* [Tarantool-patches] [PATCH luajit v3 0/2] memprof: add demangling capabilities for C functions
@ 2021-09-15 17:18 Maxim Kokryashkin via Tarantool-patches
2021-09-15 17:19 ` [Tarantool-patches] [PATCH luajit v3 1/2] memprof: extend symtab with C-symbols Maxim Kokryashkin via Tarantool-patches
2021-09-15 17:19 ` [Tarantool-patches] [PATCH luajit v3 2/2] memprof: update memprof parser Maxim Kokryashkin via Tarantool-patches
0 siblings, 2 replies; 3+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2021-09-15 17:18 UTC (permalink / raw)
To: tarantool-patches, imun, skaplun
Changes in v3:
- Fixed comments as per review by Sergey
- Added support for demangling of functions loaded in the profilter's
runtime
Maxim Kokryashkin (2):
memprof: extend symtab with C-symbols
memprof: update memprof parser
src/lj_memprof.c | 154 +++++++++++++++++-
src/lj_memprof.h | 17 +-
.../misclib-memprof-lapi.test.lua | 4 +-
tools/memprof.lua | 6 +
tools/memprof/parse.lua | 17 +-
tools/memprof/process.lua | 7 +
tools/utils/symtab.lua | 35 +++-
7 files changed, 220 insertions(+), 20 deletions(-)
---
GitHub branch:
https://github.com/tarantool/luajit/tree/fckxorg/gh-5813-demangling-of-c-symbols-v2
Issue:
https://github.com/tarantool/tarantool/issues/5813
Patch v1 thread:
https://lists.tarantool.org/tarantool-patches/cover.1627043674.git.m.kokryashkin@tarantool.org/
Patch v2 thread:
https://lists.tarantool.org/tarantool-patches/cover.1629457244.git.m.kokryashkin@tarantool.org/
2.33.0
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Tarantool-patches] [PATCH luajit v3 1/2] memprof: extend symtab with C-symbols
2021-09-15 17:18 [Tarantool-patches] [PATCH luajit v3 0/2] memprof: add demangling capabilities for C functions Maxim Kokryashkin via Tarantool-patches
@ 2021-09-15 17:19 ` Maxim Kokryashkin via Tarantool-patches
2021-09-15 17:19 ` [Tarantool-patches] [PATCH luajit v3 2/2] memprof: update memprof parser Maxim Kokryashkin via Tarantool-patches
1 sibling, 0 replies; 3+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2021-09-15 17:19 UTC (permalink / raw)
To: tarantool-patches, imun, skaplun
This commit enriches memprof's symbol table and event stream with
information about C-symbols. That information will provide demangling
capabilities to the parser.
The following data is stored in symtab for each symbol:
| SYMTAB_CFUNC | symbol address | symbol name |
1 byte 8 bytes
magic
number
The following data is stored in event for each newly loaded symbol:
| (AEVENT_SYMTAB | ASOURCE_CFUNC) | symbol address | symbol name |
1 byte 8 bytes
magic
number
Part of tarantool/tarantool#5813
---
src/lj_memprof.c | 154 ++++++++++++++++++++++++++++++++++++++++++++---
src/lj_memprof.h | 17 ++++--
2 files changed, 160 insertions(+), 11 deletions(-)
diff --git a/src/lj_memprof.c b/src/lj_memprof.c
index 2c1ef3b8..17f97cc9 100644
--- a/src/lj_memprof.c
+++ b/src/lj_memprof.c
@@ -5,10 +5,16 @@
** Copyright (C) 2015-2019 IPONWEB Ltd.
*/
+#define _GNU_SOURCE
+#include <elf.h>
+
#define lj_memprof_c
#define LUA_CORE
#include <errno.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <link.h>
#include "lj_arch.h"
#include "lj_memprof.h"
@@ -24,7 +30,118 @@
static const unsigned char ljs_header[] = {'l', 'j', 's', LJS_CURRENT_VERSION,
0x0, 0x0, 0x0};
-static void dump_symtab(struct lj_wbuf *out, const struct global_State *g)
+typedef struct {
+ uint32_t nbuckets;
+ uint32_t symoffset;
+ uint32_t bloom_size;
+ uint32_t bloom_shift;
+} ghashtab_header;
+
+uint32_t ghashtab_size(ElfW(Addr) ghashtab)
+{
+ uint32_t last_entry = 0;
+ uint32_t* cur_bucket = NULL;
+ uint32_t* entry = NULL;
+
+ const void* chain_address = NULL;
+ ghashtab_header* header = (ghashtab_header*)ghashtab;
+ const void* buckets = (void*)ghashtab + sizeof(ghashtab_header) + (sizeof(uint64_t) * header->bloom_size);
+
+ cur_bucket = (uint32_t*)buckets;
+ for (uint32_t i = 0; i < header->nbuckets; ++i) {
+ if (last_entry < *cur_bucket)
+ last_entry = *cur_bucket;
+ cur_bucket++;
+ }
+
+ if (last_entry < header->symoffset)
+ return header->symoffset;
+
+ chain_address = buckets + (sizeof(uint32_t) * header->nbuckets);
+ do {
+ entry = (uint32_t*)(chain_address + (last_entry - header->symoffset) * sizeof(uint32_t));
+ last_entry++;
+ } while (!(*entry & 1));
+
+ return last_entry;
+}
+
+struct symbol_resolver_conf {
+ struct lj_wbuf *buf;
+ const uint8_t header;
+
+ uint32_t cur_lib;
+ uint32_t lib_cnt_prev;
+ uint32_t to_dump_cnt;
+ uint32_t *lib_cnt;
+};
+
+int resolve_symbolnames(struct dl_phdr_info* info, size_t info_size, void* data)
+{
+ if(strcmp(info->dlpi_name, "linux-vdso.so.1") == 0) {
+ return 0;
+ }
+
+ ElfW(Dyn*) dyn = NULL;
+ ElfW(Sym*) sym = NULL;
+ ElfW(Word*) hashtab = NULL;
+ ElfW(Word) sym_cnt = 0;
+
+ char* strtab = 0;
+ char* sym_name = 0;
+
+ struct symbol_resolver_conf *conf = data;
+ const uint8_t header = conf->header;
+ struct lj_wbuf *buf = conf->buf;
+
+ conf->lib_cnt_prev = *conf->lib_cnt;
+ uint32_t lib_cnt_prev = conf->lib_cnt_prev;
+
+ if((conf->to_dump_cnt = info->dlpi_adds - lib_cnt_prev) == 0) {
+ /* No new libraries, stop resolver. */
+ return 1;
+ }
+
+ uint32_t lib_cnt = info->dlpi_adds - info->dlpi_subs;
+ if(conf->cur_lib < lib_cnt - conf->to_dump_cnt) {
+ /* That lib is already dumped, skip it. */
+ ++conf->cur_lib;
+ return 0;
+ }
+
+ for (size_t header_index = 0; header_index < info->dlpi_phnum; ++header_index) {
+ if (info->dlpi_phdr[header_index].p_type == PT_DYNAMIC) {
+ dyn = (ElfW(Dyn)*)(info->dlpi_addr + info->dlpi_phdr[header_index].p_vaddr);
+
+ while(dyn->d_tag != DT_NULL) {
+ if (dyn->d_tag == DT_HASH) {
+ hashtab = (ElfW(Word*))dyn->d_un.d_ptr;
+ sym_cnt = hashtab[1];
+ }
+ else if (dyn->d_tag == DT_GNU_HASH && sym_cnt == 0)
+ sym_cnt = ghashtab_size(dyn->d_un.d_ptr);
+ else if (dyn->d_tag == DT_STRTAB)
+ strtab = (char*)dyn->d_un.d_ptr;
+ else if (dyn->d_tag == DT_SYMTAB) {
+ sym = (ElfW(Sym*))dyn->d_un.d_ptr;
+
+ for (ElfW(Word) sym_index = 0; sym_index < sym_cnt; sym_index++) {
+ sym_name = &strtab[sym[sym_index].st_name];
+ lj_wbuf_addbyte(buf, header);
+ lj_wbuf_addu64(buf, sym[sym_index].st_value + info->dlpi_addr);
+ lj_wbuf_addstring(buf, sym_name);
+ }
+ }
+ dyn++;
+ }
+ }
+ }
+
+ ++conf->cur_lib;
+ return 0;
+}
+
+static void dump_symtab(struct lj_wbuf *out, const struct global_State *g, uint32_t *lib_cnt)
{
const GCRef *iter = &g->gc.root;
const GCobj *o;
@@ -49,6 +166,17 @@ static void dump_symtab(struct lj_wbuf *out, const struct global_State *g)
iter = &o->gch.nextgc;
}
+ /* Write symbols. */
+ struct symbol_resolver_conf conf = {
+ /* buf: */ out,
+ /* header: */ SYMTAB_CFUNC,
+ /* cur_lib: */ 0,
+ /* lib_cnt_prev: */ *lib_cnt,
+ /* to_dump_cnt: */ 0,
+ /* lib_cnt: */ lib_cnt
+ };
+ dl_iterate_phdr(resolve_symbolnames, &conf);
+
lj_wbuf_addbyte(out, SYMTAB_FINAL);
}
@@ -78,6 +206,7 @@ struct memprof {
struct alloc orig_alloc; /* Original allocator. */
struct lj_memprof_options opt; /* Profiling options. */
int saved_errno; /* Saved errno when profiler deinstrumented. */
+ uint32_t lib_cnt; /* Number of currently loaded libs. */
};
static struct memprof memprof = {0};
@@ -105,15 +234,26 @@ static void memprof_write_lfunc(struct lj_wbuf *out, uint8_t aevent,
}
static void memprof_write_cfunc(struct lj_wbuf *out, uint8_t aevent,
- const GCfunc *fn)
+ const GCfunc *fn, uint32_t *lib_cnt)
{
+ /* Check if there are any new libs. */
+ struct symbol_resolver_conf conf = {
+ /* buf: */ out,
+ /* header: */ AEVENT_SYMTAB | ASOURCE_CFUNC,
+ /* cur_lib: */ 0,
+ /* lib_cnt_prev: */ *lib_cnt,
+ /* to_dump_cnt: */ 0,
+ /* lib_cnt: */ lib_cnt
+ };
+ dl_iterate_phdr(resolve_symbolnames, &conf);
+
lj_wbuf_addbyte(out, aevent | ASOURCE_CFUNC);
lj_wbuf_addu64(out, (uintptr_t)fn->c.f);
}
static void memprof_write_ffunc(struct lj_wbuf *out, uint8_t aevent,
GCfunc *fn, struct lua_State *L,
- cTValue *frame)
+ cTValue *frame, uint32_t *lib_cnt)
{
cTValue *pframe = frame_prev(frame);
GCfunc *pfn = frame_func(pframe);
@@ -126,7 +266,7 @@ static void memprof_write_ffunc(struct lj_wbuf *out, uint8_t aevent,
if (pfn != NULL && isluafunc(pfn))
memprof_write_lfunc(out, aevent, pfn, L, frame);
else
- memprof_write_cfunc(out, aevent, fn);
+ memprof_write_cfunc(out, aevent, fn, lib_cnt);
}
static void memprof_write_func(struct memprof *mp, uint8_t aevent)
@@ -139,9 +279,9 @@ static void memprof_write_func(struct memprof *mp, uint8_t aevent)
if (isluafunc(fn))
memprof_write_lfunc(out, aevent, fn, L, NULL);
else if (isffunc(fn))
- memprof_write_ffunc(out, aevent, fn, L, frame);
+ memprof_write_ffunc(out, aevent, fn, L, frame, &mp->lib_cnt);
else if (iscfunc(fn))
- memprof_write_cfunc(out, aevent, fn);
+ memprof_write_cfunc(out, aevent, fn, &mp->lib_cnt);
else
lua_assert(0);
}
@@ -249,7 +389,7 @@ int lj_memprof_start(struct lua_State *L, const struct lj_memprof_options *opt)
/* Init output. */
lj_wbuf_init(&mp->out, mp_opt->writer, mp_opt->ctx, mp_opt->buf, mp_opt->len);
- dump_symtab(&mp->out, mp->g);
+ dump_symtab(&mp->out, mp->g, &mp->lib_cnt);
/* Write prologue. */
lj_wbuf_addn(&mp->out, ljm_header, ljm_header_len);
diff --git a/src/lj_memprof.h b/src/lj_memprof.h
index 3417475d..337fa76a 100644
--- a/src/lj_memprof.h
+++ b/src/lj_memprof.h
@@ -16,7 +16,7 @@
#include "lj_def.h"
#include "lj_wbuf.h"
-#define LJS_CURRENT_VERSION 0x1
+#define LJS_CURRENT_VERSION 0x2
/*
** symtab format:
@@ -25,12 +25,14 @@
** prologue := 'l' 'j' 's' version reserved
** version := <BYTE>
** reserved := <BYTE> <BYTE> <BYTE>
-** sym := sym-lua | sym-final
+** sym := sym-lua | sym-cfunc | sym-final
** sym-lua := sym-header sym-addr sym-chunk sym-line
** sym-header := <BYTE>
** sym-addr := <ULEB128>
** sym-chunk := string
** sym-line := <ULEB128>
+** sym-cfunc := sym-header sym-addr sym-name
+** sym-name := string
** sym-final := sym-header
** string := string-len string-payload
** string-len := <ULEB128>
@@ -51,9 +53,10 @@
*/
#define SYMTAB_LFUNC ((uint8_t)0)
+#define SYMTAB_CFUNC ((uint8_t)1)
#define SYMTAB_FINAL ((uint8_t)0x80)
-#define LJM_CURRENT_FORMAT_VERSION 0x01
+#define LJM_CURRENT_FORMAT_VERSION 0x02
/*
** Event stream format:
@@ -64,10 +67,11 @@
** prologue := 'l' 'j' 'm' version reserved
** version := <BYTE>
** reserved := <BYTE> <BYTE> <BYTE>
-** event := event-alloc | event-realloc | event-free
+** event := event-alloc | event-realloc | event-free | event-symtab
** event-alloc := event-header loc? naddr nsize
** event-realloc := event-header loc? oaddr osize naddr nsize
** event-free := event-header loc? oaddr osize
+** event-symtab := event-header sym-addr sym-name
** event-header := <BYTE>
** loc := loc-lua | loc-c
** loc-lua := sym-addr line-no
@@ -78,7 +82,11 @@
** naddr := <ULEB128>
** osize := <ULEB128>
** nsize := <ULEB128>
+** sym-name := string
** epilogue := event-header
+** string := string-len string-payload
+** string-len := <ULEB128>
+** string-payload := <BYTE> {string-len}
**
** <BYTE> : A single byte (no surprises here)
** <ULEB128>: Unsigned integer represented in ULEB128 encoding
@@ -97,6 +105,7 @@
*/
/* Allocation events. */
+#define AEVENT_SYMTAB ((uint8_t)0)
#define AEVENT_ALLOC ((uint8_t)1)
#define AEVENT_FREE ((uint8_t)2)
#define AEVENT_REALLOC ((uint8_t)(AEVENT_ALLOC | AEVENT_FREE))
--
2.33.0
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Tarantool-patches] [PATCH luajit v3 2/2] memprof: update memprof parser
2021-09-15 17:18 [Tarantool-patches] [PATCH luajit v3 0/2] memprof: add demangling capabilities for C functions Maxim Kokryashkin via Tarantool-patches
2021-09-15 17:19 ` [Tarantool-patches] [PATCH luajit v3 1/2] memprof: extend symtab with C-symbols Maxim Kokryashkin via Tarantool-patches
@ 2021-09-15 17:19 ` Maxim Kokryashkin via Tarantool-patches
1 sibling, 0 replies; 3+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2021-09-15 17:19 UTC (permalink / raw)
To: tarantool-patches, imun, skaplun
This commit introduces demangling of C symbols to memprof parser.
As symbol table format has changed, parser needed to be updated too.
Now the parser supports new symbol table entries, containing data
about symbols from shared objects, that were loaded at the moment of
data collection. Also, now the parser supports symtab events in the
event stream and extends the symtab corresponding to them.
Closes tarantool/tarantool#5813
---
.../misclib-memprof-lapi.test.lua | 4 +--
tools/memprof.lua | 6 ++++
tools/memprof/parse.lua | 17 ++++++++-
tools/memprof/process.lua | 7 ++++
tools/utils/symtab.lua | 35 +++++++++++++++----
5 files changed, 60 insertions(+), 9 deletions(-)
diff --git a/test/tarantool-tests/misclib-memprof-lapi.test.lua b/test/tarantool-tests/misclib-memprof-lapi.test.lua
index 06d96b3b..bdb77549 100644
--- a/test/tarantool-tests/misclib-memprof-lapi.test.lua
+++ b/test/tarantool-tests/misclib-memprof-lapi.test.lua
@@ -61,10 +61,10 @@ local function fill_ev_type(events, symbols, event_type)
name = "INTERNAL",
num = event.num,
}
- elseif symbols[addr] then
+ elseif symbols.SYMTAB_LFUNC[addr] then
ev_type[event.loc.line] = {
name = string.format(
- "%s:%d", symbols[addr].source, symbols[addr].linedefined
+ "%s:%d", symbols.SYMTAB_LFUNC[addr].source, symbols.SYMTAB_LFUNC[addr].linedefined
),
num = event.num,
}
diff --git a/tools/memprof.lua b/tools/memprof.lua
index 18b44fdd..8fbc6a3c 100644
--- a/tools/memprof.lua
+++ b/tools/memprof.lua
@@ -101,6 +101,12 @@ local function dump(inputfile)
local reader = bufread.new(inputfile)
local symbols = symtab.parse(reader)
local events = memprof.parse(reader, symbols)
+
+ -- TODO move this for to process.lua
+ for addr, event in pairs(events.symtab) do
+ symtab.add_cfunc(symbols, addr, event.name)
+ end
+
if not leak_only then
view.profile_info(events, symbols)
end
diff --git a/tools/memprof/parse.lua b/tools/memprof/parse.lua
index 12e2758f..61a95d0f 100644
--- a/tools/memprof/parse.lua
+++ b/tools/memprof/parse.lua
@@ -11,10 +11,11 @@ local lshift = bit.lshift
local string_format = string.format
local LJM_MAGIC = "ljm"
-local LJM_CURRENT_VERSION = 1
+local LJM_CURRENT_VERSION = 2
local LJM_EPILOGUE_HEADER = 0x80
+local AEVENT_SYMTAB = 0
local AEVENT_ALLOC = 1
local AEVENT_FREE = 2
local AEVENT_REALLOC = 3
@@ -36,6 +37,7 @@ local function new_event(loc)
free = 0,
alloc = 0,
primary = {},
+ name = nil
}
end
@@ -77,6 +79,17 @@ local function parse_location(reader, asource)
error("Unknown asource "..asource)
end
+local function parse_symtab(reader, asource, events, heap)
+ local id = reader:read_uleb128()
+ local name = reader:read_string()
+
+ if not events[id] then
+ events[id] = new_event(0)
+ end
+
+ events[id].name = name
+end
+
local function parse_alloc(reader, asource, events, heap)
local id, loc = parse_location(reader, asource)
@@ -134,6 +147,7 @@ local function parse_free(reader, asource, events, heap)
end
local parsers = {
+ [AEVENT_SYMTAB] = {evname = "symtab", parse = parse_symtab},
[AEVENT_ALLOC] = {evname = "alloc", parse = parse_alloc},
[AEVENT_FREE] = {evname = "free", parse = parse_free},
[AEVENT_REALLOC] = {evname = "realloc", parse = parse_realloc},
@@ -174,6 +188,7 @@ function M.parse(reader)
realloc = {},
free = {},
heap = {},
+ symtab = {}
}
local magic = reader:read_octets(3)
diff --git a/tools/memprof/process.lua b/tools/memprof/process.lua
index 0bcb965b..25a70878 100644
--- a/tools/memprof/process.lua
+++ b/tools/memprof/process.lua
@@ -56,4 +56,11 @@ function M.form_heap_delta(events, symbols)
return dheap
end
+function M.enrich_symtab(events, symbols)
+ for event, addr in pairs(events.symtab) do
+ print(event.name)
+ symtab.add_cfunc(symtab, addr, event.name)
+ end
+end
+
return M
diff --git a/tools/utils/symtab.lua b/tools/utils/symtab.lua
index 3ed1dd13..79f656ac 100644
--- a/tools/utils/symtab.lua
+++ b/tools/utils/symtab.lua
@@ -10,11 +10,12 @@ local band = bit.band
local string_format = string.format
local LJS_MAGIC = "ljs"
-local LJS_CURRENT_VERSION = 1
+local LJS_CURRENT_VERSION = 2
local LJS_EPILOGUE_HEADER = 0x80
local LJS_SYMTYPE_MASK = 0x03
local SYMTAB_LFUNC = 0
+local SYMTAB_CFUNC = 1
local M = {}
@@ -24,18 +25,33 @@ local function parse_sym_lfunc(reader, symtab)
local sym_chunk = reader:read_string()
local sym_line = reader:read_uleb128()
- symtab[sym_addr] = {
+ symtab.SYMTAB_LFUNC[sym_addr] = {
source = sym_chunk,
linedefined = sym_line,
}
end
+-- Parse a single entry in a symtab: .so library
+local function parse_sym_cfunc(reader, symtab)
+ local addr = reader:read_uleb128()
+ local name = reader:read_string()
+
+ symtab.SYMTAB_CFUNC[addr] = {
+ name = name
+ }
+end
+
local parsers = {
[SYMTAB_LFUNC] = parse_sym_lfunc,
+ [SYMTAB_CFUNC] = parse_sym_cfunc
}
function M.parse(reader)
- local symtab = {}
+ local symtab = {
+ SYMTAB_LFUNC = {},
+ SYMTAB_CFUNC = {}
+ }
+
local magic = reader:read_octets(3)
local version = reader:read_octets(1)
@@ -69,7 +85,6 @@ function M.parse(reader)
parsers[sym_type](reader, symtab)
end
end
-
return symtab
end
@@ -80,11 +95,19 @@ function M.demangle(symtab, loc)
return "INTERNAL"
end
- if symtab[addr] then
- return string_format("%s:%d", symtab[addr].source, loc.line)
+ if symtab.SYMTAB_LFUNC[addr] then
+ return string_format("%s:%d", symtab.SYMTAB_LFUNC[addr].source, loc.line)
+ end
+
+ if symtab.SYMTAB_CFUNC[addr] then
+ return string_format("%s:%#x", symtab.SYMTAB_CFUNC[addr].name, addr)
end
return string_format("CFUNC %#x", addr)
end
+function M.add_cfunc(symtab, addr, name)
+ symtab.SYMTAB_CFUNC[addr] = {name = name}
+end
+
return M
--
2.33.0
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-09-15 17:20 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-15 17:18 [Tarantool-patches] [PATCH luajit v3 0/2] memprof: add demangling capabilities for C functions Maxim Kokryashkin via Tarantool-patches
2021-09-15 17:19 ` [Tarantool-patches] [PATCH luajit v3 1/2] memprof: extend symtab with C-symbols Maxim Kokryashkin via Tarantool-patches
2021-09-15 17:19 ` [Tarantool-patches] [PATCH luajit v3 2/2] memprof: update memprof parser Maxim Kokryashkin via Tarantool-patches
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox