* [Tarantool-patches] [PATCH luajit v1 0/4] memprof: enrich symtab when meeting new prototype @ 2021-08-21 12:49 Mikhail Shishatskiy via Tarantool-patches 2021-08-21 12:49 ` [Tarantool-patches] [PATCH luajit v1 1/4] " Mikhail Shishatskiy via Tarantool-patches ` (3 more replies) 0 siblings, 4 replies; 6+ messages in thread From: Mikhail Shishatskiy via Tarantool-patches @ 2021-08-21 12:49 UTC (permalink / raw) To: tarantool-patches, imun, skaplun Here the results of benchmark [1] for introduced patchset. The time is average for 15 runs: | -------------------------------------- | | JIT-off, memprof-on | | -------------------------------------- | | BEFORE | AFTER | | ------------------ | ----------------- | | 5.3859 (0.0325) | +0.2534 (0.0248) | | -------------------------------------- | BEFORE branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5679-report-jit-allocations-as-internal [1]: https://gist.github.com/Shishqa/94152d538816fc64b68200336e6305d3 Issue: https://github.com/tarantool/tarantool/issues/5815 Branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated Tarantool branch: https://github.com/tarantool/tarantool/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated Mikhail Shishatskiy (4): memprof: enrich symtab when meeting new prototype memprof: demangle symbols on the spot memprof: substitute long proto names with aliases luajit: change order of modules src/lj_bcread.c | 1 + src/lj_memprof.c | 19 +++- src/lj_memprof.h | 16 ++- src/lj_obj.h | 7 ++ src/lj_parse.c | 1 + .../misclib-memprof-lapi.test.lua | 102 +++++++++++++----- tools/memprof.lua | 5 +- tools/memprof/humanize.lua | 51 +++++++-- tools/memprof/parse.lua | 67 ++++++++---- tools/memprof/process.lua | 12 +-- tools/utils/symtab.lua | 20 +++- 11 files changed, 230 insertions(+), 71 deletions(-) -- 2.32.0 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Tarantool-patches] [PATCH luajit v1 1/4] memprof: enrich symtab when meeting new prototype 2021-08-21 12:49 [Tarantool-patches] [PATCH luajit v1 0/4] memprof: enrich symtab when meeting new prototype Mikhail Shishatskiy via Tarantool-patches @ 2021-08-21 12:49 ` Mikhail Shishatskiy via Tarantool-patches 2021-08-24 14:16 ` Mikhail Shishatskiy via Tarantool-patches 2021-08-21 12:50 ` [Tarantool-patches] [PATCH luajit v1 2/4] memprof: demangle symbols on the spot Mikhail Shishatskiy via Tarantool-patches ` (2 subsequent siblings) 3 siblings, 1 reply; 6+ messages in thread From: Mikhail Shishatskiy via Tarantool-patches @ 2021-08-21 12:49 UTC (permalink / raw) To: tarantool-patches, imun, skaplun Previously, the profiler dumped all the function prototypes only at the start of profiling. It was enough for most cases, but sometimes we may want to investigate the memory profile of module loading. In such a case, we expect memprof to dump new prototypes from parsed source code. This patch extends memprof's streaming format with entries providing extra information about allocation source's symbols if they were not streamed at the start of profiling. | loc := loc-lua | loc-lua-sym | loc-c | loc-trace | loc-lua-sym := sym-addr sym-chunk sym-line line-no The `sym-chunk` and `sym-line` are the prototype's chunk name and line where it was defined. To recognize entries with new symbols, event-header format changed: | event-header: [FUTSSSEE] | * T : 1 bit indicating if additional symtab entry provided or not Also, the profiler parser is adjusted to recognize entries described above and enrich the symtab on the fly while parsing events. For this reason, <utils/symtab.lua> API has been changed: function `parse_sym_lfunc` became public and now returns sym-addr to be used in the event parser module. Resolves tarantool/tarantool#5815 --- Issue: https://github.com/tarantool/tarantool/issues/5815 Branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated Tarantool branch: https://github.com/tarantool/tarantool/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated src/lj_bcread.c | 1 + src/lj_memprof.c | 19 +++- src/lj_memprof.h | 16 ++- src/lj_obj.h | 7 ++ src/lj_parse.c | 1 + .../misclib-memprof-lapi.test.lua | 102 +++++++++++++----- tools/memprof/parse.lua | 55 +++++++--- tools/utils/symtab.lua | 6 +- 8 files changed, 160 insertions(+), 47 deletions(-) diff --git a/src/lj_bcread.c b/src/lj_bcread.c index 48c5e7c7..692e1a9d 100644 --- a/src/lj_bcread.c +++ b/src/lj_bcread.c @@ -353,6 +353,7 @@ GCproto *lj_bcread_proto(LexState *ls) pt->sizeuv = (uint8_t)sizeuv; pt->flags = (uint8_t)flags; pt->trace = 0; + pt->is_streamed = 0; setgcref(pt->chunkname, obj2gco(ls->chunkname)); /* Close potentially uninitialized gap between bc and kgc. */ diff --git a/src/lj_memprof.c b/src/lj_memprof.c index 14e4e67b..fb217a2c 100644 --- a/src/lj_memprof.c +++ b/src/lj_memprof.c @@ -67,7 +67,7 @@ static void dump_symtab_trace(struct lj_wbuf *out, const GCtrace *trace) static void dump_symtab(struct lj_wbuf *out, const struct global_State *g) { const GCRef *iter = &g->gc.root; - const GCobj *o; + GCobj *o; const size_t ljs_header_len = sizeof(ljs_header) / sizeof(ljs_header[0]); /* Write prologue. */ @@ -76,11 +76,12 @@ static void dump_symtab(struct lj_wbuf *out, const struct global_State *g) while ((o = gcref(*iter)) != NULL) { switch (o->gch.gct) { case (~LJ_TPROTO): { - const GCproto *pt = gco2pt(o); + GCproto *pt = gco2pt(o); lj_wbuf_addbyte(out, SYMTAB_LFUNC); lj_wbuf_addu64(out, (uintptr_t)pt); lj_wbuf_addstring(out, proto_chunknamestr(pt)); lj_wbuf_addu64(out, (uint64_t)pt->firstline); + pt->is_streamed = 1; break; } case (~LJ_TTRACE): { @@ -138,6 +139,7 @@ static void memprof_write_lfunc(struct lj_wbuf *out, uint8_t aevent, ** -DLUAJIT_DISABLE_DEBUGINFO flag. */ const BCLine line = lj_debug_frameline(L, fn, nextframe); + GCproto *pt = funcproto(fn); if (line < 0) { /* @@ -149,9 +151,20 @@ static void memprof_write_lfunc(struct lj_wbuf *out, uint8_t aevent, ** We report such allocations as internal in order not to confuse users. */ lj_wbuf_addbyte(out, aevent | ASOURCE_INT); + + } else if (LJ_UNLIKELY(!pt->is_streamed)) { + + lj_wbuf_addbyte(out, aevent | ASOURCE_LFUNC | LJM_SYMTAB); + lj_wbuf_addu64(out, (uintptr_t)pt); + lj_wbuf_addstring(out, proto_chunknamestr(pt)); + lj_wbuf_addu64(out, (uint64_t)pt->firstline); + lj_wbuf_addu64(out, (uint64_t)line); + pt->is_streamed = 1; + } else { + lj_wbuf_addbyte(out, aevent | ASOURCE_LFUNC); - lj_wbuf_addu64(out, (uintptr_t)funcproto(fn)); + lj_wbuf_addu64(out, (uintptr_t)pt); lj_wbuf_addu64(out, (uint64_t)line); } } diff --git a/src/lj_memprof.h b/src/lj_memprof.h index e72dadf7..0f5b4c6d 100644 --- a/src/lj_memprof.h +++ b/src/lj_memprof.h @@ -57,7 +57,7 @@ #define SYMTAB_TRACE ((uint8_t)1) #define SYMTAB_FINAL ((uint8_t)0x80) -#define LJM_CURRENT_FORMAT_VERSION 0x02 +#define LJM_CURRENT_FORMAT_VERSION 0x03 /* ** Event stream format: @@ -73,11 +73,14 @@ ** event-realloc := event-header loc? oaddr osize naddr nsize ** event-free := event-header loc? oaddr osize ** event-header := <BYTE> -** loc := loc-lua | loc-c | loc-trace +** loc := loc-lua | loc-lua-sym | loc-c | loc-trace ** loc-lua := sym-addr line-no +** loc-lua-sym := sym-addr sym-chunk sym-line line-no ** loc-c := sym-addr ** loc-trace := trace-addr trace-no ** sym-addr := <ULEB128> +** sym-chunk := string +** sym-line := <ULEB128> ** line-no := <ULEB128> ** trace-addr := <ULEB128> ** trace-no := <ULEB128> @@ -85,6 +88,9 @@ ** naddr := <ULEB128> ** osize := <ULEB128> ** nsize := <ULEB128> +** string := string-len string-payload +** string-len := <ULEB128> +** string-payload := <BYTE> {string-len} ** epilogue := event-header ** ** <BYTE> : A single byte (no surprises here) @@ -95,10 +101,11 @@ ** version: [VVVVVVVV] ** * VVVVVVVV: Byte interpreted as a plain integer version number ** -** event-header: [FUUSSSEE] +** event-header: [FUTSSSEE] ** * EE : 2 bits for representing allocation event type (AEVENT_*) ** * SSS : 3 bits for representing allocation source type (ASOURCE_*) -** * UU : 2 unused bits +** * T : 1 bit indicating if additional symtab entry provided or not +** * U : 1 unused bit ** * F : 0 for regular events, 1 for epilogue's *F*inal header ** (if F is set to 1, all other bits are currently ignored) */ @@ -114,6 +121,7 @@ #define ASOURCE_CFUNC ((uint8_t)(3 << 2)) #define ASOURCE_TRACE ((uint8_t)(4 << 2)) +#define LJM_SYMTAB 0x20 #define LJM_EPILOGUE_HEADER 0x80 /* Profiler public API. */ diff --git a/src/lj_obj.h b/src/lj_obj.h index d26e60be..c8f6b13a 100644 --- a/src/lj_obj.h +++ b/src/lj_obj.h @@ -385,6 +385,13 @@ typedef struct GCproto { MRef lineinfo; /* Compressed map from bytecode ins. to source line. */ MRef uvinfo; /* Upvalue names. */ MRef varinfo; /* Names and compressed extents of local variables. */ +#if LJ_HASMEMPROF + /* + ** Flag, indicating, if this proto has been streamed to the + ** memprof's binary output. + */ + uint8_t is_streamed; +#endif } GCproto; /* Flags for prototype. */ diff --git a/src/lj_parse.c b/src/lj_parse.c index a6325a76..1a540610 100644 --- a/src/lj_parse.c +++ b/src/lj_parse.c @@ -1577,6 +1577,7 @@ static GCproto *fs_finish(LexState *ls, BCLine line) pt->flags = (uint8_t)(fs->flags & ~(PROTO_HAS_RETURN|PROTO_FIXUP_RETURN)); pt->numparams = fs->numparams; pt->framesize = fs->framesize; + pt->is_streamed = 0; setgcref(pt->chunkname, obj2gco(ls->chunkname)); /* Close potentially uninitialized gap between bc and kgc. */ diff --git a/test/tarantool-tests/misclib-memprof-lapi.test.lua b/test/tarantool-tests/misclib-memprof-lapi.test.lua index 17bcb2f1..f4666da5 100644 --- a/test/tarantool-tests/misclib-memprof-lapi.test.lua +++ b/test/tarantool-tests/misclib-memprof-lapi.test.lua @@ -7,7 +7,7 @@ require("utils").skipcond( local tap = require("tap") local test = tap.test("misc-memprof-lapi") -test:plan(4) +test:plan(5) local jit_opt_default = { 3, -- level @@ -70,7 +70,7 @@ local function generate_parsed_output(payload) local reader = bufread.new(TMP_BINFILE) local symbols = symtab.parse(reader) - local events = memprof.parse(reader) + local events = memprof.parse(reader, symbols) -- We don't need it any more. os.remove(TMP_BINFILE) @@ -79,7 +79,9 @@ local function generate_parsed_output(payload) end local function fill_ev_type(events, symbols, event_type) - local ev_type = {} + local ev_type = { + trace = {}, + } for _, event in pairs(events[event_type]) do local addr = event.loc.addr local traceno = event.loc.traceno @@ -87,7 +89,7 @@ local function fill_ev_type(events, symbols, event_type) if traceno ~= 0 and symbols.trace[traceno] then local trace_loc = symbols.trace[traceno].sym_loc addr = trace_loc.addr - ev_type[trace_loc.line] = { + ev_type.trace[traceno] = { name = string.format("TRACE [%d] %s:%d", traceno, symbols.lfunc[addr].source, symbols.lfunc[addr].linedefined ), @@ -99,9 +101,13 @@ local function fill_ev_type(events, symbols, event_type) num = event.num, } elseif symbols.lfunc[addr] then - ev_type[event.loc.line] = { + local source = symbols.lfunc[addr].source + if ev_type[source] == nil then + ev_type[source] = {} + end + ev_type[source][event.loc.line] = { name = string.format( - "%s:%d", symbols.lfunc[addr].source, symbols.lfunc[addr].linedefined + "%s:%d", source, symbols.lfunc[addr].linedefined ), num = event.num, } @@ -110,23 +116,35 @@ local function fill_ev_type(events, symbols, event_type) return ev_type end -local function form_source_line(line) - return string.format("@%s:%d", arg[0], line) +local function form_source_line(loc) + return string.format("%s:%d", loc.source, loc.function_line) end -local function check_alloc_report(alloc, traceno, line, function_line, nevents) - assert(alloc[line], ("no event on line %d"):format(line)) - local event = alloc[line] - local expected_name - if traceno ~= 0 then - expected_name = string.format("TRACE [%d] ", traceno).. - form_source_line(function_line) +local function get_event_name_trace(loc, alloc) + local event = alloc.trace[loc.traceno] + return event, string.format("TRACE [%d] ", loc.traceno)..form_source_line(loc) +end + +local function get_event_name_lua(loc, alloc) + local event = alloc[loc.source][loc.line] + return event, form_source_line(loc) +end + +local function check_alloc_report(loc, alloc, nevents) + if loc.source == nil then + loc.source = '@'..arg[0] + end + + local event, name + if loc.traceno ~= nil then + event, name = get_event_name_trace(loc, alloc) else - expected_name = form_source_line(function_line) + event, name = get_event_name_lua(loc, alloc) end - assert(expected_name == event.name, ("got='%s', expected='%s'"):format( + + assert(name == event.name, ("got='%s', expected='%s'"):format( event.name, - expected_name + name )) assert(event.num == nevents, ("got=%d, expected=%d"):format( event.num, @@ -175,9 +193,9 @@ test:test("output", function(subtest) -- the number of allocations. -- 1 event - alocation of table by itself + 1 allocation -- of array part as far it is bigger than LJ_MAX_COLOSIZE (16). - subtest:ok(check_alloc_report(alloc, 0, 34, 32, 2)) + subtest:ok(check_alloc_report({line = 34, function_line = 32}, alloc, 2)) -- 20 strings allocations. - subtest:ok(check_alloc_report(alloc, 0, 39, 32, 20)) + subtest:ok(check_alloc_report({line = 39, function_line = 32}, alloc, 20)) -- Collect all previous allocated objects. subtest:ok(free.INTERNAL.num == 22) @@ -185,8 +203,8 @@ test:test("output", function(subtest) -- Tests for leak-only option. -- See also https://github.com/tarantool/tarantool/issues/5812. local heap_delta = process.form_heap_delta(events, symbols) - local tab_alloc_stats = heap_delta[form_source_line(34)] - local str_alloc_stats = heap_delta[form_source_line(39)] + local tab_alloc_stats = heap_delta[form_source_line({ function_line = 34 })] + local str_alloc_stats = heap_delta[form_source_line({ function_line = 39 })] subtest:ok(tab_alloc_stats.nalloc == tab_alloc_stats.nfree) subtest:ok(tab_alloc_stats.dbytes == 0) subtest:ok(str_alloc_stats.nalloc == str_alloc_stats.nfree) @@ -215,6 +233,38 @@ test:test("stack-resize", function(subtest) misc.memprof.stop() end) +-- Test for extending symtab with function prototypes +-- while profiler is running. +test:test("symtab-enriching", function(subtest) + subtest:plan(2) + + local payload_str = [[ + local M = { + tmp = string.rep("1", 100) -- line 2. + } + + function M.payload() + local str = string.rep("42", 100) -- line 6. + end + + return M + ]] + + local symbols, events = generate_parsed_output(function() + local str_chunk = assert(loadstring(payload_str, 'str_chunk'))() + str_chunk.payload() + end) + + local alloc = fill_ev_type(events, symbols, "alloc") + + subtest:ok(check_alloc_report( + {source = 'str_chunk', line = 6, function_line = 5}, alloc, 1) + ) + subtest:ok(check_alloc_report( + {source = 'str_chunk', line = 2, function_line = 0}, alloc, 1) + ) +end) + -- Test profiler with enabled JIT. jit.on() @@ -247,9 +297,13 @@ test:test("jit-output", function(subtest) alloc = fill_ev_type(events, symbols, "alloc") -- We expect, that loop will be compiled into a trace. - subtest:ok(check_alloc_report(alloc, 1, 37, 32, 20)) + subtest:ok(check_alloc_report( + { traceno = 1, line = 37, function_line = 32 }, alloc, 20 + )) -- See same checks with jit.off(). - subtest:ok(check_alloc_report(alloc, 0, 34, 32, 2)) + subtest:ok(check_alloc_report( + { line = 34, function_line = 32 }, alloc, 2 + )) -- Restore default JIT settings. jit.opt.start(unpack(jit_opt_default)) diff --git a/tools/memprof/parse.lua b/tools/memprof/parse.lua index adc7c072..888c6636 100644 --- a/tools/memprof/parse.lua +++ b/tools/memprof/parse.lua @@ -8,11 +8,14 @@ local bit = require "bit" local band = bit.band local lshift = bit.lshift +local symtab = require "utils.symtab" + local string_format = string.format local LJM_MAGIC = "ljm" -local LJM_CURRENT_VERSION = 0x02 +local LJM_CURRENT_VERSION = 0x03 +local LJM_SYMTAB = 0x20 local LJM_EPILOGUE_HEADER = 0x80 local AEVENT_ALLOC = 1 @@ -26,9 +29,10 @@ local ASOURCE_LFUNC = lshift(2, 2) local ASOURCE_CFUNC = lshift(3, 2) local ASOURCE_TRACE = lshift(4, 2) -local ASOURCE_MASK = lshift(0x7, 2) +local ASOURCE_MASK = lshift(0xF, 2) +local ASOURCE_TYPE_MASK = lshift(0x7, 2) -local EV_HEADER_MAX = ASOURCE_TRACE + AEVENT_REALLOC +local EV_HEADER_MAX = LJM_SYMTAB + ASOURCE_TRACE + AEVENT_REALLOC local M = {} @@ -70,7 +74,18 @@ local function id_location(addr, line, traceno) } end -local function parse_location(reader, asource) +local function parse_location_symbols(reader, asource, symbols) + if asource == ASOURCE_LFUNC then + return id_location( + symtab.parse_sym_lfunc(reader, symbols), + reader:read_uleb128(), + 0 + ) + end + error("Unknown asource "..asource) +end + +local function parse_location_common(reader, asource) if asource == ASOURCE_INT then return id_location(0, 0, 0) elseif asource == ASOURCE_CFUNC then @@ -83,8 +98,20 @@ local function parse_location(reader, asource) error("Unknown asource "..asource) end -local function parse_alloc(reader, asource, events, heap) - local id, loc = parse_location(reader, asource) +local function parse_location(reader, asource, symbols) + local has_symbols = band(asource, LJM_SYMTAB) + local asource_type = band(asource, ASOURCE_TYPE_MASK) + local id, loc + if has_symbols == LJM_SYMTAB then + id, loc = parse_location_symbols(reader, asource_type, symbols) + else + id, loc = parse_location_common(reader, asource_type) + end + return id, loc +end + +local function parse_alloc(reader, asource, events, heap, symbols) + local id, loc = parse_location(reader, asource, symbols) local naddr = reader:read_uleb128() local nsize = reader:read_uleb128() @@ -99,8 +126,8 @@ local function parse_alloc(reader, asource, events, heap) heap[naddr] = {nsize, id, loc} end -local function parse_realloc(reader, asource, events, heap) - local id, loc = parse_location(reader, asource) +local function parse_realloc(reader, asource, events, heap, symbols) + local id, loc = parse_location(reader, asource, symbols) local oaddr = reader:read_uleb128() local osize = reader:read_uleb128() @@ -121,8 +148,8 @@ local function parse_realloc(reader, asource, events, heap) heap[naddr] = {nsize, id, loc} end -local function parse_free(reader, asource, events, heap) - local id, loc = parse_location(reader, asource) +local function parse_free(reader, asource, events, heap, symbols) + local id, loc = parse_location(reader, asource, symbols) local oaddr = reader:read_uleb128() local osize = reader:read_uleb128() @@ -155,7 +182,7 @@ local function ev_header_split(evh) return band(evh, AEVENT_MASK), band(evh, ASOURCE_MASK) end -local function parse_event(reader, events) +local function parse_event(reader, events, symbols) local ev_header = reader:read_octet() assert(ev_header_is_valid(ev_header), "Bad ev_header "..ev_header) @@ -169,12 +196,12 @@ local function parse_event(reader, events) assert(parser, "Bad aevent "..aevent) - parser.parse(reader, asource, events[parser.evname], events.heap) + parser.parse(reader, asource, events[parser.evname], events.heap, symbols) return true end -function M.parse(reader) +function M.parse(reader, symbols) local events = { alloc = {}, realloc = {}, @@ -199,7 +226,7 @@ function M.parse(reader) )) end - while parse_event(reader, events) do + while parse_event(reader, events, symbols) do -- Empty body. end diff --git a/tools/utils/symtab.lua b/tools/utils/symtab.lua index 0e742ee1..601c9563 100644 --- a/tools/utils/symtab.lua +++ b/tools/utils/symtab.lua @@ -20,7 +20,7 @@ local SYMTAB_TRACE = 1 local M = {} -- Parse a single entry in a symtab: lfunc symbol. -local function parse_sym_lfunc(reader, symtab) +function M.parse_sym_lfunc(reader, symtab) local sym_addr = reader:read_uleb128() local sym_chunk = reader:read_string() local sym_line = reader:read_uleb128() @@ -29,6 +29,8 @@ local function parse_sym_lfunc(reader, symtab) source = sym_chunk, linedefined = sym_line, } + + return sym_addr end local function parse_sym_trace(reader, symtab) @@ -48,7 +50,7 @@ local function parse_sym_trace(reader, symtab) end local parsers = { - [SYMTAB_LFUNC] = parse_sym_lfunc, + [SYMTAB_LFUNC] = M.parse_sym_lfunc, [SYMTAB_TRACE] = parse_sym_trace, } -- 2.32.0 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit v1 1/4] memprof: enrich symtab when meeting new prototype 2021-08-21 12:49 ` [Tarantool-patches] [PATCH luajit v1 1/4] " Mikhail Shishatskiy via Tarantool-patches @ 2021-08-24 14:16 ` Mikhail Shishatskiy via Tarantool-patches 0 siblings, 0 replies; 6+ messages in thread From: Mikhail Shishatskiy via Tarantool-patches @ 2021-08-24 14:16 UTC (permalink / raw) To: tarantool-patches, imun, skaplun Sorry, I forgot about LJ_HASMEMPROF in some files: diff --git a/src/lj_bcread.c b/src/lj_bcread.c index 692e1a9d..242ea465 100644 --- a/src/lj_bcread.c +++ b/src/lj_bcread.c @@ -353,7 +353,9 @@ GCproto *lj_bcread_proto(LexState *ls) pt->sizeuv = (uint8_t)sizeuv; pt->flags = (uint8_t)flags; pt->trace = 0; +#if LJ_HASMEMPROF pt->is_streamed = 0; +#endif setgcref(pt->chunkname, obj2gco(ls->chunkname)); /* Close potentially uninitialized gap between bc and kgc. */ diff --git a/src/lj_parse.c b/src/lj_parse.c index 1a540610..746c6e33 100644 --- a/src/lj_parse.c +++ b/src/lj_parse.c @@ -1577,7 +1577,9 @@ static GCproto *fs_finish(LexState *ls, BCLine line) pt->flags = (uint8_t)(fs->flags & ~(PROTO_HAS_RETURN|PROTO_FIXUP_RETURN)); pt->numparams = fs->numparams; pt->framesize = fs->framesize; +#if LJ_HASMEMPROF pt->is_streamed = 0; +#endif setgcref(pt->chunkname, obj2gco(ls->chunkname)); /* Close potentially uninitialized gap between bc and kgc. */ On 21.08.2021 19:49, Mikhail Shishatskiy wrote: >Previously, the profiler dumped all the function prototypes only >at the start of profiling. It was enough for most cases, but >sometimes we may want to investigate the memory profile of module >loading. In such a case, we expect memprof to dump new prototypes >from parsed source code. > >This patch extends memprof's streaming format with entries providing >extra information about allocation source's symbols if they were not >streamed at the start of profiling. > >| loc := loc-lua | loc-lua-sym | loc-c | loc-trace >| loc-lua-sym := sym-addr sym-chunk sym-line line-no > >The `sym-chunk` and `sym-line` are the prototype's chunk name and line >where it was defined. > >To recognize entries with new symbols, event-header format changed: > >| event-header: [FUTSSSEE] >| * T : 1 bit indicating if additional symtab entry provided or not > >Also, the profiler parser is adjusted to recognize entries described >above and enrich the symtab on the fly while parsing events. For this >reason, <utils/symtab.lua> API has been changed: function >`parse_sym_lfunc` became public and now returns sym-addr to be used >in the event parser module. > >Resolves tarantool/tarantool#5815 >--- > >Issue: https://github.com/tarantool/tarantool/issues/5815 >Branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated >Tarantool branch: https://github.com/tarantool/tarantool/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated > > src/lj_bcread.c | 1 + > src/lj_memprof.c | 19 +++- > src/lj_memprof.h | 16 ++- > src/lj_obj.h | 7 ++ > src/lj_parse.c | 1 + > .../misclib-memprof-lapi.test.lua | 102 +++++++++++++----- > tools/memprof/parse.lua | 55 +++++++--- > tools/utils/symtab.lua | 6 +- > 8 files changed, 160 insertions(+), 47 deletions(-) > >diff --git a/src/lj_bcread.c b/src/lj_bcread.c >index 48c5e7c7..692e1a9d 100644 >--- a/src/lj_bcread.c >+++ b/src/lj_bcread.c >@@ -353,6 +353,7 @@ GCproto *lj_bcread_proto(LexState *ls) > pt->sizeuv = (uint8_t)sizeuv; > pt->flags = (uint8_t)flags; > pt->trace = 0; >+ pt->is_streamed = 0; > setgcref(pt->chunkname, obj2gco(ls->chunkname)); > > /* Close potentially uninitialized gap between bc and kgc. */ >diff --git a/src/lj_memprof.c b/src/lj_memprof.c >index 14e4e67b..fb217a2c 100644 >--- a/src/lj_memprof.c >+++ b/src/lj_memprof.c >@@ -67,7 +67,7 @@ static void dump_symtab_trace(struct lj_wbuf *out, const GCtrace *trace) > static void dump_symtab(struct lj_wbuf *out, const struct global_State *g) > { > const GCRef *iter = &g->gc.root; >- const GCobj *o; >+ GCobj *o; > const size_t ljs_header_len = sizeof(ljs_header) / sizeof(ljs_header[0]); > > /* Write prologue. */ >@@ -76,11 +76,12 @@ static void dump_symtab(struct lj_wbuf *out, const struct global_State *g) > while ((o = gcref(*iter)) != NULL) { > switch (o->gch.gct) { > case (~LJ_TPROTO): { >- const GCproto *pt = gco2pt(o); >+ GCproto *pt = gco2pt(o); > lj_wbuf_addbyte(out, SYMTAB_LFUNC); > lj_wbuf_addu64(out, (uintptr_t)pt); > lj_wbuf_addstring(out, proto_chunknamestr(pt)); > lj_wbuf_addu64(out, (uint64_t)pt->firstline); >+ pt->is_streamed = 1; > break; > } > case (~LJ_TTRACE): { >@@ -138,6 +139,7 @@ static void memprof_write_lfunc(struct lj_wbuf *out, uint8_t aevent, > ** -DLUAJIT_DISABLE_DEBUGINFO flag. > */ > const BCLine line = lj_debug_frameline(L, fn, nextframe); >+ GCproto *pt = funcproto(fn); > > if (line < 0) { > /* >@@ -149,9 +151,20 @@ static void memprof_write_lfunc(struct lj_wbuf *out, uint8_t aevent, > ** We report such allocations as internal in order not to confuse users. > */ > lj_wbuf_addbyte(out, aevent | ASOURCE_INT); >+ >+ } else if (LJ_UNLIKELY(!pt->is_streamed)) { >+ >+ lj_wbuf_addbyte(out, aevent | ASOURCE_LFUNC | LJM_SYMTAB); >+ lj_wbuf_addu64(out, (uintptr_t)pt); >+ lj_wbuf_addstring(out, proto_chunknamestr(pt)); >+ lj_wbuf_addu64(out, (uint64_t)pt->firstline); >+ lj_wbuf_addu64(out, (uint64_t)line); >+ pt->is_streamed = 1; >+ > } else { >+ > lj_wbuf_addbyte(out, aevent | ASOURCE_LFUNC); >- lj_wbuf_addu64(out, (uintptr_t)funcproto(fn)); >+ lj_wbuf_addu64(out, (uintptr_t)pt); > lj_wbuf_addu64(out, (uint64_t)line); > } > } >diff --git a/src/lj_memprof.h b/src/lj_memprof.h >index e72dadf7..0f5b4c6d 100644 >--- a/src/lj_memprof.h >+++ b/src/lj_memprof.h >@@ -57,7 +57,7 @@ > #define SYMTAB_TRACE ((uint8_t)1) > #define SYMTAB_FINAL ((uint8_t)0x80) > >-#define LJM_CURRENT_FORMAT_VERSION 0x02 >+#define LJM_CURRENT_FORMAT_VERSION 0x03 > > /* > ** Event stream format: >@@ -73,11 +73,14 @@ > ** event-realloc := event-header loc? oaddr osize naddr nsize > ** event-free := event-header loc? oaddr osize > ** event-header := <BYTE> >-** loc := loc-lua | loc-c | loc-trace >+** loc := loc-lua | loc-lua-sym | loc-c | loc-trace > ** loc-lua := sym-addr line-no >+** loc-lua-sym := sym-addr sym-chunk sym-line line-no > ** loc-c := sym-addr > ** loc-trace := trace-addr trace-no > ** sym-addr := <ULEB128> >+** sym-chunk := string >+** sym-line := <ULEB128> > ** line-no := <ULEB128> > ** trace-addr := <ULEB128> > ** trace-no := <ULEB128> >@@ -85,6 +88,9 @@ > ** naddr := <ULEB128> > ** osize := <ULEB128> > ** nsize := <ULEB128> >+** string := string-len string-payload >+** string-len := <ULEB128> >+** string-payload := <BYTE> {string-len} > ** epilogue := event-header > ** > ** <BYTE> : A single byte (no surprises here) >@@ -95,10 +101,11 @@ > ** version: [VVVVVVVV] > ** * VVVVVVVV: Byte interpreted as a plain integer version number > ** >-** event-header: [FUUSSSEE] >+** event-header: [FUTSSSEE] > ** * EE : 2 bits for representing allocation event type (AEVENT_*) > ** * SSS : 3 bits for representing allocation source type (ASOURCE_*) >-** * UU : 2 unused bits >+** * T : 1 bit indicating if additional symtab entry provided or not >+** * U : 1 unused bit > ** * F : 0 for regular events, 1 for epilogue's *F*inal header > ** (if F is set to 1, all other bits are currently ignored) > */ >@@ -114,6 +121,7 @@ > #define ASOURCE_CFUNC ((uint8_t)(3 << 2)) > #define ASOURCE_TRACE ((uint8_t)(4 << 2)) > >+#define LJM_SYMTAB 0x20 > #define LJM_EPILOGUE_HEADER 0x80 > > /* Profiler public API. */ >diff --git a/src/lj_obj.h b/src/lj_obj.h >index d26e60be..c8f6b13a 100644 >--- a/src/lj_obj.h >+++ b/src/lj_obj.h >@@ -385,6 +385,13 @@ typedef struct GCproto { > MRef lineinfo; /* Compressed map from bytecode ins. to source line. */ > MRef uvinfo; /* Upvalue names. */ > MRef varinfo; /* Names and compressed extents of local variables. */ >+#if LJ_HASMEMPROF >+ /* >+ ** Flag, indicating, if this proto has been streamed to the >+ ** memprof's binary output. >+ */ >+ uint8_t is_streamed; >+#endif > } GCproto; > > /* Flags for prototype. */ >diff --git a/src/lj_parse.c b/src/lj_parse.c >index a6325a76..1a540610 100644 >--- a/src/lj_parse.c >+++ b/src/lj_parse.c >@@ -1577,6 +1577,7 @@ static GCproto *fs_finish(LexState *ls, BCLine line) > pt->flags = (uint8_t)(fs->flags & ~(PROTO_HAS_RETURN|PROTO_FIXUP_RETURN)); > pt->numparams = fs->numparams; > pt->framesize = fs->framesize; >+ pt->is_streamed = 0; > setgcref(pt->chunkname, obj2gco(ls->chunkname)); > > /* Close potentially uninitialized gap between bc and kgc. */ >diff --git a/test/tarantool-tests/misclib-memprof-lapi.test.lua b/test/tarantool-tests/misclib-memprof-lapi.test.lua >index 17bcb2f1..f4666da5 100644 >--- a/test/tarantool-tests/misclib-memprof-lapi.test.lua >+++ b/test/tarantool-tests/misclib-memprof-lapi.test.lua >@@ -7,7 +7,7 @@ require("utils").skipcond( > local tap = require("tap") > > local test = tap.test("misc-memprof-lapi") >-test:plan(4) >+test:plan(5) > > local jit_opt_default = { > 3, -- level >@@ -70,7 +70,7 @@ local function generate_parsed_output(payload) > > local reader = bufread.new(TMP_BINFILE) > local symbols = symtab.parse(reader) >- local events = memprof.parse(reader) >+ local events = memprof.parse(reader, symbols) > > -- We don't need it any more. > os.remove(TMP_BINFILE) >@@ -79,7 +79,9 @@ local function generate_parsed_output(payload) > end > > local function fill_ev_type(events, symbols, event_type) >- local ev_type = {} >+ local ev_type = { >+ trace = {}, >+ } > for _, event in pairs(events[event_type]) do > local addr = event.loc.addr > local traceno = event.loc.traceno >@@ -87,7 +89,7 @@ local function fill_ev_type(events, symbols, event_type) > if traceno ~= 0 and symbols.trace[traceno] then > local trace_loc = symbols.trace[traceno].sym_loc > addr = trace_loc.addr >- ev_type[trace_loc.line] = { >+ ev_type.trace[traceno] = { > name = string.format("TRACE [%d] %s:%d", > traceno, symbols.lfunc[addr].source, symbols.lfunc[addr].linedefined > ), >@@ -99,9 +101,13 @@ local function fill_ev_type(events, symbols, event_type) > num = event.num, > } > elseif symbols.lfunc[addr] then >- ev_type[event.loc.line] = { >+ local source = symbols.lfunc[addr].source >+ if ev_type[source] == nil then >+ ev_type[source] = {} >+ end >+ ev_type[source][event.loc.line] = { > name = string.format( >- "%s:%d", symbols.lfunc[addr].source, symbols.lfunc[addr].linedefined >+ "%s:%d", source, symbols.lfunc[addr].linedefined > ), > num = event.num, > } >@@ -110,23 +116,35 @@ local function fill_ev_type(events, symbols, event_type) > return ev_type > end > >-local function form_source_line(line) >- return string.format("@%s:%d", arg[0], line) >+local function form_source_line(loc) >+ return string.format("%s:%d", loc.source, loc.function_line) > end > >-local function check_alloc_report(alloc, traceno, line, function_line, nevents) >- assert(alloc[line], ("no event on line %d"):format(line)) >- local event = alloc[line] >- local expected_name >- if traceno ~= 0 then >- expected_name = string.format("TRACE [%d] ", traceno).. >- form_source_line(function_line) >+local function get_event_name_trace(loc, alloc) >+ local event = alloc.trace[loc.traceno] >+ return event, string.format("TRACE [%d] ", loc.traceno)..form_source_line(loc) >+end >+ >+local function get_event_name_lua(loc, alloc) >+ local event = alloc[loc.source][loc.line] >+ return event, form_source_line(loc) >+end >+ >+local function check_alloc_report(loc, alloc, nevents) >+ if loc.source == nil then >+ loc.source = '@'..arg[0] >+ end >+ >+ local event, name >+ if loc.traceno ~= nil then >+ event, name = get_event_name_trace(loc, alloc) > else >- expected_name = form_source_line(function_line) >+ event, name = get_event_name_lua(loc, alloc) > end >- assert(expected_name == event.name, ("got='%s', expected='%s'"):format( >+ >+ assert(name == event.name, ("got='%s', expected='%s'"):format( > event.name, >- expected_name >+ name > )) > assert(event.num == nevents, ("got=%d, expected=%d"):format( > event.num, >@@ -175,9 +193,9 @@ test:test("output", function(subtest) > -- the number of allocations. > -- 1 event - alocation of table by itself + 1 allocation > -- of array part as far it is bigger than LJ_MAX_COLOSIZE (16). >- subtest:ok(check_alloc_report(alloc, 0, 34, 32, 2)) >+ subtest:ok(check_alloc_report({line = 34, function_line = 32}, alloc, 2)) > -- 20 strings allocations. >- subtest:ok(check_alloc_report(alloc, 0, 39, 32, 20)) >+ subtest:ok(check_alloc_report({line = 39, function_line = 32}, alloc, 20)) > > -- Collect all previous allocated objects. > subtest:ok(free.INTERNAL.num == 22) >@@ -185,8 +203,8 @@ test:test("output", function(subtest) > -- Tests for leak-only option. > -- See also https://github.com/tarantool/tarantool/issues/5812. > local heap_delta = process.form_heap_delta(events, symbols) >- local tab_alloc_stats = heap_delta[form_source_line(34)] >- local str_alloc_stats = heap_delta[form_source_line(39)] >+ local tab_alloc_stats = heap_delta[form_source_line({ function_line = 34 })] >+ local str_alloc_stats = heap_delta[form_source_line({ function_line = 39 })] > subtest:ok(tab_alloc_stats.nalloc == tab_alloc_stats.nfree) > subtest:ok(tab_alloc_stats.dbytes == 0) > subtest:ok(str_alloc_stats.nalloc == str_alloc_stats.nfree) >@@ -215,6 +233,38 @@ test:test("stack-resize", function(subtest) > misc.memprof.stop() > end) > >+-- Test for extending symtab with function prototypes >+-- while profiler is running. >+test:test("symtab-enriching", function(subtest) >+ subtest:plan(2) >+ >+ local payload_str = [[ >+ local M = { >+ tmp = string.rep("1", 100) -- line 2. >+ } >+ >+ function M.payload() >+ local str = string.rep("42", 100) -- line 6. >+ end >+ >+ return M >+ ]] >+ >+ local symbols, events = generate_parsed_output(function() >+ local str_chunk = assert(loadstring(payload_str, 'str_chunk'))() >+ str_chunk.payload() >+ end) >+ >+ local alloc = fill_ev_type(events, symbols, "alloc") >+ >+ subtest:ok(check_alloc_report( >+ {source = 'str_chunk', line = 6, function_line = 5}, alloc, 1) >+ ) >+ subtest:ok(check_alloc_report( >+ {source = 'str_chunk', line = 2, function_line = 0}, alloc, 1) >+ ) >+end) >+ > -- Test profiler with enabled JIT. > jit.on() > >@@ -247,9 +297,13 @@ test:test("jit-output", function(subtest) > alloc = fill_ev_type(events, symbols, "alloc") > > -- We expect, that loop will be compiled into a trace. >- subtest:ok(check_alloc_report(alloc, 1, 37, 32, 20)) >+ subtest:ok(check_alloc_report( >+ { traceno = 1, line = 37, function_line = 32 }, alloc, 20 >+ )) > -- See same checks with jit.off(). >- subtest:ok(check_alloc_report(alloc, 0, 34, 32, 2)) >+ subtest:ok(check_alloc_report( >+ { line = 34, function_line = 32 }, alloc, 2 >+ )) > > -- Restore default JIT settings. > jit.opt.start(unpack(jit_opt_default)) >diff --git a/tools/memprof/parse.lua b/tools/memprof/parse.lua >index adc7c072..888c6636 100644 >--- a/tools/memprof/parse.lua >+++ b/tools/memprof/parse.lua >@@ -8,11 +8,14 @@ local bit = require "bit" > local band = bit.band > local lshift = bit.lshift > >+local symtab = require "utils.symtab" >+ > local string_format = string.format > > local LJM_MAGIC = "ljm" >-local LJM_CURRENT_VERSION = 0x02 >+local LJM_CURRENT_VERSION = 0x03 > >+local LJM_SYMTAB = 0x20 > local LJM_EPILOGUE_HEADER = 0x80 > > local AEVENT_ALLOC = 1 >@@ -26,9 +29,10 @@ local ASOURCE_LFUNC = lshift(2, 2) > local ASOURCE_CFUNC = lshift(3, 2) > local ASOURCE_TRACE = lshift(4, 2) > >-local ASOURCE_MASK = lshift(0x7, 2) >+local ASOURCE_MASK = lshift(0xF, 2) >+local ASOURCE_TYPE_MASK = lshift(0x7, 2) > >-local EV_HEADER_MAX = ASOURCE_TRACE + AEVENT_REALLOC >+local EV_HEADER_MAX = LJM_SYMTAB + ASOURCE_TRACE + AEVENT_REALLOC > > local M = {} > >@@ -70,7 +74,18 @@ local function id_location(addr, line, traceno) > } > end > >-local function parse_location(reader, asource) >+local function parse_location_symbols(reader, asource, symbols) >+ if asource == ASOURCE_LFUNC then >+ return id_location( >+ symtab.parse_sym_lfunc(reader, symbols), >+ reader:read_uleb128(), >+ 0 >+ ) >+ end >+ error("Unknown asource "..asource) >+end >+ >+local function parse_location_common(reader, asource) > if asource == ASOURCE_INT then > return id_location(0, 0, 0) > elseif asource == ASOURCE_CFUNC then >@@ -83,8 +98,20 @@ local function parse_location(reader, asource) > error("Unknown asource "..asource) > end > >-local function parse_alloc(reader, asource, events, heap) >- local id, loc = parse_location(reader, asource) >+local function parse_location(reader, asource, symbols) >+ local has_symbols = band(asource, LJM_SYMTAB) >+ local asource_type = band(asource, ASOURCE_TYPE_MASK) >+ local id, loc >+ if has_symbols == LJM_SYMTAB then >+ id, loc = parse_location_symbols(reader, asource_type, symbols) >+ else >+ id, loc = parse_location_common(reader, asource_type) >+ end >+ return id, loc >+end >+ >+local function parse_alloc(reader, asource, events, heap, symbols) >+ local id, loc = parse_location(reader, asource, symbols) > > local naddr = reader:read_uleb128() > local nsize = reader:read_uleb128() >@@ -99,8 +126,8 @@ local function parse_alloc(reader, asource, events, heap) > heap[naddr] = {nsize, id, loc} > end > >-local function parse_realloc(reader, asource, events, heap) >- local id, loc = parse_location(reader, asource) >+local function parse_realloc(reader, asource, events, heap, symbols) >+ local id, loc = parse_location(reader, asource, symbols) > > local oaddr = reader:read_uleb128() > local osize = reader:read_uleb128() >@@ -121,8 +148,8 @@ local function parse_realloc(reader, asource, events, heap) > heap[naddr] = {nsize, id, loc} > end > >-local function parse_free(reader, asource, events, heap) >- local id, loc = parse_location(reader, asource) >+local function parse_free(reader, asource, events, heap, symbols) >+ local id, loc = parse_location(reader, asource, symbols) > > local oaddr = reader:read_uleb128() > local osize = reader:read_uleb128() >@@ -155,7 +182,7 @@ local function ev_header_split(evh) > return band(evh, AEVENT_MASK), band(evh, ASOURCE_MASK) > end > >-local function parse_event(reader, events) >+local function parse_event(reader, events, symbols) > local ev_header = reader:read_octet() > > assert(ev_header_is_valid(ev_header), "Bad ev_header "..ev_header) >@@ -169,12 +196,12 @@ local function parse_event(reader, events) > > assert(parser, "Bad aevent "..aevent) > >- parser.parse(reader, asource, events[parser.evname], events.heap) >+ parser.parse(reader, asource, events[parser.evname], events.heap, symbols) > > return true > end > >-function M.parse(reader) >+function M.parse(reader, symbols) > local events = { > alloc = {}, > realloc = {}, >@@ -199,7 +226,7 @@ function M.parse(reader) > )) > end > >- while parse_event(reader, events) do >+ while parse_event(reader, events, symbols) do > -- Empty body. > end > >diff --git a/tools/utils/symtab.lua b/tools/utils/symtab.lua >index 0e742ee1..601c9563 100644 >--- a/tools/utils/symtab.lua >+++ b/tools/utils/symtab.lua >@@ -20,7 +20,7 @@ local SYMTAB_TRACE = 1 > local M = {} > > -- Parse a single entry in a symtab: lfunc symbol. >-local function parse_sym_lfunc(reader, symtab) >+function M.parse_sym_lfunc(reader, symtab) > local sym_addr = reader:read_uleb128() > local sym_chunk = reader:read_string() > local sym_line = reader:read_uleb128() >@@ -29,6 +29,8 @@ local function parse_sym_lfunc(reader, symtab) > source = sym_chunk, > linedefined = sym_line, > } >+ >+ return sym_addr > end > > local function parse_sym_trace(reader, symtab) >@@ -48,7 +50,7 @@ local function parse_sym_trace(reader, symtab) > end > > local parsers = { >- [SYMTAB_LFUNC] = parse_sym_lfunc, >+ [SYMTAB_LFUNC] = M.parse_sym_lfunc, > [SYMTAB_TRACE] = parse_sym_trace, > } > >-- >2.32.0 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Tarantool-patches] [PATCH luajit v1 2/4] memprof: demangle symbols on the spot 2021-08-21 12:49 [Tarantool-patches] [PATCH luajit v1 0/4] memprof: enrich symtab when meeting new prototype Mikhail Shishatskiy via Tarantool-patches 2021-08-21 12:49 ` [Tarantool-patches] [PATCH luajit v1 1/4] " Mikhail Shishatskiy via Tarantool-patches @ 2021-08-21 12:50 ` Mikhail Shishatskiy via Tarantool-patches 2021-08-21 12:50 ` [Tarantool-patches] [PATCH luajit v1 3/4] memprof: substitute long proto names with aliases Mikhail Shishatskiy via Tarantool-patches 2021-08-21 12:50 ` [Tarantool-patches] [PATCH luajit v1 4/4] luajit: change order of modules Mikhail Shishatskiy via Tarantool-patches 3 siblings, 0 replies; 6+ messages in thread From: Mikhail Shishatskiy via Tarantool-patches @ 2021-08-21 12:50 UTC (permalink / raw) To: tarantool-patches, imun, skaplun There may be situations, when new prototype has the same address than the previously dumped one. In this case, the new symtab entry will shadow the preceeding, and if we demangle symbols when all the events are parsed, we can get irrelevant information: | Dump symtab on the start: | addr name | * 0x001 proto_1 | * 0x002 proto_2 | * 0x003 proto_3 | | Memprof running: | ... | * alloc: 0x001 proto_1 | // 0x001 proto_1 collected by the GC | // new proto loaded on the 0x001 address | * alloc: 0x001 proto_NEW | + symtab enriching -> [0x001] = proto_NEW If we demangle symbols in the end, we will get 2 allocations by proto_NEW because of the collision. But since symtab enriching is implemented, we have enough information to avoid collision while parsing events, demangling now happens on the spot, while symtab entries are not shadowed by new ones. Follows up tarantool/tarantool#5815 --- Issue: https://github.com/tarantool/tarantool/issues/5815 Branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated Tarantool branch: https://github.com/tarantool/tarantool/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated tools/memprof.lua | 4 ++-- tools/memprof/humanize.lua | 16 ++++++++-------- tools/memprof/parse.lua | 26 +++++++++++++------------- tools/memprof/process.lua | 12 +++++------- 4 files changed, 28 insertions(+), 30 deletions(-) diff --git a/tools/memprof.lua b/tools/memprof.lua index 18b44fdd..496a0d78 100644 --- a/tools/memprof.lua +++ b/tools/memprof.lua @@ -102,9 +102,9 @@ local function dump(inputfile) local symbols = symtab.parse(reader) local events = memprof.parse(reader, symbols) if not leak_only then - view.profile_info(events, symbols) + view.profile_info(events) end - local dheap = process.form_heap_delta(events, symbols) + local dheap = process.form_heap_delta(events) view.leak_info(dheap) os.exit(0) end diff --git a/tools/memprof/humanize.lua b/tools/memprof/humanize.lua index 800a465e..0360362d 100644 --- a/tools/memprof/humanize.lua +++ b/tools/memprof/humanize.lua @@ -7,7 +7,7 @@ local symtab = require "utils.symtab" local M = {} -function M.render(events, symbols) +function M.render(events) local ids = {} for id, _ in pairs(events) do @@ -21,15 +21,15 @@ function M.render(events, symbols) for i = 1, #ids do local event = events[ids[i]] print(string.format("%s: %d events\t+%d bytes\t-%d bytes", - M.describe_location(symbols, event.loc), + ids[i], event.num, event.alloc, event.free )) local prim_loc = {} - for _, heap_chunk in pairs(event.primary) do - table.insert(prim_loc, M.describe_location(symbols, heap_chunk.loc)) + for id, _ in pairs(event.primary) do + table.insert(prim_loc, id) end if #prim_loc ~= 0 then table.sort(prim_loc) @@ -42,17 +42,17 @@ function M.render(events, symbols) end end -function M.profile_info(events, symbols) +function M.profile_info(events) print("ALLOCATIONS") - M.render(events.alloc, symbols) + M.render(events.alloc) print("") print("REALLOCATIONS") - M.render(events.realloc, symbols) + M.render(events.realloc) print("") print("DEALLOCATIONS") - M.render(events.free, symbols) + M.render(events.free) print("") end diff --git a/tools/memprof/parse.lua b/tools/memprof/parse.lua index 888c6636..a540360b 100644 --- a/tools/memprof/parse.lua +++ b/tools/memprof/parse.lua @@ -9,6 +9,7 @@ local band = bit.band local lshift = bit.lshift local symtab = require "utils.symtab" +local humanize = require "memprof.humanize" local string_format = string.format @@ -66,12 +67,13 @@ local function link_to_previous(heap_chunk, e, nsize) end end -local function id_location(addr, line, traceno) - return string_format("f%#xl%dt%d", addr, line, traceno), { +local function id_location(addr, line, traceno, symbols) + local loc = { addr = addr, line = line, traceno = traceno, } + return humanize.describe_location(symbols, loc), loc end local function parse_location_symbols(reader, asource, symbols) @@ -79,21 +81,22 @@ local function parse_location_symbols(reader, asource, symbols) return id_location( symtab.parse_sym_lfunc(reader, symbols), reader:read_uleb128(), - 0 + 0, + symbols ) end error("Unknown asource "..asource) end -local function parse_location_common(reader, asource) +local function parse_location_common(reader, asource, symbols) if asource == ASOURCE_INT then - return id_location(0, 0, 0) + return id_location(0, 0, 0, symbols) elseif asource == ASOURCE_CFUNC then - return id_location(reader:read_uleb128(), 0, 0) + return id_location(reader:read_uleb128(), 0, 0, symbols) elseif asource == ASOURCE_LFUNC then - return id_location(reader:read_uleb128(), reader:read_uleb128(), 0) + return id_location(reader:read_uleb128(), reader:read_uleb128(), 0, symbols) elseif asource == ASOURCE_TRACE then - return id_location(reader:read_uleb128(), 0, reader:read_uleb128()) + return id_location(reader:read_uleb128(), 0, reader:read_uleb128(), symbols) end error("Unknown asource "..asource) end @@ -101,13 +104,10 @@ end local function parse_location(reader, asource, symbols) local has_symbols = band(asource, LJM_SYMTAB) local asource_type = band(asource, ASOURCE_TYPE_MASK) - local id, loc if has_symbols == LJM_SYMTAB then - id, loc = parse_location_symbols(reader, asource_type, symbols) - else - id, loc = parse_location_common(reader, asource_type) + return parse_location_symbols(reader, asource_type, symbols) end - return id, loc + return parse_location_common(reader, asource_type, symbols) end local function parse_alloc(reader, asource, events, heap, symbols) diff --git a/tools/memprof/process.lua b/tools/memprof/process.lua index f277ed84..0a39d333 100644 --- a/tools/memprof/process.lua +++ b/tools/memprof/process.lua @@ -2,9 +2,7 @@ local M = {} -local humanize = require "memprof.humanize" - -function M.form_heap_delta(events, symbols) +function M.form_heap_delta(events) -- Auto resurrects source event lines for counting/reporting. local dheap = setmetatable({}, {__index = function(t, line) rawset(t, line, { @@ -15,9 +13,9 @@ function M.form_heap_delta(events, symbols) return t[line] end}) - for _, event in pairs(events.alloc) do + for id, event in pairs(events.alloc) do if event.loc then - local ev_line = humanize.describe_location(symbols, event.loc) + local ev_line = id if (event.alloc > 0) then dheap[ev_line].dbytes = dheap[ev_line].dbytes + event.alloc @@ -36,8 +34,8 @@ function M.form_heap_delta(events, symbols) -- Realloc and free events always have key named "primary" -- that references the table with memory changed -- (may be empty). - for _, heap_chunk in pairs(event.primary) do - local ev_line = humanize.describe_location(symbols, heap_chunk.loc) + for id, heap_chunk in pairs(event.primary) do + local ev_line = id if (heap_chunk.alloced > 0) then dheap[ev_line].dbytes = dheap[ev_line].dbytes + heap_chunk.alloced -- 2.32.0 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Tarantool-patches] [PATCH luajit v1 3/4] memprof: substitute long proto names with aliases 2021-08-21 12:49 [Tarantool-patches] [PATCH luajit v1 0/4] memprof: enrich symtab when meeting new prototype Mikhail Shishatskiy via Tarantool-patches 2021-08-21 12:49 ` [Tarantool-patches] [PATCH luajit v1 1/4] " Mikhail Shishatskiy via Tarantool-patches 2021-08-21 12:50 ` [Tarantool-patches] [PATCH luajit v1 2/4] memprof: demangle symbols on the spot Mikhail Shishatskiy via Tarantool-patches @ 2021-08-21 12:50 ` Mikhail Shishatskiy via Tarantool-patches 2021-08-21 12:50 ` [Tarantool-patches] [PATCH luajit v1 4/4] luajit: change order of modules Mikhail Shishatskiy via Tarantool-patches 3 siblings, 0 replies; 6+ messages in thread From: Mikhail Shishatskiy via Tarantool-patches @ 2021-08-21 12:50 UTC (permalink / raw) To: tarantool-patches, imun, skaplun Sometimes a loaded chunk name can be multiline (actually, it is the Lua code itself). In order not to burden memprof parser output with big multiline names, aliases were introduced. The chunk name is replaced by `function_alias_N` (where N is a unique id) to be displayed in the allocation events report. All the aliases are printed in the end of parser's output under the header "ALIASES". Because of changes mentioned above, the API of <utils/symtab.lua> changed: now symtab has additional `alias` assotiative table for storing aliases and `alias_count` counter to store number of aliases. Follows up tarantool/tarantool#5815 --- Issue: https://github.com/tarantool/tarantool/issues/5815 Branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated Tarantool branch: https://github.com/tarantool/tarantool/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated tools/memprof.lua | 1 + tools/memprof/humanize.lua | 35 +++++++++++++++++++++++++++++++++++ tools/utils/symtab.lua | 14 +++++++++++++- 3 files changed, 49 insertions(+), 1 deletion(-) diff --git a/tools/memprof.lua b/tools/memprof.lua index 496a0d78..52e09fd9 100644 --- a/tools/memprof.lua +++ b/tools/memprof.lua @@ -106,6 +106,7 @@ local function dump(inputfile) end local dheap = process.form_heap_delta(events) view.leak_info(dheap) + view.aliases(symbols) os.exit(0) end diff --git a/tools/memprof/humanize.lua b/tools/memprof/humanize.lua index 0360362d..72e71080 100644 --- a/tools/memprof/humanize.lua +++ b/tools/memprof/humanize.lua @@ -5,6 +5,9 @@ local symtab = require "utils.symtab" +-- Assume that loaded string chunks have less than 9999 lines. +local ALIAS_MAX_LINE_NUMBER_LEN = 4 + local M = {} function M.render(events) @@ -98,4 +101,36 @@ function M.describe_location(symbols, loc) return symtab.demangle(symbols, loc) end +local function get_aliases(symbols) + local aliases = {} + for source, alias in pairs(symbols.alias) do + table.insert(aliases, { alias, source }) + end + table.sort(aliases, function(a, b) + return a[1] < b[1] + end) + return aliases +end + +local function format_source_prefix(lineno) + local line_str = tostring(lineno) + local line_str_len = line_str:len() + return line_str..string.rep(" ", ALIAS_MAX_LINE_NUMBER_LEN - line_str_len) +end + +function M.aliases(symbols) + if symbols.alias_count == 0 then return end + print("ALIASES:") + local aliases = get_aliases(symbols) + for _, alias in ipairs(aliases) do + print(alias[1]..":") + local lineno = 1 + for line in alias[2]:gmatch("(.-)\n") do + print(format_source_prefix(lineno)..'|'..line) + lineno = lineno + 1 + end + print("~\n") + end +end + return M diff --git a/tools/utils/symtab.lua b/tools/utils/symtab.lua index 601c9563..88fdb42e 100644 --- a/tools/utils/symtab.lua +++ b/tools/utils/symtab.lua @@ -58,6 +58,8 @@ function M.parse(reader) local symtab = { lfunc = {}, trace = {}, + alias = {}, + alias_count = 0, } local magic = reader:read_octets(3) local version = reader:read_octets(1) @@ -102,7 +104,17 @@ local function demangle_lfunc(symtab, loc) if addr == 0 then return "INTERNAL" elseif symtab.lfunc[addr] then - return string_format("%s:%d", symtab.lfunc[loc.addr].source, loc.line) + local name = symtab.lfunc[addr].source + if name:find('\n') == nil then + return string_format("%s:%d", name, loc.line) + end + if not symtab.alias[name] then + symtab.alias[name] = string_format( + "function_alias_%d", symtab.alias_count + ) + symtab.alias_count = symtab.alias_count + 1 + end + return string_format("%s:%d", symtab.alias[name], loc.line) end return string_format("CFUNC %#x", addr) end -- 2.32.0 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Tarantool-patches] [PATCH luajit v1 4/4] luajit: change order of modules 2021-08-21 12:49 [Tarantool-patches] [PATCH luajit v1 0/4] memprof: enrich symtab when meeting new prototype Mikhail Shishatskiy via Tarantool-patches ` (2 preceding siblings ...) 2021-08-21 12:50 ` [Tarantool-patches] [PATCH luajit v1 3/4] memprof: substitute long proto names with aliases Mikhail Shishatskiy via Tarantool-patches @ 2021-08-21 12:50 ` Mikhail Shishatskiy via Tarantool-patches 3 siblings, 0 replies; 6+ messages in thread From: Mikhail Shishatskiy via Tarantool-patches @ 2021-08-21 12:50 UTC (permalink / raw) To: tarantool-patches, imun, skaplun Since commit [1], <memprof/parse.lua> module depends on <memprof/humanize.lua> module, so we need to change order of these modules in <src/lua/init.c> to avoid unprotected API calls. Follows up tarantool/tarantool#5815 --- Issue: https://github.com/tarantool/tarantool/issues/5815 Branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated Tarantool branch: https://github.com/tarantool/tarantool/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated src/lua/init.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/lua/init.c b/src/lua/init.c index 8ac2f3592..fcb1d2c90 100644 --- a/src/lua/init.c +++ b/src/lua/init.c @@ -126,8 +126,8 @@ extern char strict_lua[], /* tools.* libraries. */ bufread_lua[], symtab_lua[], - parse_lua[], humanize_lua[], + parse_lua[], process_lua[], memprof_lua[] ; @@ -180,8 +180,8 @@ static const char *lua_modules[] = { /* tools.* libraries. Order is important. */ "utils.bufread", bufread_lua, "utils.symtab", symtab_lua, - "memprof.parse", parse_lua, "memprof.humanize", humanize_lua, + "memprof.parse", parse_lua, "memprof.process", process_lua, "memprof", memprof_lua, NULL -- 2.32.0 ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-08-24 14:16 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-08-21 12:49 [Tarantool-patches] [PATCH luajit v1 0/4] memprof: enrich symtab when meeting new prototype Mikhail Shishatskiy via Tarantool-patches 2021-08-21 12:49 ` [Tarantool-patches] [PATCH luajit v1 1/4] " Mikhail Shishatskiy via Tarantool-patches 2021-08-24 14:16 ` Mikhail Shishatskiy via Tarantool-patches 2021-08-21 12:50 ` [Tarantool-patches] [PATCH luajit v1 2/4] memprof: demangle symbols on the spot Mikhail Shishatskiy via Tarantool-patches 2021-08-21 12:50 ` [Tarantool-patches] [PATCH luajit v1 3/4] memprof: substitute long proto names with aliases Mikhail Shishatskiy via Tarantool-patches 2021-08-21 12:50 ` [Tarantool-patches] [PATCH luajit v1 4/4] luajit: change order of modules Mikhail Shishatskiy via Tarantool-patches
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox