[Tarantool-patches] [PATCH luajit v3 4/5] memprof: extend symtab with info about traces
Mikhail Shishatskiy
m.shishatskiy at tarantool.org
Wed Sep 29 22:21:43 MSK 2021
Hi, Igor!
Thank you for the review!
On 16.09.2021 18:32, Igor Munkin wrote:
>Misha,
>
>Thanks for the patch! Why don't you squash this one patch with the
>previous one? What is the rationale to split these changes?
The upcoming patch series v4 fixes this. Now patches are split not by
loading / rendering but by loading only basic info, simple rendering /
extending symtab, descriptive rendering.
>
>On 20.08.21, Mikhail Shishatskiy wrote:
>> This patch extends the memprof symtab by adding information
>> about traces, which are present at the start of profiling.
>>
>> The symtab format changed with adding <sym-trace> entry,
>> which consists of a trace address, trace number, address
>> of function proto, and line, where trace recording started:
>
>Again, traceno should be the first, IMHO.
>
>>
>> | sym-trace := sym-header trace-addr trace-no sym-addr sym-line
>> | trace-addr := <ULEB128> > | trace-no := <ULEB128>
>>
>> Also, the memprof parser is adjusted to recognize new
>> symtab entries and associate them with allocation events
>> from traces.
>>
>> With adding traces to the symbol table the API of <utils/symtab.lua>
>> changed: now table with symbols contains two tables: `lfunc` for
>> Lua functions symbols and `trace` for trace entries.
>>
>> As trace can be associated with a function proto, <utils/humanize.lua>
>> module extended with a function `describe_location` intended
>> to be used instead of symtab.demangle. The reason is that the
>> location name can be more than demangled `loc`: for example,
>> when getting trace location name, we want to assotiate the
>> trace with a function proto, where trace recording started.
>
>Ouch, too tricky description. I guess you mean that ev_line can
>represent both allocations in VM and on trace, right?
Fixed in the upcoming patch series v4.
>
>> On the one hand, we want to get a verbose trace description
>> with demangled proto location:
>>
>> | TRACE [<trace-no>] <trace-addr> started at @<proto-name>:<proto-line>
>>
>> On the other hand, idiomatically, this should be done not by
>> the demangler module but by the humanizer module.
>
>As we discussed offline, you violate so-called MVVM or MVC patterns (I
>might be wrong in terms, so excuse me, if I am): none of the parser
>modules can depend on humanize.lua -- this is the top-most part of the
>parser, that can produce plain text, JSON, CSV or anything else. See
>more comments below.
Fixed in the upcoming patch series v4.
>
>>
>> Resolves tarantool/tarantool#5814
>> ---
>>
>> Issue: https://github.com/tarantool/tarantool/issues/5814
>> Luajit branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5814-group-allocations-on-trace-by-trace-number
>> tarantool branch: https://github.com/tarantool/tarantool/tree/shishqa/gh-5814-group-allocations-on-trace-by-trace-number
>>
>> src/lj_memprof.c | 40 ++++++
>> src/lj_memprof.h | 7 +-
>> .../misclib-memprof-lapi.test.lua | 135 +++++++++++++-----
>> tools/memprof/humanize.lua | 21 ++-
>> tools/memprof/process.lua | 6 +-
>> tools/utils/symtab.lua | 45 ++++--
>> 6 files changed, 206 insertions(+), 48 deletions(-)
>>
>> diff --git a/src/lj_memprof.c b/src/lj_memprof.c
>> index fb99829d..fc5bc301 100644
>> --- a/src/lj_memprof.c
>> +++ b/src/lj_memprof.c
>> @@ -28,6 +28,42 @@
>> static const unsigned char ljs_header[] = {'l', 'j', 's', LJS_CURRENT_VERSION,
>> 0x0, 0x0, 0x0};
>>
>> +#if LJ_HASJIT
>> +
>> +static void dump_symtab_trace(struct lj_wbuf *out, const GCtrace *trace)
>> +{
>> + const GCproto *pt = &gcref(trace->startpt)->pt;
>> + BCLine lineno = -1;
>
>Why do you use -1 here?
Changed to 0 in order not to confuse. I believe that variable
initialization is important.
>
>> +
>> + const BCIns *startpc = mref(trace->startpc, const BCIns);
>> + lua_assert(startpc >= proto_bc(pt) &&
>> + startpc < proto_bc(pt) + pt->sizebc);
>> +
>> + lineno = lj_debug_line(pt, proto_bcpos(pt, startpc));
>> + lua_assert(lineno >= 0);
>> +
>> + lj_wbuf_addbyte(out, SYMTAB_TRACE);
>> + lj_wbuf_addu64(out, (uint64_t)trace->mcode);
>> + lj_wbuf_addu64(out, (uint64_t)trace->traceno);
>> + /*
>> + ** All the existing prototypes have already been dumped, so we do not
>> + ** need to repeat their dump for trace locations.
>
>Do you mean, that we don't need to dump chunk name here again? It's also
>worth to mention, that GCproto is anchored via GCtrace, so it's not
>collected until the trace is alive.
Yes, fixed the comment.
>
>> + */
>> + lj_wbuf_addu64(out, (uintptr_t)pt);
>> + lj_wbuf_addu64(out, (uint64_t)lineno);
>> +}
>
><snipped>
>
>> @@ -47,6 +83,10 @@ static void dump_symtab(struct lj_wbuf *out, const struct global_State *g)
>> lj_wbuf_addu64(out, (uint64_t)pt->firstline);
>> break;
>> }
>> + case (~LJ_TTRACE): {
>> + dump_symtab_trace(out, gco2trace(o));
>
>Minor: IMHO, it's better to put the function body right here (like it's
>done for LJ_TPROTO), but feel free to ignore.
But we need to leave the stub, when the JIT is disabled. Personally I
do not like the defines right inside the function code.
>
>> + break;
>> + }
>> default:
>> break;
>> }
>
><snipped>
>
>> diff --git a/test/tarantool-tests/misclib-memprof-lapi.test.lua b/test/tarantool-tests/misclib-memprof-lapi.test.lua
>> index dbf384ed..f84b6df0 100644
>> --- a/test/tarantool-tests/misclib-memprof-lapi.test.lua
>> +++ b/test/tarantool-tests/misclib-memprof-lapi.test.lua
>
><snipped>
>
>> @@ -52,19 +59,49 @@ local function generate_output(filename)
>> assert(res, err)
>> end
>>
>> +local function generate_parsed_output(payload)
>
>Minor: It's better to introduce this in the second patch, IMHO. Feel
>free to ignore.
Moved to the patch with test refactored.
>
>> + local res, err = pcall(generate_output, TMP_BINFILE, payload)
>> +
>> + -- Want to cleanup carefully if something went wrong.
>> + if not res then
>> + os.remove(TMP_BINFILE)
>> + error(err)
>> + end
>> +
>> + local reader = bufread.new(TMP_BINFILE)
>> + local symbols = symtab.parse(reader)
>> + local events = memprof.parse(reader)
>> +
>> + -- We don't need it any more.
>> + os.remove(TMP_BINFILE)
>> +
>> + return symbols, events
>> +end
>> +
>> local function fill_ev_type(events, symbols, event_type)
>> local ev_type = {}
>> for _, event in pairs(events[event_type]) do
>> local addr = event.loc.addr
>> - if addr == 0 then
>> + local traceno = event.loc.traceno
>> +
>> + if traceno ~= 0 and symbols.trace[traceno] then
>> + local trace_loc = symbols.trace[traceno].sym_loc
>> + addr = trace_loc.addr
>> + ev_type[trace_loc.line] = {
>> + name = string.format("TRACE [%d] %s:%d",
>> + traceno, symbols.lfunc[addr].source, symbols.lfunc[addr].linedefined
>> + ),
>> + num = event.num,
>> + }
>> + elseif addr == 0 then
>> ev_type.INTERNAL = {
>> name = "INTERNAL",
>> num = event.num,
>> }
>
>Hm. Looks like a misindent in the old code.
>
>> - elseif symbols[addr] then
>> + elseif symbols.lfunc[addr] then
>> ev_type[event.loc.line] = {
>> name = string.format(
>> - "%s:%d", symbols[addr].source, symbols[addr].linedefined
>> + "%s:%d", symbols.lfunc[addr].source, symbols.lfunc[addr].linedefined
>> ),
>> num = event.num,
>> }
>
><snipped>
>
>> @@ -179,5 +215,38 @@ test:test("stack-resize", function(subtest)
>> misc.memprof.stop()
>> end)
>>
>> +-- Test profiler with enabled JIT.
>> jit.on()
>> +
>> +test:test("jit-output", function(subtest)
>> + -- Disabled on *BSD due to #4819.
>> + if jit.os == 'BSD' then
>> + subtest:plan(1)
>
>This line is excess, since you use skip below.
The utils module uses this semantics, and if I omit the plan call,
I will fail at the end of the test, as there is "no plan".
>
>> + subtest:skip('Disabled due to #4819')
>> + return
>> + end
>> +
>> + subtest:plan(3)
>> +
>> + jit.opt.start(3, "hotloop=10")
>> + jit.flush()
>> +
>> + -- Pregenerate traces to fill symtab entries in the next run.
>> + default_payload()
>> +
>> + local symbols, events = generate_parsed_output(default_payload)
>> +
>> + local alloc = fill_ev_type(events, symbols, "alloc")
>> + local free = fill_ev_type(events, symbols, "free")
>> +
>> + -- We expect, that loop will be compiled into a trace.
>> + subtest:ok(check_alloc_report(alloc, 1, 37, 32, 20))
>> + -- See same checks with jit.off().
>> + subtest:ok(check_alloc_report(alloc, 0, 34, 32, 2))
>> + subtest:ok(free.INTERNAL.num == 22)
>> +
>> + -- Restore default JIT settings.
>> + jit.opt.start(unpack(jit_opt_default))
>> +end)
>> +
>> os.exit(test:check() and 0 or 1)
>
><snipped>
>
>> diff --git a/tools/memprof/process.lua b/tools/memprof/process.lua
>> index 0bcb965b..f277ed84 100644
>> --- a/tools/memprof/process.lua
>> +++ b/tools/memprof/process.lua
>> @@ -2,7 +2,7 @@
>>
>> local M = {}
>>
>> -local symtab = require "utils.symtab"
>> +local humanize = require "memprof.humanize"
>>
>> function M.form_heap_delta(events, symbols)
>> -- Auto resurrects source event lines for counting/reporting.
>> @@ -17,7 +17,7 @@ function M.form_heap_delta(events, symbols)
>>
>> for _, event in pairs(events.alloc) do
>> if event.loc then
>> - local ev_line = symtab.demangle(symbols, event.loc)
>> + local ev_line = humanize.describe_location(symbols, event.loc)
>
<snipped>
>
>>
>> if (event.alloc > 0) then
>> dheap[ev_line].dbytes = dheap[ev_line].dbytes + event.alloc
>> @@ -37,7 +37,7 @@ function M.form_heap_delta(events, symbols)
>> -- that references the table with memory changed
>> -- (may be empty).
>> for _, heap_chunk in pairs(event.primary) do
>> - local ev_line = symtab.demangle(symbols, heap_chunk.loc)
>> + local ev_line = humanize.describe_location(symbols, heap_chunk.loc)
>>
>> if (heap_chunk.alloced > 0) then
>> dheap[ev_line].dbytes = dheap[ev_line].dbytes + heap_chunk.alloced
>> diff --git a/tools/utils/symtab.lua b/tools/utils/symtab.lua
>> index 3ed1dd13..0e742ee1 100644
>> --- a/tools/utils/symtab.lua
>> +++ b/tools/utils/symtab.lua
>
><snipped>
>
>> @@ -24,18 +25,38 @@ local function parse_sym_lfunc(reader, symtab)
>> local sym_chunk = reader:read_string()
>> local sym_line = reader:read_uleb128()
>>
>> - symtab[sym_addr] = {
>> + symtab.lfunc[sym_addr] = {
>> source = sym_chunk,
>> linedefined = sym_line,
>> }
>> end
>>
>> +local function parse_sym_trace(reader, symtab)
>> + local trace_addr = reader:read_uleb128()
>> + local traceno = reader:read_uleb128()
>> + local sym_addr = reader:read_uleb128()
>> + local sym_line = reader:read_uleb128()
>> +
>> + symtab.trace[traceno] = {
>> + addr = trace_addr,
>> + sym_loc = {
>
>I propose to rename <sym_loc> to <start>.
Fixed in the upcoming patch series v4.
>
>> + addr = sym_addr,
>> + line = sym_line,
>
>It's better to use <linedefined> instead of <line> here.
Here I want to follow the semantics of the <loc> structure
in order to pass it to the symtab.demangle.
>
>> + traceno = 0,
>
>Why do you need this?
See explanation above (about <loc>).
>
>> + },
>> + }
>> +end
>> +
>> local parsers = {
>> [SYMTAB_LFUNC] = parse_sym_lfunc,
>> + [SYMTAB_TRACE] = parse_sym_trace,
>> }
>>
>> function M.parse(reader)
>> - local symtab = {}
>> + local symtab = {
>> + lfunc = {},
>> + trace = {},
>> + }
>> local magic = reader:read_octets(3)
>> local version = reader:read_octets(1)
>>
>> @@ -73,18 +94,26 @@ function M.parse(reader)
>> return symtab
>> end
>>
>> -function M.demangle(symtab, loc)
>> +local function demangle_lfunc(symtab, loc)
>> local addr = loc.addr
>>
>> if addr == 0 then
>> return "INTERNAL"
>> + elseif symtab.lfunc[addr] then
>> + return string_format("%s:%d", symtab.lfunc[loc.addr].source, loc.line)
>
>Why do you use here <loc.addr> instead of <addr>?
My bad, fixed.
>
>> end
>> + return string_format("CFUNC %#x", addr)
>> +end
>>
>> - if symtab[addr] then
>> - return string_format("%s:%d", symtab[addr].source, loc.line)
>> - end
>> +local function demangle_trace(loc)
>> + return string_format("TRACE [%d] 0x%x", loc.traceno, loc.addr)
>> +end
>>
>> - return string_format("CFUNC %#x", addr)
>> +function M.demangle(symtab, loc)
>> + if loc.traceno ~= 0 then
>> + return demangle_trace(loc)
>> + end
>> + return demangle_lfunc(symtab, loc)
>> end
>>
>> return M
>> --
>> 2.32.0
>>
>
>--
>Best regards,
>IM
Best regards,
Mikhail Shishatskiy
More information about the Tarantool-patches
mailing list