[Tarantool-patches] [PATCH luajit v4 4/4] memprof: add info about trace start to symtab
Sergey Kaplun
skaplun at tarantool.org
Wed Nov 17 11:17:33 MSK 2021
Igor, Mikhail,
Sorry, I don't see the letter to which Igor answered, so I reply here.
LGTM, after fixes mentioned by Igor.
On 12.11.21, Igor Munkin wrote:
> Misha,
>
> Thanks for your fixes! Please consider my answer regarding trace event
> rendering and a typo in the comment below.
>
> On 04.11.21, Mikhail Shishatskiy wrote:
> > Hi, Igor!
> > Thank you for the review.
> >
> > New commit message:
> > ============================================================
> > memprof: add info about trace start to symtab
> >
> > Allocation events occured on traces are recorded by the memory
> > profiler the following way now
> >
> > | TRACE [<trace-no>] <trace-addr>
> >
> > This approach is not descriptive enough to understand, where
> > exactly allocation took place, as we do not know the code
> > chunk, associated with the trace.
> >
> > This patch fixes the problem described above by extending the
> > symbol table with <sym-trace> entries, consisting of a trace's
> > mcode starting address, trace number, address of function proto,
> > and line, where trace recording started:
> >
> > | sym-trace := sym-header trace-no trace-addr sym-addr sym-line
> > | trace-no := <ULEB128>
> > | trace-addr := <ULEB128>
> >
> > The memory profiler parser is adjusted to recognize the entries
> > mentioned above. On top of that, the API of <utils/symtab.lua> changed:
> > now table with symbols contains two tables: `lfunc` for Lua functions
> > symbols and `trace` for trace entries.
> >
> > The demangler module has not changed, but the function
> > `describe_location` is added to the <memprof/humanize.lua> module,
> > which allows one to get a description of the trace location in the
> > format described below:
> >
> > | TRACE [<trace-no>] <trace-addr> started at @<sym-chunk>:<sym-line>
> >
> > Follows up tarantool/tarantool#5814
> > ============================================================
> >
> > Diff:
> > ============================================================
> > diff --git a/src/lj_memprof.c b/src/lj_memprof.c
> > index e8b2ebbc..05542052 100644
> > --- a/src/lj_memprof.c
> > +++ b/src/lj_memprof.c
> > @@ -40,7 +40,6 @@ static void dump_symtab_trace(struct lj_wbuf *out, const GCtrace *trace)
> > startpc < proto_bc(pt) + pt->sizebc);
> >
> > lineno = lj_debug_line(pt, proto_bcpos(pt, startpc));
> > - lua_assert(lineno >= 0);
> >
> > lj_wbuf_addbyte(out, SYMTAB_TRACE);
> > lj_wbuf_addu64(out, (uint64_t)trace->traceno);
> > diff --git a/test/tarantool-tests/misclib-memprof-lapi.test.lua b/test/tarantool-tests/misclib-memprof-lapi.test.lua
> > index 3003b9f5..ce667afc 100644
> > --- a/test/tarantool-tests/misclib-memprof-lapi.test.lua
> > +++ b/test/tarantool-tests/misclib-memprof-lapi.test.lua
> > @@ -92,7 +92,7 @@ local function fill_ev_type(events, symbols, event_type)
> > addr = trace_loc.addr
> > ev_type.trace[traceno] = {
> > name = string.format("TRACE [%d] %s:%d",
> > - traceno, symbols.lfunc[addr].source, symbols.lfunc[addr].linedefined
> > + traceno, symbols.lfunc[addr].source, trace_loc.line
> > ),
> > num = event.num,
> > }
> > @@ -122,7 +122,7 @@ local function check_alloc_report(alloc, location, nevents)
> > local traceno = location.traceno
> > if traceno then
> > expected_name = string.format("TRACE [%d] ", traceno)..
> > - form_source_line(location.linedefined)
> > + form_source_line(location.line)
> > event = alloc.trace[traceno]
> > else
> > expected_name = form_source_line(location.linedefined)
> > @@ -244,9 +244,7 @@ test:test("jit-output", function(subtest)
> > local free = fill_ev_type(events, symbols, "free")
> >
> > -- We expect, that loop will be compiled into a trace.
> > - subtest:ok(check_alloc_report(
> > - alloc, { traceno = 1, line = 37, linedefined = 32 }, 20
> > - ))
>
> Side note: I see there is neither of line and linedefined in the
> previous patch, so everything is OK.
>
> > + subtest:ok(check_alloc_report(alloc, { traceno = 1, line = 37 }, 20))
> > -- See same checks with jit.off().
> > subtest:ok(check_alloc_report(alloc, { line = 34, linedefined = 32 }, 2))
> > subtest:ok(free.INTERNAL.num == 22)
> > diff --git a/tools/memprof/humanize.lua b/tools/memprof/humanize.lua
> > index 7d30f976..caab8b3a 100644
> > --- a/tools/memprof/humanize.lua
> > +++ b/tools/memprof/humanize.lua
> > @@ -7,7 +7,7 @@ local symtab = require "utils.symtab"
> >
> > local M = {}
> >
> > -function M.describe_location(symbols, loc)
> > +local function describe_location(symbols, loc)
> > if loc.traceno == 0 then
> > return symtab.demangle(symbols, loc)
> > end
> > @@ -15,7 +15,7 @@ function M.describe_location(symbols, loc)
> > local trace = symbols.trace[loc.traceno]
> >
> > -- If trace, which was remembered in the symtab, has not
> > - -- been flushed, assotiate it with a proto, where trace
> > + -- been flushed, associate it with a proto, where trace
> > -- recording started.
> > if trace and trace.addr == loc.addr then
> > return symtab.demangle(symbols, loc).." started at "..
> > @@ -38,7 +38,7 @@ function M.render(events, symbols)
> > for i = 1, #ids do
> > local event = events[ids[i]]
> > print(string.format("%s: %d events\t+%d bytes\t-%d bytes",
> > - M.describe_location(symbols, event.loc),
> > + describe_location(symbols, event.loc),
> > event.num,
> > event.alloc,
> > event.free
> > @@ -46,7 +46,7 @@ function M.render(events, symbols)
> >
> > local prim_loc = {}
> > for _, heap_chunk in pairs(event.primary) do
> > - table.insert(prim_loc, M.describe_location(symbols, heap_chunk.loc))
> > + table.insert(prim_loc, describe_location(symbols, heap_chunk.loc))
> > end
> > if #prim_loc ~= 0 then
> > table.sort(prim_loc)
> > @@ -80,7 +80,7 @@ function M.leak_info(dheap, symbols)
> > -- with enabled jit.
> > if info.dbytes > 0 then
> > table.insert(leaks, {
> > - line = M.describe_location(symbols, info.loc),
> > + line = describe_location(symbols, info.loc),
> > dbytes = info.dbytes
> > })
> > end
> > diff --git a/tools/utils/symtab.lua b/tools/utils/symtab.lua
> > index 49ebb36f..879979f8 100644
> > --- a/tools/utils/symtab.lua
> > +++ b/tools/utils/symtab.lua
> > @@ -39,6 +39,9 @@ local function parse_sym_trace(reader, symtab)
> >
> > symtab.trace[traceno] = {
> > addr = trace_addr,
> > + -- The structure <start> is the same as the one
> > + -- yielded from the <parse_location> fucntion
>
> Typo: s/fucntion/function/.
>
> > + -- in the <memprof/parse.lua> module.
> > start = {
> > addr = sym_addr,
> > line = sym_line,
> > ============================================================
> >
> > Issue: https://github.com/tarantool/tarantool/issues/5814
> > Branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5814-group-allocations-on-trace-by-trace-number
> > CI: https://github.com/tarantool/tarantool/tree/shishqa/gh-5814-group-allocations-on-trace-by-trace-number
> >
> > On 01.11.2021 19:31, Igor Munkin wrote:
> > >Misha,
> > >
> > >Thanks for the patch! Please consider my comments below.
> > >
> > >On 29.09.21, Mikhail Shishatskiy wrote:
> > >> Trace allocation sources, recorded by the memory profiler,
> > >> were reported as
> >
> > <snipped>
> >
> > >>
> > >> | TRACE [<trace-no>] <trace-addr>
> > >>
> > >> This approach is not descriptive enough to understand, where
> > >> exactly allocation took place, as we do not know the code
> > >> chunk, associated with the trace.
> > >>
> > >> This patch fixes the problem described above by extending the
> > >> symbol table with <sym-trace> entries, consisting of a trace's
> > >> mcode starting address, trace number, address of function proto,
> > >> and line, where trace recording started:
> > >>
> > >> | sym-trace := sym-header trace-no trace-addr sym-addr sym-line
> > >> | trace-no := <ULEB128>
> > >> | trace-addr := <ULEB128>
> > >>
> > >> The memory profiler parser is adjusted to recognize the entries
> > >> mentioned above. On top of that, the API of <utils/symtab.lua> changed:
> > >> now table with symbols contains two tables: `lfunc` for Lua functions
> > >> symbols and `trace` for trace entries.
> > >>
> > >> The demangler module has not changed, but the function
> > >> `describe_location` is added to the <memprof/humanize.lua> module,
> > >> which allows one to get a description of the trace location in the
> > >> format described below:
> > >>
> > >> | TRACE [<trace-no>] <trace-addr> started at @<sym-chunk>:<sym-line>
> > >>
> > >> Follows up tarantool/tarantool#5814
> > >> ---
> > >>
> > >> Issue: https://github.com/tarantool/tarantool/issues/5814
> > >> Branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5814-group-allocations-on-trace-by-trace-number
> > >> CI: https://github.com/tarantool/tarantool/tree/shishqa/gh-5814-group-allocations-on-trace-by-trace-number
> > >>
> > >> src/lj_memprof.c | 43 +++++++++++++++++++
> > >> src/lj_memprof.h | 8 +++-
> > >> .../misclib-memprof-lapi.test.lua | 15 ++++---
> > >> tools/memprof.lua | 4 +-
> > >> tools/memprof/humanize.lua | 30 ++++++++++---
> > >> tools/memprof/process.lua | 9 ++--
> > >> tools/utils/symtab.lua | 31 ++++++++++---
> > >> 7 files changed, 118 insertions(+), 22 deletions(-)
> > >>
>
> <snipped>
>
> > >> diff --git a/test/tarantool-tests/misclib-memprof-lapi.test.lua b/test/tarantool-tests/misclib-memprof-lapi.test.lua
> > >> index 3f4ffea0..b9edb80d 100644
> > >> --- a/test/tarantool-tests/misclib-memprof-lapi.test.lua
> > >> +++ b/test/tarantool-tests/misclib-memprof-lapi.test.lua
>
> <snipped>
>
> > >> @@ -116,7 +120,8 @@ end
> > >> local function check_alloc_report(alloc, traceno, line, function_line, nevents)
> > >> local expected_name, event
> > >> if traceno ~= 0 then
> > >> - expected_name = string.format("TRACE [%d]", traceno)
> > >> + expected_name = string.format("TRACE [%d] ", traceno)..
> > >> + form_source_line(function_line)
> > >
> > >The output format differs from the one produced by memprof parser,
> > >doesn't it?
> >
> > Yes, because we demangle names in <fill_ev_type> function. So, we can
> > omit "started at" part to check that everything else is correct.
>
> Oh, I see... This is odd a bit but now I got it, thanks!
>
> >
>
> <snipped>
>
> > >> diff --git a/tools/memprof/humanize.lua b/tools/memprof/humanize.lua
> > >> index 7771005d..7d30f976 100644
> > >> --- a/tools/memprof/humanize.lua
> > >> +++ b/tools/memprof/humanize.lua
> > >> @@ -7,6 +7,23 @@ local symtab = require "utils.symtab"
>
> <snipped>
>
> > >> + -- recording started.
> > >> + if trace and trace.addr == loc.addr then
> > >> + return symtab.demangle(symbols, loc).." started at "..
> > >> + symtab.demangle(symbols, trace.start)
> > >
> > >Finally, I got the thing that bothers me the most. Why do you make
> > ><describe_location> so complex? It looks that you can move all these
> > >if-else branching to <symtab.demangle> and concatenation to
> > ><demangle_trace> function, doesn't it? AFAICS, you can remove
> > ><describe_location> as a result and trace demangling will be
> > >encapsulated in scope of <demangle_trace> function. Feel free to correct
> > >me if I'm wrong.
> >
> > Initially it was implemented, as you suggest now. But Sergey in his
> > review led me to believe, that "started at" part should ideologically
> > relate to the humanizer module. And I agree with that point, but maybe
> > I decomposed things not in a very good way.
>
> Em... In that way all other types (such as "INTERNAL" and "CFUNC %#x")
> should also be in the humanizer module, since this representation is
> specific for a particular output format. All in all nobody stops you
> from moving <symtab.demangle> to the humanize module, since it's used
> only there (and need to be used only there).
>
> BTW, Sergey is also in Cc, so he can also drop a few words regarding it.
OK, if we can interpret this as the token to parser for our output, I'm
totally fine to leave it in the old place.
>
> >
> > Another way to implement this is to demangle without "started at" and
> > then insert it to the demangled name. What do you think?
>
> My point is to have the whole "stringification" mess encapsulated in a
> single function (like it's almost done within <symtab.demangle>). And
> the only thing remaining outside of this function is "started at" tail.
> I hope this fits your vision regarding decomposition :)
>
> >
> > >
> > >> + end
> > >> + return symtab.demangle(symbols, loc)
> > >> +end
> > >> +
>
> <snipped>
>
> > >> --
> > >> 2.33.0
> > >>
> > >
> > >--
> > >Best regards,
> > >IM
> >
> > --
> > Best regards,
> > Mikhail Shishatskiy
>
> --
> Best regards,
> IM
--
Best regards,
Sergey Kaplun
More information about the Tarantool-patches
mailing list