[Tarantool-patches] [PATCH luajit v4 4/4] memprof: add info about trace start to symtab
Igor Munkin
imun at tarantool.org
Fri Nov 12 16:34:21 MSK 2021
Misha,
Thanks for your fixes! Please consider my answer regarding trace event
rendering and a typo in the comment below.
On 04.11.21, Mikhail Shishatskiy wrote:
> Hi, Igor!
> Thank you for the review.
>
> New commit message:
> ============================================================
> memprof: add info about trace start to symtab
>
> Allocation events occured on traces are recorded by the memory
> profiler the following way now
>
> | TRACE [<trace-no>] <trace-addr>
>
> This approach is not descriptive enough to understand, where
> exactly allocation took place, as we do not know the code
> chunk, associated with the trace.
>
> This patch fixes the problem described above by extending the
> symbol table with <sym-trace> entries, consisting of a trace's
> mcode starting address, trace number, address of function proto,
> and line, where trace recording started:
>
> | sym-trace := sym-header trace-no trace-addr sym-addr sym-line
> | trace-no := <ULEB128>
> | trace-addr := <ULEB128>
>
> The memory profiler parser is adjusted to recognize the entries
> mentioned above. On top of that, the API of <utils/symtab.lua> changed:
> now table with symbols contains two tables: `lfunc` for Lua functions
> symbols and `trace` for trace entries.
>
> The demangler module has not changed, but the function
> `describe_location` is added to the <memprof/humanize.lua> module,
> which allows one to get a description of the trace location in the
> format described below:
>
> | TRACE [<trace-no>] <trace-addr> started at @<sym-chunk>:<sym-line>
>
> Follows up tarantool/tarantool#5814
> ============================================================
>
> Diff:
> ============================================================
> diff --git a/src/lj_memprof.c b/src/lj_memprof.c
> index e8b2ebbc..05542052 100644
> --- a/src/lj_memprof.c
> +++ b/src/lj_memprof.c
> @@ -40,7 +40,6 @@ static void dump_symtab_trace(struct lj_wbuf *out, const GCtrace *trace)
> startpc < proto_bc(pt) + pt->sizebc);
>
> lineno = lj_debug_line(pt, proto_bcpos(pt, startpc));
> - lua_assert(lineno >= 0);
>
> lj_wbuf_addbyte(out, SYMTAB_TRACE);
> lj_wbuf_addu64(out, (uint64_t)trace->traceno);
> diff --git a/test/tarantool-tests/misclib-memprof-lapi.test.lua b/test/tarantool-tests/misclib-memprof-lapi.test.lua
> index 3003b9f5..ce667afc 100644
> --- a/test/tarantool-tests/misclib-memprof-lapi.test.lua
> +++ b/test/tarantool-tests/misclib-memprof-lapi.test.lua
> @@ -92,7 +92,7 @@ local function fill_ev_type(events, symbols, event_type)
> addr = trace_loc.addr
> ev_type.trace[traceno] = {
> name = string.format("TRACE [%d] %s:%d",
> - traceno, symbols.lfunc[addr].source, symbols.lfunc[addr].linedefined
> + traceno, symbols.lfunc[addr].source, trace_loc.line
> ),
> num = event.num,
> }
> @@ -122,7 +122,7 @@ local function check_alloc_report(alloc, location, nevents)
> local traceno = location.traceno
> if traceno then
> expected_name = string.format("TRACE [%d] ", traceno)..
> - form_source_line(location.linedefined)
> + form_source_line(location.line)
> event = alloc.trace[traceno]
> else
> expected_name = form_source_line(location.linedefined)
> @@ -244,9 +244,7 @@ test:test("jit-output", function(subtest)
> local free = fill_ev_type(events, symbols, "free")
>
> -- We expect, that loop will be compiled into a trace.
> - subtest:ok(check_alloc_report(
> - alloc, { traceno = 1, line = 37, linedefined = 32 }, 20
> - ))
Side note: I see there is neither of line and linedefined in the
previous patch, so everything is OK.
> + subtest:ok(check_alloc_report(alloc, { traceno = 1, line = 37 }, 20))
> -- See same checks with jit.off().
> subtest:ok(check_alloc_report(alloc, { line = 34, linedefined = 32 }, 2))
> subtest:ok(free.INTERNAL.num == 22)
> diff --git a/tools/memprof/humanize.lua b/tools/memprof/humanize.lua
> index 7d30f976..caab8b3a 100644
> --- a/tools/memprof/humanize.lua
> +++ b/tools/memprof/humanize.lua
> @@ -7,7 +7,7 @@ local symtab = require "utils.symtab"
>
> local M = {}
>
> -function M.describe_location(symbols, loc)
> +local function describe_location(symbols, loc)
> if loc.traceno == 0 then
> return symtab.demangle(symbols, loc)
> end
> @@ -15,7 +15,7 @@ function M.describe_location(symbols, loc)
> local trace = symbols.trace[loc.traceno]
>
> -- If trace, which was remembered in the symtab, has not
> - -- been flushed, assotiate it with a proto, where trace
> + -- been flushed, associate it with a proto, where trace
> -- recording started.
> if trace and trace.addr == loc.addr then
> return symtab.demangle(symbols, loc).." started at "..
> @@ -38,7 +38,7 @@ function M.render(events, symbols)
> for i = 1, #ids do
> local event = events[ids[i]]
> print(string.format("%s: %d events\t+%d bytes\t-%d bytes",
> - M.describe_location(symbols, event.loc),
> + describe_location(symbols, event.loc),
> event.num,
> event.alloc,
> event.free
> @@ -46,7 +46,7 @@ function M.render(events, symbols)
>
> local prim_loc = {}
> for _, heap_chunk in pairs(event.primary) do
> - table.insert(prim_loc, M.describe_location(symbols, heap_chunk.loc))
> + table.insert(prim_loc, describe_location(symbols, heap_chunk.loc))
> end
> if #prim_loc ~= 0 then
> table.sort(prim_loc)
> @@ -80,7 +80,7 @@ function M.leak_info(dheap, symbols)
> -- with enabled jit.
> if info.dbytes > 0 then
> table.insert(leaks, {
> - line = M.describe_location(symbols, info.loc),
> + line = describe_location(symbols, info.loc),
> dbytes = info.dbytes
> })
> end
> diff --git a/tools/utils/symtab.lua b/tools/utils/symtab.lua
> index 49ebb36f..879979f8 100644
> --- a/tools/utils/symtab.lua
> +++ b/tools/utils/symtab.lua
> @@ -39,6 +39,9 @@ local function parse_sym_trace(reader, symtab)
>
> symtab.trace[traceno] = {
> addr = trace_addr,
> + -- The structure <start> is the same as the one
> + -- yielded from the <parse_location> fucntion
Typo: s/fucntion/function/.
> + -- in the <memprof/parse.lua> module.
> start = {
> addr = sym_addr,
> line = sym_line,
> ============================================================
>
> Issue: https://github.com/tarantool/tarantool/issues/5814
> Branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5814-group-allocations-on-trace-by-trace-number
> CI: https://github.com/tarantool/tarantool/tree/shishqa/gh-5814-group-allocations-on-trace-by-trace-number
>
> On 01.11.2021 19:31, Igor Munkin wrote:
> >Misha,
> >
> >Thanks for the patch! Please consider my comments below.
> >
> >On 29.09.21, Mikhail Shishatskiy wrote:
> >> Trace allocation sources, recorded by the memory profiler,
> >> were reported as
>
> <snipped>
>
> >>
> >> | TRACE [<trace-no>] <trace-addr>
> >>
> >> This approach is not descriptive enough to understand, where
> >> exactly allocation took place, as we do not know the code
> >> chunk, associated with the trace.
> >>
> >> This patch fixes the problem described above by extending the
> >> symbol table with <sym-trace> entries, consisting of a trace's
> >> mcode starting address, trace number, address of function proto,
> >> and line, where trace recording started:
> >>
> >> | sym-trace := sym-header trace-no trace-addr sym-addr sym-line
> >> | trace-no := <ULEB128>
> >> | trace-addr := <ULEB128>
> >>
> >> The memory profiler parser is adjusted to recognize the entries
> >> mentioned above. On top of that, the API of <utils/symtab.lua> changed:
> >> now table with symbols contains two tables: `lfunc` for Lua functions
> >> symbols and `trace` for trace entries.
> >>
> >> The demangler module has not changed, but the function
> >> `describe_location` is added to the <memprof/humanize.lua> module,
> >> which allows one to get a description of the trace location in the
> >> format described below:
> >>
> >> | TRACE [<trace-no>] <trace-addr> started at @<sym-chunk>:<sym-line>
> >>
> >> Follows up tarantool/tarantool#5814
> >> ---
> >>
> >> Issue: https://github.com/tarantool/tarantool/issues/5814
> >> Branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5814-group-allocations-on-trace-by-trace-number
> >> CI: https://github.com/tarantool/tarantool/tree/shishqa/gh-5814-group-allocations-on-trace-by-trace-number
> >>
> >> src/lj_memprof.c | 43 +++++++++++++++++++
> >> src/lj_memprof.h | 8 +++-
> >> .../misclib-memprof-lapi.test.lua | 15 ++++---
> >> tools/memprof.lua | 4 +-
> >> tools/memprof/humanize.lua | 30 ++++++++++---
> >> tools/memprof/process.lua | 9 ++--
> >> tools/utils/symtab.lua | 31 ++++++++++---
> >> 7 files changed, 118 insertions(+), 22 deletions(-)
> >>
<snipped>
> >> diff --git a/test/tarantool-tests/misclib-memprof-lapi.test.lua b/test/tarantool-tests/misclib-memprof-lapi.test.lua
> >> index 3f4ffea0..b9edb80d 100644
> >> --- a/test/tarantool-tests/misclib-memprof-lapi.test.lua
> >> +++ b/test/tarantool-tests/misclib-memprof-lapi.test.lua
<snipped>
> >> @@ -116,7 +120,8 @@ end
> >> local function check_alloc_report(alloc, traceno, line, function_line, nevents)
> >> local expected_name, event
> >> if traceno ~= 0 then
> >> - expected_name = string.format("TRACE [%d]", traceno)
> >> + expected_name = string.format("TRACE [%d] ", traceno)..
> >> + form_source_line(function_line)
> >
> >The output format differs from the one produced by memprof parser,
> >doesn't it?
>
> Yes, because we demangle names in <fill_ev_type> function. So, we can
> omit "started at" part to check that everything else is correct.
Oh, I see... This is odd a bit but now I got it, thanks!
>
<snipped>
> >> diff --git a/tools/memprof/humanize.lua b/tools/memprof/humanize.lua
> >> index 7771005d..7d30f976 100644
> >> --- a/tools/memprof/humanize.lua
> >> +++ b/tools/memprof/humanize.lua
> >> @@ -7,6 +7,23 @@ local symtab = require "utils.symtab"
<snipped>
> >> + -- recording started.
> >> + if trace and trace.addr == loc.addr then
> >> + return symtab.demangle(symbols, loc).." started at "..
> >> + symtab.demangle(symbols, trace.start)
> >
> >Finally, I got the thing that bothers me the most. Why do you make
> ><describe_location> so complex? It looks that you can move all these
> >if-else branching to <symtab.demangle> and concatenation to
> ><demangle_trace> function, doesn't it? AFAICS, you can remove
> ><describe_location> as a result and trace demangling will be
> >encapsulated in scope of <demangle_trace> function. Feel free to correct
> >me if I'm wrong.
>
> Initially it was implemented, as you suggest now. But Sergey in his
> review led me to believe, that "started at" part should ideologically
> relate to the humanizer module. And I agree with that point, but maybe
> I decomposed things not in a very good way.
Em... In that way all other types (such as "INTERNAL" and "CFUNC %#x")
should also be in the humanizer module, since this representation is
specific for a particular output format. All in all nobody stops you
from moving <symtab.demangle> to the humanize module, since it's used
only there (and need to be used only there).
BTW, Sergey is also in Cc, so he can also drop a few words regarding it.
>
> Another way to implement this is to demangle without "started at" and
> then insert it to the demangled name. What do you think?
My point is to have the whole "stringification" mess encapsulated in a
single function (like it's almost done within <symtab.demangle>). And
the only thing remaining outside of this function is "started at" tail.
I hope this fits your vision regarding decomposition :)
>
> >
> >> + end
> >> + return symtab.demangle(symbols, loc)
> >> +end
> >> +
<snipped>
> >> --
> >> 2.33.0
> >>
> >
> >--
> >Best regards,
> >IM
>
> --
> Best regards,
> Mikhail Shishatskiy
--
Best regards,
IM
More information about the Tarantool-patches
mailing list