[Tarantool-patches] [PATCH luajit v4 4/4] memprof: add info about trace start to symtab

Igor Munkin imun at tarantool.org
Mon Nov 1 19:31:14 MSK 2021


Misha,

Thanks for the patch! Please consider my comments below.

On 29.09.21, Mikhail Shishatskiy wrote:
> Trace allocation sources, recorded by the memory profiler,
> were reported as

Minor: As for me, it's better to say it the following way:
| Allocation events occurred on traces are recorded by the memory
| profiler the following way now

> 
> | TRACE [<trace-no>] <trace-addr>
> 
> This approach is not descriptive enough to understand, where
> exactly allocation took place, as we do not know the code
> chunk, associated with the trace.
> 
> This patch fixes the problem described above by extending the
> symbol table with <sym-trace> entries, consisting of a trace's
> mcode starting address, trace number, address of function proto,
> and line, where trace recording started:
> 
> | sym-trace  := sym-header trace-no trace-addr sym-addr sym-line
> | trace-no   := <ULEB128>
> | trace-addr := <ULEB128>
> 
> The memory profiler parser is adjusted to recognize the entries
> mentioned above. On top of that, the API of <utils/symtab.lua> changed:
> now table with symbols contains two tables: `lfunc` for Lua functions
> symbols and `trace` for trace entries.
> 
> The demangler module has not changed, but the function
> `describe_location` is added to the <memprof/humanize.lua> module,
> which allows one to get a description of the trace location in the
> format described below:
> 
> | TRACE [<trace-no>] <trace-addr> started at @<sym-chunk>:<sym-line>
> 
> Follows up tarantool/tarantool#5814
> ---
> 
> Issue: https://github.com/tarantool/tarantool/issues/5814
> Branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5814-group-allocations-on-trace-by-trace-number
> CI: https://github.com/tarantool/tarantool/tree/shishqa/gh-5814-group-allocations-on-trace-by-trace-number
> 
>  src/lj_memprof.c                              | 43 +++++++++++++++++++
>  src/lj_memprof.h                              |  8 +++-
>  .../misclib-memprof-lapi.test.lua             | 15 ++++---
>  tools/memprof.lua                             |  4 +-
>  tools/memprof/humanize.lua                    | 30 ++++++++++---
>  tools/memprof/process.lua                     |  9 ++--
>  tools/utils/symtab.lua                        | 31 ++++++++++---
>  7 files changed, 118 insertions(+), 22 deletions(-)
> 
> diff --git a/src/lj_memprof.c b/src/lj_memprof.c
> index 8702557f..e8b2ebbc 100644
> --- a/src/lj_memprof.c
> +++ b/src/lj_memprof.c
> @@ -28,6 +28,45 @@
>  static const unsigned char ljs_header[] = {'l', 'j', 's', LJS_CURRENT_VERSION,
>  					   0x0, 0x0, 0x0};
>  
> +#if LJ_HASJIT
> +
> +static void dump_symtab_trace(struct lj_wbuf *out, const GCtrace *trace)
> +{
> +  GCproto *pt = &gcref(trace->startpt)->pt;
> +  BCLine lineno = 0;
> +
> +  const BCIns *startpc = mref(trace->startpc, const BCIns);
> +  lua_assert(startpc >= proto_bc(pt) &&
> +             startpc < proto_bc(pt) + pt->sizebc);
> +
> +  lineno = lj_debug_line(pt, proto_bcpos(pt, startpc));
> +  lua_assert(lineno >= 0);

I doubt whether this assertion is not always true. If it is, then what
does it check?

> +

<snipped>

> diff --git a/test/tarantool-tests/misclib-memprof-lapi.test.lua b/test/tarantool-tests/misclib-memprof-lapi.test.lua
> index 3f4ffea0..b9edb80d 100644
> --- a/test/tarantool-tests/misclib-memprof-lapi.test.lua
> +++ b/test/tarantool-tests/misclib-memprof-lapi.test.lua
> @@ -87,9 +87,13 @@ local function fill_ev_type(events, symbols, event_type)
>      local addr = event.loc.addr
>      local traceno = event.loc.traceno
>  
> -    if traceno ~= 0 then
> +    if traceno ~= 0 and symbols.trace[traceno] then
> +      local trace_loc = symbols.trace[traceno].start
> +      addr = trace_loc.addr
>        ev_type.trace[traceno] = {
> -        name = string.format("TRACE [%d]", traceno),
> +        name = string.format("TRACE [%d] %s:%d",
> +          traceno, symbols.lfunc[addr].source, symbols.lfunc[addr].linedefined

<trace_loc.line> need to be used as a last argument in <string.format>
instead of <symbols.lfunc[addr].linedefined>, don't you?

> +        ),
>          num = event.num,
>        }
>      elseif addr == 0 then

<snipped>

> @@ -116,7 +120,8 @@ end
>  local function check_alloc_report(alloc, traceno, line, function_line, nevents)
>    local expected_name, event
>    if traceno ~= 0 then
> -    expected_name = string.format("TRACE [%d]", traceno)
> +    expected_name = string.format("TRACE [%d] ", traceno)..
> +                    form_source_line(function_line)

The output format differs from the one produced by memprof parser,
doesn't it?

>      event = alloc.trace[traceno]
>    else
>      expected_name = form_source_line(function_line)

<snipped>

> diff --git a/tools/memprof/humanize.lua b/tools/memprof/humanize.lua
> index 7771005d..7d30f976 100644
> --- a/tools/memprof/humanize.lua
> +++ b/tools/memprof/humanize.lua
> @@ -7,6 +7,23 @@ local symtab = require "utils.symtab"
>  
>  local M = {}
>  
> +function M.describe_location(symbols, loc)

There is no need to export <describe_location> from humanize.lua, so it
can be just a local function within this Lua chunk.

> +  if loc.traceno == 0 then
> +    return symtab.demangle(symbols, loc)
> +  end
> +
> +  local trace = symbols.trace[loc.traceno]
> +
> +  -- If trace, which was remembered in the symtab, has not
> +  -- been flushed, assotiate it with a proto, where trace

Typo: s/assotiate/associate/.

> +  -- recording started.
> +  if trace and trace.addr == loc.addr then
> +    return symtab.demangle(symbols, loc).." started at "..
> +           symtab.demangle(symbols, trace.start)

Finally, I got the thing that bothers me the most. Why do you make
<describe_location> so complex? It looks that you can move all these
if-else branching to <symtab.demangle> and concatenation to
<demangle_trace> function, doesn't it? AFAICS, you can remove
<describe_location> as a result and trace demangling will be
encapsulated in scope of <demangle_trace> function. Feel free to correct
me if I'm wrong.

> +  end
> +  return symtab.demangle(symbols, loc)
> +end
> +

<snipped>

> diff --git a/tools/utils/symtab.lua b/tools/utils/symtab.lua
> index 85945fb2..496d8480 100644
> --- a/tools/utils/symtab.lua
> +++ b/tools/utils/symtab.lua

<snipped>

> @@ -24,18 +25,38 @@ local function parse_sym_lfunc(reader, symtab)
>    local sym_chunk = reader:read_string()
>    local sym_line = reader:read_uleb128()
>  
> -  symtab[sym_addr] = {
> +  symtab.lfunc[sym_addr] = {
>      source = sym_chunk,
>      linedefined = sym_line,
>    }
>  end
>  
> +local function parse_sym_trace(reader, symtab)
> +  local traceno = reader:read_uleb128()
> +  local trace_addr = reader:read_uleb128()
> +  local sym_addr = reader:read_uleb128()
> +  local sym_line = reader:read_uleb128()
> +
> +  symtab.trace[traceno] = {
> +    addr = trace_addr,
> +    start = {
> +      addr = sym_addr,
> +      line = sym_line,
> +      traceno = 0,

Please, leave a comment regarding the fact the structure is the same as
the one yielded from <parse_location> in memprof/parse.lua.

> +    },
> +  }
> +end
> +

<snipped>

> -- 
> 2.33.0
> 

-- 
Best regards,
IM


More information about the Tarantool-patches mailing list