[Tarantool-patches] [PATCH luajit v1] memprof: group allocations on traces by trace number

Sergey Kaplun skaplun at tarantool.org
Wed Jul 21 14:47:28 MSK 2021


Hi! Thanks for the patch!
Please consider my comments below.

On 21.07.21, Mikhail Shishatskiy wrote:
> When luajit executes a trace, it's id is stored in virtual machine

Typo: s/luajit/LuaJIT/

Nit: suggest to change: s/it's id/trace's number/
Feel free to ignore.

Typo: s/in virtual machine state/in the virtual machine state/

> state. So, we can treat this trace number as allocation event source

Typo: /allocation event source/an allocation event source/

> in memprof and report allocation events from traces as well.
> 
> Previously all the allocations from traces were marked as INTERNAL.

Typo: s/Previously all/Previously, all/

> 
> This patch introduces functionality described above by adding new
> allocation source type named ASOURCE_TRACE. If at the moment when
> allocation event occur vmstate indicates that trace is executed,
> trace id is streamed to binary file:

Typo: s/to binary file/to a binary file/

> 
> | loc-trace := trace-no
> | trace-no  := <ULEB128>
> 
> Also, memory profiler parser was tuned to recognize this source type by

Nit: s/tuned/adjusted/
Feel free to ignore.

> extending <loc> structure: field trace, representing trace id, was
> added.
> 
> Now, for example, all the allocation events happened in trace with id 42,

Nit: s/id/number/
Feel free to ignore.

> will be reported at source 'TRACE [42]' and not at 'INTERNAL' source.

Nit: In my opinion "reported as <smth> source" fills better here.
Feel free to ignore.

I suppose that we can do a little bit more, than just report trace
number. We can also provide information about its start pc (i.e.
its start line). It is already implemented for perftools (see
`perftools_addtrace()` function in <src/lj_trace.c>).

With it, users can get a rough estimate of what trace causes allocation,
without run two tools at the moment (`jit.dump` and `memprof`).
Also, it can bring some clear about the trace if `jit.flush()` is
called during memprof running.

> 
> Closes tarantool/tarantool#5814

Nit: it is better to use "Resolves" either "Closes".
Rationale: Issue isn't closed by this commit. It will be closed after
the LuaJIT submodule is bumped in the Tarantool. But, indeed, the issue
is resolved by this commit.

> ---
> 
> Issue: https://github.com/tarantool/tarantool/issues/5814
> Branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5814-group-allocations-on-trace-by-trace-number
> 
>  src/lj_memprof.c        | 16 ++++++++--------
>  src/lj_memprof.h        |  9 ++++++---
>  tools/memprof/parse.lua | 18 +++++++++++-------
>  tools/utils/symtab.lua  |  5 +++++

I suppose, that it will be nice to add some tests for the patch to check
its behaviour. Also, performance measurements are welcome! :)

You can use the benchmarks from here [1] as it was done for memprof.

>  4 files changed, 30 insertions(+), 18 deletions(-)
> 
> diff --git a/src/lj_memprof.c b/src/lj_memprof.c
> index 2c1ef3b8..0cbc0ed7 100644
> --- a/src/lj_memprof.c
> +++ b/src/lj_memprof.c
> @@ -146,6 +146,13 @@ static void memprof_write_func(struct memprof *mp, uint8_t aevent)
>      lua_assert(0);
>  }
>  
> +static void memprof_write_trace(struct memprof *mp, uint8_t aevent)
> +{
> +  struct lj_wbuf *out = &mp->out;
> +  lj_wbuf_addbyte(out, aevent | ASOURCE_TRACE);
> +  lj_wbuf_addu64(out, (uint64_t)mp->g->vmstate); /* write traceno. */

Nit: I suppose, that the comment is redundant, if we add a comment to
corresponding `memprof_writers[]` entry (see below).
Also, we prefer to avoid usage of inline comments (structure or
array definition are exceptions). Just use the previous line: see
`dump_symtab()` for example.

> +}
> +
>  static void memprof_write_hvmstate(struct memprof *mp, uint8_t aevent)
>  {
>    lj_wbuf_addbyte(&mp->out, aevent | ASOURCE_INT);
> @@ -163,14 +170,7 @@ static const memprof_writer memprof_writers[] = {
>    memprof_write_hvmstate, /* LJ_VMST_RECORD */
>    memprof_write_hvmstate, /* LJ_VMST_OPT */
>    memprof_write_hvmstate, /* LJ_VMST_ASM */
> -  /*
> -  ** XXX: In ideal world, we should report allocations from traces as well.
> -  ** But since traces must follow the semantics of the original code,
> -  ** behaviour of Lua and JITted code must match 1:1 in terms of allocations,
> -  ** which makes using memprof with enabled JIT virtually redundant.
> -  ** Hence use the stub below.
> -  */

I suppose that this comment is still valid -- we *should* report
allocations from traces, like it is done for bytecode execution via the
VM. So I suggest to adjust this comment by replacing the last sentence
with the description of what `memprof_write_trace()` does instead.

> -  memprof_write_hvmstate /* LJ_VMST_TRACE */
> +  memprof_write_trace /* LJ_VMST_TRACE */
>  };
>  
>  static void memprof_write_caller(struct memprof *mp, uint8_t aevent)
> diff --git a/src/lj_memprof.h b/src/lj_memprof.h
> index 3417475d..13125536 100644
> --- a/src/lj_memprof.h
> +++ b/src/lj_memprof.h
> @@ -69,11 +69,13 @@
>  ** event-realloc  := event-header loc? oaddr osize naddr nsize
>  ** event-free     := event-header loc? oaddr osize
>  ** event-header   := <BYTE>
> -** loc            := loc-lua | loc-c
> +** loc            := loc-lua | loc-c | loc-trace
>  ** loc-lua        := sym-addr line-no
>  ** loc-c          := sym-addr
> +** loc-trace      := trace-no
>  ** sym-addr       := <ULEB128>
>  ** line-no        := <ULEB128>
> +** trace-no       := <ULEB128>
>  ** oaddr          := <ULEB128>
>  ** naddr          := <ULEB128>
>  ** osize          := <ULEB128>
> @@ -90,8 +92,8 @@
>  **
>  ** event-header: [FUUUSSEE]

Please, update this header description too.

>  **  * EE   : 2 bits for representing allocation event type (AEVENT_*)

<snipped>

> diff --git a/tools/memprof/parse.lua b/tools/memprof/parse.lua
> index 12e2758f..2bd491c8 100644
> --- a/tools/memprof/parse.lua
> +++ b/tools/memprof/parse.lua

<snipped>

> @@ -59,20 +60,23 @@ local function link_to_previous(heap_chunk, e, nsize)
>    end
>  end
>  
> -local function id_location(addr, line)
> -  return string_format("f%#xl%d", addr, line), {
> +local function id_location(addr, line, trace)
> +  return string_format("f%#xl%dxt%d", addr, line, trace), {

Typo: s/"f%#xl%dxt%d"/"f%#xl%dt%d"/ - x symbol looks redundant here.

Nit: MB "traceno" fills better than trace.
Feel free to ignore.

>      addr = addr,
>      line = line,
> +    trace = trace,
>    }
>  end
>  
>  local function parse_location(reader, asource)
>    if asource == ASOURCE_INT then
> -    return id_location(0, 0)
> +    return id_location(0, 0, 0)
>    elseif asource == ASOURCE_CFUNC then
> -    return id_location(reader:read_uleb128(), 0)
> +    return id_location(reader:read_uleb128(), 0, 0)
>    elseif asource == ASOURCE_LFUNC then
> -    return id_location(reader:read_uleb128(), reader:read_uleb128())
> +    return id_location(reader:read_uleb128(), reader:read_uleb128(), 0)
> +  elseif asource == ASOURCE_TRACE then
> +    return id_location(0, 0, reader:read_uleb128())

Side note: The first two arguments can be used if we will dump
information about the start of a trace.

>    end
>    error("Unknown asource "..asource)
>  end
> @@ -140,7 +144,7 @@ local parsers = {
>  }
>  
>  local function ev_header_is_valid(evh)
> -  return evh <= 0x0f or evh == LJM_EPILOGUE_HEADER
> +  return evh <= 0x1f or evh == LJM_EPILOGUE_HEADER

0x1f is too much: the event header maximum value is the following:

| $ perl -E 'say sprintf("0x%x", (4<<2)+3)'
| 0x13

4<<2 is the biggest possible allocation event type (trace). And + 3
is the reallocation event type.

I suggest to create the corresponding constant for the check. It will
simplify maintenance in the future.

>  end
>  
>  -- Splits event header into event type (aka aevent = allocation
> diff --git a/tools/utils/symtab.lua b/tools/utils/symtab.lua
> index 3ed1dd13..6121177f 100644
> --- a/tools/utils/symtab.lua
> +++ b/tools/utils/symtab.lua
> @@ -75,6 +75,11 @@ end
>  
>  function M.demangle(symtab, loc)
>    local addr = loc.addr
> +  local trace = loc.trace
> +
> +  if trace ~= 0 then

Ditto, the nit about s/trace/traceno/.
Feel free to ignore.

> +    return string_format("TRACE [%d]", trace)
> +  end
>  
>    if addr == 0 then
>      return "INTERNAL"
> -- 
> 2.32.0
> 

[1]: https://github.com/LuaJIT/LuaJIT-test-cleanup/tree/master/bench

-- 
Best regards,
Sergey Kaplun


More information about the Tarantool-patches mailing list