[Tarantool-patches] [PATCH luajit v3 3/5] memprof: dump traceno if allocate from trace

Wed Sep 29 22:21:27 MSK 2021

Hi, Igor!
Thank you for the review!

On 16.09.2021 18:32, Igor Munkin wrote:
>Misha,
>
>Thanks for the patch! Please consider my comments below.
>
>On 20.08.21, Mikhail Shishatskiy wrote:
>> When LuaJIT executes a trace, the trace number is stored in
>> the virtual machine state. So, we can treat this number as
>> an allocation event source in memprof and report allocation events
>> from traces as well.
>>
>> Previously, all the allocations from traces were marked as INTERNAL.
>>
>> This patch introduces the functionality described above by adding
>> a new allocation source type named ASOURCE_TRACE. If at the moment
>> when allocation event occurs VM state indicates that trace executed,
>> trace number streamed to a binary file:
>>
>> | loc-trace  := trace-addr trace-no
>> | trace-addr := <ULEB128>
>> | trace-no   := <ULEB128>
>>
>> Also, the memory profiler parser is adjusted to recognize this
>> source type by extending <loc> structure: field <traceno>,
>> representing trace number, is added.
>
>I understand, why you've chosen this order, but I don't like it. IMHO,
>the binary format should not rely or depend on the particular parser
>implementation a lot. Please, consider more comments below.

Fixed in the upcoming patch series v4.

>
>>
>> Part of tarantool/tarantool#5814
>> ---
>>
>> Issue: https://github.com/tarantool/tarantool/issues/5814
>> Luajit branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5814-group-allocations-on-trace-by-trace-number
>> tarantool branch: https://github.com/tarantool/tarantool/tree/shishqa/gh-5814-group-allocations-on-trace-by-trace-number
>>
>>  src/Makefile.dep.original |  2 +-
>>  src/lj_memprof.c          | 35 +++++++++++++++++++++++++++++++++--
>>  src/lj_memprof.h          | 15 ++++++++++-----
>>  tools/memprof/parse.lua   | 22 ++++++++++++++--------
>>  4 files changed, 58 insertions(+), 16 deletions(-)
>>
>> diff --git a/src/Makefile.dep.original b/src/Makefile.dep.original
>> index f3672413..ee6bafb2 100644
>> --- a/src/Makefile.dep.original
>> +++ b/src/Makefile.dep.original
>> @@ -146,7 +146,7 @@ lj_mcode.o: lj_mcode.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
>>   lj_gc.h lj_err.h lj_errmsg.h lj_jit.h lj_ir.h lj_mcode.h lj_trace.h \
>>   lj_dispatch.h lj_bc.h lj_traceerr.h lj_vm.h
>>  lj_memprof.o: lj_memprof.c lj_arch.h lua.h luaconf.h lj_memprof.h \
>> - lj_def.h lj_wbuf.h lj_obj.h lj_frame.h lj_bc.h lj_debug.h
>> + lj_def.h lj_wbuf.h lj_obj.h lj_frame.h lj_bc.h lj_debug.h lj_dispatch.h
>
>It looks some headers are missing (it's better use <make depend> from
>Makefile.original to check yourself).

Fixed in the upcoming patch series v4.

>
>>  lj_meta.o: lj_meta.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h lj_gc.h \
>>   lj_err.h lj_errmsg.h lj_buf.h lj_str.h lj_tab.h lj_meta.h lj_frame.h \
>>   lj_bc.h lj_vm.h lj_strscan.h lj_strfmt.h lj_lib.h
>> diff --git a/src/lj_memprof.c b/src/lj_memprof.c
>> index 2c1ef3b8..fb99829d 100644
>> --- a/src/lj_memprof.c
>> +++ b/src/lj_memprof.c
>
><snipped>
>
>> @@ -168,9 +197,11 @@ static const memprof_writer memprof_writers[] = {
>>    ** But since traces must follow the semantics of the original code,
>>    ** behaviour of Lua and JITted code must match 1:1 in terms of allocations,
>>    ** which makes using memprof with enabled JIT virtually redundant.
>> -  ** Hence use the stub below.
>> +  ** But if one wants to investigate allocations with JIT enabled,
>> +  ** memprof_write_trace() dumps trace number to the binary output.
>
>Typo: number and mcode starting address, right?

Fixed in the upcoming patch series v4.

>
>> +  ** It can be useful to compare with with jit.v or jit.dump outputs.
>>    */
>> -  memprof_write_hvmstate /* LJ_VMST_TRACE */
>> +  memprof_write_trace /* LJ_VMST_TRACE */
>>  };
>>
>>  static void memprof_write_caller(struct memprof *mp, uint8_t aevent)
>> diff --git a/src/lj_memprof.h b/src/lj_memprof.h
>> index 3417475d..6a35385d 100644
>> --- a/src/lj_memprof.h
>> +++ b/src/lj_memprof.h
>> @@ -51,9 +51,10 @@
>>  */
>>
>>  #define SYMTAB_LFUNC ((uint8_t)0)
>> +#define SYMTAB_TRACE ((uint8_t)1)
>
>This looks like related to the next patch, doesn't it?

Fixed in the upcoming patch series v4.

>
>>  #define SYMTAB_FINAL ((uint8_t)0x80)
>>
>> -#define LJM_CURRENT_FORMAT_VERSION 0x01
>> +#define LJM_CURRENT_FORMAT_VERSION 0x02
>>
>>  /*
>>  ** Event stream format:
>
><snipped>
>
>> diff --git a/tools/memprof/parse.lua b/tools/memprof/parse.lua
>> index 12e2758f..adc7c072 100644
>> --- a/tools/memprof/parse.lua
>> +++ b/tools/memprof/parse.lua
>
><snipped>
>
>> @@ -24,8 +24,11 @@ local AEVENT_MASK = 0x3
>>  local ASOURCE_INT = lshift(1, 2)
>>  local ASOURCE_LFUNC = lshift(2, 2)
>>  local ASOURCE_CFUNC = lshift(3, 2)
>> +local ASOURCE_TRACE = lshift(4, 2)
>>
>> -local ASOURCE_MASK = lshift(0x3, 2)
>> +local ASOURCE_MASK = lshift(0x7, 2)
>> +
>> +local EV_HEADER_MAX = ASOURCE_TRACE + AEVENT_REALLOC
>
>Why so complex? I believe lshift(5, 2) is more clear and covers (i.e. is
>greater than) all cases of AEVENT_* and ASOURCE_*.

As for me, lshift(5, 2) is less descriptive. ASOURCE_TRACE +
AEVENT_REALLOC shows the layout of flags in the header [FUUSSSEE]
                                                            ^^^^^
and gives an idea, why EV_HEADER_MAX is EV_HEADER_MAX :)

>
>>
>>  local M = {}
>>
>> @@ -59,20 +62,23 @@ local function link_to_previous(heap_chunk, e, nsize)
>>    end
>>  end
>>
>> -local function id_location(addr, line)
>> -  return string_format("f%#xl%d", addr, line), {
>> +local function id_location(addr, line, traceno)
>> +  return string_format("f%#xl%dt%d", addr, line, traceno), {
>>      addr = addr,
>>      line = line,
>> +    traceno = traceno,
>>    }
>>  end
>>
>>  local function parse_location(reader, asource)
>>    if asource == ASOURCE_INT then
>> -    return id_location(0, 0)
>> +    return id_location(0, 0, 0)
>>    elseif asource == ASOURCE_CFUNC then
>> -    return id_location(reader:read_uleb128(), 0)
>> +    return id_location(reader:read_uleb128(), 0, 0)
>>    elseif asource == ASOURCE_LFUNC then
>> -    return id_location(reader:read_uleb128(), reader:read_uleb128())
>> +    return id_location(reader:read_uleb128(), reader:read_uleb128(), 0)
>> +  elseif asource == ASOURCE_TRACE then
>> +    return id_location(reader:read_uleb128(), 0, reader:read_uleb128())
>
>As a result of your changes this function becomes too "cryptic". It's
>better to refactor this function (maybe even in a separate commit), so
>we have something like the function below at the final.

Refactored in the upcoming patch series v4.

>
>| local function id(params)
>|   return string_format("f%#xl%ds%d", params.addr, params.line, params.state)
>| end
>|
>| local function parse_location(reader, asource)
>|   local location = { addr = 0, line = 0, traceno = 0 }
>|   if asource == ASOURCE_INT then
>|     -- Do nothing
>|   elseif asource == ASOURCE_CFUNC then
>|     location.addr = reader:read_uleb128()
>|   elseif asource == ASOURCE_LFUNC then
>|     location.addr = reader:read_uleb128()
>|     location.line = reader:read_uleb128()
>|   elseif asource == ASOURCE_TRACE then
>|     location.trace = reader:read_uleb128()
>|     location.addr = reader:read_uleb128()
>|   else
>|     error("Unknown asource "..asource)
>|   end
>|   return id(location), location
>| end
>
>You can also make this function public and move it to utils.lua module.
>
>BTW, these entries are "loaded" but not "rendered" in the final output
>now, aren't they? In other words, why don't you make everything in a
>single patch?

My bad, I split changes in quite a strange way. New patch series makes
it more "natural": simple rendering is moved to this patch. More complex
"started at ..." is added in another patch.

>
>>    end
>>    error("Unknown asource "..asource) >  end
>
><snipped>
>
>> --
>> 2.32.0
>>
>
>-- 
>Best regards,
>IM

Best regards,
Mikhail Shishatskiy