[Tarantool-patches] [PATCH luajit v4 3/4] memprof: group allocations on traces by traceno
Igor Munkin
imun at tarantool.org
Wed Oct 27 16:56:55 MSK 2021
Misha,
Thanks for the patch! Please, consider the comments below.
On 29.09.21, Mikhail Shishatskiy wrote:
> When LuaJIT executes a trace, the trace number is stored in
Typo: s/in/as/.
> the virtual machine state. So, we can treat this number as
> an allocation event source in memprof and report allocation events
> from traces as well.
>
> Previously, all the allocations from traces were marked as INTERNAL.
>
> This patch introduces the functionality described above by adding
> a new allocation source type named ASOURCE_TRACE. If at the moment
> when allocation event occurs VM state indicates that trace executed,
Minor: To make the wording a bit clearer, I propose the following:
| If allocation event occurs when trace is executed, trace number...
> trace number and trace's mcode starting address streamed to a binary
Typo: s/streamed/is streamed/.
> file:
>
> | loc-trace := trace-no trace-addr
> | trace-no := <ULEB128>
> | trace-addr := <ULEB128>
>
> Also, the memory profiler parser is adjusted to recognize entries
> mentioned above. The <loc> structure is extended with field <traceno>,
> representing trace number. Trace locations are demangled as
>
> | TRACE [<trace-no>] <trace-addr>
>
> Resolves tarantool/tarantool#5814
Does this patch "resolve" the issue or only being a "part of" it?
> ---
>
> Issue: https://github.com/tarantool/tarantool/issues/5814
> Branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5814-group-allocations-on-trace-by-trace-number
> CI: https://github.com/tarantool/tarantool/tree/shishqa/gh-5814-group-allocations-on-trace-by-trace-number
>
> src/Makefile.dep.original | 3 +-
> src/lj_memprof.c | 36 ++++++-
> src/lj_memprof.h | 14 ++-
> .../misclib-memprof-lapi.test.lua | 97 +++++++++++++++----
> tools/memprof/parse.lua | 13 ++-
> tools/utils/symtab.lua | 20 ++--
> 6 files changed, 148 insertions(+), 35 deletions(-)
>
<snipped>
> diff --git a/test/tarantool-tests/misclib-memprof-lapi.test.lua b/test/tarantool-tests/misclib-memprof-lapi.test.lua
> index 9de4bd98..3f4ffea0 100644
> --- a/test/tarantool-tests/misclib-memprof-lapi.test.lua
> +++ b/test/tarantool-tests/misclib-memprof-lapi.test.lua
<snipped>
> @@ -96,10 +113,21 @@ local function form_source_line(line)
> return string.format("@%s:%d", arg[0], line)
> end
>
> -local function check_alloc_report(alloc, line, function_line, nevents)
> - assert(form_source_line(function_line) == alloc[line].name)
> - assert(alloc[line].num == nevents, ("got=%d, expected=%d"):format(
> - alloc[line].num,
> +local function check_alloc_report(alloc, traceno, line, function_line, nevents)
OK, here we have a function with 5 parameters. Almost half of them is
used in a single particular case. Hence, I propose to change the
interface to the following:
| local function check_alloc_report(alloc, location, nevents)
In this case location is a table with either "traceno" key or with
"line" and "linedefined" keys. Such function looks better, doesn't it?
Consider the following:
| check_alloc_report(alloc, { traceno = 1 }, 20)
| check_alloc_report(alloc, { line = 34, linedefined = 32 }, 2)
> + local expected_name, event
> + if traceno ~= 0 then
> + expected_name = string.format("TRACE [%d]", traceno)
> + event = alloc.trace[traceno]
> + else
> + expected_name = form_source_line(function_line)
> + event = alloc.line[line]
> + end
> + assert(expected_name == event.name, ("got='%s', expected='%s'"):format(
> + event.name,
> + expected_name
> + ))
> + assert(event.num == nevents, ("got=%d, expected=%d"):format(
> + event.num,
> nevents
> ))
> return true
> @@ -145,18 +173,18 @@ test:test("output", function(subtest)
> -- one is the number of allocations. 1 event - alocation of
> -- table by itself + 1 allocation of array part as far it is
> -- bigger than LJ_MAX_COLOSIZE (16).
> - subtest:ok(check_alloc_report(alloc, 27, 25, 2))
> - -- 100 strings allocations.
> - subtest:ok(check_alloc_report(alloc, 32, 25, 100))
> + subtest:ok(check_alloc_report(alloc, 0, 34, 32, 2))
> + -- 20 strings allocations.
> + subtest:ok(check_alloc_report(alloc, 0, 39, 32, 20))
As a result of the change proposed above, you need no these ugly fixes
for the existing spots in the future.
>
> -- Collect all previous allocated objects.
> - subtest:ok(free.INTERNAL.num == 102)
> + subtest:ok(free.INTERNAL.num == 22)
>
> -- Tests for leak-only option.
> -- See also https://github.com/tarantool/tarantool/issues/5812.
> local heap_delta = process.form_heap_delta(events, symbols)
> - local tab_alloc_stats = heap_delta[form_source_line(27)]
> - local str_alloc_stats = heap_delta[form_source_line(32)]
> + local tab_alloc_stats = heap_delta[form_source_line(34)]
> + local str_alloc_stats = heap_delta[form_source_line(39)]
> subtest:ok(tab_alloc_stats.nalloc == tab_alloc_stats.nfree)
> subtest:ok(tab_alloc_stats.dbytes == 0)
> subtest:ok(str_alloc_stats.nalloc == str_alloc_stats.nfree)
> @@ -185,5 +213,38 @@ test:test("stack-resize", function(subtest)
> misc.memprof.stop()
> end)
>
> +-- Test profiler with enabled JIT.
> jit.on()
> +
> +test:test("jit-output", function(subtest)
> + -- Disabled on *BSD due to #4819.
> + if jit.os == 'BSD' then
> + subtest:plan(1)
> + subtest:skip('Disabled due to #4819')
> + return
> + end
> +
> + subtest:plan(3)
> +
> + jit.opt.start(3, "hotloop=10")
> + jit.flush()
> +
> + -- Pregenerate traces to fill symtab entries in the next run.
> + default_payload()
> +
> + local symbols, events = generate_parsed_output(default_payload)
> +
> + local alloc = fill_ev_type(events, symbols, "alloc")
> + local free = fill_ev_type(events, symbols, "free")
> +
> + -- We expect, that loop will be compiled into a trace.
> + subtest:ok(check_alloc_report(alloc, 1, 37, 32, 20))
What are 37 and 32 in this case? I can use 0 and 0 for both parameters
and the test passes, doesn't it? This is another argument for using
table with different number of keys.
> + -- See same checks with jit.off().
> + subtest:ok(check_alloc_report(alloc, 0, 34, 32, 2))
> + subtest:ok(free.INTERNAL.num == 22)
> +
> + -- Restore default JIT settings.
> + jit.opt.start(unpack(jit_opt_default))
> +end)
> +
> os.exit(test:check() and 0 or 1)
<snipped>
> diff --git a/tools/utils/symtab.lua b/tools/utils/symtab.lua
> index e01daa62..85945fb2 100644
> --- a/tools/utils/symtab.lua
> +++ b/tools/utils/symtab.lua
> @@ -74,21 +74,29 @@ function M.parse(reader)
> end
>
> function M.id(loc)
> - return string_format("f%#xl%d", loc.addr, loc.line)
> + return string_format("f%#xl%dt%d", loc.addr, loc.line, loc.traceno)
> end
>
> -function M.demangle(symtab, loc)
> +local function demangle_lfunc(symtab, loc)
> local addr = loc.addr
>
> if addr == 0 then
> return "INTERNAL"
> - end
> -
> - if symtab[addr] then
> + elseif symtab[addr] then
> return string_format("%s:%d", symtab[addr].source, loc.line)
> end
> -
Looks like all these changes are excess, aren't they?
> return string_format("CFUNC %#x", addr)
> end
>
> +local function demangle_trace(loc)
> + return string_format("TRACE [%d] %#x", loc.traceno, loc.addr)
> +end
> +
> +function M.demangle(symtab, loc)
> + if loc.traceno ~= 0 then
> + return demangle_trace(loc)
> + end
> + return demangle_lfunc(symtab, loc)
Why 3 of 4 types of ASOURCE are demangled within a separate function
(with a bit strange name), but the traces have a separate one function?
Looks irrationale a bit, IMHO. Then either introduce a new function per
ASOURCE (too crazy as for me) or just move this <if> block for trace
into the original <M.demangle> function (the only logic way I see here).
> +end
> +
> return M
> --
> 2.33.0
>
--
Best regards,
IM
More information about the Tarantool-patches
mailing list