From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 530DB6FC87; Wed, 29 Sep 2021 23:09:33 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 530DB6FC87 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1632946173; bh=q2h3pEqdp9wUjtpAO7doI5nvIGakkp0eXwuh2tMT6ow=; h=To:Date:In-Reply-To:References:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=Dgc/+EBt+8o2hQif/6xk5x2PpcNJ/IDHY2LQphwSwo5aiGKjGqNxeAvshTGeXf2xf GnnDENxJNEEwdRtUrRm7jNrJWmeqJ9Gr2j7CMnycpMcmnvk2Yg4bKxBze2pAh8VwCm xBjL8OHm1H9cHu5HwA9RheobNMgsAXqOQJhptunk= Received: from smtp59.i.mail.ru (smtp59.i.mail.ru [217.69.128.39]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 84A026DB08 for ; Wed, 29 Sep 2021 23:08:05 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 84A026DB08 Received: by smtp59.i.mail.ru with esmtpa (envelope-from ) id 1mVfs4-0000Jo-Gg; Wed, 29 Sep 2021 23:08:05 +0300 To: tarantool-patches@dev.tarantool.org, imun@tarantool.org, skaplun@tarantool.org Date: Wed, 29 Sep 2021 23:07:57 +0300 Message-Id: <20210929200758.149446-4-m.shishatskiy@tarantool.org> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20210929200758.149446-1-m.shishatskiy@tarantool.org> References: <20210929200758.149446-1-m.shishatskiy@tarantool.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD96A58C36AA2E9964976B67DC8394C8DAB0EDC1900E6C7EF8D182A05F5380850404A62E734B617E7F9B2FC0765BFDE8C682FB49898BB59FD7D54C03349E55FD92C X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7965AF5021CACFC74C2099A533E45F2D0395957E7521B51C2CFCAF695D4D8E9FCEA1F7E6F0F101C6778DA827A17800CE7ABF4E7777AC3C096EA1F7E6F0F101C6723150C8DA25C47586E58E00D9D99D84E1BDDB23E98D2D38BBCA57AF85F7723F2F320073B907556AE6C4ED6C1F97BD0F0CC7F00164DA146DAFE8445B8C89999728AA50765F7900637F924B32C592EA89F389733CBF5DBD5E9C8A9BA7A39EFB766F5D81C698A659EA7CC7F00164DA146DA9985D098DBDEAEC8EDCF5861DED71B2F389733CBF5DBD5E9B5C8C57E37DE458BD9DD9810294C998ED8FC6C240DEA76428AA50765F790063742938F816945AD6ED81D268191BDAD3DBD4B6F7A4D31EC0BEA7A3FFF5B025636D81D268191BDAD3D78DA827A17800CE7039282AFB6605B99EC76A7562686271EEC990983EF5C03292E808ACE2090B5E14AD6D5ED66289B5259CC434672EE63711DD303D21008E298D5E8D9A59859A8B6B372FE9A2E580EFC725E5C173C3A84C3FB9365559B687AC835872C767BF85DA2F004C90652538430E4A6367B16DE6309 X-B7AD71C0: AC4F5C86D027EB782CDD5689AFBDA7A213B5FB47DCBC3458F0AFF96BAACF4158235E5A14AD4A4A4625E192CAD1D9E79DB194B0D77246B867F4B7D4D2DF884BD1 X-C1DE0DAB: C20DE7B7AB408E4181F030C43753B8186998911F362727C414F749A5E30D975C5D501480F54144104FD2C147B20EA7CD2BC3B410DF04AB049C2B6934AE262D3EE7EAB7254005DCED7532B743992DF240BDC6A1CF3F042BAD6DF99611D93F60EF9EAAB76869E07C3E699F904B3F4130E343918A1A30D5E7FCCB5012B2E24CD356 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34ECB3E21D3CD9CB4F033BD8A4F030E7B843B566E82B28C560DB71136976E52B3D3FF2DD37A7C01FBA1D7E09C32AA3244C0AAF8CC722D48F4E3BBE65A5F25C0F207C0C08F7987826B9927AC6DF5659F194 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojsAIehEB+JdmQDiT69QpB8g== X-Mailru-Sender: EFA0F3A8419EF21635BFE795C6CB22C9BA67B002B7480228B2FC0765BFDE8C689BBFB9D4C9B768562376072A51849BFFE66B5C1DBFD5D09D5E022D45988A037B448E0EA96F20AB367402F9BA4338D657ED14614B50AE0675 X-Mras: Ok Subject: [Tarantool-patches] [PATCH luajit v4 3/4] memprof: group allocations on traces by traceno X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Mikhail Shishatskiy via Tarantool-patches Reply-To: Mikhail Shishatskiy Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" When LuaJIT executes a trace, the trace number is stored in the virtual machine state. So, we can treat this number as an allocation event source in memprof and report allocation events from traces as well. Previously, all the allocations from traces were marked as INTERNAL. This patch introduces the functionality described above by adding a new allocation source type named ASOURCE_TRACE. If at the moment when allocation event occurs VM state indicates that trace executed, trace number and trace's mcode starting address streamed to a binary file: | loc-trace := trace-no trace-addr | trace-no := | trace-addr := Also, the memory profiler parser is adjusted to recognize entries mentioned above. The structure is extended with field , representing trace number. Trace locations are demangled as | TRACE [] Resolves tarantool/tarantool#5814 --- Issue: https://github.com/tarantool/tarantool/issues/5814 Branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5814-group-allocations-on-trace-by-trace-number CI: https://github.com/tarantool/tarantool/tree/shishqa/gh-5814-group-allocations-on-trace-by-trace-number src/Makefile.dep.original | 3 +- src/lj_memprof.c | 36 ++++++- src/lj_memprof.h | 14 ++- .../misclib-memprof-lapi.test.lua | 97 +++++++++++++++---- tools/memprof/parse.lua | 13 ++- tools/utils/symtab.lua | 20 ++-- 6 files changed, 148 insertions(+), 35 deletions(-) diff --git a/src/Makefile.dep.original b/src/Makefile.dep.original index f3672413..faa44a0b 100644 --- a/src/Makefile.dep.original +++ b/src/Makefile.dep.original @@ -146,7 +146,8 @@ lj_mcode.o: lj_mcode.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \ lj_gc.h lj_err.h lj_errmsg.h lj_jit.h lj_ir.h lj_mcode.h lj_trace.h \ lj_dispatch.h lj_bc.h lj_traceerr.h lj_vm.h lj_memprof.o: lj_memprof.c lj_arch.h lua.h luaconf.h lj_memprof.h \ - lj_def.h lj_wbuf.h lj_obj.h lj_frame.h lj_bc.h lj_debug.h + lj_def.h lj_wbuf.h lj_obj.h lj_frame.h lj_bc.h lj_debug.h lj_dispatch.h \ + lj_jit.h lj_ir.h lj_meta.o: lj_meta.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h lj_gc.h \ lj_err.h lj_errmsg.h lj_buf.h lj_str.h lj_tab.h lj_meta.h lj_frame.h \ lj_bc.h lj_vm.h lj_strscan.h lj_strfmt.h lj_lib.h diff --git a/src/lj_memprof.c b/src/lj_memprof.c index 2c1ef3b8..8702557f 100644 --- a/src/lj_memprof.c +++ b/src/lj_memprof.c @@ -19,6 +19,10 @@ #include "lj_frame.h" #include "lj_debug.h" +#if LJ_HASJIT +#include "lj_dispatch.h" +#endif + /* --------------------------------- Symtab --------------------------------- */ static const unsigned char ljs_header[] = {'l', 'j', 's', LJS_CURRENT_VERSION, @@ -146,6 +150,31 @@ static void memprof_write_func(struct memprof *mp, uint8_t aevent) lua_assert(0); } +#if LJ_HASJIT + +static void memprof_write_trace(struct memprof *mp, uint8_t aevent) +{ + struct lj_wbuf *out = &mp->out; + const global_State *g = mp->g; + const jit_State *J = G2J(g); + const TraceNo traceno = g->vmstate; + const GCtrace *trace = traceref(J, traceno); + lj_wbuf_addbyte(out, aevent | ASOURCE_TRACE); + lj_wbuf_addu64(out, (uint64_t)traceno); + lj_wbuf_addu64(out, (uintptr_t)trace->mcode); +} + +#else + +static void memprof_write_trace(struct memprof *mp, uint8_t aevent) +{ + UNUSED(mp); + UNUSED(aevent); + lua_assert(0); +} + +#endif + static void memprof_write_hvmstate(struct memprof *mp, uint8_t aevent) { lj_wbuf_addbyte(&mp->out, aevent | ASOURCE_INT); @@ -168,9 +197,12 @@ static const memprof_writer memprof_writers[] = { ** But since traces must follow the semantics of the original code, ** behaviour of Lua and JITted code must match 1:1 in terms of allocations, ** which makes using memprof with enabled JIT virtually redundant. - ** Hence use the stub below. + ** But if one wants to investigate allocations with JIT enabled, + ** memprof_write_trace() dumps trace number and mcode starting address + ** to the binary output. It can be useful to compare with with jit.v or + ** jit.dump outputs. */ - memprof_write_hvmstate /* LJ_VMST_TRACE */ + memprof_write_trace /* LJ_VMST_TRACE */ }; static void memprof_write_caller(struct memprof *mp, uint8_t aevent) diff --git a/src/lj_memprof.h b/src/lj_memprof.h index 3417475d..47474a51 100644 --- a/src/lj_memprof.h +++ b/src/lj_memprof.h @@ -53,7 +53,7 @@ #define SYMTAB_LFUNC ((uint8_t)0) #define SYMTAB_FINAL ((uint8_t)0x80) -#define LJM_CURRENT_FORMAT_VERSION 0x01 +#define LJM_CURRENT_FORMAT_VERSION 0x02 /* ** Event stream format: @@ -69,11 +69,14 @@ ** event-realloc := event-header loc? oaddr osize naddr nsize ** event-free := event-header loc? oaddr osize ** event-header := -** loc := loc-lua | loc-c +** loc := loc-lua | loc-c | loc-trace ** loc-lua := sym-addr line-no ** loc-c := sym-addr +** loc-trace := trace-no trace-addr ** sym-addr := ** line-no := +** trace-no := +** trace-addr := ** oaddr := ** naddr := ** osize := @@ -88,10 +91,10 @@ ** version: [VVVVVVVV] ** * VVVVVVVV: Byte interpreted as a plain integer version number ** -** event-header: [FUUUSSEE] +** event-header: [FUUSSSEE] ** * EE : 2 bits for representing allocation event type (AEVENT_*) -** * SS : 2 bits for representing allocation source type (ASOURCE_*) -** * UUU : 3 unused bits +** * SSS : 3 bits for representing allocation source type (ASOURCE_*) +** * UU : 2 unused bits ** * F : 0 for regular events, 1 for epilogue's *F*inal header ** (if F is set to 1, all other bits are currently ignored) */ @@ -105,6 +108,7 @@ #define ASOURCE_INT ((uint8_t)(1 << 2)) #define ASOURCE_LFUNC ((uint8_t)(2 << 2)) #define ASOURCE_CFUNC ((uint8_t)(3 << 2)) +#define ASOURCE_TRACE ((uint8_t)(4 << 2)) #define LJM_EPILOGUE_HEADER 0x80 diff --git a/test/tarantool-tests/misclib-memprof-lapi.test.lua b/test/tarantool-tests/misclib-memprof-lapi.test.lua index 9de4bd98..3f4ffea0 100644 --- a/test/tarantool-tests/misclib-memprof-lapi.test.lua +++ b/test/tarantool-tests/misclib-memprof-lapi.test.lua @@ -7,7 +7,14 @@ require("utils").skipcond( local tap = require("tap") local test = tap.test("misc-memprof-lapi") -test:plan(3) +test:plan(4) + +local jit_opt_default = { + 3, -- level + "hotloop=56", + "hotexit=10", + "minstitch=0", +} jit.off() jit.flush() @@ -24,10 +31,10 @@ local BAD_PATH = arg[0]:gsub(".+/([^/]+)%.test%.lua$", "%1/memprofdata.tmp.bin") local function default_payload() -- Preallocate table to avoid table array part reallocations. - local _ = table_new(100, 0) + local _ = table_new(20, 0) - -- Want too see 100 objects here. - for i = 1, 100 do + -- Want too see 20 objects here. + for i = 1, 20 do -- Try to avoid crossing with "test" module objects. _[i] = "memprof-str-"..i end @@ -72,16 +79,26 @@ local function generate_parsed_output(payload) end local function fill_ev_type(events, symbols, event_type) - local ev_type = {} + local ev_type = { + line = {}, + trace = {}, + } for _, event in pairs(events[event_type]) do local addr = event.loc.addr - if addr == 0 then + local traceno = event.loc.traceno + + if traceno ~= 0 then + ev_type.trace[traceno] = { + name = string.format("TRACE [%d]", traceno), + num = event.num, + } + elseif addr == 0 then ev_type.INTERNAL = { name = "INTERNAL", num = event.num, - } + } elseif symbols[addr] then - ev_type[event.loc.line] = { + ev_type.line[event.loc.line] = { name = string.format( "%s:%d", symbols[addr].source, symbols[addr].linedefined ), @@ -96,10 +113,21 @@ local function form_source_line(line) return string.format("@%s:%d", arg[0], line) end -local function check_alloc_report(alloc, line, function_line, nevents) - assert(form_source_line(function_line) == alloc[line].name) - assert(alloc[line].num == nevents, ("got=%d, expected=%d"):format( - alloc[line].num, +local function check_alloc_report(alloc, traceno, line, function_line, nevents) + local expected_name, event + if traceno ~= 0 then + expected_name = string.format("TRACE [%d]", traceno) + event = alloc.trace[traceno] + else + expected_name = form_source_line(function_line) + event = alloc.line[line] + end + assert(expected_name == event.name, ("got='%s', expected='%s'"):format( + event.name, + expected_name + )) + assert(event.num == nevents, ("got=%d, expected=%d"):format( + event.num, nevents )) return true @@ -145,18 +173,18 @@ test:test("output", function(subtest) -- one is the number of allocations. 1 event - alocation of -- table by itself + 1 allocation of array part as far it is -- bigger than LJ_MAX_COLOSIZE (16). - subtest:ok(check_alloc_report(alloc, 27, 25, 2)) - -- 100 strings allocations. - subtest:ok(check_alloc_report(alloc, 32, 25, 100)) + subtest:ok(check_alloc_report(alloc, 0, 34, 32, 2)) + -- 20 strings allocations. + subtest:ok(check_alloc_report(alloc, 0, 39, 32, 20)) -- Collect all previous allocated objects. - subtest:ok(free.INTERNAL.num == 102) + subtest:ok(free.INTERNAL.num == 22) -- Tests for leak-only option. -- See also https://github.com/tarantool/tarantool/issues/5812. local heap_delta = process.form_heap_delta(events, symbols) - local tab_alloc_stats = heap_delta[form_source_line(27)] - local str_alloc_stats = heap_delta[form_source_line(32)] + local tab_alloc_stats = heap_delta[form_source_line(34)] + local str_alloc_stats = heap_delta[form_source_line(39)] subtest:ok(tab_alloc_stats.nalloc == tab_alloc_stats.nfree) subtest:ok(tab_alloc_stats.dbytes == 0) subtest:ok(str_alloc_stats.nalloc == str_alloc_stats.nfree) @@ -185,5 +213,38 @@ test:test("stack-resize", function(subtest) misc.memprof.stop() end) +-- Test profiler with enabled JIT. jit.on() + +test:test("jit-output", function(subtest) + -- Disabled on *BSD due to #4819. + if jit.os == 'BSD' then + subtest:plan(1) + subtest:skip('Disabled due to #4819') + return + end + + subtest:plan(3) + + jit.opt.start(3, "hotloop=10") + jit.flush() + + -- Pregenerate traces to fill symtab entries in the next run. + default_payload() + + local symbols, events = generate_parsed_output(default_payload) + + local alloc = fill_ev_type(events, symbols, "alloc") + local free = fill_ev_type(events, symbols, "free") + + -- We expect, that loop will be compiled into a trace. + subtest:ok(check_alloc_report(alloc, 1, 37, 32, 20)) + -- See same checks with jit.off(). + subtest:ok(check_alloc_report(alloc, 0, 34, 32, 2)) + subtest:ok(free.INTERNAL.num == 22) + + -- Restore default JIT settings. + jit.opt.start(unpack(jit_opt_default)) +end) + os.exit(test:check() and 0 or 1) diff --git a/tools/memprof/parse.lua b/tools/memprof/parse.lua index 34ff8aef..968fd90e 100644 --- a/tools/memprof/parse.lua +++ b/tools/memprof/parse.lua @@ -13,7 +13,7 @@ local symtab = require "utils.symtab" local string_format = string.format local LJM_MAGIC = "ljm" -local LJM_CURRENT_VERSION = 1 +local LJM_CURRENT_VERSION = 0x02 local LJM_EPILOGUE_HEADER = 0x80 @@ -26,8 +26,11 @@ local AEVENT_MASK = 0x3 local ASOURCE_INT = lshift(1, 2) local ASOURCE_LFUNC = lshift(2, 2) local ASOURCE_CFUNC = lshift(3, 2) +local ASOURCE_TRACE = lshift(4, 2) -local ASOURCE_MASK = lshift(0x3, 2) +local ASOURCE_MASK = lshift(0x7, 2) + +local EV_HEADER_MAX = ASOURCE_TRACE + AEVENT_REALLOC local M = {} @@ -65,12 +68,16 @@ local function parse_location(reader, asource) local loc = { addr = 0, line = 0, + traceno = 0, } if asource == ASOURCE_CFUNC then loc.addr = reader:read_uleb128() elseif asource == ASOURCE_LFUNC then loc.addr = reader:read_uleb128() loc.line = reader:read_uleb128() + elseif asource == ASOURCE_TRACE then + loc.traceno = reader:read_uleb128() + loc.addr = reader:read_uleb128() elseif asource ~= ASOURCE_INT then error("Unknown asource "..asource) end @@ -140,7 +147,7 @@ local parsers = { } local function ev_header_is_valid(evh) - return evh <= 0x0f or evh == LJM_EPILOGUE_HEADER + return evh <= EV_HEADER_MAX or evh == LJM_EPILOGUE_HEADER end -- Splits event header into event type (aka aevent = allocation diff --git a/tools/utils/symtab.lua b/tools/utils/symtab.lua index e01daa62..85945fb2 100644 --- a/tools/utils/symtab.lua +++ b/tools/utils/symtab.lua @@ -74,21 +74,29 @@ function M.parse(reader) end function M.id(loc) - return string_format("f%#xl%d", loc.addr, loc.line) + return string_format("f%#xl%dt%d", loc.addr, loc.line, loc.traceno) end -function M.demangle(symtab, loc) +local function demangle_lfunc(symtab, loc) local addr = loc.addr if addr == 0 then return "INTERNAL" - end - - if symtab[addr] then + elseif symtab[addr] then return string_format("%s:%d", symtab[addr].source, loc.line) end - return string_format("CFUNC %#x", addr) end +local function demangle_trace(loc) + return string_format("TRACE [%d] %#x", loc.traceno, loc.addr) +end + +function M.demangle(symtab, loc) + if loc.traceno ~= 0 then + return demangle_trace(loc) + end + return demangle_lfunc(symtab, loc) +end + return M -- 2.33.0