* [Tarantool-patches] [PATCH luajit v3 0/4] sysprof: parser refactoring @ 2023-07-31 20:30 Maxim Kokryashkin via Tarantool-patches 2023-07-31 20:30 ` [Tarantool-patches] [PATCH luajit v3 1/4] utils: remove unnecessary insertion in AVL-tree Maxim Kokryashkin via Tarantool-patches ` (4 more replies) 0 siblings, 5 replies; 10+ messages in thread From: Maxim Kokryashkin via Tarantool-patches @ 2023-07-31 20:30 UTC (permalink / raw) To: tarantool-patches, skaplun, imun This patchset is targeted at reducing the memory footprint of the sysprof parser. It is mainly achieved by fixing a nasty bug in the AVL tree implementation, which caused most of the memory consumption, and by essentially purging the collapse.lua module, which means that there is no need to store the complete information about every parsed event and it is sufficient to merge stacks inplace. Alongside with described changes two minor improvements were done: - `luajit-parse-sysprof` now has permissions to be executed - `split_by_vmstate` option is purged, because no one uses it, so there is no actual reason to maintain it. ## Note for Igor: There is no way to split the 'memory footprint reduction'-part and the 'collapse.lua purge'-part, since the main reason why memory consumption is so much lower now, other the AVL-tree bug, is that we don't need to pass a complete profile information between the `parse.lua` and `collapse.lua`. That, basically, almost turns the profile parsing process into some kind of stream-based procedure. PR: https://github.com/tarantool/tarantool/pull/8703 Branch: https://github.com/tarantool/luajit/tree/fckxorg/gh-8700-sysprof-parser-refactoring Maxim Kokryashkin (4): utils: remove unnecessary insertion in AVL-tree sysprof: remove `split by vmstate` option tools: add execution permission to sysprof parser sysprof: improve parser's memory footprint .../gh-5813-resolving-of-c-symbols.test.lua | 6 +- tools/CMakeLists.txt | 4 + tools/luajit-parse-sysprof.in | 0 tools/sysprof.lua | 28 +--- tools/sysprof/collapse.lua | 127 +----------------- tools/sysprof/parse.lua | 126 ++++++++++++----- tools/utils/avl.lua | 4 +- tools/utils/symtab.lua | 2 +- 8 files changed, 106 insertions(+), 191 deletions(-) mode change 100644 => 100755 tools/luajit-parse-sysprof.in -- 2.41.0 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [Tarantool-patches] [PATCH luajit v3 1/4] utils: remove unnecessary insertion in AVL-tree 2023-07-31 20:30 [Tarantool-patches] [PATCH luajit v3 0/4] sysprof: parser refactoring Maxim Kokryashkin via Tarantool-patches @ 2023-07-31 20:30 ` Maxim Kokryashkin via Tarantool-patches 2023-08-15 18:50 ` Igor Munkin via Tarantool-patches 2023-07-31 20:30 ` [Tarantool-patches] [PATCH luajit v3 2/4] sysprof: remove `split by vmstate` option Maxim Kokryashkin via Tarantool-patches ` (3 subsequent siblings) 4 siblings, 1 reply; 10+ messages in thread From: Maxim Kokryashkin via Tarantool-patches @ 2023-07-31 20:30 UTC (permalink / raw) To: tarantool-patches, skaplun, imun This patch fixes a bug in the AVL-tree implementation, which produced unnecessary inserts of values into nodes, instead of replacement. Needed for tarantool/tarantool#8700 --- .../tarantool-tests/gh-5813-resolving-of-c-symbols.test.lua | 6 ++---- tools/utils/avl.lua | 4 ++-- tools/utils/symtab.lua | 2 +- 3 files changed, 5 insertions(+), 7 deletions(-) diff --git a/test/tarantool-tests/gh-5813-resolving-of-c-symbols.test.lua b/test/tarantool-tests/gh-5813-resolving-of-c-symbols.test.lua index 30b8a3ca..c448248a 100644 --- a/test/tarantool-tests/gh-5813-resolving-of-c-symbols.test.lua +++ b/test/tarantool-tests/gh-5813-resolving-of-c-symbols.test.lua @@ -25,10 +25,8 @@ local function tree_contains(node, name) if node == nil then return false else - for i = 1, #node.value do - if node.value[i].name == name then - return true - end + if node.value.name == name then + return true end return tree_contains(node.left, name) or tree_contains(node.right, name) end diff --git a/tools/utils/avl.lua b/tools/utils/avl.lua index d5baa534..81ef9265 100644 --- a/tools/utils/avl.lua +++ b/tools/utils/avl.lua @@ -78,7 +78,7 @@ end function M.insert(node, key, value) assert(key, "Key can't be nil") if node == nil then - return create_node(key, { value }) + return create_node(key, value) end if key < node.key then @@ -86,7 +86,7 @@ function M.insert(node, key, value) elseif key > node.key then node.right = M.insert(node.right, key, value) else - table.insert(node.value, value) + node.value = value end update_height(node) diff --git a/tools/utils/symtab.lua b/tools/utils/symtab.lua index c26a9e8c..7f6c78f0 100644 --- a/tools/utils/symtab.lua +++ b/tools/utils/symtab.lua @@ -176,7 +176,7 @@ function M.demangle(symtab, loc) local key, value = avl.floor(symtab.cfunc, addr) if key then - return string_format("%s:%#x", value[gen].name, key) + return string_format("%s:%#x", value.name, key) end return string_format("CFUNC %#x", addr) -- 2.41.0 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit v3 1/4] utils: remove unnecessary insertion in AVL-tree 2023-07-31 20:30 ` [Tarantool-patches] [PATCH luajit v3 1/4] utils: remove unnecessary insertion in AVL-tree Maxim Kokryashkin via Tarantool-patches @ 2023-08-15 18:50 ` Igor Munkin via Tarantool-patches 0 siblings, 0 replies; 10+ messages in thread From: Igor Munkin via Tarantool-patches @ 2023-08-15 18:50 UTC (permalink / raw) To: Maxim Kokryashkin; +Cc: tarantool-patches Max, Thanks for the patch! The change is quite clear, but so-called 'gen' was introduced on purpose AFAIR. Hence, I would like to ask you to describe this change or clarify my gen misunderstanding. On 31.07.23, Maxim Kokryashkin wrote: > This patch fixes a bug in the AVL-tree implementation, > which produced unnecessary inserts of values into nodes, > instead of replacement. > > Needed for tarantool/tarantool#8700 > --- > .../tarantool-tests/gh-5813-resolving-of-c-symbols.test.lua | 6 ++---- > tools/utils/avl.lua | 4 ++-- > tools/utils/symtab.lua | 2 +- > 3 files changed, 5 insertions(+), 7 deletions(-) > <snipped> > -- > 2.41.0 > -- Best regards, IM ^ permalink raw reply [flat|nested] 10+ messages in thread
* [Tarantool-patches] [PATCH luajit v3 2/4] sysprof: remove `split by vmstate` option 2023-07-31 20:30 [Tarantool-patches] [PATCH luajit v3 0/4] sysprof: parser refactoring Maxim Kokryashkin via Tarantool-patches 2023-07-31 20:30 ` [Tarantool-patches] [PATCH luajit v3 1/4] utils: remove unnecessary insertion in AVL-tree Maxim Kokryashkin via Tarantool-patches @ 2023-07-31 20:30 ` Maxim Kokryashkin via Tarantool-patches 2023-08-15 18:51 ` Igor Munkin via Tarantool-patches 2023-07-31 20:30 ` [Tarantool-patches] [PATCH luajit v3 3/4] tools: add execution permission to sysprof parser Maxim Kokryashkin via Tarantool-patches ` (2 subsequent siblings) 4 siblings, 1 reply; 10+ messages in thread From: Maxim Kokryashkin via Tarantool-patches @ 2023-07-31 20:30 UTC (permalink / raw) To: tarantool-patches, skaplun, imun This option is unneeded and was never used by anybody. There is no reason to maintain it in the scope of refactoring. Needed for tarantool/tarantool#8700 --- tools/sysprof.lua | 9 +-------- tools/sysprof/collapse.lua | 10 +++------- 2 files changed, 4 insertions(+), 15 deletions(-) diff --git a/tools/sysprof.lua b/tools/sysprof.lua index 1afab195..22c724e9 100644 --- a/tools/sysprof.lua +++ b/tools/sysprof.lua @@ -6,8 +6,6 @@ local misc = require "sysprof.collapse" local stdout, stderr = io.stdout, io.stderr local match, gmatch = string.match, string.gmatch -local split_by_vmstate = false - -- Program options. local opt_map = {} @@ -23,15 +21,10 @@ luajit-parse-sysprof [options] sysprof.bin Supported options are: --help Show this help and exit - --split Split callchains by vmstate ]] os.exit(0) end -function opt_map.split() - split_by_vmstate = true -end - -- Print error and exit with error status. local function opterror(...) stderr:write("luajit-parse-sysprof.lua: ERROR: ", ...) @@ -103,7 +96,7 @@ local function dump(inputfile) local symbols = symtab.parse(reader) local events = sysprof.parse(reader, symbols) - local calltree = misc.collapse(events, symbols, split_by_vmstate) + local calltree = misc.collapse(events, symbols) traverse_calltree(calltree, '') diff --git a/tools/sysprof/collapse.lua b/tools/sysprof/collapse.lua index 3d83d5ea..ac5269ea 100755 --- a/tools/sysprof/collapse.lua +++ b/tools/sysprof/collapse.lua @@ -75,7 +75,7 @@ end -- merge lua and host callchains into one callchain representing -- transfer of control -local function merge(event, symbols, sep_vmst) +local function merge(event, symbols) local cc = {} for _,h_fr in pairs(event.host.callchain) do @@ -98,19 +98,15 @@ local function merge(event, symbols, sep_vmst) end - if sep_vmst == true then - table.insert(cc, { name = VMST_NAMES[event.lua.vmstate] }) - end - return cc end -- Collapse all the events into call tree -function M.collapse(events, symbols, sep_vmst) +function M.collapse(events, symbols) local root = new_node('root', false) for _,ev in pairs(events) do - local callchain = merge(ev, symbols, sep_vmst) + local callchain = merge(ev, symbols) local curr_node = root for i=#callchain,1,-1 do curr_node = insert(callchain[i].name, curr_node, false) -- 2.41.0 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit v3 2/4] sysprof: remove `split by vmstate` option 2023-07-31 20:30 ` [Tarantool-patches] [PATCH luajit v3 2/4] sysprof: remove `split by vmstate` option Maxim Kokryashkin via Tarantool-patches @ 2023-08-15 18:51 ` Igor Munkin via Tarantool-patches 0 siblings, 0 replies; 10+ messages in thread From: Igor Munkin via Tarantool-patches @ 2023-08-15 18:51 UTC (permalink / raw) To: Maxim Kokryashkin; +Cc: tarantool-patches Max, Thanks for the patch! Same for this patch: I would like to ask you to add some more rationale for removing this (i.e. why it was implemented and why it is considered unused). On 31.07.23, Maxim Kokryashkin wrote: > This option is unneeded and was never used > by anybody. There is no reason to maintain it > in the scope of refactoring. > > Needed for tarantool/tarantool#8700 > --- > tools/sysprof.lua | 9 +-------- > tools/sysprof/collapse.lua | 10 +++------- > 2 files changed, 4 insertions(+), 15 deletions(-) > <snipped> > -- > 2.41.0 > -- Best regards, IM ^ permalink raw reply [flat|nested] 10+ messages in thread
* [Tarantool-patches] [PATCH luajit v3 3/4] tools: add execution permission to sysprof parser 2023-07-31 20:30 [Tarantool-patches] [PATCH luajit v3 0/4] sysprof: parser refactoring Maxim Kokryashkin via Tarantool-patches 2023-07-31 20:30 ` [Tarantool-patches] [PATCH luajit v3 1/4] utils: remove unnecessary insertion in AVL-tree Maxim Kokryashkin via Tarantool-patches 2023-07-31 20:30 ` [Tarantool-patches] [PATCH luajit v3 2/4] sysprof: remove `split by vmstate` option Maxim Kokryashkin via Tarantool-patches @ 2023-07-31 20:30 ` Maxim Kokryashkin via Tarantool-patches 2023-08-15 18:51 ` Igor Munkin via Tarantool-patches 2023-07-31 20:30 ` [Tarantool-patches] [PATCH luajit v3 4/4] sysprof: improve parser's memory footprint Maxim Kokryashkin via Tarantool-patches 2023-08-15 18:54 ` [Tarantool-patches] [PATCH luajit v3 0/4] sysprof: parser refactoring Igor Munkin via Tarantool-patches 4 siblings, 1 reply; 10+ messages in thread From: Maxim Kokryashkin via Tarantool-patches @ 2023-07-31 20:30 UTC (permalink / raw) To: tarantool-patches, skaplun, imun This patch adds execution permission to the `luajit-parse-sysprof` helper script, so now it is possible to call it without having to change permissions beforehands. Needed for tarantool/tarantool#8700 --- tools/luajit-parse-sysprof.in | 0 1 file changed, 0 insertions(+), 0 deletions(-) mode change 100644 => 100755 tools/luajit-parse-sysprof.in diff --git a/tools/luajit-parse-sysprof.in b/tools/luajit-parse-sysprof.in old mode 100644 new mode 100755 -- 2.41.0 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit v3 3/4] tools: add execution permission to sysprof parser 2023-07-31 20:30 ` [Tarantool-patches] [PATCH luajit v3 3/4] tools: add execution permission to sysprof parser Maxim Kokryashkin via Tarantool-patches @ 2023-08-15 18:51 ` Igor Munkin via Tarantool-patches 0 siblings, 0 replies; 10+ messages in thread From: Igor Munkin via Tarantool-patches @ 2023-08-15 18:51 UTC (permalink / raw) To: Maxim Kokryashkin; +Cc: tarantool-patches Max, Thanks for the patch! LGTM as trivial, however, I see little sense in it considering your work on memprof/sysprof CLI flags. -- Best regards, IM ^ permalink raw reply [flat|nested] 10+ messages in thread
* [Tarantool-patches] [PATCH luajit v3 4/4] sysprof: improve parser's memory footprint 2023-07-31 20:30 [Tarantool-patches] [PATCH luajit v3 0/4] sysprof: parser refactoring Maxim Kokryashkin via Tarantool-patches ` (2 preceding siblings ...) 2023-07-31 20:30 ` [Tarantool-patches] [PATCH luajit v3 3/4] tools: add execution permission to sysprof parser Maxim Kokryashkin via Tarantool-patches @ 2023-07-31 20:30 ` Maxim Kokryashkin via Tarantool-patches 2023-08-15 18:52 ` Igor Munkin via Tarantool-patches 2023-08-15 18:54 ` [Tarantool-patches] [PATCH luajit v3 0/4] sysprof: parser refactoring Igor Munkin via Tarantool-patches 4 siblings, 1 reply; 10+ messages in thread From: Maxim Kokryashkin via Tarantool-patches @ 2023-07-31 20:30 UTC (permalink / raw) To: tarantool-patches, skaplun, imun This patch reduces sysprof's parser memory footprint, by avoiding reading all callchains before collapsing them. Instead of it, parser merges stacks immediately after reading them and stores counts in a lua table. The `collapse.lua` module is purged as a result of the patch, but it is left as a stub to keep the integrational testing intact. This stub should be removed in the next series. Resolves tarantool/tarantool#8700 --- tools/CMakeLists.txt | 4 ++ tools/sysprof.lua | 21 +------ tools/sysprof/collapse.lua | 123 +----------------------------------- tools/sysprof/parse.lua | 126 ++++++++++++++++++++++++++----------- 4 files changed, 101 insertions(+), 173 deletions(-) diff --git a/tools/CMakeLists.txt b/tools/CMakeLists.txt index dd7ec6bd..1ae559ee 100644 --- a/tools/CMakeLists.txt +++ b/tools/CMakeLists.txt @@ -112,6 +112,8 @@ else() add_custom_target(tools-parse-sysprof EXCLUDE_FROM_ALL DEPENDS luajit-parse-sysprof sysprof/parse.lua + # FIXME: This line is not deleted only for the sake of integrational + # testing. It should be deleted in the next series. sysprof/collapse.lua sysprof.lua utils/bufread.lua @@ -121,6 +123,8 @@ else() install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/sysprof/parse.lua + # FIXME: This line is not deleted only for the sake of integrational + # testing. It should be deleted in the next series. ${CMAKE_CURRENT_SOURCE_DIR}/sysprof/collapse.lua DESTINATION ${LUAJIT_DATAROOTDIR}/sysprof PERMISSIONS diff --git a/tools/sysprof.lua b/tools/sysprof.lua index 22c724e9..8e110a04 100644 --- a/tools/sysprof.lua +++ b/tools/sysprof.lua @@ -1,7 +1,6 @@ local bufread = require "utils.bufread" local sysprof = require "sysprof.parse" local symtab = require "utils.symtab" -local misc = require "sysprof.collapse" local stdout, stderr = io.stdout, io.stderr local match, gmatch = string.match, string.gmatch @@ -78,28 +77,14 @@ local function parseargs(args) return args[args.argn] end -local function traverse_calltree(node, prefix) - if node.is_leaf then - print(prefix..' '..node.count) - end - - local sep_prefix = #prefix == 0 and prefix or prefix..';' - - for name,child in pairs(node.children) do - traverse_calltree(child, sep_prefix..name) - end -end - local function dump(inputfile) local reader = bufread.new(inputfile) - local symbols = symtab.parse(reader) - local events = sysprof.parse(reader, symbols) - local calltree = misc.collapse(events, symbols) - - traverse_calltree(calltree, '') + for stack, count in pairs(events) do + print(stack, count) + end os.exit(0) end diff --git a/tools/sysprof/collapse.lua b/tools/sysprof/collapse.lua index ac5269ea..9e815e0d 100755 --- a/tools/sysprof/collapse.lua +++ b/tools/sysprof/collapse.lua @@ -1,120 +1,3 @@ -local parse = require "sysprof.parse" -local vmdef = require "jit.vmdef" -local symtab = require "utils.symtab" - -local VMST_NAMES = { - [parse.VMST.INTERP] = "VMST_INTERP", - [parse.VMST.LFUNC] = "VMST_LFUNC", - [parse.VMST.FFUNC] = "VMST_FFUNC", - [parse.VMST.CFUNC] = "VMST_CFUNC", - [parse.VMST.GC] = "VMST_GC", - [parse.VMST.EXIT] = "VMST_EXIT", - [parse.VMST.RECORD] = "VMST_RECORD", - [parse.VMST.OPT] = "VMST_OPT", - [parse.VMST.ASM] = "VMST_ASM", - [parse.VMST.TRACE] = "VMST_TRACE", -} - -local M = {} - -local function new_node(name, is_leaf) - return { - name = name, - count = 0, - is_leaf = is_leaf, - children = {} - } -end - --- insert new child into a node (or increase counter in existing one) -local function insert(name, node, is_leaf) - if node.children[name] == nil then - node.children[name] = new_node(name, is_leaf) - end - - local child = node.children[name] - child.count = child.count + 1 - - return child -end - -local function insert_lua_callchain(chain, lua, symbols) - local ins_cnt = 0 - for _,fr in pairs(lua.callchain) do - local name_lua - - ins_cnt = ins_cnt + 1 - if fr.type == parse.FRAME.FFUNC then - name_lua = vmdef.ffnames[fr.ffid] - else - name_lua = symtab.demangle(symbols, { - addr = fr.addr, - line = fr.line, - gen = fr.gen - }) - if lua.trace.traceno ~= nil and lua.trace.addr == fr.addr and - lua.trace.line == fr.line then - name_lua = symtab.demangle(symbols, { - addr = fr.addr, - traceno = lua.trace.traceno, - gen = fr.gen - }) - end - - if fr.type == parse.FRAME.CFUNC then - -- C function encountered, the next chunk - -- of frames is located on the C stack. - break - end - end - - table.insert(chain, 1, { name = name_lua }) - end - table.remove(lua.callchain, ins_cnt) -end - --- merge lua and host callchains into one callchain representing --- transfer of control -local function merge(event, symbols) - local cc = {} - - for _,h_fr in pairs(event.host.callchain) do - local name_host = symtab.demangle(symbols, { - addr = h_fr.addr, - gen = h_fr.gen - }) - table.insert(cc, 1, { name = name_host }) - - if string.match(name_host, '^lua_cpcall') ~= nil then - -- Any C function is present on both the C and the Lua - -- stacks. It is more convenient to get its info from the - -- host stack, since it has information about child frames. - table.remove(event.lua.callchain, 1) - end - - if string.match(name_host, '^lua_p?call') ~= nil then - insert_lua_callchain(cc, event.lua, symbols) - end - - end - - return cc -end - --- Collapse all the events into call tree -function M.collapse(events, symbols) - local root = new_node('root', false) - - for _,ev in pairs(events) do - local callchain = merge(ev, symbols) - local curr_node = root - for i=#callchain,1,-1 do - curr_node = insert(callchain[i].name, curr_node, false) - end - insert('', curr_node, true) - end - - return root -end - -return M +-- FIXME: This line is not deleted only for the sake of +-- integrational testing. It should be deleted in the +-- next series. diff --git a/tools/sysprof/parse.lua b/tools/sysprof/parse.lua index 5b52f104..19add4f3 100755 --- a/tools/sysprof/parse.lua +++ b/tools/sysprof/parse.lua @@ -2,6 +2,7 @@ -- The format spec can be found in <src/lj_sysprof.h>. local symtab = require "utils.symtab" +local vmdef = require "jit.vmdef" local string_format = string.format @@ -10,7 +11,7 @@ local LJP_CURRENT_VERSION = 2 local M = {} -M.VMST = { +local VMST = { INTERP = 0, LFUNC = 1, FFUNC = 2, @@ -25,13 +26,14 @@ M.VMST = { } -M.FRAME = { +local FRAME = { LFUNC = 1, CFUNC = 2, FFUNC = 3, BOTTOM = 0x80 } + local STREAM_END = 0x80 local SYMTAB_LFUNC_EVENT = 10 local SYMTAB_CFUNC_EVENT = 11 @@ -54,42 +56,40 @@ local function new_event() } end -local function parse_lfunc(reader, event, symbols) +local function parse_lfunc(reader, symbols) local addr = reader:read_uleb128() local line = reader:read_uleb128() local loc = symtab.loc(symbols, { addr = addr, line = line }) - loc.type = M.FRAME.LFUNC - table.insert(event.lua.callchain, 1, loc) + loc.type = FRAME.LFUNC + return symtab.demangle(symbols, loc) end -local function parse_ffunc(reader, event, _) +local function parse_ffunc(reader, _) local ffid = reader:read_uleb128() - table.insert(event.lua.callchain, 1, { - type = M.FRAME.FFUNC, - ffid = ffid, - }) + return vmdef.ffnames[ffid] end -local function parse_cfunc(reader, event, symbols) +local function parse_cfunc(reader, symbols) local addr = reader:read_uleb128() local loc = symtab.loc(symbols, { addr = addr }) - loc.type = M.FRAME.CFUNC - table.insert(event.lua.callchain, 1, loc) + loc.type = FRAME.CFUNC + return symtab.demangle(symbols, loc) end local frame_parsers = { - [M.FRAME.LFUNC] = parse_lfunc, - [M.FRAME.FFUNC] = parse_ffunc, - [M.FRAME.CFUNC] = parse_cfunc + [FRAME.LFUNC] = parse_lfunc, + [FRAME.FFUNC] = parse_ffunc, + [FRAME.CFUNC] = parse_cfunc } local function parse_lua_callchain(reader, event, symbols) while true do local frame_header = reader:read_octet() - if frame_header == M.FRAME.BOTTOM then + if frame_header == FRAME.BOTTOM then break end - frame_parsers[frame_header](reader, event, symbols) + local name = frame_parsers[frame_header](reader, symbols) + table.insert(event.lua.callchain, 1, {name = name, type = frame_header}) end end @@ -100,7 +100,7 @@ local function parse_host_callchain(reader, event, symbols) while addr ~= 0 do local loc = symtab.loc(symbols, { addr = addr }) - table.insert(event.host.callchain, 1, loc) + table.insert(event.host.callchain, 1, symtab.demangle(symbols, loc)) addr = reader:read_uleb128() end end @@ -108,10 +108,20 @@ end --~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-- local function parse_trace_callchain(reader, event, symbols) - event.lua.trace.traceno = reader:read_uleb128() - event.lua.trace.addr = reader:read_uleb128() - event.lua.trace.line = reader:read_uleb128() - event.lua.trace.gen = symtab.loc(symbols, event.lua.trace).gen + local loc = {} + loc.traceno = reader:read_uleb128() + loc.addr = reader:read_uleb128() + loc.line = reader:read_uleb128() + + local gen = symtab.loc(symbols, loc).gen + local name_lua = symtab.demangle(symbols, { + addr = loc.addr, + traceno = loc.traceno, + gen = gen + }) + event.lua.trace = loc + event.lua.trace.gen = gen + event.lua.trace.name = name_lua end --~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-- @@ -143,18 +153,63 @@ local function parse_symtab(reader, symbols, vmstate) end local event_parsers = { - [M.VMST.INTERP] = parse_host_only, - [M.VMST.LFUNC] = parse_lua_host, - [M.VMST.FFUNC] = parse_lua_host, - [M.VMST.CFUNC] = parse_lua_host, - [M.VMST.GC] = parse_host_only, - [M.VMST.EXIT] = parse_host_only, - [M.VMST.RECORD] = parse_host_only, - [M.VMST.OPT] = parse_host_only, - [M.VMST.ASM] = parse_host_only, - [M.VMST.TRACE] = parse_trace, + [VMST.INTERP] = parse_host_only, + [VMST.LFUNC] = parse_lua_host, + [VMST.FFUNC] = parse_lua_host, + [VMST.CFUNC] = parse_lua_host, + [VMST.GC] = parse_host_only, + [VMST.EXIT] = parse_host_only, + [VMST.RECORD] = parse_host_only, + [VMST.OPT] = parse_host_only, + [VMST.ASM] = parse_host_only, + [VMST.TRACE] = parse_trace, } +local function insert_lua_callchain(chain, lua) + local ins_cnt = 0 + local name_lua + for _, fr in ipairs(lua.callchain) do + ins_cnt = ins_cnt + 1 + if fr.type == FRAME.CFUNC then + -- C function encountered, the next chunk + -- of frames is located on the C stack. + break + end + name_lua = fr.name + + if fr.type == FRAME.LFUNC + and lua.trace.traceno ~= nil + and lua.trace.addr == fr.addr + and lua.trace.line == fr.line + then + name_lua = lua.trace.name + end + + table.insert(chain, name_lua) + end + table.remove(lua.callchain, ins_cnt) +end + +local function merge(event) + local callchain = {} + + for _, name_host in ipairs(event.host.callchain) do + table.insert(callchain, name_host) + if string.match(name_host, '^lua_cpcall') ~= nil then + -- Any C function is present on both the C and the Lua + -- stacks. It is more convenient to get its info from the + -- host stack, since it has information about child frames. + table.remove(event.lua.callchain) + end + + if string.match(name_host, '^lua_p?call') ~= nil then + insert_lua_callchain(callchain, event.lua) + end + + end + return table.concat(callchain, ';') .. ';' +end + local function parse_event(reader, events, symbols) local event = new_event() @@ -171,8 +226,9 @@ local function parse_event(reader, events, symbols) event.lua.vmstate = vmstate event_parsers[vmstate](reader, event, symbols) - - table.insert(events, event) + local callchain = merge(event) + local cur_cnt = events[callchain] + events[callchain] = (cur_cnt or 0) + 1 return true end -- 2.41.0 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit v3 4/4] sysprof: improve parser's memory footprint 2023-07-31 20:30 ` [Tarantool-patches] [PATCH luajit v3 4/4] sysprof: improve parser's memory footprint Maxim Kokryashkin via Tarantool-patches @ 2023-08-15 18:52 ` Igor Munkin via Tarantool-patches 0 siblings, 0 replies; 10+ messages in thread From: Igor Munkin via Tarantool-patches @ 2023-08-15 18:52 UTC (permalink / raw) To: Maxim Kokryashkin; +Cc: tarantool-patches Max, Thanks for the patch! Everything is OK in general, but please consider my comments below. On 31.07.23, Maxim Kokryashkin wrote: > This patch reduces sysprof's parser memory footprint, > by avoiding reading all callchains before collapsing them. > Instead of it, parser merges stacks immediately after > reading them and stores counts in a lua table. > > The `collapse.lua` module is purged as a result of the > patch, but it is left as a stub to keep the integrational > testing intact. This stub should be removed in the next > series. > > Resolves tarantool/tarantool#8700 > --- > tools/CMakeLists.txt | 4 ++ > tools/sysprof.lua | 21 +------ > tools/sysprof/collapse.lua | 123 +----------------------------------- > tools/sysprof/parse.lua | 126 ++++++++++++++++++++++++++----------- > 4 files changed, 101 insertions(+), 173 deletions(-) > > diff --git a/tools/CMakeLists.txt b/tools/CMakeLists.txt > index dd7ec6bd..1ae559ee 100644 > --- a/tools/CMakeLists.txt > +++ b/tools/CMakeLists.txt > @@ -112,6 +112,8 @@ else() > add_custom_target(tools-parse-sysprof EXCLUDE_FROM_ALL DEPENDS > luajit-parse-sysprof > sysprof/parse.lua > + # FIXME: This line is not deleted only for the sake of integrational > + # testing. It should be deleted in the next series. Minor: I'd rather left TODO instead of FIXME, but this is not a big deal, so feel free to ignore. > sysprof/collapse.lua > sysprof.lua > utils/bufread.lua > @@ -121,6 +123,8 @@ else() > > install(FILES > ${CMAKE_CURRENT_SOURCE_DIR}/sysprof/parse.lua > + # FIXME: This line is not deleted only for the sake of integrational > + # testing. It should be deleted in the next series. Ditto. > ${CMAKE_CURRENT_SOURCE_DIR}/sysprof/collapse.lua > DESTINATION ${LUAJIT_DATAROOTDIR}/sysprof > PERMISSIONS <snipped> > diff --git a/tools/sysprof/collapse.lua b/tools/sysprof/collapse.lua > index ac5269ea..9e815e0d 100755 > --- a/tools/sysprof/collapse.lua > +++ b/tools/sysprof/collapse.lua > @@ -1,120 +1,3 @@ <snipped> > +-- FIXME: This line is not deleted only for the sake of > +-- integrational testing. It should be deleted in the > +-- next series. Honestly, I would literally "purge" collapse.lua the following way: replace all of its contents with the only assert call to check that nobody will use it. However, if it breaks Tarantool, I agree to left it intact until the file is removed completely from the source tree. > diff --git a/tools/sysprof/parse.lua b/tools/sysprof/parse.lua > index 5b52f104..19add4f3 100755 > --- a/tools/sysprof/parse.lua > +++ b/tools/sysprof/parse.lua <snipped> > @@ -143,18 +153,63 @@ local function parse_symtab(reader, symbols, vmstate) <snipped> > > +local function insert_lua_callchain(chain, lua) > + local ins_cnt = 0 > + local name_lua > + for _, fr in ipairs(lua.callchain) do > + ins_cnt = ins_cnt + 1 > + if fr.type == FRAME.CFUNC then > + -- C function encountered, the next chunk > + -- of frames is located on the C stack. > + break > + end > + name_lua = fr.name > + > + if fr.type == FRAME.LFUNC > + and lua.trace.traceno ~= nil > + and lua.trace.addr == fr.addr > + and lua.trace.line == fr.line > + then > + name_lua = lua.trace.name > + end Something bad with indentation. I guess there should be something similar to this: | if fr.type == FRAME.LFUNC | and lua.trace.traceno ~= nil | and lua.trace.addr == fr.addr | and lua.trace.line == fr.line | then | name_lua = lua.trace.name | end > + > + table.insert(chain, name_lua) > + end > + table.remove(lua.callchain, ins_cnt) > +end <snipped> > @@ -171,8 +226,9 @@ local function parse_event(reader, events, symbols) > event.lua.vmstate = vmstate > > event_parsers[vmstate](reader, event, symbols) > - > - table.insert(events, event) > + local callchain = merge(event) > + local cur_cnt = events[callchain] > + events[callchain] = (cur_cnt or 0) + 1 Minor: The following line looks better (IMHO), but feel free to ignore. | events[callchain] = (events[callchain] or 0) + 1 > return true > end > > -- > 2.41.0 > -- Best regards, IM ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit v3 0/4] sysprof: parser refactoring 2023-07-31 20:30 [Tarantool-patches] [PATCH luajit v3 0/4] sysprof: parser refactoring Maxim Kokryashkin via Tarantool-patches ` (3 preceding siblings ...) 2023-07-31 20:30 ` [Tarantool-patches] [PATCH luajit v3 4/4] sysprof: improve parser's memory footprint Maxim Kokryashkin via Tarantool-patches @ 2023-08-15 18:54 ` Igor Munkin via Tarantool-patches 4 siblings, 0 replies; 10+ messages in thread From: Igor Munkin via Tarantool-patches @ 2023-08-15 18:54 UTC (permalink / raw) To: Maxim Kokryashkin; +Cc: tarantool-patches Max, Thanks for the fixes! Please consider my comments per patches. On 31.07.23, Maxim Kokryashkin wrote: > This patchset is targeted at reducing the memory footprint > of the sysprof parser. It is mainly achieved by fixing a > nasty bug in the AVL tree implementation, which caused most > of the memory consumption, and by essentially purging the collapse.lua > module, which means that there is no need to store the complete > information about every parsed event and it is sufficient to merge > stacks inplace. > > Alongside with described changes two minor improvements were done: > - `luajit-parse-sysprof` now has permissions to be executed > - `split_by_vmstate` option is purged, because no one uses it, so there > is no actual reason to maintain it. > > ## Note for Igor: > There is no way to split the 'memory footprint reduction'-part and the > 'collapse.lua purge'-part, since the main reason why memory consumption > is so much lower now, other the AVL-tree bug, is that we don't need to > pass a complete profile information between the `parse.lua` and > `collapse.lua`. That, basically, almost turns the profile parsing > process into some kind of stream-based procedure. Got it, thanks for explanation! > > PR: https://github.com/tarantool/tarantool/pull/8703 > Branch: https://github.com/tarantool/luajit/tree/fckxorg/gh-8700-sysprof-parser-refactoring > > Maxim Kokryashkin (4): > utils: remove unnecessary insertion in AVL-tree > sysprof: remove `split by vmstate` option > tools: add execution permission to sysprof parser > sysprof: improve parser's memory footprint > > .../gh-5813-resolving-of-c-symbols.test.lua | 6 +- > tools/CMakeLists.txt | 4 + > tools/luajit-parse-sysprof.in | 0 > tools/sysprof.lua | 28 +--- > tools/sysprof/collapse.lua | 127 +----------------- > tools/sysprof/parse.lua | 126 ++++++++++++----- > tools/utils/avl.lua | 4 +- > tools/utils/symtab.lua | 2 +- > 8 files changed, 106 insertions(+), 191 deletions(-) > mode change 100644 => 100755 tools/luajit-parse-sysprof.in > > -- > 2.41.0 > -- Best regards, IM ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-08-15 19:09 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-07-31 20:30 [Tarantool-patches] [PATCH luajit v3 0/4] sysprof: parser refactoring Maxim Kokryashkin via Tarantool-patches 2023-07-31 20:30 ` [Tarantool-patches] [PATCH luajit v3 1/4] utils: remove unnecessary insertion in AVL-tree Maxim Kokryashkin via Tarantool-patches 2023-08-15 18:50 ` Igor Munkin via Tarantool-patches 2023-07-31 20:30 ` [Tarantool-patches] [PATCH luajit v3 2/4] sysprof: remove `split by vmstate` option Maxim Kokryashkin via Tarantool-patches 2023-08-15 18:51 ` Igor Munkin via Tarantool-patches 2023-07-31 20:30 ` [Tarantool-patches] [PATCH luajit v3 3/4] tools: add execution permission to sysprof parser Maxim Kokryashkin via Tarantool-patches 2023-08-15 18:51 ` Igor Munkin via Tarantool-patches 2023-07-31 20:30 ` [Tarantool-patches] [PATCH luajit v3 4/4] sysprof: improve parser's memory footprint Maxim Kokryashkin via Tarantool-patches 2023-08-15 18:52 ` Igor Munkin via Tarantool-patches 2023-08-15 18:54 ` [Tarantool-patches] [PATCH luajit v3 0/4] sysprof: parser refactoring Igor Munkin via Tarantool-patches
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox