This patch set improves the LuaJIT's symtab resolution, by making the `lj_symtab` module read the `.symtab` section instead of `.dynsym`. Also, it changes the stack merge algorithm in the parser, providing support for sandwiched stacks. Branch: https://github.com/tarantool/luajit/tree/fckxorg/gh-7244-sysprof-sandwich PR: https://github.com/tarantool/tarantool/pull/7342 Maxim Kokryashkin (2): symtab: fix static symtab dump sysprof: add stack sandwich support src/lj_symtab.c | 28 ++++++++++++++++++++++++++++ tools/sysprof/collapse.lua | 31 +++++++++++++++++++------------ 2 files changed, 47 insertions(+), 12 deletions(-) -- 2.36.1
The `dl_iterate_phdr` returns an empty string as a name for the executable from which it was called. It is still possible to access its dynamic symbol table, but it is vital for sysprof to obtain the main symbol table for the LuaJIT executable. To do so, we need a valid path to the executable. Since there is no way to obtain the path to a running executable using the C standard library, this commit adds call to `readlink` to gather the symbolic link from `/proc/self/exe`. Most of the UNIX-based systems have procfs, so it is not a problem. Needed for tarantool/tarantool#7244 --- src/lj_symtab.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/src/lj_symtab.c b/src/lj_symtab.c index 2b2b205b..23696401 100644 --- a/src/lj_symtab.c +++ b/src/lj_symtab.c @@ -16,10 +16,12 @@ #if LJ_HASRESOLVER +#include <linux/limits.h> #include <elf.h> #include <link.h> #include <stdio.h> #include <sys/auxv.h> +#include <unistd.h> #include "lj_gc.h" #endif @@ -352,6 +354,7 @@ static int resolve_symbolnames(struct dl_phdr_info *info, size_t info_size, struct lj_wbuf *buf = conf->buf; lua_State *L = conf->L; const uint8_t header = conf->header; + char executable_path[PATH_MAX] = ""; uint32_t lib_cnt = 0; @@ -387,6 +390,31 @@ static int resolve_symbolnames(struct dl_phdr_info *info, size_t info_size, ** Main way: try to open ELF and read SHT_SYMTAB, SHT_STRTAB and SHT_HASH ** sections from it. */ + + /* + ** The `dl_iterate_phdr` returns an empty string as a name for + ** the executable from which it was called. It is still possible to + ** access its dynamic symbol table, but it is vital for sysprof to obtain + ** the main symbol table for the LuaJIT executable. To do so, we need a + ** valid path to the executable. Since there is no way to obtain the + ** path to a running executable using the C standard library, the only + ** more or less reliable way to do this is by reading the symbolic link + ** from `/proc/self/exe`. Most of the UNIX-based systems have procfs, so + ** it is not a problem. + */ + if (*info->dlpi_name == 0) { + if (-1 != readlink("/proc/self/exe", executable_path, PATH_MAX)) + info->dlpi_name = executable_path; + else + /* + ** It is impossible for sysprof to function properly without the + ** LuaJIT's .symtab section present. The assertion below + ** is unlikely to be triggered on any system supported by sysprof, + ** unless someone have deleted the LuaJIT binary right after the + ** start. + */ + lua_assert(0); + } if (dump_sht_symtab(info->dlpi_name, buf, L, header, info->dlpi_addr) == 0) { ++conf->cur_lib; } -- 2.36.1
This commit introduces stack sandwich support to the sysprof's parser. Sandwich is handled the following way: 1. If there is a `lua_cpcall` in the C stack trace, nothing is done on the C stack and the corresponding frame on the Lua stack is removed. 2. If there is a `lua_call`/`lua_pcall`, then next chunk of frames is from the Lua stack. That chunk ends with either a CFUNC, or a stack trace end. Resolves tarantool/tarantool#7244 --- tools/sysprof/collapse.lua | 31 +++++++++++++++++++------------ 1 file changed, 19 insertions(+), 12 deletions(-) diff --git a/tools/sysprof/collapse.lua b/tools/sysprof/collapse.lua index b0c4afc8..b815baf3 100755 --- a/tools/sysprof/collapse.lua +++ b/tools/sysprof/collapse.lua @@ -39,9 +39,11 @@ local function insert(name, node, is_leaf) end local function insert_lua_callchain(chain, lua, symbols) + local ins_cnt = 0 for _,fr in pairs(lua.callchain) do local name_lua + ins_cnt = ins_cnt + 1 if fr.type == parse.FRAME.FFUNC then name_lua = vmdef.ffnames[fr.ffid] else @@ -58,38 +60,43 @@ local function insert_lua_callchain(chain, lua, symbols) gen = fr.gen }) end + + if fr.type == parse.FRAME.CFUNC then + -- C function encountered, the next chunk + -- of frames is located on the C stack. + break + end end table.insert(chain, 1, { name = name_lua }) end + table.remove(lua.callchain, ins_cnt) end + -- merge lua and host callchains into one callchain representing -- transfer of control local function merge(event, symbols, sep_vmst) local cc = {} - local lua_inserted = false for _,h_fr in pairs(event.host.callchain) do local name_host = symtab.demangle(symbols, { addr = h_fr.addr, gen = h_fr.gen }) + table.insert(cc, 1, { name = name_host }) - -- We assume that usually the transfer of control - -- looks like: - -- HOST -> LUA -> HOST - -- so for now, lua callchain starts from lua_pcall() call - if name_host == 'lua_pcall' then - insert_lua_callchain(cc, event.lua, symbols) - lua_inserted = true + if string.match(name_host, '^lua_cpcall') ~= nil then + -- Any C function is present on both the C and the Lua + -- stacks. It is more convenient to get its info from the host stack, + -- since it has information about child frames. + table.remove(event.lua.callchain, 1) end - table.insert(cc, 1, { name = name_host }) - end + if string.match(name_host, '^lua_p?call') ~= nil then + insert_lua_callchain(cc, event.lua, symbols) + end - if lua_inserted == false then - insert_lua_callchain(cc, event.lua, symbols) end if sep_vmst == true then -- 2.36.1
Hi, Maxim! Thanks for the patch! LGTM, with several comments below. On 04.07.22, Maxim Kokryashkin wrote: > The `dl_iterate_phdr` returns an empty string as a name for > the executable from which it was called. Don't get: why it is the problem? I.e. why it is problem to obtain the main symbol table? Is it sysprof parser restrictions? > It is still possible > to access its dynamic symbol table, but it is vital for > sysprof to obtain the main symbol table for the LuaJIT > executable. To do so, we need a valid path to the executable. > > Since there is no way to obtain the path to a running executable > using the C standard library, this commit adds call to > `readlink` to gather the symbolic link from `/proc/self/exe`. > Most of the UNIX-based systems have procfs, so it is not > a problem. > > Needed for tarantool/tarantool#7244 > --- > src/lj_symtab.c | 28 ++++++++++++++++++++++++++++ > 1 file changed, 28 insertions(+) > > diff --git a/src/lj_symtab.c b/src/lj_symtab.c > index 2b2b205b..23696401 100644 > --- a/src/lj_symtab.c > +++ b/src/lj_symtab.c > @@ -16,10 +16,12 @@ > > #if LJ_HASRESOLVER > > +#include <linux/limits.h> > #include <elf.h> > #include <link.h> > #include <stdio.h> > #include <sys/auxv.h> > +#include <unistd.h> > #include "lj_gc.h" > #endif > > @@ -352,6 +354,7 @@ static int resolve_symbolnames(struct dl_phdr_info *info, size_t info_size, > struct lj_wbuf *buf = conf->buf; > lua_State *L = conf->L; > const uint8_t header = conf->header; > + char executable_path[PATH_MAX] = ""; Why do we need this initialization? ^ > > uint32_t lib_cnt = 0; > > @@ -387,6 +390,31 @@ static int resolve_symbolnames(struct dl_phdr_info *info, size_t info_size, > ** Main way: try to open ELF and read SHT_SYMTAB, SHT_STRTAB and SHT_HASH > ** sections from it. > */ > + > + /* > + ** The `dl_iterate_phdr` returns an empty string as a name for > + ** the executable from which it was called. It is still possible to > + ** access its dynamic symbol table, but it is vital for sysprof to obtain > + ** the main symbol table for the LuaJIT executable. To do so, we need a > + ** valid path to the executable. Since there is no way to obtain the > + ** path to a running executable using the C standard library, the only > + ** more or less reliable way to do this is by reading the symbolic link > + ** from `/proc/self/exe`. Most of the UNIX-based systems have procfs, so > + ** it is not a problem. > + */ > + if (*info->dlpi_name == 0) { Nit: s/0/'\0'/ to emphasize that we check that string contains only NULL char. > + if (-1 != readlink("/proc/self/exe", executable_path, PATH_MAX)) Nit: `readlink("/proc/self/exe", executable_path, PATH_MAX) != -1` > + info->dlpi_name = executable_path; > + else > + /* > + ** It is impossible for sysprof to function properly without the > + ** LuaJIT's .symtab section present. The assertion below > + ** is unlikely to be triggered on any system supported by sysprof, > + ** unless someone have deleted the LuaJIT binary right after the > + ** start. > + */ > + lua_assert(0); > + } > if (dump_sht_symtab(info->dlpi_name, buf, L, header, info->dlpi_addr) == 0) { > ++conf->cur_lib; > } > -- > 2.36.1 > -- Best regards, Sergey Kaplun
Hi, Maxim! Thanks for the patch! LGTM, except a few nits below. On 04.07.22, Maxim Kokryashkin wrote: > This commit introduces stack sandwich support to the > sysprof's parser. > > Sandwich is handled the following way: Typo: s/Sandwich/The sandwich/ > 1. If there is a `lua_cpcall` in the C stack trace, nothing > is done on the C stack and the corresponding frame on the > Lua stack is removed. > 2. If there is a `lua_call`/`lua_pcall`, then next chunk of Typo: s/next chunk/the next chunk/ > frames is from the Lua stack. That chunk ends with either > a CFUNC, or a stack trace end. > > Resolves tarantool/tarantool#7244 > --- > tools/sysprof/collapse.lua | 31 +++++++++++++++++++------------ > 1 file changed, 19 insertions(+), 12 deletions(-) > > diff --git a/tools/sysprof/collapse.lua b/tools/sysprof/collapse.lua > index b0c4afc8..b815baf3 100755 > --- a/tools/sysprof/collapse.lua > +++ b/tools/sysprof/collapse.lua > @@ -39,9 +39,11 @@ local function insert(name, node, is_leaf) > end > > local function insert_lua_callchain(chain, lua, symbols) <snipped> > end > + table.remove(lua.callchain, ins_cnt) > end > > + Nit: looks like this newline is excess. > -- merge lua and host callchains into one callchain representing > -- transfer of control > local function merge(event, symbols, sep_vmst) > local cc = {} > - local lua_inserted = false > > for _,h_fr in pairs(event.host.callchain) do > local name_host = symtab.demangle(symbols, { > addr = h_fr.addr, > gen = h_fr.gen > }) > + table.insert(cc, 1, { name = name_host }) > > - -- We assume that usually the transfer of control > - -- looks like: > - -- HOST -> LUA -> HOST > - -- so for now, lua callchain starts from lua_pcall() call > - if name_host == 'lua_pcall' then > - insert_lua_callchain(cc, event.lua, symbols) > - lua_inserted = true > + if string.match(name_host, '^lua_cpcall') ~= nil then Why do we use `string.match()` instead `==`? > + -- Any C function is present on both the C and the Lua > + -- stacks. It is more convenient to get its info from the host stack, > + -- since it has information about child frames. > + table.remove(event.lua.callchain, 1) > end > > - table.insert(cc, 1, { name = name_host }) > - end > + if string.match(name_host, '^lua_p?call') ~= nil then I suggest to check full function name, so use '^lua_p?call$' as pattern instead. > + insert_lua_callchain(cc, event.lua, symbols) > + end > > - if lua_inserted == false then > - insert_lua_callchain(cc, event.lua, symbols) > end > > if sep_vmst == true then > -- > 2.36.1 > -- Best regards, Sergey Kaplun
Max, Thanks for the patch! LGTM, after the fixes you've made[1] against Sergey's review comments. On 04.07.22, Maxim Kokryashkin wrote: > This patch set improves the LuaJIT's symtab resolution, by making > the `lj_symtab` module read the `.symtab` section instead > of `.dynsym`. > > Also, it changes the stack merge algorithm in the parser, > providing support for sandwiched stacks. > > Branch: https://github.com/tarantool/luajit/tree/fckxorg/gh-7244-sysprof-sandwich > PR: https://github.com/tarantool/tarantool/pull/7342 > > Maxim Kokryashkin (2): > symtab: fix static symtab dump > sysprof: add stack sandwich support > > src/lj_symtab.c | 28 ++++++++++++++++++++++++++++ > tools/sysprof/collapse.lua | 31 +++++++++++++++++++------------ > 2 files changed, 47 insertions(+), 12 deletions(-) > > -- > 2.36.1 > [1]: https://github.com/tarantool/luajit/commit/dfa0f97 -- Best regards, IM
Max,
I've checked the patchset into all required long-term branches in
tarantool/luajit and bumped a new version in master and 2.10.
On 04.07.22, Maxim Kokryashkin wrote:
> This patch set improves the LuaJIT's symtab resolution, by making
> the `lj_symtab` module read the `.symtab` section instead
> of `.dynsym`.
>
> Also, it changes the stack merge algorithm in the parser,
> providing support for sandwiched stacks.
>
> Branch: https://github.com/tarantool/luajit/tree/fckxorg/gh-7244-sysprof-sandwich
> PR: https://github.com/tarantool/tarantool/pull/7342
>
> Maxim Kokryashkin (2):
> symtab: fix static symtab dump
> sysprof: add stack sandwich support
>
> src/lj_symtab.c | 28 ++++++++++++++++++++++++++++
> tools/sysprof/collapse.lua | 31 +++++++++++++++++++------------
> 2 files changed, 47 insertions(+), 12 deletions(-)
>
> --
> 2.36.1
>
--
Best regards,
IM