Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH luajit 0/2] sysprof: fix inconsistencies
@ 2022-07-03 21:38 Maxim Kokryashkin via Tarantool-patches
  2022-07-03 21:38 ` [Tarantool-patches] [PATCH luajit 1/2] symtab: fix static symtab dump Maxim Kokryashkin via Tarantool-patches
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2022-07-03 21:38 UTC (permalink / raw)
  To: tarantool-patches, imun, skaplun

This patch set improves the LuaJIT's symtab resolution, by making
the `lj_symtab` module read the `.symtab` section instead
of `.dynsym`.

Also, it changes the stack merge algorithm in the parser,
providing support for sandwiched stacks.

Branch: https://github.com/tarantool/luajit/tree/fckxorg/gh-7244-sysprof-sandwich
PR: https://github.com/tarantool/tarantool/pull/7342

Maxim Kokryashkin (2):
  symtab: fix static symtab dump
  sysprof: add stack sandwich support

 src/lj_symtab.c            | 28 ++++++++++++++++++++++++++++
 tools/sysprof/collapse.lua | 31 +++++++++++++++++++------------
 2 files changed, 47 insertions(+), 12 deletions(-)

--
2.36.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Tarantool-patches] [PATCH luajit 1/2] symtab: fix static symtab dump
  2022-07-03 21:38 [Tarantool-patches] [PATCH luajit 0/2] sysprof: fix inconsistencies Maxim Kokryashkin via Tarantool-patches
@ 2022-07-03 21:38 ` Maxim Kokryashkin via Tarantool-patches
  2022-07-16  7:21   ` Sergey Kaplun via Tarantool-patches
  2022-07-03 21:38 ` [Tarantool-patches] [PATCH luajit 2/2] sysprof: add stack sandwich support Maxim Kokryashkin via Tarantool-patches
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2022-07-03 21:38 UTC (permalink / raw)
  To: tarantool-patches, imun, skaplun

The `dl_iterate_phdr` returns an empty string as a name for
the executable from which it was called. It is still possible
to access its dynamic symbol table, but it is vital for
sysprof to obtain the main symbol table for the LuaJIT
executable. To do so, we need a valid path to the executable.

Since there is no way to obtain the path to a running executable
using the C standard library, this commit adds call to
`readlink` to gather the symbolic link from `/proc/self/exe`.
Most of the UNIX-based systems have procfs, so it is not
a problem.

Needed for tarantool/tarantool#7244
---
 src/lj_symtab.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/src/lj_symtab.c b/src/lj_symtab.c
index 2b2b205b..23696401 100644
--- a/src/lj_symtab.c
+++ b/src/lj_symtab.c
@@ -16,10 +16,12 @@
 
 #if LJ_HASRESOLVER
 
+#include <linux/limits.h>
 #include <elf.h>
 #include <link.h>
 #include <stdio.h>
 #include <sys/auxv.h>
+#include <unistd.h>
 #include "lj_gc.h"
 #endif
 
@@ -352,6 +354,7 @@ static int resolve_symbolnames(struct dl_phdr_info *info, size_t info_size,
   struct lj_wbuf *buf = conf->buf;
   lua_State *L = conf->L;
   const uint8_t header = conf->header;
+  char executable_path[PATH_MAX] = "";
 
   uint32_t lib_cnt = 0;
 
@@ -387,6 +390,31 @@ static int resolve_symbolnames(struct dl_phdr_info *info, size_t info_size,
   ** Main way: try to open ELF and read SHT_SYMTAB, SHT_STRTAB and SHT_HASH
   ** sections from it.
   */
+
+  /*
+  ** The `dl_iterate_phdr` returns an empty string as a name for
+  ** the executable from which it was called. It is still possible to
+  ** access its dynamic symbol table, but it is vital for sysprof to obtain
+  ** the main symbol table for the LuaJIT executable. To do so, we need a
+  ** valid path to the executable. Since there is no way to obtain the
+  ** path to a running executable using the C standard library, the only
+  ** more or less reliable way to do this is by reading the symbolic link
+  ** from `/proc/self/exe`. Most of the UNIX-based systems have procfs, so
+  ** it is not a problem.
+  */
+  if (*info->dlpi_name == 0) {
+    if (-1 != readlink("/proc/self/exe", executable_path, PATH_MAX))
+      info->dlpi_name = executable_path;
+    else
+      /*
+      ** It is impossible for sysprof to function properly without the
+      ** LuaJIT's .symtab section present. The assertion below
+      ** is unlikely to be triggered on any system supported by sysprof,
+      ** unless someone have deleted the LuaJIT binary right after the
+      ** start.
+      */
+      lua_assert(0);
+  }
   if (dump_sht_symtab(info->dlpi_name, buf, L, header, info->dlpi_addr) == 0) {
     ++conf->cur_lib;
   }
-- 
2.36.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Tarantool-patches] [PATCH luajit 2/2] sysprof: add stack sandwich support
  2022-07-03 21:38 [Tarantool-patches] [PATCH luajit 0/2] sysprof: fix inconsistencies Maxim Kokryashkin via Tarantool-patches
  2022-07-03 21:38 ` [Tarantool-patches] [PATCH luajit 1/2] symtab: fix static symtab dump Maxim Kokryashkin via Tarantool-patches
@ 2022-07-03 21:38 ` Maxim Kokryashkin via Tarantool-patches
  2022-07-16  7:32   ` Sergey Kaplun via Tarantool-patches
  2022-07-18 17:23 ` [Tarantool-patches] [PATCH luajit 0/2] sysprof: fix inconsistencies Igor Munkin via Tarantool-patches
  2022-08-10 14:34 ` Igor Munkin via Tarantool-patches
  3 siblings, 1 reply; 7+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2022-07-03 21:38 UTC (permalink / raw)
  To: tarantool-patches, imun, skaplun

This commit introduces stack sandwich support to the
sysprof's parser.

Sandwich is handled the following way:
  1. If there is a `lua_cpcall` in the C stack trace, nothing
  is done on the C stack and the corresponding frame on the
  Lua stack is removed.
  2. If there is a `lua_call`/`lua_pcall`, then next chunk of
  frames is from the Lua stack. That chunk ends with either
  a CFUNC, or a stack trace end.

Resolves tarantool/tarantool#7244
---
 tools/sysprof/collapse.lua | 31 +++++++++++++++++++------------
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/tools/sysprof/collapse.lua b/tools/sysprof/collapse.lua
index b0c4afc8..b815baf3 100755
--- a/tools/sysprof/collapse.lua
+++ b/tools/sysprof/collapse.lua
@@ -39,9 +39,11 @@ local function insert(name, node, is_leaf)
 end
 
 local function insert_lua_callchain(chain, lua, symbols)
+  local ins_cnt = 0
   for _,fr in pairs(lua.callchain) do
     local name_lua
 
+    ins_cnt = ins_cnt + 1
     if fr.type == parse.FRAME.FFUNC then
       name_lua = vmdef.ffnames[fr.ffid]
     else
@@ -58,38 +60,43 @@ local function insert_lua_callchain(chain, lua, symbols)
           gen = fr.gen
         })
       end
+
+      if fr.type == parse.FRAME.CFUNC then
+        -- C function encountered, the next chunk
+        -- of frames is located on the C stack.
+        break
+      end
     end
 
     table.insert(chain, 1, { name = name_lua })
   end
+  table.remove(lua.callchain, ins_cnt)
 end
 
+
 -- merge lua and host callchains into one callchain representing
 -- transfer of control
 local function merge(event, symbols, sep_vmst)
   local cc = {}
-  local lua_inserted = false
 
   for _,h_fr in pairs(event.host.callchain) do
     local name_host = symtab.demangle(symbols, {
       addr = h_fr.addr,
       gen = h_fr.gen
     })
+    table.insert(cc, 1, { name = name_host })
 
-    -- We assume that usually the transfer of control
-    -- looks like:
-    --    HOST -> LUA -> HOST
-    -- so for now, lua callchain starts from lua_pcall() call
-    if name_host == 'lua_pcall' then
-      insert_lua_callchain(cc, event.lua, symbols)
-      lua_inserted = true
+    if string.match(name_host, '^lua_cpcall') ~= nil then
+      -- Any C function is present on both the C and the Lua
+      -- stacks. It is more convenient to get its info from the host stack,
+      -- since it has information about child frames.
+      table.remove(event.lua.callchain, 1)
     end
 
-    table.insert(cc, 1, { name = name_host })
-  end
+    if string.match(name_host, '^lua_p?call') ~= nil then
+      insert_lua_callchain(cc, event.lua, symbols)
+    end
 
-  if lua_inserted == false then
-    insert_lua_callchain(cc, event.lua, symbols)
   end
 
   if sep_vmst == true then
-- 
2.36.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/2] symtab: fix static symtab dump
  2022-07-03 21:38 ` [Tarantool-patches] [PATCH luajit 1/2] symtab: fix static symtab dump Maxim Kokryashkin via Tarantool-patches
@ 2022-07-16  7:21   ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 7+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2022-07-16  7:21 UTC (permalink / raw)
  To: Maxim Kokryashkin; +Cc: tarantool-patches

Hi, Maxim!

Thanks for the patch!
LGTM, with several comments below.

On 04.07.22, Maxim Kokryashkin wrote:
> The `dl_iterate_phdr` returns an empty string as a name for
> the executable from which it was called.

Don't get: why it is the problem? I.e. why it is problem to obtain the
main symbol table? Is it sysprof parser restrictions?

>                                          It is still possible
> to access its dynamic symbol table, but it is vital for
> sysprof to obtain the main symbol table for the LuaJIT
> executable. To do so, we need a valid path to the executable.
> 
> Since there is no way to obtain the path to a running executable
> using the C standard library, this commit adds call to
> `readlink` to gather the symbolic link from `/proc/self/exe`.
> Most of the UNIX-based systems have procfs, so it is not
> a problem.
> 
> Needed for tarantool/tarantool#7244
> ---
>  src/lj_symtab.c | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 
> diff --git a/src/lj_symtab.c b/src/lj_symtab.c
> index 2b2b205b..23696401 100644
> --- a/src/lj_symtab.c
> +++ b/src/lj_symtab.c
> @@ -16,10 +16,12 @@
>  
>  #if LJ_HASRESOLVER
>  
> +#include <linux/limits.h>
>  #include <elf.h>
>  #include <link.h>
>  #include <stdio.h>
>  #include <sys/auxv.h>
> +#include <unistd.h>
>  #include "lj_gc.h"
>  #endif
>  
> @@ -352,6 +354,7 @@ static int resolve_symbolnames(struct dl_phdr_info *info, size_t info_size,
>    struct lj_wbuf *buf = conf->buf;
>    lua_State *L = conf->L;
>    const uint8_t header = conf->header;
> +  char executable_path[PATH_MAX] = "";

Why do we need this initialization?   ^

>  
>    uint32_t lib_cnt = 0;
>  
> @@ -387,6 +390,31 @@ static int resolve_symbolnames(struct dl_phdr_info *info, size_t info_size,
>    ** Main way: try to open ELF and read SHT_SYMTAB, SHT_STRTAB and SHT_HASH
>    ** sections from it.
>    */
> +
> +  /*
> +  ** The `dl_iterate_phdr` returns an empty string as a name for
> +  ** the executable from which it was called. It is still possible to
> +  ** access its dynamic symbol table, but it is vital for sysprof to obtain
> +  ** the main symbol table for the LuaJIT executable. To do so, we need a
> +  ** valid path to the executable. Since there is no way to obtain the
> +  ** path to a running executable using the C standard library, the only
> +  ** more or less reliable way to do this is by reading the symbolic link
> +  ** from `/proc/self/exe`. Most of the UNIX-based systems have procfs, so
> +  ** it is not a problem.
> +  */
> +  if (*info->dlpi_name == 0) {

Nit: s/0/'\0'/ to emphasize that we check that string contains only NULL
char.

> +    if (-1 != readlink("/proc/self/exe", executable_path, PATH_MAX))

Nit: `readlink("/proc/self/exe", executable_path, PATH_MAX) != -1`

> +      info->dlpi_name = executable_path;
> +    else
> +      /*
> +      ** It is impossible for sysprof to function properly without the
> +      ** LuaJIT's .symtab section present. The assertion below
> +      ** is unlikely to be triggered on any system supported by sysprof,
> +      ** unless someone have deleted the LuaJIT binary right after the
> +      ** start.
> +      */
> +      lua_assert(0);
> +  }
>    if (dump_sht_symtab(info->dlpi_name, buf, L, header, info->dlpi_addr) == 0) {
>      ++conf->cur_lib;
>    }
> -- 
> 2.36.1
> 

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/2] sysprof: add stack sandwich support
  2022-07-03 21:38 ` [Tarantool-patches] [PATCH luajit 2/2] sysprof: add stack sandwich support Maxim Kokryashkin via Tarantool-patches
@ 2022-07-16  7:32   ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 7+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2022-07-16  7:32 UTC (permalink / raw)
  To: Maxim Kokryashkin; +Cc: tarantool-patches

Hi, Maxim!

Thanks for the patch!

LGTM, except a few nits below.

On 04.07.22, Maxim Kokryashkin wrote:
> This commit introduces stack sandwich support to the
> sysprof's parser.
> 
> Sandwich is handled the following way:

Typo: s/Sandwich/The sandwich/

>   1. If there is a `lua_cpcall` in the C stack trace, nothing
>   is done on the C stack and the corresponding frame on the
>   Lua stack is removed.
>   2. If there is a `lua_call`/`lua_pcall`, then next chunk of

Typo: s/next chunk/the next chunk/

>   frames is from the Lua stack. That chunk ends with either
>   a CFUNC, or a stack trace end.
> 
> Resolves tarantool/tarantool#7244
> ---
>  tools/sysprof/collapse.lua | 31 +++++++++++++++++++------------
>  1 file changed, 19 insertions(+), 12 deletions(-)
> 
> diff --git a/tools/sysprof/collapse.lua b/tools/sysprof/collapse.lua
> index b0c4afc8..b815baf3 100755
> --- a/tools/sysprof/collapse.lua
> +++ b/tools/sysprof/collapse.lua
> @@ -39,9 +39,11 @@ local function insert(name, node, is_leaf)
>  end
>  
>  local function insert_lua_callchain(chain, lua, symbols)

<snipped>

>    end
> +  table.remove(lua.callchain, ins_cnt)
>  end
>  
> +

Nit: looks like this newline is excess.

>  -- merge lua and host callchains into one callchain representing
>  -- transfer of control
>  local function merge(event, symbols, sep_vmst)
>    local cc = {}
> -  local lua_inserted = false
>  
>    for _,h_fr in pairs(event.host.callchain) do
>      local name_host = symtab.demangle(symbols, {
>        addr = h_fr.addr,
>        gen = h_fr.gen
>      })
> +    table.insert(cc, 1, { name = name_host })
>  
> -    -- We assume that usually the transfer of control
> -    -- looks like:
> -    --    HOST -> LUA -> HOST
> -    -- so for now, lua callchain starts from lua_pcall() call
> -    if name_host == 'lua_pcall' then
> -      insert_lua_callchain(cc, event.lua, symbols)
> -      lua_inserted = true
> +    if string.match(name_host, '^lua_cpcall') ~= nil then

Why do we use `string.match()` instead `==`?

> +      -- Any C function is present on both the C and the Lua
> +      -- stacks. It is more convenient to get its info from the host stack,
> +      -- since it has information about child frames.
> +      table.remove(event.lua.callchain, 1)
>      end
>  
> -    table.insert(cc, 1, { name = name_host })
> -  end
> +    if string.match(name_host, '^lua_p?call') ~= nil then

I suggest to check full function name, so use '^lua_p?call$' as pattern
instead.

> +      insert_lua_callchain(cc, event.lua, symbols)
> +    end
>  
> -  if lua_inserted == false then
> -    insert_lua_callchain(cc, event.lua, symbols)
>    end
>  
>    if sep_vmst == true then
> -- 
> 2.36.1
> 

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 0/2] sysprof: fix inconsistencies
  2022-07-03 21:38 [Tarantool-patches] [PATCH luajit 0/2] sysprof: fix inconsistencies Maxim Kokryashkin via Tarantool-patches
  2022-07-03 21:38 ` [Tarantool-patches] [PATCH luajit 1/2] symtab: fix static symtab dump Maxim Kokryashkin via Tarantool-patches
  2022-07-03 21:38 ` [Tarantool-patches] [PATCH luajit 2/2] sysprof: add stack sandwich support Maxim Kokryashkin via Tarantool-patches
@ 2022-07-18 17:23 ` Igor Munkin via Tarantool-patches
  2022-08-10 14:34 ` Igor Munkin via Tarantool-patches
  3 siblings, 0 replies; 7+ messages in thread
From: Igor Munkin via Tarantool-patches @ 2022-07-18 17:23 UTC (permalink / raw)
  To: Maxim Kokryashkin; +Cc: tarantool-patches

Max,

Thanks for the patch! LGTM, after the fixes you've made[1] against Sergey's
review comments.

On 04.07.22, Maxim Kokryashkin wrote:
> This patch set improves the LuaJIT's symtab resolution, by making
> the `lj_symtab` module read the `.symtab` section instead
> of `.dynsym`.
> 
> Also, it changes the stack merge algorithm in the parser,
> providing support for sandwiched stacks.
> 
> Branch: https://github.com/tarantool/luajit/tree/fckxorg/gh-7244-sysprof-sandwich
> PR: https://github.com/tarantool/tarantool/pull/7342
> 
> Maxim Kokryashkin (2):
>   symtab: fix static symtab dump
>   sysprof: add stack sandwich support
> 
>  src/lj_symtab.c            | 28 ++++++++++++++++++++++++++++
>  tools/sysprof/collapse.lua | 31 +++++++++++++++++++------------
>  2 files changed, 47 insertions(+), 12 deletions(-)
> 
> --
> 2.36.1
> 

[1]: https://github.com/tarantool/luajit/commit/dfa0f97

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 0/2] sysprof: fix inconsistencies
  2022-07-03 21:38 [Tarantool-patches] [PATCH luajit 0/2] sysprof: fix inconsistencies Maxim Kokryashkin via Tarantool-patches
                   ` (2 preceding siblings ...)
  2022-07-18 17:23 ` [Tarantool-patches] [PATCH luajit 0/2] sysprof: fix inconsistencies Igor Munkin via Tarantool-patches
@ 2022-08-10 14:34 ` Igor Munkin via Tarantool-patches
  3 siblings, 0 replies; 7+ messages in thread
From: Igor Munkin via Tarantool-patches @ 2022-08-10 14:34 UTC (permalink / raw)
  To: Maxim Kokryashkin; +Cc: tarantool-patches

Max,

I've checked the patchset into all required long-term branches in
tarantool/luajit and bumped a new version in master and 2.10.

On 04.07.22, Maxim Kokryashkin wrote:
> This patch set improves the LuaJIT's symtab resolution, by making
> the `lj_symtab` module read the `.symtab` section instead
> of `.dynsym`.
> 
> Also, it changes the stack merge algorithm in the parser,
> providing support for sandwiched stacks.
> 
> Branch: https://github.com/tarantool/luajit/tree/fckxorg/gh-7244-sysprof-sandwich
> PR: https://github.com/tarantool/tarantool/pull/7342
> 
> Maxim Kokryashkin (2):
>   symtab: fix static symtab dump
>   sysprof: add stack sandwich support
> 
>  src/lj_symtab.c            | 28 ++++++++++++++++++++++++++++
>  tools/sysprof/collapse.lua | 31 +++++++++++++++++++------------
>  2 files changed, 47 insertions(+), 12 deletions(-)
> 
> --
> 2.36.1
> 

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-08-10 14:44 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-03 21:38 [Tarantool-patches] [PATCH luajit 0/2] sysprof: fix inconsistencies Maxim Kokryashkin via Tarantool-patches
2022-07-03 21:38 ` [Tarantool-patches] [PATCH luajit 1/2] symtab: fix static symtab dump Maxim Kokryashkin via Tarantool-patches
2022-07-16  7:21   ` Sergey Kaplun via Tarantool-patches
2022-07-03 21:38 ` [Tarantool-patches] [PATCH luajit 2/2] sysprof: add stack sandwich support Maxim Kokryashkin via Tarantool-patches
2022-07-16  7:32   ` Sergey Kaplun via Tarantool-patches
2022-07-18 17:23 ` [Tarantool-patches] [PATCH luajit 0/2] sysprof: fix inconsistencies Igor Munkin via Tarantool-patches
2022-08-10 14:34 ` Igor Munkin via Tarantool-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox