Tarantool development patches archive
 help / color / mirror / Atom feed
From: Sergey Kaplun via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Maxim Kokryashkin <max.kokryashkin@gmail.com>,
	Mikhail Shishatskiy <m.shishatskiy@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH luajit 4/7] core: introduce lua and platform profiler
Date: Wed, 29 Sep 2021 09:53:56 +0300
Message-ID: <YVQNhPETDiFWdjHY@root> (raw)
In-Reply-To: <c3d0b3830a8f22048bd2b0309118e426a57be90c.1631122521.git.m.kokryashkin@tarantool.org>

Hi, Maxim, Mikhail!

Thanks for the patch!
Please consider my comments below.

On 08.09.21, Maxim Kokryashkin wrote:
> From: Mikhail Shishatskiy <m.shishatskiy@tarantool.org>
> 
> This patch introduces a sampling platform profiler for
> the Lua machine.
> 
> The profiler uses the signal sampling backend from the
> low-level profiler built in vanilla LuaJIT, which was put
> into a separate module in one of the previous patches.

I suppose we can omit this clarification -- if somebody looks at git log
he'll see it.

> Thus, one cannot use both profilers at the same time.

Why we can define the signal (other that used for LuaJIT profiler) to
avoid conflict?

> 
> First of all profiler dumps the definitions of all
> loaded Lua functions and all loaded shared libraries
> (symtab) via the write buffer introduced in one of
> the previous patches.

Suggest to drop "introduced in one of ..."

> 
> As the profiling signal may occur at any time, we need
> to provide some guarantees for the sampling profiler
> to unwind consistent stack frames.
>                                    So, the VM is adjusted
> to save stack's current top frame into a dedicated
> variable `guesttop` in the virtual machine global state.

It is done in previous patch, so I suppose we shouldn't mention it in
this commit.

> 
> When signal occurs, the profiler dumps current VM state

Typo: s/signal/a signal/

> and additional info about about stacks that can be
> considered consistent in a given state. Also, profiling
> can be done without writing to file: vmstate counters
> are accumulated in static memory and can be received via
> the `luaM_sysprof_report` function. For more details see

Typo: s/details see/details, see/

> the <lmisclib.h> header file.
> 
> When profiling is over, the old signal handler is restored,
> and a special epilogue header is written.
> 
> Part of tarantool/tarantool#781

I suggest to move tests and Lua C API to commit, that introduses Lua API
as well. And in this commit we will only introduce core functionality
without API.


> ---
>  CMakeLists.txt                                |   6 +
>  src/CMakeLists.txt                            |   1 +
>  src/Makefile.dep.original                     |   8 +-
>  src/lj_arch.h                                 |  11 +
>  src/lj_errmsg.h                               |   2 +-
>  src/lj_mapi.c                                 |  26 +
>  src/lj_state.c                                |   7 +
>  src/lj_sysprof.c                              | 483 ++++++++++++++++++
>  src/lj_sysprof.h                              |  94 ++++
>  src/lmisclib.h                                |  93 ++++
>  test/tarantool-tests/CMakeLists.txt           |   1 +
>  .../misclib-sysprof-capi.test.lua             |  53 ++
>  .../misclib-sysprof-capi/CMakeLists.txt       |   1 +
>  .../misclib-sysprof-capi/testsysprof.c        | 269 ++++++++++
>  14 files changed, 1052 insertions(+), 3 deletions(-)
>  create mode 100644 src/lj_sysprof.c
>  create mode 100644 src/lj_sysprof.h
>  create mode 100644 test/tarantool-tests/misclib-sysprof-capi.test.lua
>  create mode 100644 test/tarantool-tests/misclib-sysprof-capi/CMakeLists.txt
>  create mode 100644 test/tarantool-tests/misclib-sysprof-capi/testsysprof.c

Please update <src/Makefile.original> for build with disabled sysprof as well.

> 
> diff --git a/CMakeLists.txt b/CMakeLists.txt
> index 5348e043..0c702093 100644
> --- a/CMakeLists.txt
> +++ b/CMakeLists.txt
> @@ -184,6 +184,12 @@ if(LUAJIT_DISABLE_MEMPROF)
>    AppendFlags(TARGET_C_FLAGS -DLUAJIT_DISABLE_MEMPROF)
>  endif()
>  
> +# Disable platform and lua profiler.
> +option(LUAJIT_DISABLE_SYSPROF "LuaJIT platform and lua profiler support" OFF)

Typo: s/lua/Lua/

> +if(LUAJIT_DISABLE_SYSPROF)
> +  AppendFlags(TARGET_C_FLAGS -DLUAJIT_DISABLE_SYSPROF)
> +endif()
> +
>  # Switch to harder (and slower) hash function when a collision
>  # chain in the string hash table exceeds a certain length.
>  option(LUAJIT_SMART_STRINGS "Harder string hashing function" ON)
> diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
> index 0f3d888f..c19d8b54 100644
> --- a/src/CMakeLists.txt
> +++ b/src/CMakeLists.txt
> @@ -69,6 +69,7 @@ make_source_list(SOURCES_UTILS
>  make_source_list(SOURCES_PROFILER
>    SOURCES
>      lj_memprof.c
> +    lj_sysprof.c
>      lj_profile_timer.c
>      lj_profile.c

Minor: lj_sysprof.c should be the last entry (alphabetically).

>  )
> diff --git a/src/Makefile.dep.original b/src/Makefile.dep.original
> index b7d82719..503817bb 100644
> --- a/src/Makefile.dep.original
> +++ b/src/Makefile.dep.original
> @@ -140,8 +140,8 @@ lj_lib.o: lj_lib.c lauxlib.h lua.h luaconf.h lj_obj.h lj_def.h lj_arch.h \
>  lj_load.o: lj_load.c lua.h luaconf.h lauxlib.h lj_obj.h lj_def.h \
>   lj_arch.h lj_gc.h lj_err.h lj_errmsg.h lj_buf.h lj_str.h lj_func.h \
>   lj_frame.h lj_bc.h lj_vm.h lj_lex.h lj_bcdump.h lj_parse.h
> -lj_mapi.o: lj_mapi.c lua.h luaconf.h lmisclib.h lj_obj.h lj_def.h lj_arch.h \
> - lj_dispatch.h lj_bc.h lj_jit.h lj_ir.h
> +lj_mapi.o: lj_mapi.c lua.h luaconf.h lmisclib.h lj_obj.h lj_def.h \
> + lj_arch.h lj_dispatch.h lj_bc.h lj_jit.h lj_ir.h lj_sysprof.h
>  lj_mcode.o: lj_mcode.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
>   lj_gc.h lj_err.h lj_errmsg.h lj_jit.h lj_ir.h lj_mcode.h lj_trace.h \
>   lj_dispatch.h lj_bc.h lj_traceerr.h lj_vm.h
> @@ -204,6 +204,10 @@ lj_strscan.o: lj_strscan.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
>   lj_char.h lj_strscan.h
>  lj_symtab.o: lj_symtab.c lj_symtab.h lj_wbuf.h lj_def.h lua.h luaconf.h \
>   lj_obj.h lj_arch.h
> +lj_sysprof.o: lj_sysprof.c lj_arch.h lua.h luaconf.h lj_sysprof.h \
> + lj_obj.h lj_def.h lmisclib.h lj_debug.h lj_dispatch.h lj_bc.h lj_jit.h \
> + lj_ir.h lj_frame.h lj_trace.h lj_traceerr.h lj_wbuf.h lj_profile_timer.h \
> + lj_symtab.h
>  lj_tab.o: lj_tab.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h lj_gc.h \
>   lj_err.h lj_errmsg.h lj_tab.h
>  lj_trace.o: lj_trace.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \

Looks like lj_state.o dependencies should be updated too, IINM:

0) We can fix missed dependencies from lj_memprof.h
1) We should add dependency by lj_sysprof.h

> diff --git a/src/lj_arch.h b/src/lj_arch.h
> index 5bf0afb8..f0b60092 100644
> --- a/src/lj_arch.h
> +++ b/src/lj_arch.h

<snipped>

> @@ -585,4 +589,11 @@
>  #define LJ_HASMEMPROF		1
>  #endif
>  
> +/* Disable or enable the platform and lua profiler. */

Typo: s/lua/Lua/

> +#if defined(LUAJIT_DISABLE_SYSPROF) || defined(LJ_ARCH_NOSYSPROF) || LJ_TARGET_WINDOWS || LJ_TARGET_CYGWIN || LJ_TARGET_PS3 || LJ_TARGET_PS4 || LJ_TARGET_XBOX360

Side note: I not sure that all available systems have pthread.h. Is that
so?

> +#define LJ_HASSYSPROF		0
> +#else
> +#define LJ_HASSYSPROF		1
> +#endif
> +
>  #endif
> diff --git a/src/lj_errmsg.h b/src/lj_errmsg.h
> index ae0a18c0..77a08cb0 100644
> --- a/src/lj_errmsg.h
> +++ b/src/lj_errmsg.h

<snipped>

> diff --git a/src/lj_mapi.c b/src/lj_mapi.c
> index b2b35a17..0b2284f0 100644
> --- a/src/lj_mapi.c
> +++ b/src/lj_mapi.c

<snipped>

> diff --git a/src/lj_state.c b/src/lj_state.c
> index f82b1b5b..cc6f92f1 100644
> --- a/src/lj_state.c
> +++ b/src/lj_state.c

<snipped>

> diff --git a/src/lj_sysprof.c b/src/lj_sysprof.c
> new file mode 100644
> index 00000000..c0c83fa9
> --- /dev/null
> +++ b/src/lj_sysprof.c
> @@ -0,0 +1,483 @@

Minor: It would be nice to add some module description like it is done
for <src/lj_memprof.c>.

> +#define lj_sysprof_c
> +#define LUA_CORE
> +
> +#include "lj_arch.h"
> +#include "lj_sysprof.h"
> +
> +#if LJ_HASSYSPROF
> +
> +#include "lj_obj.h"
> +#include "lj_debug.h"
> +#include "lj_dispatch.h"
> +#include "lj_frame.h"
> +
> +#if LJ_HASJIT
> +#include "lj_jit.h"
> +#include "lj_trace.h"
> +#endif
> +
> +#include "lj_wbuf.h"
> +#include "lj_profile_timer.h"
> +#include "lj_symtab.h"
> +
> +#include <pthread.h>
> +#include <errno.h>
> +#include <execinfo.h>

Minor: I supposet that we should check that execinfo is available on the
system and `backtrace` symbol is defined. Not sure that it is true for
Alpine Linux, for example.

> +
> +#define SYSPROF_HANDLER_STACK_DEPTH 4

Minor: please add a comment why it is 4 and not 5 or 3.

> +#define SYSPROF_BACKTRACE_BUF_SIZE 4096

Firstly, I thought that this is the size of the buffer in bytes.
I suggest to rename this macro to SYSPROF_BACKTRACE_FRAME_MAX or some
think like that (mention FRAME is important -- to get that it stands
for the highest possible frame).
Feel free to change it on your own.

Also, I suppose that 4096 values (i.e. 4 Kb) on the stack is too much.
Do you know somebody ready to examine backtrace with at least 1k
entries? Also, if for LuaJIT expected limit *may be* really around 4K
entries with the following code:

| luajit -e 'local function rev (s) return string.gsub(s, "(.)(.+)", function (c,s1) return rev(s1)..c end) end local x = string.rep("0", 1000) rev(x)'

Tarantool's Lua fiber's stack is limited by ~300 entries, IIRC.
(All numbers are empirically established.)
In real life I never see backtraces with more than 50 entries.
So, I suppose that 512 entries are more than enough.
Thoughts?

> +
> +enum sysprof_state {
> +  /* Profiler needs to be configured. */
> +  SPS_UNCONFIGURED,

In my opinion, it is better to have default config that profiler
starts with by default.
Thoughts?

> +  /* Profiler is not running. */
> +  SPS_IDLE,
> +  /* Profiler is running. */
> +  SPS_PROFILE,
> +  /*
> +  ** Stopped in case of stopped or failed stream.
> +  ** Saved errno is set at luaM_sysprof_stop.
> +  */
> +  SPS_HALT
> +};
> +
> +struct sysprof {
> +  global_State *g; /* Profiled VM. */
> +  pthread_t thread; /* Profiled thread. */
> +  enum sysprof_state state; /* Internal state. */

As far as this internal state can be changed inside a signal handler (see
`sysprof_record_sample()`) it should be declared as volatile, to avoid
compile optimization of this code part.
Moreover, it should be used `sig_atomic_t` here to be sure that its
usage is atomic in context of usage inside the signal handler.

> +  struct lj_wbuf out; /* Output accumulator. */
> +  struct luam_sysprof_counters counters; /* Profiling counters. */
> +  struct luam_sysprof_options opt; /* Profiling options. */
> +  struct luam_sysprof_config config; /* Profiler configurations. */
> +  lj_profile_timer timer; /* Profiling timer. */
> +  int saved_errno; /* Saved errno when profiler failed. */
> +};
> +

Minor: I suppose we should add a comment here, that only one VM
(`global_State` in <lj_obj.h> terms) can be profiled at a time.

> +static struct sysprof sysprof = {0};
> +
> +/* --- Stream ------------------------------------------------------------- */
> +
> +static const uint8_t ljp_header[] = {'l', 'j', 'p', LJP_FORMAT_VERSION,
> +                                     0x0, 0x0, 0x0};

Minor: use tabs for alignment here.

> +
> +static int stream_is_needed(struct sysprof *sp)
> +{
> +  return LUAM_SYSPROF_DEFAULT != sp->opt.mode;

Nit: `sp->opt.mode != LUAM_SYSPROF_DEFAULT` is more readable for me.
Feel free to ignore.

> +}
> +
> +static void stream_prologue(struct sysprof *sp)
> +{
> +  lj_symtab_dump(&sp->out, sp->g);
> +  lj_wbuf_addn(&sp->out, ljp_header, sizeof(ljp_header));
> +}
> +
> +static void stream_epilogue(struct sysprof *sp)
> +{
> +  lj_wbuf_addbyte(&sp->out, LJP_EPILOGUE_BYTE);
> +}
> +
> +static void stream_lfunc(struct lj_wbuf *buf, GCfunc *func)

Nit: I suppose, that we can use `const GCfunc` instead to accent, that
this func cann't change here.
Here and below.

> +{

Minor: I suppose, that we should add lua_assert(isluafunc(func)) here.

> +  GCproto *pt = funcproto(func);

Nit: GC proto can be declared as `const` here, isn't it?

> +  lua_assert(pt != NULL);
> +  lj_wbuf_addbyte(buf, LJP_FRAME_LFUNC);
> +  lj_wbuf_addu64(buf, (uintptr_t)pt);
> +  lj_wbuf_addu64(buf, (uint64_t)pt->firstline);
> +}
> +
> +static void stream_cfunc(struct lj_wbuf *buf, GCfunc *func)
> +{

Minor: I suppose, that we should add lua_assert(iscfunc(func)) here.

> +  lj_wbuf_addbyte(buf, LJP_FRAME_CFUNC);
> +  lj_wbuf_addu64(buf, (uintptr_t)func->c.f);
> +}
> +
> +static void stream_ffunc(struct lj_wbuf *buf, GCfunc *func)
> +{

Minor: I suppose, that we should add lua_assert(isffunc(func)) here.

> +  lj_wbuf_addbyte(buf, LJP_FRAME_FFUNC);
> +  lj_wbuf_addu64(buf, func->c.ffid);
> +}
> +
> +static void stream_frame_lua(struct lj_wbuf *buf, cTValue *frame)
> +{
> +  GCfunc *func = frame_func(frame);

Nit: GCfunc may be declared as const here, IINM.

> +  lua_assert(NULL != func);

Side note: why do you use the reversed condition order here? :)

> +  if (isluafunc(func)) {
> +    stream_lfunc(buf, func);
> +  } else if (isffunc(func)) {
> +    stream_ffunc(buf, func);
> +  } else {

Minor: I suppose that we should use also `else if` here with else
lua_assert(0) unreachable branch below.

> +    stream_cfunc(buf, func);
> +  }
> +}
> +
> +static void stream_backtrace_lua(struct sysprof *sp)
> +{
> +  global_State *g = sp->g;
> +  struct lj_wbuf *buf = &sp->out;
> +  cTValue *top_frame = NULL, *frame = NULL, *bot = NULL;
> +  lua_State *L = NULL;

Nit: This initialization looks excess. Just declaration is enough.
Same for top_frame, frame and bot.

> +
> +  lua_assert(g != NULL);
> +  L = gco2th(gcref(g->cur_L));
> +  lua_assert(L != NULL);
> +
> +  top_frame = sp->g->top_frame.guesttop.interp_base - 1;

Looks like this line expects that interp_base always has Lua function on
top. Does it conflict with fast functions and C functions?

Should it be `- (1 + LJ_FR2)` instead?

Nit: s/sp->g/g/

> +  bot = tvref(L->stack) + LJ_FR2;
> +  /* Traverse frames backwards */
> +  for (frame = top_frame; frame > bot; frame = frame_prev(frame)) {
> +    if (frame_gc(frame) == obj2gco(L)) {

Nit: `{` and `}` are excess.

> +      continue;  /* Skip dummy frames. See lj_err_optype_call(). */
> +    }
> +    stream_frame_lua(buf, frame);

Should we skip VARAG pseudo-frames as well?

> +  }
> +
> +  lj_wbuf_addbyte(buf, LJP_FRAME_LUA_LAST);
> +}
> +

Minor: please add comment about return values. I don't get when this
function returns the same address was given, and when NULL. And for what
NULL stands for.

> +static void *stream_frame_host(int frame_no, void *addr)

Nit: also, looks like addr can be declared as a constant, can't it?

> +{
> +  struct sysprof *sp = &sysprof;
> +
> +  if (LJ_UNLIKELY(frame_no <= SYSPROF_HANDLER_STACK_DEPTH)) {
> +    /*
> +    ** We don't want the profiler stack to be streamed, as it will
> +    ** burden the profile with unnecessary information.
> +    */
> +    return addr;
> +  } else if (LJ_UNLIKELY(sp->opt.mode == LUAM_SYSPROF_LEAF &&
> +                         frame_no > SYSPROF_HANDLER_STACK_DEPTH + 1)) {

Looks like it should be `frame_no > SYSPROF_HANDLER_STACK_DEPTH, indeed
If `SYSPROF_HANDLER_STACK_DEPTH + 1 == frame_no` the first branch is
taken.

> +    return NULL;
> +  }

Nit: I suggest to move comment upper and omit curly brackets.
Feel free to ignore.

> +
> +  lj_wbuf_addu64(&sp->out, (uintptr_t)addr);
> +  return addr;
> +}
> +
> +static void default_backtrace_host(void *(writer)(int frame_no, void *addr))
> +{
> +  static void* backtrace_buf[SYSPROF_BACKTRACE_BUF_SIZE] = {};

Typo: s/void* backtrace_buf/void *backtrace_buf/

> +
> +  struct sysprof *sp = &sysprof;
> +  const int depth = backtrace(backtrace_buf,
> +                              sp->opt.mode == LUAM_SYSPROF_LEAF
> +                              ? SYSPROF_HANDLER_STACK_DEPTH + 1
> +                              : SYSPROF_BACKTRACE_BUF_SIZE);
> +  lua_assert(depth >= SYSPROF_HANDLER_STACK_DEPTH);

Nit: strictly saying it is the following:
| depth <= sp->opt.mode == LUAM_SYSPROF_LEAF ?
|          SYSPROF_HANDLER_STACK_DEPTH + 1 : SYSPROF_BACKTRACE_BUF_SIZE

So I suggest to add a variable to use in backtrace configuration and the
assert.

> +
> +  for (int i = SYSPROF_HANDLER_STACK_DEPTH; i < depth; ++i) {
> +    if (!writer(i - SYSPROF_HANDLER_STACK_DEPTH + 1, backtrace_buf[i])) {

Nit: { } brackets are excess.

> +      return;
> +    }
> +  }
> +}
> +
> +static void stream_backtrace_host(struct sysprof *sp)
> +{
> +  lua_assert(sp->config.backtracer != NULL);
> +  sp->config.backtracer(stream_frame_host);
> +  lj_wbuf_addu64(&sp->out, (uintptr_t)LJP_FRAME_HOST_LAST);
> +}
> +
> +static void stream_trace(struct sysprof *sp)
> +{

I suppose we should add the following ifdef here.
| #if LJ_HASJIT

> +  struct lj_wbuf *out = &sp->out;
> +

Nit: this empty line is excess.

> +  uint32_t traceno = sp->g->vmstate;
> +  jit_State *J = G2J(sp->g);
> +
> +  lj_wbuf_addu64(out, traceno);
> +  lj_wbuf_addu64(out, (uintptr_t)J->prev_pt);
> +  lj_wbuf_addu64(out, J->prev_line);

Don't get this line. J->prev_line is not the line of currently executed
trace, IINM. But the line of code that is recorded now. Also, prev_line
is updated only by LuaJIT's profiler in 'l' mode.

Ditto for `J->prev_pt`.

> +}
> +
> +static void stream_guest(struct sysprof *sp)
> +{
> +  stream_backtrace_lua(sp);
> +  stream_backtrace_host(sp);
> +}
> +
> +static void stream_host(struct sysprof *sp)
> +{
> +  stream_backtrace_host(sp);
> +}
> +
> +typedef void (*event_streamer)(struct sysprof *sp);
> +
> +static event_streamer event_streamers[] = {
> +  /* XXX: order is important */
> +  stream_host,  /* LJ_VMST_INTERP */
> +  stream_guest, /* LJ_VMST_LFUNC */
> +  stream_guest, /* LJ_VMST_FFUNC */
> +  stream_guest, /* LJ_VMST_CFUNC */
> +  stream_host,  /* LJ_VMST_GC */
> +  stream_host,  /* LJ_VMST_EXIT */
> +  stream_host,  /* LJ_VMST_RECORD */
> +  stream_host,  /* LJ_VMST_OPT */
> +  stream_host,  /* LJ_VMST_ASM */
> +  stream_trace  /* LJ_VMST_TRACE */
> +};

I don't understand the following point:

Assume I have the following Lua code named <tsysprof.lua>:

| 1  local tsp = require "testsysprof"
| 2  jit.off() jit.flush()
| 3
| 4  function lc(nsec) -- Lua call to C
| 5    local r = 0
| 6    for i = 1, 1e5 do r = r + i end
| 7    return tsp.c(nsec) -- Regular C function does n sleeps to 1 sec
| 8  end
| 9
| 10 local function lclc(nsec) -- Lua call to C call to Lua call to C
| 11   local r = 0
| 12   for i = 1, 1e5 do r = r + i end
| 13   return tsp.clc(nsec) -- C call to Lua call to C
| 14 end
| 15
| 16 local s, r = misc.sysprof.start( {mode = "C", interval = 1, path = "sysprof.bin"})
| 17 assert(s,r)
| 18
| 19 for i = 1, 100 do
| 20   assert(lclc(3) == true) -- 3 sec sleep
| 21 end
| 22
| 23 s,r = misc.sysprof.stop()
| 24 assert(s,r)

I expect to see the following callgraph: Lua->C->Lua->C:

 +----------------------------+
 |    CFUNC <tsp.c addr>      |
 +----------------------------+
 |....CFUNCS related to VM....|
 +----------------------------+
 |    tsysprof.lua:4          |
 +----------------------------+
 |....CFUNCS related to VM....|
 +----------------------------+
 |    CFUNC <tsp.c addr>      |
 +----------------------------+
 |....CFUNCS related to VM....|
 +----------------------------+
 |    tsysprof.lua:10         |
 +----------------------------+---------------------------+
 |    tsysprof.lua:0          |  MB some other code flow  |
 +----------------------------+---------------------------+
 |                      tsysprof.lua:0                    |
 +--------------------------------------------------------+
 |......................CFUNCS till main()................|
 +--------------------------------------------------------+

From the code flow above IINM we report separate information about host
stack and guest stack. Will they be merged together by sysprof-parser?

If not -- then instead excpected callgraph I'll see something like the
following:
                              +---------------------------+
                              |    CFUNC <tsp.c addr>     |
                              +---------------------------+
                              |....CFUNCS related to VM...|
                              +---------------------------+
                              |    tsysprof.lua:4         |
 +----------------------------+---------------------------+
 |....CFUNCS related to VM....|....CFUNCS related to VM...|
 +----------------------------+---------------------------+
 |    tsysprof.lua:10         |    CFUNC <tsp.c addr>     |
 +----------------------------+---------------------------+-------------------------+
 |                        sysprof.lua:0  ttttt            |.MB some other code flow.|
 +--------------------------------------------------------+-------------------------+
 |                      tsysprof.lua:0                                              |
 +----------------------------------------------------------------------------------+
 |......................CFUNCS till main()..........................................|
 +----------------------------------------------------------------------------------+

But, if them merged please add the corresponding comment here and in
RFC, to avoid confusing.

> +
> +static void stream_event(struct sysprof *sp, uint32_t vmstate)
> +{
> +  event_streamer stream = NULL;
> +
> +  /* Check that vmstate fits in 4 bits (see streaming format) */
> +  lua_assert((vmstate & ~(uint32_t)((1 << 4) - 1)) == 0);

Minor: May be it is better to add corresponding define in the
header?

> +  lj_wbuf_addbyte(&sp->out, (uint8_t)vmstate);
> +

Nit: the following empty line is excess.
Feel free to ignor.

> +  stream = event_streamers[vmstate];
> +  lua_assert(NULL != stream);
> +  stream(sp);
> +}
> +
> +/* -- Signal handler ------------------------------------------------------ */
> +
> +static void sysprof_record_sample(struct sysprof *sp, siginfo_t *info)
> +{
> +  global_State *g = sp->g;
> +  uint32_t _vmstate = ~(uint32_t)(g->vmstate);
> +  uint32_t vmstate = _vmstate < LJ_VMST_TRACE ? _vmstate : LJ_VMST_TRACE;
> +

Some comments about the assert below are wellcome.

> +  lua_assert(pthread_self() == sp->thread);
From Programmer's Manual PTHREAD_SELF(3)

| SYNOPSIS
|        #include <pthread.h>
|        pthread_t pthread_self(void);
|        Compile and link with -pthread.

So it looks like we should add corresponding flag in CMake.

> +
> +  /* Caveat: order of counters must match vmstate order in <lj_obj.h>. */
> +  ((uint64_t *)&sp->counters)[vmstate]++;
> +
> +  sp->counters.samples++;
> +  sp->counters.overruns += info->si_overrun;
> +
> +  if (stream_is_needed(sp)) {

Minor: `if (!stream_is_needed(sp)) return;` (early return) looks pretier to me.

> +    stream_event(sp, vmstate);
> +    if (LJ_UNLIKELY(lj_wbuf_test_flag(&sp->out, STREAM_ERRIO|STREAM_STOP))) {
> +      sp->saved_errno = lj_wbuf_errno(&sp->out);
> +      lj_wbuf_terminate(&sp->out);
> +      sp->state = SPS_HALT;
> +    }
> +  }
> +}
> +
> +static void sysprof_signal_handler(int sig, siginfo_t *info, void *ctx)
> +{
> +  struct sysprof *sp = &sysprof;
> +  UNUSED(sig);
> +  UNUSED(ctx);
> +
> +  switch (sp->state) {
> +    case SPS_PROFILE:
> +      sysprof_record_sample(sp, info);
> +      break;
> +
> +    case SPS_IDLE:
> +    case SPS_HALT:
> +      /* noop */
> +      break;
> +
> +    default:
> +      lua_assert(0);
> +      break;
> +  }
> +}
> +
> +/* -- Internal ------------------------------------------------------------ */
> +
> +static int sysprof_validate(struct sysprof *sp,
> +                            const struct luam_sysprof_options *opt)

Side note: See my question about default configuration above.

> +{
> +  switch (sp->state) {
> +    case SPS_UNCONFIGURED:
> +      return SYSPROF_ERRUSE;
> +
> +    case SPS_IDLE: {

Nit: {} looks unnecessary here.

> +      if (opt->mode > LUAM_SYSPROF_CALLGRAPH) {

As far as it is enum it can be < than LUAM_SYSPROF_DEFAULT.

> +        return SYSPROF_ERRUSE;
> +      } else if (opt->mode != LUAM_SYSPROF_DEFAULT &&
> +                 (opt->buf == NULL || opt->len == 0 ||
> +                  sp->config.writer == NULL || sp->config.on_stop == NULL)) {

If they all are set by public API may be it is better to use
lua_apicheck() instead?

> +        return SYSPROF_ERRUSE;
> +      } else if (opt->interval == 0) {
> +        return SYSPROF_ERRUSE;
> +      }
> +      break;
> +    }
> +
> +    case SPS_PROFILE:
> +    case SPS_HALT:
> +      return SYSPROF_ERRRUN;

And the similar check is missed for memprof start (ERRRUN return only if
the state is MPS_PROFILE).

I suppose we should forbid to start profiling if the profiler is in HALT
state -- it means that the profiling with error to report still running.
Wait Igor's opinion here.
If yes, it should be fixed in memprof in the separate patchset.

> +
> +    default:
> +      lua_assert(0);

Nit: `break;` is missing.

From The C programming language - Second edition (K&R 2):

Chapter 3.4 Switch

| As a matter of good form, put a break after the last case (the default
| here) even though it's logically unnecessary. Some day when another case
| gets added at the end, this bit of defensive programming will save you.

> +  }
> +
> +  return SYSPROF_SUCCESS;
> +}
> +
> +static int sysprof_init(struct sysprof *sp, lua_State *L,
> +                        const struct luam_sysprof_options *opt)
> +{
> +  int status = sysprof_validate(sp, opt);

Nit: status may be declared as const.

> +  if (SYSPROF_SUCCESS != status) {

Nit: {} are excess.

> +    return status;
> +  }
> +
> +  /* Copy validated options to sysprof state. */
> +  memcpy(&sp->opt, opt, sizeof(sp->opt));
> +
> +  /* Init general fields. */
> +  sp->g = G(L);
> +  sp->thread = pthread_self();
> +
> +  /* Reset counters. */
> +  memset(&sp->counters, 0, sizeof(sp->counters));
> +
> +  /* Reset saved errno. */
> +  sp->saved_errno = 0;
> +
> +  if (stream_is_needed(sp)) {
> +    lj_wbuf_init(&sp->out, sp->config.writer, opt->ctx, opt->buf, opt->len);
> +  }

Nit: {} are excess.

> +
> +  return SYSPROF_SUCCESS;
> +}
> +
> +/* -- Public profiling API ------------------------------------------------ */
> +
> +int lj_sysprof_configure(const struct luam_sysprof_config *config)
> +{
> +  struct sysprof *sp = &sysprof;
> +  lua_assert(config != NULL);
> +  if (sp->state != SPS_UNCONFIGURED && sp->state != SPS_IDLE) {

Nit: {} are excess.

> +    return SYSPROF_ERRUSE;
> +  }
> +
> +  memcpy(&sp->config, config, sizeof(*config));
> +
> +  if (sp->config.backtracer == NULL) {

Ditto.

> +    sp->config.backtracer = default_backtrace_host;
> +  }
> +
> +  sp->state = SPS_IDLE;
> +
> +  return SYSPROF_SUCCESS;
> +}
> +
> +int lj_sysprof_start(lua_State *L, const struct luam_sysprof_options *opt)
> +{
> +  struct sysprof *sp = &sysprof;
> +
> +  int status = sysprof_init(sp, L, opt);
> +  if (SYSPROF_SUCCESS != status) {
> +    if (NULL != sp->config.on_stop) {

I suppose as far as this situation is only possible by the user's
mistake during code development it will be better to use lua_assert() or
lua_apicheck() here.
Thoughts?

> +      /*
> +      ** Initialization may fail in case of unconfigured sysprof,
> +      ** so we cannot guarantee cleaning up resources in this case.
> +      */
> +      sp->config.on_stop(opt->ctx, opt->buf);
> +    }
> +    return status;
> +  }
> +
> +  sp->state = SPS_PROFILE;
> +
> +  if (stream_is_needed(sp)) {
> +    stream_prologue(sp);
> +    if (LJ_UNLIKELY(lj_wbuf_test_flag(&sp->out, STREAM_ERRIO|STREAM_STOP))) {
> +      /* on_stop call may change errno value. */
> +      int saved_errno = lj_wbuf_errno(&sp->out);

Nit: saved_errno may be declared as const.

> +      /* Ignore possible errors. mp->out.buf may be NULL here. */
> +      sp->config.on_stop(opt->ctx, sp->out.buf);
> +      lj_wbuf_terminate(&sp->out);
> +      sp->state = SPS_IDLE;
> +      errno = saved_errno;
> +      return SYSPROF_ERRIO;
> +    }
> +  }
> +
> +  sp->timer.opt.interval_msec = opt->interval;
> +  sp->timer.opt.handler = sysprof_signal_handler;
> +  lj_profile_timer_start(&sp->timer);
> +
> +  return SYSPROF_SUCCESS;
> +}
> +
> +int lj_sysprof_stop(lua_State *L)
> +{
> +  struct sysprof *sp = &sysprof;
> +  global_State *g = sp->g;
> +  struct lj_wbuf *out = &sp->out;
> +
> +  if (SPS_IDLE == sp->state) {
> +    return SYSPROF_ERRSTOP;

Why it is bad to have SPS_IDLE here (why UNCONFIGURED is OK)?
We can use PROFILE_ERRUSE as for this error for consistency with
memprof.

> +  } else if (G(L) != g) {
> +    return SYSPROF_ERRUSE;
> +  }

Nit: {} are excess.

> +
> +  lj_profile_timer_stop(&sp->timer);
> +
> +  if (SPS_HALT == sp->state) {
> +    errno = sp->saved_errno;
> +    sp->state = SPS_IDLE;
> +    /* wbuf was terminated when error occured. */
> +    return SYSPROF_ERRIO;
> +  }
> +
> +  sp->state = SPS_IDLE;
> +
> +  if (stream_is_needed(sp)) {
> +    int cb_status = 0;

Nit: This initialization looks excess.

> +
> +    stream_epilogue(sp);
> +    lj_wbuf_flush(out);
> +
> +    cb_status = sp->config.on_stop(sp->opt.ctx, out->buf);
> +    if (LJ_UNLIKELY(lj_wbuf_test_flag(out, STREAM_ERRIO|STREAM_STOP)) ||

Typo: s/ERRIO|STREAM/ERRIO | STREAM/

> +        0 != cb_status) {

Side note: why do you use here 0 != cb_status, but not cb_status != 0?

> +      errno = lj_wbuf_errno(out);
> +      lj_wbuf_terminate(out);
> +      return SYSPROF_ERRIO;
> +    }
> +
> +    lj_wbuf_terminate(out);
> +  }
> +
> +  return SYSPROF_SUCCESS;
> +}
> +
> +int lj_sysprof_report(struct luam_sysprof_counters *counters)
> +{
> +  const struct sysprof *sp = &sysprof;
> +  if (sp->state != SPS_IDLE) {

Nit: {} are excess.

> +    return SYSPROF_ERRUSE;
> +  }
> +  memcpy(counters, &sp->counters, sizeof(sp->counters));
> +  return SYSPROF_SUCCESS;
> +}
> +
> +#else /* LJ_HASSYSPROF */

<snipped>

> +#endif /* LJ_HASSYSPROF */
> +

Nit: this empty line is excess.

> diff --git a/src/lj_sysprof.h b/src/lj_sysprof.h
> new file mode 100644
> index 00000000..515bc08c
> --- /dev/null
> +++ b/src/lj_sysprof.h
> @@ -0,0 +1,94 @@
> +/*
> +** Sysprof - platform and Lua profiler

Typo: s/profiler/profiler/.

> +*/
> +
> +/*
> +** XXX: Platform profiler is not thread safe. Please, don't try to
> +** use it inside several VM, you can profile only one at a time.
> +*/
> +
> +/*
> +** XXX: Platform profiler uses the same signal backend as lj_profile. Please,
> +** don't use both at the same time.

So, may be it is better to use another one signal for sysprof
profilling?

> +*/
> +
> +#ifndef _LJ_SYSPROF_H
> +#define _LJ_SYSPROF_H
> +
> +#include "lj_obj.h"
> +#include "lmisclib.h"
> +
> +#define LJP_FORMAT_VERSION 0x1
> +
> +/*
> +** Event stream format:

<snipped>

> +*/
> +
> +#define LJP_FRAME_LFUNC ((uint8_t)1)
> +#define LJP_FRAME_CFUNC ((uint8_t)2)
> +#define LJP_FRAME_FFUNC ((uint8_t)3)
> +#define LJP_FRAME_LUA_LAST  0x80
> +#define LJP_FRAME_HOST_LAST NULL
> +
> +#define LJP_EPILOGUE_BYTE 0x80
> +
> +int lj_sysprof_configure(const struct luam_sysprof_config *config);
> +
> +int lj_sysprof_start(lua_State *L, const struct luam_sysprof_options *opt);
> +
> +int lj_sysprof_stop(lua_State *L);
> +
> +int lj_sysprof_report(struct luam_sysprof_counters *counters);
> +
> +#endif
> +

Nit: this empty line is excess.

> diff --git a/src/lmisclib.h b/src/lmisclib.h
> index 0c07707e..3545ff47 100644
> --- a/src/lmisclib.h
> +++ b/src/lmisclib.h
> @@ -60,6 +60,99 @@ struct luam_Metrics {
>  
>  LUAMISC_API void luaM_metrics(lua_State *L, struct luam_Metrics *metrics);
>  
> +/* --- Sysprof - platform and lua profiler -------------------------------- */
> +
> +/* Profiler configurations */

Typo: s/configurations/configurations./

> +struct luam_sysprof_config {

Minor: luam_Sysprof_Config is more consistent. Perfect variant is the
luam_<one word with the first letter in uppercase>, but I don't know how
to name it.

> +  /*
> +  ** Writer function for profile events.
> +  ** Should return amount of written bytes on success or zero in case of error.
> +  ** Setting *data to NULL means end of profiling.
> +  ** For details see <lj_wbuf.h>.

Please add also, that this writer *must be* asinc-safe function (see also
man 7 signal-safety).

> +  */
> +  size_t (*writer)(const void **data, size_t len, void *ctx);
> +  /*
> +  ** Callback on profiler stopping. Required for correctly cleaning
> +  ** at VM finalization when profiler is still running.
> +  ** Returns zero on success.
> +  */
> +  int (*on_stop)(void *ctx, uint8_t *buf);
> +  /*
> +  ** Backtracing function for the host stack. Should call `frame_writer` on
> +  ** each frame in the stack in the order from the stack top to the stack
> +  ** bottom. The `frame_writer` function is implemented inside the sysprof
> +  ** and will be passed to the `backtracer` function. If `frame_writer` returns
> +  ** NULL, backtracing should be stopped. If `frame_writer` returns not NULL,
> +  ** the backtracing should be continued if there are frames left.
> +  */
> +  void (*backtracer)(void *(*frame_writer)(int frame_no, void *addr));
> +};
> +
> +enum luam_sysprof_mode {

Nit: I don't think that we need to provide the enum here, just macro
defines is enough.
Feel free to ignore, if not:

Minor: Ditto for naming as for luam_sysprof_config.

> +  /*
> +  ** DEFAULT mode collects only data for luam_sysprof_counters, which is stored
> +  ** in memory and can be collected with luaM_sysprof_report after profiler
> +  ** stops.
> +  */
> +  LUAM_SYSPROF_DEFAULT,
> +  /*
> +  ** LEAF mode = DEFAULT + streams samples with only top frames of host and
> +  ** guests stacks in format described in <lj_sysprof.h>
> +  */
> +  LUAM_SYSPROF_LEAF,
> +  /*
> +  ** CALLGRAPH mode = DEFAULT + streams samples with full callchains of host
> +  ** and guest stacks in format described in <lj_sysprof.h>
> +  */
> +  LUAM_SYSPROF_CALLGRAPH
> +};
> +
> +struct luam_sysprof_counters {

Minor: Ditto for naming as for luam_sysprof_config.

> +  uint64_t vmst_interp;
> +  uint64_t vmst_lfunc;
> +  uint64_t vmst_ffunc;
> +  uint64_t vmst_cfunc;
> +  uint64_t vmst_gc;
> +  uint64_t vmst_exit;
> +  uint64_t vmst_record;
> +  uint64_t vmst_opt;
> +  uint64_t vmst_asm;
> +  uint64_t vmst_trace;
> +  /* XXX: order of vmst counters is important */

Minor: please add clarification that it must be the same as the order of
the vmstates. Or even better: add LUAJIT_STATIC_ASSERT.

Typo: s/important/important./

> +  uint64_t samples;
> +  uint64_t overruns;
> +};
> +
> +/* Profiler options */

Typo: s/options/options./

> +struct luam_sysprof_options {

Minor: Ditto for naming as for luam_sysprof_config.

> +  /* Profiling mode */

Typo: s/mode/mode./

> +  enum luam_sysprof_mode mode;
> +  /* Sampling interval in msec */

Typo: s/msec/msec./

> +  uint64_t interval;
> +  /* Custom buffer to write data. */
> +  uint8_t *buf;
> +  /* The buffer's size. */
> +  size_t len;
> +  /* Context for the profile writer and final callback. */
> +  void *ctx;
> +};
> +
> +#define SYSPROF_SUCCESS (0)
> +#define SYSPROF_ERRUSE  (1)
> +#define SYSPROF_ERRRUN  (2)
> +#define SYSPROF_ERRSTOP (3)
> +#define SYSPROF_ERRIO   (4)

There is no need in () here. Also why we can't move PROFILER status from
<lj_memprof.h> here instead?

> +
> +LUAMISC_API int luaM_sysprof_configure(const struct luam_sysprof_config *config);
> +
> +LUAMISC_API int luaM_sysprof_start(lua_State *L,
> +                                   const struct luam_sysprof_options *opt);
> +
> +LUAMISC_API int luaM_sysprof_stop(lua_State *L);
> +
> +LUAMISC_API int luaM_sysprof_report(struct luam_sysprof_counters *counters);
> +
> +
>  #define LUAM_MISCLIBNAME "misc"
>  LUALIB_API int luaopen_misc(lua_State *L);
>  
> diff --git a/test/tarantool-tests/CMakeLists.txt b/test/tarantool-tests/CMakeLists.txt
> index a872fa5e..cf47cc67 100644
> --- a/test/tarantool-tests/CMakeLists.txt
> +++ b/test/tarantool-tests/CMakeLists.txt
> @@ -62,6 +62,7 @@ add_subdirectory(gh-6189-cur_L)

<snipped>

> diff --git a/test/tarantool-tests/misclib-sysprof-capi.test.lua b/test/tarantool-tests/misclib-sysprof-capi.test.lua
> new file mode 100644
> index 00000000..d468572d
> --- /dev/null
> +++ b/test/tarantool-tests/misclib-sysprof-capi.test.lua
> @@ -0,0 +1,53 @@
> +-- Sysprof is implemented for x86 and x64 architectures only.
> +require("utils").skipcond(
> +  jit.arch ~= "x86" and jit.arch ~= "x64",
> +  jit.arch.." architecture is NIY for memprof"

Typo: s/memprof/sysprof/

> +)
> +
> +local testsysprof = require("testsysprof")
> +
> +local tap = require("tap")
> +
> +local test = tap.test("clib-misc-sysprof")
> +test:plan(4)
> +
> +test:ok(testsysprof.base())
> +test:ok(testsysprof.validation())
> +
> +local function lua_payload(n)
> +  if n <= 1 then
> +    return n
> +  end
> +  return lua_payload(n - 1) + lua_payload(n - 2)
> +end
> +
> +local function payload()
> +  local n_iterations = 500000
> +
> +  local co = coroutine.create(function ()
> +    for i = 1, n_iterations do
> +      if i % 2 == 0 then
> +        testsysprof.c_payload(10)
> +      else
> +        lua_payload(10)
> +      end
> +      coroutine.yield()
> +    end
> +  end)
> +
> +  for _ = 1, n_iterations do
> +    coroutine.resume(co)
> +  end
> +end
> +
> +local jit = require('jit')
> +
> +jit.off()
> +
> +test:ok(testsysprof.profile_func(payload))
> +
> +jit.on()
> +jit.flush()
> +
> +test:ok(testsysprof.profile_func(payload))
> +

Minor: please, add the following line to unify test with the others:
| os.exit(test:check() and 0 or 1)

> diff --git a/test/tarantool-tests/misclib-sysprof-capi/CMakeLists.txt b/test/tarantool-tests/misclib-sysprof-capi/CMakeLists.txt
> new file mode 100644
> index 00000000..d9fb1a1a
> --- /dev/null
> +++ b/test/tarantool-tests/misclib-sysprof-capi/CMakeLists.txt
> @@ -0,0 +1 @@
> +BuildTestCLib(testsysprof testsysprof.c)
> diff --git a/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c b/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c
> new file mode 100644
> index 00000000..46970a72
> --- /dev/null
> +++ b/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c
> @@ -0,0 +1,269 @@

I see the following warnings during compilation:

| /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c: In function 'on_stop_cb_default':
| /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c:74:51: warning: unused parameter 'buf' [-Wunused-parameter]
|    74 | static int on_stop_cb_default(void *opt, uint8_t *buf)
|       |                                          ~~~~~~~~~^~~
| /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c: In function 'c_payload':
| /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c:115:1: warning: no return statement in function returning non-void [-Wreturn-type]
|   115 | }
|       | ^
| /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c: In function 'base':
| /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c:126:3: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
|   126 |   struct luam_sysprof_options opt = {};
|       |   ^~~~~~
| /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c:133:3: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
|   133 |   struct luam_sysprof_counters cnt = {};
|       |   ^~~~~~

Also test fails with the following coredump in backtrace:

| #0  0x000055c326294d8d in lj_BC_RET () at buildvm_x86.dasc:811
| #1  0x000055c326296138 in lj_ff_coroutine_resume () at buildvm_x86.dasc:1688
| #2  0x000055c3262271da in lua_pcall (L=0x4197d378, nargs=0, nresults=0, errfunc=0) at /home/burii/reviews/luajit/sysprof/src/lj_api.c:1187
| #3  0x00007f5639a8c9b0 in profile_func (L=0x4197d378) at /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c:231
| #4  0x000055c326295313 in lj_BC_FUNCC () at buildvm_x86.dasc:811
| #5  0x000055c3262271da in lua_pcall (L=0x4197d378, nargs=0, nresults=-1, errfunc=2) at /home/burii/reviews/luajit/sysprof/src/lj_api.c:1187
| #6  0x000055c326220a9e in docall (L=0x4197d378, narg=0, clear=0) at /home/burii/reviews/luajit/sysprof/src/luajit.c:121
| #7  0x000055c32622146a in handle_script (L=0x4197d378, argx=0x7fffcbbb3010) at /home/burii/reviews/luajit/sysprof/src/luajit.c:291
| #8  0x000055c3262220c3 in pmain (L=0x4197d378) at /home/burii/reviews/luajit/sysprof/src/luajit.c:551
| #9  0x000055c326295313 in lj_BC_FUNCC () at buildvm_x86.dasc:811
| #10 0x000055c3262274f9 in lua_cpcall (L=0x4197d378, func=0x55c326221ee2 <pmain>, ud=0x0) at /home/burii/reviews/luajit/sysprof/src/lj_api.c:1221
| #11 0x000055c3262221d5 in main (argc=4, argv=0x7fffcbbb2ff8) at /home/burii/reviews/luajit/sysprof/src/luajit.c:580

> +#include <lua.h>
> +#include <luajit.h>
> +#include <lauxlib.h>
> +
> +#include <lmisclib.h>
> +
> +#include <stdlib.h>
> +#include <errno.h>
> +
> +#undef NDEBUG
> +#include <assert.h>
> +
> +
> +/* --- utils -------------------------------------------------------------- */
> +
> +#define SYSPROF_INTERVAL_DEFAULT 11
> +
> +/*
> +** Yep, 8Mb. Tuned in order not to bother the platform with too often flushes.
> +*/
> +#define STREAM_BUFFER_SIZE (8 * 1024 * 1024)

Minor: I suppose that we can take a smaller size for test.
Feel free to ignore.

> +
> +/* Structure given as ctx to memprof writer and on_stop callback. */
> +struct sysprof_ctx {

Minor: For our test files we use tabs (as in Tarantool) instead
"half-quoter tabs" (yes I know, that it is inconsistent).
Here and below.

> +  /* Output file stream for data. */
> +  FILE *stream;
> +  /* Buffer for data. */
> +  uint8_t buf[STREAM_BUFFER_SIZE];
> +};
> +
> +/*
> +** Default buffer writer function.
> +** Just call fwrite to the corresponding FILE.

This writer function is not async-signal-safe. We should use just write
here instead.

> +*/
> +static size_t buffer_writer_default(const void **buf_addr, size_t len,
> +                                    void *opt)
> +{
> +  struct sysprof_ctx *ctx = opt;
> +  FILE *stream = ctx->stream;
> +  const void * const buf_start = *buf_addr;
> +  const void *data = *buf_addr;
> +  size_t write_total = 0;
> +
> +  assert(len <= STREAM_BUFFER_SIZE);
> +
> +  for (;;) {
> +    const size_t written = fwrite(data, 1, len - write_total, stream);
> +
> +    if (written == 0) {
> +      /* Re-tries write in case of EINTR. */
> +      if (errno != EINTR) {
> +        /* Will be freed as whole chunk later. */
> +        *buf_addr = NULL;
> +        return write_total;
> +      }
> +      errno = 0;
> +      continue;
> +    }
> +
> +    write_total += written;
> +    assert(write_total <= len);
> +
> +    if (write_total == len)
> +      break;
> +
> +    data = (uint8_t *)data + (ptrdiff_t)written;
> +  }
> +
> +  *buf_addr = buf_start;
> +  return write_total;
> +}
> +
> +/* Default on stop callback. Just close the corresponding stream. */
> +static int on_stop_cb_default(void *opt, uint8_t *buf)
> +{
> +  struct sysprof_ctx *ctx = opt;
> +  FILE *stream = ctx->stream;
> +  free(ctx);
> +  return fclose(stream);
> +}
> +
> +static int stream_init(struct luam_sysprof_options *opt)
> +{
> +  struct sysprof_ctx *ctx = calloc(1, sizeof(struct sysprof_ctx));
> +  if (NULL == ctx) {

Nit: {} are excess.

> +    return SYSPROF_ERRIO;
> +  }
> +
> +  ctx->stream = fopen("/dev/null", "wb");
> +  if (NULL == ctx->stream) {
> +    free(ctx);
> +    return SYSPROF_ERRIO;
> +  }
> +
> +  opt->ctx = ctx;
> +  opt->buf = ctx->buf;
> +  opt->len = STREAM_BUFFER_SIZE;
> +
> +  return SYSPROF_SUCCESS;
> +}
> +
> +/* --- Payload ------------------------------------------------------------ */
> +
> +static double fib(double n)
> +{
> +  if (n <= 1) {
> +    return n;
> +  }

Nit: {} are excess.

> +  return fib(n - 1) + fib(n - 2);
> +}
> +
> +static int c_payload(lua_State *L)
> +{
> +  fib(luaL_checknumber(L, 1));

How many values does this function return?

> +}
> +
> +/* --- sysprof C API tests ------------------------------------------------ */
> +
> +static int base(lua_State *L)
> +{
> +  struct luam_sysprof_config config = {};
> +  (void)config.writer;
> +  (void)config.on_stop;
> +  (void)config.backtracer;
> +
> +  struct luam_sysprof_options opt = {};
> +  (void)opt.interval;
> +  (void)opt.mode;
> +  (void)opt.ctx;
> +  (void)opt.buf;
> +  (void)opt.len;
> +
> +  struct luam_sysprof_counters cnt = {};
> +  luaM_sysprof_report(&cnt);
> +
> +  (void)cnt.samples;
> +  (void)cnt.overruns;
> +  (void)cnt.vmst_interp;
> +  (void)cnt.vmst_lfunc;
> +  (void)cnt.vmst_ffunc;
> +  (void)cnt.vmst_cfunc;
> +  (void)cnt.vmst_gc;
> +  (void)cnt.vmst_exit;
> +  (void)cnt.vmst_record;
> +  (void)cnt.vmst_opt;
> +  (void)cnt.vmst_asm;
> +  (void)cnt.vmst_trace;
> +
> +  lua_pushboolean(L, 1);
> +  return 1;
> +}
> +
> +static int validation(lua_State *L)
> +{
> +  struct luam_sysprof_config config = {};
> +  struct luam_sysprof_options opt = {};
> +  int status = SYSPROF_SUCCESS;
> +
> +  status = luaM_sysprof_configure(&config);
> +  assert(SYSPROF_SUCCESS == status);
> +
> +  /* Unknown mode */

Typo: s/mode/mode./

> +  opt.mode = 0x40;
> +  status = luaM_sysprof_start(L, &opt);
> +  assert(SYSPROF_ERRUSE == status);
> +
> +  /* Buffer not configured */

Typo: s/configured/configured./

> +  opt.mode = LUAM_SYSPROF_CALLGRAPH;
> +  opt.buf = NULL;
> +  status = luaM_sysprof_start(L, &opt);
> +  assert(SYSPROF_ERRUSE == status);
> +
> +  /* Bad interval */

Typo: s/interval/interval./

> +  opt.mode = LUAM_SYSPROF_DEFAULT;
> +  opt.interval = 0;
> +  status = luaM_sysprof_start(L, &opt);
> +  assert(SYSPROF_ERRUSE == status);
> +
> +  /* Check if profiling started */

Typo: s/started/started./

> +  opt.mode = LUAM_SYSPROF_DEFAULT;
> +  opt.interval = SYSPROF_INTERVAL_DEFAULT;
> +  status = luaM_sysprof_start(L, &opt);
> +  assert(SYSPROF_SUCCESS == status);
> +
> +  /* Already running */

Typo: s/running/running./

> +  status = luaM_sysprof_start(L, &opt);
> +  assert(SYSPROF_ERRRUN == status);
> +
> +  /* Profiler stopping */

Typo: s/stopping/stopping./

> +  status = luaM_sysprof_stop(L);
> +  assert(SYSPROF_SUCCESS == status);
> +
> +  /* Stopping profiler which is not running */

Typo: s/running/running./

> +  status = luaM_sysprof_stop(L);
> +  assert(SYSPROF_ERRSTOP == status);

Suggest to use ERRRUN here, see comment about consistensy with memprof
above.

> +
> +  lua_pushboolean(L, 1);
> +  return 1;
> +}
> +
> +static int profile_func(lua_State *L)
> +{
> +  struct luam_sysprof_config config = {};
> +  struct luam_sysprof_options opt = {};
> +  struct luam_sysprof_counters cnt = {};
> +  int status = SYSPROF_ERRUSE;
> +
> +  int n = lua_gettop(L);
> +  if (n != 1 || !lua_isfunction(L, 1)) {

Nit: {} are excess.

> +    luaL_error(L, "incorrect argument: 1 function is required");
> +  }
> +
> +  /*
> +  ** Since all the other modes functionality is the
> +  ** subset of CALLGRAPH mode, run this mode to test
> +  ** the profiler's behavior.
> +  */
> +  opt.mode = LUAM_SYSPROF_CALLGRAPH;
> +  opt.interval = SYSPROF_INTERVAL_DEFAULT;
> +  stream_init(&opt);
> +
> +  config.on_stop = on_stop_cb_default;
> +  config.writer = buffer_writer_default;
> +  status = luaM_sysprof_configure(&config);
> +  assert(SYSPROF_SUCCESS == status);
> +
> +  status = luaM_sysprof_start(L, &opt);
> +  assert(SYSPROF_SUCCESS == status);
> +
> +  /* Run payload. */
> +  if (lua_pcall(L, 0, 0, 0) != 0) {

Minor: it is better to use != LUA_OK
Nit: {} are excess.

> +    luaL_error(L, "error running payload: %s", lua_tostring(L, -1));
> +  }
> +
> +  status = luaM_sysprof_stop(L);
> +  assert(SYSPROF_SUCCESS == status);
> +
> +  status = luaM_sysprof_report(&cnt);
> +  assert(SYSPROF_SUCCESS == status);
> +
> +  assert(cnt.samples > 1);
> +  assert(cnt.samples == cnt.vmst_asm +
> +                        cnt.vmst_cfunc +
> +                        cnt.vmst_exit +
> +                        cnt.vmst_ffunc +
> +                        cnt.vmst_gc +
> +                        cnt.vmst_interp +
> +                        cnt.vmst_lfunc +
> +                        cnt.vmst_opt +
> +                        cnt.vmst_record +
> +                        cnt.vmst_trace);
> +
> +  lua_pushboolean(L, 1);
> +  return 1;
> +}
> +
> +static const struct luaL_Reg testsysprof[] = {
> +  {"c_payload", c_payload},
> +  {"base", base},
> +  {"validation", validation},
> +  {"profile_func", profile_func},

Nit: Please sort entries alphabetically.

> +  {NULL, NULL}
> +};
> +
> +LUA_API int luaopen_testsysprof(lua_State *L)
> +{
> +  luaL_register(L, "testsysprof", testsysprof);
> +  return 1;
> +}
> -- 
> 2.33.0
> 

-- 
Best regards,
Sergey Kaplun

  reply	other threads:[~2021-09-29  6:55 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-08 17:50 [Tarantool-patches] [PATCH luajit 0/7] misc: introduce sysprof Maxim Kokryashkin via Tarantool-patches
2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 1/7] core: save current frame top to lj_obj Maxim Kokryashkin via Tarantool-patches
2021-09-20 17:21   ` Sergey Kaplun via Tarantool-patches
2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 2/7] core: separate the profiling timer from lj_profile Maxim Kokryashkin via Tarantool-patches
2021-09-21 11:13   ` Sergey Kaplun via Tarantool-patches
2021-09-23 11:37     ` Mikhail Shishatskiy via Tarantool-patches
2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 3/7] memprof: move symtab to a separate module Maxim Kokryashkin via Tarantool-patches
2021-09-22  7:51   ` Sergey Kaplun via Tarantool-patches
2021-09-22  8:14     ` Sergey Kaplun via Tarantool-patches
2021-09-23 14:51   ` [Tarantool-patches] [PATCH luajit v2] " Maxim Kokryashkin via Tarantool-patches
2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 4/7] core: introduce lua and platform profiler Maxim Kokryashkin via Tarantool-patches
2021-09-29  6:53   ` Sergey Kaplun via Tarantool-patches [this message]
2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 5/7] memprof: add profile common section Maxim Kokryashkin via Tarantool-patches
2021-10-05 10:48   ` Sergey Kaplun via Tarantool-patches
2021-10-06 19:15     ` [Tarantool-patches] [PATCH luajit v2] " Maxim Kokryashkin via Tarantool-patches
2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 6/7] sysprof: introduce Lua API Maxim Kokryashkin via Tarantool-patches
2021-10-05 15:36   ` Sergey Kaplun via Tarantool-patches
2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 7/7] tools: introduce parsers for sysprof Maxim Kokryashkin via Tarantool-patches
2021-10-07 11:28   ` Sergey Kaplun via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YVQNhPETDiFWdjHY@root \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=m.shishatskiy@tarantool.org \
    --cc=max.kokryashkin@gmail.com \
    --cc=skaplun@tarantool.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Tarantool development patches archive

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://lists.tarantool.org/tarantool-patches/0 tarantool-patches/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 tarantool-patches tarantool-patches/ https://lists.tarantool.org/tarantool-patches \
		tarantool-patches@dev.tarantool.org.
	public-inbox-index tarantool-patches

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git