Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH luajit 0/7] misc: introduce sysprof
@ 2021-09-08 17:50 Maxim Kokryashkin via Tarantool-patches
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 1/7] core: save current frame top to lj_obj Maxim Kokryashkin via Tarantool-patches
                   ` (6 more replies)
  0 siblings, 7 replies; 19+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2021-09-08 17:50 UTC (permalink / raw)
  To: tarantool-patches, imun, skaplun

Currently, available options for profiling LuaJIT are not fine enough
to get an understanding of performance. For example, perf only able to
show host stack, so all the Lua calls are seen as single pcall.
Oppositely, jit.p module provided with LuaJIT is not able to give any
information about the host stack.

To get a detailed perspective of platform performance, a more advanced
profiler is needed. Sysprof is able to capture both guest and host stacks
simultaneously, along with virtual machine states.

Maxim Kokryashkin (4):
  memprof: move symtab to a separate module
  memprof: add profile common section
  sysprof: introduce Lua API
  tools: introduce parsers for sysprof

Mikhail Shishatskiy (3):
  core: save current frame top to lj_obj
  core: separate the profiling timer from lj_profile
  core: introduce lua and platform profiler

 .gitignore                                    |   1 +
 CMakeLists.txt                                |   6 +
 src/CMakeLists.txt                            |   3 +
 src/Makefile.dep.original                     |  17 +-
 src/lib_misc.c                                | 239 ++++++++-
 src/lj_arch.h                                 |  11 +
 src/lj_errmsg.h                               |   2 +-
 src/lj_mapi.c                                 |  26 +
 src/lj_memprof.c                              |  37 +-
 src/lj_memprof.h                              |  36 --
 src/lj_obj.h                                  |  12 +
 src/lj_profile.c                              | 178 +------
 src/lj_profile_timer.c                        | 126 +++++
 src/lj_profile_timer.h                        |  83 +++
 src/lj_state.c                                |   7 +
 src/lj_symtab.c                               |  36 ++
 src/lj_symtab.h                               |  46 ++
 src/lj_sysprof.c                              | 483 ++++++++++++++++++
 src/lj_sysprof.h                              |  94 ++++
 src/lmisclib.h                                |  93 ++++
 src/vm_x64.dasc                               |  52 +-
 src/vm_x86.dasc                               |  52 +-
 test/tarantool-tests/CMakeLists.txt           |   1 +
 .../misclib-sysprof-capi.test.lua             |  53 ++
 .../misclib-sysprof-capi/CMakeLists.txt       |   1 +
 .../misclib-sysprof-capi/testsysprof.c        | 269 ++++++++++
 .../misclib-sysprof-lapi.test.lua             | 107 ++++
 tools/CMakeLists.txt                          |  83 +++
 tools/luajit-parse-sysprof.in                 |   6 +
 tools/sysprof.lua                             | 119 +++++
 tools/sysprof/collapse.lua                    | 110 ++++
 tools/sysprof/parse.lua                       | 188 +++++++
 32 files changed, 2309 insertions(+), 268 deletions(-)
 create mode 100644 src/lj_profile_timer.c
 create mode 100644 src/lj_profile_timer.h
 create mode 100644 src/lj_symtab.c
 create mode 100644 src/lj_symtab.h
 create mode 100644 src/lj_sysprof.c
 create mode 100644 src/lj_sysprof.h
 create mode 100644 test/tarantool-tests/misclib-sysprof-capi.test.lua
 create mode 100644 test/tarantool-tests/misclib-sysprof-capi/CMakeLists.txt
 create mode 100644 test/tarantool-tests/misclib-sysprof-capi/testsysprof.c
 create mode 100644 test/tarantool-tests/misclib-sysprof-lapi.test.lua
 create mode 100644 tools/luajit-parse-sysprof.in
 create mode 100644 tools/sysprof.lua
 create mode 100755 tools/sysprof/collapse.lua
 create mode 100755 tools/sysprof/parse.lua

---
GitHub branch:
https://github.com/tarantool/luajit/tree/shishqa/gh-781-platform-and-lua-profiler

RFC:
https://hackmd.io/g-o4f21RRkChYTyVgN1AnQ

Original issue:
https://github.com/tarantool/tarantool/issues/781

2.33.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Tarantool-patches] [PATCH luajit 1/7] core: save current frame top to lj_obj
  2021-09-08 17:50 [Tarantool-patches] [PATCH luajit 0/7] misc: introduce sysprof Maxim Kokryashkin via Tarantool-patches
@ 2021-09-08 17:50 ` Maxim Kokryashkin via Tarantool-patches
  2021-09-20 17:21   ` Sergey Kaplun via Tarantool-patches
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 2/7] core: separate the profiling timer from lj_profile Maxim Kokryashkin via Tarantool-patches
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2021-09-08 17:50 UTC (permalink / raw)
  To: tarantool-patches, imun, skaplun

From: Mikhail Shishatskiy <m.shishatskiy@tarantool.org>

Since commit 111d377d524e54e02187148a1832683291d620b2
the VM has LFUNC and FFUNC states. The upcoming sampling
profiler uses these vmstates to determine if the guest
stack is valid or not. There is an inconsistent behavior
of the VM when the Lua stack is not valid, but the state
is set to LFUNC. This patch is just a gross hack with which
the profiler works fine. The problem is to be investigated
more deeply :(
---
 src/lj_obj.h    | 12 ++++++++++++
 src/vm_x64.dasc | 52 +++++++++++++++++++++++++++++++++++++++----------
 src/vm_x86.dasc | 52 +++++++++++++++++++++++++++++++++++++++----------
 3 files changed, 96 insertions(+), 20 deletions(-)

diff --git a/src/lj_obj.h b/src/lj_obj.h
index d26e60be..b76c3155 100644
--- a/src/lj_obj.h
+++ b/src/lj_obj.h
@@ -514,6 +514,17 @@ typedef struct GCtab {
 #define setfreetop(t, n, v)	(setmref((n)->freetop, (v)))
 #endif
 
+/* -- Misc objects -------------------------------------------------------- */
+
+struct lj_sysprof_topframe {
+  uint8_t ffid;          /* FFUNC: fast function id. */
+  union {
+    uint64_t raw;        /* Raw value for context save/restore. */
+    TValue *interp_base; /* LFUNC: Base of the executed coroutine. */
+    lua_CFunction cf;    /* CFUNC: Address of the C function. */
+  } guesttop;
+};
+
 /* -- State objects ------------------------------------------------------- */
 
 /* VM states. */
@@ -674,6 +685,7 @@ typedef struct global_State {
   MRef jit_base;	/* Current JIT code L->base or NULL. */
   MRef ctype_state;	/* Pointer to C type state. */
   GCRef gcroot[GCROOT_MAX];  /* GC roots. */
+  struct lj_sysprof_topframe top_frame;  /* Top frame for sysprof */
 } global_State;
 
 #define mainthread(g)	(&gcref(g->mainthref)->th)
diff --git a/src/vm_x64.dasc b/src/vm_x64.dasc
index 974047d3..c4beb5e7 100644
--- a/src/vm_x64.dasc
+++ b/src/vm_x64.dasc
@@ -345,6 +345,35 @@
 |  mov dword [DISPATCH+DISPATCH_GL(vmstate)], ~LJ_VMST_..st
 |.endmacro
 |
+|// Stash interpreter's internal base and enter LFUNC VM state.
+|// PROFILER: Each time profiler sees LFUNC state, it will inspect [BASE-1]
+|// expecting to see a valid framelink there. So enter this state only when
+|// BASE is stable and slots are not moved on the stack.
+|.macro set_vmstate_lfunc
+|  set_vmstate INTERP // Guard for non-atomic VM context restoration
+|  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], BASE
+|  set_vmstate LFUNC
+|.endmacro
+|
+|// Stash ID of the fast function about to be executed and enter FFUNC VM state.
+|// PROFILER: Each time profiler sees FFUNC state, it will write ffid
+|// to the profile stream.
+|.macro set_vmstate_ffunc
+|  set_vmstate INTERP // Guard for non-atomic VM context restoration
+|  mov XCHGd, dword [BASE-8]
+|  mov dword [DISPATCH+DISPATCH_GL(top_frame.ffid)], XCHGd
+|  set_vmstate FFUNC
+|.endmacro
+|
+|// Stash address of the C function about to be executed and enter CFUNC VM state.
+|// PROFILER: Each time profiler sees CFUNC state, it will write this address
+|// to the profile stream.
+|.macro set_vmstate_cfunc
+|  set_vmstate INTERP // Guard for non-atomic VM context restoration
+|  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], BASE
+|  set_vmstate CFUNC
+|.endmacro
+|
 |// Uses TMPRd (r10d).
 |.macro save_vmstate
 |.if not WIN
@@ -435,7 +464,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  jnz ->vm_returnp
   |
   |  // Return to C.
-  |  set_vmstate CFUNC
+  |  set_vmstate_cfunc
   |  and PC, -8
   |  sub PC, BASE
   |  neg PC				// Previous base = BASE - delta.
@@ -467,6 +496,9 @@ static void build_subroutines(BuildCtx *ctx)
   |  xor eax, eax			// Ok return status for vm_pcall.
   |
   |->vm_leave_unw:
+  |  set_vmstate INTERP // Guard for non-atomic VM context restoration
+  |  mov XCHGd, L:RBa->base
+  |  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], XCHGd
   |  // DISPATCH required to set properly.
   |  restore_vmstate			// Caveat: uses TMPRd (r10d).
   |  restoreregs
@@ -725,7 +757,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  cleartp LFUNC:KBASE
   |  mov KBASE, LFUNC:KBASE->pc
   |  mov KBASE, [KBASE+PC2PROTO(k)]
-  |  set_vmstate LFUNC			// LFUNC after KBASE restoration.
+  |  set_vmstate_lfunc			// LFUNC after KBASE restoration.
   |  // BASE = base, RC = result, RB = meta base
   |  jmp RA				// Jump to continuation.
   |
@@ -1166,7 +1198,7 @@ static void build_subroutines(BuildCtx *ctx)
   |
   |.macro .ffunc, name
   |->ff_ .. name:
-  |  set_vmstate FFUNC
+  |  set_vmstate_ffunc
   |.endmacro
   |
   |.macro .ffunc_1, name
@@ -1748,7 +1780,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  movzx RAd, PC_RA
   |  neg RA
   |  lea BASE, [BASE+RA*8-16]		// base = base - (RA+2)*8
-  |  set_vmstate LFUNC			// LFUNC state after BASE restoration.
+  |  set_vmstate_lfunc			// LFUNC state after BASE restoration.
   |  ins_next
   |
   |6:  // Fill up results with nil.
@@ -2513,7 +2545,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  mov KBASE, [KBASE+PC2PROTO(k)]
   |  mov L:RB->base, BASE
   |  mov qword [DISPATCH+DISPATCH_GL(jit_base)], 0
-  |  set_vmstate LFUNC			// LFUNC after BASE & KBASE restoration.
+  |  set_vmstate_lfunc			// LFUNC after BASE & KBASE restoration.
   |  // Modified copy of ins_next which handles function header dispatch, too.
   |  mov RCd, [PC]
   |  movzx RAd, RCH
@@ -2730,7 +2762,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  call extern lj_ccallback_enter	// (CTState *cts, void *cf)
   |  // lua_State * returned in eax (RD).
   |  mov BASE, L:RD->base
-  |  set_vmstate LFUNC			// LFUNC after BASE restoration.
+  |  set_vmstate_lfunc			// LFUNC after BASE restoration.
   |  mov RD, L:RD->top
   |  sub RD, BASE
   |  mov LFUNC:RB, [BASE-16]
@@ -4299,7 +4331,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
     |  mov KBASE, LFUNC:KBASE->pc
     |  mov KBASE, [KBASE+PC2PROTO(k)]
     |  // LFUNC after the old BASE & KBASE is restored.
-    |  set_vmstate LFUNC
+    |  set_vmstate_lfunc
     |  ins_next
     |
     |6:  // Fill up results with nil.
@@ -4591,7 +4623,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
     |  ins_AD  // BASE = new base, RA = framesize, RD = nargs+1
     |  mov KBASE, [PC-4+PC2PROTO(k)]
     |  mov L:RB, SAVE_L
-    |  set_vmstate LFUNC		// LFUNC after KBASE restoration.
+    |  set_vmstate_lfunc		// LFUNC after KBASE restoration.
     |  lea RA, [BASE+RA*8]		// Top of frame.
     |  cmp RA, L:RB->maxstack
     |  ja ->vm_growstack_f
@@ -4629,7 +4661,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
     |  mov [RD-8], RB			// Store delta + FRAME_VARG.
     |  mov [RD-16], LFUNC:KBASE		// Store copy of LFUNC.
     |  mov L:RB, SAVE_L
-    |  set_vmstate LFUNC		// LFUNC after KBASE restoration.
+    |  set_vmstate_lfunc		// LFUNC after KBASE restoration.
     |  lea RA, [RD+RA*8]
     |  cmp RA, L:RB->maxstack
     |  ja ->vm_growstack_v		// Need to grow stack.
@@ -4685,7 +4717,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
       |  mov CARG1, L:RB		// Caveat: CARG1 may be RA.
     }
     |  ja ->vm_growstack_c		// Need to grow stack.
-    |  set_vmstate CFUNC		// CFUNC before entering C function.
+    |  set_vmstate_cfunc		// CFUNC before entering C function.
     if (op == BC_FUNCC) {
       |  call KBASE			// (lua_State *L)
     } else {
diff --git a/src/vm_x86.dasc b/src/vm_x86.dasc
index ab8e6f27..222754fe 100644
--- a/src/vm_x86.dasc
+++ b/src/vm_x86.dasc
@@ -443,6 +443,35 @@
 |  mov dword [DISPATCH+DISPATCH_GL(vmstate)], ~LJ_VMST_..st
 |.endmacro
 |
+|// Stash interpreter's internal base and enter LFUNC VM state.
+|// PROFILER: Each time profiler sees LFUNC state, it will inspect [BASE-1]
+|// expecting to see a valid framelink there. So enter this state only when
+|// BASE is stable and slots are not moved on the stack.
+|.macro set_vmstate_lfunc
+|  set_vmstate INTERP // Guard for non-atomic VM context restoration
+|  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], BASE
+|  set_vmstate LFUNC
+|.endmacro
+|
+|// Stash ID of the fast function about to be executed and enter FFUNC VM state.
+|// PROFILER: Each time profiler sees FFUNC state, it will write ffid
+|// to the profile stream.
+|.macro set_vmstate_ffunc
+|  set_vmstate INTERP // Guard for non-atomic VM context restoration
+|  mov XCHGd, dword [BASE-8]
+|  mov dword [DISPATCH+DISPATCH_GL(top_frame.ffid)], XCHGd
+|  set_vmstate FFUNC
+|.endmacro
+|
+|// Stash address of the C function about to be executed and enter CFUNC VM state.
+|// PROFILER: Each time profiler sees CFUNC state, it will write this address
+|// to the profile stream.
+|.macro set_vmstate_cfunc
+|  set_vmstate INTERP // Guard for non-atomic VM context restoration
+|  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], BASE
+|  set_vmstate CFUNC
+|.endmacro
+|
 |// Uses spilled ecx on x86 or XCHGd (r11d) on x64.
 |.macro save_vmstate
 |.if not WIN
@@ -560,7 +589,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  jnz ->vm_returnp
   |
   |  // Return to C.
-  |  set_vmstate CFUNC
+  |  set_vmstate_cfunc
   |  and PC, -8
   |  sub PC, BASE
   |  neg PC				// Previous base = BASE - delta.
@@ -599,6 +628,9 @@ static void build_subroutines(BuildCtx *ctx)
   |  xor eax, eax			// Ok return status for vm_pcall.
   |
   |->vm_leave_unw:
+  |  set_vmstate INTERP // Guard for non-atomic VM context restoration
+  |  mov XCHGd, L:RBa->base
+  |  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], XCHGd
   |  // DISPATCH required to set properly.
   |  restore_vmstate			// Caveat: on x64 uses XCHGd (r11d).
   |  restoreregs
@@ -934,7 +966,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  mov KBASE, LFUNC:KBASE->pc
   |  mov KBASE, [KBASE+PC2PROTO(k)]
   |  // BASE = base, RC = result, RB = meta base
-  |  set_vmstate LFUNC			// LFUNC after KBASE restoration.
+  |  set_vmstate_lfunc			// LFUNC after KBASE restoration.
   |  jmp RAa				// Jump to continuation.
   |
   |.if FFI
@@ -1459,7 +1491,7 @@ static void build_subroutines(BuildCtx *ctx)
   |
   |.macro .ffunc, name
   |->ff_ .. name:
-  |  set_vmstate FFUNC
+  |  set_vmstate_ffunc
   |.endmacro
   |
   |.macro .ffunc_1, name
@@ -2141,7 +2173,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  movzx RA, PC_RA
   |  not RAa				// Note: ~RA = -(RA+1)
   |  lea BASE, [BASE+RA*8]		// base = base - (RA+1)*8
-  |  set_vmstate LFUNC			// LFUNC state after BASE restoration.
+  |  set_vmstate_lfunc			// LFUNC state after BASE restoration.
   |  ins_next
   |
   |6:  // Fill up results with nil.
@@ -2986,7 +3018,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  mov KBASE, [KBASE+PC2PROTO(k)]
   |  mov L:RB->base, BASE
   |  mov dword [DISPATCH+DISPATCH_GL(jit_base)], 0
-  |  set_vmstate LFUNC			// LFUNC after BASE & KBASE restoration.
+  |  set_vmstate_lfunc			// LFUNC after BASE & KBASE restoration.
   |  // Modified copy of ins_next which handles function header dispatch, too.
   |  mov RC, [PC]
   |  movzx RA, RCH
@@ -3257,7 +3289,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  call extern lj_ccallback_enter@8	// (CTState *cts, void *cf)
   |  // lua_State * returned in eax (RD).
   |  mov BASE, L:RD->base
-  |  set_vmstate LFUNC			// LFUNC after BASE restoration.
+  |  set_vmstate_lfunc			// LFUNC after BASE restoration.
   |  mov RD, L:RD->top
   |  sub RD, BASE
   |  mov LFUNC:RB, [BASE-8]
@@ -5103,7 +5135,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
     |  mov KBASE, LFUNC:KBASE->pc
     |  mov KBASE, [KBASE+PC2PROTO(k)]
     |  // LFUNC after the old BASE & KBASE is restored.
-    |  set_vmstate LFUNC
+    |  set_vmstate_lfunc
     |  ins_next
     |
     |6:  // Fill up results with nil.
@@ -5391,7 +5423,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
     |  ins_AD  // BASE = new base, RA = framesize, RD = nargs+1
     |  mov KBASE, [PC-4+PC2PROTO(k)]
     |  mov L:RB, SAVE_L
-    |  set_vmstate LFUNC		// LFUNC after KBASE restoration.
+    |  set_vmstate_lfunc		// LFUNC after KBASE restoration.
     |  lea RA, [BASE+RA*8]		// Top of frame.
     |  cmp RA, L:RB->maxstack
     |  ja ->vm_growstack_f
@@ -5429,7 +5461,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
     |  mov [RD-4], RB			// Store delta + FRAME_VARG.
     |  mov [RD-8], LFUNC:KBASE		// Store copy of LFUNC.
     |  mov L:RB, SAVE_L
-    |  set_vmstate LFUNC		// LFUNC after KBASE restoration.
+    |  set_vmstate_lfunc		// LFUNC after KBASE restoration.
     |  lea RA, [RD+RA*8]
     |  cmp RA, L:RB->maxstack
     |  ja ->vm_growstack_v		// Need to grow stack.
@@ -5494,7 +5526,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
       |.endif
     }
     |  ja ->vm_growstack_c		// Need to grow stack.
-    |  set_vmstate CFUNC		// CFUNC before entering C function.
+    |  set_vmstate_cfunc		// CFUNC before entering C function.
     if (op == BC_FUNCC) {
       |  call KBASEa			// (lua_State *L)
     } else {
-- 
2.33.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Tarantool-patches] [PATCH luajit 2/7] core: separate the profiling timer from lj_profile
  2021-09-08 17:50 [Tarantool-patches] [PATCH luajit 0/7] misc: introduce sysprof Maxim Kokryashkin via Tarantool-patches
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 1/7] core: save current frame top to lj_obj Maxim Kokryashkin via Tarantool-patches
@ 2021-09-08 17:50 ` Maxim Kokryashkin via Tarantool-patches
  2021-09-21 11:13   ` Sergey Kaplun via Tarantool-patches
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 3/7] memprof: move symtab to a separate module Maxim Kokryashkin via Tarantool-patches
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2021-09-08 17:50 UTC (permalink / raw)
  To: tarantool-patches, imun, skaplun

From: Mikhail Shishatskiy <m.shishatskiy@tarantool.org>

This patch makes timer machinery in lj_profile self-reliant by
introducing lj_profile_timer structure and start/stop interface
for it. This makes the timer useful for other sampling profiling
features.

Part of tarantool/tarantool#781
---
 src/CMakeLists.txt        |   1 +
 src/Makefile.dep.original |   5 +-
 src/lj_profile.c          | 178 +++-----------------------------------
 src/lj_profile_timer.c    | 126 +++++++++++++++++++++++++++
 src/lj_profile_timer.h    |  83 ++++++++++++++++++
 5 files changed, 227 insertions(+), 166 deletions(-)
 create mode 100644 src/lj_profile_timer.c
 create mode 100644 src/lj_profile_timer.h

diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 809aac68..50c23cae 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -68,6 +68,7 @@ make_source_list(SOURCES_UTILS
 make_source_list(SOURCES_PROFILER
   SOURCES
     lj_memprof.c
+    lj_profile_timer.c
     lj_profile.c
 )
 
diff --git a/src/Makefile.dep.original b/src/Makefile.dep.original
index f3672413..da5abf0d 100644
--- a/src/Makefile.dep.original
+++ b/src/Makefile.dep.original
@@ -177,7 +177,10 @@ lj_parse.o: lj_parse.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
  lj_vm.h lj_vmevent.h
 lj_profile.o: lj_profile.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
  lj_buf.h lj_gc.h lj_str.h lj_frame.h lj_bc.h lj_debug.h lj_dispatch.h \
- lj_jit.h lj_ir.h lj_trace.h lj_traceerr.h lj_profile.h luajit.h
+ lj_jit.h lj_ir.h lj_trace.h lj_traceerr.h lj_profile.h \
+ lj_profile_timer.h luajit.h
+lj_profile_timer.o: lj_profile_timer.c lj_profile_timer.h lj_arch.h lua.h \
+ luaconf.h
 lj_record.o: lj_record.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
  lj_err.h lj_errmsg.h lj_str.h lj_tab.h lj_meta.h lj_frame.h lj_bc.h \
  lj_ctype.h lj_gc.h lj_ff.h lj_ffdef.h lj_debug.h lj_ir.h lj_jit.h \
diff --git a/src/lj_profile.c b/src/lj_profile.c
index 7b09a63a..e7e1a303 100644
--- a/src/lj_profile.c
+++ b/src/lj_profile.c
@@ -19,66 +19,19 @@
 #include "lj_trace.h"
 #endif
 #include "lj_profile.h"
+#include "lj_profile_timer.h"
 
 #include "luajit.h"
 
-#if LJ_PROFILE_SIGPROF
-
-#include <sys/time.h>
-#include <signal.h>
-#define profile_lock(ps)	UNUSED(ps)
-#define profile_unlock(ps)	UNUSED(ps)
-
-#elif LJ_PROFILE_PTHREAD
-
-#include <pthread.h>
-#include <time.h>
-#if LJ_TARGET_PS3
-#include <sys/timer.h>
-#endif
-#define profile_lock(ps)	pthread_mutex_lock(&ps->lock)
-#define profile_unlock(ps)	pthread_mutex_unlock(&ps->lock)
-
-#elif LJ_PROFILE_WTHREAD
-
-#define WIN32_LEAN_AND_MEAN
-#if LJ_TARGET_XBOX360
-#include <xtl.h>
-#include <xbox.h>
-#else
-#include <windows.h>
-#endif
-typedef unsigned int (WINAPI *WMM_TPFUNC)(unsigned int);
-#define profile_lock(ps)	EnterCriticalSection(&ps->lock)
-#define profile_unlock(ps)	LeaveCriticalSection(&ps->lock)
-
-#endif
-
 /* Profiler state. */
 typedef struct ProfileState {
   global_State *g;		/* VM state that started the profiler. */
   luaJIT_profile_callback cb;	/* Profiler callback. */
   void *data;			/* Profiler callback data. */
   SBuf sb;			/* String buffer for stack dumps. */
-  int interval;			/* Sample interval in milliseconds. */
   int samples;			/* Number of samples for next callback. */
   int vmstate;			/* VM state when profile timer triggered. */
-#if LJ_PROFILE_SIGPROF
-  struct sigaction oldsa;	/* Previous SIGPROF state. */
-#elif LJ_PROFILE_PTHREAD
-  pthread_mutex_t lock;		/* g->hookmask update lock. */
-  pthread_t thread;		/* Timer thread. */
-  int abort;			/* Abort timer thread. */
-#elif LJ_PROFILE_WTHREAD
-#if LJ_TARGET_WINDOWS
-  HINSTANCE wmm;		/* WinMM library handle. */
-  WMM_TPFUNC wmm_tbp;		/* WinMM timeBeginPeriod function. */
-  WMM_TPFUNC wmm_tep;		/* WinMM timeEndPeriod function. */
-#endif
-  CRITICAL_SECTION lock;	/* g->hookmask update lock. */
-  HANDLE thread;		/* Timer thread. */
-  int abort;			/* Abort timer thread. */
-#endif
+  lj_profile_timer timer;  /* Profiling timer */
 } ProfileState;
 
 /* Sadly, we have to use a static profiler state.
@@ -145,6 +98,7 @@ void LJ_FASTCALL lj_profile_interpreter(lua_State *L)
   profile_unlock(ps);
 }
 
+
 /* Trigger profile hook. Asynchronous call from OS-specific profile timer. */
 static void profile_trigger(ProfileState *ps)
 {
@@ -168,129 +122,21 @@ static void profile_trigger(ProfileState *ps)
   profile_unlock(ps);
 }
 
-/* -- OS-specific profile timer handling ---------------------------------- */
-
 #if LJ_PROFILE_SIGPROF
 
-/* SIGPROF handler. */
-static void profile_signal(int sig)
+static void profile_handler(int sig, siginfo_t *info, void* ctx)
 {
   UNUSED(sig);
+  UNUSED(info);
+  UNUSED(ctx);
   profile_trigger(&profile_state);
 }
 
-/* Start profiling timer. */
-static void profile_timer_start(ProfileState *ps)
-{
-  int interval = ps->interval;
-  struct itimerval tm;
-  struct sigaction sa;
-  tm.it_value.tv_sec = tm.it_interval.tv_sec = interval / 1000;
-  tm.it_value.tv_usec = tm.it_interval.tv_usec = (interval % 1000) * 1000;
-  setitimer(ITIMER_PROF, &tm, NULL);
-  sa.sa_flags = SA_RESTART;
-  sa.sa_handler = profile_signal;
-  sigemptyset(&sa.sa_mask);
-  sigaction(SIGPROF, &sa, &ps->oldsa);
-}
-
-/* Stop profiling timer. */
-static void profile_timer_stop(ProfileState *ps)
-{
-  struct itimerval tm;
-  tm.it_value.tv_sec = tm.it_interval.tv_sec = 0;
-  tm.it_value.tv_usec = tm.it_interval.tv_usec = 0;
-  setitimer(ITIMER_PROF, &tm, NULL);
-  sigaction(SIGPROF, &ps->oldsa, NULL);
-}
-
-#elif LJ_PROFILE_PTHREAD
-
-/* POSIX timer thread. */
-static void *profile_thread(ProfileState *ps)
-{
-  int interval = ps->interval;
-#if !LJ_TARGET_PS3
-  struct timespec ts;
-  ts.tv_sec = interval / 1000;
-  ts.tv_nsec = (interval % 1000) * 1000000;
-#endif
-  while (1) {
-#if LJ_TARGET_PS3
-    sys_timer_usleep(interval * 1000);
 #else
-    nanosleep(&ts, NULL);
-#endif
-    if (ps->abort) break;
-    profile_trigger(ps);
-  }
-  return NULL;
-}
-
-/* Start profiling timer thread. */
-static void profile_timer_start(ProfileState *ps)
-{
-  pthread_mutex_init(&ps->lock, 0);
-  ps->abort = 0;
-  pthread_create(&ps->thread, NULL, (void *(*)(void *))profile_thread, ps);
-}
 
-/* Stop profiling timer thread. */
-static void profile_timer_stop(ProfileState *ps)
+static void profile_handler()
 {
-  ps->abort = 1;
-  pthread_join(ps->thread, NULL);
-  pthread_mutex_destroy(&ps->lock);
-}
-
-#elif LJ_PROFILE_WTHREAD
-
-/* Windows timer thread. */
-static DWORD WINAPI profile_thread(void *psx)
-{
-  ProfileState *ps = (ProfileState *)psx;
-  int interval = ps->interval;
-#if LJ_TARGET_WINDOWS
-  ps->wmm_tbp(interval);
-#endif
-  while (1) {
-    Sleep(interval);
-    if (ps->abort) break;
-    profile_trigger(ps);
-  }
-#if LJ_TARGET_WINDOWS
-  ps->wmm_tep(interval);
-#endif
-  return 0;
-}
-
-/* Start profiling timer thread. */
-static void profile_timer_start(ProfileState *ps)
-{
-#if LJ_TARGET_WINDOWS
-  if (!ps->wmm) {  /* Load WinMM library on-demand. */
-    ps->wmm = LoadLibraryExA("winmm.dll", NULL, 0);
-    if (ps->wmm) {
-      ps->wmm_tbp = (WMM_TPFUNC)GetProcAddress(ps->wmm, "timeBeginPeriod");
-      ps->wmm_tep = (WMM_TPFUNC)GetProcAddress(ps->wmm, "timeEndPeriod");
-      if (!ps->wmm_tbp || !ps->wmm_tep) {
-	ps->wmm = NULL;
-	return;
-      }
-    }
-  }
-#endif
-  InitializeCriticalSection(&ps->lock);
-  ps->abort = 0;
-  ps->thread = CreateThread(NULL, 0, profile_thread, ps, 0, NULL);
-}
-
-/* Stop profiling timer thread. */
-static void profile_timer_stop(ProfileState *ps)
-{
-  ps->abort = 1;
-  WaitForSingleObject(ps->thread, INFINITE);
-  DeleteCriticalSection(&ps->lock);
+  profile_trigger(&profile_state);
 }
 
 #endif
@@ -327,12 +173,14 @@ LUA_API void luaJIT_profile_start(lua_State *L, const char *mode,
     if (ps->g) return;  /* Profiler in use by another VM. */
   }
   ps->g = G(L);
-  ps->interval = interval;
   ps->cb = cb;
   ps->data = data;
   ps->samples = 0;
   lj_buf_init(L, &ps->sb);
-  profile_timer_start(ps);
+
+  ps->timer.opt.interval_msec = interval;
+  ps->timer.opt.handler = profile_handler;
+  lj_profile_timer_start(&ps->timer);
 }
 
 /* Stop profiling. */
@@ -341,7 +189,7 @@ LUA_API void luaJIT_profile_stop(lua_State *L)
   ProfileState *ps = &profile_state;
   global_State *g = ps->g;
   if (G(L) == g) {  /* Only stop profiler if started by this VM. */
-    profile_timer_stop(ps);
+    lj_profile_timer_stop(&ps->timer);
     g->hookmask &= ~HOOK_PROFILE;
     lj_dispatch_update(g);
 #if LJ_HASJIT
diff --git a/src/lj_profile_timer.c b/src/lj_profile_timer.c
new file mode 100644
index 00000000..ce420a85
--- /dev/null
+++ b/src/lj_profile_timer.c
@@ -0,0 +1,126 @@
+/*
+** Simple profiling timer extracted from inbuilt luajit profiler
+*/
+
+#define lj_profile_timer_c
+#define LUA_CORE
+
+#include "lj_profile_timer.h"
+
+#if LJ_HASPROFILE
+
+#if LJ_PROFILE_SIGPROF
+
+/* Start profiling timer. */
+void lj_profile_timer_start(lj_profile_timer *timer) {
+  int interval = timer->opt.interval_msec;
+  struct itimerval tm;
+  struct sigaction sa;
+  tm.it_value.tv_sec = tm.it_interval.tv_sec = interval / 1000;
+  tm.it_value.tv_usec = tm.it_interval.tv_usec = (interval % 1000) * 1000;
+  setitimer(ITIMER_PROF, &tm, NULL);
+  sa.sa_flags = SA_RESTART | SA_SIGINFO;
+  sa.sa_sigaction = timer->opt.handler;
+  sigemptyset(&sa.sa_mask);
+  sigaction(SIGPROF, &sa, &timer->oldsa);
+}
+
+/* Stop profiling timer. */
+void lj_profile_timer_stop(lj_profile_timer *timer) {
+  struct itimerval tm;
+  tm.it_value.tv_sec = tm.it_interval.tv_sec = 0;
+  tm.it_value.tv_usec = tm.it_interval.tv_usec = 0;
+  setitimer(ITIMER_PROF, &tm, NULL);
+  sigaction(SIGPROF, &timer->oldsa, NULL);
+}
+
+#elif LJ_PROFILE_PTHREAD
+
+/* POSIX timer thread. */
+static void *profile_thread(lj_timer *timer) {
+  int interval = timer->opt.interval_msec;
+#if !LJ_TARGET_PS3
+  struct timespec ts;
+  ts.tv_sec = interval / 1000;
+  ts.tv_nsec = (interval % 1000) * 1000000;
+#endif
+  while (1) {
+#if LJ_TARGET_PS3
+    sys_timer_usleep(interval * 1000);
+#else
+    nanosleep(&ts, NULL);
+#endif
+    if (timer->abort)
+      break;
+    timer->opt.handler();
+  }
+  return NULL;
+}
+
+/* Start profiling timer thread. */
+void lj_profile_timer_start(lj_profile_timer *timer) {
+  pthread_mutex_init(&timer->lock, 0);
+  timer->abort = 0;
+  pthread_create(&timer->thread, NULL, (void *(*)(void *))profile_thread,
+                 timer);
+}
+
+/* Stop profiling timer thread. */
+void lj_profile_timer_stop(lj_profile_timer *timer) {
+  timer->abort = 1;
+  pthread_join(timer->thread, NULL);
+  pthread_mutex_destroy(&timer->lock);
+}
+
+#elif LJ_PROFILE_WTHREAD
+
+/* Windows timer thread. */
+static DWORD WINAPI profile_thread(void *timerx) {
+  lj_profile_timer *timer = (lj_profile_timer *)timerx;
+  int interval = timer->opt.interval_msec;
+#if LJ_TARGET_WINDOWS
+  timer->wmm_tbp(interval);
+#endif
+  while (1) {
+    Sleep(interval);
+    if (timer->abort)
+      break;
+    timer->opt.handler();
+  }
+#if LJ_TARGET_WINDOWS
+  timer->wmm_tep(interval);
+#endif
+  return 0;
+}
+
+/* Start profiling timer thread. */
+void lj_profile_timer_start(lj_profile_timer *timer) {
+#if LJ_TARGET_WINDOWS
+  if (!timer->wmm) { /* Load WinMM library on-demand. */
+    timer->wmm = LoadLibraryExA("winmm.dll", NULL, 0);
+    if (timer->wmm) {
+      timer->wmm_tbp =
+          (WMM_TPFUNC)GetProcAddress(timer->wmm, "timeBeginPeriod");
+      timer->wmm_tep = (WMM_TPFUNC)GetProcAddress(timer->wmm, "timeEndPeriod");
+      if (!timer->wmm_tbp || !timer->wmm_tep) {
+        timer->wmm = NULL;
+        return;
+      }
+    }
+  }
+#endif
+  InitializeCriticalSection(&timer->lock);
+  timer->abort = 0;
+  timer->thread = CreateThread(NULL, 0, profile_thread, timer, 0, NULL);
+}
+
+/* Stop profiling timer thread. */
+void lj_profile_timer_stop(lj_profile_timer *timer) {
+  timer->abort = 1;
+  WaitForSingleObject(timer->thread, INFINITE);
+  DeleteCriticalSection(&timer->lock);
+}
+
+#endif
+
+#endif  /* LJ_HASPROFILE */
diff --git a/src/lj_profile_timer.h b/src/lj_profile_timer.h
new file mode 100644
index 00000000..af9eabce
--- /dev/null
+++ b/src/lj_profile_timer.h
@@ -0,0 +1,83 @@
+/*
+** Simple profiling timer extracted from inbuilt luajit profiler
+*/
+
+#ifndef _LJ_PROFILE_TIMER_H
+#define _LJ_PROFILE_TIMER_H
+
+#include <stdint.h>
+
+#include "lj_arch.h"
+
+#if LJ_HASPROFILE
+
+#if LJ_PROFILE_SIGPROF
+
+#include <sys/time.h>
+#include <signal.h>
+#define profile_lock(ps)	UNUSED(ps)
+#define profile_unlock(ps)	UNUSED(ps)
+
+#elif LJ_PROFILE_PTHREAD
+
+#include <pthread.h>
+#include <time.h>
+#if LJ_TARGET_PS3
+#include <sys/timer.h>
+#endif
+#define profile_lock(ps)	pthread_mutex_lock(&ps->lock)
+#define profile_unlock(ps)	pthread_mutex_unlock(&ps->lock)
+
+#elif LJ_PROFILE_WTHREAD
+
+#define WIN32_LEAN_AND_MEAN
+#if LJ_TARGET_XBOX360
+#include <xtl.h>
+#include <xbox.h>
+#else
+#include <windows.h>
+#endif
+typedef unsigned int (WINAPI *WMM_TPFUNC)(unsigned int);
+#define profile_lock(ps)	EnterCriticalSection(&ps->lock)
+#define profile_unlock(ps)	LeaveCriticalSection(&ps->lock)
+
+#endif
+
+typedef struct {
+#if LJ_PROFILE_SIGPROF
+  void (*handler)(int, siginfo_t*, void*);
+#else
+  void (*handler)(void);
+#endif
+  uint32_t interval_msec;
+} lj_profile_timer_opt;
+
+typedef struct {
+  lj_profile_timer_opt opt;
+#if LJ_PROFILE_SIGPROF
+  struct sigaction oldsa;  /* Previous SIGPROF state. */
+#elif LJ_PROFILE_PTHREAD
+  pthread_mutex_t lock;		 /* g->hookmask update lock. */
+  pthread_t thread;		     /* Timer thread. */
+  int abort;			         /* Abort timer thread. */
+#elif LJ_PROFILE_WTHREAD
+#if LJ_TARGET_WINDOWS
+  HINSTANCE wmm;		       /* WinMM library handle. */
+  WMM_TPFUNC wmm_tbp;		   /* WinMM timeBeginPeriod function. */
+  WMM_TPFUNC wmm_tep;		   /* WinMM timeEndPeriod function. */
+#endif
+  CRITICAL_SECTION lock;	 /* g->hookmask update lock. */
+  HANDLE thread;		       /* Timer thread. */
+  int abort;			         /* Abort timer thread. */
+#endif
+} lj_profile_timer;
+
+/* Start profiling timer */
+void lj_profile_timer_start(lj_profile_timer *timer);
+
+/* Stop profiling timer */
+void lj_profile_timer_stop(lj_profile_timer *timer);
+
+#endif  /* LJ_HASPROFILE */
+
+#endif
-- 
2.33.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Tarantool-patches] [PATCH luajit 3/7] memprof: move symtab to a separate module
  2021-09-08 17:50 [Tarantool-patches] [PATCH luajit 0/7] misc: introduce sysprof Maxim Kokryashkin via Tarantool-patches
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 1/7] core: save current frame top to lj_obj Maxim Kokryashkin via Tarantool-patches
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 2/7] core: separate the profiling timer from lj_profile Maxim Kokryashkin via Tarantool-patches
@ 2021-09-08 17:50 ` Maxim Kokryashkin via Tarantool-patches
  2021-09-22  7:51   ` Sergey Kaplun via Tarantool-patches
  2021-09-23 14:51   ` [Tarantool-patches] [PATCH luajit v2] " Maxim Kokryashkin via Tarantool-patches
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 4/7] core: introduce lua and platform profiler Maxim Kokryashkin via Tarantool-patches
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 19+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2021-09-08 17:50 UTC (permalink / raw)
  To: tarantool-patches, imun, skaplun

Considering the symbol table format, it is obvious that it is suitable
not only for memprof, but for other profiling modules too. This commit
moves the symbol table to the separate module, so modules other than
memprof will be able to access it.

Part of tarantool/tarantool#781
---
 src/CMakeLists.txt        |  1 +
 src/Makefile.dep.original |  4 +++-
 src/lj_memprof.c          | 37 ++-----------------------------
 src/lj_memprof.h          | 36 ------------------------------
 src/lj_symtab.c           | 36 ++++++++++++++++++++++++++++++
 src/lj_symtab.h           | 46 +++++++++++++++++++++++++++++++++++++++
 6 files changed, 88 insertions(+), 72 deletions(-)
 create mode 100644 src/lj_symtab.c
 create mode 100644 src/lj_symtab.h

diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 50c23cae..0f3d888f 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -60,6 +60,7 @@ make_source_list(SOURCES_UTILS
   SOURCES
     lj_alloc.c
     lj_char.c
+    lj_symtab.c
     lj_utils_leb128.c
     lj_vmmath.c
     lj_wbuf.c
diff --git a/src/Makefile.dep.original b/src/Makefile.dep.original
index da5abf0d..b7d82719 100644
--- a/src/Makefile.dep.original
+++ b/src/Makefile.dep.original
@@ -146,7 +146,7 @@ lj_mcode.o: lj_mcode.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
  lj_gc.h lj_err.h lj_errmsg.h lj_jit.h lj_ir.h lj_mcode.h lj_trace.h \
  lj_dispatch.h lj_bc.h lj_traceerr.h lj_vm.h
 lj_memprof.o: lj_memprof.c lj_arch.h lua.h luaconf.h lj_memprof.h \
- lj_def.h lj_wbuf.h lj_obj.h lj_frame.h lj_bc.h lj_debug.h
+ lj_def.h lj_wbuf.h lj_obj.h lj_frame.h lj_bc.h lj_debug.h lj_symtab.h
 lj_meta.o: lj_meta.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h lj_gc.h \
  lj_err.h lj_errmsg.h lj_buf.h lj_str.h lj_tab.h lj_meta.h lj_frame.h \
  lj_bc.h lj_vm.h lj_strscan.h lj_strfmt.h lj_lib.h
@@ -202,6 +202,8 @@ lj_strfmt_num.o: lj_strfmt_num.c lj_obj.h lua.h luaconf.h lj_def.h \
  lj_arch.h lj_buf.h lj_gc.h lj_str.h lj_strfmt.h
 lj_strscan.o: lj_strscan.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
  lj_char.h lj_strscan.h
+lj_symtab.o: lj_symtab.c lj_symtab.h lj_wbuf.h lj_def.h lua.h luaconf.h \
+ lj_obj.h lj_arch.h
 lj_tab.o: lj_tab.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h lj_gc.h \
  lj_err.h lj_errmsg.h lj_tab.h
 lj_trace.o: lj_trace.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
diff --git a/src/lj_memprof.c b/src/lj_memprof.c
index 2c1ef3b8..fba688bc 100644
--- a/src/lj_memprof.c
+++ b/src/lj_memprof.c
@@ -19,40 +19,7 @@
 #include "lj_frame.h"
 #include "lj_debug.h"
 
-/* --------------------------------- Symtab --------------------------------- */
-
-static const unsigned char ljs_header[] = {'l', 'j', 's', LJS_CURRENT_VERSION,
-					   0x0, 0x0, 0x0};
-
-static void dump_symtab(struct lj_wbuf *out, const struct global_State *g)
-{
-  const GCRef *iter = &g->gc.root;
-  const GCobj *o;
-  const size_t ljs_header_len = sizeof(ljs_header) / sizeof(ljs_header[0]);
-
-  /* Write prologue. */
-  lj_wbuf_addn(out, ljs_header, ljs_header_len);
-
-  while ((o = gcref(*iter)) != NULL) {
-    switch (o->gch.gct) {
-    case (~LJ_TPROTO): {
-      const GCproto *pt = gco2pt(o);
-      lj_wbuf_addbyte(out, SYMTAB_LFUNC);
-      lj_wbuf_addu64(out, (uintptr_t)pt);
-      lj_wbuf_addstring(out, proto_chunknamestr(pt));
-      lj_wbuf_addu64(out, (uint64_t)pt->firstline);
-      break;
-    }
-    default:
-      break;
-    }
-    iter = &o->gch.nextgc;
-  }
-
-  lj_wbuf_addbyte(out, SYMTAB_FINAL);
-}
-
-/* ---------------------------- Memory profiler ----------------------------- */
+#include "lj_symtab.h"
 
 enum memprof_state {
   /* Memory profiler is not running. */
@@ -249,7 +216,7 @@ int lj_memprof_start(struct lua_State *L, const struct lj_memprof_options *opt)
 
   /* Init output. */
   lj_wbuf_init(&mp->out, mp_opt->writer, mp_opt->ctx, mp_opt->buf, mp_opt->len);
-  dump_symtab(&mp->out, mp->g);
+  lj_symtab_dump(&mp->out, mp->g);
 
   /* Write prologue. */
   lj_wbuf_addn(&mp->out, ljm_header, ljm_header_len);
diff --git a/src/lj_memprof.h b/src/lj_memprof.h
index 3417475d..33a9b869 100644
--- a/src/lj_memprof.h
+++ b/src/lj_memprof.h
@@ -16,42 +16,6 @@
 #include "lj_def.h"
 #include "lj_wbuf.h"
 
-#define LJS_CURRENT_VERSION 0x1
-
-/*
-** symtab format:
-**
-** symtab         := prologue sym*
-** prologue       := 'l' 'j' 's' version reserved
-** version        := <BYTE>
-** reserved       := <BYTE> <BYTE> <BYTE>
-** sym            := sym-lua | sym-final
-** sym-lua        := sym-header sym-addr sym-chunk sym-line
-** sym-header     := <BYTE>
-** sym-addr       := <ULEB128>
-** sym-chunk      := string
-** sym-line       := <ULEB128>
-** sym-final      := sym-header
-** string         := string-len string-payload
-** string-len     := <ULEB128>
-** string-payload := <BYTE> {string-len}
-**
-** <BYTE>   :  A single byte (no surprises here)
-** <ULEB128>:  Unsigned integer represented in ULEB128 encoding
-**
-** (Order of bits below is hi -> lo)
-**
-** version: [VVVVVVVV]
-**  * VVVVVVVV: Byte interpreted as a plain numeric version number
-**
-** sym-header: [FUUUUUTT]
-**  * TT    : 2 bits for representing symbol type
-**  * UUUUU : 5 unused bits
-**  * F     : 1 bit marking the end of the symtab (final symbol)
-*/
-
-#define SYMTAB_LFUNC ((uint8_t)0)
-#define SYMTAB_FINAL ((uint8_t)0x80)
 
 #define LJM_CURRENT_FORMAT_VERSION 0x01
 
diff --git a/src/lj_symtab.c b/src/lj_symtab.c
new file mode 100644
index 00000000..93e19353
--- /dev/null
+++ b/src/lj_symtab.c
@@ -0,0 +1,36 @@
+#define lj_symtab_c
+#define LUA_CORE
+
+#include "lj_symtab.h"
+
+static const unsigned char ljs_header[] = {'l', 'j', 's', LJS_CURRENT_VERSION,
+                                           0x0, 0x0, 0x0};
+
+void lj_symtab_dump(struct lj_wbuf *out, const struct global_State *g)
+{
+  const GCRef *iter = &g->gc.root;
+  const GCobj *o;
+  const size_t ljs_header_len = sizeof(ljs_header) / sizeof(ljs_header[0]);
+
+  /* Write prologue. */
+  lj_wbuf_addn(out, ljs_header, ljs_header_len);
+
+  while ((o = gcref(*iter)) != NULL) {
+    switch (o->gch.gct) {
+    case (~LJ_TPROTO): {
+      const GCproto *pt = gco2pt(o);
+      lj_wbuf_addbyte(out, SYMTAB_LFUNC);
+      lj_wbuf_addu64(out, (uintptr_t)pt);
+      lj_wbuf_addstring(out, proto_chunknamestr(pt));
+      lj_wbuf_addu64(out, (uint64_t)pt->firstline);
+      break;
+    }
+    default:
+      break;
+    }
+    iter = &o->gch.nextgc;
+  }
+
+  lj_wbuf_addbyte(out, SYMTAB_FINAL);
+}
+
diff --git a/src/lj_symtab.h b/src/lj_symtab.h
new file mode 100644
index 00000000..9ea2c949
--- /dev/null
+++ b/src/lj_symtab.h
@@ -0,0 +1,46 @@
+#ifndef LJ_SYMTAB
+#define LJ_SYMTAB
+
+#include "lj_wbuf.h"
+#include "lj_obj.h"
+
+/*
+** symtab format:
+**
+** symtab         := prologue sym*
+** prologue       := 'l' 'j' 's' version reserved
+** version        := <BYTE>
+** reserved       := <BYTE> <BYTE> <BYTE>
+** sym            := sym-lua | sym-final
+** sym-lua        := sym-header sym-addr sym-chunk sym-line
+** sym-header     := <BYTE>
+** sym-addr       := <ULEB128>
+** sym-chunk      := string
+** sym-line       := <ULEB128>
+** sym-final      := sym-header
+** string         := string-len string-payload
+** string-len     := <ULEB128>
+** string-payload := <BYTE> {string-len}
+**
+** <BYTE>   :  A single byte (no surprises here)
+** <ULEB128>:  Unsigned integer represented in ULEB128 encoding
+**
+** (Order of bits below is hi -> lo)
+**
+** version: [VVVVVVVV]
+**  * VVVVVVVV: Byte interpreted as a plain numeric version number
+**
+** sym-header: [FUUUUUTT]
+**  * TT    : 2 bits for representing symbol type
+**  * UUUUU : 5 unused bits
+**  * F     : 1 bit marking the end of the symtab (final symbol)
+*/
+
+#define LJS_CURRENT_VERSION 0x1
+
+#define SYMTAB_LFUNC ((uint8_t)0)
+#define SYMTAB_FINAL ((uint8_t)0x80)
+
+void lj_symtab_dump(struct lj_wbuf *out, const struct global_State *g);
+
+#endif
-- 
2.33.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Tarantool-patches] [PATCH luajit 4/7] core: introduce lua and platform profiler
  2021-09-08 17:50 [Tarantool-patches] [PATCH luajit 0/7] misc: introduce sysprof Maxim Kokryashkin via Tarantool-patches
                   ` (2 preceding siblings ...)
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 3/7] memprof: move symtab to a separate module Maxim Kokryashkin via Tarantool-patches
@ 2021-09-08 17:50 ` Maxim Kokryashkin via Tarantool-patches
  2021-09-29  6:53   ` Sergey Kaplun via Tarantool-patches
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 5/7] memprof: add profile common section Maxim Kokryashkin via Tarantool-patches
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2021-09-08 17:50 UTC (permalink / raw)
  To: tarantool-patches, imun, skaplun

From: Mikhail Shishatskiy <m.shishatskiy@tarantool.org>

This patch introduces a sampling platform profiler for
the Lua machine.

The profiler uses the signal sampling backend from the
low-level profiler built in vanilla LuaJIT, which was put
into a separate module in one of the previous patches.
Thus, one cannot use both profilers at the same time.

First of all profiler dumps the definitions of all
loaded Lua functions and all loaded shared libraries
(symtab) via the write buffer introduced in one of
the previous patches.

As the profiling signal may occur at any time, we need
to provide some guarantees for the sampling profiler
to unwind consistent stack frames. So, the VM is adjusted
to save stack's current top frame into a dedicated
variable `guesttop` in the virtual machine global state.

When signal occurs, the profiler dumps current VM state
and additional info about about stacks that can be
considered consistent in a given state. Also, profiling
can be done without writing to file: vmstate counters
are accumulated in static memory and can be received via
the `luaM_sysprof_report` function. For more details see
the <lmisclib.h> header file.

When profiling is over, the old signal handler is restored,
and a special epilogue header is written.

Part of tarantool/tarantool#781
---
 CMakeLists.txt                                |   6 +
 src/CMakeLists.txt                            |   1 +
 src/Makefile.dep.original                     |   8 +-
 src/lj_arch.h                                 |  11 +
 src/lj_errmsg.h                               |   2 +-
 src/lj_mapi.c                                 |  26 +
 src/lj_state.c                                |   7 +
 src/lj_sysprof.c                              | 483 ++++++++++++++++++
 src/lj_sysprof.h                              |  94 ++++
 src/lmisclib.h                                |  93 ++++
 test/tarantool-tests/CMakeLists.txt           |   1 +
 .../misclib-sysprof-capi.test.lua             |  53 ++
 .../misclib-sysprof-capi/CMakeLists.txt       |   1 +
 .../misclib-sysprof-capi/testsysprof.c        | 269 ++++++++++
 14 files changed, 1052 insertions(+), 3 deletions(-)
 create mode 100644 src/lj_sysprof.c
 create mode 100644 src/lj_sysprof.h
 create mode 100644 test/tarantool-tests/misclib-sysprof-capi.test.lua
 create mode 100644 test/tarantool-tests/misclib-sysprof-capi/CMakeLists.txt
 create mode 100644 test/tarantool-tests/misclib-sysprof-capi/testsysprof.c

diff --git a/CMakeLists.txt b/CMakeLists.txt
index 5348e043..0c702093 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -184,6 +184,12 @@ if(LUAJIT_DISABLE_MEMPROF)
   AppendFlags(TARGET_C_FLAGS -DLUAJIT_DISABLE_MEMPROF)
 endif()
 
+# Disable platform and lua profiler.
+option(LUAJIT_DISABLE_SYSPROF "LuaJIT platform and lua profiler support" OFF)
+if(LUAJIT_DISABLE_SYSPROF)
+  AppendFlags(TARGET_C_FLAGS -DLUAJIT_DISABLE_SYSPROF)
+endif()
+
 # Switch to harder (and slower) hash function when a collision
 # chain in the string hash table exceeds a certain length.
 option(LUAJIT_SMART_STRINGS "Harder string hashing function" ON)
diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 0f3d888f..c19d8b54 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -69,6 +69,7 @@ make_source_list(SOURCES_UTILS
 make_source_list(SOURCES_PROFILER
   SOURCES
     lj_memprof.c
+    lj_sysprof.c
     lj_profile_timer.c
     lj_profile.c
 )
diff --git a/src/Makefile.dep.original b/src/Makefile.dep.original
index b7d82719..503817bb 100644
--- a/src/Makefile.dep.original
+++ b/src/Makefile.dep.original
@@ -140,8 +140,8 @@ lj_lib.o: lj_lib.c lauxlib.h lua.h luaconf.h lj_obj.h lj_def.h lj_arch.h \
 lj_load.o: lj_load.c lua.h luaconf.h lauxlib.h lj_obj.h lj_def.h \
  lj_arch.h lj_gc.h lj_err.h lj_errmsg.h lj_buf.h lj_str.h lj_func.h \
  lj_frame.h lj_bc.h lj_vm.h lj_lex.h lj_bcdump.h lj_parse.h
-lj_mapi.o: lj_mapi.c lua.h luaconf.h lmisclib.h lj_obj.h lj_def.h lj_arch.h \
- lj_dispatch.h lj_bc.h lj_jit.h lj_ir.h
+lj_mapi.o: lj_mapi.c lua.h luaconf.h lmisclib.h lj_obj.h lj_def.h \
+ lj_arch.h lj_dispatch.h lj_bc.h lj_jit.h lj_ir.h lj_sysprof.h
 lj_mcode.o: lj_mcode.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
  lj_gc.h lj_err.h lj_errmsg.h lj_jit.h lj_ir.h lj_mcode.h lj_trace.h \
  lj_dispatch.h lj_bc.h lj_traceerr.h lj_vm.h
@@ -204,6 +204,10 @@ lj_strscan.o: lj_strscan.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
  lj_char.h lj_strscan.h
 lj_symtab.o: lj_symtab.c lj_symtab.h lj_wbuf.h lj_def.h lua.h luaconf.h \
  lj_obj.h lj_arch.h
+lj_sysprof.o: lj_sysprof.c lj_arch.h lua.h luaconf.h lj_sysprof.h \
+ lj_obj.h lj_def.h lmisclib.h lj_debug.h lj_dispatch.h lj_bc.h lj_jit.h \
+ lj_ir.h lj_frame.h lj_trace.h lj_traceerr.h lj_wbuf.h lj_profile_timer.h \
+ lj_symtab.h
 lj_tab.o: lj_tab.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h lj_gc.h \
  lj_err.h lj_errmsg.h lj_tab.h
 lj_trace.o: lj_trace.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
diff --git a/src/lj_arch.h b/src/lj_arch.h
index 5bf0afb8..f0b60092 100644
--- a/src/lj_arch.h
+++ b/src/lj_arch.h
@@ -220,6 +220,7 @@
 #endif
 
 #define LJ_ARCH_NOMEMPROF	1
+#define LJ_ARCH_NOSYSPROF	1
 
 #elif LUAJIT_TARGET == LUAJIT_ARCH_ARM64
 
@@ -243,6 +244,7 @@
 #define LJ_ARCH_VERSION		80
 
 #define LJ_ARCH_NOMEMPROF	1
+#define LJ_ARCH_NOSYSPROF	1
 
 #elif LUAJIT_TARGET == LUAJIT_ARCH_PPC
 
@@ -310,6 +312,7 @@
 #endif
 
 #define LJ_ARCH_NOMEMPROF	1
+#define LJ_ARCH_NOSYSPROF	1
 
 #elif LUAJIT_TARGET == LUAJIT_ARCH_MIPS32 || LUAJIT_TARGET == LUAJIT_ARCH_MIPS64
 
@@ -371,6 +374,7 @@
 #endif
 
 #define LJ_ARCH_NOMEMPROF	1
+#define LJ_ARCH_NOSYSPROF	1
 
 #else
 #error "No target architecture defined"
@@ -585,4 +589,11 @@
 #define LJ_HASMEMPROF		1
 #endif
 
+/* Disable or enable the platform and lua profiler. */
+#if defined(LUAJIT_DISABLE_SYSPROF) || defined(LJ_ARCH_NOSYSPROF) || LJ_TARGET_WINDOWS || LJ_TARGET_CYGWIN || LJ_TARGET_PS3 || LJ_TARGET_PS4 || LJ_TARGET_XBOX360
+#define LJ_HASSYSPROF		0
+#else
+#define LJ_HASSYSPROF		1
+#endif
+
 #endif
diff --git a/src/lj_errmsg.h b/src/lj_errmsg.h
index ae0a18c0..77a08cb0 100644
--- a/src/lj_errmsg.h
+++ b/src/lj_errmsg.h
@@ -187,7 +187,7 @@ ERRDEF(FFI_NYICALL,	"NYI: cannot call this C function (yet)")
 
 /* Profiler errors. */
 ERRDEF(PROF_MISUSE,	"profiler misuse")
-#if LJ_HASMEMPROF
+#if LJ_HASMEMPROF || LJ_HASSYSPROF
 ERRDEF(PROF_ISRUNNING,	"profiler is running already")
 ERRDEF(PROF_NOTRUNNING,	"profiler is not running")
 #endif
diff --git a/src/lj_mapi.c b/src/lj_mapi.c
index b2b35a17..0b2284f0 100644
--- a/src/lj_mapi.c
+++ b/src/lj_mapi.c
@@ -18,6 +18,8 @@
 #include "lj_jit.h"
 #endif
 
+#include "lj_sysprof.h"
+
 LUAMISC_API void luaM_metrics(lua_State *L, struct luam_Metrics *metrics)
 {
   global_State *g = G(L);
@@ -63,3 +65,27 @@ LUAMISC_API void luaM_metrics(lua_State *L, struct luam_Metrics *metrics)
   metrics->jit_trace_num = 0;
 #endif
 }
+
+/* --- Platform and Lua profiler ------------------------------------------ */
+
+LUAMISC_API int luaM_sysprof_configure(const struct luam_sysprof_config *config)
+{
+  return lj_sysprof_configure(config);
+}
+
+LUAMISC_API int luaM_sysprof_start(lua_State *L,
+                                   const struct luam_sysprof_options *opt)
+{
+  return lj_sysprof_start(L, opt);
+}
+
+/* Stop profiling. */
+LUAMISC_API int luaM_sysprof_stop(lua_State *L)
+{
+  return lj_sysprof_stop(L);
+}
+
+LUAMISC_API int luaM_sysprof_report(struct luam_sysprof_counters *counters)
+{
+  return lj_sysprof_report(counters);
+}
diff --git a/src/lj_state.c b/src/lj_state.c
index f82b1b5b..cc6f92f1 100644
--- a/src/lj_state.c
+++ b/src/lj_state.c
@@ -33,6 +33,10 @@
 #include "lj_memprof.h"
 #endif
 
+#if LJ_HASSYSPROF
+#include "lj_sysprof.h"
+#endif
+
 /* -- Stack handling ------------------------------------------------------ */
 
 /* Stack sizes. */
@@ -267,6 +271,9 @@ LUA_API void lua_close(lua_State *L)
 #if LJ_HASMEMPROF
   lj_memprof_stop(L);
 #endif
+#if LJ_HASSYSPROF
+  lj_sysprof_stop(L);
+#endif
 #if LJ_HASPROFILE
   luaJIT_profile_stop(L);
 #endif
diff --git a/src/lj_sysprof.c b/src/lj_sysprof.c
new file mode 100644
index 00000000..c0c83fa9
--- /dev/null
+++ b/src/lj_sysprof.c
@@ -0,0 +1,483 @@
+#define lj_sysprof_c
+#define LUA_CORE
+
+#include "lj_arch.h"
+#include "lj_sysprof.h"
+
+#if LJ_HASSYSPROF
+
+#include "lj_obj.h"
+#include "lj_debug.h"
+#include "lj_dispatch.h"
+#include "lj_frame.h"
+
+#if LJ_HASJIT
+#include "lj_jit.h"
+#include "lj_trace.h"
+#endif
+
+#include "lj_wbuf.h"
+#include "lj_profile_timer.h"
+#include "lj_symtab.h"
+
+#include <pthread.h>
+#include <errno.h>
+#include <execinfo.h>
+
+#define SYSPROF_HANDLER_STACK_DEPTH 4
+#define SYSPROF_BACKTRACE_BUF_SIZE 4096
+
+enum sysprof_state {
+  /* Profiler needs to be configured. */
+  SPS_UNCONFIGURED,
+  /* Profiler is not running. */
+  SPS_IDLE,
+  /* Profiler is running. */
+  SPS_PROFILE,
+  /*
+  ** Stopped in case of stopped or failed stream.
+  ** Saved errno is set at luaM_sysprof_stop.
+  */
+  SPS_HALT
+};
+
+struct sysprof {
+  global_State *g; /* Profiled VM. */
+  pthread_t thread; /* Profiled thread. */
+  enum sysprof_state state; /* Internal state. */
+  struct lj_wbuf out; /* Output accumulator. */
+  struct luam_sysprof_counters counters; /* Profiling counters. */
+  struct luam_sysprof_options opt; /* Profiling options. */
+  struct luam_sysprof_config config; /* Profiler configurations. */
+  lj_profile_timer timer; /* Profiling timer. */
+  int saved_errno; /* Saved errno when profiler failed. */
+};
+
+static struct sysprof sysprof = {0};
+
+/* --- Stream ------------------------------------------------------------- */
+
+static const uint8_t ljp_header[] = {'l', 'j', 'p', LJP_FORMAT_VERSION,
+                                     0x0, 0x0, 0x0};
+
+static int stream_is_needed(struct sysprof *sp)
+{
+  return LUAM_SYSPROF_DEFAULT != sp->opt.mode;
+}
+
+static void stream_prologue(struct sysprof *sp)
+{
+  lj_symtab_dump(&sp->out, sp->g);
+  lj_wbuf_addn(&sp->out, ljp_header, sizeof(ljp_header));
+}
+
+static void stream_epilogue(struct sysprof *sp)
+{
+  lj_wbuf_addbyte(&sp->out, LJP_EPILOGUE_BYTE);
+}
+
+static void stream_lfunc(struct lj_wbuf *buf, GCfunc *func)
+{
+  GCproto *pt = funcproto(func);
+  lua_assert(pt != NULL);
+  lj_wbuf_addbyte(buf, LJP_FRAME_LFUNC);
+  lj_wbuf_addu64(buf, (uintptr_t)pt);
+  lj_wbuf_addu64(buf, (uint64_t)pt->firstline);
+}
+
+static void stream_cfunc(struct lj_wbuf *buf, GCfunc *func)
+{
+  lj_wbuf_addbyte(buf, LJP_FRAME_CFUNC);
+  lj_wbuf_addu64(buf, (uintptr_t)func->c.f);
+}
+
+static void stream_ffunc(struct lj_wbuf *buf, GCfunc *func)
+{
+  lj_wbuf_addbyte(buf, LJP_FRAME_FFUNC);
+  lj_wbuf_addu64(buf, func->c.ffid);
+}
+
+static void stream_frame_lua(struct lj_wbuf *buf, cTValue *frame)
+{
+  GCfunc *func = frame_func(frame);
+  lua_assert(NULL != func);
+  if (isluafunc(func)) {
+    stream_lfunc(buf, func);
+  } else if (isffunc(func)) {
+    stream_ffunc(buf, func);
+  } else {
+    stream_cfunc(buf, func);
+  }
+}
+
+static void stream_backtrace_lua(struct sysprof *sp)
+{
+  global_State *g = sp->g;
+  struct lj_wbuf *buf = &sp->out;
+  cTValue *top_frame = NULL, *frame = NULL, *bot = NULL;
+  lua_State *L = NULL;
+
+  lua_assert(g != NULL);
+  L = gco2th(gcref(g->cur_L));
+  lua_assert(L != NULL);
+
+  top_frame = sp->g->top_frame.guesttop.interp_base - 1;
+  bot = tvref(L->stack) + LJ_FR2;
+  /* Traverse frames backwards */
+  for (frame = top_frame; frame > bot; frame = frame_prev(frame)) {
+    if (frame_gc(frame) == obj2gco(L)) {
+      continue;  /* Skip dummy frames. See lj_err_optype_call(). */
+    }
+    stream_frame_lua(buf, frame);
+  }
+
+  lj_wbuf_addbyte(buf, LJP_FRAME_LUA_LAST);
+}
+
+static void *stream_frame_host(int frame_no, void *addr)
+{
+  struct sysprof *sp = &sysprof;
+
+  if (LJ_UNLIKELY(frame_no <= SYSPROF_HANDLER_STACK_DEPTH)) {
+    /*
+    ** We don't want the profiler stack to be streamed, as it will
+    ** burden the profile with unnecessary information.
+    */
+    return addr;
+  } else if (LJ_UNLIKELY(sp->opt.mode == LUAM_SYSPROF_LEAF &&
+                         frame_no > SYSPROF_HANDLER_STACK_DEPTH + 1)) {
+    return NULL;
+  }
+
+  lj_wbuf_addu64(&sp->out, (uintptr_t)addr);
+  return addr;
+}
+
+static void default_backtrace_host(void *(writer)(int frame_no, void *addr))
+{
+  static void* backtrace_buf[SYSPROF_BACKTRACE_BUF_SIZE] = {};
+
+  struct sysprof *sp = &sysprof;
+  const int depth = backtrace(backtrace_buf,
+                              sp->opt.mode == LUAM_SYSPROF_LEAF
+                              ? SYSPROF_HANDLER_STACK_DEPTH + 1
+                              : SYSPROF_BACKTRACE_BUF_SIZE);
+  lua_assert(depth >= SYSPROF_HANDLER_STACK_DEPTH);
+
+  for (int i = SYSPROF_HANDLER_STACK_DEPTH; i < depth; ++i) {
+    if (!writer(i - SYSPROF_HANDLER_STACK_DEPTH + 1, backtrace_buf[i])) {
+      return;
+    }
+  }
+}
+
+static void stream_backtrace_host(struct sysprof *sp)
+{
+  lua_assert(sp->config.backtracer != NULL);
+  sp->config.backtracer(stream_frame_host);
+  lj_wbuf_addu64(&sp->out, (uintptr_t)LJP_FRAME_HOST_LAST);
+}
+
+static void stream_trace(struct sysprof *sp)
+{
+  struct lj_wbuf *out = &sp->out;
+
+  uint32_t traceno = sp->g->vmstate;
+  jit_State *J = G2J(sp->g);
+
+  lj_wbuf_addu64(out, traceno);
+  lj_wbuf_addu64(out, (uintptr_t)J->prev_pt);
+  lj_wbuf_addu64(out, J->prev_line);
+}
+
+static void stream_guest(struct sysprof *sp)
+{
+  stream_backtrace_lua(sp);
+  stream_backtrace_host(sp);
+}
+
+static void stream_host(struct sysprof *sp)
+{
+  stream_backtrace_host(sp);
+}
+
+typedef void (*event_streamer)(struct sysprof *sp);
+
+static event_streamer event_streamers[] = {
+  /* XXX: order is important */
+  stream_host,  /* LJ_VMST_INTERP */
+  stream_guest, /* LJ_VMST_LFUNC */
+  stream_guest, /* LJ_VMST_FFUNC */
+  stream_guest, /* LJ_VMST_CFUNC */
+  stream_host,  /* LJ_VMST_GC */
+  stream_host,  /* LJ_VMST_EXIT */
+  stream_host,  /* LJ_VMST_RECORD */
+  stream_host,  /* LJ_VMST_OPT */
+  stream_host,  /* LJ_VMST_ASM */
+  stream_trace  /* LJ_VMST_TRACE */
+};
+
+static void stream_event(struct sysprof *sp, uint32_t vmstate)
+{
+  event_streamer stream = NULL;
+
+  /* Check that vmstate fits in 4 bits (see streaming format) */
+  lua_assert((vmstate & ~(uint32_t)((1 << 4) - 1)) == 0);
+  lj_wbuf_addbyte(&sp->out, (uint8_t)vmstate);
+
+  stream = event_streamers[vmstate];
+  lua_assert(NULL != stream);
+  stream(sp);
+}
+
+/* -- Signal handler ------------------------------------------------------ */
+
+static void sysprof_record_sample(struct sysprof *sp, siginfo_t *info)
+{
+  global_State *g = sp->g;
+  uint32_t _vmstate = ~(uint32_t)(g->vmstate);
+  uint32_t vmstate = _vmstate < LJ_VMST_TRACE ? _vmstate : LJ_VMST_TRACE;
+
+  lua_assert(pthread_self() == sp->thread);
+
+  /* Caveat: order of counters must match vmstate order in <lj_obj.h>. */
+  ((uint64_t *)&sp->counters)[vmstate]++;
+
+  sp->counters.samples++;
+  sp->counters.overruns += info->si_overrun;
+
+  if (stream_is_needed(sp)) {
+    stream_event(sp, vmstate);
+    if (LJ_UNLIKELY(lj_wbuf_test_flag(&sp->out, STREAM_ERRIO|STREAM_STOP))) {
+      sp->saved_errno = lj_wbuf_errno(&sp->out);
+      lj_wbuf_terminate(&sp->out);
+      sp->state = SPS_HALT;
+    }
+  }
+}
+
+static void sysprof_signal_handler(int sig, siginfo_t *info, void *ctx)
+{
+  struct sysprof *sp = &sysprof;
+  UNUSED(sig);
+  UNUSED(ctx);
+
+  switch (sp->state) {
+    case SPS_PROFILE:
+      sysprof_record_sample(sp, info);
+      break;
+
+    case SPS_IDLE:
+    case SPS_HALT:
+      /* noop */
+      break;
+
+    default:
+      lua_assert(0);
+      break;
+  }
+}
+
+/* -- Internal ------------------------------------------------------------ */
+
+static int sysprof_validate(struct sysprof *sp,
+                            const struct luam_sysprof_options *opt)
+{
+  switch (sp->state) {
+    case SPS_UNCONFIGURED:
+      return SYSPROF_ERRUSE;
+
+    case SPS_IDLE: {
+      if (opt->mode > LUAM_SYSPROF_CALLGRAPH) {
+        return SYSPROF_ERRUSE;
+      } else if (opt->mode != LUAM_SYSPROF_DEFAULT &&
+                 (opt->buf == NULL || opt->len == 0 ||
+                  sp->config.writer == NULL || sp->config.on_stop == NULL)) {
+        return SYSPROF_ERRUSE;
+      } else if (opt->interval == 0) {
+        return SYSPROF_ERRUSE;
+      }
+      break;
+    }
+
+    case SPS_PROFILE:
+    case SPS_HALT:
+      return SYSPROF_ERRRUN;
+
+    default:
+      lua_assert(0);
+  }
+
+  return SYSPROF_SUCCESS;
+}
+
+static int sysprof_init(struct sysprof *sp, lua_State *L,
+                        const struct luam_sysprof_options *opt)
+{
+  int status = sysprof_validate(sp, opt);
+  if (SYSPROF_SUCCESS != status) {
+    return status;
+  }
+
+  /* Copy validated options to sysprof state. */
+  memcpy(&sp->opt, opt, sizeof(sp->opt));
+
+  /* Init general fields. */
+  sp->g = G(L);
+  sp->thread = pthread_self();
+
+  /* Reset counters. */
+  memset(&sp->counters, 0, sizeof(sp->counters));
+
+  /* Reset saved errno. */
+  sp->saved_errno = 0;
+
+  if (stream_is_needed(sp)) {
+    lj_wbuf_init(&sp->out, sp->config.writer, opt->ctx, opt->buf, opt->len);
+  }
+
+  return SYSPROF_SUCCESS;
+}
+
+/* -- Public profiling API ------------------------------------------------ */
+
+int lj_sysprof_configure(const struct luam_sysprof_config *config)
+{
+  struct sysprof *sp = &sysprof;
+  lua_assert(config != NULL);
+  if (sp->state != SPS_UNCONFIGURED && sp->state != SPS_IDLE) {
+    return SYSPROF_ERRUSE;
+  }
+
+  memcpy(&sp->config, config, sizeof(*config));
+
+  if (sp->config.backtracer == NULL) {
+    sp->config.backtracer = default_backtrace_host;
+  }
+
+  sp->state = SPS_IDLE;
+
+  return SYSPROF_SUCCESS;
+}
+
+int lj_sysprof_start(lua_State *L, const struct luam_sysprof_options *opt)
+{
+  struct sysprof *sp = &sysprof;
+
+  int status = sysprof_init(sp, L, opt);
+  if (SYSPROF_SUCCESS != status) {
+    if (NULL != sp->config.on_stop) {
+      /*
+      ** Initialization may fail in case of unconfigured sysprof,
+      ** so we cannot guarantee cleaning up resources in this case.
+      */
+      sp->config.on_stop(opt->ctx, opt->buf);
+    }
+    return status;
+  }
+
+  sp->state = SPS_PROFILE;
+
+  if (stream_is_needed(sp)) {
+    stream_prologue(sp);
+    if (LJ_UNLIKELY(lj_wbuf_test_flag(&sp->out, STREAM_ERRIO|STREAM_STOP))) {
+      /* on_stop call may change errno value. */
+      int saved_errno = lj_wbuf_errno(&sp->out);
+      /* Ignore possible errors. mp->out.buf may be NULL here. */
+      sp->config.on_stop(opt->ctx, sp->out.buf);
+      lj_wbuf_terminate(&sp->out);
+      sp->state = SPS_IDLE;
+      errno = saved_errno;
+      return SYSPROF_ERRIO;
+    }
+  }
+
+  sp->timer.opt.interval_msec = opt->interval;
+  sp->timer.opt.handler = sysprof_signal_handler;
+  lj_profile_timer_start(&sp->timer);
+
+  return SYSPROF_SUCCESS;
+}
+
+int lj_sysprof_stop(lua_State *L)
+{
+  struct sysprof *sp = &sysprof;
+  global_State *g = sp->g;
+  struct lj_wbuf *out = &sp->out;
+
+  if (SPS_IDLE == sp->state) {
+    return SYSPROF_ERRSTOP;
+  } else if (G(L) != g) {
+    return SYSPROF_ERRUSE;
+  }
+
+  lj_profile_timer_stop(&sp->timer);
+
+  if (SPS_HALT == sp->state) {
+    errno = sp->saved_errno;
+    sp->state = SPS_IDLE;
+    /* wbuf was terminated when error occured. */
+    return SYSPROF_ERRIO;
+  }
+
+  sp->state = SPS_IDLE;
+
+  if (stream_is_needed(sp)) {
+    int cb_status = 0;
+
+    stream_epilogue(sp);
+    lj_wbuf_flush(out);
+
+    cb_status = sp->config.on_stop(sp->opt.ctx, out->buf);
+    if (LJ_UNLIKELY(lj_wbuf_test_flag(out, STREAM_ERRIO|STREAM_STOP)) ||
+        0 != cb_status) {
+      errno = lj_wbuf_errno(out);
+      lj_wbuf_terminate(out);
+      return SYSPROF_ERRIO;
+    }
+
+    lj_wbuf_terminate(out);
+  }
+
+  return SYSPROF_SUCCESS;
+}
+
+int lj_sysprof_report(struct luam_sysprof_counters *counters)
+{
+  const struct sysprof *sp = &sysprof;
+  if (sp->state != SPS_IDLE) {
+    return SYSPROF_ERRUSE;
+  }
+  memcpy(counters, &sp->counters, sizeof(sp->counters));
+  return SYSPROF_SUCCESS;
+}
+
+#else /* LJ_HASSYSPROF */
+
+int lj_sysprof_configure(const struct luam_sysprof_config *config)
+{
+  UNUSED(config);
+  return SYSPROF_ERRUSE;
+}
+
+int lj_sysprof_start(lua_State *L, const struct luam_sysprof_options *opt)
+{
+  UNUSED(L);
+  opt->on_stop(opt->ctx, opt->buf);
+  return SYSPROF_ERRUSE;
+}
+
+int lj_sysprof_stop(lua_State *L)
+{
+  UNUSED(L);
+  return SYSPROF_ERRUSE;
+}
+
+int lj_sysprof_report(struct luam_sysprof_counters *counters)
+{
+  UNUSED(counters);
+  return SYSPROF_ERRUSE;
+}
+
+#endif /* LJ_HASSYSPROF */
+
diff --git a/src/lj_sysprof.h b/src/lj_sysprof.h
new file mode 100644
index 00000000..515bc08c
--- /dev/null
+++ b/src/lj_sysprof.h
@@ -0,0 +1,94 @@
+/*
+** Sysprof - platform and Lua profiler
+*/
+
+/*
+** XXX: Platform profiler is not thread safe. Please, don't try to
+** use it inside several VM, you can profile only one at a time.
+*/
+
+/*
+** XXX: Platform profiler uses the same signal backend as lj_profile. Please,
+** don't use both at the same time.
+*/
+
+#ifndef _LJ_SYSPROF_H
+#define _LJ_SYSPROF_H
+
+#include "lj_obj.h"
+#include "lmisclib.h"
+
+#define LJP_FORMAT_VERSION 0x1
+
+/*
+** Event stream format:
+**
+** stream          := symtab sysprof
+** symtab          := see symtab description
+** sysprof         := prologue sample* epilogue
+** prologue        := 'l' 'j' 'p' version reserved
+** version         := <BYTE>
+** reserved        := <BYTE> <BYTE> <BYTE>
+** sample          := sample-guest | sample-host | sample-trace
+** sample-guest    := sample-header stack-lua stack-host
+** sample-host     := sample-header stack-host
+** sample-trace    := sample-header traceno sym-addr line-no
+** sample-header   := <BYTE>
+** stack-lua       := frame-lua* frame-lua-last
+** stack-host      := frame-host* frame-host-last
+** frame-lua       := frame-lfunc | frame-cfunc | frame-ffunc
+** frame-lfunc     := frame-header sym-addr line-no
+** frame-cfunc     := frame-header exec-addr
+** frame-ffunc     := frame-header ffid
+** frame-lua-last  := frame-header
+** frame-header    := <BYTE>
+** frame-host      := exec-addr
+** frame-host-last := <ULEB128>
+** line-no         := <ULEB128>
+** traceno         := <ULEB128>
+** ffid            := <ULEB128>
+** sym-addr        := <ULEB128>
+** exec-addr       := <ULEB128>
+** epilogue        := sample-header
+**
+** <BYTE>   :  A single byte (no surprises here)
+** <ULEB128>:  Unsigned integer represented in ULEB128 encoding
+**
+** (Order of bits below is hi -> lo)
+**
+** version: [VVVVVVVV]
+**  * VVVVVVVV: Byte interpreted as a plain integer version number
+**
+** sample-header: [FUUUUEEE]
+**  * EEE  : 3 bits for representing vmstate (LJ_VMST_*)
+**  * UUUU : 4 unused bits
+**  * F    : 0 for regular samples, 1 for epilogue's Final header
+**           (if F is set to 1, all other bits are currently ignored)
+**
+** frame-header: [FUUUUUEE]
+**  * EE    : 2 bits for representing frame type (FRAME_*)
+**  * UUUUU : 5 unused bits
+**  * F     : 0 for regular frames, 1 for final frame
+**            (if F is set to 1, all other bits are currently ignored)
+**
+** frame-host-last = NULL
+*/
+
+#define LJP_FRAME_LFUNC ((uint8_t)1)
+#define LJP_FRAME_CFUNC ((uint8_t)2)
+#define LJP_FRAME_FFUNC ((uint8_t)3)
+#define LJP_FRAME_LUA_LAST  0x80
+#define LJP_FRAME_HOST_LAST NULL
+
+#define LJP_EPILOGUE_BYTE 0x80
+
+int lj_sysprof_configure(const struct luam_sysprof_config *config);
+
+int lj_sysprof_start(lua_State *L, const struct luam_sysprof_options *opt);
+
+int lj_sysprof_stop(lua_State *L);
+
+int lj_sysprof_report(struct luam_sysprof_counters *counters);
+
+#endif
+
diff --git a/src/lmisclib.h b/src/lmisclib.h
index 0c07707e..3545ff47 100644
--- a/src/lmisclib.h
+++ b/src/lmisclib.h
@@ -60,6 +60,99 @@ struct luam_Metrics {
 
 LUAMISC_API void luaM_metrics(lua_State *L, struct luam_Metrics *metrics);
 
+/* --- Sysprof - platform and lua profiler -------------------------------- */
+
+/* Profiler configurations */
+struct luam_sysprof_config {
+  /*
+  ** Writer function for profile events.
+  ** Should return amount of written bytes on success or zero in case of error.
+  ** Setting *data to NULL means end of profiling.
+  ** For details see <lj_wbuf.h>.
+  */
+  size_t (*writer)(const void **data, size_t len, void *ctx);
+  /*
+  ** Callback on profiler stopping. Required for correctly cleaning
+  ** at VM finalization when profiler is still running.
+  ** Returns zero on success.
+  */
+  int (*on_stop)(void *ctx, uint8_t *buf);
+  /*
+  ** Backtracing function for the host stack. Should call `frame_writer` on
+  ** each frame in the stack in the order from the stack top to the stack
+  ** bottom. The `frame_writer` function is implemented inside the sysprof
+  ** and will be passed to the `backtracer` function. If `frame_writer` returns
+  ** NULL, backtracing should be stopped. If `frame_writer` returns not NULL,
+  ** the backtracing should be continued if there are frames left.
+  */
+  void (*backtracer)(void *(*frame_writer)(int frame_no, void *addr));
+};
+
+enum luam_sysprof_mode {
+  /*
+  ** DEFAULT mode collects only data for luam_sysprof_counters, which is stored
+  ** in memory and can be collected with luaM_sysprof_report after profiler
+  ** stops.
+  */
+  LUAM_SYSPROF_DEFAULT,
+  /*
+  ** LEAF mode = DEFAULT + streams samples with only top frames of host and
+  ** guests stacks in format described in <lj_sysprof.h>
+  */
+  LUAM_SYSPROF_LEAF,
+  /*
+  ** CALLGRAPH mode = DEFAULT + streams samples with full callchains of host
+  ** and guest stacks in format described in <lj_sysprof.h>
+  */
+  LUAM_SYSPROF_CALLGRAPH
+};
+
+struct luam_sysprof_counters {
+  uint64_t vmst_interp;
+  uint64_t vmst_lfunc;
+  uint64_t vmst_ffunc;
+  uint64_t vmst_cfunc;
+  uint64_t vmst_gc;
+  uint64_t vmst_exit;
+  uint64_t vmst_record;
+  uint64_t vmst_opt;
+  uint64_t vmst_asm;
+  uint64_t vmst_trace;
+  /* XXX: order of vmst counters is important */
+  uint64_t samples;
+  uint64_t overruns;
+};
+
+/* Profiler options */
+struct luam_sysprof_options {
+  /* Profiling mode */
+  enum luam_sysprof_mode mode;
+  /* Sampling interval in msec */
+  uint64_t interval;
+  /* Custom buffer to write data. */
+  uint8_t *buf;
+  /* The buffer's size. */
+  size_t len;
+  /* Context for the profile writer and final callback. */
+  void *ctx;
+};
+
+#define SYSPROF_SUCCESS (0)
+#define SYSPROF_ERRUSE  (1)
+#define SYSPROF_ERRRUN  (2)
+#define SYSPROF_ERRSTOP (3)
+#define SYSPROF_ERRIO   (4)
+
+LUAMISC_API int luaM_sysprof_configure(const struct luam_sysprof_config *config);
+
+LUAMISC_API int luaM_sysprof_start(lua_State *L,
+                                   const struct luam_sysprof_options *opt);
+
+LUAMISC_API int luaM_sysprof_stop(lua_State *L);
+
+LUAMISC_API int luaM_sysprof_report(struct luam_sysprof_counters *counters);
+
+
 #define LUAM_MISCLIBNAME "misc"
 LUALIB_API int luaopen_misc(lua_State *L);
 
diff --git a/test/tarantool-tests/CMakeLists.txt b/test/tarantool-tests/CMakeLists.txt
index a872fa5e..cf47cc67 100644
--- a/test/tarantool-tests/CMakeLists.txt
+++ b/test/tarantool-tests/CMakeLists.txt
@@ -62,6 +62,7 @@ add_subdirectory(gh-6189-cur_L)
 add_subdirectory(lj-49-bad-lightuserdata)
 add_subdirectory(lj-flush-on-trace)
 add_subdirectory(misclib-getmetrics-capi)
+add_subdirectory(misclib-sysprof-capi)
 
 # The part of the memory profiler toolchain is located in tools
 # directory, jit, profiler, and bytecode toolchains are located
diff --git a/test/tarantool-tests/misclib-sysprof-capi.test.lua b/test/tarantool-tests/misclib-sysprof-capi.test.lua
new file mode 100644
index 00000000..d468572d
--- /dev/null
+++ b/test/tarantool-tests/misclib-sysprof-capi.test.lua
@@ -0,0 +1,53 @@
+-- Sysprof is implemented for x86 and x64 architectures only.
+require("utils").skipcond(
+  jit.arch ~= "x86" and jit.arch ~= "x64",
+  jit.arch.." architecture is NIY for memprof"
+)
+
+local testsysprof = require("testsysprof")
+
+local tap = require("tap")
+
+local test = tap.test("clib-misc-sysprof")
+test:plan(4)
+
+test:ok(testsysprof.base())
+test:ok(testsysprof.validation())
+
+local function lua_payload(n)
+  if n <= 1 then
+    return n
+  end
+  return lua_payload(n - 1) + lua_payload(n - 2)
+end
+
+local function payload()
+  local n_iterations = 500000
+
+  local co = coroutine.create(function ()
+    for i = 1, n_iterations do
+      if i % 2 == 0 then
+        testsysprof.c_payload(10)
+      else
+        lua_payload(10)
+      end
+      coroutine.yield()
+    end
+  end)
+
+  for _ = 1, n_iterations do
+    coroutine.resume(co)
+  end
+end
+
+local jit = require('jit')
+
+jit.off()
+
+test:ok(testsysprof.profile_func(payload))
+
+jit.on()
+jit.flush()
+
+test:ok(testsysprof.profile_func(payload))
+
diff --git a/test/tarantool-tests/misclib-sysprof-capi/CMakeLists.txt b/test/tarantool-tests/misclib-sysprof-capi/CMakeLists.txt
new file mode 100644
index 00000000..d9fb1a1a
--- /dev/null
+++ b/test/tarantool-tests/misclib-sysprof-capi/CMakeLists.txt
@@ -0,0 +1 @@
+BuildTestCLib(testsysprof testsysprof.c)
diff --git a/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c b/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c
new file mode 100644
index 00000000..46970a72
--- /dev/null
+++ b/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c
@@ -0,0 +1,269 @@
+#include <lua.h>
+#include <luajit.h>
+#include <lauxlib.h>
+
+#include <lmisclib.h>
+
+#include <stdlib.h>
+#include <errno.h>
+
+#undef NDEBUG
+#include <assert.h>
+
+
+/* --- utils -------------------------------------------------------------- */
+
+#define SYSPROF_INTERVAL_DEFAULT 11
+
+/*
+** Yep, 8Mb. Tuned in order not to bother the platform with too often flushes.
+*/
+#define STREAM_BUFFER_SIZE (8 * 1024 * 1024)
+
+/* Structure given as ctx to memprof writer and on_stop callback. */
+struct sysprof_ctx {
+  /* Output file stream for data. */
+  FILE *stream;
+  /* Buffer for data. */
+  uint8_t buf[STREAM_BUFFER_SIZE];
+};
+
+/*
+** Default buffer writer function.
+** Just call fwrite to the corresponding FILE.
+*/
+static size_t buffer_writer_default(const void **buf_addr, size_t len,
+                                    void *opt)
+{
+  struct sysprof_ctx *ctx = opt;
+  FILE *stream = ctx->stream;
+  const void * const buf_start = *buf_addr;
+  const void *data = *buf_addr;
+  size_t write_total = 0;
+
+  assert(len <= STREAM_BUFFER_SIZE);
+
+  for (;;) {
+    const size_t written = fwrite(data, 1, len - write_total, stream);
+
+    if (written == 0) {
+      /* Re-tries write in case of EINTR. */
+      if (errno != EINTR) {
+        /* Will be freed as whole chunk later. */
+        *buf_addr = NULL;
+        return write_total;
+      }
+      errno = 0;
+      continue;
+    }
+
+    write_total += written;
+    assert(write_total <= len);
+
+    if (write_total == len)
+      break;
+
+    data = (uint8_t *)data + (ptrdiff_t)written;
+  }
+
+  *buf_addr = buf_start;
+  return write_total;
+}
+
+/* Default on stop callback. Just close the corresponding stream. */
+static int on_stop_cb_default(void *opt, uint8_t *buf)
+{
+  struct sysprof_ctx *ctx = opt;
+  FILE *stream = ctx->stream;
+  free(ctx);
+  return fclose(stream);
+}
+
+static int stream_init(struct luam_sysprof_options *opt)
+{
+  struct sysprof_ctx *ctx = calloc(1, sizeof(struct sysprof_ctx));
+  if (NULL == ctx) {
+    return SYSPROF_ERRIO;
+  }
+
+  ctx->stream = fopen("/dev/null", "wb");
+  if (NULL == ctx->stream) {
+    free(ctx);
+    return SYSPROF_ERRIO;
+  }
+
+  opt->ctx = ctx;
+  opt->buf = ctx->buf;
+  opt->len = STREAM_BUFFER_SIZE;
+
+  return SYSPROF_SUCCESS;
+}
+
+/* --- Payload ------------------------------------------------------------ */
+
+static double fib(double n)
+{
+  if (n <= 1) {
+    return n;
+  }
+  return fib(n - 1) + fib(n - 2);
+}
+
+static int c_payload(lua_State *L)
+{
+  fib(luaL_checknumber(L, 1));
+}
+
+/* --- sysprof C API tests ------------------------------------------------ */
+
+static int base(lua_State *L)
+{
+  struct luam_sysprof_config config = {};
+  (void)config.writer;
+  (void)config.on_stop;
+  (void)config.backtracer;
+
+  struct luam_sysprof_options opt = {};
+  (void)opt.interval;
+  (void)opt.mode;
+  (void)opt.ctx;
+  (void)opt.buf;
+  (void)opt.len;
+
+  struct luam_sysprof_counters cnt = {};
+  luaM_sysprof_report(&cnt);
+
+  (void)cnt.samples;
+  (void)cnt.overruns;
+  (void)cnt.vmst_interp;
+  (void)cnt.vmst_lfunc;
+  (void)cnt.vmst_ffunc;
+  (void)cnt.vmst_cfunc;
+  (void)cnt.vmst_gc;
+  (void)cnt.vmst_exit;
+  (void)cnt.vmst_record;
+  (void)cnt.vmst_opt;
+  (void)cnt.vmst_asm;
+  (void)cnt.vmst_trace;
+
+  lua_pushboolean(L, 1);
+  return 1;
+}
+
+static int validation(lua_State *L)
+{
+  struct luam_sysprof_config config = {};
+  struct luam_sysprof_options opt = {};
+  int status = SYSPROF_SUCCESS;
+
+  status = luaM_sysprof_configure(&config);
+  assert(SYSPROF_SUCCESS == status);
+
+  /* Unknown mode */
+  opt.mode = 0x40;
+  status = luaM_sysprof_start(L, &opt);
+  assert(SYSPROF_ERRUSE == status);
+
+  /* Buffer not configured */
+  opt.mode = LUAM_SYSPROF_CALLGRAPH;
+  opt.buf = NULL;
+  status = luaM_sysprof_start(L, &opt);
+  assert(SYSPROF_ERRUSE == status);
+
+  /* Bad interval */
+  opt.mode = LUAM_SYSPROF_DEFAULT;
+  opt.interval = 0;
+  status = luaM_sysprof_start(L, &opt);
+  assert(SYSPROF_ERRUSE == status);
+
+  /* Check if profiling started */
+  opt.mode = LUAM_SYSPROF_DEFAULT;
+  opt.interval = SYSPROF_INTERVAL_DEFAULT;
+  status = luaM_sysprof_start(L, &opt);
+  assert(SYSPROF_SUCCESS == status);
+
+  /* Already running */
+  status = luaM_sysprof_start(L, &opt);
+  assert(SYSPROF_ERRRUN == status);
+
+  /* Profiler stopping */
+  status = luaM_sysprof_stop(L);
+  assert(SYSPROF_SUCCESS == status);
+
+  /* Stopping profiler which is not running */
+  status = luaM_sysprof_stop(L);
+  assert(SYSPROF_ERRSTOP == status);
+
+  lua_pushboolean(L, 1);
+  return 1;
+}
+
+static int profile_func(lua_State *L)
+{
+  struct luam_sysprof_config config = {};
+  struct luam_sysprof_options opt = {};
+  struct luam_sysprof_counters cnt = {};
+  int status = SYSPROF_ERRUSE;
+
+  int n = lua_gettop(L);
+  if (n != 1 || !lua_isfunction(L, 1)) {
+    luaL_error(L, "incorrect argument: 1 function is required");
+  }
+
+  /*
+  ** Since all the other modes functionality is the
+  ** subset of CALLGRAPH mode, run this mode to test
+  ** the profiler's behavior.
+  */
+  opt.mode = LUAM_SYSPROF_CALLGRAPH;
+  opt.interval = SYSPROF_INTERVAL_DEFAULT;
+  stream_init(&opt);
+
+  config.on_stop = on_stop_cb_default;
+  config.writer = buffer_writer_default;
+  status = luaM_sysprof_configure(&config);
+  assert(SYSPROF_SUCCESS == status);
+
+  status = luaM_sysprof_start(L, &opt);
+  assert(SYSPROF_SUCCESS == status);
+
+  /* Run payload. */
+  if (lua_pcall(L, 0, 0, 0) != 0) {
+    luaL_error(L, "error running payload: %s", lua_tostring(L, -1));
+  }
+
+  status = luaM_sysprof_stop(L);
+  assert(SYSPROF_SUCCESS == status);
+
+  status = luaM_sysprof_report(&cnt);
+  assert(SYSPROF_SUCCESS == status);
+
+  assert(cnt.samples > 1);
+  assert(cnt.samples == cnt.vmst_asm +
+                        cnt.vmst_cfunc +
+                        cnt.vmst_exit +
+                        cnt.vmst_ffunc +
+                        cnt.vmst_gc +
+                        cnt.vmst_interp +
+                        cnt.vmst_lfunc +
+                        cnt.vmst_opt +
+                        cnt.vmst_record +
+                        cnt.vmst_trace);
+
+  lua_pushboolean(L, 1);
+  return 1;
+}
+
+static const struct luaL_Reg testsysprof[] = {
+  {"c_payload", c_payload},
+  {"base", base},
+  {"validation", validation},
+  {"profile_func", profile_func},
+  {NULL, NULL}
+};
+
+LUA_API int luaopen_testsysprof(lua_State *L)
+{
+  luaL_register(L, "testsysprof", testsysprof);
+  return 1;
+}
-- 
2.33.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Tarantool-patches] [PATCH luajit 5/7] memprof: add profile common section
  2021-09-08 17:50 [Tarantool-patches] [PATCH luajit 0/7] misc: introduce sysprof Maxim Kokryashkin via Tarantool-patches
                   ` (3 preceding siblings ...)
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 4/7] core: introduce lua and platform profiler Maxim Kokryashkin via Tarantool-patches
@ 2021-09-08 17:50 ` Maxim Kokryashkin via Tarantool-patches
  2021-10-05 10:48   ` Sergey Kaplun via Tarantool-patches
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 6/7] sysprof: introduce Lua API Maxim Kokryashkin via Tarantool-patches
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 7/7] tools: introduce parsers for sysprof Maxim Kokryashkin via Tarantool-patches
  6 siblings, 1 reply; 19+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2021-09-08 17:50 UTC (permalink / raw)
  To: tarantool-patches, imun, skaplun

A Lua API for sysprof needs to be introduced, but sysprof's C API is quite
similar with memprof's. Considering this, there are some structures and
functions that should be common between memprof's and sysprof's
implementations of Lua API to avoid duplication.

Part of tarantool/tarantool#781
---
 src/lib_misc.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/src/lib_misc.c b/src/lib_misc.c
index 1dab08cc..a44476e0 100644
--- a/src/lib_misc.c
+++ b/src/lib_misc.c
@@ -75,9 +75,7 @@ LJLIB_CF(misc_getmetrics)
 
 #include "lj_libdef.h"
 
-/* ----- misc.memprof module ---------------------------------------------- */
-
-#define LJLIB_MODULE_misc_memprof
+/* --------- profile common  ---------------------------------------------- */
 
 /*
 ** Yep, 8Mb. Tuned in order not to bother the platform with too often flushes.
@@ -85,7 +83,7 @@ LJLIB_CF(misc_getmetrics)
 #define STREAM_BUFFER_SIZE (8 * 1024 * 1024)
 
 /* Structure given as ctx to memprof writer and on_stop callback. */
-struct memprof_ctx {
+struct profile_ctx {
   /* Output file stream for data. */
   FILE *stream;
   /* Profiled global_State for lj_mem_free at on_stop callback. */
@@ -101,7 +99,7 @@ struct memprof_ctx {
 static size_t buffer_writer_default(const void **buf_addr, size_t len,
 				    void *opt)
 {
-  struct memprof_ctx *ctx = opt;
+  struct profile_ctx *ctx = opt;
   FILE *stream = ctx->stream;
   const void * const buf_start = *buf_addr;
   const void *data = *buf_addr;
@@ -140,19 +138,22 @@ static size_t buffer_writer_default(const void **buf_addr, size_t len,
 /* Default on stop callback. Just close the corresponding stream. */
 static int on_stop_cb_default(void *opt, uint8_t *buf)
 {
-  struct memprof_ctx *ctx = opt;
+  struct profile_ctx *ctx = opt;
   FILE *stream = ctx->stream;
   UNUSED(buf);
   lj_mem_free(ctx->g, ctx, sizeof(*ctx));
   return fclose(stream);
 }
 
+/* ----- misc.memprof module ---------------------------------------------- */
+
+#define LJLIB_MODULE_misc_memprof
 /* local started, err, errno = misc.memprof.start(fname) */
 LJLIB_CF(misc_memprof_start)
 {
   struct lj_memprof_options opt = {0};
   const char *fname = strdata(lj_lib_checkstr(L, 1));
-  struct memprof_ctx *ctx;
+  struct profile_ctx *ctx;
   int memprof_status;
 
   /*
-- 
2.33.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Tarantool-patches] [PATCH luajit 6/7] sysprof: introduce Lua API
  2021-09-08 17:50 [Tarantool-patches] [PATCH luajit 0/7] misc: introduce sysprof Maxim Kokryashkin via Tarantool-patches
                   ` (4 preceding siblings ...)
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 5/7] memprof: add profile common section Maxim Kokryashkin via Tarantool-patches
@ 2021-09-08 17:50 ` Maxim Kokryashkin via Tarantool-patches
  2021-10-05 15:36   ` Sergey Kaplun via Tarantool-patches
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 7/7] tools: introduce parsers for sysprof Maxim Kokryashkin via Tarantool-patches
  6 siblings, 1 reply; 19+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2021-09-08 17:50 UTC (permalink / raw)
  To: tarantool-patches, imun, skaplun

This commit introduces Lua API for sysprof, so it is now possible to
call it from anywhere in a Lua script. All of the parameters are passed
inside a single Lua table.
The following options are supported:
- mode:     "D" (Default mode, only virtual machine state counters)
            "L" (Leaf mode, shows the last frame on the stack)
            "C" (Callchain mode, full stack dump)
- interval: time between samples
- path:     Path to file in which profiling data should be stored

Part of tarantool/tarantool#781
---
 src/lib_misc.c                                | 224 ++++++++++++++++++
 .../misclib-sysprof-lapi.test.lua             | 105 ++++++++
 2 files changed, 329 insertions(+)
 create mode 100644 test/tarantool-tests/misclib-sysprof-lapi.test.lua

diff --git a/src/lib_misc.c b/src/lib_misc.c
index a44476e0..a3423cdd 100644
--- a/src/lib_misc.c
+++ b/src/lib_misc.c
@@ -145,6 +145,218 @@ static int on_stop_cb_default(void *opt, uint8_t *buf)
   return fclose(stream);
 }
 
+/* ----- misc.sysprof module ---------------------------------------------- */
+
+/* Not loaded by default, use: local profile = require("misc.sysprof") */
+#define LJLIB_MODULE_misc_sysprof
+
+/* The default profiling interval equals to 11 ms. */
+#define SYSPROF_DEFAULT_INTERVAL (11)
+#define SYSPROF_DEFAULT_OUTPUT "sysprof.bin"
+
+int parse_sysprof_opts(lua_State *L, struct luam_sysprof_options *opt, int idx) {
+  GCtab *options = lj_lib_checktab(L, idx);
+
+  /* get profiling mode */
+  {
+    cTValue *mode_opt = lj_tab_getstr(options, lj_str_newlit(L, "mode"));
+    char mode = 0;
+    if (mode_opt) {
+      mode = *strVdata(mode_opt);
+    }
+
+    switch (mode) {
+      case 'D':
+        opt->mode = LUAM_SYSPROF_DEFAULT;
+        break;
+      case 'L':
+        opt->mode = LUAM_SYSPROF_LEAF;
+        break;
+      case 'C':
+        opt->mode = LUAM_SYSPROF_CALLGRAPH;
+        break;
+      default:
+        return SYSPROF_ERRUSE;
+    }
+  }
+
+  /* Get profiling interval. */
+  {
+    cTValue *interval = lj_tab_getstr(options, lj_str_newlit(L, "interval"));
+    opt->interval = SYSPROF_DEFAULT_INTERVAL;
+    if (interval) {
+      int32_t signed_interval = numberVint(interval);
+      if (signed_interval < 1) {
+        return SYSPROF_ERRUSE;
+      }
+      opt->interval = signed_interval;
+    }
+  }
+
+  return SYSPROF_SUCCESS;
+}
+
+int set_output_path(const char* path, struct luam_sysprof_options *opt) {
+  lua_assert(path != NULL);
+  struct profile_ctx *ctx = opt->ctx;
+  FILE *stream = fopen(path, "wb");
+  if(!stream) {
+    return SYSPROF_ERRIO;
+  }
+  ctx->stream = stream;
+  return SYSPROF_SUCCESS;
+}
+
+const char* parse_output_path(lua_State *L, struct luam_sysprof_options *opt, int idx) {
+  /* By default, sysprof writes events to a `sysprof.bin` file. */
+  const char *path = luaL_checkstring(L, idx);
+  return path ? path : SYSPROF_DEFAULT_OUTPUT;
+}
+
+int parse_options(lua_State *L, struct luam_sysprof_options *opt)
+{
+  int status = SYSPROF_SUCCESS;
+
+  switch(lua_gettop(L)) {
+    case 2:
+      if(!(lua_isstring(L, 1) && lua_istable(L, 2))) {
+        status = SYSPROF_ERRUSE;
+        break;
+      }
+      const char* path = parse_output_path(L, opt, 1); 
+      status = set_output_path(path, opt);
+      if(status != SYSPROF_SUCCESS) 
+        break;
+
+      status = parse_sysprof_opts(L, opt, 2);
+      break;
+    case 1:
+      if(!lua_istable(L, 1)) {
+        status = SYSPROF_ERRUSE;
+        break;
+      }
+      status = parse_sysprof_opts(L, opt, 1);
+      if(status != SYSPROF_SUCCESS)
+        break;
+      status = set_output_path(SYSPROF_DEFAULT_OUTPUT, opt);
+      break;
+    default:
+      status = SYSPROF_ERRUSE;
+      break;
+  } 
+  return status;
+}
+
+int sysprof_error(lua_State *L, int status)
+{
+  switch (status) {
+    case SYSPROF_ERRUSE:
+      lua_pushnil(L);
+      lua_pushstring(L, err2msg(LJ_ERR_PROF_MISUSE));
+      lua_pushinteger(L, EINVAL);
+      return 3;
+    case SYSPROF_ERRRUN:
+      lua_pushnil(L);
+      lua_pushstring(L, err2msg(LJ_ERR_PROF_ISRUNNING));
+      lua_pushinteger(L, EINVAL);
+      return 3;
+    case SYSPROF_ERRSTOP:
+      lua_pushnil(L);
+      lua_pushstring(L, err2msg(LJ_ERR_PROF_NOTRUNNING));
+      lua_pushinteger(L, EINVAL);
+      return 3;
+    case SYSPROF_ERRIO:
+      return luaL_fileresult(L, 0, NULL);
+    default:
+      lua_assert(0);
+      return 0;
+  }
+}
+
+/* local res, err, errno = sysprof.start(options) */
+LJLIB_CF(misc_sysprof_start)
+{
+  int status = SYSPROF_SUCCESS;
+
+  struct luam_sysprof_options opt = {};
+  struct profile_ctx *ctx = NULL;
+
+  ctx = lj_mem_new(L, sizeof(*ctx));
+  ctx->g = G(L);
+  opt.ctx = ctx;
+  opt.buf = ctx->buf;
+  opt.len = STREAM_BUFFER_SIZE;
+
+  status = parse_options(L, &opt);
+  if (LJ_UNLIKELY(status != PROFILE_SUCCESS)) {
+    lj_mem_free(ctx->g, ctx, sizeof(*ctx));
+    return sysprof_error(L, status);
+  }
+
+  const struct luam_sysprof_config conf = {buffer_writer_default, on_stop_cb_default, NULL};
+  status = luaM_sysprof_configure(&conf);
+  if (LJ_UNLIKELY(status != PROFILE_SUCCESS)) {
+    return sysprof_error(L, status);
+  }
+
+  status = luaM_sysprof_start(L, &opt);
+  if (LJ_UNLIKELY(status != PROFILE_SUCCESS)) {
+    /* Allocated memory will be freed in on_stop callback. */
+    return sysprof_error(L, status);
+  }
+
+  lua_pushboolean(L, 1);
+  return 1;
+}
+
+/* local res, err, errno = profile.sysprof_stop() */
+LJLIB_CF(misc_sysprof_stop)
+{
+  int status = luaM_sysprof_stop(L);
+  if (LJ_UNLIKELY(status != PROFILE_SUCCESS)) {
+    return sysprof_error(L, status);
+  }
+  lua_pushboolean(L, 1);
+  return 1;
+}
+
+/* local counters, err, errno = sysprof.report() */
+LJLIB_CF(misc_sysprof_report)
+{
+  struct luam_sysprof_counters counters = {};
+  GCtab *data_tab = NULL;
+  GCtab *count_tab = NULL;
+
+  int status = luaM_sysprof_report(&counters);
+  if (status != SYSPROF_SUCCESS) {
+    return sysprof_error(L, status);
+  }
+
+  lua_createtable(L, 0, 3);
+  data_tab = tabV(L->top - 1);
+
+  setnumfield(L, data_tab, "samples", counters.samples);
+  setnumfield(L, data_tab, "overruns", counters.overruns);
+
+  lua_createtable(L, 0, LJ_VMST__MAX + 1);
+  count_tab = tabV(L->top - 1);
+
+  setnumfield(L, count_tab, "INTERP", counters.vmst_interp);
+  setnumfield(L, count_tab, "LFUNC",  counters.vmst_lfunc);
+  setnumfield(L, count_tab, "FFUNC",  counters.vmst_ffunc);
+  setnumfield(L, count_tab, "CFUNC",  counters.vmst_cfunc);
+  setnumfield(L, count_tab, "GC",     counters.vmst_gc);
+  setnumfield(L, count_tab, "EXIT",   counters.vmst_exit);
+  setnumfield(L, count_tab, "RECORD", counters.vmst_record);
+  setnumfield(L, count_tab, "OPT",    counters.vmst_opt);
+  setnumfield(L, count_tab, "ASM",    counters.vmst_asm);
+  setnumfield(L, count_tab, "TRACE",  counters.vmst_trace);
+
+  lua_setfield(L, -2, "vmstate");
+
+  return 1;
+}
+
 /* ----- misc.memprof module ---------------------------------------------- */
 
 #define LJLIB_MODULE_misc_memprof
@@ -237,7 +449,19 @@ LJLIB_CF(misc_memprof_stop)
 
 LUALIB_API int luaopen_misc(struct lua_State *L)
 {
+  int status = SYSPROF_SUCCESS;
+  struct luam_sysprof_config config = {};
+
+  config.writer = buffer_writer_default;
+  config.on_stop = on_stop_cb_default;
+  status = luaM_sysprof_configure(&config);
+  if (LJ_UNLIKELY(status != PROFILE_SUCCESS)) {
+    return sysprof_error(L, status);
+  }
+
   LJ_LIB_REG(L, LUAM_MISCLIBNAME, misc);
   LJ_LIB_REG(L, LUAM_MISCLIBNAME ".memprof", misc_memprof);
+  LJ_LIB_REG(L, LUAM_MISCLIBNAME ".sysprof", misc_sysprof);
+
   return 1;
 }
diff --git a/test/tarantool-tests/misclib-sysprof-lapi.test.lua b/test/tarantool-tests/misclib-sysprof-lapi.test.lua
new file mode 100644
index 00000000..17d83d1b
--- /dev/null
+++ b/test/tarantool-tests/misclib-sysprof-lapi.test.lua
@@ -0,0 +1,105 @@
+local tap = require("tap")
+
+local test = tap.test("misc-sysprof-lapi")
+test:plan(14)
+
+jit.off()
+jit.flush()
+
+local bufread = require "utils.bufread"
+local symtab = require "utils.symtab"
+
+local TMP_BINFILE = arg[0]:gsub(".+/([^/]+)%.test%.lua$", "%.%1.sysprofdata.tmp.bin")
+local BAD_PATH = arg[0]:gsub(".+/([^/]+)%.test%.lua$", "%1/sysprofdata.tmp.bin")
+
+local function payload()
+  local function fib(n)
+    if n <= 1 then
+      return n
+    end
+    return fib(n - 1) + fib(n - 2)
+  end
+  return fib(32)
+end
+
+local function generate_output(file, opts)
+  file = file or "sysprof.bin"
+  local res, err = misc.sysprof.start(file, opts)
+  assert(res, err)
+
+  payload()
+
+  res,err = misc.sysprof.stop()
+  assert(res, err)
+end
+--## GENERAL ###############################################################--
+
+--wrong profiling mode
+local res, err, errno = misc.sysprof.start{ mode = "A" }
+test:ok(res == nil and err:match("profiler misuse"))
+test:ok(type(errno) == "number")
+
+--already running
+res, err = misc.sysprof.start{ mode = "D" }
+assert(res, err)
+
+res, err, errno = misc.sysprof.start{ mode = "D" }
+print(err)
+test:ok(res == nil and err:match("profiler misuse"))
+test:ok(type(errno) == "number")
+
+res, err = misc.sysprof.stop()
+assert(res, err)
+
+--not running
+res, err, errno = misc.sysprof.stop()
+test:ok(res == nil and err:match("profiler is not running"))
+test:ok(type(errno) == "number")
+
+--bad path
+res, err, errno = misc.sysprof.start(BAD_PATH, { mode = "C" })
+test:ok(res == nil and err:match("No such file or directory"))
+test:ok(type(errno) == "number")
+
+--bad interval
+res, err, errno = misc.sysprof.start{ mode = "C", interval = -1 }
+test:ok(res == nil and err:match("profiler misuse"))
+test:ok(type(errno) == "number")
+--## DEFAULT MODE ##########################################################--
+
+res, err = pcall(generate_output, nil, { mode = "D", interval = 11 })
+if res == nil then
+  error(err)
+end
+
+local report = misc.sysprof.report()
+
+test:ok(report.samples > 0)  -- TODO: more accurate test?
+test:ok(report.vmstate.LFUNC > 0)
+test:ok(report.vmstate.TRACE == 0)
+
+-- with very big interval
+res, err = pcall(generate_output, nil, { mode = "D", interval = 1000 })
+if res == nil then
+  error(err)
+end
+
+report = misc.sysprof.report()
+test:ok(report.samples == 0)
+
+--## CALL MODE #############################################################--
+
+res, err = pcall(
+  generate_output, TMP_BINFILE, { mode = "C", interval = 11 }
+)
+if res == nil then
+  error(err)
+end
+
+local reader = bufread.new(TMP_BINFILE)
+symtab.parse(reader)
+
+os.remove(TMP_BINFILE)
+
+jit.on()
+os.exit(test:check() and 0 or 1)
-- 
2.33.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Tarantool-patches] [PATCH luajit 7/7] tools: introduce parsers for sysprof
  2021-09-08 17:50 [Tarantool-patches] [PATCH luajit 0/7] misc: introduce sysprof Maxim Kokryashkin via Tarantool-patches
                   ` (5 preceding siblings ...)
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 6/7] sysprof: introduce Lua API Maxim Kokryashkin via Tarantool-patches
@ 2021-09-08 17:50 ` Maxim Kokryashkin via Tarantool-patches
  2021-10-07 11:28   ` Sergey Kaplun via Tarantool-patches
  6 siblings, 1 reply; 19+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2021-09-08 17:50 UTC (permalink / raw)
  To: tarantool-patches, imun, skaplun

Since the sysprof's binary output is not human-readable, so there is a
demand to create a parser. The parser, which this commit provides,
converts sysprof's event stream into the format that the flamegraph.pl
can process.

The sysprof parser machinery uses the same symtab module as memprof's
parser since the format is the same.

Part of tarantool/tarantool#781
---
 .gitignore                                    |   1 +
 .../misclib-sysprof-lapi.test.lua             |   2 +
 tools/CMakeLists.txt                          |  83 ++++++++
 tools/luajit-parse-sysprof.in                 |   6 +
 tools/sysprof.lua                             | 119 +++++++++++
 tools/sysprof/collapse.lua                    | 110 ++++++++++
 tools/sysprof/parse.lua                       | 188 ++++++++++++++++++
 7 files changed, 509 insertions(+)
 create mode 100644 tools/luajit-parse-sysprof.in
 create mode 100644 tools/sysprof.lua
 create mode 100755 tools/sysprof/collapse.lua
 create mode 100755 tools/sysprof/parse.lua

diff --git a/.gitignore b/.gitignore
index 2103a30f..099df060 100644
--- a/.gitignore
+++ b/.gitignore
@@ -19,4 +19,5 @@ cmake_uninstall.cmake
 compile_commands.json
 install_manifest.txt
 luajit-parse-memprof
+luajit-parse-sysprof
 luajit.pc
diff --git a/test/tarantool-tests/misclib-sysprof-lapi.test.lua b/test/tarantool-tests/misclib-sysprof-lapi.test.lua
index 17d83d1b..b9b3461f 100644
--- a/test/tarantool-tests/misclib-sysprof-lapi.test.lua
+++ b/test/tarantool-tests/misclib-sysprof-lapi.test.lua
@@ -8,6 +8,7 @@ jit.flush()
 
 local bufread = require "utils.bufread"
 local symtab = require "utils.symtab"
+local sysprof = require "sysprof.parse"
 
 local TMP_BINFILE = arg[0]:gsub(".+/([^/]+)%.test%.lua$", "%.%1.sysprofdata.tmp.bin")
 local BAD_PATH = arg[0]:gsub(".+/([^/]+)%.test%.lua$", "%1/sysprofdata.tmp.bin")
@@ -98,6 +99,7 @@ end
 
 local reader = bufread.new(TMP_BINFILE)
 symtab.parse(reader)
+sysprof.parse(reader)
 
 os.remove(TMP_BINFILE)
 
diff --git a/tools/CMakeLists.txt b/tools/CMakeLists.txt
index 61830e44..93a2f763 100644
--- a/tools/CMakeLists.txt
+++ b/tools/CMakeLists.txt
@@ -87,4 +87,87 @@ else()
   )
 endif()
 
+
+
+if(LUAJIT_DISABLE_SYSPROF)
+  message(STATUS "LuaJIT system profiler support is disabled")
+else()
+  # XXX: Can use genex here since the value need to be evaluated
+  # at the configuration phase. Fortunately, we know the exact
+  # path where LuaJIT binary is located.
+  set(LUAJIT_TOOLS_BIN ${LUAJIT_BINARY_DIR}/${LUAJIT_CLI_NAME})
+  set(LUAJIT_TOOLS_DIR ${CMAKE_CURRENT_SOURCE_DIR})
+  # XXX: Unfortunately, there is no convenient way to set
+  # particular permissions to the output file via CMake.
+  # Furthermore, I even failed to copy the given file to the same
+  # path to change its permissions. After looking at the docs, I
+  # realized that the valid solution would be too monstrous for
+  # such a simple task. As a result I've made the template itself
+  # executable, so the issue is resolved.
+  configure_file(luajit-parse-sysprof.in luajit-parse-sysprof @ONLY ESCAPE_QUOTES)
+
+
+  add_custom_target(tools-parse-sysprof EXCLUDE_FROM_ALL DEPENDS
+    luajit-parse-sysprof
+    sysprof/parse.lua
+    sysprof/collapse.lua
+    sysprof.lua
+    utils/bufread.lua
+    utils/symtab.lua
+  )
+  list(APPEND LUAJIT_TOOLS_DEPS tools-parse-sysprof)
+
+  install(FILES
+      ${CMAKE_CURRENT_SOURCE_DIR}/sysprof/parse.lua
+      ${CMAKE_CURRENT_SOURCE_DIR}/sysprof/collapse.lua
+    DESTINATION ${LUAJIT_DATAROOTDIR}/sysprof
+    PERMISSIONS
+      OWNER_READ OWNER_WRITE
+      GROUP_READ
+      WORLD_READ
+    COMPONENT tools-parse-sysprof
+  )
+  install(FILES
+      ${CMAKE_CURRENT_SOURCE_DIR}/utils/bufread.lua
+      ${CMAKE_CURRENT_SOURCE_DIR}/utils/symtab.lua
+    DESTINATION ${LUAJIT_DATAROOTDIR}/utils
+    PERMISSIONS
+      OWNER_READ OWNER_WRITE
+      GROUP_READ
+      WORLD_READ
+    COMPONENT tools-parse-sysprof
+  )
+  install(FILES
+      ${CMAKE_CURRENT_SOURCE_DIR}/sysprof.lua
+    DESTINATION ${LUAJIT_DATAROOTDIR}
+    PERMISSIONS
+      OWNER_READ OWNER_WRITE
+      GROUP_READ
+      WORLD_READ
+    COMPONENT tools-parse-sysprof
+  )
+  install(CODE
+    # XXX: The auxiliary script needs to be configured for to be
+    # used in repository directly. Furthermore, it needs to be
+    # reconfigured prior to its installation. The temporary
+    # <configure_file> output is stored to the project build
+    # directory and removed later after being installed. This
+    # script will have gone as a result of the issue:
+    # https://github.com/tarantool/tarantool/issues/5688.
+    "
+      set(LUAJIT_TOOLS_BIN ${CMAKE_INSTALL_PREFIX}/bin/${LUAJIT_CLI_NAME})
+      set(LUAJIT_TOOLS_DIR ${CMAKE_INSTALL_PREFIX}/${LUAJIT_DATAROOTDIR})
+      configure_file(${CMAKE_CURRENT_SOURCE_DIR}/luajit-parse-sysprof.in
+        ${PROJECT_BINARY_DIR}/luajit-parse-sysprof @ONLY ESCAPE_QUOTES)
+      file(INSTALL ${PROJECT_BINARY_DIR}/luajit-parse-sysprof
+        DESTINATION ${CMAKE_INSTALL_PREFIX}/bin
+        USE_SOURCE_PERMISSIONS
+      )
+      file(REMOVE ${PROJECT_BINARY_DIR}/luajit-parse-sysprof)
+    "
+    COMPONENT tools-parse-sysprof
+  )
+endif()
+
 add_custom_target(LuaJIT-tools DEPENDS ${LUAJIT_TOOLS_DEPS})
+
diff --git a/tools/luajit-parse-sysprof.in b/tools/luajit-parse-sysprof.in
new file mode 100644
index 00000000..2be25eb3
--- /dev/null
+++ b/tools/luajit-parse-sysprof.in
@@ -0,0 +1,6 @@
+#!/bin/bash
+#
+# Launcher for sysprof parser.
+
+LUA_PATH="@LUAJIT_TOOLS_DIR@/?.lua;;" \
+  @LUAJIT_TOOLS_BIN@ @LUAJIT_TOOLS_DIR@/sysprof.lua $@
diff --git a/tools/sysprof.lua b/tools/sysprof.lua
new file mode 100644
index 00000000..e6d8cc34
--- /dev/null
+++ b/tools/sysprof.lua
@@ -0,0 +1,119 @@
+local bufread = require "utils.bufread"
+local sysprof = require "sysprof.parse"
+local symtab = require "utils.symtab"
+local misc = require "sysprof.collapse"
+
+local stdout, stderr = io.stdout, io.stderr
+local match, gmatch = string.match, string.gmatch
+
+local split_by_vmstate = false
+
+-- Program options.
+local opt_map = {}
+
+function opt_map.help()
+  stdout:write [[
+luajit-parse-sysprof - parser of the profile collected
+                       with LuaJIT's sysprof.
+
+SYNOPSIS
+
+luajit-parse-sysprof [options] sysprof.bin
+
+Supported options are:
+
+  --help                            Show this help and exit
+  --split                           Split callchains by vmstate
+]]
+  os.exit(0)
+end
+
+function opt_map.split()
+  split_by_vmstate = true
+end
+
+-- Print error and exit with error status.
+local function opterror(...)
+  stderr:write("luajit-parse-sysprof.lua: ERROR: ", ...)
+  stderr:write("\n")
+  os.exit(1)
+end
+
+-- Parse single option.
+local function parseopt(opt, args)
+  local opt_current = #opt == 1 and "-"..opt or "--"..opt
+  local f = opt_map[opt]
+  if not f then
+    opterror("unrecognized option `", opt_current, "'. Try `--help'.\n")
+  end
+  f(args)
+end
+
+-- Parse arguments.
+local function parseargs(args)
+  -- Process all option arguments.
+  args.argn = 1
+  repeat
+    local a = args[args.argn]
+    if not a then
+      break
+    end
+    local lopt, opt = match(a, "^%-(%-?)(.+)")
+    if not opt then
+      break
+    end
+    args.argn = args.argn + 1
+    if lopt == "" then
+      -- Loop through short options.
+      for o in gmatch(opt, ".") do
+        parseopt(o, args)
+      end
+    else
+      -- Long option.
+      parseopt(opt, args)
+    end
+  until false
+
+  -- Check for proper number of arguments.
+  local nargs = #args - args.argn + 1
+  if nargs ~= 1 then
+    opt_map.help()
+  end
+
+  -- Translate a single input file.
+  -- TODO: Handle multiple files?
+  return args[args.argn]
+end
+
+local function traverse_calltree(node, prefix)
+  if node.is_leaf then
+    print(prefix..' '..node.count)
+  end
+
+  local sep_prefix = #prefix == 0 and prefix or prefix..';'
+
+  for name,child in pairs(node.children) do
+    traverse_calltree(child, sep_prefix..name)
+  end
+end
+
+local function dump(inputfile)
+  local reader = bufread.new(inputfile)
+
+  local symbols = symtab.parse(reader)
+
+  local events = sysprof.parse(reader)
+  local calltree = misc.collapse(events, symbols, split_by_vmstate)
+
+  traverse_calltree(calltree, '')
+
+  os.exit(0)
+end
+
+-- FIXME: this script should be application-independent.
+local args = {...}
+if #args == 1 and args[1] == "sysprof" then
+  return dump
+else
+  dump(parseargs(args))
+end
diff --git a/tools/sysprof/collapse.lua b/tools/sysprof/collapse.lua
new file mode 100755
index 00000000..2d684c95
--- /dev/null
+++ b/tools/sysprof/collapse.lua
@@ -0,0 +1,110 @@
+local parse = require "sysprof.parse"
+local vmdef = require "jit.vmdef"
+local symtab = require "utils.symtab"
+
+local VMST_NAMES = {
+  [parse.VMST.INTERP] = "VMST_INTERP",
+  [parse.VMST.LFUNC]  = "VMST_LFUNC",
+  [parse.VMST.FFUNC]  = "VMST_FFUNC",
+  [parse.VMST.CFUNC]  = "VMST_CFUNC",
+  [parse.VMST.GC]     = "VMST_GC",
+  [parse.VMST.EXIT]   = "VMST_EXIT",
+  [parse.VMST.RECORD] = "VMST_RECORD",
+  [parse.VMST.OPT]    = "VMST_OPT",
+  [parse.VMST.ASM]    = "VMST_ASM",
+  [parse.VMST.TRACE]  = "VMST_TRACE",
+}
+
+local M = {}
+
+local function new_node(name, is_leaf)
+  return {
+    name = name,
+    count = 0,
+    is_leaf = is_leaf,
+    children = {}
+  }
+end
+
+-- insert new child into a node (or increase counter in existing one)
+local function insert(name, node, is_leaf)
+  if node.children[name] == nil then
+    node.children[name] = new_node(name, is_leaf)
+  end
+
+  local child = node.children[name]
+  child.count = child.count + 1
+
+  return child
+end
+
+local function insert_lua_callchain(chain, lua, symbols)
+  for _,fr in pairs(lua.callchain) do
+    local name_lua
+
+    if fr.type == parse.FRAME.FFUNC then
+      name_lua = vmdef.ffnames[fr.ffid]
+    else
+      name_lua = symtab.demangle(symbols, {
+        addr = fr.addr,
+        line = fr.line
+      })
+      if lua.trace.id ~= nil and lua.trace.addr == fr.addr and
+          lua.trace.line == fr.line then
+        name_lua = 'TRACE_'..tostring(lua.trace.id)..'_'..name_lua
+      end
+    end
+
+    table.insert(chain, { name = name_lua })
+  end
+end
+
+-- merge lua and host callchains into one callchain representing
+-- transfer of control
+local function merge(event, symbols, sep_vmst)
+  local cc = {}
+  local lua_inserted = false
+
+  for _,h_fr in pairs(event.host.callchain) do
+    local name_host = symtab.demangle(symbols, { addr = h_fr.addr })
+
+    -- We assume that usually the transfer of control
+    -- looks like:
+    --    HOST -> LUA -> HOST
+    -- so for now, lua callchain starts from lua_pcall() call
+    if name_host == 'lua_pcall' then
+      insert_lua_callchain(cc, event.lua, symbols)
+      lua_inserted = true
+    end
+
+    table.insert(cc, { name = name_host })
+  end
+
+  if lua_inserted == false then
+    insert_lua_callchain(cc, event.lua, symbols)
+  end
+
+  if sep_vmst == true then
+    table.insert(cc, { name = VMST_NAMES[event.lua.vmstate] })
+  end
+
+  return cc
+end
+
+-- Collapse all the events into call tree
+function M.collapse(events, symbols, sep_vmst)
+  local root = new_node('root', false)
+
+  for _,ev in pairs(events) do
+    local callchain = merge(ev, symbols, sep_vmst)
+    local curr_node = root
+    for i=#callchain,1,-1 do
+      curr_node = insert(callchain[i].name, curr_node, false)
+    end
+    insert('', curr_node, true)
+  end
+
+  return root
+end
+
+return M
diff --git a/tools/sysprof/parse.lua b/tools/sysprof/parse.lua
new file mode 100755
index 00000000..766b0c99
--- /dev/null
+++ b/tools/sysprof/parse.lua
@@ -0,0 +1,188 @@
+-- Parser of LuaJIT's sysprof binary stream.
+-- The format spec can be found in <src/lj_sysprof.h>.
+
+local string_format = string.format
+
+local LJP_MAGIC = "ljp"
+local LJP_CURRENT_VERSION = 1
+
+local M = {}
+
+M.VMST = {
+  INTERP = 0,
+  LFUNC  = 1,
+  FFUNC  = 2,
+  CFUNC  = 3,
+  GC     = 4,
+  EXIT   = 5,
+  RECORD = 6,
+  OPT    = 7,
+  ASM    = 8,
+  TRACE  = 9,
+}
+
+
+M.FRAME = {
+  LFUNC  = 1,
+  CFUNC  = 2,
+  FFUNC  = 3,
+  BOTTOM = 0x80
+}
+
+local STREAM_END = 0x80
+
+local function new_event()
+  return {
+    lua = {
+      vmstate = 0,
+      callchain = {},
+      trace = {
+        id = nil,
+        addr = 0,
+        line = 0
+      }
+    },
+    host = {
+      callchain = {}
+    }
+  }
+end
+
+local function parse_lfunc(reader, event)
+  local addr = reader:read_uleb128()
+  local line = reader:read_uleb128()
+  table.insert(event.lua.callchain, {
+    type = M.FRAME.LFUNC,
+    addr = addr,
+    line = line
+  })
+end
+
+local function parse_ffunc(reader, event)
+  local ffid = reader:read_uleb128()
+  table.insert(event.lua.callchain, {
+    type = M.FRAME.FFUNC,
+    ffid = ffid,
+  })
+end
+
+local function parse_cfunc(reader, event)
+  local addr = reader:read_uleb128()
+  table.insert(event.lua.callchain, {
+    type = M.FRAME.CFUNC,
+    addr = addr
+  })
+end
+
+local frame_parsers = {
+  [M.FRAME.LFUNC] = parse_lfunc,
+  [M.FRAME.FFUNC] = parse_ffunc,
+  [M.FRAME.CFUNC] = parse_cfunc
+}
+
+local function parse_lua_callchain(reader, event)
+  while true do
+    local frame_header = reader:read_octet()
+    if frame_header == M.FRAME.BOTTOM then
+      break
+    end
+    frame_parsers[frame_header](reader, event)
+  end
+end
+
+--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--
+
+local function parse_host_callchain(reader, event)
+  local addr = reader:read_uleb128()
+
+  while addr ~= 0 do
+    table.insert(event.host.callchain, {
+      addr = addr
+    })
+    addr = reader:read_uleb128()
+  end
+end
+
+--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--
+
+local function parse_trace_callchain(reader, event)
+  event.lua.trace.id   = reader:read_uleb128()
+  event.lua.trace.addr = reader:read_uleb128()
+  event.lua.trace.line = reader:read_uleb128()
+end
+
+--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--
+
+local function parse_host_only(reader, event)
+  parse_host_callchain(reader, event)
+end
+
+local function parse_lua_host(reader, event)
+  parse_lua_callchain(reader, event)
+  parse_host_callchain(reader, event)
+end
+
+local function parse_trace(reader, event)
+  parse_trace_callchain(reader, event)
+  -- parse_lua_callchain(reader, event)
+end
+
+local event_parsers = {
+  [M.VMST.INTERP] = parse_host_only,
+  [M.VMST.LFUNC]  = parse_lua_host,
+  [M.VMST.FFUNC]  = parse_lua_host,
+  [M.VMST.CFUNC]  = parse_lua_host,
+  [M.VMST.GC]     = parse_host_only,
+  [M.VMST.EXIT]   = parse_host_only,
+  [M.VMST.RECORD] = parse_host_only,
+  [M.VMST.OPT]    = parse_host_only,
+  [M.VMST.ASM]    = parse_host_only,
+  [M.VMST.TRACE]  = parse_trace
+}
+
+local function parse_event(reader, events)
+  local event = new_event()
+
+  local vmstate = reader:read_octet()
+  if vmstate == STREAM_END then
+    -- TODO: samples & overruns
+    return false
+  end
+
+  assert(0 <= vmstate and vmstate <= 9, "Vmstate "..vmstate.." is not valid")
+  event.lua.vmstate = vmstate
+
+  event_parsers[vmstate](reader, event)
+
+  table.insert(events, event)
+  return true
+end
+
+function M.parse(reader)
+  local events = {}
+
+  local magic = reader:read_octets(3)
+  local version = reader:read_octets(1)
+  -- Dummy-consume reserved bytes.
+  local _ = reader:read_octets(3)
+
+  if magic ~= LJP_MAGIC then
+    error("Bad LJP format prologue: "..magic)
+  end
+
+  if string.byte(version) ~= LJP_CURRENT_VERSION then
+    error(string_format(
+      "LJP format version mismatch: the tool expects %d, but your data is %d",
+      LJP_CURRENT_VERSION,
+      string.byte(version)
+    ))
+  end
+
+  while parse_event(reader, events) do
+    -- Empty body.
+  end
+
+  return events
+end
+
+return M
-- 
2.33.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/7] core: save current frame top to lj_obj
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 1/7] core: save current frame top to lj_obj Maxim Kokryashkin via Tarantool-patches
@ 2021-09-20 17:21   ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 19+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-09-20 17:21 UTC (permalink / raw)
  To: Maxim Kokryashkin, Mikhail Shishatskiy; +Cc: tarantool-patches

Hi, Maxim, Mikhail!

Thanks for the patch!

Please consider my comments below.

Side note: please share Tarantool branch with bumped LuaJIT version.

On 08.09.21, Maxim Kokryashkin wrote:
> From: Mikhail Shishatskiy <m.shishatskiy@tarantool.org>

| core: save current frame top to lj_obj

Minor: It is better to use "vm:" prefix here. Also, "frame top to
lj_obj" sounds a little bit confusing to me. I suggest reformulate it
like the following: "topframe info into global_State".

> 
> Since commit 111d377d524e54e02187148a1832683291d620b2

Minor: Please mention any commit in the following format:

hashhashhashhashhashhashhashhashhash ('commit subject here')

> the VM has LFUNC and FFUNC states. The upcoming sampling
> profiler uses these vmstates to determine if the guest
> stack is valid or not. There is an inconsistent behavior
> of the VM when the Lua stack is not valid, but the state
> is set to LFUNC. This patch is just a gross hack with which
> the profiler works fine.

You've said offline, that there is some problem nearby `vm_growstack`
functionality. From this commit message and commit itself I

1) don't see any related changes (I don't see any hack :))
2) have no idea about what exactly the problem that this hack solved

Also, I notice that there are two unrelated changes: refactoring of code
with general macro `set_vmstate_.func` and saving metainformation
information inside the topframe structure.

NB: May be I excessively granulate changes, and the second reviewer does
not agree with me at this point.

>                          The problem is to be investigated
> more deeply :(

Typo: s/ :(/./. Moreover, I suggest to drop this sentence.

Minor: looks like commit message lines is not fullfilled.
Nit: please add corresponding "Part of " or may be "Need for " keyword.

> ---
>  src/lj_obj.h    | 12 ++++++++++++
>  src/vm_x64.dasc | 52 +++++++++++++++++++++++++++++++++++++++----------
>  src/vm_x86.dasc | 52 +++++++++++++++++++++++++++++++++++++++----------
>  3 files changed, 96 insertions(+), 20 deletions(-)
> 
> diff --git a/src/lj_obj.h b/src/lj_obj.h
> index d26e60be..b76c3155 100644
> --- a/src/lj_obj.h
> +++ b/src/lj_obj.h
> @@ -514,6 +514,17 @@ typedef struct GCtab {
>  #define setfreetop(t, n, v)	(setmref((n)->freetop, (v)))
>  #endif
>  
> +/* -- Misc objects -------------------------------------------------------- */

Minor: I suggest not Misc, but Profiler.

> +
> +struct lj_sysprof_topframe {
> +  uint8_t ffid;          /* FFUNC: fast function id. */

Why do we separate this field from the union?

> +  union {
> +    uint64_t raw;        /* Raw value for context save/restore. */
> +    TValue *interp_base; /* LFUNC: Base of the executed coroutine. */
> +    lua_CFunction cf;    /* CFUNC: Address of the C function. */

Nit: please use tabs instead spaces for comments alignment, like it is
done for other structures in this header.

> +  } guesttop;
> +};
> +
>  /* -- State objects ------------------------------------------------------- */
>  
>  /* VM states. */
> @@ -674,6 +685,7 @@ typedef struct global_State {
>    MRef jit_base;	/* Current JIT code L->base or NULL. */
>    MRef ctype_state;	/* Pointer to C type state. */
>    GCRef gcroot[GCROOT_MAX];  /* GC roots. */
> +  struct lj_sysprof_topframe top_frame;  /* Top frame for sysprof */

Nit: I suppose that this structure should be introduced only if sysprof
is enabled. OTOH, I see no reason to hide this structure, so I suggest
to drop it as is for now.

Side note: Also, I'm not sure that this field has addressable offset for
ARM architecture (for DynASM).

Typo: s/for sysprof/for sysprof./

>  } global_State;
>  
>  #define mainthread(g)	(&gcref(g->mainthref)->th)
> diff --git a/src/vm_x64.dasc b/src/vm_x64.dasc
> index 974047d3..c4beb5e7 100644
> --- a/src/vm_x64.dasc
> +++ b/src/vm_x64.dasc
> @@ -345,6 +345,35 @@
>  |  mov dword [DISPATCH+DISPATCH_GL(vmstate)], ~LJ_VMST_..st
>  |.endmacro
>  |
> +|// Stash interpreter's internal base and enter LFUNC VM state.
> +|// PROFILER: Each time profiler sees LFUNC state, it will inspect [BASE-1]

I suppose, that this is not valid for GC64. LuaJIT uses 2-slot frame
info here. The func value is in BASE - 2 slot (see <src/lj_frame.h> for
details). So this part should be adjusted.

| (gdb) f 0
| #0  lj_cf_print (L=0x7ffff7c83378) at src/lib_base.c:486
| 486       ptrdiff_t i, nargs = L->top - L->base;
|
| (gdb) lj-arch
| LJ_64: True, LJ_GC64: True, LJ_DUALNUM: False
| (gdb) p gcval(L->base - 2)->fn.l.ffid
| $11 = 29 '\035' # print
| (gdb) p gcval(L->base - 1)->fn.l.ffid
| $12 = 200 '\310'

Minor: please use well-known special comments instead PROFILER.
XXX [1] looks good here, IMO. Here and below.
| Use XXX in a comment to flag something that is bogus but works.

> +|// expecting to see a valid framelink there. So enter this state only when
> +|// BASE is stable and slots are not moved on the stack.
> +|.macro set_vmstate_lfunc
> +|  set_vmstate INTERP // Guard for non-atomic VM context restoration

Minor: Missed dot in the end of the sentence.
Here and below.

> +|  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], BASE

BASE stands for 64-bit register.

The error is occured, when build with the following command:

| $ cmake -DCMAKE_BUILD_TYPE=Debug -DLUAJIT_ENABLE_GC64=ON -DLUA_USE_ASSERT=ON -DLUA_USE_APICHECK=ON . && make -j
| ...
| /home/burii/reviews/luajit/sysprof/src/vm_x64.dasc:761: error: mixed operand size in `mov xd,rq':
|   |  set_vmstate_lfunc                  // LFUNC after KBASE restoration.
|   |    mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], BASE       [MACRO set_vmstate_lfunc (0)]

> +|  set_vmstate LFUNC
> +|.endmacro
> +|
> +|// Stash ID of the fast function about to be executed and enter FFUNC VM state.
> +|// PROFILER: Each time profiler sees FFUNC state, it will write ffid
> +|// to the profile stream.
> +|.macro set_vmstate_ffunc
> +|  set_vmstate INTERP // Guard for non-atomic VM context restoration
> +|  mov XCHGd, dword [BASE-8]

XCHGd register is not defined in <vm_x64.dasm>.

| /home/burii/reviews/luajit/sysprof/src/vm_x64.dasc:501: error: bad operand mode in `mov i?,i?':
|  |  mov XCHGd, L:RBa->base
| ...

Also, BASE stands for 64-bit register.
| ...
| /home/burii/reviews/luajit/sysprof/src/vm_x64.dasc:467: error: mixed operand size in `mov xd,rq':
|   |  set_vmstate_cfunc
|   |    mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], BASE       [MACRO set_vmstate_cfunc (0)]

> +|  mov dword [DISPATCH+DISPATCH_GL(top_frame.ffid)], XCHGd
> +|  set_vmstate FFUNC
> +|.endmacro
> +|
> +|// Stash address of the C function about to be executed and enter CFUNC VM state.
> +|// PROFILER: Each time profiler sees CFUNC state, it will write this address
> +|// to the profile stream.
> +|.macro set_vmstate_cfunc
> +|  set_vmstate INTERP // Guard for non-atomic VM context restoration
> +|  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], BASE

Ditto.

> +|  set_vmstate CFUNC
> +|.endmacro
> +|
>  |// Uses TMPRd (r10d).
>  |.macro save_vmstate
>  |.if not WIN
> @@ -435,7 +464,7 @@ static void build_subroutines(BuildCtx *ctx)

<snipped>

> @@ -467,6 +496,9 @@ static void build_subroutines(BuildCtx *ctx)
>    |  xor eax, eax			// Ok return status for vm_pcall.
>    |
>    |->vm_leave_unw:
> +  |  set_vmstate INTERP // Guard for non-atomic VM context restoration
> +  |  mov XCHGd, L:RBa->base

RBa register is not defined in <vm_x64.dasm>.

> +  |  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], XCHGd
>    |  // DISPATCH required to set properly.
>    |  restore_vmstate			// Caveat: uses TMPRd (r10d).
>    |  restoreregs
> @@ -725,7 +757,7 @@ static void build_subroutines(BuildCtx *ctx)

<snipped>

> diff --git a/src/vm_x86.dasc b/src/vm_x86.dasc
> index ab8e6f27..222754fe 100644
> --- a/src/vm_x86.dasc
> +++ b/src/vm_x86.dasc
> @@ -443,6 +443,35 @@
>  |  mov dword [DISPATCH+DISPATCH_GL(vmstate)], ~LJ_VMST_..st
>  |.endmacro
>  |
> +|// Stash interpreter's internal base and enter LFUNC VM state.
> +|// PROFILER: Each time profiler sees LFUNC state, it will inspect [BASE-1]
> +|// expecting to see a valid framelink there. So enter this state only when
> +|// BASE is stable and slots are not moved on the stack.
> +|.macro set_vmstate_lfunc
> +|  set_vmstate INTERP // Guard for non-atomic VM context restoration
> +|  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], BASE
> +|  set_vmstate LFUNC
> +|.endmacro
> +|
> +|// Stash ID of the fast function about to be executed and enter FFUNC VM state.
> +|// PROFILER: Each time profiler sees FFUNC state, it will write ffid
> +|// to the profile stream.
> +|.macro set_vmstate_ffunc
> +|  set_vmstate INTERP // Guard for non-atomic VM context restoration
> +|  mov XCHGd, dword [BASE-8]

This register is defined only for x64 architecture.
The error is occured, when build with the following command:

| $ make CC="gcc -m32" -f Makefile.original -j
| ...
| DYNASM    host/buildvm_arch.h
| vm_x86.dasc:632: error: bad operand mode in `mov i?,x?':
|   |  mov XCHGd, L:RBa->base
| vm_x86.dasc:1544: error: bad operand mode in `mov i?,xd':
|   |.ffunc_1 assert
|   |    mov XCHGd, dword [BASE-8]        [MACRO set_vmstate_ffunc (0)]
| vm_x86.dasc:1572: error: bad operand mode in `mov i?,xd':
|   |.ffunc_1 type
|   |    mov XCHGd, dword [BASE-8]        [MACRO set_vmstate_ffunc (0)]
| vm_x86.dasc:1599: error: bad operand mode in `mov i?,xd':
|   |.ffunc_1 getmetatable
|   |    mov XCHGd, dword [BASE-8]        [MACRO set_vmstate_ffunc (0)]

Side note: Unfortunately it's impossible for now forcify x32 build via
cmake. It requires to proxy CMAKE_C_FLAGS to macro in LuaJITUtils, IINM.

> +|  mov dword [DISPATCH+DISPATCH_GL(top_frame.ffid)], XCHGd

Please clarify the following things:

1) IINM, we save not ffid but the GCref for this function (since we not
   load ffid field from function).
2) Why do we store 64 bytes instead 8? Yes, the struct is not packed and
   there is a hole in it, so it works correct. But it is a little bit
   confusing. Also, why can't we write BASE here too and inspect ffid
   from this base later?

> +|  set_vmstate FFUNC
> +|.endmacro
> +|
> +|// Stash address of the C function about to be executed and enter CFUNC VM state.
> +|// PROFILER: Each time profiler sees CFUNC state, it will write this address
> +|// to the profile stream.
> +|.macro set_vmstate_cfunc
> +|  set_vmstate INTERP // Guard for non-atomic VM context restoration
> +|  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], BASE
> +|  set_vmstate CFUNC
> +|.endmacro
> +|
>  |// Uses spilled ecx on x86 or XCHGd (r11d) on x64.
>  |.macro save_vmstate
>  |.if not WIN
> @@ -560,7 +589,7 @@ static void build_subroutines(BuildCtx *ctx)

<snipped>

> @@ -599,6 +628,9 @@ static void build_subroutines(BuildCtx *ctx)
>    |  xor eax, eax			// Ok return status for vm_pcall.
>    |
>    |->vm_leave_unw:
> +  |  set_vmstate INTERP // Guard for non-atomic VM context restoration
> +  |  mov XCHGd, L:RBa->base

AFAIKS, there is no garantee, that L:RBa is set up to `lua_State *`.

For example, during vm_unwind_c_eh RB register is set to
`global_State *`.

|->vm_unwind_c_eh:			// Landing pad for external unwinder.
|  mov L:DISPATCH, SAVE_L
|  mov GL:RB, L:DISPATCH->glref
|  mov dword GL:RB->cur_L, L:DISPATCH
|  mov dword GL:RB->vmstate, ~LJ_VMST_CFUNC
|  mov DISPATCH, L:DISPATCH->glref	// Setup pointer to dispatch table.
|  add DISPATCH, GG_G2DISP
|  jmp ->vm_leave_unw

> +  |  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], XCHGd
>    |  // DISPATCH required to set properly.
>    |  restore_vmstate			// Caveat: on x64 uses XCHGd (r11d).
>    |  restoreregs
> @@ -934,7 +966,7 @@ static void build_subroutines(BuildCtx *ctx)

<snipped>

> -- 
> 2.33.0
> 

[1]: https://www.oracle.com/java/technologies/javase/codeconventions-programmingpractices.html

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/7] core: separate the profiling timer from lj_profile
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 2/7] core: separate the profiling timer from lj_profile Maxim Kokryashkin via Tarantool-patches
@ 2021-09-21 11:13   ` Sergey Kaplun via Tarantool-patches
  2021-09-23 11:37     ` Mikhail Shishatskiy via Tarantool-patches
  0 siblings, 1 reply; 19+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-09-21 11:13 UTC (permalink / raw)
  To: Maxim Kokryashkin, Mikhail Shishatskiy; +Cc: tarantool-patches

Hi, Maxim, Mikhail!

Thanks for the patch!

Please consider my comments below.

On 08.09.21, Maxim Kokryashkin wrote:
> From: Mikhail Shishatskiy <m.shishatskiy@tarantool.org>
> 
> This patch makes timer machinery in lj_profile self-reliant by
> introducing lj_profile_timer structure and start/stop interface
> for it. This makes the timer useful for other sampling profiling

Minor: s/This makes the timer useful/This timer is useful/

> features.
> 
> Part of tarantool/tarantool#781
> ---
>  src/CMakeLists.txt        |   1 +
>  src/Makefile.dep.original |   5 +-
>  src/lj_profile.c          | 178 +++-----------------------------------
>  src/lj_profile_timer.c    | 126 +++++++++++++++++++++++++++
>  src/lj_profile_timer.h    |  83 ++++++++++++++++++

Amalgamated build should be updated as well.

>  5 files changed, 227 insertions(+), 166 deletions(-)
>  create mode 100644 src/lj_profile_timer.c
>  create mode 100644 src/lj_profile_timer.h
> 
> diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
> index 809aac68..50c23cae 100644
> --- a/src/CMakeLists.txt
> +++ b/src/CMakeLists.txt
> @@ -68,6 +68,7 @@ make_source_list(SOURCES_UTILS
>  make_source_list(SOURCES_PROFILER
>    SOURCES
>      lj_memprof.c
> +    lj_profile_timer.c
>      lj_profile.c

Nit: lj_profile.c should go first (alphabetically).

>  )
>  
> diff --git a/src/Makefile.dep.original b/src/Makefile.dep.original

<snipped>

> diff --git a/src/lj_profile.c b/src/lj_profile.c
> index 7b09a63a..e7e1a303 100644
> --- a/src/lj_profile.c
> +++ b/src/lj_profile.c
> @@ -19,66 +19,19 @@
>  #include "lj_trace.h"
>  #endif
>  #include "lj_profile.h"
> +#include "lj_profile_timer.h"
>  
>  #include "luajit.h"
>  
> -#if LJ_PROFILE_SIGPROF

<snipped>

> -#endif
> -
>  /* Profiler state. */
>  typedef struct ProfileState {
>    global_State *g;		/* VM state that started the profiler. */
>    luaJIT_profile_callback cb;	/* Profiler callback. */
>    void *data;			/* Profiler callback data. */
>    SBuf sb;			/* String buffer for stack dumps. */
> -  int interval;			/* Sample interval in milliseconds. */
>    int samples;			/* Number of samples for next callback. */
>    int vmstate;			/* VM state when profile timer triggered. */
> -#if LJ_PROFILE_SIGPROF
> -  struct sigaction oldsa;	/* Previous SIGPROF state. */
> -#elif LJ_PROFILE_PTHREAD
> -  pthread_mutex_t lock;		/* g->hookmask update lock. */
> -  pthread_t thread;		/* Timer thread. */
> -  int abort;			/* Abort timer thread. */
> -#elif LJ_PROFILE_WTHREAD
> -#if LJ_TARGET_WINDOWS
> -  HINSTANCE wmm;		/* WinMM library handle. */
> -  WMM_TPFUNC wmm_tbp;		/* WinMM timeBeginPeriod function. */
> -  WMM_TPFUNC wmm_tep;		/* WinMM timeEndPeriod function. */
> -#endif
> -  CRITICAL_SECTION lock;	/* g->hookmask update lock. */
> -  HANDLE thread;		/* Timer thread. */
> -  int abort;			/* Abort timer thread. */
> -#endif
> +  lj_profile_timer timer;  /* Profiling timer */

Minor: Use tabs for allignment here.

>  } ProfileState;
>  
>  /* Sadly, we have to use a static profiler state.
> @@ -145,6 +98,7 @@ void LJ_FASTCALL lj_profile_interpreter(lua_State *L)
>    profile_unlock(ps);
>  }
>  
> +

This change is excess.

>  /* Trigger profile hook. Asynchronous call from OS-specific profile timer. */
>  static void profile_trigger(ProfileState *ps)
>  {
> @@ -168,129 +122,21 @@ static void profile_trigger(ProfileState *ps)
>    profile_unlock(ps);
>  }
>  
> -/* -- OS-specific profile timer handling ---------------------------------- */
> -
>  #if LJ_PROFILE_SIGPROF
>  
> -/* SIGPROF handler. */
> -static void profile_signal(int sig)
> +static void profile_handler(int sig, siginfo_t *info, void* ctx)

Why do we need to change signature of this function?

Nit: s/void* ctx/void *ctx/

>  {
>    UNUSED(sig);
> +  UNUSED(info);
> +  UNUSED(ctx);
>    profile_trigger(&profile_state);
>  }
>  
> -/* Start profiling timer. */
> -static void profile_timer_start(ProfileState *ps)
> -{
> -  int interval = ps->interval;
> -  struct itimerval tm;
> -  struct sigaction sa;
> -  tm.it_value.tv_sec = tm.it_interval.tv_sec = interval / 1000;
> -  tm.it_value.tv_usec = tm.it_interval.tv_usec = (interval % 1000) * 1000;
> -  setitimer(ITIMER_PROF, &tm, NULL);
> -  sa.sa_flags = SA_RESTART;
> -  sa.sa_handler = profile_signal;
> -  sigemptyset(&sa.sa_mask);
> -  sigaction(SIGPROF, &sa, &ps->oldsa);
> -}
> -
> -/* Stop profiling timer. */
> -static void profile_timer_stop(ProfileState *ps)
> -{

<snipped>

> -}
> -
> -#elif LJ_PROFILE_PTHREAD
> -
> -/* POSIX timer thread. */
> -static void *profile_thread(ProfileState *ps)
> -{
> -  int interval = ps->interval;
> -#if !LJ_TARGET_PS3
> -  struct timespec ts;
> -  ts.tv_sec = interval / 1000;
> -  ts.tv_nsec = (interval % 1000) * 1000000;
> -#endif
> -  while (1) {
> -#if LJ_TARGET_PS3
> -    sys_timer_usleep(interval * 1000);
>  #else
> -    nanosleep(&ts, NULL);
> -#endif
> -    if (ps->abort) break;
> -    profile_trigger(ps);
> -  }
> -  return NULL;
> -}
> -
> -/* Start profiling timer thread. */
> -static void profile_timer_start(ProfileState *ps)
> -{

<snipped>

> -}
>  
> -/* Stop profiling timer thread. */
> -static void profile_timer_stop(ProfileState *ps)
> +static void profile_handler()
>  {

<snipped>

> -}
> -
> -#elif LJ_PROFILE_WTHREAD
> -
> -/* Windows timer thread. */
> -static DWORD WINAPI profile_thread(void *psx)
> -{
> -  ProfileState *ps = (ProfileState *)psx;
> -  int interval = ps->interval;
> -#if LJ_TARGET_WINDOWS
> -  ps->wmm_tbp(interval);
> -#endif
> -  while (1) {
> -    Sleep(interval);
> -    if (ps->abort) break;
> -    profile_trigger(ps);
> -  }
> -#if LJ_TARGET_WINDOWS
> -  ps->wmm_tep(interval);
> -#endif
> -  return 0;
> -}
> -
> -/* Start profiling timer thread. */
> -static void profile_timer_start(ProfileState *ps)
> -{
> -#if LJ_TARGET_WINDOWS
> -  if (!ps->wmm) {  /* Load WinMM library on-demand. */
> -    ps->wmm = LoadLibraryExA("winmm.dll", NULL, 0);
> -    if (ps->wmm) {
> -      ps->wmm_tbp = (WMM_TPFUNC)GetProcAddress(ps->wmm, "timeBeginPeriod");
> -      ps->wmm_tep = (WMM_TPFUNC)GetProcAddress(ps->wmm, "timeEndPeriod");
> -      if (!ps->wmm_tbp || !ps->wmm_tep) {
> -	ps->wmm = NULL;
> -	return;
> -      }
> -    }
> -  }
> -#endif
> -  InitializeCriticalSection(&ps->lock);
> -  ps->abort = 0;
> -  ps->thread = CreateThread(NULL, 0, profile_thread, ps, 0, NULL);
> -}
> -
> -/* Stop profiling timer thread. */
> -static void profile_timer_stop(ProfileState *ps)
> -{

<snipped>

>  }
>  
>  #endif
> @@ -327,12 +173,14 @@ LUA_API void luaJIT_profile_start(lua_State *L, const char *mode,
>      if (ps->g) return;  /* Profiler in use by another VM. */
>    }
>    ps->g = G(L);
> -  ps->interval = interval;
>    ps->cb = cb;
>    ps->data = data;
>    ps->samples = 0;
>    lj_buf_init(L, &ps->sb);
> -  profile_timer_start(ps);
> +

Nit: I think this empty line is excess.
Feel free to ignore.

> +  ps->timer.opt.interval_msec = interval;
> +  ps->timer.opt.handler = profile_handler;
> +  lj_profile_timer_start(&ps->timer);
>  }
>  
>  /* Stop profiling. */
> @@ -341,7 +189,7 @@ LUA_API void luaJIT_profile_stop(lua_State *L)

<snipped>

> diff --git a/src/lj_profile_timer.c b/src/lj_profile_timer.c
> new file mode 100644
> index 00000000..ce420a85
> --- /dev/null
> +++ b/src/lj_profile_timer.c
> @@ -0,0 +1,126 @@
> +/*
> +** Simple profiling timer extracted from inbuilt luajit profiler

Nit: I suppose we can omit the part about extracting.

> +*/
> +
> +#define lj_profile_timer_c
> +#define LUA_CORE
> +
> +#include "lj_profile_timer.h"
> +
> +#if LJ_HASPROFILE
> +
> +#if LJ_PROFILE_SIGPROF
> +
> +/* Start profiling timer. */
> +void lj_profile_timer_start(lj_profile_timer *timer) {

Minor: use the newline for curly bracket after function declaration.
Here and below.

> +  int interval = timer->opt.interval_msec;

Minor: looks like interval may be declared as `const int`.
Feel free to ignore.

> +  struct itimerval tm;
> +  struct sigaction sa;
> +  tm.it_value.tv_sec = tm.it_interval.tv_sec = interval / 1000;
> +  tm.it_value.tv_usec = tm.it_interval.tv_usec = (interval % 1000) * 1000;
> +  setitimer(ITIMER_PROF, &tm, NULL);
> +  sa.sa_flags = SA_RESTART | SA_SIGINFO;
> +  sa.sa_sigaction = timer->opt.handler;

Why do we need to forcify using of 3 argument handler?

| -  sa.sa_flags = SA_RESTART;
| -  sa.sa_handler = profile_signal;

Looks like the old format is enough.

> +  sigemptyset(&sa.sa_mask);
> +  sigaction(SIGPROF, &sa, &timer->oldsa);
> +}
> +
> +/* Stop profiling timer. */
> +void lj_profile_timer_stop(lj_profile_timer *timer) {

<snipped>

> +}
> +
> +#elif LJ_PROFILE_PTHREAD
> +
> +/* POSIX timer thread. */
> +static void *profile_thread(lj_timer *timer) {

Its lj_profile_timer, not lj_timer :).

Also, `timer_thread()` name is looks better here, as far as it only
handler timer in another thread.
Thoughts?

> +  int interval = timer->opt.interval_msec;
> +#if !LJ_TARGET_PS3
> +  struct timespec ts;
> +  ts.tv_sec = interval / 1000;
> +  ts.tv_nsec = (interval % 1000) * 1000000;
> +#endif
> +  while (1) {
> +#if LJ_TARGET_PS3
> +    sys_timer_usleep(interval * 1000);
> +#else
> +    nanosleep(&ts, NULL);
> +#endif
> +    if (timer->abort)
> +      break;


Minor: I suggest to save old code style here, i.e. without newline.

> +    timer->opt.handler();
> +  }
> +  return NULL;
> +}
> +
> +/* Start profiling timer thread. */
> +void lj_profile_timer_start(lj_profile_timer *timer) {
> +  pthread_mutex_init(&timer->lock, 0);
> +  timer->abort = 0;
> +  pthread_create(&timer->thread, NULL, (void *(*)(void *))profile_thread,
> +                 timer);

Minor: Use tabs for allignment here.

Side note: may be `t` stands for timer is good enough here.
Feel free to ignore.

> +}
> +
> +/* Stop profiling timer thread. */
> +void lj_profile_timer_stop(lj_profile_timer *timer) {

<snipped>

> +}
> +
> +#elif LJ_PROFILE_WTHREAD
> +
> +/* Windows timer thread. */
> +static DWORD WINAPI profile_thread(void *timerx) {

Also, `timer_thread()` name is looks better here, as far as it only
handler timer in another thread.
Thoughts?

> +  lj_profile_timer *timer = (lj_profile_timer *)timerx;
> +  int interval = timer->opt.interval_msec;
> +#if LJ_TARGET_WINDOWS
> +  timer->wmm_tbp(interval);
> +#endif
> +  while (1) {
> +    Sleep(interval);
> +    if (timer->abort)
> +      break;

Minor: I suggest to save old code style here, i.e. without newline.

> +    timer->opt.handler();
> +  }
> +#if LJ_TARGET_WINDOWS
> +  timer->wmm_tep(interval);
> +#endif
> +  return 0;
> +}
> +
> +/* Start profiling timer thread. */
> +void lj_profile_timer_start(lj_profile_timer *timer) {
> +#if LJ_TARGET_WINDOWS
> +  if (!timer->wmm) { /* Load WinMM library on-demand. */
> +    timer->wmm = LoadLibraryExA("winmm.dll", NULL, 0);
> +    if (timer->wmm) {
> +      timer->wmm_tbp =
> +          (WMM_TPFUNC)GetProcAddress(timer->wmm, "timeBeginPeriod");
> +      timer->wmm_tep = (WMM_TPFUNC)GetProcAddress(timer->wmm, "timeEndPeriod");
> +      if (!timer->wmm_tbp || !timer->wmm_tep) {
> +        timer->wmm = NULL;
> +        return;

LuaJIT use quoter-half tabs codestyle. Please, return 1 tab instead 8
spaces.

> +      }
> +    }
> +  }
> +#endif
> +  InitializeCriticalSection(&timer->lock);
> +  timer->abort = 0;
> +  timer->thread = CreateThread(NULL, 0, profile_thread, timer, 0, NULL);
> +}
> +
> +/* Stop profiling timer thread. */
> +void lj_profile_timer_stop(lj_profile_timer *timer) {

<snipped>

> +}
> +
> +#endif
> +
> +#endif  /* LJ_HASPROFILE */
> diff --git a/src/lj_profile_timer.h b/src/lj_profile_timer.h
> new file mode 100644
> index 00000000..af9eabce
> --- /dev/null
> +++ b/src/lj_profile_timer.h
> @@ -0,0 +1,83 @@
> +/*
> +** Simple profiling timer extracted from inbuilt luajit profiler

Nit: I suppose we can omit the part about extracting.

> +*/
> +
> +#ifndef _LJ_PROFILE_TIMER_H
> +#define _LJ_PROFILE_TIMER_H
> +
> +#include <stdint.h>

It's better to use "lj_def.h" here (see it to explanation).

> +
> +#include "lj_arch.h"
> +
> +#if LJ_HASPROFILE
> +
> +#if LJ_PROFILE_SIGPROF

<snipped>

> +#endif
> +
> +typedef struct {
> +#if LJ_PROFILE_SIGPROF
> +  void (*handler)(int, siginfo_t*, void*);
> +#else
> +  void (*handler)(void);
> +#endif

Don't get why do we need different handlers signature here.
If we use the old one i.e. void (*)(void).

> +  uint32_t interval_msec;
> +} lj_profile_timer_opt;

Why do we need a separate structure for these options?

> +

Minor: Please fix mess with tabs and spaces below.

> +typedef struct {
> +  lj_profile_timer_opt opt;
> +#if LJ_PROFILE_SIGPROF
> +  struct sigaction oldsa;  /* Previous SIGPROF state. */
> +#elif LJ_PROFILE_PTHREAD
> +  pthread_mutex_t lock;		 /* g->hookmask update lock. */
> +  pthread_t thread;		     /* Timer thread. */
> +  int abort;			         /* Abort timer thread. */
> +#elif LJ_PROFILE_WTHREAD
> +#if LJ_TARGET_WINDOWS
> +  HINSTANCE wmm;		       /* WinMM library handle. */
> +  WMM_TPFUNC wmm_tbp;		   /* WinMM timeBeginPeriod function. */
> +  WMM_TPFUNC wmm_tep;		   /* WinMM timeEndPeriod function. */
> +#endif
> +  CRITICAL_SECTION lock;	 /* g->hookmask update lock. */
> +  HANDLE thread;		       /* Timer thread. */
> +  int abort;			         /* Abort timer thread. */
> +#endif
> +} lj_profile_timer;
> +
> +/* Start profiling timer */

Typo: missed dot at the end of the sentence.

> +void lj_profile_timer_start(lj_profile_timer *timer);
> +
> +/* Stop profiling timer */

Typo: missed dot at the end of the sentence.

> +void lj_profile_timer_stop(lj_profile_timer *timer);
> +
> +#endif  /* LJ_HASPROFILE */
> +
> +#endif
> -- 
> 2.33.0
> 

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 3/7] memprof: move symtab to a separate module
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 3/7] memprof: move symtab to a separate module Maxim Kokryashkin via Tarantool-patches
@ 2021-09-22  7:51   ` Sergey Kaplun via Tarantool-patches
  2021-09-22  8:14     ` Sergey Kaplun via Tarantool-patches
  2021-09-23 14:51   ` [Tarantool-patches] [PATCH luajit v2] " Maxim Kokryashkin via Tarantool-patches
  1 sibling, 1 reply; 19+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-09-22  7:51 UTC (permalink / raw)
  To: Maxim Kokryashkin; +Cc: tarantool-patches

Hi, Maxim!

Thanks for the patch!

Please consider my comments below.

On 08.09.21, Maxim Kokryashkin wrote:
> Considering the symbol table format, it is obvious that it is suitable
> not only for memprof, but for other profiling modules too. This commit
> moves the symbol table to the separate module, so modules other than
> memprof will be able to access it.
> 
> Part of tarantool/tarantool#781
> ---
>  src/CMakeLists.txt        |  1 +
>  src/Makefile.dep.original |  4 +++-
>  src/lj_memprof.c          | 37 ++-----------------------------
>  src/lj_memprof.h          | 36 ------------------------------
>  src/lj_symtab.c           | 36 ++++++++++++++++++++++++++++++
>  src/lj_symtab.h           | 46 +++++++++++++++++++++++++++++++++++++++

Amalgamated build should be updated as well.

>  6 files changed, 88 insertions(+), 72 deletions(-)
>  create mode 100644 src/lj_symtab.c
>  create mode 100644 src/lj_symtab.h
> 
> diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
> index 50c23cae..0f3d888f 100644
> --- a/src/CMakeLists.txt
> +++ b/src/CMakeLists.txt
> @@ -60,6 +60,7 @@ make_source_list(SOURCES_UTILS
>    SOURCES
>      lj_alloc.c
>      lj_char.c
> +    lj_symtab.c

Minor: Don't sure that it is SOURCES_UTILS, but more likely
SOURCES_PROFILER.

>      lj_utils_leb128.c
>      lj_vmmath.c
>      lj_wbuf.c
> diff --git a/src/Makefile.dep.original b/src/Makefile.dep.original
> index da5abf0d..b7d82719 100644
> --- a/src/Makefile.dep.original
> +++ b/src/Makefile.dep.original

<skipped>

> diff --git a/src/lj_memprof.c b/src/lj_memprof.c
> index 2c1ef3b8..fba688bc 100644
> --- a/src/lj_memprof.c
> +++ b/src/lj_memprof.c
> @@ -19,40 +19,7 @@
>  #include "lj_frame.h"
>  #include "lj_debug.h"
>  
> -/* --------------------------------- Symtab --------------------------------- */
> -
> -static const unsigned char ljs_header[] = {'l', 'j', 's', LJS_CURRENT_VERSION,
> -					   0x0, 0x0, 0x0};
> -

<snipped>

> @@ -249,7 +216,7 @@ int lj_memprof_start(struct lua_State *L, const struct lj_memprof_options *opt)
>  
>    /* Init output. */
>    lj_wbuf_init(&mp->out, mp_opt->writer, mp_opt->ctx, mp_opt->buf, mp_opt->len);
> -  dump_symtab(&mp->out, mp->g);
> +  lj_symtab_dump(&mp->out, mp->g);
>  
>    /* Write prologue. */
>    lj_wbuf_addn(&mp->out, ljm_header, ljm_header_len);
> diff --git a/src/lj_memprof.h b/src/lj_memprof.h
> index 3417475d..33a9b869 100644
> --- a/src/lj_memprof.h
> +++ b/src/lj_memprof.h
> @@ -16,42 +16,6 @@
>  #include "lj_def.h"
>  #include "lj_wbuf.h"
>  
> -#define LJS_CURRENT_VERSION 0x1

This a little bit confusing for me:
Symtab version in tarantool branch is 0, not 1.

> -
> -/*
> -** symtab format:

<snipped>

> -*/
> -
> -#define SYMTAB_LFUNC ((uint8_t)0)
> -#define SYMTAB_FINAL ((uint8_t)0x80)
>  
>  #define LJM_CURRENT_FORMAT_VERSION 0x01
>  
> diff --git a/src/lj_symtab.c b/src/lj_symtab.c
> new file mode 100644
> index 00000000..93e19353
> --- /dev/null
> +++ b/src/lj_symtab.c

Minor: please add some module description here. Also, COPYRIGHT should
be duplicated here too. See <lj_memprof.c> for example.

> @@ -0,0 +1,36 @@
> +#define lj_symtab_c
> +#define LUA_CORE
> +
> +#include "lj_symtab.h"
> +
> +static const unsigned char ljs_header[] = {'l', 'j', 's', LJS_CURRENT_VERSION,
> +                                           0x0, 0x0, 0x0};

Please use tabs for alignment here.

> +
> +void lj_symtab_dump(struct lj_wbuf *out, const struct global_State *g)
> +{

<snipped>

> +}
> +

Nit: Excess empty line.

> diff --git a/src/lj_symtab.h b/src/lj_symtab.h
> new file mode 100644
> index 00000000..9ea2c949
> --- /dev/null
> +++ b/src/lj_symtab.h
> @@ -0,0 +1,46 @@

Minor: please add some module description here. Also, COPYRIGHT should
be duplicated here too. See <lj_memprof.h> for example.

> +#ifndef LJ_SYMTAB
> +#define LJ_SYMTAB

Minor: It should be _LJ_SYMTAB_H, like for other header files.

> +
> +#include "lj_wbuf.h"
> +#include "lj_obj.h"
> +
> +/*
> +** symtab format:

<snipped>

> +*/
> +
> +#define LJS_CURRENT_VERSION 0x1

Nit: I suppose this can be mention before format description.
I.e. it will be look like the following:
This symtab version has the following format: ...
Feel free to ignore.

> +
> +#define SYMTAB_LFUNC ((uint8_t)0)
> +#define SYMTAB_FINAL ((uint8_t)0x80)
> +

Minor: please add some comments for function's description.

> +void lj_symtab_dump(struct lj_wbuf *out, const struct global_State *g);
> +
> +#endif
> -- 
> 2.33.0
> 

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 3/7] memprof: move symtab to a separate module
  2021-09-22  7:51   ` Sergey Kaplun via Tarantool-patches
@ 2021-09-22  8:14     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 19+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-09-22  8:14 UTC (permalink / raw)
  To: Maxim Kokryashkin, tarantool-patches

On 22.09.21, Sergey Kaplun via Tarantool-patches wrote:

<snipped>

> > diff --git a/src/lj_memprof.h b/src/lj_memprof.h
> > index 3417475d..33a9b869 100644
> > --- a/src/lj_memprof.h
> > +++ b/src/lj_memprof.h
> > @@ -16,42 +16,6 @@
> >  #include "lj_def.h"
> >  #include "lj_wbuf.h"
> >  
> > -#define LJS_CURRENT_VERSION 0x1
> 
> This a little bit confusing for me:
> Symtab version in tarantool branch is 0, not 1.

Sorry, my bad -- still 0x1 :)
Please ignore this comment.

> 

<snipped>

> 
> -- 
> Best regards,
> Sergey Kaplun

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/7] core: separate the profiling timer from lj_profile
  2021-09-21 11:13   ` Sergey Kaplun via Tarantool-patches
@ 2021-09-23 11:37     ` Mikhail Shishatskiy via Tarantool-patches
  0 siblings, 0 replies; 19+ messages in thread
From: Mikhail Shishatskiy via Tarantool-patches @ 2021-09-23 11:37 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: max.kokryashkin, tarantool-patches

Hi, Sergey!
Thanks for the review!

On 21.09.2021 14:13, Sergey Kaplun wrote:
>Hi, Maxim, Mikhail!
>
>Thanks for the patch!
>
>Please consider my comments below.
>
>On 08.09.21, Maxim Kokryashkin wrote:
>> From: Mikhail Shishatskiy <m.shishatskiy@tarantool.org>
>>
>> This patch makes timer machinery in lj_profile self-reliant by
>> introducing lj_profile_timer structure and start/stop interface
>> for it. This makes the timer useful for other sampling profiling
>
>Minor: s/This makes the timer useful/This timer is useful/

Fixed.

>
>> features.
>>
>> Part of tarantool/tarantool#781
>> ---
>>  src/CMakeLists.txt        |   1 +
>>  src/Makefile.dep.original |   5 +-
>>  src/lj_profile.c          | 178 +++-----------------------------------
>>  src/lj_profile_timer.c    | 126 +++++++++++++++++++++++++++
>>  src/lj_profile_timer.h    |  83 ++++++++++++++++++
>
>Amalgamated build should be updated as well.

Fixed.

>
>>  5 files changed, 227 insertions(+), 166 deletions(-)
>>  create mode 100644 src/lj_profile_timer.c
>>  create mode 100644 src/lj_profile_timer.h
>>
>> diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
>> index 809aac68..50c23cae 100644
>> --- a/src/CMakeLists.txt
>> +++ b/src/CMakeLists.txt
>> @@ -68,6 +68,7 @@ make_source_list(SOURCES_UTILS
>>  make_source_list(SOURCES_PROFILER
>>    SOURCES
>>      lj_memprof.c
>> +    lj_profile_timer.c
>>      lj_profile.c
>
>Nit: lj_profile.c should go first (alphabetically).

Fixed.

>
>>  )
>>
>> diff --git a/src/Makefile.dep.original b/src/Makefile.dep.original
>
><snipped>
>
>> diff --git a/src/lj_profile.c b/src/lj_profile.c
>> index 7b09a63a..e7e1a303 100644
>> --- a/src/lj_profile.c
>> +++ b/src/lj_profile.c
>> @@ -19,66 +19,19 @@
>>  #include "lj_trace.h"
>>  #endif
>>  #include "lj_profile.h"
>> +#include "lj_profile_timer.h"
>>
>>  #include "luajit.h"
>>
>> -#if LJ_PROFILE_SIGPROF
>
><snipped>
>
>> -#endif
>> -
>>  /* Profiler state. */
>>  typedef struct ProfileState {
>>    global_State *g;		/* VM state that started the profiler. */
>>    luaJIT_profile_callback cb;	/* Profiler callback. */
>>    void *data;			/* Profiler callback data. */
>>    SBuf sb;			/* String buffer for stack dumps. */
>> -  int interval;			/* Sample interval in milliseconds. */
>>    int samples;			/* Number of samples for next callback. */
>>    int vmstate;			/* VM state when profile timer triggered. */
>> -#if LJ_PROFILE_SIGPROF
>> -  struct sigaction oldsa;	/* Previous SIGPROF state. */
>> -#elif LJ_PROFILE_PTHREAD
>> -  pthread_mutex_t lock;		/* g->hookmask update lock. */
>> -  pthread_t thread;		/* Timer thread. */
>> -  int abort;			/* Abort timer thread. */
>> -#elif LJ_PROFILE_WTHREAD
>> -#if LJ_TARGET_WINDOWS
>> -  HINSTANCE wmm;		/* WinMM library handle. */
>> -  WMM_TPFUNC wmm_tbp;		/* WinMM timeBeginPeriod function. */
>> -  WMM_TPFUNC wmm_tep;		/* WinMM timeEndPeriod function. */
>> -#endif
>> -  CRITICAL_SECTION lock;	/* g->hookmask update lock. */
>> -  HANDLE thread;		/* Timer thread. */
>> -  int abort;			/* Abort timer thread. */
>> -#endif
>> +  lj_profile_timer timer;  /* Profiling timer */
>
>Minor: Use tabs for allignment here.

Fixed.

>
>>  } ProfileState;
>>
>>  /* Sadly, we have to use a static profiler state.
>> @@ -145,6 +98,7 @@ void LJ_FASTCALL lj_profile_interpreter(lua_State *L)
>>    profile_unlock(ps);
>>  }
>>
>> +
>
>This change is excess.

Fixed.

>
>>  /* Trigger profile hook. Asynchronous call from OS-specific profile timer. */
>>  static void profile_trigger(ProfileState *ps)
>>  {
>> @@ -168,129 +122,21 @@ static void profile_trigger(ProfileState *ps)
>>    profile_unlock(ps);
>>  }
>>
>> -/* -- OS-specific profile timer handling ---------------------------------- */
>> -
>>  #if LJ_PROFILE_SIGPROF
>>
>> -/* SIGPROF handler. */
>> -static void profile_signal(int sig)
>> +static void profile_handler(int sig, siginfo_t *info, void* ctx)
>
>Why do we need to change signature of this function?

We need this to make use of extra arguments inside the sysprof.
Yes, we can make this optional, but this will lead to the complicated
API.

>
>Nit: s/void* ctx/void *ctx/

Fixed.

>
>>  {
>>    UNUSED(sig);
>> +  UNUSED(info);
>> +  UNUSED(ctx);
>>    profile_trigger(&profile_state);
>>  }
>>
>> -/* Start profiling timer. */
>> -static void profile_timer_start(ProfileState *ps)
>> -{
>> -  int interval = ps->interval;
>> -  struct itimerval tm;
>> -  struct sigaction sa;
>> -  tm.it_value.tv_sec = tm.it_interval.tv_sec = interval / 1000;
>> -  tm.it_value.tv_usec = tm.it_interval.tv_usec = (interval % 1000) * 1000;
>> -  setitimer(ITIMER_PROF, &tm, NULL);
>> -  sa.sa_flags = SA_RESTART;
>> -  sa.sa_handler = profile_signal;
>> -  sigemptyset(&sa.sa_mask);
>> -  sigaction(SIGPROF, &sa, &ps->oldsa);
>> -}
>> -
>> -/* Stop profiling timer. */
>> -static void profile_timer_stop(ProfileState *ps)
>> -{
>
><snipped>
>
>> -}
>> -
>> -#elif LJ_PROFILE_PTHREAD
>> -
>> -/* POSIX timer thread. */
>> -static void *profile_thread(ProfileState *ps)
>> -{
>> -  int interval = ps->interval;
>> -#if !LJ_TARGET_PS3
>> -  struct timespec ts;
>> -  ts.tv_sec = interval / 1000;
>> -  ts.tv_nsec = (interval % 1000) * 1000000;
>> -#endif
>> -  while (1) {
>> -#if LJ_TARGET_PS3
>> -    sys_timer_usleep(interval * 1000);
>>  #else
>> -    nanosleep(&ts, NULL);
>> -#endif
>> -    if (ps->abort) break;
>> -    profile_trigger(ps);
>> -  }
>> -  return NULL;
>> -}
>> -
>> -/* Start profiling timer thread. */
>> -static void profile_timer_start(ProfileState *ps)
>> -{
>
><snipped>
>
>> -}
>>
>> -/* Stop profiling timer thread. */
>> -static void profile_timer_stop(ProfileState *ps)
>> +static void profile_handler()
>>  {
>
><snipped>
>
>> -}
>> -
>> -#elif LJ_PROFILE_WTHREAD
>> -
>> -/* Windows timer thread. */
>> -static DWORD WINAPI profile_thread(void *psx)
>> -{
>> -  ProfileState *ps = (ProfileState *)psx;
>> -  int interval = ps->interval;
>> -#if LJ_TARGET_WINDOWS
>> -  ps->wmm_tbp(interval);
>> -#endif
>> -  while (1) {
>> -    Sleep(interval);
>> -    if (ps->abort) break;
>> -    profile_trigger(ps);
>> -  }
>> -#if LJ_TARGET_WINDOWS
>> -  ps->wmm_tep(interval);
>> -#endif
>> -  return 0;
>> -}
>> -
>> -/* Start profiling timer thread. */
>> -static void profile_timer_start(ProfileState *ps)
>> -{
>> -#if LJ_TARGET_WINDOWS
>> -  if (!ps->wmm) {  /* Load WinMM library on-demand. */
>> -    ps->wmm = LoadLibraryExA("winmm.dll", NULL, 0);
>> -    if (ps->wmm) {
>> -      ps->wmm_tbp = (WMM_TPFUNC)GetProcAddress(ps->wmm, "timeBeginPeriod");
>> -      ps->wmm_tep = (WMM_TPFUNC)GetProcAddress(ps->wmm, "timeEndPeriod");
>> -      if (!ps->wmm_tbp || !ps->wmm_tep) {
>> -	ps->wmm = NULL;
>> -	return;
>> -      }
>> -    }
>> -  }
>> -#endif
>> -  InitializeCriticalSection(&ps->lock);
>> -  ps->abort = 0;
>> -  ps->thread = CreateThread(NULL, 0, profile_thread, ps, 0, NULL);
>> -}
>> -
>> -/* Stop profiling timer thread. */
>> -static void profile_timer_stop(ProfileState *ps)
>> -{
>
><snipped>
>
>>  }
>>
>>  #endif
>> @@ -327,12 +173,14 @@ LUA_API void luaJIT_profile_start(lua_State *L, const char *mode,
>>      if (ps->g) return;  /* Profiler in use by another VM. */
>>    }
>>    ps->g = G(L);
>> -  ps->interval = interval;
>>    ps->cb = cb;
>>    ps->data = data;
>>    ps->samples = 0;
>>    lj_buf_init(L, &ps->sb);
>> -  profile_timer_start(ps);
>> +
>
>Nit: I think this empty line is excess.
>Feel free to ignore.

Fixed.

>
>> +  ps->timer.opt.interval_msec = interval;
>> +  ps->timer.opt.handler = profile_handler;
>> +  lj_profile_timer_start(&ps->timer);
>>  }
>>
>>  /* Stop profiling. */
>> @@ -341,7 +189,7 @@ LUA_API void luaJIT_profile_stop(lua_State *L)
>
><snipped>
>
>> diff --git a/src/lj_profile_timer.c b/src/lj_profile_timer.c
>> new file mode 100644
>> index 00000000..ce420a85
>> --- /dev/null
>> +++ b/src/lj_profile_timer.c
>> @@ -0,0 +1,126 @@
>> +/*
>> +** Simple profiling timer extracted from inbuilt luajit profiler
>
>Nit: I suppose we can omit the part about extracting.

Fixed. Also, copyright note copied from the lj_profile

>
>> +*/
>> +
>> +#define lj_profile_timer_c
>> +#define LUA_CORE
>> +
>> +#include "lj_profile_timer.h"
>> +
>> +#if LJ_HASPROFILE
>> +
>> +#if LJ_PROFILE_SIGPROF
>> +
>> +/* Start profiling timer. */
>> +void lj_profile_timer_start(lj_profile_timer *timer) {
>
>Minor: use the newline for curly bracket after function declaration.
>Here and below.

Fixed.

>
>> +  int interval = timer->opt.interval_msec;
>
>Minor: looks like interval may be declared as `const int`.
>Feel free to ignore.

Fixed.

>
>> +  struct itimerval tm;
>> +  struct sigaction sa;
>> +  tm.it_value.tv_sec = tm.it_interval.tv_sec = interval / 1000;
>> +  tm.it_value.tv_usec = tm.it_interval.tv_usec = (interval % 1000) * 1000;
>> +  setitimer(ITIMER_PROF, &tm, NULL);
>> +  sa.sa_flags = SA_RESTART | SA_SIGINFO;
>> +  sa.sa_sigaction = timer->opt.handler;
>
>Why do we need to forcify using of 3 argument handler?
>
>| -  sa.sa_flags = SA_RESTART;
>| -  sa.sa_handler = profile_signal;
>
>Looks like the old format is enough.

Answered above.

>
>> +  sigemptyset(&sa.sa_mask);
>> +  sigaction(SIGPROF, &sa, &timer->oldsa);
>> +}
>> +
>> +/* Stop profiling timer. */
>> +void lj_profile_timer_stop(lj_profile_timer *timer) {
>
><snipped>
>
>> +}
>> +
>> +#elif LJ_PROFILE_PTHREAD
>> +
>> +/* POSIX timer thread. */
>> +static void *profile_thread(lj_timer *timer) {
>
>Its lj_profile_timer, not lj_timer :).

Fixed.

>
>Also, `timer_thread()` name is looks better here, as far as it only
>handler timer in another thread.
>Thoughts?

Seems reasonable, fixed.

>
>> +  int interval = timer->opt.interval_msec;
>> +#if !LJ_TARGET_PS3
>> +  struct timespec ts;
>> +  ts.tv_sec = interval / 1000;
>> +  ts.tv_nsec = (interval % 1000) * 1000000;
>> +#endif
>> +  while (1) {
>> +#if LJ_TARGET_PS3
>> +    sys_timer_usleep(interval * 1000);
>> +#else
>> +    nanosleep(&ts, NULL);
>> +#endif
>> +    if (timer->abort)
>> +      break;
>
>
>Minor: I suggest to save old code style here, i.e. without newline.

Fixed.

>
>> +    timer->opt.handler();
>> +  }
>> +  return NULL;
>> +}
>> +
>> +/* Start profiling timer thread. */
>> +void lj_profile_timer_start(lj_profile_timer *timer) {
>> +  pthread_mutex_init(&timer->lock, 0);
>> +  timer->abort = 0;
>> +  pthread_create(&timer->thread, NULL, (void *(*)(void *))profile_thread,
>> +                 timer);
>
>Minor: Use tabs for allignment here.

Fixed.

>
>Side note: may be `t` stands for timer is good enough here.
>Feel free to ignore.

For me, timer is more clear :)

>
>> +}
>> +
>> +/* Stop profiling timer thread. */
>> +void lj_profile_timer_stop(lj_profile_timer *timer) {
>
><snipped>
>
>> +}
>> +
>> +#elif LJ_PROFILE_WTHREAD
>> +
>> +/* Windows timer thread. */
>> +static DWORD WINAPI profile_thread(void *timerx) {
>
>Also, `timer_thread()` name is looks better here, as far as it only
>handler timer in another thread.
>Thoughts?
>
>> +  lj_profile_timer *timer = (lj_profile_timer *)timerx;
>> +  int interval = timer->opt.interval_msec;
>> +#if LJ_TARGET_WINDOWS
>> +  timer->wmm_tbp(interval);
>> +#endif
>> +  while (1) {
>> +    Sleep(interval);
>> +    if (timer->abort)
>> +      break;
>
>Minor: I suggest to save old code style here, i.e. without newline.

Fixed.

>
>> +    timer->opt.handler();
>> +  }
>> +#if LJ_TARGET_WINDOWS
>> +  timer->wmm_tep(interval);
>> +#endif
>> +  return 0;
>> +}
>> +
>> +/* Start profiling timer thread. */
>> +void lj_profile_timer_start(lj_profile_timer *timer) {
>> +#if LJ_TARGET_WINDOWS
>> +  if (!timer->wmm) { /* Load WinMM library on-demand. */
>> +    timer->wmm = LoadLibraryExA("winmm.dll", NULL, 0);
>> +    if (timer->wmm) {
>> +      timer->wmm_tbp =
>> +          (WMM_TPFUNC)GetProcAddress(timer->wmm, "timeBeginPeriod");
>> +      timer->wmm_tep = (WMM_TPFUNC)GetProcAddress(timer->wmm, "timeEndPeriod");
>> +      if (!timer->wmm_tbp || !timer->wmm_tep) {
>> +        timer->wmm = NULL;
>> +        return;
>
>LuaJIT use quoter-half tabs codestyle. Please, return 1 tab instead 8
>spaces.

Fixed.

>
>> +      }
>> +    }
>> +  }
>> +#endif
>> +  InitializeCriticalSection(&timer->lock);
>> +  timer->abort = 0;
>> +  timer->thread = CreateThread(NULL, 0, profile_thread, timer, 0, NULL);
>> +}
>> +
>> +/* Stop profiling timer thread. */
>> +void lj_profile_timer_stop(lj_profile_timer *timer) {
>
><snipped>
>
>> +}
>> +
>> +#endif
>> +
>> +#endif  /* LJ_HASPROFILE */
>> diff --git a/src/lj_profile_timer.h b/src/lj_profile_timer.h
>> new file mode 100644
>> index 00000000..af9eabce
>> --- /dev/null
>> +++ b/src/lj_profile_timer.h
>> @@ -0,0 +1,83 @@
>> +/*
>> +** Simple profiling timer extracted from inbuilt luajit profiler
>
>Nit: I suppose we can omit the part about extracting.

Fixed.

>
>> +*/
>> +
>> +#ifndef _LJ_PROFILE_TIMER_H
>> +#define _LJ_PROFILE_TIMER_H
>> +
>> +#include <stdint.h>
>
>It's better to use "lj_def.h" here (see it to explanation).

Fixed.

>
>> +
>> +#include "lj_arch.h"
>> +
>> +#if LJ_HASPROFILE
>> +
>> +#if LJ_PROFILE_SIGPROF
>
><snipped>
>
>> +#endif
>> +
>> +typedef struct {
>> +#if LJ_PROFILE_SIGPROF
>> +  void (*handler)(int, siginfo_t*, void*);
>> +#else
>> +  void (*handler)(void);
>> +#endif
>
>Don't get why do we need different handlers signature here.
>If we use the old one i.e. void (*)(void).
>
>> +  uint32_t interval_msec;
>> +} lj_profile_timer_opt;
>
>Why do we need a separate structure for these options?

I think, we need it to highlight the configurable parameters.

>
>> +
>
>Minor: Please fix mess with tabs and spaces below.

Fixed.

>
>> +typedef struct {
>> +  lj_profile_timer_opt opt;
>> +#if LJ_PROFILE_SIGPROF
>> +  struct sigaction oldsa;  /* Previous SIGPROF state. */
>> +#elif LJ_PROFILE_PTHREAD
>> +  pthread_mutex_t lock;		 /* g->hookmask update lock. */
>> +  pthread_t thread;		     /* Timer thread. */
>> +  int abort;			         /* Abort timer thread. */
>> +#elif LJ_PROFILE_WTHREAD
>> +#if LJ_TARGET_WINDOWS
>> +  HINSTANCE wmm;		       /* WinMM library handle. */
>> +  WMM_TPFUNC wmm_tbp;		   /* WinMM timeBeginPeriod function. */
>> +  WMM_TPFUNC wmm_tep;		   /* WinMM timeEndPeriod function. */
>> +#endif
>> +  CRITICAL_SECTION lock;	 /* g->hookmask update lock. */
>> +  HANDLE thread;		       /* Timer thread. */
>> +  int abort;			         /* Abort timer thread. */
>> +#endif
>> +} lj_profile_timer;
>> +
>> +/* Start profiling timer */
>
>Typo: missed dot at the end of the sentence.

Fixed.

>
>> +void lj_profile_timer_start(lj_profile_timer *timer);
>> +
>> +/* Stop profiling timer */
>
>Typo: missed dot at the end of the sentence.

Fixed.

>
>> +void lj_profile_timer_stop(lj_profile_timer *timer);
>> +
>> +#endif  /* LJ_HASPROFILE */
>> +
>> +#endif
>> --
>> 2.33.0
>>
>
>-- 
>Best regards,
>Sergey Kaplun

Branch: https://github.com/tarantool/luajit/tree/shishqa/gh-781-platform-and-lua-profiler
New hash: https://github.com/tarantool/luajit/commit/ecbf4af8ec1e8a597199e6dcdeca2a95b95cf667

diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 50c23cae..c92d78cc 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -68,8 +68,8 @@ make_source_list(SOURCES_UTILS
   make_source_list(SOURCES_PROFILER
     SOURCES
       lj_memprof.c
-    lj_profile_timer.c
       lj_profile.c
+    lj_profile_timer.c
   )
   
   # Lua standard library + extensions by LuaJIT.
diff --git a/src/Makefile.dep.original b/src/Makefile.dep.original
index da5abf0d..68f51706 100644
--- a/src/Makefile.dep.original
+++ b/src/Makefile.dep.original
@@ -179,8 +179,8 @@ lj_profile.o: lj_profile.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
    lj_buf.h lj_gc.h lj_str.h lj_frame.h lj_bc.h lj_debug.h lj_dispatch.h \
    lj_jit.h lj_ir.h lj_trace.h lj_traceerr.h lj_profile.h \
    lj_profile_timer.h luajit.h
-lj_profile_timer.o: lj_profile_timer.c lj_profile_timer.h lj_arch.h lua.h \
- luaconf.h
+lj_profile_timer.o: lj_profile_timer.c lj_profile_timer.h lj_def.h lua.h \
+ luaconf.h lj_arch.h
   lj_record.o: lj_record.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
    lj_err.h lj_errmsg.h lj_str.h lj_tab.h lj_meta.h lj_frame.h lj_bc.h \
    lj_ctype.h lj_gc.h lj_ff.h lj_ffdef.h lj_debug.h lj_ir.h lj_jit.h \
@@ -223,21 +223,23 @@ ljamalg.o: ljamalg.c lua.h luaconf.h lauxlib.h lj_gc.c lj_obj.h lj_def.h \
    lj_func.h lj_udata.h lj_meta.h lj_state.h lj_frame.h lj_bc.h lj_ctype.h \
    lj_cdata.h lj_trace.h lj_jit.h lj_ir.h lj_dispatch.h lj_traceerr.h \
    lj_vm.h lj_err.c lj_debug.h lj_ff.h lj_ffdef.h lj_strfmt.h lj_char.c \
- lj_char.h lj_bc.c lj_bcdef.h lj_obj.c lj_buf.c lj_wbuf.c lj_wbuf.h lj_utils.h \
- lj_str.c lj_tab.c lj_func.c lj_udata.c lj_meta.c lj_strscan.h lj_lib.h \
- lj_debug.c lj_state.c lj_lex.h lj_alloc.h luajit.h lj_dispatch.c \
- lj_ccallback.h lj_profile.h lj_memprof.h lj_vmevent.c lj_vmevent.h \
+ lj_char.h lj_bc.c lj_bcdef.h lj_obj.c lj_buf.c lj_wbuf.c lj_wbuf.h \
+ lj_utils.h lj_str.c lj_tab.c lj_func.c lj_udata.c lj_meta.c lj_strscan.h \
+ lj_lib.h lj_debug.c lj_state.c lj_lex.h lj_alloc.h luajit.h lj_memprof.h \
+ lj_dispatch.c lj_ccallback.h lj_profile.h lj_vmevent.c lj_vmevent.h \
    lj_vmmath.c lj_strscan.c lj_strfmt.c lj_strfmt_num.c lj_api.c lj_mapi.c \
- lmisclib.h lj_profile.c lj_memprof.c lj_lex.c lualib.h lj_parse.h lj_parse.c \
- lj_bcread.c lj_bcdump.h lj_bcwrite.c lj_load.c lj_ctype.c lj_cdata.c \
- lj_cconv.h lj_cconv.c lj_ccall.c lj_ccall.h lj_ccallback.c lj_target.h \
- lj_target_*.h lj_mcode.h lj_carith.c lj_carith.h lj_clib.c lj_clib.h \
- lj_cparse.c lj_cparse.h lj_lib.c lj_ir.c lj_ircall.h lj_iropt.h lj_opt_mem.c \
- lj_opt_fold.c lj_folddef.h lj_opt_narrow.c lj_opt_dce.c lj_opt_loop.c \
- lj_snap.h lj_opt_split.c lj_opt_sink.c lj_mcode.c lj_snap.c lj_record.c \
- lj_record.h lj_ffrecord.h lj_crecord.c lj_crecord.h lj_ffrecord.c lj_recdef.h \
- lj_asm.c lj_asm.h lj_emit_*.h lj_asm_*.h lj_trace.c lj_gdbjit.h lj_gdbjit.c \
- lj_alloc.c lj_utils_leb128.c lib_aux.c lib_base.c lj_libdef.h lib_math.c \
+ lmisclib.h lj_profile.c lj_profile_timer.h lj_profile_timer.c \
+ lj_memprof.c lj_lex.c lualib.h lj_parse.h lj_parse.c lj_bcread.c \
+ lj_bcdump.h lj_bcwrite.c lj_load.c lj_ctype.c lj_cdata.c lj_cconv.h \
+ lj_cconv.c lj_ccall.c lj_ccall.h lj_ccallback.c lj_target.h \
+ lj_target_x86.h lj_mcode.h lj_carith.c lj_carith.h lj_clib.c lj_clib.h \
+ lj_cparse.c lj_cparse.h lj_lib.c lj_ir.c lj_ircall.h lj_iropt.h \
+ lj_opt_mem.c lj_opt_fold.c lj_folddef.h lj_opt_narrow.c lj_opt_dce.c \
+ lj_opt_loop.c lj_snap.h lj_opt_split.c lj_opt_sink.c lj_mcode.c \
+ lj_snap.c lj_record.c lj_record.h lj_ffrecord.h lj_crecord.c \
+ lj_crecord.h lj_ffrecord.c lj_recdef.h lj_asm.c lj_asm.h lj_emit_x86.h \
+ lj_asm_x86.h lj_trace.c lj_gdbjit.h lj_gdbjit.c lj_alloc.c \
+ lj_utils_leb128.c lib_aux.c lib_base.c lj_libdef.h lib_math.c \
    lib_string.c lib_table.c lib_io.c lib_os.c lib_package.c lib_debug.c \
    lib_bit.c lib_jit.c lib_ffi.c lib_misc.c lib_init.c
   luajit.o: luajit.c lua.h luaconf.h lauxlib.h lualib.h luajit.h lj_arch.h
diff --git a/src/lj_profile.c b/src/lj_profile.c
index e7e1a303..4412d68b 100644
--- a/src/lj_profile.c
+++ b/src/lj_profile.c
@@ -31,7 +31,7 @@ typedef struct ProfileState {
     SBuf sb;			/* String buffer for stack dumps. */
     int samples;			/* Number of samples for next callback. */
     int vmstate;			/* VM state when profile timer triggered. */
-  lj_profile_timer timer;  /* Profiling timer */
+  lj_profile_timer timer;	/* Profiling timer */
   } ProfileState;
   
   /* Sadly, we have to use a static profiler state.
@@ -98,7 +98,6 @@ void LJ_FASTCALL lj_profile_interpreter(lua_State *L)
     profile_unlock(ps);
   }
   
-
   /* Trigger profile hook. Asynchronous call from OS-specific profile timer. */
   static void profile_trigger(ProfileState *ps)
   {
@@ -124,7 +123,7 @@ static void profile_trigger(ProfileState *ps)
   
   #if LJ_PROFILE_SIGPROF
   
-static void profile_handler(int sig, siginfo_t *info, void* ctx)
+static void profile_handler(int sig, siginfo_t *info, void *ctx)
   {
     UNUSED(sig);
     UNUSED(info);
@@ -177,7 +176,6 @@ LUA_API void luaJIT_profile_start(lua_State *L, const char *mode,
     ps->data = data;
     ps->samples = 0;
     lj_buf_init(L, &ps->sb);
-
     ps->timer.opt.interval_msec = interval;
     ps->timer.opt.handler = profile_handler;
     lj_profile_timer_start(&ps->timer);
diff --git a/src/lj_profile_timer.c b/src/lj_profile_timer.c
index ce420a85..056fd1f7 100644
--- a/src/lj_profile_timer.c
+++ b/src/lj_profile_timer.c
@@ -1,5 +1,6 @@
   /*
-** Simple profiling timer extracted from inbuilt luajit profiler
+** Simple profiling timer.
+** Copyright (C) 2005-2017 Mike Pall. See Copyright Notice in luajit.h
   */
   
   #define lj_profile_timer_c
@@ -12,8 +13,9 @@
   #if LJ_PROFILE_SIGPROF
   
   /* Start profiling timer. */
-void lj_profile_timer_start(lj_profile_timer *timer) {
-  int interval = timer->opt.interval_msec;
+void lj_profile_timer_start(lj_profile_timer *timer)
+{
+  const int interval = timer->opt.interval_msec;
     struct itimerval tm;
     struct sigaction sa;
     tm.it_value.tv_sec = tm.it_interval.tv_sec = interval / 1000;
@@ -26,7 +28,8 @@ void lj_profile_timer_start(lj_profile_timer *timer) {
   }
   
   /* Stop profiling timer. */
-void lj_profile_timer_stop(lj_profile_timer *timer) {
+void lj_profile_timer_stop(lj_profile_timer *timer)
+{
     struct itimerval tm;
     tm.it_value.tv_sec = tm.it_interval.tv_sec = 0;
     tm.it_value.tv_usec = tm.it_interval.tv_usec = 0;
@@ -37,7 +40,8 @@ void lj_profile_timer_stop(lj_profile_timer *timer) {
   #elif LJ_PROFILE_PTHREAD
   
   /* POSIX timer thread. */
-static void *profile_thread(lj_timer *timer) {
+static void *timer_thread(lj_profile_timer *timer)
+{
     int interval = timer->opt.interval_msec;
   #if !LJ_TARGET_PS3
     struct timespec ts;
@@ -50,23 +54,24 @@ static void *profile_thread(lj_timer *timer) {
   #else
       nanosleep(&ts, NULL);
   #endif
-    if (timer->abort)
-      break;
+    if (timer->abort) break;
       timer->opt.handler();
     }
     return NULL;
   }
   
   /* Start profiling timer thread. */
-void lj_profile_timer_start(lj_profile_timer *timer) {
+void lj_profile_timer_start(lj_profile_timer *timer)
+{
     pthread_mutex_init(&timer->lock, 0);
     timer->abort = 0;
-  pthread_create(&timer->thread, NULL, (void *(*)(void *))profile_thread,
-                 timer);
+  pthread_create(&timer->thread, NULL, (void *(*)(void *))timer_thread,
+		 timer);
   }
   
   /* Stop profiling timer thread. */
-void lj_profile_timer_stop(lj_profile_timer *timer) {
+void lj_profile_timer_stop(lj_profile_timer *timer)
+{
     timer->abort = 1;
     pthread_join(timer->thread, NULL);
     pthread_mutex_destroy(&timer->lock);
@@ -75,7 +80,8 @@ void lj_profile_timer_stop(lj_profile_timer *timer) {
   #elif LJ_PROFILE_WTHREAD
   
   /* Windows timer thread. */
-static DWORD WINAPI profile_thread(void *timerx) {
+static DWORD WINAPI timer_thread(void *timerx)
+{
     lj_profile_timer *timer = (lj_profile_timer *)timerx;
     int interval = timer->opt.interval_msec;
   #if LJ_TARGET_WINDOWS
@@ -83,8 +89,7 @@ static DWORD WINAPI profile_thread(void *timerx) {
   #endif
     while (1) {
       Sleep(interval);
-    if (timer->abort)
-      break;
+    if (timer->abort) break;
       timer->opt.handler();
     }
   #if LJ_TARGET_WINDOWS
@@ -94,28 +99,30 @@ static DWORD WINAPI profile_thread(void *timerx) {
   }
   
   /* Start profiling timer thread. */
-void lj_profile_timer_start(lj_profile_timer *timer) {
+void lj_profile_timer_start(lj_profile_timer *timer)
+{
   #if LJ_TARGET_WINDOWS
     if (!timer->wmm) { /* Load WinMM library on-demand. */
       timer->wmm = LoadLibraryExA("winmm.dll", NULL, 0);
       if (timer->wmm) {
         timer->wmm_tbp =
-          (WMM_TPFUNC)GetProcAddress(timer->wmm, "timeBeginPeriod");
+	(WMM_TPFUNC)GetProcAddress(timer->wmm, "timeBeginPeriod");
         timer->wmm_tep = (WMM_TPFUNC)GetProcAddress(timer->wmm, "timeEndPeriod");
         if (!timer->wmm_tbp || !timer->wmm_tep) {
-        timer->wmm = NULL;
-        return;
+	timer->wmm = NULL;
+	return;
         }
       }
     }
   #endif
     InitializeCriticalSection(&timer->lock);
     timer->abort = 0;
-  timer->thread = CreateThread(NULL, 0, profile_thread, timer, 0, NULL);
+  timer->thread = CreateThread(NULL, 0, timer_thread, timer, 0, NULL);
   }
   
   /* Stop profiling timer thread. */
-void lj_profile_timer_stop(lj_profile_timer *timer) {
+void lj_profile_timer_stop(lj_profile_timer *timer)
+{
     timer->abort = 1;
     WaitForSingleObject(timer->thread, INFINITE);
     DeleteCriticalSection(&timer->lock);
diff --git a/src/lj_profile_timer.h b/src/lj_profile_timer.h
index af9eabce..1deeea53 100644
--- a/src/lj_profile_timer.h
+++ b/src/lj_profile_timer.h
@@ -1,12 +1,12 @@
   /*
-** Simple profiling timer extracted from inbuilt luajit profiler
+** Simple profiling timer.
+** Copyright (C) 2005-2017 Mike Pall. See Copyright Notice in luajit.h
   */
   
   #ifndef _LJ_PROFILE_TIMER_H
   #define _LJ_PROFILE_TIMER_H
   
-#include <stdint.h>
-
+#include "lj_def.h"
   #include "lj_arch.h"
   
   #if LJ_HASPROFILE
@@ -55,27 +55,27 @@ typedef struct {
   typedef struct {
     lj_profile_timer_opt opt;
   #if LJ_PROFILE_SIGPROF
-  struct sigaction oldsa;  /* Previous SIGPROF state. */
+  struct sigaction oldsa;	/* Previous SIGPROF state. */
   #elif LJ_PROFILE_PTHREAD
-  pthread_mutex_t lock;		 /* g->hookmask update lock. */
-  pthread_t thread;		     /* Timer thread. */
-  int abort;			         /* Abort timer thread. */
+  pthread_mutex_t lock;		/* g->hookmask update lock. */
+  pthread_t thread;		/* Timer thread. */
+  int abort;			/* Abort timer thread. */
   #elif LJ_PROFILE_WTHREAD
   #if LJ_TARGET_WINDOWS
-  HINSTANCE wmm;		       /* WinMM library handle. */
-  WMM_TPFUNC wmm_tbp;		   /* WinMM timeBeginPeriod function. */
-  WMM_TPFUNC wmm_tep;		   /* WinMM timeEndPeriod function. */
+  HINSTANCE wmm;		/* WinMM library handle. */
+  WMM_TPFUNC wmm_tbp;		/* WinMM timeBeginPeriod function. */
+  WMM_TPFUNC wmm_tep;		/* WinMM timeEndPeriod function. */
   #endif
-  CRITICAL_SECTION lock;	 /* g->hookmask update lock. */
-  HANDLE thread;		       /* Timer thread. */
-  int abort;			         /* Abort timer thread. */
+  CRITICAL_SECTION lock;	/* g->hookmask update lock. */
+  HANDLE thread;		/* Timer thread. */
+  int abort;			/* Abort timer thread. */
   #endif
   } lj_profile_timer;
   
-/* Start profiling timer */
+/* Start profiling timer. */
   void lj_profile_timer_start(lj_profile_timer *timer);
   
-/* Stop profiling timer */
+/* Stop profiling timer. */
   void lj_profile_timer_stop(lj_profile_timer *timer);
   
   #endif  /* LJ_HASPROFILE */
diff --git a/src/ljamalg.c b/src/ljamalg.c
index 3f7e6860..ce7a0d6c 100644
--- a/src/ljamalg.c
+++ b/src/ljamalg.c
@@ -51,6 +51,7 @@
   #include "lj_api.c"
   #include "lj_mapi.c"
   #include "lj_profile.c"
+#include "lj_profile_timer.c"
   #include "lj_memprof.c"
   #include "lj_lex.c"
   #include "lj_parse.c"

Best regards,
Mikhail Shishatskiy

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Tarantool-patches] [PATCH luajit v2] memprof: move symtab to a separate module
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 3/7] memprof: move symtab to a separate module Maxim Kokryashkin via Tarantool-patches
  2021-09-22  7:51   ` Sergey Kaplun via Tarantool-patches
@ 2021-09-23 14:51   ` Maxim Kokryashkin via Tarantool-patches
  1 sibling, 0 replies; 19+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2021-09-23 14:51 UTC (permalink / raw)
  To: tarantool-patches, imun, skaplun

Considering the symbol table format, it is obvious that it is suitable
not only for memprof, but for other profiling modules too. This commit
moves the symbol table to the separate module, so modules other than
memprof will be able to access it.

Part of tarantool/tarantool#781
---
Changes in v2:
 - Fixed comments as per review by Sergey

Also, I've checked the LJS version in the tarantool branch, and it is 1.
Link: https://github.com/tarantool/luajit/blob/814625fc7208d97d74e1998a23697b9a099da4b0/src/lj_memprof.h#L19

 src/CMakeLists.txt        |  1 +
 src/Makefile.dep.original |  4 ++-
 src/lj_memprof.c          | 37 ++------------------------
 src/lj_memprof.h          | 36 -------------------------
 src/lj_symtab.c           | 42 +++++++++++++++++++++++++++++
 src/lj_symtab.h           | 56 +++++++++++++++++++++++++++++++++++++++
 src/ljamalg.c             |  1 +
 7 files changed, 105 insertions(+), 72 deletions(-)
 create mode 100644 src/lj_symtab.c
 create mode 100644 src/lj_symtab.h

diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index c92d78cc..b4ce407b 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -70,6 +70,7 @@ make_source_list(SOURCES_PROFILER
     lj_memprof.c
     lj_profile.c
     lj_profile_timer.c
+    lj_symtab.c
 )
 
 # Lua standard library + extensions by LuaJIT.
diff --git a/src/Makefile.dep.original b/src/Makefile.dep.original
index 68f51706..9389de73 100644
--- a/src/Makefile.dep.original
+++ b/src/Makefile.dep.original
@@ -146,7 +146,7 @@ lj_mcode.o: lj_mcode.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
  lj_gc.h lj_err.h lj_errmsg.h lj_jit.h lj_ir.h lj_mcode.h lj_trace.h \
  lj_dispatch.h lj_bc.h lj_traceerr.h lj_vm.h
 lj_memprof.o: lj_memprof.c lj_arch.h lua.h luaconf.h lj_memprof.h \
- lj_def.h lj_wbuf.h lj_obj.h lj_frame.h lj_bc.h lj_debug.h
+ lj_def.h lj_wbuf.h lj_obj.h lj_frame.h lj_bc.h lj_debug.h lj_symtab.h
 lj_meta.o: lj_meta.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h lj_gc.h \
  lj_err.h lj_errmsg.h lj_buf.h lj_str.h lj_tab.h lj_meta.h lj_frame.h \
  lj_bc.h lj_vm.h lj_strscan.h lj_strfmt.h lj_lib.h
@@ -202,6 +202,8 @@ lj_strfmt_num.o: lj_strfmt_num.c lj_obj.h lua.h luaconf.h lj_def.h \
  lj_arch.h lj_buf.h lj_gc.h lj_str.h lj_strfmt.h
 lj_strscan.o: lj_strscan.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
  lj_char.h lj_strscan.h
+lj_symtab.o: lj_symtab.c lj_symtab.h lj_wbuf.h lj_def.h lua.h luaconf.h \
+ lj_obj.h lj_arch.h
 lj_tab.o: lj_tab.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h lj_gc.h \
  lj_err.h lj_errmsg.h lj_tab.h
 lj_trace.o: lj_trace.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
diff --git a/src/lj_memprof.c b/src/lj_memprof.c
index 2c1ef3b8..fba688bc 100644
--- a/src/lj_memprof.c
+++ b/src/lj_memprof.c
@@ -19,40 +19,7 @@
 #include "lj_frame.h"
 #include "lj_debug.h"
 
-/* --------------------------------- Symtab --------------------------------- */
-
-static const unsigned char ljs_header[] = {'l', 'j', 's', LJS_CURRENT_VERSION,
-					   0x0, 0x0, 0x0};
-
-static void dump_symtab(struct lj_wbuf *out, const struct global_State *g)
-{
-  const GCRef *iter = &g->gc.root;
-  const GCobj *o;
-  const size_t ljs_header_len = sizeof(ljs_header) / sizeof(ljs_header[0]);
-
-  /* Write prologue. */
-  lj_wbuf_addn(out, ljs_header, ljs_header_len);
-
-  while ((o = gcref(*iter)) != NULL) {
-    switch (o->gch.gct) {
-    case (~LJ_TPROTO): {
-      const GCproto *pt = gco2pt(o);
-      lj_wbuf_addbyte(out, SYMTAB_LFUNC);
-      lj_wbuf_addu64(out, (uintptr_t)pt);
-      lj_wbuf_addstring(out, proto_chunknamestr(pt));
-      lj_wbuf_addu64(out, (uint64_t)pt->firstline);
-      break;
-    }
-    default:
-      break;
-    }
-    iter = &o->gch.nextgc;
-  }
-
-  lj_wbuf_addbyte(out, SYMTAB_FINAL);
-}
-
-/* ---------------------------- Memory profiler ----------------------------- */
+#include "lj_symtab.h"
 
 enum memprof_state {
   /* Memory profiler is not running. */
@@ -249,7 +216,7 @@ int lj_memprof_start(struct lua_State *L, const struct lj_memprof_options *opt)
 
   /* Init output. */
   lj_wbuf_init(&mp->out, mp_opt->writer, mp_opt->ctx, mp_opt->buf, mp_opt->len);
-  dump_symtab(&mp->out, mp->g);
+  lj_symtab_dump(&mp->out, mp->g);
 
   /* Write prologue. */
   lj_wbuf_addn(&mp->out, ljm_header, ljm_header_len);
diff --git a/src/lj_memprof.h b/src/lj_memprof.h
index 3417475d..33a9b869 100644
--- a/src/lj_memprof.h
+++ b/src/lj_memprof.h
@@ -16,42 +16,6 @@
 #include "lj_def.h"
 #include "lj_wbuf.h"
 
-#define LJS_CURRENT_VERSION 0x1
-
-/*
-** symtab format:
-**
-** symtab         := prologue sym*
-** prologue       := 'l' 'j' 's' version reserved
-** version        := <BYTE>
-** reserved       := <BYTE> <BYTE> <BYTE>
-** sym            := sym-lua | sym-final
-** sym-lua        := sym-header sym-addr sym-chunk sym-line
-** sym-header     := <BYTE>
-** sym-addr       := <ULEB128>
-** sym-chunk      := string
-** sym-line       := <ULEB128>
-** sym-final      := sym-header
-** string         := string-len string-payload
-** string-len     := <ULEB128>
-** string-payload := <BYTE> {string-len}
-**
-** <BYTE>   :  A single byte (no surprises here)
-** <ULEB128>:  Unsigned integer represented in ULEB128 encoding
-**
-** (Order of bits below is hi -> lo)
-**
-** version: [VVVVVVVV]
-**  * VVVVVVVV: Byte interpreted as a plain numeric version number
-**
-** sym-header: [FUUUUUTT]
-**  * TT    : 2 bits for representing symbol type
-**  * UUUUU : 5 unused bits
-**  * F     : 1 bit marking the end of the symtab (final symbol)
-*/
-
-#define SYMTAB_LFUNC ((uint8_t)0)
-#define SYMTAB_FINAL ((uint8_t)0x80)
 
 #define LJM_CURRENT_FORMAT_VERSION 0x01
 
diff --git a/src/lj_symtab.c b/src/lj_symtab.c
new file mode 100644
index 00000000..5ca8ccec
--- /dev/null
+++ b/src/lj_symtab.c
@@ -0,0 +1,42 @@
+/*
+** Implementation of symbol table for profilers.
+**
+** Major portions taken verbatim or adapted from the LuaVela.
+** Copyright (C) 2015-2019 IPONWEB Ltd.
+*/
+
+#define lj_symtab_c
+#define LUA_CORE
+
+#include "lj_symtab.h"
+
+static const unsigned char ljs_header[] = {'l', 'j', 's', LJS_CURRENT_VERSION,
+                                          0x0, 0x0, 0x0};
+
+void lj_symtab_dump(struct lj_wbuf *out, const struct global_State *g)
+{
+  const GCRef *iter = &g->gc.root;
+  const GCobj *o;
+  const size_t ljs_header_len = sizeof(ljs_header) / sizeof(ljs_header[0]);
+
+  /* Write prologue. */
+  lj_wbuf_addn(out, ljs_header, ljs_header_len);
+
+  while ((o = gcref(*iter)) != NULL) {
+    switch (o->gch.gct) {
+    case (~LJ_TPROTO): {
+      const GCproto *pt = gco2pt(o);
+      lj_wbuf_addbyte(out, SYMTAB_LFUNC);
+      lj_wbuf_addu64(out, (uintptr_t)pt);
+      lj_wbuf_addstring(out, proto_chunknamestr(pt));
+      lj_wbuf_addu64(out, (uint64_t)pt->firstline);
+      break;
+    }
+    default:
+      break;
+    }
+    iter = &o->gch.nextgc;
+  }
+
+  lj_wbuf_addbyte(out, SYMTAB_FINAL);
+}
diff --git a/src/lj_symtab.h b/src/lj_symtab.h
new file mode 100644
index 00000000..4bbdef82
--- /dev/null
+++ b/src/lj_symtab.h
@@ -0,0 +1,56 @@
+/*
+** Symbol table for profilers.
+**
+** Major portions taken verbatim or adapted from the LuaVela.
+** Copyright (C) 2015-2019 IPONWEB Ltd.
+*/
+
+#ifndef LJ_SYMTAB_H
+#define LJ_SYMTAB_H
+
+#include "lj_wbuf.h"
+#include "lj_obj.h"
+
+#define LJS_CURRENT_VERSION 0x1
+
+/*
+** symtab format:
+**
+** symtab         := prologue sym*
+** prologue       := 'l' 'j' 's' version reserved
+** version        := <BYTE>
+** reserved       := <BYTE> <BYTE> <BYTE>
+** sym            := sym-lua | sym-final
+** sym-lua        := sym-header sym-addr sym-chunk sym-line
+** sym-header     := <BYTE>
+** sym-addr       := <ULEB128>
+** sym-chunk      := string
+** sym-line       := <ULEB128>
+** sym-final      := sym-header
+** string         := string-len string-payload
+** string-len     := <ULEB128>
+** string-payload := <BYTE> {string-len}
+**
+** <BYTE>   :  A single byte (no surprises here)
+** <ULEB128>:  Unsigned integer represented in ULEB128 encoding
+**
+** (Order of bits below is hi -> lo)
+**
+** version: [VVVVVVVV]
+**  * VVVVVVVV: Byte interpreted as a plain numeric version number
+**
+** sym-header: [FUUUUUTT]
+**  * TT    : 2 bits for representing symbol type
+**  * UUUUU : 5 unused bits
+**  * F     : 1 bit marking the end of the symtab (final symbol)
+*/
+
+#define SYMTAB_LFUNC ((uint8_t)0)
+#define SYMTAB_FINAL ((uint8_t)0x80)
+
+/*
+** Dumps symbol table for Lua functions into a buffer
+*/
+void lj_symtab_dump(struct lj_wbuf *out, const struct global_State *g);
+
+#endif
diff --git a/src/ljamalg.c b/src/ljamalg.c
index ce7a0d6c..6ad5289c 100644
--- a/src/ljamalg.c
+++ b/src/ljamalg.c
@@ -52,6 +52,7 @@
 #include "lj_mapi.c"
 #include "lj_profile.c"
 #include "lj_profile_timer.c"
+#include "lj_symtab.c"
 #include "lj_memprof.c"
 #include "lj_lex.c"
 #include "lj_parse.c"
-- 
2.33.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 4/7] core: introduce lua and platform profiler
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 4/7] core: introduce lua and platform profiler Maxim Kokryashkin via Tarantool-patches
@ 2021-09-29  6:53   ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 19+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-09-29  6:53 UTC (permalink / raw)
  To: Maxim Kokryashkin, Mikhail Shishatskiy; +Cc: tarantool-patches

Hi, Maxim, Mikhail!

Thanks for the patch!
Please consider my comments below.

On 08.09.21, Maxim Kokryashkin wrote:
> From: Mikhail Shishatskiy <m.shishatskiy@tarantool.org>
> 
> This patch introduces a sampling platform profiler for
> the Lua machine.
> 
> The profiler uses the signal sampling backend from the
> low-level profiler built in vanilla LuaJIT, which was put
> into a separate module in one of the previous patches.

I suppose we can omit this clarification -- if somebody looks at git log
he'll see it.

> Thus, one cannot use both profilers at the same time.

Why we can define the signal (other that used for LuaJIT profiler) to
avoid conflict?

> 
> First of all profiler dumps the definitions of all
> loaded Lua functions and all loaded shared libraries
> (symtab) via the write buffer introduced in one of
> the previous patches.

Suggest to drop "introduced in one of ..."

> 
> As the profiling signal may occur at any time, we need
> to provide some guarantees for the sampling profiler
> to unwind consistent stack frames.
>                                    So, the VM is adjusted
> to save stack's current top frame into a dedicated
> variable `guesttop` in the virtual machine global state.

It is done in previous patch, so I suppose we shouldn't mention it in
this commit.

> 
> When signal occurs, the profiler dumps current VM state

Typo: s/signal/a signal/

> and additional info about about stacks that can be
> considered consistent in a given state. Also, profiling
> can be done without writing to file: vmstate counters
> are accumulated in static memory and can be received via
> the `luaM_sysprof_report` function. For more details see

Typo: s/details see/details, see/

> the <lmisclib.h> header file.
> 
> When profiling is over, the old signal handler is restored,
> and a special epilogue header is written.
> 
> Part of tarantool/tarantool#781

I suggest to move tests and Lua C API to commit, that introduses Lua API
as well. And in this commit we will only introduce core functionality
without API.


> ---
>  CMakeLists.txt                                |   6 +
>  src/CMakeLists.txt                            |   1 +
>  src/Makefile.dep.original                     |   8 +-
>  src/lj_arch.h                                 |  11 +
>  src/lj_errmsg.h                               |   2 +-
>  src/lj_mapi.c                                 |  26 +
>  src/lj_state.c                                |   7 +
>  src/lj_sysprof.c                              | 483 ++++++++++++++++++
>  src/lj_sysprof.h                              |  94 ++++
>  src/lmisclib.h                                |  93 ++++
>  test/tarantool-tests/CMakeLists.txt           |   1 +
>  .../misclib-sysprof-capi.test.lua             |  53 ++
>  .../misclib-sysprof-capi/CMakeLists.txt       |   1 +
>  .../misclib-sysprof-capi/testsysprof.c        | 269 ++++++++++
>  14 files changed, 1052 insertions(+), 3 deletions(-)
>  create mode 100644 src/lj_sysprof.c
>  create mode 100644 src/lj_sysprof.h
>  create mode 100644 test/tarantool-tests/misclib-sysprof-capi.test.lua
>  create mode 100644 test/tarantool-tests/misclib-sysprof-capi/CMakeLists.txt
>  create mode 100644 test/tarantool-tests/misclib-sysprof-capi/testsysprof.c

Please update <src/Makefile.original> for build with disabled sysprof as well.

> 
> diff --git a/CMakeLists.txt b/CMakeLists.txt
> index 5348e043..0c702093 100644
> --- a/CMakeLists.txt
> +++ b/CMakeLists.txt
> @@ -184,6 +184,12 @@ if(LUAJIT_DISABLE_MEMPROF)
>    AppendFlags(TARGET_C_FLAGS -DLUAJIT_DISABLE_MEMPROF)
>  endif()
>  
> +# Disable platform and lua profiler.
> +option(LUAJIT_DISABLE_SYSPROF "LuaJIT platform and lua profiler support" OFF)

Typo: s/lua/Lua/

> +if(LUAJIT_DISABLE_SYSPROF)
> +  AppendFlags(TARGET_C_FLAGS -DLUAJIT_DISABLE_SYSPROF)
> +endif()
> +
>  # Switch to harder (and slower) hash function when a collision
>  # chain in the string hash table exceeds a certain length.
>  option(LUAJIT_SMART_STRINGS "Harder string hashing function" ON)
> diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
> index 0f3d888f..c19d8b54 100644
> --- a/src/CMakeLists.txt
> +++ b/src/CMakeLists.txt
> @@ -69,6 +69,7 @@ make_source_list(SOURCES_UTILS
>  make_source_list(SOURCES_PROFILER
>    SOURCES
>      lj_memprof.c
> +    lj_sysprof.c
>      lj_profile_timer.c
>      lj_profile.c

Minor: lj_sysprof.c should be the last entry (alphabetically).

>  )
> diff --git a/src/Makefile.dep.original b/src/Makefile.dep.original
> index b7d82719..503817bb 100644
> --- a/src/Makefile.dep.original
> +++ b/src/Makefile.dep.original
> @@ -140,8 +140,8 @@ lj_lib.o: lj_lib.c lauxlib.h lua.h luaconf.h lj_obj.h lj_def.h lj_arch.h \
>  lj_load.o: lj_load.c lua.h luaconf.h lauxlib.h lj_obj.h lj_def.h \
>   lj_arch.h lj_gc.h lj_err.h lj_errmsg.h lj_buf.h lj_str.h lj_func.h \
>   lj_frame.h lj_bc.h lj_vm.h lj_lex.h lj_bcdump.h lj_parse.h
> -lj_mapi.o: lj_mapi.c lua.h luaconf.h lmisclib.h lj_obj.h lj_def.h lj_arch.h \
> - lj_dispatch.h lj_bc.h lj_jit.h lj_ir.h
> +lj_mapi.o: lj_mapi.c lua.h luaconf.h lmisclib.h lj_obj.h lj_def.h \
> + lj_arch.h lj_dispatch.h lj_bc.h lj_jit.h lj_ir.h lj_sysprof.h
>  lj_mcode.o: lj_mcode.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
>   lj_gc.h lj_err.h lj_errmsg.h lj_jit.h lj_ir.h lj_mcode.h lj_trace.h \
>   lj_dispatch.h lj_bc.h lj_traceerr.h lj_vm.h
> @@ -204,6 +204,10 @@ lj_strscan.o: lj_strscan.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
>   lj_char.h lj_strscan.h
>  lj_symtab.o: lj_symtab.c lj_symtab.h lj_wbuf.h lj_def.h lua.h luaconf.h \
>   lj_obj.h lj_arch.h
> +lj_sysprof.o: lj_sysprof.c lj_arch.h lua.h luaconf.h lj_sysprof.h \
> + lj_obj.h lj_def.h lmisclib.h lj_debug.h lj_dispatch.h lj_bc.h lj_jit.h \
> + lj_ir.h lj_frame.h lj_trace.h lj_traceerr.h lj_wbuf.h lj_profile_timer.h \
> + lj_symtab.h
>  lj_tab.o: lj_tab.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h lj_gc.h \
>   lj_err.h lj_errmsg.h lj_tab.h
>  lj_trace.o: lj_trace.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \

Looks like lj_state.o dependencies should be updated too, IINM:

0) We can fix missed dependencies from lj_memprof.h
1) We should add dependency by lj_sysprof.h

> diff --git a/src/lj_arch.h b/src/lj_arch.h
> index 5bf0afb8..f0b60092 100644
> --- a/src/lj_arch.h
> +++ b/src/lj_arch.h

<snipped>

> @@ -585,4 +589,11 @@
>  #define LJ_HASMEMPROF		1
>  #endif
>  
> +/* Disable or enable the platform and lua profiler. */

Typo: s/lua/Lua/

> +#if defined(LUAJIT_DISABLE_SYSPROF) || defined(LJ_ARCH_NOSYSPROF) || LJ_TARGET_WINDOWS || LJ_TARGET_CYGWIN || LJ_TARGET_PS3 || LJ_TARGET_PS4 || LJ_TARGET_XBOX360

Side note: I not sure that all available systems have pthread.h. Is that
so?

> +#define LJ_HASSYSPROF		0
> +#else
> +#define LJ_HASSYSPROF		1
> +#endif
> +
>  #endif
> diff --git a/src/lj_errmsg.h b/src/lj_errmsg.h
> index ae0a18c0..77a08cb0 100644
> --- a/src/lj_errmsg.h
> +++ b/src/lj_errmsg.h

<snipped>

> diff --git a/src/lj_mapi.c b/src/lj_mapi.c
> index b2b35a17..0b2284f0 100644
> --- a/src/lj_mapi.c
> +++ b/src/lj_mapi.c

<snipped>

> diff --git a/src/lj_state.c b/src/lj_state.c
> index f82b1b5b..cc6f92f1 100644
> --- a/src/lj_state.c
> +++ b/src/lj_state.c

<snipped>

> diff --git a/src/lj_sysprof.c b/src/lj_sysprof.c
> new file mode 100644
> index 00000000..c0c83fa9
> --- /dev/null
> +++ b/src/lj_sysprof.c
> @@ -0,0 +1,483 @@

Minor: It would be nice to add some module description like it is done
for <src/lj_memprof.c>.

> +#define lj_sysprof_c
> +#define LUA_CORE
> +
> +#include "lj_arch.h"
> +#include "lj_sysprof.h"
> +
> +#if LJ_HASSYSPROF
> +
> +#include "lj_obj.h"
> +#include "lj_debug.h"
> +#include "lj_dispatch.h"
> +#include "lj_frame.h"
> +
> +#if LJ_HASJIT
> +#include "lj_jit.h"
> +#include "lj_trace.h"
> +#endif
> +
> +#include "lj_wbuf.h"
> +#include "lj_profile_timer.h"
> +#include "lj_symtab.h"
> +
> +#include <pthread.h>
> +#include <errno.h>
> +#include <execinfo.h>

Minor: I supposet that we should check that execinfo is available on the
system and `backtrace` symbol is defined. Not sure that it is true for
Alpine Linux, for example.

> +
> +#define SYSPROF_HANDLER_STACK_DEPTH 4

Minor: please add a comment why it is 4 and not 5 or 3.

> +#define SYSPROF_BACKTRACE_BUF_SIZE 4096

Firstly, I thought that this is the size of the buffer in bytes.
I suggest to rename this macro to SYSPROF_BACKTRACE_FRAME_MAX or some
think like that (mention FRAME is important -- to get that it stands
for the highest possible frame).
Feel free to change it on your own.

Also, I suppose that 4096 values (i.e. 4 Kb) on the stack is too much.
Do you know somebody ready to examine backtrace with at least 1k
entries? Also, if for LuaJIT expected limit *may be* really around 4K
entries with the following code:

| luajit -e 'local function rev (s) return string.gsub(s, "(.)(.+)", function (c,s1) return rev(s1)..c end) end local x = string.rep("0", 1000) rev(x)'

Tarantool's Lua fiber's stack is limited by ~300 entries, IIRC.
(All numbers are empirically established.)
In real life I never see backtraces with more than 50 entries.
So, I suppose that 512 entries are more than enough.
Thoughts?

> +
> +enum sysprof_state {
> +  /* Profiler needs to be configured. */
> +  SPS_UNCONFIGURED,

In my opinion, it is better to have default config that profiler
starts with by default.
Thoughts?

> +  /* Profiler is not running. */
> +  SPS_IDLE,
> +  /* Profiler is running. */
> +  SPS_PROFILE,
> +  /*
> +  ** Stopped in case of stopped or failed stream.
> +  ** Saved errno is set at luaM_sysprof_stop.
> +  */
> +  SPS_HALT
> +};
> +
> +struct sysprof {
> +  global_State *g; /* Profiled VM. */
> +  pthread_t thread; /* Profiled thread. */
> +  enum sysprof_state state; /* Internal state. */

As far as this internal state can be changed inside a signal handler (see
`sysprof_record_sample()`) it should be declared as volatile, to avoid
compile optimization of this code part.
Moreover, it should be used `sig_atomic_t` here to be sure that its
usage is atomic in context of usage inside the signal handler.

> +  struct lj_wbuf out; /* Output accumulator. */
> +  struct luam_sysprof_counters counters; /* Profiling counters. */
> +  struct luam_sysprof_options opt; /* Profiling options. */
> +  struct luam_sysprof_config config; /* Profiler configurations. */
> +  lj_profile_timer timer; /* Profiling timer. */
> +  int saved_errno; /* Saved errno when profiler failed. */
> +};
> +

Minor: I suppose we should add a comment here, that only one VM
(`global_State` in <lj_obj.h> terms) can be profiled at a time.

> +static struct sysprof sysprof = {0};
> +
> +/* --- Stream ------------------------------------------------------------- */
> +
> +static const uint8_t ljp_header[] = {'l', 'j', 'p', LJP_FORMAT_VERSION,
> +                                     0x0, 0x0, 0x0};

Minor: use tabs for alignment here.

> +
> +static int stream_is_needed(struct sysprof *sp)
> +{
> +  return LUAM_SYSPROF_DEFAULT != sp->opt.mode;

Nit: `sp->opt.mode != LUAM_SYSPROF_DEFAULT` is more readable for me.
Feel free to ignore.

> +}
> +
> +static void stream_prologue(struct sysprof *sp)
> +{
> +  lj_symtab_dump(&sp->out, sp->g);
> +  lj_wbuf_addn(&sp->out, ljp_header, sizeof(ljp_header));
> +}
> +
> +static void stream_epilogue(struct sysprof *sp)
> +{
> +  lj_wbuf_addbyte(&sp->out, LJP_EPILOGUE_BYTE);
> +}
> +
> +static void stream_lfunc(struct lj_wbuf *buf, GCfunc *func)

Nit: I suppose, that we can use `const GCfunc` instead to accent, that
this func cann't change here.
Here and below.

> +{

Minor: I suppose, that we should add lua_assert(isluafunc(func)) here.

> +  GCproto *pt = funcproto(func);

Nit: GC proto can be declared as `const` here, isn't it?

> +  lua_assert(pt != NULL);
> +  lj_wbuf_addbyte(buf, LJP_FRAME_LFUNC);
> +  lj_wbuf_addu64(buf, (uintptr_t)pt);
> +  lj_wbuf_addu64(buf, (uint64_t)pt->firstline);
> +}
> +
> +static void stream_cfunc(struct lj_wbuf *buf, GCfunc *func)
> +{

Minor: I suppose, that we should add lua_assert(iscfunc(func)) here.

> +  lj_wbuf_addbyte(buf, LJP_FRAME_CFUNC);
> +  lj_wbuf_addu64(buf, (uintptr_t)func->c.f);
> +}
> +
> +static void stream_ffunc(struct lj_wbuf *buf, GCfunc *func)
> +{

Minor: I suppose, that we should add lua_assert(isffunc(func)) here.

> +  lj_wbuf_addbyte(buf, LJP_FRAME_FFUNC);
> +  lj_wbuf_addu64(buf, func->c.ffid);
> +}
> +
> +static void stream_frame_lua(struct lj_wbuf *buf, cTValue *frame)
> +{
> +  GCfunc *func = frame_func(frame);

Nit: GCfunc may be declared as const here, IINM.

> +  lua_assert(NULL != func);

Side note: why do you use the reversed condition order here? :)

> +  if (isluafunc(func)) {
> +    stream_lfunc(buf, func);
> +  } else if (isffunc(func)) {
> +    stream_ffunc(buf, func);
> +  } else {

Minor: I suppose that we should use also `else if` here with else
lua_assert(0) unreachable branch below.

> +    stream_cfunc(buf, func);
> +  }
> +}
> +
> +static void stream_backtrace_lua(struct sysprof *sp)
> +{
> +  global_State *g = sp->g;
> +  struct lj_wbuf *buf = &sp->out;
> +  cTValue *top_frame = NULL, *frame = NULL, *bot = NULL;
> +  lua_State *L = NULL;

Nit: This initialization looks excess. Just declaration is enough.
Same for top_frame, frame and bot.

> +
> +  lua_assert(g != NULL);
> +  L = gco2th(gcref(g->cur_L));
> +  lua_assert(L != NULL);
> +
> +  top_frame = sp->g->top_frame.guesttop.interp_base - 1;

Looks like this line expects that interp_base always has Lua function on
top. Does it conflict with fast functions and C functions?

Should it be `- (1 + LJ_FR2)` instead?

Nit: s/sp->g/g/

> +  bot = tvref(L->stack) + LJ_FR2;
> +  /* Traverse frames backwards */
> +  for (frame = top_frame; frame > bot; frame = frame_prev(frame)) {
> +    if (frame_gc(frame) == obj2gco(L)) {

Nit: `{` and `}` are excess.

> +      continue;  /* Skip dummy frames. See lj_err_optype_call(). */
> +    }
> +    stream_frame_lua(buf, frame);

Should we skip VARAG pseudo-frames as well?

> +  }
> +
> +  lj_wbuf_addbyte(buf, LJP_FRAME_LUA_LAST);
> +}
> +

Minor: please add comment about return values. I don't get when this
function returns the same address was given, and when NULL. And for what
NULL stands for.

> +static void *stream_frame_host(int frame_no, void *addr)

Nit: also, looks like addr can be declared as a constant, can't it?

> +{
> +  struct sysprof *sp = &sysprof;
> +
> +  if (LJ_UNLIKELY(frame_no <= SYSPROF_HANDLER_STACK_DEPTH)) {
> +    /*
> +    ** We don't want the profiler stack to be streamed, as it will
> +    ** burden the profile with unnecessary information.
> +    */
> +    return addr;
> +  } else if (LJ_UNLIKELY(sp->opt.mode == LUAM_SYSPROF_LEAF &&
> +                         frame_no > SYSPROF_HANDLER_STACK_DEPTH + 1)) {

Looks like it should be `frame_no > SYSPROF_HANDLER_STACK_DEPTH, indeed
If `SYSPROF_HANDLER_STACK_DEPTH + 1 == frame_no` the first branch is
taken.

> +    return NULL;
> +  }

Nit: I suggest to move comment upper and omit curly brackets.
Feel free to ignore.

> +
> +  lj_wbuf_addu64(&sp->out, (uintptr_t)addr);
> +  return addr;
> +}
> +
> +static void default_backtrace_host(void *(writer)(int frame_no, void *addr))
> +{
> +  static void* backtrace_buf[SYSPROF_BACKTRACE_BUF_SIZE] = {};

Typo: s/void* backtrace_buf/void *backtrace_buf/

> +
> +  struct sysprof *sp = &sysprof;
> +  const int depth = backtrace(backtrace_buf,
> +                              sp->opt.mode == LUAM_SYSPROF_LEAF
> +                              ? SYSPROF_HANDLER_STACK_DEPTH + 1
> +                              : SYSPROF_BACKTRACE_BUF_SIZE);
> +  lua_assert(depth >= SYSPROF_HANDLER_STACK_DEPTH);

Nit: strictly saying it is the following:
| depth <= sp->opt.mode == LUAM_SYSPROF_LEAF ?
|          SYSPROF_HANDLER_STACK_DEPTH + 1 : SYSPROF_BACKTRACE_BUF_SIZE

So I suggest to add a variable to use in backtrace configuration and the
assert.

> +
> +  for (int i = SYSPROF_HANDLER_STACK_DEPTH; i < depth; ++i) {
> +    if (!writer(i - SYSPROF_HANDLER_STACK_DEPTH + 1, backtrace_buf[i])) {

Nit: { } brackets are excess.

> +      return;
> +    }
> +  }
> +}
> +
> +static void stream_backtrace_host(struct sysprof *sp)
> +{
> +  lua_assert(sp->config.backtracer != NULL);
> +  sp->config.backtracer(stream_frame_host);
> +  lj_wbuf_addu64(&sp->out, (uintptr_t)LJP_FRAME_HOST_LAST);
> +}
> +
> +static void stream_trace(struct sysprof *sp)
> +{

I suppose we should add the following ifdef here.
| #if LJ_HASJIT

> +  struct lj_wbuf *out = &sp->out;
> +

Nit: this empty line is excess.

> +  uint32_t traceno = sp->g->vmstate;
> +  jit_State *J = G2J(sp->g);
> +
> +  lj_wbuf_addu64(out, traceno);
> +  lj_wbuf_addu64(out, (uintptr_t)J->prev_pt);
> +  lj_wbuf_addu64(out, J->prev_line);

Don't get this line. J->prev_line is not the line of currently executed
trace, IINM. But the line of code that is recorded now. Also, prev_line
is updated only by LuaJIT's profiler in 'l' mode.

Ditto for `J->prev_pt`.

> +}
> +
> +static void stream_guest(struct sysprof *sp)
> +{
> +  stream_backtrace_lua(sp);
> +  stream_backtrace_host(sp);
> +}
> +
> +static void stream_host(struct sysprof *sp)
> +{
> +  stream_backtrace_host(sp);
> +}
> +
> +typedef void (*event_streamer)(struct sysprof *sp);
> +
> +static event_streamer event_streamers[] = {
> +  /* XXX: order is important */
> +  stream_host,  /* LJ_VMST_INTERP */
> +  stream_guest, /* LJ_VMST_LFUNC */
> +  stream_guest, /* LJ_VMST_FFUNC */
> +  stream_guest, /* LJ_VMST_CFUNC */
> +  stream_host,  /* LJ_VMST_GC */
> +  stream_host,  /* LJ_VMST_EXIT */
> +  stream_host,  /* LJ_VMST_RECORD */
> +  stream_host,  /* LJ_VMST_OPT */
> +  stream_host,  /* LJ_VMST_ASM */
> +  stream_trace  /* LJ_VMST_TRACE */
> +};

I don't understand the following point:

Assume I have the following Lua code named <tsysprof.lua>:

| 1  local tsp = require "testsysprof"
| 2  jit.off() jit.flush()
| 3
| 4  function lc(nsec) -- Lua call to C
| 5    local r = 0
| 6    for i = 1, 1e5 do r = r + i end
| 7    return tsp.c(nsec) -- Regular C function does n sleeps to 1 sec
| 8  end
| 9
| 10 local function lclc(nsec) -- Lua call to C call to Lua call to C
| 11   local r = 0
| 12   for i = 1, 1e5 do r = r + i end
| 13   return tsp.clc(nsec) -- C call to Lua call to C
| 14 end
| 15
| 16 local s, r = misc.sysprof.start( {mode = "C", interval = 1, path = "sysprof.bin"})
| 17 assert(s,r)
| 18
| 19 for i = 1, 100 do
| 20   assert(lclc(3) == true) -- 3 sec sleep
| 21 end
| 22
| 23 s,r = misc.sysprof.stop()
| 24 assert(s,r)

I expect to see the following callgraph: Lua->C->Lua->C:

 +----------------------------+
 |    CFUNC <tsp.c addr>      |
 +----------------------------+
 |....CFUNCS related to VM....|
 +----------------------------+
 |    tsysprof.lua:4          |
 +----------------------------+
 |....CFUNCS related to VM....|
 +----------------------------+
 |    CFUNC <tsp.c addr>      |
 +----------------------------+
 |....CFUNCS related to VM....|
 +----------------------------+
 |    tsysprof.lua:10         |
 +----------------------------+---------------------------+
 |    tsysprof.lua:0          |  MB some other code flow  |
 +----------------------------+---------------------------+
 |                      tsysprof.lua:0                    |
 +--------------------------------------------------------+
 |......................CFUNCS till main()................|
 +--------------------------------------------------------+

From the code flow above IINM we report separate information about host
stack and guest stack. Will they be merged together by sysprof-parser?

If not -- then instead excpected callgraph I'll see something like the
following:
                              +---------------------------+
                              |    CFUNC <tsp.c addr>     |
                              +---------------------------+
                              |....CFUNCS related to VM...|
                              +---------------------------+
                              |    tsysprof.lua:4         |
 +----------------------------+---------------------------+
 |....CFUNCS related to VM....|....CFUNCS related to VM...|
 +----------------------------+---------------------------+
 |    tsysprof.lua:10         |    CFUNC <tsp.c addr>     |
 +----------------------------+---------------------------+-------------------------+
 |                        sysprof.lua:0  ttttt            |.MB some other code flow.|
 +--------------------------------------------------------+-------------------------+
 |                      tsysprof.lua:0                                              |
 +----------------------------------------------------------------------------------+
 |......................CFUNCS till main()..........................................|
 +----------------------------------------------------------------------------------+

But, if them merged please add the corresponding comment here and in
RFC, to avoid confusing.

> +
> +static void stream_event(struct sysprof *sp, uint32_t vmstate)
> +{
> +  event_streamer stream = NULL;
> +
> +  /* Check that vmstate fits in 4 bits (see streaming format) */
> +  lua_assert((vmstate & ~(uint32_t)((1 << 4) - 1)) == 0);

Minor: May be it is better to add corresponding define in the
header?

> +  lj_wbuf_addbyte(&sp->out, (uint8_t)vmstate);
> +

Nit: the following empty line is excess.
Feel free to ignor.

> +  stream = event_streamers[vmstate];
> +  lua_assert(NULL != stream);
> +  stream(sp);
> +}
> +
> +/* -- Signal handler ------------------------------------------------------ */
> +
> +static void sysprof_record_sample(struct sysprof *sp, siginfo_t *info)
> +{
> +  global_State *g = sp->g;
> +  uint32_t _vmstate = ~(uint32_t)(g->vmstate);
> +  uint32_t vmstate = _vmstate < LJ_VMST_TRACE ? _vmstate : LJ_VMST_TRACE;
> +

Some comments about the assert below are wellcome.

> +  lua_assert(pthread_self() == sp->thread);
From Programmer's Manual PTHREAD_SELF(3)

| SYNOPSIS
|        #include <pthread.h>
|        pthread_t pthread_self(void);
|        Compile and link with -pthread.

So it looks like we should add corresponding flag in CMake.

> +
> +  /* Caveat: order of counters must match vmstate order in <lj_obj.h>. */
> +  ((uint64_t *)&sp->counters)[vmstate]++;
> +
> +  sp->counters.samples++;
> +  sp->counters.overruns += info->si_overrun;
> +
> +  if (stream_is_needed(sp)) {

Minor: `if (!stream_is_needed(sp)) return;` (early return) looks pretier to me.

> +    stream_event(sp, vmstate);
> +    if (LJ_UNLIKELY(lj_wbuf_test_flag(&sp->out, STREAM_ERRIO|STREAM_STOP))) {
> +      sp->saved_errno = lj_wbuf_errno(&sp->out);
> +      lj_wbuf_terminate(&sp->out);
> +      sp->state = SPS_HALT;
> +    }
> +  }
> +}
> +
> +static void sysprof_signal_handler(int sig, siginfo_t *info, void *ctx)
> +{
> +  struct sysprof *sp = &sysprof;
> +  UNUSED(sig);
> +  UNUSED(ctx);
> +
> +  switch (sp->state) {
> +    case SPS_PROFILE:
> +      sysprof_record_sample(sp, info);
> +      break;
> +
> +    case SPS_IDLE:
> +    case SPS_HALT:
> +      /* noop */
> +      break;
> +
> +    default:
> +      lua_assert(0);
> +      break;
> +  }
> +}
> +
> +/* -- Internal ------------------------------------------------------------ */
> +
> +static int sysprof_validate(struct sysprof *sp,
> +                            const struct luam_sysprof_options *opt)

Side note: See my question about default configuration above.

> +{
> +  switch (sp->state) {
> +    case SPS_UNCONFIGURED:
> +      return SYSPROF_ERRUSE;
> +
> +    case SPS_IDLE: {

Nit: {} looks unnecessary here.

> +      if (opt->mode > LUAM_SYSPROF_CALLGRAPH) {

As far as it is enum it can be < than LUAM_SYSPROF_DEFAULT.

> +        return SYSPROF_ERRUSE;
> +      } else if (opt->mode != LUAM_SYSPROF_DEFAULT &&
> +                 (opt->buf == NULL || opt->len == 0 ||
> +                  sp->config.writer == NULL || sp->config.on_stop == NULL)) {

If they all are set by public API may be it is better to use
lua_apicheck() instead?

> +        return SYSPROF_ERRUSE;
> +      } else if (opt->interval == 0) {
> +        return SYSPROF_ERRUSE;
> +      }
> +      break;
> +    }
> +
> +    case SPS_PROFILE:
> +    case SPS_HALT:
> +      return SYSPROF_ERRRUN;

And the similar check is missed for memprof start (ERRRUN return only if
the state is MPS_PROFILE).

I suppose we should forbid to start profiling if the profiler is in HALT
state -- it means that the profiling with error to report still running.
Wait Igor's opinion here.
If yes, it should be fixed in memprof in the separate patchset.

> +
> +    default:
> +      lua_assert(0);

Nit: `break;` is missing.

From The C programming language - Second edition (K&R 2):

Chapter 3.4 Switch

| As a matter of good form, put a break after the last case (the default
| here) even though it's logically unnecessary. Some day when another case
| gets added at the end, this bit of defensive programming will save you.

> +  }
> +
> +  return SYSPROF_SUCCESS;
> +}
> +
> +static int sysprof_init(struct sysprof *sp, lua_State *L,
> +                        const struct luam_sysprof_options *opt)
> +{
> +  int status = sysprof_validate(sp, opt);

Nit: status may be declared as const.

> +  if (SYSPROF_SUCCESS != status) {

Nit: {} are excess.

> +    return status;
> +  }
> +
> +  /* Copy validated options to sysprof state. */
> +  memcpy(&sp->opt, opt, sizeof(sp->opt));
> +
> +  /* Init general fields. */
> +  sp->g = G(L);
> +  sp->thread = pthread_self();
> +
> +  /* Reset counters. */
> +  memset(&sp->counters, 0, sizeof(sp->counters));
> +
> +  /* Reset saved errno. */
> +  sp->saved_errno = 0;
> +
> +  if (stream_is_needed(sp)) {
> +    lj_wbuf_init(&sp->out, sp->config.writer, opt->ctx, opt->buf, opt->len);
> +  }

Nit: {} are excess.

> +
> +  return SYSPROF_SUCCESS;
> +}
> +
> +/* -- Public profiling API ------------------------------------------------ */
> +
> +int lj_sysprof_configure(const struct luam_sysprof_config *config)
> +{
> +  struct sysprof *sp = &sysprof;
> +  lua_assert(config != NULL);
> +  if (sp->state != SPS_UNCONFIGURED && sp->state != SPS_IDLE) {

Nit: {} are excess.

> +    return SYSPROF_ERRUSE;
> +  }
> +
> +  memcpy(&sp->config, config, sizeof(*config));
> +
> +  if (sp->config.backtracer == NULL) {

Ditto.

> +    sp->config.backtracer = default_backtrace_host;
> +  }
> +
> +  sp->state = SPS_IDLE;
> +
> +  return SYSPROF_SUCCESS;
> +}
> +
> +int lj_sysprof_start(lua_State *L, const struct luam_sysprof_options *opt)
> +{
> +  struct sysprof *sp = &sysprof;
> +
> +  int status = sysprof_init(sp, L, opt);
> +  if (SYSPROF_SUCCESS != status) {
> +    if (NULL != sp->config.on_stop) {

I suppose as far as this situation is only possible by the user's
mistake during code development it will be better to use lua_assert() or
lua_apicheck() here.
Thoughts?

> +      /*
> +      ** Initialization may fail in case of unconfigured sysprof,
> +      ** so we cannot guarantee cleaning up resources in this case.
> +      */
> +      sp->config.on_stop(opt->ctx, opt->buf);
> +    }
> +    return status;
> +  }
> +
> +  sp->state = SPS_PROFILE;
> +
> +  if (stream_is_needed(sp)) {
> +    stream_prologue(sp);
> +    if (LJ_UNLIKELY(lj_wbuf_test_flag(&sp->out, STREAM_ERRIO|STREAM_STOP))) {
> +      /* on_stop call may change errno value. */
> +      int saved_errno = lj_wbuf_errno(&sp->out);

Nit: saved_errno may be declared as const.

> +      /* Ignore possible errors. mp->out.buf may be NULL here. */
> +      sp->config.on_stop(opt->ctx, sp->out.buf);
> +      lj_wbuf_terminate(&sp->out);
> +      sp->state = SPS_IDLE;
> +      errno = saved_errno;
> +      return SYSPROF_ERRIO;
> +    }
> +  }
> +
> +  sp->timer.opt.interval_msec = opt->interval;
> +  sp->timer.opt.handler = sysprof_signal_handler;
> +  lj_profile_timer_start(&sp->timer);
> +
> +  return SYSPROF_SUCCESS;
> +}
> +
> +int lj_sysprof_stop(lua_State *L)
> +{
> +  struct sysprof *sp = &sysprof;
> +  global_State *g = sp->g;
> +  struct lj_wbuf *out = &sp->out;
> +
> +  if (SPS_IDLE == sp->state) {
> +    return SYSPROF_ERRSTOP;

Why it is bad to have SPS_IDLE here (why UNCONFIGURED is OK)?
We can use PROFILE_ERRUSE as for this error for consistency with
memprof.

> +  } else if (G(L) != g) {
> +    return SYSPROF_ERRUSE;
> +  }

Nit: {} are excess.

> +
> +  lj_profile_timer_stop(&sp->timer);
> +
> +  if (SPS_HALT == sp->state) {
> +    errno = sp->saved_errno;
> +    sp->state = SPS_IDLE;
> +    /* wbuf was terminated when error occured. */
> +    return SYSPROF_ERRIO;
> +  }
> +
> +  sp->state = SPS_IDLE;
> +
> +  if (stream_is_needed(sp)) {
> +    int cb_status = 0;

Nit: This initialization looks excess.

> +
> +    stream_epilogue(sp);
> +    lj_wbuf_flush(out);
> +
> +    cb_status = sp->config.on_stop(sp->opt.ctx, out->buf);
> +    if (LJ_UNLIKELY(lj_wbuf_test_flag(out, STREAM_ERRIO|STREAM_STOP)) ||

Typo: s/ERRIO|STREAM/ERRIO | STREAM/

> +        0 != cb_status) {

Side note: why do you use here 0 != cb_status, but not cb_status != 0?

> +      errno = lj_wbuf_errno(out);
> +      lj_wbuf_terminate(out);
> +      return SYSPROF_ERRIO;
> +    }
> +
> +    lj_wbuf_terminate(out);
> +  }
> +
> +  return SYSPROF_SUCCESS;
> +}
> +
> +int lj_sysprof_report(struct luam_sysprof_counters *counters)
> +{
> +  const struct sysprof *sp = &sysprof;
> +  if (sp->state != SPS_IDLE) {

Nit: {} are excess.

> +    return SYSPROF_ERRUSE;
> +  }
> +  memcpy(counters, &sp->counters, sizeof(sp->counters));
> +  return SYSPROF_SUCCESS;
> +}
> +
> +#else /* LJ_HASSYSPROF */

<snipped>

> +#endif /* LJ_HASSYSPROF */
> +

Nit: this empty line is excess.

> diff --git a/src/lj_sysprof.h b/src/lj_sysprof.h
> new file mode 100644
> index 00000000..515bc08c
> --- /dev/null
> +++ b/src/lj_sysprof.h
> @@ -0,0 +1,94 @@
> +/*
> +** Sysprof - platform and Lua profiler

Typo: s/profiler/profiler/.

> +*/
> +
> +/*
> +** XXX: Platform profiler is not thread safe. Please, don't try to
> +** use it inside several VM, you can profile only one at a time.
> +*/
> +
> +/*
> +** XXX: Platform profiler uses the same signal backend as lj_profile. Please,
> +** don't use both at the same time.

So, may be it is better to use another one signal for sysprof
profilling?

> +*/
> +
> +#ifndef _LJ_SYSPROF_H
> +#define _LJ_SYSPROF_H
> +
> +#include "lj_obj.h"
> +#include "lmisclib.h"
> +
> +#define LJP_FORMAT_VERSION 0x1
> +
> +/*
> +** Event stream format:

<snipped>

> +*/
> +
> +#define LJP_FRAME_LFUNC ((uint8_t)1)
> +#define LJP_FRAME_CFUNC ((uint8_t)2)
> +#define LJP_FRAME_FFUNC ((uint8_t)3)
> +#define LJP_FRAME_LUA_LAST  0x80
> +#define LJP_FRAME_HOST_LAST NULL
> +
> +#define LJP_EPILOGUE_BYTE 0x80
> +
> +int lj_sysprof_configure(const struct luam_sysprof_config *config);
> +
> +int lj_sysprof_start(lua_State *L, const struct luam_sysprof_options *opt);
> +
> +int lj_sysprof_stop(lua_State *L);
> +
> +int lj_sysprof_report(struct luam_sysprof_counters *counters);
> +
> +#endif
> +

Nit: this empty line is excess.

> diff --git a/src/lmisclib.h b/src/lmisclib.h
> index 0c07707e..3545ff47 100644
> --- a/src/lmisclib.h
> +++ b/src/lmisclib.h
> @@ -60,6 +60,99 @@ struct luam_Metrics {
>  
>  LUAMISC_API void luaM_metrics(lua_State *L, struct luam_Metrics *metrics);
>  
> +/* --- Sysprof - platform and lua profiler -------------------------------- */
> +
> +/* Profiler configurations */

Typo: s/configurations/configurations./

> +struct luam_sysprof_config {

Minor: luam_Sysprof_Config is more consistent. Perfect variant is the
luam_<one word with the first letter in uppercase>, but I don't know how
to name it.

> +  /*
> +  ** Writer function for profile events.
> +  ** Should return amount of written bytes on success or zero in case of error.
> +  ** Setting *data to NULL means end of profiling.
> +  ** For details see <lj_wbuf.h>.

Please add also, that this writer *must be* asinc-safe function (see also
man 7 signal-safety).

> +  */
> +  size_t (*writer)(const void **data, size_t len, void *ctx);
> +  /*
> +  ** Callback on profiler stopping. Required for correctly cleaning
> +  ** at VM finalization when profiler is still running.
> +  ** Returns zero on success.
> +  */
> +  int (*on_stop)(void *ctx, uint8_t *buf);
> +  /*
> +  ** Backtracing function for the host stack. Should call `frame_writer` on
> +  ** each frame in the stack in the order from the stack top to the stack
> +  ** bottom. The `frame_writer` function is implemented inside the sysprof
> +  ** and will be passed to the `backtracer` function. If `frame_writer` returns
> +  ** NULL, backtracing should be stopped. If `frame_writer` returns not NULL,
> +  ** the backtracing should be continued if there are frames left.
> +  */
> +  void (*backtracer)(void *(*frame_writer)(int frame_no, void *addr));
> +};
> +
> +enum luam_sysprof_mode {

Nit: I don't think that we need to provide the enum here, just macro
defines is enough.
Feel free to ignore, if not:

Minor: Ditto for naming as for luam_sysprof_config.

> +  /*
> +  ** DEFAULT mode collects only data for luam_sysprof_counters, which is stored
> +  ** in memory and can be collected with luaM_sysprof_report after profiler
> +  ** stops.
> +  */
> +  LUAM_SYSPROF_DEFAULT,
> +  /*
> +  ** LEAF mode = DEFAULT + streams samples with only top frames of host and
> +  ** guests stacks in format described in <lj_sysprof.h>
> +  */
> +  LUAM_SYSPROF_LEAF,
> +  /*
> +  ** CALLGRAPH mode = DEFAULT + streams samples with full callchains of host
> +  ** and guest stacks in format described in <lj_sysprof.h>
> +  */
> +  LUAM_SYSPROF_CALLGRAPH
> +};
> +
> +struct luam_sysprof_counters {

Minor: Ditto for naming as for luam_sysprof_config.

> +  uint64_t vmst_interp;
> +  uint64_t vmst_lfunc;
> +  uint64_t vmst_ffunc;
> +  uint64_t vmst_cfunc;
> +  uint64_t vmst_gc;
> +  uint64_t vmst_exit;
> +  uint64_t vmst_record;
> +  uint64_t vmst_opt;
> +  uint64_t vmst_asm;
> +  uint64_t vmst_trace;
> +  /* XXX: order of vmst counters is important */

Minor: please add clarification that it must be the same as the order of
the vmstates. Or even better: add LUAJIT_STATIC_ASSERT.

Typo: s/important/important./

> +  uint64_t samples;
> +  uint64_t overruns;
> +};
> +
> +/* Profiler options */

Typo: s/options/options./

> +struct luam_sysprof_options {

Minor: Ditto for naming as for luam_sysprof_config.

> +  /* Profiling mode */

Typo: s/mode/mode./

> +  enum luam_sysprof_mode mode;
> +  /* Sampling interval in msec */

Typo: s/msec/msec./

> +  uint64_t interval;
> +  /* Custom buffer to write data. */
> +  uint8_t *buf;
> +  /* The buffer's size. */
> +  size_t len;
> +  /* Context for the profile writer and final callback. */
> +  void *ctx;
> +};
> +
> +#define SYSPROF_SUCCESS (0)
> +#define SYSPROF_ERRUSE  (1)
> +#define SYSPROF_ERRRUN  (2)
> +#define SYSPROF_ERRSTOP (3)
> +#define SYSPROF_ERRIO   (4)

There is no need in () here. Also why we can't move PROFILER status from
<lj_memprof.h> here instead?

> +
> +LUAMISC_API int luaM_sysprof_configure(const struct luam_sysprof_config *config);
> +
> +LUAMISC_API int luaM_sysprof_start(lua_State *L,
> +                                   const struct luam_sysprof_options *opt);
> +
> +LUAMISC_API int luaM_sysprof_stop(lua_State *L);
> +
> +LUAMISC_API int luaM_sysprof_report(struct luam_sysprof_counters *counters);
> +
> +
>  #define LUAM_MISCLIBNAME "misc"
>  LUALIB_API int luaopen_misc(lua_State *L);
>  
> diff --git a/test/tarantool-tests/CMakeLists.txt b/test/tarantool-tests/CMakeLists.txt
> index a872fa5e..cf47cc67 100644
> --- a/test/tarantool-tests/CMakeLists.txt
> +++ b/test/tarantool-tests/CMakeLists.txt
> @@ -62,6 +62,7 @@ add_subdirectory(gh-6189-cur_L)

<snipped>

> diff --git a/test/tarantool-tests/misclib-sysprof-capi.test.lua b/test/tarantool-tests/misclib-sysprof-capi.test.lua
> new file mode 100644
> index 00000000..d468572d
> --- /dev/null
> +++ b/test/tarantool-tests/misclib-sysprof-capi.test.lua
> @@ -0,0 +1,53 @@
> +-- Sysprof is implemented for x86 and x64 architectures only.
> +require("utils").skipcond(
> +  jit.arch ~= "x86" and jit.arch ~= "x64",
> +  jit.arch.." architecture is NIY for memprof"

Typo: s/memprof/sysprof/

> +)
> +
> +local testsysprof = require("testsysprof")
> +
> +local tap = require("tap")
> +
> +local test = tap.test("clib-misc-sysprof")
> +test:plan(4)
> +
> +test:ok(testsysprof.base())
> +test:ok(testsysprof.validation())
> +
> +local function lua_payload(n)
> +  if n <= 1 then
> +    return n
> +  end
> +  return lua_payload(n - 1) + lua_payload(n - 2)
> +end
> +
> +local function payload()
> +  local n_iterations = 500000
> +
> +  local co = coroutine.create(function ()
> +    for i = 1, n_iterations do
> +      if i % 2 == 0 then
> +        testsysprof.c_payload(10)
> +      else
> +        lua_payload(10)
> +      end
> +      coroutine.yield()
> +    end
> +  end)
> +
> +  for _ = 1, n_iterations do
> +    coroutine.resume(co)
> +  end
> +end
> +
> +local jit = require('jit')
> +
> +jit.off()
> +
> +test:ok(testsysprof.profile_func(payload))
> +
> +jit.on()
> +jit.flush()
> +
> +test:ok(testsysprof.profile_func(payload))
> +

Minor: please, add the following line to unify test with the others:
| os.exit(test:check() and 0 or 1)

> diff --git a/test/tarantool-tests/misclib-sysprof-capi/CMakeLists.txt b/test/tarantool-tests/misclib-sysprof-capi/CMakeLists.txt
> new file mode 100644
> index 00000000..d9fb1a1a
> --- /dev/null
> +++ b/test/tarantool-tests/misclib-sysprof-capi/CMakeLists.txt
> @@ -0,0 +1 @@
> +BuildTestCLib(testsysprof testsysprof.c)
> diff --git a/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c b/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c
> new file mode 100644
> index 00000000..46970a72
> --- /dev/null
> +++ b/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c
> @@ -0,0 +1,269 @@

I see the following warnings during compilation:

| /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c: In function 'on_stop_cb_default':
| /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c:74:51: warning: unused parameter 'buf' [-Wunused-parameter]
|    74 | static int on_stop_cb_default(void *opt, uint8_t *buf)
|       |                                          ~~~~~~~~~^~~
| /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c: In function 'c_payload':
| /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c:115:1: warning: no return statement in function returning non-void [-Wreturn-type]
|   115 | }
|       | ^
| /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c: In function 'base':
| /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c:126:3: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
|   126 |   struct luam_sysprof_options opt = {};
|       |   ^~~~~~
| /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c:133:3: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
|   133 |   struct luam_sysprof_counters cnt = {};
|       |   ^~~~~~

Also test fails with the following coredump in backtrace:

| #0  0x000055c326294d8d in lj_BC_RET () at buildvm_x86.dasc:811
| #1  0x000055c326296138 in lj_ff_coroutine_resume () at buildvm_x86.dasc:1688
| #2  0x000055c3262271da in lua_pcall (L=0x4197d378, nargs=0, nresults=0, errfunc=0) at /home/burii/reviews/luajit/sysprof/src/lj_api.c:1187
| #3  0x00007f5639a8c9b0 in profile_func (L=0x4197d378) at /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c:231
| #4  0x000055c326295313 in lj_BC_FUNCC () at buildvm_x86.dasc:811
| #5  0x000055c3262271da in lua_pcall (L=0x4197d378, nargs=0, nresults=-1, errfunc=2) at /home/burii/reviews/luajit/sysprof/src/lj_api.c:1187
| #6  0x000055c326220a9e in docall (L=0x4197d378, narg=0, clear=0) at /home/burii/reviews/luajit/sysprof/src/luajit.c:121
| #7  0x000055c32622146a in handle_script (L=0x4197d378, argx=0x7fffcbbb3010) at /home/burii/reviews/luajit/sysprof/src/luajit.c:291
| #8  0x000055c3262220c3 in pmain (L=0x4197d378) at /home/burii/reviews/luajit/sysprof/src/luajit.c:551
| #9  0x000055c326295313 in lj_BC_FUNCC () at buildvm_x86.dasc:811
| #10 0x000055c3262274f9 in lua_cpcall (L=0x4197d378, func=0x55c326221ee2 <pmain>, ud=0x0) at /home/burii/reviews/luajit/sysprof/src/lj_api.c:1221
| #11 0x000055c3262221d5 in main (argc=4, argv=0x7fffcbbb2ff8) at /home/burii/reviews/luajit/sysprof/src/luajit.c:580

> +#include <lua.h>
> +#include <luajit.h>
> +#include <lauxlib.h>
> +
> +#include <lmisclib.h>
> +
> +#include <stdlib.h>
> +#include <errno.h>
> +
> +#undef NDEBUG
> +#include <assert.h>
> +
> +
> +/* --- utils -------------------------------------------------------------- */
> +
> +#define SYSPROF_INTERVAL_DEFAULT 11
> +
> +/*
> +** Yep, 8Mb. Tuned in order not to bother the platform with too often flushes.
> +*/
> +#define STREAM_BUFFER_SIZE (8 * 1024 * 1024)

Minor: I suppose that we can take a smaller size for test.
Feel free to ignore.

> +
> +/* Structure given as ctx to memprof writer and on_stop callback. */
> +struct sysprof_ctx {

Minor: For our test files we use tabs (as in Tarantool) instead
"half-quoter tabs" (yes I know, that it is inconsistent).
Here and below.

> +  /* Output file stream for data. */
> +  FILE *stream;
> +  /* Buffer for data. */
> +  uint8_t buf[STREAM_BUFFER_SIZE];
> +};
> +
> +/*
> +** Default buffer writer function.
> +** Just call fwrite to the corresponding FILE.

This writer function is not async-signal-safe. We should use just write
here instead.

> +*/
> +static size_t buffer_writer_default(const void **buf_addr, size_t len,
> +                                    void *opt)
> +{
> +  struct sysprof_ctx *ctx = opt;
> +  FILE *stream = ctx->stream;
> +  const void * const buf_start = *buf_addr;
> +  const void *data = *buf_addr;
> +  size_t write_total = 0;
> +
> +  assert(len <= STREAM_BUFFER_SIZE);
> +
> +  for (;;) {
> +    const size_t written = fwrite(data, 1, len - write_total, stream);
> +
> +    if (written == 0) {
> +      /* Re-tries write in case of EINTR. */
> +      if (errno != EINTR) {
> +        /* Will be freed as whole chunk later. */
> +        *buf_addr = NULL;
> +        return write_total;
> +      }
> +      errno = 0;
> +      continue;
> +    }
> +
> +    write_total += written;
> +    assert(write_total <= len);
> +
> +    if (write_total == len)
> +      break;
> +
> +    data = (uint8_t *)data + (ptrdiff_t)written;
> +  }
> +
> +  *buf_addr = buf_start;
> +  return write_total;
> +}
> +
> +/* Default on stop callback. Just close the corresponding stream. */
> +static int on_stop_cb_default(void *opt, uint8_t *buf)
> +{
> +  struct sysprof_ctx *ctx = opt;
> +  FILE *stream = ctx->stream;
> +  free(ctx);
> +  return fclose(stream);
> +}
> +
> +static int stream_init(struct luam_sysprof_options *opt)
> +{
> +  struct sysprof_ctx *ctx = calloc(1, sizeof(struct sysprof_ctx));
> +  if (NULL == ctx) {

Nit: {} are excess.

> +    return SYSPROF_ERRIO;
> +  }
> +
> +  ctx->stream = fopen("/dev/null", "wb");
> +  if (NULL == ctx->stream) {
> +    free(ctx);
> +    return SYSPROF_ERRIO;
> +  }
> +
> +  opt->ctx = ctx;
> +  opt->buf = ctx->buf;
> +  opt->len = STREAM_BUFFER_SIZE;
> +
> +  return SYSPROF_SUCCESS;
> +}
> +
> +/* --- Payload ------------------------------------------------------------ */
> +
> +static double fib(double n)
> +{
> +  if (n <= 1) {
> +    return n;
> +  }

Nit: {} are excess.

> +  return fib(n - 1) + fib(n - 2);
> +}
> +
> +static int c_payload(lua_State *L)
> +{
> +  fib(luaL_checknumber(L, 1));

How many values does this function return?

> +}
> +
> +/* --- sysprof C API tests ------------------------------------------------ */
> +
> +static int base(lua_State *L)
> +{
> +  struct luam_sysprof_config config = {};
> +  (void)config.writer;
> +  (void)config.on_stop;
> +  (void)config.backtracer;
> +
> +  struct luam_sysprof_options opt = {};
> +  (void)opt.interval;
> +  (void)opt.mode;
> +  (void)opt.ctx;
> +  (void)opt.buf;
> +  (void)opt.len;
> +
> +  struct luam_sysprof_counters cnt = {};
> +  luaM_sysprof_report(&cnt);
> +
> +  (void)cnt.samples;
> +  (void)cnt.overruns;
> +  (void)cnt.vmst_interp;
> +  (void)cnt.vmst_lfunc;
> +  (void)cnt.vmst_ffunc;
> +  (void)cnt.vmst_cfunc;
> +  (void)cnt.vmst_gc;
> +  (void)cnt.vmst_exit;
> +  (void)cnt.vmst_record;
> +  (void)cnt.vmst_opt;
> +  (void)cnt.vmst_asm;
> +  (void)cnt.vmst_trace;
> +
> +  lua_pushboolean(L, 1);
> +  return 1;
> +}
> +
> +static int validation(lua_State *L)
> +{
> +  struct luam_sysprof_config config = {};
> +  struct luam_sysprof_options opt = {};
> +  int status = SYSPROF_SUCCESS;
> +
> +  status = luaM_sysprof_configure(&config);
> +  assert(SYSPROF_SUCCESS == status);
> +
> +  /* Unknown mode */

Typo: s/mode/mode./

> +  opt.mode = 0x40;
> +  status = luaM_sysprof_start(L, &opt);
> +  assert(SYSPROF_ERRUSE == status);
> +
> +  /* Buffer not configured */

Typo: s/configured/configured./

> +  opt.mode = LUAM_SYSPROF_CALLGRAPH;
> +  opt.buf = NULL;
> +  status = luaM_sysprof_start(L, &opt);
> +  assert(SYSPROF_ERRUSE == status);
> +
> +  /* Bad interval */

Typo: s/interval/interval./

> +  opt.mode = LUAM_SYSPROF_DEFAULT;
> +  opt.interval = 0;
> +  status = luaM_sysprof_start(L, &opt);
> +  assert(SYSPROF_ERRUSE == status);
> +
> +  /* Check if profiling started */

Typo: s/started/started./

> +  opt.mode = LUAM_SYSPROF_DEFAULT;
> +  opt.interval = SYSPROF_INTERVAL_DEFAULT;
> +  status = luaM_sysprof_start(L, &opt);
> +  assert(SYSPROF_SUCCESS == status);
> +
> +  /* Already running */

Typo: s/running/running./

> +  status = luaM_sysprof_start(L, &opt);
> +  assert(SYSPROF_ERRRUN == status);
> +
> +  /* Profiler stopping */

Typo: s/stopping/stopping./

> +  status = luaM_sysprof_stop(L);
> +  assert(SYSPROF_SUCCESS == status);
> +
> +  /* Stopping profiler which is not running */

Typo: s/running/running./

> +  status = luaM_sysprof_stop(L);
> +  assert(SYSPROF_ERRSTOP == status);

Suggest to use ERRRUN here, see comment about consistensy with memprof
above.

> +
> +  lua_pushboolean(L, 1);
> +  return 1;
> +}
> +
> +static int profile_func(lua_State *L)
> +{
> +  struct luam_sysprof_config config = {};
> +  struct luam_sysprof_options opt = {};
> +  struct luam_sysprof_counters cnt = {};
> +  int status = SYSPROF_ERRUSE;
> +
> +  int n = lua_gettop(L);
> +  if (n != 1 || !lua_isfunction(L, 1)) {

Nit: {} are excess.

> +    luaL_error(L, "incorrect argument: 1 function is required");
> +  }
> +
> +  /*
> +  ** Since all the other modes functionality is the
> +  ** subset of CALLGRAPH mode, run this mode to test
> +  ** the profiler's behavior.
> +  */
> +  opt.mode = LUAM_SYSPROF_CALLGRAPH;
> +  opt.interval = SYSPROF_INTERVAL_DEFAULT;
> +  stream_init(&opt);
> +
> +  config.on_stop = on_stop_cb_default;
> +  config.writer = buffer_writer_default;
> +  status = luaM_sysprof_configure(&config);
> +  assert(SYSPROF_SUCCESS == status);
> +
> +  status = luaM_sysprof_start(L, &opt);
> +  assert(SYSPROF_SUCCESS == status);
> +
> +  /* Run payload. */
> +  if (lua_pcall(L, 0, 0, 0) != 0) {

Minor: it is better to use != LUA_OK
Nit: {} are excess.

> +    luaL_error(L, "error running payload: %s", lua_tostring(L, -1));
> +  }
> +
> +  status = luaM_sysprof_stop(L);
> +  assert(SYSPROF_SUCCESS == status);
> +
> +  status = luaM_sysprof_report(&cnt);
> +  assert(SYSPROF_SUCCESS == status);
> +
> +  assert(cnt.samples > 1);
> +  assert(cnt.samples == cnt.vmst_asm +
> +                        cnt.vmst_cfunc +
> +                        cnt.vmst_exit +
> +                        cnt.vmst_ffunc +
> +                        cnt.vmst_gc +
> +                        cnt.vmst_interp +
> +                        cnt.vmst_lfunc +
> +                        cnt.vmst_opt +
> +                        cnt.vmst_record +
> +                        cnt.vmst_trace);
> +
> +  lua_pushboolean(L, 1);
> +  return 1;
> +}
> +
> +static const struct luaL_Reg testsysprof[] = {
> +  {"c_payload", c_payload},
> +  {"base", base},
> +  {"validation", validation},
> +  {"profile_func", profile_func},

Nit: Please sort entries alphabetically.

> +  {NULL, NULL}
> +};
> +
> +LUA_API int luaopen_testsysprof(lua_State *L)
> +{
> +  luaL_register(L, "testsysprof", testsysprof);
> +  return 1;
> +}
> -- 
> 2.33.0
> 

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 5/7] memprof: add profile common section
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 5/7] memprof: add profile common section Maxim Kokryashkin via Tarantool-patches
@ 2021-10-05 10:48   ` Sergey Kaplun via Tarantool-patches
  2021-10-06 19:15     ` [Tarantool-patches] [PATCH luajit v2] " Maxim Kokryashkin via Tarantool-patches
  0 siblings, 1 reply; 19+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-10-05 10:48 UTC (permalink / raw)
  To: Maxim Kokryashkin; +Cc: tarantool-patches

Hi, Maxim!

Thanks for the patch!
Please consider my comments below.

On 08.09.21, Maxim Kokryashkin wrote:
> A Lua API for sysprof needs to be introduced, but sysprof's C API is quite

Nevertheless, there is a dramatic difference: memprof's callback isn't
called inside a signal handler, but sysprof's is. So in sysprof we can't
use non asinc-signal-safe functions and system calls. It means that we
can't use `fwrite()`, for example, because it manages internal buffer
associated with `FILE *`. It is possible that another one signal is
handled when we write via `FILE *` which using pointer related to buffer
or other related variables aren't updated yet. Then the second call to
`fwrite()` will operate on inconsistent data.

See also:
| man 7 signal-safety

> similar with memprof's. Considering this, there are some structures and
> functions that should be common between memprof's and sysprof's

Typo? s/between/among/

> implementations of Lua API to avoid duplication.

Typo: s/Lua API/the Lua API/

Minor: Please align the commit message. See Code-Review procedure [1].

| Commit message should fit 72 symbols line width. This comes from GitHub
| UI restrictions, and from general rules of formatting commits in git.
| For example 72 comes from being able to add 4 whitespaces shift, so as
| the result message is aligned in the middle of 80. However sometimes 50
| is too small, so this is an optional limit. The commit message limit of
| 72 symbols is not applied when need to copy-paste something long without
| changes, such as a failed test output.

> 
> Part of tarantool/tarantool#781
> ---
>  src/lib_misc.c | 15 ++++++++-------
>  1 file changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/src/lib_misc.c b/src/lib_misc.c
> index 1dab08cc..a44476e0 100644
> --- a/src/lib_misc.c
> +++ b/src/lib_misc.c
> @@ -75,9 +75,7 @@ LJLIB_CF(misc_getmetrics)
>  
>  #include "lj_libdef.h"
>  
> -/* ----- misc.memprof module ---------------------------------------------- */
> -
> -#define LJLIB_MODULE_misc_memprof
> +/* --------- profile common  ---------------------------------------------- */

Minor: I prefer "profile common section".
Feel free to ignore.

>  
>  /*
>  ** Yep, 8Mb. Tuned in order not to bother the platform with too often flushes.
> @@ -85,7 +83,7 @@ LJLIB_CF(misc_getmetrics)
>  #define STREAM_BUFFER_SIZE (8 * 1024 * 1024)
>  
>  /* Structure given as ctx to memprof writer and on_stop callback. */
> -struct memprof_ctx {
> +struct profile_ctx {
>    /* Output file stream for data. */
>    FILE *stream;
>    /* Profiled global_State for lj_mem_free at on_stop callback. */
> @@ -101,7 +99,7 @@ struct memprof_ctx {
>  static size_t buffer_writer_default(const void **buf_addr, size_t len,
>  				    void *opt)
>  {
> -  struct memprof_ctx *ctx = opt;
> +  struct profile_ctx *ctx = opt;
>    FILE *stream = ctx->stream;
>    const void * const buf_start = *buf_addr;
>    const void *data = *buf_addr;
> @@ -140,19 +138,22 @@ static size_t buffer_writer_default(const void **buf_addr, size_t len,
>  /* Default on stop callback. Just close the corresponding stream. */
>  static int on_stop_cb_default(void *opt, uint8_t *buf)
>  {
> -  struct memprof_ctx *ctx = opt;
> +  struct profile_ctx *ctx = opt;
>    FILE *stream = ctx->stream;
>    UNUSED(buf);
>    lj_mem_free(ctx->g, ctx, sizeof(*ctx));
>    return fclose(stream);
>  }
>  
> +/* ----- misc.memprof module ---------------------------------------------- */
> +
> +#define LJLIB_MODULE_misc_memprof
>  /* local started, err, errno = misc.memprof.start(fname) */
>  LJLIB_CF(misc_memprof_start)
>  {
>    struct lj_memprof_options opt = {0};
>    const char *fname = strdata(lj_lib_checkstr(L, 1));
> -  struct memprof_ctx *ctx;
> +  struct profile_ctx *ctx;
>    int memprof_status;
>  
>    /*
> -- 
> 2.33.0
> 

[1]: https://github.com/tarantool/tarantool/wiki/Code-review-procedure#commit-message

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 6/7] sysprof: introduce Lua API
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 6/7] sysprof: introduce Lua API Maxim Kokryashkin via Tarantool-patches
@ 2021-10-05 15:36   ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 19+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-10-05 15:36 UTC (permalink / raw)
  To: Maxim Kokryashkin; +Cc: tarantool-patches

Hi, Maxim!

Thanks for the patch!

Please consider my comments below.

On 08.09.21, Maxim Kokryashkin wrote:
> This commit introduces Lua API for sysprof, so it is now possible to
> call it from anywhere in a Lua script. All of the parameters are passed
> inside a single Lua table.
> The following options are supported:
> - mode:     "D" (Default mode, only virtual machine state counters)
>             "L" (Leaf mode, shows the last frame on the stack)
>             "C" (Callchain mode, full stack dump)
> - interval: time between samples

Please add note that this is time in miliseconds.

> - path:     Path to file in which profiling data should be stored

Minor: It would be nice to mention the return values.

> 
> Part of tarantool/tarantool#781
> ---
>  src/lib_misc.c                                | 224 ++++++++++++++++++
>  .../misclib-sysprof-lapi.test.lua             | 105 ++++++++
>  2 files changed, 329 insertions(+)
>  create mode 100644 test/tarantool-tests/misclib-sysprof-lapi.test.lua
> 
> diff --git a/src/lib_misc.c b/src/lib_misc.c
> index a44476e0..a3423cdd 100644
> --- a/src/lib_misc.c
> +++ b/src/lib_misc.c

I have the following warnings when compile with
| -DCMAKE_C_FLAGS="-Wextra -Wdeclaration-after-statement -Wredundant-decls -Wshadow -Wpointer-arith"

| /home/burii/reviews/luajit/sysprof/src/lib_misc.c: In function 'set_output_path':
| /home/burii/reviews/luajit/sysprof/src/lib_misc.c:201:3: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
|   201 |   struct profile_ctx *ctx = opt->ctx;
|       |   ^~~~~~
| /home/burii/reviews/luajit/sysprof/src/lib_misc.c: In function 'parse_output_path':
| /home/burii/reviews/luajit/sysprof/src/lib_misc.c: In function 'set_output_path':
| /home/burii/reviews/luajit/sysprof/src/lib_misc.c:201:3: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
|   201 |   struct profile_ctx *ctx = opt->ctx;
|       |   ^~~~~~
| /home/burii/reviews/luajit/sysprof/src/lib_misc.c: In function 'parse_output_path':
| /home/burii/reviews/luajit/sysprof/src/lib_misc.c:210:74: warning: unused parameter 'opt' [-Wunused-parameter]
|   210 | const char* parse_output_path(lua_State *L, struct luam_sysprof_options *opt, int idx) {
|       |                                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
| /home/burii/reviews/luajit/sysprof/src/lib_misc.c: In function 'parse_options':
| /home/burii/reviews/luajit/sysprof/src/lib_misc.c:226:7: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
|   226 |       const char* path = parse_output_path(L, opt, 1);
|       |       ^~~~~
| /home/burii/reviews/luajit/sysprof/src/lib_misc.c: In function 'lj_cf_misc_sysprof_start':
| /home/burii/reviews/luajit/sysprof/src/lib_misc.c:296:3: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
|   296 |   const struct luam_sysprof_config conf = {buffer_writer_default, on_stop_cb_default, NULL};
|       |   ^~~~~
| /home/burii/reviews/luajit/sysprof/src/lib_misc.c:210:74: warning: unused parameter 'opt' [-Wunused-parameter]
|   210 | const char* parse_output_path(lua_State *L, struct luam_sysprof_options *opt, int idx) {
|       |                                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
| /home/burii/reviews/luajit/sysprof/src/lib_misc.c: In function 'parse_options':
| /home/burii/reviews/luajit/sysprof/src/lib_misc.c:226:7: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
|   226 |       const char* path = parse_output_path(L, opt, 1);
|       |       ^~~~~
| /home/burii/reviews/luajit/sysprof/src/lib_misc.c: In function 'lj_cf_misc_sysprof_start':
| /home/burii/reviews/luajit/sysprof/src/lib_misc.c:296:3: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
|   296 |   const struct luam_sysprof_config conf = {buffer_writer_default, on_stop_cb_default, NULL};
|       |   ^~~~~

> @@ -145,6 +145,218 @@ static int on_stop_cb_default(void *opt, uint8_t *buf)
>    return fclose(stream);
>  }
>  
> +/* ----- misc.sysprof module ---------------------------------------------- */
> +
> +/* Not loaded by default, use: local profile = require("misc.sysprof") */

Looks like it is not true. See example here [1].
Also, I see no reason for differnce between sysprof and memprof here.

> +#define LJLIB_MODULE_misc_sysprof
> +
> +/* The default profiling interval equals to 11 ms. */
> +#define SYSPROF_DEFAULT_INTERVAL (11)

Why 11 and not 10, 9 or something else?
Nit: () are excess

> +#define SYSPROF_DEFAULT_OUTPUT "sysprof.bin"

I suppose that it is better if user makes a mistake in any parameter you
should raise an error to inform him.

> +
> +int parse_sysprof_opts(lua_State *L, struct luam_sysprof_options *opt, int idx) {
> +  GCtab *options = lj_lib_checktab(L, idx);

Looks like the index is used only here, so we can check this outside
this function. Also, I'm a little bit frightened by the difference with
memprof. User may set this parameter in the 1st argument and in the
opttable, what we should prefer to use?

I suggest to similarize API with memprof's -- so the first argument is
always a string contains the filename for profiler's output. The second
is a table with others options (as we agreed offline, but I still think
that it is not Lua-style API) or nil. If the second argument is nil we
still can run profiler with default options (i.e. interval and mode "D"
as the most harmless). If the option has the wrong type (a function or
a table, etc. instead a string) we should raise an error.

If the second argument is the table with only "mode" or only "interval"
key is set we use the default value for another. If there is any unknown
option (for example "inteval") we must raise an error (unknown option),
to notify the user that he tries to use a wrong option in the table.

OTOH, I don't know how the API should look when we profiling in
Default mode, i.e. without any output to file. So maybe it is better
always expect the table as the first argument. When the profiling mode
is Default we should raise and error that, path argument is invalid.

Thoughts?

> +
> +  /* get profiling mode */

Typo: s/get/Get/ s/mode/mode./

> +  {
> +    cTValue *mode_opt = lj_tab_getstr(options, lj_str_newlit(L, "mode"));
> +    char mode = 0;
> +    if (mode_opt) {

Nit: {} are excess.

> +      mode = *strVdata(mode_opt);

There is no guarantee that the value is a string, so the behaviour is
undefined.

> +    }

Default mode is not taking into account here, see the comment above.

> +
> +    switch (mode) {
> +      case 'D':
> +        opt->mode = LUAM_SYSPROF_DEFAULT;

Nit: We use quater-half tabs code style in LuaJIT.
So 8spaces should be replased with a single tab.
Here and below.

> +        break;
> +      case 'L':
> +        opt->mode = LUAM_SYSPROF_LEAF;
> +        break;
> +      case 'C':
> +        opt->mode = LUAM_SYSPROF_CALLGRAPH;
> +        break;
> +      default:
> +        return SYSPROF_ERRUSE;

What is the behaviour in case when mode is "CDL", and when is "DCL"?

> +    }
> +  }
> +
> +  /* Get profiling interval. */
> +  {
> +    cTValue *interval = lj_tab_getstr(options, lj_str_newlit(L, "interval"));
> +    opt->interval = SYSPROF_DEFAULT_INTERVAL;
> +    if (interval) {
> +      int32_t signed_interval = numberVint(interval);

There is no guarantee that interval TValue is a number or string, we
should check it.

| $ src/luajit -e 'misc.sysprof.start( {mode = "C", interval = function() end, path = "/tmp/sysprof.bin"})'
| luajit: /home/burii/reviews/luajit/sysprof/src/lj_obj.h:1025: numberVint: Assertion `(((o)->it) < 0xfffeffffu)' failed.
| Aborted

> +      if (signed_interval < 1) {

Nit: {} are excess.

> +        return SYSPROF_ERRUSE;
> +      }
> +      opt->interval = signed_interval;
> +    }
> +  }
> +
> +  return SYSPROF_SUCCESS;

According to RFC [2] we can use set path via this table, however the
corresponding check is omited.

| local started, err, errno = misc.sysprof.start{
|   mode = profiling_mode,
|   interval = sampling_interval,
|   path = fname,
| }

But, as I said, for more consistency with memprof we can drop path from
options table.

> +}
> +
> +int set_output_path(const char* path, struct luam_sysprof_options *opt) {

Typo: s/char* path/char *path/

> +  lua_assert(path != NULL);
> +  struct profile_ctx *ctx = opt->ctx;
> +  FILE *stream = fopen(path, "wb");
> +  if(!stream) {
> +    return SYSPROF_ERRIO;
> +  }
> +  ctx->stream = stream;
> +  return SYSPROF_SUCCESS;
> +}

I don't understand this fucntion's destiny. Why we can't just use fopen
and check than returned value isn't NULL?

> +
> +const char* parse_output_path(lua_State *L, struct luam_sysprof_options *opt, int idx) {
> +  /* By default, sysprof writes events to a `sysprof.bin` file. */

See comment about default file above.

> +  const char *path = luaL_checkstring(L, idx);
> +  return path ? path : SYSPROF_DEFAULT_OUTPUT;
> +}
> +
> +int parse_options(lua_State *L, struct luam_sysprof_options *opt)
> +{
> +  int status = SYSPROF_SUCCESS;
> +
> +  switch(lua_gettop(L)) {

See comment about Lua API above.

> +    case 2:
> +      if(!(lua_isstring(L, 1) && lua_istable(L, 2))) {
> +        status = SYSPROF_ERRUSE;
> +        break;
> +      }
> +      const char* path = parse_output_path(L, opt, 1); 
                                                         ^
Minor: Trailing whitespace here.-------------------------^

> +      status = set_output_path(path, opt);
> +      if(status != SYSPROF_SUCCESS) 
                                      ^
Minor: Trailing whitespace here.------^

> +        break;
> +
> +      status = parse_sysprof_opts(L, opt, 2);
> +      break;
> +    case 1:
> +      if(!lua_istable(L, 1)) {
> +        status = SYSPROF_ERRUSE;
> +        break;
> +      }
> +      status = parse_sysprof_opts(L, opt, 1);
> +      if(status != SYSPROF_SUCCESS)
> +        break;
> +      status = set_output_path(SYSPROF_DEFAULT_OUTPUT, opt);
> +      break;
> +    default:
> +      status = SYSPROF_ERRUSE;
> +      break;
> +  } 
      ^
Minor: Trailing whitespace here.

> +  return status;
> +}
> +
> +int sysprof_error(lua_State *L, int status)
> +{
> +  switch (status) {
> +    case SYSPROF_ERRUSE:
> +      lua_pushnil(L);
> +      lua_pushstring(L, err2msg(LJ_ERR_PROF_MISUSE));
> +      lua_pushinteger(L, EINVAL);
> +      return 3;
> +    case SYSPROF_ERRRUN:

LJ_ERR_PROF_ISRUNNING and LJ_ERR_PROF_NOTRUNNING is defined only, when
LJ_HASMEMPROF || LJ_HASSYSPROF, so it must be the corresponding ifdef
here.

> +      lua_pushnil(L);
> +      lua_pushstring(L, err2msg(LJ_ERR_PROF_ISRUNNING));
> +      lua_pushinteger(L, EINVAL);
> +      return 3;
> +    case SYSPROF_ERRSTOP:
> +      lua_pushnil(L);
> +      lua_pushstring(L, err2msg(LJ_ERR_PROF_NOTRUNNING));
> +      lua_pushinteger(L, EINVAL);
> +      return 3;
> +    case SYSPROF_ERRIO:
> +      return luaL_fileresult(L, 0, NULL);
> +    default:
> +      lua_assert(0);
> +      return 0;
> +  }
> +}
> +
> +/* local res, err, errno = sysprof.start(options) */
> +LJLIB_CF(misc_sysprof_start)
> +{
> +  int status = SYSPROF_SUCCESS;
> +
> +  struct luam_sysprof_options opt = {};
> +  struct profile_ctx *ctx = NULL;
> +
> +  ctx = lj_mem_new(L, sizeof(*ctx));

We don't need this context for streaming events if profiled mode is
default. Also we don't need to open any files to write data.

| src/luajit -e 'misc.sysprof.start( {mode = "D", interval = 1, path = "/tmp/sysprof.bin"})'
| luajit: /home/burii/reviews/luajit/sysprof/src/lj_state.c:199: close_state: Assertion `g->gc.total == sizeof(GG_State)' failed.
| Aborted

> +  ctx->g = G(L);
> +  opt.ctx = ctx;
> +  opt.buf = ctx->buf;
> +  opt.len = STREAM_BUFFER_SIZE;
> +
> +  status = parse_options(L, &opt);
> +  if (LJ_UNLIKELY(status != PROFILE_SUCCESS)) {
> +    lj_mem_free(ctx->g, ctx, sizeof(*ctx));
> +    return sysprof_error(L, status);
> +  }
> +
> +  const struct luam_sysprof_config conf = {buffer_writer_default, on_stop_cb_default, NULL};

Nit: linewidth is more than 80 symbols. Also, I see nothing bad to
set config fields per entry.

Also, we can't use non async-signal-safe functions in writer like
fwrite.

> +  status = luaM_sysprof_configure(&conf);
> +  if (LJ_UNLIKELY(status != PROFILE_SUCCESS)) {

We need to call `on_stop` callback manually here.

| $ src/luajit -e 'misc.sysprof.start( {mode = "C", interval = 1 end, path = "/tmp/sysprof.bin"}) misc.sysprof.start( {mode = "C", interval = 1 end, path = "/tmp/sysprof.bin"})'
| luajit: /home/burii/reviews/luajit/sysprof/src/lj_state.c:199: close_state: Assertion `g->gc.total == sizeof(GG_State)' failed.
| Aborted

> +    return sysprof_error(L, status);

Nit: {} are excess.

> +  }
> +
> +  status = luaM_sysprof_start(L, &opt);
> +  if (LJ_UNLIKELY(status != PROFILE_SUCCESS)) {
> +    /* Allocated memory will be freed in on_stop callback. */
> +    return sysprof_error(L, status);

Nit: {} are excess.

> +  }
> +
> +  lua_pushboolean(L, 1);
> +  return 1;
> +}
> +
> +/* local res, err, errno = profile.sysprof_stop() */
> +LJLIB_CF(misc_sysprof_stop)
> +{
> +  int status = luaM_sysprof_stop(L);
> +  if (LJ_UNLIKELY(status != PROFILE_SUCCESS)) {

Nit: {} are excess.

> +    return sysprof_error(L, status);
> +  }
> +  lua_pushboolean(L, 1);
> +  return 1;
> +}
> +
> +/* local counters, err, errno = sysprof.report() */
> +LJLIB_CF(misc_sysprof_report)
> +{
> +  struct luam_sysprof_counters counters = {};
> +  GCtab *data_tab = NULL;
> +  GCtab *count_tab = NULL;
> +
> +  int status = luaM_sysprof_report(&counters);
> +  if (status != SYSPROF_SUCCESS) {

Nit: {} are excess.

> +    return sysprof_error(L, status);
> +  }
> +
> +  lua_createtable(L, 0, 3);
> +  data_tab = tabV(L->top - 1);
> +
> +  setnumfield(L, data_tab, "samples", counters.samples);
> +  setnumfield(L, data_tab, "overruns", counters.overruns);
> +
> +  lua_createtable(L, 0, LJ_VMST__MAX + 1);

May be it is better to use 1 level tab here. Wait for the second
reviewer opinion.

> +  count_tab = tabV(L->top - 1);
> +
> +  setnumfield(L, count_tab, "INTERP", counters.vmst_interp);
> +  setnumfield(L, count_tab, "LFUNC",  counters.vmst_lfunc);
> +  setnumfield(L, count_tab, "FFUNC",  counters.vmst_ffunc);
> +  setnumfield(L, count_tab, "CFUNC",  counters.vmst_cfunc);
> +  setnumfield(L, count_tab, "GC",     counters.vmst_gc);
> +  setnumfield(L, count_tab, "EXIT",   counters.vmst_exit);
> +  setnumfield(L, count_tab, "RECORD", counters.vmst_record);
> +  setnumfield(L, count_tab, "OPT",    counters.vmst_opt);
> +  setnumfield(L, count_tab, "ASM",    counters.vmst_asm);
> +  setnumfield(L, count_tab, "TRACE",  counters.vmst_trace);
> +
> +  lua_setfield(L, -2, "vmstate");
> +
> +  return 1;
> +}
> +
>  /* ----- misc.memprof module ---------------------------------------------- */
>  
>  #define LJLIB_MODULE_misc_memprof

Nit: Empty line after LJLIB_MODULE_misc_memprof is missed.

> @@ -237,7 +449,19 @@ LJLIB_CF(misc_memprof_stop)
>  
>  LUALIB_API int luaopen_misc(struct lua_State *L)
>  {
> +  int status = SYSPROF_SUCCESS;
> +  struct luam_sysprof_config config = {};
> +
> +  config.writer = buffer_writer_default;
> +  config.on_stop = on_stop_cb_default;
> +  status = luaM_sysprof_configure(&config);
> +  if (LJ_UNLIKELY(status != PROFILE_SUCCESS)) {
> +    return sysprof_error(L, status);
> +  }

Don't get this: Why do we need to configure sysprof when loading module,
if we configure it manually during start() call?

> +
>    LJ_LIB_REG(L, LUAM_MISCLIBNAME, misc);
>    LJ_LIB_REG(L, LUAM_MISCLIBNAME ".memprof", misc_memprof);
> +  LJ_LIB_REG(L, LUAM_MISCLIBNAME ".sysprof", misc_sysprof);
> +
>    return 1;
>  }
> diff --git a/test/tarantool-tests/misclib-sysprof-lapi.test.lua b/test/tarantool-tests/misclib-sysprof-lapi.test.lua
> new file mode 100644
> index 00000000..17d83d1b
> --- /dev/null
> +++ b/test/tarantool-tests/misclib-sysprof-lapi.test.lua
> @@ -0,0 +1,105 @@

We need to use the same skipcondition as for memprof here.

> +local tap = require("tap")
> +
> +local test = tap.test("misc-sysprof-lapi")
> +test:plan(14)
> +
> +jit.off()
> +jit.flush()
> +
> +local bufread = require "utils.bufread"
> +local symtab = require "utils.symtab"

Nit: please be consistent: use `require("modulename")`.

> +
> +local TMP_BINFILE = arg[0]:gsub(".+/([^/]+)%.test%.lua$", "%.%1.sysprofdata.tmp.bin")
> +local BAD_PATH = arg[0]:gsub(".+/([^/]+)%.test%.lua$", "%1/sysprofdata.tmp.bin")
> +
> +local function payload()
> +  local function fib(n)
> +    if n <= 1 then
> +      return n
> +    end
> +    return fib(n - 1) + fib(n - 2)
> +  end
> +  return fib(32)
> +end
> +
> +local function generate_output(file, opts)
> +  file = file or "sysprof.bin"

Please, remove default file name.

> +  local res, err = misc.sysprof.start(file, opts)
> +  assert(res, err)
> +
> +  payload()
> +
> +  res,err = misc.sysprof.stop()
> +  assert(res, err)
> +end

Nit: missed empty line.
Here and below.

> +--## GENERAL ###############################################################--

Nit: this mixing style looks confusing. Please use only `--`.
Here and below.

> +
> +--wrong profiling mode

Typo: s/--wrong/-- Wrong/
Typo: s/mode/mode./

> +local res, err, errno = misc.sysprof.start{ mode = "A" }
> +test:ok(res == nil and err:match("profiler misuse"))
> +test:ok(type(errno) == "number")
> +
> +--already running

Typo: s/--already running/-- Already running./

> +res, err = misc.sysprof.start{ mode = "D" }
> +assert(res, err)
> +
> +res, err, errno = misc.sysprof.start{ mode = "D" }
> +print(err)

Please remove this debugging output.

> +test:ok(res == nil and err:match("profiler misuse"))
> +test:ok(type(errno) == "number")
> +
> +res, err = misc.sysprof.stop()
> +assert(res, err)
> +
> +--not running

Typo: s/--not running/-- Not running./

> +res, err, errno = misc.sysprof.stop()
> +test:ok(res == nil and err:match("profiler is not running"))
> +test:ok(type(errno) == "number")
> +
> +--bad path

Typo: s/--bad path/-- Bad path./

> +res, err, errno = misc.sysprof.start(BAD_PATH, { mode = "C" })
> +test:ok(res == nil and err:match("No such file or directory"))
> +test:ok(type(errno) == "number")
> +
> +--bad interval

Typo: s/--bad interval/-- Bad interval./

> +res, err, errno = misc.sysprof.start{ mode = "C", interval = -1 }
> +test:ok(res == nil and err:match("profiler misuse"))
> +test:ok(type(errno) == "number")
> +--## DEFAULT MODE ##########################################################--
> +
> +res, err = pcall(generate_output, nil, { mode = "D", interval = 11 })
> +if res == nil then

`pcall()` returns either true when there is no error raised, either
false in case of an error.
I suppose we should remove TMP file in case of error here.

> +  error(err)
> +end
> +
> +local report = misc.sysprof.report()
> +
> +test:ok(report.samples > 0)  -- TODO: more accurate test?

The test looks OK to me. Maybe we can use a specific function to sleep
every N ms, where N is the interval value, so then the amount of sleep
calls should be equal to the amount of samples.

> +test:ok(report.vmstate.LFUNC > 0)
> +test:ok(report.vmstate.TRACE == 0)
> +
> +-- with very big interval

Typo: s/with/With/
Typo: s/interval/interval./

> +res, err = pcall(generate_output, nil, { mode = "D", interval = 1000 })
> +if res == nil then

`pcall()` returns either true when there is no error raised, either
false in case of an error.
I suppose we should remove TMP file in case of error here.

> +  error(err)
> +end
> +
> +report = misc.sysprof.report()
> +test:ok(report.samples == 0)
> +
> +--## CALL MODE #############################################################--
> +
> +res, err = pcall(
> +  generate_output, TMP_BINFILE, { mode = "C", interval = 11 }
> +)
> +if res == nil then

`pcall()` returns either true when there is no error raised, either
false in case of an error.
I suppose we should remove TMP file in case of error here.

> +  error(err)
> +end
> +
> +local reader = bufread.new(TMP_BINFILE)
> +symtab.parse(reader)
> +
> +os.remove(TMP_BINFILE)
> +
> +jit.on()
> +os.exit(test:check() and 0 or 1)

What about Leaf mode? I suppose, that we can check it works like for
Callgraph mode.

> -- 
> 2.33.0
> 

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Tarantool-patches] [PATCH luajit v2] memprof: add profile common section
  2021-10-05 10:48   ` Sergey Kaplun via Tarantool-patches
@ 2021-10-06 19:15     ` Maxim Kokryashkin via Tarantool-patches
  0 siblings, 0 replies; 19+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2021-10-06 19:15 UTC (permalink / raw)
  To: tarantool-patches, imun, skaplun, m.shishatskiy

A Lua API for sysprof needs to be introduced, but sysprof's C API is
quite similar with memprof's. Considering this, there are some
structures and functions that should be common among memprof's and
sysprof's implementations of Lua API to avoid duplication.

Part of tarantool/tarantool#781
---
GitHub branch: https://github.com/tarantool/luajit/tree/shishqa/gh-781-platform-and-lua-profiler

 src/lib_misc.c | 36 +++++++++++++++++++-----------------
 1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/src/lib_misc.c b/src/lib_misc.c
index 1dab08cc..22d29d78 100644
--- a/src/lib_misc.c
+++ b/src/lib_misc.c
@@ -8,8 +8,9 @@
 #define lib_misc_c
 #define LUA_LIB
 
-#include <stdio.h>
 #include <errno.h>
+#include <fcntl.h>
+#include <unistd.h>
 
 #include "lua.h"
 #include "lmisclib.h"
@@ -75,9 +76,7 @@ LJLIB_CF(misc_getmetrics)
 
 #include "lj_libdef.h"
 
-/* ----- misc.memprof module ---------------------------------------------- */
-
-#define LJLIB_MODULE_misc_memprof
+/* --------- profile common section --------------------------------------- */
 
 /*
 ** Yep, 8Mb. Tuned in order not to bother the platform with too often flushes.
@@ -85,9 +84,9 @@ LJLIB_CF(misc_getmetrics)
 #define STREAM_BUFFER_SIZE (8 * 1024 * 1024)
 
 /* Structure given as ctx to memprof writer and on_stop callback. */
-struct memprof_ctx {
-  /* Output file stream for data. */
-  FILE *stream;
+struct profile_ctx {
+  /* Output file descriptor for data. */
+  int fd;
   /* Profiled global_State for lj_mem_free at on_stop callback. */
   global_State *g;
   /* Buffer for data. */
@@ -96,13 +95,13 @@ struct memprof_ctx {
 
 /*
 ** Default buffer writer function.
-** Just call fwrite to the corresponding FILE.
+** Just call write to the corresponding descriptor.
 */
 static size_t buffer_writer_default(const void **buf_addr, size_t len,
 				    void *opt)
 {
-  struct memprof_ctx *ctx = opt;
-  FILE *stream = ctx->stream;
+  struct profile_ctx *ctx = opt;
+  const int fd = ctx->fd;
   const void * const buf_start = *buf_addr;
   const void *data = *buf_addr;
   size_t write_total = 0;
@@ -110,9 +109,9 @@ static size_t buffer_writer_default(const void **buf_addr, size_t len,
   lua_assert(len <= STREAM_BUFFER_SIZE);
 
   for (;;) {
-    const size_t written = fwrite(data, 1, len - write_total, stream);
+    const size_t written = write(fd, data, len - write_total);
 
-    if (LJ_UNLIKELY(written == 0)) {
+    if (LJ_UNLIKELY(written == -1)) {
       /* Re-tries write in case of EINTR. */
       if (errno != EINTR) {
 	/* Will be freed as whole chunk later. */
@@ -137,22 +136,25 @@ static size_t buffer_writer_default(const void **buf_addr, size_t len,
   return write_total;
 }
 
-/* Default on stop callback. Just close the corresponding stream. */
+/* Default on stop callback. Just close the corresponding descriptor. */
 static int on_stop_cb_default(void *opt, uint8_t *buf)
 {
-  struct memprof_ctx *ctx = opt;
-  FILE *stream = ctx->stream;
+  struct profile_ctx *ctx = opt;
+  const int fd = ctx->fd;
   UNUSED(buf);
   lj_mem_free(ctx->g, ctx, sizeof(*ctx));
-  return fclose(stream);
+  return close(fd);
 }
 
+/* ----- misc.memprof module ---------------------------------------------- */
+
+#define LJLIB_MODULE_misc_memprof
 /* local started, err, errno = misc.memprof.start(fname) */
 LJLIB_CF(misc_memprof_start)
 {
   struct lj_memprof_options opt = {0};
   const char *fname = strdata(lj_lib_checkstr(L, 1));
-  struct memprof_ctx *ctx;
+  struct profile_ctx *ctx;
   int memprof_status;
 
   /*
-- 
2.33.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 7/7] tools: introduce parsers for sysprof
  2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 7/7] tools: introduce parsers for sysprof Maxim Kokryashkin via Tarantool-patches
@ 2021-10-07 11:28   ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 19+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-10-07 11:28 UTC (permalink / raw)
  To: Maxim Kokryashkin; +Cc: tarantool-patches

Hi, Maxim!

Thanks for the patch!

Please consider my comments below.

| tools: introduce parsers for sysprof

Typo: s/parsers/parser/

On 08.09.21, Maxim Kokryashkin wrote:
> Since the sysprof's binary output is not human-readable, so there is a
> demand to create a parser. The parser, which this commit provides,
> converts sysprof's event stream into the format that the flamegraph.pl

Minor: It will be nice to add a reference to flamegraph.pl here.

> can process.
> 
> The sysprof parser machinery uses the same symtab module as memprof's
> parser since the format is the same.
> 
> Part of tarantool/tarantool#781

Minor: As far it is the last commit in patchset it should be Resolves
instead.

> ---
>  .gitignore                                    |   1 +
>  .../misclib-sysprof-lapi.test.lua             |   2 +
>  tools/CMakeLists.txt                          |  83 ++++++++
>  tools/luajit-parse-sysprof.in                 |   6 +
>  tools/sysprof.lua                             | 119 +++++++++++
>  tools/sysprof/collapse.lua                    | 110 ++++++++++
>  tools/sysprof/parse.lua                       | 188 ++++++++++++++++++
>  7 files changed, 509 insertions(+)
>  create mode 100644 tools/luajit-parse-sysprof.in

Please make this file executable to make the resultt file executable
too.

>  create mode 100644 tools/sysprof.lua
>  create mode 100755 tools/sysprof/collapse.lua
>  create mode 100755 tools/sysprof/parse.lua

And vice versa -- these files shouldn't be executable.

> 
> diff --git a/.gitignore b/.gitignore
> index 2103a30f..099df060 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -19,4 +19,5 @@ cmake_uninstall.cmake

<snipped>

> diff --git a/test/tarantool-tests/misclib-sysprof-lapi.test.lua b/test/tarantool-tests/misclib-sysprof-lapi.test.lua
> index 17d83d1b..b9b3461f 100644
> --- a/test/tarantool-tests/misclib-sysprof-lapi.test.lua
> +++ b/test/tarantool-tests/misclib-sysprof-lapi.test.lua
> @@ -8,6 +8,7 @@ jit.flush()
>  
>  local bufread = require "utils.bufread"
>  local symtab = require "utils.symtab"
> +local sysprof = require "sysprof.parse"
>  
>  local TMP_BINFILE = arg[0]:gsub(".+/([^/]+)%.test%.lua$", "%.%1.sysprofdata.tmp.bin")
>  local BAD_PATH = arg[0]:gsub(".+/([^/]+)%.test%.lua$", "%1/sysprofdata.tmp.bin")
> @@ -98,6 +99,7 @@ end
>  
>  local reader = bufread.new(TMP_BINFILE)
>  symtab.parse(reader)
> +sysprof.parse(reader)

I suppose it is better to check some entyties in this profile.

>  
>  os.remove(TMP_BINFILE)
>  
> diff --git a/tools/CMakeLists.txt b/tools/CMakeLists.txt
> index 61830e44..93a2f763 100644
> --- a/tools/CMakeLists.txt
> +++ b/tools/CMakeLists.txt
> @@ -87,4 +87,87 @@ else()
>    )
>  endif()
>  
> +
> +

Nit: These empty lines are excess.

> +if(LUAJIT_DISABLE_SYSPROF)
> +  message(STATUS "LuaJIT system profiler support is disabled")
> +else()
> +  # XXX: Can use genex here since the value need to be evaluated
> +  # at the configuration phase. Fortunately, we know the exact
> +  # path where LuaJIT binary is located.
> +  set(LUAJIT_TOOLS_BIN ${LUAJIT_BINARY_DIR}/${LUAJIT_CLI_NAME})
> +  set(LUAJIT_TOOLS_DIR ${CMAKE_CURRENT_SOURCE_DIR})
> +  # XXX: Unfortunately, there is no convenient way to set
> +  # particular permissions to the output file via CMake.
> +  # Furthermore, I even failed to copy the given file to the same
> +  # path to change its permissions. After looking at the docs, I
> +  # realized that the valid solution would be too monstrous for
> +  # such a simple task. As a result I've made the template itself
> +  # executable, so the issue is resolved.
> +  configure_file(luajit-parse-sysprof.in luajit-parse-sysprof @ONLY ESCAPE_QUOTES)
> +

Nit: This empty line is excess.

> +
> +  add_custom_target(tools-parse-sysprof EXCLUDE_FROM_ALL DEPENDS
> +    luajit-parse-sysprof
> +    sysprof/parse.lua
> +    sysprof/collapse.lua
> +    sysprof.lua
> +    utils/bufread.lua
> +    utils/symtab.lua
> +  )
> +  list(APPEND LUAJIT_TOOLS_DEPS tools-parse-sysprof)

<snipped>

>  add_custom_target(LuaJIT-tools DEPENDS ${LUAJIT_TOOLS_DEPS})
> +

Nit: This new line is excess.

> diff --git a/tools/luajit-parse-sysprof.in b/tools/luajit-parse-sysprof.in
> new file mode 100644
> index 00000000..2be25eb3
> --- /dev/null
> +++ b/tools/luajit-parse-sysprof.in

<snipped>

> diff --git a/tools/sysprof.lua b/tools/sysprof.lua
> new file mode 100644
> index 00000000..e6d8cc34
> --- /dev/null
> +++ b/tools/sysprof.lua
> @@ -0,0 +1,119 @@
> +local bufread = require "utils.bufread"
> +local sysprof = require "sysprof.parse"
> +local symtab = require "utils.symtab"
> +local misc = require "sysprof.collapse"
> +
> +local stdout, stderr = io.stdout, io.stderr
> +local match, gmatch = string.match, string.gmatch
> +
> +local split_by_vmstate = false
> +
> +-- Program options.
> +local opt_map = {}
> +
> +function opt_map.help()
> +  stdout:write [[
> +luajit-parse-sysprof - parser of the profile collected
> +                       with LuaJIT's sysprof.
> +
> +SYNOPSIS
> +
> +luajit-parse-sysprof [options] sysprof.bin
> +
> +Supported options are:
> +
> +  --help                            Show this help and exit
> +  --split                           Split callchains by vmstate
> +]]

Side note: -o, --output option seems usefull for sysprof's parser output
too. It may be done in the separate ticket.

> +  os.exit(0)
> +end
> +
> +function opt_map.split()
> +  split_by_vmstate = true
> +end
> +
> +-- Print error and exit with error status.
> +local function opterror(...)
> +  stderr:write("luajit-parse-sysprof.lua: ERROR: ", ...)
> +  stderr:write("\n")
> +  os.exit(1)
> +end
> +
> +-- Parse single option.
> +local function parseopt(opt, args)
> +  local opt_current = #opt == 1 and "-"..opt or "--"..opt
> +  local f = opt_map[opt]
> +  if not f then
> +    opterror("unrecognized option `", opt_current, "'. Try `--help'.\n")
> +  end
> +  f(args)
> +end
> +
> +-- Parse arguments.
> +local function parseargs(args)
> +  -- Process all option arguments.
> +  args.argn = 1
> +  repeat
> +    local a = args[args.argn]
> +    if not a then
> +      break
> +    end
> +    local lopt, opt = match(a, "^%-(%-?)(.+)")
> +    if not opt then
> +      break
> +    end
> +    args.argn = args.argn + 1
> +    if lopt == "" then
> +      -- Loop through short options.
> +      for o in gmatch(opt, ".") do
> +        parseopt(o, args)
> +      end
> +    else
> +      -- Long option.
> +      parseopt(opt, args)
> +    end
> +  until false
> +
> +  -- Check for proper number of arguments.
> +  local nargs = #args - args.argn + 1
> +  if nargs ~= 1 then
> +    opt_map.help()
> +  end
> +
> +  -- Translate a single input file.
> +  -- TODO: Handle multiple files?
> +  return args[args.argn]
> +end

Minor: This logic with parse options is pretty the same for sysprof and
memprof (only options and callbacks are different). So, we can move this
part to the separate module (maybe within the separate commit).

> +
> +local function traverse_calltree(node, prefix)
> +  if node.is_leaf then
> +    print(prefix..' '..node.count)
> +  end
> +
> +  local sep_prefix = #prefix == 0 and prefix or prefix..';'
> +
> +  for name,child in pairs(node.children) do
> +    traverse_calltree(child, sep_prefix..name)
> +  end
> +end
> +
> +local function dump(inputfile)
> +  local reader = bufread.new(inputfile)
> +
> +  local symbols = symtab.parse(reader)
> +
> +  local events = sysprof.parse(reader)
> +  local calltree = misc.collapse(events, symbols, split_by_vmstate)
> +
> +  traverse_calltree(calltree, '')
> +
> +  os.exit(0)
> +end
> +
> +-- FIXME: this script should be application-independent.
> +local args = {...}
> +if #args == 1 and args[1] == "sysprof" then
> +  return dump
> +else
> +  dump(parseargs(args))
> +end
> diff --git a/tools/sysprof/collapse.lua b/tools/sysprof/collapse.lua
> new file mode 100755
> index 00000000..2d684c95
> --- /dev/null
> +++ b/tools/sysprof/collapse.lua
> @@ -0,0 +1,110 @@

Side note: I don't look this part carefully as far as we assume offline
to rework it.

> +local parse = require "sysprof.parse"
> +local vmdef = require "jit.vmdef"
> +local symtab = require "utils.symtab"
> +
> +local VMST_NAMES = {
> +  [parse.VMST.INTERP] = "VMST_INTERP",
> +  [parse.VMST.LFUNC]  = "VMST_LFUNC",
> +  [parse.VMST.FFUNC]  = "VMST_FFUNC",
> +  [parse.VMST.CFUNC]  = "VMST_CFUNC",
> +  [parse.VMST.GC]     = "VMST_GC",
> +  [parse.VMST.EXIT]   = "VMST_EXIT",
> +  [parse.VMST.RECORD] = "VMST_RECORD",
> +  [parse.VMST.OPT]    = "VMST_OPT",
> +  [parse.VMST.ASM]    = "VMST_ASM",
> +  [parse.VMST.TRACE]  = "VMST_TRACE",
> +}
> +
> +local M = {}
> +
> +local function new_node(name, is_leaf)
> +  return {
> +    name = name,
> +    count = 0,
> +    is_leaf = is_leaf,
> +    children = {}
> +  }
> +end
> +
> +-- insert new child into a node (or increase counter in existing one)

Nit: The sentence must start with a capital letter and end with a dot.
Here and below.

> +local function insert(name, node, is_leaf)
> +  if node.children[name] == nil then
> +    node.children[name] = new_node(name, is_leaf)
> +  end
> +
> +  local child = node.children[name]
> +  child.count = child.count + 1
> +
> +  return child
> +end
> +
> +local function insert_lua_callchain(chain, lua, symbols)
> +  for _,fr in pairs(lua.callchain) do

Typo: s/_,fr/_, fr/

> +    local name_lua
> +
> +    if fr.type == parse.FRAME.FFUNC then
> +      name_lua = vmdef.ffnames[fr.ffid]
> +    else
> +      name_lua = symtab.demangle(symbols, {
> +        addr = fr.addr,
> +        line = fr.line
> +      })
> +      if lua.trace.id ~= nil and lua.trace.addr == fr.addr and
> +          lua.trace.line == fr.line then
> +        name_lua = 'TRACE_'..tostring(lua.trace.id)..'_'..name_lua
> +      end
> +    end
> +
> +    table.insert(chain, { name = name_lua })
> +  end
> +end
> +
> +-- merge lua and host callchains into one callchain representing
> +-- transfer of control
> +local function merge(event, symbols, sep_vmst)

Side note: This part should be reworked to show HOST->LUA->HOST->LUA
callchains correctly as we discussed offline. The host stack is always
on top, the Lua stack is started after BC_* (simple waterline).

> +  local cc = {}
> +  local lua_inserted = false
> +
> +  for _,h_fr in pairs(event.host.callchain) do
> +    local name_host = symtab.demangle(symbols, { addr = h_fr.addr })
> +
> +    -- We assume that usually the transfer of control
> +    -- looks like:
> +    --    HOST -> LUA -> HOST
> +    -- so for now, lua callchain starts from lua_pcall() call
> +    if name_host == 'lua_pcall' then
> +      insert_lua_callchain(cc, event.lua, symbols)
> +      lua_inserted = true
> +    end
> +
> +    table.insert(cc, { name = name_host })
> +  end
> +
> +  if lua_inserted == false then

Minor: `not lua_inserted` looks more readable to me.
Feel free to ignore.

> +    insert_lua_callchain(cc, event.lua, symbols)
> +  end
> +
> +  if sep_vmst == true then

Minor: `if sep_vmst then` looks more readable to me.
Feel free to ignore.

> +    table.insert(cc, { name = VMST_NAMES[event.lua.vmstate] })
> +  end
> +
> +  return cc
> +end
> +
> +-- Collapse all the events into call tree
> +function M.collapse(events, symbols, sep_vmst)
> +  local root = new_node('root', false)
> +
> +  for _,ev in pairs(events) do
> +    local callchain = merge(ev, symbols, sep_vmst)
> +    local curr_node = root
> +    for i=#callchain,1,-1 do

Typo: s/,/, /g
Typo: s/=/ = /

> +      curr_node = insert(callchain[i].name, curr_node, false)
> +    end
> +    insert('', curr_node, true)
> +  end
> +
> +  return root
> +end
> +
> +return M
> diff --git a/tools/sysprof/parse.lua b/tools/sysprof/parse.lua
> new file mode 100755
> index 00000000..766b0c99
> --- /dev/null
> +++ b/tools/sysprof/parse.lua
> @@ -0,0 +1,188 @@
> +-- Parser of LuaJIT's sysprof binary stream.
> +-- The format spec can be found in <src/lj_sysprof.h>.
> +
> +local string_format = string.format
> +
> +local LJP_MAGIC = "ljp"
> +local LJP_CURRENT_VERSION = 1
> +
> +local M = {}
> +
> +M.VMST = {
> +  INTERP = 0,
> +  LFUNC  = 1,
> +  FFUNC  = 2,
> +  CFUNC  = 3,
> +  GC     = 4,
> +  EXIT   = 5,
> +  RECORD = 6,
> +  OPT    = 7,
> +  ASM    = 8,
> +  TRACE  = 9,
> +}
> +
> +
> +M.FRAME = {
> +  LFUNC  = 1,
> +  CFUNC  = 2,
> +  FFUNC  = 3,
> +  BOTTOM = 0x80
> +}
> +
> +local STREAM_END = 0x80
> +
> +local function new_event()
> +  return {
> +    lua = {
> +      vmstate = 0,
> +      callchain = {},
> +      trace = {

Minor: looks like trace should be separated as unique case as Lua or
host, doesn't it?

> +        id = nil,
> +        addr = 0,
> +        line = 0
> +      }

Minor: It is better to use "," in the last entries too: it is easier to
add the other entries, if you want.
Here and below.

> +    },
> +    host = {
> +      callchain = {}
> +    }
> +  }
> +end
> +
> +local function parse_lfunc(reader, event)
> +  local addr = reader:read_uleb128()
> +  local line = reader:read_uleb128()
> +  table.insert(event.lua.callchain, {
> +    type = M.FRAME.LFUNC,
> +    addr = addr,
> +    line = line
> +  })
> +end
> +
> +local function parse_ffunc(reader, event)
> +  local ffid = reader:read_uleb128()
> +  table.insert(event.lua.callchain, {
> +    type = M.FRAME.FFUNC,
> +    ffid = ffid,
> +  })
> +end
> +
> +local function parse_cfunc(reader, event)
> +  local addr = reader:read_uleb128()
> +  table.insert(event.lua.callchain, {
> +    type = M.FRAME.CFUNC,
> +    addr = addr
> +  })
> +end
> +
> +local frame_parsers = {
> +  [M.FRAME.LFUNC] = parse_lfunc,
> +  [M.FRAME.FFUNC] = parse_ffunc,
> +  [M.FRAME.CFUNC] = parse_cfunc
> +}
> +
> +local function parse_lua_callchain(reader, event)
> +  while true do
> +    local frame_header = reader:read_octet()
> +    if frame_header == M.FRAME.BOTTOM then
> +      break
> +    end
> +    frame_parsers[frame_header](reader, event)
> +  end
> +end
> +
> +--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--

Side note: what does this curves line separate, only
parse_host_callchain? Also, IMO it is pretier to use `-` instead.
Feel free to ignore.

> +
> +local function parse_host_callchain(reader, event)
> +  local addr = reader:read_uleb128()
> +
> +  while addr ~= 0 do

Side note: NULL byte is always written for last host frame?

> +    table.insert(event.host.callchain, {
> +      addr = addr
> +    })
> +    addr = reader:read_uleb128()
> +  end
> +end
> +
> +--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--
> +
> +local function parse_trace_callchain(reader, event)
> +  event.lua.trace.id   = reader:read_uleb128()
> +  event.lua.trace.addr = reader:read_uleb128()
> +  event.lua.trace.line = reader:read_uleb128()
> +end
> +
> +--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--
> +
> +local function parse_host_only(reader, event)
> +  parse_host_callchain(reader, event)
> +end
> +
> +local function parse_lua_host(reader, event)
> +  parse_lua_callchain(reader, event)
> +  parse_host_callchain(reader, event)
> +end
> +
> +local function parse_trace(reader, event)
> +  parse_trace_callchain(reader, event)
> +  -- parse_lua_callchain(reader, event)

Minor: Looks like this comment should be deleted.

> +end
> +
> +local event_parsers = {
> +  [M.VMST.INTERP] = parse_host_only,
> +  [M.VMST.LFUNC]  = parse_lua_host,
> +  [M.VMST.FFUNC]  = parse_lua_host,
> +  [M.VMST.CFUNC]  = parse_lua_host,
> +  [M.VMST.GC]     = parse_host_only,
> +  [M.VMST.EXIT]   = parse_host_only,
> +  [M.VMST.RECORD] = parse_host_only,
> +  [M.VMST.OPT]    = parse_host_only,
> +  [M.VMST.ASM]    = parse_host_only,
> +  [M.VMST.TRACE]  = parse_trace
> +}
> +
> +local function parse_event(reader, events)
> +  local event = new_event()
> +
> +  local vmstate = reader:read_octet()
> +  if vmstate == STREAM_END then
> +    -- TODO: samples & overruns
> +    return false
> +  end
> +
> +  assert(0 <= vmstate and vmstate <= 9, "Vmstate "..vmstate.." is not valid")

Minor: For what 0 and 9 stands for?

> +  event.lua.vmstate = vmstate
> +
> +  event_parsers[vmstate](reader, event)
> +
> +  table.insert(events, event)
> +  return true
> +end
> +
> +function M.parse(reader)

<snipped>

> +end
> +
> +return M
> -- 
> 2.33.0
> 

-- 
Best regards,
Sergey Kaplun


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2021-10-07 11:29 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-08 17:50 [Tarantool-patches] [PATCH luajit 0/7] misc: introduce sysprof Maxim Kokryashkin via Tarantool-patches
2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 1/7] core: save current frame top to lj_obj Maxim Kokryashkin via Tarantool-patches
2021-09-20 17:21   ` Sergey Kaplun via Tarantool-patches
2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 2/7] core: separate the profiling timer from lj_profile Maxim Kokryashkin via Tarantool-patches
2021-09-21 11:13   ` Sergey Kaplun via Tarantool-patches
2021-09-23 11:37     ` Mikhail Shishatskiy via Tarantool-patches
2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 3/7] memprof: move symtab to a separate module Maxim Kokryashkin via Tarantool-patches
2021-09-22  7:51   ` Sergey Kaplun via Tarantool-patches
2021-09-22  8:14     ` Sergey Kaplun via Tarantool-patches
2021-09-23 14:51   ` [Tarantool-patches] [PATCH luajit v2] " Maxim Kokryashkin via Tarantool-patches
2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 4/7] core: introduce lua and platform profiler Maxim Kokryashkin via Tarantool-patches
2021-09-29  6:53   ` Sergey Kaplun via Tarantool-patches
2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 5/7] memprof: add profile common section Maxim Kokryashkin via Tarantool-patches
2021-10-05 10:48   ` Sergey Kaplun via Tarantool-patches
2021-10-06 19:15     ` [Tarantool-patches] [PATCH luajit v2] " Maxim Kokryashkin via Tarantool-patches
2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 6/7] sysprof: introduce Lua API Maxim Kokryashkin via Tarantool-patches
2021-10-05 15:36   ` Sergey Kaplun via Tarantool-patches
2021-09-08 17:50 ` [Tarantool-patches] [PATCH luajit 7/7] tools: introduce parsers for sysprof Maxim Kokryashkin via Tarantool-patches
2021-10-07 11:28   ` Sergey Kaplun via Tarantool-patches

Tarantool development patches archive

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://lists.tarantool.org/tarantool-patches/0 tarantool-patches/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 tarantool-patches tarantool-patches/ https://lists.tarantool.org/tarantool-patches \
		tarantool-patches@dev.tarantool.org.
	public-inbox-index tarantool-patches

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git