[Tarantool-patches] [PATCH luajit v3] sysprof: fix crash during FFUNC stream

Sergey Bronnikov sergeyb at tarantool.org
Thu Jul 13 11:12:43 MSK 2023


Hi, Max!

Thanks for the patch!


Test is passed with reverted patch, so it is useless.


As I get it right there is a non-zero possibility to catch inconsistency 
in VM state in a future,

so I propose to enable sysprof in our fuzzing test [1] with a high 
sampling interval and check that profiler

will work on generated Lua programs . What do you think?


Also see my comments inline.


1. 
https://github.com/ligurio/tarantool/commit/b8bc3293da66ded41383faf5ea913a2554d987b8


Sergey

On 7/10/23 15:28, Maxim Kokryashkin wrote:
> Sometimes, the Lua stack can be inconsistent during
> the FFUNC execution, which may lead to a sysprof
> crash during the stack unwinding.
>
> This patch replaces the `top_frame` property of `global_State`
> with `lj_sysprof_topframe` structure, which contains `top_frame`
> and `ffid` properties. `ffid` property makes sense only when the
> LuaJIT VM state is set to `FFUNC`. That property is set to the
> ffid of the fast function that VM is about to execute.
> In the same time, `top_frame` property is not updated now, so
> the top frame of the Lua stack can be streamed based on the ffid,
> and the rest of the Lua stack can be streamed as usual.
>
> Also, this patch fixes build with plain makefile, by adding
> the `LJ_HASSYSPROF` flag support to it.
>
> Resolves tarantool/tarantool#8594
> ---
> Changes in v3:
> - Fixed comments as per review by Sergey
>
> Branch: https://github.com/tarantool/luajit/tree/fckxorg/gh-8594-sysprof-ffunc-crash
> PR: https://github.com/tarantool/tarantool/pull/8737
>   src/Makefile.original                         |  3 ++
>   src/lj_obj.h                                  |  7 +++-
>   src/lj_sysprof.c                              | 26 ++++++++++++---
>   src/vm_x64.dasc                               | 22 +++++++++++--
>   src/vm_x86.dasc                               | 31 ++++++++++++++---
>   .../gh-8594-sysprof-ffunc-crash.test.lua      | 33 +++++++++++++++++++
>   6 files changed, 109 insertions(+), 13 deletions(-)
>   create mode 100644 test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua
>
> diff --git a/src/Makefile.original b/src/Makefile.original
> index aedaaa73..e21a0e56 100644
> --- a/src/Makefile.original
> +++ b/src/Makefile.original
> @@ -441,6 +441,9 @@ ifneq (,$(findstring LJ_NO_UNWIND 1,$(TARGET_TESTARCH)))
>     DASM_AFLAGS+= -D NO_UNWIND
>     TARGET_ARCH+= -DLUAJIT_NO_UNWIND
>   endif
> +ifneq (,$(findstring LJ_HASSYSPROF 1,$(TARGET_TESTARCH)))
> +  DASM_AFLAGS+= -D LJ_HASSYSPROF
> +endif
>   DASM_AFLAGS+= -D VER=$(subst LJ_ARCH_VERSION_,,$(filter LJ_ARCH_VERSION_%,$(subst LJ_ARCH_VERSION ,LJ_ARCH_VERSION_,$(TARGET_TESTARCH))))
>   ifeq (Windows,$(TARGET_SYS))
>     DASM_AFLAGS+= -D WIN
> diff --git a/src/lj_obj.h b/src/lj_obj.h
> index 45507e0d..e17316df 100644
> --- a/src/lj_obj.h
> +++ b/src/lj_obj.h
> @@ -598,6 +598,11 @@ enum {
>     GCSmax
>   };
>   
> +struct lj_sysprof_topframe {
> +  uint8_t ffid; /* FFID of the fast function VM is about to execute. */
> +  TValue *top_frame; /* Top frame for sysprof. */
> +};
> +
>   typedef struct GCState {
>     GCSize total;		/* Memory currently allocated. */
>     GCSize threshold;	/* Memory threshold. */
> @@ -675,7 +680,7 @@ typedef struct global_State {
>     MRef ctype_state;	/* Pointer to C type state. */
>     GCRef gcroot[GCROOT_MAX];  /* GC roots. */
>   #ifdef LJ_HASSYSPROF
> -  TValue *top_frame;	/* Top frame for sysprof. */
> +  struct lj_sysprof_topframe top_frame_info;	/* Top frame info for sysprof. */
>   #endif
>   } global_State;
>   
> diff --git a/src/lj_sysprof.c b/src/lj_sysprof.c
> index 2e9ed9b3..0a341e16 100644
> --- a/src/lj_sysprof.c
> +++ b/src/lj_sysprof.c
> @@ -109,6 +109,12 @@ static void stream_epilogue(struct sysprof *sp)
>     lj_wbuf_addbyte(&sp->out, LJP_EPILOGUE_BYTE);
>   }
>   
> +static void stream_ffunc_impl(struct lj_wbuf *buf, uint8_t ffid)
> +{
> +  lj_wbuf_addbyte(buf, LJP_FRAME_FFUNC);
> +  lj_wbuf_addu64(buf, ffid);
> +}
> +
>   static void stream_lfunc(struct lj_wbuf *buf, const GCfunc *func)
>   {
>     lua_assert(isluafunc(func));
> @@ -129,8 +135,7 @@ static void stream_cfunc(struct lj_wbuf *buf, const GCfunc *func)
>   static void stream_ffunc(struct lj_wbuf *buf, const GCfunc *func)
>   {
>     lua_assert(isffunc(func));
> -  lj_wbuf_addbyte(buf, LJP_FRAME_FFUNC);
> -  lj_wbuf_addu64(buf, func->c.ffid);
> +  stream_ffunc_impl(buf, func->c.ffid);
>   }
>   
>   static void stream_frame_lua(struct lj_wbuf *buf, const cTValue *frame)
> @@ -148,7 +153,7 @@ static void stream_frame_lua(struct lj_wbuf *buf, const cTValue *frame)
>       lua_assert(0);
>   }
>   
> -static void stream_backtrace_lua(struct sysprof *sp)
> +static void stream_backtrace_lua(struct sysprof *sp, uint32_t vmstate)
>   {
>     global_State *g = sp->g;
>     struct lj_wbuf *buf = &sp->out;
> @@ -158,8 +163,19 @@ static void stream_backtrace_lua(struct sysprof *sp)
>     lua_assert(g != NULL);
>     L = gco2th(gcref(g->cur_L));
>     lua_assert(L != NULL);
> +  /*
> +  ** Lua stack may be inconsistent during the execution of a
> +  ** fast-function, so instead of updating the `top_frame` for
> +  ** it, its `ffid` is set instead. The first frame on the
> +  ** result stack is streamed manually, and the rest of the
> +  ** stack is streamed based on the previous `top_frame` value.
> +  */
> +  if (vmstate == LJ_VMST_FFUNC) {
> +    uint8_t ffid = g->top_frame_info.ffid;
> +    stream_ffunc_impl(buf, ffid);
> +  }
>   
> -  top_frame = g->top_frame - 1; //(1 + LJ_FR2)
> +  top_frame = g->top_frame_info.top_frame - 1;
>   
>     bot = tvref(L->stack) + LJ_FR2;
>     /* Traverse frames backwards */
> @@ -234,7 +250,7 @@ static void stream_trace(struct sysprof *sp, uint32_t vmstate)
>   static void stream_guest(struct sysprof *sp, uint32_t vmstate)
>   {
>     lj_wbuf_addbyte(&sp->out, (uint8_t)vmstate);
> -  stream_backtrace_lua(sp);
> +  stream_backtrace_lua(sp, vmstate);
>     stream_backtrace_host(sp);
>   }
>   
> diff --git a/src/vm_x64.dasc b/src/vm_x64.dasc
> index 7b04b928..2a0c3f03 100644
> --- a/src/vm_x64.dasc
> +++ b/src/vm_x64.dasc
> @@ -53,6 +53,7 @@
>   |.define RDL,		RCL
>   |.define TMPR,		r10
>   |.define TMPRd,		r10d
> +|.define TMPRb,		r10b
>   |.define ITYPE,		r11
>   |.define ITYPEd,	r11d
>   |
> @@ -353,14 +354,29 @@
>   |// it syncs with the BASE register only when the control is passed to
>   |// user code. So we need to sync the BASE on each vmstate change to
>   |// keep it consistent.
> +|// The only exception are FFUNCs because sometimes even internal BASE
> +|// stash is inconsistent for them. To address that issue, their ffid
> +|// is stashed instead, so the corresponding frame can be streamed
> +|// manually.
>   |.macro set_vmstate_sync_base, st
>   |.if LJ_HASSYSPROF
>   |  set_vmstate INTERP  // Guard for non-atomic VM context restoration
> -|  mov qword [DISPATCH+DISPATCH_GL(top_frame)], BASE
> +|  mov qword [DISPATCH+DISPATCH_GL(top_frame_info.top_frame)], BASE
>   |.endif
>   |  set_vmstate st
>   |.endmacro
>   |
> +|.macro set_vmstate_ffunc
> +|.if LJ_HASSYSPROF
> +|  set_vmstate INTERP
> +|  mov TMPR, [BASE - 16]
> +|  cleartp LFUNC:TMPR // Obtain plain address value. Equivalent of `gcval`.
> +|  mov TMPRb, LFUNC:TMPR->ffid
> +|  mov byte [DISPATCH+DISPATCH_GL(top_frame_info.ffid)], TMPRb
> +|.endif
> +|  set_vmstate FFUNC
> +|.endmacro
> +|
>   |// Uses TMPRd (r10d).
>   |.macro save_vmstate
>   |.if not WIN
> @@ -376,7 +392,7 @@
>   |  set_vmstate INTERP
>   |  mov TMPR, SAVE_L
>   |  mov TMPR, L:TMPR->base
> -|  mov qword [DISPATCH+DISPATCH_GL(top_frame)], TMPR
> +|  mov qword [DISPATCH+DISPATCH_GL(top_frame_info.top_frame)], TMPR
>   |.endif
>   |  mov TMPRd, SAVE_VMSTATE
>   |  mov dword [DISPATCH+DISPATCH_GL(vmstate)], TMPRd
> @@ -1188,7 +1204,7 @@ static void build_subroutines(BuildCtx *ctx)
>     |
>     |.macro .ffunc, name
>     |->ff_ .. name:
> -  |  set_vmstate_sync_base FFUNC
> +  |  set_vmstate_ffunc
>     |.endmacro
>     |
>     |.macro .ffunc_1, name
> diff --git a/src/vm_x86.dasc b/src/vm_x86.dasc
> index bd1e940e..ff388d58 100644
> --- a/src/vm_x86.dasc
> +++ b/src/vm_x86.dasc
> @@ -101,6 +101,7 @@
>   |.define FCARG2,	CARG2d
>   |
>   |.define XCHGd,		r11d		// TMP on x64, used for exchange.
> +|.define XCHGb,		r11b		// TMPRb on x64, used for exchange.
>   |.endif
>   |
>   |// Type definitions. Some of these are only used for documentation.
> @@ -451,14 +452,36 @@
>   |// it syncs with the BASE register only when the control is passed to
>   |// user code. So we need to sync the BASE on each vmstate change to
>   |// keep it consistent.
> +|// The only exception are FFUNCs because sometimes even internal BASE
> +|// stash is inconsistent for them. To address that issue, their ffid
> +|// is stashed instead, so the corresponding frame can be streamed
> +|// manually.
>   |.macro set_vmstate_sync_base, st
>   |.if LJ_HASSYSPROF
>   |  set_vmstate INTERP  // Guard for non-atomic VM context restoration
> -|  mov dword [DISPATCH+DISPATCH_GL(top_frame)], BASE
> +|  mov dword [DISPATCH+DISPATCH_GL(top_frame_info.top_frame)], BASE
>   |.endif
>   |  set_vmstate st
>   |.endmacro
>   |
> +|.macro set_vmstate_ffunc
> +|.if LJ_HASSYSPROF
> +|  set_vmstate INTERP
> +|.if not X64
> +|  mov SPILLECX, ecx
> +|  mov LFUNC:ecx, [BASE - 4]
> +|  mov cl, LFUNC:ecx->ffid
> +|  mov byte [DISPATCH+DISPATCH_GL(top_frame_info.ffid)], cl
> +|  mov ecx, SPILLECX
> +|.else // X64
> +|  mov LFUNC:XCHGd, [BASE - 8]
> +|  mov XCHGb, LFUNC:XCHGd->ffid
> +|  mov byte [DISPATCH+DISPATCH_GL(top_frame_info.ffid)], XCHGb
> +|.endif // X64
> +|.endif // LJ_HASSYSPROF
> +|  set_vmstate FFUNC
> +|.endmacro
> +|
>   |// Uses spilled ecx on x86 or XCHGd (r11d) on x64.
>   |.macro save_vmstate
>   |.if not WIN
> @@ -485,7 +508,7 @@
>   |.if LJ_HASSYSPROF
>   |  mov ecx, SAVE_L
>   |  mov ecx, L:ecx->base
> -|  mov dword [DISPATCH+DISPATCH_GL(top_frame)], ecx
> +|  mov dword [DISPATCH+DISPATCH_GL(top_frame_info.top_frame)], ecx
>   |.endif
>   |  mov ecx, SAVE_VMSTATE
>   |  mov dword [DISPATCH+DISPATCH_GL(vmstate)], ecx
> @@ -494,7 +517,7 @@
>   |.if LJ_HASSYSPROF
>   |  mov XCHGd, SAVE_L
>   |  mov XCHGd, L:XCHGd->base
> -|  mov dword [DISPATCH+DISPATCH_GL(top_frame)], XCHGd
> +|  mov dword [DISPATCH+DISPATCH_GL(top_frame_info.top_frame)], XCHGd
>   |.endif
>   |  mov XCHGd, SAVE_VMSTATE
>   |  mov dword [DISPATCH+DISPATCH_GL(vmstate)], XCHGd
> @@ -1488,7 +1511,7 @@ static void build_subroutines(BuildCtx *ctx)
>     |
>     |.macro .ffunc, name
>     |->ff_ .. name:
> -  |  set_vmstate_sync_base FFUNC
> +  |  set_vmstate_ffunc
>     |.endmacro
>     |
>     |.macro .ffunc_1, name
> diff --git a/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua b/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua
> new file mode 100644
> index 00000000..e5cdeb07
> --- /dev/null
> +++ b/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua
> @@ -0,0 +1,33 @@
> +local tap = require('tap')
> +local test = tap.test('gh-8594-sysprof-ffunc-crash'):skipcond({
> +  ['Sysprof is implemented for x86_64 only'] = jit.arch ~= 'x86' and
> +                                               jit.arch ~= 'x64',
> +  ['Sysprof is implemented for Linux only'] = jit.os ~= 'Linux',
> +})
> +
> +test:plan(1)
> +
> +jit.off()
> +-- XXX: Run JIT tuning functions in a safe frame to avoid errors
> +-- thrown when LuaJIT is compiled with JIT engine disabled.
> +pcall(jit.flush)
> +
> +local TMP_BINFILE = '/dev/null'
> +
> +local res, err = misc.sysprof.start{
> +  mode = 'C',

In the reproducer from bug description "L" mode is used.

Why do you use "C" mode here?

> +  interval = 3,
> +  path = TMP_BINFILE,
> +}
> +assert(res, err)
> +
> +for i = 1, 1e5 do
> +  tostring(i)

Please add a comment, that we use "tostring" here, because "tostring" is 
a fast function.

(Sure it's may be obvious for you, but it will be more clear for those 
who will read it.)

> +end
> +
> +res, err = misc.sysprof.stop()
> +assert(res, err)
> +
> +test:ok(true, 'sysprof finished successfully')
> +
> +os.exit(test:check() and 0 or 1)


More information about the Tarantool-patches mailing list