From: Sergey Kaplun via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: Maxim Kokryashkin <max.kokryashkin@gmail.com> Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH luajit v2] sysprof: fix crash during FFUNC stream Date: Thu, 29 Jun 2023 11:54:55 +0300 [thread overview] Message-ID: <ZJ1G3zkwOY6TjCiU@root> (raw) In-Reply-To: <20230607122557.510692-1-m.kokryashkin@tarantool.org> Hi, Maxim! Thanks for the patch! Please consider my comments below. On 07.06.23, Maxim Kokryashkin wrote: > Sometimes, the Lua stack can be inconsistent during > the FFUNC execution, which may lead to a sysprof > crash during the stack unwinding. > > This patch replaces the `top_frame` property of `global_State` > with `lj_sysprof_topframe` structure, which contains `top_frame` > and `ffid` properties. `ffid` property makes sense only when the > LuaJIT VM state is set to `FFUNC`. That property is set to the > ffid of the fast function that VM is about to execute. > In the same time, `top_frame` property is not updated now, so > the top frame of the Lua stack can be streamed based on the ffid, > and the rest of the Lua stack can be streamed as usual. > > Resolves tarantool/tarantool#8594 > --- > Changes in v2: > - Sysprof binary data is now dumped into `/dev/null` to avoid cluttering > of the test runner drive > > Branch: https://github.com/tarantool/luajit/tree/fckxorg/gh-8594-sysprof-ffunc-crash > PR: https://github.com/tarantool/tarantool/pull/8737 > > src/lj_obj.h | 7 +++- > src/lj_sysprof.c | 26 ++++++++++++--- > src/vm_x64.dasc | 21 ++++++++++-- > src/vm_x86.dasc | 22 ++++++++++--- > .../gh-8594-sysprof-ffunc-crash.test.lua | 33 +++++++++++++++++++ > 5 files changed, 96 insertions(+), 13 deletions(-) > create mode 100644 test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua > > diff --git a/src/lj_obj.h b/src/lj_obj.h > index 45507e0d..186433a3 100644 > --- a/src/lj_obj.h > +++ b/src/lj_obj.h > @@ -598,6 +598,11 @@ enum { > GCSmax > }; > > +struct lj_sysprof_topframe { > + TValue *top_frame; /* Top frame for sysprof. */ > + uint8_t ffid; /* FFID of the fast function VM is about to execute. */ > +}; I concerned a bit that the structure isn't well alligned. Maybe we should place ffid on the top, to make a "hole" in the structure, but it will be 64-bit alligned. > + > typedef struct GCState { > GCSize total; /* Memory currently allocated. */ > GCSize threshold; /* Memory threshold. */ > @@ -675,7 +680,7 @@ typedef struct global_State { > MRef ctype_state; /* Pointer to C type state. */ > GCRef gcroot[GCROOT_MAX]; /* GC roots. */ > #ifdef LJ_HASSYSPROF > - TValue *top_frame; /* Top frame for sysprof. */ > + struct lj_sysprof_topframe top_frame_info; /* Top frame info for sysprof. */ > #endif > } global_State; > > diff --git a/src/lj_sysprof.c b/src/lj_sysprof.c > index 2e9ed9b3..0a341e16 100644 > --- a/src/lj_sysprof.c > +++ b/src/lj_sysprof.c <snipped> > diff --git a/src/vm_x64.dasc b/src/vm_x64.dasc > index 7b04b928..3a35b9f7 100644 > --- a/src/vm_x64.dasc > +++ b/src/vm_x64.dasc > @@ -353,14 +353,29 @@ > |// it syncs with the BASE register only when the control is passed to > |// user code. So we need to sync the BASE on each vmstate change to > |// keep it consistent. > +|// The only execption are FFUNCs because sometimes even internal BASE Typo: s/execption/exception/ > +|// stash is inconsistent for them. To address that issue, their ffid > +|// is stashed instead, so the corresponding frame can be streamed > +|// manually. <snipped> > +|.macro set_vmstate_ffunc > +|.if LJ_HASSYSPROF > +| set_vmstate INTERP > +| mov TMPR, [BASE - 16] > +| cleartp LFUNC:TMPR I suppose that this line is excess: we don't work with TMPR as LFUNC any again after this chunk. > +| mov r10b, LFUNC:TMPR->ffid // r10b is the byte-sized part of TMPR So, maybe its better to define a macro instead, like `TMPRb`. > +| mov byte [DISPATCH+DISPATCH_GL(top_frame_info.ffid)], r10b > +|.endif > +| set_vmstate FFUNC > +|.endmacro > +| > |// Uses TMPRd (r10d). > |.macro save_vmstate > |.if not WIN > @@ -376,7 +391,7 @@ <snipped> > diff --git a/src/vm_x86.dasc b/src/vm_x86.dasc > index bd1e940e..fabeec9f 100644 > --- a/src/vm_x86.dasc > +++ b/src/vm_x86.dasc > @@ -451,14 +451,28 @@ > |// it syncs with the BASE register only when the control is passed to > |// user code. So we need to sync the BASE on each vmstate change to > |// keep it consistent. > +|// The only execption are FFUNCs because sometimes even internal BASE Typo: s/execption/exception/ > +|// stash is inconsistent for them. To address that issue, their ffid > +|// is stashed instead, so the corresponding frame can be streamed > +|// manually. <snipped> > | > +|.macro set_vmstate_ffunc > +|.if LJ_HASSYSPROF > +| set_vmstate INTERP > +| mov LFUNC:XCHGd, [BASE - 8] What about the x86 arch -- XCHGd isn't defined for it, so I'm very surprised that the VM is even built :)... We should spill ECX here too, I suppose. | >>> src/luajit -e 'print(jit.arch)' | x86 | >>> cd test/tarantool-tests/ | >>> LUA_PATH="./?.lua;../../src/?.lua;;" ../../src/luajit gh-8594-sysprof-ffunc-crash.test.lua | TAP version 13 | 1..1 | Segmentation fault Build like the following: | make -j CC="gcc -m32" CCDEBUG=" -g -ggdb3" CFLAGS=" -O0" XCFLAGS=" -DLUA_USE_APICHECK -DLUA_USE_ASSERT " -f Makefile.original Side note: I'm really dissapointed that we still don't have some flags to do it from cmake, so it will be available in the our exotic build testing. > +| mov r11b, LFUNC:XCHGd->ffid // r11b is the byte-sized part of XCHGd So, maybe its better to define a macro instead, like `XCHGb`. > +| mov byte [DISPATCH+DISPATCH_GL(top_frame_info.ffid)], r11b > +|.endif > +| set_vmstate FFUNC > +|.endmacro > +| > |// Uses spilled ecx on x86 or XCHGd (r11d) on x64. > |.macro save_vmstate > |.if not WIN > @@ -485,7 +499,7 @@ <snipped> > diff --git a/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua b/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua > new file mode 100644 > index 00000000..027eed74 > --- /dev/null > +++ b/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua > @@ -0,0 +1,33 @@ > +local tap = require('tap') > +local test = tap.test('gh-8594-sysprof-ffunc-crash'):skipcond({ > + ['Sysprof is implemented for x86_64 only'] = jit.arch ~= 'x86' and > + jit.arch ~= 'x64', > + ['Sysprof is implemented for Linux only'] = jit.os ~= "Linux", Nit: Typo: s/"Linux"/'Linux'/ > +}) <snipped> > -- > 2.40.1 > -- Best regards, Sergey Kaplun
next prev parent reply other threads:[~2023-06-29 8:59 UTC|newest] Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-06-07 12:25 Maxim Kokryashkin via Tarantool-patches 2023-06-29 8:54 ` Sergey Kaplun via Tarantool-patches [this message] 2023-07-09 12:20 ` Sergey Kaplun via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=ZJ1G3zkwOY6TjCiU@root \ --to=tarantool-patches@dev.tarantool.org \ --cc=max.kokryashkin@gmail.com \ --cc=skaplun@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH luajit v2] sysprof: fix crash during FFUNC stream' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox