* [Tarantool-patches] [PATCH luajit] Throw any errors before stack changes in trace stitching. @ 2024-02-16 11:11 Maxim Kokryashkin via Tarantool-patches 2024-02-20 8:08 ` Sergey Bronnikov via Tarantool-patches 2024-02-21 12:34 ` Sergey Kaplun via Tarantool-patches 0 siblings, 2 replies; 5+ messages in thread From: Maxim Kokryashkin via Tarantool-patches @ 2024-02-16 11:11 UTC (permalink / raw) To: tarantool-patches, skaplun, sergeyb From: Mike Pall <mike> Thanks to doujiang24. (cherry-picked from commit 3f9389edc6cdf3f78a6896d550c236860aed62b2) The Lua stack is changed in the `lj_record_stop`, so if the trace is aborted, for example, when the `maxsnap` is reached, and an error is thrown from there, the execution continues with an unbalanced stack. Since the only error that is thrown from the `lj_record_stop` with modified stack is `LJ_TRERR_SNAPOV`, this patch adds a check for snapshot overflow to the `recff_stitch`. Maxim Kokryashkin: * added the description and the test for the problem Part of tarantool/tarantool#9595 --- Branch: https://github.com/tarantool/luajit/tree/fckxorg/lj-pr-720-errors-before-stitch PR: https://github.com/tarantool/tarantool/pull/9703 Issues: https://github.com/tarantool/tarantool/issues/9595 https://github.com/LuaJIT/LuaJIT/pull/720 src/lj_ffrecord.c | 4 ++++ .../lj-pr-720-errors-before-stitch.test.lua | 20 +++++++++++++++++++ 2 files changed, 24 insertions(+) create mode 100644 test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua diff --git a/src/lj_ffrecord.c b/src/lj_ffrecord.c index 1b2ddb72..e3ed80fb 100644 --- a/src/lj_ffrecord.c +++ b/src/lj_ffrecord.c @@ -107,6 +107,10 @@ static void recff_stitch(jit_State *J) const BCIns *pc = frame_pc(base-1); TValue *pframe = frame_prevl(base-1); + /* Check for this now. Throwing in lj_record_stop messes up the stack. */ + if (J->cur.nsnap >= (MSize)J->param[JIT_P_maxsnap]) + lj_trace_err(J, LJ_TRERR_SNAPOV); + /* Move func + args up in Lua stack and insert continuation. */ memmove(&base[1], &base[-1-LJ_FR2], sizeof(TValue)*nslot); setframe_ftsz(nframe, ((char *)nframe - (char *)pframe) + FRAME_CONT); diff --git a/test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua b/test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua new file mode 100644 index 00000000..53b7b5ea --- /dev/null +++ b/test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua @@ -0,0 +1,20 @@ +local tap = require('tap') +local test = tap.test('lj-pr-720-errors-before-stitch'):skipcond({ + ['Test requires JIT enabled'] = not jit.status(), +}) +test:plan(1) + +-- `math.modf` recording is NYI. +local modf = require('math').modf +jit.opt.start('hotloop=1', 'maxsnap=1') + +-- The loop has only two iterations: the first to detect its +-- hotness and the second to record it. The snapshot limit is +-- set to one and is certainly reached. +for _ = 1, 2 do + -- Forcify stitch. + modf(1.2) +end + +test:ok(true, 'stack is balanced') +test:done(true) -- 2.43.0 ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit] Throw any errors before stack changes in trace stitching. 2024-02-16 11:11 [Tarantool-patches] [PATCH luajit] Throw any errors before stack changes in trace stitching Maxim Kokryashkin via Tarantool-patches @ 2024-02-20 8:08 ` Sergey Bronnikov via Tarantool-patches 2024-03-11 15:58 ` Sergey Bronnikov via Tarantool-patches 2024-02-21 12:34 ` Sergey Kaplun via Tarantool-patches 1 sibling, 1 reply; 5+ messages in thread From: Sergey Bronnikov via Tarantool-patches @ 2024-02-20 8:08 UTC (permalink / raw) To: Maxim Kokryashkin, tarantool-patches, skaplun Hello, Max, thanks for the patch! see my comments Sergey On 2/16/24 14:11, Maxim Kokryashkin wrote: > From: Mike Pall <mike> > > Thanks to doujiang24. > > (cherry-picked from commit 3f9389edc6cdf3f78a6896d550c236860aed62b2) > > The Lua stack is changed in the `lj_record_stop`, so if the trace > is aborted, for example, when the `maxsnap` is reached, and an > error is thrown from there, the execution continues with an > unbalanced stack. > > Since the only error that is thrown from the `lj_record_stop` > with modified stack is `LJ_TRERR_SNAPOV`, this patch adds a As I see lj_record_stop [1] throws only LJ_TRERR_RETRY or I got you wrong. 1. https://github.com/LuaJIT/LuaJIT/blob/0d313b243194a0b8d2399d8b549ca5a0ff234db5/src/lj_record.c#L299 > check for snapshot overflow to the `recff_stitch`. > > Maxim Kokryashkin: > * added the description and the test for the problem > > Part of tarantool/tarantool#9595 > --- > Branch: https://github.com/tarantool/luajit/tree/fckxorg/lj-pr-720-errors-before-stitch > PR: https://github.com/tarantool/tarantool/pull/9703 > Issues: > https://github.com/tarantool/tarantool/issues/9595 > https://github.com/LuaJIT/LuaJIT/pull/720 > > src/lj_ffrecord.c | 4 ++++ > .../lj-pr-720-errors-before-stitch.test.lua | 20 +++++++++++++++++++ > 2 files changed, 24 insertions(+) > create mode 100644 test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua > > diff --git a/src/lj_ffrecord.c b/src/lj_ffrecord.c > index 1b2ddb72..e3ed80fb 100644 > --- a/src/lj_ffrecord.c > +++ b/src/lj_ffrecord.c > @@ -107,6 +107,10 @@ static void recff_stitch(jit_State *J) > const BCIns *pc = frame_pc(base-1); > TValue *pframe = frame_prevl(base-1); > > + /* Check for this now. Throwing in lj_record_stop messes up the stack. */ > + if (J->cur.nsnap >= (MSize)J->param[JIT_P_maxsnap]) > + lj_trace_err(J, LJ_TRERR_SNAPOV); > + > /* Move func + args up in Lua stack and insert continuation. */ > memmove(&base[1], &base[-1-LJ_FR2], sizeof(TValue)*nslot); > setframe_ftsz(nframe, ((char *)nframe - (char *)pframe) + FRAME_CONT); > diff --git a/test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua b/test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua > new file mode 100644 > index 00000000..53b7b5ea > --- /dev/null > +++ b/test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua You could omit "-pr" in the name, anyway https://github.com/LuaJIT/LuaJIT/issues/720 redirects to https://github.com/LuaJIT/LuaJIT/pull/720. > @@ -0,0 +1,20 @@ > +local tap = require('tap') > +local test = tap.test('lj-pr-720-errors-before-stitch'):skipcond({ > + ['Test requires JIT enabled'] = not jit.status(), > +}) > +test:plan(1) > + > +-- `math.modf` recording is NYI. > +local modf = require('math').modf > +jit.opt.start('hotloop=1', 'maxsnap=1') > + > +-- The loop has only two iterations: the first to detect its > +-- hotness and the second to record it. The snapshot limit is > +-- set to one and is certainly reached. > +for _ = 1, 2 do > + -- Forcify stitch. > + modf(1.2) > +end > + > +test:ok(true, 'stack is balanced') > +test:done(true) > -- > 2.43.0 > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit] Throw any errors before stack changes in trace stitching. 2024-02-20 8:08 ` Sergey Bronnikov via Tarantool-patches @ 2024-03-11 15:58 ` Sergey Bronnikov via Tarantool-patches 0 siblings, 0 replies; 5+ messages in thread From: Sergey Bronnikov via Tarantool-patches @ 2024-03-11 15:58 UTC (permalink / raw) To: Maxim Kokryashkin, tarantool-patches, skaplun Max, thanks for fixes! LGTM On 2/20/24 11:08, Sergey Bronnikov via Tarantool-patches wrote: > Hello, Max, > > thanks for the patch! see my comments > > Sergey > > On 2/16/24 14:11, Maxim Kokryashkin wrote: >> From: Mike Pall <mike> >> >> Thanks to doujiang24. >> >> (cherry-picked from commit 3f9389edc6cdf3f78a6896d550c236860aed62b2) >> >> The Lua stack is changed in the `lj_record_stop`, so if the trace >> is aborted, for example, when the `maxsnap` is reached, and an >> error is thrown from there, the execution continues with an >> unbalanced stack. >> >> Since the only error that is thrown from the `lj_record_stop` >> with modified stack is `LJ_TRERR_SNAPOV`, this patch adds a > > As I see lj_record_stop [1] throws only LJ_TRERR_RETRY or > > I got you wrong. > > > 1. > https://github.com/LuaJIT/LuaJIT/blob/0d313b243194a0b8d2399d8b549ca5a0ff234db5/src/lj_record.c#L299 > >> check for snapshot overflow to the `recff_stitch`. >> >> Maxim Kokryashkin: >> * added the description and the test for the problem >> >> Part of tarantool/tarantool#9595 >> --- >> Branch: >> https://github.com/tarantool/luajit/tree/fckxorg/lj-pr-720-errors-before-stitch >> PR: https://github.com/tarantool/tarantool/pull/9703 >> Issues: >> https://github.com/tarantool/tarantool/issues/9595 >> https://github.com/LuaJIT/LuaJIT/pull/720 >> >> src/lj_ffrecord.c | 4 ++++ >> .../lj-pr-720-errors-before-stitch.test.lua | 20 +++++++++++++++++++ >> 2 files changed, 24 insertions(+) >> create mode 100644 >> test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua >> >> diff --git a/src/lj_ffrecord.c b/src/lj_ffrecord.c >> index 1b2ddb72..e3ed80fb 100644 >> --- a/src/lj_ffrecord.c >> +++ b/src/lj_ffrecord.c >> @@ -107,6 +107,10 @@ static void recff_stitch(jit_State *J) >> const BCIns *pc = frame_pc(base-1); >> TValue *pframe = frame_prevl(base-1); >> >> + /* Check for this now. Throwing in lj_record_stop messes up the >> stack. */ >> + if (J->cur.nsnap >= (MSize)J->param[JIT_P_maxsnap]) >> + lj_trace_err(J, LJ_TRERR_SNAPOV); >> + >> /* Move func + args up in Lua stack and insert continuation. */ >> memmove(&base[1], &base[-1-LJ_FR2], sizeof(TValue)*nslot); >> setframe_ftsz(nframe, ((char *)nframe - (char *)pframe) + >> FRAME_CONT); >> diff --git >> a/test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua >> b/test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua >> new file mode 100644 >> index 00000000..53b7b5ea >> --- /dev/null >> +++ b/test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua > > You could omit "-pr" in the name, anyway > > https://github.com/LuaJIT/LuaJIT/issues/720 redirects to > https://github.com/LuaJIT/LuaJIT/pull/720. > >> @@ -0,0 +1,20 @@ >> +local tap = require('tap') >> +local test = tap.test('lj-pr-720-errors-before-stitch'):skipcond({ >> + ['Test requires JIT enabled'] = not jit.status(), >> +}) >> +test:plan(1) >> + >> +-- `math.modf` recording is NYI. >> +local modf = require('math').modf >> +jit.opt.start('hotloop=1', 'maxsnap=1') >> + >> +-- The loop has only two iterations: the first to detect its >> +-- hotness and the second to record it. The snapshot limit is >> +-- set to one and is certainly reached. >> +for _ = 1, 2 do >> + -- Forcify stitch. >> + modf(1.2) >> +end >> + >> +test:ok(true, 'stack is balanced') >> +test:done(true) >> -- >> 2.43.0 >> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit] Throw any errors before stack changes in trace stitching. 2024-02-16 11:11 [Tarantool-patches] [PATCH luajit] Throw any errors before stack changes in trace stitching Maxim Kokryashkin via Tarantool-patches 2024-02-20 8:08 ` Sergey Bronnikov via Tarantool-patches @ 2024-02-21 12:34 ` Sergey Kaplun via Tarantool-patches 2024-02-22 14:03 ` Sergey Kaplun via Tarantool-patches 1 sibling, 1 reply; 5+ messages in thread From: Sergey Kaplun via Tarantool-patches @ 2024-02-21 12:34 UTC (permalink / raw) To: Maxim Kokryashkin; +Cc: tarantool-patches Hi, Maxim! Thanks for the patch! Please consider my comments below. On 16.02.24, Maxim Kokryashkin wrote: > From: Mike Pall <mike> > > Thanks to doujiang24. > > (cherry-picked from commit 3f9389edc6cdf3f78a6896d550c236860aed62b2) > > The Lua stack is changed in the `lj_record_stop`, so if the trace Actually, it is changed in the `recff_stitch()` -- the continuation frame is inserted below a function with arguments. > is aborted, for example, when the `maxsnap` is reached, and an If there are any other situations, we must handle them too: 1) AFAICS, when LuaJIT is built with -DLUAJIT_ENABLE_TABLE_BUMP, the following lines may lead to the error, and thereby the Lua stack becomes unbalanced: <src/lj_record.c:290> `lj_record_stop()`: | if (J->retryrec) | lj_trace_err(J, LJ_TRERR_RETRY); Build with the following flags: | cmake -DCMAKE_C_FLAGS="-DLUAJIT_ENABLE_TABLE_BUMP" -DLUA_USE_APICHECK=ON \ | -DLUA_USE_ASSERT=ON -DCMAKE_BUILD_TYPE=Debug \ | -DLUAJIT_ENABLE_GC64=OFF . && make -j The following script reproduces the issue (`maxsnap` configuration option is gone): | src/luajit -e " | local modf = math.modf | jit.opt.start('hotloop=1') | | local t1 | for i = 1, 2 do | t1 = {} | t1[i] = i | -- Forcify stitch. | modf(1.2) | end | " | LuaJIT ASSERT src/lj_dispatch.c:502: lj_dispatch_call: unbalanced stack after hot instruction 2) Also, it may happen when the memory error is raised when we try to reallocate the IR buffer for the next instruction in `lj_ir_nextins()`. The reproducer for this issue is much more tricky. My suggestions here are the following: Resolve the special case in the scope of backporting this patch, and report the 2 follow-ups with suggested solutions to the upstream. As I can see, this issue should be fixed in the following way: Add the `lj_vm_cpcall()` to the `record_stop()` call and rethrow the error, if needed, after clenup. Thoughts? > error is thrown from there, the execution continues with an > unbalanced stack. > > Since the only error that is thrown from the `lj_record_stop` > with modified stack is `LJ_TRERR_SNAPOV`, this patch adds a > check for snapshot overflow to the `recff_stitch`. > > Maxim Kokryashkin: > * added the description and the test for the problem > > Part of tarantool/tarantool#9595 > --- > Branch: https://github.com/tarantool/luajit/tree/fckxorg/lj-pr-720-errors-before-stitch > PR: https://github.com/tarantool/tarantool/pull/9703 > Issues: > https://github.com/tarantool/tarantool/issues/9595 > https://github.com/LuaJIT/LuaJIT/pull/720 > > src/lj_ffrecord.c | 4 ++++ > .../lj-pr-720-errors-before-stitch.test.lua | 20 +++++++++++++++++++ > 2 files changed, 24 insertions(+) > create mode 100644 test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua > > diff --git a/src/lj_ffrecord.c b/src/lj_ffrecord.c > index 1b2ddb72..e3ed80fb 100644 > --- a/src/lj_ffrecord.c > +++ b/src/lj_ffrecord.c > @@ -107,6 +107,10 @@ static void recff_stitch(jit_State *J) > const BCIns *pc = frame_pc(base-1); > TValue *pframe = frame_prevl(base-1); > > + /* Check for this now. Throwing in lj_record_stop messes up the stack. */ > + if (J->cur.nsnap >= (MSize)J->param[JIT_P_maxsnap]) > + lj_trace_err(J, LJ_TRERR_SNAPOV); > + > /* Move func + args up in Lua stack and insert continuation. */ > memmove(&base[1], &base[-1-LJ_FR2], sizeof(TValue)*nslot); > setframe_ftsz(nframe, ((char *)nframe - (char *)pframe) + FRAME_CONT); > diff --git a/test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua b/test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua > new file mode 100644 > index 00000000..53b7b5ea > --- /dev/null > +++ b/test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua I suggest to ommit the "pr-" part of the prefix. Since PR are treated the same way by Mike we don't need to mention it specifically. Also, as Sergey mentioned, GitHub redirects users to the PR page, so we can avoid specific mentioning. Side note-reminder (top priority of mentioning at the top): * lj-dddd stands for issues from LuaJIT/LuaJIT repository. * gh-dddd stands for issues from tarantool/tarantool repository. * or-dddd stands for issues from openresty/lua-resty-core or (since lack of them) openresty/luajit2 repository. > @@ -0,0 +1,20 @@ > +local tap = require('tap') > +local test = tap.test('lj-pr-720-errors-before-stitch'):skipcond({ > + ['Test requires JIT enabled'] = not jit.status(), > +}) > +test:plan(1) > + > +-- `math.modf` recording is NYI. > +local modf = require('math').modf There is no need to require math library since it is already loaded. > +jit.opt.start('hotloop=1', 'maxsnap=1') > + > +-- The loop has only two iterations: the first to detect its > +-- hotness and the second to record it. The snapshot limit is > +-- set to one and is certainly reached. > +for _ = 1, 2 do > + -- Forcify stitch. > + modf(1.2) > +end > + > +test:ok(true, 'stack is balanced') > +test:done(true) > -- > 2.43.0 > -- Best regards, Sergey Kaplun ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit] Throw any errors before stack changes in trace stitching. 2024-02-21 12:34 ` Sergey Kaplun via Tarantool-patches @ 2024-02-22 14:03 ` Sergey Kaplun via Tarantool-patches 0 siblings, 0 replies; 5+ messages in thread From: Sergey Kaplun via Tarantool-patches @ 2024-02-22 14:03 UTC (permalink / raw) To: Maxim Kokryashkin, tarantool-patches Hi, again, folks! On 21.02.24, Sergey Kaplun via Tarantool-patches wrote: > Hi, Maxim! > Thanks for the patch! > Please consider my comments below. > > On 16.02.24, Maxim Kokryashkin wrote: > > From: Mike Pall <mike> > > > > Thanks to doujiang24. > > > > (cherry-picked from commit 3f9389edc6cdf3f78a6896d550c236860aed62b2) > > > > The Lua stack is changed in the `lj_record_stop`, so if the trace > > Actually, it is changed in the `recff_stitch()` -- the continuation frame > is inserted below a function with arguments. > > > is aborted, for example, when the `maxsnap` is reached, and an > > If there are any other situations, we must handle them too: > > 1) AFAICS, when LuaJIT is built with -DLUAJIT_ENABLE_TABLE_BUMP, the > following lines may lead to the error, and thereby the Lua stack > becomes unbalanced: > > <src/lj_record.c:290> `lj_record_stop()`: > | if (J->retryrec) > | lj_trace_err(J, LJ_TRERR_RETRY); > > Build with the following flags: > | cmake -DCMAKE_C_FLAGS="-DLUAJIT_ENABLE_TABLE_BUMP" -DLUA_USE_APICHECK=ON \ > | -DLUA_USE_ASSERT=ON -DCMAKE_BUILD_TYPE=Debug \ > | -DLUAJIT_ENABLE_GC64=OFF . && make -j > > The following script reproduces the issue (`maxsnap` configuration option > is gone): > > | src/luajit -e " > | local modf = math.modf > | jit.opt.start('hotloop=1') > | > | local t1 > | for i = 1, 2 do > | t1 = {} > | t1[i] = i > | -- Forcify stitch. > | modf(1.2) > | end > | " > | LuaJIT ASSERT src/lj_dispatch.c:502: lj_dispatch_call: unbalanced stack after hot instruction > > 2) Also, it may happen when the memory error is raised when we try to > reallocate the IR buffer for the next instruction in `lj_ir_nextins()`. > > The reproducer for this issue is much more tricky. > > My suggestions here are the following: > Resolve the special case in the scope of backporting this patch, and > report the 2 follow-ups with suggested solutions to the upstream. > As I can see, this issue should be fixed in the following way: > Add the `lj_vm_cpcall()` to the `record_stop()` call and rethrow > the error, if needed, after clenup. > Reported here: https://github.com/LuaJIT/LuaJIT/issues/1166 <snipped> > > > > -- > Best regards, > Sergey Kaplun -- Best regards, Sergey Kaplun ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-03-11 15:58 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-02-16 11:11 [Tarantool-patches] [PATCH luajit] Throw any errors before stack changes in trace stitching Maxim Kokryashkin via Tarantool-patches 2024-02-20 8:08 ` Sergey Bronnikov via Tarantool-patches 2024-03-11 15:58 ` Sergey Bronnikov via Tarantool-patches 2024-02-21 12:34 ` Sergey Kaplun via Tarantool-patches 2024-02-22 14:03 ` Sergey Kaplun via Tarantool-patches
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox