[Tarantool-patches] [PATCH luajit] Throw any errors before stack changes in trace stitching.
Sergey Kaplun
skaplun at tarantool.org
Wed Feb 21 15:34:38 MSK 2024
Hi, Maxim!
Thanks for the patch!
Please consider my comments below.
On 16.02.24, Maxim Kokryashkin wrote:
> From: Mike Pall <mike>
>
> Thanks to doujiang24.
>
> (cherry-picked from commit 3f9389edc6cdf3f78a6896d550c236860aed62b2)
>
> The Lua stack is changed in the `lj_record_stop`, so if the trace
Actually, it is changed in the `recff_stitch()` -- the continuation frame
is inserted below a function with arguments.
> is aborted, for example, when the `maxsnap` is reached, and an
If there are any other situations, we must handle them too:
1) AFAICS, when LuaJIT is built with -DLUAJIT_ENABLE_TABLE_BUMP, the
following lines may lead to the error, and thereby the Lua stack
becomes unbalanced:
<src/lj_record.c:290> `lj_record_stop()`:
| if (J->retryrec)
| lj_trace_err(J, LJ_TRERR_RETRY);
Build with the following flags:
| cmake -DCMAKE_C_FLAGS="-DLUAJIT_ENABLE_TABLE_BUMP" -DLUA_USE_APICHECK=ON \
| -DLUA_USE_ASSERT=ON -DCMAKE_BUILD_TYPE=Debug \
| -DLUAJIT_ENABLE_GC64=OFF . && make -j
The following script reproduces the issue (`maxsnap` configuration option
is gone):
| src/luajit -e "
| local modf = math.modf
| jit.opt.start('hotloop=1')
|
| local t1
| for i = 1, 2 do
| t1 = {}
| t1[i] = i
| -- Forcify stitch.
| modf(1.2)
| end
| "
| LuaJIT ASSERT src/lj_dispatch.c:502: lj_dispatch_call: unbalanced stack after hot instruction
2) Also, it may happen when the memory error is raised when we try to
reallocate the IR buffer for the next instruction in `lj_ir_nextins()`.
The reproducer for this issue is much more tricky.
My suggestions here are the following:
Resolve the special case in the scope of backporting this patch, and
report the 2 follow-ups with suggested solutions to the upstream.
As I can see, this issue should be fixed in the following way:
Add the `lj_vm_cpcall()` to the `record_stop()` call and rethrow
the error, if needed, after clenup.
Thoughts?
> error is thrown from there, the execution continues with an
> unbalanced stack.
>
> Since the only error that is thrown from the `lj_record_stop`
> with modified stack is `LJ_TRERR_SNAPOV`, this patch adds a
> check for snapshot overflow to the `recff_stitch`.
>
> Maxim Kokryashkin:
> * added the description and the test for the problem
>
> Part of tarantool/tarantool#9595
> ---
> Branch: https://github.com/tarantool/luajit/tree/fckxorg/lj-pr-720-errors-before-stitch
> PR: https://github.com/tarantool/tarantool/pull/9703
> Issues:
> https://github.com/tarantool/tarantool/issues/9595
> https://github.com/LuaJIT/LuaJIT/pull/720
>
> src/lj_ffrecord.c | 4 ++++
> .../lj-pr-720-errors-before-stitch.test.lua | 20 +++++++++++++++++++
> 2 files changed, 24 insertions(+)
> create mode 100644 test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua
>
> diff --git a/src/lj_ffrecord.c b/src/lj_ffrecord.c
> index 1b2ddb72..e3ed80fb 100644
> --- a/src/lj_ffrecord.c
> +++ b/src/lj_ffrecord.c
> @@ -107,6 +107,10 @@ static void recff_stitch(jit_State *J)
> const BCIns *pc = frame_pc(base-1);
> TValue *pframe = frame_prevl(base-1);
>
> + /* Check for this now. Throwing in lj_record_stop messes up the stack. */
> + if (J->cur.nsnap >= (MSize)J->param[JIT_P_maxsnap])
> + lj_trace_err(J, LJ_TRERR_SNAPOV);
> +
> /* Move func + args up in Lua stack and insert continuation. */
> memmove(&base[1], &base[-1-LJ_FR2], sizeof(TValue)*nslot);
> setframe_ftsz(nframe, ((char *)nframe - (char *)pframe) + FRAME_CONT);
> diff --git a/test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua b/test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua
> new file mode 100644
> index 00000000..53b7b5ea
> --- /dev/null
> +++ b/test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua
I suggest to ommit the "pr-" part of the prefix. Since PR are treated
the same way by Mike we don't need to mention it specifically. Also, as
Sergey mentioned, GitHub redirects users to the PR page, so we can avoid
specific mentioning.
Side note-reminder (top priority of mentioning at the top):
* lj-dddd stands for issues from LuaJIT/LuaJIT repository.
* gh-dddd stands for issues from tarantool/tarantool repository.
* or-dddd stands for issues from openresty/lua-resty-core or (since lack
of them) openresty/luajit2 repository.
> @@ -0,0 +1,20 @@
> +local tap = require('tap')
> +local test = tap.test('lj-pr-720-errors-before-stitch'):skipcond({
> + ['Test requires JIT enabled'] = not jit.status(),
> +})
> +test:plan(1)
> +
> +-- `math.modf` recording is NYI.
> +local modf = require('math').modf
There is no need to require math library since it is already loaded.
> +jit.opt.start('hotloop=1', 'maxsnap=1')
> +
> +-- The loop has only two iterations: the first to detect its
> +-- hotness and the second to record it. The snapshot limit is
> +-- set to one and is certainly reached.
> +for _ = 1, 2 do
> + -- Forcify stitch.
> + modf(1.2)
> +end
> +
> +test:ok(true, 'stack is balanced')
> +test:done(true)
> --
> 2.43.0
>
--
Best regards,
Sergey Kaplun
More information about the Tarantool-patches
mailing list