Tarantool development patches archive
 help / color / mirror / Atom feed
From: Sergey Kaplun via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Maxim Kokryashkin <max.kokryashkin@gmail.com>
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH luajit] Throw any errors before stack changes in trace stitching.
Date: Wed, 21 Feb 2024 15:34:38 +0300	[thread overview]
Message-ID: <ZdXt3oDNjOzpB_Vc@root> (raw)
In-Reply-To: <20240216111152.152025-1-m.kokryashkin@tarantool.org>

Hi, Maxim!
Thanks for the patch!
Please consider my comments below.

On 16.02.24, Maxim Kokryashkin wrote:
> From: Mike Pall <mike>
> 
> Thanks to doujiang24.
> 
> (cherry-picked from commit 3f9389edc6cdf3f78a6896d550c236860aed62b2)
> 
> The Lua stack is changed in the `lj_record_stop`, so if the trace

Actually, it is changed in the `recff_stitch()` -- the continuation frame
is inserted below a function with arguments.

> is aborted, for example, when the `maxsnap` is reached, and an

If there are any other situations, we must handle them too:

1) AFAICS, when LuaJIT is built with -DLUAJIT_ENABLE_TABLE_BUMP, the
following lines may lead to the error, and thereby the Lua stack
becomes unbalanced:

<src/lj_record.c:290> `lj_record_stop()`:
| if (J->retryrec)
|   lj_trace_err(J, LJ_TRERR_RETRY);

Build with the following flags:
| cmake -DCMAKE_C_FLAGS="-DLUAJIT_ENABLE_TABLE_BUMP" -DLUA_USE_APICHECK=ON \
|       -DLUA_USE_ASSERT=ON -DCMAKE_BUILD_TYPE=Debug \
|       -DLUAJIT_ENABLE_GC64=OFF . && make -j

The following script reproduces the issue (`maxsnap` configuration option
is gone):

| src/luajit -e "
| local modf = math.modf
| jit.opt.start('hotloop=1')
|
| local t1
| for i = 1, 2 do
|   t1 = {}
|   t1[i] = i
|   -- Forcify stitch.
|   modf(1.2)
| end
| "
| LuaJIT ASSERT src/lj_dispatch.c:502: lj_dispatch_call: unbalanced stack after hot instruction

2) Also, it may happen when the memory error is raised when we try to
reallocate the IR buffer for the next instruction in `lj_ir_nextins()`.

The reproducer for this issue is much more tricky.

My suggestions here are the following:
Resolve the special case in the scope of backporting this patch, and
report the 2 follow-ups with suggested solutions to the upstream.
As I can see, this issue should be fixed in the following way:
Add the `lj_vm_cpcall()` to the `record_stop()` call and rethrow
the error, if needed, after clenup.

Thoughts?

> error is thrown from there, the execution continues with an
> unbalanced stack.
> 
> Since the only error that is thrown from the `lj_record_stop`
> with modified stack is `LJ_TRERR_SNAPOV`, this patch adds a
> check for snapshot overflow to the `recff_stitch`.
> 
> Maxim Kokryashkin:
> * added the description and the test for the problem
> 
> Part of tarantool/tarantool#9595
> ---
> Branch: https://github.com/tarantool/luajit/tree/fckxorg/lj-pr-720-errors-before-stitch
> PR: https://github.com/tarantool/tarantool/pull/9703
> Issues:
> https://github.com/tarantool/tarantool/issues/9595
> https://github.com/LuaJIT/LuaJIT/pull/720
> 
>  src/lj_ffrecord.c                             |  4 ++++
>  .../lj-pr-720-errors-before-stitch.test.lua   | 20 +++++++++++++++++++
>  2 files changed, 24 insertions(+)
>  create mode 100644 test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua
> 
> diff --git a/src/lj_ffrecord.c b/src/lj_ffrecord.c
> index 1b2ddb72..e3ed80fb 100644
> --- a/src/lj_ffrecord.c
> +++ b/src/lj_ffrecord.c
> @@ -107,6 +107,10 @@ static void recff_stitch(jit_State *J)
>    const BCIns *pc = frame_pc(base-1);
>    TValue *pframe = frame_prevl(base-1);
> 
> +  /* Check for this now. Throwing in lj_record_stop messes up the stack. */
> +  if (J->cur.nsnap >= (MSize)J->param[JIT_P_maxsnap])
> +    lj_trace_err(J, LJ_TRERR_SNAPOV);
> +
>    /* Move func + args up in Lua stack and insert continuation. */
>    memmove(&base[1], &base[-1-LJ_FR2], sizeof(TValue)*nslot);
>    setframe_ftsz(nframe, ((char *)nframe - (char *)pframe) + FRAME_CONT);
> diff --git a/test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua b/test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua
> new file mode 100644
> index 00000000..53b7b5ea
> --- /dev/null
> +++ b/test/tarantool-tests/lj-pr-720-errors-before-stitch.test.lua

I suggest to ommit the "pr-" part of the prefix. Since PR are treated
the same way by Mike we don't need to mention it specifically. Also, as
Sergey mentioned, GitHub redirects users to the PR page, so we can avoid
specific mentioning.

Side note-reminder (top priority of mentioning at the top):
* lj-dddd stands for issues from LuaJIT/LuaJIT repository.
* gh-dddd stands for issues from tarantool/tarantool repository.
* or-dddd stands for issues from openresty/lua-resty-core or (since lack
  of them) openresty/luajit2 repository.

> @@ -0,0 +1,20 @@
> +local tap = require('tap')
> +local test = tap.test('lj-pr-720-errors-before-stitch'):skipcond({
> +  ['Test requires JIT enabled'] = not jit.status(),
> +})
> +test:plan(1)
> +
> +-- `math.modf` recording is NYI.
> +local modf = require('math').modf

There is no need to require math library since it is already loaded.

> +jit.opt.start('hotloop=1', 'maxsnap=1')
> +
> +-- The loop has only two iterations: the first to detect its
> +-- hotness and the second to record it. The snapshot limit is
> +-- set to one and is certainly reached.
> +for _ = 1, 2 do
> +  -- Forcify stitch.
> +  modf(1.2)
> +end
> +
> +test:ok(true, 'stack is balanced')
> +test:done(true)
> --
> 2.43.0
> 

-- 
Best regards,
Sergey Kaplun

  parent reply	other threads:[~2024-02-21 12:38 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-16 11:11 Maxim Kokryashkin via Tarantool-patches
2024-02-20  8:08 ` Sergey Bronnikov via Tarantool-patches
2024-03-11 15:58   ` Sergey Bronnikov via Tarantool-patches
2024-02-21 12:34 ` Sergey Kaplun via Tarantool-patches [this message]
2024-02-22 14:03   ` Sergey Kaplun via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZdXt3oDNjOzpB_Vc@root \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=max.kokryashkin@gmail.com \
    --cc=skaplun@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH luajit] Throw any errors before stack changes in trace stitching.' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox