[Tarantool-patches] [PATCH luajit 1/2] Fix handling of errors during snapshot restore.

Sergey Kaplun skaplun at tarantool.org
Mon Aug 1 13:04:57 MSK 2022


Hi, Sergos!

Thanks for the review!

On 01.08.22, sergos wrote:
> Hi!
> 
> Thanks for the patch!
> With some comments fixes LGTM.
> 
> Sergos
> 
> 
> > On 31 Jul 2022, at 13:58, Sergey Kaplun <skaplun at tarantool.org> wrote:
> > 
> > From: Mike Pall <mike>
> > 
> > (cherry picked from commit 12ab596997b9cb27846a5b254d11230c3f9c50c8)
> > 
> > When an error is raised during snapshot restore, `err_unwind()` skipped the
>                                 the               the
> > correct cframe to stop unwinding. It happens due this frame is C frame
>                 that should stop   The reason is                a     
> > without Lua frame and the special negative value of `cfram_nres()` for
>         ^^^^^^^                                            ^e
>        not sure if I got it right - wrapping frame? Anyways an article is missing
> 
> > this frame isn't set.
> > 
> > This patch sets `cframe_nres()` for cframe with snap restoration to
>                 the                 a       that contains a
> > `-2*LUAI_MAXSTACK` to guarantee that an error will be always caught
> > here.
> Not quite clear why this should be done always, since you mentioned before the
> Lua frame presence mitigates the problem. Does it mean the cframe_nres() is
> ignored if Lua frame is present? Mention it if it is true.

Sorry, my bad wording. I've meant that this is just pseudo frame without
true honestly frame set up like it is done for `lua_cpcall()`. I've
reformulated the commit message as the following, considerring your
comments:


| Fix handling of errors during snapshot restore.
|
| (cherry picked from commit 12ab596997b9cb27846a5b254d11230c3f9c50c8)
|
| When an error is raised during a snapshot restore, the `err_unwind()`
| skips the correct cframe that should stop unwinding. The reason is this
| frame is the pseudo C frame for error handling without an actual frame
| and the special negative value of `cframe_nres()` for this frame isn't
| set.
|
| This patch sets `cframe_nres()` for a cframe that contains a snap
| restoration to `-2*LUAI_MAXSTACK` to guarantee that an error will be
| always caught here.
|
| Sergey Kaplun:
| * added the description and the test for the problem
|
| Part of tarantool/tarantool#7230

> 
> > 
> > Sergey Kaplun:
> > * added the description and the test for the problem
> > 
> > Part of tarantool/tarantool#7230
> > ---
> > src/lj_trace.c                                |  2 ++
> > .../lj-603-err-snap-restore.test.lua          | 30 +++++++++++++++++++
> > 2 files changed, 32 insertions(+)
> > create mode 100644 test/tarantool-tests/lj-603-err-snap-restore.test.lua
> > 
> > diff --git a/src/lj_trace.c b/src/lj_trace.c
> > index d7a78d4d..68a657a7 100644
> > --- a/src/lj_trace.c
> > +++ b/src/lj_trace.c
> > @@ -803,6 +803,8 @@ static TValue *trace_exit_cp(lua_State *L, lua_CFunction dummy, void *ud)
> > {
> >   ExitDataCP *exd = (ExitDataCP *)ud;
> >   cframe_errfunc(L->cframe) = -1;  /* Inherit error function. */
> > +  /* Always catch error here. */
> > +  cframe_nres(L->cframe) = -2*LUAI_MAXSTACK*(int)sizeof(TValue);
> >   exd->pc = lj_snap_restore(exd->J, exd->exptr);
> >   UNUSED(dummy);
> >   return NULL;
> > diff --git a/test/tarantool-tests/lj-603-err-snap-restore.test.lua b/test/tarantool-tests/lj-603-err-snap-restore.test.lua
> > new file mode 100644
> > index 00000000..82ce6a8f
> > --- /dev/null
> > +++ b/test/tarantool-tests/lj-603-err-snap-restore.test.lua
> > @@ -0,0 +1,30 @@
> > +local tap = require('tap')
> > +
> > +-- Test file to demonstrate the incorrect JIT behaviour when an
>            ??? 
> 
> > +-- error is raised on restoration from the snapshot.
> > +-- See also https://github.com/LuaJIT/LuaJIT/issues/603.

<snipped>

> > +
> > +test:ok(true)
> > +
> > +-- XXX: Don't use `os.exit()` here intense. When error on snap
>                                       ^^^ explicitly? | by intention?
>                                             
> > +-- restoration is raised, `err_unwind()` doesn't stop on correct
> > +-- cframe. So later, on exit from VM this corrupted cframe chain
> > +-- shows itself. `os.exit()` literally calls `exit()` and doesn't
> > +-- show the issue.

Fixed. See the iterative patch below.

===================================================================
diff --git a/test/tarantool-tests/lj-603-err-snap-restore.test.lua b/test/tarantool-tests/lj-603-err-snap-restore.test.lua
index 82ce6a8f..d20c4d6a 100644
--- a/test/tarantool-tests/lj-603-err-snap-restore.test.lua
+++ b/test/tarantool-tests/lj-603-err-snap-restore.test.lua
@@ -1,7 +1,7 @@
 local tap = require('tap')
 
--- Test file to demonstrate the incorrect JIT behaviour when an
--- error is raised on restoration from the snapshot.
+-- Test to demonstrate the incorrect JIT behaviour when an error
+-- is raised on restoration from the snapshot.
 -- See also https://github.com/LuaJIT/LuaJIT/issues/603.
 local test = tap.test('lj-603-err-snap-restore.test.lua')
 test:plan(1)
@@ -23,8 +23,8 @@ recursive_f()
 
 test:ok(true)
 
--- XXX: Don't use `os.exit()` here intense. When error on snap
--- restoration is raised, `err_unwind()` doesn't stop on correct
--- cframe. So later, on exit from VM this corrupted cframe chain
--- shows itself. `os.exit()` literally calls `exit()` and doesn't
--- show the issue.
+-- XXX: Don't use `os.exit()` here by intention. When error on
+-- snap restoration is raised, `err_unwind()` doesn't stop on
+-- correct cframe. So later, on exit from VM this corrupted cframe
+-- chain shows itself. `os.exit()` literally calls `exit()` and
+-- doesn't show the issue.
===================================================================

> > -- 
> > 2.34.1
> > 
> 

-- 
Best regards,
Sergey Kaplun


More information about the Tarantool-patches mailing list