[Tarantool-patches] [PATCH luajit 2/2] Only emit proper parent references in snapshot replay.
Maxim Kokryashkin
m.kokryashkin at tarantool.org
Wed Jan 31 12:29:03 MSK 2024
Hi, Sergey!
Thanks for the patch!
LGTM
On Wed, Jan 24, 2024 at 05:11:09PM +0300, Sergey Kaplun wrote:
> From: Mike Pall <mike>
>
> Thanks to Peter Cawley.
>
> (cherry picked from commit 9bdfd34dccb913777be0efcc6869b6eeb5b9b43b)
>
> Assume we have a trace containing the IR instruction:
> | {sink} tab TNEW #32762 #0
>
> `lj_snap_replay()` assumes that 32762 (0x7ffa) (op1 of TNEW) is a
> constant reference. It is passed to the `snap_replay_const()` lookup to
> the IR constant in the 0x7ffa slot. If this slot contains the second
> part of the IR constant number 0.5029296875 (step of the cycle) in its
> raw form (0x3fe0180000000000). The 0x18 part is treated as IROp
> (IR_KGC), and JIT is trying to continue with a store of an invalid GC
> object, which leads to a crash.
>
Oh, god...
> This patch checks that only the IRMref IR operand is needed to restore.
>
> Sergey Kaplun:
> * added the description and the test for the problem
>
> Part of tarantool/tarantool#9595
> ---
> src/lj_snap.c | 12 ++++---
> .../lj-1132-bad-snap-refs.test.lua | 36 +++++++++++++++++++
> 2 files changed, 44 insertions(+), 4 deletions(-)
> create mode 100644 test/tarantool-tests/lj-1132-bad-snap-refs.test.lua
>
> diff --git a/src/lj_snap.c b/src/lj_snap.c
> index 3f0fccec..3eb0cd28 100644
> --- a/src/lj_snap.c
> +++ b/src/lj_snap.c
> @@ -516,13 +516,15 @@ void lj_snap_replay(jit_State *J, GCtrace *T)
> IRRef refp = snap_ref(sn);
> IRIns *ir = &T->ir[refp];
> if (regsp_reg(ir->r) == RID_SUNK) {
> + uint8_t m;
> if (J->slot[snap_slot(sn)] != snap_slot(sn)) continue;
> pass23 = 1;
> lj_assertJ(ir->o == IR_TNEW || ir->o == IR_TDUP ||
> ir->o == IR_CNEW || ir->o == IR_CNEWI,
> "sunk parent IR %04d has bad op %d", refp - REF_BIAS, ir->o);
> - if (ir->op1 >= T->nk) snap_pref(J, T, map, nent, seen, ir->op1);
> - if (ir->op2 >= T->nk) snap_pref(J, T, map, nent, seen, ir->op2);
> + m = lj_ir_mode[ir->o];
> + if (irm_op1(m) == IRMref) snap_pref(J, T, map, nent, seen, ir->op1);
> + if (irm_op2(m) == IRMref) snap_pref(J, T, map, nent, seen, ir->op2);
> if (LJ_HASFFI && ir->o == IR_CNEWI) {
> if (LJ_32 && refp+1 < T->nins && (ir+1)->o == IR_HIOP)
> snap_pref(J, T, map, nent, seen, (ir+1)->op2);
> @@ -550,14 +552,16 @@ void lj_snap_replay(jit_State *J, GCtrace *T)
> IRIns *ir = &T->ir[refp];
> if (regsp_reg(ir->r) == RID_SUNK) {
> TRef op1, op2;
> + uint8_t m;
> if (J->slot[snap_slot(sn)] != snap_slot(sn)) { /* De-dup allocs. */
> J->slot[snap_slot(sn)] = J->slot[J->slot[snap_slot(sn)]];
> continue;
> }
> op1 = ir->op1;
> - if (op1 >= T->nk) op1 = snap_pref(J, T, map, nent, seen, op1);
> + m = lj_ir_mode[ir->o];
> + if (irm_op1(m) == IRMref) op1 = snap_pref(J, T, map, nent, seen, op1);
> op2 = ir->op2;
> - if (op2 >= T->nk) op2 = snap_pref(J, T, map, nent, seen, op2);
> + if (irm_op2(m) == IRMref) op2 = snap_pref(J, T, map, nent, seen, op2);
> if (LJ_HASFFI && ir->o == IR_CNEWI) {
> if (LJ_32 && refp+1 < T->nins && (ir+1)->o == IR_HIOP) {
> lj_needsplit(J); /* Emit joining HIOP. */
> diff --git a/test/tarantool-tests/lj-1132-bad-snap-refs.test.lua b/test/tarantool-tests/lj-1132-bad-snap-refs.test.lua
> new file mode 100644
> index 00000000..1f2b5400
> --- /dev/null
> +++ b/test/tarantool-tests/lj-1132-bad-snap-refs.test.lua
> @@ -0,0 +1,36 @@
> +local tap = require('tap')
> +
> +-- Test file to demonstrate LuaJIT's crash in cases of sunk
> +-- restore for huge tables.
> +-- See also https://github.com/LuaJIT/LuaJIT/issues/1132.
> +
> +local test = tap.test('lj-1132-bad-snap-refs'):skipcond({
> + ['Test requires JIT enabled'] = not jit.status(),
> +})
> +
> +test:plan(1)
> +
> +local table_new = require('table.new')
> +
> +jit.opt.start('hotloop=1', 'hotexit=1')
> +
> +local result_tab
> +-- Create a trace containing the IR instruction:
> +-- | {sink} tab TNEW #32762 #0
> +-- `lj_snap_replay()` assumes that 32762 (0x7ffa) (op1 of TNEW) is
> +-- a constant reference. It is passed to the `snap_replay_const()`
> +-- lookup to the IR constant in the 0x7ffa slot.
> +-- This slot contains the second part of the IR constant
> +-- number 0.5029296875 (step of the cycle) in its raw form
> +-- (0x3fe0180000000000). The 0x18 part is treated as IROp
> +-- (IR_KGC), and JIT is trying to continue with a store of an
> +-- invalid GC object, which leads to a crash.
> +for i = 1, 2.5, 0.5029296875 do
> + local sunk_tab = table_new(0x7ff9, 0)
> + -- Force the side exit with restoration of the sunk table.
> + if i > 2 then result_tab = sunk_tab end
> +end
> +
> +test:ok(type(result_tab) == 'table', 'no crash during sunk restore')
> +
> +test:done(true)
> --
> 2.43.0
>
More information about the Tarantool-patches
mailing list