[Tarantool-patches] [PATCH luajit 2/2] Only emit proper parent references in snapshot replay.

Sergey Bronnikov sergeyb at tarantool.org
Tue Feb 6 12:46:08 MSK 2024


Hi, Sergey

thanks for the patch! test is passed with reverted patch.


cmake -S . -B build -DLUA_USE_ASSERT=ON -DLUA_USE_APICHECK=ON

cmake --builddir  build

cmake --builddir build -t LuaJIT-tests


On 1/24/24 17:11, Sergey Kaplun wrote:
> From: Mike Pall <mike>
>
> Thanks to Peter Cawley.
>
> (cherry picked from commit 9bdfd34dccb913777be0efcc6869b6eeb5b9b43b)
>
> Assume we have a trace containing the IR instruction:
> | {sink}  tab TNEW   #32762  #0
>
> `lj_snap_replay()` assumes that 32762 (0x7ffa) (op1 of TNEW) is a
> constant reference. It is passed to the `snap_replay_const()` lookup to
> the IR constant in the 0x7ffa slot. If this slot contains the second
> part of the IR constant number 0.5029296875 (step of the cycle) in its
> raw form (0x3fe0180000000000). The 0x18 part is treated as IROp
> (IR_KGC), and JIT is trying to continue with a store of an invalid GC
> object, which leads to a crash.
>
> This patch checks that only the IRMref IR operand is needed to restore.
>
> Sergey Kaplun:
> * added the description and the test for the problem
>
> Part of tarantool/tarantool#9595
> ---
>   src/lj_snap.c                                 | 12 ++++---
>   .../lj-1132-bad-snap-refs.test.lua            | 36 +++++++++++++++++++
>   2 files changed, 44 insertions(+), 4 deletions(-)
>   create mode 100644 test/tarantool-tests/lj-1132-bad-snap-refs.test.lua
>
> diff --git a/src/lj_snap.c b/src/lj_snap.c
> index 3f0fccec..3eb0cd28 100644
> --- a/src/lj_snap.c
> +++ b/src/lj_snap.c
> @@ -516,13 +516,15 @@ void lj_snap_replay(jit_State *J, GCtrace *T)
>         IRRef refp = snap_ref(sn);
>         IRIns *ir = &T->ir[refp];
>         if (regsp_reg(ir->r) == RID_SUNK) {
> +	uint8_t m;
>   	if (J->slot[snap_slot(sn)] != snap_slot(sn)) continue;
>   	pass23 = 1;
>   	lj_assertJ(ir->o == IR_TNEW || ir->o == IR_TDUP ||
>   		   ir->o == IR_CNEW || ir->o == IR_CNEWI,
>   		   "sunk parent IR %04d has bad op %d", refp - REF_BIAS, ir->o);
> -	if (ir->op1 >= T->nk) snap_pref(J, T, map, nent, seen, ir->op1);
> -	if (ir->op2 >= T->nk) snap_pref(J, T, map, nent, seen, ir->op2);
> +	m = lj_ir_mode[ir->o];
> +	if (irm_op1(m) == IRMref) snap_pref(J, T, map, nent, seen, ir->op1);
> +	if (irm_op2(m) == IRMref) snap_pref(J, T, map, nent, seen, ir->op2);
>   	if (LJ_HASFFI && ir->o == IR_CNEWI) {
>   	  if (LJ_32 && refp+1 < T->nins && (ir+1)->o == IR_HIOP)
>   	    snap_pref(J, T, map, nent, seen, (ir+1)->op2);
> @@ -550,14 +552,16 @@ void lj_snap_replay(jit_State *J, GCtrace *T)
>         IRIns *ir = &T->ir[refp];
>         if (regsp_reg(ir->r) == RID_SUNK) {
>   	TRef op1, op2;
> +	uint8_t m;
>   	if (J->slot[snap_slot(sn)] != snap_slot(sn)) {  /* De-dup allocs. */
>   	  J->slot[snap_slot(sn)] = J->slot[J->slot[snap_slot(sn)]];
>   	  continue;
>   	}
>   	op1 = ir->op1;
> -	if (op1 >= T->nk) op1 = snap_pref(J, T, map, nent, seen, op1);
> +	m = lj_ir_mode[ir->o];
> +	if (irm_op1(m) == IRMref) op1 = snap_pref(J, T, map, nent, seen, op1);
>   	op2 = ir->op2;
> -	if (op2 >= T->nk) op2 = snap_pref(J, T, map, nent, seen, op2);
> +	if (irm_op2(m) == IRMref) op2 = snap_pref(J, T, map, nent, seen, op2);
>   	if (LJ_HASFFI && ir->o == IR_CNEWI) {
>   	  if (LJ_32 && refp+1 < T->nins && (ir+1)->o == IR_HIOP) {
>   	    lj_needsplit(J);  /* Emit joining HIOP. */
> diff --git a/test/tarantool-tests/lj-1132-bad-snap-refs.test.lua b/test/tarantool-tests/lj-1132-bad-snap-refs.test.lua
> new file mode 100644
> index 00000000..1f2b5400
> --- /dev/null
> +++ b/test/tarantool-tests/lj-1132-bad-snap-refs.test.lua
> @@ -0,0 +1,36 @@
> +local tap = require('tap')
> +
> +-- Test file to demonstrate LuaJIT's crash in cases of sunk
> +-- restore for huge tables.
> +-- See also https://github.com/LuaJIT/LuaJIT/issues/1132.
> +
> +local test = tap.test('lj-1132-bad-snap-refs'):skipcond({
> +  ['Test requires JIT enabled'] = not jit.status(),
> +})
> +
> +test:plan(1)
> +
> +local table_new = require('table.new')
> +
> +jit.opt.start('hotloop=1', 'hotexit=1')
> +
> +local result_tab
> +-- Create a trace containing the IR instruction:
> +-- | {sink}  tab TNEW   #32762  #0
> +-- `lj_snap_replay()` assumes that 32762 (0x7ffa) (op1 of TNEW) is
> +-- a constant reference. It is passed to the `snap_replay_const()`
> +-- lookup to the IR constant in the 0x7ffa slot.
> +-- This slot contains the second part of the IR constant
> +-- number 0.5029296875 (step of the cycle) in its raw form
> +-- (0x3fe0180000000000). The 0x18 part is treated as IROp
> +-- (IR_KGC), and JIT is trying to continue with a store of an
> +-- invalid GC object, which leads to a crash.
> +for i = 1, 2.5, 0.5029296875 do
> +  local sunk_tab = table_new(0x7ff9, 0)
> +  -- Force the side exit with restoration of the sunk table.
> +  if i > 2 then result_tab = sunk_tab end
> +end
> +
> +test:ok(type(result_tab) == 'table', 'no crash during sunk restore')
> +
> +test:done(true)


More information about the Tarantool-patches mailing list