* [Tarantool-patches] [PATCH luajit 0/2] Fix assembling of the head of side traces
@ 2023-10-11 15:04 Sergey Kaplun via Tarantool-patches
2023-10-11 15:04 ` [Tarantool-patches] [PATCH luajit 1/2] Fix register mask for stack check in head of side trace Sergey Kaplun via Tarantool-patches
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-10-11 15:04 UTC (permalink / raw)
To: Maxim Kokryashkin, Sergey Bronnikov; +Cc: tarantool-patches
Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-1016-1031-asm-head-side
Tarantool PR: https://github.com/tarantool/tarantool/pull/9252
Related Issues:
* https://github.com/LuaJIT/LuaJIT/issues/1016
* https://github.com/LuaJIT/LuaJIT/issues/1031
* https://github.com/tarantool/tarantool/issues/8767
* https://github.com/tarantool/tarantool/issues/9145
The reproducer in 8767 doesn't fail anymore after the second patch.
Also, the box/bitset.test.lua doesn't fail either.
Mike Pall (2):
Fix register mask for stack check in head of side trace.
Fix base register coalescing in side trace.
src/lj_asm.c | 11 +-
src/lj_asm_arm.h | 7 +-
src/lj_asm_arm64.h | 7 +-
src/lj_asm_mips.h | 8 +-
src/lj_asm_ppc.h | 8 +-
src/lj_asm_x86.h | 8 +-
.../lj-1031-asm-head-side-base-reg.test.lua | 139 ++++++++++++++++++
7 files changed, 166 insertions(+), 22 deletions(-)
create mode 100644 test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua
--
2.42.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Tarantool-patches] [PATCH luajit 1/2] Fix register mask for stack check in head of side trace.
2023-10-11 15:04 [Tarantool-patches] [PATCH luajit 0/2] Fix assembling of the head of side traces Sergey Kaplun via Tarantool-patches
@ 2023-10-11 15:04 ` Sergey Kaplun via Tarantool-patches
2023-10-13 5:48 ` Maxim Kokryashkin via Tarantool-patches
2023-11-07 18:42 ` Igor Munkin via Tarantool-patches
2023-10-11 15:04 ` [Tarantool-patches] [PATCH luajit 2/2] Fix base register coalescing in " Sergey Kaplun via Tarantool-patches
2023-11-23 6:30 ` [Tarantool-patches] [PATCH luajit 0/2] Fix assembling of the head of side traces Igor Munkin via Tarantool-patches
2 siblings, 2 replies; 12+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-10-11 15:04 UTC (permalink / raw)
To: Maxim Kokryashkin, Sergey Bronnikov; +Cc: tarantool-patches
From: Mike Pall <mike>
Analyzed by Sergey Kaplun.
(cherry-picked from commit b7a8c7c184257858699454408420dd5f0b6c8a75)
Assume we have parent and child traces with the following IRs from the
IR dump:
Parent:
| 0009 rax > tab TNEW 0 0
| 0010 p32 FLOAD 0008 tab.node
| 0011 p32 HREFK 0010 "Name" @1
| 0012 {0008} tab HSTORE 0011 0009
| .... SNAP 2 [ ---- 0001 0002 0008 ---- ]
| 0013 {sink} tab TNEW 0 0
| 0014 {0008} fal HSTORE 0011 false
| .... SNAP 3 [ ---- 0001 0002 0008 ---- ]
Child:
| 0001 r15 tab SLOAD 1 PI
| 0002 rbp tab SLOAD 2 PI
| 0003 tab PVAL 9
As we can see from the trace dump above, the `rax` register is missing
in the `0003 PVAL` IR for the side trace -- so it is assumed to be
available in the allow RegSet inside `asm_stack_check()` and its value
is spoiled during this check, so if we are restoring from the 3rd
snapshot by stack overflow -- we are in trouble.
The moment when IR is spoiled is when we set a hint on the register
inherited from the parent trace (see `asm_setup_regsp()` for details).
The 0th register (`rax`) shapeshifts into `RID_NONE`. Hence, when
collecting register dependencies from the parent trace, `0003 PVAL` is
considered the IR with `RID_NONE`, i.e., without an assigned register.
So, this register is considered free (picked as bottom from the free
set) in the `asm_stack_check()` and is used for stack overflow check, so
the table reference is gone.
This patch introduces another register set for the context of the parent
trace to use in the stack check. All registers used on the child trace
are excluded from this set.
The test case for this patch is omitted since it requires specific
register allocation, which is hard to construct and not stable in any
future patches.
Part of tarantool/tarantool#9145
Sergey Kaplun:
* added the description for the problem
---
src/lj_asm.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/src/lj_asm.c b/src/lj_asm.c
index 3a1909d5..ca06860a 100644
--- a/src/lj_asm.c
+++ b/src/lj_asm.c
@@ -1859,6 +1859,7 @@ static void asm_head_side(ASMState *as)
IRRef1 sloadins[RID_MAX];
RegSet allow = RSET_ALL; /* Inverse of all coalesced registers. */
RegSet live = RSET_EMPTY; /* Live parent registers. */
+ RegSet pallow = RSET_GPR; /* Registers needed by the parent stack check. */
IRIns *irp = &as->parent->ir[REF_BASE]; /* Parent base. */
int32_t spadj, spdelta;
int pass2 = 0;
@@ -1899,6 +1900,7 @@ static void asm_head_side(ASMState *as)
sloadins[rs] = (IRRef1)i;
rset_set(live, rs); /* Block live parent register. */
}
+ if (!ra_hasspill(regsp_spill(rs))) rset_clear(pallow, regsp_reg(rs));
}
/* Calculate stack frame adjustment. */
@@ -2015,7 +2017,7 @@ static void asm_head_side(ASMState *as)
ExitNo exitno = as->J->exitno;
#endif
as->T->topslot = (uint8_t)as->topslot; /* Remember for child traces. */
- asm_stack_check(as, as->topslot, irp, allow & RSET_GPR, exitno);
+ asm_stack_check(as, as->topslot, irp, pallow, exitno);
}
}
--
2.42.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Tarantool-patches] [PATCH luajit 2/2] Fix base register coalescing in side trace.
2023-10-11 15:04 [Tarantool-patches] [PATCH luajit 0/2] Fix assembling of the head of side traces Sergey Kaplun via Tarantool-patches
2023-10-11 15:04 ` [Tarantool-patches] [PATCH luajit 1/2] Fix register mask for stack check in head of side trace Sergey Kaplun via Tarantool-patches
@ 2023-10-11 15:04 ` Sergey Kaplun via Tarantool-patches
2023-10-13 5:59 ` Maxim Kokryashkin via Tarantool-patches
2023-11-07 18:42 ` Igor Munkin via Tarantool-patches
2023-11-23 6:30 ` [Tarantool-patches] [PATCH luajit 0/2] Fix assembling of the head of side traces Igor Munkin via Tarantool-patches
2 siblings, 2 replies; 12+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-10-11 15:04 UTC (permalink / raw)
To: Maxim Kokryashkin, Sergey Bronnikov; +Cc: tarantool-patches
From: Mike Pall <mike>
Thanks to Sergey Kaplun, NiLuJe and Peter Cawley.
(cherry-picked from commit aa2db7ebd1267836af5221336ccb4e9b4aa8372d)
The previous patch fixed just part of the problem with the register
coalesing. For example, the parent base register may be used inside the
parent or child register sets when it shouldn't. This leads to incorrect
register allocations, which may lead to crashes or undefined behaviour.
This patch fixes it by excluding the parent base register from both
register sets.
The test case for this patch doesn't fail before the commit since it
requires specific register allocation, which is hard to construct and
very fragile. Due to the lack of ideal sync with the upstream
repository, the test is passed before the patch.
It should become correct in future patches.
Resolves tarantool/tarantool#8767
Part of tarantool/tarantool#9145
Sergey Kaplun:
* added the description and the test for the problem
---
src/lj_asm.c | 7 +-
src/lj_asm_arm.h | 7 +-
src/lj_asm_arm64.h | 7 +-
src/lj_asm_mips.h | 8 +-
src/lj_asm_ppc.h | 8 +-
src/lj_asm_x86.h | 8 +-
.../lj-1031-asm-head-side-base-reg.test.lua | 139 ++++++++++++++++++
7 files changed, 163 insertions(+), 21 deletions(-)
create mode 100644 test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua
diff --git a/src/lj_asm.c b/src/lj_asm.c
index ca06860a..5137d05c 100644
--- a/src/lj_asm.c
+++ b/src/lj_asm.c
@@ -1860,6 +1860,7 @@ static void asm_head_side(ASMState *as)
RegSet allow = RSET_ALL; /* Inverse of all coalesced registers. */
RegSet live = RSET_EMPTY; /* Live parent registers. */
RegSet pallow = RSET_GPR; /* Registers needed by the parent stack check. */
+ Reg pbase;
IRIns *irp = &as->parent->ir[REF_BASE]; /* Parent base. */
int32_t spadj, spdelta;
int pass2 = 0;
@@ -1870,7 +1871,11 @@ static void asm_head_side(ASMState *as)
/* Force snap #0 alloc to prevent register overwrite in stack check. */
asm_snap_alloc(as, 0);
}
- allow = asm_head_side_base(as, irp, allow);
+ pbase = asm_head_side_base(as, irp);
+ if (pbase != RID_NONE) {
+ rset_clear(allow, pbase);
+ rset_clear(pallow, pbase);
+ }
/* Scan all parent SLOADs and collect register dependencies. */
for (i = as->stopins; i > REF_BASE; i--) {
diff --git a/src/lj_asm_arm.h b/src/lj_asm_arm.h
index 47564d2e..5a0f925f 100644
--- a/src/lj_asm_arm.h
+++ b/src/lj_asm_arm.h
@@ -2107,7 +2107,7 @@ static void asm_head_root_base(ASMState *as)
}
/* Coalesce BASE register for a side trace. */
-static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
+static Reg asm_head_side_base(ASMState *as, IRIns *irp)
{
IRIns *ir;
asm_head_lreg(as);
@@ -2115,16 +2115,15 @@ static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
if (ra_hasreg(ir->r) && (rset_test(as->modset, ir->r) || irt_ismarked(ir->t)))
ra_spill(as, ir);
if (ra_hasspill(irp->s)) {
- rset_clear(allow, ra_dest(as, ir, allow));
+ return ra_dest(as, ir, RSET_GPR);
} else {
Reg r = irp->r;
lj_assertA(ra_hasreg(r), "base reg lost");
- rset_clear(allow, r);
if (r != ir->r && !rset_test(as->freeset, r))
ra_restore(as, regcost_ref(as->cost[r]));
ra_destreg(as, ir, r);
+ return r;
}
- return allow;
}
/* -- Tail of trace ------------------------------------------------------- */
diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
index d1d4237b..88b47ceb 100644
--- a/src/lj_asm_arm64.h
+++ b/src/lj_asm_arm64.h
@@ -1873,7 +1873,7 @@ static void asm_head_root_base(ASMState *as)
}
/* Coalesce BASE register for a side trace. */
-static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
+static Reg asm_head_side_base(ASMState *as, IRIns *irp)
{
IRIns *ir;
asm_head_lreg(as);
@@ -1881,16 +1881,15 @@ static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
if (ra_hasreg(ir->r) && (rset_test(as->modset, ir->r) || irt_ismarked(ir->t)))
ra_spill(as, ir);
if (ra_hasspill(irp->s)) {
- rset_clear(allow, ra_dest(as, ir, allow));
+ return ra_dest(as, ir, RSET_GPR);
} else {
Reg r = irp->r;
lj_assertA(ra_hasreg(r), "base reg lost");
- rset_clear(allow, r);
if (r != ir->r && !rset_test(as->freeset, r))
ra_restore(as, regcost_ref(as->cost[r]));
ra_destreg(as, ir, r);
+ return r;
}
- return allow;
}
/* -- Tail of trace ------------------------------------------------------- */
diff --git a/src/lj_asm_mips.h b/src/lj_asm_mips.h
index ac9090f2..597c6d62 100644
--- a/src/lj_asm_mips.h
+++ b/src/lj_asm_mips.h
@@ -2595,7 +2595,7 @@ static void asm_head_root_base(ASMState *as)
}
/* Coalesce BASE register for a side trace. */
-static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
+static Reg asm_head_side_base(ASMState *as, IRIns *irp)
{
IRIns *ir = IR(REF_BASE);
Reg r = ir->r;
@@ -2605,15 +2605,15 @@ static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
if (rset_test(as->modset, r) || irt_ismarked(ir->t))
ir->r = RID_INIT; /* No inheritance for modified BASE register. */
if (irp->r == r) {
- rset_clear(allow, r); /* Mark same BASE register as coalesced. */
+ return r; /* Same BASE register already coalesced. */
} else if (ra_hasreg(irp->r) && rset_test(as->freeset, irp->r)) {
- rset_clear(allow, irp->r);
emit_move(as, r, irp->r); /* Move from coalesced parent reg. */
+ return irp->r;
} else {
emit_getgl(as, r, jit_base); /* Otherwise reload BASE. */
}
}
- return allow;
+ return RID_NONE;
}
/* -- Tail of trace ------------------------------------------------------- */
diff --git a/src/lj_asm_ppc.h b/src/lj_asm_ppc.h
index 971dcc88..7bba71b3 100644
--- a/src/lj_asm_ppc.h
+++ b/src/lj_asm_ppc.h
@@ -2130,7 +2130,7 @@ static void asm_head_root_base(ASMState *as)
}
/* Coalesce BASE register for a side trace. */
-static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
+static Reg asm_head_side_base(ASMState *as, IRIns *irp)
{
IRIns *ir = IR(REF_BASE);
Reg r = ir->r;
@@ -2139,15 +2139,15 @@ static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
if (rset_test(as->modset, r) || irt_ismarked(ir->t))
ir->r = RID_INIT; /* No inheritance for modified BASE register. */
if (irp->r == r) {
- rset_clear(allow, r); /* Mark same BASE register as coalesced. */
+ return r; /* Same BASE register already coalesced. */
} else if (ra_hasreg(irp->r) && rset_test(as->freeset, irp->r)) {
- rset_clear(allow, irp->r);
emit_mr(as, r, irp->r); /* Move from coalesced parent reg. */
+ return irp->r;
} else {
emit_getgl(as, r, jit_base); /* Otherwise reload BASE. */
}
}
- return allow;
+ return RID_NONE;
}
/* -- Tail of trace ------------------------------------------------------- */
diff --git a/src/lj_asm_x86.h b/src/lj_asm_x86.h
index 2b810c8d..86ce3937 100644
--- a/src/lj_asm_x86.h
+++ b/src/lj_asm_x86.h
@@ -2856,7 +2856,7 @@ static void asm_head_root_base(ASMState *as)
}
/* Coalesce or reload BASE register for a side trace. */
-static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
+static Reg asm_head_side_base(ASMState *as, IRIns *irp)
{
IRIns *ir = IR(REF_BASE);
Reg r = ir->r;
@@ -2865,16 +2865,16 @@ static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
if (rset_test(as->modset, r) || irt_ismarked(ir->t))
ir->r = RID_INIT; /* No inheritance for modified BASE register. */
if (irp->r == r) {
- rset_clear(allow, r); /* Mark same BASE register as coalesced. */
+ return r; /* Same BASE register already coalesced. */
} else if (ra_hasreg(irp->r) && rset_test(as->freeset, irp->r)) {
/* Move from coalesced parent reg. */
- rset_clear(allow, irp->r);
emit_rr(as, XO_MOV, r|REX_GC64, irp->r);
+ return irp->r;
} else {
emit_getgl(as, r, jit_base); /* Otherwise reload BASE. */
}
}
- return allow;
+ return RID_NONE;
}
/* -- Tail of trace ------------------------------------------------------- */
diff --git a/test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua b/test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua
new file mode 100644
index 00000000..fc5efaaa
--- /dev/null
+++ b/test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua
@@ -0,0 +1,139 @@
+local tap = require('tap')
+-- Test file to demonstrate incorrect side trace head assembling
+-- when use parent trace register holding base (`RID_BASE`).
+-- See also, https://github.com/LuaJIT/LuaJIT/issues/1031.
+--
+-- XXX: For now, the test doesn't fail even on arm64 due to the
+-- different from upstream register allocation. Nevertheless, this
+-- issue should be gone with future backporting, so just leave the
+-- test case as is.
+local test = tap.test('lj-1031-asm-head-side-base-reg'):skipcond({
+ ['Test requires JIT enabled'] = not jit.status(),
+})
+
+local ffi = require 'ffi'
+local int64_t = ffi.typeof('int64_t')
+
+test:plan(1)
+
+-- To reproduce the issue (reproduced only on arm64) we need a
+-- little bit of a tricky structure for the traces:
+-- +-> + -- start TRACE 1
+-- | | ...
+-- | |---> + -- start TRACE 3 (side trace for the 1st)
+-- | | | ...
+-- | | v (link with TRACE 2)
+-- | | +-> + -- start TRACE 2 (some loop that wasn't recorded
+-- | | | | before due to the loop limit)
+-- | | +---+
+-- +<--+
+-- Also, we need high register pressure to be sure that register
+-- holding value of the base will be spilled.
+-- So, during the assembly of "TRACE 3" (#6 in our test),
+-- `RID_BASE` is spilled and restored via 32-bit fpr. This leads
+-- to an incorrect value because it contains a 64-bit value.
+-- See the original ticket for the details.
+
+-- XXX: Reduce amount of IR.
+local tonumber = tonumber
+local ffi_new = ffi.new
+
+-- Resulting slots for values calculated on trace.
+-- luacheck: ignore
+local r1
+local r2
+local r3
+local r4
+local r5
+local r6
+local r7
+local r8
+local r9
+local r10
+local r11
+local r12
+local r13
+local r14
+local r15
+local r16
+local r17
+local r18
+local r19
+
+local ARR_SIZE = 1e2
+local lim_arr = {}
+
+-- XXX: Prevent irrelevant output in jit.dump().
+jit.off()
+
+local INNER_TRACE_LIMIT = 20
+local INNER_LIMIT1 = 1
+local INNER_LIMIT2 = 4
+for i = 1, INNER_TRACE_LIMIT do lim_arr[i] = INNER_LIMIT1 end
+for i = INNER_TRACE_LIMIT + 1, ARR_SIZE + 1 do lim_arr[i] = INNER_LIMIT2 end
+
+-- Enable compilation back.
+jit.on()
+
+jit.opt.start('hotloop=1', 'hotexit=2')
+
+-- XXX: Trace numbers are given with the respect of using
+-- `jit.dump`.
+-- Start TRACE 2 (1 is inside dump bc).
+local k = 0
+while k < ARR_SIZE do
+ k = k + 1
+ -- Forcify register pressure.
+ local l1 = ffi_new(int64_t, k + 1)
+ local l2 = ffi_new(int64_t, k + 2)
+ local l3 = ffi_new(int64_t, k + 3)
+ local l4 = ffi_new(int64_t, k + 4)
+ local l5 = ffi_new(int64_t, k + 5)
+ local l6 = ffi_new(int64_t, k + 6)
+ local l7 = ffi_new(int64_t, k + 7)
+ local l8 = ffi_new(int64_t, k + 8)
+ local l9 = ffi_new(int64_t, k + 9)
+ local l10 = ffi_new(int64_t, k + 10)
+ local l11 = ffi_new(int64_t, k + 11)
+ local l12 = ffi_new(int64_t, k + 12)
+ local l13 = ffi_new(int64_t, k + 13)
+ local l14 = ffi_new(int64_t, k + 14)
+ local l15 = ffi_new(int64_t, k + 15)
+ local l16 = ffi_new(int64_t, k + 16)
+ local l17 = ffi_new(int64_t, k + 17)
+ local l18 = ffi_new(int64_t, k + 18)
+ local l19 = ffi_new(int64_t, k + 19)
+ -- Side exit for TRACE 6 start 2/1.
+ -- luacheck: ignore
+ if k > 55 then else end
+ r1 = tonumber(l1)
+ r2 = tonumber(l2)
+ r3 = tonumber(l3)
+ r4 = tonumber(l4)
+ r5 = tonumber(l5)
+ r6 = tonumber(l6)
+ r7 = tonumber(l7)
+ r8 = tonumber(l8)
+ r9 = tonumber(l9)
+ r10 = tonumber(l10)
+ r11 = tonumber(l11)
+ r12 = tonumber(l12)
+ r13 = tonumber(l13)
+ r14 = tonumber(l14)
+ r15 = tonumber(l15)
+ r15 = tonumber(l15)
+ r16 = tonumber(l16)
+ r17 = tonumber(l17)
+ r18 = tonumber(l18)
+ r19 = tonumber(l19)
+ local lim_inner = lim_arr[k]
+ -- TRACE 3. Loop is not taken at the moment of recording TRACE 2
+ -- (lim_inner == 1).
+ for _ = 1, lim_inner do
+ -- TRACE 6 stop -> 3.
+ end
+end
+
+test:ok(true, 'correct side trace head assembling')
+
+test:done(true)
--
2.42.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 1/2] Fix register mask for stack check in head of side trace.
2023-10-11 15:04 ` [Tarantool-patches] [PATCH luajit 1/2] Fix register mask for stack check in head of side trace Sergey Kaplun via Tarantool-patches
@ 2023-10-13 5:48 ` Maxim Kokryashkin via Tarantool-patches
2023-10-23 9:27 ` Sergey Kaplun via Tarantool-patches
2023-11-07 18:42 ` Igor Munkin via Tarantool-patches
1 sibling, 1 reply; 12+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2023-10-13 5:48 UTC (permalink / raw)
To: Sergey Kaplun; +Cc: tarantool-patches
Hi, Sergey!
Thanks for the patch!
LGTM, except for a few nits below.
On Wed, Oct 11, 2023 at 06:04:09PM +0300, Sergey Kaplun wrote:
> From: Mike Pall <mike>
>
> Analyzed by Sergey Kaplun.
>
> (cherry-picked from commit b7a8c7c184257858699454408420dd5f0b6c8a75)
>
> Assume we have parent and child traces with the following IRs from the
> IR dump:
>
> Parent:
> | 0009 rax > tab TNEW 0 0
> | 0010 p32 FLOAD 0008 tab.node
> | 0011 p32 HREFK 0010 "Name" @1
> | 0012 {0008} tab HSTORE 0011 0009
> | .... SNAP 2 [ ---- 0001 0002 0008 ---- ]
> | 0013 {sink} tab TNEW 0 0
> | 0014 {0008} fal HSTORE 0011 false
> | .... SNAP 3 [ ---- 0001 0002 0008 ---- ]
>
> Child:
> | 0001 r15 tab SLOAD 1 PI
> | 0002 rbp tab SLOAD 2 PI
> | 0003 tab PVAL 9
>
> As we can see from the trace dump above, the `rax` register is missing
> in the `0003 PVAL` IR for the side trace -- so it is assumed to be
> available in the allow RegSet inside `asm_stack_check()` and its value
> is spoiled during this check, so if we are restoring from the 3rd
Typo: s/spoiled/spilled/
> snapshot by stack overflow -- we are in trouble.
>
> The moment when IR is spoiled is when we set a hint on the register
Typo: s/spoiled/spilled/
> inherited from the parent trace (see `asm_setup_regsp()` for details).
> The 0th register (`rax`) shapeshifts into `RID_NONE`. Hence, when
> collecting register dependencies from the parent trace, `0003 PVAL` is
> considered the IR with `RID_NONE`, i.e., without an assigned register.
> So, this register is considered free (picked as bottom from the free
> set) in the `asm_stack_check()` and is used for stack overflow check, so
> the table reference is gone.
>
> This patch introduces another register set for the context of the parent
> trace to use in the stack check. All registers used on the child trace
> are excluded from this set.
>
> The test case for this patch is omitted since it requires specific
> register allocation, which is hard to construct and not stable in any
> future patches.
>
> Part of tarantool/tarantool#9145
>
> Sergey Kaplun:
> * added the description for the problem
> ---
> src/lj_asm.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/src/lj_asm.c b/src/lj_asm.c
> index 3a1909d5..ca06860a 100644
> --- a/src/lj_asm.c
> +++ b/src/lj_asm.c
> @@ -1859,6 +1859,7 @@ static void asm_head_side(ASMState *as)
> IRRef1 sloadins[RID_MAX];
> RegSet allow = RSET_ALL; /* Inverse of all coalesced registers. */
> RegSet live = RSET_EMPTY; /* Live parent registers. */
> + RegSet pallow = RSET_GPR; /* Registers needed by the parent stack check. */
> IRIns *irp = &as->parent->ir[REF_BASE]; /* Parent base. */
> int32_t spadj, spdelta;
> int pass2 = 0;
> @@ -1899,6 +1900,7 @@ static void asm_head_side(ASMState *as)
> sloadins[rs] = (IRRef1)i;
> rset_set(live, rs); /* Block live parent register. */
> }
> + if (!ra_hasspill(regsp_spill(rs))) rset_clear(pallow, regsp_reg(rs));
> }
>
> /* Calculate stack frame adjustment. */
> @@ -2015,7 +2017,7 @@ static void asm_head_side(ASMState *as)
> ExitNo exitno = as->J->exitno;
> #endif
> as->T->topslot = (uint8_t)as->topslot; /* Remember for child traces. */
> - asm_stack_check(as, as->topslot, irp, allow & RSET_GPR, exitno);
> + asm_stack_check(as, as->topslot, irp, pallow, exitno);
> }
> }
>
> --
> 2.42.0
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 2/2] Fix base register coalescing in side trace.
2023-10-11 15:04 ` [Tarantool-patches] [PATCH luajit 2/2] Fix base register coalescing in " Sergey Kaplun via Tarantool-patches
@ 2023-10-13 5:59 ` Maxim Kokryashkin via Tarantool-patches
2023-10-23 9:50 ` Sergey Kaplun via Tarantool-patches
2023-11-07 18:42 ` Igor Munkin via Tarantool-patches
1 sibling, 1 reply; 12+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2023-10-13 5:59 UTC (permalink / raw)
To: Sergey Kaplun; +Cc: tarantool-patches
Hi, Sergey!
Thanks for the patch!
LGTM, except for a few comments below.
On Wed, Oct 11, 2023 at 06:04:10PM +0300, Sergey Kaplun wrote:
> From: Mike Pall <mike>
>
> Thanks to Sergey Kaplun, NiLuJe and Peter Cawley.
>
> (cherry-picked from commit aa2db7ebd1267836af5221336ccb4e9b4aa8372d)
>
> The previous patch fixed just part of the problem with the register
> coalesing. For example, the parent base register may be used inside the
> parent or child register sets when it shouldn't. This leads to incorrect
> register allocations, which may lead to crashes or undefined behaviour.
> This patch fixes it by excluding the parent base register from both
> register sets.
>
> The test case for this patch doesn't fail before the commit since it
> requires specific register allocation, which is hard to construct and
> very fragile. Due to the lack of ideal sync with the upstream
> repository, the test is passed before the patch.
> It should become correct in future patches.
Typo: s/in future patches/after backporting future patches/
>
> Resolves tarantool/tarantool#8767
How can I reassure that, since we don't have any reproducer?
> Part of tarantool/tarantool#9145
>
> Sergey Kaplun:
> * added the description and the test for the problem
> ---
> src/lj_asm.c | 7 +-
> src/lj_asm_arm.h | 7 +-
> src/lj_asm_arm64.h | 7 +-
> src/lj_asm_mips.h | 8 +-
> src/lj_asm_ppc.h | 8 +-
> src/lj_asm_x86.h | 8 +-
> .../lj-1031-asm-head-side-base-reg.test.lua | 139 ++++++++++++++++++
> 7 files changed, 163 insertions(+), 21 deletions(-)
> create mode 100644 test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua
>
> diff --git a/src/lj_asm.c b/src/lj_asm.c
> index ca06860a..5137d05c 100644
> --- a/src/lj_asm.c
> +++ b/src/lj_asm.c
> @@ -1860,6 +1860,7 @@ static void asm_head_side(ASMState *as)
> RegSet allow = RSET_ALL; /* Inverse of all coalesced registers. */
> RegSet live = RSET_EMPTY; /* Live parent registers. */
> RegSet pallow = RSET_GPR; /* Registers needed by the parent stack check. */
> + Reg pbase;
> IRIns *irp = &as->parent->ir[REF_BASE]; /* Parent base. */
> int32_t spadj, spdelta;
> int pass2 = 0;
> @@ -1870,7 +1871,11 @@ static void asm_head_side(ASMState *as)
> /* Force snap #0 alloc to prevent register overwrite in stack check. */
> asm_snap_alloc(as, 0);
> }
> - allow = asm_head_side_base(as, irp, allow);
> + pbase = asm_head_side_base(as, irp);
> + if (pbase != RID_NONE) {
> + rset_clear(allow, pbase);
> + rset_clear(pallow, pbase);
> + }
>
> /* Scan all parent SLOADs and collect register dependencies. */
> for (i = as->stopins; i > REF_BASE; i--) {
> diff --git a/src/lj_asm_arm.h b/src/lj_asm_arm.h
> index 47564d2e..5a0f925f 100644
> --- a/src/lj_asm_arm.h
> +++ b/src/lj_asm_arm.h
> @@ -2107,7 +2107,7 @@ static void asm_head_root_base(ASMState *as)
> }
>
> /* Coalesce BASE register for a side trace. */
> -static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
> +static Reg asm_head_side_base(ASMState *as, IRIns *irp)
> {
> IRIns *ir;
> asm_head_lreg(as);
> @@ -2115,16 +2115,15 @@ static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
> if (ra_hasreg(ir->r) && (rset_test(as->modset, ir->r) || irt_ismarked(ir->t)))
> ra_spill(as, ir);
> if (ra_hasspill(irp->s)) {
> - rset_clear(allow, ra_dest(as, ir, allow));
> + return ra_dest(as, ir, RSET_GPR);
> } else {
> Reg r = irp->r;
> lj_assertA(ra_hasreg(r), "base reg lost");
> - rset_clear(allow, r);
> if (r != ir->r && !rset_test(as->freeset, r))
> ra_restore(as, regcost_ref(as->cost[r]));
> ra_destreg(as, ir, r);
> + return r;
> }
> - return allow;
> }
>
> /* -- Tail of trace ------------------------------------------------------- */
> diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
> index d1d4237b..88b47ceb 100644
> --- a/src/lj_asm_arm64.h
> +++ b/src/lj_asm_arm64.h
> @@ -1873,7 +1873,7 @@ static void asm_head_root_base(ASMState *as)
> }
>
> /* Coalesce BASE register for a side trace. */
> -static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
> +static Reg asm_head_side_base(ASMState *as, IRIns *irp)
> {
> IRIns *ir;
> asm_head_lreg(as);
> @@ -1881,16 +1881,15 @@ static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
> if (ra_hasreg(ir->r) && (rset_test(as->modset, ir->r) || irt_ismarked(ir->t)))
> ra_spill(as, ir);
> if (ra_hasspill(irp->s)) {
> - rset_clear(allow, ra_dest(as, ir, allow));
> + return ra_dest(as, ir, RSET_GPR);
> } else {
> Reg r = irp->r;
> lj_assertA(ra_hasreg(r), "base reg lost");
> - rset_clear(allow, r);
> if (r != ir->r && !rset_test(as->freeset, r))
> ra_restore(as, regcost_ref(as->cost[r]));
> ra_destreg(as, ir, r);
> + return r;
> }
> - return allow;
> }
>
> /* -- Tail of trace ------------------------------------------------------- */
> diff --git a/src/lj_asm_mips.h b/src/lj_asm_mips.h
> index ac9090f2..597c6d62 100644
> --- a/src/lj_asm_mips.h
> +++ b/src/lj_asm_mips.h
> @@ -2595,7 +2595,7 @@ static void asm_head_root_base(ASMState *as)
> }
>
> /* Coalesce BASE register for a side trace. */
> -static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
> +static Reg asm_head_side_base(ASMState *as, IRIns *irp)
> {
> IRIns *ir = IR(REF_BASE);
> Reg r = ir->r;
> @@ -2605,15 +2605,15 @@ static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
> if (rset_test(as->modset, r) || irt_ismarked(ir->t))
> ir->r = RID_INIT; /* No inheritance for modified BASE register. */
> if (irp->r == r) {
> - rset_clear(allow, r); /* Mark same BASE register as coalesced. */
> + return r; /* Same BASE register already coalesced. */
> } else if (ra_hasreg(irp->r) && rset_test(as->freeset, irp->r)) {
> - rset_clear(allow, irp->r);
> emit_move(as, r, irp->r); /* Move from coalesced parent reg. */
> + return irp->r;
> } else {
> emit_getgl(as, r, jit_base); /* Otherwise reload BASE. */
> }
> }
> - return allow;
> + return RID_NONE;
> }
>
> /* -- Tail of trace ------------------------------------------------------- */
> diff --git a/src/lj_asm_ppc.h b/src/lj_asm_ppc.h
> index 971dcc88..7bba71b3 100644
> --- a/src/lj_asm_ppc.h
> +++ b/src/lj_asm_ppc.h
> @@ -2130,7 +2130,7 @@ static void asm_head_root_base(ASMState *as)
> }
>
> /* Coalesce BASE register for a side trace. */
> -static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
> +static Reg asm_head_side_base(ASMState *as, IRIns *irp)
> {
> IRIns *ir = IR(REF_BASE);
> Reg r = ir->r;
> @@ -2139,15 +2139,15 @@ static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
> if (rset_test(as->modset, r) || irt_ismarked(ir->t))
> ir->r = RID_INIT; /* No inheritance for modified BASE register. */
> if (irp->r == r) {
> - rset_clear(allow, r); /* Mark same BASE register as coalesced. */
> + return r; /* Same BASE register already coalesced. */
> } else if (ra_hasreg(irp->r) && rset_test(as->freeset, irp->r)) {
> - rset_clear(allow, irp->r);
> emit_mr(as, r, irp->r); /* Move from coalesced parent reg. */
> + return irp->r;
> } else {
> emit_getgl(as, r, jit_base); /* Otherwise reload BASE. */
> }
> }
> - return allow;
> + return RID_NONE;
> }
>
> /* -- Tail of trace ------------------------------------------------------- */
> diff --git a/src/lj_asm_x86.h b/src/lj_asm_x86.h
> index 2b810c8d..86ce3937 100644
> --- a/src/lj_asm_x86.h
> +++ b/src/lj_asm_x86.h
> @@ -2856,7 +2856,7 @@ static void asm_head_root_base(ASMState *as)
> }
>
> /* Coalesce or reload BASE register for a side trace. */
> -static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
> +static Reg asm_head_side_base(ASMState *as, IRIns *irp)
> {
> IRIns *ir = IR(REF_BASE);
> Reg r = ir->r;
> @@ -2865,16 +2865,16 @@ static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
> if (rset_test(as->modset, r) || irt_ismarked(ir->t))
> ir->r = RID_INIT; /* No inheritance for modified BASE register. */
> if (irp->r == r) {
> - rset_clear(allow, r); /* Mark same BASE register as coalesced. */
> + return r; /* Same BASE register already coalesced. */
> } else if (ra_hasreg(irp->r) && rset_test(as->freeset, irp->r)) {
> /* Move from coalesced parent reg. */
> - rset_clear(allow, irp->r);
> emit_rr(as, XO_MOV, r|REX_GC64, irp->r);
> + return irp->r;
> } else {
> emit_getgl(as, r, jit_base); /* Otherwise reload BASE. */
> }
> }
> - return allow;
> + return RID_NONE;
> }
>
> /* -- Tail of trace ------------------------------------------------------- */
> diff --git a/test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua b/test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua
> new file mode 100644
> index 00000000..fc5efaaa
> --- /dev/null
> +++ b/test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua
> @@ -0,0 +1,139 @@
> +local tap = require('tap')
> +-- Test file to demonstrate incorrect side trace head assembling
> +-- when use parent trace register holding base (`RID_BASE`).
> +-- See also, https://github.com/LuaJIT/LuaJIT/issues/1031.
> +--
> +-- XXX: For now, the test doesn't fail even on arm64 due to the
> +-- different from upstream register allocation. Nevertheless, this
> +-- issue should be gone with future backporting, so just leave the
> +-- test case as is.
> +local test = tap.test('lj-1031-asm-head-side-base-reg'):skipcond({
> + ['Test requires JIT enabled'] = not jit.status(),
> +})
> +
> +local ffi = require 'ffi'
Please use parantheses here too.
> +local int64_t = ffi.typeof('int64_t')
> +
> +test:plan(1)
> +
> +-- To reproduce the issue (reproduced only on arm64) we need a
> +-- little bit of a tricky structure for the traces:
> +-- +-> + -- start TRACE 1
> +-- | | ...
> +-- | |---> + -- start TRACE 3 (side trace for the 1st)
> +-- | | | ...
> +-- | | v (link with TRACE 2)
> +-- | | +-> + -- start TRACE 2 (some loop that wasn't recorded
> +-- | | | | before due to the loop limit)
> +-- | | +---+
> +-- +<--+
> +-- Also, we need high register pressure to be sure that register
> +-- holding value of the base will be spilled.
> +-- So, during the assembly of "TRACE 3" (#6 in our test),
> +-- `RID_BASE` is spilled and restored via 32-bit fpr. This leads
> +-- to an incorrect value because it contains a 64-bit value.
> +-- See the original ticket for the details.
> +
> +-- XXX: Reduce amount of IR.
> +local tonumber = tonumber
> +local ffi_new = ffi.new
> +
> +-- Resulting slots for values calculated on trace.
> +-- luacheck: ignore
> +local r1
> +local r2
> +local r3
> +local r4
> +local r5
> +local r6
> +local r7
> +local r8
> +local r9
> +local r10
> +local r11
> +local r12
> +local r13
> +local r14
> +local r15
> +local r16
> +local r17
> +local r18
> +local r19
> +
> +local ARR_SIZE = 1e2
Maybe it is better to just write it as 100?
> +local lim_arr = {}
> +
> +-- XXX: Prevent irrelevant output in jit.dump().
> +jit.off()
> +
> +local INNER_TRACE_LIMIT = 20
> +local INNER_LIMIT1 = 1
> +local INNER_LIMIT2 = 4
Please drop a comment with explanation of why those numbers were chosen.
> +for i = 1, INNER_TRACE_LIMIT do lim_arr[i] = INNER_LIMIT1 end
> +for i = INNER_TRACE_LIMIT + 1, ARR_SIZE + 1 do lim_arr[i] = INNER_LIMIT2 end
These loops are hardly readable, it would be nice to make the multiline.
> +
> +-- Enable compilation back.
> +jit.on()
> +
> +jit.opt.start('hotloop=1', 'hotexit=2')
> +
> +-- XXX: Trace numbers are given with the respect of using
> +-- `jit.dump`.
> +-- Start TRACE 2 (1 is inside dump bc).
> +local k = 0
> +while k < ARR_SIZE do
> + k = k + 1
> + -- Forcify register pressure.
> + local l1 = ffi_new(int64_t, k + 1)
> + local l2 = ffi_new(int64_t, k + 2)
> + local l3 = ffi_new(int64_t, k + 3)
> + local l4 = ffi_new(int64_t, k + 4)
> + local l5 = ffi_new(int64_t, k + 5)
> + local l6 = ffi_new(int64_t, k + 6)
> + local l7 = ffi_new(int64_t, k + 7)
> + local l8 = ffi_new(int64_t, k + 8)
> + local l9 = ffi_new(int64_t, k + 9)
> + local l10 = ffi_new(int64_t, k + 10)
> + local l11 = ffi_new(int64_t, k + 11)
> + local l12 = ffi_new(int64_t, k + 12)
> + local l13 = ffi_new(int64_t, k + 13)
> + local l14 = ffi_new(int64_t, k + 14)
> + local l15 = ffi_new(int64_t, k + 15)
> + local l16 = ffi_new(int64_t, k + 16)
> + local l17 = ffi_new(int64_t, k + 17)
> + local l18 = ffi_new(int64_t, k + 18)
> + local l19 = ffi_new(int64_t, k + 19)
> + -- Side exit for TRACE 6 start 2/1.
> + -- luacheck: ignore
> + if k > 55 then else end
Please drop a comment and explain why it is 55, for some people
that is not obvious.
> + r1 = tonumber(l1)
> + r2 = tonumber(l2)
> + r3 = tonumber(l3)
> + r4 = tonumber(l4)
> + r5 = tonumber(l5)
> + r6 = tonumber(l6)
> + r7 = tonumber(l7)
> + r8 = tonumber(l8)
> + r9 = tonumber(l9)
> + r10 = tonumber(l10)
> + r11 = tonumber(l11)
> + r12 = tonumber(l12)
> + r13 = tonumber(l13)
> + r14 = tonumber(l14)
> + r15 = tonumber(l15)
> + r15 = tonumber(l15)
> + r16 = tonumber(l16)
> + r17 = tonumber(l17)
> + r18 = tonumber(l18)
> + r19 = tonumber(l19)
> + local lim_inner = lim_arr[k]
> + -- TRACE 3. Loop is not taken at the moment of recording TRACE 2
> + -- (lim_inner == 1).
> + for _ = 1, lim_inner do
> + -- TRACE 6 stop -> 3.
> + end
> +end
> +
> +test:ok(true, 'correct side trace head assembling')
> +
> +test:done(true)
> --
> 2.42.0
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 1/2] Fix register mask for stack check in head of side trace.
2023-10-13 5:48 ` Maxim Kokryashkin via Tarantool-patches
@ 2023-10-23 9:27 ` Sergey Kaplun via Tarantool-patches
2023-10-24 13:51 ` Maxim Kokryashkin via Tarantool-patches
0 siblings, 1 reply; 12+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-10-23 9:27 UTC (permalink / raw)
To: Maxim Kokryashkin; +Cc: tarantool-patches
Hi, Maxim!
Thanks for the review!
Please, consider my answers below.
On 13.10.23, Maxim Kokryashkin wrote:
> Hi, Sergey!
> Thanks for the patch!
> LGTM, except for a few nits below.
> On Wed, Oct 11, 2023 at 06:04:09PM +0300, Sergey Kaplun wrote:
> > From: Mike Pall <mike>
> >
> > Analyzed by Sergey Kaplun.
> >
> > (cherry-picked from commit b7a8c7c184257858699454408420dd5f0b6c8a75)
> >
> > Assume we have parent and child traces with the following IRs from the
> > IR dump:
> >
> > Parent:
> > | 0009 rax > tab TNEW 0 0
> > | 0010 p32 FLOAD 0008 tab.node
> > | 0011 p32 HREFK 0010 "Name" @1
> > | 0012 {0008} tab HSTORE 0011 0009
> > | .... SNAP 2 [ ---- 0001 0002 0008 ---- ]
> > | 0013 {sink} tab TNEW 0 0
> > | 0014 {0008} fal HSTORE 0011 false
> > | .... SNAP 3 [ ---- 0001 0002 0008 ---- ]
> >
> > Child:
> > | 0001 r15 tab SLOAD 1 PI
> > | 0002 rbp tab SLOAD 2 PI
> > | 0003 tab PVAL 9
> >
> > As we can see from the trace dump above, the `rax` register is missing
> > in the `0003 PVAL` IR for the side trace -- so it is assumed to be
> > available in the allow RegSet inside `asm_stack_check()` and its value
> > is spoiled during this check, so if we are restoring from the 3rd
> Typo: s/spoiled/spilled/
I mean spoiled (damaged) (with incorrect value) here -- there is no spill, but
incorrect value as `ir->r`.
> > snapshot by stack overflow -- we are in trouble.
> >
> > The moment when IR is spoiled is when we set a hint on the register
> Typo: s/spoiled/spilled/
Ditto.
> > inherited from the parent trace (see `asm_setup_regsp()` for details).
<snipped>
> >
--
Best regards,
Sergey Kaplun
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 2/2] Fix base register coalescing in side trace.
2023-10-13 5:59 ` Maxim Kokryashkin via Tarantool-patches
@ 2023-10-23 9:50 ` Sergey Kaplun via Tarantool-patches
2023-10-24 13:50 ` Maxim Kokryashkin via Tarantool-patches
0 siblings, 1 reply; 12+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-10-23 9:50 UTC (permalink / raw)
To: Maxim Kokryashkin; +Cc: tarantool-patches
Hi, Maxim!
Thanks for the review!
Fixed your comments and force-pushed the branch!
On 13.10.23, Maxim Kokryashkin wrote:
> Hi, Sergey!
> Thanks for the patch!
> LGTM, except for a few comments below.
>
> On Wed, Oct 11, 2023 at 06:04:10PM +0300, Sergey Kaplun wrote:
> > From: Mike Pall <mike>
> >
> > Thanks to Sergey Kaplun, NiLuJe and Peter Cawley.
> >
> > (cherry-picked from commit aa2db7ebd1267836af5221336ccb4e9b4aa8372d)
> >
> > The previous patch fixed just part of the problem with the register
> > coalesing. For example, the parent base register may be used inside the
> > parent or child register sets when it shouldn't. This leads to incorrect
> > register allocations, which may lead to crashes or undefined behaviour.
> > This patch fixes it by excluding the parent base register from both
> > register sets.
> >
> > The test case for this patch doesn't fail before the commit since it
> > requires specific register allocation, which is hard to construct and
> > very fragile. Due to the lack of ideal sync with the upstream
> > repository, the test is passed before the patch.
> > It should become correct in future patches.
> Typo: s/in future patches/after backporting future patches/
Fixed, thanks!
> >
> > Resolves tarantool/tarantool#8767
> How can I reassure that, since we don't have any reproducer?
You may try the recipe mentioned in the ticket and see that there are no
any crashes anymore.
<snipped>
> > diff --git a/test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua b/test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua
> > new file mode 100644
> > index 00000000..fc5efaaa
> > --- /dev/null
> > +++ b/test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua
> > @@ -0,0 +1,139 @@
> > +local tap = require('tap')
> > +-- Test file to demonstrate incorrect side trace head assembling
> > +-- when use parent trace register holding base (`RID_BASE`).
> > +-- See also, https://github.com/LuaJIT/LuaJIT/issues/1031.
> > +--
> > +-- XXX: For now, the test doesn't fail even on arm64 due to the
> > +-- different from upstream register allocation. Nevertheless, this
> > +-- issue should be gone with future backporting, so just leave the
> > +-- test case as is.
> > +local test = tap.test('lj-1031-asm-head-side-base-reg'):skipcond({
> > + ['Test requires JIT enabled'] = not jit.status(),
> > +})
> > +
> > +local ffi = require 'ffi'
> Please use parantheses here too.
Fixed, thanks!
> > +local int64_t = ffi.typeof('int64_t')
> > +
<snipped>
> > +
> > +local ARR_SIZE = 1e2
> Maybe it is better to just write it as 100?
Fixed.
> > +local lim_arr = {}
> > +
> > +-- XXX: Prevent irrelevant output in jit.dump().
> > +jit.off()
> > +
> > +local INNER_TRACE_LIMIT = 20
> > +local INNER_LIMIT1 = 1
> > +local INNER_LIMIT2 = 4
> Please drop a comment with explanation of why those numbers were chosen.
Added, see the full patch below.
> > +for i = 1, INNER_TRACE_LIMIT do lim_arr[i] = INNER_LIMIT1 end
> > +for i = INNER_TRACE_LIMIT + 1, ARR_SIZE + 1 do lim_arr[i] = INNER_LIMIT2 end
> These loops are hardly readable, it would be nice to make the multiline.
Fixed.
> > +
<snipped>
> > + if k > 55 then else end
> Please drop a comment and explain why it is 55, for some people
> that is not obvious.
Added. See iterational patch below. Brach is force-pushed.
===================================================================
diff --git a/test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua b/test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua
index fc5efaaa..35364f4e 100644
--- a/test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua
+++ b/test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua
@@ -11,7 +11,7 @@ local test = tap.test('lj-1031-asm-head-side-base-reg'):skipcond({
['Test requires JIT enabled'] = not jit.status(),
})
-local ffi = require 'ffi'
+local ffi = require('ffi')
local int64_t = ffi.typeof('int64_t')
test:plan(1)
@@ -60,21 +60,31 @@ local r17
local r18
local r19
-local ARR_SIZE = 1e2
+local ARR_SIZE = 100
local lim_arr = {}
-- XXX: Prevent irrelevant output in jit.dump().
jit.off()
+-- `INNER_LIMIT1` - no cycle is taken.
+-- `INNER_LIMIT2` - cycle is taken and compiled.
+-- `INNER_TRACE_LIMIT` - empirical number of iterations to compile
+-- all necessary traces from the outer cycle.
local INNER_TRACE_LIMIT = 20
local INNER_LIMIT1 = 1
local INNER_LIMIT2 = 4
-for i = 1, INNER_TRACE_LIMIT do lim_arr[i] = INNER_LIMIT1 end
-for i = INNER_TRACE_LIMIT + 1, ARR_SIZE + 1 do lim_arr[i] = INNER_LIMIT2 end
+for i = 1, INNER_TRACE_LIMIT do
+ lim_arr[i] = INNER_LIMIT1
+end
+for i = INNER_TRACE_LIMIT + 1, ARR_SIZE + 1 do
+ lim_arr[i] = INNER_LIMIT2
+end
-- Enable compilation back.
jit.on()
+-- XXX: `hotexit` is set to 2 to decrease the number of
+-- meaningless side traces.
jit.opt.start('hotloop=1', 'hotexit=2')
-- XXX: Trace numbers are given with the respect of using
@@ -105,6 +115,8 @@ while k < ARR_SIZE do
local l19 = ffi_new(int64_t, k + 19)
-- Side exit for TRACE 6 start 2/1.
-- luacheck: ignore
+ -- XXX: The number is meaningless, just needs to be big enough
+ -- to be sure that all necessary traces are compiled.
if k > 55 then else end
r1 = tonumber(l1)
r2 = tonumber(l2)
===================================================================
<snipped>
> >
--
Best regards,
Sergey Kaplun
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 2/2] Fix base register coalescing in side trace.
2023-10-23 9:50 ` Sergey Kaplun via Tarantool-patches
@ 2023-10-24 13:50 ` Maxim Kokryashkin via Tarantool-patches
0 siblings, 0 replies; 12+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2023-10-24 13:50 UTC (permalink / raw)
To: Sergey Kaplun; +Cc: tarantool-patches
[-- Attachment #1: Type: text/plain, Size: 5802 bytes --]
Hi, Sergey!
Thanks for the fixes!
LGTM
--
Best regards,
Maxim Kokryashkin
>Понедельник, 23 октября 2023, 12:54 +03:00 от Sergey Kaplun <skaplun@tarantool.org>:
>
>Hi, Maxim!
>Thanks for the review!
>Fixed your comments and force-pushed the branch!
>
>On 13.10.23, Maxim Kokryashkin wrote:
>> Hi, Sergey!
>> Thanks for the patch!
>> LGTM, except for a few comments below.
>>
>> On Wed, Oct 11, 2023 at 06:04:10PM +0300, Sergey Kaplun wrote:
>> > From: Mike Pall <mike>
>> >
>> > Thanks to Sergey Kaplun, NiLuJe and Peter Cawley.
>> >
>> > (cherry-picked from commit aa2db7ebd1267836af5221336ccb4e9b4aa8372d)
>> >
>> > The previous patch fixed just part of the problem with the register
>> > coalesing. For example, the parent base register may be used inside the
>> > parent or child register sets when it shouldn't. This leads to incorrect
>> > register allocations, which may lead to crashes or undefined behaviour.
>> > This patch fixes it by excluding the parent base register from both
>> > register sets.
>> >
>> > The test case for this patch doesn't fail before the commit since it
>> > requires specific register allocation, which is hard to construct and
>> > very fragile. Due to the lack of ideal sync with the upstream
>> > repository, the test is passed before the patch.
>> > It should become correct in future patches.
>> Typo: s/in future patches/after backporting future patches/
>
>Fixed, thanks!
>
>> >
>> > Resolves tarantool/tarantool#8767
>> How can I reassure that, since we don't have any reproducer?
>
>You may try the recipe mentioned in the ticket and see that there are no
>any crashes anymore.
>
><snipped>
>
>> > diff --git a/test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua b/test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua
>> > new file mode 100644
>> > index 00000000..fc5efaaa
>> > --- /dev/null
>> > +++ b/test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua
>> > @@ -0,0 +1,139 @@
>> > +local tap = require('tap')
>> > +-- Test file to demonstrate incorrect side trace head assembling
>> > +-- when use parent trace register holding base (`RID_BASE`).
>> > +-- See also, https://github.com/LuaJIT/LuaJIT/issues/1031 .
>> > +--
>> > +-- XXX: For now, the test doesn't fail even on arm64 due to the
>> > +-- different from upstream register allocation. Nevertheless, this
>> > +-- issue should be gone with future backporting, so just leave the
>> > +-- test case as is.
>> > +local test = tap.test('lj-1031-asm-head-side-base-reg'):skipcond({
>> > + ['Test requires JIT enabled'] = not jit.status(),
>> > +})
>> > +
>> > +local ffi = require 'ffi'
>> Please use parantheses here too.
>
>Fixed, thanks!
>
>> > +local int64_t = ffi.typeof('int64_t')
>> > +
>
><snipped>
>
>> > +
>> > +local ARR_SIZE = 1e2
>> Maybe it is better to just write it as 100?
>
>Fixed.
>
>> > +local lim_arr = {}
>> > +
>> > +-- XXX: Prevent irrelevant output in jit.dump().
>> > +jit.off()
>> > +
>> > +local INNER_TRACE_LIMIT = 20
>> > +local INNER_LIMIT1 = 1
>> > +local INNER_LIMIT2 = 4
>> Please drop a comment with explanation of why those numbers were chosen.
>
>Added, see the full patch below.
>
>> > +for i = 1, INNER_TRACE_LIMIT do lim_arr[i] = INNER_LIMIT1 end
>> > +for i = INNER_TRACE_LIMIT + 1, ARR_SIZE + 1 do lim_arr[i] = INNER_LIMIT2 end
>> These loops are hardly readable, it would be nice to make the multiline.
>
>Fixed.
>
>> > +
>
><snipped>
>
>> > + if k > 55 then else end
>> Please drop a comment and explain why it is 55, for some people
>> that is not obvious.
>
>Added. See iterational patch below. Brach is force-pushed.
>
>===================================================================
>diff --git a/test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua b/test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua
>index fc5efaaa..35364f4e 100644
>--- a/test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua
>+++ b/test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua
>@@ -11,7 +11,7 @@ local test = tap.test('lj-1031-asm-head-side-base-reg'):skipcond({
> ['Test requires JIT enabled'] = not jit.status(),
> })
>
>-local ffi = require 'ffi'
>+local ffi = require('ffi')
> local int64_t = ffi.typeof('int64_t')
>
> test:plan(1)
>@@ -60,21 +60,31 @@ local r17
> local r18
> local r19
>
>-local ARR_SIZE = 1e2
>+local ARR_SIZE = 100
> local lim_arr = {}
>
> -- XXX: Prevent irrelevant output in jit.dump().
> jit.off()
>
>+-- `INNER_LIMIT1` - no cycle is taken.
>+-- `INNER_LIMIT2` - cycle is taken and compiled.
>+-- `INNER_TRACE_LIMIT` - empirical number of iterations to compile
>+-- all necessary traces from the outer cycle.
> local INNER_TRACE_LIMIT = 20
> local INNER_LIMIT1 = 1
> local INNER_LIMIT2 = 4
>-for i = 1, INNER_TRACE_LIMIT do lim_arr[i] = INNER_LIMIT1 end
>-for i = INNER_TRACE_LIMIT + 1, ARR_SIZE + 1 do lim_arr[i] = INNER_LIMIT2 end
>+for i = 1, INNER_TRACE_LIMIT do
>+ lim_arr[i] = INNER_LIMIT1
>+end
>+for i = INNER_TRACE_LIMIT + 1, ARR_SIZE + 1 do
>+ lim_arr[i] = INNER_LIMIT2
>+end
>
> -- Enable compilation back.
> jit.on()
>
>+-- XXX: `hotexit` is set to 2 to decrease the number of
>+-- meaningless side traces.
> jit.opt.start('hotloop=1', 'hotexit=2')
>
> -- XXX: Trace numbers are given with the respect of using
>@@ -105,6 +115,8 @@ while k < ARR_SIZE do
> local l19 = ffi_new(int64_t, k + 19)
> -- Side exit for TRACE 6 start 2/1.
> -- luacheck: ignore
>+ -- XXX: The number is meaningless, just needs to be big enough
>+ -- to be sure that all necessary traces are compiled.
> if k > 55 then else end
> r1 = tonumber(l1)
> r2 = tonumber(l2)
>===================================================================
>
><snipped>
>
>> >
>
>--
>Best regards,
>Sergey Kaplun
[-- Attachment #2: Type: text/html, Size: 7260 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 1/2] Fix register mask for stack check in head of side trace.
2023-10-23 9:27 ` Sergey Kaplun via Tarantool-patches
@ 2023-10-24 13:51 ` Maxim Kokryashkin via Tarantool-patches
0 siblings, 0 replies; 12+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2023-10-24 13:51 UTC (permalink / raw)
To: Sergey Kaplun; +Cc: tarantool-patches
[-- Attachment #1: Type: text/plain, Size: 1874 bytes --]
Hi, Sergey!
Thanks for the explanation!
LGTM
--
Best regards,
Maxim Kokryashkin
>Понедельник, 23 октября 2023, 12:32 +03:00 от Sergey Kaplun <skaplun@tarantool.org>:
>
>Hi, Maxim!
>Thanks for the review!
>Please, consider my answers below.
>
>On 13.10.23, Maxim Kokryashkin wrote:
>> Hi, Sergey!
>> Thanks for the patch!
>> LGTM, except for a few nits below.
>> On Wed, Oct 11, 2023 at 06:04:09PM +0300, Sergey Kaplun wrote:
>> > From: Mike Pall <mike>
>> >
>> > Analyzed by Sergey Kaplun.
>> >
>> > (cherry-picked from commit b7a8c7c184257858699454408420dd5f0b6c8a75)
>> >
>> > Assume we have parent and child traces with the following IRs from the
>> > IR dump:
>> >
>> > Parent:
>> > | 0009 rax > tab TNEW 0 0
>> > | 0010 p32 FLOAD 0008 tab.node
>> > | 0011 p32 HREFK 0010 "Name" @1
>> > | 0012 {0008} tab HSTORE 0011 0009
>> > | .... SNAP 2 [ ---- 0001 0002 0008 ---- ]
>> > | 0013 {sink} tab TNEW 0 0
>> > | 0014 {0008} fal HSTORE 0011 false
>> > | .... SNAP 3 [ ---- 0001 0002 0008 ---- ]
>> >
>> > Child:
>> > | 0001 r15 tab SLOAD 1 PI
>> > | 0002 rbp tab SLOAD 2 PI
>> > | 0003 tab PVAL 9
>> >
>> > As we can see from the trace dump above, the `rax` register is missing
>> > in the `0003 PVAL` IR for the side trace -- so it is assumed to be
>> > available in the allow RegSet inside `asm_stack_check()` and its value
>> > is spoiled during this check, so if we are restoring from the 3rd
>> Typo: s/spoiled/spilled/
>
>I mean spoiled (damaged) (with incorrect value) here -- there is no spill, but
>incorrect value as `ir->r`.
>
>> > snapshot by stack overflow -- we are in trouble.
>> >
>> > The moment when IR is spoiled is when we set a hint on the register
>> Typo: s/spoiled/spilled/
>
>Ditto.
>
>> > inherited from the parent trace (see `asm_setup_regsp()` for details).
>
><snipped>
>
>> >
>
>--
>Best regards,
>Sergey Kaplun
[-- Attachment #2: Type: text/html, Size: 2657 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 1/2] Fix register mask for stack check in head of side trace.
2023-10-11 15:04 ` [Tarantool-patches] [PATCH luajit 1/2] Fix register mask for stack check in head of side trace Sergey Kaplun via Tarantool-patches
2023-10-13 5:48 ` Maxim Kokryashkin via Tarantool-patches
@ 2023-11-07 18:42 ` Igor Munkin via Tarantool-patches
1 sibling, 0 replies; 12+ messages in thread
From: Igor Munkin via Tarantool-patches @ 2023-11-07 18:42 UTC (permalink / raw)
To: Sergey Kaplun; +Cc: tarantool-patches
Sergey,
Thanks for the patch! LGTM. I have no idea regarding the test either,
since the condition for reproducing is definitely rather fragile.
--
Best regards,
IM
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 2/2] Fix base register coalescing in side trace.
2023-10-11 15:04 ` [Tarantool-patches] [PATCH luajit 2/2] Fix base register coalescing in " Sergey Kaplun via Tarantool-patches
2023-10-13 5:59 ` Maxim Kokryashkin via Tarantool-patches
@ 2023-11-07 18:42 ` Igor Munkin via Tarantool-patches
1 sibling, 0 replies; 12+ messages in thread
From: Igor Munkin via Tarantool-patches @ 2023-11-07 18:42 UTC (permalink / raw)
To: Sergey Kaplun; +Cc: tarantool-patches
Sergey,
Thanks for the patch! LGTM, except a typo below.
On 11.10.23, Sergey Kaplun via Tarantool-patches wrote:
> From: Mike Pall <mike>
>
> Thanks to Sergey Kaplun, NiLuJe and Peter Cawley.
>
> (cherry-picked from commit aa2db7ebd1267836af5221336ccb4e9b4aa8372d)
>
> The previous patch fixed just part of the problem with the register
> coalesing. For example, the parent base register may be used inside the
Typo: s/coalesing/coalescing.
> parent or child register sets when it shouldn't. This leads to incorrect
> register allocations, which may lead to crashes or undefined behaviour.
> This patch fixes it by excluding the parent base register from both
> register sets.
>
> The test case for this patch doesn't fail before the commit since it
> requires specific register allocation, which is hard to construct and
> very fragile. Due to the lack of ideal sync with the upstream
> repository, the test is passed before the patch.
> It should become correct in future patches.
>
> Resolves tarantool/tarantool#8767
> Part of tarantool/tarantool#9145
>
> Sergey Kaplun:
> * added the description and the test for the problem
> ---
> src/lj_asm.c | 7 +-
> src/lj_asm_arm.h | 7 +-
> src/lj_asm_arm64.h | 7 +-
> src/lj_asm_mips.h | 8 +-
> src/lj_asm_ppc.h | 8 +-
> src/lj_asm_x86.h | 8 +-
> .../lj-1031-asm-head-side-base-reg.test.lua | 139 ++++++++++++++++++
> 7 files changed, 163 insertions(+), 21 deletions(-)
> create mode 100644 test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua
>
<snipped>
> --
> 2.42.0
>
--
Best regards,
IM
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 0/2] Fix assembling of the head of side traces
2023-10-11 15:04 [Tarantool-patches] [PATCH luajit 0/2] Fix assembling of the head of side traces Sergey Kaplun via Tarantool-patches
2023-10-11 15:04 ` [Tarantool-patches] [PATCH luajit 1/2] Fix register mask for stack check in head of side trace Sergey Kaplun via Tarantool-patches
2023-10-11 15:04 ` [Tarantool-patches] [PATCH luajit 2/2] Fix base register coalescing in " Sergey Kaplun via Tarantool-patches
@ 2023-11-23 6:30 ` Igor Munkin via Tarantool-patches
2 siblings, 0 replies; 12+ messages in thread
From: Igor Munkin via Tarantool-patches @ 2023-11-23 6:30 UTC (permalink / raw)
To: Sergey Kaplun; +Cc: tarantool-patches
Sergey,
I've checked the patchset into all long-term branches in
tarantool/luajit and bumped a new version in master, release/2.11 and
release/2.10.
On 11.10.23, Sergey Kaplun via Tarantool-patches wrote:
> Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-1016-1031-asm-head-side
> Tarantool PR: https://github.com/tarantool/tarantool/pull/9252
> Related Issues:
> * https://github.com/LuaJIT/LuaJIT/issues/1016
> * https://github.com/LuaJIT/LuaJIT/issues/1031
> * https://github.com/tarantool/tarantool/issues/8767
> * https://github.com/tarantool/tarantool/issues/9145
>
> The reproducer in 8767 doesn't fail anymore after the second patch.
> Also, the box/bitset.test.lua doesn't fail either.
>
> Mike Pall (2):
> Fix register mask for stack check in head of side trace.
> Fix base register coalescing in side trace.
>
> src/lj_asm.c | 11 +-
> src/lj_asm_arm.h | 7 +-
> src/lj_asm_arm64.h | 7 +-
> src/lj_asm_mips.h | 8 +-
> src/lj_asm_ppc.h | 8 +-
> src/lj_asm_x86.h | 8 +-
> .../lj-1031-asm-head-side-base-reg.test.lua | 139 ++++++++++++++++++
> 7 files changed, 166 insertions(+), 22 deletions(-)
> create mode 100644 test/tarantool-tests/lj-1031-asm-head-side-base-reg.test.lua
>
> --
> 2.42.0
>
--
Best regards,
IM
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2023-11-23 6:35 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-11 15:04 [Tarantool-patches] [PATCH luajit 0/2] Fix assembling of the head of side traces Sergey Kaplun via Tarantool-patches
2023-10-11 15:04 ` [Tarantool-patches] [PATCH luajit 1/2] Fix register mask for stack check in head of side trace Sergey Kaplun via Tarantool-patches
2023-10-13 5:48 ` Maxim Kokryashkin via Tarantool-patches
2023-10-23 9:27 ` Sergey Kaplun via Tarantool-patches
2023-10-24 13:51 ` Maxim Kokryashkin via Tarantool-patches
2023-11-07 18:42 ` Igor Munkin via Tarantool-patches
2023-10-11 15:04 ` [Tarantool-patches] [PATCH luajit 2/2] Fix base register coalescing in " Sergey Kaplun via Tarantool-patches
2023-10-13 5:59 ` Maxim Kokryashkin via Tarantool-patches
2023-10-23 9:50 ` Sergey Kaplun via Tarantool-patches
2023-10-24 13:50 ` Maxim Kokryashkin via Tarantool-patches
2023-11-07 18:42 ` Igor Munkin via Tarantool-patches
2023-11-23 6:30 ` [Tarantool-patches] [PATCH luajit 0/2] Fix assembling of the head of side traces Igor Munkin via Tarantool-patches
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox