<!DOCTYPE html>
<html data-lt-installed="true">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body style="padding-bottom: 1px;">
<p>Hi, Sergey!</p>
<p>thanks for the patch and a good explanation in the commit
message!</p>
<p>LGTM</p>
<p>Sergey<br>
</p>
<div class="moz-cite-prefix">On 7/24/25 12:04, Sergey Kaplun wrote:<br>
</div>
<blockquote type="cite"
cite="mid:6a0d3fabe8caca468e02a319a45db6f2556dd2fe.1753344905.git.skaplun@tarantool.org">
<pre wrap="" class="moz-quote-pre">From: Mike Pall <mike>
Thanks to Peter Cawley.
(cherry picked from commit 7cc53f0b85f834dfba1516ea79d59db463e856fa)
Assume we have a trace for the several `setmetatable()` calls to the
same table. This trace contains the following IR:
| 0011 p64 FREF 0003 tab.meta
| ...
| 0018 x0 > tab TNEW 0 0
| 0019 tab TBAR 0003
| 0020 tab FSTORE 0011 0018
The expected mcode to be emitted for the last two IRs is the following:
| 55626cffb0 ldrb w30, [x19, 8] ; tab->marked
| 55626cffb4 tst w30, 0x4 ; Is black?
| 55626cffb8 beq 0x626cffd0 ; Skip marking.
| 55626cffbc ldr x27, [x20, 128]
| 55626cffc0 and w30, w30, 0xfffffffb
| 55626cffc4 str x19, [x20, 128]
| 55626cffcc strb w30, [x19, 8] ; tab->marked
| 55626cffc8 str x27, [x19, 24] ; tab->gclist
| 55626cffd0 str x0, [x19, 32] ; tab->metatable
But the last 2 instructions are fused into the following `stp`:
| 55581dffd0 stp x27, x0, [x19, 48]
Hence, the GC propagation frontier back is done partially, since
`str x27, [x19, 24]` is not skipped despite TBAR semantics. This leads
to the incorrect value in the `gclist` and the segmentation fault during
its traversal on GC step.
This patch prevents this fusion via switching instruction for
`tab->gclist` and `tab->marked` storing.
Sergey Kaplun:
* added the description and the test for the problem
Part of tarantool/tarantool#11691
---
src/lj_asm_arm64.h | 3 +-
...1057-arm64-stp-fusing-across-tbar.test.lua | 79 +++++++++++++++++++
2 files changed, 81 insertions(+), 1 deletion(-)
create mode 100644 test/tarantool-tests/lj-1057-arm64-stp-fusing-across-tbar.test.lua
diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
index 5a6c60b7..9b3c0467 100644
--- a/src/lj_asm_arm64.h
+++ b/src/lj_asm_arm64.h
@@ -1271,8 +1271,9 @@ static void asm_tbar(ASMState *as, IRIns *ir)
Reg link = ra_scratch(as, rset_exclude(RSET_GPR, tab));
Reg mark = RID_TMP;
MCLabel l_end = emit_label(as);
- emit_lso(as, A64I_STRx, link, tab, (int32_t)offsetof(GCtab, gclist));
emit_lso(as, A64I_STRB, mark, tab, (int32_t)offsetof(GCtab, marked));
+ /* Keep STRx in the middle to avoid LDP/STP fusion with surrounding code. */
+ emit_lso(as, A64I_STRx, link, tab, (int32_t)offsetof(GCtab, gclist));
emit_setgl(as, tab, gc.grayagain);
emit_dn(as, A64I_ANDw^emit_isk13(~LJ_GC_BLACK, 0), mark, mark);
emit_getgl(as, link, gc.grayagain);
diff --git a/test/tarantool-tests/lj-1057-arm64-stp-fusing-across-tbar.test.lua b/test/tarantool-tests/lj-1057-arm64-stp-fusing-across-tbar.test.lua
new file mode 100644
index 00000000..27d18916
--- /dev/null
+++ b/test/tarantool-tests/lj-1057-arm64-stp-fusing-across-tbar.test.lua
@@ -0,0 +1,79 @@
+local tap = require('tap')
+
+-- This test demonstrates LuaJIT's incorrect fusing of store
+-- instructions separated by the conditional branch on arm64.
+-- See also <a class="moz-txt-link-freetext" href="https://github.com/LuaJIT/LuaJIT/issues/1057">https://github.com/LuaJIT/LuaJIT/issues/1057</a>.
+local test = tap.test('lj-1057-arm64-stp-fusing-across-tbar'):skipcond({
+ ['Test requires JIT enabled'] = not jit.status(),
+})
+
+test:plan(2)
+
+-- XXX: Simplify the `jit.dump()` output.
+local setmetatable = setmetatable
+
+-- The function below generates the following IR:
+-- | 0011 p64 FREF 0003 tab.meta
+-- | ...
+-- | 0018 x0 > tab TNEW #0 #0
+-- | 0019 tab TBAR 0003
+-- | 0020 tab FSTORE 0011 0018
+-- The expected mcode to be emitted for the last two IRs is the
+-- following:
+-- | 55626cffb0 ldrb w30, [x19, #8] ; tab->marked
+-- | 55626cffb4 tst w30, #0x4 ; Is black?
+-- | 55626cffb8 beq 0x626cffd0 ; Skip marking.
+-- | 55626cffbc ldr x27, [x20, #128]
+-- | 55626cffc0 and w30, w30, #0xfffffffb
+-- | 55626cffc4 str x19, [x20, #128]
+-- | 55626cffcc strb w30, [x19, #8] ; tab->marked
+-- | 55626cffc8 str x27, [x19, #24] ; tab->gclist
+-- | 55626cffd0 str x0, [x19, #32] ; tab->metatable
+--
+-- But the last 2 instructions are fused into the following `stp`:
+-- | 55581dffd0 stp x27, x0, [x19, #48]
+-- Hence, the GC propagation frontier back is done partially,
+-- since `str x27, [x19, #24]` is not skipped despite TBAR
+-- semantics. This leads to the incorrect value in the `gclist`
+-- and the segmentation fault during its traversal on GC step.
+local function trace(target_t)
+ -- Precreate a table for the FLOAD to avoid TNEW in between.
+ local stack_t = {}
+ -- Generate FSTORE TBAR pair. The FSTORE will be dropped due to
+ -- the FSTORE below by DSE.
+ setmetatable(target_t, {})
+ -- Generate FSTORE. TBAR will be dropped by CSE.
+ setmetatable(target_t, stack_t)
+end
+
+jit.opt.start('hotloop=1')
+
+-- XXX: Need to trigger the GC on trace to introspect that the
+-- GC chain is broken. Use empirical 10000 iterations.
+local tab = {}
+for _ = 1, 1e4 do
+ trace(tab)
+end
+
+test:ok(true, 'no assertion failure in the simple loop')
+
+-- The similar test, but be sure that we finish the whole GC
+-- cycle, plus using upvalue instead of stack slot for the target
+-- table.
+
+local target_t = {}
+local function trace2()
+ local stack_t = {}
+ setmetatable(target_t, {})
+ setmetatable(target_t, stack_t)
+end
+
+collectgarbage('collect')
+collectgarbage('setstepmul', 1)
+while not collectgarbage('step') do
+ trace2()
+end
+
+test:ok(true, 'no assertion failure in the whole GC cycle')
+
+test:done(true)
</pre>
</blockquote>
</body>
<lt-container></lt-container>
</html>