From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 0BB7314D4C18; Thu, 24 Jul 2025 12:05:05 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 0BB7314D4C18 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1753347905; bh=CGjHKBb7OQpcckdTjN31h0MqBwT3Rb/tVJvptEnpSz8=; h=To:Date:In-Reply-To:References:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=VL7yc6OdQ19f49DUwOGvtVOOywSwdNl/U/x4vppT+54Yzr/MyCTqv7L5U8yCAZfZ6 djTmb7GbU78cQHLnijiKPpGgYWPkdtiYiyskz646oA3t5KCOZ/y7OW9yI6EEZDl1EC 6F/JMpyKwaFmRvw+Y7tXklmbGq2mk/rk6EV29aJQ= Received: from send243.i.mail.ru (send243.i.mail.ru [95.163.59.82]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 7524C14D4C0C for ; Thu, 24 Jul 2025 12:03:36 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 7524C14D4C0C Received: by exim-smtp-6db95c9866-xvr86 with esmtpa (envelope-from ) id 1uerrT-000000000SL-1i5Q; Thu, 24 Jul 2025 12:03:35 +0300 To: Sergey Bronnikov Date: Thu, 24 Jul 2025 12:04:00 +0300 Message-ID: <6a0d3fabe8caca468e02a319a45db6f2556dd2fe.1753344905.git.skaplun@tarantool.org> X-Mailer: git-send-email 2.50.0 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Mailru-Src: smtp X-4EC0790: 10 X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD9B9F43B9EFA5FD5643EC32769796B0A8B077C33DFA6EE6490182A05F53808504062C7303E066C21A33DE06ABAFEAF67058776C4F8B5C83C5CDF98AC2C90C84BB2A24CDA786A8AA87E X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE70C6BE81AD14BE2BFEA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637AC83A81C8FD4AD23D82A6BABE6F325AC2E85FA5F3EDFCBAA7353EFBB553375660E7F2F9780B0C8574031766AFBDE5928176A71CF666A8271B954E59F469A6742389733CBF5DBD5E913377AFFFEAFD269176DF2183F8FC7C06030C3405640F6718941B15DA834481FCF19DD082D7633A0EF3E4896CB9E6436389733CBF5DBD5E9D5E8D9A59859A8B6AEEA5BB16A939343CC7F00164DA146DA6F5DAA56C3B73B237318B6A418E8EAB86D1867E19FE14079C09775C1D3CA48CF3D321E7403792E342EB15956EA79C166A417C69337E82CC275ECD9A6C639B01B78DA827A17800CE7B2B7C64F398C7410731C566533BA786AA5CC5B56E945C8DA X-C1DE0DAB: 0D63561A33F958A5BD8713DF449F82A15002B1117B3ED696D612E7E3137424F4B74D9144D44E4FCF823CB91A9FED034534781492E4B8EEAD2F8D89FC5850081EC79554A2A72441328621D336A7BC284946AD531847A6065A535571D14F44ED41 X-C8649E89: 1C3962B70DF3F0ADE00A9FD3E00BEEDF3FED46C3ACD6F73ED3581295AF09D3DF87807E0823442EA2ED31085941D9CD0AF7F820E7B07EA4CF300B6904C4BAF0FD7C6109A237E121AF64968C9CAB051027EF5E6475FECD50C3E5920CFB4BEC2A8376EB2F54A6C907ADF54AA04D6EEB3F12981AC94E6D1C16E6B89648BC9872732E5F4332CA8FE04980913E6812662D5F2A5EAB5682573093F7837F15F2B5E4A70B33F2C28C22F508233FCF178C6DD14203 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu53w8ahmwBjZKM/YPHZyZHvz5uv+WouB9+ObcCpyrx6l7KImUglyhkEat/+ysWwi0gdhEs0JGjl6ggRWTy1haxBpVdbIX1nthFXMZebaIdHP2ghjoIc/363UZI6Kf1ptIMVU4HtwWQGdJGdpc8aWyPt5o= X-DA7885C5: 2DE929DA0EF0D551F255D290C0D534F9BF07C9312FCC31C485305517B202D56F0F5A5BB18DDBF21D5B1A4C17EAA7BC4BEF2421ABFA55128DAF83EF9164C44C7E X-Mailru-Sender: 689FA8AB762F7393FE9E42A757851DB6AF27FD99809DCBACD5B33FE1970292800E8BDCBF9C8C7AC2E49D44BB4BD9522A059A1ED8796F048DB274557F927329BE89D5A3BC2B10C37545BD1C3CC395C826B4A721A3011E896F X-Mras: Ok Subject: [Tarantool-patches] [PATCH luajit 3/3] ARM64: Prevent STP fusion for conditional code emitted by TBAR. X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Sergey Kaplun via Tarantool-patches Reply-To: Sergey Kaplun Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" From: Mike Pall Thanks to Peter Cawley. (cherry picked from commit 7cc53f0b85f834dfba1516ea79d59db463e856fa) Assume we have a trace for the several `setmetatable()` calls to the same table. This trace contains the following IR: | 0011 p64 FREF 0003 tab.meta | ... | 0018 x0 > tab TNEW 0 0 | 0019 tab TBAR 0003 | 0020 tab FSTORE 0011 0018 The expected mcode to be emitted for the last two IRs is the following: | 55626cffb0 ldrb w30, [x19, 8] ; tab->marked | 55626cffb4 tst w30, 0x4 ; Is black? | 55626cffb8 beq 0x626cffd0 ; Skip marking. | 55626cffbc ldr x27, [x20, 128] | 55626cffc0 and w30, w30, 0xfffffffb | 55626cffc4 str x19, [x20, 128] | 55626cffcc strb w30, [x19, 8] ; tab->marked | 55626cffc8 str x27, [x19, 24] ; tab->gclist | 55626cffd0 str x0, [x19, 32] ; tab->metatable But the last 2 instructions are fused into the following `stp`: | 55581dffd0 stp x27, x0, [x19, 48] Hence, the GC propagation frontier back is done partially, since `str x27, [x19, 24]` is not skipped despite TBAR semantics. This leads to the incorrect value in the `gclist` and the segmentation fault during its traversal on GC step. This patch prevents this fusion via switching instruction for `tab->gclist` and `tab->marked` storing. Sergey Kaplun: * added the description and the test for the problem Part of tarantool/tarantool#11691 --- src/lj_asm_arm64.h | 3 +- ...1057-arm64-stp-fusing-across-tbar.test.lua | 79 +++++++++++++++++++ 2 files changed, 81 insertions(+), 1 deletion(-) create mode 100644 test/tarantool-tests/lj-1057-arm64-stp-fusing-across-tbar.test.lua diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h index 5a6c60b7..9b3c0467 100644 --- a/src/lj_asm_arm64.h +++ b/src/lj_asm_arm64.h @@ -1271,8 +1271,9 @@ static void asm_tbar(ASMState *as, IRIns *ir) Reg link = ra_scratch(as, rset_exclude(RSET_GPR, tab)); Reg mark = RID_TMP; MCLabel l_end = emit_label(as); - emit_lso(as, A64I_STRx, link, tab, (int32_t)offsetof(GCtab, gclist)); emit_lso(as, A64I_STRB, mark, tab, (int32_t)offsetof(GCtab, marked)); + /* Keep STRx in the middle to avoid LDP/STP fusion with surrounding code. */ + emit_lso(as, A64I_STRx, link, tab, (int32_t)offsetof(GCtab, gclist)); emit_setgl(as, tab, gc.grayagain); emit_dn(as, A64I_ANDw^emit_isk13(~LJ_GC_BLACK, 0), mark, mark); emit_getgl(as, link, gc.grayagain); diff --git a/test/tarantool-tests/lj-1057-arm64-stp-fusing-across-tbar.test.lua b/test/tarantool-tests/lj-1057-arm64-stp-fusing-across-tbar.test.lua new file mode 100644 index 00000000..27d18916 --- /dev/null +++ b/test/tarantool-tests/lj-1057-arm64-stp-fusing-across-tbar.test.lua @@ -0,0 +1,79 @@ +local tap = require('tap') + +-- This test demonstrates LuaJIT's incorrect fusing of store +-- instructions separated by the conditional branch on arm64. +-- See also https://github.com/LuaJIT/LuaJIT/issues/1057. +local test = tap.test('lj-1057-arm64-stp-fusing-across-tbar'):skipcond({ + ['Test requires JIT enabled'] = not jit.status(), +}) + +test:plan(2) + +-- XXX: Simplify the `jit.dump()` output. +local setmetatable = setmetatable + +-- The function below generates the following IR: +-- | 0011 p64 FREF 0003 tab.meta +-- | ... +-- | 0018 x0 > tab TNEW #0 #0 +-- | 0019 tab TBAR 0003 +-- | 0020 tab FSTORE 0011 0018 +-- The expected mcode to be emitted for the last two IRs is the +-- following: +-- | 55626cffb0 ldrb w30, [x19, #8] ; tab->marked +-- | 55626cffb4 tst w30, #0x4 ; Is black? +-- | 55626cffb8 beq 0x626cffd0 ; Skip marking. +-- | 55626cffbc ldr x27, [x20, #128] +-- | 55626cffc0 and w30, w30, #0xfffffffb +-- | 55626cffc4 str x19, [x20, #128] +-- | 55626cffcc strb w30, [x19, #8] ; tab->marked +-- | 55626cffc8 str x27, [x19, #24] ; tab->gclist +-- | 55626cffd0 str x0, [x19, #32] ; tab->metatable +-- +-- But the last 2 instructions are fused into the following `stp`: +-- | 55581dffd0 stp x27, x0, [x19, #48] +-- Hence, the GC propagation frontier back is done partially, +-- since `str x27, [x19, #24]` is not skipped despite TBAR +-- semantics. This leads to the incorrect value in the `gclist` +-- and the segmentation fault during its traversal on GC step. +local function trace(target_t) + -- Precreate a table for the FLOAD to avoid TNEW in between. + local stack_t = {} + -- Generate FSTORE TBAR pair. The FSTORE will be dropped due to + -- the FSTORE below by DSE. + setmetatable(target_t, {}) + -- Generate FSTORE. TBAR will be dropped by CSE. + setmetatable(target_t, stack_t) +end + +jit.opt.start('hotloop=1') + +-- XXX: Need to trigger the GC on trace to introspect that the +-- GC chain is broken. Use empirical 10000 iterations. +local tab = {} +for _ = 1, 1e4 do + trace(tab) +end + +test:ok(true, 'no assertion failure in the simple loop') + +-- The similar test, but be sure that we finish the whole GC +-- cycle, plus using upvalue instead of stack slot for the target +-- table. + +local target_t = {} +local function trace2() + local stack_t = {} + setmetatable(target_t, {}) + setmetatable(target_t, stack_t) +end + +collectgarbage('collect') +collectgarbage('setstepmul', 1) +while not collectgarbage('step') do + trace2() +end + +test:ok(true, 'no assertion failure in the whole GC cycle') + +test:done(true) -- 2.50.0