From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id C1F8FD323D0; Wed, 25 Jun 2025 17:20:11 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org C1F8FD323D0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1750861211; bh=/x2Sq2HBUMIcPyIqaIsp94RDP05qShXqMQ8zYR7poa8=; h=Date:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=A4mY7gWBTqnjsRg4kvUiVbeAwPyXjRJLU8VHLEAFM57eGikJRtuUsO8S2FfgYPZfG jPHl8vuUeNrnMa+E1WU9QeFrhIgS6bhhUuVWJvjBjyLsCatSi9ZhW0u/kN5UFCx3VD 1h3JfXWagfvXTGJDEfFPuMFJPcPZg/nY9N2x8R+k= Received: from send126.i.mail.ru (send126.i.mail.ru [89.221.237.221]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 76500D323D0 for ; Wed, 25 Jun 2025 17:20:10 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 76500D323D0 Received: by exim-smtp-5f9ff66d98-585z4 with esmtpa (envelope-from ) id 1uUQyv-00000000Kql-1p5F; Wed, 25 Jun 2025 17:20:09 +0300 Content-Type: multipart/alternative; boundary="------------enHVyAaLI0Psl0HO37jx36ca" Message-ID: Date: Wed, 25 Jun 2025 17:20:09 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: Sergey Kaplun Cc: tarantool-patches@dev.tarantool.org References: <20250612093651.7552-1-skaplun@tarantool.org> In-Reply-To: <20250612093651.7552-1-skaplun@tarantool.org> X-Mailru-Src: smtp X-4EC0790: 10 X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD9D919194CF4FC6604F97CDCDE7B83DF3B742DFF629EC6174A00894C459B0CD1B90AE91AD5D25386BA15C513EAE8123D4DED54150FD5F7F36A03CBCC4170C79A1B08E1323863CD3BC3 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7059B0D8AC717918AEA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637AC83A81C8FD4AD23D82A6BABE6F325AC2E85FA5F3EDFCBAA7353EFBB553375669C5B23AD11DEB7B91FDDA71FB9010B310BA1640AE8A6F349067C059240019FD7389733CBF5DBD5E913377AFFFEAFD269176DF2183F8FC7C0A3E989B1926288338941B15DA834481FCF19DD082D7633A0EF3E4896CB9E6436389733CBF5DBD5E9D5E8D9A59859A8B636DA1BED736F9328CC7F00164DA146DA6F5DAA56C3B73B237318B6A418E8EAB8D32BA5DBAC0009BE9E8FC8737B5C22495276EF5FA9B1B5E376E601842F6C81A12EF20D2F80756B5FB606B96278B59C4276E601842F6C81A127C277FBC8AE2E8B4EC14D5F6B0DF2963AA81AA40904B5D99C9F4D5AE37F343AD1F44FA8B9022EA23BBE47FD9DD3FB595F5C1EE8F4F765FC72CEEB2601E22B093A03B725D353964B2FFDA4F57982C5F435872C767BF85DA227C277FBC8AE2E8BE355FB2A6EFF69C575ECD9A6C639B01B4E70A05D1297E1BBCB5012B2E24CD356 X-C1DE0DAB: 0D63561A33F958A544E366788BF539D35002B1117B3ED696A74EC5760C40287BED71F038FC046993823CB91A9FED034534781492E4B8EEADAE4FDBF11360AC9BBDAD6C7F3747799A X-C8649E89: 1C3962B70DF3F0ADBF74143AD284FC7177DD89D51EBB7742424CF958EAFF5D571004E42C50DC4CA955A7F0CF078B5EC49A30900B95165D34E3E1DC0F9BD553099A061A3D5CC2694849C583A34A135DC707803D941398C3658D99328CB514516C1D7E09C32AA3244CDD1C52AAD4D7978277DD89D51EBB7742180C4926F79200FAEA455F16B58544A2E30DDF7C44BCB90DA5AE236DF995FB59978A700BF655EAEEED6A17656DB59BCAD427812AF56FC65B X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu53w8ahmwBjZKM/YPHZyZHvz5uv+WouB9+ObcCpyrx6l7KImUglyhkEat/+ysWwi0gdhEs0JGjl6ggRWTy1haxBpVdbIX1nthFXMZebaIdHP2ghjoIc/363UZI6Kf1ptIMVVWXk7QTiVzHzOVbld0uRr8= X-Mailru-Sender: 520A125C2F17F0B1E52FEF5D219D61400AE91AD5D25386BA15C513EAE8123D4D92DD16F223E1CD7E0152A3D17938EB451EB5A0BCEC6A560B3DDE9B364B0DF289BE2DA36745F2EEB5CEBA01FB949A1F1EEAB4BC95F72C04283CDA0F3B3F5B9367 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH luajit] ARM64: Fix assembly of HREFK. X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Sergey Bronnikov via Tarantool-patches Reply-To: Sergey Bronnikov Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" This is a multi-part message in MIME format. --------------enHVyAaLI0Psl0HO37jx36ca Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Hi, Sergey, thanks for the patch! Please see my comments below. Sergey On 6/12/25 12:36, Sergey Kaplun wrote: > From: Mike Pall > > Reported by caohongqing. > Fix contributed by Peter Cawley. > > (cherry picked from commit 8fbd576fb9414a5fa70dfa6069733d3416a78269) > > `asm_hrefk()` uses the check for the offset for the corresponding node > structure. However, the target load is performed from its inner `key` > field with the offset 8. In the case of a huge table, it is possible > that the offset of the node (4095 * 8) is less than 4096 * 8 and can be > emitted via the corresponding instruction as an immediate offset, but > the offset of the `key` field is not. This leads to the corresponding > assertion failure in `emit_lso()`. The issue [1] contains yet another fix in the same place [2]. We decided to backport the patch separately. But please mention this in commit message. 1. https://github.com/LuaJIT/LuaJIT/issues/1026 2. https://github.com/LuaJIT/LuaJIT/commit/93ce12ee15abf28ef4cb24ae7e4b8a5b73d75c85 > > This patch fixes this behaviour by the correct check. > > Sergey Kaplun: > * added the description and the test for the problem > > Part of tarantool/tarantool#11278 > --- > > Related issues: > *https://github.com/LuaJIT/LuaJIT/issues/1026 > *https://github.com/tarantool/tarantool/issues/11278 > Branch:https://github.com/tarantool/luajit/tree/skaplun/lj-1026-arm64-invalid-hrefk-offset-check > > src/lj_asm_arm64.h | 2 +- > ...-arm64-invalid-hrefk-offset-check.test.lua | 48 +++++++++++++++++++ > 2 files changed, 49 insertions(+), 1 deletion(-) > create mode 100644 test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua > > diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h > index 6c7b011f..a7f059a2 100644 > --- a/src/lj_asm_arm64.h > +++ b/src/lj_asm_arm64.h > @@ -885,7 +885,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir) > IRIns *irkey = IR(kslot->op1); > int32_t ofs = (int32_t)(kslot->op2 * sizeof(Node)); > int32_t kofs = ofs + (int32_t)offsetof(Node, key); > - int bigofs = !emit_checkofs(A64I_LDRx, ofs); > + int bigofs = !emit_checkofs(A64I_LDRx, kofs); > Reg dest = (ra_used(ir) || bigofs) ? ra_dest(as, ir, RSET_GPR) : RID_NONE; > Reg node = ra_alloc1(as, ir->op1, RSET_GPR); > Reg key, idx = node; > diff --git a/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua b/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua > new file mode 100644 > index 00000000..de243814 > --- /dev/null > +++ b/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua > @@ -0,0 +1,48 @@ > +local tap = require('tap') > + > +-- Test file to demonstrate LuaJIT misbehaviour when assembling > +-- HREFK instruction on arm64 with the huge offset. > +-- See also:https://github.com/LuaJIT/LuaJIT/issues/1026. > +local test = tap.test('lj-1026-arm64-invalid-hrefk-offset-check'):skipcond({ > + ['Test requires JIT enabled'] = not jit.status(), It is an ARM-specific patch, should we add a condition for ARM here? > +}) > + > +test:plan(1) > + > +-- The assertion fails since in HREFK we are checking the offset > +-- from the hslots of the table of the Node structure itself s/Node/`Node`/ > +-- instead of its inner field `key` (with additional 8 bytes). > +-- So to test this, we generate a big table with constant keys > +-- and compile a trace for each HREFK possible. > + > +local big_tab = {} > +-- The map of the characters to generate constant string keys. > +-- The offset of the node should be 4096 * 8. It takes at least > +-- 1365 keys to hit this value. The maximum possible slots in the > +-- hash part is 2048, so to fill it with the maximum density (with > +-- the way below), we need 45 * 45 = 2025 keys. > +local chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRS' > +for c1 inchars:gmatch('.') do > + for c2 inchars:gmatch('.') do > + big_tab[c1 .. c2] = 1 > + end > +end > + > +jit.opt.start('hotloop=1') > + > +-- Generate bunch of traces. > +for c1 inchars:gmatch('.') do > + for c2 inchars:gmatch('.') do > + loadstring([=[ > + local t = ... > + for i = 1, 4 do > + -- HREFK generation. > + t[ ']=] .. c1 .. c2 .. [=[' ] = i > + end > + ]=])(big_tab) > + end > +end > + > +test:ok(true, 'no assertion failed') I would replace testcase description to something like "emitted assembly is correct". Feel free to ignore. > + > +test:done(true) --------------enHVyAaLI0Psl0HO37jx36ca Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit

Hi, Sergey,

thanks for the patch! Please see my comments below.

Sergey

On 6/12/25 12:36, Sergey Kaplun wrote:
From: Mike Pall <mike>

Reported by caohongqing.
Fix contributed by Peter Cawley.

(cherry picked from commit 8fbd576fb9414a5fa70dfa6069733d3416a78269)

`asm_hrefk()` uses the check for the offset for the corresponding node
structure. However, the target load is performed from its inner `key`
field with the offset 8. In the case of a huge table, it is possible
that the offset of the node (4095 * 8) is less than 4096 * 8 and can be
emitted via the corresponding instruction as an immediate offset, but
the offset of the `key` field is not. This leads to the corresponding
assertion failure in `emit_lso()`.

The issue [1] contains yet another fix in the same place [2]. We decided to backport the patch

separately. But please mention this in commit message.


1. https://github.com/LuaJIT/LuaJIT/issues/1026

2. https://github.com/LuaJIT/LuaJIT/commit/93ce12ee15abf28ef4cb24ae7e4b8a5b73d75c85



This patch fixes this behaviour by the correct check.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#11278
---

Related issues:
* https://github.com/LuaJIT/LuaJIT/issues/1026
* https://github.com/tarantool/tarantool/issues/11278
Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-1026-arm64-invalid-hrefk-offset-check

 src/lj_asm_arm64.h                            |  2 +-
 ...-arm64-invalid-hrefk-offset-check.test.lua | 48 +++++++++++++++++++
 2 files changed, 49 insertions(+), 1 deletion(-)
 create mode 100644 test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua

diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
index 6c7b011f..a7f059a2 100644
--- a/src/lj_asm_arm64.h
+++ b/src/lj_asm_arm64.h
@@ -885,7 +885,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
   IRIns *irkey = IR(kslot->op1);
   int32_t ofs = (int32_t)(kslot->op2 * sizeof(Node));
   int32_t kofs = ofs + (int32_t)offsetof(Node, key);
-  int bigofs = !emit_checkofs(A64I_LDRx, ofs);
+  int bigofs = !emit_checkofs(A64I_LDRx, kofs);
   Reg dest = (ra_used(ir) || bigofs) ? ra_dest(as, ir, RSET_GPR) : RID_NONE;
   Reg node = ra_alloc1(as, ir->op1, RSET_GPR);
   Reg key, idx = node;
diff --git a/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua b/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua
new file mode 100644
index 00000000..de243814
--- /dev/null
+++ b/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua
@@ -0,0 +1,48 @@
+local tap = require('tap')
+
+-- Test file to demonstrate LuaJIT misbehaviour when assembling
+-- HREFK instruction on arm64 with the huge offset.
+-- See also: https://github.com/LuaJIT/LuaJIT/issues/1026.
+local test = tap.test('lj-1026-arm64-invalid-hrefk-offset-check'):skipcond({
+  ['Test requires JIT enabled'] = not jit.status(),
It is an ARM-specific patch, should we add a condition for ARM here?
+})
+
+test:plan(1)
+
+-- The assertion fails since in HREFK we are checking the offset
+-- from the hslots of the table of the Node structure itself
s/Node/`Node`/
+-- instead of its inner field `key` (with additional 8 bytes).
+-- So to test this, we generate a big table with constant keys
+-- and compile a trace for each HREFK possible.
+
+local big_tab = {}
+-- The map of the characters to generate constant string keys.
+-- The offset of the node should be 4096 * 8. It takes at least
+-- 1365 keys to hit this value. The maximum possible slots in the
+-- hash part is 2048, so to fill it with the maximum density (with
+-- the way below), we need 45 * 45 = 2025 keys.
+local chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRS'
+for c1 in chars:gmatch('.') do
+  for c2 in chars:gmatch('.') do
+    big_tab[c1 .. c2] = 1
+  end
+end
+
+jit.opt.start('hotloop=1')
+
+-- Generate bunch of traces.
+for c1 in chars:gmatch('.') do
+  for c2 in chars:gmatch('.') do
+    loadstring([=[
+      local t = ...
+      for i = 1, 4 do
+        -- HREFK generation.
+        t[ ']=] .. c1 .. c2 .. [=[' ] = i
+      end
+    ]=])(big_tab)
+  end
+end
+
+test:ok(true, 'no assertion failed')

I would replace testcase description to something like "emitted assembly is correct".

Feel free to ignore.

+
+test:done(true)
--------------enHVyAaLI0Psl0HO37jx36ca--