From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 89B651456557; Thu, 26 Jun 2025 12:47:22 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 89B651456557 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1750931242; bh=2SmRp8VVpNXxohceDK7ZZSgdSIB3IbenRrif4IFCrUY=; h=Date:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=WEmqJOa1tqgoW1jDXi0iD0GKsFWRfH2enfXOQGuFE/dpPyu/kBNtn4i/rnlarkiPx +jdnARMThk4w/jzHc6Ut8AUWk0xVIfyX05bUzhMkpqYvmBmVgY5TmIUJbp6zNJrlpq /ulvltjt/Ef2SuAPxt3l/j0r7/nKMO9okqokzp70= Received: from send126.i.mail.ru (send126.i.mail.ru [89.221.237.221]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 4985D4923A0 for ; Thu, 26 Jun 2025 12:47:21 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 4985D4923A0 Received: by exim-smtp-68f89ddb46-v9vv5 with esmtpa (envelope-from ) id 1uUjCS-00000000CA4-1Ovv; Thu, 26 Jun 2025 12:47:20 +0300 Content-Type: multipart/alternative; boundary="------------oxX2aO0sIEyhYD5MhCF5OU95" Message-ID: Date: Thu, 26 Jun 2025 12:47:20 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: Sergey Kaplun Cc: tarantool-patches@dev.tarantool.org References: <20250612093651.7552-1-skaplun@tarantool.org> In-Reply-To: X-Mailru-Src: smtp X-4EC0790: 10 X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD9D919194CF4FC6604B620CE3B3404E4A9D3F3E03D2E6477AC00894C459B0CD1B96C87A1492C5057290578E6996F383413B47D90B9F1D8E4FAF977F7E30D03C3224DF7BC4678047644 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE704BA85F3D5A9F85BEA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637AC83A81C8FD4AD23D82A6BABE6F325AC2E85FA5F3EDFCBAA7353EFBB553375669C5B23AD11DEB7B9A5DE3221F792DF5407F64C342066FF372A837687C31248F4389733CBF5DBD5E913377AFFFEAFD269176DF2183F8FC7C0D9442B0B5983000E8941B15DA834481FCF19DD082D7633A0EF3E4896CB9E6436389733CBF5DBD5E9D5E8D9A59859A8B636DA1BED736F9328CC7F00164DA146DA6F5DAA56C3B73B237318B6A418E8EAB8D32BA5DBAC0009BE9E8FC8737B5C224902F69741B4859B5F76E601842F6C81A12EF20D2F80756B5FB606B96278B59C4276E601842F6C81A127C277FBC8AE2E8B1DB702F0F215A03F3AA81AA40904B5D99C9F4D5AE37F343AD1F44FA8B9022EA23BBE47FD9DD3FB595F5C1EE8F4F765FC72CEEB2601E22B093A03B725D353964B0B7D0EA88DDEDAC722CA9DD8327EE4930A3850AC1BE2E7359E834CA16AA0C2CCC4224003CC83647689D4C264860C145E X-C1DE0DAB: 0D63561A33F958A5248BDEFE75680A685002B1117B3ED696AD584AA14FB3EDFACA7E60A991436CA2823CB91A9FED034534781492E4B8EEADA91A6E18C88C5E2F X-C8649E89: 1C3962B70DF3F0ADBF74143AD284FC7177DD89D51EBB7742424CF958EAFF5D571004E42C50DC4CA955A7F0CF078B5EC49A30900B95165D343D50AEDB859DBAD9FE99B7B65D04A58E15EEF4AE0A8073DDB7E3A780B16C9430AC48213214E6151E1D7E09C32AA3244CCE9E2AFD37EA56EB77DD89D51EBB7742D16075F0E61EDAEFEA455F16B58544A2E30DDF7C44BCB90DA5AE236DF995FB59978A700BF655EAEEED6A17656DB59BCAD427812AF56FC65B X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu53w8ahmwBjZKM/YPHZyZHvz5uv+WouB9+ObcCpyrx6l7KImUglyhkEat/+ysWwi0gdhEs0JGjl6ggRWTy1haxBpVdbIX1nthFXMZebaIdHP2ghjoIc/363UZI6Kf1ptIMVVWXk7QTiVzHTwA89fX2ORM= X-Mailru-Sender: 520A125C2F17F0B1E52FEF5D219D61406C87A1492C5057290578E6996F383413C75578898D53F1050152A3D17938EB451EB5A0BCEC6A560B3DDE9B364B0DF289BE2DA36745F2EEB5CEBA01FB949A1F1EEAB4BC95F72C04283CDA0F3B3F5B9367 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH luajit] ARM64: Fix assembly of HREFK. X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Sergey Bronnikov via Tarantool-patches Reply-To: Sergey Bronnikov Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" This is a multi-part message in MIME format. --------------oxX2aO0sIEyhYD5MhCF5OU95 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit LGTM, thanks! On 6/25/25 17:41, Sergey Kaplun wrote: > Hi, Sergey! > Thanks for the review! > Please consider my answers below. > > On 25.06.25, Sergey Bronnikov wrote: >> Hi, Sergey, >> >> thanks for the patch! Please see my comments below. >> >> Sergey >> >> On 6/12/25 12:36, Sergey Kaplun wrote: >>> From: Mike Pall >>> >>> Reported by caohongqing. >>> Fix contributed by Peter Cawley. >>> >>> (cherry picked from commit 8fbd576fb9414a5fa70dfa6069733d3416a78269) >>> >>> `asm_hrefk()` uses the check for the offset for the corresponding node >>> structure. However, the target load is performed from its inner `key` >>> field with the offset 8. In the case of a huge table, it is possible >>> that the offset of the node (4095 * 8) is less than 4096 * 8 and can be >>> emitted via the corresponding instruction as an immediate offset, but >>> the offset of the `key` field is not. This leads to the corresponding >>> assertion failure in `emit_lso()`. >> The issue [1] contains yet another fix in the same place [2]. We decided >> to backport the patch >> >> separately. But please mention this in commit message. >> >> >> 1.https://github.com/LuaJIT/LuaJIT/issues/1026 >> >> 2. >> https://github.com/LuaJIT/LuaJIT/commit/93ce12ee15abf28ef4cb24ae7e4b8a5b73d75c85 > These issues are completely independent, IMO. I would rather not mention > it. Otherwise, by this logic, we should mention every problem related to > the HREFK here. ok >>> This patch fixes this behaviour by the correct check. >>> >>> Sergey Kaplun: >>> * added the description and the test for the problem >>> >>> Part of tarantool/tarantool#11278 >>> --- >>> >>> Related issues: >>> *https://github.com/LuaJIT/LuaJIT/issues/1026 >>> *https://github.com/tarantool/tarantool/issues/11278 >>> Branch:https://github.com/tarantool/luajit/tree/skaplun/lj-1026-arm64-invalid-hrefk-offset-check >>> >>> src/lj_asm_arm64.h | 2 +- >>> ...-arm64-invalid-hrefk-offset-check.test.lua | 48 +++++++++++++++++++ >>> 2 files changed, 49 insertions(+), 1 deletion(-) >>> create mode 100644 test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua >>> >>> diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h >>> index 6c7b011f..a7f059a2 100644 >>> --- a/src/lj_asm_arm64.h >>> +++ b/src/lj_asm_arm64.h >>> @@ -885,7 +885,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir) > > >>> diff --git a/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua b/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua >>> new file mode 100644 >>> index 00000000..de243814 >>> --- /dev/null >>> +++ b/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua >>> @@ -0,0 +1,48 @@ >>> +local tap = require('tap') >>> + >>> +-- Test file to demonstrate LuaJIT misbehaviour when assembling >>> +-- HREFK instruction on arm64 with the huge offset. >>> +-- Seealso:https://github.com/LuaJIT/LuaJIT/issues/1026. >>> +local test = tap.test('lj-1026-arm64-invalid-hrefk-offset-check'):skipcond({ >>> + ['Test requires JIT enabled'] = not jit.status(), >> It is an ARM-specific patch, should we add a condition for ARM here? > It is a good question. It was once discussed, and we decided not to add > the skip condition to make other architectures more covered by tests too > (for example, we may check MIPS/PPC in the same test if we want to > support them). Ok. >>> +}) >>> + >>> +test:plan(1) >>> + >>> +-- The assertion fails since in HREFK we are checking the offset >>> +-- from the hslots of the table of the Node structure itself >> s/Node/`Node`/ > Fixed. See the iterative patch below. > > =================================================================== > diff --git a/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua b/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua > index de243814..caa6291d 100644 > --- a/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua > +++ b/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua > @@ -10,7 +10,7 @@ local test = tap.test('lj-1026-arm64-invalid-hrefk-offset-check'):skipcond({ > test:plan(1) > > -- The assertion fails since in HREFK we are checking the offset > --- from the hslots of the table of the Node structure itself > +-- from the hslots of the table of the `Node` structure itself > -- instead of its inner field `key` (with additional 8 bytes). > -- So to test this, we generate a big table with constant keys > -- and compile a trace for each HREFK possible. > =================================================================== > > Branch is force-pushed. > >>> +-- instead of its inner field `key` (with additional 8 bytes). >>> +-- So to test this, we generate a big table with constant keys >>> +-- and compile a trace for each HREFK possible. >>> + >>> +local big_tab = {} >>> +-- The map of the characters to generate constant string keys. >>> +-- The offset of the node should be 4096 * 8. It takes at least >>> +-- 1365 keys to hit this value. The maximum possible slots in the >>> +-- hash part is 2048, so to fill it with the maximum density (with >>> +-- the way below), we need 45 * 45 = 2025 keys. >>> +local chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRS' >>> +for c1inchars:gmatch('.') do >>> + for c2inchars:gmatch('.') do >>> + big_tab[c1 .. c2] = 1 >>> + end >>> +end >>> + >>> +jit.opt.start('hotloop=1') >>> + >>> +-- Generate bunch of traces. >>> +for c1inchars:gmatch('.') do >>> + for c2inchars:gmatch('.') do >>> + loadstring([=[ >>> + local t = ... >>> + for i = 1, 4 do >>> + -- HREFK generation. >>> + t[ ']=] .. c1 .. c2 .. [=[' ] = i >>> + end >>> + ]=])(big_tab) >>> + end >>> +end >>> + >>> +test:ok(true, 'no assertion failed') >> I would replace testcase description to something like "emitted assembly >> is correct". >> >> Feel free to ignore. > It triggers the assertion in the first place, so ignoring. > >>> + >>> +test:done(true) --------------oxX2aO0sIEyhYD5MhCF5OU95 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit

LGTM, thanks!

On 6/25/25 17:41, Sergey Kaplun wrote:
Hi, Sergey!
Thanks for the review!
Please consider my answers below.

On 25.06.25, Sergey Bronnikov wrote:
Hi, Sergey,

thanks for the patch! Please see my comments below.

Sergey

On 6/12/25 12:36, Sergey Kaplun wrote:
From: Mike Pall <mike>

Reported by caohongqing.
Fix contributed by Peter Cawley.

(cherry picked from commit 8fbd576fb9414a5fa70dfa6069733d3416a78269)

`asm_hrefk()` uses the check for the offset for the corresponding node
structure. However, the target load is performed from its inner `key`
field with the offset 8. In the case of a huge table, it is possible
that the offset of the node (4095 * 8) is less than 4096 * 8 and can be
emitted via the corresponding instruction as an immediate offset, but
the offset of the `key` field is not. This leads to the corresponding
assertion failure in `emit_lso()`.
The issue [1] contains yet another fix in the same place [2]. We decided 
to backport the patch

separately. But please mention this in commit message.


1. https://github.com/LuaJIT/LuaJIT/issues/1026

2. 
https://github.com/LuaJIT/LuaJIT/commit/93ce12ee15abf28ef4cb24ae7e4b8a5b73d75c85
These issues are completely independent, IMO. I would rather not mention
it. Otherwise, by this logic, we should mention every problem related to
the HREFK here.
ok

      
This patch fixes this behaviour by the correct check.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#11278
---

Related issues:
*https://github.com/LuaJIT/LuaJIT/issues/1026
*https://github.com/tarantool/tarantool/issues/11278
Branch:https://github.com/tarantool/luajit/tree/skaplun/lj-1026-arm64-invalid-hrefk-offset-check

  src/lj_asm_arm64.h                            |  2 +-
  ...-arm64-invalid-hrefk-offset-check.test.lua | 48 +++++++++++++++++++
  2 files changed, 49 insertions(+), 1 deletion(-)
  create mode 100644 test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua

diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
index 6c7b011f..a7f059a2 100644
--- a/src/lj_asm_arm64.h
+++ b/src/lj_asm_arm64.h
@@ -885,7 +885,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
<snipped>

diff --git a/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua b/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua
new file mode 100644
index 00000000..de243814
--- /dev/null
+++ b/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua
@@ -0,0 +1,48 @@
+local tap = require('tap')
+
+-- Test file to demonstrate LuaJIT misbehaviour when assembling
+-- HREFK instruction on arm64 with the huge offset.
+-- See also:https://github.com/LuaJIT/LuaJIT/issues/1026.
+local test = tap.test('lj-1026-arm64-invalid-hrefk-offset-check'):skipcond({
+  ['Test requires JIT enabled'] = not jit.status(),
It is an ARM-specific patch, should we add a condition for ARM here?
It is a good question. It was once discussed, and we decided not to add
the skip condition to make other architectures more covered by tests too
(for example, we may check MIPS/PPC in the same test if we want to
support them).
Ok.

      
+})
+
+test:plan(1)
+
+-- The assertion fails since in HREFK we are checking the offset
+-- from the hslots of the table of the Node structure itself
s/Node/`Node`/
Fixed. See the iterative patch below.

===================================================================
diff --git a/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua b/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua
index de243814..caa6291d 100644
--- a/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua
+++ b/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua
@@ -10,7 +10,7 @@ local test = tap.test('lj-1026-arm64-invalid-hrefk-offset-check'):skipcond({
 test:plan(1)
 
 -- The assertion fails since in HREFK we are checking the offset
--- from the hslots of the table of the Node structure itself
+-- from the hslots of the table of the `Node` structure itself
 -- instead of its inner field `key` (with additional 8 bytes).
 -- So to test this, we generate a big table with constant keys
 -- and compile a trace for each HREFK possible.
===================================================================

Branch is force-pushed.

+-- instead of its inner field `key` (with additional 8 bytes).
+-- So to test this, we generate a big table with constant keys
+-- and compile a trace for each HREFK possible.
+
+local big_tab = {}
+-- The map of the characters to generate constant string keys.
+-- The offset of the node should be 4096 * 8. It takes at least
+-- 1365 keys to hit this value. The maximum possible slots in the
+-- hash part is 2048, so to fill it with the maximum density (with
+-- the way below), we need 45 * 45 = 2025 keys.
+local chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRS'
+for c1 inchars:gmatch('.') do
+  for c2 inchars:gmatch('.') do
+    big_tab[c1 .. c2] = 1
+  end
+end
+
+jit.opt.start('hotloop=1')
+
+-- Generate bunch of traces.
+for c1 inchars:gmatch('.') do
+  for c2 inchars:gmatch('.') do
+    loadstring([=[
+      local t = ...
+      for i = 1, 4 do
+        -- HREFK generation.
+        t[ ']=] .. c1 .. c2 .. [=[' ] = i
+      end
+    ]=])(big_tab)
+  end
+end
+
+test:ok(true, 'no assertion failed')
I would replace testcase description to something like "emitted assembly 
is correct".

Feel free to ignore.
It triggers the assertion in the first place, so ignoring.


        
+
+test:done(true)

    
--------------oxX2aO0sIEyhYD5MhCF5OU95--