From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 7FC0913C2CCF; Tue, 1 Jul 2025 11:53:18 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 7FC0913C2CCF DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1751359998; bh=hbGkFxR3uhtVO+S4DoGuYRzAQ//NDKxPCWrLfphUtb8=; h=Date:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=hZLKjPqpM6DVTQSq/hK0mX9+itDpONZewZOLgLvnjmfVU5IoXXnMB7VXXYyHOqm3C npnxrkLVag+blLC5pIl+VXCKlSpkmQfm+t4yt9opeT8uwDEXGMbsCaYzKWeUm9RJg0 hxpSQzpoi8ivVOwkuzTEOKor8GIeUegHw+p0I2KE= Received: from send126.i.mail.ru (send126.i.mail.ru [89.221.237.221]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 5281113C2CD5 for ; Tue, 1 Jul 2025 11:53:16 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 5281113C2CD5 Received: by exim-smtp-6c74f64b69-hzspk with esmtpa (envelope-from ) id 1uWWjr-000000001xr-05WU; Tue, 01 Jul 2025 11:53:15 +0300 Content-Type: multipart/alternative; boundary="------------kmAz3jItRgirgZceIYXjibgZ" Message-ID: <667dfdf8-5fe0-4656-89d0-b6eaf0cb8836@tarantool.org> Date: Tue, 1 Jul 2025 11:53:14 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: Sergey Kaplun Cc: tarantool-patches@dev.tarantool.org References: <20250626151224.27925-1-skaplun@tarantool.org> <63170032-dc7a-47e4-ad84-9627a02070e0@tarantool.org> Content-Language: en-US In-Reply-To: X-Mailru-Src: smtp X-4EC0790: 10 X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD93BC787D97D52D839EB482FF62014574896C38B394B0703AC182A05F538085040D1FF27EDB6848D2F3DE06ABAFEAF67051E2B2AB5C5A21D6442742D089820740919AD13A5913A5F63 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE705B093C0FC4B30B9EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637AC83A81C8FD4AD23D82A6BABE6F325AC2E85FA5F3EDFCBAA7353EFBB55337566560991A0683723ABD7ABD796D2F2F91012D919A14EC2ABF070600483E7FBBDDB389733CBF5DBD5E913377AFFFEAFD269176DF2183F8FC7C0A29E2F051442AF778941B15DA834481FCF19DD082D7633A0EF3E4896CB9E6436389733CBF5DBD5E9D5E8D9A59859A8B6D52CD31C43BF465FCC7F00164DA146DA6F5DAA56C3B73B237318B6A418E8EAB8D32BA5DBAC0009BE9E8FC8737B5C22490F250D17497FEF6176E601842F6C81A12EF20D2F80756B5FB606B96278B59C4276E601842F6C81A127C277FBC8AE2E8B8DBB596EC94336063AA81AA40904B5D99C9F4D5AE37F343AD1F44FA8B9022EA23BBE47FD9DD3FB595F5C1EE8F4F765FC72CEEB2601E22B093A03B725D353964B2FFDA4F57982C5F435872C767BF85DA227C277FBC8AE2E8BEB1A37DF9DABAA8F75ECD9A6C639B01B4E70A05D1297E1BBCB5012B2E24CD356 X-C1DE0DAB: 0D63561A33F958A5FDEFC9F85AA2CEB45002B1117B3ED696CEF8603450C6E79292212597CCBD6D77823CB91A9FED034534781492E4B8EEADA2D5570B22232E1EBDAD6C7F3747799A X-C8649E89: 1C3962B70DF3F0ADBF74143AD284FC7177DD89D51EBB7742424CF958EAFF5D571004E42C50DC4CA955A7F0CF078B5EC49A30900B95165D349379E7F8541B6C9A3CEFBE3E60E64E99755D566C0159EE776D6B69937DB53A0142AF78A48B1E11991D7E09C32AA3244CA0AB2EEEEA22812B77DD89D51EBB77424460C793D1B77124EA455F16B58544A2E30DDF7C44BCB90DA5AE236DF995FB59978A700BF655EAEEED6A17656DB59BCAD427812AF56FC65B X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu53w8ahmwBjZKM/YPHZyZHvz5uv+WouB9+ObcCpyrx6l7KImUglyhkEat/+ysWwi0gdhEs0JGjl6ggRWTy1haxBpVdbIX1nthFXMZebaIdHP2ghjoIc/363UZI6Kf1ptIMVV45pMz5J5bN2Yhn9ZcIdv8= X-Mailru-Sender: 520A125C2F17F0B1E52FEF5D219D6140AB493B89FE677AA99487ABAC94A94B544FC3A2C67621C86D0152A3D17938EB451EB5A0BCEC6A560B3DDE9B364B0DF289BE2DA36745F2EEB5CEBA01FB949A1F1EEAB4BC95F72C04283CDA0F3B3F5B9367 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH luajit] ARM64: Fix LDP/STP fusing for unaligned accesses. X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Sergey Bronnikov via Tarantool-patches Reply-To: Sergey Bronnikov Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" This is a multi-part message in MIME format. --------------kmAz3jItRgirgZceIYXjibgZ Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Thanks! LGTM On 6/30/25 10:26, Sergey Kaplun wrote: > Hi, Sergey! > Thanks for the review! > Fixed your comments and force-pushed the branch. > > On 27.06.25, Sergey Bronnikov wrote: >> Hi, Sergey, >> >> thanks for the patch! LGTM with a minor two comments below. >> >> Sergey >> >> On 6/26/25 18:12, Sergey Kaplun wrote: >>> From: Mike Pall >>> >>> Thanks to Peter Cawley. >>> >>> (cherry picked from commit 0fa2f1cbcf023ad0549f1428809e506fa2c78552) >>> >>> The arm64 emitting of load/store operation works incorrectly in the >>> case when at least one offset of load/store to be fused into ldp/stp is >>> misaligned. In this case this misaligning is ignored, and instructions >>> are fused, which leads to loading/storing from/to at least one incorrect >>> address. >>> >>> For example, the following instructions: >>> | stur w0, [x1, #17] >>> | stur w0, [x1, #21] >>> >>> May be fused to the following: >>> | stp w0, w0, [x1, #16] >>> >>> This patch prevents fusion in this case by testing the alignment with >>> the help of bitwise ROR by the alignment value. In case of misaligned >>> offset, the value overflows the 7-bit length mask in the check. >>> >>> The negative immediate (7-bit width including sign bit) is limited by >>> the corresponding addition of `64 << sc` (it is harmless in the case of >>> positive values). >>> >>> Sergey Kaplun: >>> * added the description and the test for the problem >>> >>> Part of tarantool/tarantool#11278 >>> --- >>> >>> Related issues: >>> *https://github.com/LuaJIT/LuaJIT/issues/1056 >>> *https://github.com/tarantool/tarantool/issues/11278 >>> Branch:https://github.com/tarantool/luajit/tree/skaplun/lj-1056-arm64-ldp-sdp-misaligned-fusing >>> >>> src/lj_emit_arm64.h | 2 +- >>> ...6-arm64-ldp-sdp-misaligned-fusing.test.lua | 98 +++++++++++++++++++ >>> 2 files changed, 99 insertions(+), 1 deletion(-) >>> create mode 100644 test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua >>> >>> diff --git a/src/lj_emit_arm64.h b/src/lj_emit_arm64.h >>> index 30cd3505..5c1bc372 100644 >>> --- a/src/lj_emit_arm64.h >>> +++ b/src/lj_emit_arm64.h > > >>> diff --git a/test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua b/test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua >>> new file mode 100644 >>> index 00000000..5d03097e >>> --- /dev/null >>> +++ b/test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua >>> @@ -0,0 +1,98 @@ >>> +local tap = require('tap') >>> +local ffi = require('ffi') >>> + >>> +-- This test demonstrates LuaJIT's incorrect emitting of LDP/STP >>> +-- instructions from LDUR/STUR instructions with misaligned offset >>> +-- on arm64. >>> +-- Seealsohttps://github.com/LuaJIT/LuaJIT/issue/1056. >> s/issue/issues/ > Fixed, thanks! > > =================================================================== > diff --git a/test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua b/test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua > index 815da15d..5ff040e7 100644 > --- a/test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua > +++ b/test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua > @@ -4,7 +4,7 @@ local ffi = require('ffi') > -- This test demonstrates LuaJIT's incorrect emitting of LDP/STP > -- instructions from LDUR/STUR instructions with misaligned offset > -- on arm64. > --- See alsohttps://github.com/LuaJIT/LuaJIT/issue/1056. > +-- See alsohttps://github.com/LuaJIT/LuaJIT/issues/1056. > local test = tap.test('lj-1056-arm64-ldp-sdp-misaligned-fusing'):skipcond({ > ['Test requires JIT enabled'] = not jit.status(), > }) > =================================================================== > >>> +local test = tap.test('lj-1056-arm64-ldp-sdp-misaligned-fusing'):skipcond({ >>> + ['Test requires JIT enabled'] = not jit.status(), >>> +}) >>> + > > >>> + >>> +test:is(resl, 0x4030201, 'pair of misaligned loads, left') >>> +test:is(resr, 0x8070605, 'pair of misaligned loads, right') >> What does mean these magic numbers? Please add a comment or >> >> use a variable with self-explained name. Here and below. > Added the comments nearby, see the corresponding patch below: > > =================================================================== > diff --git a/test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua b/test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua > index 5d03097e..815da15d 100644 > --- a/test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua > +++ b/test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua > @@ -81,6 +81,8 @@ for _ = 1, N_ITERATIONS do > resr = ffi.cast('int32_t *', ptr + 5)[0] > end > > +-- Values are resulted from the `init_buf()` function with the > +-- corresponding offset. > test:is(resl, 0x4030201, 'pair of misaligned loads, left') > test:is(resr, 0x8070605, 'pair of misaligned loads, right') > > @@ -92,6 +94,8 @@ for _ = 1, N_ITERATIONS do > resl = ffi.cast('int32_t *', ptr)[0] > end > > +-- Values are resulted from the `init_buf()` function with the > +-- corresponding offset. > test:is(resl, 0x3020100, 'aligned / misaligned load, aligned') > test:is(resr, 0x8070605, 'aligned / misaligned load, misaligned') > > =================================================================== > > > --------------kmAz3jItRgirgZceIYXjibgZ Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit

Thanks! LGTM

On 6/30/25 10:26, Sergey Kaplun wrote:
Hi, Sergey!
Thanks for the review!
Fixed your comments and force-pushed the branch.

On 27.06.25, Sergey Bronnikov wrote:
Hi, Sergey,

thanks for the patch! LGTM with a minor two comments below.

Sergey

On 6/26/25 18:12, Sergey Kaplun wrote:
From: Mike Pall <mike>

Thanks to Peter Cawley.

(cherry picked from commit 0fa2f1cbcf023ad0549f1428809e506fa2c78552)

The arm64 emitting of load/store operation works incorrectly in the
case when at least one offset of load/store to be fused into ldp/stp is
misaligned. In this case this misaligning is ignored, and instructions
are fused, which leads to loading/storing from/to at least one incorrect
address.

For example, the following instructions:
| stur  w0, [x1, #17]
| stur  w0, [x1, #21]

May be fused to the following:
| stp   w0, w0, [x1, #16]

This patch prevents fusion in this case by testing the alignment with
the help of bitwise ROR by the alignment value. In case of misaligned
offset, the value overflows the 7-bit length mask in the check.

The negative immediate (7-bit width including sign bit) is limited by
the corresponding addition of `64 << sc` (it is harmless in the case of
positive values).

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#11278
---

Related issues:
*https://github.com/LuaJIT/LuaJIT/issues/1056
*https://github.com/tarantool/tarantool/issues/11278
Branch:https://github.com/tarantool/luajit/tree/skaplun/lj-1056-arm64-ldp-sdp-misaligned-fusing

  src/lj_emit_arm64.h                           |  2 +-
  ...6-arm64-ldp-sdp-misaligned-fusing.test.lua | 98 +++++++++++++++++++
  2 files changed, 99 insertions(+), 1 deletion(-)
  create mode 100644 test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua

diff --git a/src/lj_emit_arm64.h b/src/lj_emit_arm64.h
index 30cd3505..5c1bc372 100644
--- a/src/lj_emit_arm64.h
+++ b/src/lj_emit_arm64.h
<snipped>

diff --git a/test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua b/test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua
new file mode 100644
index 00000000..5d03097e
--- /dev/null
+++ b/test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua
@@ -0,0 +1,98 @@
+local tap = require('tap')
+local ffi = require('ffi')
+
+-- This test demonstrates LuaJIT's incorrect emitting of LDP/STP
+-- instructions from LDUR/STUR instructions with misaligned offset
+-- on arm64.
+-- See alsohttps://github.com/LuaJIT/LuaJIT/issue/1056.
s/issue/issues/
Fixed, thanks!

===================================================================
diff --git a/test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua b/test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua
index 815da15d..5ff040e7 100644
--- a/test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua
+++ b/test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua
@@ -4,7 +4,7 @@ local ffi = require('ffi')
 -- This test demonstrates LuaJIT's incorrect emitting of LDP/STP
 -- instructions from LDUR/STUR instructions with misaligned offset
 -- on arm64.
--- See also https://github.com/LuaJIT/LuaJIT/issue/1056.
+-- See also https://github.com/LuaJIT/LuaJIT/issues/1056.
 local test = tap.test('lj-1056-arm64-ldp-sdp-misaligned-fusing'):skipcond({
   ['Test requires JIT enabled'] = not jit.status(),
 })
===================================================================

+local test = tap.test('lj-1056-arm64-ldp-sdp-misaligned-fusing'):skipcond({
+  ['Test requires JIT enabled'] = not jit.status(),
+})
+
<snipped>

+
+test:is(resl, 0x4030201, 'pair of misaligned loads, left')
+test:is(resr, 0x8070605, 'pair of misaligned loads, right')
What does mean these magic numbers? Please add a comment or

use a variable with self-explained name. Here and below.
Added the comments nearby, see the corresponding patch below:

===================================================================
diff --git a/test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua b/test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua
index 5d03097e..815da15d 100644
--- a/test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua
+++ b/test/tarantool-tests/lj-1056-arm64-ldp-sdp-misaligned-fusing.test.lua
@@ -81,6 +81,8 @@ for _ = 1, N_ITERATIONS do
   resr = ffi.cast('int32_t *', ptr + 5)[0]
 end
 
+-- Values are resulted from the `init_buf()` function with the
+-- corresponding offset.
 test:is(resl, 0x4030201, 'pair of misaligned loads, left')
 test:is(resr, 0x8070605, 'pair of misaligned loads, right')
 
@@ -92,6 +94,8 @@ for _ = 1, N_ITERATIONS do
   resl = ffi.cast('int32_t *', ptr)[0]
 end
 
+-- Values are resulted from the `init_buf()` function with the
+-- corresponding offset.
 test:is(resl, 0x3020100, 'aligned / misaligned load, aligned')
 test:is(resr, 0x8070605, 'aligned / misaligned load, misaligned')
 
===================================================================

<snipped>

--------------kmAz3jItRgirgZceIYXjibgZ--