From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id B8E3713DF782; Wed, 4 Jun 2025 12:43:59 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org B8E3713DF782 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1749030239; bh=ks5UaXVaXkedJBN+A7oQs564NO+4rNzTDu1UjE5W1EE=; h=Date:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=KLIs2fqcW+Vv20YkUPgfQBU8x5xvwb5WfMfrec6GKfk+jm97Qybt4WGfLD5Jqxq/K cwL1A0oMNEII/5giNMmdYySbxbssYpnBW8/x+BY7wZIEMSUv9et+hYiED/HqRINJxG RseVKWkEc71MvkoCgOyYLl1hvmnlNlrK3TSdSV7g= Received: from send82.i.mail.ru (send82.i.mail.ru [89.221.237.177]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id EDF4513DF784 for ; Wed, 4 Jun 2025 12:43:57 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org EDF4513DF784 Received: by exim-smtp-75656d46d5-5qhtk with esmtpa (envelope-from ) id 1uMkf6-000000000Zl-3uiY; Wed, 04 Jun 2025 12:43:57 +0300 Content-Type: multipart/alternative; boundary="------------I4WmE5X6fsqGuOEo04Jdbbzp" Message-ID: <7ccfb141-2475-4605-a631-1adffd5842e4@tarantool.org> Date: Wed, 4 Jun 2025 12:43:56 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: Sergey Kaplun Cc: tarantool-patches@dev.tarantool.org References: <20250603185300.19160-1-skaplun@tarantool.org> In-Reply-To: <20250603185300.19160-1-skaplun@tarantool.org> X-Mailru-Src: smtp X-4EC0790: 10 X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD9851146C857904EABAA89F6B1AA8B80D9BC23291DDA182EE7182A05F538085040BC8B627946E77B7D3DE06ABAFEAF67055AFE678A4F970C311FEAC63F0EBDC6B03004FA394ACD0D3A X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE79E25BE5045FD62C0EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637AC83A81C8FD4AD23D82A6BABE6F325AC2E85FA5F3EDFCBAA7353EFBB553375662BC6C8E92A5ED823AF396DE90E1E6CF70E08D4732A0AC1097AF6CC9B47A1392C389733CBF5DBD5E913377AFFFEAFD269176DF2183F8FC7C091DAD9F922AA71188941B15DA834481FCF19DD082D7633A0EF3E4896CB9E6436389733CBF5DBD5E9D5E8D9A59859A8B6E232F00D8D26902CA471835C12D1D977C4224003CC836476EB9C4185024447017B076A6E789B0E975F5C1EE8F4F765FC89C074F960B19C4B3AA81AA40904B5D9CF19DD082D7633A0C84D3B47A649675F3AA81AA40904B5D98AA50765F7900637D26F802DBF5E8046D81D268191BDAD3D3666184CF4C3C14F3FC91FA280E0CE3D1A620F70A64A45A98AA50765F79006372E808ACE2090B5E1725E5C173C3A84C3C5EA940A35A165FF2DBA43225CD8A89FB26E97DCB74E6252CE5475246E174218B5C8C57E37DE458BEDA766A37F9254B7 X-C1DE0DAB: 0D63561A33F958A5F6387528CCA3732E5002B1117B3ED6961E2271F0067172B2CCE9A60C8CB01D7C823CB91A9FED034534781492E4B8EEAD27E9584FBD6BDD31BDAD6C7F3747799A X-C8649E89: 1C3962B70DF3F0ADBF74143AD284FC7177DD89D51EBB7742424CF958EAFF5D571004E42C50DC4CA955A7F0CF078B5EC49A30900B95165D34A5112A9AECFE11B1728CE85DD7B14D63730225086C8591FF13694E58F74A061957A757897C817C541D7E09C32AA3244CB3353EB7EA38216777DD89D51EBB7742171C4C61F4FE6127EA455F16B58544A2E30DDF7C44BCB90DA5AE236DF995FB59978A700BF655EAEEED6A17656DB59BCAD427812AF56FC65B X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu53w8ahmwBjZKM/YPHZyZHvz5uv+WouB9+ObcCpyrx6l7KImUglyhkEat/+ysWwi0gdhEs0JGjl6ggRWTy1haxBpVdbIX1nthFXMZebaIdHP2ghjoIc/363UZI6Kf1ptIMVSykAyseJQ6/59wXdN7PC88= X-Mailru-Sender: 520A125C2F17F0B1E52FEF5D219D6140CD627DAF99E4488A3DE06ABAFEAF6705E3DC47FACA10E7990152A3D17938EB451EB5A0BCEC6A560B3DDE9B364B0DF289BE2DA36745F2EEB5CEBA01FB949A1F1EEAB4BC95F72C04283CDA0F3B3F5B9367 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH luajit] ARM64: Fix code generation for IR_SLOAD with typecheck + conversion. X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Sergey Bronnikov via Tarantool-patches Reply-To: Sergey Bronnikov Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" This is a multi-part message in MIME format. --------------I4WmE5X6fsqGuOEo04Jdbbzp Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Sergey, thanks for the patch! LGTM On 6/3/25 21:53, Sergey Kaplun wrote: > From: Mike Pall > > Reported by memcorrupt. > > (cherry picked from commit 564147f518af5a5d8985d9e09fc3a768231f4e75) > > The assembling of the SLOAD with typecheck and conversion from number to > int misses the corresponding move for emitting conversion to the FPR > during assembling. > > Consider the following SLOAD: > | 0006 x28 > int SLOAD #4 TCI > > Which results in the following mcode before the patch: > | ldr x28, [x3, #16] > | cmp x2, x28, lsr #32 > | bls 0x62d2fda0 ->0 > | ; here missing the move to d31 > | fcvtzs w28, d31 > | scvtf d30, w28 > | fcmp d30, d31 > | bne 0x62d2fda0 ->0 > > Instead of the expected: > | ldr x28, [x3, #16] > | cmp x2, x28, lsr #32 > | bls 0x7bacfda0 ->0 > | fmov d31, x28 > | fcvtzs w28, d31 > | scvtf d30, w28 > | fcmp d30, d31 > | bne 0x7bacfda0 ->0 > > Due to the incorrect check of the condition inside the `asm_sload()`, > which excluded the `IRSLOAD_CONVERT` flag. It may lead to inconsistent > behaviour on the trace. This patch fixes the check by comparing the > source and destination registers instead. > > Sergey Kaplun: > * added the description and the test for the problem > > Part of tarantool/tarantool#11278 > --- > > Branch:https://github.com/tarantool/luajit/tree/skaplun/lj-917-arm64-sload-typecheck-conversion > Related issues: > *https://github.com/LuaJIT/LuaJIT/issues/917 > *https://github.com/tarantool/tarantool/issues/11278 > > src/lj_asm_arm64.h | 2 +- > ...-arm64-sload-typecheck-conversion.test.lua | 58 +++++++++++++++++++ > 2 files changed, 59 insertions(+), 1 deletion(-) > create mode 100644 test/tarantool-tests/lj-917-arm64-sload-typecheck-conversion.test.lua > > diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h > index 9b27473c..6c7b011f 100644 > --- a/src/lj_asm_arm64.h > +++ b/src/lj_asm_arm64.h > @@ -1177,7 +1177,7 @@ dotypecheck: > tmp = ra_scratch(as, allow); > rset_clear(allow, tmp); > } > - if (ra_hasreg(dest) && irt_isnum(t) && !(ir->op2 & IRSLOAD_CONVERT)) > + if (ra_hasreg(dest) && tmp != dest) > emit_dn(as, A64I_FMOV_D_R, (dest & 31), tmp); > /* Need type check, even if the load result is unused. */ > asm_guardcc(as, irt_isnum(t) ? CC_LS : CC_NE); > diff --git a/test/tarantool-tests/lj-917-arm64-sload-typecheck-conversion.test.lua b/test/tarantool-tests/lj-917-arm64-sload-typecheck-conversion.test.lua > new file mode 100644 > index 00000000..9cf5cda0 > --- /dev/null > +++ b/test/tarantool-tests/lj-917-arm64-sload-typecheck-conversion.test.lua > @@ -0,0 +1,58 @@ > +local tap = require('tap') > +-- Test file to demonstrate the incorrect JIT assembling of > +-- `IR_SLOAD` with typecheck and conversion to integer from > +-- number. > +-- See alsohttps://github.com/LuaJIT/LuaJIT/issues/917. > +local test = tap.test('lj-917-arm64-sload-typecheck-conversion'):skipcond({ > + ['Test requires JIT enabled'] = not jit.status(), > +}) > + > +test:plan(1) > + > +jit.opt.start('hotloop=1') > + > +local results = {} > + > +-- Use the following mathematics on a huge number not fitting into > +-- an int to be sure that all 3 control numbers (start, stop, > +-- step) of the loop should be non-integers to avoid fallback to > +-- the `lj_vmeta_for()` and narrowing in the `lj_meta_for()` > +-- (see for details). > +local NOT_INT = 2 ^ 32 > + > +-- The interesting for us SLOAD is the loading of the start index: > +-- | 0006 x28 > int SLOAD #4 TCI > +-- > +-- Which results in the following mcode before the patch: > +-- | ldr x28, [x3, #16] > +-- | cmp x2, x28, lsr #32 > +-- | bls 0x62d2fda0 ->0 > +-- | ; here missing the move to d31 > +-- | fcvtzs w28, d31 > +-- | scvtf d30, w28 > +-- | fcmp d30, d31 > +-- | bne 0x62d2fda0 ->0 > +-- > +-- Instead of the expected: > +-- | ldr x28, [x3, #16] > +-- | cmp x2, x28, lsr #32 > +-- | bls 0x7bacfda0 ->0 > +-- | fmov d31, x28 > +-- | fcvtzs w28, d31 > +-- | scvtf d30, w28 > +-- | fcmp d30, d31 > +-- | bne 0x7bacfda0 ->0 > + > +-- At this moment d31 contains the value of the `step`, so `step` > +-- should be >= `stop` to obtain inconsistency (the too early loop > +-- end with the last `i` value equals to `step`). > +-- The resulting loop is: > +-- | for i = -4, -1, 1 do > +for i = -4 + NOT_INT * 0, -1 + NOT_INT * 0, 1 + NOT_INT * 0 do > + results[-i] = true > +end > + > +-- Expected {true, true, true, true}, since -4 is a start. > +test:samevalues(results, 'correct SLOAD TC assembling') > + > +test:done(true) --------------I4WmE5X6fsqGuOEo04Jdbbzp Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit

Sergey,

thanks for the patch! LGTM

On 6/3/25 21:53, Sergey Kaplun wrote:
From: Mike Pall <mike>

Reported by memcorrupt.

(cherry picked from commit 564147f518af5a5d8985d9e09fc3a768231f4e75)

The assembling of the SLOAD with typecheck and conversion from number to
int misses the corresponding move for emitting conversion to the FPR
during assembling.

Consider the following SLOAD:
| 0006 x28   >  int SLOAD  #4    TCI

Which results in the following mcode before the patch:
|  ldr   x28, [x3, #16]
|  cmp   x2, x28, lsr #32
|  bls   0x62d2fda0        ->0
|                              ; here missing the move to d31
|  fcvtzs w28, d31
|  scvtf d30, w28
|  fcmp  d30, d31
|  bne   0x62d2fda0        ->0

Instead of the expected:
|  ldr   x28, [x3, #16]
|  cmp   x2, x28, lsr #32
|  bls   0x7bacfda0        ->0
|  fmov  d31, x28
|  fcvtzs w28, d31
|  scvtf d30, w28
|  fcmp  d30, d31
|  bne   0x7bacfda0        ->0

Due to the incorrect check of the condition inside the `asm_sload()`,
which excluded the `IRSLOAD_CONVERT` flag. It may lead to inconsistent
behaviour on the trace. This patch fixes the check by comparing the
source and destination registers instead.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#11278
---

Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-917-arm64-sload-typecheck-conversion
Related issues:
* https://github.com/LuaJIT/LuaJIT/issues/917
* https://github.com/tarantool/tarantool/issues/11278

 src/lj_asm_arm64.h                            |  2 +-
 ...-arm64-sload-typecheck-conversion.test.lua | 58 +++++++++++++++++++
 2 files changed, 59 insertions(+), 1 deletion(-)
 create mode 100644 test/tarantool-tests/lj-917-arm64-sload-typecheck-conversion.test.lua

diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
index 9b27473c..6c7b011f 100644
--- a/src/lj_asm_arm64.h
+++ b/src/lj_asm_arm64.h
@@ -1177,7 +1177,7 @@ dotypecheck:
       tmp = ra_scratch(as, allow);
       rset_clear(allow, tmp);
     }
-    if (ra_hasreg(dest) && irt_isnum(t) && !(ir->op2 & IRSLOAD_CONVERT))
+    if (ra_hasreg(dest) && tmp != dest)
       emit_dn(as, A64I_FMOV_D_R, (dest & 31), tmp);
     /* Need type check, even if the load result is unused. */
     asm_guardcc(as, irt_isnum(t) ? CC_LS : CC_NE);
diff --git a/test/tarantool-tests/lj-917-arm64-sload-typecheck-conversion.test.lua b/test/tarantool-tests/lj-917-arm64-sload-typecheck-conversion.test.lua
new file mode 100644
index 00000000..9cf5cda0
--- /dev/null
+++ b/test/tarantool-tests/lj-917-arm64-sload-typecheck-conversion.test.lua
@@ -0,0 +1,58 @@
+local tap = require('tap')
+-- Test file to demonstrate the incorrect JIT assembling of
+-- `IR_SLOAD` with typecheck and conversion to integer from
+-- number.
+-- See also https://github.com/LuaJIT/LuaJIT/issues/917.
+local test = tap.test('lj-917-arm64-sload-typecheck-conversion'):skipcond({
+  ['Test requires JIT enabled'] = not jit.status(),
+})
+
+test:plan(1)
+
+jit.opt.start('hotloop=1')
+
+local results = {}
+
+-- Use the following mathematics on a huge number not fitting into
+-- an int to be sure that all 3 control numbers (start, stop,
+-- step) of the loop should be non-integers to avoid fallback to
+-- the `lj_vmeta_for()` and narrowing in the `lj_meta_for()`
+-- (see <src/vm_arm64.dasm> for details).
+local NOT_INT = 2 ^ 32
+
+-- The interesting for us SLOAD is the loading of the start index:
+-- | 0006 x28   >  int SLOAD  #4    TCI
+--
+-- Which results in the following mcode before the patch:
+-- |  ldr   x28, [x3, #16]
+-- |  cmp   x2, x28, lsr #32
+-- |  bls   0x62d2fda0        ->0
+-- |                              ; here missing the move to d31
+-- |  fcvtzs w28, d31
+-- |  scvtf d30, w28
+-- |  fcmp  d30, d31
+-- |  bne   0x62d2fda0        ->0
+--
+-- Instead of the expected:
+-- |  ldr   x28, [x3, #16]
+-- |  cmp   x2, x28, lsr #32
+-- |  bls   0x7bacfda0        ->0
+-- |  fmov  d31, x28
+-- |  fcvtzs w28, d31
+-- |  scvtf d30, w28
+-- |  fcmp  d30, d31
+-- |  bne   0x7bacfda0        ->0
+
+-- At this moment d31 contains the value of the `step`, so `step`
+-- should be >= `stop` to obtain inconsistency (the too early loop
+-- end with the last `i` value equals to `step`).
+-- The resulting loop is:
+-- | for i = -4, -1, 1 do
+for i = -4 + NOT_INT * 0, -1 + NOT_INT * 0, 1 + NOT_INT * 0 do
+  results[-i] = true
+end
+
+-- Expected {true, true, true, true}, since -4 is a start.
+test:samevalues(results, 'correct SLOAD TC assembling')
+
+test:done(true)
--------------I4WmE5X6fsqGuOEo04Jdbbzp--