From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 129D913B8B66; Fri, 16 May 2025 20:07:48 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 129D913B8B66 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1747415268; bh=nNSJ3YGlsit45cg3buETktvFl8swQP3BUMXduab5PBg=; h=Date:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=IoNGQUzJFkGVToTn4ssxdXbCvQUeHgWIDCWYBJOL6sMvnVJRHpYxmNoRd+FmI4Awg 6cfjfoYcLg5BNI16JFa44MjvClB9p1ADmSn0HpBDh4FGTYorHfWaboOuztYm4ZMeW0 IMkYqyYsei2idZhym8/aMwZTLezA4ZKWub5NkGOA= Received: from send242.i.mail.ru (send242.i.mail.ru [95.163.59.81]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 8791413B8B66 for ; Fri, 16 May 2025 20:07:46 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 8791413B8B66 Received: by exim-smtp-7b66877447-xvcjs with esmtpa (envelope-from ) id 1uFyXB-00000000BaN-0s6f; Fri, 16 May 2025 20:07:45 +0300 Content-Type: multipart/alternative; boundary="------------40JMvxetg0Bvkc84NmCVn560" Message-ID: <91a4423c-9968-4176-b4dd-4f5045dc7ee4@tarantool.org> Date: Fri, 16 May 2025 20:07:45 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: Sergey Kaplun Cc: tarantool-patches@dev.tarantool.org References: <20250514115656.13243-1-skaplun@tarantool.org> Content-Language: en-US In-Reply-To: <20250514115656.13243-1-skaplun@tarantool.org> X-Mailru-Src: smtp X-4EC0790: 10 X-7564579A: EEAE043A70213CC8 X-77F55803: 4F1203BC0FB41BD93761F2630DFFAF41AF55934E159152F283418FB27FF54A7D182A05F5380850404C228DA9ACA6FE27210A42EEB36357503DE06ABAFEAF670598F1AD5D7972C9B64B728E403FCBF4F2490E54C98FB8BF8F X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE74B44AB1D52BB6B9BC2099A533E45F2D0395957E7521B51C2CFCAF695D4D8E9FCEA1F7E6F0F101C6759CC434672EE6371C2A783ECEC0211ADC4224003CC836476D5A39DEEDB180909611E41BBFE2FEB2B0CB46BBD517E79A5599D55E7B23D8537A95075824D5697C64CD6B5DCE537389D9FA2833FD35BB23D9E625A9149C048EE33AC447995A7AD18CB629EEF1311BF91D2E47CDBA5A96583BD4B6F7A4D31EC0BC014FD901B82EE079FA2833FD35BB23D27C277FBC8AE2E8B3A703B70628EAD7BA471835C12D1D977C4224003CC836476EB9C4185024447017B076A6E789B0E975F5C1EE8F4F765FCDA1EAD9FF9EE4E593AA81AA40904B5D9CF19DD082D7633A0C84D3B47A649675F3AA81AA40904B5D98AA50765F79006375E2EA8B077CA715ED81D268191BDAD3D3666184CF4C3C14F3FC91FA280E0CE3D1A620F70A64A45A98AA50765F79006372E808ACE2090B5E1725E5C173C3A84C3C5EA940A35A165FF2DBA43225CD8A89F9FFED5BD9FB417555E1C53F199C2BB95B5C8C57E37DE458BEDA766A37F9254B7 X-C1DE0DAB: 0D63561A33F958A5901B49FFB1F28BDB5002B1117B3ED696C1B83899B7F4D4D33D2BBC1EF78EDEBE823CB91A9FED034534781492E4B8EEADC24E78AA85F86F6CBDAD6C7F3747799A X-C8649E89: 1C3962B70DF3F0ADBF74143AD284FC7177DD89D51EBB7742424CF958EAFF5D571004E42C50DC4CA955A7F0CF078B5EC49A30900B95165D34A5BF8FA0BD61F994296AE6CBB0307BC501B8B87F3C31C62133EE46ABFA236BDA2E40E9754768D6351D7E09C32AA3244C92F0CF4F2A59DD0377DD89D51EBB774262BEB8DA3537706FEA455F16B58544A2E30DDF7C44BCB90DA5AE236DF995FB59978A700BF655EAEEED6A17656DB59BCAD427812AF56FC65B X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu53w8ahmwBjZKM/YPHZyZHvz5uv+WouB9+ObcCpyrx6l7KImUglyhkEat/+ysWwi0gdhEs0JGjl6ggRWTy1haxBpVdbIX1nthFXMZebaIdHP2ghjoIc/363UZI6Kf1ptIMVT0GoR8ZYDe75x04QlqTKg8= X-Mailru-Sender: 520A125C2F17F0B1E52FEF5D219D61400AF3D0C8A3AEEDE033594132A326AF8BD266B8DFA7BC5FC50152A3D17938EB451EB5A0BCEC6A560B3DDE9B364B0DF289BE2DA36745F2EEB5CEBA01FB949A1F1EEAB4BC95F72C04283CDA0F3B3F5B9367 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH luajit] ARM64: Fix IR_SLOAD assembly. X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Sergey Bronnikov via Tarantool-patches Reply-To: Sergey Bronnikov Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" This is a multi-part message in MIME format. --------------40JMvxetg0Bvkc84NmCVn560 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Hello, Sergey, thanks for the patch! Please see my comments. Sergey On 5/14/25 14:56, Sergey Kaplun wrote: > From: Mike Pall > > Reported by Gate88. > > (cherry picked from commit 6c4826f12c4d33b8b978004bc681eb1eef2be977) > > The issue is in the case when IR SLOAD is unused on a trace, persists typo: s/, /, it/ > only for typecheck, and has the `num` type. In this case, the `dest` > register is `RID_NONE`. Hence, the `fmov` instruction is emitted there are two instructions `fmov` in the generated assembler output, which one do you mean? 55690afedc  fmov  d0, x30 55690afee0  ldr   w28, [x19, #8] 55690afee4  ldur  x3, [x19, #-16] 55690afee8  and   x3, x3, #0x7fffffffffff 55690afeec  ldr   x27, [x3, #48] 55690afef0  ldr   x8, [x27, #32] 55690afef4  sub   x9, x8, x19 55690afef8  cmp   x9, #56 55690afefc  bls   0x690affe8    ->0 55690aff00  ldr   x6, [x8] 55690aff04  asr   x27, x6, #47 55690aff08  cmn   x27, #12 55690aff0c  bne   0x690affe8    ->0 55690aff10  and   x6, x6, #0x7fffffffffff 55690aff14  ldr   w7, [x6, #52] 55690aff18  cmp   w7, #1 55690aff1c  bne   0x690affe8    ->0 55690aff20  ldr   x5, [x6, #40] 55690aff24  ldr   x27, [x5, #32] 55690aff28  cmp   x27, x1 55690aff2c  bne   0x690affe8    ->0 55690aff30  ldr   x27, [x5, #24] 55690aff34  cmp   x0, x27, lsr #32 55690aff38  bls   0x690affe8    ->0 55690aff3c  fmov  d2, x27 or (probably) you just refer to a C code of `asm_sload`, that emits `fmov`:   1180     if (ra_hasreg(dest) && irt_isnum(t) && !(ir->op2 & IRSLOAD_CONVERT))   1181       emit_dn(as, A64I_FMOV_D_R, (dest & 31), tmp); Sorry, it is not easy for me to match IR and ASM, so I believe more details is required in the description. I just want to understand a difference for emitted assembler before and after the patch. > unconditionally, where the destination register is `d0` (`RID_NONE & > 31`). So, the value of this register is spoiled. If it holds any value > evaluated before and used after this SLOAD, it leads to incorrect > behaviour. > > This patch adds the check that the register is in use before emitting > the instruction. > > Sergey Kaplun: > * added the description and the test for the problem > > Part of tarantool/tarantool#11278 > --- > > Branch:https://github.com/tarantool/luajit/tree/skaplun/lj-903-arm64-unused-number-sload-typecheck > Related issues: > *https://github.com/LuaJIT/LuaJIT/issues/903 > *https://github.com/tarantool/tarantool/issues/11278 > > src/lj_asm_arm64.h | 2 +- > ...m64-unused-number-sload-typecheck.test.lua | 45 +++++++++++++++++++ > 2 files changed, 46 insertions(+), 1 deletion(-) > create mode 100644 test/tarantool-tests/lj-903-arm64-unused-number-sload-typecheck.test.lua > > diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h > index 554bb60a..9b27473c 100644 > --- a/src/lj_asm_arm64.h > +++ b/src/lj_asm_arm64.h > @@ -1177,7 +1177,7 @@ dotypecheck: > tmp = ra_scratch(as, allow); > rset_clear(allow, tmp); > } > - if (irt_isnum(t) && !(ir->op2 & IRSLOAD_CONVERT)) > + if (ra_hasreg(dest) && irt_isnum(t) && !(ir->op2 & IRSLOAD_CONVERT)) > emit_dn(as, A64I_FMOV_D_R, (dest & 31), tmp); > /* Need type check, even if the load result is unused. */ > asm_guardcc(as, irt_isnum(t) ? CC_LS : CC_NE); > diff --git a/test/tarantool-tests/lj-903-arm64-unused-number-sload-typecheck.test.lua b/test/tarantool-tests/lj-903-arm64-unused-number-sload-typecheck.test.lua > new file mode 100644 > index 00000000..748b88e2 > --- /dev/null > +++ b/test/tarantool-tests/lj-903-arm64-unused-number-sload-typecheck.test.lua > @@ -0,0 +1,45 @@ > +local tap = require('tap') > +-- Test file to demonstrate the incorrect JIT assembling of unused > +-- `IR_SLOAD` with number type on arm64. > +-- See alsohttps://github.com/LuaJIT/LuaJIT/issue/903. typo:  s/issue/issues/ > +local test = tap.test('lj-903-arm64-unused-number-sload-typecheck'):skipcond({ > + ['Test requires JIT enabled'] = not jit.status(), > +}) > + > +test:plan(1) > + > +-- Just use any different numbers (but not integers to avoid > +-- integer IR type). > +local SLOT = 0.1 > +local MARKER_VALUE = 4.2 > +-- XXX: Special mapping to avoid folding and removing always true > +-- comparison. > +local anchor = {marker = MARKER_VALUE} > + > +-- Special function to inline on trace to generate SLOAD > +-- typecheck. > +local function sload_unused(x) > + return x > +end > + > +-- The additional wrapper to use stackslots in the function. > +local function test_sload() > + local sload = SLOT > + for _ = 1, 4 do > + -- This line should use the `d0` register. > + local marker = anchor.marker - MARKER_VALUE > + -- This generates unused IR_SLOAD with typecheck (number). > + -- Before the patch, it occasionally overwrites the `d0` > + -- register and causes the execution of the branch. > + sload_unused(sload) > + if marker ~= 0 then > + return false > + end > + end > + return true > +end > + > +jit.opt.start('hotloop=1') > +test:ok(test_sload(), 'correct SLOAD assembling') > + > +test:done(true) --------------40JMvxetg0Bvkc84NmCVn560 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit

Hello, Sergey,

thanks for the patch! Please see my comments.

Sergey

On 5/14/25 14:56, Sergey Kaplun wrote:
From: Mike Pall <mike>

Reported by Gate88.

(cherry picked from commit 6c4826f12c4d33b8b978004bc681eb1eef2be977)

The issue is in the case when IR SLOAD is unused on a trace, persists
typo: s/, /, it/
only for typecheck, and has the `num` type. In this case, the `dest`
register is `RID_NONE`. Hence, the `fmov` instruction is emitted

there are two instructions `fmov` in the generated assembler output, which one do you mean?

55690afedc  fmov  d0, x30
55690afee0  ldr   w28, [x19, #8]
55690afee4  ldur  x3, [x19, #-16]
55690afee8  and   x3, x3, #0x7fffffffffff
55690afeec  ldr   x27, [x3, #48]
55690afef0  ldr   x8, [x27, #32]
55690afef4  sub   x9, x8, x19
55690afef8  cmp   x9, #56
55690afefc  bls   0x690affe8    ->0
55690aff00  ldr   x6, [x8]
55690aff04  asr   x27, x6, #47
55690aff08  cmn   x27, #12
55690aff0c  bne   0x690affe8    ->0
55690aff10  and   x6, x6, #0x7fffffffffff
55690aff14  ldr   w7, [x6, #52]
55690aff18  cmp   w7, #1
55690aff1c  bne   0x690affe8    ->0
55690aff20  ldr   x5, [x6, #40]
55690aff24  ldr   x27, [x5, #32]
55690aff28  cmp   x27, x1
55690aff2c  bne   0x690affe8    ->0
55690aff30  ldr   x27, [x5, #24]
55690aff34  cmp   x0, x27, lsr #32
55690aff38  bls   0x690affe8    ->0
55690aff3c  fmov  d2, x27

or (probably) you just refer to a C code of `asm_sload`, that emits `fmov`:

  1180     if (ra_hasreg(dest) && irt_isnum(t) && !(ir->op2 & IRSLOAD_CONVERT))         
  1181       emit_dn(as, A64I_FMOV_D_R, (dest & 31), tmp);  

Sorry, it is not easy for me to match IR and ASM, so I believe more details is required in the description.

I just want to understand a difference for emitted assembler before and after the patch.

unconditionally, where the destination register is `d0` (`RID_NONE &
31`). So, the value of this register is spoiled. If it holds any value
evaluated before and used after this SLOAD, it leads to incorrect
behaviour.

This patch adds the check that the register is in use before emitting
the instruction.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#11278
---

Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-903-arm64-unused-number-sload-typecheck
Related issues:
* https://github.com/LuaJIT/LuaJIT/issues/903
* https://github.com/tarantool/tarantool/issues/11278

 src/lj_asm_arm64.h                            |  2 +-
 ...m64-unused-number-sload-typecheck.test.lua | 45 +++++++++++++++++++
 2 files changed, 46 insertions(+), 1 deletion(-)
 create mode 100644 test/tarantool-tests/lj-903-arm64-unused-number-sload-typecheck.test.lua

diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
index 554bb60a..9b27473c 100644
--- a/src/lj_asm_arm64.h
+++ b/src/lj_asm_arm64.h
@@ -1177,7 +1177,7 @@ dotypecheck:
       tmp = ra_scratch(as, allow);
       rset_clear(allow, tmp);
     }
-    if (irt_isnum(t) && !(ir->op2 & IRSLOAD_CONVERT))
+    if (ra_hasreg(dest) && irt_isnum(t) && !(ir->op2 & IRSLOAD_CONVERT))
       emit_dn(as, A64I_FMOV_D_R, (dest & 31), tmp);
     /* Need type check, even if the load result is unused. */
     asm_guardcc(as, irt_isnum(t) ? CC_LS : CC_NE);
diff --git a/test/tarantool-tests/lj-903-arm64-unused-number-sload-typecheck.test.lua b/test/tarantool-tests/lj-903-arm64-unused-number-sload-typecheck.test.lua
new file mode 100644
index 00000000..748b88e2
--- /dev/null
+++ b/test/tarantool-tests/lj-903-arm64-unused-number-sload-typecheck.test.lua
@@ -0,0 +1,45 @@
+local tap = require('tap')
+-- Test file to demonstrate the incorrect JIT assembling of unused
+-- `IR_SLOAD` with number type on arm64.
+-- See also https://github.com/LuaJIT/LuaJIT/issue/903.
typo:  s/issue/issues/
+local test = tap.test('lj-903-arm64-unused-number-sload-typecheck'):skipcond({
+  ['Test requires JIT enabled'] = not jit.status(),
+})
+
+test:plan(1)
+
+-- Just use any different numbers (but not integers to avoid
+-- integer IR type).
+local SLOT = 0.1
+local MARKER_VALUE = 4.2
+-- XXX: Special mapping to avoid folding and removing always true
+-- comparison.
+local anchor = {marker = MARKER_VALUE}
+
+-- Special function to inline on trace to generate SLOAD
+-- typecheck.
+local function sload_unused(x)
+  return x
+end
+
+-- The additional wrapper to use stackslots in the function.
+local function test_sload()
+  local sload = SLOT
+  for _ = 1, 4 do
+    -- This line should use the `d0` register.
+    local marker = anchor.marker - MARKER_VALUE
+    -- This generates unused IR_SLOAD with typecheck (number).
+    -- Before the patch, it occasionally overwrites the `d0`
+    -- register and causes the execution of the branch.
+    sload_unused(sload)
+    if marker ~= 0 then
+      return false
+    end
+  end
+  return true
+end
+
+jit.opt.start('hotloop=1')
+test:ok(test_sload(), 'correct SLOAD assembling')
+
+test:done(true)
--------------40JMvxetg0Bvkc84NmCVn560--