* [Tarantool-patches] [PATCH luajit] ARM64: Fix assembly of HREFK.
@ 2025-06-12 9:36 Sergey Kaplun via Tarantool-patches
0 siblings, 0 replies; 6+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-06-12 9:36 UTC (permalink / raw)
To: Sergey Bronnikov; +Cc: tarantool-patches
From: Mike Pall <mike>
Reported by caohongqing.
Fix contributed by Peter Cawley.
(cherry picked from commit 8fbd576fb9414a5fa70dfa6069733d3416a78269)
`asm_hrefk()` uses the check for the offset for the corresponding node
structure. However, the target load is performed from its inner `key`
field with the offset 8. In the case of a huge table, it is possible
that the offset of the node (4095 * 8) is less than 4096 * 8 and can be
emitted via the corresponding instruction as an immediate offset, but
the offset of the `key` field is not. This leads to the corresponding
assertion failure in `emit_lso()`.
This patch fixes this behaviour by the correct check.
Sergey Kaplun:
* added the description and the test for the problem
Part of tarantool/tarantool#11278
---
Related issues:
* https://github.com/LuaJIT/LuaJIT/issues/1026
* https://github.com/tarantool/tarantool/issues/11278
Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-1026-arm64-invalid-hrefk-offset-check
src/lj_asm_arm64.h | 2 +-
...-arm64-invalid-hrefk-offset-check.test.lua | 48 +++++++++++++++++++
2 files changed, 49 insertions(+), 1 deletion(-)
create mode 100644 test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua
diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
index 6c7b011f..a7f059a2 100644
--- a/src/lj_asm_arm64.h
+++ b/src/lj_asm_arm64.h
@@ -885,7 +885,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
IRIns *irkey = IR(kslot->op1);
int32_t ofs = (int32_t)(kslot->op2 * sizeof(Node));
int32_t kofs = ofs + (int32_t)offsetof(Node, key);
- int bigofs = !emit_checkofs(A64I_LDRx, ofs);
+ int bigofs = !emit_checkofs(A64I_LDRx, kofs);
Reg dest = (ra_used(ir) || bigofs) ? ra_dest(as, ir, RSET_GPR) : RID_NONE;
Reg node = ra_alloc1(as, ir->op1, RSET_GPR);
Reg key, idx = node;
diff --git a/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua b/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua
new file mode 100644
index 00000000..de243814
--- /dev/null
+++ b/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua
@@ -0,0 +1,48 @@
+local tap = require('tap')
+
+-- Test file to demonstrate LuaJIT misbehaviour when assembling
+-- HREFK instruction on arm64 with the huge offset.
+-- See also: https://github.com/LuaJIT/LuaJIT/issues/1026.
+local test = tap.test('lj-1026-arm64-invalid-hrefk-offset-check'):skipcond({
+ ['Test requires JIT enabled'] = not jit.status(),
+})
+
+test:plan(1)
+
+-- The assertion fails since in HREFK we are checking the offset
+-- from the hslots of the table of the Node structure itself
+-- instead of its inner field `key` (with additional 8 bytes).
+-- So to test this, we generate a big table with constant keys
+-- and compile a trace for each HREFK possible.
+
+local big_tab = {}
+-- The map of the characters to generate constant string keys.
+-- The offset of the node should be 4096 * 8. It takes at least
+-- 1365 keys to hit this value. The maximum possible slots in the
+-- hash part is 2048, so to fill it with the maximum density (with
+-- the way below), we need 45 * 45 = 2025 keys.
+local chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRS'
+for c1 in chars:gmatch('.') do
+ for c2 in chars:gmatch('.') do
+ big_tab[c1 .. c2] = 1
+ end
+end
+
+jit.opt.start('hotloop=1')
+
+-- Generate bunch of traces.
+for c1 in chars:gmatch('.') do
+ for c2 in chars:gmatch('.') do
+ loadstring([=[
+ local t = ...
+ for i = 1, 4 do
+ -- HREFK generation.
+ t[ ']=] .. c1 .. c2 .. [=[' ] = i
+ end
+ ]=])(big_tab)
+ end
+end
+
+test:ok(true, 'no assertion failed')
+
+test:done(true)
--
2.49.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Tarantool-patches] [PATCH luajit] ARM64: Fix assembly of HREFK.
@ 2021-10-27 13:02 Sergey Kaplun via Tarantool-patches
2022-03-30 10:26 ` sergos via Tarantool-patches
2022-06-30 12:11 ` Igor Munkin via Tarantool-patches
0 siblings, 2 replies; 6+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-10-27 13:02 UTC (permalink / raw)
To: Sergey Ostanevich, Igor Munkin; +Cc: tarantool-patches
From: Mike Pall <mike>
Reported by Jason Teplitz.
(cherry picked from commit 06cd9fce7df440323647174f1ca4a01281ec8acd)
LDR instruction with register offset [1] has a restriction, that the
destination register must be different from the register used for offset
base. HREFK IR emits LDR instruction for a load of node key to a
register. Base register to offset contains a node address. The register
holding the node address is not excluded from a allow set, when loading
the key value to a new register, and may be chosen by the register
allocator as a destination for the key value, which violates the
aforementioned restriction.
This patch excludes the node register from the allowing set in the
allocation of register for a key value.
Sergey Kaplun:
* added the description and the test for the problem
[1]: https://developer.arm.com/documentation/dui0489/c/arm-and-thumb-instructions/memory-access-instructions/ldr-and-str--register-offset-
Part of tarantool/tarantool#6548
---
Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-357-arm64-hrefk
Tarantool branch: https://github.com/tarantool/tarantool/tree/skaplun/lj-357-arm64-hrefk
LuaJIT issue: https://github.com/LuaJIT/LuaJIT/issues/357
Issue: https://github.com/tarantool/tarantool/issues/6548
src/lj_asm_arm64.h | 11 ++++----
.../lj-357-arm64-hrefk.test.lua | 27 +++++++++++++++++++
2 files changed, 32 insertions(+), 6 deletions(-)
create mode 100644 test/tarantool-tests/lj-357-arm64-hrefk.test.lua
diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
index cc8c0c9d..da0ee4bb 100644
--- a/src/lj_asm_arm64.h
+++ b/src/lj_asm_arm64.h
@@ -869,14 +869,12 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
int32_t ofs = (int32_t)(kslot->op2 * sizeof(Node));
int32_t kofs = ofs + (int32_t)offsetof(Node, key);
int bigofs = !emit_checkofs(A64I_LDRx, ofs);
- RegSet allow = RSET_GPR;
Reg dest = (ra_used(ir) || bigofs) ? ra_dest(as, ir, RSET_GPR) : RID_NONE;
- Reg node = ra_alloc1(as, ir->op1, allow);
- Reg key = ra_scratch(as, rset_clear(allow, node));
- Reg idx = node;
+ Reg node = ra_alloc1(as, ir->op1, RSET_GPR);
+ Reg key, idx = node;
+ RegSet allow = rset_exclude(RSET_GPR, node);
uint64_t k;
lua_assert(ofs % sizeof(Node) == 0);
- rset_clear(allow, key);
if (bigofs) {
idx = dest;
rset_clear(allow, dest);
@@ -892,7 +890,8 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
} else {
k = ((uint64_t)irt_toitype(irkey->t) << 47) | (uint64_t)ir_kgc(irkey);
}
- emit_nm(as, A64I_CMPx, key, ra_allock(as, k, allow));
+ key = ra_scratch(as, allow);
+ emit_nm(as, A64I_CMPx, key, ra_allock(as, k, rset_exclude(allow, key)));
emit_lso(as, A64I_LDRx, key, idx, kofs);
if (bigofs)
emit_opk(as, A64I_ADDx, dest, node, ofs, RSET_GPR);
diff --git a/test/tarantool-tests/lj-357-arm64-hrefk.test.lua b/test/tarantool-tests/lj-357-arm64-hrefk.test.lua
new file mode 100644
index 00000000..200d29f0
--- /dev/null
+++ b/test/tarantool-tests/lj-357-arm64-hrefk.test.lua
@@ -0,0 +1,27 @@
+local tap = require('tap')
+
+-- Test file to demonstrate the incorrect JIT behaviour for HREFK
+-- IR compilation on arm64.
+-- See also https://github.com/LuaJIT/LuaJIT/issues/357.
+local test = tap.test('lj-357-arm64-hrefk')
+test:plan(2)
+
+jit.opt.start('hotloop=1', 'hotexit=1')
+
+local t = {hrefk = 0}
+
+-- XXX: Need to generate a bunch of side traces (starts a new one
+-- when the hmask is changed) to wait, when the register allocator
+-- chooses the same register as a base register for offset and
+-- destination in LDR instruction.
+local START = 2e3
+local STOP = 1
+for i = START, STOP, -1 do
+ t.hrefk = t.hrefk - 1
+ t[t.hrefk] = i
+end
+
+test:is(t.hrefk, -START)
+test:is(t[t.hrefk], STOP)
+
+os.exit(test:check() and 0 or 1)
--
2.31.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit] ARM64: Fix assembly of HREFK.
2021-10-27 13:02 Sergey Kaplun via Tarantool-patches
@ 2022-03-30 10:26 ` sergos via Tarantool-patches
2022-04-04 8:55 ` Sergey Kaplun via Tarantool-patches
2022-06-30 12:11 ` Igor Munkin via Tarantool-patches
1 sibling, 1 reply; 6+ messages in thread
From: sergos via Tarantool-patches @ 2022-03-30 10:26 UTC (permalink / raw)
To: Sergey Kaplun; +Cc: tarantool-patches
Hi!
I see the test is guess-based in term that number of traces should
be enough to trigger the situation. Is there any confirmation it
will happen at particular traces count?
Otherwise LGTM.
Sergos
> On 27 Oct 2021, at 16:02, Sergey Kaplun <skaplun@tarantool.org> wrote:
>
> From: Mike Pall <mike>
>
> Reported by Jason Teplitz.
>
> (cherry picked from commit 06cd9fce7df440323647174f1ca4a01281ec8acd)
>
> LDR instruction with register offset [1] has a restriction, that the
> destination register must be different from the register used for offset
> base. HREFK IR emits LDR instruction for a load of node key to a
> register. Base register to offset contains a node address. The register
> holding the node address is not excluded from a allow set, when loading
> the key value to a new register, and may be chosen by the register
> allocator as a destination for the key value, which violates the
> aforementioned restriction.
>
> This patch excludes the node register from the allowing set in the
> allocation of register for a key value.
>
> Sergey Kaplun:
> * added the description and the test for the problem
>
> [1]: https://developer.arm.com/documentation/dui0489/c/arm-and-thumb-instructions/memory-access-instructions/ldr-and-str--register-offset-
>
> Part of tarantool/tarantool#6548
> ---
>
> Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-357-arm64-hrefk
> Tarantool branch: https://github.com/tarantool/tarantool/tree/skaplun/lj-357-arm64-hrefk
> LuaJIT issue: https://github.com/LuaJIT/LuaJIT/issues/357
> Issue: https://github.com/tarantool/tarantool/issues/6548
>
> src/lj_asm_arm64.h | 11 ++++----
> .../lj-357-arm64-hrefk.test.lua | 27 +++++++++++++++++++
> 2 files changed, 32 insertions(+), 6 deletions(-)
> create mode 100644 test/tarantool-tests/lj-357-arm64-hrefk.test.lua
>
> diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
> index cc8c0c9d..da0ee4bb 100644
> --- a/src/lj_asm_arm64.h
> +++ b/src/lj_asm_arm64.h
> @@ -869,14 +869,12 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
> int32_t ofs = (int32_t)(kslot->op2 * sizeof(Node));
> int32_t kofs = ofs + (int32_t)offsetof(Node, key);
> int bigofs = !emit_checkofs(A64I_LDRx, ofs);
> - RegSet allow = RSET_GPR;
> Reg dest = (ra_used(ir) || bigofs) ? ra_dest(as, ir, RSET_GPR) : RID_NONE;
> - Reg node = ra_alloc1(as, ir->op1, allow);
> - Reg key = ra_scratch(as, rset_clear(allow, node));
> - Reg idx = node;
> + Reg node = ra_alloc1(as, ir->op1, RSET_GPR);
> + Reg key, idx = node;
> + RegSet allow = rset_exclude(RSET_GPR, node);
> uint64_t k;
> lua_assert(ofs % sizeof(Node) == 0);
> - rset_clear(allow, key);
> if (bigofs) {
> idx = dest;
> rset_clear(allow, dest);
> @@ -892,7 +890,8 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
> } else {
> k = ((uint64_t)irt_toitype(irkey->t) << 47) | (uint64_t)ir_kgc(irkey);
> }
> - emit_nm(as, A64I_CMPx, key, ra_allock(as, k, allow));
> + key = ra_scratch(as, allow);
> + emit_nm(as, A64I_CMPx, key, ra_allock(as, k, rset_exclude(allow, key)));
> emit_lso(as, A64I_LDRx, key, idx, kofs);
> if (bigofs)
> emit_opk(as, A64I_ADDx, dest, node, ofs, RSET_GPR);
> diff --git a/test/tarantool-tests/lj-357-arm64-hrefk.test.lua b/test/tarantool-tests/lj-357-arm64-hrefk.test.lua
> new file mode 100644
> index 00000000..200d29f0
> --- /dev/null
> +++ b/test/tarantool-tests/lj-357-arm64-hrefk.test.lua
> @@ -0,0 +1,27 @@
> +local tap = require('tap')
> +
> +-- Test file to demonstrate the incorrect JIT behaviour for HREFK
> +-- IR compilation on arm64.
> +-- See also https://github.com/LuaJIT/LuaJIT/issues/357.
> +local test = tap.test('lj-357-arm64-hrefk')
> +test:plan(2)
> +
> +jit.opt.start('hotloop=1', 'hotexit=1')
> +
> +local t = {hrefk = 0}
> +
> +-- XXX: Need to generate a bunch of side traces (starts a new one
> +-- when the hmask is changed) to wait, when the register allocator
> +-- chooses the same register as a base register for offset and
> +-- destination in LDR instruction.
> +local START = 2e3
> +local STOP = 1
> +for i = START, STOP, -1 do
> + t.hrefk = t.hrefk - 1
> + t[t.hrefk] = i
> +end
> +
> +test:is(t.hrefk, -START)
> +test:is(t[t.hrefk], STOP)
> +
> +os.exit(test:check() and 0 or 1)
> --
> 2.31.0
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit] ARM64: Fix assembly of HREFK.
2022-03-30 10:26 ` sergos via Tarantool-patches
@ 2022-04-04 8:55 ` Sergey Kaplun via Tarantool-patches
2022-06-29 9:16 ` Igor Munkin via Tarantool-patches
0 siblings, 1 reply; 6+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2022-04-04 8:55 UTC (permalink / raw)
To: sergos; +Cc: tarantool-patches
Hi, Sergos!
Thanks for the review!
On 30.03.22, sergos wrote:
> Hi!
>
> I see the test is guess-based in term that number of traces should
> be enough to trigger the situation. Is there any confirmation it
> will happen at particular traces count?
Register allocator is deterministic (IINM), so yes, this test isn't
flaky. But the amount of iterations should be large enough to create the
necessary bugged trace.
>
> Otherwise LGTM.
>
> Sergos
>
<snipped>
>
--
Best regards,
Sergey Kaplun
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit] ARM64: Fix assembly of HREFK.
2021-10-27 13:02 Sergey Kaplun via Tarantool-patches
2022-03-30 10:26 ` sergos via Tarantool-patches
@ 2022-06-30 12:11 ` Igor Munkin via Tarantool-patches
1 sibling, 0 replies; 6+ messages in thread
From: Igor Munkin via Tarantool-patches @ 2022-06-30 12:11 UTC (permalink / raw)
To: Sergey Kaplun; +Cc: tarantool-patches
Sergey,
I've checked the patch into all long-term branches with ARM64 support in
tarantool/luajit and bumped a new version in master, 2.10.
On 27.10.21, Sergey Kaplun wrote:
> From: Mike Pall <mike>
>
> Reported by Jason Teplitz.
>
> (cherry picked from commit 06cd9fce7df440323647174f1ca4a01281ec8acd)
>
> LDR instruction with register offset [1] has a restriction, that the
> destination register must be different from the register used for offset
> base. HREFK IR emits LDR instruction for a load of node key to a
> register. Base register to offset contains a node address. The register
> holding the node address is not excluded from a allow set, when loading
> the key value to a new register, and may be chosen by the register
> allocator as a destination for the key value, which violates the
> aforementioned restriction.
>
> This patch excludes the node register from the allowing set in the
> allocation of register for a key value.
>
> Sergey Kaplun:
> * added the description and the test for the problem
>
> [1]: https://developer.arm.com/documentation/dui0489/c/arm-and-thumb-instructions/memory-access-instructions/ldr-and-str--register-offset-
>
> Part of tarantool/tarantool#6548
> ---
>
> Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-357-arm64-hrefk
> Tarantool branch: https://github.com/tarantool/tarantool/tree/skaplun/lj-357-arm64-hrefk
> LuaJIT issue: https://github.com/LuaJIT/LuaJIT/issues/357
> Issue: https://github.com/tarantool/tarantool/issues/6548
>
> src/lj_asm_arm64.h | 11 ++++----
> .../lj-357-arm64-hrefk.test.lua | 27 +++++++++++++++++++
> 2 files changed, 32 insertions(+), 6 deletions(-)
> create mode 100644 test/tarantool-tests/lj-357-arm64-hrefk.test.lua
>
<snipped>
> --
> 2.31.0
>
--
Best regards,
IM
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-06-12 9:36 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-06-12 9:36 [Tarantool-patches] [PATCH luajit] ARM64: Fix assembly of HREFK Sergey Kaplun via Tarantool-patches
-- strict thread matches above, loose matches on Subject: below --
2021-10-27 13:02 Sergey Kaplun via Tarantool-patches
2022-03-30 10:26 ` sergos via Tarantool-patches
2022-04-04 8:55 ` Sergey Kaplun via Tarantool-patches
2022-06-29 9:16 ` Igor Munkin via Tarantool-patches
2022-06-30 12:11 ` Igor Munkin via Tarantool-patches
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox