From: Mike Pall <mike> Reported by Jason Teplitz. (cherry picked from commit 06cd9fce7df440323647174f1ca4a01281ec8acd) LDR instruction with register offset [1] has a restriction, that the destination register must be different from the register used for offset base. HREFK IR emits LDR instruction for a load of node key to a register. Base register to offset contains a node address. The register holding the node address is not excluded from a allow set, when loading the key value to a new register, and may be chosen by the register allocator as a destination for the key value, which violates the aforementioned restriction. This patch excludes the node register from the allowing set in the allocation of register for a key value. Sergey Kaplun: * added the description and the test for the problem [1]: https://developer.arm.com/documentation/dui0489/c/arm-and-thumb-instructions/memory-access-instructions/ldr-and-str--register-offset- Part of tarantool/tarantool#6548 --- Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-357-arm64-hrefk Tarantool branch: https://github.com/tarantool/tarantool/tree/skaplun/lj-357-arm64-hrefk LuaJIT issue: https://github.com/LuaJIT/LuaJIT/issues/357 Issue: https://github.com/tarantool/tarantool/issues/6548 src/lj_asm_arm64.h | 11 ++++---- .../lj-357-arm64-hrefk.test.lua | 27 +++++++++++++++++++ 2 files changed, 32 insertions(+), 6 deletions(-) create mode 100644 test/tarantool-tests/lj-357-arm64-hrefk.test.lua diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h index cc8c0c9d..da0ee4bb 100644 --- a/src/lj_asm_arm64.h +++ b/src/lj_asm_arm64.h @@ -869,14 +869,12 @@ static void asm_hrefk(ASMState *as, IRIns *ir) int32_t ofs = (int32_t)(kslot->op2 * sizeof(Node)); int32_t kofs = ofs + (int32_t)offsetof(Node, key); int bigofs = !emit_checkofs(A64I_LDRx, ofs); - RegSet allow = RSET_GPR; Reg dest = (ra_used(ir) || bigofs) ? ra_dest(as, ir, RSET_GPR) : RID_NONE; - Reg node = ra_alloc1(as, ir->op1, allow); - Reg key = ra_scratch(as, rset_clear(allow, node)); - Reg idx = node; + Reg node = ra_alloc1(as, ir->op1, RSET_GPR); + Reg key, idx = node; + RegSet allow = rset_exclude(RSET_GPR, node); uint64_t k; lua_assert(ofs % sizeof(Node) == 0); - rset_clear(allow, key); if (bigofs) { idx = dest; rset_clear(allow, dest); @@ -892,7 +890,8 @@ static void asm_hrefk(ASMState *as, IRIns *ir) } else { k = ((uint64_t)irt_toitype(irkey->t) << 47) | (uint64_t)ir_kgc(irkey); } - emit_nm(as, A64I_CMPx, key, ra_allock(as, k, allow)); + key = ra_scratch(as, allow); + emit_nm(as, A64I_CMPx, key, ra_allock(as, k, rset_exclude(allow, key))); emit_lso(as, A64I_LDRx, key, idx, kofs); if (bigofs) emit_opk(as, A64I_ADDx, dest, node, ofs, RSET_GPR); diff --git a/test/tarantool-tests/lj-357-arm64-hrefk.test.lua b/test/tarantool-tests/lj-357-arm64-hrefk.test.lua new file mode 100644 index 00000000..200d29f0 --- /dev/null +++ b/test/tarantool-tests/lj-357-arm64-hrefk.test.lua @@ -0,0 +1,27 @@ +local tap = require('tap') + +-- Test file to demonstrate the incorrect JIT behaviour for HREFK +-- IR compilation on arm64. +-- See also https://github.com/LuaJIT/LuaJIT/issues/357. +local test = tap.test('lj-357-arm64-hrefk') +test:plan(2) + +jit.opt.start('hotloop=1', 'hotexit=1') + +local t = {hrefk = 0} + +-- XXX: Need to generate a bunch of side traces (starts a new one +-- when the hmask is changed) to wait, when the register allocator +-- chooses the same register as a base register for offset and +-- destination in LDR instruction. +local START = 2e3 +local STOP = 1 +for i = START, STOP, -1 do + t.hrefk = t.hrefk - 1 + t[t.hrefk] = i +end + +test:is(t.hrefk, -START) +test:is(t[t.hrefk], STOP) + +os.exit(test:check() and 0 or 1) -- 2.31.0
Hi!
I see the test is guess-based in term that number of traces should
be enough to trigger the situation. Is there any confirmation it
will happen at particular traces count?
Otherwise LGTM.
Sergos
> On 27 Oct 2021, at 16:02, Sergey Kaplun <skaplun@tarantool.org> wrote:
>
> From: Mike Pall <mike>
>
> Reported by Jason Teplitz.
>
> (cherry picked from commit 06cd9fce7df440323647174f1ca4a01281ec8acd)
>
> LDR instruction with register offset [1] has a restriction, that the
> destination register must be different from the register used for offset
> base. HREFK IR emits LDR instruction for a load of node key to a
> register. Base register to offset contains a node address. The register
> holding the node address is not excluded from a allow set, when loading
> the key value to a new register, and may be chosen by the register
> allocator as a destination for the key value, which violates the
> aforementioned restriction.
>
> This patch excludes the node register from the allowing set in the
> allocation of register for a key value.
>
> Sergey Kaplun:
> * added the description and the test for the problem
>
> [1]: https://developer.arm.com/documentation/dui0489/c/arm-and-thumb-instructions/memory-access-instructions/ldr-and-str--register-offset-
>
> Part of tarantool/tarantool#6548
> ---
>
> Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-357-arm64-hrefk
> Tarantool branch: https://github.com/tarantool/tarantool/tree/skaplun/lj-357-arm64-hrefk
> LuaJIT issue: https://github.com/LuaJIT/LuaJIT/issues/357
> Issue: https://github.com/tarantool/tarantool/issues/6548
>
> src/lj_asm_arm64.h | 11 ++++----
> .../lj-357-arm64-hrefk.test.lua | 27 +++++++++++++++++++
> 2 files changed, 32 insertions(+), 6 deletions(-)
> create mode 100644 test/tarantool-tests/lj-357-arm64-hrefk.test.lua
>
> diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
> index cc8c0c9d..da0ee4bb 100644
> --- a/src/lj_asm_arm64.h
> +++ b/src/lj_asm_arm64.h
> @@ -869,14 +869,12 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
> int32_t ofs = (int32_t)(kslot->op2 * sizeof(Node));
> int32_t kofs = ofs + (int32_t)offsetof(Node, key);
> int bigofs = !emit_checkofs(A64I_LDRx, ofs);
> - RegSet allow = RSET_GPR;
> Reg dest = (ra_used(ir) || bigofs) ? ra_dest(as, ir, RSET_GPR) : RID_NONE;
> - Reg node = ra_alloc1(as, ir->op1, allow);
> - Reg key = ra_scratch(as, rset_clear(allow, node));
> - Reg idx = node;
> + Reg node = ra_alloc1(as, ir->op1, RSET_GPR);
> + Reg key, idx = node;
> + RegSet allow = rset_exclude(RSET_GPR, node);
> uint64_t k;
> lua_assert(ofs % sizeof(Node) == 0);
> - rset_clear(allow, key);
> if (bigofs) {
> idx = dest;
> rset_clear(allow, dest);
> @@ -892,7 +890,8 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
> } else {
> k = ((uint64_t)irt_toitype(irkey->t) << 47) | (uint64_t)ir_kgc(irkey);
> }
> - emit_nm(as, A64I_CMPx, key, ra_allock(as, k, allow));
> + key = ra_scratch(as, allow);
> + emit_nm(as, A64I_CMPx, key, ra_allock(as, k, rset_exclude(allow, key)));
> emit_lso(as, A64I_LDRx, key, idx, kofs);
> if (bigofs)
> emit_opk(as, A64I_ADDx, dest, node, ofs, RSET_GPR);
> diff --git a/test/tarantool-tests/lj-357-arm64-hrefk.test.lua b/test/tarantool-tests/lj-357-arm64-hrefk.test.lua
> new file mode 100644
> index 00000000..200d29f0
> --- /dev/null
> +++ b/test/tarantool-tests/lj-357-arm64-hrefk.test.lua
> @@ -0,0 +1,27 @@
> +local tap = require('tap')
> +
> +-- Test file to demonstrate the incorrect JIT behaviour for HREFK
> +-- IR compilation on arm64.
> +-- See also https://github.com/LuaJIT/LuaJIT/issues/357.
> +local test = tap.test('lj-357-arm64-hrefk')
> +test:plan(2)
> +
> +jit.opt.start('hotloop=1', 'hotexit=1')
> +
> +local t = {hrefk = 0}
> +
> +-- XXX: Need to generate a bunch of side traces (starts a new one
> +-- when the hmask is changed) to wait, when the register allocator
> +-- chooses the same register as a base register for offset and
> +-- destination in LDR instruction.
> +local START = 2e3
> +local STOP = 1
> +for i = START, STOP, -1 do
> + t.hrefk = t.hrefk - 1
> + t[t.hrefk] = i
> +end
> +
> +test:is(t.hrefk, -START)
> +test:is(t[t.hrefk], STOP)
> +
> +os.exit(test:check() and 0 or 1)
> --
> 2.31.0
>
Hi, Sergos! Thanks for the review! On 30.03.22, sergos wrote: > Hi! > > I see the test is guess-based in term that number of traces should > be enough to trigger the situation. Is there any confirmation it > will happen at particular traces count? Register allocator is deterministic (IINM), so yes, this test isn't flaky. But the amount of iterations should be large enough to create the necessary bugged trace. > > Otherwise LGTM. > > Sergos > <snipped> > -- Best regards, Sergey Kaplun
Sergey, Thanks for the patch! LGTM, considering the fixes on the branch. -- Best regards, IM
Sergey, I've checked the patch into all long-term branches with ARM64 support in tarantool/luajit and bumped a new version in master, 2.10. On 27.10.21, Sergey Kaplun wrote: > From: Mike Pall <mike> > > Reported by Jason Teplitz. > > (cherry picked from commit 06cd9fce7df440323647174f1ca4a01281ec8acd) > > LDR instruction with register offset [1] has a restriction, that the > destination register must be different from the register used for offset > base. HREFK IR emits LDR instruction for a load of node key to a > register. Base register to offset contains a node address. The register > holding the node address is not excluded from a allow set, when loading > the key value to a new register, and may be chosen by the register > allocator as a destination for the key value, which violates the > aforementioned restriction. > > This patch excludes the node register from the allowing set in the > allocation of register for a key value. > > Sergey Kaplun: > * added the description and the test for the problem > > [1]: https://developer.arm.com/documentation/dui0489/c/arm-and-thumb-instructions/memory-access-instructions/ldr-and-str--register-offset- > > Part of tarantool/tarantool#6548 > --- > > Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-357-arm64-hrefk > Tarantool branch: https://github.com/tarantool/tarantool/tree/skaplun/lj-357-arm64-hrefk > LuaJIT issue: https://github.com/LuaJIT/LuaJIT/issues/357 > Issue: https://github.com/tarantool/tarantool/issues/6548 > > src/lj_asm_arm64.h | 11 ++++---- > .../lj-357-arm64-hrefk.test.lua | 27 +++++++++++++++++++ > 2 files changed, 32 insertions(+), 6 deletions(-) > create mode 100644 test/tarantool-tests/lj-357-arm64-hrefk.test.lua > <snipped> > -- > 2.31.0 > -- Best regards, IM