[Tarantool-patches] [PATCH luajit 17/19] MIPS64: Fix register allocation in assembly of HREF.
Maxim Kokryashkin
m.kokryashkin at tarantool.org
Wed Aug 16 12:01:08 MSK 2023
Hi, Sergey!
Thanks for the patch!
Please consider my comments below.
On Wed, Aug 09, 2023 at 06:36:06PM +0300, Sergey Kaplun via Tarantool-patches wrote:
> From: Mike Pall <mike>
>
> Contributed by James Cowgill.
>
> (cherry-picked from commit 99cdfbf6a1e8856f64908072ef10443a7eab14f2)
>
> The issue is observed for the following merged IRs:
> | p64 HREF 0001 "a" ; or other keys
> | > p64 EQ 0002 [0x4002d0c528] ; nilnode
> Sometimes, when we need to rematerialize a constant during evicting of
Typo: s/during evicting/during the eviction/
> the register. So, the instruction related to constant rematerialization
Sometimes happens what? The sentence looks kind of chopped.
> is placed in the delay branch slot, which suppose to contain the loads
Typo: s/which suppose/which is supposed/
> of trace exit number to the `$ra` register. The resulting assembly is
Typo: s/number/numbers/ (because of `loads` being in the plural form)
> the following (for example):
> | beq ra, r1, 0x400abee9b0 ->exit
> | lui r1, 65531 ; delay slot without setting of the `ra`
> This leading to the assertion failure during trace exit in
Typo: s/leading/leads/
> `lj_trace_exit()`, since a trace number is incorrect.
>
> This patch moves the constant register allocations above the main
> instruction emitting code in `asm_href()`.
AFAICS, It is not just moved, the register allocation logic has changed too.
Before the patch, there were a few cases of inplace emissions, which
disappeared after the patch. I believe it is important to mention to, along
with a more detailed description of the logic changes.
>
> Sergey Kaplun:
> * added the description and the test for the problem
>
> Part of tarantool/tarantool#8825
> ---
> src/lj_asm_mips.h | 42 +++++---
> ...-mips64-href-delay-slot-side-exit.test.lua | 101 ++++++++++++++++++
> 2 files changed, 126 insertions(+), 17 deletions(-)
> create mode 100644 test/tarantool-tests/lj-362-mips64-href-delay-slot-side-exit.test.lua
>
> diff --git a/src/lj_asm_mips.h b/src/lj_asm_mips.h
> index c27d8413..23ffc3aa 100644
> --- a/src/lj_asm_mips.h
> +++ b/src/lj_asm_mips.h
> @@ -859,6 +859,9 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
> Reg dest = ra_dest(as, ir, allow);
> Reg tab = ra_alloc1(as, ir->op1, rset_clear(allow, dest));
> Reg key = RID_NONE, type = RID_NONE, tmpnum = RID_NONE, tmp1 = RID_TMP, tmp2;
> +#if LJ_64
> + Reg cmp64 = RID_NONE;
> +#endif
> IRRef refkey = ir->op2;
> IRIns *irkey = IR(refkey);
> int isk = irref_isk(refkey);
> @@ -901,6 +904,26 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
> #endif
> tmp2 = ra_scratch(as, allow);
> rset_clear(allow, tmp2);
> +#if LJ_64
> + if (LJ_SOFTFP || !irt_isnum(kt)) {
> + /* Allocate cmp64 register used for 64-bit comparisons */
> + if (LJ_SOFTFP && irt_isnum(kt)) {
> + cmp64 = key;
> + } else if (!isk && irt_isaddr(kt)) {
> + cmp64 = tmp2;
> + } else {
> + int64_t k;
> + if (isk && irt_isaddr(kt)) {
> + k = ((int64_t)irt_toitype(irkey->t) << 47) | irkey[1].tv.u64;
> + } else {
> + lua_assert(irt_ispri(kt) && !irt_isnil(kt));
> + k = ~((int64_t)~irt_toitype(ir->t) << 47);
> + }
> + cmp64 = ra_allock(as, k, allow);
> + rset_clear(allow, cmp64);
> + }
> + }
> +#endif
>
> /* Key not found in chain: jump to exit (if merged) or load niltv. */
> l_end = emit_label(as);
> @@ -943,24 +966,9 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
> emit_dta(as, MIPSI_DSRA32, tmp1, tmp1, 15);
> emit_tg(as, MIPSI_DMTC1, tmp1, tmpnum);
> emit_tsi(as, MIPSI_LD, tmp1, dest, (int32_t)offsetof(Node, key.u64));
> - } else if (LJ_SOFTFP && irt_isnum(kt)) {
> - emit_branch(as, MIPSI_BEQ, tmp1, key, l_end);
> - emit_tsi(as, MIPSI_LD, tmp1, dest, (int32_t)offsetof(Node, key.u64));
> - } else if (irt_isaddr(kt)) {
> - Reg refk = tmp2;
> - if (isk) {
> - int64_t k = ((int64_t)irt_toitype(irkey->t) << 47) | irkey[1].tv.u64;
> - refk = ra_allock(as, k, allow);
> - rset_clear(allow, refk);
> - }
> - emit_branch(as, MIPSI_BEQ, tmp1, refk, l_end);
> - emit_tsi(as, MIPSI_LD, tmp1, dest, offsetof(Node, key));
> } else {
> - Reg pri = ra_allock(as, ~((int64_t)~irt_toitype(ir->t) << 47), allow);
> - rset_clear(allow, pri);
> - lua_assert(irt_ispri(kt) && !irt_isnil(kt));
> - emit_branch(as, MIPSI_BEQ, tmp1, pri, l_end);
> - emit_tsi(as, MIPSI_LD, tmp1, dest, offsetof(Node, key));
> + emit_branch(as, MIPSI_BEQ, tmp1, cmp64, l_end);
> + emit_tsi(as, MIPSI_LD, tmp1, dest, (int32_t)offsetof(Node, key.u64));
> }
> *l_loop = MIPSI_BNE | MIPSF_S(tmp1) | ((as->mcp-l_loop-1) & 0xffffu);
> if (!isk && irt_isaddr(kt)) {
> diff --git a/test/tarantool-tests/lj-362-mips64-href-delay-slot-side-exit.test.lua b/test/tarantool-tests/lj-362-mips64-href-delay-slot-side-exit.test.lua
> new file mode 100644
> index 00000000..8c75e69c
> --- /dev/null
> +++ b/test/tarantool-tests/lj-362-mips64-href-delay-slot-side-exit.test.lua
> @@ -0,0 +1,101 @@
> +local tap = require('tap')
> +-- Test file to demonstrate the incorrect JIT behaviour for HREF
> +-- IR compilation on mips64.
> +-- See also https://github.com/LuaJIT/LuaJIT/pull/362.
> +local test = tap.test('lj-362-mips64-href-delay-slot-side-exit'):skipcond({
> + ['Test requires JIT enabled'] = not jit.status(),
> +})
> +
> +test:plan(1)
> +
> +-- To reproduce the issue we need to compile a trace with
> +-- `IR_HREF`, with a lookup of constant hash key GC value. To
Typo: s/constant/a constant/
> +-- prevent an `IR_HREFK` to be emitted instead, we need a table
Typo: s/to be/from being/
> +-- with a huge hash part. Delta of address between the start of
Typo: s/Delta/The delta/
> +-- the hash part of the table and the current node to lookup must
> +-- be more than `(1024 * 64 - 1) * sizeof(Node)`.
Typo: s/more/greater/
> +-- See <src/lj_record.c>, for details.
> +-- XXX: This constant is well suited to prevent test to be flaky,
Typo: s/to be/from being/
> +-- because the aforementioned delta is always large enough.
> +-- Also, this constant avoids table rehashing, when inserting new
> +-- keys.
> +local N_HASH_FIELDS = 2 ^ 16 + 2 ^ 15
> +
> +-- XXX: don't set `hotexit` to prevent compilation of trace after
> +-- exiting the main test cycle.
I suggest rehprasing it the following way:
| The `hotexit` option is not set to prevent the compilation of traces
| after the emission of the main test cycle.
> +jit.opt.start('hotloop=1')
> +
> +-- Don't use `table.new()`, here by intence -- this leads to the
Typo: s/Don't use `table.new()`, here by intence/`table.new()` is not used here by intention/
> +-- allocation failure for the mcode memory, so traces are not
> +-- compiled.
> +local filled_tab = {}
> +-- Filling-up the table with GC values to minimize the amount of
Typo: s/Filling-up/Fill up/
> +-- hash collisions and increase delta between the start of the
Typo: s/delta/the delta/
> +-- hash part of the table and currently stored node.
Typo: s/currently/the currently/
> +for _ = 1, N_HASH_FIELDS do
> + filled_tab[1LL] = 1
> +end
> +
> +-- luacheck: no unused
> +local tab_value_a
> +local tab_value_b
> +local tab_value_c
> +local tab_value_d
> +local tab_value_e
> +local tab_value_f
> +local tab_value_g
> +local tab_value_h
> +local tab_value_i
> +
> +-- The function for this trace has a bunch of the following IRs:
> +-- p64 HREF 0001 "a" ; or other keys
> +-- > p64 EQ 0002 [0x4002d0c528] ; nilnode
> +-- Sometimes, when we need to rematerialize a constant during
> +-- evicting of the register. So, the instruction related to
Typo: s/evicting/the eviction/
Again, sometimes happens what?
> +-- constant rematerialization is placed in the delay branch slot,
> +-- which suppose to contain the loads of trace exit number to the
Typo: s/which suppose/which is supposed/
Typo: s/number/numbers/
> +-- `$ra` register. This leading to the assertion failure during
Typo: s/leading/leads/
> +-- trace exit in `lj_trace_exit()`, since a trace number is
> +-- incorrect. The amount of the side exit to check is empirical
Typo: s/exit/exits/
> +-- (even a little bit more, than necessary just in case).
Typo: s/more/greater/
> +local function href_const(tab)
> + tab_value_a = tab.a
> + tab_value_b = tab.b
> + tab_value_c = tab.c
> + tab_value_d = tab.d
> + tab_value_e = tab.e
> + tab_value_f = tab.f
> + tab_value_g = tab.g
> + tab_value_h = tab.h
> + tab_value_i = tab.i
> +end
> +
> +-- Compile main trace first.
Typo: s/main/the main/
> +href_const(filled_tab)
> +href_const(filled_tab)
> +
> +-- Now brute-force side exits to check that they are compiled
> +-- correct. Take side exits in the reverse order to take a new
Typo: s/correct/correctly/
Typo: s/the reverse/reverse/
> +-- side exit each time.
> +filled_tab.i = 'i'
> +href_const(filled_tab)
> +filled_tab.h = 'h'
> +href_const(filled_tab)
> +filled_tab.g = 'g'
> +href_const(filled_tab)
> +filled_tab.f = 'f'
> +href_const(filled_tab)
> +filled_tab.e = 'e'
> +href_const(filled_tab)
> +filled_tab.d = 'd'
> +href_const(filled_tab)
> +filled_tab.c = 'c'
> +href_const(filled_tab)
> +filled_tab.b = 'b'
> +href_const(filled_tab)
> +filled_tab.a = 'a'
> +href_const(filled_tab)
> +
> +test:ok(true, 'no assertion failures during trace exits')
> +
> +test:done(true)
> --
> 2.41.0
>
More information about the Tarantool-patches
mailing list