[Tarantool-patches] [PATCH luajit 17/19] MIPS64: Fix register allocation in assembly of HREF.

Maxim Kokryashkin m.kokryashkin at tarantool.org
Wed Aug 16 12:01:08 MSK 2023


Hi, Sergey!
Thanks for the patch!
Please consider my comments below.
On Wed, Aug 09, 2023 at 06:36:06PM +0300, Sergey Kaplun via Tarantool-patches wrote:
> From: Mike Pall <mike>
> 
> Contributed by James Cowgill.
> 
> (cherry-picked from commit 99cdfbf6a1e8856f64908072ef10443a7eab14f2)
> 
> The issue is observed for the following merged IRs:
> |    p64 HREF   0001  "a"            ; or other keys
> | >  p64 EQ     0002  [0x4002d0c528] ; nilnode
> Sometimes, when we need to rematerialize a constant during evicting of
Typo: s/during evicting/during the eviction/
> the register. So, the instruction related to constant rematerialization
Sometimes happens what? The sentence looks kind of chopped.
> is placed in the delay branch slot, which suppose to contain the loads
Typo: s/which suppose/which is supposed/
> of trace exit number to the `$ra` register. The resulting assembly is
Typo: s/number/numbers/ (because of `loads` being in the plural form)
> the following (for example):
> | beq     ra, r1, 0x400abee9b0  ->exit
> | lui     r1, 65531   ; delay slot without setting of the `ra`
> This leading to the assertion failure during trace exit in
Typo: s/leading/leads/
> `lj_trace_exit()`, since a trace number is incorrect.
> 
> This patch moves the constant register allocations above the main
> instruction emitting code in `asm_href()`.
AFAICS, It is not just moved, the register allocation logic has changed too.
Before the patch, there were a few cases of inplace emissions, which
disappeared after the patch. I believe it is important to mention to, along
with a more detailed description of the logic changes.
> 
> Sergey Kaplun:
> * added the description and the test for the problem
> 
> Part of tarantool/tarantool#8825
> ---
>  src/lj_asm_mips.h                             |  42 +++++---
>  ...-mips64-href-delay-slot-side-exit.test.lua | 101 ++++++++++++++++++
>  2 files changed, 126 insertions(+), 17 deletions(-)
>  create mode 100644 test/tarantool-tests/lj-362-mips64-href-delay-slot-side-exit.test.lua
> 
> diff --git a/src/lj_asm_mips.h b/src/lj_asm_mips.h
> index c27d8413..23ffc3aa 100644
> --- a/src/lj_asm_mips.h
> +++ b/src/lj_asm_mips.h
> @@ -859,6 +859,9 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>    Reg dest = ra_dest(as, ir, allow);
>    Reg tab = ra_alloc1(as, ir->op1, rset_clear(allow, dest));
>    Reg key = RID_NONE, type = RID_NONE, tmpnum = RID_NONE, tmp1 = RID_TMP, tmp2;
> +#if LJ_64
> +  Reg cmp64 = RID_NONE;
> +#endif
>    IRRef refkey = ir->op2;
>    IRIns *irkey = IR(refkey);
>    int isk = irref_isk(refkey);
> @@ -901,6 +904,26 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>  #endif
>    tmp2 = ra_scratch(as, allow);
>    rset_clear(allow, tmp2);
> +#if LJ_64
> +  if (LJ_SOFTFP || !irt_isnum(kt)) {
> +    /* Allocate cmp64 register used for 64-bit comparisons */
> +    if (LJ_SOFTFP && irt_isnum(kt)) {
> +      cmp64 = key;
> +    } else if (!isk && irt_isaddr(kt)) {
> +      cmp64 = tmp2;
> +    } else {
> +      int64_t k;
> +      if (isk && irt_isaddr(kt)) {
> +	k = ((int64_t)irt_toitype(irkey->t) << 47) | irkey[1].tv.u64;
> +      } else {
> +	lua_assert(irt_ispri(kt) && !irt_isnil(kt));
> +	k = ~((int64_t)~irt_toitype(ir->t) << 47);
> +      }
> +      cmp64 = ra_allock(as, k, allow);
> +      rset_clear(allow, cmp64);
> +    }
> +  }
> +#endif
>  
>    /* Key not found in chain: jump to exit (if merged) or load niltv. */
>    l_end = emit_label(as);
> @@ -943,24 +966,9 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>      emit_dta(as, MIPSI_DSRA32, tmp1, tmp1, 15);
>      emit_tg(as, MIPSI_DMTC1, tmp1, tmpnum);
>      emit_tsi(as, MIPSI_LD, tmp1, dest, (int32_t)offsetof(Node, key.u64));
> -  } else if (LJ_SOFTFP && irt_isnum(kt)) {
> -    emit_branch(as, MIPSI_BEQ, tmp1, key, l_end);
> -    emit_tsi(as, MIPSI_LD, tmp1, dest, (int32_t)offsetof(Node, key.u64));
> -  } else if (irt_isaddr(kt)) {
> -    Reg refk = tmp2;
> -    if (isk) {
> -      int64_t k = ((int64_t)irt_toitype(irkey->t) << 47) | irkey[1].tv.u64;
> -      refk = ra_allock(as, k, allow);
> -      rset_clear(allow, refk);
> -    }
> -    emit_branch(as, MIPSI_BEQ, tmp1, refk, l_end);
> -    emit_tsi(as, MIPSI_LD, tmp1, dest, offsetof(Node, key));
>    } else {
> -    Reg pri = ra_allock(as, ~((int64_t)~irt_toitype(ir->t) << 47), allow);
> -    rset_clear(allow, pri);
> -    lua_assert(irt_ispri(kt) && !irt_isnil(kt));
> -    emit_branch(as, MIPSI_BEQ, tmp1, pri, l_end);
> -    emit_tsi(as, MIPSI_LD, tmp1, dest, offsetof(Node, key));
> +    emit_branch(as, MIPSI_BEQ, tmp1, cmp64, l_end);
> +    emit_tsi(as, MIPSI_LD, tmp1, dest, (int32_t)offsetof(Node, key.u64));
>    }
>    *l_loop = MIPSI_BNE | MIPSF_S(tmp1) | ((as->mcp-l_loop-1) & 0xffffu);
>    if (!isk && irt_isaddr(kt)) {
> diff --git a/test/tarantool-tests/lj-362-mips64-href-delay-slot-side-exit.test.lua b/test/tarantool-tests/lj-362-mips64-href-delay-slot-side-exit.test.lua
> new file mode 100644
> index 00000000..8c75e69c
> --- /dev/null
> +++ b/test/tarantool-tests/lj-362-mips64-href-delay-slot-side-exit.test.lua
> @@ -0,0 +1,101 @@
> +local tap = require('tap')
> +-- Test file to demonstrate the incorrect JIT behaviour for HREF
> +-- IR compilation on mips64.
> +-- See also https://github.com/LuaJIT/LuaJIT/pull/362.
> +local test = tap.test('lj-362-mips64-href-delay-slot-side-exit'):skipcond({
> +  ['Test requires JIT enabled'] = not jit.status(),
> +})
> +
> +test:plan(1)
> +
> +-- To reproduce the issue we need to compile a trace with
> +-- `IR_HREF`, with a lookup of constant hash key GC value. To
Typo: s/constant/a constant/
> +-- prevent an `IR_HREFK` to be emitted instead, we need a table
Typo: s/to be/from being/
> +-- with a huge hash part. Delta of address between the start of
Typo: s/Delta/The delta/
> +-- the hash part of the table and the current node to lookup must
> +-- be more than `(1024 * 64 - 1) * sizeof(Node)`.
Typo: s/more/greater/
> +-- See <src/lj_record.c>, for details.
> +-- XXX: This constant is well suited to prevent test to be flaky,
Typo: s/to be/from being/
> +-- because the aforementioned delta is always large enough.
> +-- Also, this constant avoids table rehashing, when inserting new
> +-- keys.
> +local N_HASH_FIELDS = 2 ^ 16 + 2 ^ 15
> +
> +-- XXX: don't set `hotexit` to prevent compilation of trace after
> +-- exiting the main test cycle.
I suggest rehprasing it the following way:
| The `hotexit` option is not set to prevent the compilation of traces
| after the emission of the main test cycle.
> +jit.opt.start('hotloop=1')
> +
> +-- Don't use `table.new()`, here by intence -- this leads to the
Typo: s/Don't use `table.new()`, here by intence/`table.new()` is not used here by intention/
> +-- allocation failure for the mcode memory, so traces are not
> +-- compiled.
> +local filled_tab = {}
> +-- Filling-up the table with GC values to minimize the amount of
Typo: s/Filling-up/Fill up/
> +-- hash collisions and increase delta between the start of the
Typo: s/delta/the delta/
> +-- hash part of the table and currently stored node.
Typo: s/currently/the currently/
> +for _ = 1, N_HASH_FIELDS do
> +  filled_tab[1LL] = 1
> +end
> +
> +-- luacheck: no unused
> +local tab_value_a
> +local tab_value_b
> +local tab_value_c
> +local tab_value_d
> +local tab_value_e
> +local tab_value_f
> +local tab_value_g
> +local tab_value_h
> +local tab_value_i
> +
> +-- The function for this trace has a bunch of the following IRs:
> +--    p64 HREF   0001  "a"            ; or other keys
> +-- >  p64 EQ     0002  [0x4002d0c528] ; nilnode
> +-- Sometimes, when we need to rematerialize a constant during
> +-- evicting of the register. So, the instruction related to
Typo: s/evicting/the eviction/
Again, sometimes happens what?
> +-- constant rematerialization is placed in the delay branch slot,
> +-- which suppose to contain the loads of trace exit number to the
Typo: s/which suppose/which is supposed/
Typo: s/number/numbers/
> +-- `$ra` register. This leading to the assertion failure during
Typo: s/leading/leads/
> +-- trace exit in `lj_trace_exit()`, since a trace number is
> +-- incorrect. The amount of the side exit to check is empirical
Typo: s/exit/exits/
> +-- (even a little bit more, than necessary just in case).
Typo: s/more/greater/
> +local function href_const(tab)
> +  tab_value_a = tab.a
> +  tab_value_b = tab.b
> +  tab_value_c = tab.c
> +  tab_value_d = tab.d
> +  tab_value_e = tab.e
> +  tab_value_f = tab.f
> +  tab_value_g = tab.g
> +  tab_value_h = tab.h
> +  tab_value_i = tab.i
> +end
> +
> +-- Compile main trace first.
Typo: s/main/the main/
> +href_const(filled_tab)
> +href_const(filled_tab)
> +
> +-- Now brute-force side exits to check that they are compiled
> +-- correct. Take side exits in the reverse order to take a new
Typo: s/correct/correctly/
Typo: s/the reverse/reverse/
> +-- side exit each time.
> +filled_tab.i = 'i'
> +href_const(filled_tab)
> +filled_tab.h = 'h'
> +href_const(filled_tab)
> +filled_tab.g = 'g'
> +href_const(filled_tab)
> +filled_tab.f = 'f'
> +href_const(filled_tab)
> +filled_tab.e = 'e'
> +href_const(filled_tab)
> +filled_tab.d = 'd'
> +href_const(filled_tab)
> +filled_tab.c = 'c'
> +href_const(filled_tab)
> +filled_tab.b = 'b'
> +href_const(filled_tab)
> +filled_tab.a = 'a'
> +href_const(filled_tab)
> +
> +test:ok(true, 'no assertion failures during trace exits')
> +
> +test:done(true)
> -- 
> 2.41.0
> 


More information about the Tarantool-patches mailing list