[Tarantool-patches] [PATCH luajit] Fix bytecode register allocation for comparisons.
Igor Munkin
imun at tarantool.org
Mon Aug 16 19:27:15 MSK 2021
Sergey,
Thanks for the fixes! LGTM now.
On 16.08.21, Sergey Kaplun wrote:
> Igor,
>
> Thanks for the review!
>
> See the new comment message above:
> ===================================================================
> Fix bytecode register allocation for comparisons.
>
> (cherry picked from commit 2f3f07882fb4ad9c64967d7088461b1ca0a25d3a)
>
> When LuaJIT is built with LJ_FR2 (e.g. with GC64 mode enabled),
> information about frame takes two slots -- the first takes the TValue
> with the function to be called, the second takes the framelink. The JIT
> recording machinery does pretty the same -- the function IR_KGC is
> loaded in the first slot, and the second is set to TREF_FRAME value.
> This value should be rewritten after return from a callee. This slot is
> cleared either by return values or manually (set to zero), when there
> are no values to return. The latter case is done by the next bytecode
> with RA dst mode. This obliges that the destination of RA takes the next
> slot after TREF_FRAME. Hence, this an earlier instruction must use the
> smallest possible destination register (see `lj_record_ins()` for the
> details).
>
> Bytecode emitter swaps operands for ISGT and ISGE comparisons. As a
> result, the aforementioned rule for registers allocations may be
> violated. When it happens for a chunk being recorded, the slot with
> TREF_FRAME is not rewritten (but the next empty slot after TREF_FRAME
> is). This leads to JIT slots inconsistency and assertion failure in
> `rec_check_slots()` during recording of the next bytecode instruction.
>
> This patch fixes bytecode register allocation by changing the VM
> register allocation order in case of ISGT and ISGE bytecodes.
>
> Sergey Kaplun:
> * added the description and the test for the problem
>
> Resolves tarantool/tarantool#6227
> Part of tarantool/tarantool#5629
> ===================================================================
>
> On 16.08.21, Igor Munkin wrote:
<snipped>
>
> >
> > Furthermore, what does stop you from using local variables?
>
> They occupy new slots and make it harder to maintain, see the new
> comment below.
Meh, OK anyway :)
>
> >
<snipped>
>
> ===================================================================
> diff --git a/test/tarantool-tests/gh-6227-bytecode-allocator-for-comparisons.test.lua b/test/tarantool-tests/gh-6227-bytecode-allocator-for-comparisons.test.lua
> index 66f6885e..9788923a 100644
> --- a/test/tarantool-tests/gh-6227-bytecode-allocator-for-comparisons.test.lua
> +++ b/test/tarantool-tests/gh-6227-bytecode-allocator-for-comparisons.test.lua
> @@ -14,26 +14,39 @@ local function empty() end
> local uv = 0
>
> -- This function needs to reset register enumerating.
> --- Also set `J->maxslot` to zero.
> --- The upvalue function to call is loaded to 0 slot.
> +-- `J->maxslot` is initialized with `nargs` (i.e. zero in this
> +-- case) in `rec_call_setup()`.
> local function bump_frame()
> -- First call function with RET0 to set TREF_FRAME in the
> -- last slot.
> empty()
> + -- The old bytecode to be recorded looks like the following:
> + -- 0000 . FUNCF 4
> + -- 0001 . UGET 0 0 ; empty
> + -- 0002 . CALL 0 1 1
> + -- 0000 . . JFUNCF 1 1
> + -- 0001 . . RET0 0 1
> + -- 0002 . CALL 0 1 1
> + -- 0003 . UGET 0 0 ; empty
> + -- 0004 . UGET 3 1 ; uv
> + -- 0005 . KSHORT 2 1
> + -- 0006 . ISLT 3 2
> -- Test ISGE or ISGT bytecode. These bytecodes swap their
> - -- operands. Also, a constant is always loaded into the slot
> - -- smaller than upvalue. So, if upvalue loads before KSHORT,
> - -- then the difference between registers is more than 2 (2 is
> - -- needed for LJ_FR2) and TREF_FRAME slot is not rewriting by
> - -- the bytecode after call and return as expected. That leads
> - -- to recording slots inconsistency and assertion failure at
> - -- `rec_check_slots()`.
> + -- operands (consider ISLT above).
> + -- Two calls of `empty()` function in a row is necessary for 2
> + -- slot gap in LJ_FR2 mode.
> + -- Upvalue loads before KSHORT, so the difference between slot
> + -- for upvalue `empty` (function to be called) and slot for
> + -- upvalue `uv` is more than 2. Hence, TREF_FRAME slot is not
> + -- rewritten by the bytecode after return from `empty()`
> + -- function as expected. That leads to recording slots
> + -- inconsistency and assertion failure at `rec_check_slots()`.
> empty(1>uv)
> end
>
> jit.opt.start('hotloop=1')
>
> -for _ = 1,3 do
> +for _ = 1, 3 do
> bump_frame()
> end
> ===================================================================
>
<snipped>
>
> --
> Best regards,
> Sergey Kaplun
--
Best regards,
IM
More information about the Tarantool-patches
mailing list