[Tarantool-patches] [PATCH luajit] Fix bytecode register allocation for comparisons.
Igor Munkin
imun at tarantool.org
Sun Aug 1 13:43:18 MSK 2021
Sergey,
Thanks for the patch! Please consider the comments below. I didn't check
the test yet, since I don't get the JIT peculiarities from your commit
message and comments. Please provide a clearer description and I'll
proceed with the review of the test case then.
On 19.07.21, Sergey Kaplun wrote:
> From: Mike Pall <mike>
>
> (cherry picked from commit 2f3f07882fb4ad9c64967d7088461b1ca0a25d3a)
>
> When LuaJIT is build with LJ_FR2 (GC64), information about frame takes
> two slots -- the first takes the TValue with the function to call, the
> second takes the additional frame information. The recording JIT
Minor: The second slot is the framelink in LuaJIT terms.
> machinery works pretty the same -- the function IR_KGC is loaded in the
> first slot, and the second is set to TREF_FRAME value. This value
> should be rewritten after return from a callee. It is done either by the
> return values either this slot is cleared (set to zero) manually with
> the next bytecode with RA dst mode with the assumption, that the dst RA
> takes the next slot after TREF_FRAME, i.e. an earlier instruction uses
> the smallest possible destination register (see `lj_record_ins()` for
> the details).
The main point lies in the monstrous 5-line sentence. I've read several
times, but still don't get it. Could you please reword it in a not such
complex sentence?
>
> Bytecode allocator swaps operands for ISGT and ISGE comparisons.
> When it happens, the aforementioned rule for registers allocations
> may be violated. When it happens, and this chunk is recording, the slot
> with TREF_FRAME is not rewritten (but the next empty slot after
> TREF_FRAME is) during bytecode recording. This leads to JIT slots
> inconsistency and assertion failure in `rec_check_slots()` during
> recording the next bytecode instruction.
>
> This patch fixes bytecode register allocation by changing the register
> allocation order in case of ISGT and ISGE bytecodes.
It's better to use "virtual register" or even "VM register" to avoid
ambiguous plain "register" usage.
>
> Sergey Kaplun:
> * added the description and the test for the problem
>
> Resolves tarantool/tarantool#6227
Minor: Why #5629 is not mentioned?
> ---
>
> Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-6227-fix-bytecode-allocator-for-comp
> Tarantool branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-6227-fix-bytecode-allocator-for-comp
> Issue: https://github.com/tarantool/tarantool/issues/6227
>
> src/lj_parse.c | 7 +++-
> ...ytecode-allocator-for-comparisons.test.lua | 41 +++++++++++++++++++
> 2 files changed, 46 insertions(+), 2 deletions(-)
> create mode 100644 test/tarantool-tests/gh-6227-bytecode-allocator-for-comparisons.test.lua
>
> diff --git a/src/lj_parse.c b/src/lj_parse.c
> index 08f7cfa6..a6325a76 100644
> --- a/src/lj_parse.c
> +++ b/src/lj_parse.c
> @@ -853,9 +853,12 @@ static void bcemit_comp(FuncState *fs, BinOpr opr, ExpDesc *e1, ExpDesc *e2)
> e1 = e2; e2 = eret; /* Swap operands. */
> op = ((op-BC_ISLT)^3)+BC_ISLT;
> expr_toval(fs, e1);
> + ra = expr_toanyreg(fs, e1);
> + rd = expr_toanyreg(fs, e2);
> + } else {
> + rd = expr_toanyreg(fs, e2);
> + ra = expr_toanyreg(fs, e1);
> }
> - rd = expr_toanyreg(fs, e2);
> - ra = expr_toanyreg(fs, e1);
> ins = BCINS_AD(op, ra, rd);
> }
> /* Using expr_free might cause asserts if the order is wrong. */
> diff --git a/test/tarantool-tests/gh-6227-bytecode-allocator-for-comparisons.test.lua b/test/tarantool-tests/gh-6227-bytecode-allocator-for-comparisons.test.lua
> new file mode 100644
> index 00000000..66f6885e
> --- /dev/null
> +++ b/test/tarantool-tests/gh-6227-bytecode-allocator-for-comparisons.test.lua
> @@ -0,0 +1,41 @@
> +local tap = require('tap')
> +local test = tap.test('gh-6227-bytecode-allocator-for-comparisons')
> +test:plan(1)
> +
> +-- Test file to demonstrate assertion failure during recording
> +-- wrong allocated bytecode for comparisons.
> +-- See also https://github.com/tarantool/tarantool/issues/6227.
> +
> +-- Need function with RET0 bytecode to avoid reset of
> +-- the first JIT slot with frame info. Also need no assignments
> +-- by the caller.
> +local function empty() end
> +
> +local uv = 0
> +
> +-- This function needs to reset register enumerating.
> +-- Also set `J->maxslot` to zero.
> +-- The upvalue function to call is loaded to 0 slot.
> +local function bump_frame()
> + -- First call function with RET0 to set TREF_FRAME in the
> + -- last slot.
> + empty()
> + -- Test ISGE or ISGT bytecode. These bytecodes swap their
> + -- operands. Also, a constant is always loaded into the slot
> + -- smaller than upvalue. So, if upvalue loads before KSHORT,
> + -- then the difference between registers is more than 2 (2 is
> + -- needed for LJ_FR2) and TREF_FRAME slot is not rewriting by
> + -- the bytecode after call and return as expected. That leads
> + -- to recording slots inconsistency and assertion failure at
> + -- `rec_check_slots()`.
> + empty(1>uv)
> +end
> +
> +jit.opt.start('hotloop=1')
> +
> +for _ = 1,3 do
> + bump_frame()
> +end
> +
> +test:ok(true)
> +os.exit(test:check() and 0 or 1)
> --
> 2.31.0
>
--
Best regards,
IM
More information about the Tarantool-patches
mailing list