Hi, Sergey,
thanks for the patch! See comments below.
<snipped>
I would rephrase: XXX: Test fails with reverted fix and enabled GC64.diff --git a/test/tarantool-tests/lj-1116-redzones-checks.test.lua b/test/tarantool-tests/lj-1116-redzones-checks.test.lua new file mode 100644 index 00000000..70062ec9 --- /dev/null +++ b/test/tarantool-tests/lj-1116-redzones-checks.test.lua @@ -0,0 +1,118 @@ +local tap = require('tap') +-- Test file to demonstrate mcode area overflow during recording a +-- trace with the high FPR pressure. +-- See also, https://github.com/LuaJIT/LuaJIT/issues/1116. +-- +-- XXX: Test fails only with GC64 enabled before the commit.
+local test = tap.test('lj-1116-redzones-checks'):skipcond({ + ['Test requires JIT enabled'] = not jit.status(), +}) + +test:plan(1) + +jit.opt.start('hotloop=1') + +-- XXX: This test snippet was originally created by the fuzzer. +-- See https://oss-fuzz.com/testcase-detail/5622965122170880. +-- +-- Unfortunately, it's impossible to reduce the testcase further. +-- Before the patch, assembling some instructions (like `IR_CONV +-- int.num`, for example) with many mcode to be emitted may +-- overflow the `MCLIM_REDZONE` (64) at once due to the huge +-- mcode emitting. +-- For example `IR_CONV` in this test requires 66 bytes of the +-- machine code: +-- | cvttsd2si r15d, xmm5 +-- | xorps xmm9, xmm9 +-- | cvtsi2sd xmm9, r15d +-- | ucomisd xmm5, xmm9 +-- | jnz 0x11edb00e5 ->37 +-- | jpe 0x11edb00e5 ->37 +-- | mov [rsp+0x80], r15d +-- | mov r15, [rsp+0xe8] +-- | movsd xmm9, [rsp+0xe0] +-- | movsd xmm5, [rsp+0xd8] +-- +-- The reproducer needs sufficient register pressure as to +-- immediately spill the result of the instruction to the stack +-- and then reload the three registers used by the instruction, +-- and to have chosen enough registers with numbers >=8 (because +-- shaving off a REX prefix [1] or two would get 66 back down +-- to <= `MCLIM_REDZONE`), and to be using lots of spill slots +-- (because memory offsets <= 0x7f are shorter to encode compared +-- to those >= 0x80. So, each reload instruction consumes 9 bytes. +-- This makes this reproducer unstable (regarding the register +-- allocator changes). So, lets use this as a regression test. +-- +-- [1]: https://wiki.osdev.org/X86-64_Instruction_Encoding#REX_prefix + +_G.a = 0 +_G.b = 0 +_G.c = 0 +_G.d = 0 +_G.e = 0 +_G.f = 0 +_G.g = 0 +_G.h = 0 +-- Skip `i`.
I didn't get it.
<snipped>