Hi, Sergey,
thanks for the patch! See comments below.
<snipped>
diff --git a/test/tarantool-tests/lj-1116-redzones-checks.test.lua b/test/tarantool-tests/lj-1116-redzones-checks.test.lua
new file mode 100644
index 00000000..70062ec9
--- /dev/null
+++ b/test/tarantool-tests/lj-1116-redzones-checks.test.lua
@@ -0,0 +1,118 @@
+local tap = require('tap')
+-- Test file to demonstrate mcode area overflow during recording a
+-- trace with the high FPR pressure.
+-- See also, https://github.com/LuaJIT/LuaJIT/issues/1116.
+--
+-- XXX: Test fails only with GC64 enabled before the commit.
I would rephrase: XXX: Test fails with reverted fix and enabled
GC64.
+local test = tap.test('lj-1116-redzones-checks'):skipcond({
+ ['Test requires JIT enabled'] = not jit.status(),
+})
+
+test:plan(1)
+
+jit.opt.start('hotloop=1')
+
+-- XXX: This test snippet was originally created by the fuzzer.
+-- See https://oss-fuzz.com/testcase-detail/5622965122170880.
+--
+-- Unfortunately, it's impossible to reduce the testcase further.
+-- Before the patch, assembling some instructions (like `IR_CONV
+-- int.num`, for example) with many mcode to be emitted may
+-- overflow the `MCLIM_REDZONE` (64) at once due to the huge
+-- mcode emitting.
+-- For example `IR_CONV` in this test requires 66 bytes of the
+-- machine code:
+-- | cvttsd2si r15d, xmm5
+-- | xorps xmm9, xmm9
+-- | cvtsi2sd xmm9, r15d
+-- | ucomisd xmm5, xmm9
+-- | jnz 0x11edb00e5 ->37
+-- | jpe 0x11edb00e5 ->37
+-- | mov [rsp+0x80], r15d
+-- | mov r15, [rsp+0xe8]
+-- | movsd xmm9, [rsp+0xe0]
+-- | movsd xmm5, [rsp+0xd8]
+--
+-- The reproducer needs sufficient register pressure as to
+-- immediately spill the result of the instruction to the stack
+-- and then reload the three registers used by the instruction,
+-- and to have chosen enough registers with numbers >=8 (because
+-- shaving off a REX prefix [1] or two would get 66 back down
+-- to <= `MCLIM_REDZONE`), and to be using lots of spill slots
+-- (because memory offsets <= 0x7f are shorter to encode compared
+-- to those >= 0x80. So, each reload instruction consumes 9 bytes.
+-- This makes this reproducer unstable (regarding the register
+-- allocator changes). So, lets use this as a regression test.
+--
+-- [1]: https://wiki.osdev.org/X86-64_Instruction_Encoding#REX_prefix
+
+_G.a = 0
+_G.b = 0
+_G.c = 0
+_G.d = 0
+_G.e = 0
+_G.f = 0
+_G.g = 0
+_G.h = 0
+-- Skip `i`.
I didn't get it.
<snipped>