Hi, Sergey, thanks for the patch! See comments below. On 16.01.2025 16:35, Sergey Kaplun wrote: > diff --git a/test/tarantool-tests/lj-1116-redzones-checks.test.lua b/test/tarantool-tests/lj-1116-redzones-checks.test.lua > new file mode 100644 > index 00000000..70062ec9 > --- /dev/null > +++ b/test/tarantool-tests/lj-1116-redzones-checks.test.lua > @@ -0,0 +1,118 @@ > +local tap = require('tap') > +-- Test file to demonstrate mcode area overflow during recording a > +-- trace with the high FPR pressure. > +-- See also,https://github.com/LuaJIT/LuaJIT/issues/1116. > +-- > +-- XXX: Test fails only with GC64 enabled before the commit. I would rephrase: XXX: Test fails with reverted fix and enabled GC64. > +local test = tap.test('lj-1116-redzones-checks'):skipcond({ > + ['Test requires JIT enabled'] = not jit.status(), > +}) > + > +test:plan(1) > + > +jit.opt.start('hotloop=1') > + > +-- XXX: This test snippet was originally created by the fuzzer. > +-- Seehttps://oss-fuzz.com/testcase-detail/5622965122170880. > +-- > +-- Unfortunately, it's impossible to reduce the testcase further. > +-- Before the patch, assembling some instructions (like `IR_CONV > +-- int.num`, for example) with many mcode to be emitted may > +-- overflow the `MCLIM_REDZONE` (64) at once due to the huge > +-- mcode emitting. > +-- For example `IR_CONV` in this test requires 66 bytes of the > +-- machine code: > +-- | cvttsd2si r15d, xmm5 > +-- | xorps xmm9, xmm9 > +-- | cvtsi2sd xmm9, r15d > +-- | ucomisd xmm5, xmm9 > +-- | jnz 0x11edb00e5 ->37 > +-- | jpe 0x11edb00e5 ->37 > +-- | mov [rsp+0x80], r15d > +-- | mov r15, [rsp+0xe8] > +-- | movsd xmm9, [rsp+0xe0] > +-- | movsd xmm5, [rsp+0xd8] > +-- > +-- The reproducer needs sufficient register pressure as to > +-- immediately spill the result of the instruction to the stack > +-- and then reload the three registers used by the instruction, > +-- and to have chosen enough registers with numbers >=8 (because > +-- shaving off a REX prefix [1] or two would get 66 back down > +-- to <= `MCLIM_REDZONE`), and to be using lots of spill slots > +-- (because memory offsets <= 0x7f are shorter to encode compared > +-- to those >= 0x80. So, each reload instruction consumes 9 bytes. > +-- This makes this reproducer unstable (regarding the register > +-- allocator changes). So, lets use this as a regression test. > +-- > +-- [1]:https://wiki.osdev.org/X86-64_Instruction_Encoding#REX_prefix > + > +_G.a = 0 > +_G.b = 0 > +_G.c = 0 > +_G.d = 0 > +_G.e = 0 > +_G.f = 0 > +_G.g = 0 > +_G.h = 0 > +-- Skip `i`. I didn't get it.