Hi, Sergey, thanks for review! Please see my comments below. Sergey On 9/1/25 16:07, Sergey Kaplun via Tarantool-patches wrote: > Hi, Sergey! > Thanks for the patch! > Please consider my comments below. > > On 27.08.25, Sergey Bronnikov wrote: >> Thanks to Peter Cawley. >> >> (cherry picked from commit d1a2fef8a8f53b0055ee041f7f63d83a27444ffa) >> >> The builtin `pcall()` has two separate ways by which it can >> grow the stack by one slot: >> >> 1. Resolving the `__call` metamethod of its first argument. > This is unrelated to this patch, so it can be omitted. > >> 2. Growing the stack by one slot in LJ_FR2 mode. >> >> The first case leads to a stack smash if `pcall()` is used as >> `__call`. Setting a metatable with this metamethod will cause >> an infinite loop which fills up the stack with `pcall`-frames >> and then keeps going beyond the end of the stack until it segfaults. > This issue is not related to this patch. > >> Either of these points can cause an issue if `pcall()` is used as >> `__newindex`. > Looks like the metamethods are not required for issue reproducing. > >> The patch partially fixes aforementioned issues. > By how? I've updated the commit message as the following:     Stack overflow can cause a segmentation fault in vararg     function on ARM64 and MIPS64 in LJ_FR2 mode. This happen     because stack check in BC_IFUNCV is off by one on these     platforms without the patch. The patch partially fixes     aforementioned issue by bumping LJ_STACK_EXTRA by 1 to     give a space to write the entire frame link and fixing     a number of last free slot in the stack. > >> Sergey Bronnikov: >> * added the description and the test for the problem >> >> Part of tarantool/tarantool#11691 >> --- >> src/lj_def.h | 2 +- >> src/lj_dispatch.c | 2 +- >> src/vm_arm64.dasc | 1 + >> src/vm_mips64.dasc | 1 + >> ...048-fix-stack-checks-vararg-calls.test.lua | 56 +++++++++++++++++++ >> 5 files changed, 60 insertions(+), 2 deletions(-) >> create mode 100644 test/tarantool-tests/lj-1048-fix-stack-checks-vararg-calls.test.lua >> >> diff --git a/src/lj_def.h b/src/lj_def.h > > >> diff --git a/src/lj_dispatch.c b/src/lj_dispatch.c >> index a44a5adf..431cb3c2 100644 >> --- a/src/lj_dispatch.c >> +++ b/src/lj_dispatch.c >> @@ -453,7 +453,7 @@ static int call_init(lua_State *L, GCfunc *fn) >> int numparams = pt->numparams; >> int gotparams = (int)(L->top - L->base); >> int need = pt->framesize; >> - if ((pt->flags & PROTO_VARARG)) need += 1+gotparams; >> + if ((pt->flags & PROTO_VARARG)) need += 1+LJ_FR2+gotparams; > I can't see the test related to this change. Not `prober_1()` nor > `prober_2()` lead to the assertion failure for x86_64 or aarch64 without > it. Please check again. Both testcases trigger segfault on AArch64 (odroid). cmake -S . -B build -DCMAKE_BUILD_TYPE=Debug  -DLUA_USE_ASSERT=ON -DLUA_USE_APICHECK=ON cmake --build build --parallel LUA_PATH="/root/sergeyb/luajit/test/tarantool-tests/?.lua;/root/sergeyb/luajit/test/tarantool-tests/?/init.lua;/root/sergeyb/luajit/src/?.lua;/root/sergeyb/luajit/build/src/?.lua;;" gdb --args /root/sergeyb/luajit/build/src/luajit "-e" "dofile[[/root/sergeyb/luajit/test/luajit-test-init.lua]]" "/root/sergeyb/luajit/test/tarantool-tests/lj-1048-fix-stack-checks-vararg-calls.test.lua" (gdb) bt #0  0x00000055555c16f4 in lj_alloc_free (msp=0x7fb7d56010, ptr=0x7fb7d69088)     at /root/sergeyb/luajit/src/lj_alloc.c:1405 #1  0x00000055555c1fe4 in lj_alloc_realloc (msp=0x7fb7d56010, ptr=0x7fb7d69088, nsize=1696)     at /root/sergeyb/luajit/src/lj_alloc.c:1471 #2  0x00000055555c204c in lj_alloc_f (msp=0x7fb7d56010, ptr=0x7fb7d69088, osize=816, nsize=1696)     at /root/sergeyb/luajit/src/lj_alloc.c:1486 #3  0x00000055555790e0 in lj_mem_realloc (L=0x7fb7d6d330, p=0x7fb7d69088, osz=816, nsz=1696)     at /root/sergeyb/luajit/src/lj_gc.c:896 #4  0x000000555557e610 in resizestack (L=0x7fb7d6d330, n=204) at /root/sergeyb/luajit/src/lj_state.c:82 #5  0x000000555557e970 in lj_state_growstack (L=0x7fb7d6d330, need=48)     at /root/sergeyb/luajit/src/lj_state.c:130 #6  0x00000055555fad68 in lj_vm_growstack_l () at buildvm_arm64.dasc:1263 #7  0x00000055555fb8d4 in lj_ff_coroutine_wrap_aux () at buildvm_arm64.dasc:1775 #8  0x000000555556a824 in lua_pcall (L=0x7fb7d56378, nargs=0, nresults=-1, errfunc=2)     at /root/sergeyb/luajit/src/lj_api.c:1173 #9  0x000000555555d258 in docall (L=0x7fb7d56378, narg=0, clear=0) at /root/sergeyb/luajit/src/luajit.c:134 #10 0x000000555555db9c in handle_script (L=0x7fb7d56378, argx=0x7ffffff280)     at /root/sergeyb/luajit/src/luajit.c:304 #11 0x000000555555ea54 in pmain (L=0x7fb7d56378) at /root/sergeyb/luajit/src/luajit.c:602 #12 0x00000055555fab90 in lj_BC_FUNCC () at buildvm_arm64.dasc:894 #13 0x000000555556ad90 in lua_cpcall (L=0x7fb7d56378, func=0x555555e898 , ud=0x0)     at /root/sergeyb/luajit/src/lj_api.c:1208 #14 0x000000555555ebb4 in main (argc=4, argv=0x7ffffff268) at /root/sergeyb/luajit/src/luajit.c:633 (gdb) With commented out first testcase: (gdb) bt #0  0x00000055555c18fc in lj_alloc_free (msp=0x7fb7d56010, ptr=0x7fb7d69068)     at /root/sergeyb/luajit/src/lj_alloc.c:1406 #1  0x00000055555c1fe4 in lj_alloc_realloc (msp=0x7fb7d56010, ptr=0x7fb7d69068, nsize=1696)     at /root/sergeyb/luajit/src/lj_alloc.c:1471 #2  0x00000055555c204c in lj_alloc_f (msp=0x7fb7d56010, ptr=0x7fb7d69068, osize=816, nsize=1696)     at /root/sergeyb/luajit/src/lj_alloc.c:1486 #3  0x00000055555790e0 in lj_mem_realloc (L=0x7fb7d6d2a0, p=0x7fb7d69068, osz=816, nsz=1696)     at /root/sergeyb/luajit/src/lj_gc.c:896 #4  0x000000555557e610 in resizestack (L=0x7fb7d6d2a0, n=204) at /root/sergeyb/luajit/src/lj_state.c:82 #5  0x000000555557e970 in lj_state_growstack (L=0x7fb7d6d2a0, need=48)     at /root/sergeyb/luajit/src/lj_state.c:130 #6  0x00000055555fad68 in lj_vm_growstack_l () at buildvm_arm64.dasc:1263 #7  0x00000055555fb8d4 in lj_ff_coroutine_wrap_aux () at buildvm_arm64.dasc:1775 #8  0x000000555556a824 in lua_pcall (L=0x7fb7d56378, nargs=0, nresults=-1, errfunc=2)     at /root/sergeyb/luajit/src/lj_api.c:1173 #9  0x000000555555d258 in docall (L=0x7fb7d56378, narg=0, clear=0) at /root/sergeyb/luajit/src/luajit.c:134 #10 0x000000555555db9c in handle_script (L=0x7fb7d56378, argx=0x7ffffff280)     at /root/sergeyb/luajit/src/luajit.c:304 #11 0x000000555555ea54 in pmain (L=0x7fb7d56378) at /root/sergeyb/luajit/src/luajit.c:602 #12 0x00000055555fab90 in lj_BC_FUNCC () at buildvm_arm64.dasc:894 #13 0x000000555556ad90 in lua_cpcall (L=0x7fb7d56378, func=0x555555e898 , ud=0x0)     at /root/sergeyb/luajit/src/lj_api.c:1208 #14 0x000000555555ebb4 in main (argc=4, argv=0x7ffffff268) at /root/sergeyb/luajit/src/luajit.c:633 (gdb) >> lj_state_checkstack(L, (MSize)need); >> numparams -= gotparams; >> return numparams >= 0 ? numparams : 0; >> diff --git a/src/vm_arm64.dasc b/src/vm_arm64.dasc >> index c5f0a7a7..cf8e575a 100644 >> --- a/src/vm_arm64.dasc >> +++ b/src/vm_arm64.dasc >> @@ -3779,6 +3779,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) >> | add TMP2, BASE, RC >> | addLFUNC:CARG3, CARG3, TMP0, lsl #47 >> | add RA, RA, RC >> + | sub CARG1, CARG1, #8 > Please mention in the commit message why the original stack check was > incorrect (for aarch64 and mips64). > > Also, mention why the x64 isn't affected: > > x64: > | RA == BASE + (RD=NARGS+1)*8 + framesize * 8 +8 > maxstack > The last summand here is the `LJ_FR2` adjustment. > > arm64|mips64 -- incorrect check: > | RA == BASE + (RD=NARGS)*8 + framesize * 8 >= maxstack > Added. >> | add TMP0, RC, #16+FRAME_VARG >> | strLFUNC:CARG3, [TMP2], #8 // Store (tagged) copy of LFUNC. >> | ldr KBASE, [PC, #-4+PC2PROTO(k)] >> diff --git a/src/vm_mips64.dasc b/src/vm_mips64.dasc >> index 44fba36c..7f49df5b 100644 >> --- a/src/vm_mips64.dasc >> +++ b/src/vm_mips64.dasc > > >> diff --git a/test/tarantool-tests/lj-1048-fix-stack-checks-vararg-calls.test.lua b/test/tarantool-tests/lj-1048-fix-stack-checks-vararg-calls.test.lua >> new file mode 100644 >> index 00000000..e300d5c1 >> --- /dev/null >> +++ b/test/tarantool-tests/lj-1048-fix-stack-checks-vararg-calls.test.lua >> @@ -0,0 +1,56 @@ >> +local tap = require('tap') >> + >> +-- A test file to demonstrate a stack overflow in `pcall()` in >> +-- some cases, see below testcase descriptions. >> +-- See alsohttps://github.com/LuaJIT/LuaJIT/issues/1048. >> +local test = tap.test('lj-1048-fix-stack-checks-vararg-calls'):skipcond({ >> + ['Test requires JIT enabled'] = not jit.status(), >> +}) >> + >> +test:plan(2) >> + >> +-- The first testcase demonstrate a stack overflow in `pcall()` >> +-- by recursive calling `pcall()`. The functions are vararg >> +-- because stack check in BC_IFUNCV is off by one without the > Minor: by one for the arm64, mips64 architectures. Updated (here and below): --- a/test/tarantool-tests/lj-1048-fix-stack-checks-vararg-calls.test.lua +++ b/test/tarantool-tests/lj-1048-fix-stack-checks-vararg-calls.test.lua @@ -11,8 +11,8 @@ test:plan(2)  -- The first testcase demonstrate a stack overflow in `pcall()`  -- by recursive calling `pcall()`. The functions are vararg --- because stack check in BC_IFUNCV is off by one without the --- patch. +-- because stack check in BC_IFUNCV is off by one on ARM64, +-- MIPS64 without the patch.  local function prober_1(...) -- luacheck: no unused    pcall(pcall, pcall, pcall, pcall, pcall, pcall, pcall, pcall, pairs, {})  end > >> +-- patch. >> +local function prober_1(...) -- luacheck: no unused >> + pcall(pcall, pcall, pcall, pcall, pcall, pcall, pcall, pcall, pairs, {}) >> +end > Why do we want to use probber_1 here? Why is this different from the > second example? Only because of the metamethods? > > If we want to keep it, please describe why we need at least 9 pcall-s. As I got right, exactly this number of pcall's is needed to trigger a stack overflow. > > Also, there is no need for `pairs()` here. Let's use another simpler fast > function (like `type()`). (discussed in a private conversation) Updated:  local function prober_1(...) -- luacheck: no unused -  pcall(pcall, pcall, pcall, pcall, pcall, pcall, pcall, pcall, pairs, {}) +  -- Any fast function can be used, but `type` is most convenient +  -- here because it works fast and can be used with any data type. +  pcall(pcall, pcall, pcall, pcall, pcall, pcall, pcall, pcall, type, 0)  end  local function looper_1(n, ...) > Also, please add a comment about fast function > usage, see the example below. > >> + >> +local function looper_1(n, ...) >> + prober_1(...) >> + prober_1(nil, ...) > Why do we need `nil` here? I suppose this line is excess, see the > comment with the example below. Right, removed:  end  local function looper_1(n, ...)    prober_1(...) -  prober_1(nil, ...)    return looper_1(n + 1, n, ...)  end > >> + return looper_1(n + 1, n, ...) >> +end >> + >> +pcall(coroutine.wrap(looper_1), 0) >> + >> +test:ok(true, 'no stack overflow with recursive pcall') >> + >> +-- The second testcase demonstrate a stack overflow in `pcall()` >> +-- with using metamethods. A stack overflow is triggered when >> +-- `pcall()` is used as `__call` metamethod, setting metatable >> +-- will cause an infinite loop which fills up the stack with >> +-- `pcall`-frames and then keeps going beyond the end of the >> +-- stack until it segfaults. > This comment is unrelated to this test. Updated and now it looks as the following: -- The testcase demonstrate a stack overflow when `pcall()` -- is used as `__newindex` metamethod. The function is vararg -- because stack check in BC_IFUNCV is off by one on ARM64 -- and MIPS64 without the patch. > >> Also, a stack overflow can be >> +-- triggered when `pcall()` is used as `__newindex` metamethod. >> +-- The functions are vararg because stack check in BC_IFUNCV is >> +-- off by one without the patch. >> + >> +local mt = setmetatable({}, { __newindex = pcall, __call = pairs }) >> + >> +local function prober_2(...) -- luacheck: no unused >> + mt[mt] = mt >> +end >> + >> +local function looper_2(n, ...) >> + prober_2(...) >> + prober_2(nil, ...) >> + return looper_2(n + 1, n, ...) >> +end >> + >> +pcall(coroutine.wrap(looper_2), 0) > This can be simplified to the following: > | src/luajit -e ' > | -- Do not use a Lua function as metamethod -- since it will check > | -- the stack on each invocation. Use simple `type()` built-in > | -- instead. > | local t = setmetatable({}, {__newindex = pcall, __call = type}) > | local function prober(...) > | -- Invokes `pcall(t, t, t)`. > | t[t] = t > | end > | local function looper(n, ...) > | prober(...) > | return looper(n+1, n, ...) > | end > | pcall(coroutine.wrap(looper), 0) > | ' Updated (added a comment about FF and removed prober() with nil): @@ -37,15 +38,18 @@ test:ok(true, 'no stack overflow with recursive pcall')  -- The functions are vararg because stack check in BC_IFUNCV is  -- off by one without the patch. -local mt = setmetatable({}, { __newindex = pcall, __call = pairs }) +-- The `type()` function is more convenient here, it works fast +-- and can be used with any data type. However, any fast function +-- can be used instead. +local t = setmetatable({}, { __newindex = pcall, __call = type })  local function prober_2(...) -- luacheck: no unused -  mt[mt] = mt +  -- Invokes `pcall(t, t, t)`. +  t[t] = t  end  local function looper_2(n, ...)    prober_2(...) -  prober_2(nil, ...)    return looper_2(n + 1, n, ...)  end >> + >> +test:ok(true, 'no stack overflow with using metamethod') >> + >> +test:done(true) >> -- >> 2.43.0 >>