Hi, Sergey,
thanks for review! Please see my comments below.
Sergey
Hi, Sergey! Thanks for the patch! Please consider my comments below. On 27.08.25, Sergey Bronnikov wrote:Thanks to Peter Cawley. (cherry picked from commit d1a2fef8a8f53b0055ee041f7f63d83a27444ffa) The builtin `pcall()` has two separate ways by which it can grow the stack by one slot: 1. Resolving the `__call` metamethod of its first argument.This is unrelated to this patch, so it can be omitted.
2. Growing the stack by one slot in LJ_FR2 mode. The first case leads to a stack smash if `pcall()` is used as `__call`. Setting a metatable with this metamethod will cause an infinite loop which fills up the stack with `pcall`-frames and then keeps going beyond the end of the stack until it segfaults.This issue is not related to this patch.Either of these points can cause an issue if `pcall()` is used as `__newindex`.Looks like the metamethods are not required for issue reproducing.The patch partially fixes aforementioned issues.By how?
I've updated the commit message as the following:
Stack overflow can cause a segmentation fault in vararg
function on ARM64 and MIPS64 in LJ_FR2 mode. This happen
because stack check in BC_IFUNCV is off by one on these
platforms without the patch. The patch partially fixes
aforementioned issue by bumping LJ_STACK_EXTRA by 1 to
give a space to write the entire frame link and fixing
a number of last free slot in the stack.
Sergey Bronnikov: * added the description and the test for the problem Part of tarantool/tarantool#11691 --- src/lj_def.h | 2 +- src/lj_dispatch.c | 2 +- src/vm_arm64.dasc | 1 + src/vm_mips64.dasc | 1 + ...048-fix-stack-checks-vararg-calls.test.lua | 56 +++++++++++++++++++ 5 files changed, 60 insertions(+), 2 deletions(-) create mode 100644 test/tarantool-tests/lj-1048-fix-stack-checks-vararg-calls.test.lua diff --git a/src/lj_def.h b/src/lj_def.h<snipped>diff --git a/src/lj_dispatch.c b/src/lj_dispatch.c index a44a5adf..431cb3c2 100644 --- a/src/lj_dispatch.c +++ b/src/lj_dispatch.c @@ -453,7 +453,7 @@ static int call_init(lua_State *L, GCfunc *fn) int numparams = pt->numparams; int gotparams = (int)(L->top - L->base); int need = pt->framesize; - if ((pt->flags & PROTO_VARARG)) need += 1+gotparams; + if ((pt->flags & PROTO_VARARG)) need += 1+LJ_FR2+gotparams;I can't see the test related to this change. Not `prober_1()` nor `prober_2()` lead to the assertion failure for x86_64 or aarch64 without it.
Please check again. Both testcases trigger segfault on AArch64 (odroid).
cmake -S . -B build -DCMAKE_BUILD_TYPE=Debug -DLUA_USE_ASSERT=ON -DLUA_USE_APICHECK=ON
cmake --build build --parallel
LUA_PATH="/root/sergeyb/luajit/test/tarantool-tests/?.lua;/root/sergeyb/luajit/test/tarantool-tests/?/init.lua;/root/sergeyb/luajit/src/?.lua;/root/sergeyb/luajit/build/src/?.lua;;" gdb --args /root/sergeyb/luajit/build/src/luajit "-e" "dofile[[/root/sergeyb/luajit/test/luajit-test-init.lua]]" "/root/sergeyb/luajit/test/tarantool-tests/lj-1048-fix-stack-checks-vararg-calls.test.lua"
(gdb) bt
#0 0x00000055555c16f4 in lj_alloc_free (msp=0x7fb7d56010, ptr=0x7fb7d69088)(gdb)
With commented out first testcase:
(gdb) bt
#0 0x00000055555c18fc in lj_alloc_free (msp=0x7fb7d56010,
ptr=0x7fb7d69068)
at /root/sergeyb/luajit/src/lj_alloc.c:1406
#1 0x00000055555c1fe4 in lj_alloc_realloc (msp=0x7fb7d56010,
ptr=0x7fb7d69068, nsize=1696)
at /root/sergeyb/luajit/src/lj_alloc.c:1471
#2 0x00000055555c204c in lj_alloc_f (msp=0x7fb7d56010,
ptr=0x7fb7d69068, osize=816, nsize=1696)
at /root/sergeyb/luajit/src/lj_alloc.c:1486
#3 0x00000055555790e0 in lj_mem_realloc (L=0x7fb7d6d2a0,
p=0x7fb7d69068, osz=816, nsz=1696)
at /root/sergeyb/luajit/src/lj_gc.c:896
#4 0x000000555557e610 in resizestack (L=0x7fb7d6d2a0, n=204) at
/root/sergeyb/luajit/src/lj_state.c:82
#5 0x000000555557e970 in lj_state_growstack (L=0x7fb7d6d2a0,
need=48)
at /root/sergeyb/luajit/src/lj_state.c:130
#6 0x00000055555fad68 in lj_vm_growstack_l () at
buildvm_arm64.dasc:1263
#7 0x00000055555fb8d4 in lj_ff_coroutine_wrap_aux () at
buildvm_arm64.dasc:1775
#8 0x000000555556a824 in lua_pcall (L=0x7fb7d56378, nargs=0,
nresults=-1, errfunc=2)
at /root/sergeyb/luajit/src/lj_api.c:1173
#9 0x000000555555d258 in docall (L=0x7fb7d56378, narg=0, clear=0)
at /root/sergeyb/luajit/src/luajit.c:134
#10 0x000000555555db9c in handle_script (L=0x7fb7d56378,
argx=0x7ffffff280)
at /root/sergeyb/luajit/src/luajit.c:304
#11 0x000000555555ea54 in pmain (L=0x7fb7d56378) at
/root/sergeyb/luajit/src/luajit.c:602
#12 0x00000055555fab90 in lj_BC_FUNCC () at buildvm_arm64.dasc:894
#13 0x000000555556ad90 in lua_cpcall (L=0x7fb7d56378,
func=0x555555e898 <pmain>, ud=0x0)
at /root/sergeyb/luajit/src/lj_api.c:1208
#14 0x000000555555ebb4 in main (argc=4, argv=0x7ffffff268) at
/root/sergeyb/luajit/src/luajit.c:633
(gdb)
Added.lj_state_checkstack(L, (MSize)need); numparams -= gotparams; return numparams >= 0 ? numparams : 0; diff --git a/src/vm_arm64.dasc b/src/vm_arm64.dasc index c5f0a7a7..cf8e575a 100644 --- a/src/vm_arm64.dasc +++ b/src/vm_arm64.dasc @@ -3779,6 +3779,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) | add TMP2, BASE, RC | add LFUNC:CARG3, CARG3, TMP0, lsl #47 | add RA, RA, RC + | sub CARG1, CARG1, #8Please mention in the commit message why the original stack check was incorrect (for aarch64 and mips64). Also, mention why the x64 isn't affected: x64: | RA == BASE + (RD=NARGS+1)*8 + framesize * 8 +8 > maxstack The last summand here is the `LJ_FR2` adjustment. arm64|mips64 -- incorrect check: | RA == BASE + (RD=NARGS)*8 + framesize * 8 >= maxstack
| add TMP0, RC, #16+FRAME_VARG | str LFUNC:CARG3, [TMP2], #8 // Store (tagged) copy of LFUNC. | ldr KBASE, [PC, #-4+PC2PROTO(k)] diff --git a/src/vm_mips64.dasc b/src/vm_mips64.dasc index 44fba36c..7f49df5b 100644 --- a/src/vm_mips64.dasc +++ b/src/vm_mips64.dasc<snipped>diff --git a/test/tarantool-tests/lj-1048-fix-stack-checks-vararg-calls.test.lua b/test/tarantool-tests/lj-1048-fix-stack-checks-vararg-calls.test.lua new file mode 100644 index 00000000..e300d5c1 --- /dev/null +++ b/test/tarantool-tests/lj-1048-fix-stack-checks-vararg-calls.test.lua @@ -0,0 +1,56 @@ +local tap = require('tap') + +-- A test file to demonstrate a stack overflow in `pcall()` in +-- some cases, see below testcase descriptions. +-- See also https://github.com/LuaJIT/LuaJIT/issues/1048. +local test = tap.test('lj-1048-fix-stack-checks-vararg-calls'):skipcond({ + ['Test requires JIT enabled'] = not jit.status(), +}) + +test:plan(2) + +-- The first testcase demonstrate a stack overflow in `pcall()` +-- by recursive calling `pcall()`. The functions are vararg +-- because stack check in BC_IFUNCV is off by one without theMinor: by one for the arm64, mips64 architectures.
Updated (here and below):
---
a/test/tarantool-tests/lj-1048-fix-stack-checks-vararg-calls.test.lua
+++
b/test/tarantool-tests/lj-1048-fix-stack-checks-vararg-calls.test.lua
@@ -11,8 +11,8 @@ test:plan(2)
-- The first testcase demonstrate a stack overflow in `pcall()`
-- by recursive calling `pcall()`. The functions are vararg
--- because stack check in BC_IFUNCV is off by one without the
--- patch.
+-- because stack check in BC_IFUNCV is off by one on ARM64,
+-- MIPS64 without the patch.
local function prober_1(...) -- luacheck: no unused
pcall(pcall, pcall, pcall, pcall, pcall, pcall, pcall, pcall,
pairs, {})
end
As I got right, exactly this number of pcall's is needed to trigger a stack overflow.+-- patch. +local function prober_1(...) -- luacheck: no unused + pcall(pcall, pcall, pcall, pcall, pcall, pcall, pcall, pcall, pairs, {}) +endWhy do we want to use probber_1 here? Why is this different from the second example? Only because of the metamethods? If we want to keep it, please describe why we need at least 9 pcall-s.
Also, there is no need for `pairs()` here. Let's use another simpler fast function (like `type()`).
(discussed in a private conversation)
Updated:
local function prober_1(...) -- luacheck: no unused
- pcall(pcall, pcall, pcall, pcall, pcall, pcall, pcall, pcall,
pairs, {})
+ -- Any fast function can be used, but `type` is most convenient
+ -- here because it works fast and can be used with any data
type.
+ pcall(pcall, pcall, pcall, pcall, pcall, pcall, pcall, pcall,
type, 0)
end
local function looper_1(n, ...)
Also, please add a comment about fast function usage, see the example below.+ +local function looper_1(n, ...) + prober_1(...) + prober_1(nil, ...)Why do we need `nil` here? I suppose this line is excess, see the comment with the example below.
Right, removed:
end
local function looper_1(n, ...)
prober_1(...)
- prober_1(nil, ...)
return looper_1(n + 1, n, ...)
end
+ return looper_1(n + 1, n, ...) +end + +pcall(coroutine.wrap(looper_1), 0) + +test:ok(true, 'no stack overflow with recursive pcall') + +-- The second testcase demonstrate a stack overflow in `pcall()` +-- with using metamethods. A stack overflow is triggered when +-- `pcall()` is used as `__call` metamethod, setting metatable +-- will cause an infinite loop which fills up the stack with +-- `pcall`-frames and then keeps going beyond the end of the +-- stack until it segfaults.This comment is unrelated to this test.
Updated and now it looks as the following:
-- The testcase demonstrate a stack overflow when `pcall()`
-- is used as `__newindex` metamethod. The function is vararg
-- because stack check in BC_IFUNCV is off by one on ARM64
-- and MIPS64 without the patch.
Also, a stack overflow can be +-- triggered when `pcall()` is used as `__newindex` metamethod. +-- The functions are vararg because stack check in BC_IFUNCV is +-- off by one without the patch. + +local mt = setmetatable({}, { __newindex = pcall, __call = pairs }) + +local function prober_2(...) -- luacheck: no unused + mt[mt] = mt +end + +local function looper_2(n, ...) + prober_2(...) + prober_2(nil, ...) + return looper_2(n + 1, n, ...) +end + +pcall(coroutine.wrap(looper_2), 0)This can be simplified to the following: | src/luajit -e ' | -- Do not use a Lua function as metamethod -- since it will check | -- the stack on each invocation. Use simple `type()` built-in | -- instead. | local t = setmetatable({}, {__newindex = pcall, __call = type}) | local function prober(...) | -- Invokes `pcall(t, t, t)`. | t[t] = t | end | local function looper(n, ...) | prober(...) | return looper(n+1, n, ...) | end | pcall(coroutine.wrap(looper), 0) | '
Updated (added a comment about FF and removed prober() with nil):
@@ -37,15 +38,18 @@ test:ok(true, 'no stack overflow with
recursive pcall')
-- The functions are vararg because stack check in BC_IFUNCV is
-- off by one without the patch.
-local mt = setmetatable({}, { __newindex = pcall, __call = pairs
})
+-- The `type()` function is more convenient here, it works fast
+-- and can be used with any data type. However, any fast function
+-- can be used instead.
+local t = setmetatable({}, { __newindex = pcall, __call = type })
local function prober_2(...) -- luacheck: no unused
- mt[mt] = mt
+ -- Invokes `pcall(t, t, t)`.
+ t[t] = t
end
local function looper_2(n, ...)
prober_2(...)
- prober_2(nil, ...)
return looper_2(n + 1, n, ...)
end
+ +test:ok(true, 'no stack overflow with using metamethod') + +test:done(true) -- 2.43.0