Hi, Sergey,

thanks for review! Please consider my comments below.

Sergey

On 9/1/25 16:36, Sergey Kaplun via Tarantool-patches wrote:
Hi, Sergey!
Thanks for the patch!
Please consider my comments below.

On 27.08.25, Sergey Bronnikov wrote:
Analyzed by Peter Cawley.

(cherry picked from commit a4c1640432a9d8a60624cdc8065b15078c228e36)

In the commit "LJ_FR2: Fix stack checks in vararg calls."
Minor: In the previous commit ("...")
Fixed.

stack overflow in `pcall()`/`xpcall()` was fixed partially and
This handles stack overflow for vararg functions and metamethod
invocations not `xpcall()/pcall()` (directly).

Updated:

> In the previous commit ("LJ_FR2: Fix stack checks in vararg calls.")             
>  stack overflow for vararg functions and metamethod invocations                   
>  was fixed partially and there are still cases where stack overflow               
>  happens, see comments in the test. The patch fixes the issue by                  
>  adding the stack check to fast functions `pcall()` and `xpcall()`.               
 


      
there are still cases where stack overflow happens, see comments
in the test. The patch add stack check to `pcall()` and `xpcall()`.
Please, mention in the commit message that the issue was fixed by adding
the stack check to these fast functions.
Added.

      
Sergey Bronnikov:
* added the description and the test for the problem

Part of tarantool/tarantool#11691
---
 src/vm_arm.dasc                               |  7 +++++
 src/vm_arm64.dasc                             |  8 +++++
 src/vm_mips.dasc                              | 10 +++++-
 src/vm_mips64.dasc                            | 12 +++++--
 src/vm_ppc.dasc                               |  9 ++++++
 src/vm_x64.dasc                               |  6 ++++
 src/vm_x86.dasc                               |  6 ++++
 ...048-fix-stack-checks-vararg-calls.test.lua | 31 ++++++++++++++++++-
 8 files changed, 85 insertions(+), 4 deletions(-)

diff --git a/src/vm_arm.dasc b/src/vm_arm.dasc
index 7095e660..efe9dcb2 100644
--- a/src/vm_arm.dasc
+++ b/src/vm_arm.dasc
@@ -1201,8 +1201,11 @@ static void build_subroutines(BuildCtx *ctx)
   |//-- Base library: catch errors ----------------------------------------
   |
   |.ffunc pcall
+  |   ldr RB, L->maxstack
+  |   add INS, BASE, NARGS8:RC
   |  ldrb RA, [DISPATCH, #DISPATCH_GL(hookmask)]
   |   cmp NARGS8:RC, #8
+  |   cmphs RB, INS
   |   blo ->fff_fallback
   |  tst RA, #HOOK_ACTIVE		// Remember active hook before pcall.
   |   mov RB, BASE
@@ -1213,7 +1216,11 @@ static void build_subroutines(BuildCtx *ctx)
   |  b ->vm_call_dispatch
   |
   |.ffunc_2 xpcall
+  |   ldr RB, L->maxstack
+  |   add INS, BASE, NARGS8:RC
   |  ldrb RA, [DISPATCH, #DISPATCH_GL(hookmask)]
+  |   cmp RB, INS
+  |   blo ->fff_fallback
   |  checkfunc CARG4, ->fff_fallback	// Traceback must be a function.
   |   mov RB, BASE
   |  strd CARG12, [BASE, #8]		// Swap function and traceback.
diff --git a/src/vm_arm64.dasc b/src/vm_arm64.dasc
index cf8e575a..53ff7162 100644
--- a/src/vm_arm64.dasc
+++ b/src/vm_arm64.dasc
@@ -1166,6 +1166,10 @@ static void build_subroutines(BuildCtx *ctx)
   |//-- Base library: catch errors ----------------------------------------
   |
   |.ffunc pcall
+  |  ldr TMP1, L->maxstack
+  |  add TMP2, BASE, NARGS8:RC
+  |  cmp TMP1, TMP2
+  |  blo ->fff_fallback
   |   cmp NARGS8:RC, #8
   |  ldrb TMP0w, GL->hookmask
   |   blo ->fff_fallback
@@ -1185,6 +1189,10 @@ static void build_subroutines(BuildCtx *ctx)
   |  b ->vm_call_dispatch
   |
   |.ffunc xpcall
+  |  ldr TMP1, L->maxstack
+  |  add TMP2, BASE, NARGS8:RC
+  |  cmp TMP1, TMP2
+  |  blo ->fff_fallback
   |     ldp CARG1, CARG2, [BASE]
   |  ldrb TMP0w, GL->hookmask
   |   subs NARGS8:TMP1, NARGS8:RC, #16
diff --git a/src/vm_mips.dasc b/src/vm_mips.dasc
index 32caabf7..69d09d52 100644
--- a/src/vm_mips.dasc
+++ b/src/vm_mips.dasc
@@ -1382,9 +1382,13 @@ static void build_subroutines(BuildCtx *ctx)
   |//-- Base library: catch errors ----------------------------------------
   |
   |.ffunc pcall
+  |   lw TMP1, L->maxstack
+  |   addu TMP2, BASE, NARGS8:RC
   |  lbu TMP3, DISPATCH_GL(hookmask)(DISPATCH)
   |  beqz NARGS8:RC, ->fff_fallback
-  |   move TMP2, BASE
+  |.  sltu AT, TMP1, TMP2
+  |   bnez AT, ->fff_fallback
+  |.   move TMP2, BASE
   |   addiu BASE, BASE, 8
   |  // Remember active hook before pcall.
   |  srl TMP3, TMP3, HOOK_ACTIVE_SHIFT
@@ -1394,8 +1398,12 @@ static void build_subroutines(BuildCtx *ctx)
   |.  addiu NARGS8:RC, NARGS8:RC, -8
   |
   |.ffunc xpcall
+  |   lw TMP1, L->maxstack
+  |   addu TMP2, BASE, NARGS8:RC
   |    sltiu AT, NARGS8:RC, 16
   |  lw CARG4, 8+HI(BASE)
+  |   sltu TMP1, TMP1, TMP2
+  |    or AT, AT, TMP1
   |    bnez AT, ->fff_fallback
   |.  lw CARG3, 8+LO(BASE)
   |   lw CARG1, LO(BASE)
diff --git a/src/vm_mips64.dasc b/src/vm_mips64.dasc
index 7f49df5b..06b143a2 100644
--- a/src/vm_mips64.dasc
+++ b/src/vm_mips64.dasc
@@ -1418,8 +1418,12 @@ static void build_subroutines(BuildCtx *ctx)
   |//-- Base library: catch errors ----------------------------------------
   |
   |.ffunc pcall
+  |  ld TMP1, L->maxstack
+  |  daddu TMP2, BASE, NARGS8:RC
+  |  sltu AT, TMP1, TMP2
+  |  bnez AT, ->fff_fallback
+  |.  lbu TMP3, DISPATCH_GL(hookmask)(DISPATCH)
   |  daddiu NARGS8:RC, NARGS8:RC, -8
-  |  lbu TMP3, DISPATCH_GL(hookmask)(DISPATCH)
   |  bltz NARGS8:RC, ->fff_fallback
   |.   move TMP2, BASE
   |   daddiu BASE, BASE, 16
I see that the original patch in the upstream has another diff. Please
backport the commit ea7071d3 ("MIPS64: Fix xpcall() error case") first
(as the first commit in the patch series) to avoid conflicts in the
future.\
Done.

@@ -1440,8 +1444,12 @@ static void build_subroutines(BuildCtx *ctx)
   |.  nop
   |
   |.ffunc xpcall
+  |  ld TMP1, L->maxstack
+  |  daddu TMP2, BASE, NARGS8:RC
+  |  sltu AT, TMP1, TMP2
+  |  bnez AT, ->fff_fallback
+  |.  ld CARG1, 0(BASE)
   |  daddiu NARGS8:RC, NARGS8:RC, -16
-  |  ld CARG1, 0(BASE)
   |   ld CARG2, 8(BASE)
   |    bltz NARGS8:RC, ->fff_fallback
   |.    lbu TMP1, DISPATCH_GL(hookmask)(DISPATCH)
diff --git a/src/vm_ppc.dasc b/src/vm_ppc.dasc
index 980ad897..f2ea933b 100644
--- a/src/vm_ppc.dasc
+++ b/src/vm_ppc.dasc
@@ -1755,8 +1755,12 @@ static void build_subroutines(BuildCtx *ctx)
   |//-- Base library: catch errors ----------------------------------------
   |
   |.ffunc pcall
+  |    lwz TMP1, L->maxstack
+  |    add TMP2, BASE, NARGS8:RC
   |  cmplwi NARGS8:RC, 8
   |   lbz TMP3, DISPATCH_GL(hookmask)(DISPATCH)
+  |    cmplw cr1, TMP1, TMP2
+  |  cror 4*cr0+lt, 4*cr0+lt, 4*cr1+lt
   |  blt ->fff_fallback
   |   mr TMP2, BASE
   |   la BASE, 8(BASE)
@@ -1767,14 +1771,19 @@ static void build_subroutines(BuildCtx *ctx)
   |  b ->vm_call_dispatch
   |
   |.ffunc xpcall
+  |     lwz TMP1, L->maxstack
+  |     add TMP2, BASE, NARGS8:RC
   |  cmplwi NARGS8:RC, 16
   |   lwz CARG3, 8(BASE)
+  |     cmplw cr1, TMP1, TMP2
   |.if FPU
   |    lfd FARG2, 8(BASE)
+  |  cror 4*cr0+lt, 4*cr0+lt, 4*cr1+lt
   |    lfd FARG1, 0(BASE)
   |.else
   |    lwz CARG1, 0(BASE)
   |    lwz CARG2, 4(BASE)
+  |  cror 4*cr0+lt, 4*cr0+lt, 4*cr1+lt
   |    lwz CARG4, 12(BASE)
   |.endif
   |  blt ->fff_fallback
diff --git a/src/vm_x64.dasc b/src/vm_x64.dasc
index d5296759..141f5f82 100644
--- a/src/vm_x64.dasc
+++ b/src/vm_x64.dasc
@@ -1545,6 +1545,9 @@ static void build_subroutines(BuildCtx *ctx)
   |//-- Base library: catch errors ----------------------------------------
   |
   |.ffunc_1 pcall
+  |  mov L:RB, SAVE_L
+  |  lea RA, [BASE+NARGS:RD*8]
+  |  cmp RA, L:RB->maxstack; ja ->fff_fallback
   |  lea RA, [BASE+16]
   |  sub NARGS:RDd, 1
   |  mov PCd, 16+FRAME_PCALL
@@ -1563,6 +1566,9 @@ static void build_subroutines(BuildCtx *ctx)
   |  jmp ->vm_call_dispatch
   |
   |.ffunc_2 xpcall
+  |  mov L:RB, SAVE_L
+  |  lea RA, [BASE+NARGS:RD*8]
+  |  cmp RA, L:RB->maxstack; ja ->fff_fallback
   |  mov LFUNC:RA, [BASE+8]
   |  checktp_nc LFUNC:RA, LJ_TFUNC, ->fff_fallback
   |  mov LFUNC:RB, [BASE]		// Swap function and traceback.
diff --git a/src/vm_x86.dasc b/src/vm_x86.dasc
index b043b830..1ba5abce 100644
--- a/src/vm_x86.dasc
+++ b/src/vm_x86.dasc
@@ -1914,6 +1914,9 @@ static void build_subroutines(BuildCtx *ctx)
   |//-- Base library: catch errors ----------------------------------------
   |
   |.ffunc_1 pcall
+  |  mov L:RB, SAVE_L
+  |  lea RA, [BASE+NARGS:RD*8]
+  |  cmp RA, L:RB->maxstack; ja ->fff_fallback
   |  lea RA, [BASE+8]
   |  sub NARGS:RD, 1
   |  mov PC, 8+FRAME_PCALL
@@ -1925,6 +1928,9 @@ static void build_subroutines(BuildCtx *ctx)
   |  jmp ->vm_call_dispatch
   |
   |.ffunc_2 xpcall
+  |  mov L:RB, SAVE_L
+  |  lea RA, [BASE+NARGS:RD*8]
+  |  cmp RA, L:RB->maxstack; ja ->fff_fallback
   |  cmp dword [BASE+12], LJ_TFUNC;  jne ->fff_fallback
   |  mov RB, [BASE+4]			// Swap function and traceback.
   |  mov [BASE+12], RB
diff --git a/test/tarantool-tests/lj-1048-fix-stack-checks-vararg-calls.test.lua b/test/tarantool-tests/lj-1048-fix-stack-checks-vararg-calls.test.lua
index e300d5c1..367aecb6 100644
--- a/test/tarantool-tests/lj-1048-fix-stack-checks-vararg-calls.test.lua
+++ b/test/tarantool-tests/lj-1048-fix-stack-checks-vararg-calls.test.lua
@@ -7,7 +7,7 @@ local test = tap.test('lj-1048-fix-stack-checks-vararg-calls'):skipcond({
   ['Test requires JIT enabled'] = not jit.status(),
 })
 
-test:plan(2)
+test:plan(4)
 
 -- The first testcase demonstrate a stack overflow in `pcall()`
 -- by recursive calling `pcall()`. The functions are vararg
@@ -53,4 +53,33 @@ pcall(coroutine.wrap(looper_2), 0)
 
 test:ok(true, 'no stack overflow with using metamethod')
 
Why do you drop the original test case?

Expected behaviour:
| src/luajit -e 'local t = {setmetatable({},{__call=pcall})()} print(t[#t])'
| stack overflow
Actual behaviour -- dirty read detected by ASAN.

this testcase is failed too:

TAP version 13                                                                   
1..4                                                                                     
ok - no stack overflow with recursive pcall                                 
ok - no stack overflow with metamethod                                
=================================================================                      
==3374==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x007f77c03fc8 at pc 0x005574fc4fb0 bp 0x007ff2e0f3f0
 sp 0x007ff2e0f410                                                      
WRITE of size 8 at 0x007f77c03fc8 thread T0                         
    #0 0x5574fc4faf in copyTV /root/sergeyb/luajit/src/lj_obj.h:1000       
    #1 0x5574fc4faf in lj_meta_call /root/sergeyb/luajit/src/lj_meta.c:444
    #2 0x557510c293 in lj_vmeta_call (/root/sergeyb/luajit/build/src/luajit+0x1d5293)
    #3 0x5574f92a9b in lua_pcall /root/sergeyb/luajit/src/lj_api.c:1173  
    #4 0x5574f74217 in docall /root/sergeyb/luajit/src/luajit.c:134
    #5 0x5574f75067 in handle_script /root/sergeyb/luajit/src/luajit.c:304          
    #6 0x5574f76d07 in pmain /root/sergeyb/luajit/src/luajit.c:602   
    #7 0x557510bb33 in lj_BC_FUNCC (/root/sergeyb/luajit/build/src/luajit+0x1d4b33)
    #8 0x5574f937fb in lua_cpcall /root/sergeyb/luajit/src/lj_api.c:1208                        
    #9 0x5574f77083 in main /root/sergeyb/luajit/src/luajit.c:633
    #10 0x7f7dbfe79f in __libc_start_main (/lib/aarch64-linux-gnu/libc.so.6+0x2079f)
    #11 0x5574f73aa3  (/root/sergeyb/luajit/build/src/luajit+0x3caa3)


+-- The third testcase demonstrate a stack overflow in
+-- `pcall()`/xpcall()` similar to the first testcase, but it is
+-- triggered using hand-crafted Lua chunk with a lot `pcall()`
+-- builtins.
+
+for i = 1, 100 do
+  local code = 'return pcall(' .. string.rep('pcall, ', i) .. 'pairs, {})'
Why do we need this test if it has the same semantics as the second
one?
the test was removed

+  local f = load(code)
+  coroutine.wrap(f)()
+end
+
+test:ok(true, 'no stack overflow with pcalls in load()')
+
+-- The fourth testcase demonstrate a stack overflow in
+-- `pcall()`/`xpcall()` similar to the first testcase, but it is
+-- triggered using `unpack()`.
+
+local t = {}
+local function f()
+  return pcall(unpack(t))
+end
+
+for i = 1, 100 do
This limit isn't enough for GC64 or non-GC64 mode.
| src/luajit -e '
| local t = {}
| local function f() return pcall(unpack(t)) end
| for i = 1, 100 do
|   t[i], t[i+1], t[i+2] = pcall, pairs, {}
|   coroutine.wrap(f)()
| end
| '

For the GC64 build it is necessary to set the limit as 180, (179 -- not
SegFault).

Please provide two different limits depending on the GC64 mode
configuration. Please, describe why the __exact__ limit is chosen for
the particular configuration.
Added.

+  t[i], t[i + 1], t[i + 2] = pcall, pairs, {}
+  coroutine.wrap(f)()
+end
+
+test:ok(true, 'no stack overflow with unpacked pcalls')
+
 test:done(true)
-- 
2.43.0