[Tarantool-patches] [PATCH luajit] Fix frame for on-trace out-of-memory error.

Sergey Kaplun skaplun at tarantool.org
Wed Jul 26 15:12:40 MSK 2023


Hi, Maxim!
Thanks for the patch!
Please, consider my comments below.

On 24.07.23, Maxim Kokryashkin wrote:
> Reported by ruidong007.
> 
> (cherry-picked from commit 2d8300c1944f3a62c10f0829e9b7847c5a6f0482)
> 
> When an on-trace OOM error is triggered from a frame that is
> child in regard to `jit_base`, and `L->base` is not updated
> correspondingly (FUNCC, for example), it is possible to
> encounter an inconsistent Lua stack in the error handler.
> 
> This patch adds a fixup for OOM errors on trace that always

Typo: s/on trace/on the trace/

> sets the Lua stack base to `jit_base`, so the stack is
> now consistent.
> 
> Part of tarantool/tarantool#8825
> ---

<snipped>

> +local testoomframe = require('testoomframe')
> +
> +local anchor = {}
> +local function extra_frame(val)
> +  table.insert(anchor, val)
> +end
> +
> +local function chomp()
> +  while true do
> +    extra_frame(testoomframe.allocate_userdata())
> +  end
> +end


Before the patch the test takes really a lot of time to wait:
| time LUA_PATH="src/?.lua;test/tarantool-tests/?.lua;;" LUA_CPATH="test/tarantool-tests/lj-1004-oom-error-frame/?.so;;" src/luajit test/tarantool-tests/lj-1
| 004-oom-error-frame.test.lua 
| TAP version 13
| 1..1
| ok - on-trace error handled successfully
|
| real    0m12.984s
| user    0m12.207s
| sys     0m0.775s


I've added the simple chunk eater (and reduce lightuserdata size, just
in case). And this speedups test x2. But I believe that we can make it
even x20 faster with careful size calculations. Also, I suggest to try
to use some non-compilable fast function instead Lua C call to forcify
trace stitching (looks like the error in userdata allocations is the
main blocker).

===================================================================
diff --git a/test/tarantool-tests/lj-1004-oom-error-frame.test.lua b/test/tarantool-tests/lj-1004-oom-error-frame.test.lua
index fd167d14..f8dde4e6 100644
--- a/test/tarantool-tests/lj-1004-oom-error-frame.test.lua
+++ b/test/tarantool-tests/lj-1004-oom-error-frame.test.lua
@@ -1,13 +1,25 @@
 local tap = require('tap')
+local ffi = require('ffi')
 local test  = tap.test('lj-1004-oom-error-frame'):skipcond({
   ['Test requires JIT enabled'] = not jit.status(),
-  ['Test requires GC64 mode disabled'] = require('ffi').abi('gc64'),
+  ['Test requires GC64 mode disabled'] = ffi.abi('gc64'),
 })
 
 test:plan(1)
 
 local testoomframe = require('testoomframe')
 
+local anchor_memory = {} -- luacheck: no unused
+local function eatchunks(size)
+  while true do
+    anchor_memory[ffi.new('char[?]', size)] = 1
+  end
+end
+
+if not ffi.abi('gc64') then
+  local r,e = pcall(eatchunks, 512 * 1024 * 1024)
+end
+
 local anchor = {}
 local function extra_frame(val)
   table.insert(anchor, val)
diff --git a/test/tarantool-tests/lj-1004-oom-error-frame/testoomframe.c b/test/tarantool-tests/lj-1004-oom-error-frame/testoomframe.c
index 13071b4e..a54eac63 100644
--- a/test/tarantool-tests/lj-1004-oom-error-frame/testoomframe.c
+++ b/test/tarantool-tests/lj-1004-oom-error-frame/testoomframe.c
@@ -2,7 +2,7 @@
 #include <lauxlib.h>
 
 static int allocate_userdata(lua_State *L) {
-        lua_newuserdata(L, 16);
+        lua_newuserdata(L, 1);
         return 1;
 }
 
===================================================================

| time LUA_PATH="src/?.lua;test/tarantool-tests/?.lua;;" LUA_CPATH="test/tarantool-tests/lj-1004-oom-error-frame/?.so;;" src/luajit test/tarantool-tests/lj-1
| 004-oom-error-frame.test.lua 
| TAP version 13
| 1..1
| ok - on-trace error handled successfully
|
| real    0m5.803s
| user    0m5.006s
| sys     0m0.795s

I suggest to play a bit with this sizings and stitching, to decrease
time of waiting. But be aware! The test should fail before the commit as
for `make LuaJIT-test` command as well as for the one-line command (like
above).

> +
> +local st, _ = pcall(chomp)
> +test:ok(st == false, 'on-trace error handled successfully')

Should we also check the error type to be sure that test is still
valid? I.e. that we catch OOM, and not TABOV, for example?

<snipped>

> -- 
> 2.41.0
> 

-- 
Best regards,
Sergey Kaplun


More information about the Tarantool-patches mailing list