From: Sergey Bronnikov via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Sergey Kaplun <skaplun@tarantool.org>,
Maxim Kokryashkin <m.kokryashkin@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH luajit v1 4/5] Handle all types of errors during trace stitching.
Date: Thu, 6 Jun 2024 16:03:11 +0300 [thread overview]
Message-ID: <e4102bf6-1e1f-41ce-811b-e9b7c19b493b@tarantool.org> (raw)
In-Reply-To: <362ec5c556a9e9832726ec89978ef25ba865af41.1713773432.git.skaplun@tarantool.org>
[-- Attachment #1: Type: text/plain, Size: 14353 bytes --]
Hi, Sergey,
thanks for the patch! Please see my comments.
On 22.04.2024 11:49, Sergey Kaplun wrote:
> From: Mike Pall <mike>
>
> Thanks to Sergey Kaplun and Peter Cawley.
>
> (cherry picked from commit d06beb0480c5d1eb53b3343e78063950275aa281)
>
> This commit is a follow-up for the commit
> 1b8216023d5a79814389f1c1affef27c15d9de27 ("Throw any errors before stack
> changes in trace stitching."). The patch prepends failures for the
> specific error to be thrown. Nevertheless, the error may be thrown due
> to retrying trace recording in the case when table bump optimization
> is enabled or when OOM is observed during reallocation of the snapshot
> or IR buffers.
>
> This patch adds the corresponding protected frame and rethrows the error
> after a fixup of the stack.
>
> This patch also tests the correctness of copying the error message to
> the top of the stack to get a valid "abort" reason in the `jit.dump`
> utility.
>
> Also, this patch fixes a non-ASCII space character in the comment for
> <lj-720-errors-before-stitch.test.lua>.
>
> Sergey Kaplun:
> * added the description and the test for the problem
>
> Part of tarantool/tarantool#9924
> ---
> src/lj_ffrecord.c | 21 ++++++--
> test/tarantool-tests/CMakeLists.txt | 1 +
> .../lj-1166-error-stitch-oom-ir-buff.test.lua | 46 ++++++++++++++++
> ...j-1166-error-stitch-oom-snap-buff.test.lua | 54 +++++++++++++++++++
> .../lj-1166-error-stitch-table-bump.test.lua | 38 +++++++++++++
> .../lj-1166-error-stitch/CMakeLists.txt | 1 +
> .../lj-1166-error-stitch/mockalloc.c | 51 ++++++++++++++++++
> .../lj-720-errors-before-stitch.test.lua | 40 +++++++++++++-
> 8 files changed, 245 insertions(+), 7 deletions(-)
> create mode 100644 test/tarantool-tests/lj-1166-error-stitch-oom-ir-buff.test.lua
> create mode 100644 test/tarantool-tests/lj-1166-error-stitch-oom-snap-buff.test.lua
> create mode 100644 test/tarantool-tests/lj-1166-error-stitch-table-bump.test.lua
> create mode 100644 test/tarantool-tests/lj-1166-error-stitch/CMakeLists.txt
> create mode 100644 test/tarantool-tests/lj-1166-error-stitch/mockalloc.c
>
> diff --git a/src/lj_ffrecord.c b/src/lj_ffrecord.c
> index e3ed80fb..ff14e9e4 100644
> --- a/src/lj_ffrecord.c
> +++ b/src/lj_ffrecord.c
> @@ -96,6 +96,14 @@ static ptrdiff_t results_wanted(jit_State *J)
> return -1;
> }
>
> +static TValue *rec_stop_stitch_cp(lua_State *L, lua_CFunction dummy, void *ud)
> +{
> + jit_State *J = (jit_State *)ud;
> + lj_record_stop(J, LJ_TRLINK_STITCH, 0);
> + UNUSED(L); UNUSED(dummy);
> + return NULL;
> +}
> +
> /* Trace stitching: add continuation below frame to start a new trace. */
> static void recff_stitch(jit_State *J)
> {
> @@ -106,10 +114,7 @@ static void recff_stitch(jit_State *J)
> TValue *nframe = base + 1 + LJ_FR2;
> const BCIns *pc = frame_pc(base-1);
> TValue *pframe = frame_prevl(base-1);
> -
> - /* Check for this now. Throwing in lj_record_stop messes up the stack. */
> - if (J->cur.nsnap >= (MSize)J->param[JIT_P_maxsnap])
> - lj_trace_err(J, LJ_TRERR_SNAPOV);
> + int errcode;
>
> /* Move func + args up in Lua stack and insert continuation. */
> memmove(&base[1], &base[-1-LJ_FR2], sizeof(TValue)*nslot);
> @@ -134,13 +139,19 @@ static void recff_stitch(jit_State *J)
> J->baseslot += 2 + LJ_FR2;
> J->framedepth++;
>
> - lj_record_stop(J, LJ_TRLINK_STITCH, 0);
> + errcode = lj_vm_cpcall(L, NULL, J, rec_stop_stitch_cp);
>
> /* Undo Lua stack changes. */
> memmove(&base[-1-LJ_FR2], &base[1], sizeof(TValue)*nslot);
> setframe_pc(base-1, pc);
> L->base -= 2 + LJ_FR2;
> L->top -= 2 + LJ_FR2;
> +
> + if (errcode) {
> + if (errcode == LUA_ERRRUN)
> + copyTV(L, L->top-1, L->top + (1 + LJ_FR2));
> + lj_err_throw(L, errcode); /* Propagate errors. */
> + }
> }
>
> /* Fallback handler for fast functions that are not recorded (yet). */
> diff --git a/test/tarantool-tests/CMakeLists.txt b/test/tarantool-tests/CMakeLists.txt
> index 56660932..d7c96078 100644
> --- a/test/tarantool-tests/CMakeLists.txt
> +++ b/test/tarantool-tests/CMakeLists.txt
> @@ -39,6 +39,7 @@ add_subdirectory(lj-802-panic-at-mcode-protfail)
> add_subdirectory(lj-flush-on-trace)
> add_subdirectory(lj-1004-oom-error-frame)
> add_subdirectory(lj-1066-fix-cur_L-after-coroutine-resume)
> +add_subdirectory(lj-1166-error-stitch)
>
> # The part of the memory profiler toolchain is located in tools
> # directory, jit, profiler, and bytecode toolchains are located
> diff --git a/test/tarantool-tests/lj-1166-error-stitch-oom-ir-buff.test.lua b/test/tarantool-tests/lj-1166-error-stitch-oom-ir-buff.test.lua
> new file mode 100644
> index 00000000..e3a5397d
> --- /dev/null
> +++ b/test/tarantool-tests/lj-1166-error-stitch-oom-ir-buff.test.lua
> @@ -0,0 +1,46 @@
> +local tap = require('tap')
> +
> +-- Test file to demonstrate unbalanced Lua stack after instruction
> +-- recording due to throwing an error at recording of a stitched
> +-- function.
> +-- See also:https://github.com/LuaJIT/LuaJIT/issues/1166.
> +
> +local test = tap.test('lj-1166-error-stitch-oom-snap-buff'):skipcond({
should a name in tap.test match to test file name?
now it is not.
> + ['Test requires JIT enabled'] = not jit.status(),
> + ['Disabled on *BSD due to #4819'] = jit.os == 'BSD',
> +})
> +
> +test:plan(1)
> +
> +local mockalloc = require('mockalloc')
> +
> +local function create_chunk(n_slots)
I would add a comment like this:
--- a/test/tarantool-tests/lj-1166-error-stitch-oom-ir-buff.test.lua
+++ b/test/tarantool-tests/lj-1166-error-stitch-oom-ir-buff.test.lua
@@ -14,6 +14,18 @@ test:plan(1)
local mockalloc = require('mockalloc')
+-- Generate a Lua chunk like below:
+-- local s1
+-- local s2
+-- ...
+-- local sN
+-- for i = 1, 2 do
+-- s1 = i + 1
+-- s2 = i + 2
+-- ...
+-- sN = i + N
+-- math.modf(1)
+-- end
local function create_chunk(n_slots)
local chunk = ''
for i = 1, n_slots do
> + local chunk = ''
> + for i = 1, n_slots do
> + chunk = chunk .. ('local s%d\n'):format(i)
> + end
> + chunk = chunk .. 'for i = 1, 2 do\n'
> + -- Generate additional IR instructions.
> + for i = 1, n_slots do
> + chunk = chunk .. (' s%d = i + %d\n'):format(i, i)
> + end
> + -- `math.modf()` recording is NYI.
> + chunk = chunk .. ' math.modf(1)\n'
> + chunk = chunk .. 'end\n'
> + return chunk
> +end
> +
> +-- XXX: amount of slots is empirical.
> +local tracef = assert(loadstring(create_chunk(175)))
> +
> +jit.opt.start('hotloop=1', '-loop', '-fold')
> +
> +mockalloc.mock()
> +
> +tracef()
> +
> +mockalloc.unmock()
> +
> +test:ok(true, 'stack is balanced')
> +
> +test:done(true)
> diff --git a/test/tarantool-tests/lj-1166-error-stitch-oom-snap-buff.test.lua b/test/tarantool-tests/lj-1166-error-stitch-oom-snap-buff.test.lua
> new file mode 100644
> index 00000000..8d671f8d
> --- /dev/null
> +++ b/test/tarantool-tests/lj-1166-error-stitch-oom-snap-buff.test.lua
> @@ -0,0 +1,54 @@
> +local tap = require('tap')
> +
> +-- Test file to demonstrate unbalanced Lua stack after instruction
> +-- recording due to throwing an error at recording of a stitched
> +-- function.
> +-- See also:https://github.com/LuaJIT/LuaJIT/issues/1166.
> +
> +local test = tap.test('lj-1166-error-stitch-oom-snap-buff'):skipcond({
> + ['Test requires JIT enabled'] = not jit.status(),
> + ['Disabled on *BSD due to #4819'] = jit.os == 'BSD',
> +})
> +
> +test:plan(1)
> +
> +local mockalloc = require('mockalloc')
> +
> +local function create_chunk(n_conds)
the same as above: please add a comment with an example of generated Lua
chunk
> + local chunk = ''
> + chunk = chunk .. 'for i = 1, 2 do\n'
> + -- Each condition adds additional snapshot.
> + for i = 1, n_conds do
> + chunk = chunk .. (' if i < %d then end\n'):format(i + n_conds)
> + end
> + -- `math.modf()` recording is NYI.
> + chunk = chunk .. ' math.modf(1)\n'
> + chunk = chunk .. 'end\n'
> + return chunk
> +end
> +
> +-- XXX: Need to compile the cycle in the `create_chunk()` to
> +-- preallocate the snapshot buffer.
> +jit.opt.start('hotloop=1', '-loop', '-fold')
> +
> +-- XXX: Amount of slots is empirical.
> +local tracef = assert(loadstring(create_chunk(6)))
> +
> +-- XXX: Remove previous trace.
> +jit.off()
> +jit.flush()
> +
> +-- XXX: Update hotcounts to avoid hash collisions.
> +jit.opt.start('hotloop=1')
> +
> +jit.on()
> +
> +mockalloc.mock()
> +
> +tracef()
> +
> +mockalloc.unmock()
> +
> +test:ok(true, 'stack is balanced')
> +
> +test:done(true)
> diff --git a/test/tarantool-tests/lj-1166-error-stitch-table-bump.test.lua b/test/tarantool-tests/lj-1166-error-stitch-table-bump.test.lua
> new file mode 100644
> index 00000000..f2453bbe
> --- /dev/null
> +++ b/test/tarantool-tests/lj-1166-error-stitch-table-bump.test.lua
this test is not failed after reverting patch
> @@ -0,0 +1,38 @@
> +local tap = require('tap')
> +
> +-- Test file to demonstrate unbalanced Lua stack after instruction
> +-- recording due to throwing an error at recording of a stitched
> +-- function. The test fails with LUAJIT_ENABLE_TABLE_BUMP enabled.
> +-- See also:
> +-- *https://github.com/LuaJIT/LuaJIT/issues/606,
> +-- *https://github.com/LuaJIT/LuaJIT/issues/1166.
> +
> +local test = tap.test('lj-1166-error-stitch-table-bump'):skipcond({
> + ['Test requires JIT enabled'] = not jit.status(),
> +})
> +
> +test:plan(1)
> +
> +-- `math.modf` recording is NYI.
> +-- Local `modf` simplifies `jit.dump()` output.
> +local modf = math.modf
> +
> +jit.opt.start('hotloop=1')
> +
> +-- luacheck: no unused
> +local t
> +-- There is no need to run the trace itself. Just check the
> +-- correctness of a recording.
> +for i = 1, 2 do
> + t = {}
> + -- Cause table rehashing to trigger table bump optimization.
> + t[i] = i
> + -- Forcify stitch. This will throw an error at the end of
> + -- recording, since trace recording should be retried after
> + -- bytecode updating.
> + modf(1)
> +end
> +
> +test:ok(true, 'stack is balanced')
> +
> +test:done(true)
> diff --git a/test/tarantool-tests/lj-1166-error-stitch/CMakeLists.txt b/test/tarantool-tests/lj-1166-error-stitch/CMakeLists.txt
> new file mode 100644
> index 00000000..1ebf253b
> --- /dev/null
> +++ b/test/tarantool-tests/lj-1166-error-stitch/CMakeLists.txt
> @@ -0,0 +1 @@
> +BuildTestCLib(mockalloc mockalloc.c)
> diff --git a/test/tarantool-tests/lj-1166-error-stitch/mockalloc.c b/test/tarantool-tests/lj-1166-error-stitch/mockalloc.c
> new file mode 100644
> index 00000000..d6d3492e
> --- /dev/null
> +++ b/test/tarantool-tests/lj-1166-error-stitch/mockalloc.c
> @@ -0,0 +1,51 @@
> +#include "lua.h"
> +#include "lauxlib.h"
> +
> +#undef NDEBUG
> +#include <assert.h>
> +
> +static lua_Alloc old_allocf = NULL;
> +static void *old_alloc_state = NULL;
> +
> +/* Function to be used instead of the default allocator. */
> +static void *mock_allocf(void *ud, void *ptr, size_t osize, size_t nsize)
> +{
> + assert(old_allocf != NULL);
> + /*
> + * Check the specific reallocation related to the IR
> + * buffer or the snapshot buffer.
> + */
> + if (osize * 2 == nsize)
> + return NULL;
> + return old_allocf(ud, ptr, osize, nsize);
> +}
> +
> +static int mock(lua_State *L)
It is actually not a test mock.
According to definition [1] test mock imitate a behavior of a real object.
Your memory allocator behaves as a real allocator, but in some cases it
will return
a NULL instead of memory address. What if we rename "mock" to "allocator
with fault injection"?
1. https://www.martinfowler.com/articles/mocksArentStubs.html
> +{
> + assert(old_allocf == NULL);
> + old_allocf = lua_getallocf(L, &old_alloc_state);
> + lua_setallocf(L, mock_allocf, old_alloc_state);
> + return 0;
> +}
> +
> +static int unmock(lua_State *L)
> +{
> + assert(old_allocf != NULL);
> + assert(old_allocf != mock_allocf);
> + lua_setallocf(L, old_allocf, old_alloc_state);
> + old_allocf = NULL;
> + old_alloc_state = NULL;
> + return 0;
> +}
> +
> +static const struct luaL_Reg mockalloc[] = {
> + {"mock", mock},
> + {"unmock", unmock},
> + {NULL, NULL}
> +};
> +
> +LUA_API int luaopen_mockalloc(lua_State *L)
> +{
> + luaL_register(L, "mockalloc", mockalloc);
> + return 1;
> +}
> diff --git a/test/tarantool-tests/lj-720-errors-before-stitch.test.lua b/test/tarantool-tests/lj-720-errors-before-stitch.test.lua
> index d750b721..6e8f70c2 100644
> --- a/test/tarantool-tests/lj-720-errors-before-stitch.test.lua
> +++ b/test/tarantool-tests/lj-720-errors-before-stitch.test.lua
> @@ -1,13 +1,27 @@
> local tap = require('tap')
> local test = tap.test('lj-720-errors-before-stitch'):skipcond({
> ['Test requires JIT enabled'] = not jit.status(),
> + ['Disabled on *BSD due to #4819'] = jit.os == 'BSD',
> })
> -test:plan(1)
>
> --- `math.modf` recording is NYI.
> +local jparse = require('utils').jit.parse
> +
> +-- `math.modf` recording is NYI.
> -- Local `modf` simplifies `jit.dump()` output.
> local modf = math.modf
> +
> +-- XXX: Avoid other traces compilation due to hotcount collisions
> +-- for predictable results.
> +jit.off()
> +jit.flush()
> +
> +test:plan(2)
> +
> +-- We only need the abort reason in the test.
> +jparse.start('t')
> +
> jit.opt.start('hotloop=1', 'maxsnap=1')
> +jit.on()
>
> -- The loop has only two iterations: the first to detect its
> -- hotness and the second to record it. The snapshot limit is
> @@ -17,5 +31,27 @@ for _ = 1, 2 do
> modf(1.2)
> end
>
> +local _, aborted_traces = jparse.finish()
> +
> +jit.off()
> +
> test:ok(true, 'stack is balanced')
> +
> +-- Tarantool may compile traces on the startup. These traces
> +-- already exceed the maximum snapshot amount we set after they
> +-- are compiled. Hence, there is no need to reallocate the
> +-- snapshot buffer, so the check for the snap size is not
> +-- triggered.
> +test:skipcond({
> + -- luacheck: no global
> + ['Impossible to predict the number of snapshots for Tarantool'] = _TARANTOOL,
> +})
> +
> +assert(aborted_traces and aborted_traces[1], 'aborted trace is persisted')
> +
> +-- We tried to compile only one trace.
> +local reason = aborted_traces[1][1].abort_reason
> +
> +test:like(reason, 'too many snapshots', 'abort reason is correct')
> +
> test:done(true)
[-- Attachment #2: Type: text/html, Size: 15958 bytes --]
next prev parent reply other threads:[~2024-06-06 13:03 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-22 8:49 [Tarantool-patches] [PATCH luajit v1 0/5] Handle all errors during stitching Sergey Kaplun via Tarantool-patches
2024-04-22 8:49 ` [Tarantool-patches] [PATCH luajit v1 1/5] build: introduce option LUAJIT_ENABLE_TABLE_BUMP Sergey Kaplun via Tarantool-patches
2024-05-05 12:39 ` Maxim Kokryashkin via Tarantool-patches
2024-05-13 11:47 ` Sergey Kaplun via Tarantool-patches
2024-06-06 10:53 ` Sergey Bronnikov via Tarantool-patches
2024-06-13 9:51 ` Sergey Kaplun via Tarantool-patches
2024-06-13 14:37 ` Sergey Bronnikov via Tarantool-patches
2024-04-22 8:49 ` [Tarantool-patches] [PATCH luajit v1 2/5] ci: add tablebump flavor for exotic builds Sergey Kaplun via Tarantool-patches
2024-05-05 12:40 ` Maxim Kokryashkin via Tarantool-patches
2024-05-13 11:49 ` Sergey Kaplun via Tarantool-patches
2024-06-06 10:56 ` Sergey Bronnikov via Tarantool-patches
2024-04-22 8:49 ` [Tarantool-patches] [PATCH luajit v1 3/5] test: allow `jit.parse` to return aborted traces Sergey Kaplun via Tarantool-patches
2024-05-05 12:55 ` Maxim Kokryashkin via Tarantool-patches
2024-05-13 11:53 ` Sergey Kaplun via Tarantool-patches
2024-06-06 11:01 ` Sergey Bronnikov via Tarantool-patches
2024-06-13 10:00 ` Sergey Kaplun via Tarantool-patches
2024-06-13 14:38 ` Sergey Bronnikov via Tarantool-patches
2024-04-22 8:49 ` [Tarantool-patches] [PATCH luajit v1 4/5] Handle all types of errors during trace stitching Sergey Kaplun via Tarantool-patches
2024-05-06 8:25 ` Maxim Kokryashkin via Tarantool-patches
2024-06-06 13:03 ` Sergey Bronnikov via Tarantool-patches [this message]
2024-06-13 10:25 ` Sergey Kaplun via Tarantool-patches
2024-06-13 14:52 ` Sergey Bronnikov via Tarantool-patches
2024-04-22 8:49 ` [Tarantool-patches] [PATCH luajit v1 5/5] Use generic trace error for OOM " Sergey Kaplun via Tarantool-patches
2024-05-06 8:27 ` Maxim Kokryashkin via Tarantool-patches
2024-06-06 14:06 ` Sergey Bronnikov via Tarantool-patches
2024-06-13 10:31 ` Sergey Kaplun via Tarantool-patches
2024-06-13 14:58 ` Sergey Bronnikov via Tarantool-patches
2024-06-16 19:13 ` Sergey Kaplun via Tarantool-patches
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e4102bf6-1e1f-41ce-811b-e9b7c19b493b@tarantool.org \
--to=tarantool-patches@dev.tarantool.org \
--cc=m.kokryashkin@tarantool.org \
--cc=sergeyb@tarantool.org \
--cc=skaplun@tarantool.org \
--subject='Re: [Tarantool-patches] [PATCH luajit v1 4/5] Handle all types of errors during trace stitching.' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox