* [Tarantool-patches] [PATCH luajit 0/2] Trace abort on FFI sandwich or mode change @ 2020-03-27 10:47 Igor Munkin 2020-03-27 10:47 ` [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API Igor Munkin ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Igor Munkin @ 2020-03-27 10:47 UTC (permalink / raw) To: Sergey Ostanevich, Vladislav Shpilevoy, Kirill Yukhin; +Cc: tarantool-patches This series closes two issues related to the JIT machinery behaviour: * "FFI sandwich"(*) detection is introduced. If sandwich is detected while trace recording the recording is aborted. The sandwich detected while mcode execution leads to the platform panic. * luaJIT_setmode call is prohibited while mcode execution and leads to the platform panic. (*) The following stack mix is called FFI sandwich. | Lua-FFI -> С routine -> Lua-C API -> Lua VM This sort of re-entrancy is explicitly not supported by LuaJIT compiler. For more info see [1]. Branch: https://github.com/tarantool/luajit/tree/imun/ffi-sandwich [1]: https://github.com/tarantool/tarantool/issues/4427 Igor Munkin (2): jit: abort trace recording and execution for C API jit: abort trace execution on JIT mode change src/lj_api.c | 35 ++++++++++---- src/lj_dispatch.c | 5 ++ src/lj_errmsg.h | 2 + test/gh-4427-ffi-sandwich/CMakeLists.txt | 1 + test/gh-4427-ffi-sandwich/libsandwich.c | 59 ++++++++++++++++++++++++ test/gh-4427-ffi-sandwich/test.lua | 26 +++++++++++ test/lj-flush-on-trace/CMakeLists.txt | 1 + test/lj-flush-on-trace/libflush.c | 31 +++++++++++++ test/lj-flush-on-trace/test.lua | 25 ++++++++++ 9 files changed, 176 insertions(+), 9 deletions(-) create mode 100644 test/gh-4427-ffi-sandwich/CMakeLists.txt create mode 100644 test/gh-4427-ffi-sandwich/libsandwich.c create mode 100644 test/gh-4427-ffi-sandwich/test.lua create mode 100644 test/lj-flush-on-trace/CMakeLists.txt create mode 100644 test/lj-flush-on-trace/libflush.c create mode 100644 test/lj-flush-on-trace/test.lua -- 2.25.0 ^ permalink raw reply [flat|nested] 20+ messages in thread
* [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API 2020-03-27 10:47 [Tarantool-patches] [PATCH luajit 0/2] Trace abort on FFI sandwich or mode change Igor Munkin @ 2020-03-27 10:47 ` Igor Munkin 2020-03-28 16:33 ` Sergey Ostanevich 2020-04-02 23:41 ` Vladislav Shpilevoy 2020-03-27 10:47 ` [Tarantool-patches] [PATCH luajit 2/2] jit: abort trace execution on JIT mode change Igor Munkin 2020-04-02 23:41 ` [Tarantool-patches] [PATCH luajit 0/2] Trace abort on FFI sandwich or " Vladislav Shpilevoy 2 siblings, 2 replies; 20+ messages in thread From: Igor Munkin @ 2020-03-27 10:47 UTC (permalink / raw) To: Sergey Ostanevich, Vladislav Shpilevoy, Kirill Yukhin; +Cc: tarantool-patches JIT recording semantics assumes FFI calls are atomic regarding the LuaJIT VM: if the execution exited Lua world through FFI it is not re-entering Lua world again. However, there are ways to break this atomicity via FFI: one can re-enter LuaJIT VM via Lua C API inside the C routine called via FFI. As a result the following stack mix is created: | Lua-FFI -> С routine -> Lua-C API -> Lua VM This sort of re-entrancy is explicitly not supported by LuaJIT compiler machinery. @mraleph named such layout an "FFI sandwich" and I'm strictly against eating it. E.g. there is a trigger machinery in Tarantool calling Lua functions from C space. As a result of FFI function call with a trigger underneath the "FFI sandwich" is created. This changeset introduces the mechanism for Lua-C API callbacks similar to the one existing for Lua-FFI: trace recording is aborted whe the execution re-enters LuaJIT VM. Sandwich detection while mcode execution leads to platform panic. Relates to tarantool/tarantool#4427 Co-authored-by: Vyacheslav Egorov <vegorov@google.com> Co-authored-by: Sergey Ostanevich <sergos@tarantool.org> Signed-off-by: Igor Munkin <imun@tarantool.org> --- src/lj_api.c | 35 ++++++++++---- src/lj_errmsg.h | 1 + test/gh-4427-ffi-sandwich/CMakeLists.txt | 1 + test/gh-4427-ffi-sandwich/libsandwich.c | 59 ++++++++++++++++++++++++ test/gh-4427-ffi-sandwich/test.lua | 26 +++++++++++ 5 files changed, 113 insertions(+), 9 deletions(-) create mode 100644 test/gh-4427-ffi-sandwich/CMakeLists.txt create mode 100644 test/gh-4427-ffi-sandwich/libsandwich.c create mode 100644 test/gh-4427-ffi-sandwich/test.lua diff --git a/src/lj_api.c b/src/lj_api.c index a5e2465..c1f53e0 100644 --- a/src/lj_api.c +++ b/src/lj_api.c @@ -76,6 +76,17 @@ static GCtab *getcurrenv(lua_State *L) return fn->c.gct == ~LJ_TFUNC ? tabref(fn->c.env) : tabref(L->env); } +static void call(lua_State *L, TValue *base, int nres) { + global_State *g = G(L); + if (tvref(g->jit_base)) { + setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); + if (g->panic) g->panic(L); + exit(EXIT_FAILURE); + } + lj_trace_abort(g); /* Never record across Lua VM entrance */ + lj_vm_call(L, base, nres); +} + /* -- Miscellaneous API functions ----------------------------------------- */ LUA_API int lua_status(lua_State *L) @@ -300,7 +311,7 @@ LUA_API int lua_equal(lua_State *L, int idx1, int idx2) return (int)(uintptr_t)base; } else { L->top = base+2; - lj_vm_call(L, base, 1+1); + call(L, base, 1+1); L->top -= 2+LJ_FR2; return tvistruecond(L->top+1+LJ_FR2); } @@ -323,7 +334,7 @@ LUA_API int lua_lessthan(lua_State *L, int idx1, int idx2) return (int)(uintptr_t)base; } else { L->top = base+2; - lj_vm_call(L, base, 1+1); + call(L, base, 1+1); L->top -= 2+LJ_FR2; return tvistruecond(L->top+1+LJ_FR2); } @@ -775,7 +786,7 @@ LUA_API void lua_concat(lua_State *L, int n) } n -= (int)(L->top - top); L->top = top+2; - lj_vm_call(L, top, 1+1); + call(L, top, 1+1); L->top -= 1+LJ_FR2; copyTV(L, L->top-1, L->top+LJ_FR2); } while (--n > 0); @@ -795,7 +806,7 @@ LUA_API void lua_gettable(lua_State *L, int idx) v = lj_meta_tget(L, t, L->top-1); if (v == NULL) { L->top += 2; - lj_vm_call(L, L->top-2, 1+1); + call(L, L->top-2, 1+1); L->top -= 2+LJ_FR2; v = L->top+1+LJ_FR2; } @@ -811,7 +822,7 @@ LUA_API void lua_getfield(lua_State *L, int idx, const char *k) v = lj_meta_tget(L, t, &key); if (v == NULL) { L->top += 2; - lj_vm_call(L, L->top-2, 1+1); + call(L, L->top-2, 1+1); L->top -= 2+LJ_FR2; v = L->top+1+LJ_FR2; } @@ -966,7 +977,7 @@ LUA_API void lua_settable(lua_State *L, int idx) TValue *base = L->top; copyTV(L, base+2, base-3-2*LJ_FR2); L->top = base+3; - lj_vm_call(L, base, 0+1); + call(L, base, 0+1); L->top -= 3+LJ_FR2; } } @@ -987,7 +998,7 @@ LUA_API void lua_setfield(lua_State *L, int idx, const char *k) TValue *base = L->top; copyTV(L, base+2, base-3-2*LJ_FR2); L->top = base+3; - lj_vm_call(L, base, 0+1); + call(L, base, 0+1); L->top -= 2+LJ_FR2; } } @@ -1118,7 +1129,7 @@ LUA_API void lua_call(lua_State *L, int nargs, int nresults) { api_check(L, L->status == LUA_OK || L->status == LUA_ERRERR); api_checknelems(L, nargs+1); - lj_vm_call(L, api_call_base(L, nargs), nresults+1); + call(L, api_call_base(L, nargs), nresults+1); } LUA_API int lua_pcall(lua_State *L, int nargs, int nresults, int errfunc) @@ -1136,6 +1147,12 @@ LUA_API int lua_pcall(lua_State *L, int nargs, int nresults, int errfunc) api_checkvalidindex(L, o); ef = savestack(L, o); } + if (tvref(g->jit_base)) { + setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); + if (g->panic) g->panic(L); + exit(EXIT_FAILURE); + } + lj_trace_abort(g); /* Never record across Lua VM entrance */ status = lj_vm_pcall(L, api_call_base(L, nargs), nresults+1, ef); if (status) hook_restore(g, oldh); return status; @@ -1172,7 +1189,7 @@ LUALIB_API int luaL_callmeta(lua_State *L, int idx, const char *field) if (LJ_FR2) setnilV(top++); copyTV(L, top++, index2adr(L, idx)); L->top = top; - lj_vm_call(L, top-1, 1+1); + call(L, top-1, 1+1); return 1; } return 0; diff --git a/src/lj_errmsg.h b/src/lj_errmsg.h index 060a9f8..1580385 100644 --- a/src/lj_errmsg.h +++ b/src/lj_errmsg.h @@ -112,6 +112,7 @@ ERRDEF(NOJIT, "no JIT compiler for this architecture (yet)") ERRDEF(NOJIT, "JIT compiler permanently disabled by build option") #endif ERRDEF(JITOPT, "unknown or malformed optimization flag " LUA_QS) +ERRDEF(JITCALL, "Lua VM re-entrancy is detected while executing the trace") /* Lexer/parser errors. */ ERRDEF(XMODE, "attempt to load chunk with wrong mode") diff --git a/test/gh-4427-ffi-sandwich/CMakeLists.txt b/test/gh-4427-ffi-sandwich/CMakeLists.txt new file mode 100644 index 0000000..995c6bb --- /dev/null +++ b/test/gh-4427-ffi-sandwich/CMakeLists.txt @@ -0,0 +1 @@ +build_lualib(libsandwich libsandwich.c) diff --git a/test/gh-4427-ffi-sandwich/libsandwich.c b/test/gh-4427-ffi-sandwich/libsandwich.c new file mode 100644 index 0000000..2a24fb2 --- /dev/null +++ b/test/gh-4427-ffi-sandwich/libsandwich.c @@ -0,0 +1,59 @@ +#include <lua.h> +#include <lauxlib.h> + +struct sandwich { + lua_State *L; /* Coroutine saved for a Lua call */ + int ref; /* Anchor to the Lua function to be run */ + int trigger; /* Trigger for switching to Lua call */ +}; + +int ipp(struct sandwich *state, int i) +{ + if (i < state->trigger) + return i + 1; + + /* Sandwich is triggered and Lua ipp function is called */ + lua_pushnumber(state->L, state->ref); + lua_gettable(state->L, LUA_REGISTRYINDEX); + lua_pushnumber(state->L, i); + lua_call(state->L, 1, 1); + return lua_tonumber(state->L, -1); +} + +#define STRUCT_SANDWICH_MT "struct sandwich" + +static int init(lua_State *L) +{ + struct sandwich *state = lua_newuserdata(L, sizeof(struct sandwich)); + + luaL_getmetatable(L, STRUCT_SANDWICH_MT); + lua_setmetatable(L, -2); + + /* Lua ipp function to be called when sandwich is triggered */ + if (luaL_dostring(L, "return function(i) return i + 1 end")) + luaL_error(L, "failed to translate Lua ipp function"); + + state->ref = luaL_ref(L, LUA_REGISTRYINDEX); + state->L = L; + state->trigger = lua_tonumber(L, 1); + return 1; +} + +static int fin(lua_State *L) +{ + struct sandwich *state = luaL_checkudata(L, 1, STRUCT_SANDWICH_MT); + + /* Release the anchored ipp function */ + luaL_unref(L, LUA_REGISTRYINDEX, state->ref); + return 0; +} + +LUA_API int luaopen_libsandwich(lua_State *L) +{ + luaL_newmetatable(L, STRUCT_SANDWICH_MT); + lua_pushcfunction(L, fin); + lua_setfield(L, -2, "__gc"); + + lua_pushcfunction(L, init); + return 1; +} diff --git a/test/gh-4427-ffi-sandwich/test.lua b/test/gh-4427-ffi-sandwich/test.lua new file mode 100644 index 0000000..3ccaee5 --- /dev/null +++ b/test/gh-4427-ffi-sandwich/test.lua @@ -0,0 +1,26 @@ +local cfg = { + hotloop = arg[1] or 1, + trigger = arg[2] or 1, +} + +local ffi = require('ffi') +local ffisandwich = ffi.load('libsandwich') +ffi.cdef('int ipp(struct sandwich *state, int i)') + +-- Save the current coroutine and set the value to trigger ipp +-- call the Lua routine instead of pure C implementation. +local sandwich = require('libsandwich')(cfg.trigger) + +-- Depending on trigger and hotloop values the following contexts +-- are possible: +-- * if trigger <= hotloop -> trace recording is aborted +-- * if trigger > hotloop -> trace is recorded but execution +-- leads to panic +jit.opt.start("3", string.format("hotloop=%d", cfg.hotloop)) + +local res +for i = 0, cfg.trigger + cfg.hotloop do + res = ffisandwich.ipp(sandwich, i) +end +-- Check the resulting value if panic didn't occur earlier. +print(res) -- 2.25.0 ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API 2020-03-27 10:47 ` [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API Igor Munkin @ 2020-03-28 16:33 ` Sergey Ostanevich 2020-03-28 20:30 ` Igor Munkin 2020-04-02 23:41 ` Vladislav Shpilevoy 1 sibling, 1 reply; 20+ messages in thread From: Sergey Ostanevich @ 2020-03-28 16:33 UTC (permalink / raw) To: Igor Munkin; +Cc: Vladislav Shpilevoy, tarantool-patches Hi! Thanks for the patch, see below. On 27 мар 13:47, Igor Munkin wrote: > JIT recording semantics assumes FFI calls are atomic regarding the > LuaJIT VM: if the execution exited Lua world through FFI it is not > re-entering Lua world again. > > However, there are ways to break this atomicity via FFI: one can > re-enter LuaJIT VM via Lua C API inside the C routine called via FFI. As > a result the following stack mix is created: > | Lua-FFI -> С routine -> Lua-C API -> Lua VM > This sort of re-entrancy is explicitly not supported by LuaJIT compiler > machinery. @mraleph named such layout an "FFI sandwich" and I'm strictly > against eating it. > > E.g. there is a trigger machinery in Tarantool calling Lua functions > from C space. As a result of FFI function call with a trigger underneath > the "FFI sandwich" is created. > > This changeset introduces the mechanism for Lua-C API callbacks similar > to the one existing for Lua-FFI: trace recording is aborted whe the when > execution re-enters LuaJIT VM. Sandwich detection while mcode execution > leads to platform panic. Could you elaborate here in comment on this a little: it means this particular Lua call to be aborted, fail for the fiber Lua state, or for the whole Tarantool? > > Relates to tarantool/tarantool#4427 > > Co-authored-by: Vyacheslav Egorov <vegorov@google.com> > Co-authored-by: Sergey Ostanevich <sergos@tarantool.org> > Signed-off-by: Igor Munkin <imun@tarantool.org> > --- > src/lj_api.c | 35 ++++++++++---- > src/lj_errmsg.h | 1 + > test/gh-4427-ffi-sandwich/CMakeLists.txt | 1 + > test/gh-4427-ffi-sandwich/libsandwich.c | 59 ++++++++++++++++++++++++ > test/gh-4427-ffi-sandwich/test.lua | 26 +++++++++++ > 5 files changed, 113 insertions(+), 9 deletions(-) > create mode 100644 test/gh-4427-ffi-sandwich/CMakeLists.txt > create mode 100644 test/gh-4427-ffi-sandwich/libsandwich.c > create mode 100644 test/gh-4427-ffi-sandwich/test.lua > > diff --git a/src/lj_api.c b/src/lj_api.c > index a5e2465..c1f53e0 100644 > --- a/src/lj_api.c > +++ b/src/lj_api.c > @@ -76,6 +76,17 @@ static GCtab *getcurrenv(lua_State *L) > return fn->c.gct == ~LJ_TFUNC ? tabref(fn->c.env) : tabref(L->env); > } > > +static void call(lua_State *L, TValue *base, int nres) { I would recommend to name the function in a more descriptive way, like sane_call. > + global_State *g = G(L); > + if (tvref(g->jit_base)) { > + setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); > + if (g->panic) g->panic(L); > + exit(EXIT_FAILURE); > + } > + lj_trace_abort(g); /* Never record across Lua VM entrance */ > + lj_vm_call(L, base, nres); > +} > + > /* -- Miscellaneous API functions ----------------------------------------- */ > > LUA_API int lua_status(lua_State *L) Also I have a question on perf impact - have you measured any benchmarks - Lua only, or as part of Tarantool? Regards, Sergos ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API 2020-03-28 16:33 ` Sergey Ostanevich @ 2020-03-28 20:30 ` Igor Munkin 2020-03-29 9:21 ` Sergey Ostanevich 0 siblings, 1 reply; 20+ messages in thread From: Igor Munkin @ 2020-03-28 20:30 UTC (permalink / raw) To: Sergey Ostanevich; +Cc: Vladislav Shpilevoy, tarantool-patches Sergos, Thanks for you review! On 28.03.20, Sergey Ostanevich wrote: > Hi! > > Thanks for the patch, see below. > On 27 мар 13:47, Igor Munkin wrote: > > JIT recording semantics assumes FFI calls are atomic regarding the > > LuaJIT VM: if the execution exited Lua world through FFI it is not > > re-entering Lua world again. > > > > However, there are ways to break this atomicity via FFI: one can > > re-enter LuaJIT VM via Lua C API inside the C routine called via FFI. As > > a result the following stack mix is created: > > | Lua-FFI -> С routine -> Lua-C API -> Lua VM Fixed typo <D0><A1> (cyrillic С letter) to an ASCII C. > > This sort of re-entrancy is explicitly not supported by LuaJIT compiler > > machinery. @mraleph named such layout an "FFI sandwich" and I'm strictly > > against eating it. > > > > E.g. there is a trigger machinery in Tarantool calling Lua functions > > from C space. As a result of FFI function call with a trigger underneath > > the "FFI sandwich" is created. > > > > This changeset introduces the mechanism for Lua-C API callbacks similar > > to the one existing for Lua-FFI: trace recording is aborted whe the > when Thanks, fixed. > > execution re-enters LuaJIT VM. Sandwich detection while mcode execution > > leads to platform panic. > Could you elaborate here in comment on this a little: it means this > particular Lua call to be aborted, fail for the fiber Lua state, or for > the whole Tarantool? > The whole Tarantool finishes its execution with EXIT_FAILURE code and calls tarantool_panic_handler routine prior to the exit. Should I add this remark into commit message? > > > > Relates to tarantool/tarantool#4427 > > > > Co-authored-by: Vyacheslav Egorov <vegorov@google.com> > > Co-authored-by: Sergey Ostanevich <sergos@tarantool.org> > > Signed-off-by: Igor Munkin <imun@tarantool.org> > > --- > > src/lj_api.c | 35 ++++++++++---- > > src/lj_errmsg.h | 1 + > > test/gh-4427-ffi-sandwich/CMakeLists.txt | 1 + > > test/gh-4427-ffi-sandwich/libsandwich.c | 59 ++++++++++++++++++++++++ > > test/gh-4427-ffi-sandwich/test.lua | 26 +++++++++++ > > 5 files changed, 113 insertions(+), 9 deletions(-) > > create mode 100644 test/gh-4427-ffi-sandwich/CMakeLists.txt > > create mode 100644 test/gh-4427-ffi-sandwich/libsandwich.c > > create mode 100644 test/gh-4427-ffi-sandwich/test.lua > > > > diff --git a/src/lj_api.c b/src/lj_api.c > > index a5e2465..c1f53e0 100644 > > --- a/src/lj_api.c > > +++ b/src/lj_api.c > > @@ -76,6 +76,17 @@ static GCtab *getcurrenv(lua_State *L) > > return fn->c.gct == ~LJ_TFUNC ? tabref(fn->c.env) : tabref(L->env); > > } > > > > +static void call(lua_State *L, TValue *base, int nres) { > I would recommend to name the function in a more descriptive way, like > sane_call. IMHO no_trace_call / no_jit_call looks good to me, but feel free to propose the naming you prefer more. > > > + global_State *g = G(L); > > + if (tvref(g->jit_base)) { > > + setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); > > + if (g->panic) g->panic(L); > > + exit(EXIT_FAILURE); > > + } > > + lj_trace_abort(g); /* Never record across Lua VM entrance */ > > + lj_vm_call(L, base, nres); > > +} > > + > > /* -- Miscellaneous API functions ----------------------------------------- */ > > > > LUA_API int lua_status(lua_State *L) > > Also I have a question on perf impact - have you measured any > benchmarks - Lua only, or as part of Tarantool I compared two sysbench logs ([1] and [2]) via vimdiff and see no critical difference. > > Regards, > Sergos [1]: https://gitlab.com/tarantool/tarantool/-/jobs/489420257/raw [2]: https://gitlab.com/tarantool/tarantool/-/jobs/486707332/raw -- Best regards, IM ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API 2020-03-28 20:30 ` Igor Munkin @ 2020-03-29 9:21 ` Sergey Ostanevich 2020-03-29 10:45 ` Igor Munkin 0 siblings, 1 reply; 20+ messages in thread From: Sergey Ostanevich @ 2020-03-29 9:21 UTC (permalink / raw) To: Igor Munkin; +Cc: Vladislav Shpilevoy, tarantool-patches On 28 мар 23:30, Igor Munkin wrote: > Sergos, > > Thanks for you review! > > On 28.03.20, Sergey Ostanevich wrote: > > Hi! > > > > Thanks for the patch, see below. > > On 27 мар 13:47, Igor Munkin wrote: > > > JIT recording semantics assumes FFI calls are atomic regarding the > > > LuaJIT VM: if the execution exited Lua world through FFI it is not > > > re-entering Lua world again. > > > > > > However, there are ways to break this atomicity via FFI: one can > > > re-enter LuaJIT VM via Lua C API inside the C routine called via FFI. As > > > a result the following stack mix is created: > > > | Lua-FFI -> С routine -> Lua-C API -> Lua VM > > Fixed typo <D0><A1> (cyrillic С letter) to an ASCII C. > > > > This sort of re-entrancy is explicitly not supported by LuaJIT compiler > > > machinery. @mraleph named such layout an "FFI sandwich" and I'm strictly > > > against eating it. > > > > > > E.g. there is a trigger machinery in Tarantool calling Lua functions > > > from C space. As a result of FFI function call with a trigger underneath > > > the "FFI sandwich" is created. > > > > > > This changeset introduces the mechanism for Lua-C API callbacks similar > > > to the one existing for Lua-FFI: trace recording is aborted whe the > > when > > Thanks, fixed. > > > > execution re-enters LuaJIT VM. Sandwich detection while mcode execution > > > leads to platform panic. > > Could you elaborate here in comment on this a little: it means this > > particular Lua call to be aborted, fail for the fiber Lua state, or for > > the whole Tarantool? > > > > The whole Tarantool finishes its execution with EXIT_FAILURE code and > calls tarantool_panic_handler routine prior to the exit. Should I add > this remark into commit message? I would like to see it in commit message. > > > > > > > Relates to tarantool/tarantool#4427 > > > > > > Co-authored-by: Vyacheslav Egorov <vegorov@google.com> > > > Co-authored-by: Sergey Ostanevich <sergos@tarantool.org> > > > Signed-off-by: Igor Munkin <imun@tarantool.org> > > > --- > > > src/lj_api.c | 35 ++++++++++---- > > > src/lj_errmsg.h | 1 + > > > test/gh-4427-ffi-sandwich/CMakeLists.txt | 1 + > > > test/gh-4427-ffi-sandwich/libsandwich.c | 59 ++++++++++++++++++++++++ > > > test/gh-4427-ffi-sandwich/test.lua | 26 +++++++++++ > > > 5 files changed, 113 insertions(+), 9 deletions(-) > > > create mode 100644 test/gh-4427-ffi-sandwich/CMakeLists.txt > > > create mode 100644 test/gh-4427-ffi-sandwich/libsandwich.c > > > create mode 100644 test/gh-4427-ffi-sandwich/test.lua > > > > > > diff --git a/src/lj_api.c b/src/lj_api.c > > > index a5e2465..c1f53e0 100644 > > > --- a/src/lj_api.c > > > +++ b/src/lj_api.c > > > @@ -76,6 +76,17 @@ static GCtab *getcurrenv(lua_State *L) > > > return fn->c.gct == ~LJ_TFUNC ? tabref(fn->c.env) : tabref(L->env); > > > } > > > > > > +static void call(lua_State *L, TValue *base, int nres) { > > I would recommend to name the function in a more descriptive way, like > > sane_call. > > IMHO no_trace_call / no_jit_call looks good to me, but feel free to > propose the naming you prefer more. If you want to mention that this call aborts the trace generation, then I would put it like 'call_w_trace_abort' But it goes beyond that AFAU. Can you name it like 'ffi_safe_call'? Although, safety resulting in a panic sounds paradoxical. BTW, should it be under 'LJ_HASFFI' and keep the original call otherwise? > > > > > > + global_State *g = G(L); > > > + if (tvref(g->jit_base)) { > > > + setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); > > > + if (g->panic) g->panic(L); > > > + exit(EXIT_FAILURE); > > > + } > > > + lj_trace_abort(g); /* Never record across Lua VM entrance */ > > > + lj_vm_call(L, base, nres); > > > +} > > > + > > > /* -- Miscellaneous API functions ----------------------------------------- */ > > > > > > LUA_API int lua_status(lua_State *L) > > > > Also I have a question on perf impact - have you measured any > > benchmarks - Lua only, or as part of Tarantool > > I compared two sysbench logs ([1] and [2]) via vimdiff and see no > critical difference. > > > > > Regards, > > Sergos > > [1]: https://gitlab.com/tarantool/tarantool/-/jobs/489420257/raw > [2]: https://gitlab.com/tarantool/tarantool/-/jobs/486707332/raw > > -- > Best regards, > IM ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API 2020-03-29 9:21 ` Sergey Ostanevich @ 2020-03-29 10:45 ` Igor Munkin 2020-03-30 8:58 ` Sergey Ostanevich 0 siblings, 1 reply; 20+ messages in thread From: Igor Munkin @ 2020-03-29 10:45 UTC (permalink / raw) To: Sergey Ostanevich; +Cc: Vladislav Shpilevoy, tarantool-patches Sergos, On 29.03.20, Sergey Ostanevich wrote: > On 28 мар 23:30, Igor Munkin wrote: > > Sergos, > > > > Thanks for you review! > > > > On 28.03.20, Sergey Ostanevich wrote: > > > Hi! > > > > > > Thanks for the patch, see below. > > > On 27 мар 13:47, Igor Munkin wrote: > > > > JIT recording semantics assumes FFI calls are atomic regarding the > > > > LuaJIT VM: if the execution exited Lua world through FFI it is not > > > > re-entering Lua world again. > > > > > > > > However, there are ways to break this atomicity via FFI: one can > > > > re-enter LuaJIT VM via Lua C API inside the C routine called via FFI. As > > > > a result the following stack mix is created: > > > > | Lua-FFI -> С routine -> Lua-C API -> Lua VM > > > > Fixed typo <D0><A1> (cyrillic С letter) to an ASCII C. > > > > > > This sort of re-entrancy is explicitly not supported by LuaJIT compiler > > > > machinery. @mraleph named such layout an "FFI sandwich" and I'm strictly > > > > against eating it. > > > > > > > > E.g. there is a trigger machinery in Tarantool calling Lua functions > > > > from C space. As a result of FFI function call with a trigger underneath > > > > the "FFI sandwich" is created. > > > > > > > > This changeset introduces the mechanism for Lua-C API callbacks similar > > > > to the one existing for Lua-FFI: trace recording is aborted whe the > > > when > > > > Thanks, fixed. > > > > > > execution re-enters LuaJIT VM. Sandwich detection while mcode execution > > > > leads to platform panic. > > > Could you elaborate here in comment on this a little: it means this > > > particular Lua call to be aborted, fail for the fiber Lua state, or for > > > the whole Tarantool? > > > > > > > The whole Tarantool finishes its execution with EXIT_FAILURE code and > > calls tarantool_panic_handler routine prior to the exit. Should I add > > this remark into commit message? > I would like to see it in commit message. Added. Here is a new commit message: ================================================================================ jit: abort trace recording and execution for C API JIT recording semantics assumes FFI calls are atomic regarding the LuaJIT VM: if the execution exited Lua world through FFI it is not re-entering Lua world again. However, there are ways to break this atomicity via FFI: one can re-enter LuaJIT VM via Lua C API inside the C routine called via FFI. As a result the following stack mix is created: | Lua-FFI -> C routine -> Lua-C API -> Lua VM This sort of re-entrancy is explicitly not supported by LuaJIT compiler machinery. @mraleph named such layout an "FFI sandwich" and I'm strictly against eating it. E.g. there is a trigger machinery in Tarantool calling Lua functions from C space. As a result of FFI function call with a trigger underneath the "FFI sandwich" is created. This changeset introduces the mechanism for Lua-C API callbacks similar to the one existing for Lua-FFI: trace recording is aborted when the execution re-enters LuaJIT VM. If re-entrancy is detected while running mcode the platform finishes its execution with EXIT_FAILURE code and calls panic routine prior to the exit. Relates to tarantool/tarantool#4427 ================================================================================ > > > > > > > > > > > Relates to tarantool/tarantool#4427 > > > > > > > > Co-authored-by: Vyacheslav Egorov <vegorov@google.com> > > > > Co-authored-by: Sergey Ostanevich <sergos@tarantool.org> > > > > Signed-off-by: Igor Munkin <imun@tarantool.org> > > > > --- > > > > src/lj_api.c | 35 ++++++++++---- > > > > src/lj_errmsg.h | 1 + > > > > test/gh-4427-ffi-sandwich/CMakeLists.txt | 1 + > > > > test/gh-4427-ffi-sandwich/libsandwich.c | 59 ++++++++++++++++++++++++ > > > > test/gh-4427-ffi-sandwich/test.lua | 26 +++++++++++ > > > > 5 files changed, 113 insertions(+), 9 deletions(-) > > > > create mode 100644 test/gh-4427-ffi-sandwich/CMakeLists.txt > > > > create mode 100644 test/gh-4427-ffi-sandwich/libsandwich.c > > > > create mode 100644 test/gh-4427-ffi-sandwich/test.lua > > > > > > > > diff --git a/src/lj_api.c b/src/lj_api.c > > > > index a5e2465..c1f53e0 100644 > > > > --- a/src/lj_api.c > > > > +++ b/src/lj_api.c > > > > @@ -76,6 +76,17 @@ static GCtab *getcurrenv(lua_State *L) > > > > return fn->c.gct == ~LJ_TFUNC ? tabref(fn->c.env) : tabref(L->env); > > > > } > > > > > > > > +static void call(lua_State *L, TValue *base, int nres) { > > > I would recommend to name the function in a more descriptive way, like > > > sane_call. > > > > IMHO no_trace_call / no_jit_call looks good to me, but feel free to > > propose the naming you prefer more. > If you want to mention that this call aborts the trace generation, > then I would put it like 'call_w_trace_abort' It not only aborts the trace recording, but prohibits further execution for already compiled code. > But it goes beyond that AFAU. Can you name it like 'ffi_safe_call'? The changes are not related to FFI at all. FFI helps to achieve the platform failure, but the problem relates to JIT machinery. So this name doesn't clarify function semantics. > Although, safety resulting in a panic sounds paradoxical. Totally agree. > > BTW, should it be under 'LJ_HASFFI' and keep the original call > otherwise? I see no benefits for it. Again, the fix doesn't relates to FFI itself. It's just a particular way leading to JIT inconsistency. Feel free to correct me, if I'm wrong. > > > > > > > > > > + global_State *g = G(L); > > > > + if (tvref(g->jit_base)) { > > > > + setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); > > > > + if (g->panic) g->panic(L); > > > > + exit(EXIT_FAILURE); > > > > + } > > > > + lj_trace_abort(g); /* Never record across Lua VM entrance */ > > > > + lj_vm_call(L, base, nres); > > > > +} > > > > + > > > > /* -- Miscellaneous API functions ----------------------------------------- */ > > > > > > > > LUA_API int lua_status(lua_State *L) > > > > > > Also I have a question on perf impact - have you measured any > > > benchmarks - Lua only, or as part of Tarantool > > > > I compared two sysbench logs ([1] and [2]) via vimdiff and see no > > critical difference. > > > > > > > > Regards, > > > Sergos > > > > [1]: https://gitlab.com/tarantool/tarantool/-/jobs/489420257/raw > > [2]: https://gitlab.com/tarantool/tarantool/-/jobs/486707332/raw > > > > -- > > Best regards, > > IM -- Best regards, IM ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API 2020-03-29 10:45 ` Igor Munkin @ 2020-03-30 8:58 ` Sergey Ostanevich 2020-03-30 14:25 ` Igor Munkin 0 siblings, 1 reply; 20+ messages in thread From: Sergey Ostanevich @ 2020-03-30 8:58 UTC (permalink / raw) To: Igor Munkin; +Cc: Vladislav Shpilevoy, tarantool-patches On 29 мар 13:45, Igor Munkin wrote: > Sergos, > > On 29.03.20, Sergey Ostanevich wrote: > > On 28 мар 23:30, Igor Munkin wrote: > > > Sergos, > > > > > > Thanks for you review! > > > > > > On 28.03.20, Sergey Ostanevich wrote: > > > > Hi! > > > > > > > > Thanks for the patch, see below. > > > > On 27 мар 13:47, Igor Munkin wrote: > > > > > JIT recording semantics assumes FFI calls are atomic regarding the > > > > > LuaJIT VM: if the execution exited Lua world through FFI it is not > > > > > re-entering Lua world again. > > > > > > > > > > However, there are ways to break this atomicity via FFI: one can > > > > > re-enter LuaJIT VM via Lua C API inside the C routine called via FFI. As > > > > > a result the following stack mix is created: > > > > > | Lua-FFI -> С routine -> Lua-C API -> Lua VM > > > > > > Fixed typo <D0><A1> (cyrillic С letter) to an ASCII C. > > > > > > > > This sort of re-entrancy is explicitly not supported by LuaJIT compiler > > > > > machinery. @mraleph named such layout an "FFI sandwich" and I'm strictly > > > > > against eating it. > > > > > > > > > > E.g. there is a trigger machinery in Tarantool calling Lua functions > > > > > from C space. As a result of FFI function call with a trigger underneath > > > > > the "FFI sandwich" is created. > > > > > > > > > > This changeset introduces the mechanism for Lua-C API callbacks similar > > > > > to the one existing for Lua-FFI: trace recording is aborted whe the > > > > when > > > > > > Thanks, fixed. > > > > > > > > execution re-enters LuaJIT VM. Sandwich detection while mcode execution > > > > > leads to platform panic. > > > > Could you elaborate here in comment on this a little: it means this > > > > particular Lua call to be aborted, fail for the fiber Lua state, or for > > > > the whole Tarantool? > > > > > > > > > > The whole Tarantool finishes its execution with EXIT_FAILURE code and > > > calls tarantool_panic_handler routine prior to the exit. Should I add > > > this remark into commit message? > > I would like to see it in commit message. > > Added. Here is a new commit message: > > ================================================================================ > > jit: abort trace recording and execution for C API > > JIT recording semantics assumes FFI calls are atomic regarding the > LuaJIT VM: if the execution exited Lua world through FFI it is not > re-entering Lua world again. > > However, there are ways to break this atomicity via FFI: one can > re-enter LuaJIT VM via Lua C API inside the C routine called via FFI. As > a result the following stack mix is created: > | Lua-FFI -> C routine -> Lua-C API -> Lua VM > This sort of re-entrancy is explicitly not supported by LuaJIT compiler > machinery. @mraleph named such layout an "FFI sandwich" and I'm strictly > against eating it. > > E.g. there is a trigger machinery in Tarantool calling Lua functions > from C space. As a result of FFI function call with a trigger underneath > the "FFI sandwich" is created. > > This changeset introduces the mechanism for Lua-C API callbacks similar > to the one existing for Lua-FFI: trace recording is aborted when the > execution re-enters LuaJIT VM. If re-entrancy is detected while running > mcode the platform finishes its execution with EXIT_FAILURE code and > calls panic routine prior to the exit. > > Relates to tarantool/tarantool#4427 > > ================================================================================ LGTM. > > > > > > > > > > > > > > > > Relates to tarantool/tarantool#4427 > > > > > > > > > > Co-authored-by: Vyacheslav Egorov <vegorov@google.com> > > > > > Co-authored-by: Sergey Ostanevich <sergos@tarantool.org> > > > > > Signed-off-by: Igor Munkin <imun@tarantool.org> > > > > > --- > > > > > src/lj_api.c | 35 ++++++++++---- > > > > > src/lj_errmsg.h | 1 + > > > > > test/gh-4427-ffi-sandwich/CMakeLists.txt | 1 + > > > > > test/gh-4427-ffi-sandwich/libsandwich.c | 59 ++++++++++++++++++++++++ > > > > > test/gh-4427-ffi-sandwich/test.lua | 26 +++++++++++ > > > > > 5 files changed, 113 insertions(+), 9 deletions(-) > > > > > create mode 100644 test/gh-4427-ffi-sandwich/CMakeLists.txt > > > > > create mode 100644 test/gh-4427-ffi-sandwich/libsandwich.c > > > > > create mode 100644 test/gh-4427-ffi-sandwich/test.lua > > > > > > > > > > diff --git a/src/lj_api.c b/src/lj_api.c > > > > > index a5e2465..c1f53e0 100644 > > > > > --- a/src/lj_api.c > > > > > +++ b/src/lj_api.c > > > > > @@ -76,6 +76,17 @@ static GCtab *getcurrenv(lua_State *L) > > > > > return fn->c.gct == ~LJ_TFUNC ? tabref(fn->c.env) : tabref(L->env); > > > > > } > > > > > > > > > > +static void call(lua_State *L, TValue *base, int nres) { > > > > I would recommend to name the function in a more descriptive way, like > > > > sane_call. > > > > > > IMHO no_trace_call / no_jit_call looks good to me, but feel free to > > > propose the naming you prefer more. > > If you want to mention that this call aborts the trace generation, > > then I would put it like 'call_w_trace_abort' > > It not only aborts the trace recording, but prohibits further execution > for already compiled code. > > > But it goes beyond that AFAU. Can you name it like 'ffi_safe_call'? > > The changes are not related to FFI at all. FFI helps to achieve the > platform failure, but the problem relates to JIT machinery. So this name > doesn't clarify function semantics. > Can we sayn that this machinery tracks the machinery consistency? Then the first name comes to my mind is consistent_call with in the code comment on what consistency is - abort trace collection and inconsistent trace excution. > > Although, safety resulting in a panic sounds paradoxical. > > Totally agree. > > > > > BTW, should it be under 'LJ_HASFFI' and keep the original call > > otherwise? > > I see no benefits for it. Again, the fix doesn't relates to FFI > itself. It's just a particular way leading to JIT inconsistency. > > Feel free to correct me, if I'm wrong. > > > > > > > > > > > > > > > + global_State *g = G(L); > > > > > + if (tvref(g->jit_base)) { > > > > > + setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); > > > > > + if (g->panic) g->panic(L); > > > > > + exit(EXIT_FAILURE); > > > > > + } > > > > > + lj_trace_abort(g); /* Never record across Lua VM entrance */ > > > > > + lj_vm_call(L, base, nres); > > > > > +} > > > > > + > > > > > /* -- Miscellaneous API functions ----------------------------------------- */ > > > > > > > > > > LUA_API int lua_status(lua_State *L) > > > > > > > > Also I have a question on perf impact - have you measured any > > > > benchmarks - Lua only, or as part of Tarantool > > > > > > I compared two sysbench logs ([1] and [2]) via vimdiff and see no > > > critical difference. > > > > > > > > > > > Regards, > > > > Sergos > > > > > > [1]: https://gitlab.com/tarantool/tarantool/-/jobs/489420257/raw > > > [2]: https://gitlab.com/tarantool/tarantool/-/jobs/486707332/raw > > > > > > -- > > > Best regards, > > > IM > > -- > Best regards, > IM ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API 2020-03-30 8:58 ` Sergey Ostanevich @ 2020-03-30 14:25 ` Igor Munkin 2020-04-03 21:06 ` Sergey Ostanevich 0 siblings, 1 reply; 20+ messages in thread From: Igor Munkin @ 2020-03-30 14:25 UTC (permalink / raw) To: Sergey Ostanevich; +Cc: Vladislav Shpilevoy, tarantool-patches Sergos, On 30.03.20, Sergey Ostanevich wrote: > On 29 мар 13:45, Igor Munkin wrote: > > Sergos, > > > > On 29.03.20, Sergey Ostanevich wrote: > > > On 28 мар 23:30, Igor Munkin wrote: > > > > Sergos, > > > > > > > > Thanks for you review! > > > > > > > > On 28.03.20, Sergey Ostanevich wrote: > > > > > Hi! > > > > > > > > > > Thanks for the patch, see below. > > > > > On 27 мар 13:47, Igor Munkin wrote: > > > > > > JIT recording semantics assumes FFI calls are atomic regarding the > > > > > > LuaJIT VM: if the execution exited Lua world through FFI it is not > > > > > > re-entering Lua world again. > > > > > > > > > > > > However, there are ways to break this atomicity via FFI: one can > > > > > > re-enter LuaJIT VM via Lua C API inside the C routine called via FFI. As > > > > > > a result the following stack mix is created: > > > > > > | Lua-FFI -> С routine -> Lua-C API -> Lua VM > > > > > > > > Fixed typo <D0><A1> (cyrillic С letter) to an ASCII C. > > > > > > > > > > This sort of re-entrancy is explicitly not supported by LuaJIT compiler > > > > > > machinery. @mraleph named such layout an "FFI sandwich" and I'm strictly > > > > > > against eating it. > > > > > > > > > > > > E.g. there is a trigger machinery in Tarantool calling Lua functions > > > > > > from C space. As a result of FFI function call with a trigger underneath > > > > > > the "FFI sandwich" is created. > > > > > > > > > > > > This changeset introduces the mechanism for Lua-C API callbacks similar > > > > > > to the one existing for Lua-FFI: trace recording is aborted whe the > > > > > when > > > > > > > > Thanks, fixed. > > > > > > > > > > execution re-enters LuaJIT VM. Sandwich detection while mcode execution > > > > > > leads to platform panic. > > > > > Could you elaborate here in comment on this a little: it means this > > > > > particular Lua call to be aborted, fail for the fiber Lua state, or for > > > > > the whole Tarantool? > > > > > > > > > > > > > The whole Tarantool finishes its execution with EXIT_FAILURE code and > > > > calls tarantool_panic_handler routine prior to the exit. Should I add > > > > this remark into commit message? > > > I would like to see it in commit message. > > > > Added. Here is a new commit message: > > > > ================================================================================ > > > > jit: abort trace recording and execution for C API > > > > JIT recording semantics assumes FFI calls are atomic regarding the > > LuaJIT VM: if the execution exited Lua world through FFI it is not > > re-entering Lua world again. > > > > However, there are ways to break this atomicity via FFI: one can > > re-enter LuaJIT VM via Lua C API inside the C routine called via FFI. As > > a result the following stack mix is created: > > | Lua-FFI -> C routine -> Lua-C API -> Lua VM > > This sort of re-entrancy is explicitly not supported by LuaJIT compiler > > machinery. @mraleph named such layout an "FFI sandwich" and I'm strictly > > against eating it. > > > > E.g. there is a trigger machinery in Tarantool calling Lua functions > > from C space. As a result of FFI function call with a trigger underneath > > the "FFI sandwich" is created. > > > > This changeset introduces the mechanism for Lua-C API callbacks similar > > to the one existing for Lua-FFI: trace recording is aborted when the > > execution re-enters LuaJIT VM. If re-entrancy is detected while running > > mcode the platform finishes its execution with EXIT_FAILURE code and > > calls panic routine prior to the exit. > > > > Relates to tarantool/tarantool#4427 > > > > ================================================================================ > > LGTM. > > > > > > > > > > > > > > > > > > > > > > Relates to tarantool/tarantool#4427 > > > > > > > > > > > > Co-authored-by: Vyacheslav Egorov <vegorov@google.com> > > > > > > Co-authored-by: Sergey Ostanevich <sergos@tarantool.org> > > > > > > Signed-off-by: Igor Munkin <imun@tarantool.org> > > > > > > --- > > > > > > src/lj_api.c | 35 ++++++++++---- > > > > > > src/lj_errmsg.h | 1 + > > > > > > test/gh-4427-ffi-sandwich/CMakeLists.txt | 1 + > > > > > > test/gh-4427-ffi-sandwich/libsandwich.c | 59 ++++++++++++++++++++++++ > > > > > > test/gh-4427-ffi-sandwich/test.lua | 26 +++++++++++ > > > > > > 5 files changed, 113 insertions(+), 9 deletions(-) > > > > > > create mode 100644 test/gh-4427-ffi-sandwich/CMakeLists.txt > > > > > > create mode 100644 test/gh-4427-ffi-sandwich/libsandwich.c > > > > > > create mode 100644 test/gh-4427-ffi-sandwich/test.lua > > > > > > > > > > > > diff --git a/src/lj_api.c b/src/lj_api.c > > > > > > index a5e2465..c1f53e0 100644 > > > > > > --- a/src/lj_api.c > > > > > > +++ b/src/lj_api.c > > > > > > @@ -76,6 +76,17 @@ static GCtab *getcurrenv(lua_State *L) > > > > > > return fn->c.gct == ~LJ_TFUNC ? tabref(fn->c.env) : tabref(L->env); > > > > > > } > > > > > > > > > > > > +static void call(lua_State *L, TValue *base, int nres) { > > > > > I would recommend to name the function in a more descriptive way, like > > > > > sane_call. > > > > > > > > IMHO no_trace_call / no_jit_call looks good to me, but feel free to > > > > propose the naming you prefer more. > > > If you want to mention that this call aborts the trace generation, > > > then I would put it like 'call_w_trace_abort' > > > > It not only aborts the trace recording, but prohibits further execution > > for already compiled code. > > > > > But it goes beyond that AFAU. Can you name it like 'ffi_safe_call'? > > > > The changes are not related to FFI at all. FFI helps to achieve the > > platform failure, but the problem relates to JIT machinery. So this name > > doesn't clarify function semantics. > > > Can we sayn that this machinery tracks the machinery consistency? Then > the first name comes to my mind is consistent_call with in the code > comment on what consistency is - abort trace collection and inconsistent > trace excution. As a result of discussion, we stopped on the jit_secure_call name. The diff is below: ================================================================================ diff --git a/src/lj_api.c b/src/lj_api.c index c1f53e0..4ff8ccf 100644 --- a/src/lj_api.c +++ b/src/lj_api.c @@ -76,8 +76,9 @@ static GCtab *getcurrenv(lua_State *L) return fn->c.gct == ~LJ_TFUNC ? tabref(fn->c.env) : tabref(L->env); } -static void call(lua_State *L, TValue *base, int nres) { +static void jit_secure_call(lua_State *L, TValue *base, int nres) { global_State *g = G(L); + /* Forbid Lua world re-entrancy while running the trace */ if (tvref(g->jit_base)) { setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); if (g->panic) g->panic(L); @@ -311,7 +311,7 @@ LUA_API int lua_equal(lua_State *L, int idx1, int idx2) return (int)(uintptr_t)base; } else { L->top = base+2; - call(L, base, 1+1); + jit_secure_call(L, base, 1+1); L->top -= 2+LJ_FR2; return tvistruecond(L->top+1+LJ_FR2); } @@ -334,7 +334,7 @@ LUA_API int lua_lessthan(lua_State *L, int idx1, int idx2) return (int)(uintptr_t)base; } else { L->top = base+2; - call(L, base, 1+1); + jit_secure_call(L, base, 1+1); L->top -= 2+LJ_FR2; return tvistruecond(L->top+1+LJ_FR2); } @@ -786,7 +786,7 @@ LUA_API void lua_concat(lua_State *L, int n) } n -= (int)(L->top - top); L->top = top+2; - call(L, top, 1+1); + jit_secure_call(L, top, 1+1); L->top -= 1+LJ_FR2; copyTV(L, L->top-1, L->top+LJ_FR2); } while (--n > 0); @@ -806,7 +806,7 @@ LUA_API void lua_gettable(lua_State *L, int idx) v = lj_meta_tget(L, t, L->top-1); if (v == NULL) { L->top += 2; - call(L, L->top-2, 1+1); + jit_secure_call(L, L->top-2, 1+1); L->top -= 2+LJ_FR2; v = L->top+1+LJ_FR2; } @@ -822,7 +822,7 @@ LUA_API void lua_getfield(lua_State *L, int idx, const char *k) v = lj_meta_tget(L, t, &key); if (v == NULL) { L->top += 2; - call(L, L->top-2, 1+1); + jit_secure_call(L, L->top-2, 1+1); L->top -= 2+LJ_FR2; v = L->top+1+LJ_FR2; } @@ -977,7 +977,7 @@ LUA_API void lua_settable(lua_State *L, int idx) TValue *base = L->top; copyTV(L, base+2, base-3-2*LJ_FR2); L->top = base+3; - call(L, base, 0+1); + jit_secure_call(L, base, 0+1); L->top -= 3+LJ_FR2; } } @@ -998,7 +998,7 @@ LUA_API void lua_setfield(lua_State *L, int idx, const char *k) TValue *base = L->top; copyTV(L, base+2, base-3-2*LJ_FR2); L->top = base+3; - call(L, base, 0+1); + jit_secure_call(L, base, 0+1); L->top -= 2+LJ_FR2; } } @@ -1129,7 +1129,7 @@ LUA_API void lua_call(lua_State *L, int nargs, int nresults) { api_check(L, L->status == LUA_OK || L->status == LUA_ERRERR); api_checknelems(L, nargs+1); - call(L, api_call_base(L, nargs), nresults+1); + jit_secure_call(L, api_call_base(L, nargs), nresults+1); } LUA_API int lua_pcall(lua_State *L, int nargs, int nresults, int errfunc) @@ -1147,6 +1148,7 @@ LUA_API int lua_pcall(lua_State *L, int nargs, int nresults, int errfunc) api_checkvalidindex(L, o); ef = savestack(L, o); } + /* Forbid Lua world re-entrancy while running the trace */ if (tvref(g->jit_base)) { setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); if (g->panic) g->panic(L); @@ -1189,7 +1189,7 @@ LUALIB_API int luaL_callmeta(lua_State *L, int idx, const char *field) if (LJ_FR2) setnilV(top++); copyTV(L, top++, index2adr(L, idx)); L->top = top; - call(L, top-1, 1+1); + jit_secure_call(L, top-1, 1+1); return 1; } return 0; ================================================================================ Squashed these changes with the original patch. > > > > Although, safety resulting in a panic sounds paradoxical. > > > > Totally agree. > > > > > > > > BTW, should it be under 'LJ_HASFFI' and keep the original call > > > otherwise? > > > > I see no benefits for it. Again, the fix doesn't relates to FFI > > itself. It's just a particular way leading to JIT inconsistency. > > > > Feel free to correct me, if I'm wrong. > > > > > > > > > > > > > > > > > > > > + global_State *g = G(L); > > > > > > + if (tvref(g->jit_base)) { > > > > > > + setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); > > > > > > + if (g->panic) g->panic(L); > > > > > > + exit(EXIT_FAILURE); > > > > > > + } > > > > > > + lj_trace_abort(g); /* Never record across Lua VM entrance */ > > > > > > + lj_vm_call(L, base, nres); > > > > > > +} > > > > > > + > > > > > > /* -- Miscellaneous API functions ----------------------------------------- */ > > > > > > > > > > > > LUA_API int lua_status(lua_State *L) > > > > > > > > > > Also I have a question on perf impact - have you measured any > > > > > benchmarks - Lua only, or as part of Tarantool > > > > > > > > I compared two sysbench logs ([1] and [2]) via vimdiff and see no > > > > critical difference. > > > > > > > > > > > > > > Regards, > > > > > Sergos > > > > > > > > [1]: https://gitlab.com/tarantool/tarantool/-/jobs/489420257/raw > > > > [2]: https://gitlab.com/tarantool/tarantool/-/jobs/486707332/raw > > > > > > > > -- > > > > Best regards, > > > > IM > > > > -- > > Best regards, > > IM -- Best regards, IM ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API 2020-03-30 14:25 ` Igor Munkin @ 2020-04-03 21:06 ` Sergey Ostanevich 2020-04-03 21:31 ` Igor Munkin 0 siblings, 1 reply; 20+ messages in thread From: Sergey Ostanevich @ 2020-04-03 21:06 UTC (permalink / raw) To: Igor Munkin; +Cc: Vladislav Shpilevoy, tarantool-patches Hi! LGTM. Thanks. On 30 мар 17:25, Igor Munkin wrote: > Sergos, > > On 30.03.20, Sergey Ostanevich wrote: > > On 29 мар 13:45, Igor Munkin wrote: > > > Sergos, > > > > > > On 29.03.20, Sergey Ostanevich wrote: > > > > On 28 мар 23:30, Igor Munkin wrote: > > > > > Sergos, > > > > > > > > > > Thanks for you review! > > > > > > > > > > On 28.03.20, Sergey Ostanevich wrote: > > > > > > Hi! > > > > > > > > > > > > Thanks for the patch, see below. > > > > > > On 27 мар 13:47, Igor Munkin wrote: > > > > > > > JIT recording semantics assumes FFI calls are atomic regarding the > > > > > > > LuaJIT VM: if the execution exited Lua world through FFI it is not > > > > > > > re-entering Lua world again. > > > > > > > > > > > > > > However, there are ways to break this atomicity via FFI: one can > > > > > > > re-enter LuaJIT VM via Lua C API inside the C routine called via FFI. As > > > > > > > a result the following stack mix is created: > > > > > > > | Lua-FFI -> С routine -> Lua-C API -> Lua VM > > > > > > > > > > Fixed typo <D0><A1> (cyrillic С letter) to an ASCII C. > > > > > > > > > > > > This sort of re-entrancy is explicitly not supported by LuaJIT compiler > > > > > > > machinery. @mraleph named such layout an "FFI sandwich" and I'm strictly > > > > > > > against eating it. > > > > > > > > > > > > > > E.g. there is a trigger machinery in Tarantool calling Lua functions > > > > > > > from C space. As a result of FFI function call with a trigger underneath > > > > > > > the "FFI sandwich" is created. > > > > > > > > > > > > > > This changeset introduces the mechanism for Lua-C API callbacks similar > > > > > > > to the one existing for Lua-FFI: trace recording is aborted whe the > > > > > > when > > > > > > > > > > Thanks, fixed. > > > > > > > > > > > > execution re-enters LuaJIT VM. Sandwich detection while mcode execution > > > > > > > leads to platform panic. > > > > > > Could you elaborate here in comment on this a little: it means this > > > > > > particular Lua call to be aborted, fail for the fiber Lua state, or for > > > > > > the whole Tarantool? > > > > > > > > > > > > > > > > The whole Tarantool finishes its execution with EXIT_FAILURE code and > > > > > calls tarantool_panic_handler routine prior to the exit. Should I add > > > > > this remark into commit message? > > > > I would like to see it in commit message. > > > > > > Added. Here is a new commit message: > > > > > > ================================================================================ > > > > > > jit: abort trace recording and execution for C API > > > > > > JIT recording semantics assumes FFI calls are atomic regarding the > > > LuaJIT VM: if the execution exited Lua world through FFI it is not > > > re-entering Lua world again. > > > > > > However, there are ways to break this atomicity via FFI: one can > > > re-enter LuaJIT VM via Lua C API inside the C routine called via FFI. As > > > a result the following stack mix is created: > > > | Lua-FFI -> C routine -> Lua-C API -> Lua VM > > > This sort of re-entrancy is explicitly not supported by LuaJIT compiler > > > machinery. @mraleph named such layout an "FFI sandwich" and I'm strictly > > > against eating it. > > > > > > E.g. there is a trigger machinery in Tarantool calling Lua functions > > > from C space. As a result of FFI function call with a trigger underneath > > > the "FFI sandwich" is created. > > > > > > This changeset introduces the mechanism for Lua-C API callbacks similar > > > to the one existing for Lua-FFI: trace recording is aborted when the > > > execution re-enters LuaJIT VM. If re-entrancy is detected while running > > > mcode the platform finishes its execution with EXIT_FAILURE code and > > > calls panic routine prior to the exit. > > > > > > Relates to tarantool/tarantool#4427 > > > > > > ================================================================================ > > > > LGTM. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Relates to tarantool/tarantool#4427 > > > > > > > > > > > > > > Co-authored-by: Vyacheslav Egorov <vegorov@google.com> > > > > > > > Co-authored-by: Sergey Ostanevich <sergos@tarantool.org> > > > > > > > Signed-off-by: Igor Munkin <imun@tarantool.org> > > > > > > > --- > > > > > > > src/lj_api.c | 35 ++++++++++---- > > > > > > > src/lj_errmsg.h | 1 + > > > > > > > test/gh-4427-ffi-sandwich/CMakeLists.txt | 1 + > > > > > > > test/gh-4427-ffi-sandwich/libsandwich.c | 59 ++++++++++++++++++++++++ > > > > > > > test/gh-4427-ffi-sandwich/test.lua | 26 +++++++++++ > > > > > > > 5 files changed, 113 insertions(+), 9 deletions(-) > > > > > > > create mode 100644 test/gh-4427-ffi-sandwich/CMakeLists.txt > > > > > > > create mode 100644 test/gh-4427-ffi-sandwich/libsandwich.c > > > > > > > create mode 100644 test/gh-4427-ffi-sandwich/test.lua > > > > > > > > > > > > > > diff --git a/src/lj_api.c b/src/lj_api.c > > > > > > > index a5e2465..c1f53e0 100644 > > > > > > > --- a/src/lj_api.c > > > > > > > +++ b/src/lj_api.c > > > > > > > @@ -76,6 +76,17 @@ static GCtab *getcurrenv(lua_State *L) > > > > > > > return fn->c.gct == ~LJ_TFUNC ? tabref(fn->c.env) : tabref(L->env); > > > > > > > } > > > > > > > > > > > > > > +static void call(lua_State *L, TValue *base, int nres) { > > > > > > I would recommend to name the function in a more descriptive way, like > > > > > > sane_call. > > > > > > > > > > IMHO no_trace_call / no_jit_call looks good to me, but feel free to > > > > > propose the naming you prefer more. > > > > If you want to mention that this call aborts the trace generation, > > > > then I would put it like 'call_w_trace_abort' > > > > > > It not only aborts the trace recording, but prohibits further execution > > > for already compiled code. > > > > > > > But it goes beyond that AFAU. Can you name it like 'ffi_safe_call'? > > > > > > The changes are not related to FFI at all. FFI helps to achieve the > > > platform failure, but the problem relates to JIT machinery. So this name > > > doesn't clarify function semantics. > > > > > Can we sayn that this machinery tracks the machinery consistency? Then > > the first name comes to my mind is consistent_call with in the code > > comment on what consistency is - abort trace collection and inconsistent > > trace excution. > > As a result of discussion, we stopped on the jit_secure_call name. The > diff is below: > > ================================================================================ > > diff --git a/src/lj_api.c b/src/lj_api.c > index c1f53e0..4ff8ccf 100644 > --- a/src/lj_api.c > +++ b/src/lj_api.c > @@ -76,8 +76,9 @@ static GCtab *getcurrenv(lua_State *L) > return fn->c.gct == ~LJ_TFUNC ? tabref(fn->c.env) : tabref(L->env); > } > > -static void call(lua_State *L, TValue *base, int nres) { > +static void jit_secure_call(lua_State *L, TValue *base, int nres) { > global_State *g = G(L); > + /* Forbid Lua world re-entrancy while running the trace */ > if (tvref(g->jit_base)) { > setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); > if (g->panic) g->panic(L); > @@ -311,7 +311,7 @@ LUA_API int lua_equal(lua_State *L, int idx1, int idx2) > return (int)(uintptr_t)base; > } else { > L->top = base+2; > - call(L, base, 1+1); > + jit_secure_call(L, base, 1+1); > L->top -= 2+LJ_FR2; > return tvistruecond(L->top+1+LJ_FR2); > } > @@ -334,7 +334,7 @@ LUA_API int lua_lessthan(lua_State *L, int idx1, int idx2) > return (int)(uintptr_t)base; > } else { > L->top = base+2; > - call(L, base, 1+1); > + jit_secure_call(L, base, 1+1); > L->top -= 2+LJ_FR2; > return tvistruecond(L->top+1+LJ_FR2); > } > @@ -786,7 +786,7 @@ LUA_API void lua_concat(lua_State *L, int n) > } > n -= (int)(L->top - top); > L->top = top+2; > - call(L, top, 1+1); > + jit_secure_call(L, top, 1+1); > L->top -= 1+LJ_FR2; > copyTV(L, L->top-1, L->top+LJ_FR2); > } while (--n > 0); > @@ -806,7 +806,7 @@ LUA_API void lua_gettable(lua_State *L, int idx) > v = lj_meta_tget(L, t, L->top-1); > if (v == NULL) { > L->top += 2; > - call(L, L->top-2, 1+1); > + jit_secure_call(L, L->top-2, 1+1); > L->top -= 2+LJ_FR2; > v = L->top+1+LJ_FR2; > } > @@ -822,7 +822,7 @@ LUA_API void lua_getfield(lua_State *L, int idx, const char *k) > v = lj_meta_tget(L, t, &key); > if (v == NULL) { > L->top += 2; > - call(L, L->top-2, 1+1); > + jit_secure_call(L, L->top-2, 1+1); > L->top -= 2+LJ_FR2; > v = L->top+1+LJ_FR2; > } > @@ -977,7 +977,7 @@ LUA_API void lua_settable(lua_State *L, int idx) > TValue *base = L->top; > copyTV(L, base+2, base-3-2*LJ_FR2); > L->top = base+3; > - call(L, base, 0+1); > + jit_secure_call(L, base, 0+1); > L->top -= 3+LJ_FR2; > } > } > @@ -998,7 +998,7 @@ LUA_API void lua_setfield(lua_State *L, int idx, const char *k) > TValue *base = L->top; > copyTV(L, base+2, base-3-2*LJ_FR2); > L->top = base+3; > - call(L, base, 0+1); > + jit_secure_call(L, base, 0+1); > L->top -= 2+LJ_FR2; > } > } > @@ -1129,7 +1129,7 @@ LUA_API void lua_call(lua_State *L, int nargs, int nresults) > { > api_check(L, L->status == LUA_OK || L->status == LUA_ERRERR); > api_checknelems(L, nargs+1); > - call(L, api_call_base(L, nargs), nresults+1); > + jit_secure_call(L, api_call_base(L, nargs), nresults+1); > } > > LUA_API int lua_pcall(lua_State *L, int nargs, int nresults, int errfunc) > @@ -1147,6 +1148,7 @@ LUA_API int lua_pcall(lua_State *L, int nargs, int nresults, int errfunc) > api_checkvalidindex(L, o); > ef = savestack(L, o); > } > + /* Forbid Lua world re-entrancy while running the trace */ > if (tvref(g->jit_base)) { > setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); > if (g->panic) g->panic(L); > @@ -1189,7 +1189,7 @@ LUALIB_API int luaL_callmeta(lua_State *L, int idx, const char *field) > if (LJ_FR2) setnilV(top++); > copyTV(L, top++, index2adr(L, idx)); > L->top = top; > - call(L, top-1, 1+1); > + jit_secure_call(L, top-1, 1+1); > return 1; > } > return 0; > > ================================================================================ > > Squashed these changes with the original patch. > > > > > > > Although, safety resulting in a panic sounds paradoxical. > > > > > > Totally agree. > > > > > > > > > > > BTW, should it be under 'LJ_HASFFI' and keep the original call > > > > otherwise? > > > > > > I see no benefits for it. Again, the fix doesn't relates to FFI > > > itself. It's just a particular way leading to JIT inconsistency. > > > > > > Feel free to correct me, if I'm wrong. > > > > > > > > > > > > > > > > > > > > > > > > > + global_State *g = G(L); > > > > > > > + if (tvref(g->jit_base)) { > > > > > > > + setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); > > > > > > > + if (g->panic) g->panic(L); > > > > > > > + exit(EXIT_FAILURE); > > > > > > > + } > > > > > > > + lj_trace_abort(g); /* Never record across Lua VM entrance */ > > > > > > > + lj_vm_call(L, base, nres); > > > > > > > +} > > > > > > > + > > > > > > > /* -- Miscellaneous API functions ----------------------------------------- */ > > > > > > > > > > > > > > LUA_API int lua_status(lua_State *L) > > > > > > > > > > > > Also I have a question on perf impact - have you measured any > > > > > > benchmarks - Lua only, or as part of Tarantool > > > > > > > > > > I compared two sysbench logs ([1] and [2]) via vimdiff and see no > > > > > critical difference. > > > > > > > > > > > > > > > > > Regards, > > > > > > Sergos > > > > > > > > > > [1]: https://gitlab.com/tarantool/tarantool/-/jobs/489420257/raw > > > > > [2]: https://gitlab.com/tarantool/tarantool/-/jobs/486707332/raw > > > > > > > > > > -- > > > > > Best regards, > > > > > IM > > > > > > -- > > > Best regards, > > > IM > > -- > Best regards, > IM ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API 2020-04-03 21:06 ` Sergey Ostanevich @ 2020-04-03 21:31 ` Igor Munkin 0 siblings, 0 replies; 20+ messages in thread From: Igor Munkin @ 2020-04-03 21:31 UTC (permalink / raw) To: Sergey Ostanevich; +Cc: Vladislav Shpilevoy, tarantool-patches Sergos, Thanks for your review! Squashed the fixes, added your Reviewed-by tag and updated the upstream. On 04.04.20, Sergey Ostanevich wrote: > Hi! > > LGTM. > > Thanks. > > On 30 мар 17:25, Igor Munkin wrote: > > Sergos, > > > > On 30.03.20, Sergey Ostanevich wrote: > > > On 29 мар 13:45, Igor Munkin wrote: > > > > Sergos, > > > > > > > > On 29.03.20, Sergey Ostanevich wrote: > > > > > On 28 мар 23:30, Igor Munkin wrote: > > > > > > Sergos, > > > > > > > > > > > > Thanks for you review! > > > > > > > > > > > > On 28.03.20, Sergey Ostanevich wrote: > > > > > > > Hi! > > > > > > > > > > > > > > Thanks for the patch, see below. > > > > > > > On 27 мар 13:47, Igor Munkin wrote: > > > > > > > > JIT recording semantics assumes FFI calls are atomic regarding the > > > > > > > > LuaJIT VM: if the execution exited Lua world through FFI it is not > > > > > > > > re-entering Lua world again. > > > > > > > > > > > > > > > > However, there are ways to break this atomicity via FFI: one can > > > > > > > > re-enter LuaJIT VM via Lua C API inside the C routine called via FFI. As > > > > > > > > a result the following stack mix is created: > > > > > > > > | Lua-FFI -> С routine -> Lua-C API -> Lua VM > > > > > > > > > > > > Fixed typo <D0><A1> (cyrillic С letter) to an ASCII C. > > > > > > > > > > > > > > This sort of re-entrancy is explicitly not supported by LuaJIT compiler > > > > > > > > machinery. @mraleph named such layout an "FFI sandwich" and I'm strictly > > > > > > > > against eating it. > > > > > > > > > > > > > > > > E.g. there is a trigger machinery in Tarantool calling Lua functions > > > > > > > > from C space. As a result of FFI function call with a trigger underneath > > > > > > > > the "FFI sandwich" is created. > > > > > > > > > > > > > > > > This changeset introduces the mechanism for Lua-C API callbacks similar > > > > > > > > to the one existing for Lua-FFI: trace recording is aborted whe the > > > > > > > when > > > > > > > > > > > > Thanks, fixed. > > > > > > > > > > > > > > execution re-enters LuaJIT VM. Sandwich detection while mcode execution > > > > > > > > leads to platform panic. > > > > > > > Could you elaborate here in comment on this a little: it means this > > > > > > > particular Lua call to be aborted, fail for the fiber Lua state, or for > > > > > > > the whole Tarantool? > > > > > > > > > > > > > > > > > > > The whole Tarantool finishes its execution with EXIT_FAILURE code and > > > > > > calls tarantool_panic_handler routine prior to the exit. Should I add > > > > > > this remark into commit message? > > > > > I would like to see it in commit message. > > > > > > > > Added. Here is a new commit message: > > > > > > > > ================================================================================ > > > > > > > > jit: abort trace recording and execution for C API > > > > > > > > JIT recording semantics assumes FFI calls are atomic regarding the > > > > LuaJIT VM: if the execution exited Lua world through FFI it is not > > > > re-entering Lua world again. > > > > > > > > However, there are ways to break this atomicity via FFI: one can > > > > re-enter LuaJIT VM via Lua C API inside the C routine called via FFI. As > > > > a result the following stack mix is created: > > > > | Lua-FFI -> C routine -> Lua-C API -> Lua VM > > > > This sort of re-entrancy is explicitly not supported by LuaJIT compiler > > > > machinery. @mraleph named such layout an "FFI sandwich" and I'm strictly > > > > against eating it. > > > > > > > > E.g. there is a trigger machinery in Tarantool calling Lua functions > > > > from C space. As a result of FFI function call with a trigger underneath > > > > the "FFI sandwich" is created. > > > > > > > > This changeset introduces the mechanism for Lua-C API callbacks similar > > > > to the one existing for Lua-FFI: trace recording is aborted when the > > > > execution re-enters LuaJIT VM. If re-entrancy is detected while running > > > > mcode the platform finishes its execution with EXIT_FAILURE code and > > > > calls panic routine prior to the exit. > > > > > > > > Relates to tarantool/tarantool#4427 > > > > > > > > ================================================================================ > > > > > > LGTM. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Relates to tarantool/tarantool#4427 > > > > > > > > > > > > > > > > Co-authored-by: Vyacheslav Egorov <vegorov@google.com> > > > > > > > > Co-authored-by: Sergey Ostanevich <sergos@tarantool.org> > > > > > > > > Signed-off-by: Igor Munkin <imun@tarantool.org> > > > > > > > > --- > > > > > > > > src/lj_api.c | 35 ++++++++++---- > > > > > > > > src/lj_errmsg.h | 1 + > > > > > > > > test/gh-4427-ffi-sandwich/CMakeLists.txt | 1 + > > > > > > > > test/gh-4427-ffi-sandwich/libsandwich.c | 59 ++++++++++++++++++++++++ > > > > > > > > test/gh-4427-ffi-sandwich/test.lua | 26 +++++++++++ > > > > > > > > 5 files changed, 113 insertions(+), 9 deletions(-) > > > > > > > > create mode 100644 test/gh-4427-ffi-sandwich/CMakeLists.txt > > > > > > > > create mode 100644 test/gh-4427-ffi-sandwich/libsandwich.c > > > > > > > > create mode 100644 test/gh-4427-ffi-sandwich/test.lua > > > > > > > > > > > > > > > > diff --git a/src/lj_api.c b/src/lj_api.c > > > > > > > > index a5e2465..c1f53e0 100644 > > > > > > > > --- a/src/lj_api.c > > > > > > > > +++ b/src/lj_api.c > > > > > > > > @@ -76,6 +76,17 @@ static GCtab *getcurrenv(lua_State *L) > > > > > > > > return fn->c.gct == ~LJ_TFUNC ? tabref(fn->c.env) : tabref(L->env); > > > > > > > > } > > > > > > > > > > > > > > > > +static void call(lua_State *L, TValue *base, int nres) { > > > > > > > I would recommend to name the function in a more descriptive way, like > > > > > > > sane_call. > > > > > > > > > > > > IMHO no_trace_call / no_jit_call looks good to me, but feel free to > > > > > > propose the naming you prefer more. > > > > > If you want to mention that this call aborts the trace generation, > > > > > then I would put it like 'call_w_trace_abort' > > > > > > > > It not only aborts the trace recording, but prohibits further execution > > > > for already compiled code. > > > > > > > > > But it goes beyond that AFAU. Can you name it like 'ffi_safe_call'? > > > > > > > > The changes are not related to FFI at all. FFI helps to achieve the > > > > platform failure, but the problem relates to JIT machinery. So this name > > > > doesn't clarify function semantics. > > > > > > > Can we sayn that this machinery tracks the machinery consistency? Then > > > the first name comes to my mind is consistent_call with in the code > > > comment on what consistency is - abort trace collection and inconsistent > > > trace excution. > > > > As a result of discussion, we stopped on the jit_secure_call name. The > > diff is below: > > > > ================================================================================ > > > > diff --git a/src/lj_api.c b/src/lj_api.c > > index c1f53e0..4ff8ccf 100644 > > --- a/src/lj_api.c > > +++ b/src/lj_api.c > > @@ -76,8 +76,9 @@ static GCtab *getcurrenv(lua_State *L) > > return fn->c.gct == ~LJ_TFUNC ? tabref(fn->c.env) : tabref(L->env); > > } > > > > -static void call(lua_State *L, TValue *base, int nres) { > > +static void jit_secure_call(lua_State *L, TValue *base, int nres) { > > global_State *g = G(L); > > + /* Forbid Lua world re-entrancy while running the trace */ > > if (tvref(g->jit_base)) { > > setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); > > if (g->panic) g->panic(L); > > @@ -311,7 +311,7 @@ LUA_API int lua_equal(lua_State *L, int idx1, int idx2) > > return (int)(uintptr_t)base; > > } else { > > L->top = base+2; > > - call(L, base, 1+1); > > + jit_secure_call(L, base, 1+1); > > L->top -= 2+LJ_FR2; > > return tvistruecond(L->top+1+LJ_FR2); > > } > > @@ -334,7 +334,7 @@ LUA_API int lua_lessthan(lua_State *L, int idx1, int idx2) > > return (int)(uintptr_t)base; > > } else { > > L->top = base+2; > > - call(L, base, 1+1); > > + jit_secure_call(L, base, 1+1); > > L->top -= 2+LJ_FR2; > > return tvistruecond(L->top+1+LJ_FR2); > > } > > @@ -786,7 +786,7 @@ LUA_API void lua_concat(lua_State *L, int n) > > } > > n -= (int)(L->top - top); > > L->top = top+2; > > - call(L, top, 1+1); > > + jit_secure_call(L, top, 1+1); > > L->top -= 1+LJ_FR2; > > copyTV(L, L->top-1, L->top+LJ_FR2); > > } while (--n > 0); > > @@ -806,7 +806,7 @@ LUA_API void lua_gettable(lua_State *L, int idx) > > v = lj_meta_tget(L, t, L->top-1); > > if (v == NULL) { > > L->top += 2; > > - call(L, L->top-2, 1+1); > > + jit_secure_call(L, L->top-2, 1+1); > > L->top -= 2+LJ_FR2; > > v = L->top+1+LJ_FR2; > > } > > @@ -822,7 +822,7 @@ LUA_API void lua_getfield(lua_State *L, int idx, const char *k) > > v = lj_meta_tget(L, t, &key); > > if (v == NULL) { > > L->top += 2; > > - call(L, L->top-2, 1+1); > > + jit_secure_call(L, L->top-2, 1+1); > > L->top -= 2+LJ_FR2; > > v = L->top+1+LJ_FR2; > > } > > @@ -977,7 +977,7 @@ LUA_API void lua_settable(lua_State *L, int idx) > > TValue *base = L->top; > > copyTV(L, base+2, base-3-2*LJ_FR2); > > L->top = base+3; > > - call(L, base, 0+1); > > + jit_secure_call(L, base, 0+1); > > L->top -= 3+LJ_FR2; > > } > > } > > @@ -998,7 +998,7 @@ LUA_API void lua_setfield(lua_State *L, int idx, const char *k) > > TValue *base = L->top; > > copyTV(L, base+2, base-3-2*LJ_FR2); > > L->top = base+3; > > - call(L, base, 0+1); > > + jit_secure_call(L, base, 0+1); > > L->top -= 2+LJ_FR2; > > } > > } > > @@ -1129,7 +1129,7 @@ LUA_API void lua_call(lua_State *L, int nargs, int nresults) > > { > > api_check(L, L->status == LUA_OK || L->status == LUA_ERRERR); > > api_checknelems(L, nargs+1); > > - call(L, api_call_base(L, nargs), nresults+1); > > + jit_secure_call(L, api_call_base(L, nargs), nresults+1); > > } > > > > LUA_API int lua_pcall(lua_State *L, int nargs, int nresults, int errfunc) > > @@ -1147,6 +1148,7 @@ LUA_API int lua_pcall(lua_State *L, int nargs, int nresults, int errfunc) > > api_checkvalidindex(L, o); > > ef = savestack(L, o); > > } > > + /* Forbid Lua world re-entrancy while running the trace */ > > if (tvref(g->jit_base)) { > > setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); > > if (g->panic) g->panic(L); > > @@ -1189,7 +1189,7 @@ LUALIB_API int luaL_callmeta(lua_State *L, int idx, const char *field) > > if (LJ_FR2) setnilV(top++); > > copyTV(L, top++, index2adr(L, idx)); > > L->top = top; > > - call(L, top-1, 1+1); > > + jit_secure_call(L, top-1, 1+1); > > return 1; > > } > > return 0; > > > > ================================================================================ > > > > Squashed these changes with the original patch. > > > > > > > > > > Although, safety resulting in a panic sounds paradoxical. > > > > > > > > Totally agree. > > > > > > > > > > > > > > BTW, should it be under 'LJ_HASFFI' and keep the original call > > > > > otherwise? > > > > > > > > I see no benefits for it. Again, the fix doesn't relates to FFI > > > > itself. It's just a particular way leading to JIT inconsistency. > > > > > > > > Feel free to correct me, if I'm wrong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > + global_State *g = G(L); > > > > > > > > + if (tvref(g->jit_base)) { > > > > > > > > + setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); > > > > > > > > + if (g->panic) g->panic(L); > > > > > > > > + exit(EXIT_FAILURE); > > > > > > > > + } > > > > > > > > + lj_trace_abort(g); /* Never record across Lua VM entrance */ > > > > > > > > + lj_vm_call(L, base, nres); > > > > > > > > +} > > > > > > > > + > > > > > > > > /* -- Miscellaneous API functions ----------------------------------------- */ > > > > > > > > > > > > > > > > LUA_API int lua_status(lua_State *L) > > > > > > > > > > > > > > Also I have a question on perf impact - have you measured any > > > > > > > benchmarks - Lua only, or as part of Tarantool > > > > > > > > > > > > I compared two sysbench logs ([1] and [2]) via vimdiff and see no > > > > > > critical difference. > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > Sergos > > > > > > > > > > > > [1]: https://gitlab.com/tarantool/tarantool/-/jobs/489420257/raw > > > > > > [2]: https://gitlab.com/tarantool/tarantool/-/jobs/486707332/raw > > > > > > > > > > > > -- > > > > > > Best regards, > > > > > > IM > > > > > > > > -- > > > > Best regards, > > > > IM > > > > -- > > Best regards, > > IM -- Best regards, IM ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API 2020-03-27 10:47 ` [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API Igor Munkin 2020-03-28 16:33 ` Sergey Ostanevich @ 2020-04-02 23:41 ` Vladislav Shpilevoy 2020-04-04 11:55 ` Igor Munkin 1 sibling, 1 reply; 20+ messages in thread From: Vladislav Shpilevoy @ 2020-04-02 23:41 UTC (permalink / raw) To: Igor Munkin, Sergey Ostanevich, Kirill Yukhin; +Cc: tarantool-patches Hi! Thanks for the patch! > diff --git a/src/lj_api.c b/src/lj_api.c > index a5e2465..c1f53e0 100644 > --- a/src/lj_api.c > +++ b/src/lj_api.c > @@ -300,7 +311,7 @@ LUA_API int lua_equal(lua_State *L, int idx1, int idx2) > return (int)(uintptr_t)base; > } else { > L->top = base+2; > - lj_vm_call(L, base, 1+1); > + call(L, base, 1+1); These changes look dangerous to perf. So if I had operators == and < overloaded for an FFI struct, they are not jitted anymore, right? Why? Does someone really yield in them? Could you measure how does it affect perf of these calls? From what I see, these changes basically kill FFI making it not better than Lua C. It no longer helps JITting anything, does it? The same for concat, and other things. Looks like an overkill. We would need sandwich protection only for a few places. Is it possible to avoid such drastic changes? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API 2020-04-02 23:41 ` Vladislav Shpilevoy @ 2020-04-04 11:55 ` Igor Munkin 2020-04-04 21:37 ` Vladislav Shpilevoy 0 siblings, 1 reply; 20+ messages in thread From: Igor Munkin @ 2020-04-04 11:55 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches Vlad, Thanks for your review! On 03.04.20, Vladislav Shpilevoy wrote: > Hi! Thanks for the patch! > > > diff --git a/src/lj_api.c b/src/lj_api.c > > index a5e2465..c1f53e0 100644 > > --- a/src/lj_api.c > > +++ b/src/lj_api.c > > @@ -300,7 +311,7 @@ LUA_API int lua_equal(lua_State *L, int idx1, int idx2) > > return (int)(uintptr_t)base; > > } else { > > L->top = base+2; > > - lj_vm_call(L, base, 1+1); > > + call(L, base, 1+1); > > These changes look dangerous to perf. So if I had operators == > and < overloaded for an FFI struct, they are not jitted anymore, > right? Nope. I guess I need to stop right here and clarify the patch a little. Let's start with the patch itself: the introduced changes abort the trace recording whether re-entrancy to Lua VM is detected or "panic" if such re-entrancy is faced while running already compiled machine code. A good rationale for such protection is the platform failure Nikita faced in #4427: the C function called via FFI makes the trigger with a Lua-C API call underneath run. However it also might be any Lua-C API function respecting the one of available metamethods (e.g. lua_less, lua_equal, etc). If the correponding metamethod is set, the platform also re-enters to execute it. Such behaviour is also unnacceptable, so metamethod call ought to be invoked via jit_secure_call too. It's not optional and we can't avoid it, since there are no guarantees the situation never occurs. OK, what about FFI: it's a well isolated machinery, that handles everything its own way: own trace recording semantics, own metamethod handling (since only tables and userdata can have an individual metatable FFI has the one <ffi> "first level" metatable), etc. Nothing FFI-related is changed here. If you run the following snippet, you'll see the artefacts for successfully compiled trace in dump.out. | require('jit.dump').start('+tbisrmXaT', 'dump.out') | | local limit = -1ULL / 2 | local value = 1ULL | local tests = { } | | while value < limit do | value = value * 2 | table.insert(tests, value < limit) | end | | jit.off() -- prevent spoiling dump.out with other traces. | print(value, limit, #tests) Since the patch relates only to Lua-C API, metamethods triggered while running Lua code are also successfully compiled. | require('jit.dump').start('+tbisrmXaT', 'dump.out') | | local complex | | local complex_mt = { | __add = function(a, b) return complex(a.re + b.re, a.im + b.im) end, | __eq = function(a, b) return a.re == b.re and a.im == b.im end, | __tostring = function(o) return ('re: %d, im: %d'):format(o.re, o.im) end | } | | complex = function(re, im) | return setmetatable({ re = re, im = im }, complex_mt) | end | | local start = complex(0, 0) | local finish = complex(64, 64) | local step = complex(1, 1) | | repeat | start = start + step | until start == finish Feel free to point the parts that looks good to be included into commit message, comments, etc. > Why? Does someone really yield in them? The main idea is to protect ourselves in the future (we have already discussed it with Kostja here[1]). > Could you measure how does it affect perf of these calls? Talking about perf: Sergos also asked to make benchmarks and I provided the sysbench results[2] I obtained in Gitlab. I saw no dramatically changes. I guess the reason the performance stays the same is that all such places has been already handled manually in Tarantool sources (like box_txn_rollback_to_savepoint after Leonid changes in #4427). However, if you would like to see more benchmarks, please let me know. > From what I see, these changes basically kill FFI making it not better > than Lua C. It no longer helps JITting anything, does it? > > The same for concat, and other things. Looks like an overkill. We > would need sandwich protection only for a few places. Is it possible > to avoid such drastic changes? [1]: https://lists.tarantool.org/pipermail/tarantool-patches/2019-November/012732.html [2]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-March/015236.html -- Best regards, IM ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API 2020-04-04 11:55 ` Igor Munkin @ 2020-04-04 21:37 ` Vladislav Shpilevoy 2020-04-07 21:16 ` Igor Munkin 0 siblings, 1 reply; 20+ messages in thread From: Vladislav Shpilevoy @ 2020-04-04 21:37 UTC (permalink / raw) To: Igor Munkin; +Cc: tarantool-patches See 8 comments below. > diff --git a/src/lj_api.c b/src/lj_api.c > index a5e2465..b543231 100644 > --- a/src/lj_api.c > +++ b/src/lj_api.c > @@ -76,6 +76,18 @@ static GCtab *getcurrenv(lua_State *L) > return fn->c.gct == ~LJ_TFUNC ? tabref(fn->c.env) : tabref(L->env); > } > > +static void jit_secure_call(lua_State *L, TValue *base, int nres) { 1. Many functions in luajit related to jitting have prefix 'lj_'. Probably this should be applied here too. I didn't see any other starting with 'jit_'. > + global_State *g = G(L); > + /* Forbid Lua world re-entrancy while running the trace */ > + if (tvref(g->jit_base)) { > + setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); > + if (g->panic) g->panic(L); > + exit(EXIT_FAILURE); 2. Is it correct, that it is allowed to renter Lua, and the panic will happen only if we see that during reentering the trace is still being recorded (i.e. if there is a bug in lj)? I don't see in the code above where is that check like 'the trace is being recorded'. Could you please point at it? > + } > + lj_trace_abort(g); /* Never record across Lua VM entrance */ > + lj_vm_call(L, base, nres); > +} > @@ -1172,7 +1191,7 @@ LUALIB_API int luaL_callmeta(lua_State *L, int idx, const char *field) > if (LJ_FR2) setnilV(top++); > copyTV(L, top++, index2adr(L, idx)); > L->top = top; > - lj_vm_call(L, top-1, 1+1); > + jit_secure_call(L, top-1, 1+1); > return 1; > } 3. Why didn't you change lua_cpcall(), luaL_callmeta()? > diff --git a/test/gh-4427-ffi-sandwich/CMakeLists.txt b/test/gh-4427-ffi-sandwich/CMakeLists.txt > new file mode 100644 > index 0000000..995c6bb > --- /dev/null > +++ b/test/gh-4427-ffi-sandwich/CMakeLists.txt 4. So this is not built by luajit? Does not it have anything to be more self sufficient here? This looks strange to add a CMake file to a probject, which uses Makefiles. Also this cmake does not work. I get error: CMake Error at CMakeLists.txt:1 (build_lualib): Unknown CMake command "build_lualib". This is a clear cyclic dependency on Tarantool. Can it be built by luajit's own means? This dependency becomes even stronger in the next commit, which introduces another CMake file. > @@ -0,0 +1 @@ > +build_lualib(libsandwich libsandwich.c) > diff --git a/test/gh-4427-ffi-sandwich/libsandwich.c b/test/gh-4427-ffi-sandwich/libsandwich.c > new file mode 100644 > index 0000000..2a24fb2 > --- /dev/null > +++ b/test/gh-4427-ffi-sandwich/libsandwich.c > @@ -0,0 +1,59 @@ > +#include <lua.h> > +#include <lauxlib.h> > + > +struct sandwich { > + lua_State *L; /* Coroutine saved for a Lua call */ > + int ref; /* Anchor to the Lua function to be run */ > + int trigger; /* Trigger for switching to Lua call */ > +}; > + > +int ipp(struct sandwich *state, int i) 5. What is 'ipp'? > +{ > + if (i < state->trigger) > + return i + 1; > + > + /* Sandwich is triggered and Lua ipp function is called */ > + lua_pushnumber(state->L, state->ref); > + lua_gettable(state->L, LUA_REGISTRYINDEX); > + lua_pushnumber(state->L, i); > + lua_call(state->L, 1, 1); > + return lua_tonumber(state->L, -1); > >> These changes look dangerous to perf. So if I had operators == >> and < overloaded for an FFI struct, they are not jitted anymore, >> right? > > Nope. I guess I need to stop right here and clarify the patch a little. > > Let's start with the patch itself: the introduced changes abort the > trace recording whether re-entrancy to Lua VM is detected or "panic" if > such re-entrancy is faced while running already compiled machine code. A > good rationale for such protection is the platform failure Nikita faced > in #4427: the C function called via FFI makes the trigger with a Lua-C > API call underneath run. However it also might be any Lua-C API function > respecting the one of available metamethods (e.g. lua_less, lua_equal, > etc). If the correponding metamethod is set, the platform also re-enters > to execute it. Such behaviour is also unnacceptable, so metamethod call > ought to be invoked via jit_secure_call too. It's not optional and we > can't avoid it, since there are no guarantees the situation never > occurs. 6. Yeah, ok, now I see, the changes functions are from the public C Lua API, and are called by external code only. Such as from tarantool. I was afraid, that lj uses them for all comparisons internally. > OK, what about FFI: it's a well isolated machinery, that handles > everything its own way: own trace recording semantics, own metamethod > handling (since only tables and userdata can have an individual > metatable FFI has the one <ffi> "first level" metatable), etc. Nothing > FFI-related is changed here. 7. If this has nothing to do with ffi, then why it is used in the test? > If you run the following snippet, you'll see the artefacts for > successfully compiled trace in dump.out. > | require('jit.dump').start('+tbisrmXaT', 'dump.out') > | > | local limit = -1ULL / 2 > | local value = 1ULL > | local tests = { } > | > | while value < limit do > | value = value * 2 > | table.insert(tests, value < limit) > | end > | > | jit.off() -- prevent spoiling dump.out with other traces. > | print(value, limit, #tests) > > Since the patch relates only to Lua-C API, metamethods triggered while > running Lua code are also successfully compiled. > | require('jit.dump').start('+tbisrmXaT', 'dump.out') > | > | local complex > | > | local complex_mt = { > | __add = function(a, b) return complex(a.re + b.re, a.im + b.im) end, > | __eq = function(a, b) return a.re == b.re and a.im == b.im end, > | __tostring = function(o) return ('re: %d, im: %d'):format(o.re, o.im) end > | } > | > | complex = function(re, im) > | return setmetatable({ re = re, im = im }, complex_mt) > | end > | > | local start = complex(0, 0) > | local finish = complex(64, 64) > | local step = complex(1, 1) > | > | repeat > | start = start + step > | until start == finish > > Feel free to point the parts that looks good to be included into commit > message, comments, etc. > >> Why? Does someone really yield in them? > > The main idea is to protect ourselves in the future (we have already > discussed it with Kostja here[1]). > >> Could you measure how does it affect perf of these calls? > > Talking about perf: Sergos also asked to make benchmarks and I provided > the sysbench results[2] I obtained in Gitlab. I saw no dramatically > changes. I guess the reason the performance stays the same is that > all such places has been already handled manually in Tarantool sources > (like box_txn_rollback_to_savepoint after Leonid changes in #4427). > > However, if you would like to see more benchmarks, please let me know. > >> From what I see, these changes basically kill FFI making it not better >> than Lua C. It no longer helps JITting anything, does it? >> >> The same for concat, and other things. Looks like an overkill. We >> would need sandwich protection only for a few places. Is it possible >> to avoid such drastic changes? > > [1]: https://lists.tarantool.org/pipermail/tarantool-patches/2019-November/012732.html > [2]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-March/015236.html 8. So are you saying, that ffi part is still fast? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API 2020-04-04 21:37 ` Vladislav Shpilevoy @ 2020-04-07 21:16 ` Igor Munkin 0 siblings, 0 replies; 20+ messages in thread From: Igor Munkin @ 2020-04-07 21:16 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches Vlad, Thanks for your review! On 04.04.20, Vladislav Shpilevoy wrote: > See 8 comments below. > > > diff --git a/src/lj_api.c b/src/lj_api.c > > index a5e2465..b543231 100644 > > --- a/src/lj_api.c > > +++ b/src/lj_api.c > > @@ -76,6 +76,18 @@ static GCtab *getcurrenv(lua_State *L) > > return fn->c.gct == ~LJ_TFUNC ? tabref(fn->c.env) : tabref(L->env); > > } > > > > +static void jit_secure_call(lua_State *L, TValue *base, int nres) { > > 1. Many functions in luajit related to jitting have prefix 'lj_'. > Probably this should be applied here too. I didn't see any other > starting with 'jit_'. 'lj_' prefix is used for extern functions in various units of internal API. For static (i.e. private) routines no prefixes are used (consider <index2adr>, <stkindex2adr>, <getcurrenv> helpers nearby in lj_api.c). Originally, I named this function just <call>, but Sergos asked for a more describing name in the neighbour thread and finally we chose this one. I used underscores for readability, not for a prefix (I'm OK with jitsecurecall though it much harder to parse :)). I *literally* OK with any name if it is not misguiding one. Feel free to propose your naming if you have a better one. > > > + global_State *g = G(L); > > + /* Forbid Lua world re-entrancy while running the trace */ > > + if (tvref(g->jit_base)) { > > + setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); > > + if (g->panic) g->panic(L); > > + exit(EXIT_FAILURE); > > 2. Is it correct, that it is allowed to renter Lua, and the panic > will happen only if we see that during reentering the trace is > still being recorded (i.e. if there is a bug in lj)? Not quite right. there are two cases: 1. Trace is being recorded when re-entrancy occurs. In this case lj_trace_abort macro asynchroniously aborts (i.e. signals to stop and purge the result) trace recording prior to re-entering Lua VM. 2. Trace is being run when re-entrancy occurs. In this case jit_base field is not NULL and contains an address to the guest stack frame base on the moment trace started its execution. Unfortunately in this case we can do nothing for now except panic. > > I don't see in the code above where is that check like 'the trace > is being recorded'. Could you please point at it? Trace recording is aborted unconditionally via lj_trace_abort macro. > > > + } > > + lj_trace_abort(g); /* Never record across Lua VM entrance */ > > + lj_vm_call(L, base, nres); > > +} > > @@ -1172,7 +1191,7 @@ LUALIB_API int luaL_callmeta(lua_State *L, int idx, const char *field) > > if (LJ_FR2) setnilV(top++); > > copyTV(L, top++, index2adr(L, idx)); > > L->top = top; > > - lj_vm_call(L, top-1, 1+1); > > + jit_secure_call(L, top-1, 1+1); > > return 1; > > } > > 3. Why didn't you change lua_cpcall(), luaL_callmeta()? OK, let's start with luaL_callmeta: I've checked twice and see it is patched. Here is the corresponding diff chunk: ================================================================================ @@ -1172,7 +1191,7 @@ LUALIB_API int luaL_callmeta(lua_State *L, int idx, const char *field) if (LJ_FR2) setnilV(top++); copyTV(L, top++, index2adr(L, idx)); L->top = top; - lj_vm_call(L, top-1, 1+1); + jit_secure_call(L, top-1, 1+1); return 1; } return 0; ================================================================================ Now I clarify the situation regarding lua_cpcall: Yes, you're *formally* right it re-enters Lua VM. However, it just prepares the safe frame, creates a GC object of Lua function type using the given C function (strictly saying lua_CFunction) and executes it via BC_FUNCC. Yes, BC_FUNCC is "recordable" for platform builtins and "fast" functions (i.e. functions with success paths implemented in DynASM, and fallbacks implemented in C, e.g. tonumber, rawget). Thereby calling arbitrary "extern" C function aborts recording. If the call occurs while running already compiled trace it doesn't break the following: * it respects caller-callee convention and all callee-safe registers are preserved * it leaves the guest stack unchanged (since lua_cpcall allows no values to be returned) * it doesn't change global fields required by trace (e.g. jit_base) *BUT* BC_FUNCC semantics changes vmstate field, where currently running trace number is stored and after C function yields the execution back to the VM the vmstate is not restored to traceno. This value is used within vm_exit_handler to obtain trace number, so further execution might lead to the platform failure. Long story short: thanks for your preciseness, I re-checked lua_cpcall sematics while writing this section and here is a fix for lua_cpcall: ================================================================================ diff --git a/src/lj_api.c b/src/lj_api.c index b543231..c9b5d22 100644 --- a/src/lj_api.c +++ b/src/lj_api.c @@ -1179,6 +1179,13 @@ LUA_API int lua_cpcall(lua_State *L, lua_CFunction func, void *ud) uint8_t oldh = hook_save(g); int status; api_check(L, L->status == LUA_OK || L->status == LUA_ERRERR); + /* Forbid Lua world re-entrancy while running the trace */ + if (tvref(g->jit_base)) { + setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); + if (g->panic) g->panic(L); + exit(EXIT_FAILURE); + } + lj_trace_abort(g); /* Never record across Lua VM entrance */ status = lj_vm_cpcall(L, func, ud, cpcall); if (status) hook_restore(g, oldh); return status; ================================================================================ Squashed, force pushed to the upstream branch. > > > diff --git a/test/gh-4427-ffi-sandwich/CMakeLists.txt b/test/gh-4427-ffi-sandwich/CMakeLists.txt > > new file mode 100644 > > index 0000000..995c6bb > > --- /dev/null > > +++ b/test/gh-4427-ffi-sandwich/CMakeLists.txt > > 4. So this is not built by luajit? Does not it have anything > to be more self sufficient here? This looks strange to add > a CMake file to a probject, which uses Makefiles. > > Also this cmake does not work. I get error: > > CMake Error at CMakeLists.txt:1 (build_lualib): > Unknown CMake command "build_lualib". > > This is a clear cyclic dependency on Tarantool. Can it be > built by luajit's own means? No, it can't for now. LuaJIT tests are run by Tarantool testing machinery and depends on it since[1]. > > This dependency becomes even stronger in the next commit, > which introduces another CMake file. I'm definitely going to rework LuaJIT testing environment to make it more standalone and self-sufficient in the nearest future. Here is a ticket for it[2]. > > > @@ -0,0 +1 @@ > > +build_lualib(libsandwich libsandwich.c) > > diff --git a/test/gh-4427-ffi-sandwich/libsandwich.c b/test/gh-4427-ffi-sandwich/libsandwich.c > > new file mode 100644 > > index 0000000..2a24fb2 > > --- /dev/null > > +++ b/test/gh-4427-ffi-sandwich/libsandwich.c > > @@ -0,0 +1,59 @@ > > +#include <lua.h> > > +#include <lauxlib.h> > > + > > +struct sandwich { > > + lua_State *L; /* Coroutine saved for a Lua call */ > > + int ref; /* Anchor to the Lua function to be run */ > > + int trigger; /* Trigger for switching to Lua call */ > > +}; > > + > > +int ipp(struct sandwich *state, int i) > > 5. What is 'ipp'? <ipp> stands for i-plus-plus. Does <increment> name look better? > > > +{ > > + if (i < state->trigger) > > + return i + 1; > > + > > + /* Sandwich is triggered and Lua ipp function is called */ > > + lua_pushnumber(state->L, state->ref); > > + lua_gettable(state->L, LUA_REGISTRYINDEX); > > + lua_pushnumber(state->L, i); > > + lua_call(state->L, 1, 1); > > + return lua_tonumber(state->L, -1); > > > >> These changes look dangerous to perf. So if I had operators == > >> and < overloaded for an FFI struct, they are not jitted anymore, > >> right? > > > > Nope. I guess I need to stop right here and clarify the patch a little. > > > > Let's start with the patch itself: the introduced changes abort the > > trace recording whether re-entrancy to Lua VM is detected or "panic" if > > such re-entrancy is faced while running already compiled machine code. A > > good rationale for such protection is the platform failure Nikita faced > > in #4427: the C function called via FFI makes the trigger with a Lua-C > > API call underneath run. However it also might be any Lua-C API function > > respecting the one of available metamethods (e.g. lua_less, lua_equal, > > etc). If the correponding metamethod is set, the platform also re-enters > > to execute it. Such behaviour is also unnacceptable, so metamethod call > > ought to be invoked via jit_secure_call too. It's not optional and we > > can't avoid it, since there are no guarantees the situation never > > occurs. > > 6. Yeah, ok, now I see, the changes functions are from the > public C Lua API, and are called by external code only. Such > as from tarantool. I was afraid, that lj uses them for all > comparisons internally. > > > OK, what about FFI: it's a well isolated machinery, that handles > > everything its own way: own trace recording semantics, own metamethod > > handling (since only tables and userdata can have an individual > > metatable FFI has the one <ffi> "first level" metatable), etc. Nothing > > FFI-related is changed here. > > 7. If this has nothing to do with ffi, then why it is used > in the test? It is just a particular way to make compiler crazy (but I don't know another one related to the case). > > > If you run the following snippet, you'll see the artefacts for > > successfully compiled trace in dump.out. > > | require('jit.dump').start('+tbisrmXaT', 'dump.out') > > | > > | local limit = -1ULL / 2 > > | local value = 1ULL > > | local tests = { } > > | > > | while value < limit do > > | value = value * 2 > > | table.insert(tests, value < limit) > > | end > > | > > | jit.off() -- prevent spoiling dump.out with other traces. > > | print(value, limit, #tests) > > > > Since the patch relates only to Lua-C API, metamethods triggered while > > running Lua code are also successfully compiled. > > | require('jit.dump').start('+tbisrmXaT', 'dump.out') > > | > > | local complex > > | > > | local complex_mt = { > > | __add = function(a, b) return complex(a.re + b.re, a.im + b.im) end, > > | __eq = function(a, b) return a.re == b.re and a.im == b.im end, > > | __tostring = function(o) return ('re: %d, im: %d'):format(o.re, o.im) end > > | } > > | > > | complex = function(re, im) > > | return setmetatable({ re = re, im = im }, complex_mt) > > | end > > | > > | local start = complex(0, 0) > > | local finish = complex(64, 64) > > | local step = complex(1, 1) > > | > > | repeat > > | start = start + step > > | until start == finish > > > > Feel free to point the parts that looks good to be included into commit > > message, comments, etc. > > > >> Why? Does someone really yield in them? > > > > The main idea is to protect ourselves in the future (we have already > > discussed it with Kostja here[1]). > > > >> Could you measure how does it affect perf of these calls? > > > > Talking about perf: Sergos also asked to make benchmarks and I provided > > the sysbench results[2] I obtained in Gitlab. I saw no dramatically > > changes. I guess the reason the performance stays the same is that > > all such places has been already handled manually in Tarantool sources > > (like box_txn_rollback_to_savepoint after Leonid changes in #4427). > > > > However, if you would like to see more benchmarks, please let me know. > > > >> From what I see, these changes basically kill FFI making it not better > >> than Lua C. It no longer helps JITting anything, does it? > >> > >> The same for concat, and other things. Looks like an overkill. We > >> would need sandwich protection only for a few places. Is it possible > >> to avoid such drastic changes? > > > > [1]: https://lists.tarantool.org/pipermail/tarantool-patches/2019-November/012732.html > > [2]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-March/015236.html > 8. So are you saying, that ffi part is still fast? Yes. Nothing has changed, the patch just protects one from FFI misuse. [1]: https://github.com/tarantool/tarantool/issues/4478 [2]: https://github.com/tarantool/tarantool/issues/4862 -- Best regards, IM ^ permalink raw reply [flat|nested] 20+ messages in thread
* [Tarantool-patches] [PATCH luajit 2/2] jit: abort trace execution on JIT mode change 2020-03-27 10:47 [Tarantool-patches] [PATCH luajit 0/2] Trace abort on FFI sandwich or mode change Igor Munkin 2020-03-27 10:47 ` [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API Igor Munkin @ 2020-03-27 10:47 ` Igor Munkin 2020-03-28 19:36 ` Sergey Ostanevich 2020-04-02 23:41 ` [Tarantool-patches] [PATCH luajit 0/2] Trace abort on FFI sandwich or " Vladislav Shpilevoy 2 siblings, 1 reply; 20+ messages in thread From: Igor Munkin @ 2020-03-27 10:47 UTC (permalink / raw) To: Sergey Ostanevich, Vladislav Shpilevoy, Kirill Yukhin; +Cc: tarantool-patches Current luaJIT_setmode implementation aborts trace recording but nothing prevents calling it on already compiled trace. E.g. if one conditionally calls an FFI function having luaJIT_setmode with LUAJIT_MODE_FLUSH mode underneath, the trace being executed can be purged and the return address is invalidated as a result (since the mcode is released). This changeset prohibits luaJIT_setmode call while mcode is being executed leading to platform panic if the call occurs. Signed-off-by: Igor Munkin <imun@tarantool.org> --- src/lj_dispatch.c | 5 +++++ src/lj_errmsg.h | 1 + test/lj-flush-on-trace/CMakeLists.txt | 1 + test/lj-flush-on-trace/libflush.c | 31 +++++++++++++++++++++++++++ test/lj-flush-on-trace/test.lua | 25 +++++++++++++++++++++ 5 files changed, 63 insertions(+) create mode 100644 test/lj-flush-on-trace/CMakeLists.txt create mode 100644 test/lj-flush-on-trace/libflush.c create mode 100644 test/lj-flush-on-trace/test.lua diff --git a/src/lj_dispatch.c b/src/lj_dispatch.c index 5d6795f..b3448c8 100644 --- a/src/lj_dispatch.c +++ b/src/lj_dispatch.c @@ -240,6 +240,11 @@ int luaJIT_setmode(lua_State *L, int idx, int mode) { global_State *g = G(L); int mm = mode & LUAJIT_MODE_MASK; + if (tvref(g->jit_base)) { + setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITMODE)); + if (g->panic) g->panic(L); + exit(EXIT_FAILURE); + } lj_trace_abort(g); /* Abort recording on any state change. */ /* Avoid pulling the rug from under our own feet. */ if ((g->hookmask & HOOK_GC)) diff --git a/src/lj_errmsg.h b/src/lj_errmsg.h index 1580385..de7b867 100644 --- a/src/lj_errmsg.h +++ b/src/lj_errmsg.h @@ -113,6 +113,7 @@ ERRDEF(NOJIT, "JIT compiler permanently disabled by build option") #endif ERRDEF(JITOPT, "unknown or malformed optimization flag " LUA_QS) ERRDEF(JITCALL, "Lua VM re-entrancy is detected while executing the trace") +ERRDEF(JITMODE, "JIT mode change is detected while executing the trace") /* Lexer/parser errors. */ ERRDEF(XMODE, "attempt to load chunk with wrong mode") diff --git a/test/lj-flush-on-trace/CMakeLists.txt b/test/lj-flush-on-trace/CMakeLists.txt new file mode 100644 index 0000000..a90452d --- /dev/null +++ b/test/lj-flush-on-trace/CMakeLists.txt @@ -0,0 +1 @@ +build_lualib(libflush libflush.c) diff --git a/test/lj-flush-on-trace/libflush.c b/test/lj-flush-on-trace/libflush.c new file mode 100644 index 0000000..177409a --- /dev/null +++ b/test/lj-flush-on-trace/libflush.c @@ -0,0 +1,31 @@ +#include <lua.h> +#include <luajit.h> + +struct flush { + lua_State *L; /* Coroutine saved to change JIT mode */ + int trigger; /* Trigger for flushing all traces */ +}; + +void flush(struct flush *state, int i) +{ + if (i < state->trigger) + return; + + /* Trace flushing is triggered */ + (void)luaJIT_setmode(state->L, 0, LUAJIT_MODE_ENGINE|LUAJIT_MODE_FLUSH); +} + +static int init(lua_State *L) +{ + struct flush *state = lua_newuserdata(L, sizeof(struct flush)); + + state->L = L; + state->trigger = lua_tonumber(L, 1); + return 1; +} + +LUA_API int luaopen_libflush(lua_State *L) +{ + lua_pushcfunction(L, init); + return 1; +} diff --git a/test/lj-flush-on-trace/test.lua b/test/lj-flush-on-trace/test.lua new file mode 100644 index 0000000..ff6e0b6 --- /dev/null +++ b/test/lj-flush-on-trace/test.lua @@ -0,0 +1,25 @@ +local cfg = { + hotloop = arg[1] or 1, + trigger = arg[2] or 1, +} + +local ffi = require('ffi') +local ffiflush = ffi.load('libflush') +ffi.cdef('void flush(struct flush *state, int i)') + +-- Save the current coroutine and set the value to trigger ipp +-- call the Lua routine instead of pure C implementation. +local flush = require('libflush')(cfg.trigger) + +-- Depending on trigger and hotloop values the following contexts +-- are possible: +-- * if trigger <= hotloop -> trace recording is aborted +-- * if trigger > hotloop -> trace is recorded but execution +-- leads to panic +jit.opt.start("3", string.format("hotloop=%d", cfg.hotloop)) + +for i = 0, cfg.trigger + cfg.hotloop do + ffiflush.flush(flush, i) +end +-- Panic didn't occur earlier. +print('OK') -- 2.25.0 ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 2/2] jit: abort trace execution on JIT mode change 2020-03-27 10:47 ` [Tarantool-patches] [PATCH luajit 2/2] jit: abort trace execution on JIT mode change Igor Munkin @ 2020-03-28 19:36 ` Sergey Ostanevich 2020-03-29 10:46 ` Igor Munkin 0 siblings, 1 reply; 20+ messages in thread From: Sergey Ostanevich @ 2020-03-28 19:36 UTC (permalink / raw) To: Igor Munkin; +Cc: Vladislav Shpilevoy, tarantool-patches Hi! Thanks for the patch, LGTM. Sergos. On 27 мар 13:47, Igor Munkin wrote: > Current luaJIT_setmode implementation aborts trace recording but nothing > prevents calling it on already compiled trace. E.g. if one conditionally > calls an FFI function having luaJIT_setmode with LUAJIT_MODE_FLUSH mode > underneath, the trace being executed can be purged and the return > address is invalidated as a result (since the mcode is released). > > This changeset prohibits luaJIT_setmode call while mcode is being > executed leading to platform panic if the call occurs. > > Signed-off-by: Igor Munkin <imun@tarantool.org> > --- > src/lj_dispatch.c | 5 +++++ > src/lj_errmsg.h | 1 + > test/lj-flush-on-trace/CMakeLists.txt | 1 + > test/lj-flush-on-trace/libflush.c | 31 +++++++++++++++++++++++++++ > test/lj-flush-on-trace/test.lua | 25 +++++++++++++++++++++ > 5 files changed, 63 insertions(+) > create mode 100644 test/lj-flush-on-trace/CMakeLists.txt > create mode 100644 test/lj-flush-on-trace/libflush.c > create mode 100644 test/lj-flush-on-trace/test.lua > > diff --git a/src/lj_dispatch.c b/src/lj_dispatch.c > index 5d6795f..b3448c8 100644 > --- a/src/lj_dispatch.c > +++ b/src/lj_dispatch.c > @@ -240,6 +240,11 @@ int luaJIT_setmode(lua_State *L, int idx, int mode) > { > global_State *g = G(L); > int mm = mode & LUAJIT_MODE_MASK; > + if (tvref(g->jit_base)) { > + setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITMODE)); > + if (g->panic) g->panic(L); > + exit(EXIT_FAILURE); > + } > lj_trace_abort(g); /* Abort recording on any state change. */ > /* Avoid pulling the rug from under our own feet. */ > if ((g->hookmask & HOOK_GC)) > diff --git a/src/lj_errmsg.h b/src/lj_errmsg.h > index 1580385..de7b867 100644 > --- a/src/lj_errmsg.h > +++ b/src/lj_errmsg.h > @@ -113,6 +113,7 @@ ERRDEF(NOJIT, "JIT compiler permanently disabled by build option") > #endif > ERRDEF(JITOPT, "unknown or malformed optimization flag " LUA_QS) > ERRDEF(JITCALL, "Lua VM re-entrancy is detected while executing the trace") > +ERRDEF(JITMODE, "JIT mode change is detected while executing the trace") > > /* Lexer/parser errors. */ > ERRDEF(XMODE, "attempt to load chunk with wrong mode") > diff --git a/test/lj-flush-on-trace/CMakeLists.txt b/test/lj-flush-on-trace/CMakeLists.txt > new file mode 100644 > index 0000000..a90452d > --- /dev/null > +++ b/test/lj-flush-on-trace/CMakeLists.txt > @@ -0,0 +1 @@ > +build_lualib(libflush libflush.c) > diff --git a/test/lj-flush-on-trace/libflush.c b/test/lj-flush-on-trace/libflush.c > new file mode 100644 > index 0000000..177409a > --- /dev/null > +++ b/test/lj-flush-on-trace/libflush.c > @@ -0,0 +1,31 @@ > +#include <lua.h> > +#include <luajit.h> > + > +struct flush { > + lua_State *L; /* Coroutine saved to change JIT mode */ > + int trigger; /* Trigger for flushing all traces */ > +}; > + > +void flush(struct flush *state, int i) > +{ > + if (i < state->trigger) > + return; > + > + /* Trace flushing is triggered */ > + (void)luaJIT_setmode(state->L, 0, LUAJIT_MODE_ENGINE|LUAJIT_MODE_FLUSH); > +} > + > +static int init(lua_State *L) > +{ > + struct flush *state = lua_newuserdata(L, sizeof(struct flush)); > + > + state->L = L; > + state->trigger = lua_tonumber(L, 1); > + return 1; > +} > + > +LUA_API int luaopen_libflush(lua_State *L) > +{ > + lua_pushcfunction(L, init); > + return 1; > +} > diff --git a/test/lj-flush-on-trace/test.lua b/test/lj-flush-on-trace/test.lua > new file mode 100644 > index 0000000..ff6e0b6 > --- /dev/null > +++ b/test/lj-flush-on-trace/test.lua > @@ -0,0 +1,25 @@ > +local cfg = { > + hotloop = arg[1] or 1, > + trigger = arg[2] or 1, > +} > + > +local ffi = require('ffi') > +local ffiflush = ffi.load('libflush') > +ffi.cdef('void flush(struct flush *state, int i)') > + > +-- Save the current coroutine and set the value to trigger ipp > +-- call the Lua routine instead of pure C implementation. > +local flush = require('libflush')(cfg.trigger) > + > +-- Depending on trigger and hotloop values the following contexts > +-- are possible: > +-- * if trigger <= hotloop -> trace recording is aborted > +-- * if trigger > hotloop -> trace is recorded but execution > +-- leads to panic > +jit.opt.start("3", string.format("hotloop=%d", cfg.hotloop)) > + > +for i = 0, cfg.trigger + cfg.hotloop do > + ffiflush.flush(flush, i) > +end > +-- Panic didn't occur earlier. > +print('OK') > -- > 2.25.0 > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 2/2] jit: abort trace execution on JIT mode change 2020-03-28 19:36 ` Sergey Ostanevich @ 2020-03-29 10:46 ` Igor Munkin 0 siblings, 0 replies; 20+ messages in thread From: Igor Munkin @ 2020-03-29 10:46 UTC (permalink / raw) To: Sergey Ostanevich; +Cc: Vladislav Shpilevoy, tarantool-patches Sergos, Thanks for you review! On 28.03.20, Sergey Ostanevich wrote: > Hi! > > Thanks for the patch, LGTM. > > Sergos. > > On 27 мар 13:47, Igor Munkin wrote: > > Current luaJIT_setmode implementation aborts trace recording but nothing > > prevents calling it on already compiled trace. E.g. if one conditionally > > calls an FFI function having luaJIT_setmode with LUAJIT_MODE_FLUSH mode > > underneath, the trace being executed can be purged and the return > > address is invalidated as a result (since the mcode is released). > > > > This changeset prohibits luaJIT_setmode call while mcode is being > > executed leading to platform panic if the call occurs. I reworded this section considering your comment about platform panic: ================================================================================ This changeset prohibits luaJIT_setmode call while mcode is being executed. If the call occurs the platform finishes its execution with EXIT_FAILURE code and calls panic routine prior to the exit. ================================================================================ > > Reviewed-by: Sergey Ostanevich <sergos@tarantool.org> > > Signed-off-by: Igor Munkin <imun@tarantool.org> > > --- > > src/lj_dispatch.c | 5 +++++ > > src/lj_errmsg.h | 1 + > > test/lj-flush-on-trace/CMakeLists.txt | 1 + > > test/lj-flush-on-trace/libflush.c | 31 +++++++++++++++++++++++++++ > > test/lj-flush-on-trace/test.lua | 25 +++++++++++++++++++++ > > 5 files changed, 63 insertions(+) > > create mode 100644 test/lj-flush-on-trace/CMakeLists.txt > > create mode 100644 test/lj-flush-on-trace/libflush.c > > create mode 100644 test/lj-flush-on-trace/test.lua > > <snipped> > > -- > > 2.25.0 > > -- Best regards, IM ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 0/2] Trace abort on FFI sandwich or mode change 2020-03-27 10:47 [Tarantool-patches] [PATCH luajit 0/2] Trace abort on FFI sandwich or mode change Igor Munkin 2020-03-27 10:47 ` [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API Igor Munkin 2020-03-27 10:47 ` [Tarantool-patches] [PATCH luajit 2/2] jit: abort trace execution on JIT mode change Igor Munkin @ 2020-04-02 23:41 ` Vladislav Shpilevoy 2020-04-03 21:32 ` Igor Munkin 2 siblings, 1 reply; 20+ messages in thread From: Vladislav Shpilevoy @ 2020-04-02 23:41 UTC (permalink / raw) To: Igor Munkin, Sergey Ostanevich, Kirill Yukhin; +Cc: tarantool-patches Hi! Thanks for the patch! I am getting build errors here: Undefined symbols for architecture x86_64: "__Unwind_DeleteException", referenced from: _lj_err_unwind_dwarf in libluajit.a(lj_err.o) "__Unwind_GetCFA", referenced from: _lj_err_unwind_dwarf in libluajit.a(lj_err.o) "__Unwind_RaiseException", referenced from: _lj_err_throw in libluajit.a(lj_err.o) "__Unwind_SetGR", referenced from: _lj_err_unwind_dwarf in libluajit.a(lj_err.o) "__Unwind_SetIP", referenced from: _lj_err_unwind_dwarf in libluajit.a(lj_err.o) I think they are not related to your patch, but probably you know how to fix them? Did I miss a build option? I used 'make -j'. I saw you fixed some things requested by Sergey, but I don't see them on the branch. Did you push the latest version? For example, in the first commit you had a typo in the commit message - 'whe' instead of 'when'. I still see 'whe', on github too. On 27/03/2020 11:47, Igor Munkin wrote: > This series closes two issues related to the JIT machinery behaviour: > * "FFI sandwich"(*) detection is introduced. If sandwich is detected > while trace recording the recording is aborted. The sandwich detected > while mcode execution leads to the platform panic. > * luaJIT_setmode call is prohibited while mcode execution and leads to > the platform panic. > > (*) The following stack mix is called FFI sandwich. > | Lua-FFI -> С routine -> Lua-C API -> Lua VM > This sort of re-entrancy is explicitly not supported by LuaJIT > compiler. For more info see [1]. > > Branch: https://github.com/tarantool/luajit/tree/imun/ffi-sandwich > > [1]: https://github.com/tarantool/tarantool/issues/4427 > > Igor Munkin (2): > jit: abort trace recording and execution for C API > jit: abort trace execution on JIT mode change > > src/lj_api.c | 35 ++++++++++---- > src/lj_dispatch.c | 5 ++ > src/lj_errmsg.h | 2 + > test/gh-4427-ffi-sandwich/CMakeLists.txt | 1 + > test/gh-4427-ffi-sandwich/libsandwich.c | 59 ++++++++++++++++++++++++ > test/gh-4427-ffi-sandwich/test.lua | 26 +++++++++++ > test/lj-flush-on-trace/CMakeLists.txt | 1 + > test/lj-flush-on-trace/libflush.c | 31 +++++++++++++ > test/lj-flush-on-trace/test.lua | 25 ++++++++++ > 9 files changed, 176 insertions(+), 9 deletions(-) > create mode 100644 test/gh-4427-ffi-sandwich/CMakeLists.txt > create mode 100644 test/gh-4427-ffi-sandwich/libsandwich.c > create mode 100644 test/gh-4427-ffi-sandwich/test.lua > create mode 100644 test/lj-flush-on-trace/CMakeLists.txt > create mode 100644 test/lj-flush-on-trace/libflush.c > create mode 100644 test/lj-flush-on-trace/test.lua > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 0/2] Trace abort on FFI sandwich or mode change 2020-04-02 23:41 ` [Tarantool-patches] [PATCH luajit 0/2] Trace abort on FFI sandwich or " Vladislav Shpilevoy @ 2020-04-03 21:32 ` Igor Munkin 2020-04-04 21:36 ` Vladislav Shpilevoy 0 siblings, 1 reply; 20+ messages in thread From: Igor Munkin @ 2020-04-03 21:32 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches Vlad, Thanks for the review! On 03.04.20, Vladislav Shpilevoy wrote: > Hi! Thanks for the patch! > > I am getting build errors here: > > Undefined symbols for architecture x86_64: > "__Unwind_DeleteException", referenced from: > _lj_err_unwind_dwarf in libluajit.a(lj_err.o) > "__Unwind_GetCFA", referenced from: > _lj_err_unwind_dwarf in libluajit.a(lj_err.o) > "__Unwind_RaiseException", referenced from: > _lj_err_throw in libluajit.a(lj_err.o) > "__Unwind_SetGR", referenced from: > _lj_err_unwind_dwarf in libluajit.a(lj_err.o) > "__Unwind_SetIP", referenced from: > _lj_err_unwind_dwarf in libluajit.a(lj_err.o) > > I think they are not related to your patch, but > probably you know how to fix them? Did I miss a build > option? I used 'make -j'. Hm, it totally doesn't relate to the patch. AFAIR you use MacOS, so the problems might be related to the external unwinder ([1], sec. 6.2) used by luajit. Could you please share your build environment configuration? > > I saw you fixed some things requested by Sergey, but I don't > see them on the branch. Did you push the latest version? For > example, in the first commit you had a typo in the commit > message - 'whe' instead of 'when'. I still see 'whe', on github > too. Sorry, my fault. I had been waiting for Sergos comments related to my post-review fixes, avoiding several force pushes. Our discussion had been stalled for a while, and I've just received his LGTM and I've pushed the fixed patches to the remote branch. Next time I'll update the upstream ASAP. Please, let me know, whether I need to send the actual series version. > > On 27/03/2020 11:47, Igor Munkin wrote: > > This series closes two issues related to the JIT machinery behaviour: > > * "FFI sandwich"(*) detection is introduced. If sandwich is detected > > while trace recording the recording is aborted. The sandwich detected > > while mcode execution leads to the platform panic. > > * luaJIT_setmode call is prohibited while mcode execution and leads to > > the platform panic. > > > > (*) The following stack mix is called FFI sandwich. > > | Lua-FFI -> С routine -> Lua-C API -> Lua VM > > This sort of re-entrancy is explicitly not supported by LuaJIT > > compiler. For more info see [1]. > > > > Branch: https://github.com/tarantool/luajit/tree/imun/ffi-sandwich > > > > [1]: https://github.com/tarantool/tarantool/issues/4427 > > > > Igor Munkin (2): > > jit: abort trace recording and execution for C API > > jit: abort trace execution on JIT mode change > > > > src/lj_api.c | 35 ++++++++++---- > > src/lj_dispatch.c | 5 ++ > > src/lj_errmsg.h | 2 + > > test/gh-4427-ffi-sandwich/CMakeLists.txt | 1 + > > test/gh-4427-ffi-sandwich/libsandwich.c | 59 ++++++++++++++++++++++++ > > test/gh-4427-ffi-sandwich/test.lua | 26 +++++++++++ > > test/lj-flush-on-trace/CMakeLists.txt | 1 + > > test/lj-flush-on-trace/libflush.c | 31 +++++++++++++ > > test/lj-flush-on-trace/test.lua | 25 ++++++++++ > > 9 files changed, 176 insertions(+), 9 deletions(-) > > create mode 100644 test/gh-4427-ffi-sandwich/CMakeLists.txt > > create mode 100644 test/gh-4427-ffi-sandwich/libsandwich.c > > create mode 100644 test/gh-4427-ffi-sandwich/test.lua > > create mode 100644 test/lj-flush-on-trace/CMakeLists.txt > > create mode 100644 test/lj-flush-on-trace/libflush.c > > create mode 100644 test/lj-flush-on-trace/test.lua > > [1]: https://www.uclibc.org/docs/psABI-x86_64.pdf -- Best regards, IM ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit 0/2] Trace abort on FFI sandwich or mode change 2020-04-03 21:32 ` Igor Munkin @ 2020-04-04 21:36 ` Vladislav Shpilevoy 0 siblings, 0 replies; 20+ messages in thread From: Vladislav Shpilevoy @ 2020-04-04 21:36 UTC (permalink / raw) To: Igor Munkin; +Cc: tarantool-patches This is a Mac problem. I found the solution here: https://github.com/LuaJIT/LuaJIT/issues/449 Tarantool sets these variables in cmake, this is why it compiles as a part of the main repo. ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2020-04-07 21:23 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-03-27 10:47 [Tarantool-patches] [PATCH luajit 0/2] Trace abort on FFI sandwich or mode change Igor Munkin 2020-03-27 10:47 ` [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API Igor Munkin 2020-03-28 16:33 ` Sergey Ostanevich 2020-03-28 20:30 ` Igor Munkin 2020-03-29 9:21 ` Sergey Ostanevich 2020-03-29 10:45 ` Igor Munkin 2020-03-30 8:58 ` Sergey Ostanevich 2020-03-30 14:25 ` Igor Munkin 2020-04-03 21:06 ` Sergey Ostanevich 2020-04-03 21:31 ` Igor Munkin 2020-04-02 23:41 ` Vladislav Shpilevoy 2020-04-04 11:55 ` Igor Munkin 2020-04-04 21:37 ` Vladislav Shpilevoy 2020-04-07 21:16 ` Igor Munkin 2020-03-27 10:47 ` [Tarantool-patches] [PATCH luajit 2/2] jit: abort trace execution on JIT mode change Igor Munkin 2020-03-28 19:36 ` Sergey Ostanevich 2020-03-29 10:46 ` Igor Munkin 2020-04-02 23:41 ` [Tarantool-patches] [PATCH luajit 0/2] Trace abort on FFI sandwich or " Vladislav Shpilevoy 2020-04-03 21:32 ` Igor Munkin 2020-04-04 21:36 ` Vladislav Shpilevoy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox