From: Igor Munkin <imun@tarantool.org> To: Sergey Ostanevich <sergos@tarantool.org> Cc: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>, tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API Date: Mon, 30 Mar 2020 17:25:02 +0300 [thread overview] Message-ID: <20200330142502.GJ22874@tarantool.org> (raw) In-Reply-To: <20200330085819.GA635@tarantool.org> Sergos, On 30.03.20, Sergey Ostanevich wrote: > On 29 мар 13:45, Igor Munkin wrote: > > Sergos, > > > > On 29.03.20, Sergey Ostanevich wrote: > > > On 28 мар 23:30, Igor Munkin wrote: > > > > Sergos, > > > > > > > > Thanks for you review! > > > > > > > > On 28.03.20, Sergey Ostanevich wrote: > > > > > Hi! > > > > > > > > > > Thanks for the patch, see below. > > > > > On 27 мар 13:47, Igor Munkin wrote: > > > > > > JIT recording semantics assumes FFI calls are atomic regarding the > > > > > > LuaJIT VM: if the execution exited Lua world through FFI it is not > > > > > > re-entering Lua world again. > > > > > > > > > > > > However, there are ways to break this atomicity via FFI: one can > > > > > > re-enter LuaJIT VM via Lua C API inside the C routine called via FFI. As > > > > > > a result the following stack mix is created: > > > > > > | Lua-FFI -> С routine -> Lua-C API -> Lua VM > > > > > > > > Fixed typo <D0><A1> (cyrillic С letter) to an ASCII C. > > > > > > > > > > This sort of re-entrancy is explicitly not supported by LuaJIT compiler > > > > > > machinery. @mraleph named such layout an "FFI sandwich" and I'm strictly > > > > > > against eating it. > > > > > > > > > > > > E.g. there is a trigger machinery in Tarantool calling Lua functions > > > > > > from C space. As a result of FFI function call with a trigger underneath > > > > > > the "FFI sandwich" is created. > > > > > > > > > > > > This changeset introduces the mechanism for Lua-C API callbacks similar > > > > > > to the one existing for Lua-FFI: trace recording is aborted whe the > > > > > when > > > > > > > > Thanks, fixed. > > > > > > > > > > execution re-enters LuaJIT VM. Sandwich detection while mcode execution > > > > > > leads to platform panic. > > > > > Could you elaborate here in comment on this a little: it means this > > > > > particular Lua call to be aborted, fail for the fiber Lua state, or for > > > > > the whole Tarantool? > > > > > > > > > > > > > The whole Tarantool finishes its execution with EXIT_FAILURE code and > > > > calls tarantool_panic_handler routine prior to the exit. Should I add > > > > this remark into commit message? > > > I would like to see it in commit message. > > > > Added. Here is a new commit message: > > > > ================================================================================ > > > > jit: abort trace recording and execution for C API > > > > JIT recording semantics assumes FFI calls are atomic regarding the > > LuaJIT VM: if the execution exited Lua world through FFI it is not > > re-entering Lua world again. > > > > However, there are ways to break this atomicity via FFI: one can > > re-enter LuaJIT VM via Lua C API inside the C routine called via FFI. As > > a result the following stack mix is created: > > | Lua-FFI -> C routine -> Lua-C API -> Lua VM > > This sort of re-entrancy is explicitly not supported by LuaJIT compiler > > machinery. @mraleph named such layout an "FFI sandwich" and I'm strictly > > against eating it. > > > > E.g. there is a trigger machinery in Tarantool calling Lua functions > > from C space. As a result of FFI function call with a trigger underneath > > the "FFI sandwich" is created. > > > > This changeset introduces the mechanism for Lua-C API callbacks similar > > to the one existing for Lua-FFI: trace recording is aborted when the > > execution re-enters LuaJIT VM. If re-entrancy is detected while running > > mcode the platform finishes its execution with EXIT_FAILURE code and > > calls panic routine prior to the exit. > > > > Relates to tarantool/tarantool#4427 > > > > ================================================================================ > > LGTM. > > > > > > > > > > > > > > > > > > > > > > Relates to tarantool/tarantool#4427 > > > > > > > > > > > > Co-authored-by: Vyacheslav Egorov <vegorov@google.com> > > > > > > Co-authored-by: Sergey Ostanevich <sergos@tarantool.org> > > > > > > Signed-off-by: Igor Munkin <imun@tarantool.org> > > > > > > --- > > > > > > src/lj_api.c | 35 ++++++++++---- > > > > > > src/lj_errmsg.h | 1 + > > > > > > test/gh-4427-ffi-sandwich/CMakeLists.txt | 1 + > > > > > > test/gh-4427-ffi-sandwich/libsandwich.c | 59 ++++++++++++++++++++++++ > > > > > > test/gh-4427-ffi-sandwich/test.lua | 26 +++++++++++ > > > > > > 5 files changed, 113 insertions(+), 9 deletions(-) > > > > > > create mode 100644 test/gh-4427-ffi-sandwich/CMakeLists.txt > > > > > > create mode 100644 test/gh-4427-ffi-sandwich/libsandwich.c > > > > > > create mode 100644 test/gh-4427-ffi-sandwich/test.lua > > > > > > > > > > > > diff --git a/src/lj_api.c b/src/lj_api.c > > > > > > index a5e2465..c1f53e0 100644 > > > > > > --- a/src/lj_api.c > > > > > > +++ b/src/lj_api.c > > > > > > @@ -76,6 +76,17 @@ static GCtab *getcurrenv(lua_State *L) > > > > > > return fn->c.gct == ~LJ_TFUNC ? tabref(fn->c.env) : tabref(L->env); > > > > > > } > > > > > > > > > > > > +static void call(lua_State *L, TValue *base, int nres) { > > > > > I would recommend to name the function in a more descriptive way, like > > > > > sane_call. > > > > > > > > IMHO no_trace_call / no_jit_call looks good to me, but feel free to > > > > propose the naming you prefer more. > > > If you want to mention that this call aborts the trace generation, > > > then I would put it like 'call_w_trace_abort' > > > > It not only aborts the trace recording, but prohibits further execution > > for already compiled code. > > > > > But it goes beyond that AFAU. Can you name it like 'ffi_safe_call'? > > > > The changes are not related to FFI at all. FFI helps to achieve the > > platform failure, but the problem relates to JIT machinery. So this name > > doesn't clarify function semantics. > > > Can we sayn that this machinery tracks the machinery consistency? Then > the first name comes to my mind is consistent_call with in the code > comment on what consistency is - abort trace collection and inconsistent > trace excution. As a result of discussion, we stopped on the jit_secure_call name. The diff is below: ================================================================================ diff --git a/src/lj_api.c b/src/lj_api.c index c1f53e0..4ff8ccf 100644 --- a/src/lj_api.c +++ b/src/lj_api.c @@ -76,8 +76,9 @@ static GCtab *getcurrenv(lua_State *L) return fn->c.gct == ~LJ_TFUNC ? tabref(fn->c.env) : tabref(L->env); } -static void call(lua_State *L, TValue *base, int nres) { +static void jit_secure_call(lua_State *L, TValue *base, int nres) { global_State *g = G(L); + /* Forbid Lua world re-entrancy while running the trace */ if (tvref(g->jit_base)) { setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); if (g->panic) g->panic(L); @@ -311,7 +311,7 @@ LUA_API int lua_equal(lua_State *L, int idx1, int idx2) return (int)(uintptr_t)base; } else { L->top = base+2; - call(L, base, 1+1); + jit_secure_call(L, base, 1+1); L->top -= 2+LJ_FR2; return tvistruecond(L->top+1+LJ_FR2); } @@ -334,7 +334,7 @@ LUA_API int lua_lessthan(lua_State *L, int idx1, int idx2) return (int)(uintptr_t)base; } else { L->top = base+2; - call(L, base, 1+1); + jit_secure_call(L, base, 1+1); L->top -= 2+LJ_FR2; return tvistruecond(L->top+1+LJ_FR2); } @@ -786,7 +786,7 @@ LUA_API void lua_concat(lua_State *L, int n) } n -= (int)(L->top - top); L->top = top+2; - call(L, top, 1+1); + jit_secure_call(L, top, 1+1); L->top -= 1+LJ_FR2; copyTV(L, L->top-1, L->top+LJ_FR2); } while (--n > 0); @@ -806,7 +806,7 @@ LUA_API void lua_gettable(lua_State *L, int idx) v = lj_meta_tget(L, t, L->top-1); if (v == NULL) { L->top += 2; - call(L, L->top-2, 1+1); + jit_secure_call(L, L->top-2, 1+1); L->top -= 2+LJ_FR2; v = L->top+1+LJ_FR2; } @@ -822,7 +822,7 @@ LUA_API void lua_getfield(lua_State *L, int idx, const char *k) v = lj_meta_tget(L, t, &key); if (v == NULL) { L->top += 2; - call(L, L->top-2, 1+1); + jit_secure_call(L, L->top-2, 1+1); L->top -= 2+LJ_FR2; v = L->top+1+LJ_FR2; } @@ -977,7 +977,7 @@ LUA_API void lua_settable(lua_State *L, int idx) TValue *base = L->top; copyTV(L, base+2, base-3-2*LJ_FR2); L->top = base+3; - call(L, base, 0+1); + jit_secure_call(L, base, 0+1); L->top -= 3+LJ_FR2; } } @@ -998,7 +998,7 @@ LUA_API void lua_setfield(lua_State *L, int idx, const char *k) TValue *base = L->top; copyTV(L, base+2, base-3-2*LJ_FR2); L->top = base+3; - call(L, base, 0+1); + jit_secure_call(L, base, 0+1); L->top -= 2+LJ_FR2; } } @@ -1129,7 +1129,7 @@ LUA_API void lua_call(lua_State *L, int nargs, int nresults) { api_check(L, L->status == LUA_OK || L->status == LUA_ERRERR); api_checknelems(L, nargs+1); - call(L, api_call_base(L, nargs), nresults+1); + jit_secure_call(L, api_call_base(L, nargs), nresults+1); } LUA_API int lua_pcall(lua_State *L, int nargs, int nresults, int errfunc) @@ -1147,6 +1148,7 @@ LUA_API int lua_pcall(lua_State *L, int nargs, int nresults, int errfunc) api_checkvalidindex(L, o); ef = savestack(L, o); } + /* Forbid Lua world re-entrancy while running the trace */ if (tvref(g->jit_base)) { setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); if (g->panic) g->panic(L); @@ -1189,7 +1189,7 @@ LUALIB_API int luaL_callmeta(lua_State *L, int idx, const char *field) if (LJ_FR2) setnilV(top++); copyTV(L, top++, index2adr(L, idx)); L->top = top; - call(L, top-1, 1+1); + jit_secure_call(L, top-1, 1+1); return 1; } return 0; ================================================================================ Squashed these changes with the original patch. > > > > Although, safety resulting in a panic sounds paradoxical. > > > > Totally agree. > > > > > > > > BTW, should it be under 'LJ_HASFFI' and keep the original call > > > otherwise? > > > > I see no benefits for it. Again, the fix doesn't relates to FFI > > itself. It's just a particular way leading to JIT inconsistency. > > > > Feel free to correct me, if I'm wrong. > > > > > > > > > > > > > > > > > > > > + global_State *g = G(L); > > > > > > + if (tvref(g->jit_base)) { > > > > > > + setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); > > > > > > + if (g->panic) g->panic(L); > > > > > > + exit(EXIT_FAILURE); > > > > > > + } > > > > > > + lj_trace_abort(g); /* Never record across Lua VM entrance */ > > > > > > + lj_vm_call(L, base, nres); > > > > > > +} > > > > > > + > > > > > > /* -- Miscellaneous API functions ----------------------------------------- */ > > > > > > > > > > > > LUA_API int lua_status(lua_State *L) > > > > > > > > > > Also I have a question on perf impact - have you measured any > > > > > benchmarks - Lua only, or as part of Tarantool > > > > > > > > I compared two sysbench logs ([1] and [2]) via vimdiff and see no > > > > critical difference. > > > > > > > > > > > > > > Regards, > > > > > Sergos > > > > > > > > [1]: https://gitlab.com/tarantool/tarantool/-/jobs/489420257/raw > > > > [2]: https://gitlab.com/tarantool/tarantool/-/jobs/486707332/raw > > > > > > > > -- > > > > Best regards, > > > > IM > > > > -- > > Best regards, > > IM -- Best regards, IM
next prev parent reply other threads:[~2020-03-30 14:31 UTC|newest] Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-03-27 10:47 [Tarantool-patches] [PATCH luajit 0/2] Trace abort on FFI sandwich or mode change Igor Munkin 2020-03-27 10:47 ` [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API Igor Munkin 2020-03-28 16:33 ` Sergey Ostanevich 2020-03-28 20:30 ` Igor Munkin 2020-03-29 9:21 ` Sergey Ostanevich 2020-03-29 10:45 ` Igor Munkin 2020-03-30 8:58 ` Sergey Ostanevich 2020-03-30 14:25 ` Igor Munkin [this message] 2020-04-03 21:06 ` Sergey Ostanevich 2020-04-03 21:31 ` Igor Munkin 2020-04-02 23:41 ` Vladislav Shpilevoy 2020-04-04 11:55 ` Igor Munkin 2020-04-04 21:37 ` Vladislav Shpilevoy 2020-04-07 21:16 ` Igor Munkin 2020-03-27 10:47 ` [Tarantool-patches] [PATCH luajit 2/2] jit: abort trace execution on JIT mode change Igor Munkin 2020-03-28 19:36 ` Sergey Ostanevich 2020-03-29 10:46 ` Igor Munkin 2020-04-02 23:41 ` [Tarantool-patches] [PATCH luajit 0/2] Trace abort on FFI sandwich or " Vladislav Shpilevoy 2020-04-03 21:32 ` Igor Munkin 2020-04-04 21:36 ` Vladislav Shpilevoy
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200330142502.GJ22874@tarantool.org \ --to=imun@tarantool.org \ --cc=sergos@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox