From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng3.m.smailru.net (smtpng3.m.smailru.net [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 855D94696C4 for ; Sun, 5 Apr 2020 00:37:06 +0300 (MSK) References: <50fe58f83ebb4e4971528641e74f99fe2f9fd8f2.1585304087.git.imun@tarantool.org> <20200404115505.GD30022@tarantool.org> From: Vladislav Shpilevoy Message-ID: <6c9f0e61-135a-caf3-5d29-4aa8525f685c@tarantool.org> Date: Sat, 4 Apr 2020 23:37:04 +0200 MIME-Version: 1.0 In-Reply-To: <20200404115505.GD30022@tarantool.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Igor Munkin Cc: tarantool-patches@dev.tarantool.org See 8 comments below. > diff --git a/src/lj_api.c b/src/lj_api.c > index a5e2465..b543231 100644 > --- a/src/lj_api.c > +++ b/src/lj_api.c > @@ -76,6 +76,18 @@ static GCtab *getcurrenv(lua_State *L) > return fn->c.gct == ~LJ_TFUNC ? tabref(fn->c.env) : tabref(L->env); > } > > +static void jit_secure_call(lua_State *L, TValue *base, int nres) { 1. Many functions in luajit related to jitting have prefix 'lj_'. Probably this should be applied here too. I didn't see any other starting with 'jit_'. > + global_State *g = G(L); > + /* Forbid Lua world re-entrancy while running the trace */ > + if (tvref(g->jit_base)) { > + setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); > + if (g->panic) g->panic(L); > + exit(EXIT_FAILURE); 2. Is it correct, that it is allowed to renter Lua, and the panic will happen only if we see that during reentering the trace is still being recorded (i.e. if there is a bug in lj)? I don't see in the code above where is that check like 'the trace is being recorded'. Could you please point at it? > + } > + lj_trace_abort(g); /* Never record across Lua VM entrance */ > + lj_vm_call(L, base, nres); > +} > @@ -1172,7 +1191,7 @@ LUALIB_API int luaL_callmeta(lua_State *L, int idx, const char *field) > if (LJ_FR2) setnilV(top++); > copyTV(L, top++, index2adr(L, idx)); > L->top = top; > - lj_vm_call(L, top-1, 1+1); > + jit_secure_call(L, top-1, 1+1); > return 1; > } 3. Why didn't you change lua_cpcall(), luaL_callmeta()? > diff --git a/test/gh-4427-ffi-sandwich/CMakeLists.txt b/test/gh-4427-ffi-sandwich/CMakeLists.txt > new file mode 100644 > index 0000000..995c6bb > --- /dev/null > +++ b/test/gh-4427-ffi-sandwich/CMakeLists.txt 4. So this is not built by luajit? Does not it have anything to be more self sufficient here? This looks strange to add a CMake file to a probject, which uses Makefiles. Also this cmake does not work. I get error: CMake Error at CMakeLists.txt:1 (build_lualib): Unknown CMake command "build_lualib". This is a clear cyclic dependency on Tarantool. Can it be built by luajit's own means? This dependency becomes even stronger in the next commit, which introduces another CMake file. > @@ -0,0 +1 @@ > +build_lualib(libsandwich libsandwich.c) > diff --git a/test/gh-4427-ffi-sandwich/libsandwich.c b/test/gh-4427-ffi-sandwich/libsandwich.c > new file mode 100644 > index 0000000..2a24fb2 > --- /dev/null > +++ b/test/gh-4427-ffi-sandwich/libsandwich.c > @@ -0,0 +1,59 @@ > +#include > +#include > + > +struct sandwich { > + lua_State *L; /* Coroutine saved for a Lua call */ > + int ref; /* Anchor to the Lua function to be run */ > + int trigger; /* Trigger for switching to Lua call */ > +}; > + > +int ipp(struct sandwich *state, int i) 5. What is 'ipp'? > +{ > + if (i < state->trigger) > + return i + 1; > + > + /* Sandwich is triggered and Lua ipp function is called */ > + lua_pushnumber(state->L, state->ref); > + lua_gettable(state->L, LUA_REGISTRYINDEX); > + lua_pushnumber(state->L, i); > + lua_call(state->L, 1, 1); > + return lua_tonumber(state->L, -1); > >> These changes look dangerous to perf. So if I had operators == >> and < overloaded for an FFI struct, they are not jitted anymore, >> right? > > Nope. I guess I need to stop right here and clarify the patch a little. > > Let's start with the patch itself: the introduced changes abort the > trace recording whether re-entrancy to Lua VM is detected or "panic" if > such re-entrancy is faced while running already compiled machine code. A > good rationale for such protection is the platform failure Nikita faced > in #4427: the C function called via FFI makes the trigger with a Lua-C > API call underneath run. However it also might be any Lua-C API function > respecting the one of available metamethods (e.g. lua_less, lua_equal, > etc). If the correponding metamethod is set, the platform also re-enters > to execute it. Such behaviour is also unnacceptable, so metamethod call > ought to be invoked via jit_secure_call too. It's not optional and we > can't avoid it, since there are no guarantees the situation never > occurs. 6. Yeah, ok, now I see, the changes functions are from the public C Lua API, and are called by external code only. Such as from tarantool. I was afraid, that lj uses them for all comparisons internally. > OK, what about FFI: it's a well isolated machinery, that handles > everything its own way: own trace recording semantics, own metamethod > handling (since only tables and userdata can have an individual > metatable FFI has the one "first level" metatable), etc. Nothing > FFI-related is changed here. 7. If this has nothing to do with ffi, then why it is used in the test? > If you run the following snippet, you'll see the artefacts for > successfully compiled trace in dump.out. > | require('jit.dump').start('+tbisrmXaT', 'dump.out') > | > | local limit = -1ULL / 2 > | local value = 1ULL > | local tests = { } > | > | while value < limit do > | value = value * 2 > | table.insert(tests, value < limit) > | end > | > | jit.off() -- prevent spoiling dump.out with other traces. > | print(value, limit, #tests) > > Since the patch relates only to Lua-C API, metamethods triggered while > running Lua code are also successfully compiled. > | require('jit.dump').start('+tbisrmXaT', 'dump.out') > | > | local complex > | > | local complex_mt = { > | __add = function(a, b) return complex(a.re + b.re, a.im + b.im) end, > | __eq = function(a, b) return a.re == b.re and a.im == b.im end, > | __tostring = function(o) return ('re: %d, im: %d'):format(o.re, o.im) end > | } > | > | complex = function(re, im) > | return setmetatable({ re = re, im = im }, complex_mt) > | end > | > | local start = complex(0, 0) > | local finish = complex(64, 64) > | local step = complex(1, 1) > | > | repeat > | start = start + step > | until start == finish > > Feel free to point the parts that looks good to be included into commit > message, comments, etc. > >> Why? Does someone really yield in them? > > The main idea is to protect ourselves in the future (we have already > discussed it with Kostja here[1]). > >> Could you measure how does it affect perf of these calls? > > Talking about perf: Sergos also asked to make benchmarks and I provided > the sysbench results[2] I obtained in Gitlab. I saw no dramatically > changes. I guess the reason the performance stays the same is that > all such places has been already handled manually in Tarantool sources > (like box_txn_rollback_to_savepoint after Leonid changes in #4427). > > However, if you would like to see more benchmarks, please let me know. > >> From what I see, these changes basically kill FFI making it not better >> than Lua C. It no longer helps JITting anything, does it? >> >> The same for concat, and other things. Looks like an overkill. We >> would need sandwich protection only for a few places. Is it possible >> to avoid such drastic changes? > > [1]: https://lists.tarantool.org/pipermail/tarantool-patches/2019-November/012732.html > [2]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-March/015236.html 8. So are you saying, that ffi part is still fast?