[Tarantool-patches] [PATCH luajit 1/2] jit: abort trace recording and execution for C API

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Sun Apr 5 00:37:04 MSK 2020


See 8 comments below.

> diff --git a/src/lj_api.c b/src/lj_api.c
> index a5e2465..b543231 100644
> --- a/src/lj_api.c
> +++ b/src/lj_api.c
> @@ -76,6 +76,18 @@ static GCtab *getcurrenv(lua_State *L)
>    return fn->c.gct == ~LJ_TFUNC ? tabref(fn->c.env) : tabref(L->env);
>  }
>  
> +static void jit_secure_call(lua_State *L, TValue *base, int nres) {

1. Many functions in luajit related to jitting have prefix 'lj_'.
Probably this should be applied here too. I didn't see any other
starting with 'jit_'.

> +  global_State *g = G(L);
> +  /* Forbid Lua world re-entrancy while running the trace */
> +  if (tvref(g->jit_base)) {
> +    setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL));
> +    if (g->panic) g->panic(L);
> +    exit(EXIT_FAILURE);

2. Is it correct, that it is allowed to renter Lua, and the panic
will happen only if we see that during reentering the trace is
still being recorded (i.e. if there is a bug in lj)?

I don't see in the code above where is that check like 'the trace
is being recorded'. Could you please point at it?

> +  }
> +  lj_trace_abort(g);  /* Never record across Lua VM entrance */
> +  lj_vm_call(L, base, nres);
> +}
> @@ -1172,7 +1191,7 @@ LUALIB_API int luaL_callmeta(lua_State *L, int idx, const char *field)
>      if (LJ_FR2) setnilV(top++);
>      copyTV(L, top++, index2adr(L, idx));
>      L->top = top;
> -    lj_vm_call(L, top-1, 1+1);
> +    jit_secure_call(L, top-1, 1+1);
>      return 1;
>    }

3. Why didn't you change lua_cpcall(), luaL_callmeta()?

> diff --git a/test/gh-4427-ffi-sandwich/CMakeLists.txt b/test/gh-4427-ffi-sandwich/CMakeLists.txt
> new file mode 100644
> index 0000000..995c6bb
> --- /dev/null
> +++ b/test/gh-4427-ffi-sandwich/CMakeLists.txt

4. So this is not built by luajit? Does not it have anything
to be more self sufficient here? This looks strange to add
a CMake file to a probject, which uses Makefiles.

Also this cmake does not work. I get error:

CMake Error at CMakeLists.txt:1 (build_lualib):
  Unknown CMake command "build_lualib".

This is a clear cyclic dependency on Tarantool. Can it be
built by luajit's own means?

This dependency becomes even stronger in the next commit,
which introduces another CMake file.

> @@ -0,0 +1 @@
> +build_lualib(libsandwich libsandwich.c)
> diff --git a/test/gh-4427-ffi-sandwich/libsandwich.c b/test/gh-4427-ffi-sandwich/libsandwich.c
> new file mode 100644
> index 0000000..2a24fb2
> --- /dev/null
> +++ b/test/gh-4427-ffi-sandwich/libsandwich.c
> @@ -0,0 +1,59 @@
> +#include <lua.h>
> +#include <lauxlib.h>
> +
> +struct sandwich {
> +	lua_State *L; /* Coroutine saved for a Lua call */
> +	int ref;      /* Anchor to the Lua function to be run */
> +	int trigger;  /* Trigger for switching to Lua call */
> +};
> +
> +int ipp(struct sandwich *state, int i)

5. What is 'ipp'?

> +{
> +	if (i < state->trigger)
> +		return i + 1;
> +
> +	/* Sandwich is triggered and Lua ipp function is called */
> +	lua_pushnumber(state->L, state->ref);
> +	lua_gettable(state->L, LUA_REGISTRYINDEX);
> +	lua_pushnumber(state->L, i);
> +	lua_call(state->L, 1, 1);
> +	return lua_tonumber(state->L, -1);
> 
>> These changes look dangerous to perf. So if I had operators ==
>> and < overloaded for an FFI struct, they are not jitted anymore,
>> right?
> 
> Nope. I guess I need to stop right here and clarify the patch a little.
> 
> Let's start with the patch itself: the introduced changes abort the
> trace recording whether re-entrancy to Lua VM is detected or "panic" if
> such re-entrancy is faced while running already compiled machine code. A
> good rationale for such protection is the platform failure Nikita faced
> in #4427: the C function called via FFI makes the trigger with a Lua-C
> API call underneath run. However it also might be any Lua-C API function
> respecting the one of available metamethods (e.g. lua_less, lua_equal,
> etc). If the correponding metamethod is set, the platform also re-enters
> to execute it. Such behaviour is also unnacceptable, so metamethod call
> ought to be invoked via jit_secure_call too. It's not optional and we
> can't avoid it, since there are no guarantees the situation never
> occurs.

6. Yeah, ok, now I see, the changes functions are from the
public C Lua API, and are called by external code only. Such
as from tarantool. I was afraid, that lj uses them for all
comparisons internally.

> OK, what about FFI: it's a well isolated machinery, that handles
> everything its own way: own trace recording semantics, own metamethod
> handling (since only tables and userdata can have an individual
> metatable FFI has the one <ffi> "first level" metatable), etc. Nothing
> FFI-related is changed here.

7. If this has nothing to do with ffi, then why it is used
in the test?

> If you run the following snippet, you'll see the artefacts for
> successfully compiled trace in dump.out.
> | require('jit.dump').start('+tbisrmXaT', 'dump.out')
> | 
> | local limit = -1ULL / 2
> | local value = 1ULL
> | local tests = { }
> | 
> | while value < limit do
> |   value = value * 2
> |   table.insert(tests, value < limit)
> | end
> | 
> | jit.off() -- prevent spoiling dump.out with other traces.
> | print(value, limit, #tests)
> 
> Since the patch relates only to Lua-C API, metamethods triggered while
> running Lua code are also successfully compiled.
> | require('jit.dump').start('+tbisrmXaT', 'dump.out')
> | 
> | local complex
> | 
> | local complex_mt = {
> |   __add = function(a, b) return complex(a.re + b.re, a.im + b.im) end,
> |   __eq  = function(a, b) return a.re == b.re and a.im == b.im end,
> |   __tostring = function(o) return ('re: %d, im: %d'):format(o.re, o.im) end
> | }
> | 
> | complex = function(re, im)
> |   return setmetatable({ re = re, im = im }, complex_mt)
> | end
> | 
> | local start = complex(0, 0)
> | local finish = complex(64, 64)
> | local step = complex(1, 1)
> | 
> | repeat
> |   start = start + step
> | until start == finish
> 
> Feel free to point the parts that looks good to be included into commit
> message, comments, etc.
> 
>> Why? Does someone really yield in them?
> 
> The main idea is to protect ourselves in the future (we have already
> discussed it with Kostja here[1]).
> 
>> Could you measure how does it affect perf of these calls?
> 
> Talking about perf: Sergos also asked to make benchmarks and I provided
> the sysbench results[2] I obtained in Gitlab. I saw no dramatically
> changes. I guess the reason the performance stays the same is that
> all such places has been already handled manually in Tarantool sources
> (like box_txn_rollback_to_savepoint after Leonid changes in #4427).
> 
> However, if you would like to see more benchmarks, please let me know.
> 
>> From what I see, these changes basically kill FFI making it not better
>> than Lua C. It no longer helps JITting anything, does it?
>>
>> The same for concat, and other things. Looks like an overkill. We
>> would need sandwich protection only for a few places. Is it possible
>> to avoid such drastic changes?
> 
> [1]: https://lists.tarantool.org/pipermail/tarantool-patches/2019-November/012732.html
> [2]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-March/015236.html
8. So are you saying, that ffi part is still fast?


More information about the Tarantool-patches mailing list