[Tarantool-patches] [PATCH] Move txn from shema to a separate module (use C API instead of FFI)

Igor Munkin imun at tarantool.org
Thu Nov 28 21:36:15 MSK 2019


Kostja,

On 28.11.19, Konstantin Osipov wrote:
> * Igor Munkin <imun at tarantool.org> [19/11/28 17:08]:
> 
> > Why should we be aiming at using FFI more? The root cause is that
> > current fiber machinery (as well as some parts of triggers mechanism)
> > doesn't respect the Lua coroutine switch semantics, thereby breaking
> > trace recording. Lua-C API implicitly (or non-intentionally) prevents
> > breakage by JIT trace aborts when recording FUNCC.
> 
> It's not correct. The current FFI functions were carefully crafted
> to never lead to sandwich code: only those functions which can not
> trigger a return to Lua were implemented as FFI. 
> There was one regression between 1.10 and in 2.3 because we
> started firing rollback triggers when rolling back to a savepoint, 
> which was spotted by a failing tests.
> 
> One more time: When FFI bindings were written we were aware of NYI
> and took it into account.
> 

OK, maybe I said it the wrong way using the word "non-intentionally". I
mean that Tarantool doesn't use any special handler to asynchroniously
abort recording, since there is no such outside the LuaJIT internals
(jit.off blacklists the function regardless the host and guest stacks
layout).

> > Therefore, I guess we should be aiming either at changing fiber
> > switching to the one respecting the LuaJIT runtime or at tuning JIT
> > compiler way more regarding the Lua-C usage.
> 
> This is actually quite simple - we could easily call a LuaJIT hook
> whenever switching a fiber, to make sure that it carefully
> switches the internals as well. Mike Pall refused to cooperate on
> the matter, but now we (you) control our own destiny.

Unfortunately, I haven't seen the thread where the subj is discussed
with Mike Pall, but the approach you proposed doesn't seem to be a
convenient one, however it still solves a problem (as does the move to
use Lua-C API for the code with possible Lua VM re-entrance underneath).

The major flaw I see in this solution, is introducing the dependency on
the JIT interface into Tarantool internals. There is already one
dependency on LuaJIT-2.1.0 presented with internal headers usage for
several hacks in utils.c. As a result we are not able to simply replace
the Lua implementation to try another one (e.g. uJIT conforming
LuaJIT-2.0.5) for comparing each other.

The best proposal we had with Kirill and Sergos is to finalize a trace
exactly at CALLXS IR, however after some research I found that the
snapshot to be replayed at the corresponding trace exit will restore the
guest stack it doesn't relate to. I hope to make further research this
direction, but it requires a way more time to adjust this behaviour and
its benefits are doubtful for me now.

For now, there is a partial fix I mentioned before, however it still
violates the flow I described here[1]. I'm going to proceed with the
research a bit later and provide another patch.

> 
> > Besides, we can't fully prevent platform failures if there is an FFI
> > misusage in users code.
> 
> Tarantool has never been claiming that it prevents people from

Sorry, I simply misread the following:
|> Why not dig it up to protect from future erosion of the code base?
|>
|> This would be more valuable contribution than just falling back to
|> Lua/C for everything.

> shooting themselves in the foot. Performance is the ultimate
> design goal, at the cost of safety at times.
> 

Great, we discussed with Leonid and Sasha offline and agreed to make
several benchmarks to be provided in this thread. With no benchmarks all
our estimates can be simply wrong.

> 
> > > What should be the rule of thumb in your opinion, ffi or
> > > lua/c? 
> > 
> > If you want to know my rule of thumb: FFI is for external existing
> > libraries to be used in Lua code (and all compiler related benefits are
> > nothing more than a godsend consequence, since all guest stack
> > manipulations are implemented in LuaJIT runtime, not in an external
> > code) and Lua-C is a well-designed and well-documented API for embedding
> > Lua into a host application / extending Lua with external low-lewel
> > libs. I totally do not insist on my point of view, since everyone has
> > it's own vision on LuaJIT features.
> 
> OK, but there must be a single policy though. So far it was:
> everything that doesn't yield and doesn't call back to Lua
> uses FFI. Everything else *has* to use Lua/C API, UNTIL  there
> is a way to safely sandwich FFI calls.
> 

I agree with you for the policy existence, but we all see the one you
mentioned above can introduce bugs leading to a platform failures. So I
guess we should reconsider it or simply dump somewhere. I think we have
to make some benchmarks and provide not only stats, but also a
reproducer with the input data, otherwise JIT tests are IMHO irrelevant.

> -- 
> Konstantin Osipov, Moscow, Russia

[1]: https://github.com/tarantool/tarantool/issues/4427#issuecomment-546056302

-- 
Best regards,
IM


More information about the Tarantool-patches mailing list