From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng3.m.smailru.net (smtpng3.m.smailru.net [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 0489E46970F for ; Thu, 28 Nov 2019 21:38:22 +0300 (MSK) Date: Thu, 28 Nov 2019 21:36:15 +0300 From: Igor Munkin Message-ID: <20191128183615.GC1214@tarantool.org> References: <156ce93b495648d6f3fd6c879b0d9aaf56754a1e.1574773773.git.lvasiliev@tarantool.org> <20191126210520.GE23422@atlas> <20191126211701.mhavpytwkemux3vm@tkn_work_nb> <20191127083123.GA2752@atlas> <20191128123445.GC29714@atlas> <20191128130005.GA1214@tarantool.org> <20191128131804.GE29714@atlas> <20191128140340.GB1214@tarantool.org> <20191128155801.GB11584@atlas> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20191128155801.GB11584@atlas> Subject: Re: [Tarantool-patches] [PATCH] Move txn from shema to a separate module (use C API instead of FFI) List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Konstantin Osipov Cc: tarantool-patches@dev.tarantool.org Kostja, On 28.11.19, Konstantin Osipov wrote: > * Igor Munkin [19/11/28 17:08]: > > > Why should we be aiming at using FFI more? The root cause is that > > current fiber machinery (as well as some parts of triggers mechanism) > > doesn't respect the Lua coroutine switch semantics, thereby breaking > > trace recording. Lua-C API implicitly (or non-intentionally) prevents > > breakage by JIT trace aborts when recording FUNCC. > > It's not correct. The current FFI functions were carefully crafted > to never lead to sandwich code: only those functions which can not > trigger a return to Lua were implemented as FFI. > There was one regression between 1.10 and in 2.3 because we > started firing rollback triggers when rolling back to a savepoint, > which was spotted by a failing tests. > > One more time: When FFI bindings were written we were aware of NYI > and took it into account. > OK, maybe I said it the wrong way using the word "non-intentionally". I mean that Tarantool doesn't use any special handler to asynchroniously abort recording, since there is no such outside the LuaJIT internals (jit.off blacklists the function regardless the host and guest stacks layout). > > Therefore, I guess we should be aiming either at changing fiber > > switching to the one respecting the LuaJIT runtime or at tuning JIT > > compiler way more regarding the Lua-C usage. > > This is actually quite simple - we could easily call a LuaJIT hook > whenever switching a fiber, to make sure that it carefully > switches the internals as well. Mike Pall refused to cooperate on > the matter, but now we (you) control our own destiny. Unfortunately, I haven't seen the thread where the subj is discussed with Mike Pall, but the approach you proposed doesn't seem to be a convenient one, however it still solves a problem (as does the move to use Lua-C API for the code with possible Lua VM re-entrance underneath). The major flaw I see in this solution, is introducing the dependency on the JIT interface into Tarantool internals. There is already one dependency on LuaJIT-2.1.0 presented with internal headers usage for several hacks in utils.c. As a result we are not able to simply replace the Lua implementation to try another one (e.g. uJIT conforming LuaJIT-2.0.5) for comparing each other. The best proposal we had with Kirill and Sergos is to finalize a trace exactly at CALLXS IR, however after some research I found that the snapshot to be replayed at the corresponding trace exit will restore the guest stack it doesn't relate to. I hope to make further research this direction, but it requires a way more time to adjust this behaviour and its benefits are doubtful for me now. For now, there is a partial fix I mentioned before, however it still violates the flow I described here[1]. I'm going to proceed with the research a bit later and provide another patch. > > > Besides, we can't fully prevent platform failures if there is an FFI > > misusage in users code. > > Tarantool has never been claiming that it prevents people from Sorry, I simply misread the following: |> Why not dig it up to protect from future erosion of the code base? |> |> This would be more valuable contribution than just falling back to |> Lua/C for everything. > shooting themselves in the foot. Performance is the ultimate > design goal, at the cost of safety at times. > Great, we discussed with Leonid and Sasha offline and agreed to make several benchmarks to be provided in this thread. With no benchmarks all our estimates can be simply wrong. > > > > What should be the rule of thumb in your opinion, ffi or > > > lua/c? > > > > If you want to know my rule of thumb: FFI is for external existing > > libraries to be used in Lua code (and all compiler related benefits are > > nothing more than a godsend consequence, since all guest stack > > manipulations are implemented in LuaJIT runtime, not in an external > > code) and Lua-C is a well-designed and well-documented API for embedding > > Lua into a host application / extending Lua with external low-lewel > > libs. I totally do not insist on my point of view, since everyone has > > it's own vision on LuaJIT features. > > OK, but there must be a single policy though. So far it was: > everything that doesn't yield and doesn't call back to Lua > uses FFI. Everything else *has* to use Lua/C API, UNTIL there > is a way to safely sandwich FFI calls. > I agree with you for the policy existence, but we all see the one you mentioned above can introduce bugs leading to a platform failures. So I guess we should reconsider it or simply dump somewhere. I think we have to make some benchmarks and provide not only stats, but also a reproducer with the input data, otherwise JIT tests are IMHO irrelevant. > -- > Konstantin Osipov, Moscow, Russia [1]: https://github.com/tarantool/tarantool/issues/4427#issuecomment-546056302 -- Best regards, IM