[Tarantool-patches] [PATCH] Move txn from shema to a separate module (use C API instead of FFI)

Leonid Vasiliev lvasiliev at tarantool.org
Wed Dec 4 16:05:42 MSK 2019


So, several benchmarks (sergos/rollback_to_savepoint - 
lvasiliev/gh-4427-move-some-stuff-from-ffi-to-c-api):

BM1:

local os = require('os')
local clock = require('clock')

box.cfg()

local s = box.schema.space.create('test')
s:create_index('primary')


local start_clock = clock.monotonic()

for i = 1,1000000 do
     box.begin()
     s:replace{1}
     box.is_in_txn()
     local save1 = box.savepoint()
     s:replace{2}
     box.is_in_txn()
     local save2 = box.savepoint()
     s:replace{3}
     box.is_in_txn()
     local save3 = box.savepoint()
     s:replace{4}
     box.is_in_txn()
     box.rollback_to_savepoint(save3)
     box.is_in_txn()
     box.rollback_to_savepoint(save2)
     box.is_in_txn()
     box.rollback_to_savepoint(save1)
     box.is_in_txn()
     box.commit()
end

local stop_clock = clock.monotonic()
local work_time = stop_clock - start_clock
print("Result: " .. tostring(work_time))

s:drop()
os.exit()


BM2:

local os = require('os')
local clock = require('clock')

box.cfg()

local s = box.schema.space.create('test')
s:create_index('primary')


local start_clock = clock.monotonic()

for i = 1,1000000 do
     box.is_in_txn()
end

local stop_clock = clock.monotonic()
local work_time = stop_clock - start_clock
print("Result: " .. tostring(work_time))

s:drop()
os.exit()


BM3:

local os = require('os')
local clock = require('clock')

box.cfg()

local s = box.schema.space.create('test')
s:create_index('primary')


local start_clock = clock.monotonic()

for i = 1,1000000 do
     box.begin()
     box.commit()
end

local stop_clock = clock.monotonic()
local work_time = stop_clock - start_clock
print("Result: " .. tostring(work_time))

s:drop()
os.exit()


BM4:

local os = require('os')
local clock = require('clock')

box.cfg()

local s = box.schema.space.create('test')
s:create_index('primary')


local start_clock = clock.monotonic()

for i = 1,1000000 do
     box.begin()
     s:replace{1}
     local save1 = box.savepoint()
     s:replace{2}
     box.rollback_to_savepoint(save1)
     box.commit()
end

local stop_clock = clock.monotonic()
local work_time = stop_clock - start_clock
print("Result: " .. tostring(work_time))

s:drop()
os.exit()


Results:

BM	Sergo commit(s)	Leonid commit(s)	Result
bm1	26.7092		26.7897			Sergo commit win 0.3%
bm2	0.0023		0.0195			Sergo commit win x 8.5
bm3	0.1281		0.1502			Sergo commit win 17.2%
bm4	19.6567		19.8466			Sergo commit win 0.96%


On 11/28/19 9:36 PM, Igor Munkin wrote:
> Kostja,
> 
> On 28.11.19, Konstantin Osipov wrote:
>> * Igor Munkin <imun at tarantool.org> [19/11/28 17:08]:
>>
>>> Why should we be aiming at using FFI more? The root cause is that
>>> current fiber machinery (as well as some parts of triggers mechanism)
>>> doesn't respect the Lua coroutine switch semantics, thereby breaking
>>> trace recording. Lua-C API implicitly (or non-intentionally) prevents
>>> breakage by JIT trace aborts when recording FUNCC.
>>
>> It's not correct. The current FFI functions were carefully crafted
>> to never lead to sandwich code: only those functions which can not
>> trigger a return to Lua were implemented as FFI.
>> There was one regression between 1.10 and in 2.3 because we
>> started firing rollback triggers when rolling back to a savepoint,
>> which was spotted by a failing tests.
>>
>> One more time: When FFI bindings were written we were aware of NYI
>> and took it into account.
>>
> 
> OK, maybe I said it the wrong way using the word "non-intentionally". I
> mean that Tarantool doesn't use any special handler to asynchroniously
> abort recording, since there is no such outside the LuaJIT internals
> (jit.off blacklists the function regardless the host and guest stacks
> layout).
> 
>>> Therefore, I guess we should be aiming either at changing fiber
>>> switching to the one respecting the LuaJIT runtime or at tuning JIT
>>> compiler way more regarding the Lua-C usage.
>>
>> This is actually quite simple - we could easily call a LuaJIT hook
>> whenever switching a fiber, to make sure that it carefully
>> switches the internals as well. Mike Pall refused to cooperate on
>> the matter, but now we (you) control our own destiny.
> 
> Unfortunately, I haven't seen the thread where the subj is discussed
> with Mike Pall, but the approach you proposed doesn't seem to be a
> convenient one, however it still solves a problem (as does the move to
> use Lua-C API for the code with possible Lua VM re-entrance underneath).
> 
> The major flaw I see in this solution, is introducing the dependency on
> the JIT interface into Tarantool internals. There is already one
> dependency on LuaJIT-2.1.0 presented with internal headers usage for
> several hacks in utils.c. As a result we are not able to simply replace
> the Lua implementation to try another one (e.g. uJIT conforming
> LuaJIT-2.0.5) for comparing each other.
> 
> The best proposal we had with Kirill and Sergos is to finalize a trace
> exactly at CALLXS IR, however after some research I found that the
> snapshot to be replayed at the corresponding trace exit will restore the
> guest stack it doesn't relate to. I hope to make further research this
> direction, but it requires a way more time to adjust this behaviour and
> its benefits are doubtful for me now.
> 
> For now, there is a partial fix I mentioned before, however it still
> violates the flow I described here[1]. I'm going to proceed with the
> research a bit later and provide another patch.
> 
>>
>>> Besides, we can't fully prevent platform failures if there is an FFI
>>> misusage in users code.
>>
>> Tarantool has never been claiming that it prevents people from
> 
> Sorry, I simply misread the following:
> |> Why not dig it up to protect from future erosion of the code base?
> |>
> |> This would be more valuable contribution than just falling back to
> |> Lua/C for everything.
> 
>> shooting themselves in the foot. Performance is the ultimate
>> design goal, at the cost of safety at times.
>>
> 
> Great, we discussed with Leonid and Sasha offline and agreed to make
> several benchmarks to be provided in this thread. With no benchmarks all
> our estimates can be simply wrong.
> 
>>
>>>> What should be the rule of thumb in your opinion, ffi or
>>>> lua/c?
>>>
>>> If you want to know my rule of thumb: FFI is for external existing
>>> libraries to be used in Lua code (and all compiler related benefits are
>>> nothing more than a godsend consequence, since all guest stack
>>> manipulations are implemented in LuaJIT runtime, not in an external
>>> code) and Lua-C is a well-designed and well-documented API for embedding
>>> Lua into a host application / extending Lua with external low-lewel
>>> libs. I totally do not insist on my point of view, since everyone has
>>> it's own vision on LuaJIT features.
>>
>> OK, but there must be a single policy though. So far it was:
>> everything that doesn't yield and doesn't call back to Lua
>> uses FFI. Everything else *has* to use Lua/C API, UNTIL  there
>> is a way to safely sandwich FFI calls.
>>
> 
> I agree with you for the policy existence, but we all see the one you
> mentioned above can introduce bugs leading to a platform failures. So I
> guess we should reconsider it or simply dump somewhere. I think we have
> to make some benchmarks and provide not only stats, but also a
> reproducer with the input data, otherwise JIT tests are IMHO irrelevant.
> 
>> -- 
>> Konstantin Osipov, Moscow, Russia
> 
> [1]: https://github.com/tarantool/tarantool/issues/4427#issuecomment-546056302
> 


More information about the Tarantool-patches mailing list