[Tarantool-patches] [PATCH] Move txn from shema to a separate module (use C API instead of FFI)
Leonid Vasiliev
lvasiliev at tarantool.org
Thu Dec 5 11:27:29 MSK 2019
On 12/4/19 4:15 PM, Konstantin Osipov wrote:
> * Leonid Vasiliev <lvasiliev at tarantool.org> [19/12/04 16:08]:
>
> Leonid, thank you for the benchmark.
>
> Could you please also provide
> - explanation for the results (not everyone may understand why the
> results are the way they are)
Results explanation:
In case lvasiliev/gh-4427-move-some-stuff-from-ffi-to-c-api:
box.begin(), box.is_in_txn(), box.savepoint() use a C-API
In case sergos/rollback_to_savepoint:
box.begin(), box.is_in_txn(), box.savepoint() use a LuaJIT's FFI
> - your analysis of the results and suggestions for further
> actions?
Look, I'll be honest with you.
I don't have enough competence in Lua performance stuff. As I understand
- each of these methods has its pros and cons. My (and Igor if I have
understood correctly) point of view: "We need a some policy of using FFI
vs. C-API". And I try to force it.
>
>> So, several benchmarks (sergos/rollback_to_savepoint -
>> lvasiliev/gh-4427-move-some-stuff-from-ffi-to-c-api):
>>
>> BM1:
>>
>> local os = require('os')
>> local clock = require('clock')
>>
>> box.cfg()
>>
>> local s = box.schema.space.create('test')
>> s:create_index('primary')
>>
>>
>> local start_clock = clock.monotonic()
>>
>> for i = 1,1000000 do
>> box.begin()
>> s:replace{1}
>> box.is_in_txn()
>> local save1 = box.savepoint()
>> s:replace{2}
>> box.is_in_txn()
>> local save2 = box.savepoint()
>> s:replace{3}
>> box.is_in_txn()
>> local save3 = box.savepoint()
>> s:replace{4}
>> box.is_in_txn()
>> box.rollback_to_savepoint(save3)
>> box.is_in_txn()
>> box.rollback_to_savepoint(save2)
>> box.is_in_txn()
>> box.rollback_to_savepoint(save1)
>> box.is_in_txn()
>> box.commit()
>> end
>>
>> local stop_clock = clock.monotonic()
>> local work_time = stop_clock - start_clock
>> print("Result: " .. tostring(work_time))
>>
>> s:drop()
>> os.exit()
>>
>>
>> BM2:
>>
>> local os = require('os')
>> local clock = require('clock')
>>
>> box.cfg()
>>
>> local s = box.schema.space.create('test')
>> s:create_index('primary')
>>
>>
>> local start_clock = clock.monotonic()
>>
>> for i = 1,1000000 do
>> box.is_in_txn()
>> end
>>
>> local stop_clock = clock.monotonic()
>> local work_time = stop_clock - start_clock
>> print("Result: " .. tostring(work_time))
>>
>> s:drop()
>> os.exit()
>>
>>
>> BM3:
>>
>> local os = require('os')
>> local clock = require('clock')
>>
>> box.cfg()
>>
>> local s = box.schema.space.create('test')
>> s:create_index('primary')
>>
>>
>> local start_clock = clock.monotonic()
>>
>> for i = 1,1000000 do
>> box.begin()
>> box.commit()
>> end
>>
>> local stop_clock = clock.monotonic()
>> local work_time = stop_clock - start_clock
>> print("Result: " .. tostring(work_time))
>>
>> s:drop()
>> os.exit()
>>
>>
>> BM4:
>>
>> local os = require('os')
>> local clock = require('clock')
>>
>> box.cfg()
>>
>> local s = box.schema.space.create('test')
>> s:create_index('primary')
>>
>>
>> local start_clock = clock.monotonic()
>>
>> for i = 1,1000000 do
>> box.begin()
>> s:replace{1}
>> local save1 = box.savepoint()
>> s:replace{2}
>> box.rollback_to_savepoint(save1)
>> box.commit()
>> end
>>
>> local stop_clock = clock.monotonic()
>> local work_time = stop_clock - start_clock
>> print("Result: " .. tostring(work_time))
>>
>> s:drop()
>> os.exit()
>>
>>
>> Results:
>>
>> BM Sergo commit(s) Leonid commit(s) Result
>> bm1 26.7092 26.7897 Sergo commit win 0.3%
>> bm2 0.0023 0.0195 Sergo commit win x 8.5
>> bm3 0.1281 0.1502 Sergo commit win 17.2%
>> bm4 19.6567 19.8466 Sergo commit win 0.96%
>>
>>
>> On 11/28/19 9:36 PM, Igor Munkin wrote:
>>> Kostja,
>>>
>>> On 28.11.19, Konstantin Osipov wrote:
>>>> * Igor Munkin <imun at tarantool.org> [19/11/28 17:08]:
>>>>
>>>>> Why should we be aiming at using FFI more? The root cause is that
>>>>> current fiber machinery (as well as some parts of triggers mechanism)
>>>>> doesn't respect the Lua coroutine switch semantics, thereby breaking
>>>>> trace recording. Lua-C API implicitly (or non-intentionally) prevents
>>>>> breakage by JIT trace aborts when recording FUNCC.
>>>>
>>>> It's not correct. The current FFI functions were carefully crafted
>>>> to never lead to sandwich code: only those functions which can not
>>>> trigger a return to Lua were implemented as FFI.
>>>> There was one regression between 1.10 and in 2.3 because we
>>>> started firing rollback triggers when rolling back to a savepoint,
>>>> which was spotted by a failing tests.
>>>>
>>>> One more time: When FFI bindings were written we were aware of NYI
>>>> and took it into account.
>>>>
>>>
>>> OK, maybe I said it the wrong way using the word "non-intentionally". I
>>> mean that Tarantool doesn't use any special handler to asynchroniously
>>> abort recording, since there is no such outside the LuaJIT internals
>>> (jit.off blacklists the function regardless the host and guest stacks
>>> layout).
>>>
>>>>> Therefore, I guess we should be aiming either at changing fiber
>>>>> switching to the one respecting the LuaJIT runtime or at tuning JIT
>>>>> compiler way more regarding the Lua-C usage.
>>>>
>>>> This is actually quite simple - we could easily call a LuaJIT hook
>>>> whenever switching a fiber, to make sure that it carefully
>>>> switches the internals as well. Mike Pall refused to cooperate on
>>>> the matter, but now we (you) control our own destiny.
>>>
>>> Unfortunately, I haven't seen the thread where the subj is discussed
>>> with Mike Pall, but the approach you proposed doesn't seem to be a
>>> convenient one, however it still solves a problem (as does the move to
>>> use Lua-C API for the code with possible Lua VM re-entrance underneath).
>>>
>>> The major flaw I see in this solution, is introducing the dependency on
>>> the JIT interface into Tarantool internals. There is already one
>>> dependency on LuaJIT-2.1.0 presented with internal headers usage for
>>> several hacks in utils.c. As a result we are not able to simply replace
>>> the Lua implementation to try another one (e.g. uJIT conforming
>>> LuaJIT-2.0.5) for comparing each other.
>>>
>>> The best proposal we had with Kirill and Sergos is to finalize a trace
>>> exactly at CALLXS IR, however after some research I found that the
>>> snapshot to be replayed at the corresponding trace exit will restore the
>>> guest stack it doesn't relate to. I hope to make further research this
>>> direction, but it requires a way more time to adjust this behaviour and
>>> its benefits are doubtful for me now.
>>>
>>> For now, there is a partial fix I mentioned before, however it still
>>> violates the flow I described here[1]. I'm going to proceed with the
>>> research a bit later and provide another patch.
>>>
>>>>
>>>>> Besides, we can't fully prevent platform failures if there is an FFI
>>>>> misusage in users code.
>>>>
>>>> Tarantool has never been claiming that it prevents people from
>>>
>>> Sorry, I simply misread the following:
>>> |> Why not dig it up to protect from future erosion of the code base?
>>> |>
>>> |> This would be more valuable contribution than just falling back to
>>> |> Lua/C for everything.
>>>
>>>> shooting themselves in the foot. Performance is the ultimate
>>>> design goal, at the cost of safety at times.
>>>>
>>>
>>> Great, we discussed with Leonid and Sasha offline and agreed to make
>>> several benchmarks to be provided in this thread. With no benchmarks all
>>> our estimates can be simply wrong.
>>>
>>>>
>>>>>> What should be the rule of thumb in your opinion, ffi or
>>>>>> lua/c?
>>>>>
>>>>> If you want to know my rule of thumb: FFI is for external existing
>>>>> libraries to be used in Lua code (and all compiler related benefits are
>>>>> nothing more than a godsend consequence, since all guest stack
>>>>> manipulations are implemented in LuaJIT runtime, not in an external
>>>>> code) and Lua-C is a well-designed and well-documented API for embedding
>>>>> Lua into a host application / extending Lua with external low-lewel
>>>>> libs. I totally do not insist on my point of view, since everyone has
>>>>> it's own vision on LuaJIT features.
>>>>
>>>> OK, but there must be a single policy though. So far it was:
>>>> everything that doesn't yield and doesn't call back to Lua
>>>> uses FFI. Everything else *has* to use Lua/C API, UNTIL there
>>>> is a way to safely sandwich FFI calls.
>>>>
>>>
>>> I agree with you for the policy existence, but we all see the one you
>>> mentioned above can introduce bugs leading to a platform failures. So I
>>> guess we should reconsider it or simply dump somewhere. I think we have
>>> to make some benchmarks and provide not only stats, but also a
>>> reproducer with the input data, otherwise JIT tests are IMHO irrelevant.
>>>
>>>> --
>>>> Konstantin Osipov, Moscow, Russia
>>>
>>> [1]: https://github.com/tarantool/tarantool/issues/4427#issuecomment-546056302
>>>
>
More information about the Tarantool-patches
mailing list