[tarantool-patches] Re: [PATCH v8 1/3] box: factor fiber_gc out of txn_commit

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Tue Oct 30 23:03:45 MSK 2018

Thanks for the review!

On 30/10/2018 17:30, n.pettik wrote:
>> On 29 Oct 2018, at 20:33, imeevma at tarantool.org wrote:
>> Now txn_commit is judge, jury and executioner. It both
>> commits or rollbacks data, and collects it calling fiber_gc,
>> which destroys the region.
> Nit: both commits and rollbacks.


>> But SQL wants to use some transactional data after commit. It is
>> autogenerated identifiers - a list of sequence values generated
>> for autoincrement columns and explicit sequence:next() calls.
>> It is possible to store the list on malloced mem inside Vdbe, but
>> it complicates deallocation.
> What is the problem with deallocation? AFAIU it is enough to
> simply iterate over the list and release each element - not big deal.
> If you want to use region, mb it is worth to store separate region
> specially for VDBE? We already have it in parser, so what prevents
> us for adding the same thing to VDBE? I guess we can store many
> things there, not only list of ids. I understand that parser in its turn
> has nothing in common (at least it should, except for analyze machinery)
> with transaction routines, so separate region is likely to be more
> reasonable for parser, but anyway...

I've decided to say more details. Parser never yields. This is why we can
waste here any resources, rack and ruin everything, but at the end of
parsing it should be returned back.

Vdbe, on the contrary, yields. So it holds some system resources while
other fibers can not use them. If we added a special region to Vdbe, it
would steal slabs from the thread's slab cache, while other fibers may
want to use it. Hence, when we use one region for all transactional data,
including language specific, allocations are much less fragmented over
different slabs.

Is this explanation decent?

Also, I do not agree, that 'deallocation is just iteration and it is
ok'. It is O(n) iteration and freeing of heap objects. If a one inserted
10k rows with autogenerated ids, it would waste 10k heap fragments,
10k calls of malloc/free - in my opinion it is an abysmal overhead, but
what is more, it can be avoided for free. Instead of 10k free() it boils
down to deallocation of N slabs, where N = slab_size / (10k * 8); 8 - size
of autogenerated it; slab size is at least 64Kb, so N = 64*1024/80000 < 1.
It takes 1 deallocation vs 10k deallocations. So I think this refactoring
is worth.

More information about the Tarantool-patches mailing list