But SQL wants to use some transactional data after commit. It is
autogenerated identifiers - a list of sequence values generated
for autoincrement columns and explicit sequence:next() calls.
It is possible to store the list on malloced mem inside Vdbe, but
it complicates deallocation.
What is the problem with deallocation? AFAIU it is enough to
simply iterate over the list and release each element - not big deal.
If you want to use region, mb it is worth to store separate region
specially for VDBE? We already have it in parser, so what prevents
us for adding the same thing to VDBE? I guess we can store many
things there, not only list of ids. I understand that parser in its turn
has nothing in common (at least it should, except for analyze machinery)
with transaction routines, so separate region is likely to be more
reasonable for parser, but anyway...
I've decided to say more details. Parser never yields. This is why we canwaste here any resources, rack and ruin everything, but at the end ofparsing it should be returned back.Vdbe, on the contrary, yields. So it holds some system resources whileother fibers can not use them. If we added a special region to Vdbe, itwould steal slabs from the thread's slab cache, while other fibers maywant to use it. Hence, when we use one region for all transactional data,including language specific, allocations are much less fragmented overdifferent slabs.Is this explanation decent?
Quite. I thought that used slabs are marked somehow so that different
fibers’ regions can’t rely on the same chunk. Probably, I misunderstood
how internals of our allocation system work. I would better ask you f2f
someday (or read again Konstantin’s article). Anyway, thanks.
Also, I do not agree, that 'deallocation is just iteration and it isok'. It is O(n) iteration and freeing of heap objects. If a one inserted10k rows with autogenerated ids, it would waste 10k heap fragments,10k calls of malloc/free - in my opinion it is an abysmal overhead, butwhat is more, it can be avoided for free. Instead of 10k free() it boilsdown to deallocation of N slabs, where N = slab_size / (10k * 8); 8 - sizeof autogenerated it; slab size is at least 64Kb, so N = 64*1024/80000 < 1.It takes 1 deallocation vs 10k deallocations. So I think this refactoringis worth.
Very impressive calculations, however:
a. I doubt that smb extensively uses queries like
INSERT INTO t VALUES (NULL, ..), *10k repeats*, (NULL, ..)’
*Ok, neither I nor you know which queries users execute (or will execute),
but anyway your example looks too synthetic.*
b. Nothing prevents us from counting number of NULLs right in parser
and allocate memory as single array (one malloc). In this case it would
be more efficient, I guess, since you don’t need that machinery connected
with linked list. Btw, why didn’t you consider this variant?