> On 31 Oct 2018, at 12:18, Vladislav Shpilevoy wrote: > On 31/10/2018 02:08, n.pettik wrote: >>>>> But SQL wants to use some transactional data after commit. It is >>>>> autogenerated identifiers - a list of sequence values generated >>>>> for autoincrement columns and explicit sequence:next() calls. >>>>> >>>>> It is possible to store the list on malloced mem inside Vdbe, but >>>>> it complicates deallocation. >>>> What is the problem with deallocation? AFAIU it is enough to >>>> simply iterate over the list and release each element - not big deal. >>>> If you want to use region, mb it is worth to store separate region >>>> specially for VDBE? We already have it in parser, so what prevents >>>> us for adding the same thing to VDBE? I guess we can store many >>>> things there, not only list of ids. I understand that parser in its turn >>>> has nothing in common (at least it should, except for analyze machinery) >>>> with transaction routines, so separate region is likely to be more >>>> reasonable for parser, but anyway... >>> >>> I've decided to say more details. Parser never yields. This is why we can >>> waste here any resources, rack and ruin everything, but at the end of >>> parsing it should be returned back. >>> >>> Vdbe, on the contrary, yields. So it holds some system resources while >>> other fibers can not use them. If we added a special region to Vdbe, it >>> would steal slabs from the thread's slab cache, while other fibers may >>> want to use it. Hence, when we use one region for all transactional data, >>> including language specific, allocations are much less fragmented over >>> different slabs. >>> >>> Is this explanation decent? >> Quite. I thought that used slabs are marked somehow so that different >> fibers’ regions can’t rely on the same chunk. Probably, I misunderstood >> how internals of our allocation system work. I would better ask you f2f >> someday (or read again Konstantin’s article). Anyway, thanks. >>> >>> Also, I do not agree, that 'deallocation is just iteration and it is >>> ok'. It is O(n) iteration and freeing of heap objects. If a one inserted >>> 10k rows with autogenerated ids, it would waste 10k heap fragments, >>> 10k calls of malloc/free - in my opinion it is an abysmal overhead, but >>> what is more, it can be avoided for free. Instead of 10k free() it boils >>> down to deallocation of N slabs, where N = slab_size / (10k * 8); 8 - size >>> of autogenerated it; slab size is at least 64Kb, so N = 64*1024/80000 >< 1. >>> It takes 1 deallocation vs 10k deallocations. So I think this refactoring >>> is worth. >> Very impressive calculations, however: >> a. I doubt that smb extensively uses queries like >> INSERT INTO t VALUES (NULL, ..), *10k repeats*, (NULL, ..)’ >> *Ok, neither I nor you know which queries users execute (or will execute), >> but anyway your example looks too synthetic.* >> b. Nothing prevents us from counting number of NULLs right in parser >> and allocate memory as single array (one malloc). In this case it would >> be more efficient, I guess, since you don’t need that machinery connected >> with linked list. Btw, why didn’t you consider this variant? > > INSERT INTO ... SELECT FROM ...; > > Here you can not calculate. Also, it is not possible to calculate > autoids from triggers, box Lua functions. So a list is the only > variant. Ok, now I see. Then patch LGTM.