From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> To: "n.pettik" <korablev@tarantool.org>, tarantool-patches@freelists.org Cc: Imeev Mergen <imeevma@tarantool.org> Subject: [tarantool-patches] Re: [PATCH v8 1/3] box: factor fiber_gc out of txn_commit Date: Wed, 31 Oct 2018 12:18:53 +0300 [thread overview] Message-ID: <b2c3edf4-7a4f-c40e-ee60-ee1f188aea3c@tarantool.org> (raw) In-Reply-To: <F66C06EC-E7F7-4608-8483-AAC0C0E416C8@tarantool.org> On 31/10/2018 02:08, n.pettik wrote: > >>>> But SQL wants to use some transactional data after commit. It is >>>> autogenerated identifiers - a list of sequence values generated >>>> for autoincrement columns and explicit sequence:next() calls. >>>> >>>> It is possible to store the list on malloced mem inside Vdbe, but >>>> it complicates deallocation. >>> What is the problem with deallocation? AFAIU it is enough to >>> simply iterate over the list and release each element - not big deal. >>> If you want to use region, mb it is worth to store separate region >>> specially for VDBE? We already have it in parser, so what prevents >>> us for adding the same thing to VDBE? I guess we can store many >>> things there, not only list of ids. I understand that parser in its turn >>> has nothing in common (at least it should, except for analyze machinery) >>> with transaction routines, so separate region is likely to be more >>> reasonable for parser, but anyway... >> >> I've decided to say more details. Parser never yields. This is why we can >> waste here any resources, rack and ruin everything, but at the end of >> parsing it should be returned back. >> >> Vdbe, on the contrary, yields. So it holds some system resources while >> other fibers can not use them. If we added a special region to Vdbe, it >> would steal slabs from the thread's slab cache, while other fibers may >> want to use it. Hence, when we use one region for all transactional data, >> including language specific, allocations are much less fragmented over >> different slabs. >> >> Is this explanation decent? > > Quite. I thought that used slabs are marked somehow so that different > fibers’ regions can’t rely on the same chunk. Probably, I misunderstood > how internals of our allocation system work. I would better ask you f2f > someday (or read again Konstantin’s article). Anyway, thanks. > >> >> Also, I do not agree, that 'deallocation is just iteration and it is >> ok'. It is O(n) iteration and freeing of heap objects. If a one inserted >> 10k rows with autogenerated ids, it would waste 10k heap fragments, >> 10k calls of malloc/free - in my opinion it is an abysmal overhead, but >> what is more, it can be avoided for free. Instead of 10k free() it boils >> down to deallocation of N slabs, where N = slab_size / (10k * 8); 8 - size >> of autogenerated it; slab size is at least 64Kb, so N = 64*1024/80000 <tel:1024/80000>< 1. >> It takes 1 deallocation vs 10k deallocations. So I think this refactoring >> is worth. > > Very impressive calculations, however: > > a. I doubt that smb extensively uses queries like > INSERT INTO t VALUES (NULL, ..), *10k repeats*, (NULL, ..)’ > *Ok, neither I nor you know which queries users execute (or will execute), > but anyway your example looks too synthetic.* > > b. Nothing prevents us from counting number of NULLs right in parser > and allocate memory as single array (one malloc). In this case it would > be more efficient, I guess, since you don’t need that machinery connected > with linked list. Btw, why didn’t you consider this variant? INSERT INTO ... SELECT FROM ...; Here you can not calculate. Also, it is not possible to calculate autoids from triggers, box Lua functions. So a list is the only variant. >
next prev parent reply other threads:[~2018-10-31 9:18 UTC|newest] Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-10-29 17:33 [tarantool-patches] [PATCH v8 0/3] sql: return all generated ids via IPROTO imeevma 2018-10-29 17:33 ` [tarantool-patches] [PATCH v8 1/3] box: factor fiber_gc out of txn_commit imeevma 2018-10-30 14:30 ` [tarantool-patches] " n.pettik 2018-10-30 19:15 ` Vladislav Shpilevoy 2018-10-30 20:03 ` Vladislav Shpilevoy 2018-10-30 20:06 ` Vladislav Shpilevoy 2018-10-30 21:32 ` Vladislav Shpilevoy 2018-10-30 23:08 ` n.pettik 2018-10-31 9:18 ` Vladislav Shpilevoy [this message] 2018-10-31 9:30 ` n.pettik 2018-10-29 17:33 ` [tarantool-patches] [PATCH v8 2/3] sql: return all generated ids via IPROTO imeevma 2018-10-30 14:30 ` [tarantool-patches] " n.pettik 2018-11-02 18:52 ` Imeev Mergen 2018-11-09 7:51 ` n.pettik 2018-10-29 17:33 ` [tarantool-patches] [PATCH v8 3/3] sql: remove psql_txn from Vdbe imeevma 2018-10-30 14:30 ` [tarantool-patches] " n.pettik 2018-10-30 19:41 ` Vladislav Shpilevoy 2018-11-09 8:02 ` [tarantool-patches] Re: [PATCH v8 0/3] sql: return all generated ids via IPROTO Kirill Yukhin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=b2c3edf4-7a4f-c40e-ee60-ee1f188aea3c@tarantool.org \ --to=v.shpilevoy@tarantool.org \ --cc=imeevma@tarantool.org \ --cc=korablev@tarantool.org \ --cc=tarantool-patches@freelists.org \ --subject='[tarantool-patches] Re: [PATCH v8 1/3] box: factor fiber_gc out of txn_commit' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox