Tarantool development patches archive
 help / color / mirror / Atom feed
From: "n.pettik" <korablev@tarantool.org>
To: tarantool-patches@freelists.org
Cc: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>,
	Imeev Mergen <imeevma@tarantool.org>
Subject: [tarantool-patches] Re: [PATCH v8 1/3] box: factor fiber_gc out of txn_commit
Date: Wed, 31 Oct 2018 12:30:09 +0300	[thread overview]
Message-ID: <DB75EF28-4BF2-460A-8F05-7094750A3152@tarantool.org> (raw)
In-Reply-To: <b2c3edf4-7a4f-c40e-ee60-ee1f188aea3c@tarantool.org>

[-- Attachment #1: Type: text/plain, Size: 3548 bytes --]



> On 31 Oct 2018, at 12:18, Vladislav Shpilevoy <v.shpilevoy@tarantool.org> wrote:
> On 31/10/2018 02:08, n.pettik wrote:
>>>>> But SQL wants to use some transactional data after commit. It is
>>>>> autogenerated identifiers - a list of sequence values generated
>>>>> for autoincrement columns and explicit sequence:next() calls.
>>>>> 
>>>>> It is possible to store the list on malloced mem inside Vdbe, but
>>>>> it complicates deallocation.
>>>> What is the problem with deallocation? AFAIU it is enough to
>>>> simply iterate over the list and release each element - not big deal.
>>>> If you want to use region, mb it is worth to store separate region
>>>> specially for VDBE? We already have it in parser, so what prevents
>>>> us for adding the same thing to VDBE? I guess we can store many
>>>> things there, not only list of ids. I understand that parser in its turn
>>>> has nothing in common (at least it should, except for analyze machinery)
>>>> with transaction routines, so separate region is likely to be more
>>>> reasonable for parser, but anyway...
>>> 
>>> I've decided to say more details. Parser never yields. This is why we can
>>> waste here any resources, rack and ruin everything, but at the end of
>>> parsing it should be returned back.
>>> 
>>> Vdbe, on the contrary, yields. So it holds some system resources while
>>> other fibers can not use them. If we added a special region to Vdbe, it
>>> would steal slabs from the thread's slab cache, while other fibers may
>>> want to use it. Hence, when we use one region for all transactional data,
>>> including language specific, allocations are much less fragmented over
>>> different slabs.
>>> 
>>> Is this explanation decent?
>> Quite. I thought that used slabs are marked somehow so that different
>> fibers’ regions can’t rely on the same chunk. Probably, I misunderstood
>> how internals of our allocation system work. I would better ask you f2f
>> someday (or read again Konstantin’s article). Anyway, thanks.
>>> 
>>> Also, I do not agree, that 'deallocation is just iteration and it is
>>> ok'. It is O(n) iteration and freeing of heap objects. If a one inserted
>>> 10k rows with autogenerated ids, it would waste 10k heap fragments,
>>> 10k calls of malloc/free - in my opinion it is an abysmal overhead, but
>>> what is more, it can be avoided for free. Instead of 10k free() it boils
>>> down to deallocation of N slabs, where N = slab_size / (10k * 8); 8 - size
>>> of autogenerated it; slab size is at least 64Kb, so N = 64*1024/80000 <tel:1024/80000> <tel:1024/80000 <tel:1024/80000>>< 1.
>>> It takes 1 deallocation vs 10k deallocations. So I think this refactoring
>>> is worth.
>> Very impressive calculations, however:
>> a. I doubt that smb extensively uses queries like
>> INSERT INTO t VALUES (NULL, ..), *10k repeats*, (NULL, ..)’
>> *Ok, neither I nor you know which queries users execute (or will execute),
>>  but anyway your example looks too synthetic.*
>> b. Nothing prevents us from counting number of NULLs right in parser
>> and allocate memory as single array (one malloc). In this case it would
>> be more efficient, I guess, since you don’t need that machinery connected
>> with linked list. Btw, why didn’t you consider this variant?
> 
> INSERT INTO ... SELECT FROM ...;
> 
> Here you can not calculate. Also, it is not possible to calculate
> autoids from triggers, box Lua functions. So a list is the only
> variant.

Ok, now I see. Then patch LGTM.


[-- Attachment #2: Type: text/html, Size: 8905 bytes --]

  reply	other threads:[~2018-10-31  9:30 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-29 17:33 [tarantool-patches] [PATCH v8 0/3] sql: return all generated ids via IPROTO imeevma
2018-10-29 17:33 ` [tarantool-patches] [PATCH v8 1/3] box: factor fiber_gc out of txn_commit imeevma
2018-10-30 14:30   ` [tarantool-patches] " n.pettik
2018-10-30 19:15     ` Vladislav Shpilevoy
2018-10-30 20:03     ` Vladislav Shpilevoy
2018-10-30 20:06       ` Vladislav Shpilevoy
2018-10-30 21:32         ` Vladislav Shpilevoy
2018-10-30 23:08       ` n.pettik
2018-10-31  9:18         ` Vladislav Shpilevoy
2018-10-31  9:30           ` n.pettik [this message]
2018-10-29 17:33 ` [tarantool-patches] [PATCH v8 2/3] sql: return all generated ids via IPROTO imeevma
2018-10-30 14:30   ` [tarantool-patches] " n.pettik
2018-11-02 18:52     ` Imeev Mergen
2018-11-09  7:51       ` n.pettik
2018-10-29 17:33 ` [tarantool-patches] [PATCH v8 3/3] sql: remove psql_txn from Vdbe imeevma
2018-10-30 14:30   ` [tarantool-patches] " n.pettik
2018-10-30 19:41     ` Vladislav Shpilevoy
2018-11-09  8:02 ` [tarantool-patches] Re: [PATCH v8 0/3] sql: return all generated ids via IPROTO Kirill Yukhin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DB75EF28-4BF2-460A-8F05-7094750A3152@tarantool.org \
    --to=korablev@tarantool.org \
    --cc=imeevma@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='[tarantool-patches] Re: [PATCH v8 1/3] box: factor fiber_gc out of txn_commit' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox