From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 13A2E2F3BD for ; Wed, 31 Oct 2018 05:30:20 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0-po9qPjFJbg for ; Wed, 31 Oct 2018 05:30:19 -0400 (EDT) Received: from smtp48.i.mail.ru (smtp48.i.mail.ru [94.100.177.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 9970F2EFD5 for ; Wed, 31 Oct 2018 05:30:19 -0400 (EDT) From: "n.pettik" Message-Id: Content-Type: multipart/alternative; boundary="Apple-Mail=_D59DA289-AA2D-4D54-97B7-E922CECFE49C" Mime-Version: 1.0 (Mac OS X Mail 12.0 \(3445.100.39\)) Subject: [tarantool-patches] Re: [PATCH v8 1/3] box: factor fiber_gc out of txn_commit Date: Wed, 31 Oct 2018 12:30:09 +0300 In-Reply-To: References: <4a7a178a-7632-4f1a-5b94-67ef886c784d@tarantool.org> Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: tarantool-patches@freelists.org Cc: Vladislav Shpilevoy , Imeev Mergen --Apple-Mail=_D59DA289-AA2D-4D54-97B7-E922CECFE49C Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On 31 Oct 2018, at 12:18, Vladislav Shpilevoy = wrote: > On 31/10/2018 02:08, n.pettik wrote: >>>>> But SQL wants to use some transactional data after commit. It is >>>>> autogenerated identifiers - a list of sequence values generated >>>>> for autoincrement columns and explicit sequence:next() calls. >>>>>=20 >>>>> It is possible to store the list on malloced mem inside Vdbe, but >>>>> it complicates deallocation. >>>> What is the problem with deallocation? AFAIU it is enough to >>>> simply iterate over the list and release each element - not big = deal. >>>> If you want to use region, mb it is worth to store separate region >>>> specially for VDBE? We already have it in parser, so what prevents >>>> us for adding the same thing to VDBE? I guess we can store many >>>> things there, not only list of ids. I understand that parser in its = turn >>>> has nothing in common (at least it should, except for analyze = machinery) >>>> with transaction routines, so separate region is likely to be more >>>> reasonable for parser, but anyway... >>>=20 >>> I've decided to say more details. Parser never yields. This is why = we can >>> waste here any resources, rack and ruin everything, but at the end = of >>> parsing it should be returned back. >>>=20 >>> Vdbe, on the contrary, yields. So it holds some system resources = while >>> other fibers can not use them. If we added a special region to Vdbe, = it >>> would steal slabs from the thread's slab cache, while other fibers = may >>> want to use it. Hence, when we use one region for all transactional = data, >>> including language specific, allocations are much less fragmented = over >>> different slabs. >>>=20 >>> Is this explanation decent? >> Quite. I thought that used slabs are marked somehow so that different >> fibers=E2=80=99 regions can=E2=80=99t rely on the same chunk. = Probably, I misunderstood >> how internals of our allocation system work. I would better ask you = f2f >> someday (or read again Konstantin=E2=80=99s article). Anyway, thanks. >>>=20 >>> Also, I do not agree, that 'deallocation is just iteration and it is >>> ok'. It is O(n) iteration and freeing of heap objects. If a one = inserted >>> 10k rows with autogenerated ids, it would waste 10k heap fragments, >>> 10k calls of malloc/free - in my opinion it is an abysmal overhead, = but >>> what is more, it can be avoided for free. Instead of 10k free() it = boils >>> down to deallocation of N slabs, where N =3D slab_size / (10k * 8); = 8 - size >>> of autogenerated it; slab size is at least 64Kb, so N =3D = 64*1024/80000 >< 1. >>> It takes 1 deallocation vs 10k deallocations. So I think this = refactoring >>> is worth. >> Very impressive calculations, however: >> a. I doubt that smb extensively uses queries like >> INSERT INTO t VALUES (NULL, ..), *10k repeats*, (NULL, ..)=E2=80=99 >> *Ok, neither I nor you know which queries users execute (or will = execute), >> but anyway your example looks too synthetic.* >> b. Nothing prevents us from counting number of NULLs right in parser >> and allocate memory as single array (one malloc). In this case it = would >> be more efficient, I guess, since you don=E2=80=99t need that = machinery connected >> with linked list. Btw, why didn=E2=80=99t you consider this variant? >=20 > INSERT INTO ... SELECT FROM ...; >=20 > Here you can not calculate. Also, it is not possible to calculate > autoids from triggers, box Lua functions. So a list is the only > variant. Ok, now I see. Then patch LGTM. --Apple-Mail=_D59DA289-AA2D-4D54-97B7-E922CECFE49C Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8

On 31 Oct 2018, at 12:18, Vladislav Shpilevoy <v.shpilevoy@tarantool.org> wrote:
On 31/10/2018 = 02:08, n.pettik wrote:
But SQL wants to use some transactional data after commit. It = is
autogenerated identifiers - a list of sequence values = generated
for autoincrement columns and explicit = sequence:next() calls.

It is possible to = store the list on malloced mem inside Vdbe, but
it = complicates deallocation.
What is the problem = with deallocation? AFAIU it is enough to
simply iterate = over the list and release each element - not big deal.
If = you want to use region, mb it is worth to store separate region
specially for VDBE? We already have it in parser, so what = prevents
us for adding the same thing to VDBE? I guess we = can store many
things there, not only list of ids. I = understand that parser in its turn
has nothing in common = (at least it should, except for analyze machinery)
with = transaction routines, so separate region is likely to be more
reasonable for parser, but anyway...

I've decided to say more details. = Parser never yields. This is why we can
waste here any = resources, rack and ruin everything, but at the end of
parsing it should be returned back.

Vdbe, on the contrary, yields. So it holds some system = resources while
other fibers can not use them. If we added = a special region to Vdbe, it
would steal slabs from the = thread's slab cache, while other fibers may
want to use = it. Hence, when we use one region for all transactional data,
including language specific, allocations are much less = fragmented over
different slabs.

Is this explanation decent?
Quite. = I thought that used slabs are marked somehow so that different
fibers=E2=80=99 regions can=E2=80=99t rely on the same chunk. = Probably, I misunderstood
how internals of our allocation = system work. I would better ask you f2f
someday (or read = again Konstantin=E2=80=99s article). Anyway, thanks.

Also, I = do not agree, that 'deallocation is just iteration and it is
ok'. It is O(n) iteration and freeing of heap objects. If a = one inserted
10k rows with autogenerated ids, it would = waste 10k heap fragments,
10k calls of malloc/free - in my = opinion it is an abysmal overhead, but
what is more, it = can be avoided for free. Instead of 10k free() it boils
down= to deallocation of N slabs, where N =3D slab_size / (10k * 8); 8 - = size
of autogenerated it; slab size is at least 64Kb, so N = =3D 64*1024/80000 <tel:1024/80000>< 1.
It takes 1 deallocation vs 10k deallocations. So I think this = refactoring
is worth.
Very = impressive calculations, however:
a. I doubt that smb = extensively uses queries like
INSERT INTO t VALUES (NULL, = ..), *10k repeats*, (NULL, ..)=E2=80=99
*Ok, neither I nor = you know which queries users execute (or will execute),
 but anyway your example looks too synthetic.*
b. Nothing prevents us from counting number of NULLs right in = parser
and allocate memory as single array (one malloc). = In this case it would
be more efficient, I guess, since = you don=E2=80=99t need that machinery connected
with = linked list. Btw, why didn=E2=80=99t you consider this variant?

INSERT INTO ... SELECT FROM ...;

Here you can not calculate. = Also, it is not possible to calculate
autoids from triggers, box Lua functions. So a list is the = only
variant.

Ok,= now I see. Then patch LGTM.

= --Apple-Mail=_D59DA289-AA2D-4D54-97B7-E922CECFE49C--