Tarantool development patches archive
 help / color / mirror / Atom feed
From: Nikita Pettik <korablev@tarantool.org>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH] region: do not rotate slabs in case of single slab purification
Date: Fri, 7 Feb 2020 16:58:59 +0300	[thread overview]
Message-ID: <20200207135859.GA51051@tarantool.org> (raw)
In-Reply-To: <6c6d81d1-d9d2-5248-3edd-f56aefea9e76@tarantool.org>

On 06 Feb 21:48, Vladislav Shpilevoy wrote:
> Hi! Thanks for the patch!
> 
> On 06/02/2020 18:14, Nikita Pettik wrote:
> > Imagine following situation at the moment of region truncation:
> > 
> >        1 slab      2 slab/HEAD
> >      x of s1 used  y of s2 used
> >      +---------+   +---------+
> > ...->|xx|      |-->|yyyyyy|  |--> NULL
> >      |xx|      |   |yyyyyy|  |
> >      +---------+   +---------+
> > 
> > And region is going to be truncated by size y (i.g.
> > region_truncate(&region, region_used() - y)). After this operation
> > second slab will remain empty (slab.used == 0). Currently, this slab is
> > returned back to the slab cache, which leads to slab rotation. This may
> > not be useful in terms of performance, especially for requests which
> > don't abuse region memory (for instance, SQL queries) but do call
> > region_truncate() intensively. For example, during transfer tuples to
> > ephemeral table truncate() is called for each tuple. What is more,
> > truncate is also called on each returned tuple in sql_row_to_port()
> > (see runtime_tuple_new()).
> 
> I understand the point about perf and slab oscillation, but
> don't see how is it related to the tests.

See below.

> > Also, taking into consideration revert of 67d7ab4 (see 81dda5b) this may
> > lead to memory usage fluctuations: region memory in use before query
> > execution and after may not be the same.
> 
> Yeah, but your patch does not fix that. It is a bigger problem.
> Region does not free its slabs even at fiber_gc() if their total
> size is small enough. So how is this related to memory fluctuations?

Let's look at particular example taken from sql/gh-3199-no-mem-leaks.test.lua:

box.execute('SELECT x, y + 3 * b, b FROM test2, test WHERE b = x')

Here's brief region memory use log for this query:

region alloc size 0 //First region allocation of 0 bytes
getting new slab
region alloc size 0
current slab unused 4040 //56 bytes takes slab structure itself
current slab used 0
current slab size 4096
region alloc size 0
... //same values
region free
region alloc size 1
current slab unused 4040
current slab used 0
current slab size 4096
region alloc size 1
...
region alloc size 1
current slab unused 4038
current slab used 2
current slab size 4096
region alloc size 1
...
region join
region truncate
cut size 0 //nothing to truncate
current slab used 4
slabs used 26730 //total region memory in use, before and during execution
region truncate
cut size 4 //cut size matches with memory in use
slab used 4
removing current slab

Slab is purified and put back to the slab cache. At this point we observe
slab rotation. But new slab (i.e. head of list) can be initialized with
non-zero used memory since we reverted Georgy's patch (which zeroed whole
list of slabs). As a result, used memory in first slab has increased (which
looks extremely  contradictory). It can be seen in further logs:
...
cut size 0
slab used 2060 //used memory has bumped from 4 to 2060
slabs used 26726 //total memory in use hasn't changed tho
...
region alloc size 1
current slab unused 1978
current slab used 2062
current slab size 4096
region alloc size 1
current slab unused 1977
current slab used 2063
current slab size 4096
fiber_gc
fiber reset //at this step total memory is 26730 - 2063 = 24667

At the end of execution fibre_gc() is called, which in turn will nullify
slab->used memory consuption and ergo reduce whole fiber.memory.used
consumption by 2063. That's why amount of memory in usage at the end of
query execution does not match with initial value.

Now, if truncate didn't rotate slab, region memory before and after
query execution would be the same. So, it would make tests like this
more stable, since region leaks could be easily detected as difference
between initial (i.e. before query execution) and final
(i.e. after execution) occupied region memory.

> > For instance: total region memory
> > in usage is x, but the first slab (head) is empty (since between
> > requests region_reset() is called). During query execution
> > region_truncate() is called and first slab is put back to the slab
> > cache. Now another slab forms head of slab list, but there's no
> > guarantee that its size is 0 (since slab's sizes are not zeroed on
> > reset).
> 
> How slab size can be 0? You mean 'used' size? In that case it is
> zeroed in reset. Otherwise I am missing something.

Yep, size of used memory. I'll update commit message.
 
> > As a result, total region memory may change (decrease) due to
> > slab rotation. It is likely to be misleading for users and in particular
> > for our testing system: to detect region memory leak cases, we have
> > nothing to do but restart Tarantool instance each time before test.
> 
> From what I see, this patch makes region keep slab more often. It does
> not free them more aggressively or something. So how is it going to fix
> the slab leak or at least make it less severe?

It doesn't affect leak, but it allows avoiding workarounds like server
restart for sql/gh-3199-no-mem-leaks.test.lua:

https://lists.tarantool.org/pipermail/tarantool-patches/2020-January/013837.html

> > ---
> > Branch: https://github.com/tarantool/small/tree/np/dont-rotate-region-on-truncate
> > 
> >  small/region.c |  2 +-
> >  test/region.c  | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 70 insertions(+), 1 deletion(-)

  reply	other threads:[~2020-02-07 13:59 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-06 17:14 Nikita Pettik
2020-02-06 18:43 ` Konstantin Osipov
2020-02-06 20:48 ` Vladislav Shpilevoy
2020-02-07 13:58   ` Nikita Pettik [this message]
2020-02-08  0:28     ` Vladislav Shpilevoy
2020-02-10 11:52       ` Nikita Pettik
2020-04-10 10:10 Aleksandr Lyapunov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200207135859.GA51051@tarantool.org \
    --to=korablev@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH] region: do not rotate slabs in case of single slab purification' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox