[Tarantool-patches] [PATCH] test: stabilize flaky fiber memory leak detection
Nikita Pettik
korablev at tarantool.org
Wed Feb 5 18:36:12 MSK 2020
On 29 Jan 23:27, Alexander Turenko wrote:
> On Wed, Jan 29, 2020 at 09:46:30PM +0300, Nikita Pettik wrote:
> > On 29 Jan 20:03, Alexander Turenko wrote:
> > > The problem is that the first `<...>.memory.used` value may be non-zero.
> > > It depends of previous tests that were executed on this tarantool
> > > instance. It is resolved by restarting of a tarantool instance before
> > > such test cases to ensure that there are no 'garbage' slabs in a current
> > > fiber's region.
> >
> > Hm, why not simply save value of ..memory.used before workload, value
> > after workload and compare them:
> >
> > reg_sz_before = fiber.info()[fiber.self().id()].memory.used
> > ...
> > reg_sz_after = fiber.info()[fiber.self().id()].memory.used
> >
> > assert(reg_sz_before == reg_sz_after);
> >
> > So that make sure workload returns all occupied memory. This may fail
> > only in case allocated memory > 4kb, but examples in this particular
> > case definitely don't require so many memory (as you noted below).
>
> I forgot to add the reason why this approach does not work to the commit
> message. Added the following paragraph:
>
> | The obvious way to solve it would be print differences between
> | `<...>.memory.used` values before and after a workload instead of
> | absolute values. This however does not work, because a first slab in a
> | region can be almost used at the point where a test case starts and a
> | next slab will be acquired from a slab_cache. This means that the
I checked these particular SQL queries - it's not the case you mentioned.
Slab (first one i.e. head of list) is empty at the start of query processing;
meanwhile query itself requires only a few bytes allocated on region.
Region memory (...memory.used) changes only after executing last query:
box.execute('SELECT x, y + 3 * b, b FROM test2, test WHERE b = x')
Here's brief region memory use log for this query:
region alloc size 0 //First region allocation of 0 bytes
getting new slab
region alloc size 0
current slab unused 4040 //56 bytes takes slab structure itself
current slab used 0
current slab size 4096
region alloc size 0
... //same values
region free
region alloc size 1
current slab unused 4040
current slab used 0
current slab size 4096
region alloc size 1
...
region alloc size 1
current slab unused 4038
current slab used 2
current slab size 4096
region alloc size 1
...
region join
region truncate
cut size 0 //nothing to truncate
current slab used 4
slabs used 26730 //total region memory in use
region truncate
cut size 4 //cut size matches with memory in use
slab used 4
removing current slab
// slab is empty and put back to cache
// at this point we observe slab rotation.
// But new slab (i.e. new head of list) can be not empty
// (that is slab.used != 0) since we reverted Georgy's patch which
// zeroed whole list of slabs.
// So that used memory in first slab has increased (which looks extremely
// contradictory). Also, at the end of execution we call fibre_gc() which
// will nullify slab->used memory consuption and ergo reduce whole
// ...fiber.memory.used consumption. That's why amount of memory in usage at
// the end of query execution does not match with initial value.
// As a workaround we can remove slab rotation in region_truncate().
// Moreover, it may even result in performance gain.
// So, instead of pushing this patch, let's consider another one fix
// for small library. I'll send a patch.
...
cut size 0
slab used 2060
slabs used 26726
region alloc size 1
current slab unused 1980
current slab used 2060
current slab size 4096
region alloc size 1
current slab unused 1979
current slab used 2061
current slab size 4096
region alloc size 1
current slab unused 1978
current slab used 2062
current slab size 4096
region alloc size 1
current slab unused 1977
current slab used 2063
current slab size 4096
...
fiber gc
region reset
More information about the Tarantool-patches
mailing list