[Tarantool-patches] [PATCH] test: stabilize flaky fiber memory leak detection

Alexander Turenko alexander.turenko at tarantool.org
Wed Jan 29 23:27:48 MSK 2020


On Wed, Jan 29, 2020 at 09:46:30PM +0300, Nikita Pettik wrote:
> On 29 Jan 20:03, Alexander Turenko wrote:
> > After #4736 regression fix (in fact it just reverts the new logic in
> > small) it is possible again that a fiber's region may hold a memory for
> > a while, but release it eventually. When the used memory exceeds 128 KiB
> > threshold, fiber_gc() puts 'garbage' slabs back to slab_cache and
> > subtracts them from region_used() metric. But until this point those
> > slabs are accounted in region_used() and so in fiber.info() metrics.
> > 
> > This commit fixes flakiness of test cases of the following kind:
> > 
> >  | fiber.info()[fiber.self().id()].memory.used -- should be zero
> >  | <...workload...>
> >  | fiber.info()[fiber.self().id()].memory.used -- should be zero
> > 
> > The problem is that the first `<...>.memory.used` value may be non-zero.
> > It depends of previous tests that were executed on this tarantool
> > instance. It is resolved by restarting of a tarantool instance before
> > such test cases to ensure that there are no 'garbage' slabs in a current
> > fiber's region.
> 
> Hm, why not simply save value of ..memory.used before workload, value
> after workload and compare them:
> 
> reg_sz_before = fiber.info()[fiber.self().id()].memory.used
> ...
> reg_sz_after = fiber.info()[fiber.self().id()].memory.used
> 
> assert(reg_sz_before == reg_sz_after);
> 
> So that make sure workload returns all occupied memory. This may fail
> only in case allocated memory > 4kb, but examples in this particular
> case definitely don't require so many memory (as you noted below).

I forgot to add the reason why this approach does not work to the commit
message. Added the following paragraph:

 | The obvious way to solve it would be print differences between
 | `<...>.memory.used` values before and after a workload instead of
 | absolute values. This however does not work, because a first slab in a
 | region can be almost used at the point where a test case starts and a
 | next slab will be acquired from a slab_cache. This means that the
 | previous slab will become a 'garbage' and will not be collected until
 | 128 KiB threshold will exceed: the latter `<...>.memory.used` check will
 | return a bigger value than the former one. However, if the threshold
 | will be reached during the workload, the latter check may show lesser
 | value than the former one. In short, the test case would be unstable
 | after this change.

It can be checked using the snipped from the issue:

 | diff --git a/test/sql/gh-3199-no-mem-leaks.test.lua b/test/sql/gh-3199-no-mem-leaks.test.lua
 | index 41648d0fc..2f998e65f 100644
 | --- a/test/sql/gh-3199-no-mem-leaks.test.lua
 | +++ b/test/sql/gh-3199-no-mem-leaks.test.lua
 | @@ -9,6 +9,15 @@ fiber = require('fiber')
 |  
 |  -- box.cfg()
 |  
 | +box.once('init', function()        \
 | +    box.schema.space.create('s')   \
 | +    box.space.s:create_index('pk') \
 | +end)
 | +
 | +data = string.rep('x', 2048)                        \
 | +for i = 1, 1000 do                                  \
 | +    box.space.s:upsert({i, data}, {{'=', 2, data}}) \
 | +end
 |  
 |  box.execute('CREATE TABLE test (id INT PRIMARY KEY, x INTEGER, y INTEGER)')
 |  box.execute('INSERT INTO test VALUES (1, 1, 1), (2, 2, 2)')

After this change, values from
`fiber.info()[fiber.self().id()].memory.used` expressions from the test
will be the following:

32888
32888
32888
32888
26726

If we'll change the absolute values to a differences, then the last
value will not be zero anyway.

> 
> > Note: This works only if a test case reserves only one slab at the
> > moment: otherwise some memory may be hold after the case (and so a
> > memory check after a workload will fail). However it seems that our
> > cases are small enough to don't trigger this situation.
> > 
> > Call of region_free() would be enough, but we have no Lua API for it.
> > 
> > Fixes #4750.
> > ---
> > 


More information about the Tarantool-patches mailing list