From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp62.i.mail.ru (smtp62.i.mail.ru [217.69.128.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 7CEB246970E for ; Wed, 29 Jan 2020 23:27:40 +0300 (MSK) Date: Wed, 29 Jan 2020 23:27:48 +0300 From: Alexander Turenko Message-ID: <20200129202748.ermvumaerp746s2s@tkn_work_nb> References: <20200129184630.GB16149@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20200129184630.GB16149@tarantool.org> Subject: Re: [Tarantool-patches] [PATCH] test: stabilize flaky fiber memory leak detection List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Nikita Pettik Cc: tarantool-patches@dev.tarantool.org On Wed, Jan 29, 2020 at 09:46:30PM +0300, Nikita Pettik wrote: > On 29 Jan 20:03, Alexander Turenko wrote: > > After #4736 regression fix (in fact it just reverts the new logic in > > small) it is possible again that a fiber's region may hold a memory for > > a while, but release it eventually. When the used memory exceeds 128 KiB > > threshold, fiber_gc() puts 'garbage' slabs back to slab_cache and > > subtracts them from region_used() metric. But until this point those > > slabs are accounted in region_used() and so in fiber.info() metrics. > > > > This commit fixes flakiness of test cases of the following kind: > > > > | fiber.info()[fiber.self().id()].memory.used -- should be zero > > | <...workload...> > > | fiber.info()[fiber.self().id()].memory.used -- should be zero > > > > The problem is that the first `<...>.memory.used` value may be non-zero. > > It depends of previous tests that were executed on this tarantool > > instance. It is resolved by restarting of a tarantool instance before > > such test cases to ensure that there are no 'garbage' slabs in a current > > fiber's region. > > Hm, why not simply save value of ..memory.used before workload, value > after workload and compare them: > > reg_sz_before = fiber.info()[fiber.self().id()].memory.used > ... > reg_sz_after = fiber.info()[fiber.self().id()].memory.used > > assert(reg_sz_before == reg_sz_after); > > So that make sure workload returns all occupied memory. This may fail > only in case allocated memory > 4kb, but examples in this particular > case definitely don't require so many memory (as you noted below). I forgot to add the reason why this approach does not work to the commit message. Added the following paragraph: | The obvious way to solve it would be print differences between | `<...>.memory.used` values before and after a workload instead of | absolute values. This however does not work, because a first slab in a | region can be almost used at the point where a test case starts and a | next slab will be acquired from a slab_cache. This means that the | previous slab will become a 'garbage' and will not be collected until | 128 KiB threshold will exceed: the latter `<...>.memory.used` check will | return a bigger value than the former one. However, if the threshold | will be reached during the workload, the latter check may show lesser | value than the former one. In short, the test case would be unstable | after this change. It can be checked using the snipped from the issue: | diff --git a/test/sql/gh-3199-no-mem-leaks.test.lua b/test/sql/gh-3199-no-mem-leaks.test.lua | index 41648d0fc..2f998e65f 100644 | --- a/test/sql/gh-3199-no-mem-leaks.test.lua | +++ b/test/sql/gh-3199-no-mem-leaks.test.lua | @@ -9,6 +9,15 @@ fiber = require('fiber') | | -- box.cfg() | | +box.once('init', function() \ | + box.schema.space.create('s') \ | + box.space.s:create_index('pk') \ | +end) | + | +data = string.rep('x', 2048) \ | +for i = 1, 1000 do \ | + box.space.s:upsert({i, data}, {{'=', 2, data}}) \ | +end | | box.execute('CREATE TABLE test (id INT PRIMARY KEY, x INTEGER, y INTEGER)') | box.execute('INSERT INTO test VALUES (1, 1, 1), (2, 2, 2)') After this change, values from `fiber.info()[fiber.self().id()].memory.used` expressions from the test will be the following: 32888 32888 32888 32888 26726 If we'll change the absolute values to a differences, then the last value will not be zero anyway. > > > Note: This works only if a test case reserves only one slab at the > > moment: otherwise some memory may be hold after the case (and so a > > memory check after a workload will fail). However it seems that our > > cases are small enough to don't trigger this situation. > > > > Call of region_free() would be enough, but we have no Lua API for it. > > > > Fixes #4750. > > --- > >