From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp49.i.mail.ru (smtp49.i.mail.ru [94.100.177.109]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 2E83145C304 for ; Tue, 1 Dec 2020 23:42:46 +0300 (MSK) References: <995884546a2320b0e0cf031dde7e8e5bc7c7f8dd.1605627203.git.avtikhon@tarantool.org> From: Aleksandr Lyapunov Message-ID: <54bb9a32-d506-f92c-93bd-7b5197a862c8@tarantool.org> Date: Tue, 1 Dec 2020 23:42:44 +0300 MIME-Version: 1.0 In-Reply-To: <995884546a2320b0e0cf031dde7e8e5bc7c7f8dd.1605627203.git.avtikhon@tarantool.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Subject: Re: [Tarantool-patches] [PATCH v1] test: flaky hang vinyl/ddl.test.lua test List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Alexander V. Tikhonov" , Kirill Yukhin Cc: tarantool-patches@dev.tarantool.org Hi! thanks for the patch! As I can see the patch does 2 things: 1)reverts 5f96ee5907672bd32dd7fe8dad144ac3d328187f 2)don't do some actions (box.snapshot()) if the space has wrong count in an index. As for me (1) should be done in separate commit. But it's no a big deal. I think that (2) can be achieved with much more shorter patch: -box.snapshot() +if box.space.test.index.pk:count() == box.space.test.index.tk:count() then box.snapshot() end That would be much more understandable. But it still would be suspicious. The main problem is: why the test hangs if tarantool makes snapshot? We must not exit a test end even skip some lines if something went wrong. If tarantool hangs in some conditions we have to know it. I think hiding `box.snapshot()` is a wrong way. The test subsystem must handle with any extreme situation. And if tarantool hangs - the subsystem must record it's stacktrace, report and kill tarantool. As I understood didn't manage to do it. It must be fixed. It must be one fix of test subsystem instead of dozen of fixes of particular tests. On 11/17/20 6:34 PM, Alexander V. Tikhonov wrote: > Found hanging test vinyl/ddl.test.lua on: > > [159] inspector:wait_cond(function() return box.space.test.index.pk:count() == box.space.test.index.tk:count() end) > [159] --- > [159] - true > [159] ... > [159] -box.snapshot() > [159] ---- > [159] -- ok > [159] -... > > The real issue happend before it when test failed on: > > [091] --- engine/ddl.result Thu May 14 16:12:09 2020 > [091] +++ engine/ddl.reject Fri May 15 04:15:07 2020 > [091] @@ -2558,7 +2558,7 @@ > [091] ... > [091] inspector:wait_cond(function() return box.space.test.index.pk:count() == box.space.test.index.sk:count() end) > [091] --- > [091] -- true > [091] +- false > [091] ... > > To avoid of hangs decided to rewrite the test to be able to leave the > hanging subtest on the very start of the issue. Also found that the > previous fix committed with: > > 5f96ee5907672bd32dd7fe8dad144ac3d328187f ('Fix flaky test engine/ddl') > > did not fix the issue and its changes were removed with the current > commit. > > Needed for #4353 > --- > > Github: https://github.com/tarantool/tarantool/tree/avtikhon/gh-4353-hang-ddl > Issue: https://github.com/tarantool/tarantool/issues/4353 > > test/engine/ddl.result | 91 ++++++++++++++++++++++++++-------------- > test/engine/ddl.test.lua | 76 ++++++++++++++++++++++++++------- > test/engine/suite.ini | 2 +- > 3 files changed, 122 insertions(+), 47 deletions(-) > > diff --git a/test/engine/ddl.result b/test/engine/ddl.result > index 8b44d1ee1..3dda5494f 100644 > --- a/test/engine/ddl.result > +++ b/test/engine/ddl.result > @@ -2526,41 +2526,52 @@ function gen_load() > end; > --- > ... > -inspector:cmd("setopt delimiter ''"); > ---- > -- true > -... > -fiber = require('fiber') > ---- > -... > -ch = fiber.channel(1) > ---- > -... > -_ = fiber.create(function() gen_load() ch:put(true) end) > +function check_equal(check, pk, k) > + if pk ~= k then > + require('log').error("Error on fiber check: failed '" .. check .. > + "' check on equal pk " .. pk .. " and k = " .. k) > + return false > + end > + return true > +end; > --- > ... > -_ = box.space.test:create_index('sk', {unique = false, parts = {2, 'unsigned'}}) > +function check_fiber() > + _ = fiber.create(function() gen_load() ch:put(true) end) > + _ = box.space.test:create_index('sk', {unique = false, parts = {2, 'unsigned'}}) > + > + ch:get() > + > + local index = box.space.test.index > + if not check_equal("1st step secondary keys", index.pk:count(), index.sk:count()) then > + return false > + end > + > + _ = fiber.create(function() gen_load() ch:put(true) end) > + _ = box.space.test:create_index('tk', {unique = true, parts = {3, 'unsigned'}}) > + > + ch:get() > + > + index = box.space.test.index > + if not check_equal("2nd step secondary keys", index.pk:count(), index.sk:count()) or > + not check_equal("2nd step third keys", index.pk:count(), index.tk:count()) then > + return false > + end > + return true > +end; > --- > ... > -ch:get() > +inspector:cmd("setopt delimiter ''"); > --- > - true > ... > -_ = fiber.create(function() gen_load() ch:put(true) end) > ---- > -... > -_ = box.space.test:create_index('tk', {unique = true, parts = {3, 'unsigned'}}) > ---- > -... > -ch:get() > +fiber = require('fiber') > --- > -- true > ... > -inspector:wait_cond(function() return box.space.test.index.pk:count() == box.space.test.index.sk:count() end) > +ch = fiber.channel(1) > --- > -- true > ... > -inspector:wait_cond(function() return box.space.test.index.pk:count() == box.space.test.index.tk:count() end) > +check_fiber() > --- > - true > ... > @@ -2568,23 +2579,41 @@ inspector:cmd("restart server default") > inspector = require('test_run').new() > --- > ... > -inspector:wait_cond(function() return box.space.test.index.pk:count() == box.space.test.index.sk:count() end) > +inspector:cmd("setopt delimiter ';'") > --- > - true > ... > -inspector:wait_cond(function() return box.space.test.index.pk:count() == box.space.test.index.tk:count() end) > +function check_equal(check, pk, k) > + if pk ~= k then > + require('log').error("Error on server restart check: failed '" .. check .. > + "' check on equal pk " .. pk .. " and k = " .. k) > + return false > + end > + return true > +end; > --- > -- true > ... > -box.snapshot() > +function check_server_restart() > + local index = box.space.test.index > + if not check_equal("1rd step secondary keys", index.pk:count(), index.sk:count()) or > + not check_equal("1rd step third keys", index.pk:count(), index.tk:count()) then > + return false > + end > + box.snapshot() > + index = box.space.test.index > + if not check_equal("2th step secondary keys", index.pk:count(), index.sk:count()) or > + not check_equal("2th step third keys", index.pk:count(), index.tk:count()) then > + return false > + end > + return true > +end; > --- > -- ok > ... > -inspector:wait_cond(function() return box.space.test.index.pk:count() == box.space.test.index.sk:count() end) > +inspector:cmd("setopt delimiter ''"); > --- > - true > ... > -inspector:wait_cond(function() return box.space.test.index.pk:count() == box.space.test.index.tk:count() end) > +check_server_restart() > --- > - true > ... > diff --git a/test/engine/ddl.test.lua b/test/engine/ddl.test.lua > index 1d77705dd..c2eda7c06 100644 > --- a/test/engine/ddl.test.lua > +++ b/test/engine/ddl.test.lua > @@ -1003,30 +1003,76 @@ function gen_load() > end > end; > > +function check_equal(check, pk, k) > + if pk ~= k then > + require('log').error("Error on fiber check: failed '" .. check .. > + "' check on equal pk " .. pk .. " and k = " .. k) > + return false > + end > + return true > +end; > + > +function check_fiber() > + _ = fiber.create(function() gen_load() ch:put(true) end) > + _ = box.space.test:create_index('sk', {unique = false, parts = {2, 'unsigned'}}) > + > + ch:get() > + > + local index = box.space.test.index > + if not check_equal("1st step secondary keys", index.pk:count(), index.sk:count()) then > + return false > + end > + > + _ = fiber.create(function() gen_load() ch:put(true) end) > + _ = box.space.test:create_index('tk', {unique = true, parts = {3, 'unsigned'}}) > + > + ch:get() > + > + index = box.space.test.index > + if not check_equal("2nd step secondary keys", index.pk:count(), index.sk:count()) or > + not check_equal("2nd step third keys", index.pk:count(), index.tk:count()) then > + return false > + end > + return true > +end; > + > inspector:cmd("setopt delimiter ''"); > > fiber = require('fiber') > ch = fiber.channel(1) > +check_fiber() > > -_ = fiber.create(function() gen_load() ch:put(true) end) > -_ = box.space.test:create_index('sk', {unique = false, parts = {2, 'unsigned'}}) > -ch:get() > +inspector:cmd("restart server default") > +inspector = require('test_run').new() > > -_ = fiber.create(function() gen_load() ch:put(true) end) > -_ = box.space.test:create_index('tk', {unique = true, parts = {3, 'unsigned'}}) > -ch:get() > +inspector:cmd("setopt delimiter ';'") > > -inspector:wait_cond(function() return box.space.test.index.pk:count() == box.space.test.index.sk:count() end) > -inspector:wait_cond(function() return box.space.test.index.pk:count() == box.space.test.index.tk:count() end) > +function check_equal(check, pk, k) > + if pk ~= k then > + require('log').error("Error on server restart check: failed '" .. check .. > + "' check on equal pk " .. pk .. " and k = " .. k) > + return false > + end > + return true > +end; > > -inspector:cmd("restart server default") > +function check_server_restart() > + local index = box.space.test.index > + if not check_equal("1rd step secondary keys", index.pk:count(), index.sk:count()) or > + not check_equal("1rd step third keys", index.pk:count(), index.tk:count()) then > + return false > + end > + box.snapshot() > + index = box.space.test.index > + if not check_equal("2th step secondary keys", index.pk:count(), index.sk:count()) or > + not check_equal("2th step third keys", index.pk:count(), index.tk:count()) then > + return false > + end > + return true > +end; > > -inspector = require('test_run').new() > +inspector:cmd("setopt delimiter ''"); > > -inspector:wait_cond(function() return box.space.test.index.pk:count() == box.space.test.index.sk:count() end) > -inspector:wait_cond(function() return box.space.test.index.pk:count() == box.space.test.index.tk:count() end) > -box.snapshot() > -inspector:wait_cond(function() return box.space.test.index.pk:count() == box.space.test.index.sk:count() end) > -inspector:wait_cond(function() return box.space.test.index.pk:count() == box.space.test.index.tk:count() end) > +check_server_restart() > > box.space.test:drop() > diff --git a/test/engine/suite.ini b/test/engine/suite.ini > index 01899f088..5354699b3 100644 > --- a/test/engine/suite.ini > +++ b/test/engine/suite.ini > @@ -15,7 +15,7 @@ fragile = { > "tests": { > "ddl.test.lua": { > "issues": [ "gh-4353" ], > - "checksums": [ "dd8851d80183cc75052119ba646e295d" ] > + "checksums": [ "928a43a39a843edbf26ec324d73c210f" ] > }, > "gh-4973-concurrent-alter-fails.test.lua": { > "issues": [ "gh-5157" ],