From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng1.m.smailru.net (smtpng1.m.smailru.net [94.100.181.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id B0D124696C4 for ; Mon, 25 Nov 2019 14:50:16 +0300 (MSK) From: Ilya Kosarev Date: Mon, 25 Nov 2019 14:50:09 +0300 Message-Id: <74517eadcabe93b86623a37700bcb5b673f4aa81.1574681299.git.i.kosarev@tarantool.org> In-Reply-To: References: In-Reply-To: References: Subject: [Tarantool-patches] [PATCH v3 4/4] test: stabilize quorum test conditions List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: tarantool-patches@dev.tarantool.org Cc: v.shpilevoy@tarantool.org There were some pass conditions in quorum test which could take some time to be satisfied. Now they are wrapped using test_run:wait_cond to make the test stable. Closes #4586 --- test/replication/quorum.result | 34 ++++++++++++++++++++------------ test/replication/quorum.test.lua | 19 +++++++++--------- 2 files changed, 31 insertions(+), 22 deletions(-) diff --git a/test/replication/quorum.result b/test/replication/quorum.result index ff5fa0150..07abe7f2a 100644 --- a/test/replication/quorum.result +++ b/test/replication/quorum.result @@ -40,6 +40,10 @@ box.info.ro -- true --- - true ... +test_run:wait_cond(function() return box.space.test ~= nil end, 20) +--- +- true +... box.space.test:replace{100} -- error --- - error: Can't modify data because this instance is in read-only mode. @@ -115,15 +119,15 @@ box.info.status -- running - running ... -- Check that the replica follows all masters. -box.info.id == 1 or box.info.replication[1].upstream.status == 'follow' +box.info.id == 1 or test_run:wait_cond(function() return box.info.replication[1].upstream.status == 'follow' end, 20) --- - true ... -box.info.id == 2 or box.info.replication[2].upstream.status == 'follow' +box.info.id == 2 or test_run:wait_cond(function() return box.info.replication[2].upstream.status == 'follow' end, 20) --- - true ... -box.info.id == 3 or box.info.replication[3].upstream.status == 'follow' +box.info.id == 3 or test_run:wait_cond(function() return box.info.replication[3].upstream.status == 'follow' end, 20) --- - true ... @@ -149,6 +153,10 @@ test_run:cmd('stop server quorum1') --- - true ... +test_run:wait_cond(function() return box.space.test.index.primary ~= nil end, 20) +--- +- true +... for i = 1, 100 do box.space.test:insert{i} end --- ... @@ -166,9 +174,9 @@ test_run:cmd('switch quorum1') --- - true ... -box.space.test:count() -- 100 +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) --- -- 100 +- true ... -- Rebootstrap one node of the cluster and check that others follow. -- Note, due to ERRINJ_RELAY_TIMEOUT there is a substantial delay @@ -197,9 +205,9 @@ test_run:cmd('switch quorum1') - true ... test_run:cmd('restart server quorum1 with cleanup=1, args="0.1 0.5"') -box.space.test:count() -- 100 +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) --- -- 100 +- true ... -- The rebootstrapped replica will be assigned id = 4, -- because ids 1..3 are busy. @@ -207,11 +215,9 @@ test_run:cmd('switch quorum2') --- - true ... -fiber = require('fiber') ---- -... -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) --- +- true ... box.info.replication[4].upstream.status --- @@ -221,11 +227,13 @@ test_run:cmd('switch quorum3') --- - true ... -fiber = require('fiber') +test_run:wait_cond(function() return box.info.replication ~= nil end, 20) --- +- true ... -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) --- +- true ... box.info.replication[4].upstream.status --- diff --git a/test/replication/quorum.test.lua b/test/replication/quorum.test.lua index 98febb367..5f2872675 100644 --- a/test/replication/quorum.test.lua +++ b/test/replication/quorum.test.lua @@ -22,6 +22,7 @@ test_run:cmd('restart server quorum2 with args="0.1 0.5"') box.info.status -- orphan box.ctl.wait_rw(0.001) -- timeout box.info.ro -- true +test_run:wait_cond(function() return box.space.test ~= nil end, 20) box.space.test:replace{100} -- error box.cfg{replication={}} box.info.status -- running @@ -47,9 +48,9 @@ box.info.ro -- false box.info.status -- running -- Check that the replica follows all masters. -box.info.id == 1 or box.info.replication[1].upstream.status == 'follow' -box.info.id == 2 or box.info.replication[2].upstream.status == 'follow' -box.info.id == 3 or box.info.replication[3].upstream.status == 'follow' +box.info.id == 1 or test_run:wait_cond(function() return box.info.replication[1].upstream.status == 'follow' end, 20) +box.info.id == 2 or test_run:wait_cond(function() return box.info.replication[2].upstream.status == 'follow' end, 20) +box.info.id == 3 or test_run:wait_cond(function() return box.info.replication[3].upstream.status == 'follow' end, 20) -- Check that box.cfg() doesn't return until the instance -- catches up with all configured replicas. @@ -59,13 +60,14 @@ test_run:cmd('switch quorum2') box.error.injection.set("ERRINJ_RELAY_TIMEOUT", 0.001) test_run:cmd('stop server quorum1') +test_run:wait_cond(function() return box.space.test.index.primary ~= nil end, 20) for i = 1, 100 do box.space.test:insert{i} end fiber = require('fiber') fiber.sleep(0.1) test_run:cmd('start server quorum1 with args="0.1 0.5"') test_run:cmd('switch quorum1') -box.space.test:count() -- 100 +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) -- Rebootstrap one node of the cluster and check that others follow. -- Note, due to ERRINJ_RELAY_TIMEOUT there is a substantial delay @@ -81,17 +83,16 @@ box.snapshot() test_run:cmd('switch quorum1') test_run:cmd('restart server quorum1 with cleanup=1, args="0.1 0.5"') -box.space.test:count() -- 100 +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) -- The rebootstrapped replica will be assigned id = 4, -- because ids 1..3 are busy. test_run:cmd('switch quorum2') -fiber = require('fiber') -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) box.info.replication[4].upstream.status test_run:cmd('switch quorum3') -fiber = require('fiber') -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end +test_run:wait_cond(function() return box.info.replication ~= nil end, 20) +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) box.info.replication[4].upstream.status -- Cleanup. -- 2.17.1