* [Tarantool-patches] [PATCH 2/2] test: stabilize quorum test conditions [not found] <cover.1574159473.git.i.kosarev@tarantool.org> @ 2019-11-19 10:31 ` Ilya Kosarev 0 siblings, 0 replies; 5+ messages in thread From: Ilya Kosarev @ 2019-11-19 10:31 UTC (permalink / raw) To: tarantool-patches; +Cc: v.shpilevoy There were some pass conditions in quorum test which could take some time to be satisfied. Now they are wrapped using test_run:wait_cond to make the test stable. Closes #4586 --- test/replication/quorum.result | 30 +++++++++++++++++------------- test/replication/quorum.test.lua | 18 +++++++++--------- 2 files changed, 26 insertions(+), 22 deletions(-) diff --git a/test/replication/quorum.result b/test/replication/quorum.result index ff5fa0150..12604c8de 100644 --- a/test/replication/quorum.result +++ b/test/replication/quorum.result @@ -115,15 +115,15 @@ box.info.status -- running - running ... -- Check that the replica follows all masters. -box.info.id == 1 or box.info.replication[1].upstream.status == 'follow' +box.info.id == 1 or test_run:wait_cond(function() return box.info.replication[1].upstream.status == 'follow' end, 20) --- - true ... -box.info.id == 2 or box.info.replication[2].upstream.status == 'follow' +box.info.id == 2 or test_run:wait_cond(function() return box.info.replication[2].upstream.status == 'follow' end, 20) --- - true ... -box.info.id == 3 or box.info.replication[3].upstream.status == 'follow' +box.info.id == 3 or test_run:wait_cond(function() return box.info.replication[3].upstream.status == 'follow' end, 20) --- - true ... @@ -149,6 +149,10 @@ test_run:cmd('stop server quorum1') --- - true ... +test_run:wait_cond(function() return box.space.test.index.primary ~= nil end, 20) +--- +- true +... for i = 1, 100 do box.space.test:insert{i} end --- ... @@ -166,9 +170,9 @@ test_run:cmd('switch quorum1') --- - true ... -box.space.test:count() -- 100 +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) --- -- 100 +- true ... -- Rebootstrap one node of the cluster and check that others follow. -- Note, due to ERRINJ_RELAY_TIMEOUT there is a substantial delay @@ -197,9 +201,9 @@ test_run:cmd('switch quorum1') - true ... test_run:cmd('restart server quorum1 with cleanup=1, args="0.1 0.5"') -box.space.test:count() -- 100 +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) --- -- 100 +- true ... -- The rebootstrapped replica will be assigned id = 4, -- because ids 1..3 are busy. @@ -207,11 +211,9 @@ test_run:cmd('switch quorum2') --- - true ... -fiber = require('fiber') ---- -... -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) --- +- true ... box.info.replication[4].upstream.status --- @@ -221,11 +223,13 @@ test_run:cmd('switch quorum3') --- - true ... -fiber = require('fiber') +test_run:wait_cond(function() return box.info.replication ~= nil end, 20) --- +- true ... -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) --- +- true ... box.info.replication[4].upstream.status --- diff --git a/test/replication/quorum.test.lua b/test/replication/quorum.test.lua index 98febb367..be23200d3 100644 --- a/test/replication/quorum.test.lua +++ b/test/replication/quorum.test.lua @@ -47,9 +47,9 @@ box.info.ro -- false box.info.status -- running -- Check that the replica follows all masters. -box.info.id == 1 or box.info.replication[1].upstream.status == 'follow' -box.info.id == 2 or box.info.replication[2].upstream.status == 'follow' -box.info.id == 3 or box.info.replication[3].upstream.status == 'follow' +box.info.id == 1 or test_run:wait_cond(function() return box.info.replication[1].upstream.status == 'follow' end, 20) +box.info.id == 2 or test_run:wait_cond(function() return box.info.replication[2].upstream.status == 'follow' end, 20) +box.info.id == 3 or test_run:wait_cond(function() return box.info.replication[3].upstream.status == 'follow' end, 20) -- Check that box.cfg() doesn't return until the instance -- catches up with all configured replicas. @@ -59,13 +59,14 @@ test_run:cmd('switch quorum2') box.error.injection.set("ERRINJ_RELAY_TIMEOUT", 0.001) test_run:cmd('stop server quorum1') +test_run:wait_cond(function() return box.space.test.index.primary ~= nil end, 20) for i = 1, 100 do box.space.test:insert{i} end fiber = require('fiber') fiber.sleep(0.1) test_run:cmd('start server quorum1 with args="0.1 0.5"') test_run:cmd('switch quorum1') -box.space.test:count() -- 100 +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) -- Rebootstrap one node of the cluster and check that others follow. -- Note, due to ERRINJ_RELAY_TIMEOUT there is a substantial delay @@ -81,17 +82,16 @@ box.snapshot() test_run:cmd('switch quorum1') test_run:cmd('restart server quorum1 with cleanup=1, args="0.1 0.5"') -box.space.test:count() -- 100 +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) -- The rebootstrapped replica will be assigned id = 4, -- because ids 1..3 are busy. test_run:cmd('switch quorum2') -fiber = require('fiber') -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) box.info.replication[4].upstream.status test_run:cmd('switch quorum3') -fiber = require('fiber') -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end +test_run:wait_cond(function() return box.info.replication ~= nil end, 20) +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) box.info.replication[4].upstream.status -- Cleanup. -- 2.17.1 ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <cover.1574290043.git.i.kosarev@tarantool.org>]
* [Tarantool-patches] [PATCH 2/2] test: stabilize quorum test conditions [not found] <cover.1574290043.git.i.kosarev@tarantool.org> @ 2019-11-20 22:47 ` Ilya Kosarev 2019-11-20 23:07 ` Vladislav Shpilevoy 0 siblings, 1 reply; 5+ messages in thread From: Ilya Kosarev @ 2019-11-20 22:47 UTC (permalink / raw) To: tarantool-patches; +Cc: v.shpilevoy There were some pass conditions in quorum test which could take some time to be satisfied. Now they are wrapped using test_run:wait_cond to make the test stable. Part of #4586 --- test/replication/quorum.result | 30 +++++++++++++++++------------- test/replication/quorum.test.lua | 18 +++++++++--------- 2 files changed, 26 insertions(+), 22 deletions(-) diff --git a/test/replication/quorum.result b/test/replication/quorum.result index ff5fa0150..12604c8de 100644 --- a/test/replication/quorum.result +++ b/test/replication/quorum.result @@ -115,15 +115,15 @@ box.info.status -- running - running ... -- Check that the replica follows all masters. -box.info.id == 1 or box.info.replication[1].upstream.status == 'follow' +box.info.id == 1 or test_run:wait_cond(function() return box.info.replication[1].upstream.status == 'follow' end, 20) --- - true ... -box.info.id == 2 or box.info.replication[2].upstream.status == 'follow' +box.info.id == 2 or test_run:wait_cond(function() return box.info.replication[2].upstream.status == 'follow' end, 20) --- - true ... -box.info.id == 3 or box.info.replication[3].upstream.status == 'follow' +box.info.id == 3 or test_run:wait_cond(function() return box.info.replication[3].upstream.status == 'follow' end, 20) --- - true ... @@ -149,6 +149,10 @@ test_run:cmd('stop server quorum1') --- - true ... +test_run:wait_cond(function() return box.space.test.index.primary ~= nil end, 20) +--- +- true +... for i = 1, 100 do box.space.test:insert{i} end --- ... @@ -166,9 +170,9 @@ test_run:cmd('switch quorum1') --- - true ... -box.space.test:count() -- 100 +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) --- -- 100 +- true ... -- Rebootstrap one node of the cluster and check that others follow. -- Note, due to ERRINJ_RELAY_TIMEOUT there is a substantial delay @@ -197,9 +201,9 @@ test_run:cmd('switch quorum1') - true ... test_run:cmd('restart server quorum1 with cleanup=1, args="0.1 0.5"') -box.space.test:count() -- 100 +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) --- -- 100 +- true ... -- The rebootstrapped replica will be assigned id = 4, -- because ids 1..3 are busy. @@ -207,11 +211,9 @@ test_run:cmd('switch quorum2') --- - true ... -fiber = require('fiber') ---- -... -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) --- +- true ... box.info.replication[4].upstream.status --- @@ -221,11 +223,13 @@ test_run:cmd('switch quorum3') --- - true ... -fiber = require('fiber') +test_run:wait_cond(function() return box.info.replication ~= nil end, 20) --- +- true ... -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) --- +- true ... box.info.replication[4].upstream.status --- diff --git a/test/replication/quorum.test.lua b/test/replication/quorum.test.lua index 98febb367..be23200d3 100644 --- a/test/replication/quorum.test.lua +++ b/test/replication/quorum.test.lua @@ -47,9 +47,9 @@ box.info.ro -- false box.info.status -- running -- Check that the replica follows all masters. -box.info.id == 1 or box.info.replication[1].upstream.status == 'follow' -box.info.id == 2 or box.info.replication[2].upstream.status == 'follow' -box.info.id == 3 or box.info.replication[3].upstream.status == 'follow' +box.info.id == 1 or test_run:wait_cond(function() return box.info.replication[1].upstream.status == 'follow' end, 20) +box.info.id == 2 or test_run:wait_cond(function() return box.info.replication[2].upstream.status == 'follow' end, 20) +box.info.id == 3 or test_run:wait_cond(function() return box.info.replication[3].upstream.status == 'follow' end, 20) -- Check that box.cfg() doesn't return until the instance -- catches up with all configured replicas. @@ -59,13 +59,14 @@ test_run:cmd('switch quorum2') box.error.injection.set("ERRINJ_RELAY_TIMEOUT", 0.001) test_run:cmd('stop server quorum1') +test_run:wait_cond(function() return box.space.test.index.primary ~= nil end, 20) for i = 1, 100 do box.space.test:insert{i} end fiber = require('fiber') fiber.sleep(0.1) test_run:cmd('start server quorum1 with args="0.1 0.5"') test_run:cmd('switch quorum1') -box.space.test:count() -- 100 +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) -- Rebootstrap one node of the cluster and check that others follow. -- Note, due to ERRINJ_RELAY_TIMEOUT there is a substantial delay @@ -81,17 +82,16 @@ box.snapshot() test_run:cmd('switch quorum1') test_run:cmd('restart server quorum1 with cleanup=1, args="0.1 0.5"') -box.space.test:count() -- 100 +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) -- The rebootstrapped replica will be assigned id = 4, -- because ids 1..3 are busy. test_run:cmd('switch quorum2') -fiber = require('fiber') -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) box.info.replication[4].upstream.status test_run:cmd('switch quorum3') -fiber = require('fiber') -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end +test_run:wait_cond(function() return box.info.replication ~= nil end, 20) +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) box.info.replication[4].upstream.status -- Cleanup. -- 2.17.1 ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/2] test: stabilize quorum test conditions 2019-11-20 22:47 ` Ilya Kosarev @ 2019-11-20 23:07 ` Vladislav Shpilevoy 2019-11-21 0:12 ` Ilya Kosarev 0 siblings, 1 reply; 5+ messages in thread From: Vladislav Shpilevoy @ 2019-11-20 23:07 UTC (permalink / raw) To: Ilya Kosarev, tarantool-patches Hm. So you are not going to fix the flaky error I mentioned in the previous thread about this commit? Seems like it is also about 'conditions which need time to be satisfied'. On 20/11/2019 23:47, Ilya Kosarev wrote: > There were some pass conditions in quorum test which could take some > time to be satisfied. Now they are wrapped using test_run:wait_cond to > make the test stable. > > Part of #4586 > --- > test/replication/quorum.result | 30 +++++++++++++++++------------- > test/replication/quorum.test.lua | 18 +++++++++--------- > 2 files changed, 26 insertions(+), 22 deletions(-) > > diff --git a/test/replication/quorum.result b/test/replication/quorum.result > index ff5fa0150..12604c8de 100644 > --- a/test/replication/quorum.result > +++ b/test/replication/quorum.result > @@ -115,15 +115,15 @@ box.info.status -- running > - running > ... > -- Check that the replica follows all masters. > -box.info.id == 1 or box.info.replication[1].upstream.status == 'follow' > +box.info.id == 1 or test_run:wait_cond(function() return box.info.replication[1].upstream.status == 'follow' end, 20) > --- > - true > ... > -box.info.id == 2 or box.info.replication[2].upstream.status == 'follow' > +box.info.id == 2 or test_run:wait_cond(function() return box.info.replication[2].upstream.status == 'follow' end, 20) > --- > - true > ... > -box.info.id == 3 or box.info.replication[3].upstream.status == 'follow' > +box.info.id == 3 or test_run:wait_cond(function() return box.info.replication[3].upstream.status == 'follow' end, 20) > --- > - true > ... > @@ -149,6 +149,10 @@ test_run:cmd('stop server quorum1') > --- > - true > ... > +test_run:wait_cond(function() return box.space.test.index.primary ~= nil end, 20) > +--- > +- true > +... > for i = 1, 100 do box.space.test:insert{i} end > --- > ... > @@ -166,9 +170,9 @@ test_run:cmd('switch quorum1') > --- > - true > ... > -box.space.test:count() -- 100 > +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) > --- > -- 100 > +- true > ... > -- Rebootstrap one node of the cluster and check that others follow. > -- Note, due to ERRINJ_RELAY_TIMEOUT there is a substantial delay > @@ -197,9 +201,9 @@ test_run:cmd('switch quorum1') > - true > ... > test_run:cmd('restart server quorum1 with cleanup=1, args="0.1 0.5"') > -box.space.test:count() -- 100 > +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) > --- > -- 100 > +- true > ... > -- The rebootstrapped replica will be assigned id = 4, > -- because ids 1..3 are busy. > @@ -207,11 +211,9 @@ test_run:cmd('switch quorum2') > --- > - true > ... > -fiber = require('fiber') > ---- > -... > -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end > +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) > --- > +- true > ... > box.info.replication[4].upstream.status > --- > @@ -221,11 +223,13 @@ test_run:cmd('switch quorum3') > --- > - true > ... > -fiber = require('fiber') > +test_run:wait_cond(function() return box.info.replication ~= nil end, 20) > --- > +- true > ... > -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end > +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) > --- > +- true > ... > box.info.replication[4].upstream.status > --- > diff --git a/test/replication/quorum.test.lua b/test/replication/quorum.test.lua > index 98febb367..be23200d3 100644 > --- a/test/replication/quorum.test.lua > +++ b/test/replication/quorum.test.lua > @@ -47,9 +47,9 @@ box.info.ro -- false > box.info.status -- running > > -- Check that the replica follows all masters. > -box.info.id == 1 or box.info.replication[1].upstream.status == 'follow' > -box.info.id == 2 or box.info.replication[2].upstream.status == 'follow' > -box.info.id == 3 or box.info.replication[3].upstream.status == 'follow' > +box.info.id == 1 or test_run:wait_cond(function() return box.info.replication[1].upstream.status == 'follow' end, 20) > +box.info.id == 2 or test_run:wait_cond(function() return box.info.replication[2].upstream.status == 'follow' end, 20) > +box.info.id == 3 or test_run:wait_cond(function() return box.info.replication[3].upstream.status == 'follow' end, 20) > > -- Check that box.cfg() doesn't return until the instance > -- catches up with all configured replicas. > @@ -59,13 +59,14 @@ test_run:cmd('switch quorum2') > box.error.injection.set("ERRINJ_RELAY_TIMEOUT", 0.001) > test_run:cmd('stop server quorum1') > > +test_run:wait_cond(function() return box.space.test.index.primary ~= nil end, 20) > for i = 1, 100 do box.space.test:insert{i} end > fiber = require('fiber') > fiber.sleep(0.1) > > test_run:cmd('start server quorum1 with args="0.1 0.5"') > test_run:cmd('switch quorum1') > -box.space.test:count() -- 100 > +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) > > -- Rebootstrap one node of the cluster and check that others follow. > -- Note, due to ERRINJ_RELAY_TIMEOUT there is a substantial delay > @@ -81,17 +82,16 @@ box.snapshot() > test_run:cmd('switch quorum1') > test_run:cmd('restart server quorum1 with cleanup=1, args="0.1 0.5"') > > -box.space.test:count() -- 100 > +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) > > -- The rebootstrapped replica will be assigned id = 4, > -- because ids 1..3 are busy. > test_run:cmd('switch quorum2') > -fiber = require('fiber') > -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end > +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) > box.info.replication[4].upstream.status > test_run:cmd('switch quorum3') > -fiber = require('fiber') > -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end > +test_run:wait_cond(function() return box.info.replication ~= nil end, 20) > +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) > box.info.replication[4].upstream.status > > -- Cleanup. > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/2] test: stabilize quorum test conditions 2019-11-20 23:07 ` Vladislav Shpilevoy @ 2019-11-21 0:12 ` Ilya Kosarev 0 siblings, 0 replies; 5+ messages in thread From: Ilya Kosarev @ 2019-11-21 0:12 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches [-- Attachment #1: Type: text/plain, Size: 6651 bytes --] I am going to fix it as soon as i will be able to test it, as far as it doesn't seem convenient to fix something i can't reproduce. For now i just brought the patch to the current state. >Четверг, 21 ноября 2019, 2:01 +03:00 от Vladislav Shpilevoy <v.shpilevoy@tarantool.org>: > >Hm. So you are not going to fix the flaky error I mentioned >in the previous thread about this commit? > >Seems like it is also about 'conditions which need time to be >satisfied'. > >On 20/11/2019 23:47, Ilya Kosarev wrote: >> There were some pass conditions in quorum test which could take some >> time to be satisfied. Now they are wrapped using test_run:wait_cond to >> make the test stable. >> >> Part of #4586 >> --- >> test/replication/quorum.result | 30 +++++++++++++++++------------- >> test/replication/quorum.test.lua | 18 +++++++++--------- >> 2 files changed, 26 insertions(+), 22 deletions(-) >> >> diff --git a/test/replication/quorum.result b/test/replication/quorum.result >> index ff5fa0150..12604c8de 100644 >> --- a/test/replication/quorum.result >> +++ b/test/replication/quorum.result >> @@ -115,15 +115,15 @@ box.info.status -- running >> - running >> ... >> -- Check that the replica follows all masters. >> -box.info.id == 1 or box.info.replication[1].upstream.status == 'follow' >> +box.info.id == 1 or test_run:wait_cond(function() return box.info.replication[1].upstream.status == 'follow' end, 20) >> --- >> - true >> ... >> -box.info.id == 2 or box.info.replication[2].upstream.status == 'follow' >> +box.info.id == 2 or test_run:wait_cond(function() return box.info.replication[2].upstream.status == 'follow' end, 20) >> --- >> - true >> ... >> -box.info.id == 3 or box.info.replication[3].upstream.status == 'follow' >> +box.info.id == 3 or test_run:wait_cond(function() return box.info.replication[3].upstream.status == 'follow' end, 20) >> --- >> - true >> ... >> @@ -149,6 +149,10 @@ test_run:cmd('stop server quorum1') >> --- >> - true >> ... >> +test_run:wait_cond(function() return box.space.test.index.primary ~= nil end, 20) >> +--- >> +- true >> +... >> for i = 1, 100 do box.space.test:insert{i} end >> --- >> ... >> @@ -166,9 +170,9 @@ test_run:cmd('switch quorum1') >> --- >> - true >> ... >> -box.space.test:count() -- 100 >> +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) >> --- >> -- 100 >> +- true >> ... >> -- Rebootstrap one node of the cluster and check that others follow. >> -- Note, due to ERRINJ_RELAY_TIMEOUT there is a substantial delay >> @@ -197,9 +201,9 @@ test_run:cmd('switch quorum1') >> - true >> ... >> test_run:cmd('restart server quorum1 with cleanup=1, args="0.1 0.5"') >> -box.space.test:count() -- 100 >> +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) >> --- >> -- 100 >> +- true >> ... >> -- The rebootstrapped replica will be assigned id = 4, >> -- because ids 1..3 are busy. >> @@ -207,11 +211,9 @@ test_run:cmd('switch quorum2') >> --- >> - true >> ... >> -fiber = require('fiber') >> ---- >> -... >> -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end >> +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) >> --- >> +- true >> ... >> box.info.replication[4].upstream.status >> --- >> @@ -221,11 +223,13 @@ test_run:cmd('switch quorum3') >> --- >> - true >> ... >> -fiber = require('fiber') >> +test_run:wait_cond(function() return box.info.replication ~= nil end, 20) >> --- >> +- true >> ... >> -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end >> +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) >> --- >> +- true >> ... >> box.info.replication[4].upstream.status >> --- >> diff --git a/test/replication/quorum.test.lua b/test/replication/quorum.test.lua >> index 98febb367..be23200d3 100644 >> --- a/test/replication/quorum.test.lua >> +++ b/test/replication/quorum.test.lua >> @@ -47,9 +47,9 @@ box.info.ro -- false >> box.info.status -- running >> >> -- Check that the replica follows all masters. >> -box.info.id == 1 or box.info.replication[1].upstream.status == 'follow' >> -box.info.id == 2 or box.info.replication[2].upstream.status == 'follow' >> -box.info.id == 3 or box.info.replication[3].upstream.status == 'follow' >> +box.info.id == 1 or test_run:wait_cond(function() return box.info.replication[1].upstream.status == 'follow' end, 20) >> +box.info.id == 2 or test_run:wait_cond(function() return box.info.replication[2].upstream.status == 'follow' end, 20) >> +box.info.id == 3 or test_run:wait_cond(function() return box.info.replication[3].upstream.status == 'follow' end, 20) >> >> -- Check that box.cfg() doesn't return until the instance >> -- catches up with all configured replicas. >> @@ -59,13 +59,14 @@ test_run:cmd('switch quorum2') >> box.error.injection.set("ERRINJ_RELAY_TIMEOUT", 0.001) >> test_run:cmd('stop server quorum1') >> >> +test_run:wait_cond(function() return box.space.test.index.primary ~= nil end, 20) >> for i = 1, 100 do box.space.test:insert{i} end >> fiber = require('fiber') >> fiber.sleep(0.1) >> >> test_run:cmd('start server quorum1 with args="0.1 0.5"') >> test_run:cmd('switch quorum1') >> -box.space.test:count() -- 100 >> +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) >> >> -- Rebootstrap one node of the cluster and check that others follow. >> -- Note, due to ERRINJ_RELAY_TIMEOUT there is a substantial delay >> @@ -81,17 +82,16 @@ box.snapshot() >> test_run:cmd('switch quorum1') >> test_run:cmd('restart server quorum1 with cleanup=1, args="0.1 0.5"') >> >> -box.space.test:count() -- 100 >> +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) >> >> -- The rebootstrapped replica will be assigned id = 4, >> -- because ids 1..3 are busy. >> test_run:cmd('switch quorum2') >> -fiber = require('fiber') >> -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end >> +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) >> box.info.replication[4].upstream.status >> test_run:cmd('switch quorum3') >> -fiber = require('fiber') >> -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end >> +test_run:wait_cond(function() return box.info.replication ~= nil end, 20) >> +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) >> box.info.replication[4].upstream.status >> >> -- Cleanup. >> -- Ilya Kosarev [-- Attachment #2: Type: text/html, Size: 7991 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Tarantool-patches] [PATCH 0/2] fix replica iteration issue & stabilize quorum test @ 2019-11-18 13:19 Ilya Kosarev 2019-11-18 13:19 ` [Tarantool-patches] [PATCH 2/2] test: stabilize quorum test conditions Ilya Kosarev 0 siblings, 1 reply; 5+ messages in thread From: Ilya Kosarev @ 2019-11-18 13:19 UTC (permalink / raw) To: tarantool-patches This patchset fixes anon replicas iteration issues in replicaset_foollow and stabilizes quorum test. Branch: https://github.com/tarantool/tarantool/tree/i.kosarev/gh-4586-fix-quorum-test Issue: https://github.com/tarantool/tarantool/issues/4586 Ilya Kosarev (2): replication: make anon replicas iteration safe test: stabilize quorum test conditions src/box/replication.cc | 3 ++- test/replication/quorum.result | 28 +++++++++++++++++++--------- test/replication/quorum.test.lua | 16 +++++++++------- 3 files changed, 30 insertions(+), 17 deletions(-) -- 2.17.1 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Tarantool-patches] [PATCH 2/2] test: stabilize quorum test conditions 2019-11-18 13:19 [Tarantool-patches] [PATCH 0/2] fix replica iteration issue & stabilize quorum test Ilya Kosarev @ 2019-11-18 13:19 ` Ilya Kosarev 0 siblings, 0 replies; 5+ messages in thread From: Ilya Kosarev @ 2019-11-18 13:19 UTC (permalink / raw) To: tarantool-patches There were some pass conditions in quorum test which could take some time to be satisfied. Now they are wrapped using test_run:wait_cond to make the test stable. Closes #4586 --- test/replication/quorum.result | 28 +++++++++++++++++++--------- test/replication/quorum.test.lua | 16 +++++++++------- 2 files changed, 28 insertions(+), 16 deletions(-) diff --git a/test/replication/quorum.result b/test/replication/quorum.result index ff5fa0150..73def113f 100644 --- a/test/replication/quorum.result +++ b/test/replication/quorum.result @@ -115,15 +115,15 @@ box.info.status -- running - running ... -- Check that the replica follows all masters. -box.info.id == 1 or box.info.replication[1].upstream.status == 'follow' +box.info.id == 1 or test_run:wait_cond(function() return box.info.replication[1].upstream.status == 'follow' end, 20) --- - true ... -box.info.id == 2 or box.info.replication[2].upstream.status == 'follow' +box.info.id == 2 or test_run:wait_cond(function() return box.info.replication[2].upstream.status == 'follow' end, 20) --- - true ... -box.info.id == 3 or box.info.replication[3].upstream.status == 'follow' +box.info.id == 3 or test_run:wait_cond(function() return box.info.replication[3].upstream.status == 'follow' end, 20) --- - true ... @@ -149,6 +149,10 @@ test_run:cmd('stop server quorum1') --- - true ... +test_run:wait_cond(function() return box.space.test.index.primary ~= nil end, 20) +--- +- true +... for i = 1, 100 do box.space.test:insert{i} end --- ... @@ -166,9 +170,9 @@ test_run:cmd('switch quorum1') --- - true ... -box.space.test:count() -- 100 +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) --- -- 100 +- true ... -- Rebootstrap one node of the cluster and check that others follow. -- Note, due to ERRINJ_RELAY_TIMEOUT there is a substantial delay @@ -197,9 +201,9 @@ test_run:cmd('switch quorum1') - true ... test_run:cmd('restart server quorum1 with cleanup=1, args="0.1 0.5"') -box.space.test:count() -- 100 +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) --- -- 100 +- true ... -- The rebootstrapped replica will be assigned id = 4, -- because ids 1..3 are busy. @@ -210,8 +214,9 @@ test_run:cmd('switch quorum2') fiber = require('fiber') --- ... -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) --- +- true ... box.info.replication[4].upstream.status --- @@ -224,8 +229,13 @@ test_run:cmd('switch quorum3') fiber = require('fiber') --- ... -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end +test_run:wait_cond(function() return box.info.replication ~= nil end, 20) --- +- true +... +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) +--- +- true ... box.info.replication[4].upstream.status --- diff --git a/test/replication/quorum.test.lua b/test/replication/quorum.test.lua index 98febb367..1842cdffe 100644 --- a/test/replication/quorum.test.lua +++ b/test/replication/quorum.test.lua @@ -47,9 +47,9 @@ box.info.ro -- false box.info.status -- running -- Check that the replica follows all masters. -box.info.id == 1 or box.info.replication[1].upstream.status == 'follow' -box.info.id == 2 or box.info.replication[2].upstream.status == 'follow' -box.info.id == 3 or box.info.replication[3].upstream.status == 'follow' +box.info.id == 1 or test_run:wait_cond(function() return box.info.replication[1].upstream.status == 'follow' end, 20) +box.info.id == 2 or test_run:wait_cond(function() return box.info.replication[2].upstream.status == 'follow' end, 20) +box.info.id == 3 or test_run:wait_cond(function() return box.info.replication[3].upstream.status == 'follow' end, 20) -- Check that box.cfg() doesn't return until the instance -- catches up with all configured replicas. @@ -59,13 +59,14 @@ test_run:cmd('switch quorum2') box.error.injection.set("ERRINJ_RELAY_TIMEOUT", 0.001) test_run:cmd('stop server quorum1') +test_run:wait_cond(function() return box.space.test.index.primary ~= nil end, 20) for i = 1, 100 do box.space.test:insert{i} end fiber = require('fiber') fiber.sleep(0.1) test_run:cmd('start server quorum1 with args="0.1 0.5"') test_run:cmd('switch quorum1') -box.space.test:count() -- 100 +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) -- Rebootstrap one node of the cluster and check that others follow. -- Note, due to ERRINJ_RELAY_TIMEOUT there is a substantial delay @@ -81,17 +82,18 @@ box.snapshot() test_run:cmd('switch quorum1') test_run:cmd('restart server quorum1 with cleanup=1, args="0.1 0.5"') -box.space.test:count() -- 100 +test_run:wait_cond(function() return box.space.test:count() == 100 end, 20) -- The rebootstrapped replica will be assigned id = 4, -- because ids 1..3 are busy. test_run:cmd('switch quorum2') fiber = require('fiber') -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) box.info.replication[4].upstream.status test_run:cmd('switch quorum3') fiber = require('fiber') -while box.info.replication[4].upstream.status ~= 'follow' do fiber.sleep(0.001) end +test_run:wait_cond(function() return box.info.replication ~= nil end, 20) +test_run:wait_cond(function() return box.info.replication[4].upstream.status == 'follow' end, 20) box.info.replication[4].upstream.status -- Cleanup. -- 2.17.1 ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-11-21 0:12 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <cover.1574159473.git.i.kosarev@tarantool.org> 2019-11-19 10:31 ` [Tarantool-patches] [PATCH 2/2] test: stabilize quorum test conditions Ilya Kosarev [not found] <cover.1574290043.git.i.kosarev@tarantool.org> 2019-11-20 22:47 ` Ilya Kosarev 2019-11-20 23:07 ` Vladislav Shpilevoy 2019-11-21 0:12 ` Ilya Kosarev 2019-11-18 13:19 [Tarantool-patches] [PATCH 0/2] fix replica iteration issue & stabilize quorum test Ilya Kosarev 2019-11-18 13:19 ` [Tarantool-patches] [PATCH 2/2] test: stabilize quorum test conditions Ilya Kosarev
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox