From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp34.i.mail.ru (smtp34.i.mail.ru [94.100.177.94]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id AAA544696C3 for ; Wed, 29 Apr 2020 14:54:02 +0300 (MSK) Date: Wed, 29 Apr 2020 14:53:50 +0300 From: Sergey Bronnikov Message-ID: <20200429115350.GA26469@pony.bronevichok.ru> References: <8b8bef3060a2a3a78ab8b6de1d99443d0545e3b4.1587942462.git.avtikhon@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <8b8bef3060a2a3a78ab8b6de1d99443d0545e3b4.1587942462.git.avtikhon@tarantool.org> Subject: Re: [Tarantool-patches] [PATCH v1] test: fix flaky replication/skip_conflict_row test List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Alexander V. Tikhonov" Cc: Oleg Piskunov , tarantool-patches@dev.tarantool.org LGTM Test passed 100 out of 100 iterations. And see comments inline. On 02:08 Mon 27 Apr , Alexander V. Tikhonov wrote: > From: "Aleander V. Tikhonov" > > Fixed flaky upstream checks at replication/skip_conflict_row test. > > Errors fixed: > > [038] @@ -174,7 +174,7 @@ > [038] ... > [038] box.info.replication[1].upstream.status > [038] --- > [038] -- follow > [038] +- disconnected > [038] ... > [038] -- write some conflicting records on slave > [038] for i = 1, 10 do box.space.test:insert({i, 'r'}) end > > [030] @@ -201,12 +201,12 @@ > [030] -- lsn should be incremented > [030] v1 == box.info.vclock[1] - 10 > [030] --- > [030] -- true > [030] +- false > [030] ... > [030] -- and state is follow > [030] box.info.replication[1].upstream.status > [030] --- > [030] -- follow > [030] +- disconnected > [030] ... > [030] -- restart server and check replication continues from nop-ed vclock > [030] test_run:cmd("switch default") > > [022] @@ -230,7 +230,7 @@ > [022] ... > [022] box.info.replication[1].upstream.status > [022] --- > [022] -- follow > [022] +- disconnected > [022] ... > [022] box.space.test:select({11}, {iterator = "GE"}) > [022] --- > [022] It is not clear from output was the problem you have fixed. I propose to remove this output and add description of the problem. > > Closes #4425 > --- > test/replication/skip_conflict_row.result | 34 ++++++++------------- > test/replication/skip_conflict_row.test.lua | 16 +++++----- > test/replication/suite.ini | 1 - > 3 files changed, 20 insertions(+), 31 deletions(-) > > diff --git a/test/replication/skip_conflict_row.result b/test/replication/skip_conflict_row.result > index d70ac8e2a..737522b8b 100644 > --- a/test/replication/skip_conflict_row.result > +++ b/test/replication/skip_conflict_row.result > @@ -64,13 +64,9 @@ test_run:cmd("switch replica") > --- > - true > ... > -box.info.replication[1].upstream.message > +test_run:wait_upstream(1, {status = 'follow', message_re = box.NULL}) > --- > -- null > -... > -box.info.replication[1].upstream.status > ---- > -- follow > +- true > ... > box.space.test:select() > --- > @@ -123,13 +119,9 @@ lsn1 == box.info.vclock[1] > --- > - true > ... > -box.info.replication[1].upstream.message > ---- > -- Duplicate key exists in unique index 'primary' in space 'test' > -... > -box.info.replication[1].upstream.status > +test_run:wait_upstream(1, {status = 'stopped', message_re = "Duplicate key exists in unique index 'primary' in space 'test'"}) > --- > -- stopped > +- true > ... > test_run:cmd("switch default") > --- > @@ -140,9 +132,9 @@ test_run:cmd("restart server replica") > - true > ... > -- applier is not in follow state > -box.info.replication[1].upstream.message > +test_run:wait_upstream(1, {status = 'stopped', message_re = "Duplicate key exists in unique index 'primary' in space 'test'"}) > --- > -- Duplicate key exists in unique index 'primary' in space 'test' > +- true > ... > -- > -- gh-3977: check that NOP is written instead of conflicting row. > @@ -172,9 +164,9 @@ test_run:cmd("switch replica") > --- > - true > ... > -box.info.replication[1].upstream.status > +test_run:wait_upstream(1, {status = 'follow'}) > --- > -- follow > +- true > ... > -- write some conflicting records on slave > for i = 1, 10 do box.space.test:insert({i, 'r'}) end > @@ -199,14 +191,14 @@ test_run:cmd("switch replica") > - true > ... > -- lsn should be incremented > -v1 == box.info.vclock[1] - 10 > +test_run:wait_cond(function() return v1 == box.info.vclock[1] - 10 end) > --- > - true > ... > -- and state is follow > -box.info.replication[1].upstream.status > +test_run:wait_upstream(1, {status = 'follow'}) > --- > -- follow > +- true > ... > -- restart server and check replication continues from nop-ed vclock > test_run:cmd("switch default") > @@ -228,9 +220,9 @@ test_run:cmd("switch replica") > --- > - true > ... > -box.info.replication[1].upstream.status > +test_run:wait_upstream(1, {status = 'follow'}) > --- > -- follow > +- true > ... > box.space.test:select({11}, {iterator = "GE"}) > --- > diff --git a/test/replication/skip_conflict_row.test.lua b/test/replication/skip_conflict_row.test.lua > index 04fd08136..e7a93cc74 100644 > --- a/test/replication/skip_conflict_row.test.lua > +++ b/test/replication/skip_conflict_row.test.lua > @@ -22,8 +22,7 @@ vclock = test_run:get_vclock('default') > vclock[0] = nil > _ = test_run:wait_vclock("replica", vclock) > test_run:cmd("switch replica") > -box.info.replication[1].upstream.message > -box.info.replication[1].upstream.status > +test_run:wait_upstream(1, {status = 'follow', message_re = box.NULL}) > box.space.test:select() > > test_run:cmd("switch default") > @@ -41,12 +40,11 @@ box.space.test:insert{4} > test_run:cmd("switch replica") > -- lsn is not promoted > lsn1 == box.info.vclock[1] > -box.info.replication[1].upstream.message > -box.info.replication[1].upstream.status > +test_run:wait_upstream(1, {status = 'stopped', message_re = "Duplicate key exists in unique index 'primary' in space 'test'"}) > test_run:cmd("switch default") > test_run:cmd("restart server replica") > -- applier is not in follow state > -box.info.replication[1].upstream.message > +test_run:wait_upstream(1, {status = 'stopped', message_re = "Duplicate key exists in unique index 'primary' in space 'test'"}) > > -- > -- gh-3977: check that NOP is written instead of conflicting row. > @@ -60,7 +58,7 @@ test_run:cmd("switch default") > box.space.test:truncate() > test_run:cmd("restart server replica") > test_run:cmd("switch replica") > -box.info.replication[1].upstream.status > +test_run:wait_upstream(1, {status = 'follow'}) > -- write some conflicting records on slave > for i = 1, 10 do box.space.test:insert({i, 'r'}) end > box.cfg{replication_skip_conflict = true} > @@ -72,9 +70,9 @@ for i = 1, 10 do box.space.test:insert({i, 'm'}) end > > test_run:cmd("switch replica") > -- lsn should be incremented > -v1 == box.info.vclock[1] - 10 > +test_run:wait_cond(function() return v1 == box.info.vclock[1] - 10 end) > -- and state is follow > -box.info.replication[1].upstream.status > +test_run:wait_upstream(1, {status = 'follow'}) > > -- restart server and check replication continues from nop-ed vclock > test_run:cmd("switch default") > @@ -82,7 +80,7 @@ test_run:cmd("stop server replica") > for i = 11, 20 do box.space.test:insert({i, 'm'}) end > test_run:cmd("start server replica") > test_run:cmd("switch replica") > -box.info.replication[1].upstream.status > +test_run:wait_upstream(1, {status = 'follow'}) > box.space.test:select({11}, {iterator = "GE"}) > > test_run:cmd("switch default") > diff --git a/test/replication/suite.ini b/test/replication/suite.ini > index ac413669d..572dd47fe 100644 > --- a/test/replication/suite.ini > +++ b/test/replication/suite.ini > @@ -13,7 +13,6 @@ is_parallel = True > pretest_clean = True > fragile = errinj.test.lua ; gh-3870 > long_row_timeout.test.lua ; gh-4351 > - skip_conflict_row.test.lua ; gh-4457 > sync.test.lua ; gh-3835 gh-3877 > transaction.test.lua ; gh-4312 > wal_off.test.lua ; gh-4355 > -- > 2.17.1 > -- sergeyb@