Found the following error in our CI: [001] Test failed! Result content mismatch: [001] --- replication/gh-3055-election-promote.result Mon Aug 2 17:52:55 2021 [001] +++ var/rejects/replication/gh-3055-election-promote.reject Mon Aug 9 10:29:34 2021 [001] @@ -88,7 +88,7 @@ [001] | ... [001] assert(not box.info.ro) [001] | --- [001] - | - true [001] + | - error: assertion failed! [001] | ... [001] assert(box.info.election.term > term) [001] | --- [001] The problem was the same as in recently fixed election_qsync.test (commit 096a0a7d5e2442656d228658f89f0c7066c60b16): PROMOTE is written to WAL asynchronously, and box.ctl.promote() returns earlier than this happens. Fix the issue by waiting for the instance to become writeable. Follow-up #6034 --- https://github.com/tarantool/tarantool/tree/sp/election-promote-fix test/replication/gh-3055-election-promote.result | 4 ++-- test/replication/gh-3055-election-promote.test.lua | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/test/replication/gh-3055-election-promote.result b/test/replication/gh-3055-election-promote.result index 6f5af13bc..c51ee8056 100644 --- a/test/replication/gh-3055-election-promote.result +++ b/test/replication/gh-3055-election-promote.result @@ -50,7 +50,7 @@ assert(box.info.election.state == 'leader') | --- | - true | ... -assert(not box.info.ro) +test_run:wait_cond(function() return not box.info.ro end) | --- | - true | ... @@ -86,7 +86,7 @@ assert(box.info.election.state == 'leader') | --- | - true | ... -assert(not box.info.ro) +test_run:wait_cond(function() return not box.info.ro end) | --- | - true | ... diff --git a/test/replication/gh-3055-election-promote.test.lua b/test/replication/gh-3055-election-promote.test.lua index cbc3ed206..84acc24b8 100644 --- a/test/replication/gh-3055-election-promote.test.lua +++ b/test/replication/gh-3055-election-promote.test.lua @@ -24,7 +24,7 @@ assert(box.info.election.state == 'follower') term = box.info.election.term box.ctl.promote() assert(box.info.election.state == 'leader') -assert(not box.info.ro) +test_run:wait_cond(function() return not box.info.ro end) assert(box.info.election.term > term) -- Test promote when there's a live leader. @@ -35,7 +35,7 @@ assert(box.info.ro) assert(box.info.election.leader ~= 0) box.ctl.promote() assert(box.info.election.state == 'leader') -assert(not box.info.ro) +test_run:wait_cond(function() return not box.info.ro end) assert(box.info.election.term > term) -- Cleanup. -- 2.30.1 (Apple Git-130)
On Mon, Aug 09, 2021 at 10:42:05AM +0300, Serge Petrenko wrote:
> Found the following error in our CI:
>
> [001] Test failed! Result content mismatch:
> [001] --- replication/gh-3055-election-promote.result Mon Aug 2 17:52:55 2021
> [001] +++ var/rejects/replication/gh-3055-election-promote.reject Mon Aug 9 10:29:34 2021
> [001] @@ -88,7 +88,7 @@
> [001] | ...
> [001] assert(not box.info.ro)
> [001] | ---
> [001] - | - true
> [001] + | - error: assertion failed!
> [001] | ...
> [001] assert(box.info.election.term > term)
> [001] | ---
> [001]
>
> The problem was the same as in recently fixed election_qsync.test
> (commit 096a0a7d5e2442656d228658f89f0c7066c60b16): PROMOTE is written to
> WAL asynchronously, and box.ctl.promote() returns earlier than this
> happens.
Ack.
[-- Attachment #1: Type: text/plain, Size: 1041 bytes --] Hi team, Thank you for fixing the flaky test QA LGTM -- Vitaliia Ioffe >Понедельник, 9 августа 2021, 10:45 +03:00 от Cyrill Gorcunov via Tarantool-patches <tarantool-patches@dev.tarantool.org>: > >On Mon, Aug 09, 2021 at 10:42:05AM +0300, Serge Petrenko wrote: >> Found the following error in our CI: >> >> [001] Test failed! Result content mismatch: >> [001] --- replication/gh-3055-election-promote.result Mon Aug 2 17:52:55 2021 >> [001] +++ var/rejects/replication/gh-3055-election-promote.reject Mon Aug 9 10:29:34 2021 >> [001] @@ -88,7 +88,7 @@ >> [001] | ... >> [001] assert(not box.info.ro) >> [001] | --- >> [001] - | - true >> [001] + | - error: assertion failed! >> [001] | ... >> [001] assert(box.info.election.term > term) >> [001] | --- >> [001] >> >> The problem was the same as in recently fixed election_qsync.test >> (commit 096a0a7d5e2442656d228658f89f0c7066c60b16): PROMOTE is written to >> WAL asynchronously, and box.ctl.promote() returns earlier than this >> happens. >Ack. [-- Attachment #2: Type: text/html, Size: 1652 bytes --]
Hello,
On 09 авг 10:42, Serge Petrenko via Tarantool-patches wrote:
> Found the following error in our CI:
>
> [001] Test failed! Result content mismatch:
> [001] --- replication/gh-3055-election-promote.result Mon Aug 2 17:52:55 2021
> [001] +++ var/rejects/replication/gh-3055-election-promote.reject Mon Aug 9 10:29:34 2021
> [001] @@ -88,7 +88,7 @@
> [001] | ...
> [001] assert(not box.info.ro)
> [001] | ---
> [001] - | - true
> [001] + | - error: assertion failed!
> [001] | ...
> [001] assert(box.info.election.term > term)
> [001] | ---
> [001]
>
> The problem was the same as in recently fixed election_qsync.test
> (commit 096a0a7d5e2442656d228658f89f0c7066c60b16): PROMOTE is written to
> WAL asynchronously, and box.ctl.promote() returns earlier than this
> happens.
>
> Fix the issue by waiting for the instance to become writeable.
>
> Follow-up #6034
I've checked your patch into 2.7, 2.8 and master.
--
Regards, Kirill Yukhin