From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp50.i.mail.ru (smtp50.i.mail.ru [94.100.177.110]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 26812445321 for ; Thu, 23 Jul 2020 02:57:06 +0300 (MSK) From: Vladislav Shpilevoy Date: Thu, 23 Jul 2020 01:57:02 +0200 Message-Id: <63b15c4eed1be14d3feb1d02de1e15ed4b38d748.1595462166.git.v.shpilevoy@tarantool.org> In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [Tarantool-patches] [PATCH 2/2] test: fix flaky qsync_with_anon.test.lua again List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: tarantool-patches@dev.tarantool.org, avtikhon@tarantool.org One of the test cases had 2 problems. - The same as in the previous commit - it started a sync transaction on master, switched to replica assuming it sees everything up to this sync transaction, but it still can see data from the previous test case; - The test case tried to write a sync transaction on master, got timeout, switched to replica to ensure the data is removed here too, but since dirty reads are possible, it could happen the data was delivered to replica and ROLLBACK wasn't not yet. On the replica the rolled back data still could be visible. The first issue is solved by flushing master's state to replica via making a successful sync transaction. The second issue is fixed by splitting it into more steps, not depending on timeouts (1000 is considered infinity). Closes #5196 --- test/replication/qsync_with_anon.result | 58 +++++++++++++++++++++-- test/replication/qsync_with_anon.test.lua | 27 +++++++++-- 2 files changed, 76 insertions(+), 9 deletions(-) diff --git a/test/replication/qsync_with_anon.result b/test/replication/qsync_with_anon.result index 51f02bcdb..6a2952a32 100644 --- a/test/replication/qsync_with_anon.result +++ b/test/replication/qsync_with_anon.result @@ -96,7 +96,7 @@ box.space.sync:drop() -- [RFC, Asynchronous replication] failed transaction rolled back on async -- replica. -- Testcase setup. -box.cfg{replication_synchro_quorum=BROKEN_QUORUM, replication_synchro_timeout=0.1} +box.cfg{replication_synchro_quorum = NUM_INSTANCES, replication_synchro_timeout = 1000} | --- | ... _ = box.schema.space.create('sync', {is_sync=true, engine=engine}) @@ -105,23 +105,71 @@ _ = box.schema.space.create('sync', {is_sync=true, engine=engine}) _ = box.space.sync:create_index('pk') | --- | ... --- Testcase body. +-- Write something to flush the current master's state to replica. +_ = box.space.sync:insert{1} + | --- + | ... +_ = box.space.sync:delete{1} + | --- + | ... + +box.cfg{replication_synchro_quorum = BROKEN_QUORUM, replication_synchro_timeout = 1000} + | --- + | ... +fiber = require('fiber') + | --- + | ... +ok, err = nil + | --- + | ... +f = fiber.create(function() \ + ok, err = pcall(box.space.sync.insert, box.space.sync, {1}) \ +end) + | --- + | ... + +test_run:cmd('switch replica_anon') + | --- + | - true + | ... +test_run:wait_cond(function() return box.space.sync:count() == 1 end) + | --- + | - true + | ... +box.space.sync:select{} + | --- + | - - [1] + | ... + test_run:switch('default') | --- | - true | ... -box.space.sync:insert{1} -- failure +box.cfg{replication_synchro_timeout = 0.001} + | --- + | ... +test_run:wait_cond(function() return f:status() == 'dead' end) + | --- + | - true + | ... +box.space.sync:select{} | --- - | - error: Quorum collection for a synchronous transaction is timed out + | - [] | ... + test_run:cmd('switch replica_anon') | --- | - true | ... -box.space.sync:select{} -- none +test_run:wait_cond(function() return box.space.sync:count() == 0 end) + | --- + | - true + | ... +box.space.sync:select{} | --- | - [] | ... + test_run:switch('default') | --- | - true diff --git a/test/replication/qsync_with_anon.test.lua b/test/replication/qsync_with_anon.test.lua index 5bc7c8be4..d7ecaa107 100644 --- a/test/replication/qsync_with_anon.test.lua +++ b/test/replication/qsync_with_anon.test.lua @@ -36,14 +36,33 @@ box.space.sync:drop() -- [RFC, Asynchronous replication] failed transaction rolled back on async -- replica. -- Testcase setup. -box.cfg{replication_synchro_quorum=BROKEN_QUORUM, replication_synchro_timeout=0.1} +box.cfg{replication_synchro_quorum = NUM_INSTANCES, replication_synchro_timeout = 1000} _ = box.schema.space.create('sync', {is_sync=true, engine=engine}) _ = box.space.sync:create_index('pk') --- Testcase body. +-- Write something to flush the current master's state to replica. +_ = box.space.sync:insert{1} +_ = box.space.sync:delete{1} + +box.cfg{replication_synchro_quorum = BROKEN_QUORUM, replication_synchro_timeout = 1000} +fiber = require('fiber') +ok, err = nil +f = fiber.create(function() \ + ok, err = pcall(box.space.sync.insert, box.space.sync, {1}) \ +end) + +test_run:cmd('switch replica_anon') +test_run:wait_cond(function() return box.space.sync:count() == 1 end) +box.space.sync:select{} + test_run:switch('default') -box.space.sync:insert{1} -- failure +box.cfg{replication_synchro_timeout = 0.001} +test_run:wait_cond(function() return f:status() == 'dead' end) +box.space.sync:select{} + test_run:cmd('switch replica_anon') -box.space.sync:select{} -- none +test_run:wait_cond(function() return box.space.sync:count() == 0 end) +box.space.sync:select{} + test_run:switch('default') box.cfg{replication_synchro_quorum=NUM_INSTANCES, replication_synchro_timeout=1000} box.space.sync:insert{1} -- success -- 2.21.1 (Apple Git-122.3)