[Tarantool-patches] [PATCH 2/2] test: fix flaky qsync_with_anon.test.lua again

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Thu Jul 23 02:57:02 MSK 2020


One of the test cases had 2 problems.

- The same as in the previous commit - it started a sync
  transaction on master, switched to replica assuming it sees
  everything up to this sync transaction, but it still can see
  data from the previous test case;

- The test case tried to write a sync transaction on master, got
  timeout, switched to replica to ensure the data is removed here
  too, but since dirty reads are possible, it could happen the
  data was delivered to replica and ROLLBACK wasn't not yet. On
  the replica the rolled back data still could be visible.

The first issue is solved by flushing master's state to replica
via making a successful sync transaction.

The second issue is fixed by splitting it into more steps, not
depending on timeouts (1000 is considered infinity).

Closes #5196
---
 test/replication/qsync_with_anon.result   | 58 +++++++++++++++++++++--
 test/replication/qsync_with_anon.test.lua | 27 +++++++++--
 2 files changed, 76 insertions(+), 9 deletions(-)

diff --git a/test/replication/qsync_with_anon.result b/test/replication/qsync_with_anon.result
index 51f02bcdb..6a2952a32 100644
--- a/test/replication/qsync_with_anon.result
+++ b/test/replication/qsync_with_anon.result
@@ -96,7 +96,7 @@ box.space.sync:drop()
 -- [RFC, Asynchronous replication] failed transaction rolled back on async
 -- replica.
 -- Testcase setup.
-box.cfg{replication_synchro_quorum=BROKEN_QUORUM, replication_synchro_timeout=0.1}
+box.cfg{replication_synchro_quorum = NUM_INSTANCES, replication_synchro_timeout = 1000}
  | ---
  | ...
 _ = box.schema.space.create('sync', {is_sync=true, engine=engine})
@@ -105,23 +105,71 @@ _ = box.schema.space.create('sync', {is_sync=true, engine=engine})
 _ = box.space.sync:create_index('pk')
  | ---
  | ...
--- Testcase body.
+-- Write something to flush the current master's state to replica.
+_ = box.space.sync:insert{1}
+ | ---
+ | ...
+_ = box.space.sync:delete{1}
+ | ---
+ | ...
+
+box.cfg{replication_synchro_quorum = BROKEN_QUORUM, replication_synchro_timeout = 1000}
+ | ---
+ | ...
+fiber = require('fiber')
+ | ---
+ | ...
+ok, err = nil
+ | ---
+ | ...
+f = fiber.create(function()                                                     \
+    ok, err = pcall(box.space.sync.insert, box.space.sync, {1})                 \
+end)
+ | ---
+ | ...
+
+test_run:cmd('switch replica_anon')
+ | ---
+ | - true
+ | ...
+test_run:wait_cond(function() return box.space.sync:count() == 1 end)
+ | ---
+ | - true
+ | ...
+box.space.sync:select{}
+ | ---
+ | - - [1]
+ | ...
+
 test_run:switch('default')
  | ---
  | - true
  | ...
-box.space.sync:insert{1} -- failure
+box.cfg{replication_synchro_timeout = 0.001}
+ | ---
+ | ...
+test_run:wait_cond(function() return f:status() == 'dead' end)
+ | ---
+ | - true
+ | ...
+box.space.sync:select{}
  | ---
- | - error: Quorum collection for a synchronous transaction is timed out
+ | - []
  | ...
+
 test_run:cmd('switch replica_anon')
  | ---
  | - true
  | ...
-box.space.sync:select{} -- none
+test_run:wait_cond(function() return box.space.sync:count() == 0 end)
+ | ---
+ | - true
+ | ...
+box.space.sync:select{}
  | ---
  | - []
  | ...
+
 test_run:switch('default')
  | ---
  | - true
diff --git a/test/replication/qsync_with_anon.test.lua b/test/replication/qsync_with_anon.test.lua
index 5bc7c8be4..d7ecaa107 100644
--- a/test/replication/qsync_with_anon.test.lua
+++ b/test/replication/qsync_with_anon.test.lua
@@ -36,14 +36,33 @@ box.space.sync:drop()
 -- [RFC, Asynchronous replication] failed transaction rolled back on async
 -- replica.
 -- Testcase setup.
-box.cfg{replication_synchro_quorum=BROKEN_QUORUM, replication_synchro_timeout=0.1}
+box.cfg{replication_synchro_quorum = NUM_INSTANCES, replication_synchro_timeout = 1000}
 _ = box.schema.space.create('sync', {is_sync=true, engine=engine})
 _ = box.space.sync:create_index('pk')
--- Testcase body.
+-- Write something to flush the current master's state to replica.
+_ = box.space.sync:insert{1}
+_ = box.space.sync:delete{1}
+
+box.cfg{replication_synchro_quorum = BROKEN_QUORUM, replication_synchro_timeout = 1000}
+fiber = require('fiber')
+ok, err = nil
+f = fiber.create(function()                                                     \
+    ok, err = pcall(box.space.sync.insert, box.space.sync, {1})                 \
+end)
+
+test_run:cmd('switch replica_anon')
+test_run:wait_cond(function() return box.space.sync:count() == 1 end)
+box.space.sync:select{}
+
 test_run:switch('default')
-box.space.sync:insert{1} -- failure
+box.cfg{replication_synchro_timeout = 0.001}
+test_run:wait_cond(function() return f:status() == 'dead' end)
+box.space.sync:select{}
+
 test_run:cmd('switch replica_anon')
-box.space.sync:select{} -- none
+test_run:wait_cond(function() return box.space.sync:count() == 0 end)
+box.space.sync:select{}
+
 test_run:switch('default')
 box.cfg{replication_synchro_quorum=NUM_INSTANCES, replication_synchro_timeout=1000}
 box.space.sync:insert{1} -- success
-- 
2.21.1 (Apple Git-122.3)



More information about the Tarantool-patches mailing list