Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH 0/2] Qsync flaky tests, next iteration
@ 2020-07-22 23:57 Vladislav Shpilevoy
  2020-07-22 23:57 ` [Tarantool-patches] [PATCH 1/2] test: fix flaky qsync_snapshots.test.lua again Vladislav Shpilevoy
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Vladislav Shpilevoy @ 2020-07-22 23:57 UTC (permalink / raw)
  To: tarantool-patches, avtikhon

The patchset attempts to fix more flaky test cases discovered since the last
fixes.

Branch: http://github.com/tarantool/tarantool/tree/gerold103/qsync-flaky-tests
Issue: https://github.com/tarantool/tarantool/issues/5196
Issue: https://github.com/tarantool/tarantool/issues/5167

Vladislav Shpilevoy (2):
  test: fix flaky qsync_snapshots.test.lua again
  test: fix flaky qsync_with_anon.test.lua again

 test/replication/qsync_snapshots.result   |  8 +++-
 test/replication/qsync_snapshots.test.lua |  4 +-
 test/replication/qsync_with_anon.result   | 58 +++++++++++++++++++++--
 test/replication/qsync_with_anon.test.lua | 27 +++++++++--
 4 files changed, 86 insertions(+), 11 deletions(-)

-- 
2.21.1 (Apple Git-122.3)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Tarantool-patches] [PATCH 1/2] test: fix flaky qsync_snapshots.test.lua again
  2020-07-22 23:57 [Tarantool-patches] [PATCH 0/2] Qsync flaky tests, next iteration Vladislav Shpilevoy
@ 2020-07-22 23:57 ` Vladislav Shpilevoy
  2020-07-22 23:57 ` [Tarantool-patches] [PATCH 2/2] test: fix flaky qsync_with_anon.test.lua again Vladislav Shpilevoy
  2020-07-28  8:09 ` [Tarantool-patches] [PATCH 0/2] Qsync flaky tests, next iteration Alexander V. Tikhonov
  2 siblings, 0 replies; 5+ messages in thread
From: Vladislav Shpilevoy @ 2020-07-22 23:57 UTC (permalink / raw)
  To: tarantool-patches, avtikhon

One of the test cases started a sync transaction on master,
switched to replica, and tried to do some actions assuming that
the latest master data has arrived here.

But in fact the replica could be far behind the master. It could
still contain data from the previous test case. That led to a
bug, when it looked like if the replica had some data committed
on it, but not committed on master - this was just data from the
previous test case.

The issue is solved by flushing master's state to replica via
making a successful sync transaction.

Closes #5167
---
 test/replication/qsync_snapshots.result   | 8 +++++++-
 test/replication/qsync_snapshots.test.lua | 4 +++-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/test/replication/qsync_snapshots.result b/test/replication/qsync_snapshots.result
index 782ffd482..cafdd63c8 100644
--- a/test/replication/qsync_snapshots.result
+++ b/test/replication/qsync_snapshots.result
@@ -176,8 +176,14 @@ _ = box.schema.space.create('sync', {is_sync=true, engine=engine})
 _ = box.space.sync:create_index('pk')
  | ---
  | ...
+-- Write something to flush the current master's state to replica.
+_ = box.space.sync:insert{1}
+ | ---
+ | ...
+_ = box.space.sync:delete{1}
+ | ---
+ | ...
 
--- Testcase body.
 test_run:switch('default')
  | ---
  | - true
diff --git a/test/replication/qsync_snapshots.test.lua b/test/replication/qsync_snapshots.test.lua
index 979f04d5f..590610974 100644
--- a/test/replication/qsync_snapshots.test.lua
+++ b/test/replication/qsync_snapshots.test.lua
@@ -85,8 +85,10 @@ test_run:switch('default')
 box.cfg{replication_synchro_quorum=NUM_INSTANCES, replication_synchro_timeout=1000}
 _ = box.schema.space.create('sync', {is_sync=true, engine=engine})
 _ = box.space.sync:create_index('pk')
+-- Write something to flush the current master's state to replica.
+_ = box.space.sync:insert{1}
+_ = box.space.sync:delete{1}
 
--- Testcase body.
 test_run:switch('default')
 box.cfg{replication_synchro_quorum=BROKEN_QUORUM}
 ok, err = nil
-- 
2.21.1 (Apple Git-122.3)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Tarantool-patches] [PATCH 2/2] test: fix flaky qsync_with_anon.test.lua again
  2020-07-22 23:57 [Tarantool-patches] [PATCH 0/2] Qsync flaky tests, next iteration Vladislav Shpilevoy
  2020-07-22 23:57 ` [Tarantool-patches] [PATCH 1/2] test: fix flaky qsync_snapshots.test.lua again Vladislav Shpilevoy
@ 2020-07-22 23:57 ` Vladislav Shpilevoy
  2020-07-28  8:09 ` [Tarantool-patches] [PATCH 0/2] Qsync flaky tests, next iteration Alexander V. Tikhonov
  2 siblings, 0 replies; 5+ messages in thread
From: Vladislav Shpilevoy @ 2020-07-22 23:57 UTC (permalink / raw)
  To: tarantool-patches, avtikhon

One of the test cases had 2 problems.

- The same as in the previous commit - it started a sync
  transaction on master, switched to replica assuming it sees
  everything up to this sync transaction, but it still can see
  data from the previous test case;

- The test case tried to write a sync transaction on master, got
  timeout, switched to replica to ensure the data is removed here
  too, but since dirty reads are possible, it could happen the
  data was delivered to replica and ROLLBACK wasn't not yet. On
  the replica the rolled back data still could be visible.

The first issue is solved by flushing master's state to replica
via making a successful sync transaction.

The second issue is fixed by splitting it into more steps, not
depending on timeouts (1000 is considered infinity).

Closes #5196
---
 test/replication/qsync_with_anon.result   | 58 +++++++++++++++++++++--
 test/replication/qsync_with_anon.test.lua | 27 +++++++++--
 2 files changed, 76 insertions(+), 9 deletions(-)

diff --git a/test/replication/qsync_with_anon.result b/test/replication/qsync_with_anon.result
index 51f02bcdb..6a2952a32 100644
--- a/test/replication/qsync_with_anon.result
+++ b/test/replication/qsync_with_anon.result
@@ -96,7 +96,7 @@ box.space.sync:drop()
 -- [RFC, Asynchronous replication] failed transaction rolled back on async
 -- replica.
 -- Testcase setup.
-box.cfg{replication_synchro_quorum=BROKEN_QUORUM, replication_synchro_timeout=0.1}
+box.cfg{replication_synchro_quorum = NUM_INSTANCES, replication_synchro_timeout = 1000}
  | ---
  | ...
 _ = box.schema.space.create('sync', {is_sync=true, engine=engine})
@@ -105,23 +105,71 @@ _ = box.schema.space.create('sync', {is_sync=true, engine=engine})
 _ = box.space.sync:create_index('pk')
  | ---
  | ...
--- Testcase body.
+-- Write something to flush the current master's state to replica.
+_ = box.space.sync:insert{1}
+ | ---
+ | ...
+_ = box.space.sync:delete{1}
+ | ---
+ | ...
+
+box.cfg{replication_synchro_quorum = BROKEN_QUORUM, replication_synchro_timeout = 1000}
+ | ---
+ | ...
+fiber = require('fiber')
+ | ---
+ | ...
+ok, err = nil
+ | ---
+ | ...
+f = fiber.create(function()                                                     \
+    ok, err = pcall(box.space.sync.insert, box.space.sync, {1})                 \
+end)
+ | ---
+ | ...
+
+test_run:cmd('switch replica_anon')
+ | ---
+ | - true
+ | ...
+test_run:wait_cond(function() return box.space.sync:count() == 1 end)
+ | ---
+ | - true
+ | ...
+box.space.sync:select{}
+ | ---
+ | - - [1]
+ | ...
+
 test_run:switch('default')
  | ---
  | - true
  | ...
-box.space.sync:insert{1} -- failure
+box.cfg{replication_synchro_timeout = 0.001}
+ | ---
+ | ...
+test_run:wait_cond(function() return f:status() == 'dead' end)
+ | ---
+ | - true
+ | ...
+box.space.sync:select{}
  | ---
- | - error: Quorum collection for a synchronous transaction is timed out
+ | - []
  | ...
+
 test_run:cmd('switch replica_anon')
  | ---
  | - true
  | ...
-box.space.sync:select{} -- none
+test_run:wait_cond(function() return box.space.sync:count() == 0 end)
+ | ---
+ | - true
+ | ...
+box.space.sync:select{}
  | ---
  | - []
  | ...
+
 test_run:switch('default')
  | ---
  | - true
diff --git a/test/replication/qsync_with_anon.test.lua b/test/replication/qsync_with_anon.test.lua
index 5bc7c8be4..d7ecaa107 100644
--- a/test/replication/qsync_with_anon.test.lua
+++ b/test/replication/qsync_with_anon.test.lua
@@ -36,14 +36,33 @@ box.space.sync:drop()
 -- [RFC, Asynchronous replication] failed transaction rolled back on async
 -- replica.
 -- Testcase setup.
-box.cfg{replication_synchro_quorum=BROKEN_QUORUM, replication_synchro_timeout=0.1}
+box.cfg{replication_synchro_quorum = NUM_INSTANCES, replication_synchro_timeout = 1000}
 _ = box.schema.space.create('sync', {is_sync=true, engine=engine})
 _ = box.space.sync:create_index('pk')
--- Testcase body.
+-- Write something to flush the current master's state to replica.
+_ = box.space.sync:insert{1}
+_ = box.space.sync:delete{1}
+
+box.cfg{replication_synchro_quorum = BROKEN_QUORUM, replication_synchro_timeout = 1000}
+fiber = require('fiber')
+ok, err = nil
+f = fiber.create(function()                                                     \
+    ok, err = pcall(box.space.sync.insert, box.space.sync, {1})                 \
+end)
+
+test_run:cmd('switch replica_anon')
+test_run:wait_cond(function() return box.space.sync:count() == 1 end)
+box.space.sync:select{}
+
 test_run:switch('default')
-box.space.sync:insert{1} -- failure
+box.cfg{replication_synchro_timeout = 0.001}
+test_run:wait_cond(function() return f:status() == 'dead' end)
+box.space.sync:select{}
+
 test_run:cmd('switch replica_anon')
-box.space.sync:select{} -- none
+test_run:wait_cond(function() return box.space.sync:count() == 0 end)
+box.space.sync:select{}
+
 test_run:switch('default')
 box.cfg{replication_synchro_quorum=NUM_INSTANCES, replication_synchro_timeout=1000}
 box.space.sync:insert{1} -- success
-- 
2.21.1 (Apple Git-122.3)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Tarantool-patches] [PATCH 0/2] Qsync flaky tests, next iteration
  2020-07-22 23:57 [Tarantool-patches] [PATCH 0/2] Qsync flaky tests, next iteration Vladislav Shpilevoy
  2020-07-22 23:57 ` [Tarantool-patches] [PATCH 1/2] test: fix flaky qsync_snapshots.test.lua again Vladislav Shpilevoy
  2020-07-22 23:57 ` [Tarantool-patches] [PATCH 2/2] test: fix flaky qsync_with_anon.test.lua again Vladislav Shpilevoy
@ 2020-07-28  8:09 ` Alexander V. Tikhonov
  2020-07-28 20:37   ` Vladislav Shpilevoy
  2 siblings, 1 reply; 5+ messages in thread
From: Alexander V. Tikhonov @ 2020-07-28  8:09 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Hi Vlad, thanks for the fixes. I've checked it and seems that it helps
to avoid of issues - the patches LGTM.

On Thu, Jul 23, 2020 at 01:57:00AM +0200, Vladislav Shpilevoy wrote:
> The patchset attempts to fix more flaky test cases discovered since the last
> fixes.
> 
> Branch: http://github.com/tarantool/tarantool/tree/gerold103/qsync-flaky-tests
> Issue: https://github.com/tarantool/tarantool/issues/5196
> Issue: https://github.com/tarantool/tarantool/issues/5167
> 
> Vladislav Shpilevoy (2):
>   test: fix flaky qsync_snapshots.test.lua again
>   test: fix flaky qsync_with_anon.test.lua again
> 
>  test/replication/qsync_snapshots.result   |  8 +++-
>  test/replication/qsync_snapshots.test.lua |  4 +-
>  test/replication/qsync_with_anon.result   | 58 +++++++++++++++++++++--
>  test/replication/qsync_with_anon.test.lua | 27 +++++++++--
>  4 files changed, 86 insertions(+), 11 deletions(-)
> 
> -- 
> 2.21.1 (Apple Git-122.3)
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Tarantool-patches] [PATCH 0/2] Qsync flaky tests, next iteration
  2020-07-28  8:09 ` [Tarantool-patches] [PATCH 0/2] Qsync flaky tests, next iteration Alexander V. Tikhonov
@ 2020-07-28 20:37   ` Vladislav Shpilevoy
  0 siblings, 0 replies; 5+ messages in thread
From: Vladislav Shpilevoy @ 2020-07-28 20:37 UTC (permalink / raw)
  To: Alexander V. Tikhonov; +Cc: tarantool-patches

Pushed to master and 2.5.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-07-28 20:37 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-22 23:57 [Tarantool-patches] [PATCH 0/2] Qsync flaky tests, next iteration Vladislav Shpilevoy
2020-07-22 23:57 ` [Tarantool-patches] [PATCH 1/2] test: fix flaky qsync_snapshots.test.lua again Vladislav Shpilevoy
2020-07-22 23:57 ` [Tarantool-patches] [PATCH 2/2] test: fix flaky qsync_with_anon.test.lua again Vladislav Shpilevoy
2020-07-28  8:09 ` [Tarantool-patches] [PATCH 0/2] Qsync flaky tests, next iteration Alexander V. Tikhonov
2020-07-28 20:37   ` Vladislav Shpilevoy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox