Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH 0/4 v3] Additional qsync tests
@ 2020-11-17 16:13 sergeyb
  2020-11-17 16:13 ` [Tarantool-patches] [PATCH 1/4] replication: run clear_synchro_queue() with unconfigured box sergeyb
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: sergeyb @ 2020-11-17 16:13 UTC (permalink / raw)
  To: tarantool-patches, v.shpilevoy

From: Sergey Bronnikov <sergeyb@tarantool.org>

Changes v3:

- added negative case with 'broken' quorum in test with random leader
- reduced a number of iterations from 200 to 30 in test with random leader
- moved testcase with box.ctl.clear_synchro_queue() on unconfigured box from
  replication/qsync_basic.test.lua to app-tap/cfg.test.lua (added as separate
  patch).
- added wait_cond() to get final box.space.sync:count()

Changes v2:

- removed negative case in test with random leader: no more stuck tx's in a
  limbo before switching a leader. I spend couple of hours to get it work, but
failed. Some details: to avoid block a loop I should execute tx in a fiber when
quorum is set to 'broken', but both test_run:eval() and insert via netbox
connection in a fiber haven't insert value even when fiber is dead.
- check inserted values in test with random leader as Sergey asked on previous
  review
- removed negative case in test with random sync/async mode, Sergey said it is
  pointless [1]
- added test with change enable/disable sync mode, it has LGTM from Sergey [1]
- reduced a cluster size in test with random leader from 32 to 5 instances as
  Vlad asked on previous review
- aligned tests with 80 symbols per line
- adjusted values of synchro timeouts in box.cfg to make tests more reliable

GH branch: ligurio/gh-4842-qsync-testing
Github issue: https://github.com/tarantool/tarantool/issues/5055
Gitlab CI: https://gitlab.com/tarantool/tarantool/-/pipelines/217357503

1. https://lists.tarantool.org/pipermail/tarantool-patches/2020-September/019227.html

Sergey Bronnikov (4):
  replication: run clear_synchro_queue() with unconfigured box
  replication: test clear_synchro_queue() function
  replication: add test with random leaders promotion and demotion
  replication: add test with change space sync mode in a loop

 test/app-tap/cfg.test.lua                     |   8 +-
 test/replication/qsync.lua                    |  31 ++++
 test/replication/qsync1.lua                   |   1 +
 test/replication/qsync2.lua                   |   1 +
 test/replication/qsync3.lua                   |   1 +
 test/replication/qsync4.lua                   |   1 +
 test/replication/qsync5.lua                   |   1 +
 test/replication/qsync_basic.result           | 129 +++++++++++++++
 test/replication/qsync_basic.test.lua         |  45 ++++++
 test/replication/qsync_random_leader.result   | 148 ++++++++++++++++++
 test/replication/qsync_random_leader.test.lua |  76 +++++++++
 test/replication/qsync_sync_mode.result       | 114 ++++++++++++++
 test/replication/qsync_sync_mode.test.lua     |  56 +++++++
 13 files changed, 611 insertions(+), 1 deletion(-)
 create mode 100644 test/replication/qsync.lua
 create mode 120000 test/replication/qsync1.lua
 create mode 120000 test/replication/qsync2.lua
 create mode 120000 test/replication/qsync3.lua
 create mode 120000 test/replication/qsync4.lua
 create mode 120000 test/replication/qsync5.lua
 create mode 100644 test/replication/qsync_random_leader.result
 create mode 100644 test/replication/qsync_random_leader.test.lua
 create mode 100644 test/replication/qsync_sync_mode.result
 create mode 100644 test/replication/qsync_sync_mode.test.lua

-- 
2.25.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Tarantool-patches] [PATCH 1/4] replication: run clear_synchro_queue() with unconfigured box
  2020-11-17 16:13 [Tarantool-patches] [PATCH 0/4 v3] Additional qsync tests sergeyb
@ 2020-11-17 16:13 ` sergeyb
  2020-11-21 14:41   ` Vladislav Shpilevoy
  2020-11-17 16:13 ` [Tarantool-patches] [PATCH 2/4] replication: test clear_synchro_queue() function sergeyb
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: sergeyb @ 2020-11-17 16:13 UTC (permalink / raw)
  To: tarantool-patches, v.shpilevoy

From: Sergey Bronnikov <sergeyb@tarantool.org>

Part of #4849
---
 test/app-tap/cfg.test.lua | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/test/app-tap/cfg.test.lua b/test/app-tap/cfg.test.lua
index ba6b735ab..3d89a5167 100755
--- a/test/app-tap/cfg.test.lua
+++ b/test/app-tap/cfg.test.lua
@@ -3,7 +3,13 @@ local fiber = require('fiber')
 local tap = require('tap')
 local test = tap.test("cfg")
 
-test:plan(11)
+test:plan(12)
+
+--
+-- gh-4849: clear synchro queue is null with unconfigured box
+--
+local ok, err = pcall(box.ctl.clear_synchro_queue(), nil)
+test:isnil(ok, 'execute clear_synchro_queue with unconfigured box')
 
 --
 -- gh-4282: box.cfg should not allow nor just ignore nil UUID.
-- 
2.25.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Tarantool-patches] [PATCH 2/4] replication: test clear_synchro_queue() function
  2020-11-17 16:13 [Tarantool-patches] [PATCH 0/4 v3] Additional qsync tests sergeyb
  2020-11-17 16:13 ` [Tarantool-patches] [PATCH 1/4] replication: run clear_synchro_queue() with unconfigured box sergeyb
@ 2020-11-17 16:13 ` sergeyb
  2020-11-21 14:41   ` Vladislav Shpilevoy
  2020-11-17 16:13 ` [Tarantool-patches] [PATCH 3/4] replication: add test with random leaders promotion and demotion sergeyb
  2020-11-17 16:13 ` [Tarantool-patches] [PATCH 4/4] replication: add test with change space sync mode in a loop sergeyb
  3 siblings, 1 reply; 15+ messages in thread
From: sergeyb @ 2020-11-17 16:13 UTC (permalink / raw)
  To: tarantool-patches, v.shpilevoy

From: Sergey Bronnikov <sergeyb@tarantool.org>

Part of #5055
Part of #4849
---
 test/replication/qsync_basic.result   | 129 ++++++++++++++++++++++++++
 test/replication/qsync_basic.test.lua |  45 +++++++++
 2 files changed, 174 insertions(+)

diff --git a/test/replication/qsync_basic.result b/test/replication/qsync_basic.result
index bd3c3cce1..a06f65ce1 100644
--- a/test/replication/qsync_basic.result
+++ b/test/replication/qsync_basic.result
@@ -637,6 +637,135 @@ box.space.sync:count()
  | - 0
  | ...
 
+--
+-- gh-4849: clear synchro queue on a master
+--
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+box.cfg{replication_synchro_quorum = 3, replication_synchro_timeout = 1000}
+ | ---
+ | ...
+ok, err = nil
+ | ---
+ | ...
+f = fiber.create(function()							\
+    ok, err = pcall(box.space.sync.insert, box.space.sync, {10})		\
+end)
+ | ---
+ | ...
+f:status()
+ | ---
+ | - suspended
+ | ...
+test_run:switch('replica')
+ | ---
+ | - true
+ | ...
+test_run:wait_cond(function() return box.space.sync:get{10} ~= nil end)
+ | ---
+ | - true
+ | ...
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+box.cfg{replication_synchro_timeout = 0.1}
+ | ---
+ | ...
+box.ctl.clear_synchro_queue()
+ | ---
+ | ...
+test_run:switch('replica')
+ | ---
+ | - true
+ | ...
+test_run:wait_cond(function() return box.space.sync:get{10} == nil end)
+ | ---
+ | - true
+ | ...
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+test_run:wait_cond(function() return f:status() == 'dead' end)
+ | ---
+ | - true
+ | ...
+ok, err
+ | ---
+ | - false
+ | - Quorum collection for a synchronous transaction is timed out
+ | ...
+test_run:wait_cond(function() return box.space.sync:get{10} == nil end)
+ | ---
+ | - true
+ | ...
+
+--
+-- gh-4849: clear synchro queue on a replica, make sure no crashes
+--
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+box.cfg{replication_synchro_quorum = 3, replication_synchro_timeout = 1000}
+ | ---
+ | ...
+ok, err = nil
+ | ---
+ | ...
+f = fiber.create(function()                                                     \
+    ok, err = pcall(box.space.sync.insert, box.space.sync, {9})                 \
+end)
+ | ---
+ | ...
+f:status()
+ | ---
+ | - suspended
+ | ...
+test_run:wait_cond(function() return box.space.sync:get{9} ~= nil end)
+ | ---
+ | - true
+ | ...
+test_run:switch('replica')
+ | ---
+ | - true
+ | ...
+box.cfg{replication_synchro_quorum = 3, replication_synchro_timeout=0.01}
+ | ---
+ | ...
+box.ctl.clear_synchro_queue()
+ | ---
+ | ...
+test_run:wait_cond(function() return box.space.sync:get{9} == nil end)
+ | ---
+ | - true
+ | ...
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+box.cfg{replication_synchro_timeout=0.01}
+ | ---
+ | ...
+test_run:wait_cond(function() return f:status() == 'dead' end)
+ | ---
+ | - true
+ | ...
+ok, err
+ | ---
+ | - false
+ | - Quorum collection for a synchronous transaction is timed out
+ | ...
+test_run:wait_cond(function() return box.space.sync:get{9} == nil end)
+ | ---
+ | - true
+ | ...
+
+-- Note: cluster may be in a broken state here due to nature of previous test.
+
 -- Cleanup.
 test_run:cmd('switch default')
  | ---
diff --git a/test/replication/qsync_basic.test.lua b/test/replication/qsync_basic.test.lua
index 94235547d..81d3b20ad 100644
--- a/test/replication/qsync_basic.test.lua
+++ b/test/replication/qsync_basic.test.lua
@@ -248,6 +248,51 @@ for i = 1, 100 do box.space.sync:delete{i} end
 test_run:cmd('switch replica')
 box.space.sync:count()
 
+--
+-- gh-4849: clear synchro queue on a master
+--
+test_run:switch('default')
+box.cfg{replication_synchro_quorum = 3, replication_synchro_timeout = 1000}
+ok, err = nil
+f = fiber.create(function()							\
+    ok, err = pcall(box.space.sync.insert, box.space.sync, {10})		\
+end)
+f:status()
+test_run:switch('replica')
+test_run:wait_cond(function() return box.space.sync:get{10} ~= nil end)
+test_run:switch('default')
+box.cfg{replication_synchro_timeout = 0.1}
+box.ctl.clear_synchro_queue()
+test_run:switch('replica')
+test_run:wait_cond(function() return box.space.sync:get{10} == nil end)
+test_run:switch('default')
+test_run:wait_cond(function() return f:status() == 'dead' end)
+ok, err
+test_run:wait_cond(function() return box.space.sync:get{10} == nil end)
+
+--
+-- gh-4849: clear synchro queue on a replica, make sure no crashes
+--
+test_run:switch('default')
+box.cfg{replication_synchro_quorum = 3, replication_synchro_timeout = 1000}
+ok, err = nil
+f = fiber.create(function()                                                     \
+    ok, err = pcall(box.space.sync.insert, box.space.sync, {9})                 \
+end)
+f:status()
+test_run:wait_cond(function() return box.space.sync:get{9} ~= nil end)
+test_run:switch('replica')
+box.cfg{replication_synchro_quorum = 3, replication_synchro_timeout=0.01}
+box.ctl.clear_synchro_queue()
+test_run:wait_cond(function() return box.space.sync:get{9} == nil end)
+test_run:switch('default')
+box.cfg{replication_synchro_timeout=0.01}
+test_run:wait_cond(function() return f:status() == 'dead' end)
+ok, err
+test_run:wait_cond(function() return box.space.sync:get{9} == nil end)
+
+-- Note: cluster may be in a broken state here due to nature of previous test.
+
 -- Cleanup.
 test_run:cmd('switch default')
 
-- 
2.25.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Tarantool-patches] [PATCH 3/4] replication: add test with random leaders promotion and demotion
  2020-11-17 16:13 [Tarantool-patches] [PATCH 0/4 v3] Additional qsync tests sergeyb
  2020-11-17 16:13 ` [Tarantool-patches] [PATCH 1/4] replication: run clear_synchro_queue() with unconfigured box sergeyb
  2020-11-17 16:13 ` [Tarantool-patches] [PATCH 2/4] replication: test clear_synchro_queue() function sergeyb
@ 2020-11-17 16:13 ` sergeyb
  2020-11-21 14:41   ` Vladislav Shpilevoy
  2020-11-17 16:13 ` [Tarantool-patches] [PATCH 4/4] replication: add test with change space sync mode in a loop sergeyb
  3 siblings, 1 reply; 15+ messages in thread
From: sergeyb @ 2020-11-17 16:13 UTC (permalink / raw)
  To: tarantool-patches, v.shpilevoy

From: Sergey Bronnikov <sergeyb@tarantool.org>

Part of #5055
Part of #5144
---
 test/replication/qsync.lua                    |  31 ++++
 test/replication/qsync1.lua                   |   1 +
 test/replication/qsync2.lua                   |   1 +
 test/replication/qsync3.lua                   |   1 +
 test/replication/qsync4.lua                   |   1 +
 test/replication/qsync5.lua                   |   1 +
 test/replication/qsync_random_leader.result   | 148 ++++++++++++++++++
 test/replication/qsync_random_leader.test.lua |  76 +++++++++
 8 files changed, 260 insertions(+)
 create mode 100644 test/replication/qsync.lua
 create mode 120000 test/replication/qsync1.lua
 create mode 120000 test/replication/qsync2.lua
 create mode 120000 test/replication/qsync3.lua
 create mode 120000 test/replication/qsync4.lua
 create mode 120000 test/replication/qsync5.lua
 create mode 100644 test/replication/qsync_random_leader.result
 create mode 100644 test/replication/qsync_random_leader.test.lua

diff --git a/test/replication/qsync.lua b/test/replication/qsync.lua
new file mode 100644
index 000000000..9bbc87239
--- /dev/null
+++ b/test/replication/qsync.lua
@@ -0,0 +1,31 @@
+#!/usr/bin/env tarantool
+
+-- get instance name from filename (qsync1.lua => qsync1)
+local INSTANCE_ID = string.match(arg[0], "%d")
+
+local SOCKET_DIR = require('fio').cwd()
+
+local function instance_uri(instance_id)
+    return SOCKET_DIR..'/qsync'..instance_id..'.sock';
+end
+
+-- start console first
+require('console').listen(os.getenv('ADMIN'))
+
+box.cfg({
+    listen = instance_uri(INSTANCE_ID);
+    replication = {
+        instance_uri(1);
+        instance_uri(2);
+        instance_uri(3);
+        instance_uri(4);
+        instance_uri(5);
+    };
+    replication_synchro_timeout = 1000;
+    replication_synchro_quorum = 5;
+    read_only = false;
+})
+
+box.once("bootstrap", function()
+    box.schema.user.grant("guest", 'replication')
+end)
diff --git a/test/replication/qsync1.lua b/test/replication/qsync1.lua
new file mode 120000
index 000000000..df9f3a883
--- /dev/null
+++ b/test/replication/qsync1.lua
@@ -0,0 +1 @@
+qsync.lua
\ No newline at end of file
diff --git a/test/replication/qsync2.lua b/test/replication/qsync2.lua
new file mode 120000
index 000000000..df9f3a883
--- /dev/null
+++ b/test/replication/qsync2.lua
@@ -0,0 +1 @@
+qsync.lua
\ No newline at end of file
diff --git a/test/replication/qsync3.lua b/test/replication/qsync3.lua
new file mode 120000
index 000000000..df9f3a883
--- /dev/null
+++ b/test/replication/qsync3.lua
@@ -0,0 +1 @@
+qsync.lua
\ No newline at end of file
diff --git a/test/replication/qsync4.lua b/test/replication/qsync4.lua
new file mode 120000
index 000000000..df9f3a883
--- /dev/null
+++ b/test/replication/qsync4.lua
@@ -0,0 +1 @@
+qsync.lua
\ No newline at end of file
diff --git a/test/replication/qsync5.lua b/test/replication/qsync5.lua
new file mode 120000
index 000000000..df9f3a883
--- /dev/null
+++ b/test/replication/qsync5.lua
@@ -0,0 +1 @@
+qsync.lua
\ No newline at end of file
diff --git a/test/replication/qsync_random_leader.result b/test/replication/qsync_random_leader.result
new file mode 100644
index 000000000..2b2df99db
--- /dev/null
+++ b/test/replication/qsync_random_leader.result
@@ -0,0 +1,148 @@
+-- test-run result file version 2
+os = require('os')
+ | ---
+ | ...
+env = require('test_run')
+ | ---
+ | ...
+math = require('math')
+ | ---
+ | ...
+fiber = require('fiber')
+ | ---
+ | ...
+test_run = env.new()
+ | ---
+ | ...
+netbox = require('net.box')
+ | ---
+ | ...
+
+orig_synchro_quorum = box.cfg.replication_synchro_quorum
+ | ---
+ | ...
+orig_synchro_timeout = box.cfg.replication_synchro_timeout
+ | ---
+ | ...
+
+NUM_INSTANCES = 5
+ | ---
+ | ...
+SERVERS = {}
+ | ---
+ | ...
+for i=1,NUM_INSTANCES do                                                       \
+    SERVERS[i] = 'qsync' .. i                                                  \
+end;
+ | ---
+ | ...
+SERVERS -- print instance names
+ | ---
+ | - - qsync1
+ |   - qsync2
+ |   - qsync3
+ |   - qsync4
+ |   - qsync5
+ | ...
+
+math.randomseed(os.time())
+ | ---
+ | ...
+random = function(excluded_num, total)                                         \
+    local r = math.random(1, total)                                            \
+    if (r == excluded_num) then                                                \
+        return random(excluded_num, total)                                     \
+    end                                                                        \
+    return r                                                                   \
+end
+ | ---
+ | ...
+
+-- Set 'broken' quorum on current leader.
+-- Write value on current leader.
+-- Pick a random replica in a cluster.
+-- Set 'good' quorum on it and promote to a leader.
+-- Make sure value is there and on an old leader.
+
+-- Testcase setup.
+test_run:create_cluster(SERVERS)
+ | ---
+ | ...
+test_run:wait_fullmesh(SERVERS)
+ | ---
+ | ...
+test_run:switch('qsync1')
+ | ---
+ | - true
+ | ...
+_ = box.schema.space.create('sync', {is_sync=true, engine = test_run:get_cfg('engine')})
+ | ---
+ | ...
+_ = box.space.sync:create_index('primary')
+ | ---
+ | ...
+box.schema.user.grant('guest', 'write', 'space', 'sync')
+ | ---
+ | ...
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+current_leader_id = 1
+ | ---
+ | ...
+test_run:eval(SERVERS[current_leader_id], "box.ctl.clear_synchro_queue()")
+ | ---
+ | - []
+ | ...
+
+SOCKET_DIR = require('fio').cwd()
+ | ---
+ | ...
+
+-- Testcase body.
+for i=1,30 do                                                                  \
+    test_run:eval(SERVERS[current_leader_id],                                  \
+        "box.cfg{replication_synchro_quorum=6, replication_synchro_timeout=1000}") \
+    c = netbox.connect(SOCKET_DIR..'/'..SERVERS[current_leader_id]..'.sock')   \
+    fiber.create(function() c.space.sync:insert{i} end)                        \
+    new_leader_id = random(current_leader_id, #SERVERS)                        \
+    test_run:eval(SERVERS[new_leader_id],                                      \
+        "box.cfg{replication_synchro_quorum=3, replication_synchro_timeout=0.01}") \
+    test_run:eval(SERVERS[new_leader_id], "box.ctl.clear_synchro_queue()")     \
+    c:close()                                                                  \
+    replica = random(new_leader_id, #SERVERS)                                  \
+    test_run:wait_cond(function() return test_run:eval(SERVERS[replica],       \
+                       string.format("box.space.sync:get{%d}", i))[1] ~= nil end)  \
+    test_run:wait_cond(function() return test_run:eval(SERVERS[current_leader_id], \
+                       string.format("box.space.sync:get{%d}", i))[1] ~= nil end)  \
+    new_leader_id = random(current_leader_id, #SERVERS)                        \
+    current_leader_id = new_leader_id                                          \
+end
+ | ---
+ | ...
+
+test_run:wait_cond(function() return test_run:eval('qsync1',                   \
+                   ("box.space.sync:count()")) == 30 end)
+ | ---
+ | - false
+ | ...
+
+-- Teardown.
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+test_run:eval(SERVERS[current_leader_id], 'box.space.sync:drop()')
+ | ---
+ | - []
+ | ...
+test_run:drop_cluster(SERVERS)
+ | ---
+ | ...
+box.cfg{                                                                       \
+    replication_synchro_quorum = orig_synchro_quorum,                          \
+    replication_synchro_timeout = orig_synchro_timeout,                        \
+}
+ | ---
+ | ...
diff --git a/test/replication/qsync_random_leader.test.lua b/test/replication/qsync_random_leader.test.lua
new file mode 100644
index 000000000..d84366916
--- /dev/null
+++ b/test/replication/qsync_random_leader.test.lua
@@ -0,0 +1,76 @@
+os = require('os')
+env = require('test_run')
+math = require('math')
+fiber = require('fiber')
+test_run = env.new()
+netbox = require('net.box')
+
+orig_synchro_quorum = box.cfg.replication_synchro_quorum
+orig_synchro_timeout = box.cfg.replication_synchro_timeout
+
+NUM_INSTANCES = 5
+SERVERS = {}
+for i=1,NUM_INSTANCES do                                                       \
+    SERVERS[i] = 'qsync' .. i                                                  \
+end;
+SERVERS -- print instance names
+
+math.randomseed(os.time())
+random = function(excluded_num, total)                                         \
+    local r = math.random(1, total)                                            \
+    if (r == excluded_num) then                                                \
+        return random(excluded_num, total)                                     \
+    end                                                                        \
+    return r                                                                   \
+end
+
+-- Set 'broken' quorum on current leader.
+-- Write value on current leader.
+-- Pick a random replica in a cluster.
+-- Set 'good' quorum on it and promote to a leader.
+-- Make sure value is there and on an old leader.
+
+-- Testcase setup.
+test_run:create_cluster(SERVERS)
+test_run:wait_fullmesh(SERVERS)
+test_run:switch('qsync1')
+_ = box.schema.space.create('sync', {is_sync=true, engine = test_run:get_cfg('engine')})
+_ = box.space.sync:create_index('primary')
+box.schema.user.grant('guest', 'write', 'space', 'sync')
+test_run:switch('default')
+current_leader_id = 1
+test_run:eval(SERVERS[current_leader_id], "box.ctl.clear_synchro_queue()")
+
+SOCKET_DIR = require('fio').cwd()
+
+-- Testcase body.
+for i=1,30 do                                                                  \
+    test_run:eval(SERVERS[current_leader_id],                                  \
+        "box.cfg{replication_synchro_quorum=6, replication_synchro_timeout=1000}") \
+    c = netbox.connect(SOCKET_DIR..'/'..SERVERS[current_leader_id]..'.sock')   \
+    fiber.create(function() c.space.sync:insert{i} end)                        \
+    new_leader_id = random(current_leader_id, #SERVERS)                        \
+    test_run:eval(SERVERS[new_leader_id],                                      \
+        "box.cfg{replication_synchro_quorum=3, replication_synchro_timeout=0.01}") \
+    test_run:eval(SERVERS[new_leader_id], "box.ctl.clear_synchro_queue()")     \
+    c:close()                                                                  \
+    replica = random(new_leader_id, #SERVERS)                                  \
+    test_run:wait_cond(function() return test_run:eval(SERVERS[replica],       \
+                       string.format("box.space.sync:get{%d}", i))[1] ~= nil end)  \
+    test_run:wait_cond(function() return test_run:eval(SERVERS[current_leader_id], \
+                       string.format("box.space.sync:get{%d}", i))[1] ~= nil end)  \
+    new_leader_id = random(current_leader_id, #SERVERS)                        \
+    current_leader_id = new_leader_id                                          \
+end
+
+test_run:wait_cond(function() return test_run:eval('qsync1',                   \
+                   ("box.space.sync:count()")) == 30 end)
+
+-- Teardown.
+test_run:switch('default')
+test_run:eval(SERVERS[current_leader_id], 'box.space.sync:drop()')
+test_run:drop_cluster(SERVERS)
+box.cfg{                                                                       \
+    replication_synchro_quorum = orig_synchro_quorum,                          \
+    replication_synchro_timeout = orig_synchro_timeout,                        \
+}
-- 
2.25.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Tarantool-patches] [PATCH 4/4] replication: add test with change space sync mode in a loop
  2020-11-17 16:13 [Tarantool-patches] [PATCH 0/4 v3] Additional qsync tests sergeyb
                   ` (2 preceding siblings ...)
  2020-11-17 16:13 ` [Tarantool-patches] [PATCH 3/4] replication: add test with random leaders promotion and demotion sergeyb
@ 2020-11-17 16:13 ` sergeyb
  2020-11-21 14:41   ` Vladislav Shpilevoy
  2020-11-25 22:02   ` Vladislav Shpilevoy
  3 siblings, 2 replies; 15+ messages in thread
From: sergeyb @ 2020-11-17 16:13 UTC (permalink / raw)
  To: tarantool-patches, v.shpilevoy

From: Sergey Bronnikov <sergeyb@tarantool.org>

Closes #5055
Part of #5144
---
 test/replication/qsync_sync_mode.result   | 114 ++++++++++++++++++++++
 test/replication/qsync_sync_mode.test.lua |  56 +++++++++++
 2 files changed, 170 insertions(+)
 create mode 100644 test/replication/qsync_sync_mode.result
 create mode 100644 test/replication/qsync_sync_mode.test.lua

diff --git a/test/replication/qsync_sync_mode.result b/test/replication/qsync_sync_mode.result
new file mode 100644
index 000000000..66f6a70b2
--- /dev/null
+++ b/test/replication/qsync_sync_mode.result
@@ -0,0 +1,114 @@
+-- test-run result file version 2
+env = require('test_run')
+ | ---
+ | ...
+test_run = env.new()
+ | ---
+ | ...
+engine = test_run:get_cfg('engine')
+ | ---
+ | ...
+fiber = require('fiber')
+ | ---
+ | ...
+math = require('math')
+ | ---
+ | ...
+
+orig_synchro_quorum = box.cfg.replication_synchro_quorum
+ | ---
+ | ...
+orig_synchro_timeout = box.cfg.replication_synchro_timeout
+ | ---
+ | ...
+
+math.randomseed(os.time())
+ | ---
+ | ...
+random_boolean = function()                                                    \
+    if (math.random(1, 10) > 5) then                                           \
+        return true                                                            \
+    else                                                                       \
+        return false                                                           \
+    end                                                                        \
+end
+ | ---
+ | ...
+
+box.schema.user.grant('guest', 'replication')
+ | ---
+ | ...
+
+-- Setup an cluster with two instances.
+test_run:cmd('create server replica with rpl_master=default,\
+                                         script="replication/replica.lua"')
+ | ---
+ | - true
+ | ...
+test_run:cmd('start server replica with wait=True, wait_load=True')
+ | ---
+ | - true
+ | ...
+
+-- Write data to a leader and enable/disable sync mode in a loop.
+-- Testcase setup.
+_ = box.schema.space.create('sync', {is_sync=true, engine=engine})
+ | ---
+ | ...
+_ = box.space.sync:create_index('pk')
+ | ---
+ | ...
+box.cfg{replication_synchro_quorum=2, replication_synchro_timeout=1000}
+ | ---
+ | ...
+
+-- Testcase body.
+for i = 1,100 do                                                               \
+    box.space.sync:alter{is_sync=random_boolean()}                             \
+    box.space.sync:insert{i}                                                   \
+    test_run:switch('replica')                                                 \
+    test_run:wait_cond(function() return box.space.sync:get{i} ~= nil end)     \
+    test_run:switch('default')                                                 \
+    test_run:wait_cond(function() return box.space.sync:get{i} ~= nil end)     \
+end
+ | ---
+ | ...
+box.space.sync:count() -- 100
+ | ---
+ | - 100
+ | ...
+
+-- Testcase cleanup.
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+box.space.sync:drop()
+ | ---
+ | ...
+
+-- Teardown.
+test_run:cmd('switch default')
+ | ---
+ | - true
+ | ...
+test_run:cmd('stop server replica')
+ | ---
+ | - true
+ | ...
+test_run:cmd('delete server replica')
+ | ---
+ | - true
+ | ...
+test_run:cleanup_cluster()
+ | ---
+ | ...
+box.schema.user.revoke('guest', 'replication')
+ | ---
+ | ...
+box.cfg{                                                                       \
+    replication_synchro_quorum = orig_synchro_quorum,                          \
+    replication_synchro_timeout = orig_synchro_timeout,                        \
+}
+ | ---
+ | ...
diff --git a/test/replication/qsync_sync_mode.test.lua b/test/replication/qsync_sync_mode.test.lua
new file mode 100644
index 000000000..ea5725169
--- /dev/null
+++ b/test/replication/qsync_sync_mode.test.lua
@@ -0,0 +1,56 @@
+env = require('test_run')
+test_run = env.new()
+engine = test_run:get_cfg('engine')
+fiber = require('fiber')
+math = require('math')
+
+orig_synchro_quorum = box.cfg.replication_synchro_quorum
+orig_synchro_timeout = box.cfg.replication_synchro_timeout
+
+math.randomseed(os.time())
+random_boolean = function()                                                    \
+    if (math.random(1, 10) > 5) then                                           \
+        return true                                                            \
+    else                                                                       \
+        return false                                                           \
+    end                                                                        \
+end
+
+box.schema.user.grant('guest', 'replication')
+
+-- Setup an cluster with two instances.
+test_run:cmd('create server replica with rpl_master=default,\
+                                         script="replication/replica.lua"')
+test_run:cmd('start server replica with wait=True, wait_load=True')
+
+-- Write data to a leader and enable/disable sync mode in a loop.
+-- Testcase setup.
+_ = box.schema.space.create('sync', {is_sync=true, engine=engine})
+_ = box.space.sync:create_index('pk')
+box.cfg{replication_synchro_quorum=2, replication_synchro_timeout=1000}
+
+-- Testcase body.
+for i = 1,100 do                                                               \
+    box.space.sync:alter{is_sync=random_boolean()}                             \
+    box.space.sync:insert{i}                                                   \
+    test_run:switch('replica')                                                 \
+    test_run:wait_cond(function() return box.space.sync:get{i} ~= nil end)     \
+    test_run:switch('default')                                                 \
+    test_run:wait_cond(function() return box.space.sync:get{i} ~= nil end)     \
+end
+box.space.sync:count() -- 100
+
+-- Testcase cleanup.
+test_run:switch('default')
+box.space.sync:drop()
+
+-- Teardown.
+test_run:cmd('switch default')
+test_run:cmd('stop server replica')
+test_run:cmd('delete server replica')
+test_run:cleanup_cluster()
+box.schema.user.revoke('guest', 'replication')
+box.cfg{                                                                       \
+    replication_synchro_quorum = orig_synchro_quorum,                          \
+    replication_synchro_timeout = orig_synchro_timeout,                        \
+}
-- 
2.25.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/4] replication: run clear_synchro_queue() with unconfigured box
  2020-11-17 16:13 ` [Tarantool-patches] [PATCH 1/4] replication: run clear_synchro_queue() with unconfigured box sergeyb
@ 2020-11-21 14:41   ` Vladislav Shpilevoy
  2020-11-23 14:44     ` Sergey Bronnikov
  0 siblings, 1 reply; 15+ messages in thread
From: Vladislav Shpilevoy @ 2020-11-21 14:41 UTC (permalink / raw)
  To: sergeyb, tarantool-patches

Hi! Thanks for the patch!

From the commit title it seems like you introduced a feature to run
the function with non-configured box. Please, change 'run' to 'test',
would be better already.

Also for qsync we usually write title 'qsync: ...', not just
'replication: ...'. Or in your case it could be 'test: ...',
but up to you. The same for the other commits.

The patch itself is good.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/4] replication: test clear_synchro_queue() function
  2020-11-17 16:13 ` [Tarantool-patches] [PATCH 2/4] replication: test clear_synchro_queue() function sergeyb
@ 2020-11-21 14:41   ` Vladislav Shpilevoy
  2020-11-23 15:13     ` Sergey Bronnikov
  0 siblings, 1 reply; 15+ messages in thread
From: Vladislav Shpilevoy @ 2020-11-21 14:41 UTC (permalink / raw)
  To: sergeyb, tarantool-patches

Thanks for the patch!

See 2 comments below.

On 17.11.2020 17:13, sergeyb@tarantool.org wrote:
> From: Sergey Bronnikov <sergeyb@tarantool.org>
> 
> Part of #5055
> Part of #4849
> ---
>  test/replication/qsync_basic.result   | 129 ++++++++++++++++++++++++++
>  test/replication/qsync_basic.test.lua |  45 +++++++++
>  2 files changed, 174 insertions(+)
> 
> diff --git a/test/replication/qsync_basic.result b/test/replication/qsync_basic.result
> index bd3c3cce1..a06f65ce1 100644
> --- a/test/replication/qsync_basic.result
> +++ b/test/replication/qsync_basic.result
> @@ -637,6 +637,135 @@ box.space.sync:count()
>   | - 0
>   | ...
>  
> +--
> +-- gh-4849: clear synchro queue on a master
> +--
> +test_run:switch('default')
> + | ---
> + | - true
> + | ...
> +box.cfg{replication_synchro_quorum = 3, replication_synchro_timeout = 1000}
> + | ---
> + | ...
> +ok, err = nil
> + | ---
> + | ...
> +f = fiber.create(function()							\
> +    ok, err = pcall(box.space.sync.insert, box.space.sync, {10})		\

1. In Lua files we use spaces and 4 symbols indentation step. I
know it would be easier if it would be like in C, but at this point
it is too late to change without a reason.

Here you apparently used tabs for '\' symbols.

> +end)
> + | ---
> + | ...
> +f:status()
> + | ---
> + | - suspended
> + | ...
> +test_run:switch('replica')
> + | ---
> + | - true
> + | ...
> +test_run:wait_cond(function() return box.space.sync:get{10} ~= nil end)
> + | ---
> + | - true
> + | ...
> +test_run:switch('default')
> + | ---
> + | - true
> + | ...
> +box.cfg{replication_synchro_timeout = 0.1}
> + | ---
> + | ...
> +box.ctl.clear_synchro_queue()

2. Current behaviour of the queue clearance is incorrect,
because it should never rollback local rows, see
https://github.com/tarantool/tarantool/issues/5435. It is
fine to have this test, but better leave a comment, that the
test must stop working eventually, and it will be ok to
delete or rework it.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Tarantool-patches] [PATCH 4/4] replication: add test with change space sync mode in a loop
  2020-11-17 16:13 ` [Tarantool-patches] [PATCH 4/4] replication: add test with change space sync mode in a loop sergeyb
@ 2020-11-21 14:41   ` Vladislav Shpilevoy
  2020-11-23 15:42     ` Sergey Bronnikov
  2020-11-25 22:02   ` Vladislav Shpilevoy
  1 sibling, 1 reply; 15+ messages in thread
From: Vladislav Shpilevoy @ 2020-11-21 14:41 UTC (permalink / raw)
  To: sergeyb, tarantool-patches

Thanks for the patch!

> +++ b/test/replication/qsync_sync_mode.result
> @@ -0,0 +1,114 @@
> +-- test-run result file version 2
> +env = require('test_run')
> + | ---
> + | ...
> +test_run = env.new()
> + | ---
> + | ...
> +engine = test_run:get_cfg('engine')
> + | ---
> + | ...
> +fiber = require('fiber')
> + | ---
> + | ...
> +math = require('math')
> + | ---
> + | ...
> +
> +orig_synchro_quorum = box.cfg.replication_synchro_quorum
> + | ---
> + | ...
> +orig_synchro_timeout = box.cfg.replication_synchro_timeout
> + | ---
> + | ...
> +
> +math.randomseed(os.time())
> + | ---
> + | ...
> +random_boolean = function()                                                    \
> +    if (math.random(1, 10) > 5) then                                           \

In Lua we don't use () in if conditions. Also you could simply
'return math.random(1, 10) > 5'.

> +        return true                                                            \
> +    else                                                                       \
> +        return false                                                           \
> +    end                                                                        \
> +end

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Tarantool-patches] [PATCH 3/4] replication: add test with random leaders promotion and demotion
  2020-11-17 16:13 ` [Tarantool-patches] [PATCH 3/4] replication: add test with random leaders promotion and demotion sergeyb
@ 2020-11-21 14:41   ` Vladislav Shpilevoy
  2020-11-24  8:39     ` Sergey Bronnikov
  0 siblings, 1 reply; 15+ messages in thread
From: Vladislav Shpilevoy @ 2020-11-21 14:41 UTC (permalink / raw)
  To: sergeyb, tarantool-patches

Thanks for the patch!

See 8 comments below.

On 17.11.2020 17:13, sergeyb@tarantool.org wrote:
> From: Sergey Bronnikov <sergeyb@tarantool.org>
> 
> Part of #5055
> Part of #5144
> ---
>  test/replication/qsync.lua                    |  31 ++++
>  test/replication/qsync1.lua                   |   1 +
>  test/replication/qsync2.lua                   |   1 +
>  test/replication/qsync3.lua                   |   1 +
>  test/replication/qsync4.lua                   |   1 +
>  test/replication/qsync5.lua                   |   1 +
>  test/replication/qsync_random_leader.result   | 148 ++++++++++++++++++
>  test/replication/qsync_random_leader.test.lua |  76 +++++++++
>  8 files changed, 260 insertions(+)
>  create mode 100644 test/replication/qsync.lua
>  create mode 120000 test/replication/qsync1.lua
>  create mode 120000 test/replication/qsync2.lua
>  create mode 120000 test/replication/qsync3.lua
>  create mode 120000 test/replication/qsync4.lua
>  create mode 120000 test/replication/qsync5.lua
>  create mode 100644 test/replication/qsync_random_leader.result
>  create mode 100644 test/replication/qsync_random_leader.test.lua
> 
> diff --git a/test/replication/qsync.lua b/test/replication/qsync.lua
> new file mode 100644
> index 000000000..9bbc87239
> --- /dev/null
> +++ b/test/replication/qsync.lua
> @@ -0,0 +1,31 @@
> +#!/usr/bin/env tarantool
> +
> +-- get instance name from filename (qsync1.lua => qsync1)
> +local INSTANCE_ID = string.match(arg[0], "%d")
> +
> +local SOCKET_DIR = require('fio').cwd()
> +
> +local function instance_uri(instance_id)
> +    return SOCKET_DIR..'/qsync'..instance_id..'.sock';

1. ';' is not needed here.

> +end
> +
> +-- start console first

2. Why? The comment narrates the code, but does not say why
it is done so.

> +require('console').listen(os.getenv('ADMIN'))
> +
> +box.cfg({
> +    listen = instance_uri(INSTANCE_ID);

3. Please, don't use ';' as a separator. We use ',' in the
new code.

> +    replication = {
> +        instance_uri(1);
> +        instance_uri(2);
> +        instance_uri(3);
> +        instance_uri(4);
> +        instance_uri(5);
> +    };
> +    replication_synchro_timeout = 1000;
> +    replication_synchro_quorum = 5;
> +    read_only = false;
> +})
> +
> +box.once("bootstrap", function()
> +    box.schema.user.grant("guest", 'replication')
> +end)
> diff --git a/test/replication/qsync_random_leader.result b/test/replication/qsync_random_leader.result
> new file mode 100644
> index 000000000..2b2df99db
> --- /dev/null
> +++ b/test/replication/qsync_random_leader.result
> @@ -0,0 +1,148 @@
> +-- test-run result file version 2
> +os = require('os')
> + | ---
> + | ...
> +env = require('test_run')
> + | ---
> + | ...
> +math = require('math')
> + | ---
> + | ...
> +fiber = require('fiber')
> + | ---
> + | ...
> +test_run = env.new()
> + | ---
> + | ...
> +netbox = require('net.box')
> + | ---
> + | ...
> +
> +orig_synchro_quorum = box.cfg.replication_synchro_quorum
> + | ---
> + | ...
> +orig_synchro_timeout = box.cfg.replication_synchro_timeout
> + | ---
> + | ...
> +
> +NUM_INSTANCES = 5
> + | ---
> + | ...
> +SERVERS = {}
> + | ---
> + | ...
> +for i=1,NUM_INSTANCES do                                                       \

4. Please, use whitespaces. Here are some: '             ',
take them.

> +    SERVERS[i] = 'qsync' .. i                                                  \
> +end;

5. ';' is not needed.

> + | ---
> + | ...
> +SERVERS -- print instance names
> + | ---
> + | - - qsync1
> + |   - qsync2
> + |   - qsync3
> + |   - qsync4
> + |   - qsync5
> + | ...
> +
> +math.randomseed(os.time())
> + | ---
> + | ...
> +random = function(excluded_num, total)                                         \

6. Why not 'function random(...)'? Why do you need to assign
it to a variable via '='?

> +    local r = math.random(1, total)                                            \
> +    if (r == excluded_num) then                                                \
> +        return random(excluded_num, total)                                     \
> +    end                                                                        \
> +    return r                                                                   \
> +end
> + | ---
> + | ...
> +
> +-- Set 'broken' quorum on current leader.
> +-- Write value on current leader.
> +-- Pick a random replica in a cluster.
> +-- Set 'good' quorum on it and promote to a leader.
> +-- Make sure value is there and on an old leader.
> +
> +-- Testcase setup.
> +test_run:create_cluster(SERVERS)
> + | ---
> + | ...
> +test_run:wait_fullmesh(SERVERS)
> + | ---
> + | ...
> +test_run:switch('qsync1')
> + | ---
> + | - true
> + | ...
> +_ = box.schema.space.create('sync', {is_sync=true, engine = test_run:get_cfg('engine')})
> + | ---
> + | ...
> +_ = box.space.sync:create_index('primary')
> + | ---
> + | ...
> +box.schema.user.grant('guest', 'write', 'space', 'sync')
> + | ---
> + | ...
> +test_run:switch('default')
> + | ---
> + | - true
> + | ...
> +current_leader_id = 1
> + | ---
> + | ...
> +test_run:eval(SERVERS[current_leader_id], "box.ctl.clear_synchro_queue()")
> + | ---
> + | - []
> + | ...
> +
> +SOCKET_DIR = require('fio').cwd()
> + | ---
> + | ...
> +
> +-- Testcase body.
> +for i=1,30 do                                                                  \
> +    test_run:eval(SERVERS[current_leader_id],                                  \
> +        "box.cfg{replication_synchro_quorum=6, replication_synchro_timeout=1000}") \
> +    c = netbox.connect(SOCKET_DIR..'/'..SERVERS[current_leader_id]..'.sock')   \
> +    fiber.create(function() c.space.sync:insert{i} end)                        \
> +    new_leader_id = random(current_leader_id, #SERVERS)                        \
> +    test_run:eval(SERVERS[new_leader_id],                                      \
> +        "box.cfg{replication_synchro_quorum=3, replication_synchro_timeout=0.01}") \
> +    test_run:eval(SERVERS[new_leader_id], "box.ctl.clear_synchro_queue()")     \
> +    c:close()                                                                  \
> +    replica = random(new_leader_id, #SERVERS)                                  \
> +    test_run:wait_cond(function() return test_run:eval(SERVERS[replica],       \
> +                       string.format("box.space.sync:get{%d}", i))[1] ~= nil end)  \
> +    test_run:wait_cond(function() return test_run:eval(SERVERS[current_leader_id], \
> +                       string.format("box.space.sync:get{%d}", i))[1] ~= nil end)  \
> +    new_leader_id = random(current_leader_id, #SERVERS)                        \
> +    current_leader_id = new_leader_id                                          \
> +end
> + | ---
> + | ...
> +
> +test_run:wait_cond(function() return test_run:eval('qsync1',                   \
> +                   ("box.space.sync:count()")) == 30 end)
> + | ---
> + | - false

7. I am not sure it is supposed to be false. If it should be -
I don't understand why. Could you elaborate please?

Also in the previous commit in the end of a test doing exactly
the same, but for leader-replica config, you wrote:

	Note: cluster may be in a broken state here due to nature of previous test.

But here you do it 30 times, and it is not broken (I hope).
So that statement wasn't true?

Besides, I get

No output during 10 seconds. Will abort after 120 seconds without output. List of workers not reporting the status:
- 001_replication [replication/qsync_random_leader.test.lua, vinyl] at var/001_replication/qsync_random_leader.result:126
- 002_replication [replication/qsync_random_leader.test.lua, memtx] at var/002_replication/qsync_random_leader.result:126
No output during 20 seconds. Will abort after 120 seconds without output. List of workers not reporting the status:
- 001_replication [replication/qsync_random_leader.test.lua, vinyl] at var/001_replication/qsync_random_leader.result:126
- 002_replication [replication/qsync_random_leader.test.lua, memtx] at var/002_replication/qsync_random_leader.result:126

on this wait_cond(). Does not look like it works.

Also you don't check the space is full on other instances. Only on the
current leader, which is not so interesting.

> + | ...
> +
> +-- Teardown.
> +test_run:switch('default')
> + | ---
> + | - true
> + | ...
> +test_run:eval(SERVERS[current_leader_id], 'box.space.sync:drop()')
> + | ---
> + | - []
> + | ...
> +test_run:drop_cluster(SERVERS)
> + | ---
> + | ...
> +box.cfg{                                                                       \
> +    replication_synchro_quorum = orig_synchro_quorum,                          \
> +    replication_synchro_timeout = orig_synchro_timeout,                        \

8. You didn't change these values in the 'default' instance. So after
what do you restore them?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/4] replication: run clear_synchro_queue() with unconfigured box
  2020-11-21 14:41   ` Vladislav Shpilevoy
@ 2020-11-23 14:44     ` Sergey Bronnikov
  0 siblings, 0 replies; 15+ messages in thread
From: Sergey Bronnikov @ 2020-11-23 14:44 UTC (permalink / raw)
  To: Vladislav Shpilevoy, tarantool-patches

Hi! Thanks for review!

On 21.11.2020 17:41, Vladislav Shpilevoy wrote:
> Hi! Thanks for the patch!
>
>  From the commit title it seems like you introduced a feature to run
> the function with non-configured box. Please, change 'run' to 'test',
> would be better already.
>
> Also for qsync we usually write title 'qsync: ...', not just
> 'replication: ...'. Or in your case it could be 'test: ...',
> but up to you. The same for the other commits.

updated prefix for all commits in patch series:

4f8b516c3 (HEAD -> ligurio/gh-4842-qsync-testing) qsync: add test with 
change space sync mode in a loop
d9a3fff52 qsync: add test with random leaders promotion and demotion
8723e4cca qsync: test clear_synchro_queue() function
8889406fc qsync: test clear_synchro_queue() with unconfigured box

> The patch itself is good.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/4] replication: test clear_synchro_queue() function
  2020-11-21 14:41   ` Vladislav Shpilevoy
@ 2020-11-23 15:13     ` Sergey Bronnikov
  0 siblings, 0 replies; 15+ messages in thread
From: Sergey Bronnikov @ 2020-11-23 15:13 UTC (permalink / raw)
  To: Vladislav Shpilevoy, tarantool-patches

Hi, thanks for review!

On 21.11.2020 17:41, Vladislav Shpilevoy wrote:
> Thanks for the patch!
>
> See 2 comments below.
>
> On 17.11.2020 17:13, sergeyb@tarantool.org wrote:
>> From: Sergey Bronnikov <sergeyb@tarantool.org>
>>
>> Part of #5055
>> Part of #4849
>> ---
>>   test/replication/qsync_basic.result   | 129 ++++++++++++++++++++++++++
>>   test/replication/qsync_basic.test.lua |  45 +++++++++
>>   2 files changed, 174 insertions(+)
>>
>> diff --git a/test/replication/qsync_basic.result b/test/replication/qsync_basic.result
>> index bd3c3cce1..a06f65ce1 100644
>> --- a/test/replication/qsync_basic.result
>> +++ b/test/replication/qsync_basic.result
>> @@ -637,6 +637,135 @@ box.space.sync:count()
>>    | - 0
>>    | ...
>>   
>> +--
>> +-- gh-4849: clear synchro queue on a master
>> +--
>> +test_run:switch('default')
>> + | ---
>> + | - true
>> + | ...
>> +box.cfg{replication_synchro_quorum = 3, replication_synchro_timeout = 1000}
>> + | ---
>> + | ...
>> +ok, err = nil
>> + | ---
>> + | ...
>> +f = fiber.create(function()							\
>> +    ok, err = pcall(box.space.sync.insert, box.space.sync, {10})		\
> 1. In Lua files we use spaces and 4 symbols indentation step. I
> know it would be easier if it would be like in C, but at this point
> it is too late to change without a reason.
>
> Here you apparently used tabs for '\' symbols.

Updated in a branch. I haven't put relevant patch here

because whitespaces are broken after copy-paste.

>
>> +end)
>> + | ---
>> + | ...
>> +f:status()
>> + | ---
>> + | - suspended
>> + | ...
>> +test_run:switch('replica')
>> + | ---
>> + | - true
>> + | ...
>> +test_run:wait_cond(function() return box.space.sync:get{10} ~= nil end)
>> + | ---
>> + | - true
>> + | ...
>> +test_run:switch('default')
>> + | ---
>> + | - true
>> + | ...
>> +box.cfg{replication_synchro_timeout = 0.1}
>> + | ---
>> + | ...
>> +box.ctl.clear_synchro_queue()
> 2. Current behaviour of the queue clearance is incorrect,
> because it should never rollback local rows, see
> https://github.com/tarantool/tarantool/issues/5435. It is
> fine to have this test, but better leave a comment, that the
> test must stop working eventually, and it will be ok to
> delete or rework it.

Updated comment:

@@ -251,11 +251,16 @@ box.space.sync:count()
  --
  -- gh-4849: clear synchro queue on a master
  --
+-- NOTE: current behavior of the queue clearance is incorrect,
+-- because it should never rollback local rows, see
+-- https://github.com/tarantool/tarantool/issues/5435. Test may
+-- stop working eventually.
+--
  test_run:switch('default')
  box.cfg{replication_synchro_quorum = 3, replication_synchro_timeout = 
1000}
  ok, err = nil

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Tarantool-patches] [PATCH 4/4] replication: add test with change space sync mode in a loop
  2020-11-21 14:41   ` Vladislav Shpilevoy
@ 2020-11-23 15:42     ` Sergey Bronnikov
  0 siblings, 0 replies; 15+ messages in thread
From: Sergey Bronnikov @ 2020-11-23 15:42 UTC (permalink / raw)
  To: Vladislav Shpilevoy, tarantool-patches

Thanks for review!

On 21.11.2020 17:41, Vladislav Shpilevoy wrote:
> Thanks for the patch!
>
>> +++ b/test/replication/qsync_sync_mode.result
>> @@ -0,0 +1,114 @@
>> +-- test-run result file version 2
>> +env = require('test_run')
>> + | ---
>> + | ...
>> +test_run = env.new()
>> + | ---
>> + | ...
>> +engine = test_run:get_cfg('engine')
>> + | ---
>> + | ...
>> +fiber = require('fiber')
>> + | ---
>> + | ...
>> +math = require('math')
>> + | ---
>> + | ...
>> +
>> +orig_synchro_quorum = box.cfg.replication_synchro_quorum
>> + | ---
>> + | ...
>> +orig_synchro_timeout = box.cfg.replication_synchro_timeout
>> + | ---
>> + | ...
>> +
>> +math.randomseed(os.time())
>> + | ---
>> + | ...
>> +random_boolean = function()                                                    \
>> +    if (math.random(1, 10) > 5) then                                           \
> In Lua we don't use () in if conditions. Also you could simply
> 'return math.random(1, 10) > 5'.

Agree. Also updated format of function.

--- a/test/replication/qsync_sync_mode.test.lua
+++ b/test/replication/qsync_sync_mode.test.lua
@@ -8,12 +8,8 @@ orig_synchro_quorum = box.cfg.replication_synchro_quorum
  orig_synchro_timeout = box.cfg.replication_synchro_timeout

  math.randomseed(os.time())
-random_boolean = 
function()                                                    \
-    if (math.random(1, 10) > 5) 
then                                           \
-        return 
true                                                            \
- else \
-        return 
false                                                           \
- end \
+function random_boolean() \
+    return math.random(1, 10) > 
5                                              \
  end

  box.schema.user.grant('guest', 'replication')

>> +        return true                                                            \
>> +    else                                                                       \
>> +        return false                                                           \
>> +    end                                                                        \
>> +end

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Tarantool-patches] [PATCH 3/4] replication: add test with random leaders promotion and demotion
  2020-11-21 14:41   ` Vladislav Shpilevoy
@ 2020-11-24  8:39     ` Sergey Bronnikov
  2020-11-25 22:02       ` Vladislav Shpilevoy
  0 siblings, 1 reply; 15+ messages in thread
From: Sergey Bronnikov @ 2020-11-24  8:39 UTC (permalink / raw)
  To: Vladislav Shpilevoy, tarantool-patches

Thanks for review!

On 21.11.2020 17:41, Vladislav Shpilevoy wrote:
> Thanks for the patch!
>
> See 8 comments below.
>
> On 17.11.2020 17:13, sergeyb@tarantool.org wrote:
>> From: Sergey Bronnikov <sergeyb@tarantool.org>
>>
>> Part of #5055
>> Part of #5144
>> ---
>>   test/replication/qsync.lua                    |  31 ++++
>>   test/replication/qsync1.lua                   |   1 +
>>   test/replication/qsync2.lua                   |   1 +
>>   test/replication/qsync3.lua                   |   1 +
>>   test/replication/qsync4.lua                   |   1 +
>>   test/replication/qsync5.lua                   |   1 +
>>   test/replication/qsync_random_leader.result   | 148 ++++++++++++++++++
>>   test/replication/qsync_random_leader.test.lua |  76 +++++++++
>>   8 files changed, 260 insertions(+)
>>   create mode 100644 test/replication/qsync.lua
>>   create mode 120000 test/replication/qsync1.lua
>>   create mode 120000 test/replication/qsync2.lua
>>   create mode 120000 test/replication/qsync3.lua
>>   create mode 120000 test/replication/qsync4.lua
>>   create mode 120000 test/replication/qsync5.lua
>>   create mode 100644 test/replication/qsync_random_leader.result
>>   create mode 100644 test/replication/qsync_random_leader.test.lua
>>
>> diff --git a/test/replication/qsync.lua b/test/replication/qsync.lua
>> new file mode 100644
>> index 000000000..9bbc87239
>> --- /dev/null
>> +++ b/test/replication/qsync.lua
>> @@ -0,0 +1,31 @@
>> +#!/usr/bin/env tarantool
>> +
>> +-- get instance name from filename (qsync1.lua => qsync1)
>> +local INSTANCE_ID = string.match(arg[0], "%d")
>> +
>> +local SOCKET_DIR = require('fio').cwd()
>> +
>> +local function instance_uri(instance_id)
>> +    return SOCKET_DIR..'/qsync'..instance_id..'.sock';
> 1. ';' is not needed here.
  local function instance_uri(instance_id)
-    return SOCKET_DIR..'/qsync'..instance_id..'.sock';
+    return SOCKET_DIR..'/qsync'..instance_id..'.sock'
  end


>> +end
>> +
>> +-- start console first
> 2. Why? The comment narrates the code, but does not say why
> it is done so.
Updated a comment. Console is required to keep instance running.
>> +require('console').listen(os.getenv('ADMIN'))
>> +
>> +box.cfg({
>> +    listen = instance_uri(INSTANCE_ID);
> 3. Please, don't use ';' as a separator. We use ',' in the
> new code.


  box.cfg({
-    listen = instance_uri(INSTANCE_ID);
+    listen = instance_uri(INSTANCE_ID),
      replication = {
-        instance_uri(1);
-        instance_uri(2);
-        instance_uri(3);
-        instance_uri(4);
-        instance_uri(5);
-    };
-    replication_synchro_timeout = 1000;
-    replication_synchro_quorum = 5;
-    read_only = false;
+        instance_uri(1),
+        instance_uri(2),
+        instance_uri(3),
+        instance_uri(4),
+        instance_uri(5),
+    },
+    replication_synchro_timeout = 1000,
+    replication_synchro_quorum = 5,
+    read_only = false,
  })

>> +    replication = {
>> +        instance_uri(1);
>> +        instance_uri(2);
>> +        instance_uri(3);
>> +        instance_uri(4);
>> +        instance_uri(5);
>> +    };
>> +    replication_synchro_timeout = 1000;
>> +    replication_synchro_quorum = 5;
>> +    read_only = false;
>> +})
>> +
>> +box.once("bootstrap", function()
>> +    box.schema.user.grant("guest", 'replication')
>> +end)
>> diff --git a/test/replication/qsync_random_leader.result b/test/replication/qsync_random_leader.result
>> new file mode 100644
>> index 000000000..2b2df99db
>> --- /dev/null
>> +++ b/test/replication/qsync_random_leader.result
>> @@ -0,0 +1,148 @@
>> +-- test-run result file version 2
>> +os = require('os')
>> + | ---
>> + | ...
>> +env = require('test_run')
>> + | ---
>> + | ...
>> +math = require('math')
>> + | ---
>> + | ...
>> +fiber = require('fiber')
>> + | ---
>> + | ...
>> +test_run = env.new()
>> + | ---
>> + | ...
>> +netbox = require('net.box')
>> + | ---
>> + | ...
>> +
>> +orig_synchro_quorum = box.cfg.replication_synchro_quorum
>> + | ---
>> + | ...
>> +orig_synchro_timeout = box.cfg.replication_synchro_timeout
>> + | ---
>> + | ...
>> +
>> +NUM_INSTANCES = 5
>> + | ---
>> + | ...
>> +SERVERS = {}
>> + | ---
>> + | ...
>> +for i=1,NUM_INSTANCES do                                                       \
> 4. Please, use whitespaces. Here are some: '             ',
> take them.
Vim doesn't show any tabs before '\'. Anyway I have updated indentation 
before '\' using whitespaces.
>> +    SERVERS[i] = 'qsync' .. i                                                  \
>> +end;
> 5. ';' is not needed.

Removed.


>> + | ---
>> + | ...
>> +SERVERS -- print instance names
>> + | ---
>> + | - - qsync1
>> + |   - qsync2
>> + |   - qsync3
>> + |   - qsync4
>> + |   - qsync5
>> + | ...
>> +
>> +math.randomseed(os.time())
>> + | ---
>> + | ...
>> +random = function(excluded_num, total)                                         \
> 6. Why not 'function random(...)'? Why do you need to assign
> it to a variable via '='?

Because for my taste this form is more convenient for reading.

Converted to "function random(...)".

>
>> +    local r = math.random(1, total)                                            \
>> +    if (r == excluded_num) then                                                \
>> +        return random(excluded_num, total)                                     \
>> +    end                                                                        \
>> +    return r                                                                   \
>> +end
>> + | ---
>> + | ...
>> +
>> +-- Set 'broken' quorum on current leader.
>> +-- Write value on current leader.
>> +-- Pick a random replica in a cluster.
>> +-- Set 'good' quorum on it and promote to a leader.
>> +-- Make sure value is there and on an old leader.
>> +
>> +-- Testcase setup.
>> +test_run:create_cluster(SERVERS)
>> + | ---
>> + | ...
>> +test_run:wait_fullmesh(SERVERS)
>> + | ---
>> + | ...
>> +test_run:switch('qsync1')
>> + | ---
>> + | - true
>> + | ...
>> +_ = box.schema.space.create('sync', {is_sync=true, engine = test_run:get_cfg('engine')})
>> + | ---
>> + | ...
>> +_ = box.space.sync:create_index('primary')
>> + | ---
>> + | ...
>> +box.schema.user.grant('guest', 'write', 'space', 'sync')
>> + | ---
>> + | ...
>> +test_run:switch('default')
>> + | ---
>> + | - true
>> + | ...
>> +current_leader_id = 1
>> + | ---
>> + | ...
>> +test_run:eval(SERVERS[current_leader_id], "box.ctl.clear_synchro_queue()")
>> + | ---
>> + | - []
>> + | ...
>> +
>> +SOCKET_DIR = require('fio').cwd()
>> + | ---
>> + | ...
>> +
>> +-- Testcase body.
>> +for i=1,30 do                                                                  \
>> +    test_run:eval(SERVERS[current_leader_id],                                  \
>> +        "box.cfg{replication_synchro_quorum=6, replication_synchro_timeout=1000}") \
>> +    c = netbox.connect(SOCKET_DIR..'/'..SERVERS[current_leader_id]..'.sock')   \
>> +    fiber.create(function() c.space.sync:insert{i} end)                        \
>> +    new_leader_id = random(current_leader_id, #SERVERS)                        \
>> +    test_run:eval(SERVERS[new_leader_id],                                      \
>> +        "box.cfg{replication_synchro_quorum=3, replication_synchro_timeout=0.01}") \
>> +    test_run:eval(SERVERS[new_leader_id], "box.ctl.clear_synchro_queue()")     \
>> +    c:close()                                                                  \
>> +    replica = random(new_leader_id, #SERVERS)                                  \
>> +    test_run:wait_cond(function() return test_run:eval(SERVERS[replica],       \
>> +                       string.format("box.space.sync:get{%d}", i))[1] ~= nil end)  \
>> +    test_run:wait_cond(function() return test_run:eval(SERVERS[current_leader_id], \
>> +                       string.format("box.space.sync:get{%d}", i))[1] ~= nil end)  \
>> +    new_leader_id = random(current_leader_id, #SERVERS)                        \
>> +    current_leader_id = new_leader_id                                          \
>> +end
>> + | ---
>> + | ...
>> +
>> +test_run:wait_cond(function() return test_run:eval('qsync1',                   \
>> +                   ("box.space.sync:count()")) == 30 end)
>> + | ---
>> + | - false
> 7. I am not sure it is supposed to be false. If it should be -
> I don't understand why. Could you elaborate please?

No, it is obviously should be True here. I missed reference value for 
that statement in .result file.

wait_cond() here was broken because it returns a list, not a number. 
Fixed it.

  test_run:wait_cond(function() return 
test_run:eval('qsync1',                   \
-                   ("box.space.sync:count()")) == 30 end)
+                   ("box.space.sync:count()"))[1] == 30 end)
   | ---
- | - false
+ | - true
   | ...

> Also in the previous commit in the end of a test doing exactly
> the same, but for leader-replica config, you wrote:
>
> 	Note: cluster may be in a broken state here due to nature of previous test.

You got it wrong. The sense of my note is following: testcases should be 
isolated from each one and if testcase

is failed we consider environment where test has been started as 'dirty' 
and should not continue to run other tests in

that environment. Testcases with 'broken' quorum uses negative scenarios 
and I placed this testcase at the end of file

to minimize impact to other tests in case of failure. Note was placed 
for others just to warn about negative testcase.

>
> But here you do it 30 times, and it is not broken (I hope).
> So that statement wasn't true?
>
> Besides, I get
>
> No output during 10 seconds. Will abort after 120 seconds without output. List of workers not reporting the status:
> - 001_replication [replication/qsync_random_leader.test.lua, vinyl] at var/001_replication/qsync_random_leader.result:126
> - 002_replication [replication/qsync_random_leader.test.lua, memtx] at var/002_replication/qsync_random_leader.result:126
> No output during 20 seconds. Will abort after 120 seconds without output. List of workers not reporting the status:
> - 001_replication [replication/qsync_random_leader.test.lua, vinyl] at var/001_replication/qsync_random_leader.result:126
> - 002_replication [replication/qsync_random_leader.test.lua, memtx] at var/002_replication/qsync_random_leader.result:126
>
> on this wait_cond(). Does not look like it works.

The problem was in a value that wait_cond() returns.

>
> Also you don't check the space is full on other instances. Only on the
> current leader, which is not so interesting.

-test_run:wait_cond(function() return 
test_run:eval('qsync1',                   \
-                   ("box.space.sync:count()")) == 30 end)
+for i=1,NUM_INSTANCES 
do                                                       \
+    instance = string.format('qsync%d', 
i)                                     \
+    test_run:wait_cond(function() return 
test_run:eval(instance,               \
+                       "box.space.sync:count()")[1] == 30 
end)                 \
+end



>> + | ...
>> +
>> +-- Teardown.
>> +test_run:switch('default')
>> + | ---
>> + | - true
>> + | ...
>> +test_run:eval(SERVERS[current_leader_id], 'box.space.sync:drop()')
>> + | ---
>> + | - []
>> + | ...
>> +test_run:drop_cluster(SERVERS)
>> + | ---
>> + | ...
>> +box.cfg{                                                                       \
>> +    replication_synchro_quorum = orig_synchro_quorum,                          \
>> +    replication_synchro_timeout = orig_synchro_timeout,                        \
> 8. You didn't change these values in the 'default' instance. So after
> what do you restore them?

Removed these lines as well as lines before test to save settings.

--- a/test/replication/qsync_random_leader.test.lua
+++ b/test/replication/qsync_random_leader.test.lua
@@ -5,18 +5,15 @@ fiber = require('fiber')
  test_run = env.new()
  netbox = require('net.box')

-orig_synchro_quorum = box.cfg.replication_synchro_quorum
-orig_synchro_timeout = box.cfg.replication_synchro_timeout
-
  NUM_INSTANCES = 5
  SERVERS = {}
  for i=1,NUM_INSTANCES 
do                                                       \
      SERVERS[i] = 'qsync' .. i


  -- Teardown.
  test_run:switch('default')
  test_run:eval(SERVERS[current_leader_id], 'box.space.sync:drop()')
  test_run:drop_cluster(SERVERS)
-box.cfg{ \
-    replication_synchro_quorum = 
orig_synchro_quorum,                          \
-    replication_synchro_timeout = 
orig_synchro_timeout,                        \
-}

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Tarantool-patches] [PATCH 3/4] replication: add test with random leaders promotion and demotion
  2020-11-24  8:39     ` Sergey Bronnikov
@ 2020-11-25 22:02       ` Vladislav Shpilevoy
  0 siblings, 0 replies; 15+ messages in thread
From: Vladislav Shpilevoy @ 2020-11-25 22:02 UTC (permalink / raw)
  To: Sergey Bronnikov, tarantool-patches

Hi! Thanks for the patch!

See 3 comments below.

> diff --git a/test/replication/qsync_random_leader.result b/test/replication/qsync_random_leader.result
> new file mode 100644
> index 000000000..2d242e802
> --- /dev/null
> +++ b/test/replication/qsync_random_leader.result
> @@ -0,0 +1,137 @@
> +-- test-run result file version 2
> +os = require('os')
> + | ---
> + | ...
> +env = require('test_run')
> + | ---
> + | ...
> +math = require('math')
> + | ---
> + | ...
> +fiber = require('fiber')
> + | ---
> + | ...
> +test_run = env.new()
> + | ---
> + | ...
> +netbox = require('net.box')
> + | ---
> + | ...
> +
> +NUM_INSTANCES = 5
> + | ---
> + | ...
> +SERVERS = {}
> + | ---
> + | ...
> +for i=1,NUM_INSTANCES do                                                       \

1. Here are some more whitespaces: '                              ',
'               ', '               ', '               '. Should be
enough for this loop and the loops below.

> +    SERVERS[i] = 'qsync' .. i                                                  \
> +end
> + | ---
> + | ...
> +SERVERS -- print instance names
> + | ---
> + | - - qsync1
> + |   - qsync2
> + |   - qsync3
> + |   - qsync4
> + |   - qsync5
> + | ...
> +
> +math.randomseed(os.time())
> + | ---
> + | ...
> +function random(excluded_num, total)                                           \
> +    local r = math.random(1, total)                                            \
> +    if (r == excluded_num) then                                                \

2. In Lua we never use () for condition bodies.

> +        return random(excluded_num, total)                                     \
> +    end                                                                        \
> +    return r                                                                   \
> +end
> + | ---
> + | ...
> +
> +-- Set 'broken' quorum on current leader.
> +-- Write value on current leader.
> +-- Pick a random replica in a cluster.
> +-- Set 'good' quorum on it and promote to a leader.
> +-- Make sure value is there and on an old leader.
> +
> +-- Testcase setup.
> +test_run:create_cluster(SERVERS)
> + | ---
> + | ...
> +test_run:wait_fullmesh(SERVERS)
> + | ---
> + | ...
> +test_run:switch('qsync1')
> + | ---
> + | - true
> + | ...
> +_ = box.schema.space.create('sync', {is_sync=true, engine = test_run:get_cfg('engine')})
> + | ---
> + | ...
> +_ = box.space.sync:create_index('primary')
> + | ---
> + | ...
> +box.schema.user.grant('guest', 'write', 'space', 'sync')
> + | ---
> + | ...
> +test_run:switch('default')
> + | ---
> + | - true
> + | ...
> +current_leader_id = 1
> + | ---
> + | ...
> +test_run:eval(SERVERS[current_leader_id], "box.ctl.clear_synchro_queue()")
> + | ---
> + | - []
> + | ...
> +
> +SOCKET_DIR = require('fio').cwd()
> + | ---
> + | ...
> +
> +-- Testcase body.
> +for i=1,30 do                                                                  \
> +    test_run:eval(SERVERS[current_leader_id],                                  \
> +        "box.cfg{replication_synchro_quorum=6, replication_synchro_timeout=1000}") \
> +    c = netbox.connect(SOCKET_DIR..'/'..SERVERS[current_leader_id]..'.sock')   \
> +    fiber.create(function() c.space.sync:insert{i} end)                        \
> +    new_leader_id = random(current_leader_id, #SERVERS)                        \
> +    test_run:eval(SERVERS[new_leader_id],                                      \
> +        "box.cfg{replication_synchro_quorum=3, replication_synchro_timeout=0.01}") \
> +    test_run:eval(SERVERS[new_leader_id], "box.ctl.clear_synchro_queue()")     \

3. There is a little issue, which makes this code test not what it is
supposed to test, I suspect. When you do c.space.sync:insert{i} in a new
fiber, you don't wait until the insertion is replicated to a new leader.

So it can happen, that when you call box.ctl.clear_synchro_queue() on the
new leader, the queue here is empty or contains some old records from
the previous iterations. Not the value from the current iteration.

If the test is supposed to commit the record of the old leader on the
new leader in the same iteration, then it is likely working by luck.

I suggest you to turn on the mvcc engine to turn off dirty reads to make
sure you don't treat a dirty read as committed read. And rely entirely on
LSNs to wait for replication. We have wait_lsn and get_lsn test_run methods
for that. In fact, it is always better to rely on LSNs. Because dirty reads
may be turned off by default someday, and we will need to rewrite such
tests.

> +    c:close()                                                                  \
> +    replica = random(new_leader_id, #SERVERS)                                  \
> +    test_run:wait_cond(function() return test_run:eval(SERVERS[replica],       \
> +                       string.format("box.space.sync:get{%d}", i))[1] ~= nil end)  \
> +    test_run:wait_cond(function() return test_run:eval(SERVERS[current_leader_id], \
> +                       string.format("box.space.sync:get{%d}", i))[1] ~= nil end)  \
> +    new_leader_id = random(current_leader_id, #SERVERS)                        \
> +    current_leader_id = new_leader_id                                          \
> +end
> + | ---
> + | ...
> +
> +for i=1,NUM_INSTANCES do                                                       \
> +    instance = string.format('qsync%d', i)                                     \
> +    test_run:wait_cond(function() return test_run:eval(instance,               \
> +                       "box.space.sync:count()")[1] == 30 end)                 \
> +end

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Tarantool-patches] [PATCH 4/4] replication: add test with change space sync mode in a loop
  2020-11-17 16:13 ` [Tarantool-patches] [PATCH 4/4] replication: add test with change space sync mode in a loop sergeyb
  2020-11-21 14:41   ` Vladislav Shpilevoy
@ 2020-11-25 22:02   ` Vladislav Shpilevoy
  1 sibling, 0 replies; 15+ messages in thread
From: Vladislav Shpilevoy @ 2020-11-25 22:02 UTC (permalink / raw)
  To: sergeyb, tarantool-patches

Thanks for the patch!

> diff --git a/test/replication/qsync_sync_mode.result b/test/replication/qsync_sync_mode.result
> new file mode 100644
> index 000000000..66f6a70b2
> --- /dev/null
> +++ b/test/replication/qsync_sync_mode.result
> @@ -0,0 +1,114 @@
> +test_run:cmd('start server replica with wait=True, wait_load=True')
> + | ---
> + | - true
> + | ...
> +
> +-- Write data to a leader and enable/disable sync mode in a loop.
> +-- Testcase setup.
> +_ = box.schema.space.create('sync', {is_sync=true, engine=engine})
> + | ---
> + | ...
> +_ = box.space.sync:create_index('pk')
> + | ---
> + | ...
> +box.cfg{replication_synchro_quorum=2, replication_synchro_timeout=1000}
> + | ---
> + | ...
> +
> +-- Testcase body.
> +for i = 1,100 do                                                               \
> +    box.space.sync:alter{is_sync=random_boolean()}                             \

I thought about it more, and I realized that it does not make
much sense to test alter of kind true -> true, and false -> false.
Maybe better simply change it to negative on each iteration.

Also if the sync mode is true, wait_cond is supposed to be not
needed. That is the purpose of the sync. Currently your test will
pass even if I will simply remove is_sync option at all.

> +    box.space.sync:insert{i}                                                   \
> +    test_run:switch('replica')                                                 \
> +    test_run:wait_cond(function() return box.space.sync:get{i} ~= nil end)     \
> +    test_run:switch('default')                                                 \
> +    test_run:wait_cond(function() return box.space.sync:get{i} ~= nil end)     \
> +end
> + | ---
> + | ...
> +box.space.sync:count() -- 100
> + | ---
> + | - 100
> + | ...

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-11-25 22:02 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-17 16:13 [Tarantool-patches] [PATCH 0/4 v3] Additional qsync tests sergeyb
2020-11-17 16:13 ` [Tarantool-patches] [PATCH 1/4] replication: run clear_synchro_queue() with unconfigured box sergeyb
2020-11-21 14:41   ` Vladislav Shpilevoy
2020-11-23 14:44     ` Sergey Bronnikov
2020-11-17 16:13 ` [Tarantool-patches] [PATCH 2/4] replication: test clear_synchro_queue() function sergeyb
2020-11-21 14:41   ` Vladislav Shpilevoy
2020-11-23 15:13     ` Sergey Bronnikov
2020-11-17 16:13 ` [Tarantool-patches] [PATCH 3/4] replication: add test with random leaders promotion and demotion sergeyb
2020-11-21 14:41   ` Vladislav Shpilevoy
2020-11-24  8:39     ` Sergey Bronnikov
2020-11-25 22:02       ` Vladislav Shpilevoy
2020-11-17 16:13 ` [Tarantool-patches] [PATCH 4/4] replication: add test with change space sync mode in a loop sergeyb
2020-11-21 14:41   ` Vladislav Shpilevoy
2020-11-23 15:42     ` Sergey Bronnikov
2020-11-25 22:02   ` Vladislav Shpilevoy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox