From: "Alexander V. Tikhonov" <avtikhon@tarantool.org> To: Kirill Yukhin <kyukhin@tarantool.org>, Mergen Imeev <imeevma@gmail.com> Cc: tarantool-patches@dev.tarantool.org Subject: [Tarantool-patches] [PATCH v2 1/2] Divide replication/misc.test.lua Date: Fri, 4 Sep 2020 20:27:06 +0300 [thread overview] Message-ID: <3e681071c39d726e895f78d02fbb7a1e999a56ac.1599240310.git.avtikhon@tarantool.org> (raw) To fix flaky issues of replication/misc.test.lua the test had to be divided into smaller tests to be able to localize the flaky results: misc_assert_connecting_master_twice_gh-3610.test.lua misc_assert_on_server_die_gh-2991.test.lua misc_assert_replica_on_applier_disconnect_gh-3510.test.lua misc_crash_on_box_concurrent_update_gh-3606.test.lua misc_heartbeats_on_master_changes_gh-3160.test.lua misc_no_failure_on_error_reading_wal_gh-4399.test.lua misc_no_panic_on_connected_gh-3637.test.lua misc_no_restart_on_same_configuration_gh-3711.test.lua misc_no_socket_leak_on_replica_disconnect_gh-3642.test.lua misc_orphan_on_reconfiguration_error_gh-4424.test.lua misc_rebootstrap_from_ro_master_gh-3111.test.lua misc_replica_checks_cluster_id_gh-3704.test.lua misc_return_on_quorum_0_gh-3760.test.lua misc_value_not_replicated_on_iproto_request_gh-3247.test.lua Needed for #4940 --- Github: https://github.com/tarantool/tarantool/tree/avtikhon/gh-4940-replication-misc Issue: https://github.com/tarantool/tarantool/issues/4940 .../gh-2991-misc-assert-on-server-die.result | 29 + ...gh-2991-misc-assert-on-server-die.test.lua | 11 + ...111-misc-rebootstrap-from-ro-master.result | 58 ++ ...1-misc-rebootstrap-from-ro-master.test.lua | 20 + ...0-misc-heartbeats-on-master-changes.result | 67 ++ ...misc-heartbeats-on-master-changes.test.lua | 37 + ...ue-not-replicated-on-iproto-request.result | 87 ++ ...-not-replicated-on-iproto-request.test.lua | 32 + ...ssert-replica-on-applier-disconnect.result | 46 + ...ert-replica-on-applier-disconnect.test.lua | 16 + ...misc-crash-on-box-concurrent-update.result | 45 + ...sc-crash-on-box-concurrent-update.test.lua | 17 + ...misc-assert-connecting-master-twice.result | 83 ++ ...sc-assert-connecting-master-twice.test.lua | 33 + .../gh-3637-misc-no-panic-on-connected.result | 69 ++ ...h-3637-misc-no-panic-on-connected.test.lua | 32 + ...o-socket-leak-on-replica-disconnect.result | 95 ++ ...socket-leak-on-replica-disconnect.test.lua | 43 + ...3704-misc-replica-checks-cluster-id.result | 68 ++ ...04-misc-replica-checks-cluster-id.test.lua | 25 + ...sc-no-restart-on-same-configuration.result | 101 ++ ...-no-restart-on-same-configuration.test.lua | 39 + .../gh-3760-misc-return-on-quorum-0.result | 23 + .../gh-3760-misc-return-on-quorum-0.test.lua | 14 + ...isc-no-failure-on-error-reading-wal.result | 94 ++ ...c-no-failure-on-error-reading-wal.test.lua | 38 + ...isc-orphan-on-reconfiguration-error.result | 82 ++ ...c-orphan-on-reconfiguration-error.test.lua | 35 + test/replication/misc.result | 866 ------------------ test/replication/misc.skipcond | 7 - test/replication/misc.test.lua | 356 ------- test/replication/suite.cfg | 15 +- test/replication/suite.ini | 2 +- 33 files changed, 1354 insertions(+), 1231 deletions(-) create mode 100644 test/replication/gh-2991-misc-assert-on-server-die.result create mode 100644 test/replication/gh-2991-misc-assert-on-server-die.test.lua create mode 100644 test/replication/gh-3111-misc-rebootstrap-from-ro-master.result create mode 100644 test/replication/gh-3111-misc-rebootstrap-from-ro-master.test.lua create mode 100644 test/replication/gh-3160-misc-heartbeats-on-master-changes.result create mode 100644 test/replication/gh-3160-misc-heartbeats-on-master-changes.test.lua create mode 100644 test/replication/gh-3247-misc-value-not-replicated-on-iproto-request.result create mode 100644 test/replication/gh-3247-misc-value-not-replicated-on-iproto-request.test.lua create mode 100644 test/replication/gh-3510-misc-assert-replica-on-applier-disconnect.result create mode 100644 test/replication/gh-3510-misc-assert-replica-on-applier-disconnect.test.lua create mode 100644 test/replication/gh-3606-misc-crash-on-box-concurrent-update.result create mode 100644 test/replication/gh-3606-misc-crash-on-box-concurrent-update.test.lua create mode 100644 test/replication/gh-3610-misc-assert-connecting-master-twice.result create mode 100644 test/replication/gh-3610-misc-assert-connecting-master-twice.test.lua create mode 100644 test/replication/gh-3637-misc-no-panic-on-connected.result create mode 100644 test/replication/gh-3637-misc-no-panic-on-connected.test.lua create mode 100644 test/replication/gh-3642-misc-no-socket-leak-on-replica-disconnect.result create mode 100644 test/replication/gh-3642-misc-no-socket-leak-on-replica-disconnect.test.lua create mode 100644 test/replication/gh-3704-misc-replica-checks-cluster-id.result create mode 100644 test/replication/gh-3704-misc-replica-checks-cluster-id.test.lua create mode 100644 test/replication/gh-3711-misc-no-restart-on-same-configuration.result create mode 100644 test/replication/gh-3711-misc-no-restart-on-same-configuration.test.lua create mode 100644 test/replication/gh-3760-misc-return-on-quorum-0.result create mode 100644 test/replication/gh-3760-misc-return-on-quorum-0.test.lua create mode 100644 test/replication/gh-4399-misc-no-failure-on-error-reading-wal.result create mode 100644 test/replication/gh-4399-misc-no-failure-on-error-reading-wal.test.lua create mode 100644 test/replication/gh-4424-misc-orphan-on-reconfiguration-error.result create mode 100644 test/replication/gh-4424-misc-orphan-on-reconfiguration-error.test.lua delete mode 100644 test/replication/misc.result delete mode 100644 test/replication/misc.skipcond delete mode 100644 test/replication/misc.test.lua diff --git a/test/replication/gh-2991-misc-assert-on-server-die.result b/test/replication/gh-2991-misc-assert-on-server-die.result new file mode 100644 index 000000000..ffae6e44a --- /dev/null +++ b/test/replication/gh-2991-misc-assert-on-server-die.result @@ -0,0 +1,29 @@ +-- gh-2991 - Tarantool asserts on box.cfg.replication update if one of +-- servers is dead +replication_timeout = box.cfg.replication_timeout +--- +... +replication_connect_timeout = box.cfg.replication_connect_timeout +--- +... +box.cfg{replication_timeout=0.05, replication_connect_timeout=0.05, replication={}} +--- +... +box.cfg{replication_connect_quorum=2} +--- +... +box.cfg{replication = {'127.0.0.1:12345', box.cfg.listen}} +--- +... +box.info.status +--- +- orphan +... +box.info.ro +--- +- true +... +box.cfg{replication = "", replication_timeout = replication_timeout, \ + replication_connect_timeout = replication_connect_timeout} +--- +... diff --git a/test/replication/gh-2991-misc-assert-on-server-die.test.lua b/test/replication/gh-2991-misc-assert-on-server-die.test.lua new file mode 100644 index 000000000..b9f217cfa --- /dev/null +++ b/test/replication/gh-2991-misc-assert-on-server-die.test.lua @@ -0,0 +1,11 @@ +-- gh-2991 - Tarantool asserts on box.cfg.replication update if one of +-- servers is dead +replication_timeout = box.cfg.replication_timeout +replication_connect_timeout = box.cfg.replication_connect_timeout +box.cfg{replication_timeout=0.05, replication_connect_timeout=0.05, replication={}} +box.cfg{replication_connect_quorum=2} +box.cfg{replication = {'127.0.0.1:12345', box.cfg.listen}} +box.info.status +box.info.ro +box.cfg{replication = "", replication_timeout = replication_timeout, \ + replication_connect_timeout = replication_connect_timeout} diff --git a/test/replication/gh-3111-misc-rebootstrap-from-ro-master.result b/test/replication/gh-3111-misc-rebootstrap-from-ro-master.result new file mode 100644 index 000000000..7ffca1585 --- /dev/null +++ b/test/replication/gh-3111-misc-rebootstrap-from-ro-master.result @@ -0,0 +1,58 @@ +test_run = require('test_run').new() +--- +... +test_run:cmd("restart server default") +uuid = require('uuid') +--- +... +box.schema.user.grant('guest', 'replication') +--- +... +-- gh-3111 - Allow to rebootstrap a replica from a read-only master +replica_uuid = uuid.new() +--- +... +test_run:cmd('create server test with rpl_master=default, script="replication/replica_uuid.lua"') +--- +- true +... +test_run:cmd(string.format('start server test with args="%s"', replica_uuid)) +--- +- true +... +test_run:cmd('stop server test') +--- +- true +... +test_run:cmd('cleanup server test') +--- +- true +... +box.cfg{read_only = true} +--- +... +test_run:cmd(string.format('start server test with args="%s"', replica_uuid)) +--- +- true +... +test_run:cmd('stop server test') +--- +- true +... +test_run:cmd('cleanup server test') +--- +- true +... +box.cfg{read_only = false} +--- +... +test_run:cmd('delete server test') +--- +- true +... +test_run:cleanup_cluster() +--- +... +box.schema.user.revoke('guest', 'replication') +--- +... diff --git a/test/replication/gh-3111-misc-rebootstrap-from-ro-master.test.lua b/test/replication/gh-3111-misc-rebootstrap-from-ro-master.test.lua new file mode 100644 index 000000000..bb9b4a80f --- /dev/null +++ b/test/replication/gh-3111-misc-rebootstrap-from-ro-master.test.lua @@ -0,0 +1,20 @@ +test_run = require('test_run').new() +test_run:cmd("restart server default") +uuid = require('uuid') + +box.schema.user.grant('guest', 'replication') + +-- gh-3111 - Allow to rebootstrap a replica from a read-only master +replica_uuid = uuid.new() +test_run:cmd('create server test with rpl_master=default, script="replication/replica_uuid.lua"') +test_run:cmd(string.format('start server test with args="%s"', replica_uuid)) +test_run:cmd('stop server test') +test_run:cmd('cleanup server test') +box.cfg{read_only = true} +test_run:cmd(string.format('start server test with args="%s"', replica_uuid)) +test_run:cmd('stop server test') +test_run:cmd('cleanup server test') +box.cfg{read_only = false} +test_run:cmd('delete server test') +test_run:cleanup_cluster() +box.schema.user.revoke('guest', 'replication') diff --git a/test/replication/gh-3160-misc-heartbeats-on-master-changes.result b/test/replication/gh-3160-misc-heartbeats-on-master-changes.result new file mode 100644 index 000000000..9bce55ae1 --- /dev/null +++ b/test/replication/gh-3160-misc-heartbeats-on-master-changes.result @@ -0,0 +1,67 @@ +test_run = require('test_run').new() +--- +... +-- gh-3160 - Send heartbeats if there are changes from a remote master only +SERVERS = { 'autobootstrap1', 'autobootstrap2', 'autobootstrap3' } +--- +... +-- Deploy a cluster. +test_run:create_cluster(SERVERS, "replication", {args="0.03"}) +--- +... +test_run:wait_fullmesh(SERVERS) +--- +... +test_run:cmd("switch autobootstrap3") +--- +- true +... +test_run = require('test_run').new() +--- +... +_ = box.schema.space.create('test_timeout'):create_index('pk') +--- +... +test_run:cmd("setopt delimiter ';'") +--- +- true +... +function wait_not_follow(replicaA, replicaB) + return test_run:wait_cond(function() + return replicaA.status ~= 'follow' or replicaB.status ~= 'follow' + end, box.cfg.replication_timeout) +end; +--- +... +function test_timeout() + local replicaA = box.info.replication[1].upstream or box.info.replication[2].upstream + local replicaB = box.info.replication[3].upstream or box.info.replication[2].upstream + local follows = test_run:wait_cond(function() + return replicaA.status == 'follow' or replicaB.status == 'follow' + end) + if not follows then error('replicas are not in the follow status') end + for i = 0, 99 do + box.space.test_timeout:replace({1}) + if wait_not_follow(replicaA, replicaB) then + return error(box.info.replication) + end + end + return true +end; +--- +... +test_run:cmd("setopt delimiter ''"); +--- +- true +... +test_timeout() +--- +- true +... +test_run:cmd("switch default") +--- +- true +... +test_run:drop_cluster(SERVERS) +--- +... diff --git a/test/replication/gh-3160-misc-heartbeats-on-master-changes.test.lua b/test/replication/gh-3160-misc-heartbeats-on-master-changes.test.lua new file mode 100644 index 000000000..b3d8d2d54 --- /dev/null +++ b/test/replication/gh-3160-misc-heartbeats-on-master-changes.test.lua @@ -0,0 +1,37 @@ +test_run = require('test_run').new() + +-- gh-3160 - Send heartbeats if there are changes from a remote master only +SERVERS = { 'autobootstrap1', 'autobootstrap2', 'autobootstrap3' } + +-- Deploy a cluster. +test_run:create_cluster(SERVERS, "replication", {args="0.03"}) +test_run:wait_fullmesh(SERVERS) +test_run:cmd("switch autobootstrap3") +test_run = require('test_run').new() +_ = box.schema.space.create('test_timeout'):create_index('pk') +test_run:cmd("setopt delimiter ';'") +function wait_not_follow(replicaA, replicaB) + return test_run:wait_cond(function() + return replicaA.status ~= 'follow' or replicaB.status ~= 'follow' + end, box.cfg.replication_timeout) +end; +function test_timeout() + local replicaA = box.info.replication[1].upstream or box.info.replication[2].upstream + local replicaB = box.info.replication[3].upstream or box.info.replication[2].upstream + local follows = test_run:wait_cond(function() + return replicaA.status == 'follow' or replicaB.status == 'follow' + end) + if not follows then error('replicas are not in the follow status') end + for i = 0, 99 do + box.space.test_timeout:replace({1}) + if wait_not_follow(replicaA, replicaB) then + return error(box.info.replication) + end + end + return true +end; +test_run:cmd("setopt delimiter ''"); +test_timeout() + +test_run:cmd("switch default") +test_run:drop_cluster(SERVERS) diff --git a/test/replication/gh-3247-misc-value-not-replicated-on-iproto-request.result b/test/replication/gh-3247-misc-value-not-replicated-on-iproto-request.result new file mode 100644 index 000000000..39f9b5763 --- /dev/null +++ b/test/replication/gh-3247-misc-value-not-replicated-on-iproto-request.result @@ -0,0 +1,87 @@ +test_run = require('test_run').new() +--- +... +test_run:cmd("restart server default") +-- Deploy a cluster. +SERVERS = { 'autobootstrap1', 'autobootstrap2', 'autobootstrap3' } +--- +... +test_run:create_cluster(SERVERS, "replication", {args="0.03"}) +--- +... +test_run:wait_fullmesh(SERVERS) +--- +... +-- gh-3247 - Sequence-generated value is not replicated in case +-- the request was sent via iproto. +test_run:cmd("switch autobootstrap1") +--- +- true +... +net_box = require('net.box') +--- +... +_ = box.schema.space.create('space1') +--- +... +_ = box.schema.sequence.create('seq') +--- +... +_ = box.space.space1:create_index('primary', {sequence = true} ) +--- +... +_ = box.space.space1:create_index('secondary', {parts = {2, 'unsigned'}}) +--- +... +box.schema.user.grant('guest', 'read,write', 'space', 'space1') +--- +... +c = net_box.connect(box.cfg.listen) +--- +... +c.space.space1:insert{box.NULL, "data"} -- fails, but bumps sequence value +--- +- error: 'Tuple field 2 type does not match one required by operation: expected unsigned' +... +c.space.space1:insert{box.NULL, 1, "data"} +--- +- [2, 1, 'data'] +... +box.space.space1:select{} +--- +- - [2, 1, 'data'] +... +vclock = test_run:get_vclock("autobootstrap1") +--- +... +vclock[0] = nil +--- +... +_ = test_run:wait_vclock("autobootstrap2", vclock) +--- +... +test_run:cmd("switch autobootstrap2") +--- +- true +... +box.space.space1:select{} +--- +- - [2, 1, 'data'] +... +test_run:cmd("switch autobootstrap1") +--- +- true +... +box.space.space1:drop() +--- +... +test_run:cmd("switch default") +--- +- true +... +test_run:drop_cluster(SERVERS) +--- +... +test_run:cleanup_cluster() +--- +... diff --git a/test/replication/gh-3247-misc-value-not-replicated-on-iproto-request.test.lua b/test/replication/gh-3247-misc-value-not-replicated-on-iproto-request.test.lua new file mode 100644 index 000000000..a703377c3 --- /dev/null +++ b/test/replication/gh-3247-misc-value-not-replicated-on-iproto-request.test.lua @@ -0,0 +1,32 @@ +test_run = require('test_run').new() +test_run:cmd("restart server default") + +-- Deploy a cluster. +SERVERS = { 'autobootstrap1', 'autobootstrap2', 'autobootstrap3' } +test_run:create_cluster(SERVERS, "replication", {args="0.03"}) +test_run:wait_fullmesh(SERVERS) + +-- gh-3247 - Sequence-generated value is not replicated in case +-- the request was sent via iproto. +test_run:cmd("switch autobootstrap1") +net_box = require('net.box') +_ = box.schema.space.create('space1') +_ = box.schema.sequence.create('seq') +_ = box.space.space1:create_index('primary', {sequence = true} ) +_ = box.space.space1:create_index('secondary', {parts = {2, 'unsigned'}}) +box.schema.user.grant('guest', 'read,write', 'space', 'space1') +c = net_box.connect(box.cfg.listen) +c.space.space1:insert{box.NULL, "data"} -- fails, but bumps sequence value +c.space.space1:insert{box.NULL, 1, "data"} +box.space.space1:select{} +vclock = test_run:get_vclock("autobootstrap1") +vclock[0] = nil +_ = test_run:wait_vclock("autobootstrap2", vclock) +test_run:cmd("switch autobootstrap2") +box.space.space1:select{} +test_run:cmd("switch autobootstrap1") +box.space.space1:drop() + +test_run:cmd("switch default") +test_run:drop_cluster(SERVERS) +test_run:cleanup_cluster() diff --git a/test/replication/gh-3510-misc-assert-replica-on-applier-disconnect.result b/test/replication/gh-3510-misc-assert-replica-on-applier-disconnect.result new file mode 100644 index 000000000..03e4ebfd5 --- /dev/null +++ b/test/replication/gh-3510-misc-assert-replica-on-applier-disconnect.result @@ -0,0 +1,46 @@ +test_run = require('test_run').new() +--- +... +-- gh-3510 assertion failure in replica_on_applier_disconnect() +test_run:cmd('create server er_load1 with script="replication/er_load1.lua"') +--- +- true +... +test_run:cmd('create server er_load2 with script="replication/er_load2.lua"') +--- +- true +... +test_run:cmd('start server er_load1 with wait=False, wait_load=False') +--- +- true +... +-- Instance er_load2 will fail with error ER_REPLICASET_UUID_MISMATCH. +-- This is OK since we only test here that er_load1 doesn't assert. +test_run:cmd('start server er_load2 with wait=True, wait_load=True, crash_expected = True') +--- +- false +... +test_run:cmd('stop server er_load1') +--- +- true +... +-- er_load2 exits automatically. +test_run:cmd('cleanup server er_load1') +--- +- true +... +test_run:cmd('cleanup server er_load2') +--- +- true +... +test_run:cmd('delete server er_load1') +--- +- true +... +test_run:cmd('delete server er_load2') +--- +- true +... +test_run:cleanup_cluster() +--- +... diff --git a/test/replication/gh-3510-misc-assert-replica-on-applier-disconnect.test.lua b/test/replication/gh-3510-misc-assert-replica-on-applier-disconnect.test.lua new file mode 100644 index 000000000..f55d0137b --- /dev/null +++ b/test/replication/gh-3510-misc-assert-replica-on-applier-disconnect.test.lua @@ -0,0 +1,16 @@ +test_run = require('test_run').new() + +-- gh-3510 assertion failure in replica_on_applier_disconnect() +test_run:cmd('create server er_load1 with script="replication/er_load1.lua"') +test_run:cmd('create server er_load2 with script="replication/er_load2.lua"') +test_run:cmd('start server er_load1 with wait=False, wait_load=False') +-- Instance er_load2 will fail with error ER_REPLICASET_UUID_MISMATCH. +-- This is OK since we only test here that er_load1 doesn't assert. +test_run:cmd('start server er_load2 with wait=True, wait_load=True, crash_expected = True') +test_run:cmd('stop server er_load1') +-- er_load2 exits automatically. +test_run:cmd('cleanup server er_load1') +test_run:cmd('cleanup server er_load2') +test_run:cmd('delete server er_load1') +test_run:cmd('delete server er_load2') +test_run:cleanup_cluster() diff --git a/test/replication/gh-3606-misc-crash-on-box-concurrent-update.result b/test/replication/gh-3606-misc-crash-on-box-concurrent-update.result new file mode 100644 index 000000000..4de4ad35a --- /dev/null +++ b/test/replication/gh-3606-misc-crash-on-box-concurrent-update.result @@ -0,0 +1,45 @@ +replication_timeout = box.cfg.replication_timeout +--- +... +replication_connect_timeout = box.cfg.replication_connect_timeout +--- +... +box.cfg{replication_timeout=0.05, replication_connect_timeout=0.05, replication={}} +--- +... +-- gh-3606 - Tarantool crashes if box.cfg.replication is updated concurrently +fiber = require('fiber') +--- +... +c = fiber.channel(2) +--- +... +f = function() fiber.create(function() pcall(box.cfg, {replication = {12345}}) c:put(true) end) end +--- +... +f() +--- +... +f() +--- +... +c:get() +--- +- true +... +c:get() +--- +- true +... +box.cfg{replication = "", replication_timeout = replication_timeout, \ + replication_connect_timeout = replication_connect_timeout} +--- +... +box.info.status +--- +- running +... +box.info.ro +--- +- false +... diff --git a/test/replication/gh-3606-misc-crash-on-box-concurrent-update.test.lua b/test/replication/gh-3606-misc-crash-on-box-concurrent-update.test.lua new file mode 100644 index 000000000..3792cc9e1 --- /dev/null +++ b/test/replication/gh-3606-misc-crash-on-box-concurrent-update.test.lua @@ -0,0 +1,17 @@ +replication_timeout = box.cfg.replication_timeout +replication_connect_timeout = box.cfg.replication_connect_timeout +box.cfg{replication_timeout=0.05, replication_connect_timeout=0.05, replication={}} + +-- gh-3606 - Tarantool crashes if box.cfg.replication is updated concurrently +fiber = require('fiber') +c = fiber.channel(2) +f = function() fiber.create(function() pcall(box.cfg, {replication = {12345}}) c:put(true) end) end +f() +f() +c:get() +c:get() + +box.cfg{replication = "", replication_timeout = replication_timeout, \ + replication_connect_timeout = replication_connect_timeout} +box.info.status +box.info.ro diff --git a/test/replication/gh-3610-misc-assert-connecting-master-twice.result b/test/replication/gh-3610-misc-assert-connecting-master-twice.result new file mode 100644 index 000000000..f2b07f30b --- /dev/null +++ b/test/replication/gh-3610-misc-assert-connecting-master-twice.result @@ -0,0 +1,83 @@ +test_run = require('test_run').new() +--- +... +test_run:cmd("restart server default") +fiber = require('fiber') +--- +... +-- +-- Test case for gh-3610. Before the fix replica would fail with the assertion +-- when trying to connect to the same master twice. +-- +box.schema.user.grant('guest', 'replication') +--- +... +test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'") +--- +- true +... +test_run:cmd("start server replica") +--- +- true +... +test_run:cmd("switch replica") +--- +- true +... +replication = box.cfg.replication[1] +--- +... +box.cfg{replication = {replication, replication}} +--- +- error: 'Incorrect value for option ''replication'': duplicate connection to the + same replica' +... +-- Check the case when duplicate connection is detected in the background. +test_run:cmd("switch default") +--- +- true +... +listen = box.cfg.listen +--- +... +box.cfg{listen = ''} +--- +... +test_run:cmd("switch replica") +--- +- true +... +box.cfg{replication_connect_quorum = 0, replication_connect_timeout = 0.01} +--- +... +box.cfg{replication = {replication, replication}} +--- +... +test_run:cmd("switch default") +--- +- true +... +box.cfg{listen = listen} +--- +... +while test_run:grep_log('replica', 'duplicate connection') == nil do fiber.sleep(0.01) end +--- +... +test_run:cmd("stop server replica") +--- +- true +... +test_run:cmd("cleanup server replica") +--- +- true +... +test_run:cmd("delete server replica") +--- +- true +... +test_run:cleanup_cluster() +--- +... +box.schema.user.revoke('guest', 'replication') +--- +... diff --git a/test/replication/gh-3610-misc-assert-connecting-master-twice.test.lua b/test/replication/gh-3610-misc-assert-connecting-master-twice.test.lua new file mode 100644 index 000000000..5f86eb15b --- /dev/null +++ b/test/replication/gh-3610-misc-assert-connecting-master-twice.test.lua @@ -0,0 +1,33 @@ +test_run = require('test_run').new() +test_run:cmd("restart server default") +fiber = require('fiber') + +-- +-- Test case for gh-3610. Before the fix replica would fail with the assertion +-- when trying to connect to the same master twice. +-- +box.schema.user.grant('guest', 'replication') +test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'") +test_run:cmd("start server replica") +test_run:cmd("switch replica") +replication = box.cfg.replication[1] +box.cfg{replication = {replication, replication}} + +-- Check the case when duplicate connection is detected in the background. +test_run:cmd("switch default") +listen = box.cfg.listen +box.cfg{listen = ''} + +test_run:cmd("switch replica") +box.cfg{replication_connect_quorum = 0, replication_connect_timeout = 0.01} +box.cfg{replication = {replication, replication}} + +test_run:cmd("switch default") +box.cfg{listen = listen} +while test_run:grep_log('replica', 'duplicate connection') == nil do fiber.sleep(0.01) end + +test_run:cmd("stop server replica") +test_run:cmd("cleanup server replica") +test_run:cmd("delete server replica") +test_run:cleanup_cluster() +box.schema.user.revoke('guest', 'replication') diff --git a/test/replication/gh-3637-misc-no-panic-on-connected.result b/test/replication/gh-3637-misc-no-panic-on-connected.result new file mode 100644 index 000000000..98880d8e4 --- /dev/null +++ b/test/replication/gh-3637-misc-no-panic-on-connected.result @@ -0,0 +1,69 @@ +test_run = require('test_run').new() +--- +... +test_run:cmd("restart server default") +-- +-- Test case for gh-3637, gh-4550. Before the fix replica would +-- exit with an error if a user does not exist or a password is +-- incorrect. Now check that we don't hang/panic and successfully +-- connect. +-- +fiber = require('fiber') +--- +... +test_run:cmd("create server replica_auth with rpl_master=default, script='replication/replica_auth.lua'") +--- +- true +... +test_run:cmd("start server replica_auth with wait=False, wait_load=False, args='cluster:pass 0.05'") +--- +- true +... +-- Wait a bit to make sure replica waits till user is created. +fiber.sleep(0.1) +--- +... +box.schema.user.create('cluster') +--- +... +-- The user is created. Let the replica fail auth request due to +-- a wrong password. +fiber.sleep(0.1) +--- +... +box.schema.user.passwd('cluster', 'pass') +--- +... +box.schema.user.grant('cluster', 'replication') +--- +... +while box.info.replication[2] == nil do fiber.sleep(0.01) end +--- +... +vclock = test_run:get_vclock('default') +--- +... +vclock[0] = nil +--- +... +_ = test_run:wait_vclock('replica_auth', vclock) +--- +... +test_run:cmd("stop server replica_auth") +--- +- true +... +test_run:cmd("cleanup server replica_auth") +--- +- true +... +test_run:cmd("delete server replica_auth") +--- +- true +... +test_run:cleanup_cluster() +--- +... +box.schema.user.drop('cluster') +--- +... diff --git a/test/replication/gh-3637-misc-no-panic-on-connected.test.lua b/test/replication/gh-3637-misc-no-panic-on-connected.test.lua new file mode 100644 index 000000000..c51a2f628 --- /dev/null +++ b/test/replication/gh-3637-misc-no-panic-on-connected.test.lua @@ -0,0 +1,32 @@ +test_run = require('test_run').new() +test_run:cmd("restart server default") + +-- +-- Test case for gh-3637, gh-4550. Before the fix replica would +-- exit with an error if a user does not exist or a password is +-- incorrect. Now check that we don't hang/panic and successfully +-- connect. +-- +fiber = require('fiber') +test_run:cmd("create server replica_auth with rpl_master=default, script='replication/replica_auth.lua'") +test_run:cmd("start server replica_auth with wait=False, wait_load=False, args='cluster:pass 0.05'") +-- Wait a bit to make sure replica waits till user is created. +fiber.sleep(0.1) +box.schema.user.create('cluster') +-- The user is created. Let the replica fail auth request due to +-- a wrong password. +fiber.sleep(0.1) +box.schema.user.passwd('cluster', 'pass') +box.schema.user.grant('cluster', 'replication') + +while box.info.replication[2] == nil do fiber.sleep(0.01) end +vclock = test_run:get_vclock('default') +vclock[0] = nil +_ = test_run:wait_vclock('replica_auth', vclock) + +test_run:cmd("stop server replica_auth") +test_run:cmd("cleanup server replica_auth") +test_run:cmd("delete server replica_auth") +test_run:cleanup_cluster() + +box.schema.user.drop('cluster') diff --git a/test/replication/gh-3642-misc-no-socket-leak-on-replica-disconnect.result b/test/replication/gh-3642-misc-no-socket-leak-on-replica-disconnect.result new file mode 100644 index 000000000..d068ad8fc --- /dev/null +++ b/test/replication/gh-3642-misc-no-socket-leak-on-replica-disconnect.result @@ -0,0 +1,95 @@ +test_run = require('test_run').new() +--- +... +test_run:cmd("restart server default") +box.schema.user.grant('guest', 'replication') +--- +... +-- gh-3642 - Check that socket file descriptor doesn't leak +-- when a replica is disconnected. +rlimit = require('rlimit') +--- +... +lim = rlimit.limit() +--- +... +rlimit.getrlimit(rlimit.RLIMIT_NOFILE, lim) +--- +... +old_fno = lim.rlim_cur +--- +... +lim.rlim_cur = 64 +--- +... +rlimit.setrlimit(rlimit.RLIMIT_NOFILE, lim) +--- +... +test_run:cmd('create server sock with rpl_master=default, script="replication/replica.lua"') +--- +- true +... +test_run:cmd('start server sock') +--- +- true +... +test_run:cmd('switch sock') +--- +- true +... +test_run = require('test_run').new() +--- +... +fiber = require('fiber') +--- +... +test_run:cmd("setopt delimiter ';'") +--- +- true +... +for i = 1, 64 do + local replication = box.cfg.replication + box.cfg{replication = {}} + box.cfg{replication = replication} + while box.info.replication[1].upstream.status ~= 'follow' do + fiber.sleep(0.001) + end +end; +--- +... +test_run:cmd("setopt delimiter ''"); +--- +- true +... +box.info.replication[1].upstream.status +--- +- follow +... +test_run:cmd('switch default') +--- +- true +... +lim.rlim_cur = old_fno +--- +... +rlimit.setrlimit(rlimit.RLIMIT_NOFILE, lim) +--- +... +test_run:cmd("stop server sock") +--- +- true +... +test_run:cmd("cleanup server sock") +--- +- true +... +test_run:cmd("delete server sock") +--- +- true +... +test_run:cleanup_cluster() +--- +... +box.schema.user.revoke('guest', 'replication') +--- +... diff --git a/test/replication/gh-3642-misc-no-socket-leak-on-replica-disconnect.test.lua b/test/replication/gh-3642-misc-no-socket-leak-on-replica-disconnect.test.lua new file mode 100644 index 000000000..9cfbe7214 --- /dev/null +++ b/test/replication/gh-3642-misc-no-socket-leak-on-replica-disconnect.test.lua @@ -0,0 +1,43 @@ +test_run = require('test_run').new() +test_run:cmd("restart server default") + +box.schema.user.grant('guest', 'replication') + +-- gh-3642 - Check that socket file descriptor doesn't leak +-- when a replica is disconnected. +rlimit = require('rlimit') +lim = rlimit.limit() +rlimit.getrlimit(rlimit.RLIMIT_NOFILE, lim) +old_fno = lim.rlim_cur +lim.rlim_cur = 64 +rlimit.setrlimit(rlimit.RLIMIT_NOFILE, lim) + +test_run:cmd('create server sock with rpl_master=default, script="replication/replica.lua"') +test_run:cmd('start server sock') +test_run:cmd('switch sock') +test_run = require('test_run').new() +fiber = require('fiber') +test_run:cmd("setopt delimiter ';'") +for i = 1, 64 do + local replication = box.cfg.replication + box.cfg{replication = {}} + box.cfg{replication = replication} + while box.info.replication[1].upstream.status ~= 'follow' do + fiber.sleep(0.001) + end +end; +test_run:cmd("setopt delimiter ''"); + +box.info.replication[1].upstream.status + +test_run:cmd('switch default') + +lim.rlim_cur = old_fno +rlimit.setrlimit(rlimit.RLIMIT_NOFILE, lim) + +test_run:cmd("stop server sock") +test_run:cmd("cleanup server sock") +test_run:cmd("delete server sock") +test_run:cleanup_cluster() + +box.schema.user.revoke('guest', 'replication') diff --git a/test/replication/gh-3704-misc-replica-checks-cluster-id.result b/test/replication/gh-3704-misc-replica-checks-cluster-id.result new file mode 100644 index 000000000..1ca2913f8 --- /dev/null +++ b/test/replication/gh-3704-misc-replica-checks-cluster-id.result @@ -0,0 +1,68 @@ +test_run = require('test_run').new() +--- +... +test_run:cmd("restart server default") +uuid = require('uuid') +--- +... +-- +-- gh-3704 move cluster id check to replica +-- +test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'") +--- +- true +... +box.schema.user.grant("guest", "replication") +--- +... +test_run:cmd("start server replica") +--- +- true +... +test_run:grep_log("replica", "REPLICASET_UUID_MISMATCH") +--- +- null +... +box.info.replication[2].downstream.status +--- +- follow +... +-- change master's cluster uuid and check that replica doesn't connect. +test_run:cmd("stop server replica") +--- +- true +... +_ = box.space._schema:replace{'cluster', tostring(uuid.new())} +--- +... +-- master believes replica is in cluster, but their cluster UUIDs differ. +test_run:cmd("start server replica") +--- +- true +... +test_run:wait_log("replica", "REPLICASET_UUID_MISMATCH", nil, 1.0) +--- +- REPLICASET_UUID_MISMATCH +... +test_run:wait_downstream(2, {status = 'stopped'}) +--- +- true +... +test_run:cmd("stop server replica") +--- +- true +... +test_run:cmd("cleanup server replica") +--- +- true +... +test_run:cmd("delete server replica") +--- +- true +... +test_run:cleanup_cluster() +--- +... +box.schema.user.revoke('guest', 'replication') +--- +... diff --git a/test/replication/gh-3704-misc-replica-checks-cluster-id.test.lua b/test/replication/gh-3704-misc-replica-checks-cluster-id.test.lua new file mode 100644 index 000000000..00c443a55 --- /dev/null +++ b/test/replication/gh-3704-misc-replica-checks-cluster-id.test.lua @@ -0,0 +1,25 @@ +test_run = require('test_run').new() +test_run:cmd("restart server default") +uuid = require('uuid') + +-- +-- gh-3704 move cluster id check to replica +-- +test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'") +box.schema.user.grant("guest", "replication") +test_run:cmd("start server replica") +test_run:grep_log("replica", "REPLICASET_UUID_MISMATCH") +box.info.replication[2].downstream.status +-- change master's cluster uuid and check that replica doesn't connect. +test_run:cmd("stop server replica") +_ = box.space._schema:replace{'cluster', tostring(uuid.new())} +-- master believes replica is in cluster, but their cluster UUIDs differ. +test_run:cmd("start server replica") +test_run:wait_log("replica", "REPLICASET_UUID_MISMATCH", nil, 1.0) +test_run:wait_downstream(2, {status = 'stopped'}) + +test_run:cmd("stop server replica") +test_run:cmd("cleanup server replica") +test_run:cmd("delete server replica") +test_run:cleanup_cluster() +box.schema.user.revoke('guest', 'replication') diff --git a/test/replication/gh-3711-misc-no-restart-on-same-configuration.result b/test/replication/gh-3711-misc-no-restart-on-same-configuration.result new file mode 100644 index 000000000..c1e746f54 --- /dev/null +++ b/test/replication/gh-3711-misc-no-restart-on-same-configuration.result @@ -0,0 +1,101 @@ +test_run = require('test_run').new() +--- +... +test_run:cmd("restart server default") +-- +-- gh-3711 Do not restart replication on box.cfg in case the +-- configuration didn't change. +-- +box.schema.user.grant('guest', 'replication') +--- +... +test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'") +--- +- true +... +test_run:cmd("start server replica") +--- +- true +... +-- Access rights are checked only during reconnect. If the new +-- and old configurations are equivalent, no reconnect will be +-- issued and replication should continue working. +box.schema.user.revoke('guest', 'replication') +--- +... +test_run:cmd("switch replica") +--- +- true +... +replication = box.cfg.replication[1] +--- +... +box.cfg{replication = {replication}} +--- +... +box.info.status == 'running' +--- +- true +... +box.cfg{replication = replication} +--- +... +box.info.status == 'running' +--- +- true +... +-- Check that comparison of tables works as expected as well. +test_run:cmd("switch default") +--- +- true +... +box.schema.user.grant('guest', 'replication') +--- +... +test_run:cmd("switch replica") +--- +- true +... +replication = box.cfg.replication +--- +... +table.insert(replication, box.cfg.listen) +--- +... +test_run:cmd("switch default") +--- +- true +... +box.schema.user.revoke('guest', 'replication') +--- +... +test_run:cmd("switch replica") +--- +- true +... +box.cfg{replication = replication} +--- +... +box.info.status == 'running' +--- +- true +... +test_run:cmd("switch default") +--- +- true +... +test_run:cmd("stop server replica") +--- +- true +... +test_run:cmd("cleanup server replica") +--- +- true +... +test_run:cmd("delete server replica") +--- +- true +... +test_run:cleanup_cluster() +--- +... diff --git a/test/replication/gh-3711-misc-no-restart-on-same-configuration.test.lua b/test/replication/gh-3711-misc-no-restart-on-same-configuration.test.lua new file mode 100644 index 000000000..72666c5b7 --- /dev/null +++ b/test/replication/gh-3711-misc-no-restart-on-same-configuration.test.lua @@ -0,0 +1,39 @@ +test_run = require('test_run').new() +test_run:cmd("restart server default") + +-- +-- gh-3711 Do not restart replication on box.cfg in case the +-- configuration didn't change. +-- +box.schema.user.grant('guest', 'replication') +test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'") +test_run:cmd("start server replica") + +-- Access rights are checked only during reconnect. If the new +-- and old configurations are equivalent, no reconnect will be +-- issued and replication should continue working. +box.schema.user.revoke('guest', 'replication') +test_run:cmd("switch replica") +replication = box.cfg.replication[1] +box.cfg{replication = {replication}} +box.info.status == 'running' +box.cfg{replication = replication} +box.info.status == 'running' + +-- Check that comparison of tables works as expected as well. +test_run:cmd("switch default") +box.schema.user.grant('guest', 'replication') +test_run:cmd("switch replica") +replication = box.cfg.replication +table.insert(replication, box.cfg.listen) +test_run:cmd("switch default") +box.schema.user.revoke('guest', 'replication') +test_run:cmd("switch replica") +box.cfg{replication = replication} +box.info.status == 'running' + +test_run:cmd("switch default") +test_run:cmd("stop server replica") +test_run:cmd("cleanup server replica") +test_run:cmd("delete server replica") +test_run:cleanup_cluster() diff --git a/test/replication/gh-3760-misc-return-on-quorum-0.result b/test/replication/gh-3760-misc-return-on-quorum-0.result new file mode 100644 index 000000000..79295f5c2 --- /dev/null +++ b/test/replication/gh-3760-misc-return-on-quorum-0.result @@ -0,0 +1,23 @@ +-- +-- gh-3760: replication quorum 0 on reconfiguration should return +-- from box.cfg immediately. +-- +replication = box.cfg.replication +--- +... +box.cfg{ \ + replication = {}, \ + replication_connect_quorum = 0, \ + replication_connect_timeout = 1000000 \ +} +--- +... +-- The call below would hang, if quorum 0 is ignored, or checked +-- too late. +box.cfg{replication = {'localhost:12345'}} +--- +... +box.info.status +--- +- running +... diff --git a/test/replication/gh-3760-misc-return-on-quorum-0.test.lua b/test/replication/gh-3760-misc-return-on-quorum-0.test.lua new file mode 100644 index 000000000..30089ac23 --- /dev/null +++ b/test/replication/gh-3760-misc-return-on-quorum-0.test.lua @@ -0,0 +1,14 @@ +-- +-- gh-3760: replication quorum 0 on reconfiguration should return +-- from box.cfg immediately. +-- +replication = box.cfg.replication +box.cfg{ \ + replication = {}, \ + replication_connect_quorum = 0, \ + replication_connect_timeout = 1000000 \ +} +-- The call below would hang, if quorum 0 is ignored, or checked +-- too late. +box.cfg{replication = {'localhost:12345'}} +box.info.status diff --git a/test/replication/gh-4399-misc-no-failure-on-error-reading-wal.result b/test/replication/gh-4399-misc-no-failure-on-error-reading-wal.result new file mode 100644 index 000000000..46b4f6464 --- /dev/null +++ b/test/replication/gh-4399-misc-no-failure-on-error-reading-wal.result @@ -0,0 +1,94 @@ +test_run = require('test_run').new() +--- +... +test_run:cmd("restart server default") +fiber = require('fiber') +--- +... +-- +-- gh-4399 Check that an error reading WAL directory on subscribe +-- doesn't lead to a permanent replication failure. +-- +box.schema.user.grant("guest", "replication") +--- +... +test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'") +--- +- true +... +test_run:cmd("start server replica") +--- +- true +... +-- Make the WAL directory inaccessible. +fio = require('fio') +--- +... +path = fio.abspath(box.cfg.wal_dir) +--- +... +fio.chmod(path, 0) +--- +- true +... +-- Break replication on timeout. +replication_timeout = box.cfg.replication_timeout +--- +... +box.cfg{replication_timeout = 9000} +--- +... +test_run:cmd("switch replica") +--- +- true +... +test_run:wait_cond(function() return box.info.replication[1].upstream.status ~= 'follow' end) +--- +- true +... +require('fiber').sleep(box.cfg.replication_timeout) +--- +... +test_run:cmd("switch default") +--- +- true +... +box.cfg{replication_timeout = replication_timeout} +--- +... +-- Restore access to the WAL directory. +-- Wait for replication to be reestablished. +fio.chmod(path, tonumber('777', 8)) +--- +- true +... +test_run:cmd("switch replica") +--- +- true +... +test_run:wait_cond(function() return box.info.replication[1].upstream.status == 'follow' end) +--- +- true +... +test_run:cmd("switch default") +--- +- true +... +test_run:cmd("stop server replica") +--- +- true +... +test_run:cmd("cleanup server replica") +--- +- true +... +test_run:cmd("delete server replica") +--- +- true +... +test_run:cleanup_cluster() +--- +... +box.schema.user.revoke('guest', 'replication') +--- +... diff --git a/test/replication/gh-4399-misc-no-failure-on-error-reading-wal.test.lua b/test/replication/gh-4399-misc-no-failure-on-error-reading-wal.test.lua new file mode 100644 index 000000000..a926ae590 --- /dev/null +++ b/test/replication/gh-4399-misc-no-failure-on-error-reading-wal.test.lua @@ -0,0 +1,38 @@ +test_run = require('test_run').new() +test_run:cmd("restart server default") +fiber = require('fiber') + +-- +-- gh-4399 Check that an error reading WAL directory on subscribe +-- doesn't lead to a permanent replication failure. +-- +box.schema.user.grant("guest", "replication") +test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'") +test_run:cmd("start server replica") + +-- Make the WAL directory inaccessible. +fio = require('fio') +path = fio.abspath(box.cfg.wal_dir) +fio.chmod(path, 0) + +-- Break replication on timeout. +replication_timeout = box.cfg.replication_timeout +box.cfg{replication_timeout = 9000} +test_run:cmd("switch replica") +test_run:wait_cond(function() return box.info.replication[1].upstream.status ~= 'follow' end) +require('fiber').sleep(box.cfg.replication_timeout) +test_run:cmd("switch default") +box.cfg{replication_timeout = replication_timeout} + +-- Restore access to the WAL directory. +-- Wait for replication to be reestablished. +fio.chmod(path, tonumber('777', 8)) +test_run:cmd("switch replica") +test_run:wait_cond(function() return box.info.replication[1].upstream.status == 'follow' end) +test_run:cmd("switch default") + +test_run:cmd("stop server replica") +test_run:cmd("cleanup server replica") +test_run:cmd("delete server replica") +test_run:cleanup_cluster() +box.schema.user.revoke('guest', 'replication') diff --git a/test/replication/gh-4424-misc-orphan-on-reconfiguration-error.result b/test/replication/gh-4424-misc-orphan-on-reconfiguration-error.result new file mode 100644 index 000000000..c87ef2e05 --- /dev/null +++ b/test/replication/gh-4424-misc-orphan-on-reconfiguration-error.result @@ -0,0 +1,82 @@ +test_run = require('test_run').new() +--- +... +-- +-- gh-4424 Always enter orphan mode on error in replication +-- configuration change. +-- +replication_connect_timeout = box.cfg.replication_connect_timeout +--- +... +replication_connect_quorum = box.cfg.replication_connect_quorum +--- +... +box.cfg{replication="12345", replication_connect_timeout=0.1, replication_connect_quorum=1} +--- +... +box.info.status +--- +- orphan +... +box.info.ro +--- +- true +... +-- reset replication => leave orphan mode +box.cfg{replication=""} +--- +... +box.info.status +--- +- running +... +box.info.ro +--- +- false +... +-- no switch to orphan when quorum == 0 +box.cfg{replication="12345", replication_connect_quorum=0} +--- +... +box.info.status +--- +- running +... +box.info.ro +--- +- false +... +-- we could connect to one out of two replicas. Set orphan. +box.cfg{replication_connect_quorum=2} +--- +... +box.cfg{replication={box.cfg.listen, "12345"}} +--- +... +box.info.status +--- +- orphan +... +box.info.ro +--- +- true +... +-- lower quorum => leave orphan mode +box.cfg{replication_connect_quorum=1} +--- +... +box.info.status +--- +- running +... +box.info.ro +--- +- false +... +box.cfg{ \ + replication = {}, \ + replication_connect_quorum = replication_connect_quorum, \ + replication_connect_timeout = replication_connect_timeout \ +} +--- +... diff --git a/test/replication/gh-4424-misc-orphan-on-reconfiguration-error.test.lua b/test/replication/gh-4424-misc-orphan-on-reconfiguration-error.test.lua new file mode 100644 index 000000000..6f42863c3 --- /dev/null +++ b/test/replication/gh-4424-misc-orphan-on-reconfiguration-error.test.lua @@ -0,0 +1,35 @@ +test_run = require('test_run').new() + +-- +-- gh-4424 Always enter orphan mode on error in replication +-- configuration change. +-- +replication_connect_timeout = box.cfg.replication_connect_timeout +replication_connect_quorum = box.cfg.replication_connect_quorum +box.cfg{replication="12345", replication_connect_timeout=0.1, replication_connect_quorum=1} +box.info.status +box.info.ro +-- reset replication => leave orphan mode +box.cfg{replication=""} +box.info.status +box.info.ro +-- no switch to orphan when quorum == 0 +box.cfg{replication="12345", replication_connect_quorum=0} +box.info.status +box.info.ro + +-- we could connect to one out of two replicas. Set orphan. +box.cfg{replication_connect_quorum=2} +box.cfg{replication={box.cfg.listen, "12345"}} +box.info.status +box.info.ro +-- lower quorum => leave orphan mode +box.cfg{replication_connect_quorum=1} +box.info.status +box.info.ro + +box.cfg{ \ + replication = {}, \ + replication_connect_quorum = replication_connect_quorum, \ + replication_connect_timeout = replication_connect_timeout \ +} diff --git a/test/replication/misc.result b/test/replication/misc.result deleted file mode 100644 index e5d1f560e..000000000 --- a/test/replication/misc.result +++ /dev/null @@ -1,866 +0,0 @@ -uuid = require('uuid') ---- -... -test_run = require('test_run').new() ---- -... -box.schema.user.grant('guest', 'replication') ---- -... --- gh-2991 - Tarantool asserts on box.cfg.replication update if one of --- servers is dead -replication_timeout = box.cfg.replication_timeout ---- -... -replication_connect_timeout = box.cfg.replication_connect_timeout ---- -... -box.cfg{replication_timeout=0.05, replication_connect_timeout=0.05, replication={}} ---- -... -box.cfg{replication_connect_quorum=2} ---- -... -box.cfg{replication = {'127.0.0.1:12345', box.cfg.listen}} ---- -... -box.info.status ---- -- orphan -... -box.info.ro ---- -- true -... --- gh-3606 - Tarantool crashes if box.cfg.replication is updated concurrently -fiber = require('fiber') ---- -... -c = fiber.channel(2) ---- -... -f = function() fiber.create(function() pcall(box.cfg, {replication = {12345}}) c:put(true) end) end ---- -... -f() ---- -... -f() ---- -... -c:get() ---- -- true -... -c:get() ---- -- true -... -box.cfg{replication = "", replication_timeout = replication_timeout, replication_connect_timeout = replication_connect_timeout} ---- -... -box.info.status ---- -- running -... -box.info.ro ---- -- false -... --- gh-3111 - Allow to rebootstrap a replica from a read-only master -replica_uuid = uuid.new() ---- -... -test_run:cmd('create server test with rpl_master=default, script="replication/replica_uuid.lua"') ---- -- true -... -test_run:cmd(string.format('start server test with args="%s"', replica_uuid)) ---- -- true -... -test_run:cmd('stop server test') ---- -- true -... -test_run:cmd('cleanup server test') ---- -- true -... -box.cfg{read_only = true} ---- -... -test_run:cmd(string.format('start server test with args="%s"', replica_uuid)) ---- -- true -... -test_run:cmd('stop server test') ---- -- true -... -test_run:cmd('cleanup server test') ---- -- true -... -box.cfg{read_only = false} ---- -... -test_run:cmd('delete server test') ---- -- true -... -test_run:cleanup_cluster() ---- -... --- gh-3160 - Send heartbeats if there are changes from a remote master only -SERVERS = { 'autobootstrap1', 'autobootstrap2', 'autobootstrap3' } ---- -... --- Deploy a cluster. -test_run:create_cluster(SERVERS, "replication", {args="0.03"}) ---- -... -test_run:wait_fullmesh(SERVERS) ---- -... -test_run:cmd("switch autobootstrap3") ---- -- true -... -test_run = require('test_run').new() ---- -... -fiber = require('fiber') ---- -... -_ = box.schema.space.create('test_timeout'):create_index('pk') ---- -... -test_run:cmd("setopt delimiter ';'") ---- -- true -... -function wait_not_follow(replicaA, replicaB) - return test_run:wait_cond(function() - return replicaA.status ~= 'follow' or replicaB.status ~= 'follow' - end, box.cfg.replication_timeout) -end; ---- -... -function test_timeout() - local replicaA = box.info.replication[1].upstream or box.info.replication[2].upstream - local replicaB = box.info.replication[3].upstream or box.info.replication[2].upstream - local follows = test_run:wait_cond(function() - return replicaA.status == 'follow' or replicaB.status == 'follow' - end) - if not follows then error('replicas are not in the follow status') end - for i = 0, 99 do - box.space.test_timeout:replace({1}) - if wait_not_follow(replicaA, replicaB) then - return error(box.info.replication) - end - end - return true -end; ---- -... -test_run:cmd("setopt delimiter ''"); ---- -- true -... -test_timeout() ---- -- true -... --- gh-3247 - Sequence-generated value is not replicated in case --- the request was sent via iproto. -test_run:cmd("switch autobootstrap1") ---- -- true -... -net_box = require('net.box') ---- -... -_ = box.schema.space.create('space1') ---- -... -_ = box.schema.sequence.create('seq') ---- -... -_ = box.space.space1:create_index('primary', {sequence = true} ) ---- -... -_ = box.space.space1:create_index('secondary', {parts = {2, 'unsigned'}}) ---- -... -box.schema.user.grant('guest', 'read,write', 'space', 'space1') ---- -... -c = net_box.connect(box.cfg.listen) ---- -... -c.space.space1:insert{box.NULL, "data"} -- fails, but bumps sequence value ---- -- error: 'Tuple field 2 type does not match one required by operation: expected unsigned' -... -c.space.space1:insert{box.NULL, 1, "data"} ---- -- [2, 1, 'data'] -... -box.space.space1:select{} ---- -- - [2, 1, 'data'] -... -vclock = test_run:get_vclock("autobootstrap1") ---- -... -vclock[0] = nil ---- -... -_ = test_run:wait_vclock("autobootstrap2", vclock) ---- -... -test_run:cmd("switch autobootstrap2") ---- -- true -... -box.space.space1:select{} ---- -- - [2, 1, 'data'] -... -test_run:cmd("switch autobootstrap1") ---- -- true -... -box.space.space1:drop() ---- -... -test_run:cmd("switch default") ---- -- true -... -test_run:drop_cluster(SERVERS) ---- -... -test_run:cleanup_cluster() ---- -... --- gh-3642 - Check that socket file descriptor doesn't leak --- when a replica is disconnected. -rlimit = require('rlimit') ---- -... -lim = rlimit.limit() ---- -... -rlimit.getrlimit(rlimit.RLIMIT_NOFILE, lim) ---- -... -old_fno = lim.rlim_cur ---- -... -lim.rlim_cur = 64 ---- -... -rlimit.setrlimit(rlimit.RLIMIT_NOFILE, lim) ---- -... -test_run:cmd('create server sock with rpl_master=default, script="replication/replica.lua"') ---- -- true -... -test_run:cmd('start server sock') ---- -- true -... -test_run:cmd('switch sock') ---- -- true -... -test_run = require('test_run').new() ---- -... -fiber = require('fiber') ---- -... -test_run:cmd("setopt delimiter ';'") ---- -- true -... -for i = 1, 64 do - local replication = box.cfg.replication - box.cfg{replication = {}} - box.cfg{replication = replication} - while box.info.replication[1].upstream.status ~= 'follow' do - fiber.sleep(0.001) - end -end; ---- -... -test_run:cmd("setopt delimiter ''"); ---- -- true -... -box.info.replication[1].upstream.status ---- -- follow -... -test_run:cmd('switch default') ---- -- true -... -lim.rlim_cur = old_fno ---- -... -rlimit.setrlimit(rlimit.RLIMIT_NOFILE, lim) ---- -... -test_run:cmd("stop server sock") ---- -- true -... -test_run:cmd("cleanup server sock") ---- -- true -... -test_run:cmd("delete server sock") ---- -- true -... -test_run:cleanup_cluster() ---- -... -box.schema.user.revoke('guest', 'replication') ---- -... --- gh-3510 assertion failure in replica_on_applier_disconnect() -test_run:cmd('create server er_load1 with script="replication/er_load1.lua"') ---- -- true -... -test_run:cmd('create server er_load2 with script="replication/er_load2.lua"') ---- -- true -... -test_run:cmd('start server er_load1 with wait=False, wait_load=False') ---- -- true -... --- Instance er_load2 will fail with error ER_REPLICASET_UUID_MISMATCH. --- This is OK since we only test here that er_load1 doesn't assert. -test_run:cmd('start server er_load2 with wait=True, wait_load=True, crash_expected = True') ---- -- false -... -test_run:cmd('stop server er_load1') ---- -- true -... --- er_load2 exits automatically. -test_run:cmd('cleanup server er_load1') ---- -- true -... -test_run:cmd('cleanup server er_load2') ---- -- true -... -test_run:cmd('delete server er_load1') ---- -- true -... -test_run:cmd('delete server er_load2') ---- -- true -... -test_run:cleanup_cluster() ---- -... --- --- Test case for gh-3637, gh-4550. Before the fix replica would --- exit with an error if a user does not exist or a password is --- incorrect. Now check that we don't hang/panic and successfully --- connect. --- -fiber = require('fiber') ---- -... -test_run:cmd("create server replica_auth with rpl_master=default, script='replication/replica_auth.lua'") ---- -- true -... -test_run:cmd("start server replica_auth with wait=False, wait_load=False, args='cluster:pass 0.05'") ---- -- true -... --- Wait a bit to make sure replica waits till user is created. -fiber.sleep(0.1) ---- -... -box.schema.user.create('cluster') ---- -... --- The user is created. Let the replica fail auth request due to --- a wrong password. -fiber.sleep(0.1) ---- -... -box.schema.user.passwd('cluster', 'pass') ---- -... -box.schema.user.grant('cluster', 'replication') ---- -... -while box.info.replication[2] == nil do fiber.sleep(0.01) end ---- -... -vclock = test_run:get_vclock('default') ---- -... -vclock[0] = nil ---- -... -_ = test_run:wait_vclock('replica_auth', vclock) ---- -... -test_run:cmd("stop server replica_auth") ---- -- true -... -test_run:cmd("cleanup server replica_auth") ---- -- true -... -test_run:cmd("delete server replica_auth") ---- -- true -... -test_run:cleanup_cluster() ---- -... -box.schema.user.drop('cluster') ---- -... --- --- Test case for gh-3610. Before the fix replica would fail with the assertion --- when trying to connect to the same master twice. --- -box.schema.user.grant('guest', 'replication') ---- -... -test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'") ---- -- true -... -test_run:cmd("start server replica") ---- -- true -... -test_run:cmd("switch replica") ---- -- true -... -replication = box.cfg.replication[1] ---- -... -box.cfg{replication = {replication, replication}} ---- -- error: 'Incorrect value for option ''replication'': duplicate connection to the - same replica' -... --- Check the case when duplicate connection is detected in the background. -test_run:cmd("switch default") ---- -- true -... -listen = box.cfg.listen ---- -... -box.cfg{listen = ''} ---- -... -test_run:cmd("switch replica") ---- -- true -... -box.cfg{replication_connect_quorum = 0, replication_connect_timeout = 0.01} ---- -... -box.cfg{replication = {replication, replication}} ---- -... -test_run:cmd("switch default") ---- -- true -... -box.cfg{listen = listen} ---- -... -while test_run:grep_log('replica', 'duplicate connection') == nil do fiber.sleep(0.01) end ---- -... -test_run:cmd("stop server replica") ---- -- true -... -test_run:cmd("cleanup server replica") ---- -- true -... -test_run:cmd("delete server replica") ---- -- true -... -test_run:cleanup_cluster() ---- -... -box.schema.user.revoke('guest', 'replication') ---- -... --- --- gh-3711 Do not restart replication on box.cfg in case the --- configuration didn't change. --- -box.schema.user.grant('guest', 'replication') ---- -... -test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'") ---- -- true -... -test_run:cmd("start server replica") ---- -- true -... --- Access rights are checked only during reconnect. If the new --- and old configurations are equivalent, no reconnect will be --- issued and replication should continue working. -box.schema.user.revoke('guest', 'replication') ---- -... -test_run:cmd("switch replica") ---- -- true -... -replication = box.cfg.replication[1] ---- -... -box.cfg{replication = {replication}} ---- -... -box.info.status == 'running' ---- -- true -... -box.cfg{replication = replication} ---- -... -box.info.status == 'running' ---- -- true -... --- Check that comparison of tables works as expected as well. -test_run:cmd("switch default") ---- -- true -... -box.schema.user.grant('guest', 'replication') ---- -... -test_run:cmd("switch replica") ---- -- true -... -replication = box.cfg.replication ---- -... -table.insert(replication, box.cfg.listen) ---- -... -test_run:cmd("switch default") ---- -- true -... -box.schema.user.revoke('guest', 'replication') ---- -... -test_run:cmd("switch replica") ---- -- true -... -box.cfg{replication = replication} ---- -... -box.info.status == 'running' ---- -- true -... -test_run:cmd("switch default") ---- -- true -... -test_run:cmd("stop server replica") ---- -- true -... -test_run:cmd("cleanup server replica") ---- -- true -... -test_run:cmd("delete server replica") ---- -- true -... -test_run:cleanup_cluster() ---- -... --- --- gh-3704 move cluster id check to replica --- -test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'") ---- -- true -... -box.schema.user.grant("guest", "replication") ---- -... -test_run:cmd("start server replica") ---- -- true -... -test_run:grep_log("replica", "REPLICASET_UUID_MISMATCH") ---- -- null -... -box.info.replication[2].downstream.status ---- -- follow -... --- change master's cluster uuid and check that replica doesn't connect. -test_run:cmd("stop server replica") ---- -- true -... -_ = box.space._schema:replace{'cluster', tostring(uuid.new())} ---- -... --- master believes replica is in cluster, but their cluster UUIDs differ. -test_run:cmd("start server replica") ---- -- true -... -test_run:wait_log("replica", "REPLICASET_UUID_MISMATCH", nil, 1.0) ---- -- REPLICASET_UUID_MISMATCH -... -test_run:wait_downstream(2, {status = 'stopped'}) ---- -- true -... -test_run:cmd("stop server replica") ---- -- true -... -test_run:cmd("cleanup server replica") ---- -- true -... -test_run:cmd("delete server replica") ---- -- true -... -test_run:cleanup_cluster() ---- -... -box.schema.user.revoke('guest', 'replication') ---- -... --- --- gh-4399 Check that an error reading WAL directory on subscribe --- doesn't lead to a permanent replication failure. --- -box.schema.user.grant("guest", "replication") ---- -... -test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'") ---- -- true -... -test_run:cmd("start server replica") ---- -- true -... --- Make the WAL directory inaccessible. -fio = require('fio') ---- -... -path = fio.abspath(box.cfg.wal_dir) ---- -... -fio.chmod(path, 0) ---- -- true -... --- Break replication on timeout. -replication_timeout = box.cfg.replication_timeout ---- -... -box.cfg{replication_timeout = 9000} ---- -... -test_run:cmd("switch replica") ---- -- true -... -test_run:wait_cond(function() return box.info.replication[1].upstream.status ~= 'follow' end) ---- -- true -... -require('fiber').sleep(box.cfg.replication_timeout) ---- -... -test_run:cmd("switch default") ---- -- true -... -box.cfg{replication_timeout = replication_timeout} ---- -... --- Restore access to the WAL directory. --- Wait for replication to be reestablished. -fio.chmod(path, tonumber('777', 8)) ---- -- true -... -test_run:cmd("switch replica") ---- -- true -... -test_run:wait_cond(function() return box.info.replication[1].upstream.status == 'follow' end) ---- -- true -... -test_run:cmd("switch default") ---- -- true -... -test_run:cmd("stop server replica") ---- -- true -... -test_run:cmd("cleanup server replica") ---- -- true -... -test_run:cmd("delete server replica") ---- -- true -... -test_run:cleanup_cluster() ---- -... -box.schema.user.revoke('guest', 'replication') ---- -... --- --- gh-4424 Always enter orphan mode on error in replication --- configuration change. --- -replication_connect_timeout = box.cfg.replication_connect_timeout ---- -... -replication_connect_quorum = box.cfg.replication_connect_quorum ---- -... -box.cfg{replication="12345", replication_connect_timeout=0.1, replication_connect_quorum=1} ---- -... -box.info.status ---- -- orphan -... -box.info.ro ---- -- true -... --- reset replication => leave orphan mode -box.cfg{replication=""} ---- -... -box.info.status ---- -- running -... -box.info.ro ---- -- false -... --- no switch to orphan when quorum == 0 -box.cfg{replication="12345", replication_connect_quorum=0} ---- -... -box.info.status ---- -- running -... -box.info.ro ---- -- false -... --- we could connect to one out of two replicas. Set orphan. -box.cfg{replication_connect_quorum=2} ---- -... -box.cfg{replication={box.cfg.listen, "12345"}} ---- -... -box.info.status ---- -- orphan -... -box.info.ro ---- -- true -... --- lower quorum => leave orphan mode -box.cfg{replication_connect_quorum=1} ---- -... -box.info.status ---- -- running -... -box.info.ro ---- -- false -... --- --- gh-3760: replication quorum 0 on reconfiguration should return --- from box.cfg immediately. --- -replication = box.cfg.replication ---- -... -box.cfg{ \ - replication = {}, \ - replication_connect_quorum = 0, \ - replication_connect_timeout = 1000000 \ -} ---- -... --- The call below would hang, if quorum 0 is ignored, or checked --- too late. -box.cfg{replication = {'localhost:12345'}} ---- -... -box.info.status ---- -- running -... -box.cfg{ \ - replication = {}, \ - replication_connect_quorum = replication_connect_quorum, \ - replication_connect_timeout = replication_connect_timeout \ -} ---- -... diff --git a/test/replication/misc.skipcond b/test/replication/misc.skipcond deleted file mode 100644 index 48e17903e..000000000 --- a/test/replication/misc.skipcond +++ /dev/null @@ -1,7 +0,0 @@ -import platform - -# Disabled on FreeBSD due to flaky fail #4271. -if platform.system() == 'FreeBSD': - self.skip = 1 - -# vim: set ft=python: diff --git a/test/replication/misc.test.lua b/test/replication/misc.test.lua deleted file mode 100644 index d285b014a..000000000 --- a/test/replication/misc.test.lua +++ /dev/null @@ -1,356 +0,0 @@ -uuid = require('uuid') -test_run = require('test_run').new() - -box.schema.user.grant('guest', 'replication') - --- gh-2991 - Tarantool asserts on box.cfg.replication update if one of --- servers is dead -replication_timeout = box.cfg.replication_timeout -replication_connect_timeout = box.cfg.replication_connect_timeout -box.cfg{replication_timeout=0.05, replication_connect_timeout=0.05, replication={}} -box.cfg{replication_connect_quorum=2} -box.cfg{replication = {'127.0.0.1:12345', box.cfg.listen}} -box.info.status -box.info.ro - --- gh-3606 - Tarantool crashes if box.cfg.replication is updated concurrently -fiber = require('fiber') -c = fiber.channel(2) -f = function() fiber.create(function() pcall(box.cfg, {replication = {12345}}) c:put(true) end) end -f() -f() -c:get() -c:get() - -box.cfg{replication = "", replication_timeout = replication_timeout, replication_connect_timeout = replication_connect_timeout} -box.info.status -box.info.ro - --- gh-3111 - Allow to rebootstrap a replica from a read-only master -replica_uuid = uuid.new() -test_run:cmd('create server test with rpl_master=default, script="replication/replica_uuid.lua"') -test_run:cmd(string.format('start server test with args="%s"', replica_uuid)) -test_run:cmd('stop server test') -test_run:cmd('cleanup server test') -box.cfg{read_only = true} -test_run:cmd(string.format('start server test with args="%s"', replica_uuid)) -test_run:cmd('stop server test') -test_run:cmd('cleanup server test') -box.cfg{read_only = false} -test_run:cmd('delete server test') -test_run:cleanup_cluster() - --- gh-3160 - Send heartbeats if there are changes from a remote master only -SERVERS = { 'autobootstrap1', 'autobootstrap2', 'autobootstrap3' } - --- Deploy a cluster. -test_run:create_cluster(SERVERS, "replication", {args="0.03"}) -test_run:wait_fullmesh(SERVERS) -test_run:cmd("switch autobootstrap3") -test_run = require('test_run').new() -fiber = require('fiber') -_ = box.schema.space.create('test_timeout'):create_index('pk') -test_run:cmd("setopt delimiter ';'") -function wait_not_follow(replicaA, replicaB) - return test_run:wait_cond(function() - return replicaA.status ~= 'follow' or replicaB.status ~= 'follow' - end, box.cfg.replication_timeout) -end; -function test_timeout() - local replicaA = box.info.replication[1].upstream or box.info.replication[2].upstream - local replicaB = box.info.replication[3].upstream or box.info.replication[2].upstream - local follows = test_run:wait_cond(function() - return replicaA.status == 'follow' or replicaB.status == 'follow' - end) - if not follows then error('replicas are not in the follow status') end - for i = 0, 99 do - box.space.test_timeout:replace({1}) - if wait_not_follow(replicaA, replicaB) then - return error(box.info.replication) - end - end - return true -end; -test_run:cmd("setopt delimiter ''"); -test_timeout() - --- gh-3247 - Sequence-generated value is not replicated in case --- the request was sent via iproto. -test_run:cmd("switch autobootstrap1") -net_box = require('net.box') -_ = box.schema.space.create('space1') -_ = box.schema.sequence.create('seq') -_ = box.space.space1:create_index('primary', {sequence = true} ) -_ = box.space.space1:create_index('secondary', {parts = {2, 'unsigned'}}) -box.schema.user.grant('guest', 'read,write', 'space', 'space1') -c = net_box.connect(box.cfg.listen) -c.space.space1:insert{box.NULL, "data"} -- fails, but bumps sequence value -c.space.space1:insert{box.NULL, 1, "data"} -box.space.space1:select{} -vclock = test_run:get_vclock("autobootstrap1") -vclock[0] = nil -_ = test_run:wait_vclock("autobootstrap2", vclock) -test_run:cmd("switch autobootstrap2") -box.space.space1:select{} -test_run:cmd("switch autobootstrap1") -box.space.space1:drop() - -test_run:cmd("switch default") -test_run:drop_cluster(SERVERS) -test_run:cleanup_cluster() - --- gh-3642 - Check that socket file descriptor doesn't leak --- when a replica is disconnected. -rlimit = require('rlimit') -lim = rlimit.limit() -rlimit.getrlimit(rlimit.RLIMIT_NOFILE, lim) -old_fno = lim.rlim_cur -lim.rlim_cur = 64 -rlimit.setrlimit(rlimit.RLIMIT_NOFILE, lim) - -test_run:cmd('create server sock with rpl_master=default, script="replication/replica.lua"') -test_run:cmd('start server sock') -test_run:cmd('switch sock') -test_run = require('test_run').new() -fiber = require('fiber') -test_run:cmd("setopt delimiter ';'") -for i = 1, 64 do - local replication = box.cfg.replication - box.cfg{replication = {}} - box.cfg{replication = replication} - while box.info.replication[1].upstream.status ~= 'follow' do - fiber.sleep(0.001) - end -end; -test_run:cmd("setopt delimiter ''"); - -box.info.replication[1].upstream.status - -test_run:cmd('switch default') - -lim.rlim_cur = old_fno -rlimit.setrlimit(rlimit.RLIMIT_NOFILE, lim) - -test_run:cmd("stop server sock") -test_run:cmd("cleanup server sock") -test_run:cmd("delete server sock") -test_run:cleanup_cluster() - -box.schema.user.revoke('guest', 'replication') - --- gh-3510 assertion failure in replica_on_applier_disconnect() -test_run:cmd('create server er_load1 with script="replication/er_load1.lua"') -test_run:cmd('create server er_load2 with script="replication/er_load2.lua"') -test_run:cmd('start server er_load1 with wait=False, wait_load=False') --- Instance er_load2 will fail with error ER_REPLICASET_UUID_MISMATCH. --- This is OK since we only test here that er_load1 doesn't assert. -test_run:cmd('start server er_load2 with wait=True, wait_load=True, crash_expected = True') -test_run:cmd('stop server er_load1') --- er_load2 exits automatically. -test_run:cmd('cleanup server er_load1') -test_run:cmd('cleanup server er_load2') -test_run:cmd('delete server er_load1') -test_run:cmd('delete server er_load2') -test_run:cleanup_cluster() - --- --- Test case for gh-3637, gh-4550. Before the fix replica would --- exit with an error if a user does not exist or a password is --- incorrect. Now check that we don't hang/panic and successfully --- connect. --- -fiber = require('fiber') -test_run:cmd("create server replica_auth with rpl_master=default, script='replication/replica_auth.lua'") -test_run:cmd("start server replica_auth with wait=False, wait_load=False, args='cluster:pass 0.05'") --- Wait a bit to make sure replica waits till user is created. -fiber.sleep(0.1) -box.schema.user.create('cluster') --- The user is created. Let the replica fail auth request due to --- a wrong password. -fiber.sleep(0.1) -box.schema.user.passwd('cluster', 'pass') -box.schema.user.grant('cluster', 'replication') - -while box.info.replication[2] == nil do fiber.sleep(0.01) end -vclock = test_run:get_vclock('default') -vclock[0] = nil -_ = test_run:wait_vclock('replica_auth', vclock) - -test_run:cmd("stop server replica_auth") -test_run:cmd("cleanup server replica_auth") -test_run:cmd("delete server replica_auth") -test_run:cleanup_cluster() - -box.schema.user.drop('cluster') - --- --- Test case for gh-3610. Before the fix replica would fail with the assertion --- when trying to connect to the same master twice. --- -box.schema.user.grant('guest', 'replication') -test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'") -test_run:cmd("start server replica") -test_run:cmd("switch replica") -replication = box.cfg.replication[1] -box.cfg{replication = {replication, replication}} - --- Check the case when duplicate connection is detected in the background. -test_run:cmd("switch default") -listen = box.cfg.listen -box.cfg{listen = ''} - -test_run:cmd("switch replica") -box.cfg{replication_connect_quorum = 0, replication_connect_timeout = 0.01} -box.cfg{replication = {replication, replication}} - -test_run:cmd("switch default") -box.cfg{listen = listen} -while test_run:grep_log('replica', 'duplicate connection') == nil do fiber.sleep(0.01) end - -test_run:cmd("stop server replica") -test_run:cmd("cleanup server replica") -test_run:cmd("delete server replica") -test_run:cleanup_cluster() -box.schema.user.revoke('guest', 'replication') - --- --- gh-3711 Do not restart replication on box.cfg in case the --- configuration didn't change. --- -box.schema.user.grant('guest', 'replication') -test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'") -test_run:cmd("start server replica") - --- Access rights are checked only during reconnect. If the new --- and old configurations are equivalent, no reconnect will be --- issued and replication should continue working. -box.schema.user.revoke('guest', 'replication') -test_run:cmd("switch replica") -replication = box.cfg.replication[1] -box.cfg{replication = {replication}} -box.info.status == 'running' -box.cfg{replication = replication} -box.info.status == 'running' - --- Check that comparison of tables works as expected as well. -test_run:cmd("switch default") -box.schema.user.grant('guest', 'replication') -test_run:cmd("switch replica") -replication = box.cfg.replication -table.insert(replication, box.cfg.listen) -test_run:cmd("switch default") -box.schema.user.revoke('guest', 'replication') -test_run:cmd("switch replica") -box.cfg{replication = replication} -box.info.status == 'running' - -test_run:cmd("switch default") -test_run:cmd("stop server replica") -test_run:cmd("cleanup server replica") -test_run:cmd("delete server replica") -test_run:cleanup_cluster() - --- --- gh-3704 move cluster id check to replica --- -test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'") -box.schema.user.grant("guest", "replication") -test_run:cmd("start server replica") -test_run:grep_log("replica", "REPLICASET_UUID_MISMATCH") -box.info.replication[2].downstream.status --- change master's cluster uuid and check that replica doesn't connect. -test_run:cmd("stop server replica") -_ = box.space._schema:replace{'cluster', tostring(uuid.new())} --- master believes replica is in cluster, but their cluster UUIDs differ. -test_run:cmd("start server replica") -test_run:wait_log("replica", "REPLICASET_UUID_MISMATCH", nil, 1.0) -test_run:wait_downstream(2, {status = 'stopped'}) - -test_run:cmd("stop server replica") -test_run:cmd("cleanup server replica") -test_run:cmd("delete server replica") -test_run:cleanup_cluster() -box.schema.user.revoke('guest', 'replication') - --- --- gh-4399 Check that an error reading WAL directory on subscribe --- doesn't lead to a permanent replication failure. --- -box.schema.user.grant("guest", "replication") -test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'") -test_run:cmd("start server replica") - --- Make the WAL directory inaccessible. -fio = require('fio') -path = fio.abspath(box.cfg.wal_dir) -fio.chmod(path, 0) - --- Break replication on timeout. -replication_timeout = box.cfg.replication_timeout -box.cfg{replication_timeout = 9000} -test_run:cmd("switch replica") -test_run:wait_cond(function() return box.info.replication[1].upstream.status ~= 'follow' end) -require('fiber').sleep(box.cfg.replication_timeout) -test_run:cmd("switch default") -box.cfg{replication_timeout = replication_timeout} - --- Restore access to the WAL directory. --- Wait for replication to be reestablished. -fio.chmod(path, tonumber('777', 8)) -test_run:cmd("switch replica") -test_run:wait_cond(function() return box.info.replication[1].upstream.status == 'follow' end) -test_run:cmd("switch default") - -test_run:cmd("stop server replica") -test_run:cmd("cleanup server replica") -test_run:cmd("delete server replica") -test_run:cleanup_cluster() -box.schema.user.revoke('guest', 'replication') - --- --- gh-4424 Always enter orphan mode on error in replication --- configuration change. --- -replication_connect_timeout = box.cfg.replication_connect_timeout -replication_connect_quorum = box.cfg.replication_connect_quorum -box.cfg{replication="12345", replication_connect_timeout=0.1, replication_connect_quorum=1} -box.info.status -box.info.ro --- reset replication => leave orphan mode -box.cfg{replication=""} -box.info.status -box.info.ro --- no switch to orphan when quorum == 0 -box.cfg{replication="12345", replication_connect_quorum=0} -box.info.status -box.info.ro - --- we could connect to one out of two replicas. Set orphan. -box.cfg{replication_connect_quorum=2} -box.cfg{replication={box.cfg.listen, "12345"}} -box.info.status -box.info.ro --- lower quorum => leave orphan mode -box.cfg{replication_connect_quorum=1} -box.info.status -box.info.ro - --- --- gh-3760: replication quorum 0 on reconfiguration should return --- from box.cfg immediately. --- -replication = box.cfg.replication -box.cfg{ \ - replication = {}, \ - replication_connect_quorum = 0, \ - replication_connect_timeout = 1000000 \ -} --- The call below would hang, if quorum 0 is ignored, or checked --- too late. -box.cfg{replication = {'localhost:12345'}} -box.info.status -box.cfg{ \ - replication = {}, \ - replication_connect_quorum = replication_connect_quorum, \ - replication_connect_timeout = replication_connect_timeout \ -} diff --git a/test/replication/suite.cfg b/test/replication/suite.cfg index f357b07da..e21daa5ad 100644 --- a/test/replication/suite.cfg +++ b/test/replication/suite.cfg @@ -1,6 +1,19 @@ { "anon.test.lua": {}, - "misc.test.lua": {}, + "misc_assert_connecting_master_twice_gh-3610.test.lua": {}, + "misc_assert_on_server_die_gh-2991.test.lua": {}, + "misc_assert_replica_on_applier_disconnect_gh-3510.test.lua": {}, + "misc_crash_on_box_concurrent_update_gh-3606.test.lua": {}, + "misc_heartbeats_on_master_changes_gh-3160.test.lua": {}, + "misc_no_failure_on_error_reading_wal_gh-4399.test.lua": {}, + "misc_no_panic_on_connected_gh-3637.test.lua": {}, + "misc_no_restart_on_same_configuration_gh-3711.test.lua": {}, + "misc_no_socket_leak_on_replica_disconnect_gh-3642.test.lua": {}, + "misc_orphan_on_reconfiguration_error_gh-4424.test.lua": {}, + "misc_rebootstrap_from_ro_master_gh-3111.test.lua": {}, + "misc_replica_checks_cluster_id_gh-3704.test.lua": {}, + "misc_return_on_quorum_0_gh-3760.test.lua": {}, + "misc_value_not_replicated_on_iproto_request_gh-3247.test.lua": {}, "once.test.lua": {}, "on_replace.test.lua": {}, "status.test.lua": {}, diff --git a/test/replication/suite.ini b/test/replication/suite.ini index ab9c3dabd..a6d653d3b 100644 --- a/test/replication/suite.ini +++ b/test/replication/suite.ini @@ -13,7 +13,7 @@ is_parallel = True pretest_clean = True fragile = errinj.test.lua ; gh-3870 long_row_timeout.test.lua ; gh-4351 - misc.test.lua ; gh-4940 + gh-3160-misc-heartbeats-on-master-changes.test.lua ; gh-4940 skip_conflict_row.test.lua ; gh-4958 sync.test.lua ; gh-3835 transaction.test.lua ; gh-4312 -- 2.17.1
reply other threads:[~2020-09-04 17:27 UTC|newest] Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=3e681071c39d726e895f78d02fbb7a1e999a56ac.1599240310.git.avtikhon@tarantool.org \ --to=avtikhon@tarantool.org \ --cc=imeevma@gmail.com \ --cc=kyukhin@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH v2 1/2] Divide replication/misc.test.lua' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox