[PATCH] replication: allow to rebootstrap replica from read-only master
Vladimir Davydov
vdavydov.dev at gmail.com
Tue Feb 6 19:45:24 MSK 2018
If an instance is read-only, an attempt to join a new replica to it will
fail with ER_READONLY, because joining a replica to a cluster implies
registration in the _cluster system space. However, if the replica is
already registered, which is the case if it is being rebootstrapped with
the same uuid (see box.cfg.instance_uuid), the record corresponding to
the replica is already present in the _cluster space and hence no write
operation is required. Still, rebootstrap fails with the same error.
Let's rearrange the access checks to make it possible to rebootstrap a
replica from a read-only master provided it has the same uuid.
Closes #3111
---
Branch: gh-3111-replication-allow-rebootstrap-from-ro-master
src/box/box.cc | 17 ++++++++++---
test/replication/misc.result | 52 ++++++++++++++++++++++++++++++++++++++-
test/replication/misc.test.lua | 22 ++++++++++++++---
test/replication/replica_uuid.lua | 11 +++++++++
4 files changed, 94 insertions(+), 8 deletions(-)
create mode 100644 test/replication/replica_uuid.lua
diff --git a/src/box/box.cc b/src/box/box.cc
index c33243a8..9d494257 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -1129,11 +1129,12 @@ box_register_replica(uint32_t id, const struct tt_uuid *uuid)
static void
box_on_join(const tt_uuid *instance_uuid)
{
- box_check_writable_xc();
struct replica *replica = replica_by_uuid(instance_uuid);
if (replica != NULL)
return; /* nothing to do - already registered */
+ box_check_writable_xc();
+
/** Find the largest existing replica id. */
struct space *space = space_cache_find_xc(BOX_CLUSTER_ID);
struct index *index = index_find_system_xc(space, 0);
@@ -1226,10 +1227,18 @@ box_process_join(struct ev_io *io, struct xrow_header *header)
/* Check permissions */
access_check_universe_xc(PRIV_R);
- access_check_space_xc(space_cache_find_xc(BOX_CLUSTER_ID), PRIV_W);
- /* Check that we actually can register a new replica */
- box_check_writable_xc();
+ /*
+ * Unless already registered, the new replica will be
+ * added to _cluster space once the initial join stage
+ * is complete. Fail early if the caller does not have
+ * appropriate access privileges.
+ */
+ if (replica_by_uuid(&instance_uuid) == NULL) {
+ box_check_writable_xc();
+ struct space *space = space_cache_find_xc(BOX_CLUSTER_ID);
+ access_check_space_xc(space, PRIV_W);
+ }
/* Forbid replication with disabled WAL */
if (wal_mode() == WAL_NONE) {
diff --git a/test/replication/misc.result b/test/replication/misc.result
index ae26c703..070e4ea8 100644
--- a/test/replication/misc.result
+++ b/test/replication/misc.result
@@ -1,6 +1,15 @@
+uuid = require('uuid')
+---
+...
+test_run = require('test_run').new()
+---
+...
+box.schema.user.grant('guest', 'replication')
+---
+...
-- gh-2991 - Tarantool asserts on box.cfg.replication update if one of
-- servers is dead
-box.schema.user.grant('guest', 'replication')
+replication_timeout = box.cfg.replication_timeout
---
...
box.cfg{replication_timeout=0.05, replication={}}
@@ -11,6 +20,47 @@ box.cfg{replication = {'127.0.0.1:12345', box.cfg.listen}}
- error: 'Incorrect value for option ''replication'': failed to connect to one or
more replicas'
...
+box.cfg{replication_timeout = replication_timeout}
+---
+...
+-- gh-3111 - Allow to rebootstrap a replica from a read-only master
+replica_uuid = uuid.new()
+---
+...
+test_run:cmd('create server test with rpl_master=default, script="replication/replica_uuid.lua"')
+---
+- true
+...
+test_run:cmd(string.format('start server test with args="%s"', replica_uuid))
+---
+- true
+...
+test_run:cmd('stop server test')
+---
+- true
+...
+test_run:cmd('cleanup server test')
+---
+- true
+...
+box.cfg{read_only = true}
+---
+...
+test_run:cmd(string.format('start server test with args="%s"', replica_uuid))
+---
+- true
+...
+test_run:cmd('stop server test')
+---
+- true
+...
+test_run:cmd('cleanup server test')
+---
+- true
+...
+box.cfg{read_only = false}
+---
+...
box.schema.user.revoke('guest', 'replication')
---
...
diff --git a/test/replication/misc.test.lua b/test/replication/misc.test.lua
index 04967a24..d4f714d9 100644
--- a/test/replication/misc.test.lua
+++ b/test/replication/misc.test.lua
@@ -1,9 +1,25 @@
--- gh-2991 - Tarantool asserts on box.cfg.replication update if one of
--- servers is dead
+uuid = require('uuid')
+test_run = require('test_run').new()
+
box.schema.user.grant('guest', 'replication')
+-- gh-2991 - Tarantool asserts on box.cfg.replication update if one of
+-- servers is dead
+replication_timeout = box.cfg.replication_timeout
box.cfg{replication_timeout=0.05, replication={}}
-
box.cfg{replication = {'127.0.0.1:12345', box.cfg.listen}}
+box.cfg{replication_timeout = replication_timeout}
+
+-- gh-3111 - Allow to rebootstrap a replica from a read-only master
+replica_uuid = uuid.new()
+test_run:cmd('create server test with rpl_master=default, script="replication/replica_uuid.lua"')
+test_run:cmd(string.format('start server test with args="%s"', replica_uuid))
+test_run:cmd('stop server test')
+test_run:cmd('cleanup server test')
+box.cfg{read_only = true}
+test_run:cmd(string.format('start server test with args="%s"', replica_uuid))
+test_run:cmd('stop server test')
+test_run:cmd('cleanup server test')
+box.cfg{read_only = false}
box.schema.user.revoke('guest', 'replication')
diff --git a/test/replication/replica_uuid.lua b/test/replication/replica_uuid.lua
new file mode 100644
index 00000000..f92d3119
--- /dev/null
+++ b/test/replication/replica_uuid.lua
@@ -0,0 +1,11 @@
+#!/usr/bin/env tarantool
+
+box.cfg({
+ instance_uuid = arg[1],
+ listen = os.getenv("LISTEN"),
+ replication = os.getenv("MASTER"),
+ memtx_memory = 107374182,
+})
+
+require('console').listen(os.getenv('ADMIN'))
+
--
2.11.0
More information about the Tarantool-patches
mailing list