* [Tarantool-patches] [PATCH v2 6/8] box: introduce `box.ctl.demote`
2021-06-17 21:07 [Tarantool-patches] [PATCH v2 0/8] forbid implicit limbo ownership transition Serge Petrenko via Tarantool-patches
` (4 preceding siblings ...)
2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 5/8] replication: forbid implicit limbo owner transition Serge Petrenko via Tarantool-patches
@ 2021-06-17 21:07 ` Serge Petrenko via Tarantool-patches
2021-06-17 21:15 ` Serge Petrenko via Tarantool-patches
2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 7/8] txn_limbo: persist the latest effective promote in snapshot Serge Petrenko via Tarantool-patches
` (3 subsequent siblings)
9 siblings, 1 reply; 16+ messages in thread
From: Serge Petrenko via Tarantool-patches @ 2021-06-17 21:07 UTC (permalink / raw)
To: v.shpilevoy, gorcunov; +Cc: tarantool-patches
Introduce a new journal entry, DEMOTE. The entry has the same meaning as
PROMOTE, with the only difference that it clears limbo ownership instead
of transferring it to the issuer.
Introduce `box.ctl.demote`, which works exactly like `box.ctl.promote`,
but results in writing DEMOTE instead of PROMOTE.
A new request was necessary instead of simply writing PROMOTE(origin_id
= 0), because origin_id is deduced from row.replica_id, which cannot be
0 for replicated rows (it's always equal to instance_id of the row
originator).
Closes #6034
@TarantoolBod document
Title: box.ctl.demote
`box.ctl.demote()` is a new function, which works exactly like
`box.ctl.promote()`, with one exception that it results in the instance
writing DEMOTE request to WAL instead of a PROMOTE request.
A DEMOTE request (DEMOTE = 32) copies PROMOTE behaviour (it clears the
limbo as well), but clears limbo ownership instead of assigning it to a
new instance.
---
src/box/box.cc | 28 ++-
src/box/box.h | 3 +
src/box/iproto_constants.h | 10 +-
src/box/lua/ctl.c | 9 +
src/box/txn_limbo.c | 37 +++-
src/box/txn_limbo.h | 7 +
test/box/error.result | 2 +-
test/replication/election_basic.result | 3 +
test/replication/election_basic.test.lua | 1 +
test/replication/election_qsync.result | 3 +
test/replication/election_qsync.test.lua | 1 +
.../gh-5140-qsync-casc-rollback.result | 6 +
.../gh-5140-qsync-casc-rollback.test.lua | 2 +
.../gh-5144-qsync-dup-confirm.result | 6 +
.../gh-5144-qsync-dup-confirm.test.lua | 2 +
.../gh-5163-qsync-restart-crash.result | 6 +
.../gh-5163-qsync-restart-crash.test.lua | 2 +
.../gh-5167-qsync-rollback-snap.result | 6 +
.../gh-5167-qsync-rollback-snap.test.lua | 2 +
.../gh-5195-qsync-replica-write.result | 10 +-
.../gh-5195-qsync-replica-write.test.lua | 6 +-
.../gh-5213-qsync-applier-order-3.result | 9 +
.../gh-5213-qsync-applier-order-3.test.lua | 3 +
.../gh-5213-qsync-applier-order.result | 6 +
.../gh-5213-qsync-applier-order.test.lua | 2 +
.../replication/gh-5288-qsync-recovery.result | 6 +
.../gh-5288-qsync-recovery.test.lua | 2 +
.../gh-5298-qsync-recovery-snap.result | 6 +
.../gh-5298-qsync-recovery-snap.test.lua | 2 +
.../gh-5426-election-on-off.result | 3 +
.../gh-5426-election-on-off.test.lua | 1 +
.../gh-5433-election-restart-recovery.result | 3 +
...gh-5433-election-restart-recovery.test.lua | 1 +
...sync-clear-synchro-queue-commit-all.result | 3 +
...nc-clear-synchro-queue-commit-all.test.lua | 1 +
.../gh-5446-qsync-eval-quorum.result | 7 +
.../gh-5446-qsync-eval-quorum.test.lua | 3 +
.../gh-5506-election-on-off.result | 3 +
.../gh-5506-election-on-off.test.lua | 1 +
.../gh-5566-final-join-synchro.result | 6 +
.../gh-5566-final-join-synchro.test.lua | 2 +
.../gh-5874-qsync-txn-recovery.result | 6 +
.../gh-5874-qsync-txn-recovery.test.lua | 2 +
.../gh-6032-promote-wal-write.result | 3 +
.../gh-6032-promote-wal-write.test.lua | 1 +
.../gh-6034-limbo-ownership.result | 186 ++++++++++++++++++
.../gh-6034-limbo-ownership.test.lua | 68 +++++++
.../gh-6057-qsync-confirm-async-no-wal.result | 7 +
...h-6057-qsync-confirm-async-no-wal.test.lua | 3 +
test/replication/hang_on_synchro_fail.result | 6 +
.../replication/hang_on_synchro_fail.test.lua | 2 +
test/replication/qsync_advanced.result | 12 ++
test/replication/qsync_advanced.test.lua | 4 +
test/replication/qsync_basic.result | 33 ++--
test/replication/qsync_basic.test.lua | 16 +-
test/replication/qsync_errinj.result | 6 +
test/replication/qsync_errinj.test.lua | 2 +
test/replication/qsync_snapshots.result | 6 +
test/replication/qsync_snapshots.test.lua | 2 +
test/replication/qsync_with_anon.result | 6 +
test/replication/qsync_with_anon.test.lua | 2 +
test/replication/suite.cfg | 1 +
62 files changed, 550 insertions(+), 46 deletions(-)
create mode 100644 test/replication/gh-6034-limbo-ownership.result
create mode 100644 test/replication/gh-6034-limbo-ownership.test.lua
diff --git a/src/box/box.cc b/src/box/box.cc
index 53a8f80e5..f2bde910c 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -1527,8 +1527,8 @@ box_wait_quorum(uint32_t lead_id, int64_t target_lsn, int quorum,
return 0;
}
-int
-box_promote(void)
+static int
+box_clear_synchro_queue(bool demote)
{
/* A guard to block multiple simultaneous function invocations. */
static bool in_promote = false;
@@ -1691,10 +1691,16 @@ promote:
raft_new_term(box_raft());
uint64_t term = box_raft()->volatile_term;
- txn_limbo_write_promote(&txn_limbo, wait_lsn,
- term);
+ if (demote) {
+ txn_limbo_write_demote(&txn_limbo, wait_lsn,
+ term);
+ } else {
+ txn_limbo_write_promote(&txn_limbo, wait_lsn,
+ term);
+ }
+ uint16_t type = demote ? IPROTO_DEMOTE : IPROTO_PROMOTE;
struct synchro_request req = {
- .type = IPROTO_PROMOTE,
+ .type = type,
.replica_id = former_leader_id,
.origin_id = instance_id,
.lsn = wait_lsn,
@@ -1707,6 +1713,18 @@ promote:
return rc;
}
+int
+box_promote(void)
+{
+ return box_clear_synchro_queue(false);
+}
+
+int
+box_demote(void)
+{
+ return box_clear_synchro_queue(true);
+}
+
int
box_listen(void)
{
diff --git a/src/box/box.h b/src/box/box.h
index ecf32240d..aaf20d9dd 100644
--- a/src/box/box.h
+++ b/src/box/box.h
@@ -276,6 +276,9 @@ typedef struct tuple box_tuple_t;
int
box_promote(void);
+int
+box_demote(void);
+
/* box_select is private and used only by FFI */
API_EXPORT int
box_select(uint32_t space_id, uint32_t index_id,
diff --git a/src/box/iproto_constants.h b/src/box/iproto_constants.h
index 137bee9da..3c9edb7d2 100644
--- a/src/box/iproto_constants.h
+++ b/src/box/iproto_constants.h
@@ -241,6 +241,8 @@ enum iproto_type {
IPROTO_RAFT = 30,
/** PROMOTE request. */
IPROTO_PROMOTE = 31,
+ /** DEMOTE request. */
+ IPROTO_DEMOTE = 32,
/** A confirmation message for synchronous transactions. */
IPROTO_CONFIRM = 40,
@@ -310,6 +312,8 @@ iproto_type_name(uint16_t type)
return "RAFT";
case IPROTO_PROMOTE:
return "PROMOTE";
+ case IPROTO_DEMOTE:
+ return "DEMOTE";
case IPROTO_CONFIRM:
return "CONFIRM";
case IPROTO_ROLLBACK:
@@ -364,14 +368,14 @@ static inline bool
iproto_type_is_synchro_request(uint16_t type)
{
return type == IPROTO_CONFIRM || type == IPROTO_ROLLBACK ||
- type == IPROTO_PROMOTE;
+ type == IPROTO_PROMOTE || type == IPROTO_DEMOTE;
}
-/** PROMOTE entry (synchronous replication and leader elections). */
+/** PROMOTE/DEMOTE entry (synchronous replication and leader elections). */
static inline bool
iproto_type_is_promote_request(uint32_t type)
{
- return type == IPROTO_PROMOTE;
+ return type == IPROTO_PROMOTE || type == IPROTO_DEMOTE;
}
static inline bool
diff --git a/src/box/lua/ctl.c b/src/box/lua/ctl.c
index 368b9ab60..a613c4111 100644
--- a/src/box/lua/ctl.c
+++ b/src/box/lua/ctl.c
@@ -89,6 +89,14 @@ lbox_ctl_promote(struct lua_State *L)
return 0;
}
+static int
+lbox_ctl_demote(struct lua_State *L)
+{
+ if (box_demote() != 0)
+ return luaT_error(L);
+ return 0;
+}
+
static int
lbox_ctl_is_recovery_finished(struct lua_State *L)
{
@@ -127,6 +135,7 @@ static const struct luaL_Reg lbox_ctl_lib[] = {
{"promote", lbox_ctl_promote},
/* An old alias. */
{"clear_synchro_queue", lbox_ctl_promote},
+ {"demote", lbox_ctl_demote},
{"is_recovery_finished", lbox_ctl_is_recovery_finished},
{"set_on_shutdown_timeout", lbox_ctl_set_on_shutdown_timeout},
{NULL, NULL}
diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c
index 203dbe856..b5af02479 100644
--- a/src/box/txn_limbo.c
+++ b/src/box/txn_limbo.c
@@ -503,6 +503,29 @@ txn_limbo_read_promote(struct txn_limbo *limbo, uint32_t replica_id,
limbo->confirmed_lsn = 0;
}
+void
+txn_limbo_write_demote(struct txn_limbo *limbo, int64_t lsn, uint64_t term)
+{
+ limbo->confirmed_lsn = lsn;
+ limbo->is_in_rollback = true;
+ struct txn_limbo_entry *e = txn_limbo_last_synchro_entry(limbo);
+ assert(e == NULL || e->lsn <= lsn);
+ (void)e;
+ txn_limbo_write_synchro(limbo, IPROTO_DEMOTE, lsn, term);
+ limbo->is_in_rollback = false;
+}
+
+/**
+ * Process a DEMOTE request, which's like PROMOTE, but clears the limbo
+ * ownership.
+ * @sa txn_limbo_read_promote.
+ */
+static void
+txn_limbo_read_demote(struct txn_limbo *limbo, int64_t lsn)
+{
+ return txn_limbo_read_promote(limbo, REPLICA_ID_NIL, lsn);
+}
+
void
txn_limbo_ack(struct txn_limbo *limbo, uint32_t replica_id, int64_t lsn)
{
@@ -655,12 +678,13 @@ txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req)
if (term > limbo->promote_greatest_term) {
limbo->promote_greatest_term = term;
- } else if (req->type == IPROTO_PROMOTE &&
+ } else if (iproto_type_is_promote_request(req->type) &&
limbo->promote_greatest_term > 1) {
/* PROMOTE for outdated term. Ignore. */
- say_info("RAFT: ignoring PROMOTE request from instance "
+ say_info("RAFT: ignoring %s request from instance "
"id %"PRIu32" for term %"PRIu64". Greatest term seen "
- "before (%"PRIu64") is bigger.", origin, term,
+ "before (%"PRIu64") is bigger.",
+ iproto_type_name(req->type), origin, term,
limbo->promote_greatest_term);
return;
}
@@ -671,7 +695,7 @@ txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req)
* The limbo was empty on the instance issuing the request.
* This means this instance must empty its limbo as well.
*/
- assert(lsn == 0 && req->type == IPROTO_PROMOTE);
+ assert(lsn == 0 && iproto_type_is_promote_request(req->type));
} else if (req->replica_id != limbo->owner_id) {
/*
* Ignore CONFIRM/ROLLBACK messages for a foreign master.
@@ -679,7 +703,7 @@ txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req)
* data from an old leader, who has just started and written
* confirm right on synchronous transaction recovery.
*/
- if (req->type != IPROTO_PROMOTE)
+ if (!iproto_type_is_promote_request(req->type))
return;
/*
* Promote has a bigger term, and tries to steal the limbo. It
@@ -699,6 +723,9 @@ txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req)
case IPROTO_PROMOTE:
txn_limbo_read_promote(limbo, req->origin_id, lsn);
break;
+ case IPROTO_DEMOTE:
+ txn_limbo_read_demote(limbo, lsn);
+ break;
default:
unreachable();
}
diff --git a/src/box/txn_limbo.h b/src/box/txn_limbo.h
index e409ac657..801a1a0ee 100644
--- a/src/box/txn_limbo.h
+++ b/src/box/txn_limbo.h
@@ -318,6 +318,13 @@ txn_limbo_wait_confirm(struct txn_limbo *limbo);
void
txn_limbo_write_promote(struct txn_limbo *limbo, int64_t lsn, uint64_t term);
+/**
+ * Write a DEMOTE request.
+ * It has the same effect as PROMOTE and additionally clears limbo ownership.
+ */
+void
+txn_limbo_write_demote(struct txn_limbo *limbo, int64_t lsn, uint64_t term);
+
/**
* Update qsync parameters dynamically.
*/
diff --git a/test/box/error.result b/test/box/error.result
index 574521a14..55ecc3e8e 100644
--- a/test/box/error.result
+++ b/test/box/error.result
@@ -444,7 +444,7 @@ t;
| 223: box.error.INTERFERING_PROMOTE
| 224: box.error.RAFT_DISABLED
| 225: box.error.TXN_ROLLBACK
- | 226: box.error.LIMBO_UNCLAIMED
+ | 226: box.error.SYNCHRO_QUEUE_UNCLAIMED
| ...
test_run:cmd("setopt delimiter ''");
diff --git a/test/replication/election_basic.result b/test/replication/election_basic.result
index d5320b3ff..a62d32deb 100644
--- a/test/replication/election_basic.result
+++ b/test/replication/election_basic.result
@@ -114,6 +114,9 @@ box.cfg{
}
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
--
-- See if bootstrap with election enabled works.
diff --git a/test/replication/election_basic.test.lua b/test/replication/election_basic.test.lua
index 821f73cea..2143a6000 100644
--- a/test/replication/election_basic.test.lua
+++ b/test/replication/election_basic.test.lua
@@ -43,6 +43,7 @@ box.cfg{
election_mode = 'off', \
election_timeout = old_election_timeout \
}
+box.ctl.demote()
--
-- See if bootstrap with election enabled works.
diff --git a/test/replication/election_qsync.result b/test/replication/election_qsync.result
index c06400b38..2402c8578 100644
--- a/test/replication/election_qsync.result
+++ b/test/replication/election_qsync.result
@@ -165,6 +165,9 @@ box.cfg{
}
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
box.schema.user.revoke('guest', 'super')
| ---
| ...
diff --git a/test/replication/election_qsync.test.lua b/test/replication/election_qsync.test.lua
index ea6fc4a61..e1aca8351 100644
--- a/test/replication/election_qsync.test.lua
+++ b/test/replication/election_qsync.test.lua
@@ -84,4 +84,5 @@ box.cfg{
replication = old_replication, \
replication_synchro_timeout = old_replication_synchro_timeout, \
}
+box.ctl.demote()
box.schema.user.revoke('guest', 'super')
diff --git a/test/replication/gh-5140-qsync-casc-rollback.result b/test/replication/gh-5140-qsync-casc-rollback.result
index da77631dd..d3208e1a4 100644
--- a/test/replication/gh-5140-qsync-casc-rollback.result
+++ b/test/replication/gh-5140-qsync-casc-rollback.result
@@ -73,6 +73,9 @@ _ = box.schema.space.create('async', {is_sync=false, engine = engine})
_ = _:create_index('pk')
| ---
| ...
+box.ctl.promote()
+ | ---
+ | ...
-- Write something to flush the master state to replica.
box.space.sync:replace{1}
| ---
@@ -222,3 +225,6 @@ test_run:cmd('delete server replica')
box.schema.user.revoke('guest', 'super')
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-5140-qsync-casc-rollback.test.lua b/test/replication/gh-5140-qsync-casc-rollback.test.lua
index 69fc9ad02..96ddfd260 100644
--- a/test/replication/gh-5140-qsync-casc-rollback.test.lua
+++ b/test/replication/gh-5140-qsync-casc-rollback.test.lua
@@ -48,6 +48,7 @@ _ = box.schema.space.create('sync', {is_sync = true, engine = engine})
_ = _:create_index('pk')
_ = box.schema.space.create('async', {is_sync=false, engine = engine})
_ = _:create_index('pk')
+box.ctl.promote()
-- Write something to flush the master state to replica.
box.space.sync:replace{1}
@@ -103,3 +104,4 @@ test_run:cmd('stop server replica')
test_run:cmd('delete server replica')
box.schema.user.revoke('guest', 'super')
+box.ctl.demote()
diff --git a/test/replication/gh-5144-qsync-dup-confirm.result b/test/replication/gh-5144-qsync-dup-confirm.result
index 9d265d9ff..217e44412 100644
--- a/test/replication/gh-5144-qsync-dup-confirm.result
+++ b/test/replication/gh-5144-qsync-dup-confirm.result
@@ -46,6 +46,9 @@ _ = box.schema.space.create('sync', {is_sync = true, engine = engine})
_ = _:create_index('pk')
| ---
| ...
+box.ctl.promote()
+ | ---
+ | ...
-- Remember the current LSN. In the end, when the following synchronous
-- transaction is committed, result LSN should be this value +2: for the
@@ -148,6 +151,9 @@ test_run:cmd('delete server replica2')
| - true
| ...
+box.ctl.demote()
+ | ---
+ | ...
box.schema.user.revoke('guest', 'super')
| ---
| ...
diff --git a/test/replication/gh-5144-qsync-dup-confirm.test.lua b/test/replication/gh-5144-qsync-dup-confirm.test.lua
index 01a8351e0..1d6af2c62 100644
--- a/test/replication/gh-5144-qsync-dup-confirm.test.lua
+++ b/test/replication/gh-5144-qsync-dup-confirm.test.lua
@@ -19,6 +19,7 @@ box.cfg{replication_synchro_quorum = 2, replication_synchro_timeout = 1000}
_ = box.schema.space.create('sync', {is_sync = true, engine = engine})
_ = _:create_index('pk')
+box.ctl.promote()
-- Remember the current LSN. In the end, when the following synchronous
-- transaction is committed, result LSN should be this value +2: for the
@@ -69,4 +70,5 @@ test_run:cmd('delete server replica1')
test_run:cmd('stop server replica2')
test_run:cmd('delete server replica2')
+box.ctl.demote()
box.schema.user.revoke('guest', 'super')
diff --git a/test/replication/gh-5163-qsync-restart-crash.result b/test/replication/gh-5163-qsync-restart-crash.result
index e57bc76d1..1b4d3d9b5 100644
--- a/test/replication/gh-5163-qsync-restart-crash.result
+++ b/test/replication/gh-5163-qsync-restart-crash.result
@@ -16,6 +16,9 @@ _ = box.schema.space.create('sync', {is_sync=true, engine=engine})
_ = box.space.sync:create_index('pk')
| ---
| ...
+box.ctl.promote()
+ | ---
+ | ...
box.space.sync:replace{1}
| ---
@@ -30,3 +33,6 @@ box.space.sync:select{}
box.space.sync:drop()
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-5163-qsync-restart-crash.test.lua b/test/replication/gh-5163-qsync-restart-crash.test.lua
index d5aca4749..c8d54aad2 100644
--- a/test/replication/gh-5163-qsync-restart-crash.test.lua
+++ b/test/replication/gh-5163-qsync-restart-crash.test.lua
@@ -7,8 +7,10 @@ engine = test_run:get_cfg('engine')
--
_ = box.schema.space.create('sync', {is_sync=true, engine=engine})
_ = box.space.sync:create_index('pk')
+box.ctl.promote()
box.space.sync:replace{1}
test_run:cmd('restart server default')
box.space.sync:select{}
box.space.sync:drop()
+box.ctl.demote()
diff --git a/test/replication/gh-5167-qsync-rollback-snap.result b/test/replication/gh-5167-qsync-rollback-snap.result
index 06f58526c..13166720f 100644
--- a/test/replication/gh-5167-qsync-rollback-snap.result
+++ b/test/replication/gh-5167-qsync-rollback-snap.result
@@ -41,6 +41,9 @@ _ = box.schema.space.create('sync', {is_sync = true, engine = engine})
_ = box.space.sync:create_index('pk')
| ---
| ...
+box.ctl.promote()
+ | ---
+ | ...
-- Write something to flush the current master's state to replica.
_ = box.space.sync:insert{1}
| ---
@@ -163,3 +166,6 @@ box.cfg{
}
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-5167-qsync-rollback-snap.test.lua b/test/replication/gh-5167-qsync-rollback-snap.test.lua
index 475727e61..1a2a31b7c 100644
--- a/test/replication/gh-5167-qsync-rollback-snap.test.lua
+++ b/test/replication/gh-5167-qsync-rollback-snap.test.lua
@@ -16,6 +16,7 @@ fiber = require('fiber')
box.cfg{replication_synchro_quorum = 2, replication_synchro_timeout = 1000}
_ = box.schema.space.create('sync', {is_sync = true, engine = engine})
_ = box.space.sync:create_index('pk')
+box.ctl.promote()
-- Write something to flush the current master's state to replica.
_ = box.space.sync:insert{1}
_ = box.space.sync:delete{1}
@@ -65,3 +66,4 @@ box.cfg{
replication_synchro_quorum = orig_synchro_quorum, \
replication_synchro_timeout = orig_synchro_timeout, \
}
+box.ctl.demote()
diff --git a/test/replication/gh-5195-qsync-replica-write.result b/test/replication/gh-5195-qsync-replica-write.result
index 85e00e6ed..99ec9663e 100644
--- a/test/replication/gh-5195-qsync-replica-write.result
+++ b/test/replication/gh-5195-qsync-replica-write.result
@@ -40,6 +40,9 @@ _ = box.schema.space.create('sync', {engine = engine, is_sync = true})
_ = box.space.sync:create_index('pk')
| ---
| ...
+box.ctl.promote()
+ | ---
+ | ...
box.cfg{replication_synchro_timeout = 1000, replication_synchro_quorum = 3}
| ---
@@ -71,12 +74,12 @@ test_run:wait_lsn('replica', 'default')
| ---
| ...
-- Normal DML is blocked - the limbo is not empty and does not belong to the
--- replica. But synchro queue cleanup also does a WAL write, and propagates LSN
+-- replica. But promote also does a WAL write, and propagates LSN
-- of the instance.
box.cfg{replication_synchro_timeout = 0.001}
| ---
| ...
-box.ctl.clear_synchro_queue()
+box.ctl.promote()
| ---
| ...
@@ -157,3 +160,6 @@ box.cfg{
}
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-5195-qsync-replica-write.test.lua b/test/replication/gh-5195-qsync-replica-write.test.lua
index 64c48be99..8b5d78357 100644
--- a/test/replication/gh-5195-qsync-replica-write.test.lua
+++ b/test/replication/gh-5195-qsync-replica-write.test.lua
@@ -17,6 +17,7 @@ test_run:cmd('start server replica with wait=True, wait_load=True')
--
_ = box.schema.space.create('sync', {engine = engine, is_sync = true})
_ = box.space.sync:create_index('pk')
+box.ctl.promote()
box.cfg{replication_synchro_timeout = 1000, replication_synchro_quorum = 3}
lsn = box.info.lsn
@@ -30,10 +31,10 @@ test_run:wait_cond(function() return box.info.lsn == lsn end)
test_run:switch('replica')
test_run:wait_lsn('replica', 'default')
-- Normal DML is blocked - the limbo is not empty and does not belong to the
--- replica. But synchro queue cleanup also does a WAL write, and propagates LSN
+-- replica. But promote also does a WAL write, and propagates LSN
-- of the instance.
box.cfg{replication_synchro_timeout = 0.001}
-box.ctl.clear_synchro_queue()
+box.ctl.promote()
test_run:switch('default')
-- Wait second ACK receipt.
@@ -66,3 +67,4 @@ box.cfg{
replication_synchro_quorum = old_synchro_quorum, \
replication_synchro_timeout = old_synchro_timeout, \
}
+box.ctl.demote()
diff --git a/test/replication/gh-5213-qsync-applier-order-3.result b/test/replication/gh-5213-qsync-applier-order-3.result
index bcb18b5c0..e788eec77 100644
--- a/test/replication/gh-5213-qsync-applier-order-3.result
+++ b/test/replication/gh-5213-qsync-applier-order-3.result
@@ -45,6 +45,9 @@ s = box.schema.space.create('test', {is_sync = true})
_ = s:create_index('pk')
| ---
| ...
+box.ctl.promote()
+ | ---
+ | ...
test_run:cmd('create server replica1 with rpl_master=default,\
script="replication/replica1.lua"')
@@ -179,6 +182,9 @@ box.cfg{
-- Replica2 takes the limbo ownership and sends the transaction to the replica1.
-- Along with the CONFIRM from the default node, which is still not applied
-- on the replica1.
+box.ctl.promote()
+ | ---
+ | ...
fiber = require('fiber')
| ---
| ...
@@ -261,3 +267,6 @@ box.cfg{
}
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-5213-qsync-applier-order-3.test.lua b/test/replication/gh-5213-qsync-applier-order-3.test.lua
index 37b569da7..304656de0 100644
--- a/test/replication/gh-5213-qsync-applier-order-3.test.lua
+++ b/test/replication/gh-5213-qsync-applier-order-3.test.lua
@@ -30,6 +30,7 @@ box.schema.user.grant('guest', 'super')
s = box.schema.space.create('test', {is_sync = true})
_ = s:create_index('pk')
+box.ctl.promote()
test_run:cmd('create server replica1 with rpl_master=default,\
script="replication/replica1.lua"')
@@ -90,6 +91,7 @@ box.cfg{
-- Replica2 takes the limbo ownership and sends the transaction to the replica1.
-- Along with the CONFIRM from the default node, which is still not applied
-- on the replica1.
+box.ctl.promote()
fiber = require('fiber')
f = fiber.new(function() box.space.test:replace{2} end)
@@ -123,3 +125,4 @@ box.cfg{
replication_synchro_quorum = old_synchro_quorum, \
replication_synchro_timeout = old_synchro_timeout, \
}
+box.ctl.demote()
diff --git a/test/replication/gh-5213-qsync-applier-order.result b/test/replication/gh-5213-qsync-applier-order.result
index a8c24c289..ba6cdab06 100644
--- a/test/replication/gh-5213-qsync-applier-order.result
+++ b/test/replication/gh-5213-qsync-applier-order.result
@@ -29,6 +29,9 @@ s = box.schema.space.create('test', {is_sync = true})
_ = s:create_index('pk')
| ---
| ...
+box.ctl.promote()
+ | ---
+ | ...
test_run:cmd('create server replica with rpl_master=default,\
script="replication/gh-5213-replica.lua"')
@@ -300,3 +303,6 @@ box.cfg{
}
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-5213-qsync-applier-order.test.lua b/test/replication/gh-5213-qsync-applier-order.test.lua
index f1eccfa84..39b1912e8 100644
--- a/test/replication/gh-5213-qsync-applier-order.test.lua
+++ b/test/replication/gh-5213-qsync-applier-order.test.lua
@@ -14,6 +14,7 @@ box.schema.user.grant('guest', 'super')
s = box.schema.space.create('test', {is_sync = true})
_ = s:create_index('pk')
+box.ctl.promote()
test_run:cmd('create server replica with rpl_master=default,\
script="replication/gh-5213-replica.lua"')
@@ -120,3 +121,4 @@ box.cfg{
replication_synchro_quorum = old_synchro_quorum, \
replication_synchro_timeout = old_synchro_timeout, \
}
+box.ctl.demote()
diff --git a/test/replication/gh-5288-qsync-recovery.result b/test/replication/gh-5288-qsync-recovery.result
index dc0babef6..704b71d93 100644
--- a/test/replication/gh-5288-qsync-recovery.result
+++ b/test/replication/gh-5288-qsync-recovery.result
@@ -12,6 +12,9 @@ s = box.schema.space.create('sync', {is_sync = true})
_ = s:create_index('pk')
| ---
| ...
+box.ctl.promote()
+ | ---
+ | ...
s:insert{1}
| ---
| - [1]
@@ -25,3 +28,6 @@ test_run:cmd('restart server default')
box.space.sync:drop()
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-5288-qsync-recovery.test.lua b/test/replication/gh-5288-qsync-recovery.test.lua
index 00bff7b87..2455f7278 100644
--- a/test/replication/gh-5288-qsync-recovery.test.lua
+++ b/test/replication/gh-5288-qsync-recovery.test.lua
@@ -5,7 +5,9 @@ test_run = require('test_run').new()
--
s = box.schema.space.create('sync', {is_sync = true})
_ = s:create_index('pk')
+box.ctl.promote()
s:insert{1}
box.snapshot()
test_run:cmd('restart server default')
box.space.sync:drop()
+box.ctl.demote()
diff --git a/test/replication/gh-5298-qsync-recovery-snap.result b/test/replication/gh-5298-qsync-recovery-snap.result
index 922831552..0883fe5f5 100644
--- a/test/replication/gh-5298-qsync-recovery-snap.result
+++ b/test/replication/gh-5298-qsync-recovery-snap.result
@@ -17,6 +17,9 @@ _ = box.schema.space.create('sync', {is_sync = true, engine = engine})
_ = box.space.sync:create_index('pk')
| ---
| ...
+box.ctl.promote()
+ | ---
+ | ...
for i = 1, 10 do box.space.sync:replace{i} end
| ---
| ...
@@ -98,3 +101,6 @@ box.space.sync:drop()
box.space.loc:drop()
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-5298-qsync-recovery-snap.test.lua b/test/replication/gh-5298-qsync-recovery-snap.test.lua
index 187f60d75..084cde963 100644
--- a/test/replication/gh-5298-qsync-recovery-snap.test.lua
+++ b/test/replication/gh-5298-qsync-recovery-snap.test.lua
@@ -8,6 +8,7 @@ engine = test_run:get_cfg('engine')
--
_ = box.schema.space.create('sync', {is_sync = true, engine = engine})
_ = box.space.sync:create_index('pk')
+box.ctl.promote()
for i = 1, 10 do box.space.sync:replace{i} end
-- Local rows could affect this by increasing the signature.
@@ -46,3 +47,4 @@ box.cfg{
}
box.space.sync:drop()
box.space.loc:drop()
+box.ctl.demote()
diff --git a/test/replication/gh-5426-election-on-off.result b/test/replication/gh-5426-election-on-off.result
index 7444ef7f2..2bdc17ec6 100644
--- a/test/replication/gh-5426-election-on-off.result
+++ b/test/replication/gh-5426-election-on-off.result
@@ -168,6 +168,9 @@ box.cfg{
}
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
box.schema.user.revoke('guest', 'super')
| ---
| ...
diff --git a/test/replication/gh-5426-election-on-off.test.lua b/test/replication/gh-5426-election-on-off.test.lua
index bdf06903b..6277e9ef2 100644
--- a/test/replication/gh-5426-election-on-off.test.lua
+++ b/test/replication/gh-5426-election-on-off.test.lua
@@ -69,4 +69,5 @@ box.cfg{
election_mode = old_election_mode, \
replication_timeout = old_replication_timeout, \
}
+box.ctl.demote()
box.schema.user.revoke('guest', 'super')
diff --git a/test/replication/gh-5433-election-restart-recovery.result b/test/replication/gh-5433-election-restart-recovery.result
index f8f32416e..ed63ff409 100644
--- a/test/replication/gh-5433-election-restart-recovery.result
+++ b/test/replication/gh-5433-election-restart-recovery.result
@@ -169,6 +169,9 @@ box.cfg{
}
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
box.schema.user.revoke('guest', 'super')
| ---
| ...
diff --git a/test/replication/gh-5433-election-restart-recovery.test.lua b/test/replication/gh-5433-election-restart-recovery.test.lua
index 4aff000bf..ae1f42c4d 100644
--- a/test/replication/gh-5433-election-restart-recovery.test.lua
+++ b/test/replication/gh-5433-election-restart-recovery.test.lua
@@ -84,4 +84,5 @@ box.cfg{
election_mode = old_election_mode, \
replication_timeout = old_replication_timeout, \
}
+box.ctl.demote()
box.schema.user.revoke('guest', 'super')
diff --git a/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.result b/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.result
index 2699231e5..20fab4072 100644
--- a/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.result
+++ b/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.result
@@ -49,6 +49,9 @@ _ = box.schema.space.create('test', {is_sync=true})
_ = box.space.test:create_index('pk')
| ---
| ...
+box.ctl.promote()
+ | ---
+ | ...
-- Fill the limbo with pending entries. 3 mustn't receive them yet.
test_run:cmd('stop server election_replica3')
diff --git a/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.test.lua b/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.test.lua
index 03705d96c..ec0f1d77e 100644
--- a/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.test.lua
+++ b/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.test.lua
@@ -21,6 +21,7 @@ box.ctl.wait_rw()
_ = box.schema.space.create('test', {is_sync=true})
_ = box.space.test:create_index('pk')
+box.ctl.promote()
-- Fill the limbo with pending entries. 3 mustn't receive them yet.
test_run:cmd('stop server election_replica3')
diff --git a/test/replication/gh-5446-qsync-eval-quorum.result b/test/replication/gh-5446-qsync-eval-quorum.result
index 5f83b248c..1173128a7 100644
--- a/test/replication/gh-5446-qsync-eval-quorum.result
+++ b/test/replication/gh-5446-qsync-eval-quorum.result
@@ -88,6 +88,9 @@ s = box.schema.space.create('sync', {is_sync = true, engine = engine})
_ = s:create_index('pk')
| ---
| ...
+box.ctl.promote()
+ | ---
+ | ...
-- Only one master node -> 1/2 + 1 = 1
s:insert{1} -- should pass
@@ -343,3 +346,7 @@ box.cfg{
}
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
+
diff --git a/test/replication/gh-5446-qsync-eval-quorum.test.lua b/test/replication/gh-5446-qsync-eval-quorum.test.lua
index 6b9e324ed..b969df836 100644
--- a/test/replication/gh-5446-qsync-eval-quorum.test.lua
+++ b/test/replication/gh-5446-qsync-eval-quorum.test.lua
@@ -37,6 +37,7 @@ end
-- Create a sync space we will operate on
s = box.schema.space.create('sync', {is_sync = true, engine = engine})
_ = s:create_index('pk')
+box.ctl.promote()
-- Only one master node -> 1/2 + 1 = 1
s:insert{1} -- should pass
@@ -135,3 +136,5 @@ box.cfg{
replication_synchro_quorum = old_synchro_quorum, \
replication_synchro_timeout = old_synchro_timeout, \
}
+box.ctl.demote()
+
diff --git a/test/replication/gh-5506-election-on-off.result b/test/replication/gh-5506-election-on-off.result
index b8abd7ecd..a7f2b6a9c 100644
--- a/test/replication/gh-5506-election-on-off.result
+++ b/test/replication/gh-5506-election-on-off.result
@@ -138,3 +138,6 @@ box.cfg{
}
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-5506-election-on-off.test.lua b/test/replication/gh-5506-election-on-off.test.lua
index 476b00ec0..f8915c333 100644
--- a/test/replication/gh-5506-election-on-off.test.lua
+++ b/test/replication/gh-5506-election-on-off.test.lua
@@ -66,3 +66,4 @@ box.cfg{
election_mode = old_election_mode, \
replication_timeout = old_replication_timeout, \
}
+box.ctl.demote()
diff --git a/test/replication/gh-5566-final-join-synchro.result b/test/replication/gh-5566-final-join-synchro.result
index a09882ba6..c5ae2f283 100644
--- a/test/replication/gh-5566-final-join-synchro.result
+++ b/test/replication/gh-5566-final-join-synchro.result
@@ -12,6 +12,9 @@ _ = box.schema.space.create('sync', {is_sync=true})
_ = box.space.sync:create_index('pk')
| ---
| ...
+box.ctl.promote()
+ | ---
+ | ...
box.schema.user.grant('guest', 'replication')
| ---
@@ -137,3 +140,6 @@ test_run:cleanup_cluster()
box.schema.user.revoke('guest', 'replication')
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-5566-final-join-synchro.test.lua b/test/replication/gh-5566-final-join-synchro.test.lua
index 2db2c742f..25f411407 100644
--- a/test/replication/gh-5566-final-join-synchro.test.lua
+++ b/test/replication/gh-5566-final-join-synchro.test.lua
@@ -5,6 +5,7 @@ test_run = require('test_run').new()
--
_ = box.schema.space.create('sync', {is_sync=true})
_ = box.space.sync:create_index('pk')
+box.ctl.promote()
box.schema.user.grant('guest', 'replication')
box.schema.user.grant('guest', 'write', 'space', 'sync')
@@ -59,3 +60,4 @@ box.cfg{\
box.space.sync:drop()
test_run:cleanup_cluster()
box.schema.user.revoke('guest', 'replication')
+box.ctl.demote()
diff --git a/test/replication/gh-5874-qsync-txn-recovery.result b/test/replication/gh-5874-qsync-txn-recovery.result
index 73f903ca7..01328a9e3 100644
--- a/test/replication/gh-5874-qsync-txn-recovery.result
+++ b/test/replication/gh-5874-qsync-txn-recovery.result
@@ -31,6 +31,9 @@ sync = box.schema.create_space('sync', {is_sync = true, engine = engine})
_ = sync:create_index('pk')
| ---
| ...
+box.ctl.promote()
+ | ---
+ | ...
-- The transaction fails, but is written to the log anyway.
box.begin() async:insert{1} sync:insert{1} box.commit()
@@ -160,3 +163,6 @@ sync:drop()
loc:drop()
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-5874-qsync-txn-recovery.test.lua b/test/replication/gh-5874-qsync-txn-recovery.test.lua
index f35eb68de..6ddf164ac 100644
--- a/test/replication/gh-5874-qsync-txn-recovery.test.lua
+++ b/test/replication/gh-5874-qsync-txn-recovery.test.lua
@@ -12,6 +12,7 @@ async = box.schema.create_space('async', {engine = engine})
_ = async:create_index('pk')
sync = box.schema.create_space('sync', {is_sync = true, engine = engine})
_ = sync:create_index('pk')
+box.ctl.promote()
-- The transaction fails, but is written to the log anyway.
box.begin() async:insert{1} sync:insert{1} box.commit()
@@ -82,3 +83,4 @@ loc:select()
async:drop()
sync:drop()
loc:drop()
+box.ctl.demote()
diff --git a/test/replication/gh-6032-promote-wal-write.result b/test/replication/gh-6032-promote-wal-write.result
index 246c7974f..03112fb8d 100644
--- a/test/replication/gh-6032-promote-wal-write.result
+++ b/test/replication/gh-6032-promote-wal-write.result
@@ -67,3 +67,6 @@ box.cfg{\
box.space.sync:drop()
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-6032-promote-wal-write.test.lua b/test/replication/gh-6032-promote-wal-write.test.lua
index 8c1859083..9a036a8b4 100644
--- a/test/replication/gh-6032-promote-wal-write.test.lua
+++ b/test/replication/gh-6032-promote-wal-write.test.lua
@@ -26,3 +26,4 @@ box.cfg{\
replication_synchro_timeout = replication_synchro_timeout,\
}
box.space.sync:drop()
+box.ctl.demote()
diff --git a/test/replication/gh-6034-limbo-ownership.result b/test/replication/gh-6034-limbo-ownership.result
new file mode 100644
index 000000000..3681df3d8
--- /dev/null
+++ b/test/replication/gh-6034-limbo-ownership.result
@@ -0,0 +1,186 @@
+-- test-run result file version 2
+test_run = require('test_run').new()
+ | ---
+ | ...
+fiber = require('fiber')
+ | ---
+ | ...
+
+--
+-- gh-6034: test that transactional limbo isn't accessible without a promotion.
+--
+synchro_quorum = box.cfg.replication_synchro_quorum
+ | ---
+ | ...
+election_mode = box.cfg.election_mode
+ | ---
+ | ...
+box.cfg{replication_synchro_quorum = 1, election_mode='off'}
+ | ---
+ | ...
+
+_ = box.schema.space.create('async'):create_index('pk')
+ | ---
+ | ...
+_ = box.schema.space.create('sync', {is_sync=true}):create_index('pk')
+ | ---
+ | ...
+
+-- Limbo is initially unclaimed, everyone is writeable.
+assert(not box.info.ro)
+ | ---
+ | - true
+ | ...
+assert(box.info.synchro.queue.owner == 0)
+ | ---
+ | - true
+ | ...
+box.space.async:insert{1} -- success.
+ | ---
+ | - [1]
+ | ...
+-- Synchro spaces aren't writeable
+box.space.sync:insert{1} -- error.
+ | ---
+ | - error: The synchronous transaction queue doesn't belong to any instance
+ | ...
+
+box.ctl.promote()
+ | ---
+ | ...
+assert(not box.info.ro)
+ | ---
+ | - true
+ | ...
+assert(box.info.synchro.queue.owner == box.info.id)
+ | ---
+ | - true
+ | ...
+box.space.sync:insert{1} -- success.
+ | ---
+ | - [1]
+ | ...
+
+-- Everyone but the limbo owner is read-only.
+box.schema.user.grant('guest', 'replication')
+ | ---
+ | ...
+test_run:cmd('create server replica with rpl_master=default,\
+ script="replication/replica.lua"')
+ | ---
+ | - true
+ | ...
+test_run:cmd('start server replica with wait=True, wait_load=True')
+ | ---
+ | - true
+ | ...
+test_run:cmd('set variable rpl_listen to "replica.listen"')
+ | ---
+ | - true
+ | ...
+orig_replication = box.cfg.replication
+ | ---
+ | ...
+box.cfg{replication={box.info.listen, rpl_listen}}
+ | ---
+ | ...
+
+test_run:switch('replica')
+ | ---
+ | - true
+ | ...
+assert(box.info.ro)
+ | ---
+ | - true
+ | ...
+assert(box.info.synchro.queue.owner == test_run:eval('default', 'return box.info.id')[1])
+ | ---
+ | - true
+ | ...
+box.space.async:insert{2} -- failure.
+ | ---
+ | - error: Can't modify data because this instance is in read-only mode.
+ | ...
+
+-- Promotion on the other node. Default should become ro.
+box.ctl.promote()
+ | ---
+ | ...
+assert(not box.info.ro)
+ | ---
+ | - true
+ | ...
+assert(box.info.synchro.queue.owner == box.info.id)
+ | ---
+ | - true
+ | ...
+box.space.sync:insert{2} -- success.
+ | ---
+ | - [2]
+ | ...
+
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+assert(box.info.ro)
+ | ---
+ | - true
+ | ...
+assert(box.info.synchro.queue.owner == test_run:eval('replica', 'return box.info.id')[1])
+ | ---
+ | - true
+ | ...
+box.space.sync:insert{3} -- failure.
+ | ---
+ | - error: Can't modify data because this instance is in read-only mode.
+ | ...
+
+box.ctl.demote()
+ | ---
+ | ...
+assert(not box.info.ro)
+ | ---
+ | - true
+ | ...
+box.space.sync:insert{3} -- still fails.
+ | ---
+ | - error: The synchronous transaction queue doesn't belong to any instance
+ | ...
+assert(box.info.synchro.queue.owner == 0)
+ | ---
+ | - true
+ | ...
+box.space.async:insert{3} -- success.
+ | ---
+ | - [3]
+ | ...
+
+-- Cleanup.
+box.ctl.demote()
+ | ---
+ | ...
+test_run:cmd('stop server replica')
+ | ---
+ | - true
+ | ...
+test_run:cmd('delete server replica')
+ | ---
+ | - true
+ | ...
+box.schema.user.revoke('guest', 'replication')
+ | ---
+ | ...
+box.space.sync:drop()
+ | ---
+ | ...
+box.space.async:drop()
+ | ---
+ | ...
+box.cfg{\
+ replication_synchro_quorum = synchro_quorum,\
+ election_mode = election_mode,\
+ replication = orig_replication,\
+}
+ | ---
+ | ...
diff --git a/test/replication/gh-6034-limbo-ownership.test.lua b/test/replication/gh-6034-limbo-ownership.test.lua
new file mode 100644
index 000000000..0e1586566
--- /dev/null
+++ b/test/replication/gh-6034-limbo-ownership.test.lua
@@ -0,0 +1,68 @@
+test_run = require('test_run').new()
+fiber = require('fiber')
+
+--
+-- gh-6034: test that transactional limbo isn't accessible without a promotion.
+--
+synchro_quorum = box.cfg.replication_synchro_quorum
+election_mode = box.cfg.election_mode
+box.cfg{replication_synchro_quorum = 1, election_mode='off'}
+
+_ = box.schema.space.create('async'):create_index('pk')
+_ = box.schema.space.create('sync', {is_sync=true}):create_index('pk')
+
+-- Limbo is initially unclaimed, everyone is writeable.
+assert(not box.info.ro)
+assert(box.info.synchro.queue.owner == 0)
+box.space.async:insert{1} -- success.
+-- Synchro spaces aren't writeable
+box.space.sync:insert{1} -- error.
+
+box.ctl.promote()
+assert(not box.info.ro)
+assert(box.info.synchro.queue.owner == box.info.id)
+box.space.sync:insert{1} -- success.
+
+-- Everyone but the limbo owner is read-only.
+box.schema.user.grant('guest', 'replication')
+test_run:cmd('create server replica with rpl_master=default,\
+ script="replication/replica.lua"')
+test_run:cmd('start server replica with wait=True, wait_load=True')
+test_run:cmd('set variable rpl_listen to "replica.listen"')
+orig_replication = box.cfg.replication
+box.cfg{replication={box.info.listen, rpl_listen}}
+
+test_run:switch('replica')
+assert(box.info.ro)
+assert(box.info.synchro.queue.owner == test_run:eval('default', 'return box.info.id')[1])
+box.space.async:insert{2} -- failure.
+
+-- Promotion on the other node. Default should become ro.
+box.ctl.promote()
+assert(not box.info.ro)
+assert(box.info.synchro.queue.owner == box.info.id)
+box.space.sync:insert{2} -- success.
+
+test_run:switch('default')
+assert(box.info.ro)
+assert(box.info.synchro.queue.owner == test_run:eval('replica', 'return box.info.id')[1])
+box.space.sync:insert{3} -- failure.
+
+box.ctl.demote()
+assert(not box.info.ro)
+box.space.sync:insert{3} -- still fails.
+assert(box.info.synchro.queue.owner == 0)
+box.space.async:insert{3} -- success.
+
+-- Cleanup.
+box.ctl.demote()
+test_run:cmd('stop server replica')
+test_run:cmd('delete server replica')
+box.schema.user.revoke('guest', 'replication')
+box.space.sync:drop()
+box.space.async:drop()
+box.cfg{\
+ replication_synchro_quorum = synchro_quorum,\
+ election_mode = election_mode,\
+ replication = orig_replication,\
+}
diff --git a/test/replication/gh-6057-qsync-confirm-async-no-wal.result b/test/replication/gh-6057-qsync-confirm-async-no-wal.result
index 23c77729b..e7beefb2a 100644
--- a/test/replication/gh-6057-qsync-confirm-async-no-wal.result
+++ b/test/replication/gh-6057-qsync-confirm-async-no-wal.result
@@ -40,6 +40,10 @@ _ = s2:create_index('pk')
| ---
| ...
+box.ctl.promote()
+ | ---
+ | ...
+
errinj = box.error.injection
| ---
| ...
@@ -161,3 +165,6 @@ box.cfg{
}
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-6057-qsync-confirm-async-no-wal.test.lua b/test/replication/gh-6057-qsync-confirm-async-no-wal.test.lua
index a11ddc042..bb459ea02 100644
--- a/test/replication/gh-6057-qsync-confirm-async-no-wal.test.lua
+++ b/test/replication/gh-6057-qsync-confirm-async-no-wal.test.lua
@@ -21,6 +21,8 @@ _ = s:create_index('pk')
s2 = box.schema.create_space('test2')
_ = s2:create_index('pk')
+box.ctl.promote()
+
errinj = box.error.injection
function create_hanging_async_after_confirm(sync_key, async_key1, async_key2) \
@@ -86,3 +88,4 @@ box.cfg{
replication_synchro_quorum = old_synchro_quorum, \
replication_synchro_timeout = old_synchro_timeout, \
}
+box.ctl.demote()
diff --git a/test/replication/hang_on_synchro_fail.result b/test/replication/hang_on_synchro_fail.result
index 9f6fac00b..dda15af20 100644
--- a/test/replication/hang_on_synchro_fail.result
+++ b/test/replication/hang_on_synchro_fail.result
@@ -19,6 +19,9 @@ _ = box.schema.space.create('sync', {is_sync=true})
_ = box.space.sync:create_index('pk')
| ---
| ...
+box.ctl.promote()
+ | ---
+ | ...
old_synchro_quorum = box.cfg.replication_synchro_quorum
| ---
@@ -127,4 +130,7 @@ box.space.sync:drop()
box.schema.user.revoke('guest', 'replication')
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/hang_on_synchro_fail.test.lua b/test/replication/hang_on_synchro_fail.test.lua
index 6c3b09fab..f0d494eae 100644
--- a/test/replication/hang_on_synchro_fail.test.lua
+++ b/test/replication/hang_on_synchro_fail.test.lua
@@ -8,6 +8,7 @@ box.schema.user.grant('guest', 'replication')
_ = box.schema.space.create('sync', {is_sync=true})
_ = box.space.sync:create_index('pk')
+box.ctl.promote()
old_synchro_quorum = box.cfg.replication_synchro_quorum
box.cfg{replication_synchro_quorum=3}
@@ -54,4 +55,5 @@ box.cfg{replication_synchro_quorum=old_synchro_quorum,\
replication_synchro_timeout=old_synchro_timeout}
box.space.sync:drop()
box.schema.user.revoke('guest', 'replication')
+box.ctl.demote()
diff --git a/test/replication/qsync_advanced.result b/test/replication/qsync_advanced.result
index 94b19b1f2..72ac0c326 100644
--- a/test/replication/qsync_advanced.result
+++ b/test/replication/qsync_advanced.result
@@ -72,6 +72,9 @@ _ = box.schema.space.create('sync', {is_sync=true, engine=engine})
_ = box.space.sync:create_index('pk')
| ---
| ...
+box.ctl.promote()
+ | ---
+ | ...
-- Testcase body.
box.space.sync:insert{1} -- success
| ---
@@ -468,6 +471,9 @@ box.space.sync:select{} -- 1
box.cfg{read_only=false} -- promote replica to master
| ---
| ...
+box.ctl.promote()
+ | ---
+ | ...
test_run:switch('default')
| ---
| - true
@@ -508,6 +514,9 @@ test_run:switch('default')
box.cfg{read_only=false}
| ---
| ...
+box.ctl.promote()
+ | ---
+ | ...
test_run:switch('replica')
| ---
| - true
@@ -781,3 +790,6 @@ box.cfg{
}
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/qsync_advanced.test.lua b/test/replication/qsync_advanced.test.lua
index 058ece602..37c285b8d 100644
--- a/test/replication/qsync_advanced.test.lua
+++ b/test/replication/qsync_advanced.test.lua
@@ -30,6 +30,7 @@ test_run:switch('default')
box.cfg{replication_synchro_quorum=NUM_INSTANCES, replication_synchro_timeout=1000}
_ = box.schema.space.create('sync', {is_sync=true, engine=engine})
_ = box.space.sync:create_index('pk')
+box.ctl.promote()
-- Testcase body.
box.space.sync:insert{1} -- success
test_run:cmd('switch replica')
@@ -170,6 +171,7 @@ box.space.sync:select{} -- 1
test_run:switch('replica')
box.space.sync:select{} -- 1
box.cfg{read_only=false} -- promote replica to master
+box.ctl.promote()
test_run:switch('default')
box.cfg{read_only=true} -- demote master to replica
test_run:switch('replica')
@@ -181,6 +183,7 @@ box.space.sync:select{} -- 1, 2
-- Revert cluster configuration.
test_run:switch('default')
box.cfg{read_only=false}
+box.ctl.promote()
test_run:switch('replica')
box.cfg{read_only=true}
-- Testcase cleanup.
@@ -279,3 +282,4 @@ box.cfg{
replication_synchro_quorum = orig_synchro_quorum, \
replication_synchro_timeout = orig_synchro_timeout, \
}
+box.ctl.demote()
diff --git a/test/replication/qsync_basic.result b/test/replication/qsync_basic.result
index 7e711ba13..bbdfc42fe 100644
--- a/test/replication/qsync_basic.result
+++ b/test/replication/qsync_basic.result
@@ -14,6 +14,9 @@ s1.is_sync
pk = s1:create_index('pk')
| ---
| ...
+box.ctl.promote()
+ | ---
+ | ...
box.begin() s1:insert({1}) s1:insert({2}) box.commit()
| ---
| ...
@@ -645,19 +648,12 @@ test_run:switch('default')
| ---
| - true
| ...
-box.cfg{replication_synchro_quorum = 3, replication_synchro_timeout = 1000}
- | ---
- | ...
-f = fiber.create(function() box.space.sync:replace{1} end)
+box.ctl.demote()
| ---
| ...
-test_run:wait_lsn('replica', 'default')
+box.space.sync:replace{1}
| ---
- | ...
-
-test_run:switch('replica')
- | ---
- | - true
+ | - error: The synchronous transaction queue doesn't belong to any instance
| ...
function skip_row() return nil end
| ---
@@ -674,26 +670,22 @@ box.space.sync:replace{2}
box.space.sync:before_replace(nil, skip_row)
| ---
| ...
-assert(box.space.sync:get{2} == nil)
+assert(box.space.sync:get{1} == nil)
| ---
| - true
| ...
-assert(box.space.sync:get{1} ~= nil)
+assert(box.space.sync:get{2} == nil)
| ---
| - true
| ...
-
-test_run:switch('default')
+assert(box.info.lsn == old_lsn + 1)
| ---
| - true
| ...
-box.cfg{replication_synchro_quorum = 2}
+box.ctl.promote()
| ---
| ...
-test_run:wait_cond(function() return f:status() == 'dead' end)
- | ---
- | - true
- | ...
+
box.space.sync:truncate()
| ---
| ...
@@ -758,3 +750,6 @@ box.space.sync:drop()
box.schema.user.revoke('guest', 'replication')
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/qsync_basic.test.lua b/test/replication/qsync_basic.test.lua
index 75c9b222b..eac465e25 100644
--- a/test/replication/qsync_basic.test.lua
+++ b/test/replication/qsync_basic.test.lua
@@ -6,6 +6,7 @@
s1 = box.schema.create_space('test1', {is_sync = true})
s1.is_sync
pk = s1:create_index('pk')
+box.ctl.promote()
box.begin() s1:insert({1}) s1:insert({2}) box.commit()
s1:select{}
@@ -253,22 +254,18 @@ box.space.sync:count()
-- instances, but also works for local rows.
--
test_run:switch('default')
-box.cfg{replication_synchro_quorum = 3, replication_synchro_timeout = 1000}
-f = fiber.create(function() box.space.sync:replace{1} end)
-test_run:wait_lsn('replica', 'default')
-
-test_run:switch('replica')
+box.ctl.demote()
+box.space.sync:replace{1}
function skip_row() return nil end
old_lsn = box.info.lsn
_ = box.space.sync:before_replace(skip_row)
box.space.sync:replace{2}
box.space.sync:before_replace(nil, skip_row)
+assert(box.space.sync:get{1} == nil)
assert(box.space.sync:get{2} == nil)
-assert(box.space.sync:get{1} ~= nil)
+assert(box.info.lsn == old_lsn + 1)
+box.ctl.promote()
-test_run:switch('default')
-box.cfg{replication_synchro_quorum = 2}
-test_run:wait_cond(function() return f:status() == 'dead' end)
box.space.sync:truncate()
--
@@ -301,3 +298,4 @@ test_run:cmd('delete server replica')
box.space.test:drop()
box.space.sync:drop()
box.schema.user.revoke('guest', 'replication')
+box.ctl.demote()
diff --git a/test/replication/qsync_errinj.result b/test/replication/qsync_errinj.result
index 635bcf939..cf1e30a90 100644
--- a/test/replication/qsync_errinj.result
+++ b/test/replication/qsync_errinj.result
@@ -35,6 +35,9 @@ _ = box.schema.space.create('sync', {is_sync = true, engine = engine})
_ = box.space.sync:create_index('pk')
| ---
| ...
+box.ctl.promote()
+ | ---
+ | ...
--
-- gh-5100: slow ACK sending shouldn't stun replica for the
@@ -542,3 +545,6 @@ box.space.sync:drop()
box.schema.user.revoke('guest', 'super')
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/qsync_errinj.test.lua b/test/replication/qsync_errinj.test.lua
index 6a9fd3e1a..e7c85c58c 100644
--- a/test/replication/qsync_errinj.test.lua
+++ b/test/replication/qsync_errinj.test.lua
@@ -12,6 +12,7 @@ test_run:cmd('start server replica with wait=True, wait_load=True')
_ = box.schema.space.create('sync', {is_sync = true, engine = engine})
_ = box.space.sync:create_index('pk')
+box.ctl.promote()
--
-- gh-5100: slow ACK sending shouldn't stun replica for the
@@ -222,3 +223,4 @@ test_run:cmd('delete server replica')
box.space.sync:drop()
box.schema.user.revoke('guest', 'super')
+box.ctl.demote()
diff --git a/test/replication/qsync_snapshots.result b/test/replication/qsync_snapshots.result
index cafdd63c8..ca418b168 100644
--- a/test/replication/qsync_snapshots.result
+++ b/test/replication/qsync_snapshots.result
@@ -57,6 +57,9 @@ _ = box.schema.space.create('sync', {is_sync=true, engine=engine})
_ = box.space.sync:create_index('pk')
| ---
| ...
+box.ctl.promote()
+ | ---
+ | ...
-- Testcase body.
box.space.sync:insert{1}
| ---
@@ -299,3 +302,6 @@ box.cfg{
}
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/qsync_snapshots.test.lua b/test/replication/qsync_snapshots.test.lua
index 590610974..82c2e3f7c 100644
--- a/test/replication/qsync_snapshots.test.lua
+++ b/test/replication/qsync_snapshots.test.lua
@@ -23,6 +23,7 @@ test_run:switch('default')
box.cfg{replication_synchro_quorum=NUM_INSTANCES, replication_synchro_timeout=1000}
_ = box.schema.space.create('sync', {is_sync=true, engine=engine})
_ = box.space.sync:create_index('pk')
+box.ctl.promote()
-- Testcase body.
box.space.sync:insert{1}
box.space.sync:select{} -- 1
@@ -130,3 +131,4 @@ box.cfg{
replication_synchro_quorum = orig_synchro_quorum, \
replication_synchro_timeout = orig_synchro_timeout, \
}
+box.ctl.demote()
diff --git a/test/replication/qsync_with_anon.result b/test/replication/qsync_with_anon.result
index 6a2952a32..99c6fb902 100644
--- a/test/replication/qsync_with_anon.result
+++ b/test/replication/qsync_with_anon.result
@@ -57,6 +57,9 @@ _ = box.schema.space.create('sync', {is_sync=true, engine=engine})
_ = box.space.sync:create_index('pk')
| ---
| ...
+box.ctl.promote()
+ | ---
+ | ...
-- Testcase body.
test_run:switch('default')
| ---
@@ -220,6 +223,9 @@ box.cfg{
}
| ---
| ...
+box.ctl.demote()
+ | ---
+ | ...
test_run:cleanup_cluster()
| ---
| ...
diff --git a/test/replication/qsync_with_anon.test.lua b/test/replication/qsync_with_anon.test.lua
index d7ecaa107..e73880ec7 100644
--- a/test/replication/qsync_with_anon.test.lua
+++ b/test/replication/qsync_with_anon.test.lua
@@ -22,6 +22,7 @@ test_run:switch('default')
box.cfg{replication_synchro_quorum=NUM_INSTANCES, replication_synchro_timeout=1000}
_ = box.schema.space.create('sync', {is_sync=true, engine=engine})
_ = box.space.sync:create_index('pk')
+box.ctl.promote()
-- Testcase body.
test_run:switch('default')
box.space.sync:insert{1} -- success
@@ -81,4 +82,5 @@ box.cfg{
replication_synchro_quorum = orig_synchro_quorum, \
replication_synchro_timeout = orig_synchro_timeout, \
}
+box.ctl.demote()
test_run:cleanup_cluster()
diff --git a/test/replication/suite.cfg b/test/replication/suite.cfg
index 4fc6643e4..c2bbb5aa9 100644
--- a/test/replication/suite.cfg
+++ b/test/replication/suite.cfg
@@ -48,6 +48,7 @@
"gh-5613-bootstrap-prefer-booted.test.lua": {},
"gh-6027-applier-error-show.test.lua": {},
"gh-6032-promote-wal-write.test.lua": {},
+ "gh-6034-limbo-ownership.test.lua": {},
"gh-6034-promote-bump-term.test.lua": {},
"gh-6057-qsync-confirm-async-no-wal.test.lua": {},
"gh-6094-rs-uuid-mismatch.test.lua": {},
--
2.30.1 (Apple Git-130)
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Tarantool-patches] [PATCH v2 8/8] replication: send latest effective promote in initial join
2021-06-17 21:07 [Tarantool-patches] [PATCH v2 0/8] forbid implicit limbo ownership transition Serge Petrenko via Tarantool-patches
` (6 preceding siblings ...)
2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 7/8] txn_limbo: persist the latest effective promote in snapshot Serge Petrenko via Tarantool-patches
@ 2021-06-17 21:07 ` Serge Petrenko via Tarantool-patches
2021-06-18 13:02 ` [Tarantool-patches] [PATCH v2 0/8] forbid implicit limbo ownership transition Cyrill Gorcunov via Tarantool-patches
2021-06-21 12:11 ` [Tarantool-patches] [PATCH v2 9/8] replication: send current Raft term in join response Serge Petrenko via Tarantool-patches
9 siblings, 0 replies; 16+ messages in thread
From: Serge Petrenko via Tarantool-patches @ 2021-06-17 21:07 UTC (permalink / raw)
To: v.shpilevoy, gorcunov; +Cc: tarantool-patches
A joining instance may never receive the latest PROMOTE request, which
is the only source of information about the limbo owner. Send out the
latest limbo state (e.g. the latest applied PROMOTE request) together
with the initial join snapshot.
Follow-up #6034
---
src/box/applier.cc | 5 ++
src/box/relay.cc | 15 +++++
test/replication/replica_rejoin.result | 77 ++++++++++++++----------
test/replication/replica_rejoin.test.lua | 50 +++++++--------
4 files changed, 92 insertions(+), 55 deletions(-)
diff --git a/src/box/applier.cc b/src/box/applier.cc
index 10cea26a7..9d90a2384 100644
--- a/src/box/applier.cc
+++ b/src/box/applier.cc
@@ -458,6 +458,11 @@ applier_wait_snapshot(struct applier *applier)
xrow_decode_vclock_xc(&row, &replicaset.vclock);
}
break; /* end of stream */
+ } else if (iproto_type_is_promote_request(row.type)) {
+ struct synchro_request req;
+ if (xrow_decode_synchro(&row, &req) != 0)
+ diag_raise();
+ txn_limbo_process(&txn_limbo, &req);
} else if (iproto_type_is_error(row.type)) {
xrow_decode_error_xc(&row); /* rethrow error */
} else {
diff --git a/src/box/relay.cc b/src/box/relay.cc
index b47767769..e05b53d5d 100644
--- a/src/box/relay.cc
+++ b/src/box/relay.cc
@@ -399,12 +399,27 @@ relay_initial_join(int fd, uint64_t sync, struct vclock *vclock)
if (txn_limbo_wait_confirm(&txn_limbo) != 0)
diag_raise();
+ struct synchro_request req;
+ txn_limbo_checkpoint(&txn_limbo, &req);
+
/* Respond to the JOIN request with the current vclock. */
struct xrow_header row;
xrow_encode_vclock_xc(&row, vclock);
row.sync = sync;
coio_write_xrow(&relay->io, &row);
+ /*
+ * Send out the latest limbo state. Don't do that when limbo is unused,
+ * let the old instances join without trouble.
+ */
+ if (req.replica_id != REPLICA_ID_NIL) {
+ char body[XROW_SYNCHRO_BODY_LEN_MAX];
+ xrow_encode_synchro(&row, body, &req);
+ row.replica_id = req.replica_id;
+ row.sync = sync;
+ coio_write_xrow(&relay->io, &row);
+ }
+
/* Send read view to the replica. */
engine_join_xc(&ctx, &relay->stream);
}
diff --git a/test/replication/replica_rejoin.result b/test/replication/replica_rejoin.result
index 843333a19..e489c150a 100644
--- a/test/replication/replica_rejoin.result
+++ b/test/replication/replica_rejoin.result
@@ -7,10 +7,19 @@ test_run = env.new()
log = require('log')
---
...
-engine = test_run:get_cfg('engine')
+test_run:cmd("create server master with script='replication/master1.lua'")
---
+- true
...
-test_run:cleanup_cluster()
+test_run:cmd("start server master")
+---
+- true
+...
+test_run:switch("master")
+---
+- true
+...
+engine = test_run:get_cfg('engine')
---
...
--
@@ -43,7 +52,7 @@ _ = box.space.test:insert{3}
---
...
-- Join a replica, then stop it.
-test_run:cmd("create server replica with rpl_master=default, script='replication/replica_rejoin.lua'")
+test_run:cmd("create server replica with rpl_master=master, script='replication/replica_rejoin.lua'")
---
- true
...
@@ -65,7 +74,7 @@ box.space.test:select()
- [2]
- [3]
...
-test_run:cmd("switch default")
+test_run:cmd("switch master")
---
- true
...
@@ -75,7 +84,7 @@ test_run:cmd("stop server replica")
...
-- Restart the server to purge the replica from
-- the garbage collection state.
-test_run:cmd("restart server default")
+test_run:cmd("restart server master")
box.cfg{wal_cleanup_delay = 0}
---
...
@@ -146,7 +155,7 @@ box.space.test:select()
- [20]
- [30]
...
-test_run:cmd("switch default")
+test_run:cmd("switch master")
---
- true
...
@@ -154,7 +163,7 @@ test_run:cmd("switch default")
for i = 10, 30, 10 do box.space.test:update(i, {{'!', 1, i}}) end
---
...
-vclock = test_run:get_vclock('default')
+vclock = test_run:get_vclock('master')
---
...
vclock[0] = nil
@@ -191,7 +200,7 @@ box.space.test:replace{1, 2, 3} -- bumps LSN on the replica
---
- [1, 2, 3]
...
-test_run:cmd("switch default")
+test_run:cmd("switch master")
---
- true
...
@@ -199,7 +208,7 @@ test_run:cmd("stop server replica")
---
- true
...
-test_run:cmd("restart server default")
+test_run:cmd("restart server master")
box.cfg{wal_cleanup_delay = 0}
---
...
@@ -253,7 +262,7 @@ box.space.test:select()
-- from the replica.
--
-- Bootstrap a new replica.
-test_run:cmd("switch default")
+test_run:cmd("switch master")
---
- true
...
@@ -295,7 +304,7 @@ box.cfg{replication = ''}
---
...
-- Bump vclock on the master.
-test_run:cmd("switch default")
+test_run:cmd("switch master")
---
- true
...
@@ -317,15 +326,15 @@ vclock = test_run:get_vclock('replica')
vclock[0] = nil
---
...
-_ = test_run:wait_vclock('default', vclock)
+_ = test_run:wait_vclock('master', vclock)
---
...
-- Restart the master and force garbage collection.
-test_run:cmd("switch default")
+test_run:cmd("switch master")
---
- true
...
-test_run:cmd("restart server default")
+test_run:cmd("restart server master")
box.cfg{wal_cleanup_delay = 0}
---
...
@@ -373,7 +382,7 @@ vclock = test_run:get_vclock('replica')
vclock[0] = nil
---
...
-_ = test_run:wait_vclock('default', vclock)
+_ = test_run:wait_vclock('master', vclock)
---
...
-- Restart the replica. It should successfully rebootstrap.
@@ -396,38 +405,42 @@ test_run:cmd("switch default")
---
- true
...
-box.cfg{replication = ''}
+test_run:cmd("stop server replica")
---
+- true
...
-test_run:cmd("stop server replica")
+test_run:cmd("delete server replica")
---
- true
...
-test_run:cmd("cleanup server replica")
+test_run:cmd("stop server master")
---
- true
...
-test_run:cmd("delete server replica")
+test_run:cmd("delete server master")
---
- true
...
-test_run:cleanup_cluster()
+--
+-- gh-4107: rebootstrap fails if the replica was deleted from
+-- the cluster on the master.
+--
+test_run:cmd("create server master with script='replication/master1.lua'")
---
+- true
...
-box.space.test:drop()
+test_run:cmd("start server master")
---
+- true
...
-box.schema.user.revoke('guest', 'replication')
+test_run:switch("master")
---
+- true
...
---
--- gh-4107: rebootstrap fails if the replica was deleted from
--- the cluster on the master.
---
box.schema.user.grant('guest', 'replication')
---
...
-test_run:cmd("create server replica with rpl_master=default, script='replication/replica_uuid.lua'")
+test_run:cmd("create server replica with rpl_master=master, script='replication/replica_uuid.lua'")
---
- true
...
@@ -462,11 +475,11 @@ box.space._cluster:get(2) ~= nil
---
- true
...
-test_run:cmd("stop server replica")
+test_run:switch("default")
---
- true
...
-test_run:cmd("cleanup server replica")
+test_run:cmd("stop server replica")
---
- true
...
@@ -474,9 +487,11 @@ test_run:cmd("delete server replica")
---
- true
...
-box.schema.user.revoke('guest', 'replication')
+test_run:cmd("stop server master")
---
+- true
...
-test_run:cleanup_cluster()
+test_run:cmd("delete server master")
---
+- true
...
diff --git a/test/replication/replica_rejoin.test.lua b/test/replication/replica_rejoin.test.lua
index c3ba9bf3f..2563177cf 100644
--- a/test/replication/replica_rejoin.test.lua
+++ b/test/replication/replica_rejoin.test.lua
@@ -1,9 +1,11 @@
env = require('test_run')
test_run = env.new()
log = require('log')
-engine = test_run:get_cfg('engine')
-test_run:cleanup_cluster()
+test_run:cmd("create server master with script='replication/master1.lua'")
+test_run:cmd("start server master")
+test_run:switch("master")
+engine = test_run:get_cfg('engine')
--
-- gh-5806: this replica_rejoin test relies on the wal cleanup fiber
@@ -23,17 +25,17 @@ _ = box.space.test:insert{2}
_ = box.space.test:insert{3}
-- Join a replica, then stop it.
-test_run:cmd("create server replica with rpl_master=default, script='replication/replica_rejoin.lua'")
+test_run:cmd("create server replica with rpl_master=master, script='replication/replica_rejoin.lua'")
test_run:cmd("start server replica")
test_run:cmd("switch replica")
box.info.replication[1].upstream.status == 'follow' or log.error(box.info)
box.space.test:select()
-test_run:cmd("switch default")
+test_run:cmd("switch master")
test_run:cmd("stop server replica")
-- Restart the server to purge the replica from
-- the garbage collection state.
-test_run:cmd("restart server default")
+test_run:cmd("restart server master")
box.cfg{wal_cleanup_delay = 0}
-- Make some checkpoints to remove old xlogs.
@@ -58,11 +60,11 @@ box.info.replication[2].downstream.vclock ~= nil or log.error(box.info)
test_run:cmd("switch replica")
box.info.replication[1].upstream.status == 'follow' or log.error(box.info)
box.space.test:select()
-test_run:cmd("switch default")
+test_run:cmd("switch master")
-- Make sure the replica follows new changes.
for i = 10, 30, 10 do box.space.test:update(i, {{'!', 1, i}}) end
-vclock = test_run:get_vclock('default')
+vclock = test_run:get_vclock('master')
vclock[0] = nil
_ = test_run:wait_vclock('replica', vclock)
test_run:cmd("switch replica")
@@ -76,9 +78,9 @@ box.space.test:select()
-- Check that rebootstrap is NOT initiated unless the replica
-- is strictly behind the master.
box.space.test:replace{1, 2, 3} -- bumps LSN on the replica
-test_run:cmd("switch default")
+test_run:cmd("switch master")
test_run:cmd("stop server replica")
-test_run:cmd("restart server default")
+test_run:cmd("restart server master")
box.cfg{wal_cleanup_delay = 0}
checkpoint_count = box.cfg.checkpoint_count
box.cfg{checkpoint_count = 1}
@@ -99,7 +101,7 @@ box.space.test:select()
--
-- Bootstrap a new replica.
-test_run:cmd("switch default")
+test_run:cmd("switch master")
test_run:cmd("stop server replica")
test_run:cmd("cleanup server replica")
test_run:cleanup_cluster()
@@ -113,17 +115,17 @@ box.cfg{replication = replica_listen}
test_run:cmd("switch replica")
box.cfg{replication = ''}
-- Bump vclock on the master.
-test_run:cmd("switch default")
+test_run:cmd("switch master")
box.space.test:replace{1}
-- Bump vclock on the replica.
test_run:cmd("switch replica")
for i = 1, 10 do box.space.test:replace{2} end
vclock = test_run:get_vclock('replica')
vclock[0] = nil
-_ = test_run:wait_vclock('default', vclock)
+_ = test_run:wait_vclock('master', vclock)
-- Restart the master and force garbage collection.
-test_run:cmd("switch default")
-test_run:cmd("restart server default")
+test_run:cmd("switch master")
+test_run:cmd("restart server master")
box.cfg{wal_cleanup_delay = 0}
replica_listen = test_run:cmd("eval replica 'return box.cfg.listen'")
replica_listen ~= nil
@@ -139,7 +141,7 @@ test_run:cmd("switch replica")
for i = 1, 10 do box.space.test:replace{2} end
vclock = test_run:get_vclock('replica')
vclock[0] = nil
-_ = test_run:wait_vclock('default', vclock)
+_ = test_run:wait_vclock('master', vclock)
-- Restart the replica. It should successfully rebootstrap.
test_run:cmd("restart server replica with args='true'")
box.space.test:select()
@@ -148,20 +150,20 @@ box.space.test:replace{2}
-- Cleanup.
test_run:cmd("switch default")
-box.cfg{replication = ''}
test_run:cmd("stop server replica")
-test_run:cmd("cleanup server replica")
test_run:cmd("delete server replica")
-test_run:cleanup_cluster()
-box.space.test:drop()
-box.schema.user.revoke('guest', 'replication')
+test_run:cmd("stop server master")
+test_run:cmd("delete server master")
--
-- gh-4107: rebootstrap fails if the replica was deleted from
-- the cluster on the master.
--
+test_run:cmd("create server master with script='replication/master1.lua'")
+test_run:cmd("start server master")
+test_run:switch("master")
box.schema.user.grant('guest', 'replication')
-test_run:cmd("create server replica with rpl_master=default, script='replication/replica_uuid.lua'")
+test_run:cmd("create server replica with rpl_master=master, script='replication/replica_uuid.lua'")
start_cmd = string.format("start server replica with args='%s'", require('uuid').new())
box.space._cluster:get(2) == nil
test_run:cmd(start_cmd)
@@ -170,8 +172,8 @@ test_run:cmd("cleanup server replica")
box.space._cluster:delete(2) ~= nil
test_run:cmd(start_cmd)
box.space._cluster:get(2) ~= nil
+test_run:switch("default")
test_run:cmd("stop server replica")
-test_run:cmd("cleanup server replica")
test_run:cmd("delete server replica")
-box.schema.user.revoke('guest', 'replication')
-test_run:cleanup_cluster()
+test_run:cmd("stop server master")
+test_run:cmd("delete server master")
--
2.30.1 (Apple Git-130)
^ permalink raw reply [flat|nested] 16+ messages in thread