Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH v2 0/8] forbid implicit limbo ownership transition
@ 2021-06-17 21:07 Serge Petrenko via Tarantool-patches
  2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 1/8] replication: always send raft state to subscribers Serge Petrenko via Tarantool-patches
                   ` (9 more replies)
  0 siblings, 10 replies; 16+ messages in thread
From: Serge Petrenko via Tarantool-patches @ 2021-06-17 21:07 UTC (permalink / raw)
  To: v.shpilevoy, gorcunov; +Cc: tarantool-patches

https://github.com/tarantool/tarantool/issues/5438
https://github.com/tarantool/tarantool/issues/6034
https://github.com/tarantool/tarantool/tree/sp/gh-6034-empty-limbo-transition

Changes in v2:
  - reorder the commits for a more logical sequence.
  - review fixes as per review from Vlad.
  - fix replica_rejoin test

Serge Petrenko (8):
  replication: always send raft state to subscribers
  txn_limbo: fix promote term filtering
  raft: refactor raft_new_term()
  box: make promote always bump the term
  replication: forbid implicit limbo owner transition
  box: introduce `box.ctl.demote`
  txn_limbo: persist the latest effective promote in snapshot
  replication: send latest effective promote in initial join

 src/box/applier.cc                            |   5 +
 src/box/box.cc                                |  49 +++--
 src/box/box.h                                 |   3 +
 src/box/errcode.h                             |   1 +
 src/box/iproto_constants.h                    |  10 +-
 src/box/lua/ctl.c                             |   9 +
 src/box/lua/info.c                            |   4 +-
 src/box/memtx_engine.c                        |  32 +++
 src/box/relay.cc                              |  19 +-
 src/box/txn_limbo.c                           |  98 +++++----
 src/box/txn_limbo.h                           |  14 ++
 src/lib/raft/raft.c                           |   3 +-
 test/box/alter.result                         |   2 +-
 test/box/error.result                         |   1 +
 test/replication/election_basic.result        |   3 +
 test/replication/election_basic.test.lua      |   1 +
 test/replication/election_qsync.result        |   3 +
 test/replication/election_qsync.test.lua      |   1 +
 .../gh-4114-local-space-replication.result    |   7 +-
 .../gh-4114-local-space-replication.test.lua  |   4 +-
 .../gh-5140-qsync-casc-rollback.result        |   6 +
 .../gh-5140-qsync-casc-rollback.test.lua      |   2 +
 .../gh-5144-qsync-dup-confirm.result          |   6 +
 .../gh-5144-qsync-dup-confirm.test.lua        |   2 +
 .../gh-5163-qsync-restart-crash.result        |   6 +
 .../gh-5163-qsync-restart-crash.test.lua      |   2 +
 .../gh-5167-qsync-rollback-snap.result        |   6 +
 .../gh-5167-qsync-rollback-snap.test.lua      |   2 +
 .../gh-5195-qsync-replica-write.result        |  10 +-
 .../gh-5195-qsync-replica-write.test.lua      |   6 +-
 .../gh-5213-qsync-applier-order-3.result      |   9 +
 .../gh-5213-qsync-applier-order-3.test.lua    |   3 +
 .../gh-5213-qsync-applier-order.result        |   6 +
 .../gh-5213-qsync-applier-order.test.lua      |   2 +
 .../replication/gh-5288-qsync-recovery.result |   6 +
 .../gh-5288-qsync-recovery.test.lua           |   2 +
 .../gh-5298-qsync-recovery-snap.result        |   6 +
 .../gh-5298-qsync-recovery-snap.test.lua      |   2 +
 .../gh-5426-election-on-off.result            |   3 +
 .../gh-5426-election-on-off.test.lua          |   1 +
 .../gh-5433-election-restart-recovery.result  |   3 +
 ...gh-5433-election-restart-recovery.test.lua |   1 +
 ...sync-clear-synchro-queue-commit-all.result |   3 +
 ...nc-clear-synchro-queue-commit-all.test.lua |   1 +
 test/replication/gh-5438-raft-state.result    |  63 ++++++
 test/replication/gh-5438-raft-state.test.lua  |  28 +++
 test/replication/gh-5440-qsync-ro.result      | 133 -------------
 test/replication/gh-5440-qsync-ro.test.lua    |  53 -----
 .../gh-5446-qsync-eval-quorum.result          |   7 +
 .../gh-5446-qsync-eval-quorum.test.lua        |   3 +
 .../gh-5506-election-on-off.result            |   3 +
 .../gh-5506-election-on-off.test.lua          |   1 +
 .../gh-5566-final-join-synchro.result         |   6 +
 .../gh-5566-final-join-synchro.test.lua       |   2 +
 .../gh-5874-qsync-txn-recovery.result         |   6 +
 .../gh-5874-qsync-txn-recovery.test.lua       |   2 +
 .../gh-6032-promote-wal-write.result          |   3 +
 .../gh-6032-promote-wal-write.test.lua        |   1 +
 .../gh-6034-limbo-ownership.result            | 186 ++++++++++++++++++
 .../gh-6034-limbo-ownership.test.lua          |  68 +++++++
 .../gh-6034-promote-bump-term.result          |  37 ++++
 .../gh-6034-promote-bump-term.test.lua        |  16 ++
 .../gh-6057-qsync-confirm-async-no-wal.result |   7 +
 ...h-6057-qsync-confirm-async-no-wal.test.lua |   3 +
 test/replication/hang_on_synchro_fail.result  |   6 +
 .../replication/hang_on_synchro_fail.test.lua |   2 +
 test/replication/qsync_advanced.result        |  12 ++
 test/replication/qsync_advanced.test.lua      |   4 +
 test/replication/qsync_basic.result           |  33 ++--
 test/replication/qsync_basic.test.lua         |  16 +-
 test/replication/qsync_errinj.result          |   6 +
 test/replication/qsync_errinj.test.lua        |   2 +
 test/replication/qsync_snapshots.result       |   6 +
 test/replication/qsync_snapshots.test.lua     |   2 +
 test/replication/qsync_with_anon.result       |   6 +
 test/replication/qsync_with_anon.test.lua     |   2 +
 test/replication/replica_rejoin.result        |  77 +++++---
 test/replication/replica_rejoin.test.lua      |  50 ++---
 test/replication/suite.cfg                    |   3 +
 test/unit/raft.c                              |  15 +-
 test/unit/raft.result                         |   3 +-
 81 files changed, 898 insertions(+), 340 deletions(-)
 create mode 100644 test/replication/gh-5438-raft-state.result
 create mode 100644 test/replication/gh-5438-raft-state.test.lua
 delete mode 100644 test/replication/gh-5440-qsync-ro.result
 delete mode 100644 test/replication/gh-5440-qsync-ro.test.lua
 create mode 100644 test/replication/gh-6034-limbo-ownership.result
 create mode 100644 test/replication/gh-6034-limbo-ownership.test.lua
 create mode 100644 test/replication/gh-6034-promote-bump-term.result
 create mode 100644 test/replication/gh-6034-promote-bump-term.test.lua

-- 
2.30.1 (Apple Git-130)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Tarantool-patches] [PATCH v2 1/8] replication: always send raft state to subscribers
  2021-06-17 21:07 [Tarantool-patches] [PATCH v2 0/8] forbid implicit limbo ownership transition Serge Petrenko via Tarantool-patches
@ 2021-06-17 21:07 ` Serge Petrenko via Tarantool-patches
  2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 2/8] txn_limbo: fix promote term filtering Serge Petrenko via Tarantool-patches
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Serge Petrenko via Tarantool-patches @ 2021-06-17 21:07 UTC (permalink / raw)
  To: v.shpilevoy, gorcunov; +Cc: tarantool-patches

Tarantool used to send out raft state on subscribe only when raft was
enabled. This was a safeguard against partially-upgraded clusters, where
some nodes had no clue about Raft messages and couldn't handle them
properly.

Actually, Raft state should be sent out always. For example, promote
will be changed to bump Raft term even when Raft is disabled, and it's
important that everyone in cluster has the same term for the sake of promote
at least.

So, send out Raft state to every subscriber with version >= 2.6.0
(that's when Raft was introduced).
Do the same for Raft broadcasts. They should be sent only to replicas
with version >= 2.6.0

Closes #5438
---
 src/box/box.cc                               | 11 ++--
 src/box/relay.cc                             |  4 +-
 test/replication/gh-5438-raft-state.result   | 63 ++++++++++++++++++++
 test/replication/gh-5438-raft-state.test.lua | 28 +++++++++
 test/replication/suite.cfg                   |  1 +
 5 files changed, 100 insertions(+), 7 deletions(-)
 create mode 100644 test/replication/gh-5438-raft-state.result
 create mode 100644 test/replication/gh-5438-raft-state.test.lua

diff --git a/src/box/box.cc b/src/box/box.cc
index ab7d983c9..d6a3b4d97 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -82,6 +82,7 @@
 #include "msgpack.h"
 #include "raft.h"
 #include "trivia/util.h"
+#include "version.h"
 
 enum {
 	IPROTO_THREADS_MAX = 1000,
@@ -2836,13 +2837,13 @@ box_process_subscribe(struct ev_io *io, struct xrow_header *header)
 		 tt_uuid_str(&replica_uuid), sio_socketname(io->fd));
 	say_info("remote vclock %s local vclock %s",
 		 vclock_to_string(&replica_clock), vclock_to_string(&vclock));
-	if (raft_is_enabled(box_raft())) {
+	if (replica_version_id >= version_id(2, 6, 0) && !anon) {
 		/*
 		 * Send out the current raft state of the instance. Don't do
-		 * that if Raft is disabled. It can be that a part of the
-		 * cluster still contains old versions, which can't handle Raft
-		 * messages. So when it is disabled, its network footprint
-		 * should be 0.
+		 * that if the remote instance is old. It can be that a part of
+		 * the cluster still contains old versions, which can't handle
+		 * Raft messages. Raft's network footprint should be 0 as seen
+		 * by such instances.
 		 */
 		struct raft_request req;
 		box_raft_checkpoint_remote(&req);
diff --git a/src/box/relay.cc b/src/box/relay.cc
index b1571b361..b47767769 100644
--- a/src/box/relay.cc
+++ b/src/box/relay.cc
@@ -760,7 +760,7 @@ relay_subscribe_f(va_list ap)
 		  &relay->relay_pipe, NULL, NULL, cbus_process);
 
 	struct relay_is_raft_enabled_msg raft_enabler;
-	if (!relay->replica->anon)
+	if (!relay->replica->anon && relay->version_id >= version_id(2, 6, 0))
 		relay_send_is_raft_enabled(relay, &raft_enabler, true);
 
 	/*
@@ -842,7 +842,7 @@ relay_subscribe_f(va_list ap)
 		cpipe_push(&relay->tx_pipe, &relay->status_msg.msg);
 	}
 
-	if (!relay->replica->anon)
+	if (!relay->replica->anon && relay->version_id >= version_id(2, 6, 0))
 		relay_send_is_raft_enabled(relay, &raft_enabler, false);
 
 	/*
diff --git a/test/replication/gh-5438-raft-state.result b/test/replication/gh-5438-raft-state.result
new file mode 100644
index 000000000..6985f026a
--- /dev/null
+++ b/test/replication/gh-5438-raft-state.result
@@ -0,0 +1,63 @@
+-- test-run result file version 2
+test_run = require('test_run').new()
+ | ---
+ | ...
+
+--
+-- gh-5428 send out Raft state to subscribers, even when Raft is disabled.
+--
+-- Bump Raft term while the replica's offline.
+term = box.info.election.term
+ | ---
+ | ...
+old_election_mode = box.cfg.election_mode
+ | ---
+ | ...
+box.cfg{election_mode = 'candidate'}
+ | ---
+ | ...
+test_run:wait_cond(function() return box.info.election.term > term end)
+ | ---
+ | - true
+ | ...
+
+-- Make sure the replica receives new term on subscribe.
+box.cfg{election_mode = 'off'}
+ | ---
+ | ...
+
+box.schema.user.grant('guest', 'replication')
+ | ---
+ | ...
+test_run:cmd('create server replica with rpl_master=default,\
+                                         script="replication/replica.lua"')
+ | ---
+ | - true
+ | ...
+test_run:cmd('start server replica')
+ | ---
+ | - true
+ | ...
+test_run:wait_cond(function()\
+    return test_run:eval('replica', 'return box.info.election.term')[1] ==\
+           box.info.election.term\
+end)
+ | ---
+ | - true
+ | ...
+
+-- Cleanup.
+box.cfg{election_mode = old_election_mode}
+ | ---
+ | ...
+test_run:cmd('stop server replica')
+ | ---
+ | - true
+ | ...
+test_run:cmd('delete server replica')
+ | ---
+ | - true
+ | ...
+box.schema.user.revoke('guest', 'replication')
+ | ---
+ | ...
diff --git a/test/replication/gh-5438-raft-state.test.lua b/test/replication/gh-5438-raft-state.test.lua
new file mode 100644
index 000000000..60c3366c1
--- /dev/null
+++ b/test/replication/gh-5438-raft-state.test.lua
@@ -0,0 +1,28 @@
+test_run = require('test_run').new()
+
+--
+-- gh-5428 send out Raft state to subscribers, even when Raft is disabled.
+--
+-- Bump Raft term while the replica's offline.
+term = box.info.election.term
+old_election_mode = box.cfg.election_mode
+box.cfg{election_mode = 'candidate'}
+test_run:wait_cond(function() return box.info.election.term > term end)
+
+-- Make sure the replica receives new term on subscribe.
+box.cfg{election_mode = 'off'}
+
+box.schema.user.grant('guest', 'replication')
+test_run:cmd('create server replica with rpl_master=default,\
+                                         script="replication/replica.lua"')
+test_run:cmd('start server replica')
+test_run:wait_cond(function()\
+    return test_run:eval('replica', 'return box.info.election.term')[1] ==\
+           box.info.election.term\
+end)
+
+-- Cleanup.
+box.cfg{election_mode = old_election_mode}
+test_run:cmd('stop server replica')
+test_run:cmd('delete server replica')
+box.schema.user.revoke('guest', 'replication')
diff --git a/test/replication/suite.cfg b/test/replication/suite.cfg
index 3a0a8649f..b0f275526 100644
--- a/test/replication/suite.cfg
+++ b/test/replication/suite.cfg
@@ -19,6 +19,7 @@
     "gh-5213-qsync-applier-order-3.test.lua": {},
     "gh-5426-election-on-off.test.lua": {},
     "gh-5433-election-restart-recovery.test.lua": {},
+    "gh-5438-raft-state.test.lua": {},
     "gh-5445-leader-inconsistency.test.lua": {},
     "gh-5506-election-on-off.test.lua": {},
     "once.test.lua": {},
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Tarantool-patches] [PATCH v2 2/8] txn_limbo: fix promote term filtering
  2021-06-17 21:07 [Tarantool-patches] [PATCH v2 0/8] forbid implicit limbo ownership transition Serge Petrenko via Tarantool-patches
  2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 1/8] replication: always send raft state to subscribers Serge Petrenko via Tarantool-patches
@ 2021-06-17 21:07 ` Serge Petrenko via Tarantool-patches
  2021-06-17 22:05   ` Cyrill Gorcunov via Tarantool-patches
  2021-06-18 10:31   ` Cyrill Gorcunov via Tarantool-patches
  2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 3/8] raft: refactor raft_new_term() Serge Petrenko via Tarantool-patches
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 16+ messages in thread
From: Serge Petrenko via Tarantool-patches @ 2021-06-17 21:07 UTC (permalink / raw)
  To: v.shpilevoy, gorcunov; +Cc: tarantool-patches

txn_limbo_process() used to filter out promote requests whose term was
equal to the greatest term seen. This wasn't correct for PROMOTE entries
with term 1.

Such entries appear after box.ctl.promote() is issued on an instance
with disabled elections. Every PROMOTE entry from such an instance has
term 1, but should still be applied. Fix this in the patch.

Also, when an outdated PROMOTE entry with term smaller than already
applied from some replica arrived, it wasn't filtered at all. Such a
situation shouldn't be possible, but fix it as well.

Part-of #6034
---
 src/box/txn_limbo.c | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c
index 51dc2a186..a5d1df00c 100644
--- a/src/box/txn_limbo.c
+++ b/src/box/txn_limbo.c
@@ -665,17 +665,21 @@ txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req)
 {
 	uint64_t term = req->term;
 	uint32_t origin = req->origin_id;
-	if (txn_limbo_replica_term(limbo, origin) < term) {
+	if (txn_limbo_replica_term(limbo, origin) < term)
 		vclock_follow(&limbo->promote_term_map, origin, term);
-		if (term > limbo->promote_greatest_term) {
-			limbo->promote_greatest_term = term;
-		} else if (req->type == IPROTO_PROMOTE) {
-			/*
-			 * PROMOTE for outdated term. Ignore.
-			 */
-			return;
-		}
+
+	if (term > limbo->promote_greatest_term) {
+		limbo->promote_greatest_term = term;
+	} else if (req->type == IPROTO_PROMOTE &&
+		   limbo->promote_greatest_term > 1) {
+		/* PROMOTE for outdated term. Ignore. */
+		say_info("RAFT: ignoring PROMOTE request from instance "
+			 "id %"PRIu32" for term %"PRIu64". Greatest term seen "
+			 "before (%"PRIu64") is bigger.", origin, term,
+			 limbo->promote_greatest_term);
+		return;
 	}
+
 	int64_t lsn = req->lsn;
 	if (req->replica_id == REPLICA_ID_NIL) {
 		/*
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Tarantool-patches] [PATCH v2 3/8] raft: refactor raft_new_term()
  2021-06-17 21:07 [Tarantool-patches] [PATCH v2 0/8] forbid implicit limbo ownership transition Serge Petrenko via Tarantool-patches
  2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 1/8] replication: always send raft state to subscribers Serge Petrenko via Tarantool-patches
  2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 2/8] txn_limbo: fix promote term filtering Serge Petrenko via Tarantool-patches
@ 2021-06-17 21:07 ` Serge Petrenko via Tarantool-patches
  2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 4/8] box: make promote always bump the term Serge Petrenko via Tarantool-patches
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Serge Petrenko via Tarantool-patches @ 2021-06-17 21:07 UTC (permalink / raw)
  To: v.shpilevoy, gorcunov; +Cc: tarantool-patches

Make raft_new_term() always bump the current term, even when Raft is
disabled.

Part-of #6034
---
 src/box/box.cc        |  2 +-
 src/lib/raft/raft.c   |  3 +--
 test/unit/raft.c      | 15 ++++++++++++++-
 test/unit/raft.result |  3 ++-
 4 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/src/box/box.cc b/src/box/box.cc
index d6a3b4d97..6a0950f44 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -3514,7 +3514,7 @@ box_cfg_xc(void)
 
 	if (!is_bootstrap_leader) {
 		replicaset_sync();
-	} else {
+	} else if (raft_is_enabled(box_raft())) {
 		/*
 		 * When the cluster is just bootstrapped and this instance is a
 		 * leader, it makes no sense to wait for a leader appearance.
diff --git a/src/lib/raft/raft.c b/src/lib/raft/raft.c
index eacdddb7e..78d55846f 100644
--- a/src/lib/raft/raft.c
+++ b/src/lib/raft/raft.c
@@ -987,8 +987,7 @@ raft_cfg_vclock(struct raft *raft, const struct vclock *vclock)
 void
 raft_new_term(struct raft *raft)
 {
-	if (raft->is_enabled)
-		raft_sm_schedule_new_term(raft, raft->volatile_term + 1);
+	raft_sm_schedule_new_term(raft, raft->volatile_term + 1);
 }
 
 static void
diff --git a/test/unit/raft.c b/test/unit/raft.c
index 6369c42d3..4db4024d9 100644
--- a/test/unit/raft.c
+++ b/test/unit/raft.c
@@ -1176,7 +1176,7 @@ raft_test_death_timeout(void)
 static void
 raft_test_enable_disable(void)
 {
-	raft_start_test(10);
+	raft_start_test(11);
 	struct raft_node node;
 	raft_node_create(&node);
 
@@ -1268,7 +1268,20 @@ raft_test_enable_disable(void)
 		"{0: 2}" /* Vclock. */
 	), "nothing changed");
 
+	/* Disabled node still bumps the term when needed. */
+	raft_node_new_term(&node);
+
+	ok(raft_node_check_full_state(&node,
+		RAFT_STATE_FOLLOWER /* State. */,
+		0 /* Leader. */,
+		4 /* Term. */,
+		0 /* Vote. */,
+		4 /* Volatile term. */,
+		0 /* Volatile vote. */,
+		"{0: 3}" /* Vclock. */
+	), "term bump when disabled");
 	raft_node_destroy(&node);
+
 	raft_finish_test();
 }
 
diff --git a/test/unit/raft.result b/test/unit/raft.result
index 598a7e760..57229a265 100644
--- a/test/unit/raft.result
+++ b/test/unit/raft.result
@@ -203,7 +203,7 @@ ok 10 - subtests
 ok 11 - subtests
 	*** raft_test_death_timeout: done ***
 	*** raft_test_enable_disable ***
-    1..10
+    1..11
     ok 1 - accepted a leader notification
     ok 2 - leader is seen
     ok 3 - death timeout passed
@@ -214,6 +214,7 @@ ok 11 - subtests
     ok 8 - became leader
     ok 9 - resigned from leader state
     ok 10 - nothing changed
+    ok 11 - term bump when disabled
 ok 12 - subtests
 	*** raft_test_enable_disable: done ***
 	*** raft_test_too_long_wal_write ***
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Tarantool-patches] [PATCH v2 4/8] box: make promote always bump the term
  2021-06-17 21:07 [Tarantool-patches] [PATCH v2 0/8] forbid implicit limbo ownership transition Serge Petrenko via Tarantool-patches
                   ` (2 preceding siblings ...)
  2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 3/8] raft: refactor raft_new_term() Serge Petrenko via Tarantool-patches
@ 2021-06-17 21:07 ` Serge Petrenko via Tarantool-patches
  2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 5/8] replication: forbid implicit limbo owner transition Serge Petrenko via Tarantool-patches
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Serge Petrenko via Tarantool-patches @ 2021-06-17 21:07 UTC (permalink / raw)
  To: v.shpilevoy, gorcunov; +Cc: tarantool-patches

When called without elections, promote and resulted in multiple
PROMOTE entries for the same term. This is not right, because all
the promotions for the same term except the first one would be ignored
as already seen.

Part-of #6034
---
 src/box/box.cc                                | 10 +++--
 .../gh-4114-local-space-replication.result    |  7 ++--
 .../gh-4114-local-space-replication.test.lua  |  4 +-
 .../gh-6034-promote-bump-term.result          | 37 +++++++++++++++++++
 .../gh-6034-promote-bump-term.test.lua        | 16 ++++++++
 test/replication/suite.cfg                    |  1 +
 6 files changed, 65 insertions(+), 10 deletions(-)
 create mode 100644 test/replication/gh-6034-promote-bump-term.result
 create mode 100644 test/replication/gh-6034-promote-bump-term.test.lua

diff --git a/src/box/box.cc b/src/box/box.cc
index 6a0950f44..53a8f80e5 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -1687,16 +1687,18 @@ box_promote(void)
 			rc = -1;
 		} else {
 promote:
-			/* We cannot possibly get here in a volatile state. */
-			assert(box_raft()->volatile_term == box_raft()->term);
+			if (try_wait)
+				raft_new_term(box_raft());
+
+			uint64_t term = box_raft()->volatile_term;
 			txn_limbo_write_promote(&txn_limbo, wait_lsn,
-						box_raft()->term);
+						term);
 			struct synchro_request req = {
 				.type = IPROTO_PROMOTE,
 				.replica_id = former_leader_id,
 				.origin_id = instance_id,
 				.lsn = wait_lsn,
-				.term = box_raft()->term,
+				.term = term,
 			};
 			txn_limbo_process(&txn_limbo, &req);
 			assert(txn_limbo_is_empty(&txn_limbo));
diff --git a/test/replication/gh-4114-local-space-replication.result b/test/replication/gh-4114-local-space-replication.result
index 9b63a4b99..676400cef 100644
--- a/test/replication/gh-4114-local-space-replication.result
+++ b/test/replication/gh-4114-local-space-replication.result
@@ -45,9 +45,8 @@ test_run:cmd('switch replica')
  | ---
  | - true
  | ...
-box.info.vclock[0]
+a = box.info.vclock[0] or 0
  | ---
- | - null
  | ...
 box.cfg{checkpoint_count=1}
  | ---
@@ -77,9 +76,9 @@ box.space.test:insert{3}
  | - [3]
  | ...
 
-box.info.vclock[0]
+box.info.vclock[0] == a + 3 or box.info.vclock[0] - a
  | ---
- | - 3
+ | - true
  | ...
 
 test_run:cmd('switch default')
diff --git a/test/replication/gh-4114-local-space-replication.test.lua b/test/replication/gh-4114-local-space-replication.test.lua
index c18fb3b10..8f787db84 100644
--- a/test/replication/gh-4114-local-space-replication.test.lua
+++ b/test/replication/gh-4114-local-space-replication.test.lua
@@ -18,7 +18,7 @@ for i = 1,10 do box.space.test:insert{i} end
 box.info.vclock[0] == a + 10 or box.info.vclock[0] - a
 
 test_run:cmd('switch replica')
-box.info.vclock[0]
+a = box.info.vclock[0] or 0
 box.cfg{checkpoint_count=1}
 box.space.test:select{}
 box.space.test:insert{1}
@@ -27,7 +27,7 @@ box.space.test:insert{2}
 box.snapshot()
 box.space.test:insert{3}
 
-box.info.vclock[0]
+box.info.vclock[0] == a + 3 or box.info.vclock[0] - a
 
 test_run:cmd('switch default')
 
diff --git a/test/replication/gh-6034-promote-bump-term.result b/test/replication/gh-6034-promote-bump-term.result
new file mode 100644
index 000000000..20e352922
--- /dev/null
+++ b/test/replication/gh-6034-promote-bump-term.result
@@ -0,0 +1,37 @@
+-- test-run result file version 2
+test_run = require('test_run').new()
+ | ---
+ | ...
+
+-- gh-6034: test that every box.ctl.promote() bumps
+-- the instance's term. Even when elections are disabled. Even for consequent
+-- promotes on the same instance.
+election_mode = box.cfg.election_mode
+ | ---
+ | ...
+box.cfg{election_mode='off'}
+ | ---
+ | ...
+
+term = box.info.election.term
+ | ---
+ | ...
+box.ctl.promote()
+ | ---
+ | ...
+assert(box.info.election.term == term + 1)
+ | ---
+ | - true
+ | ...
+box.ctl.promote()
+ | ---
+ | ...
+assert(box.info.election.term == term + 2)
+ | ---
+ | - true
+ | ...
+
+-- Cleanup.
+box.cfg{election_mode=election_mode}
+ | ---
+ | ...
diff --git a/test/replication/gh-6034-promote-bump-term.test.lua b/test/replication/gh-6034-promote-bump-term.test.lua
new file mode 100644
index 000000000..5847dbb8f
--- /dev/null
+++ b/test/replication/gh-6034-promote-bump-term.test.lua
@@ -0,0 +1,16 @@
+test_run = require('test_run').new()
+
+-- gh-6034: test that every box.ctl.promote() bumps
+-- the instance's term. Even when elections are disabled. Even for consequent
+-- promotes on the same instance.
+election_mode = box.cfg.election_mode
+box.cfg{election_mode='off'}
+
+term = box.info.election.term
+box.ctl.promote()
+assert(box.info.election.term == term + 1)
+box.ctl.promote()
+assert(box.info.election.term == term + 2)
+
+-- Cleanup.
+box.cfg{election_mode=election_mode}
diff --git a/test/replication/suite.cfg b/test/replication/suite.cfg
index b0f275526..4fc6643e4 100644
--- a/test/replication/suite.cfg
+++ b/test/replication/suite.cfg
@@ -48,6 +48,7 @@
     "gh-5613-bootstrap-prefer-booted.test.lua": {},
     "gh-6027-applier-error-show.test.lua": {},
     "gh-6032-promote-wal-write.test.lua": {},
+    "gh-6034-promote-bump-term.test.lua": {},
     "gh-6057-qsync-confirm-async-no-wal.test.lua": {},
     "gh-6094-rs-uuid-mismatch.test.lua": {},
     "gh-6127-election-join-new.test.lua": {},
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Tarantool-patches] [PATCH v2 5/8] replication: forbid implicit limbo owner transition
  2021-06-17 21:07 [Tarantool-patches] [PATCH v2 0/8] forbid implicit limbo ownership transition Serge Petrenko via Tarantool-patches
                   ` (3 preceding siblings ...)
  2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 4/8] box: make promote always bump the term Serge Petrenko via Tarantool-patches
@ 2021-06-17 21:07 ` Serge Petrenko via Tarantool-patches
  2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 6/8] box: introduce `box.ctl.demote` Serge Petrenko via Tarantool-patches
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Serge Petrenko via Tarantool-patches @ 2021-06-17 21:07 UTC (permalink / raw)
  To: v.shpilevoy, gorcunov; +Cc: tarantool-patches

Forbid limbo ownership transition without an explicit promote.
Make it so that synchronous transactions may be committed only after it
is claimed by some instance via a PROMOTE request.

Make everyone but the limbo owner read-only even when the limbo is
empty.

Part-of #6034

@TarantoolBot document
Title: synchronous replication changes

`box.info.synchro.queue` receives a new field: `owner`. It's a replica
id of the instance owning the synchronous transaction queue.

Once some instance owns the queue, every other instance becomes
read-only. When the queue is unclaimed, e.g.
`box.info.synchro.queue.owner` is `0`, everyone may be writeable, but
cannot create synchronous transactions.

In order to claim or re-claim the queue, you have to issue
`box.ctl.promote()` on the instance you wish to promote.

When elections are enabled, the instance issues `box.ctl.promote()`
automatically once it wins the elections, no additional actions are
required.
---
 src/box/errcode.h                          |   1 +
 src/box/lua/info.c                         |   4 +-
 src/box/txn_limbo.c                        |  35 ++----
 test/box/alter.result                      |   2 +-
 test/box/error.result                      |   1 +
 test/replication/gh-5440-qsync-ro.result   | 133 ---------------------
 test/replication/gh-5440-qsync-ro.test.lua |  53 --------
 7 files changed, 16 insertions(+), 213 deletions(-)
 delete mode 100644 test/replication/gh-5440-qsync-ro.result
 delete mode 100644 test/replication/gh-5440-qsync-ro.test.lua

diff --git a/src/box/errcode.h b/src/box/errcode.h
index 49aec4bf6..e3943c01d 100644
--- a/src/box/errcode.h
+++ b/src/box/errcode.h
@@ -278,6 +278,7 @@ struct errcode_record {
 	/*223 */_(ER_INTERFERING_PROMOTE,	"Instance with replica id %u was promoted first") \
 	/*224 */_(ER_RAFT_DISABLED,		"Elections were turned off while running box.ctl.promote()")\
 	/*225 */_(ER_TXN_ROLLBACK,		"Transaction was rolled back") \
+	/*226 */_(ER_SYNCHRO_QUEUE_UNCLAIMED,	"The synchronous transaction queue doesn't belong to any instance")\
 
 /*
  * !IMPORTANT! Please follow instructions at start of the file
diff --git a/src/box/lua/info.c b/src/box/lua/info.c
index 0eb48b823..7e3cd0b7d 100644
--- a/src/box/lua/info.c
+++ b/src/box/lua/info.c
@@ -611,9 +611,11 @@ lbox_info_synchro(struct lua_State *L)
 
 	/* Queue information. */
 	struct txn_limbo *queue = &txn_limbo;
-	lua_createtable(L, 0, 1);
+	lua_createtable(L, 0, 2);
 	lua_pushnumber(L, queue->len);
 	lua_setfield(L, -2, "len");
+	lua_pushnumber(L, queue->owner_id);
+	lua_setfield(L, -2, "owner");
 	lua_setfield(L, -2, "queue");
 
 	return 1;
diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c
index a5d1df00c..203dbe856 100644
--- a/src/box/txn_limbo.c
+++ b/src/box/txn_limbo.c
@@ -55,7 +55,8 @@ txn_limbo_create(struct txn_limbo *limbo)
 bool
 txn_limbo_is_ro(struct txn_limbo *limbo)
 {
-	return limbo->owner_id != instance_id && !txn_limbo_is_empty(limbo);
+	return limbo->owner_id != REPLICA_ID_NIL &&
+	       limbo->owner_id != instance_id;
 }
 
 struct txn_limbo_entry *
@@ -95,18 +96,13 @@ txn_limbo_append(struct txn_limbo *limbo, uint32_t id, struct txn *txn)
 	}
 	if (id == 0)
 		id = instance_id;
-	bool make_ro = false;
-	if (limbo->owner_id != id) {
-		if (rlist_empty(&limbo->queue)) {
-			limbo->owner_id = id;
-			limbo->confirmed_lsn = 0;
-			if (id != instance_id)
-				make_ro = true;
-		} else {
-			diag_set(ClientError, ER_UNCOMMITTED_FOREIGN_SYNC_TXNS,
-				 limbo->owner_id);
-			return NULL;
-		}
+	if  (limbo->owner_id == REPLICA_ID_NIL) {
+		diag_set(ClientError, ER_SYNCHRO_QUEUE_UNCLAIMED);
+		return NULL;
+	} else if (limbo->owner_id != id) {
+		diag_set(ClientError, ER_UNCOMMITTED_FOREIGN_SYNC_TXNS,
+			 limbo->owner_id);
+		return NULL;
 	}
 	size_t size;
 	struct txn_limbo_entry *e = region_alloc_object(&txn->region,
@@ -122,12 +118,6 @@ txn_limbo_append(struct txn_limbo *limbo, uint32_t id, struct txn *txn)
 	e->is_rollback = false;
 	rlist_add_tail_entry(&limbo->queue, e, in_queue);
 	limbo->len++;
-	/*
-	 * We added new entries from a remote instance to an empty limbo.
-	 * Time to make this instance read-only.
-	 */
-	if (make_ro)
-		box_update_ro_summary();
 	return e;
 }
 
@@ -432,9 +422,6 @@ txn_limbo_read_confirm(struct txn_limbo *limbo, int64_t lsn)
 		assert(e->txn->signature >= 0);
 		txn_complete_success(e->txn);
 	}
-	/* Update is_ro once the limbo is clear. */
-	if (txn_limbo_is_empty(limbo))
-		box_update_ro_summary();
 }
 
 /**
@@ -482,9 +469,6 @@ txn_limbo_read_rollback(struct txn_limbo *limbo, int64_t lsn)
 		if (e == last_rollback)
 			break;
 	}
-	/* Update is_ro once the limbo is clear. */
-	if (txn_limbo_is_empty(limbo))
-		box_update_ro_summary();
 }
 
 void
@@ -515,6 +499,7 @@ txn_limbo_read_promote(struct txn_limbo *limbo, uint32_t replica_id,
 	txn_limbo_read_rollback(limbo, lsn + 1);
 	assert(txn_limbo_is_empty(&txn_limbo));
 	limbo->owner_id = replica_id;
+	box_update_ro_summary();
 	limbo->confirmed_lsn = 0;
 }
 
diff --git a/test/box/alter.result b/test/box/alter.result
index a7bffce10..6a64f6b84 100644
--- a/test/box/alter.result
+++ b/test/box/alter.result
@@ -1464,7 +1464,7 @@ assert(s.is_sync)
 ...
 s:replace{1}
 ---
-- error: Quorum collection for a synchronous transaction is timed out
+- error: The synchronous transaction queue doesn't belong to any instance
 ...
 -- When not specified or nil - ignored.
 s:alter({is_sync = nil})
diff --git a/test/box/error.result b/test/box/error.result
index 062a90399..574521a14 100644
--- a/test/box/error.result
+++ b/test/box/error.result
@@ -444,6 +444,7 @@ t;
  |   223: box.error.INTERFERING_PROMOTE
  |   224: box.error.RAFT_DISABLED
  |   225: box.error.TXN_ROLLBACK
+ |   226: box.error.LIMBO_UNCLAIMED
  | ...
 
 test_run:cmd("setopt delimiter ''");
diff --git a/test/replication/gh-5440-qsync-ro.result b/test/replication/gh-5440-qsync-ro.result
deleted file mode 100644
index 1ece26a42..000000000
--- a/test/replication/gh-5440-qsync-ro.result
+++ /dev/null
@@ -1,133 +0,0 @@
--- test-run result file version 2
---
--- gh-5440 everyone but the limbo owner is read-only on non-empty limbo.
---
-env = require('test_run')
- | ---
- | ...
-test_run = env.new()
- | ---
- | ...
-fiber = require('fiber')
- | ---
- | ...
-
-box.schema.user.grant('guest', 'replication')
- | ---
- | ...
-test_run:cmd('create server replica with rpl_master=default, script="replication/replica.lua"')
- | ---
- | - true
- | ...
-test_run:cmd('start server replica with wait=True, wait_load=True')
- | ---
- | - true
- | ...
-
-_ = box.schema.space.create('test', {is_sync=true})
- | ---
- | ...
-_ = box.space.test:create_index('pk')
- | ---
- | ...
-
-old_synchro_quorum = box.cfg.replication_synchro_quorum
- | ---
- | ...
-old_synchro_timeout = box.cfg.replication_synchro_timeout
- | ---
- | ...
-
--- Make sure that the master stalls on commit leaving the limbo non-empty.
-box.cfg{replication_synchro_quorum=3, replication_synchro_timeout=1000}
- | ---
- | ...
-
-f = fiber.new(function() box.space.test:insert{1} end)
- | ---
- | ...
-f:status()
- | ---
- | - suspended
- | ...
-
--- Wait till replica's limbo is non-empty.
-test_run:wait_lsn('replica', 'default')
- | ---
- | ...
-test_run:cmd('switch replica')
- | ---
- | - true
- | ...
-
-box.info.ro
- | ---
- | - true
- | ...
-box.space.test:insert{2}
- | ---
- | - error: Can't modify data because this instance is in read-only mode.
- | ...
-success = false
- | ---
- | ...
-f = require('fiber').new(function() box.ctl.wait_rw() success = true end)
- | ---
- | ...
-f:status()
- | ---
- | - suspended
- | ...
-
-test_run:cmd('switch default')
- | ---
- | - true
- | ...
-
--- Empty the limbo.
-box.cfg{replication_synchro_quorum=2}
- | ---
- | ...
-
-test_run:cmd('switch replica')
- | ---
- | - true
- | ...
-
-test_run:wait_cond(function() return success end)
- | ---
- | - true
- | ...
-box.info.ro
- | ---
- | - false
- | ...
--- Should succeed now.
-box.space.test:insert{2}
- | ---
- | - [2]
- | ...
-
--- Cleanup.
-test_run:cmd('switch default')
- | ---
- | - true
- | ...
-box.cfg{replication_synchro_quorum=old_synchro_quorum,\
-        replication_synchro_timeout=old_synchro_timeout}
- | ---
- | ...
-box.space.test:drop()
- | ---
- | ...
-test_run:cmd('stop server replica')
- | ---
- | - true
- | ...
-test_run:cmd('delete server replica')
- | ---
- | - true
- | ...
-box.schema.user.revoke('guest', 'replication')
- | ---
- | ...
diff --git a/test/replication/gh-5440-qsync-ro.test.lua b/test/replication/gh-5440-qsync-ro.test.lua
deleted file mode 100644
index d63ec9c1e..000000000
--- a/test/replication/gh-5440-qsync-ro.test.lua
+++ /dev/null
@@ -1,53 +0,0 @@
---
--- gh-5440 everyone but the limbo owner is read-only on non-empty limbo.
---
-env = require('test_run')
-test_run = env.new()
-fiber = require('fiber')
-
-box.schema.user.grant('guest', 'replication')
-test_run:cmd('create server replica with rpl_master=default, script="replication/replica.lua"')
-test_run:cmd('start server replica with wait=True, wait_load=True')
-
-_ = box.schema.space.create('test', {is_sync=true})
-_ = box.space.test:create_index('pk')
-
-old_synchro_quorum = box.cfg.replication_synchro_quorum
-old_synchro_timeout = box.cfg.replication_synchro_timeout
-
--- Make sure that the master stalls on commit leaving the limbo non-empty.
-box.cfg{replication_synchro_quorum=3, replication_synchro_timeout=1000}
-
-f = fiber.new(function() box.space.test:insert{1} end)
-f:status()
-
--- Wait till replica's limbo is non-empty.
-test_run:wait_lsn('replica', 'default')
-test_run:cmd('switch replica')
-
-box.info.ro
-box.space.test:insert{2}
-success = false
-f = require('fiber').new(function() box.ctl.wait_rw() success = true end)
-f:status()
-
-test_run:cmd('switch default')
-
--- Empty the limbo.
-box.cfg{replication_synchro_quorum=2}
-
-test_run:cmd('switch replica')
-
-test_run:wait_cond(function() return success end)
-box.info.ro
--- Should succeed now.
-box.space.test:insert{2}
-
--- Cleanup.
-test_run:cmd('switch default')
-box.cfg{replication_synchro_quorum=old_synchro_quorum,\
-        replication_synchro_timeout=old_synchro_timeout}
-box.space.test:drop()
-test_run:cmd('stop server replica')
-test_run:cmd('delete server replica')
-box.schema.user.revoke('guest', 'replication')
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Tarantool-patches] [PATCH v2 6/8] box: introduce `box.ctl.demote`
  2021-06-17 21:07 [Tarantool-patches] [PATCH v2 0/8] forbid implicit limbo ownership transition Serge Petrenko via Tarantool-patches
                   ` (4 preceding siblings ...)
  2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 5/8] replication: forbid implicit limbo owner transition Serge Petrenko via Tarantool-patches
@ 2021-06-17 21:07 ` Serge Petrenko via Tarantool-patches
  2021-06-17 21:15   ` Serge Petrenko via Tarantool-patches
  2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 7/8] txn_limbo: persist the latest effective promote in snapshot Serge Petrenko via Tarantool-patches
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 16+ messages in thread
From: Serge Petrenko via Tarantool-patches @ 2021-06-17 21:07 UTC (permalink / raw)
  To: v.shpilevoy, gorcunov; +Cc: tarantool-patches

Introduce a new journal entry, DEMOTE. The entry has the same meaning as
PROMOTE, with the only difference that it clears limbo ownership instead
of transferring it to the issuer.

Introduce `box.ctl.demote`, which works exactly like `box.ctl.promote`,
but results in writing DEMOTE instead of PROMOTE.

A new request was necessary instead of simply writing PROMOTE(origin_id
= 0), because origin_id is deduced from row.replica_id, which cannot be
0 for replicated rows (it's always equal to instance_id of the row
originator).

Closes #6034

@TarantoolBod document
Title: box.ctl.demote

`box.ctl.demote()` is a new function, which works exactly like
`box.ctl.promote()`, with one exception that it results in the instance
writing DEMOTE request to WAL instead of a PROMOTE request.

A DEMOTE request (DEMOTE = 32) copies PROMOTE behaviour (it clears the
limbo as well), but clears limbo ownership instead of assigning it to a
new instance.
---
 src/box/box.cc                                |  28 ++-
 src/box/box.h                                 |   3 +
 src/box/iproto_constants.h                    |  10 +-
 src/box/lua/ctl.c                             |   9 +
 src/box/txn_limbo.c                           |  37 +++-
 src/box/txn_limbo.h                           |   7 +
 test/box/error.result                         |   2 +-
 test/replication/election_basic.result        |   3 +
 test/replication/election_basic.test.lua      |   1 +
 test/replication/election_qsync.result        |   3 +
 test/replication/election_qsync.test.lua      |   1 +
 .../gh-5140-qsync-casc-rollback.result        |   6 +
 .../gh-5140-qsync-casc-rollback.test.lua      |   2 +
 .../gh-5144-qsync-dup-confirm.result          |   6 +
 .../gh-5144-qsync-dup-confirm.test.lua        |   2 +
 .../gh-5163-qsync-restart-crash.result        |   6 +
 .../gh-5163-qsync-restart-crash.test.lua      |   2 +
 .../gh-5167-qsync-rollback-snap.result        |   6 +
 .../gh-5167-qsync-rollback-snap.test.lua      |   2 +
 .../gh-5195-qsync-replica-write.result        |  10 +-
 .../gh-5195-qsync-replica-write.test.lua      |   6 +-
 .../gh-5213-qsync-applier-order-3.result      |   9 +
 .../gh-5213-qsync-applier-order-3.test.lua    |   3 +
 .../gh-5213-qsync-applier-order.result        |   6 +
 .../gh-5213-qsync-applier-order.test.lua      |   2 +
 .../replication/gh-5288-qsync-recovery.result |   6 +
 .../gh-5288-qsync-recovery.test.lua           |   2 +
 .../gh-5298-qsync-recovery-snap.result        |   6 +
 .../gh-5298-qsync-recovery-snap.test.lua      |   2 +
 .../gh-5426-election-on-off.result            |   3 +
 .../gh-5426-election-on-off.test.lua          |   1 +
 .../gh-5433-election-restart-recovery.result  |   3 +
 ...gh-5433-election-restart-recovery.test.lua |   1 +
 ...sync-clear-synchro-queue-commit-all.result |   3 +
 ...nc-clear-synchro-queue-commit-all.test.lua |   1 +
 .../gh-5446-qsync-eval-quorum.result          |   7 +
 .../gh-5446-qsync-eval-quorum.test.lua        |   3 +
 .../gh-5506-election-on-off.result            |   3 +
 .../gh-5506-election-on-off.test.lua          |   1 +
 .../gh-5566-final-join-synchro.result         |   6 +
 .../gh-5566-final-join-synchro.test.lua       |   2 +
 .../gh-5874-qsync-txn-recovery.result         |   6 +
 .../gh-5874-qsync-txn-recovery.test.lua       |   2 +
 .../gh-6032-promote-wal-write.result          |   3 +
 .../gh-6032-promote-wal-write.test.lua        |   1 +
 .../gh-6034-limbo-ownership.result            | 186 ++++++++++++++++++
 .../gh-6034-limbo-ownership.test.lua          |  68 +++++++
 .../gh-6057-qsync-confirm-async-no-wal.result |   7 +
 ...h-6057-qsync-confirm-async-no-wal.test.lua |   3 +
 test/replication/hang_on_synchro_fail.result  |   6 +
 .../replication/hang_on_synchro_fail.test.lua |   2 +
 test/replication/qsync_advanced.result        |  12 ++
 test/replication/qsync_advanced.test.lua      |   4 +
 test/replication/qsync_basic.result           |  33 ++--
 test/replication/qsync_basic.test.lua         |  16 +-
 test/replication/qsync_errinj.result          |   6 +
 test/replication/qsync_errinj.test.lua        |   2 +
 test/replication/qsync_snapshots.result       |   6 +
 test/replication/qsync_snapshots.test.lua     |   2 +
 test/replication/qsync_with_anon.result       |   6 +
 test/replication/qsync_with_anon.test.lua     |   2 +
 test/replication/suite.cfg                    |   1 +
 62 files changed, 550 insertions(+), 46 deletions(-)
 create mode 100644 test/replication/gh-6034-limbo-ownership.result
 create mode 100644 test/replication/gh-6034-limbo-ownership.test.lua

diff --git a/src/box/box.cc b/src/box/box.cc
index 53a8f80e5..f2bde910c 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -1527,8 +1527,8 @@ box_wait_quorum(uint32_t lead_id, int64_t target_lsn, int quorum,
 	return 0;
 }
 
-int
-box_promote(void)
+static int
+box_clear_synchro_queue(bool demote)
 {
 	/* A guard to block multiple simultaneous function invocations. */
 	static bool in_promote = false;
@@ -1691,10 +1691,16 @@ promote:
 				raft_new_term(box_raft());
 
 			uint64_t term = box_raft()->volatile_term;
-			txn_limbo_write_promote(&txn_limbo, wait_lsn,
-						term);
+			if (demote) {
+				txn_limbo_write_demote(&txn_limbo, wait_lsn,
+						       term);
+			} else {
+				txn_limbo_write_promote(&txn_limbo, wait_lsn,
+							term);
+			}
+			uint16_t type = demote ? IPROTO_DEMOTE : IPROTO_PROMOTE;
 			struct synchro_request req = {
-				.type = IPROTO_PROMOTE,
+				.type = type,
 				.replica_id = former_leader_id,
 				.origin_id = instance_id,
 				.lsn = wait_lsn,
@@ -1707,6 +1713,18 @@ promote:
 	return rc;
 }
 
+int
+box_promote(void)
+{
+	return box_clear_synchro_queue(false);
+}
+
+int
+box_demote(void)
+{
+	return box_clear_synchro_queue(true);
+}
+
 int
 box_listen(void)
 {
diff --git a/src/box/box.h b/src/box/box.h
index ecf32240d..aaf20d9dd 100644
--- a/src/box/box.h
+++ b/src/box/box.h
@@ -276,6 +276,9 @@ typedef struct tuple box_tuple_t;
 int
 box_promote(void);
 
+int
+box_demote(void);
+
 /* box_select is private and used only by FFI */
 API_EXPORT int
 box_select(uint32_t space_id, uint32_t index_id,
diff --git a/src/box/iproto_constants.h b/src/box/iproto_constants.h
index 137bee9da..3c9edb7d2 100644
--- a/src/box/iproto_constants.h
+++ b/src/box/iproto_constants.h
@@ -241,6 +241,8 @@ enum iproto_type {
 	IPROTO_RAFT = 30,
 	/** PROMOTE request. */
 	IPROTO_PROMOTE = 31,
+	/** DEMOTE request. */
+	IPROTO_DEMOTE = 32,
 
 	/** A confirmation message for synchronous transactions. */
 	IPROTO_CONFIRM = 40,
@@ -310,6 +312,8 @@ iproto_type_name(uint16_t type)
 		return "RAFT";
 	case IPROTO_PROMOTE:
 		return "PROMOTE";
+	case IPROTO_DEMOTE:
+		return "DEMOTE";
 	case IPROTO_CONFIRM:
 		return "CONFIRM";
 	case IPROTO_ROLLBACK:
@@ -364,14 +368,14 @@ static inline bool
 iproto_type_is_synchro_request(uint16_t type)
 {
 	return type == IPROTO_CONFIRM || type == IPROTO_ROLLBACK ||
-	       type == IPROTO_PROMOTE;
+	       type == IPROTO_PROMOTE || type == IPROTO_DEMOTE;
 }
 
-/** PROMOTE entry (synchronous replication and leader elections). */
+/** PROMOTE/DEMOTE entry (synchronous replication and leader elections). */
 static inline bool
 iproto_type_is_promote_request(uint32_t type)
 {
-       return type == IPROTO_PROMOTE;
+       return type == IPROTO_PROMOTE || type == IPROTO_DEMOTE;
 }
 
 static inline bool
diff --git a/src/box/lua/ctl.c b/src/box/lua/ctl.c
index 368b9ab60..a613c4111 100644
--- a/src/box/lua/ctl.c
+++ b/src/box/lua/ctl.c
@@ -89,6 +89,14 @@ lbox_ctl_promote(struct lua_State *L)
 	return 0;
 }
 
+static int
+lbox_ctl_demote(struct lua_State *L)
+{
+	if (box_demote() != 0)
+		return luaT_error(L);
+	return 0;
+}
+
 static int
 lbox_ctl_is_recovery_finished(struct lua_State *L)
 {
@@ -127,6 +135,7 @@ static const struct luaL_Reg lbox_ctl_lib[] = {
 	{"promote", lbox_ctl_promote},
 	/* An old alias. */
 	{"clear_synchro_queue", lbox_ctl_promote},
+	{"demote", lbox_ctl_demote},
 	{"is_recovery_finished", lbox_ctl_is_recovery_finished},
 	{"set_on_shutdown_timeout", lbox_ctl_set_on_shutdown_timeout},
 	{NULL, NULL}
diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c
index 203dbe856..b5af02479 100644
--- a/src/box/txn_limbo.c
+++ b/src/box/txn_limbo.c
@@ -503,6 +503,29 @@ txn_limbo_read_promote(struct txn_limbo *limbo, uint32_t replica_id,
 	limbo->confirmed_lsn = 0;
 }
 
+void
+txn_limbo_write_demote(struct txn_limbo *limbo, int64_t lsn, uint64_t term)
+{
+	limbo->confirmed_lsn = lsn;
+	limbo->is_in_rollback = true;
+	struct txn_limbo_entry *e = txn_limbo_last_synchro_entry(limbo);
+	assert(e == NULL || e->lsn <= lsn);
+	(void)e;
+	txn_limbo_write_synchro(limbo, IPROTO_DEMOTE, lsn, term);
+	limbo->is_in_rollback = false;
+}
+
+/**
+ * Process a DEMOTE request, which's like PROMOTE, but clears the limbo
+ * ownership.
+ * @sa txn_limbo_read_promote.
+ */
+static void
+txn_limbo_read_demote(struct txn_limbo *limbo, int64_t lsn)
+{
+	return txn_limbo_read_promote(limbo, REPLICA_ID_NIL, lsn);
+}
+
 void
 txn_limbo_ack(struct txn_limbo *limbo, uint32_t replica_id, int64_t lsn)
 {
@@ -655,12 +678,13 @@ txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req)
 
 	if (term > limbo->promote_greatest_term) {
 		limbo->promote_greatest_term = term;
-	} else if (req->type == IPROTO_PROMOTE &&
+	} else if (iproto_type_is_promote_request(req->type) &&
 		   limbo->promote_greatest_term > 1) {
 		/* PROMOTE for outdated term. Ignore. */
-		say_info("RAFT: ignoring PROMOTE request from instance "
+		say_info("RAFT: ignoring %s request from instance "
 			 "id %"PRIu32" for term %"PRIu64". Greatest term seen "
-			 "before (%"PRIu64") is bigger.", origin, term,
+			 "before (%"PRIu64") is bigger.",
+			 iproto_type_name(req->type), origin, term,
 			 limbo->promote_greatest_term);
 		return;
 	}
@@ -671,7 +695,7 @@ txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req)
 		 * The limbo was empty on the instance issuing the request.
 		 * This means this instance must empty its limbo as well.
 		 */
-		assert(lsn == 0 && req->type == IPROTO_PROMOTE);
+		assert(lsn == 0 && iproto_type_is_promote_request(req->type));
 	} else if (req->replica_id != limbo->owner_id) {
 		/*
 		 * Ignore CONFIRM/ROLLBACK messages for a foreign master.
@@ -679,7 +703,7 @@ txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req)
 		 * data from an old leader, who has just started and written
 		 * confirm right on synchronous transaction recovery.
 		 */
-		if (req->type != IPROTO_PROMOTE)
+		if (!iproto_type_is_promote_request(req->type))
 			return;
 		/*
 		 * Promote has a bigger term, and tries to steal the limbo. It
@@ -699,6 +723,9 @@ txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req)
 	case IPROTO_PROMOTE:
 		txn_limbo_read_promote(limbo, req->origin_id, lsn);
 		break;
+	case IPROTO_DEMOTE:
+		txn_limbo_read_demote(limbo, lsn);
+		break;
 	default:
 		unreachable();
 	}
diff --git a/src/box/txn_limbo.h b/src/box/txn_limbo.h
index e409ac657..801a1a0ee 100644
--- a/src/box/txn_limbo.h
+++ b/src/box/txn_limbo.h
@@ -318,6 +318,13 @@ txn_limbo_wait_confirm(struct txn_limbo *limbo);
 void
 txn_limbo_write_promote(struct txn_limbo *limbo, int64_t lsn, uint64_t term);
 
+/**
+ * Write a DEMOTE request.
+ * It has the same effect as PROMOTE and additionally clears limbo ownership.
+ */
+void
+txn_limbo_write_demote(struct txn_limbo *limbo, int64_t lsn, uint64_t term);
+
 /**
  * Update qsync parameters dynamically.
  */
diff --git a/test/box/error.result b/test/box/error.result
index 574521a14..55ecc3e8e 100644
--- a/test/box/error.result
+++ b/test/box/error.result
@@ -444,7 +444,7 @@ t;
  |   223: box.error.INTERFERING_PROMOTE
  |   224: box.error.RAFT_DISABLED
  |   225: box.error.TXN_ROLLBACK
- |   226: box.error.LIMBO_UNCLAIMED
+ |   226: box.error.SYNCHRO_QUEUE_UNCLAIMED
  | ...
 
 test_run:cmd("setopt delimiter ''");
diff --git a/test/replication/election_basic.result b/test/replication/election_basic.result
index d5320b3ff..a62d32deb 100644
--- a/test/replication/election_basic.result
+++ b/test/replication/election_basic.result
@@ -114,6 +114,9 @@ box.cfg{
 }
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
 
 --
 -- See if bootstrap with election enabled works.
diff --git a/test/replication/election_basic.test.lua b/test/replication/election_basic.test.lua
index 821f73cea..2143a6000 100644
--- a/test/replication/election_basic.test.lua
+++ b/test/replication/election_basic.test.lua
@@ -43,6 +43,7 @@ box.cfg{
     election_mode = 'off',                                                      \
     election_timeout = old_election_timeout                                     \
 }
+box.ctl.demote()
 
 --
 -- See if bootstrap with election enabled works.
diff --git a/test/replication/election_qsync.result b/test/replication/election_qsync.result
index c06400b38..2402c8578 100644
--- a/test/replication/election_qsync.result
+++ b/test/replication/election_qsync.result
@@ -165,6 +165,9 @@ box.cfg{
 }
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
 box.schema.user.revoke('guest', 'super')
  | ---
  | ...
diff --git a/test/replication/election_qsync.test.lua b/test/replication/election_qsync.test.lua
index ea6fc4a61..e1aca8351 100644
--- a/test/replication/election_qsync.test.lua
+++ b/test/replication/election_qsync.test.lua
@@ -84,4 +84,5 @@ box.cfg{
     replication = old_replication,                                              \
     replication_synchro_timeout = old_replication_synchro_timeout,              \
 }
+box.ctl.demote()
 box.schema.user.revoke('guest', 'super')
diff --git a/test/replication/gh-5140-qsync-casc-rollback.result b/test/replication/gh-5140-qsync-casc-rollback.result
index da77631dd..d3208e1a4 100644
--- a/test/replication/gh-5140-qsync-casc-rollback.result
+++ b/test/replication/gh-5140-qsync-casc-rollback.result
@@ -73,6 +73,9 @@ _ = box.schema.space.create('async', {is_sync=false, engine = engine})
 _ = _:create_index('pk')
  | ---
  | ...
+box.ctl.promote()
+ | ---
+ | ...
 -- Write something to flush the master state to replica.
 box.space.sync:replace{1}
  | ---
@@ -222,3 +225,6 @@ test_run:cmd('delete server replica')
 box.schema.user.revoke('guest', 'super')
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-5140-qsync-casc-rollback.test.lua b/test/replication/gh-5140-qsync-casc-rollback.test.lua
index 69fc9ad02..96ddfd260 100644
--- a/test/replication/gh-5140-qsync-casc-rollback.test.lua
+++ b/test/replication/gh-5140-qsync-casc-rollback.test.lua
@@ -48,6 +48,7 @@ _ = box.schema.space.create('sync', {is_sync = true, engine = engine})
 _ = _:create_index('pk')
 _ = box.schema.space.create('async', {is_sync=false, engine = engine})
 _ = _:create_index('pk')
+box.ctl.promote()
 -- Write something to flush the master state to replica.
 box.space.sync:replace{1}
 
@@ -103,3 +104,4 @@ test_run:cmd('stop server replica')
 test_run:cmd('delete server replica')
 
 box.schema.user.revoke('guest', 'super')
+box.ctl.demote()
diff --git a/test/replication/gh-5144-qsync-dup-confirm.result b/test/replication/gh-5144-qsync-dup-confirm.result
index 9d265d9ff..217e44412 100644
--- a/test/replication/gh-5144-qsync-dup-confirm.result
+++ b/test/replication/gh-5144-qsync-dup-confirm.result
@@ -46,6 +46,9 @@ _ = box.schema.space.create('sync', {is_sync = true, engine = engine})
 _ = _:create_index('pk')
  | ---
  | ...
+box.ctl.promote()
+ | ---
+ | ...
 
 -- Remember the current LSN. In the end, when the following synchronous
 -- transaction is committed, result LSN should be this value +2: for the
@@ -148,6 +151,9 @@ test_run:cmd('delete server replica2')
  | - true
  | ...
 
+box.ctl.demote()
+ | ---
+ | ...
 box.schema.user.revoke('guest', 'super')
  | ---
  | ...
diff --git a/test/replication/gh-5144-qsync-dup-confirm.test.lua b/test/replication/gh-5144-qsync-dup-confirm.test.lua
index 01a8351e0..1d6af2c62 100644
--- a/test/replication/gh-5144-qsync-dup-confirm.test.lua
+++ b/test/replication/gh-5144-qsync-dup-confirm.test.lua
@@ -19,6 +19,7 @@ box.cfg{replication_synchro_quorum = 2, replication_synchro_timeout = 1000}
 
 _ = box.schema.space.create('sync', {is_sync = true, engine = engine})
 _ = _:create_index('pk')
+box.ctl.promote()
 
 -- Remember the current LSN. In the end, when the following synchronous
 -- transaction is committed, result LSN should be this value +2: for the
@@ -69,4 +70,5 @@ test_run:cmd('delete server replica1')
 test_run:cmd('stop server replica2')
 test_run:cmd('delete server replica2')
 
+box.ctl.demote()
 box.schema.user.revoke('guest', 'super')
diff --git a/test/replication/gh-5163-qsync-restart-crash.result b/test/replication/gh-5163-qsync-restart-crash.result
index e57bc76d1..1b4d3d9b5 100644
--- a/test/replication/gh-5163-qsync-restart-crash.result
+++ b/test/replication/gh-5163-qsync-restart-crash.result
@@ -16,6 +16,9 @@ _ = box.schema.space.create('sync', {is_sync=true, engine=engine})
 _ = box.space.sync:create_index('pk')
  | ---
  | ...
+box.ctl.promote()
+ | ---
+ | ...
 
 box.space.sync:replace{1}
  | ---
@@ -30,3 +33,6 @@ box.space.sync:select{}
 box.space.sync:drop()
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-5163-qsync-restart-crash.test.lua b/test/replication/gh-5163-qsync-restart-crash.test.lua
index d5aca4749..c8d54aad2 100644
--- a/test/replication/gh-5163-qsync-restart-crash.test.lua
+++ b/test/replication/gh-5163-qsync-restart-crash.test.lua
@@ -7,8 +7,10 @@ engine = test_run:get_cfg('engine')
 --
 _ = box.schema.space.create('sync', {is_sync=true, engine=engine})
 _ = box.space.sync:create_index('pk')
+box.ctl.promote()
 
 box.space.sync:replace{1}
 test_run:cmd('restart server default')
 box.space.sync:select{}
 box.space.sync:drop()
+box.ctl.demote()
diff --git a/test/replication/gh-5167-qsync-rollback-snap.result b/test/replication/gh-5167-qsync-rollback-snap.result
index 06f58526c..13166720f 100644
--- a/test/replication/gh-5167-qsync-rollback-snap.result
+++ b/test/replication/gh-5167-qsync-rollback-snap.result
@@ -41,6 +41,9 @@ _ = box.schema.space.create('sync', {is_sync = true, engine = engine})
 _ = box.space.sync:create_index('pk')
  | ---
  | ...
+box.ctl.promote()
+ | ---
+ | ...
 -- Write something to flush the current master's state to replica.
 _ = box.space.sync:insert{1}
  | ---
@@ -163,3 +166,6 @@ box.cfg{
 }
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-5167-qsync-rollback-snap.test.lua b/test/replication/gh-5167-qsync-rollback-snap.test.lua
index 475727e61..1a2a31b7c 100644
--- a/test/replication/gh-5167-qsync-rollback-snap.test.lua
+++ b/test/replication/gh-5167-qsync-rollback-snap.test.lua
@@ -16,6 +16,7 @@ fiber = require('fiber')
 box.cfg{replication_synchro_quorum = 2, replication_synchro_timeout = 1000}
 _ = box.schema.space.create('sync', {is_sync = true, engine = engine})
 _ = box.space.sync:create_index('pk')
+box.ctl.promote()
 -- Write something to flush the current master's state to replica.
 _ = box.space.sync:insert{1}
 _ = box.space.sync:delete{1}
@@ -65,3 +66,4 @@ box.cfg{
     replication_synchro_quorum = orig_synchro_quorum,                           \
     replication_synchro_timeout = orig_synchro_timeout,                         \
 }
+box.ctl.demote()
diff --git a/test/replication/gh-5195-qsync-replica-write.result b/test/replication/gh-5195-qsync-replica-write.result
index 85e00e6ed..99ec9663e 100644
--- a/test/replication/gh-5195-qsync-replica-write.result
+++ b/test/replication/gh-5195-qsync-replica-write.result
@@ -40,6 +40,9 @@ _ = box.schema.space.create('sync', {engine = engine, is_sync = true})
 _ = box.space.sync:create_index('pk')
  | ---
  | ...
+box.ctl.promote()
+ | ---
+ | ...
 
 box.cfg{replication_synchro_timeout = 1000, replication_synchro_quorum = 3}
  | ---
@@ -71,12 +74,12 @@ test_run:wait_lsn('replica', 'default')
  | ---
  | ...
 -- Normal DML is blocked - the limbo is not empty and does not belong to the
--- replica. But synchro queue cleanup also does a WAL write, and propagates LSN
+-- replica. But promote also does a WAL write, and propagates LSN
 -- of the instance.
 box.cfg{replication_synchro_timeout = 0.001}
  | ---
  | ...
-box.ctl.clear_synchro_queue()
+box.ctl.promote()
  | ---
  | ...
 
@@ -157,3 +160,6 @@ box.cfg{
 }
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-5195-qsync-replica-write.test.lua b/test/replication/gh-5195-qsync-replica-write.test.lua
index 64c48be99..8b5d78357 100644
--- a/test/replication/gh-5195-qsync-replica-write.test.lua
+++ b/test/replication/gh-5195-qsync-replica-write.test.lua
@@ -17,6 +17,7 @@ test_run:cmd('start server replica with wait=True, wait_load=True')
 --
 _ = box.schema.space.create('sync', {engine = engine, is_sync = true})
 _ = box.space.sync:create_index('pk')
+box.ctl.promote()
 
 box.cfg{replication_synchro_timeout = 1000, replication_synchro_quorum = 3}
 lsn = box.info.lsn
@@ -30,10 +31,10 @@ test_run:wait_cond(function() return box.info.lsn == lsn end)
 test_run:switch('replica')
 test_run:wait_lsn('replica', 'default')
 -- Normal DML is blocked - the limbo is not empty and does not belong to the
--- replica. But synchro queue cleanup also does a WAL write, and propagates LSN
+-- replica. But promote also does a WAL write, and propagates LSN
 -- of the instance.
 box.cfg{replication_synchro_timeout = 0.001}
-box.ctl.clear_synchro_queue()
+box.ctl.promote()
 
 test_run:switch('default')
 -- Wait second ACK receipt.
@@ -66,3 +67,4 @@ box.cfg{
     replication_synchro_quorum = old_synchro_quorum,                            \
     replication_synchro_timeout = old_synchro_timeout,                          \
 }
+box.ctl.demote()
diff --git a/test/replication/gh-5213-qsync-applier-order-3.result b/test/replication/gh-5213-qsync-applier-order-3.result
index bcb18b5c0..e788eec77 100644
--- a/test/replication/gh-5213-qsync-applier-order-3.result
+++ b/test/replication/gh-5213-qsync-applier-order-3.result
@@ -45,6 +45,9 @@ s = box.schema.space.create('test', {is_sync = true})
 _ = s:create_index('pk')
  | ---
  | ...
+box.ctl.promote()
+ | ---
+ | ...
 
 test_run:cmd('create server replica1 with rpl_master=default,\
               script="replication/replica1.lua"')
@@ -179,6 +182,9 @@ box.cfg{
 -- Replica2 takes the limbo ownership and sends the transaction to the replica1.
 -- Along with the CONFIRM from the default node, which is still not applied
 -- on the replica1.
+box.ctl.promote()
+ | ---
+ | ...
 fiber = require('fiber')
  | ---
  | ...
@@ -261,3 +267,6 @@ box.cfg{
 }
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-5213-qsync-applier-order-3.test.lua b/test/replication/gh-5213-qsync-applier-order-3.test.lua
index 37b569da7..304656de0 100644
--- a/test/replication/gh-5213-qsync-applier-order-3.test.lua
+++ b/test/replication/gh-5213-qsync-applier-order-3.test.lua
@@ -30,6 +30,7 @@ box.schema.user.grant('guest', 'super')
 
 s = box.schema.space.create('test', {is_sync = true})
 _ = s:create_index('pk')
+box.ctl.promote()
 
 test_run:cmd('create server replica1 with rpl_master=default,\
               script="replication/replica1.lua"')
@@ -90,6 +91,7 @@ box.cfg{
 -- Replica2 takes the limbo ownership and sends the transaction to the replica1.
 -- Along with the CONFIRM from the default node, which is still not applied
 -- on the replica1.
+box.ctl.promote()
 fiber = require('fiber')
 f = fiber.new(function() box.space.test:replace{2} end)
 
@@ -123,3 +125,4 @@ box.cfg{
     replication_synchro_quorum = old_synchro_quorum,                            \
     replication_synchro_timeout = old_synchro_timeout,                          \
 }
+box.ctl.demote()
diff --git a/test/replication/gh-5213-qsync-applier-order.result b/test/replication/gh-5213-qsync-applier-order.result
index a8c24c289..ba6cdab06 100644
--- a/test/replication/gh-5213-qsync-applier-order.result
+++ b/test/replication/gh-5213-qsync-applier-order.result
@@ -29,6 +29,9 @@ s = box.schema.space.create('test', {is_sync = true})
 _ = s:create_index('pk')
  | ---
  | ...
+box.ctl.promote()
+ | ---
+ | ...
 
 test_run:cmd('create server replica with rpl_master=default,\
               script="replication/gh-5213-replica.lua"')
@@ -300,3 +303,6 @@ box.cfg{
 }
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-5213-qsync-applier-order.test.lua b/test/replication/gh-5213-qsync-applier-order.test.lua
index f1eccfa84..39b1912e8 100644
--- a/test/replication/gh-5213-qsync-applier-order.test.lua
+++ b/test/replication/gh-5213-qsync-applier-order.test.lua
@@ -14,6 +14,7 @@ box.schema.user.grant('guest', 'super')
 
 s = box.schema.space.create('test', {is_sync = true})
 _ = s:create_index('pk')
+box.ctl.promote()
 
 test_run:cmd('create server replica with rpl_master=default,\
               script="replication/gh-5213-replica.lua"')
@@ -120,3 +121,4 @@ box.cfg{
     replication_synchro_quorum = old_synchro_quorum,                            \
     replication_synchro_timeout = old_synchro_timeout,                          \
 }
+box.ctl.demote()
diff --git a/test/replication/gh-5288-qsync-recovery.result b/test/replication/gh-5288-qsync-recovery.result
index dc0babef6..704b71d93 100644
--- a/test/replication/gh-5288-qsync-recovery.result
+++ b/test/replication/gh-5288-qsync-recovery.result
@@ -12,6 +12,9 @@ s = box.schema.space.create('sync', {is_sync = true})
 _ = s:create_index('pk')
  | ---
  | ...
+box.ctl.promote()
+ | ---
+ | ...
 s:insert{1}
  | ---
  | - [1]
@@ -25,3 +28,6 @@ test_run:cmd('restart server default')
 box.space.sync:drop()
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-5288-qsync-recovery.test.lua b/test/replication/gh-5288-qsync-recovery.test.lua
index 00bff7b87..2455f7278 100644
--- a/test/replication/gh-5288-qsync-recovery.test.lua
+++ b/test/replication/gh-5288-qsync-recovery.test.lua
@@ -5,7 +5,9 @@ test_run = require('test_run').new()
 --
 s = box.schema.space.create('sync', {is_sync = true})
 _ = s:create_index('pk')
+box.ctl.promote()
 s:insert{1}
 box.snapshot()
 test_run:cmd('restart server default')
 box.space.sync:drop()
+box.ctl.demote()
diff --git a/test/replication/gh-5298-qsync-recovery-snap.result b/test/replication/gh-5298-qsync-recovery-snap.result
index 922831552..0883fe5f5 100644
--- a/test/replication/gh-5298-qsync-recovery-snap.result
+++ b/test/replication/gh-5298-qsync-recovery-snap.result
@@ -17,6 +17,9 @@ _ = box.schema.space.create('sync', {is_sync = true, engine = engine})
 _ = box.space.sync:create_index('pk')
  | ---
  | ...
+box.ctl.promote()
+ | ---
+ | ...
 for i = 1, 10 do box.space.sync:replace{i} end
  | ---
  | ...
@@ -98,3 +101,6 @@ box.space.sync:drop()
 box.space.loc:drop()
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-5298-qsync-recovery-snap.test.lua b/test/replication/gh-5298-qsync-recovery-snap.test.lua
index 187f60d75..084cde963 100644
--- a/test/replication/gh-5298-qsync-recovery-snap.test.lua
+++ b/test/replication/gh-5298-qsync-recovery-snap.test.lua
@@ -8,6 +8,7 @@ engine = test_run:get_cfg('engine')
 --
 _ = box.schema.space.create('sync', {is_sync = true, engine = engine})
 _ = box.space.sync:create_index('pk')
+box.ctl.promote()
 for i = 1, 10 do box.space.sync:replace{i} end
 
 -- Local rows could affect this by increasing the signature.
@@ -46,3 +47,4 @@ box.cfg{
 }
 box.space.sync:drop()
 box.space.loc:drop()
+box.ctl.demote()
diff --git a/test/replication/gh-5426-election-on-off.result b/test/replication/gh-5426-election-on-off.result
index 7444ef7f2..2bdc17ec6 100644
--- a/test/replication/gh-5426-election-on-off.result
+++ b/test/replication/gh-5426-election-on-off.result
@@ -168,6 +168,9 @@ box.cfg{
 }
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
 box.schema.user.revoke('guest', 'super')
  | ---
  | ...
diff --git a/test/replication/gh-5426-election-on-off.test.lua b/test/replication/gh-5426-election-on-off.test.lua
index bdf06903b..6277e9ef2 100644
--- a/test/replication/gh-5426-election-on-off.test.lua
+++ b/test/replication/gh-5426-election-on-off.test.lua
@@ -69,4 +69,5 @@ box.cfg{
     election_mode = old_election_mode,                                          \
     replication_timeout = old_replication_timeout,                              \
 }
+box.ctl.demote()
 box.schema.user.revoke('guest', 'super')
diff --git a/test/replication/gh-5433-election-restart-recovery.result b/test/replication/gh-5433-election-restart-recovery.result
index f8f32416e..ed63ff409 100644
--- a/test/replication/gh-5433-election-restart-recovery.result
+++ b/test/replication/gh-5433-election-restart-recovery.result
@@ -169,6 +169,9 @@ box.cfg{
 }
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
 box.schema.user.revoke('guest', 'super')
  | ---
  | ...
diff --git a/test/replication/gh-5433-election-restart-recovery.test.lua b/test/replication/gh-5433-election-restart-recovery.test.lua
index 4aff000bf..ae1f42c4d 100644
--- a/test/replication/gh-5433-election-restart-recovery.test.lua
+++ b/test/replication/gh-5433-election-restart-recovery.test.lua
@@ -84,4 +84,5 @@ box.cfg{
     election_mode = old_election_mode,                                          \
     replication_timeout = old_replication_timeout,                              \
 }
+box.ctl.demote()
 box.schema.user.revoke('guest', 'super')
diff --git a/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.result b/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.result
index 2699231e5..20fab4072 100644
--- a/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.result
+++ b/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.result
@@ -49,6 +49,9 @@ _ = box.schema.space.create('test', {is_sync=true})
 _ = box.space.test:create_index('pk')
  | ---
  | ...
+box.ctl.promote()
+ | ---
+ | ...
 
 -- Fill the limbo with pending entries. 3 mustn't receive them yet.
 test_run:cmd('stop server election_replica3')
diff --git a/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.test.lua b/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.test.lua
index 03705d96c..ec0f1d77e 100644
--- a/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.test.lua
+++ b/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.test.lua
@@ -21,6 +21,7 @@ box.ctl.wait_rw()
 
 _ = box.schema.space.create('test', {is_sync=true})
 _ = box.space.test:create_index('pk')
+box.ctl.promote()
 
 -- Fill the limbo with pending entries. 3 mustn't receive them yet.
 test_run:cmd('stop server election_replica3')
diff --git a/test/replication/gh-5446-qsync-eval-quorum.result b/test/replication/gh-5446-qsync-eval-quorum.result
index 5f83b248c..1173128a7 100644
--- a/test/replication/gh-5446-qsync-eval-quorum.result
+++ b/test/replication/gh-5446-qsync-eval-quorum.result
@@ -88,6 +88,9 @@ s = box.schema.space.create('sync', {is_sync = true, engine = engine})
 _ = s:create_index('pk')
  | ---
  | ...
+box.ctl.promote()
+ | ---
+ | ...
 
 -- Only one master node -> 1/2 + 1 = 1
 s:insert{1} -- should pass
@@ -343,3 +346,7 @@ box.cfg{
 }
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
+
diff --git a/test/replication/gh-5446-qsync-eval-quorum.test.lua b/test/replication/gh-5446-qsync-eval-quorum.test.lua
index 6b9e324ed..b969df836 100644
--- a/test/replication/gh-5446-qsync-eval-quorum.test.lua
+++ b/test/replication/gh-5446-qsync-eval-quorum.test.lua
@@ -37,6 +37,7 @@ end
 -- Create a sync space we will operate on
 s = box.schema.space.create('sync', {is_sync = true, engine = engine})
 _ = s:create_index('pk')
+box.ctl.promote()
 
 -- Only one master node -> 1/2 + 1 = 1
 s:insert{1} -- should pass
@@ -135,3 +136,5 @@ box.cfg{
     replication_synchro_quorum = old_synchro_quorum,                            \
     replication_synchro_timeout = old_synchro_timeout,                          \
 }
+box.ctl.demote()
+
diff --git a/test/replication/gh-5506-election-on-off.result b/test/replication/gh-5506-election-on-off.result
index b8abd7ecd..a7f2b6a9c 100644
--- a/test/replication/gh-5506-election-on-off.result
+++ b/test/replication/gh-5506-election-on-off.result
@@ -138,3 +138,6 @@ box.cfg{
 }
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-5506-election-on-off.test.lua b/test/replication/gh-5506-election-on-off.test.lua
index 476b00ec0..f8915c333 100644
--- a/test/replication/gh-5506-election-on-off.test.lua
+++ b/test/replication/gh-5506-election-on-off.test.lua
@@ -66,3 +66,4 @@ box.cfg{
     election_mode = old_election_mode,                                          \
     replication_timeout = old_replication_timeout,                              \
 }
+box.ctl.demote()
diff --git a/test/replication/gh-5566-final-join-synchro.result b/test/replication/gh-5566-final-join-synchro.result
index a09882ba6..c5ae2f283 100644
--- a/test/replication/gh-5566-final-join-synchro.result
+++ b/test/replication/gh-5566-final-join-synchro.result
@@ -12,6 +12,9 @@ _ = box.schema.space.create('sync', {is_sync=true})
 _ = box.space.sync:create_index('pk')
  | ---
  | ...
+box.ctl.promote()
+ | ---
+ | ...
 
 box.schema.user.grant('guest', 'replication')
  | ---
@@ -137,3 +140,6 @@ test_run:cleanup_cluster()
 box.schema.user.revoke('guest', 'replication')
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-5566-final-join-synchro.test.lua b/test/replication/gh-5566-final-join-synchro.test.lua
index 2db2c742f..25f411407 100644
--- a/test/replication/gh-5566-final-join-synchro.test.lua
+++ b/test/replication/gh-5566-final-join-synchro.test.lua
@@ -5,6 +5,7 @@ test_run = require('test_run').new()
 --
 _ = box.schema.space.create('sync', {is_sync=true})
 _ = box.space.sync:create_index('pk')
+box.ctl.promote()
 
 box.schema.user.grant('guest', 'replication')
 box.schema.user.grant('guest', 'write', 'space', 'sync')
@@ -59,3 +60,4 @@ box.cfg{\
 box.space.sync:drop()
 test_run:cleanup_cluster()
 box.schema.user.revoke('guest', 'replication')
+box.ctl.demote()
diff --git a/test/replication/gh-5874-qsync-txn-recovery.result b/test/replication/gh-5874-qsync-txn-recovery.result
index 73f903ca7..01328a9e3 100644
--- a/test/replication/gh-5874-qsync-txn-recovery.result
+++ b/test/replication/gh-5874-qsync-txn-recovery.result
@@ -31,6 +31,9 @@ sync = box.schema.create_space('sync', {is_sync = true, engine = engine})
 _ = sync:create_index('pk')
  | ---
  | ...
+box.ctl.promote()
+ | ---
+ | ...
 
 -- The transaction fails, but is written to the log anyway.
 box.begin() async:insert{1} sync:insert{1} box.commit()
@@ -160,3 +163,6 @@ sync:drop()
 loc:drop()
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-5874-qsync-txn-recovery.test.lua b/test/replication/gh-5874-qsync-txn-recovery.test.lua
index f35eb68de..6ddf164ac 100644
--- a/test/replication/gh-5874-qsync-txn-recovery.test.lua
+++ b/test/replication/gh-5874-qsync-txn-recovery.test.lua
@@ -12,6 +12,7 @@ async = box.schema.create_space('async', {engine = engine})
 _ = async:create_index('pk')
 sync = box.schema.create_space('sync', {is_sync = true, engine = engine})
 _ = sync:create_index('pk')
+box.ctl.promote()
 
 -- The transaction fails, but is written to the log anyway.
 box.begin() async:insert{1} sync:insert{1} box.commit()
@@ -82,3 +83,4 @@ loc:select()
 async:drop()
 sync:drop()
 loc:drop()
+box.ctl.demote()
diff --git a/test/replication/gh-6032-promote-wal-write.result b/test/replication/gh-6032-promote-wal-write.result
index 246c7974f..03112fb8d 100644
--- a/test/replication/gh-6032-promote-wal-write.result
+++ b/test/replication/gh-6032-promote-wal-write.result
@@ -67,3 +67,6 @@ box.cfg{\
 box.space.sync:drop()
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-6032-promote-wal-write.test.lua b/test/replication/gh-6032-promote-wal-write.test.lua
index 8c1859083..9a036a8b4 100644
--- a/test/replication/gh-6032-promote-wal-write.test.lua
+++ b/test/replication/gh-6032-promote-wal-write.test.lua
@@ -26,3 +26,4 @@ box.cfg{\
     replication_synchro_timeout = replication_synchro_timeout,\
 }
 box.space.sync:drop()
+box.ctl.demote()
diff --git a/test/replication/gh-6034-limbo-ownership.result b/test/replication/gh-6034-limbo-ownership.result
new file mode 100644
index 000000000..3681df3d8
--- /dev/null
+++ b/test/replication/gh-6034-limbo-ownership.result
@@ -0,0 +1,186 @@
+-- test-run result file version 2
+test_run = require('test_run').new()
+ | ---
+ | ...
+fiber = require('fiber')
+ | ---
+ | ...
+
+--
+-- gh-6034: test that transactional limbo isn't accessible without a promotion.
+--
+synchro_quorum = box.cfg.replication_synchro_quorum
+ | ---
+ | ...
+election_mode = box.cfg.election_mode
+ | ---
+ | ...
+box.cfg{replication_synchro_quorum = 1, election_mode='off'}
+ | ---
+ | ...
+
+_ = box.schema.space.create('async'):create_index('pk')
+ | ---
+ | ...
+_ = box.schema.space.create('sync', {is_sync=true}):create_index('pk')
+ | ---
+ | ...
+
+-- Limbo is initially unclaimed, everyone is writeable.
+assert(not box.info.ro)
+ | ---
+ | - true
+ | ...
+assert(box.info.synchro.queue.owner == 0)
+ | ---
+ | - true
+ | ...
+box.space.async:insert{1} -- success.
+ | ---
+ | - [1]
+ | ...
+-- Synchro spaces aren't writeable
+box.space.sync:insert{1} -- error.
+ | ---
+ | - error: The synchronous transaction queue doesn't belong to any instance
+ | ...
+
+box.ctl.promote()
+ | ---
+ | ...
+assert(not box.info.ro)
+ | ---
+ | - true
+ | ...
+assert(box.info.synchro.queue.owner == box.info.id)
+ | ---
+ | - true
+ | ...
+box.space.sync:insert{1} -- success.
+ | ---
+ | - [1]
+ | ...
+
+-- Everyone but the limbo owner is read-only.
+box.schema.user.grant('guest', 'replication')
+ | ---
+ | ...
+test_run:cmd('create server replica with rpl_master=default,\
+                                         script="replication/replica.lua"')
+ | ---
+ | - true
+ | ...
+test_run:cmd('start server replica with wait=True, wait_load=True')
+ | ---
+ | - true
+ | ...
+test_run:cmd('set variable rpl_listen to "replica.listen"')
+ | ---
+ | - true
+ | ...
+orig_replication = box.cfg.replication
+ | ---
+ | ...
+box.cfg{replication={box.info.listen, rpl_listen}}
+ | ---
+ | ...
+
+test_run:switch('replica')
+ | ---
+ | - true
+ | ...
+assert(box.info.ro)
+ | ---
+ | - true
+ | ...
+assert(box.info.synchro.queue.owner == test_run:eval('default', 'return box.info.id')[1])
+ | ---
+ | - true
+ | ...
+box.space.async:insert{2} -- failure.
+ | ---
+ | - error: Can't modify data because this instance is in read-only mode.
+ | ...
+
+-- Promotion on the other node. Default should become ro.
+box.ctl.promote()
+ | ---
+ | ...
+assert(not box.info.ro)
+ | ---
+ | - true
+ | ...
+assert(box.info.synchro.queue.owner == box.info.id)
+ | ---
+ | - true
+ | ...
+box.space.sync:insert{2} -- success.
+ | ---
+ | - [2]
+ | ...
+
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+assert(box.info.ro)
+ | ---
+ | - true
+ | ...
+assert(box.info.synchro.queue.owner == test_run:eval('replica', 'return box.info.id')[1])
+ | ---
+ | - true
+ | ...
+box.space.sync:insert{3} -- failure.
+ | ---
+ | - error: Can't modify data because this instance is in read-only mode.
+ | ...
+
+box.ctl.demote()
+ | ---
+ | ...
+assert(not box.info.ro)
+ | ---
+ | - true
+ | ...
+box.space.sync:insert{3} -- still fails.
+ | ---
+ | - error: The synchronous transaction queue doesn't belong to any instance
+ | ...
+assert(box.info.synchro.queue.owner == 0)
+ | ---
+ | - true
+ | ...
+box.space.async:insert{3} -- success.
+ | ---
+ | - [3]
+ | ...
+
+-- Cleanup.
+box.ctl.demote()
+ | ---
+ | ...
+test_run:cmd('stop server replica')
+ | ---
+ | - true
+ | ...
+test_run:cmd('delete server replica')
+ | ---
+ | - true
+ | ...
+box.schema.user.revoke('guest', 'replication')
+ | ---
+ | ...
+box.space.sync:drop()
+ | ---
+ | ...
+box.space.async:drop()
+ | ---
+ | ...
+box.cfg{\
+    replication_synchro_quorum = synchro_quorum,\
+    election_mode = election_mode,\
+    replication = orig_replication,\
+}
+ | ---
+ | ...
diff --git a/test/replication/gh-6034-limbo-ownership.test.lua b/test/replication/gh-6034-limbo-ownership.test.lua
new file mode 100644
index 000000000..0e1586566
--- /dev/null
+++ b/test/replication/gh-6034-limbo-ownership.test.lua
@@ -0,0 +1,68 @@
+test_run = require('test_run').new()
+fiber = require('fiber')
+
+--
+-- gh-6034: test that transactional limbo isn't accessible without a promotion.
+--
+synchro_quorum = box.cfg.replication_synchro_quorum
+election_mode = box.cfg.election_mode
+box.cfg{replication_synchro_quorum = 1, election_mode='off'}
+
+_ = box.schema.space.create('async'):create_index('pk')
+_ = box.schema.space.create('sync', {is_sync=true}):create_index('pk')
+
+-- Limbo is initially unclaimed, everyone is writeable.
+assert(not box.info.ro)
+assert(box.info.synchro.queue.owner == 0)
+box.space.async:insert{1} -- success.
+-- Synchro spaces aren't writeable
+box.space.sync:insert{1} -- error.
+
+box.ctl.promote()
+assert(not box.info.ro)
+assert(box.info.synchro.queue.owner == box.info.id)
+box.space.sync:insert{1} -- success.
+
+-- Everyone but the limbo owner is read-only.
+box.schema.user.grant('guest', 'replication')
+test_run:cmd('create server replica with rpl_master=default,\
+                                         script="replication/replica.lua"')
+test_run:cmd('start server replica with wait=True, wait_load=True')
+test_run:cmd('set variable rpl_listen to "replica.listen"')
+orig_replication = box.cfg.replication
+box.cfg{replication={box.info.listen, rpl_listen}}
+
+test_run:switch('replica')
+assert(box.info.ro)
+assert(box.info.synchro.queue.owner == test_run:eval('default', 'return box.info.id')[1])
+box.space.async:insert{2} -- failure.
+
+-- Promotion on the other node. Default should become ro.
+box.ctl.promote()
+assert(not box.info.ro)
+assert(box.info.synchro.queue.owner == box.info.id)
+box.space.sync:insert{2} -- success.
+
+test_run:switch('default')
+assert(box.info.ro)
+assert(box.info.synchro.queue.owner == test_run:eval('replica', 'return box.info.id')[1])
+box.space.sync:insert{3} -- failure.
+
+box.ctl.demote()
+assert(not box.info.ro)
+box.space.sync:insert{3} -- still fails.
+assert(box.info.synchro.queue.owner == 0)
+box.space.async:insert{3} -- success.
+
+-- Cleanup.
+box.ctl.demote()
+test_run:cmd('stop server replica')
+test_run:cmd('delete server replica')
+box.schema.user.revoke('guest', 'replication')
+box.space.sync:drop()
+box.space.async:drop()
+box.cfg{\
+    replication_synchro_quorum = synchro_quorum,\
+    election_mode = election_mode,\
+    replication = orig_replication,\
+}
diff --git a/test/replication/gh-6057-qsync-confirm-async-no-wal.result b/test/replication/gh-6057-qsync-confirm-async-no-wal.result
index 23c77729b..e7beefb2a 100644
--- a/test/replication/gh-6057-qsync-confirm-async-no-wal.result
+++ b/test/replication/gh-6057-qsync-confirm-async-no-wal.result
@@ -40,6 +40,10 @@ _ = s2:create_index('pk')
  | ---
  | ...
 
+box.ctl.promote()
+ | ---
+ | ...
+
 errinj = box.error.injection
  | ---
  | ...
@@ -161,3 +165,6 @@ box.cfg{
 }
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/gh-6057-qsync-confirm-async-no-wal.test.lua b/test/replication/gh-6057-qsync-confirm-async-no-wal.test.lua
index a11ddc042..bb459ea02 100644
--- a/test/replication/gh-6057-qsync-confirm-async-no-wal.test.lua
+++ b/test/replication/gh-6057-qsync-confirm-async-no-wal.test.lua
@@ -21,6 +21,8 @@ _ = s:create_index('pk')
 s2 = box.schema.create_space('test2')
 _ = s2:create_index('pk')
 
+box.ctl.promote()
+
 errinj = box.error.injection
 
 function create_hanging_async_after_confirm(sync_key, async_key1, async_key2)   \
@@ -86,3 +88,4 @@ box.cfg{
     replication_synchro_quorum = old_synchro_quorum,                            \
     replication_synchro_timeout = old_synchro_timeout,                          \
 }
+box.ctl.demote()
diff --git a/test/replication/hang_on_synchro_fail.result b/test/replication/hang_on_synchro_fail.result
index 9f6fac00b..dda15af20 100644
--- a/test/replication/hang_on_synchro_fail.result
+++ b/test/replication/hang_on_synchro_fail.result
@@ -19,6 +19,9 @@ _ = box.schema.space.create('sync', {is_sync=true})
 _ = box.space.sync:create_index('pk')
  | ---
  | ...
+box.ctl.promote()
+ | ---
+ | ...
 
 old_synchro_quorum = box.cfg.replication_synchro_quorum
  | ---
@@ -127,4 +130,7 @@ box.space.sync:drop()
 box.schema.user.revoke('guest', 'replication')
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
 
diff --git a/test/replication/hang_on_synchro_fail.test.lua b/test/replication/hang_on_synchro_fail.test.lua
index 6c3b09fab..f0d494eae 100644
--- a/test/replication/hang_on_synchro_fail.test.lua
+++ b/test/replication/hang_on_synchro_fail.test.lua
@@ -8,6 +8,7 @@ box.schema.user.grant('guest', 'replication')
 
 _ = box.schema.space.create('sync', {is_sync=true})
 _ = box.space.sync:create_index('pk')
+box.ctl.promote()
 
 old_synchro_quorum = box.cfg.replication_synchro_quorum
 box.cfg{replication_synchro_quorum=3}
@@ -54,4 +55,5 @@ box.cfg{replication_synchro_quorum=old_synchro_quorum,\
         replication_synchro_timeout=old_synchro_timeout}
 box.space.sync:drop()
 box.schema.user.revoke('guest', 'replication')
+box.ctl.demote()
 
diff --git a/test/replication/qsync_advanced.result b/test/replication/qsync_advanced.result
index 94b19b1f2..72ac0c326 100644
--- a/test/replication/qsync_advanced.result
+++ b/test/replication/qsync_advanced.result
@@ -72,6 +72,9 @@ _ = box.schema.space.create('sync', {is_sync=true, engine=engine})
 _ = box.space.sync:create_index('pk')
  | ---
  | ...
+box.ctl.promote()
+ | ---
+ | ...
 -- Testcase body.
 box.space.sync:insert{1} -- success
  | ---
@@ -468,6 +471,9 @@ box.space.sync:select{} -- 1
 box.cfg{read_only=false} -- promote replica to master
  | ---
  | ...
+box.ctl.promote()
+ | ---
+ | ...
 test_run:switch('default')
  | ---
  | - true
@@ -508,6 +514,9 @@ test_run:switch('default')
 box.cfg{read_only=false}
  | ---
  | ...
+box.ctl.promote()
+ | ---
+ | ...
 test_run:switch('replica')
  | ---
  | - true
@@ -781,3 +790,6 @@ box.cfg{
 }
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/qsync_advanced.test.lua b/test/replication/qsync_advanced.test.lua
index 058ece602..37c285b8d 100644
--- a/test/replication/qsync_advanced.test.lua
+++ b/test/replication/qsync_advanced.test.lua
@@ -30,6 +30,7 @@ test_run:switch('default')
 box.cfg{replication_synchro_quorum=NUM_INSTANCES, replication_synchro_timeout=1000}
 _ = box.schema.space.create('sync', {is_sync=true, engine=engine})
 _ = box.space.sync:create_index('pk')
+box.ctl.promote()
 -- Testcase body.
 box.space.sync:insert{1} -- success
 test_run:cmd('switch replica')
@@ -170,6 +171,7 @@ box.space.sync:select{} -- 1
 test_run:switch('replica')
 box.space.sync:select{} -- 1
 box.cfg{read_only=false} -- promote replica to master
+box.ctl.promote()
 test_run:switch('default')
 box.cfg{read_only=true} -- demote master to replica
 test_run:switch('replica')
@@ -181,6 +183,7 @@ box.space.sync:select{} -- 1, 2
 -- Revert cluster configuration.
 test_run:switch('default')
 box.cfg{read_only=false}
+box.ctl.promote()
 test_run:switch('replica')
 box.cfg{read_only=true}
 -- Testcase cleanup.
@@ -279,3 +282,4 @@ box.cfg{
     replication_synchro_quorum = orig_synchro_quorum,                           \
     replication_synchro_timeout = orig_synchro_timeout,                         \
 }
+box.ctl.demote()
diff --git a/test/replication/qsync_basic.result b/test/replication/qsync_basic.result
index 7e711ba13..bbdfc42fe 100644
--- a/test/replication/qsync_basic.result
+++ b/test/replication/qsync_basic.result
@@ -14,6 +14,9 @@ s1.is_sync
 pk = s1:create_index('pk')
  | ---
  | ...
+box.ctl.promote()
+ | ---
+ | ...
 box.begin() s1:insert({1}) s1:insert({2}) box.commit()
  | ---
  | ...
@@ -645,19 +648,12 @@ test_run:switch('default')
  | ---
  | - true
  | ...
-box.cfg{replication_synchro_quorum = 3, replication_synchro_timeout = 1000}
- | ---
- | ...
-f = fiber.create(function() box.space.sync:replace{1} end)
+box.ctl.demote()
  | ---
  | ...
-test_run:wait_lsn('replica', 'default')
+box.space.sync:replace{1}
  | ---
- | ...
-
-test_run:switch('replica')
- | ---
- | - true
+ | - error: The synchronous transaction queue doesn't belong to any instance
  | ...
 function skip_row() return nil end
  | ---
@@ -674,26 +670,22 @@ box.space.sync:replace{2}
 box.space.sync:before_replace(nil, skip_row)
  | ---
  | ...
-assert(box.space.sync:get{2} == nil)
+assert(box.space.sync:get{1} == nil)
  | ---
  | - true
  | ...
-assert(box.space.sync:get{1} ~= nil)
+assert(box.space.sync:get{2} == nil)
  | ---
  | - true
  | ...
-
-test_run:switch('default')
+assert(box.info.lsn == old_lsn + 1)
  | ---
  | - true
  | ...
-box.cfg{replication_synchro_quorum = 2}
+box.ctl.promote()
  | ---
  | ...
-test_run:wait_cond(function() return f:status() == 'dead' end)
- | ---
- | - true
- | ...
+
 box.space.sync:truncate()
  | ---
  | ...
@@ -758,3 +750,6 @@ box.space.sync:drop()
 box.schema.user.revoke('guest', 'replication')
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/qsync_basic.test.lua b/test/replication/qsync_basic.test.lua
index 75c9b222b..eac465e25 100644
--- a/test/replication/qsync_basic.test.lua
+++ b/test/replication/qsync_basic.test.lua
@@ -6,6 +6,7 @@
 s1 = box.schema.create_space('test1', {is_sync = true})
 s1.is_sync
 pk = s1:create_index('pk')
+box.ctl.promote()
 box.begin() s1:insert({1}) s1:insert({2}) box.commit()
 s1:select{}
 
@@ -253,22 +254,18 @@ box.space.sync:count()
 -- instances, but also works for local rows.
 --
 test_run:switch('default')
-box.cfg{replication_synchro_quorum = 3, replication_synchro_timeout = 1000}
-f = fiber.create(function() box.space.sync:replace{1} end)
-test_run:wait_lsn('replica', 'default')
-
-test_run:switch('replica')
+box.ctl.demote()
+box.space.sync:replace{1}
 function skip_row() return nil end
 old_lsn = box.info.lsn
 _ = box.space.sync:before_replace(skip_row)
 box.space.sync:replace{2}
 box.space.sync:before_replace(nil, skip_row)
+assert(box.space.sync:get{1} == nil)
 assert(box.space.sync:get{2} == nil)
-assert(box.space.sync:get{1} ~= nil)
+assert(box.info.lsn == old_lsn + 1)
+box.ctl.promote()
 
-test_run:switch('default')
-box.cfg{replication_synchro_quorum = 2}
-test_run:wait_cond(function() return f:status() == 'dead' end)
 box.space.sync:truncate()
 
 --
@@ -301,3 +298,4 @@ test_run:cmd('delete server replica')
 box.space.test:drop()
 box.space.sync:drop()
 box.schema.user.revoke('guest', 'replication')
+box.ctl.demote()
diff --git a/test/replication/qsync_errinj.result b/test/replication/qsync_errinj.result
index 635bcf939..cf1e30a90 100644
--- a/test/replication/qsync_errinj.result
+++ b/test/replication/qsync_errinj.result
@@ -35,6 +35,9 @@ _ = box.schema.space.create('sync', {is_sync = true, engine = engine})
 _ = box.space.sync:create_index('pk')
  | ---
  | ...
+box.ctl.promote()
+ | ---
+ | ...
 
 --
 -- gh-5100: slow ACK sending shouldn't stun replica for the
@@ -542,3 +545,6 @@ box.space.sync:drop()
 box.schema.user.revoke('guest', 'super')
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/qsync_errinj.test.lua b/test/replication/qsync_errinj.test.lua
index 6a9fd3e1a..e7c85c58c 100644
--- a/test/replication/qsync_errinj.test.lua
+++ b/test/replication/qsync_errinj.test.lua
@@ -12,6 +12,7 @@ test_run:cmd('start server replica with wait=True, wait_load=True')
 
 _ = box.schema.space.create('sync', {is_sync = true, engine = engine})
 _ = box.space.sync:create_index('pk')
+box.ctl.promote()
 
 --
 -- gh-5100: slow ACK sending shouldn't stun replica for the
@@ -222,3 +223,4 @@ test_run:cmd('delete server replica')
 
 box.space.sync:drop()
 box.schema.user.revoke('guest', 'super')
+box.ctl.demote()
diff --git a/test/replication/qsync_snapshots.result b/test/replication/qsync_snapshots.result
index cafdd63c8..ca418b168 100644
--- a/test/replication/qsync_snapshots.result
+++ b/test/replication/qsync_snapshots.result
@@ -57,6 +57,9 @@ _ = box.schema.space.create('sync', {is_sync=true, engine=engine})
 _ = box.space.sync:create_index('pk')
  | ---
  | ...
+box.ctl.promote()
+ | ---
+ | ...
 -- Testcase body.
 box.space.sync:insert{1}
  | ---
@@ -299,3 +302,6 @@ box.cfg{
 }
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
diff --git a/test/replication/qsync_snapshots.test.lua b/test/replication/qsync_snapshots.test.lua
index 590610974..82c2e3f7c 100644
--- a/test/replication/qsync_snapshots.test.lua
+++ b/test/replication/qsync_snapshots.test.lua
@@ -23,6 +23,7 @@ test_run:switch('default')
 box.cfg{replication_synchro_quorum=NUM_INSTANCES, replication_synchro_timeout=1000}
 _ = box.schema.space.create('sync', {is_sync=true, engine=engine})
 _ = box.space.sync:create_index('pk')
+box.ctl.promote()
 -- Testcase body.
 box.space.sync:insert{1}
 box.space.sync:select{} -- 1
@@ -130,3 +131,4 @@ box.cfg{
     replication_synchro_quorum = orig_synchro_quorum,                           \
     replication_synchro_timeout = orig_synchro_timeout,                         \
 }
+box.ctl.demote()
diff --git a/test/replication/qsync_with_anon.result b/test/replication/qsync_with_anon.result
index 6a2952a32..99c6fb902 100644
--- a/test/replication/qsync_with_anon.result
+++ b/test/replication/qsync_with_anon.result
@@ -57,6 +57,9 @@ _ = box.schema.space.create('sync', {is_sync=true, engine=engine})
 _ = box.space.sync:create_index('pk')
  | ---
  | ...
+box.ctl.promote()
+ | ---
+ | ...
 -- Testcase body.
 test_run:switch('default')
  | ---
@@ -220,6 +223,9 @@ box.cfg{
 }
  | ---
  | ...
+box.ctl.demote()
+ | ---
+ | ...
 test_run:cleanup_cluster()
  | ---
  | ...
diff --git a/test/replication/qsync_with_anon.test.lua b/test/replication/qsync_with_anon.test.lua
index d7ecaa107..e73880ec7 100644
--- a/test/replication/qsync_with_anon.test.lua
+++ b/test/replication/qsync_with_anon.test.lua
@@ -22,6 +22,7 @@ test_run:switch('default')
 box.cfg{replication_synchro_quorum=NUM_INSTANCES, replication_synchro_timeout=1000}
 _ = box.schema.space.create('sync', {is_sync=true, engine=engine})
 _ = box.space.sync:create_index('pk')
+box.ctl.promote()
 -- Testcase body.
 test_run:switch('default')
 box.space.sync:insert{1} -- success
@@ -81,4 +82,5 @@ box.cfg{
     replication_synchro_quorum = orig_synchro_quorum,                           \
     replication_synchro_timeout = orig_synchro_timeout,                         \
 }
+box.ctl.demote()
 test_run:cleanup_cluster()
diff --git a/test/replication/suite.cfg b/test/replication/suite.cfg
index 4fc6643e4..c2bbb5aa9 100644
--- a/test/replication/suite.cfg
+++ b/test/replication/suite.cfg
@@ -48,6 +48,7 @@
     "gh-5613-bootstrap-prefer-booted.test.lua": {},
     "gh-6027-applier-error-show.test.lua": {},
     "gh-6032-promote-wal-write.test.lua": {},
+    "gh-6034-limbo-ownership.test.lua": {},
     "gh-6034-promote-bump-term.test.lua": {},
     "gh-6057-qsync-confirm-async-no-wal.test.lua": {},
     "gh-6094-rs-uuid-mismatch.test.lua": {},
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Tarantool-patches] [PATCH v2 7/8] txn_limbo: persist the latest effective promote in snapshot
  2021-06-17 21:07 [Tarantool-patches] [PATCH v2 0/8] forbid implicit limbo ownership transition Serge Petrenko via Tarantool-patches
                   ` (5 preceding siblings ...)
  2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 6/8] box: introduce `box.ctl.demote` Serge Petrenko via Tarantool-patches
@ 2021-06-17 21:07 ` Serge Petrenko via Tarantool-patches
  2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 8/8] replication: send latest effective promote in initial join Serge Petrenko via Tarantool-patches
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Serge Petrenko via Tarantool-patches @ 2021-06-17 21:07 UTC (permalink / raw)
  To: v.shpilevoy, gorcunov; +Cc: tarantool-patches

Previously PROMOTE entries, just like CONFIRM and ROLLBACK were only
stored in WALs. This is because snapshots consist solely of confirmed
transactions, so there's nothing to CONFIRM or ROLLBACK.

PROMOTE has gained additional meaning recently: it pins limbo ownership
to a specific instance, rendering everyone else read-only. So now
PROMOTE information must be stored in snapshots as well.

Save the latest limbo state (owner id and latest confirmed lsn) to the
snapshot as a PROMOTE request.

Follow-up #6034
---
 src/box/memtx_engine.c | 32 ++++++++++++++++++++++++++++++++
 src/box/txn_limbo.c    | 10 ++++++++++
 src/box/txn_limbo.h    |  7 +++++++
 3 files changed, 49 insertions(+)

diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c
index c662a3c8c..a2cfb2615 100644
--- a/src/box/memtx_engine.c
+++ b/src/box/memtx_engine.c
@@ -50,6 +50,7 @@
 #include "schema.h"
 #include "gc.h"
 #include "raft.h"
+#include "txn_limbo.h"
 
 /* sync snapshot every 16MB */
 #define SNAP_SYNC_INTERVAL	(1 << 24)
@@ -225,6 +226,22 @@ memtx_engine_recover_raft(const struct xrow_header *row)
 	return 0;
 }
 
+static int
+memtx_engine_recover_synchro(const struct xrow_header *row)
+{
+	assert(row->type == IPROTO_PROMOTE);
+	struct synchro_request req;
+	if (xrow_decode_synchro(row, &req) != 0)
+		return -1;
+	/*
+	 * Origin id cannot be deduced from row.replica_id in a checkpoint,
+	 * because all its rows have a zero replica_id.
+	 */
+	req.origin_id = req.replica_id;
+	txn_limbo_process(&txn_limbo, &req);
+	return 0;
+}
+
 static int
 memtx_engine_recover_snapshot_row(struct memtx_engine *memtx,
 				  struct xrow_header *row, int *is_space_system)
@@ -233,6 +250,8 @@ memtx_engine_recover_snapshot_row(struct memtx_engine *memtx,
 	if (row->type != IPROTO_INSERT) {
 		if (row->type == IPROTO_RAFT)
 			return memtx_engine_recover_raft(row);
+		if (row->type == IPROTO_PROMOTE)
+			return memtx_engine_recover_synchro(row);
 		diag_set(ClientError, ER_UNKNOWN_REQUEST_TYPE,
 			 (uint32_t) row->type);
 		return -1;
@@ -542,6 +561,7 @@ struct checkpoint {
 	struct vclock vclock;
 	struct xdir dir;
 	struct raft_request raft;
+	struct synchro_request synchro_state;
 	/**
 	 * Do nothing, just touch the snapshot file - the
 	 * checkpoint already exists.
@@ -567,6 +587,7 @@ checkpoint_new(const char *snap_dirname, uint64_t snap_io_rate_limit)
 	xdir_create(&ckpt->dir, snap_dirname, SNAP, &INSTANCE_UUID, &opts);
 	vclock_create(&ckpt->vclock);
 	box_raft_checkpoint_local(&ckpt->raft);
+	txn_limbo_checkpoint(&txn_limbo, &ckpt->synchro_state);
 	ckpt->touch = false;
 	return ckpt;
 }
@@ -655,6 +676,15 @@ finish:
 	return rc;
 }
 
+static int
+checkpoint_write_synchro(struct xlog *l, const struct synchro_request *req)
+{
+	struct xrow_header row;
+	char body[XROW_SYNCHRO_BODY_LEN_MAX];
+	xrow_encode_synchro(&row, body, req);
+	return checkpoint_write_row(l, &row);
+}
+
 static int
 checkpoint_f(va_list ap)
 {
@@ -692,6 +722,8 @@ checkpoint_f(va_list ap)
 	}
 	if (checkpoint_write_raft(&snap, &ckpt->raft) != 0)
 		goto fail;
+	if (checkpoint_write_synchro(&snap, &ckpt->synchro_state) != 0)
+		goto fail;
 	if (xlog_flush(&snap) < 0)
 		goto fail;
 
diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c
index b5af02479..482e17c8f 100644
--- a/src/box/txn_limbo.c
+++ b/src/box/txn_limbo.c
@@ -296,6 +296,16 @@ complete:
 	return 0;
 }
 
+void
+txn_limbo_checkpoint(const struct txn_limbo *limbo,
+		     struct synchro_request *req)
+{
+	req->type = IPROTO_PROMOTE;
+	req->replica_id = limbo->owner_id;
+	req->lsn = limbo->confirmed_lsn;
+	req->term = limbo->promote_greatest_term;
+}
+
 static void
 txn_limbo_write_synchro(struct txn_limbo *limbo, uint16_t type, int64_t lsn,
 			uint64_t term)
diff --git a/src/box/txn_limbo.h b/src/box/txn_limbo.h
index 801a1a0ee..442f2483e 100644
--- a/src/box/txn_limbo.h
+++ b/src/box/txn_limbo.h
@@ -311,6 +311,13 @@ txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req);
 int
 txn_limbo_wait_confirm(struct txn_limbo *limbo);
 
+/**
+ * Persist limbo state to a given synchro request.
+ */
+void
+txn_limbo_checkpoint(const struct txn_limbo *limbo,
+		     struct synchro_request *req);
+
 /**
  * Write a PROMOTE request, which has the same effect as CONFIRM(@a lsn) and
  * ROLLBACK(@a lsn + 1) combined.
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Tarantool-patches] [PATCH v2 8/8] replication: send latest effective promote in initial join
  2021-06-17 21:07 [Tarantool-patches] [PATCH v2 0/8] forbid implicit limbo ownership transition Serge Petrenko via Tarantool-patches
                   ` (6 preceding siblings ...)
  2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 7/8] txn_limbo: persist the latest effective promote in snapshot Serge Petrenko via Tarantool-patches
@ 2021-06-17 21:07 ` Serge Petrenko via Tarantool-patches
  2021-06-18 13:02 ` [Tarantool-patches] [PATCH v2 0/8] forbid implicit limbo ownership transition Cyrill Gorcunov via Tarantool-patches
  2021-06-21 12:11 ` [Tarantool-patches] [PATCH v2 9/8] replication: send current Raft term in join response Serge Petrenko via Tarantool-patches
  9 siblings, 0 replies; 16+ messages in thread
From: Serge Petrenko via Tarantool-patches @ 2021-06-17 21:07 UTC (permalink / raw)
  To: v.shpilevoy, gorcunov; +Cc: tarantool-patches

A joining instance may never receive the latest PROMOTE request, which
is the only source of information about the limbo owner. Send out the
latest limbo state (e.g. the latest applied PROMOTE request) together
with the initial join snapshot.

Follow-up #6034
---
 src/box/applier.cc                       |  5 ++
 src/box/relay.cc                         | 15 +++++
 test/replication/replica_rejoin.result   | 77 ++++++++++++++----------
 test/replication/replica_rejoin.test.lua | 50 +++++++--------
 4 files changed, 92 insertions(+), 55 deletions(-)

diff --git a/src/box/applier.cc b/src/box/applier.cc
index 10cea26a7..9d90a2384 100644
--- a/src/box/applier.cc
+++ b/src/box/applier.cc
@@ -458,6 +458,11 @@ applier_wait_snapshot(struct applier *applier)
 				xrow_decode_vclock_xc(&row, &replicaset.vclock);
 			}
 			break; /* end of stream */
+		} else if (iproto_type_is_promote_request(row.type)) {
+			struct synchro_request req;
+			if (xrow_decode_synchro(&row, &req) != 0)
+				diag_raise();
+			txn_limbo_process(&txn_limbo, &req);
 		} else if (iproto_type_is_error(row.type)) {
 			xrow_decode_error_xc(&row);  /* rethrow error */
 		} else {
diff --git a/src/box/relay.cc b/src/box/relay.cc
index b47767769..e05b53d5d 100644
--- a/src/box/relay.cc
+++ b/src/box/relay.cc
@@ -399,12 +399,27 @@ relay_initial_join(int fd, uint64_t sync, struct vclock *vclock)
 	if (txn_limbo_wait_confirm(&txn_limbo) != 0)
 		diag_raise();
 
+	struct synchro_request req;
+	txn_limbo_checkpoint(&txn_limbo, &req);
+
 	/* Respond to the JOIN request with the current vclock. */
 	struct xrow_header row;
 	xrow_encode_vclock_xc(&row, vclock);
 	row.sync = sync;
 	coio_write_xrow(&relay->io, &row);
 
+	/*
+	 * Send out the latest limbo state. Don't do that when limbo is unused,
+	 * let the old instances join without trouble.
+	 */
+	if (req.replica_id != REPLICA_ID_NIL) {
+		char body[XROW_SYNCHRO_BODY_LEN_MAX];
+		xrow_encode_synchro(&row, body, &req);
+		row.replica_id = req.replica_id;
+		row.sync = sync;
+		coio_write_xrow(&relay->io, &row);
+	}
+
 	/* Send read view to the replica. */
 	engine_join_xc(&ctx, &relay->stream);
 }
diff --git a/test/replication/replica_rejoin.result b/test/replication/replica_rejoin.result
index 843333a19..e489c150a 100644
--- a/test/replication/replica_rejoin.result
+++ b/test/replication/replica_rejoin.result
@@ -7,10 +7,19 @@ test_run = env.new()
 log = require('log')
 ---
 ...
-engine = test_run:get_cfg('engine')
+test_run:cmd("create server master with script='replication/master1.lua'")
 ---
+- true
 ...
-test_run:cleanup_cluster()
+test_run:cmd("start server master")
+---
+- true
+...
+test_run:switch("master")
+---
+- true
+...
+engine = test_run:get_cfg('engine')
 ---
 ...
 --
@@ -43,7 +52,7 @@ _ = box.space.test:insert{3}
 ---
 ...
 -- Join a replica, then stop it.
-test_run:cmd("create server replica with rpl_master=default, script='replication/replica_rejoin.lua'")
+test_run:cmd("create server replica with rpl_master=master, script='replication/replica_rejoin.lua'")
 ---
 - true
 ...
@@ -65,7 +74,7 @@ box.space.test:select()
   - [2]
   - [3]
 ...
-test_run:cmd("switch default")
+test_run:cmd("switch master")
 ---
 - true
 ...
@@ -75,7 +84,7 @@ test_run:cmd("stop server replica")
 ...
 -- Restart the server to purge the replica from
 -- the garbage collection state.
-test_run:cmd("restart server default")
+test_run:cmd("restart server master")
 box.cfg{wal_cleanup_delay = 0}
 ---
 ...
@@ -146,7 +155,7 @@ box.space.test:select()
   - [20]
   - [30]
 ...
-test_run:cmd("switch default")
+test_run:cmd("switch master")
 ---
 - true
 ...
@@ -154,7 +163,7 @@ test_run:cmd("switch default")
 for i = 10, 30, 10 do box.space.test:update(i, {{'!', 1, i}}) end
 ---
 ...
-vclock = test_run:get_vclock('default')
+vclock = test_run:get_vclock('master')
 ---
 ...
 vclock[0] = nil
@@ -191,7 +200,7 @@ box.space.test:replace{1, 2, 3} -- bumps LSN on the replica
 ---
 - [1, 2, 3]
 ...
-test_run:cmd("switch default")
+test_run:cmd("switch master")
 ---
 - true
 ...
@@ -199,7 +208,7 @@ test_run:cmd("stop server replica")
 ---
 - true
 ...
-test_run:cmd("restart server default")
+test_run:cmd("restart server master")
 box.cfg{wal_cleanup_delay = 0}
 ---
 ...
@@ -253,7 +262,7 @@ box.space.test:select()
 -- from the replica.
 --
 -- Bootstrap a new replica.
-test_run:cmd("switch default")
+test_run:cmd("switch master")
 ---
 - true
 ...
@@ -295,7 +304,7 @@ box.cfg{replication = ''}
 ---
 ...
 -- Bump vclock on the master.
-test_run:cmd("switch default")
+test_run:cmd("switch master")
 ---
 - true
 ...
@@ -317,15 +326,15 @@ vclock = test_run:get_vclock('replica')
 vclock[0] = nil
 ---
 ...
-_ = test_run:wait_vclock('default', vclock)
+_ = test_run:wait_vclock('master', vclock)
 ---
 ...
 -- Restart the master and force garbage collection.
-test_run:cmd("switch default")
+test_run:cmd("switch master")
 ---
 - true
 ...
-test_run:cmd("restart server default")
+test_run:cmd("restart server master")
 box.cfg{wal_cleanup_delay = 0}
 ---
 ...
@@ -373,7 +382,7 @@ vclock = test_run:get_vclock('replica')
 vclock[0] = nil
 ---
 ...
-_ = test_run:wait_vclock('default', vclock)
+_ = test_run:wait_vclock('master', vclock)
 ---
 ...
 -- Restart the replica. It should successfully rebootstrap.
@@ -396,38 +405,42 @@ test_run:cmd("switch default")
 ---
 - true
 ...
-box.cfg{replication = ''}
+test_run:cmd("stop server replica")
 ---
+- true
 ...
-test_run:cmd("stop server replica")
+test_run:cmd("delete server replica")
 ---
 - true
 ...
-test_run:cmd("cleanup server replica")
+test_run:cmd("stop server master")
 ---
 - true
 ...
-test_run:cmd("delete server replica")
+test_run:cmd("delete server master")
 ---
 - true
 ...
-test_run:cleanup_cluster()
+--
+-- gh-4107: rebootstrap fails if the replica was deleted from
+-- the cluster on the master.
+--
+test_run:cmd("create server master with script='replication/master1.lua'")
 ---
+- true
 ...
-box.space.test:drop()
+test_run:cmd("start server master")
 ---
+- true
 ...
-box.schema.user.revoke('guest', 'replication')
+test_run:switch("master")
 ---
+- true
 ...
---
--- gh-4107: rebootstrap fails if the replica was deleted from
--- the cluster on the master.
---
 box.schema.user.grant('guest', 'replication')
 ---
 ...
-test_run:cmd("create server replica with rpl_master=default, script='replication/replica_uuid.lua'")
+test_run:cmd("create server replica with rpl_master=master, script='replication/replica_uuid.lua'")
 ---
 - true
 ...
@@ -462,11 +475,11 @@ box.space._cluster:get(2) ~= nil
 ---
 - true
 ...
-test_run:cmd("stop server replica")
+test_run:switch("default")
 ---
 - true
 ...
-test_run:cmd("cleanup server replica")
+test_run:cmd("stop server replica")
 ---
 - true
 ...
@@ -474,9 +487,11 @@ test_run:cmd("delete server replica")
 ---
 - true
 ...
-box.schema.user.revoke('guest', 'replication')
+test_run:cmd("stop server master")
 ---
+- true
 ...
-test_run:cleanup_cluster()
+test_run:cmd("delete server master")
 ---
+- true
 ...
diff --git a/test/replication/replica_rejoin.test.lua b/test/replication/replica_rejoin.test.lua
index c3ba9bf3f..2563177cf 100644
--- a/test/replication/replica_rejoin.test.lua
+++ b/test/replication/replica_rejoin.test.lua
@@ -1,9 +1,11 @@
 env = require('test_run')
 test_run = env.new()
 log = require('log')
-engine = test_run:get_cfg('engine')
 
-test_run:cleanup_cluster()
+test_run:cmd("create server master with script='replication/master1.lua'")
+test_run:cmd("start server master")
+test_run:switch("master")
+engine = test_run:get_cfg('engine')
 
 --
 -- gh-5806: this replica_rejoin test relies on the wal cleanup fiber
@@ -23,17 +25,17 @@ _ = box.space.test:insert{2}
 _ = box.space.test:insert{3}
 
 -- Join a replica, then stop it.
-test_run:cmd("create server replica with rpl_master=default, script='replication/replica_rejoin.lua'")
+test_run:cmd("create server replica with rpl_master=master, script='replication/replica_rejoin.lua'")
 test_run:cmd("start server replica")
 test_run:cmd("switch replica")
 box.info.replication[1].upstream.status == 'follow' or log.error(box.info)
 box.space.test:select()
-test_run:cmd("switch default")
+test_run:cmd("switch master")
 test_run:cmd("stop server replica")
 
 -- Restart the server to purge the replica from
 -- the garbage collection state.
-test_run:cmd("restart server default")
+test_run:cmd("restart server master")
 box.cfg{wal_cleanup_delay = 0}
 
 -- Make some checkpoints to remove old xlogs.
@@ -58,11 +60,11 @@ box.info.replication[2].downstream.vclock ~= nil or log.error(box.info)
 test_run:cmd("switch replica")
 box.info.replication[1].upstream.status == 'follow' or log.error(box.info)
 box.space.test:select()
-test_run:cmd("switch default")
+test_run:cmd("switch master")
 
 -- Make sure the replica follows new changes.
 for i = 10, 30, 10 do box.space.test:update(i, {{'!', 1, i}}) end
-vclock = test_run:get_vclock('default')
+vclock = test_run:get_vclock('master')
 vclock[0] = nil
 _ = test_run:wait_vclock('replica', vclock)
 test_run:cmd("switch replica")
@@ -76,9 +78,9 @@ box.space.test:select()
 -- Check that rebootstrap is NOT initiated unless the replica
 -- is strictly behind the master.
 box.space.test:replace{1, 2, 3} -- bumps LSN on the replica
-test_run:cmd("switch default")
+test_run:cmd("switch master")
 test_run:cmd("stop server replica")
-test_run:cmd("restart server default")
+test_run:cmd("restart server master")
 box.cfg{wal_cleanup_delay = 0}
 checkpoint_count = box.cfg.checkpoint_count
 box.cfg{checkpoint_count = 1}
@@ -99,7 +101,7 @@ box.space.test:select()
 --
 
 -- Bootstrap a new replica.
-test_run:cmd("switch default")
+test_run:cmd("switch master")
 test_run:cmd("stop server replica")
 test_run:cmd("cleanup server replica")
 test_run:cleanup_cluster()
@@ -113,17 +115,17 @@ box.cfg{replication = replica_listen}
 test_run:cmd("switch replica")
 box.cfg{replication = ''}
 -- Bump vclock on the master.
-test_run:cmd("switch default")
+test_run:cmd("switch master")
 box.space.test:replace{1}
 -- Bump vclock on the replica.
 test_run:cmd("switch replica")
 for i = 1, 10 do box.space.test:replace{2} end
 vclock = test_run:get_vclock('replica')
 vclock[0] = nil
-_ = test_run:wait_vclock('default', vclock)
+_ = test_run:wait_vclock('master', vclock)
 -- Restart the master and force garbage collection.
-test_run:cmd("switch default")
-test_run:cmd("restart server default")
+test_run:cmd("switch master")
+test_run:cmd("restart server master")
 box.cfg{wal_cleanup_delay = 0}
 replica_listen = test_run:cmd("eval replica 'return box.cfg.listen'")
 replica_listen ~= nil
@@ -139,7 +141,7 @@ test_run:cmd("switch replica")
 for i = 1, 10 do box.space.test:replace{2} end
 vclock = test_run:get_vclock('replica')
 vclock[0] = nil
-_ = test_run:wait_vclock('default', vclock)
+_ = test_run:wait_vclock('master', vclock)
 -- Restart the replica. It should successfully rebootstrap.
 test_run:cmd("restart server replica with args='true'")
 box.space.test:select()
@@ -148,20 +150,20 @@ box.space.test:replace{2}
 
 -- Cleanup.
 test_run:cmd("switch default")
-box.cfg{replication = ''}
 test_run:cmd("stop server replica")
-test_run:cmd("cleanup server replica")
 test_run:cmd("delete server replica")
-test_run:cleanup_cluster()
-box.space.test:drop()
-box.schema.user.revoke('guest', 'replication')
+test_run:cmd("stop server master")
+test_run:cmd("delete server master")
 
 --
 -- gh-4107: rebootstrap fails if the replica was deleted from
 -- the cluster on the master.
 --
+test_run:cmd("create server master with script='replication/master1.lua'")
+test_run:cmd("start server master")
+test_run:switch("master")
 box.schema.user.grant('guest', 'replication')
-test_run:cmd("create server replica with rpl_master=default, script='replication/replica_uuid.lua'")
+test_run:cmd("create server replica with rpl_master=master, script='replication/replica_uuid.lua'")
 start_cmd = string.format("start server replica with args='%s'", require('uuid').new())
 box.space._cluster:get(2) == nil
 test_run:cmd(start_cmd)
@@ -170,8 +172,8 @@ test_run:cmd("cleanup server replica")
 box.space._cluster:delete(2) ~= nil
 test_run:cmd(start_cmd)
 box.space._cluster:get(2) ~= nil
+test_run:switch("default")
 test_run:cmd("stop server replica")
-test_run:cmd("cleanup server replica")
 test_run:cmd("delete server replica")
-box.schema.user.revoke('guest', 'replication')
-test_run:cleanup_cluster()
+test_run:cmd("stop server master")
+test_run:cmd("delete server master")
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 6/8] box: introduce `box.ctl.demote`
  2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 6/8] box: introduce `box.ctl.demote` Serge Petrenko via Tarantool-patches
@ 2021-06-17 21:15   ` Serge Petrenko via Tarantool-patches
  0 siblings, 0 replies; 16+ messages in thread
From: Serge Petrenko via Tarantool-patches @ 2021-06-17 21:15 UTC (permalink / raw)
  To: v.shpilevoy, gorcunov; +Cc: tarantool-patches



18.06.2021 00:07, Serge Petrenko пишет:
> Introduce a new journal entry, DEMOTE. The entry has the same meaning as
> PROMOTE, with the only difference that it clears limbo ownership instead
> of transferring it to the issuer.
>

I've force pushed a fix for failing CI jobs:

diff --git a/test/replication/gh-6034-limbo-ownership.result 
b/test/replication/gh-6034-limbo-ownership.result
index 3681df3d8..3b2fcdda9 100644
--- a/test/replication/gh-6034-limbo-ownership.result
+++ b/test/replication/gh-6034-limbo-ownership.result
@@ -81,7 +81,7 @@ test_run:cmd('set variable rpl_listen to 
"replica.listen"')
  orig_replication = box.cfg.replication
   | ---
   | ...
-box.cfg{replication={box.info.listen, rpl_listen}}
+box.cfg{replication=rpl_listen}
   | ---
   | ...

diff --git a/test/replication/gh-6034-limbo-ownership.test.lua 
b/test/replication/gh-6034-limbo-ownership.test.lua
index 0e1586566..3beea8270 100644
--- a/test/replication/gh-6034-limbo-ownership.test.lua
+++ b/test/replication/gh-6034-limbo-ownership.test.lua
@@ -30,7 +30,7 @@ test_run:cmd('create server replica with 
rpl_master=default,\
  test_run:cmd('start server replica with wait=True, wait_load=True')
  test_run:cmd('set variable rpl_listen to "replica.listen"')
  orig_replication = box.cfg.replication
-box.cfg{replication={box.info.listen, rpl_listen}}
+box.cfg{replication=rpl_listen}

  test_run:switch('replica')
  assert(box.info.ro)


-- 
Serge Petrenko


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 2/8] txn_limbo: fix promote term filtering
  2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 2/8] txn_limbo: fix promote term filtering Serge Petrenko via Tarantool-patches
@ 2021-06-17 22:05   ` Cyrill Gorcunov via Tarantool-patches
  2021-06-18  8:55     ` Serge Petrenko via Tarantool-patches
  2021-06-18 10:31   ` Cyrill Gorcunov via Tarantool-patches
  1 sibling, 1 reply; 16+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-06-17 22:05 UTC (permalink / raw)
  To: Serge Petrenko; +Cc: v.shpilevoy, tarantool-patches

On Fri, Jun 18, 2021 at 12:07:36AM +0300, Serge Petrenko wrote:
> txn_limbo_process() used to filter out promote requests whose term was
> equal to the greatest term seen. This wasn't correct for PROMOTE entries
> with term 1.
...
> +	if (term > limbo->promote_greatest_term) {
> +		limbo->promote_greatest_term = term;
> +	} else if (req->type == IPROTO_PROMOTE &&
> +		   limbo->promote_greatest_term > 1) {
> +		/* PROMOTE for outdated term. Ignore. */
> +		say_info("RAFT: ignoring PROMOTE request from instance "
> +			 "id %"PRIu32" for term %"PRIu64". Greatest term seen "
> +			 "before (%"PRIu64") is bigger.", origin, term,
> +			 limbo->promote_greatest_term);
> +		return;
>  	}

Serge, please don't use PRIx helper, they are completely unreadable.
As far as I remember we agreed to use %llu with explicit long long
conversion. Ie

	say_info("RAFT: ignoring PROMOTE request from instance "
		 "id %u for term %lld . Greatest term seen "
		 "before (%llu) is bigger.", origin, (long long)term,
		 (long long)limbo->promote_greatest_term);

surely we can fix it on top of the series.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 2/8] txn_limbo: fix promote term filtering
  2021-06-17 22:05   ` Cyrill Gorcunov via Tarantool-patches
@ 2021-06-18  8:55     ` Serge Petrenko via Tarantool-patches
  0 siblings, 0 replies; 16+ messages in thread
From: Serge Petrenko via Tarantool-patches @ 2021-06-18  8:55 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: v.shpilevoy, tarantool-patches



18.06.2021 01:05, Cyrill Gorcunov пишет:
> On Fri, Jun 18, 2021 at 12:07:36AM +0300, Serge Petrenko wrote:
>> txn_limbo_process() used to filter out promote requests whose term was
>> equal to the greatest term seen. This wasn't correct for PROMOTE entries
>> with term 1.
> ...
>> +	if (term > limbo->promote_greatest_term) {
>> +		limbo->promote_greatest_term = term;
>> +	} else if (req->type == IPROTO_PROMOTE &&
>> +		   limbo->promote_greatest_term > 1) {
>> +		/* PROMOTE for outdated term. Ignore. */
>> +		say_info("RAFT: ignoring PROMOTE request from instance "
>> +			 "id %"PRIu32" for term %"PRIu64". Greatest term seen "
>> +			 "before (%"PRIu64") is bigger.", origin, term,
>> +			 limbo->promote_greatest_term);
>> +		return;
>>   	}
> Serge, please don't use PRIx helper, they are completely unreadable.
> As far as I remember we agreed to use %llu with explicit long long
> conversion. Ie
>
> 	say_info("RAFT: ignoring PROMOTE request from instance "
> 		 "id %u for term %lld . Greatest term seen "
> 		 "before (%llu) is bigger.", origin, (long long)term,
> 		 (long long)limbo->promote_greatest_term);
>
> surely we can fix it on top of the series.

Ok, sure. I thought this was fine. Let's change.

================================

diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c
index a5d1df00c..e91782b40 100644
--- a/src/box/txn_limbo.c
+++ b/src/box/txn_limbo.c
@@ -674,9 +674,9 @@ txn_limbo_process(struct txn_limbo *limbo, const 
struct synchro_request *req)
                    limbo->promote_greatest_term > 1) {
                 /* PROMOTE for outdated term. Ignore. */
                 say_info("RAFT: ignoring PROMOTE request from instance "
-                        "id %"PRIu32" for term %"PRIu64". Greatest term 
seen "
-                        "before (%"PRIu64") is bigger.", origin, term,
-                        limbo->promote_greatest_term);
+                        "id %u for term %llu. Greatest term seen "
+                        "before (%llu) is bigger.", origin, (long 
long)term,
+                        (long long)limbo->promote_greatest_term);
                 return;
         }



================================

-- 
Serge Petrenko


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 2/8] txn_limbo: fix promote term filtering
  2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 2/8] txn_limbo: fix promote term filtering Serge Petrenko via Tarantool-patches
  2021-06-17 22:05   ` Cyrill Gorcunov via Tarantool-patches
@ 2021-06-18 10:31   ` Cyrill Gorcunov via Tarantool-patches
  2021-06-21 15:09     ` Serge Petrenko via Tarantool-patches
  1 sibling, 1 reply; 16+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-06-18 10:31 UTC (permalink / raw)
  To: Serge Petrenko; +Cc: v.shpilevoy, tarantool-patches

On Fri, Jun 18, 2021 at 12:07:36AM +0300, Serge Petrenko wrote:
> txn_limbo_process() used to filter out promote requests whose term was
> equal to the greatest term seen. This wasn't correct for PROMOTE entries
> with term 1.
> 
> Such entries appear after box.ctl.promote() is issued on an instance
> with disabled elections. Every PROMOTE entry from such an instance has
> term 1, but should still be applied. Fix this in the patch.
> 
> Also, when an outdated PROMOTE entry with term smaller than already
> applied from some replica arrived, it wasn't filtered at all. Such a
> situation shouldn't be possible, but fix it as well.
> 
> Part-of #6034
> ---
>  src/box/txn_limbo.c | 22 +++++++++++++---------
>  1 file changed, 13 insertions(+), 9 deletions(-)
> 
> diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c
> index 51dc2a186..a5d1df00c 100644
> --- a/src/box/txn_limbo.c
> +++ b/src/box/txn_limbo.c
> @@ -665,17 +665,21 @@ txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req)
>  {
>  	uint64_t term = req->term;
>  	uint32_t origin = req->origin_id;
> -	if (txn_limbo_replica_term(limbo, origin) < term) {
> +	if (txn_limbo_replica_term(limbo, origin) < term)
>  		vclock_follow(&limbo->promote_term_map, origin, term);
> -		if (term > limbo->promote_greatest_term) {
> -			limbo->promote_greatest_term = term;
> -		} else if (req->type == IPROTO_PROMOTE) {
> -			/*
> -			 * PROMOTE for outdated term. Ignore.
> -			 */
> -			return;
> -		}
> +
> +	if (term > limbo->promote_greatest_term) {
> +		limbo->promote_greatest_term = term;
> +	} else if (req->type == IPROTO_PROMOTE &&
> +		   limbo->promote_greatest_term > 1) {
> +		/* PROMOTE for outdated term. Ignore. */
> +		say_info("RAFT: ignoring PROMOTE request from instance "
> +			 "id %"PRIu32" for term %"PRIu64". Greatest term seen "
> +			 "before (%"PRIu64") is bigger.", origin, term,
> +			 limbo->promote_greatest_term);
> +		return;
>  	}

Serge, here is a moment I don't understand. The promote_term_map
and promote_greatest_term both are zero initially, then we start
accepting requests from the net and previously we've been updating
promote_term_map and promote_greatest_term in same context, iow
promote_term_map is filtering incoming traffic and update
promote_greatest_term with max from promote_term_map, thus
we can denote it as

	if (term > promote_term_map[i]) {
		promote_term_map[i] = term;
		promote_greatest_term = max(promote_greatest_term, term);
	}

thus promote_greatest_term always = max({promote_term_map[i]}), so I don't
understart why say

	uint64_t term = req->term;
	uint32_t origin = req->origin_id;

	if (txn_limbo_replica_term(limbo, origin) < term) {
		vclock_follow(&limbo->promote_term_map, origin, term);
		if (term > limbo->promote_greatest_term)
			limbo->promote_greatest_term = term;
	}

	if (iproto_type_is_promote_request(req->type) &&
	    limbo->promote_greatest_term > 1) {
		/* PROMOTE for outdated term. Ignore. */
		say_info("RAFT: ignoring %s request from instance "
			 "id %"PRIu32" for term %"PRIu64". Greatest term seen "
			 "before (%"PRIu64") is bigger.",
			 iproto_type_name(req->type), origin, term,
			 limbo->promote_greatest_term);
		return;
	}

won't do the trick? Do I miss something obvious here?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 0/8] forbid implicit limbo ownership transition
  2021-06-17 21:07 [Tarantool-patches] [PATCH v2 0/8] forbid implicit limbo ownership transition Serge Petrenko via Tarantool-patches
                   ` (7 preceding siblings ...)
  2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 8/8] replication: send latest effective promote in initial join Serge Petrenko via Tarantool-patches
@ 2021-06-18 13:02 ` Cyrill Gorcunov via Tarantool-patches
  2021-06-21 12:11 ` [Tarantool-patches] [PATCH v2 9/8] replication: send current Raft term in join response Serge Petrenko via Tarantool-patches
  9 siblings, 0 replies; 16+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-06-18 13:02 UTC (permalink / raw)
  To: Serge Petrenko; +Cc: v.shpilevoy, tarantool-patches

On Fri, Jun 18, 2021 at 12:07:34AM +0300, Serge Petrenko wrote:
> https://github.com/tarantool/tarantool/issues/5438
> https://github.com/tarantool/tarantool/issues/6034
> https://github.com/tarantool/tarantool/tree/sp/gh-6034-empty-limbo-transition
> 
> Changes in v2:
>   - reorder the commits for a more logical sequence.
>   - review fixes as per review from Vlad.
>   - fix replica_rejoin test
> 
> Serge Petrenko (8):
>   replication: always send raft state to subscribers
>   txn_limbo: fix promote term filtering
>   raft: refactor raft_new_term()
>   box: make promote always bump the term
>   replication: forbid implicit limbo owner transition
>   box: introduce `box.ctl.demote`
>   txn_limbo: persist the latest effective promote in snapshot
>   replication: send latest effective promote in initial join

The series looks ok to me, thanks! Lets wait for Vlad's review
(though i would make fixes on top rather if any)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Tarantool-patches] [PATCH v2 9/8] replication: send current Raft term in join response
  2021-06-17 21:07 [Tarantool-patches] [PATCH v2 0/8] forbid implicit limbo ownership transition Serge Petrenko via Tarantool-patches
                   ` (8 preceding siblings ...)
  2021-06-18 13:02 ` [Tarantool-patches] [PATCH v2 0/8] forbid implicit limbo ownership transition Cyrill Gorcunov via Tarantool-patches
@ 2021-06-21 12:11 ` Serge Petrenko via Tarantool-patches
  9 siblings, 0 replies; 16+ messages in thread
From: Serge Petrenko via Tarantool-patches @ 2021-06-21 12:11 UTC (permalink / raw)
  To: v.shpilevoy, gorcunov; +Cc: tarantool-patches

Make Raft nodes send out their latest persisted term to joining
replicas.

Follow-up #6034

diff --git a/src/box/applier.cc b/src/box/applier.cc
index 9d90a2384..2b3c653a5 100644
--- a/src/box/applier.cc
+++ b/src/box/applier.cc
@@ -463,6 +463,11 @@ applier_wait_snapshot(struct applier *applier)
                         if (xrow_decode_synchro(&row, &req) != 0)
                                 diag_raise();
                         txn_limbo_process(&txn_limbo, &req);
+               } else if (iproto_type_is_raft_request(row.type)) {
+                       struct raft_request req;
+                       if (xrow_decode_raft(&row, &req, NULL) != 0)
+                               diag_raise();
+                       box_raft_recover(&req);
                 } else if (iproto_type_is_error(row.type)) {
                         xrow_decode_error_xc(&row);  /* rethrow error */
                 } else {
diff --git a/src/box/relay.cc b/src/box/relay.cc
index e05b53d5d..510e5c420 100644
--- a/src/box/relay.cc
+++ b/src/box/relay.cc
@@ -400,7 +400,9 @@ relay_initial_join(int fd, uint64_t sync, struct 
vclock *vclock)
                 diag_raise();

         struct synchro_request req;
+       struct raft_request raft_req;
         txn_limbo_checkpoint(&txn_limbo, &req);
+       box_raft_checkpoint_local(&raft_req);

         /* Respond to the JOIN request with the current vclock. */
         struct xrow_header row;
@@ -419,6 +421,11 @@ relay_initial_join(int fd, uint64_t sync, struct 
vclock *vclock)
                 row.sync = sync;
                 coio_write_xrow(&relay->io, &row);
         }
+       if (raft_req.term > 1) {
+               xrow_encode_raft(&row, &fiber()->gc, &raft_req);
+               row.sync = sync;
+               coio_write_xrow(&relay->io, &row);
+       }

         /* Send read view to the replica. */
         engine_join_xc(&ctx, &relay->stream);

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 2/8] txn_limbo: fix promote term filtering
  2021-06-18 10:31   ` Cyrill Gorcunov via Tarantool-patches
@ 2021-06-21 15:09     ` Serge Petrenko via Tarantool-patches
  0 siblings, 0 replies; 16+ messages in thread
From: Serge Petrenko via Tarantool-patches @ 2021-06-21 15:09 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: v.shpilevoy, tarantool-patches



18.06.2021 13:31, Cyrill Gorcunov пишет:
> On Fri, Jun 18, 2021 at 12:07:36AM +0300, Serge Petrenko wrote:
>> txn_limbo_process() used to filter out promote requests whose term was
>> equal to the greatest term seen. This wasn't correct for PROMOTE entries
>> with term 1.
>>
>> Such entries appear after box.ctl.promote() is issued on an instance
>> with disabled elections. Every PROMOTE entry from such an instance has
>> term 1, but should still be applied. Fix this in the patch.
>>
>> Also, when an outdated PROMOTE entry with term smaller than already
>> applied from some replica arrived, it wasn't filtered at all. Such a
>> situation shouldn't be possible, but fix it as well.
>>
>> Part-of #6034
>> ---
>>   src/box/txn_limbo.c | 22 +++++++++++++---------
>>   1 file changed, 13 insertions(+), 9 deletions(-)
>>
>> diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c
>> index 51dc2a186..a5d1df00c 100644
>> --- a/src/box/txn_limbo.c
>> +++ b/src/box/txn_limbo.c
>> @@ -665,17 +665,21 @@ txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req)
>>   {
>>   	uint64_t term = req->term;
>>   	uint32_t origin = req->origin_id;
>> -	if (txn_limbo_replica_term(limbo, origin) < term) {
>> +	if (txn_limbo_replica_term(limbo, origin) < term)
>>   		vclock_follow(&limbo->promote_term_map, origin, term);
>> -		if (term > limbo->promote_greatest_term) {
>> -			limbo->promote_greatest_term = term;
>> -		} else if (req->type == IPROTO_PROMOTE) {
>> -			/*
>> -			 * PROMOTE for outdated term. Ignore.
>> -			 */
>> -			return;
>> -		}
>> +
>> +	if (term > limbo->promote_greatest_term) {
>> +		limbo->promote_greatest_term = term;
>> +	} else if (req->type == IPROTO_PROMOTE &&
>> +		   limbo->promote_greatest_term > 1) {
>> +		/* PROMOTE for outdated term. Ignore. */
>> +		say_info("RAFT: ignoring PROMOTE request from instance "
>> +			 "id %"PRIu32" for term %"PRIu64". Greatest term seen "
>> +			 "before (%"PRIu64") is bigger.", origin, term,
>> +			 limbo->promote_greatest_term);
>> +		return;
>>   	}

Thanks for the review!

> Serge, here is a moment I don't understand. The promote_term_map
> and promote_greatest_term both are zero initially, then we start
> accepting requests from the net and previously we've been updating
> promote_term_map and promote_greatest_term in same context, iow
> promote_term_map is filtering incoming traffic and update
> promote_greatest_term with max from promote_term_map, thus
> we can denote it as
>
> 	if (term > promote_term_map[i]) {
> 		promote_term_map[i] = term;
> 		promote_greatest_term = max(promote_greatest_term, term);
> 	}
>
> thus promote_greatest_term always = max({promote_term_map[i]}), so I don't
> understart why say
>
> 	uint64_t term = req->term;
> 	uint32_t origin = req->origin_id;
>
> 	if (txn_limbo_replica_term(limbo, origin) < term) {
> 		vclock_follow(&limbo->promote_term_map, origin, term);
> 		if (term > limbo->promote_greatest_term)
> 			limbo->promote_greatest_term = term;
> 	}
>
> 	if (iproto_type_is_promote_request(req->type) &&
> 	    limbo->promote_greatest_term > 1) {
> 		/* PROMOTE for outdated term. Ignore. */
> 		say_info("RAFT: ignoring %s request from instance "
> 			 "id %"PRIu32" for term %"PRIu64". Greatest term seen "
> 			 "before (%"PRIu64") is bigger.",
> 			 iproto_type_name(req->type), origin, term,
> 			 limbo->promote_greatest_term);
> 		return;
> 	}
>
> won't do the trick? Do I miss something obvious here?
It will, thanks for the suggestion! I applied your changes:

==========================

diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c
index e91782b40..16181b8a0 100644
--- a/src/box/txn_limbo.c
+++ b/src/box/txn_limbo.c
@@ -665,11 +665,10 @@ txn_limbo_process(struct txn_limbo *limbo, const 
struct synchro_request *req)
  {
         uint64_t term = req->term;
         uint32_t origin = req->origin_id;
-       if (txn_limbo_replica_term(limbo, origin) < term)
+       if (txn_limbo_replica_term(limbo, origin) < term) {
                 vclock_follow(&limbo->promote_term_map, origin, term);
-
-       if (term > limbo->promote_greatest_term) {
-               limbo->promote_greatest_term = term;
+               if (term > limbo->promote_greatest_term)
+                       limbo->promote_greatest_term = term;
         } else if (req->type == IPROTO_PROMOTE &&
                    limbo->promote_greatest_term > 1) {
                 /* PROMOTE for outdated term. Ignore. */

==========================

-- 
Serge Petrenko


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-06-21 15:09 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-17 21:07 [Tarantool-patches] [PATCH v2 0/8] forbid implicit limbo ownership transition Serge Petrenko via Tarantool-patches
2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 1/8] replication: always send raft state to subscribers Serge Petrenko via Tarantool-patches
2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 2/8] txn_limbo: fix promote term filtering Serge Petrenko via Tarantool-patches
2021-06-17 22:05   ` Cyrill Gorcunov via Tarantool-patches
2021-06-18  8:55     ` Serge Petrenko via Tarantool-patches
2021-06-18 10:31   ` Cyrill Gorcunov via Tarantool-patches
2021-06-21 15:09     ` Serge Petrenko via Tarantool-patches
2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 3/8] raft: refactor raft_new_term() Serge Petrenko via Tarantool-patches
2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 4/8] box: make promote always bump the term Serge Petrenko via Tarantool-patches
2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 5/8] replication: forbid implicit limbo owner transition Serge Petrenko via Tarantool-patches
2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 6/8] box: introduce `box.ctl.demote` Serge Petrenko via Tarantool-patches
2021-06-17 21:15   ` Serge Petrenko via Tarantool-patches
2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 7/8] txn_limbo: persist the latest effective promote in snapshot Serge Petrenko via Tarantool-patches
2021-06-17 21:07 ` [Tarantool-patches] [PATCH v2 8/8] replication: send latest effective promote in initial join Serge Petrenko via Tarantool-patches
2021-06-18 13:02 ` [Tarantool-patches] [PATCH v2 0/8] forbid implicit limbo ownership transition Cyrill Gorcunov via Tarantool-patches
2021-06-21 12:11 ` [Tarantool-patches] [PATCH v2 9/8] replication: send current Raft term in join response Serge Petrenko via Tarantool-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox