[Tarantool-patches] [PATCH v2 5/5] election: promote 'manual' bootstrap master

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Sun Jul 18 19:53:29 MSK 2021


A cluster may consist of only voters and manual nodes. This means
on bootstrap nobody would be elected as a Raft leader
automatically to create the first snapshot and register the
others.

After the previous commit the manual nodes were preferred to be
bootstrap masters, but they couldn't do anything.

This patch makes 'manual' bootstrap master promote itself for one
term so as it could boot the cluster.

Closes #6018
---
 .../unreleased/gh-6018-election-boot-voter.md |  4 ++
 src/box/box.cc                                | 19 +++++++-
 .../gh-6018-election-boot-voter.result        | 46 +++++++++++++++++++
 .../gh-6018-election-boot-voter.test.lua      | 21 +++++++++
 4 files changed, 88 insertions(+), 2 deletions(-)
 create mode 100644 changelogs/unreleased/gh-6018-election-boot-voter.md

diff --git a/changelogs/unreleased/gh-6018-election-boot-voter.md b/changelogs/unreleased/gh-6018-election-boot-voter.md
new file mode 100644
index 000000000..080484bbe
--- /dev/null
+++ b/changelogs/unreleased/gh-6018-election-boot-voter.md
@@ -0,0 +1,4 @@
+## bugfix/replication
+
+* Fixed a cluster sometimes being unable to bootstrap if it contains nodes with
+  `election_mode` `manual` or `voter` (gh-6018).
diff --git a/src/box/box.cc b/src/box/box.cc
index b828d3a31..8c10a99dd 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -3501,7 +3501,8 @@ box_cfg_xc(void)
 
 	if (!is_bootstrap_leader) {
 		replicaset_sync();
-	} else {
+	} else if (box_election_mode == ELECTION_MODE_CANDIDATE ||
+		   box_election_mode == ELECTION_MODE_MANUAL) {
 		/*
 		 * When the cluster is just bootstrapped and this instance is a
 		 * leader, it makes no sense to wait for a leader appearance.
@@ -3509,7 +3510,21 @@ box_cfg_xc(void)
 		 * should take the control over the situation and start a new
 		 * term immediately.
 		 */
-		raft_new_term(raft);
+		raft_promote(raft);
+		int rc = box_raft_wait_term_outcome();
+		if (rc == 0 && raft->leader != instance_id) {
+			/*
+			 * It was promoted and is a single registered node -
+			 * there can't be another leader or a new term bump.
+			 */
+			panic("Bootstrap master couldn't elect self as a "
+			      "leader. Leader is %u, term is %llu",
+			      raft->leader, (long long)raft->volatile_term);
+		}
+		if (rc != 0) {
+			raft_restore(raft);
+			diag_raise();
+		}
 	}
 
 	/* box.cfg.read_only is not read yet. */
diff --git a/test/replication/gh-6018-election-boot-voter.result b/test/replication/gh-6018-election-boot-voter.result
index 6b05f0825..1b7949bb9 100644
--- a/test/replication/gh-6018-election-boot-voter.result
+++ b/test/replication/gh-6018-election-boot-voter.result
@@ -4,6 +4,11 @@
 -- as bootstrap leaders. They should not, because a voter can't be ever writable
 -- and it can neither boot itself nor register other nodes.
 --
+-- Similar situation was with the manual election. All instances might have
+-- manual election mode. Such a cluster wouldn't be able to boot if their
+-- bootstrap master wouldn't become an elected leader automatically at least
+-- once.
+--
 test_run = require('test_run').new()
  | ---
  | ...
@@ -68,3 +73,44 @@ test_run:switch('default')
 stop_cluster()
  | ---
  | ...
+
+--
+-- Manual leader.
+--
+boot_with_master_election_mode('manual')
+ | ---
+ | ...
+
+test_run:switch('master')
+ | ---
+ | - true
+ | ...
+test_run:wait_cond(function() return not box.info.ro end)
+ | ---
+ | - true
+ | ...
+assert(box.info.election.state == 'leader')
+ | ---
+ | - true
+ | ...
+
+test_run:switch('replica')
+ | ---
+ | - true
+ | ...
+assert(box.info.ro)
+ | ---
+ | - true
+ | ...
+assert(box.info.election.state == 'follower')
+ | ---
+ | - true
+ | ...
+
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+stop_cluster()
+ | ---
+ | ...
diff --git a/test/replication/gh-6018-election-boot-voter.test.lua b/test/replication/gh-6018-election-boot-voter.test.lua
index 7222beb19..fb08e2bc8 100644
--- a/test/replication/gh-6018-election-boot-voter.test.lua
+++ b/test/replication/gh-6018-election-boot-voter.test.lua
@@ -3,6 +3,11 @@
 -- as bootstrap leaders. They should not, because a voter can't be ever writable
 -- and it can neither boot itself nor register other nodes.
 --
+-- Similar situation was with the manual election. All instances might have
+-- manual election mode. Such a cluster wouldn't be able to boot if their
+-- bootstrap master wouldn't become an elected leader automatically at least
+-- once.
+--
 test_run = require('test_run').new()
 
 function boot_with_master_election_mode(mode)                                   \
@@ -36,3 +41,19 @@ assert(box.info.election.state == 'follower')
 
 test_run:switch('default')
 stop_cluster()
+
+--
+-- Manual leader.
+--
+boot_with_master_election_mode('manual')
+
+test_run:switch('master')
+test_run:wait_cond(function() return not box.info.ro end)
+assert(box.info.election.state == 'leader')
+
+test_run:switch('replica')
+assert(box.info.ro)
+assert(box.info.election.state == 'follower')
+
+test_run:switch('default')
+stop_cluster()
-- 
2.24.3 (Apple Git-128)



More information about the Tarantool-patches mailing list