Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: tarantool-patches@dev.tarantool.org, gorcunov@gmail.com,
	sergepetrenko@tarantool.org
Subject: [Tarantool-patches] [PATCH 2/2] election: during bootstrap prefer candidates
Date: Fri, 16 Jul 2021 01:49:50 +0200
Message-ID: <b8eea53ce98aef8f58f0b642742aa9b73de5adf5.1626392372.git.v.shpilevoy@tarantool.org> (raw)
In-Reply-To: <cover.1626392372.git.v.shpilevoy@tarantool.org>

During cluster bootstrap the boot master election algorithm didn't
take into account election modes of the instances. It could be
that all nodes have box.cfg.read_only = false, none is booted,
all are read-only now. Then the node with the smallest UUID was
chosen even if it was box.cfg.election_mode='voter' node.

It could neither boot nor register other nodes and the cluster
couldn't start.

The patch makes the boot master election prefer the instances
which can become a Raft leader. If all the other parameters didn't
help.

Closes #6018
---
 .../unreleased/gh-6018-election-boot-voter.md |   4 +
 src/box/box.cc                                |  25 +++-
 src/box/replication.cc                        |  11 +-
 .../gh-6018-election-boot-voter.result        | 116 ++++++++++++++++++
 .../gh-6018-election-boot-voter.test.lua      |  59 +++++++++
 test/replication/gh-6018-master.lua           |  17 +++
 test/replication/gh-6018-replica.lua          |  15 +++
 test/replication/suite.cfg                    |   1 +
 8 files changed, 245 insertions(+), 3 deletions(-)
 create mode 100644 changelogs/unreleased/gh-6018-election-boot-voter.md
 create mode 100644 test/replication/gh-6018-election-boot-voter.result
 create mode 100644 test/replication/gh-6018-election-boot-voter.test.lua
 create mode 100644 test/replication/gh-6018-master.lua
 create mode 100644 test/replication/gh-6018-replica.lua

diff --git a/changelogs/unreleased/gh-6018-election-boot-voter.md b/changelogs/unreleased/gh-6018-election-boot-voter.md
new file mode 100644
index 000000000..080484bbe
--- /dev/null
+++ b/changelogs/unreleased/gh-6018-election-boot-voter.md
@@ -0,0 +1,4 @@
+## bugfix/replication
+
+* Fixed a cluster sometimes being unable to bootstrap if it contains nodes with
+  `election_mode` `manual` or `voter` (gh-6018).
diff --git a/src/box/box.cc b/src/box/box.cc
index ef3efe3e0..3105b04b6 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -3519,7 +3519,30 @@ box_cfg_xc(void)
 		 * should take the control over the situation and start a new
 		 * term immediately.
 		 */
-		raft_new_term(box_raft());
+		struct raft *raft = box_raft();
+		if (box_election_mode == ELECTION_MODE_MANUAL) {
+			raft_start_candidate(raft);
+			raft_new_term(raft);
+			int rc = box_raft_wait_leader_found();
+			/*
+			 * No need to check if the mode is still manual - it
+			 * couldn't change because box.cfg is protected with a
+			 * fiber lock.
+			 */
+			assert(box_election_mode == ELECTION_MODE_MANUAL);
+			raft_stop_candidate(raft, false);
+			/*
+			 * It should not fail, because on bootstrap the node is
+			 * a single registered instance. It can't not win the
+			 * elections while being a lone participant. But still
+			 * check the result so as not to a ignore potential
+			 * problems.
+			 */
+			if (rc != 0)
+				diag_raise();
+		} else {
+			raft_new_term(raft);
+		}
 	}
 
 	/* box.cfg.read_only is not read yet. */
diff --git a/src/box/replication.cc b/src/box/replication.cc
index a0b3e0186..622d12f74 100644
--- a/src/box/replication.cc
+++ b/src/box/replication.cc
@@ -978,12 +978,19 @@ replicaset_find_join_master(void)
 		 * config is stronger because if it is configured as read-only,
 		 * it is in read-only state for sure, until the config is
 		 * changed.
+		 *
+		 * In a cluster with leader election enabled all instances might
+		 * look equal by the scores above. Then must prefer the ones
+		 * which can be elected as a leader, because only they would be
+		 * able to boot themselves and register the others.
 		 */
 		if (ballot->is_booted)
-			score += 10;
+			score += 1000;
 		if (!ballot->is_ro_cfg)
-			score += 5;
+			score += 100;
 		if (!ballot->is_ro)
+			score += 10;
+		if (ballot->can_be_leader)
 			score += 1;
 		if (leader_score < score)
 			goto elect;
diff --git a/test/replication/gh-6018-election-boot-voter.result b/test/replication/gh-6018-election-boot-voter.result
new file mode 100644
index 000000000..c960aa4bd
--- /dev/null
+++ b/test/replication/gh-6018-election-boot-voter.result
@@ -0,0 +1,116 @@
+-- test-run result file version 2
+--
+-- gh-6018: in a auto-election cluster nodes with voter state could be selected
+-- as bootstrap leaders. They should not, because a voter can't be ever writable
+-- and it can neither boot itself nor register other nodes.
+--
+-- Similar situation was with the manual election. All instances might have
+-- manual election mode. Such a cluster wouldn't be able to boot if their
+-- bootstrap master wouldn't become an elected leader automatically at least
+-- once.
+--
+test_run = require('test_run').new()
+ | ---
+ | ...
+
+function boot_with_master_election_mode(mode)                                   \
+        test_run:cmd('create server master with '..                             \
+                     'script="replication/gh-6018-master.lua"')                 \
+        test_run:cmd('start server master with wait=False, args="'..mode..'"')  \
+        test_run:cmd('create server replica with '..                            \
+                     'script="replication/gh-6018-replica.lua"')                \
+        test_run:cmd('start server replica')                                    \
+end
+ | ---
+ | ...
+
+function stop_cluster()                                                         \
+    test_run:cmd('stop server replica')                                         \
+    test_run:cmd('stop server master')                                          \
+    test_run:cmd('delete server replica')                                       \
+    test_run:cmd('delete server master')                                        \
+end
+ | ---
+ | ...
+
+--
+-- Candidate leader.
+--
+boot_with_master_election_mode('candidate')
+ | ---
+ | ...
+
+test_run:switch('master')
+ | ---
+ | - true
+ | ...
+test_run:wait_cond(function() return not box.info.ro end)
+ | ---
+ | - true
+ | ...
+assert(box.info.election.state == 'leader')
+ | ---
+ | - true
+ | ...
+
+test_run:switch('replica')
+ | ---
+ | - true
+ | ...
+assert(box.info.ro)
+ | ---
+ | - true
+ | ...
+assert(box.info.election.state == 'follower')
+ | ---
+ | - true
+ | ...
+
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+stop_cluster()
+ | ---
+ | ...
+
+--
+-- Manual leader.
+--
+boot_with_master_election_mode('manual')
+ | ---
+ | ...
+
+test_run:switch('master')
+ | ---
+ | - true
+ | ...
+test_run:wait_cond(function() return not box.info.ro end)
+ | ---
+ | - true
+ | ...
+assert(box.info.election.state == 'leader')
+ | ---
+ | - true
+ | ...
+
+test_run:switch('replica')
+ | ---
+ | - true
+ | ...
+assert(box.info.ro)
+ | ---
+ | - true
+ | ...
+assert(box.info.election.state == 'follower')
+ | ---
+ | - true
+ | ...
+
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+stop_cluster()
+ | ---
+ | ...
diff --git a/test/replication/gh-6018-election-boot-voter.test.lua b/test/replication/gh-6018-election-boot-voter.test.lua
new file mode 100644
index 000000000..800e20c8f
--- /dev/null
+++ b/test/replication/gh-6018-election-boot-voter.test.lua
@@ -0,0 +1,59 @@
+--
+-- gh-6018: in a auto-election cluster nodes with voter state could be selected
+-- as bootstrap leaders. They should not, because a voter can't be ever writable
+-- and it can neither boot itself nor register other nodes.
+--
+-- Similar situation was with the manual election. All instances might have
+-- manual election mode. Such a cluster wouldn't be able to boot if their
+-- bootstrap master wouldn't become an elected leader automatically at least
+-- once.
+--
+test_run = require('test_run').new()
+
+function boot_with_master_election_mode(mode)                                   \
+        test_run:cmd('create server master with '..                             \
+                     'script="replication/gh-6018-master.lua"')                 \
+        test_run:cmd('start server master with wait=False, args="'..mode..'"')  \
+        test_run:cmd('create server replica with '..                            \
+                     'script="replication/gh-6018-replica.lua"')                \
+        test_run:cmd('start server replica')                                    \
+end
+
+function stop_cluster()                                                         \
+    test_run:cmd('stop server replica')                                         \
+    test_run:cmd('stop server master')                                          \
+    test_run:cmd('delete server replica')                                       \
+    test_run:cmd('delete server master')                                        \
+end
+
+--
+-- Candidate leader.
+--
+boot_with_master_election_mode('candidate')
+
+test_run:switch('master')
+test_run:wait_cond(function() return not box.info.ro end)
+assert(box.info.election.state == 'leader')
+
+test_run:switch('replica')
+assert(box.info.ro)
+assert(box.info.election.state == 'follower')
+
+test_run:switch('default')
+stop_cluster()
+
+--
+-- Manual leader.
+--
+boot_with_master_election_mode('manual')
+
+test_run:switch('master')
+test_run:wait_cond(function() return not box.info.ro end)
+assert(box.info.election.state == 'leader')
+
+test_run:switch('replica')
+assert(box.info.ro)
+assert(box.info.election.state == 'follower')
+
+test_run:switch('default')
+stop_cluster()
diff --git a/test/replication/gh-6018-master.lua b/test/replication/gh-6018-master.lua
new file mode 100644
index 000000000..1192204ff
--- /dev/null
+++ b/test/replication/gh-6018-master.lua
@@ -0,0 +1,17 @@
+#!/usr/bin/env tarantool
+
+require('console').listen(os.getenv('ADMIN'))
+
+box.cfg({
+    listen = 'unix/:./gh-6018-master.sock',
+    replication = {
+	'unix/:./gh-6018-master.sock',
+	'unix/:./gh-6018-replica.sock',
+    },
+    election_mode = arg[1],
+    instance_uuid = 'cbf06940-0790-498b-948d-042b62cf3d29',
+    replication_timeout = 0.1,
+})
+
+box.ctl.wait_rw()
+box.schema.user.grant('guest', 'super')
diff --git a/test/replication/gh-6018-replica.lua b/test/replication/gh-6018-replica.lua
new file mode 100644
index 000000000..71e669141
--- /dev/null
+++ b/test/replication/gh-6018-replica.lua
@@ -0,0 +1,15 @@
+#!/usr/bin/env tarantool
+
+require('console').listen(os.getenv('ADMIN'))
+
+box.cfg({
+    listen = 'unix/:./gh-6018-replica.sock',
+    replication = {
+	'unix/:./gh-6018-master.sock',
+	'unix/:./gh-6018-replica.sock',
+    },
+    election_mode = 'voter',
+    -- Smaller than master UUID.
+    instance_uuid = 'cbf06940-0790-498b-948d-042b62cf3d28',
+    replication_timeout = 0.1,
+})
diff --git a/test/replication/suite.cfg b/test/replication/suite.cfg
index 69f2f3511..2bfc3b845 100644
--- a/test/replication/suite.cfg
+++ b/test/replication/suite.cfg
@@ -45,6 +45,7 @@
     "gh-5536-wal-limit.test.lua": {},
     "gh-5566-final-join-synchro.test.lua": {},
     "gh-5613-bootstrap-prefer-booted.test.lua": {},
+    "gh-6018-election-boot-voter.test.lua": {},
     "gh-6027-applier-error-show.test.lua": {},
     "gh-6032-promote-wal-write.test.lua": {},
     "gh-6057-qsync-confirm-async-no-wal.test.lua": {},
-- 
2.24.3 (Apple Git-128)


  parent reply	other threads:[~2021-07-15 23:50 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-15 23:49 [Tarantool-patches] [PATCH 0/2] Bootstrap voter Vladislav Shpilevoy via Tarantool-patches
2021-07-15 23:49 ` [Tarantool-patches] [PATCH 1/2] replication: introduce ballot.can_be_leader Vladislav Shpilevoy via Tarantool-patches
2021-07-16 10:59   ` Serge Petrenko via Tarantool-patches
2021-07-18 17:00     ` Vladislav Shpilevoy via Tarantool-patches
2021-07-19  9:11       ` Sergey Petrenko via Tarantool-patches
2021-07-16 14:29   ` Konstantin Osipov via Tarantool-patches
2021-07-18 17:00     ` Vladislav Shpilevoy via Tarantool-patches
2021-07-19  9:12       ` Konstantin Osipov via Tarantool-patches
2021-07-19 22:06         ` Vladislav Shpilevoy via Tarantool-patches
2021-07-20  8:49           ` Konstantin Osipov via Tarantool-patches
2021-07-20 20:02             ` Vladislav Shpilevoy via Tarantool-patches
2021-07-20 20:18               ` Konstantin Osipov via Tarantool-patches
2021-07-20 21:16         ` Cyrill Gorcunov via Tarantool-patches
2021-07-20 23:20           ` Konstantin Osipov via Tarantool-patches
2021-07-21 18:51             ` Cyrill Gorcunov via Tarantool-patches
2021-07-21 21:43             ` Vladislav Shpilevoy via Tarantool-patches
2021-07-15 23:49 ` Vladislav Shpilevoy via Tarantool-patches [this message]
2021-07-16 11:30   ` [Tarantool-patches] [PATCH 2/2] election: during bootstrap prefer candidates Serge Petrenko via Tarantool-patches
2021-07-18 17:00     ` Vladislav Shpilevoy via Tarantool-patches
2021-07-16 14:27 ` [Tarantool-patches] [PATCH 0/2] Bootstrap voter Konstantin Osipov via Tarantool-patches
2021-07-18 17:00   ` Vladislav Shpilevoy via Tarantool-patches
2021-07-19  9:13     ` Konstantin Osipov via Tarantool-patches
2021-07-19 22:04       ` Vladislav Shpilevoy via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b8eea53ce98aef8f58f0b642742aa9b73de5adf5.1626392372.git.v.shpilevoy@tarantool.org \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=gorcunov@gmail.com \
    --cc=sergepetrenko@tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Tarantool development patches archive

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://lists.tarantool.org/tarantool-patches/0 tarantool-patches/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 tarantool-patches tarantool-patches/ https://lists.tarantool.org/tarantool-patches \
		tarantool-patches@dev.tarantool.org.
	public-inbox-index tarantool-patches

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git