Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: tarantool-patches@dev.tarantool.org, sergepetrenko@tarantool.org
Subject: [Tarantool-patches] [PATCH 1/3] raft: send state to new subscribers if Raft worked
Date: Sat, 17 Oct 2020 19:17:55 +0200	[thread overview]
Message-ID: <15c343a9d2140a6c7ba31cb8761b2bca0ca6e09a.1602954690.git.v.shpilevoy@tarantool.org> (raw)
In-Reply-To: <cover.1602954690.git.v.shpilevoy@tarantool.org>

Before the patch a node sent its complete Raft state to new
replication subscribers only when Raft was enabled: voter or
candidate. When the mode was off, it didn't send anything.

This led to a bug (or maybe just inconvenience), that if the node
was a leader, and couldn't deliver the 'off' message to all the
followers, some of them would still think the node is a leader
until their restart or a new leader election.

If the followers were just voters or also had Raft disabled, they
would treat the old node as a leader infinitely until a new one
was elected somewhere.

The state wasn't sent not to break connections with the old
versions, which won't understand the Raft messages.

The patch makes Raft state being sent even if Raft is disabled on
the node, but was enabled ever in the past (term is not initial,
!= 1). In that way the old leader will keep sending resignation
messages even in case of restart or any other issue. And it won't
break the old versions when Raft is not used and was not used
ever.

The issue was found while working on #5433, but is not strictly
related.
---
 src/box/box.cc                                | 14 +++--
 src/box/raft.h                                | 10 ++++
 .../gh-5426-election-on-off.result            | 59 +++++++++++++++----
 .../gh-5426-election-on-off.test.lua          | 26 +++++---
 4 files changed, 84 insertions(+), 25 deletions(-)

diff --git a/src/box/box.cc b/src/box/box.cc
index 18568df3b..6d2ca3349 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -2154,12 +2154,16 @@ box_process_subscribe(struct ev_io *io, struct xrow_header *header)
 		 tt_uuid_str(&replica_uuid), sio_socketname(io->fd));
 	say_info("remote vclock %s local vclock %s",
 		 vclock_to_string(&replica_clock), vclock_to_string(&vclock));
-	if (raft_is_enabled()) {
+	if (raft_was_enabled()) {
 		/*
-		 * Send out the current raft state of the instance. Don't do
-		 * that if Raft is disabled. It can be that a part of the
-		 * cluster still contains old versions, which can't handle Raft
-		 * messages. So when it is disabled, its network footprint
+		 * Send Raft state only if it was ever used in the cluster. That
+		 * is because if it was used, then this node could be a leader
+		 * in the past. And there may be nodes, who still consider it a
+		 * leader. So even if it is disabled now, but was probably
+		 * enabled in the past, the state must be sent. Can't send it
+		 * always because it can be a part of a cluster with old
+		 * Tarantool versions, which can't handle Raft messages. When it
+		 * is disabled and was never enabled, its network footprint
 		 * should be 0.
 		 */
 		struct raft_request req;
diff --git a/src/box/raft.h b/src/box/raft.h
index 82d5aa442..74255e627 100644
--- a/src/box/raft.h
+++ b/src/box/raft.h
@@ -185,6 +185,16 @@ raft_is_enabled(void)
 	return raft.is_enabled;
 }
 
+/**
+ * Check if the node ever heard anything from anyone about Raft. When it is
+ * true, the node, in theory, could be a leader in the past.
+ */
+static inline bool
+raft_was_enabled(void)
+{
+	return raft.volatile_term > 1;
+}
+
 /** Process a raft entry stored in WAL/snapshot. */
 void
 raft_process_recovery(const struct raft_request *req);
diff --git a/test/replication/gh-5426-election-on-off.result b/test/replication/gh-5426-election-on-off.result
index 1abfb9154..08253443c 100644
--- a/test/replication/gh-5426-election-on-off.result
+++ b/test/replication/gh-5426-election-on-off.result
@@ -6,13 +6,6 @@ box.schema.user.grant('guest', 'super')
  | ---
  | ...
 
-old_election_mode = box.cfg.election_mode
- | ---
- | ...
-old_replication_timeout = box.cfg.replication_timeout
- | ---
- | ...
-
 test_run:cmd('create server replica with rpl_master=default,\
               script="replication/replica.lua"')
  | ---
@@ -111,23 +104,63 @@ test_run:wait_cond(function() return box.info.election.leader == 0 end)
  | - true
  | ...
 
+-- When turned off, the former leader could stop sending its state to new
+-- subscribers. That could make them think the leader is still alive.
 test_run:switch('default')
  | ---
  | - true
  | ...
-test_run:cmd('stop server replica')
+box.cfg{election_mode = 'candidate'}
+ | ---
+ | ...
+test_run:wait_cond(function() return box.info.election.state == 'leader' end)
  | ---
  | - true
  | ...
-test_run:cmd('delete server replica')
+
+test_run:switch('replica')
  | ---
  | - true
  | ...
-box.cfg{                                                                        \
-    election_mode = old_election_mode,                                          \
-    replication_timeout = old_replication_timeout,                              \
-}
+test_run:wait_cond(function() return box.info.election.leader ~= 0 end)
+ | ---
+ | - true
+ | ...
+
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+test_run:cmd('restart server default')
+ | 
+-- All settings are reset, the election is 'off'. The replica re-subscribes,
+-- and the former leader should send its state. Even though the election is
+-- disabled on it.
+assert(box.cfg.election_mode == 'off')
  | ---
+ | - true
+ | ...
+
+test_run:switch('replica')
+ | ---
+ | - true
+ | ...
+test_run:wait_cond(function() return box.info.election.leader == 0 end)
+ | ---
+ | - true
+ | ...
+
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+test_run:cmd('stop server replica')
+ | ---
+ | - true
+ | ...
+test_run:cmd('delete server replica')
+ | ---
+ | - true
  | ...
 box.schema.user.revoke('guest', 'super')
  | ---
diff --git a/test/replication/gh-5426-election-on-off.test.lua b/test/replication/gh-5426-election-on-off.test.lua
index d6b980d0a..adbfcd97d 100644
--- a/test/replication/gh-5426-election-on-off.test.lua
+++ b/test/replication/gh-5426-election-on-off.test.lua
@@ -1,9 +1,6 @@
 test_run = require('test_run').new()
 box.schema.user.grant('guest', 'super')
 
-old_election_mode = box.cfg.election_mode
-old_replication_timeout = box.cfg.replication_timeout
-
 test_run:cmd('create server replica with rpl_master=default,\
               script="replication/replica.lua"')
 test_run:cmd('start server replica with wait=True, wait_load=True')
@@ -47,11 +44,26 @@ box.cfg{election_mode = 'off'}
 test_run:switch('replica')
 test_run:wait_cond(function() return box.info.election.leader == 0 end)
 
+-- When turned off, the former leader could stop sending its state to new
+-- subscribers. That could make them think the leader is still alive.
+test_run:switch('default')
+box.cfg{election_mode = 'candidate'}
+test_run:wait_cond(function() return box.info.election.state == 'leader' end)
+
+test_run:switch('replica')
+test_run:wait_cond(function() return box.info.election.leader ~= 0 end)
+
+test_run:switch('default')
+test_run:cmd('restart server default')
+-- All settings are reset, the election is 'off'. The replica re-subscribes,
+-- and the former leader should send its state. Even though the election is
+-- disabled on it.
+assert(box.cfg.election_mode == 'off')
+
+test_run:switch('replica')
+test_run:wait_cond(function() return box.info.election.leader == 0 end)
+
 test_run:switch('default')
 test_run:cmd('stop server replica')
 test_run:cmd('delete server replica')
-box.cfg{                                                                        \
-    election_mode = old_election_mode,                                          \
-    replication_timeout = old_replication_timeout,                              \
-}
 box.schema.user.revoke('guest', 'super')
-- 
2.21.1 (Apple Git-122.3)

  reply	other threads:[~2020-10-17 17:17 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-17 17:17 [Tarantool-patches] [PATCH 0/3] Raft on leader election recovery restart Vladislav Shpilevoy
2020-10-17 17:17 ` Vladislav Shpilevoy [this message]
2020-10-20 20:43   ` [Tarantool-patches] [PATCH 1/3] raft: send state to new subscribers if Raft worked Vladislav Shpilevoy
2020-10-21 11:41     ` Serge Petrenko
2020-10-21 21:41       ` Vladislav Shpilevoy
2020-10-22  8:53         ` Alexander V. Tikhonov
2020-10-17 17:17 ` [Tarantool-patches] [PATCH 2/3] raft: use local LSN in relay recovery restart Vladislav Shpilevoy
2020-10-17 17:17 ` [Tarantool-patches] [PATCH 3/3] raft: don't drop GC when restart relay recovery Vladislav Shpilevoy
2020-10-19  9:36 ` [Tarantool-patches] [PATCH 0/3] Raft on leader election recovery restart Serge Petrenko
2020-10-19 20:26   ` Vladislav Shpilevoy
2020-10-20  8:18     ` Serge Petrenko
2020-10-22  8:55 ` Alexander V. Tikhonov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=15c343a9d2140a6c7ba31cb8761b2bca0ca6e09a.1602954690.git.v.shpilevoy@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=sergepetrenko@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH 1/3] raft: send state to new subscribers if Raft worked' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox