Tarantool development patches archive
 help / color / mirror / Atom feed
From: Serge Petrenko via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: v.shpilevoy@tarantool.org, gorcunov@gmail.com
Cc: tarantool-patches@dev.tarantool.org
Subject: [Tarantool-patches] [PATCH v4 01/16] replication: always send raft state to subscribers
Date: Wed, 14 Jul 2021 21:25:29 +0300	[thread overview]
Message-ID: <e6001e4186d28d179852dec988713578bc3b5f64.1626287002.git.sergepetrenko@tarantool.org> (raw)
In-Reply-To: <cover.1626287002.git.sergepetrenko@tarantool.org>

Tarantool used to send out raft state on subscribe only when raft was
enabled. This was a safeguard against partially-upgraded clusters, where
some nodes had no clue about Raft messages and couldn't handle them
properly.

Actually, Raft state should be sent out always. For example, promote
will be changed to bump Raft term even when Raft is disabled, and it's
important that everyone in cluster has the same term for the sake of promote
at least.

So, send out Raft state to every subscriber with version >= 2.6.0
(that's when Raft was introduced).
Do the same for Raft broadcasts. They should be sent only to replicas
with version >= 2.6.0

Closes #5438
---
 src/box/box.cc                                | 11 ++--
 src/box/relay.cc                              |  4 +-
 .../replication/gh-5438-election-state.result | 63 +++++++++++++++++++
 .../gh-5438-election-state.test.lua           | 28 +++++++++
 test/replication/suite.cfg                    |  1 +
 5 files changed, 100 insertions(+), 7 deletions(-)
 create mode 100644 test/replication/gh-5438-election-state.result
 create mode 100644 test/replication/gh-5438-election-state.test.lua

diff --git a/src/box/box.cc b/src/box/box.cc
index eeb57b04e..5dcf5b460 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -82,6 +82,7 @@
 #include "msgpack.h"
 #include "raft.h"
 #include "trivia/util.h"
+#include "version.h"
 
 enum {
 	IPROTO_THREADS_MAX = 1000,
@@ -2831,13 +2832,13 @@ box_process_subscribe(struct ev_io *io, struct xrow_header *header)
 		 tt_uuid_str(&replica_uuid), sio_socketname(io->fd));
 	say_info("remote vclock %s local vclock %s",
 		 vclock_to_string(&replica_clock), vclock_to_string(&vclock));
-	if (raft_is_enabled(box_raft())) {
+	if (replica_version_id >= version_id(2, 6, 0) && !anon) {
 		/*
 		 * Send out the current raft state of the instance. Don't do
-		 * that if Raft is disabled. It can be that a part of the
-		 * cluster still contains old versions, which can't handle Raft
-		 * messages. So when it is disabled, its network footprint
-		 * should be 0.
+		 * that if the remote instance is old. It can be that a part of
+		 * the cluster still contains old versions, which can't handle
+		 * Raft messages. Raft's network footprint should be 0 as seen
+		 * by such instances.
 		 */
 		struct raft_request req;
 		box_raft_checkpoint_remote(&req);
diff --git a/src/box/relay.cc b/src/box/relay.cc
index 115037fc3..60f527b7f 100644
--- a/src/box/relay.cc
+++ b/src/box/relay.cc
@@ -800,7 +800,7 @@ relay_subscribe_f(va_list ap)
 		  &relay->relay_pipe, NULL, NULL, cbus_process);
 
 	struct relay_is_raft_enabled_msg raft_enabler;
-	if (!relay->replica->anon)
+	if (!relay->replica->anon && relay->version_id >= version_id(2, 6, 0))
 		relay_send_is_raft_enabled(relay, &raft_enabler, true);
 
 	/*
@@ -883,7 +883,7 @@ relay_subscribe_f(va_list ap)
 		cpipe_push(&relay->tx_pipe, &relay->status_msg.msg);
 	}
 
-	if (!relay->replica->anon)
+	if (!relay->replica->anon && relay->version_id >= version_id(2, 6, 0))
 		relay_send_is_raft_enabled(relay, &raft_enabler, false);
 
 	/*
diff --git a/test/replication/gh-5438-election-state.result b/test/replication/gh-5438-election-state.result
new file mode 100644
index 000000000..6985f026a
--- /dev/null
+++ b/test/replication/gh-5438-election-state.result
@@ -0,0 +1,63 @@
+-- test-run result file version 2
+test_run = require('test_run').new()
+ | ---
+ | ...
+
+--
+-- gh-5428 send out Raft state to subscribers, even when Raft is disabled.
+--
+-- Bump Raft term while the replica's offline.
+term = box.info.election.term
+ | ---
+ | ...
+old_election_mode = box.cfg.election_mode
+ | ---
+ | ...
+box.cfg{election_mode = 'candidate'}
+ | ---
+ | ...
+test_run:wait_cond(function() return box.info.election.term > term end)
+ | ---
+ | - true
+ | ...
+
+-- Make sure the replica receives new term on subscribe.
+box.cfg{election_mode = 'off'}
+ | ---
+ | ...
+
+box.schema.user.grant('guest', 'replication')
+ | ---
+ | ...
+test_run:cmd('create server replica with rpl_master=default,\
+                                         script="replication/replica.lua"')
+ | ---
+ | - true
+ | ...
+test_run:cmd('start server replica')
+ | ---
+ | - true
+ | ...
+test_run:wait_cond(function()\
+    return test_run:eval('replica', 'return box.info.election.term')[1] ==\
+           box.info.election.term\
+end)
+ | ---
+ | - true
+ | ...
+
+-- Cleanup.
+box.cfg{election_mode = old_election_mode}
+ | ---
+ | ...
+test_run:cmd('stop server replica')
+ | ---
+ | - true
+ | ...
+test_run:cmd('delete server replica')
+ | ---
+ | - true
+ | ...
+box.schema.user.revoke('guest', 'replication')
+ | ---
+ | ...
diff --git a/test/replication/gh-5438-election-state.test.lua b/test/replication/gh-5438-election-state.test.lua
new file mode 100644
index 000000000..60c3366c1
--- /dev/null
+++ b/test/replication/gh-5438-election-state.test.lua
@@ -0,0 +1,28 @@
+test_run = require('test_run').new()
+
+--
+-- gh-5428 send out Raft state to subscribers, even when Raft is disabled.
+--
+-- Bump Raft term while the replica's offline.
+term = box.info.election.term
+old_election_mode = box.cfg.election_mode
+box.cfg{election_mode = 'candidate'}
+test_run:wait_cond(function() return box.info.election.term > term end)
+
+-- Make sure the replica receives new term on subscribe.
+box.cfg{election_mode = 'off'}
+
+box.schema.user.grant('guest', 'replication')
+test_run:cmd('create server replica with rpl_master=default,\
+                                         script="replication/replica.lua"')
+test_run:cmd('start server replica')
+test_run:wait_cond(function()\
+    return test_run:eval('replica', 'return box.info.election.term')[1] ==\
+           box.info.election.term\
+end)
+
+-- Cleanup.
+box.cfg{election_mode = old_election_mode}
+test_run:cmd('stop server replica')
+test_run:cmd('delete server replica')
+box.schema.user.revoke('guest', 'replication')
diff --git a/test/replication/suite.cfg b/test/replication/suite.cfg
index 69f2f3511..ae146c366 100644
--- a/test/replication/suite.cfg
+++ b/test/replication/suite.cfg
@@ -19,6 +19,7 @@
     "gh-5213-qsync-applier-order-3.test.lua": {},
     "gh-5426-election-on-off.test.lua": {},
     "gh-5433-election-restart-recovery.test.lua": {},
+    "gh-5438-election-state.test.lua": {},
     "gh-5445-leader-inconsistency.test.lua": {},
     "gh-5506-election-on-off.test.lua": {},
     "once.test.lua": {},
-- 
2.30.1 (Apple Git-130)


  reply	other threads:[~2021-07-14 18:26 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-14 18:25 [Tarantool-patches] [PATCH v4 00/16] forbid implicit limbo ownership transition Serge Petrenko via Tarantool-patches
2021-07-14 18:25 ` Serge Petrenko via Tarantool-patches [this message]
2021-07-14 18:25 ` [Tarantool-patches] [PATCH v4 02/16] txn_limbo: fix promote term filtering Serge Petrenko via Tarantool-patches
2021-07-14 18:25 ` [Tarantool-patches] [PATCH v4 03/16] txn_limbo: persist the latest effective promote in snapshot Serge Petrenko via Tarantool-patches
2021-07-14 18:25 ` [Tarantool-patches] [PATCH v4 04/16] replication: encode version in JOIN request Serge Petrenko via Tarantool-patches
2021-07-14 18:25 ` [Tarantool-patches] [PATCH v4 05/16] replication: add META stage to JOIN Serge Petrenko via Tarantool-patches
2021-07-14 18:25 ` [Tarantool-patches] [PATCH v4 06/16] replication: send latest effective promote in initial join Serge Petrenko via Tarantool-patches
2021-07-21 23:24   ` Vladislav Shpilevoy via Tarantool-patches
2021-07-23  7:44     ` Sergey Petrenko via Tarantool-patches
2021-07-26 23:43       ` Vladislav Shpilevoy via Tarantool-patches
2021-07-14 18:25 ` [Tarantool-patches] [PATCH v4 07/16] replication: send current Raft term in join response Serge Petrenko via Tarantool-patches
2021-07-21 23:24   ` Vladislav Shpilevoy via Tarantool-patches
2021-07-23  7:44     ` Sergey Petrenko via Tarantool-patches
2021-07-26 23:43       ` Vladislav Shpilevoy via Tarantool-patches
2021-07-29 20:46         ` Sergey Petrenko via Tarantool-patches
2021-07-14 18:25 ` [Tarantool-patches] [PATCH v4 08/16] raft: refactor raft_new_term() Serge Petrenko via Tarantool-patches
2021-07-14 18:25 ` [Tarantool-patches] [PATCH v4 09/16] box: split promote() into reasonable parts Serge Petrenko via Tarantool-patches
2021-07-21 23:26   ` Vladislav Shpilevoy via Tarantool-patches
2021-07-23  7:45     ` Sergey Petrenko via Tarantool-patches
2021-07-26 23:44       ` Vladislav Shpilevoy via Tarantool-patches
2021-07-29 20:46         ` Sergey Petrenko via Tarantool-patches
2021-07-14 18:25 ` [Tarantool-patches] [PATCH v4 10/16] box: make promote always bump the term Serge Petrenko via Tarantool-patches
2021-07-26 23:45   ` Vladislav Shpilevoy via Tarantool-patches
2021-07-29 20:46     ` Sergey Petrenko via Tarantool-patches
2021-07-14 18:25 ` [Tarantool-patches] [PATCH v4 11/16] box: make promote on the current leader a no-op Serge Petrenko via Tarantool-patches
2021-07-21 23:26   ` Vladislav Shpilevoy via Tarantool-patches
2021-07-23  7:45     ` Sergey Petrenko via Tarantool-patches
2021-07-14 18:25 ` [Tarantool-patches] [PATCH v4 12/16] box: fix an assertion failure after a spurious wakeup in promote Serge Petrenko via Tarantool-patches
2021-07-21 23:29   ` Vladislav Shpilevoy via Tarantool-patches
2021-07-23  7:45     ` Sergey Petrenko via Tarantool-patches
2021-07-14 18:25 ` [Tarantool-patches] [PATCH v4 13/16] box: allow calling promote on a candidate Serge Petrenko via Tarantool-patches
2021-07-15 14:06   ` Serge Petrenko via Tarantool-patches
2021-07-14 18:25 ` [Tarantool-patches] [PATCH v4 14/16] box: extract promote() settings to a separate method Serge Petrenko via Tarantool-patches
2021-07-14 18:25 ` [Tarantool-patches] [PATCH v4 15/16] replication: forbid implicit limbo owner transition Serge Petrenko via Tarantool-patches
2021-07-14 18:25 ` [Tarantool-patches] [PATCH v4 16/16] box: introduce `box.ctl.demote` Serge Petrenko via Tarantool-patches
2021-07-15 17:13   ` Serge Petrenko via Tarantool-patches
2021-07-15 20:11   ` [Tarantool-patches] [PATCH v4 17/16] replication: fix flaky election_qsync.test Serge Petrenko via Tarantool-patches
2021-07-26 23:43 ` [Tarantool-patches] [PATCH v4 00/16] forbid implicit limbo ownership transition Vladislav Shpilevoy via Tarantool-patches
2021-07-29 20:47   ` Sergey Petrenko via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e6001e4186d28d179852dec988713578bc3b5f64.1626287002.git.sergepetrenko@tarantool.org \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=gorcunov@gmail.com \
    --cc=sergepetrenko@tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v4 01/16] replication: always send raft state to subscribers' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox