Tarantool development patches archive
 help / color / mirror / Atom feed
From: Serge Petrenko <sergepetrenko@tarantool.org>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>,
	tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH 1/3] raft: send state to new subscribers if Raft worked
Date: Wed, 21 Oct 2020 14:41:56 +0300	[thread overview]
Message-ID: <eb05fa39-fb16-8657-20a3-7714c0e13c9d@tarantool.org> (raw)
In-Reply-To: <47356d83-4464-070b-0df8-c7fcbbdb1e02@tarantool.org>


20.10.2020 23:43, Vladislav Shpilevoy пишет:
> The first patch was reworked to look at the subscriber's version.
> And send Raft state if it is supported. Even when mode is 'off'.


Hi! Thanks for the patch!

Please see my comments below.

>
> ====================
>      raft: send state always if version is >= 2.6.0
>      
>      Before the patch a node sent its complete Raft state to new
>      replication subscribers only when Raft was enabled: voter or
>      candidate. When the mode was off, it didn't send anything.
>      
>      This led to a bug (or maybe just inconvenience), that if the node
>      was a leader, was turned off, and couldn't deliver the 'off'
>      message to all the followers, some of them would still think the
>      node is a leader until their restart or a new leader election.
>      
>      If the followers were just voters or also had Raft disabled, they
>      would treat the old node as a leader infinitely until a new one
>      was elected somewhere.
>      
>      The state wasn't sent not to break connections with the old
>      versions, which won't understand the Raft messages.
>      
>      The patch makes Raft state being sent even if Raft is disabled on
>      the node, but the connected node supports Raft (its version is >=
>      2.6.0). In that way the old leader will keep sending resignation
>      messages even in case of restart or any other issue. And it won't
>      break the old versions when Raft is not used and was not used
>      ever.
>      
>      The issue was found while working on #5433, but is not strictly
>      related.
>
> diff --git a/src/box/box.cc b/src/box/box.cc
> index 18568df3b..86daaf626 100644
> --- a/src/box/box.cc
> +++ b/src/box/box.cc
> @@ -80,6 +80,7 @@
>   #include "msgpack.h"
>   #include "raft.h"
>   #include "trivia/util.h"
> +#include "version.h"
>   
>   static char status[64] = "unknown";
>   
> @@ -2154,13 +2155,14 @@ box_process_subscribe(struct ev_io *io, struct xrow_header *header)
>   		 tt_uuid_str(&replica_uuid), sio_socketname(io->fd));
>   	say_info("remote vclock %s local vclock %s",
>   		 vclock_to_string(&replica_clock), vclock_to_string(&vclock));
> -	if (raft_is_enabled()) {
> +	if (replica_version_id >= version_id(2, 6, 0)) {

Why don't you use 'raft_was_used()` here together with version check?

I've just understood we should also check for relay version in 
relay_push_raft(), shouldn't we?


>   		/*
> -		 * Send out the current raft state of the instance. Don't do
> -		 * that if Raft is disabled. It can be that a part of the
> -		 * cluster still contains old versions, which can't handle Raft
> -		 * messages. So when it is disabled, its network footprint
> -		 * should be 0.
> +		 * Send Raft state always to all instances supporting it.
> +		 * Regardless of its mode, even when disabled. That is because
> +		 * if it was ever enabled, then this node could be a leader in
> +		 * the past. And there may be nodes, who still consider it a
> +		 * leader. So even if it is disabled now, but was probably
> +		 * enabled in the past, the state must be sent.
>   		 */
>   		struct raft_request req;
>   		raft_serialize_for_network(&req, &vclock);
> diff --git a/test/replication/gh-5426-election-on-off.result b/test/replication/gh-5426-election-on-off.result
> index 1abfb9154..08253443c 100644
> --- a/test/replication/gh-5426-election-on-off.result
> +++ b/test/replication/gh-5426-election-on-off.result
> @@ -6,13 +6,6 @@ box.schema.user.grant('guest', 'super')
>    | ---
>    | ...
>   
> -old_election_mode = box.cfg.election_mode
> - | ---
> - | ...
> -old_replication_timeout = box.cfg.replication_timeout
> - | ---
> - | ...
> -
>   test_run:cmd('create server replica with rpl_master=default,\
>                 script="replication/replica.lua"')
>    | ---
> @@ -111,23 +104,63 @@ test_run:wait_cond(function() return box.info.election.leader == 0 end)
>    | - true
>    | ...
>   
> +-- When turned off, the former leader could stop sending its state to new
> +-- subscribers. That could make them think the leader is still alive.
>   test_run:switch('default')
>    | ---
>    | - true
>    | ...
> -test_run:cmd('stop server replica')
> +box.cfg{election_mode = 'candidate'}
> + | ---
> + | ...
> +test_run:wait_cond(function() return box.info.election.state == 'leader' end)
>    | ---
>    | - true
>    | ...
> -test_run:cmd('delete server replica')
> +
> +test_run:switch('replica')
>    | ---
>    | - true
>    | ...
> -box.cfg{                                                                        \
> -    election_mode = old_election_mode,                                          \
> -    replication_timeout = old_replication_timeout,                              \
> -}
> +test_run:wait_cond(function() return box.info.election.leader ~= 0 end)
> + | ---
> + | - true
> + | ...
> +
> +test_run:switch('default')
> + | ---
> + | - true
> + | ...
> +test_run:cmd('restart server default')
> + |
> +-- All settings are reset, the election is 'off'. The replica re-subscribes,
> +-- and the former leader should send its state. Even though the election is
> +-- disabled on it.
> +assert(box.cfg.election_mode == 'off')
>    | ---
> + | - true
> + | ...
> +
> +test_run:switch('replica')
> + | ---
> + | - true
> + | ...
> +test_run:wait_cond(function() return box.info.election.leader == 0 end)
> + | ---
> + | - true
> + | ...
> +
> +test_run:switch('default')
> + | ---
> + | - true
> + | ...
> +test_run:cmd('stop server replica')
> + | ---
> + | - true
> + | ...
> +test_run:cmd('delete server replica')
> + | ---
> + | - true
>    | ...
>   box.schema.user.revoke('guest', 'super')
>    | ---
> diff --git a/test/replication/gh-5426-election-on-off.test.lua b/test/replication/gh-5426-election-on-off.test.lua
> index d6b980d0a..adbfcd97d 100644
> --- a/test/replication/gh-5426-election-on-off.test.lua
> +++ b/test/replication/gh-5426-election-on-off.test.lua
> @@ -1,9 +1,6 @@
>   test_run = require('test_run').new()
>   box.schema.user.grant('guest', 'super')
>   
> -old_election_mode = box.cfg.election_mode
> -old_replication_timeout = box.cfg.replication_timeout
> -
>   test_run:cmd('create server replica with rpl_master=default,\
>                 script="replication/replica.lua"')
>   test_run:cmd('start server replica with wait=True, wait_load=True')
> @@ -47,11 +44,26 @@ box.cfg{election_mode = 'off'}
>   test_run:switch('replica')
>   test_run:wait_cond(function() return box.info.election.leader == 0 end)
>   
> +-- When turned off, the former leader could stop sending its state to new
> +-- subscribers. That could make them think the leader is still alive.
> +test_run:switch('default')
> +box.cfg{election_mode = 'candidate'}
> +test_run:wait_cond(function() return box.info.election.state == 'leader' end)
> +
> +test_run:switch('replica')
> +test_run:wait_cond(function() return box.info.election.leader ~= 0 end)
> +
> +test_run:switch('default')
> +test_run:cmd('restart server default')
> +-- All settings are reset, the election is 'off'. The replica re-subscribes,
> +-- and the former leader should send its state. Even though the election is
> +-- disabled on it.
> +assert(box.cfg.election_mode == 'off')
> +
> +test_run:switch('replica')
> +test_run:wait_cond(function() return box.info.election.leader == 0 end)
> +
>   test_run:switch('default')
>   test_run:cmd('stop server replica')
>   test_run:cmd('delete server replica')
> -box.cfg{                                                                        \
> -    election_mode = old_election_mode,                                          \
> -    replication_timeout = old_replication_timeout,                              \
> -}
>   box.schema.user.revoke('guest', 'super')

-- 
Serge Petrenko

  reply	other threads:[~2020-10-21 11:41 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-17 17:17 [Tarantool-patches] [PATCH 0/3] Raft on leader election recovery restart Vladislav Shpilevoy
2020-10-17 17:17 ` [Tarantool-patches] [PATCH 1/3] raft: send state to new subscribers if Raft worked Vladislav Shpilevoy
2020-10-20 20:43   ` Vladislav Shpilevoy
2020-10-21 11:41     ` Serge Petrenko [this message]
2020-10-21 21:41       ` Vladislav Shpilevoy
2020-10-22  8:53         ` Alexander V. Tikhonov
2020-10-17 17:17 ` [Tarantool-patches] [PATCH 2/3] raft: use local LSN in relay recovery restart Vladislav Shpilevoy
2020-10-17 17:17 ` [Tarantool-patches] [PATCH 3/3] raft: don't drop GC when restart relay recovery Vladislav Shpilevoy
2020-10-19  9:36 ` [Tarantool-patches] [PATCH 0/3] Raft on leader election recovery restart Serge Petrenko
2020-10-19 20:26   ` Vladislav Shpilevoy
2020-10-20  8:18     ` Serge Petrenko
2020-10-22  8:55 ` Alexander V. Tikhonov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=eb05fa39-fb16-8657-20a3-7714c0e13c9d@tarantool.org \
    --to=sergepetrenko@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH 1/3] raft: send state to new subscribers if Raft worked' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox