Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: tarantool-patches@dev.tarantool.org, sergepetrenko@tarantool.org
Subject: [Tarantool-patches] [PATCH 4/4] election: activate raft split vote handling
Date: Sat, 15 Jan 2022 01:48:56 +0100	[thread overview]
Message-ID: <a86513c785f5084b20a4a709946cb2e30e9a8ef0.1642207647.git.v.shpilevoy@tarantool.org> (raw)
In-Reply-To: <cover.1642207647.git.v.shpilevoy@tarantool.org>

Raft needs to know cluster size in order to detect and handle
split vote. The patch uses registered server count as cluster
size.

It is not documented nor has a changelog file because this is an
optimization. Can't be observed except in logs or with a watch.

Closes #5285
---
 src/box/raft.c                                |  4 +-
 .../election_split_vote_test.lua              | 92 +++++++++++++++++++
 2 files changed, 95 insertions(+), 1 deletion(-)
 create mode 100644 test/replication-luatest/election_split_vote_test.lua

diff --git a/src/box/raft.c b/src/box/raft.c
index 1e360dc88..1908b71b6 100644
--- a/src/box/raft.c
+++ b/src/box/raft.c
@@ -229,7 +229,9 @@ box_raft_update_election_quorum(void)
 	 *   be lost.
 	 */
 	int quorum = MIN(replication_synchro_quorum, max);
-	raft_cfg_election_quorum(box_raft(), quorum);
+	struct raft *raft = box_raft();
+	raft_cfg_election_quorum(raft, quorum);
+	raft_cfg_cluster_size(raft, replicaset.registered_count);
 }
 
 void
diff --git a/test/replication-luatest/election_split_vote_test.lua b/test/replication-luatest/election_split_vote_test.lua
new file mode 100644
index 000000000..f31bfd7f3
--- /dev/null
+++ b/test/replication-luatest/election_split_vote_test.lua
@@ -0,0 +1,92 @@
+local t = require('luatest')
+local cluster = require('test.luatest_helpers.cluster')
+local helpers = require('test.luatest_helpers')
+local wait_timeout = 120
+
+--
+-- gh-5285: split vote is when in the current term there can't be winner of the
+-- leader role. Number of unused votes is not enough for anyone to get the
+-- quorum. It can be detected to speed up the term bump.
+--
+local g = t.group('split-vote')
+
+g.before_each(function()
+    g.cluster = cluster:new({})
+    local node1_uri = helpers.instance_uri('node1')
+    local node2_uri = helpers.instance_uri('node2')
+    local replication = {node1_uri, node2_uri}
+    local box_cfg = {
+        listen = node1_uri,
+        replication = replication,
+        -- To speed up new term when try to elect a first leader.
+        replication_timeout = 0.1,
+        replication_synchro_quorum = 2,
+        election_timeout = 1000000,
+    }
+    g.node1 = g.cluster:build_server({alias = 'node1', box_cfg = box_cfg})
+
+    box_cfg.listen = node2_uri
+    g.node2 = g.cluster:build_server({alias = 'node2', box_cfg = box_cfg})
+
+    g.cluster:add_server(g.node1)
+    g.cluster:add_server(g.node2)
+    g.cluster:start()
+end)
+
+g.after_each(function()
+    g.cluster:drop()
+end)
+
+g.test_split_vote = function(g)
+    -- Stop the replication so as the nodes can't request votes from each other.
+    local node1_repl = g.node1:exec(function()
+        local repl = box.cfg.replication
+        box.cfg{replication = {}}
+        return repl
+    end)
+    local node2_repl = g.node2:exec(function()
+        local repl = box.cfg.replication
+        box.cfg{replication = {}}
+        return repl
+    end)
+
+    -- Both vote for self but don't see the split-vote yet.
+    g.node1:exec(function()
+        box.cfg{election_mode = 'candidate'}
+    end)
+    g.node2:exec(function()
+        box.cfg{election_mode = 'candidate'}
+    end)
+
+    -- Wait for the votes to actually happen.
+    t.helpers.retrying({timeout = wait_timeout}, function()
+        local func = function()
+            return box.info.election.vote == box.info.id
+        end
+        assert(g.node1:exec(func))
+        assert(g.node2:exec(func))
+    end)
+
+    -- Now let the nodes notice the split vote.
+    g.node1:exec(function(repl)
+        box.cfg{replication = repl}
+    end, {node1_repl})
+    g.node2:exec(function(repl)
+        box.cfg{replication = repl}
+    end, {node2_repl})
+
+    t.helpers.retrying({timeout = wait_timeout}, function()
+        local msg = 'split vote is discovered'
+        assert(g.node1:grep_log(msg) or g.node2:grep_log(msg))
+    end)
+
+    -- Ensure a leader is eventually elected. Nothing is broken for good.
+    g.node1:exec(function()
+        box.cfg{election_timeout = 1}
+    end)
+    g.node2:exec(function()
+        box.cfg{election_timeout = 1}
+    end)
+    g.node1:wait_election_leader_found()
+    g.node2:wait_election_leader_found()
+end
-- 
2.24.3 (Apple Git-128)


  parent reply	other threads:[~2022-01-15  0:51 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-15  0:48 [Tarantool-patches] [PATCH 0/4] Split vote Vladislav Shpilevoy via Tarantool-patches
2022-01-15  0:48 ` [Tarantool-patches] [PATCH 1/4] raft: fix crash on election_timeout reconfig Vladislav Shpilevoy via Tarantool-patches
2022-01-18 13:12   ` Serge Petrenko via Tarantool-patches
2022-01-15  0:48 ` [Tarantool-patches] [PATCH 2/4] raft: track all votes, even not own Vladislav Shpilevoy via Tarantool-patches
2022-01-21  0:42   ` Vladislav Shpilevoy via Tarantool-patches
2022-01-15  0:48 ` [Tarantool-patches] [PATCH 3/4] raft: introduce split vote detection Vladislav Shpilevoy via Tarantool-patches
2022-01-18 13:20   ` Serge Petrenko via Tarantool-patches
2022-01-20  0:44     ` Vladislav Shpilevoy via Tarantool-patches
2022-01-20 10:21       ` Serge Petrenko via Tarantool-patches
2022-01-20 23:02         ` Vladislav Shpilevoy via Tarantool-patches
2022-01-15  0:48 ` Vladislav Shpilevoy via Tarantool-patches [this message]
2022-01-18 13:21   ` [Tarantool-patches] [PATCH 4/4] election: activate raft split vote handling Serge Petrenko via Tarantool-patches
2022-01-20  0:44     ` Vladislav Shpilevoy via Tarantool-patches
2022-01-16 14:10 ` [Tarantool-patches] [PATCH 0/4] Split vote Konstantin Osipov via Tarantool-patches
2022-01-17 22:57   ` Vladislav Shpilevoy via Tarantool-patches
2022-01-18  7:18     ` Konstantin Osipov via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a86513c785f5084b20a4a709946cb2e30e9a8ef0.1642207647.git.v.shpilevoy@tarantool.org \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=sergepetrenko@tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH 4/4] election: activate raft split vote handling' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox