Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: tarantool-patches@dev.tarantool.org, sergepetrenko@tarantool.org
Subject: [Tarantool-patches] [PATCH v2 1/5] raft: fix crash on election_timeout reconfig
Date: Thu, 20 Jan 2022 01:43:43 +0100	[thread overview]
Message-ID: <4cde0a624f2f716e868653030048771643e84bc7.1642639079.git.v.shpilevoy@tarantool.org> (raw)
In-Reply-To: <cover.1642639079.git.v.shpilevoy@tarantool.org>

It used to crash if done during election on a node voted for
anybody, it is a candidate, it doesn't know a leader yet, but has
a WAL write in progress.

Thus it could only happen if the term was bumped by a message from
a non-leader node and wasn't flushed to the disk yet.

The patch makes the reconfig check if there is a WAL write in
progress. Then don't do anything.

Could also check for volatile vote instead of persistent, but it
would create the same problem for the case when started writing
vote for self and didn't finish yet. Reconfig would crash.
---
 .../unreleased/election-timeout-cfg-crash.md  |  5 ++
 src/lib/raft/raft.c                           |  3 +-
 test/unit/raft.c                              | 75 ++++++++++++++++++-
 test/unit/raft.result                         | 16 +++-
 4 files changed, 96 insertions(+), 3 deletions(-)
 create mode 100644 changelogs/unreleased/election-timeout-cfg-crash.md

diff --git a/changelogs/unreleased/election-timeout-cfg-crash.md b/changelogs/unreleased/election-timeout-cfg-crash.md
new file mode 100644
index 000000000..e870e3665
--- /dev/null
+++ b/changelogs/unreleased/election-timeout-cfg-crash.md
@@ -0,0 +1,5 @@
+## bugfix/raft
+
+* Reconfiguration of `box.cfg.election_timeout` could lead to a crash or
+  undefined behaviour if done during an ongoing election with a special WAL
+  write in progress.
diff --git a/src/lib/raft/raft.c b/src/lib/raft/raft.c
index 0e6dc6155..1ae8fe87f 100644
--- a/src/lib/raft/raft.c
+++ b/src/lib/raft/raft.c
@@ -935,7 +935,8 @@ raft_cfg_election_timeout(struct raft *raft, double timeout)
 		return;
 
 	raft->election_timeout = timeout;
-	if (raft->vote != 0 && raft->leader == 0 && raft->is_candidate) {
+	if (raft->vote != 0 && raft->leader == 0 && raft->is_candidate &&
+	    !raft->is_write_in_progress) {
 		assert(raft_ev_is_active(&raft->timer));
 		struct ev_loop *loop = raft_loop();
 		double timeout = raft_ev_timer_remaining(loop, &raft->timer) -
diff --git a/test/unit/raft.c b/test/unit/raft.c
index 8f4d2dd9a..520b94466 100644
--- a/test/unit/raft.c
+++ b/test/unit/raft.c
@@ -1296,7 +1296,7 @@ raft_test_enable_disable(void)
 static void
 raft_test_too_long_wal_write(void)
 {
-	raft_start_test(8);
+	raft_start_test(22);
 	struct raft_node node;
 	raft_node_create(&node);
 
@@ -1362,6 +1362,79 @@ raft_test_too_long_wal_write(void)
 	ok(raft_time() - ts == node.cfg_death_timeout, "timer works again");
 	is(node.raft.state, RAFT_STATE_CANDIDATE, "became candidate");
 
+	/*
+	 * During WAL write it is possible to reconfigure election timeout.
+	 * The dangerous case is when the timer is active already. It happens
+	 * when the node voted and is a candidate, but leader is unknown.
+	 */
+	raft_node_destroy(&node);
+	raft_node_create(&node);
+
+	raft_node_cfg_election_timeout(&node, 100);
+	raft_run_next_event();
+	is(node.raft.term, 2, "term is bumped");
+
+	/* Bump term again but it is not written to WAL yet. */
+	raft_node_block(&node);
+	is(raft_node_send_vote_response(&node,
+		3 /* Term. */,
+		3 /* Vote. */,
+		2 /* Source. */
+	), 0, "2 votes for 3 in a new term");
+	raft_run_next_event();
+	is(node.raft.term, 2, "term is old");
+	is(node.raft.vote, 1, "vote is used for self");
+	is(node.raft.volatile_term, 3, "volatile term is new");
+	is(node.raft.volatile_vote, 0, "volatile vote is unused");
+
+	raft_node_cfg_election_timeout(&node, 50);
+	raft_node_unblock(&node);
+	ts = raft_time();
+	raft_run_next_event();
+	ts = raft_time() - ts;
+	/* 50 + <= 10% random delay. */
+	ok(ts >= 50 && ts <= 55, "new election timeout works");
+	ok(raft_node_check_full_state(&node,
+		RAFT_STATE_CANDIDATE /* State. */,
+		0 /* Leader. */,
+		4 /* Term. */,
+		1 /* Vote. */,
+		4 /* Volatile term. */,
+		1 /* Volatile vote. */,
+		"{0: 4}" /* Vclock. */
+	), "new term is started with vote for self");
+
+	/*
+	 * Similar case when a vote is being written but not finished yet.
+	 */
+	raft_node_destroy(&node);
+	raft_node_create(&node);
+
+	raft_node_cfg_election_timeout(&node, 100);
+	raft_node_block(&node);
+	raft_run_next_event();
+	is(node.raft.term, 1, "term is old");
+	is(node.raft.vote, 0, "vote is unused");
+	is(node.raft.volatile_term, 2, "volatile term is new");
+	is(node.raft.volatile_vote, 1, "volatile vote is self");
+
+	raft_node_cfg_election_timeout(&node, 50);
+	raft_node_unblock(&node);
+	ts = raft_time();
+	raft_run_next_event();
+	ts = raft_time() - ts;
+	/* 50 + <= 10% random delay. */
+	ok(ts >= 50 && ts <= 55, "new election timeout works");
+	ok(raft_node_check_full_state(&node,
+		RAFT_STATE_CANDIDATE /* State. */,
+		0 /* Leader. */,
+		3 /* Term. */,
+		1 /* Vote. */,
+		3 /* Volatile term. */,
+		1 /* Volatile vote. */,
+		"{0: 2}" /* Vclock. */
+	), "new term is started with vote for self");
+
 	raft_node_destroy(&node);
 	raft_finish_test();
 }
diff --git a/test/unit/raft.result b/test/unit/raft.result
index 14689ea81..764d82969 100644
--- a/test/unit/raft.result
+++ b/test/unit/raft.result
@@ -220,7 +220,7 @@ ok 11 - subtests
 ok 12 - subtests
 	*** raft_test_enable_disable: done ***
 	*** raft_test_too_long_wal_write ***
-    1..8
+    1..22
     ok 1 - vote for 2
     ok 2 - vote is volatile
     ok 3 - message from leader
@@ -229,6 +229,20 @@ ok 12 - subtests
     ok 6 - wal write is finished
     ok 7 - timer works again
     ok 8 - became candidate
+    ok 9 - term is bumped
+    ok 10 - 2 votes for 3 in a new term
+    ok 11 - term is old
+    ok 12 - vote is used for self
+    ok 13 - volatile term is new
+    ok 14 - volatile vote is unused
+    ok 15 - new election timeout works
+    ok 16 - new term is started with vote for self
+    ok 17 - term is old
+    ok 18 - vote is unused
+    ok 19 - volatile term is new
+    ok 20 - volatile vote is self
+    ok 21 - new election timeout works
+    ok 22 - new term is started with vote for self
 ok 13 - subtests
 	*** raft_test_too_long_wal_write: done ***
 	*** raft_test_promote_restore ***
-- 
2.24.3 (Apple Git-128)


  reply	other threads:[~2022-01-20  0:44 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-20  0:43 [Tarantool-patches] [PATCH v2 0/5] Split vote and bugs Vladislav Shpilevoy via Tarantool-patches
2022-01-20  0:43 ` Vladislav Shpilevoy via Tarantool-patches [this message]
2022-01-20  0:43 ` [Tarantool-patches] [PATCH v2 2/5] raft: fix ev_timer.at incorrect usage Vladislav Shpilevoy via Tarantool-patches
2022-01-20  0:43 ` [Tarantool-patches] [PATCH v2 3/5] raft: track all votes, even not own Vladislav Shpilevoy via Tarantool-patches
2022-01-20  0:43 ` [Tarantool-patches] [PATCH v2 4/5] raft: introduce split vote detection Vladislav Shpilevoy via Tarantool-patches
2022-01-20 13:22   ` Serge Petrenko via Tarantool-patches
2022-01-20 23:02     ` Vladislav Shpilevoy via Tarantool-patches
2022-01-25 10:17       ` Serge Petrenko via Tarantool-patches
2022-01-20  0:43 ` [Tarantool-patches] [PATCH v2 5/5] election: activate raft split vote handling Vladislav Shpilevoy via Tarantool-patches
2022-01-25 10:18 ` [Tarantool-patches] [PATCH v2 0/5] Split vote and bugs Serge Petrenko via Tarantool-patches
2022-01-25 22:51 ` Vladislav Shpilevoy via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4cde0a624f2f716e868653030048771643e84bc7.1642639079.git.v.shpilevoy@tarantool.org \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=sergepetrenko@tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v2 1/5] raft: fix crash on election_timeout reconfig' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox