[Tarantool-patches] [PATCH 8/8] raft: fix crash on death timeout decrease

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Sun Dec 13 20:15:30 MSK 2020


If death timeout was decreased during waiting for leader death or
discovery to a new value making the current death waiting end
immediately, it could crash in libev.

Because it would mean the remaining time until leader death became
negative. The negative timeout was passed to libev without any
checks, and there is an assertion, that a timeout should always
be >= 0.

This commit makes raft code covered almost on 100%, not counting
one 'unreachable()' place.

Closes #5303
---
 src/lib/raft/raft.c   |  2 ++
 test/unit/raft.c      | 26 +++++++++++++++++++++++++-
 test/unit/raft.result |  7 ++++++-
 3 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/src/lib/raft/raft.c b/src/lib/raft/raft.c
index 4f6a5ee5e..4ea4fc3f8 100644
--- a/src/lib/raft/raft.c
+++ b/src/lib/raft/raft.c
@@ -924,6 +924,8 @@ raft_cfg_death_timeout(struct raft *raft, double death_timeout)
 		struct ev_loop *loop = raft_loop();
 		double timeout = raft_ev_timer_remaining(loop, &raft->timer) -
 				 raft->timer.at + raft->death_timeout;
+		if (timeout < 0)
+			timeout = 0;
 		raft_ev_timer_stop(loop, &raft->timer);
 		raft_ev_timer_set(&raft->timer, timeout, timeout);
 		raft_ev_timer_start(loop, &raft->timer);
diff --git a/test/unit/raft.c b/test/unit/raft.c
index 2c3935cbf..11e101777 100644
--- a/test/unit/raft.c
+++ b/test/unit/raft.c
@@ -971,7 +971,7 @@ raft_test_election_quorum(void)
 static void
 raft_test_death_timeout(void)
 {
-	raft_start_test(4);
+	raft_start_test(9);
 	struct raft_node node;
 	raft_node_create(&node);
 
@@ -1018,6 +1018,30 @@ raft_test_death_timeout(void)
 		"{0: 2}" /* Vclock. */
 	), "enter candidate state when the new death timeout expires");
 
+	/* Decrease timeout to earlier than now. */
+
+	is(raft_node_send_leader(&node,
+		3 /* Term. */,
+		2 /* Source. */
+	), 0, "message from leader");
+	is(node.raft.leader, 2, "leader is accepted");
+	is(node.raft.state, RAFT_STATE_FOLLOWER, "became follower");
+
+	raft_run_for(timeout / 2);
+	raft_node_cfg_death_timeout(&node, timeout / 4);
+	double ts = raft_time();
+	raft_run_next_event();
+	ok(raft_time() == ts, "death is detected immediately");
+	ok(raft_node_check_full_state(&node,
+		RAFT_STATE_CANDIDATE /* State. */,
+		0 /* Leader. */,
+		4 /* Term. */,
+		1 /* Vote. */,
+		4 /* Volatile term. */,
+		1 /* Volatile vote. */,
+		"{0: 3}" /* Vclock. */
+	), "enter candidate state");
+
 	raft_node_destroy(&node);
 	raft_finish_test();
 }
diff --git a/test/unit/raft.result b/test/unit/raft.result
index fcd180cfc..8188d1806 100644
--- a/test/unit/raft.result
+++ b/test/unit/raft.result
@@ -176,11 +176,16 @@ ok 9 - subtests
 ok 10 - subtests
 	*** raft_test_election_quorum: done ***
 	*** raft_test_death_timeout ***
-    1..4
+    1..9
     ok 1 - leader notification
     ok 2 - follow the leader
     ok 3 - the leader still is considered alive
     ok 4 - enter candidate state when the new death timeout expires
+    ok 5 - message from leader
+    ok 6 - leader is accepted
+    ok 7 - became follower
+    ok 8 - death is detected immediately
+    ok 9 - enter candidate state
 ok 11 - subtests
 	*** raft_test_death_timeout: done ***
 	*** raft_test_enable_disable ***
-- 
2.24.3 (Apple Git-128)



More information about the Tarantool-patches mailing list