From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng1.m.smailru.net (smtpng1.m.smailru.net [94.100.181.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 5B5BB45C304 for ; Sun, 13 Dec 2020 20:15:45 +0300 (MSK) From: Vladislav Shpilevoy Date: Sun, 13 Dec 2020 18:15:30 +0100 Message-Id: <607c7c67a3d31471a8c8d2fcb13c86615bd48024.1607879643.git.v.shpilevoy@tarantool.org> In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [Tarantool-patches] [PATCH 8/8] raft: fix crash on death timeout decrease List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: tarantool-patches@dev.tarantool.org, sergepetrenko@tarantool.org If death timeout was decreased during waiting for leader death or discovery to a new value making the current death waiting end immediately, it could crash in libev. Because it would mean the remaining time until leader death became negative. The negative timeout was passed to libev without any checks, and there is an assertion, that a timeout should always be >= 0. This commit makes raft code covered almost on 100%, not counting one 'unreachable()' place. Closes #5303 --- src/lib/raft/raft.c | 2 ++ test/unit/raft.c | 26 +++++++++++++++++++++++++- test/unit/raft.result | 7 ++++++- 3 files changed, 33 insertions(+), 2 deletions(-) diff --git a/src/lib/raft/raft.c b/src/lib/raft/raft.c index 4f6a5ee5e..4ea4fc3f8 100644 --- a/src/lib/raft/raft.c +++ b/src/lib/raft/raft.c @@ -924,6 +924,8 @@ raft_cfg_death_timeout(struct raft *raft, double death_timeout) struct ev_loop *loop = raft_loop(); double timeout = raft_ev_timer_remaining(loop, &raft->timer) - raft->timer.at + raft->death_timeout; + if (timeout < 0) + timeout = 0; raft_ev_timer_stop(loop, &raft->timer); raft_ev_timer_set(&raft->timer, timeout, timeout); raft_ev_timer_start(loop, &raft->timer); diff --git a/test/unit/raft.c b/test/unit/raft.c index 2c3935cbf..11e101777 100644 --- a/test/unit/raft.c +++ b/test/unit/raft.c @@ -971,7 +971,7 @@ raft_test_election_quorum(void) static void raft_test_death_timeout(void) { - raft_start_test(4); + raft_start_test(9); struct raft_node node; raft_node_create(&node); @@ -1018,6 +1018,30 @@ raft_test_death_timeout(void) "{0: 2}" /* Vclock. */ ), "enter candidate state when the new death timeout expires"); + /* Decrease timeout to earlier than now. */ + + is(raft_node_send_leader(&node, + 3 /* Term. */, + 2 /* Source. */ + ), 0, "message from leader"); + is(node.raft.leader, 2, "leader is accepted"); + is(node.raft.state, RAFT_STATE_FOLLOWER, "became follower"); + + raft_run_for(timeout / 2); + raft_node_cfg_death_timeout(&node, timeout / 4); + double ts = raft_time(); + raft_run_next_event(); + ok(raft_time() == ts, "death is detected immediately"); + ok(raft_node_check_full_state(&node, + RAFT_STATE_CANDIDATE /* State. */, + 0 /* Leader. */, + 4 /* Term. */, + 1 /* Vote. */, + 4 /* Volatile term. */, + 1 /* Volatile vote. */, + "{0: 3}" /* Vclock. */ + ), "enter candidate state"); + raft_node_destroy(&node); raft_finish_test(); } diff --git a/test/unit/raft.result b/test/unit/raft.result index fcd180cfc..8188d1806 100644 --- a/test/unit/raft.result +++ b/test/unit/raft.result @@ -176,11 +176,16 @@ ok 9 - subtests ok 10 - subtests *** raft_test_election_quorum: done *** *** raft_test_death_timeout *** - 1..4 + 1..9 ok 1 - leader notification ok 2 - follow the leader ok 3 - the leader still is considered alive ok 4 - enter candidate state when the new death timeout expires + ok 5 - message from leader + ok 6 - leader is accepted + ok 7 - became follower + ok 8 - death is detected immediately + ok 9 - enter candidate state ok 11 - subtests *** raft_test_death_timeout: done *** *** raft_test_enable_disable *** -- 2.24.3 (Apple Git-128)