From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng1.m.smailru.net (smtpng1.m.smailru.net [94.100.181.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 05F1545C30C for ; Sun, 13 Dec 2020 20:15:43 +0300 (MSK) From: Vladislav Shpilevoy Date: Sun, 13 Dec 2020 18:15:29 +0100 Message-Id: <4d367a7c1a4fe8efc92de40583b5f4843b11b295.1607879643.git.v.shpilevoy@tarantool.org> In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [Tarantool-patches] [PATCH 7/8] raft: fix crash on election timeout decrease List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: tarantool-patches@dev.tarantool.org, sergepetrenko@tarantool.org If election timeout was decreased during election to a new value making the current election expired immediately, it could crash in libev. Because it would mean the remaining time until election end became negative. The negative timeout was passed to libev without any checks, and there is an assertion, that a timeout should always be >= 0. Part of #5303 --- src/lib/raft/raft.c | 2 ++ test/unit/raft.c | 20 +++++++++++++++++++- test/unit/raft.result | 8 +++++--- 3 files changed, 26 insertions(+), 4 deletions(-) diff --git a/src/lib/raft/raft.c b/src/lib/raft/raft.c index ab007a462..4f6a5ee5e 100644 --- a/src/lib/raft/raft.c +++ b/src/lib/raft/raft.c @@ -895,6 +895,8 @@ raft_cfg_election_timeout(struct raft *raft, double timeout) struct ev_loop *loop = raft_loop(); double timeout = raft_ev_timer_remaining(loop, &raft->timer) - raft->timer.at + raft->election_timeout; + if (timeout < 0) + timeout = 0; raft_ev_timer_stop(loop, &raft->timer); raft_ev_timer_set(&raft->timer, timeout, timeout); raft_ev_timer_start(loop, &raft->timer); diff --git a/test/unit/raft.c b/test/unit/raft.c index b97d9d0aa..2c3935cbf 100644 --- a/test/unit/raft.c +++ b/test/unit/raft.c @@ -793,7 +793,7 @@ raft_test_heartbeat(void) static void raft_test_election_timeout(void) { - raft_start_test(11); + raft_start_test(13); struct raft_node node; raft_node_create(&node); @@ -865,6 +865,24 @@ raft_test_election_timeout(void) "{0: 3}" /* Vclock. */ ), "re-enter candidate state"); + /* Decrease election timeout to earlier than now. */ + + raft_run_for(election_timeout / 2); + raft_node_cfg_election_timeout(&node, election_timeout / 4); + ts = raft_time(); + raft_run_next_event(); + + ok(raft_time() == ts, "the new timeout acts immediately"); + ok(raft_node_check_full_state(&node, + RAFT_STATE_CANDIDATE /* State. */, + 0 /* Leader. */, + 5 /* Term. */, + 1 /* Vote. */, + 5 /* Volatile term. */, + 1 /* Volatile vote. */, + "{0: 4}" /* Vclock. */ + ), "re-enter candidate state"); + /* * Timeout smaller than a millisecond. Election random shift has * millisecond precision. When timeout is smaller, maximal shift is diff --git a/test/unit/raft.result b/test/unit/raft.result index 3fa2682c8..fcd180cfc 100644 --- a/test/unit/raft.result +++ b/test/unit/raft.result @@ -148,7 +148,7 @@ ok 7 - subtests ok 8 - subtests *** raft_test_heartbeat: done *** *** raft_test_election_timeout *** - 1..11 + 1..13 ok 1 - election is started ok 2 - enter candidate state ok 3 - new election is started @@ -158,8 +158,10 @@ ok 8 - subtests ok 7 - new election timeout is respected ok 8 - but not too late ok 9 - re-enter candidate state - ok 10 - term is bumped, timeout was truly random - ok 11 - still candidate + ok 10 - the new timeout acts immediately + ok 11 - re-enter candidate state + ok 12 - term is bumped, timeout was truly random + ok 13 - still candidate ok 9 - subtests *** raft_test_election_timeout: done *** *** raft_test_election_quorum *** -- 2.24.3 (Apple Git-128)