[Tarantool-patches] [PATCH 7/8] raft: fix crash on election timeout decrease

Serge Petrenko sergepetrenko at tarantool.org
Wed Dec 16 16:08:29 MSK 2020


13.12.2020 20:15, Vladislav Shpilevoy пишет:
> If election timeout was decreased during election to a new value
> making the current election expired immediately, it could crash in
> libev.
>
> Because it would mean the remaining time until election end became
> negative. The negative timeout was passed to libev without any
> checks, and there is an assertion, that a timeout should always
> be >= 0.
>
> Part of #5303


LGTM.


> ---
>   src/lib/raft/raft.c   |  2 ++
>   test/unit/raft.c      | 20 +++++++++++++++++++-
>   test/unit/raft.result |  8 +++++---
>   3 files changed, 26 insertions(+), 4 deletions(-)
>
> diff --git a/src/lib/raft/raft.c b/src/lib/raft/raft.c
> index ab007a462..4f6a5ee5e 100644
> --- a/src/lib/raft/raft.c
> +++ b/src/lib/raft/raft.c
> @@ -895,6 +895,8 @@ raft_cfg_election_timeout(struct raft *raft, double timeout)
>   		struct ev_loop *loop = raft_loop();
>   		double timeout = raft_ev_timer_remaining(loop, &raft->timer) -
>   				 raft->timer.at + raft->election_timeout;
> +		if (timeout < 0)
> +			timeout = 0;
>   		raft_ev_timer_stop(loop, &raft->timer);
>   		raft_ev_timer_set(&raft->timer, timeout, timeout);
>   		raft_ev_timer_start(loop, &raft->timer);
> diff --git a/test/unit/raft.c b/test/unit/raft.c
> index b97d9d0aa..2c3935cbf 100644
> --- a/test/unit/raft.c
> +++ b/test/unit/raft.c
> @@ -793,7 +793,7 @@ raft_test_heartbeat(void)
>   static void
>   raft_test_election_timeout(void)
>   {
> -	raft_start_test(11);
> +	raft_start_test(13);
>   	struct raft_node node;
>   	raft_node_create(&node);
>   
> @@ -865,6 +865,24 @@ raft_test_election_timeout(void)
>   		"{0: 3}" /* Vclock. */
>   	), "re-enter candidate state");
>   
> +	/* Decrease election timeout to earlier than now. */
> +
> +	raft_run_for(election_timeout / 2);
> +	raft_node_cfg_election_timeout(&node, election_timeout / 4);
> +	ts = raft_time();
> +	raft_run_next_event();
> +
> +	ok(raft_time() == ts, "the new timeout acts immediately");
> +	ok(raft_node_check_full_state(&node,
> +		RAFT_STATE_CANDIDATE /* State. */,
> +		0 /* Leader. */,
> +		5 /* Term. */,
> +		1 /* Vote. */,
> +		5 /* Volatile term. */,
> +		1 /* Volatile vote. */,
> +		"{0: 4}" /* Vclock. */
> +	), "re-enter candidate state");
> +
>   	/*
>   	 * Timeout smaller than a millisecond. Election random shift has
>   	 * millisecond precision. When timeout is smaller, maximal shift is
> diff --git a/test/unit/raft.result b/test/unit/raft.result
> index 3fa2682c8..fcd180cfc 100644
> --- a/test/unit/raft.result
> +++ b/test/unit/raft.result
> @@ -148,7 +148,7 @@ ok 7 - subtests
>   ok 8 - subtests
>   	*** raft_test_heartbeat: done ***
>   	*** raft_test_election_timeout ***
> -    1..11
> +    1..13
>       ok 1 - election is started
>       ok 2 - enter candidate state
>       ok 3 - new election is started
> @@ -158,8 +158,10 @@ ok 8 - subtests
>       ok 7 - new election timeout is respected
>       ok 8 - but not too late
>       ok 9 - re-enter candidate state
> -    ok 10 - term is bumped, timeout was truly random
> -    ok 11 - still candidate
> +    ok 10 - the new timeout acts immediately
> +    ok 11 - re-enter candidate state
> +    ok 12 - term is bumped, timeout was truly random
> +    ok 13 - still candidate
>   ok 9 - subtests
>   	*** raft_test_election_timeout: done ***
>   	*** raft_test_election_quorum ***

-- 
Serge Petrenko



More information about the Tarantool-patches mailing list