[Tarantool-patches] [PATCH 8/8] raft: fix crash on death timeout decrease

Serge Petrenko sergepetrenko at tarantool.org
Wed Dec 16 16:10:32 MSK 2020


13.12.2020 20:15, Vladislav Shpilevoy пишет:
> If death timeout was decreased during waiting for leader death or
> discovery to a new value making the current death waiting end
> immediately, it could crash in libev.
>
> Because it would mean the remaining time until leader death became
> negative. The negative timeout was passed to libev without any
> checks, and there is an assertion, that a timeout should always
> be >= 0.
>
> This commit makes raft code covered almost on 100%, not counting
> one 'unreachable()' place.
>
> Closes #5303


LGTM.


> ---
>   src/lib/raft/raft.c   |  2 ++
>   test/unit/raft.c      | 26 +++++++++++++++++++++++++-
>   test/unit/raft.result |  7 ++++++-
>   3 files changed, 33 insertions(+), 2 deletions(-)
>
> diff --git a/src/lib/raft/raft.c b/src/lib/raft/raft.c
> index 4f6a5ee5e..4ea4fc3f8 100644
> --- a/src/lib/raft/raft.c
> +++ b/src/lib/raft/raft.c
> @@ -924,6 +924,8 @@ raft_cfg_death_timeout(struct raft *raft, double death_timeout)
>   		struct ev_loop *loop = raft_loop();
>   		double timeout = raft_ev_timer_remaining(loop, &raft->timer) -
>   				 raft->timer.at + raft->death_timeout;
> +		if (timeout < 0)
> +			timeout = 0;
>   		raft_ev_timer_stop(loop, &raft->timer);
>   		raft_ev_timer_set(&raft->timer, timeout, timeout);
>   		raft_ev_timer_start(loop, &raft->timer);
> diff --git a/test/unit/raft.c b/test/unit/raft.c
> index 2c3935cbf..11e101777 100644
> --- a/test/unit/raft.c
> +++ b/test/unit/raft.c
> @@ -971,7 +971,7 @@ raft_test_election_quorum(void)
>   static void
>   raft_test_death_timeout(void)
>   {
> -	raft_start_test(4);
> +	raft_start_test(9);
>   	struct raft_node node;
>   	raft_node_create(&node);
>   
> @@ -1018,6 +1018,30 @@ raft_test_death_timeout(void)
>   		"{0: 2}" /* Vclock. */
>   	), "enter candidate state when the new death timeout expires");
>   
> +	/* Decrease timeout to earlier than now. */
> +
> +	is(raft_node_send_leader(&node,
> +		3 /* Term. */,
> +		2 /* Source. */
> +	), 0, "message from leader");
> +	is(node.raft.leader, 2, "leader is accepted");
> +	is(node.raft.state, RAFT_STATE_FOLLOWER, "became follower");
> +
> +	raft_run_for(timeout / 2);
> +	raft_node_cfg_death_timeout(&node, timeout / 4);
> +	double ts = raft_time();
> +	raft_run_next_event();
> +	ok(raft_time() == ts, "death is detected immediately");
> +	ok(raft_node_check_full_state(&node,
> +		RAFT_STATE_CANDIDATE /* State. */,
> +		0 /* Leader. */,
> +		4 /* Term. */,
> +		1 /* Vote. */,
> +		4 /* Volatile term. */,
> +		1 /* Volatile vote. */,
> +		"{0: 3}" /* Vclock. */
> +	), "enter candidate state");
> +
>   	raft_node_destroy(&node);
>   	raft_finish_test();
>   }
> diff --git a/test/unit/raft.result b/test/unit/raft.result
> index fcd180cfc..8188d1806 100644
> --- a/test/unit/raft.result
> +++ b/test/unit/raft.result
> @@ -176,11 +176,16 @@ ok 9 - subtests
>   ok 10 - subtests
>   	*** raft_test_election_quorum: done ***
>   	*** raft_test_death_timeout ***
> -    1..4
> +    1..9
>       ok 1 - leader notification
>       ok 2 - follow the leader
>       ok 3 - the leader still is considered alive
>       ok 4 - enter candidate state when the new death timeout expires
> +    ok 5 - message from leader
> +    ok 6 - leader is accepted
> +    ok 7 - became follower
> +    ok 8 - death is detected immediately
> +    ok 9 - enter candidate state
>   ok 11 - subtests
>   	*** raft_test_death_timeout: done ***
>   	*** raft_test_enable_disable ***

-- 
Serge Petrenko



More information about the Tarantool-patches mailing list