Tarantool development patches archive
 help / color / mirror / Atom feed
From: Serge Petrenko via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>,
	tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH 3/4] raft: introduce split vote detection
Date: Tue, 18 Jan 2022 16:20:52 +0300	[thread overview]
Message-ID: <0d65c52d-c42f-7271-d4d2-a997268138a7@tarantool.org> (raw)
In-Reply-To: <8ce7d7d2ff3c79f11f73272ad08e43838689681a.1642207647.git.v.shpilevoy@tarantool.org>


Thanks for the patch!

I don't think this optimisation is "too much of a hassle".
It's quite nice, and looks like a bunch of SLOC in the patch are used up
by verbose printing (I mean raft_scores_snprint).

In other words, I like the idea and I think we should have that on board.
(Just like pre-voting)

Please find my comments below.

> diff --git a/src/lib/raft/raft.c b/src/lib/raft/raft.c
> index 289d53fd5..5dcbc7821 100644
> --- a/src/lib/raft/raft.c
> +++ b/src/lib/raft/raft.c
> @@ -152,20 +152,69 @@ raft_can_vote_for(const struct raft *raft, const struct vclock *v)
>   	return cmp == 0 || cmp == 1;
>   }
>   
> -static inline void
> +static inline bool
>   raft_add_vote(struct raft *raft, int src, int dst)
>   {
>   	struct raft_vote *v = &raft->votes[src];
>   	if (v->did_vote)
> -		return;
> +		return false;
>   	v->did_vote = true;
>   	++raft->votes[dst].count;
> +	return true;
> +}
> +

You may check split_vote right in raft_add_vote:
simply track number of votes given in this term and
max votes given for one instance.

This way you won't have to run over all 32 nodes each time a vote
is casted.

> +static bool
> +raft_has_split_vote(const struct raft *raft)
> +{
> +	int max_vote = 0;
> +	int vote_vac = raft->cluster_size;
> +	int quorum = raft->election_quorum;
> +	for (int i = 0; i < VCLOCK_MAX; ++i) {
> +		int count = raft->votes[i].count;
> +		vote_vac -= count;
> +		if (count > max_vote)
> +			max_vote = count;
> +	}
> +	return max_vote < quorum && max_vote + vote_vac < quorum;

This is equal to `return max_vote + vote_vac < quorum`

> +}
> +
> +static int
> +raft_scores_snprintf(const struct raft *raft, char *buf, int size)
> +{
> +	int total = 0;
> +	bool is_empty = true;
> +	SNPRINT(total, snprintf, buf, size, "{");
> +	for (int i = 0; i < VCLOCK_MAX; ++i) {
> +		int count = raft->votes[i].count;
> +		if (count == 0)
> +			continue;
> +		if (!is_empty)
> +			SNPRINT(total, snprintf, buf, size, ", ");
> +		is_empty = false;

Nit: you may move is_empty = false into the 'else' branch.

> +		SNPRINT(total, snprintf, buf, size, "%d: %d", i, count);
> +	}
> +	SNPRINT(total, snprintf, buf, size, "}");
> +	return total;
> +}
> +

...

>   
> +static void
> +raft_check_split_vote(struct raft *raft)
> +{
> +	/* When leader is known, there is no election. Thus no vote to split. */
> +	if (raft->leader != 0)
> +		return;
> +	/* Not a candidate = can't trigger term bump anyway. */
> +	if (!raft->is_candidate)
> +		return;
> +	/*
> +	 * WAL write in progress means the state is changing. All is rechecked
> +	 * when it is done.
> +	 */
> +	if (raft->is_write_in_progress)
> +		return;
> +	if (!raft_has_split_vote(raft))
> +		return;
> +	assert(raft_ev_is_active(&raft->timer));
> +	if (raft->timer.at < raft->election_timeout)
> +		return;

I don't understand that.  timer.at should point at current time, 
shouldn't it?

> +
> +	assert(raft->state == RAFT_STATE_FOLLOWER ||
> +	       raft->state == RAFT_STATE_CANDIDATE);
> +	struct ev_loop *loop = raft_loop();
> +	struct ev_timer *timer = &raft->timer;
> +	double delay = raft_new_random_election_shift(raft);
> +	/*
> +	 * Could be too late to speed up anything - probably the term is almost
> +	 * over anyway.
> +	 */
> +	double remaining = raft_ev_timer_remaining(loop, timer);
> +	if (delay >= remaining)
> +		delay = remaining;
> +	say_info("RAFT: split vote is discovered - %s, new term in %lf sec",
> +		 raft_scores_str(raft), delay);
> +	raft_ev_timer_stop(loop, timer);
> +	raft_ev_timer_set(timer, delay, delay);
> +	raft_ev_timer_start(loop, timer);
> +}
> +
>   void
>   raft_create(struct raft *raft, const struct raft_vtab *vtab)
>   {
> @@ -1053,6 +1150,7 @@ raft_create(struct raft *raft, const struct raft_vtab *vtab)
>   		.election_quorum = 1,
>   		.election_timeout = 5,
>   		.death_timeout = 5,
> +		.cluster_size = VCLOCK_MAX,
>   		.vtab = vtab,
>   	};
>   	raft_ev_timer_init(&raft->timer, raft_sm_schedule_new_election_cb,
...
>   

-- 
Serge Petrenko


  reply	other threads:[~2022-01-18 13:20 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-15  0:48 [Tarantool-patches] [PATCH 0/4] Split vote Vladislav Shpilevoy via Tarantool-patches
2022-01-15  0:48 ` [Tarantool-patches] [PATCH 1/4] raft: fix crash on election_timeout reconfig Vladislav Shpilevoy via Tarantool-patches
2022-01-18 13:12   ` Serge Petrenko via Tarantool-patches
2022-01-15  0:48 ` [Tarantool-patches] [PATCH 2/4] raft: track all votes, even not own Vladislav Shpilevoy via Tarantool-patches
2022-01-21  0:42   ` Vladislav Shpilevoy via Tarantool-patches
2022-01-15  0:48 ` [Tarantool-patches] [PATCH 3/4] raft: introduce split vote detection Vladislav Shpilevoy via Tarantool-patches
2022-01-18 13:20   ` Serge Petrenko via Tarantool-patches [this message]
2022-01-20  0:44     ` Vladislav Shpilevoy via Tarantool-patches
2022-01-20 10:21       ` Serge Petrenko via Tarantool-patches
2022-01-20 23:02         ` Vladislav Shpilevoy via Tarantool-patches
2022-01-15  0:48 ` [Tarantool-patches] [PATCH 4/4] election: activate raft split vote handling Vladislav Shpilevoy via Tarantool-patches
2022-01-18 13:21   ` Serge Petrenko via Tarantool-patches
2022-01-20  0:44     ` Vladislav Shpilevoy via Tarantool-patches
2022-01-16 14:10 ` [Tarantool-patches] [PATCH 0/4] Split vote Konstantin Osipov via Tarantool-patches
2022-01-17 22:57   ` Vladislav Shpilevoy via Tarantool-patches
2022-01-18  7:18     ` Konstantin Osipov via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0d65c52d-c42f-7271-d4d2-a997268138a7@tarantool.org \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=sergepetrenko@tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH 3/4] raft: introduce split vote detection' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox