From: Serge Petrenko via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>, tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH 3/4] raft: introduce split vote detection Date: Tue, 18 Jan 2022 16:20:52 +0300 [thread overview] Message-ID: <0d65c52d-c42f-7271-d4d2-a997268138a7@tarantool.org> (raw) In-Reply-To: <8ce7d7d2ff3c79f11f73272ad08e43838689681a.1642207647.git.v.shpilevoy@tarantool.org> Thanks for the patch! I don't think this optimisation is "too much of a hassle". It's quite nice, and looks like a bunch of SLOC in the patch are used up by verbose printing (I mean raft_scores_snprint). In other words, I like the idea and I think we should have that on board. (Just like pre-voting) Please find my comments below. > diff --git a/src/lib/raft/raft.c b/src/lib/raft/raft.c > index 289d53fd5..5dcbc7821 100644 > --- a/src/lib/raft/raft.c > +++ b/src/lib/raft/raft.c > @@ -152,20 +152,69 @@ raft_can_vote_for(const struct raft *raft, const struct vclock *v) > return cmp == 0 || cmp == 1; > } > > -static inline void > +static inline bool > raft_add_vote(struct raft *raft, int src, int dst) > { > struct raft_vote *v = &raft->votes[src]; > if (v->did_vote) > - return; > + return false; > v->did_vote = true; > ++raft->votes[dst].count; > + return true; > +} > + You may check split_vote right in raft_add_vote: simply track number of votes given in this term and max votes given for one instance. This way you won't have to run over all 32 nodes each time a vote is casted. > +static bool > +raft_has_split_vote(const struct raft *raft) > +{ > + int max_vote = 0; > + int vote_vac = raft->cluster_size; > + int quorum = raft->election_quorum; > + for (int i = 0; i < VCLOCK_MAX; ++i) { > + int count = raft->votes[i].count; > + vote_vac -= count; > + if (count > max_vote) > + max_vote = count; > + } > + return max_vote < quorum && max_vote + vote_vac < quorum; This is equal to `return max_vote + vote_vac < quorum` > +} > + > +static int > +raft_scores_snprintf(const struct raft *raft, char *buf, int size) > +{ > + int total = 0; > + bool is_empty = true; > + SNPRINT(total, snprintf, buf, size, "{"); > + for (int i = 0; i < VCLOCK_MAX; ++i) { > + int count = raft->votes[i].count; > + if (count == 0) > + continue; > + if (!is_empty) > + SNPRINT(total, snprintf, buf, size, ", "); > + is_empty = false; Nit: you may move is_empty = false into the 'else' branch. > + SNPRINT(total, snprintf, buf, size, "%d: %d", i, count); > + } > + SNPRINT(total, snprintf, buf, size, "}"); > + return total; > +} > + ... > > +static void > +raft_check_split_vote(struct raft *raft) > +{ > + /* When leader is known, there is no election. Thus no vote to split. */ > + if (raft->leader != 0) > + return; > + /* Not a candidate = can't trigger term bump anyway. */ > + if (!raft->is_candidate) > + return; > + /* > + * WAL write in progress means the state is changing. All is rechecked > + * when it is done. > + */ > + if (raft->is_write_in_progress) > + return; > + if (!raft_has_split_vote(raft)) > + return; > + assert(raft_ev_is_active(&raft->timer)); > + if (raft->timer.at < raft->election_timeout) > + return; I don't understand that. timer.at should point at current time, shouldn't it? > + > + assert(raft->state == RAFT_STATE_FOLLOWER || > + raft->state == RAFT_STATE_CANDIDATE); > + struct ev_loop *loop = raft_loop(); > + struct ev_timer *timer = &raft->timer; > + double delay = raft_new_random_election_shift(raft); > + /* > + * Could be too late to speed up anything - probably the term is almost > + * over anyway. > + */ > + double remaining = raft_ev_timer_remaining(loop, timer); > + if (delay >= remaining) > + delay = remaining; > + say_info("RAFT: split vote is discovered - %s, new term in %lf sec", > + raft_scores_str(raft), delay); > + raft_ev_timer_stop(loop, timer); > + raft_ev_timer_set(timer, delay, delay); > + raft_ev_timer_start(loop, timer); > +} > + > void > raft_create(struct raft *raft, const struct raft_vtab *vtab) > { > @@ -1053,6 +1150,7 @@ raft_create(struct raft *raft, const struct raft_vtab *vtab) > .election_quorum = 1, > .election_timeout = 5, > .death_timeout = 5, > + .cluster_size = VCLOCK_MAX, > .vtab = vtab, > }; > raft_ev_timer_init(&raft->timer, raft_sm_schedule_new_election_cb, ... > -- Serge Petrenko
next prev parent reply other threads:[~2022-01-18 13:20 UTC|newest] Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-01-15 0:48 [Tarantool-patches] [PATCH 0/4] Split vote Vladislav Shpilevoy via Tarantool-patches 2022-01-15 0:48 ` [Tarantool-patches] [PATCH 1/4] raft: fix crash on election_timeout reconfig Vladislav Shpilevoy via Tarantool-patches 2022-01-18 13:12 ` Serge Petrenko via Tarantool-patches 2022-01-15 0:48 ` [Tarantool-patches] [PATCH 2/4] raft: track all votes, even not own Vladislav Shpilevoy via Tarantool-patches 2022-01-21 0:42 ` Vladislav Shpilevoy via Tarantool-patches 2022-01-15 0:48 ` [Tarantool-patches] [PATCH 3/4] raft: introduce split vote detection Vladislav Shpilevoy via Tarantool-patches 2022-01-18 13:20 ` Serge Petrenko via Tarantool-patches [this message] 2022-01-20 0:44 ` Vladislav Shpilevoy via Tarantool-patches 2022-01-20 10:21 ` Serge Petrenko via Tarantool-patches 2022-01-20 23:02 ` Vladislav Shpilevoy via Tarantool-patches 2022-01-15 0:48 ` [Tarantool-patches] [PATCH 4/4] election: activate raft split vote handling Vladislav Shpilevoy via Tarantool-patches 2022-01-18 13:21 ` Serge Petrenko via Tarantool-patches 2022-01-20 0:44 ` Vladislav Shpilevoy via Tarantool-patches 2022-01-16 14:10 ` [Tarantool-patches] [PATCH 0/4] Split vote Konstantin Osipov via Tarantool-patches 2022-01-17 22:57 ` Vladislav Shpilevoy via Tarantool-patches 2022-01-18 7:18 ` Konstantin Osipov via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=0d65c52d-c42f-7271-d4d2-a997268138a7@tarantool.org \ --to=tarantool-patches@dev.tarantool.org \ --cc=sergepetrenko@tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH 3/4] raft: introduce split vote detection' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox