[Tarantool-patches] [PATCH 1/2] replication: introduce ballot.can_be_leader

Konstantin Osipov kostja.osipov at gmail.com
Mon Jul 19 12:12:48 MSK 2021


* Vladislav Shpilevoy <v.shpilevoy at tarantool.org> [21/07/18 20:03]:
> >> The new field during bootstrap will help to avoid selecting a
> >> 'voter' as a master. Voters can't write, they are unable to boot
> >> themselves nor register others.
> >>
> >> @TarantoolBot document
> >> Title: New field - IPROTO_BALLOT_CAN_BE_LEADER
> >> It is sent as a part of `IPROTO_BALLOT (0x29)`. The field is a
> >> boolean flag which is true if the sender has `election_mode` in
> >> its config `'manual'` or `'candidate'`.
> >>
> >> During bootstrap the nodes able to become a leader are preferred
> >> over the nodes configured as `'voter'`.
> > 
> > Curious why did you add this feature in the first place, I mean
> > "eligibility"? Each voter has to be able to become a leader,
> > otherwise raft liveness guarantees are violated. Raft has
> > learners, but learners neither vote nor can become leaders.
> 
> Voters are nodes which an admin does not want to be a leader. For
> instance, they are too far away physically. As voters, they might
> help to elect a leader, for example, if there are just 3 nodes one
> of which is a voter.
> 
> Another application is when you specifically start 1 node as a
> voter and 2 candidates. The voter might skip all the replication
> data and work on a slow small machine.
> 
> It can help to form a majority. We are planning to make this
> feature even easier to use by adding dataless nodes just for
> voting.
> 
> As for Raft, it should not bring any problems. In Raft you can
> say that all nodes are candidates, but some of them are so slow,
> that they can never vote for themselves in time. Raft still works,
> and you essentially have 'voters'.

Imagine there are nodes A, B, C, D, E.
A is a leader, E is a voter which can not become a leader.

Imagine A's log index is 5, B = 4, C = 3, D = 2, E = 5.

The majority's log index is 4, so entry 4 is committed. A dies, B
is partitioned away. The cluster is stuck, because neither C nor B
can get a quorum (3 votes).

Worse yet, if E's (voter) commit index is low, not high, it can vote for a
node which doesn't have a committed entry. In that case you can
lose a committed entry.

-- 
Konstantin Osipov, Moscow, Russia


More information about the Tarantool-patches mailing list