[Tarantool-patches] [PATCH v2 09/11] raft: introduce state machine
Serge Petrenko
sergepetrenko at tarantool.org
Mon Sep 21 11:34:20 MSK 2020
21.09.2020 11:22, Serge Petrenko пишет:
>
> 19.09.2020 18:49, Vladislav Shpilevoy пишет:
>> Here is a new version of the patch after some squashes.
>>
>> ====================
>> raft: introduce state machine
>> The commit is a core part of Raft implementation. It
>> introduces
>> the Raft state machine implementation and its integration into the
>> instance's life cycle.
>> The implementation follows the protocol to the letter
>> except a few
>> important details.
>> Firstly, the original Raft assumes, that all nodes share
>> the same
>> log record numbers. In Tarantool they are called LSNs. But in case
>> of Tarantool each node has its own LSN in its own component of
>> vclock. That makes the election messages a bit heavier, because
>> the nodes need to send and compare complete vclocks of each other
>> instead of a single number like in the original Raft. But logic
>> becomes simpler. Because in the original Raft there is a problem
>> of uncertainty about what to do with records of an old leader
>> right after a new leader is elected. They could be rolled back or
>> confirmed depending on circumstances. The issue disappears when
>> vclock is used.
>> Secondly, leader election works differently during cluster
>> bootstrap, until number of bootstrapped replicas becomes >=
>> election quorum. That arises from specifics of replicas bootstrap
>> and order of systems initialization. In short: during bootstrap a
>> leader election may use a smaller election quorum than the
>> configured one. See more details in the code.
>> Part of #1146
>
> Consider these fixes for applier vclock segfault pushed on top of the
> branch.
I've split the patch in two and placed the commits on top of
`raft: relay status updates to followers` and `raft: introduce state
machine`
>
> src/box/box.cc | 6 +-----
> src/box/raft.c | 12 +++++++++---
> 2 files changed, 10 insertions(+), 8 deletions(-)
>
> diff --git a/src/box/box.cc b/src/box/box.cc
> index c5dcbd959..1169f2cd5 100644
> --- a/src/box/box.cc
> +++ b/src/box/box.cc
> @@ -2160,11 +2160,7 @@ box_process_subscribe(struct ev_io *io, struct
> xrow_header *header)
> * should be 0.
> */
> struct raft_request req;
> - /*
> - * Omit the candidate vclock, since we've just sent it in
> - * subscribe response.
> - */
> - raft_serialize_for_network(&req, NULL);
> + raft_serialize_for_network(&req, &vclock);
> xrow_encode_raft(&row, &fiber()->gc, &req);
> coio_write_xrow(io, &row);
> }
> diff --git a/src/box/raft.c b/src/box/raft.c
> index 7b7ef9c1c..45783f0e3 100644
> --- a/src/box/raft.c
> +++ b/src/box/raft.c
> @@ -130,8 +130,6 @@ raft_new_random_election_shift(void)
> static inline bool
> raft_can_vote_for(const struct vclock *v)
> {
> - if (v == NULL)
> - return false;
> int cmp = vclock_compare_ignore0(v, &replicaset.vclock);
> return cmp == 0 || cmp == 1;
> }
> @@ -369,6 +367,12 @@ raft_process_msg(const struct raft_request *req,
> uint32_t source)
> "candidate");
> break;
> }
> + /* Can't vote when vclock is unknown. */
> + if (req->vclock == NULL) {
> + say_info("RAFT: vote request is skipped - "
> + "missing candidate vclock.");
> + break;
> + }
> /* Can't vote for too old or incomparable nodes. */
> if (!raft_can_vote_for(req->vclock)) {
> say_info("RAFT: vote request is skipped - "
> @@ -858,8 +862,10 @@ raft_serialize_for_network(struct raft_request
> *req, struct vclock *vclock)
> req->state = raft.state;
> /*
> * Raft does not own vclock, so it always expects it passed
> externally.
> + * Vclock is sent out only by candidate instances.
> */
> - req->vclock = vclock;
> + if (req->state == RAFT_STATE_CANDIDATE)
> + req->vclock = vclock;
> }
>
> void
--
Serge Petrenko
More information about the Tarantool-patches
mailing list