[Tarantool-patches] [PATCH v4 13/12] replication: send accumulated Raft messages after relay start

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Tue Apr 20 01:36:48 MSK 2021


Thanks for the patch!

See 2 comments below.

> diff --git a/src/box/relay.cc b/src/box/relay.cc
> index 7be33ee31..85f335cd7 100644
> --- a/src/box/relay.cc
> +++ b/src/box/relay.cc
> @@ -628,13 +659,38 @@ struct relay_is_raft_enabled_msg {
>      bool is_finished;
>  };
> 
> +static void
> +relay_push_raft_msg(struct relay *relay, bool do_restart_recovery)

1. Why is the recovery restart flag is ignored if a message is already
sent? This might lead to recovery restart loss if I am not mistaken.

> +{
> +    if (!relay->tx.is_raft_enabled || relay->tx.is_raft_push_sent)
> +        return;
> +    struct relay_raft_msg *msg =
> +        &relay->tx.raft_msgs[relay->tx.raft_ready_msg];
> +    msg->do_restart_recovery = do_restart_recovery;
> +    cpipe_push(&relay->relay_pipe, &msg->base);
> +    relay->tx.raft_ready_msg = (relay->tx.raft_ready_msg + 1) % 2;
> +    relay->tx.is_raft_push_sent = true;
> +    relay->tx.is_raft_push_pending = false;
> +}
> +
>  /** TX thread part of the Raft flag setting, first hop. */
>  static void
>  tx_set_is_raft_enabled(struct cmsg *base)
>  {
>      struct relay_is_raft_enabled_msg *msg =
>          (struct relay_is_raft_enabled_msg *)base;
> -    msg->relay->tx.is_raft_enabled = msg->value;
> +    struct relay *relay  = msg->relay;
> +    relay->tx.is_raft_enabled = msg->value;
> +    /*
> +     * Send saved raft message as soon as relay becomes operational.
> +     * Do not restart recovery upon the message arrival. Recovery is
> +     * positioned at replica_clock initially, i.e. already "restarted" and
> +     * restarting it once again would position it at the oldest xlog
> +     * possible, because relay reader hasn't received replica vclock yet.
> +     */
> +    if (relay->tx.is_raft_push_pending) {
> +        relay_push_raft_msg(msg->relay, false);

2. I don't understand. Why wasn't there such a problem before? Recovery
must be restarted when the node becomes a leader. If you do not restart
it, the data would be ignored by the replicas. How do you know it is
positioned right now at replica_clock? You are in tx thread, you can't
tell. What do I miss?


More information about the Tarantool-patches mailing list