[Tarantool-patches] [RFC] Quorum-based synchronous replication

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Thu May 14 00:42:37 MSK 2020


Thanks for the discussion!

On 06/05/2020 20:55, Konstantin Osipov wrote:
> * Sergey Ostanevich <sergos at tarantool.org> [20/04/30 17:51]:
> 
> A few more issues:
> 
> - the spec assumes there is a full mesh. In any other
>   topology electing a leader based on the longest wal can easily
>   deadlock. Yet it provides no protection against non-full-mesh
>   setups. Currently the server can't even detect that this is not
>   a full-mesh setup, so can't check if the precondition for this
>   to work correctly is met.

Yes, this is a very unstable construction. But we failed to come up
with a solution right now, which would protect against accidental
non-fullmesh. For example, how will it work, when I add a new node?
If non-fullmesh is forbidden, the new node just can't be added ever,
because this can't be done on all nodes simultaneously.

> - the spec assumes that quorum is identical to the
>   number of replicas, and the number of replicas is stable across
>   cluster life time. Can I have quorum=2 while the number of
>   replicas is 4? Am I allowed to increase the number of replicas
>   online? What happens when a replica is added,
>   how exactly and starting from which transaction is the leader
>   required to collect a bigger quorum?

Quorum <= number of replicas. It is a parameter, just like
replication_connect_quorum.

I think you are allowed to add new replicas. When a replica is added,
it goes through the normal join process.

> - the same goes for removing a replica. How is the quorum reduced?

Node is just removed, I guess. If total number of nodes becomes less
than quorum, obviously no transactions will be served.

However what to do with the existing pending transactions, which
already accounted the removed replica in their quorums? Should they be
decremented?

All what I am talking here are guesses. Which should be clarified in the
RFC in the ideal world, of course.

Tbh, we discussed the sync replication for may hours in voice, and this
is a surprise, that all of them fit into such a small update of the RFC.
Even though it didn't fit. Since we obviously still didn't clarify many
things. Especially exact API look.


More information about the Tarantool-patches mailing list