[Tarantool-patches] [RFC] Quorum-based synchronous replication
Sergey Ostanevich
sergos at tarantool.org
Fri Apr 17 16:45:18 MSK 2020
Hi, thanks for review!
On 17 апр 13:10, Konstantin Osipov wrote:
> * sergos at tarantool.org <sergos at tarantool.org> [20/04/15 17:51]:
> > ### Quorum commit
>
> This part looks correct. It only describes two paths out of many
> though:
> - leader is able to collect the majority
> - leader is not able to collect the majority
>
> What happens when a leader receives a message for a round which is
> complete?
It just ignores it, the reason - see next comment.
> How does a replica which missed a round catch up?
> What happens if replica fails to apply txn 1 (e.g. because of a
> duplciate key), but confirms txn 2?
This should never happen, since each replica applies txns in strict
order, means failure of txn 1 will happen before the confirmation of
txn 2. As soon as replica fails to apply a txn it should report an
error, disconnect and roll back all txns in it's pipeline. After that
the replica will ne in a consistent state with Leader's lsn before the
txn 1.
>
> What happens if txn1 gets no majority at the leader, but txn 2
> gets a majority? How are the followers rolled back?
This situation means that some of ACKs from replicas didn't arrive.
Which doesn't mean they failed to apply txn 1. Althoug, success of txn 2
means the txn 1 was also applied - hence, receiveing a txn N ACK from a
replica means ACK for each txn M: M < N.
> > In case of a leader failure a replica with the biggest LSN with former
> > leader's ID is elected as a new leader.
>
> As long as multi-master is not banned, there may be multiple
> leaders. Does this proposal suggest multi-master is banned? Then
> it should describe the implementation of this, and in absense of
> transparent query forwarding it will break all clients.
>
It was mentioned at the top of RFC:
> What this RFC is not:
>
> - high availability (HA) solution with automated failover, roles
> assignments an so on
> - master-master configuration support
Which I tend to describe as 'do not recommend'. Similar to what we have
in documentation about the cascading replication configuration.
Although, I heard from some users that they successfuly use such config.
Regards,
Sergos
More information about the Tarantool-patches
mailing list