[Tarantool-patches] [RFC] Quorum-based synchronous replication

Sergey Ostanevich sergos at tarantool.org
Fri Apr 17 16:45:18 MSK 2020


Hi, thanks for review!

On 17 апр 13:10, Konstantin Osipov wrote:
> * sergos at tarantool.org <sergos at tarantool.org> [20/04/15 17:51]:
> > ### Quorum commit
> 
> This part looks correct. It only describes two paths out of many
> though:
> - leader is able to collect the majority
> - leader is not able to collect the majority
> 
> What happens when a leader receives a message for a round which is
> complete?

It just ignores it, the reason - see next comment.

> How does a replica which missed a round catch up? 
> What happens if replica fails to apply txn 1 (e.g. because of a
> duplciate key), but confirms txn 2? 

This should never happen, since each replica applies txns in strict
order, means failure of txn 1 will happen before the confirmation of
txn 2. As soon as replica fails to apply a txn it should report an
error, disconnect and roll back all txns in it's pipeline. After that
the replica will ne in a consistent state with Leader's lsn before the
txn 1.
> 
> What happens if txn1 gets no majority at the leader, but txn 2
> gets a majority? How are the followers rolled back?

This situation means that some of ACKs from replicas didn't arrive.
Which doesn't mean they failed to apply txn 1. Althoug, success of txn 2
means the txn 1 was also applied - hence, receiveing a txn N ACK from a
replica means ACK for each txn M: M < N. 

> > In case of a leader failure a replica with the biggest LSN with former
> > leader's ID is elected as a new leader.
> 
> As long as multi-master is not banned, there may be multiple
> leaders. Does this proposal suggest multi-master is banned? Then
> it should describe the implementation of this, and in absense of
> transparent query forwarding it will break all clients.
> 

It was mentioned at the top of RFC: 

> What this RFC is not:
>
>    - high availability (HA) solution with automated failover, roles
>      assignments an so on
>    - master-master configuration support

Which I tend to describe as 'do not recommend'. Similar to what we have
in documentation about the cascading replication configuration.
Although, I heard from some users that they successfuly use such config.

Regards,
Sergos



More information about the Tarantool-patches mailing list