[Tarantool-patches] [RFC] Quorum-based synchronous replication
Konstantin Osipov
kostja.osipov at gmail.com
Fri May 8 02:01:12 MSK 2020
> ### Synchronous replication enabling.
>
> Synchronous operation can be required for a set of spaces in the data
> scheme. That means only transactions that contain data modification for
> these spaces should require quorum. Such transactions named synchronous.
> As soon as last operation of synchronous transaction appeared in leader's
> WAL, it will cause all following transactions - no matter if they are
> synchronous or not - wait for the quorum. In case quorum is not achieved
> the 'rollback' operation will cause rollback of all transactions after
> the synchronous one. It will ensure the consistent state of the data both
> on leader and replicas. In case user doesn't require synchronous operation
> for any space then no changes to the WAL generation and replication will
> appear.
1) It's unclear what happens here if async tx follows a sync tx.
Does it wait for the sync tx? This reduces availability for
async txs - so it's hardly acceptable. Besides, with
group=local spaces, one can quickly run out of memory for undo.
Then it should be allowed to proceed and commit.
Then mixing sync and async tables in a single transaction
shouldn't be allowed.
Imagine t1 is sync and t2 is async. tx1 changes t1 and t2, tx2
changes t2. tx1 is not confirmed and must be rolled back. But it can
not revert changes of tx2.
The spec should clarify that.
2) First candidates to "sync" spaces are system spaces, especially
_schema (to fix box.once()) and _cluster (to fix parallel join
of multiple replicas).
I can't imagine it's possible to make system spaces synchronous
with an external coordinator - the coordinator may not be
available during box.cfg{}.
3) One can quickly run out of memory for undo. Any sync
transaction should be capped with a timeout to avoid OOMs. I
don't know how many times I should repeat it. The only good
solution for load control is in-memory WAL, which will allow to
rollback all transactions as soon as network partitioning is
detected.
--
Konstantin Osipov, Moscow, Russia
More information about the Tarantool-patches
mailing list