[Tarantool-patches] [RFC] Quorum-based synchronous replication

Konstantin Osipov kostja.osipov at gmail.com
Thu Apr 23 12:14:36 MSK 2020


* Konstantin Osipov <kostja.osipov at gmail.com> [20/04/23 09:58]:
> > > To my understanding - it's up to user. I was considering a cluster that
> > > has no WAL at all - relying on sychro replication and sufficient number
> > > of replicas. Everyone who I asked about it told me I'm nuts. To my great
> > > surprise Alexander Lyapunov brought exactly the same idea to discuss. 
> > 
> > I didn't see an RFC on that, and this can become easily possible, when
> > in-memory relay is implemented. If it is implemented in a clean way. We
> > just can turn off the disk backoff, and it will work from memory-only.
> 
> Sync replication must work from in-memory relay only. It works as
> a natural failure detector: a replica which is slow or unavailable
> is first removed from the subscribers of in-memory relay, and only 
> then (possibly much much later) is marked as down.
> 
> By looking at the in-memory relay you have a clear idea what peers
> are available and can abort a transaction if a cluster is in the
> downgraded state right away. You never wait for impossible events. 
> 
> If you do have to wait, and say your wait timeout is 1 second, you
> quickly run out of any fibers in the fiber pool for any work,
> because all of them will be waiting on the sync transactions they
> picked up from iproto to finish. The system will loose its
> throttling capability. 

The other issue is that if your replicas are alive but
slow/lagging behind, you can't let too many undo records to
pile up unacknowledged in tx thread.
The in-memory relay solves this nicely too, because it kicks out
replicas from memory to file mode if they are unable to keep up
with the speed of change.

-- 
Konstantin Osipov, Moscow, Russia


More information about the Tarantool-patches mailing list