[Tarantool-patches] [RFC] Quorum-based synchronous replication

Konstantin Osipov kostja.osipov at gmail.com
Thu May 14 02:54:48 MSK 2020


* Vladislav Shpilevoy <v.shpilevoy at tarantool.org> [20/05/14 00:42]:
> > Sure yes, if it restarted - then connection lost can't be unnoticed by
> > anyone, be it coordinator or cluster.
> 
> Here comes another problem. Disconnect and restart have nothing to do with
> each other. The coordinator can loose connection without the peer leader
> restart. Just because it is network. Anything can happen. Moreover, while
> the coordinator does not have a connection, the leader can restart multiple
> times.

yes. 

> We can't tell the coordinator rely on connectivity as a restart signal.

Well, we could demand that the leader always demotes itself after
restart. But the spec should be explicit about it and explain how
the election happens in this case, because it still may have the
longest WAL (but with some junk in it, thanks to lost confirms),
so after restart the leader may need to reconcile its wal with the
majority, fetching missing records back.

Once again, RAFT is very explicit about this. By default it
requires that the leader commit log is durable, i.e.
wal_mode=sync. This would kill performance. Implementations exist
which run in wal_mode=write (cassandra is one of them), but they know how to
repair the log at the leader before proceeding with the next
transaction. The reason I brought this up is that it's extremely
tricky, and confusing as hell if the election is external (agree
there should be an API, or better yet, abandon the idea of
external election, just have no election for now at all, assume
the leader never changes, and we only provide durability in
multi-master config), with no consistency guarantees (but eventual
one).

> > How a restart can be unnoticed, if it causes disconnection?
> 
> Disconnection has nothing to do with restart. The coordinator itself may
> restart. Or it may loose connection to the leader temporarily. Or the
> leader may loose it without any restarts.

and yes.

-- 
Konstantin Osipov, Moscow, Russia


More information about the Tarantool-patches mailing list