From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-f68.google.com (mail-lf1-f68.google.com [209.85.167.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 56CB1469710 for ; Thu, 14 May 2020 02:54:51 +0300 (MSK) Received: by mail-lf1-f68.google.com with SMTP id r17so1010104lff.9 for ; Wed, 13 May 2020 16:54:51 -0700 (PDT) Date: Thu, 14 May 2020 02:54:48 +0300 From: Konstantin Osipov Message-ID: <20200513235448.GC5698@atlas> References: <20200403210836.GB18283@tarantool.org> <20200430145033.GF112@tarantool.org> <20200506085249.GA2842@atlas> <20200506163901.GH112@tarantool.org> <20200506184445.GB24913@atlas> <20200512155508.GJ112@tarantool.org> <78713377-806f-8cf6-efe0-5019f3d3e428@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <78713377-806f-8cf6-efe0-5019f3d3e428@tarantool.org> Subject: Re: [Tarantool-patches] [RFC] Quorum-based synchronous replication List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vladislav Shpilevoy Cc: tarantool-patches@dev.tarantool.org * Vladislav Shpilevoy [20/05/14 00:42]: > > Sure yes, if it restarted - then connection lost can't be unnoticed by > > anyone, be it coordinator or cluster. > > Here comes another problem. Disconnect and restart have nothing to do with > each other. The coordinator can loose connection without the peer leader > restart. Just because it is network. Anything can happen. Moreover, while > the coordinator does not have a connection, the leader can restart multiple > times. yes. > We can't tell the coordinator rely on connectivity as a restart signal. Well, we could demand that the leader always demotes itself after restart. But the spec should be explicit about it and explain how the election happens in this case, because it still may have the longest WAL (but with some junk in it, thanks to lost confirms), so after restart the leader may need to reconcile its wal with the majority, fetching missing records back. Once again, RAFT is very explicit about this. By default it requires that the leader commit log is durable, i.e. wal_mode=sync. This would kill performance. Implementations exist which run in wal_mode=write (cassandra is one of them), but they know how to repair the log at the leader before proceeding with the next transaction. The reason I brought this up is that it's extremely tricky, and confusing as hell if the election is external (agree there should be an API, or better yet, abandon the idea of external election, just have no election for now at all, assume the leader never changes, and we only provide durability in multi-master config), with no consistency guarantees (but eventual one). > > How a restart can be unnoticed, if it causes disconnection? > > Disconnection has nothing to do with restart. The coordinator itself may > restart. Or it may loose connection to the leader temporarily. Or the > leader may loose it without any restarts. and yes. -- Konstantin Osipov, Moscow, Russia