From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp49.i.mail.ru (smtp49.i.mail.ru [94.100.177.109]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 367DC4696C3 for ; Thu, 23 Apr 2020 18:11:35 +0300 (MSK) Date: Thu, 23 Apr 2020 18:11:34 +0300 From: Sergey Ostanevich Message-ID: <20200423151134.GD112@tarantool.org> References: <20200403210836.GB18283@tarantool.org> <20200421104918.GA112@tarantool.org> <20200423065809.GA4528@atlas> <20200423091436.GA14576@atlas> <20200423112702.GC112@tarantool.org> <20200423114325.GA19129@atlas> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20200423114325.GA19129@atlas> Subject: Re: [Tarantool-patches] [RFC] Quorum-based synchronous replication List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Konstantin Osipov , Vladislav Shpilevoy , tarantool-patches@dev.tarantool.org On 23 апр 14:43, Konstantin Osipov wrote: > > The quality one buys for this price: consistency of data in multiple > > instances distributed across different locations. > > The spec should demonstrate the consistency is guaranteed: right > now it can easily be violated during a leader change, and this is > left out of scope of the spec. > > My take is that any implementation which is not close enough to a > TLA+ proven spec is not trustworthy, so I would not claim myself > or trust any one elses claims that it is consistent. At best this > RFC could achieve durability, by ensuring that no transaction is > committed unless it is delivered to a majority of replicas. What is exactly mentioned in RFC goals. > Consistency requires implementing RAFT spec in full and showing > that leader changes preserve the write ahead log linearizability. > So the leader should stop accepting transactions, wait for all txn in queue resolved into confirmed either issue a rollback - after a timeout as a last resort. Since no automation in leader election the cluster will appear in a consistent state after this. Now a new leader can be appointed with all circumstances taken into account - nodes availability, ping from the proxy, lsn, etc. Again, this RFC is not about any HA features, such as auto-failover. > > > The other issue is that if your replicas are alive but > > > slow/lagging behind, you can't let too many undo records to > > > pile up unacknowledged in tx thread. > > > The in-memory relay solves this nicely too, because it kicks out > > > replicas from memory to file mode if they are unable to keep up > > > with the speed of change. > > > > > That is the same problem - resources of leader, so natural limit for > > throughput. I bet Tarantool faces similar limitations even now, > > although different ones. > > > > The in-memory relay supposed to keep the same interface, so we expect to > > hop easily to this new shiny express as soon as it appears. This will be > > an optimization and we're trying to implement something and then speed > > it up. > > It is pretty clear that the implementation will be different. > Which contradicts to the interface preservance, right? > -- > Konstantin Osipov, Moscow, Russia