From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-f41.google.com (mail-lf1-f41.google.com [209.85.167.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id DBDAF469710 for ; Tue, 12 May 2020 20:47:43 +0300 (MSK) Received: by mail-lf1-f41.google.com with SMTP id a4so11292149lfh.12 for ; Tue, 12 May 2020 10:47:43 -0700 (PDT) Date: Tue, 12 May 2020 20:47:41 +0300 From: Konstantin Osipov Message-ID: <20200512174741.GC2049@atlas> References: <20200403210836.GB18283@tarantool.org> <20200430145033.GF112@tarantool.org> <20200507230112.GB14285@atlas> <20200512164048.GM112@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200512164048.GM112@tarantool.org> Subject: Re: [Tarantool-patches] [RFC] Quorum-based synchronous replication List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Sergey Ostanevich Cc: tarantool-patches@dev.tarantool.org, Vladislav Shpilevoy * Sergey Ostanevich [20/05/12 19:43]: > > 1) It's unclear what happens here if async tx follows a sync tx. > > Does it wait for the sync tx? This reduces availability for > > Definitely yes, unless we keep the 'dirty read' as it is at the moment > in memtx. This is the essence of the design, and it is temporary until > the MVCC similar to the vinyl machinery appears. I intentionally didn't > include this big task into this RFC. > > It will provide similar capabilities, although it will keep only > dependent transactions in the undo log. Also, it looks like it will fit > well into the machinery of this RFC. = reduced availability for all who have at least one sync space. If different spaces have different quorum size = quorum size of the biggest group is effectively used for all spaces. Replica-local transactions, e.g. those used by vinyl compaction, are rolled back if there is no quorum. What's the value of this? > > > async txs - so it's hardly acceptable. Besides, with > > group=local spaces, one can quickly run out of memory for undo. > > > > > > Then it should be allowed to proceed and commit. > > > > Then mixing sync and async tables in a single transaction > > shouldn't be allowed. > > > > Imagine t1 is sync and t2 is async. tx1 changes t1 and t2, tx2 > > changes t2. tx1 is not confirmed and must be rolled back. But it can > > not revert changes of tx2. > > > > The spec should clarify that. You conveniently skip this explanation of the problem - meaning you don't intend to address it? > > > > 3) One can quickly run out of memory for undo. Any sync > > transaction should be capped with a timeout to avoid OOMs. I > > don't know how many times I should repeat it. The only good > > solution for load control is in-memory WAL, which will allow to > > rollback all transactions as soon as network partitioning is > > detected. > > How in-memry WAL can help save on _undo_ memory? > To rollback whatever amount of transactions one need to store the undo. I wrote earlier that it works as a natural failure detector and throttling mechanism. If there is no quorum, we can see it immediately by looking at the number of active subscribers of the in-memory WAL, so do not accumulate undo. -- Konstantin Osipov, Moscow, Russia