From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <kostja.osipov@gmail.com>
Received: from mail-lj1-f177.google.com (mail-lj1-f177.google.com
 [209.85.208.177])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by dev.tarantool.org (Postfix) with ESMTPS id 83C114696C3
 for <tarantool-patches@dev.tarantool.org>;
 Thu, 23 Apr 2020 12:14:38 +0300 (MSK)
Received: by mail-lj1-f177.google.com with SMTP id w20so5461367ljj.0
 for <tarantool-patches@dev.tarantool.org>;
 Thu, 23 Apr 2020 02:14:38 -0700 (PDT)
Date: Thu, 23 Apr 2020 12:14:36 +0300
From: Konstantin Osipov <kostja.osipov@gmail.com>
Message-ID: <20200423091436.GA14576@atlas>
References: <20200403210836.GB18283@tarantool.org>
 <ab849382-feb5-b906-84a8-402124e1c0a8@tarantool.org>
 <20200421104918.GA112@tarantool.org>
 <dd4d703e-6918-ccd9-ac5e-76fb54fff0f9@tarantool.org>
 <20200423065809.GA4528@atlas>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20200423065809.GA4528@atlas>
Subject: Re: [Tarantool-patches] [RFC] Quorum-based synchronous replication
List-Id: Tarantool development patches <tarantool-patches.dev.tarantool.org>
List-Unsubscribe: <https://lists.tarantool.org/mailman/options/tarantool-patches>, 
 <mailto:tarantool-patches-request@dev.tarantool.org?subject=unsubscribe>
List-Archive: <https://lists.tarantool.org/pipermail/tarantool-patches/>
List-Post: <mailto:tarantool-patches@dev.tarantool.org>
List-Help: <mailto:tarantool-patches-request@dev.tarantool.org?subject=help>
List-Subscribe: <https://lists.tarantool.org/mailman/listinfo/tarantool-patches>, 
 <mailto:tarantool-patches-request@dev.tarantool.org?subject=subscribe>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>, Sergey Ostanevich <sergos@tarantool.org>, tarantool-patches@dev.tarantool.org

* Konstantin Osipov <kostja.osipov@gmail.com> [20/04/23 09:58]:
> > > To my understanding - it's up to user. I was considering a cluster that
> > > has no WAL at all - relying on sychro replication and sufficient number
> > > of replicas. Everyone who I asked about it told me I'm nuts. To my great
> > > surprise Alexander Lyapunov brought exactly the same idea to discuss. 
> > 
> > I didn't see an RFC on that, and this can become easily possible, when
> > in-memory relay is implemented. If it is implemented in a clean way. We
> > just can turn off the disk backoff, and it will work from memory-only.
> 
> Sync replication must work from in-memory relay only. It works as
> a natural failure detector: a replica which is slow or unavailable
> is first removed from the subscribers of in-memory relay, and only 
> then (possibly much much later) is marked as down.
> 
> By looking at the in-memory relay you have a clear idea what peers
> are available and can abort a transaction if a cluster is in the
> downgraded state right away. You never wait for impossible events. 
> 
> If you do have to wait, and say your wait timeout is 1 second, you
> quickly run out of any fibers in the fiber pool for any work,
> because all of them will be waiting on the sync transactions they
> picked up from iproto to finish. The system will loose its
> throttling capability. 

The other issue is that if your replicas are alive but
slow/lagging behind, you can't let too many undo records to
pile up unacknowledged in tx thread.
The in-memory relay solves this nicely too, because it kicks out
replicas from memory to file mode if they are unable to keep up
with the speed of change.

-- 
Konstantin Osipov, Moscow, Russia