Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: Konstantin Osipov <kostja.osipov@gmail.com>,
	Sergey Ostanevich <sergos@tarantool.org>,
	tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [RFC] Quorum-based synchronous replication
Date: Wed, 13 May 2020 23:34:24 +0200	[thread overview]
Message-ID: <29b2a0df-fe3e-332e-1d33-e9ee37353383@tarantool.org> (raw)
In-Reply-To: <20200512174741.GC2049@atlas>

Thanks for the discussion!

On 12/05/2020 19:47, Konstantin Osipov wrote:
> * Sergey Ostanevich <sergos@tarantool.org> [20/05/12 19:43]:
> 
>>> 1) It's unclear what happens here if async tx follows a sync tx.
>>>    Does it wait for the sync tx? This reduces availability for
>>
>> Definitely yes, unless we keep the 'dirty read' as it is at the moment
>> in memtx. This is the essence of the design, and it is temporary until 
>> the MVCC similar to the vinyl machinery appears. I intentionally didn't
>> include this big task into this RFC. 
>>
>> It will provide similar capabilities, although it will keep only
>> dependent transactions in the undo log. Also, it looks like it will fit
>> well into the machinery of this RFC. 
> 
> = reduced availability for all who have at least one sync space.
> 
> If different spaces have different quorum size = quorum size of
> the biggest group is effectively used for all spaces.
> 
> Replica-local transactions, e.g. those used by vinyl compaction, 
> are rolled back if there is no quorum.
> 
> What's the value of this?

There is an example when it leaves the database in an inconsistent
state, when half of a transaction is applied. I don't know why Sergey
didn't add it. I propose to him to extend the RFC with these examples.
Since you are not the first person, who finds this strange and wrong.
So clearly the RFC still does not explain this moment diligently
enough.

>>>    async txs - so it's hardly acceptable. Besides, with
>>>    group=local spaces, one can quickly run out of memory for undo.
>>>   
>>>
>>> 3) One can quickly run out of memory for undo. Any sync
>>>    transaction should be capped with a timeout to avoid OOMs. I
>>>    don't know how many times I should repeat it. The only good
>>>    solution for load control is in-memory WAL, which will allow to
>>>    rollback all transactions as soon as network partitioning is
>>>    detected.
>>
>> How in-memry WAL can help save on _undo_ memory? 
>> To rollback whatever amount of transactions one need to store the undo. 
> 
> I wrote earlier that it works as a natural failure detector and
> throttling mechanism. If
> there is no quorum, we can see it immediately by looking at the
> number of active subscribers of the in-memory WAL, so do not
> accumulate undo.

Here we go again ...

Talking of throttling. Without in-memory WAL no need for throttling. All is
'slow' by design already, as you think.

Talking of failure detection - what??? I don't get it. This is something new.
With in-memory relay or without you anyway can see if there is a quorum.
This is a matter of API of replication and transaction modules, and their
interaction with each other, solved by txn_limbo in my branch.

But still, I don't see how knowing number of subscribers helps with the
quorum. Subscriber presence does not add to quorums by itself. Anyway every
transaction needs to be replicated before you can say that its quorum got
+1 replica ack.

  reply	other threads:[~2020-05-13 21:34 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-03 21:08 Sergey Ostanevich
2020-04-07 13:02 ` Aleksandr Lyapunov
2020-04-08  9:18   ` Sergey Ostanevich
2020-04-08 14:05     ` Konstantin Osipov
2020-04-08 15:06       ` Sergey Ostanevich
2020-04-14 12:58 ` Sergey Bronnikov
2020-04-14 14:43   ` Sergey Ostanevich
2020-04-15 11:09     ` sergos
2020-04-15 14:50       ` sergos
2020-04-16  7:13         ` Aleksandr Lyapunov
2020-04-17 10:10         ` Konstantin Osipov
2020-04-17 13:45           ` Sergey Ostanevich
2020-04-20 11:20         ` Serge Petrenko
2020-04-20 23:32 ` Vladislav Shpilevoy
2020-04-21 10:49   ` Sergey Ostanevich
2020-04-21 22:17     ` Vladislav Shpilevoy
2020-04-22 16:50       ` Sergey Ostanevich
2020-04-22 20:28         ` Vladislav Shpilevoy
2020-04-23  6:58       ` Konstantin Osipov
2020-04-23  9:14         ` Konstantin Osipov
2020-04-23 11:27           ` Sergey Ostanevich
2020-04-23 11:43             ` Konstantin Osipov
2020-04-23 15:11               ` Sergey Ostanevich
2020-04-23 20:39                 ` Konstantin Osipov
2020-04-23 21:38 ` Vladislav Shpilevoy
2020-04-23 22:28   ` Konstantin Osipov
2020-04-30 14:50   ` Sergey Ostanevich
2020-05-06  8:52     ` Konstantin Osipov
2020-05-06 16:39       ` Sergey Ostanevich
2020-05-06 18:44         ` Konstantin Osipov
2020-05-12 15:55           ` Sergey Ostanevich
2020-05-12 16:42             ` Konstantin Osipov
2020-05-13 21:39             ` Vladislav Shpilevoy
2020-05-13 23:54               ` Konstantin Osipov
2020-05-14 20:38               ` Sergey Ostanevich
2020-05-20 20:59                 ` Sergey Ostanevich
2020-05-25 23:41                   ` Vladislav Shpilevoy
2020-05-27 21:17                     ` Sergey Ostanevich
2020-06-09 16:19                       ` Sergey Ostanevich
2020-06-11 15:17                         ` Vladislav Shpilevoy
2020-06-12 20:31                           ` Sergey Ostanevich
2020-05-13 21:36         ` Vladislav Shpilevoy
2020-05-13 23:45           ` Konstantin Osipov
2020-05-06 18:55     ` Konstantin Osipov
2020-05-06 19:10       ` Konstantin Osipov
2020-05-12 16:03         ` Sergey Ostanevich
2020-05-13 21:42       ` Vladislav Shpilevoy
2020-05-14  0:05         ` Konstantin Osipov
2020-05-07 23:01     ` Konstantin Osipov
2020-05-12 16:40       ` Sergey Ostanevich
2020-05-12 17:47         ` Konstantin Osipov
2020-05-13 21:34           ` Vladislav Shpilevoy [this message]
2020-05-13 23:31             ` Konstantin Osipov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=29b2a0df-fe3e-332e-1d33-e9ee37353383@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=kostja.osipov@gmail.com \
    --cc=sergos@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --subject='Re: [Tarantool-patches] [RFC] Quorum-based synchronous replication' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox