From: Konstantin Osipov <kostja.osipov@gmail.com>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [RFC] Quorum-based synchronous replication
Date: Fri, 24 Apr 2020 01:28:37 +0300 [thread overview]
Message-ID: <20200423222837.GC22011@atlas> (raw)
In-Reply-To: <c86ef610-f54e-524e-103a-324e7e572d2d@tarantool.org>
* Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [20/04/24 00:42]:
> It says, that once the quorum is collected, and 'confirm' is written
> to local leader's WAL, it is considered committed and is reported
> to the client as successful.
>
> On the other hand it is said, that in case of leader change the
> new leader will rollback all not confirmed transactions. That leads
> to the following bug:
>
> Assume we have 4 instances: i1, i2, i3, i4. Leader is i1. It
> writes a transaction with LSN1. The LSN1 is sent to other nodes,
> they apply it ok, and send acks to the leader. The leader sees
> i2-i4 all applied the transaction (propagated their LSNs to LSN1).
> It writes 'confirm' to its local WAL, reports it to the client as
> success, the client's request is over, it is returned back to
> some remote node, etc. The transaction is officially synchronously
> committed.
>
> Then the leader's machine dies - disk is dead. The confirm was
> not sent to any of the other nodes. For example, it started having
> problems with network connection to the replicas recently before
> the death. Or it just didn't manage to hand the confirm out.
>
> >From now on if any of the other nodes i2-i4 becomes a leader, it
> will rollback the officially confirmed transaction, even if it
> has it, and all the other nodes too.
>
> That basically means, this sync replication gives exactly the same
> guarantees as the async replication - 'confirm' on the leader tells
> nothing about replicas except that they *are able to apply the
> transaction*, but still may not apply it.
>
> Am I missing something?
This video explains what leader has to do after it's been elected:
https://www.youtube.com/watch?v=YbZ3zDzDnrw
In short, the transactions in leader's wal has to be committed,
not rolled back.
Raft paper has https://raft.github.io/raft.pdf has answers in a
concise single page summary.
Why have this discussion at all, any ambiguity or discrepancy
between this document and raft paper should be treated as a
mistake in this document. Or do you actually think it's possible
to invent a new consensus algorithm this way?
> Note for those who is concerned: this has nothing to do with
> in-memory relay. It has the same problems, which are in the protocol,
> not in the implementation.
No, the issues are distinct:
1) there may be cases where this paper doesn't follow RAFT. It
should be obvious to everyone, that with the exception to
external leader election and failure detection it has to if
correctness is of any concern, so it's simply a matter of
fixing this doc to match raft.
As to the leader election, there are two alternatives: either
spec out in this paper how the external election is interacting
with the cluster, including finishing up old transactions and
neutralizing old leaders, or allow multi-master, so forget
about consistency for now.
2) an implementation based on triggers will be complicated and
will have performance/stability implications. This is what I
hope I was able to convey and in this case we can put the
matter to rest.
--
Konstantin Osipov, Moscow, Russia
next prev parent reply other threads:[~2020-04-23 22:28 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-03 21:08 Sergey Ostanevich
2020-04-07 13:02 ` Aleksandr Lyapunov
2020-04-08 9:18 ` Sergey Ostanevich
2020-04-08 14:05 ` Konstantin Osipov
2020-04-08 15:06 ` Sergey Ostanevich
2020-04-14 12:58 ` Sergey Bronnikov
2020-04-14 14:43 ` Sergey Ostanevich
2020-04-15 11:09 ` sergos
2020-04-15 14:50 ` sergos
2020-04-16 7:13 ` Aleksandr Lyapunov
2020-04-17 10:10 ` Konstantin Osipov
2020-04-17 13:45 ` Sergey Ostanevich
2020-04-20 11:20 ` Serge Petrenko
2020-04-20 23:32 ` Vladislav Shpilevoy
2020-04-21 10:49 ` Sergey Ostanevich
2020-04-21 22:17 ` Vladislav Shpilevoy
2020-04-22 16:50 ` Sergey Ostanevich
2020-04-22 20:28 ` Vladislav Shpilevoy
2020-04-23 6:58 ` Konstantin Osipov
2020-04-23 9:14 ` Konstantin Osipov
2020-04-23 11:27 ` Sergey Ostanevich
2020-04-23 11:43 ` Konstantin Osipov
2020-04-23 15:11 ` Sergey Ostanevich
2020-04-23 20:39 ` Konstantin Osipov
2020-04-23 21:38 ` Vladislav Shpilevoy
2020-04-23 22:28 ` Konstantin Osipov [this message]
2020-04-30 14:50 ` Sergey Ostanevich
2020-05-06 8:52 ` Konstantin Osipov
2020-05-06 16:39 ` Sergey Ostanevich
2020-05-06 18:44 ` Konstantin Osipov
2020-05-12 15:55 ` Sergey Ostanevich
2020-05-12 16:42 ` Konstantin Osipov
2020-05-13 21:39 ` Vladislav Shpilevoy
2020-05-13 23:54 ` Konstantin Osipov
2020-05-14 20:38 ` Sergey Ostanevich
2020-05-20 20:59 ` Sergey Ostanevich
2020-05-25 23:41 ` Vladislav Shpilevoy
2020-05-27 21:17 ` Sergey Ostanevich
2020-06-09 16:19 ` Sergey Ostanevich
2020-06-11 15:17 ` Vladislav Shpilevoy
2020-06-12 20:31 ` Sergey Ostanevich
2020-05-13 21:36 ` Vladislav Shpilevoy
2020-05-13 23:45 ` Konstantin Osipov
2020-05-06 18:55 ` Konstantin Osipov
2020-05-06 19:10 ` Konstantin Osipov
2020-05-12 16:03 ` Sergey Ostanevich
2020-05-13 21:42 ` Vladislav Shpilevoy
2020-05-14 0:05 ` Konstantin Osipov
2020-05-07 23:01 ` Konstantin Osipov
2020-05-12 16:40 ` Sergey Ostanevich
2020-05-12 17:47 ` Konstantin Osipov
2020-05-13 21:34 ` Vladislav Shpilevoy
2020-05-13 23:31 ` Konstantin Osipov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200423222837.GC22011@atlas \
--to=kostja.osipov@gmail.com \
--cc=tarantool-patches@dev.tarantool.org \
--cc=v.shpilevoy@tarantool.org \
--subject='Re: [Tarantool-patches] [RFC] Quorum-based synchronous replication' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox