From: Konstantin Osipov <kostja.osipov@gmail.com> To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [RFC] Quorum-based synchronous replication Date: Fri, 24 Apr 2020 01:28:37 +0300 [thread overview] Message-ID: <20200423222837.GC22011@atlas> (raw) In-Reply-To: <c86ef610-f54e-524e-103a-324e7e572d2d@tarantool.org> * Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [20/04/24 00:42]: > It says, that once the quorum is collected, and 'confirm' is written > to local leader's WAL, it is considered committed and is reported > to the client as successful. > > On the other hand it is said, that in case of leader change the > new leader will rollback all not confirmed transactions. That leads > to the following bug: > > Assume we have 4 instances: i1, i2, i3, i4. Leader is i1. It > writes a transaction with LSN1. The LSN1 is sent to other nodes, > they apply it ok, and send acks to the leader. The leader sees > i2-i4 all applied the transaction (propagated their LSNs to LSN1). > It writes 'confirm' to its local WAL, reports it to the client as > success, the client's request is over, it is returned back to > some remote node, etc. The transaction is officially synchronously > committed. > > Then the leader's machine dies - disk is dead. The confirm was > not sent to any of the other nodes. For example, it started having > problems with network connection to the replicas recently before > the death. Or it just didn't manage to hand the confirm out. > > >From now on if any of the other nodes i2-i4 becomes a leader, it > will rollback the officially confirmed transaction, even if it > has it, and all the other nodes too. > > That basically means, this sync replication gives exactly the same > guarantees as the async replication - 'confirm' on the leader tells > nothing about replicas except that they *are able to apply the > transaction*, but still may not apply it. > > Am I missing something? This video explains what leader has to do after it's been elected: https://www.youtube.com/watch?v=YbZ3zDzDnrw In short, the transactions in leader's wal has to be committed, not rolled back. Raft paper has https://raft.github.io/raft.pdf has answers in a concise single page summary. Why have this discussion at all, any ambiguity or discrepancy between this document and raft paper should be treated as a mistake in this document. Or do you actually think it's possible to invent a new consensus algorithm this way? > Note for those who is concerned: this has nothing to do with > in-memory relay. It has the same problems, which are in the protocol, > not in the implementation. No, the issues are distinct: 1) there may be cases where this paper doesn't follow RAFT. It should be obvious to everyone, that with the exception to external leader election and failure detection it has to if correctness is of any concern, so it's simply a matter of fixing this doc to match raft. As to the leader election, there are two alternatives: either spec out in this paper how the external election is interacting with the cluster, including finishing up old transactions and neutralizing old leaders, or allow multi-master, so forget about consistency for now. 2) an implementation based on triggers will be complicated and will have performance/stability implications. This is what I hope I was able to convey and in this case we can put the matter to rest. -- Konstantin Osipov, Moscow, Russia
next prev parent reply other threads:[~2020-04-23 22:28 UTC|newest] Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-04-03 21:08 Sergey Ostanevich 2020-04-07 13:02 ` Aleksandr Lyapunov 2020-04-08 9:18 ` Sergey Ostanevich 2020-04-08 14:05 ` Konstantin Osipov 2020-04-08 15:06 ` Sergey Ostanevich 2020-04-14 12:58 ` Sergey Bronnikov 2020-04-14 14:43 ` Sergey Ostanevich 2020-04-15 11:09 ` sergos 2020-04-15 14:50 ` sergos 2020-04-16 7:13 ` Aleksandr Lyapunov 2020-04-17 10:10 ` Konstantin Osipov 2020-04-17 13:45 ` Sergey Ostanevich 2020-04-20 11:20 ` Serge Petrenko 2020-04-20 23:32 ` Vladislav Shpilevoy 2020-04-21 10:49 ` Sergey Ostanevich 2020-04-21 22:17 ` Vladislav Shpilevoy 2020-04-22 16:50 ` Sergey Ostanevich 2020-04-22 20:28 ` Vladislav Shpilevoy 2020-04-23 6:58 ` Konstantin Osipov 2020-04-23 9:14 ` Konstantin Osipov 2020-04-23 11:27 ` Sergey Ostanevich 2020-04-23 11:43 ` Konstantin Osipov 2020-04-23 15:11 ` Sergey Ostanevich 2020-04-23 20:39 ` Konstantin Osipov 2020-04-23 21:38 ` Vladislav Shpilevoy 2020-04-23 22:28 ` Konstantin Osipov [this message] 2020-04-30 14:50 ` Sergey Ostanevich 2020-05-06 8:52 ` Konstantin Osipov 2020-05-06 16:39 ` Sergey Ostanevich 2020-05-06 18:44 ` Konstantin Osipov 2020-05-12 15:55 ` Sergey Ostanevich 2020-05-12 16:42 ` Konstantin Osipov 2020-05-13 21:39 ` Vladislav Shpilevoy 2020-05-13 23:54 ` Konstantin Osipov 2020-05-14 20:38 ` Sergey Ostanevich 2020-05-20 20:59 ` Sergey Ostanevich 2020-05-25 23:41 ` Vladislav Shpilevoy 2020-05-27 21:17 ` Sergey Ostanevich 2020-06-09 16:19 ` Sergey Ostanevich 2020-06-11 15:17 ` Vladislav Shpilevoy 2020-06-12 20:31 ` Sergey Ostanevich 2020-05-13 21:36 ` Vladislav Shpilevoy 2020-05-13 23:45 ` Konstantin Osipov 2020-05-06 18:55 ` Konstantin Osipov 2020-05-06 19:10 ` Konstantin Osipov 2020-05-12 16:03 ` Sergey Ostanevich 2020-05-13 21:42 ` Vladislav Shpilevoy 2020-05-14 0:05 ` Konstantin Osipov 2020-05-07 23:01 ` Konstantin Osipov 2020-05-12 16:40 ` Sergey Ostanevich 2020-05-12 17:47 ` Konstantin Osipov 2020-05-13 21:34 ` Vladislav Shpilevoy 2020-05-13 23:31 ` Konstantin Osipov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200423222837.GC22011@atlas \ --to=kostja.osipov@gmail.com \ --cc=tarantool-patches@dev.tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [RFC] Quorum-based synchronous replication' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox