Tarantool development patches archive
 help / color / mirror / Atom feed
From: Konstantin Osipov <kostja.osipov@gmail.com>
To: Sergey Ostanevich <sergos@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org,
	Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Subject: Re: [Tarantool-patches] [RFC] Quorum-based synchronous replication
Date: Thu, 23 Apr 2020 23:39:32 +0300	[thread overview]
Message-ID: <20200423203932.GA22011@atlas> (raw)
In-Reply-To: <20200423151134.GD112@tarantool.org>

* Sergey Ostanevich <sergos@tarantool.org> [20/04/23 18:11]:
> > The spec should demonstrate the consistency is guaranteed: right
> > now it can easily be violated during a leader change, and this is
> > left out of scope of the spec.
> > 
> > My take is that any implementation which is not close enough to a
> > TLA+ proven spec is not trustworthy, so I would not claim myself
> > or trust any one elses claims that it is consistent. At best this
> > RFC could achieve durability, by ensuring that no transaction is
> > committed unless it is delivered to a majority of replicas.
> 
> What is exactly mentioned in RFC goals.

This is durability, though, not consistency. My point is: if
consistency can not be guaranteed anyway, why assume single leader. Let's
consider what happens if all replicas are allowed to collect acks, 
define for it the same semantics as we do today in case of async
multi-master. Then add the remaining bits of RAFT.
> 
> > Consistency requires implementing RAFT spec in full and showing
> > that leader changes preserve the write ahead log linearizability.
> > 
> So the leader should stop accepting transactions, wait for all txn in
> queue resolved into confirmed either issue a rollback - after a 
> timeout as a last resort.
> Since no automation in leader election the cluster will appear in a
> consistent state after this. Now a new leader can be appointed with
> all circumstances taken into account - nodes availability, ping from
> the proxy, lsn, etc.
> Again, this RFC is not about any HA features, such as auto-failover.
> 
> > > > The other issue is that if your replicas are alive but
> > > > slow/lagging behind, you can't let too many undo records to
> > > > pile up unacknowledged in tx thread.
> > > > The in-memory relay solves this nicely too, because it kicks out
> > > > replicas from memory to file mode if they are unable to keep up
> > > > with the speed of change.
> > > > 
> > > That is the same problem - resources of leader, so natural limit for
> > > throughput. I bet Tarantool faces similar limitations even now,
> > > although different ones. 
> > > 
> > > The in-memory relay supposed to keep the same interface, so we expect to
> > > hop easily to this new shiny express as soon as it appears. This will be
> > > an optimization and we're trying to implement something and then speed
> > > it up.
> > 
> > It is pretty clear that the implementation will be different. 
> > 
> Which contradicts to the interface preservance, right?

I don't believe internals and API can be so disconnected. I think
in-memory relay is such a significant change that the
implementation has to build upon it. 
The trigger-based implementation was contributed back in 2015 and
went nowhere, in fact it was an inspiration to create a backlog of
such items as parallel applier, applier in iproto, in-memory
relay, and so on - all of these are "review items" for the
trigger-based syncrep:

https://github.com/Alexey-Ivanensky/tarantool/tree/bsync

-- 
Konstantin Osipov, Moscow, Russia

  reply	other threads:[~2020-04-23 20:39 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-03 21:08 Sergey Ostanevich
2020-04-07 13:02 ` Aleksandr Lyapunov
2020-04-08  9:18   ` Sergey Ostanevich
2020-04-08 14:05     ` Konstantin Osipov
2020-04-08 15:06       ` Sergey Ostanevich
2020-04-14 12:58 ` Sergey Bronnikov
2020-04-14 14:43   ` Sergey Ostanevich
2020-04-15 11:09     ` sergos
2020-04-15 14:50       ` sergos
2020-04-16  7:13         ` Aleksandr Lyapunov
2020-04-17 10:10         ` Konstantin Osipov
2020-04-17 13:45           ` Sergey Ostanevich
2020-04-20 11:20         ` Serge Petrenko
2020-04-20 23:32 ` Vladislav Shpilevoy
2020-04-21 10:49   ` Sergey Ostanevich
2020-04-21 22:17     ` Vladislav Shpilevoy
2020-04-22 16:50       ` Sergey Ostanevich
2020-04-22 20:28         ` Vladislav Shpilevoy
2020-04-23  6:58       ` Konstantin Osipov
2020-04-23  9:14         ` Konstantin Osipov
2020-04-23 11:27           ` Sergey Ostanevich
2020-04-23 11:43             ` Konstantin Osipov
2020-04-23 15:11               ` Sergey Ostanevich
2020-04-23 20:39                 ` Konstantin Osipov [this message]
2020-04-23 21:38 ` Vladislav Shpilevoy
2020-04-23 22:28   ` Konstantin Osipov
2020-04-30 14:50   ` Sergey Ostanevich
2020-05-06  8:52     ` Konstantin Osipov
2020-05-06 16:39       ` Sergey Ostanevich
2020-05-06 18:44         ` Konstantin Osipov
2020-05-12 15:55           ` Sergey Ostanevich
2020-05-12 16:42             ` Konstantin Osipov
2020-05-13 21:39             ` Vladislav Shpilevoy
2020-05-13 23:54               ` Konstantin Osipov
2020-05-14 20:38               ` Sergey Ostanevich
2020-05-20 20:59                 ` Sergey Ostanevich
2020-05-25 23:41                   ` Vladislav Shpilevoy
2020-05-27 21:17                     ` Sergey Ostanevich
2020-06-09 16:19                       ` Sergey Ostanevich
2020-06-11 15:17                         ` Vladislav Shpilevoy
2020-06-12 20:31                           ` Sergey Ostanevich
2020-05-13 21:36         ` Vladislav Shpilevoy
2020-05-13 23:45           ` Konstantin Osipov
2020-05-06 18:55     ` Konstantin Osipov
2020-05-06 19:10       ` Konstantin Osipov
2020-05-12 16:03         ` Sergey Ostanevich
2020-05-13 21:42       ` Vladislav Shpilevoy
2020-05-14  0:05         ` Konstantin Osipov
2020-05-07 23:01     ` Konstantin Osipov
2020-05-12 16:40       ` Sergey Ostanevich
2020-05-12 17:47         ` Konstantin Osipov
2020-05-13 21:34           ` Vladislav Shpilevoy
2020-05-13 23:31             ` Konstantin Osipov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200423203932.GA22011@atlas \
    --to=kostja.osipov@gmail.com \
    --cc=sergos@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [RFC] Quorum-based synchronous replication' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox