Tarantool development patches archive
 help / color / mirror / Atom feed
From: Konstantin Osipov via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: tarantool-patches@dev.tarantool.org,
	Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Subject: Re: [Tarantool-patches] [PATCH 1/2] replication: introduce ballot.can_be_leader
Date: Wed, 21 Jul 2021 02:20:57 +0300
Message-ID: <20210720232057.GA85781@starling> (raw)
In-Reply-To: <YPc9QCmbEHUIQjOT@grain>

* Cyrill Gorcunov <gorcunov@gmail.com> [21/07/21 00:17]:
> > Imagine there are nodes A, B, C, D, E.
> > A is a leader, E is a voter which can not become a leader.
> > 
> > Imagine A's log index is 5, B = 4, C = 3, D = 2, E = 5.
> > 
> > The majority's log index is 4, so entry 4 is committed. A dies, B
> > is partitioned away. The cluster is stuck, because neither C nor B
> > can get a quorum (3 votes).
> > 
> > Worse yet, if E's (voter) commit index is low, not high, it can vote for a
> > node which doesn't have a committed entry. In that case you can
> > lose a committed entry.
> 
> Wait, Kostya, here is a set
> 
>      A  B  C  D  E
>     {5, 4, 3, 2, 5}
>      *  *        *
>      L  F  F  F  V
> 
> where L - leader, F - follower, V - voter, LCI is 4 (least common index),
> Q(uorum) = 3, then
> 
>      A  B  C  D  E
>     {-, -, 3, 2, 5}
>            F  F  V
> 
> The Voter E won't be able to choose C or D because its log
> is bigger and the cluster get stuck (this is guaranteed by
> vclock comparision).

Right. That's what I am saying - the cluster is stuck even though
the quorum (3 nodes) is present. And this is not something
consistent, such clusters will get stuck simply based on the state
of the voter - sometimes they will, sometimes they won't.

> Lets assume the E's index is low, say 3
> 
>     A  B  C  D  E
>    {5, 4, 4, 3, 3}
>     *  *
>     L  F  F  F  V
> 
> in this config the leader won't commit record 5 until one
> of {C,D,E} write the new record(s) since otherwise the quorum
> won't be reached. Assume A and B get out of the set without
> record 4 written to C
> 
>      A  B  C  D  E
>     {-, -, 4, 3, 3}
>            F  F  V
> 
> Now the node E can vote for C and D because its index is LE.
> And since C's index is bigger than others it will be elected
> next as far as I understand, no?

You're right, assuming the voter never casts a vote for a
candidate with a shorter log the safety is not violated. I wasn't
sure it's the case, and presumed that the voter may have no log of
its own. But still there are issues with liveness. Raft PHD has
learners, so why not use them instead.

> The E won't be leader but will
> help C to gather the majority. So the cluster should be safe
> I only I'm not missing something obvious.

-- 
Konstantin Osipov, Moscow, Russia

  reply	other threads:[~2021-07-20 23:21 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-15 23:49 [Tarantool-patches] [PATCH 0/2] Bootstrap voter Vladislav Shpilevoy via Tarantool-patches
2021-07-15 23:49 ` [Tarantool-patches] [PATCH 1/2] replication: introduce ballot.can_be_leader Vladislav Shpilevoy via Tarantool-patches
2021-07-16 10:59   ` Serge Petrenko via Tarantool-patches
2021-07-18 17:00     ` Vladislav Shpilevoy via Tarantool-patches
2021-07-19  9:11       ` Sergey Petrenko via Tarantool-patches
2021-07-16 14:29   ` Konstantin Osipov via Tarantool-patches
2021-07-18 17:00     ` Vladislav Shpilevoy via Tarantool-patches
2021-07-19  9:12       ` Konstantin Osipov via Tarantool-patches
2021-07-19 22:06         ` Vladislav Shpilevoy via Tarantool-patches
2021-07-20  8:49           ` Konstantin Osipov via Tarantool-patches
2021-07-20 20:02             ` Vladislav Shpilevoy via Tarantool-patches
2021-07-20 20:18               ` Konstantin Osipov via Tarantool-patches
2021-07-20 21:16         ` Cyrill Gorcunov via Tarantool-patches
2021-07-20 23:20           ` Konstantin Osipov via Tarantool-patches [this message]
2021-07-21 18:51             ` Cyrill Gorcunov via Tarantool-patches
2021-07-21 21:43             ` Vladislav Shpilevoy via Tarantool-patches
2021-07-15 23:49 ` [Tarantool-patches] [PATCH 2/2] election: during bootstrap prefer candidates Vladislav Shpilevoy via Tarantool-patches
2021-07-16 11:30   ` Serge Petrenko via Tarantool-patches
2021-07-18 17:00     ` Vladislav Shpilevoy via Tarantool-patches
2021-07-16 14:27 ` [Tarantool-patches] [PATCH 0/2] Bootstrap voter Konstantin Osipov via Tarantool-patches
2021-07-18 17:00   ` Vladislav Shpilevoy via Tarantool-patches
2021-07-19  9:13     ` Konstantin Osipov via Tarantool-patches
2021-07-19 22:04       ` Vladislav Shpilevoy via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210720232057.GA85781@starling \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=gorcunov@gmail.com \
    --cc=kostja.osipov@gmail.com \
    --cc=v.shpilevoy@tarantool.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Tarantool development patches archive

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://lists.tarantool.org/tarantool-patches/0 tarantool-patches/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 tarantool-patches tarantool-patches/ https://lists.tarantool.org/tarantool-patches \
		tarantool-patches@dev.tarantool.org.
	public-inbox-index tarantool-patches

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git