From: Serge Petrenko via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>,
tarantool-patches@dev.tarantool.org, gorcunov@gmail.com
Subject: Re: [Tarantool-patches] [PATCH 5/6] replication: use 'score' to find a join-master
Date: Thu, 10 Jun 2021 17:06:16 +0300 [thread overview]
Message-ID: <3f77385e-9bcd-654c-68eb-8bad8a634a0a@tarantool.org> (raw)
In-Reply-To: <6675abcfa409f1fd6e05a7e7852b42e1a08d1795.1622849790.git.v.shpilevoy@tarantool.org>
05.06.2021 02:37, Vladislav Shpilevoy пишет:
> The patch refactors the algorithm of finding a join-master (in
> replicaset_find_join_master()) to use scores instead of multiple
> iterations with different criteria.
>
> The original code was relatively fine as long as it had only
> one parameter to change - whether should it skip
> `box.cfg{read_only = true}` nodes.
>
> Although it was clear that it was "on the edge" of acceptable
> complexity due to a second non-configurable parameter whether a
> replica is in read-only state regardless of its config.
>
> It is going to get more complicated when the algorithm will take
> into account the third parameter whether an instance is
> bootstrapped.
>
> Then it should make decisions like "among bootstrapped nodes try
> to prefer instances not having read_only=true, and not being in
> read-only state". The easiest way to do so is to use
> scores/weights incremented according to the instance's parameters
> matching certain "good points".
>
> Part of #5613
LGTM.
> ---
> src/box/replication.cc | 62 ++++++++++++++++--------------------------
> 1 file changed, 23 insertions(+), 39 deletions(-)
>
> diff --git a/src/box/replication.cc b/src/box/replication.cc
> index 990f6239c..d33e70f28 100644
> --- a/src/box/replication.cc
> +++ b/src/box/replication.cc
> @@ -960,71 +960,55 @@ replicaset_next(struct replica *replica)
> * replicas, choose a read-only replica with biggest vclock
> * as a leader, in hope it will become read-write soon.
> */
> -static struct replica *
> -replicaset_round(bool skip_ro)
> +struct replica *
> +replicaset_find_join_master(void)
> {
> struct replica *leader = NULL;
> + int leader_score = -1;
> replicaset_foreach(replica) {
> struct applier *applier = replica->applier;
> if (applier == NULL)
> continue;
> const struct ballot *ballot = &applier->ballot;
> - /**
> - * While bootstrapping a new cluster, read-only
> - * replicas shouldn't be considered as a leader.
> - * The only exception if there is no read-write
> - * replicas since there is still a possibility
> - * that all replicas exist in cluster table.
> - */
> - if (skip_ro && ballot->is_ro_cfg)
> - continue;
> - if (leader == NULL) {
> - leader = replica;
> - continue;
> - }
> - const struct ballot *leader_ballot = &leader->applier->ballot;
> + int score = 0;
> /*
> - * Try to find a replica which has already left
> - * orphan mode.
> + * Prefer instances not configured as read-only via box.cfg, and
> + * not being in read-only state due to any other reason. The
> + * config is stronger because if it is configured as read-only,
> + * it is in read-only state for sure, until the config is
> + * changed.
> */
> - if (ballot->is_ro && !leader_ballot->is_ro)
> + if (!ballot->is_ro_cfg)
> + score += 5;
> + if (!ballot->is_ro)
> + score += 1;
> + if (leader_score < score)
> + goto elect;
> + if (score < leader_score)
> continue;
> + const struct ballot *leader_ballot;
> + leader_ballot = &leader->applier->ballot;
> /*
> * Choose the replica with the most advanced
> * vclock. If there are two or more replicas
> * with the same vclock, prefer the one with
> * the lowest uuid.
> */
> - int cmp = vclock_compare_ignore0(&ballot->vclock,
> - &leader_ballot->vclock);
> + int cmp;
> + cmp = vclock_compare_ignore0(&ballot->vclock,
> + &leader_ballot->vclock);
> if (cmp < 0)
> continue;
> if (cmp == 0 && tt_uuid_compare(&replica->uuid,
> &leader->uuid) > 0)
> continue;
> + elect:
> leader = replica;
> + leader_score = score;
> }
> return leader;
> }
>
> -struct replica *
> -replicaset_find_join_master(void)
> -{
> - bool skip_ro = true;
> - /**
> - * Two loops, first prefers read-write replicas among others.
> - * Second for backward compatibility, if there is no such
> - * replicas at all.
> - */
> - struct replica *leader = replicaset_round(skip_ro);
> - if (leader == NULL) {
> - skip_ro = false;
> - leader = replicaset_round(skip_ro);
> - }
> -
> - return leader;
> -}
> -
> struct replica *
> replica_by_uuid(const struct tt_uuid *uuid)
> {
--
Serge Petrenko
next prev parent reply other threads:[~2021-06-10 14:06 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-04 23:37 [Tarantool-patches] [PATCH 0/6] Instance join should prefer booted instances Vladislav Shpilevoy via Tarantool-patches
2021-06-04 23:37 ` [Tarantool-patches] [PATCH 1/6] replication: refactor replicaset_leader() Vladislav Shpilevoy via Tarantool-patches
2021-06-10 13:54 ` Serge Petrenko via Tarantool-patches
2021-06-04 23:37 ` [Tarantool-patches] [PATCH 2/6] replication: ballot.is_ro -> is_ro_cfg Vladislav Shpilevoy via Tarantool-patches
2021-06-10 13:56 ` Serge Petrenko via Tarantool-patches
2021-06-04 23:37 ` [Tarantool-patches] [PATCH 3/6] replication: ballot.is_loading -> is_ro Vladislav Shpilevoy via Tarantool-patches
2021-06-10 13:58 ` Serge Petrenko via Tarantool-patches
2021-06-04 23:37 ` [Tarantool-patches] [PATCH 4/6] replication: introduce ballot.is_booted Vladislav Shpilevoy via Tarantool-patches
2021-06-10 14:02 ` Serge Petrenko via Tarantool-patches
2021-06-04 23:37 ` [Tarantool-patches] [PATCH 5/6] replication: use 'score' to find a join-master Vladislav Shpilevoy via Tarantool-patches
2021-06-10 14:06 ` Serge Petrenko via Tarantool-patches [this message]
2021-06-10 15:02 ` Cyrill Gorcunov via Tarantool-patches
2021-06-10 20:09 ` Vladislav Shpilevoy via Tarantool-patches
2021-06-10 21:28 ` Cyrill Gorcunov via Tarantool-patches
2021-06-04 23:38 ` [Tarantool-patches] [PATCH 6/6] replication: prefer to join from booted replicas Vladislav Shpilevoy via Tarantool-patches
2021-06-06 17:06 ` Vladislav Shpilevoy via Tarantool-patches
2021-06-10 14:14 ` Serge Petrenko via Tarantool-patches
2021-06-06 17:03 ` [Tarantool-patches] [PATCH 7/6] raft: test join to a raft cluster Vladislav Shpilevoy via Tarantool-patches
2021-06-06 23:01 ` Vladislav Shpilevoy via Tarantool-patches
2021-06-10 14:17 ` Serge Petrenko via Tarantool-patches
2021-06-10 15:03 ` [Tarantool-patches] [PATCH 0/6] Instance join should prefer booted instances Cyrill Gorcunov via Tarantool-patches
2021-06-11 20:56 ` Vladislav Shpilevoy via Tarantool-patches
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3f77385e-9bcd-654c-68eb-8bad8a634a0a@tarantool.org \
--to=tarantool-patches@dev.tarantool.org \
--cc=gorcunov@gmail.com \
--cc=sergepetrenko@tarantool.org \
--cc=v.shpilevoy@tarantool.org \
--subject='Re: [Tarantool-patches] [PATCH 5/6] replication: use '\''score'\'' to find a join-master' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox