From: Vladislav Shpilevoy via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: tarantool-patches@dev.tarantool.org, gorcunov@gmail.com, sergepetrenko@tarantool.org Subject: [Tarantool-patches] [PATCH v2 4/5] election: during bootstrap prefer candidates Date: Sun, 18 Jul 2021 18:53:28 +0200 [thread overview] Message-ID: <97e77a3158ef83db3c93f247297297874d7465cb.1626627097.git.v.shpilevoy@tarantool.org> (raw) In-Reply-To: <cover.1626627097.git.v.shpilevoy@tarantool.org> During cluster bootstrap the boot master election algorithm didn't take into account election modes of the instances. It could be that all nodes have box.cfg.read_only = false, none is booted, all are read-only now. Then the node with the smallest UUID was chosen even if it was a box.cfg.election_mode='voter' node. It could neither boot nor register other nodes and the cluster couldn't start. The patch makes the boot master election prefer the instances which can become a Raft leader. If all the other parameters didn't help. Part #6018 --- src/box/replication.cc | 11 ++- .../gh-6018-election-boot-voter.result | 70 +++++++++++++++++++ .../gh-6018-election-boot-voter.test.lua | 38 ++++++++++ test/replication/gh-6018-master.lua | 17 +++++ test/replication/gh-6018-replica.lua | 15 ++++ test/replication/suite.cfg | 1 + 6 files changed, 150 insertions(+), 2 deletions(-) create mode 100644 test/replication/gh-6018-election-boot-voter.result create mode 100644 test/replication/gh-6018-election-boot-voter.test.lua create mode 100644 test/replication/gh-6018-master.lua create mode 100644 test/replication/gh-6018-replica.lua diff --git a/src/box/replication.cc b/src/box/replication.cc index a0b3e0186..45ad03dfd 100644 --- a/src/box/replication.cc +++ b/src/box/replication.cc @@ -978,12 +978,19 @@ replicaset_find_join_master(void) * config is stronger because if it is configured as read-only, * it is in read-only state for sure, until the config is * changed. + * + * In a cluster with leader election enabled all instances might + * look equal by the scores above. Then must prefer the ones + * which can be elected as a leader, because only they would be + * able to boot themselves and register the others. */ if (ballot->is_booted) - score += 10; + score += 1000; if (!ballot->is_ro_cfg) - score += 5; + score += 100; if (!ballot->is_ro) + score += 10; + if (ballot->can_lead) score += 1; if (leader_score < score) goto elect; diff --git a/test/replication/gh-6018-election-boot-voter.result b/test/replication/gh-6018-election-boot-voter.result new file mode 100644 index 000000000..6b05f0825 --- /dev/null +++ b/test/replication/gh-6018-election-boot-voter.result @@ -0,0 +1,70 @@ +-- test-run result file version 2 +-- +-- gh-6018: in an auto-election cluster nodes with voter state could be selected +-- as bootstrap leaders. They should not, because a voter can't be ever writable +-- and it can neither boot itself nor register other nodes. +-- +test_run = require('test_run').new() + | --- + | ... + +function boot_with_master_election_mode(mode) \ + test_run:cmd('create server master with '.. \ + 'script="replication/gh-6018-master.lua"') \ + test_run:cmd('start server master with wait=False, args="'..mode..'"') \ + test_run:cmd('create server replica with '.. \ + 'script="replication/gh-6018-replica.lua"') \ + test_run:cmd('start server replica') \ +end + | --- + | ... + +function stop_cluster() \ + test_run:cmd('stop server replica') \ + test_run:cmd('stop server master') \ + test_run:cmd('delete server replica') \ + test_run:cmd('delete server master') \ +end + | --- + | ... + +-- +-- Candidate leader. +-- +boot_with_master_election_mode('candidate') + | --- + | ... + +test_run:switch('master') + | --- + | - true + | ... +test_run:wait_cond(function() return not box.info.ro end) + | --- + | - true + | ... +assert(box.info.election.state == 'leader') + | --- + | - true + | ... + +test_run:switch('replica') + | --- + | - true + | ... +assert(box.info.ro) + | --- + | - true + | ... +assert(box.info.election.state == 'follower') + | --- + | - true + | ... + +test_run:switch('default') + | --- + | - true + | ... +stop_cluster() + | --- + | ... diff --git a/test/replication/gh-6018-election-boot-voter.test.lua b/test/replication/gh-6018-election-boot-voter.test.lua new file mode 100644 index 000000000..7222beb19 --- /dev/null +++ b/test/replication/gh-6018-election-boot-voter.test.lua @@ -0,0 +1,38 @@ +-- +-- gh-6018: in an auto-election cluster nodes with voter state could be selected +-- as bootstrap leaders. They should not, because a voter can't be ever writable +-- and it can neither boot itself nor register other nodes. +-- +test_run = require('test_run').new() + +function boot_with_master_election_mode(mode) \ + test_run:cmd('create server master with '.. \ + 'script="replication/gh-6018-master.lua"') \ + test_run:cmd('start server master with wait=False, args="'..mode..'"') \ + test_run:cmd('create server replica with '.. \ + 'script="replication/gh-6018-replica.lua"') \ + test_run:cmd('start server replica') \ +end + +function stop_cluster() \ + test_run:cmd('stop server replica') \ + test_run:cmd('stop server master') \ + test_run:cmd('delete server replica') \ + test_run:cmd('delete server master') \ +end + +-- +-- Candidate leader. +-- +boot_with_master_election_mode('candidate') + +test_run:switch('master') +test_run:wait_cond(function() return not box.info.ro end) +assert(box.info.election.state == 'leader') + +test_run:switch('replica') +assert(box.info.ro) +assert(box.info.election.state == 'follower') + +test_run:switch('default') +stop_cluster() diff --git a/test/replication/gh-6018-master.lua b/test/replication/gh-6018-master.lua new file mode 100644 index 000000000..1192204ff --- /dev/null +++ b/test/replication/gh-6018-master.lua @@ -0,0 +1,17 @@ +#!/usr/bin/env tarantool + +require('console').listen(os.getenv('ADMIN')) + +box.cfg({ + listen = 'unix/:./gh-6018-master.sock', + replication = { + 'unix/:./gh-6018-master.sock', + 'unix/:./gh-6018-replica.sock', + }, + election_mode = arg[1], + instance_uuid = 'cbf06940-0790-498b-948d-042b62cf3d29', + replication_timeout = 0.1, +}) + +box.ctl.wait_rw() +box.schema.user.grant('guest', 'super') diff --git a/test/replication/gh-6018-replica.lua b/test/replication/gh-6018-replica.lua new file mode 100644 index 000000000..71e669141 --- /dev/null +++ b/test/replication/gh-6018-replica.lua @@ -0,0 +1,15 @@ +#!/usr/bin/env tarantool + +require('console').listen(os.getenv('ADMIN')) + +box.cfg({ + listen = 'unix/:./gh-6018-replica.sock', + replication = { + 'unix/:./gh-6018-master.sock', + 'unix/:./gh-6018-replica.sock', + }, + election_mode = 'voter', + -- Smaller than master UUID. + instance_uuid = 'cbf06940-0790-498b-948d-042b62cf3d28', + replication_timeout = 0.1, +}) diff --git a/test/replication/suite.cfg b/test/replication/suite.cfg index 69f2f3511..2bfc3b845 100644 --- a/test/replication/suite.cfg +++ b/test/replication/suite.cfg @@ -45,6 +45,7 @@ "gh-5536-wal-limit.test.lua": {}, "gh-5566-final-join-synchro.test.lua": {}, "gh-5613-bootstrap-prefer-booted.test.lua": {}, + "gh-6018-election-boot-voter.test.lua": {}, "gh-6027-applier-error-show.test.lua": {}, "gh-6032-promote-wal-write.test.lua": {}, "gh-6057-qsync-confirm-async-no-wal.test.lua": {}, -- 2.24.3 (Apple Git-128)
next prev parent reply other threads:[~2021-07-18 16:55 UTC|newest] Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-07-18 16:53 [Tarantool-patches] [PATCH v2 0/5] Bootstrap voter Vladislav Shpilevoy via Tarantool-patches 2021-07-18 16:53 ` [Tarantool-patches] [PATCH v2 1/5] replication: introduce ballot.can_lead Vladislav Shpilevoy via Tarantool-patches 2021-07-21 21:38 ` Vladislav Shpilevoy via Tarantool-patches 2021-07-18 16:53 ` [Tarantool-patches] [PATCH v2 2/5] box: save box_raft() into a variable Vladislav Shpilevoy via Tarantool-patches 2021-07-18 16:53 ` [Tarantool-patches] [PATCH v2 3/5] raft: replace raft_start_candidate with _promote Vladislav Shpilevoy via Tarantool-patches 2021-07-18 16:53 ` Vladislav Shpilevoy via Tarantool-patches [this message] 2021-07-18 16:53 ` [Tarantool-patches] [PATCH v2 5/5] election: promote 'manual' bootstrap master Vladislav Shpilevoy via Tarantool-patches 2021-07-19 14:27 ` [Tarantool-patches] [PATCH v2 0/5] Bootstrap voter Sergey Petrenko via Tarantool-patches 2021-07-20 8:18 ` Cyrill Gorcunov via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=97e77a3158ef83db3c93f247297297874d7465cb.1626627097.git.v.shpilevoy@tarantool.org \ --to=tarantool-patches@dev.tarantool.org \ --cc=gorcunov@gmail.com \ --cc=sergepetrenko@tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH v2 4/5] election: during bootstrap prefer candidates' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox