From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 5EC2F6EC55; Sun, 18 Jul 2021 19:56:02 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 5EC2F6EC55 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1626627362; bh=ljwmWBhyU2NdAYVcsWU3ZZycOOxAfOupO5SPMDgDnhw=; h=To:Date:In-Reply-To:References:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=FlBrWM5pC9mdGV9r22IRNCR5shhvZ9e0n1hKAbV1MfR9bnVIHP51y4TF+wGNnGGsU dW1+f2o0q6r5ws6rsGsCW/lOxUA2BMy2yCLlma2P6gS2jox3UDAl6Q3n6Tffyrl49R tWty4ycPxH25s5NPGgUjmOTOjxg8dzEKlhFohn/o= Received: from smtpng3.i.mail.ru (smtpng3.i.mail.ru [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 356066EC70 for ; Sun, 18 Jul 2021 19:53:36 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 356066EC70 Received: by smtpng3.m.smailru.net with esmtpa (envelope-from ) id 1m5A2p-0005NM-Bt; Sun, 18 Jul 2021 19:53:35 +0300 To: tarantool-patches@dev.tarantool.org, gorcunov@gmail.com, sergepetrenko@tarantool.org Date: Sun, 18 Jul 2021 18:53:29 +0200 Message-Id: X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD941C43E597735A9C3038391AAE5FBFA76FBCFDED1455B43CD182A05F53808504082F7368BA4DF45CFEF4BB4B81B1B5BC4E2384B3B0080F8D4FA302D7156320168 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7239598B80E045C53EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F79006375189E43B47FB350A8638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D80D6A0988456C67CC8956A52B955AEA31117882F4460429724CE54428C33FAD305F5C1EE8F4F765FCAA867293B0326636D2E47CDBA5A96583BD4B6F7A4D31EC0BC014FD901B82EE079FA2833FD35BB23D27C277FBC8AE2E8BAA867293B0326636D2E47CDBA5A96583BA9C0B312567BB231DD303D21008E29813377AFFFEAFD269A417C69337E82CC2E827F84554CEF50127C277FBC8AE2E8BA83251EDC214901ED5E8D9A59859A8B65D56369A3576CBA5089D37D7C0E48F6C5571747095F342E88FB05168BE4CE3AF X-C1DE0DAB: C20DE7B7AB408E4181F030C43753B8186998911F362727C414F749A5E30D975CBC3D6B4D9834ECA72E61E0C61B5BBABA517A4913A30ED1149C2B6934AE262D3EE7EAB7254005DCED7532B743992DF240BDC6A1CF3F042BAD6DF99611D93F60EF3033054805BDE987699F904B3F4130E343918A1A30D5E7FCCB5012B2E24CD356 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D341E2D05735FCBECD17B8012C73CC586CE584C8CD7C5D58AA03C20FF05A62403ABEDAF00213BB2E7BA1D7E09C32AA3244CF19C6B69A5EFE3AEF0E807B39CCF00F9BBA718C7E6A9E042FACE5A9C96DEB163 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2bioj+8+KVR9NZrEvBysU4VFgsA== X-Mailru-Sender: 689FA8AB762F7393C37E3C1AEC41BA5D19DBF95B2CC05A6236434359218F115C3841015FED1DE5223CC9A89AB576DD93FB559BB5D741EB963CF37A108A312F5C27E8A8C3839CE0E267EA787935ED9F1B X-Mras: Ok Subject: [Tarantool-patches] [PATCH v2 5/5] election: promote 'manual' bootstrap master X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Vladislav Shpilevoy via Tarantool-patches Reply-To: Vladislav Shpilevoy Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" A cluster may consist of only voters and manual nodes. This means on bootstrap nobody would be elected as a Raft leader automatically to create the first snapshot and register the others. After the previous commit the manual nodes were preferred to be bootstrap masters, but they couldn't do anything. This patch makes 'manual' bootstrap master promote itself for one term so as it could boot the cluster. Closes #6018 --- .../unreleased/gh-6018-election-boot-voter.md | 4 ++ src/box/box.cc | 19 +++++++- .../gh-6018-election-boot-voter.result | 46 +++++++++++++++++++ .../gh-6018-election-boot-voter.test.lua | 21 +++++++++ 4 files changed, 88 insertions(+), 2 deletions(-) create mode 100644 changelogs/unreleased/gh-6018-election-boot-voter.md diff --git a/changelogs/unreleased/gh-6018-election-boot-voter.md b/changelogs/unreleased/gh-6018-election-boot-voter.md new file mode 100644 index 000000000..080484bbe --- /dev/null +++ b/changelogs/unreleased/gh-6018-election-boot-voter.md @@ -0,0 +1,4 @@ +## bugfix/replication + +* Fixed a cluster sometimes being unable to bootstrap if it contains nodes with + `election_mode` `manual` or `voter` (gh-6018). diff --git a/src/box/box.cc b/src/box/box.cc index b828d3a31..8c10a99dd 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -3501,7 +3501,8 @@ box_cfg_xc(void) if (!is_bootstrap_leader) { replicaset_sync(); - } else { + } else if (box_election_mode == ELECTION_MODE_CANDIDATE || + box_election_mode == ELECTION_MODE_MANUAL) { /* * When the cluster is just bootstrapped and this instance is a * leader, it makes no sense to wait for a leader appearance. @@ -3509,7 +3510,21 @@ box_cfg_xc(void) * should take the control over the situation and start a new * term immediately. */ - raft_new_term(raft); + raft_promote(raft); + int rc = box_raft_wait_term_outcome(); + if (rc == 0 && raft->leader != instance_id) { + /* + * It was promoted and is a single registered node - + * there can't be another leader or a new term bump. + */ + panic("Bootstrap master couldn't elect self as a " + "leader. Leader is %u, term is %llu", + raft->leader, (long long)raft->volatile_term); + } + if (rc != 0) { + raft_restore(raft); + diag_raise(); + } } /* box.cfg.read_only is not read yet. */ diff --git a/test/replication/gh-6018-election-boot-voter.result b/test/replication/gh-6018-election-boot-voter.result index 6b05f0825..1b7949bb9 100644 --- a/test/replication/gh-6018-election-boot-voter.result +++ b/test/replication/gh-6018-election-boot-voter.result @@ -4,6 +4,11 @@ -- as bootstrap leaders. They should not, because a voter can't be ever writable -- and it can neither boot itself nor register other nodes. -- +-- Similar situation was with the manual election. All instances might have +-- manual election mode. Such a cluster wouldn't be able to boot if their +-- bootstrap master wouldn't become an elected leader automatically at least +-- once. +-- test_run = require('test_run').new() | --- | ... @@ -68,3 +73,44 @@ test_run:switch('default') stop_cluster() | --- | ... + +-- +-- Manual leader. +-- +boot_with_master_election_mode('manual') + | --- + | ... + +test_run:switch('master') + | --- + | - true + | ... +test_run:wait_cond(function() return not box.info.ro end) + | --- + | - true + | ... +assert(box.info.election.state == 'leader') + | --- + | - true + | ... + +test_run:switch('replica') + | --- + | - true + | ... +assert(box.info.ro) + | --- + | - true + | ... +assert(box.info.election.state == 'follower') + | --- + | - true + | ... + +test_run:switch('default') + | --- + | - true + | ... +stop_cluster() + | --- + | ... diff --git a/test/replication/gh-6018-election-boot-voter.test.lua b/test/replication/gh-6018-election-boot-voter.test.lua index 7222beb19..fb08e2bc8 100644 --- a/test/replication/gh-6018-election-boot-voter.test.lua +++ b/test/replication/gh-6018-election-boot-voter.test.lua @@ -3,6 +3,11 @@ -- as bootstrap leaders. They should not, because a voter can't be ever writable -- and it can neither boot itself nor register other nodes. -- +-- Similar situation was with the manual election. All instances might have +-- manual election mode. Such a cluster wouldn't be able to boot if their +-- bootstrap master wouldn't become an elected leader automatically at least +-- once. +-- test_run = require('test_run').new() function boot_with_master_election_mode(mode) \ @@ -36,3 +41,19 @@ assert(box.info.election.state == 'follower') test_run:switch('default') stop_cluster() + +-- +-- Manual leader. +-- +boot_with_master_election_mode('manual') + +test_run:switch('master') +test_run:wait_cond(function() return not box.info.ro end) +assert(box.info.election.state == 'leader') + +test_run:switch('replica') +assert(box.info.ro) +assert(box.info.election.state == 'follower') + +test_run:switch('default') +stop_cluster() -- 2.24.3 (Apple Git-128)