From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id A1C2E6EC55; Thu, 10 Jun 2021 17:17:44 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org A1C2E6EC55 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1623334664; bh=bk5YP7HOLAiXniLumXfM0Ruqs/se9cX2t7eSwzv3UME=; h=To:References:Date:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=eGfA0PepujNjIot/TuSJsnrMSx5v1lk65uSgWDYd6l9ALZLtHRy+8TxndN6hIr5zw yA/nmPgPUiGiKB1W0uBFxdM/JF2S4MUvg/mTHwtH4Ejo+0PxfZDmqNPMCpdIt9neUu h+/WC2xLfulWsrv+Kn0Ur8HVRMLSuOIHEmrMJ3TQ= Received: from smtp48.i.mail.ru (smtp48.i.mail.ru [94.100.177.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id C89376EC55 for ; Thu, 10 Jun 2021 17:17:42 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org C89376EC55 Received: by smtp48.i.mail.ru with esmtpa (envelope-from ) id 1lrLV8-00064j-3H; Thu, 10 Jun 2021 17:17:42 +0300 To: Vladislav Shpilevoy , tarantool-patches@dev.tarantool.org, gorcunov@gmail.com References: <3bce2555-0fb2-4edf-3373-d068d99d7309@tarantool.org> Message-ID: Date: Thu, 10 Jun 2021 17:17:41 +0300 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <3bce2555-0fb2-4edf-3373-d068d99d7309@tarantool.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-GB X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD9D5B0DA836B685C549A9F97C297FFF2C725C7934AD8E7B4B9182A05F538085040CBD34085666304E53C8CEFF3C449BE2221DD695CC7588F16597128CD99B38372 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE74E940C28E5656A39EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637CF58BB58AE3180708638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D85D6D28FA69E5C74C3CB8090F4811642E117882F4460429724CE54428C33FAD305F5C1EE8F4F765FCAA867293B0326636D2E47CDBA5A96583BD4B6F7A4D31EC0BC014FD901B82EE079FA2833FD35BB23D27C277FBC8AE2E8BAA867293B0326636D2E47CDBA5A96583BA9C0B312567BB2376E601842F6C81A19E625A9149C048EEC65AC60A1F0286FE8FBB52F5C7ECD1BBD8FC6C240DEA7642DBF02ECDB25306B2B78CF848AE20165D0A6AB1C7CE11FEE3FA486DC37A503D0B6E0066C2D8992A16C4224003CC836476EA7A3FFF5B025636E2021AF6380DFAD1A18204E546F3947CB11811A4A51E3B096D1867E19FE1407959CC434672EE6371089D37D7C0E48F6C8AA50765F7900637BC468E7E89D8C5D6EFF80C71ABB335746BA297DBC24807EABDAD6C7F3747799A X-C1DE0DAB: 0D63561A33F958A59DDBA0A47C0F90CA5F653772E7C78128F7B2D7A6E227879ED59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA75448CF9D3A7B2C848410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34D7D5115130EFC8244E66227F157D288B92D32C9BFE7360DBF6784913709A1C00C4AEA92FC685592B1D7E09C32AA3244C0063BA37CF140A5E6B207A1C2F4B5B0E3A92A9747B6CC886FACE5A9C96DEB163 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojCpYK6nkTlbGpq82gr/pt0g== X-Mailru-Sender: 583F1D7ACE8F49BD9DF7A8DAE6E2B08A5EC2A377B9EE5C8176B080D8148B3A69F48597291BCA73FC424AE0EB1F3D1D21E2978F233C3FAE6EE63DB1732555E4A8EE80603BA4A5B0BC112434F685709FCF0DA7A0AF5A3A8387 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH 7/6] raft: test join to a raft cluster X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Serge Petrenko via Tarantool-patches Reply-To: Serge Petrenko Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" 06.06.2021 20:03, Vladislav Shpilevoy пишет: > There was a bug that a new replica at join to a Raft cluster > sometimes tried to register on a non-leader node which couldn't > write to _cluster, so the join failed with ER_READONLY error. > > Now in scope of #5613 the algorithm of join-master selection is > changed. A new node looks for writable members of the cluster to > use a join-master. It will not choose a follower if there is a > leader. > > Closes #6127 Thanks for working on this! LGTM. > --- > .../unreleased/gh-6127-raft-join-new.md | 4 + > test/replication/gh-6127-master1.lua | 15 +++ > test/replication/gh-6127-master2.lua | 13 +++ > test/replication/gh-6127-raft-join-new.result | 105 ++++++++++++++++++ > .../gh-6127-raft-join-new.test.lua | 41 +++++++ > test/replication/gh-6127-replica.lua | 9 ++ > 6 files changed, 187 insertions(+) > create mode 100644 changelogs/unreleased/gh-6127-raft-join-new.md > create mode 100644 test/replication/gh-6127-master1.lua > create mode 100644 test/replication/gh-6127-master2.lua > create mode 100644 test/replication/gh-6127-raft-join-new.result > create mode 100644 test/replication/gh-6127-raft-join-new.test.lua > create mode 100644 test/replication/gh-6127-replica.lua > > diff --git a/changelogs/unreleased/gh-6127-raft-join-new.md b/changelogs/unreleased/gh-6127-raft-join-new.md > new file mode 100644 > index 000000000..a2d898df0 > --- /dev/null > +++ b/changelogs/unreleased/gh-6127-raft-join-new.md > @@ -0,0 +1,4 @@ > +## bugfix/raft > + > +* Fixed an error when a new replica in a Raft cluster could try to join from a > + follower instead of a leader and failed with an error `ER_READONLY` (gh-6127). > diff --git a/test/replication/gh-6127-master1.lua b/test/replication/gh-6127-master1.lua > new file mode 100644 > index 000000000..708574322 > --- /dev/null > +++ b/test/replication/gh-6127-master1.lua > @@ -0,0 +1,15 @@ > +#!/usr/bin/env tarantool > + > +require('console').listen(os.getenv('ADMIN')) > +box.cfg({ > + listen = 'unix/:./master1.sock', > + replication = { > + 'unix/:./master1.sock', > + 'unix/:./master2.sock' > + }, > + election_mode = 'candidate', > + election_timeout = 0.1, > + instance_uuid = '10f9828d-b5d5-46a9-b698-ddac7cce5e27', > +}) > +box.ctl.wait_rw() > +box.schema.user.grant('guest', 'super') > diff --git a/test/replication/gh-6127-master2.lua b/test/replication/gh-6127-master2.lua > new file mode 100644 > index 000000000..1851070c7 > --- /dev/null > +++ b/test/replication/gh-6127-master2.lua > @@ -0,0 +1,13 @@ > +#!/usr/bin/env tarantool > + > +require('console').listen(os.getenv('ADMIN')) > +box.cfg({ > + listen = 'unix/:./master2.sock', > + replication = { > + 'unix/:./master1.sock', > + 'unix/:./master2.sock' > + }, > + election_mode = 'voter', > + election_timeout = 0.1, > + instance_uuid = '20f9828d-b5d5-46a9-b698-ddac7cce5e27', > +}) > diff --git a/test/replication/gh-6127-raft-join-new.result b/test/replication/gh-6127-raft-join-new.result > new file mode 100644 > index 000000000..be6f8489b > --- /dev/null > +++ b/test/replication/gh-6127-raft-join-new.result > @@ -0,0 +1,105 @@ > +-- test-run result file version 2 > +test_run = require('test_run').new() > + | --- > + | ... > + > +-- > +-- gh-6127: the algorithm selecting a node from which to join to a replicaset > +-- should take into account who is the leader (is writable and can write to > +-- _cluster) and who is a follower/candidate. > +-- > +test_run:cmd('create server master1 with script="replication/gh-6127-master1.lua"') > + | --- > + | - true > + | ... > +test_run:cmd('start server master1 with wait=False') > + | --- > + | - true > + | ... > +test_run:cmd('create server master2 with script="replication/gh-6127-master2.lua"') > + | --- > + | - true > + | ... > +test_run:cmd('start server master2') > + | --- > + | - true > + | ... > + > +test_run:switch('master1') > + | --- > + | - true > + | ... > +box.cfg{election_mode = 'voter'} > + | --- > + | ... > +test_run:switch('master2') > + | --- > + | - true > + | ... > +-- Perform manual election because it is faster - the automatic one still tries > +-- to wait for 'death timeout' first which is several seconds. > +box.cfg{ \ > + election_mode = 'manual', \ > + election_timeout = 0.1, \ > +} > + | --- > + | ... > +box.ctl.promote() > + | --- > + | ... > +box.ctl.wait_rw() > + | --- > + | ... > +-- Make sure the other node received the promotion row. Vclocks now should be > +-- equal so the new node would select only using read-only state and min UUID. > +test_run:wait_lsn('master1', 'master2') > + | --- > + | ... > + > +-- Min UUID is master1, but it is not writable. Therefore must join from > +-- master2. > +test_run:cmd('create server replica with script="replication/gh-6127-replica.lua"') > + | --- > + | - true > + | ... > +test_run:cmd('start server replica') > + | --- > + | - true > + | ... > +test_run:switch('replica') > + | --- > + | - true > + | ... > +assert(box.info.leader ~= 0) > + | --- > + | - true > + | ... > + > +test_run:switch('default') > + | --- > + | - true > + | ... > +test_run:cmd('stop server replica') > + | --- > + | - true > + | ... > +test_run:cmd('delete server replica') > + | --- > + | - true > + | ... > +test_run:cmd('stop server master2') > + | --- > + | - true > + | ... > +test_run:cmd('delete server master2') > + | --- > + | - true > + | ... > +test_run:cmd('stop server master1') > + | --- > + | - true > + | ... > +test_run:cmd('delete server master1') > + | --- > + | - true > + | ... > diff --git a/test/replication/gh-6127-raft-join-new.test.lua b/test/replication/gh-6127-raft-join-new.test.lua > new file mode 100644 > index 000000000..3e0e9f226 > --- /dev/null > +++ b/test/replication/gh-6127-raft-join-new.test.lua > @@ -0,0 +1,41 @@ > +test_run = require('test_run').new() > + > +-- > +-- gh-6127: the algorithm selecting a node from which to join to a replicaset > +-- should take into account who is the leader (is writable and can write to > +-- _cluster) and who is a follower/candidate. > +-- > +test_run:cmd('create server master1 with script="replication/gh-6127-master1.lua"') > +test_run:cmd('start server master1 with wait=False') > +test_run:cmd('create server master2 with script="replication/gh-6127-master2.lua"') > +test_run:cmd('start server master2') > + > +test_run:switch('master1') > +box.cfg{election_mode = 'voter'} > +test_run:switch('master2') > +-- Perform manual election because it is faster - the automatic one still tries > +-- to wait for 'death timeout' first which is several seconds. > +box.cfg{ \ > + election_mode = 'manual', \ > + election_timeout = 0.1, \ > +} > +box.ctl.promote() > +box.ctl.wait_rw() > +-- Make sure the other node received the promotion row. Vclocks now should be > +-- equal so the new node would select only using read-only state and min UUID. > +test_run:wait_lsn('master1', 'master2') > + > +-- Min UUID is master1, but it is not writable. Therefore must join from > +-- master2. > +test_run:cmd('create server replica with script="replication/gh-6127-replica.lua"') > +test_run:cmd('start server replica') > +test_run:switch('replica') > +assert(box.info.leader ~= 0) > + > +test_run:switch('default') > +test_run:cmd('stop server replica') > +test_run:cmd('delete server replica') > +test_run:cmd('stop server master2') > +test_run:cmd('delete server master2') > +test_run:cmd('stop server master1') > +test_run:cmd('delete server master1') > diff --git a/test/replication/gh-6127-replica.lua b/test/replication/gh-6127-replica.lua > new file mode 100644 > index 000000000..9f4c35ecd > --- /dev/null > +++ b/test/replication/gh-6127-replica.lua > @@ -0,0 +1,9 @@ > +#!/usr/bin/env tarantool > + > +require('console').listen(os.getenv('ADMIN')) > +box.cfg({ > + replication = { > + 'unix/:./master1.sock', > + 'unix/:./master2.sock' > + }, > +}) -- Serge Petrenko