From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Tue, 22 May 2018 19:45:42 +0300 From: Konstantin Osipov Subject: Re: [PATCH] replication: fix bug with read-only replica as a bootstrap leader Message-ID: <20180522164542.GC28644@atlas> References: <20180522154004.2278-1-k.belyavskiy@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180522154004.2278-1-k.belyavskiy@tarantool.org> To: Konstantin Belyavskiy Cc: vdavydov@tarantool.org, georgy@tarantool.org, tarantool-patches@freelists.org List-ID: * Konstantin Belyavskiy [18/05/22 18:41]: It's OK to push. As a nitpick, I'd prefer extracting leader search loop into a routine, rather than adding a goto label. I'm not crazy about goto, but i think in this case the could would win from not having it. > Another broken case. Adding a new replica to cluster: > + if (replica->applier->remote_is_ro && > + replica->applier->vclock.signature == 0) > In this case we may got an ER_READONLY, since signature is not 0. > So leader election now has two phases: > 1. To select among read-write replicas. > 2. If no such found, try old algorithm for backward compatibility > (case then all replicas exist in cluster table). > > Closes #3257 > --- > https://github.com/tarantool/tarantool/issues/3257 > https://github.com/tarantool/tarantool/tree/kbelyavs/gh-3257-fix-bug-with-read-only-as-a-leader > > src/box/replication.cc | 22 +++++++++++++++++----- > test/replication/replica_uuid_ro3.lua | 1 + > test/replication/replicaset_ro_mostly.result | 20 ++++++++++++++++++++ > test/replication/replicaset_ro_mostly.test.lua | 8 ++++++++ > 4 files changed, 46 insertions(+), 5 deletions(-) > create mode 120000 test/replication/replica_uuid_ro3.lua > > diff --git a/src/box/replication.cc b/src/box/replication.cc > index 0b770c913..8dcf1f656 100644 > --- a/src/box/replication.cc > +++ b/src/box/replication.cc > @@ -691,16 +691,24 @@ struct replica * > replicaset_leader(void) > { > struct replica *leader = NULL; > + bool skip_ro = true; > + /** > + * Two loops, first prefers read-write replicas among others. > + * Second for backward compatibility, if there is no such > + * replicas at all. > + */ > +loop: > replicaset_foreach(replica) { > if (replica->applier == NULL) > continue; > /** > - * While bootstrapping a new cluster, > - * read-only replicas shouldn't be considered > - * as a leader. > + * While bootstrapping a new cluster, read-only > + * replicas shouldn't be considered as a leader. > + * The only exception if there is no read-write > + * replicas since there is still a possibility > + * that all replicas exist in cluster table. > */ > - if (replica->applier->remote_is_ro && > - replica->applier->vclock.signature == 0) > + if (skip_ro && replica->applier->remote_is_ro) > continue; > if (leader == NULL) { > leader = replica; > @@ -721,6 +729,10 @@ replicaset_leader(void) > continue; > leader = replica; > } > + if (skip_ro && leader == NULL) { > + skip_ro = false; > + goto loop; > + } > return leader; > } > > diff --git a/test/replication/replica_uuid_ro3.lua b/test/replication/replica_uuid_ro3.lua > new file mode 120000 > index 000000000..342d71c57 > --- /dev/null > +++ b/test/replication/replica_uuid_ro3.lua > @@ -0,0 +1 @@ > +replica_uuid_ro.lua > \ No newline at end of file > diff --git a/test/replication/replicaset_ro_mostly.result b/test/replication/replicaset_ro_mostly.result > index d753a182d..b9e8f1fe8 100644 > --- a/test/replication/replicaset_ro_mostly.result > +++ b/test/replication/replicaset_ro_mostly.result > @@ -53,6 +53,26 @@ create_cluster_uuid(SERVERS, UUID) > test_run:wait_fullmesh(SERVERS) > --- > ... > +-- Add third replica > +name = 'replica_uuid_ro3' > +--- > +... > +test_run:cmd(create_cluster_cmd1:format(name, name)) > +--- > +- true > +... > +test_run:cmd(create_cluster_cmd2:format(name, uuid.new())) > +--- > +- true > +... > +test_run:cmd('switch replica_uuid_ro3') > +--- > +- true > +... > +test_run:cmd('switch default') > +--- > +- true > +... > -- Cleanup. > test_run:drop_cluster(SERVERS) > --- > diff --git a/test/replication/replicaset_ro_mostly.test.lua b/test/replication/replicaset_ro_mostly.test.lua > index 539ca5a13..f2c2d0d11 100644 > --- a/test/replication/replicaset_ro_mostly.test.lua > +++ b/test/replication/replicaset_ro_mostly.test.lua > @@ -26,5 +26,13 @@ test_run:cmd("setopt delimiter ''"); > -- Deploy a cluster. > create_cluster_uuid(SERVERS, UUID) > test_run:wait_fullmesh(SERVERS) > + > +-- Add third replica > +name = 'replica_uuid_ro3' > +test_run:cmd(create_cluster_cmd1:format(name, name)) > +test_run:cmd(create_cluster_cmd2:format(name, uuid.new())) > +test_run:cmd('switch replica_uuid_ro3') > +test_run:cmd('switch default') > + > -- Cleanup. > test_run:drop_cluster(SERVERS) > -- > 2.14.3 (Apple Git-98) -- Konstantin Osipov, Moscow, Russia, +7 903 626 22 32 http://tarantool.io - www.twitter.com/kostja_osipov