From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp56.i.mail.ru (smtp56.i.mail.ru [217.69.128.36]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 87FF6469719 for ; Tue, 3 Mar 2020 00:26:54 +0300 (MSK) References: <20200228170130.81713-1-sergepetrenko@tarantool.org> <1582969950.401106116@f137.i.mail.ru> <41560ae6-e2fb-1b7e-32d0-d596e73db8c5@tarantool.org> <1583169513.32666235@f144.i.mail.ru> From: Vladislav Shpilevoy Message-ID: Date: Mon, 2 Mar 2020 22:26:51 +0100 MIME-Version: 1.0 In-Reply-To: <1583169513.32666235@f144.i.mail.ru> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [Tarantool-patches] [PATCH] replication: fix rebootstrap in case the instance is listed in box.cfg.replication List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Serge Petrenko Cc: kirichenkoga@gmail.com, tarantool-patches@dev.tarantool.org LGTM. On 02/03/2020 18:18, Serge Petrenko wrote: >   > > Суббота, 29 февраля 2020, 17:19 +03:00 от Vladislav Shpilevoy : >   > On 29/02/2020 10:52, Serge Petrenko wrote: > > Hi!  > > > > Суббота, 29 февраля 2020, 2:43 +03:00 от Vladislav Shpilevoy >: > >   > > Thanks for the patch! > > > > On 28/02/2020 18:01, Serge Petrenko wrote: > > > When checking wheter rejoin is needed, replica loops through all the > > > instances in box.cfg.replication, which makes it believe that there is a > > > master holding files, needed by it, since it accounts itself just like > > > all other instances. > > > So make replica skip itself when finding an instance which holds files > > > needed by it, and determining whether rebootstrap is needed. > > > > > > We already have a working test for the issue, it missed the issue due to > > > replica.lua replication settings. Fix replica.lua to optionally include > > > itself in box.cfg.replication, so that the corresponding test works > > > correctly. > > > > > > Closes #4759 > > > --- > > > https://github.com/tarantool/tarantool/issues/4759 > > > https://github.com/tarantool/tarantool/tree/sp/gh-4759-rebootstrap-fix > > > > > > @ChangeLog > > > - fix rebootstrap procedure not working in case replica itself > > > is listed in `box.cfg.replication` > > > > > > src/box/replication.cc | 13 ++++++++++++- > > > test/replication/replica.lua | 11 ++++++++++- > > > test/replication/replica_rejoin.result | 12 ++++++------ > > > test/replication/replica_rejoin.test.lua | 12 ++++++------ > > > 4 files changed, 34 insertions(+), 14 deletions(-) > > > > > > diff --git a/src/box/replication.cc b/src/box/replication.cc > > > index e7bfa22ab..01edc0fb2 100644 > > > --- a/src/box/replication.cc > > > +++ b/src/box/replication.cc > > > @@ -768,8 +768,19 @@ replicaset_needs_rejoin(struct replica **master) > > > struct replica *leader = NULL; > > > replicaset_foreach(replica) { > > > struct applier *applier = replica->applier; > > > - if (applier == NULL) > > > + /* > > > + * The function is called right after > > > + * box_sync_replication(), which in turn calls > > > + * replicaset_connect(), which ensures that > > > + * appliers are either stopped (APPLIER OFF) or > > > + * connected. > > > + * Also ignore self, as self applier might not > > > + * have disconnected yet. > > > + */ > > > + if (applier == NULL || applier->state == APPLIER_OFF || > > > + tt_uuid_is_equal(&replica->uuid, &INSTANCE_UUID)) > > > continue; > > > + assert(applier->state == APPLIER_CONNECTED); > > > > > > > Could you please understand one thing? Below I see this: > > > > > const struct ballot *ballot = &applier->ballot; > > > if (vclock_compare(&ballot->gc_vclock, > > > &replicaset.vclock) <= 0) { > > > /* > > > * There's at least one master that still stores > > > * WALs needed by this instance. Proceed to local > > > * recovery. > > > */ > > > return false; > > > } > > > > Question is why do we need rebootstrap if some remote node's > > vclock is bigger than ours? It does not mean anything. It doesn't > > say whether that remote instance still keeps any xlogs. All it > > tells is that the remote instance committed some more data since > > our restart. > > > >   > > In the piece above we check for master’s gc vclock, which is vclock of the > > oldest WAL the instance still has. If master’s gc vclock is less than our vclock, > > no rebootstrap is needed. Master’ll be able to send us all the fresh data. > >  Later we check whether our vclock is greater or incompatible with the remote > > instance’s vclock. If it is, we cannot rebootstrap from the instance, since we have > > some rows, that the instance doesn’t have. > > Thanks for the explanation. Then I would change the comment to say > more about the issue. Currently it says about self only > >     "Also ignore self, as self applier might not have disconnected yet." > > This does not look related to the actual problem, that we just should > not look at our gc vclock. > >   > Ok, I changed the comment. The diff is below. >   > > > Also why do we need 'applier->state == APPLIER_OFF' check? How is it > related to the issue? If this is a separate bug, is it possible to > make it a separate patch? > >   > We needn’t this check, after all. replicaset_update() at the end of replicaset_connect() > ensures that all the replicas in the replicaset have appliers in state APPLIER_CONNECTED. > Thanks for pointing this out. >   > > diff --git a/src/box/replication.cc b/src/box/replication.cc > > index 01edc0fb2..1345f189b 100644 > > --- a/src/box/replication.cc > > +++ b/src/box/replication.cc > > @@ -769,18 +769,12 @@ replicaset_needs_rejoin(struct replica **master) > >   replicaset_foreach(replica) { > >   struct applier *applier = replica->applier; > >   /* > > - * The function is called right after > > - * box_sync_replication(), which in turn calls > > - * replicaset_connect(), which ensures that > > - * appliers are either stopped (APPLIER OFF) or > > - * connected. > > - * Also ignore self, as self applier might not > > - * have disconnected yet. > > + * Skip the local instance, we shouldn't perform a > > + * check against our own gc vclock. > >   */ > > - if (applier == NULL || applier->state == APPLIER_OFF || > > -     tt_uuid_is_equal(&replica->uuid, &INSTANCE_UUID)) > > + if (applier == NULL || tt_uuid_is_equal(&replica->uuid, > > + &INSTANCE_UUID)) > >   continue; > > - assert(applier->state == APPLIER_CONNECTED); > >   > >   const struct ballot *ballot = &applier->ballot; > >   > -- > Serge Petrenko >