[Tarantool-patches] [PATCH] replication: fix rebootstrap in case the instance is listed in box.cfg.replication

Tue Mar 3 00:26:51 MSK 2020

LGTM.

On 02/03/2020 18:18, Serge Petrenko wrote:
>  
> 
>     Суббота, 29 февраля 2020, 17:19 +03:00 от Vladislav Shpilevoy <v.shpilevoy at tarantool.org>:
>      
>     On 29/02/2020 10:52, Serge Petrenko wrote:
>     > Hi! 
>     >
>     > Суббота, 29 февраля 2020, 2:43 +03:00 от Vladislav Shpilevoy <v.shpilevoy at tarantool.org </compose?To=v.shpilevoy at tarantool.org>>:
>     >  
>     > Thanks for the patch!
>     >
>     > On 28/02/2020 18:01, Serge Petrenko wrote:
>     > > When checking wheter rejoin is needed, replica loops through all the
>     > > instances in box.cfg.replication, which makes it believe that there is a
>     > > master holding files, needed by it, since it accounts itself just like
>     > > all other instances.
>     > > So make replica skip itself when finding an instance which holds files
>     > > needed by it, and determining whether rebootstrap is needed.
>     > >
>     > > We already have a working test for the issue, it missed the issue due to
>     > > replica.lua replication settings. Fix replica.lua to optionally include
>     > > itself in box.cfg.replication, so that the corresponding test works
>     > > correctly.
>     > >
>     > > Closes #4759
>     > > ---
>     > > https://github.com/tarantool/tarantool/issues/4759
>     > > https://github.com/tarantool/tarantool/tree/sp/gh-4759-rebootstrap-fix
>     > >
>     > > @ChangeLog
>     > > - fix rebootstrap procedure not working in case replica itself
>     > > is listed in `box.cfg.replication`
>     > >
>     > > src/box/replication.cc | 13 ++++++++++++-
>     > > test/replication/replica.lua | 11 ++++++++++-
>     > > test/replication/replica_rejoin.result | 12 ++++++------
>     > > test/replication/replica_rejoin.test.lua | 12 ++++++------
>     > > 4 files changed, 34 insertions(+), 14 deletions(-)
>     > >
>     > > diff --git a/src/box/replication.cc b/src/box/replication.cc
>     > > index e7bfa22ab..01edc0fb2 100644
>     > > --- a/src/box/replication.cc
>     > > +++ b/src/box/replication.cc
>     > > @@ -768,8 +768,19 @@ replicaset_needs_rejoin(struct replica **master)
>     > > struct replica *leader = NULL;
>     > > replicaset_foreach(replica) {
>     > > struct applier *applier = replica->applier;
>     > > - if (applier == NULL)
>     > > + /*
>     > > + * The function is called right after
>     > > + * box_sync_replication(), which in turn calls
>     > > + * replicaset_connect(), which ensures that
>     > > + * appliers are either stopped (APPLIER OFF) or
>     > > + * connected.
>     > > + * Also ignore self, as self applier might not
>     > > + * have disconnected yet.
>     > > + */
>     > > + if (applier == NULL || applier->state == APPLIER_OFF ||
>     > > + tt_uuid_is_equal(&replica->uuid, &INSTANCE_UUID))
>     > > continue;
>     > > + assert(applier->state == APPLIER_CONNECTED);
>     > >
>     >
>     > Could you please understand one thing? Below I see this:
>     >
>     > > const struct ballot *ballot = &applier->ballot;
>     > > if (vclock_compare(&ballot->gc_vclock,
>     > > &replicaset.vclock) <= 0) {
>     > > /*
>     > > * There's at least one master that still stores
>     > > * WALs needed by this instance. Proceed to local
>     > > * recovery.
>     > > */
>     > > return false;
>     > > }
>     >
>     > Question is why do we need rebootstrap if some remote node's
>     > vclock is bigger than ours? It does not mean anything. It doesn't
>     > say whether that remote instance still keeps any xlogs. All it
>     > tells is that the remote instance committed some more data since
>     > our restart.
>     >
>     >  
>     > In the piece above we check for master’s gc vclock, which is vclock of the
>     > oldest WAL the instance still has. If master’s gc vclock is less than our vclock,
>     > no rebootstrap is needed. Master’ll be able to send us all the fresh data.
>     >  Later we check whether our vclock is greater or incompatible with the remote
>     > instance’s vclock. If it is, we cannot rebootstrap from the instance, since we have
>     > some rows, that the instance doesn’t have.
> 
>     Thanks for the explanation. Then I would change the comment to say
>     more about the issue. Currently it says about self only
> 
>         "Also ignore self, as self applier might not have disconnected yet."
> 
>     This does not look related to the actual problem, that we just should
>     not look at our gc vclock.
> 
>  
> Ok, I changed the comment. The diff is below.
>  
> 
> 
>     Also why do we need 'applier->state == APPLIER_OFF' check? How is it
>     related to the issue? If this is a separate bug, is it possible to
>     make it a separate patch?
> 
>  
> We needn’t this check, after all. replicaset_update() at the end of replicaset_connect()
> ensures that all the replicas in the replicaset have appliers in state APPLIER_CONNECTED.
> Thanks for pointing this out.
>  
> 
> diff --git a/src/box/replication.cc b/src/box/replication.cc
> 
> index 01edc0fb2..1345f189b 100644
> 
> --- a/src/box/replication.cc
> 
> +++ b/src/box/replication.cc
> 
> @@ -769,18 +769,12 @@ replicaset_needs_rejoin(struct replica **master)
> 
>   replicaset_foreach(replica) {
> 
>   struct applier *applier = replica->applier;
> 
>   /*
> 
> - * The function is called right after
> 
> - * box_sync_replication(), which in turn calls
> 
> - * replicaset_connect(), which ensures that
> 
> - * appliers are either stopped (APPLIER OFF) or
> 
> - * connected.
> 
> - * Also ignore self, as self applier might not
> 
> - * have disconnected yet.
> 
> + * Skip the local instance, we shouldn't perform a
> 
> + * check against our own gc vclock.
> 
>   */
> 
> - if (applier == NULL || applier->state == APPLIER_OFF ||
> 
> -     tt_uuid_is_equal(&replica->uuid, &INSTANCE_UUID))
> 
> + if (applier == NULL || tt_uuid_is_equal(&replica->uuid,
> 
> + &INSTANCE_UUID))
> 
>   continue;
> 
> - assert(applier->state == APPLIER_CONNECTED);
> 
>  
> 
>   const struct ballot *ballot = &applier->ballot;
> 
>  
> --
> Serge Petrenko
>