[Tarantool-patches] [PATCH] replication: fix rebootstrap in case the instance is listed in box.cfg.replication
Serge Petrenko
sergepetrenko at tarantool.org
Mon Mar 2 20:18:33 MSK 2020
>Суббота, 29 февраля 2020, 17:19 +03:00 от Vladislav Shpilevoy <v.shpilevoy at tarantool.org>:
>
>On 29/02/2020 10:52, Serge Petrenko wrote:
>> Hi!
>>
>> Суббота, 29 февраля 2020, 2:43 +03:00 от Vladislav Shpilevoy < v.shpilevoy at tarantool.org >:
>>
>> Thanks for the patch!
>>
>> On 28/02/2020 18:01, Serge Petrenko wrote:
>> > When checking wheter rejoin is needed, replica loops through all the
>> > instances in box.cfg.replication, which makes it believe that there is a
>> > master holding files, needed by it, since it accounts itself just like
>> > all other instances.
>> > So make replica skip itself when finding an instance which holds files
>> > needed by it, and determining whether rebootstrap is needed.
>> >
>> > We already have a working test for the issue, it missed the issue due to
>> > replica.lua replication settings. Fix replica.lua to optionally include
>> > itself in box.cfg.replication, so that the corresponding test works
>> > correctly.
>> >
>> > Closes #4759
>> > ---
>> > https://github.com/tarantool/tarantool/issues/4759
>> > https://github.com/tarantool/tarantool/tree/sp/gh-4759-rebootstrap-fix
>> >
>> > @ChangeLog
>> > - fix rebootstrap procedure not working in case replica itself
>> > is listed in `box.cfg.replication`
>> >
>> > src/box/replication.cc | 13 ++++++++++++-
>> > test/replication/replica.lua | 11 ++++++++++-
>> > test/replication/replica_rejoin.result | 12 ++++++------
>> > test/replication/replica_rejoin.test.lua | 12 ++++++------
>> > 4 files changed, 34 insertions(+), 14 deletions(-)
>> >
>> > diff --git a/src/box/replication.cc b/src/box/replication.cc
>> > index e7bfa22ab..01edc0fb2 100644
>> > --- a/src/box/replication.cc
>> > +++ b/src/box/replication.cc
>> > @@ -768,8 +768,19 @@ replicaset_needs_rejoin(struct replica **master)
>> > struct replica *leader = NULL;
>> > replicaset_foreach(replica) {
>> > struct applier *applier = replica->applier;
>> > - if (applier == NULL)
>> > + /*
>> > + * The function is called right after
>> > + * box_sync_replication(), which in turn calls
>> > + * replicaset_connect(), which ensures that
>> > + * appliers are either stopped (APPLIER OFF) or
>> > + * connected.
>> > + * Also ignore self, as self applier might not
>> > + * have disconnected yet.
>> > + */
>> > + if (applier == NULL || applier->state == APPLIER_OFF ||
>> > + tt_uuid_is_equal(&replica->uuid, &INSTANCE_UUID))
>> > continue;
>> > + assert(applier->state == APPLIER_CONNECTED);
>> >
>>
>> Could you please understand one thing? Below I see this:
>>
>> > const struct ballot *ballot = &applier->ballot;
>> > if (vclock_compare(&ballot->gc_vclock,
>> > &replicaset.vclock) <= 0) {
>> > /*
>> > * There's at least one master that still stores
>> > * WALs needed by this instance. Proceed to local
>> > * recovery.
>> > */
>> > return false;
>> > }
>>
>> Question is why do we need rebootstrap if some remote node's
>> vclock is bigger than ours? It does not mean anything. It doesn't
>> say whether that remote instance still keeps any xlogs. All it
>> tells is that the remote instance committed some more data since
>> our restart.
>>
>>
>> In the piece above we check for master’s gc vclock, which is vclock of the
>> oldest WAL the instance still has. If master’s gc vclock is less than our vclock,
>> no rebootstrap is needed. Master’ll be able to send us all the fresh data.
>> Later we check whether our vclock is greater or incompatible with the remote
>> instance’s vclock. If it is, we cannot rebootstrap from the instance, since we have
>> some rows, that the instance doesn’t have.
>Thanks for the explanation. Then I would change the comment to say
>more about the issue. Currently it says about self only
>
> "Also ignore self, as self applier might not have disconnected yet."
>
>This does not look related to the actual problem, that we just should
>not look at our gc vclock.
Ok, I changed the comment. The diff is below.
>
>Also why do we need 'applier->state == APPLIER_OFF' check? How is it
>related to the issue? If this is a separate bug, is it possible to
>make it a separate patch?
We needn’t this check, after all. replicaset_update() at the end of replicaset_connect()
ensures that all the replicas in the replicaset have appliers in state APPLIER_CONNECTED.
Thanks for pointing this out.
diff --git a/src/box/replication.cc b/src/box/replication.cc
index 01edc0fb2..1345f189b 100644
--- a/src/box/replication.cc
+++ b/src/box/replication.cc
@@ -769,18 +769,12 @@ replicaset_needs_rejoin(struct replica **master)
replicaset_foreach(replica) {
struct applier *applier = replica->applier;
/*
- * The function is called right after
- * box_sync_replication(), which in turn calls
- * replicaset_connect(), which ensures that
- * appliers are either stopped (APPLIER OFF) or
- * connected.
- * Also ignore self, as self applier might not
- * have disconnected yet.
+ * Skip the local instance, we shouldn't perform a
+ * check against our own gc vclock.
*/
- if (applier == NULL || applier->state == APPLIER_OFF ||
- tt_uuid_is_equal(&replica->uuid, &INSTANCE_UUID))
+ if (applier == NULL || tt_uuid_is_equal(&replica->uuid,
+ &INSTANCE_UUID))
continue;
- assert(applier->state == APPLIER_CONNECTED);
const struct ballot *ballot = &applier->ballot;
--
Serge Petrenko
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20200302/b85b623a/attachment.html>
More information about the Tarantool-patches
mailing list