<HTML><BODY><div> </div><blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;">Суббота, 29 февраля 2020, 17:19 +03:00 от Vladislav Shpilevoy <v.shpilevoy@tarantool.org>:<br> <div id=""><div class="js-helper js-readmsg-msg"><style type="text/css"></style><div><div id="style_15829859741177080087_BODY">On 29/02/2020 10:52, Serge Petrenko wrote:<div class="mail-quote-collapse">> Hi! <br>><br>> Суббота, 29 февраля 2020, 2:43 +03:00 от Vladislav Shpilevoy <<a href="/compose?To=v.shpilevoy@tarantool.org">v.shpilevoy@tarantool.org</a>>:<br>> <br>> Thanks for the patch!<br>><br>> On 28/02/2020 18:01, Serge Petrenko wrote:<br>> > When checking wheter rejoin is needed, replica loops through all the<br>> > instances in box.cfg.replication, which makes it believe that there is a<br>> > master holding files, needed by it, since it accounts itself just like<br>> > all other instances.<br>> > So make replica skip itself when finding an instance which holds files<br>> > needed by it, and determining whether rebootstrap is needed.<br>> ><br>> > We already have a working test for the issue, it missed the issue due to<br>> > replica.lua replication settings. Fix replica.lua to optionally include<br>> > itself in box.cfg.replication, so that the corresponding test works<br>> > correctly.<br>> ><br>> > Closes #4759<br>> > ---<br>> > <a href="https://github.com/tarantool/tarantool/issues/4759" target="_blank">https://github.com/tarantool/tarantool/issues/4759</a><br>> > <a href="https://github.com/tarantool/tarantool/tree/sp/gh-4759-rebootstrap-fix" target="_blank">https://github.com/tarantool/tarantool/tree/sp/gh-4759-rebootstrap-fix</a><br>> ><br>> > @ChangeLog<br>> > - fix rebootstrap procedure not working in case replica itself<br>> > is listed in `box.cfg.replication`<br>> ><br>> > src/box/replication.cc | 13 ++++++++++++-<br>> > test/replication/replica.lua | 11 ++++++++++-<br>> > test/replication/replica_rejoin.result | 12 ++++++------<br>> > test/replication/replica_rejoin.test.lua | 12 ++++++------<br>> > 4 files changed, 34 insertions(+), 14 deletions(-)<br>> ><br>> > diff --git a/src/box/replication.cc b/src/box/replication.cc<br>> > index e7bfa22ab..01edc0fb2 100644<br>> > --- a/src/box/replication.cc<br>> > +++ b/src/box/replication.cc<br>> > @@ -768,8 +768,19 @@ replicaset_needs_rejoin(struct replica **master)<br>> > struct replica *leader = NULL;<br>> > replicaset_foreach(replica) {<br>> > struct applier *applier = replica->applier;<br>> > - if (applier == NULL)<br>> > + /*<br>> > + * The function is called right after<br>> > + * box_sync_replication(), which in turn calls<br>> > + * replicaset_connect(), which ensures that<br>> > + * appliers are either stopped (APPLIER OFF) or<br>> > + * connected.<br>> > + * Also ignore self, as self applier might not<br>> > + * have disconnected yet.<br>> > + */<br>> > + if (applier == NULL || applier->state == APPLIER_OFF ||<br>> > + tt_uuid_is_equal(&replica->uuid, &INSTANCE_UUID))<br>> > continue;<br>> > + assert(applier->state == APPLIER_CONNECTED);<br>> ><br>><br>> Could you please understand one thing? Below I see this:<br>><br>> > const struct ballot *ballot = &applier->ballot;<br>> > if (vclock_compare(&ballot->gc_vclock,<br>> > &replicaset.vclock) <= 0) {<br>> > /*<br>> > * There's at least one master that still stores<br>> > * WALs needed by this instance. Proceed to local<br>> > * recovery.<br>> > */<br>> > return false;<br>> > }<br>><br>> Question is why do we need rebootstrap if some remote node's<br>> vclock is bigger than ours? It does not mean anything. It doesn't<br>> say whether that remote instance still keeps any xlogs. All it<br>> tells is that the remote instance committed some more data since<br>> our restart.<br>><br>> <br>> In the piece above we check for master’s gc vclock, which is vclock of the<br>> oldest WAL the instance still has. If master’s gc vclock is less than our vclock,<br>> no rebootstrap is needed. Master’ll be able to send us all the fresh data.<br>> Later we check whether our vclock is greater or incompatible with the remote<br>> instance’s vclock. If it is, we cannot rebootstrap from the instance, since we have<br>> some rows, that the instance doesn’t have.</div><br>Thanks for the explanation. Then I would change the comment to say<br>more about the issue. Currently it says about self only<br><br> "Also ignore self, as self applier might not have disconnected yet."<br><br>This does not look related to the actual problem, that we just should<br>not look at our gc vclock.</div></div></div></div></blockquote><div> </div><div>Ok, I changed the comment. The diff is below.</div><div> </div><blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;"><div><div class="js-helper js-readmsg-msg"><div><div><br>Also why do we need 'applier->state == APPLIER_OFF' check? How is it<br>related to the issue? If this is a separate bug, is it possible to<br>make it a separate patch?</div></div></div></div></blockquote><div> </div><div>We needn’t this check, after all. replicaset_update() at the end of replicaset_connect()</div><div>ensures that all the replicas in the replicaset have appliers in state APPLIER_CONNECTED.</div><div>Thanks for pointing this out.</div><div> </div><div><p>diff --git a/src/box/replication.cc b/src/box/replication.cc</p><p>index 01edc0fb2..1345f189b 100644</p><p>--- a/src/box/replication.cc</p><p>+++ b/src/box/replication.cc</p><p>@@ -769,18 +769,12 @@ replicaset_needs_rejoin(struct replica **master)</p><p> replicaset_foreach(replica) {</p><p> struct applier *applier = replica->applier;</p><p> /*</p><p>- * The function is called right after</p><p>- * box_sync_replication(), which in turn calls</p><p>- * replicaset_connect(), which ensures that</p><p>- * appliers are either stopped (APPLIER OFF) or</p><p>- * connected.</p><p>- * Also ignore self, as self applier might not</p><p>- * have disconnected yet.</p><p>+ * Skip the local instance, we shouldn't perform a</p><p>+ * check against our own gc vclock.</p><p> */</p><p>- if (applier == NULL || applier->state == APPLIER_OFF ||</p><p>- tt_uuid_is_equal(&replica->uuid, &INSTANCE_UUID))</p><p>+ if (applier == NULL || tt_uuid_is_equal(&replica->uuid,</p><p>+ &INSTANCE_UUID))</p><p> continue;</p><p>- assert(applier->state == APPLIER_CONNECTED);</p><p> </p><p> const struct ballot *ballot = &applier->ballot;</p></div><div> </div><div>--<br>Serge Petrenko</div><div> </div></BODY></HTML>