From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <v.shpilevoy@tarantool.org>
Received: from smtp53.i.mail.ru (smtp53.i.mail.ru [94.100.177.113])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by dev.tarantool.org (Postfix) with ESMTPS id 77A5646970E
 for <tarantool-patches@dev.tarantool.org>;
 Sat, 28 Dec 2019 15:15:21 +0300 (MSK)
References: <cover.1577277455.git.sergepetrenko@tarantool.org>
 <1577460437.511789779@f420.i.mail.ru>
 <42e12eeb-00b3-00dc-789f-a851459937ba@tarantool.org>
 <1577533695.773222909@f469.i.mail.ru>
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Message-ID: <40e33eff-8893-5917-edeb-1f5a8aee8777@tarantool.org>
Date: Sat, 28 Dec 2019 15:15:19 +0300
MIME-Version: 1.0
In-Reply-To: <1577533695.773222909@f469.i.mail.ru>
Content-Type: text/plain; charset="utf-8"
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Subject: Re: [Tarantool-patches] [PATCH v2 5/5] replication: introduce
	anonymous replica.
List-Id: Tarantool development patches <tarantool-patches.dev.tarantool.org>
List-Unsubscribe: <https://lists.tarantool.org/mailman/options/tarantool-patches>, 
 <mailto:tarantool-patches-request@dev.tarantool.org?subject=unsubscribe>
List-Archive: <https://lists.tarantool.org/pipermail/tarantool-patches/>
List-Post: <mailto:tarantool-patches@dev.tarantool.org>
List-Help: <mailto:tarantool-patches-request@dev.tarantool.org?subject=help>
List-Subscribe: <https://lists.tarantool.org/mailman/listinfo/tarantool-patches>, 
 <mailto:tarantool-patches-request@dev.tarantool.org?subject=subscribe>
To: Sergey Petrenko <sergepetrenko@tarantool.org>, Kirill Yukhin <kyukhin@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org

Thanks for the fixes!

The whole patchset LGTM. With some of my answers below.

>     >> 1. Why do we check for version here? Wouldn't it be more straight
>     >> to do something like this?
>     >>
>     >> ================================================================================
>     >>
>     >> diff --git a/src/box/applier.cc b/src/box/applier.cc
>     >> index 3efa2b765..32851506c 100644
>     >> --- a/src/box/applier.cc
>     >> +++ b/src/box/applier.cc
>     >> @@ -1032,6 +1032,7 @@ applier_f(va_list ap)
>     >>  while (!fiber_is_cancelled()) {
>     >>  try {
>     >>  applier_connect(applier);
>     >> +bool was_anon = replication_anon;
>     >>  if (tt_uuid_is_nil(&REPLICASET_UUID)) {
>     >>  /*
>     >>   * Execute JOIN if this is a bootstrap.
>     >> @@ -1046,8 +1047,8 @@ applier_f(va_list ap)
>     >>  else
>     >>  applier_join(applier);
>     >>  }
>     >> -if (applier->version_id >= version_id(1, 7, 0) &&
>     >> - !replication_anon && instance_id == REPLICA_ID_NIL) {
>     >> +if (was_anon && !replication_anon &&
>     >> + instance_id == REPLICA_ID_NIL) {
>     >>  /*
>     >>   * Anonymity was turned off while
>     >>   * we were fetching a snapshot or
>     >>
>     >> ================================================================================
>     >>
>     >> Also I don't understand, why is it needed? From what I see in
>     >> box_set_replication_anon(), we stop existing appliers, when
>     >> change `replication_anon`. So in case `replication_anon` is
>     >> changed, the fiber should be canceled by that moment.
>     >> applier_connect() does fiber_testcancel() via coio after each
>     >> yield.
>     >>
>     >> A register could even harm here, because this instance will do
>     >> register twice - from there, and from a new applier fiber, started
>     >> in box_set_replication_anon().
>     >
>     > True, the version check is pointless and the comment is misleading.
>     > But we need to call applier_register here. In case of a transition
>     > from anon to normal, we have already fetched a snapshot, so we
>     > shouldn't execute join in a new applier, we just execute register and
>     > proceed to subscribe. Fixed.
> 
>     Thanks for the fix. Although I still don't understand how is this
>     code reachable (I tried to add assert(false), and it crashed). It
>     is even covered with a test, what is cool btw.
> 
> 
> 
> 
>     As I see, when we change anon setting, the existing appliers are
>     cancelled. Why does this applier_f fiber still works, if any change
>     in anon and replication cancels all the existing appliers?
> 
> 
> On replication_anon reconfiguration old appliers are killed.
> The new applier is started, it connects, checks for replicaset_uuid, which is
> not nil, since the previous applier has fetched a snapshot,
> proceeds to checking whether the instance_uuid is zero.
> If it is (it is, since we were anonymous), and replication_anon is false
> (and it is, since we have reconfigured it), it proceeds to register.
> Then it subscribes.
> 
> I guess the previous comment that we fixed here was misleading.
> This was how I understood it when I started working on this and
> tried to reuse the same applier without killing it. So, sorry for the
> confusion.

Aha, now I understand. Then looks ok.

>     However the thing below looks like a bug:
> 
>         Master:
>           box.cfg{listen=3301}
>           box.schema.user.grant('guest', 'read,write,execute', 'universe')
>           box.schema.user.grant('guest', 'replication')
> 
>         Replica:
>           box.cfg{replication_anon=true,
>                   read_only=true,
>                   replication_connect_quorum=0,
>                   replication=3301}
> 
>           box.cfg{replication_anon=false}
>           ---
>           - error: Couldn't find an instance to register this replica on.
>           ...
> 
>     This is because of quorum 0. When the quorum is 0, box_set_replication_anon()
>     tries to find a leader immediately after creation of appliers. Note, that the
>     master is alive. I tried to make a simple fix, but failed. Temporary change
>     to quorum 1 won't help, btw, because it can connect to a read-only replica
>     instead of a master.
> 
>     Everything works, when I change quorum to 1, and then try deanon again.
> 
>     Also it works when I do this:
> 
>     @@ -789,7 +789,7 @@ box_set_replication_anon(void)
>       * them can register and others resend a
>       * non-anonymous subscribe.
>       */
>     - box_sync_replication(false);
>     + box_sync_replication(true);
>       /*
>       * Wait until the master has registered this
>       * instance.
> 
>     But I am not sure that it is legal to do so.
> 
> 
> Maybe leave it as is for now? I can't see what the consequences
> will be. If you want to promote an anon replica to master this
> probably means that one of the masters is dead. So if you do
> box_sync_replication(true), your instance will hang on replication_anon
> configuration for replication timeout or whatever (replication_connect_timeout?).
> You'll need to remove the dead master from box.cfg.replication first.
> So this will still require additional configuration mangling.
> 
> If you think it's fine to wait for a timeout on replication_anon reconfiguration,
> feel free to apply the change above. I'm not against it.

I don't know what is more correct to do here. Yes, lets keep it
as is, and look at user feedback.

>     >>> +/*
>     >>> + * Reset all appliers. This will interrupt
>     >>> + * anonymous follow they're in so that one of
>     >>> + * them can register and others resend a
>     >>> + * non-anonymous subscribe.
>     >>> + */
>     >>> +replicaset_foreach(replica) {
>     >>> +struct applier *applier = replica->applier;
>     >>> +if (applier == NULL)
>     >>> +continue;
>     >>> +replica_clear_applier(replica);
>     >>> +applier_stop(applier);
>     >>> +replica->applier_sync_state = APPLIER_DISCONNECTED;
>     >>> +replica_set_applier(replica, applier);
>     >>> +}
>     >>> +/* Choose a master to send register request to. */
>     >>> +struct replica *master = replicaset_leader();
>     >>> +assert(master != NULL && master->applier != NULL);
>     >>> +struct applier *master_applier = master->applier;
>     >>> +
>     >>> +applier_start(master_applier);
>     >>> +
>     >>> +applier_resume_to_state(master_applier, APPLIER_REGISTER, TIMEOUT_INFINITY);
>     >>> +applier_resume_to_state(master_applier, APPLIER_REGISTERED, TIMEOUT_INFINITY);
>     >>> +applier_resume_to_state(master_applier, APPLIER_READY, TIMEOUT_INFINITY);
> 
>     Is it possible to always wait for APPLIER_READY and not to
>     pay attention to the prior states?
> 
> No, applier transitions to ready right after CONNECTED.
> 
> 
> 
>     Or at least wait for REGISTERED + READY only? Like this:
> 
> 
> This is ok. I just liked the verbosity of waiting for all the states step by step.

You are free to revert this part of my change, I don't mind.