From: "Sergey Petrenko" <sergepetrenko@tarantool.org> To: "Vladislav Shpilevoy" <v.shpilevoy@tarantool.org> Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH v2 5/5] replication: introduce anonymous replica. Date: Fri, 27 Dec 2019 18:27:17 +0300 [thread overview] Message-ID: <1577460437.511789779@f420.i.mail.ru> (raw) In-Reply-To: <590158f6-a650-97c8-94fb-6f8c530e5c88@tarantool.org> >Среда, 25 декабря 2019, 21:22 +03:00 от Vladislav Shpilevoy <v.shpilevoy@tarantool.org>: > >Thanks for the patch! Hi! Thanks for your answer! My comments and incremental diff are below. Also please check out the discussion with Kostja Osipov in this thread ( [Tarantool-patches] [PATCH 4/5] vclock: ignore 0th component in comparisons.), there are some new changes to the commit regarding 0 vclock component comparison. > >I've pushed my review fixes on top of this commit. See it in the >end of the email, and on the branch. Thanks! Looks good. > > >We should document new iproto codes: IPROTO_FETCH_SNAPSHOT, >IPROTO_REGISTER, IPROTO_REPLICA_ANON, like here: >https://www.tarantool.io/en/doc/2.2/dev_guide/internals/box_protocol/ Opened a ticket. https://github.com/tarantool/doc/issues/1046 > > >See 4 comments below. > >> diff --git a/src/box/applier.cc b/src/box/applier.cc >> index f4f9d0670..2939a0f61 100644 >> --- a/src/box/applier.cc >> +++ b/src/box/applier.cc >> @@ -996,10 +1035,24 @@ applier_f(va_list ap) >> if (tt_uuid_is_nil(&REPLICASET_UUID)) { >> /* >> * Execute JOIN if this is a bootstrap. >> + * In case of anonymous replication, don't >> + * join but just fetch master's snapshot. >> + * >> * The join will pause the applier >> * until WAL is created. >> */ >> -applier_join(applier); >> +if (replication_anon) >> +applier_fetch_snapshot(applier); >> +else >> +applier_join(applier); >> +} >> +if (applier->version_id >= version_id(1, 7, 0) && >> + !replication_anon && instance_id == REPLICA_ID_NIL) { >> +/* anonymity was turned off while we were >> + * fetching a snapshot or following master. >> + * Register the replica now. >> + */ >> +applier_register(applier); >> } > >1. Why do we check for version here? Wouldn't it be more straight >to do something like this? > >================================================================================ > >diff --git a/src/box/applier.cc b/src/box/applier.cc >index 3efa2b765..32851506c 100644 >--- a/src/box/applier.cc >+++ b/src/box/applier.cc >@@ -1032,6 +1032,7 @@ applier_f(va_list ap) > while (!fiber_is_cancelled()) { > try { > applier_connect(applier); >+bool was_anon = replication_anon; > if (tt_uuid_is_nil(&REPLICASET_UUID)) { > /* > * Execute JOIN if this is a bootstrap. >@@ -1046,8 +1047,8 @@ applier_f(va_list ap) > else > applier_join(applier); > } >-if (applier->version_id >= version_id(1, 7, 0) && >- !replication_anon && instance_id == REPLICA_ID_NIL) { >+if (was_anon && !replication_anon && >+ instance_id == REPLICA_ID_NIL) { > /* > * Anonymity was turned off while > * we were fetching a snapshot or > >================================================================================ > >Also I don't understand, why is it needed? From what I see in >box_set_replication_anon(), we stop existing appliers, when >change `replication_anon`. So in case `replication_anon` is >changed, the fiber should be canceled by that moment. >applier_connect() does fiber_testcancel() via coio after each >yield. > >A register could even harm here, because this instance will do >register twice - from there, and from a new applier fiber, started >in box_set_replication_anon(). True, the version check is pointless and the comment is misleading. But we need to call applier_register here. In case of a transition from anon to normal, we have already fetched a snapshot, so we shouldn't execute join in a new applier, we just execute register and proceed to subscribe. Fixed. > > >> diff --git a/src/box/box.cc b/src/box/box.cc >> index 56d07e634..a93e2503e 100644 >> --- a/src/box/box.cc >> +++ b/src/box/box.cc >> @@ -740,6 +770,64 @@ box_set_replication_skip_conflict(void) >> replication_skip_conflict = cfg_geti("replication_skip_conflict"); >> } >> >> +void >> +box_set_replication_anon(void) >> +{ >> +bool anon = box_check_replication_anon(); >> +if (anon == replication_anon) >> +return; >> + >> +if (!anon) { >> +/* Turn anonymous instance into a normal one. */ >> +replication_anon = anon; > >2. In case of any exception `replication_anon` will contain a not >applied value. > >But there is a worse problem - we stop the appliers, and don't >restart them in case of an exception. So box_set_replication_anon() >is not atomic, it may leave the instance in a half configured >state. box_set_replication() deals with that by creation of a new >array of appliers. It connects them, and only then atomically >replaces the old appliers. > >Also looks like deanonymisation does not respect quorum. I did >this test: > > Master: > box.cfg{listen=3301} > box.schema.user.grant('guest', 'read,write,execute', 'universe') > > > Replica: > box.cfg{replication_anon=true, > read_only=true, > replication_connect_quorum=0, > replication=3301} > > Master: > Killed. > > Replica: > box.cfg{replication_anon=false, read_only=false} > >The replica hangs. Even though the quorum is 0. > >I think, we somehow need to reuse box_set_replication() for all >of this. Looks like it does all we need except does not send the >register requests. True. > >> +/* >> + * Reset all appliers. This will interrupt >> + * anonymous follow they're in so that one of >> + * them can register and others resend a >> + * non-anonymous subscribe. >> + */ >> +replicaset_foreach(replica) { >> +struct applier *applier = replica->applier; >> +if (applier == NULL) >> +continue; >> +replica_clear_applier(replica); >> +applier_stop(applier); >> +replica->applier_sync_state = APPLIER_DISCONNECTED; >> +replica_set_applier(replica, applier); >> +} >> +/* Choose a master to send register request to. */ >> +struct replica *master = replicaset_leader(); >> +assert(master != NULL && master->applier != NULL); >> +struct applier *master_applier = master->applier; >> + >> +applier_start(master_applier); >> + >> +applier_resume_to_state(master_applier, APPLIER_REGISTER, TIMEOUT_INFINITY); >> +applier_resume_to_state(master_applier, APPLIER_REGISTERED, TIMEOUT_INFINITY); >> +applier_resume_to_state(master_applier, APPLIER_READY, TIMEOUT_INFINITY); >> +applier_resume(master_applier); >> +/** >> + * Restart other appliers to >> + * resend non-anonymous subscribe. >> + */ >> +replicaset_foreach(replica) { >> +if (replica != master && replica->applier) >> +applier_start(replica->applier); >> +} >> +} else if (!is_box_configured) { >> +replication_anon = anon; >> +} else { >> +/* >> + * It is forbidden to turn a normal replica into >> + * an anonymous one. >> + */ >> +tnt_raise(ClientError, ER_CFG, "replication_anon", >> + "cannot be turned on after bootstrap" >> + " has finished"); >> +} >> + >> +} >> diff --git a/src/box/relay.cc b/src/box/relay.cc >> index e849fcf4f..22029751f 100644 >> --- a/src/box/relay.cc >> +++ b/src/box/relay.cc >> @@ -691,7 +697,10 @@ relay_subscribe(struct replica *replica, int fd, uint64_t sync, >> relay_start(relay, fd, sync, relay_send_row); >> auto relay_guard = make_scoped_guard([=] { >> relay_stop(relay); >> -replica_on_relay_stop(replica); >> +if (replica->anon) >> +replica_anon_delete(replica); >> +else >> +replica_on_relay_stop(replica); > >3. Why can't replica_on_relay_stop() be used for an anon replica? replica_on_relay_stop() asserts the replica is registered with gc and removes it from consumers. Anonymous replicas aren't registered with gc. I can rewrite replica_on_relay_stop() appropriately, but why? > >> }); >> >> @@ -741,6 +750,13 @@ relay_send_row(struct xstream *stream, struct xrow_header *packet) >> { >> struct relay *relay = container_of(stream, struct relay, stream); >> assert(iproto_type_is_dml(packet->type)); >> +/* >> + * Replica-local requests generated while replica was >> + * anonymous have a zero instance id. Just skip all >> + * these rows. >> + */ >> +if (packet->replica_id == REPLICA_ID_NIL) >> +return; > >4. From what I remember about local requests in an anon replica, >this is possible only with packet->group_id == GROUP_LOCAL, >right? Then can you move it into the check about GROUP_LOCAL >below? To avoid checking replica_id, when the group is not local >anyway. Like this: > >if (packet->group_id == GROUP_LOCAL) { >if (packet->replica_id == REPLICA_ID_NIL) >return; >packet->type = IPROTO_NOP; >packet->group_id = GROUP_DEFAULT; >packet->bodycnt = 0; >} Good point, thanks! My changes are below. > >================================================================================ > >commit 3e8fded091fae8abacc876bf6c899750b595d72d >Author: Vladislav Shpilevoy < v.shpilevoy@tarantool.org > >Date: Wed Dec 25 17:20:45 2019 +0100 > > Review fixes > >diff --git a/src/box/applier.cc b/src/box/applier.cc >index 2939a0f61..3efa2b765 100644 >--- a/src/box/applier.cc >+++ b/src/box/applier.cc >@@ -1048,9 +1048,11 @@ applier_f(va_list ap) > } > if (applier->version_id >= version_id(1, 7, 0) && > !replication_anon && instance_id == REPLICA_ID_NIL) { >-/* anonymity was turned off while we were >- * fetching a snapshot or following master. >- * Register the replica now. >+/* >+ * Anonymity was turned off while >+ * we were fetching a snapshot or >+ * following master. Register the >+ * replica now. > */ > applier_register(applier); > } >diff --git a/src/box/lua/load_cfg.lua b/src/box/lua/load_cfg.lua >index 9dee75b7d..51d157aa3 100644 >--- a/src/box/lua/load_cfg.lua >+++ b/src/box/lua/load_cfg.lua >@@ -279,6 +279,7 @@ local dynamic_cfg_order = { > replication_sync_timeout = 150, > replication_connect_timeout = 150, > replication_connect_quorum = 150, >+ replication_anon = 175, >================================================================================ > >I am not sure about this change. > >The only way how can it be changed now - from true to false. >Not back. The only dubious scenario: we change `replication_anon` >to false, and change `replication` at the same time. > >If we set `replication_anon` false before new `replication`, >we will try to register the old `replication`. It means >`replication_anon` update should be done after `replication`. >So as the new configuration could connect as anon, and then be >upgraded to normal. > >But when we will support anonymization of normal replicas, the >config update could look like this: > > `replication_anon` false -> true > `replication` old -> new > >And here `replication_anon` should be set before `replication`. >So as we would connect anonymously to the new replication set. > >In both cases we do unnecessary actions with the old >replication. > >Seems like it can't be properly solved by dynamic_cfg_order >table. We need to configure `replication_anon` and `replication` >at the same time. So as in box_set_replication() you always know >if the new `replication` setting is anon or not. > >Maybe we need to extend dynamic_cfg_order idea, and introduce >depending parameters. Or parameter groups. Or something like >that. So as we could, for example, group `replication_anon` and >`replication`, and pass anon value to box_set_replication() when >they are updated both at the same box.cfg. > >Or we need to forbid update of some parameters in one box.cfg. > >Or we could make that order of parameters depends on their >values. Although it does not solve the problem that when we >change both `replication_anon` and `replication`, we do >unnecessary actions to the old `replication` in some cases. Well, I reused the box_set_replication code in replication_anon change, so now if replication_anon is changed together with replication, it will trigger duplicate box_set_replication action (almost). This is ok, I guess. > > >================================================================================ > replication = 200, > } > >diff --git a/src/box/replication.cc b/src/box/replication.cc >index ce707811a..d9848debe 100644 >--- a/src/box/replication.cc >+++ b/src/box/replication.cc >@@ -916,7 +916,6 @@ replica_anon_delete(struct replica *replica) > replica_delete(replica); > } > >- > struct replica * > replicaset_first(void) > { diff --git a/src/box/applier.cc b/src/box/applier.cc index 10725ed2c..ae3d281a5 100644 --- a/src/box/applier.cc +++ b/src/box/applier.cc @@ -1046,13 +1046,12 @@ applier_f(va_list ap) else applier_join(applier); } -if (applier->version_id >= version_id(1, 7, 0) && - !replication_anon && instance_id == REPLICA_ID_NIL) { +if (instance_id == REPLICA_ID_NIL && + !replication_anon) { /* - * Anonymity was turned off while - * we were fetching a snapshot or - * following master. Register the - * replica now. + * The instance transitioned + * from anonymous. Register it + * now. */ applier_register(applier); } diff --git a/src/box/box.cc b/src/box/box.cc index a93e2503e..f99aeebc6 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -778,6 +778,9 @@ box_set_replication_anon(void) return; if (!anon) { +auto guard = make_scoped_guard([&]{ +replication_anon = !anon; +}); /* Turn anonymous instance into a normal one. */ replication_anon = anon; /* @@ -786,34 +789,30 @@ box_set_replication_anon(void) * them can register and others resend a * non-anonymous subscribe. */ -replicaset_foreach(replica) { -struct applier *applier = replica->applier; -if (applier == NULL) -continue; -replica_clear_applier(replica); -applier_stop(applier); -replica->applier_sync_state = APPLIER_DISCONNECTED; -replica_set_applier(replica, applier); -} +box_sync_replication(false); /* Choose a master to send register request to. */ struct replica *master = replicaset_leader(); -assert(master != NULL && master->applier != NULL); +if (master == NULL || master->applier == NULL || + master->applier->state != APPLIER_CONNECTED) { +tnt_raise(ClientError, ER_CANNOT_REGISTER); +} struct applier *master_applier = master->applier; -applier_start(master_applier); - -applier_resume_to_state(master_applier, APPLIER_REGISTER, TIMEOUT_INFINITY); -applier_resume_to_state(master_applier, APPLIER_REGISTERED, TIMEOUT_INFINITY); -applier_resume_to_state(master_applier, APPLIER_READY, TIMEOUT_INFINITY); -applier_resume(master_applier); +applier_resume_to_state(master_applier, APPLIER_READY, +TIMEOUT_INFINITY); +applier_resume_to_state(master_applier, APPLIER_REGISTER, +TIMEOUT_INFINITY); +applier_resume_to_state(master_applier, APPLIER_REGISTERED, +TIMEOUT_INFINITY); +applier_resume_to_state(master_applier, APPLIER_READY, +TIMEOUT_INFINITY); /** - * Restart other appliers to + * Resume other appliers to * resend non-anonymous subscribe. */ -replicaset_foreach(replica) { -if (replica != master && replica->applier) -applier_start(replica->applier); -} +replicaset_follow(); +replicaset_sync(); +guard.is_active = false; } else if (!is_box_configured) { replication_anon = anon; } else { diff --git a/src/box/errcode.h b/src/box/errcode.h index dce69c9c0..0f7563e64 100644 --- a/src/box/errcode.h +++ b/src/box/errcode.h @@ -259,6 +259,7 @@ struct errcode_record { /*204 */_(ER_SQL_FUNC_WRONG_RET_COUNT,"SQL expects exactly one argument returned from %s, got %d")\ /*205 */_(ER_FUNC_INVALID_RETURN_TYPE,"Function '%s' returned value of invalid type: expected %s got %s") \ /*206 */_(ER_REPLICA_NOT_ANON, "Replica '%s' is not anonymous and cannot register.") \ +/*207 */_(ER_CANNOT_REGISTER, "Couldn't find an instance to register this replica on.") \ /* * !IMPORTANT! Please follow instructions at start of the file diff --git a/src/box/relay.cc b/src/box/relay.cc index 22029751f..4258edfcd 100644 --- a/src/box/relay.cc +++ b/src/box/relay.cc @@ -750,19 +750,19 @@ relay_send_row(struct xstream *stream, struct xrow_header *packet) { struct relay *relay = container_of(stream, struct relay, stream); assert(iproto_type_is_dml(packet->type)); -/* - * Replica-local requests generated while replica was - * anonymous have a zero instance id. Just skip all - * these rows. - */ -if (packet->replica_id == REPLICA_ID_NIL) -return; /* * Transform replica local requests to IPROTO_NOP so as to * promote vclock on the replica without actually modifying * any data. */ if (packet->group_id == GROUP_LOCAL) { +/* + * Replica-local requests generated while replica + * was anonymous have a zero instance id. Just + * skip all these rows. + */ +if (packet->replica_id == REPLICA_ID_NIL) +return; packet->type = IPROTO_NOP; packet->group_id = GROUP_DEFAULT; packet->bodycnt = 0; diff --git a/test/box/misc.result b/test/box/misc.result index 134caef63..51ad29cdd 100644 --- a/test/box/misc.result +++ b/test/box/misc.result @@ -555,6 +555,7 @@ t; 204: box.error.SQL_FUNC_WRONG_RET_COUNT 205: box.error.FUNC_INVALID_RETURN_TYPE 206: box.error.REPLICA_NOT_ANON + 207: box.error.CANNOT_REGISTER ... test_run:cmd("setopt delimiter ''"); --- -- Sergey Petrenko
next prev parent reply other threads:[~2019-12-27 15:27 UTC|newest] Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-12-25 12:46 [Tarantool-patches] [PATCH v2 0/5] introduce anonymous replicas sergepetrenko 2019-12-25 12:46 ` [Tarantool-patches] [PATCH v2 1/5] box: update comment describing join protocol sergepetrenko 2019-12-25 12:46 ` [Tarantool-patches] [PATCH v2 2/5] replication: do not decode replicaset uuid when processing a subscribe sergepetrenko 2019-12-25 12:47 ` [Tarantool-patches] [PATCH v2 3/5] applier: split join processing into two stages sergepetrenko 2019-12-25 12:47 ` [Tarantool-patches] [PATCH v2 4/5] vclock: ignore 0th component in comparisons sergepetrenko 2019-12-25 16:00 ` Vladislav Shpilevoy 2019-12-27 18:42 ` Vladislav Shpilevoy 2019-12-28 11:21 ` Sergey Petrenko 2019-12-25 12:47 ` [Tarantool-patches] [PATCH v2 5/5] replication: introduce anonymous replica sergepetrenko 2019-12-25 18:22 ` Vladislav Shpilevoy 2019-12-27 15:27 ` Sergey Petrenko [this message] 2019-12-27 18:42 ` Vladislav Shpilevoy 2019-12-28 11:48 ` Sergey Petrenko 2019-12-28 12:15 ` Vladislav Shpilevoy 2019-12-30 5:12 ` [Tarantool-patches] [PATCH v2 0/5] introduce anonymous replicas Kirill Yukhin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1577460437.511789779@f420.i.mail.ru \ --to=sergepetrenko@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH v2 5/5] replication: introduce anonymous replica.' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox