From: Vladislav Shpilevoy via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: Sergey Ostanevich <sergos@tarantool.org> Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH 1/1] replication: set replica ID before _cluster commit Date: Tue, 27 Jul 2021 23:53:44 +0200 [thread overview] Message-ID: <959e1a8f-f5f8-85db-e0ed-22d249ed496c@tarantool.org> (raw) In-Reply-To: <E096DE72-5C3B-4FDC-99BD-D994C55B240D@tarantool.org> Hi! Thanks for the review! >> On 25 Jul 2021, at 19:53, Vladislav Shpilevoy via Tarantool-patches <tarantool-patches@dev.tarantool.org> wrote: >> >> Replica registration works via looking for the smallest not >> occupied ID in _cluster and inserting it into the space. >> >> It works not so good when mvcc is enabled. In particular, if more >> than 1 replica try to register at the same time, they might get >> the same replica_id because don't see changes of each other until >> the registration in _cluster is complete. >> >> This in the end leads to all replicas failing the registration >> except one with the 'duplicate key' error (primary index in >> _cluster is replica ID). >> >> The patch makes the replicas occupy their ID before they commit it >> into _cluster. And new replica ID search now uses the replica ID >> map instead of _cluster iterator. >> >> This way the registration works like before - like MVCC does not >> exist which is fine. > > Did you discuss the MVCC capabilities - if we can address it there, by > setting a dedicated flag for this particular - and perhaps for some > other internal - space(s) to suppress MVCC ‘coverage’ for them. The > solution will be way more common with supposedly less hassle with > local structures, triggers, and so on? I did think of that in the beginning, but then I decided it is a bad idea. Firstly, from the side of unnecessary complication of MVCC which is complex enough already. Secondly, _cluster DQL is going to see the uncommitted changes despite MVCC turned on. Does not look well. In my patch I made so that you do not see the tuples in :get() and :select() in _cluster and you won't stumble into the "dirty" errors because you are not supposed to make any :insert() nor :replace() into this space manually. Automatic registration bypasses the busy IDs before trying to insert them. >> diff --git a/src/box/alter.cc b/src/box/alter.cc >> index 89bb5946c..64ba09021 100644 >> --- a/src/box/alter.cc >> +++ b/src/box/alter.cc >> @@ -4280,14 +4244,34 @@ on_replace_dd_cluster(struct trigger *trigger, void *event) >> "updates of instance uuid"); >> return -1; >> } >> - } else { >> - struct trigger *on_commit; >> - on_commit = txn_alter_trigger_new(register_replica, >> - new_tuple); >> - if (on_commit == NULL) >> - return -1; >> - txn_stmt_on_commit(stmt, on_commit); >> + return 0; >> + } >> + /* >> + * With read-views enabled there might be already a replica >> + * whose registration is in progress in another transaction. >> + * With the same replica ID. >> + */ >> + if (replica_by_id(replica_id) != NULL) { >> + diag_set(ClientError, ER_UNSUPPORTED, "Tarantool", >> + "more than 1 replica with the same ID"); > > There should be more details on replica ID/UUID/name here, same as in unregister > panic below. The info would be visible in the logs anyway because before something is added to _cluster, there are always more logs about 'joining ...', 'subscribed replica ...' and so on. But I don't mind to extend the logs. ==================== diff --git a/src/box/alter.cc b/src/box/alter.cc index 64ba09021..390199298 100644 --- a/src/box/alter.cc +++ b/src/box/alter.cc @@ -4251,9 +4251,14 @@ on_replace_dd_cluster(struct trigger *trigger, void *event) * whose registration is in progress in another transaction. * With the same replica ID. */ - if (replica_by_id(replica_id) != NULL) { - diag_set(ClientError, ER_UNSUPPORTED, "Tarantool", - "more than 1 replica with the same ID"); + struct replica *replica = replica_by_id(replica_id); + if (replica != NULL) { + const char *msg = tt_sprintf( + "more than 1 replica with the same ID %u: new " + "uuid - %s, old uuid - %s", replica_id, + tt_uuid_str(&replica_uuid), + tt_uuid_str(&replica->uuid)); + diag_set(ClientError, ER_UNSUPPORTED, "Tarantool", msg); return -1; } struct trigger *on_rollback = txn_alter_trigger_new( @@ -4265,7 +4270,7 @@ on_replace_dd_cluster(struct trigger *trigger, void *event) * replica ID now. While WAL write is in progress, new replicas * might come, they should see the ID is already in use. */ - struct replica *replica = replica_by_uuid(&replica_uuid); + replica = replica_by_uuid(&replica_uuid); if (replica != NULL) replica_set_id(replica, replica_id); else @@ -4283,6 +4288,10 @@ on_replace_dd_cluster(struct trigger *trigger, void *event) return -1; if (replica_check_id(replica_id) != 0) return -1; + tt_uuid replica_uuid; + if (tuple_field_uuid(old_tuple, BOX_CLUSTER_FIELD_UUID, + &replica_uuid) != 0) + return -1; struct replica *replica = replica_by_id(replica_id); if (replica == NULL) { @@ -4292,7 +4301,15 @@ on_replace_dd_cluster(struct trigger *trigger, void *event) * subsystem is affected. */ panic("Tried to unregister a replica not stored in " - "replica_by_id map, id is %u", replica_id); + "replica_by_id map, id is %u, uuid is %s", + replica_id, tt_uuid_str(&replica_uuid)); + } + if (!tt_uuid_is_equal(&replica->uuid, &replica_uuid)) { + panic("Tried to unregister a replica with id %u, but " + "its uuid is different from stored internally: " + "in space - %s, internally - %s", replica_id, + tt_uuid_str(&replica_uuid), + tt_uuid_str(&replica->uuid)); } /* * Unregister only after commit. Otherwise if the transaction ==================== >> diff --git a/src/box/box.cc b/src/box/box.cc >> index 8c10a99dd..5c10aceff 100644 >> --- a/src/box/box.cc >> +++ b/src/box/box.cc >> @@ -2407,22 +2407,9 @@ box_on_join(const tt_uuid *instance_uuid) >> return; /* nothing to do - already registered */ >> >> box_check_writable_xc(); >> - >> - /** Find the largest existing replica id. */ >> - struct space *space = space_cache_find_xc(BOX_CLUSTER_ID); >> - struct index *index = index_find_system_xc(space, 0); >> - struct iterator *it = index_create_iterator_xc(index, ITER_ALL, >> - NULL, 0); >> - IteratorGuard iter_guard(it); >> - struct tuple *tuple; >> - /** Assign a new replica id. */ >> - uint32_t replica_id = 1; >> - while ((tuple = iterator_next_xc(it)) != NULL) { >> - if (tuple_field_u32_xc(tuple, >> - BOX_CLUSTER_FIELD_ID) != replica_id) >> - break; >> - replica_id++; >> - } >> + uint32_t replica_id; >> + if (replica_find_new_id(&replica_id) != 0) >> + diag_raise(); > > Any info on why register fails? replica_find_new_id() sets diag with a proper error. This is why I do diag_raise() here.
next prev parent reply other threads:[~2021-07-27 21:53 UTC|newest] Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-07-25 16:53 Vladislav Shpilevoy via Tarantool-patches 2021-07-25 18:31 ` Sergey Petrenko via Tarantool-patches 2021-07-26 22:06 ` Vladislav Shpilevoy via Tarantool-patches 2021-07-27 12:13 ` Cyrill Gorcunov via Tarantool-patches 2021-07-27 15:10 ` Sergey Ostanevich via Tarantool-patches 2021-07-27 21:53 ` Vladislav Shpilevoy via Tarantool-patches [this message] 2021-07-28 22:20 ` Sergey Ostanevich via Tarantool-patches 2021-07-29 21:42 ` Vladislav Shpilevoy via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=959e1a8f-f5f8-85db-e0ed-22d249ed496c@tarantool.org \ --to=tarantool-patches@dev.tarantool.org \ --cc=sergos@tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH 1/1] replication: set replica ID before _cluster commit' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox