Thanks for updates! LGTM. Best regards, Sergos  Wednesday, 28 July 2021, 00:53 +0300 from Shpilevoy Vladislav : >Hi! Thanks for the review! > >>> On 25 Jul 2021, at 19:53, Vladislav Shpilevoy via Tarantool-patches < tarantool-patches@dev.tarantool.org > wrote: >>> >>> Replica registration works via looking for the smallest not >>> occupied ID in _cluster and inserting it into the space. >>> >>> It works not so good when mvcc is enabled. In particular, if more >>> than 1 replica try to register at the same time, they might get >>> the same replica_id because don't see changes of each other until >>> the registration in _cluster is complete. >>> >>> This in the end leads to all replicas failing the registration >>> except one with the 'duplicate key' error (primary index in >>> _cluster is replica ID). >>> >>> The patch makes the replicas occupy their ID before they commit it >>> into _cluster. And new replica ID search now uses the replica ID >>> map instead of _cluster iterator. >>> >>> This way the registration works like before - like MVCC does not >>> exist which is fine. >> >> Did you discuss the MVCC capabilities - if we can address it there, by >> setting a dedicated flag for this particular - and perhaps for some >> other internal - space(s) to suppress MVCC ‘coverage’ for them. The >> solution will be way more common with supposedly less hassle with >> local structures, triggers, and so on? > >I did think of that in the beginning, but then I decided it is a bad >idea. Firstly, from the side of unnecessary complication of MVCC which >is complex enough already. Secondly, _cluster DQL is going to see the >uncommitted changes despite MVCC turned on. Does not look well. In my >patch I made so that you do not see the tuples in :get() and :select() >in _cluster and you won't stumble into the "dirty" errors because you >are not supposed to make any :insert() nor :replace() into this space >manually. Automatic registration bypasses the busy IDs before trying >to insert them. > >>> diff --git a/src/box/alter.cc b/src/box/alter.cc >>> index 89bb5946c..64ba09021 100644 >>> --- a/src/box/alter.cc >>> +++ b/src/box/alter.cc >>> @@ -4280,14 +4244,34 @@ on_replace_dd_cluster(struct trigger *trigger, void *event) >>> "updates of instance uuid"); >>> return -1; >>> } >>> - } else { >>> - struct trigger *on_commit; >>> - on_commit = txn_alter_trigger_new(register_replica, >>> - new_tuple); >>> - if (on_commit == NULL) >>> - return -1; >>> - txn_stmt_on_commit(stmt, on_commit); >>> + return 0; >>> + } >>> + /* >>> + * With read-views enabled there might be already a replica >>> + * whose registration is in progress in another transaction. >>> + * With the same replica ID. >>> + */ >>> + if (replica_by_id(replica_id) != NULL) { >>> + diag_set(ClientError, ER_UNSUPPORTED, "Tarantool", >>> + "more than 1 replica with the same ID"); >> >> There should be more details on replica ID/UUID/name here, same as in unregister >> panic below. > >The info would be visible in the logs anyway because before something >is added to _cluster, there are always more logs about 'joining ...', >'subscribed replica ...' and so on. > >But I don't mind to extend the logs. > >==================== >diff --git a/src/box/alter.cc b/src/box/alter.cc >index 64ba09021..390199298 100644 >--- a/src/box/alter.cc >+++ b/src/box/alter.cc >@@ -4251,9 +4251,14 @@ on_replace_dd_cluster(struct trigger *trigger, void *event) >  * whose registration is in progress in another transaction. >  * With the same replica ID. >  */ >- if (replica_by_id(replica_id) != NULL) { >- diag_set(ClientError, ER_UNSUPPORTED, "Tarantool", >- "more than 1 replica with the same ID"); >+ struct replica *replica = replica_by_id(replica_id); >+ if (replica != NULL) { >+ const char *msg = tt_sprintf( >+ "more than 1 replica with the same ID %u: new " >+ "uuid - %s, old uuid - %s", replica_id, >+ tt_uuid_str(&replica_uuid), >+ tt_uuid_str(&replica->uuid)); >+ diag_set(ClientError, ER_UNSUPPORTED, "Tarantool", msg); >  return -1; >  } >  struct trigger *on_rollback = txn_alter_trigger_new( >@@ -4265,7 +4270,7 @@ on_replace_dd_cluster(struct trigger *trigger, void *event) >  * replica ID now. While WAL write is in progress, new replicas >  * might come, they should see the ID is already in use. >  */ >- struct replica *replica = replica_by_uuid(&replica_uuid); >+ replica = replica_by_uuid(&replica_uuid); >  if (replica != NULL) >  replica_set_id(replica, replica_id); >  else >@@ -4283,6 +4288,10 @@ on_replace_dd_cluster(struct trigger *trigger, void *event) >  return -1; >  if (replica_check_id(replica_id) != 0) >  return -1; >+ tt_uuid replica_uuid; >+ if (tuple_field_uuid(old_tuple, BOX_CLUSTER_FIELD_UUID, >+ &replica_uuid) != 0) >+ return -1; >  >  struct replica *replica = replica_by_id(replica_id); >  if (replica == NULL) { >@@ -4292,7 +4301,15 @@ on_replace_dd_cluster(struct trigger *trigger, void *event) >  * subsystem is affected. >  */ >  panic("Tried to unregister a replica not stored in " >- "replica_by_id map, id is %u", replica_id); >+ "replica_by_id map, id is %u, uuid is %s", >+ replica_id, tt_uuid_str(&replica_uuid)); >+ } >+ if (!tt_uuid_is_equal(&replica->uuid, &replica_uuid)) { >+ panic("Tried to unregister a replica with id %u, but " >+ "its uuid is different from stored internally: " >+ "in space - %s, internally - %s", replica_id, >+ tt_uuid_str(&replica_uuid), >+ tt_uuid_str(&replica->uuid)); >  } >  /* >  * Unregister only after commit. Otherwise if the transaction >==================== > >>> diff --git a/src/box/box.cc b/src/box/box.cc >>> index 8c10a99dd..5c10aceff 100644 >>> --- a/src/box/box.cc >>> +++ b/src/box/box.cc >>> @@ -2407,22 +2407,9 @@ box_on_join(const tt_uuid *instance_uuid) >>> return; /* nothing to do - already registered */ >>> >>> box_check_writable_xc(); >>> - >>> - /** Find the largest existing replica id. */ >>> - struct space *space = space_cache_find_xc(BOX_CLUSTER_ID); >>> - struct index *index = index_find_system_xc(space, 0); >>> - struct iterator *it = index_create_iterator_xc(index, ITER_ALL, >>> - NULL, 0); >>> - IteratorGuard iter_guard(it); >>> - struct tuple *tuple; >>> - /** Assign a new replica id. */ >>> - uint32_t replica_id = 1; >>> - while ((tuple = iterator_next_xc(it)) != NULL) { >>> - if (tuple_field_u32_xc(tuple, >>> - BOX_CLUSTER_FIELD_ID) != replica_id) >>> - break; >>> - replica_id++; >>> - } >>> + uint32_t replica_id; >>> + if (replica_find_new_id(&replica_id) != 0) >>> + diag_raise(); >> >> Any info on why register fails? > >replica_find_new_id() sets diag with a proper error. This is >why I do diag_raise() here.