[Tarantool-patches] [PATCH 1/1] replication: check rs uuid on subscribe process

Serge Petrenko sergepetrenko at tarantool.org
Thu Jun 3 10:18:02 MSK 2021



02.06.2021 23:16, Vladislav Shpilevoy пишет:
> Hi! Good job on the review!
>
>>>>> The UUID ignorance on subscribe decode was introduced here:
>>>>> https://github.com/tarantool/tarantool/commit/7f8cbde3555084ad6c41f137aec4faba4648c705#diff-fc276b44b551b4eac3431c9433d4bc881790ddd7df76226d7579f80da7798f6e
>>>>>
>>>>> And I don't understand why. Maybe I miss something? The tests have
>>>>> passed. Sergey, do you remember why was it needed really?
>>>>>
>>>>> Replicaset UUID mismatch definitely means the node can't connect.
>>>>> It is not related to whether it is anonymous or not. Because it
>>>>> has nothing to do with _cluster.
>>>> Hi! Thanks for the patch!
>>>>
>>>> That change was meant to help anonymous replicas fetch data from
>>>> multiple clusters. There was such an idea back then.
>>>> It never got implemented though, and I doubt it will.
>>> Wow, I am not sure how it is supposed to work at all. Starting from
>>> the problem that in different RS you have conflicting replica_id, and
>>> you won't be able to keep track of the changes properly. You simply can't
>>> pack the same replica_id from 2 different replicasets into one replica_id
>>> in your own journal.
>>>
>>> It does not matter if you are anon or not - the changes in the given
>>> replicasets are generated by non-anon nodes and their replica IDs will
>>> clash.
>> I guess you could assign the same replica_id to all the changes coming from
>> one cluster. Just assign lsns in the order of arrival, or something.
> How will you resubscribe then to get the changes from one RS starting from
> a certain vclock? The same for the idea below.
>
> When you resubscribe, you need to tell the other RS starting from which
> vclock it should send data. If you changed the received replica IDs, then
> you don't have the needed information persisted anywhere.
>
> Absence of a resubscribe means a rejoin on each restart and disconnect,
> which makes it hardly usable.

Maybe save received vclocks in some local space? It would have
[cluster_uuid ; vclock ] pairs. Then you could recover all the data first,
and know which vclock to send in resubscribe to every cluster 
representative.


>
>> Sorry, one more nit. I failed to notice this earlier, but there's an obsolete
>> comment in box_process_subscribe().
>>
>> It goes like
>> "Tarantool > 2.1.1 master doesn't check that replica has
>> the same cluster uuid".
>>
>> LGTM, once you delete the comment.
> Thanks for noticing. I didn't delete it, but applied the diff below,
> pushed to master, 2.8, 2.7.
>
> ====================
> @@ -2786,11 +2786,13 @@ box_process_subscribe(struct ev_io *io, struct xrow_header *header)
>   	 * the replica how many rows we have in stock for it,
>   	 * and identify ourselves with our own replica id.
>   	 *
> -	 * Tarantool > 2.1.1 master doesn't check that replica
> -	 * has the same cluster id. Instead it sends its cluster
> -	 * id to replica, and replica checks that its cluster id
> -	 * matches master's one. Older versions will just ignore
> -	 * the additional field.
> +	 * Master not only checks the replica has the same replicaset UUID, but
> +	 * also sends the UUID to the replica so both Tarantools could perform
> +	 * any checks they want depending on their version and supported
> +	 * features.
> +	 *
> +	 * Older versions not supporting replicaset UUID in the response will
> +	 * just ignore the additional field (these are < 2.1.1).
>   	 */
>   	struct xrow_header row;
>   	xrow_encode_subscribe_response_xc(&row, &REPLICASET_UUID, &vclock);

-- 
Serge Petrenko



More information about the Tarantool-patches mailing list