[Tarantool-patches] [PATCH v2 5/5] replication: introduce anonymous replica.

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Wed Dec 25 21:22:30 MSK 2019


Thanks for the patch!

I've pushed my review fixes on top of this commit. See it in the
end of the email, and on the branch.

We should document new iproto codes: IPROTO_FETCH_SNAPSHOT,
IPROTO_REGISTER, IPROTO_REPLICA_ANON, like here:
https://www.tarantool.io/en/doc/2.2/dev_guide/internals/box_protocol/

See 4 comments below.

> diff --git a/src/box/applier.cc b/src/box/applier.cc
> index f4f9d0670..2939a0f61 100644
> --- a/src/box/applier.cc
> +++ b/src/box/applier.cc
> @@ -996,10 +1035,24 @@ applier_f(va_list ap)
>  			if (tt_uuid_is_nil(&REPLICASET_UUID)) {
>  				/*
>  				 * Execute JOIN if this is a bootstrap.
> +				 * In case of anonymous replication, don't
> +				 * join but just fetch master's snapshot.
> +				 *
>  				 * The join will pause the applier
>  				 * until WAL is created.
>  				 */
> -				applier_join(applier);
> +				if (replication_anon)
> +					applier_fetch_snapshot(applier);
> +				else
> +					applier_join(applier);
> +			}
> +			if (applier->version_id >= version_id(1, 7, 0) &&
> +			    !replication_anon && instance_id == REPLICA_ID_NIL) {
> +				/* anonymity was turned off while we were
> +				 * fetching a snapshot or following master.
> +				 * Register the replica now.
> +				 */
> +				applier_register(applier);
>  			}

1. Why do we check for version here? Wouldn't it be more straight
to do something like this?

================================================================================

diff --git a/src/box/applier.cc b/src/box/applier.cc
index 3efa2b765..32851506c 100644
--- a/src/box/applier.cc
+++ b/src/box/applier.cc
@@ -1032,6 +1032,7 @@ applier_f(va_list ap)
 	while (!fiber_is_cancelled()) {
 		try {
 			applier_connect(applier);
+			bool was_anon = replication_anon;
 			if (tt_uuid_is_nil(&REPLICASET_UUID)) {
 				/*
 				 * Execute JOIN if this is a bootstrap.
@@ -1046,8 +1047,8 @@ applier_f(va_list ap)
 				else
 					applier_join(applier);
 			}
-			if (applier->version_id >= version_id(1, 7, 0) &&
-			    !replication_anon && instance_id == REPLICA_ID_NIL) {
+			if (was_anon && !replication_anon &&
+			    instance_id == REPLICA_ID_NIL) {
 				/*
 				 * Anonymity was turned off while
 				 * we were fetching a snapshot or

================================================================================

Also I don't understand, why is it needed? From what I see in
box_set_replication_anon(), we stop existing appliers, when
change `replication_anon`. So in case `replication_anon` is
changed, the fiber should be canceled by that moment.
applier_connect() does fiber_testcancel() via coio after each
yield.

A register could even harm here, because this instance will do
register twice - from there, and from a new applier fiber, started
in box_set_replication_anon().

> diff --git a/src/box/box.cc b/src/box/box.cc
> index 56d07e634..a93e2503e 100644
> --- a/src/box/box.cc
> +++ b/src/box/box.cc
> @@ -740,6 +770,64 @@ box_set_replication_skip_conflict(void)
>  	replication_skip_conflict = cfg_geti("replication_skip_conflict");
>  }
>  
> +void
> +box_set_replication_anon(void)
> +{
> +	bool anon = box_check_replication_anon();
> +	if (anon == replication_anon)
> +		return;
> +
> +	if (!anon) {
> +		/* Turn anonymous instance into a normal one. */
> +		replication_anon = anon;

2. In case of any exception `replication_anon` will contain a not
applied value.

But there is a worse problem - we stop the appliers, and don't
restart them in case of an exception. So box_set_replication_anon()
is not atomic, it may leave the instance in a half configured
state. box_set_replication() deals with that by creation of a new
array of appliers. It connects them, and only then atomically
replaces the old appliers.

Also looks like deanonymisation does not respect quorum. I did
this test:

    Master:
        box.cfg{listen=3301}
        box.schema.user.grant('guest', 'read,write,execute', 'universe')


    Replica:
        box.cfg{replication_anon=true,
                read_only=true,
                replication_connect_quorum=0,
                replication=3301}

    Master:
        Killed.

    Replica:
        box.cfg{replication_anon=false, read_only=false}

The replica hangs. Even though the quorum is 0.

I think, we somehow need to reuse box_set_replication() for all
of this. Looks like it does all we need except does not send the
register requests.

> +		/*
> +		 * Reset all appliers. This will interrupt
> +		 * anonymous follow they're in so that one of
> +		 * them can register and others resend a
> +		 * non-anonymous subscribe.
> +		 */
> +		replicaset_foreach(replica) {
> +			struct applier *applier = replica->applier;
> +			if (applier == NULL)
> +				continue;
> +			replica_clear_applier(replica);
> +			applier_stop(applier);
> +			replica->applier_sync_state = APPLIER_DISCONNECTED;
> +			replica_set_applier(replica, applier);
> +		}
> +		/* Choose a master to send register request to. */
> +		struct replica *master = replicaset_leader();
> +		assert(master != NULL && master->applier != NULL);
> +		struct applier *master_applier = master->applier;
> +
> +		applier_start(master_applier);
> +
> +		applier_resume_to_state(master_applier, APPLIER_REGISTER, TIMEOUT_INFINITY);
> +		applier_resume_to_state(master_applier, APPLIER_REGISTERED, TIMEOUT_INFINITY);
> +		applier_resume_to_state(master_applier, APPLIER_READY, TIMEOUT_INFINITY);
> +		applier_resume(master_applier);
> +		/**
> +		 * Restart other appliers to
> +		 * resend non-anonymous subscribe.
> +		 */
> +		replicaset_foreach(replica) {
> +			if (replica != master && replica->applier)
> +				applier_start(replica->applier);
> +		}
> +	} else if (!is_box_configured) {
> +		replication_anon = anon;
> +	} else {
> +		/*
> +		 * It is forbidden to turn a normal replica into
> +		 * an anonymous one.
> +		 */
> +		tnt_raise(ClientError, ER_CFG, "replication_anon",
> +			  "cannot be turned on after bootstrap"
> +			  " has finished");
> +	}
> +
> +}
> diff --git a/src/box/relay.cc b/src/box/relay.cc
> index e849fcf4f..22029751f 100644
> --- a/src/box/relay.cc
> +++ b/src/box/relay.cc
> @@ -691,7 +697,10 @@ relay_subscribe(struct replica *replica, int fd, uint64_t sync,
>  	relay_start(relay, fd, sync, relay_send_row);
>  	auto relay_guard = make_scoped_guard([=] {
>  		relay_stop(relay);
> -		replica_on_relay_stop(replica);
> +		if (replica->anon)
> +			replica_anon_delete(replica);
> +		else
> +			replica_on_relay_stop(replica);

3. Why can't replica_on_relay_stop() be used for an anon replica?

>  	});
>  
> @@ -741,6 +750,13 @@ relay_send_row(struct xstream *stream, struct xrow_header *packet)
>  {
>  	struct relay *relay = container_of(stream, struct relay, stream);
>  	assert(iproto_type_is_dml(packet->type));
> +	/*
> +	 * Replica-local requests generated while replica was
> +	 * anonymous have a zero instance id. Just skip all
> +	 * these rows.
> +	 */
> +	if (packet->replica_id == REPLICA_ID_NIL)
> +		return;

4. From what I remember about local requests in an anon replica,
this is possible only with packet->group_id == GROUP_LOCAL,
right? Then can you move it into the check about GROUP_LOCAL
below? To avoid checking replica_id, when the group is not local
anyway. Like this:

	if (packet->group_id == GROUP_LOCAL) {
		if (packet->replica_id == REPLICA_ID_NIL)
			return;
		packet->type = IPROTO_NOP;
		packet->group_id = GROUP_DEFAULT;
		packet->bodycnt = 0;
	}

================================================================================

commit 3e8fded091fae8abacc876bf6c899750b595d72d
Author: Vladislav Shpilevoy <v.shpilevoy at tarantool.org>
Date:   Wed Dec 25 17:20:45 2019 +0100

    Review fixes

diff --git a/src/box/applier.cc b/src/box/applier.cc
index 2939a0f61..3efa2b765 100644
--- a/src/box/applier.cc
+++ b/src/box/applier.cc
@@ -1048,9 +1048,11 @@ applier_f(va_list ap)
 			}
 			if (applier->version_id >= version_id(1, 7, 0) &&
 			    !replication_anon && instance_id == REPLICA_ID_NIL) {
-				/* anonymity was turned off while we were
-				 * fetching a snapshot or following master.
-				 * Register the replica now.
+				/*
+				 * Anonymity was turned off while
+				 * we were fetching a snapshot or
+				 * following master. Register the
+				 * replica now.
 				 */
 				applier_register(applier);
 			}
diff --git a/src/box/lua/load_cfg.lua b/src/box/lua/load_cfg.lua
index 9dee75b7d..51d157aa3 100644
--- a/src/box/lua/load_cfg.lua
+++ b/src/box/lua/load_cfg.lua
@@ -279,6 +279,7 @@ local dynamic_cfg_order = {
     replication_sync_timeout    = 150,
     replication_connect_timeout = 150,
     replication_connect_quorum  = 150,
+    replication_anon        = 175,
================================================================================

I am not sure about this change.

The only way how can it be changed now - from true to false.
Not back. The only dubious scenario: we change `replication_anon`
to false, and change `replication` at the same time.

If we set `replication_anon` false before new `replication`,
we will try to register the old `replication`. It means
`replication_anon` update should be done after `replication`.
So as the new configuration could connect as anon, and then be
upgraded to normal.

But when we will support anonymization of normal replicas, the
config update could look like this:

    `replication_anon` false -> true
    `replication` old -> new

And here `replication_anon` should be set before `replication`.
So as we would connect anonymously to the new replication set.

In both cases we do unnecessary actions with the old
replication.

Seems like it can't be properly solved by dynamic_cfg_order
table. We need to configure `replication_anon` and `replication`
at the same time. So as in box_set_replication() you always know
if the new `replication` setting is anon or not.

Maybe we need to extend dynamic_cfg_order idea, and introduce
depending parameters. Or parameter groups. Or something like
that. So as we could, for example, group `replication_anon` and
`replication`, and pass anon value to box_set_replication() when
they are updated both at the same box.cfg.

Or we need to forbid update of some parameters in one box.cfg.

Or we could make that order of parameters depends on their
values. Although it does not solve the problem that when we
change both `replication_anon` and `replication`, we do
unnecessary actions to the old `replication` in some cases.

================================================================================
     replication             = 200,
 }
 
diff --git a/src/box/replication.cc b/src/box/replication.cc
index ce707811a..d9848debe 100644
--- a/src/box/replication.cc
+++ b/src/box/replication.cc
@@ -916,7 +916,6 @@ replica_anon_delete(struct replica *replica)
 	replica_delete(replica);
 }
 
-
 struct replica *
 replicaset_first(void)
 {


More information about the Tarantool-patches mailing list