[PATCH] replication: fix bug with read-only replica as a bootstrap leader

Konstantin Osipov kostja at tarantool.org
Tue May 22 19:45:42 MSK 2018


* Konstantin Belyavskiy <k.belyavskiy at tarantool.org> [18/05/22 18:41]:

It's OK to push.

As a nitpick, I'd prefer extracting leader search loop into a
routine, rather than adding a goto label.
I'm not crazy about goto, but i think in this case the could would
win from not having it.

> Another broken case. Adding a new replica to cluster:
> +		if (replica->applier->remote_is_ro &&
> +		    replica->applier->vclock.signature == 0)
> In this case we may got an ER_READONLY, since signature is not 0.
> So leader election now has two phases:
>  1. To select among read-write replicas.
>  2. If no such found, try old algorithm for backward compatibility
>     (case then all replicas exist in cluster table).
> 
> Closes #3257
> ---
> https://github.com/tarantool/tarantool/issues/3257
> https://github.com/tarantool/tarantool/tree/kbelyavs/gh-3257-fix-bug-with-read-only-as-a-leader
> 
>  src/box/replication.cc                         | 22 +++++++++++++++++-----
>  test/replication/replica_uuid_ro3.lua          |  1 +
>  test/replication/replicaset_ro_mostly.result   | 20 ++++++++++++++++++++
>  test/replication/replicaset_ro_mostly.test.lua |  8 ++++++++
>  4 files changed, 46 insertions(+), 5 deletions(-)
>  create mode 120000 test/replication/replica_uuid_ro3.lua
> 
> diff --git a/src/box/replication.cc b/src/box/replication.cc
> index 0b770c913..8dcf1f656 100644
> --- a/src/box/replication.cc
> +++ b/src/box/replication.cc
> @@ -691,16 +691,24 @@ struct replica *
>  replicaset_leader(void)
>  {
>  	struct replica *leader = NULL;
> +	bool skip_ro = true;
> +	/**
> +	 * Two loops, first prefers read-write replicas among others.
> +	 * Second for backward compatibility, if there is no such
> +	 * replicas at all.
> +	 */
> +loop:
>  	replicaset_foreach(replica) {
>  		if (replica->applier == NULL)
>  			continue;
>  		/**
> -		 * While bootstrapping a new cluster,
> -		 * read-only replicas shouldn't be considered
> -		 * as a leader.
> +		 * While bootstrapping a new cluster, read-only
> +		 * replicas shouldn't be considered as a leader.
> +		 * The only exception if there is no read-write
> +		 * replicas since there is still a possibility
> +		 * that all replicas exist in cluster table.
>  		 */
> -		if (replica->applier->remote_is_ro &&
> -		    replica->applier->vclock.signature == 0)
> +		if (skip_ro && replica->applier->remote_is_ro)
>  			continue;
>  		if (leader == NULL) {
>  			leader = replica;
> @@ -721,6 +729,10 @@ replicaset_leader(void)
>  			continue;
>  		leader = replica;
>  	}
> +	if (skip_ro && leader == NULL) {
> +		skip_ro = false;
> +		goto loop;
> +	}
>  	return leader;
>  }
>  
> diff --git a/test/replication/replica_uuid_ro3.lua b/test/replication/replica_uuid_ro3.lua
> new file mode 120000
> index 000000000..342d71c57
> --- /dev/null
> +++ b/test/replication/replica_uuid_ro3.lua
> @@ -0,0 +1 @@
> +replica_uuid_ro.lua
> \ No newline at end of file
> diff --git a/test/replication/replicaset_ro_mostly.result b/test/replication/replicaset_ro_mostly.result
> index d753a182d..b9e8f1fe8 100644
> --- a/test/replication/replicaset_ro_mostly.result
> +++ b/test/replication/replicaset_ro_mostly.result
> @@ -53,6 +53,26 @@ create_cluster_uuid(SERVERS, UUID)
>  test_run:wait_fullmesh(SERVERS)
>  ---
>  ...
> +-- Add third replica
> +name = 'replica_uuid_ro3'
> +---
> +...
> +test_run:cmd(create_cluster_cmd1:format(name, name))
> +---
> +- true
> +...
> +test_run:cmd(create_cluster_cmd2:format(name, uuid.new()))
> +---
> +- true
> +...
> +test_run:cmd('switch replica_uuid_ro3')
> +---
> +- true
> +...
> +test_run:cmd('switch default')
> +---
> +- true
> +...
>  -- Cleanup.
>  test_run:drop_cluster(SERVERS)
>  ---
> diff --git a/test/replication/replicaset_ro_mostly.test.lua b/test/replication/replicaset_ro_mostly.test.lua
> index 539ca5a13..f2c2d0d11 100644
> --- a/test/replication/replicaset_ro_mostly.test.lua
> +++ b/test/replication/replicaset_ro_mostly.test.lua
> @@ -26,5 +26,13 @@ test_run:cmd("setopt delimiter ''");
>  -- Deploy a cluster.
>  create_cluster_uuid(SERVERS, UUID)
>  test_run:wait_fullmesh(SERVERS)
> +
> +-- Add third replica
> +name = 'replica_uuid_ro3'
> +test_run:cmd(create_cluster_cmd1:format(name, name))
> +test_run:cmd(create_cluster_cmd2:format(name, uuid.new()))
> +test_run:cmd('switch replica_uuid_ro3')
> +test_run:cmd('switch default')
> +
>  -- Cleanup.
>  test_run:drop_cluster(SERVERS)
> -- 
> 2.14.3 (Apple Git-98)

-- 
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov



More information about the Tarantool-patches mailing list