[PATCH] replication: fix bug with read-only replica as a bootstrap leader
Konstantin Belyavskiy
k.belyavskiy at tarantool.org
Tue May 22 18:40:04 MSK 2018
Another broken case. Adding a new replica to cluster:
+ if (replica->applier->remote_is_ro &&
+ replica->applier->vclock.signature == 0)
In this case we may got an ER_READONLY, since signature is not 0.
So leader election now has two phases:
1. To select among read-write replicas.
2. If no such found, try old algorithm for backward compatibility
(case then all replicas exist in cluster table).
Closes #3257
---
https://github.com/tarantool/tarantool/issues/3257
https://github.com/tarantool/tarantool/tree/kbelyavs/gh-3257-fix-bug-with-read-only-as-a-leader
src/box/replication.cc | 22 +++++++++++++++++-----
test/replication/replica_uuid_ro3.lua | 1 +
test/replication/replicaset_ro_mostly.result | 20 ++++++++++++++++++++
test/replication/replicaset_ro_mostly.test.lua | 8 ++++++++
4 files changed, 46 insertions(+), 5 deletions(-)
create mode 120000 test/replication/replica_uuid_ro3.lua
diff --git a/src/box/replication.cc b/src/box/replication.cc
index 0b770c913..8dcf1f656 100644
--- a/src/box/replication.cc
+++ b/src/box/replication.cc
@@ -691,16 +691,24 @@ struct replica *
replicaset_leader(void)
{
struct replica *leader = NULL;
+ bool skip_ro = true;
+ /**
+ * Two loops, first prefers read-write replicas among others.
+ * Second for backward compatibility, if there is no such
+ * replicas at all.
+ */
+loop:
replicaset_foreach(replica) {
if (replica->applier == NULL)
continue;
/**
- * While bootstrapping a new cluster,
- * read-only replicas shouldn't be considered
- * as a leader.
+ * While bootstrapping a new cluster, read-only
+ * replicas shouldn't be considered as a leader.
+ * The only exception if there is no read-write
+ * replicas since there is still a possibility
+ * that all replicas exist in cluster table.
*/
- if (replica->applier->remote_is_ro &&
- replica->applier->vclock.signature == 0)
+ if (skip_ro && replica->applier->remote_is_ro)
continue;
if (leader == NULL) {
leader = replica;
@@ -721,6 +729,10 @@ replicaset_leader(void)
continue;
leader = replica;
}
+ if (skip_ro && leader == NULL) {
+ skip_ro = false;
+ goto loop;
+ }
return leader;
}
diff --git a/test/replication/replica_uuid_ro3.lua b/test/replication/replica_uuid_ro3.lua
new file mode 120000
index 000000000..342d71c57
--- /dev/null
+++ b/test/replication/replica_uuid_ro3.lua
@@ -0,0 +1 @@
+replica_uuid_ro.lua
\ No newline at end of file
diff --git a/test/replication/replicaset_ro_mostly.result b/test/replication/replicaset_ro_mostly.result
index d753a182d..b9e8f1fe8 100644
--- a/test/replication/replicaset_ro_mostly.result
+++ b/test/replication/replicaset_ro_mostly.result
@@ -53,6 +53,26 @@ create_cluster_uuid(SERVERS, UUID)
test_run:wait_fullmesh(SERVERS)
---
...
+-- Add third replica
+name = 'replica_uuid_ro3'
+---
+...
+test_run:cmd(create_cluster_cmd1:format(name, name))
+---
+- true
+...
+test_run:cmd(create_cluster_cmd2:format(name, uuid.new()))
+---
+- true
+...
+test_run:cmd('switch replica_uuid_ro3')
+---
+- true
+...
+test_run:cmd('switch default')
+---
+- true
+...
-- Cleanup.
test_run:drop_cluster(SERVERS)
---
diff --git a/test/replication/replicaset_ro_mostly.test.lua b/test/replication/replicaset_ro_mostly.test.lua
index 539ca5a13..f2c2d0d11 100644
--- a/test/replication/replicaset_ro_mostly.test.lua
+++ b/test/replication/replicaset_ro_mostly.test.lua
@@ -26,5 +26,13 @@ test_run:cmd("setopt delimiter ''");
-- Deploy a cluster.
create_cluster_uuid(SERVERS, UUID)
test_run:wait_fullmesh(SERVERS)
+
+-- Add third replica
+name = 'replica_uuid_ro3'
+test_run:cmd(create_cluster_cmd1:format(name, name))
+test_run:cmd(create_cluster_cmd2:format(name, uuid.new()))
+test_run:cmd('switch replica_uuid_ro3')
+test_run:cmd('switch default')
+
-- Cleanup.
test_run:drop_cluster(SERVERS)
--
2.14.3 (Apple Git-98)
More information about the Tarantool-patches
mailing list