[Tarantool-patches] [PATCH 7/6] raft: test join to a raft cluster

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Sun Jun 6 20:03:44 MSK 2021


There was a bug that a new replica at join to a Raft cluster
sometimes tried to register on a non-leader node which couldn't
write to _cluster, so the join failed with ER_READONLY error.

Now in scope of #5613 the algorithm of join-master selection is
changed. A new node looks for writable members of the cluster to
use a join-master. It will not choose a follower if there is a
leader.

Closes #6127
---
 .../unreleased/gh-6127-raft-join-new.md       |   4 +
 test/replication/gh-6127-master1.lua          |  15 +++
 test/replication/gh-6127-master2.lua          |  13 +++
 test/replication/gh-6127-raft-join-new.result | 105 ++++++++++++++++++
 .../gh-6127-raft-join-new.test.lua            |  41 +++++++
 test/replication/gh-6127-replica.lua          |   9 ++
 6 files changed, 187 insertions(+)
 create mode 100644 changelogs/unreleased/gh-6127-raft-join-new.md
 create mode 100644 test/replication/gh-6127-master1.lua
 create mode 100644 test/replication/gh-6127-master2.lua
 create mode 100644 test/replication/gh-6127-raft-join-new.result
 create mode 100644 test/replication/gh-6127-raft-join-new.test.lua
 create mode 100644 test/replication/gh-6127-replica.lua

diff --git a/changelogs/unreleased/gh-6127-raft-join-new.md b/changelogs/unreleased/gh-6127-raft-join-new.md
new file mode 100644
index 000000000..a2d898df0
--- /dev/null
+++ b/changelogs/unreleased/gh-6127-raft-join-new.md
@@ -0,0 +1,4 @@
+## bugfix/raft
+
+* Fixed an error when a new replica in a Raft cluster could try to join from a
+  follower instead of a leader and failed with an error `ER_READONLY` (gh-6127).
diff --git a/test/replication/gh-6127-master1.lua b/test/replication/gh-6127-master1.lua
new file mode 100644
index 000000000..708574322
--- /dev/null
+++ b/test/replication/gh-6127-master1.lua
@@ -0,0 +1,15 @@
+#!/usr/bin/env tarantool
+
+require('console').listen(os.getenv('ADMIN'))
+box.cfg({
+    listen = 'unix/:./master1.sock',
+    replication = {
+        'unix/:./master1.sock',
+        'unix/:./master2.sock'
+    },
+    election_mode = 'candidate',
+    election_timeout = 0.1,
+    instance_uuid = '10f9828d-b5d5-46a9-b698-ddac7cce5e27',
+})
+box.ctl.wait_rw()
+box.schema.user.grant('guest', 'super')
diff --git a/test/replication/gh-6127-master2.lua b/test/replication/gh-6127-master2.lua
new file mode 100644
index 000000000..1851070c7
--- /dev/null
+++ b/test/replication/gh-6127-master2.lua
@@ -0,0 +1,13 @@
+#!/usr/bin/env tarantool
+
+require('console').listen(os.getenv('ADMIN'))
+box.cfg({
+    listen = 'unix/:./master2.sock',
+    replication = {
+        'unix/:./master1.sock',
+        'unix/:./master2.sock'
+    },
+    election_mode = 'voter',
+    election_timeout = 0.1,
+    instance_uuid = '20f9828d-b5d5-46a9-b698-ddac7cce5e27',
+})
diff --git a/test/replication/gh-6127-raft-join-new.result b/test/replication/gh-6127-raft-join-new.result
new file mode 100644
index 000000000..be6f8489b
--- /dev/null
+++ b/test/replication/gh-6127-raft-join-new.result
@@ -0,0 +1,105 @@
+-- test-run result file version 2
+test_run = require('test_run').new()
+ | ---
+ | ...
+
+--
+-- gh-6127: the algorithm selecting a node from which to join to a replicaset
+-- should take into account who is the leader (is writable and can write to
+-- _cluster) and who is a follower/candidate.
+--
+test_run:cmd('create server master1 with script="replication/gh-6127-master1.lua"')
+ | ---
+ | - true
+ | ...
+test_run:cmd('start server master1 with wait=False')
+ | ---
+ | - true
+ | ...
+test_run:cmd('create server master2 with script="replication/gh-6127-master2.lua"')
+ | ---
+ | - true
+ | ...
+test_run:cmd('start server master2')
+ | ---
+ | - true
+ | ...
+
+test_run:switch('master1')
+ | ---
+ | - true
+ | ...
+box.cfg{election_mode = 'voter'}
+ | ---
+ | ...
+test_run:switch('master2')
+ | ---
+ | - true
+ | ...
+-- Perform manual election because it is faster - the automatic one still tries
+-- to wait for 'death timeout' first which is several seconds.
+box.cfg{                                                                        \
+    election_mode = 'manual',                                                   \
+    election_timeout = 0.1,                                                     \
+}
+ | ---
+ | ...
+box.ctl.promote()
+ | ---
+ | ...
+box.ctl.wait_rw()
+ | ---
+ | ...
+-- Make sure the other node received the promotion row. Vclocks now should be
+-- equal so the new node would select only using read-only state and min UUID.
+test_run:wait_lsn('master1', 'master2')
+ | ---
+ | ...
+
+-- Min UUID is master1, but it is not writable. Therefore must join from
+-- master2.
+test_run:cmd('create server replica with script="replication/gh-6127-replica.lua"')
+ | ---
+ | - true
+ | ...
+test_run:cmd('start server replica')
+ | ---
+ | - true
+ | ...
+test_run:switch('replica')
+ | ---
+ | - true
+ | ...
+assert(box.info.leader ~= 0)
+ | ---
+ | - true
+ | ...
+
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+test_run:cmd('stop server replica')
+ | ---
+ | - true
+ | ...
+test_run:cmd('delete server replica')
+ | ---
+ | - true
+ | ...
+test_run:cmd('stop server master2')
+ | ---
+ | - true
+ | ...
+test_run:cmd('delete server master2')
+ | ---
+ | - true
+ | ...
+test_run:cmd('stop server master1')
+ | ---
+ | - true
+ | ...
+test_run:cmd('delete server master1')
+ | ---
+ | - true
+ | ...
diff --git a/test/replication/gh-6127-raft-join-new.test.lua b/test/replication/gh-6127-raft-join-new.test.lua
new file mode 100644
index 000000000..3e0e9f226
--- /dev/null
+++ b/test/replication/gh-6127-raft-join-new.test.lua
@@ -0,0 +1,41 @@
+test_run = require('test_run').new()
+
+--
+-- gh-6127: the algorithm selecting a node from which to join to a replicaset
+-- should take into account who is the leader (is writable and can write to
+-- _cluster) and who is a follower/candidate.
+--
+test_run:cmd('create server master1 with script="replication/gh-6127-master1.lua"')
+test_run:cmd('start server master1 with wait=False')
+test_run:cmd('create server master2 with script="replication/gh-6127-master2.lua"')
+test_run:cmd('start server master2')
+
+test_run:switch('master1')
+box.cfg{election_mode = 'voter'}
+test_run:switch('master2')
+-- Perform manual election because it is faster - the automatic one still tries
+-- to wait for 'death timeout' first which is several seconds.
+box.cfg{                                                                        \
+    election_mode = 'manual',                                                   \
+    election_timeout = 0.1,                                                     \
+}
+box.ctl.promote()
+box.ctl.wait_rw()
+-- Make sure the other node received the promotion row. Vclocks now should be
+-- equal so the new node would select only using read-only state and min UUID.
+test_run:wait_lsn('master1', 'master2')
+
+-- Min UUID is master1, but it is not writable. Therefore must join from
+-- master2.
+test_run:cmd('create server replica with script="replication/gh-6127-replica.lua"')
+test_run:cmd('start server replica')
+test_run:switch('replica')
+assert(box.info.leader ~= 0)
+
+test_run:switch('default')
+test_run:cmd('stop server replica')
+test_run:cmd('delete server replica')
+test_run:cmd('stop server master2')
+test_run:cmd('delete server master2')
+test_run:cmd('stop server master1')
+test_run:cmd('delete server master1')
diff --git a/test/replication/gh-6127-replica.lua b/test/replication/gh-6127-replica.lua
new file mode 100644
index 000000000..9f4c35ecd
--- /dev/null
+++ b/test/replication/gh-6127-replica.lua
@@ -0,0 +1,9 @@
+#!/usr/bin/env tarantool
+
+require('console').listen(os.getenv('ADMIN'))
+box.cfg({
+    replication = {
+        'unix/:./master1.sock',
+        'unix/:./master2.sock'
+    },
+})
-- 
2.24.3 (Apple Git-128)




More information about the Tarantool-patches mailing list