* [PATCH] replication: fix bug with zero replication_connect_quorum
@ 2018-04-09 8:04 Konstantin Belyavskiy
2018-04-09 8:41 ` Vladimir Davydov
0 siblings, 1 reply; 4+ messages in thread
From: Konstantin Belyavskiy @ 2018-04-09 8:04 UTC (permalink / raw)
To: vdavydov, georgy; +Cc: tarantool-patches
If 'box.cfg.read_only' is false, 'replication' defines at least one
replica (other than itself), but they are not available at the time
of box.cfg execution and replication_connect_quorum is set to zero,
master displays 'orphan' status instead of 'running' since logic
which cnange this state is executed only after successfull connection.
ticket: https://github.com/tarantool/tarantool/issues/3278
branch: https://github.com/tarantool/tarantool/compare/gh-3278-fix-bug-with-zero-replication-connect-quorum
Closes #3278
---
src/box/replication.cc | 8 ++++--
test/replication/quorum.result | 52 +++++++++++++++++++++++++++++++++++++
test/replication/quorum.test.lua | 25 ++++++++++++++++++
test/replication/replica_params.lua | 23 ++++++++++++++++
4 files changed, 106 insertions(+), 2 deletions(-)
create mode 100644 test/replication/replica_params.lua
diff --git a/src/box/replication.cc b/src/box/replication.cc
index b1c84d36c..760f83751 100644
--- a/src/box/replication.cc
+++ b/src/box/replication.cc
@@ -600,8 +600,12 @@ error:
void
replicaset_follow(void)
{
- if (replicaset.applier.total == 0) {
- /* Replication is not configured. */
+ if (replicaset.applier.total == 0 || replicaset_quorum() == 0) {
+ /*
+ * Replication is not configured or quorum is set to
+ * zero so in the latter case we have no need to wait
+ * for others.
+ */
box_clear_orphan();
return;
}
diff --git a/test/replication/quorum.result b/test/replication/quorum.result
index 34408d421..78d98bfa2 100644
--- a/test/replication/quorum.result
+++ b/test/replication/quorum.result
@@ -239,3 +239,55 @@ test_run:cmd('switch default')
test_run:drop_cluster(SERVERS)
---
...
+--
+-- gh-3278: test different replication and replication_connect_quorum configs.
+--
+-- For next test try to find port with no tarantool running on it.
+remote = require('net.box')
+---
+...
+attempt = 0
+---
+...
+port = 32768 + math.random(32768)
+---
+...
+test_run:cmd("setopt delimiter ';'")
+---
+- true
+...
+while attempt < 10 do
+ s = remote.connect('localhost:'..port)
+ if s.state ~= 'error' then
+ port = 32768 + math.random(32768)
+ attempt = attempt + 1
+ else
+ break
+ end
+end;
+---
+...
+test_run:cmd("setopt delimiter ''");
+---
+- true
+...
+test_run:cmd("create server replica with rpl_master=default, script='replication/replica_params.lua'")
+---
+- true
+...
+test_run:cmd("start server replica with args='0 0 "..port.."'")
+---
+- true
+...
+test_run:cmd("restart server replica with args='0 1 "..port.."'")
+---
+- true
+...
+test_run:cmd('switch replica')
+---
+- true
+...
+box.info.status -- running (old: orphan)
+---
+- running
+...
diff --git a/test/replication/quorum.test.lua b/test/replication/quorum.test.lua
index 856006843..d92ed23a6 100644
--- a/test/replication/quorum.test.lua
+++ b/test/replication/quorum.test.lua
@@ -97,3 +97,28 @@ box.info.replication[4].upstream.status
-- Cleanup.
test_run:cmd('switch default')
test_run:drop_cluster(SERVERS)
+
+--
+-- gh-3278: test different replication and replication_connect_quorum configs.
+--
+
+-- For next test try to find port with no tarantool running on it.
+remote = require('net.box')
+attempt = 0
+port = 32768 + math.random(32768)
+test_run:cmd("setopt delimiter ';'")
+while attempt < 10 do
+ s = remote.connect('localhost:'..port)
+ if s.state ~= 'error' then
+ port = 32768 + math.random(32768)
+ attempt = attempt + 1
+ else
+ break
+ end
+end;
+test_run:cmd("setopt delimiter ''");
+test_run:cmd("create server replica with rpl_master=default, script='replication/replica_params.lua'")
+test_run:cmd("start server replica with args='0 0 "..port.."'")
+test_run:cmd("restart server replica with args='0 1 "..port.."'")
+test_run:cmd('switch replica')
+box.info.status -- running (old: orphan)
diff --git a/test/replication/replica_params.lua b/test/replication/replica_params.lua
new file mode 100644
index 000000000..4f8ee5d0a
--- /dev/null
+++ b/test/replication/replica_params.lua
@@ -0,0 +1,23 @@
+#!/usr/bin/env tarantool
+
+local quorum = tonumber(arg[1])
+local n_replics = tonumber(arg[2])
+local addr = '127.0.0.1:'..arg[3]
+listen = os.getenv("LISTEN")
+-- Test different replicaset configurations:
+-- First, when the only address in the replicaset is itself.
+repl = {}
+-- To test situation with second master unavailable, add
+-- second address (should be empty).
+if n_replics == 1 then repl = { addr } end
+if n_replics == 2 then repl = { listen, addr } end
+
+box.cfg({
+ listen = listen,
+ replication = repl,
+ memtx_memory = 100 * 1024 * 1024,
+ replication_connect_quorum = quorum,
+ replication_connect_timeout = 0.1,
+})
+
+require('console').listen(os.getenv('ADMIN'))
--
2.14.3 (Apple Git-98)
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] replication: fix bug with zero replication_connect_quorum
2018-04-09 8:04 [PATCH] replication: fix bug with zero replication_connect_quorum Konstantin Belyavskiy
@ 2018-04-09 8:41 ` Vladimir Davydov
0 siblings, 0 replies; 4+ messages in thread
From: Vladimir Davydov @ 2018-04-09 8:41 UTC (permalink / raw)
To: Konstantin Belyavskiy; +Cc: georgy, tarantool-patches
v2 tag missing, changes since v1 missing
Please read and follow
https://tarantool.io/en/doc/1.9/dev_guide/developer_guidelines.html#how-to-submit-a-patch-for-review
On Mon, Apr 09, 2018 at 11:04:04AM +0300, Konstantin Belyavskiy wrote:
> diff --git a/test/replication/quorum.test.lua b/test/replication/quorum.test.lua
> index 856006843..d92ed23a6 100644
> --- a/test/replication/quorum.test.lua
> +++ b/test/replication/quorum.test.lua
> @@ -97,3 +97,28 @@ box.info.replication[4].upstream.status
> -- Cleanup.
> test_run:cmd('switch default')
> test_run:drop_cluster(SERVERS)
> +
> +--
> +-- gh-3278: test different replication and replication_connect_quorum configs.
> +--
> +
> +-- For next test try to find port with no tarantool running on it.
> +remote = require('net.box')
> +attempt = 0
> +port = 32768 + math.random(32768)
> +test_run:cmd("setopt delimiter ';'")
> +while attempt < 10 do
> + s = remote.connect('localhost:'..port)
> + if s.state ~= 'error' then
> + port = 32768 + math.random(32768)
> + attempt = attempt + 1
> + else
> + break
> + end
> +end;
> +test_run:cmd("setopt delimiter ''");
> +test_run:cmd("create server replica with rpl_master=default, script='replication/replica_params.lua'")
> +test_run:cmd("start server replica with args='0 0 "..port.."'")
> +test_run:cmd("restart server replica with args='0 1 "..port.."'")
> +test_run:cmd('switch replica')
> +box.info.status -- running (old: orphan)
> diff --git a/test/replication/replica_params.lua b/test/replication/replica_params.lua
> new file mode 100644
> index 000000000..4f8ee5d0a
> --- /dev/null
> +++ b/test/replication/replica_params.lua
> @@ -0,0 +1,23 @@
> +#!/usr/bin/env tarantool
> +
> +local quorum = tonumber(arg[1])
> +local n_replics = tonumber(arg[2])
> +local addr = '127.0.0.1:'..arg[3]
> +listen = os.getenv("LISTEN")
> +-- Test different replicaset configurations:
> +-- First, when the only address in the replicaset is itself.
> +repl = {}
> +-- To test situation with second master unavailable, add
> +-- second address (should be empty).
> +if n_replics == 1 then repl = { addr } end
> +if n_replics == 2 then repl = { listen, addr } end
> +
> +box.cfg({
> + listen = listen,
> + replication = repl,
> + memtx_memory = 100 * 1024 * 1024,
> + replication_connect_quorum = quorum,
> + replication_connect_timeout = 0.1,
> +})
> +
> +require('console').listen(os.getenv('ADMIN'))
Why all this complexity? Why don't you just use the test case from
my previous email? It's much more straightforward and easier to
understand. All you need to do is create replica_no_quorum.lua that
has replication_connect_quorum=0 and replication_connect_timeout=0.1
and then add the following test case to replication/quorum.test.lua:
box.schema.user.grant('guest', 'replication')
test_run:cmd("create server replica with rpl_master=default, script='replication/replica_no_quorum.lua'")
test_run:cmd("start server replica")
test_run:cmd("switch replica")
box.info.status -- running
test_run:cmd("switch default")
test_run:cmd("stop server replica")
listen = box.cfg.listen
box.cfg{listen = ''}
test_run:cmd("start server replica")
test_run:cmd("switch replica")
box.info.status -- running
test_run:cmd("switch default")
test_run:cmd("stop server replica")
test_run:cmd("cleanup server replica")
box.schema.user.revoke('guest', 'replication')
box.cfg{listen = listen}
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH] replication: fix bug with zero replication_connect_quorum
@ 2018-04-06 8:39 Konstantin Belyavskiy
2018-04-07 14:55 ` Vladimir Davydov
0 siblings, 1 reply; 4+ messages in thread
From: Konstantin Belyavskiy @ 2018-04-06 8:39 UTC (permalink / raw)
To: vdavydov, georgy; +Cc: tarantool-patches
If 'box.cfg.read_only' is false, 'replication' defines at least one
replica (other than itself), but they are not available at the time
of box.cfg execution and replication_connect_quorum is set to zero,
master displays 'orphan' status instead of 'running' since logic
which cnange this state is executed only after successfull connection.
Closes #3278
---
ticket: https://github.com/tarantool/tarantool/issues/3278
branch: https://github.com/tarantool/tarantool/compare/gh-3278-fix-bug-with-zero-replication-connect-quorum
src/box/replication.cc | 8 +++-
test/replication/check_quorum.result | 86 ++++++++++++++++++++++++++++++++++
test/replication/check_quorum.test.lua | 30 ++++++++++++
test/replication/replica_params.lua | 21 +++++++++
4 files changed, 143 insertions(+), 2 deletions(-)
create mode 100644 test/replication/check_quorum.result
create mode 100644 test/replication/check_quorum.test.lua
create mode 100644 test/replication/replica_params.lua
diff --git a/src/box/replication.cc b/src/box/replication.cc
index b1c84d36c..760f83751 100644
--- a/src/box/replication.cc
+++ b/src/box/replication.cc
@@ -600,8 +600,12 @@ error:
void
replicaset_follow(void)
{
- if (replicaset.applier.total == 0) {
- /* Replication is not configured. */
+ if (replicaset.applier.total == 0 || replicaset_quorum() == 0) {
+ /*
+ * Replication is not configured or quorum is set to
+ * zero so in the latter case we have no need to wait
+ * for others.
+ */
box_clear_orphan();
return;
}
diff --git a/test/replication/check_quorum.result b/test/replication/check_quorum.result
new file mode 100644
index 000000000..017d8380f
--- /dev/null
+++ b/test/replication/check_quorum.result
@@ -0,0 +1,86 @@
+--
+-- gh-3278: test different replication and replication_connect_quorum configs.
+--
+env = require('test_run')
+---
+...
+test_run = env.new()
+---
+...
+socket = require('socket')
+---
+...
+s = socket.tcp_server('localhost', 3371, function() end)
+---
+...
+test_run:cmd('switch default')
+---
+- true
+...
+listen = os.getenv("LISTEN")
+---
+...
+box.cfg{listen = listen, replication_timeout = 1, read_only = false}
+---
+...
+box.info.status -- running
+---
+- running
+...
+test_run:cmd("restart server default")
+listen = os.getenv("LISTEN")
+---
+...
+box.cfg{listen = listen, replication = { listen }, replication_timeout = 1, read_only = false}
+---
+...
+box.info.status -- running
+---
+- running
+...
+test_run:cmd("restart server default")
+listen = os.getenv("LISTEN")
+---
+...
+box.cfg{listen = listen, replication = { listen }, replication_timeout = 1, read_only = false, replication_connect_quorum = 1}
+---
+...
+box.info.status -- running
+---
+- running
+...
+test_run:cmd("create server replica with rpl_master=default, script='replication/replica_params.lua'")
+---
+- true
+...
+test_run:cmd("start server replica with args='2 1'")
+---
+- true
+...
+test_run:cmd('switch replica')
+---
+- true
+...
+box.info.status -- running
+---
+- running
+...
+box.cfg{replication_connect_quorum = 1}
+---
+...
+test_run:cmd('switch default')
+---
+- true
+...
+test_run:cmd("restart server replica with args='2 2'")
+---
+- true
+...
+test_run:cmd('switch replica')
+---
+- true
+...
+box.info.status -- orphan
+---
+- orphan
+...
diff --git a/test/replication/check_quorum.test.lua b/test/replication/check_quorum.test.lua
new file mode 100644
index 000000000..497d2af6c
--- /dev/null
+++ b/test/replication/check_quorum.test.lua
@@ -0,0 +1,30 @@
+--
+-- gh-3278: test different replication and replication_connect_quorum configs.
+--
+
+env = require('test_run')
+test_run = env.new()
+socket = require('socket')
+s = socket.tcp_server('localhost', 3371, function() end)
+test_run:cmd('switch default')
+listen = os.getenv("LISTEN")
+box.cfg{listen = listen, replication_timeout = 1, read_only = false}
+box.info.status -- running
+test_run:cmd("restart server default")
+listen = os.getenv("LISTEN")
+box.cfg{listen = listen, replication = { listen }, replication_timeout = 1, read_only = false}
+box.info.status -- running
+test_run:cmd("restart server default")
+listen = os.getenv("LISTEN")
+box.cfg{listen = listen, replication = { listen }, replication_timeout = 1, read_only = false, replication_connect_quorum = 1}
+box.info.status -- running
+
+test_run:cmd("create server replica with rpl_master=default, script='replication/replica_params.lua'")
+test_run:cmd("start server replica with args='2 1'")
+test_run:cmd('switch replica')
+box.info.status -- running
+box.cfg{replication_connect_quorum = 1}
+test_run:cmd('switch default')
+test_run:cmd("restart server replica with args='2 2'")
+test_run:cmd('switch replica')
+box.info.status -- orphan
diff --git a/test/replication/replica_params.lua b/test/replication/replica_params.lua
new file mode 100644
index 000000000..73a15e6c3
--- /dev/null
+++ b/test/replication/replica_params.lua
@@ -0,0 +1,21 @@
+#!/usr/bin/env tarantool
+
+local quorum = tonumber(arg[1])
+local n_replics = tonumber(arg[2])
+listen = os.getenv("LISTEN")
+-- Test different replicaset configurations:
+-- First, when the only address in the replicaset is itself.
+repl = {listen}
+-- To test situation with second master unavailable, add
+-- second address (should be empty).
+if n_replics == 2 then repl = { listen, '127.0.0.1:3371' } end
+
+box.cfg({
+ listen = listen,
+ replication = repl,
+ memtx_memory = 107374182,
+ replication_connect_quorum = quorum,
+ replication_connect_timeout = 0.1,
+})
+
+require('console').listen(os.getenv('ADMIN'))
--
2.14.3 (Apple Git-98)
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] replication: fix bug with zero replication_connect_quorum
2018-04-06 8:39 Konstantin Belyavskiy
@ 2018-04-07 14:55 ` Vladimir Davydov
0 siblings, 0 replies; 4+ messages in thread
From: Vladimir Davydov @ 2018-04-07 14:55 UTC (permalink / raw)
To: Konstantin Belyavskiy; +Cc: georgy, tarantool-patches
On Fri, Apr 06, 2018 at 11:39:45AM +0300, Konstantin Belyavskiy wrote:
> If 'box.cfg.read_only' is false, 'replication' defines at least one
> replica (other than itself), but they are not available at the time
> of box.cfg execution and replication_connect_quorum is set to zero,
> master displays 'orphan' status instead of 'running' since logic
> which cnange this state is executed only after successfull connection.
>
> Closes #3278
> ---
> ticket: https://github.com/tarantool/tarantool/issues/3278
> branch: https://github.com/tarantool/tarantool/compare/gh-3278-fix-bug-with-zero-replication-connect-quorum
>
> src/box/replication.cc | 8 +++-
> test/replication/check_quorum.result | 86 ++++++++++++++++++++++++++++++++++
> test/replication/check_quorum.test.lua | 30 ++++++++++++
> test/replication/replica_params.lua | 21 +++++++++
> 4 files changed, 143 insertions(+), 2 deletions(-)
> create mode 100644 test/replication/check_quorum.result
> create mode 100644 test/replication/check_quorum.test.lua
> create mode 100644 test/replication/replica_params.lua
> diff --git a/test/replication/check_quorum.test.lua b/test/replication/check_quorum.test.lua
This test case should be a part of replication/quorum.test.lua
> new file mode 100644
> index 000000000..497d2af6c
> --- /dev/null
> +++ b/test/replication/check_quorum.test.lua
> @@ -0,0 +1,30 @@
> +--
> +-- gh-3278: test different replication and replication_connect_quorum configs.
> +--
> +
This test passes with and without the fix, i.e. it is pointless:
it is supposed to test the case of replication_connect_quorum=0,
but it only sets this parameter to 1 or 2.
> +env = require('test_run')
> +test_run = env.new()
> +socket = require('socket')
> +s = socket.tcp_server('localhost', 3371, function() end)
Do not use an arbitrary tcp port as it may be busy.
> +test_run:cmd('switch default')
> +listen = os.getenv("LISTEN")
> +box.cfg{listen = listen, replication_timeout = 1, read_only = false}
> +box.info.status -- running
> +test_run:cmd("restart server default")
> +listen = os.getenv("LISTEN")
> +box.cfg{listen = listen, replication = { listen }, replication_timeout = 1, read_only = false}
> +box.info.status -- running
> +test_run:cmd("restart server default")
> +listen = os.getenv("LISTEN")
> +box.cfg{listen = listen, replication = { listen }, replication_timeout = 1, read_only = false, replication_connect_quorum = 1}
> +box.info.status -- running
> +
> +test_run:cmd("create server replica with rpl_master=default, script='replication/replica_params.lua'")
> +test_run:cmd("start server replica with args='2 1'")
Why don't you simply use the default server as replication master?
Something like this:
box.schema.user.grant('guest', 'replication')
test_run:cmd("create server replica with rpl_master=default, script='replication/replica_quorum.lua'")
test_run:cmd("start server replica")
test_run:cmd("switch replica")
box.info.status -- running
test_run:cmd("switch default")
test_run:cmd("stop server replica")
listen = box.cfg.listen
box.cfg{listen = ''}
test_run:cmd("start server replica")
test_run:cmd("switch replica")
box.info.status -- running
test_run:cmd("switch default")
test_run:cmd("stop server replica")
test_run:cmd("cleanup server replica")
box.schema.user.revoke('guest', 'replication')
box.cfg{listen = listen}
> +test_run:cmd('switch replica')
> +box.info.status -- running
> +box.cfg{replication_connect_quorum = 1}
> +test_run:cmd('switch default')
> +test_run:cmd("restart server replica with args='2 2'")
> +test_run:cmd('switch replica')
> +box.info.status -- orphan
> diff --git a/test/replication/replica_params.lua b/test/replication/replica_params.lua
> new file mode 100644
> index 000000000..73a15e6c3
> --- /dev/null
> +++ b/test/replication/replica_params.lua
> @@ -0,0 +1,21 @@
> +#!/usr/bin/env tarantool
> +
> +local quorum = tonumber(arg[1])
> +local n_replics = tonumber(arg[2])
> +listen = os.getenv("LISTEN")
> +-- Test different replicaset configurations:
> +-- First, when the only address in the replicaset is itself.
> +repl = {listen}
> +-- To test situation with second master unavailable, add
> +-- second address (should be empty).
> +if n_replics == 2 then repl = { listen, '127.0.0.1:3371' } end
> +
> +box.cfg({
> + listen = listen,
> + replication = repl,
> + memtx_memory = 107374182,
> + replication_connect_quorum = quorum,
> + replication_connect_timeout = 0.1,
> +})
> +
> +require('console').listen(os.getenv('ADMIN'))
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-04-09 8:41 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-09 8:04 [PATCH] replication: fix bug with zero replication_connect_quorum Konstantin Belyavskiy
2018-04-09 8:41 ` Vladimir Davydov
-- strict thread matches above, loose matches on Subject: below --
2018-04-06 8:39 Konstantin Belyavskiy
2018-04-07 14:55 ` Vladimir Davydov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox