Tarantool development patches archive
 help / color / mirror / Atom feed
* [PATCH] replication: fix bug with zero replication_connect_quorum
@ 2018-04-06  8:39 Konstantin Belyavskiy
  2018-04-06  8:49 ` [tarantool-patches] " Konstantin Osipov
  2018-04-07 14:55 ` Vladimir Davydov
  0 siblings, 2 replies; 4+ messages in thread
From: Konstantin Belyavskiy @ 2018-04-06  8:39 UTC (permalink / raw)
  To: vdavydov, georgy; +Cc: tarantool-patches

If 'box.cfg.read_only' is false, 'replication' defines at least one
replica (other than itself), but they are not available at the time
of box.cfg execution and replication_connect_quorum is set to zero,
master displays 'orphan' status instead of 'running' since logic
which cnange this state is executed only after successfull connection.

Closes #3278
---
ticket: https://github.com/tarantool/tarantool/issues/3278
branch: https://github.com/tarantool/tarantool/compare/gh-3278-fix-bug-with-zero-replication-connect-quorum

 src/box/replication.cc                 |  8 +++-
 test/replication/check_quorum.result   | 86 ++++++++++++++++++++++++++++++++++
 test/replication/check_quorum.test.lua | 30 ++++++++++++
 test/replication/replica_params.lua    | 21 +++++++++
 4 files changed, 143 insertions(+), 2 deletions(-)
 create mode 100644 test/replication/check_quorum.result
 create mode 100644 test/replication/check_quorum.test.lua
 create mode 100644 test/replication/replica_params.lua

diff --git a/src/box/replication.cc b/src/box/replication.cc
index b1c84d36c..760f83751 100644
--- a/src/box/replication.cc
+++ b/src/box/replication.cc
@@ -600,8 +600,12 @@ error:
 void
 replicaset_follow(void)
 {
-	if (replicaset.applier.total == 0) {
-		/* Replication is not configured. */
+	if (replicaset.applier.total == 0 || replicaset_quorum() == 0) {
+		/*
+		 * Replication is not configured or quorum is set to
+		 * zero so in the latter case we have no need to wait
+		 * for others.
+		 */
 		box_clear_orphan();
 		return;
 	}
diff --git a/test/replication/check_quorum.result b/test/replication/check_quorum.result
new file mode 100644
index 000000000..017d8380f
--- /dev/null
+++ b/test/replication/check_quorum.result
@@ -0,0 +1,86 @@
+--
+-- gh-3278: test different replication and replication_connect_quorum configs.
+--
+env = require('test_run')
+---
+...
+test_run = env.new()
+---
+...
+socket = require('socket')
+---
+...
+s = socket.tcp_server('localhost', 3371, function() end)
+---
+...
+test_run:cmd('switch default')
+---
+- true
+...
+listen = os.getenv("LISTEN")
+---
+...
+box.cfg{listen = listen, replication_timeout = 1, read_only = false}
+---
+...
+box.info.status -- running
+---
+- running
+...
+test_run:cmd("restart server default")
+listen = os.getenv("LISTEN")
+---
+...
+box.cfg{listen = listen, replication = { listen }, replication_timeout = 1, read_only = false}
+---
+...
+box.info.status -- running
+---
+- running
+...
+test_run:cmd("restart server default")
+listen = os.getenv("LISTEN")
+---
+...
+box.cfg{listen = listen, replication = { listen }, replication_timeout = 1, read_only = false, replication_connect_quorum = 1}
+---
+...
+box.info.status -- running
+---
+- running
+...
+test_run:cmd("create server replica with rpl_master=default, script='replication/replica_params.lua'")
+---
+- true
+...
+test_run:cmd("start server replica with args='2 1'")
+---
+- true
+...
+test_run:cmd('switch replica')
+---
+- true
+...
+box.info.status -- running
+---
+- running
+...
+box.cfg{replication_connect_quorum = 1}
+---
+...
+test_run:cmd('switch default')
+---
+- true
+...
+test_run:cmd("restart server replica with args='2 2'")
+---
+- true
+...
+test_run:cmd('switch replica')
+---
+- true
+...
+box.info.status -- orphan
+---
+- orphan
+...
diff --git a/test/replication/check_quorum.test.lua b/test/replication/check_quorum.test.lua
new file mode 100644
index 000000000..497d2af6c
--- /dev/null
+++ b/test/replication/check_quorum.test.lua
@@ -0,0 +1,30 @@
+--
+-- gh-3278: test different replication and replication_connect_quorum configs.
+--
+
+env = require('test_run')
+test_run = env.new()
+socket = require('socket')
+s = socket.tcp_server('localhost', 3371, function() end)
+test_run:cmd('switch default')
+listen = os.getenv("LISTEN")
+box.cfg{listen = listen, replication_timeout = 1, read_only = false}
+box.info.status -- running
+test_run:cmd("restart server default")
+listen = os.getenv("LISTEN")
+box.cfg{listen = listen, replication = { listen }, replication_timeout = 1, read_only = false}
+box.info.status -- running
+test_run:cmd("restart server default")
+listen = os.getenv("LISTEN")
+box.cfg{listen = listen, replication = { listen }, replication_timeout = 1, read_only = false, replication_connect_quorum = 1}
+box.info.status -- running
+
+test_run:cmd("create server replica with rpl_master=default, script='replication/replica_params.lua'")
+test_run:cmd("start server replica with args='2 1'")
+test_run:cmd('switch replica')
+box.info.status -- running
+box.cfg{replication_connect_quorum = 1}
+test_run:cmd('switch default')
+test_run:cmd("restart server replica with args='2 2'")
+test_run:cmd('switch replica')
+box.info.status -- orphan
diff --git a/test/replication/replica_params.lua b/test/replication/replica_params.lua
new file mode 100644
index 000000000..73a15e6c3
--- /dev/null
+++ b/test/replication/replica_params.lua
@@ -0,0 +1,21 @@
+#!/usr/bin/env tarantool
+
+local quorum = tonumber(arg[1])
+local n_replics = tonumber(arg[2])
+listen = os.getenv("LISTEN")
+-- Test different replicaset configurations:
+-- First, when the only address in the replicaset is itself.
+repl = {listen}
+-- To test situation with second master unavailable, add
+-- second address (should be empty).
+if n_replics == 2 then repl = { listen, '127.0.0.1:3371' } end
+
+box.cfg({
+    listen              = listen,
+    replication         = repl,
+    memtx_memory        = 107374182,
+    replication_connect_quorum = quorum,
+    replication_connect_timeout = 0.1,
+})
+
+require('console').listen(os.getenv('ADMIN'))
-- 
2.14.3 (Apple Git-98)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [tarantool-patches] [PATCH] replication: fix bug with zero replication_connect_quorum
  2018-04-06  8:39 [PATCH] replication: fix bug with zero replication_connect_quorum Konstantin Belyavskiy
@ 2018-04-06  8:49 ` Konstantin Osipov
  2018-04-06 13:02   ` [tarantool-patches] " Konstantin Belyavskiy
  2018-04-07 14:55 ` Vladimir Davydov
  1 sibling, 1 reply; 4+ messages in thread
From: Konstantin Osipov @ 2018-04-06  8:49 UTC (permalink / raw)
  To: tarantool-patches; +Cc: vdavydov, georgy

* Konstantin Belyavskiy <k.belyavskiy@tarantool.org> [18/04/06 11:40]:
> +socket = require('socket')
> +---
> +...
> +s = socket.tcp_server('localhost', 3371, function() end)

Is it possible to avoid using a specific port? The port may be
already in use for a thousand of reasons, and this will break the test.
> +++ b/test/replication/replica_params.lua
> @@ -0,0 +1,21 @@
> +#!/usr/bin/env tarantool
> +
> +local quorum = tonumber(arg[1])
> +local n_replics = tonumber(arg[2])
> +listen = os.getenv("LISTEN")
> +-- Test different replicaset configurations:
> +-- First, when the only address in the replicaset is itself.
> +repl = {listen}
> +-- To test situation with second master unavailable, add
> +-- second address (should be empty).
> +if n_replics == 2 then repl = { listen, '127.0.0.1:3371' } end
> +
> +box.cfg({
> +    listen              = listen,
> +    replication         = repl,
> +    memtx_memory        = 107374182,

Why? What's magical about this number? Could you use 100 * 1024 *
1024 instead? Then it's clear the test is not using too much
memory and the exact amount is also clearly visible.

> +    replication_connect_quorum = quorum,
> +    replication_connect_timeout = 0.1,
> +})
> +
> +require('console').listen(os.getenv('ADMIN'))

Thanks,


-- 
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [tarantool-patches] Re: [PATCH] replication: fix bug with zero replication_connect_quorum
  2018-04-06  8:49 ` [tarantool-patches] " Konstantin Osipov
@ 2018-04-06 13:02   ` Konstantin Belyavskiy
  0 siblings, 0 replies; 4+ messages in thread
From: Konstantin Belyavskiy @ 2018-04-06 13:02 UTC (permalink / raw)
  To: tarantool-patches; +Cc: vdavydov, georgy

[-- Attachment #1: Type: text/plain, Size: 1900 bytes --]




>Пятница,  6 апреля 2018, 11:49 +03:00 от Konstantin Osipov <kostja@tarantool.org>:
>
>* Konstantin Belyavskiy < k.belyavskiy@tarantool.org > [18/04/06 11:40]:
>> +socket = require('socket')
>> +---
>> +...
>> +s = socket.tcp_server('localhost', 3371, function() end)
>
>Is it possible to avoid using a specific port? The port may be
>already in use for a thousand of reasons, and this will break the test. 
Done, the aim here was to make one port busy to make sure, that no other tarantool instance use it.
But here was an error due to this port was socket based.
Update this code and transfer found port to replica script as a third parameter.
>> +++ b/test/replication/replica_params.lua
>> @@ -0,0 +1,21 @@
>> +#!/usr/bin/env tarantool
>> +
>> +local quorum = tonumber(arg[1])
>> +local n_replics = tonumber(arg[2])
>> +listen = os.getenv("LISTEN")
>> +-- Test different replicaset configurations:
>> +-- First, when the only address in the replicaset is itself.
>> +repl = {listen}
>> +-- To test situation with second master unavailable, add
>> +-- second address (should be empty).
>> +if n_replics == 2 then repl = { listen, '127.0.0.1:3371' } end
>> +
>> +box.cfg({
>> +    listen              = listen,
>> +    replication         = repl,
>> +    memtx_memory        = 107374182,
>
>Why? What's magical about this number? Could you use 100 * 1024 *
>1024 instead? Then it's clear the test is not using too much
>memory and the exact amount is also clearly visible. 
OK, it was "copy exactly" principle, just reuse same value.
>> +    replication_connect_quorum = quorum,
>> +    replication_connect_timeout = 0.1,
>> +})
>> +
>> +require('console').listen(os.getenv('ADMIN'))
>
>Thanks,
>
>
>-- 
>Konstantin Osipov, Moscow, Russia,  +7 903 626 22 32
>http://tarantool.io -  www.twitter.com/kostja_osipov
>


Best regards,
Konstantin Belyavskiy
k.belyavskiy@tarantool.org

[-- Attachment #2: Type: text/html, Size: 3063 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] replication: fix bug with zero replication_connect_quorum
  2018-04-06  8:39 [PATCH] replication: fix bug with zero replication_connect_quorum Konstantin Belyavskiy
  2018-04-06  8:49 ` [tarantool-patches] " Konstantin Osipov
@ 2018-04-07 14:55 ` Vladimir Davydov
  1 sibling, 0 replies; 4+ messages in thread
From: Vladimir Davydov @ 2018-04-07 14:55 UTC (permalink / raw)
  To: Konstantin Belyavskiy; +Cc: georgy, tarantool-patches

On Fri, Apr 06, 2018 at 11:39:45AM +0300, Konstantin Belyavskiy wrote:
> If 'box.cfg.read_only' is false, 'replication' defines at least one
> replica (other than itself), but they are not available at the time
> of box.cfg execution and replication_connect_quorum is set to zero,
> master displays 'orphan' status instead of 'running' since logic
> which cnange this state is executed only after successfull connection.
> 
> Closes #3278
> ---
> ticket: https://github.com/tarantool/tarantool/issues/3278
> branch: https://github.com/tarantool/tarantool/compare/gh-3278-fix-bug-with-zero-replication-connect-quorum
> 
>  src/box/replication.cc                 |  8 +++-
>  test/replication/check_quorum.result   | 86 ++++++++++++++++++++++++++++++++++
>  test/replication/check_quorum.test.lua | 30 ++++++++++++
>  test/replication/replica_params.lua    | 21 +++++++++
>  4 files changed, 143 insertions(+), 2 deletions(-)
>  create mode 100644 test/replication/check_quorum.result
>  create mode 100644 test/replication/check_quorum.test.lua
>  create mode 100644 test/replication/replica_params.lua

> diff --git a/test/replication/check_quorum.test.lua b/test/replication/check_quorum.test.lua

This test case should be a part of replication/quorum.test.lua

> new file mode 100644
> index 000000000..497d2af6c
> --- /dev/null
> +++ b/test/replication/check_quorum.test.lua
> @@ -0,0 +1,30 @@
> +--
> +-- gh-3278: test different replication and replication_connect_quorum configs.
> +--
> +

This test passes with and without the fix, i.e. it is pointless:
it is supposed to test the case of replication_connect_quorum=0,
but it only sets this parameter to 1 or 2.

> +env = require('test_run')
> +test_run = env.new()
> +socket = require('socket')
> +s = socket.tcp_server('localhost', 3371, function() end)

Do not use an arbitrary tcp port as it may be busy.

> +test_run:cmd('switch default')
> +listen = os.getenv("LISTEN")
> +box.cfg{listen = listen, replication_timeout = 1, read_only = false}
> +box.info.status -- running
> +test_run:cmd("restart server default")
> +listen = os.getenv("LISTEN")
> +box.cfg{listen = listen, replication = { listen }, replication_timeout = 1, read_only = false}
> +box.info.status -- running
> +test_run:cmd("restart server default")
> +listen = os.getenv("LISTEN")
> +box.cfg{listen = listen, replication = { listen }, replication_timeout = 1, read_only = false, replication_connect_quorum = 1}
> +box.info.status -- running
> +
> +test_run:cmd("create server replica with rpl_master=default, script='replication/replica_params.lua'")
> +test_run:cmd("start server replica with args='2 1'")

Why don't you simply use the default server as replication master?
Something like this:

box.schema.user.grant('guest', 'replication')
test_run:cmd("create server replica with rpl_master=default, script='replication/replica_quorum.lua'")
test_run:cmd("start server replica")
test_run:cmd("switch replica")
box.info.status -- running
test_run:cmd("switch default")
test_run:cmd("stop server replica")
listen = box.cfg.listen
box.cfg{listen = ''}
test_run:cmd("start server replica")
test_run:cmd("switch replica")
box.info.status -- running
test_run:cmd("switch default")
test_run:cmd("stop server replica")
test_run:cmd("cleanup server replica")
box.schema.user.revoke('guest', 'replication')
box.cfg{listen = listen}

> +test_run:cmd('switch replica')
> +box.info.status -- running
> +box.cfg{replication_connect_quorum = 1}
> +test_run:cmd('switch default')
> +test_run:cmd("restart server replica with args='2 2'")
> +test_run:cmd('switch replica')
> +box.info.status -- orphan
> diff --git a/test/replication/replica_params.lua b/test/replication/replica_params.lua
> new file mode 100644
> index 000000000..73a15e6c3
> --- /dev/null
> +++ b/test/replication/replica_params.lua
> @@ -0,0 +1,21 @@
> +#!/usr/bin/env tarantool
> +
> +local quorum = tonumber(arg[1])
> +local n_replics = tonumber(arg[2])
> +listen = os.getenv("LISTEN")
> +-- Test different replicaset configurations:
> +-- First, when the only address in the replicaset is itself.
> +repl = {listen}
> +-- To test situation with second master unavailable, add
> +-- second address (should be empty).
> +if n_replics == 2 then repl = { listen, '127.0.0.1:3371' } end
> +
> +box.cfg({
> +    listen              = listen,
> +    replication         = repl,
> +    memtx_memory        = 107374182,
> +    replication_connect_quorum = quorum,
> +    replication_connect_timeout = 0.1,
> +})
> +
> +require('console').listen(os.getenv('ADMIN'))

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-04-07 14:55 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-06  8:39 [PATCH] replication: fix bug with zero replication_connect_quorum Konstantin Belyavskiy
2018-04-06  8:49 ` [tarantool-patches] " Konstantin Osipov
2018-04-06 13:02   ` [tarantool-patches] " Konstantin Belyavskiy
2018-04-07 14:55 ` Vladimir Davydov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox