[PATCH] replication: fix bug with zero replication_connect_quorum

Vladimir Davydov vdavydov.dev at gmail.com
Sat Apr 7 17:55:49 MSK 2018


On Fri, Apr 06, 2018 at 11:39:45AM +0300, Konstantin Belyavskiy wrote:
> If 'box.cfg.read_only' is false, 'replication' defines at least one
> replica (other than itself), but they are not available at the time
> of box.cfg execution and replication_connect_quorum is set to zero,
> master displays 'orphan' status instead of 'running' since logic
> which cnange this state is executed only after successfull connection.
> 
> Closes #3278
> ---
> ticket: https://github.com/tarantool/tarantool/issues/3278
> branch: https://github.com/tarantool/tarantool/compare/gh-3278-fix-bug-with-zero-replication-connect-quorum
> 
>  src/box/replication.cc                 |  8 +++-
>  test/replication/check_quorum.result   | 86 ++++++++++++++++++++++++++++++++++
>  test/replication/check_quorum.test.lua | 30 ++++++++++++
>  test/replication/replica_params.lua    | 21 +++++++++
>  4 files changed, 143 insertions(+), 2 deletions(-)
>  create mode 100644 test/replication/check_quorum.result
>  create mode 100644 test/replication/check_quorum.test.lua
>  create mode 100644 test/replication/replica_params.lua

> diff --git a/test/replication/check_quorum.test.lua b/test/replication/check_quorum.test.lua

This test case should be a part of replication/quorum.test.lua

> new file mode 100644
> index 000000000..497d2af6c
> --- /dev/null
> +++ b/test/replication/check_quorum.test.lua
> @@ -0,0 +1,30 @@
> +--
> +-- gh-3278: test different replication and replication_connect_quorum configs.
> +--
> +

This test passes with and without the fix, i.e. it is pointless:
it is supposed to test the case of replication_connect_quorum=0,
but it only sets this parameter to 1 or 2.

> +env = require('test_run')
> +test_run = env.new()
> +socket = require('socket')
> +s = socket.tcp_server('localhost', 3371, function() end)

Do not use an arbitrary tcp port as it may be busy.

> +test_run:cmd('switch default')
> +listen = os.getenv("LISTEN")
> +box.cfg{listen = listen, replication_timeout = 1, read_only = false}
> +box.info.status -- running
> +test_run:cmd("restart server default")
> +listen = os.getenv("LISTEN")
> +box.cfg{listen = listen, replication = { listen }, replication_timeout = 1, read_only = false}
> +box.info.status -- running
> +test_run:cmd("restart server default")
> +listen = os.getenv("LISTEN")
> +box.cfg{listen = listen, replication = { listen }, replication_timeout = 1, read_only = false, replication_connect_quorum = 1}
> +box.info.status -- running
> +
> +test_run:cmd("create server replica with rpl_master=default, script='replication/replica_params.lua'")
> +test_run:cmd("start server replica with args='2 1'")

Why don't you simply use the default server as replication master?
Something like this:

box.schema.user.grant('guest', 'replication')
test_run:cmd("create server replica with rpl_master=default, script='replication/replica_quorum.lua'")
test_run:cmd("start server replica")
test_run:cmd("switch replica")
box.info.status -- running
test_run:cmd("switch default")
test_run:cmd("stop server replica")
listen = box.cfg.listen
box.cfg{listen = ''}
test_run:cmd("start server replica")
test_run:cmd("switch replica")
box.info.status -- running
test_run:cmd("switch default")
test_run:cmd("stop server replica")
test_run:cmd("cleanup server replica")
box.schema.user.revoke('guest', 'replication')
box.cfg{listen = listen}

> +test_run:cmd('switch replica')
> +box.info.status -- running
> +box.cfg{replication_connect_quorum = 1}
> +test_run:cmd('switch default')
> +test_run:cmd("restart server replica with args='2 2'")
> +test_run:cmd('switch replica')
> +box.info.status -- orphan
> diff --git a/test/replication/replica_params.lua b/test/replication/replica_params.lua
> new file mode 100644
> index 000000000..73a15e6c3
> --- /dev/null
> +++ b/test/replication/replica_params.lua
> @@ -0,0 +1,21 @@
> +#!/usr/bin/env tarantool
> +
> +local quorum = tonumber(arg[1])
> +local n_replics = tonumber(arg[2])
> +listen = os.getenv("LISTEN")
> +-- Test different replicaset configurations:
> +-- First, when the only address in the replicaset is itself.
> +repl = {listen}
> +-- To test situation with second master unavailable, add
> +-- second address (should be empty).
> +if n_replics == 2 then repl = { listen, '127.0.0.1:3371' } end
> +
> +box.cfg({
> +    listen              = listen,
> +    replication         = repl,
> +    memtx_memory        = 107374182,
> +    replication_connect_quorum = quorum,
> +    replication_connect_timeout = 0.1,
> +})
> +
> +require('console').listen(os.getenv('ADMIN'))



More information about the Tarantool-patches mailing list