Re[2]: [PATCH] replication: fix bug with zero replication_connect_quorum

Konstantin Belyavskiy k.belyavskiy at tarantool.org
Mon Apr 9 10:29:32 MSK 2018




>Суббота,  7 апреля 2018, 17:55 +03:00 от Vladimir Davydov <vdavydov.dev at gmail.com>:
>
>On Fri, Apr 06, 2018 at 11:39:45AM +0300, Konstantin Belyavskiy wrote:
>> If 'box.cfg.read_only' is false, 'replication' defines at least one
>> replica (other than itself), but they are not available at the time
>> of box.cfg execution and replication_connect_quorum is set to zero,
>> master displays 'orphan' status instead of 'running' since logic
>> which cnange this state is executed only after successfull connection.
>> 
>> Closes #3278
>> ---
>> ticket:  https://github.com/tarantool/tarantool/issues/3278
>> branch:  https://github.com/tarantool/tarantool/compare/gh-3278-fix-bug-with-zero-replication-connect-quorum
>> 
>>  src/box/replication.cc                 |  8 +++-
>>  test/replication/check_quorum.result   | 86 ++++++++++++++++++++++++++++++++++
>>  test/replication/check_quorum.test.lua | 30 ++++++++++++
>>  test/replication/replica_params.lua    | 21 +++++++++
>>  4 files changed, 143 insertions(+), 2 deletions(-)
>>  create mode 100644 test/replication/check_quorum.result
>>  create mode 100644 test/replication/check_quorum.test.lua
>>  create mode 100644 test/replication/replica_params.lua
>
>> diff --git a/test/replication/check_quorum.test.lua b/test/replication/check_quorum.test.lua
>
>This test case should be a part of replication/quorum.test.lua 
Replace
>
>> new file mode 100644
>> index 000000000..497d2af6c
>> --- /dev/null
>> +++ b/test/replication/check_quorum.test.lua
>> @@ -0,0 +1,30 @@
>> +--
>> +-- gh-3278: test different replication and replication_connect_quorum configs.
>> +--
>> +
>
>This test passes with and without the fix, i.e. it is pointless:
>it is supposed to test the case of replication_connect_quorum=0,
>but it only sets this parameter to 1 or 2. 
Yes, sorry, want to test also old behaviour to not break backward capability and also
check cases in ticket, but forgot to add main case. Fixed.
And removed all others to make it shorter.
>
>> +env = require('test_run')
>> +test_run = env.new()
>> +socket = require('socket')
>> +s = socket.tcp_server('localhost', 3371, function() end)
>
>Do not use an arbitrary tcp port as it may be busy. 
Vladimir, please checkout latest version, already fixed, first find empty port, then pass as a param.
>
>> +test_run:cmd('switch default')
>> +listen = os.getenv("LISTEN")
>> +box.cfg{listen = listen, replication_timeout = 1, read_only = false}
>> +box.info.status -- running
>> +test_run:cmd("restart server default")
>> +listen = os.getenv("LISTEN")
>> +box.cfg{listen = listen, replication = { listen }, replication_timeout = 1, read_only = false}
>> +box.info.status -- running
>> +test_run:cmd("restart server default")
>> +listen = os.getenv("LISTEN")
>> +box.cfg{listen = listen, replication = { listen }, replication_timeout = 1, read_only = false, replication_connect_quorum = 1}
>> +box.info.status -- running
>> +
>> +test_run:cmd("create server replica with rpl_master=default, script='replication/replica_params.lua'")
>> +test_run:cmd("start server replica with args='2 1'")
>
>Why don't you simply use the default server as replication master?
>Something like this:
>
>box.schema.user.grant('guest', 'replication')
>test_run:cmd("create server replica with rpl_master=default, script='replication/replica_quorum.lua'")
>test_run:cmd("start server replica")
>test_run:cmd("switch replica")
>box.info.status -- running
>test_run:cmd("switch default")
>test_run:cmd("stop server replica")
>listen = box.cfg.listen
>box.cfg{listen = ''}
>test_run:cmd("start server replica")
>test_run:cmd("switch replica")
>box.info.status -- running
>test_run:cmd("switch default")
>test_run:cmd("stop server replica")
>test_run:cmd("cleanup server replica")
>box.schema.user.revoke('guest', 'replication')
>box.cfg{listen = listen} 
Have to create separate scripts with parameters, since first have to do box.cfg{} without replication
and then with some unreachable.
>
>> +test_run:cmd('switch replica')
>> +box.info.status -- running
>> +box.cfg{replication_connect_quorum = 1}
>> +test_run:cmd('switch default')
>> +test_run:cmd("restart server replica with args='2 2'")
>> +test_run:cmd('switch replica')
>> +box.info.status -- orphan
>> diff --git a/test/replication/replica_params.lua b/test/replication/replica_params.lua
>> new file mode 100644
>> index 000000000..73a15e6c3
>> --- /dev/null
>> +++ b/test/replication/replica_params.lua
>> @@ -0,0 +1,21 @@
>> +#!/usr/bin/env tarantool
>> +
>> +local quorum = tonumber(arg[1])
>> +local n_replics = tonumber(arg[2])
>> +listen = os.getenv("LISTEN")
>> +-- Test different replicaset configurations:
>> +-- First, when the only address in the replicaset is itself.
>> +repl = {listen}
>> +-- To test situation with second master unavailable, add
>> +-- second address (should be empty).
>> +if n_replics == 2 then repl = { listen, '127.0.0.1:3371' } end
>> +
>> +box.cfg({
>> +    listen              = listen,
>> +    replication         = repl,
>> +    memtx_memory        = 107374182,
>> +    replication_connect_quorum = quorum,
>> +    replication_connect_timeout = 0.1,
>> +})
>> +
>> +require('console').listen(os.getenv('ADMIN'))


Best regards,
Konstantin Belyavskiy
k.belyavskiy at tarantool.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20180409/0b919187/attachment.html>


More information about the Tarantool-patches mailing list