[Tarantool-patches] [PATCH v3 3/3] test: add replication/gh-5446-sqync-eval-quorum.test.lua

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Sat Dec 5 02:52:25 MSK 2020


Thanks for the patch!

See 5 comments below.

On 03.12.2020 15:04, Cyrill Gorcunov via Tarantool-patches wrote:
> Part-of #5446
> 
> Signed-off-by: Cyrill Gorcunov <gorcunov at gmail.com>
> ---
>  .../gh-5446-sqync-eval-quorum.result          | 156 ++++++++++++++++++
>  .../gh-5446-sqync-eval-quorum.test.lua        |  62 +++++++
>  test/replication/replica-quorum-1.lua         |   1 +
>  test/replication/replica-quorum-2.lua         |   1 +
>  test/replication/replica-quorum-3.lua         |   1 +
>  test/replication/replica-quorum-4.lua         |   1 +
>  6 files changed, 222 insertions(+)
>  create mode 100644 test/replication/gh-5446-sqync-eval-quorum.result
>  create mode 100644 test/replication/gh-5446-sqync-eval-quorum.test.lua

1. sqync?

>  create mode 120000 test/replication/replica-quorum-1.lua
>  create mode 120000 test/replication/replica-quorum-2.lua
>  create mode 120000 test/replication/replica-quorum-3.lua
>  create mode 120000 test/replication/replica-quorum-4.lua
> 
> diff --git a/test/replication/gh-5446-sqync-eval-quorum.result b/test/replication/gh-5446-sqync-eval-quorum.result
> new file mode 100644
> index 000000000..14050aa4b
> --- /dev/null
> +++ b/test/replication/gh-5446-sqync-eval-quorum.result
> @@ -0,0 +1,156 @@
> +-- test-run result file version 2
> +test_run = require('test_run').new()
> + | ---
> + | ...
> +engine = test_run:get_cfg('engine')
> + | ---
> + | ...
> +
> +box.schema.user.grant('guest', 'replication')
> + | ---
> + | ...
> +
> +box.cfg { replication_synchro_quorum = "N/2+1", replication_synchro_timeout = 1000 }
> + | ---
> + | ...
> +
> +test_run:cmd('create server replica1 with rpl_master=default,\
> +              script="replication/replica-quorum-1.lua"')

2. There is a function test_run:create_cluster for such cases.

But what I actually would want to see is how the quorum value changes
as we add more replicas. Here you just add 4, and that is it. We
don't see how the value changes dynamically.

> + | ---
> + | - true
> + | ...
> +test_run:cmd('start server replica1 with wait=True, wait_load=True')
> + | ---
> + | - true
> + | ...
> +
> +test_run:cmd('create server replica2 with rpl_master=default,\
> +              script="replication/replica-quorum-2.lua"')
> + | ---
> + | - true
> + | ...
> +test_run:cmd('start server replica2 with wait=True, wait_load=True')
> + | ---
> + | - true
> + | ...
> +
> +test_run:cmd('create server replica3 with rpl_master=default,\
> +              script="replication/replica-quorum-3.lua"')
> + | ---
> + | - true
> + | ...
> +test_run:cmd('start server replica3 with wait=True, wait_load=True')
> + | ---
> + | - true
> + | ...
> +
> +test_run:cmd('create server replica4 with rpl_master=default,\
> +              script="replication/replica-quorum-4.lua"')
> + | ---
> + | - true
> + | ...
> +test_run:cmd('start server replica4 with wait=True, wait_load=True')
> + | ---
> + | - true
> + | ...
> +
> +_ = box.schema.space.create('sync', {is_sync = true, engine = engine})
> + | ---
> + | ...
> +s = box.space.sync
> + | ---
> + | ...
> +
> +s:format({{name = 'id', type = 'unsigned'}, {name = 'band_name', type = 'string'}})

3. Why do you need the format? Why isn't the space dropped
in the end? Why do you need the tuples below, if you never
use them?

> + | ---
> + | ...
> +
> +_ = s:create_index('primary', {parts = {'id'}})
> + | ---
> + | ...
> +s:insert({1, '1'})
> + | ---
> + | - [1, '1']
> + | ...
> +s:insert({2, '2'})
> + | ---
> + | - [2, '2']
> + | ...
> +s:insert({3, '3'})
> + | ---
> + | - [3, '3']
> + | ...
> +
> +-- Wait the data to be propagated, there is no much
> +-- need for this since we're testing the quorum number
> +-- update only but just in case.
> +test_run:wait_lsn('replica1', 'default')
> + | ---
> + | ...
> +test_run:wait_lsn('replica2', 'default')
> + | ---
> + | ...
> +test_run:wait_lsn('replica3', 'default')
> + | ---
> + | ...
> +test_run:wait_lsn('replica4', 'default')

4. Why do you need these wait_lsns? The whole purpose of is_sync is to
make sure the data is already replicated before commit.

> + | ---
> + | ...
> +
> +-- Make sure we were configured in a proper way
> +match = 'set \'replication_synchro_quorum\' configuration option to \"N\\/2%+1'
> + | ---
> + | ...
> +test_run:grep_log("default", match) ~= nil
> + | ---
> + | - true
> + | ...
> +
> +-- There are 4 replicas -> replication_synchro_quorum = 4/2 + 1 = 3
> +match = 'update replication_synchro_quorum = 3'
> + | ---
> + | ...
> +test_run:grep_log("default", match) ~= nil

5. I would better expect you to test the quorum in action. The logs may
say anything, but we must validate it. For example, here quorum must be 3.
So if you kill 2 replicas, it should stop working.

Would be also interesting how the quorum decreases as we remove replicas
from _cluster and kill them for good.

Besides, I don't see tests for errors in the quorum expression.

Also we usually add tests in the same commit as the feature.


More information about the Tarantool-patches mailing list