From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp41.i.mail.ru (smtp41.i.mail.ru [94.100.177.101]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id D6F0145C305 for ; Sat, 5 Dec 2020 02:52:27 +0300 (MSK) References: <20201203140446.66312-1-gorcunov@gmail.com> <20201203140446.66312-4-gorcunov@gmail.com> From: Vladislav Shpilevoy Message-ID: <4877abd0-72b4-57c0-a257-cf991ed8721a@tarantool.org> Date: Sat, 5 Dec 2020 00:52:25 +0100 MIME-Version: 1.0 In-Reply-To: <20201203140446.66312-4-gorcunov@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Tarantool-patches] [PATCH v3 3/3] test: add replication/gh-5446-sqync-eval-quorum.test.lua List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Cyrill Gorcunov , tml Cc: Mons Anderson Thanks for the patch! See 5 comments below. On 03.12.2020 15:04, Cyrill Gorcunov via Tarantool-patches wrote: > Part-of #5446 > > Signed-off-by: Cyrill Gorcunov > --- > .../gh-5446-sqync-eval-quorum.result | 156 ++++++++++++++++++ > .../gh-5446-sqync-eval-quorum.test.lua | 62 +++++++ > test/replication/replica-quorum-1.lua | 1 + > test/replication/replica-quorum-2.lua | 1 + > test/replication/replica-quorum-3.lua | 1 + > test/replication/replica-quorum-4.lua | 1 + > 6 files changed, 222 insertions(+) > create mode 100644 test/replication/gh-5446-sqync-eval-quorum.result > create mode 100644 test/replication/gh-5446-sqync-eval-quorum.test.lua 1. sqync? > create mode 120000 test/replication/replica-quorum-1.lua > create mode 120000 test/replication/replica-quorum-2.lua > create mode 120000 test/replication/replica-quorum-3.lua > create mode 120000 test/replication/replica-quorum-4.lua > > diff --git a/test/replication/gh-5446-sqync-eval-quorum.result b/test/replication/gh-5446-sqync-eval-quorum.result > new file mode 100644 > index 000000000..14050aa4b > --- /dev/null > +++ b/test/replication/gh-5446-sqync-eval-quorum.result > @@ -0,0 +1,156 @@ > +-- test-run result file version 2 > +test_run = require('test_run').new() > + | --- > + | ... > +engine = test_run:get_cfg('engine') > + | --- > + | ... > + > +box.schema.user.grant('guest', 'replication') > + | --- > + | ... > + > +box.cfg { replication_synchro_quorum = "N/2+1", replication_synchro_timeout = 1000 } > + | --- > + | ... > + > +test_run:cmd('create server replica1 with rpl_master=default,\ > + script="replication/replica-quorum-1.lua"') 2. There is a function test_run:create_cluster for such cases. But what I actually would want to see is how the quorum value changes as we add more replicas. Here you just add 4, and that is it. We don't see how the value changes dynamically. > + | --- > + | - true > + | ... > +test_run:cmd('start server replica1 with wait=True, wait_load=True') > + | --- > + | - true > + | ... > + > +test_run:cmd('create server replica2 with rpl_master=default,\ > + script="replication/replica-quorum-2.lua"') > + | --- > + | - true > + | ... > +test_run:cmd('start server replica2 with wait=True, wait_load=True') > + | --- > + | - true > + | ... > + > +test_run:cmd('create server replica3 with rpl_master=default,\ > + script="replication/replica-quorum-3.lua"') > + | --- > + | - true > + | ... > +test_run:cmd('start server replica3 with wait=True, wait_load=True') > + | --- > + | - true > + | ... > + > +test_run:cmd('create server replica4 with rpl_master=default,\ > + script="replication/replica-quorum-4.lua"') > + | --- > + | - true > + | ... > +test_run:cmd('start server replica4 with wait=True, wait_load=True') > + | --- > + | - true > + | ... > + > +_ = box.schema.space.create('sync', {is_sync = true, engine = engine}) > + | --- > + | ... > +s = box.space.sync > + | --- > + | ... > + > +s:format({{name = 'id', type = 'unsigned'}, {name = 'band_name', type = 'string'}}) 3. Why do you need the format? Why isn't the space dropped in the end? Why do you need the tuples below, if you never use them? > + | --- > + | ... > + > +_ = s:create_index('primary', {parts = {'id'}}) > + | --- > + | ... > +s:insert({1, '1'}) > + | --- > + | - [1, '1'] > + | ... > +s:insert({2, '2'}) > + | --- > + | - [2, '2'] > + | ... > +s:insert({3, '3'}) > + | --- > + | - [3, '3'] > + | ... > + > +-- Wait the data to be propagated, there is no much > +-- need for this since we're testing the quorum number > +-- update only but just in case. > +test_run:wait_lsn('replica1', 'default') > + | --- > + | ... > +test_run:wait_lsn('replica2', 'default') > + | --- > + | ... > +test_run:wait_lsn('replica3', 'default') > + | --- > + | ... > +test_run:wait_lsn('replica4', 'default') 4. Why do you need these wait_lsns? The whole purpose of is_sync is to make sure the data is already replicated before commit. > + | --- > + | ... > + > +-- Make sure we were configured in a proper way > +match = 'set \'replication_synchro_quorum\' configuration option to \"N\\/2%+1' > + | --- > + | ... > +test_run:grep_log("default", match) ~= nil > + | --- > + | - true > + | ... > + > +-- There are 4 replicas -> replication_synchro_quorum = 4/2 + 1 = 3 > +match = 'update replication_synchro_quorum = 3' > + | --- > + | ... > +test_run:grep_log("default", match) ~= nil 5. I would better expect you to test the quorum in action. The logs may say anything, but we must validate it. For example, here quorum must be 3. So if you kill 2 replicas, it should stop working. Would be also interesting how the quorum decreases as we remove replicas from _cluster and kill them for good. Besides, I don't see tests for errors in the quorum expression. Also we usually add tests in the same commit as the feature.