From: Sergey Bronnikov <sergeyb@tarantool.org> To: Sergey Petrenko <sergepetrenko@tarantool.org> Cc: tarantool-patches@dev.tarantool.org, Vladislav Shpilevoy <v.shpilevoy@tarantool.org> Subject: Re: [Tarantool-patches] [PATCH v1] replication: change space sync mode in a loop Date: Mon, 31 Aug 2020 14:06:42 +0300 [thread overview] Message-ID: <20200831110642.GB94343@pony.bronevichok.ru> (raw) In-Reply-To: <347d6e10-deb8-d349-e1df-033b5075903f@tarantool.org> Sergey, thanks for review! See my comments inline. Test has been updated in a branch. On 14:27 Thu 27 Aug , Sergey Petrenko wrote: > Hi! Thanks for the patch! > > Please see my comments below. > > 26.08.2020 18:10, sergeyb@tarantool.org пишет: > > From: Sergey Bronnikov <sergeyb@tarantool.org> > > > > New regression tests covers cases when one can change synchronous mode > > of space to asynchronous and vice versa. > > > > Closes #5055 > > Part of #5144 > > --- > > > > Branch: ligurio/gh-4842-qsync-change-mode > > CI: https://gitlab.com/tarantool/tarantool/-/pipelines/182271234 > > > > test/replication/qsync_sync_mode.result | 164 ++++++++++++++++++++++ > > test/replication/qsync_sync_mode.test.lua | 90 ++++++++++++ > > 2 files changed, 254 insertions(+) > > create mode 100644 test/replication/qsync_sync_mode.result > > create mode 100644 test/replication/qsync_sync_mode.test.lua > > > > diff --git a/test/replication/qsync_sync_mode.result b/test/replication/qsync_sync_mode.result > > new file mode 100644 > > index 000000000..f2f95ec0f > > --- /dev/null > > +++ b/test/replication/qsync_sync_mode.result > > @@ -0,0 +1,164 @@ > > +-- test-run result file version 2 > > +env = require('test_run') > > + | --- > > + | ... > > +test_run = env.new() > > + | --- > > + | ... > > +engine = test_run:get_cfg('engine') > > + | --- > > + | ... > > +fiber = require('fiber') > > + | --- > > + | ... > > +math = require('math') > > + | --- > > + | ... > > +math.randomseed(os.time()) > > + | --- > > + | ... > > + > > +orig_synchro_quorum = box.cfg.replication_synchro_quorum > > + | --- > > + | ... > > +orig_synchro_timeout = box.cfg.replication_synchro_timeout > > + | --- > > + | ... > > + > > +disable_sync_mode = function() \ > > + local s = box.space._space:get(box.space.sync.id) \ > > + local new_s = s:update({{'=', 6, {is_sync=false}}}) \ > > + box.space._space:replace(new_s) \ > > +end > > + | --- > > + | ... > > + > > +enable_sync_mode = function() \ > > + local s = box.space._space:get(box.space.sync.id) \ > > + local new_s = s:update({{'=', 6, {is_sync=true}}}) \ > > + box.space._space:replace(new_s) \ > > +end > > + | --- > > + | ... > > + > > +set_random_sync_mode = function() \ > > + if (math.random(1, 10) > 5) then \ > > + enable_sync_mode() \ > > + else \ > > + disable_sync_mode() \ > > + end \ > > +end > > + | --- > > + | ... > > + > > +set_random_quorum = function(n) \ > > + box.cfg{replication_synchro_quorum=math.random(1, n)} \ > > +end > > + | --- > > + | ... > > + > > +box.schema.user.grant('guest', 'replication') > > + | --- > > + | ... > > + > > +-- Setup an async cluster with two instances. > > +test_run:cmd('create server replica with rpl_master=default,\ > > + script="replication/replica.lua"') > > + | --- > > + | - true > > + | ... > > +test_run:cmd('start server replica with wait=True, wait_load=True') > > + | --- > > + | - true > > + | ... > > + > > +-- Write data to a leader, enable and disable sync mode in background in a > > +-- loop. Expected no data loss. > > +-- Testcase setup. > > +_ = box.schema.space.create('sync', {is_sync=true, engine=engine}) > > + | --- > > + | ... > > +_ = box.space.sync:create_index('pk') > > + | --- > > + | ... > > +box.cfg{replication_synchro_quorum=2, replication_synchro_timeout=0.001} > > + | --- > > + | ... > > +-- Testcase body. > > +for i = 1,10 do \ > > + set_random_sync_mode() \ > > + if pcall(box.space.sync.insert, box.space.sync, {i}) then \ > > + test_run:switch('replica') \ > > + test_run:wait_cond(function() return box.space.sync:get{i} ~= nil end) \ > > + end \ > > + test_run:switch('default') \ > > +end > > + | --- > > + | ... > > +-- Testcase cleanup. > > +test_run:switch('default') > > + | --- > > + | - true > > + | ... > > +box.space.sync:drop() > > + | --- > > + | ... > > + > > +-- Write data to a leader, enable and disable sync mode and change quorum value > > +-- in background in a loop. > > +-- Testcase setup. > > +_ = box.schema.space.create('sync', {is_sync=true, engine=engine}) > > + | --- > > + | ... > > +_ = box.space.sync:create_index('pk') > > + | --- > > + | ... > > +box.cfg{replication_synchro_quorum=2, replication_synchro_timeout=0.001} > > + | --- > > + | ... > > +-- Testcase body. > > +for i = 1,10 do \ > > + set_random_sync_mode() \ > > + set_random_quorum(5) \ > > + if pcall(box.space.sync.insert, box.space.sync, {i}) then \ > > + test_run:switch('replica') \ > > + test_run:wait_cond(function() return box.space.sync:get{i} ~= nil end) \ > > + end \ > > + test_run:switch('default') \ > > +end > > + | --- > > + | ... > > +-- Testcase cleanup. > > +test_run:switch('default') > > + | --- > > + | - true > > + | ... > > +box.space.sync:drop() > > + | --- > > + | ... > > + > > +-- Teardown. > > +test_run:cmd('switch default') > > + | --- > > + | - true > > + | ... > > +test_run:cmd('stop server replica') > > + | --- > > + | - true > > + | ... > > +test_run:cmd('delete server replica') > > + | --- > > + | - true > > + | ... > > +test_run:cleanup_cluster() > > + | --- > > + | ... > > +box.schema.user.revoke('guest', 'replication') > > + | --- > > + | ... > > +box.cfg{ \ > > + replication_synchro_quorum = orig_synchro_quorum, \ > > + replication_synchro_timeout = orig_synchro_timeout, \ > > +} > > + | --- > > + | ... > > diff --git a/test/replication/qsync_sync_mode.test.lua b/test/replication/qsync_sync_mode.test.lua > > new file mode 100644 > > index 000000000..706261c10 > > --- /dev/null > > +++ b/test/replication/qsync_sync_mode.test.lua > > @@ -0,0 +1,90 @@ > > +env = require('test_run') > > +test_run = env.new() > > +engine = test_run:get_cfg('engine') > > +fiber = require('fiber') > > +math = require('math') > > +math.randomseed(os.time()) > > + > > +orig_synchro_quorum = box.cfg.replication_synchro_quorum > > +orig_synchro_timeout = box.cfg.replication_synchro_timeout > > + > > +disable_sync_mode = function() \ > > + local s = box.space._space:get(box.space.sync.id) \ > > + local new_s = s:update({{'=', 6, {is_sync=false}}}) \ > > + box.space._space:replace(new_s) \ > > +end > > + > > +enable_sync_mode = function() \ > > + local s = box.space._space:get(box.space.sync.id) \ > > + local new_s = s:update({{'=', 6, {is_sync=true}}}) \ > > + box.space._space:replace(new_s) \ > > +end > > + > > Vlad has pushed a patch with `space:alter` a couple of days ago. > So now you may say `space:alter{is_sync=true}`, `space:alter{is_sync=false}` > It does the same work you do, but looks much simpler. used alter in updated patch: -- Testcase body. for i = 1,100 do \ box.space.sync:alter{is_sync=random_boolean()} \ box.space.sync:insert{i} \ test_run:switch('replica') \ test_run:wait_cond(function() return box.space.sync:get{i} ~= nil end) \ test_run:switch('default') \ end > > +set_random_sync_mode = function() \ > > + if (math.random(1, 10) > 5) then \ > > + enable_sync_mode() \ > > + else \ > > + disable_sync_mode() \ > > + end \ > > +end > > + > > +set_random_quorum = function(n) \ > > + box.cfg{replication_synchro_quorum=math.random(1, n)} \ > > +end > > + > > +box.schema.user.grant('guest', 'replication') > > + > > +-- Setup an async cluster with two instances. > > +test_run:cmd('create server replica with rpl_master=default,\ > > + script="replication/replica.lua"') > > +test_run:cmd('start server replica with wait=True, wait_load=True') > > + > > +-- Write data to a leader, enable and disable sync mode in background in a > > +-- loop. Expected no data loss. > > But sync mode is not set in background, it's set in the same loop where > insertions happen. in updated version I set sync mode just before an insert, description in comment has been updated > > +-- Testcase setup. > > +_ = box.schema.space.create('sync', {is_sync=true, engine=engine}) > > +_ = box.space.sync:create_index('pk') > > +box.cfg{replication_synchro_quorum=2, replication_synchro_timeout=0.001} > > Is such a tiny timeout intended? I see you wrap the errors in a pcall, but > still, > it may happen that you get a timeout 10 out of 10 times and then you'll > test nothing. We usually use 30 second replication timeouts. Fixed by setting it to 1000. > > +-- Testcase body. > > +for i = 1,10 do \ > > + set_random_sync_mode() \ > > + if pcall(box.space.sync.insert, box.space.sync, {i}) then \ > > + test_run:switch('replica') \ > > + test_run:wait_cond(function() return box.space.sync:get{i} ~= nil end) \ > > + end \ > > + test_run:switch('default') \ > > +end > > +-- Testcase cleanup. > > +test_run:switch('default') > > +box.space.sync:drop() > > + > > +-- Write data to a leader, enable and disable sync mode and change quorum value > > +-- in background in a loop. > > +-- Testcase setup. > > +_ = box.schema.space.create('sync', {is_sync=true, engine=engine}) > > +_ = box.space.sync:create_index('pk') > > +box.cfg{replication_synchro_quorum=2, replication_synchro_timeout=0.001} > > Ok, I see why you need a small timeout in this testcase. Maybe set a small > timeout > only when the insertion is expected to fail(quorum = 3 or 4)?. And use a 30 > sec timeout > otherwise. I have rewrote tests to consider to cases: - positive, when we have achievable quorum, set mode to sync or async on a master and then make an insert. Expected successful writes always. - negative, when we change mode to sync or async with tx's in a limbo. So comment above is not relevant now. > > +-- Testcase body. > > +for i = 1,10 do \ > > + set_random_sync_mode() \ > > + set_random_quorum(5) \ > > Math.random() range is inclusive, so it'd be more fair, to choose quorum > from 1 to 4. > Then there'd be 2 cases of successful write, quorum 1 and 2, and 2 cases of > failed write, > quorum 3 and 4. removed set_random_quorum() in updated tests > > + if pcall(box.space.sync.insert, box.space.sync, {i}) then \ > > + test_run:switch('replica') \ > > + test_run:wait_cond(function() return box.space.sync:get{i} ~= nil end) \ > > + end \ > > + test_run:switch('default') \ > > +end > > +-- Testcase cleanup. > > +test_run:switch('default') > > +box.space.sync:drop() > > + > > +-- Teardown. > > +test_run:cmd('switch default') > > +test_run:cmd('stop server replica') > > +test_run:cmd('delete server replica') > > +test_run:cleanup_cluster() > > +box.schema.user.revoke('guest', 'replication') > > +box.cfg{ \ > > + replication_synchro_quorum = orig_synchro_quorum, \ > > + replication_synchro_timeout = orig_synchro_timeout, \ > > +} -- sergeyb@
next prev parent reply other threads:[~2020-08-31 11:06 UTC|newest] Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-08-26 15:10 sergeyb 2020-08-27 11:27 ` Sergey Petrenko 2020-08-31 11:06 ` Sergey Bronnikov [this message] 2020-09-01 12:25 ` Serge Petrenko 2020-09-01 12:50 ` Sergey Bronnikov 2020-11-12 9:27 ` Sergey Bronnikov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200831110642.GB94343@pony.bronevichok.ru \ --to=sergeyb@tarantool.org \ --cc=sergepetrenko@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH v1] replication: change space sync mode in a loop' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox