From: Sergey Bronnikov <sergeyb@tarantool.org> To: Serge Petrenko <sergepetrenko@tarantool.org> Cc: tarantool-patches@dev.tarantool.org, Vladislav Shpilevoy <v.shpilevoy@tarantool.org> Subject: Re: [Tarantool-patches] [PATCH 3/3] replication: add test with random leaders promotion and demotion Date: Mon, 31 Aug 2020 13:05:39 +0300 [thread overview] Message-ID: <20200831100539.GA94343@pony.bronevichok.ru> (raw) In-Reply-To: <1598525382.271317018@f511.i.mail.ru> Sergey, thanks for review! See my comments inline, test updated in a branch. On 13:49 Thu 27 Aug , Serge Petrenko wrote: > > Hi! Thanks for the fixes! > I’m pasting parts of the patch below to comment on. > > +box.cfg({ > + listen = instance_uri(INSTANCE_ID); > + replication_connect_quorum = 3; > + replication = { > + instance_uri(1); > + instance_uri(2); > + instance_uri(3); > + instance_uri(4); > + instance_uri(5); > + }; > +}) > > > * You should either omit `replication_connect_quorum` at all, or set it to 5. > Omitting it will have the same effect. > I think you meant `replication_synchro_quorum` here, then it makes sense > to set it to 3. Also `replication_synchro_timeout` should be set here, > I’ll mention it again below. removed replication_connect_quorum at all > + > +NUM_INSTANCES = 5 > +BROKEN_QUORUM = NUM_INSTANCES + 1 > + > > > * BROKEN_QUORUM assigned but never used. removed > > + > +test_run:create_cluster(SERVERS, "replication", {args="0.1"}) > +test_run:wait_fullmesh(SERVERS) > > > * You’re passing some argument to qsync1, … qsync5 instances, but you never use it. removed argument > > +current_leader_id = 1 > +test_run:switch(SERVERS[current_leader_id]) > +box.cfg{replication_synchro_timeout=1} > > > * You should set `replication_synchro_timeout` on every instance, not only > on qsync1 so you better move this box.cfg call to the instance file. > Besides, the timeout should be bigger (much bigger), like Vlad said. > We typically use 30 seconds for various replication timeouts. > It’s fairly common when a test is stable on your machine, but is flaky on > testing machines. increased timeout up to 1000 and set synchro settings in a qsync.lua file > +test_run:switch('default') > + > +-- Testcase body. > +for i=1,10 do \ > + new_leader_id = random(current_leader_id, 1, #SERVERS) \ > + test_run:switch(SERVERS[new_leader_id]) \ > + box.cfg{read_only=false} \ > + f1 = fiber.create(function() \ > + pcall(box.space.sync:truncate{}) \ > + end) \ > > > * Why put truncate call in a separate fiber? > > Why use truncate at all? You may just replace all your `insert` calls below > with `replace`, and then truncate won’t be needed. This is up to you > though. As far as I remember original idea was to cleanup before next insert. I rewrote a body of test and keep all inserted values in a space and after loop check amount of inserted values. > > + f2 = fiber.create(function() \ > + for i=1,10000 do box.space.sync:insert{i} end \ > + end) \ > > * So you’re testing a case when a leader has some unconfirmed transactions in > limbo and then a leader change happens. Then you need to call > `clear_synchro_queue` on a new leader to wait for confirmation of old txns. Otherwise > the new leader fails to insert its data, but the test doesn’t show this, because you > don’t check fiber state or `insert()` return values. added clear_synchro_queue() call to updated test > + test_run:switch('default') \ > + test_run:switch(SERVERS[current_leader_id]) \ > > > * The 2 lines above are useless. removed in updated test: -- Testcase body. for i=1,100 do \ test_run:eval(SERVERS[current_leader_id], \ "box.cfg{replication_synchro_quorum=6}") \ test_run:eval(SERVERS[current_leader_id], \ string.format("box.space.sync:insert{%d}", i)) \ new_leader_id = random(current_leader_id, 1, #SERVERS) \ test_run:eval(SERVERS[new_leader_id], \ "box.cfg{replication_synchro_quorum=3}") \ test_run:eval(SERVERS[new_leader_id], "box.ctl.clear_synchro_queue()") \ replica = random(new_leader_id, 1, #SERVERS) \ test_run:eval(SERVERS[replica], \ string.format("box.space.sync:get{%d}", i)) \ test_run:switch('default') \ current_leader_id = new_leader_id \ end test_run:switch('qsync1') box.space.sync:count() -- 100 > > + box.cfg{read_only=true} \ > + test_run:switch('default') \ > + current_leader_id = new_leader_id \ > +end > + > +-- Teardown. > -- > Serge Petrenko -- sergeyb@
next prev parent reply other threads:[~2020-08-31 10:05 UTC|newest] Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top [not found] <cover.1594314820.git.sergeyb@tarantool.org> 2020-07-09 17:16 ` [Tarantool-patches] [PATCH 1/3] replication: print number of txs in limbo before its clear sergeyb 2020-07-09 20:07 ` Vladislav Shpilevoy 2020-07-10 12:55 ` Sergey Bronnikov 2020-07-20 21:59 ` Vladislav Shpilevoy 2020-07-09 17:16 ` [Tarantool-patches] [PATCH 2/3] replication: test clear_synchro_queue function sergeyb 2020-07-20 22:00 ` Vladislav Shpilevoy 2020-08-25 12:49 ` Sergey Bronnikov 2020-08-26 7:31 ` Serge Petrenko 2020-08-26 14:48 ` Sergey Bronnikov 2020-07-09 17:16 ` [Tarantool-patches] [PATCH 3/3] replication: add test with random leaders promotion and demotion sergeyb 2020-07-09 20:07 ` Vladislav Shpilevoy 2020-07-10 8:05 ` Sergey Bronnikov 2020-07-20 22:01 ` Vladislav Shpilevoy 2020-08-26 14:45 ` Sergey Bronnikov 2020-08-27 10:49 ` Serge Petrenko 2020-08-31 10:05 ` Sergey Bronnikov [this message] 2020-09-01 13:03 ` Serge Petrenko 2020-11-12 9:32 ` Sergey Bronnikov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200831100539.GA94343@pony.bronevichok.ru \ --to=sergeyb@tarantool.org \ --cc=sergepetrenko@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH 3/3] replication: add test with random leaders promotion and demotion' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox