From: "Serge Petrenko" <sergepetrenko@tarantool.org> To: "Sergey Bronnikov" <sergeyb@tarantool.org> Cc: tarantool-patches@dev.tarantool.org, "Vladislav Shpilevoy" <v.shpilevoy@tarantool.org> Subject: Re: [Tarantool-patches] [PATCH 3/3] replication: add test with random leaders promotion and demotion Date: Thu, 27 Aug 2020 13:49:42 +0300 [thread overview] Message-ID: <1598525382.271317018@f511.i.mail.ru> (raw) In-Reply-To: <20200826144538.GC47610@pony.bronevichok.ru> [-- Attachment #1: Type: text/plain, Size: 3856 bytes --] Hi! Thanks for the fixes! I’m pasting parts of the patch below to comment on. +box.cfg({ + listen = instance_uri(INSTANCE_ID); + replication_connect_quorum = 3; + replication = { + instance_uri(1); + instance_uri(2); + instance_uri(3); + instance_uri(4); + instance_uri(5); + }; +}) * You should either omit `replication_connect_quorum` at all, or set it to 5. Omitting it will have the same effect. I think you meant `replication_synchro_quorum` here, then it makes sense to set it to 3. Also `replication_synchro_timeout` should be set here, I’ll mention it again below. + +NUM_INSTANCES = 5 +BROKEN_QUORUM = NUM_INSTANCES + 1 + * BROKEN_QUORUM assigned but never used. + +test_run:create_cluster(SERVERS, "replication", {args="0.1"}) +test_run:wait_fullmesh(SERVERS) * You’re passing some argument to qsync1, … qsync5 instances, but you never use it. +current_leader_id = 1 +test_run:switch(SERVERS[current_leader_id]) +box.cfg{replication_synchro_timeout=1} * You should set `replication_synchro_timeout` on every instance, not only on qsync1 so you better move this box.cfg call to the instance file. Besides, the timeout should be bigger (much bigger), like Vlad said. We typically use 30 seconds for various replication timeouts. It’s fairly common when a test is stable on your machine, but is flaky on testing machines. +test_run:switch('default') + +-- Testcase body. +for i=1,10 do \ + new_leader_id = random(current_leader_id, 1, #SERVERS) \ + test_run:switch(SERVERS[new_leader_id]) \ + box.cfg{read_only=false} \ + f1 = fiber.create(function() \ + pcall(box.space.sync:truncate{}) \ + end) \ * Why put truncate call in a separate fiber? Why use truncate at all? You may just replace all your `insert` calls below with `replace`, and then truncate won’t be needed. This is up to you though. + f2 = fiber.create(function() \ + for i=1,10000 do box.space.sync:insert{i} end \ + end) \ * So you’re testing a case when a leader has some unconfirmed transactions in limbo and then a leader change happens. Then you need to call `clear_synchro_queue` on a new leader to wait for confirmation of old txns. Otherwise the new leader fails to insert its data, but the test doesn’t show this, because you don’t check fiber state or `insert()` return values. + test_run:switch('default') \ + test_run:switch(SERVERS[current_leader_id]) \ * The 2 lines above are useless. + box.cfg{read_only=true} \ + test_run:switch('default') \ + current_leader_id = new_leader_id \ +end + +-- Teardown. -- Serge Petrenko [-- Attachment #2: Type: text/html, Size: 13274 bytes --]
next prev parent reply other threads:[~2020-08-27 10:49 UTC|newest] Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top [not found] <cover.1594314820.git.sergeyb@tarantool.org> 2020-07-09 17:16 ` [Tarantool-patches] [PATCH 1/3] replication: print number of txs in limbo before its clear sergeyb 2020-07-09 20:07 ` Vladislav Shpilevoy 2020-07-10 12:55 ` Sergey Bronnikov 2020-07-20 21:59 ` Vladislav Shpilevoy 2020-07-09 17:16 ` [Tarantool-patches] [PATCH 2/3] replication: test clear_synchro_queue function sergeyb 2020-07-20 22:00 ` Vladislav Shpilevoy 2020-08-25 12:49 ` Sergey Bronnikov 2020-08-26 7:31 ` Serge Petrenko 2020-08-26 14:48 ` Sergey Bronnikov 2020-07-09 17:16 ` [Tarantool-patches] [PATCH 3/3] replication: add test with random leaders promotion and demotion sergeyb 2020-07-09 20:07 ` Vladislav Shpilevoy 2020-07-10 8:05 ` Sergey Bronnikov 2020-07-20 22:01 ` Vladislav Shpilevoy 2020-08-26 14:45 ` Sergey Bronnikov 2020-08-27 10:49 ` Serge Petrenko [this message] 2020-08-31 10:05 ` Sergey Bronnikov 2020-09-01 13:03 ` Serge Petrenko 2020-11-12 9:32 ` Sergey Bronnikov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1598525382.271317018@f511.i.mail.ru \ --to=sergepetrenko@tarantool.org \ --cc=sergeyb@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH 3/3] replication: add test with random leaders promotion and demotion' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox