<HTML><BODY><div><div><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">Hi! Thanks for the fixes!<br>I’m pasting parts of the patch below to comment on.</span></div><div> </div><p> </p><div><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+box.cfg({</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ listen = instance_uri(INSTANCE_ID);</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ replication_connect_quorum = 3;</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ replication = {</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ instance_uri(1);</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ instance_uri(2);</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ instance_uri(3);</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ instance_uri(4);</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ instance_uri(5);</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ };</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+})</span></p><div> </div><div> </div><ol><li><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">You should either omit `replication_connect_quorum` at all, or set it to 5.</span><br><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">Omitting it will have the same effect.<br>I think you meant `replication_synchro_quorum` here, then it makes sense<br>to set it to 3. Also</span> <span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">`replication_synchro_timeout` should be set here,<br>I’ll mention it</span> <span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">again below.</span></li></ol><div> </div><div> </div><div><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+NUM_INSTANCES = 5</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+BROKEN_QUORUM = NUM_INSTANCES + 1</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+</span></p><div> </div><div> </div><ol start="2"><li><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">BROKEN_QUORUM assigned but never used.</span></li></ol></div><div> </div><div> </div><div><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+test_run:create_cluster(SERVERS, "replication", {args="0.1"})</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+test_run:wait_fullmesh(SERVERS)</span></p><div> </div><div> </div><ol start="3"><li><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">You’re passing some argument to qsync1, … qsync5 instances, but you never use</span> it.</li></ol><div> </div><div> </div><div><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+current_leader_id = 1</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+test_run:switch(SERVERS[current_leader_id])</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+box.cfg{replication_synchro_timeout=1}</span></p><div> </div><div> </div><ol start="4"><li><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">You should set `replication_synchro_timeout` on every instance, not only<br>on qsync1 so</span> <span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">you better move this box.cfg call to the instance file.</span><br><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">Besides, the timeout should be bigger (much bigger), like Vlad said.</span><br><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">We typically use 30 seconds for various replication timeouts.</span><br><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">It’s fairly common when a test is stable on your machine, but is flaky on</span><br><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">testing machines.</span></li></ol><div> </div><div> </div><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+test_run:switch('default')</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+-- Testcase body.</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+for i=1,10 do \</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ new_leader_id = random(current_leader_id, 1, #SERVERS) \</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ test_run:switch(SERVERS[new_leader_id]) \</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ box.cfg{read_only=false} \</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ f1 = fiber.create(function() \</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ pcall(box.space.sync:truncate{}) \</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ end) \</span></p><div> </div><div> </div><ol start="5"><li><font face="Helvetica, Arial">Why put truncate call in a separate fiber?</font><br><br>Why use truncate at all? You may just replace all your `insert` calls below<br>with `replace`, and then truncate won’t be needed. This is up to you<br>though.</li></ol><div> </div><div> </div><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ f2 = fiber.create(function() \</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ for i=1,10000 do box.space.sync:insert{i} end \</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ end) \</span></p><div> </div><ol start="6"><li><font face="Helvetica, Arial">So you’re testing a case when a leader has some unconfirmed transactions in<br>limbo and then a leader change happens. Then you need to call<br>`clear_synchro_queue` on a new leader to wait for confirmation of old txns. Otherwise<br>the new leader fails to insert its data, but the test doesn’t show this, because you<br>don’t check fiber state or `insert()` return values.</font></li></ol><div> </div><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ test_run:switch('default') \</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ test_run:switch(SERVERS[current_leader_id]) \</span></p><div> </div><div> </div><ol start="7"><li>The 2 lines above are useless.</li></ol><div> </div><div> </div><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ box.cfg{read_only=true} \</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ test_run:switch('default') \</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+ current_leader_id = new_leader_id \</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+end</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+</span></p><p><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">+-- Teardown.</span></p></div></div><div><p> </p><div> </div></div></div><div> </div></div><div><span style="font-family:Helvetica,Arial;font-size:15px;font-style:normal;font-weight:normal;line-height:20px;">--<br>Serge Petrenko</span></div></BODY></HTML>