[PATCH 3/4] test: increase timeout to check replica status
Alexander Turenko
alexander.turenko at tarantool.org
Mon Oct 8 22:07:59 MSK 2018
On Fri, Oct 05, 2018 at 12:02:14PM +0300, Sergei Voronezhskii wrote:
> The replica status is checked 100 times, each check within
> `replica_timeout`. Refactor the code to get properly upstream.
> Then in loop with little sleep check upstreams status until
> it is not in follow mode. If count of checks is more than 200
> break the loop with error. The value 200 and little sleep 0.001
> choosed suitably to `replica_timeout` and `replica_connect_timeout`.
replica_timeout -> replication_timeout
replica_connect_timeout -> replication_connect_timeout
>
> Part of #2436, #3232
> ---
> test/replication/misc.result | 22 ++++++++++++----------
> test/replication/misc.test.lua | 22 ++++++++++++----------
> 2 files changed, 24 insertions(+), 20 deletions(-)
>
> diff --git a/test/replication/misc.test.lua b/test/replication/misc.test.lua
> index 375c8b58a..cb658f6d0 100644
> --- a/test/replication/misc.test.lua
> +++ b/test/replication/misc.test.lua
> @@ -43,30 +43,32 @@ test_run:create_cluster(SERVERS, "replication", {args="0.1"})
> test_run:wait_fullmesh(SERVERS)
> test_run:cmd("switch autobootstrap1")
> test_run = require('test_run').new()
> -box.cfg{replication_timeout = 0.01, replication_connect_timeout=0.01}
> +box.cfg{replication_timeout = 0.2, replication_connect_timeout=0.2}
> test_run:cmd("switch autobootstrap2")
> test_run = require('test_run').new()
> -box.cfg{replication_timeout = 0.01, replication_connect_timeout=0.01}
> +box.cfg{replication_timeout = 0.2, replication_connect_timeout=0.2}
> test_run:cmd("switch autobootstrap3")
> test_run = require('test_run').new()
> fiber=require('fiber')
> -box.cfg{replication_timeout = 0.01, replication_connect_timeout=0.01}
> +box.cfg{replication_timeout = 0.2, replication_connect_timeout=0.2}
> _ = box.schema.space.create('test_timeout'):create_index('pk')
> test_run:cmd("setopt delimiter ';'")
> function test_timeout()
> + local replicaA = box.info.replication[1].upstream or box.info.replication[2].upstream
> + local replicaB = box.info.replication[3].upstream or box.info.replication[2].upstream
Are the 'box' code guarantees that box.info.replication[N].upstream will
update the same table? I don't think so. It is better to get these
values inside the loop.
Nit: too long lines.
> for i = 0, 99 do
> box.space.test_timeout:replace({1})
> - fiber.sleep(0.005)
> - local rinfo = box.info.replication
> - if rinfo[1].upstream and rinfo[1].upstream.status ~= 'follow' or
> - rinfo[2].upstream and rinfo[2].upstream.status ~= 'follow' or
> - rinfo[3].upstream and rinfo[3].upstream.status ~= 'follow' then
> - return error('Replication broken')
> - end
> + local n = 200
> + repeat
> + fiber.sleep(0.001)
> + n = n - 1
> + if n == 0 then return error(box.info.replication) end
> + until replicaA.status == 'follow' and replicaB.status == 'follow'
> end
> return true
> end ;
> test_run:cmd("setopt delimiter ''");
> +-- the replica status is checked 100 times, each check within replication_timeout
Don't get the comment 'each check within replication_timeout', what does
it mean?
Anyway, I think this just broke the test case. It did check that
replicas does not leave from the 'follow' state, now it checks nothing.
I still push to the approach where QA team working closely with
developers to understand cases or at least leave issues for developers
to fix its test cases. I strongly against random timeout tweaks to get
a test 'working'.
Please, elaborate the test case with Georgy (the case was introduced in
195d4462).
> test_timeout()
>
> -- gh-3247 - Sequence-generated value is not replicated in case
> --
> 2.18.0
>
More information about the Tarantool-patches
mailing list