[PATCH 3/4] test: increase timeout to check replica status

Alexander Turenko alexander.turenko at tarantool.org
Mon Oct 8 22:07:59 MSK 2018


On Fri, Oct 05, 2018 at 12:02:14PM +0300, Sergei Voronezhskii wrote:
> The replica status is checked 100 times, each check within
> `replica_timeout`. Refactor the code to get properly upstream.
> Then in loop with little sleep check upstreams status until
> it is not in follow mode. If count of checks is more than 200
> break the loop with error. The value 200 and little sleep 0.001
> choosed suitably to `replica_timeout` and `replica_connect_timeout`.

replica_timeout -> replication_timeout
replica_connect_timeout -> replication_connect_timeout

> 
> Part of #2436, #3232
> ---
>  test/replication/misc.result   | 22 ++++++++++++----------
>  test/replication/misc.test.lua | 22 ++++++++++++----------
>  2 files changed, 24 insertions(+), 20 deletions(-)
> 
> diff --git a/test/replication/misc.test.lua b/test/replication/misc.test.lua
> index 375c8b58a..cb658f6d0 100644
> --- a/test/replication/misc.test.lua
> +++ b/test/replication/misc.test.lua
> @@ -43,30 +43,32 @@ test_run:create_cluster(SERVERS, "replication", {args="0.1"})
>  test_run:wait_fullmesh(SERVERS)
>  test_run:cmd("switch autobootstrap1")
>  test_run = require('test_run').new()
> -box.cfg{replication_timeout = 0.01, replication_connect_timeout=0.01}
> +box.cfg{replication_timeout = 0.2, replication_connect_timeout=0.2}
>  test_run:cmd("switch autobootstrap2")
>  test_run = require('test_run').new()
> -box.cfg{replication_timeout = 0.01, replication_connect_timeout=0.01}
> +box.cfg{replication_timeout = 0.2, replication_connect_timeout=0.2}
>  test_run:cmd("switch autobootstrap3")
>  test_run = require('test_run').new()
>  fiber=require('fiber')
> -box.cfg{replication_timeout = 0.01, replication_connect_timeout=0.01}
> +box.cfg{replication_timeout = 0.2, replication_connect_timeout=0.2}
>  _ = box.schema.space.create('test_timeout'):create_index('pk')
>  test_run:cmd("setopt delimiter ';'")
>  function test_timeout()
> +    local replicaA = box.info.replication[1].upstream or box.info.replication[2].upstream
> +    local replicaB = box.info.replication[3].upstream or box.info.replication[2].upstream

Are the 'box' code guarantees that box.info.replication[N].upstream will
update the same table? I don't think so. It is better to get these
values inside the loop.

Nit: too long lines.

>      for i = 0, 99 do 
>          box.space.test_timeout:replace({1})
> -        fiber.sleep(0.005)
> -        local rinfo = box.info.replication
> -        if rinfo[1].upstream and rinfo[1].upstream.status ~= 'follow' or
> -           rinfo[2].upstream and rinfo[2].upstream.status ~= 'follow' or
> -           rinfo[3].upstream and rinfo[3].upstream.status ~= 'follow' then
> -            return error('Replication broken')
> -        end
> +        local n = 200
> +        repeat
> +            fiber.sleep(0.001)
> +            n = n - 1
> +            if n == 0 then return error(box.info.replication) end
> +        until replicaA.status == 'follow' and replicaB.status == 'follow'
>      end
>      return true
>  end ;
>  test_run:cmd("setopt delimiter ''");
> +-- the replica status is checked 100 times, each check within replication_timeout

Don't get the comment 'each check within replication_timeout', what does
it mean?

Anyway, I think this just broke the test case. It did check that
replicas does not leave from the 'follow' state, now it checks nothing.

I still push to the approach where QA team working closely with
developers to understand cases or at least leave issues for developers
to fix its test cases. I strongly against random timeout tweaks to get
a test 'working'.

Please, elaborate the test case with Georgy (the case was introduced in
195d4462).

>  test_timeout()
>  
>  -- gh-3247 - Sequence-generated value is not replicated in case
> -- 
> 2.18.0
> 



More information about the Tarantool-patches mailing list