[tarantool-patches] Re: [PATCH v2 4/5] test: use wait_cond to check follow status

Mon Oct 29 13:41:07 MSK 2018

> >> If `test_run:wait_cond()` found a not 'follow` status it returns true.
> >> Which immediately causes an error.
> >> 
> >> Fixes #3734
> >> Part of #2436, #3232
> >> ---
> >>  test/replication/misc.result   | 17 +++++++++++------
> >>  test/replication/misc.test.lua | 15 +++++++++------
> >>  2 files changed, 20 insertions(+), 12 deletions(-)
> >> 
> >
> >> diff --git a/test/replication/misc.test.lua b/test/replication/misc.test.lua
> >> index 06ad974db..3866eb3ac 100644
> >> --- a/test/replication/misc.test.lua
> >> +++ b/test/replication/misc.test.lua
> >> @@ -53,15 +53,18 @@ fiber=require('fiber')
> >>  box.cfg{replication_timeout = 0.01, replication_connect_timeout=0.01}
> >>  _ = box.schema.space.create('test_timeout'):create_index('pk')
> >>  test_run:cmd("setopt delimiter ';'")
> >> +function wait_follow(replicaA, replicaB)
> >> +    return test_run:wait_cond(function()
> >> +        return replicaA.status ~= 'follow' or replicaB.status ~= 'follow'
> >> +    end, 0.01)
> >> +end ;
> >>  function test_timeout()
> >>      for i = 0, 99 do 
> >> +        local replicaA = box.info.replication[1].upstream or box.info.replication[2].upstream
> >> +        local replicaB = box.info.replication[3].upstream or box.info.replication[2].upstream
> >>          box.space.test_timeout:replace({1})
> >> -        fiber.sleep(0.005)
> >> -        local rinfo = box.info.replication
> >> -        if rinfo[1].upstream and rinfo[1].upstream.status ~= 'follow' or
> >> -           rinfo[2].upstream and rinfo[2].upstream.status ~= 'follow' or
> >> -           rinfo[3].upstream and rinfo[3].upstream.status ~= 'follow' then
> >> -            return error('Replication broken')
> >> +        if wait_follow(replicaA, replicaB) then
> >> +            return error(box.info.replication)
> >
> >AFAIU, this test case checks that replicas do not leave from 'follow'
> >state even for a short time period. We should wait for 'follow' state
> >before the loop and perform some amount of attemps to catch an another
> >state. I don't sure, though. Georgy should draw the line.
> >
> >I still think correction of test cases is a developer responsibility. If
> >you want to do it, please, discuss it with the author before. This will
> >save us some time we spend now on those extra review iterations.
>
> We discussed with Georgy how to do it:
> function test_timeout()
>     local replicaA = box.info.replication[1].upstream or box.info.replication[2].upstream
>     local replicaB = box.info.replication[3].upstream or box.info.replication[2].upstream
>     local follows = test_run:wait_cond(function()
>         return replicaA.status == 'follow' or replicaB.status == 'follow'
>     end, 0.1)
>     if not follows then error('replicas not in follow status') end
>     for i = 0, 99 do
>         box.space.test_timeout:replace({1})
>         if wait_follow(replicaA, replicaB) then
>             return error(box.info.replication)
>         end
>     end
>     return true
> end ;
> 
> Branch was updated.

Now I understand the approach. It looks good. I think it should be
commented inside the test, because it is counter-intuitive that wait_xxx
function returns true when something went wrong.

WBR, Alexander Turenko.