Tarantool development patches archive
 help / color / mirror / Atom feed
From: Serge Petrenko <sergepetrenko@tarantool.org>
To: "Alexander V. Tikhonov" <avtikhon@tarantool.org>,
	Kirill Yukhin <kyukhin@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH v1] test: replication-py/init_storage.test.py
Date: Tue, 10 Nov 2020 12:06:34 +0300	[thread overview]
Message-ID: <4c34dfd4-216e-f556-9cfa-e2a16bcc0bc8@tarantool.org> (raw)
In-Reply-To: <1f40f4a9644bb2a51586cb9a44303ca2af131015.1604816629.git.avtikhon@tarantool.org>


08.11.2020 09:24, Alexander V. Tikhonov пишет:
> Found that test failed in 2 common places when it tried to start the
> replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
> It used to wait for replica start check the 'wait_until_started()'
> function 'TarantoolServer' class from test-run repository. But it
> didn't try resolve connection issues on replica creation, like:
>
>    [30534] main/103/replica I> connecting to 1 replicas
>    [30534] main/112/applier/localhost:49168 I> can't connect to master
>    [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
>    [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
>    [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
>    [30534] main/103/replica I> connected to 1 replicas
>    [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
>    [30534] main/112/applier/localhost:49168 I> can't read row
>    [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
>    [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
>    [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
>    [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.


Hi! Thanks for the patch!

Please find my comments below.


>
> To resolve it the test was changed to be able to catch exception
> 'TarantoolStartError' from test-run. Also the test should have the
> ability to be restarted by test-run using fragile list and in this way
> 'crash_expected' flag was disabled to let the test fail with exception.

You're saying `crash_expected` is disabled, but you enable it in your patch.


>
> Needed by #4949
> ---
>
> Github: https://github.com/tarantool/tarantool/tree/avtikhon/gh-4949
> Issue: https://github.com/tarantool/tarantool/issues/4949
>
>   test/replication-py/init_storage.test.py | 27 ++++++++++++++++++++----
>   test/replication-py/suite.ini            |  5 +++--
>   2 files changed, 26 insertions(+), 6 deletions(-)
>
> diff --git a/test/replication-py/init_storage.test.py b/test/replication-py/init_storage.test.py
> index 4be531f8d..a830821cd 100644
> --- a/test/replication-py/init_storage.test.py
> +++ b/test/replication-py/init_storage.test.py
> @@ -1,5 +1,6 @@
>   import os
>   import glob
> +from lib.tarantool_server import TarantoolStartError
>   from lib.tarantool_server import TarantoolServer
>   
>   # master server
> @@ -64,8 +65,17 @@ replica.deploy(wait=False)
>   
>   print 'waiting reconnect on JOIN...'
>   server.start()
> -replica.wait_until_started()
> -print 'ok'
> +try:
> +    # Replica may fail to start due connection issues may occur, check
> +    # gh-4949. Also the test should have the ability to be restarted by
> +    # test-run using fragile list and in this way 'crash_expected' flag
> +    # should be disabled to let the test fail with exception.
> +    replica.crash_expected = True
> +    replica.wait_until_started()
> +except TarantoolStartError:
> +    print 'not ok - server failed to start'
> +else:
> +    print 'ok'

You're saying that crash_expected flag should be disabled,
but you're enabling it.

What happens if you don't catch the exception?
Is this added to make the test produce the same output on
multiple failures?

>   
>   replica.stop()
>   server.stop()
> @@ -73,8 +83,17 @@ server.stop()
>   print 'waiting reconnect on SUBSCRIBE...'
>   replica.start(wait=False)
>   server.start()
> -replica.wait_until_started()
> -print 'ok'
> +try:
> +    # Replica may fail to start due connection issues may occur, check
> +    # gh-4949. Also the test should have the ability to be restarted by
> +    # test-run using fragile list and in this way 'crash_expected' flag
> +    # should be disabled to let the test fail with exception.
> +    replica.crash_expected = True
> +    replica.wait_until_started()
> +except TarantoolStartError:
> +    print 'not ok - server failed to start'
> +else:
> +    print 'ok'
>   
>   replica.stop()
>   replica.cleanup()
> diff --git a/test/replication-py/suite.ini b/test/replication-py/suite.ini
> index b563b9fca..db4403457 100644
> --- a/test/replication-py/suite.ini
> +++ b/test/replication-py/suite.ini
> @@ -4,10 +4,11 @@ script =  master.lua
>   description = tarantool/box, replication
>   is_parallel = True
>   fragile = {
> -    "retries": 10,
> +    "retries": 50,

Why 50 retries?  Don't we have a rule that only 10 retries  are allowed?
I understand that 10 retries aren't sufficient sometimes, but still. 50 
is too much.

I guess we should teach test_run to retry starting a server instead of 
restarting the
whole test, when the problem is only with replica start.

>       "tests": {
>           "init_storage.test.py": {
> -            "issues": [ "gh-4949" ]
> +            "issues": [ "gh-4949" ],
> +            "checksums": [ "9b4235bb6bb9d76aa6a1f7dc8f088075", "4c5fc871955a3166d67fbfa9f254f68a", "bc2781acdb5745d01da2f533a0d519f9" ]
>           },
>           "conflict.test.py": {
>               "issues": [ "gh-4980" ]

-- 
Serge Petrenko

  reply	other threads:[~2020-11-10  9:06 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-08  6:24 Alexander V. Tikhonov
2020-11-10  9:06 ` Serge Petrenko [this message]
2020-12-02 20:31   ` Alexander V. Tikhonov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4c34dfd4-216e-f556-9cfa-e2a16bcc0bc8@tarantool.org \
    --to=sergepetrenko@tarantool.org \
    --cc=avtikhon@tarantool.org \
    --cc=kyukhin@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v1] test: replication-py/init_storage.test.py' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox