[Tarantool-patches] [test-run] Add options for upgrade testing

Mon Nov 16 19:40:06 MSK 2020

On Tue, Nov 10, 2020 at 12:55:46PM +0300, Sergey Bronnikov wrote:
> Hi, Alexander!
> 
> I just want to ask you to take a look on updated patch again.
> I have fixed code according to your comments and also
> fixed a bug when path for snapshot copying was invalid.
> 
> Branch: https://github.com/tarantool/test-run/tree/ligurio/gh-4801-add-snapshot-option
> CI: https://travis-ci.com/github/tarantool/test-run/builds/199375391
> 
> On 14:29 Wed 10 Jun , Sergey Bronnikov wrote:
> > Hi,

Hi!

Sorry for the delay.

I looked over the code and didn't perform any testing of the changes. I
hope the success path works as expected, however there are certainly
broken error paths and there are code hunks for test-run.py parameters
validation, which should not spread far away from options.py. See
details below.

WBR, Alexander Turenko.

> Option '--snapshot-version' specifies a path to snapshot
> that will be loaded in Tarantool before testing.

I would name it --snapshot, because the argument of the option is path
to a snapshot, not a version of some kind.

Ouch, I see, it is --snapshot-path now, but the commit message does not
reflect this change. --snapshot-path is okay for me too (despite that
personally I would prefer --snapshot).

(lib/app_server.py)
> +    def test_option_get(self, option_list_str, silent=False):
> +        args = [self.binary] + shlex.split(option_list_str)
> +        if not silent:
> +            print " ".join([os.path.basename(self.binary)] + args[1:])
> +        output = subprocess.Popen(args,
> +                                  stdout=subprocess.PIPE,
> +                                  stderr=subprocess.STDOUT).stdout.read()
> +        return output
> +
> +    def test_debug(self):
> +        if re.findall(r"-Debug", self.test_option_get("-V", True), re.I):
> +            return True
> +        return False

I don't see a reason to duplicate the code, it is already present in
TarantoolServer. We run all tests against the same tarantool. We should
move test-run.py parameters validation code out from the *Server classes
and the need to have this code in AppServer will gone.

> +        if getattr(self, 'disable_schema_upgrade', '') and \
> +                not (self, 'snapshot_path', ''):
> +            color_stdout("option --disable-schema-upgrade \
> +                depends on --snapshot-version\n", schema='error')
> +            check_error = True
> +
> +        if getattr(self, 'snapshot_path', ''):
> +            snapshot_path = getattr(self, 'snapshot_path', '')
> +            if not os.path.exists(snapshot_path):
> +                color_stdout("path {} not exist\n", snapshot_path, schema='error')
> +                check_error = True

First, it does not work at all as well as the validation for conflicting
--gdb, --lldb and so options above. It tries to obtain values from
`self` instead of `self.args`.

Second, there is no reason to use getattr() for options that have
default values.

Third, there is no --snapshot-version option anymore.

Fourth, the first message contains extra whitespaces:

 | option --disable-schema-upgrade                 depends on --snapshot-version

The second one is printed in two lines without leading newline
character:

 | path {} not exist
 | fooalex at tkn_work_nb test-run ((df4a0e8...) *)

(It should be 'not exist*s*'.)

This code would not work even if getattr() from `self` would do, because
of missed getattr(): `not (self, 'snapshot_path', '')`.

It is also strange to use '' as default value for a boolean option.

> -from lib.colorer import color_stdout
> +from lib.colorer import color_stdout, color_log
> +from lib.options import Options

Nit: Most of imports are function-per-line. I would keep the style
instead of randomly using one or another.

> +    DISABLE_AUTO_UPGRADE = "require('fiber').sleep(0) \
> +        assert(box.error.injection.set('ERRINJ_AUTO_UPGRADE', true) == 'ok', \
> +        'no such errinj')"

You and me knows why fiber.sleep() is here. Nobody else will understand
this code. Please, refer the relevant issue and mention that it is
workaround for it.

> -        corecls = __import__(cls.mdlname,
> -                             fromlist=cls.clsname).__dict__[cls.clsname]
> +        corecls = __import__(cls.mdlname, fromlist=cls.clsname).__dict__[cls.clsname]

Unrelated change.

> +        self.disable_schema_upgrade = Options().args.disable_schema_upgrade
> +        if self.snapshot_path:
> +            self.snapshot_path = os.path.abspath(self.snapshot_path)
> +            if os.path.exists(self.snapshot_path):
> +                self.snapshot_basename = os.path.basename(self.snapshot_path)
> +            else:
> +                color_stdout("\nSnapshot {} have been not found\n".format(self.snapshot_path),
> +                             schema='error')

I don't understand this logic. If there is no requested snapshot file,
we'll print the warning and continue? In my understanding we should give
an error and stop, if an incorrect option is passed.

And parameters verification should not be grounded somewhere in
unrelated code. Preferably it should be in options.py. If some
test-run's facilities are used to verify the option, it should be
somewhere after options parsing in some top-level code. Maybe in
module_init(). However here we just verify whether the file exists, so
let's move to options.py.

>      def install(self, binary=None, vardir=None, mem=None, silent=True):
> -        pass
> +        if self.snapshot_basename:
> +            (instance_name, _) = os.path.splitext(os.path.basename(self.script))
> +            instance_dir = os.path.join(self.vardir, instance_name)
> +            if not os.path.exists(instance_dir):
> +                os.mkdir(instance_dir)
> +            snapshot_dest = os.path.join(instance_dir, "00000000000000000000.snap")
> +            color_log("Copying snapshot {} to {}\n".format(
> +                self.snapshot_path, snapshot_dest))
> +            shutil.copy(self.snapshot_path, snapshot_dest)

We assume tarantoolctl layout (as it is defined in test/.tarantoolctl):

 | vardir/
 | + instance_name/
 |   + *.{snap,xlog}
 | + instance_name.log
 | + instance_name.control

It is not relevant for AppServer and UnittestServer.

>  from lib.preprocessor import TestState
>  from lib.server import Server
>  from lib.test import Test
> -from lib.utils import find_port
> +from lib.utils import find_port, extract_schema_from_snapshot
>  from lib.utils import format_process
>  from lib.utils import signame
>  from lib.utils import warn_unix_socket

Nit: Most of imports are function-per-line, I would keep it this way. It
looks more accurate.

> @@ -609,7 +609,6 @@ def __init__(self, _ini=None, test_suite=None):
>          }
>          ini.update(_ini)
>          Server.__init__(self, ini, test_suite)
> -        self.testdir = os.path.abspath(os.curdir)
>          self.sourcedir = os.path.abspath(os.path.join(os.path.basename(
>              sys.argv[0]), "..", ".."))
>          self.name = "default"

I see that it is moved to the parent class, but:

1. It does not look related to your changes, but if it is related, you
   should describe a reason in the commit message.
2. If it is a side refactoring, you should move it to its own commit
   with explained motivation.

For now, I don't see any reason for this change.

> @@ -641,6 +640,10 @@ def __init__(self, _ini=None, test_suite=None):
>          if 'test_run_current_test' in caller_globals.keys():
>              self.current_test = caller_globals['test_run_current_test']
>  
> +        if self.disable_schema_upgrade and not self.test_debug():
> +            color_stdout("Can't disable schema upgrade on release build\n",
> +                         schema='error')
> +

Should not we verify it on the upper level, because it is like options
validation? Say, right in options.py or at least in module_init()?

Should not we give an error and don't start testing in the case?

BTW, there is `debug` property, which calls test_debug().

> @@ -861,6 +873,18 @@ def start(self, silent=True, wait=True, wait_load=True, rais=True, args=[],
>          self.admin = CON_SWITCH[self.tests_type]('localhost', port)
>          self.status = 'started'
>  
> +        if not self.ctl_path:
> +            self.ctl_path = self.find_exe(self.builddir)

1. Are there situations, when there is no `ctl_path` value at starting a
   server? So `prepare_args()` call above is incorrect? I don't get the
   trick. I guess `lib/__init__.py` call to `find_exe()` should be
   enough to fill `cls.ctl_path`.
2. `find_exe()` sets `cls.ctl_path` on its own.

> +        if self.disable_schema_upgrade:
> +            version = extract_schema_from_snapshot(self.ctl_path,
> +                                                   self.builddir, self.snapshot_path)
> +            current_ver = yaml.safe_load(self.admin.execute(
> +                'box.space._schema:get{"version"}'))
> +            if version == current_ver:
> +                raise_msg = 'Versions are inequal ({} vs {})'.format(
> +                    version, current_ver)
> +                raise Exception(raise_msg)
> +

AFAIS, the usual recommendation is to raise standard or user-defined
Exception subclasses. It seems, RuntimeError fits best here.

However, I guess, RuntimeError and TarantoolStartError will lead to
different behaviour. The former just fails the test, but the latter
stops testing at whole. Should not we stop testing in the case?

Will the new server left running in the case? It is undesirable
situation.

> @@ -1066,7 +1090,6 @@ def test_option_get(self, option_list_str, silent=False):
>          if not silent:
>              print " ".join([os.path.basename(self.binary)] + args[1:])
>          output = subprocess.Popen(args,
> -                                  cwd=self.vardir,
>                                    stdout=subprocess.PIPE,
>                                    stderr=subprocess.STDOUT).stdout.read()
>          return output

1. It looks unrelated.
2. There is not any described reason for this change.

> @@ -1126,7 +1149,7 @@ def is_correct(run_name):
>      def get_param(self, param=None):
>          if param is not None:
>              return yaml.safe_load(self.admin("box.info." + param,
> -                                  silent=True))[0]
> +                                             silent=True))[0]
>          return yaml.safe_load(self.admin("box.info", silent=True))

Unrelated change.