[Tarantool-patches] [test-run] Add options for upgrade testing

Thu May 28 12:39:34 MSK 2020

Alexander,

thanks for review. See my comments inline.
All changes on top of branch.

On 19:17 Fri 15 May , Alexander Turenko wrote:
> The most important question: as I see from the code both options do not
> work with 'core = app' tests. I think we should support those option in
> AppServer too.

Fixed with small refactoring of AppServer, TarantoolServer and Server
classes.

> Other comments are below. They are mostly like 'we can polish it a bit
> here and there'.
>
> WBR, Alexander Turenko.
> 
> On Mon, Mar 30, 2020 at 05:48:23PM +0300, Sergey Bronnikov wrote:
> > First option '--snapshot-version' specifies snapshot version which loaded in
> 
> Nit: Over 72 symbols.

Fixed.

> > tarantool before testing. Prepared snapshots lies in a directory
> > test/static/snapshots.
> > 
> > Second option '--disable-chema-upgrade' allows to skip execution
> 
> Typo: chema -> schema.

Fixed.

> > of `upgrade.lua` script.
> 
> Nit: It is not obvious that it is about src/box/lua script, because we
> have several upgrade.lua in tests:
> 
> $ find . -name upgrade.lua
> ./src/box/lua/upgrade.lua
> ./test/sql/upgrade/upgrade.lua
> ./test/vinyl/upgrade.lua
> ./test/xlog/upgrade.lua

Fixed too.

> > 
> > Ticket: #4801
> 
> Nit: tarantool/tarantool issue will not be linked this way rfom
> tarantool/test-run repository commit. I usually use full link in such
> cases.

Done.

> > ---
> 
> Nit: Please, place branch name here.

Will do next time.

> >  lib/options.py          | 14 +++++++++
> >  lib/tarantool_server.py | 66 ++++++++++++++++++++++++++++++++++++++++-
> >  2 files changed, 79 insertions(+), 1 deletion(-)
> > 
> > diff --git a/lib/options.py b/lib/options.py
> > index 8bacb4a..d627eb2 100644
> > --- a/lib/options.py
> > +++ b/lib/options.py
> > @@ -201,6 +201,20 @@ class Options:
> >                  help="""Run the server under 'luacov'.
> >                  Default: false.""")
> >  
> > +        parser.add_argument(
> > +                "--snapshot-version",
> > +                dest='snapshot_version',
> > +                default=None,
> > +                help="""Version of Tarantool snapshot loaded before
> > +                testing.""")
> 
> I would accept a path to a snapshot file here. This way does not require
> knowledge from a user where test-run expect snapshots to be placed (and
> that '.snap' should not be added). It also more flexible: a user may
> choose where it is better to place snapshots within its project (or even
> outside it).

Don't forget we need a way to run upgrade testing for several tarantool
versions automatically. With your suggestion we should specify one
option several times and I don't like this approach, because it will
overload command line for running tests.

> > +
> > +        parser.add_argument(
> > +                "--disable-schema-upgrade",
> > +                dest='disable_schema_upgrade',
> > +                action="store_true",
> > +                default=False,
> > +                help="""Disable schema upgrade on testing with snapshots.""")
> > +
> >          # XXX: We can use parser.parse_intermixed_args() on
> >          # Python 3.7 to understand commands like
> >          # ./test-run.py foo --exclude bar baz
> > diff --git a/lib/tarantool_server.py b/lib/tarantool_server.py
> > index fd102b9..035ba5f 100644
> > --- a/lib/tarantool_server.py
> > +++ b/lib/tarantool_server.py
> > @@ -2,6 +2,7 @@ import errno
> >  import gevent
> >  import glob
> >  import inspect  # for caller_globals
> > +import json
> >  import os
> >  import os.path
> >  import re
> > @@ -32,6 +33,7 @@ from lib.utils import format_process
> >  from lib.utils import signame
> >  from lib.utils import warn_unix_socket
> >  from lib.utils import prefix_each_line
> > +from lib.options import Options
> >  from test import TestRunGreenlet, TestExecutionError
> >  
> >  
> > @@ -448,6 +450,9 @@ class TarantoolLog(object):
> >  
> >  
> >  class TarantoolServer(Server):
> > +
> > +    SNAPSHOT_DIRECTORY = 'static/snapshots'
> > +
> >      default_tarantool = {
> >          "bin": "tarantool",
> >          "logfile": "tarantool.log",
> > @@ -641,6 +646,22 @@ class TarantoolServer(Server):
> >          if 'test_run_current_test' in caller_globals.keys():
> >              self.current_test = caller_globals['test_run_current_test']
> >  
> > +        self.snapshot_basename = Options().args.snapshot_version
> > +        if self.snapshot_basename:
> > +            self.snapshot_name = '{}.snap'.format(self.snapshot_basename)
> > +            self.snapshot_path = os.path.join(self.testdir, self.SNAPSHOT_DIRECTORY, self.snapshot_name)
> > +            if not os.path.exists(self.snapshot_path):
> > +                color_stdout("Snapshot {} have been not found\n".format(self.snapshot_path),
> > +                            schema='error')
> 
> Can we move this check to lib/options.py? There is check() function
> there.

These conditions are not about constraints checking but also set
internal variables. So I cannot move them to lib/options.py.

> > +                raise TarantoolStartError
> > +                server.kill_current_test()
> > +        self.disable_schema_upgrade = Options().args.disable_schema_upgrade
> > +        if self.disable_schema_upgrade and not self.snapshot_basename:
> > +            color_stdout("option --disable-schema-upgrade depends on --snapshot-version\n",
> > +                        schema='error')
> > +            raise TarantoolStartError
> > +            server.kill_current_test
> > +
> 
> And this check too?

Moved to lib/options.py.

> I would also verify whether tarantool is built in debug: otherwise error
> injections are not available. When the build is not Debug we obviously
> unable to disable upgrade and it worth to give a user exact reason.

Good point! Added.

> cls.version() captures `tarantool --version` output.
> 
> >      def __del__(self):
> >          self.stop()
> >  
> > @@ -727,6 +748,15 @@ class TarantoolServer(Server):
> >              self.cleanup()
> >          self.copy_files()
> >  
> > +        if self.snapshot_basename:
> > +            (instance_name, _) = os.path.splitext(os.path.basename(self.script))
> > +            instance_dir = os.path.join(self.vardir, instance_name)
> > +            if not os.path.exists(instance_dir):
> > +                os.mkdir(instance_dir)
> > +            snapshot_dest = os.path.join(instance_dir, '00000000000000000000.snap')
> 
> We can keep a snapshot name, it may be useful for debugging.

AFAIU you mean 'keep original name', done.

> > +            color_log("Copying snapshot {} to {}".format(self.snapshot_name, snapshot_dest))
> > +            shutil.copy(self.snapshot_path, snapshot_dest)
> > +
> >          if self.use_unix_sockets:
> >              path = os.path.join(self.vardir, self.name + ".socket-admin")
> >              warn_unix_socket(path)
> > @@ -767,6 +797,7 @@ class TarantoolServer(Server):
> >                      if (e.errno == errno.ENOENT):
> >                          continue
> >                      raise
> > +
> >          shutil.copy('.tarantoolctl', self.vardir)
> >          shutil.copy(os.path.join(self.TEST_RUN_DIR, 'test_run.lua'),
> >                      self.vardir)
> > @@ -776,7 +807,12 @@ class TarantoolServer(Server):
> >                          self.vardir)
> >  
> >      def prepare_args(self, args=[]):
> > -        return [self.ctl_path, 'start', os.path.basename(self.script)] + args
> > +        ctl_args = [self.ctl_path, 'start', os.path.basename(self.script)] + args
> > +        if self.disable_schema_upgrade:
> > +            auto_upgrade = 'box.error.injection.set("ERRINJ_AUTO_UPGRADE", true)'
> > +            ctl_args = [self.binary, '-e', auto_upgrade] + ctl_args
> 
> It does not fail when the errinj is not available (say, on old tarantool
> version). I experimented around and found that this variant will fails
> in the case:
> 
>  | -e "require('fiber').sleep(0) assert(box.error.injection.set('ERRINJ_AUTO_UPGRADE', true) == 'ok', 'no such errinj')"

Updated.

> Filed https://github.com/tarantool/tarantool/issues/4983
> 
> Please, mention the issue in a comment here if you'll use this
> fiber.sleep(0) hack.

Mentioned.

> > +
> > +        return ctl_args
> >  
> >      def pretest_clean(self):
> >          # Don't delete snap and logs for 'default' tarantool server
> > @@ -819,6 +855,7 @@ class TarantoolServer(Server):
> >  
> >          # redirect stdout from tarantoolctl and tarantool
> >          os.putenv("TEST_WORKDIR", self.vardir)
> > +
> >          self.process = subprocess.Popen(args,
> >                                          cwd=self.vardir,
> >                                          stdout=self.log_des,
> > @@ -861,6 +898,33 @@ class TarantoolServer(Server):
> >          self.admin = CON_SWITCH[self.tests_type]('localhost', port)
> >          self.status = 'started'
> >  
> > +        # make sure schema is old
> > +        if self.disable_schema_upgrade:
> > +            version = self.extract_schema_from_snapshot()
> > +            conn = AdminConnection('localhost', self.admin.port)
> > +            current_ver = yaml.safe_load(conn.execute('box.space._schema:get{"version"}'))
> > +            conn.disconnect()
> 
> self.admin is a connection, it can be used here.

Fixed.

> > +            assert(version, current_ver)
> 
> Nit: It does not check whether version are equivalent.
> 
> Nit: assert is statement, not a function in Python (it is better to use
> it w/o parentheses).
> 
> Nit: I would explicitly raise RuntimeError(<...message...>), because I
> would not want depend on optimization level here. This check is not only
> for test-run code, but also depends on tarantool behaviour.

Well, replaced assert with raise.

> BTW, CI (`make lint`) show errors:
> https://travis-ci.com/github/tarantool/test-run/builds/156536350

Most warnings fixed.

> > +
> > +    def extract_schema_from_snapshot(self):
> 
> Nit: We can make it a class method.

What do you mean? It's already a class method.

> Nit: We can call find_exe() if cls.ctl_path is not defined.

Fixed.

> > +        ctl_args = [self.ctl_path, 'cat', self.snapshot_path,
> > +                        '--format=json', '--show-system']
> > +        # Extract schema version, example of record:
> > +        # {"HEADER":{"lsn":2,"type":"INSERT","timestamp":1584694286.0031},
> > +        #   "BODY":{"space_id":272,"tuple":["version",2,3,1]}}
> > +        with open(os.devnull, 'w') as fp:
> > +            proc = subprocess.Popen(ctl_args, stdout=subprocess.PIPE, stderr=fp)
> 
> In theory snapshots may be quite large. Can we read it line-by-line? I
> guess `for line in iter(proc.stdout.readline, '')` should work, see:
> 
> https://stackoverflow.com/a/2813530/1598057 (comments)
> https://web.archive.org/web/20181102115150/bugs.python.org/issue3907

Fixed.

> > +            content = proc.stdout.read()
> > +            proc.wait()
> > +            for line in content.split('\n'):
> > +                if line:
> > +                    json_line = json.loads(line)
> > +                    t = json_line['BODY']['tuple']
> > +                    if t[0] == 'version':
> > +                        return t
> 
> It looks fragile. Can we at least check space id? Those IDs are fixed
> for system spaces.

Discussed it and decided to check that box.space._schema.id == 272 in
tuples. Added such check.

> It also worth to consider only INSERT/REPLACE operations.

Added filter for operations.

> > +
> > +        return None
> > +
> >      def crash_detect(self):
> >          if self.crash_expected:
> >              return
> > -- 
> > 2.23.0
> > 
> > 
> > -- 
> > sergeyb@