[Tarantool-patches] [PATCH] Add options for upgrade testing

Alexander Turenko alexander.turenko at tarantool.org
Fri Nov 27 04:45:28 MSK 2020


Sorry for the long delay.

I looked over the patch, tested it a bit and proposed several fixups.

I pasted them here (for ease commenting) and pushed to the branch
(upward your commits). If you're okay with them, I can squash commits
and push the result. Or, maybe, extract several commits (re debug
property, re xlog module) and squash the rest (keeping the conflicting
options fix separate, of course).

The branch: ligurio/gh-4801-add-snapshot-option.

And sorry that I reworked a lot of code. I tried to minimize review
ping-ponging, but make myself comfortable with the result. And amount of
changes becomes larger than I initially expect. A lot of low hanging
fruits of the kind 'okay, but there is the easy way to make it a bit
better' appears.

I hope such moments will be less and less frequent with a time we'll
work in touch. A review is often about compromises and it is the work on
ourself for both sides.

WBR, Alexander Turenko.

----

Catched the patch version from the ligurio/gh-4801-add-snapshot-option
branch (8300e499b888c0d66daa941c956462767274b996 with conflicting option
fix 12b791c897cbf84da28326ce3af8a061703c77e3 upward).

Verified for a 'core = app' test. Server.cleanup() removes the snapshot
before it'll be read by tarantool. Fixed:

 | commit 04ffc8d6ca21f00c130e33e9db7947fb1b96b783
 | Author: Alexander Turenko <alexander.turenko at tarantool.org>
 | Date:   Wed Nov 25 02:25:08 2020 +0300
 | 
 |     FIXUP: Add options for upgrade testing
 |     
 |     The --bootstrap option don't work on 'core = app' tests before this
 |     fixup, because pretest_clean() removes all snapshots in a working
 |     directory. pretest_clean() is called before each test, while deploy() is
 |     called once a testing begins. We should copy the snapshot right before
 |     tarantool startup.
 | 
 | diff --git a/lib/app_server.py b/lib/app_server.py
 | index c32c5cf..a046b21 100644
 | --- a/lib/app_server.py
 | +++ b/lib/app_server.py
 | @@ -44,6 +44,14 @@ class AppTest(Test):
 |                                      server.logfile, retval)
 |          self.current_test_greenlet = tarantool
 |  
 | +        # Copy the snapshot right before starting the server.
 | +        # Otherwise pretest_clean() would remove it.
 | +        if server.snapshot_path:
 | +            snapshot_dest = os.path.join(server.vardir, DEFAULT_SNAPSHOT_NAME)
 | +            color_log("Copying snapshot {} to {}\n".format(
 | +                server.snapshot_path, snapshot_dest))
 | +            shutil.copy(server.snapshot_path, snapshot_dest)
 | +
 |          try:
 |              tarantool.start()
 |              tarantool.join()
 | @@ -125,12 +133,6 @@ class AppServer(Server):
 |          # cannot check length of path of *.control unix socket created by it.
 |          # So for 'app' tests type we don't check *.control unix sockets paths.
 |  
 | -        if self.snapshot_path:
 | -            snapshot_dest = os.path.join(self.vardir, DEFAULT_SNAPSHOT_NAME)
 | -            color_log("Copying snapshot {} to {}\n".format(
 | -                self.snapshot_path, snapshot_dest))
 | -            shutil.copy(self.snapshot_path, snapshot_dest)
 | -
 |      def stop(self, silent):
 |          if not self.process:
 |              return

Verified for a 'core = tarantool' test: it works. Observed that 'Processing
file <filename>...' lines are printed to the terminal and fixed it:

 | commit 7f25709a84af1cf862037e0e2bf99318836b05d0
 | Author: Alexander Turenko <alexander.turenko at tarantool.org>
 | Date:   Wed Nov 25 20:14:31 2020 +0300
 | 
 |     FIXUP: Add options for upgrade testing
 |     
 |     Explicitly discard `tarantoolctl cat` stderr output. Otherwise test-run
 |     prints 'Processing file <filename>...' lines to the terminal before
 |     starting a server for 'core = tarantool' test (or when 'start server'
 |     command is used inside a test) if --disable-schema-upgrade option is
 |     passed.
 | 
 | diff --git a/lib/utils.py b/lib/utils.py
 | index ef7d8a8..bdf920e 100644
 | --- a/lib/utils.py
 | +++ b/lib/utils.py
 | @@ -281,7 +281,8 @@ def extract_schema_from_snapshot(ctl_path, builddir, snapshot_path):
 |  
 |      ctl_args = [ctl_path, 'cat', snapshot_path,
 |                  '--format=json', '--show-system']
 | -    proc = subprocess.Popen(ctl_args, stdout=subprocess.PIPE)
 | +    with open(os.devnull, 'w') as devnull:
 | +        proc = subprocess.Popen(ctl_args, stdout=subprocess.PIPE, stderr=devnull)
 |      version = None
 |      while True:
 |          line = proc.stdout.readline()

(I unable to use subprocess.DEVNULL, because it is supported since
Python 3.3.)

The last thing I want to verify is whether we'll able to reimplement the
option in the way that is described in #5540 ([1]) without changing
anything on the tarantool side (including snapshots we pass). In other
words, whether there is any sense in our agreement about calling the
option as --bootstrap in order to reimplement it later without breaking
backward compatibility.

[1]: https://github.com/tarantool/tarantool/issues/5540

I tried to pass a bootstrap snapshot (src/box/bootstrap.snap from the
latest 1.10) and got the following errors:

 | [001] 2020-11-25 02:24:08.839 [22154] main/103/merger.test.lua box.cc:2747 E> ER_UNKNOWN_REPLICA: Replica 1e00d635-4bea-4eb8-86d8-52807fcd0adb is not registered with replica set 00000000-0000-0000-0000-000000000000
 | [001] 2020-11-25 02:24:08.839 [22154] main/103/merger.test.lua F> can't initialize storage: Replica 1e00d635-4bea-4eb8-86d8-52807fcd0adb is not registered with replica set 00000000-0000-0000-0000-000000000000

I investigated the differences between a bootstrap snapshot and the
snapshot that is written to the file system after a first box.cfg() and
created the utility function to transform the former to the latter.

I'll place four patches with support of this approach at the of the
email and on the branch.

See the rest of comments and proposed patches are inline.

> diff --git a/lib/__init__.py b/lib/__init__.py
> index 398439d..a3d1597 100644
> --- a/lib/__init__.py
> +++ b/lib/__init__.py
> @@ -6,6 +6,7 @@ from lib.options import Options
>  from lib.tarantool_server import TarantoolServer
>  from lib.unittest_server import UnittestServer
>  from utils import warn_unix_sockets_at_start
> +from utils import test_debug
>  
>  
>  __all__ = ['Options']
> @@ -56,6 +57,9 @@ def module_init():
>      TarantoolServer.find_exe(args.builddir)
>      UnittestServer.find_exe(args.builddir)
>  
> +    is_debug = test_debug(TarantoolServer().version())
> +    Options().check_schema_upgrade_option(is_debug)
> +

There is no need to create an instance of TarantoolServer to call a
class method. In fact, there is no need to move the `test_debug()`
method implementation to utils and, moreover, there is no need to keep
this method at all. We can just use the 'version()' class method at
appropriate place and save the result within a class field. It still
will be available as the field of an instance.

And, to be honest, the name 'test_debug()' is dubious. It may mean 'to
test a debug build' or 'to test whether a build is debug'. Too common,
IMHO.

BTW, while we're here I would note that it would be good to fix the
problem with the 'release_disable' suite.ini option in 'core = app' and
'core = unittest' test suites:
https://github.com/tarantool/test-run/issues/199

I propose the following change:

 | commit d4cc26096643f41888e7ed004392d27bce60ee78
 | Author: Alexander Turenko <alexander.turenko at tarantool.org>
 | Date:   Thu Nov 26 22:56:06 2020 +0300
 | 
 |     FIXUP: Add options for upgrade testing
 |     
 |     Move TarantoolServer's property 'debug' to a class level instead of
 |     instance level.
 |     
 |     Drop unnecessary test_debug() function (it is not used anywhere in
 |     tests, but was used to implement the property). Drop the same named
 |     utils function, because it is unnecessary too now.
 |     
 |     Eliminate unnecesary instantiation of TarantoolServer at options
 |     verification in lib/__init__.py. Use the 'debug' property directly
 |     instead of the 'version()' class method.
 |     
 |     The purpose of changes is simplification of the code.
 | 
 | diff --git a/lib/__init__.py b/lib/__init__.py
 | index a3d1597..767f0d6 100644
 | --- a/lib/__init__.py
 | +++ b/lib/__init__.py
 | @@ -6,7 +6,6 @@ from lib.options import Options
 |  from lib.tarantool_server import TarantoolServer
 |  from lib.unittest_server import UnittestServer
 |  from utils import warn_unix_sockets_at_start
 | -from utils import test_debug
 |  
 |  
 |  __all__ = ['Options']
 | @@ -57,8 +56,7 @@ def module_init():
 |      TarantoolServer.find_exe(args.builddir)
 |      UnittestServer.find_exe(args.builddir)
 |  
 | -    is_debug = test_debug(TarantoolServer().version())
 | -    Options().check_schema_upgrade_option(is_debug)
 | +    Options().check_schema_upgrade_option(TarantoolServer.debug)
 |  
 |  
 |  # Init
 | diff --git a/lib/tarantool_server.py b/lib/tarantool_server.py
 | index 8de1a30..4121a4d 100644
 | --- a/lib/tarantool_server.py
 | +++ b/lib/tarantool_server.py
 | @@ -35,7 +35,6 @@ from lib.utils import safe_makedirs
 |  from lib.utils import signame
 |  from lib.utils import warn_unix_socket
 |  from lib.utils import prefix_each_line
 | -from lib.utils import test_debug
 |  from test import TestRunGreenlet, TestExecutionError
 |  
 |  
 | @@ -461,10 +460,6 @@ class TarantoolServer(Server):
 |      }
 |  
 |      # ----------------------------PROPERTIES--------------------------------- #
 | -    @property
 | -    def debug(self):
 | -        return self.test_debug()
 | -
 |      @property
 |      def name(self):
 |          if not hasattr(self, '_name') or not self._name:
 | @@ -692,7 +687,10 @@ class TarantoolServer(Server):
 |                          ctl_dir + '/?.lua;' + \
 |                          ctl_dir + '/?/init.lua;' + \
 |                          os.environ.get("LUA_PATH", ";;")
 | +                cls.debug = bool(re.findall(r'-Debug', str(cls.version()),
 | +                                 re.I))
 |                  return exe
 | +
 |          raise RuntimeError("Can't find server executable in " + path)
 |  
 |      @classmethod
 | @@ -1112,9 +1110,6 @@ class TarantoolServer(Server):
 |      def test_option(self, option_list_str):
 |          print self.test_option_get(option_list_str)
 |  
 | -    def test_debug(self):
 | -        return test_debug(self.test_option_get("-V", True))
 | -
 |      @staticmethod
 |      def find_tests(test_suite, suite_path):
 |          test_suite.ini['suite'] = suite_path
 | diff --git a/lib/utils.py b/lib/utils.py
 | index bdf920e..e0bd394 100644
 | --- a/lib/utils.py
 | +++ b/lib/utils.py
 | @@ -1,5 +1,4 @@
 |  import os
 | -import re
 |  import sys
 |  import six
 |  import collections
 | @@ -298,9 +297,3 @@ def extract_schema_from_snapshot(ctl_path, builddir, snapshot_path):
 |                  break
 |  
 |      return version
 | -
 | -
 | -def test_debug(version):
 | -    if re.findall(r"-Debug", version, re.I):
 | -        return True
 | -    return False

> diff --git a/lib/options.py b/lib/options.py
> index 47bbc0f..9dcb70e 100644
> --- a/lib/options.py
> +++ b/lib/options.py
> @@ -209,6 +209,19 @@ class Options:
>                  help="""Update or create file with reference output (.result).
>                  Default: false.""")
>  
> +        parser.add_argument(
> +                "--snapshot",
> +                dest='snapshot_path',
> +                help="""Path to a Tarantool snapshot that will be
> +                loaded before testing.""")

(The cited patch is a bit old, now it is --bootstrap.)

Typo: boostrap -> bootstrap. Fixed in a fixup commit.

Omitting of the value is the same as passing `default=None`, but the
latter is more obvious. See, you use getattr(), because your guess was
that it works as `default=argparse.SUPPRESS` (see docs, [1]). And using
the explicit default would be more consistent with other arguments. If
omitting of `default` would work like argparse.SUPRESS, the code, which
uses `Options().args.snapshot_path` in server.py, would fail when the
argument would not be passed.

[1]: https://docs.python.org/2/library/argparse.html#default

I would be against of using `default=argparse.SUPPRESS`: this way we
would be obligated to use getattr() (and remember to use it) every time
we need to obtain the option. But there is no reason to introduce the
extra complexity.

I propose the following change:

 | commit 637e3f309b5c43b031f66440f12fabf02b1d26dc
 | Author: Alexander Turenko <alexander.turenko at tarantool.org>
 | Date:   Thu Nov 26 23:45:00 2020 +0300
 | 
 |     FIXUP: Add options for upgrade testing
 |     
 |     Explicitly set default=None for the new option. It does not change
 |     anything, but is consistent with other options and more obvious for a
 |     reader.
 |     
 |     Removed unnecessary getattr().
 | 
 | diff --git a/lib/options.py b/lib/options.py
 | index fc905cb..0476a36 100644
 | --- a/lib/options.py
 | +++ b/lib/options.py
 | @@ -212,6 +212,7 @@ class Options:
 |          parser.add_argument(
 |                  "--bootstrap",
 |                  dest='snapshot_path',
 | +                default=None,
 |                  help="""Path to snapshot that will be used to bootstrap
 |                  Tarantool before testing.""")
 |  
 | @@ -240,7 +241,7 @@ class Options:
 |                  check_error = True
 |                  break
 |  
 | -        snapshot_path = getattr(self.args, 'snapshot_path', '')
 | +        snapshot_path = self.args.snapshot_path
 |          if self.args.disable_schema_upgrade and not snapshot_path:
 |              color_stdout("\nOption --disable-schema-upgrade requires --bootstrap\n",
 |                           schema='error')

By the way, the option is named '--bootstrap', but the destination is
'snapshot_path'. The thought whether it may be confusing spins in my
mind, but I don't know. I left it unchanged.

> @@ -227,5 +240,23 @@ class Options:
>                  color_stdout(format_str.format(op1, op2), schema='error')
>                  check_error = True
>  
> +        snapshot_path = getattr(self.args, 'snapshot_path', '')
> +        if self.args.disable_schema_upgrade and not snapshot_path:
> +            color_stdout("\nOption --disable-schema-upgrade requires --snapshot\n",
> +                        schema='error')
> +            check_error = True

Typo: --boostrap -> --bootstrap. Fixed in a fixup commit.

> +
> +        if snapshot_path:
> +            if not os.path.exists(snapshot_path):
> +                color_stdout("\nPath {} not exists\n".format(snapshot_path), schema='error')
> +                check_error = True
> +            else:
> +                self.args.snapshot_path = os.path.abspath(snapshot_path)
> +

I like the idea to normalize the path before test-run calls chdir(). I
think other path options should do the same: I didn't try, but they may
be broken for the case, when test-run is called outside of the test/
directory.

But I found it being a bit counter-intuitive to normalize an option in a
'check()' method. If I look over the options, I may end up with the
guess that the destination variable contains a relative path. There are
other ways to normalize:

1. A custom 'action' add_argument() parameter.
2. A custom 'type' add_argument() parameter.
3. A separate normalize() Options method.

The first looks most intuitive on the first glance, but it requires to
define an Action class, not just a function. In fact, the second choice
is the simplest one. I propose to use it:

 | commit cda60788573dc4396c6bd0e6bf196593cb3d1485
 | Author: Alexander Turenko <alexander.turenko at tarantool.org>
 | Date:   Fri Nov 27 00:59:44 2020 +0300
 | 
 |     FIXUP: Add options for upgrade testing
 |     
 |     Move normalization of the --bootstrap option value into the option
 |     declaration block.
 |     
 |     The intention is to simplify the code.
 |     
 |     Corner case: `./test-run.py --bootstrap ../00000000000000000000.snap/`
 |     will be accepted now as if would we don't place the slash at the end. I
 |     would ignore the case for the sake of code simplicity.
 | 
 | diff --git a/lib/options.py b/lib/options.py
 | index 0476a36..c478030 100644
 | --- a/lib/options.py
 | +++ b/lib/options.py
 | @@ -213,6 +213,7 @@ class Options:
 |                  "--bootstrap",
 |                  dest='snapshot_path',
 |                  default=None,
 | +                type=os.path.abspath,
 |                  help="""Path to snapshot that will be used to bootstrap
 |                  Tarantool before testing.""")
 |  
 | @@ -247,12 +248,9 @@ class Options:
 |                           schema='error')
 |              check_error = True
 |  
 | -        if snapshot_path:
 | -            if not os.path.exists(snapshot_path):
 | -                color_stdout("\nPath {} not exists\n".format(snapshot_path), schema='error')
 | -                check_error = True
 | -            else:
 | -                self.args.snapshot_path = os.path.abspath(snapshot_path)
 | +        if snapshot_path and not os.path.exists(snapshot_path):
 | +            color_stdout("\nPath {} not exists\n".format(snapshot_path), schema='error')
 | +            check_error = True
 |  
 |          if check_error:
 |              exit(-1)

BTW, you may fix exit(-1) (which gives 255) to exit(1). I guess it was
the unintentional mistake. I left it unchanged (just my laziness,
sorry).

> @@ -84,6 +90,9 @@ class Server(object):
>          self.inspector_port = int(ini.get(
>              'inspector_port', self.DEFAULT_INSPECTOR
>          ))
> +        self.testdir = os.path.abspath(os.curdir)
> +        self.disable_schema_upgrade = Options().args.disable_schema_upgrade
> +        self.snapshot_path = Options().args.snapshot_path

I would use the option variables directly, it looks simpler for me (less
transitive def-use dependencies, so the dependency tree has lesser
height). However other options are also saved in *Server instances, so
okay.

> @@ -609,7 +613,6 @@ class TarantoolServer(Server):
>          }
>          ini.update(_ini)
>          Server.__init__(self, ini, test_suite)
> -        self.testdir = os.path.abspath(os.curdir)

We discussed it in the past review and you said that reverted it, but it
seems, you don't. I did it in a fixup commit.

> @@ -775,8 +778,29 @@ class TarantoolServer(Server):
>              shutil.copy(os.path.join(self.TEST_RUN_DIR, 'pretest_clean.lua'),
>                          self.vardir)
>  
> +        if self.snapshot_path:
> +            # Copy snapshot to the workdir.
> +            # Usually Tarantool looking for snapshots on start in a current directory
> +            # or in a directories that specified in memtx_dir or vinyl_dir box settings.
> +            # Before running test current directory (workdir) passed to a new instance in
> +            # an environment variable TEST_WORKDIR and then .tarantoolctl.in
> +            # adds to it instance_name and set to memtx_dir and vinyl_dir.

I replaced '.tarantoolctl.in' with 'tarantoolctl' in a fixup commit. It
looks as typo. Let me know, if I misunderstood something.

> @@ -861,6 +885,18 @@ class TarantoolServer(Server):
>          self.admin = CON_SWITCH[self.tests_type]('localhost', port)
>          self.status = 'started'
>  
> +        if not self.ctl_path:
> +            self.ctl_path = self.find_exe(self.builddir)

I remember, you said that the same if-condition is present in the
print_exe() classmethod. But this situation is different: we already use
self.ctl_path inside self.prepare_args() call above. If it is None,
False or [], Popen will fail with AttributeError or TypeError. So
despite it may have a sense in another place of the code, here it just
gives a reader the wrong idea that self.ctl_path may be None.

Removed in a fixup commit.

> +        if self.disable_schema_upgrade:
> +            version = extract_schema_from_snapshot(self.ctl_path,
> +                                                   self.builddir, self.snapshot_path)
> +            current_ver = yaml.safe_load(self.admin.execute(
> +                'box.space._schema:get{"version"}'))
> +            if version == current_ver:
> +                raise_msg = 'Versions are inequal ({} vs {})'.format(
> +                    version, current_ver)
> +                raise Exception(raise_msg)
> +

I made several fixups here:

 | commit 947df1416c0f3b6330ad8d257077e46b2d06d821
 | Author: Alexander Turenko <alexander.turenko at tarantool.org>
 | Date:   Thu Nov 26 19:42:11 2020 +0300
 | 
 |     FIXUP: Add options for upgrade testing
 |     
 |     Several problems are fixed within the hunk:
 |     
 |     1. Added a short explanation comment.
 |     2. Gave a bit more descriptive names to variables.
 |     3. Fixed actual version acquiring (added [0]).
 |     4. Fixed versions comparison (`==` -> `!=`).
 |     5. Print a bit more descriptive error message.
 |     6. Changed the exception to TarantoolStartError.
 |     
 |     The 6th point is dubious. I mean, `raise Exception(<...>)` should be
 |     changed to some more-or-less concrete exception class: RuntimeError or
 |     TarantoolStartError. Neither of them handled good by test-run: it prints
 |     a traceback and it looks like internal test-run error. But at least it
 |     kills the server, so it don't hang is the prcess table after the
 |     testing.
 |     
 |     I would leave it as is here and maybe investigate later if there will be
 |     a demand.
 | 
 | diff --git a/lib/tarantool_server.py b/lib/tarantool_server.py
 | index 5b47ff0..8de1a30 100644
 | --- a/lib/tarantool_server.py
 | +++ b/lib/tarantool_server.py
 | @@ -886,15 +886,18 @@ class TarantoolServer(Server):
 |          self.admin = CON_SWITCH[self.tests_type]('localhost', port)
 |          self.status = 'started'
 |  
 | +        # Verify that the schema actually was not upgraded.
 |          if self.disable_schema_upgrade:
 | -            version = extract_schema_from_snapshot(self.ctl_path,
 | -                                                   self.builddir, self.snapshot_path)
 | -            current_ver = yaml.safe_load(self.admin.execute(
 | -                'box.space._schema:get{"version"}'))
 | -            if version == current_ver:
 | -                raise_msg = 'Versions are inequal ({} vs {})'.format(
 | -                    version, current_ver)
 | -                raise Exception(raise_msg)
 | +            expected_version = extract_schema_from_snapshot(
 | +                self.ctl_path, self.builddir, self.snapshot_path)
 | +            actual_version = yaml.safe_load(self.admin.execute(
 | +                'box.space._schema:get{"version"}'))[0]
 | +            if expected_version != actual_version:
 | +                color_stdout('Schema version check fails: expected '
 | +                             '{}, got {}\n'.format(expected_version,
 | +                                                   actual_version),
 | +                             schema='error')
 | +                raise TarantoolStartError(self.name)
 |  
 |      def crash_detect(self):
 |          if self.crash_expected:

----

The patches about using a bootstrap snapshot (and simplification of
extract_schema_from_snapshot()). To be removed (maybe except
extract_schema_from_snapshot()) after #5540.

commit a34c28ea61c9a56b8ed2be2f045ebea8f94d0f8a
Author: Alexander Turenko <alexander.turenko at tarantool.org>
Date:   Fri Nov 27 02:17:43 2020 +0300

    FIXUP: Add options for upgrade testing
    
    Add xlog utility functions. Will be used in the following fixup commits.

diff --git a/lib/xlog.py b/lib/xlog.py
new file mode 100644
index 0000000..dbaa4dc
--- /dev/null
+++ b/lib/xlog.py
@@ -0,0 +1,232 @@
+"""Xlog and snapshot utility functions."""
+
+import os
+import msgpack
+import subprocess
+import json
+from uuid import uuid4
+
+
+__all__ = ['init', 'snapshot_is_for_bootstrap', 'prepare_bootstrap_snapshot']
+
+
+# {{{ Constants
+
+# Xlog binary format constants.
+ROW_MARKER = b'\xd5\xba\x0b\xab'
+EOF_MARKER = b'\xd5\x10\xad\xed'
+XLOG_FIXHEADER_SIZE = 19
+VCLOCK_MAX = 32
+VCLOCK_STR_LEN_MAX = 1 + VCLOCK_MAX * (2 + 2 + 20 + 2) + 1
+XLOG_META_LEN_MAX = 1024 + VCLOCK_STR_LEN_MAX
+
+
+# The binary protocol (iproto) constants.
+#
+# Widely reused for xlog / snapshot files.
+IPROTO_REQUEST_TYPE = 0x00
+IPROTO_LSN = 0x03
+IPROTO_TIMESTAMP = 0x04
+IPROTO_INSERT = 2
+IPROTO_SPACE_ID = 0x10
+IPROTO_TUPLE = 0x21
+
+
+# System space IDs.
+BOX_SCHEMA_ID = 272
+BOX_CLUSTER_ID = 320
+
+# }}} Constants
+
+
+tarantool_cmd = 'tarantool'
+tarantoolctl_cmd = 'tarantoolctl'
+debug_f = lambda x: None  # noqa: E731
+
+
+def init(tarantool=None, tarantoolctl=None, debug=None):
+    """ Redefine module level globals.
+
+        If the function is not called, tarantool and tarantoolctl
+        will be called according to the PATH environment variable.
+
+        Beware: tarantool and tarantoolctl are lists. Example:
+
+        tarantool_cmd = ['/path/to/tarantool']
+        tarantoolctl_cmd = tarantool_cmd + ['/path/to/tarantoolctl']
+        xlog.init(tarantool=tarantool_cmd, tarantoolctl=tarantoolctl_cmd)
+    """
+    global tarantool_cmd
+    global tarantoolctl_cmd
+    global debug_f
+
+    if tarantool:
+        assert isinstance(tarantool, list)
+        tarantool_cmd = tarantool
+    if tarantool_cmd:
+        assert isinstance(tarantoolctl, list)
+        tarantoolctl_cmd = tarantoolctl
+    if debug:
+        debug_f = debug
+
+
+# {{{ General purpose helpers
+
+def crc32c(data):
+    """ Use tarantool's implementation of CRC32C algorithm.
+
+        Python has no built-in implementation of CRC32C.
+    """
+    lua = "print(require('digest').crc32_update(0, io.stdin:read({})))".format(
+        len(data))
+    with open(os.devnull, 'w') as devnull:
+        process = subprocess.Popen(tarantool_cmd + ['-e', lua],
+                                   stdin=subprocess.PIPE,
+                                   stdout=subprocess.PIPE,
+                                   stderr=devnull)
+    process.stdin.write(data)
+    res, _ = process.communicate()
+    return int(res)
+
+# }}} General purpose helpers
+
+
+# {{{ parse xlog / snapshot
+
+def xlog_rows(xlog_path):
+    cmd = tarantoolctl_cmd + ['cat', xlog_path, '--format=json',
+                              '--show-system']
+    with open(os.devnull, 'w') as devnull:
+        process = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=devnull)
+    for line in process.stdout.readlines():
+        yield json.loads(line)
+
+# }}} parse xlog / snapshot
+
+
+# {{{ xlog encode data helpers
+
+def encode_xrow_header(xrow):
+    packer = msgpack.Packer(use_bin_type=False)
+    xrow_header = ROW_MARKER
+    # xrow size
+    xrow_header += packer.pack(len(xrow))
+    # previous xrow crc32
+    xrow_header += packer.pack(0)
+    # current xrow crc32
+    xrow_header += packer.pack(crc32c(xrow))
+    # padding
+    padding_size = XLOG_FIXHEADER_SIZE - len(xrow_header)
+    xrow_header += packer.pack(b'\x00' * (padding_size - 1))
+    assert(len(xrow_header) == XLOG_FIXHEADER_SIZE)
+    return xrow_header
+
+
+def encode_xrow(header, body):
+    packer = msgpack.Packer(use_bin_type=False)
+    header = packer.pack(header)
+    body = packer.pack(body)
+    xrow_data = header + body
+    return encode_xrow_header(xrow_data) + xrow_data
+
+# }}} xlog encode data helpers
+
+
+# {{{ xlog write data helpers
+
+def xlog_seek_end(xlog):
+    """Set the file position right before the end marker."""
+    WHENCE_END = 2
+    xlog.seek(-4, WHENCE_END)
+    eof_marker = xlog.read(4)
+    if eof_marker != EOF_MARKER:
+        raise RuntimeError('Invalid eof marker: {}'.format(eof_marker))
+    xlog.seek(-4, WHENCE_END)
+
+
+def xlog_write_eof(xlog):
+    xlog.write(EOF_MARKER)
+
+# }}} xlog write data helpers
+
+
+# {{{ xlog write meta helpers
+
+def xlog_meta_write_instance_uuid(xlog, instance_uuid):
+    xlog.seek(0)
+    xlog.seek(xlog.read(XLOG_META_LEN_MAX).find(b'Instance: '))
+    # Trick: old and new UUID have the same size.
+    xlog.write(b'Instance: ' + instance_uuid)
+
+# }}} xlog write meta helpers
+
+
+def snapshot_is_for_bootstrap(snapshot_path):
+    """ A bootstrap snapshot (src/box/bootstrap.snap) has no
+        <cluster_uuid> and <instance_uuid> in _schema and
+        _cluster system spaces.
+    """
+    cluster_uuid_exists = False
+    instance_uuid_exists = False
+
+    for row in xlog_rows(snapshot_path):
+        if row['HEADER']['type'] == 'INSERT' and         \
+           row['BODY']['space_id'] == BOX_SCHEMA_ID and \
+           row['BODY']['tuple'][0] == 'cluster':
+            cluster_uuid_exists = True
+
+        if row['HEADER']['type'] == 'INSERT' and          \
+           row['BODY']['space_id'] == BOX_CLUSTER_ID and \
+           row['BODY']['tuple'][0] == 1:
+            instance_uuid_exists = True
+
+        if cluster_uuid_exists and instance_uuid_exists:
+            break
+
+    if cluster_uuid_exists != instance_uuid_exists:
+        raise RuntimeError('A cluster UUID and an instance UUID should exist '
+                           'or not exist both')
+
+    return not cluster_uuid_exists
+
+
+def prepare_bootstrap_snapshot(snapshot_path):
+    """ Prepare a bootstrap snapshot to use with local recovery."""
+    cluster_uuid = str(uuid4()).encode('ascii')
+    debug_f('Cluster UUID: {}'.format(cluster_uuid))
+    instance_uuid = str(uuid4()).encode('ascii')
+    instance_id = 1
+    debug_f('Instance ID: {}'.format(instance_id))
+    debug_f('Instance UUID: {}'.format(instance_uuid))
+
+    last_row = list(xlog_rows(snapshot_path))[-1]
+    lsn = int(last_row['HEADER']['lsn'])
+    timestamp = float(last_row['HEADER']['timestamp'])
+
+    with open(snapshot_path, 'rb+') as xlog:
+        xlog_meta_write_instance_uuid(xlog, instance_uuid)
+        xlog_seek_end(xlog)
+
+        # Write cluster UUID to _schema.
+        lsn += 1
+        xlog.write(encode_xrow({
+            IPROTO_REQUEST_TYPE: IPROTO_INSERT,
+            IPROTO_LSN: lsn,
+            IPROTO_TIMESTAMP: timestamp,
+        }, {
+            IPROTO_SPACE_ID: BOX_SCHEMA_ID,
+            IPROTO_TUPLE: ['cluster', cluster_uuid],
+        }))
+
+        # Write replica ID and replica UUID to _cluster.
+        lsn += 1
+        xlog.write(encode_xrow({
+            IPROTO_REQUEST_TYPE: IPROTO_INSERT,
+            IPROTO_LSN: lsn,
+            IPROTO_TIMESTAMP: timestamp,
+        }, {
+            IPROTO_SPACE_ID: BOX_CLUSTER_ID,
+            IPROTO_TUPLE: [1, instance_uuid],
+        }))
+
+        xlog_write_eof(xlog)

commit 4e860a6a405d08aebc23a233783c043027aaa1f5
Author: Alexander Turenko <alexander.turenko at tarantool.org>
Date:   Fri Nov 27 02:52:47 2020 +0300

    FIXUP: Add options for upgrade testing
    
    Move the module search path tweaking into the upper level. The next
    commit will use the xlog module, which requires msgpack. So we need to
    add the necessary search paths before importing the module.

diff --git a/lib/__init__.py b/lib/__init__.py
index 767f0d6..dccb76b 100644
--- a/lib/__init__.py
+++ b/lib/__init__.py
@@ -2,10 +2,14 @@ import os
 import sys
 import shutil
 
-from lib.options import Options
-from lib.tarantool_server import TarantoolServer
-from lib.unittest_server import UnittestServer
-from utils import warn_unix_sockets_at_start
+# monkey patch tarantool and msgpack
+from lib.utils import check_libs
+check_libs()
+
+from lib.options import Options                   # noqa: E402
+from lib.tarantool_server import TarantoolServer  # noqa: E402
+from lib.unittest_server import UnittestServer    # noqa: E402
+from utils import warn_unix_sockets_at_start      # noqa: E402
 
 
 __all__ = ['Options']
diff --git a/lib/box_connection.py b/lib/box_connection.py
index 6eb5121..12a5abd 100644
--- a/lib/box_connection.py
+++ b/lib/box_connection.py
@@ -26,13 +26,8 @@ import ctypes
 import socket
 
 from tarantool_connection import TarantoolConnection
-
-# monkey patch tarantool and msgpack
-from lib.utils import check_libs
-check_libs()
-
-from tarantool import Connection as tnt_connection  # noqa: E402
-from tarantool import Schema                        # noqa: E402
+from tarantool import Connection as tnt_connection
+from tarantool import Schema
 
 
 SEPARATOR = '\n'

commit 4c93e1a105737c74f77035aa638100b09c2367ce
Author: Alexander Turenko <alexander.turenko at tarantool.org>
Date:   Fri Nov 27 02:58:16 2020 +0300

    FIXUP: Add options for upgrade testing
    
    Accept only bootstrap snapshots, but prepare them to use with local
    recovery. We have no other way to feed an external bootstrap snapshot to
    tarantool until [1] will be implemented.
    
    [1]: https://github.com/tarantool/tarantool/issues/5540

diff --git a/lib/__init__.py b/lib/__init__.py
index dccb76b..c8e8626 100644
--- a/lib/__init__.py
+++ b/lib/__init__.py
@@ -10,6 +10,8 @@ from lib.options import Options                   # noqa: E402
 from lib.tarantool_server import TarantoolServer  # noqa: E402
 from lib.unittest_server import UnittestServer    # noqa: E402
 from utils import warn_unix_sockets_at_start      # noqa: E402
+from lib.colorer import color_log                 # noqa: E402
+import xlog                                       # noqa: E402
 
 
 __all__ = ['Options']
@@ -60,6 +62,14 @@ def module_init():
     TarantoolServer.find_exe(args.builddir)
     UnittestServer.find_exe(args.builddir)
 
+    # Initialize xlog module with found tarantool / tarantoolctl.
+    # Set color_log() as the function for write debug logs.
+    tarantool_cmd = [TarantoolServer.binary]
+    tarantoolctl_cmd = tarantool_cmd + [TarantoolServer.ctl_path]
+    xlog.init(tarantool=tarantool_cmd, tarantoolctl=tarantoolctl_cmd,
+              debug=lambda x: color_log(' | ' + x + '\n'))
+
+    Options().check_bootstrap_option()
     Options().check_schema_upgrade_option(TarantoolServer.debug)
 
 
diff --git a/lib/app_server.py b/lib/app_server.py
index a046b21..f2916f0 100644
--- a/lib/app_server.py
+++ b/lib/app_server.py
@@ -17,6 +17,7 @@ from lib.utils import find_port
 from lib.utils import format_process
 from lib.utils import warn_unix_socket
 from test import TestRunGreenlet, TestExecutionError
+from xlog import prepare_bootstrap_snapshot
 
 
 def run_server(execs, cwd, server, logfile, retval):
@@ -51,6 +52,9 @@ class AppTest(Test):
             color_log("Copying snapshot {} to {}\n".format(
                 server.snapshot_path, snapshot_dest))
             shutil.copy(server.snapshot_path, snapshot_dest)
+            color_log('Preparing snapshot {} for local recovery...\n'.format(
+                      snapshot_dest))
+            prepare_bootstrap_snapshot(snapshot_dest)
 
         try:
             tarantool.start()
diff --git a/lib/options.py b/lib/options.py
index c478030..324f59d 100644
--- a/lib/options.py
+++ b/lib/options.py
@@ -5,6 +5,7 @@ from itertools import product
 from lib.singleton import Singleton
 
 from lib.colorer import color_stdout
+from xlog import snapshot_is_for_bootstrap
 
 
 def env_int(name, default):
@@ -255,6 +256,12 @@ class Options:
         if check_error:
             exit(-1)
 
+    def check_bootstrap_option(self):
+        if not snapshot_is_for_bootstrap(self.args.snapshot_path):
+            color_stdout('Expected a boostrap snapshot, one for local recovery '
+                         'is given\n', schema='error')
+            exit(1)
+
     def check_schema_upgrade_option(self, is_debug):
         if self.args.disable_schema_upgrade and not is_debug:
             color_stdout("Can't disable schema upgrade on release build\n", schema='error')
diff --git a/lib/tarantool_server.py b/lib/tarantool_server.py
index 4121a4d..78994c6 100644
--- a/lib/tarantool_server.py
+++ b/lib/tarantool_server.py
@@ -36,6 +36,7 @@ from lib.utils import signame
 from lib.utils import warn_unix_socket
 from lib.utils import prefix_each_line
 from test import TestRunGreenlet, TestExecutionError
+from xlog import prepare_bootstrap_snapshot
 
 
 def save_join(green_obj, timeout=None):
@@ -791,6 +792,9 @@ class TarantoolServer(Server):
             color_log("Copying snapshot {} to {}\n".format(
                 self.snapshot_path, snapshot_dest))
             shutil.copy(self.snapshot_path, snapshot_dest)
+            color_log('Preparing snapshot {} for local recovery...\n'.format(
+                      snapshot_dest))
+            prepare_bootstrap_snapshot(snapshot_dest)
 
     def prepare_args(self, args=[]):
         cli_args = [self.ctl_path, 'start',

commit 2fe384b4287033695f43eb695e776e3ecf8ae968
Author: Alexander Turenko <alexander.turenko at tarantool.org>
Date:   Fri Nov 27 03:49:18 2020 +0300

    FIXUP: Add options for upgrade testing
    
    Use xlog.xlog_rows() helper in the extract_schema_from_snapshot()
    function.
    
    Several other changes:
    
    1. Moved to the xlog module, since we have it separate from other utils.
    2. Prettify the example of json formatted xrow.
    3. Get rid of one-time-use variables. It looks more readable for me
       here.
    4. Expect only 'INSERT' queries, because the source is a snapshot. The
       function should be changed anyway to extract a last version from an
       xlog file: traverse till end, handle 'UPDATE'.

diff --git a/lib/tarantool_server.py b/lib/tarantool_server.py
index 78994c6..d954032 100644
--- a/lib/tarantool_server.py
+++ b/lib/tarantool_server.py
@@ -29,7 +29,6 @@ from lib.server import Server
 from lib.server import DEFAULT_SNAPSHOT_NAME
 from lib.test import Test
 from lib.utils import find_port
-from lib.utils import extract_schema_from_snapshot
 from lib.utils import format_process
 from lib.utils import safe_makedirs
 from lib.utils import signame
@@ -37,6 +36,7 @@ from lib.utils import warn_unix_socket
 from lib.utils import prefix_each_line
 from test import TestRunGreenlet, TestExecutionError
 from xlog import prepare_bootstrap_snapshot
+from xlog import extract_schema_from_snapshot
 
 
 def save_join(green_obj, timeout=None):
@@ -890,8 +890,7 @@ class TarantoolServer(Server):
 
         # Verify that the schema actually was not upgraded.
         if self.disable_schema_upgrade:
-            expected_version = extract_schema_from_snapshot(
-                self.ctl_path, self.builddir, self.snapshot_path)
+            expected_version = extract_schema_from_snapshot(self.snapshot_path)
             actual_version = yaml.safe_load(self.admin.execute(
                 'box.space._schema:get{"version"}'))[0]
             if expected_version != actual_version:
diff --git a/lib/utils.py b/lib/utils.py
index e0bd394..cc2f6b7 100644
--- a/lib/utils.py
+++ b/lib/utils.py
@@ -3,10 +3,8 @@ import sys
 import six
 import collections
 import signal
-import subprocess
 import random
 import fcntl
-import json
 import difflib
 import time
 from gevent import socket
@@ -266,34 +264,3 @@ def just_and_trim(src, width):
     if len(src) > width:
         return src[:width - 1] + '>'
     return src.ljust(width)
-
-
-def extract_schema_from_snapshot(ctl_path, builddir, snapshot_path):
-    """
-    Extract schema version from snapshot, example of record:
-     {"HEADER":{"lsn":2,"type":"INSERT","timestamp":1584694286.0031},
-       "BODY":{"space_id":272,"tuple":["version",2,3,1]}}
-
-    :returns: [u'version', 2, 3, 1]
-    """
-    SCHEMA_SPACE_ID = 272
-
-    ctl_args = [ctl_path, 'cat', snapshot_path,
-                '--format=json', '--show-system']
-    with open(os.devnull, 'w') as devnull:
-        proc = subprocess.Popen(ctl_args, stdout=subprocess.PIPE, stderr=devnull)
-    version = None
-    while True:
-        line = proc.stdout.readline()
-        if not line:
-            break
-        json_line = json.loads(line.strip())
-        query_type = json_line["HEADER"]["type"]
-        space_id = json_line["BODY"]["space_id"]
-        if query_type == "INSERT" or query_type == "REPLACE" and space_id == SCHEMA_SPACE_ID:
-            query_tuple = json_line['BODY']['tuple']
-            if query_tuple[0] == 'version':
-                version = query_tuple
-                break
-
-    return version
diff --git a/lib/xlog.py b/lib/xlog.py
index dbaa4dc..0105dfc 100644
--- a/lib/xlog.py
+++ b/lib/xlog.py
@@ -7,7 +7,8 @@ import json
 from uuid import uuid4
 
 
-__all__ = ['init', 'snapshot_is_for_bootstrap', 'prepare_bootstrap_snapshot']
+__all__ = ['init', 'snapshot_is_for_bootstrap', 'prepare_bootstrap_snapshot',
+           'extract_schema_from_snapshot']
 
 
 # {{{ Constants
@@ -230,3 +231,25 @@ def prepare_bootstrap_snapshot(snapshot_path):
         }))
 
         xlog_write_eof(xlog)
+
+
+def extract_schema_from_snapshot(snapshot_path):
+    """
+    Extract schema version from snapshot.
+
+    Example of record:
+
+     {
+       "HEADER": {"lsn":2, "type": "INSERT", "timestamp": 1584694286.0031},
+       "BODY": {"space_id": 272, "tuple": ["version", 2, 3, 1]}
+     }
+
+    :returns: [u'version', 2, 3, 1]
+    """
+    for row in xlog_rows(snapshot_path):
+        if row['HEADER']['type'] == 'INSERT' and \
+           row['BODY']['space_id'] == BOX_SCHEMA_ID:
+            res = row['BODY']['tuple']
+            if res[0] == 'version':
+                return res
+    return None


More information about the Tarantool-patches mailing list