[Tarantool-patches] [PATCH] Add options for upgrade testing

Alexander Turenko alexander.turenko at tarantool.org
Wed Dec 2 06:40:03 MSK 2020


> > +def snapshot_is_for_bootstrap(snapshot_path):
> > +    """ A bootstrap snapshot (src/box/bootstrap.snap) has no
> > +        <cluster_uuid> and <instance_uuid> in _schema and
> > +        _cluster system spaces.
> > +    """
> > +    cluster_uuid_exists = False
> > +    instance_uuid_exists = False
> > +
> > +    for row in xlog_rows(snapshot_path):
> > +        if row['HEADER']['type'] == 'INSERT' and         \
> > +           row['BODY']['space_id'] == BOX_SCHEMA_ID and \
> > +           row['BODY']['tuple'][0] == 'cluster':
> > +            cluster_uuid_exists = True
> > +
> > +        if row['HEADER']['type'] == 'INSERT' and          \
> > +           row['BODY']['space_id'] == BOX_CLUSTER_ID and \
> > +           row['BODY']['tuple'][0] == 1:
> > +            instance_uuid_exists = True
> > +
> > +        if cluster_uuid_exists and instance_uuid_exists:
> > +            break
> > +
> > +    if cluster_uuid_exists != instance_uuid_exists:
> > +        raise RuntimeError('A cluster UUID and an instance UUID should exist '
> > +                           'or not exist both')
> > +
> > +    return not cluster_uuid_exists
> > +
> 
> snapshot_is_for_bootstrap() accept a path that can be None
> 
> (see default value in argument --bootstrap) when option --bootstrap is not
> specified.
> 
> Due to this xlog_rows() may fail. xlog_rows() returns iterator and we can
> return something like
> 
> "yield None" but we should handle this properly everywhere where xlog_rows()
> called.
> 
> So I just added patch to snapshot_is_for_bootstrap():
> 
> --- a/lib/options.py
> +++ b/lib/options.py
> @@ -257,7 +257,7 @@ class Options:
>              exit(-1)
> 
>      def check_bootstrap_option(self):
> -        if not snapshot_is_for_bootstrap(self.args.snapshot_path):
> +        if self.args.snapshot_path and not
> snapshot_is_for_bootstrap(self.args.snapshot_path):
>              color_stdout('Expected a boostrap snapshot, one for local
> recovery '
>                           'is given\n', schema='error')
>              exit(1)

My bad. Thanks for catching it up.

The fix you pasted here looks better for me than one that is on the
branch (which returns True from snapshot_is_for_bootstrap() when a
snapshot path is None).

Applied to the squashed commits on the branch.

----

We had another short discussion with Mons (after his discussion with
Vlad) regarding possible ways to test an instance with an old schema or
an upgraded schema. He highlighted that this is quite far from users'
scenarious.

This convenced me that it would be dubious decision to assume that we'll
actually implement [1] for testing purposes and so name the option as
--bootstrap and restrict accepted snapshots only to ones that are
prepared for bootstrap (as opposite to local recovery). Sorry for those
flips back and forth, but now I think it would be more wise to implement
the simplest feature (--snapshot option, as in your original patch) and
postpone further activities until we'll have at least rough data from
test fails analysis (that is part of [2]).

So, I prepared your branch to push to master with the --snapshot option
without all my patches around mangling bootstrap snapshots. In
particular, I extracted a couple of preparatory patches and apply all
proposed fixups. The result looks so:

 | $ git log --reverse --oneline -4 ligurio/gh-4801-add-snapshot-option | cat
 | dfb6d66 refactoring: make 'debug' property class level
 | 8e80565 Allow to extract a schema version from a snapshot
 | f93af28 Add options for upgrade testing
 | 7d73f2f Update code with checks of conflicting arguments

I'm okay with the state on the branch [3] (it is updated as shown above)
and I ready to push it if you don't have objections against the result
of squashing. I included my variant of extract_schema_from_snapshot()
(but it leans on PATH, it is enough for usages from TarantoolServer).

I performed the following tests using tarantool 2.7.0-73-gb53cb2aec
repository (debug build).

0. make lint
   No errors / warnings. Okay.

1. An app test.
    | (cd test && ./test-run.py box-tap/merger.test.lua)
   Passed. Okay.

2. An app test. Snapshot from 1.10.
    | (cd test && ./test-run.py                              \
    |     --snapshot ../../../1.10/00000000000000000000.snap \
    |     box-tap/merger.test.lua)
   Passed. Upgrade entries are present in the log. Okay.

3. An app test. Snapshot from 1.10. No schema upgrade.
    | (cd test && ./test-run.py                              \
    |     --snapshot ../../../1.10/00000000000000000000.snap \
    |     box-tap/merger.test.lua --disable-schema-upgrade)
   Passed. No upgrade entries in the log. Okay.

4. A tarantool test.
    | (cd test && ./test-run.py box/net.box_timeout_gh-1533.test.lua)
   Passed. Okay.

5. A tarantool test. Snapshot from 1.10.
    | (cd test && ./test-run.py                              \
    |     --snapshot ../../../1.10/00000000000000000000.snap \
    |     box/net.box_timeout_gh-1533.test.lua)
   Passed. Upgrade entries are present in the log. Okay.

6. A tarantool test. Snapshot from 1.10. No schema upgrade.
    | (cd test && ./test-run.py                              \
    |     --snapshot ../../../1.10/00000000000000000000.snap \
    |     box/net.box_timeout_gh-1533.test.lua --disable-schema-upgrade)
   Passed. No upgrade entries in the log. Okay.

7. --disable-schema-upgrade without --snapshot.
    | (cd test && ./test-run.py --disable-schema-upgrade)
   Gives an error. Okay.

8. Conflicting options.
    | (cd test && ./test-run.py --gdb --valgrind)
   Gives an error. Okay.

9. Run the whole test suite.
    | (cd test && ./test-run.py --long --force)
   Passed except replication/election_qsync_stress.test.lua, which is
   known as flaky on this revision. Okay.

Rebuilt the same tarantool revision as RelWithDebInfo and performed the
following test:

10. Attempt to use --disable-schema-upgrade on a RelWithDebInfo build.
     | (cd test && ./test-run.py                              \
     |     --snapshot ../../../1.10/00000000000000000000.snap \
     |     box/net.box_timeout_gh-1533.test.lua --disable-schema-upgrade)
    Gives an error. Okay.

I rewrote my 'snapshot mangling' patches to introduce the new
--bootstrap option and pushed them to the branch [4]. I propose to don't
push them to master at least until we'll found them useful.

[1]: https://github.com/tarantool/tarantool/issues/5540
[2]: https://github.com/tarantool/tarantool/issues/4801
[3]: ligurio/gh-4801-add-snapshot-option
[4]: Totktonada/gh-4801-add-bootstrap-option


More information about the Tarantool-patches mailing list