[Tarantool-patches] [PATCH] net.box: fix fetching of schema of an old version

Sun Apr 5 19:41:29 MSK 2020

On 05/04/2020 14:05, Alexander Turenko wrote:
> On Sat, Apr 04, 2020 at 07:40:44PM +0200, Vladislav Shpilevoy wrote:
>>>>> https://github.com/tarantool/tarantool/issues/4691
>>>>> https://github.com/tarantool/tarantool/tree/Totktonada/gh-4691-net-box-connect-schema-2-1-3
>>>>>
>>>>>  src/box/lua/net_box.lua                       |   9 ++
>>>>>  ...4691-net-box-connect-schema-2-1-3.test.lua |  93 ++++++++++++++++++
>>>>>  test/box-tap/no_auto_schema_upgrade.lua       |  30 ++++++
>>>>>  .../snap/2.1.3/00000000000000000000.snap      | Bin 0 -> 4466 bytes
>>>>>  4 files changed, 132 insertions(+)
>>>>>  create mode 100755 test/box-tap/gh-4691-net-box-connect-schema-2-1-3.test.lua
>>>>
>>>> We have xlog/upgrade suite specifically for these tests. With
>>>> already designed processes, where and how to store old snaps, etc.
>>>> See xlog/upgrade/how_to_add_new_test.md.
>>>
>>>
>>> I want to test the case when an instance is NOT upgraded. So 'upgrade'
>>> directory name for snaps would be misleading.
>>
>> The test and the bug are still upgrade related. 'xlog/upgrade' is not
>> called 'xlog/upgrade_called'. It is called just 'upgrade'. For all cases
>> when an old snapshot is used to start a new tarantool.
> 
> I disagree here. 'upgrade' is not something about working upward an old
> snapshot.

It is. Because you won't work on the old snapshot forever. You are
going to upgrade anyway. Your bug is for when upgrade is started but
not finished. Because Tarantools are new, but upgrade() is not called
yet.

>>> I also think that we should move from using compare-against-output
>>> tests.
>>
>> This is not about tap vs diff. It is about having all upgrade related
>> tests in one place, and organized in one way, with how to fill them, with
>> how to store xlogs and snapshots in the repository.
> 
> But I still want to test it using TAP13 test. test/xlog is the test
> suite that is fed to the console.

Well, I also want many things different, but also I don't want to break
what works, and I want to be consistent. Currently this patch basically
proposes to throw away the just started to work xlog/upgrade suite, which
was disabled for a long time. And it makes upgrade tests inconsistent.

>>> We can move 'snap' directory upward (right into test/) and use it for
>>> snapshots for all test suites. Or keep them per-suite. However see the
>>> next paragraph re fill.lua.
>>
>> This is a bad idea, because different upgrade-related tests usually
>> need some specific data stored in the old snapshot. If you merge data,
>> needed by many tests, into one snapshot, you will violate the policy of
>> one-bug-one-test. Because many tests will start to depend on each other.
> 
> Ouch, okay, fill.lua is called on the old tarantool version (on which
> you generate a snapshot). It is okay for my case to disable autoupgrade
> and insert the data using the new tarantool version. I guess it is okay
> too for most of cases. Your case with upgrade of a sequence can be
> tested this way too.

No, it can't. Unless you tried and actually managed to reproduce this.
Because I just tried your way. I made checkout to 2.1, took an empty
snapshot (00000000000000000000.snap). Took 2.2 before the sequence fix
(that commit: 13de917a0e3b8affb693d041663c174d9ec7fcc9). Then I started
Tarantool in read-only mode:

    box.cfg{read_only = true}

Then I check the upgrade is not called:

    tarantool> box.space._schema:select{}
    ---
    - - ['cluster', '9990ec84-1f65-4452-aa39-3c06c1aac709']
      - ['max_id', 511]
      - ['version', 2, 1, 3]
    ...

    tarantool> _TARANTOOL
    ---
    - 2.2.2-40-g13de917a0
    ...

Then I made it read_only false to be able to create a sequence:

    tarantool> box.cfg{read_only = false}
    2020-04-05 18:14:27.256 [23941] main/102/interactive I> set 'read_only' configuration option to false
    ---
    ...

Then I created a space with sequence, like described in the ticket:
https://github.com/tarantool/tarantool/issues/4771

    tarantool> box.schema.create_space('s')
    ---
    - engine: memtx
      ... -- snipped
    - created
    ...

    tarantool> box.space.s:create_index('pk', {parts = {{1, 'integer'}}, sequence = true})
    ---
    - parts:
      ... -- snipped
      name: pk
    ...

Then I called upgrade:

    tarantool> box.schema.upgrade()
    2020-04-05 18:15:54.580 [23941] main/102/interactive I> add sequence field to space _space_sequence
    2020-04-05 18:15:54.581 [23941] main/102/interactive I> create space _ck_constraint
    2020-04-05 18:15:54.581 [23941] main/102/interactive I> create index primary on _ck_constraint
    2020-04-05 18:15:54.582 [23941] main/102/interactive I> create view _vcollation...
    2020-04-05 18:15:54.583 [23941] main/102/interactive I> create index primary on _vcollation
    2020-04-05 18:15:54.583 [23941] main/102/interactive I> create index name on _vcollation
    2020-04-05 18:15:54.583 [23941] main/102/interactive I> grant read access to 'public' role for _vcollation view
    2020-04-05 18:15:54.584 [23941] main/102/interactive I> Update _func format
    2020-04-05 18:15:54.602 [23941] main/102/interactive I> Create _func_index space
    2020-04-05 18:15:54.603 [23941] main/102/interactive I> set schema version to 2.2.1
    ---
    ...

So even though in your particular case it is ok, it is very
artificial, and can't be applied to all tests.

> Inserting necessary data right from a test seems to be more sane
> approach: you don't need to jump over several files to look at the test
> case at whole: its data and code.

It may look simpler (even though for me, honestly, the test in the
patch does not look simple). But it does not mean anything it is right.
As I said, I consider that test too artificial.

> This way also splits snapshot from test data. The snapshot contains just
> what appears after box.cfg() + box.snapshot() on an old tarantool
> version. No surprises.

For having test data we use fill.lua. I don't see any surprises here.
What you put into fill.lua, you will see after bootstrap.

>>> I prefer to feed initial schema and data right from a test to keep them
>>> near to actual usage in test cases (within one file).
>>
>> This makes impossible to test bugs, which appear only when you bootstrap
>> from the old snapshot.
> 
> Any example? Your case with a sequence upgrade can be tested with
> inserting data using the new tarantool version.

As I already showed above, it can't. At least without manual change
of _sequence/_space_sequence/_sequence_data (but this I didn't check),
which would make the test artificial, not real.

So if you want examples, first is this:
https://github.com/tarantool/tarantool/issues/4771

Second is this:
https://github.com/tarantool/tarantool/issues/2950

The problem was that space formats started being validated, and you
couldn't insert bad data even if you didn't call upgrade yet. Because
alter.cc already parsed existing formats, and turned on format validation.
Idea is to insert bad data on the old version, and bootstrap a new
version on that. From what I remember, fix was to add a check in
alter.cc that wouldn't turn on validation in case you have an old record
in _schema.

> We just need to start tarantool of the new version on the old snapshot
> in the read-only mode to disable autoupgrading, enable read-write,
> insert necessary data. Then we can either test the instance on the old
> schema or call upgrade and test it after upgrade.

This does not work, in case your 'necessary data' will be inserted
or validated differently due to alter.cc or any other internal changes.

>>> Sometimes it eases
>>> reading of test cases, especially when you're not much familiar with a
>>> testing system.
>>
>> You are now creating a second way of testing upgrade. This is definitely
>> not a simplification of the testing system.
> 
> Nope. Again, I don't test upgrade here.
> 
> But, yep, I want to enforce several properties and so I'm unable to use
> the way you proposed to run tarantool on an old schema:
> 
> * A test is the script with TAP13 output, so it should be inside
>   a corresponding test suite.

Tap or diff has nothing to do with that. It is not a blocker.

> * A test inserts all needed data right from the test file. The instance
>   file just calls box.cfg() and give guest user accesses. It is highly
>   reusable so.

This is also not a blocker to write a test on this issue.

On the summary, all 'blockers' are subjective, and don't really
block anything. It is the same as I would refuse to patch luajit code,
because I personally prefer Tarantool's code style.

>>> I don't much like the approach with fill.lua script (at
>>> least for data corresponds to test cases).
>>
>> fill.lua is inevitable, when you need to test bootstrap from an old
>> snapshot having some data in it before you start tarantool, and you want
>> to know what data is stored here exactly.
> 
> I would call for examples.

See two examples above.

> But anyway, we can keep both ways and use fill.lua when it is not
> possible to feed initial data for a test using the new tarantool
> version.

Okey, you can omit fill.lua, but I want upgrade-related tests in
one place. Or organized in one way in maximum two places, if you are
so desperate to use tap.