[Tarantool-patches] [PATCH] box: allow upgrading from version 1.6

Sergey Ostanevich sergos at tarantool.org
Tue Aug 10 17:43:48 MSK 2021


Hi!

Thanks for the patch! See my 7 comments below

Sergos

1. I don’t see any doc references to 
https://www.tarantool.io/en/doc/latest/book/admin/upgrades/#admin-upgrades
where obviously some changes are needed: “Meanwhile Tarantool 2.x may have 
incompatible changes when migrating from Tarantool 1.6. to 2.x directly."


> On 10 Aug 2021, at 14:28, Serge Petrenko <sergepetrenko at tarantool.org> wrote:
> 
> Direct upgrade support from pre-1.7.5 versions was removed in commit
> 7d3b80e78206c479cab75b4893629cfa1932252e
> (Forbid upgrade from Tarantool < 1.7.5 and refactor upgrade.lua)
> The reason for that was the mandatory space format checks introduced
> back then. Later on, more incompatible changes were made, and due to these
> changes, Tarantool failed to recover from pre-1.7.5 snapshots.
> 
> Actually this is rather inconvenient. Now one has to perform an extra
> upgrade step when upgrading from, say, 1.6 to 2.x: instead of performing a
> direct upgrade one has to do 1.6 -> 1.10 -> 2.x upgrade which takes
> twice the time.
> 

2. I didn’t get the part before this line - if it part of the commit message?
I don’t think we need that deep explanation, partially covered later in
description how the issue is addressed. 

> Make it possible to boot from snapshots coming from Tarantool version 1.6.8 and
> above.
> 

3. Not sure if it’s too long, but it looks like.

> In order to do so, introduce before_replace triggers on system spaces,
> which work during snapshot/xlog recovery. The triggers will set tuple
> formats to the ones supported by current Tarantool (2.x). This way the
> recovered data will have the correct format for a usual schema upgrade.
> 
> Also add upgrade_to_1_7_5() handler, which finishes transformation of
> old schema to 1.7.5. The handler is fired together with other
> box.schema.upgrade() handlers, so there's no user-visible behaviour
> change.
> 
> Side note: it would be great to use the same technique to allow booting
> from pre-1.6.8 snapshots. Unfortunately, this is not possible.
> 
> Current triggers don't break the order of schema upgrades, so 1.7.1
> upgrades come before 1.7.2 and 1.7.5. This is because all the upgrades
> in these versions are replacing existing tuples and not inserting new
> ones, so the upgrades may be handled by the before_replace triggers.
> 
> Upgrade to 1.6.8 requires inserting new tuples: creating sysviews, like
> _vspace, _vuser and so on. This can't be done from the before_replace
> triggers, so we would have to run triggers for 1.7.x first which would
> allow Tarantool to recover the snapshot, and then run an upgrade handler for
> 1.6.8. This looks really messy.
> 
> Closes #5894
> ---
> https://github.com/tarantool/tarantool/issues/5894
> https://github.com/tarantool/tarantool/tree/sp/gh-5894-1.6-upgrade
> src/box/lua/load_cfg.lua |   9 ++
> src/box/lua/upgrade.lua  | 251 ++++++++++++++++++++++++++++++++++++++-
> 2 files changed, 258 insertions(+), 2 deletions(-)
> 
> diff --git a/src/box/lua/load_cfg.lua b/src/box/lua/load_cfg.lua
> index 4df70c210..ea73337b7 100644
> --- a/src/box/lua/load_cfg.lua
> +++ b/src/box/lua/load_cfg.lua
> @@ -695,6 +695,12 @@ local function load_cfg(cfg)
>         return box.error() -- re-throw exception from check_cfg()
>     end
> 
> +    local snap_dir = box.cfg.memtx_dir

4. It should be covered with comment on the matter - why is it memtx only,
what versions are expected here, what triggers will do.

> +    local snap_version = private.get_snapshot_version(snap_dir)
> +    if snap_version and private.need_recovery_triggers(snap_version) then
> +        private.set_recovery_triggers(snap_version)
> +    end
> +
>     -- NB: After this point the function should not raise an
>     -- error.
>     --
> @@ -722,6 +728,9 @@ local function load_cfg(cfg)
>     -- This call either succeeds or calls panic() / exit().
>     private.cfg_load()
> 
> +    if snap_version and private.need_recovery_triggers(snap_version) then
> +        private.reset_recovery_triggers()
> +    end

5. So, after this point the formats will be up to 2.x. If we make a snapshot
at this moment, the new snap file will become incompatible with 1.6.8?
If an 2.x instance will bootstrap from this snap, will it be able to
be a replica in 1.6.8 cluster? 

>     -- This block does not raise an error: all necessary checks
>     -- already performed in private.cfg_check(). See <dynamic_cfg>
>     -- comment.
> diff --git a/src/box/lua/upgrade.lua b/src/box/lua/upgrade.lua
> index 97afc0b4d..c1f3a027a 100644
> --- a/src/box/lua/upgrade.lua
> +++ b/src/box/lua/upgrade.lua
> @@ -1,6 +1,8 @@
> local log = require('log')
> local bit = require('bit')
> local json = require('json')
> +local fio = require('fio')
> +local xlog = require('xlog')
> 
> -- Guest user id - the default user
> local GUEST = 0
> @@ -86,6 +88,41 @@ local function set_system_triggers(val)
>     foreach_system_space(function(s) s:run_triggers(val) end)
> end
> 
> +-- Get schema version, stored in _schema system space, by reading the latest
> +-- snapshot file from the snap_dir. Useful to define schema_version before
> +-- recovering the snapshot, because some schema versions are too old and cannot
> +-- be recovered normally.
> +local function get_snapshot_version(snap_dir)
> +    local SCHEMA_ID = 272

6. Can we reuse the global definition at schema_def.h:72 ?

> +    local snap_pattern = snap_dir..'/'..string.rep('[0-9]', 20)..'.snap'
> +    local snap_list = fio.glob(snap_pattern)
> +    table.sort(snap_list)
> +    local snap = snap_list[#snap_list]
> +    if snap == nil then
> +        return nil
> +    end
> +    local version = nil
> +    for _, row in xlog.pairs(snap) do
> +        local sid = row.BODY and row.BODY.space_id
> +        if sid == SCHEMA_ID then
> +            local tuple = row.BODY.tuple
> +            if tuple and tuple[1] == 'version' then
> +                local major, minor, patch = tuple[2], tuple[3], tuple[4] or 0
> +                if major and minor and patch and type(major) == 'number' and
> +                   type(minor) == 'number' and type(patch) == 'number' then
> +                    version = mkversion(major, minor, patch)
> +                    break
> +                end
> +            end
> +        elseif sid and sid > SCHEMA_ID then
> +            -- Exit early if version wasn't found in _schema space.
> +            -- Snapshot rows are ordered by space id.
> +            break
> +        end
> +    end
> +    return version
> +end
> +
> --------------------------------------------------------------------------------
> -- Bootstrap
> --------------------------------------------------------------------------------
> @@ -131,6 +168,135 @@ local function create_sysview(source_id, target_id)
>     end
> end
> 
> +--------------------------------------------------------------------------------
> +-- Tarantool 1.7.1
> +--------------------------------------------------------------------------------
> +local function user_trig_1_7_1(_, tuple)
> +    if tuple[3] == 'guest' and not tuple[5] then
> +        local auth_method_list = {}
> +        auth_method_list["chap-sha1"] = box.schema.user.password("")
> +        tuple = tuple:update{{'=', 5, auth_method_list}}
> +        log.info("Set empty password to user 'guest'")
> +    end
> +    return tuple
> +end
> +
> +--------------------------------------------------------------------------------
> +-- Tarantool 1.7.2
> +--------------------------------------------------------------------------------
> +local function index_trig_1_7_2(_, tuple)
> +    local field_types_v16 = {
> +        num = 'unsigned',
> +        int = 'integer',
> +        str = 'string',
> +    }
> +    local parts = tuple[6]
> +    local changed = false
> +    for _, part in pairs(parts) do
> +        local field_type = part[2]:lower()
> +        if field_types_v16[field_type] ~= nil then
> +            part[2] = field_types_v16[field_type]
> +            changed = true
> +        end
> +    end
> +    if changed then
> +        log.info("Updating index '%s' on space '%s': set parts to %s", tuple[3],
> +                 box.space[tuple[1]].name, json.encode(parts))
> +        tuple = tuple:update{{'=', 6, parts}}
> +    end
> +    return tuple
> +end
> +
> +--------------------------------------------------------------------------------
> +-- Tarantool 1.7.5
> +--------------------------------------------------------------------------------
> +local function create_truncate_space()
> +    local _truncate = box.space[box.schema.TRUNCATE_ID]
> +
> +    log.info("create space _truncate")
> +    box.space._space:insert{_truncate.id, ADMIN, '_truncate', 'memtx', 0, setmap({}),
> +                            {{name = 'id', type = 'unsigned'}, {name = 'count', type = 'unsigned'}}}
> +
> +    log.info("create index primary on _truncate")
> +    box.space._index:insert{_truncate.id, 0, 'primary', 'tree', {unique = true}, {{0, 'unsigned'}}}
> +
> +    local _priv = box.space[box.schema.PRIV_ID]
> +    _priv:insert{ADMIN, PUBLIC, 'space', _truncate.id, box.priv.W}
> +end
> +
> +local function upgrade_to_1_7_5()
> +    create_truncate_space()

7. Is it an intended nesting of a single-call function?

> +end
> +
> +local function user_trig_1_7_5(_, tuple)
> +    if not tuple[5] then
> +        tuple = tuple:update{{'=', 5, setmap({})}}
> +        log.info("Set empty password to %s '%s'", tuple[4], tuple[3])
> +    end
> +    return tuple
> +end
> +


More information about the Tarantool-patches mailing list