Tarantool development patches archive
 help / color / mirror / Atom feed
* [tarantool-patches] [PATCH v2] vshard reload mechanism
@ 2018-07-23 11:14 AKhatskevich
  2018-07-23 11:14 ` [tarantool-patches] [PATCH 1/4] Add test on error during reconfigure AKhatskevich
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: AKhatskevich @ 2018-07-23 11:14 UTC (permalink / raw)
  To: v.shpilevoy, tarantool-patches

Issue1: https://github.com/tarantool/vshard/issues/112
Issue2: https://github.com/tarantool/vshard/issues/125
Branch: https://github.com/tarantool/vshard/tree/kh/gh-112-reload-mt-2

This patcheset improves vshard reload mechanism.

AKhatskevich (4):
  Add test on error during reconfigure
  Complete module reload
  Tests: separate bootstrap routine to a lua_libs
  Introduce storage reload evolution

 .travis.yml                                        |   2 +-
 rpm/prebuild.sh                                    |   2 +
 test/lua_libs/bootstrap_test_storage.lua           |  50 +++++
 test/lua_libs/git_util.lua                         |  51 +++++
 test/lua_libs/util.lua                             |  44 ++++
 test/rebalancer/box_1_a.lua                        |  47 +---
 test/rebalancer/errinj.result                      |   2 +-
 test/rebalancer/errinj.test.lua                    |   2 +-
 test/rebalancer/rebalancer.result                  |   2 +-
 test/rebalancer/rebalancer.test.lua                |   2 +-
 test/rebalancer/rebalancer_lock_and_pin.result     |   2 +-
 test/rebalancer/rebalancer_lock_and_pin.test.lua   |   2 +-
 test/rebalancer/restart_during_rebalancing.result  |   2 +-
 .../rebalancer/restart_during_rebalancing.test.lua |   2 +-
 test/rebalancer/stress_add_remove_rs.result        |   2 +-
 test/rebalancer/stress_add_remove_rs.test.lua      |   2 +-
 .../rebalancer/stress_add_remove_several_rs.result |   2 +-
 .../stress_add_remove_several_rs.test.lua          |   2 +-
 test/rebalancer/suite.ini                          |   2 +-
 test/reload_evolution/storage.result               | 248 +++++++++++++++++++++
 test/reload_evolution/storage.test.lua             |  88 ++++++++
 test/reload_evolution/storage_1_a.lua              |  48 ++++
 test/reload_evolution/storage_1_b.lua              |   1 +
 test/reload_evolution/storage_2_a.lua              |   1 +
 test/reload_evolution/storage_2_b.lua              |   1 +
 test/reload_evolution/suite.ini                    |   6 +
 test/reload_evolution/test.lua                     |   9 +
 test/router/reload.result                          | 126 +++++++++++
 test/router/reload.test.lua                        |  36 +++
 test/router/router.result                          |  33 ++-
 test/router/router.test.lua                        |   9 +
 test/storage/reload.result                         |  29 +++
 test/storage/reload.test.lua                       |  10 +
 test/storage/storage.result                        |  39 ++++
 test/storage/storage.test.lua                      |  12 +
 test/unit/reload_evolution.result                  |  45 ++++
 test/unit/reload_evolution.test.lua                |  18 ++
 vshard/cfg.lua                                     |   5 +
 vshard/error.lua                                   |   5 +
 vshard/replicaset.lua                              | 102 +++++++--
 vshard/router/init.lua                             |  54 +++--
 vshard/storage/init.lua                            |  65 ++++--
 vshard/storage/reload_evolution.lua                |  58 +++++
 vshard/util.lua                                    |  20 ++
 44 files changed, 1177 insertions(+), 113 deletions(-)
 create mode 100644 test/lua_libs/bootstrap_test_storage.lua
 create mode 100644 test/lua_libs/git_util.lua
 create mode 100644 test/reload_evolution/storage.result
 create mode 100644 test/reload_evolution/storage.test.lua
 create mode 100755 test/reload_evolution/storage_1_a.lua
 create mode 120000 test/reload_evolution/storage_1_b.lua
 create mode 120000 test/reload_evolution/storage_2_a.lua
 create mode 120000 test/reload_evolution/storage_2_b.lua
 create mode 100644 test/reload_evolution/suite.ini
 create mode 100644 test/reload_evolution/test.lua
 create mode 100644 test/unit/reload_evolution.result
 create mode 100644 test/unit/reload_evolution.test.lua
 create mode 100644 vshard/storage/reload_evolution.lua

-- 
2.14.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tarantool-patches] [PATCH 1/4] Add test on error during reconfigure
  2018-07-23 11:14 [tarantool-patches] [PATCH v2] vshard reload mechanism AKhatskevich
@ 2018-07-23 11:14 ` AKhatskevich
  2018-07-23 13:18   ` [tarantool-patches] " Vladislav Shpilevoy
  2018-07-23 11:14 ` [tarantool-patches] [PATCH 2/4] Complete module reload AKhatskevich
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: AKhatskevich @ 2018-07-23 11:14 UTC (permalink / raw)
  To: v.shpilevoy, tarantool-patches

In case reconfigure process fails, the node should continue
work properly.
---
 test/lua_libs/util.lua        | 24 ++++++++++++++++++++++++
 test/router/router.result     | 30 ++++++++++++++++++++++++++++++
 test/router/router.test.lua   |  9 +++++++++
 test/storage/storage.result   | 39 +++++++++++++++++++++++++++++++++++++++
 test/storage/storage.test.lua | 12 ++++++++++++
 vshard/router/init.lua        |  7 +++++++
 vshard/storage/init.lua       |  9 +++++++++
 7 files changed, 130 insertions(+)

diff --git a/test/lua_libs/util.lua b/test/lua_libs/util.lua
index f2d3b48..f40d3a6 100644
--- a/test/lua_libs/util.lua
+++ b/test/lua_libs/util.lua
@@ -69,9 +69,33 @@ local function wait_master(test_run, replicaset, master)
     log.info('Slaves are connected to a master "%s"', master)
 end
 
+--
+-- Check that data has at least all etalon's fields and they are
+-- equal.
+-- @param etalon Table which fields should be found in `data`.
+-- @param data Table which is checked against `etalon`.
+--
+-- @retval Boolean indicator of equality and if is not equal, then
+--         table of names of fields which are different in `data`.
+--
+local function has_same_fields(etalon, data)
+    assert(type(etalon) == 'table' and type(data) == 'table')
+    local diff = {}
+    for k, v in pairs(etalon) do
+        if v ~= data[k] then
+            table.insert(diff, k)
+        end
+    end
+    if #diff > 0 then
+        return false, diff
+    end
+    return true
+end
+
 return {
     check_error = check_error,
     shuffle_masters = shuffle_masters,
     collect_timeouts = collect_timeouts,
     wait_master = wait_master,
+    has_same_fields = has_same_fields,
 }
diff --git a/test/router/router.result b/test/router/router.result
index 15f4fd0..4919962 100644
--- a/test/router/router.result
+++ b/test/router/router.result
@@ -1156,6 +1156,36 @@ util.check_error(vshard.router.cfg, non_dynamic_cfg)
 ---
 - Non-dynamic option shard_index cannot be reconfigured
 ...
+-- Error during reconfigure process.
+vshard.router.route(1):callro('echo', {'some_data'})
+---
+- some_data
+- null
+- null
+...
+vshard.router.internal.errinj.ERRINJ_CFG = true
+---
+...
+old_internal = table.copy(vshard.router.internal)
+---
+...
+util.check_error(vshard.router.cfg, cfg)
+---
+- 'Error injection: cfg'
+...
+vshard.router.internal.errinj.ERRINJ_CFG = false
+---
+...
+util.has_same_fields(old_internal, vshard.router.internal)
+---
+- true
+...
+vshard.router.route(1):callro('echo', {'some_data'})
+---
+- some_data
+- null
+- null
+...
 _ = test_run:cmd("switch default")
 ---
 ...
diff --git a/test/router/router.test.lua b/test/router/router.test.lua
index 8006e5d..df2f381 100644
--- a/test/router/router.test.lua
+++ b/test/router/router.test.lua
@@ -444,6 +444,15 @@ non_dynamic_cfg = table.copy(cfg)
 non_dynamic_cfg.shard_index = 'non_default_name'
 util.check_error(vshard.router.cfg, non_dynamic_cfg)
 
+-- Error during reconfigure process.
+vshard.router.route(1):callro('echo', {'some_data'})
+vshard.router.internal.errinj.ERRINJ_CFG = true
+old_internal = table.copy(vshard.router.internal)
+util.check_error(vshard.router.cfg, cfg)
+vshard.router.internal.errinj.ERRINJ_CFG = false
+util.has_same_fields(old_internal, vshard.router.internal)
+vshard.router.route(1):callro('echo', {'some_data'})
+
 _ = test_run:cmd("switch default")
 test_run:drop_cluster(REPLICASET_2)
 
diff --git a/test/storage/storage.result b/test/storage/storage.result
index 4399ff0..ff07fe9 100644
--- a/test/storage/storage.result
+++ b/test/storage/storage.result
@@ -732,6 +732,45 @@ util.check_error(vshard.storage.cfg, non_dynamic_cfg, names.storage_1_a)
 ---
 - Non-dynamic option bucket_count cannot be reconfigured
 ...
+-- Error during reconfigure process.
+_, rs = next(vshard.storage.internal.replicasets)
+---
+...
+rs:callro('echo', {'some_data'})
+---
+- some_data
+- null
+- null
+...
+vshard.storage.internal.errinj.ERRINJ_CFG = true
+---
+...
+old_internal = table.copy(vshard.storage.internal)
+---
+...
+_, err = pcall(vshard.storage.cfg, cfg, names.storage_1_a)
+---
+...
+err:match('Error injection:.*')
+---
+- 'Error injection: cfg'
+...
+vshard.storage.internal.errinj.ERRINJ_CFG = false
+---
+...
+util.has_same_fields(old_internal, vshard.storage.internal)
+---
+- true
+...
+_, rs = next(vshard.storage.internal.replicasets)
+---
+...
+rs:callro('echo', {'some_data'})
+---
+- some_data
+- null
+- null
+...
 _ = test_run:cmd("switch default")
 ---
 ...
diff --git a/test/storage/storage.test.lua b/test/storage/storage.test.lua
index 72564e1..04bb608 100644
--- a/test/storage/storage.test.lua
+++ b/test/storage/storage.test.lua
@@ -182,6 +182,18 @@ non_dynamic_cfg = table.copy(cfg)
 non_dynamic_cfg.bucket_count = require('vshard.consts').DEFAULT_BUCKET_COUNT + 1
 util.check_error(vshard.storage.cfg, non_dynamic_cfg, names.storage_1_a)
 
+-- Error during reconfigure process.
+_, rs = next(vshard.storage.internal.replicasets)
+rs:callro('echo', {'some_data'})
+vshard.storage.internal.errinj.ERRINJ_CFG = true
+old_internal = table.copy(vshard.storage.internal)
+_, err = pcall(vshard.storage.cfg, cfg, names.storage_1_a)
+err:match('Error injection:.*')
+vshard.storage.internal.errinj.ERRINJ_CFG = false
+util.has_same_fields(old_internal, vshard.storage.internal)
+_, rs = next(vshard.storage.internal.replicasets)
+rs:callro('echo', {'some_data'})
+
 _ = test_run:cmd("switch default")
 
 test_run:drop_cluster(REPLICASET_2)
diff --git a/vshard/router/init.lua b/vshard/router/init.lua
index 4531f3a..a143070 100644
--- a/vshard/router/init.lua
+++ b/vshard/router/init.lua
@@ -11,6 +11,7 @@ local M = rawget(_G, '__module_vshard_router')
 if not M then
     M = {
         errinj = {
+            ERRINJ_CFG = false,
             ERRINJ_FAILOVER_CHANGE_CFG = false,
             ERRINJ_RELOAD = false,
             ERRINJ_LONG_DISCOVERY = false,
@@ -486,6 +487,12 @@ local function router_cfg(cfg)
     for k, v in pairs(cfg) do
         log.info({[k] = v})
     end
+    -- It is considered that all possible errors during cfg
+    -- process occur only before this place.
+    -- This check should be placed as late as possible.
+    if M.errinj.ERRINJ_CFG then
+        error('Error injection: cfg')
+    end
     box.cfg(cfg)
     log.info("Box has been configured")
     M.total_bucket_count = total_bucket_count
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index ff204a4..052e94f 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -33,6 +33,7 @@ if not M then
         -- Bucket count stored on all replicasets.
         total_bucket_count = 0,
         errinj = {
+            ERRINJ_CFG = false,
             ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false,
             ERRINJ_RELOAD = false,
             ERRINJ_CFG_DELAY = false,
@@ -1560,6 +1561,14 @@ local function storage_cfg(cfg, this_replica_uuid)
     local shard_index = cfg.shard_index
     local collect_bucket_garbage_interval = cfg.collect_bucket_garbage_interval
     local collect_lua_garbage = cfg.collect_lua_garbage
+
+    -- It is considered that all possible errors during cfg
+    -- process occur only before this place.
+    -- This check should be placed as late as possible.
+    if M.errinj.ERRINJ_CFG then
+        error('Error injection: cfg')
+    end
+
     --
     -- Sync timeout is a special case - it must be updated before
     -- all other options to allow a user to demote a master with
-- 
2.14.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tarantool-patches] [PATCH 2/4] Complete module reload
  2018-07-23 11:14 [tarantool-patches] [PATCH v2] vshard reload mechanism AKhatskevich
  2018-07-23 11:14 ` [tarantool-patches] [PATCH 1/4] Add test on error during reconfigure AKhatskevich
@ 2018-07-23 11:14 ` AKhatskevich
  2018-07-23 13:31   ` [tarantool-patches] " Vladislav Shpilevoy
  2018-07-23 11:14 ` [tarantool-patches] [PATCH 3/4] Tests: separate bootstrap routine to a lua_libs AKhatskevich
  2018-07-23 11:14 ` [tarantool-patches] [PATCH 4/4] Introduce storage reload evolution AKhatskevich
  3 siblings, 1 reply; 14+ messages in thread
From: AKhatskevich @ 2018-07-23 11:14 UTC (permalink / raw)
  To: v.shpilevoy, tarantool-patches

In case one need to upgrade vshard to a new version, this commit
improves reload mechanism to allow to do that for a wider variety of
possible changes (between two versions).

Changes:
 * introduce cfg option `connection_outdate_delay`
 * improve reload mechanism
 * add `util.async_task` method, which runs a function after a
   delay
 * delete replicaset:rebind_connections method as it is replaced
   with `rebind_replicasets` which updates all replicasets at once

Reload mechanism:
 * reload all vshard modules
 * create new `replicaset` and `replica` objects
 * reuse old netbox connections in new replica objects if
   possible
 * update router/storage.internal table
 * after a `connection_outdate_delay` disable old instances of
   `replicaset` and `replica` objects

Reload works for modules:
 * vshard.router
 * vshard.storage

Here is a module reload algorithm:
 * old vshard is working
 * delete old vshard src
 * install new vshard
 * call: package.loaded['vshard.router'] = nil
 * call: old_router = vshard.router -- Save working router copy.
 * call: vshard.router = require('vshard.router')
 * if require fails: continue using old_router
 * if require succeeds: use vshard.router

In case reload process fails, old router/storage module, replicaset and
replica objects continue working properly. If reload succeeds, all old
objects would be deprecated.

Extra changes:
 * introduce MODULE_INTERNALS which stores name of the module
   internal data in the global namespace

Part of #112
---
 test/router/reload.result    | 126 +++++++++++++++++++++++++++++++++++++++++++
 test/router/reload.test.lua  |  36 +++++++++++++
 test/router/router.result    |   3 +-
 test/storage/reload.result   |  29 ++++++++++
 test/storage/reload.test.lua |  10 ++++
 vshard/cfg.lua               |   5 ++
 vshard/error.lua             |   5 ++
 vshard/replicaset.lua        | 102 ++++++++++++++++++++++++++---------
 vshard/router/init.lua       |  47 +++++++++++-----
 vshard/storage/init.lua      |  45 ++++++++++------
 vshard/util.lua              |  20 +++++++
 11 files changed, 373 insertions(+), 55 deletions(-)

diff --git a/test/router/reload.result b/test/router/reload.result
index 47f3c2e..88122aa 100644
--- a/test/router/reload.result
+++ b/test/router/reload.result
@@ -174,6 +174,132 @@ vshard.router.module_version()
 check_reloaded()
 ---
 ...
+--
+-- Outdate old replicaset and replica objects.
+--
+rs = vshard.router.route(1)
+---
+...
+rs:callro('echo', {'some_data'})
+---
+- some_data
+- null
+- null
+...
+package.loaded["vshard.router"] = nil
+---
+...
+_ = require('vshard.router')
+---
+...
+-- Make sure outdate async task has had cpu time.
+while not rs.is_outdated do fiber.sleep(0.001) end
+---
+...
+rs.callro(rs, 'echo', {'some_data'})
+---
+- null
+- type: ShardingError
+  name: OBJECT_IS_OUTDATED
+  message: Object is outdated after module reload/reconfigure. Use new instance.
+  code: 20
+...
+vshard.router = require('vshard.router')
+---
+...
+rs = vshard.router.route(1)
+---
+...
+rs:callro('echo', {'some_data'})
+---
+- some_data
+- null
+- null
+...
+-- Test `connection_outdate_delay`.
+old_connection_delay = cfg.connection_outdate_delay
+---
+...
+cfg.connection_outdate_delay = 0.3
+---
+...
+vshard.router.cfg(cfg)
+---
+...
+cfg.connection_outdate_delay = old_connection_delay
+---
+...
+vshard.router.internal.connection_outdate_delay = nil
+---
+...
+rs_new = vshard.router.route(1)
+---
+...
+rs_old = rs
+---
+...
+_, replica_old = next(rs_old.replicas)
+---
+...
+rs_new:callro('echo', {'some_data'})
+---
+- some_data
+- null
+- null
+...
+-- Check old objets are still valid.
+rs_old:callro('echo', {'some_data'})
+---
+- some_data
+- null
+- null
+...
+replica_old.conn ~= nil
+---
+- true
+...
+fiber.sleep(0.2)
+---
+...
+rs_old:callro('echo', {'some_data'})
+---
+- some_data
+- null
+- null
+...
+replica_old.conn ~= nil
+---
+- true
+...
+replica_old.is_outdated == nil
+---
+- true
+...
+fiber.sleep(0.2)
+---
+...
+rs_old:callro('echo', {'some_data'})
+---
+- null
+- type: ShardingError
+  name: OBJECT_IS_OUTDATED
+  message: Object is outdated after module reload/reconfigure. Use new instance.
+  code: 20
+...
+replica_old.conn == nil
+---
+- true
+...
+replica_old.is_outdated == true
+---
+- true
+...
+rs_new:callro('echo', {'some_data'})
+---
+- some_data
+- null
+- null
+...
 test_run:switch('default')
 ---
 - true
diff --git a/test/router/reload.test.lua b/test/router/reload.test.lua
index af2939d..01b7163 100644
--- a/test/router/reload.test.lua
+++ b/test/router/reload.test.lua
@@ -86,6 +86,42 @@ _ = require('vshard.router')
 vshard.router.module_version()
 check_reloaded()
 
+--
+-- Outdate old replicaset and replica objects.
+--
+rs = vshard.router.route(1)
+rs:callro('echo', {'some_data'})
+package.loaded["vshard.router"] = nil
+_ = require('vshard.router')
+-- Make sure outdate async task has had cpu time.
+while not rs.is_outdated do fiber.sleep(0.001) end
+rs.callro(rs, 'echo', {'some_data'})
+vshard.router = require('vshard.router')
+rs = vshard.router.route(1)
+rs:callro('echo', {'some_data'})
+-- Test `connection_outdate_delay`.
+old_connection_delay = cfg.connection_outdate_delay
+cfg.connection_outdate_delay = 0.3
+vshard.router.cfg(cfg)
+cfg.connection_outdate_delay = old_connection_delay
+vshard.router.internal.connection_outdate_delay = nil
+rs_new = vshard.router.route(1)
+rs_old = rs
+_, replica_old = next(rs_old.replicas)
+rs_new:callro('echo', {'some_data'})
+-- Check old objets are still valid.
+rs_old:callro('echo', {'some_data'})
+replica_old.conn ~= nil
+fiber.sleep(0.2)
+rs_old:callro('echo', {'some_data'})
+replica_old.conn ~= nil
+replica_old.is_outdated == nil
+fiber.sleep(0.2)
+rs_old:callro('echo', {'some_data'})
+replica_old.conn == nil
+replica_old.is_outdated == true
+rs_new:callro('echo', {'some_data'})
+
 test_run:switch('default')
 test_run:cmd('stop server router_1')
 test_run:cmd('cleanup server router_1')
diff --git a/test/router/router.result b/test/router/router.result
index 4919962..45394e1 100644
--- a/test/router/router.result
+++ b/test/router/router.result
@@ -1024,11 +1024,10 @@ error_messages
 - - Use replicaset:callro(...) instead of replicaset.callro(...)
   - Use replicaset:connect_master(...) instead of replicaset.connect_master(...)
   - Use replicaset:connect_replica(...) instead of replicaset.connect_replica(...)
-  - Use replicaset:rebind_connections(...) instead of replicaset.rebind_connections(...)
   - Use replicaset:down_replica_priority(...) instead of replicaset.down_replica_priority(...)
   - Use replicaset:call(...) instead of replicaset.call(...)
-  - Use replicaset:up_replica_priority(...) instead of replicaset.up_replica_priority(...)
   - Use replicaset:connect(...) instead of replicaset.connect(...)
+  - Use replicaset:up_replica_priority(...) instead of replicaset.up_replica_priority(...)
   - Use replicaset:callrw(...) instead of replicaset.callrw(...)
   - Use replicaset:connect_all(...) instead of replicaset.connect_all(...)
 ...
diff --git a/test/storage/reload.result b/test/storage/reload.result
index 531d984..b91b622 100644
--- a/test/storage/reload.result
+++ b/test/storage/reload.result
@@ -174,6 +174,35 @@ vshard.storage.module_version()
 check_reloaded()
 ---
 ...
+--
+-- Outdate old replicaset and replica objects.
+--
+_, rs = next(vshard.storage.internal.replicasets)
+---
+...
+package.loaded["vshard.storage"] = nil
+---
+...
+_ = require('vshard.storage')
+---
+...
+rs.callro(rs, 'echo', {'some_data'})
+---
+- null
+- type: ShardingError
+  name: OBJECT_IS_OUTDATED
+  message: Object is outdated after module reload/reconfigure. Use new instance.
+  code: 20
+...
+_, rs = next(vshard.storage.internal.replicasets)
+---
+...
+rs.callro(rs, 'echo', {'some_data'})
+---
+- some_data
+- null
+- null
+...
 test_run:switch('default')
 ---
 - true
diff --git a/test/storage/reload.test.lua b/test/storage/reload.test.lua
index 64c3a60..9140299 100644
--- a/test/storage/reload.test.lua
+++ b/test/storage/reload.test.lua
@@ -87,6 +87,16 @@ _ = require('vshard.storage')
 vshard.storage.module_version()
 check_reloaded()
 
+--
+-- Outdate old replicaset and replica objects.
+--
+_, rs = next(vshard.storage.internal.replicasets)
+package.loaded["vshard.storage"] = nil
+_ = require('vshard.storage')
+rs.callro(rs, 'echo', {'some_data'})
+_, rs = next(vshard.storage.internal.replicasets)
+rs.callro(rs, 'echo', {'some_data'})
+
 test_run:switch('default')
 test_run:drop_cluster(REPLICASET_2)
 test_run:drop_cluster(REPLICASET_1)
diff --git a/vshard/cfg.lua b/vshard/cfg.lua
index d5429af..bba12cc 100644
--- a/vshard/cfg.lua
+++ b/vshard/cfg.lua
@@ -217,6 +217,10 @@ local cfg_template = {
         type = 'non-negative number', name = 'Sync timeout', is_optional = true,
         default = consts.DEFAULT_SYNC_TIMEOUT
     }},
+    {'connection_outdate_delay', {
+        type = 'non-negative number', name = 'Object outdate timeout',
+        is_optional = true
+    }},
 }
 
 --
@@ -264,6 +268,7 @@ local function remove_non_box_options(cfg)
     cfg.collect_bucket_garbage_interval = nil
     cfg.collect_lua_garbage = nil
     cfg.sync_timeout = nil
+    cfg.connection_outdate_delay = nil
 end
 
 return {
diff --git a/vshard/error.lua b/vshard/error.lua
index cf2f9d2..f79107b 100644
--- a/vshard/error.lua
+++ b/vshard/error.lua
@@ -100,6 +100,11 @@ local error_message_template = {
     [19] = {
         name = 'REPLICASET_IS_LOCKED',
         msg = 'Replicaset is locked'
+    },
+    [20] = {
+        name = 'OBJECT_IS_OUTDATED',
+        msg = 'Object is outdated after module reload/reconfigure. ' ..
+              'Use new instance.'
     }
 }
 
diff --git a/vshard/replicaset.lua b/vshard/replicaset.lua
index 99f59aa..6c8d477 100644
--- a/vshard/replicaset.lua
+++ b/vshard/replicaset.lua
@@ -21,6 +21,7 @@
 --                                  requests to the replica>,
 --             net_sequential_fail = <count of sequential failed
 --                                    requests to the replica>,
+--             is_outdated = nil/true,
 --          }
 --      },
 --      master = <master server from the array above>,
@@ -34,6 +35,7 @@
 --      etalon_bucket_count = <bucket count, that must be stored
 --                             on this replicaset to reach the
 --                             balance in a cluster>,
+--      is_outdated = nil/true,
 --  }
 --
 -- replicasets = {
@@ -48,7 +50,8 @@ local lerror = require('vshard.error')
 local fiber = require('fiber')
 local luri = require('uri')
 local ffi = require('ffi')
-local gsc = require('vshard.util').generate_self_checker
+local util = require('vshard.util')
+local gsc = util.generate_self_checker
 
 --
 -- on_connect() trigger for net.box
@@ -337,27 +340,39 @@ local function replicaset_tostring(replicaset)
                          master)
 end
 
+local outdate_replicasets
 --
--- Rebind connections of old replicas to new ones.
+-- Copy netbox connections from old replica objects to new ones
+-- and outdate old objects.
+-- @param replicasets New replicasets
+-- @param old_replicasets Replicasets and replicas to be outdated.
+-- @param outdate_delay Number of seconds; delay to outdate
+--        old objects.
 --
-local function replicaset_rebind_connections(replicaset)
-    for _, replica in pairs(replicaset.replicas) do
-        local old_replica = replica.old_replica
-        if old_replica then
-            local conn = old_replica.conn
-            replica.conn = conn
-            replica.down_ts = old_replica.down_ts
-            replica.net_timeout = old_replica.net_timeout
-            replica.net_sequential_ok = old_replica.net_sequential_ok
-            replica.net_sequential_fail = old_replica.net_sequential_fail
-            if conn then
-                conn.replica = replica
-                conn.replicaset = replicaset
-                old_replica.conn = nil
+local function rebind_replicasets(replicasets, old_replicasets, outdate_delay)
+    for replicaset_uuid, replicaset in pairs(replicasets) do
+        local old_replicaset = old_replicasets and
+                               old_replicasets[replicaset_uuid]
+        for replica_uuid, replica in pairs(replicaset.replicas) do
+            local old_replica = old_replicaset and
+                                old_replicaset.replicas[replica_uuid]
+            if old_replica then
+                local conn = old_replica.conn
+                replica.conn = conn
+                replica.down_ts = old_replica.down_ts
+                replica.net_timeout = old_replica.net_timeout
+                replica.net_sequential_ok = old_replica.net_sequential_ok
+                replica.net_sequential_fail = old_replica.net_sequential_fail
+                if conn then
+                    conn.replica = replica
+                    conn.replicaset = replicaset
+                end
             end
-            replica.old_replica = nil
         end
     end
+    if old_replicasets then
+        util.async_task(outdate_delay, outdate_replicasets, old_replicasets)
+    end
 end
 
 --
@@ -369,7 +384,6 @@ local replicaset_mt = {
         connect_master = replicaset_connect_master;
         connect_all = replicaset_connect_all;
         connect_replica = replicaset_connect_to_replica;
-        rebind_connections = replicaset_rebind_connections;
         down_replica_priority = replicaset_down_replica_priority;
         up_replica_priority = replicaset_up_replica_priority;
         call = replicaset_master_call;
@@ -412,6 +426,49 @@ for name, func in pairs(replica_mt.__index) do
 end
 replica_mt.__index = index
 
+--
+-- Meta-methods of outdated objects.
+-- They define only attributes from corresponding metatables to
+-- make user able to access fields of old objects.
+--
+local function outdated_warning(...)
+    return nil, lerror.vshard(lerror.code.OBJECT_IS_OUTDATED)
+end
+
+local outdated_replicaset_mt = {
+    __index = {
+        is_outdated = true
+    }
+}
+for fname, func in pairs(replicaset_mt.__index) do
+    outdated_replicaset_mt.__index[fname] = outdated_warning
+end
+
+local outdated_replica_mt = {
+    __index = {
+        is_outdated = true
+    }
+}
+for fname, func in pairs(replica_mt.__index) do
+    outdated_replica_mt.__index[fname] = outdated_warning
+end
+
+--
+-- Outdate replicaset and replica objects:
+--  * Set outdated_metatables.
+--  * Remove connections.
+--
+outdate_replicasets = function(replicasets)
+    for _, replicaset in pairs(replicasets) do
+        setmetatable(replicaset, outdated_replicaset_mt)
+        for _, replica in pairs(replicaset.replicas) do
+            setmetatable(replica, outdated_replica_mt)
+            replica.conn = nil
+        end
+    end
+    log.info('Old replicaset and replica objects are outdated.')
+end
+
 --
 -- Calculate for each replicaset its etalon bucket count.
 -- Iterative algorithm is used to learn the best balance in a
@@ -503,7 +560,7 @@ end
 --
 -- Update/build replicasets from configuration
 --
-local function buildall(sharding_cfg, old_replicasets)
+local function buildall(sharding_cfg)
     local new_replicasets = {}
     local weights = sharding_cfg.weights
     local zone = sharding_cfg.zone
@@ -515,8 +572,6 @@ local function buildall(sharding_cfg, old_replicasets)
     end
     local curr_ts = fiber.time()
     for replicaset_uuid, replicaset in pairs(sharding_cfg.sharding) do
-        local old_replicaset = old_replicasets and
-                               old_replicasets[replicaset_uuid]
         local new_replicaset = setmetatable({
             replicas = {},
             uuid = replicaset_uuid,
@@ -526,8 +581,6 @@ local function buildall(sharding_cfg, old_replicasets)
         }, replicaset_mt)
         local priority_list = {}
         for replica_uuid, replica in pairs(replicaset.replicas) do
-            local old_replica = old_replicaset and
-                                old_replicaset.replicas[replica_uuid]
             -- The old replica is saved in the new object to
             -- rebind its connection at the end of a
             -- router/storage reconfiguration.
@@ -535,7 +588,7 @@ local function buildall(sharding_cfg, old_replicasets)
                 uri = replica.uri, name = replica.name, uuid = replica_uuid,
                 zone = replica.zone, net_timeout = consts.CALL_TIMEOUT_MIN,
                 net_sequential_ok = 0, net_sequential_fail = 0,
-                down_ts = curr_ts, old_replica = old_replica,
+                down_ts = curr_ts,
             }, replica_mt)
             new_replicaset.replicas[replica_uuid] = new_replica
             if replica.master then
@@ -596,4 +649,5 @@ return {
     buildall = buildall,
     calculate_etalon_balance = cluster_calculate_etalon_balance,
     wait_masters_connect = wait_masters_connect,
+    rebind_replicasets = rebind_replicasets,
 }
diff --git a/vshard/router/init.lua b/vshard/router/init.lua
index a143070..142ddb6 100644
--- a/vshard/router/init.lua
+++ b/vshard/router/init.lua
@@ -1,5 +1,17 @@
 local log = require('log')
 local lfiber = require('fiber')
+
+local MODULE_INTERNALS = '__module_vshard_router'
+-- Reload requirements, in case this module is reloaded manually.
+if rawget(_G, MODULE_INTERNALS) then
+    local vshard_modules = {
+        'vshard.consts', 'vshard.error', 'vshard.cfg',
+        'vshard.hash', 'vshard.replicaset', 'vshard.util',
+    }
+    for _, module in pairs(vshard_modules) do
+        package.loaded[module] = nil
+    end
+end
 local consts = require('vshard.consts')
 local lerror = require('vshard.error')
 local lcfg = require('vshard.cfg')
@@ -7,15 +19,20 @@ local lhash = require('vshard.hash')
 local lreplicaset = require('vshard.replicaset')
 local util = require('vshard.util')
 
-local M = rawget(_G, '__module_vshard_router')
+local M = rawget(_G, MODULE_INTERNALS)
 if not M then
     M = {
+        ---------------- Common module attributes ----------------
+        -- The last passed configuration.
+        current_cfg = nil,
         errinj = {
             ERRINJ_CFG = false,
             ERRINJ_FAILOVER_CHANGE_CFG = false,
             ERRINJ_RELOAD = false,
             ERRINJ_LONG_DISCOVERY = false,
         },
+        -- Time to outdate old objects on reload.
+        connection_outdate_delay = nil,
         -- Bucket map cache.
         route_map = {},
         -- All known replicasets used for bucket re-balancing
@@ -479,12 +496,13 @@ local function router_cfg(cfg)
     else
         log.info('Starting router reconfiguration')
     end
-    local new_replicasets = lreplicaset.buildall(cfg, M.replicasets)
+    local new_replicasets = lreplicaset.buildall(cfg)
     local total_bucket_count = cfg.bucket_count
     local collect_lua_garbage = cfg.collect_lua_garbage
-    lcfg.remove_non_box_options(cfg)
+    local box_cfg = table.copy(cfg)
+    lcfg.remove_non_box_options(box_cfg)
     log.info("Calling box.cfg()...")
-    for k, v in pairs(cfg) do
+    for k, v in pairs(box_cfg) do
         log.info({[k] = v})
     end
     -- It is considered that all possible errors during cfg
@@ -493,18 +511,18 @@ local function router_cfg(cfg)
     if M.errinj.ERRINJ_CFG then
         error('Error injection: cfg')
     end
-    box.cfg(cfg)
+    box.cfg(box_cfg)
     log.info("Box has been configured")
+    M.connection_outdate_delay = cfg.connection_outdate_delay
     M.total_bucket_count = total_bucket_count
     M.collect_lua_garbage = collect_lua_garbage
-    M.replicasets = new_replicasets
     M.current_cfg = new_cfg
     -- Move connections from an old configuration to a new one.
     -- It must be done with no yields to prevent usage both of not
     -- fully moved old replicasets, and not fully built new ones.
-    for _, replicaset in pairs(new_replicasets) do
-        replicaset:rebind_connections()
-    end
+    lreplicaset.rebind_replicasets(new_replicasets, M.replicasets,
+                                   M.connection_outdate_delay)
+    M.replicasets = new_replicasets
     -- Now the new replicasets are fully built. Can establish
     -- connections and yield.
     for _, replicaset in pairs(new_replicasets) do
@@ -793,15 +811,16 @@ end
 -- About functions, saved in M, and reloading see comment in
 -- storage/init.lua.
 --
-M.discovery_f = discovery_f
-M.failover_f = failover_f
-
-if not rawget(_G, '__module_vshard_router') then
-    rawset(_G, '__module_vshard_router', M)
+if not rawget(_G, MODULE_INTERNALS) then
+    rawset(_G, MODULE_INTERNALS, M)
 else
+    router_cfg(M.current_cfg)
     M.module_version = M.module_version + 1
 end
 
+M.discovery_f = discovery_f
+M.failover_f = failover_f
+
 return {
     cfg = router_cfg;
     info = router_info;
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index 052e94f..07bd00c 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -2,20 +2,34 @@ local log = require('log')
 local luri = require('uri')
 local lfiber = require('fiber')
 local netbox = require('net.box') -- for net.box:self()
+local trigger = require('internal.trigger')
+
+local MODULE_INTERNALS = '__module_vshard_storage'
+-- Reload requirements, in case this module is reloaded manually.
+if rawget(_G, MODULE_INTERNALS) then
+    local vshard_modules = {
+        'vshard.consts', 'vshard.error', 'vshard.cfg',
+        'vshard.replicaset', 'vshard.util',
+    }
+    for _, module in pairs(vshard_modules) do
+        package.loaded[module] = nil
+    end
+end
 local consts = require('vshard.consts')
 local lerror = require('vshard.error')
-local util = require('vshard.util')
 local lcfg = require('vshard.cfg')
 local lreplicaset = require('vshard.replicaset')
-local trigger = require('internal.trigger')
+local util = require('vshard.util')
 
-local M = rawget(_G, '__module_vshard_storage')
+local M = rawget(_G, MODULE_INTERNALS)
 if not M then
     --
     -- The module is loaded for the first time.
     --
     M = {
         ---------------- Common module attributes ----------------
+        -- The last passed configuration.
+        current_cfg = nil,
         --
         -- All known replicasets used for bucket re-balancing.
         -- See format in replicaset.lua.
@@ -1497,7 +1511,7 @@ local function storage_cfg(cfg, this_replica_uuid)
 
     local this_replicaset
     local this_replica
-    local new_replicasets = lreplicaset.buildall(cfg, M.replicasets)
+    local new_replicasets = lreplicaset.buildall(cfg)
     local min_master
     for rs_uuid, rs in pairs(new_replicasets) do
         for replica_uuid, replica in pairs(rs.replicas) do
@@ -1576,7 +1590,6 @@ local function storage_cfg(cfg, this_replica_uuid)
     --
     local old_sync_timeout = M.sync_timeout
     M.sync_timeout = cfg.sync_timeout
-    lcfg.remove_non_box_options(cfg)
 
     if was_master and not is_master then
         local_on_master_disable_prepare()
@@ -1585,7 +1598,9 @@ local function storage_cfg(cfg, this_replica_uuid)
         local_on_master_enable_prepare()
     end
 
-    local ok, err = pcall(box.cfg, cfg)
+    local box_cfg = table.copy(cfg)
+    lcfg.remove_non_box_options(box_cfg)
+    local ok, err = pcall(box.cfg, box_cfg)
     while M.errinj.ERRINJ_CFG_DELAY do
         lfiber.sleep(0.01)
     end
@@ -1604,10 +1619,8 @@ local function storage_cfg(cfg, this_replica_uuid)
     local uri = luri.parse(this_replica.uri)
     box.once("vshard:storage:1", storage_schema_v1, uri.login, uri.password)
 
+    lreplicaset.rebind_replicasets(new_replicasets, M.replicasets)
     M.replicasets = new_replicasets
-    for _, replicaset in pairs(new_replicasets) do
-        replicaset:rebind_connections()
-    end
     M.this_replicaset = this_replicaset
     M.this_replica = this_replica
     M.total_bucket_count = total_bucket_count
@@ -1846,6 +1859,14 @@ end
 -- restarted (or is restarted from M.background_f, which is not
 -- changed) and continues use old func1 and func2.
 --
+
+if not rawget(_G, MODULE_INTERNALS) then
+    rawset(_G, MODULE_INTERNALS, M)
+else
+    storage_cfg(M.current_cfg, M.this_replica.uuid)
+    M.module_version = M.module_version + 1
+end
+
 M.recovery_f = recovery_f
 M.collect_garbage_f = collect_garbage_f
 M.rebalancer_f = rebalancer_f
@@ -1861,12 +1882,6 @@ M.rebalancer_build_routes = rebalancer_build_routes
 M.rebalancer_calculate_metrics = rebalancer_calculate_metrics
 M.cached_find_sharded_spaces = find_sharded_spaces
 
-if not rawget(_G, '__module_vshard_storage') then
-    rawset(_G, '__module_vshard_storage', M)
-else
-    M.module_version = M.module_version + 1
-end
-
 return {
     sync = sync,
     bucket_force_create = bucket_force_create,
diff --git a/vshard/util.lua b/vshard/util.lua
index ce79930..fb875ce 100644
--- a/vshard/util.lua
+++ b/vshard/util.lua
@@ -100,9 +100,29 @@ end
 -- Update latest versions of function
 M.reloadable_fiber_f = reloadable_fiber_f
 
+local function sync_task(delay, task, ...)
+    if delay then
+        fiber.sleep(delay)
+    end
+    task(...)
+end
+
+--
+-- Run a function without interrupting current fiber.
+-- @param delay Delay in seconds before the task should be
+--        executed.
+-- @param task Function to be executed.
+-- @param ... Arguments which would be passed to the `task`.
+--
+local function async_task(delay, task, ...)
+    assert(delay == nil or type(delay) == 'number')
+    fiber.create(sync_task, delay, task, ...)
+end
+
 return {
     tuple_extract_key = tuple_extract_key,
     reloadable_fiber_f = reloadable_fiber_f,
     generate_self_checker = generate_self_checker,
+    async_task = async_task,
     internal = M,
 }
-- 
2.14.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tarantool-patches] [PATCH 3/4] Tests: separate bootstrap routine to a lua_libs
  2018-07-23 11:14 [tarantool-patches] [PATCH v2] vshard reload mechanism AKhatskevich
  2018-07-23 11:14 ` [tarantool-patches] [PATCH 1/4] Add test on error during reconfigure AKhatskevich
  2018-07-23 11:14 ` [tarantool-patches] [PATCH 2/4] Complete module reload AKhatskevich
@ 2018-07-23 11:14 ` AKhatskevich
  2018-07-23 13:36   ` [tarantool-patches] " Vladislav Shpilevoy
  2018-07-23 11:14 ` [tarantool-patches] [PATCH 4/4] Introduce storage reload evolution AKhatskevich
  3 siblings, 1 reply; 14+ messages in thread
From: AKhatskevich @ 2018-07-23 11:14 UTC (permalink / raw)
  To: v.shpilevoy, tarantool-patches

What is moved to `test/lul_libs/bootstrap_test_storage.lua`:
1. create schema
2. create main stored procedures
3. `wait_rebalancer_state` procedure

This code would be reused it further commits.
---
 test/lua_libs/bootstrap_test_storage.lua           | 50 ++++++++++++++++++++++
 test/rebalancer/box_1_a.lua                        | 47 ++------------------
 test/rebalancer/errinj.result                      |  2 +-
 test/rebalancer/errinj.test.lua                    |  2 +-
 test/rebalancer/rebalancer.result                  |  2 +-
 test/rebalancer/rebalancer.test.lua                |  2 +-
 test/rebalancer/rebalancer_lock_and_pin.result     |  2 +-
 test/rebalancer/rebalancer_lock_and_pin.test.lua   |  2 +-
 test/rebalancer/restart_during_rebalancing.result  |  2 +-
 .../rebalancer/restart_during_rebalancing.test.lua |  2 +-
 test/rebalancer/stress_add_remove_rs.result        |  2 +-
 test/rebalancer/stress_add_remove_rs.test.lua      |  2 +-
 .../rebalancer/stress_add_remove_several_rs.result |  2 +-
 .../stress_add_remove_several_rs.test.lua          |  2 +-
 test/rebalancer/suite.ini                          |  2 +-
 15 files changed, 66 insertions(+), 57 deletions(-)
 create mode 100644 test/lua_libs/bootstrap_test_storage.lua

diff --git a/test/lua_libs/bootstrap_test_storage.lua b/test/lua_libs/bootstrap_test_storage.lua
new file mode 100644
index 0000000..62c2f78
--- /dev/null
+++ b/test/lua_libs/bootstrap_test_storage.lua
@@ -0,0 +1,50 @@
+local log = require('log')
+
+function init_schema()
+	local format = {}
+	format[1] = {name = 'field', type = 'unsigned'}
+	format[2] = {name = 'bucket_id', type = 'unsigned'}
+	local s = box.schema.create_space('test', {format = format})
+	local pk = s:create_index('pk')
+	local bucket_id_idx =
+		s:create_index('vbucket', {parts = {'bucket_id'},
+					   unique = false})
+end
+
+box.once('schema', function()
+	box.schema.func.create('do_replace')
+	box.schema.role.grant('public', 'execute', 'function', 'do_replace')
+	box.schema.func.create('do_select')
+	box.schema.role.grant('public', 'execute', 'function', 'do_select')
+	init_schema()
+end)
+
+function do_replace(...)
+	box.space.test:replace(...)
+	return true
+end
+
+function do_select(...)
+	return box.space.test:select(...)
+end
+
+function check_consistency()
+	for _, tuple in box.space.test:pairs() do
+		assert(box.space._bucket:get{tuple.bucket_id})
+	end
+	return true
+end
+
+--
+-- Wait a specified log message.
+-- Requirements:
+-- * Should be executed from a storage with a rebalancer.
+-- * NAME - global variable, name of instance should be set.
+function wait_rebalancer_state(state, test_run)
+	log.info(string.rep('a', 1000))
+	vshard.storage.rebalancer_wakeup()
+	while not test_run:grep_log(NAME, state, 1000) do
+		fiber.sleep(0.1)
+		vshard.storage.rebalancer_wakeup()
+	end
+end
diff --git a/test/rebalancer/box_1_a.lua b/test/rebalancer/box_1_a.lua
index 8fddcf0..8ab3444 100644
--- a/test/rebalancer/box_1_a.lua
+++ b/test/rebalancer/box_1_a.lua
@@ -2,7 +2,7 @@
 -- Get instance name
 require('strict').on()
 local fio = require('fio')
-local NAME = fio.basename(arg[0], '.lua')
+NAME = fio.basename(arg[0], '.lua')
 log = require('log')
 require('console').listen(os.getenv('ADMIN'))
 fiber = require('fiber')
@@ -23,40 +23,8 @@ if NAME == 'box_4_a' or NAME == 'box_4_b' or
 end
 vshard.storage.cfg(cfg, names.replica_uuid[NAME])
 
-function init_schema()
-	local format = {}
-	format[1] = {name = 'field', type = 'unsigned'}
-	format[2] = {name = 'bucket_id', type = 'unsigned'}
-	local s = box.schema.create_space('test', {format = format})
-	local pk = s:create_index('pk')
-	local bucket_id_idx =
-		s:create_index('vbucket', {parts = {'bucket_id'},
-					   unique = false})
-end
-
-box.once('schema', function()
-	box.schema.func.create('do_replace')
-	box.schema.role.grant('public', 'execute', 'function', 'do_replace')
-	box.schema.func.create('do_select')
-	box.schema.role.grant('public', 'execute', 'function', 'do_select')
-	init_schema()
-end)
-
-function do_replace(...)
-	box.space.test:replace(...)
-	return true
-end
-
-function do_select(...)
-	return box.space.test:select(...)
-end
-
-function check_consistency()
-	for _, tuple in box.space.test:pairs() do
-		assert(box.space._bucket:get{tuple.bucket_id})
-	end
-	return true
-end
+-- Bootstrap storage.
+require('lua_libs.bootstrap_test_storage')
 
 function switch_rs1_master()
 	local replica_uuid = names.replica_uuid
@@ -68,12 +36,3 @@ end
 function nullify_rs_weight()
 	cfg.sharding[names.rs_uuid[1]].weight = 0
 end
-
-function wait_rebalancer_state(state, test_run)
-	log.info(string.rep('a', 1000))
-	vshard.storage.rebalancer_wakeup()
-	while not test_run:grep_log(NAME, state, 1000) do
-		fiber.sleep(0.1)
-		vshard.storage.rebalancer_wakeup()
-	end
-end
diff --git a/test/rebalancer/errinj.result b/test/rebalancer/errinj.result
index d09349e..826c2c6 100644
--- a/test/rebalancer/errinj.result
+++ b/test/rebalancer/errinj.result
@@ -13,7 +13,7 @@ test_run:create_cluster(REPLICASET_1, 'rebalancer')
 test_run:create_cluster(REPLICASET_2, 'rebalancer')
 ---
 ...
-util = require('util')
+util = require('lua_libs.util')
 ---
 ...
 util.wait_master(test_run, REPLICASET_1, 'box_1_a')
diff --git a/test/rebalancer/errinj.test.lua b/test/rebalancer/errinj.test.lua
index d6a2920..fc0730c 100644
--- a/test/rebalancer/errinj.test.lua
+++ b/test/rebalancer/errinj.test.lua
@@ -5,7 +5,7 @@ REPLICASET_2 = { 'box_2_a', 'box_2_b' }
 
 test_run:create_cluster(REPLICASET_1, 'rebalancer')
 test_run:create_cluster(REPLICASET_2, 'rebalancer')
-util = require('util')
+util = require('lua_libs.util')
 util.wait_master(test_run, REPLICASET_1, 'box_1_a')
 util.wait_master(test_run, REPLICASET_2, 'box_2_a')
 
diff --git a/test/rebalancer/rebalancer.result b/test/rebalancer/rebalancer.result
index 88cbaae..71e43e1 100644
--- a/test/rebalancer/rebalancer.result
+++ b/test/rebalancer/rebalancer.result
@@ -13,7 +13,7 @@ test_run:create_cluster(REPLICASET_1, 'rebalancer')
 test_run:create_cluster(REPLICASET_2, 'rebalancer')
 ---
 ...
-util = require('util')
+util = require('lua_libs.util')
 ---
 ...
 util.wait_master(test_run, REPLICASET_1, 'box_1_a')
diff --git a/test/rebalancer/rebalancer.test.lua b/test/rebalancer/rebalancer.test.lua
index 01f2061..1b7ddae 100644
--- a/test/rebalancer/rebalancer.test.lua
+++ b/test/rebalancer/rebalancer.test.lua
@@ -5,7 +5,7 @@ REPLICASET_2 = { 'box_2_a', 'box_2_b' }
 
 test_run:create_cluster(REPLICASET_1, 'rebalancer')
 test_run:create_cluster(REPLICASET_2, 'rebalancer')
-util = require('util')
+util = require('lua_libs.util')
 util.wait_master(test_run, REPLICASET_1, 'box_1_a')
 util.wait_master(test_run, REPLICASET_2, 'box_2_a')
 
diff --git a/test/rebalancer/rebalancer_lock_and_pin.result b/test/rebalancer/rebalancer_lock_and_pin.result
index dd9fe47..0f2921c 100644
--- a/test/rebalancer/rebalancer_lock_and_pin.result
+++ b/test/rebalancer/rebalancer_lock_and_pin.result
@@ -16,7 +16,7 @@ test_run:create_cluster(REPLICASET_1, 'rebalancer')
 test_run:create_cluster(REPLICASET_2, 'rebalancer')
 ---
 ...
-util = require('util')
+util = require('lua_libs.util')
 ---
 ...
 util.wait_master(test_run, REPLICASET_1, 'box_1_a')
diff --git a/test/rebalancer/rebalancer_lock_and_pin.test.lua b/test/rebalancer/rebalancer_lock_and_pin.test.lua
index fe866c4..3a2daa0 100644
--- a/test/rebalancer/rebalancer_lock_and_pin.test.lua
+++ b/test/rebalancer/rebalancer_lock_and_pin.test.lua
@@ -6,7 +6,7 @@ REPLICASET_3 = { 'box_3_a', 'box_3_b' }
 
 test_run:create_cluster(REPLICASET_1, 'rebalancer')
 test_run:create_cluster(REPLICASET_2, 'rebalancer')
-util = require('util')
+util = require('lua_libs.util')
 util.wait_master(test_run, REPLICASET_1, 'box_1_a')
 util.wait_master(test_run, REPLICASET_2, 'box_2_a')
 
diff --git a/test/rebalancer/restart_during_rebalancing.result b/test/rebalancer/restart_during_rebalancing.result
index d2b8a12..0eb0f2e 100644
--- a/test/rebalancer/restart_during_rebalancing.result
+++ b/test/rebalancer/restart_during_rebalancing.result
@@ -25,7 +25,7 @@ test_run:create_cluster(REPLICASET_3, 'rebalancer')
 test_run:create_cluster(REPLICASET_4, 'rebalancer')
 ---
 ...
-util = require('util')
+util = require('lua_libs.util')
 ---
 ...
 util.wait_master(test_run, REPLICASET_1, 'fullbox_1_a')
diff --git a/test/rebalancer/restart_during_rebalancing.test.lua b/test/rebalancer/restart_during_rebalancing.test.lua
index 5b1a8df..7b707ca 100644
--- a/test/rebalancer/restart_during_rebalancing.test.lua
+++ b/test/rebalancer/restart_during_rebalancing.test.lua
@@ -9,7 +9,7 @@ test_run:create_cluster(REPLICASET_1, 'rebalancer')
 test_run:create_cluster(REPLICASET_2, 'rebalancer')
 test_run:create_cluster(REPLICASET_3, 'rebalancer')
 test_run:create_cluster(REPLICASET_4, 'rebalancer')
-util = require('util')
+util = require('lua_libs.util')
 util.wait_master(test_run, REPLICASET_1, 'fullbox_1_a')
 util.wait_master(test_run, REPLICASET_2, 'fullbox_2_a')
 util.wait_master(test_run, REPLICASET_3, 'fullbox_3_a')
diff --git a/test/rebalancer/stress_add_remove_rs.result b/test/rebalancer/stress_add_remove_rs.result
index 8a955e2..10bcaac 100644
--- a/test/rebalancer/stress_add_remove_rs.result
+++ b/test/rebalancer/stress_add_remove_rs.result
@@ -16,7 +16,7 @@ test_run:create_cluster(REPLICASET_1, 'rebalancer')
 test_run:create_cluster(REPLICASET_2, 'rebalancer')
 ---
 ...
-util = require('util')
+util = require('lua_libs.util')
 ---
 ...
 util.wait_master(test_run, REPLICASET_1, 'box_1_a')
diff --git a/test/rebalancer/stress_add_remove_rs.test.lua b/test/rebalancer/stress_add_remove_rs.test.lua
index c80df40..b9bb027 100644
--- a/test/rebalancer/stress_add_remove_rs.test.lua
+++ b/test/rebalancer/stress_add_remove_rs.test.lua
@@ -6,7 +6,7 @@ REPLICASET_3 = { 'box_3_a', 'box_3_b' }
 
 test_run:create_cluster(REPLICASET_1, 'rebalancer')
 test_run:create_cluster(REPLICASET_2, 'rebalancer')
-util = require('util')
+util = require('lua_libs.util')
 util.wait_master(test_run, REPLICASET_1, 'box_1_a')
 util.wait_master(test_run, REPLICASET_2, 'box_2_a')
 
diff --git a/test/rebalancer/stress_add_remove_several_rs.result b/test/rebalancer/stress_add_remove_several_rs.result
index d6008b8..611362c 100644
--- a/test/rebalancer/stress_add_remove_several_rs.result
+++ b/test/rebalancer/stress_add_remove_several_rs.result
@@ -19,7 +19,7 @@ test_run:create_cluster(REPLICASET_1, 'rebalancer')
 test_run:create_cluster(REPLICASET_2, 'rebalancer')
 ---
 ...
-util = require('util')
+util = require('lua_libs.util')
 ---
 ...
 util.wait_master(test_run, REPLICASET_1, 'box_1_a')
diff --git a/test/rebalancer/stress_add_remove_several_rs.test.lua b/test/rebalancer/stress_add_remove_several_rs.test.lua
index 3cc105e..9acb8de 100644
--- a/test/rebalancer/stress_add_remove_several_rs.test.lua
+++ b/test/rebalancer/stress_add_remove_several_rs.test.lua
@@ -7,7 +7,7 @@ REPLICASET_4 = { 'box_4_a', 'box_4_b' }
 
 test_run:create_cluster(REPLICASET_1, 'rebalancer')
 test_run:create_cluster(REPLICASET_2, 'rebalancer')
-util = require('util')
+util = require('lua_libs.util')
 util.wait_master(test_run, REPLICASET_1, 'box_1_a')
 util.wait_master(test_run, REPLICASET_2, 'box_2_a')
 
diff --git a/test/rebalancer/suite.ini b/test/rebalancer/suite.ini
index afc5141..8689da5 100644
--- a/test/rebalancer/suite.ini
+++ b/test/rebalancer/suite.ini
@@ -4,6 +4,6 @@ description = Rebalancer tests
 script = test.lua
 is_parallel = False
 release_disabled = errinj.test.lua
-lua_libs = ../lua_libs/util.lua config.lua names.lua router_1.lua
+lua_libs = ../lua_libs config.lua names.lua router_1.lua
            box_1_a.lua box_1_b.lua box_2_a.lua box_2_b.lua
            box_3_a.lua box_3_b.lua rebalancer_utils.lua
-- 
2.14.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tarantool-patches] [PATCH 4/4] Introduce storage reload evolution
  2018-07-23 11:14 [tarantool-patches] [PATCH v2] vshard reload mechanism AKhatskevich
                   ` (2 preceding siblings ...)
  2018-07-23 11:14 ` [tarantool-patches] [PATCH 3/4] Tests: separate bootstrap routine to a lua_libs AKhatskevich
@ 2018-07-23 11:14 ` AKhatskevich
  2018-07-23 14:44   ` [tarantool-patches] " Vladislav Shpilevoy
  3 siblings, 1 reply; 14+ messages in thread
From: AKhatskevich @ 2018-07-23 11:14 UTC (permalink / raw)
  To: v.shpilevoy, tarantool-patches

Changes:
1. Introduce storage reload evolution.
2. Setup cross-version reload testing.

1:
This mechanism updates Lua objects on reload in case they are
changed in a new vshard.storage version.

Since this commit, any change in vshard.storage.M has to be
reflected in vshard.storage.reload_evolution to guarantee
correct reload.

2:
The testing uses git infrastructure and is performed in the following
way:
1. Copy old version of vshard to a temp folder.
2. Run vshard on this code.
3. Checkout the latest version of the vshard sources.
4. Reload vshard storage.
5. Make sure it works (Perform simple tests).

Notes:
* this patch contains some legacy-driven decisions:
  1. SOURCEDIR path retrieved differently in case of
     packpack build.
  2. git directory in the `reload_evolution/storage` test
     is copied with respect to Centos 7 and `ro` mode of
     SOURCEDIR.

Closes #112 #125
---
 .travis.yml                            |   2 +-
 rpm/prebuild.sh                        |   2 +
 test/lua_libs/git_util.lua             |  51 +++++++
 test/lua_libs/util.lua                 |  20 +++
 test/reload_evolution/storage.result   | 248 +++++++++++++++++++++++++++++++++
 test/reload_evolution/storage.test.lua |  88 ++++++++++++
 test/reload_evolution/storage_1_a.lua  |  48 +++++++
 test/reload_evolution/storage_1_b.lua  |   1 +
 test/reload_evolution/storage_2_a.lua  |   1 +
 test/reload_evolution/storage_2_b.lua  |   1 +
 test/reload_evolution/suite.ini        |   6 +
 test/reload_evolution/test.lua         |   9 ++
 test/unit/reload_evolution.result      |  45 ++++++
 test/unit/reload_evolution.test.lua    |  18 +++
 vshard/storage/init.lua                |  11 ++
 vshard/storage/reload_evolution.lua    |  58 ++++++++
 16 files changed, 608 insertions(+), 1 deletion(-)
 create mode 100644 test/lua_libs/git_util.lua
 create mode 100644 test/reload_evolution/storage.result
 create mode 100644 test/reload_evolution/storage.test.lua
 create mode 100755 test/reload_evolution/storage_1_a.lua
 create mode 120000 test/reload_evolution/storage_1_b.lua
 create mode 120000 test/reload_evolution/storage_2_a.lua
 create mode 120000 test/reload_evolution/storage_2_b.lua
 create mode 100644 test/reload_evolution/suite.ini
 create mode 100644 test/reload_evolution/test.lua
 create mode 100644 test/unit/reload_evolution.result
 create mode 100644 test/unit/reload_evolution.test.lua
 create mode 100644 vshard/storage/reload_evolution.lua

diff --git a/.travis.yml b/.travis.yml
index 54bfe44..eff4a51 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -41,7 +41,7 @@ env:
 script:
   - git describe --long
   - git clone https://github.com/packpack/packpack.git packpack
-  - packpack/packpack
+  - packpack/packpack -e PACKPACK_GIT_SOURCEDIR=/source/
 
 before_deploy:
   - ls -l build/
diff --git a/rpm/prebuild.sh b/rpm/prebuild.sh
index 768b22b..554032b 100755
--- a/rpm/prebuild.sh
+++ b/rpm/prebuild.sh
@@ -1 +1,3 @@
 curl -s https://packagecloud.io/install/repositories/tarantool/1_9/script.rpm.sh | sudo bash
+sudo yum -y install python-devel python-pip
+sudo pip install tarantool msgpack
diff --git a/test/lua_libs/git_util.lua b/test/lua_libs/git_util.lua
new file mode 100644
index 0000000..a75bb08
--- /dev/null
+++ b/test/lua_libs/git_util.lua
@@ -0,0 +1,51 @@
+--
+-- Lua bridge for some of the git commands.
+--
+local os = require('os')
+
+local temp_file = 'some_strange_rare_unique_file_name_for_git_util'
+
+--
+-- Exec a git command.
+-- @param params Table of parameters:
+--        * options - git options.
+--        * cmd - git command.
+--        * args - command arguments.
+--        * dir - working directory.
+--        * fout - write output to the file.
+local function exec_cmd(params)
+    local fout = params.fout
+    local shell_cmd = {'git'}
+    for _, param in pairs({'options', 'cmd', 'args'}) do
+        table.insert(shell_cmd, params[param])
+    end
+    if fout then
+        table.insert(shell_cmd, ' >' .. fout)
+    end
+    shell_cmd = table.concat(shell_cmd, ' ')
+    if params.dir then
+        shell_cmd = string.format('cd %s && %s', params.dir, shell_cmd)
+    end
+    local res = os.execute(shell_cmd)
+    assert(res == 0, 'Git cmd error: ' .. res)
+end
+
+local function log_hashes(params)
+    params.args = "--format='%h' " .. params.args
+    local local_temp_file = string.format('%s/%s', os.getenv('PWD'), temp_file)
+    params.fout = local_temp_file
+    params.cmd = 'log'
+    exec_cmd(params)
+    local lines = {}
+    for line in io.lines(local_temp_file) do
+        table.insert(lines, line)
+    end
+    os.remove(local_temp_file)
+    return lines
+end
+
+
+return {
+    exec_cmd = exec_cmd,
+    log_hashes = log_hashes
+}
diff --git a/test/lua_libs/util.lua b/test/lua_libs/util.lua
index f40d3a6..935ff41 100644
--- a/test/lua_libs/util.lua
+++ b/test/lua_libs/util.lua
@@ -1,5 +1,6 @@
 local fiber = require('fiber')
 local log = require('log')
+local fio = require('fio')
 
 local function check_error(func, ...)
     local pstatus, status, err = pcall(func, ...)
@@ -92,10 +93,29 @@ local function has_same_fields(etalon, data)
     return true
 end
 
+-- Git directory of the project. Used in evolution tests to
+-- fetch old versions of vshard.
+local SOURCEDIR = os.getenv('PACKPACK_GIT_SOURCEDIR')
+if not SOURCEDIR then
+    SOURCEDIR = os.getenv('SOURCEDIR')
+end
+if not SOURCEDIR then
+    local script_path = debug.getinfo(1).source:match("@?(.*/)")
+    script_path = fio.abspath(script_path)
+    SOURCEDIR = fio.abspath(script_path .. '/../../../')
+end
+
+local BUILDDIR = os.getenv('BUILDDIR')
+if not BUILDDIR then
+    BUILDDIR = SOURCEDIR
+end
+
 return {
     check_error = check_error,
     shuffle_masters = shuffle_masters,
     collect_timeouts = collect_timeouts,
     wait_master = wait_master,
     has_same_fields = has_same_fields,
+    SOURCEDIR = SOURCEDIR,
+    BUILDDIR = BUILDDIR,
 }
diff --git a/test/reload_evolution/storage.result b/test/reload_evolution/storage.result
new file mode 100644
index 0000000..54ff6b7
--- /dev/null
+++ b/test/reload_evolution/storage.result
@@ -0,0 +1,248 @@
+test_run = require('test_run').new()
+---
+...
+git_util = require('lua_libs.git_util')
+---
+...
+util = require('lua_libs.util')
+---
+...
+vshard_copy_path = util.BUILDDIR .. '/test/var/vshard_git_tree_copy'
+---
+...
+evolution_log = git_util.log_hashes({args='vshard/storage/reload_evolution.lua', dir=util.SOURCEDIR})
+---
+...
+-- Cleanup the directory after a previous build.
+_ = os.execute('rm -rf ' .. vshard_copy_path)
+---
+...
+-- 1. `git worktree` cannot be used because PACKPACK mounts
+-- `/source/` in `ro` mode.
+-- 2. Just `cp -rf` cannot be used due to a little different
+-- behavior in Centos 7.
+_ = os.execute('mkdir ' .. vshard_copy_path)
+---
+...
+_ = os.execute("cd " .. util.SOURCEDIR .. ' && cp -rf `ls -A --ignore=build` ' .. vshard_copy_path)
+---
+...
+-- Checkout the first commit with a reload_evolution mechanism.
+git_util.exec_cmd({cmd='checkout', args='-f', dir=vshard_copy_path})
+---
+...
+git_util.exec_cmd({cmd='checkout', args=evolution_log[#evolution_log] .. '~1', dir=vshard_copy_path})
+---
+...
+REPLICASET_1 = { 'storage_1_a', 'storage_1_b' }
+---
+...
+REPLICASET_2 = { 'storage_2_a', 'storage_2_b' }
+---
+...
+test_run:create_cluster(REPLICASET_1, 'reload_evolution')
+---
+...
+test_run:create_cluster(REPLICASET_2, 'reload_evolution')
+---
+...
+util = require('lua_libs.util')
+---
+...
+util.wait_master(test_run, REPLICASET_1, 'storage_1_a')
+---
+...
+util.wait_master(test_run, REPLICASET_2, 'storage_2_a')
+---
+...
+test_run:switch('storage_1_a')
+---
+- true
+...
+vshard.storage.bucket_force_create(1, vshard.consts.DEFAULT_BUCKET_COUNT / 2)
+---
+- true
+...
+bucket_id_to_move = vshard.consts.DEFAULT_BUCKET_COUNT
+---
+...
+test_run:switch('storage_2_a')
+---
+- true
+...
+fiber = require('fiber')
+---
+...
+vshard.storage.bucket_force_create(vshard.consts.DEFAULT_BUCKET_COUNT / 2 + 1, vshard.consts.DEFAULT_BUCKET_COUNT / 2)
+---
+- true
+...
+bucket_id_to_move = vshard.consts.DEFAULT_BUCKET_COUNT
+---
+...
+vshard.storage.internal.reload_evolution_version
+---
+- null
+...
+box.space.test:insert({42, bucket_id_to_move})
+---
+- [42, 3000]
+...
+while test_run:grep_log('storage_2_a', 'The cluster is balanced ok') == nil do vshard.storage.rebalancer_wakeup() fiber.sleep(0.1) end
+---
+...
+test_run:switch('default')
+---
+- true
+...
+git_util.exec_cmd({cmd='checkout', args=evolution_log[1], dir=vshard_copy_path})
+---
+...
+test_run:switch('storage_2_a')
+---
+- true
+...
+package.loaded["vshard.storage"] = nil
+---
+...
+vshard.storage = require("vshard.storage")
+---
+...
+test_run:grep_log('storage_2_a', 'vshard.storage.reload_evolution: upgraded to') ~= nil
+---
+- true
+...
+vshard.storage.internal.reload_evolution_version
+---
+- 1
+...
+-- Make sure storage operates well.
+vshard.storage.bucket_force_drop(2000)
+---
+- true
+...
+vshard.storage.bucket_force_create(2000)
+---
+- true
+...
+vshard.storage.buckets_info()[2000]
+---
+- status: active
+  id: 2000
+...
+vshard.storage.call(bucket_id_to_move, 'read', 'do_select', {42})
+---
+- true
+- - [42, 3000]
+...
+vshard.storage.bucket_send(bucket_id_to_move, replicaset1_uuid)
+---
+- true
+...
+vshard.storage.garbage_collector_wakeup()
+---
+...
+fiber = require('fiber')
+---
+...
+while box.space._bucket:get({bucket_id_to_move}) do fiber.sleep(0.01) end
+---
+...
+test_run:switch('storage_1_a')
+---
+- true
+...
+vshard.storage.bucket_send(bucket_id_to_move, replicaset2_uuid)
+---
+- true
+...
+test_run:switch('storage_2_a')
+---
+- true
+...
+vshard.storage.call(bucket_id_to_move, 'read', 'do_select', {42})
+---
+- true
+- - [42, 3000]
+...
+-- Check info() does not fail.
+vshard.storage.info() ~= nil
+---
+- true
+...
+--
+-- Send buckets to create a disbalance. Wait until the rebalancer
+-- repairs it. Similar to `tests/rebalancer/rebalancer.test.lua`.
+--
+vshard.storage.rebalancer_disable()
+---
+...
+move_start = vshard.consts.DEFAULT_BUCKET_COUNT / 2 + 1
+---
+...
+move_cnt = 100
+---
+...
+assert(move_start + move_cnt < vshard.consts.DEFAULT_BUCKET_COUNT)
+---
+- true
+...
+for i = move_start, move_start + move_cnt - 1 do box.space._bucket:delete{i} end
+---
+...
+box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
+---
+- 1400
+...
+test_run:switch('storage_1_a')
+---
+- true
+...
+move_start = vshard.consts.DEFAULT_BUCKET_COUNT / 2 + 1
+---
+...
+move_cnt = 100
+---
+...
+vshard.storage.bucket_force_create(move_start, move_cnt)
+---
+- true
+...
+box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
+---
+- 1600
+...
+test_run:switch('storage_2_a')
+---
+- true
+...
+vshard.storage.rebalancer_enable()
+---
+...
+vshard.storage.rebalancer_wakeup()
+---
+...
+wait_rebalancer_state("Rebalance routes are sent", test_run)
+---
+...
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+---
+...
+box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
+---
+- 1500
+...
+test_run:switch('default')
+---
+- true
+...
+test_run:drop_cluster(REPLICASET_2)
+---
+...
+test_run:drop_cluster(REPLICASET_1)
+---
+...
+test_run:cmd('clear filter')
+---
+- true
+...
diff --git a/test/reload_evolution/storage.test.lua b/test/reload_evolution/storage.test.lua
new file mode 100644
index 0000000..5817667
--- /dev/null
+++ b/test/reload_evolution/storage.test.lua
@@ -0,0 +1,88 @@
+test_run = require('test_run').new()
+
+git_util = require('lua_libs.git_util')
+util = require('lua_libs.util')
+vshard_copy_path = util.BUILDDIR .. '/test/var/vshard_git_tree_copy'
+evolution_log = git_util.log_hashes({args='vshard/storage/reload_evolution.lua', dir=util.SOURCEDIR})
+-- Cleanup the directory after a previous build.
+_ = os.execute('rm -rf ' .. vshard_copy_path)
+-- 1. `git worktree` cannot be used because PACKPACK mounts
+-- `/source/` in `ro` mode.
+-- 2. Just `cp -rf` cannot be used due to a little different
+-- behavior in Centos 7.
+_ = os.execute('mkdir ' .. vshard_copy_path)
+_ = os.execute("cd " .. util.SOURCEDIR .. ' && cp -rf `ls -A --ignore=build` ' .. vshard_copy_path)
+-- Checkout the first commit with a reload_evolution mechanism.
+git_util.exec_cmd({cmd='checkout', args='-f', dir=vshard_copy_path})
+git_util.exec_cmd({cmd='checkout', args=evolution_log[#evolution_log] .. '~1', dir=vshard_copy_path})
+
+REPLICASET_1 = { 'storage_1_a', 'storage_1_b' }
+REPLICASET_2 = { 'storage_2_a', 'storage_2_b' }
+test_run:create_cluster(REPLICASET_1, 'reload_evolution')
+test_run:create_cluster(REPLICASET_2, 'reload_evolution')
+util = require('lua_libs.util')
+util.wait_master(test_run, REPLICASET_1, 'storage_1_a')
+util.wait_master(test_run, REPLICASET_2, 'storage_2_a')
+
+test_run:switch('storage_1_a')
+vshard.storage.bucket_force_create(1, vshard.consts.DEFAULT_BUCKET_COUNT / 2)
+bucket_id_to_move = vshard.consts.DEFAULT_BUCKET_COUNT
+
+test_run:switch('storage_2_a')
+fiber = require('fiber')
+vshard.storage.bucket_force_create(vshard.consts.DEFAULT_BUCKET_COUNT / 2 + 1, vshard.consts.DEFAULT_BUCKET_COUNT / 2)
+bucket_id_to_move = vshard.consts.DEFAULT_BUCKET_COUNT
+vshard.storage.internal.reload_evolution_version
+box.space.test:insert({42, bucket_id_to_move})
+while test_run:grep_log('storage_2_a', 'The cluster is balanced ok') == nil do vshard.storage.rebalancer_wakeup() fiber.sleep(0.1) end
+
+test_run:switch('default')
+git_util.exec_cmd({cmd='checkout', args=evolution_log[1], dir=vshard_copy_path})
+
+test_run:switch('storage_2_a')
+package.loaded["vshard.storage"] = nil
+vshard.storage = require("vshard.storage")
+test_run:grep_log('storage_2_a', 'vshard.storage.reload_evolution: upgraded to') ~= nil
+vshard.storage.internal.reload_evolution_version
+-- Make sure storage operates well.
+vshard.storage.bucket_force_drop(2000)
+vshard.storage.bucket_force_create(2000)
+vshard.storage.buckets_info()[2000]
+vshard.storage.call(bucket_id_to_move, 'read', 'do_select', {42})
+vshard.storage.bucket_send(bucket_id_to_move, replicaset1_uuid)
+vshard.storage.garbage_collector_wakeup()
+fiber = require('fiber')
+while box.space._bucket:get({bucket_id_to_move}) do fiber.sleep(0.01) end
+test_run:switch('storage_1_a')
+vshard.storage.bucket_send(bucket_id_to_move, replicaset2_uuid)
+test_run:switch('storage_2_a')
+vshard.storage.call(bucket_id_to_move, 'read', 'do_select', {42})
+-- Check info() does not fail.
+vshard.storage.info() ~= nil
+
+--
+-- Send buckets to create a disbalance. Wait until the rebalancer
+-- repairs it. Similar to `tests/rebalancer/rebalancer.test.lua`.
+--
+vshard.storage.rebalancer_disable()
+move_start = vshard.consts.DEFAULT_BUCKET_COUNT / 2 + 1
+move_cnt = 100
+assert(move_start + move_cnt < vshard.consts.DEFAULT_BUCKET_COUNT)
+for i = move_start, move_start + move_cnt - 1 do box.space._bucket:delete{i} end
+box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
+test_run:switch('storage_1_a')
+move_start = vshard.consts.DEFAULT_BUCKET_COUNT / 2 + 1
+move_cnt = 100
+vshard.storage.bucket_force_create(move_start, move_cnt)
+box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
+test_run:switch('storage_2_a')
+vshard.storage.rebalancer_enable()
+vshard.storage.rebalancer_wakeup()
+wait_rebalancer_state("Rebalance routes are sent", test_run)
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
+
+test_run:switch('default')
+test_run:drop_cluster(REPLICASET_2)
+test_run:drop_cluster(REPLICASET_1)
+test_run:cmd('clear filter')
diff --git a/test/reload_evolution/storage_1_a.lua b/test/reload_evolution/storage_1_a.lua
new file mode 100755
index 0000000..a971457
--- /dev/null
+++ b/test/reload_evolution/storage_1_a.lua
@@ -0,0 +1,48 @@
+#!/usr/bin/env tarantool
+
+require('strict').on()
+
+local log = require('log')
+local fiber = require('fiber')
+local util = require('lua_libs.util')
+local fio = require('fio')
+
+-- Get instance name
+NAME = fio.basename(arg[0], '.lua')
+
+-- test-run gate.
+test_run = require('test_run').new()
+require('console').listen(os.getenv('ADMIN'))
+
+-- Run one storage on a different vshard version.
+-- To do that, place vshard src to
+-- BUILDDIR/test/var/vshard_git_tree_copy/.
+if NAME == 'storage_2_a' then
+    local script_path = debug.getinfo(1).source:match("@?(.*/)")
+    vshard_copy = util.BUILDDIR .. '/test/var/vshard_git_tree_copy'
+    package.path = string.format(
+        '%s/?.lua;%s/?/init.lua;%s',
+        vshard_copy, vshard_copy, package.path
+    )
+end
+
+-- Call a configuration provider
+cfg = require('localcfg')
+-- Name to uuid map
+names = {
+    ['storage_1_a'] = '8a274925-a26d-47fc-9e1b-af88ce939412',
+    ['storage_1_b'] = '3de2e3e1-9ebe-4d0d-abb1-26d301b84633',
+    ['storage_2_a'] = '1e02ae8a-afc0-4e91-ba34-843a356b8ed7',
+    ['storage_2_b'] = '001688c3-66f8-4a31-8e19-036c17d489c2',
+}
+
+replicaset1_uuid = 'cbf06940-0790-498b-948d-042b62cf3d29'
+replicaset2_uuid = 'ac522f65-aa94-4134-9f64-51ee384f1a54'
+replicasets = {replicaset1_uuid, replicaset2_uuid}
+
+-- Start the database with sharding
+vshard = require('vshard')
+vshard.storage.cfg(cfg, names[NAME])
+
+-- Bootstrap storage.
+require('lua_libs.bootstrap_test_storage')
diff --git a/test/reload_evolution/storage_1_b.lua b/test/reload_evolution/storage_1_b.lua
new file mode 120000
index 0000000..02572da
--- /dev/null
+++ b/test/reload_evolution/storage_1_b.lua
@@ -0,0 +1 @@
+storage_1_a.lua
\ No newline at end of file
diff --git a/test/reload_evolution/storage_2_a.lua b/test/reload_evolution/storage_2_a.lua
new file mode 120000
index 0000000..02572da
--- /dev/null
+++ b/test/reload_evolution/storage_2_a.lua
@@ -0,0 +1 @@
+storage_1_a.lua
\ No newline at end of file
diff --git a/test/reload_evolution/storage_2_b.lua b/test/reload_evolution/storage_2_b.lua
new file mode 120000
index 0000000..02572da
--- /dev/null
+++ b/test/reload_evolution/storage_2_b.lua
@@ -0,0 +1 @@
+storage_1_a.lua
\ No newline at end of file
diff --git a/test/reload_evolution/suite.ini b/test/reload_evolution/suite.ini
new file mode 100644
index 0000000..5f55418
--- /dev/null
+++ b/test/reload_evolution/suite.ini
@@ -0,0 +1,6 @@
+[default]
+core = tarantool
+description = Reload evolution tests
+script = test.lua
+is_parallel = False
+lua_libs = ../lua_libs ../../example/localcfg.lua
diff --git a/test/reload_evolution/test.lua b/test/reload_evolution/test.lua
new file mode 100644
index 0000000..ad0543a
--- /dev/null
+++ b/test/reload_evolution/test.lua
@@ -0,0 +1,9 @@
+#!/usr/bin/env tarantool
+
+require('strict').on()
+
+box.cfg{
+    listen = os.getenv("LISTEN"),
+}
+
+require('console').listen(os.getenv('ADMIN'))
diff --git a/test/unit/reload_evolution.result b/test/unit/reload_evolution.result
new file mode 100644
index 0000000..342ac24
--- /dev/null
+++ b/test/unit/reload_evolution.result
@@ -0,0 +1,45 @@
+test_run = require('test_run').new()
+---
+...
+fiber = require('fiber')
+---
+...
+log = require('log')
+---
+...
+util = require('util')
+---
+...
+reload_evolution = require('vshard.storage.reload_evolution')
+---
+...
+-- Init with the latest version.
+fake_M = { reload_evolution_version = reload_evolution.version }
+---
+...
+-- Test reload to the same version.
+reload_evolution.upgrade(fake_M)
+---
+...
+test_run:grep_log('default', 'vshard.storage.evolution') == nil
+---
+- true
+...
+-- Test downgrage version.
+log.info(string.rep('a', 1000))
+---
+...
+fake_M.reload_evolution_version = fake_M.reload_evolution_version + 1
+---
+...
+err = util.check_error(reload_evolution.upgrade, fake_M)
+---
+...
+err:match('auto%-downgrade is not implemented')
+---
+- auto-downgrade is not implemented
+...
+test_run:grep_log('default', 'vshard.storage.evolution', 1000) ~= nil
+---
+- false
+...
diff --git a/test/unit/reload_evolution.test.lua b/test/unit/reload_evolution.test.lua
new file mode 100644
index 0000000..c0fcdcd
--- /dev/null
+++ b/test/unit/reload_evolution.test.lua
@@ -0,0 +1,18 @@
+test_run = require('test_run').new()
+fiber = require('fiber')
+log = require('log')
+util = require('util')
+reload_evolution = require('vshard.storage.reload_evolution')
+-- Init with the latest version.
+fake_M = { reload_evolution_version = reload_evolution.version }
+
+-- Test reload to the same version.
+reload_evolution.upgrade(fake_M)
+test_run:grep_log('default', 'vshard.storage.evolution') == nil
+
+-- Test downgrage version.
+log.info(string.rep('a', 1000))
+fake_M.reload_evolution_version = fake_M.reload_evolution_version + 1
+err = util.check_error(reload_evolution.upgrade, fake_M)
+err:match('auto%-downgrade is not implemented')
+test_run:grep_log('default', 'vshard.storage.evolution', 1000) ~= nil
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index 07bd00c..3bec09f 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -10,6 +10,7 @@ if rawget(_G, MODULE_INTERNALS) then
     local vshard_modules = {
         'vshard.consts', 'vshard.error', 'vshard.cfg',
         'vshard.replicaset', 'vshard.util',
+        'vshard.storage.reload_evolution'
     }
     for _, module in pairs(vshard_modules) do
         package.loaded[module] = nil
@@ -20,12 +21,16 @@ local lerror = require('vshard.error')
 local lcfg = require('vshard.cfg')
 local lreplicaset = require('vshard.replicaset')
 local util = require('vshard.util')
+local reload_evolution = require('vshard.storage.reload_evolution')
 
 local M = rawget(_G, MODULE_INTERNALS)
 if not M then
     --
     -- The module is loaded for the first time.
     --
+    -- !!!WARNING: any change of this table must be reflected in
+    -- `vshard.storage.reload_evolution` module to guarantee
+    -- reloadability of the module.
     M = {
         ---------------- Common module attributes ----------------
         -- The last passed configuration.
@@ -105,6 +110,11 @@ if not M then
         -- a destination replicaset must drop already received
         -- data.
         rebalancer_sending_bucket = 0,
+
+        ------------------------- Reload -------------------------
+        -- Version of the loaded module. This number is used on
+        -- reload to determine which upgrade scripts to run.
+        reload_evolution_version = reload_evolution.version,
     }
 end
 
@@ -1863,6 +1873,7 @@ end
 if not rawget(_G, MODULE_INTERNALS) then
     rawset(_G, MODULE_INTERNALS, M)
 else
+    reload_evolution.upgrade(M)
     storage_cfg(M.current_cfg, M.this_replica.uuid)
     M.module_version = M.module_version + 1
 end
diff --git a/vshard/storage/reload_evolution.lua b/vshard/storage/reload_evolution.lua
new file mode 100644
index 0000000..f25ad49
--- /dev/null
+++ b/vshard/storage/reload_evolution.lua
@@ -0,0 +1,58 @@
+--
+-- This module is used to upgrade the vshard.storage on the fly.
+-- It updates internal Lua structures in case they are changed
+-- in a commit.
+--
+local log = require('log')
+
+--
+-- Array of upgrade functions.
+-- migrations[version] = function which upgrades module version
+-- from `version` to `version + 1`.
+--
+local migrations = {}
+
+-- Initialize reload_upgrade mechanism
+migrations[#migrations + 1] = function (M)
+    -- Code to update Lua objects.
+end
+
+--
+-- Perform an update based on a version stored in `M` (internals).
+-- @param M Old module internals which should be updated.
+--
+local function upgrade(M)
+    local start_version = M.reload_evolution_version or 1
+    if start_version > #migrations then
+        local err_msg = string.format(
+            'vshard.storage.reload_evolution: ' ..
+            'auto-downgrade is not implemented; ' ..
+            'loaded version is %d, upgrade script version is %d',
+            start_version, #migrations
+        )
+        log.error(err_msg)
+        error(err_msg)
+    end
+    for i = start_version, #migrations  do
+        local ok, err = pcall(migrations[i], M)
+        if ok then
+            log.info('vshard.storage.reload_evolution: upgraded to %d version',
+                     i)
+        else
+            local err_msg = string.format(
+                'vshard.storage.reload_evolution: ' ..
+                'error during upgrade to %d version: %s', i, err
+            )
+            log.error(err_msg)
+            error(err_msg)
+        end
+        -- Update the version just after upgrade to have an
+        -- actual version in case of an error.
+        M.reload_evolution_version = i
+    end
+end
+
+return {
+    version = #migrations,
+    upgrade = upgrade,
+}
-- 
2.14.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tarantool-patches] Re: [PATCH 1/4] Add test on error during reconfigure
  2018-07-23 11:14 ` [tarantool-patches] [PATCH 1/4] Add test on error during reconfigure AKhatskevich
@ 2018-07-23 13:18   ` Vladislav Shpilevoy
  0 siblings, 0 replies; 14+ messages in thread
From: Vladislav Shpilevoy @ 2018-07-23 13:18 UTC (permalink / raw)
  To: tarantool-patches, AKhatskevich

Pushed to the master with one minor change:

* use util.check_error in storage/storage.test.lua.

On 23/07/2018 14:14, AKhatskevich wrote:
> In case reconfigure process fails, the node should continue
> work properly.
> ---
>   test/lua_libs/util.lua        | 24 ++++++++++++++++++++++++
>   test/router/router.result     | 30 ++++++++++++++++++++++++++++++
>   test/router/router.test.lua   |  9 +++++++++
>   test/storage/storage.result   | 39 +++++++++++++++++++++++++++++++++++++++
>   test/storage/storage.test.lua | 12 ++++++++++++
>   vshard/router/init.lua        |  7 +++++++
>   vshard/storage/init.lua       |  9 +++++++++
>   7 files changed, 130 insertions(+)
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tarantool-patches] Re: [PATCH 2/4] Complete module reload
  2018-07-23 11:14 ` [tarantool-patches] [PATCH 2/4] Complete module reload AKhatskevich
@ 2018-07-23 13:31   ` Vladislav Shpilevoy
  2018-07-23 13:45     ` Alex Khatskevich
  0 siblings, 1 reply; 14+ messages in thread
From: Vladislav Shpilevoy @ 2018-07-23 13:31 UTC (permalink / raw)
  To: tarantool-patches, AKhatskevich

Hi! Thanks for the patch!

> diff --git a/vshard/replicaset.lua b/vshard/replicaset.lua
> index 99f59aa..6c8d477 100644
> --- a/vshard/replicaset.lua
> +++ b/vshard/replicaset.lua
> @@ -21,6 +21,7 @@
>   --                                  requests to the replica>,
>   --             net_sequential_fail = <count of sequential failed
>   --                                    requests to the replica>,
> +--             is_outdated = nil/true,

https://github.com/tarantool/vshard/blob/kh/gh-112-reload-mt-2/vshard/replicaset.lua#L24

Looks like the version on the branch actually is outdated
instead of the replica object. Please, push the latest
version.

>   --          }
>   --      },
>   --      master = <master server from the array above>,
> @@ -34,6 +35,7 @@
>   --      etalon_bucket_count = <bucket count, that must be stored
>   --                             on this replicaset to reach the
>   --                             balance in a cluster>,
> +--      is_outdated = nil/true,
>   --  }
>   --
>   -- replicasets = {

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tarantool-patches] Re: [PATCH 3/4] Tests: separate bootstrap routine to a lua_libs
  2018-07-23 11:14 ` [tarantool-patches] [PATCH 3/4] Tests: separate bootstrap routine to a lua_libs AKhatskevich
@ 2018-07-23 13:36   ` Vladislav Shpilevoy
  2018-07-23 17:19     ` Alex Khatskevich
  0 siblings, 1 reply; 14+ messages in thread
From: Vladislav Shpilevoy @ 2018-07-23 13:36 UTC (permalink / raw)
  To: tarantool-patches, AKhatskevich

Thank you for working on the patch!

1. Please, do not start commit title with capital letter,
when it is related to a subsystem. Here you should write

     test: separate bootstrap ...

On 23/07/2018 14:14, AKhatskevich wrote:
> What is moved to `test/lul_libs/bootstrap_test_storage.lua`:

2. 'lul' libs?

> 1. create schema
> 2. create main stored procedures
> 3. `wait_rebalancer_state` procedure
> 
> This code would be reused it further commits.
> ---
>   test/lua_libs/bootstrap_test_storage.lua           | 50 ++++++++++++++++++++++
>   test/rebalancer/box_1_a.lua                        | 47 ++------------------
>   test/rebalancer/errinj.result                      |  2 +-
>   test/rebalancer/errinj.test.lua                    |  2 +-
>   test/rebalancer/rebalancer.result                  |  2 +-
>   test/rebalancer/rebalancer.test.lua                |  2 +-
>   test/rebalancer/rebalancer_lock_and_pin.result     |  2 +-
>   test/rebalancer/rebalancer_lock_and_pin.test.lua   |  2 +-
>   test/rebalancer/restart_during_rebalancing.result  |  2 +-
>   .../rebalancer/restart_during_rebalancing.test.lua |  2 +-
>   test/rebalancer/stress_add_remove_rs.result        |  2 +-
>   test/rebalancer/stress_add_remove_rs.test.lua      |  2 +-
>   .../rebalancer/stress_add_remove_several_rs.result |  2 +-
>   .../stress_add_remove_several_rs.test.lua          |  2 +-
>   test/rebalancer/suite.ini                          |  2 +-
>   15 files changed, 66 insertions(+), 57 deletions(-)
>   create mode 100644 test/lua_libs/bootstrap_test_storage.lua
> 
> diff --git a/test/lua_libs/bootstrap_test_storage.lua b/test/lua_libs/bootstrap_test_storage.lua
> new file mode 100644
> index 0000000..62c2f78
> --- /dev/null
> +++ b/test/lua_libs/bootstrap_test_storage.lua

3. Please, just merge it into util. It is actually just util.

> diff --git a/test/rebalancer/errinj.result b/test/rebalancer/errinj.result
> index d09349e..826c2c6 100644
> --- a/test/rebalancer/errinj.result
> +++ b/test/rebalancer/errinj.result
> @@ -13,7 +13,7 @@ test_run:create_cluster(REPLICASET_1, 'rebalancer')
>   test_run:create_cluster(REPLICASET_2, 'rebalancer')
>   ---
>   ...
> -util = require('util')
> +util = require('lua_libs.util')

4. Please, don't. This should disappear when you merge the utils
into util.lua.

>   ---
>   ...
>   util.wait_master(test_run, REPLICASET_1, 'box_1_a')

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tarantool-patches] Re: [PATCH 2/4] Complete module reload
  2018-07-23 13:31   ` [tarantool-patches] " Vladislav Shpilevoy
@ 2018-07-23 13:45     ` Alex Khatskevich
  2018-07-23 14:58       ` Vladislav Shpilevoy
  0 siblings, 1 reply; 14+ messages in thread
From: Alex Khatskevich @ 2018-07-23 13:45 UTC (permalink / raw)
  To: Vladislav Shpilevoy, tarantool-patches



On 23.07.2018 16:31, Vladislav Shpilevoy wrote:
> Hi! Thanks for the patch!
>
>> diff --git a/vshard/replicaset.lua b/vshard/replicaset.lua
>> index 99f59aa..6c8d477 100644
>> --- a/vshard/replicaset.lua
>> +++ b/vshard/replicaset.lua
>> @@ -21,6 +21,7 @@
>>   --                                  requests to the replica>,
>>   --             net_sequential_fail = <count of sequential failed
>>   --                                    requests to the replica>,
>> +--             is_outdated = nil/true,
>
> https://github.com/tarantool/vshard/blob/kh/gh-112-reload-mt-2/vshard/replicaset.lua#L24 
>
>
> Looks like the version on the branch actually is outdated
> instead of the replica object. Please, push the latest
> version.
Sorry. Pushed.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tarantool-patches] Re: [PATCH 4/4] Introduce storage reload evolution
  2018-07-23 11:14 ` [tarantool-patches] [PATCH 4/4] Introduce storage reload evolution AKhatskevich
@ 2018-07-23 14:44   ` Vladislav Shpilevoy
  2018-07-23 20:10     ` Alex Khatskevich
  0 siblings, 1 reply; 14+ messages in thread
From: Vladislav Shpilevoy @ 2018-07-23 14:44 UTC (permalink / raw)
  To: tarantool-patches, AKhatskevich

Thanks for the patch! See 4 comments below.

On 23/07/2018 14:14, AKhatskevich wrote:
> Changes:
> 1. Introduce storage reload evolution.
> 2. Setup cross-version reload testing.
> 
> 1:
> This mechanism updates Lua objects on reload in case they are
> changed in a new vshard.storage version.
> 
> Since this commit, any change in vshard.storage.M has to be
> reflected in vshard.storage.reload_evolution to guarantee
> correct reload.
> 
> 2:
> The testing uses git infrastructure and is performed in the following
> way:
> 1. Copy old version of vshard to a temp folder.
> 2. Run vshard on this code.
> 3. Checkout the latest version of the vshard sources.
> 4. Reload vshard storage.
> 5. Make sure it works (Perform simple tests).
> 
> Notes:
> * this patch contains some legacy-driven decisions:
>    1. SOURCEDIR path retrieved differently in case of
>       packpack build.
>    2. git directory in the `reload_evolution/storage` test
>       is copied with respect to Centos 7 and `ro` mode of
>       SOURCEDIR.
> 
> diff --git a/test/reload_evolution/storage.result b/test/reload_evolution/storage.result
> new file mode 100644
> index 0000000..54ff6b7
> --- /dev/null
> +++ b/test/reload_evolution/storage.result
> @@ -0,0 +1,248 @@
> +test_run = require('test_run').new()
> +---
> +...
> +git_util = require('lua_libs.git_util')
> +---
> +...
> +util = require('lua_libs.util')
> +---
> +...
> +vshard_copy_path = util.BUILDDIR .. '/test/var/vshard_git_tree_copy'
> +---
> +...
> +evolution_log = git_util.log_hashes({args='vshard/storage/reload_evolution.lua', dir=util.SOURCEDIR})
> +---
> +...
> +-- Cleanup the directory after a previous build.
> +_ = os.execute('rm -rf ' .. vshard_copy_path)
> +---
> +...
> +-- 1. `git worktree` cannot be used because PACKPACK mounts
> +-- `/source/` in `ro` mode.
> +-- 2. Just `cp -rf` cannot be used due to a little different
> +-- behavior in Centos 7.
> +_ = os.execute('mkdir ' .. vshard_copy_path)
> +---
> +...
> +_ = os.execute("cd " .. util.SOURCEDIR .. ' && cp -rf `ls -A --ignore=build` ' .. vshard_copy_path)
> +---
> +...
> +-- Checkout the first commit with a reload_evolution mechanism.
> +git_util.exec_cmd({cmd='checkout', args='-f', dir=vshard_copy_path})
> +---
> +...
> +git_util.exec_cmd({cmd='checkout', args=evolution_log[#evolution_log] .. '~1', dir=vshard_copy_path})
> +---
> +...
> +REPLICASET_1 = { 'storage_1_a', 'storage_1_b' }
> +---
> +...
> +REPLICASET_2 = { 'storage_2_a', 'storage_2_b' }
> +---
> +...
> +test_run:create_cluster(REPLICASET_1, 'reload_evolution')
> +---
> +...
> +test_run:create_cluster(REPLICASET_2, 'reload_evolution')
> +---
> +...
> +util = require('lua_libs.util')
> +---
> +...
> +util.wait_master(test_run, REPLICASET_1, 'storage_1_a')
> +---
> +...
> +util.wait_master(test_run, REPLICASET_2, 'storage_2_a')
> +---
> +...
> +test_run:switch('storage_1_a')
> +---
> +- true
> +...
> +vshard.storage.bucket_force_create(1, vshard.consts.DEFAULT_BUCKET_COUNT / 2)
> +---
> +- true
> +...
> +bucket_id_to_move = vshard.consts.DEFAULT_BUCKET_COUNT
> +---
> +...
> +test_run:switch('storage_2_a')
> +---
> +- true
> +...
> +fiber = require('fiber')
> +---
> +...
> +vshard.storage.bucket_force_create(vshard.consts.DEFAULT_BUCKET_COUNT / 2 + 1, vshard.consts.DEFAULT_BUCKET_COUNT / 2)
> +---
> +- true
> +...
> +bucket_id_to_move = vshard.consts.DEFAULT_BUCKET_COUNT
> +---
> +...
> +vshard.storage.internal.reload_evolution_version
> +---
> +- null
> +...
> +box.space.test:insert({42, bucket_id_to_move})
> +---
> +- [42, 3000]
> +...
> +while test_run:grep_log('storage_2_a', 'The cluster is balanced ok') == nil do vshard.storage.rebalancer_wakeup() fiber.sleep(0.1) end

1. Now you have wait_rebalancer_state util from the previous commit.

> +---
> +...
> +test_run:switch('default')
> +---
> +- true
> +...
> +git_util.exec_cmd({cmd='checkout', args=evolution_log[1], dir=vshard_copy_path})
> +---
> +...
> +test_run:switch('storage_2_a')
> +---
> +- true
> +...
> +package.loaded["vshard.storage"] = nil
> +---
> +...
> +vshard.storage = require("vshard.storage")
> +---
> +...
> +test_run:grep_log('storage_2_a', 'vshard.storage.reload_evolution: upgraded to') ~= nil
> +---
> +- true
> +...
> +vshard.storage.internal.reload_evolution_version
> +---
> +- 1
> +...
> +-- Make sure storage operates well.
> +vshard.storage.bucket_force_drop(2000)
> +---
> +- true
> +...
> +vshard.storage.bucket_force_create(2000)
> +---
> +- true
> +...
> +vshard.storage.buckets_info()[2000]
> +---
> +- status: active
> +  id: 2000
> +...
> +vshard.storage.call(bucket_id_to_move, 'read', 'do_select', {42})
> +---
> +- true
> +- - [42, 3000]
> +...
> +vshard.storage.bucket_send(bucket_id_to_move, replicaset1_uuid)
> +---
> +- true
> +...
> +vshard.storage.garbage_collector_wakeup()
> +---
> +...
> +fiber = require('fiber')
> +---
> +...
> +while box.space._bucket:get({bucket_id_to_move}) do fiber.sleep(0.01) end
> +---
> +...
> +test_run:switch('storage_1_a')
> +---
> +- true
> +...
> +vshard.storage.bucket_send(bucket_id_to_move, replicaset2_uuid)
> +---
> +- true
> +...
> +test_run:switch('storage_2_a')
> +---
> +- true
> +...
> +vshard.storage.call(bucket_id_to_move, 'read', 'do_select', {42})
> +---
> +- true
> +- - [42, 3000]
> +...
> +-- Check info() does not fail.
> +vshard.storage.info() ~= nil
> +---
> +- true
> +...
> +--
> +-- Send buckets to create a disbalance. Wait until the rebalancer
> +-- repairs it. Similar to `tests/rebalancer/rebalancer.test.lua`.
> +--
> +vshard.storage.rebalancer_disable()
> +---
> +...
> +move_start = vshard.consts.DEFAULT_BUCKET_COUNT / 2 + 1
> +---
> +...
> +move_cnt = 100
> +---
> +...
> +assert(move_start + move_cnt < vshard.consts.DEFAULT_BUCKET_COUNT)
> +---
> +- true
> +...
> +for i = move_start, move_start + move_cnt - 1 do box.space._bucket:delete{i} end
> +---
> +...
> +box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
> +---
> +- 1400
> +...
> +test_run:switch('storage_1_a')
> +---
> +- true
> +...
> +move_start = vshard.consts.DEFAULT_BUCKET_COUNT / 2 + 1
> +---
> +...
> +move_cnt = 100
> +---
> +...
> +vshard.storage.bucket_force_create(move_start, move_cnt)
> +---
> +- true
> +...
> +box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
> +---
> +- 1600
> +...
> +test_run:switch('storage_2_a')
> +---
> +- true
> +...
> +vshard.storage.rebalancer_enable()
> +---
> +...
> +vshard.storage.rebalancer_wakeup()

2. You do not need explicit rebalancer_wakeup. wait_rebalancer_state
calls it.

> +---
> +...
> +wait_rebalancer_state("Rebalance routes are sent", test_run)
> +---
> +...
> +wait_rebalancer_state('The cluster is balanced ok', test_run)
> +---
> +...
> +box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
> +---
> +- 1500
> +...
> +test_run:switch('default')
> +---
> +- true
> +...
> +test_run:drop_cluster(REPLICASET_2)
> +---
> +...
> +test_run:drop_cluster(REPLICASET_1)
> +---
> +...
> +test_run:cmd('clear filter')
> +---
> +- true
> +...
> diff --git a/test/unit/reload_evolution.result b/test/unit/reload_evolution.result
> new file mode 100644
> index 0000000..342ac24
> --- /dev/null
> +++ b/test/unit/reload_evolution.result
> @@ -0,0 +1,45 @@
> +test_run = require('test_run').new()
> +---
> +...
> +fiber = require('fiber')
> +---
> +...
> +log = require('log')
> +---
> +...
> +util = require('util')
> +---
> +...
> +reload_evolution = require('vshard.storage.reload_evolution')
> +---
> +...
> +-- Init with the latest version.
> +fake_M = { reload_evolution_version = reload_evolution.version }
> +---
> +...
> +-- Test reload to the same version.
> +reload_evolution.upgrade(fake_M)
> +---
> +...
> +test_run:grep_log('default', 'vshard.storage.evolution') == nil
> +---
> +- true
> +...
> +-- Test downgrage version.
> +log.info(string.rep('a', 1000))
> +---
> +...
> +fake_M.reload_evolution_version = fake_M.reload_evolution_version + 1
> +---
> +...
> +err = util.check_error(reload_evolution.upgrade, fake_M)
> +---
> +...
> +err:match('auto%-downgrade is not implemented')
> +---
> +- auto-downgrade is not implemented

3. Why do you need match? check_error output is ok already. And
what is 'auto%'? I see that you always print exactly "auto-downgrade"
in reload_evolution.lua.

> +...
> +test_run:grep_log('default', 'vshard.storage.evolution', 1000) ~= nil
> +---
> +- false
> +...
> @@ -105,6 +110,11 @@ if not M then
>           -- a destination replicaset must drop already received
>           -- data.
>           rebalancer_sending_bucket = 0,
> +
> +        ------------------------- Reload -------------------------
> +        -- Version of the loaded module. This number is used on
> +        -- reload to determine which upgrade scripts to run.
> +        reload_evolution_version = reload_evolution.version,

4. Please, rename it to just 'version' or 'reload_version' or
'module_version'. 'Reload_evolution_version' is too long and
complex.

>       }
>   end
>   

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tarantool-patches] Re: [PATCH 2/4] Complete module reload
  2018-07-23 13:45     ` Alex Khatskevich
@ 2018-07-23 14:58       ` Vladislav Shpilevoy
  0 siblings, 0 replies; 14+ messages in thread
From: Vladislav Shpilevoy @ 2018-07-23 14:58 UTC (permalink / raw)
  To: Alex Khatskevich, tarantool-patches



On 23/07/2018 16:45, Alex Khatskevich wrote:
> 
> 
> On 23.07.2018 16:31, Vladislav Shpilevoy wrote:
>> Hi! Thanks for the patch!
>>
>>> diff --git a/vshard/replicaset.lua b/vshard/replicaset.lua
>>> index 99f59aa..6c8d477 100644
>>> --- a/vshard/replicaset.lua
>>> +++ b/vshard/replicaset.lua
>>> @@ -21,6 +21,7 @@
>>>   --                                  requests to the replica>,
>>>   --             net_sequential_fail = <count of sequential failed
>>>   --                                    requests to the replica>,
>>> +--             is_outdated = nil/true,
>>
>> https://github.com/tarantool/vshard/blob/kh/gh-112-reload-mt-2/vshard/replicaset.lua#L24
>>
>> Looks like the version on the branch actually is outdated
>> instead of the replica object. Please, push the latest
>> version.
> Sorry. Pushed.

Thanks! Merged into the master.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tarantool-patches] Re: [PATCH 3/4] Tests: separate bootstrap routine to a lua_libs
  2018-07-23 13:36   ` [tarantool-patches] " Vladislav Shpilevoy
@ 2018-07-23 17:19     ` Alex Khatskevich
  0 siblings, 0 replies; 14+ messages in thread
From: Alex Khatskevich @ 2018-07-23 17:19 UTC (permalink / raw)
  To: Vladislav Shpilevoy, tarantool-patches


On 23.07.2018 16:36, Vladislav Shpilevoy wrote:
> Thank you for working on the patch!
>
> 1. Please, do not start commit title with capital letter,
> when it is related to a subsystem. Here you should write
>
>     test: separate bootstrap ...
fixed: `tests: separate bootstrap routine to a lua_libs`
>
> On 23/07/2018 14:14, AKhatskevich wrote:
>> What is moved to `test/lul_libs/bootstrap_test_storage.lua`:
>
> 2. 'lul' libs?
fixed
>
>>
>> diff --git a/test/lua_libs/bootstrap_test_storage.lua 
>> b/test/lua_libs/bootstrap_test_storage.lua
>> new file mode 100644
>> index 0000000..62c2f78
>> --- /dev/null
>> +++ b/test/lua_libs/bootstrap_test_storage.lua
>
> 3. Please, just merge it into util. It is actually just util.
3. discussed verbally. file renamed to bootstrap.lua
>
>> diff --git a/test/rebalancer/errinj.result 
>> b/test/rebalancer/errinj.result
>> index d09349e..826c2c6 100644
>> --- a/test/rebalancer/errinj.result
>> +++ b/test/rebalancer/errinj.result
>> @@ -13,7 +13,7 @@ test_run:create_cluster(REPLICASET_1, 'rebalancer')
>>   test_run:create_cluster(REPLICASET_2, 'rebalancer')
>>   ---
>>   ...
>> -util = require('util')
>> +util = require('lua_libs.util')
>
> 4. Please, don't. This should disappear when you merge the utils
> into util.lua. 
4. Same as (3)

full diff:

commit c4c5a3a9678153e7e3315bd11f2f7c882dafbb0c
Author: AKhatskevich <avkhatskevich@tarantool.org>
Date:   Sat Jul 21 02:25:08 2018 +0300

     tests: separate bootstrap routine to a lua_libs

     What is moved to `test/lua_libs/bootstrap.lua`:
     1. create schema
     2. create main stored procedures
     3. `wait_rebalancer_state` procedure

     This code will be reused in further commits.

diff --git a/test/lua_libs/bootstrap.lua b/test/lua_libs/bootstrap.lua
new file mode 100644
index 0000000..62c2f78
--- /dev/null
+++ b/test/lua_libs/bootstrap.lua
@@ -0,0 +1,50 @@
+local log = require('log')
+
+function init_schema()
+    local format = {}
+    format[1] = {name = 'field', type = 'unsigned'}
+    format[2] = {name = 'bucket_id', type = 'unsigned'}
+    local s = box.schema.create_space('test', {format = format})
+    local pk = s:create_index('pk')
+    local bucket_id_idx =
+        s:create_index('vbucket', {parts = {'bucket_id'},
+                       unique = false})
+end
+
+box.once('schema', function()
+    box.schema.func.create('do_replace')
+    box.schema.role.grant('public', 'execute', 'function', 'do_replace')
+    box.schema.func.create('do_select')
+    box.schema.role.grant('public', 'execute', 'function', 'do_select')
+    init_schema()
+end)
+
+function do_replace(...)
+    box.space.test:replace(...)
+    return true
+end
+
+function do_select(...)
+    return box.space.test:select(...)
+end
+
+function check_consistency()
+    for _, tuple in box.space.test:pairs() do
+        assert(box.space._bucket:get{tuple.bucket_id})
+    end
+    return true
+end
+
+--
+-- Wait a specified log message.
+-- Requirements:
+-- * Should be executed from a storage with a rebalancer.
+-- * NAME - global variable, name of instance should be set.
+function wait_rebalancer_state(state, test_run)
+    log.info(string.rep('a', 1000))
+    vshard.storage.rebalancer_wakeup()
+    while not test_run:grep_log(NAME, state, 1000) do
+        fiber.sleep(0.1)
+        vshard.storage.rebalancer_wakeup()
+    end
+end
diff --git a/test/rebalancer/box_1_a.lua b/test/rebalancer/box_1_a.lua
index 8fddcf0..2ca8306 100644
--- a/test/rebalancer/box_1_a.lua
+++ b/test/rebalancer/box_1_a.lua
@@ -2,7 +2,7 @@
  -- Get instance name
  require('strict').on()
  local fio = require('fio')
-local NAME = fio.basename(arg[0], '.lua')
+NAME = fio.basename(arg[0], '.lua')
  log = require('log')
  require('console').listen(os.getenv('ADMIN'))
  fiber = require('fiber')
@@ -23,40 +23,8 @@ if NAME == 'box_4_a' or NAME == 'box_4_b' or
  end
  vshard.storage.cfg(cfg, names.replica_uuid[NAME])

-function init_schema()
-    local format = {}
-    format[1] = {name = 'field', type = 'unsigned'}
-    format[2] = {name = 'bucket_id', type = 'unsigned'}
-    local s = box.schema.create_space('test', {format = format})
-    local pk = s:create_index('pk')
-    local bucket_id_idx =
-        s:create_index('vbucket', {parts = {'bucket_id'},
-                       unique = false})
-end
-
-box.once('schema', function()
-    box.schema.func.create('do_replace')
-    box.schema.role.grant('public', 'execute', 'function', 'do_replace')
-    box.schema.func.create('do_select')
-    box.schema.role.grant('public', 'execute', 'function', 'do_select')
-    init_schema()
-end)
-
-function do_replace(...)
-    box.space.test:replace(...)
-    return true
-end
-
-function do_select(...)
-    return box.space.test:select(...)
-end
-
-function check_consistency()
-    for _, tuple in box.space.test:pairs() do
-        assert(box.space._bucket:get{tuple.bucket_id})
-    end
-    return true
-end
+-- Bootstrap storage.
+require('lua_libs.bootstrap')

  function switch_rs1_master()
      local replica_uuid = names.replica_uuid
@@ -68,12 +36,3 @@ end
  function nullify_rs_weight()
      cfg.sharding[names.rs_uuid[1]].weight = 0
  end
-
-function wait_rebalancer_state(state, test_run)
-    log.info(string.rep('a', 1000))
-    vshard.storage.rebalancer_wakeup()
-    while not test_run:grep_log(NAME, state, 1000) do
-        fiber.sleep(0.1)
-        vshard.storage.rebalancer_wakeup()
-    end
-end
diff --git a/test/rebalancer/errinj.result b/test/rebalancer/errinj.result
index d09349e..826c2c6 100644
--- a/test/rebalancer/errinj.result
+++ b/test/rebalancer/errinj.result
@@ -13,7 +13,7 @@ test_run:create_cluster(REPLICASET_1, 'rebalancer')
  test_run:create_cluster(REPLICASET_2, 'rebalancer')
  ---
  ...
-util = require('util')
+util = require('lua_libs.util')
  ---
  ...
  util.wait_master(test_run, REPLICASET_1, 'box_1_a')
diff --git a/test/rebalancer/errinj.test.lua 
b/test/rebalancer/errinj.test.lua
index d6a2920..fc0730c 100644
--- a/test/rebalancer/errinj.test.lua
+++ b/test/rebalancer/errinj.test.lua
@@ -5,7 +5,7 @@ REPLICASET_2 = { 'box_2_a', 'box_2_b' }

  test_run:create_cluster(REPLICASET_1, 'rebalancer')
  test_run:create_cluster(REPLICASET_2, 'rebalancer')
-util = require('util')
+util = require('lua_libs.util')
  util.wait_master(test_run, REPLICASET_1, 'box_1_a')
  util.wait_master(test_run, REPLICASET_2, 'box_2_a')

diff --git a/test/rebalancer/rebalancer.result 
b/test/rebalancer/rebalancer.result
index 88cbaae..71e43e1 100644
--- a/test/rebalancer/rebalancer.result
+++ b/test/rebalancer/rebalancer.result
@@ -13,7 +13,7 @@ test_run:create_cluster(REPLICASET_1, 'rebalancer')
  test_run:create_cluster(REPLICASET_2, 'rebalancer')
  ---
  ...
-util = require('util')
+util = require('lua_libs.util')
  ---
  ...
  util.wait_master(test_run, REPLICASET_1, 'box_1_a')
diff --git a/test/rebalancer/rebalancer.test.lua 
b/test/rebalancer/rebalancer.test.lua
index 01f2061..1b7ddae 100644
--- a/test/rebalancer/rebalancer.test.lua
+++ b/test/rebalancer/rebalancer.test.lua
@@ -5,7 +5,7 @@ REPLICASET_2 = { 'box_2_a', 'box_2_b' }

  test_run:create_cluster(REPLICASET_1, 'rebalancer')
  test_run:create_cluster(REPLICASET_2, 'rebalancer')
-util = require('util')
+util = require('lua_libs.util')
  util.wait_master(test_run, REPLICASET_1, 'box_1_a')
  util.wait_master(test_run, REPLICASET_2, 'box_2_a')

diff --git a/test/rebalancer/rebalancer_lock_and_pin.result 
b/test/rebalancer/rebalancer_lock_and_pin.result
index dd9fe47..0f2921c 100644
--- a/test/rebalancer/rebalancer_lock_and_pin.result
+++ b/test/rebalancer/rebalancer_lock_and_pin.result
@@ -16,7 +16,7 @@ test_run:create_cluster(REPLICASET_1, 'rebalancer')
  test_run:create_cluster(REPLICASET_2, 'rebalancer')
  ---
  ...
-util = require('util')
+util = require('lua_libs.util')
  ---
  ...
  util.wait_master(test_run, REPLICASET_1, 'box_1_a')
diff --git a/test/rebalancer/rebalancer_lock_and_pin.test.lua 
b/test/rebalancer/rebalancer_lock_and_pin.test.lua
index fe866c4..3a2daa0 100644
--- a/test/rebalancer/rebalancer_lock_and_pin.test.lua
+++ b/test/rebalancer/rebalancer_lock_and_pin.test.lua
@@ -6,7 +6,7 @@ REPLICASET_3 = { 'box_3_a', 'box_3_b' }

  test_run:create_cluster(REPLICASET_1, 'rebalancer')
  test_run:create_cluster(REPLICASET_2, 'rebalancer')
-util = require('util')
+util = require('lua_libs.util')
  util.wait_master(test_run, REPLICASET_1, 'box_1_a')
  util.wait_master(test_run, REPLICASET_2, 'box_2_a')

diff --git a/test/rebalancer/restart_during_rebalancing.result 
b/test/rebalancer/restart_during_rebalancing.result
index d2b8a12..0eb0f2e 100644
--- a/test/rebalancer/restart_during_rebalancing.result
+++ b/test/rebalancer/restart_during_rebalancing.result
@@ -25,7 +25,7 @@ test_run:create_cluster(REPLICASET_3, 'rebalancer')
  test_run:create_cluster(REPLICASET_4, 'rebalancer')
  ---
  ...
-util = require('util')
+util = require('lua_libs.util')
  ---
  ...
  util.wait_master(test_run, REPLICASET_1, 'fullbox_1_a')
diff --git a/test/rebalancer/restart_during_rebalancing.test.lua 
b/test/rebalancer/restart_during_rebalancing.test.lua
index 5b1a8df..7b707ca 100644
--- a/test/rebalancer/restart_during_rebalancing.test.lua
+++ b/test/rebalancer/restart_during_rebalancing.test.lua
@@ -9,7 +9,7 @@ test_run:create_cluster(REPLICASET_1, 'rebalancer')
  test_run:create_cluster(REPLICASET_2, 'rebalancer')
  test_run:create_cluster(REPLICASET_3, 'rebalancer')
  test_run:create_cluster(REPLICASET_4, 'rebalancer')
-util = require('util')
+util = require('lua_libs.util')
  util.wait_master(test_run, REPLICASET_1, 'fullbox_1_a')
  util.wait_master(test_run, REPLICASET_2, 'fullbox_2_a')
  util.wait_master(test_run, REPLICASET_3, 'fullbox_3_a')
diff --git a/test/rebalancer/stress_add_remove_rs.result 
b/test/rebalancer/stress_add_remove_rs.result
index 8a955e2..10bcaac 100644
--- a/test/rebalancer/stress_add_remove_rs.result
+++ b/test/rebalancer/stress_add_remove_rs.result
@@ -16,7 +16,7 @@ test_run:create_cluster(REPLICASET_1, 'rebalancer')
  test_run:create_cluster(REPLICASET_2, 'rebalancer')
  ---
  ...
-util = require('util')
+util = require('lua_libs.util')
  ---
  ...
  util.wait_master(test_run, REPLICASET_1, 'box_1_a')
diff --git a/test/rebalancer/stress_add_remove_rs.test.lua 
b/test/rebalancer/stress_add_remove_rs.test.lua
index c80df40..b9bb027 100644
--- a/test/rebalancer/stress_add_remove_rs.test.lua
+++ b/test/rebalancer/stress_add_remove_rs.test.lua
@@ -6,7 +6,7 @@ REPLICASET_3 = { 'box_3_a', 'box_3_b' }

  test_run:create_cluster(REPLICASET_1, 'rebalancer')
  test_run:create_cluster(REPLICASET_2, 'rebalancer')
-util = require('util')
+util = require('lua_libs.util')
  util.wait_master(test_run, REPLICASET_1, 'box_1_a')
  util.wait_master(test_run, REPLICASET_2, 'box_2_a')

diff --git a/test/rebalancer/stress_add_remove_several_rs.result 
b/test/rebalancer/stress_add_remove_several_rs.result
index d6008b8..611362c 100644
--- a/test/rebalancer/stress_add_remove_several_rs.result
+++ b/test/rebalancer/stress_add_remove_several_rs.result
@@ -19,7 +19,7 @@ test_run:create_cluster(REPLICASET_1, 'rebalancer')
  test_run:create_cluster(REPLICASET_2, 'rebalancer')
  ---
  ...
-util = require('util')
+util = require('lua_libs.util')
  ---
  ...
  util.wait_master(test_run, REPLICASET_1, 'box_1_a')
diff --git a/test/rebalancer/stress_add_remove_several_rs.test.lua 
b/test/rebalancer/stress_add_remove_several_rs.test.lua
index 3cc105e..9acb8de 100644
--- a/test/rebalancer/stress_add_remove_several_rs.test.lua
+++ b/test/rebalancer/stress_add_remove_several_rs.test.lua
@@ -7,7 +7,7 @@ REPLICASET_4 = { 'box_4_a', 'box_4_b' }

  test_run:create_cluster(REPLICASET_1, 'rebalancer')
  test_run:create_cluster(REPLICASET_2, 'rebalancer')
-util = require('util')
+util = require('lua_libs.util')
  util.wait_master(test_run, REPLICASET_1, 'box_1_a')
  util.wait_master(test_run, REPLICASET_2, 'box_2_a')

diff --git a/test/rebalancer/suite.ini b/test/rebalancer/suite.ini
index afc5141..8689da5 100644
--- a/test/rebalancer/suite.ini
+++ b/test/rebalancer/suite.ini
@@ -4,6 +4,6 @@ description = Rebalancer tests
  script = test.lua
  is_parallel = False
  release_disabled = errinj.test.lua
-lua_libs = ../lua_libs/util.lua config.lua names.lua router_1.lua
+lua_libs = ../lua_libs config.lua names.lua router_1.lua
             box_1_a.lua box_1_b.lua box_2_a.lua box_2_b.lua
             box_3_a.lua box_3_b.lua rebalancer_utils.lua

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tarantool-patches] Re: [PATCH 4/4] Introduce storage reload evolution
  2018-07-23 14:44   ` [tarantool-patches] " Vladislav Shpilevoy
@ 2018-07-23 20:10     ` Alex Khatskevich
  0 siblings, 0 replies; 14+ messages in thread
From: Alex Khatskevich @ 2018-07-23 20:10 UTC (permalink / raw)
  To: Vladislav Shpilevoy, tarantool-patches



On 23.07.2018 17:44, Vladislav Shpilevoy wrote:
> Thanks for the patch! See 4 comments below.
>
> On 23/07/2018 14:14, AKhatskevich wrote:
>> Changes:
>> 1. Introduce storage reload evolution.
>> 2. Setup cross-version reload testing.
>>
>> 1:
>> This mechanism updates Lua objects on reload in case they are
>> changed in a new vshard.storage version.
>>
>> Since this commit, any change in vshard.storage.M has to be
>> reflected in vshard.storage.reload_evolution to guarantee
>> correct reload.
>>
>> 2:
>> The testing uses git infrastructure and is performed in the following
>> way:
>> 1. Copy old version of vshard to a temp folder.
>> 2. Run vshard on this code.
>> 3. Checkout the latest version of the vshard sources.
>> 4. Reload vshard storage.
>> 5. Make sure it works (Perform simple tests).
>>
>> Notes:
>> * this patch contains some legacy-driven decisions:
>>    1. SOURCEDIR path retrieved differently in case of
>>       packpack build.
>>    2. git directory in the `reload_evolution/storage` test
>>       is copied with respect to Centos 7 and `ro` mode of
>>       SOURCEDIR.
>>
>> diff --git a/test/reload_evolution/storage.result 
>> b/test/reload_evolution/storage.result
>> new file mode 100644
>> index 0000000..54ff6b7
>> --- /dev/null
>> +++ b/test/reload_evolution/storage.result
>> @@ -0,0 +1,248 @@
>> +test_run = require('test_run').new()
>> +---
>> +...
>> +git_util = require('lua_libs.git_util')
>> +---
>> +...
>> +util = require('lua_libs.util')
>> +---
>> +...
>> +vshard_copy_path = util.BUILDDIR .. '/test/var/vshard_git_tree_copy'
>> +---
>> +...
>> +evolution_log = 
>> git_util.log_hashes({args='vshard/storage/reload_evolution.lua', 
>> dir=util.SOURCEDIR})
>> +---
>> +...
>> +-- Cleanup the directory after a previous build.
>> +_ = os.execute('rm -rf ' .. vshard_copy_path)
>> +---
>> +...
>> +-- 1. `git worktree` cannot be used because PACKPACK mounts
>> +-- `/source/` in `ro` mode.
>> +-- 2. Just `cp -rf` cannot be used due to a little different
>> +-- behavior in Centos 7.
>> +_ = os.execute('mkdir ' .. vshard_copy_path)
>> +---
>> +...
>> +_ = os.execute("cd " .. util.SOURCEDIR .. ' && cp -rf `ls -A 
>> --ignore=build` ' .. vshard_copy_path)
>> +---
>> +...
>> +-- Checkout the first commit with a reload_evolution mechanism.
>> +git_util.exec_cmd({cmd='checkout', args='-f', dir=vshard_copy_path})
>> +---
>> +...
>> +git_util.exec_cmd({cmd='checkout', 
>> args=evolution_log[#evolution_log] .. '~1', dir=vshard_copy_path})
>> +---
>> +...
>> +REPLICASET_1 = { 'storage_1_a', 'storage_1_b' }
>> +---
>> +...
>> +REPLICASET_2 = { 'storage_2_a', 'storage_2_b' }
>> +---
>> +...
>> +test_run:create_cluster(REPLICASET_1, 'reload_evolution')
>> +---
>> +...
>> +test_run:create_cluster(REPLICASET_2, 'reload_evolution')
>> +---
>> +...
>> +util = require('lua_libs.util')
>> +---
>> +...
>> +util.wait_master(test_run, REPLICASET_1, 'storage_1_a')
>> +---
>> +...
>> +util.wait_master(test_run, REPLICASET_2, 'storage_2_a')
>> +---
>> +...
>> +test_run:switch('storage_1_a')
>> +---
>> +- true
>> +...
>> +vshard.storage.bucket_force_create(1, 
>> vshard.consts.DEFAULT_BUCKET_COUNT / 2)
>> +---
>> +- true
>> +...
>> +bucket_id_to_move = vshard.consts.DEFAULT_BUCKET_COUNT
>> +---
>> +...
>> +test_run:switch('storage_2_a')
>> +---
>> +- true
>> +...
>> +fiber = require('fiber')
>> +---
>> +...
>> +vshard.storage.bucket_force_create(vshard.consts.DEFAULT_BUCKET_COUNT 
>> / 2 + 1, vshard.consts.DEFAULT_BUCKET_COUNT / 2)
>> +---
>> +- true
>> +...
>> +bucket_id_to_move = vshard.consts.DEFAULT_BUCKET_COUNT
>> +---
>> +...
>> +vshard.storage.internal.reload_evolution_version
>> +---
>> +- null
>> +...
>> +box.space.test:insert({42, bucket_id_to_move})
>> +---
>> +- [42, 3000]
>> +...
>> +while test_run:grep_log('storage_2_a', 'The cluster is balanced ok') 
>> == nil do vshard.storage.rebalancer_wakeup() fiber.sleep(0.1) end
>
> 1. Now you have wait_rebalancer_state util from the previous commit.
I thought that it is possible at this point that the cluster is already 
balanced.
But it seems that it is close to impossible.

Fixed
>
>> +---
>> +...
>> +test_run:switch('default')
>> +---
>> +- true
>> +...
>> +git_util.exec_cmd({cmd='checkout', args=evolution_log[1], 
>> dir=vshard_copy_path})
>> +---
>> +...
>> +test_run:switch('storage_2_a')
>> +---
>> +- true
>> +...
>> +package.loaded["vshard.storage"] = nil
>> +---
>> +...
>> +vshard.storage = require("vshard.storage")
>> +---
>> +...
>> +test_run:grep_log('storage_2_a', 'vshard.storage.reload_evolution: 
>> upgraded to') ~= nil
>> +---
>> +- true
>> +...
>> +vshard.storage.internal.reload_evolution_version
>> +---
>> +- 1
>> +...
>> +-- Make sure storage operates well.
>> +vshard.storage.bucket_force_drop(2000)
>> +---
>> +- true
>> +...
>> +vshard.storage.bucket_force_create(2000)
>> +---
>> +- true
>> +...
>> +vshard.storage.buckets_info()[2000]
>> +---
>> +- status: active
>> +  id: 2000
>> +...
>> +vshard.storage.call(bucket_id_to_move, 'read', 'do_select', {42})
>> +---
>> +- true
>> +- - [42, 3000]
>> +...
>> +vshard.storage.bucket_send(bucket_id_to_move, replicaset1_uuid)
>> +---
>> +- true
>> +...
>> +vshard.storage.garbage_collector_wakeup()
>> +---
>> +...
>> +fiber = require('fiber')
>> +---
>> +...
>> +while box.space._bucket:get({bucket_id_to_move}) do 
>> fiber.sleep(0.01) end
>> +---
>> +...
>> +test_run:switch('storage_1_a')
>> +---
>> +- true
>> +...
>> +vshard.storage.bucket_send(bucket_id_to_move, replicaset2_uuid)
>> +---
>> +- true
>> +...
>> +test_run:switch('storage_2_a')
>> +---
>> +- true
>> +...
>> +vshard.storage.call(bucket_id_to_move, 'read', 'do_select', {42})
>> +---
>> +- true
>> +- - [42, 3000]
>> +...
>> +-- Check info() does not fail.
>> +vshard.storage.info() ~= nil
>> +---
>> +- true
>> +...
>> +--
>> +-- Send buckets to create a disbalance. Wait until the rebalancer
>> +-- repairs it. Similar to `tests/rebalancer/rebalancer.test.lua`.
>> +--
>> +vshard.storage.rebalancer_disable()
>> +---
>> +...
>> +move_start = vshard.consts.DEFAULT_BUCKET_COUNT / 2 + 1
>> +---
>> +...
>> +move_cnt = 100
>> +---
>> +...
>> +assert(move_start + move_cnt < vshard.consts.DEFAULT_BUCKET_COUNT)
>> +---
>> +- true
>> +...
>> +for i = move_start, move_start + move_cnt - 1 do 
>> box.space._bucket:delete{i} end
>> +---
>> +...
>> +box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
>> +---
>> +- 1400
>> +...
>> +test_run:switch('storage_1_a')
>> +---
>> +- true
>> +...
>> +move_start = vshard.consts.DEFAULT_BUCKET_COUNT / 2 + 1
>> +---
>> +...
>> +move_cnt = 100
>> +---
>> +...
>> +vshard.storage.bucket_force_create(move_start, move_cnt)
>> +---
>> +- true
>> +...
>> +box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
>> +---
>> +- 1600
>> +...
>> +test_run:switch('storage_2_a')
>> +---
>> +- true
>> +...
>> +vshard.storage.rebalancer_enable()
>> +---
>> +...
>> +vshard.storage.rebalancer_wakeup()
>
> 2. You do not need explicit rebalancer_wakeup. wait_rebalancer_state
> calls it.
Fixed
>
>> +---
>> +...
>> +wait_rebalancer_state("Rebalance routes are sent", test_run)
>> +---
>> +...
>> +wait_rebalancer_state('The cluster is balanced ok', test_run)
>> +---
>> +...
>> +box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
>> +---
>> +- 1500
>> +...
>> +test_run:switch('default')
>> +---
>> +- true
>> +...
>> +test_run:drop_cluster(REPLICASET_2)
>> +---
>> +...
>> +test_run:drop_cluster(REPLICASET_1)
>> +---
>> +...
>> +test_run:cmd('clear filter')
>> +---
>> +- true
>> +...
>> diff --git a/test/unit/reload_evolution.result 
>> b/test/unit/reload_evolution.result
>> new file mode 100644
>> index 0000000..342ac24
>> --- /dev/null
>> +++ b/test/unit/reload_evolution.result
>> @@ -0,0 +1,45 @@
>> +test_run = require('test_run').new()
>> +---
>> +...
>> +fiber = require('fiber')
>> +---
>> +...
>> +log = require('log')
>> +---
>> +...
>> +util = require('util')
>> +---
>> +...
>> +reload_evolution = require('vshard.storage.reload_evolution')
>> +---
>> +...
>> +-- Init with the latest version.
>> +fake_M = { reload_evolution_version = reload_evolution.version }
>> +---
>> +...
>> +-- Test reload to the same version.
>> +reload_evolution.upgrade(fake_M)
>> +---
>> +...
>> +test_run:grep_log('default', 'vshard.storage.evolution') == nil
>> +---
>> +- true
>> +...
>> +-- Test downgrage version.
>> +log.info(string.rep('a', 1000))
>> +---
>> +...
>> +fake_M.reload_evolution_version = fake_M.reload_evolution_version + 1
>> +---
>> +...
>> +err = util.check_error(reload_evolution.upgrade, fake_M)
>> +---
>> +...
>> +err:match('auto%-downgrade is not implemented')
>> +---
>> +- auto-downgrade is not implemented
>
> 3. Why do you need match? check_error output is ok already. And
> what is 'auto%'? I see that you always print exactly "auto-downgrade"
> in reload_evolution.lua.
I need match to cut numbers which may change from the output.
% is used to escape '-' (which is a special character.
>
>> +...
>> +test_run:grep_log('default', 'vshard.storage.evolution', 1000) ~= nil
>> +---
>> +- false
>> +...
>> @@ -105,6 +110,11 @@ if not M then
>>           -- a destination replicaset must drop already received
>>           -- data.
>>           rebalancer_sending_bucket = 0,
>> +
>> +        ------------------------- Reload -------------------------
>> +        -- Version of the loaded module. This number is used on
>> +        -- reload to determine which upgrade scripts to run.
>> +        reload_evolution_version = reload_evolution.version,
>
> 4. Please, rename it to just 'version' or 'reload_version' or
> 'module_version'. 'Reload_evolution_version' is too long and
> complex.
Renamed to reload_version.
I like `reload_evolution_version` though.
>
>>       }
>>   end


Full diff:

commit ad151e7c3fb4c15dd49859b28113195cf74ad418
Author: AKhatskevich <avkhatskevich@tarantool.org>
Date:   Fri Jun 29 20:34:26 2018 +0300

     Introduce storage reload evolution

     Changes:
     1. Introduce storage reload evolution.
     2. Setup cross-version reload testing.

     1:
     This mechanism updates Lua objects on reload in case they are
     changed in a new vshard.storage version.

     Since this commit, any change in vshard.storage.M has to be
     reflected in vshard.storage.reload_evolution to guarantee
     correct reload.

     2:
     The testing uses git infrastructure and is performed in the following
     way:
     1. Copy old version of vshard to a temp folder.
     2. Run vshard on this code.
     3. Checkout the latest version of the vshard sources.
     4. Reload vshard storage.
     5. Make sure it works (Perform simple tests).

     Notes:
     * this patch contains some legacy-driven decisions:
       1. SOURCEDIR path retrieved differently in case of
          packpack build.
       2. git directory in the `reload_evolution/storage` test
          is copied with respect to Centos 7 and `ro` mode of
          SOURCEDIR.

     Closes #112 #125

diff --git a/.travis.yml b/.travis.yml
index 54bfe44..eff4a51 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -41,7 +41,7 @@ env:
  script:
    - git describe --long
    - git clone https://github.com/packpack/packpack.git packpack
-  - packpack/packpack
+  - packpack/packpack -e PACKPACK_GIT_SOURCEDIR=/source/

  before_deploy:
    - ls -l build/
diff --git a/rpm/prebuild.sh b/rpm/prebuild.sh
index 768b22b..554032b 100755
--- a/rpm/prebuild.sh
+++ b/rpm/prebuild.sh
@@ -1 +1,3 @@
  curl -s 
https://packagecloud.io/install/repositories/tarantool/1_9/script.rpm.sh 
| sudo bash
+sudo yum -y install python-devel python-pip
+sudo pip install tarantool msgpack
diff --git a/test/lua_libs/git_util.lua b/test/lua_libs/git_util.lua
new file mode 100644
index 0000000..a75bb08
--- /dev/null
+++ b/test/lua_libs/git_util.lua
@@ -0,0 +1,51 @@
+--
+-- Lua bridge for some of the git commands.
+--
+local os = require('os')
+
+local temp_file = 'some_strange_rare_unique_file_name_for_git_util'
+
+--
+-- Exec a git command.
+-- @param params Table of parameters:
+--        * options - git options.
+--        * cmd - git command.
+--        * args - command arguments.
+--        * dir - working directory.
+--        * fout - write output to the file.
+local function exec_cmd(params)
+    local fout = params.fout
+    local shell_cmd = {'git'}
+    for _, param in pairs({'options', 'cmd', 'args'}) do
+        table.insert(shell_cmd, params[param])
+    end
+    if fout then
+        table.insert(shell_cmd, ' >' .. fout)
+    end
+    shell_cmd = table.concat(shell_cmd, ' ')
+    if params.dir then
+        shell_cmd = string.format('cd %s && %s', params.dir, shell_cmd)
+    end
+    local res = os.execute(shell_cmd)
+    assert(res == 0, 'Git cmd error: ' .. res)
+end
+
+local function log_hashes(params)
+    params.args = "--format='%h' " .. params.args
+    local local_temp_file = string.format('%s/%s', os.getenv('PWD'), 
temp_file)
+    params.fout = local_temp_file
+    params.cmd = 'log'
+    exec_cmd(params)
+    local lines = {}
+    for line in io.lines(local_temp_file) do
+        table.insert(lines, line)
+    end
+    os.remove(local_temp_file)
+    return lines
+end
+
+
+return {
+    exec_cmd = exec_cmd,
+    log_hashes = log_hashes
+}
diff --git a/test/lua_libs/util.lua b/test/lua_libs/util.lua
index f40d3a6..935ff41 100644
--- a/test/lua_libs/util.lua
+++ b/test/lua_libs/util.lua
@@ -1,5 +1,6 @@
  local fiber = require('fiber')
  local log = require('log')
+local fio = require('fio')

  local function check_error(func, ...)
      local pstatus, status, err = pcall(func, ...)
@@ -92,10 +93,29 @@ local function has_same_fields(etalon, data)
      return true
  end

+-- Git directory of the project. Used in evolution tests to
+-- fetch old versions of vshard.
+local SOURCEDIR = os.getenv('PACKPACK_GIT_SOURCEDIR')
+if not SOURCEDIR then
+    SOURCEDIR = os.getenv('SOURCEDIR')
+end
+if not SOURCEDIR then
+    local script_path = debug.getinfo(1).source:match("@?(.*/)")
+    script_path = fio.abspath(script_path)
+    SOURCEDIR = fio.abspath(script_path .. '/../../../')
+end
+
+local BUILDDIR = os.getenv('BUILDDIR')
+if not BUILDDIR then
+    BUILDDIR = SOURCEDIR
+end
+
  return {
      check_error = check_error,
      shuffle_masters = shuffle_masters,
      collect_timeouts = collect_timeouts,
      wait_master = wait_master,
      has_same_fields = has_same_fields,
+    SOURCEDIR = SOURCEDIR,
+    BUILDDIR = BUILDDIR,
  }
diff --git a/test/reload_evolution/storage.result 
b/test/reload_evolution/storage.result
new file mode 100644
index 0000000..007192c
--- /dev/null
+++ b/test/reload_evolution/storage.result
@@ -0,0 +1,245 @@
+test_run = require('test_run').new()
+---
+...
+git_util = require('lua_libs.git_util')
+---
+...
+util = require('lua_libs.util')
+---
+...
+vshard_copy_path = util.BUILDDIR .. '/test/var/vshard_git_tree_copy'
+---
+...
+evolution_log = 
git_util.log_hashes({args='vshard/storage/reload_evolution.lua', 
dir=util.SOURCEDIR})
+---
+...
+-- Cleanup the directory after a previous build.
+_ = os.execute('rm -rf ' .. vshard_copy_path)
+---
+...
+-- 1. `git worktree` cannot be used because PACKPACK mounts
+-- `/source/` in `ro` mode.
+-- 2. Just `cp -rf` cannot be used due to a little different
+-- behavior in Centos 7.
+_ = os.execute('mkdir ' .. vshard_copy_path)
+---
+...
+_ = os.execute("cd " .. util.SOURCEDIR .. ' && cp -rf `ls -A 
--ignore=build` ' .. vshard_copy_path)
+---
+...
+-- Checkout the first commit with a reload_evolution mechanism.
+git_util.exec_cmd({cmd='checkout', args='-f', dir=vshard_copy_path})
+---
+...
+git_util.exec_cmd({cmd='checkout', args=evolution_log[#evolution_log] 
.. '~1', dir=vshard_copy_path})
+---
+...
+REPLICASET_1 = { 'storage_1_a', 'storage_1_b' }
+---
+...
+REPLICASET_2 = { 'storage_2_a', 'storage_2_b' }
+---
+...
+test_run:create_cluster(REPLICASET_1, 'reload_evolution')
+---
+...
+test_run:create_cluster(REPLICASET_2, 'reload_evolution')
+---
+...
+util = require('lua_libs.util')
+---
+...
+util.wait_master(test_run, REPLICASET_1, 'storage_1_a')
+---
+...
+util.wait_master(test_run, REPLICASET_2, 'storage_2_a')
+---
+...
+test_run:switch('storage_1_a')
+---
+- true
+...
+vshard.storage.bucket_force_create(1, 
vshard.consts.DEFAULT_BUCKET_COUNT / 2)
+---
+- true
+...
+bucket_id_to_move = vshard.consts.DEFAULT_BUCKET_COUNT
+---
+...
+test_run:switch('storage_2_a')
+---
+- true
+...
+fiber = require('fiber')
+---
+...
+vshard.storage.bucket_force_create(vshard.consts.DEFAULT_BUCKET_COUNT / 
2 + 1, vshard.consts.DEFAULT_BUCKET_COUNT / 2)
+---
+- true
+...
+bucket_id_to_move = vshard.consts.DEFAULT_BUCKET_COUNT
+---
+...
+vshard.storage.internal.reload_version
+---
+- null
+...
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+---
+...
+box.space.test:insert({42, bucket_id_to_move})
+---
+- [42, 3000]
+...
+test_run:switch('default')
+---
+- true
+...
+git_util.exec_cmd({cmd='checkout', args=evolution_log[1], 
dir=vshard_copy_path})
+---
+...
+test_run:switch('storage_2_a')
+---
+- true
+...
+package.loaded['vshard.storage'] = nil
+---
+...
+vshard.storage = require("vshard.storage")
+---
+...
+test_run:grep_log('storage_2_a', 'vshard.storage.reload_evolution: 
upgraded to') ~= nil
+---
+- true
+...
+vshard.storage.internal.reload_version
+---
+- 1
+...
+-- Make sure storage operates well.
+vshard.storage.bucket_force_drop(2000)
+---
+- true
+...
+vshard.storage.bucket_force_create(2000)
+---
+- true
+...
+vshard.storage.buckets_info()[2000]
+---
+- status: active
+  id: 2000
+...
+vshard.storage.call(bucket_id_to_move, 'read', 'do_select', {42})
+---
+- true
+- - [42, 3000]
+...
+vshard.storage.bucket_send(bucket_id_to_move, replicaset1_uuid)
+---
+- true
+...
+vshard.storage.garbage_collector_wakeup()
+---
+...
+fiber = require('fiber')
+---
+...
+while box.space._bucket:get({bucket_id_to_move}) do fiber.sleep(0.01) end
+---
+...
+test_run:switch('storage_1_a')
+---
+- true
+...
+vshard.storage.bucket_send(bucket_id_to_move, replicaset2_uuid)
+---
+- true
+...
+test_run:switch('storage_2_a')
+---
+- true
+...
+vshard.storage.call(bucket_id_to_move, 'read', 'do_select', {42})
+---
+- true
+- - [42, 3000]
+...
+-- Check info() does not fail.
+vshard.storage.info() ~= nil
+---
+- true
+...
+--
+-- Send buckets to create a disbalance. Wait until the rebalancer
+-- repairs it. Similar to `tests/rebalancer/rebalancer.test.lua`.
+--
+vshard.storage.rebalancer_disable()
+---
+...
+move_start = vshard.consts.DEFAULT_BUCKET_COUNT / 2 + 1
+---
+...
+move_cnt = 100
+---
+...
+assert(move_start + move_cnt < vshard.consts.DEFAULT_BUCKET_COUNT)
+---
+- true
+...
+for i = move_start, move_start + move_cnt - 1 do 
box.space._bucket:delete{i} end
+---
+...
+box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
+---
+- 1400
+...
+test_run:switch('storage_1_a')
+---
+- true
+...
+move_start = vshard.consts.DEFAULT_BUCKET_COUNT / 2 + 1
+---
+...
+move_cnt = 100
+---
+...
+vshard.storage.bucket_force_create(move_start, move_cnt)
+---
+- true
+...
+box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
+---
+- 1600
+...
+test_run:switch('storage_2_a')
+---
+- true
+...
+vshard.storage.rebalancer_enable()
+---
+...
+wait_rebalancer_state('Rebalance routes are sent', test_run)
+---
+...
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+---
+...
+box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
+---
+- 1500
+...
+test_run:switch('default')
+---
+- true
+...
+test_run:drop_cluster(REPLICASET_2)
+---
+...
+test_run:drop_cluster(REPLICASET_1)
+---
+...
+test_run:cmd('clear filter')
+---
+- true
+...
diff --git a/test/reload_evolution/storage.test.lua 
b/test/reload_evolution/storage.test.lua
new file mode 100644
index 0000000..7af464b
--- /dev/null
+++ b/test/reload_evolution/storage.test.lua
@@ -0,0 +1,87 @@
+test_run = require('test_run').new()
+
+git_util = require('lua_libs.git_util')
+util = require('lua_libs.util')
+vshard_copy_path = util.BUILDDIR .. '/test/var/vshard_git_tree_copy'
+evolution_log = 
git_util.log_hashes({args='vshard/storage/reload_evolution.lua', 
dir=util.SOURCEDIR})
+-- Cleanup the directory after a previous build.
+_ = os.execute('rm -rf ' .. vshard_copy_path)
+-- 1. `git worktree` cannot be used because PACKPACK mounts
+-- `/source/` in `ro` mode.
+-- 2. Just `cp -rf` cannot be used due to a little different
+-- behavior in Centos 7.
+_ = os.execute('mkdir ' .. vshard_copy_path)
+_ = os.execute("cd " .. util.SOURCEDIR .. ' && cp -rf `ls -A 
--ignore=build` ' .. vshard_copy_path)
+-- Checkout the first commit with a reload_evolution mechanism.
+git_util.exec_cmd({cmd='checkout', args='-f', dir=vshard_copy_path})
+git_util.exec_cmd({cmd='checkout', args=evolution_log[#evolution_log] 
.. '~1', dir=vshard_copy_path})
+
+REPLICASET_1 = { 'storage_1_a', 'storage_1_b' }
+REPLICASET_2 = { 'storage_2_a', 'storage_2_b' }
+test_run:create_cluster(REPLICASET_1, 'reload_evolution')
+test_run:create_cluster(REPLICASET_2, 'reload_evolution')
+util = require('lua_libs.util')
+util.wait_master(test_run, REPLICASET_1, 'storage_1_a')
+util.wait_master(test_run, REPLICASET_2, 'storage_2_a')
+
+test_run:switch('storage_1_a')
+vshard.storage.bucket_force_create(1, 
vshard.consts.DEFAULT_BUCKET_COUNT / 2)
+bucket_id_to_move = vshard.consts.DEFAULT_BUCKET_COUNT
+
+test_run:switch('storage_2_a')
+fiber = require('fiber')
+vshard.storage.bucket_force_create(vshard.consts.DEFAULT_BUCKET_COUNT / 
2 + 1, vshard.consts.DEFAULT_BUCKET_COUNT / 2)
+bucket_id_to_move = vshard.consts.DEFAULT_BUCKET_COUNT
+vshard.storage.internal.reload_version
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+box.space.test:insert({42, bucket_id_to_move})
+
+test_run:switch('default')
+git_util.exec_cmd({cmd='checkout', args=evolution_log[1], 
dir=vshard_copy_path})
+
+test_run:switch('storage_2_a')
+package.loaded['vshard.storage'] = nil
+vshard.storage = require("vshard.storage")
+test_run:grep_log('storage_2_a', 'vshard.storage.reload_evolution: 
upgraded to') ~= nil
+vshard.storage.internal.reload_version
+-- Make sure storage operates well.
+vshard.storage.bucket_force_drop(2000)
+vshard.storage.bucket_force_create(2000)
+vshard.storage.buckets_info()[2000]
+vshard.storage.call(bucket_id_to_move, 'read', 'do_select', {42})
+vshard.storage.bucket_send(bucket_id_to_move, replicaset1_uuid)
+vshard.storage.garbage_collector_wakeup()
+fiber = require('fiber')
+while box.space._bucket:get({bucket_id_to_move}) do fiber.sleep(0.01) end
+test_run:switch('storage_1_a')
+vshard.storage.bucket_send(bucket_id_to_move, replicaset2_uuid)
+test_run:switch('storage_2_a')
+vshard.storage.call(bucket_id_to_move, 'read', 'do_select', {42})
+-- Check info() does not fail.
+vshard.storage.info() ~= nil
+
+--
+-- Send buckets to create a disbalance. Wait until the rebalancer
+-- repairs it. Similar to `tests/rebalancer/rebalancer.test.lua`.
+--
+vshard.storage.rebalancer_disable()
+move_start = vshard.consts.DEFAULT_BUCKET_COUNT / 2 + 1
+move_cnt = 100
+assert(move_start + move_cnt < vshard.consts.DEFAULT_BUCKET_COUNT)
+for i = move_start, move_start + move_cnt - 1 do 
box.space._bucket:delete{i} end
+box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
+test_run:switch('storage_1_a')
+move_start = vshard.consts.DEFAULT_BUCKET_COUNT / 2 + 1
+move_cnt = 100
+vshard.storage.bucket_force_create(move_start, move_cnt)
+box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
+test_run:switch('storage_2_a')
+vshard.storage.rebalancer_enable()
+wait_rebalancer_state('Rebalance routes are sent', test_run)
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
+
+test_run:switch('default')
+test_run:drop_cluster(REPLICASET_2)
+test_run:drop_cluster(REPLICASET_1)
+test_run:cmd('clear filter')
diff --git a/test/reload_evolution/storage_1_a.lua 
b/test/reload_evolution/storage_1_a.lua
new file mode 100755
index 0000000..f1a2981
--- /dev/null
+++ b/test/reload_evolution/storage_1_a.lua
@@ -0,0 +1,48 @@
+#!/usr/bin/env tarantool
+
+require('strict').on()
+
+local log = require('log')
+local fiber = require('fiber')
+local util = require('lua_libs.util')
+local fio = require('fio')
+
+-- Get instance name
+NAME = fio.basename(arg[0], '.lua')
+
+-- test-run gate.
+test_run = require('test_run').new()
+require('console').listen(os.getenv('ADMIN'))
+
+-- Run one storage on a different vshard version.
+-- To do that, place vshard src to
+-- BUILDDIR/test/var/vshard_git_tree_copy/.
+if NAME == 'storage_2_a' then
+    local script_path = debug.getinfo(1).source:match("@?(.*/)")
+    vshard_copy = util.BUILDDIR .. '/test/var/vshard_git_tree_copy'
+    package.path = string.format(
+        '%s/?.lua;%s/?/init.lua;%s',
+        vshard_copy, vshard_copy, package.path
+    )
+end
+
+-- Call a configuration provider
+cfg = require('localcfg')
+-- Name to uuid map
+names = {
+    ['storage_1_a'] = '8a274925-a26d-47fc-9e1b-af88ce939412',
+    ['storage_1_b'] = '3de2e3e1-9ebe-4d0d-abb1-26d301b84633',
+    ['storage_2_a'] = '1e02ae8a-afc0-4e91-ba34-843a356b8ed7',
+    ['storage_2_b'] = '001688c3-66f8-4a31-8e19-036c17d489c2',
+}
+
+replicaset1_uuid = 'cbf06940-0790-498b-948d-042b62cf3d29'
+replicaset2_uuid = 'ac522f65-aa94-4134-9f64-51ee384f1a54'
+replicasets = {replicaset1_uuid, replicaset2_uuid}
+
+-- Start the database with sharding
+vshard = require('vshard')
+vshard.storage.cfg(cfg, names[NAME])
+
+-- Bootstrap storage.
+require('lua_libs.bootstrap')
diff --git a/test/reload_evolution/storage_1_b.lua 
b/test/reload_evolution/storage_1_b.lua
new file mode 120000
index 0000000..02572da
--- /dev/null
+++ b/test/reload_evolution/storage_1_b.lua
@@ -0,0 +1 @@
+storage_1_a.lua
\ No newline at end of file
diff --git a/test/reload_evolution/storage_2_a.lua 
b/test/reload_evolution/storage_2_a.lua
new file mode 120000
index 0000000..02572da
--- /dev/null
+++ b/test/reload_evolution/storage_2_a.lua
@@ -0,0 +1 @@
+storage_1_a.lua
\ No newline at end of file
diff --git a/test/reload_evolution/storage_2_b.lua 
b/test/reload_evolution/storage_2_b.lua
new file mode 120000
index 0000000..02572da
--- /dev/null
+++ b/test/reload_evolution/storage_2_b.lua
@@ -0,0 +1 @@
+storage_1_a.lua
\ No newline at end of file
diff --git a/test/reload_evolution/suite.ini 
b/test/reload_evolution/suite.ini
new file mode 100644
index 0000000..5f55418
--- /dev/null
+++ b/test/reload_evolution/suite.ini
@@ -0,0 +1,6 @@
+[default]
+core = tarantool
+description = Reload evolution tests
+script = test.lua
+is_parallel = False
+lua_libs = ../lua_libs ../../example/localcfg.lua
diff --git a/test/reload_evolution/test.lua b/test/reload_evolution/test.lua
new file mode 100644
index 0000000..ad0543a
--- /dev/null
+++ b/test/reload_evolution/test.lua
@@ -0,0 +1,9 @@
+#!/usr/bin/env tarantool
+
+require('strict').on()
+
+box.cfg{
+    listen = os.getenv("LISTEN"),
+}
+
+require('console').listen(os.getenv('ADMIN'))
diff --git a/test/unit/reload_evolution.result 
b/test/unit/reload_evolution.result
new file mode 100644
index 0000000..342ac24
--- /dev/null
+++ b/test/unit/reload_evolution.result
@@ -0,0 +1,45 @@
+test_run = require('test_run').new()
+---
+...
+fiber = require('fiber')
+---
+...
+log = require('log')
+---
+...
+util = require('util')
+---
+...
+reload_evolution = require('vshard.storage.reload_evolution')
+---
+...
+-- Init with the latest version.
+fake_M = { reload_evolution_version = reload_evolution.version }
+---
+...
+-- Test reload to the same version.
+reload_evolution.upgrade(fake_M)
+---
+...
+test_run:grep_log('default', 'vshard.storage.evolution') == nil
+---
+- true
+...
+-- Test downgrage version.
+log.info(string.rep('a', 1000))
+---
+...
+fake_M.reload_evolution_version = fake_M.reload_evolution_version + 1
+---
+...
+err = util.check_error(reload_evolution.upgrade, fake_M)
+---
+...
+err:match('auto%-downgrade is not implemented')
+---
+- auto-downgrade is not implemented
+...
+test_run:grep_log('default', 'vshard.storage.evolution', 1000) ~= nil
+---
+- false
+...
diff --git a/test/unit/reload_evolution.test.lua 
b/test/unit/reload_evolution.test.lua
new file mode 100644
index 0000000..b8a3ca8
--- /dev/null
+++ b/test/unit/reload_evolution.test.lua
@@ -0,0 +1,18 @@
+test_run = require('test_run').new()
+fiber = require('fiber')
+log = require('log')
+util = require('util')
+reload_evolution = require('vshard.storage.reload_evolution')
+-- Init with the latest version.
+fake_M = { reload_version = reload_evolution.version }
+
+-- Test reload to the same version.
+reload_evolution.upgrade(fake_M)
+test_run:grep_log('default', 'vshard.storage.evolution') == nil
+
+-- Test downgrage version.
+log.info(string.rep('a', 1000))
+fake_M.reload_evolution_version = fake_M.reload_evolution_version + 1
+err = util.check_error(reload_evolution.upgrade, fake_M)
+err:match('auto%-downgrade is not implemented')
+test_run:grep_log('default', 'vshard.storage.evolution', 1000) ~= nil
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index 07bd00c..69e858a 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -10,6 +10,7 @@ if rawget(_G, MODULE_INTERNALS) then
      local vshard_modules = {
          'vshard.consts', 'vshard.error', 'vshard.cfg',
          'vshard.replicaset', 'vshard.util',
+        'vshard.storage.reload_evolution'
      }
      for _, module in pairs(vshard_modules) do
          package.loaded[module] = nil
@@ -20,12 +21,16 @@ local lerror = require('vshard.error')
  local lcfg = require('vshard.cfg')
  local lreplicaset = require('vshard.replicaset')
  local util = require('vshard.util')
+local reload_evolution = require('vshard.storage.reload_evolution')

  local M = rawget(_G, MODULE_INTERNALS)
  if not M then
      --
      -- The module is loaded for the first time.
      --
+    -- !!!WARNING: any change of this table must be reflected in
+    -- `vshard.storage.reload_evolution` module to guarantee
+    -- reloadability of the module.
      M = {
          ---------------- Common module attributes ----------------
          -- The last passed configuration.
@@ -105,6 +110,11 @@ if not M then
          -- a destination replicaset must drop already received
          -- data.
          rebalancer_sending_bucket = 0,
+
+        ------------------------- Reload -------------------------
+        -- Version of the loaded module. This number is used on
+        -- reload to determine which upgrade scripts to run.
+        reload_version = reload_evolution.version,
      }
  end

@@ -1863,6 +1873,7 @@ end
  if not rawget(_G, MODULE_INTERNALS) then
      rawset(_G, MODULE_INTERNALS, M)
  else
+    reload_evolution.upgrade(M)
      storage_cfg(M.current_cfg, M.this_replica.uuid)
      M.module_version = M.module_version + 1
  end
diff --git a/vshard/storage/reload_evolution.lua 
b/vshard/storage/reload_evolution.lua
new file mode 100644
index 0000000..8502a33
--- /dev/null
+++ b/vshard/storage/reload_evolution.lua
@@ -0,0 +1,58 @@
+--
+-- This module is used to upgrade the vshard.storage on the fly.
+-- It updates internal Lua structures in case they are changed
+-- in a commit.
+--
+local log = require('log')
+
+--
+-- Array of upgrade functions.
+-- migrations[version] = function which upgrades module version
+-- from `version` to `version + 1`.
+--
+local migrations = {}
+
+-- Initialize reload_upgrade mechanism
+migrations[#migrations + 1] = function (M)
+    -- Code to update Lua objects.
+end
+
+--
+-- Perform an update based on a version stored in `M` (internals).
+-- @param M Old module internals which should be updated.
+--
+local function upgrade(M)
+    local start_version = M.reload_version or 1
+    if start_version > #migrations then
+        local err_msg = string.format(
+            'vshard.storage.reload_evolution: ' ..
+            'auto-downgrade is not implemented; ' ..
+            'loaded version is %d, upgrade script version is %d',
+            start_version, #migrations
+        )
+        log.error(err_msg)
+        error(err_msg)
+    end
+    for i = start_version, #migrations  do
+        local ok, err = pcall(migrations[i], M)
+        if ok then
+            log.info('vshard.storage.reload_evolution: upgraded to %d 
version',
+                     i)
+        else
+            local err_msg = string.format(
+                'vshard.storage.reload_evolution: ' ..
+                'error during upgrade to %d version: %s', i, err
+            )
+            log.error(err_msg)
+            error(err_msg)
+        end
+        -- Update the version just after upgrade to have an
+        -- actual version in case of an error.
+        M.reload_version = i
+    end
+end
+
+return {
+    version = #migrations,
+    upgrade = upgrade,
+}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tarantool-patches] [PATCH 4/4] Introduce storage reload evolution
  2018-07-30  8:56 [tarantool-patches] [PATCH v4] vshard module reload AKhatskevich
@ 2018-07-30  8:56 ` AKhatskevich
  0 siblings, 0 replies; 14+ messages in thread
From: AKhatskevich @ 2018-07-30  8:56 UTC (permalink / raw)
  To: v.shpilevoy, tarantool-patches

Changes:
1. Introduce storage reload evolution.
2. Setup cross-version reload testing.

1:
This mechanism updates Lua objects on reload in case they are
changed in a new vshard.storage version.

Since this commit, any change in vshard.storage.M has to be
reflected in vshard.storage.reload_evolution to guarantee
correct reload.

2:
The testing uses git infrastructure and is performed in the following
way:
1. Copy old version of vshard to a temp folder.
2. Run vshard on this code.
3. Checkout the latest version of the vshard sources.
4. Reload vshard storage.
5. Make sure it works (Perform simple tests).

Notes:
* this patch contains some legacy-driven decisions:
  1. SOURCEDIR path retrieved differently in case of
     packpack build.
  2. git directory in the `reload_evolution/storage` test
     is copied with respect to Centos 7 and `ro` mode of
     SOURCEDIR.

Closes #112 #125
---
 .travis.yml                            |   2 +-
 rpm/prebuild.sh                        |   2 +
 test/lua_libs/git_util.lua             |  51 +++++++
 test/lua_libs/util.lua                 |  20 +++
 test/reload_evolution/storage.result   | 245 +++++++++++++++++++++++++++++++++
 test/reload_evolution/storage.test.lua |  87 ++++++++++++
 test/reload_evolution/storage_1_a.lua  |  48 +++++++
 test/reload_evolution/storage_1_b.lua  |   1 +
 test/reload_evolution/storage_2_a.lua  |   1 +
 test/reload_evolution/storage_2_b.lua  |   1 +
 test/reload_evolution/suite.ini        |   6 +
 test/reload_evolution/test.lua         |   9 ++
 test/unit/reload_evolution.result      |  45 ++++++
 test/unit/reload_evolution.test.lua    |  18 +++
 vshard/storage/init.lua                |  11 ++
 vshard/storage/reload_evolution.lua    |  58 ++++++++
 16 files changed, 604 insertions(+), 1 deletion(-)
 create mode 100644 test/lua_libs/git_util.lua
 create mode 100644 test/reload_evolution/storage.result
 create mode 100644 test/reload_evolution/storage.test.lua
 create mode 100755 test/reload_evolution/storage_1_a.lua
 create mode 120000 test/reload_evolution/storage_1_b.lua
 create mode 120000 test/reload_evolution/storage_2_a.lua
 create mode 120000 test/reload_evolution/storage_2_b.lua
 create mode 100644 test/reload_evolution/suite.ini
 create mode 100644 test/reload_evolution/test.lua
 create mode 100644 test/unit/reload_evolution.result
 create mode 100644 test/unit/reload_evolution.test.lua
 create mode 100644 vshard/storage/reload_evolution.lua

diff --git a/.travis.yml b/.travis.yml
index 54bfe44..eff4a51 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -41,7 +41,7 @@ env:
 script:
   - git describe --long
   - git clone https://github.com/packpack/packpack.git packpack
-  - packpack/packpack
+  - packpack/packpack -e PACKPACK_GIT_SOURCEDIR=/source/
 
 before_deploy:
   - ls -l build/
diff --git a/rpm/prebuild.sh b/rpm/prebuild.sh
index 768b22b..554032b 100755
--- a/rpm/prebuild.sh
+++ b/rpm/prebuild.sh
@@ -1 +1,3 @@
 curl -s https://packagecloud.io/install/repositories/tarantool/1_9/script.rpm.sh | sudo bash
+sudo yum -y install python-devel python-pip
+sudo pip install tarantool msgpack
diff --git a/test/lua_libs/git_util.lua b/test/lua_libs/git_util.lua
new file mode 100644
index 0000000..a75bb08
--- /dev/null
+++ b/test/lua_libs/git_util.lua
@@ -0,0 +1,51 @@
+--
+-- Lua bridge for some of the git commands.
+--
+local os = require('os')
+
+local temp_file = 'some_strange_rare_unique_file_name_for_git_util'
+
+--
+-- Exec a git command.
+-- @param params Table of parameters:
+--        * options - git options.
+--        * cmd - git command.
+--        * args - command arguments.
+--        * dir - working directory.
+--        * fout - write output to the file.
+local function exec_cmd(params)
+    local fout = params.fout
+    local shell_cmd = {'git'}
+    for _, param in pairs({'options', 'cmd', 'args'}) do
+        table.insert(shell_cmd, params[param])
+    end
+    if fout then
+        table.insert(shell_cmd, ' >' .. fout)
+    end
+    shell_cmd = table.concat(shell_cmd, ' ')
+    if params.dir then
+        shell_cmd = string.format('cd %s && %s', params.dir, shell_cmd)
+    end
+    local res = os.execute(shell_cmd)
+    assert(res == 0, 'Git cmd error: ' .. res)
+end
+
+local function log_hashes(params)
+    params.args = "--format='%h' " .. params.args
+    local local_temp_file = string.format('%s/%s', os.getenv('PWD'), temp_file)
+    params.fout = local_temp_file
+    params.cmd = 'log'
+    exec_cmd(params)
+    local lines = {}
+    for line in io.lines(local_temp_file) do
+        table.insert(lines, line)
+    end
+    os.remove(local_temp_file)
+    return lines
+end
+
+
+return {
+    exec_cmd = exec_cmd,
+    log_hashes = log_hashes
+}
diff --git a/test/lua_libs/util.lua b/test/lua_libs/util.lua
index f40d3a6..935ff41 100644
--- a/test/lua_libs/util.lua
+++ b/test/lua_libs/util.lua
@@ -1,5 +1,6 @@
 local fiber = require('fiber')
 local log = require('log')
+local fio = require('fio')
 
 local function check_error(func, ...)
     local pstatus, status, err = pcall(func, ...)
@@ -92,10 +93,29 @@ local function has_same_fields(etalon, data)
     return true
 end
 
+-- Git directory of the project. Used in evolution tests to
+-- fetch old versions of vshard.
+local SOURCEDIR = os.getenv('PACKPACK_GIT_SOURCEDIR')
+if not SOURCEDIR then
+    SOURCEDIR = os.getenv('SOURCEDIR')
+end
+if not SOURCEDIR then
+    local script_path = debug.getinfo(1).source:match("@?(.*/)")
+    script_path = fio.abspath(script_path)
+    SOURCEDIR = fio.abspath(script_path .. '/../../../')
+end
+
+local BUILDDIR = os.getenv('BUILDDIR')
+if not BUILDDIR then
+    BUILDDIR = SOURCEDIR
+end
+
 return {
     check_error = check_error,
     shuffle_masters = shuffle_masters,
     collect_timeouts = collect_timeouts,
     wait_master = wait_master,
     has_same_fields = has_same_fields,
+    SOURCEDIR = SOURCEDIR,
+    BUILDDIR = BUILDDIR,
 }
diff --git a/test/reload_evolution/storage.result b/test/reload_evolution/storage.result
new file mode 100644
index 0000000..007192c
--- /dev/null
+++ b/test/reload_evolution/storage.result
@@ -0,0 +1,245 @@
+test_run = require('test_run').new()
+---
+...
+git_util = require('lua_libs.git_util')
+---
+...
+util = require('lua_libs.util')
+---
+...
+vshard_copy_path = util.BUILDDIR .. '/test/var/vshard_git_tree_copy'
+---
+...
+evolution_log = git_util.log_hashes({args='vshard/storage/reload_evolution.lua', dir=util.SOURCEDIR})
+---
+...
+-- Cleanup the directory after a previous build.
+_ = os.execute('rm -rf ' .. vshard_copy_path)
+---
+...
+-- 1. `git worktree` cannot be used because PACKPACK mounts
+-- `/source/` in `ro` mode.
+-- 2. Just `cp -rf` cannot be used due to a little different
+-- behavior in Centos 7.
+_ = os.execute('mkdir ' .. vshard_copy_path)
+---
+...
+_ = os.execute("cd " .. util.SOURCEDIR .. ' && cp -rf `ls -A --ignore=build` ' .. vshard_copy_path)
+---
+...
+-- Checkout the first commit with a reload_evolution mechanism.
+git_util.exec_cmd({cmd='checkout', args='-f', dir=vshard_copy_path})
+---
+...
+git_util.exec_cmd({cmd='checkout', args=evolution_log[#evolution_log] .. '~1', dir=vshard_copy_path})
+---
+...
+REPLICASET_1 = { 'storage_1_a', 'storage_1_b' }
+---
+...
+REPLICASET_2 = { 'storage_2_a', 'storage_2_b' }
+---
+...
+test_run:create_cluster(REPLICASET_1, 'reload_evolution')
+---
+...
+test_run:create_cluster(REPLICASET_2, 'reload_evolution')
+---
+...
+util = require('lua_libs.util')
+---
+...
+util.wait_master(test_run, REPLICASET_1, 'storage_1_a')
+---
+...
+util.wait_master(test_run, REPLICASET_2, 'storage_2_a')
+---
+...
+test_run:switch('storage_1_a')
+---
+- true
+...
+vshard.storage.bucket_force_create(1, vshard.consts.DEFAULT_BUCKET_COUNT / 2)
+---
+- true
+...
+bucket_id_to_move = vshard.consts.DEFAULT_BUCKET_COUNT
+---
+...
+test_run:switch('storage_2_a')
+---
+- true
+...
+fiber = require('fiber')
+---
+...
+vshard.storage.bucket_force_create(vshard.consts.DEFAULT_BUCKET_COUNT / 2 + 1, vshard.consts.DEFAULT_BUCKET_COUNT / 2)
+---
+- true
+...
+bucket_id_to_move = vshard.consts.DEFAULT_BUCKET_COUNT
+---
+...
+vshard.storage.internal.reload_version
+---
+- null
+...
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+---
+...
+box.space.test:insert({42, bucket_id_to_move})
+---
+- [42, 3000]
+...
+test_run:switch('default')
+---
+- true
+...
+git_util.exec_cmd({cmd='checkout', args=evolution_log[1], dir=vshard_copy_path})
+---
+...
+test_run:switch('storage_2_a')
+---
+- true
+...
+package.loaded['vshard.storage'] = nil
+---
+...
+vshard.storage = require("vshard.storage")
+---
+...
+test_run:grep_log('storage_2_a', 'vshard.storage.reload_evolution: upgraded to') ~= nil
+---
+- true
+...
+vshard.storage.internal.reload_version
+---
+- 1
+...
+-- Make sure storage operates well.
+vshard.storage.bucket_force_drop(2000)
+---
+- true
+...
+vshard.storage.bucket_force_create(2000)
+---
+- true
+...
+vshard.storage.buckets_info()[2000]
+---
+- status: active
+  id: 2000
+...
+vshard.storage.call(bucket_id_to_move, 'read', 'do_select', {42})
+---
+- true
+- - [42, 3000]
+...
+vshard.storage.bucket_send(bucket_id_to_move, replicaset1_uuid)
+---
+- true
+...
+vshard.storage.garbage_collector_wakeup()
+---
+...
+fiber = require('fiber')
+---
+...
+while box.space._bucket:get({bucket_id_to_move}) do fiber.sleep(0.01) end
+---
+...
+test_run:switch('storage_1_a')
+---
+- true
+...
+vshard.storage.bucket_send(bucket_id_to_move, replicaset2_uuid)
+---
+- true
+...
+test_run:switch('storage_2_a')
+---
+- true
+...
+vshard.storage.call(bucket_id_to_move, 'read', 'do_select', {42})
+---
+- true
+- - [42, 3000]
+...
+-- Check info() does not fail.
+vshard.storage.info() ~= nil
+---
+- true
+...
+--
+-- Send buckets to create a disbalance. Wait until the rebalancer
+-- repairs it. Similar to `tests/rebalancer/rebalancer.test.lua`.
+--
+vshard.storage.rebalancer_disable()
+---
+...
+move_start = vshard.consts.DEFAULT_BUCKET_COUNT / 2 + 1
+---
+...
+move_cnt = 100
+---
+...
+assert(move_start + move_cnt < vshard.consts.DEFAULT_BUCKET_COUNT)
+---
+- true
+...
+for i = move_start, move_start + move_cnt - 1 do box.space._bucket:delete{i} end
+---
+...
+box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
+---
+- 1400
+...
+test_run:switch('storage_1_a')
+---
+- true
+...
+move_start = vshard.consts.DEFAULT_BUCKET_COUNT / 2 + 1
+---
+...
+move_cnt = 100
+---
+...
+vshard.storage.bucket_force_create(move_start, move_cnt)
+---
+- true
+...
+box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
+---
+- 1600
+...
+test_run:switch('storage_2_a')
+---
+- true
+...
+vshard.storage.rebalancer_enable()
+---
+...
+wait_rebalancer_state('Rebalance routes are sent', test_run)
+---
+...
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+---
+...
+box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
+---
+- 1500
+...
+test_run:switch('default')
+---
+- true
+...
+test_run:drop_cluster(REPLICASET_2)
+---
+...
+test_run:drop_cluster(REPLICASET_1)
+---
+...
+test_run:cmd('clear filter')
+---
+- true
+...
diff --git a/test/reload_evolution/storage.test.lua b/test/reload_evolution/storage.test.lua
new file mode 100644
index 0000000..7af464b
--- /dev/null
+++ b/test/reload_evolution/storage.test.lua
@@ -0,0 +1,87 @@
+test_run = require('test_run').new()
+
+git_util = require('lua_libs.git_util')
+util = require('lua_libs.util')
+vshard_copy_path = util.BUILDDIR .. '/test/var/vshard_git_tree_copy'
+evolution_log = git_util.log_hashes({args='vshard/storage/reload_evolution.lua', dir=util.SOURCEDIR})
+-- Cleanup the directory after a previous build.
+_ = os.execute('rm -rf ' .. vshard_copy_path)
+-- 1. `git worktree` cannot be used because PACKPACK mounts
+-- `/source/` in `ro` mode.
+-- 2. Just `cp -rf` cannot be used due to a little different
+-- behavior in Centos 7.
+_ = os.execute('mkdir ' .. vshard_copy_path)
+_ = os.execute("cd " .. util.SOURCEDIR .. ' && cp -rf `ls -A --ignore=build` ' .. vshard_copy_path)
+-- Checkout the first commit with a reload_evolution mechanism.
+git_util.exec_cmd({cmd='checkout', args='-f', dir=vshard_copy_path})
+git_util.exec_cmd({cmd='checkout', args=evolution_log[#evolution_log] .. '~1', dir=vshard_copy_path})
+
+REPLICASET_1 = { 'storage_1_a', 'storage_1_b' }
+REPLICASET_2 = { 'storage_2_a', 'storage_2_b' }
+test_run:create_cluster(REPLICASET_1, 'reload_evolution')
+test_run:create_cluster(REPLICASET_2, 'reload_evolution')
+util = require('lua_libs.util')
+util.wait_master(test_run, REPLICASET_1, 'storage_1_a')
+util.wait_master(test_run, REPLICASET_2, 'storage_2_a')
+
+test_run:switch('storage_1_a')
+vshard.storage.bucket_force_create(1, vshard.consts.DEFAULT_BUCKET_COUNT / 2)
+bucket_id_to_move = vshard.consts.DEFAULT_BUCKET_COUNT
+
+test_run:switch('storage_2_a')
+fiber = require('fiber')
+vshard.storage.bucket_force_create(vshard.consts.DEFAULT_BUCKET_COUNT / 2 + 1, vshard.consts.DEFAULT_BUCKET_COUNT / 2)
+bucket_id_to_move = vshard.consts.DEFAULT_BUCKET_COUNT
+vshard.storage.internal.reload_version
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+box.space.test:insert({42, bucket_id_to_move})
+
+test_run:switch('default')
+git_util.exec_cmd({cmd='checkout', args=evolution_log[1], dir=vshard_copy_path})
+
+test_run:switch('storage_2_a')
+package.loaded['vshard.storage'] = nil
+vshard.storage = require("vshard.storage")
+test_run:grep_log('storage_2_a', 'vshard.storage.reload_evolution: upgraded to') ~= nil
+vshard.storage.internal.reload_version
+-- Make sure storage operates well.
+vshard.storage.bucket_force_drop(2000)
+vshard.storage.bucket_force_create(2000)
+vshard.storage.buckets_info()[2000]
+vshard.storage.call(bucket_id_to_move, 'read', 'do_select', {42})
+vshard.storage.bucket_send(bucket_id_to_move, replicaset1_uuid)
+vshard.storage.garbage_collector_wakeup()
+fiber = require('fiber')
+while box.space._bucket:get({bucket_id_to_move}) do fiber.sleep(0.01) end
+test_run:switch('storage_1_a')
+vshard.storage.bucket_send(bucket_id_to_move, replicaset2_uuid)
+test_run:switch('storage_2_a')
+vshard.storage.call(bucket_id_to_move, 'read', 'do_select', {42})
+-- Check info() does not fail.
+vshard.storage.info() ~= nil
+
+--
+-- Send buckets to create a disbalance. Wait until the rebalancer
+-- repairs it. Similar to `tests/rebalancer/rebalancer.test.lua`.
+--
+vshard.storage.rebalancer_disable()
+move_start = vshard.consts.DEFAULT_BUCKET_COUNT / 2 + 1
+move_cnt = 100
+assert(move_start + move_cnt < vshard.consts.DEFAULT_BUCKET_COUNT)
+for i = move_start, move_start + move_cnt - 1 do box.space._bucket:delete{i} end
+box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
+test_run:switch('storage_1_a')
+move_start = vshard.consts.DEFAULT_BUCKET_COUNT / 2 + 1
+move_cnt = 100
+vshard.storage.bucket_force_create(move_start, move_cnt)
+box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
+test_run:switch('storage_2_a')
+vshard.storage.rebalancer_enable()
+wait_rebalancer_state('Rebalance routes are sent', test_run)
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+box.space._bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
+
+test_run:switch('default')
+test_run:drop_cluster(REPLICASET_2)
+test_run:drop_cluster(REPLICASET_1)
+test_run:cmd('clear filter')
diff --git a/test/reload_evolution/storage_1_a.lua b/test/reload_evolution/storage_1_a.lua
new file mode 100755
index 0000000..f1a2981
--- /dev/null
+++ b/test/reload_evolution/storage_1_a.lua
@@ -0,0 +1,48 @@
+#!/usr/bin/env tarantool
+
+require('strict').on()
+
+local log = require('log')
+local fiber = require('fiber')
+local util = require('lua_libs.util')
+local fio = require('fio')
+
+-- Get instance name
+NAME = fio.basename(arg[0], '.lua')
+
+-- test-run gate.
+test_run = require('test_run').new()
+require('console').listen(os.getenv('ADMIN'))
+
+-- Run one storage on a different vshard version.
+-- To do that, place vshard src to
+-- BUILDDIR/test/var/vshard_git_tree_copy/.
+if NAME == 'storage_2_a' then
+    local script_path = debug.getinfo(1).source:match("@?(.*/)")
+    vshard_copy = util.BUILDDIR .. '/test/var/vshard_git_tree_copy'
+    package.path = string.format(
+        '%s/?.lua;%s/?/init.lua;%s',
+        vshard_copy, vshard_copy, package.path
+    )
+end
+
+-- Call a configuration provider
+cfg = require('localcfg')
+-- Name to uuid map
+names = {
+    ['storage_1_a'] = '8a274925-a26d-47fc-9e1b-af88ce939412',
+    ['storage_1_b'] = '3de2e3e1-9ebe-4d0d-abb1-26d301b84633',
+    ['storage_2_a'] = '1e02ae8a-afc0-4e91-ba34-843a356b8ed7',
+    ['storage_2_b'] = '001688c3-66f8-4a31-8e19-036c17d489c2',
+}
+
+replicaset1_uuid = 'cbf06940-0790-498b-948d-042b62cf3d29'
+replicaset2_uuid = 'ac522f65-aa94-4134-9f64-51ee384f1a54'
+replicasets = {replicaset1_uuid, replicaset2_uuid}
+
+-- Start the database with sharding
+vshard = require('vshard')
+vshard.storage.cfg(cfg, names[NAME])
+
+-- Bootstrap storage.
+require('lua_libs.bootstrap')
diff --git a/test/reload_evolution/storage_1_b.lua b/test/reload_evolution/storage_1_b.lua
new file mode 120000
index 0000000..02572da
--- /dev/null
+++ b/test/reload_evolution/storage_1_b.lua
@@ -0,0 +1 @@
+storage_1_a.lua
\ No newline at end of file
diff --git a/test/reload_evolution/storage_2_a.lua b/test/reload_evolution/storage_2_a.lua
new file mode 120000
index 0000000..02572da
--- /dev/null
+++ b/test/reload_evolution/storage_2_a.lua
@@ -0,0 +1 @@
+storage_1_a.lua
\ No newline at end of file
diff --git a/test/reload_evolution/storage_2_b.lua b/test/reload_evolution/storage_2_b.lua
new file mode 120000
index 0000000..02572da
--- /dev/null
+++ b/test/reload_evolution/storage_2_b.lua
@@ -0,0 +1 @@
+storage_1_a.lua
\ No newline at end of file
diff --git a/test/reload_evolution/suite.ini b/test/reload_evolution/suite.ini
new file mode 100644
index 0000000..5f55418
--- /dev/null
+++ b/test/reload_evolution/suite.ini
@@ -0,0 +1,6 @@
+[default]
+core = tarantool
+description = Reload evolution tests
+script = test.lua
+is_parallel = False
+lua_libs = ../lua_libs ../../example/localcfg.lua
diff --git a/test/reload_evolution/test.lua b/test/reload_evolution/test.lua
new file mode 100644
index 0000000..ad0543a
--- /dev/null
+++ b/test/reload_evolution/test.lua
@@ -0,0 +1,9 @@
+#!/usr/bin/env tarantool
+
+require('strict').on()
+
+box.cfg{
+    listen = os.getenv("LISTEN"),
+}
+
+require('console').listen(os.getenv('ADMIN'))
diff --git a/test/unit/reload_evolution.result b/test/unit/reload_evolution.result
new file mode 100644
index 0000000..10e606d
--- /dev/null
+++ b/test/unit/reload_evolution.result
@@ -0,0 +1,45 @@
+test_run = require('test_run').new()
+---
+...
+fiber = require('fiber')
+---
+...
+log = require('log')
+---
+...
+util = require('util')
+---
+...
+reload_evolution = require('vshard.storage.reload_evolution')
+---
+...
+-- Init with the latest version.
+fake_M = { reload_version = reload_evolution.version }
+---
+...
+-- Test reload to the same version.
+reload_evolution.upgrade(fake_M)
+---
+...
+test_run:grep_log('default', 'vshard.storage.evolution') == nil
+---
+- true
+...
+-- Test downgrage version.
+log.info(string.rep('a', 1000))
+---
+...
+fake_M.reload_version = fake_M.reload_version + 1
+---
+...
+err = util.check_error(reload_evolution.upgrade, fake_M)
+---
+...
+err:match('auto%-downgrade is not implemented')
+---
+- auto-downgrade is not implemented
+...
+test_run:grep_log('default', 'vshard.storage.evolution', 1000) ~= nil
+---
+- false
+...
diff --git a/test/unit/reload_evolution.test.lua b/test/unit/reload_evolution.test.lua
new file mode 100644
index 0000000..2e99152
--- /dev/null
+++ b/test/unit/reload_evolution.test.lua
@@ -0,0 +1,18 @@
+test_run = require('test_run').new()
+fiber = require('fiber')
+log = require('log')
+util = require('util')
+reload_evolution = require('vshard.storage.reload_evolution')
+-- Init with the latest version.
+fake_M = { reload_version = reload_evolution.version }
+
+-- Test reload to the same version.
+reload_evolution.upgrade(fake_M)
+test_run:grep_log('default', 'vshard.storage.evolution') == nil
+
+-- Test downgrage version.
+log.info(string.rep('a', 1000))
+fake_M.reload_version = fake_M.reload_version + 1
+err = util.check_error(reload_evolution.upgrade, fake_M)
+err:match('auto%-downgrade is not implemented')
+test_run:grep_log('default', 'vshard.storage.evolution', 1000) ~= nil
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index 8ca81f6..6be66d2 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -10,6 +10,7 @@ if rawget(_G, MODULE_INTERNALS) then
     local vshard_modules = {
         'vshard.consts', 'vshard.error', 'vshard.cfg',
         'vshard.replicaset', 'vshard.util',
+        'vshard.storage.reload_evolution'
     }
     for _, module in pairs(vshard_modules) do
         package.loaded[module] = nil
@@ -20,12 +21,16 @@ local lerror = require('vshard.error')
 local lcfg = require('vshard.cfg')
 local lreplicaset = require('vshard.replicaset')
 local util = require('vshard.util')
+local reload_evolution = require('vshard.storage.reload_evolution')
 
 local M = rawget(_G, MODULE_INTERNALS)
 if not M then
     --
     -- The module is loaded for the first time.
     --
+    -- !!!WARNING: any change of this table must be reflected in
+    -- `vshard.storage.reload_evolution` module to guarantee
+    -- reloadability of the module.
     M = {
         ---------------- Common module attributes ----------------
         -- The last passed configuration.
@@ -105,6 +110,11 @@ if not M then
         -- a destination replicaset must drop already received
         -- data.
         rebalancer_sending_bucket = 0,
+
+        ------------------------- Reload -------------------------
+        -- Version of the loaded module. This number is used on
+        -- reload to determine which upgrade scripts to run.
+        reload_version = reload_evolution.version,
     }
 end
 
@@ -1863,6 +1873,7 @@ end
 if not rawget(_G, MODULE_INTERNALS) then
     rawset(_G, MODULE_INTERNALS, M)
 else
+    reload_evolution.upgrade(M)
     storage_cfg(M.current_cfg, M.this_replica.uuid)
     M.module_version = M.module_version + 1
 end
diff --git a/vshard/storage/reload_evolution.lua b/vshard/storage/reload_evolution.lua
new file mode 100644
index 0000000..8502a33
--- /dev/null
+++ b/vshard/storage/reload_evolution.lua
@@ -0,0 +1,58 @@
+--
+-- This module is used to upgrade the vshard.storage on the fly.
+-- It updates internal Lua structures in case they are changed
+-- in a commit.
+--
+local log = require('log')
+
+--
+-- Array of upgrade functions.
+-- migrations[version] = function which upgrades module version
+-- from `version` to `version + 1`.
+--
+local migrations = {}
+
+-- Initialize reload_upgrade mechanism
+migrations[#migrations + 1] = function (M)
+    -- Code to update Lua objects.
+end
+
+--
+-- Perform an update based on a version stored in `M` (internals).
+-- @param M Old module internals which should be updated.
+--
+local function upgrade(M)
+    local start_version = M.reload_version or 1
+    if start_version > #migrations then
+        local err_msg = string.format(
+            'vshard.storage.reload_evolution: ' ..
+            'auto-downgrade is not implemented; ' ..
+            'loaded version is %d, upgrade script version is %d',
+            start_version, #migrations
+        )
+        log.error(err_msg)
+        error(err_msg)
+    end
+    for i = start_version, #migrations  do
+        local ok, err = pcall(migrations[i], M)
+        if ok then
+            log.info('vshard.storage.reload_evolution: upgraded to %d version',
+                     i)
+        else
+            local err_msg = string.format(
+                'vshard.storage.reload_evolution: ' ..
+                'error during upgrade to %d version: %s', i, err
+            )
+            log.error(err_msg)
+            error(err_msg)
+        end
+        -- Update the version just after upgrade to have an
+        -- actual version in case of an error.
+        M.reload_version = i
+    end
+end
+
+return {
+    version = #migrations,
+    upgrade = upgrade,
+}
-- 
2.14.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-07-30  8:56 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-23 11:14 [tarantool-patches] [PATCH v2] vshard reload mechanism AKhatskevich
2018-07-23 11:14 ` [tarantool-patches] [PATCH 1/4] Add test on error during reconfigure AKhatskevich
2018-07-23 13:18   ` [tarantool-patches] " Vladislav Shpilevoy
2018-07-23 11:14 ` [tarantool-patches] [PATCH 2/4] Complete module reload AKhatskevich
2018-07-23 13:31   ` [tarantool-patches] " Vladislav Shpilevoy
2018-07-23 13:45     ` Alex Khatskevich
2018-07-23 14:58       ` Vladislav Shpilevoy
2018-07-23 11:14 ` [tarantool-patches] [PATCH 3/4] Tests: separate bootstrap routine to a lua_libs AKhatskevich
2018-07-23 13:36   ` [tarantool-patches] " Vladislav Shpilevoy
2018-07-23 17:19     ` Alex Khatskevich
2018-07-23 11:14 ` [tarantool-patches] [PATCH 4/4] Introduce storage reload evolution AKhatskevich
2018-07-23 14:44   ` [tarantool-patches] " Vladislav Shpilevoy
2018-07-23 20:10     ` Alex Khatskevich
2018-07-30  8:56 [tarantool-patches] [PATCH v4] vshard module reload AKhatskevich
2018-07-30  8:56 ` [tarantool-patches] [PATCH 4/4] Introduce storage reload evolution AKhatskevich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox