* [tarantool-patches] [PATCH 0/3] multiple routers
@ 2018-07-31 16:25 AKhatskevich
2018-07-31 16:25 ` [tarantool-patches] [PATCH 1/3] Update only vshard part of a cfg on reload AKhatskevich
` (4 more replies)
0 siblings, 5 replies; 23+ messages in thread
From: AKhatskevich @ 2018-07-31 16:25 UTC (permalink / raw)
To: v.shpilevoy, tarantool-patches
Issue: https://github.com/tarantool/vshard/issues/130
Extra issue: https://github.com/tarantool/vshard/issues/138
Branch: https://github.com/tarantool/vshard/tree/kh/gh-130-multiple-routers
This patchset introduces multiple routers feature.
A user can create multiple router instances which are
connected to different (or the same) clusters.
AKhatskevich (3):
Update only vshard part of a cfg on reload
Move lua gc to a dedicated module
Introduce multiple routers feature
test/multiple_routers/configs.lua | 81 ++++++
test/multiple_routers/multiple_routers.result | 226 +++++++++++++++
test/multiple_routers/multiple_routers.test.lua | 85 ++++++
test/multiple_routers/router_1.lua | 15 +
test/multiple_routers/storage_1_1_a.lua | 23 ++
test/multiple_routers/storage_1_1_b.lua | 1 +
test/multiple_routers/storage_1_2_a.lua | 1 +
test/multiple_routers/storage_1_2_b.lua | 1 +
test/multiple_routers/storage_2_1_a.lua | 1 +
test/multiple_routers/storage_2_1_b.lua | 1 +
test/multiple_routers/storage_2_2_a.lua | 1 +
test/multiple_routers/storage_2_2_b.lua | 1 +
test/multiple_routers/suite.ini | 6 +
test/multiple_routers/test.lua | 9 +
test/router/garbage_collector.result | 27 +-
test/router/garbage_collector.test.lua | 18 +-
test/router/router.result | 4 +-
test/router/router.test.lua | 4 +-
test/storage/garbage_collector.result | 27 +-
test/storage/garbage_collector.test.lua | 22 +-
vshard/cfg.lua | 54 ++--
vshard/lua_gc.lua | 54 ++++
vshard/router/init.lua | 364 +++++++++++++++---------
vshard/storage/init.lua | 71 ++---
vshard/util.lua | 12 +-
25 files changed, 865 insertions(+), 244 deletions(-)
create mode 100644 test/multiple_routers/configs.lua
create mode 100644 test/multiple_routers/multiple_routers.result
create mode 100644 test/multiple_routers/multiple_routers.test.lua
create mode 100644 test/multiple_routers/router_1.lua
create mode 100644 test/multiple_routers/storage_1_1_a.lua
create mode 120000 test/multiple_routers/storage_1_1_b.lua
create mode 120000 test/multiple_routers/storage_1_2_a.lua
create mode 120000 test/multiple_routers/storage_1_2_b.lua
create mode 120000 test/multiple_routers/storage_2_1_a.lua
create mode 120000 test/multiple_routers/storage_2_1_b.lua
create mode 120000 test/multiple_routers/storage_2_2_a.lua
create mode 120000 test/multiple_routers/storage_2_2_b.lua
create mode 100644 test/multiple_routers/suite.ini
create mode 100644 test/multiple_routers/test.lua
create mode 100644 vshard/lua_gc.lua
--
2.14.1
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] [PATCH 1/3] Update only vshard part of a cfg on reload
2018-07-31 16:25 [tarantool-patches] [PATCH 0/3] multiple routers AKhatskevich
@ 2018-07-31 16:25 ` AKhatskevich
2018-08-01 18:43 ` [tarantool-patches] " Vladislav Shpilevoy
2018-07-31 16:25 ` [tarantool-patches] [PATCH 2/3] Move lua gc to a dedicated module AKhatskevich
` (3 subsequent siblings)
4 siblings, 1 reply; 23+ messages in thread
From: AKhatskevich @ 2018-07-31 16:25 UTC (permalink / raw)
To: v.shpilevoy, tarantool-patches
Box cfg could have been changed by a user and then overridden by
an old vshard config on reload.
Since that commit, box part of a config is applied only when
it is explicitly passed to a `cfg` method.
This change is important for the multiple routers feature.
---
vshard/cfg.lua | 54 +++++++++++++++++++++++++------------------------
vshard/router/init.lua | 18 ++++++++---------
vshard/storage/init.lua | 53 ++++++++++++++++++++++++++++--------------------
3 files changed, 67 insertions(+), 58 deletions(-)
diff --git a/vshard/cfg.lua b/vshard/cfg.lua
index bba12cc..8282086 100644
--- a/vshard/cfg.lua
+++ b/vshard/cfg.lua
@@ -230,48 +230,50 @@ local non_dynamic_options = {
'bucket_count', 'shard_index'
}
+--
+-- Deepcopy a config and split it into vshard_cfg and box_cfg.
+--
+local function split_cfg(cfg)
+ local vshard_field_map = {}
+ for _, field in ipairs(cfg_template) do
+ vshard_field_map[field[1]] = true
+ end
+ local vshard_cfg = {}
+ local box_cfg = {}
+ for k, v in pairs(cfg) do
+ if vshard_field_map[k] then
+ vshard_cfg[k] = table.deepcopy(v)
+ else
+ box_cfg[k] = table.deepcopy(v)
+ end
+ end
+ return vshard_cfg, box_cfg
+end
+
--
-- Check sharding config on correctness. Check types, name and uri
-- uniqueness, master count (in each replicaset must be <= 1).
--
-local function cfg_check(shard_cfg, old_cfg)
- if type(shard_cfg) ~= 'table' then
+local function cfg_check(cfg, old_vshard_cfg)
+ if type(cfg) ~= 'table' then
error('Сonfig must be map of options')
end
- shard_cfg = table.deepcopy(shard_cfg)
- validate_config(shard_cfg, cfg_template)
- if not old_cfg then
- return shard_cfg
+ local vshard_cfg, box_cfg = split_cfg(cfg)
+ validate_config(vshard_cfg, cfg_template)
+ if not old_vshard_cfg then
+ return vshard_cfg, box_cfg
end
-- Check non-dynamic after default values are added.
for _, f_name in pairs(non_dynamic_options) do
-- New option may be added in new vshard version.
- if shard_cfg[f_name] ~= old_cfg[f_name] then
+ if vshard_cfg[f_name] ~= old_vshard_cfg[f_name] then
error(string.format('Non-dynamic option %s ' ..
'cannot be reconfigured', f_name))
end
end
- return shard_cfg
-end
-
---
--- Nullify non-box options.
---
-local function remove_non_box_options(cfg)
- cfg.sharding = nil
- cfg.weights = nil
- cfg.zone = nil
- cfg.bucket_count = nil
- cfg.rebalancer_disbalance_threshold = nil
- cfg.rebalancer_max_receiving = nil
- cfg.shard_index = nil
- cfg.collect_bucket_garbage_interval = nil
- cfg.collect_lua_garbage = nil
- cfg.sync_timeout = nil
- cfg.connection_outdate_delay = nil
+ return vshard_cfg, box_cfg
end
return {
check = cfg_check,
- remove_non_box_options = remove_non_box_options,
}
diff --git a/vshard/router/init.lua b/vshard/router/init.lua
index 4cb19fd..e2b2b22 100644
--- a/vshard/router/init.lua
+++ b/vshard/router/init.lua
@@ -496,18 +496,15 @@ end
--------------------------------------------------------------------------------
local function router_cfg(cfg)
- cfg = lcfg.check(cfg, M.current_cfg)
- local new_cfg = table.copy(cfg)
+ local vshard_cfg, box_cfg = lcfg.check(cfg, M.current_cfg)
if not M.replicasets then
log.info('Starting router configuration')
else
log.info('Starting router reconfiguration')
end
- local new_replicasets = lreplicaset.buildall(cfg)
- local total_bucket_count = cfg.bucket_count
- local collect_lua_garbage = cfg.collect_lua_garbage
- local box_cfg = table.copy(cfg)
- lcfg.remove_non_box_options(box_cfg)
+ local new_replicasets = lreplicaset.buildall(vshard_cfg)
+ local total_bucket_count = vshard_cfg.bucket_count
+ local collect_lua_garbage = vshard_cfg.collect_lua_garbage
log.info("Calling box.cfg()...")
for k, v in pairs(box_cfg) do
log.info({[k] = v})
@@ -530,11 +527,12 @@ local function router_cfg(cfg)
replicaset:connect_all()
end
lreplicaset.wait_masters_connect(new_replicasets)
- lreplicaset.outdate_replicasets(M.replicasets, cfg.connection_outdate_delay)
- M.connection_outdate_delay = cfg.connection_outdate_delay
+ lreplicaset.outdate_replicasets(M.replicasets,
+ vshard_cfg.connection_outdate_delay)
+ M.connection_outdate_delay = vshard_cfg.connection_outdate_delay
M.total_bucket_count = total_bucket_count
M.collect_lua_garbage = collect_lua_garbage
- M.current_cfg = cfg
+ M.current_cfg = vshard_cfg
M.replicasets = new_replicasets
-- Update existing route map in-place.
local old_route_map = M.route_map
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index 102b942..75f5df9 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -1500,13 +1500,17 @@ end
--------------------------------------------------------------------------------
-- Configuration
--------------------------------------------------------------------------------
+-- Private (not accessible by a user) reload indicator.
+local is_reload = false
local function storage_cfg(cfg, this_replica_uuid)
+ -- Reset is_reload indicator in case of errors.
+ local xis_reload = is_reload
+ is_reload = false
if this_replica_uuid == nil then
error('Usage: cfg(configuration, this_replica_uuid)')
end
- cfg = lcfg.check(cfg, M.current_cfg)
- local new_cfg = table.copy(cfg)
- if cfg.weights or cfg.zone then
+ local vshard_cfg, box_cfg = lcfg.check(cfg, M.current_cfg)
+ if vshard_cfg.weights or vshard_cfg.zone then
error('Weights and zone are not allowed for storage configuration')
end
if M.replicasets then
@@ -1520,7 +1524,7 @@ local function storage_cfg(cfg, this_replica_uuid)
local this_replicaset
local this_replica
- local new_replicasets = lreplicaset.buildall(cfg)
+ local new_replicasets = lreplicaset.buildall(vshard_cfg)
local min_master
for rs_uuid, rs in pairs(new_replicasets) do
for replica_uuid, replica in pairs(rs.replicas) do
@@ -1553,18 +1557,19 @@ local function storage_cfg(cfg, this_replica_uuid)
--
-- If a master role of the replica is not changed, then
-- 'read_only' can be set right here.
- cfg.listen = cfg.listen or this_replica.uri
- if cfg.replication == nil and this_replicaset.master and not is_master then
- cfg.replication = {this_replicaset.master.uri}
+ box_cfg.listen = box_cfg.listen or this_replica.uri
+ if box_cfg.replication == nil and this_replicaset.master
+ and not is_master then
+ box_cfg.replication = {this_replicaset.master.uri}
else
- cfg.replication = {}
+ box_cfg.replication = {}
end
if was_master == is_master then
- cfg.read_only = not is_master
+ box_cfg.read_only = not is_master
end
if type(box.cfg) == 'function' then
- cfg.instance_uuid = this_replica.uuid
- cfg.replicaset_uuid = this_replicaset.uuid
+ box_cfg.instance_uuid = this_replica.uuid
+ box_cfg.replicaset_uuid = this_replicaset.uuid
else
local info = box.info
if this_replica_uuid ~= info.uuid then
@@ -1578,12 +1583,14 @@ local function storage_cfg(cfg, this_replica_uuid)
this_replicaset.uuid))
end
end
- local total_bucket_count = cfg.bucket_count
- local rebalancer_disbalance_threshold = cfg.rebalancer_disbalance_threshold
- local rebalancer_max_receiving = cfg.rebalancer_max_receiving
- local shard_index = cfg.shard_index
- local collect_bucket_garbage_interval = cfg.collect_bucket_garbage_interval
- local collect_lua_garbage = cfg.collect_lua_garbage
+ local total_bucket_count = vshard_cfg.bucket_count
+ local rebalancer_disbalance_threshold =
+ vshard_cfg.rebalancer_disbalance_threshold
+ local rebalancer_max_receiving = vshard_cfg.rebalancer_max_receiving
+ local shard_index = vshard_cfg.shard_index
+ local collect_bucket_garbage_interval =
+ vshard_cfg.collect_bucket_garbage_interval
+ local collect_lua_garbage = vshard_cfg.collect_lua_garbage
-- It is considered that all possible errors during cfg
-- process occur only before this place.
@@ -1598,7 +1605,7 @@ local function storage_cfg(cfg, this_replica_uuid)
-- a new sync timeout.
--
local old_sync_timeout = M.sync_timeout
- M.sync_timeout = cfg.sync_timeout
+ M.sync_timeout = vshard_cfg.sync_timeout
if was_master and not is_master then
local_on_master_disable_prepare()
@@ -1607,9 +1614,10 @@ local function storage_cfg(cfg, this_replica_uuid)
local_on_master_enable_prepare()
end
- local box_cfg = table.copy(cfg)
- lcfg.remove_non_box_options(box_cfg)
- local ok, err = pcall(box.cfg, box_cfg)
+ local ok, err = true, nil
+ if not xis_reload then
+ ok, err = pcall(box.cfg, box_cfg)
+ end
while M.errinj.ERRINJ_CFG_DELAY do
lfiber.sleep(0.01)
end
@@ -1639,7 +1647,7 @@ local function storage_cfg(cfg, this_replica_uuid)
M.shard_index = shard_index
M.collect_bucket_garbage_interval = collect_bucket_garbage_interval
M.collect_lua_garbage = collect_lua_garbage
- M.current_cfg = new_cfg
+ M.current_cfg = vshard_cfg
if was_master and not is_master then
local_on_master_disable()
@@ -1874,6 +1882,7 @@ if not rawget(_G, MODULE_INTERNALS) then
rawset(_G, MODULE_INTERNALS, M)
else
reload_evolution.upgrade(M)
+ is_reload = true
storage_cfg(M.current_cfg, M.this_replica.uuid)
M.module_version = M.module_version + 1
end
--
2.14.1
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] [PATCH 2/3] Move lua gc to a dedicated module
2018-07-31 16:25 [tarantool-patches] [PATCH 0/3] multiple routers AKhatskevich
2018-07-31 16:25 ` [tarantool-patches] [PATCH 1/3] Update only vshard part of a cfg on reload AKhatskevich
@ 2018-07-31 16:25 ` AKhatskevich
2018-08-01 18:43 ` [tarantool-patches] " Vladislav Shpilevoy
2018-07-31 16:25 ` [tarantool-patches] [PATCH 3/3] Introduce multiple routers feature AKhatskevich
` (2 subsequent siblings)
4 siblings, 1 reply; 23+ messages in thread
From: AKhatskevich @ 2018-07-31 16:25 UTC (permalink / raw)
To: v.shpilevoy, tarantool-patches
`vshard.lua_gc.lua` is a new module which helps make gc work more
intense.
Before the commit that was a duty of router and storage.
Reasons to move lua gc to a separate module:
1. It is not a duty of vshard to collect garbage, so let gc fiber
be as far from vshard as possible.
2. Next commits will introduce multiple routers feature, which require
gc fiber to be a singleton.
Closes #138
---
test/router/garbage_collector.result | 27 +++++++++++------
test/router/garbage_collector.test.lua | 18 ++++++-----
test/storage/garbage_collector.result | 27 +++++++++--------
test/storage/garbage_collector.test.lua | 22 ++++++--------
vshard/lua_gc.lua | 54 +++++++++++++++++++++++++++++++++
vshard/router/init.lua | 19 +++---------
vshard/storage/init.lua | 20 ++++--------
7 files changed, 116 insertions(+), 71 deletions(-)
create mode 100644 vshard/lua_gc.lua
diff --git a/test/router/garbage_collector.result b/test/router/garbage_collector.result
index 3c2a4f1..a7474fc 100644
--- a/test/router/garbage_collector.result
+++ b/test/router/garbage_collector.result
@@ -40,27 +40,30 @@ test_run:switch('router_1')
fiber = require('fiber')
---
...
-cfg.collect_lua_garbage = true
+lua_gc = require('vshard.lua_gc')
---
...
-iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DISCOVERY_INTERVAL
+cfg.collect_lua_garbage = true
---
...
vshard.router.cfg(cfg)
---
...
-a = setmetatable({}, {__mode = 'v'})
+lua_gc.internal.bg_fiber ~= nil
+---
+- true
+...
+iterations = lua_gc.internal.iterations
---
...
-a.k = {b = 100}
+lua_gc.internal.bg_fiber:wakeup()
---
...
-for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end
+while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end
---
...
-a.k
+lua_gc.internal.interval = 0.001
---
-- null
...
cfg.collect_lua_garbage = false
---
@@ -68,13 +71,17 @@ cfg.collect_lua_garbage = false
vshard.router.cfg(cfg)
---
...
-a.k = {b = 100}
+lua_gc.internal.bg_fiber == nil
+---
+- true
+...
+iterations = lua_gc.internal.iterations
---
...
-for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end
+fiber.sleep(0.01)
---
...
-a.k ~= nil
+iterations == lua_gc.internal.iterations
---
- true
...
diff --git a/test/router/garbage_collector.test.lua b/test/router/garbage_collector.test.lua
index b3411cd..d1da8e9 100644
--- a/test/router/garbage_collector.test.lua
+++ b/test/router/garbage_collector.test.lua
@@ -13,18 +13,20 @@ test_run:cmd("start server router_1")
--
test_run:switch('router_1')
fiber = require('fiber')
+lua_gc = require('vshard.lua_gc')
cfg.collect_lua_garbage = true
-iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DISCOVERY_INTERVAL
vshard.router.cfg(cfg)
-a = setmetatable({}, {__mode = 'v'})
-a.k = {b = 100}
-for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end
-a.k
+lua_gc.internal.bg_fiber ~= nil
+iterations = lua_gc.internal.iterations
+lua_gc.internal.bg_fiber:wakeup()
+while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end
+lua_gc.internal.interval = 0.001
cfg.collect_lua_garbage = false
vshard.router.cfg(cfg)
-a.k = {b = 100}
-for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end
-a.k ~= nil
+lua_gc.internal.bg_fiber == nil
+iterations = lua_gc.internal.iterations
+fiber.sleep(0.01)
+iterations == lua_gc.internal.iterations
test_run:switch("default")
test_run:cmd("stop server router_1")
diff --git a/test/storage/garbage_collector.result b/test/storage/garbage_collector.result
index 3588fb4..d94ba24 100644
--- a/test/storage/garbage_collector.result
+++ b/test/storage/garbage_collector.result
@@ -120,7 +120,7 @@ test_run:switch('storage_1_a')
fiber = require('fiber')
---
...
-log = require('log')
+lua_gc = require('vshard.lua_gc')
---
...
cfg.collect_lua_garbage = true
@@ -129,24 +129,21 @@ cfg.collect_lua_garbage = true
vshard.storage.cfg(cfg, names.storage_1_a)
---
...
--- Create a weak reference to a able {b = 100} - it must be
--- deleted on the next GC.
-a = setmetatable({}, {__mode = 'v'})
+lua_gc.internal.bg_fiber ~= nil
---
+- true
...
-a.k = {b = 100}
+iterations = lua_gc.internal.iterations
---
...
-iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
+lua_gc.internal.bg_fiber:wakeup()
---
...
--- Wait until Lua GC deletes a.k.
-for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
+while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end
---
...
-a.k
+lua_gc.internal.interval = 0.001
---
-- null
...
cfg.collect_lua_garbage = false
---
@@ -154,13 +151,17 @@ cfg.collect_lua_garbage = false
vshard.storage.cfg(cfg, names.storage_1_a)
---
...
-a.k = {b = 100}
+lua_gc.internal.bg_fiber == nil
+---
+- true
+...
+iterations = lua_gc.internal.iterations
---
...
-for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
+fiber.sleep(0.01)
---
...
-a.k ~= nil
+iterations == lua_gc.internal.iterations
---
- true
...
diff --git a/test/storage/garbage_collector.test.lua b/test/storage/garbage_collector.test.lua
index 79e76d8..ee3ecf4 100644
--- a/test/storage/garbage_collector.test.lua
+++ b/test/storage/garbage_collector.test.lua
@@ -46,22 +46,20 @@ customer:select{}
--
test_run:switch('storage_1_a')
fiber = require('fiber')
-log = require('log')
+lua_gc = require('vshard.lua_gc')
cfg.collect_lua_garbage = true
vshard.storage.cfg(cfg, names.storage_1_a)
--- Create a weak reference to a able {b = 100} - it must be
--- deleted on the next GC.
-a = setmetatable({}, {__mode = 'v'})
-a.k = {b = 100}
-iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
--- Wait until Lua GC deletes a.k.
-for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
-a.k
+lua_gc.internal.bg_fiber ~= nil
+iterations = lua_gc.internal.iterations
+lua_gc.internal.bg_fiber:wakeup()
+while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end
+lua_gc.internal.interval = 0.001
cfg.collect_lua_garbage = false
vshard.storage.cfg(cfg, names.storage_1_a)
-a.k = {b = 100}
-for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
-a.k ~= nil
+lua_gc.internal.bg_fiber == nil
+iterations = lua_gc.internal.iterations
+fiber.sleep(0.01)
+iterations == lua_gc.internal.iterations
test_run:switch('default')
test_run:drop_cluster(REPLICASET_2)
diff --git a/vshard/lua_gc.lua b/vshard/lua_gc.lua
new file mode 100644
index 0000000..8d6af3e
--- /dev/null
+++ b/vshard/lua_gc.lua
@@ -0,0 +1,54 @@
+--
+-- This module implements background lua GC fiber.
+-- It's purpose is to make GC more aggressive.
+--
+
+local lfiber = require('fiber')
+local MODULE_INTERNALS = '__module_vshard_lua_gc'
+
+local M = rawget(_G, MODULE_INTERNALS)
+if not M then
+ M = {
+ -- Background fiber.
+ bg_fiber = nil,
+ -- GC interval in seconds.
+ interval = nil,
+ -- Main loop.
+ -- Stored here to make the fiber reloadable.
+ main_loop = nil,
+ -- Number of `collectgarbage()` calls.
+ iterations = 0,
+ }
+end
+local DEFALUT_INTERVAL = 100
+
+M.main_loop = function()
+ lfiber.sleep(M.interval or DEFALUT_INTERVAL)
+ collectgarbage()
+ M.iterations = M.iterations + 1
+ return M.main_loop()
+end
+
+local function set_state(active, interval)
+ M.inverval = interval
+ if active and not M.bg_fiber then
+ M.bg_fiber = lfiber.create(M.main_loop)
+ M.bg_fiber:name('vshard.lua_gc')
+ end
+ if not active and M.bg_fiber then
+ M.bg_fiber:cancel()
+ M.bg_fiber = nil
+ end
+ if active then
+ M.bg_fiber:wakeup()
+ end
+end
+
+if not rawget(_G, MODULE_INTERNALS) then
+ rawset(_G, MODULE_INTERNALS, M)
+end
+
+return {
+ set_state = set_state,
+ internal = M,
+}
diff --git a/vshard/router/init.lua b/vshard/router/init.lua
index e2b2b22..3e127cb 100644
--- a/vshard/router/init.lua
+++ b/vshard/router/init.lua
@@ -7,6 +7,7 @@ if rawget(_G, MODULE_INTERNALS) then
local vshard_modules = {
'vshard.consts', 'vshard.error', 'vshard.cfg',
'vshard.hash', 'vshard.replicaset', 'vshard.util',
+ 'vshard.lua_gc',
}
for _, module in pairs(vshard_modules) do
package.loaded[module] = nil
@@ -18,6 +19,7 @@ local lcfg = require('vshard.cfg')
local lhash = require('vshard.hash')
local lreplicaset = require('vshard.replicaset')
local util = require('vshard.util')
+local lua_gc = require('vshard.lua_gc')
local M = rawget(_G, MODULE_INTERNALS)
if not M then
@@ -43,8 +45,7 @@ if not M then
discovery_fiber = nil,
-- Bucket count stored on all replicasets.
total_bucket_count = 0,
- -- If true, then discovery fiber starts to call
- -- collectgarbage() periodically.
+ -- Boolean lua_gc state (create periodic gc task).
collect_lua_garbage = nil,
-- This counter is used to restart background fibers with
-- new reloaded code.
@@ -151,8 +152,6 @@ end
--
local function discovery_f()
local module_version = M.module_version
- local iterations_until_lua_gc =
- consts.COLLECT_LUA_GARBAGE_INTERVAL / consts.DISCOVERY_INTERVAL
while module_version == M.module_version do
while not next(M.replicasets) do
lfiber.sleep(consts.DISCOVERY_INTERVAL)
@@ -188,12 +187,6 @@ local function discovery_f()
M.route_map[bucket_id] = replicaset
end
end
- iterations_until_lua_gc = iterations_until_lua_gc - 1
- if M.collect_lua_garbage and iterations_until_lua_gc == 0 then
- iterations_until_lua_gc =
- consts.COLLECT_LUA_GARBAGE_INTERVAL / consts.DISCOVERY_INTERVAL
- collectgarbage()
- end
lfiber.sleep(consts.DISCOVERY_INTERVAL)
end
end
@@ -504,7 +497,6 @@ local function router_cfg(cfg)
end
local new_replicasets = lreplicaset.buildall(vshard_cfg)
local total_bucket_count = vshard_cfg.bucket_count
- local collect_lua_garbage = vshard_cfg.collect_lua_garbage
log.info("Calling box.cfg()...")
for k, v in pairs(box_cfg) do
log.info({[k] = v})
@@ -531,7 +523,7 @@ local function router_cfg(cfg)
vshard_cfg.connection_outdate_delay)
M.connection_outdate_delay = vshard_cfg.connection_outdate_delay
M.total_bucket_count = total_bucket_count
- M.collect_lua_garbage = collect_lua_garbage
+ M.collect_lua_garbage = vshard_cfg.collect_lua_garbage
M.current_cfg = vshard_cfg
M.replicasets = new_replicasets
-- Update existing route map in-place.
@@ -548,8 +540,7 @@ local function router_cfg(cfg)
M.discovery_fiber = util.reloadable_fiber_create(
'vshard.discovery', M, 'discovery_f')
end
- -- Destroy connections, not used in a new configuration.
- collectgarbage()
+ lua_gc.set_state(M.collect_lua_garbage, consts.COLLECT_LUA_GARBAGE_INTERVAL)
end
--------------------------------------------------------------------------------
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index 75f5df9..1e11960 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -10,7 +10,8 @@ if rawget(_G, MODULE_INTERNALS) then
local vshard_modules = {
'vshard.consts', 'vshard.error', 'vshard.cfg',
'vshard.replicaset', 'vshard.util',
- 'vshard.storage.reload_evolution'
+ 'vshard.storage.reload_evolution',
+ 'vshard.lua_gc',
}
for _, module in pairs(vshard_modules) do
package.loaded[module] = nil
@@ -21,6 +22,7 @@ local lerror = require('vshard.error')
local lcfg = require('vshard.cfg')
local lreplicaset = require('vshard.replicaset')
local util = require('vshard.util')
+local lua_gc = require('vshard.lua_gc')
local reload_evolution = require('vshard.storage.reload_evolution')
local M = rawget(_G, MODULE_INTERNALS)
@@ -75,8 +77,7 @@ if not M then
collect_bucket_garbage_fiber = nil,
-- Do buckets garbage collection once per this time.
collect_bucket_garbage_interval = nil,
- -- If true, then bucket garbage collection fiber starts to
- -- call collectgarbage() periodically.
+ -- Boolean lua_gc state (create periodic gc task).
collect_lua_garbage = nil,
-------------------- Bucket recovery ---------------------
@@ -1063,9 +1064,6 @@ function collect_garbage_f()
-- buckets_for_redirect is deleted, it gets empty_sent_buckets
-- for next deletion.
local empty_sent_buckets = {}
- local iterations_until_lua_gc =
- consts.COLLECT_LUA_GARBAGE_INTERVAL / M.collect_bucket_garbage_interval
-
while M.module_version == module_version do
-- Check if no changes in buckets configuration.
if control.bucket_generation_collected ~= control.bucket_generation then
@@ -1106,12 +1104,6 @@ function collect_garbage_f()
end
end
::continue::
- iterations_until_lua_gc = iterations_until_lua_gc - 1
- if iterations_until_lua_gc == 0 and M.collect_lua_garbage then
- iterations_until_lua_gc = consts.COLLECT_LUA_GARBAGE_INTERVAL /
- M.collect_bucket_garbage_interval
- collectgarbage()
- end
lfiber.sleep(M.collect_bucket_garbage_interval)
end
end
@@ -1590,7 +1582,6 @@ local function storage_cfg(cfg, this_replica_uuid)
local shard_index = vshard_cfg.shard_index
local collect_bucket_garbage_interval =
vshard_cfg.collect_bucket_garbage_interval
- local collect_lua_garbage = vshard_cfg.collect_lua_garbage
-- It is considered that all possible errors during cfg
-- process occur only before this place.
@@ -1646,7 +1637,7 @@ local function storage_cfg(cfg, this_replica_uuid)
M.rebalancer_max_receiving = rebalancer_max_receiving
M.shard_index = shard_index
M.collect_bucket_garbage_interval = collect_bucket_garbage_interval
- M.collect_lua_garbage = collect_lua_garbage
+ M.collect_lua_garbage = vshard_cfg.collect_lua_garbage
M.current_cfg = vshard_cfg
if was_master and not is_master then
@@ -1671,6 +1662,7 @@ local function storage_cfg(cfg, this_replica_uuid)
M.rebalancer_fiber:cancel()
M.rebalancer_fiber = nil
end
+ lua_gc.set_state(M.collect_lua_garbage, consts.COLLECT_LUA_GARBAGE_INTERVAL)
-- Destroy connections, not used in a new configuration.
collectgarbage()
end
--
2.14.1
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] [PATCH 3/3] Introduce multiple routers feature
2018-07-31 16:25 [tarantool-patches] [PATCH 0/3] multiple routers AKhatskevich
2018-07-31 16:25 ` [tarantool-patches] [PATCH 1/3] Update only vshard part of a cfg on reload AKhatskevich
2018-07-31 16:25 ` [tarantool-patches] [PATCH 2/3] Move lua gc to a dedicated module AKhatskevich
@ 2018-07-31 16:25 ` AKhatskevich
2018-08-01 18:43 ` [tarantool-patches] " Vladislav Shpilevoy
2018-08-01 14:30 ` [tarantool-patches] [PATCH] Check self arg passed for router objects AKhatskevich
2018-08-03 20:07 ` [tarantool-patches] [PATCH] Refactor config templates AKhatskevich
4 siblings, 1 reply; 23+ messages in thread
From: AKhatskevich @ 2018-07-31 16:25 UTC (permalink / raw)
To: v.shpilevoy, tarantool-patches
Key points:
* Old `vshard.router.some_method()` api is preserved.
* Add `vshard.router.new(name, cfg)` method which returns a new router.
* Each router has its own:
1. name
2. background fibers
3. attributes (route_map, replicasets, outdate_delay...)
* Module reload reloads all configured routers.
* `cfg` reconfigures a single router.
* All routers share the same box configuration. The last passed config
overrides the global config.
* Multiple router instances can be connected to the same cluster.
* By now, a router cannot be destroyed.
Extra changes:
* Add `data` parameter to `reloadable_fiber_create` function.
Closes #130
---
test/multiple_routers/configs.lua | 81 ++++++
test/multiple_routers/multiple_routers.result | 226 ++++++++++++++++
test/multiple_routers/multiple_routers.test.lua | 85 ++++++
test/multiple_routers/router_1.lua | 15 ++
test/multiple_routers/storage_1_1_a.lua | 23 ++
test/multiple_routers/storage_1_1_b.lua | 1 +
test/multiple_routers/storage_1_2_a.lua | 1 +
test/multiple_routers/storage_1_2_b.lua | 1 +
test/multiple_routers/storage_2_1_a.lua | 1 +
test/multiple_routers/storage_2_1_b.lua | 1 +
test/multiple_routers/storage_2_2_a.lua | 1 +
test/multiple_routers/storage_2_2_b.lua | 1 +
test/multiple_routers/suite.ini | 6 +
test/multiple_routers/test.lua | 9 +
test/router/router.result | 4 +-
test/router/router.test.lua | 4 +-
vshard/router/init.lua | 341 ++++++++++++++++--------
vshard/util.lua | 12 +-
18 files changed, 690 insertions(+), 123 deletions(-)
create mode 100644 test/multiple_routers/configs.lua
create mode 100644 test/multiple_routers/multiple_routers.result
create mode 100644 test/multiple_routers/multiple_routers.test.lua
create mode 100644 test/multiple_routers/router_1.lua
create mode 100644 test/multiple_routers/storage_1_1_a.lua
create mode 120000 test/multiple_routers/storage_1_1_b.lua
create mode 120000 test/multiple_routers/storage_1_2_a.lua
create mode 120000 test/multiple_routers/storage_1_2_b.lua
create mode 120000 test/multiple_routers/storage_2_1_a.lua
create mode 120000 test/multiple_routers/storage_2_1_b.lua
create mode 120000 test/multiple_routers/storage_2_2_a.lua
create mode 120000 test/multiple_routers/storage_2_2_b.lua
create mode 100644 test/multiple_routers/suite.ini
create mode 100644 test/multiple_routers/test.lua
diff --git a/test/multiple_routers/configs.lua b/test/multiple_routers/configs.lua
new file mode 100644
index 0000000..a6ce33c
--- /dev/null
+++ b/test/multiple_routers/configs.lua
@@ -0,0 +1,81 @@
+names = {
+ storage_1_1_a = '32a2d4b8-f146-44ed-9d51-2436507efdf8',
+ storage_1_1_b = 'c1c849b1-641d-40b8-9283-bcfe73d46270',
+ storage_1_2_a = '04e677ed-c7ba-47e0-a67f-b5100cfa86af',
+ storage_1_2_b = 'c7a979ee-9263-4a38-84a5-2fb6a0a32684',
+ storage_2_1_a = '88dc03f0-23fb-4f05-b462-e29186542864',
+ storage_2_1_b = '4230b711-f5c4-4131-bf98-88cd43a16901',
+ storage_2_2_a = '6b1eefbc-1e2e-410e-84ff-44c572ea9916',
+ storage_2_2_b = 'be74419a-1e56-4ba4-97e9-6b18710f63c5',
+}
+
+rs_1_1 = 'dd208fb8-8b90-49bc-8393-6b3a99da7c52'
+rs_1_2 = 'af9cfe88-2091-4613-a877-a623776c5c0e'
+rs_2_1 = '9ca8ee15-ae18-4f31-9385-4859f89ce73f'
+rs_2_2 = '007f5f58-b654-4125-8441-a71866fb62b5'
+
+local cfg_1 = {}
+cfg_1.sharding = {
+ [rs_1_1] = {
+ replicas = {
+ [names.storage_1_1_a] = {
+ uri = 'storage:storage@127.0.0.1:3301',
+ name = 'storage_1_1_a',
+ master = true,
+ },
+ [names.storage_1_1_b] = {
+ uri = 'storage:storage@127.0.0.1:3302',
+ name = 'storage_1_1_b',
+ },
+ }
+ },
+ [rs_1_2] = {
+ replicas = {
+ [names.storage_1_2_a] = {
+ uri = 'storage:storage@127.0.0.1:3303',
+ name = 'storage_1_2_a',
+ master = true,
+ },
+ [names.storage_1_2_b] = {
+ uri = 'storage:storage@127.0.0.1:3304',
+ name = 'storage_1_2_b',
+ },
+ }
+ },
+}
+
+
+local cfg_2 = {}
+cfg_2.sharding = {
+ [rs_2_1] = {
+ replicas = {
+ [names.storage_2_1_a] = {
+ uri = 'storage:storage@127.0.0.1:3305',
+ name = 'storage_2_1_a',
+ master = true,
+ },
+ [names.storage_2_1_b] = {
+ uri = 'storage:storage@127.0.0.1:3306',
+ name = 'storage_2_1_b',
+ },
+ }
+ },
+ [rs_2_2] = {
+ replicas = {
+ [names.storage_2_2_a] = {
+ uri = 'storage:storage@127.0.0.1:3307',
+ name = 'storage_2_2_a',
+ master = true,
+ },
+ [names.storage_2_2_b] = {
+ uri = 'storage:storage@127.0.0.1:3308',
+ name = 'storage_2_2_b',
+ },
+ }
+ },
+}
+
+return {
+ cfg_1 = cfg_1,
+ cfg_2 = cfg_2,
+}
diff --git a/test/multiple_routers/multiple_routers.result b/test/multiple_routers/multiple_routers.result
new file mode 100644
index 0000000..33f4034
--- /dev/null
+++ b/test/multiple_routers/multiple_routers.result
@@ -0,0 +1,226 @@
+test_run = require('test_run').new()
+---
+...
+REPLICASET_1_1 = { 'storage_1_1_a', 'storage_1_1_b' }
+---
+...
+REPLICASET_1_2 = { 'storage_1_2_a', 'storage_1_2_b' }
+---
+...
+REPLICASET_2_1 = { 'storage_2_1_a', 'storage_2_1_b' }
+---
+...
+REPLICASET_2_2 = { 'storage_2_2_a', 'storage_2_2_b' }
+---
+...
+test_run:create_cluster(REPLICASET_1_1, 'multiple_routers')
+---
+...
+test_run:create_cluster(REPLICASET_1_2, 'multiple_routers')
+---
+...
+test_run:create_cluster(REPLICASET_2_1, 'multiple_routers')
+---
+...
+test_run:create_cluster(REPLICASET_2_2, 'multiple_routers')
+---
+...
+util = require('lua_libs.util')
+---
+...
+util.wait_master(test_run, REPLICASET_1_1, 'storage_1_1_a')
+---
+...
+util.wait_master(test_run, REPLICASET_1_2, 'storage_1_2_a')
+---
+...
+util.wait_master(test_run, REPLICASET_2_1, 'storage_2_1_a')
+---
+...
+util.wait_master(test_run, REPLICASET_2_2, 'storage_2_2_a')
+---
+...
+test_run:cmd("create server router_1 with script='multiple_routers/router_1.lua'")
+---
+- true
+...
+test_run:cmd("start server router_1")
+---
+- true
+...
+-- Configure default (static) router.
+_ = test_run:cmd("switch router_1")
+---
+...
+vshard.router.cfg(configs.cfg_1)
+---
+...
+vshard.router.bootstrap()
+---
+- true
+...
+_ = test_run:cmd("switch storage_1_2_a")
+---
+...
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+---
+...
+_ = test_run:cmd("switch router_1")
+---
+...
+vshard.router.call(1, 'write', 'do_replace', {{1, 1}})
+---
+- true
+...
+vshard.router.call(1, 'read', 'do_select', {1})
+---
+- [[1, 1]]
+...
+-- Configure extra router.
+router_2 = vshard.router.new('router_2', configs.cfg_2)
+---
+...
+router_2:bootstrap()
+---
+- true
+...
+_ = test_run:cmd("switch storage_2_2_a")
+---
+...
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+---
+...
+_ = test_run:cmd("switch router_1")
+---
+...
+router_2:call(1, 'write', 'do_replace', {{2, 2}})
+---
+- true
+...
+router_2:call(1, 'read', 'do_select', {2})
+---
+- [[2, 2]]
+...
+-- Check that router_2 and static router serves different clusters.
+#router_2:call(1, 'read', 'do_select', {1}) == 0
+---
+- true
+...
+-- Create several routers to the same cluster.
+routers = {}
+---
+...
+for i = 3, 10 do routers[i] = vshard.router.new('router_' .. i, configs.cfg_2) end
+---
+...
+routers[3]:call(1, 'read', 'do_select', {2})
+---
+- [[2, 2]]
+...
+-- Check that they have their own background fibers.
+fiber_names = {}
+---
+...
+for i = 2, 10 do fiber_names['vshard.failover.router_' .. i] = true; fiber_names['vshard.discovery.router_' .. i] = true; end
+---
+...
+next(fiber_names) ~= nil
+---
+- true
+...
+fiber = require('fiber')
+---
+...
+for _, xfiber in pairs(fiber.info()) do fiber_names[xfiber.name] = nil end
+---
+...
+next(fiber_names) == nil
+---
+- true
+...
+-- Reconfigure one of routers do not affect the others.
+routers[3]:cfg(configs.cfg_1)
+---
+...
+routers[3]:call(1, 'read', 'do_select', {1})
+---
+- [[1, 1]]
+...
+#routers[3]:call(1, 'read', 'do_select', {2}) == 0
+---
+- true
+...
+#routers[4]:call(1, 'read', 'do_select', {1}) == 0
+---
+- true
+...
+routers[4]:call(1, 'read', 'do_select', {2})
+---
+- [[2, 2]]
+...
+routers[3]:cfg(configs.cfg_2)
+---
+...
+-- Try to create router with the same name.
+util = require('lua_libs.util')
+---
+...
+util.check_error(vshard.router.new, 'router_2', configs.cfg_2)
+---
+- null
+- Router with name router_2 already exists
+...
+-- Reload router module.
+_, old_rs_1 = next(vshard.router.internal.static_router.replicasets)
+---
+...
+_, old_rs_2 = next(router_2.replicasets)
+---
+...
+package.loaded['vshard.router'] = nil
+---
+...
+vshard.router = require('vshard.router')
+---
+...
+while not old_rs_1.is_outdated do fiber.sleep(0.01) end
+---
+...
+while not old_rs_2.is_outdated do fiber.sleep(0.01) end
+---
+...
+vshard.router.call(1, 'read', 'do_select', {1})
+---
+- [[1, 1]]
+...
+router_2:call(1, 'read', 'do_select', {2})
+---
+- [[2, 2]]
+...
+routers[5]:call(1, 'read', 'do_select', {2})
+---
+- [[2, 2]]
+...
+_ = test_run:cmd("switch default")
+---
+...
+test_run:cmd("stop server router_1")
+---
+- true
+...
+test_run:cmd("cleanup server router_1")
+---
+- true
+...
+test_run:drop_cluster(REPLICASET_1_1)
+---
+...
+test_run:drop_cluster(REPLICASET_1_2)
+---
+...
+test_run:drop_cluster(REPLICASET_2_1)
+---
+...
+test_run:drop_cluster(REPLICASET_2_2)
+---
+...
diff --git a/test/multiple_routers/multiple_routers.test.lua b/test/multiple_routers/multiple_routers.test.lua
new file mode 100644
index 0000000..6d470e1
--- /dev/null
+++ b/test/multiple_routers/multiple_routers.test.lua
@@ -0,0 +1,85 @@
+test_run = require('test_run').new()
+
+REPLICASET_1_1 = { 'storage_1_1_a', 'storage_1_1_b' }
+REPLICASET_1_2 = { 'storage_1_2_a', 'storage_1_2_b' }
+REPLICASET_2_1 = { 'storage_2_1_a', 'storage_2_1_b' }
+REPLICASET_2_2 = { 'storage_2_2_a', 'storage_2_2_b' }
+
+test_run:create_cluster(REPLICASET_1_1, 'multiple_routers')
+test_run:create_cluster(REPLICASET_1_2, 'multiple_routers')
+test_run:create_cluster(REPLICASET_2_1, 'multiple_routers')
+test_run:create_cluster(REPLICASET_2_2, 'multiple_routers')
+util = require('lua_libs.util')
+util.wait_master(test_run, REPLICASET_1_1, 'storage_1_1_a')
+util.wait_master(test_run, REPLICASET_1_2, 'storage_1_2_a')
+util.wait_master(test_run, REPLICASET_2_1, 'storage_2_1_a')
+util.wait_master(test_run, REPLICASET_2_2, 'storage_2_2_a')
+
+test_run:cmd("create server router_1 with script='multiple_routers/router_1.lua'")
+test_run:cmd("start server router_1")
+
+-- Configure default (static) router.
+_ = test_run:cmd("switch router_1")
+vshard.router.cfg(configs.cfg_1)
+vshard.router.bootstrap()
+_ = test_run:cmd("switch storage_1_2_a")
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+_ = test_run:cmd("switch router_1")
+
+vshard.router.call(1, 'write', 'do_replace', {{1, 1}})
+vshard.router.call(1, 'read', 'do_select', {1})
+
+-- Configure extra router.
+router_2 = vshard.router.new('router_2', configs.cfg_2)
+router_2:bootstrap()
+_ = test_run:cmd("switch storage_2_2_a")
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+_ = test_run:cmd("switch router_1")
+
+router_2:call(1, 'write', 'do_replace', {{2, 2}})
+router_2:call(1, 'read', 'do_select', {2})
+-- Check that router_2 and static router serves different clusters.
+#router_2:call(1, 'read', 'do_select', {1}) == 0
+
+-- Create several routers to the same cluster.
+routers = {}
+for i = 3, 10 do routers[i] = vshard.router.new('router_' .. i, configs.cfg_2) end
+routers[3]:call(1, 'read', 'do_select', {2})
+-- Check that they have their own background fibers.
+fiber_names = {}
+for i = 2, 10 do fiber_names['vshard.failover.router_' .. i] = true; fiber_names['vshard.discovery.router_' .. i] = true; end
+next(fiber_names) ~= nil
+fiber = require('fiber')
+for _, xfiber in pairs(fiber.info()) do fiber_names[xfiber.name] = nil end
+next(fiber_names) == nil
+
+-- Reconfigure one of routers do not affect the others.
+routers[3]:cfg(configs.cfg_1)
+routers[3]:call(1, 'read', 'do_select', {1})
+#routers[3]:call(1, 'read', 'do_select', {2}) == 0
+#routers[4]:call(1, 'read', 'do_select', {1}) == 0
+routers[4]:call(1, 'read', 'do_select', {2})
+routers[3]:cfg(configs.cfg_2)
+
+-- Try to create router with the same name.
+util = require('lua_libs.util')
+util.check_error(vshard.router.new, 'router_2', configs.cfg_2)
+
+-- Reload router module.
+_, old_rs_1 = next(vshard.router.internal.static_router.replicasets)
+_, old_rs_2 = next(router_2.replicasets)
+package.loaded['vshard.router'] = nil
+vshard.router = require('vshard.router')
+while not old_rs_1.is_outdated do fiber.sleep(0.01) end
+while not old_rs_2.is_outdated do fiber.sleep(0.01) end
+vshard.router.call(1, 'read', 'do_select', {1})
+router_2:call(1, 'read', 'do_select', {2})
+routers[5]:call(1, 'read', 'do_select', {2})
+
+_ = test_run:cmd("switch default")
+test_run:cmd("stop server router_1")
+test_run:cmd("cleanup server router_1")
+test_run:drop_cluster(REPLICASET_1_1)
+test_run:drop_cluster(REPLICASET_1_2)
+test_run:drop_cluster(REPLICASET_2_1)
+test_run:drop_cluster(REPLICASET_2_2)
diff --git a/test/multiple_routers/router_1.lua b/test/multiple_routers/router_1.lua
new file mode 100644
index 0000000..2e9ea91
--- /dev/null
+++ b/test/multiple_routers/router_1.lua
@@ -0,0 +1,15 @@
+#!/usr/bin/env tarantool
+
+require('strict').on()
+
+-- Get instance name
+local fio = require('fio')
+local NAME = fio.basename(arg[0], '.lua')
+
+require('console').listen(os.getenv('ADMIN'))
+
+configs = require('configs')
+
+-- Start the database with sharding
+vshard = require('vshard')
+box.cfg{}
diff --git a/test/multiple_routers/storage_1_1_a.lua b/test/multiple_routers/storage_1_1_a.lua
new file mode 100644
index 0000000..b44a97a
--- /dev/null
+++ b/test/multiple_routers/storage_1_1_a.lua
@@ -0,0 +1,23 @@
+#!/usr/bin/env tarantool
+
+require('strict').on()
+
+-- Get instance name.
+local fio = require('fio')
+NAME = fio.basename(arg[0], '.lua')
+
+require('console').listen(os.getenv('ADMIN'))
+
+-- Fetch config for the cluster of the instance.
+if NAME:sub(9,9) == '1' then
+ cfg = require('configs').cfg_1
+else
+ cfg = require('configs').cfg_2
+end
+
+-- Start the database with sharding.
+vshard = require('vshard')
+vshard.storage.cfg(cfg, names[NAME])
+
+-- Bootstrap storage.
+require('lua_libs.bootstrap')
diff --git a/test/multiple_routers/storage_1_1_b.lua b/test/multiple_routers/storage_1_1_b.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_1_1_b.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_1_2_a.lua b/test/multiple_routers/storage_1_2_a.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_1_2_a.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_1_2_b.lua b/test/multiple_routers/storage_1_2_b.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_1_2_b.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_2_1_a.lua b/test/multiple_routers/storage_2_1_a.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_2_1_a.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_2_1_b.lua b/test/multiple_routers/storage_2_1_b.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_2_1_b.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_2_2_a.lua b/test/multiple_routers/storage_2_2_a.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_2_2_a.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_2_2_b.lua b/test/multiple_routers/storage_2_2_b.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_2_2_b.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/suite.ini b/test/multiple_routers/suite.ini
new file mode 100644
index 0000000..d2d4470
--- /dev/null
+++ b/test/multiple_routers/suite.ini
@@ -0,0 +1,6 @@
+[default]
+core = tarantool
+description = Multiple routers tests
+script = test.lua
+is_parallel = False
+lua_libs = ../lua_libs configs.lua
diff --git a/test/multiple_routers/test.lua b/test/multiple_routers/test.lua
new file mode 100644
index 0000000..cb7c1ee
--- /dev/null
+++ b/test/multiple_routers/test.lua
@@ -0,0 +1,9 @@
+#!/usr/bin/env tarantool
+
+require('strict').on()
+
+box.cfg{
+ listen = os.getenv("LISTEN"),
+}
+
+require('console').listen(os.getenv('ADMIN'))
diff --git a/test/router/router.result b/test/router/router.result
index 45394e1..f123ab9 100644
--- a/test/router/router.result
+++ b/test/router/router.result
@@ -225,7 +225,7 @@ vshard.router.bootstrap()
--
-- gh-108: negative bucket count on discovery.
--
-vshard.router.internal.route_map = {}
+vshard.router.internal.static_router.route_map = {}
---
...
rets = {}
@@ -1111,7 +1111,7 @@ end;
vshard.router.cfg(cfg);
---
...
-vshard.router.internal.route_map = {};
+vshard.router.internal.static_router.route_map = {};
---
...
vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY = false;
diff --git a/test/router/router.test.lua b/test/router/router.test.lua
index df2f381..a421d0c 100644
--- a/test/router/router.test.lua
+++ b/test/router/router.test.lua
@@ -91,7 +91,7 @@ vshard.router.bootstrap()
--
-- gh-108: negative bucket count on discovery.
--
-vshard.router.internal.route_map = {}
+vshard.router.internal.static_router.route_map = {}
rets = {}
function do_echo() table.insert(rets, vshard.router.callro(1, 'echo', {1})) end
f1 = fiber.create(do_echo) f2 = fiber.create(do_echo)
@@ -423,7 +423,7 @@ while vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY ~= 'waiting' do
fiber.sleep(0.02)
end;
vshard.router.cfg(cfg);
-vshard.router.internal.route_map = {};
+vshard.router.internal.static_router.route_map = {};
vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY = false;
-- Do discovery iteration. Upload buckets from the
-- first replicaset.
diff --git a/vshard/router/init.lua b/vshard/router/init.lua
index 3e127cb..7569baf 100644
--- a/vshard/router/init.lua
+++ b/vshard/router/init.lua
@@ -25,14 +25,31 @@ local M = rawget(_G, MODULE_INTERNALS)
if not M then
M = {
---------------- Common module attributes ----------------
- -- The last passed configuration.
- current_cfg = nil,
errinj = {
ERRINJ_CFG = false,
ERRINJ_FAILOVER_CHANGE_CFG = false,
ERRINJ_RELOAD = false,
ERRINJ_LONG_DISCOVERY = false,
},
+ -- Dictionary, key is router name, value is a router.
+ routers = {},
+ -- Router object which can be accessed by old api:
+ -- e.g. vshard.router.call(...)
+ static_router = nil,
+ -- This counter is used to restart background fibers with
+ -- new reloaded code.
+ module_version = 0,
+ }
+end
+
+--
+-- Router object attributes.
+--
+local ROUTER_TEMPLATE = {
+ -- Name of router.
+ name = nil,
+ -- The last passed configuration.
+ current_cfg = nil,
-- Time to outdate old objects on reload.
connection_outdate_delay = nil,
-- Bucket map cache.
@@ -47,38 +64,36 @@ if not M then
total_bucket_count = 0,
-- Boolean lua_gc state (create periodic gc task).
collect_lua_garbage = nil,
- -- This counter is used to restart background fibers with
- -- new reloaded code.
- module_version = 0,
- }
-end
+}
+
+local STATIC_ROUTER_NAME = 'static_router'
-- Set a bucket to a replicaset.
-local function bucket_set(bucket_id, rs_uuid)
- local replicaset = M.replicasets[rs_uuid]
+local function bucket_set(router, bucket_id, rs_uuid)
+ local replicaset = router.replicasets[rs_uuid]
-- It is technically possible to delete a replicaset at the
-- same time when route to the bucket is discovered.
if not replicaset then
return nil, lerror.vshard(lerror.code.NO_ROUTE_TO_BUCKET, bucket_id)
end
- local old_replicaset = M.route_map[bucket_id]
+ local old_replicaset = router.route_map[bucket_id]
if old_replicaset ~= replicaset then
if old_replicaset then
old_replicaset.bucket_count = old_replicaset.bucket_count - 1
end
replicaset.bucket_count = replicaset.bucket_count + 1
end
- M.route_map[bucket_id] = replicaset
+ router.route_map[bucket_id] = replicaset
return replicaset
end
-- Remove a bucket from the cache.
-local function bucket_reset(bucket_id)
- local replicaset = M.route_map[bucket_id]
+local function bucket_reset(router, bucket_id)
+ local replicaset = router.route_map[bucket_id]
if replicaset then
replicaset.bucket_count = replicaset.bucket_count - 1
end
- M.route_map[bucket_id] = nil
+ router.route_map[bucket_id] = nil
end
--------------------------------------------------------------------------------
@@ -86,8 +101,8 @@ end
--------------------------------------------------------------------------------
-- Search bucket in whole cluster
-local function bucket_discovery(bucket_id)
- local replicaset = M.route_map[bucket_id]
+local function bucket_discovery(router, bucket_id)
+ local replicaset = router.route_map[bucket_id]
if replicaset ~= nil then
return replicaset
end
@@ -95,14 +110,14 @@ local function bucket_discovery(bucket_id)
log.verbose("Discovering bucket %d", bucket_id)
local last_err = nil
local unreachable_uuid = nil
- for uuid, _ in pairs(M.replicasets) do
+ for uuid, _ in pairs(router.replicasets) do
-- Handle reload/reconfigure.
- replicaset = M.replicasets[uuid]
+ replicaset = router.replicasets[uuid]
if replicaset then
local _, err =
replicaset:callrw('vshard.storage.bucket_stat', {bucket_id})
if err == nil then
- return bucket_set(bucket_id, replicaset.uuid)
+ return bucket_set(router, bucket_id, replicaset.uuid)
elseif err.code ~= lerror.code.WRONG_BUCKET then
last_err = err
unreachable_uuid = uuid
@@ -132,14 +147,14 @@ local function bucket_discovery(bucket_id)
end
-- Resolve bucket id to replicaset uuid
-local function bucket_resolve(bucket_id)
+local function bucket_resolve(router, bucket_id)
local replicaset, err
- local replicaset = M.route_map[bucket_id]
+ local replicaset = router.route_map[bucket_id]
if replicaset ~= nil then
return replicaset
end
-- Replicaset removed from cluster, perform discovery
- replicaset, err = bucket_discovery(bucket_id)
+ replicaset, err = bucket_discovery(router, bucket_id)
if replicaset == nil then
return nil, err
end
@@ -150,14 +165,14 @@ end
-- Background fiber to perform discovery. It periodically scans
-- replicasets one by one and updates route_map.
--
-local function discovery_f()
+local function discovery_f(router)
local module_version = M.module_version
while module_version == M.module_version do
- while not next(M.replicasets) do
+ while not next(router.replicasets) do
lfiber.sleep(consts.DISCOVERY_INTERVAL)
end
- local old_replicasets = M.replicasets
- for rs_uuid, replicaset in pairs(M.replicasets) do
+ local old_replicasets = router.replicasets
+ for rs_uuid, replicaset in pairs(router.replicasets) do
local active_buckets, err =
replicaset:callro('vshard.storage.buckets_discovery', {},
{timeout = 2})
@@ -167,7 +182,7 @@ local function discovery_f()
end
-- Renew replicasets object captured by the for loop
-- in case of reconfigure and reload events.
- if M.replicasets ~= old_replicasets then
+ if router.replicasets ~= old_replicasets then
break
end
if not active_buckets then
@@ -180,11 +195,11 @@ local function discovery_f()
end
replicaset.bucket_count = #active_buckets
for _, bucket_id in pairs(active_buckets) do
- local old_rs = M.route_map[bucket_id]
+ local old_rs = router.route_map[bucket_id]
if old_rs and old_rs ~= replicaset then
old_rs.bucket_count = old_rs.bucket_count - 1
end
- M.route_map[bucket_id] = replicaset
+ router.route_map[bucket_id] = replicaset
end
end
lfiber.sleep(consts.DISCOVERY_INTERVAL)
@@ -195,9 +210,9 @@ end
--
-- Immediately wakeup discovery fiber if exists.
--
-local function discovery_wakeup()
- if M.discovery_fiber then
- M.discovery_fiber:wakeup()
+local function discovery_wakeup(router)
+ if router.discovery_fiber then
+ router.discovery_fiber:wakeup()
end
end
@@ -209,7 +224,7 @@ end
-- Function will restart operation after wrong bucket response until timeout
-- is reached
--
-local function router_call(bucket_id, mode, func, args, opts)
+local function router_call(router, bucket_id, mode, func, args, opts)
if opts and (type(opts) ~= 'table' or
(opts.timeout and type(opts.timeout) ~= 'number')) then
error('Usage: call(bucket_id, mode, func, args, opts)')
@@ -217,7 +232,7 @@ local function router_call(bucket_id, mode, func, args, opts)
local timeout = opts and opts.timeout or consts.CALL_TIMEOUT_MIN
local replicaset, err
local tend = lfiber.time() + timeout
- if bucket_id > M.total_bucket_count or bucket_id <= 0 then
+ if bucket_id > router.total_bucket_count or bucket_id <= 0 then
error('Bucket is unreachable: bucket id is out of range')
end
local call
@@ -227,7 +242,7 @@ local function router_call(bucket_id, mode, func, args, opts)
call = 'callrw'
end
repeat
- replicaset, err = bucket_resolve(bucket_id)
+ replicaset, err = bucket_resolve(router, bucket_id)
if replicaset then
::replicaset_is_found::
local storage_call_status, call_status, call_error =
@@ -243,9 +258,9 @@ local function router_call(bucket_id, mode, func, args, opts)
end
err = call_status
if err.code == lerror.code.WRONG_BUCKET then
- bucket_reset(bucket_id)
+ bucket_reset(router, bucket_id)
if err.destination then
- replicaset = M.replicasets[err.destination]
+ replicaset = router.replicasets[err.destination]
if not replicaset then
log.warn('Replicaset "%s" was not found, but received'..
' from storage as destination - please '..
@@ -257,13 +272,13 @@ local function router_call(bucket_id, mode, func, args, opts)
-- but already is executed on storages.
while lfiber.time() <= tend do
lfiber.sleep(0.05)
- replicaset = M.replicasets[err.destination]
+ replicaset = router.replicasets[err.destination]
if replicaset then
goto replicaset_is_found
end
end
else
- replicaset = bucket_set(bucket_id, replicaset.uuid)
+ replicaset = bucket_set(router, bucket_id, replicaset.uuid)
lfiber.yield()
-- Protect against infinite cycle in a
-- case of broken cluster, when a bucket
@@ -280,7 +295,7 @@ local function router_call(bucket_id, mode, func, args, opts)
-- is not timeout - these requests are repeated in
-- any case on client, if error.
assert(mode == 'write')
- bucket_reset(bucket_id)
+ bucket_reset(router, bucket_id)
return nil, err
elseif err.code == lerror.code.NON_MASTER then
-- Same, as above - do not wait and repeat.
@@ -306,12 +321,12 @@ end
--
-- Wrappers for router_call with preset mode.
--
-local function router_callro(bucket_id, ...)
- return router_call(bucket_id, 'read', ...)
+local function router_callro(router, bucket_id, ...)
+ return router_call(router, bucket_id, 'read', ...)
end
-local function router_callrw(bucket_id, ...)
- return router_call(bucket_id, 'write', ...)
+local function router_callrw(router, bucket_id, ...)
+ return router_call(router, bucket_id, 'write', ...)
end
--
@@ -319,27 +334,27 @@ end
-- @param bucket_id Bucket identifier.
-- @retval Netbox connection.
--
-local function router_route(bucket_id)
+local function router_route(router, bucket_id)
if type(bucket_id) ~= 'number' then
error('Usage: router.route(bucket_id)')
end
- return bucket_resolve(bucket_id)
+ return bucket_resolve(router, bucket_id)
end
--
-- Return map of all replicasets.
-- @retval See self.replicasets map.
--
-local function router_routeall()
- return M.replicasets
+local function router_routeall(router)
+ return router.replicasets
end
--------------------------------------------------------------------------------
-- Failover
--------------------------------------------------------------------------------
-local function failover_ping_round()
- for _, replicaset in pairs(M.replicasets) do
+local function failover_ping_round(router)
+ for _, replicaset in pairs(router.replicasets) do
local replica = replicaset.replica
if replica ~= nil and replica.conn ~= nil and
replica.down_ts == nil then
@@ -382,10 +397,10 @@ end
-- Collect UUIDs of replicasets, priority of whose replica
-- connections must be updated.
--
-local function failover_collect_to_update()
+local function failover_collect_to_update(router)
local ts = lfiber.time()
local uuid_to_update = {}
- for uuid, rs in pairs(M.replicasets) do
+ for uuid, rs in pairs(router.replicasets) do
if failover_need_down_priority(rs, ts) or
failover_need_up_priority(rs, ts) then
table.insert(uuid_to_update, uuid)
@@ -400,16 +415,16 @@ end
-- disconnected replicas.
-- @retval true A replica of an replicaset has been changed.
--
-local function failover_step()
- failover_ping_round()
- local uuid_to_update = failover_collect_to_update()
+local function failover_step(router)
+ failover_ping_round(router)
+ local uuid_to_update = failover_collect_to_update(router)
if #uuid_to_update == 0 then
return false
end
local curr_ts = lfiber.time()
local replica_is_changed = false
for _, uuid in pairs(uuid_to_update) do
- local rs = M.replicasets[uuid]
+ local rs = router.replicasets[uuid]
if M.errinj.ERRINJ_FAILOVER_CHANGE_CFG then
rs = nil
M.errinj.ERRINJ_FAILOVER_CHANGE_CFG = false
@@ -451,7 +466,7 @@ end
-- tries to reconnect to the best replica. When the connection is
-- established, it replaces the original replica.
--
-local function failover_f()
+local function failover_f(router)
local module_version = M.module_version
local min_timeout = math.min(consts.FAILOVER_UP_TIMEOUT,
consts.FAILOVER_DOWN_TIMEOUT)
@@ -461,7 +476,7 @@ local function failover_f()
local prev_was_ok = false
while module_version == M.module_version do
::continue::
- local ok, replica_is_changed = pcall(failover_step)
+ local ok, replica_is_changed = pcall(failover_step, router)
if not ok then
log.error('Error during failovering: %s',
lerror.make(replica_is_changed))
@@ -488,9 +503,14 @@ end
-- Configuration
--------------------------------------------------------------------------------
-local function router_cfg(cfg)
- local vshard_cfg, box_cfg = lcfg.check(cfg, M.current_cfg)
- if not M.replicasets then
+-- Types of configuration.
+CFG_NEW = 'new'
+CFG_RELOAD = 'reload'
+CFG_RECONFIGURE = 'reconfigure'
+
+local function router_cfg(router, cfg, cfg_type)
+ local vshard_cfg, box_cfg = lcfg.check(cfg, router.current_cfg)
+ if cfg_type == CFG_NEW then
log.info('Starting router configuration')
else
log.info('Starting router reconfiguration')
@@ -512,44 +532,53 @@ local function router_cfg(cfg)
-- Move connections from an old configuration to a new one.
-- It must be done with no yields to prevent usage both of not
-- fully moved old replicasets, and not fully built new ones.
- lreplicaset.rebind_replicasets(new_replicasets, M.replicasets)
+ lreplicaset.rebind_replicasets(new_replicasets, router.replicasets)
-- Now the new replicasets are fully built. Can establish
-- connections and yield.
for _, replicaset in pairs(new_replicasets) do
replicaset:connect_all()
end
lreplicaset.wait_masters_connect(new_replicasets)
- lreplicaset.outdate_replicasets(M.replicasets,
+ lreplicaset.outdate_replicasets(router.replicasets,
vshard_cfg.connection_outdate_delay)
- M.connection_outdate_delay = vshard_cfg.connection_outdate_delay
- M.total_bucket_count = total_bucket_count
- M.collect_lua_garbage = vshard_cfg.collect_lua_garbage
- M.current_cfg = vshard_cfg
- M.replicasets = new_replicasets
+ router.connection_outdate_delay = vshard_cfg.connection_outdate_delay
+ router.total_bucket_count = total_bucket_count
+ router.collect_lua_garbage = vshard_cfg.collect_lua_garbage
+ router.current_cfg = vshard_cfg
+ router.replicasets = new_replicasets
-- Update existing route map in-place.
- local old_route_map = M.route_map
- M.route_map = {}
+ local old_route_map = router.route_map
+ router.route_map = {}
for bucket, rs in pairs(old_route_map) do
- M.route_map[bucket] = M.replicasets[rs.uuid]
+ router.route_map[bucket] = router.replicasets[rs.uuid]
end
- if M.failover_fiber == nil then
- M.failover_fiber = util.reloadable_fiber_create(
- 'vshard.failover', M, 'failover_f')
+ if router.failover_fiber == nil then
+ router.failover_fiber = util.reloadable_fiber_create(
+ 'vshard.failover.' .. router.name, M, 'failover_f', router)
end
- if M.discovery_fiber == nil then
- M.discovery_fiber = util.reloadable_fiber_create(
- 'vshard.discovery', M, 'discovery_f')
+ if router.discovery_fiber == nil then
+ router.discovery_fiber = util.reloadable_fiber_create(
+ 'vshard.discovery.' .. router.name, M, 'discovery_f', router)
end
- lua_gc.set_state(M.collect_lua_garbage, consts.COLLECT_LUA_GARBAGE_INTERVAL)
+end
+
+local function updage_lua_gc_state()
+ local lua_gc = false
+ for _, xrouter in pairs(M.routers) do
+ if xrouter.collect_lua_garbage then
+ lua_gc = true
+ end
+ end
+ lua_gc.set_state(lua_gc, consts.COLLECT_LUA_GARBAGE_INTERVAL)
end
--------------------------------------------------------------------------------
-- Bootstrap
--------------------------------------------------------------------------------
-local function cluster_bootstrap()
+local function cluster_bootstrap(router)
local replicasets = {}
- for uuid, replicaset in pairs(M.replicasets) do
+ for uuid, replicaset in pairs(router.replicasets) do
table.insert(replicasets, replicaset)
local count, err = replicaset:callrw('vshard.storage.buckets_count',
{})
@@ -560,9 +589,10 @@ local function cluster_bootstrap()
return nil, lerror.vshard(lerror.code.NON_EMPTY)
end
end
- lreplicaset.calculate_etalon_balance(M.replicasets, M.total_bucket_count)
+ lreplicaset.calculate_etalon_balance(router.replicasets,
+ router.total_bucket_count)
local bucket_id = 1
- for uuid, replicaset in pairs(M.replicasets) do
+ for uuid, replicaset in pairs(router.replicasets) do
if replicaset.etalon_bucket_count > 0 then
local ok, err =
replicaset:callrw('vshard.storage.bucket_force_create',
@@ -618,7 +648,7 @@ local function replicaset_instance_info(replicaset, name, alerts, errcolor,
return info, consts.STATUS.GREEN
end
-local function router_info()
+local function router_info(router)
local state = {
replicasets = {},
bucket = {
@@ -632,7 +662,7 @@ local function router_info()
}
local bucket_info = state.bucket
local known_bucket_count = 0
- for rs_uuid, replicaset in pairs(M.replicasets) do
+ for rs_uuid, replicaset in pairs(router.replicasets) do
-- Replicaset info parameters:
-- * master instance info;
-- * replica instance info;
@@ -720,7 +750,7 @@ local function router_info()
-- If a bucket is unreachable, then replicaset is
-- unreachable too and color already is red.
end
- bucket_info.unknown = M.total_bucket_count - known_bucket_count
+ bucket_info.unknown = router.total_bucket_count - known_bucket_count
if bucket_info.unknown > 0 then
state.status = math.max(state.status, consts.STATUS.YELLOW)
table.insert(state.alerts, lerror.alert(lerror.code.UNKNOWN_BUCKETS,
@@ -737,13 +767,13 @@ end
-- @param limit Maximal bucket count in output.
-- @retval Map of type {bucket_id = 'unknown'/replicaset_uuid}.
--
-local function router_buckets_info(offset, limit)
+local function router_buckets_info(router, offset, limit)
if offset ~= nil and type(offset) ~= 'number' or
limit ~= nil and type(limit) ~= 'number' then
error('Usage: buckets_info(offset, limit)')
end
offset = offset or 0
- limit = limit or M.total_bucket_count
+ limit = limit or router.total_bucket_count
local ret = {}
-- Use one string memory for all unknown buckets.
local available_rw = 'available_rw'
@@ -752,9 +782,9 @@ local function router_buckets_info(offset, limit)
local unreachable = 'unreachable'
-- Collect limit.
local first = math.max(1, offset + 1)
- local last = math.min(offset + limit, M.total_bucket_count)
+ local last = math.min(offset + limit, router.total_bucket_count)
for bucket_id = first, last do
- local rs = M.route_map[bucket_id]
+ local rs = router.route_map[bucket_id]
if rs then
if rs.master and rs.master:is_connected() then
ret[bucket_id] = {uuid = rs.uuid, status = available_rw}
@@ -774,22 +804,22 @@ end
-- Other
--------------------------------------------------------------------------------
-local function router_bucket_id(key)
+local function router_bucket_id(router, key)
if key == nil then
error("Usage: vshard.router.bucket_id(key)")
end
- return lhash.key_hash(key) % M.total_bucket_count + 1
+ return lhash.key_hash(key) % router.total_bucket_count + 1
end
-local function router_bucket_count()
- return M.total_bucket_count
+local function router_bucket_count(router)
+ return router.total_bucket_count
end
-local function router_sync(timeout)
+local function router_sync(router, timeout)
if timeout ~= nil and type(timeout) ~= 'number' then
error('Usage: vshard.router.sync([timeout: number])')
end
- for rs_uuid, replicaset in pairs(M.replicasets) do
+ for rs_uuid, replicaset in pairs(router.replicasets) do
local status, err = replicaset:callrw('vshard.storage.sync', {timeout})
if not status then
-- Add information about replicaset
@@ -803,6 +833,93 @@ if M.errinj.ERRINJ_RELOAD then
error('Error injection: reload')
end
+--------------------------------------------------------------------------------
+-- Managing router instances
+--------------------------------------------------------------------------------
+
+local function cfg_reconfigure(router, cfg)
+ return router_cfg(router, cfg, CFG_RECONFIGURE)
+end
+
+local router_mt = {
+ __index = {
+ cfg = cfg_reconfigure;
+ info = router_info;
+ buckets_info = router_buckets_info;
+ call = router_call;
+ callro = router_callro;
+ callrw = router_callrw;
+ route = router_route;
+ routeall = router_routeall;
+ bucket_id = router_bucket_id;
+ bucket_count = router_bucket_count;
+ sync = router_sync;
+ bootstrap = cluster_bootstrap;
+ bucket_discovery = bucket_discovery;
+ discovery_wakeup = discovery_wakeup;
+ }
+}
+
+-- Table which represents this module.
+local module = {}
+
+local function export_static_router_attributes()
+ -- This metatable bypasses calls to a module to the static_router.
+ local module_mt = {__index = {}}
+ for method_name, method in pairs(router_mt.__index) do
+ module_mt.__index[method_name] = function(...)
+ if M.static_router then
+ return method(M.static_router, ...)
+ else
+ error('Static router is not configured')
+ end
+ end
+ end
+ setmetatable(module, module_mt)
+ -- Make static_router attributes accessible form
+ -- vshard.router.internal.
+ local M_static_router_attributes = {
+ name = true,
+ replicasets = true,
+ route_map = true,
+ total_bucket_count = true,
+ }
+ setmetatable(M, {
+ __index = function(M, key)
+ return M.static_router[key]
+ end
+ })
+end
+
+local function router_new(name, cfg)
+ assert(type(name) == 'string' and type(cfg) == 'table',
+ 'Wrong argument type. Usage: vshard.router.new(name, cfg).')
+ if M.routers[name] then
+ return nil, string.format('Router with name %s already exists', name)
+ end
+ local router = table.deepcopy(ROUTER_TEMPLATE)
+ setmetatable(router, router_mt)
+ router.name = name
+ M.routers[name] = router
+ if name == STATIC_ROUTER_NAME then
+ M.static_router = router
+ export_static_router_attributes()
+ end
+ router_cfg(router, cfg, CFG_NEW)
+ updage_lua_gc_state()
+ return router
+end
+
+local function legacy_cfg(cfg)
+ if M.static_router then
+ -- Reconfigure.
+ router_cfg(M.static_router, cfg, CFG_RECONFIGURE)
+ else
+ -- Create new static instance.
+ router_new(STATIC_ROUTER_NAME, cfg)
+ end
+end
+
--------------------------------------------------------------------------------
-- Module definition
--------------------------------------------------------------------------------
@@ -813,28 +930,24 @@ end
if not rawget(_G, MODULE_INTERNALS) then
rawset(_G, MODULE_INTERNALS, M)
else
- router_cfg(M.current_cfg)
+ for _, router in pairs(M.routers) do
+ router_cfg(router, router.current_cfg, CFG_RELOAD)
+ setmetatable(router, router_mt)
+ end
+ updage_lua_gc_state()
M.module_version = M.module_version + 1
end
M.discovery_f = discovery_f
M.failover_f = failover_f
+M.router_mt = router_mt
+if M.static_router then
+ export_static_router_attributes()
+end
-return {
- cfg = router_cfg;
- info = router_info;
- buckets_info = router_buckets_info;
- call = router_call;
- callro = router_callro;
- callrw = router_callrw;
- route = router_route;
- routeall = router_routeall;
- bucket_id = router_bucket_id;
- bucket_count = router_bucket_count;
- sync = router_sync;
- bootstrap = cluster_bootstrap;
- bucket_discovery = bucket_discovery;
- discovery_wakeup = discovery_wakeup;
- internal = M;
- module_version = function() return M.module_version end;
-}
+module.cfg = legacy_cfg
+module.new = router_new
+module.internal = M
+module.module_version = function() return M.module_version end
+
+return module
diff --git a/vshard/util.lua b/vshard/util.lua
index ea676ff..852e8a3 100644
--- a/vshard/util.lua
+++ b/vshard/util.lua
@@ -38,11 +38,11 @@ end
-- reload of that module.
-- See description of parameters in `reloadable_fiber_create`.
--
-local function reloadable_fiber_main_loop(module, func_name)
+local function reloadable_fiber_main_loop(module, func_name, data)
log.info('%s has been started', func_name)
local func = module[func_name]
::restart_loop::
- local ok, err = pcall(func)
+ local ok, err = pcall(func, data)
-- yield serves two purposes:
-- * makes this fiber cancellable
-- * prevents 100% cpu consumption
@@ -60,7 +60,7 @@ local function reloadable_fiber_main_loop(module, func_name)
log.info('module is reloaded, restarting')
-- luajit drops this frame if next function is called in
-- return statement.
- return M.reloadable_fiber_main_loop(module, func_name)
+ return M.reloadable_fiber_main_loop(module, func_name, data)
end
--
@@ -73,11 +73,13 @@ end
-- @param module Module which can be reloaded.
-- @param func_name Name of a function to be executed in the
-- module.
+-- @param data Data to be passed to the specified function.
-- @retval New fiber.
--
-local function reloadable_fiber_create(fiber_name, module, func_name)
+local function reloadable_fiber_create(fiber_name, module, func_name, data)
assert(type(fiber_name) == 'string')
- local xfiber = fiber.create(reloadable_fiber_main_loop, module, func_name)
+ local xfiber = fiber.create(reloadable_fiber_main_loop, module, func_name,
+ data)
xfiber:name(fiber_name)
return xfiber
end
--
2.14.1
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] [PATCH] Check self arg passed for router objects
2018-07-31 16:25 [tarantool-patches] [PATCH 0/3] multiple routers AKhatskevich
` (2 preceding siblings ...)
2018-07-31 16:25 ` [tarantool-patches] [PATCH 3/3] Introduce multiple routers feature AKhatskevich
@ 2018-08-01 14:30 ` AKhatskevich
2018-08-03 20:07 ` [tarantool-patches] [PATCH] Refactor config templates AKhatskevich
4 siblings, 0 replies; 23+ messages in thread
From: AKhatskevich @ 2018-08-01 14:30 UTC (permalink / raw)
To: v.shpilevoy, tarantool-patches
Raise an exception in case someone calls router like
`router.info()` instead of `router:info()`.
---
test/multiple_routers/multiple_routers.result | 5 +++++
test/multiple_routers/multiple_routers.test.lua | 3 +++
vshard/router/init.lua | 9 +++++++++
3 files changed, 17 insertions(+)
diff --git a/test/multiple_routers/multiple_routers.result b/test/multiple_routers/multiple_routers.result
index 33f4034..389bf9a 100644
--- a/test/multiple_routers/multiple_routers.result
+++ b/test/multiple_routers/multiple_routers.result
@@ -201,6 +201,11 @@ routers[5]:call(1, 'read', 'do_select', {2})
---
- [[2, 2]]
...
+-- Self checker.
+util.check_error(router_2.info)
+---
+- Use router:info(...) instead of router.info(...)
+...
_ = test_run:cmd("switch default")
---
...
diff --git a/test/multiple_routers/multiple_routers.test.lua b/test/multiple_routers/multiple_routers.test.lua
index 6d470e1..2f159c7 100644
--- a/test/multiple_routers/multiple_routers.test.lua
+++ b/test/multiple_routers/multiple_routers.test.lua
@@ -76,6 +76,9 @@ vshard.router.call(1, 'read', 'do_select', {1})
router_2:call(1, 'read', 'do_select', {2})
routers[5]:call(1, 'read', 'do_select', {2})
+-- Self checker.
+util.check_error(router_2.info)
+
_ = test_run:cmd("switch default")
test_run:cmd("stop server router_1")
test_run:cmd("cleanup server router_1")
diff --git a/vshard/router/init.lua b/vshard/router/init.lua
index 128628b..e0a39b2 100644
--- a/vshard/router/init.lua
+++ b/vshard/router/init.lua
@@ -860,6 +860,15 @@ local router_mt = {
}
}
+--
+-- Wrap self methods with a sanity checker.
+--
+local mt_index = {}
+for name, func in pairs(router_mt.__index) do
+ mt_index[name] = util.generate_self_checker("router", name, router_mt, func)
+end
+router_mt.__index = mt_index
+
-- Table which represents this module.
local module = {}
--
2.14.1
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 1/3] Update only vshard part of a cfg on reload
2018-07-31 16:25 ` [tarantool-patches] [PATCH 1/3] Update only vshard part of a cfg on reload AKhatskevich
@ 2018-08-01 18:43 ` Vladislav Shpilevoy
2018-08-03 20:03 ` Alex Khatskevich
0 siblings, 1 reply; 23+ messages in thread
From: Vladislav Shpilevoy @ 2018-08-01 18:43 UTC (permalink / raw)
To: tarantool-patches, AKhatskevich
Thanks for the patch! See 4 comments below.
On 31/07/2018 19:25, AKhatskevich wrote:
> Box cfg could have been changed by a user and then overridden by
> an old vshard config on reload.
>
> Since that commit, box part of a config is applied only when
> it is explicitly passed to a `cfg` method.
>
> This change is important for the multiple routers feature.
> ---
> vshard/cfg.lua | 54 +++++++++++++++++++++++++------------------------
> vshard/router/init.lua | 18 ++++++++---------
> vshard/storage/init.lua | 53 ++++++++++++++++++++++++++++--------------------
> 3 files changed, 67 insertions(+), 58 deletions(-)
>
> diff --git a/vshard/cfg.lua b/vshard/cfg.lua
> index bba12cc..8282086 100644
> --- a/vshard/cfg.lua
> +++ b/vshard/cfg.lua
> @@ -230,48 +230,50 @@ local non_dynamic_options = {
> 'bucket_count', 'shard_index'
> }
>
> +--
> +-- Deepcopy a config and split it into vshard_cfg and box_cfg.
> +--
> +local function split_cfg(cfg)
> + local vshard_field_map = {}
> + for _, field in ipairs(cfg_template) do
> + vshard_field_map[field[1]] = true
> + end
1. vshard_field_map does not change ever. Why do you build it
on each cfg? Please, store it in a module local variable like
cfg_template. Or refactor cfg_template and other templates so
they would be maps with parameter name as a key - looks like
the most suitable solution.
> + local vshard_cfg = {}
> + local box_cfg = {}
> + for k, v in pairs(cfg) do
> + if vshard_field_map[k] then
> + vshard_cfg[k] = table.deepcopy(v)
> + else
> + box_cfg[k] = table.deepcopy(v)
> + end
> + end
> + return vshard_cfg, box_cfg
> +end
> +
> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
> index 102b942..75f5df9 100644
> --- a/vshard/storage/init.lua
> +++ b/vshard/storage/init.lua
> @@ -1500,13 +1500,17 @@ end
> --------------------------------------------------------------------------------
> -- Configuration
> --------------------------------------------------------------------------------
> +-- Private (not accessible by a user) reload indicator.
> +local is_reload = false
2. Please, make this variable be parameter of storage_cfg and wrap public
storage.cfg with a one-liner:
storage.cfg = function(cfg, uuid) return storage_cfg(cfg, uuid, false) end
I believe/hope you understand that such way to pass parameters, via global
variables, is flawed by design.
> @@ -1553,18 +1557,19 @@ local function storage_cfg(cfg, this_replica_uuid)
> --
> -- If a master role of the replica is not changed, then
> -- 'read_only' can be set right here.
> - cfg.listen = cfg.listen or this_replica.uri
> - if cfg.replication == nil and this_replicaset.master and not is_master then
> - cfg.replication = {this_replicaset.master.uri}
> + box_cfg.listen = box_cfg.listen or this_replica.uri
> + if box_cfg.replication == nil and this_replicaset.master
> + and not is_master then
3. Broken indentation.
> + box_cfg.replication = {this_replicaset.master.uri}
> else
> - cfg.replication = {}
> + box_cfg.replication = {}
> end
> if was_master == is_master then
> - cfg.read_only = not is_master
> + box_cfg.read_only = not is_master
> end
> if type(box.cfg) == 'function' then
> - cfg.instance_uuid = this_replica.uuid
> - cfg.replicaset_uuid = this_replicaset.uuid
> + box_cfg.instance_uuid = this_replica.uuid
> + box_cfg.replicaset_uuid = this_replicaset.uuid
> else
> local info = box.info
> if this_replica_uuid ~= info.uuid then
> @@ -1607,9 +1614,10 @@ local function storage_cfg(cfg, this_replica_uuid)
> local_on_master_enable_prepare()
> end
>
> - local box_cfg = table.copy(cfg)
> - lcfg.remove_non_box_options(box_cfg)
> - local ok, err = pcall(box.cfg, box_cfg)
> + local ok, err = true, nil
> + if not xis_reload then
> + ok, err = pcall(box.cfg, box_cfg)
> + end
4. The code below (if not ok then ...) can be moved inside
'if not is_reload' together with 'local ok, err' declaration.
Please, do.
> while M.errinj.ERRINJ_CFG_DELAY do
> lfiber.sleep(0.01)
> end
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 3/3] Introduce multiple routers feature
2018-07-31 16:25 ` [tarantool-patches] [PATCH 3/3] Introduce multiple routers feature AKhatskevich
@ 2018-08-01 18:43 ` Vladislav Shpilevoy
2018-08-03 20:05 ` Alex Khatskevich
0 siblings, 1 reply; 23+ messages in thread
From: Vladislav Shpilevoy @ 2018-08-01 18:43 UTC (permalink / raw)
To: tarantool-patches, AKhatskevich
Thanks for the patch! See 10 comments below.
On 31/07/2018 19:25, AKhatskevich wrote:
> Key points:
> * Old `vshard.router.some_method()` api is preserved.
> * Add `vshard.router.new(name, cfg)` method which returns a new router.
> * Each router has its own:
> 1. name
> 2. background fibers
> 3. attributes (route_map, replicasets, outdate_delay...)
> * Module reload reloads all configured routers.
> * `cfg` reconfigures a single router.
> * All routers share the same box configuration. The last passed config
> overrides the global config.
> * Multiple router instances can be connected to the same cluster.
> * By now, a router cannot be destroyed.
>
> Extra changes:
> * Add `data` parameter to `reloadable_fiber_create` function.
>
> Closes #130
> ---> diff --git a/test/multiple_routers/multiple_routers.result b/test/multiple_routers/multiple_routers.result
> new file mode 100644
> index 0000000..33f4034
> --- /dev/null
> +++ b/test/multiple_routers/multiple_routers.result
> @@ -0,0 +1,226 @@
> +-- Reconfigure one of routers do not affect the others.
> +routers[3]:cfg(configs.cfg_1)
1. You did not change configs.cfg_1 so it is not reconfig
actually. Please, change something to check that the
parameter affects one router and does not affect others.
2. Please, add a test on an ability to get the static router
into a variable and use it like others. It should be possible
to hide distinctions between static and other routers.
Like this:
r1 = vshard.router.static
r2 = vshard.router.new(...)
do_something_with_router(r1)
do_something_with_router(r2)
Here do_something_with_router() is unaware of whether the
router is static or not.
> diff --git a/vshard/router/init.lua b/vshard/router/init.lua
> index 3e127cb..7569baf 100644
> --- a/vshard/router/init.lua
> +++ b/vshard/router/init.lua
> @@ -257,13 +272,13 @@ local function router_call(bucket_id, mode, func, args, opts)
> -- but already is executed on storages.
> while lfiber.time() <= tend do
> lfiber.sleep(0.05)
> - replicaset = M.replicasets[err.destination]
> + replicaset = router.replicasets[err.destination]
> if replicaset then
> goto replicaset_is_found
> end
> end
> else
> - replicaset = bucket_set(bucket_id, replicaset.uuid)
> + replicaset = bucket_set(router, bucket_id, replicaset.uuid)
3. Out of 80 symbols.
> lfiber.yield()
> -- Protect against infinite cycle in a
> -- case of broken cluster, when a bucket
> @@ -488,9 +503,14 @@ end
> -- Configuration
> --------------------------------------------------------------------------------
>
> -local function router_cfg(cfg)
> - local vshard_cfg, box_cfg = lcfg.check(cfg, M.current_cfg)
> - if not M.replicasets then
> +-- Types of configuration.
> +CFG_NEW = 'new'
> +CFG_RELOAD = 'reload'
> +CFG_RECONFIGURE = 'reconfigure'
4. Last two values are never used in router_cfg(). The first
is used for logging only and can be checked as it was before
with no explicit passing.
> +
> +local function router_cfg(router, cfg, cfg_type)
> + local vshard_cfg, box_cfg = lcfg.check(cfg, router.current_cfg)
> + if cfg_type == CFG_NEW then
> log.info('Starting router configuration')
> else
> log.info('Starting router reconfiguration')
> @@ -512,44 +532,53 @@ local function router_cfg(cfg)
> +
> +local function updage_lua_gc_state()
5. This function is not needed actually.
On router_new() the only thing that can change is start of
the gc fiber if the new router has the flag and the gc is
not started now. It can be checked by a simple 'if' with
no full-scan of all routers.
On reload it is not possible to change configuration, so
the gc state can not be changed and does not need an update.
Even if it could be changed, you already iterate over routers
on reload to call router_cfg and can collect their flags
along side.
The next point is that it is not possible now to manage the
gc via simple :cfg() call. You do nothing with gc when
router_cfg is called directly. And that produces a question -
why do your tests pass if so?
The possible solution - keep a counter of set lua gc flags
overall routers in M. On each cfg you update the counter
if the value is changed. If it was 0 and become > 0, then
you start gc. If it was > 0 and become 0, then you stop gc.
No routers iteration at all.
> + local lua_gc = false
> + for _, xrouter in pairs(M.routers) do
> + if xrouter.collect_lua_garbage then
> + lua_gc = true
> + end
> + end
> + lua_gc.set_state(lua_gc, consts.COLLECT_LUA_GARBAGE_INTERVAL)
> end
>
> @@ -803,6 +833,93 @@ if M.errinj.ERRINJ_RELOAD then
> error('Error injection: reload')
> end
>
> +--------------------------------------------------------------------------------
> +-- Managing router instances
> +--------------------------------------------------------------------------------
> +
> +local function cfg_reconfigure(router, cfg)
> + return router_cfg(router, cfg, CFG_RECONFIGURE)
> +end
> +
> +local router_mt = {
> + __index = {
> + cfg = cfg_reconfigure;
> + info = router_info;
> + buckets_info = router_buckets_info;
> + call = router_call;
> + callro = router_callro;
> + callrw = router_callrw;
> + route = router_route;
> + routeall = router_routeall;
> + bucket_id = router_bucket_id;
> + bucket_count = router_bucket_count;
> + sync = router_sync;
> + bootstrap = cluster_bootstrap;
> + bucket_discovery = bucket_discovery;
> + discovery_wakeup = discovery_wakeup;
> + }
> +}
> +
> +-- Table which represents this module.
> +local module = {}
> +
> +local function export_static_router_attributes()
> + -- This metatable bypasses calls to a module to the static_router.
> + local module_mt = {__index = {}}
> + for method_name, method in pairs(router_mt.__index) do
> + module_mt.__index[method_name] = function(...)
> + if M.static_router then
> + return method(M.static_router, ...)
> + else
> + error('Static router is not configured')
6. This should not be all-time check. You should
initialize the static router metatable with only errors.
On the first cfg you reset the metatable to always use
regular methods. But anyway this code is unreachable. See
below in the comment 10 why it is so.
> + end
> + end
> + end
> + setmetatable(module, module_mt)
> + -- Make static_router attributes accessible form
> + -- vshard.router.internal.
> + local M_static_router_attributes = {
> + name = true,
> + replicasets = true,
> + route_map = true,
> + total_bucket_count = true,
> + }
7. I saw in the tests that you are using vshard.router.internal.static_router
instead. Please, remove M_static_router_attributes then.
> + setmetatable(M, {
> + __index = function(M, key)
> + return M.static_router[key]
> + end
> + })
> +end
> +
> +local function router_new(name, cfg)
> + assert(type(name) == 'string' and type(cfg) == 'table',
> + 'Wrong argument type. Usage: vshard.router.new(name, cfg).')
8. As I said before, do not use assertions for usage checks in public
API. Use 'if wrong_usage then error(...) end'.
> + if M.routers[name] then
> + return nil, string.format('Router with name %s already exists', name)
> + end
> + local router = table.deepcopy(ROUTER_TEMPLATE)
> + setmetatable(router, router_mt)
> + router.name = name
> + M.routers[name] = router
> + if name == STATIC_ROUTER_NAME then
> + M.static_router = router
> + export_static_router_attributes()
> + end
9. This check can be removed if you move
export_static_router_attributes call into legacy_cfg.
10. Looks like all your struggles in
export_static_router_attributes() about error on non-configured
router makes no sense since until cfg is called, vshard.router
has no any methods except cfg and new.
> + router_cfg(router, cfg, CFG_NEW)
> + updage_lua_gc_state()
> + return router
> +end
> +
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 2/3] Move lua gc to a dedicated module
2018-07-31 16:25 ` [tarantool-patches] [PATCH 2/3] Move lua gc to a dedicated module AKhatskevich
@ 2018-08-01 18:43 ` Vladislav Shpilevoy
2018-08-03 20:04 ` Alex Khatskevich
0 siblings, 1 reply; 23+ messages in thread
From: Vladislav Shpilevoy @ 2018-08-01 18:43 UTC (permalink / raw)
To: tarantool-patches, AKhatskevich
Thanks for the patch! See 4 comments below.
On 31/07/2018 19:25, AKhatskevich wrote:
> `vshard.lua_gc.lua` is a new module which helps make gc work more
> intense.
> Before the commit that was a duty of router and storage.
>
> Reasons to move lua gc to a separate module:
> 1. It is not a duty of vshard to collect garbage, so let gc fiber
> be as far from vshard as possible.
> 2. Next commits will introduce multiple routers feature, which require
> gc fiber to be a singleton.
>
> Closes #138
> ---
> test/router/garbage_collector.result | 27 +++++++++++------
> test/router/garbage_collector.test.lua | 18 ++++++-----
> test/storage/garbage_collector.result | 27 +++++++++--------
> test/storage/garbage_collector.test.lua | 22 ++++++--------
> vshard/lua_gc.lua | 54 +++++++++++++++++++++++++++++++++
> vshard/router/init.lua | 19 +++---------
> vshard/storage/init.lua | 20 ++++--------
> 7 files changed, 116 insertions(+), 71 deletions(-)
> create mode 100644 vshard/lua_gc.lua
>
> diff --git a/test/router/garbage_collector.result b/test/router/garbage_collector.result
> index 3c2a4f1..a7474fc 100644
> --- a/test/router/garbage_collector.result
> +++ b/test/router/garbage_collector.result
> @@ -40,27 +40,30 @@ test_run:switch('router_1')
> fiber = require('fiber')
> ---
> ...
> -cfg.collect_lua_garbage = true
> +lua_gc = require('vshard.lua_gc')
> ---
> ...
> -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DISCOVERY_INTERVAL
> +cfg.collect_lua_garbage = true
1. Now this code tests nothing but just fibers. Below you do wakeup
and check that iteration counter is increased, but it is obvious
thing. Before your patch the test really tested that GC is called
by checking for nullified weak references. Now I can remove collectgarbage()
from the main_loop and nothing would changed. Please, make this test
be a test.
Moreover, the test hangs forever both locally and on Travis.
> diff --git a/test/storage/garbage_collector.result b/test/storage/garbage_collector.result
> index 3588fb4..d94ba24 100644
> --- a/test/storage/garbage_collector.result
> +++ b/test/storage/garbage_collector.result
2. Same. Now the test passes even if I removed collectgarbage() from
the main loop.
> diff --git a/vshard/lua_gc.lua b/vshard/lua_gc.lua
> new file mode 100644
> index 0000000..8d6af3e
> --- /dev/null
> +++ b/vshard/lua_gc.lua
> @@ -0,0 +1,54 @@
> +--
> +-- This module implements background lua GC fiber.
> +-- It's purpose is to make GC more aggressive.
> +--
> +
> +local lfiber = require('fiber')
> +local MODULE_INTERNALS = '__module_vshard_lua_gc'
> +
> +local M = rawget(_G, MODULE_INTERNALS)
> +if not M then
> + M = {
> + -- Background fiber.
> + bg_fiber = nil,
> + -- GC interval in seconds.
> + interval = nil,
> + -- Main loop.
> + -- Stored here to make the fiber reloadable.
> + main_loop = nil,
> + -- Number of `collectgarbage()` calls.
> + iterations = 0,
> + }
> +end
> +local DEFALUT_INTERVAL = 100
3. For constants please use vshard.consts.
4. You should not choose interval inside the main_loop.
Please, use 'default' option in cfg.lua.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 1/3] Update only vshard part of a cfg on reload
2018-08-01 18:43 ` [tarantool-patches] " Vladislav Shpilevoy
@ 2018-08-03 20:03 ` Alex Khatskevich
2018-08-06 17:03 ` Vladislav Shpilevoy
0 siblings, 1 reply; 23+ messages in thread
From: Alex Khatskevich @ 2018-08-03 20:03 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches
On 01.08.2018 21:43, Vladislav Shpilevoy wrote:
> Thanks for the patch! See 4 comments below.
>
> On 31/07/2018 19:25, AKhatskevich wrote:
>> Box cfg could have been changed by a user and then overridden by
>> an old vshard config on reload.
>>
>> Since that commit, box part of a config is applied only when
>> it is explicitly passed to a `cfg` method.
>>
>> This change is important for the multiple routers feature.
>> ---
>> vshard/cfg.lua | 54
>> +++++++++++++++++++++++++------------------------
>> vshard/router/init.lua | 18 ++++++++---------
>> vshard/storage/init.lua | 53
>> ++++++++++++++++++++++++++++--------------------
>> 3 files changed, 67 insertions(+), 58 deletions(-)
>>
>> diff --git a/vshard/cfg.lua b/vshard/cfg.lua
>> index bba12cc..8282086 100644
>> --- a/vshard/cfg.lua
>> +++ b/vshard/cfg.lua
>> @@ -230,48 +230,50 @@ local non_dynamic_options = {
>> 'bucket_count', 'shard_index'
>> }
>> +--
>> +-- Deepcopy a config and split it into vshard_cfg and box_cfg.
>> +--
>> +local function split_cfg(cfg)
>> + local vshard_field_map = {}
>> + for _, field in ipairs(cfg_template) do
>> + vshard_field_map[field[1]] = true
>> + end
>
> 1. vshard_field_map does not change ever. Why do you build it
> on each cfg? Please, store it in a module local variable like
> cfg_template. Or refactor cfg_template and other templates so
> they would be maps with parameter name as a key - looks like
> the most suitable solution.
Refactored cfg_template. (add extra commit before this one)
>
>> + local vshard_cfg = {}
>> + local box_cfg = {}
>> + for k, v in pairs(cfg) do
>> + if vshard_field_map[k] then
>> + vshard_cfg[k] = table.deepcopy(v)
>> + else
>> + box_cfg[k] = table.deepcopy(v)
>> + end
>> + end
>> + return vshard_cfg, box_cfg
>> +end
>> +
>> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
>> index 102b942..75f5df9 100644
>> --- a/vshard/storage/init.lua
>> +++ b/vshard/storage/init.lua
>> @@ -1500,13 +1500,17 @@ end
>> --------------------------------------------------------------------------------
>> -- Configuration
>> --------------------------------------------------------------------------------
>> +-- Private (not accessible by a user) reload indicator.
>> +local is_reload = false
>
> 2. Please, make this variable be parameter of storage_cfg and wrap public
> storage.cfg with a one-liner:
>
> storage.cfg = function(cfg, uuid) return storage_cfg(cfg, uuid,
> false) end
>
> I believe/hope you understand that such way to pass parameters, via
> global
> variables, is flawed by design.
Nice idea.
Fixed.
>
>> @@ -1553,18 +1557,19 @@ local function storage_cfg(cfg,
>> this_replica_uuid)
>> --
>> -- If a master role of the replica is not changed, then
>> -- 'read_only' can be set right here.
>> - cfg.listen = cfg.listen or this_replica.uri
>> - if cfg.replication == nil and this_replicaset.master and not
>> is_master then
>> - cfg.replication = {this_replicaset.master.uri}
>> + box_cfg.listen = box_cfg.listen or this_replica.uri
>> + if box_cfg.replication == nil and this_replicaset.master
>> + and not is_master then
>
> 3. Broken indentation.
fixed
>
>> + box_cfg.replication = {this_replicaset.master.uri}
>> else
>> - cfg.replication = {}
>> + box_cfg.replication = {}
>> end
>> if was_master == is_master then
>> - cfg.read_only = not is_master
>> + box_cfg.read_only = not is_master
>> end
>> if type(box.cfg) == 'function' then
>> - cfg.instance_uuid = this_replica.uuid
>> - cfg.replicaset_uuid = this_replicaset.uuid
>> + box_cfg.instance_uuid = this_replica.uuid
>> + box_cfg.replicaset_uuid = this_replicaset.uuid
>> else
>> local info = box.info
>> if this_replica_uuid ~= info.uuid then
>> @@ -1607,9 +1614,10 @@ local function storage_cfg(cfg,
>> this_replica_uuid)
>> local_on_master_enable_prepare()
>> end
>> - local box_cfg = table.copy(cfg)
>> - lcfg.remove_non_box_options(box_cfg)
>> - local ok, err = pcall(box.cfg, box_cfg)
>> + local ok, err = true, nil
>> + if not xis_reload then
>> + ok, err = pcall(box.cfg, box_cfg)
>> + end
>
> 4. The code below (if not ok then ...) can be moved inside
> 'if not is_reload' together with 'local ok, err' declaration.
> Please, do.
>
>> while M.errinj.ERRINJ_CFG_DELAY do
>> lfiber.sleep(0.01)
>> end
Done.
Full diff
commit 81cb60df74fbacae3aee1817f1ff16e7fe0af72f
Author: AKhatskevich <avkhatskevich@tarantool.org>
Date: Mon Jul 23 16:42:22 2018 +0300
Update only vshard part of a cfg on reload
Box cfg could have been changed by a user and then overridden by
an old vshard config on reload.
Since that commit, box part of a config is applied only when
it is explicitly passed to a `cfg` method.
This change is important for the multiple routers feature.
diff --git a/vshard/cfg.lua b/vshard/cfg.lua
index 7c9ab77..80ea432 100644
--- a/vshard/cfg.lua
+++ b/vshard/cfg.lua
@@ -221,6 +221,22 @@ local cfg_template = {
},
}
+--
+-- Deepcopy a config and split it into vshard_cfg and box_cfg.
+--
+local function split_cfg(cfg)
+ local vshard_cfg = {}
+ local box_cfg = {}
+ for k, v in pairs(cfg) do
+ if cfg_template[k] then
+ vshard_cfg[k] = table.deepcopy(v)
+ else
+ box_cfg[k] = table.deepcopy(v)
+ end
+ end
+ return vshard_cfg, box_cfg
+end
+
--
-- Names of options which cannot be changed during reconfigure.
--
@@ -232,44 +248,26 @@ local non_dynamic_options = {
-- Check sharding config on correctness. Check types, name and uri
-- uniqueness, master count (in each replicaset must be <= 1).
--
-local function cfg_check(shard_cfg, old_cfg)
- if type(shard_cfg) ~= 'table' then
+local function cfg_check(cfg, old_vshard_cfg)
+ if type(cfg) ~= 'table' then
error('Сonfig must be map of options')
end
- shard_cfg = table.deepcopy(shard_cfg)
- validate_config(shard_cfg, cfg_template)
- if not old_cfg then
- return shard_cfg
+ local vshard_cfg, box_cfg = split_cfg(cfg)
+ validate_config(vshard_cfg, cfg_template)
+ if not old_vshard_cfg then
+ return vshard_cfg, box_cfg
end
-- Check non-dynamic after default values are added.
for _, f_name in pairs(non_dynamic_options) do
-- New option may be added in new vshard version.
- if shard_cfg[f_name] ~= old_cfg[f_name] then
+ if vshard_cfg[f_name] ~= old_vshard_cfg[f_name] then
error(string.format('Non-dynamic option %s ' ..
'cannot be reconfigured', f_name))
end
end
- return shard_cfg
-end
-
---
--- Nullify non-box options.
---
-local function remove_non_box_options(cfg)
- cfg.sharding = nil
- cfg.weights = nil
- cfg.zone = nil
- cfg.bucket_count = nil
- cfg.rebalancer_disbalance_threshold = nil
- cfg.rebalancer_max_receiving = nil
- cfg.shard_index = nil
- cfg.collect_bucket_garbage_interval = nil
- cfg.collect_lua_garbage = nil
- cfg.sync_timeout = nil
- cfg.connection_outdate_delay = nil
+ return vshard_cfg, box_cfg
end
return {
check = cfg_check,
- remove_non_box_options = remove_non_box_options,
}
diff --git a/vshard/router/init.lua b/vshard/router/init.lua
index 4cb19fd..e2b2b22 100644
--- a/vshard/router/init.lua
+++ b/vshard/router/init.lua
@@ -496,18 +496,15 @@ end
--------------------------------------------------------------------------------
local function router_cfg(cfg)
- cfg = lcfg.check(cfg, M.current_cfg)
- local new_cfg = table.copy(cfg)
+ local vshard_cfg, box_cfg = lcfg.check(cfg, M.current_cfg)
if not M.replicasets then
log.info('Starting router configuration')
else
log.info('Starting router reconfiguration')
end
- local new_replicasets = lreplicaset.buildall(cfg)
- local total_bucket_count = cfg.bucket_count
- local collect_lua_garbage = cfg.collect_lua_garbage
- local box_cfg = table.copy(cfg)
- lcfg.remove_non_box_options(box_cfg)
+ local new_replicasets = lreplicaset.buildall(vshard_cfg)
+ local total_bucket_count = vshard_cfg.bucket_count
+ local collect_lua_garbage = vshard_cfg.collect_lua_garbage
log.info("Calling box.cfg()...")
for k, v in pairs(box_cfg) do
log.info({[k] = v})
@@ -530,11 +527,12 @@ local function router_cfg(cfg)
replicaset:connect_all()
end
lreplicaset.wait_masters_connect(new_replicasets)
- lreplicaset.outdate_replicasets(M.replicasets,
cfg.connection_outdate_delay)
- M.connection_outdate_delay = cfg.connection_outdate_delay
+ lreplicaset.outdate_replicasets(M.replicasets,
+ vshard_cfg.connection_outdate_delay)
+ M.connection_outdate_delay = vshard_cfg.connection_outdate_delay
M.total_bucket_count = total_bucket_count
M.collect_lua_garbage = collect_lua_garbage
- M.current_cfg = cfg
+ M.current_cfg = vshard_cfg
M.replicasets = new_replicasets
-- Update existing route map in-place.
local old_route_map = M.route_map
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index 102b942..40216ea 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -1500,13 +1500,13 @@ end
--------------------------------------------------------------------------------
-- Configuration
--------------------------------------------------------------------------------
-local function storage_cfg(cfg, this_replica_uuid)
+
+local function storage_cfg(cfg, this_replica_uuid, is_reload)
if this_replica_uuid == nil then
error('Usage: cfg(configuration, this_replica_uuid)')
end
- cfg = lcfg.check(cfg, M.current_cfg)
- local new_cfg = table.copy(cfg)
- if cfg.weights or cfg.zone then
+ local vshard_cfg, box_cfg = lcfg.check(cfg, M.current_cfg)
+ if vshard_cfg.weights or vshard_cfg.zone then
error('Weights and zone are not allowed for storage
configuration')
end
if M.replicasets then
@@ -1520,7 +1520,7 @@ local function storage_cfg(cfg, this_replica_uuid)
local this_replicaset
local this_replica
- local new_replicasets = lreplicaset.buildall(cfg)
+ local new_replicasets = lreplicaset.buildall(vshard_cfg)
local min_master
for rs_uuid, rs in pairs(new_replicasets) do
for replica_uuid, replica in pairs(rs.replicas) do
@@ -1553,18 +1553,19 @@ local function storage_cfg(cfg, this_replica_uuid)
--
-- If a master role of the replica is not changed, then
-- 'read_only' can be set right here.
- cfg.listen = cfg.listen or this_replica.uri
- if cfg.replication == nil and this_replicaset.master and not
is_master then
- cfg.replication = {this_replicaset.master.uri}
+ box_cfg.listen = box_cfg.listen or this_replica.uri
+ if box_cfg.replication == nil and this_replicaset.master
+ and not is_master then
+ box_cfg.replication = {this_replicaset.master.uri}
else
- cfg.replication = {}
+ box_cfg.replication = {}
end
if was_master == is_master then
- cfg.read_only = not is_master
+ box_cfg.read_only = not is_master
end
if type(box.cfg) == 'function' then
- cfg.instance_uuid = this_replica.uuid
- cfg.replicaset_uuid = this_replicaset.uuid
+ box_cfg.instance_uuid = this_replica.uuid
+ box_cfg.replicaset_uuid = this_replicaset.uuid
else
local info = box.info
if this_replica_uuid ~= info.uuid then
@@ -1578,12 +1579,14 @@ local function storage_cfg(cfg, this_replica_uuid)
this_replicaset.uuid))
end
end
- local total_bucket_count = cfg.bucket_count
- local rebalancer_disbalance_threshold =
cfg.rebalancer_disbalance_threshold
- local rebalancer_max_receiving = cfg.rebalancer_max_receiving
- local shard_index = cfg.shard_index
- local collect_bucket_garbage_interval =
cfg.collect_bucket_garbage_interval
- local collect_lua_garbage = cfg.collect_lua_garbage
+ local total_bucket_count = vshard_cfg.bucket_count
+ local rebalancer_disbalance_threshold =
+ vshard_cfg.rebalancer_disbalance_threshold
+ local rebalancer_max_receiving = vshard_cfg.rebalancer_max_receiving
+ local shard_index = vshard_cfg.shard_index
+ local collect_bucket_garbage_interval =
+ vshard_cfg.collect_bucket_garbage_interval
+ local collect_lua_garbage = vshard_cfg.collect_lua_garbage
-- It is considered that all possible errors during cfg
-- process occur only before this place.
@@ -1598,7 +1601,7 @@ local function storage_cfg(cfg, this_replica_uuid)
-- a new sync timeout.
--
local old_sync_timeout = M.sync_timeout
- M.sync_timeout = cfg.sync_timeout
+ M.sync_timeout = vshard_cfg.sync_timeout
if was_master and not is_master then
local_on_master_disable_prepare()
@@ -1607,27 +1610,27 @@ local function storage_cfg(cfg, this_replica_uuid)
local_on_master_enable_prepare()
end
- local box_cfg = table.copy(cfg)
- lcfg.remove_non_box_options(box_cfg)
- local ok, err = pcall(box.cfg, box_cfg)
- while M.errinj.ERRINJ_CFG_DELAY do
- lfiber.sleep(0.01)
- end
- if not ok then
- M.sync_timeout = old_sync_timeout
- if was_master and not is_master then
- local_on_master_disable_abort()
+ if not is_reload then
+ local ok, err = true, nil
+ ok, err = pcall(box.cfg, box_cfg)
+ while M.errinj.ERRINJ_CFG_DELAY do
+ lfiber.sleep(0.01)
end
- if not was_master and is_master then
- local_on_master_enable_abort()
+ if not ok then
+ M.sync_timeout = old_sync_timeout
+ if was_master and not is_master then
+ local_on_master_disable_abort()
+ end
+ if not was_master and is_master then
+ local_on_master_enable_abort()
+ end
+ error(err)
end
- error(err)
+ log.info("Box has been configured")
+ local uri = luri.parse(this_replica.uri)
+ box.once("vshard:storage:1", storage_schema_v1, uri.login,
uri.password)
end
- log.info("Box has been configured")
- local uri = luri.parse(this_replica.uri)
- box.once("vshard:storage:1", storage_schema_v1, uri.login,
uri.password)
-
lreplicaset.rebind_replicasets(new_replicasets, M.replicasets)
lreplicaset.outdate_replicasets(M.replicasets)
M.replicasets = new_replicasets
@@ -1639,7 +1642,7 @@ local function storage_cfg(cfg, this_replica_uuid)
M.shard_index = shard_index
M.collect_bucket_garbage_interval = collect_bucket_garbage_interval
M.collect_lua_garbage = collect_lua_garbage
- M.current_cfg = new_cfg
+ M.current_cfg = vshard_cfg
if was_master and not is_master then
local_on_master_disable()
@@ -1874,7 +1877,7 @@ if not rawget(_G, MODULE_INTERNALS) then
rawset(_G, MODULE_INTERNALS, M)
else
reload_evolution.upgrade(M)
- storage_cfg(M.current_cfg, M.this_replica.uuid)
+ storage_cfg(M.current_cfg, M.this_replica.uuid, true)
M.module_version = M.module_version + 1
end
@@ -1913,7 +1916,7 @@ return {
rebalancing_is_in_progress = rebalancing_is_in_progress,
recovery_wakeup = recovery_wakeup,
call = storage_call,
- cfg = storage_cfg,
+ cfg = function(cfg, uuid) return storage_cfg(cfg, uuid, false) end,
info = storage_info,
buckets_info = storage_buckets_info,
buckets_count = storage_buckets_count,
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 2/3] Move lua gc to a dedicated module
2018-08-01 18:43 ` [tarantool-patches] " Vladislav Shpilevoy
@ 2018-08-03 20:04 ` Alex Khatskevich
2018-08-06 17:03 ` Vladislav Shpilevoy
2018-08-08 11:17 ` Vladislav Shpilevoy
0 siblings, 2 replies; 23+ messages in thread
From: Alex Khatskevich @ 2018-08-03 20:04 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches
On 01.08.2018 21:43, Vladislav Shpilevoy wrote:
> Thanks for the patch! See 4 comments below.
>
> On 31/07/2018 19:25, AKhatskevich wrote:
>> `vshard.lua_gc.lua` is a new module which helps make gc work more
>> intense.
>> Before the commit that was a duty of router and storage.
>>
>> Reasons to move lua gc to a separate module:
>> 1. It is not a duty of vshard to collect garbage, so let gc fiber
>> be as far from vshard as possible.
>> 2. Next commits will introduce multiple routers feature, which require
>> gc fiber to be a singleton.
>>
>> Closes #138
>> ---
>> test/router/garbage_collector.result | 27 +++++++++++------
>> test/router/garbage_collector.test.lua | 18 ++++++-----
>> test/storage/garbage_collector.result | 27 +++++++++--------
>> test/storage/garbage_collector.test.lua | 22 ++++++--------
>> vshard/lua_gc.lua | 54
>> +++++++++++++++++++++++++++++++++
>> vshard/router/init.lua | 19 +++---------
>> vshard/storage/init.lua | 20 ++++--------
>> 7 files changed, 116 insertions(+), 71 deletions(-)
>> create mode 100644 vshard/lua_gc.lua
>>
>> diff --git a/test/router/garbage_collector.result
>> b/test/router/garbage_collector.result
>> index 3c2a4f1..a7474fc 100644
>> --- a/test/router/garbage_collector.result
>> +++ b/test/router/garbage_collector.result
>> @@ -40,27 +40,30 @@ test_run:switch('router_1')
>> fiber = require('fiber')
>> ---
>> ...
>> -cfg.collect_lua_garbage = true
>> +lua_gc = require('vshard.lua_gc')
>> ---
>> ...
>> -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL /
>> vshard.consts.DISCOVERY_INTERVAL
>> +cfg.collect_lua_garbage = true
>
> 1. Now this code tests nothing but just fibers. Below you do wakeup
> and check that iteration counter is increased, but it is obvious
> thing. Before your patch the test really tested that GC is called
> by checking for nullified weak references. Now I can remove
> collectgarbage()
> from the main_loop and nothing would changed. Please, make this test
> be a test.
GC test returned back.
>
> Moreover, the test hangs forever both locally and on Travis.
Fixed
>
>> diff --git a/test/storage/garbage_collector.result
>> b/test/storage/garbage_collector.result
>> index 3588fb4..d94ba24 100644
>> --- a/test/storage/garbage_collector.result
>> +++ b/test/storage/garbage_collector.result
>
> 2. Same. Now the test passes even if I removed collectgarbage() from
> the main loop.
returned.
>
>> diff --git a/vshard/lua_gc.lua b/vshard/lua_gc.lua
>> new file mode 100644
>> index 0000000..8d6af3e
>> --- /dev/null
>> +++ b/vshard/lua_gc.lua
>> @@ -0,0 +1,54 @@
>> +--
>> +-- This module implements background lua GC fiber.
>> +-- It's purpose is to make GC more aggressive.
>> +--
>> +
>> +local lfiber = require('fiber')
>> +local MODULE_INTERNALS = '__module_vshard_lua_gc'
>> +
>> +local M = rawget(_G, MODULE_INTERNALS)
>> +if not M then
>> + M = {
>> + -- Background fiber.
>> + bg_fiber = nil,
>> + -- GC interval in seconds.
>> + interval = nil,
>> + -- Main loop.
>> + -- Stored here to make the fiber reloadable.
>> + main_loop = nil,
>> + -- Number of `collectgarbage()` calls.
>> + iterations = 0,
>> + }
>> +end
>> +local DEFALUT_INTERVAL = 100
>
> 3. For constants please use vshard.consts.
>
> 4. You should not choose interval inside the main_loop.
> Please, use 'default' option in cfg.lua.
DEFAULT_INTERVAL is removed at all.
Interval value is became required.
full diff
commit ec221bd060f46e4dc009eaab1c6c1bd1cf5a4150
Author: AKhatskevich <avkhatskevich@tarantool.org>
Date: Thu Jul 26 01:17:00 2018 +0300
Move lua gc to a dedicated module
`vshard.lua_gc.lua` is a new module which helps make gc work more
intense.
Before the commit that was a duty of router and storage.
Reasons to move lua gc to a separate module:
1. It is not a duty of vshard to collect garbage, so let gc fiber
be as far from vshard as possible.
2. Next commits will introduce multiple routers feature, which require
gc fiber to be a singleton.
Closes #138
diff --git a/test/router/garbage_collector.result
b/test/router/garbage_collector.result
index 3c2a4f1..7780046 100644
--- a/test/router/garbage_collector.result
+++ b/test/router/garbage_collector.result
@@ -40,41 +40,59 @@ test_run:switch('router_1')
fiber = require('fiber')
---
...
-cfg.collect_lua_garbage = true
+lua_gc = require('vshard.lua_gc')
---
...
-iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL /
vshard.consts.DISCOVERY_INTERVAL
+cfg.collect_lua_garbage = true
---
...
vshard.router.cfg(cfg)
---
...
+lua_gc.internal.bg_fiber ~= nil
+---
+- true
+...
+-- Check that `collectgarbage()` was really called.
a = setmetatable({}, {__mode = 'v'})
---
...
a.k = {b = 100}
---
...
-for i = 1, iters + 1 do vshard.router.discovery_wakeup()
fiber.sleep(0.01) end
+iterations = lua_gc.internal.iterations
+---
+...
+lua_gc.internal.bg_fiber:wakeup()
+---
+...
+while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end
---
...
a.k
---
- null
...
+lua_gc.internal.interval = 0.001
+---
+...
cfg.collect_lua_garbage = false
---
...
vshard.router.cfg(cfg)
---
...
-a.k = {b = 100}
+lua_gc.internal.bg_fiber == nil
+---
+- true
+...
+iterations = lua_gc.internal.iterations
---
...
-for i = 1, iters + 1 do vshard.router.discovery_wakeup()
fiber.sleep(0.01) end
+fiber.sleep(0.01)
---
...
-a.k ~= nil
+iterations == lua_gc.internal.iterations
---
- true
...
diff --git a/test/router/garbage_collector.test.lua
b/test/router/garbage_collector.test.lua
index b3411cd..e8d0876 100644
--- a/test/router/garbage_collector.test.lua
+++ b/test/router/garbage_collector.test.lua
@@ -13,18 +13,24 @@ test_run:cmd("start server router_1")
--
test_run:switch('router_1')
fiber = require('fiber')
+lua_gc = require('vshard.lua_gc')
cfg.collect_lua_garbage = true
-iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL /
vshard.consts.DISCOVERY_INTERVAL
vshard.router.cfg(cfg)
+lua_gc.internal.bg_fiber ~= nil
+-- Check that `collectgarbage()` was really called.
a = setmetatable({}, {__mode = 'v'})
a.k = {b = 100}
-for i = 1, iters + 1 do vshard.router.discovery_wakeup()
fiber.sleep(0.01) end
+iterations = lua_gc.internal.iterations
+lua_gc.internal.bg_fiber:wakeup()
+while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end
a.k
+lua_gc.internal.interval = 0.001
cfg.collect_lua_garbage = false
vshard.router.cfg(cfg)
-a.k = {b = 100}
-for i = 1, iters + 1 do vshard.router.discovery_wakeup()
fiber.sleep(0.01) end
-a.k ~= nil
+lua_gc.internal.bg_fiber == nil
+iterations = lua_gc.internal.iterations
+fiber.sleep(0.01)
+iterations == lua_gc.internal.iterations
test_run:switch("default")
test_run:cmd("stop server router_1")
diff --git a/test/storage/garbage_collector.result
b/test/storage/garbage_collector.result
index 3588fb4..6bec2db 100644
--- a/test/storage/garbage_collector.result
+++ b/test/storage/garbage_collector.result
@@ -120,7 +120,7 @@ test_run:switch('storage_1_a')
fiber = require('fiber')
---
...
-log = require('log')
+lua_gc = require('vshard.lua_gc')
---
...
cfg.collect_lua_garbage = true
@@ -129,38 +129,50 @@ cfg.collect_lua_garbage = true
vshard.storage.cfg(cfg, names.storage_1_a)
---
...
--- Create a weak reference to a able {b = 100} - it must be
--- deleted on the next GC.
+lua_gc.internal.bg_fiber ~= nil
+---
+- true
+...
+-- Check that `collectgarbage()` was really called.
a = setmetatable({}, {__mode = 'v'})
---
...
a.k = {b = 100}
---
...
-iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL /
vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
+iterations = lua_gc.internal.iterations
---
...
--- Wait until Lua GC deletes a.k.
-for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup()
fiber.sleep(0.01) end
+lua_gc.internal.bg_fiber:wakeup()
+---
+...
+while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end
---
...
a.k
---
- null
...
+lua_gc.internal.interval = 0.001
+---
+...
cfg.collect_lua_garbage = false
---
...
vshard.storage.cfg(cfg, names.storage_1_a)
---
...
-a.k = {b = 100}
+lua_gc.internal.bg_fiber == nil
+---
+- true
+...
+iterations = lua_gc.internal.iterations
---
...
-for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup()
fiber.sleep(0.01) end
+fiber.sleep(0.01)
---
...
-a.k ~= nil
+iterations == lua_gc.internal.iterations
---
- true
...
diff --git a/test/storage/garbage_collector.test.lua
b/test/storage/garbage_collector.test.lua
index 79e76d8..407b8a1 100644
--- a/test/storage/garbage_collector.test.lua
+++ b/test/storage/garbage_collector.test.lua
@@ -46,22 +46,24 @@ customer:select{}
--
test_run:switch('storage_1_a')
fiber = require('fiber')
-log = require('log')
+lua_gc = require('vshard.lua_gc')
cfg.collect_lua_garbage = true
vshard.storage.cfg(cfg, names.storage_1_a)
--- Create a weak reference to a able {b = 100} - it must be
--- deleted on the next GC.
+lua_gc.internal.bg_fiber ~= nil
+-- Check that `collectgarbage()` was really called.
a = setmetatable({}, {__mode = 'v'})
a.k = {b = 100}
-iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL /
vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
--- Wait until Lua GC deletes a.k.
-for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup()
fiber.sleep(0.01) end
+iterations = lua_gc.internal.iterations
+lua_gc.internal.bg_fiber:wakeup()
+while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end
a.k
+lua_gc.internal.interval = 0.001
cfg.collect_lua_garbage = false
vshard.storage.cfg(cfg, names.storage_1_a)
-a.k = {b = 100}
-for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup()
fiber.sleep(0.01) end
-a.k ~= nil
+lua_gc.internal.bg_fiber == nil
+iterations = lua_gc.internal.iterations
+fiber.sleep(0.01)
+iterations == lua_gc.internal.iterations
test_run:switch('default')
test_run:drop_cluster(REPLICASET_2)
diff --git a/vshard/lua_gc.lua b/vshard/lua_gc.lua
new file mode 100644
index 0000000..c6c5cd3
--- /dev/null
+++ b/vshard/lua_gc.lua
@@ -0,0 +1,54 @@
+--
+-- This module implements background lua GC fiber.
+-- It's purpose is to make GC more aggressive.
+--
+
+local lfiber = require('fiber')
+local MODULE_INTERNALS = '__module_vshard_lua_gc'
+
+local M = rawget(_G, MODULE_INTERNALS)
+if not M then
+ M = {
+ -- Background fiber.
+ bg_fiber = nil,
+ -- GC interval in seconds.
+ interval = nil,
+ -- Main loop.
+ -- Stored here to make the fiber reloadable.
+ main_loop = nil,
+ -- Number of `collectgarbage()` calls.
+ iterations = 0,
+ }
+end
+
+M.main_loop = function()
+ lfiber.sleep(M.interval)
+ collectgarbage()
+ M.iterations = M.iterations + 1
+ return M.main_loop()
+end
+
+local function set_state(active, interval)
+ assert(type(interval) == 'number')
+ M.interval = interval
+ if active and not M.bg_fiber then
+ M.bg_fiber = lfiber.create(M.main_loop)
+ M.bg_fiber:name('vshard.lua_gc')
+ end
+ if not active and M.bg_fiber then
+ M.bg_fiber:cancel()
+ M.bg_fiber = nil
+ end
+ if active then
+ M.bg_fiber:wakeup()
+ end
+end
+
+if not rawget(_G, MODULE_INTERNALS) then
+ rawset(_G, MODULE_INTERNALS, M)
+end
+
+return {
+ set_state = set_state,
+ internal = M,
+}
diff --git a/vshard/router/init.lua b/vshard/router/init.lua
index e2b2b22..3e127cb 100644
--- a/vshard/router/init.lua
+++ b/vshard/router/init.lua
@@ -7,6 +7,7 @@ if rawget(_G, MODULE_INTERNALS) then
local vshard_modules = {
'vshard.consts', 'vshard.error', 'vshard.cfg',
'vshard.hash', 'vshard.replicaset', 'vshard.util',
+ 'vshard.lua_gc',
}
for _, module in pairs(vshard_modules) do
package.loaded[module] = nil
@@ -18,6 +19,7 @@ local lcfg = require('vshard.cfg')
local lhash = require('vshard.hash')
local lreplicaset = require('vshard.replicaset')
local util = require('vshard.util')
+local lua_gc = require('vshard.lua_gc')
local M = rawget(_G, MODULE_INTERNALS)
if not M then
@@ -43,8 +45,7 @@ if not M then
discovery_fiber = nil,
-- Bucket count stored on all replicasets.
total_bucket_count = 0,
- -- If true, then discovery fiber starts to call
- -- collectgarbage() periodically.
+ -- Boolean lua_gc state (create periodic gc task).
collect_lua_garbage = nil,
-- This counter is used to restart background fibers with
-- new reloaded code.
@@ -151,8 +152,6 @@ end
--
local function discovery_f()
local module_version = M.module_version
- local iterations_until_lua_gc =
- consts.COLLECT_LUA_GARBAGE_INTERVAL / consts.DISCOVERY_INTERVAL
while module_version == M.module_version do
while not next(M.replicasets) do
lfiber.sleep(consts.DISCOVERY_INTERVAL)
@@ -188,12 +187,6 @@ local function discovery_f()
M.route_map[bucket_id] = replicaset
end
end
- iterations_until_lua_gc = iterations_until_lua_gc - 1
- if M.collect_lua_garbage and iterations_until_lua_gc == 0 then
- iterations_until_lua_gc =
- consts.COLLECT_LUA_GARBAGE_INTERVAL /
consts.DISCOVERY_INTERVAL
- collectgarbage()
- end
lfiber.sleep(consts.DISCOVERY_INTERVAL)
end
end
@@ -504,7 +497,6 @@ local function router_cfg(cfg)
end
local new_replicasets = lreplicaset.buildall(vshard_cfg)
local total_bucket_count = vshard_cfg.bucket_count
- local collect_lua_garbage = vshard_cfg.collect_lua_garbage
log.info("Calling box.cfg()...")
for k, v in pairs(box_cfg) do
log.info({[k] = v})
@@ -531,7 +523,7 @@ local function router_cfg(cfg)
vshard_cfg.connection_outdate_delay)
M.connection_outdate_delay = vshard_cfg.connection_outdate_delay
M.total_bucket_count = total_bucket_count
- M.collect_lua_garbage = collect_lua_garbage
+ M.collect_lua_garbage = vshard_cfg.collect_lua_garbage
M.current_cfg = vshard_cfg
M.replicasets = new_replicasets
-- Update existing route map in-place.
@@ -548,8 +540,7 @@ local function router_cfg(cfg)
M.discovery_fiber = util.reloadable_fiber_create(
'vshard.discovery', M, 'discovery_f')
end
- -- Destroy connections, not used in a new configuration.
- collectgarbage()
+ lua_gc.set_state(M.collect_lua_garbage,
consts.COLLECT_LUA_GARBAGE_INTERVAL)
end
--------------------------------------------------------------------------------
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index 40216ea..3e29e9d 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -10,7 +10,8 @@ if rawget(_G, MODULE_INTERNALS) then
local vshard_modules = {
'vshard.consts', 'vshard.error', 'vshard.cfg',
'vshard.replicaset', 'vshard.util',
- 'vshard.storage.reload_evolution'
+ 'vshard.storage.reload_evolution',
+ 'vshard.lua_gc',
}
for _, module in pairs(vshard_modules) do
package.loaded[module] = nil
@@ -21,6 +22,7 @@ local lerror = require('vshard.error')
local lcfg = require('vshard.cfg')
local lreplicaset = require('vshard.replicaset')
local util = require('vshard.util')
+local lua_gc = require('vshard.lua_gc')
local reload_evolution = require('vshard.storage.reload_evolution')
local M = rawget(_G, MODULE_INTERNALS)
@@ -75,8 +77,7 @@ if not M then
collect_bucket_garbage_fiber = nil,
-- Do buckets garbage collection once per this time.
collect_bucket_garbage_interval = nil,
- -- If true, then bucket garbage collection fiber starts to
- -- call collectgarbage() periodically.
+ -- Boolean lua_gc state (create periodic gc task).
collect_lua_garbage = nil,
-------------------- Bucket recovery ---------------------
@@ -1063,9 +1064,6 @@ function collect_garbage_f()
-- buckets_for_redirect is deleted, it gets empty_sent_buckets
-- for next deletion.
local empty_sent_buckets = {}
- local iterations_until_lua_gc =
- consts.COLLECT_LUA_GARBAGE_INTERVAL /
M.collect_bucket_garbage_interval
-
while M.module_version == module_version do
-- Check if no changes in buckets configuration.
if control.bucket_generation_collected ~=
control.bucket_generation then
@@ -1106,12 +1104,6 @@ function collect_garbage_f()
end
end
::continue::
- iterations_until_lua_gc = iterations_until_lua_gc - 1
- if iterations_until_lua_gc == 0 and M.collect_lua_garbage then
- iterations_until_lua_gc = consts.COLLECT_LUA_GARBAGE_INTERVAL /
- M.collect_bucket_garbage_interval
- collectgarbage()
- end
lfiber.sleep(M.collect_bucket_garbage_interval)
end
end
@@ -1586,7 +1578,6 @@ local function storage_cfg(cfg, this_replica_uuid,
is_reload)
local shard_index = vshard_cfg.shard_index
local collect_bucket_garbage_interval =
vshard_cfg.collect_bucket_garbage_interval
- local collect_lua_garbage = vshard_cfg.collect_lua_garbage
-- It is considered that all possible errors during cfg
-- process occur only before this place.
@@ -1641,7 +1632,7 @@ local function storage_cfg(cfg, this_replica_uuid,
is_reload)
M.rebalancer_max_receiving = rebalancer_max_receiving
M.shard_index = shard_index
M.collect_bucket_garbage_interval = collect_bucket_garbage_interval
- M.collect_lua_garbage = collect_lua_garbage
+ M.collect_lua_garbage = vshard_cfg.collect_lua_garbage
M.current_cfg = vshard_cfg
if was_master and not is_master then
@@ -1666,6 +1657,7 @@ local function storage_cfg(cfg, this_replica_uuid,
is_reload)
M.rebalancer_fiber:cancel()
M.rebalancer_fiber = nil
end
+ lua_gc.set_state(M.collect_lua_garbage,
consts.COLLECT_LUA_GARBAGE_INTERVAL)
-- Destroy connections, not used in a new configuration.
collectgarbage()
end
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 3/3] Introduce multiple routers feature
2018-08-01 18:43 ` [tarantool-patches] " Vladislav Shpilevoy
@ 2018-08-03 20:05 ` Alex Khatskevich
2018-08-06 17:03 ` Vladislav Shpilevoy
0 siblings, 1 reply; 23+ messages in thread
From: Alex Khatskevich @ 2018-08-03 20:05 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches
On 01.08.2018 21:43, Vladislav Shpilevoy wrote:
> Thanks for the patch! See 10 comments below.
>
> On 31/07/2018 19:25, AKhatskevich wrote:
>> Key points:
>> * Old `vshard.router.some_method()` api is preserved.
>> * Add `vshard.router.new(name, cfg)` method which returns a new router.
>> * Each router has its own:
>> 1. name
>> 2. background fibers
>> 3. attributes (route_map, replicasets, outdate_delay...)
>> * Module reload reloads all configured routers.
>> * `cfg` reconfigures a single router.
>> * All routers share the same box configuration. The last passed config
>> overrides the global config.
>> * Multiple router instances can be connected to the same cluster.
>> * By now, a router cannot be destroyed.
>>
>> Extra changes:
>> * Add `data` parameter to `reloadable_fiber_create` function.
>>
>> Closes #130
>> ---> diff --git a/test/multiple_routers/multiple_routers.result
>> b/test/multiple_routers/multiple_routers.result
>> new file mode 100644
>> index 0000000..33f4034
>> --- /dev/null
>> +++ b/test/multiple_routers/multiple_routers.result
>> @@ -0,0 +1,226 @@
>> +-- Reconfigure one of routers do not affect the others.
>> +routers[3]:cfg(configs.cfg_1)
>
> 1. You did not change configs.cfg_1 so it is not reconfig
> actually. Please, change something to check that the
> parameter affects one router and does not affect others.
router[3] was configured with configs.cfg_2 before.
So, its config was changed.
>
> 2. Please, add a test on an ability to get the static router
> into a variable and use it like others. It should be possible
> to hide distinctions between static and other routers.
>
> Like this:
>
> r1 = vshard.router.static
> r2 = vshard.router.new(...)
> do_something_with_router(r1)
> do_something_with_router(r2)
>
> Here do_something_with_router() is unaware of whether the
> router is static or not.
>
Few calls are added.
>> diff --git a/vshard/router/init.lua b/vshard/router/init.lua
>> index 3e127cb..7569baf 100644
>> --- a/vshard/router/init.lua
>> +++ b/vshard/router/init.lua
>> @@ -257,13 +272,13 @@ local function router_call(bucket_id, mode,
>> func, args, opts)
>> -- but already is executed on storages.
>> while lfiber.time() <= tend do
>> lfiber.sleep(0.05)
>> - replicaset = M.replicasets[err.destination]
>> + replicaset =
>> router.replicasets[err.destination]
>> if replicaset then
>> goto replicaset_is_found
>> end
>> end
>> else
>> - replicaset = bucket_set(bucket_id,
>> replicaset.uuid)
>> + replicaset = bucket_set(router, bucket_id,
>> replicaset.uuid)
>
> 3. Out of 80 symbols.
fixed
>
>> lfiber.yield()
>> -- Protect against infinite cycle in a
>> -- case of broken cluster, when a bucket
>> @@ -488,9 +503,14 @@ end
>> -- Configuration
>> --------------------------------------------------------------------------------
>> -local function router_cfg(cfg)
>> - local vshard_cfg, box_cfg = lcfg.check(cfg, M.current_cfg)
>> - if not M.replicasets then
>> +-- Types of configuration.
>> +CFG_NEW = 'new'
>> +CFG_RELOAD = 'reload'
>> +CFG_RECONFIGURE = 'reconfigure'
>
> 4. Last two values are never used in router_cfg(). The first
> is used for logging only and can be checked as it was before
> with no explicit passing.
I have left it as it is.
Now, each of those is passed at least once.
>> +
>> +local function router_cfg(router, cfg, cfg_type)
>> + local vshard_cfg, box_cfg = lcfg.check(cfg, router.current_cfg)
>> + if cfg_type == CFG_NEW then
>> log.info('Starting router configuration')
>> else
>> log.info('Starting router reconfiguration')
>> @@ -512,44 +532,53 @@ local function router_cfg(cfg)
>> +
>> +local function updage_lua_gc_state()
>
> 5. This function is not needed actually.
>
> On router_new() the only thing that can change is start of
> the gc fiber if the new router has the flag and the gc is
> not started now. It can be checked by a simple 'if' with
> no full-scan of all routers.
>
> On reload it is not possible to change configuration, so
> the gc state can not be changed and does not need an update.
> Even if it could be changed, you already iterate over routers
> on reload to call router_cfg and can collect their flags
> along side.
>
> The next point is that it is not possible now to manage the
> gc via simple :cfg() call. You do nothing with gc when
> router_cfg is called directly. And that produces a question -
> why do your tests pass if so?
>
> The possible solution - keep a counter of set lua gc flags
> overall routers in M. On each cfg you update the counter
> if the value is changed. If it was 0 and become > 0, then
> you start gc. If it was > 0 and become 0, then you stop gc.
> No routers iteration at all.
Implemented by introducing a counter.
>
>> + local lua_gc = false
>> + for _, xrouter in pairs(M.routers) do
>> + if xrouter.collect_lua_garbage then
>> + lua_gc = true
>> + end
>> + end
>> + lua_gc.set_state(lua_gc, consts.COLLECT_LUA_GARBAGE_INTERVAL)
>> end
>> @@ -803,6 +833,93 @@ if M.errinj.ERRINJ_RELOAD then
>> error('Error injection: reload')
>> end
>> +--------------------------------------------------------------------------------
>> +-- Managing router instances
>> +--------------------------------------------------------------------------------
>>
>> +
>> +local function cfg_reconfigure(router, cfg)
>> + return router_cfg(router, cfg, CFG_RECONFIGURE)
>> +end
>> +
>> +local router_mt = {
>> + __index = {
>> + cfg = cfg_reconfigure;
>> + info = router_info;
>> + buckets_info = router_buckets_info;
>> + call = router_call;
>> + callro = router_callro;
>> + callrw = router_callrw;
>> + route = router_route;
>> + routeall = router_routeall;
>> + bucket_id = router_bucket_id;
>> + bucket_count = router_bucket_count;
>> + sync = router_sync;
>> + bootstrap = cluster_bootstrap;
>> + bucket_discovery = bucket_discovery;
>> + discovery_wakeup = discovery_wakeup;
>> + }
>> +}
>> +
>> +-- Table which represents this module.
>> +local module = {}
>> +
>> +local function export_static_router_attributes()
>> + -- This metatable bypasses calls to a module to the static_router.
>> + local module_mt = {__index = {}}
>> + for method_name, method in pairs(router_mt.__index) do
>> + module_mt.__index[method_name] = function(...)
>> + if M.static_router then
>> + return method(M.static_router, ...)
>> + else
>> + error('Static router is not configured')
>
> 6. This should not be all-time check. You should
> initialize the static router metatable with only errors.
> On the first cfg you reset the metatable to always use
> regular methods. But anyway this code is unreachable. See
> below in the comment 10 why it is so.
Yes. Fixed.
>
>> + end
>> + end
>> + end
>> + setmetatable(module, module_mt)
>> + -- Make static_router attributes accessible form
>> + -- vshard.router.internal.
>> + local M_static_router_attributes = {
>> + name = true,
>> + replicasets = true,
>> + route_map = true,
>> + total_bucket_count = true,
>> + }
>
> 7. I saw in the tests that you are using
> vshard.router.internal.static_router
> instead. Please, remove M_static_router_attributes then.
Deleted. Tests are fixed.
>
>> + setmetatable(M, {
>> + __index = function(M, key)
>> + return M.static_router[key]
>> + end
>> + })
>> +end
>> +
>> +local function router_new(name, cfg)
>> + assert(type(name) == 'string' and type(cfg) == 'table',
>> + 'Wrong argument type. Usage: vshard.router.new(name, cfg).')
>
> 8. As I said before, do not use assertions for usage checks in public
> API. Use 'if wrong_usage then error(...) end'.
Fixed.
>
>> + if M.routers[name] then
>> + return nil, string.format('Router with name %s already
>> exists', name)
>> + end
>> + local router = table.deepcopy(ROUTER_TEMPLATE)
>> + setmetatable(router, router_mt)
>> + router.name = name
>> + M.routers[name] = router
>> + if name == STATIC_ROUTER_NAME then
>> + M.static_router = router
>> + export_static_router_attributes()
>> + end
>
> 9. This check can be removed if you move
> export_static_router_attributes call into legacy_cfg.
Butbue to this if, the static router can be configured by
`vshard.box.new(static_router_name)`.
>
> 10. Looks like all your struggles in
> export_static_router_attributes() about error on non-configured
> router makes no sense since until cfg is called, vshard.router
> has no any methods except cfg and new.
>
>> + router_cfg(router, cfg, CFG_NEW)
>> + updage_lua_gc_state()
>> + return router
>> +end
>> +
Fixed.
full diff
commit f3ffb6a6a3632277f05ee4ea7d095a19dd85a42f
Author: AKhatskevich <avkhatskevich@tarantool.org>
Date: Thu Jul 26 16:17:25 2018 +0300
Introduce multiple routers feature
Key points:
* Old `vshard.router.some_method()` api is preserved.
* Add `vshard.router.new(name, cfg)` method which returns a new router.
* Each router has its own:
1. name
2. background fibers
3. attributes (route_map, replicasets, outdate_delay...)
* Module reload reloads all configured routers.
* `cfg` reconfigures a single router.
* All routers share the same box configuration. The last passed config
overrides the global box config.
* Multiple router instances can be connected to the same cluster.
* By now, a router cannot be destroyed.
Extra changes:
* Add `data` parameter to `reloadable_fiber_create` function.
Closes #130
diff --git a/test/failover/failover.result b/test/failover/failover.result
index 73a4250..50410ad 100644
--- a/test/failover/failover.result
+++ b/test/failover/failover.result
@@ -174,7 +174,7 @@ test_run:switch('router_1')
---
- true
...
-rs1 = vshard.router.internal.replicasets[rs_uuid[1]]
+rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]]
---
...
while not rs1.replica_up_ts do fiber.sleep(0.1) end
diff --git a/test/failover/failover.test.lua
b/test/failover/failover.test.lua
index 6e06314..44c8b6d 100644
--- a/test/failover/failover.test.lua
+++ b/test/failover/failover.test.lua
@@ -74,7 +74,7 @@ echo_count
-- Ensure that replica_up_ts is updated periodically.
test_run:switch('router_1')
-rs1 = vshard.router.internal.replicasets[rs_uuid[1]]
+rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]]
while not rs1.replica_up_ts do fiber.sleep(0.1) end
old_up_ts = rs1.replica_up_ts
while rs1.replica_up_ts == old_up_ts do fiber.sleep(0.1) end
diff --git a/test/failover/failover_errinj.result
b/test/failover/failover_errinj.result
index 3b6d986..484a1e3 100644
--- a/test/failover/failover_errinj.result
+++ b/test/failover/failover_errinj.result
@@ -49,7 +49,7 @@ vshard.router.cfg(cfg)
-- Check that already run failover step is restarted on
-- configuration change (if some replicasets are removed from
-- config).
-rs1 = vshard.router.internal.replicasets[rs_uuid[1]]
+rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]]
---
...
while not rs1.replica or not rs1.replica.conn:is_connected() do
fiber.sleep(0.1) end
diff --git a/test/failover/failover_errinj.test.lua
b/test/failover/failover_errinj.test.lua
index b4d2d35..14228de 100644
--- a/test/failover/failover_errinj.test.lua
+++ b/test/failover/failover_errinj.test.lua
@@ -20,7 +20,7 @@ vshard.router.cfg(cfg)
-- Check that already run failover step is restarted on
-- configuration change (if some replicasets are removed from
-- config).
-rs1 = vshard.router.internal.replicasets[rs_uuid[1]]
+rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]]
while not rs1.replica or not rs1.replica.conn:is_connected() do
fiber.sleep(0.1) end
vshard.router.internal.errinj.ERRINJ_FAILOVER_CHANGE_CFG = true
wait_state('Configuration has changed, restart ')
diff --git a/test/failover/router_1.lua b/test/failover/router_1.lua
index d71209b..664a6c6 100644
--- a/test/failover/router_1.lua
+++ b/test/failover/router_1.lua
@@ -42,7 +42,7 @@ end
function priority_order()
local ret = {}
for _, uuid in pairs(rs_uuid) do
- local rs = vshard.router.internal.replicasets[uuid]
+ local rs = vshard.router.internal.static_router.replicasets[uuid]
local sorted = {}
for _, replica in pairs(rs.priority_list) do
local z
diff --git a/test/misc/reconfigure.result b/test/misc/reconfigure.result
index c7960b3..311f749 100644
--- a/test/misc/reconfigure.result
+++ b/test/misc/reconfigure.result
@@ -250,7 +250,7 @@ test_run:switch('router_1')
-- Ensure that in a case of error router internals are not
-- changed.
--
-not vshard.router.internal.collect_lua_garbage
+not vshard.router.internal.static_router.collect_lua_garbage
---
- true
...
@@ -264,7 +264,7 @@ vshard.router.cfg(cfg)
---
- error: 'Incorrect value for option ''invalid_option'': unexpected
option'
...
-not vshard.router.internal.collect_lua_garbage
+not vshard.router.internal.static_router.collect_lua_garbage
---
- true
...
diff --git a/test/misc/reconfigure.test.lua b/test/misc/reconfigure.test.lua
index 25dc2ca..298b9b0 100644
--- a/test/misc/reconfigure.test.lua
+++ b/test/misc/reconfigure.test.lua
@@ -99,11 +99,11 @@ test_run:switch('router_1')
-- Ensure that in a case of error router internals are not
-- changed.
--
-not vshard.router.internal.collect_lua_garbage
+not vshard.router.internal.static_router.collect_lua_garbage
cfg.collect_lua_garbage = true
cfg.invalid_option = 'kek'
vshard.router.cfg(cfg)
-not vshard.router.internal.collect_lua_garbage
+not vshard.router.internal.static_router.collect_lua_garbage
cfg.invalid_option = nil
cfg.collect_lua_garbage = nil
vshard.router.cfg(cfg)
diff --git a/test/multiple_routers/configs.lua
b/test/multiple_routers/configs.lua
new file mode 100644
index 0000000..a6ce33c
--- /dev/null
+++ b/test/multiple_routers/configs.lua
@@ -0,0 +1,81 @@
+names = {
+ storage_1_1_a = '32a2d4b8-f146-44ed-9d51-2436507efdf8',
+ storage_1_1_b = 'c1c849b1-641d-40b8-9283-bcfe73d46270',
+ storage_1_2_a = '04e677ed-c7ba-47e0-a67f-b5100cfa86af',
+ storage_1_2_b = 'c7a979ee-9263-4a38-84a5-2fb6a0a32684',
+ storage_2_1_a = '88dc03f0-23fb-4f05-b462-e29186542864',
+ storage_2_1_b = '4230b711-f5c4-4131-bf98-88cd43a16901',
+ storage_2_2_a = '6b1eefbc-1e2e-410e-84ff-44c572ea9916',
+ storage_2_2_b = 'be74419a-1e56-4ba4-97e9-6b18710f63c5',
+}
+
+rs_1_1 = 'dd208fb8-8b90-49bc-8393-6b3a99da7c52'
+rs_1_2 = 'af9cfe88-2091-4613-a877-a623776c5c0e'
+rs_2_1 = '9ca8ee15-ae18-4f31-9385-4859f89ce73f'
+rs_2_2 = '007f5f58-b654-4125-8441-a71866fb62b5'
+
+local cfg_1 = {}
+cfg_1.sharding = {
+ [rs_1_1] = {
+ replicas = {
+ [names.storage_1_1_a] = {
+ uri = 'storage:storage@127.0.0.1:3301',
+ name = 'storage_1_1_a',
+ master = true,
+ },
+ [names.storage_1_1_b] = {
+ uri = 'storage:storage@127.0.0.1:3302',
+ name = 'storage_1_1_b',
+ },
+ }
+ },
+ [rs_1_2] = {
+ replicas = {
+ [names.storage_1_2_a] = {
+ uri = 'storage:storage@127.0.0.1:3303',
+ name = 'storage_1_2_a',
+ master = true,
+ },
+ [names.storage_1_2_b] = {
+ uri = 'storage:storage@127.0.0.1:3304',
+ name = 'storage_1_2_b',
+ },
+ }
+ },
+}
+
+
+local cfg_2 = {}
+cfg_2.sharding = {
+ [rs_2_1] = {
+ replicas = {
+ [names.storage_2_1_a] = {
+ uri = 'storage:storage@127.0.0.1:3305',
+ name = 'storage_2_1_a',
+ master = true,
+ },
+ [names.storage_2_1_b] = {
+ uri = 'storage:storage@127.0.0.1:3306',
+ name = 'storage_2_1_b',
+ },
+ }
+ },
+ [rs_2_2] = {
+ replicas = {
+ [names.storage_2_2_a] = {
+ uri = 'storage:storage@127.0.0.1:3307',
+ name = 'storage_2_2_a',
+ master = true,
+ },
+ [names.storage_2_2_b] = {
+ uri = 'storage:storage@127.0.0.1:3308',
+ name = 'storage_2_2_b',
+ },
+ }
+ },
+}
+
+return {
+ cfg_1 = cfg_1,
+ cfg_2 = cfg_2,
+}
diff --git a/test/multiple_routers/multiple_routers.result
b/test/multiple_routers/multiple_routers.result
new file mode 100644
index 0000000..1e309a7
--- /dev/null
+++ b/test/multiple_routers/multiple_routers.result
@@ -0,0 +1,295 @@
+test_run = require('test_run').new()
+---
+...
+REPLICASET_1_1 = { 'storage_1_1_a', 'storage_1_1_b' }
+---
+...
+REPLICASET_1_2 = { 'storage_1_2_a', 'storage_1_2_b' }
+---
+...
+REPLICASET_2_1 = { 'storage_2_1_a', 'storage_2_1_b' }
+---
+...
+REPLICASET_2_2 = { 'storage_2_2_a', 'storage_2_2_b' }
+---
+...
+test_run:create_cluster(REPLICASET_1_1, 'multiple_routers')
+---
+...
+test_run:create_cluster(REPLICASET_1_2, 'multiple_routers')
+---
+...
+test_run:create_cluster(REPLICASET_2_1, 'multiple_routers')
+---
+...
+test_run:create_cluster(REPLICASET_2_2, 'multiple_routers')
+---
+...
+util = require('lua_libs.util')
+---
+...
+util.wait_master(test_run, REPLICASET_1_1, 'storage_1_1_a')
+---
+...
+util.wait_master(test_run, REPLICASET_1_2, 'storage_1_2_a')
+---
+...
+util.wait_master(test_run, REPLICASET_2_1, 'storage_2_1_a')
+---
+...
+util.wait_master(test_run, REPLICASET_2_2, 'storage_2_2_a')
+---
+...
+test_run:cmd("create server router_1 with
script='multiple_routers/router_1.lua'")
+---
+- true
+...
+test_run:cmd("start server router_1")
+---
+- true
+...
+-- Configure default (static) router.
+_ = test_run:cmd("switch router_1")
+---
+...
+static_router = vshard.router.new('_static_router', configs.cfg_1)
+---
+...
+vshard.router.bootstrap()
+---
+- true
+...
+_ = test_run:cmd("switch storage_1_2_a")
+---
+...
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+---
+...
+_ = test_run:cmd("switch router_1")
+---
+...
+vshard.router.call(1, 'write', 'do_replace', {{1, 1}})
+---
+- true
+...
+vshard.router.call(1, 'read', 'do_select', {1})
+---
+- [[1, 1]]
+...
+-- Test that static router is just a router object under the hood.
+static_router:route(1) == vshard.router.route(1)
+---
+- true
+...
+-- Configure extra router.
+router_2 = vshard.router.new('router_2', configs.cfg_2)
+---
+...
+router_2:bootstrap()
+---
+- true
+...
+_ = test_run:cmd("switch storage_2_2_a")
+---
+...
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+---
+...
+_ = test_run:cmd("switch router_1")
+---
+...
+router_2:call(1, 'write', 'do_replace', {{2, 2}})
+---
+- true
+...
+router_2:call(1, 'read', 'do_select', {2})
+---
+- [[2, 2]]
+...
+-- Check that router_2 and static router serves different clusters.
+#router_2:call(1, 'read', 'do_select', {1}) == 0
+---
+- true
+...
+-- Create several routers to the same cluster.
+routers = {}
+---
+...
+for i = 3, 10 do routers[i] = vshard.router.new('router_' .. i,
configs.cfg_2) end
+---
+...
+routers[3]:call(1, 'read', 'do_select', {2})
+---
+- [[2, 2]]
+...
+-- Check that they have their own background fibers.
+fiber_names = {}
+---
+...
+for i = 2, 10 do fiber_names['vshard.failover.router_' .. i] = true;
fiber_names['vshard.discovery.router_' .. i] = true; end
+---
+...
+next(fiber_names) ~= nil
+---
+- true
+...
+fiber = require('fiber')
+---
+...
+for _, xfiber in pairs(fiber.info()) do fiber_names[xfiber.name] = nil end
+---
+...
+next(fiber_names) == nil
+---
+- true
+...
+-- Reconfigure one of routers do not affect the others.
+routers[3]:cfg(configs.cfg_1)
+---
+...
+routers[3]:call(1, 'read', 'do_select', {1})
+---
+- [[1, 1]]
+...
+#routers[3]:call(1, 'read', 'do_select', {2}) == 0
+---
+- true
+...
+#routers[4]:call(1, 'read', 'do_select', {1}) == 0
+---
+- true
+...
+routers[4]:call(1, 'read', 'do_select', {2})
+---
+- [[2, 2]]
+...
+routers[3]:cfg(configs.cfg_2)
+---
+...
+-- Try to create router with the same name.
+util = require('lua_libs.util')
+---
+...
+util.check_error(vshard.router.new, 'router_2', configs.cfg_2)
+---
+- null
+- Router with name router_2 already exists
+...
+-- Reload router module.
+_, old_rs_1 = next(vshard.router.internal.static_router.replicasets)
+---
+...
+_, old_rs_2 = next(router_2.replicasets)
+---
+...
+package.loaded['vshard.router'] = nil
+---
+...
+vshard.router = require('vshard.router')
+---
+...
+while not old_rs_1.is_outdated do fiber.sleep(0.01) end
+---
+...
+while not old_rs_2.is_outdated do fiber.sleep(0.01) end
+---
+...
+vshard.router.call(1, 'read', 'do_select', {1})
+---
+- [[1, 1]]
+...
+router_2:call(1, 'read', 'do_select', {2})
+---
+- [[2, 2]]
+...
+routers[5]:call(1, 'read', 'do_select', {2})
+---
+- [[2, 2]]
+...
+-- Check lua_gc counter.
+lua_gc = require('vshard.lua_gc')
+---
+...
+vshard.router.internal.collect_lua_garbage_cnt == 0
+---
+- true
+...
+lua_gc.internal.bg_fiber == nil
+---
+- true
+...
+configs.cfg_2.collect_lua_garbage = true
+---
+...
+routers[5]:cfg(configs.cfg_2)
+---
+...
+lua_gc.internal.bg_fiber ~= nil
+---
+- true
+...
+routers[7]:cfg(configs.cfg_2)
+---
+...
+lua_gc.internal.bg_fiber ~= nil
+---
+- true
+...
+vshard.router.internal.collect_lua_garbage_cnt == 2
+---
+- true
+...
+package.loaded['vshard.router'] = nil
+---
+...
+vshard.router = require('vshard.router')
+---
+...
+vshard.router.internal.collect_lua_garbage_cnt == 2
+---
+- true
+...
+configs.cfg_2.collect_lua_garbage = nil
+---
+...
+routers[5]:cfg(configs.cfg_2)
+---
+...
+lua_gc.internal.bg_fiber ~= nil
+---
+- true
+...
+routers[7]:cfg(configs.cfg_2)
+---
+...
+vshard.router.internal.collect_lua_garbage_cnt == 0
+---
+- true
+...
+lua_gc.internal.bg_fiber == nil
+---
+- true
+...
+_ = test_run:cmd("switch default")
+---
+...
+test_run:cmd("stop server router_1")
+---
+- true
+...
+test_run:cmd("cleanup server router_1")
+---
+- true
+...
+test_run:drop_cluster(REPLICASET_1_1)
+---
+...
+test_run:drop_cluster(REPLICASET_1_2)
+---
+...
+test_run:drop_cluster(REPLICASET_2_1)
+---
+...
+test_run:drop_cluster(REPLICASET_2_2)
+---
+...
diff --git a/test/multiple_routers/multiple_routers.test.lua
b/test/multiple_routers/multiple_routers.test.lua
new file mode 100644
index 0000000..760ad9f
--- /dev/null
+++ b/test/multiple_routers/multiple_routers.test.lua
@@ -0,0 +1,108 @@
+test_run = require('test_run').new()
+
+REPLICASET_1_1 = { 'storage_1_1_a', 'storage_1_1_b' }
+REPLICASET_1_2 = { 'storage_1_2_a', 'storage_1_2_b' }
+REPLICASET_2_1 = { 'storage_2_1_a', 'storage_2_1_b' }
+REPLICASET_2_2 = { 'storage_2_2_a', 'storage_2_2_b' }
+
+test_run:create_cluster(REPLICASET_1_1, 'multiple_routers')
+test_run:create_cluster(REPLICASET_1_2, 'multiple_routers')
+test_run:create_cluster(REPLICASET_2_1, 'multiple_routers')
+test_run:create_cluster(REPLICASET_2_2, 'multiple_routers')
+util = require('lua_libs.util')
+util.wait_master(test_run, REPLICASET_1_1, 'storage_1_1_a')
+util.wait_master(test_run, REPLICASET_1_2, 'storage_1_2_a')
+util.wait_master(test_run, REPLICASET_2_1, 'storage_2_1_a')
+util.wait_master(test_run, REPLICASET_2_2, 'storage_2_2_a')
+
+test_run:cmd("create server router_1 with
script='multiple_routers/router_1.lua'")
+test_run:cmd("start server router_1")
+
+-- Configure default (static) router.
+_ = test_run:cmd("switch router_1")
+static_router = vshard.router.new('_static_router', configs.cfg_1)
+vshard.router.bootstrap()
+_ = test_run:cmd("switch storage_1_2_a")
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+_ = test_run:cmd("switch router_1")
+
+vshard.router.call(1, 'write', 'do_replace', {{1, 1}})
+vshard.router.call(1, 'read', 'do_select', {1})
+
+-- Test that static router is just a router object under the hood.
+static_router:route(1) == vshard.router.route(1)
+
+-- Configure extra router.
+router_2 = vshard.router.new('router_2', configs.cfg_2)
+router_2:bootstrap()
+_ = test_run:cmd("switch storage_2_2_a")
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+_ = test_run:cmd("switch router_1")
+
+router_2:call(1, 'write', 'do_replace', {{2, 2}})
+router_2:call(1, 'read', 'do_select', {2})
+-- Check that router_2 and static router serves different clusters.
+#router_2:call(1, 'read', 'do_select', {1}) == 0
+
+-- Create several routers to the same cluster.
+routers = {}
+for i = 3, 10 do routers[i] = vshard.router.new('router_' .. i,
configs.cfg_2) end
+routers[3]:call(1, 'read', 'do_select', {2})
+-- Check that they have their own background fibers.
+fiber_names = {}
+for i = 2, 10 do fiber_names['vshard.failover.router_' .. i] = true;
fiber_names['vshard.discovery.router_' .. i] = true; end
+next(fiber_names) ~= nil
+fiber = require('fiber')
+for _, xfiber in pairs(fiber.info()) do fiber_names[xfiber.name] = nil end
+next(fiber_names) == nil
+
+-- Reconfigure one of routers do not affect the others.
+routers[3]:cfg(configs.cfg_1)
+routers[3]:call(1, 'read', 'do_select', {1})
+#routers[3]:call(1, 'read', 'do_select', {2}) == 0
+#routers[4]:call(1, 'read', 'do_select', {1}) == 0
+routers[4]:call(1, 'read', 'do_select', {2})
+routers[3]:cfg(configs.cfg_2)
+
+-- Try to create router with the same name.
+util = require('lua_libs.util')
+util.check_error(vshard.router.new, 'router_2', configs.cfg_2)
+
+-- Reload router module.
+_, old_rs_1 = next(vshard.router.internal.static_router.replicasets)
+_, old_rs_2 = next(router_2.replicasets)
+package.loaded['vshard.router'] = nil
+vshard.router = require('vshard.router')
+while not old_rs_1.is_outdated do fiber.sleep(0.01) end
+while not old_rs_2.is_outdated do fiber.sleep(0.01) end
+vshard.router.call(1, 'read', 'do_select', {1})
+router_2:call(1, 'read', 'do_select', {2})
+routers[5]:call(1, 'read', 'do_select', {2})
+
+-- Check lua_gc counter.
+lua_gc = require('vshard.lua_gc')
+vshard.router.internal.collect_lua_garbage_cnt == 0
+lua_gc.internal.bg_fiber == nil
+configs.cfg_2.collect_lua_garbage = true
+routers[5]:cfg(configs.cfg_2)
+lua_gc.internal.bg_fiber ~= nil
+routers[7]:cfg(configs.cfg_2)
+lua_gc.internal.bg_fiber ~= nil
+vshard.router.internal.collect_lua_garbage_cnt == 2
+package.loaded['vshard.router'] = nil
+vshard.router = require('vshard.router')
+vshard.router.internal.collect_lua_garbage_cnt == 2
+configs.cfg_2.collect_lua_garbage = nil
+routers[5]:cfg(configs.cfg_2)
+lua_gc.internal.bg_fiber ~= nil
+routers[7]:cfg(configs.cfg_2)
+vshard.router.internal.collect_lua_garbage_cnt == 0
+lua_gc.internal.bg_fiber == nil
+
+_ = test_run:cmd("switch default")
+test_run:cmd("stop server router_1")
+test_run:cmd("cleanup server router_1")
+test_run:drop_cluster(REPLICASET_1_1)
+test_run:drop_cluster(REPLICASET_1_2)
+test_run:drop_cluster(REPLICASET_2_1)
+test_run:drop_cluster(REPLICASET_2_2)
diff --git a/test/multiple_routers/router_1.lua
b/test/multiple_routers/router_1.lua
new file mode 100644
index 0000000..2e9ea91
--- /dev/null
+++ b/test/multiple_routers/router_1.lua
@@ -0,0 +1,15 @@
+#!/usr/bin/env tarantool
+
+require('strict').on()
+
+-- Get instance name
+local fio = require('fio')
+local NAME = fio.basename(arg[0], '.lua')
+
+require('console').listen(os.getenv('ADMIN'))
+
+configs = require('configs')
+
+-- Start the database with sharding
+vshard = require('vshard')
+box.cfg{}
diff --git a/test/multiple_routers/storage_1_1_a.lua
b/test/multiple_routers/storage_1_1_a.lua
new file mode 100644
index 0000000..b44a97a
--- /dev/null
+++ b/test/multiple_routers/storage_1_1_a.lua
@@ -0,0 +1,23 @@
+#!/usr/bin/env tarantool
+
+require('strict').on()
+
+-- Get instance name.
+local fio = require('fio')
+NAME = fio.basename(arg[0], '.lua')
+
+require('console').listen(os.getenv('ADMIN'))
+
+-- Fetch config for the cluster of the instance.
+if NAME:sub(9,9) == '1' then
+ cfg = require('configs').cfg_1
+else
+ cfg = require('configs').cfg_2
+end
+
+-- Start the database with sharding.
+vshard = require('vshard')
+vshard.storage.cfg(cfg, names[NAME])
+
+-- Bootstrap storage.
+require('lua_libs.bootstrap')
diff --git a/test/multiple_routers/storage_1_1_b.lua
b/test/multiple_routers/storage_1_1_b.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_1_1_b.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_1_2_a.lua
b/test/multiple_routers/storage_1_2_a.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_1_2_a.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_1_2_b.lua
b/test/multiple_routers/storage_1_2_b.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_1_2_b.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_2_1_a.lua
b/test/multiple_routers/storage_2_1_a.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_2_1_a.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_2_1_b.lua
b/test/multiple_routers/storage_2_1_b.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_2_1_b.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_2_2_a.lua
b/test/multiple_routers/storage_2_2_a.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_2_2_a.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_2_2_b.lua
b/test/multiple_routers/storage_2_2_b.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_2_2_b.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/suite.ini
b/test/multiple_routers/suite.ini
new file mode 100644
index 0000000..d2d4470
--- /dev/null
+++ b/test/multiple_routers/suite.ini
@@ -0,0 +1,6 @@
+[default]
+core = tarantool
+description = Multiple routers tests
+script = test.lua
+is_parallel = False
+lua_libs = ../lua_libs configs.lua
diff --git a/test/multiple_routers/test.lua b/test/multiple_routers/test.lua
new file mode 100644
index 0000000..cb7c1ee
--- /dev/null
+++ b/test/multiple_routers/test.lua
@@ -0,0 +1,9 @@
+#!/usr/bin/env tarantool
+
+require('strict').on()
+
+box.cfg{
+ listen = os.getenv("LISTEN"),
+}
+
+require('console').listen(os.getenv('ADMIN'))
diff --git a/test/router/exponential_timeout.result
b/test/router/exponential_timeout.result
index fb54d0f..6748b64 100644
--- a/test/router/exponential_timeout.result
+++ b/test/router/exponential_timeout.result
@@ -37,10 +37,10 @@ test_run:cmd('switch router_1')
util = require('util')
---
...
-rs1 = vshard.router.internal.replicasets[replicasets[1]]
+rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]
---
...
-rs2 = vshard.router.internal.replicasets[replicasets[2]]
+rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]
---
...
util.collect_timeouts(rs1)
diff --git a/test/router/exponential_timeout.test.lua
b/test/router/exponential_timeout.test.lua
index 3ec0b8c..75d85bf 100644
--- a/test/router/exponential_timeout.test.lua
+++ b/test/router/exponential_timeout.test.lua
@@ -13,8 +13,8 @@ test_run:cmd("start server router_1")
test_run:cmd('switch router_1')
util = require('util')
-rs1 = vshard.router.internal.replicasets[replicasets[1]]
-rs2 = vshard.router.internal.replicasets[replicasets[2]]
+rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]
+rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]
util.collect_timeouts(rs1)
util.collect_timeouts(rs2)
diff --git a/test/router/reconnect_to_master.result
b/test/router/reconnect_to_master.result
index 5e678ce..d502723 100644
--- a/test/router/reconnect_to_master.result
+++ b/test/router/reconnect_to_master.result
@@ -76,7 +76,7 @@ _ = test_run:cmd('stop server storage_1_a')
_ = test_run:switch('router_1')
---
...
-reps = vshard.router.internal.replicasets
+reps = vshard.router.internal.static_router.replicasets
---
...
test_run:cmd("setopt delimiter ';'")
@@ -95,7 +95,7 @@ end;
...
function count_known_buckets()
local known_buckets = 0
- for _, id in pairs(vshard.router.internal.route_map) do
+ for _, id in pairs(vshard.router.internal.static_router.route_map) do
known_buckets = known_buckets + 1
end
return known_buckets
@@ -127,7 +127,7 @@ is_disconnected()
fiber = require('fiber')
---
...
-while vshard.router.internal.replicasets[replicasets[1]].replica == nil
do fiber.sleep(0.1) end
+while
vshard.router.internal.static_router.replicasets[replicasets[1]].replica
== nil do fiber.sleep(0.1) end
---
...
vshard.router.info()
diff --git a/test/router/reconnect_to_master.test.lua
b/test/router/reconnect_to_master.test.lua
index 39ba90e..8820fa7 100644
--- a/test/router/reconnect_to_master.test.lua
+++ b/test/router/reconnect_to_master.test.lua
@@ -34,7 +34,7 @@ _ = test_run:cmd('stop server storage_1_a')
_ = test_run:switch('router_1')
-reps = vshard.router.internal.replicasets
+reps = vshard.router.internal.static_router.replicasets
test_run:cmd("setopt delimiter ';'")
function is_disconnected()
for i, rep in pairs(reps) do
@@ -46,7 +46,7 @@ function is_disconnected()
end;
function count_known_buckets()
local known_buckets = 0
- for _, id in pairs(vshard.router.internal.route_map) do
+ for _, id in pairs(vshard.router.internal.static_router.route_map) do
known_buckets = known_buckets + 1
end
return known_buckets
@@ -63,7 +63,7 @@ is_disconnected()
-- Wait until replica is connected to test alerts on unavailable
-- master.
fiber = require('fiber')
-while vshard.router.internal.replicasets[replicasets[1]].replica == nil
do fiber.sleep(0.1) end
+while
vshard.router.internal.static_router.replicasets[replicasets[1]].replica
== nil do fiber.sleep(0.1) end
vshard.router.info()
-- Return master.
diff --git a/test/router/reload.result b/test/router/reload.result
index f0badc3..98e8e71 100644
--- a/test/router/reload.result
+++ b/test/router/reload.result
@@ -229,7 +229,7 @@ vshard.router.cfg(cfg)
cfg.connection_outdate_delay = old_connection_delay
---
...
-vshard.router.internal.connection_outdate_delay = nil
+vshard.router.internal.static_router.connection_outdate_delay = nil
---
...
rs_new = vshard.router.route(1)
diff --git a/test/router/reload.test.lua b/test/router/reload.test.lua
index 528222a..293cb26 100644
--- a/test/router/reload.test.lua
+++ b/test/router/reload.test.lua
@@ -104,7 +104,7 @@ old_connection_delay = cfg.connection_outdate_delay
cfg.connection_outdate_delay = 0.3
vshard.router.cfg(cfg)
cfg.connection_outdate_delay = old_connection_delay
-vshard.router.internal.connection_outdate_delay = nil
+vshard.router.internal.static_router.connection_outdate_delay = nil
rs_new = vshard.router.route(1)
rs_old = rs
_, replica_old = next(rs_old.replicas)
diff --git a/test/router/reroute_wrong_bucket.result
b/test/router/reroute_wrong_bucket.result
index 7f2a494..989dc79 100644
--- a/test/router/reroute_wrong_bucket.result
+++ b/test/router/reroute_wrong_bucket.result
@@ -98,7 +98,7 @@ vshard.router.call(100, 'read', 'customer_lookup',
{1}, {timeout = 100})
---
- {'accounts': [], 'customer_id': 1, 'name': 'name'}
...
-vshard.router.internal.route_map[100] =
vshard.router.internal.replicasets[replicasets[1]]
+vshard.router.internal.static_router.route_map[100] =
vshard.router.internal.static_router.replicasets[replicasets[1]]
---
...
vshard.router.call(100, 'write', 'customer_add', {{customer_id = 2,
bucket_id = 100, name = 'name2', accounts = {}}}, {timeout = 100})
@@ -146,13 +146,13 @@ test_run:switch('router_1')
...
-- Emulate a situation, when a replicaset_2 while is unknown for
-- router, but is already known for storages.
-save_rs2 = vshard.router.internal.replicasets[replicasets[2]]
+save_rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]
---
...
-vshard.router.internal.replicasets[replicasets[2]] = nil
+vshard.router.internal.static_router.replicasets[replicasets[2]] = nil
---
...
-vshard.router.internal.route_map[100] =
vshard.router.internal.replicasets[replicasets[1]]
+vshard.router.internal.static_router.route_map[100] =
vshard.router.internal.static_router.replicasets[replicasets[1]]
---
...
fiber = require('fiber')
@@ -207,7 +207,7 @@ err
require('log').info(string.rep('a', 1000))
---
...
-vshard.router.internal.route_map[100] =
vshard.router.internal.replicasets[replicasets[1]]
+vshard.router.internal.static_router.route_map[100] =
vshard.router.internal.static_router.replicasets[replicasets[1]]
---
...
call_retval = nil
@@ -219,7 +219,7 @@ f = fiber.create(do_call, 100)
while not test_run:grep_log('router_1', 'please update configuration',
1000) do fiber.sleep(0.1) end
---
...
-vshard.router.internal.replicasets[replicasets[2]] = save_rs2
+vshard.router.internal.static_router.replicasets[replicasets[2]] = save_rs2
---
...
while not call_retval do fiber.sleep(0.1) end
diff --git a/test/router/reroute_wrong_bucket.test.lua
b/test/router/reroute_wrong_bucket.test.lua
index 03384d1..a00f941 100644
--- a/test/router/reroute_wrong_bucket.test.lua
+++ b/test/router/reroute_wrong_bucket.test.lua
@@ -35,7 +35,7 @@ customer_add({customer_id = 1, bucket_id = 100, name =
'name', accounts = {}})
test_run:switch('router_1')
vshard.router.call(100, 'read', 'customer_lookup', {1}, {timeout = 100})
-vshard.router.internal.route_map[100] =
vshard.router.internal.replicasets[replicasets[1]]
+vshard.router.internal.static_router.route_map[100] =
vshard.router.internal.static_router.replicasets[replicasets[1]]
vshard.router.call(100, 'write', 'customer_add', {{customer_id = 2,
bucket_id = 100, name = 'name2', accounts = {}}}, {timeout = 100})
-- Create cycle.
@@ -55,9 +55,9 @@ box.space._bucket:replace({100,
vshard.consts.BUCKET.SENT, replicasets[2]})
test_run:switch('router_1')
-- Emulate a situation, when a replicaset_2 while is unknown for
-- router, but is already known for storages.
-save_rs2 = vshard.router.internal.replicasets[replicasets[2]]
-vshard.router.internal.replicasets[replicasets[2]] = nil
-vshard.router.internal.route_map[100] =
vshard.router.internal.replicasets[replicasets[1]]
+save_rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]
+vshard.router.internal.static_router.replicasets[replicasets[2]] = nil
+vshard.router.internal.static_router.route_map[100] =
vshard.router.internal.static_router.replicasets[replicasets[1]]
fiber = require('fiber')
call_retval = nil
@@ -84,11 +84,11 @@ err
-- detect it and end with ok.
--
require('log').info(string.rep('a', 1000))
-vshard.router.internal.route_map[100] =
vshard.router.internal.replicasets[replicasets[1]]
+vshard.router.internal.static_router.route_map[100] =
vshard.router.internal.static_router.replicasets[replicasets[1]]
call_retval = nil
f = fiber.create(do_call, 100)
while not test_run:grep_log('router_1', 'please update configuration',
1000) do fiber.sleep(0.1) end
-vshard.router.internal.replicasets[replicasets[2]] = save_rs2
+vshard.router.internal.static_router.replicasets[replicasets[2]] = save_rs2
while not call_retval do fiber.sleep(0.1) end
call_retval
vshard.router.call(100, 'read', 'customer_lookup', {3}, {timeout = 1})
diff --git a/test/router/retry_reads.result b/test/router/retry_reads.result
index 64b0ff3..b803ae3 100644
--- a/test/router/retry_reads.result
+++ b/test/router/retry_reads.result
@@ -37,7 +37,7 @@ test_run:cmd('switch router_1')
util = require('util')
---
...
-rs1 = vshard.router.internal.replicasets[replicasets[1]]
+rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]
---
...
min_timeout = vshard.consts.CALL_TIMEOUT_MIN
diff --git a/test/router/retry_reads.test.lua
b/test/router/retry_reads.test.lua
index 2fb2fc7..510e961 100644
--- a/test/router/retry_reads.test.lua
+++ b/test/router/retry_reads.test.lua
@@ -13,7 +13,7 @@ test_run:cmd("start server router_1")
test_run:cmd('switch router_1')
util = require('util')
-rs1 = vshard.router.internal.replicasets[replicasets[1]]
+rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]
min_timeout = vshard.consts.CALL_TIMEOUT_MIN
--
diff --git a/test/router/router.result b/test/router/router.result
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] [PATCH] Refactor config templates
2018-07-31 16:25 [tarantool-patches] [PATCH 0/3] multiple routers AKhatskevich
` (3 preceding siblings ...)
2018-08-01 14:30 ` [tarantool-patches] [PATCH] Check self arg passed for router objects AKhatskevich
@ 2018-08-03 20:07 ` AKhatskevich
2018-08-06 15:49 ` [tarantool-patches] " Vladislav Shpilevoy
4 siblings, 1 reply; 23+ messages in thread
From: AKhatskevich @ 2018-08-03 20:07 UTC (permalink / raw)
To: v.shpilevoy, tarantool-patches
Config templates are converted to dictionary format.
Before: format = {{field_name, description}}
After: format = {field_name = description]]
This change is made for fast template lookups, which will be used
in further commits.
---
This is an extra commit, created especially for the
'Update only vshard part of a cfg on reload' patch.
vshard/cfg.lua | 64 ++++++++++++++++++++++++++++------------------------------
1 file changed, 31 insertions(+), 33 deletions(-)
diff --git a/vshard/cfg.lua b/vshard/cfg.lua
index bba12cc..7c9ab77 100644
--- a/vshard/cfg.lua
+++ b/vshard/cfg.lua
@@ -43,9 +43,7 @@ local type_validate = {
}
local function validate_config(config, template, check_arg)
- for _, key_template in pairs(template) do
- local key = key_template[1]
- local template_value = key_template[2]
+ for key, template_value in pairs(template) do
local value = config[key]
if not value then
if not template_value.is_optional then
@@ -83,13 +81,13 @@ local function validate_config(config, template, check_arg)
end
local replica_template = {
- {'uri', {type = 'non-empty string', name = 'URI', check = check_uri}},
- {'name', {type = 'string', name = "Name", is_optional = true}},
- {'zone', {type = {'string', 'number'}, name = "Zone", is_optional = true}},
- {'master', {
+ uri = {type = 'non-empty string', name = 'URI', check = check_uri},
+ name = {type = 'string', name = "Name", is_optional = true},
+ zone = {type = {'string', 'number'}, name = "Zone", is_optional = true},
+ master = {
type = 'boolean', name = "Master", is_optional = true, default = false,
check = check_master
- }},
+ },
}
local function check_replicas(replicas)
@@ -100,12 +98,12 @@ local function check_replicas(replicas)
end
local replicaset_template = {
- {'replicas', {type = 'table', name = 'Replicas', check = check_replicas}},
- {'weight', {
+ replicas = {type = 'table', name = 'Replicas', check = check_replicas},
+ weight = {
type = 'non-negative number', name = 'Weight', is_optional = true,
default = 1,
- }},
- {'lock', {type = 'boolean', name = 'Lock', is_optional = true}},
+ },
+ lock = {type = 'boolean', name = 'Lock', is_optional = true},
}
--
@@ -177,50 +175,50 @@ local function check_sharding(sharding)
end
local cfg_template = {
- {'sharding', {type = 'table', name = 'Sharding', check = check_sharding}},
- {'weights', {
+ sharding = {type = 'table', name = 'Sharding', check = check_sharding},
+ weights = {
type = 'table', name = 'Weight matrix', is_optional = true,
check = cfg_check_weights
- }},
- {'shard_index', {
+ },
+ shard_index = {
type = {'non-empty string', 'non-negative integer'},
name = 'Shard index', is_optional = true, default = 'bucket_id',
- }},
- {'zone', {
+ },
+ zone = {
type = {'string', 'number'}, name = 'Zone identifier',
is_optional = true
- }},
- {'bucket_count', {
+ },
+ bucket_count = {
type = 'positive integer', name = 'Bucket count', is_optional = true,
default = consts.DEFAULT_BUCKET_COUNT
- }},
- {'rebalancer_disbalance_threshold', {
+ },
+ rebalancer_disbalance_threshold = {
type = 'non-negative number', name = 'Rebalancer disbalance threshold',
is_optional = true,
default = consts.DEFAULT_REBALANCER_DISBALANCE_THRESHOLD
- }},
- {'rebalancer_max_receiving', {
+ },
+ rebalancer_max_receiving = {
type = 'positive integer',
name = 'Rebalancer max receiving bucket count', is_optional = true,
default = consts.DEFAULT_REBALANCER_MAX_RECEIVING
- }},
- {'collect_bucket_garbage_interval', {
+ },
+ collect_bucket_garbage_interval = {
type = 'positive number', name = 'Garbage bucket collect interval',
is_optional = true,
default = consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
- }},
- {'collect_lua_garbage', {
+ },
+ collect_lua_garbage = {
type = 'boolean', name = 'Garbage Lua collect necessity',
is_optional = true, default = false
- }},
- {'sync_timeout', {
+ },
+ sync_timeout = {
type = 'non-negative number', name = 'Sync timeout', is_optional = true,
default = consts.DEFAULT_SYNC_TIMEOUT
- }},
- {'connection_outdate_delay', {
+ },
+ connection_outdate_delay = {
type = 'non-negative number', name = 'Object outdate timeout',
is_optional = true
- }},
+ },
}
--
--
2.14.1
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH] Refactor config templates
2018-08-03 20:07 ` [tarantool-patches] [PATCH] Refactor config templates AKhatskevich
@ 2018-08-06 15:49 ` Vladislav Shpilevoy
0 siblings, 0 replies; 23+ messages in thread
From: Vladislav Shpilevoy @ 2018-08-06 15:49 UTC (permalink / raw)
To: tarantool-patches, AKhatskevich
Thanks for the patch! LGTM and pushed.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 1/3] Update only vshard part of a cfg on reload
2018-08-03 20:03 ` Alex Khatskevich
@ 2018-08-06 17:03 ` Vladislav Shpilevoy
2018-08-07 13:19 ` Alex Khatskevich
0 siblings, 1 reply; 23+ messages in thread
From: Vladislav Shpilevoy @ 2018-08-06 17:03 UTC (permalink / raw)
To: Alex Khatskevich, tarantool-patches
Thanks for the patch! See 3 comments below.
> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
> index 102b942..40216ea 100644
> --- a/vshard/storage/init.lua
> +++ b/vshard/storage/init.lua
> @@ -1553,18 +1553,19 @@ local function storage_cfg(cfg, this_replica_uuid)
> --
> -- If a master role of the replica is not changed, then
> -- 'read_only' can be set right here.
> - cfg.listen = cfg.listen or this_replica.uri
> - if cfg.replication == nil and this_replicaset.master and not is_master then
> - cfg.replication = {this_replicaset.master.uri}
> + box_cfg.listen = box_cfg.listen or this_replica.uri
> + if box_cfg.replication == nil and this_replicaset.master
> + and not is_master then
> + box_cfg.replication = {this_replicaset.master.uri}
> else
> - cfg.replication = {}
> + box_cfg.replication = {}
> end
> if was_master == is_master then
> - cfg.read_only = not is_master
> + box_cfg.read_only = not is_master
> end
> if type(box.cfg) == 'function' then
> - cfg.instance_uuid = this_replica.uuid
> - cfg.replicaset_uuid = this_replicaset.uuid
> + box_cfg.instance_uuid = this_replica.uuid
> + box_cfg.replicaset_uuid = this_replicaset.uuid
1. All these box_cfg manipulations should be done under 'if not is_reload'
I think.
> else
> local info = box.info
> if this_replica_uuid ~= info.uuid then
> @@ -1607,27 +1610,27 @@ local function storage_cfg(cfg, this_replica_uuid)
> local_on_master_enable_prepare()
> end
>
> - local box_cfg = table.copy(cfg)
> - lcfg.remove_non_box_options(box_cfg)
> - local ok, err = pcall(box.cfg, box_cfg)
> - while M.errinj.ERRINJ_CFG_DELAY do
> - lfiber.sleep(0.01)
> - end
> - if not ok then
> - M.sync_timeout = old_sync_timeout
> - if was_master and not is_master then
> - local_on_master_disable_abort()
> + if not is_reload then
> + local ok, err = true, nil
> + ok, err = pcall(box.cfg, box_cfg)
2. Why do you need to announce 'local ok, err' before
their usage on the next line?
> + while M.errinj.ERRINJ_CFG_DELAY do
> + lfiber.sleep(0.01)
> end
> - if not was_master and is_master then
> - local_on_master_enable_abort()
> + if not ok then
> + M.sync_timeout = old_sync_timeout
> + if was_master and not is_master then
> + local_on_master_disable_abort()
> + end
> + if not was_master and is_master then
> + local_on_master_enable_abort()
> + end
> + error(err)
> end
> - error(err)
> + log.info("Box has been configured")
> + local uri = luri.parse(this_replica.uri)
> + box.once("vshard:storage:1", storage_schema_v1, uri.login, uri.password)
> end
>
> - log.info("Box has been configured")
> - local uri = luri.parse(this_replica.uri)
> - box.once("vshard:storage:1", storage_schema_v1, uri.login, uri.password)
> -
> lreplicaset.rebind_replicasets(new_replicasets, M.replicasets)
> lreplicaset.outdate_replicasets(M.replicasets)
> M.replicasets = new_replicasets
> @@ -1874,7 +1877,7 @@ if not rawget(_G, MODULE_INTERNALS) then
> rawset(_G, MODULE_INTERNALS, M)
> else
> reload_evolution.upgrade(M)
> - storage_cfg(M.current_cfg, M.this_replica.uuid)
> + storage_cfg(M.current_cfg, M.this_replica.uuid, true)
3. I see that you have stored vshard_cfg in M.current_cfg. Not a full
config. So it does not have any box options. And it causes a question
- why do you need to separate reload from non-reload, if reload anyway
in such implementation is like 'box.cfg{}' call with no parameters?
And if you do not store box_cfg options how are you going to compare
configs when we will implement atomic cfg over cluster?
> M.module_version = M.module_version + 1
> end
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 2/3] Move lua gc to a dedicated module
2018-08-03 20:04 ` Alex Khatskevich
@ 2018-08-06 17:03 ` Vladislav Shpilevoy
2018-08-08 11:17 ` Vladislav Shpilevoy
1 sibling, 0 replies; 23+ messages in thread
From: Vladislav Shpilevoy @ 2018-08-06 17:03 UTC (permalink / raw)
To: tarantool-patches, Alex Khatskevich
Thanks for the patch! It is LGTM but can not push since it
depends on the previous one (cherry-pick shows conflicts).
On 03/08/2018 23:04, Alex Khatskevich wrote:
>
> On 01.08.2018 21:43, Vladislav Shpilevoy wrote:
>> Thanks for the patch! See 4 comments below.
>>
>> On 31/07/2018 19:25, AKhatskevich wrote:
>>> `vshard.lua_gc.lua` is a new module which helps make gc work more
>>> intense.
>>> Before the commit that was a duty of router and storage.
>>>
>>> Reasons to move lua gc to a separate module:
>>> 1. It is not a duty of vshard to collect garbage, so let gc fiber
>>> be as far from vshard as possible.
>>> 2. Next commits will introduce multiple routers feature, which require
>>> gc fiber to be a singleton.
>>>
>>> Closes #138
>>> ---
>>> test/router/garbage_collector.result | 27 +++++++++++------
>>> test/router/garbage_collector.test.lua | 18 ++++++-----
>>> test/storage/garbage_collector.result | 27 +++++++++--------
>>> test/storage/garbage_collector.test.lua | 22 ++++++--------
>>> vshard/lua_gc.lua | 54 +++++++++++++++++++++++++++++++++
>>> vshard/router/init.lua | 19 +++---------
>>> vshard/storage/init.lua | 20 ++++--------
>>> 7 files changed, 116 insertions(+), 71 deletions(-)
>>> create mode 100644 vshard/lua_gc.lua
>>>
>>> diff --git a/test/router/garbage_collector.result b/test/router/garbage_collector.result
>>> index 3c2a4f1..a7474fc 100644
>>> --- a/test/router/garbage_collector.result
>>> +++ b/test/router/garbage_collector.result
>>> @@ -40,27 +40,30 @@ test_run:switch('router_1')
>>> fiber = require('fiber')
>>> ---
>>> ...
>>> -cfg.collect_lua_garbage = true
>>> +lua_gc = require('vshard.lua_gc')
>>> ---
>>> ...
>>> -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DISCOVERY_INTERVAL
>>> +cfg.collect_lua_garbage = true
>>
>> 1. Now this code tests nothing but just fibers. Below you do wakeup
>> and check that iteration counter is increased, but it is obvious
>> thing. Before your patch the test really tested that GC is called
>> by checking for nullified weak references. Now I can remove collectgarbage()
>> from the main_loop and nothing would changed. Please, make this test
>> be a test.
> GC test returned back.
>>
>> Moreover, the test hangs forever both locally and on Travis.
> Fixed
>>
>>> diff --git a/test/storage/garbage_collector.result b/test/storage/garbage_collector.result
>>> index 3588fb4..d94ba24 100644
>>> --- a/test/storage/garbage_collector.result
>>> +++ b/test/storage/garbage_collector.result
>>
>> 2. Same. Now the test passes even if I removed collectgarbage() from
>> the main loop.
> returned.
>>
>>> diff --git a/vshard/lua_gc.lua b/vshard/lua_gc.lua
>>> new file mode 100644
>>> index 0000000..8d6af3e
>>> --- /dev/null
>>> +++ b/vshard/lua_gc.lua
>>> @@ -0,0 +1,54 @@
>>> +--
>>> +-- This module implements background lua GC fiber.
>>> +-- It's purpose is to make GC more aggressive.
>>> +--
>>> +
>>> +local lfiber = require('fiber')
>>> +local MODULE_INTERNALS = '__module_vshard_lua_gc'
>>> +
>>> +local M = rawget(_G, MODULE_INTERNALS)
>>> +if not M then
>>> + M = {
>>> + -- Background fiber.
>>> + bg_fiber = nil,
>>> + -- GC interval in seconds.
>>> + interval = nil,
>>> + -- Main loop.
>>> + -- Stored here to make the fiber reloadable.
>>> + main_loop = nil,
>>> + -- Number of `collectgarbage()` calls.
>>> + iterations = 0,
>>> + }
>>> +end
>>> +local DEFALUT_INTERVAL = 100
>>
>> 3. For constants please use vshard.consts.
>>
>> 4. You should not choose interval inside the main_loop.
>> Please, use 'default' option in cfg.lua.
> DEFAULT_INTERVAL is removed at all.
> Interval value is became required.
>
>
>
> full diff
>
>
>
> commit ec221bd060f46e4dc009eaab1c6c1bd1cf5a4150
> Author: AKhatskevich <avkhatskevich@tarantool.org>
> Date: Thu Jul 26 01:17:00 2018 +0300
>
> Move lua gc to a dedicated module
>
> `vshard.lua_gc.lua` is a new module which helps make gc work more
> intense.
> Before the commit that was a duty of router and storage.
>
> Reasons to move lua gc to a separate module:
> 1. It is not a duty of vshard to collect garbage, so let gc fiber
> be as far from vshard as possible.
> 2. Next commits will introduce multiple routers feature, which require
> gc fiber to be a singleton.
>
> Closes #138
>
> diff --git a/test/router/garbage_collector.result b/test/router/garbage_collector.result
> index 3c2a4f1..7780046 100644
> --- a/test/router/garbage_collector.result
> +++ b/test/router/garbage_collector.result
> @@ -40,41 +40,59 @@ test_run:switch('router_1')
> fiber = require('fiber')
> ---
> ...
> -cfg.collect_lua_garbage = true
> +lua_gc = require('vshard.lua_gc')
> ---
> ...
> -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DISCOVERY_INTERVAL
> +cfg.collect_lua_garbage = true
> ---
> ...
> vshard.router.cfg(cfg)
> ---
> ...
> +lua_gc.internal.bg_fiber ~= nil
> +---
> +- true
> +...
> +-- Check that `collectgarbage()` was really called.
> a = setmetatable({}, {__mode = 'v'})
> ---
> ...
> a.k = {b = 100}
> ---
> ...
> -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end
> +iterations = lua_gc.internal.iterations
> +---
> +...
> +lua_gc.internal.bg_fiber:wakeup()
> +---
> +...
> +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end
> ---
> ...
> a.k
> ---
> - null
> ...
> +lua_gc.internal.interval = 0.001
> +---
> +...
> cfg.collect_lua_garbage = false
> ---
> ...
> vshard.router.cfg(cfg)
> ---
> ...
> -a.k = {b = 100}
> +lua_gc.internal.bg_fiber == nil
> +---
> +- true
> +...
> +iterations = lua_gc.internal.iterations
> ---
> ...
> -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end
> +fiber.sleep(0.01)
> ---
> ...
> -a.k ~= nil
> +iterations == lua_gc.internal.iterations
> ---
> - true
> ...
> diff --git a/test/router/garbage_collector.test.lua b/test/router/garbage_collector.test.lua
> index b3411cd..e8d0876 100644
> --- a/test/router/garbage_collector.test.lua
> +++ b/test/router/garbage_collector.test.lua
> @@ -13,18 +13,24 @@ test_run:cmd("start server router_1")
> --
> test_run:switch('router_1')
> fiber = require('fiber')
> +lua_gc = require('vshard.lua_gc')
> cfg.collect_lua_garbage = true
> -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DISCOVERY_INTERVAL
> vshard.router.cfg(cfg)
> +lua_gc.internal.bg_fiber ~= nil
> +-- Check that `collectgarbage()` was really called.
> a = setmetatable({}, {__mode = 'v'})
> a.k = {b = 100}
> -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end
> +iterations = lua_gc.internal.iterations
> +lua_gc.internal.bg_fiber:wakeup()
> +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end
> a.k
> +lua_gc.internal.interval = 0.001
> cfg.collect_lua_garbage = false
> vshard.router.cfg(cfg)
> -a.k = {b = 100}
> -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end
> -a.k ~= nil
> +lua_gc.internal.bg_fiber == nil
> +iterations = lua_gc.internal.iterations
> +fiber.sleep(0.01)
> +iterations == lua_gc.internal.iterations
>
> test_run:switch("default")
> test_run:cmd("stop server router_1")
> diff --git a/test/storage/garbage_collector.result b/test/storage/garbage_collector.result
> index 3588fb4..6bec2db 100644
> --- a/test/storage/garbage_collector.result
> +++ b/test/storage/garbage_collector.result
> @@ -120,7 +120,7 @@ test_run:switch('storage_1_a')
> fiber = require('fiber')
> ---
> ...
> -log = require('log')
> +lua_gc = require('vshard.lua_gc')
> ---
> ...
> cfg.collect_lua_garbage = true
> @@ -129,38 +129,50 @@ cfg.collect_lua_garbage = true
> vshard.storage.cfg(cfg, names.storage_1_a)
> ---
> ...
> --- Create a weak reference to a able {b = 100} - it must be
> --- deleted on the next GC.
> +lua_gc.internal.bg_fiber ~= nil
> +---
> +- true
> +...
> +-- Check that `collectgarbage()` was really called.
> a = setmetatable({}, {__mode = 'v'})
> ---
> ...
> a.k = {b = 100}
> ---
> ...
> -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
> +iterations = lua_gc.internal.iterations
> ---
> ...
> --- Wait until Lua GC deletes a.k.
> -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
> +lua_gc.internal.bg_fiber:wakeup()
> +---
> +...
> +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end
> ---
> ...
> a.k
> ---
> - null
> ...
> +lua_gc.internal.interval = 0.001
> +---
> +...
> cfg.collect_lua_garbage = false
> ---
> ...
> vshard.storage.cfg(cfg, names.storage_1_a)
> ---
> ...
> -a.k = {b = 100}
> +lua_gc.internal.bg_fiber == nil
> +---
> +- true
> +...
> +iterations = lua_gc.internal.iterations
> ---
> ...
> -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
> +fiber.sleep(0.01)
> ---
> ...
> -a.k ~= nil
> +iterations == lua_gc.internal.iterations
> ---
> - true
> ...
> diff --git a/test/storage/garbage_collector.test.lua b/test/storage/garbage_collector.test.lua
> index 79e76d8..407b8a1 100644
> --- a/test/storage/garbage_collector.test.lua
> +++ b/test/storage/garbage_collector.test.lua
> @@ -46,22 +46,24 @@ customer:select{}
> --
> test_run:switch('storage_1_a')
> fiber = require('fiber')
> -log = require('log')
> +lua_gc = require('vshard.lua_gc')
> cfg.collect_lua_garbage = true
> vshard.storage.cfg(cfg, names.storage_1_a)
> --- Create a weak reference to a able {b = 100} - it must be
> --- deleted on the next GC.
> +lua_gc.internal.bg_fiber ~= nil
> +-- Check that `collectgarbage()` was really called.
> a = setmetatable({}, {__mode = 'v'})
> a.k = {b = 100}
> -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
> --- Wait until Lua GC deletes a.k.
> -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
> +iterations = lua_gc.internal.iterations
> +lua_gc.internal.bg_fiber:wakeup()
> +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end
> a.k
> +lua_gc.internal.interval = 0.001
> cfg.collect_lua_garbage = false
> vshard.storage.cfg(cfg, names.storage_1_a)
> -a.k = {b = 100}
> -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
> -a.k ~= nil
> +lua_gc.internal.bg_fiber == nil
> +iterations = lua_gc.internal.iterations
> +fiber.sleep(0.01)
> +iterations == lua_gc.internal.iterations
>
> test_run:switch('default')
> test_run:drop_cluster(REPLICASET_2)
> diff --git a/vshard/lua_gc.lua b/vshard/lua_gc.lua
> new file mode 100644
> index 0000000..c6c5cd3
> --- /dev/null
> +++ b/vshard/lua_gc.lua
> @@ -0,0 +1,54 @@
> +--
> +-- This module implements background lua GC fiber.
> +-- It's purpose is to make GC more aggressive.
> +--
> +
> +local lfiber = require('fiber')
> +local MODULE_INTERNALS = '__module_vshard_lua_gc'
> +
> +local M = rawget(_G, MODULE_INTERNALS)
> +if not M then
> + M = {
> + -- Background fiber.
> + bg_fiber = nil,
> + -- GC interval in seconds.
> + interval = nil,
> + -- Main loop.
> + -- Stored here to make the fiber reloadable.
> + main_loop = nil,
> + -- Number of `collectgarbage()` calls.
> + iterations = 0,
> + }
> +end
> +
> +M.main_loop = function()
> + lfiber.sleep(M.interval)
> + collectgarbage()
> + M.iterations = M.iterations + 1
> + return M.main_loop()
> +end
> +
> +local function set_state(active, interval)
> + assert(type(interval) == 'number')
> + M.interval = interval
> + if active and not M.bg_fiber then
> + M.bg_fiber = lfiber.create(M.main_loop)
> + M.bg_fiber:name('vshard.lua_gc')
> + end
> + if not active and M.bg_fiber then
> + M.bg_fiber:cancel()
> + M.bg_fiber = nil
> + end
> + if active then
> + M.bg_fiber:wakeup()
> + end
> +end
> +
> +if not rawget(_G, MODULE_INTERNALS) then
> + rawset(_G, MODULE_INTERNALS, M)
> +end
> +
> +return {
> + set_state = set_state,
> + internal = M,
> +}
> diff --git a/vshard/router/init.lua b/vshard/router/init.lua
> index e2b2b22..3e127cb 100644
> --- a/vshard/router/init.lua
> +++ b/vshard/router/init.lua
> @@ -7,6 +7,7 @@ if rawget(_G, MODULE_INTERNALS) then
> local vshard_modules = {
> 'vshard.consts', 'vshard.error', 'vshard.cfg',
> 'vshard.hash', 'vshard.replicaset', 'vshard.util',
> + 'vshard.lua_gc',
> }
> for _, module in pairs(vshard_modules) do
> package.loaded[module] = nil
> @@ -18,6 +19,7 @@ local lcfg = require('vshard.cfg')
> local lhash = require('vshard.hash')
> local lreplicaset = require('vshard.replicaset')
> local util = require('vshard.util')
> +local lua_gc = require('vshard.lua_gc')
>
> local M = rawget(_G, MODULE_INTERNALS)
> if not M then
> @@ -43,8 +45,7 @@ if not M then
> discovery_fiber = nil,
> -- Bucket count stored on all replicasets.
> total_bucket_count = 0,
> - -- If true, then discovery fiber starts to call
> - -- collectgarbage() periodically.
> + -- Boolean lua_gc state (create periodic gc task).
> collect_lua_garbage = nil,
> -- This counter is used to restart background fibers with
> -- new reloaded code.
> @@ -151,8 +152,6 @@ end
> --
> local function discovery_f()
> local module_version = M.module_version
> - local iterations_until_lua_gc =
> - consts.COLLECT_LUA_GARBAGE_INTERVAL / consts.DISCOVERY_INTERVAL
> while module_version == M.module_version do
> while not next(M.replicasets) do
> lfiber.sleep(consts.DISCOVERY_INTERVAL)
> @@ -188,12 +187,6 @@ local function discovery_f()
> M.route_map[bucket_id] = replicaset
> end
> end
> - iterations_until_lua_gc = iterations_until_lua_gc - 1
> - if M.collect_lua_garbage and iterations_until_lua_gc == 0 then
> - iterations_until_lua_gc =
> - consts.COLLECT_LUA_GARBAGE_INTERVAL / consts.DISCOVERY_INTERVAL
> - collectgarbage()
> - end
> lfiber.sleep(consts.DISCOVERY_INTERVAL)
> end
> end
> @@ -504,7 +497,6 @@ local function router_cfg(cfg)
> end
> local new_replicasets = lreplicaset.buildall(vshard_cfg)
> local total_bucket_count = vshard_cfg.bucket_count
> - local collect_lua_garbage = vshard_cfg.collect_lua_garbage
> log.info("Calling box.cfg()...")
> for k, v in pairs(box_cfg) do
> log.info({[k] = v})
> @@ -531,7 +523,7 @@ local function router_cfg(cfg)
> vshard_cfg.connection_outdate_delay)
> M.connection_outdate_delay = vshard_cfg.connection_outdate_delay
> M.total_bucket_count = total_bucket_count
> - M.collect_lua_garbage = collect_lua_garbage
> + M.collect_lua_garbage = vshard_cfg.collect_lua_garbage
> M.current_cfg = vshard_cfg
> M.replicasets = new_replicasets
> -- Update existing route map in-place.
> @@ -548,8 +540,7 @@ local function router_cfg(cfg)
> M.discovery_fiber = util.reloadable_fiber_create(
> 'vshard.discovery', M, 'discovery_f')
> end
> - -- Destroy connections, not used in a new configuration.
> - collectgarbage()
> + lua_gc.set_state(M.collect_lua_garbage, consts.COLLECT_LUA_GARBAGE_INTERVAL)
> end
>
> --------------------------------------------------------------------------------
> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
> index 40216ea..3e29e9d 100644
> --- a/vshard/storage/init.lua
> +++ b/vshard/storage/init.lua
> @@ -10,7 +10,8 @@ if rawget(_G, MODULE_INTERNALS) then
> local vshard_modules = {
> 'vshard.consts', 'vshard.error', 'vshard.cfg',
> 'vshard.replicaset', 'vshard.util',
> - 'vshard.storage.reload_evolution'
> + 'vshard.storage.reload_evolution',
> + 'vshard.lua_gc',
> }
> for _, module in pairs(vshard_modules) do
> package.loaded[module] = nil
> @@ -21,6 +22,7 @@ local lerror = require('vshard.error')
> local lcfg = require('vshard.cfg')
> local lreplicaset = require('vshard.replicaset')
> local util = require('vshard.util')
> +local lua_gc = require('vshard.lua_gc')
> local reload_evolution = require('vshard.storage.reload_evolution')
>
> local M = rawget(_G, MODULE_INTERNALS)
> @@ -75,8 +77,7 @@ if not M then
> collect_bucket_garbage_fiber = nil,
> -- Do buckets garbage collection once per this time.
> collect_bucket_garbage_interval = nil,
> - -- If true, then bucket garbage collection fiber starts to
> - -- call collectgarbage() periodically.
> + -- Boolean lua_gc state (create periodic gc task).
> collect_lua_garbage = nil,
>
> -------------------- Bucket recovery ---------------------
> @@ -1063,9 +1064,6 @@ function collect_garbage_f()
> -- buckets_for_redirect is deleted, it gets empty_sent_buckets
> -- for next deletion.
> local empty_sent_buckets = {}
> - local iterations_until_lua_gc =
> - consts.COLLECT_LUA_GARBAGE_INTERVAL / M.collect_bucket_garbage_interval
> -
> while M.module_version == module_version do
> -- Check if no changes in buckets configuration.
> if control.bucket_generation_collected ~= control.bucket_generation then
> @@ -1106,12 +1104,6 @@ function collect_garbage_f()
> end
> end
> ::continue::
> - iterations_until_lua_gc = iterations_until_lua_gc - 1
> - if iterations_until_lua_gc == 0 and M.collect_lua_garbage then
> - iterations_until_lua_gc = consts.COLLECT_LUA_GARBAGE_INTERVAL /
> - M.collect_bucket_garbage_interval
> - collectgarbage()
> - end
> lfiber.sleep(M.collect_bucket_garbage_interval)
> end
> end
> @@ -1586,7 +1578,6 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload)
> local shard_index = vshard_cfg.shard_index
> local collect_bucket_garbage_interval =
> vshard_cfg.collect_bucket_garbage_interval
> - local collect_lua_garbage = vshard_cfg.collect_lua_garbage
>
> -- It is considered that all possible errors during cfg
> -- process occur only before this place.
> @@ -1641,7 +1632,7 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload)
> M.rebalancer_max_receiving = rebalancer_max_receiving
> M.shard_index = shard_index
> M.collect_bucket_garbage_interval = collect_bucket_garbage_interval
> - M.collect_lua_garbage = collect_lua_garbage
> + M.collect_lua_garbage = vshard_cfg.collect_lua_garbage
> M.current_cfg = vshard_cfg
>
> if was_master and not is_master then
> @@ -1666,6 +1657,7 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload)
> M.rebalancer_fiber:cancel()
> M.rebalancer_fiber = nil
> end
> + lua_gc.set_state(M.collect_lua_garbage, consts.COLLECT_LUA_GARBAGE_INTERVAL)
> -- Destroy connections, not used in a new configuration.
> collectgarbage()
> end
>
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 3/3] Introduce multiple routers feature
2018-08-03 20:05 ` Alex Khatskevich
@ 2018-08-06 17:03 ` Vladislav Shpilevoy
2018-08-07 13:18 ` Alex Khatskevich
0 siblings, 1 reply; 23+ messages in thread
From: Vladislav Shpilevoy @ 2018-08-06 17:03 UTC (permalink / raw)
To: Alex Khatskevich, tarantool-patches
Thanks for the patch! See 8 comments below.
1. You did not send a full diff. There are tests only. (In
this email I pasted it myself). Please, send a full diff
next times.
>>> + if M.routers[name] then
>>> + return nil, string.format('Router with name %s already exists', name)
>>> + end
>>> + local router = table.deepcopy(ROUTER_TEMPLATE)
>>> + setmetatable(router, router_mt)
>>> + router.name = name
>>> + M.routers[name] = router
>>> + if name == STATIC_ROUTER_NAME then
>>> + M.static_router = router
>>> + export_static_router_attributes()
>>> + end
>>
>> 9. This check can be removed if you move
>> export_static_router_attributes call into legacy_cfg.
> Butbue to this if, the static router can be configured by
> `vshard.box.new(static_router_name)`.
2. It is not ok. A user should not use any internal names like
_statis_router to configure it and get. Please, add a new member
vshard.router.static that references the statis one. Until cfg
is called it is nil.
> diff --git a/vshard/router/init.lua b/vshard/router/init.lua
> index 3e127cb..62fdcda 100644
> --- a/vshard/router/init.lua
> +++ b/vshard/router/init.lua
> @@ -25,14 +25,32 @@ local M = rawget(_G, MODULE_INTERNALS)
> if not M then
> M = {
> ---------------- Common module attributes ----------------
> - -- The last passed configuration.
> - current_cfg = nil,
> errinj = {
> ERRINJ_CFG = false,
> ERRINJ_FAILOVER_CHANGE_CFG = false,
> ERRINJ_RELOAD = false,
> ERRINJ_LONG_DISCOVERY = false,
> },
> + -- Dictionary, key is router name, value is a router.
> + routers = {},
> + -- Router object which can be accessed by old api:
> + -- e.g. vshard.router.call(...)
> + static_router = nil,
> + -- This counter is used to restart background fibers with
> + -- new reloaded code.
> + module_version = 0,
> + collect_lua_garbage_cnt = 0,
3. A comment?
> + }
> +end
> +
> +--
> +-- Router object attributes.
> +--
> +local ROUTER_TEMPLATE = {
> + -- Name of router.
> + name = nil,
> + -- The last passed configuration.
> + current_cfg = nil,
> -- Time to outdate old objects on reload.
> connection_outdate_delay = nil,
> -- Bucket map cache.> @@ -488,8 +505,20 @@ end
> -- Configuration
> --------------------------------------------------------------------------------
>
> -local function router_cfg(cfg)
> - local vshard_cfg, box_cfg = lcfg.check(cfg, M.current_cfg)
> +local function change_lua_gc_cnt(val)
4. The same.
> + assert(M.collect_lua_garbage_cnt >= 0)
> + local prev_cnt = M.collect_lua_garbage_cnt
> + M.collect_lua_garbage_cnt = M.collect_lua_garbage_cnt + val
> + if prev_cnt == 0 and M.collect_lua_garbage_cnt > 0 then
> + lua_gc.set_state(true, consts.COLLECT_LUA_GARBAGE_INTERVAL)
> + end
> + if prev_cnt > 0 and M.collect_lua_garbage_cnt == 0 then
> + lua_gc.set_state(false, consts.COLLECT_LUA_GARBAGE_INTERVAL)
> + end
5. You know the concrete val in the caller always: 1 or -1. I think
it would look simpler if you split this function into separate inc
and dec ones. The former checks for prev == 0 and new > 0, the later
checks for prev > 0 and new == 0. It is not needed to check both
each time.
> +end
> +
> +local function router_cfg(router, cfg)
> + local vshard_cfg, box_cfg = lcfg.check(cfg, router.current_cfg)
> if not M.replicasets then
> log.info('Starting router configuration')
> else
> @@ -803,6 +839,77 @@ if M.errinj.ERRINJ_RELOAD then
> error('Error injection: reload')
> end
>
> +--------------------------------------------------------------------------------
> +-- Managing router instances
> +--------------------------------------------------------------------------------
> +
> +local function cfg_reconfigure(router, cfg)
> + return router_cfg(router, cfg)
> +end
> +
> +local router_mt = {
> + __index = {
> + cfg = cfg_reconfigure;
> + info = router_info;
> + buckets_info = router_buckets_info;
> + call = router_call;
> + callro = router_callro;
> + callrw = router_callrw;
> + route = router_route;
> + routeall = router_routeall;
> + bucket_id = router_bucket_id;
> + bucket_count = router_bucket_count;
> + sync = router_sync;
> + bootstrap = cluster_bootstrap;
> + bucket_discovery = bucket_discovery;
> + discovery_wakeup = discovery_wakeup;
> + }
> +}
> +
> +-- Table which represents this module.
> +local module = {}
> +
> +-- This metatable bypasses calls to a module to the static_router.
> +local module_mt = {__index = {}}
> +for method_name, method in pairs(router_mt.__index) do
> + module_mt.__index[method_name] = function(...)
> + return method(M.static_router, ...)
> + end
> +end
> +
> +local function export_static_router_attributes()
> + setmetatable(module, module_mt)
> +end
> +
> +local function router_new(name, cfg)
6. A comment?
7. This function should not check for router_name == static one.
It just creates a new router and returns it. The caller should set
it into M.routers or M.static_router if needed. For a user you
expose not this function but a wrapper that calls router_new and
sets M.routers.
> + if type(name) ~= 'string' or type(cfg) ~= 'table' then
> + error('Wrong argument type. Usage: vshard.router.new(name, cfg).')
> + end
> + if M.routers[name] then
> + return nil, string.format('Router with name %s already exists', name)
> + end
> + local router = table.deepcopy(ROUTER_TEMPLATE)
> + setmetatable(router, router_mt)
> + router.name = name
> + M.routers[name] = router
> + if name == STATIC_ROUTER_NAME then
> + M.static_router = router
> + export_static_router_attributes()
> + end
> + router_cfg(router, cfg)
> + return router
> +end
> +
> @@ -813,28 +920,23 @@ end
> if not rawget(_G, MODULE_INTERNALS) then
> rawset(_G, MODULE_INTERNALS, M)
> else
> - router_cfg(M.current_cfg)
> + for _, router in pairs(M.routers) do
> + router_cfg(router, router.current_cfg)
> + setmetatable(router, router_mt)
> + end
> M.module_version = M.module_version + 1
> end
>
> M.discovery_f = discovery_f
> M.failover_f = failover_f
> +M.router_mt = router_mt
> +if M.static_router then
> + export_static_router_attributes()
> +end
8. This is possible on reload only and can be moved into
the if above to the reload case processing.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 3/3] Introduce multiple routers feature
2018-08-06 17:03 ` Vladislav Shpilevoy
@ 2018-08-07 13:18 ` Alex Khatskevich
2018-08-08 12:28 ` Vladislav Shpilevoy
0 siblings, 1 reply; 23+ messages in thread
From: Alex Khatskevich @ 2018-08-07 13:18 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches
On 06.08.2018 20:03, Vladislav Shpilevoy wrote:
> Thanks for the patch! See 8 comments below.
>
> 1. You did not send a full diff. There are tests only. (In
> this email I pasted it myself). Please, send a full diff
> next times.
Ok, sorry.
>
>>>> + if M.routers[name] then
>>>> + return nil, string.format('Router with name %s already
>>>> exists', name)
>>>> + end
>>>> + local router = table.deepcopy(ROUTER_TEMPLATE)
>>>> + setmetatable(router, router_mt)
>>>> + router.name = name
>>>> + M.routers[name] = router
>>>> + if name == STATIC_ROUTER_NAME then
>>>> + M.static_router = router
>>>> + export_static_router_attributes()
>>>> + end
>>>
>>> 9. This check can be removed if you move
>>> export_static_router_attributes call into legacy_cfg.
>> Butbue to this if, the static router can be configured by
>> `vshard.box.new(static_router_name)`.
>
> 2. It is not ok. A user should not use any internal names like
> _statis_router to configure it and get. Please, add a new member
> vshard.router.static that references the statis one. Until cfg
> is called it is nil.
Fixed.
By now, user can create a static router only by calling
`vshard.router.cfg()`
>
>> diff --git a/vshard/router/init.lua b/vshard/router/init.lua
>> index 3e127cb..62fdcda 100644
>> --- a/vshard/router/init.lua
>> +++ b/vshard/router/init.lua
>> @@ -25,14 +25,32 @@ local M = rawget(_G, MODULE_INTERNALS)
>> if not M then
>> M = {
>> ---------------- Common module attributes ----------------
>> - -- The last passed configuration.
>> - current_cfg = nil,
>> errinj = {
>> ERRINJ_CFG = false,
>> ERRINJ_FAILOVER_CHANGE_CFG = false,
>> ERRINJ_RELOAD = false,
>> ERRINJ_LONG_DISCOVERY = false,
>> },
>> + -- Dictionary, key is router name, value is a router.
>> + routers = {},
>> + -- Router object which can be accessed by old api:
>> + -- e.g. vshard.router.call(...)
>> + static_router = nil,
>> + -- This counter is used to restart background fibers with
>> + -- new reloaded code.
>> + module_version = 0,
>> + collect_lua_garbage_cnt = 0,
>
> 3. A comment?
added
-- Number of router which require collecting lua garbage.
>
>> + }
>> +end
>> +
>> +--
>> +-- Router object attributes.
>> +--
>> +local ROUTER_TEMPLATE = {
>> + -- Name of router.
>> + name = nil,
>> + -- The last passed configuration.
>> + current_cfg = nil,
>> -- Time to outdate old objects on reload.
>> connection_outdate_delay = nil,
>> -- Bucket map cache.> @@ -488,8 +505,20 @@ end
>> -- Configuration
>> --------------------------------------------------------------------------------
>>
>>
>> -local function router_cfg(cfg)
>> - local vshard_cfg, box_cfg = lcfg.check(cfg, M.current_cfg)
>> +local function change_lua_gc_cnt(val)
>
> 4. The same.
fixed
>
>> + assert(M.collect_lua_garbage_cnt >= 0)
>> + local prev_cnt = M.collect_lua_garbage_cnt
>> + M.collect_lua_garbage_cnt = M.collect_lua_garbage_cnt + val
>> + if prev_cnt == 0 and M.collect_lua_garbage_cnt > 0 then
>> + lua_gc.set_state(true, consts.COLLECT_LUA_GARBAGE_INTERVAL)
>> + end
>> + if prev_cnt > 0 and M.collect_lua_garbage_cnt == 0 then
>> + lua_gc.set_state(false, consts.COLLECT_LUA_GARBAGE_INTERVAL)
>> + end
>
> 5. You know the concrete val in the caller always: 1 or -1. I think
> it would look simpler if you split this function into separate inc
> and dec ones. The former checks for prev == 0 and new > 0, the later
> checks for prev > 0 and new == 0. It is not needed to check both
> each time.
changed
>
>> +end
>> +
>> +local function router_cfg(router, cfg)
>> + local vshard_cfg, box_cfg = lcfg.check(cfg, router.current_cfg)
>> if not M.replicasets then
>> log.info('Starting router configuration')
>> else
>> @@ -803,6 +839,77 @@ if M.errinj.ERRINJ_RELOAD then
>> error('Error injection: reload')
>> end
>>
>> +--------------------------------------------------------------------------------
>>
>> +-- Managing router instances
>> +--------------------------------------------------------------------------------
>>
>> +
>> +local function cfg_reconfigure(router, cfg)
>> + return router_cfg(router, cfg)
>> +end
>> +
>> +local router_mt = {
>> + __index = {
>> + cfg = cfg_reconfigure;
>> + info = router_info;
>> + buckets_info = router_buckets_info;
>> + call = router_call;
>> + callro = router_callro;
>> + callrw = router_callrw;
>> + route = router_route;
>> + routeall = router_routeall;
>> + bucket_id = router_bucket_id;
>> + bucket_count = router_bucket_count;
>> + sync = router_sync;
>> + bootstrap = cluster_bootstrap;
>> + bucket_discovery = bucket_discovery;
>> + discovery_wakeup = discovery_wakeup;
>> + }
>> +}
>> +
>> +-- Table which represents this module.
>> +local module = {}
>> +
>> +-- This metatable bypasses calls to a module to the static_router.
>> +local module_mt = {__index = {}}
>> +for method_name, method in pairs(router_mt.__index) do
>> + module_mt.__index[method_name] = function(...)
>> + return method(M.static_router, ...)
>> + end
>> +end
>> +
>> +local function export_static_router_attributes()
>> + setmetatable(module, module_mt)
>> +end
>> +
>> +local function router_new(name, cfg)
>
> 6. A comment?
added
>
> 7. This function should not check for router_name == static one.
> It just creates a new router and returns it. The caller should set
> it into M.routers or M.static_router if needed. For a user you
> expose not this function but a wrapper that calls router_new and
> sets M.routers.
fixed.
>
>> + if type(name) ~= 'string' or type(cfg) ~= 'table' then
>> + error('Wrong argument type. Usage:
>> vshard.router.new(name, cfg).')
>> + end
>> + if M.routers[name] then
>> + return nil, string.format('Router with name %s already
>> exists', name)
>> + end
>> + local router = table.deepcopy(ROUTER_TEMPLATE)
>> + setmetatable(router, router_mt)
>> + router.name = name
>> + M.routers[name] = router
>> + if name == STATIC_ROUTER_NAME then
>> + M.static_router = router
>> + export_static_router_attributes()
>> + end
>> + router_cfg(router, cfg)
>> + return router
>> +end
>> +
>> @@ -813,28 +920,23 @@ end
>> if not rawget(_G, MODULE_INTERNALS) then
>> rawset(_G, MODULE_INTERNALS, M)
>> else
>> - router_cfg(M.current_cfg)
>> + for _, router in pairs(M.routers) do
>> + router_cfg(router, router.current_cfg)
>> + setmetatable(router, router_mt)
>> + end
>> M.module_version = M.module_version + 1
>> end
>>
>> M.discovery_f = discovery_f
>> M.failover_f = failover_f
>> +M.router_mt = router_mt
>> +if M.static_router then
>> + export_static_router_attributes()
>> +end
>
> 8. This is possible on reload only and can be moved into
> the if above to the reload case processing.
Fixed.
Here is a full diff
commit 87b6dc044de177e159dbe24f07abf3f98839ccff
Author: AKhatskevich <avkhatskevich@tarantool.org>
Date: Thu Jul 26 16:17:25 2018 +0300
Introduce multiple routers feature
Key points:
* Old `vshard.router.some_method()` api is preserved.
* Add `vshard.router.new(name, cfg)` method which returns a new router.
* Each router has its own:
1. name
2. background fibers
3. attributes (route_map, replicasets, outdate_delay...)
* Module reload reloads all configured routers.
* `cfg` reconfigures a single router.
* All routers share the same box configuration. The last passed config
overrides the global box config.
* Multiple router instances can be connected to the same cluster.
* By now, a router cannot be destroyed.
Extra changes:
* Add `data` parameter to `reloadable_fiber_create` function.
Closes #130
diff --git a/test/failover/failover.result b/test/failover/failover.result
index 73a4250..50410ad 100644
--- a/test/failover/failover.result
+++ b/test/failover/failover.result
@@ -174,7 +174,7 @@ test_run:switch('router_1')
---
- true
...
-rs1 = vshard.router.internal.replicasets[rs_uuid[1]]
+rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]]
---
...
while not rs1.replica_up_ts do fiber.sleep(0.1) end
diff --git a/test/failover/failover.test.lua
b/test/failover/failover.test.lua
index 6e06314..44c8b6d 100644
--- a/test/failover/failover.test.lua
+++ b/test/failover/failover.test.lua
@@ -74,7 +74,7 @@ echo_count
-- Ensure that replica_up_ts is updated periodically.
test_run:switch('router_1')
-rs1 = vshard.router.internal.replicasets[rs_uuid[1]]
+rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]]
while not rs1.replica_up_ts do fiber.sleep(0.1) end
old_up_ts = rs1.replica_up_ts
while rs1.replica_up_ts == old_up_ts do fiber.sleep(0.1) end
diff --git a/test/failover/failover_errinj.result
b/test/failover/failover_errinj.result
index 3b6d986..484a1e3 100644
--- a/test/failover/failover_errinj.result
+++ b/test/failover/failover_errinj.result
@@ -49,7 +49,7 @@ vshard.router.cfg(cfg)
-- Check that already run failover step is restarted on
-- configuration change (if some replicasets are removed from
-- config).
-rs1 = vshard.router.internal.replicasets[rs_uuid[1]]
+rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]]
---
...
while not rs1.replica or not rs1.replica.conn:is_connected() do
fiber.sleep(0.1) end
diff --git a/test/failover/failover_errinj.test.lua
b/test/failover/failover_errinj.test.lua
index b4d2d35..14228de 100644
--- a/test/failover/failover_errinj.test.lua
+++ b/test/failover/failover_errinj.test.lua
@@ -20,7 +20,7 @@ vshard.router.cfg(cfg)
-- Check that already run failover step is restarted on
-- configuration change (if some replicasets are removed from
-- config).
-rs1 = vshard.router.internal.replicasets[rs_uuid[1]]
+rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]]
while not rs1.replica or not rs1.replica.conn:is_connected() do
fiber.sleep(0.1) end
vshard.router.internal.errinj.ERRINJ_FAILOVER_CHANGE_CFG = true
wait_state('Configuration has changed, restart ')
diff --git a/test/failover/router_1.lua b/test/failover/router_1.lua
index d71209b..664a6c6 100644
--- a/test/failover/router_1.lua
+++ b/test/failover/router_1.lua
@@ -42,7 +42,7 @@ end
function priority_order()
local ret = {}
for _, uuid in pairs(rs_uuid) do
- local rs = vshard.router.internal.replicasets[uuid]
+ local rs = vshard.router.internal.static_router.replicasets[uuid]
local sorted = {}
for _, replica in pairs(rs.priority_list) do
local z
diff --git a/test/misc/reconfigure.result b/test/misc/reconfigure.result
index c7960b3..311f749 100644
--- a/test/misc/reconfigure.result
+++ b/test/misc/reconfigure.result
@@ -250,7 +250,7 @@ test_run:switch('router_1')
-- Ensure that in a case of error router internals are not
-- changed.
--
-not vshard.router.internal.collect_lua_garbage
+not vshard.router.internal.static_router.collect_lua_garbage
---
- true
...
@@ -264,7 +264,7 @@ vshard.router.cfg(cfg)
---
- error: 'Incorrect value for option ''invalid_option'': unexpected
option'
...
-not vshard.router.internal.collect_lua_garbage
+not vshard.router.internal.static_router.collect_lua_garbage
---
- true
...
diff --git a/test/misc/reconfigure.test.lua b/test/misc/reconfigure.test.lua
index 25dc2ca..298b9b0 100644
--- a/test/misc/reconfigure.test.lua
+++ b/test/misc/reconfigure.test.lua
@@ -99,11 +99,11 @@ test_run:switch('router_1')
-- Ensure that in a case of error router internals are not
-- changed.
--
-not vshard.router.internal.collect_lua_garbage
+not vshard.router.internal.static_router.collect_lua_garbage
cfg.collect_lua_garbage = true
cfg.invalid_option = 'kek'
vshard.router.cfg(cfg)
-not vshard.router.internal.collect_lua_garbage
+not vshard.router.internal.static_router.collect_lua_garbage
cfg.invalid_option = nil
cfg.collect_lua_garbage = nil
vshard.router.cfg(cfg)
diff --git a/test/multiple_routers/configs.lua
b/test/multiple_routers/configs.lua
new file mode 100644
index 0000000..a6ce33c
--- /dev/null
+++ b/test/multiple_routers/configs.lua
@@ -0,0 +1,81 @@
+names = {
+ storage_1_1_a = '32a2d4b8-f146-44ed-9d51-2436507efdf8',
+ storage_1_1_b = 'c1c849b1-641d-40b8-9283-bcfe73d46270',
+ storage_1_2_a = '04e677ed-c7ba-47e0-a67f-b5100cfa86af',
+ storage_1_2_b = 'c7a979ee-9263-4a38-84a5-2fb6a0a32684',
+ storage_2_1_a = '88dc03f0-23fb-4f05-b462-e29186542864',
+ storage_2_1_b = '4230b711-f5c4-4131-bf98-88cd43a16901',
+ storage_2_2_a = '6b1eefbc-1e2e-410e-84ff-44c572ea9916',
+ storage_2_2_b = 'be74419a-1e56-4ba4-97e9-6b18710f63c5',
+}
+
+rs_1_1 = 'dd208fb8-8b90-49bc-8393-6b3a99da7c52'
+rs_1_2 = 'af9cfe88-2091-4613-a877-a623776c5c0e'
+rs_2_1 = '9ca8ee15-ae18-4f31-9385-4859f89ce73f'
+rs_2_2 = '007f5f58-b654-4125-8441-a71866fb62b5'
+
+local cfg_1 = {}
+cfg_1.sharding = {
+ [rs_1_1] = {
+ replicas = {
+ [names.storage_1_1_a] = {
+ uri = 'storage:storage@127.0.0.1:3301',
+ name = 'storage_1_1_a',
+ master = true,
+ },
+ [names.storage_1_1_b] = {
+ uri = 'storage:storage@127.0.0.1:3302',
+ name = 'storage_1_1_b',
+ },
+ }
+ },
+ [rs_1_2] = {
+ replicas = {
+ [names.storage_1_2_a] = {
+ uri = 'storage:storage@127.0.0.1:3303',
+ name = 'storage_1_2_a',
+ master = true,
+ },
+ [names.storage_1_2_b] = {
+ uri = 'storage:storage@127.0.0.1:3304',
+ name = 'storage_1_2_b',
+ },
+ }
+ },
+}
+
+
+local cfg_2 = {}
+cfg_2.sharding = {
+ [rs_2_1] = {
+ replicas = {
+ [names.storage_2_1_a] = {
+ uri = 'storage:storage@127.0.0.1:3305',
+ name = 'storage_2_1_a',
+ master = true,
+ },
+ [names.storage_2_1_b] = {
+ uri = 'storage:storage@127.0.0.1:3306',
+ name = 'storage_2_1_b',
+ },
+ }
+ },
+ [rs_2_2] = {
+ replicas = {
+ [names.storage_2_2_a] = {
+ uri = 'storage:storage@127.0.0.1:3307',
+ name = 'storage_2_2_a',
+ master = true,
+ },
+ [names.storage_2_2_b] = {
+ uri = 'storage:storage@127.0.0.1:3308',
+ name = 'storage_2_2_b',
+ },
+ }
+ },
+}
+
+return {
+ cfg_1 = cfg_1,
+ cfg_2 = cfg_2,
+}
diff --git a/test/multiple_routers/multiple_routers.result
b/test/multiple_routers/multiple_routers.result
new file mode 100644
index 0000000..5b85e1c
--- /dev/null
+++ b/test/multiple_routers/multiple_routers.result
@@ -0,0 +1,301 @@
+test_run = require('test_run').new()
+---
+...
+REPLICASET_1_1 = { 'storage_1_1_a', 'storage_1_1_b' }
+---
+...
+REPLICASET_1_2 = { 'storage_1_2_a', 'storage_1_2_b' }
+---
+...
+REPLICASET_2_1 = { 'storage_2_1_a', 'storage_2_1_b' }
+---
+...
+REPLICASET_2_2 = { 'storage_2_2_a', 'storage_2_2_b' }
+---
+...
+test_run:create_cluster(REPLICASET_1_1, 'multiple_routers')
+---
+...
+test_run:create_cluster(REPLICASET_1_2, 'multiple_routers')
+---
+...
+test_run:create_cluster(REPLICASET_2_1, 'multiple_routers')
+---
+...
+test_run:create_cluster(REPLICASET_2_2, 'multiple_routers')
+---
+...
+util = require('lua_libs.util')
+---
+...
+util.wait_master(test_run, REPLICASET_1_1, 'storage_1_1_a')
+---
+...
+util.wait_master(test_run, REPLICASET_1_2, 'storage_1_2_a')
+---
+...
+util.wait_master(test_run, REPLICASET_2_1, 'storage_2_1_a')
+---
+...
+util.wait_master(test_run, REPLICASET_2_2, 'storage_2_2_a')
+---
+...
+test_run:cmd("create server router_1 with
script='multiple_routers/router_1.lua'")
+---
+- true
+...
+test_run:cmd("start server router_1")
+---
+- true
+...
+-- Configure default (static) router.
+_ = test_run:cmd("switch router_1")
+---
+...
+vshard.router.cfg(configs.cfg_1)
+---
+...
+vshard.router.bootstrap()
+---
+- true
+...
+_ = test_run:cmd("switch storage_1_2_a")
+---
+...
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+---
+...
+_ = test_run:cmd("switch router_1")
+---
+...
+vshard.router.call(1, 'write', 'do_replace', {{1, 1}})
+---
+- true
+...
+vshard.router.call(1, 'read', 'do_select', {1})
+---
+- [[1, 1]]
+...
+-- Test that static router is just a router object under the hood.
+static_router = vshard.router.internal.static_router
+---
+...
+static_router:route(1) == vshard.router.route(1)
+---
+- true
+...
+-- Configure extra router.
+router_2 = vshard.router.new('router_2', configs.cfg_2)
+---
+...
+router_2:bootstrap()
+---
+- true
+...
+_ = test_run:cmd("switch storage_2_2_a")
+---
+...
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+---
+...
+_ = test_run:cmd("switch router_1")
+---
+...
+router_2:call(1, 'write', 'do_replace', {{2, 2}})
+---
+- true
+...
+router_2:call(1, 'read', 'do_select', {2})
+---
+- [[2, 2]]
+...
+-- Check that router_2 and static router serves different clusters.
+#router_2:call(1, 'read', 'do_select', {1}) == 0
+---
+- true
+...
+-- Create several routers to the same cluster.
+routers = {}
+---
+...
+for i = 3, 10 do routers[i] = vshard.router.new('router_' .. i,
configs.cfg_2) end
+---
+...
+routers[3]:call(1, 'read', 'do_select', {2})
+---
+- [[2, 2]]
+...
+-- Check that they have their own background fibers.
+fiber_names = {}
+---
+...
+for i = 2, 10 do fiber_names['vshard.failover.router_' .. i] = true;
fiber_names['vshard.discovery.router_' .. i] = true; end
+---
+...
+next(fiber_names) ~= nil
+---
+- true
+...
+fiber = require('fiber')
+---
+...
+for _, xfiber in pairs(fiber.info()) do fiber_names[xfiber.name] = nil end
+---
+...
+next(fiber_names) == nil
+---
+- true
+...
+-- Reconfigure one of routers do not affect the others.
+routers[3]:cfg(configs.cfg_1)
+---
+...
+routers[3]:call(1, 'read', 'do_select', {1})
+---
+- [[1, 1]]
+...
+#routers[3]:call(1, 'read', 'do_select', {2}) == 0
+---
+- true
+...
+#routers[4]:call(1, 'read', 'do_select', {1}) == 0
+---
+- true
+...
+routers[4]:call(1, 'read', 'do_select', {2})
+---
+- [[2, 2]]
+...
+routers[3]:cfg(configs.cfg_2)
+---
+...
+-- Try to create router with the same name.
+util = require('lua_libs.util')
+---
+...
+util.check_error(vshard.router.new, 'router_2', configs.cfg_2)
+---
+- null
+- type: ShardingError
+ code: 21
+ name: ROUTER_ALREADY_EXISTS
+ message: Router with name router_2 already exists
+...
+-- Reload router module.
+_, old_rs_1 = next(vshard.router.internal.static_router.replicasets)
+---
+...
+_, old_rs_2 = next(router_2.replicasets)
+---
+...
+package.loaded['vshard.router'] = nil
+---
+...
+vshard.router = require('vshard.router')
+---
+...
+while not old_rs_1.is_outdated do fiber.sleep(0.01) end
+---
+...
+while not old_rs_2.is_outdated do fiber.sleep(0.01) end
+---
+...
+vshard.router.call(1, 'read', 'do_select', {1})
+---
+- [[1, 1]]
+...
+router_2:call(1, 'read', 'do_select', {2})
+---
+- [[2, 2]]
+...
+routers[5]:call(1, 'read', 'do_select', {2})
+---
+- [[2, 2]]
+...
+-- Check lua_gc counter.
+lua_gc = require('vshard.lua_gc')
+---
+...
+vshard.router.internal.collect_lua_garbage_cnt == 0
+---
+- true
+...
+lua_gc.internal.bg_fiber == nil
+---
+- true
+...
+configs.cfg_2.collect_lua_garbage = true
+---
+...
+routers[5]:cfg(configs.cfg_2)
+---
+...
+lua_gc.internal.bg_fiber ~= nil
+---
+- true
+...
+routers[7]:cfg(configs.cfg_2)
+---
+...
+lua_gc.internal.bg_fiber ~= nil
+---
+- true
+...
+vshard.router.internal.collect_lua_garbage_cnt == 2
+---
+- true
+...
+package.loaded['vshard.router'] = nil
+---
+...
+vshard.router = require('vshard.router')
+---
+...
+vshard.router.internal.collect_lua_garbage_cnt == 2
+---
+- true
+...
+configs.cfg_2.collect_lua_garbage = nil
+---
+...
+routers[5]:cfg(configs.cfg_2)
+---
+...
+lua_gc.internal.bg_fiber ~= nil
+---
+- true
+...
+routers[7]:cfg(configs.cfg_2)
+---
+...
+vshard.router.internal.collect_lua_garbage_cnt == 0
+---
+- true
+...
+lua_gc.internal.bg_fiber == nil
+---
+- true
+...
+_ = test_run:cmd("switch default")
+---
+...
+test_run:cmd("stop server router_1")
+---
+- true
+...
+test_run:cmd("cleanup server router_1")
+---
+- true
+...
+test_run:drop_cluster(REPLICASET_1_1)
+---
+...
+test_run:drop_cluster(REPLICASET_1_2)
+---
+...
+test_run:drop_cluster(REPLICASET_2_1)
+---
+...
+test_run:drop_cluster(REPLICASET_2_2)
+---
+...
diff --git a/test/multiple_routers/multiple_routers.test.lua
b/test/multiple_routers/multiple_routers.test.lua
new file mode 100644
index 0000000..ec3c7f7
--- /dev/null
+++ b/test/multiple_routers/multiple_routers.test.lua
@@ -0,0 +1,109 @@
+test_run = require('test_run').new()
+
+REPLICASET_1_1 = { 'storage_1_1_a', 'storage_1_1_b' }
+REPLICASET_1_2 = { 'storage_1_2_a', 'storage_1_2_b' }
+REPLICASET_2_1 = { 'storage_2_1_a', 'storage_2_1_b' }
+REPLICASET_2_2 = { 'storage_2_2_a', 'storage_2_2_b' }
+
+test_run:create_cluster(REPLICASET_1_1, 'multiple_routers')
+test_run:create_cluster(REPLICASET_1_2, 'multiple_routers')
+test_run:create_cluster(REPLICASET_2_1, 'multiple_routers')
+test_run:create_cluster(REPLICASET_2_2, 'multiple_routers')
+util = require('lua_libs.util')
+util.wait_master(test_run, REPLICASET_1_1, 'storage_1_1_a')
+util.wait_master(test_run, REPLICASET_1_2, 'storage_1_2_a')
+util.wait_master(test_run, REPLICASET_2_1, 'storage_2_1_a')
+util.wait_master(test_run, REPLICASET_2_2, 'storage_2_2_a')
+
+test_run:cmd("create server router_1 with
script='multiple_routers/router_1.lua'")
+test_run:cmd("start server router_1")
+
+-- Configure default (static) router.
+_ = test_run:cmd("switch router_1")
+vshard.router.cfg(configs.cfg_1)
+vshard.router.bootstrap()
+_ = test_run:cmd("switch storage_1_2_a")
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+_ = test_run:cmd("switch router_1")
+
+vshard.router.call(1, 'write', 'do_replace', {{1, 1}})
+vshard.router.call(1, 'read', 'do_select', {1})
+
+-- Test that static router is just a router object under the hood.
+static_router = vshard.router.internal.static_router
+static_router:route(1) == vshard.router.route(1)
+
+-- Configure extra router.
+router_2 = vshard.router.new('router_2', configs.cfg_2)
+router_2:bootstrap()
+_ = test_run:cmd("switch storage_2_2_a")
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+_ = test_run:cmd("switch router_1")
+
+router_2:call(1, 'write', 'do_replace', {{2, 2}})
+router_2:call(1, 'read', 'do_select', {2})
+-- Check that router_2 and static router serves different clusters.
+#router_2:call(1, 'read', 'do_select', {1}) == 0
+
+-- Create several routers to the same cluster.
+routers = {}
+for i = 3, 10 do routers[i] = vshard.router.new('router_' .. i,
configs.cfg_2) end
+routers[3]:call(1, 'read', 'do_select', {2})
+-- Check that they have their own background fibers.
+fiber_names = {}
+for i = 2, 10 do fiber_names['vshard.failover.router_' .. i] = true;
fiber_names['vshard.discovery.router_' .. i] = true; end
+next(fiber_names) ~= nil
+fiber = require('fiber')
+for _, xfiber in pairs(fiber.info()) do fiber_names[xfiber.name] = nil end
+next(fiber_names) == nil
+
+-- Reconfigure one of routers do not affect the others.
+routers[3]:cfg(configs.cfg_1)
+routers[3]:call(1, 'read', 'do_select', {1})
+#routers[3]:call(1, 'read', 'do_select', {2}) == 0
+#routers[4]:call(1, 'read', 'do_select', {1}) == 0
+routers[4]:call(1, 'read', 'do_select', {2})
+routers[3]:cfg(configs.cfg_2)
+
+-- Try to create router with the same name.
+util = require('lua_libs.util')
+util.check_error(vshard.router.new, 'router_2', configs.cfg_2)
+
+-- Reload router module.
+_, old_rs_1 = next(vshard.router.internal.static_router.replicasets)
+_, old_rs_2 = next(router_2.replicasets)
+package.loaded['vshard.router'] = nil
+vshard.router = require('vshard.router')
+while not old_rs_1.is_outdated do fiber.sleep(0.01) end
+while not old_rs_2.is_outdated do fiber.sleep(0.01) end
+vshard.router.call(1, 'read', 'do_select', {1})
+router_2:call(1, 'read', 'do_select', {2})
+routers[5]:call(1, 'read', 'do_select', {2})
+
+-- Check lua_gc counter.
+lua_gc = require('vshard.lua_gc')
+vshard.router.internal.collect_lua_garbage_cnt == 0
+lua_gc.internal.bg_fiber == nil
+configs.cfg_2.collect_lua_garbage = true
+routers[5]:cfg(configs.cfg_2)
+lua_gc.internal.bg_fiber ~= nil
+routers[7]:cfg(configs.cfg_2)
+lua_gc.internal.bg_fiber ~= nil
+vshard.router.internal.collect_lua_garbage_cnt == 2
+package.loaded['vshard.router'] = nil
+vshard.router = require('vshard.router')
+vshard.router.internal.collect_lua_garbage_cnt == 2
+configs.cfg_2.collect_lua_garbage = nil
+routers[5]:cfg(configs.cfg_2)
+lua_gc.internal.bg_fiber ~= nil
+routers[7]:cfg(configs.cfg_2)
+vshard.router.internal.collect_lua_garbage_cnt == 0
+lua_gc.internal.bg_fiber == nil
+
+_ = test_run:cmd("switch default")
+test_run:cmd("stop server router_1")
+test_run:cmd("cleanup server router_1")
+test_run:drop_cluster(REPLICASET_1_1)
+test_run:drop_cluster(REPLICASET_1_2)
+test_run:drop_cluster(REPLICASET_2_1)
+test_run:drop_cluster(REPLICASET_2_2)
diff --git a/test/multiple_routers/router_1.lua
b/test/multiple_routers/router_1.lua
new file mode 100644
index 0000000..2e9ea91
--- /dev/null
+++ b/test/multiple_routers/router_1.lua
@@ -0,0 +1,15 @@
+#!/usr/bin/env tarantool
+
+require('strict').on()
+
+-- Get instance name
+local fio = require('fio')
+local NAME = fio.basename(arg[0], '.lua')
+
+require('console').listen(os.getenv('ADMIN'))
+
+configs = require('configs')
+
+-- Start the database with sharding
+vshard = require('vshard')
+box.cfg{}
diff --git a/test/multiple_routers/storage_1_1_a.lua
b/test/multiple_routers/storage_1_1_a.lua
new file mode 100644
index 0000000..b44a97a
--- /dev/null
+++ b/test/multiple_routers/storage_1_1_a.lua
@@ -0,0 +1,23 @@
+#!/usr/bin/env tarantool
+
+require('strict').on()
+
+-- Get instance name.
+local fio = require('fio')
+NAME = fio.basename(arg[0], '.lua')
+
+require('console').listen(os.getenv('ADMIN'))
+
+-- Fetch config for the cluster of the instance.
+if NAME:sub(9,9) == '1' then
+ cfg = require('configs').cfg_1
+else
+ cfg = require('configs').cfg_2
+end
+
+-- Start the database with sharding.
+vshard = require('vshard')
+vshard.storage.cfg(cfg, names[NAME])
+
+-- Bootstrap storage.
+require('lua_libs.bootstrap')
diff --git a/test/multiple_routers/storage_1_1_b.lua
b/test/multiple_routers/storage_1_1_b.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_1_1_b.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_1_2_a.lua
b/test/multiple_routers/storage_1_2_a.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_1_2_a.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_1_2_b.lua
b/test/multiple_routers/storage_1_2_b.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_1_2_b.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_2_1_a.lua
b/test/multiple_routers/storage_2_1_a.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_2_1_a.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_2_1_b.lua
b/test/multiple_routers/storage_2_1_b.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_2_1_b.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_2_2_a.lua
b/test/multiple_routers/storage_2_2_a.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_2_2_a.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_2_2_b.lua
b/test/multiple_routers/storage_2_2_b.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_2_2_b.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/suite.ini
b/test/multiple_routers/suite.ini
new file mode 100644
index 0000000..d2d4470
--- /dev/null
+++ b/test/multiple_routers/suite.ini
@@ -0,0 +1,6 @@
+[default]
+core = tarantool
+description = Multiple routers tests
+script = test.lua
+is_parallel = False
+lua_libs = ../lua_libs configs.lua
diff --git a/test/multiple_routers/test.lua b/test/multiple_routers/test.lua
new file mode 100644
index 0000000..cb7c1ee
--- /dev/null
+++ b/test/multiple_routers/test.lua
@@ -0,0 +1,9 @@
+#!/usr/bin/env tarantool
+
+require('strict').on()
+
+box.cfg{
+ listen = os.getenv("LISTEN"),
+}
+
+require('console').listen(os.getenv('ADMIN'))
diff --git a/test/router/exponential_timeout.result
b/test/router/exponential_timeout.result
index fb54d0f..6748b64 100644
--- a/test/router/exponential_timeout.result
+++ b/test/router/exponential_timeout.result
@@ -37,10 +37,10 @@ test_run:cmd('switch router_1')
util = require('util')
---
...
-rs1 = vshard.router.internal.replicasets[replicasets[1]]
+rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]
---
...
-rs2 = vshard.router.internal.replicasets[replicasets[2]]
+rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]
---
...
util.collect_timeouts(rs1)
diff --git a/test/router/exponential_timeout.test.lua
b/test/router/exponential_timeout.test.lua
index 3ec0b8c..75d85bf 100644
--- a/test/router/exponential_timeout.test.lua
+++ b/test/router/exponential_timeout.test.lua
@@ -13,8 +13,8 @@ test_run:cmd("start server router_1")
test_run:cmd('switch router_1')
util = require('util')
-rs1 = vshard.router.internal.replicasets[replicasets[1]]
-rs2 = vshard.router.internal.replicasets[replicasets[2]]
+rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]
+rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]
util.collect_timeouts(rs1)
util.collect_timeouts(rs2)
diff --git a/test/router/reconnect_to_master.result
b/test/router/reconnect_to_master.result
index 5e678ce..d502723 100644
--- a/test/router/reconnect_to_master.result
+++ b/test/router/reconnect_to_master.result
@@ -76,7 +76,7 @@ _ = test_run:cmd('stop server storage_1_a')
_ = test_run:switch('router_1')
---
...
-reps = vshard.router.internal.replicasets
+reps = vshard.router.internal.static_router.replicasets
---
...
test_run:cmd("setopt delimiter ';'")
@@ -95,7 +95,7 @@ end;
...
function count_known_buckets()
local known_buckets = 0
- for _, id in pairs(vshard.router.internal.route_map) do
+ for _, id in pairs(vshard.router.internal.static_router.route_map) do
known_buckets = known_buckets + 1
end
return known_buckets
@@ -127,7 +127,7 @@ is_disconnected()
fiber = require('fiber')
---
...
-while vshard.router.internal.replicasets[replicasets[1]].replica == nil
do fiber.sleep(0.1) end
+while
vshard.router.internal.static_router.replicasets[replicasets[1]].replica
== nil do fiber.sleep(0.1) end
---
...
vshard.router.info()
diff --git a/test/router/reconnect_to_master.test.lua
b/test/router/reconnect_to_master.test.lua
index 39ba90e..8820fa7 100644
--- a/test/router/reconnect_to_master.test.lua
+++ b/test/router/reconnect_to_master.test.lua
@@ -34,7 +34,7 @@ _ = test_run:cmd('stop server storage_1_a')
_ = test_run:switch('router_1')
-reps = vshard.router.internal.replicasets
+reps = vshard.router.internal.static_router.replicasets
test_run:cmd("setopt delimiter ';'")
function is_disconnected()
for i, rep in pairs(reps) do
@@ -46,7 +46,7 @@ function is_disconnected()
end;
function count_known_buckets()
local known_buckets = 0
- for _, id in pairs(vshard.router.internal.route_map) do
+ for _, id in pairs(vshard.router.internal.static_router.route_map) do
known_buckets = known_buckets + 1
end
return known_buckets
@@ -63,7 +63,7 @@ is_disconnected()
-- Wait until replica is connected to test alerts on unavailable
-- master.
fiber = require('fiber')
-while vshard.router.internal.replicasets[replicasets[1]].replica == nil
do fiber.sleep(0.1) end
+while
vshard.router.internal.static_router.replicasets[replicasets[1]].replica
== nil do fiber.sleep(0.1) end
vshard.router.info()
-- Return master.
diff --git a/test/router/reload.result b/test/router/reload.result
index f0badc3..98e8e71 100644
--- a/test/router/reload.result
+++ b/test/router/reload.result
@@ -229,7 +229,7 @@ vshard.router.cfg(cfg)
cfg.connection_outdate_delay = old_connection_delay
---
...
-vshard.router.internal.connection_outdate_delay = nil
+vshard.router.internal.static_router.connection_outdate_delay = nil
---
...
rs_new = vshard.router.route(1)
diff --git a/test/router/reload.test.lua b/test/router/reload.test.lua
index 528222a..293cb26 100644
--- a/test/router/reload.test.lua
+++ b/test/router/reload.test.lua
@@ -104,7 +104,7 @@ old_connection_delay = cfg.connection_outdate_delay
cfg.connection_outdate_delay = 0.3
vshard.router.cfg(cfg)
cfg.connection_outdate_delay = old_connection_delay
-vshard.router.internal.connection_outdate_delay = nil
+vshard.router.internal.static_router.connection_outdate_delay = nil
rs_new = vshard.router.route(1)
rs_old = rs
_, replica_old = next(rs_old.replicas)
diff --git a/test/router/reroute_wrong_bucket.result
b/test/router/reroute_wrong_bucket.result
index 7f2a494..989dc79 100644
--- a/test/router/reroute_wrong_bucket.result
+++ b/test/router/reroute_wrong_bucket.result
@@ -98,7 +98,7 @@ vshard.router.call(100, 'read', 'customer_lookup',
{1}, {timeout = 100})
---
- {'accounts': [], 'customer_id': 1, 'name': 'name'}
...
-vshard.router.internal.route_map[100] =
vshard.router.internal.replicasets[replicasets[1]]
+vshard.router.internal.static_router.route_map[100] =
vshard.router.internal.static_router.replicasets[replicasets[1]]
---
...
vshard.router.call(100, 'write', 'customer_add', {{customer_id = 2,
bucket_id = 100, name = 'name2', accounts = {}}}, {timeout = 100})
@@ -146,13 +146,13 @@ test_run:switch('router_1')
...
-- Emulate a situation, when a replicaset_2 while is unknown for
-- router, but is already known for storages.
-save_rs2 = vshard.router.internal.replicasets[replicasets[2]]
+save_rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]
---
...
-vshard.router.internal.replicasets[replicasets[2]] = nil
+vshard.router.internal.static_router.replicasets[replicasets[2]] = nil
---
...
-vshard.router.internal.route_map[100] =
vshard.router.internal.replicasets[replicasets[1]]
+vshard.router.internal.static_router.route_map[100] =
vshard.router.internal.static_router.replicasets[replicasets[1]]
---
...
fiber = require('fiber')
@@ -207,7 +207,7 @@ err
require('log').info(string.rep('a', 1000))
---
...
-vshard.router.internal.route_map[100] =
vshard.router.internal.replicasets[replicasets[1]]
+vshard.router.internal.static_router.route_map[100] =
vshard.router.internal.static_router.replicasets[replicasets[1]]
---
...
call_retval = nil
@@ -219,7 +219,7 @@ f = fiber.create(do_call, 100)
while not test_run:grep_log('router_1', 'please update configuration',
1000) do fiber.sleep(0.1) end
---
...
-vshard.router.internal.replicasets[replicasets[2]] = save_rs2
+vshard.router.internal.static_router.replicasets[replicasets[2]] = save_rs2
---
...
while not call_retval do fiber.sleep(0.1) end
diff --git a/test/router/reroute_wrong_bucket.test.lua
b/test/router/reroute_wrong_bucket.test.lua
index 03384d1..a00f941 100644
--- a/test/router/reroute_wrong_bucket.test.lua
+++ b/test/router/reroute_wrong_bucket.test.lua
@@ -35,7 +35,7 @@ customer_add({customer_id = 1, bucket_id = 100, name =
'name', accounts = {}})
test_run:switch('router_1')
vshard.router.call(100, 'read', 'customer_lookup', {1}, {timeout = 100})
-vshard.router.internal.route_map[100] =
vshard.router.internal.replicasets[replicasets[1]]
+vshard.router.internal.static_router.route_map[100] =
vshard.router.internal.static_router.replicasets[replicasets[1]]
vshard.router.call(100, 'write', 'customer_add', {{customer_id = 2,
bucket_id = 100, name = 'name2', accounts = {}}}, {timeout = 100})
-- Create cycle.
@@ -55,9 +55,9 @@ box.space._bucket:replace({100,
vshard.consts.BUCKET.SENT, replicasets[2]})
test_run:switch('router_1')
-- Emulate a situation, when a replicaset_2 while is unknown for
-- router, but is already known for storages.
-save_rs2 = vshard.router.internal.replicasets[replicasets[2]]
-vshard.router.internal.replicasets[replicasets[2]] = nil
-vshard.router.internal.route_map[100] =
vshard.router.internal.replicasets[replicasets[1]]
+save_rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]
+vshard.router.internal.static_router.replicasets[replicasets[2]] = nil
+vshard.router.internal.static_router.route_map[100] =
vshard.router.internal.static_router.replicasets[replicasets[1]]
fiber = require('fiber')
call_retval = nil
@@ -84,11 +84,11 @@ err
-- detect it and end with ok.
--
require('log').info(string.rep('a', 1000))
-vshard.router.internal.route_map[100] =
vshard.router.internal.replicasets[replicasets[1]]
+vshard.router.internal.static_router.route_map[100] =
vshard.router.internal.static_router.replicasets[replicasets[1]]
call_retval = nil
f = fiber.create(do_call, 100)
while not test_run:grep_log('router_1', 'please update configuration',
1000) do fiber.sleep(0.1) end
-vshard.router.internal.replicasets[replicasets[2]] = save_rs2
+vshard.router.internal.static_router.replicasets[replicasets[2]] = save_rs2
while not call_retval do fiber.sleep(0.1) end
call_retval
vshard.router.call(100, 'read', 'customer_lookup', {3}, {timeout = 1})
diff --git a/test/router/retry_reads.result b/test/router/retry_reads.result
index 64b0ff3..b803ae3 100644
--- a/test/router/retry_reads.result
+++ b/test/router/retry_reads.result
@@ -37,7 +37,7 @@ test_run:cmd('switch router_1')
util = require('util')
---
...
-rs1 = vshard.router.internal.replicasets[replicasets[1]]
+rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]
---
...
min_timeout = vshard.consts.CALL_TIMEOUT_MIN
diff --git a/test/router/retry_reads.test.lua
b/test/router/retry_reads.test.lua
index 2fb2fc7..510e961 100644
--- a/test/router/retry_reads.test.lua
+++ b/test/router/retry_reads.test.lua
@@ -13,7 +13,7 @@ test_run:cmd("start server router_1")
test_run:cmd('switch router_1')
util = require('util')
-rs1 = vshard.router.internal.replicasets[replicasets[1]]
+rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]
min_timeout = vshard.consts.CALL_TIMEOUT_MIN
--
diff --git a/test/router/router.result b/test/router/router.result
index 45394e1..ceaf672 100644
--- a/test/router/router.result
+++ b/test/router/router.result
@@ -70,10 +70,10 @@ test_run:grep_log('router_1', 'connected to ')
---
- 'connected to '
...
-rs1 = vshard.router.internal.replicasets[replicasets[1]]
+rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]
---
...
-rs2 = vshard.router.internal.replicasets[replicasets[2]]
+rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]
---
...
fiber = require('fiber')
@@ -95,7 +95,7 @@ rs2.replica == rs2.master
-- Part of gh-76: on reconfiguration do not recreate connections
-- to replicas, that are kept in a new configuration.
--
-old_replicasets = vshard.router.internal.replicasets
+old_replicasets = vshard.router.internal.static_router.replicasets
---
...
old_connections = {}
@@ -127,17 +127,17 @@ connection_count == 4
vshard.router.cfg(cfg)
---
...
-new_replicasets = vshard.router.internal.replicasets
+new_replicasets = vshard.router.internal.static_router.replicasets
---
...
old_replicasets ~= new_replicasets
---
- true
...
-rs1 = vshard.router.internal.replicasets[replicasets[1]]
+rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]
---
...
-rs2 = vshard.router.internal.replicasets[replicasets[2]]
+rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]
---
...
while not rs1.replica or not rs2.replica do fiber.sleep(0.1) end
@@ -225,7 +225,7 @@ vshard.router.bootstrap()
--
-- gh-108: negative bucket count on discovery.
--
-vshard.router.internal.route_map = {}
+vshard.router.internal.static_router.route_map = {}
---
...
rets = {}
@@ -456,7 +456,7 @@ conn.state
rs_uuid = '<replicaset_2>'
---
...
-rs = vshard.router.internal.replicasets[rs_uuid]
+rs = vshard.router.internal.static_router.replicasets[rs_uuid]
---
...
master = rs.master
@@ -605,7 +605,7 @@ vshard.router.info()
...
-- Remove replica and master connections to trigger alert
-- UNREACHABLE_REPLICASET.
-rs = vshard.router.internal.replicasets[replicasets[1]]
+rs = vshard.router.internal.static_router.replicasets[replicasets[1]]
---
...
master_conn = rs.master.conn
@@ -749,7 +749,7 @@ test_run:cmd("setopt delimiter ';'")
...
function calculate_known_buckets()
local known_buckets = 0
- for _, rs in pairs(vshard.router.internal.route_map) do
+ for _, rs in pairs(vshard.router.internal.static_router.route_map) do
known_buckets = known_buckets + 1
end
return known_buckets
@@ -851,10 +851,10 @@ test_run:cmd("setopt delimiter ';'")
- true
...
for i = 1, 100 do
- local rs = vshard.router.internal.route_map[i]
+ local rs = vshard.router.internal.static_router.route_map[i]
assert(rs)
rs.bucket_count = rs.bucket_count - 1
- vshard.router.internal.route_map[i] = nil
+ vshard.router.internal.static_router.route_map[i] = nil
end;
---
...
@@ -999,7 +999,7 @@ vshard.router.sync(100500)
-- object method like this: object.method() instead of
-- object:method(), an appropriate help-error returns.
--
-_, replicaset = next(vshard.router.internal.replicasets)
+_, replicaset = next(vshard.router.internal.static_router.replicasets)
---
...
error_messages = {}
@@ -1069,7 +1069,7 @@ test_run:cmd("setopt delimiter ';'")
---
- true
...
-for bucket, rs in pairs(vshard.router.internal.route_map) do
+for bucket, rs in pairs(vshard.router.internal.static_router.route_map) do
bucket_to_old_rs[bucket] = rs
bucket_cnt = bucket_cnt + 1
end;
@@ -1084,7 +1084,7 @@ vshard.router.cfg(cfg);
...
for bucket, old_rs in pairs(bucket_to_old_rs) do
local old_uuid = old_rs.uuid
- local rs = vshard.router.internal.route_map[bucket]
+ local rs = vshard.router.internal.static_router.route_map[bucket]
if not rs or not old_uuid == rs.uuid then
error("Bucket lost during reconfigure.")
end
@@ -1111,7 +1111,7 @@ end;
vshard.router.cfg(cfg);
---
...
-vshard.router.internal.route_map = {};
+vshard.router.internal.static_router.route_map = {};
---
...
vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY = false;
@@ -1119,7 +1119,7 @@
vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY = false;
...
-- Do discovery iteration. Upload buckets from the
-- first replicaset.
-while not next(vshard.router.internal.route_map) do
+while not next(vshard.router.internal.static_router.route_map) do
vshard.router.discovery_wakeup()
fiber.sleep(0.01)
end;
@@ -1128,12 +1128,12 @@ end;
new_replicasets = {};
---
...
-for _, rs in pairs(vshard.router.internal.replicasets) do
+for _, rs in pairs(vshard.router.internal.static_router.replicasets) do
new_replicasets[rs] = true
end;
---
...
-_, rs = next(vshard.router.internal.route_map);
+_, rs = next(vshard.router.internal.static_router.route_map);
---
...
new_replicasets[rs] == true;
@@ -1185,6 +1185,17 @@ vshard.router.route(1):callro('echo', {'some_data'})
- null
- null
...
+-- Multiple routers: check that static router can be used as an
+-- object.
+static_router = vshard.router.internal.static_router
+---
+...
+static_router:route(1):callro('echo', {'some_data'})
+---
+- some_data
+- null
+- null
+...
_ = test_run:cmd("switch default")
---
...
diff --git a/test/router/router.test.lua b/test/router/router.test.lua
index df2f381..d7588f7 100644
--- a/test/router/router.test.lua
+++ b/test/router/router.test.lua
@@ -27,8 +27,8 @@ util = require('util')
-- gh-24: log all connnect/disconnect events.
test_run:grep_log('router_1', 'connected to ')
-rs1 = vshard.router.internal.replicasets[replicasets[1]]
-rs2 = vshard.router.internal.replicasets[replicasets[2]]
+rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]
+rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]
fiber = require('fiber')
while not rs1.replica or not rs2.replica do fiber.sleep(0.1) end
-- With no zones the nearest server is master.
@@ -39,7 +39,7 @@ rs2.replica == rs2.master
-- Part of gh-76: on reconfiguration do not recreate connections
-- to replicas, that are kept in a new configuration.
--
-old_replicasets = vshard.router.internal.replicasets
+old_replicasets = vshard.router.internal.static_router.replicasets
old_connections = {}
connection_count = 0
test_run:cmd("setopt delimiter ';'")
@@ -52,10 +52,10 @@ end;
test_run:cmd("setopt delimiter ''");
connection_count == 4
vshard.router.cfg(cfg)
-new_replicasets = vshard.router.internal.replicasets
+new_replicasets = vshard.router.internal.static_router.replicasets
old_replicasets ~= new_replicasets
-rs1 = vshard.router.internal.replicasets[replicasets[1]]
-rs2 = vshard.router.internal.replicasets[replicasets[2]]
+rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]
+rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]
while not rs1.replica or not rs2.replica do fiber.sleep(0.1) end
vshard.router.discovery_wakeup()
-- Check that netbox connections are the same.
@@ -91,7 +91,7 @@ vshard.router.bootstrap()
--
-- gh-108: negative bucket count on discovery.
--
-vshard.router.internal.route_map = {}
+vshard.router.internal.static_router.route_map = {}
rets = {}
function do_echo() table.insert(rets, vshard.router.callro(1, 'echo',
{1})) end
f1 = fiber.create(do_echo) f2 = fiber.create(do_echo)
@@ -153,7 +153,7 @@ conn = vshard.router.route(1).master.conn
conn.state
-- Test missing master.
rs_uuid = 'ac522f65-aa94-4134-9f64-51ee384f1a54'
-rs = vshard.router.internal.replicasets[rs_uuid]
+rs = vshard.router.internal.static_router.replicasets[rs_uuid]
master = rs.master
rs.master = nil
vshard.router.route(1).master
@@ -223,7 +223,7 @@ vshard.router.info()
-- Remove replica and master connections to trigger alert
-- UNREACHABLE_REPLICASET.
-rs = vshard.router.internal.replicasets[replicasets[1]]
+rs = vshard.router.internal.static_router.replicasets[replicasets[1]]
master_conn = rs.master.conn
replica_conn = rs.replica.conn
rs.master.conn = nil
@@ -261,7 +261,7 @@ util.check_error(vshard.router.buckets_info, 123, '456')
test_run:cmd("setopt delimiter ';'")
function calculate_known_buckets()
local known_buckets = 0
- for _, rs in pairs(vshard.router.internal.route_map) do
+ for _, rs in pairs(vshard.router.internal.static_router.route_map) do
known_buckets = known_buckets + 1
end
return known_buckets
@@ -301,10 +301,10 @@ test_run:switch('router_1')
--
test_run:cmd("setopt delimiter ';'")
for i = 1, 100 do
- local rs = vshard.router.internal.route_map[i]
+ local rs = vshard.router.internal.static_router.route_map[i]
assert(rs)
rs.bucket_count = rs.bucket_count - 1
- vshard.router.internal.route_map[i] = nil
+ vshard.router.internal.static_router.route_map[i] = nil
end;
test_run:cmd("setopt delimiter ''");
calculate_known_buckets()
@@ -367,7 +367,7 @@ vshard.router.sync(100500)
-- object method like this: object.method() instead of
-- object:method(), an appropriate help-error returns.
--
-_, replicaset = next(vshard.router.internal.replicasets)
+_, replicaset = next(vshard.router.internal.static_router.replicasets)
error_messages = {}
test_run:cmd("setopt delimiter ';'")
@@ -395,7 +395,7 @@ error_messages
bucket_to_old_rs = {}
bucket_cnt = 0
test_run:cmd("setopt delimiter ';'")
-for bucket, rs in pairs(vshard.router.internal.route_map) do
+for bucket, rs in pairs(vshard.router.internal.static_router.route_map) do
bucket_to_old_rs[bucket] = rs
bucket_cnt = bucket_cnt + 1
end;
@@ -403,7 +403,7 @@ bucket_cnt;
vshard.router.cfg(cfg);
for bucket, old_rs in pairs(bucket_to_old_rs) do
local old_uuid = old_rs.uuid
- local rs = vshard.router.internal.route_map[bucket]
+ local rs = vshard.router.internal.static_router.route_map[bucket]
if not rs or not old_uuid == rs.uuid then
error("Bucket lost during reconfigure.")
end
@@ -423,19 +423,19 @@ while
vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY ~= 'waiting' do
fiber.sleep(0.02)
end;
vshard.router.cfg(cfg);
-vshard.router.internal.route_map = {};
+vshard.router.internal.static_router.route_map = {};
vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY = false;
-- Do discovery iteration. Upload buckets from the
-- first replicaset.
-while not next(vshard.router.internal.route_map) do
+while not next(vshard.router.internal.static_router.route_map) do
vshard.router.discovery_wakeup()
fiber.sleep(0.01)
end;
new_replicasets = {};
-for _, rs in pairs(vshard.router.internal.replicasets) do
+for _, rs in pairs(vshard.router.internal.static_router.replicasets) do
new_replicasets[rs] = true
end;
-_, rs = next(vshard.router.internal.route_map);
+_, rs = next(vshard.router.internal.static_router.route_map);
new_replicasets[rs] == true;
test_run:cmd("setopt delimiter ''");
@@ -453,6 +453,11 @@ vshard.router.internal.errinj.ERRINJ_CFG = false
util.has_same_fields(old_internal, vshard.router.internal)
vshard.router.route(1):callro('echo', {'some_data'})
+-- Multiple routers: check that static router can be used as an
+-- object.
+static_router = vshard.router.internal.static_router
+static_router:route(1):callro('echo', {'some_data'})
+
_ = test_run:cmd("switch default")
test_run:drop_cluster(REPLICASET_2)
diff --git a/vshard/error.lua b/vshard/error.lua
index f79107b..da92b58 100644
--- a/vshard/error.lua
+++ b/vshard/error.lua
@@ -105,7 +105,12 @@ local error_message_template = {
name = 'OBJECT_IS_OUTDATED',
msg = 'Object is outdated after module reload/reconfigure. ' ..
'Use new instance.'
- }
+ },
+ [21] = {
+ name = 'ROUTER_ALREADY_EXISTS',
+ msg = 'Router with name %s already exists',
+ args = {'name'},
+ },
}
--
diff --git a/vshard/router/init.lua b/vshard/router/init.lua
index 59c25a0..b31f7dc 100644
--- a/vshard/router/init.lua
+++ b/vshard/router/init.lua
@@ -25,14 +25,33 @@ local M = rawget(_G, MODULE_INTERNALS)
if not M then
M = {
---------------- Common module attributes ----------------
- -- The last passed configuration.
- current_cfg = nil,
errinj = {
ERRINJ_CFG = false,
ERRINJ_FAILOVER_CHANGE_CFG = false,
ERRINJ_RELOAD = false,
ERRINJ_LONG_DISCOVERY = false,
},
+ -- Dictionary, key is router name, value is a router.
+ routers = {},
+ -- Router object which can be accessed by old api:
+ -- e.g. vshard.router.call(...)
+ static_router = nil,
+ -- This counter is used to restart background fibers with
+ -- new reloaded code.
+ module_version = 0,
+ -- Number of router which require collecting lua garbage.
+ collect_lua_garbage_cnt = 0,
+ }
+end
+
+--
+-- Router object attributes.
+--
+local ROUTER_TEMPLATE = {
+ -- Name of router.
+ name = nil,
+ -- The last passed configuration.
+ current_cfg = nil,
-- Time to outdate old objects on reload.
connection_outdate_delay = nil,
-- Bucket map cache.
@@ -47,38 +66,60 @@ if not M then
total_bucket_count = 0,
-- Boolean lua_gc state (create periodic gc task).
collect_lua_garbage = nil,
- -- This counter is used to restart background fibers with
- -- new reloaded code.
- module_version = 0,
- }
-end
+}
+
+local STATIC_ROUTER_NAME = '_static_router'
-- Set a bucket to a replicaset.
-local function bucket_set(bucket_id, rs_uuid)
- local replicaset = M.replicasets[rs_uuid]
+local function bucket_set(router, bucket_id, rs_uuid)
+ local replicaset = router.replicasets[rs_uuid]
-- It is technically possible to delete a replicaset at the
-- same time when route to the bucket is discovered.
if not replicaset then
return nil, lerror.vshard(lerror.code.NO_ROUTE_TO_BUCKET,
bucket_id)
end
- local old_replicaset = M.route_map[bucket_id]
+ local old_replicaset = router.route_map[bucket_id]
if old_replicaset ~= replicaset then
if old_replicaset then
old_replicaset.bucket_count = old_replicaset.bucket_count - 1
end
replicaset.bucket_count = replicaset.bucket_count + 1
end
- M.route_map[bucket_id] = replicaset
+ router.route_map[bucket_id] = replicaset
return replicaset
end
-- Remove a bucket from the cache.
-local function bucket_reset(bucket_id)
- local replicaset = M.route_map[bucket_id]
+local function bucket_reset(router, bucket_id)
+ local replicaset = router.route_map[bucket_id]
if replicaset then
replicaset.bucket_count = replicaset.bucket_count - 1
end
- M.route_map[bucket_id] = nil
+ router.route_map[bucket_id] = nil
+end
+
+--------------------------------------------------------------------------------
+-- Helpers
+--------------------------------------------------------------------------------
+
+--
+-- Increase/decrease number of routers which require to collect
+-- a lua garbage and change state of the `lua_gc` fiber.
+--
+
+local function lua_gc_cnt_inc()
+ M.collect_lua_garbage_cnt = M.collect_lua_garbage_cnt + 1
+ if M.collect_lua_garbage_cnt == 1 then
+ lua_gc.set_state(true, consts.COLLECT_LUA_GARBAGE_INTERVAL)
+ end
+end
+
+local function lua_gc_cnt_dec()
+ M.collect_lua_garbage_cnt = M.collect_lua_garbage_cnt - 1
+ assert(M.collect_lua_garbage_cnt >= 0)
+ if M.collect_lua_garbage_cnt == 0 then
+ lua_gc.set_state(false, consts.COLLECT_LUA_GARBAGE_INTERVAL)
+ end
end
--------------------------------------------------------------------------------
@@ -86,8 +127,8 @@ end
--------------------------------------------------------------------------------
-- Search bucket in whole cluster
-local function bucket_discovery(bucket_id)
- local replicaset = M.route_map[bucket_id]
+local function bucket_discovery(router, bucket_id)
+ local replicaset = router.route_map[bucket_id]
if replicaset ~= nil then
return replicaset
end
@@ -95,11 +136,11 @@ local function bucket_discovery(bucket_id)
log.verbose("Discovering bucket %d", bucket_id)
local last_err = nil
local unreachable_uuid = nil
- for uuid, replicaset in pairs(M.replicasets) do
+ for uuid, replicaset in pairs(router.replicasets) do
local _, err =
replicaset:callrw('vshard.storage.bucket_stat', {bucket_id})
if err == nil then
- return bucket_set(bucket_id, replicaset.uuid)
+ return bucket_set(router, bucket_id, replicaset.uuid)
elseif err.code ~= lerror.code.WRONG_BUCKET then
last_err = err
unreachable_uuid = uuid
@@ -128,14 +169,14 @@ local function bucket_discovery(bucket_id)
end
-- Resolve bucket id to replicaset uuid
-local function bucket_resolve(bucket_id)
+local function bucket_resolve(router, bucket_id)
local replicaset, err
- local replicaset = M.route_map[bucket_id]
+ local replicaset = router.route_map[bucket_id]
if replicaset ~= nil then
return replicaset
end
-- Replicaset removed from cluster, perform discovery
- replicaset, err = bucket_discovery(bucket_id)
+ replicaset, err = bucket_discovery(router, bucket_id)
if replicaset == nil then
return nil, err
end
@@ -146,14 +187,14 @@ end
-- Background fiber to perform discovery. It periodically scans
-- replicasets one by one and updates route_map.
--
-local function discovery_f()
+local function discovery_f(router)
local module_version = M.module_version
while module_version == M.module_version do
- while not next(M.replicasets) do
+ while not next(router.replicasets) do
lfiber.sleep(consts.DISCOVERY_INTERVAL)
end
- local old_replicasets = M.replicasets
- for rs_uuid, replicaset in pairs(M.replicasets) do
+ local old_replicasets = router.replicasets
+ for rs_uuid, replicaset in pairs(router.replicasets) do
local active_buckets, err =
replicaset:callro('vshard.storage.buckets_discovery', {},
{timeout = 2})
@@ -163,7 +204,7 @@ local function discovery_f()
end
-- Renew replicasets object captured by the for loop
-- in case of reconfigure and reload events.
- if M.replicasets ~= old_replicasets then
+ if router.replicasets ~= old_replicasets then
break
end
if not active_buckets then
@@ -176,11 +217,11 @@ local function discovery_f()
end
replicaset.bucket_count = #active_buckets
for _, bucket_id in pairs(active_buckets) do
- local old_rs = M.route_map[bucket_id]
+ local old_rs = router.route_map[bucket_id]
if old_rs and old_rs ~= replicaset then
old_rs.bucket_count = old_rs.bucket_count - 1
end
- M.route_map[bucket_id] = replicaset
+ router.route_map[bucket_id] = replicaset
end
end
lfiber.sleep(consts.DISCOVERY_INTERVAL)
@@ -191,9 +232,9 @@ end
--
-- Immediately wakeup discovery fiber if exists.
--
-local function discovery_wakeup()
- if M.discovery_fiber then
- M.discovery_fiber:wakeup()
+local function discovery_wakeup(router)
+ if router.discovery_fiber then
+ router.discovery_fiber:wakeup()
end
end
@@ -205,7 +246,7 @@ end
-- Function will restart operation after wrong bucket response until
timeout
-- is reached
--
-local function router_call(bucket_id, mode, func, args, opts)
+local function router_call(router, bucket_id, mode, func, args, opts)
if opts and (type(opts) ~= 'table' or
(opts.timeout and type(opts.timeout) ~= 'number')) then
error('Usage: call(bucket_id, mode, func, args, opts)')
@@ -213,7 +254,7 @@ local function router_call(bucket_id, mode, func,
args, opts)
local timeout = opts and opts.timeout or consts.CALL_TIMEOUT_MIN
local replicaset, err
local tend = lfiber.time() + timeout
- if bucket_id > M.total_bucket_count or bucket_id <= 0 then
+ if bucket_id > router.total_bucket_count or bucket_id <= 0 then
error('Bucket is unreachable: bucket id is out of range')
end
local call
@@ -223,7 +264,7 @@ local function router_call(bucket_id, mode, func,
args, opts)
call = 'callrw'
end
repeat
- replicaset, err = bucket_resolve(bucket_id)
+ replicaset, err = bucket_resolve(router, bucket_id)
if replicaset then
::replicaset_is_found::
local storage_call_status, call_status, call_error =
@@ -239,9 +280,9 @@ local function router_call(bucket_id, mode, func,
args, opts)
end
err = call_status
if err.code == lerror.code.WRONG_BUCKET then
- bucket_reset(bucket_id)
+ bucket_reset(router, bucket_id)
if err.destination then
- replicaset = M.replicasets[err.destination]
+ replicaset = router.replicasets[err.destination]
if not replicaset then
log.warn('Replicaset "%s" was not found, but
received'..
' from storage as destination -
please '..
@@ -253,13 +294,14 @@ local function router_call(bucket_id, mode, func,
args, opts)
-- but already is executed on storages.
while lfiber.time() <= tend do
lfiber.sleep(0.05)
- replicaset = M.replicasets[err.destination]
+ replicaset =
router.replicasets[err.destination]
if replicaset then
goto replicaset_is_found
end
end
else
- replicaset = bucket_set(bucket_id, replicaset.uuid)
+ replicaset = bucket_set(router, bucket_id,
+ replicaset.uuid)
lfiber.yield()
-- Protect against infinite cycle in a
-- case of broken cluster, when a bucket
@@ -276,7 +318,7 @@ local function router_call(bucket_id, mode, func,
args, opts)
-- is not timeout - these requests are repeated in
-- any case on client, if error.
assert(mode == 'write')
- bucket_reset(bucket_id)
+ bucket_reset(router, bucket_id)
return nil, err
elseif err.code == lerror.code.NON_MASTER then
-- Same, as above - do not wait and repeat.
@@ -302,12 +344,12 @@ end
--
-- Wrappers for router_call with preset mode.
--
-local function router_callro(bucket_id, ...)
- return router_call(bucket_id, 'read', ...)
+local function router_callro(router, bucket_id, ...)
+ return router_call(router, bucket_id, 'read', ...)
end
-local function router_callrw(bucket_id, ...)
- return router_call(bucket_id, 'write', ...)
+local function router_callrw(router, bucket_id, ...)
+ return router_call(router, bucket_id, 'write', ...)
end
--
@@ -315,27 +357,27 @@ end
-- @param bucket_id Bucket identifier.
-- @retval Netbox connection.
--
-local function router_route(bucket_id)
+local function router_route(router, bucket_id)
if type(bucket_id) ~= 'number' then
error('Usage: router.route(bucket_id)')
end
- return bucket_resolve(bucket_id)
+ return bucket_resolve(router, bucket_id)
end
--
-- Return map of all replicasets.
-- @retval See self.replicasets map.
--
-local function router_routeall()
- return M.replicasets
+local function router_routeall(router)
+ return router.replicasets
end
--------------------------------------------------------------------------------
-- Failover
--------------------------------------------------------------------------------
-local function failover_ping_round()
- for _, replicaset in pairs(M.replicasets) do
+local function failover_ping_round(router)
+ for _, replicaset in pairs(router.replicasets) do
local replica = replicaset.replica
if replica ~= nil and replica.conn ~= nil and
replica.down_ts == nil then
@@ -378,10 +420,10 @@ end
-- Collect UUIDs of replicasets, priority of whose replica
-- connections must be updated.
--
-local function failover_collect_to_update()
+local function failover_collect_to_update(router)
local ts = lfiber.time()
local uuid_to_update = {}
- for uuid, rs in pairs(M.replicasets) do
+ for uuid, rs in pairs(router.replicasets) do
if failover_need_down_priority(rs, ts) or
failover_need_up_priority(rs, ts) then
table.insert(uuid_to_update, uuid)
@@ -396,16 +438,16 @@ end
-- disconnected replicas.
-- @retval true A replica of an replicaset has been changed.
--
-local function failover_step()
- failover_ping_round()
- local uuid_to_update = failover_collect_to_update()
+local function failover_step(router)
+ failover_ping_round(router)
+ local uuid_to_update = failover_collect_to_update(router)
if #uuid_to_update == 0 then
return false
end
local curr_ts = lfiber.time()
local replica_is_changed = false
for _, uuid in pairs(uuid_to_update) do
- local rs = M.replicasets[uuid]
+ local rs = router.replicasets[uuid]
if M.errinj.ERRINJ_FAILOVER_CHANGE_CFG then
rs = nil
M.errinj.ERRINJ_FAILOVER_CHANGE_CFG = false
@@ -447,7 +489,7 @@ end
-- tries to reconnect to the best replica. When the connection is
-- established, it replaces the original replica.
--
-local function failover_f()
+local function failover_f(router)
local module_version = M.module_version
local min_timeout = math.min(consts.FAILOVER_UP_TIMEOUT,
consts.FAILOVER_DOWN_TIMEOUT)
@@ -457,7 +499,7 @@ local function failover_f()
local prev_was_ok = false
while module_version == M.module_version do
::continue::
- local ok, replica_is_changed = pcall(failover_step)
+ local ok, replica_is_changed = pcall(failover_step, router)
if not ok then
log.error('Error during failovering: %s',
lerror.make(replica_is_changed))
@@ -484,8 +526,8 @@ end
-- Configuration
--------------------------------------------------------------------------------
-local function router_cfg(cfg, is_reload)
- cfg = lcfg.check(cfg, M.current_cfg)
+local function router_cfg(router, cfg, is_reload)
+ cfg = lcfg.check(cfg, router.current_cfg)
local vshard_cfg, box_cfg = lcfg.split(cfg)
if not M.replicasets then
log.info('Starting router configuration')
@@ -511,41 +553,47 @@ local function router_cfg(cfg, is_reload)
-- Move connections from an old configuration to a new one.
-- It must be done with no yields to prevent usage both of not
-- fully moved old replicasets, and not fully built new ones.
- lreplicaset.rebind_replicasets(new_replicasets, M.replicasets)
+ lreplicaset.rebind_replicasets(new_replicasets, router.replicasets)
-- Now the new replicasets are fully built. Can establish
-- connections and yield.
for _, replicaset in pairs(new_replicasets) do
replicaset:connect_all()
end
+ -- Change state of lua GC.
+ if vshard_cfg.collect_lua_garbage and not
router.collect_lua_garbage then
+ lua_gc_cnt_inc()
+ elseif not vshard_cfg.collect_lua_garbage and
+ router.collect_lua_garbage then
+ lua_gc_cnt_dec()
+ end
lreplicaset.wait_masters_connect(new_replicasets)
- lreplicaset.outdate_replicasets(M.replicasets,
+ lreplicaset.outdate_replicasets(router.replicasets,
vshard_cfg.connection_outdate_delay)
- M.connection_outdate_delay = vshard_cfg.connection_outdate_delay
- M.total_bucket_count = total_bucket_count
- M.collect_lua_garbage = vshard_cfg.collect_lua_garbage
- M.current_cfg = cfg
- M.replicasets = new_replicasets
- for bucket, rs in pairs(M.route_map) do
- M.route_map[bucket] = M.replicasets[rs.uuid]
- end
- if M.failover_fiber == nil then
- M.failover_fiber =
util.reloadable_fiber_create('vshard.failover', M,
- 'failover_f')
+ router.connection_outdate_delay = vshard_cfg.connection_outdate_delay
+ router.total_bucket_count = total_bucket_count
+ router.collect_lua_garbage = vshard_cfg.collect_lua_garbage
+ router.current_cfg = cfg
+ router.replicasets = new_replicasets
+ for bucket, rs in pairs(router.route_map) do
+ router.route_map[bucket] = router.replicasets[rs.uuid]
+ end
+ if router.failover_fiber == nil then
+ router.failover_fiber = util.reloadable_fiber_create(
+ 'vshard.failover.' .. router.name, M, 'failover_f', router)
+ end
+ if router.discovery_fiber == nil then
+ router.discovery_fiber = util.reloadable_fiber_create(
+ 'vshard.discovery.' .. router.name, M, 'discovery_f', router)
end
- if M.discovery_fiber == nil then
- M.discovery_fiber =
util.reloadable_fiber_create('vshard.discovery', M,
- 'discovery_f')
- end
- lua_gc.set_state(M.collect_lua_garbage,
consts.COLLECT_LUA_GARBAGE_INTERVAL)
end
--------------------------------------------------------------------------------
-- Bootstrap
--------------------------------------------------------------------------------
-local function cluster_bootstrap()
+local function cluster_bootstrap(router)
local replicasets = {}
- for uuid, replicaset in pairs(M.replicasets) do
+ for uuid, replicaset in pairs(router.replicasets) do
table.insert(replicasets, replicaset)
local count, err =
replicaset:callrw('vshard.storage.buckets_count',
{})
@@ -556,9 +604,10 @@ local function cluster_bootstrap()
return nil, lerror.vshard(lerror.code.NON_EMPTY)
end
end
- lreplicaset.calculate_etalon_balance(M.replicasets,
M.total_bucket_count)
+ lreplicaset.calculate_etalon_balance(router.replicasets,
+ router.total_bucket_count)
local bucket_id = 1
- for uuid, replicaset in pairs(M.replicasets) do
+ for uuid, replicaset in pairs(router.replicasets) do
if replicaset.etalon_bucket_count > 0 then
local ok, err =
replicaset:callrw('vshard.storage.bucket_force_create',
@@ -614,7 +663,7 @@ local function replicaset_instance_info(replicaset,
name, alerts, errcolor,
return info, consts.STATUS.GREEN
end
-local function router_info()
+local function router_info(router)
local state = {
replicasets = {},
bucket = {
@@ -628,7 +677,7 @@ local function router_info()
}
local bucket_info = state.bucket
local known_bucket_count = 0
- for rs_uuid, replicaset in pairs(M.replicasets) do
+ for rs_uuid, replicaset in pairs(router.replicasets) do
-- Replicaset info parameters:
-- * master instance info;
-- * replica instance info;
@@ -716,7 +765,7 @@ local function router_info()
-- If a bucket is unreachable, then replicaset is
-- unreachable too and color already is red.
end
- bucket_info.unknown = M.total_bucket_count - known_bucket_count
+ bucket_info.unknown = router.total_bucket_count - known_bucket_count
if bucket_info.unknown > 0 then
state.status = math.max(state.status, consts.STATUS.YELLOW)
table.insert(state.alerts,
lerror.alert(lerror.code.UNKNOWN_BUCKETS,
@@ -733,13 +782,13 @@ end
-- @param limit Maximal bucket count in output.
-- @retval Map of type {bucket_id = 'unknown'/replicaset_uuid}.
--
-local function router_buckets_info(offset, limit)
+local function router_buckets_info(router, offset, limit)
if offset ~= nil and type(offset) ~= 'number' or
limit ~= nil and type(limit) ~= 'number' then
error('Usage: buckets_info(offset, limit)')
end
offset = offset or 0
- limit = limit or M.total_bucket_count
+ limit = limit or router.total_bucket_count
local ret = {}
-- Use one string memory for all unknown buckets.
local available_rw = 'available_rw'
@@ -748,9 +797,9 @@ local function router_buckets_info(offset, limit)
local unreachable = 'unreachable'
-- Collect limit.
local first = math.max(1, offset + 1)
- local last = math.min(offset + limit, M.total_bucket_count)
+ local last = math.min(offset + limit, router.total_bucket_count)
for bucket_id = first, last do
- local rs = M.route_map[bucket_id]
+ local rs = router.route_map[bucket_id]
if rs then
if rs.master and rs.master:is_connected() then
ret[bucket_id] = {uuid = rs.uuid, status = available_rw}
@@ -770,22 +819,22 @@ end
-- Other
--------------------------------------------------------------------------------
-local function router_bucket_id(key)
+local function router_bucket_id(router, key)
if key == nil then
error("Usage: vshard.router.bucket_id(key)")
end
- return lhash.key_hash(key) % M.total_bucket_count + 1
+ return lhash.key_hash(key) % router.total_bucket_count + 1
end
-local function router_bucket_count()
- return M.total_bucket_count
+local function router_bucket_count(router)
+ return router.total_bucket_count
end
-local function router_sync(timeout)
+local function router_sync(router, timeout)
if timeout ~= nil and type(timeout) ~= 'number' then
error('Usage: vshard.router.sync([timeout: number])')
end
- for rs_uuid, replicaset in pairs(M.replicasets) do
+ for rs_uuid, replicaset in pairs(router.replicasets) do
local status, err = replicaset:callrw('vshard.storage.sync',
{timeout})
if not status then
-- Add information about replicaset
@@ -799,6 +848,90 @@ if M.errinj.ERRINJ_RELOAD then
error('Error injection: reload')
end
+--------------------------------------------------------------------------------
+-- Managing router instances
+--------------------------------------------------------------------------------
+
+local function cfg_reconfigure(router, cfg)
+ return router_cfg(router, cfg, false)
+end
+
+local router_mt = {
+ __index = {
+ cfg = cfg_reconfigure;
+ info = router_info;
+ buckets_info = router_buckets_info;
+ call = router_call;
+ callro = router_callro;
+ callrw = router_callrw;
+ route = router_route;
+ routeall = router_routeall;
+ bucket_id = router_bucket_id;
+ bucket_count = router_bucket_count;
+ sync = router_sync;
+ bootstrap = cluster_bootstrap;
+ bucket_discovery = bucket_discovery;
+ discovery_wakeup = discovery_wakeup;
+ }
+}
+
+-- Table which represents this module.
+local module = {}
+
+-- This metatable bypasses calls to a module to the static_router.
+local module_mt = {__index = {}}
+for method_name, method in pairs(router_mt.__index) do
+ module_mt.__index[method_name] = function(...)
+ return method(M.static_router, ...)
+ end
+end
+
+local function export_static_router_attributes()
+ setmetatable(module, module_mt)
+end
+
+--
+-- Create a new instance of router.
+-- @param name Name of a new router.
+-- @param cfg Configuration for `router_cfg`.
+-- @retval Router instance.
+-- @retval Nil and error object.
+--
+local function router_new(name, cfg)
+ if type(name) ~= 'string' or type(cfg) ~= 'table' then
+ error('Wrong argument type. Usage: vshard.router.new(name,
cfg).')
+ end
+ if M.routers[name] then
+ return nil, lerror.vshard(lerror.code.ROUTER_ALREADY_EXISTS, name)
+ end
+ local router = table.deepcopy(ROUTER_TEMPLATE)
+ setmetatable(router, router_mt)
+ router.name = name
+ M.routers[name] = router
+ router_cfg(router, cfg)
+ return router
+end
+
+--
+-- Wrapper around a `router_new` API, which allow to use old
+-- static `vshard.router.cfg()` API.
+--
+local function legacy_cfg(cfg)
+ if M.static_router then
+ -- Reconfigure.
+ router_cfg(M.static_router, cfg, false)
+ else
+ -- Create new static instance.
+ local router, err = router_new(STATIC_ROUTER_NAME, cfg)
+ if router then
+ M.static_router = router
+ export_static_router_attributes()
+ else
+ return nil, err
+ end
+ end
+end
+
--------------------------------------------------------------------------------
-- Module definition
--------------------------------------------------------------------------------
@@ -809,28 +942,23 @@ end
if not rawget(_G, MODULE_INTERNALS) then
rawset(_G, MODULE_INTERNALS, M)
else
- router_cfg(M.current_cfg, true)
+ for _, router in pairs(M.routers) do
+ router_cfg(router, router.current_cfg, true)
+ setmetatable(router, router_mt)
+ end
+ if M.static_router then
+ export_static_router_attributes()
+ end
M.module_version = M.module_version + 1
end
M.discovery_f = discovery_f
M.failover_f = failover_f
+M.router_mt = router_mt
-return {
- cfg = function(cfg) return router_cfg(cfg, false) end;
- info = router_info;
- buckets_info = router_buckets_info;
- call = router_call;
- callro = router_callro;
- callrw = router_callrw;
- route = router_route;
- routeall = router_routeall;
- bucket_id = router_bucket_id;
- bucket_count = router_bucket_count;
- sync = router_sync;
- bootstrap = cluster_bootstrap;
- bucket_discovery = bucket_discovery;
- discovery_wakeup = discovery_wakeup;
- internal = M;
- module_version = function() return M.module_version end;
-}
+module.cfg = legacy_cfg
+module.new = router_new
+module.internal = M
+module.module_version = function() return M.module_version end
+
+return module
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index 0593edf..63aa96f 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -1632,8 +1632,6 @@ local function storage_cfg(cfg, this_replica_uuid,
is_reload)
M.rebalancer_max_receiving = rebalancer_max_receiving
M.shard_index = shard_index
M.collect_bucket_garbage_interval = collect_bucket_garbage_interval
- M.collect_lua_garbage = collect_lua_garbage
- M.current_cfg = cfg
M.collect_lua_garbage = vshard_cfg.collect_lua_garbage
M.current_cfg = cfg
diff --git a/vshard/util.lua b/vshard/util.lua
index 37abe2b..3afaa61 100644
--- a/vshard/util.lua
+++ b/vshard/util.lua
@@ -38,11 +38,11 @@ end
-- reload of that module.
-- See description of parameters in `reloadable_fiber_create`.
--
-local function reloadable_fiber_main_loop(module, func_name)
+local function reloadable_fiber_main_loop(module, func_name, data)
log.info('%s has been started', func_name)
local func = module[func_name]
::restart_loop::
- local ok, err = pcall(func)
+ local ok, err = pcall(func, data)
-- yield serves two purposes:
-- * makes this fiber cancellable
-- * prevents 100% cpu consumption
@@ -60,7 +60,7 @@ local function reloadable_fiber_main_loop(module,
func_name)
log.info('module is reloaded, restarting')
-- luajit drops this frame if next function is called in
-- return statement.
- return M.reloadable_fiber_main_loop(module, func_name)
+ return M.reloadable_fiber_main_loop(module, func_name, data)
end
--
@@ -74,11 +74,13 @@ end
-- @param module Module which can be reloaded.
-- @param func_name Name of a function to be executed in the
-- module.
+-- @param data Data to be passed to the specified function.
-- @retval New fiber.
--
-local function reloadable_fiber_create(fiber_name, module, func_name)
+local function reloadable_fiber_create(fiber_name, module, func_name, data)
assert(type(fiber_name) == 'string')
- local xfiber = fiber.create(reloadable_fiber_main_loop, module,
func_name)
+ local xfiber = fiber.create(reloadable_fiber_main_loop, module,
func_name,
+ data)
xfiber:name(fiber_name)
return xfiber
end
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 1/3] Update only vshard part of a cfg on reload
2018-08-06 17:03 ` Vladislav Shpilevoy
@ 2018-08-07 13:19 ` Alex Khatskevich
2018-08-08 11:17 ` Vladislav Shpilevoy
0 siblings, 1 reply; 23+ messages in thread
From: Alex Khatskevich @ 2018-08-07 13:19 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches
On 06.08.2018 20:03, Vladislav Shpilevoy wrote:
> Thanks for the patch! See 3 comments below.
>
>> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
>> index 102b942..40216ea 100644
>> --- a/vshard/storage/init.lua
>> +++ b/vshard/storage/init.lua
>> @@ -1553,18 +1553,19 @@ local function storage_cfg(cfg,
>> this_replica_uuid)
>> --
>> -- If a master role of the replica is not changed, then
>> -- 'read_only' can be set right here.
>> - cfg.listen = cfg.listen or this_replica.uri
>> - if cfg.replication == nil and this_replicaset.master and not
>> is_master then
>> - cfg.replication = {this_replicaset.master.uri}
>> + box_cfg.listen = box_cfg.listen or this_replica.uri
>> + if box_cfg.replication == nil and this_replicaset.master
>> + and not is_master then
>> + box_cfg.replication = {this_replicaset.master.uri}
>> else
>> - cfg.replication = {}
>> + box_cfg.replication = {}
>> end
>> if was_master == is_master then
>> - cfg.read_only = not is_master
>> + box_cfg.read_only = not is_master
>> end
>> if type(box.cfg) == 'function' then
>> - cfg.instance_uuid = this_replica.uuid
>> - cfg.replicaset_uuid = this_replicaset.uuid
>> + box_cfg.instance_uuid = this_replica.uuid
>> + box_cfg.replicaset_uuid = this_replicaset.uuid
>
> 1. All these box_cfg manipulations should be done under 'if not
> is_reload'
> I think.
Fixed.
>
>> else
>> local info = box.info
>> if this_replica_uuid ~= info.uuid then
>> @@ -1607,27 +1610,27 @@ local function storage_cfg(cfg,
>> this_replica_uuid)
>> local_on_master_enable_prepare()
>> end
>>
>> - local box_cfg = table.copy(cfg)
>> - lcfg.remove_non_box_options(box_cfg)
>> - local ok, err = pcall(box.cfg, box_cfg)
>> - while M.errinj.ERRINJ_CFG_DELAY do
>> - lfiber.sleep(0.01)
>> - end
>> - if not ok then
>> - M.sync_timeout = old_sync_timeout
>> - if was_master and not is_master then
>> - local_on_master_disable_abort()
>> + if not is_reload then
>> + local ok, err = true, nil
>> + ok, err = pcall(box.cfg, box_cfg)
>
> 2. Why do you need to announce 'local ok, err' before
> their usage on the next line?
fixed.
>
>
>> + while M.errinj.ERRINJ_CFG_DELAY do
>> + lfiber.sleep(0.01)
>> end
>> - if not was_master and is_master then
>> - local_on_master_enable_abort()
>> + if not ok then
>> + M.sync_timeout = old_sync_timeout
>> + if was_master and not is_master then
>> + local_on_master_disable_abort()
>> + end
>> + if not was_master and is_master then
>> + local_on_master_enable_abort()
>> + end
>> + error(err)
>> end
>> - error(err)
>> + log.info("Box has been configured")
>> + local uri = luri.parse(this_replica.uri)
>> + box.once("vshard:storage:1", storage_schema_v1, uri.login,
>> uri.password)
>> end
>>
>> - log.info("Box has been configured")
>> - local uri = luri.parse(this_replica.uri)
>> - box.once("vshard:storage:1", storage_schema_v1, uri.login,
>> uri.password)
>> -
>> lreplicaset.rebind_replicasets(new_replicasets, M.replicasets)
>> lreplicaset.outdate_replicasets(M.replicasets)
>> M.replicasets = new_replicasets
>> @@ -1874,7 +1877,7 @@ if not rawget(_G, MODULE_INTERNALS) then
>> rawset(_G, MODULE_INTERNALS, M)
>> else
>> reload_evolution.upgrade(M)
>> - storage_cfg(M.current_cfg, M.this_replica.uuid)
>> + storage_cfg(M.current_cfg, M.this_replica.uuid, true)
>
> 3. I see that you have stored vshard_cfg in M.current_cfg. Not a full
> config. So it does not have any box options. And it causes a question
> - why do you need to separate reload from non-reload, if reload anyway
> in such implementation is like 'box.cfg{}' call with no parameters?
> And if you do not store box_cfg options how are you going to compare
> configs when we will implement atomic cfg over cluster?
Fixed. And in a router too.
Full diff:
commit d3c35612130ff95b20245993ab5053981d3b985f
Author: AKhatskevich <avkhatskevich@tarantool.org>
Date: Mon Jul 23 16:42:22 2018 +0300
Update only vshard part of a cfg on reload
Box cfg could have been changed by a user and then overridden by
an old vshard config on reload.
Since that commit, box part of a config is applied only when
it is explicitly passed to a `cfg` method.
This change is important for the multiple routers feature.
diff --git a/vshard/cfg.lua b/vshard/cfg.lua
index 7c9ab77..af1c3ee 100644
--- a/vshard/cfg.lua
+++ b/vshard/cfg.lua
@@ -221,6 +221,22 @@ local cfg_template = {
},
}
+--
+-- Split it into vshard_cfg and box_cfg parts.
+--
+local function cfg_split(cfg)
+ local vshard_cfg = {}
+ local box_cfg = {}
+ for k, v in pairs(cfg) do
+ if cfg_template[k] then
+ vshard_cfg[k] = v
+ else
+ box_cfg[k] = v
+ end
+ end
+ return vshard_cfg, box_cfg
+end
+
--
-- Names of options which cannot be changed during reconfigure.
--
@@ -252,24 +268,7 @@ local function cfg_check(shard_cfg, old_cfg)
return shard_cfg
end
---
--- Nullify non-box options.
---
-local function remove_non_box_options(cfg)
- cfg.sharding = nil
- cfg.weights = nil
- cfg.zone = nil
- cfg.bucket_count = nil
- cfg.rebalancer_disbalance_threshold = nil
- cfg.rebalancer_max_receiving = nil
- cfg.shard_index = nil
- cfg.collect_bucket_garbage_interval = nil
- cfg.collect_lua_garbage = nil
- cfg.sync_timeout = nil
- cfg.connection_outdate_delay = nil
-end
-
return {
check = cfg_check,
- remove_non_box_options = remove_non_box_options,
+ split = cfg_split,
}
diff --git a/vshard/router/init.lua b/vshard/router/init.lua
index d8c026b..1e8d898 100644
--- a/vshard/router/init.lua
+++ b/vshard/router/init.lua
@@ -491,19 +491,17 @@ end
-- Configuration
--------------------------------------------------------------------------------
-local function router_cfg(cfg)
+local function router_cfg(cfg, is_reload)
cfg = lcfg.check(cfg, M.current_cfg)
- local new_cfg = table.copy(cfg)
+ local vshard_cfg, box_cfg = lcfg.split(cfg)
if not M.replicasets then
log.info('Starting router configuration')
else
log.info('Starting router reconfiguration')
end
- local new_replicasets = lreplicaset.buildall(cfg)
- local total_bucket_count = cfg.bucket_count
- local collect_lua_garbage = cfg.collect_lua_garbage
- local box_cfg = table.copy(cfg)
- lcfg.remove_non_box_options(box_cfg)
+ local new_replicasets = lreplicaset.buildall(vshard_cfg)
+ local total_bucket_count = vshard_cfg.bucket_count
+ local collect_lua_garbage = vshard_cfg.collect_lua_garbage
log.info("Calling box.cfg()...")
for k, v in pairs(box_cfg) do
log.info({[k] = v})
@@ -514,8 +512,10 @@ local function router_cfg(cfg)
if M.errinj.ERRINJ_CFG then
error('Error injection: cfg')
end
- box.cfg(box_cfg)
- log.info("Box has been configured")
+ if not is_reload then
+ box.cfg(box_cfg)
+ log.info("Box has been configured")
+ end
-- Move connections from an old configuration to a new one.
-- It must be done with no yields to prevent usage both of not
-- fully moved old replicasets, and not fully built new ones.
@@ -526,8 +526,9 @@ local function router_cfg(cfg)
replicaset:connect_all()
end
lreplicaset.wait_masters_connect(new_replicasets)
- lreplicaset.outdate_replicasets(M.replicasets,
cfg.connection_outdate_delay)
- M.connection_outdate_delay = cfg.connection_outdate_delay
+ lreplicaset.outdate_replicasets(M.replicasets,
+ vshard_cfg.connection_outdate_delay)
+ M.connection_outdate_delay = vshard_cfg.connection_outdate_delay
M.total_bucket_count = total_bucket_count
M.collect_lua_garbage = collect_lua_garbage
M.current_cfg = cfg
@@ -817,7 +818,7 @@ end
if not rawget(_G, MODULE_INTERNALS) then
rawset(_G, MODULE_INTERNALS, M)
else
- router_cfg(M.current_cfg)
+ router_cfg(M.current_cfg, true)
M.module_version = M.module_version + 1
end
@@ -825,7 +826,7 @@ M.discovery_f = discovery_f
M.failover_f = failover_f
return {
- cfg = router_cfg;
+ cfg = function(cfg) return router_cfg(cfg, false) end;
info = router_info;
buckets_info = router_buckets_info;
call = router_call;
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index 1f29323..2080769 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -1500,13 +1500,14 @@ end
--------------------------------------------------------------------------------
-- Configuration
--------------------------------------------------------------------------------
-local function storage_cfg(cfg, this_replica_uuid)
+
+local function storage_cfg(cfg, this_replica_uuid, is_reload)
if this_replica_uuid == nil then
error('Usage: cfg(configuration, this_replica_uuid)')
end
cfg = lcfg.check(cfg, M.current_cfg)
- local new_cfg = table.copy(cfg)
- if cfg.weights or cfg.zone then
+ local vshard_cfg, box_cfg = lcfg.split(cfg)
+ if vshard_cfg.weights or vshard_cfg.zone then
error('Weights and zone are not allowed for storage
configuration')
end
if M.replicasets then
@@ -1520,7 +1521,7 @@ local function storage_cfg(cfg, this_replica_uuid)
local this_replicaset
local this_replica
- local new_replicasets = lreplicaset.buildall(cfg)
+ local new_replicasets = lreplicaset.buildall(vshard_cfg)
local min_master
for rs_uuid, rs in pairs(new_replicasets) do
for replica_uuid, replica in pairs(rs.replicas) do
@@ -1544,46 +1545,14 @@ local function storage_cfg(cfg, this_replica_uuid)
log.info('I am master')
end
- -- Do not change 'read_only' option here - if a master is
- -- disabled and there are triggers on master disable, then
- -- they would not be able to modify anything, if 'read_only'
- -- had been set here. 'Read_only' is set in
- -- local_on_master_disable after triggers and is unset in
- -- local_on_master_enable before triggers.
- --
- -- If a master role of the replica is not changed, then
- -- 'read_only' can be set right here.
- cfg.listen = cfg.listen or this_replica.uri
- if cfg.replication == nil and this_replicaset.master and not
is_master then
- cfg.replication = {this_replicaset.master.uri}
- else
- cfg.replication = {}
- end
- if was_master == is_master then
- cfg.read_only = not is_master
- end
- if type(box.cfg) == 'function' then
- cfg.instance_uuid = this_replica.uuid
- cfg.replicaset_uuid = this_replicaset.uuid
- else
- local info = box.info
- if this_replica_uuid ~= info.uuid then
- error(string.format('Instance UUID mismatch: already set
"%s" '..
- 'but "%s" in arguments', info.uuid,
- this_replica_uuid))
- end
- if this_replicaset.uuid ~= info.cluster.uuid then
- error(string.format('Replicaset UUID mismatch: already set
"%s" '..
- 'but "%s" in vshard config',
info.cluster.uuid,
- this_replicaset.uuid))
- end
- end
- local total_bucket_count = cfg.bucket_count
- local rebalancer_disbalance_threshold =
cfg.rebalancer_disbalance_threshold
- local rebalancer_max_receiving = cfg.rebalancer_max_receiving
- local shard_index = cfg.shard_index
- local collect_bucket_garbage_interval =
cfg.collect_bucket_garbage_interval
- local collect_lua_garbage = cfg.collect_lua_garbage
+ local total_bucket_count = vshard_cfg.bucket_count
+ local rebalancer_disbalance_threshold =
+ vshard_cfg.rebalancer_disbalance_threshold
+ local rebalancer_max_receiving = vshard_cfg.rebalancer_max_receiving
+ local shard_index = vshard_cfg.shard_index
+ local collect_bucket_garbage_interval =
+ vshard_cfg.collect_bucket_garbage_interval
+ local collect_lua_garbage = vshard_cfg.collect_lua_garbage
-- It is considered that all possible errors during cfg
-- process occur only before this place.
@@ -1598,7 +1567,7 @@ local function storage_cfg(cfg, this_replica_uuid)
-- a new sync timeout.
--
local old_sync_timeout = M.sync_timeout
- M.sync_timeout = cfg.sync_timeout
+ M.sync_timeout = vshard_cfg.sync_timeout
if was_master and not is_master then
local_on_master_disable_prepare()
@@ -1607,27 +1576,61 @@ local function storage_cfg(cfg, this_replica_uuid)
local_on_master_enable_prepare()
end
- local box_cfg = table.copy(cfg)
- lcfg.remove_non_box_options(box_cfg)
- local ok, err = pcall(box.cfg, box_cfg)
- while M.errinj.ERRINJ_CFG_DELAY do
- lfiber.sleep(0.01)
- end
- if not ok then
- M.sync_timeout = old_sync_timeout
- if was_master and not is_master then
- local_on_master_disable_abort()
+ if not is_reload then
+ -- Do not change 'read_only' option here - if a master is
+ -- disabled and there are triggers on master disable, then
+ -- they would not be able to modify anything, if 'read_only'
+ -- had been set here. 'Read_only' is set in
+ -- local_on_master_disable after triggers and is unset in
+ -- local_on_master_enable before triggers.
+ --
+ -- If a master role of the replica is not changed, then
+ -- 'read_only' can be set right here.
+ box_cfg.listen = box_cfg.listen or this_replica.uri
+ if box_cfg.replication == nil and this_replicaset.master
+ and not is_master then
+ box_cfg.replication = {this_replicaset.master.uri}
+ else
+ box_cfg.replication = {}
end
- if not was_master and is_master then
- local_on_master_enable_abort()
+ if was_master == is_master then
+ box_cfg.read_only = not is_master
end
- error(err)
+ if type(box.cfg) == 'function' then
+ box_cfg.instance_uuid = this_replica.uuid
+ box_cfg.replicaset_uuid = this_replicaset.uuid
+ else
+ local info = box.info
+ if this_replica_uuid ~= info.uuid then
+ error(string.format('Instance UUID mismatch: already
set ' ..
+ '"%s" but "%s" in arguments',
info.uuid,
+ this_replica_uuid))
+ end
+ if this_replicaset.uuid ~= info.cluster.uuid then
+ error(string.format('Replicaset UUID mismatch: already
set ' ..
+ '"%s" but "%s" in vshard config',
+ info.cluster.uuid,
this_replicaset.uuid))
+ end
+ end
+ local ok, err = pcall(box.cfg, box_cfg)
+ while M.errinj.ERRINJ_CFG_DELAY do
+ lfiber.sleep(0.01)
+ end
+ if not ok then
+ M.sync_timeout = old_sync_timeout
+ if was_master and not is_master then
+ local_on_master_disable_abort()
+ end
+ if not was_master and is_master then
+ local_on_master_enable_abort()
+ end
+ error(err)
+ end
+ log.info("Box has been configured")
+ local uri = luri.parse(this_replica.uri)
+ box.once("vshard:storage:1", storage_schema_v1, uri.login,
uri.password)
end
- log.info("Box has been configured")
- local uri = luri.parse(this_replica.uri)
- box.once("vshard:storage:1", storage_schema_v1, uri.login,
uri.password)
-
lreplicaset.rebind_replicasets(new_replicasets, M.replicasets)
lreplicaset.outdate_replicasets(M.replicasets)
M.replicasets = new_replicasets
@@ -1639,7 +1642,7 @@ local function storage_cfg(cfg, this_replica_uuid)
M.shard_index = shard_index
M.collect_bucket_garbage_interval = collect_bucket_garbage_interval
M.collect_lua_garbage = collect_lua_garbage
- M.current_cfg = new_cfg
+ M.current_cfg = cfg
if was_master and not is_master then
local_on_master_disable()
@@ -1875,7 +1878,7 @@ if not rawget(_G, MODULE_INTERNALS) then
rawset(_G, MODULE_INTERNALS, M)
else
reload_evolution.upgrade(M)
- storage_cfg(M.current_cfg, M.this_replica.uuid)
+ storage_cfg(M.current_cfg, M.this_replica.uuid, true)
M.module_version = M.module_version + 1
end
@@ -1914,7 +1917,7 @@ return {
rebalancing_is_in_progress = rebalancing_is_in_progress,
recovery_wakeup = recovery_wakeup,
call = storage_call,
- cfg = storage_cfg,
+ cfg = function(cfg, uuid) return storage_cfg(cfg, uuid, false) end,
info = storage_info,
buckets_info = storage_buckets_info,
buckets_count = storage_buckets_count,
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 1/3] Update only vshard part of a cfg on reload
2018-08-07 13:19 ` Alex Khatskevich
@ 2018-08-08 11:17 ` Vladislav Shpilevoy
0 siblings, 0 replies; 23+ messages in thread
From: Vladislav Shpilevoy @ 2018-08-08 11:17 UTC (permalink / raw)
To: tarantool-patches, Alex Khatskevich
Thanks for the patch! Pushed into the master.
On 07/08/2018 16:19, Alex Khatskevich wrote:
>
> On 06.08.2018 20:03, Vladislav Shpilevoy wrote:
>> Thanks for the patch! See 3 comments below.
>>
>>> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
>>> index 102b942..40216ea 100644
>>> --- a/vshard/storage/init.lua
>>> +++ b/vshard/storage/init.lua
>>> @@ -1553,18 +1553,19 @@ local function storage_cfg(cfg, this_replica_uuid)
>>> --
>>> -- If a master role of the replica is not changed, then
>>> -- 'read_only' can be set right here.
>>> - cfg.listen = cfg.listen or this_replica.uri
>>> - if cfg.replication == nil and this_replicaset.master and not is_master then
>>> - cfg.replication = {this_replicaset.master.uri}
>>> + box_cfg.listen = box_cfg.listen or this_replica.uri
>>> + if box_cfg.replication == nil and this_replicaset.master
>>> + and not is_master then
>>> + box_cfg.replication = {this_replicaset.master.uri}
>>> else
>>> - cfg.replication = {}
>>> + box_cfg.replication = {}
>>> end
>>> if was_master == is_master then
>>> - cfg.read_only = not is_master
>>> + box_cfg.read_only = not is_master
>>> end
>>> if type(box.cfg) == 'function' then
>>> - cfg.instance_uuid = this_replica.uuid
>>> - cfg.replicaset_uuid = this_replicaset.uuid
>>> + box_cfg.instance_uuid = this_replica.uuid
>>> + box_cfg.replicaset_uuid = this_replicaset.uuid
>>
>> 1. All these box_cfg manipulations should be done under 'if not is_reload'
>> I think.
> Fixed.
>>
>>> else
>>> local info = box.info
>>> if this_replica_uuid ~= info.uuid then
>>> @@ -1607,27 +1610,27 @@ local function storage_cfg(cfg, this_replica_uuid)
>>> local_on_master_enable_prepare()
>>> end
>>>
>>> - local box_cfg = table.copy(cfg)
>>> - lcfg.remove_non_box_options(box_cfg)
>>> - local ok, err = pcall(box.cfg, box_cfg)
>>> - while M.errinj.ERRINJ_CFG_DELAY do
>>> - lfiber.sleep(0.01)
>>> - end
>>> - if not ok then
>>> - M.sync_timeout = old_sync_timeout
>>> - if was_master and not is_master then
>>> - local_on_master_disable_abort()
>>> + if not is_reload then
>>> + local ok, err = true, nil
>>> + ok, err = pcall(box.cfg, box_cfg)
>>
>> 2. Why do you need to announce 'local ok, err' before
>> their usage on the next line?
> fixed.
>>
>>
>>> + while M.errinj.ERRINJ_CFG_DELAY do
>>> + lfiber.sleep(0.01)
>>> end
>>> - if not was_master and is_master then
>>> - local_on_master_enable_abort()
>>> + if not ok then
>>> + M.sync_timeout = old_sync_timeout
>>> + if was_master and not is_master then
>>> + local_on_master_disable_abort()
>>> + end
>>> + if not was_master and is_master then
>>> + local_on_master_enable_abort()
>>> + end
>>> + error(err)
>>> end
>>> - error(err)
>>> + log.info("Box has been configured")
>>> + local uri = luri.parse(this_replica.uri)
>>> + box.once("vshard:storage:1", storage_schema_v1, uri.login, uri.password)
>>> end
>>>
>>> - log.info("Box has been configured")
>>> - local uri = luri.parse(this_replica.uri)
>>> - box.once("vshard:storage:1", storage_schema_v1, uri.login, uri.password)
>>> -
>>> lreplicaset.rebind_replicasets(new_replicasets, M.replicasets)
>>> lreplicaset.outdate_replicasets(M.replicasets)
>>> M.replicasets = new_replicasets
>>> @@ -1874,7 +1877,7 @@ if not rawget(_G, MODULE_INTERNALS) then
>>> rawset(_G, MODULE_INTERNALS, M)
>>> else
>>> reload_evolution.upgrade(M)
>>> - storage_cfg(M.current_cfg, M.this_replica.uuid)
>>> + storage_cfg(M.current_cfg, M.this_replica.uuid, true)
>>
>> 3. I see that you have stored vshard_cfg in M.current_cfg. Not a full
>> config. So it does not have any box options. And it causes a question
>> - why do you need to separate reload from non-reload, if reload anyway
>> in such implementation is like 'box.cfg{}' call with no parameters?
>> And if you do not store box_cfg options how are you going to compare
>> configs when we will implement atomic cfg over cluster?
> Fixed. And in a router too.
>
>
>
> Full diff:
>
> commit d3c35612130ff95b20245993ab5053981d3b985f
> Author: AKhatskevich <avkhatskevich@tarantool.org>
> Date: Mon Jul 23 16:42:22 2018 +0300
>
> Update only vshard part of a cfg on reload
>
> Box cfg could have been changed by a user and then overridden by
> an old vshard config on reload.
>
> Since that commit, box part of a config is applied only when
> it is explicitly passed to a `cfg` method.
>
> This change is important for the multiple routers feature.
>
> diff --git a/vshard/cfg.lua b/vshard/cfg.lua
> index 7c9ab77..af1c3ee 100644
> --- a/vshard/cfg.lua
> +++ b/vshard/cfg.lua
> @@ -221,6 +221,22 @@ local cfg_template = {
> },
> }
>
> +--
> +-- Split it into vshard_cfg and box_cfg parts.
> +--
> +local function cfg_split(cfg)
> + local vshard_cfg = {}
> + local box_cfg = {}
> + for k, v in pairs(cfg) do
> + if cfg_template[k] then
> + vshard_cfg[k] = v
> + else
> + box_cfg[k] = v
> + end
> + end
> + return vshard_cfg, box_cfg
> +end
> +
> --
> -- Names of options which cannot be changed during reconfigure.
> --
> @@ -252,24 +268,7 @@ local function cfg_check(shard_cfg, old_cfg)
> return shard_cfg
> end
>
> ---
> --- Nullify non-box options.
> ---
> -local function remove_non_box_options(cfg)
> - cfg.sharding = nil
> - cfg.weights = nil
> - cfg.zone = nil
> - cfg.bucket_count = nil
> - cfg.rebalancer_disbalance_threshold = nil
> - cfg.rebalancer_max_receiving = nil
> - cfg.shard_index = nil
> - cfg.collect_bucket_garbage_interval = nil
> - cfg.collect_lua_garbage = nil
> - cfg.sync_timeout = nil
> - cfg.connection_outdate_delay = nil
> -end
> -
> return {
> check = cfg_check,
> - remove_non_box_options = remove_non_box_options,
> + split = cfg_split,
> }
> diff --git a/vshard/router/init.lua b/vshard/router/init.lua
> index d8c026b..1e8d898 100644
> --- a/vshard/router/init.lua
> +++ b/vshard/router/init.lua
> @@ -491,19 +491,17 @@ end
> -- Configuration
> --------------------------------------------------------------------------------
>
> -local function router_cfg(cfg)
> +local function router_cfg(cfg, is_reload)
> cfg = lcfg.check(cfg, M.current_cfg)
> - local new_cfg = table.copy(cfg)
> + local vshard_cfg, box_cfg = lcfg.split(cfg)
> if not M.replicasets then
> log.info('Starting router configuration')
> else
> log.info('Starting router reconfiguration')
> end
> - local new_replicasets = lreplicaset.buildall(cfg)
> - local total_bucket_count = cfg.bucket_count
> - local collect_lua_garbage = cfg.collect_lua_garbage
> - local box_cfg = table.copy(cfg)
> - lcfg.remove_non_box_options(box_cfg)
> + local new_replicasets = lreplicaset.buildall(vshard_cfg)
> + local total_bucket_count = vshard_cfg.bucket_count
> + local collect_lua_garbage = vshard_cfg.collect_lua_garbage
> log.info("Calling box.cfg()...")
> for k, v in pairs(box_cfg) do
> log.info({[k] = v})
> @@ -514,8 +512,10 @@ local function router_cfg(cfg)
> if M.errinj.ERRINJ_CFG then
> error('Error injection: cfg')
> end
> - box.cfg(box_cfg)
> - log.info("Box has been configured")
> + if not is_reload then
> + box.cfg(box_cfg)
> + log.info("Box has been configured")
> + end
> -- Move connections from an old configuration to a new one.
> -- It must be done with no yields to prevent usage both of not
> -- fully moved old replicasets, and not fully built new ones.
> @@ -526,8 +526,9 @@ local function router_cfg(cfg)
> replicaset:connect_all()
> end
> lreplicaset.wait_masters_connect(new_replicasets)
> - lreplicaset.outdate_replicasets(M.replicasets, cfg.connection_outdate_delay)
> - M.connection_outdate_delay = cfg.connection_outdate_delay
> + lreplicaset.outdate_replicasets(M.replicasets,
> + vshard_cfg.connection_outdate_delay)
> + M.connection_outdate_delay = vshard_cfg.connection_outdate_delay
> M.total_bucket_count = total_bucket_count
> M.collect_lua_garbage = collect_lua_garbage
> M.current_cfg = cfg
> @@ -817,7 +818,7 @@ end
> if not rawget(_G, MODULE_INTERNALS) then
> rawset(_G, MODULE_INTERNALS, M)
> else
> - router_cfg(M.current_cfg)
> + router_cfg(M.current_cfg, true)
> M.module_version = M.module_version + 1
> end
>
> @@ -825,7 +826,7 @@ M.discovery_f = discovery_f
> M.failover_f = failover_f
>
> return {
> - cfg = router_cfg;
> + cfg = function(cfg) return router_cfg(cfg, false) end;
> info = router_info;
> buckets_info = router_buckets_info;
> call = router_call;
> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
> index 1f29323..2080769 100644
> --- a/vshard/storage/init.lua
> +++ b/vshard/storage/init.lua
> @@ -1500,13 +1500,14 @@ end
> --------------------------------------------------------------------------------
> -- Configuration
> --------------------------------------------------------------------------------
> -local function storage_cfg(cfg, this_replica_uuid)
> +
> +local function storage_cfg(cfg, this_replica_uuid, is_reload)
> if this_replica_uuid == nil then
> error('Usage: cfg(configuration, this_replica_uuid)')
> end
> cfg = lcfg.check(cfg, M.current_cfg)
> - local new_cfg = table.copy(cfg)
> - if cfg.weights or cfg.zone then
> + local vshard_cfg, box_cfg = lcfg.split(cfg)
> + if vshard_cfg.weights or vshard_cfg.zone then
> error('Weights and zone are not allowed for storage configuration')
> end
> if M.replicasets then
> @@ -1520,7 +1521,7 @@ local function storage_cfg(cfg, this_replica_uuid)
>
> local this_replicaset
> local this_replica
> - local new_replicasets = lreplicaset.buildall(cfg)
> + local new_replicasets = lreplicaset.buildall(vshard_cfg)
> local min_master
> for rs_uuid, rs in pairs(new_replicasets) do
> for replica_uuid, replica in pairs(rs.replicas) do
> @@ -1544,46 +1545,14 @@ local function storage_cfg(cfg, this_replica_uuid)
> log.info('I am master')
> end
>
> - -- Do not change 'read_only' option here - if a master is
> - -- disabled and there are triggers on master disable, then
> - -- they would not be able to modify anything, if 'read_only'
> - -- had been set here. 'Read_only' is set in
> - -- local_on_master_disable after triggers and is unset in
> - -- local_on_master_enable before triggers.
> - --
> - -- If a master role of the replica is not changed, then
> - -- 'read_only' can be set right here.
> - cfg.listen = cfg.listen or this_replica.uri
> - if cfg.replication == nil and this_replicaset.master and not is_master then
> - cfg.replication = {this_replicaset.master.uri}
> - else
> - cfg.replication = {}
> - end
> - if was_master == is_master then
> - cfg.read_only = not is_master
> - end
> - if type(box.cfg) == 'function' then
> - cfg.instance_uuid = this_replica.uuid
> - cfg.replicaset_uuid = this_replicaset.uuid
> - else
> - local info = box.info
> - if this_replica_uuid ~= info.uuid then
> - error(string.format('Instance UUID mismatch: already set "%s" '..
> - 'but "%s" in arguments', info.uuid,
> - this_replica_uuid))
> - end
> - if this_replicaset.uuid ~= info.cluster.uuid then
> - error(string.format('Replicaset UUID mismatch: already set "%s" '..
> - 'but "%s" in vshard config', info.cluster.uuid,
> - this_replicaset.uuid))
> - end
> - end
> - local total_bucket_count = cfg.bucket_count
> - local rebalancer_disbalance_threshold = cfg.rebalancer_disbalance_threshold
> - local rebalancer_max_receiving = cfg.rebalancer_max_receiving
> - local shard_index = cfg.shard_index
> - local collect_bucket_garbage_interval = cfg.collect_bucket_garbage_interval
> - local collect_lua_garbage = cfg.collect_lua_garbage
> + local total_bucket_count = vshard_cfg.bucket_count
> + local rebalancer_disbalance_threshold =
> + vshard_cfg.rebalancer_disbalance_threshold
> + local rebalancer_max_receiving = vshard_cfg.rebalancer_max_receiving
> + local shard_index = vshard_cfg.shard_index
> + local collect_bucket_garbage_interval =
> + vshard_cfg.collect_bucket_garbage_interval
> + local collect_lua_garbage = vshard_cfg.collect_lua_garbage
>
> -- It is considered that all possible errors during cfg
> -- process occur only before this place.
> @@ -1598,7 +1567,7 @@ local function storage_cfg(cfg, this_replica_uuid)
> -- a new sync timeout.
> --
> local old_sync_timeout = M.sync_timeout
> - M.sync_timeout = cfg.sync_timeout
> + M.sync_timeout = vshard_cfg.sync_timeout
>
> if was_master and not is_master then
> local_on_master_disable_prepare()
> @@ -1607,27 +1576,61 @@ local function storage_cfg(cfg, this_replica_uuid)
> local_on_master_enable_prepare()
> end
>
> - local box_cfg = table.copy(cfg)
> - lcfg.remove_non_box_options(box_cfg)
> - local ok, err = pcall(box.cfg, box_cfg)
> - while M.errinj.ERRINJ_CFG_DELAY do
> - lfiber.sleep(0.01)
> - end
> - if not ok then
> - M.sync_timeout = old_sync_timeout
> - if was_master and not is_master then
> - local_on_master_disable_abort()
> + if not is_reload then
> + -- Do not change 'read_only' option here - if a master is
> + -- disabled and there are triggers on master disable, then
> + -- they would not be able to modify anything, if 'read_only'
> + -- had been set here. 'Read_only' is set in
> + -- local_on_master_disable after triggers and is unset in
> + -- local_on_master_enable before triggers.
> + --
> + -- If a master role of the replica is not changed, then
> + -- 'read_only' can be set right here.
> + box_cfg.listen = box_cfg.listen or this_replica.uri
> + if box_cfg.replication == nil and this_replicaset.master
> + and not is_master then
> + box_cfg.replication = {this_replicaset.master.uri}
> + else
> + box_cfg.replication = {}
> end
> - if not was_master and is_master then
> - local_on_master_enable_abort()
> + if was_master == is_master then
> + box_cfg.read_only = not is_master
> end
> - error(err)
> + if type(box.cfg) == 'function' then
> + box_cfg.instance_uuid = this_replica.uuid
> + box_cfg.replicaset_uuid = this_replicaset.uuid
> + else
> + local info = box.info
> + if this_replica_uuid ~= info.uuid then
> + error(string.format('Instance UUID mismatch: already set ' ..
> + '"%s" but "%s" in arguments', info.uuid,
> + this_replica_uuid))
> + end
> + if this_replicaset.uuid ~= info.cluster.uuid then
> + error(string.format('Replicaset UUID mismatch: already set ' ..
> + '"%s" but "%s" in vshard config',
> + info.cluster.uuid, this_replicaset.uuid))
> + end
> + end
> + local ok, err = pcall(box.cfg, box_cfg)
> + while M.errinj.ERRINJ_CFG_DELAY do
> + lfiber.sleep(0.01)
> + end
> + if not ok then
> + M.sync_timeout = old_sync_timeout
> + if was_master and not is_master then
> + local_on_master_disable_abort()
> + end
> + if not was_master and is_master then
> + local_on_master_enable_abort()
> + end
> + error(err)
> + end
> + log.info("Box has been configured")
> + local uri = luri.parse(this_replica.uri)
> + box.once("vshard:storage:1", storage_schema_v1, uri.login, uri.password)
> end
>
> - log.info("Box has been configured")
> - local uri = luri.parse(this_replica.uri)
> - box.once("vshard:storage:1", storage_schema_v1, uri.login, uri.password)
> -
> lreplicaset.rebind_replicasets(new_replicasets, M.replicasets)
> lreplicaset.outdate_replicasets(M.replicasets)
> M.replicasets = new_replicasets
> @@ -1639,7 +1642,7 @@ local function storage_cfg(cfg, this_replica_uuid)
> M.shard_index = shard_index
> M.collect_bucket_garbage_interval = collect_bucket_garbage_interval
> M.collect_lua_garbage = collect_lua_garbage
> - M.current_cfg = new_cfg
> + M.current_cfg = cfg
>
> if was_master and not is_master then
> local_on_master_disable()
> @@ -1875,7 +1878,7 @@ if not rawget(_G, MODULE_INTERNALS) then
> rawset(_G, MODULE_INTERNALS, M)
> else
> reload_evolution.upgrade(M)
> - storage_cfg(M.current_cfg, M.this_replica.uuid)
> + storage_cfg(M.current_cfg, M.this_replica.uuid, true)
> M.module_version = M.module_version + 1
> end
>
> @@ -1914,7 +1917,7 @@ return {
> rebalancing_is_in_progress = rebalancing_is_in_progress,
> recovery_wakeup = recovery_wakeup,
> call = storage_call,
> - cfg = storage_cfg,
> + cfg = function(cfg, uuid) return storage_cfg(cfg, uuid, false) end,
> info = storage_info,
> buckets_info = storage_buckets_info,
> buckets_count = storage_buckets_count,
>
>
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 2/3] Move lua gc to a dedicated module
2018-08-03 20:04 ` Alex Khatskevich
2018-08-06 17:03 ` Vladislav Shpilevoy
@ 2018-08-08 11:17 ` Vladislav Shpilevoy
1 sibling, 0 replies; 23+ messages in thread
From: Vladislav Shpilevoy @ 2018-08-08 11:17 UTC (permalink / raw)
To: tarantool-patches, Alex Khatskevich
Thanks for the patch! Pushed into the master.
On 03/08/2018 23:04, Alex Khatskevich wrote:
>
> On 01.08.2018 21:43, Vladislav Shpilevoy wrote:
>> Thanks for the patch! See 4 comments below.
>>
>> On 31/07/2018 19:25, AKhatskevich wrote:
>>> `vshard.lua_gc.lua` is a new module which helps make gc work more
>>> intense.
>>> Before the commit that was a duty of router and storage.
>>>
>>> Reasons to move lua gc to a separate module:
>>> 1. It is not a duty of vshard to collect garbage, so let gc fiber
>>> be as far from vshard as possible.
>>> 2. Next commits will introduce multiple routers feature, which require
>>> gc fiber to be a singleton.
>>>
>>> Closes #138
>>> ---
>>> test/router/garbage_collector.result | 27 +++++++++++------
>>> test/router/garbage_collector.test.lua | 18 ++++++-----
>>> test/storage/garbage_collector.result | 27 +++++++++--------
>>> test/storage/garbage_collector.test.lua | 22 ++++++--------
>>> vshard/lua_gc.lua | 54 +++++++++++++++++++++++++++++++++
>>> vshard/router/init.lua | 19 +++---------
>>> vshard/storage/init.lua | 20 ++++--------
>>> 7 files changed, 116 insertions(+), 71 deletions(-)
>>> create mode 100644 vshard/lua_gc.lua
>>>
>>> diff --git a/test/router/garbage_collector.result b/test/router/garbage_collector.result
>>> index 3c2a4f1..a7474fc 100644
>>> --- a/test/router/garbage_collector.result
>>> +++ b/test/router/garbage_collector.result
>>> @@ -40,27 +40,30 @@ test_run:switch('router_1')
>>> fiber = require('fiber')
>>> ---
>>> ...
>>> -cfg.collect_lua_garbage = true
>>> +lua_gc = require('vshard.lua_gc')
>>> ---
>>> ...
>>> -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DISCOVERY_INTERVAL
>>> +cfg.collect_lua_garbage = true
>>
>> 1. Now this code tests nothing but just fibers. Below you do wakeup
>> and check that iteration counter is increased, but it is obvious
>> thing. Before your patch the test really tested that GC is called
>> by checking for nullified weak references. Now I can remove collectgarbage()
>> from the main_loop and nothing would changed. Please, make this test
>> be a test.
> GC test returned back.
>>
>> Moreover, the test hangs forever both locally and on Travis.
> Fixed
>>
>>> diff --git a/test/storage/garbage_collector.result b/test/storage/garbage_collector.result
>>> index 3588fb4..d94ba24 100644
>>> --- a/test/storage/garbage_collector.result
>>> +++ b/test/storage/garbage_collector.result
>>
>> 2. Same. Now the test passes even if I removed collectgarbage() from
>> the main loop.
> returned.
>>
>>> diff --git a/vshard/lua_gc.lua b/vshard/lua_gc.lua
>>> new file mode 100644
>>> index 0000000..8d6af3e
>>> --- /dev/null
>>> +++ b/vshard/lua_gc.lua
>>> @@ -0,0 +1,54 @@
>>> +--
>>> +-- This module implements background lua GC fiber.
>>> +-- It's purpose is to make GC more aggressive.
>>> +--
>>> +
>>> +local lfiber = require('fiber')
>>> +local MODULE_INTERNALS = '__module_vshard_lua_gc'
>>> +
>>> +local M = rawget(_G, MODULE_INTERNALS)
>>> +if not M then
>>> + M = {
>>> + -- Background fiber.
>>> + bg_fiber = nil,
>>> + -- GC interval in seconds.
>>> + interval = nil,
>>> + -- Main loop.
>>> + -- Stored here to make the fiber reloadable.
>>> + main_loop = nil,
>>> + -- Number of `collectgarbage()` calls.
>>> + iterations = 0,
>>> + }
>>> +end
>>> +local DEFALUT_INTERVAL = 100
>>
>> 3. For constants please use vshard.consts.
>>
>> 4. You should not choose interval inside the main_loop.
>> Please, use 'default' option in cfg.lua.
> DEFAULT_INTERVAL is removed at all.
> Interval value is became required.
>
>
>
> full diff
>
>
>
> commit ec221bd060f46e4dc009eaab1c6c1bd1cf5a4150
> Author: AKhatskevich <avkhatskevich@tarantool.org>
> Date: Thu Jul 26 01:17:00 2018 +0300
>
> Move lua gc to a dedicated module
>
> `vshard.lua_gc.lua` is a new module which helps make gc work more
> intense.
> Before the commit that was a duty of router and storage.
>
> Reasons to move lua gc to a separate module:
> 1. It is not a duty of vshard to collect garbage, so let gc fiber
> be as far from vshard as possible.
> 2. Next commits will introduce multiple routers feature, which require
> gc fiber to be a singleton.
>
> Closes #138
>
> diff --git a/test/router/garbage_collector.result b/test/router/garbage_collector.result
> index 3c2a4f1..7780046 100644
> --- a/test/router/garbage_collector.result
> +++ b/test/router/garbage_collector.result
> @@ -40,41 +40,59 @@ test_run:switch('router_1')
> fiber = require('fiber')
> ---
> ...
> -cfg.collect_lua_garbage = true
> +lua_gc = require('vshard.lua_gc')
> ---
> ...
> -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DISCOVERY_INTERVAL
> +cfg.collect_lua_garbage = true
> ---
> ...
> vshard.router.cfg(cfg)
> ---
> ...
> +lua_gc.internal.bg_fiber ~= nil
> +---
> +- true
> +...
> +-- Check that `collectgarbage()` was really called.
> a = setmetatable({}, {__mode = 'v'})
> ---
> ...
> a.k = {b = 100}
> ---
> ...
> -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end
> +iterations = lua_gc.internal.iterations
> +---
> +...
> +lua_gc.internal.bg_fiber:wakeup()
> +---
> +...
> +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end
> ---
> ...
> a.k
> ---
> - null
> ...
> +lua_gc.internal.interval = 0.001
> +---
> +...
> cfg.collect_lua_garbage = false
> ---
> ...
> vshard.router.cfg(cfg)
> ---
> ...
> -a.k = {b = 100}
> +lua_gc.internal.bg_fiber == nil
> +---
> +- true
> +...
> +iterations = lua_gc.internal.iterations
> ---
> ...
> -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end
> +fiber.sleep(0.01)
> ---
> ...
> -a.k ~= nil
> +iterations == lua_gc.internal.iterations
> ---
> - true
> ...
> diff --git a/test/router/garbage_collector.test.lua b/test/router/garbage_collector.test.lua
> index b3411cd..e8d0876 100644
> --- a/test/router/garbage_collector.test.lua
> +++ b/test/router/garbage_collector.test.lua
> @@ -13,18 +13,24 @@ test_run:cmd("start server router_1")
> --
> test_run:switch('router_1')
> fiber = require('fiber')
> +lua_gc = require('vshard.lua_gc')
> cfg.collect_lua_garbage = true
> -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DISCOVERY_INTERVAL
> vshard.router.cfg(cfg)
> +lua_gc.internal.bg_fiber ~= nil
> +-- Check that `collectgarbage()` was really called.
> a = setmetatable({}, {__mode = 'v'})
> a.k = {b = 100}
> -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end
> +iterations = lua_gc.internal.iterations
> +lua_gc.internal.bg_fiber:wakeup()
> +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end
> a.k
> +lua_gc.internal.interval = 0.001
> cfg.collect_lua_garbage = false
> vshard.router.cfg(cfg)
> -a.k = {b = 100}
> -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end
> -a.k ~= nil
> +lua_gc.internal.bg_fiber == nil
> +iterations = lua_gc.internal.iterations
> +fiber.sleep(0.01)
> +iterations == lua_gc.internal.iterations
>
> test_run:switch("default")
> test_run:cmd("stop server router_1")
> diff --git a/test/storage/garbage_collector.result b/test/storage/garbage_collector.result
> index 3588fb4..6bec2db 100644
> --- a/test/storage/garbage_collector.result
> +++ b/test/storage/garbage_collector.result
> @@ -120,7 +120,7 @@ test_run:switch('storage_1_a')
> fiber = require('fiber')
> ---
> ...
> -log = require('log')
> +lua_gc = require('vshard.lua_gc')
> ---
> ...
> cfg.collect_lua_garbage = true
> @@ -129,38 +129,50 @@ cfg.collect_lua_garbage = true
> vshard.storage.cfg(cfg, names.storage_1_a)
> ---
> ...
> --- Create a weak reference to a able {b = 100} - it must be
> --- deleted on the next GC.
> +lua_gc.internal.bg_fiber ~= nil
> +---
> +- true
> +...
> +-- Check that `collectgarbage()` was really called.
> a = setmetatable({}, {__mode = 'v'})
> ---
> ...
> a.k = {b = 100}
> ---
> ...
> -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
> +iterations = lua_gc.internal.iterations
> ---
> ...
> --- Wait until Lua GC deletes a.k.
> -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
> +lua_gc.internal.bg_fiber:wakeup()
> +---
> +...
> +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end
> ---
> ...
> a.k
> ---
> - null
> ...
> +lua_gc.internal.interval = 0.001
> +---
> +...
> cfg.collect_lua_garbage = false
> ---
> ...
> vshard.storage.cfg(cfg, names.storage_1_a)
> ---
> ...
> -a.k = {b = 100}
> +lua_gc.internal.bg_fiber == nil
> +---
> +- true
> +...
> +iterations = lua_gc.internal.iterations
> ---
> ...
> -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
> +fiber.sleep(0.01)
> ---
> ...
> -a.k ~= nil
> +iterations == lua_gc.internal.iterations
> ---
> - true
> ...
> diff --git a/test/storage/garbage_collector.test.lua b/test/storage/garbage_collector.test.lua
> index 79e76d8..407b8a1 100644
> --- a/test/storage/garbage_collector.test.lua
> +++ b/test/storage/garbage_collector.test.lua
> @@ -46,22 +46,24 @@ customer:select{}
> --
> test_run:switch('storage_1_a')
> fiber = require('fiber')
> -log = require('log')
> +lua_gc = require('vshard.lua_gc')
> cfg.collect_lua_garbage = true
> vshard.storage.cfg(cfg, names.storage_1_a)
> --- Create a weak reference to a able {b = 100} - it must be
> --- deleted on the next GC.
> +lua_gc.internal.bg_fiber ~= nil
> +-- Check that `collectgarbage()` was really called.
> a = setmetatable({}, {__mode = 'v'})
> a.k = {b = 100}
> -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
> --- Wait until Lua GC deletes a.k.
> -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
> +iterations = lua_gc.internal.iterations
> +lua_gc.internal.bg_fiber:wakeup()
> +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end
> a.k
> +lua_gc.internal.interval = 0.001
> cfg.collect_lua_garbage = false
> vshard.storage.cfg(cfg, names.storage_1_a)
> -a.k = {b = 100}
> -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
> -a.k ~= nil
> +lua_gc.internal.bg_fiber == nil
> +iterations = lua_gc.internal.iterations
> +fiber.sleep(0.01)
> +iterations == lua_gc.internal.iterations
>
> test_run:switch('default')
> test_run:drop_cluster(REPLICASET_2)
> diff --git a/vshard/lua_gc.lua b/vshard/lua_gc.lua
> new file mode 100644
> index 0000000..c6c5cd3
> --- /dev/null
> +++ b/vshard/lua_gc.lua
> @@ -0,0 +1,54 @@
> +--
> +-- This module implements background lua GC fiber.
> +-- It's purpose is to make GC more aggressive.
> +--
> +
> +local lfiber = require('fiber')
> +local MODULE_INTERNALS = '__module_vshard_lua_gc'
> +
> +local M = rawget(_G, MODULE_INTERNALS)
> +if not M then
> + M = {
> + -- Background fiber.
> + bg_fiber = nil,
> + -- GC interval in seconds.
> + interval = nil,
> + -- Main loop.
> + -- Stored here to make the fiber reloadable.
> + main_loop = nil,
> + -- Number of `collectgarbage()` calls.
> + iterations = 0,
> + }
> +end
> +
> +M.main_loop = function()
> + lfiber.sleep(M.interval)
> + collectgarbage()
> + M.iterations = M.iterations + 1
> + return M.main_loop()
> +end
> +
> +local function set_state(active, interval)
> + assert(type(interval) == 'number')
> + M.interval = interval
> + if active and not M.bg_fiber then
> + M.bg_fiber = lfiber.create(M.main_loop)
> + M.bg_fiber:name('vshard.lua_gc')
> + end
> + if not active and M.bg_fiber then
> + M.bg_fiber:cancel()
> + M.bg_fiber = nil
> + end
> + if active then
> + M.bg_fiber:wakeup()
> + end
> +end
> +
> +if not rawget(_G, MODULE_INTERNALS) then
> + rawset(_G, MODULE_INTERNALS, M)
> +end
> +
> +return {
> + set_state = set_state,
> + internal = M,
> +}
> diff --git a/vshard/router/init.lua b/vshard/router/init.lua
> index e2b2b22..3e127cb 100644
> --- a/vshard/router/init.lua
> +++ b/vshard/router/init.lua
> @@ -7,6 +7,7 @@ if rawget(_G, MODULE_INTERNALS) then
> local vshard_modules = {
> 'vshard.consts', 'vshard.error', 'vshard.cfg',
> 'vshard.hash', 'vshard.replicaset', 'vshard.util',
> + 'vshard.lua_gc',
> }
> for _, module in pairs(vshard_modules) do
> package.loaded[module] = nil
> @@ -18,6 +19,7 @@ local lcfg = require('vshard.cfg')
> local lhash = require('vshard.hash')
> local lreplicaset = require('vshard.replicaset')
> local util = require('vshard.util')
> +local lua_gc = require('vshard.lua_gc')
>
> local M = rawget(_G, MODULE_INTERNALS)
> if not M then
> @@ -43,8 +45,7 @@ if not M then
> discovery_fiber = nil,
> -- Bucket count stored on all replicasets.
> total_bucket_count = 0,
> - -- If true, then discovery fiber starts to call
> - -- collectgarbage() periodically.
> + -- Boolean lua_gc state (create periodic gc task).
> collect_lua_garbage = nil,
> -- This counter is used to restart background fibers with
> -- new reloaded code.
> @@ -151,8 +152,6 @@ end
> --
> local function discovery_f()
> local module_version = M.module_version
> - local iterations_until_lua_gc =
> - consts.COLLECT_LUA_GARBAGE_INTERVAL / consts.DISCOVERY_INTERVAL
> while module_version == M.module_version do
> while not next(M.replicasets) do
> lfiber.sleep(consts.DISCOVERY_INTERVAL)
> @@ -188,12 +187,6 @@ local function discovery_f()
> M.route_map[bucket_id] = replicaset
> end
> end
> - iterations_until_lua_gc = iterations_until_lua_gc - 1
> - if M.collect_lua_garbage and iterations_until_lua_gc == 0 then
> - iterations_until_lua_gc =
> - consts.COLLECT_LUA_GARBAGE_INTERVAL / consts.DISCOVERY_INTERVAL
> - collectgarbage()
> - end
> lfiber.sleep(consts.DISCOVERY_INTERVAL)
> end
> end
> @@ -504,7 +497,6 @@ local function router_cfg(cfg)
> end
> local new_replicasets = lreplicaset.buildall(vshard_cfg)
> local total_bucket_count = vshard_cfg.bucket_count
> - local collect_lua_garbage = vshard_cfg.collect_lua_garbage
> log.info("Calling box.cfg()...")
> for k, v in pairs(box_cfg) do
> log.info({[k] = v})
> @@ -531,7 +523,7 @@ local function router_cfg(cfg)
> vshard_cfg.connection_outdate_delay)
> M.connection_outdate_delay = vshard_cfg.connection_outdate_delay
> M.total_bucket_count = total_bucket_count
> - M.collect_lua_garbage = collect_lua_garbage
> + M.collect_lua_garbage = vshard_cfg.collect_lua_garbage
> M.current_cfg = vshard_cfg
> M.replicasets = new_replicasets
> -- Update existing route map in-place.
> @@ -548,8 +540,7 @@ local function router_cfg(cfg)
> M.discovery_fiber = util.reloadable_fiber_create(
> 'vshard.discovery', M, 'discovery_f')
> end
> - -- Destroy connections, not used in a new configuration.
> - collectgarbage()
> + lua_gc.set_state(M.collect_lua_garbage, consts.COLLECT_LUA_GARBAGE_INTERVAL)
> end
>
> --------------------------------------------------------------------------------
> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
> index 40216ea..3e29e9d 100644
> --- a/vshard/storage/init.lua
> +++ b/vshard/storage/init.lua
> @@ -10,7 +10,8 @@ if rawget(_G, MODULE_INTERNALS) then
> local vshard_modules = {
> 'vshard.consts', 'vshard.error', 'vshard.cfg',
> 'vshard.replicaset', 'vshard.util',
> - 'vshard.storage.reload_evolution'
> + 'vshard.storage.reload_evolution',
> + 'vshard.lua_gc',
> }
> for _, module in pairs(vshard_modules) do
> package.loaded[module] = nil
> @@ -21,6 +22,7 @@ local lerror = require('vshard.error')
> local lcfg = require('vshard.cfg')
> local lreplicaset = require('vshard.replicaset')
> local util = require('vshard.util')
> +local lua_gc = require('vshard.lua_gc')
> local reload_evolution = require('vshard.storage.reload_evolution')
>
> local M = rawget(_G, MODULE_INTERNALS)
> @@ -75,8 +77,7 @@ if not M then
> collect_bucket_garbage_fiber = nil,
> -- Do buckets garbage collection once per this time.
> collect_bucket_garbage_interval = nil,
> - -- If true, then bucket garbage collection fiber starts to
> - -- call collectgarbage() periodically.
> + -- Boolean lua_gc state (create periodic gc task).
> collect_lua_garbage = nil,
>
> -------------------- Bucket recovery ---------------------
> @@ -1063,9 +1064,6 @@ function collect_garbage_f()
> -- buckets_for_redirect is deleted, it gets empty_sent_buckets
> -- for next deletion.
> local empty_sent_buckets = {}
> - local iterations_until_lua_gc =
> - consts.COLLECT_LUA_GARBAGE_INTERVAL / M.collect_bucket_garbage_interval
> -
> while M.module_version == module_version do
> -- Check if no changes in buckets configuration.
> if control.bucket_generation_collected ~= control.bucket_generation then
> @@ -1106,12 +1104,6 @@ function collect_garbage_f()
> end
> end
> ::continue::
> - iterations_until_lua_gc = iterations_until_lua_gc - 1
> - if iterations_until_lua_gc == 0 and M.collect_lua_garbage then
> - iterations_until_lua_gc = consts.COLLECT_LUA_GARBAGE_INTERVAL /
> - M.collect_bucket_garbage_interval
> - collectgarbage()
> - end
> lfiber.sleep(M.collect_bucket_garbage_interval)
> end
> end
> @@ -1586,7 +1578,6 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload)
> local shard_index = vshard_cfg.shard_index
> local collect_bucket_garbage_interval =
> vshard_cfg.collect_bucket_garbage_interval
> - local collect_lua_garbage = vshard_cfg.collect_lua_garbage
>
> -- It is considered that all possible errors during cfg
> -- process occur only before this place.
> @@ -1641,7 +1632,7 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload)
> M.rebalancer_max_receiving = rebalancer_max_receiving
> M.shard_index = shard_index
> M.collect_bucket_garbage_interval = collect_bucket_garbage_interval
> - M.collect_lua_garbage = collect_lua_garbage
> + M.collect_lua_garbage = vshard_cfg.collect_lua_garbage
> M.current_cfg = vshard_cfg
>
> if was_master and not is_master then
> @@ -1666,6 +1657,7 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload)
> M.rebalancer_fiber:cancel()
> M.rebalancer_fiber = nil
> end
> + lua_gc.set_state(M.collect_lua_garbage, consts.COLLECT_LUA_GARBAGE_INTERVAL)
> -- Destroy connections, not used in a new configuration.
> collectgarbage()
> end
>
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 3/3] Introduce multiple routers feature
2018-08-07 13:18 ` Alex Khatskevich
@ 2018-08-08 12:28 ` Vladislav Shpilevoy
2018-08-08 14:04 ` Alex Khatskevich
0 siblings, 1 reply; 23+ messages in thread
From: Vladislav Shpilevoy @ 2018-08-08 12:28 UTC (permalink / raw)
To: Alex Khatskevich, tarantool-patches
Thanks for the fixes!
1. Please, rebase on the master. I've failed to do it
easy.
2. Please, adding a new commit send it to the same thread.
I am talking about "Fix: do not update route map in place".
Since you've not sent it, I review it here.
2.1. At first, please, prefix the commit title with a
subsystem name the patch is for. Here it is not "Fix: ",
but "router: ".
2.2. We know a new route map size before rebuild - it is
equal to the total bucket count. So it can be allocated
once via table.new(total_bucket_count, 0). It allows to
avoid reallocs.
I've fixed both remarks and pushed the commit into the
master.
> diff --git a/vshard/router/init.lua b/vshard/router/init.lua
> index 59c25a0..b31f7dc 100644
> --- a/vshard/router/init.lua
> +++ b/vshard/router/init.lua
> @@ -799,6 +848,90 @@ if M.errinj.ERRINJ_RELOAD then
> error('Error injection: reload')
> end
>
> +--------------------------------------------------------------------------------
> +-- Managing router instances
> +--------------------------------------------------------------------------------
> +
> +local function cfg_reconfigure(router, cfg)
> + return router_cfg(router, cfg, false)
> +end
> +
> +local router_mt = {
> + __index = {
> + cfg = cfg_reconfigure;
> + info = router_info;
> + buckets_info = router_buckets_info;
> + call = router_call;
> + callro = router_callro;
> + callrw = router_callrw;
> + route = router_route;
> + routeall = router_routeall;
> + bucket_id = router_bucket_id;
> + bucket_count = router_bucket_count;
> + sync = router_sync;
> + bootstrap = cluster_bootstrap;
> + bucket_discovery = bucket_discovery;
> + discovery_wakeup = discovery_wakeup;
> + }
> +}
> +
> +-- Table which represents this module.
> +local module = {}
> +
> +-- This metatable bypasses calls to a module to the static_router.
> +local module_mt = {__index = {}}
> +for method_name, method in pairs(router_mt.__index) do
> + module_mt.__index[method_name] = function(...)
> + return method(M.static_router, ...)
> + end
> +end
> +
> +local function export_static_router_attributes()
> + setmetatable(module, module_mt)
> +end
> +
> +--
> +-- Create a new instance of router.
> +-- @param name Name of a new router.
> +-- @param cfg Configuration for `router_cfg`.
> +-- @retval Router instance.
> +-- @retval Nil and error object.
> +--
> +local function router_new(name, cfg)
> + if type(name) ~= 'string' or type(cfg) ~= 'table' then
> + error('Wrong argument type. Usage: vshard.router.new(name, cfg).')
> + end
> + if M.routers[name] then
> + return nil, lerror.vshard(lerror.code.ROUTER_ALREADY_EXISTS, name)
> + end
> + local router = table.deepcopy(ROUTER_TEMPLATE)
> + setmetatable(router, router_mt)
> + router.name = name
> + M.routers[name] = router
> + router_cfg(router, cfg)
3. router_cfg can raise an error from box.cfg. So on an error lets catch it,
remove the router from M.routers and rethrow the error.
In other things the patch LGTM. Please, fix the comments above and I will
push it. Thank you for working on this!
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 3/3] Introduce multiple routers feature
2018-08-08 12:28 ` Vladislav Shpilevoy
@ 2018-08-08 14:04 ` Alex Khatskevich
2018-08-08 15:37 ` Vladislav Shpilevoy
0 siblings, 1 reply; 23+ messages in thread
From: Alex Khatskevich @ 2018-08-08 14:04 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches
On 08.08.2018 15:28, Vladislav Shpilevoy wrote:
> Thanks for the fixes!
>
> 1. Please, rebase on the master. I've failed to do it
> easy.
>
Done
> 2. Please, adding a new commit send it to the same thread.
> I am talking about "Fix: do not update route map in place".
>
> Since you've not sent it, I review it here.
>
> 2.1. At first, please, prefix the commit title with a
> subsystem name the patch is for. Here it is not "Fix: ",
> but "router: ".
>
> 2.2. We know a new route map size before rebuild - it is
> equal to the total bucket count. So it can be allocated
> once via table.new(total_bucket_count, 0). It allows to
> avoid reallocs.
>
> I've fixed both remarks and pushed the commit into the
> master.
>
Thanks
>> +local function router_new(name, cfg)
>> + if type(name) ~= 'string' or type(cfg) ~= 'table' then
>> + error('Wrong argument type. Usage:
>> vshard.router.new(name, cfg).')
>> + end
>> + if M.routers[name] then
>> + return nil, lerror.vshard(lerror.code.ROUTER_ALREADY_EXISTS,
>> name)
>> + end
>> + local router = table.deepcopy(ROUTER_TEMPLATE)
>> + setmetatable(router, router_mt)
>> + router.name = name
>> + M.routers[name] = router
>> + router_cfg(router, cfg)
>
> 3. router_cfg can raise an error from box.cfg. So on an error lets
> catch it,
> remove the router from M.routers and rethrow the error.
Done
>
> In other things the patch LGTM. Please, fix the comments above and I will
> push it. Thank you for working on this!
full diff
commit 5cc3991487b6b212ef1c35880963c020e443200e
Author: AKhatskevich <avkhatskevich@tarantool.org>
Date: Thu Jul 26 16:17:25 2018 +0300
Introduce multiple routers feature
Key points:
* Old `vshard.router.some_method()` api is preserved.
* Add `vshard.router.new(name, cfg)` method which returns a new router.
* Each router has its own:
1. name
2. background fibers
3. attributes (route_map, replicasets, outdate_delay...)
* Module reload reloads all configured routers.
* `cfg` reconfigures a single router.
* All routers share the same box configuration. The last passed config
overrides the global box config.
* Multiple router instances can be connected to the same cluster.
* By now, a router cannot be destroyed.
Extra changes:
* Add `data` parameter to `reloadable_fiber_create` function.
Closes #130
diff --git a/test/failover/failover.result b/test/failover/failover.result
index 73a4250..50410ad 100644
--- a/test/failover/failover.result
+++ b/test/failover/failover.result
@@ -174,7 +174,7 @@ test_run:switch('router_1')
---
- true
...
-rs1 = vshard.router.internal.replicasets[rs_uuid[1]]
+rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]]
---
...
while not rs1.replica_up_ts do fiber.sleep(0.1) end
diff --git a/test/failover/failover.test.lua
b/test/failover/failover.test.lua
index 6e06314..44c8b6d 100644
--- a/test/failover/failover.test.lua
+++ b/test/failover/failover.test.lua
@@ -74,7 +74,7 @@ echo_count
-- Ensure that replica_up_ts is updated periodically.
test_run:switch('router_1')
-rs1 = vshard.router.internal.replicasets[rs_uuid[1]]
+rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]]
while not rs1.replica_up_ts do fiber.sleep(0.1) end
old_up_ts = rs1.replica_up_ts
while rs1.replica_up_ts == old_up_ts do fiber.sleep(0.1) end
diff --git a/test/failover/failover_errinj.result
b/test/failover/failover_errinj.result
index 3b6d986..484a1e3 100644
--- a/test/failover/failover_errinj.result
+++ b/test/failover/failover_errinj.result
@@ -49,7 +49,7 @@ vshard.router.cfg(cfg)
-- Check that already run failover step is restarted on
-- configuration change (if some replicasets are removed from
-- config).
-rs1 = vshard.router.internal.replicasets[rs_uuid[1]]
+rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]]
---
...
while not rs1.replica or not rs1.replica.conn:is_connected() do
fiber.sleep(0.1) end
diff --git a/test/failover/failover_errinj.test.lua
b/test/failover/failover_errinj.test.lua
index b4d2d35..14228de 100644
--- a/test/failover/failover_errinj.test.lua
+++ b/test/failover/failover_errinj.test.lua
@@ -20,7 +20,7 @@ vshard.router.cfg(cfg)
-- Check that already run failover step is restarted on
-- configuration change (if some replicasets are removed from
-- config).
-rs1 = vshard.router.internal.replicasets[rs_uuid[1]]
+rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]]
while not rs1.replica or not rs1.replica.conn:is_connected() do
fiber.sleep(0.1) end
vshard.router.internal.errinj.ERRINJ_FAILOVER_CHANGE_CFG = true
wait_state('Configuration has changed, restart ')
diff --git a/test/failover/router_1.lua b/test/failover/router_1.lua
index d71209b..664a6c6 100644
--- a/test/failover/router_1.lua
+++ b/test/failover/router_1.lua
@@ -42,7 +42,7 @@ end
function priority_order()
local ret = {}
for _, uuid in pairs(rs_uuid) do
- local rs = vshard.router.internal.replicasets[uuid]
+ local rs = vshard.router.internal.static_router.replicasets[uuid]
local sorted = {}
for _, replica in pairs(rs.priority_list) do
local z
diff --git a/test/misc/reconfigure.result b/test/misc/reconfigure.result
index c7960b3..311f749 100644
--- a/test/misc/reconfigure.result
+++ b/test/misc/reconfigure.result
@@ -250,7 +250,7 @@ test_run:switch('router_1')
-- Ensure that in a case of error router internals are not
-- changed.
--
-not vshard.router.internal.collect_lua_garbage
+not vshard.router.internal.static_router.collect_lua_garbage
---
- true
...
@@ -264,7 +264,7 @@ vshard.router.cfg(cfg)
---
- error: 'Incorrect value for option ''invalid_option'': unexpected
option'
...
-not vshard.router.internal.collect_lua_garbage
+not vshard.router.internal.static_router.collect_lua_garbage
---
- true
...
diff --git a/test/misc/reconfigure.test.lua b/test/misc/reconfigure.test.lua
index 25dc2ca..298b9b0 100644
--- a/test/misc/reconfigure.test.lua
+++ b/test/misc/reconfigure.test.lua
@@ -99,11 +99,11 @@ test_run:switch('router_1')
-- Ensure that in a case of error router internals are not
-- changed.
--
-not vshard.router.internal.collect_lua_garbage
+not vshard.router.internal.static_router.collect_lua_garbage
cfg.collect_lua_garbage = true
cfg.invalid_option = 'kek'
vshard.router.cfg(cfg)
-not vshard.router.internal.collect_lua_garbage
+not vshard.router.internal.static_router.collect_lua_garbage
cfg.invalid_option = nil
cfg.collect_lua_garbage = nil
vshard.router.cfg(cfg)
diff --git a/test/multiple_routers/configs.lua
b/test/multiple_routers/configs.lua
new file mode 100644
index 0000000..a6ce33c
--- /dev/null
+++ b/test/multiple_routers/configs.lua
@@ -0,0 +1,81 @@
+names = {
+ storage_1_1_a = '32a2d4b8-f146-44ed-9d51-2436507efdf8',
+ storage_1_1_b = 'c1c849b1-641d-40b8-9283-bcfe73d46270',
+ storage_1_2_a = '04e677ed-c7ba-47e0-a67f-b5100cfa86af',
+ storage_1_2_b = 'c7a979ee-9263-4a38-84a5-2fb6a0a32684',
+ storage_2_1_a = '88dc03f0-23fb-4f05-b462-e29186542864',
+ storage_2_1_b = '4230b711-f5c4-4131-bf98-88cd43a16901',
+ storage_2_2_a = '6b1eefbc-1e2e-410e-84ff-44c572ea9916',
+ storage_2_2_b = 'be74419a-1e56-4ba4-97e9-6b18710f63c5',
+}
+
+rs_1_1 = 'dd208fb8-8b90-49bc-8393-6b3a99da7c52'
+rs_1_2 = 'af9cfe88-2091-4613-a877-a623776c5c0e'
+rs_2_1 = '9ca8ee15-ae18-4f31-9385-4859f89ce73f'
+rs_2_2 = '007f5f58-b654-4125-8441-a71866fb62b5'
+
+local cfg_1 = {}
+cfg_1.sharding = {
+ [rs_1_1] = {
+ replicas = {
+ [names.storage_1_1_a] = {
+ uri = 'storage:storage@127.0.0.1:3301',
+ name = 'storage_1_1_a',
+ master = true,
+ },
+ [names.storage_1_1_b] = {
+ uri = 'storage:storage@127.0.0.1:3302',
+ name = 'storage_1_1_b',
+ },
+ }
+ },
+ [rs_1_2] = {
+ replicas = {
+ [names.storage_1_2_a] = {
+ uri = 'storage:storage@127.0.0.1:3303',
+ name = 'storage_1_2_a',
+ master = true,
+ },
+ [names.storage_1_2_b] = {
+ uri = 'storage:storage@127.0.0.1:3304',
+ name = 'storage_1_2_b',
+ },
+ }
+ },
+}
+
+
+local cfg_2 = {}
+cfg_2.sharding = {
+ [rs_2_1] = {
+ replicas = {
+ [names.storage_2_1_a] = {
+ uri = 'storage:storage@127.0.0.1:3305',
+ name = 'storage_2_1_a',
+ master = true,
+ },
+ [names.storage_2_1_b] = {
+ uri = 'storage:storage@127.0.0.1:3306',
+ name = 'storage_2_1_b',
+ },
+ }
+ },
+ [rs_2_2] = {
+ replicas = {
+ [names.storage_2_2_a] = {
+ uri = 'storage:storage@127.0.0.1:3307',
+ name = 'storage_2_2_a',
+ master = true,
+ },
+ [names.storage_2_2_b] = {
+ uri = 'storage:storage@127.0.0.1:3308',
+ name = 'storage_2_2_b',
+ },
+ }
+ },
+}
+
+return {
+ cfg_1 = cfg_1,
+ cfg_2 = cfg_2,
+}
diff --git a/test/multiple_routers/multiple_routers.result
b/test/multiple_routers/multiple_routers.result
new file mode 100644
index 0000000..5b85e1c
--- /dev/null
+++ b/test/multiple_routers/multiple_routers.result
@@ -0,0 +1,301 @@
+test_run = require('test_run').new()
+---
+...
+REPLICASET_1_1 = { 'storage_1_1_a', 'storage_1_1_b' }
+---
+...
+REPLICASET_1_2 = { 'storage_1_2_a', 'storage_1_2_b' }
+---
+...
+REPLICASET_2_1 = { 'storage_2_1_a', 'storage_2_1_b' }
+---
+...
+REPLICASET_2_2 = { 'storage_2_2_a', 'storage_2_2_b' }
+---
+...
+test_run:create_cluster(REPLICASET_1_1, 'multiple_routers')
+---
+...
+test_run:create_cluster(REPLICASET_1_2, 'multiple_routers')
+---
+...
+test_run:create_cluster(REPLICASET_2_1, 'multiple_routers')
+---
+...
+test_run:create_cluster(REPLICASET_2_2, 'multiple_routers')
+---
+...
+util = require('lua_libs.util')
+---
+...
+util.wait_master(test_run, REPLICASET_1_1, 'storage_1_1_a')
+---
+...
+util.wait_master(test_run, REPLICASET_1_2, 'storage_1_2_a')
+---
+...
+util.wait_master(test_run, REPLICASET_2_1, 'storage_2_1_a')
+---
+...
+util.wait_master(test_run, REPLICASET_2_2, 'storage_2_2_a')
+---
+...
+test_run:cmd("create server router_1 with
script='multiple_routers/router_1.lua'")
+---
+- true
+...
+test_run:cmd("start server router_1")
+---
+- true
+...
+-- Configure default (static) router.
+_ = test_run:cmd("switch router_1")
+---
+...
+vshard.router.cfg(configs.cfg_1)
+---
+...
+vshard.router.bootstrap()
+---
+- true
+...
+_ = test_run:cmd("switch storage_1_2_a")
+---
+...
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+---
+...
+_ = test_run:cmd("switch router_1")
+---
+...
+vshard.router.call(1, 'write', 'do_replace', {{1, 1}})
+---
+- true
+...
+vshard.router.call(1, 'read', 'do_select', {1})
+---
+- [[1, 1]]
+...
+-- Test that static router is just a router object under the hood.
+static_router = vshard.router.internal.static_router
+---
+...
+static_router:route(1) == vshard.router.route(1)
+---
+- true
+...
+-- Configure extra router.
+router_2 = vshard.router.new('router_2', configs.cfg_2)
+---
+...
+router_2:bootstrap()
+---
+- true
+...
+_ = test_run:cmd("switch storage_2_2_a")
+---
+...
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+---
+...
+_ = test_run:cmd("switch router_1")
+---
+...
+router_2:call(1, 'write', 'do_replace', {{2, 2}})
+---
+- true
+...
+router_2:call(1, 'read', 'do_select', {2})
+---
+- [[2, 2]]
+...
+-- Check that router_2 and static router serves different clusters.
+#router_2:call(1, 'read', 'do_select', {1}) == 0
+---
+- true
+...
+-- Create several routers to the same cluster.
+routers = {}
+---
+...
+for i = 3, 10 do routers[i] = vshard.router.new('router_' .. i,
configs.cfg_2) end
+---
+...
+routers[3]:call(1, 'read', 'do_select', {2})
+---
+- [[2, 2]]
+...
+-- Check that they have their own background fibers.
+fiber_names = {}
+---
+...
+for i = 2, 10 do fiber_names['vshard.failover.router_' .. i] = true;
fiber_names['vshard.discovery.router_' .. i] = true; end
+---
+...
+next(fiber_names) ~= nil
+---
+- true
+...
+fiber = require('fiber')
+---
+...
+for _, xfiber in pairs(fiber.info()) do fiber_names[xfiber.name] = nil end
+---
+...
+next(fiber_names) == nil
+---
+- true
+...
+-- Reconfigure one of routers do not affect the others.
+routers[3]:cfg(configs.cfg_1)
+---
+...
+routers[3]:call(1, 'read', 'do_select', {1})
+---
+- [[1, 1]]
+...
+#routers[3]:call(1, 'read', 'do_select', {2}) == 0
+---
+- true
+...
+#routers[4]:call(1, 'read', 'do_select', {1}) == 0
+---
+- true
+...
+routers[4]:call(1, 'read', 'do_select', {2})
+---
+- [[2, 2]]
+...
+routers[3]:cfg(configs.cfg_2)
+---
+...
+-- Try to create router with the same name.
+util = require('lua_libs.util')
+---
+...
+util.check_error(vshard.router.new, 'router_2', configs.cfg_2)
+---
+- null
+- type: ShardingError
+ code: 21
+ name: ROUTER_ALREADY_EXISTS
+ message: Router with name router_2 already exists
+...
+-- Reload router module.
+_, old_rs_1 = next(vshard.router.internal.static_router.replicasets)
+---
+...
+_, old_rs_2 = next(router_2.replicasets)
+---
+...
+package.loaded['vshard.router'] = nil
+---
+...
+vshard.router = require('vshard.router')
+---
+...
+while not old_rs_1.is_outdated do fiber.sleep(0.01) end
+---
+...
+while not old_rs_2.is_outdated do fiber.sleep(0.01) end
+---
+...
+vshard.router.call(1, 'read', 'do_select', {1})
+---
+- [[1, 1]]
+...
+router_2:call(1, 'read', 'do_select', {2})
+---
+- [[2, 2]]
+...
+routers[5]:call(1, 'read', 'do_select', {2})
+---
+- [[2, 2]]
+...
+-- Check lua_gc counter.
+lua_gc = require('vshard.lua_gc')
+---
+...
+vshard.router.internal.collect_lua_garbage_cnt == 0
+---
+- true
+...
+lua_gc.internal.bg_fiber == nil
+---
+- true
+...
+configs.cfg_2.collect_lua_garbage = true
+---
+...
+routers[5]:cfg(configs.cfg_2)
+---
+...
+lua_gc.internal.bg_fiber ~= nil
+---
+- true
+...
+routers[7]:cfg(configs.cfg_2)
+---
+...
+lua_gc.internal.bg_fiber ~= nil
+---
+- true
+...
+vshard.router.internal.collect_lua_garbage_cnt == 2
+---
+- true
+...
+package.loaded['vshard.router'] = nil
+---
+...
+vshard.router = require('vshard.router')
+---
+...
+vshard.router.internal.collect_lua_garbage_cnt == 2
+---
+- true
+...
+configs.cfg_2.collect_lua_garbage = nil
+---
+...
+routers[5]:cfg(configs.cfg_2)
+---
+...
+lua_gc.internal.bg_fiber ~= nil
+---
+- true
+...
+routers[7]:cfg(configs.cfg_2)
+---
+...
+vshard.router.internal.collect_lua_garbage_cnt == 0
+---
+- true
+...
+lua_gc.internal.bg_fiber == nil
+---
+- true
+...
+_ = test_run:cmd("switch default")
+---
+...
+test_run:cmd("stop server router_1")
+---
+- true
+...
+test_run:cmd("cleanup server router_1")
+---
+- true
+...
+test_run:drop_cluster(REPLICASET_1_1)
+---
+...
+test_run:drop_cluster(REPLICASET_1_2)
+---
+...
+test_run:drop_cluster(REPLICASET_2_1)
+---
+...
+test_run:drop_cluster(REPLICASET_2_2)
+---
+...
diff --git a/test/multiple_routers/multiple_routers.test.lua
b/test/multiple_routers/multiple_routers.test.lua
new file mode 100644
index 0000000..ec3c7f7
--- /dev/null
+++ b/test/multiple_routers/multiple_routers.test.lua
@@ -0,0 +1,109 @@
+test_run = require('test_run').new()
+
+REPLICASET_1_1 = { 'storage_1_1_a', 'storage_1_1_b' }
+REPLICASET_1_2 = { 'storage_1_2_a', 'storage_1_2_b' }
+REPLICASET_2_1 = { 'storage_2_1_a', 'storage_2_1_b' }
+REPLICASET_2_2 = { 'storage_2_2_a', 'storage_2_2_b' }
+
+test_run:create_cluster(REPLICASET_1_1, 'multiple_routers')
+test_run:create_cluster(REPLICASET_1_2, 'multiple_routers')
+test_run:create_cluster(REPLICASET_2_1, 'multiple_routers')
+test_run:create_cluster(REPLICASET_2_2, 'multiple_routers')
+util = require('lua_libs.util')
+util.wait_master(test_run, REPLICASET_1_1, 'storage_1_1_a')
+util.wait_master(test_run, REPLICASET_1_2, 'storage_1_2_a')
+util.wait_master(test_run, REPLICASET_2_1, 'storage_2_1_a')
+util.wait_master(test_run, REPLICASET_2_2, 'storage_2_2_a')
+
+test_run:cmd("create server router_1 with
script='multiple_routers/router_1.lua'")
+test_run:cmd("start server router_1")
+
+-- Configure default (static) router.
+_ = test_run:cmd("switch router_1")
+vshard.router.cfg(configs.cfg_1)
+vshard.router.bootstrap()
+_ = test_run:cmd("switch storage_1_2_a")
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+_ = test_run:cmd("switch router_1")
+
+vshard.router.call(1, 'write', 'do_replace', {{1, 1}})
+vshard.router.call(1, 'read', 'do_select', {1})
+
+-- Test that static router is just a router object under the hood.
+static_router = vshard.router.internal.static_router
+static_router:route(1) == vshard.router.route(1)
+
+-- Configure extra router.
+router_2 = vshard.router.new('router_2', configs.cfg_2)
+router_2:bootstrap()
+_ = test_run:cmd("switch storage_2_2_a")
+wait_rebalancer_state('The cluster is balanced ok', test_run)
+_ = test_run:cmd("switch router_1")
+
+router_2:call(1, 'write', 'do_replace', {{2, 2}})
+router_2:call(1, 'read', 'do_select', {2})
+-- Check that router_2 and static router serves different clusters.
+#router_2:call(1, 'read', 'do_select', {1}) == 0
+
+-- Create several routers to the same cluster.
+routers = {}
+for i = 3, 10 do routers[i] = vshard.router.new('router_' .. i,
configs.cfg_2) end
+routers[3]:call(1, 'read', 'do_select', {2})
+-- Check that they have their own background fibers.
+fiber_names = {}
+for i = 2, 10 do fiber_names['vshard.failover.router_' .. i] = true;
fiber_names['vshard.discovery.router_' .. i] = true; end
+next(fiber_names) ~= nil
+fiber = require('fiber')
+for _, xfiber in pairs(fiber.info()) do fiber_names[xfiber.name] = nil end
+next(fiber_names) == nil
+
+-- Reconfigure one of routers do not affect the others.
+routers[3]:cfg(configs.cfg_1)
+routers[3]:call(1, 'read', 'do_select', {1})
+#routers[3]:call(1, 'read', 'do_select', {2}) == 0
+#routers[4]:call(1, 'read', 'do_select', {1}) == 0
+routers[4]:call(1, 'read', 'do_select', {2})
+routers[3]:cfg(configs.cfg_2)
+
+-- Try to create router with the same name.
+util = require('lua_libs.util')
+util.check_error(vshard.router.new, 'router_2', configs.cfg_2)
+
+-- Reload router module.
+_, old_rs_1 = next(vshard.router.internal.static_router.replicasets)
+_, old_rs_2 = next(router_2.replicasets)
+package.loaded['vshard.router'] = nil
+vshard.router = require('vshard.router')
+while not old_rs_1.is_outdated do fiber.sleep(0.01) end
+while not old_rs_2.is_outdated do fiber.sleep(0.01) end
+vshard.router.call(1, 'read', 'do_select', {1})
+router_2:call(1, 'read', 'do_select', {2})
+routers[5]:call(1, 'read', 'do_select', {2})
+
+-- Check lua_gc counter.
+lua_gc = require('vshard.lua_gc')
+vshard.router.internal.collect_lua_garbage_cnt == 0
+lua_gc.internal.bg_fiber == nil
+configs.cfg_2.collect_lua_garbage = true
+routers[5]:cfg(configs.cfg_2)
+lua_gc.internal.bg_fiber ~= nil
+routers[7]:cfg(configs.cfg_2)
+lua_gc.internal.bg_fiber ~= nil
+vshard.router.internal.collect_lua_garbage_cnt == 2
+package.loaded['vshard.router'] = nil
+vshard.router = require('vshard.router')
+vshard.router.internal.collect_lua_garbage_cnt == 2
+configs.cfg_2.collect_lua_garbage = nil
+routers[5]:cfg(configs.cfg_2)
+lua_gc.internal.bg_fiber ~= nil
+routers[7]:cfg(configs.cfg_2)
+vshard.router.internal.collect_lua_garbage_cnt == 0
+lua_gc.internal.bg_fiber == nil
+
+_ = test_run:cmd("switch default")
+test_run:cmd("stop server router_1")
+test_run:cmd("cleanup server router_1")
+test_run:drop_cluster(REPLICASET_1_1)
+test_run:drop_cluster(REPLICASET_1_2)
+test_run:drop_cluster(REPLICASET_2_1)
+test_run:drop_cluster(REPLICASET_2_2)
diff --git a/test/multiple_routers/router_1.lua
b/test/multiple_routers/router_1.lua
new file mode 100644
index 0000000..2e9ea91
--- /dev/null
+++ b/test/multiple_routers/router_1.lua
@@ -0,0 +1,15 @@
+#!/usr/bin/env tarantool
+
+require('strict').on()
+
+-- Get instance name
+local fio = require('fio')
+local NAME = fio.basename(arg[0], '.lua')
+
+require('console').listen(os.getenv('ADMIN'))
+
+configs = require('configs')
+
+-- Start the database with sharding
+vshard = require('vshard')
+box.cfg{}
diff --git a/test/multiple_routers/storage_1_1_a.lua
b/test/multiple_routers/storage_1_1_a.lua
new file mode 100644
index 0000000..b44a97a
--- /dev/null
+++ b/test/multiple_routers/storage_1_1_a.lua
@@ -0,0 +1,23 @@
+#!/usr/bin/env tarantool
+
+require('strict').on()
+
+-- Get instance name.
+local fio = require('fio')
+NAME = fio.basename(arg[0], '.lua')
+
+require('console').listen(os.getenv('ADMIN'))
+
+-- Fetch config for the cluster of the instance.
+if NAME:sub(9,9) == '1' then
+ cfg = require('configs').cfg_1
+else
+ cfg = require('configs').cfg_2
+end
+
+-- Start the database with sharding.
+vshard = require('vshard')
+vshard.storage.cfg(cfg, names[NAME])
+
+-- Bootstrap storage.
+require('lua_libs.bootstrap')
diff --git a/test/multiple_routers/storage_1_1_b.lua
b/test/multiple_routers/storage_1_1_b.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_1_1_b.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_1_2_a.lua
b/test/multiple_routers/storage_1_2_a.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_1_2_a.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_1_2_b.lua
b/test/multiple_routers/storage_1_2_b.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_1_2_b.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_2_1_a.lua
b/test/multiple_routers/storage_2_1_a.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_2_1_a.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_2_1_b.lua
b/test/multiple_routers/storage_2_1_b.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_2_1_b.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_2_2_a.lua
b/test/multiple_routers/storage_2_2_a.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_2_2_a.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/storage_2_2_b.lua
b/test/multiple_routers/storage_2_2_b.lua
new file mode 120000
index 0000000..76d196b
--- /dev/null
+++ b/test/multiple_routers/storage_2_2_b.lua
@@ -0,0 +1 @@
+storage_1_1_a.lua
\ No newline at end of file
diff --git a/test/multiple_routers/suite.ini
b/test/multiple_routers/suite.ini
new file mode 100644
index 0000000..d2d4470
--- /dev/null
+++ b/test/multiple_routers/suite.ini
@@ -0,0 +1,6 @@
+[default]
+core = tarantool
+description = Multiple routers tests
+script = test.lua
+is_parallel = False
+lua_libs = ../lua_libs configs.lua
diff --git a/test/multiple_routers/test.lua b/test/multiple_routers/test.lua
new file mode 100644
index 0000000..cb7c1ee
--- /dev/null
+++ b/test/multiple_routers/test.lua
@@ -0,0 +1,9 @@
+#!/usr/bin/env tarantool
+
+require('strict').on()
+
+box.cfg{
+ listen = os.getenv("LISTEN"),
+}
+
+require('console').listen(os.getenv('ADMIN'))
diff --git a/test/router/exponential_timeout.result
b/test/router/exponential_timeout.result
index fb54d0f..6748b64 100644
--- a/test/router/exponential_timeout.result
+++ b/test/router/exponential_timeout.result
@@ -37,10 +37,10 @@ test_run:cmd('switch router_1')
util = require('util')
---
...
-rs1 = vshard.router.internal.replicasets[replicasets[1]]
+rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]
---
...
-rs2 = vshard.router.internal.replicasets[replicasets[2]]
+rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]
---
...
util.collect_timeouts(rs1)
diff --git a/test/router/exponential_timeout.test.lua
b/test/router/exponential_timeout.test.lua
index 3ec0b8c..75d85bf 100644
--- a/test/router/exponential_timeout.test.lua
+++ b/test/router/exponential_timeout.test.lua
@@ -13,8 +13,8 @@ test_run:cmd("start server router_1")
test_run:cmd('switch router_1')
util = require('util')
-rs1 = vshard.router.internal.replicasets[replicasets[1]]
-rs2 = vshard.router.internal.replicasets[replicasets[2]]
+rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]
+rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]
util.collect_timeouts(rs1)
util.collect_timeouts(rs2)
diff --git a/test/router/reconnect_to_master.result
b/test/router/reconnect_to_master.result
index 5e678ce..d502723 100644
--- a/test/router/reconnect_to_master.result
+++ b/test/router/reconnect_to_master.result
@@ -76,7 +76,7 @@ _ = test_run:cmd('stop server storage_1_a')
_ = test_run:switch('router_1')
---
...
-reps = vshard.router.internal.replicasets
+reps = vshard.router.internal.static_router.replicasets
---
...
test_run:cmd("setopt delimiter ';'")
@@ -95,7 +95,7 @@ end;
...
function count_known_buckets()
local known_buckets = 0
- for _, id in pairs(vshard.router.internal.route_map) do
+ for _, id in pairs(vshard.router.internal.static_router.route_map) do
known_buckets = known_buckets + 1
end
return known_buckets
@@ -127,7 +127,7 @@ is_disconnected()
fiber = require('fiber')
---
...
-while vshard.router.internal.replicasets[replicasets[1]].replica == nil
do fiber.sleep(0.1) end
+while
vshard.router.internal.static_router.replicasets[replicasets[1]].replica
== nil do fiber.sleep(0.1) end
---
...
vshard.router.info()
diff --git a/test/router/reconnect_to_master.test.lua
b/test/router/reconnect_to_master.test.lua
index 39ba90e..8820fa7 100644
--- a/test/router/reconnect_to_master.test.lua
+++ b/test/router/reconnect_to_master.test.lua
@@ -34,7 +34,7 @@ _ = test_run:cmd('stop server storage_1_a')
_ = test_run:switch('router_1')
-reps = vshard.router.internal.replicasets
+reps = vshard.router.internal.static_router.replicasets
test_run:cmd("setopt delimiter ';'")
function is_disconnected()
for i, rep in pairs(reps) do
@@ -46,7 +46,7 @@ function is_disconnected()
end;
function count_known_buckets()
local known_buckets = 0
- for _, id in pairs(vshard.router.internal.route_map) do
+ for _, id in pairs(vshard.router.internal.static_router.route_map) do
known_buckets = known_buckets + 1
end
return known_buckets
@@ -63,7 +63,7 @@ is_disconnected()
-- Wait until replica is connected to test alerts on unavailable
-- master.
fiber = require('fiber')
-while vshard.router.internal.replicasets[replicasets[1]].replica == nil
do fiber.sleep(0.1) end
+while
vshard.router.internal.static_router.replicasets[replicasets[1]].replica
== nil do fiber.sleep(0.1) end
vshard.router.info()
-- Return master.
diff --git a/test/router/reload.result b/test/router/reload.result
index f0badc3..98e8e71 100644
--- a/test/router/reload.result
+++ b/test/router/reload.result
@@ -229,7 +229,7 @@ vshard.router.cfg(cfg)
cfg.connection_outdate_delay = old_connection_delay
---
...
-vshard.router.internal.connection_outdate_delay = nil
+vshard.router.internal.static_router.connection_outdate_delay = nil
---
...
rs_new = vshard.router.route(1)
diff --git a/test/router/reload.test.lua b/test/router/reload.test.lua
index 528222a..293cb26 100644
--- a/test/router/reload.test.lua
+++ b/test/router/reload.test.lua
@@ -104,7 +104,7 @@ old_connection_delay = cfg.connection_outdate_delay
cfg.connection_outdate_delay = 0.3
vshard.router.cfg(cfg)
cfg.connection_outdate_delay = old_connection_delay
-vshard.router.internal.connection_outdate_delay = nil
+vshard.router.internal.static_router.connection_outdate_delay = nil
rs_new = vshard.router.route(1)
rs_old = rs
_, replica_old = next(rs_old.replicas)
diff --git a/test/router/reroute_wrong_bucket.result
b/test/router/reroute_wrong_bucket.result
index 7f2a494..989dc79 100644
--- a/test/router/reroute_wrong_bucket.result
+++ b/test/router/reroute_wrong_bucket.result
@@ -98,7 +98,7 @@ vshard.router.call(100, 'read', 'customer_lookup',
{1}, {timeout = 100})
---
- {'accounts': [], 'customer_id': 1, 'name': 'name'}
...
-vshard.router.internal.route_map[100] =
vshard.router.internal.replicasets[replicasets[1]]
+vshard.router.internal.static_router.route_map[100] =
vshard.router.internal.static_router.replicasets[replicasets[1]]
---
...
vshard.router.call(100, 'write', 'customer_add', {{customer_id = 2,
bucket_id = 100, name = 'name2', accounts = {}}}, {timeout = 100})
@@ -146,13 +146,13 @@ test_run:switch('router_1')
...
-- Emulate a situation, when a replicaset_2 while is unknown for
-- router, but is already known for storages.
-save_rs2 = vshard.router.internal.replicasets[replicasets[2]]
+save_rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]
---
...
-vshard.router.internal.replicasets[replicasets[2]] = nil
+vshard.router.internal.static_router.replicasets[replicasets[2]] = nil
---
...
-vshard.router.internal.route_map[100] =
vshard.router.internal.replicasets[replicasets[1]]
+vshard.router.internal.static_router.route_map[100] =
vshard.router.internal.static_router.replicasets[replicasets[1]]
---
...
fiber = require('fiber')
@@ -207,7 +207,7 @@ err
require('log').info(string.rep('a', 1000))
---
...
-vshard.router.internal.route_map[100] =
vshard.router.internal.replicasets[replicasets[1]]
+vshard.router.internal.static_router.route_map[100] =
vshard.router.internal.static_router.replicasets[replicasets[1]]
---
...
call_retval = nil
@@ -219,7 +219,7 @@ f = fiber.create(do_call, 100)
while not test_run:grep_log('router_1', 'please update configuration',
1000) do fiber.sleep(0.1) end
---
...
-vshard.router.internal.replicasets[replicasets[2]] = save_rs2
+vshard.router.internal.static_router.replicasets[replicasets[2]] = save_rs2
---
...
while not call_retval do fiber.sleep(0.1) end
diff --git a/test/router/reroute_wrong_bucket.test.lua
b/test/router/reroute_wrong_bucket.test.lua
index 03384d1..a00f941 100644
--- a/test/router/reroute_wrong_bucket.test.lua
+++ b/test/router/reroute_wrong_bucket.test.lua
@@ -35,7 +35,7 @@ customer_add({customer_id = 1, bucket_id = 100, name =
'name', accounts = {}})
test_run:switch('router_1')
vshard.router.call(100, 'read', 'customer_lookup', {1}, {timeout = 100})
-vshard.router.internal.route_map[100] =
vshard.router.internal.replicasets[replicasets[1]]
+vshard.router.internal.static_router.route_map[100] =
vshard.router.internal.static_router.replicasets[replicasets[1]]
vshard.router.call(100, 'write', 'customer_add', {{customer_id = 2,
bucket_id = 100, name = 'name2', accounts = {}}}, {timeout = 100})
-- Create cycle.
@@ -55,9 +55,9 @@ box.space._bucket:replace({100,
vshard.consts.BUCKET.SENT, replicasets[2]})
test_run:switch('router_1')
-- Emulate a situation, when a replicaset_2 while is unknown for
-- router, but is already known for storages.
-save_rs2 = vshard.router.internal.replicasets[replicasets[2]]
-vshard.router.internal.replicasets[replicasets[2]] = nil
-vshard.router.internal.route_map[100] =
vshard.router.internal.replicasets[replicasets[1]]
+save_rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]
+vshard.router.internal.static_router.replicasets[replicasets[2]] = nil
+vshard.router.internal.static_router.route_map[100] =
vshard.router.internal.static_router.replicasets[replicasets[1]]
fiber = require('fiber')
call_retval = nil
@@ -84,11 +84,11 @@ err
-- detect it and end with ok.
--
require('log').info(string.rep('a', 1000))
-vshard.router.internal.route_map[100] =
vshard.router.internal.replicasets[replicasets[1]]
+vshard.router.internal.static_router.route_map[100] =
vshard.router.internal.static_router.replicasets[replicasets[1]]
call_retval = nil
f = fiber.create(do_call, 100)
while not test_run:grep_log('router_1', 'please update configuration',
1000) do fiber.sleep(0.1) end
-vshard.router.internal.replicasets[replicasets[2]] = save_rs2
+vshard.router.internal.static_router.replicasets[replicasets[2]] = save_rs2
while not call_retval do fiber.sleep(0.1) end
call_retval
vshard.router.call(100, 'read', 'customer_lookup', {3}, {timeout = 1})
diff --git a/test/router/retry_reads.result b/test/router/retry_reads.result
index 64b0ff3..b803ae3 100644
--- a/test/router/retry_reads.result
+++ b/test/router/retry_reads.result
@@ -37,7 +37,7 @@ test_run:cmd('switch router_1')
util = require('util')
---
...
-rs1 = vshard.router.internal.replicasets[replicasets[1]]
+rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]
---
...
min_timeout = vshard.consts.CALL_TIMEOUT_MIN
diff --git a/test/router/retry_reads.test.lua
b/test/router/retry_reads.test.lua
index 2fb2fc7..510e961 100644
--- a/test/router/retry_reads.test.lua
+++ b/test/router/retry_reads.test.lua
@@ -13,7 +13,7 @@ test_run:cmd("start server router_1")
test_run:cmd('switch router_1')
util = require('util')
-rs1 = vshard.router.internal.replicasets[replicasets[1]]
+rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]
min_timeout = vshard.consts.CALL_TIMEOUT_MIN
--
diff --git a/test/router/router.result b/test/router/router.result
index 45394e1..ceaf672 100644
--- a/test/router/router.result
+++ b/test/router/router.result
@@ -70,10 +70,10 @@ test_run:grep_log('router_1', 'connected to ')
---
- 'connected to '
...
-rs1 = vshard.router.internal.replicasets[replicasets[1]]
+rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]
---
...
-rs2 = vshard.router.internal.replicasets[replicasets[2]]
+rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]
---
...
fiber = require('fiber')
@@ -95,7 +95,7 @@ rs2.replica == rs2.master
-- Part of gh-76: on reconfiguration do not recreate connections
-- to replicas, that are kept in a new configuration.
--
-old_replicasets = vshard.router.internal.replicasets
+old_replicasets = vshard.router.internal.static_router.replicasets
---
...
old_connections = {}
@@ -127,17 +127,17 @@ connection_count == 4
vshard.router.cfg(cfg)
---
...
-new_replicasets = vshard.router.internal.replicasets
+new_replicasets = vshard.router.internal.static_router.replicasets
---
...
old_replicasets ~= new_replicasets
---
- true
...
-rs1 = vshard.router.internal.replicasets[replicasets[1]]
+rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]
---
...
-rs2 = vshard.router.internal.replicasets[replicasets[2]]
+rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]
---
...
while not rs1.replica or not rs2.replica do fiber.sleep(0.1) end
@@ -225,7 +225,7 @@ vshard.router.bootstrap()
--
-- gh-108: negative bucket count on discovery.
--
-vshard.router.internal.route_map = {}
+vshard.router.internal.static_router.route_map = {}
---
...
rets = {}
@@ -456,7 +456,7 @@ conn.state
rs_uuid = '<replicaset_2>'
---
...
-rs = vshard.router.internal.replicasets[rs_uuid]
+rs = vshard.router.internal.static_router.replicasets[rs_uuid]
---
...
master = rs.master
@@ -605,7 +605,7 @@ vshard.router.info()
...
-- Remove replica and master connections to trigger alert
-- UNREACHABLE_REPLICASET.
-rs = vshard.router.internal.replicasets[replicasets[1]]
+rs = vshard.router.internal.static_router.replicasets[replicasets[1]]
---
...
master_conn = rs.master.conn
@@ -749,7 +749,7 @@ test_run:cmd("setopt delimiter ';'")
...
function calculate_known_buckets()
local known_buckets = 0
- for _, rs in pairs(vshard.router.internal.route_map) do
+ for _, rs in pairs(vshard.router.internal.static_router.route_map) do
known_buckets = known_buckets + 1
end
return known_buckets
@@ -851,10 +851,10 @@ test_run:cmd("setopt delimiter ';'")
- true
...
for i = 1, 100 do
- local rs = vshard.router.internal.route_map[i]
+ local rs = vshard.router.internal.static_router.route_map[i]
assert(rs)
rs.bucket_count = rs.bucket_count - 1
- vshard.router.internal.route_map[i] = nil
+ vshard.router.internal.static_router.route_map[i] = nil
end;
---
...
@@ -999,7 +999,7 @@ vshard.router.sync(100500)
-- object method like this: object.method() instead of
-- object:method(), an appropriate help-error returns.
--
-_, replicaset = next(vshard.router.internal.replicasets)
+_, replicaset = next(vshard.router.internal.static_router.replicasets)
---
...
error_messages = {}
@@ -1069,7 +1069,7 @@ test_run:cmd("setopt delimiter ';'")
---
- true
...
-for bucket, rs in pairs(vshard.router.internal.route_map) do
+for bucket, rs in pairs(vshard.router.internal.static_router.route_map) do
bucket_to_old_rs[bucket] = rs
bucket_cnt = bucket_cnt + 1
end;
@@ -1084,7 +1084,7 @@ vshard.router.cfg(cfg);
...
for bucket, old_rs in pairs(bucket_to_old_rs) do
local old_uuid = old_rs.uuid
- local rs = vshard.router.internal.route_map[bucket]
+ local rs = vshard.router.internal.static_router.route_map[bucket]
if not rs or not old_uuid == rs.uuid then
error("Bucket lost during reconfigure.")
end
@@ -1111,7 +1111,7 @@ end;
vshard.router.cfg(cfg);
---
...
-vshard.router.internal.route_map = {};
+vshard.router.internal.static_router.route_map = {};
---
...
vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY = false;
@@ -1119,7 +1119,7 @@
vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY = false;
...
-- Do discovery iteration. Upload buckets from the
-- first replicaset.
-while not next(vshard.router.internal.route_map) do
+while not next(vshard.router.internal.static_router.route_map) do
vshard.router.discovery_wakeup()
fiber.sleep(0.01)
end;
@@ -1128,12 +1128,12 @@ end;
new_replicasets = {};
---
...
-for _, rs in pairs(vshard.router.internal.replicasets) do
+for _, rs in pairs(vshard.router.internal.static_router.replicasets) do
new_replicasets[rs] = true
end;
---
...
-_, rs = next(vshard.router.internal.route_map);
+_, rs = next(vshard.router.internal.static_router.route_map);
---
...
new_replicasets[rs] == true;
@@ -1185,6 +1185,17 @@ vshard.router.route(1):callro('echo', {'some_data'})
- null
- null
...
+-- Multiple routers: check that static router can be used as an
+-- object.
+static_router = vshard.router.internal.static_router
+---
+...
+static_router:route(1):callro('echo', {'some_data'})
+---
+- some_data
+- null
+- null
+...
_ = test_run:cmd("switch default")
---
...
diff --git a/test/router/router.test.lua b/test/router/router.test.lua
index df2f381..d7588f7 100644
--- a/test/router/router.test.lua
+++ b/test/router/router.test.lua
@@ -27,8 +27,8 @@ util = require('util')
-- gh-24: log all connnect/disconnect events.
test_run:grep_log('router_1', 'connected to ')
-rs1 = vshard.router.internal.replicasets[replicasets[1]]
-rs2 = vshard.router.internal.replicasets[replicasets[2]]
+rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]
+rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]
fiber = require('fiber')
while not rs1.replica or not rs2.replica do fiber.sleep(0.1) end
-- With no zones the nearest server is master.
@@ -39,7 +39,7 @@ rs2.replica == rs2.master
-- Part of gh-76: on reconfiguration do not recreate connections
-- to replicas, that are kept in a new configuration.
--
-old_replicasets = vshard.router.internal.replicasets
+old_replicasets = vshard.router.internal.static_router.replicasets
old_connections = {}
connection_count = 0
test_run:cmd("setopt delimiter ';'")
@@ -52,10 +52,10 @@ end;
test_run:cmd("setopt delimiter ''");
connection_count == 4
vshard.router.cfg(cfg)
-new_replicasets = vshard.router.internal.replicasets
+new_replicasets = vshard.router.internal.static_router.replicasets
old_replicasets ~= new_replicasets
-rs1 = vshard.router.internal.replicasets[replicasets[1]]
-rs2 = vshard.router.internal.replicasets[replicasets[2]]
+rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]
+rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]
while not rs1.replica or not rs2.replica do fiber.sleep(0.1) end
vshard.router.discovery_wakeup()
-- Check that netbox connections are the same.
@@ -91,7 +91,7 @@ vshard.router.bootstrap()
--
-- gh-108: negative bucket count on discovery.
--
-vshard.router.internal.route_map = {}
+vshard.router.internal.static_router.route_map = {}
rets = {}
function do_echo() table.insert(rets, vshard.router.callro(1, 'echo',
{1})) end
f1 = fiber.create(do_echo) f2 = fiber.create(do_echo)
@@ -153,7 +153,7 @@ conn = vshard.router.route(1).master.conn
conn.state
-- Test missing master.
rs_uuid = 'ac522f65-aa94-4134-9f64-51ee384f1a54'
-rs = vshard.router.internal.replicasets[rs_uuid]
+rs = vshard.router.internal.static_router.replicasets[rs_uuid]
master = rs.master
rs.master = nil
vshard.router.route(1).master
@@ -223,7 +223,7 @@ vshard.router.info()
-- Remove replica and master connections to trigger alert
-- UNREACHABLE_REPLICASET.
-rs = vshard.router.internal.replicasets[replicasets[1]]
+rs = vshard.router.internal.static_router.replicasets[replicasets[1]]
master_conn = rs.master.conn
replica_conn = rs.replica.conn
rs.master.conn = nil
@@ -261,7 +261,7 @@ util.check_error(vshard.router.buckets_info, 123, '456')
test_run:cmd("setopt delimiter ';'")
function calculate_known_buckets()
local known_buckets = 0
- for _, rs in pairs(vshard.router.internal.route_map) do
+ for _, rs in pairs(vshard.router.internal.static_router.route_map) do
known_buckets = known_buckets + 1
end
return known_buckets
@@ -301,10 +301,10 @@ test_run:switch('router_1')
--
test_run:cmd("setopt delimiter ';'")
for i = 1, 100 do
- local rs = vshard.router.internal.route_map[i]
+ local rs = vshard.router.internal.static_router.route_map[i]
assert(rs)
rs.bucket_count = rs.bucket_count - 1
- vshard.router.internal.route_map[i] = nil
+ vshard.router.internal.static_router.route_map[i] = nil
end;
test_run:cmd("setopt delimiter ''");
calculate_known_buckets()
@@ -367,7 +367,7 @@ vshard.router.sync(100500)
-- object method like this: object.method() instead of
-- object:method(), an appropriate help-error returns.
--
-_, replicaset = next(vshard.router.internal.replicasets)
+_, replicaset = next(vshard.router.internal.static_router.replicasets)
error_messages = {}
test_run:cmd("setopt delimiter ';'")
@@ -395,7 +395,7 @@ error_messages
bucket_to_old_rs = {}
bucket_cnt = 0
test_run:cmd("setopt delimiter ';'")
-for bucket, rs in pairs(vshard.router.internal.route_map) do
+for bucket, rs in pairs(vshard.router.internal.static_router.route_map) do
bucket_to_old_rs[bucket] = rs
bucket_cnt = bucket_cnt + 1
end;
@@ -403,7 +403,7 @@ bucket_cnt;
vshard.router.cfg(cfg);
for bucket, old_rs in pairs(bucket_to_old_rs) do
local old_uuid = old_rs.uuid
- local rs = vshard.router.internal.route_map[bucket]
+ local rs = vshard.router.internal.static_router.route_map[bucket]
if not rs or not old_uuid == rs.uuid then
error("Bucket lost during reconfigure.")
end
@@ -423,19 +423,19 @@ while
vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY ~= 'waiting' do
fiber.sleep(0.02)
end;
vshard.router.cfg(cfg);
-vshard.router.internal.route_map = {};
+vshard.router.internal.static_router.route_map = {};
vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY = false;
-- Do discovery iteration. Upload buckets from the
-- first replicaset.
-while not next(vshard.router.internal.route_map) do
+while not next(vshard.router.internal.static_router.route_map) do
vshard.router.discovery_wakeup()
fiber.sleep(0.01)
end;
new_replicasets = {};
-for _, rs in pairs(vshard.router.internal.replicasets) do
+for _, rs in pairs(vshard.router.internal.static_router.replicasets) do
new_replicasets[rs] = true
end;
-_, rs = next(vshard.router.internal.route_map);
+_, rs = next(vshard.router.internal.static_router.route_map);
new_replicasets[rs] == true;
test_run:cmd("setopt delimiter ''");
@@ -453,6 +453,11 @@ vshard.router.internal.errinj.ERRINJ_CFG = false
util.has_same_fields(old_internal, vshard.router.internal)
vshard.router.route(1):callro('echo', {'some_data'})
+-- Multiple routers: check that static router can be used as an
+-- object.
+static_router = vshard.router.internal.static_router
+static_router:route(1):callro('echo', {'some_data'})
+
_ = test_run:cmd("switch default")
test_run:drop_cluster(REPLICASET_2)
diff --git a/vshard/error.lua b/vshard/error.lua
index f79107b..da92b58 100644
--- a/vshard/error.lua
+++ b/vshard/error.lua
@@ -105,7 +105,12 @@ local error_message_template = {
name = 'OBJECT_IS_OUTDATED',
msg = 'Object is outdated after module reload/reconfigure. ' ..
'Use new instance.'
- }
+ },
+ [21] = {
+ name = 'ROUTER_ALREADY_EXISTS',
+ msg = 'Router with name %s already exists',
+ args = {'name'},
+ },
}
--
diff --git a/vshard/router/init.lua b/vshard/router/init.lua
index 69cd37c..7ab2145 100644
--- a/vshard/router/init.lua
+++ b/vshard/router/init.lua
@@ -26,14 +26,33 @@ local M = rawget(_G, MODULE_INTERNALS)
if not M then
M = {
---------------- Common module attributes ----------------
- -- The last passed configuration.
- current_cfg = nil,
errinj = {
ERRINJ_CFG = false,
ERRINJ_FAILOVER_CHANGE_CFG = false,
ERRINJ_RELOAD = false,
ERRINJ_LONG_DISCOVERY = false,
},
+ -- Dictionary, key is router name, value is a router.
+ routers = {},
+ -- Router object which can be accessed by old api:
+ -- e.g. vshard.router.call(...)
+ static_router = nil,
+ -- This counter is used to restart background fibers with
+ -- new reloaded code.
+ module_version = 0,
+ -- Number of router which require collecting lua garbage.
+ collect_lua_garbage_cnt = 0,
+ }
+end
+
+--
+-- Router object attributes.
+--
+local ROUTER_TEMPLATE = {
+ -- Name of router.
+ name = nil,
+ -- The last passed configuration.
+ current_cfg = nil,
-- Time to outdate old objects on reload.
connection_outdate_delay = nil,
-- Bucket map cache.
@@ -48,38 +67,60 @@ if not M then
total_bucket_count = 0,
-- Boolean lua_gc state (create periodic gc task).
collect_lua_garbage = nil,
- -- This counter is used to restart background fibers with
- -- new reloaded code.
- module_version = 0,
- }
-end
+}
+
+local STATIC_ROUTER_NAME = '_static_router'
-- Set a bucket to a replicaset.
-local function bucket_set(bucket_id, rs_uuid)
- local replicaset = M.replicasets[rs_uuid]
+local function bucket_set(router, bucket_id, rs_uuid)
+ local replicaset = router.replicasets[rs_uuid]
-- It is technically possible to delete a replicaset at the
-- same time when route to the bucket is discovered.
if not replicaset then
return nil, lerror.vshard(lerror.code.NO_ROUTE_TO_BUCKET,
bucket_id)
end
- local old_replicaset = M.route_map[bucket_id]
+ local old_replicaset = router.route_map[bucket_id]
if old_replicaset ~= replicaset then
if old_replicaset then
old_replicaset.bucket_count = old_replicaset.bucket_count - 1
end
replicaset.bucket_count = replicaset.bucket_count + 1
end
- M.route_map[bucket_id] = replicaset
+ router.route_map[bucket_id] = replicaset
return replicaset
end
-- Remove a bucket from the cache.
-local function bucket_reset(bucket_id)
- local replicaset = M.route_map[bucket_id]
+local function bucket_reset(router, bucket_id)
+ local replicaset = router.route_map[bucket_id]
if replicaset then
replicaset.bucket_count = replicaset.bucket_count - 1
end
- M.route_map[bucket_id] = nil
+ router.route_map[bucket_id] = nil
+end
+
+--------------------------------------------------------------------------------
+-- Helpers
+--------------------------------------------------------------------------------
+
+--
+-- Increase/decrease number of routers which require to collect
+-- a lua garbage and change state of the `lua_gc` fiber.
+--
+
+local function lua_gc_cnt_inc()
+ M.collect_lua_garbage_cnt = M.collect_lua_garbage_cnt + 1
+ if M.collect_lua_garbage_cnt == 1 then
+ lua_gc.set_state(true, consts.COLLECT_LUA_GARBAGE_INTERVAL)
+ end
+end
+
+local function lua_gc_cnt_dec()
+ M.collect_lua_garbage_cnt = M.collect_lua_garbage_cnt - 1
+ assert(M.collect_lua_garbage_cnt >= 0)
+ if M.collect_lua_garbage_cnt == 0 then
+ lua_gc.set_state(false, consts.COLLECT_LUA_GARBAGE_INTERVAL)
+ end
end
--------------------------------------------------------------------------------
@@ -87,8 +128,8 @@ end
--------------------------------------------------------------------------------
-- Search bucket in whole cluster
-local function bucket_discovery(bucket_id)
- local replicaset = M.route_map[bucket_id]
+local function bucket_discovery(router, bucket_id)
+ local replicaset = router.route_map[bucket_id]
if replicaset ~= nil then
return replicaset
end
@@ -96,11 +137,11 @@ local function bucket_discovery(bucket_id)
log.verbose("Discovering bucket %d", bucket_id)
local last_err = nil
local unreachable_uuid = nil
- for uuid, replicaset in pairs(M.replicasets) do
+ for uuid, replicaset in pairs(router.replicasets) do
local _, err =
replicaset:callrw('vshard.storage.bucket_stat', {bucket_id})
if err == nil then
- return bucket_set(bucket_id, replicaset.uuid)
+ return bucket_set(router, bucket_id, replicaset.uuid)
elseif err.code ~= lerror.code.WRONG_BUCKET then
last_err = err
unreachable_uuid = uuid
@@ -129,14 +170,14 @@ local function bucket_discovery(bucket_id)
end
-- Resolve bucket id to replicaset uuid
-local function bucket_resolve(bucket_id)
+local function bucket_resolve(router, bucket_id)
local replicaset, err
- local replicaset = M.route_map[bucket_id]
+ local replicaset = router.route_map[bucket_id]
if replicaset ~= nil then
return replicaset
end
-- Replicaset removed from cluster, perform discovery
- replicaset, err = bucket_discovery(bucket_id)
+ replicaset, err = bucket_discovery(router, bucket_id)
if replicaset == nil then
return nil, err
end
@@ -147,14 +188,14 @@ end
-- Background fiber to perform discovery. It periodically scans
-- replicasets one by one and updates route_map.
--
-local function discovery_f()
+local function discovery_f(router)
local module_version = M.module_version
while module_version == M.module_version do
- while not next(M.replicasets) do
+ while not next(router.replicasets) do
lfiber.sleep(consts.DISCOVERY_INTERVAL)
end
- local old_replicasets = M.replicasets
- for rs_uuid, replicaset in pairs(M.replicasets) do
+ local old_replicasets = router.replicasets
+ for rs_uuid, replicaset in pairs(router.replicasets) do
local active_buckets, err =
replicaset:callro('vshard.storage.buckets_discovery', {},
{timeout = 2})
@@ -164,7 +205,7 @@ local function discovery_f()
end
-- Renew replicasets object captured by the for loop
-- in case of reconfigure and reload events.
- if M.replicasets ~= old_replicasets then
+ if router.replicasets ~= old_replicasets then
break
end
if not active_buckets then
@@ -177,11 +218,11 @@ local function discovery_f()
end
replicaset.bucket_count = #active_buckets
for _, bucket_id in pairs(active_buckets) do
- local old_rs = M.route_map[bucket_id]
+ local old_rs = router.route_map[bucket_id]
if old_rs and old_rs ~= replicaset then
old_rs.bucket_count = old_rs.bucket_count - 1
end
- M.route_map[bucket_id] = replicaset
+ router.route_map[bucket_id] = replicaset
end
end
lfiber.sleep(consts.DISCOVERY_INTERVAL)
@@ -192,9 +233,9 @@ end
--
-- Immediately wakeup discovery fiber if exists.
--
-local function discovery_wakeup()
- if M.discovery_fiber then
- M.discovery_fiber:wakeup()
+local function discovery_wakeup(router)
+ if router.discovery_fiber then
+ router.discovery_fiber:wakeup()
end
end
@@ -206,7 +247,7 @@ end
-- Function will restart operation after wrong bucket response until
timeout
-- is reached
--
-local function router_call(bucket_id, mode, func, args, opts)
+local function router_call(router, bucket_id, mode, func, args, opts)
if opts and (type(opts) ~= 'table' or
(opts.timeout and type(opts.timeout) ~= 'number')) then
error('Usage: call(bucket_id, mode, func, args, opts)')
@@ -214,7 +255,7 @@ local function router_call(bucket_id, mode, func,
args, opts)
local timeout = opts and opts.timeout or consts.CALL_TIMEOUT_MIN
local replicaset, err
local tend = lfiber.time() + timeout
- if bucket_id > M.total_bucket_count or bucket_id <= 0 then
+ if bucket_id > router.total_bucket_count or bucket_id <= 0 then
error('Bucket is unreachable: bucket id is out of range')
end
local call
@@ -224,7 +265,7 @@ local function router_call(bucket_id, mode, func,
args, opts)
call = 'callrw'
end
repeat
- replicaset, err = bucket_resolve(bucket_id)
+ replicaset, err = bucket_resolve(router, bucket_id)
if replicaset then
::replicaset_is_found::
local storage_call_status, call_status, call_error =
@@ -240,9 +281,9 @@ local function router_call(bucket_id, mode, func,
args, opts)
end
err = call_status
if err.code == lerror.code.WRONG_BUCKET then
- bucket_reset(bucket_id)
+ bucket_reset(router, bucket_id)
if err.destination then
- replicaset = M.replicasets[err.destination]
+ replicaset = router.replicasets[err.destination]
if not replicaset then
log.warn('Replicaset "%s" was not found, but
received'..
' from storage as destination -
please '..
@@ -254,13 +295,14 @@ local function router_call(bucket_id, mode, func,
args, opts)
-- but already is executed on storages.
while lfiber.time() <= tend do
lfiber.sleep(0.05)
- replicaset = M.replicasets[err.destination]
+ replicaset =
router.replicasets[err.destination]
if replicaset then
goto replicaset_is_found
end
end
else
- replicaset = bucket_set(bucket_id, replicaset.uuid)
+ replicaset = bucket_set(router, bucket_id,
+ replicaset.uuid)
lfiber.yield()
-- Protect against infinite cycle in a
-- case of broken cluster, when a bucket
@@ -277,7 +319,7 @@ local function router_call(bucket_id, mode, func,
args, opts)
-- is not timeout - these requests are repeated in
-- any case on client, if error.
assert(mode == 'write')
- bucket_reset(bucket_id)
+ bucket_reset(router, bucket_id)
return nil, err
elseif err.code == lerror.code.NON_MASTER then
-- Same, as above - do not wait and repeat.
@@ -303,12 +345,12 @@ end
--
-- Wrappers for router_call with preset mode.
--
-local function router_callro(bucket_id, ...)
- return router_call(bucket_id, 'read', ...)
+local function router_callro(router, bucket_id, ...)
+ return router_call(router, bucket_id, 'read', ...)
end
-local function router_callrw(bucket_id, ...)
- return router_call(bucket_id, 'write', ...)
+local function router_callrw(router, bucket_id, ...)
+ return router_call(router, bucket_id, 'write', ...)
end
--
@@ -316,27 +358,27 @@ end
-- @param bucket_id Bucket identifier.
-- @retval Netbox connection.
--
-local function router_route(bucket_id)
+local function router_route(router, bucket_id)
if type(bucket_id) ~= 'number' then
error('Usage: router.route(bucket_id)')
end
- return bucket_resolve(bucket_id)
+ return bucket_resolve(router, bucket_id)
end
--
-- Return map of all replicasets.
-- @retval See self.replicasets map.
--
-local function router_routeall()
- return M.replicasets
+local function router_routeall(router)
+ return router.replicasets
end
--------------------------------------------------------------------------------
-- Failover
--------------------------------------------------------------------------------
-local function failover_ping_round()
- for _, replicaset in pairs(M.replicasets) do
+local function failover_ping_round(router)
+ for _, replicaset in pairs(router.replicasets) do
local replica = replicaset.replica
if replica ~= nil and replica.conn ~= nil and
replica.down_ts == nil then
@@ -379,10 +421,10 @@ end
-- Collect UUIDs of replicasets, priority of whose replica
-- connections must be updated.
--
-local function failover_collect_to_update()
+local function failover_collect_to_update(router)
local ts = lfiber.time()
local uuid_to_update = {}
- for uuid, rs in pairs(M.replicasets) do
+ for uuid, rs in pairs(router.replicasets) do
if failover_need_down_priority(rs, ts) or
failover_need_up_priority(rs, ts) then
table.insert(uuid_to_update, uuid)
@@ -397,16 +439,16 @@ end
-- disconnected replicas.
-- @retval true A replica of an replicaset has been changed.
--
-local function failover_step()
- failover_ping_round()
- local uuid_to_update = failover_collect_to_update()
+local function failover_step(router)
+ failover_ping_round(router)
+ local uuid_to_update = failover_collect_to_update(router)
if #uuid_to_update == 0 then
return false
end
local curr_ts = lfiber.time()
local replica_is_changed = false
for _, uuid in pairs(uuid_to_update) do
- local rs = M.replicasets[uuid]
+ local rs = router.replicasets[uuid]
if M.errinj.ERRINJ_FAILOVER_CHANGE_CFG then
rs = nil
M.errinj.ERRINJ_FAILOVER_CHANGE_CFG = false
@@ -448,7 +490,7 @@ end
-- tries to reconnect to the best replica. When the connection is
-- established, it replaces the original replica.
--
-local function failover_f()
+local function failover_f(router)
local module_version = M.module_version
local min_timeout = math.min(consts.FAILOVER_UP_TIMEOUT,
consts.FAILOVER_DOWN_TIMEOUT)
@@ -458,7 +500,7 @@ local function failover_f()
local prev_was_ok = false
while module_version == M.module_version do
::continue::
- local ok, replica_is_changed = pcall(failover_step)
+ local ok, replica_is_changed = pcall(failover_step, router)
if not ok then
log.error('Error during failovering: %s',
lerror.make(replica_is_changed))
@@ -485,8 +527,8 @@ end
-- Configuration
--------------------------------------------------------------------------------
-local function router_cfg(cfg, is_reload)
- cfg = lcfg.check(cfg, M.current_cfg)
+local function router_cfg(router, cfg, is_reload)
+ cfg = lcfg.check(cfg, router.current_cfg)
local vshard_cfg, box_cfg = lcfg.split(cfg)
if not M.replicasets then
log.info('Starting router configuration')
@@ -511,45 +553,49 @@ local function router_cfg(cfg, is_reload)
-- Move connections from an old configuration to a new one.
-- It must be done with no yields to prevent usage both of not
-- fully moved old replicasets, and not fully built new ones.
- lreplicaset.rebind_replicasets(new_replicasets, M.replicasets)
+ lreplicaset.rebind_replicasets(new_replicasets, router.replicasets)
-- Now the new replicasets are fully built. Can establish
-- connections and yield.
for _, replicaset in pairs(new_replicasets) do
replicaset:connect_all()
end
+ -- Change state of lua GC.
+ if vshard_cfg.collect_lua_garbage and not
router.collect_lua_garbage then
+ lua_gc_cnt_inc()
+ elseif not vshard_cfg.collect_lua_garbage and
+ router.collect_lua_garbage then
+ lua_gc_cnt_dec()
+ end
lreplicaset.wait_masters_connect(new_replicasets)
- lreplicaset.outdate_replicasets(M.replicasets,
+ lreplicaset.outdate_replicasets(router.replicasets,
vshard_cfg.connection_outdate_delay)
- M.connection_outdate_delay = vshard_cfg.connection_outdate_delay
- M.total_bucket_count = vshard_cfg.bucket_count
- M.collect_lua_garbage = vshard_cfg.collect_lua_garbage
- M.current_cfg = cfg
- M.replicasets = new_replicasets
- local old_route_map = M.route_map
- M.route_map = table_new(M.total_bucket_count, 0)
+ router.connection_outdate_delay = vshard_cfg.connection_outdate_delay
+ router.total_bucket_count = vshard_cfg.bucket_count
+ router.collect_lua_garbage = vshard_cfg.collect_lua_garbage
+ router.current_cfg = cfg
+ router.replicasets = new_replicasets
+ local old_route_map = router.route_map
+ router.route_map = table_new(router.total_bucket_count, 0)
for bucket, rs in pairs(old_route_map) do
- M.route_map[bucket] = M.replicasets[rs.uuid]
+ router.route_map[bucket] = router.replicasets[rs.uuid]
end
- if M.failover_fiber == nil then
- M.failover_fiber =
util.reloadable_fiber_create('vshard.failover', M,
- 'failover_f')
+ if router.failover_fiber == nil then
+ router.failover_fiber = util.reloadable_fiber_create(
+ 'vshard.failover.' .. router.name, M, 'failover_f', router)
end
- if M.discovery_fiber == nil then
- M.discovery_fiber =
util.reloadable_fiber_create('vshard.discovery', M,
- 'discovery_f')
+ if router.discovery_fiber == nil then
+ router.discovery_fiber = util.reloadable_fiber_create(
+ 'vshard.discovery.' .. router.name, M, 'discovery_f', router)
end
- lua_gc.set_state(M.collect_lua_garbage,
consts.COLLECT_LUA_GARBAGE_INTERVAL)
- -- Destroy connections, not used in a new configuration.
- collectgarbage()
end
--------------------------------------------------------------------------------
-- Bootstrap
--------------------------------------------------------------------------------
-local function cluster_bootstrap()
+local function cluster_bootstrap(router)
local replicasets = {}
- for uuid, replicaset in pairs(M.replicasets) do
+ for uuid, replicaset in pairs(router.replicasets) do
table.insert(replicasets, replicaset)
local count, err =
replicaset:callrw('vshard.storage.buckets_count',
{})
@@ -560,9 +606,10 @@ local function cluster_bootstrap()
return nil, lerror.vshard(lerror.code.NON_EMPTY)
end
end
- lreplicaset.calculate_etalon_balance(M.replicasets,
M.total_bucket_count)
+ lreplicaset.calculate_etalon_balance(router.replicasets,
+ router.total_bucket_count)
local bucket_id = 1
- for uuid, replicaset in pairs(M.replicasets) do
+ for uuid, replicaset in pairs(router.replicasets) do
if replicaset.etalon_bucket_count > 0 then
local ok, err =
replicaset:callrw('vshard.storage.bucket_force_create',
@@ -618,7 +665,7 @@ local function replicaset_instance_info(replicaset,
name, alerts, errcolor,
return info, consts.STATUS.GREEN
end
-local function router_info()
+local function router_info(router)
local state = {
replicasets = {},
bucket = {
@@ -632,7 +679,7 @@ local function router_info()
}
local bucket_info = state.bucket
local known_bucket_count = 0
- for rs_uuid, replicaset in pairs(M.replicasets) do
+ for rs_uuid, replicaset in pairs(router.replicasets) do
-- Replicaset info parameters:
-- * master instance info;
-- * replica instance info;
@@ -720,7 +767,7 @@ local function router_info()
-- If a bucket is unreachable, then replicaset is
-- unreachable too and color already is red.
end
- bucket_info.unknown = M.total_bucket_count - known_bucket_count
+ bucket_info.unknown = router.total_bucket_count - known_bucket_count
if bucket_info.unknown > 0 then
state.status = math.max(state.status, consts.STATUS.YELLOW)
table.insert(state.alerts,
lerror.alert(lerror.code.UNKNOWN_BUCKETS,
@@ -737,13 +784,13 @@ end
-- @param limit Maximal bucket count in output.
-- @retval Map of type {bucket_id = 'unknown'/replicaset_uuid}.
--
-local function router_buckets_info(offset, limit)
+local function router_buckets_info(router, offset, limit)
if offset ~= nil and type(offset) ~= 'number' or
limit ~= nil and type(limit) ~= 'number' then
error('Usage: buckets_info(offset, limit)')
end
offset = offset or 0
- limit = limit or M.total_bucket_count
+ limit = limit or router.total_bucket_count
local ret = {}
-- Use one string memory for all unknown buckets.
local available_rw = 'available_rw'
@@ -752,9 +799,9 @@ local function router_buckets_info(offset, limit)
local unreachable = 'unreachable'
-- Collect limit.
local first = math.max(1, offset + 1)
- local last = math.min(offset + limit, M.total_bucket_count)
+ local last = math.min(offset + limit, router.total_bucket_count)
for bucket_id = first, last do
- local rs = M.route_map[bucket_id]
+ local rs = router.route_map[bucket_id]
if rs then
if rs.master and rs.master:is_connected() then
ret[bucket_id] = {uuid = rs.uuid, status = available_rw}
@@ -774,22 +821,22 @@ end
-- Other
--------------------------------------------------------------------------------
-local function router_bucket_id(key)
+local function router_bucket_id(router, key)
if key == nil then
error("Usage: vshard.router.bucket_id(key)")
end
- return lhash.key_hash(key) % M.total_bucket_count + 1
+ return lhash.key_hash(key) % router.total_bucket_count + 1
end
-local function router_bucket_count()
- return M.total_bucket_count
+local function router_bucket_count(router)
+ return router.total_bucket_count
end
-local function router_sync(timeout)
+local function router_sync(router, timeout)
if timeout ~= nil and type(timeout) ~= 'number' then
error('Usage: vshard.router.sync([timeout: number])')
end
- for rs_uuid, replicaset in pairs(M.replicasets) do
+ for rs_uuid, replicaset in pairs(router.replicasets) do
local status, err = replicaset:callrw('vshard.storage.sync',
{timeout})
if not status then
-- Add information about replicaset
@@ -803,6 +850,94 @@ if M.errinj.ERRINJ_RELOAD then
error('Error injection: reload')
end
+--------------------------------------------------------------------------------
+-- Managing router instances
+--------------------------------------------------------------------------------
+
+local function cfg_reconfigure(router, cfg)
+ return router_cfg(router, cfg, false)
+end
+
+local router_mt = {
+ __index = {
+ cfg = cfg_reconfigure;
+ info = router_info;
+ buckets_info = router_buckets_info;
+ call = router_call;
+ callro = router_callro;
+ callrw = router_callrw;
+ route = router_route;
+ routeall = router_routeall;
+ bucket_id = router_bucket_id;
+ bucket_count = router_bucket_count;
+ sync = router_sync;
+ bootstrap = cluster_bootstrap;
+ bucket_discovery = bucket_discovery;
+ discovery_wakeup = discovery_wakeup;
+ }
+}
+
+-- Table which represents this module.
+local module = {}
+
+-- This metatable bypasses calls to a module to the static_router.
+local module_mt = {__index = {}}
+for method_name, method in pairs(router_mt.__index) do
+ module_mt.__index[method_name] = function(...)
+ return method(M.static_router, ...)
+ end
+end
+
+local function export_static_router_attributes()
+ setmetatable(module, module_mt)
+end
+
+--
+-- Create a new instance of router.
+-- @param name Name of a new router.
+-- @param cfg Configuration for `router_cfg`.
+-- @retval Router instance.
+-- @retval Nil and error object.
+--
+local function router_new(name, cfg)
+ if type(name) ~= 'string' or type(cfg) ~= 'table' then
+ error('Wrong argument type. Usage: vshard.router.new(name,
cfg).')
+ end
+ if M.routers[name] then
+ return nil, lerror.vshard(lerror.code.ROUTER_ALREADY_EXISTS, name)
+ end
+ local router = table.deepcopy(ROUTER_TEMPLATE)
+ setmetatable(router, router_mt)
+ router.name = name
+ M.routers[name] = router
+ local ok, err = pcall(router_cfg, router, cfg)
+ if not ok then
+ M.routers[name] = nil
+ error(err)
+ end
+ return router
+end
+
+--
+-- Wrapper around a `router_new` API, which allow to use old
+-- static `vshard.router.cfg()` API.
+--
+local function legacy_cfg(cfg)
+ if M.static_router then
+ -- Reconfigure.
+ router_cfg(M.static_router, cfg, false)
+ else
+ -- Create new static instance.
+ local router, err = router_new(STATIC_ROUTER_NAME, cfg)
+ if router then
+ M.static_router = router
+ export_static_router_attributes()
+ else
+ return nil, err
+ end
+ end
+end
+
--------------------------------------------------------------------------------
-- Module definition
--------------------------------------------------------------------------------
@@ -813,28 +948,23 @@ end
if not rawget(_G, MODULE_INTERNALS) then
rawset(_G, MODULE_INTERNALS, M)
else
- router_cfg(M.current_cfg, true)
+ for _, router in pairs(M.routers) do
+ router_cfg(router, router.current_cfg, true)
+ setmetatable(router, router_mt)
+ end
+ if M.static_router then
+ export_static_router_attributes()
+ end
M.module_version = M.module_version + 1
end
M.discovery_f = discovery_f
M.failover_f = failover_f
+M.router_mt = router_mt
-return {
- cfg = function(cfg) return router_cfg(cfg, false) end;
- info = router_info;
- buckets_info = router_buckets_info;
- call = router_call;
- callro = router_callro;
- callrw = router_callrw;
- route = router_route;
- routeall = router_routeall;
- bucket_id = router_bucket_id;
- bucket_count = router_bucket_count;
- sync = router_sync;
- bootstrap = cluster_bootstrap;
- bucket_discovery = bucket_discovery;
- discovery_wakeup = discovery_wakeup;
- internal = M;
- module_version = function() return M.module_version end;
-}
+module.cfg = legacy_cfg
+module.new = router_new
+module.internal = M
+module.module_version = function() return M.module_version end
+
+return module
diff --git a/vshard/util.lua b/vshard/util.lua
index 37abe2b..3afaa61 100644
--- a/vshard/util.lua
+++ b/vshard/util.lua
@@ -38,11 +38,11 @@ end
-- reload of that module.
-- See description of parameters in `reloadable_fiber_create`.
--
-local function reloadable_fiber_main_loop(module, func_name)
+local function reloadable_fiber_main_loop(module, func_name, data)
log.info('%s has been started', func_name)
local func = module[func_name]
::restart_loop::
- local ok, err = pcall(func)
+ local ok, err = pcall(func, data)
-- yield serves two purposes:
-- * makes this fiber cancellable
-- * prevents 100% cpu consumption
@@ -60,7 +60,7 @@ local function reloadable_fiber_main_loop(module,
func_name)
log.info('module is reloaded, restarting')
-- luajit drops this frame if next function is called in
-- return statement.
- return M.reloadable_fiber_main_loop(module, func_name)
+ return M.reloadable_fiber_main_loop(module, func_name, data)
end
--
@@ -74,11 +74,13 @@ end
-- @param module Module which can be reloaded.
-- @param func_name Name of a function to be executed in the
-- module.
+-- @param data Data to be passed to the specified function.
-- @retval New fiber.
--
-local function reloadable_fiber_create(fiber_name, module, func_name)
+local function reloadable_fiber_create(fiber_name, module, func_name, data)
assert(type(fiber_name) == 'string')
- local xfiber = fiber.create(reloadable_fiber_main_loop, module,
func_name)
+ local xfiber = fiber.create(reloadable_fiber_main_loop, module,
func_name,
+ data)
xfiber:name(fiber_name)
return xfiber
end
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 3/3] Introduce multiple routers feature
2018-08-08 14:04 ` Alex Khatskevich
@ 2018-08-08 15:37 ` Vladislav Shpilevoy
0 siblings, 0 replies; 23+ messages in thread
From: Vladislav Shpilevoy @ 2018-08-08 15:37 UTC (permalink / raw)
To: Alex Khatskevich, tarantool-patches
Thanks for the fixes! Pushed into the master.
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2018-08-08 15:37 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-31 16:25 [tarantool-patches] [PATCH 0/3] multiple routers AKhatskevich
2018-07-31 16:25 ` [tarantool-patches] [PATCH 1/3] Update only vshard part of a cfg on reload AKhatskevich
2018-08-01 18:43 ` [tarantool-patches] " Vladislav Shpilevoy
2018-08-03 20:03 ` Alex Khatskevich
2018-08-06 17:03 ` Vladislav Shpilevoy
2018-08-07 13:19 ` Alex Khatskevich
2018-08-08 11:17 ` Vladislav Shpilevoy
2018-07-31 16:25 ` [tarantool-patches] [PATCH 2/3] Move lua gc to a dedicated module AKhatskevich
2018-08-01 18:43 ` [tarantool-patches] " Vladislav Shpilevoy
2018-08-03 20:04 ` Alex Khatskevich
2018-08-06 17:03 ` Vladislav Shpilevoy
2018-08-08 11:17 ` Vladislav Shpilevoy
2018-07-31 16:25 ` [tarantool-patches] [PATCH 3/3] Introduce multiple routers feature AKhatskevich
2018-08-01 18:43 ` [tarantool-patches] " Vladislav Shpilevoy
2018-08-03 20:05 ` Alex Khatskevich
2018-08-06 17:03 ` Vladislav Shpilevoy
2018-08-07 13:18 ` Alex Khatskevich
2018-08-08 12:28 ` Vladislav Shpilevoy
2018-08-08 14:04 ` Alex Khatskevich
2018-08-08 15:37 ` Vladislav Shpilevoy
2018-08-01 14:30 ` [tarantool-patches] [PATCH] Check self arg passed for router objects AKhatskevich
2018-08-03 20:07 ` [tarantool-patches] [PATCH] Refactor config templates AKhatskevich
2018-08-06 15:49 ` [tarantool-patches] " Vladislav Shpilevoy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox