* [tarantool-patches] [PATCH 0/3] multiple routers @ 2018-07-31 16:25 AKhatskevich 2018-07-31 16:25 ` [tarantool-patches] [PATCH 1/3] Update only vshard part of a cfg on reload AKhatskevich ` (4 more replies) 0 siblings, 5 replies; 23+ messages in thread From: AKhatskevich @ 2018-07-31 16:25 UTC (permalink / raw) To: v.shpilevoy, tarantool-patches Issue: https://github.com/tarantool/vshard/issues/130 Extra issue: https://github.com/tarantool/vshard/issues/138 Branch: https://github.com/tarantool/vshard/tree/kh/gh-130-multiple-routers This patchset introduces multiple routers feature. A user can create multiple router instances which are connected to different (or the same) clusters. AKhatskevich (3): Update only vshard part of a cfg on reload Move lua gc to a dedicated module Introduce multiple routers feature test/multiple_routers/configs.lua | 81 ++++++ test/multiple_routers/multiple_routers.result | 226 +++++++++++++++ test/multiple_routers/multiple_routers.test.lua | 85 ++++++ test/multiple_routers/router_1.lua | 15 + test/multiple_routers/storage_1_1_a.lua | 23 ++ test/multiple_routers/storage_1_1_b.lua | 1 + test/multiple_routers/storage_1_2_a.lua | 1 + test/multiple_routers/storage_1_2_b.lua | 1 + test/multiple_routers/storage_2_1_a.lua | 1 + test/multiple_routers/storage_2_1_b.lua | 1 + test/multiple_routers/storage_2_2_a.lua | 1 + test/multiple_routers/storage_2_2_b.lua | 1 + test/multiple_routers/suite.ini | 6 + test/multiple_routers/test.lua | 9 + test/router/garbage_collector.result | 27 +- test/router/garbage_collector.test.lua | 18 +- test/router/router.result | 4 +- test/router/router.test.lua | 4 +- test/storage/garbage_collector.result | 27 +- test/storage/garbage_collector.test.lua | 22 +- vshard/cfg.lua | 54 ++-- vshard/lua_gc.lua | 54 ++++ vshard/router/init.lua | 364 +++++++++++++++--------- vshard/storage/init.lua | 71 ++--- vshard/util.lua | 12 +- 25 files changed, 865 insertions(+), 244 deletions(-) create mode 100644 test/multiple_routers/configs.lua create mode 100644 test/multiple_routers/multiple_routers.result create mode 100644 test/multiple_routers/multiple_routers.test.lua create mode 100644 test/multiple_routers/router_1.lua create mode 100644 test/multiple_routers/storage_1_1_a.lua create mode 120000 test/multiple_routers/storage_1_1_b.lua create mode 120000 test/multiple_routers/storage_1_2_a.lua create mode 120000 test/multiple_routers/storage_1_2_b.lua create mode 120000 test/multiple_routers/storage_2_1_a.lua create mode 120000 test/multiple_routers/storage_2_1_b.lua create mode 120000 test/multiple_routers/storage_2_2_a.lua create mode 120000 test/multiple_routers/storage_2_2_b.lua create mode 100644 test/multiple_routers/suite.ini create mode 100644 test/multiple_routers/test.lua create mode 100644 vshard/lua_gc.lua -- 2.14.1 ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] [PATCH 1/3] Update only vshard part of a cfg on reload 2018-07-31 16:25 [tarantool-patches] [PATCH 0/3] multiple routers AKhatskevich @ 2018-07-31 16:25 ` AKhatskevich 2018-08-01 18:43 ` [tarantool-patches] " Vladislav Shpilevoy 2018-07-31 16:25 ` [tarantool-patches] [PATCH 2/3] Move lua gc to a dedicated module AKhatskevich ` (3 subsequent siblings) 4 siblings, 1 reply; 23+ messages in thread From: AKhatskevich @ 2018-07-31 16:25 UTC (permalink / raw) To: v.shpilevoy, tarantool-patches Box cfg could have been changed by a user and then overridden by an old vshard config on reload. Since that commit, box part of a config is applied only when it is explicitly passed to a `cfg` method. This change is important for the multiple routers feature. --- vshard/cfg.lua | 54 +++++++++++++++++++++++++------------------------ vshard/router/init.lua | 18 ++++++++--------- vshard/storage/init.lua | 53 ++++++++++++++++++++++++++++-------------------- 3 files changed, 67 insertions(+), 58 deletions(-) diff --git a/vshard/cfg.lua b/vshard/cfg.lua index bba12cc..8282086 100644 --- a/vshard/cfg.lua +++ b/vshard/cfg.lua @@ -230,48 +230,50 @@ local non_dynamic_options = { 'bucket_count', 'shard_index' } +-- +-- Deepcopy a config and split it into vshard_cfg and box_cfg. +-- +local function split_cfg(cfg) + local vshard_field_map = {} + for _, field in ipairs(cfg_template) do + vshard_field_map[field[1]] = true + end + local vshard_cfg = {} + local box_cfg = {} + for k, v in pairs(cfg) do + if vshard_field_map[k] then + vshard_cfg[k] = table.deepcopy(v) + else + box_cfg[k] = table.deepcopy(v) + end + end + return vshard_cfg, box_cfg +end + -- -- Check sharding config on correctness. Check types, name and uri -- uniqueness, master count (in each replicaset must be <= 1). -- -local function cfg_check(shard_cfg, old_cfg) - if type(shard_cfg) ~= 'table' then +local function cfg_check(cfg, old_vshard_cfg) + if type(cfg) ~= 'table' then error('Сonfig must be map of options') end - shard_cfg = table.deepcopy(shard_cfg) - validate_config(shard_cfg, cfg_template) - if not old_cfg then - return shard_cfg + local vshard_cfg, box_cfg = split_cfg(cfg) + validate_config(vshard_cfg, cfg_template) + if not old_vshard_cfg then + return vshard_cfg, box_cfg end -- Check non-dynamic after default values are added. for _, f_name in pairs(non_dynamic_options) do -- New option may be added in new vshard version. - if shard_cfg[f_name] ~= old_cfg[f_name] then + if vshard_cfg[f_name] ~= old_vshard_cfg[f_name] then error(string.format('Non-dynamic option %s ' .. 'cannot be reconfigured', f_name)) end end - return shard_cfg -end - --- --- Nullify non-box options. --- -local function remove_non_box_options(cfg) - cfg.sharding = nil - cfg.weights = nil - cfg.zone = nil - cfg.bucket_count = nil - cfg.rebalancer_disbalance_threshold = nil - cfg.rebalancer_max_receiving = nil - cfg.shard_index = nil - cfg.collect_bucket_garbage_interval = nil - cfg.collect_lua_garbage = nil - cfg.sync_timeout = nil - cfg.connection_outdate_delay = nil + return vshard_cfg, box_cfg end return { check = cfg_check, - remove_non_box_options = remove_non_box_options, } diff --git a/vshard/router/init.lua b/vshard/router/init.lua index 4cb19fd..e2b2b22 100644 --- a/vshard/router/init.lua +++ b/vshard/router/init.lua @@ -496,18 +496,15 @@ end -------------------------------------------------------------------------------- local function router_cfg(cfg) - cfg = lcfg.check(cfg, M.current_cfg) - local new_cfg = table.copy(cfg) + local vshard_cfg, box_cfg = lcfg.check(cfg, M.current_cfg) if not M.replicasets then log.info('Starting router configuration') else log.info('Starting router reconfiguration') end - local new_replicasets = lreplicaset.buildall(cfg) - local total_bucket_count = cfg.bucket_count - local collect_lua_garbage = cfg.collect_lua_garbage - local box_cfg = table.copy(cfg) - lcfg.remove_non_box_options(box_cfg) + local new_replicasets = lreplicaset.buildall(vshard_cfg) + local total_bucket_count = vshard_cfg.bucket_count + local collect_lua_garbage = vshard_cfg.collect_lua_garbage log.info("Calling box.cfg()...") for k, v in pairs(box_cfg) do log.info({[k] = v}) @@ -530,11 +527,12 @@ local function router_cfg(cfg) replicaset:connect_all() end lreplicaset.wait_masters_connect(new_replicasets) - lreplicaset.outdate_replicasets(M.replicasets, cfg.connection_outdate_delay) - M.connection_outdate_delay = cfg.connection_outdate_delay + lreplicaset.outdate_replicasets(M.replicasets, + vshard_cfg.connection_outdate_delay) + M.connection_outdate_delay = vshard_cfg.connection_outdate_delay M.total_bucket_count = total_bucket_count M.collect_lua_garbage = collect_lua_garbage - M.current_cfg = cfg + M.current_cfg = vshard_cfg M.replicasets = new_replicasets -- Update existing route map in-place. local old_route_map = M.route_map diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua index 102b942..75f5df9 100644 --- a/vshard/storage/init.lua +++ b/vshard/storage/init.lua @@ -1500,13 +1500,17 @@ end -------------------------------------------------------------------------------- -- Configuration -------------------------------------------------------------------------------- +-- Private (not accessible by a user) reload indicator. +local is_reload = false local function storage_cfg(cfg, this_replica_uuid) + -- Reset is_reload indicator in case of errors. + local xis_reload = is_reload + is_reload = false if this_replica_uuid == nil then error('Usage: cfg(configuration, this_replica_uuid)') end - cfg = lcfg.check(cfg, M.current_cfg) - local new_cfg = table.copy(cfg) - if cfg.weights or cfg.zone then + local vshard_cfg, box_cfg = lcfg.check(cfg, M.current_cfg) + if vshard_cfg.weights or vshard_cfg.zone then error('Weights and zone are not allowed for storage configuration') end if M.replicasets then @@ -1520,7 +1524,7 @@ local function storage_cfg(cfg, this_replica_uuid) local this_replicaset local this_replica - local new_replicasets = lreplicaset.buildall(cfg) + local new_replicasets = lreplicaset.buildall(vshard_cfg) local min_master for rs_uuid, rs in pairs(new_replicasets) do for replica_uuid, replica in pairs(rs.replicas) do @@ -1553,18 +1557,19 @@ local function storage_cfg(cfg, this_replica_uuid) -- -- If a master role of the replica is not changed, then -- 'read_only' can be set right here. - cfg.listen = cfg.listen or this_replica.uri - if cfg.replication == nil and this_replicaset.master and not is_master then - cfg.replication = {this_replicaset.master.uri} + box_cfg.listen = box_cfg.listen or this_replica.uri + if box_cfg.replication == nil and this_replicaset.master + and not is_master then + box_cfg.replication = {this_replicaset.master.uri} else - cfg.replication = {} + box_cfg.replication = {} end if was_master == is_master then - cfg.read_only = not is_master + box_cfg.read_only = not is_master end if type(box.cfg) == 'function' then - cfg.instance_uuid = this_replica.uuid - cfg.replicaset_uuid = this_replicaset.uuid + box_cfg.instance_uuid = this_replica.uuid + box_cfg.replicaset_uuid = this_replicaset.uuid else local info = box.info if this_replica_uuid ~= info.uuid then @@ -1578,12 +1583,14 @@ local function storage_cfg(cfg, this_replica_uuid) this_replicaset.uuid)) end end - local total_bucket_count = cfg.bucket_count - local rebalancer_disbalance_threshold = cfg.rebalancer_disbalance_threshold - local rebalancer_max_receiving = cfg.rebalancer_max_receiving - local shard_index = cfg.shard_index - local collect_bucket_garbage_interval = cfg.collect_bucket_garbage_interval - local collect_lua_garbage = cfg.collect_lua_garbage + local total_bucket_count = vshard_cfg.bucket_count + local rebalancer_disbalance_threshold = + vshard_cfg.rebalancer_disbalance_threshold + local rebalancer_max_receiving = vshard_cfg.rebalancer_max_receiving + local shard_index = vshard_cfg.shard_index + local collect_bucket_garbage_interval = + vshard_cfg.collect_bucket_garbage_interval + local collect_lua_garbage = vshard_cfg.collect_lua_garbage -- It is considered that all possible errors during cfg -- process occur only before this place. @@ -1598,7 +1605,7 @@ local function storage_cfg(cfg, this_replica_uuid) -- a new sync timeout. -- local old_sync_timeout = M.sync_timeout - M.sync_timeout = cfg.sync_timeout + M.sync_timeout = vshard_cfg.sync_timeout if was_master and not is_master then local_on_master_disable_prepare() @@ -1607,9 +1614,10 @@ local function storage_cfg(cfg, this_replica_uuid) local_on_master_enable_prepare() end - local box_cfg = table.copy(cfg) - lcfg.remove_non_box_options(box_cfg) - local ok, err = pcall(box.cfg, box_cfg) + local ok, err = true, nil + if not xis_reload then + ok, err = pcall(box.cfg, box_cfg) + end while M.errinj.ERRINJ_CFG_DELAY do lfiber.sleep(0.01) end @@ -1639,7 +1647,7 @@ local function storage_cfg(cfg, this_replica_uuid) M.shard_index = shard_index M.collect_bucket_garbage_interval = collect_bucket_garbage_interval M.collect_lua_garbage = collect_lua_garbage - M.current_cfg = new_cfg + M.current_cfg = vshard_cfg if was_master and not is_master then local_on_master_disable() @@ -1874,6 +1882,7 @@ if not rawget(_G, MODULE_INTERNALS) then rawset(_G, MODULE_INTERNALS, M) else reload_evolution.upgrade(M) + is_reload = true storage_cfg(M.current_cfg, M.this_replica.uuid) M.module_version = M.module_version + 1 end -- 2.14.1 ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 1/3] Update only vshard part of a cfg on reload 2018-07-31 16:25 ` [tarantool-patches] [PATCH 1/3] Update only vshard part of a cfg on reload AKhatskevich @ 2018-08-01 18:43 ` Vladislav Shpilevoy 2018-08-03 20:03 ` Alex Khatskevich 0 siblings, 1 reply; 23+ messages in thread From: Vladislav Shpilevoy @ 2018-08-01 18:43 UTC (permalink / raw) To: tarantool-patches, AKhatskevich Thanks for the patch! See 4 comments below. On 31/07/2018 19:25, AKhatskevich wrote: > Box cfg could have been changed by a user and then overridden by > an old vshard config on reload. > > Since that commit, box part of a config is applied only when > it is explicitly passed to a `cfg` method. > > This change is important for the multiple routers feature. > --- > vshard/cfg.lua | 54 +++++++++++++++++++++++++------------------------ > vshard/router/init.lua | 18 ++++++++--------- > vshard/storage/init.lua | 53 ++++++++++++++++++++++++++++-------------------- > 3 files changed, 67 insertions(+), 58 deletions(-) > > diff --git a/vshard/cfg.lua b/vshard/cfg.lua > index bba12cc..8282086 100644 > --- a/vshard/cfg.lua > +++ b/vshard/cfg.lua > @@ -230,48 +230,50 @@ local non_dynamic_options = { > 'bucket_count', 'shard_index' > } > > +-- > +-- Deepcopy a config and split it into vshard_cfg and box_cfg. > +-- > +local function split_cfg(cfg) > + local vshard_field_map = {} > + for _, field in ipairs(cfg_template) do > + vshard_field_map[field[1]] = true > + end 1. vshard_field_map does not change ever. Why do you build it on each cfg? Please, store it in a module local variable like cfg_template. Or refactor cfg_template and other templates so they would be maps with parameter name as a key - looks like the most suitable solution. > + local vshard_cfg = {} > + local box_cfg = {} > + for k, v in pairs(cfg) do > + if vshard_field_map[k] then > + vshard_cfg[k] = table.deepcopy(v) > + else > + box_cfg[k] = table.deepcopy(v) > + end > + end > + return vshard_cfg, box_cfg > +end > + > diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua > index 102b942..75f5df9 100644 > --- a/vshard/storage/init.lua > +++ b/vshard/storage/init.lua > @@ -1500,13 +1500,17 @@ end > -------------------------------------------------------------------------------- > -- Configuration > -------------------------------------------------------------------------------- > +-- Private (not accessible by a user) reload indicator. > +local is_reload = false 2. Please, make this variable be parameter of storage_cfg and wrap public storage.cfg with a one-liner: storage.cfg = function(cfg, uuid) return storage_cfg(cfg, uuid, false) end I believe/hope you understand that such way to pass parameters, via global variables, is flawed by design. > @@ -1553,18 +1557,19 @@ local function storage_cfg(cfg, this_replica_uuid) > -- > -- If a master role of the replica is not changed, then > -- 'read_only' can be set right here. > - cfg.listen = cfg.listen or this_replica.uri > - if cfg.replication == nil and this_replicaset.master and not is_master then > - cfg.replication = {this_replicaset.master.uri} > + box_cfg.listen = box_cfg.listen or this_replica.uri > + if box_cfg.replication == nil and this_replicaset.master > + and not is_master then 3. Broken indentation. > + box_cfg.replication = {this_replicaset.master.uri} > else > - cfg.replication = {} > + box_cfg.replication = {} > end > if was_master == is_master then > - cfg.read_only = not is_master > + box_cfg.read_only = not is_master > end > if type(box.cfg) == 'function' then > - cfg.instance_uuid = this_replica.uuid > - cfg.replicaset_uuid = this_replicaset.uuid > + box_cfg.instance_uuid = this_replica.uuid > + box_cfg.replicaset_uuid = this_replicaset.uuid > else > local info = box.info > if this_replica_uuid ~= info.uuid then > @@ -1607,9 +1614,10 @@ local function storage_cfg(cfg, this_replica_uuid) > local_on_master_enable_prepare() > end > > - local box_cfg = table.copy(cfg) > - lcfg.remove_non_box_options(box_cfg) > - local ok, err = pcall(box.cfg, box_cfg) > + local ok, err = true, nil > + if not xis_reload then > + ok, err = pcall(box.cfg, box_cfg) > + end 4. The code below (if not ok then ...) can be moved inside 'if not is_reload' together with 'local ok, err' declaration. Please, do. > while M.errinj.ERRINJ_CFG_DELAY do > lfiber.sleep(0.01) > end ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 1/3] Update only vshard part of a cfg on reload 2018-08-01 18:43 ` [tarantool-patches] " Vladislav Shpilevoy @ 2018-08-03 20:03 ` Alex Khatskevich 2018-08-06 17:03 ` Vladislav Shpilevoy 0 siblings, 1 reply; 23+ messages in thread From: Alex Khatskevich @ 2018-08-03 20:03 UTC (permalink / raw) To: Vladislav Shpilevoy, tarantool-patches On 01.08.2018 21:43, Vladislav Shpilevoy wrote: > Thanks for the patch! See 4 comments below. > > On 31/07/2018 19:25, AKhatskevich wrote: >> Box cfg could have been changed by a user and then overridden by >> an old vshard config on reload. >> >> Since that commit, box part of a config is applied only when >> it is explicitly passed to a `cfg` method. >> >> This change is important for the multiple routers feature. >> --- >> vshard/cfg.lua | 54 >> +++++++++++++++++++++++++------------------------ >> vshard/router/init.lua | 18 ++++++++--------- >> vshard/storage/init.lua | 53 >> ++++++++++++++++++++++++++++-------------------- >> 3 files changed, 67 insertions(+), 58 deletions(-) >> >> diff --git a/vshard/cfg.lua b/vshard/cfg.lua >> index bba12cc..8282086 100644 >> --- a/vshard/cfg.lua >> +++ b/vshard/cfg.lua >> @@ -230,48 +230,50 @@ local non_dynamic_options = { >> 'bucket_count', 'shard_index' >> } >> +-- >> +-- Deepcopy a config and split it into vshard_cfg and box_cfg. >> +-- >> +local function split_cfg(cfg) >> + local vshard_field_map = {} >> + for _, field in ipairs(cfg_template) do >> + vshard_field_map[field[1]] = true >> + end > > 1. vshard_field_map does not change ever. Why do you build it > on each cfg? Please, store it in a module local variable like > cfg_template. Or refactor cfg_template and other templates so > they would be maps with parameter name as a key - looks like > the most suitable solution. Refactored cfg_template. (add extra commit before this one) > >> + local vshard_cfg = {} >> + local box_cfg = {} >> + for k, v in pairs(cfg) do >> + if vshard_field_map[k] then >> + vshard_cfg[k] = table.deepcopy(v) >> + else >> + box_cfg[k] = table.deepcopy(v) >> + end >> + end >> + return vshard_cfg, box_cfg >> +end >> + >> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua >> index 102b942..75f5df9 100644 >> --- a/vshard/storage/init.lua >> +++ b/vshard/storage/init.lua >> @@ -1500,13 +1500,17 @@ end >> -------------------------------------------------------------------------------- >> -- Configuration >> -------------------------------------------------------------------------------- >> +-- Private (not accessible by a user) reload indicator. >> +local is_reload = false > > 2. Please, make this variable be parameter of storage_cfg and wrap public > storage.cfg with a one-liner: > > storage.cfg = function(cfg, uuid) return storage_cfg(cfg, uuid, > false) end > > I believe/hope you understand that such way to pass parameters, via > global > variables, is flawed by design. Nice idea. Fixed. > >> @@ -1553,18 +1557,19 @@ local function storage_cfg(cfg, >> this_replica_uuid) >> -- >> -- If a master role of the replica is not changed, then >> -- 'read_only' can be set right here. >> - cfg.listen = cfg.listen or this_replica.uri >> - if cfg.replication == nil and this_replicaset.master and not >> is_master then >> - cfg.replication = {this_replicaset.master.uri} >> + box_cfg.listen = box_cfg.listen or this_replica.uri >> + if box_cfg.replication == nil and this_replicaset.master >> + and not is_master then > > 3. Broken indentation. fixed > >> + box_cfg.replication = {this_replicaset.master.uri} >> else >> - cfg.replication = {} >> + box_cfg.replication = {} >> end >> if was_master == is_master then >> - cfg.read_only = not is_master >> + box_cfg.read_only = not is_master >> end >> if type(box.cfg) == 'function' then >> - cfg.instance_uuid = this_replica.uuid >> - cfg.replicaset_uuid = this_replicaset.uuid >> + box_cfg.instance_uuid = this_replica.uuid >> + box_cfg.replicaset_uuid = this_replicaset.uuid >> else >> local info = box.info >> if this_replica_uuid ~= info.uuid then >> @@ -1607,9 +1614,10 @@ local function storage_cfg(cfg, >> this_replica_uuid) >> local_on_master_enable_prepare() >> end >> - local box_cfg = table.copy(cfg) >> - lcfg.remove_non_box_options(box_cfg) >> - local ok, err = pcall(box.cfg, box_cfg) >> + local ok, err = true, nil >> + if not xis_reload then >> + ok, err = pcall(box.cfg, box_cfg) >> + end > > 4. The code below (if not ok then ...) can be moved inside > 'if not is_reload' together with 'local ok, err' declaration. > Please, do. > >> while M.errinj.ERRINJ_CFG_DELAY do >> lfiber.sleep(0.01) >> end Done. Full diff commit 81cb60df74fbacae3aee1817f1ff16e7fe0af72f Author: AKhatskevich <avkhatskevich@tarantool.org> Date: Mon Jul 23 16:42:22 2018 +0300 Update only vshard part of a cfg on reload Box cfg could have been changed by a user and then overridden by an old vshard config on reload. Since that commit, box part of a config is applied only when it is explicitly passed to a `cfg` method. This change is important for the multiple routers feature. diff --git a/vshard/cfg.lua b/vshard/cfg.lua index 7c9ab77..80ea432 100644 --- a/vshard/cfg.lua +++ b/vshard/cfg.lua @@ -221,6 +221,22 @@ local cfg_template = { }, } +-- +-- Deepcopy a config and split it into vshard_cfg and box_cfg. +-- +local function split_cfg(cfg) + local vshard_cfg = {} + local box_cfg = {} + for k, v in pairs(cfg) do + if cfg_template[k] then + vshard_cfg[k] = table.deepcopy(v) + else + box_cfg[k] = table.deepcopy(v) + end + end + return vshard_cfg, box_cfg +end + -- -- Names of options which cannot be changed during reconfigure. -- @@ -232,44 +248,26 @@ local non_dynamic_options = { -- Check sharding config on correctness. Check types, name and uri -- uniqueness, master count (in each replicaset must be <= 1). -- -local function cfg_check(shard_cfg, old_cfg) - if type(shard_cfg) ~= 'table' then +local function cfg_check(cfg, old_vshard_cfg) + if type(cfg) ~= 'table' then error('Сonfig must be map of options') end - shard_cfg = table.deepcopy(shard_cfg) - validate_config(shard_cfg, cfg_template) - if not old_cfg then - return shard_cfg + local vshard_cfg, box_cfg = split_cfg(cfg) + validate_config(vshard_cfg, cfg_template) + if not old_vshard_cfg then + return vshard_cfg, box_cfg end -- Check non-dynamic after default values are added. for _, f_name in pairs(non_dynamic_options) do -- New option may be added in new vshard version. - if shard_cfg[f_name] ~= old_cfg[f_name] then + if vshard_cfg[f_name] ~= old_vshard_cfg[f_name] then error(string.format('Non-dynamic option %s ' .. 'cannot be reconfigured', f_name)) end end - return shard_cfg -end - --- --- Nullify non-box options. --- -local function remove_non_box_options(cfg) - cfg.sharding = nil - cfg.weights = nil - cfg.zone = nil - cfg.bucket_count = nil - cfg.rebalancer_disbalance_threshold = nil - cfg.rebalancer_max_receiving = nil - cfg.shard_index = nil - cfg.collect_bucket_garbage_interval = nil - cfg.collect_lua_garbage = nil - cfg.sync_timeout = nil - cfg.connection_outdate_delay = nil + return vshard_cfg, box_cfg end return { check = cfg_check, - remove_non_box_options = remove_non_box_options, } diff --git a/vshard/router/init.lua b/vshard/router/init.lua index 4cb19fd..e2b2b22 100644 --- a/vshard/router/init.lua +++ b/vshard/router/init.lua @@ -496,18 +496,15 @@ end -------------------------------------------------------------------------------- local function router_cfg(cfg) - cfg = lcfg.check(cfg, M.current_cfg) - local new_cfg = table.copy(cfg) + local vshard_cfg, box_cfg = lcfg.check(cfg, M.current_cfg) if not M.replicasets then log.info('Starting router configuration') else log.info('Starting router reconfiguration') end - local new_replicasets = lreplicaset.buildall(cfg) - local total_bucket_count = cfg.bucket_count - local collect_lua_garbage = cfg.collect_lua_garbage - local box_cfg = table.copy(cfg) - lcfg.remove_non_box_options(box_cfg) + local new_replicasets = lreplicaset.buildall(vshard_cfg) + local total_bucket_count = vshard_cfg.bucket_count + local collect_lua_garbage = vshard_cfg.collect_lua_garbage log.info("Calling box.cfg()...") for k, v in pairs(box_cfg) do log.info({[k] = v}) @@ -530,11 +527,12 @@ local function router_cfg(cfg) replicaset:connect_all() end lreplicaset.wait_masters_connect(new_replicasets) - lreplicaset.outdate_replicasets(M.replicasets, cfg.connection_outdate_delay) - M.connection_outdate_delay = cfg.connection_outdate_delay + lreplicaset.outdate_replicasets(M.replicasets, + vshard_cfg.connection_outdate_delay) + M.connection_outdate_delay = vshard_cfg.connection_outdate_delay M.total_bucket_count = total_bucket_count M.collect_lua_garbage = collect_lua_garbage - M.current_cfg = cfg + M.current_cfg = vshard_cfg M.replicasets = new_replicasets -- Update existing route map in-place. local old_route_map = M.route_map diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua index 102b942..40216ea 100644 --- a/vshard/storage/init.lua +++ b/vshard/storage/init.lua @@ -1500,13 +1500,13 @@ end -------------------------------------------------------------------------------- -- Configuration -------------------------------------------------------------------------------- -local function storage_cfg(cfg, this_replica_uuid) + +local function storage_cfg(cfg, this_replica_uuid, is_reload) if this_replica_uuid == nil then error('Usage: cfg(configuration, this_replica_uuid)') end - cfg = lcfg.check(cfg, M.current_cfg) - local new_cfg = table.copy(cfg) - if cfg.weights or cfg.zone then + local vshard_cfg, box_cfg = lcfg.check(cfg, M.current_cfg) + if vshard_cfg.weights or vshard_cfg.zone then error('Weights and zone are not allowed for storage configuration') end if M.replicasets then @@ -1520,7 +1520,7 @@ local function storage_cfg(cfg, this_replica_uuid) local this_replicaset local this_replica - local new_replicasets = lreplicaset.buildall(cfg) + local new_replicasets = lreplicaset.buildall(vshard_cfg) local min_master for rs_uuid, rs in pairs(new_replicasets) do for replica_uuid, replica in pairs(rs.replicas) do @@ -1553,18 +1553,19 @@ local function storage_cfg(cfg, this_replica_uuid) -- -- If a master role of the replica is not changed, then -- 'read_only' can be set right here. - cfg.listen = cfg.listen or this_replica.uri - if cfg.replication == nil and this_replicaset.master and not is_master then - cfg.replication = {this_replicaset.master.uri} + box_cfg.listen = box_cfg.listen or this_replica.uri + if box_cfg.replication == nil and this_replicaset.master + and not is_master then + box_cfg.replication = {this_replicaset.master.uri} else - cfg.replication = {} + box_cfg.replication = {} end if was_master == is_master then - cfg.read_only = not is_master + box_cfg.read_only = not is_master end if type(box.cfg) == 'function' then - cfg.instance_uuid = this_replica.uuid - cfg.replicaset_uuid = this_replicaset.uuid + box_cfg.instance_uuid = this_replica.uuid + box_cfg.replicaset_uuid = this_replicaset.uuid else local info = box.info if this_replica_uuid ~= info.uuid then @@ -1578,12 +1579,14 @@ local function storage_cfg(cfg, this_replica_uuid) this_replicaset.uuid)) end end - local total_bucket_count = cfg.bucket_count - local rebalancer_disbalance_threshold = cfg.rebalancer_disbalance_threshold - local rebalancer_max_receiving = cfg.rebalancer_max_receiving - local shard_index = cfg.shard_index - local collect_bucket_garbage_interval = cfg.collect_bucket_garbage_interval - local collect_lua_garbage = cfg.collect_lua_garbage + local total_bucket_count = vshard_cfg.bucket_count + local rebalancer_disbalance_threshold = + vshard_cfg.rebalancer_disbalance_threshold + local rebalancer_max_receiving = vshard_cfg.rebalancer_max_receiving + local shard_index = vshard_cfg.shard_index + local collect_bucket_garbage_interval = + vshard_cfg.collect_bucket_garbage_interval + local collect_lua_garbage = vshard_cfg.collect_lua_garbage -- It is considered that all possible errors during cfg -- process occur only before this place. @@ -1598,7 +1601,7 @@ local function storage_cfg(cfg, this_replica_uuid) -- a new sync timeout. -- local old_sync_timeout = M.sync_timeout - M.sync_timeout = cfg.sync_timeout + M.sync_timeout = vshard_cfg.sync_timeout if was_master and not is_master then local_on_master_disable_prepare() @@ -1607,27 +1610,27 @@ local function storage_cfg(cfg, this_replica_uuid) local_on_master_enable_prepare() end - local box_cfg = table.copy(cfg) - lcfg.remove_non_box_options(box_cfg) - local ok, err = pcall(box.cfg, box_cfg) - while M.errinj.ERRINJ_CFG_DELAY do - lfiber.sleep(0.01) - end - if not ok then - M.sync_timeout = old_sync_timeout - if was_master and not is_master then - local_on_master_disable_abort() + if not is_reload then + local ok, err = true, nil + ok, err = pcall(box.cfg, box_cfg) + while M.errinj.ERRINJ_CFG_DELAY do + lfiber.sleep(0.01) end - if not was_master and is_master then - local_on_master_enable_abort() + if not ok then + M.sync_timeout = old_sync_timeout + if was_master and not is_master then + local_on_master_disable_abort() + end + if not was_master and is_master then + local_on_master_enable_abort() + end + error(err) end - error(err) + log.info("Box has been configured") + local uri = luri.parse(this_replica.uri) + box.once("vshard:storage:1", storage_schema_v1, uri.login, uri.password) end - log.info("Box has been configured") - local uri = luri.parse(this_replica.uri) - box.once("vshard:storage:1", storage_schema_v1, uri.login, uri.password) - lreplicaset.rebind_replicasets(new_replicasets, M.replicasets) lreplicaset.outdate_replicasets(M.replicasets) M.replicasets = new_replicasets @@ -1639,7 +1642,7 @@ local function storage_cfg(cfg, this_replica_uuid) M.shard_index = shard_index M.collect_bucket_garbage_interval = collect_bucket_garbage_interval M.collect_lua_garbage = collect_lua_garbage - M.current_cfg = new_cfg + M.current_cfg = vshard_cfg if was_master and not is_master then local_on_master_disable() @@ -1874,7 +1877,7 @@ if not rawget(_G, MODULE_INTERNALS) then rawset(_G, MODULE_INTERNALS, M) else reload_evolution.upgrade(M) - storage_cfg(M.current_cfg, M.this_replica.uuid) + storage_cfg(M.current_cfg, M.this_replica.uuid, true) M.module_version = M.module_version + 1 end @@ -1913,7 +1916,7 @@ return { rebalancing_is_in_progress = rebalancing_is_in_progress, recovery_wakeup = recovery_wakeup, call = storage_call, - cfg = storage_cfg, + cfg = function(cfg, uuid) return storage_cfg(cfg, uuid, false) end, info = storage_info, buckets_info = storage_buckets_info, buckets_count = storage_buckets_count, ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 1/3] Update only vshard part of a cfg on reload 2018-08-03 20:03 ` Alex Khatskevich @ 2018-08-06 17:03 ` Vladislav Shpilevoy 2018-08-07 13:19 ` Alex Khatskevich 0 siblings, 1 reply; 23+ messages in thread From: Vladislav Shpilevoy @ 2018-08-06 17:03 UTC (permalink / raw) To: Alex Khatskevich, tarantool-patches Thanks for the patch! See 3 comments below. > diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua > index 102b942..40216ea 100644 > --- a/vshard/storage/init.lua > +++ b/vshard/storage/init.lua > @@ -1553,18 +1553,19 @@ local function storage_cfg(cfg, this_replica_uuid) > -- > -- If a master role of the replica is not changed, then > -- 'read_only' can be set right here. > - cfg.listen = cfg.listen or this_replica.uri > - if cfg.replication == nil and this_replicaset.master and not is_master then > - cfg.replication = {this_replicaset.master.uri} > + box_cfg.listen = box_cfg.listen or this_replica.uri > + if box_cfg.replication == nil and this_replicaset.master > + and not is_master then > + box_cfg.replication = {this_replicaset.master.uri} > else > - cfg.replication = {} > + box_cfg.replication = {} > end > if was_master == is_master then > - cfg.read_only = not is_master > + box_cfg.read_only = not is_master > end > if type(box.cfg) == 'function' then > - cfg.instance_uuid = this_replica.uuid > - cfg.replicaset_uuid = this_replicaset.uuid > + box_cfg.instance_uuid = this_replica.uuid > + box_cfg.replicaset_uuid = this_replicaset.uuid 1. All these box_cfg manipulations should be done under 'if not is_reload' I think. > else > local info = box.info > if this_replica_uuid ~= info.uuid then > @@ -1607,27 +1610,27 @@ local function storage_cfg(cfg, this_replica_uuid) > local_on_master_enable_prepare() > end > > - local box_cfg = table.copy(cfg) > - lcfg.remove_non_box_options(box_cfg) > - local ok, err = pcall(box.cfg, box_cfg) > - while M.errinj.ERRINJ_CFG_DELAY do > - lfiber.sleep(0.01) > - end > - if not ok then > - M.sync_timeout = old_sync_timeout > - if was_master and not is_master then > - local_on_master_disable_abort() > + if not is_reload then > + local ok, err = true, nil > + ok, err = pcall(box.cfg, box_cfg) 2. Why do you need to announce 'local ok, err' before their usage on the next line? > + while M.errinj.ERRINJ_CFG_DELAY do > + lfiber.sleep(0.01) > end > - if not was_master and is_master then > - local_on_master_enable_abort() > + if not ok then > + M.sync_timeout = old_sync_timeout > + if was_master and not is_master then > + local_on_master_disable_abort() > + end > + if not was_master and is_master then > + local_on_master_enable_abort() > + end > + error(err) > end > - error(err) > + log.info("Box has been configured") > + local uri = luri.parse(this_replica.uri) > + box.once("vshard:storage:1", storage_schema_v1, uri.login, uri.password) > end > > - log.info("Box has been configured") > - local uri = luri.parse(this_replica.uri) > - box.once("vshard:storage:1", storage_schema_v1, uri.login, uri.password) > - > lreplicaset.rebind_replicasets(new_replicasets, M.replicasets) > lreplicaset.outdate_replicasets(M.replicasets) > M.replicasets = new_replicasets > @@ -1874,7 +1877,7 @@ if not rawget(_G, MODULE_INTERNALS) then > rawset(_G, MODULE_INTERNALS, M) > else > reload_evolution.upgrade(M) > - storage_cfg(M.current_cfg, M.this_replica.uuid) > + storage_cfg(M.current_cfg, M.this_replica.uuid, true) 3. I see that you have stored vshard_cfg in M.current_cfg. Not a full config. So it does not have any box options. And it causes a question - why do you need to separate reload from non-reload, if reload anyway in such implementation is like 'box.cfg{}' call with no parameters? And if you do not store box_cfg options how are you going to compare configs when we will implement atomic cfg over cluster? > M.module_version = M.module_version + 1 > end > ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 1/3] Update only vshard part of a cfg on reload 2018-08-06 17:03 ` Vladislav Shpilevoy @ 2018-08-07 13:19 ` Alex Khatskevich 2018-08-08 11:17 ` Vladislav Shpilevoy 0 siblings, 1 reply; 23+ messages in thread From: Alex Khatskevich @ 2018-08-07 13:19 UTC (permalink / raw) To: Vladislav Shpilevoy, tarantool-patches On 06.08.2018 20:03, Vladislav Shpilevoy wrote: > Thanks for the patch! See 3 comments below. > >> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua >> index 102b942..40216ea 100644 >> --- a/vshard/storage/init.lua >> +++ b/vshard/storage/init.lua >> @@ -1553,18 +1553,19 @@ local function storage_cfg(cfg, >> this_replica_uuid) >> -- >> -- If a master role of the replica is not changed, then >> -- 'read_only' can be set right here. >> - cfg.listen = cfg.listen or this_replica.uri >> - if cfg.replication == nil and this_replicaset.master and not >> is_master then >> - cfg.replication = {this_replicaset.master.uri} >> + box_cfg.listen = box_cfg.listen or this_replica.uri >> + if box_cfg.replication == nil and this_replicaset.master >> + and not is_master then >> + box_cfg.replication = {this_replicaset.master.uri} >> else >> - cfg.replication = {} >> + box_cfg.replication = {} >> end >> if was_master == is_master then >> - cfg.read_only = not is_master >> + box_cfg.read_only = not is_master >> end >> if type(box.cfg) == 'function' then >> - cfg.instance_uuid = this_replica.uuid >> - cfg.replicaset_uuid = this_replicaset.uuid >> + box_cfg.instance_uuid = this_replica.uuid >> + box_cfg.replicaset_uuid = this_replicaset.uuid > > 1. All these box_cfg manipulations should be done under 'if not > is_reload' > I think. Fixed. > >> else >> local info = box.info >> if this_replica_uuid ~= info.uuid then >> @@ -1607,27 +1610,27 @@ local function storage_cfg(cfg, >> this_replica_uuid) >> local_on_master_enable_prepare() >> end >> >> - local box_cfg = table.copy(cfg) >> - lcfg.remove_non_box_options(box_cfg) >> - local ok, err = pcall(box.cfg, box_cfg) >> - while M.errinj.ERRINJ_CFG_DELAY do >> - lfiber.sleep(0.01) >> - end >> - if not ok then >> - M.sync_timeout = old_sync_timeout >> - if was_master and not is_master then >> - local_on_master_disable_abort() >> + if not is_reload then >> + local ok, err = true, nil >> + ok, err = pcall(box.cfg, box_cfg) > > 2. Why do you need to announce 'local ok, err' before > their usage on the next line? fixed. > > >> + while M.errinj.ERRINJ_CFG_DELAY do >> + lfiber.sleep(0.01) >> end >> - if not was_master and is_master then >> - local_on_master_enable_abort() >> + if not ok then >> + M.sync_timeout = old_sync_timeout >> + if was_master and not is_master then >> + local_on_master_disable_abort() >> + end >> + if not was_master and is_master then >> + local_on_master_enable_abort() >> + end >> + error(err) >> end >> - error(err) >> + log.info("Box has been configured") >> + local uri = luri.parse(this_replica.uri) >> + box.once("vshard:storage:1", storage_schema_v1, uri.login, >> uri.password) >> end >> >> - log.info("Box has been configured") >> - local uri = luri.parse(this_replica.uri) >> - box.once("vshard:storage:1", storage_schema_v1, uri.login, >> uri.password) >> - >> lreplicaset.rebind_replicasets(new_replicasets, M.replicasets) >> lreplicaset.outdate_replicasets(M.replicasets) >> M.replicasets = new_replicasets >> @@ -1874,7 +1877,7 @@ if not rawget(_G, MODULE_INTERNALS) then >> rawset(_G, MODULE_INTERNALS, M) >> else >> reload_evolution.upgrade(M) >> - storage_cfg(M.current_cfg, M.this_replica.uuid) >> + storage_cfg(M.current_cfg, M.this_replica.uuid, true) > > 3. I see that you have stored vshard_cfg in M.current_cfg. Not a full > config. So it does not have any box options. And it causes a question > - why do you need to separate reload from non-reload, if reload anyway > in such implementation is like 'box.cfg{}' call with no parameters? > And if you do not store box_cfg options how are you going to compare > configs when we will implement atomic cfg over cluster? Fixed. And in a router too. Full diff: commit d3c35612130ff95b20245993ab5053981d3b985f Author: AKhatskevich <avkhatskevich@tarantool.org> Date: Mon Jul 23 16:42:22 2018 +0300 Update only vshard part of a cfg on reload Box cfg could have been changed by a user and then overridden by an old vshard config on reload. Since that commit, box part of a config is applied only when it is explicitly passed to a `cfg` method. This change is important for the multiple routers feature. diff --git a/vshard/cfg.lua b/vshard/cfg.lua index 7c9ab77..af1c3ee 100644 --- a/vshard/cfg.lua +++ b/vshard/cfg.lua @@ -221,6 +221,22 @@ local cfg_template = { }, } +-- +-- Split it into vshard_cfg and box_cfg parts. +-- +local function cfg_split(cfg) + local vshard_cfg = {} + local box_cfg = {} + for k, v in pairs(cfg) do + if cfg_template[k] then + vshard_cfg[k] = v + else + box_cfg[k] = v + end + end + return vshard_cfg, box_cfg +end + -- -- Names of options which cannot be changed during reconfigure. -- @@ -252,24 +268,7 @@ local function cfg_check(shard_cfg, old_cfg) return shard_cfg end --- --- Nullify non-box options. --- -local function remove_non_box_options(cfg) - cfg.sharding = nil - cfg.weights = nil - cfg.zone = nil - cfg.bucket_count = nil - cfg.rebalancer_disbalance_threshold = nil - cfg.rebalancer_max_receiving = nil - cfg.shard_index = nil - cfg.collect_bucket_garbage_interval = nil - cfg.collect_lua_garbage = nil - cfg.sync_timeout = nil - cfg.connection_outdate_delay = nil -end - return { check = cfg_check, - remove_non_box_options = remove_non_box_options, + split = cfg_split, } diff --git a/vshard/router/init.lua b/vshard/router/init.lua index d8c026b..1e8d898 100644 --- a/vshard/router/init.lua +++ b/vshard/router/init.lua @@ -491,19 +491,17 @@ end -- Configuration -------------------------------------------------------------------------------- -local function router_cfg(cfg) +local function router_cfg(cfg, is_reload) cfg = lcfg.check(cfg, M.current_cfg) - local new_cfg = table.copy(cfg) + local vshard_cfg, box_cfg = lcfg.split(cfg) if not M.replicasets then log.info('Starting router configuration') else log.info('Starting router reconfiguration') end - local new_replicasets = lreplicaset.buildall(cfg) - local total_bucket_count = cfg.bucket_count - local collect_lua_garbage = cfg.collect_lua_garbage - local box_cfg = table.copy(cfg) - lcfg.remove_non_box_options(box_cfg) + local new_replicasets = lreplicaset.buildall(vshard_cfg) + local total_bucket_count = vshard_cfg.bucket_count + local collect_lua_garbage = vshard_cfg.collect_lua_garbage log.info("Calling box.cfg()...") for k, v in pairs(box_cfg) do log.info({[k] = v}) @@ -514,8 +512,10 @@ local function router_cfg(cfg) if M.errinj.ERRINJ_CFG then error('Error injection: cfg') end - box.cfg(box_cfg) - log.info("Box has been configured") + if not is_reload then + box.cfg(box_cfg) + log.info("Box has been configured") + end -- Move connections from an old configuration to a new one. -- It must be done with no yields to prevent usage both of not -- fully moved old replicasets, and not fully built new ones. @@ -526,8 +526,9 @@ local function router_cfg(cfg) replicaset:connect_all() end lreplicaset.wait_masters_connect(new_replicasets) - lreplicaset.outdate_replicasets(M.replicasets, cfg.connection_outdate_delay) - M.connection_outdate_delay = cfg.connection_outdate_delay + lreplicaset.outdate_replicasets(M.replicasets, + vshard_cfg.connection_outdate_delay) + M.connection_outdate_delay = vshard_cfg.connection_outdate_delay M.total_bucket_count = total_bucket_count M.collect_lua_garbage = collect_lua_garbage M.current_cfg = cfg @@ -817,7 +818,7 @@ end if not rawget(_G, MODULE_INTERNALS) then rawset(_G, MODULE_INTERNALS, M) else - router_cfg(M.current_cfg) + router_cfg(M.current_cfg, true) M.module_version = M.module_version + 1 end @@ -825,7 +826,7 @@ M.discovery_f = discovery_f M.failover_f = failover_f return { - cfg = router_cfg; + cfg = function(cfg) return router_cfg(cfg, false) end; info = router_info; buckets_info = router_buckets_info; call = router_call; diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua index 1f29323..2080769 100644 --- a/vshard/storage/init.lua +++ b/vshard/storage/init.lua @@ -1500,13 +1500,14 @@ end -------------------------------------------------------------------------------- -- Configuration -------------------------------------------------------------------------------- -local function storage_cfg(cfg, this_replica_uuid) + +local function storage_cfg(cfg, this_replica_uuid, is_reload) if this_replica_uuid == nil then error('Usage: cfg(configuration, this_replica_uuid)') end cfg = lcfg.check(cfg, M.current_cfg) - local new_cfg = table.copy(cfg) - if cfg.weights or cfg.zone then + local vshard_cfg, box_cfg = lcfg.split(cfg) + if vshard_cfg.weights or vshard_cfg.zone then error('Weights and zone are not allowed for storage configuration') end if M.replicasets then @@ -1520,7 +1521,7 @@ local function storage_cfg(cfg, this_replica_uuid) local this_replicaset local this_replica - local new_replicasets = lreplicaset.buildall(cfg) + local new_replicasets = lreplicaset.buildall(vshard_cfg) local min_master for rs_uuid, rs in pairs(new_replicasets) do for replica_uuid, replica in pairs(rs.replicas) do @@ -1544,46 +1545,14 @@ local function storage_cfg(cfg, this_replica_uuid) log.info('I am master') end - -- Do not change 'read_only' option here - if a master is - -- disabled and there are triggers on master disable, then - -- they would not be able to modify anything, if 'read_only' - -- had been set here. 'Read_only' is set in - -- local_on_master_disable after triggers and is unset in - -- local_on_master_enable before triggers. - -- - -- If a master role of the replica is not changed, then - -- 'read_only' can be set right here. - cfg.listen = cfg.listen or this_replica.uri - if cfg.replication == nil and this_replicaset.master and not is_master then - cfg.replication = {this_replicaset.master.uri} - else - cfg.replication = {} - end - if was_master == is_master then - cfg.read_only = not is_master - end - if type(box.cfg) == 'function' then - cfg.instance_uuid = this_replica.uuid - cfg.replicaset_uuid = this_replicaset.uuid - else - local info = box.info - if this_replica_uuid ~= info.uuid then - error(string.format('Instance UUID mismatch: already set "%s" '.. - 'but "%s" in arguments', info.uuid, - this_replica_uuid)) - end - if this_replicaset.uuid ~= info.cluster.uuid then - error(string.format('Replicaset UUID mismatch: already set "%s" '.. - 'but "%s" in vshard config', info.cluster.uuid, - this_replicaset.uuid)) - end - end - local total_bucket_count = cfg.bucket_count - local rebalancer_disbalance_threshold = cfg.rebalancer_disbalance_threshold - local rebalancer_max_receiving = cfg.rebalancer_max_receiving - local shard_index = cfg.shard_index - local collect_bucket_garbage_interval = cfg.collect_bucket_garbage_interval - local collect_lua_garbage = cfg.collect_lua_garbage + local total_bucket_count = vshard_cfg.bucket_count + local rebalancer_disbalance_threshold = + vshard_cfg.rebalancer_disbalance_threshold + local rebalancer_max_receiving = vshard_cfg.rebalancer_max_receiving + local shard_index = vshard_cfg.shard_index + local collect_bucket_garbage_interval = + vshard_cfg.collect_bucket_garbage_interval + local collect_lua_garbage = vshard_cfg.collect_lua_garbage -- It is considered that all possible errors during cfg -- process occur only before this place. @@ -1598,7 +1567,7 @@ local function storage_cfg(cfg, this_replica_uuid) -- a new sync timeout. -- local old_sync_timeout = M.sync_timeout - M.sync_timeout = cfg.sync_timeout + M.sync_timeout = vshard_cfg.sync_timeout if was_master and not is_master then local_on_master_disable_prepare() @@ -1607,27 +1576,61 @@ local function storage_cfg(cfg, this_replica_uuid) local_on_master_enable_prepare() end - local box_cfg = table.copy(cfg) - lcfg.remove_non_box_options(box_cfg) - local ok, err = pcall(box.cfg, box_cfg) - while M.errinj.ERRINJ_CFG_DELAY do - lfiber.sleep(0.01) - end - if not ok then - M.sync_timeout = old_sync_timeout - if was_master and not is_master then - local_on_master_disable_abort() + if not is_reload then + -- Do not change 'read_only' option here - if a master is + -- disabled and there are triggers on master disable, then + -- they would not be able to modify anything, if 'read_only' + -- had been set here. 'Read_only' is set in + -- local_on_master_disable after triggers and is unset in + -- local_on_master_enable before triggers. + -- + -- If a master role of the replica is not changed, then + -- 'read_only' can be set right here. + box_cfg.listen = box_cfg.listen or this_replica.uri + if box_cfg.replication == nil and this_replicaset.master + and not is_master then + box_cfg.replication = {this_replicaset.master.uri} + else + box_cfg.replication = {} end - if not was_master and is_master then - local_on_master_enable_abort() + if was_master == is_master then + box_cfg.read_only = not is_master end - error(err) + if type(box.cfg) == 'function' then + box_cfg.instance_uuid = this_replica.uuid + box_cfg.replicaset_uuid = this_replicaset.uuid + else + local info = box.info + if this_replica_uuid ~= info.uuid then + error(string.format('Instance UUID mismatch: already set ' .. + '"%s" but "%s" in arguments', info.uuid, + this_replica_uuid)) + end + if this_replicaset.uuid ~= info.cluster.uuid then + error(string.format('Replicaset UUID mismatch: already set ' .. + '"%s" but "%s" in vshard config', + info.cluster.uuid, this_replicaset.uuid)) + end + end + local ok, err = pcall(box.cfg, box_cfg) + while M.errinj.ERRINJ_CFG_DELAY do + lfiber.sleep(0.01) + end + if not ok then + M.sync_timeout = old_sync_timeout + if was_master and not is_master then + local_on_master_disable_abort() + end + if not was_master and is_master then + local_on_master_enable_abort() + end + error(err) + end + log.info("Box has been configured") + local uri = luri.parse(this_replica.uri) + box.once("vshard:storage:1", storage_schema_v1, uri.login, uri.password) end - log.info("Box has been configured") - local uri = luri.parse(this_replica.uri) - box.once("vshard:storage:1", storage_schema_v1, uri.login, uri.password) - lreplicaset.rebind_replicasets(new_replicasets, M.replicasets) lreplicaset.outdate_replicasets(M.replicasets) M.replicasets = new_replicasets @@ -1639,7 +1642,7 @@ local function storage_cfg(cfg, this_replica_uuid) M.shard_index = shard_index M.collect_bucket_garbage_interval = collect_bucket_garbage_interval M.collect_lua_garbage = collect_lua_garbage - M.current_cfg = new_cfg + M.current_cfg = cfg if was_master and not is_master then local_on_master_disable() @@ -1875,7 +1878,7 @@ if not rawget(_G, MODULE_INTERNALS) then rawset(_G, MODULE_INTERNALS, M) else reload_evolution.upgrade(M) - storage_cfg(M.current_cfg, M.this_replica.uuid) + storage_cfg(M.current_cfg, M.this_replica.uuid, true) M.module_version = M.module_version + 1 end @@ -1914,7 +1917,7 @@ return { rebalancing_is_in_progress = rebalancing_is_in_progress, recovery_wakeup = recovery_wakeup, call = storage_call, - cfg = storage_cfg, + cfg = function(cfg, uuid) return storage_cfg(cfg, uuid, false) end, info = storage_info, buckets_info = storage_buckets_info, buckets_count = storage_buckets_count, ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 1/3] Update only vshard part of a cfg on reload 2018-08-07 13:19 ` Alex Khatskevich @ 2018-08-08 11:17 ` Vladislav Shpilevoy 0 siblings, 0 replies; 23+ messages in thread From: Vladislav Shpilevoy @ 2018-08-08 11:17 UTC (permalink / raw) To: tarantool-patches, Alex Khatskevich Thanks for the patch! Pushed into the master. On 07/08/2018 16:19, Alex Khatskevich wrote: > > On 06.08.2018 20:03, Vladislav Shpilevoy wrote: >> Thanks for the patch! See 3 comments below. >> >>> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua >>> index 102b942..40216ea 100644 >>> --- a/vshard/storage/init.lua >>> +++ b/vshard/storage/init.lua >>> @@ -1553,18 +1553,19 @@ local function storage_cfg(cfg, this_replica_uuid) >>> -- >>> -- If a master role of the replica is not changed, then >>> -- 'read_only' can be set right here. >>> - cfg.listen = cfg.listen or this_replica.uri >>> - if cfg.replication == nil and this_replicaset.master and not is_master then >>> - cfg.replication = {this_replicaset.master.uri} >>> + box_cfg.listen = box_cfg.listen or this_replica.uri >>> + if box_cfg.replication == nil and this_replicaset.master >>> + and not is_master then >>> + box_cfg.replication = {this_replicaset.master.uri} >>> else >>> - cfg.replication = {} >>> + box_cfg.replication = {} >>> end >>> if was_master == is_master then >>> - cfg.read_only = not is_master >>> + box_cfg.read_only = not is_master >>> end >>> if type(box.cfg) == 'function' then >>> - cfg.instance_uuid = this_replica.uuid >>> - cfg.replicaset_uuid = this_replicaset.uuid >>> + box_cfg.instance_uuid = this_replica.uuid >>> + box_cfg.replicaset_uuid = this_replicaset.uuid >> >> 1. All these box_cfg manipulations should be done under 'if not is_reload' >> I think. > Fixed. >> >>> else >>> local info = box.info >>> if this_replica_uuid ~= info.uuid then >>> @@ -1607,27 +1610,27 @@ local function storage_cfg(cfg, this_replica_uuid) >>> local_on_master_enable_prepare() >>> end >>> >>> - local box_cfg = table.copy(cfg) >>> - lcfg.remove_non_box_options(box_cfg) >>> - local ok, err = pcall(box.cfg, box_cfg) >>> - while M.errinj.ERRINJ_CFG_DELAY do >>> - lfiber.sleep(0.01) >>> - end >>> - if not ok then >>> - M.sync_timeout = old_sync_timeout >>> - if was_master and not is_master then >>> - local_on_master_disable_abort() >>> + if not is_reload then >>> + local ok, err = true, nil >>> + ok, err = pcall(box.cfg, box_cfg) >> >> 2. Why do you need to announce 'local ok, err' before >> their usage on the next line? > fixed. >> >> >>> + while M.errinj.ERRINJ_CFG_DELAY do >>> + lfiber.sleep(0.01) >>> end >>> - if not was_master and is_master then >>> - local_on_master_enable_abort() >>> + if not ok then >>> + M.sync_timeout = old_sync_timeout >>> + if was_master and not is_master then >>> + local_on_master_disable_abort() >>> + end >>> + if not was_master and is_master then >>> + local_on_master_enable_abort() >>> + end >>> + error(err) >>> end >>> - error(err) >>> + log.info("Box has been configured") >>> + local uri = luri.parse(this_replica.uri) >>> + box.once("vshard:storage:1", storage_schema_v1, uri.login, uri.password) >>> end >>> >>> - log.info("Box has been configured") >>> - local uri = luri.parse(this_replica.uri) >>> - box.once("vshard:storage:1", storage_schema_v1, uri.login, uri.password) >>> - >>> lreplicaset.rebind_replicasets(new_replicasets, M.replicasets) >>> lreplicaset.outdate_replicasets(M.replicasets) >>> M.replicasets = new_replicasets >>> @@ -1874,7 +1877,7 @@ if not rawget(_G, MODULE_INTERNALS) then >>> rawset(_G, MODULE_INTERNALS, M) >>> else >>> reload_evolution.upgrade(M) >>> - storage_cfg(M.current_cfg, M.this_replica.uuid) >>> + storage_cfg(M.current_cfg, M.this_replica.uuid, true) >> >> 3. I see that you have stored vshard_cfg in M.current_cfg. Not a full >> config. So it does not have any box options. And it causes a question >> - why do you need to separate reload from non-reload, if reload anyway >> in such implementation is like 'box.cfg{}' call with no parameters? >> And if you do not store box_cfg options how are you going to compare >> configs when we will implement atomic cfg over cluster? > Fixed. And in a router too. > > > > Full diff: > > commit d3c35612130ff95b20245993ab5053981d3b985f > Author: AKhatskevich <avkhatskevich@tarantool.org> > Date: Mon Jul 23 16:42:22 2018 +0300 > > Update only vshard part of a cfg on reload > > Box cfg could have been changed by a user and then overridden by > an old vshard config on reload. > > Since that commit, box part of a config is applied only when > it is explicitly passed to a `cfg` method. > > This change is important for the multiple routers feature. > > diff --git a/vshard/cfg.lua b/vshard/cfg.lua > index 7c9ab77..af1c3ee 100644 > --- a/vshard/cfg.lua > +++ b/vshard/cfg.lua > @@ -221,6 +221,22 @@ local cfg_template = { > }, > } > > +-- > +-- Split it into vshard_cfg and box_cfg parts. > +-- > +local function cfg_split(cfg) > + local vshard_cfg = {} > + local box_cfg = {} > + for k, v in pairs(cfg) do > + if cfg_template[k] then > + vshard_cfg[k] = v > + else > + box_cfg[k] = v > + end > + end > + return vshard_cfg, box_cfg > +end > + > -- > -- Names of options which cannot be changed during reconfigure. > -- > @@ -252,24 +268,7 @@ local function cfg_check(shard_cfg, old_cfg) > return shard_cfg > end > > --- > --- Nullify non-box options. > --- > -local function remove_non_box_options(cfg) > - cfg.sharding = nil > - cfg.weights = nil > - cfg.zone = nil > - cfg.bucket_count = nil > - cfg.rebalancer_disbalance_threshold = nil > - cfg.rebalancer_max_receiving = nil > - cfg.shard_index = nil > - cfg.collect_bucket_garbage_interval = nil > - cfg.collect_lua_garbage = nil > - cfg.sync_timeout = nil > - cfg.connection_outdate_delay = nil > -end > - > return { > check = cfg_check, > - remove_non_box_options = remove_non_box_options, > + split = cfg_split, > } > diff --git a/vshard/router/init.lua b/vshard/router/init.lua > index d8c026b..1e8d898 100644 > --- a/vshard/router/init.lua > +++ b/vshard/router/init.lua > @@ -491,19 +491,17 @@ end > -- Configuration > -------------------------------------------------------------------------------- > > -local function router_cfg(cfg) > +local function router_cfg(cfg, is_reload) > cfg = lcfg.check(cfg, M.current_cfg) > - local new_cfg = table.copy(cfg) > + local vshard_cfg, box_cfg = lcfg.split(cfg) > if not M.replicasets then > log.info('Starting router configuration') > else > log.info('Starting router reconfiguration') > end > - local new_replicasets = lreplicaset.buildall(cfg) > - local total_bucket_count = cfg.bucket_count > - local collect_lua_garbage = cfg.collect_lua_garbage > - local box_cfg = table.copy(cfg) > - lcfg.remove_non_box_options(box_cfg) > + local new_replicasets = lreplicaset.buildall(vshard_cfg) > + local total_bucket_count = vshard_cfg.bucket_count > + local collect_lua_garbage = vshard_cfg.collect_lua_garbage > log.info("Calling box.cfg()...") > for k, v in pairs(box_cfg) do > log.info({[k] = v}) > @@ -514,8 +512,10 @@ local function router_cfg(cfg) > if M.errinj.ERRINJ_CFG then > error('Error injection: cfg') > end > - box.cfg(box_cfg) > - log.info("Box has been configured") > + if not is_reload then > + box.cfg(box_cfg) > + log.info("Box has been configured") > + end > -- Move connections from an old configuration to a new one. > -- It must be done with no yields to prevent usage both of not > -- fully moved old replicasets, and not fully built new ones. > @@ -526,8 +526,9 @@ local function router_cfg(cfg) > replicaset:connect_all() > end > lreplicaset.wait_masters_connect(new_replicasets) > - lreplicaset.outdate_replicasets(M.replicasets, cfg.connection_outdate_delay) > - M.connection_outdate_delay = cfg.connection_outdate_delay > + lreplicaset.outdate_replicasets(M.replicasets, > + vshard_cfg.connection_outdate_delay) > + M.connection_outdate_delay = vshard_cfg.connection_outdate_delay > M.total_bucket_count = total_bucket_count > M.collect_lua_garbage = collect_lua_garbage > M.current_cfg = cfg > @@ -817,7 +818,7 @@ end > if not rawget(_G, MODULE_INTERNALS) then > rawset(_G, MODULE_INTERNALS, M) > else > - router_cfg(M.current_cfg) > + router_cfg(M.current_cfg, true) > M.module_version = M.module_version + 1 > end > > @@ -825,7 +826,7 @@ M.discovery_f = discovery_f > M.failover_f = failover_f > > return { > - cfg = router_cfg; > + cfg = function(cfg) return router_cfg(cfg, false) end; > info = router_info; > buckets_info = router_buckets_info; > call = router_call; > diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua > index 1f29323..2080769 100644 > --- a/vshard/storage/init.lua > +++ b/vshard/storage/init.lua > @@ -1500,13 +1500,14 @@ end > -------------------------------------------------------------------------------- > -- Configuration > -------------------------------------------------------------------------------- > -local function storage_cfg(cfg, this_replica_uuid) > + > +local function storage_cfg(cfg, this_replica_uuid, is_reload) > if this_replica_uuid == nil then > error('Usage: cfg(configuration, this_replica_uuid)') > end > cfg = lcfg.check(cfg, M.current_cfg) > - local new_cfg = table.copy(cfg) > - if cfg.weights or cfg.zone then > + local vshard_cfg, box_cfg = lcfg.split(cfg) > + if vshard_cfg.weights or vshard_cfg.zone then > error('Weights and zone are not allowed for storage configuration') > end > if M.replicasets then > @@ -1520,7 +1521,7 @@ local function storage_cfg(cfg, this_replica_uuid) > > local this_replicaset > local this_replica > - local new_replicasets = lreplicaset.buildall(cfg) > + local new_replicasets = lreplicaset.buildall(vshard_cfg) > local min_master > for rs_uuid, rs in pairs(new_replicasets) do > for replica_uuid, replica in pairs(rs.replicas) do > @@ -1544,46 +1545,14 @@ local function storage_cfg(cfg, this_replica_uuid) > log.info('I am master') > end > > - -- Do not change 'read_only' option here - if a master is > - -- disabled and there are triggers on master disable, then > - -- they would not be able to modify anything, if 'read_only' > - -- had been set here. 'Read_only' is set in > - -- local_on_master_disable after triggers and is unset in > - -- local_on_master_enable before triggers. > - -- > - -- If a master role of the replica is not changed, then > - -- 'read_only' can be set right here. > - cfg.listen = cfg.listen or this_replica.uri > - if cfg.replication == nil and this_replicaset.master and not is_master then > - cfg.replication = {this_replicaset.master.uri} > - else > - cfg.replication = {} > - end > - if was_master == is_master then > - cfg.read_only = not is_master > - end > - if type(box.cfg) == 'function' then > - cfg.instance_uuid = this_replica.uuid > - cfg.replicaset_uuid = this_replicaset.uuid > - else > - local info = box.info > - if this_replica_uuid ~= info.uuid then > - error(string.format('Instance UUID mismatch: already set "%s" '.. > - 'but "%s" in arguments', info.uuid, > - this_replica_uuid)) > - end > - if this_replicaset.uuid ~= info.cluster.uuid then > - error(string.format('Replicaset UUID mismatch: already set "%s" '.. > - 'but "%s" in vshard config', info.cluster.uuid, > - this_replicaset.uuid)) > - end > - end > - local total_bucket_count = cfg.bucket_count > - local rebalancer_disbalance_threshold = cfg.rebalancer_disbalance_threshold > - local rebalancer_max_receiving = cfg.rebalancer_max_receiving > - local shard_index = cfg.shard_index > - local collect_bucket_garbage_interval = cfg.collect_bucket_garbage_interval > - local collect_lua_garbage = cfg.collect_lua_garbage > + local total_bucket_count = vshard_cfg.bucket_count > + local rebalancer_disbalance_threshold = > + vshard_cfg.rebalancer_disbalance_threshold > + local rebalancer_max_receiving = vshard_cfg.rebalancer_max_receiving > + local shard_index = vshard_cfg.shard_index > + local collect_bucket_garbage_interval = > + vshard_cfg.collect_bucket_garbage_interval > + local collect_lua_garbage = vshard_cfg.collect_lua_garbage > > -- It is considered that all possible errors during cfg > -- process occur only before this place. > @@ -1598,7 +1567,7 @@ local function storage_cfg(cfg, this_replica_uuid) > -- a new sync timeout. > -- > local old_sync_timeout = M.sync_timeout > - M.sync_timeout = cfg.sync_timeout > + M.sync_timeout = vshard_cfg.sync_timeout > > if was_master and not is_master then > local_on_master_disable_prepare() > @@ -1607,27 +1576,61 @@ local function storage_cfg(cfg, this_replica_uuid) > local_on_master_enable_prepare() > end > > - local box_cfg = table.copy(cfg) > - lcfg.remove_non_box_options(box_cfg) > - local ok, err = pcall(box.cfg, box_cfg) > - while M.errinj.ERRINJ_CFG_DELAY do > - lfiber.sleep(0.01) > - end > - if not ok then > - M.sync_timeout = old_sync_timeout > - if was_master and not is_master then > - local_on_master_disable_abort() > + if not is_reload then > + -- Do not change 'read_only' option here - if a master is > + -- disabled and there are triggers on master disable, then > + -- they would not be able to modify anything, if 'read_only' > + -- had been set here. 'Read_only' is set in > + -- local_on_master_disable after triggers and is unset in > + -- local_on_master_enable before triggers. > + -- > + -- If a master role of the replica is not changed, then > + -- 'read_only' can be set right here. > + box_cfg.listen = box_cfg.listen or this_replica.uri > + if box_cfg.replication == nil and this_replicaset.master > + and not is_master then > + box_cfg.replication = {this_replicaset.master.uri} > + else > + box_cfg.replication = {} > end > - if not was_master and is_master then > - local_on_master_enable_abort() > + if was_master == is_master then > + box_cfg.read_only = not is_master > end > - error(err) > + if type(box.cfg) == 'function' then > + box_cfg.instance_uuid = this_replica.uuid > + box_cfg.replicaset_uuid = this_replicaset.uuid > + else > + local info = box.info > + if this_replica_uuid ~= info.uuid then > + error(string.format('Instance UUID mismatch: already set ' .. > + '"%s" but "%s" in arguments', info.uuid, > + this_replica_uuid)) > + end > + if this_replicaset.uuid ~= info.cluster.uuid then > + error(string.format('Replicaset UUID mismatch: already set ' .. > + '"%s" but "%s" in vshard config', > + info.cluster.uuid, this_replicaset.uuid)) > + end > + end > + local ok, err = pcall(box.cfg, box_cfg) > + while M.errinj.ERRINJ_CFG_DELAY do > + lfiber.sleep(0.01) > + end > + if not ok then > + M.sync_timeout = old_sync_timeout > + if was_master and not is_master then > + local_on_master_disable_abort() > + end > + if not was_master and is_master then > + local_on_master_enable_abort() > + end > + error(err) > + end > + log.info("Box has been configured") > + local uri = luri.parse(this_replica.uri) > + box.once("vshard:storage:1", storage_schema_v1, uri.login, uri.password) > end > > - log.info("Box has been configured") > - local uri = luri.parse(this_replica.uri) > - box.once("vshard:storage:1", storage_schema_v1, uri.login, uri.password) > - > lreplicaset.rebind_replicasets(new_replicasets, M.replicasets) > lreplicaset.outdate_replicasets(M.replicasets) > M.replicasets = new_replicasets > @@ -1639,7 +1642,7 @@ local function storage_cfg(cfg, this_replica_uuid) > M.shard_index = shard_index > M.collect_bucket_garbage_interval = collect_bucket_garbage_interval > M.collect_lua_garbage = collect_lua_garbage > - M.current_cfg = new_cfg > + M.current_cfg = cfg > > if was_master and not is_master then > local_on_master_disable() > @@ -1875,7 +1878,7 @@ if not rawget(_G, MODULE_INTERNALS) then > rawset(_G, MODULE_INTERNALS, M) > else > reload_evolution.upgrade(M) > - storage_cfg(M.current_cfg, M.this_replica.uuid) > + storage_cfg(M.current_cfg, M.this_replica.uuid, true) > M.module_version = M.module_version + 1 > end > > @@ -1914,7 +1917,7 @@ return { > rebalancing_is_in_progress = rebalancing_is_in_progress, > recovery_wakeup = recovery_wakeup, > call = storage_call, > - cfg = storage_cfg, > + cfg = function(cfg, uuid) return storage_cfg(cfg, uuid, false) end, > info = storage_info, > buckets_info = storage_buckets_info, > buckets_count = storage_buckets_count, > > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] [PATCH 2/3] Move lua gc to a dedicated module 2018-07-31 16:25 [tarantool-patches] [PATCH 0/3] multiple routers AKhatskevich 2018-07-31 16:25 ` [tarantool-patches] [PATCH 1/3] Update only vshard part of a cfg on reload AKhatskevich @ 2018-07-31 16:25 ` AKhatskevich 2018-08-01 18:43 ` [tarantool-patches] " Vladislav Shpilevoy 2018-07-31 16:25 ` [tarantool-patches] [PATCH 3/3] Introduce multiple routers feature AKhatskevich ` (2 subsequent siblings) 4 siblings, 1 reply; 23+ messages in thread From: AKhatskevich @ 2018-07-31 16:25 UTC (permalink / raw) To: v.shpilevoy, tarantool-patches `vshard.lua_gc.lua` is a new module which helps make gc work more intense. Before the commit that was a duty of router and storage. Reasons to move lua gc to a separate module: 1. It is not a duty of vshard to collect garbage, so let gc fiber be as far from vshard as possible. 2. Next commits will introduce multiple routers feature, which require gc fiber to be a singleton. Closes #138 --- test/router/garbage_collector.result | 27 +++++++++++------ test/router/garbage_collector.test.lua | 18 ++++++----- test/storage/garbage_collector.result | 27 +++++++++-------- test/storage/garbage_collector.test.lua | 22 ++++++-------- vshard/lua_gc.lua | 54 +++++++++++++++++++++++++++++++++ vshard/router/init.lua | 19 +++--------- vshard/storage/init.lua | 20 ++++-------- 7 files changed, 116 insertions(+), 71 deletions(-) create mode 100644 vshard/lua_gc.lua diff --git a/test/router/garbage_collector.result b/test/router/garbage_collector.result index 3c2a4f1..a7474fc 100644 --- a/test/router/garbage_collector.result +++ b/test/router/garbage_collector.result @@ -40,27 +40,30 @@ test_run:switch('router_1') fiber = require('fiber') --- ... -cfg.collect_lua_garbage = true +lua_gc = require('vshard.lua_gc') --- ... -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DISCOVERY_INTERVAL +cfg.collect_lua_garbage = true --- ... vshard.router.cfg(cfg) --- ... -a = setmetatable({}, {__mode = 'v'}) +lua_gc.internal.bg_fiber ~= nil +--- +- true +... +iterations = lua_gc.internal.iterations --- ... -a.k = {b = 100} +lua_gc.internal.bg_fiber:wakeup() --- ... -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end --- ... -a.k +lua_gc.internal.interval = 0.001 --- -- null ... cfg.collect_lua_garbage = false --- @@ -68,13 +71,17 @@ cfg.collect_lua_garbage = false vshard.router.cfg(cfg) --- ... -a.k = {b = 100} +lua_gc.internal.bg_fiber == nil +--- +- true +... +iterations = lua_gc.internal.iterations --- ... -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end +fiber.sleep(0.01) --- ... -a.k ~= nil +iterations == lua_gc.internal.iterations --- - true ... diff --git a/test/router/garbage_collector.test.lua b/test/router/garbage_collector.test.lua index b3411cd..d1da8e9 100644 --- a/test/router/garbage_collector.test.lua +++ b/test/router/garbage_collector.test.lua @@ -13,18 +13,20 @@ test_run:cmd("start server router_1") -- test_run:switch('router_1') fiber = require('fiber') +lua_gc = require('vshard.lua_gc') cfg.collect_lua_garbage = true -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DISCOVERY_INTERVAL vshard.router.cfg(cfg) -a = setmetatable({}, {__mode = 'v'}) -a.k = {b = 100} -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end -a.k +lua_gc.internal.bg_fiber ~= nil +iterations = lua_gc.internal.iterations +lua_gc.internal.bg_fiber:wakeup() +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end +lua_gc.internal.interval = 0.001 cfg.collect_lua_garbage = false vshard.router.cfg(cfg) -a.k = {b = 100} -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end -a.k ~= nil +lua_gc.internal.bg_fiber == nil +iterations = lua_gc.internal.iterations +fiber.sleep(0.01) +iterations == lua_gc.internal.iterations test_run:switch("default") test_run:cmd("stop server router_1") diff --git a/test/storage/garbage_collector.result b/test/storage/garbage_collector.result index 3588fb4..d94ba24 100644 --- a/test/storage/garbage_collector.result +++ b/test/storage/garbage_collector.result @@ -120,7 +120,7 @@ test_run:switch('storage_1_a') fiber = require('fiber') --- ... -log = require('log') +lua_gc = require('vshard.lua_gc') --- ... cfg.collect_lua_garbage = true @@ -129,24 +129,21 @@ cfg.collect_lua_garbage = true vshard.storage.cfg(cfg, names.storage_1_a) --- ... --- Create a weak reference to a able {b = 100} - it must be --- deleted on the next GC. -a = setmetatable({}, {__mode = 'v'}) +lua_gc.internal.bg_fiber ~= nil --- +- true ... -a.k = {b = 100} +iterations = lua_gc.internal.iterations --- ... -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL +lua_gc.internal.bg_fiber:wakeup() --- ... --- Wait until Lua GC deletes a.k. -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end --- ... -a.k +lua_gc.internal.interval = 0.001 --- -- null ... cfg.collect_lua_garbage = false --- @@ -154,13 +151,17 @@ cfg.collect_lua_garbage = false vshard.storage.cfg(cfg, names.storage_1_a) --- ... -a.k = {b = 100} +lua_gc.internal.bg_fiber == nil +--- +- true +... +iterations = lua_gc.internal.iterations --- ... -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end +fiber.sleep(0.01) --- ... -a.k ~= nil +iterations == lua_gc.internal.iterations --- - true ... diff --git a/test/storage/garbage_collector.test.lua b/test/storage/garbage_collector.test.lua index 79e76d8..ee3ecf4 100644 --- a/test/storage/garbage_collector.test.lua +++ b/test/storage/garbage_collector.test.lua @@ -46,22 +46,20 @@ customer:select{} -- test_run:switch('storage_1_a') fiber = require('fiber') -log = require('log') +lua_gc = require('vshard.lua_gc') cfg.collect_lua_garbage = true vshard.storage.cfg(cfg, names.storage_1_a) --- Create a weak reference to a able {b = 100} - it must be --- deleted on the next GC. -a = setmetatable({}, {__mode = 'v'}) -a.k = {b = 100} -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL --- Wait until Lua GC deletes a.k. -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end -a.k +lua_gc.internal.bg_fiber ~= nil +iterations = lua_gc.internal.iterations +lua_gc.internal.bg_fiber:wakeup() +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end +lua_gc.internal.interval = 0.001 cfg.collect_lua_garbage = false vshard.storage.cfg(cfg, names.storage_1_a) -a.k = {b = 100} -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end -a.k ~= nil +lua_gc.internal.bg_fiber == nil +iterations = lua_gc.internal.iterations +fiber.sleep(0.01) +iterations == lua_gc.internal.iterations test_run:switch('default') test_run:drop_cluster(REPLICASET_2) diff --git a/vshard/lua_gc.lua b/vshard/lua_gc.lua new file mode 100644 index 0000000..8d6af3e --- /dev/null +++ b/vshard/lua_gc.lua @@ -0,0 +1,54 @@ +-- +-- This module implements background lua GC fiber. +-- It's purpose is to make GC more aggressive. +-- + +local lfiber = require('fiber') +local MODULE_INTERNALS = '__module_vshard_lua_gc' + +local M = rawget(_G, MODULE_INTERNALS) +if not M then + M = { + -- Background fiber. + bg_fiber = nil, + -- GC interval in seconds. + interval = nil, + -- Main loop. + -- Stored here to make the fiber reloadable. + main_loop = nil, + -- Number of `collectgarbage()` calls. + iterations = 0, + } +end +local DEFALUT_INTERVAL = 100 + +M.main_loop = function() + lfiber.sleep(M.interval or DEFALUT_INTERVAL) + collectgarbage() + M.iterations = M.iterations + 1 + return M.main_loop() +end + +local function set_state(active, interval) + M.inverval = interval + if active and not M.bg_fiber then + M.bg_fiber = lfiber.create(M.main_loop) + M.bg_fiber:name('vshard.lua_gc') + end + if not active and M.bg_fiber then + M.bg_fiber:cancel() + M.bg_fiber = nil + end + if active then + M.bg_fiber:wakeup() + end +end + +if not rawget(_G, MODULE_INTERNALS) then + rawset(_G, MODULE_INTERNALS, M) +end + +return { + set_state = set_state, + internal = M, +} diff --git a/vshard/router/init.lua b/vshard/router/init.lua index e2b2b22..3e127cb 100644 --- a/vshard/router/init.lua +++ b/vshard/router/init.lua @@ -7,6 +7,7 @@ if rawget(_G, MODULE_INTERNALS) then local vshard_modules = { 'vshard.consts', 'vshard.error', 'vshard.cfg', 'vshard.hash', 'vshard.replicaset', 'vshard.util', + 'vshard.lua_gc', } for _, module in pairs(vshard_modules) do package.loaded[module] = nil @@ -18,6 +19,7 @@ local lcfg = require('vshard.cfg') local lhash = require('vshard.hash') local lreplicaset = require('vshard.replicaset') local util = require('vshard.util') +local lua_gc = require('vshard.lua_gc') local M = rawget(_G, MODULE_INTERNALS) if not M then @@ -43,8 +45,7 @@ if not M then discovery_fiber = nil, -- Bucket count stored on all replicasets. total_bucket_count = 0, - -- If true, then discovery fiber starts to call - -- collectgarbage() periodically. + -- Boolean lua_gc state (create periodic gc task). collect_lua_garbage = nil, -- This counter is used to restart background fibers with -- new reloaded code. @@ -151,8 +152,6 @@ end -- local function discovery_f() local module_version = M.module_version - local iterations_until_lua_gc = - consts.COLLECT_LUA_GARBAGE_INTERVAL / consts.DISCOVERY_INTERVAL while module_version == M.module_version do while not next(M.replicasets) do lfiber.sleep(consts.DISCOVERY_INTERVAL) @@ -188,12 +187,6 @@ local function discovery_f() M.route_map[bucket_id] = replicaset end end - iterations_until_lua_gc = iterations_until_lua_gc - 1 - if M.collect_lua_garbage and iterations_until_lua_gc == 0 then - iterations_until_lua_gc = - consts.COLLECT_LUA_GARBAGE_INTERVAL / consts.DISCOVERY_INTERVAL - collectgarbage() - end lfiber.sleep(consts.DISCOVERY_INTERVAL) end end @@ -504,7 +497,6 @@ local function router_cfg(cfg) end local new_replicasets = lreplicaset.buildall(vshard_cfg) local total_bucket_count = vshard_cfg.bucket_count - local collect_lua_garbage = vshard_cfg.collect_lua_garbage log.info("Calling box.cfg()...") for k, v in pairs(box_cfg) do log.info({[k] = v}) @@ -531,7 +523,7 @@ local function router_cfg(cfg) vshard_cfg.connection_outdate_delay) M.connection_outdate_delay = vshard_cfg.connection_outdate_delay M.total_bucket_count = total_bucket_count - M.collect_lua_garbage = collect_lua_garbage + M.collect_lua_garbage = vshard_cfg.collect_lua_garbage M.current_cfg = vshard_cfg M.replicasets = new_replicasets -- Update existing route map in-place. @@ -548,8 +540,7 @@ local function router_cfg(cfg) M.discovery_fiber = util.reloadable_fiber_create( 'vshard.discovery', M, 'discovery_f') end - -- Destroy connections, not used in a new configuration. - collectgarbage() + lua_gc.set_state(M.collect_lua_garbage, consts.COLLECT_LUA_GARBAGE_INTERVAL) end -------------------------------------------------------------------------------- diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua index 75f5df9..1e11960 100644 --- a/vshard/storage/init.lua +++ b/vshard/storage/init.lua @@ -10,7 +10,8 @@ if rawget(_G, MODULE_INTERNALS) then local vshard_modules = { 'vshard.consts', 'vshard.error', 'vshard.cfg', 'vshard.replicaset', 'vshard.util', - 'vshard.storage.reload_evolution' + 'vshard.storage.reload_evolution', + 'vshard.lua_gc', } for _, module in pairs(vshard_modules) do package.loaded[module] = nil @@ -21,6 +22,7 @@ local lerror = require('vshard.error') local lcfg = require('vshard.cfg') local lreplicaset = require('vshard.replicaset') local util = require('vshard.util') +local lua_gc = require('vshard.lua_gc') local reload_evolution = require('vshard.storage.reload_evolution') local M = rawget(_G, MODULE_INTERNALS) @@ -75,8 +77,7 @@ if not M then collect_bucket_garbage_fiber = nil, -- Do buckets garbage collection once per this time. collect_bucket_garbage_interval = nil, - -- If true, then bucket garbage collection fiber starts to - -- call collectgarbage() periodically. + -- Boolean lua_gc state (create periodic gc task). collect_lua_garbage = nil, -------------------- Bucket recovery --------------------- @@ -1063,9 +1064,6 @@ function collect_garbage_f() -- buckets_for_redirect is deleted, it gets empty_sent_buckets -- for next deletion. local empty_sent_buckets = {} - local iterations_until_lua_gc = - consts.COLLECT_LUA_GARBAGE_INTERVAL / M.collect_bucket_garbage_interval - while M.module_version == module_version do -- Check if no changes in buckets configuration. if control.bucket_generation_collected ~= control.bucket_generation then @@ -1106,12 +1104,6 @@ function collect_garbage_f() end end ::continue:: - iterations_until_lua_gc = iterations_until_lua_gc - 1 - if iterations_until_lua_gc == 0 and M.collect_lua_garbage then - iterations_until_lua_gc = consts.COLLECT_LUA_GARBAGE_INTERVAL / - M.collect_bucket_garbage_interval - collectgarbage() - end lfiber.sleep(M.collect_bucket_garbage_interval) end end @@ -1590,7 +1582,6 @@ local function storage_cfg(cfg, this_replica_uuid) local shard_index = vshard_cfg.shard_index local collect_bucket_garbage_interval = vshard_cfg.collect_bucket_garbage_interval - local collect_lua_garbage = vshard_cfg.collect_lua_garbage -- It is considered that all possible errors during cfg -- process occur only before this place. @@ -1646,7 +1637,7 @@ local function storage_cfg(cfg, this_replica_uuid) M.rebalancer_max_receiving = rebalancer_max_receiving M.shard_index = shard_index M.collect_bucket_garbage_interval = collect_bucket_garbage_interval - M.collect_lua_garbage = collect_lua_garbage + M.collect_lua_garbage = vshard_cfg.collect_lua_garbage M.current_cfg = vshard_cfg if was_master and not is_master then @@ -1671,6 +1662,7 @@ local function storage_cfg(cfg, this_replica_uuid) M.rebalancer_fiber:cancel() M.rebalancer_fiber = nil end + lua_gc.set_state(M.collect_lua_garbage, consts.COLLECT_LUA_GARBAGE_INTERVAL) -- Destroy connections, not used in a new configuration. collectgarbage() end -- 2.14.1 ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 2/3] Move lua gc to a dedicated module 2018-07-31 16:25 ` [tarantool-patches] [PATCH 2/3] Move lua gc to a dedicated module AKhatskevich @ 2018-08-01 18:43 ` Vladislav Shpilevoy 2018-08-03 20:04 ` Alex Khatskevich 0 siblings, 1 reply; 23+ messages in thread From: Vladislav Shpilevoy @ 2018-08-01 18:43 UTC (permalink / raw) To: tarantool-patches, AKhatskevich Thanks for the patch! See 4 comments below. On 31/07/2018 19:25, AKhatskevich wrote: > `vshard.lua_gc.lua` is a new module which helps make gc work more > intense. > Before the commit that was a duty of router and storage. > > Reasons to move lua gc to a separate module: > 1. It is not a duty of vshard to collect garbage, so let gc fiber > be as far from vshard as possible. > 2. Next commits will introduce multiple routers feature, which require > gc fiber to be a singleton. > > Closes #138 > --- > test/router/garbage_collector.result | 27 +++++++++++------ > test/router/garbage_collector.test.lua | 18 ++++++----- > test/storage/garbage_collector.result | 27 +++++++++-------- > test/storage/garbage_collector.test.lua | 22 ++++++-------- > vshard/lua_gc.lua | 54 +++++++++++++++++++++++++++++++++ > vshard/router/init.lua | 19 +++--------- > vshard/storage/init.lua | 20 ++++-------- > 7 files changed, 116 insertions(+), 71 deletions(-) > create mode 100644 vshard/lua_gc.lua > > diff --git a/test/router/garbage_collector.result b/test/router/garbage_collector.result > index 3c2a4f1..a7474fc 100644 > --- a/test/router/garbage_collector.result > +++ b/test/router/garbage_collector.result > @@ -40,27 +40,30 @@ test_run:switch('router_1') > fiber = require('fiber') > --- > ... > -cfg.collect_lua_garbage = true > +lua_gc = require('vshard.lua_gc') > --- > ... > -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DISCOVERY_INTERVAL > +cfg.collect_lua_garbage = true 1. Now this code tests nothing but just fibers. Below you do wakeup and check that iteration counter is increased, but it is obvious thing. Before your patch the test really tested that GC is called by checking for nullified weak references. Now I can remove collectgarbage() from the main_loop and nothing would changed. Please, make this test be a test. Moreover, the test hangs forever both locally and on Travis. > diff --git a/test/storage/garbage_collector.result b/test/storage/garbage_collector.result > index 3588fb4..d94ba24 100644 > --- a/test/storage/garbage_collector.result > +++ b/test/storage/garbage_collector.result 2. Same. Now the test passes even if I removed collectgarbage() from the main loop. > diff --git a/vshard/lua_gc.lua b/vshard/lua_gc.lua > new file mode 100644 > index 0000000..8d6af3e > --- /dev/null > +++ b/vshard/lua_gc.lua > @@ -0,0 +1,54 @@ > +-- > +-- This module implements background lua GC fiber. > +-- It's purpose is to make GC more aggressive. > +-- > + > +local lfiber = require('fiber') > +local MODULE_INTERNALS = '__module_vshard_lua_gc' > + > +local M = rawget(_G, MODULE_INTERNALS) > +if not M then > + M = { > + -- Background fiber. > + bg_fiber = nil, > + -- GC interval in seconds. > + interval = nil, > + -- Main loop. > + -- Stored here to make the fiber reloadable. > + main_loop = nil, > + -- Number of `collectgarbage()` calls. > + iterations = 0, > + } > +end > +local DEFALUT_INTERVAL = 100 3. For constants please use vshard.consts. 4. You should not choose interval inside the main_loop. Please, use 'default' option in cfg.lua. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 2/3] Move lua gc to a dedicated module 2018-08-01 18:43 ` [tarantool-patches] " Vladislav Shpilevoy @ 2018-08-03 20:04 ` Alex Khatskevich 2018-08-06 17:03 ` Vladislav Shpilevoy 2018-08-08 11:17 ` Vladislav Shpilevoy 0 siblings, 2 replies; 23+ messages in thread From: Alex Khatskevich @ 2018-08-03 20:04 UTC (permalink / raw) To: Vladislav Shpilevoy, tarantool-patches On 01.08.2018 21:43, Vladislav Shpilevoy wrote: > Thanks for the patch! See 4 comments below. > > On 31/07/2018 19:25, AKhatskevich wrote: >> `vshard.lua_gc.lua` is a new module which helps make gc work more >> intense. >> Before the commit that was a duty of router and storage. >> >> Reasons to move lua gc to a separate module: >> 1. It is not a duty of vshard to collect garbage, so let gc fiber >> be as far from vshard as possible. >> 2. Next commits will introduce multiple routers feature, which require >> gc fiber to be a singleton. >> >> Closes #138 >> --- >> test/router/garbage_collector.result | 27 +++++++++++------ >> test/router/garbage_collector.test.lua | 18 ++++++----- >> test/storage/garbage_collector.result | 27 +++++++++-------- >> test/storage/garbage_collector.test.lua | 22 ++++++-------- >> vshard/lua_gc.lua | 54 >> +++++++++++++++++++++++++++++++++ >> vshard/router/init.lua | 19 +++--------- >> vshard/storage/init.lua | 20 ++++-------- >> 7 files changed, 116 insertions(+), 71 deletions(-) >> create mode 100644 vshard/lua_gc.lua >> >> diff --git a/test/router/garbage_collector.result >> b/test/router/garbage_collector.result >> index 3c2a4f1..a7474fc 100644 >> --- a/test/router/garbage_collector.result >> +++ b/test/router/garbage_collector.result >> @@ -40,27 +40,30 @@ test_run:switch('router_1') >> fiber = require('fiber') >> --- >> ... >> -cfg.collect_lua_garbage = true >> +lua_gc = require('vshard.lua_gc') >> --- >> ... >> -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / >> vshard.consts.DISCOVERY_INTERVAL >> +cfg.collect_lua_garbage = true > > 1. Now this code tests nothing but just fibers. Below you do wakeup > and check that iteration counter is increased, but it is obvious > thing. Before your patch the test really tested that GC is called > by checking for nullified weak references. Now I can remove > collectgarbage() > from the main_loop and nothing would changed. Please, make this test > be a test. GC test returned back. > > Moreover, the test hangs forever both locally and on Travis. Fixed > >> diff --git a/test/storage/garbage_collector.result >> b/test/storage/garbage_collector.result >> index 3588fb4..d94ba24 100644 >> --- a/test/storage/garbage_collector.result >> +++ b/test/storage/garbage_collector.result > > 2. Same. Now the test passes even if I removed collectgarbage() from > the main loop. returned. > >> diff --git a/vshard/lua_gc.lua b/vshard/lua_gc.lua >> new file mode 100644 >> index 0000000..8d6af3e >> --- /dev/null >> +++ b/vshard/lua_gc.lua >> @@ -0,0 +1,54 @@ >> +-- >> +-- This module implements background lua GC fiber. >> +-- It's purpose is to make GC more aggressive. >> +-- >> + >> +local lfiber = require('fiber') >> +local MODULE_INTERNALS = '__module_vshard_lua_gc' >> + >> +local M = rawget(_G, MODULE_INTERNALS) >> +if not M then >> + M = { >> + -- Background fiber. >> + bg_fiber = nil, >> + -- GC interval in seconds. >> + interval = nil, >> + -- Main loop. >> + -- Stored here to make the fiber reloadable. >> + main_loop = nil, >> + -- Number of `collectgarbage()` calls. >> + iterations = 0, >> + } >> +end >> +local DEFALUT_INTERVAL = 100 > > 3. For constants please use vshard.consts. > > 4. You should not choose interval inside the main_loop. > Please, use 'default' option in cfg.lua. DEFAULT_INTERVAL is removed at all. Interval value is became required. full diff commit ec221bd060f46e4dc009eaab1c6c1bd1cf5a4150 Author: AKhatskevich <avkhatskevich@tarantool.org> Date: Thu Jul 26 01:17:00 2018 +0300 Move lua gc to a dedicated module `vshard.lua_gc.lua` is a new module which helps make gc work more intense. Before the commit that was a duty of router and storage. Reasons to move lua gc to a separate module: 1. It is not a duty of vshard to collect garbage, so let gc fiber be as far from vshard as possible. 2. Next commits will introduce multiple routers feature, which require gc fiber to be a singleton. Closes #138 diff --git a/test/router/garbage_collector.result b/test/router/garbage_collector.result index 3c2a4f1..7780046 100644 --- a/test/router/garbage_collector.result +++ b/test/router/garbage_collector.result @@ -40,41 +40,59 @@ test_run:switch('router_1') fiber = require('fiber') --- ... -cfg.collect_lua_garbage = true +lua_gc = require('vshard.lua_gc') --- ... -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DISCOVERY_INTERVAL +cfg.collect_lua_garbage = true --- ... vshard.router.cfg(cfg) --- ... +lua_gc.internal.bg_fiber ~= nil +--- +- true +... +-- Check that `collectgarbage()` was really called. a = setmetatable({}, {__mode = 'v'}) --- ... a.k = {b = 100} --- ... -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end +iterations = lua_gc.internal.iterations +--- +... +lua_gc.internal.bg_fiber:wakeup() +--- +... +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end --- ... a.k --- - null ... +lua_gc.internal.interval = 0.001 +--- +... cfg.collect_lua_garbage = false --- ... vshard.router.cfg(cfg) --- ... -a.k = {b = 100} +lua_gc.internal.bg_fiber == nil +--- +- true +... +iterations = lua_gc.internal.iterations --- ... -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end +fiber.sleep(0.01) --- ... -a.k ~= nil +iterations == lua_gc.internal.iterations --- - true ... diff --git a/test/router/garbage_collector.test.lua b/test/router/garbage_collector.test.lua index b3411cd..e8d0876 100644 --- a/test/router/garbage_collector.test.lua +++ b/test/router/garbage_collector.test.lua @@ -13,18 +13,24 @@ test_run:cmd("start server router_1") -- test_run:switch('router_1') fiber = require('fiber') +lua_gc = require('vshard.lua_gc') cfg.collect_lua_garbage = true -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DISCOVERY_INTERVAL vshard.router.cfg(cfg) +lua_gc.internal.bg_fiber ~= nil +-- Check that `collectgarbage()` was really called. a = setmetatable({}, {__mode = 'v'}) a.k = {b = 100} -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end +iterations = lua_gc.internal.iterations +lua_gc.internal.bg_fiber:wakeup() +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end a.k +lua_gc.internal.interval = 0.001 cfg.collect_lua_garbage = false vshard.router.cfg(cfg) -a.k = {b = 100} -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end -a.k ~= nil +lua_gc.internal.bg_fiber == nil +iterations = lua_gc.internal.iterations +fiber.sleep(0.01) +iterations == lua_gc.internal.iterations test_run:switch("default") test_run:cmd("stop server router_1") diff --git a/test/storage/garbage_collector.result b/test/storage/garbage_collector.result index 3588fb4..6bec2db 100644 --- a/test/storage/garbage_collector.result +++ b/test/storage/garbage_collector.result @@ -120,7 +120,7 @@ test_run:switch('storage_1_a') fiber = require('fiber') --- ... -log = require('log') +lua_gc = require('vshard.lua_gc') --- ... cfg.collect_lua_garbage = true @@ -129,38 +129,50 @@ cfg.collect_lua_garbage = true vshard.storage.cfg(cfg, names.storage_1_a) --- ... --- Create a weak reference to a able {b = 100} - it must be --- deleted on the next GC. +lua_gc.internal.bg_fiber ~= nil +--- +- true +... +-- Check that `collectgarbage()` was really called. a = setmetatable({}, {__mode = 'v'}) --- ... a.k = {b = 100} --- ... -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL +iterations = lua_gc.internal.iterations --- ... --- Wait until Lua GC deletes a.k. -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end +lua_gc.internal.bg_fiber:wakeup() +--- +... +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end --- ... a.k --- - null ... +lua_gc.internal.interval = 0.001 +--- +... cfg.collect_lua_garbage = false --- ... vshard.storage.cfg(cfg, names.storage_1_a) --- ... -a.k = {b = 100} +lua_gc.internal.bg_fiber == nil +--- +- true +... +iterations = lua_gc.internal.iterations --- ... -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end +fiber.sleep(0.01) --- ... -a.k ~= nil +iterations == lua_gc.internal.iterations --- - true ... diff --git a/test/storage/garbage_collector.test.lua b/test/storage/garbage_collector.test.lua index 79e76d8..407b8a1 100644 --- a/test/storage/garbage_collector.test.lua +++ b/test/storage/garbage_collector.test.lua @@ -46,22 +46,24 @@ customer:select{} -- test_run:switch('storage_1_a') fiber = require('fiber') -log = require('log') +lua_gc = require('vshard.lua_gc') cfg.collect_lua_garbage = true vshard.storage.cfg(cfg, names.storage_1_a) --- Create a weak reference to a able {b = 100} - it must be --- deleted on the next GC. +lua_gc.internal.bg_fiber ~= nil +-- Check that `collectgarbage()` was really called. a = setmetatable({}, {__mode = 'v'}) a.k = {b = 100} -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL --- Wait until Lua GC deletes a.k. -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end +iterations = lua_gc.internal.iterations +lua_gc.internal.bg_fiber:wakeup() +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end a.k +lua_gc.internal.interval = 0.001 cfg.collect_lua_garbage = false vshard.storage.cfg(cfg, names.storage_1_a) -a.k = {b = 100} -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end -a.k ~= nil +lua_gc.internal.bg_fiber == nil +iterations = lua_gc.internal.iterations +fiber.sleep(0.01) +iterations == lua_gc.internal.iterations test_run:switch('default') test_run:drop_cluster(REPLICASET_2) diff --git a/vshard/lua_gc.lua b/vshard/lua_gc.lua new file mode 100644 index 0000000..c6c5cd3 --- /dev/null +++ b/vshard/lua_gc.lua @@ -0,0 +1,54 @@ +-- +-- This module implements background lua GC fiber. +-- It's purpose is to make GC more aggressive. +-- + +local lfiber = require('fiber') +local MODULE_INTERNALS = '__module_vshard_lua_gc' + +local M = rawget(_G, MODULE_INTERNALS) +if not M then + M = { + -- Background fiber. + bg_fiber = nil, + -- GC interval in seconds. + interval = nil, + -- Main loop. + -- Stored here to make the fiber reloadable. + main_loop = nil, + -- Number of `collectgarbage()` calls. + iterations = 0, + } +end + +M.main_loop = function() + lfiber.sleep(M.interval) + collectgarbage() + M.iterations = M.iterations + 1 + return M.main_loop() +end + +local function set_state(active, interval) + assert(type(interval) == 'number') + M.interval = interval + if active and not M.bg_fiber then + M.bg_fiber = lfiber.create(M.main_loop) + M.bg_fiber:name('vshard.lua_gc') + end + if not active and M.bg_fiber then + M.bg_fiber:cancel() + M.bg_fiber = nil + end + if active then + M.bg_fiber:wakeup() + end +end + +if not rawget(_G, MODULE_INTERNALS) then + rawset(_G, MODULE_INTERNALS, M) +end + +return { + set_state = set_state, + internal = M, +} diff --git a/vshard/router/init.lua b/vshard/router/init.lua index e2b2b22..3e127cb 100644 --- a/vshard/router/init.lua +++ b/vshard/router/init.lua @@ -7,6 +7,7 @@ if rawget(_G, MODULE_INTERNALS) then local vshard_modules = { 'vshard.consts', 'vshard.error', 'vshard.cfg', 'vshard.hash', 'vshard.replicaset', 'vshard.util', + 'vshard.lua_gc', } for _, module in pairs(vshard_modules) do package.loaded[module] = nil @@ -18,6 +19,7 @@ local lcfg = require('vshard.cfg') local lhash = require('vshard.hash') local lreplicaset = require('vshard.replicaset') local util = require('vshard.util') +local lua_gc = require('vshard.lua_gc') local M = rawget(_G, MODULE_INTERNALS) if not M then @@ -43,8 +45,7 @@ if not M then discovery_fiber = nil, -- Bucket count stored on all replicasets. total_bucket_count = 0, - -- If true, then discovery fiber starts to call - -- collectgarbage() periodically. + -- Boolean lua_gc state (create periodic gc task). collect_lua_garbage = nil, -- This counter is used to restart background fibers with -- new reloaded code. @@ -151,8 +152,6 @@ end -- local function discovery_f() local module_version = M.module_version - local iterations_until_lua_gc = - consts.COLLECT_LUA_GARBAGE_INTERVAL / consts.DISCOVERY_INTERVAL while module_version == M.module_version do while not next(M.replicasets) do lfiber.sleep(consts.DISCOVERY_INTERVAL) @@ -188,12 +187,6 @@ local function discovery_f() M.route_map[bucket_id] = replicaset end end - iterations_until_lua_gc = iterations_until_lua_gc - 1 - if M.collect_lua_garbage and iterations_until_lua_gc == 0 then - iterations_until_lua_gc = - consts.COLLECT_LUA_GARBAGE_INTERVAL / consts.DISCOVERY_INTERVAL - collectgarbage() - end lfiber.sleep(consts.DISCOVERY_INTERVAL) end end @@ -504,7 +497,6 @@ local function router_cfg(cfg) end local new_replicasets = lreplicaset.buildall(vshard_cfg) local total_bucket_count = vshard_cfg.bucket_count - local collect_lua_garbage = vshard_cfg.collect_lua_garbage log.info("Calling box.cfg()...") for k, v in pairs(box_cfg) do log.info({[k] = v}) @@ -531,7 +523,7 @@ local function router_cfg(cfg) vshard_cfg.connection_outdate_delay) M.connection_outdate_delay = vshard_cfg.connection_outdate_delay M.total_bucket_count = total_bucket_count - M.collect_lua_garbage = collect_lua_garbage + M.collect_lua_garbage = vshard_cfg.collect_lua_garbage M.current_cfg = vshard_cfg M.replicasets = new_replicasets -- Update existing route map in-place. @@ -548,8 +540,7 @@ local function router_cfg(cfg) M.discovery_fiber = util.reloadable_fiber_create( 'vshard.discovery', M, 'discovery_f') end - -- Destroy connections, not used in a new configuration. - collectgarbage() + lua_gc.set_state(M.collect_lua_garbage, consts.COLLECT_LUA_GARBAGE_INTERVAL) end -------------------------------------------------------------------------------- diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua index 40216ea..3e29e9d 100644 --- a/vshard/storage/init.lua +++ b/vshard/storage/init.lua @@ -10,7 +10,8 @@ if rawget(_G, MODULE_INTERNALS) then local vshard_modules = { 'vshard.consts', 'vshard.error', 'vshard.cfg', 'vshard.replicaset', 'vshard.util', - 'vshard.storage.reload_evolution' + 'vshard.storage.reload_evolution', + 'vshard.lua_gc', } for _, module in pairs(vshard_modules) do package.loaded[module] = nil @@ -21,6 +22,7 @@ local lerror = require('vshard.error') local lcfg = require('vshard.cfg') local lreplicaset = require('vshard.replicaset') local util = require('vshard.util') +local lua_gc = require('vshard.lua_gc') local reload_evolution = require('vshard.storage.reload_evolution') local M = rawget(_G, MODULE_INTERNALS) @@ -75,8 +77,7 @@ if not M then collect_bucket_garbage_fiber = nil, -- Do buckets garbage collection once per this time. collect_bucket_garbage_interval = nil, - -- If true, then bucket garbage collection fiber starts to - -- call collectgarbage() periodically. + -- Boolean lua_gc state (create periodic gc task). collect_lua_garbage = nil, -------------------- Bucket recovery --------------------- @@ -1063,9 +1064,6 @@ function collect_garbage_f() -- buckets_for_redirect is deleted, it gets empty_sent_buckets -- for next deletion. local empty_sent_buckets = {} - local iterations_until_lua_gc = - consts.COLLECT_LUA_GARBAGE_INTERVAL / M.collect_bucket_garbage_interval - while M.module_version == module_version do -- Check if no changes in buckets configuration. if control.bucket_generation_collected ~= control.bucket_generation then @@ -1106,12 +1104,6 @@ function collect_garbage_f() end end ::continue:: - iterations_until_lua_gc = iterations_until_lua_gc - 1 - if iterations_until_lua_gc == 0 and M.collect_lua_garbage then - iterations_until_lua_gc = consts.COLLECT_LUA_GARBAGE_INTERVAL / - M.collect_bucket_garbage_interval - collectgarbage() - end lfiber.sleep(M.collect_bucket_garbage_interval) end end @@ -1586,7 +1578,6 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload) local shard_index = vshard_cfg.shard_index local collect_bucket_garbage_interval = vshard_cfg.collect_bucket_garbage_interval - local collect_lua_garbage = vshard_cfg.collect_lua_garbage -- It is considered that all possible errors during cfg -- process occur only before this place. @@ -1641,7 +1632,7 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload) M.rebalancer_max_receiving = rebalancer_max_receiving M.shard_index = shard_index M.collect_bucket_garbage_interval = collect_bucket_garbage_interval - M.collect_lua_garbage = collect_lua_garbage + M.collect_lua_garbage = vshard_cfg.collect_lua_garbage M.current_cfg = vshard_cfg if was_master and not is_master then @@ -1666,6 +1657,7 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload) M.rebalancer_fiber:cancel() M.rebalancer_fiber = nil end + lua_gc.set_state(M.collect_lua_garbage, consts.COLLECT_LUA_GARBAGE_INTERVAL) -- Destroy connections, not used in a new configuration. collectgarbage() end ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 2/3] Move lua gc to a dedicated module 2018-08-03 20:04 ` Alex Khatskevich @ 2018-08-06 17:03 ` Vladislav Shpilevoy 2018-08-08 11:17 ` Vladislav Shpilevoy 1 sibling, 0 replies; 23+ messages in thread From: Vladislav Shpilevoy @ 2018-08-06 17:03 UTC (permalink / raw) To: tarantool-patches, Alex Khatskevich Thanks for the patch! It is LGTM but can not push since it depends on the previous one (cherry-pick shows conflicts). On 03/08/2018 23:04, Alex Khatskevich wrote: > > On 01.08.2018 21:43, Vladislav Shpilevoy wrote: >> Thanks for the patch! See 4 comments below. >> >> On 31/07/2018 19:25, AKhatskevich wrote: >>> `vshard.lua_gc.lua` is a new module which helps make gc work more >>> intense. >>> Before the commit that was a duty of router and storage. >>> >>> Reasons to move lua gc to a separate module: >>> 1. It is not a duty of vshard to collect garbage, so let gc fiber >>> be as far from vshard as possible. >>> 2. Next commits will introduce multiple routers feature, which require >>> gc fiber to be a singleton. >>> >>> Closes #138 >>> --- >>> test/router/garbage_collector.result | 27 +++++++++++------ >>> test/router/garbage_collector.test.lua | 18 ++++++----- >>> test/storage/garbage_collector.result | 27 +++++++++-------- >>> test/storage/garbage_collector.test.lua | 22 ++++++-------- >>> vshard/lua_gc.lua | 54 +++++++++++++++++++++++++++++++++ >>> vshard/router/init.lua | 19 +++--------- >>> vshard/storage/init.lua | 20 ++++-------- >>> 7 files changed, 116 insertions(+), 71 deletions(-) >>> create mode 100644 vshard/lua_gc.lua >>> >>> diff --git a/test/router/garbage_collector.result b/test/router/garbage_collector.result >>> index 3c2a4f1..a7474fc 100644 >>> --- a/test/router/garbage_collector.result >>> +++ b/test/router/garbage_collector.result >>> @@ -40,27 +40,30 @@ test_run:switch('router_1') >>> fiber = require('fiber') >>> --- >>> ... >>> -cfg.collect_lua_garbage = true >>> +lua_gc = require('vshard.lua_gc') >>> --- >>> ... >>> -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DISCOVERY_INTERVAL >>> +cfg.collect_lua_garbage = true >> >> 1. Now this code tests nothing but just fibers. Below you do wakeup >> and check that iteration counter is increased, but it is obvious >> thing. Before your patch the test really tested that GC is called >> by checking for nullified weak references. Now I can remove collectgarbage() >> from the main_loop and nothing would changed. Please, make this test >> be a test. > GC test returned back. >> >> Moreover, the test hangs forever both locally and on Travis. > Fixed >> >>> diff --git a/test/storage/garbage_collector.result b/test/storage/garbage_collector.result >>> index 3588fb4..d94ba24 100644 >>> --- a/test/storage/garbage_collector.result >>> +++ b/test/storage/garbage_collector.result >> >> 2. Same. Now the test passes even if I removed collectgarbage() from >> the main loop. > returned. >> >>> diff --git a/vshard/lua_gc.lua b/vshard/lua_gc.lua >>> new file mode 100644 >>> index 0000000..8d6af3e >>> --- /dev/null >>> +++ b/vshard/lua_gc.lua >>> @@ -0,0 +1,54 @@ >>> +-- >>> +-- This module implements background lua GC fiber. >>> +-- It's purpose is to make GC more aggressive. >>> +-- >>> + >>> +local lfiber = require('fiber') >>> +local MODULE_INTERNALS = '__module_vshard_lua_gc' >>> + >>> +local M = rawget(_G, MODULE_INTERNALS) >>> +if not M then >>> + M = { >>> + -- Background fiber. >>> + bg_fiber = nil, >>> + -- GC interval in seconds. >>> + interval = nil, >>> + -- Main loop. >>> + -- Stored here to make the fiber reloadable. >>> + main_loop = nil, >>> + -- Number of `collectgarbage()` calls. >>> + iterations = 0, >>> + } >>> +end >>> +local DEFALUT_INTERVAL = 100 >> >> 3. For constants please use vshard.consts. >> >> 4. You should not choose interval inside the main_loop. >> Please, use 'default' option in cfg.lua. > DEFAULT_INTERVAL is removed at all. > Interval value is became required. > > > > full diff > > > > commit ec221bd060f46e4dc009eaab1c6c1bd1cf5a4150 > Author: AKhatskevich <avkhatskevich@tarantool.org> > Date: Thu Jul 26 01:17:00 2018 +0300 > > Move lua gc to a dedicated module > > `vshard.lua_gc.lua` is a new module which helps make gc work more > intense. > Before the commit that was a duty of router and storage. > > Reasons to move lua gc to a separate module: > 1. It is not a duty of vshard to collect garbage, so let gc fiber > be as far from vshard as possible. > 2. Next commits will introduce multiple routers feature, which require > gc fiber to be a singleton. > > Closes #138 > > diff --git a/test/router/garbage_collector.result b/test/router/garbage_collector.result > index 3c2a4f1..7780046 100644 > --- a/test/router/garbage_collector.result > +++ b/test/router/garbage_collector.result > @@ -40,41 +40,59 @@ test_run:switch('router_1') > fiber = require('fiber') > --- > ... > -cfg.collect_lua_garbage = true > +lua_gc = require('vshard.lua_gc') > --- > ... > -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DISCOVERY_INTERVAL > +cfg.collect_lua_garbage = true > --- > ... > vshard.router.cfg(cfg) > --- > ... > +lua_gc.internal.bg_fiber ~= nil > +--- > +- true > +... > +-- Check that `collectgarbage()` was really called. > a = setmetatable({}, {__mode = 'v'}) > --- > ... > a.k = {b = 100} > --- > ... > -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end > +iterations = lua_gc.internal.iterations > +--- > +... > +lua_gc.internal.bg_fiber:wakeup() > +--- > +... > +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end > --- > ... > a.k > --- > - null > ... > +lua_gc.internal.interval = 0.001 > +--- > +... > cfg.collect_lua_garbage = false > --- > ... > vshard.router.cfg(cfg) > --- > ... > -a.k = {b = 100} > +lua_gc.internal.bg_fiber == nil > +--- > +- true > +... > +iterations = lua_gc.internal.iterations > --- > ... > -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end > +fiber.sleep(0.01) > --- > ... > -a.k ~= nil > +iterations == lua_gc.internal.iterations > --- > - true > ... > diff --git a/test/router/garbage_collector.test.lua b/test/router/garbage_collector.test.lua > index b3411cd..e8d0876 100644 > --- a/test/router/garbage_collector.test.lua > +++ b/test/router/garbage_collector.test.lua > @@ -13,18 +13,24 @@ test_run:cmd("start server router_1") > -- > test_run:switch('router_1') > fiber = require('fiber') > +lua_gc = require('vshard.lua_gc') > cfg.collect_lua_garbage = true > -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DISCOVERY_INTERVAL > vshard.router.cfg(cfg) > +lua_gc.internal.bg_fiber ~= nil > +-- Check that `collectgarbage()` was really called. > a = setmetatable({}, {__mode = 'v'}) > a.k = {b = 100} > -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end > +iterations = lua_gc.internal.iterations > +lua_gc.internal.bg_fiber:wakeup() > +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end > a.k > +lua_gc.internal.interval = 0.001 > cfg.collect_lua_garbage = false > vshard.router.cfg(cfg) > -a.k = {b = 100} > -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end > -a.k ~= nil > +lua_gc.internal.bg_fiber == nil > +iterations = lua_gc.internal.iterations > +fiber.sleep(0.01) > +iterations == lua_gc.internal.iterations > > test_run:switch("default") > test_run:cmd("stop server router_1") > diff --git a/test/storage/garbage_collector.result b/test/storage/garbage_collector.result > index 3588fb4..6bec2db 100644 > --- a/test/storage/garbage_collector.result > +++ b/test/storage/garbage_collector.result > @@ -120,7 +120,7 @@ test_run:switch('storage_1_a') > fiber = require('fiber') > --- > ... > -log = require('log') > +lua_gc = require('vshard.lua_gc') > --- > ... > cfg.collect_lua_garbage = true > @@ -129,38 +129,50 @@ cfg.collect_lua_garbage = true > vshard.storage.cfg(cfg, names.storage_1_a) > --- > ... > --- Create a weak reference to a able {b = 100} - it must be > --- deleted on the next GC. > +lua_gc.internal.bg_fiber ~= nil > +--- > +- true > +... > +-- Check that `collectgarbage()` was really called. > a = setmetatable({}, {__mode = 'v'}) > --- > ... > a.k = {b = 100} > --- > ... > -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL > +iterations = lua_gc.internal.iterations > --- > ... > --- Wait until Lua GC deletes a.k. > -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end > +lua_gc.internal.bg_fiber:wakeup() > +--- > +... > +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end > --- > ... > a.k > --- > - null > ... > +lua_gc.internal.interval = 0.001 > +--- > +... > cfg.collect_lua_garbage = false > --- > ... > vshard.storage.cfg(cfg, names.storage_1_a) > --- > ... > -a.k = {b = 100} > +lua_gc.internal.bg_fiber == nil > +--- > +- true > +... > +iterations = lua_gc.internal.iterations > --- > ... > -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end > +fiber.sleep(0.01) > --- > ... > -a.k ~= nil > +iterations == lua_gc.internal.iterations > --- > - true > ... > diff --git a/test/storage/garbage_collector.test.lua b/test/storage/garbage_collector.test.lua > index 79e76d8..407b8a1 100644 > --- a/test/storage/garbage_collector.test.lua > +++ b/test/storage/garbage_collector.test.lua > @@ -46,22 +46,24 @@ customer:select{} > -- > test_run:switch('storage_1_a') > fiber = require('fiber') > -log = require('log') > +lua_gc = require('vshard.lua_gc') > cfg.collect_lua_garbage = true > vshard.storage.cfg(cfg, names.storage_1_a) > --- Create a weak reference to a able {b = 100} - it must be > --- deleted on the next GC. > +lua_gc.internal.bg_fiber ~= nil > +-- Check that `collectgarbage()` was really called. > a = setmetatable({}, {__mode = 'v'}) > a.k = {b = 100} > -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL > --- Wait until Lua GC deletes a.k. > -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end > +iterations = lua_gc.internal.iterations > +lua_gc.internal.bg_fiber:wakeup() > +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end > a.k > +lua_gc.internal.interval = 0.001 > cfg.collect_lua_garbage = false > vshard.storage.cfg(cfg, names.storage_1_a) > -a.k = {b = 100} > -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end > -a.k ~= nil > +lua_gc.internal.bg_fiber == nil > +iterations = lua_gc.internal.iterations > +fiber.sleep(0.01) > +iterations == lua_gc.internal.iterations > > test_run:switch('default') > test_run:drop_cluster(REPLICASET_2) > diff --git a/vshard/lua_gc.lua b/vshard/lua_gc.lua > new file mode 100644 > index 0000000..c6c5cd3 > --- /dev/null > +++ b/vshard/lua_gc.lua > @@ -0,0 +1,54 @@ > +-- > +-- This module implements background lua GC fiber. > +-- It's purpose is to make GC more aggressive. > +-- > + > +local lfiber = require('fiber') > +local MODULE_INTERNALS = '__module_vshard_lua_gc' > + > +local M = rawget(_G, MODULE_INTERNALS) > +if not M then > + M = { > + -- Background fiber. > + bg_fiber = nil, > + -- GC interval in seconds. > + interval = nil, > + -- Main loop. > + -- Stored here to make the fiber reloadable. > + main_loop = nil, > + -- Number of `collectgarbage()` calls. > + iterations = 0, > + } > +end > + > +M.main_loop = function() > + lfiber.sleep(M.interval) > + collectgarbage() > + M.iterations = M.iterations + 1 > + return M.main_loop() > +end > + > +local function set_state(active, interval) > + assert(type(interval) == 'number') > + M.interval = interval > + if active and not M.bg_fiber then > + M.bg_fiber = lfiber.create(M.main_loop) > + M.bg_fiber:name('vshard.lua_gc') > + end > + if not active and M.bg_fiber then > + M.bg_fiber:cancel() > + M.bg_fiber = nil > + end > + if active then > + M.bg_fiber:wakeup() > + end > +end > + > +if not rawget(_G, MODULE_INTERNALS) then > + rawset(_G, MODULE_INTERNALS, M) > +end > + > +return { > + set_state = set_state, > + internal = M, > +} > diff --git a/vshard/router/init.lua b/vshard/router/init.lua > index e2b2b22..3e127cb 100644 > --- a/vshard/router/init.lua > +++ b/vshard/router/init.lua > @@ -7,6 +7,7 @@ if rawget(_G, MODULE_INTERNALS) then > local vshard_modules = { > 'vshard.consts', 'vshard.error', 'vshard.cfg', > 'vshard.hash', 'vshard.replicaset', 'vshard.util', > + 'vshard.lua_gc', > } > for _, module in pairs(vshard_modules) do > package.loaded[module] = nil > @@ -18,6 +19,7 @@ local lcfg = require('vshard.cfg') > local lhash = require('vshard.hash') > local lreplicaset = require('vshard.replicaset') > local util = require('vshard.util') > +local lua_gc = require('vshard.lua_gc') > > local M = rawget(_G, MODULE_INTERNALS) > if not M then > @@ -43,8 +45,7 @@ if not M then > discovery_fiber = nil, > -- Bucket count stored on all replicasets. > total_bucket_count = 0, > - -- If true, then discovery fiber starts to call > - -- collectgarbage() periodically. > + -- Boolean lua_gc state (create periodic gc task). > collect_lua_garbage = nil, > -- This counter is used to restart background fibers with > -- new reloaded code. > @@ -151,8 +152,6 @@ end > -- > local function discovery_f() > local module_version = M.module_version > - local iterations_until_lua_gc = > - consts.COLLECT_LUA_GARBAGE_INTERVAL / consts.DISCOVERY_INTERVAL > while module_version == M.module_version do > while not next(M.replicasets) do > lfiber.sleep(consts.DISCOVERY_INTERVAL) > @@ -188,12 +187,6 @@ local function discovery_f() > M.route_map[bucket_id] = replicaset > end > end > - iterations_until_lua_gc = iterations_until_lua_gc - 1 > - if M.collect_lua_garbage and iterations_until_lua_gc == 0 then > - iterations_until_lua_gc = > - consts.COLLECT_LUA_GARBAGE_INTERVAL / consts.DISCOVERY_INTERVAL > - collectgarbage() > - end > lfiber.sleep(consts.DISCOVERY_INTERVAL) > end > end > @@ -504,7 +497,6 @@ local function router_cfg(cfg) > end > local new_replicasets = lreplicaset.buildall(vshard_cfg) > local total_bucket_count = vshard_cfg.bucket_count > - local collect_lua_garbage = vshard_cfg.collect_lua_garbage > log.info("Calling box.cfg()...") > for k, v in pairs(box_cfg) do > log.info({[k] = v}) > @@ -531,7 +523,7 @@ local function router_cfg(cfg) > vshard_cfg.connection_outdate_delay) > M.connection_outdate_delay = vshard_cfg.connection_outdate_delay > M.total_bucket_count = total_bucket_count > - M.collect_lua_garbage = collect_lua_garbage > + M.collect_lua_garbage = vshard_cfg.collect_lua_garbage > M.current_cfg = vshard_cfg > M.replicasets = new_replicasets > -- Update existing route map in-place. > @@ -548,8 +540,7 @@ local function router_cfg(cfg) > M.discovery_fiber = util.reloadable_fiber_create( > 'vshard.discovery', M, 'discovery_f') > end > - -- Destroy connections, not used in a new configuration. > - collectgarbage() > + lua_gc.set_state(M.collect_lua_garbage, consts.COLLECT_LUA_GARBAGE_INTERVAL) > end > > -------------------------------------------------------------------------------- > diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua > index 40216ea..3e29e9d 100644 > --- a/vshard/storage/init.lua > +++ b/vshard/storage/init.lua > @@ -10,7 +10,8 @@ if rawget(_G, MODULE_INTERNALS) then > local vshard_modules = { > 'vshard.consts', 'vshard.error', 'vshard.cfg', > 'vshard.replicaset', 'vshard.util', > - 'vshard.storage.reload_evolution' > + 'vshard.storage.reload_evolution', > + 'vshard.lua_gc', > } > for _, module in pairs(vshard_modules) do > package.loaded[module] = nil > @@ -21,6 +22,7 @@ local lerror = require('vshard.error') > local lcfg = require('vshard.cfg') > local lreplicaset = require('vshard.replicaset') > local util = require('vshard.util') > +local lua_gc = require('vshard.lua_gc') > local reload_evolution = require('vshard.storage.reload_evolution') > > local M = rawget(_G, MODULE_INTERNALS) > @@ -75,8 +77,7 @@ if not M then > collect_bucket_garbage_fiber = nil, > -- Do buckets garbage collection once per this time. > collect_bucket_garbage_interval = nil, > - -- If true, then bucket garbage collection fiber starts to > - -- call collectgarbage() periodically. > + -- Boolean lua_gc state (create periodic gc task). > collect_lua_garbage = nil, > > -------------------- Bucket recovery --------------------- > @@ -1063,9 +1064,6 @@ function collect_garbage_f() > -- buckets_for_redirect is deleted, it gets empty_sent_buckets > -- for next deletion. > local empty_sent_buckets = {} > - local iterations_until_lua_gc = > - consts.COLLECT_LUA_GARBAGE_INTERVAL / M.collect_bucket_garbage_interval > - > while M.module_version == module_version do > -- Check if no changes in buckets configuration. > if control.bucket_generation_collected ~= control.bucket_generation then > @@ -1106,12 +1104,6 @@ function collect_garbage_f() > end > end > ::continue:: > - iterations_until_lua_gc = iterations_until_lua_gc - 1 > - if iterations_until_lua_gc == 0 and M.collect_lua_garbage then > - iterations_until_lua_gc = consts.COLLECT_LUA_GARBAGE_INTERVAL / > - M.collect_bucket_garbage_interval > - collectgarbage() > - end > lfiber.sleep(M.collect_bucket_garbage_interval) > end > end > @@ -1586,7 +1578,6 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload) > local shard_index = vshard_cfg.shard_index > local collect_bucket_garbage_interval = > vshard_cfg.collect_bucket_garbage_interval > - local collect_lua_garbage = vshard_cfg.collect_lua_garbage > > -- It is considered that all possible errors during cfg > -- process occur only before this place. > @@ -1641,7 +1632,7 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload) > M.rebalancer_max_receiving = rebalancer_max_receiving > M.shard_index = shard_index > M.collect_bucket_garbage_interval = collect_bucket_garbage_interval > - M.collect_lua_garbage = collect_lua_garbage > + M.collect_lua_garbage = vshard_cfg.collect_lua_garbage > M.current_cfg = vshard_cfg > > if was_master and not is_master then > @@ -1666,6 +1657,7 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload) > M.rebalancer_fiber:cancel() > M.rebalancer_fiber = nil > end > + lua_gc.set_state(M.collect_lua_garbage, consts.COLLECT_LUA_GARBAGE_INTERVAL) > -- Destroy connections, not used in a new configuration. > collectgarbage() > end > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 2/3] Move lua gc to a dedicated module 2018-08-03 20:04 ` Alex Khatskevich 2018-08-06 17:03 ` Vladislav Shpilevoy @ 2018-08-08 11:17 ` Vladislav Shpilevoy 1 sibling, 0 replies; 23+ messages in thread From: Vladislav Shpilevoy @ 2018-08-08 11:17 UTC (permalink / raw) To: tarantool-patches, Alex Khatskevich Thanks for the patch! Pushed into the master. On 03/08/2018 23:04, Alex Khatskevich wrote: > > On 01.08.2018 21:43, Vladislav Shpilevoy wrote: >> Thanks for the patch! See 4 comments below. >> >> On 31/07/2018 19:25, AKhatskevich wrote: >>> `vshard.lua_gc.lua` is a new module which helps make gc work more >>> intense. >>> Before the commit that was a duty of router and storage. >>> >>> Reasons to move lua gc to a separate module: >>> 1. It is not a duty of vshard to collect garbage, so let gc fiber >>> be as far from vshard as possible. >>> 2. Next commits will introduce multiple routers feature, which require >>> gc fiber to be a singleton. >>> >>> Closes #138 >>> --- >>> test/router/garbage_collector.result | 27 +++++++++++------ >>> test/router/garbage_collector.test.lua | 18 ++++++----- >>> test/storage/garbage_collector.result | 27 +++++++++-------- >>> test/storage/garbage_collector.test.lua | 22 ++++++-------- >>> vshard/lua_gc.lua | 54 +++++++++++++++++++++++++++++++++ >>> vshard/router/init.lua | 19 +++--------- >>> vshard/storage/init.lua | 20 ++++-------- >>> 7 files changed, 116 insertions(+), 71 deletions(-) >>> create mode 100644 vshard/lua_gc.lua >>> >>> diff --git a/test/router/garbage_collector.result b/test/router/garbage_collector.result >>> index 3c2a4f1..a7474fc 100644 >>> --- a/test/router/garbage_collector.result >>> +++ b/test/router/garbage_collector.result >>> @@ -40,27 +40,30 @@ test_run:switch('router_1') >>> fiber = require('fiber') >>> --- >>> ... >>> -cfg.collect_lua_garbage = true >>> +lua_gc = require('vshard.lua_gc') >>> --- >>> ... >>> -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DISCOVERY_INTERVAL >>> +cfg.collect_lua_garbage = true >> >> 1. Now this code tests nothing but just fibers. Below you do wakeup >> and check that iteration counter is increased, but it is obvious >> thing. Before your patch the test really tested that GC is called >> by checking for nullified weak references. Now I can remove collectgarbage() >> from the main_loop and nothing would changed. Please, make this test >> be a test. > GC test returned back. >> >> Moreover, the test hangs forever both locally and on Travis. > Fixed >> >>> diff --git a/test/storage/garbage_collector.result b/test/storage/garbage_collector.result >>> index 3588fb4..d94ba24 100644 >>> --- a/test/storage/garbage_collector.result >>> +++ b/test/storage/garbage_collector.result >> >> 2. Same. Now the test passes even if I removed collectgarbage() from >> the main loop. > returned. >> >>> diff --git a/vshard/lua_gc.lua b/vshard/lua_gc.lua >>> new file mode 100644 >>> index 0000000..8d6af3e >>> --- /dev/null >>> +++ b/vshard/lua_gc.lua >>> @@ -0,0 +1,54 @@ >>> +-- >>> +-- This module implements background lua GC fiber. >>> +-- It's purpose is to make GC more aggressive. >>> +-- >>> + >>> +local lfiber = require('fiber') >>> +local MODULE_INTERNALS = '__module_vshard_lua_gc' >>> + >>> +local M = rawget(_G, MODULE_INTERNALS) >>> +if not M then >>> + M = { >>> + -- Background fiber. >>> + bg_fiber = nil, >>> + -- GC interval in seconds. >>> + interval = nil, >>> + -- Main loop. >>> + -- Stored here to make the fiber reloadable. >>> + main_loop = nil, >>> + -- Number of `collectgarbage()` calls. >>> + iterations = 0, >>> + } >>> +end >>> +local DEFALUT_INTERVAL = 100 >> >> 3. For constants please use vshard.consts. >> >> 4. You should not choose interval inside the main_loop. >> Please, use 'default' option in cfg.lua. > DEFAULT_INTERVAL is removed at all. > Interval value is became required. > > > > full diff > > > > commit ec221bd060f46e4dc009eaab1c6c1bd1cf5a4150 > Author: AKhatskevich <avkhatskevich@tarantool.org> > Date: Thu Jul 26 01:17:00 2018 +0300 > > Move lua gc to a dedicated module > > `vshard.lua_gc.lua` is a new module which helps make gc work more > intense. > Before the commit that was a duty of router and storage. > > Reasons to move lua gc to a separate module: > 1. It is not a duty of vshard to collect garbage, so let gc fiber > be as far from vshard as possible. > 2. Next commits will introduce multiple routers feature, which require > gc fiber to be a singleton. > > Closes #138 > > diff --git a/test/router/garbage_collector.result b/test/router/garbage_collector.result > index 3c2a4f1..7780046 100644 > --- a/test/router/garbage_collector.result > +++ b/test/router/garbage_collector.result > @@ -40,41 +40,59 @@ test_run:switch('router_1') > fiber = require('fiber') > --- > ... > -cfg.collect_lua_garbage = true > +lua_gc = require('vshard.lua_gc') > --- > ... > -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DISCOVERY_INTERVAL > +cfg.collect_lua_garbage = true > --- > ... > vshard.router.cfg(cfg) > --- > ... > +lua_gc.internal.bg_fiber ~= nil > +--- > +- true > +... > +-- Check that `collectgarbage()` was really called. > a = setmetatable({}, {__mode = 'v'}) > --- > ... > a.k = {b = 100} > --- > ... > -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end > +iterations = lua_gc.internal.iterations > +--- > +... > +lua_gc.internal.bg_fiber:wakeup() > +--- > +... > +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end > --- > ... > a.k > --- > - null > ... > +lua_gc.internal.interval = 0.001 > +--- > +... > cfg.collect_lua_garbage = false > --- > ... > vshard.router.cfg(cfg) > --- > ... > -a.k = {b = 100} > +lua_gc.internal.bg_fiber == nil > +--- > +- true > +... > +iterations = lua_gc.internal.iterations > --- > ... > -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end > +fiber.sleep(0.01) > --- > ... > -a.k ~= nil > +iterations == lua_gc.internal.iterations > --- > - true > ... > diff --git a/test/router/garbage_collector.test.lua b/test/router/garbage_collector.test.lua > index b3411cd..e8d0876 100644 > --- a/test/router/garbage_collector.test.lua > +++ b/test/router/garbage_collector.test.lua > @@ -13,18 +13,24 @@ test_run:cmd("start server router_1") > -- > test_run:switch('router_1') > fiber = require('fiber') > +lua_gc = require('vshard.lua_gc') > cfg.collect_lua_garbage = true > -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DISCOVERY_INTERVAL > vshard.router.cfg(cfg) > +lua_gc.internal.bg_fiber ~= nil > +-- Check that `collectgarbage()` was really called. > a = setmetatable({}, {__mode = 'v'}) > a.k = {b = 100} > -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end > +iterations = lua_gc.internal.iterations > +lua_gc.internal.bg_fiber:wakeup() > +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end > a.k > +lua_gc.internal.interval = 0.001 > cfg.collect_lua_garbage = false > vshard.router.cfg(cfg) > -a.k = {b = 100} > -for i = 1, iters + 1 do vshard.router.discovery_wakeup() fiber.sleep(0.01) end > -a.k ~= nil > +lua_gc.internal.bg_fiber == nil > +iterations = lua_gc.internal.iterations > +fiber.sleep(0.01) > +iterations == lua_gc.internal.iterations > > test_run:switch("default") > test_run:cmd("stop server router_1") > diff --git a/test/storage/garbage_collector.result b/test/storage/garbage_collector.result > index 3588fb4..6bec2db 100644 > --- a/test/storage/garbage_collector.result > +++ b/test/storage/garbage_collector.result > @@ -120,7 +120,7 @@ test_run:switch('storage_1_a') > fiber = require('fiber') > --- > ... > -log = require('log') > +lua_gc = require('vshard.lua_gc') > --- > ... > cfg.collect_lua_garbage = true > @@ -129,38 +129,50 @@ cfg.collect_lua_garbage = true > vshard.storage.cfg(cfg, names.storage_1_a) > --- > ... > --- Create a weak reference to a able {b = 100} - it must be > --- deleted on the next GC. > +lua_gc.internal.bg_fiber ~= nil > +--- > +- true > +... > +-- Check that `collectgarbage()` was really called. > a = setmetatable({}, {__mode = 'v'}) > --- > ... > a.k = {b = 100} > --- > ... > -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL > +iterations = lua_gc.internal.iterations > --- > ... > --- Wait until Lua GC deletes a.k. > -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end > +lua_gc.internal.bg_fiber:wakeup() > +--- > +... > +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end > --- > ... > a.k > --- > - null > ... > +lua_gc.internal.interval = 0.001 > +--- > +... > cfg.collect_lua_garbage = false > --- > ... > vshard.storage.cfg(cfg, names.storage_1_a) > --- > ... > -a.k = {b = 100} > +lua_gc.internal.bg_fiber == nil > +--- > +- true > +... > +iterations = lua_gc.internal.iterations > --- > ... > -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end > +fiber.sleep(0.01) > --- > ... > -a.k ~= nil > +iterations == lua_gc.internal.iterations > --- > - true > ... > diff --git a/test/storage/garbage_collector.test.lua b/test/storage/garbage_collector.test.lua > index 79e76d8..407b8a1 100644 > --- a/test/storage/garbage_collector.test.lua > +++ b/test/storage/garbage_collector.test.lua > @@ -46,22 +46,24 @@ customer:select{} > -- > test_run:switch('storage_1_a') > fiber = require('fiber') > -log = require('log') > +lua_gc = require('vshard.lua_gc') > cfg.collect_lua_garbage = true > vshard.storage.cfg(cfg, names.storage_1_a) > --- Create a weak reference to a able {b = 100} - it must be > --- deleted on the next GC. > +lua_gc.internal.bg_fiber ~= nil > +-- Check that `collectgarbage()` was really called. > a = setmetatable({}, {__mode = 'v'}) > a.k = {b = 100} > -iters = vshard.consts.COLLECT_LUA_GARBAGE_INTERVAL / vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL > --- Wait until Lua GC deletes a.k. > -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end > +iterations = lua_gc.internal.iterations > +lua_gc.internal.bg_fiber:wakeup() > +while lua_gc.internal.iterations < iterations + 1 do fiber.sleep(0.01) end > a.k > +lua_gc.internal.interval = 0.001 > cfg.collect_lua_garbage = false > vshard.storage.cfg(cfg, names.storage_1_a) > -a.k = {b = 100} > -for i = 1, iters + 1 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end > -a.k ~= nil > +lua_gc.internal.bg_fiber == nil > +iterations = lua_gc.internal.iterations > +fiber.sleep(0.01) > +iterations == lua_gc.internal.iterations > > test_run:switch('default') > test_run:drop_cluster(REPLICASET_2) > diff --git a/vshard/lua_gc.lua b/vshard/lua_gc.lua > new file mode 100644 > index 0000000..c6c5cd3 > --- /dev/null > +++ b/vshard/lua_gc.lua > @@ -0,0 +1,54 @@ > +-- > +-- This module implements background lua GC fiber. > +-- It's purpose is to make GC more aggressive. > +-- > + > +local lfiber = require('fiber') > +local MODULE_INTERNALS = '__module_vshard_lua_gc' > + > +local M = rawget(_G, MODULE_INTERNALS) > +if not M then > + M = { > + -- Background fiber. > + bg_fiber = nil, > + -- GC interval in seconds. > + interval = nil, > + -- Main loop. > + -- Stored here to make the fiber reloadable. > + main_loop = nil, > + -- Number of `collectgarbage()` calls. > + iterations = 0, > + } > +end > + > +M.main_loop = function() > + lfiber.sleep(M.interval) > + collectgarbage() > + M.iterations = M.iterations + 1 > + return M.main_loop() > +end > + > +local function set_state(active, interval) > + assert(type(interval) == 'number') > + M.interval = interval > + if active and not M.bg_fiber then > + M.bg_fiber = lfiber.create(M.main_loop) > + M.bg_fiber:name('vshard.lua_gc') > + end > + if not active and M.bg_fiber then > + M.bg_fiber:cancel() > + M.bg_fiber = nil > + end > + if active then > + M.bg_fiber:wakeup() > + end > +end > + > +if not rawget(_G, MODULE_INTERNALS) then > + rawset(_G, MODULE_INTERNALS, M) > +end > + > +return { > + set_state = set_state, > + internal = M, > +} > diff --git a/vshard/router/init.lua b/vshard/router/init.lua > index e2b2b22..3e127cb 100644 > --- a/vshard/router/init.lua > +++ b/vshard/router/init.lua > @@ -7,6 +7,7 @@ if rawget(_G, MODULE_INTERNALS) then > local vshard_modules = { > 'vshard.consts', 'vshard.error', 'vshard.cfg', > 'vshard.hash', 'vshard.replicaset', 'vshard.util', > + 'vshard.lua_gc', > } > for _, module in pairs(vshard_modules) do > package.loaded[module] = nil > @@ -18,6 +19,7 @@ local lcfg = require('vshard.cfg') > local lhash = require('vshard.hash') > local lreplicaset = require('vshard.replicaset') > local util = require('vshard.util') > +local lua_gc = require('vshard.lua_gc') > > local M = rawget(_G, MODULE_INTERNALS) > if not M then > @@ -43,8 +45,7 @@ if not M then > discovery_fiber = nil, > -- Bucket count stored on all replicasets. > total_bucket_count = 0, > - -- If true, then discovery fiber starts to call > - -- collectgarbage() periodically. > + -- Boolean lua_gc state (create periodic gc task). > collect_lua_garbage = nil, > -- This counter is used to restart background fibers with > -- new reloaded code. > @@ -151,8 +152,6 @@ end > -- > local function discovery_f() > local module_version = M.module_version > - local iterations_until_lua_gc = > - consts.COLLECT_LUA_GARBAGE_INTERVAL / consts.DISCOVERY_INTERVAL > while module_version == M.module_version do > while not next(M.replicasets) do > lfiber.sleep(consts.DISCOVERY_INTERVAL) > @@ -188,12 +187,6 @@ local function discovery_f() > M.route_map[bucket_id] = replicaset > end > end > - iterations_until_lua_gc = iterations_until_lua_gc - 1 > - if M.collect_lua_garbage and iterations_until_lua_gc == 0 then > - iterations_until_lua_gc = > - consts.COLLECT_LUA_GARBAGE_INTERVAL / consts.DISCOVERY_INTERVAL > - collectgarbage() > - end > lfiber.sleep(consts.DISCOVERY_INTERVAL) > end > end > @@ -504,7 +497,6 @@ local function router_cfg(cfg) > end > local new_replicasets = lreplicaset.buildall(vshard_cfg) > local total_bucket_count = vshard_cfg.bucket_count > - local collect_lua_garbage = vshard_cfg.collect_lua_garbage > log.info("Calling box.cfg()...") > for k, v in pairs(box_cfg) do > log.info({[k] = v}) > @@ -531,7 +523,7 @@ local function router_cfg(cfg) > vshard_cfg.connection_outdate_delay) > M.connection_outdate_delay = vshard_cfg.connection_outdate_delay > M.total_bucket_count = total_bucket_count > - M.collect_lua_garbage = collect_lua_garbage > + M.collect_lua_garbage = vshard_cfg.collect_lua_garbage > M.current_cfg = vshard_cfg > M.replicasets = new_replicasets > -- Update existing route map in-place. > @@ -548,8 +540,7 @@ local function router_cfg(cfg) > M.discovery_fiber = util.reloadable_fiber_create( > 'vshard.discovery', M, 'discovery_f') > end > - -- Destroy connections, not used in a new configuration. > - collectgarbage() > + lua_gc.set_state(M.collect_lua_garbage, consts.COLLECT_LUA_GARBAGE_INTERVAL) > end > > -------------------------------------------------------------------------------- > diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua > index 40216ea..3e29e9d 100644 > --- a/vshard/storage/init.lua > +++ b/vshard/storage/init.lua > @@ -10,7 +10,8 @@ if rawget(_G, MODULE_INTERNALS) then > local vshard_modules = { > 'vshard.consts', 'vshard.error', 'vshard.cfg', > 'vshard.replicaset', 'vshard.util', > - 'vshard.storage.reload_evolution' > + 'vshard.storage.reload_evolution', > + 'vshard.lua_gc', > } > for _, module in pairs(vshard_modules) do > package.loaded[module] = nil > @@ -21,6 +22,7 @@ local lerror = require('vshard.error') > local lcfg = require('vshard.cfg') > local lreplicaset = require('vshard.replicaset') > local util = require('vshard.util') > +local lua_gc = require('vshard.lua_gc') > local reload_evolution = require('vshard.storage.reload_evolution') > > local M = rawget(_G, MODULE_INTERNALS) > @@ -75,8 +77,7 @@ if not M then > collect_bucket_garbage_fiber = nil, > -- Do buckets garbage collection once per this time. > collect_bucket_garbage_interval = nil, > - -- If true, then bucket garbage collection fiber starts to > - -- call collectgarbage() periodically. > + -- Boolean lua_gc state (create periodic gc task). > collect_lua_garbage = nil, > > -------------------- Bucket recovery --------------------- > @@ -1063,9 +1064,6 @@ function collect_garbage_f() > -- buckets_for_redirect is deleted, it gets empty_sent_buckets > -- for next deletion. > local empty_sent_buckets = {} > - local iterations_until_lua_gc = > - consts.COLLECT_LUA_GARBAGE_INTERVAL / M.collect_bucket_garbage_interval > - > while M.module_version == module_version do > -- Check if no changes in buckets configuration. > if control.bucket_generation_collected ~= control.bucket_generation then > @@ -1106,12 +1104,6 @@ function collect_garbage_f() > end > end > ::continue:: > - iterations_until_lua_gc = iterations_until_lua_gc - 1 > - if iterations_until_lua_gc == 0 and M.collect_lua_garbage then > - iterations_until_lua_gc = consts.COLLECT_LUA_GARBAGE_INTERVAL / > - M.collect_bucket_garbage_interval > - collectgarbage() > - end > lfiber.sleep(M.collect_bucket_garbage_interval) > end > end > @@ -1586,7 +1578,6 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload) > local shard_index = vshard_cfg.shard_index > local collect_bucket_garbage_interval = > vshard_cfg.collect_bucket_garbage_interval > - local collect_lua_garbage = vshard_cfg.collect_lua_garbage > > -- It is considered that all possible errors during cfg > -- process occur only before this place. > @@ -1641,7 +1632,7 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload) > M.rebalancer_max_receiving = rebalancer_max_receiving > M.shard_index = shard_index > M.collect_bucket_garbage_interval = collect_bucket_garbage_interval > - M.collect_lua_garbage = collect_lua_garbage > + M.collect_lua_garbage = vshard_cfg.collect_lua_garbage > M.current_cfg = vshard_cfg > > if was_master and not is_master then > @@ -1666,6 +1657,7 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload) > M.rebalancer_fiber:cancel() > M.rebalancer_fiber = nil > end > + lua_gc.set_state(M.collect_lua_garbage, consts.COLLECT_LUA_GARBAGE_INTERVAL) > -- Destroy connections, not used in a new configuration. > collectgarbage() > end > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] [PATCH 3/3] Introduce multiple routers feature 2018-07-31 16:25 [tarantool-patches] [PATCH 0/3] multiple routers AKhatskevich 2018-07-31 16:25 ` [tarantool-patches] [PATCH 1/3] Update only vshard part of a cfg on reload AKhatskevich 2018-07-31 16:25 ` [tarantool-patches] [PATCH 2/3] Move lua gc to a dedicated module AKhatskevich @ 2018-07-31 16:25 ` AKhatskevich 2018-08-01 18:43 ` [tarantool-patches] " Vladislav Shpilevoy 2018-08-01 14:30 ` [tarantool-patches] [PATCH] Check self arg passed for router objects AKhatskevich 2018-08-03 20:07 ` [tarantool-patches] [PATCH] Refactor config templates AKhatskevich 4 siblings, 1 reply; 23+ messages in thread From: AKhatskevich @ 2018-07-31 16:25 UTC (permalink / raw) To: v.shpilevoy, tarantool-patches Key points: * Old `vshard.router.some_method()` api is preserved. * Add `vshard.router.new(name, cfg)` method which returns a new router. * Each router has its own: 1. name 2. background fibers 3. attributes (route_map, replicasets, outdate_delay...) * Module reload reloads all configured routers. * `cfg` reconfigures a single router. * All routers share the same box configuration. The last passed config overrides the global config. * Multiple router instances can be connected to the same cluster. * By now, a router cannot be destroyed. Extra changes: * Add `data` parameter to `reloadable_fiber_create` function. Closes #130 --- test/multiple_routers/configs.lua | 81 ++++++ test/multiple_routers/multiple_routers.result | 226 ++++++++++++++++ test/multiple_routers/multiple_routers.test.lua | 85 ++++++ test/multiple_routers/router_1.lua | 15 ++ test/multiple_routers/storage_1_1_a.lua | 23 ++ test/multiple_routers/storage_1_1_b.lua | 1 + test/multiple_routers/storage_1_2_a.lua | 1 + test/multiple_routers/storage_1_2_b.lua | 1 + test/multiple_routers/storage_2_1_a.lua | 1 + test/multiple_routers/storage_2_1_b.lua | 1 + test/multiple_routers/storage_2_2_a.lua | 1 + test/multiple_routers/storage_2_2_b.lua | 1 + test/multiple_routers/suite.ini | 6 + test/multiple_routers/test.lua | 9 + test/router/router.result | 4 +- test/router/router.test.lua | 4 +- vshard/router/init.lua | 341 ++++++++++++++++-------- vshard/util.lua | 12 +- 18 files changed, 690 insertions(+), 123 deletions(-) create mode 100644 test/multiple_routers/configs.lua create mode 100644 test/multiple_routers/multiple_routers.result create mode 100644 test/multiple_routers/multiple_routers.test.lua create mode 100644 test/multiple_routers/router_1.lua create mode 100644 test/multiple_routers/storage_1_1_a.lua create mode 120000 test/multiple_routers/storage_1_1_b.lua create mode 120000 test/multiple_routers/storage_1_2_a.lua create mode 120000 test/multiple_routers/storage_1_2_b.lua create mode 120000 test/multiple_routers/storage_2_1_a.lua create mode 120000 test/multiple_routers/storage_2_1_b.lua create mode 120000 test/multiple_routers/storage_2_2_a.lua create mode 120000 test/multiple_routers/storage_2_2_b.lua create mode 100644 test/multiple_routers/suite.ini create mode 100644 test/multiple_routers/test.lua diff --git a/test/multiple_routers/configs.lua b/test/multiple_routers/configs.lua new file mode 100644 index 0000000..a6ce33c --- /dev/null +++ b/test/multiple_routers/configs.lua @@ -0,0 +1,81 @@ +names = { + storage_1_1_a = '32a2d4b8-f146-44ed-9d51-2436507efdf8', + storage_1_1_b = 'c1c849b1-641d-40b8-9283-bcfe73d46270', + storage_1_2_a = '04e677ed-c7ba-47e0-a67f-b5100cfa86af', + storage_1_2_b = 'c7a979ee-9263-4a38-84a5-2fb6a0a32684', + storage_2_1_a = '88dc03f0-23fb-4f05-b462-e29186542864', + storage_2_1_b = '4230b711-f5c4-4131-bf98-88cd43a16901', + storage_2_2_a = '6b1eefbc-1e2e-410e-84ff-44c572ea9916', + storage_2_2_b = 'be74419a-1e56-4ba4-97e9-6b18710f63c5', +} + +rs_1_1 = 'dd208fb8-8b90-49bc-8393-6b3a99da7c52' +rs_1_2 = 'af9cfe88-2091-4613-a877-a623776c5c0e' +rs_2_1 = '9ca8ee15-ae18-4f31-9385-4859f89ce73f' +rs_2_2 = '007f5f58-b654-4125-8441-a71866fb62b5' + +local cfg_1 = {} +cfg_1.sharding = { + [rs_1_1] = { + replicas = { + [names.storage_1_1_a] = { + uri = 'storage:storage@127.0.0.1:3301', + name = 'storage_1_1_a', + master = true, + }, + [names.storage_1_1_b] = { + uri = 'storage:storage@127.0.0.1:3302', + name = 'storage_1_1_b', + }, + } + }, + [rs_1_2] = { + replicas = { + [names.storage_1_2_a] = { + uri = 'storage:storage@127.0.0.1:3303', + name = 'storage_1_2_a', + master = true, + }, + [names.storage_1_2_b] = { + uri = 'storage:storage@127.0.0.1:3304', + name = 'storage_1_2_b', + }, + } + }, +} + + +local cfg_2 = {} +cfg_2.sharding = { + [rs_2_1] = { + replicas = { + [names.storage_2_1_a] = { + uri = 'storage:storage@127.0.0.1:3305', + name = 'storage_2_1_a', + master = true, + }, + [names.storage_2_1_b] = { + uri = 'storage:storage@127.0.0.1:3306', + name = 'storage_2_1_b', + }, + } + }, + [rs_2_2] = { + replicas = { + [names.storage_2_2_a] = { + uri = 'storage:storage@127.0.0.1:3307', + name = 'storage_2_2_a', + master = true, + }, + [names.storage_2_2_b] = { + uri = 'storage:storage@127.0.0.1:3308', + name = 'storage_2_2_b', + }, + } + }, +} + +return { + cfg_1 = cfg_1, + cfg_2 = cfg_2, +} diff --git a/test/multiple_routers/multiple_routers.result b/test/multiple_routers/multiple_routers.result new file mode 100644 index 0000000..33f4034 --- /dev/null +++ b/test/multiple_routers/multiple_routers.result @@ -0,0 +1,226 @@ +test_run = require('test_run').new() +--- +... +REPLICASET_1_1 = { 'storage_1_1_a', 'storage_1_1_b' } +--- +... +REPLICASET_1_2 = { 'storage_1_2_a', 'storage_1_2_b' } +--- +... +REPLICASET_2_1 = { 'storage_2_1_a', 'storage_2_1_b' } +--- +... +REPLICASET_2_2 = { 'storage_2_2_a', 'storage_2_2_b' } +--- +... +test_run:create_cluster(REPLICASET_1_1, 'multiple_routers') +--- +... +test_run:create_cluster(REPLICASET_1_2, 'multiple_routers') +--- +... +test_run:create_cluster(REPLICASET_2_1, 'multiple_routers') +--- +... +test_run:create_cluster(REPLICASET_2_2, 'multiple_routers') +--- +... +util = require('lua_libs.util') +--- +... +util.wait_master(test_run, REPLICASET_1_1, 'storage_1_1_a') +--- +... +util.wait_master(test_run, REPLICASET_1_2, 'storage_1_2_a') +--- +... +util.wait_master(test_run, REPLICASET_2_1, 'storage_2_1_a') +--- +... +util.wait_master(test_run, REPLICASET_2_2, 'storage_2_2_a') +--- +... +test_run:cmd("create server router_1 with script='multiple_routers/router_1.lua'") +--- +- true +... +test_run:cmd("start server router_1") +--- +- true +... +-- Configure default (static) router. +_ = test_run:cmd("switch router_1") +--- +... +vshard.router.cfg(configs.cfg_1) +--- +... +vshard.router.bootstrap() +--- +- true +... +_ = test_run:cmd("switch storage_1_2_a") +--- +... +wait_rebalancer_state('The cluster is balanced ok', test_run) +--- +... +_ = test_run:cmd("switch router_1") +--- +... +vshard.router.call(1, 'write', 'do_replace', {{1, 1}}) +--- +- true +... +vshard.router.call(1, 'read', 'do_select', {1}) +--- +- [[1, 1]] +... +-- Configure extra router. +router_2 = vshard.router.new('router_2', configs.cfg_2) +--- +... +router_2:bootstrap() +--- +- true +... +_ = test_run:cmd("switch storage_2_2_a") +--- +... +wait_rebalancer_state('The cluster is balanced ok', test_run) +--- +... +_ = test_run:cmd("switch router_1") +--- +... +router_2:call(1, 'write', 'do_replace', {{2, 2}}) +--- +- true +... +router_2:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +-- Check that router_2 and static router serves different clusters. +#router_2:call(1, 'read', 'do_select', {1}) == 0 +--- +- true +... +-- Create several routers to the same cluster. +routers = {} +--- +... +for i = 3, 10 do routers[i] = vshard.router.new('router_' .. i, configs.cfg_2) end +--- +... +routers[3]:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +-- Check that they have their own background fibers. +fiber_names = {} +--- +... +for i = 2, 10 do fiber_names['vshard.failover.router_' .. i] = true; fiber_names['vshard.discovery.router_' .. i] = true; end +--- +... +next(fiber_names) ~= nil +--- +- true +... +fiber = require('fiber') +--- +... +for _, xfiber in pairs(fiber.info()) do fiber_names[xfiber.name] = nil end +--- +... +next(fiber_names) == nil +--- +- true +... +-- Reconfigure one of routers do not affect the others. +routers[3]:cfg(configs.cfg_1) +--- +... +routers[3]:call(1, 'read', 'do_select', {1}) +--- +- [[1, 1]] +... +#routers[3]:call(1, 'read', 'do_select', {2}) == 0 +--- +- true +... +#routers[4]:call(1, 'read', 'do_select', {1}) == 0 +--- +- true +... +routers[4]:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +routers[3]:cfg(configs.cfg_2) +--- +... +-- Try to create router with the same name. +util = require('lua_libs.util') +--- +... +util.check_error(vshard.router.new, 'router_2', configs.cfg_2) +--- +- null +- Router with name router_2 already exists +... +-- Reload router module. +_, old_rs_1 = next(vshard.router.internal.static_router.replicasets) +--- +... +_, old_rs_2 = next(router_2.replicasets) +--- +... +package.loaded['vshard.router'] = nil +--- +... +vshard.router = require('vshard.router') +--- +... +while not old_rs_1.is_outdated do fiber.sleep(0.01) end +--- +... +while not old_rs_2.is_outdated do fiber.sleep(0.01) end +--- +... +vshard.router.call(1, 'read', 'do_select', {1}) +--- +- [[1, 1]] +... +router_2:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +routers[5]:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +_ = test_run:cmd("switch default") +--- +... +test_run:cmd("stop server router_1") +--- +- true +... +test_run:cmd("cleanup server router_1") +--- +- true +... +test_run:drop_cluster(REPLICASET_1_1) +--- +... +test_run:drop_cluster(REPLICASET_1_2) +--- +... +test_run:drop_cluster(REPLICASET_2_1) +--- +... +test_run:drop_cluster(REPLICASET_2_2) +--- +... diff --git a/test/multiple_routers/multiple_routers.test.lua b/test/multiple_routers/multiple_routers.test.lua new file mode 100644 index 0000000..6d470e1 --- /dev/null +++ b/test/multiple_routers/multiple_routers.test.lua @@ -0,0 +1,85 @@ +test_run = require('test_run').new() + +REPLICASET_1_1 = { 'storage_1_1_a', 'storage_1_1_b' } +REPLICASET_1_2 = { 'storage_1_2_a', 'storage_1_2_b' } +REPLICASET_2_1 = { 'storage_2_1_a', 'storage_2_1_b' } +REPLICASET_2_2 = { 'storage_2_2_a', 'storage_2_2_b' } + +test_run:create_cluster(REPLICASET_1_1, 'multiple_routers') +test_run:create_cluster(REPLICASET_1_2, 'multiple_routers') +test_run:create_cluster(REPLICASET_2_1, 'multiple_routers') +test_run:create_cluster(REPLICASET_2_2, 'multiple_routers') +util = require('lua_libs.util') +util.wait_master(test_run, REPLICASET_1_1, 'storage_1_1_a') +util.wait_master(test_run, REPLICASET_1_2, 'storage_1_2_a') +util.wait_master(test_run, REPLICASET_2_1, 'storage_2_1_a') +util.wait_master(test_run, REPLICASET_2_2, 'storage_2_2_a') + +test_run:cmd("create server router_1 with script='multiple_routers/router_1.lua'") +test_run:cmd("start server router_1") + +-- Configure default (static) router. +_ = test_run:cmd("switch router_1") +vshard.router.cfg(configs.cfg_1) +vshard.router.bootstrap() +_ = test_run:cmd("switch storage_1_2_a") +wait_rebalancer_state('The cluster is balanced ok', test_run) +_ = test_run:cmd("switch router_1") + +vshard.router.call(1, 'write', 'do_replace', {{1, 1}}) +vshard.router.call(1, 'read', 'do_select', {1}) + +-- Configure extra router. +router_2 = vshard.router.new('router_2', configs.cfg_2) +router_2:bootstrap() +_ = test_run:cmd("switch storage_2_2_a") +wait_rebalancer_state('The cluster is balanced ok', test_run) +_ = test_run:cmd("switch router_1") + +router_2:call(1, 'write', 'do_replace', {{2, 2}}) +router_2:call(1, 'read', 'do_select', {2}) +-- Check that router_2 and static router serves different clusters. +#router_2:call(1, 'read', 'do_select', {1}) == 0 + +-- Create several routers to the same cluster. +routers = {} +for i = 3, 10 do routers[i] = vshard.router.new('router_' .. i, configs.cfg_2) end +routers[3]:call(1, 'read', 'do_select', {2}) +-- Check that they have their own background fibers. +fiber_names = {} +for i = 2, 10 do fiber_names['vshard.failover.router_' .. i] = true; fiber_names['vshard.discovery.router_' .. i] = true; end +next(fiber_names) ~= nil +fiber = require('fiber') +for _, xfiber in pairs(fiber.info()) do fiber_names[xfiber.name] = nil end +next(fiber_names) == nil + +-- Reconfigure one of routers do not affect the others. +routers[3]:cfg(configs.cfg_1) +routers[3]:call(1, 'read', 'do_select', {1}) +#routers[3]:call(1, 'read', 'do_select', {2}) == 0 +#routers[4]:call(1, 'read', 'do_select', {1}) == 0 +routers[4]:call(1, 'read', 'do_select', {2}) +routers[3]:cfg(configs.cfg_2) + +-- Try to create router with the same name. +util = require('lua_libs.util') +util.check_error(vshard.router.new, 'router_2', configs.cfg_2) + +-- Reload router module. +_, old_rs_1 = next(vshard.router.internal.static_router.replicasets) +_, old_rs_2 = next(router_2.replicasets) +package.loaded['vshard.router'] = nil +vshard.router = require('vshard.router') +while not old_rs_1.is_outdated do fiber.sleep(0.01) end +while not old_rs_2.is_outdated do fiber.sleep(0.01) end +vshard.router.call(1, 'read', 'do_select', {1}) +router_2:call(1, 'read', 'do_select', {2}) +routers[5]:call(1, 'read', 'do_select', {2}) + +_ = test_run:cmd("switch default") +test_run:cmd("stop server router_1") +test_run:cmd("cleanup server router_1") +test_run:drop_cluster(REPLICASET_1_1) +test_run:drop_cluster(REPLICASET_1_2) +test_run:drop_cluster(REPLICASET_2_1) +test_run:drop_cluster(REPLICASET_2_2) diff --git a/test/multiple_routers/router_1.lua b/test/multiple_routers/router_1.lua new file mode 100644 index 0000000..2e9ea91 --- /dev/null +++ b/test/multiple_routers/router_1.lua @@ -0,0 +1,15 @@ +#!/usr/bin/env tarantool + +require('strict').on() + +-- Get instance name +local fio = require('fio') +local NAME = fio.basename(arg[0], '.lua') + +require('console').listen(os.getenv('ADMIN')) + +configs = require('configs') + +-- Start the database with sharding +vshard = require('vshard') +box.cfg{} diff --git a/test/multiple_routers/storage_1_1_a.lua b/test/multiple_routers/storage_1_1_a.lua new file mode 100644 index 0000000..b44a97a --- /dev/null +++ b/test/multiple_routers/storage_1_1_a.lua @@ -0,0 +1,23 @@ +#!/usr/bin/env tarantool + +require('strict').on() + +-- Get instance name. +local fio = require('fio') +NAME = fio.basename(arg[0], '.lua') + +require('console').listen(os.getenv('ADMIN')) + +-- Fetch config for the cluster of the instance. +if NAME:sub(9,9) == '1' then + cfg = require('configs').cfg_1 +else + cfg = require('configs').cfg_2 +end + +-- Start the database with sharding. +vshard = require('vshard') +vshard.storage.cfg(cfg, names[NAME]) + +-- Bootstrap storage. +require('lua_libs.bootstrap') diff --git a/test/multiple_routers/storage_1_1_b.lua b/test/multiple_routers/storage_1_1_b.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_1_1_b.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_1_2_a.lua b/test/multiple_routers/storage_1_2_a.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_1_2_a.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_1_2_b.lua b/test/multiple_routers/storage_1_2_b.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_1_2_b.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_2_1_a.lua b/test/multiple_routers/storage_2_1_a.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_2_1_a.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_2_1_b.lua b/test/multiple_routers/storage_2_1_b.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_2_1_b.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_2_2_a.lua b/test/multiple_routers/storage_2_2_a.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_2_2_a.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_2_2_b.lua b/test/multiple_routers/storage_2_2_b.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_2_2_b.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/suite.ini b/test/multiple_routers/suite.ini new file mode 100644 index 0000000..d2d4470 --- /dev/null +++ b/test/multiple_routers/suite.ini @@ -0,0 +1,6 @@ +[default] +core = tarantool +description = Multiple routers tests +script = test.lua +is_parallel = False +lua_libs = ../lua_libs configs.lua diff --git a/test/multiple_routers/test.lua b/test/multiple_routers/test.lua new file mode 100644 index 0000000..cb7c1ee --- /dev/null +++ b/test/multiple_routers/test.lua @@ -0,0 +1,9 @@ +#!/usr/bin/env tarantool + +require('strict').on() + +box.cfg{ + listen = os.getenv("LISTEN"), +} + +require('console').listen(os.getenv('ADMIN')) diff --git a/test/router/router.result b/test/router/router.result index 45394e1..f123ab9 100644 --- a/test/router/router.result +++ b/test/router/router.result @@ -225,7 +225,7 @@ vshard.router.bootstrap() -- -- gh-108: negative bucket count on discovery. -- -vshard.router.internal.route_map = {} +vshard.router.internal.static_router.route_map = {} --- ... rets = {} @@ -1111,7 +1111,7 @@ end; vshard.router.cfg(cfg); --- ... -vshard.router.internal.route_map = {}; +vshard.router.internal.static_router.route_map = {}; --- ... vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY = false; diff --git a/test/router/router.test.lua b/test/router/router.test.lua index df2f381..a421d0c 100644 --- a/test/router/router.test.lua +++ b/test/router/router.test.lua @@ -91,7 +91,7 @@ vshard.router.bootstrap() -- -- gh-108: negative bucket count on discovery. -- -vshard.router.internal.route_map = {} +vshard.router.internal.static_router.route_map = {} rets = {} function do_echo() table.insert(rets, vshard.router.callro(1, 'echo', {1})) end f1 = fiber.create(do_echo) f2 = fiber.create(do_echo) @@ -423,7 +423,7 @@ while vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY ~= 'waiting' do fiber.sleep(0.02) end; vshard.router.cfg(cfg); -vshard.router.internal.route_map = {}; +vshard.router.internal.static_router.route_map = {}; vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY = false; -- Do discovery iteration. Upload buckets from the -- first replicaset. diff --git a/vshard/router/init.lua b/vshard/router/init.lua index 3e127cb..7569baf 100644 --- a/vshard/router/init.lua +++ b/vshard/router/init.lua @@ -25,14 +25,31 @@ local M = rawget(_G, MODULE_INTERNALS) if not M then M = { ---------------- Common module attributes ---------------- - -- The last passed configuration. - current_cfg = nil, errinj = { ERRINJ_CFG = false, ERRINJ_FAILOVER_CHANGE_CFG = false, ERRINJ_RELOAD = false, ERRINJ_LONG_DISCOVERY = false, }, + -- Dictionary, key is router name, value is a router. + routers = {}, + -- Router object which can be accessed by old api: + -- e.g. vshard.router.call(...) + static_router = nil, + -- This counter is used to restart background fibers with + -- new reloaded code. + module_version = 0, + } +end + +-- +-- Router object attributes. +-- +local ROUTER_TEMPLATE = { + -- Name of router. + name = nil, + -- The last passed configuration. + current_cfg = nil, -- Time to outdate old objects on reload. connection_outdate_delay = nil, -- Bucket map cache. @@ -47,38 +64,36 @@ if not M then total_bucket_count = 0, -- Boolean lua_gc state (create periodic gc task). collect_lua_garbage = nil, - -- This counter is used to restart background fibers with - -- new reloaded code. - module_version = 0, - } -end +} + +local STATIC_ROUTER_NAME = 'static_router' -- Set a bucket to a replicaset. -local function bucket_set(bucket_id, rs_uuid) - local replicaset = M.replicasets[rs_uuid] +local function bucket_set(router, bucket_id, rs_uuid) + local replicaset = router.replicasets[rs_uuid] -- It is technically possible to delete a replicaset at the -- same time when route to the bucket is discovered. if not replicaset then return nil, lerror.vshard(lerror.code.NO_ROUTE_TO_BUCKET, bucket_id) end - local old_replicaset = M.route_map[bucket_id] + local old_replicaset = router.route_map[bucket_id] if old_replicaset ~= replicaset then if old_replicaset then old_replicaset.bucket_count = old_replicaset.bucket_count - 1 end replicaset.bucket_count = replicaset.bucket_count + 1 end - M.route_map[bucket_id] = replicaset + router.route_map[bucket_id] = replicaset return replicaset end -- Remove a bucket from the cache. -local function bucket_reset(bucket_id) - local replicaset = M.route_map[bucket_id] +local function bucket_reset(router, bucket_id) + local replicaset = router.route_map[bucket_id] if replicaset then replicaset.bucket_count = replicaset.bucket_count - 1 end - M.route_map[bucket_id] = nil + router.route_map[bucket_id] = nil end -------------------------------------------------------------------------------- @@ -86,8 +101,8 @@ end -------------------------------------------------------------------------------- -- Search bucket in whole cluster -local function bucket_discovery(bucket_id) - local replicaset = M.route_map[bucket_id] +local function bucket_discovery(router, bucket_id) + local replicaset = router.route_map[bucket_id] if replicaset ~= nil then return replicaset end @@ -95,14 +110,14 @@ local function bucket_discovery(bucket_id) log.verbose("Discovering bucket %d", bucket_id) local last_err = nil local unreachable_uuid = nil - for uuid, _ in pairs(M.replicasets) do + for uuid, _ in pairs(router.replicasets) do -- Handle reload/reconfigure. - replicaset = M.replicasets[uuid] + replicaset = router.replicasets[uuid] if replicaset then local _, err = replicaset:callrw('vshard.storage.bucket_stat', {bucket_id}) if err == nil then - return bucket_set(bucket_id, replicaset.uuid) + return bucket_set(router, bucket_id, replicaset.uuid) elseif err.code ~= lerror.code.WRONG_BUCKET then last_err = err unreachable_uuid = uuid @@ -132,14 +147,14 @@ local function bucket_discovery(bucket_id) end -- Resolve bucket id to replicaset uuid -local function bucket_resolve(bucket_id) +local function bucket_resolve(router, bucket_id) local replicaset, err - local replicaset = M.route_map[bucket_id] + local replicaset = router.route_map[bucket_id] if replicaset ~= nil then return replicaset end -- Replicaset removed from cluster, perform discovery - replicaset, err = bucket_discovery(bucket_id) + replicaset, err = bucket_discovery(router, bucket_id) if replicaset == nil then return nil, err end @@ -150,14 +165,14 @@ end -- Background fiber to perform discovery. It periodically scans -- replicasets one by one and updates route_map. -- -local function discovery_f() +local function discovery_f(router) local module_version = M.module_version while module_version == M.module_version do - while not next(M.replicasets) do + while not next(router.replicasets) do lfiber.sleep(consts.DISCOVERY_INTERVAL) end - local old_replicasets = M.replicasets - for rs_uuid, replicaset in pairs(M.replicasets) do + local old_replicasets = router.replicasets + for rs_uuid, replicaset in pairs(router.replicasets) do local active_buckets, err = replicaset:callro('vshard.storage.buckets_discovery', {}, {timeout = 2}) @@ -167,7 +182,7 @@ local function discovery_f() end -- Renew replicasets object captured by the for loop -- in case of reconfigure and reload events. - if M.replicasets ~= old_replicasets then + if router.replicasets ~= old_replicasets then break end if not active_buckets then @@ -180,11 +195,11 @@ local function discovery_f() end replicaset.bucket_count = #active_buckets for _, bucket_id in pairs(active_buckets) do - local old_rs = M.route_map[bucket_id] + local old_rs = router.route_map[bucket_id] if old_rs and old_rs ~= replicaset then old_rs.bucket_count = old_rs.bucket_count - 1 end - M.route_map[bucket_id] = replicaset + router.route_map[bucket_id] = replicaset end end lfiber.sleep(consts.DISCOVERY_INTERVAL) @@ -195,9 +210,9 @@ end -- -- Immediately wakeup discovery fiber if exists. -- -local function discovery_wakeup() - if M.discovery_fiber then - M.discovery_fiber:wakeup() +local function discovery_wakeup(router) + if router.discovery_fiber then + router.discovery_fiber:wakeup() end end @@ -209,7 +224,7 @@ end -- Function will restart operation after wrong bucket response until timeout -- is reached -- -local function router_call(bucket_id, mode, func, args, opts) +local function router_call(router, bucket_id, mode, func, args, opts) if opts and (type(opts) ~= 'table' or (opts.timeout and type(opts.timeout) ~= 'number')) then error('Usage: call(bucket_id, mode, func, args, opts)') @@ -217,7 +232,7 @@ local function router_call(bucket_id, mode, func, args, opts) local timeout = opts and opts.timeout or consts.CALL_TIMEOUT_MIN local replicaset, err local tend = lfiber.time() + timeout - if bucket_id > M.total_bucket_count or bucket_id <= 0 then + if bucket_id > router.total_bucket_count or bucket_id <= 0 then error('Bucket is unreachable: bucket id is out of range') end local call @@ -227,7 +242,7 @@ local function router_call(bucket_id, mode, func, args, opts) call = 'callrw' end repeat - replicaset, err = bucket_resolve(bucket_id) + replicaset, err = bucket_resolve(router, bucket_id) if replicaset then ::replicaset_is_found:: local storage_call_status, call_status, call_error = @@ -243,9 +258,9 @@ local function router_call(bucket_id, mode, func, args, opts) end err = call_status if err.code == lerror.code.WRONG_BUCKET then - bucket_reset(bucket_id) + bucket_reset(router, bucket_id) if err.destination then - replicaset = M.replicasets[err.destination] + replicaset = router.replicasets[err.destination] if not replicaset then log.warn('Replicaset "%s" was not found, but received'.. ' from storage as destination - please '.. @@ -257,13 +272,13 @@ local function router_call(bucket_id, mode, func, args, opts) -- but already is executed on storages. while lfiber.time() <= tend do lfiber.sleep(0.05) - replicaset = M.replicasets[err.destination] + replicaset = router.replicasets[err.destination] if replicaset then goto replicaset_is_found end end else - replicaset = bucket_set(bucket_id, replicaset.uuid) + replicaset = bucket_set(router, bucket_id, replicaset.uuid) lfiber.yield() -- Protect against infinite cycle in a -- case of broken cluster, when a bucket @@ -280,7 +295,7 @@ local function router_call(bucket_id, mode, func, args, opts) -- is not timeout - these requests are repeated in -- any case on client, if error. assert(mode == 'write') - bucket_reset(bucket_id) + bucket_reset(router, bucket_id) return nil, err elseif err.code == lerror.code.NON_MASTER then -- Same, as above - do not wait and repeat. @@ -306,12 +321,12 @@ end -- -- Wrappers for router_call with preset mode. -- -local function router_callro(bucket_id, ...) - return router_call(bucket_id, 'read', ...) +local function router_callro(router, bucket_id, ...) + return router_call(router, bucket_id, 'read', ...) end -local function router_callrw(bucket_id, ...) - return router_call(bucket_id, 'write', ...) +local function router_callrw(router, bucket_id, ...) + return router_call(router, bucket_id, 'write', ...) end -- @@ -319,27 +334,27 @@ end -- @param bucket_id Bucket identifier. -- @retval Netbox connection. -- -local function router_route(bucket_id) +local function router_route(router, bucket_id) if type(bucket_id) ~= 'number' then error('Usage: router.route(bucket_id)') end - return bucket_resolve(bucket_id) + return bucket_resolve(router, bucket_id) end -- -- Return map of all replicasets. -- @retval See self.replicasets map. -- -local function router_routeall() - return M.replicasets +local function router_routeall(router) + return router.replicasets end -------------------------------------------------------------------------------- -- Failover -------------------------------------------------------------------------------- -local function failover_ping_round() - for _, replicaset in pairs(M.replicasets) do +local function failover_ping_round(router) + for _, replicaset in pairs(router.replicasets) do local replica = replicaset.replica if replica ~= nil and replica.conn ~= nil and replica.down_ts == nil then @@ -382,10 +397,10 @@ end -- Collect UUIDs of replicasets, priority of whose replica -- connections must be updated. -- -local function failover_collect_to_update() +local function failover_collect_to_update(router) local ts = lfiber.time() local uuid_to_update = {} - for uuid, rs in pairs(M.replicasets) do + for uuid, rs in pairs(router.replicasets) do if failover_need_down_priority(rs, ts) or failover_need_up_priority(rs, ts) then table.insert(uuid_to_update, uuid) @@ -400,16 +415,16 @@ end -- disconnected replicas. -- @retval true A replica of an replicaset has been changed. -- -local function failover_step() - failover_ping_round() - local uuid_to_update = failover_collect_to_update() +local function failover_step(router) + failover_ping_round(router) + local uuid_to_update = failover_collect_to_update(router) if #uuid_to_update == 0 then return false end local curr_ts = lfiber.time() local replica_is_changed = false for _, uuid in pairs(uuid_to_update) do - local rs = M.replicasets[uuid] + local rs = router.replicasets[uuid] if M.errinj.ERRINJ_FAILOVER_CHANGE_CFG then rs = nil M.errinj.ERRINJ_FAILOVER_CHANGE_CFG = false @@ -451,7 +466,7 @@ end -- tries to reconnect to the best replica. When the connection is -- established, it replaces the original replica. -- -local function failover_f() +local function failover_f(router) local module_version = M.module_version local min_timeout = math.min(consts.FAILOVER_UP_TIMEOUT, consts.FAILOVER_DOWN_TIMEOUT) @@ -461,7 +476,7 @@ local function failover_f() local prev_was_ok = false while module_version == M.module_version do ::continue:: - local ok, replica_is_changed = pcall(failover_step) + local ok, replica_is_changed = pcall(failover_step, router) if not ok then log.error('Error during failovering: %s', lerror.make(replica_is_changed)) @@ -488,9 +503,14 @@ end -- Configuration -------------------------------------------------------------------------------- -local function router_cfg(cfg) - local vshard_cfg, box_cfg = lcfg.check(cfg, M.current_cfg) - if not M.replicasets then +-- Types of configuration. +CFG_NEW = 'new' +CFG_RELOAD = 'reload' +CFG_RECONFIGURE = 'reconfigure' + +local function router_cfg(router, cfg, cfg_type) + local vshard_cfg, box_cfg = lcfg.check(cfg, router.current_cfg) + if cfg_type == CFG_NEW then log.info('Starting router configuration') else log.info('Starting router reconfiguration') @@ -512,44 +532,53 @@ local function router_cfg(cfg) -- Move connections from an old configuration to a new one. -- It must be done with no yields to prevent usage both of not -- fully moved old replicasets, and not fully built new ones. - lreplicaset.rebind_replicasets(new_replicasets, M.replicasets) + lreplicaset.rebind_replicasets(new_replicasets, router.replicasets) -- Now the new replicasets are fully built. Can establish -- connections and yield. for _, replicaset in pairs(new_replicasets) do replicaset:connect_all() end lreplicaset.wait_masters_connect(new_replicasets) - lreplicaset.outdate_replicasets(M.replicasets, + lreplicaset.outdate_replicasets(router.replicasets, vshard_cfg.connection_outdate_delay) - M.connection_outdate_delay = vshard_cfg.connection_outdate_delay - M.total_bucket_count = total_bucket_count - M.collect_lua_garbage = vshard_cfg.collect_lua_garbage - M.current_cfg = vshard_cfg - M.replicasets = new_replicasets + router.connection_outdate_delay = vshard_cfg.connection_outdate_delay + router.total_bucket_count = total_bucket_count + router.collect_lua_garbage = vshard_cfg.collect_lua_garbage + router.current_cfg = vshard_cfg + router.replicasets = new_replicasets -- Update existing route map in-place. - local old_route_map = M.route_map - M.route_map = {} + local old_route_map = router.route_map + router.route_map = {} for bucket, rs in pairs(old_route_map) do - M.route_map[bucket] = M.replicasets[rs.uuid] + router.route_map[bucket] = router.replicasets[rs.uuid] end - if M.failover_fiber == nil then - M.failover_fiber = util.reloadable_fiber_create( - 'vshard.failover', M, 'failover_f') + if router.failover_fiber == nil then + router.failover_fiber = util.reloadable_fiber_create( + 'vshard.failover.' .. router.name, M, 'failover_f', router) end - if M.discovery_fiber == nil then - M.discovery_fiber = util.reloadable_fiber_create( - 'vshard.discovery', M, 'discovery_f') + if router.discovery_fiber == nil then + router.discovery_fiber = util.reloadable_fiber_create( + 'vshard.discovery.' .. router.name, M, 'discovery_f', router) end - lua_gc.set_state(M.collect_lua_garbage, consts.COLLECT_LUA_GARBAGE_INTERVAL) +end + +local function updage_lua_gc_state() + local lua_gc = false + for _, xrouter in pairs(M.routers) do + if xrouter.collect_lua_garbage then + lua_gc = true + end + end + lua_gc.set_state(lua_gc, consts.COLLECT_LUA_GARBAGE_INTERVAL) end -------------------------------------------------------------------------------- -- Bootstrap -------------------------------------------------------------------------------- -local function cluster_bootstrap() +local function cluster_bootstrap(router) local replicasets = {} - for uuid, replicaset in pairs(M.replicasets) do + for uuid, replicaset in pairs(router.replicasets) do table.insert(replicasets, replicaset) local count, err = replicaset:callrw('vshard.storage.buckets_count', {}) @@ -560,9 +589,10 @@ local function cluster_bootstrap() return nil, lerror.vshard(lerror.code.NON_EMPTY) end end - lreplicaset.calculate_etalon_balance(M.replicasets, M.total_bucket_count) + lreplicaset.calculate_etalon_balance(router.replicasets, + router.total_bucket_count) local bucket_id = 1 - for uuid, replicaset in pairs(M.replicasets) do + for uuid, replicaset in pairs(router.replicasets) do if replicaset.etalon_bucket_count > 0 then local ok, err = replicaset:callrw('vshard.storage.bucket_force_create', @@ -618,7 +648,7 @@ local function replicaset_instance_info(replicaset, name, alerts, errcolor, return info, consts.STATUS.GREEN end -local function router_info() +local function router_info(router) local state = { replicasets = {}, bucket = { @@ -632,7 +662,7 @@ local function router_info() } local bucket_info = state.bucket local known_bucket_count = 0 - for rs_uuid, replicaset in pairs(M.replicasets) do + for rs_uuid, replicaset in pairs(router.replicasets) do -- Replicaset info parameters: -- * master instance info; -- * replica instance info; @@ -720,7 +750,7 @@ local function router_info() -- If a bucket is unreachable, then replicaset is -- unreachable too and color already is red. end - bucket_info.unknown = M.total_bucket_count - known_bucket_count + bucket_info.unknown = router.total_bucket_count - known_bucket_count if bucket_info.unknown > 0 then state.status = math.max(state.status, consts.STATUS.YELLOW) table.insert(state.alerts, lerror.alert(lerror.code.UNKNOWN_BUCKETS, @@ -737,13 +767,13 @@ end -- @param limit Maximal bucket count in output. -- @retval Map of type {bucket_id = 'unknown'/replicaset_uuid}. -- -local function router_buckets_info(offset, limit) +local function router_buckets_info(router, offset, limit) if offset ~= nil and type(offset) ~= 'number' or limit ~= nil and type(limit) ~= 'number' then error('Usage: buckets_info(offset, limit)') end offset = offset or 0 - limit = limit or M.total_bucket_count + limit = limit or router.total_bucket_count local ret = {} -- Use one string memory for all unknown buckets. local available_rw = 'available_rw' @@ -752,9 +782,9 @@ local function router_buckets_info(offset, limit) local unreachable = 'unreachable' -- Collect limit. local first = math.max(1, offset + 1) - local last = math.min(offset + limit, M.total_bucket_count) + local last = math.min(offset + limit, router.total_bucket_count) for bucket_id = first, last do - local rs = M.route_map[bucket_id] + local rs = router.route_map[bucket_id] if rs then if rs.master and rs.master:is_connected() then ret[bucket_id] = {uuid = rs.uuid, status = available_rw} @@ -774,22 +804,22 @@ end -- Other -------------------------------------------------------------------------------- -local function router_bucket_id(key) +local function router_bucket_id(router, key) if key == nil then error("Usage: vshard.router.bucket_id(key)") end - return lhash.key_hash(key) % M.total_bucket_count + 1 + return lhash.key_hash(key) % router.total_bucket_count + 1 end -local function router_bucket_count() - return M.total_bucket_count +local function router_bucket_count(router) + return router.total_bucket_count end -local function router_sync(timeout) +local function router_sync(router, timeout) if timeout ~= nil and type(timeout) ~= 'number' then error('Usage: vshard.router.sync([timeout: number])') end - for rs_uuid, replicaset in pairs(M.replicasets) do + for rs_uuid, replicaset in pairs(router.replicasets) do local status, err = replicaset:callrw('vshard.storage.sync', {timeout}) if not status then -- Add information about replicaset @@ -803,6 +833,93 @@ if M.errinj.ERRINJ_RELOAD then error('Error injection: reload') end +-------------------------------------------------------------------------------- +-- Managing router instances +-------------------------------------------------------------------------------- + +local function cfg_reconfigure(router, cfg) + return router_cfg(router, cfg, CFG_RECONFIGURE) +end + +local router_mt = { + __index = { + cfg = cfg_reconfigure; + info = router_info; + buckets_info = router_buckets_info; + call = router_call; + callro = router_callro; + callrw = router_callrw; + route = router_route; + routeall = router_routeall; + bucket_id = router_bucket_id; + bucket_count = router_bucket_count; + sync = router_sync; + bootstrap = cluster_bootstrap; + bucket_discovery = bucket_discovery; + discovery_wakeup = discovery_wakeup; + } +} + +-- Table which represents this module. +local module = {} + +local function export_static_router_attributes() + -- This metatable bypasses calls to a module to the static_router. + local module_mt = {__index = {}} + for method_name, method in pairs(router_mt.__index) do + module_mt.__index[method_name] = function(...) + if M.static_router then + return method(M.static_router, ...) + else + error('Static router is not configured') + end + end + end + setmetatable(module, module_mt) + -- Make static_router attributes accessible form + -- vshard.router.internal. + local M_static_router_attributes = { + name = true, + replicasets = true, + route_map = true, + total_bucket_count = true, + } + setmetatable(M, { + __index = function(M, key) + return M.static_router[key] + end + }) +end + +local function router_new(name, cfg) + assert(type(name) == 'string' and type(cfg) == 'table', + 'Wrong argument type. Usage: vshard.router.new(name, cfg).') + if M.routers[name] then + return nil, string.format('Router with name %s already exists', name) + end + local router = table.deepcopy(ROUTER_TEMPLATE) + setmetatable(router, router_mt) + router.name = name + M.routers[name] = router + if name == STATIC_ROUTER_NAME then + M.static_router = router + export_static_router_attributes() + end + router_cfg(router, cfg, CFG_NEW) + updage_lua_gc_state() + return router +end + +local function legacy_cfg(cfg) + if M.static_router then + -- Reconfigure. + router_cfg(M.static_router, cfg, CFG_RECONFIGURE) + else + -- Create new static instance. + router_new(STATIC_ROUTER_NAME, cfg) + end +end + -------------------------------------------------------------------------------- -- Module definition -------------------------------------------------------------------------------- @@ -813,28 +930,24 @@ end if not rawget(_G, MODULE_INTERNALS) then rawset(_G, MODULE_INTERNALS, M) else - router_cfg(M.current_cfg) + for _, router in pairs(M.routers) do + router_cfg(router, router.current_cfg, CFG_RELOAD) + setmetatable(router, router_mt) + end + updage_lua_gc_state() M.module_version = M.module_version + 1 end M.discovery_f = discovery_f M.failover_f = failover_f +M.router_mt = router_mt +if M.static_router then + export_static_router_attributes() +end -return { - cfg = router_cfg; - info = router_info; - buckets_info = router_buckets_info; - call = router_call; - callro = router_callro; - callrw = router_callrw; - route = router_route; - routeall = router_routeall; - bucket_id = router_bucket_id; - bucket_count = router_bucket_count; - sync = router_sync; - bootstrap = cluster_bootstrap; - bucket_discovery = bucket_discovery; - discovery_wakeup = discovery_wakeup; - internal = M; - module_version = function() return M.module_version end; -} +module.cfg = legacy_cfg +module.new = router_new +module.internal = M +module.module_version = function() return M.module_version end + +return module diff --git a/vshard/util.lua b/vshard/util.lua index ea676ff..852e8a3 100644 --- a/vshard/util.lua +++ b/vshard/util.lua @@ -38,11 +38,11 @@ end -- reload of that module. -- See description of parameters in `reloadable_fiber_create`. -- -local function reloadable_fiber_main_loop(module, func_name) +local function reloadable_fiber_main_loop(module, func_name, data) log.info('%s has been started', func_name) local func = module[func_name] ::restart_loop:: - local ok, err = pcall(func) + local ok, err = pcall(func, data) -- yield serves two purposes: -- * makes this fiber cancellable -- * prevents 100% cpu consumption @@ -60,7 +60,7 @@ local function reloadable_fiber_main_loop(module, func_name) log.info('module is reloaded, restarting') -- luajit drops this frame if next function is called in -- return statement. - return M.reloadable_fiber_main_loop(module, func_name) + return M.reloadable_fiber_main_loop(module, func_name, data) end -- @@ -73,11 +73,13 @@ end -- @param module Module which can be reloaded. -- @param func_name Name of a function to be executed in the -- module. +-- @param data Data to be passed to the specified function. -- @retval New fiber. -- -local function reloadable_fiber_create(fiber_name, module, func_name) +local function reloadable_fiber_create(fiber_name, module, func_name, data) assert(type(fiber_name) == 'string') - local xfiber = fiber.create(reloadable_fiber_main_loop, module, func_name) + local xfiber = fiber.create(reloadable_fiber_main_loop, module, func_name, + data) xfiber:name(fiber_name) return xfiber end -- 2.14.1 ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 3/3] Introduce multiple routers feature 2018-07-31 16:25 ` [tarantool-patches] [PATCH 3/3] Introduce multiple routers feature AKhatskevich @ 2018-08-01 18:43 ` Vladislav Shpilevoy 2018-08-03 20:05 ` Alex Khatskevich 0 siblings, 1 reply; 23+ messages in thread From: Vladislav Shpilevoy @ 2018-08-01 18:43 UTC (permalink / raw) To: tarantool-patches, AKhatskevich Thanks for the patch! See 10 comments below. On 31/07/2018 19:25, AKhatskevich wrote: > Key points: > * Old `vshard.router.some_method()` api is preserved. > * Add `vshard.router.new(name, cfg)` method which returns a new router. > * Each router has its own: > 1. name > 2. background fibers > 3. attributes (route_map, replicasets, outdate_delay...) > * Module reload reloads all configured routers. > * `cfg` reconfigures a single router. > * All routers share the same box configuration. The last passed config > overrides the global config. > * Multiple router instances can be connected to the same cluster. > * By now, a router cannot be destroyed. > > Extra changes: > * Add `data` parameter to `reloadable_fiber_create` function. > > Closes #130 > ---> diff --git a/test/multiple_routers/multiple_routers.result b/test/multiple_routers/multiple_routers.result > new file mode 100644 > index 0000000..33f4034 > --- /dev/null > +++ b/test/multiple_routers/multiple_routers.result > @@ -0,0 +1,226 @@ > +-- Reconfigure one of routers do not affect the others. > +routers[3]:cfg(configs.cfg_1) 1. You did not change configs.cfg_1 so it is not reconfig actually. Please, change something to check that the parameter affects one router and does not affect others. 2. Please, add a test on an ability to get the static router into a variable and use it like others. It should be possible to hide distinctions between static and other routers. Like this: r1 = vshard.router.static r2 = vshard.router.new(...) do_something_with_router(r1) do_something_with_router(r2) Here do_something_with_router() is unaware of whether the router is static or not. > diff --git a/vshard/router/init.lua b/vshard/router/init.lua > index 3e127cb..7569baf 100644 > --- a/vshard/router/init.lua > +++ b/vshard/router/init.lua > @@ -257,13 +272,13 @@ local function router_call(bucket_id, mode, func, args, opts) > -- but already is executed on storages. > while lfiber.time() <= tend do > lfiber.sleep(0.05) > - replicaset = M.replicasets[err.destination] > + replicaset = router.replicasets[err.destination] > if replicaset then > goto replicaset_is_found > end > end > else > - replicaset = bucket_set(bucket_id, replicaset.uuid) > + replicaset = bucket_set(router, bucket_id, replicaset.uuid) 3. Out of 80 symbols. > lfiber.yield() > -- Protect against infinite cycle in a > -- case of broken cluster, when a bucket > @@ -488,9 +503,14 @@ end > -- Configuration > -------------------------------------------------------------------------------- > > -local function router_cfg(cfg) > - local vshard_cfg, box_cfg = lcfg.check(cfg, M.current_cfg) > - if not M.replicasets then > +-- Types of configuration. > +CFG_NEW = 'new' > +CFG_RELOAD = 'reload' > +CFG_RECONFIGURE = 'reconfigure' 4. Last two values are never used in router_cfg(). The first is used for logging only and can be checked as it was before with no explicit passing. > + > +local function router_cfg(router, cfg, cfg_type) > + local vshard_cfg, box_cfg = lcfg.check(cfg, router.current_cfg) > + if cfg_type == CFG_NEW then > log.info('Starting router configuration') > else > log.info('Starting router reconfiguration') > @@ -512,44 +532,53 @@ local function router_cfg(cfg) > + > +local function updage_lua_gc_state() 5. This function is not needed actually. On router_new() the only thing that can change is start of the gc fiber if the new router has the flag and the gc is not started now. It can be checked by a simple 'if' with no full-scan of all routers. On reload it is not possible to change configuration, so the gc state can not be changed and does not need an update. Even if it could be changed, you already iterate over routers on reload to call router_cfg and can collect their flags along side. The next point is that it is not possible now to manage the gc via simple :cfg() call. You do nothing with gc when router_cfg is called directly. And that produces a question - why do your tests pass if so? The possible solution - keep a counter of set lua gc flags overall routers in M. On each cfg you update the counter if the value is changed. If it was 0 and become > 0, then you start gc. If it was > 0 and become 0, then you stop gc. No routers iteration at all. > + local lua_gc = false > + for _, xrouter in pairs(M.routers) do > + if xrouter.collect_lua_garbage then > + lua_gc = true > + end > + end > + lua_gc.set_state(lua_gc, consts.COLLECT_LUA_GARBAGE_INTERVAL) > end > > @@ -803,6 +833,93 @@ if M.errinj.ERRINJ_RELOAD then > error('Error injection: reload') > end > > +-------------------------------------------------------------------------------- > +-- Managing router instances > +-------------------------------------------------------------------------------- > + > +local function cfg_reconfigure(router, cfg) > + return router_cfg(router, cfg, CFG_RECONFIGURE) > +end > + > +local router_mt = { > + __index = { > + cfg = cfg_reconfigure; > + info = router_info; > + buckets_info = router_buckets_info; > + call = router_call; > + callro = router_callro; > + callrw = router_callrw; > + route = router_route; > + routeall = router_routeall; > + bucket_id = router_bucket_id; > + bucket_count = router_bucket_count; > + sync = router_sync; > + bootstrap = cluster_bootstrap; > + bucket_discovery = bucket_discovery; > + discovery_wakeup = discovery_wakeup; > + } > +} > + > +-- Table which represents this module. > +local module = {} > + > +local function export_static_router_attributes() > + -- This metatable bypasses calls to a module to the static_router. > + local module_mt = {__index = {}} > + for method_name, method in pairs(router_mt.__index) do > + module_mt.__index[method_name] = function(...) > + if M.static_router then > + return method(M.static_router, ...) > + else > + error('Static router is not configured') 6. This should not be all-time check. You should initialize the static router metatable with only errors. On the first cfg you reset the metatable to always use regular methods. But anyway this code is unreachable. See below in the comment 10 why it is so. > + end > + end > + end > + setmetatable(module, module_mt) > + -- Make static_router attributes accessible form > + -- vshard.router.internal. > + local M_static_router_attributes = { > + name = true, > + replicasets = true, > + route_map = true, > + total_bucket_count = true, > + } 7. I saw in the tests that you are using vshard.router.internal.static_router instead. Please, remove M_static_router_attributes then. > + setmetatable(M, { > + __index = function(M, key) > + return M.static_router[key] > + end > + }) > +end > + > +local function router_new(name, cfg) > + assert(type(name) == 'string' and type(cfg) == 'table', > + 'Wrong argument type. Usage: vshard.router.new(name, cfg).') 8. As I said before, do not use assertions for usage checks in public API. Use 'if wrong_usage then error(...) end'. > + if M.routers[name] then > + return nil, string.format('Router with name %s already exists', name) > + end > + local router = table.deepcopy(ROUTER_TEMPLATE) > + setmetatable(router, router_mt) > + router.name = name > + M.routers[name] = router > + if name == STATIC_ROUTER_NAME then > + M.static_router = router > + export_static_router_attributes() > + end 9. This check can be removed if you move export_static_router_attributes call into legacy_cfg. 10. Looks like all your struggles in export_static_router_attributes() about error on non-configured router makes no sense since until cfg is called, vshard.router has no any methods except cfg and new. > + router_cfg(router, cfg, CFG_NEW) > + updage_lua_gc_state() > + return router > +end > + ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 3/3] Introduce multiple routers feature 2018-08-01 18:43 ` [tarantool-patches] " Vladislav Shpilevoy @ 2018-08-03 20:05 ` Alex Khatskevich 2018-08-06 17:03 ` Vladislav Shpilevoy 0 siblings, 1 reply; 23+ messages in thread From: Alex Khatskevich @ 2018-08-03 20:05 UTC (permalink / raw) To: Vladislav Shpilevoy, tarantool-patches On 01.08.2018 21:43, Vladislav Shpilevoy wrote: > Thanks for the patch! See 10 comments below. > > On 31/07/2018 19:25, AKhatskevich wrote: >> Key points: >> * Old `vshard.router.some_method()` api is preserved. >> * Add `vshard.router.new(name, cfg)` method which returns a new router. >> * Each router has its own: >> 1. name >> 2. background fibers >> 3. attributes (route_map, replicasets, outdate_delay...) >> * Module reload reloads all configured routers. >> * `cfg` reconfigures a single router. >> * All routers share the same box configuration. The last passed config >> overrides the global config. >> * Multiple router instances can be connected to the same cluster. >> * By now, a router cannot be destroyed. >> >> Extra changes: >> * Add `data` parameter to `reloadable_fiber_create` function. >> >> Closes #130 >> ---> diff --git a/test/multiple_routers/multiple_routers.result >> b/test/multiple_routers/multiple_routers.result >> new file mode 100644 >> index 0000000..33f4034 >> --- /dev/null >> +++ b/test/multiple_routers/multiple_routers.result >> @@ -0,0 +1,226 @@ >> +-- Reconfigure one of routers do not affect the others. >> +routers[3]:cfg(configs.cfg_1) > > 1. You did not change configs.cfg_1 so it is not reconfig > actually. Please, change something to check that the > parameter affects one router and does not affect others. router[3] was configured with configs.cfg_2 before. So, its config was changed. > > 2. Please, add a test on an ability to get the static router > into a variable and use it like others. It should be possible > to hide distinctions between static and other routers. > > Like this: > > r1 = vshard.router.static > r2 = vshard.router.new(...) > do_something_with_router(r1) > do_something_with_router(r2) > > Here do_something_with_router() is unaware of whether the > router is static or not. > Few calls are added. >> diff --git a/vshard/router/init.lua b/vshard/router/init.lua >> index 3e127cb..7569baf 100644 >> --- a/vshard/router/init.lua >> +++ b/vshard/router/init.lua >> @@ -257,13 +272,13 @@ local function router_call(bucket_id, mode, >> func, args, opts) >> -- but already is executed on storages. >> while lfiber.time() <= tend do >> lfiber.sleep(0.05) >> - replicaset = M.replicasets[err.destination] >> + replicaset = >> router.replicasets[err.destination] >> if replicaset then >> goto replicaset_is_found >> end >> end >> else >> - replicaset = bucket_set(bucket_id, >> replicaset.uuid) >> + replicaset = bucket_set(router, bucket_id, >> replicaset.uuid) > > 3. Out of 80 symbols. fixed > >> lfiber.yield() >> -- Protect against infinite cycle in a >> -- case of broken cluster, when a bucket >> @@ -488,9 +503,14 @@ end >> -- Configuration >> -------------------------------------------------------------------------------- >> -local function router_cfg(cfg) >> - local vshard_cfg, box_cfg = lcfg.check(cfg, M.current_cfg) >> - if not M.replicasets then >> +-- Types of configuration. >> +CFG_NEW = 'new' >> +CFG_RELOAD = 'reload' >> +CFG_RECONFIGURE = 'reconfigure' > > 4. Last two values are never used in router_cfg(). The first > is used for logging only and can be checked as it was before > with no explicit passing. I have left it as it is. Now, each of those is passed at least once. >> + >> +local function router_cfg(router, cfg, cfg_type) >> + local vshard_cfg, box_cfg = lcfg.check(cfg, router.current_cfg) >> + if cfg_type == CFG_NEW then >> log.info('Starting router configuration') >> else >> log.info('Starting router reconfiguration') >> @@ -512,44 +532,53 @@ local function router_cfg(cfg) >> + >> +local function updage_lua_gc_state() > > 5. This function is not needed actually. > > On router_new() the only thing that can change is start of > the gc fiber if the new router has the flag and the gc is > not started now. It can be checked by a simple 'if' with > no full-scan of all routers. > > On reload it is not possible to change configuration, so > the gc state can not be changed and does not need an update. > Even if it could be changed, you already iterate over routers > on reload to call router_cfg and can collect their flags > along side. > > The next point is that it is not possible now to manage the > gc via simple :cfg() call. You do nothing with gc when > router_cfg is called directly. And that produces a question - > why do your tests pass if so? > > The possible solution - keep a counter of set lua gc flags > overall routers in M. On each cfg you update the counter > if the value is changed. If it was 0 and become > 0, then > you start gc. If it was > 0 and become 0, then you stop gc. > No routers iteration at all. Implemented by introducing a counter. > >> + local lua_gc = false >> + for _, xrouter in pairs(M.routers) do >> + if xrouter.collect_lua_garbage then >> + lua_gc = true >> + end >> + end >> + lua_gc.set_state(lua_gc, consts.COLLECT_LUA_GARBAGE_INTERVAL) >> end >> @@ -803,6 +833,93 @@ if M.errinj.ERRINJ_RELOAD then >> error('Error injection: reload') >> end >> +-------------------------------------------------------------------------------- >> +-- Managing router instances >> +-------------------------------------------------------------------------------- >> >> + >> +local function cfg_reconfigure(router, cfg) >> + return router_cfg(router, cfg, CFG_RECONFIGURE) >> +end >> + >> +local router_mt = { >> + __index = { >> + cfg = cfg_reconfigure; >> + info = router_info; >> + buckets_info = router_buckets_info; >> + call = router_call; >> + callro = router_callro; >> + callrw = router_callrw; >> + route = router_route; >> + routeall = router_routeall; >> + bucket_id = router_bucket_id; >> + bucket_count = router_bucket_count; >> + sync = router_sync; >> + bootstrap = cluster_bootstrap; >> + bucket_discovery = bucket_discovery; >> + discovery_wakeup = discovery_wakeup; >> + } >> +} >> + >> +-- Table which represents this module. >> +local module = {} >> + >> +local function export_static_router_attributes() >> + -- This metatable bypasses calls to a module to the static_router. >> + local module_mt = {__index = {}} >> + for method_name, method in pairs(router_mt.__index) do >> + module_mt.__index[method_name] = function(...) >> + if M.static_router then >> + return method(M.static_router, ...) >> + else >> + error('Static router is not configured') > > 6. This should not be all-time check. You should > initialize the static router metatable with only errors. > On the first cfg you reset the metatable to always use > regular methods. But anyway this code is unreachable. See > below in the comment 10 why it is so. Yes. Fixed. > >> + end >> + end >> + end >> + setmetatable(module, module_mt) >> + -- Make static_router attributes accessible form >> + -- vshard.router.internal. >> + local M_static_router_attributes = { >> + name = true, >> + replicasets = true, >> + route_map = true, >> + total_bucket_count = true, >> + } > > 7. I saw in the tests that you are using > vshard.router.internal.static_router > instead. Please, remove M_static_router_attributes then. Deleted. Tests are fixed. > >> + setmetatable(M, { >> + __index = function(M, key) >> + return M.static_router[key] >> + end >> + }) >> +end >> + >> +local function router_new(name, cfg) >> + assert(type(name) == 'string' and type(cfg) == 'table', >> + 'Wrong argument type. Usage: vshard.router.new(name, cfg).') > > 8. As I said before, do not use assertions for usage checks in public > API. Use 'if wrong_usage then error(...) end'. Fixed. > >> + if M.routers[name] then >> + return nil, string.format('Router with name %s already >> exists', name) >> + end >> + local router = table.deepcopy(ROUTER_TEMPLATE) >> + setmetatable(router, router_mt) >> + router.name = name >> + M.routers[name] = router >> + if name == STATIC_ROUTER_NAME then >> + M.static_router = router >> + export_static_router_attributes() >> + end > > 9. This check can be removed if you move > export_static_router_attributes call into legacy_cfg. Butbue to this if, the static router can be configured by `vshard.box.new(static_router_name)`. > > 10. Looks like all your struggles in > export_static_router_attributes() about error on non-configured > router makes no sense since until cfg is called, vshard.router > has no any methods except cfg and new. > >> + router_cfg(router, cfg, CFG_NEW) >> + updage_lua_gc_state() >> + return router >> +end >> + Fixed. full diff commit f3ffb6a6a3632277f05ee4ea7d095a19dd85a42f Author: AKhatskevich <avkhatskevich@tarantool.org> Date: Thu Jul 26 16:17:25 2018 +0300 Introduce multiple routers feature Key points: * Old `vshard.router.some_method()` api is preserved. * Add `vshard.router.new(name, cfg)` method which returns a new router. * Each router has its own: 1. name 2. background fibers 3. attributes (route_map, replicasets, outdate_delay...) * Module reload reloads all configured routers. * `cfg` reconfigures a single router. * All routers share the same box configuration. The last passed config overrides the global box config. * Multiple router instances can be connected to the same cluster. * By now, a router cannot be destroyed. Extra changes: * Add `data` parameter to `reloadable_fiber_create` function. Closes #130 diff --git a/test/failover/failover.result b/test/failover/failover.result index 73a4250..50410ad 100644 --- a/test/failover/failover.result +++ b/test/failover/failover.result @@ -174,7 +174,7 @@ test_run:switch('router_1') --- - true ... -rs1 = vshard.router.internal.replicasets[rs_uuid[1]] +rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]] --- ... while not rs1.replica_up_ts do fiber.sleep(0.1) end diff --git a/test/failover/failover.test.lua b/test/failover/failover.test.lua index 6e06314..44c8b6d 100644 --- a/test/failover/failover.test.lua +++ b/test/failover/failover.test.lua @@ -74,7 +74,7 @@ echo_count -- Ensure that replica_up_ts is updated periodically. test_run:switch('router_1') -rs1 = vshard.router.internal.replicasets[rs_uuid[1]] +rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]] while not rs1.replica_up_ts do fiber.sleep(0.1) end old_up_ts = rs1.replica_up_ts while rs1.replica_up_ts == old_up_ts do fiber.sleep(0.1) end diff --git a/test/failover/failover_errinj.result b/test/failover/failover_errinj.result index 3b6d986..484a1e3 100644 --- a/test/failover/failover_errinj.result +++ b/test/failover/failover_errinj.result @@ -49,7 +49,7 @@ vshard.router.cfg(cfg) -- Check that already run failover step is restarted on -- configuration change (if some replicasets are removed from -- config). -rs1 = vshard.router.internal.replicasets[rs_uuid[1]] +rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]] --- ... while not rs1.replica or not rs1.replica.conn:is_connected() do fiber.sleep(0.1) end diff --git a/test/failover/failover_errinj.test.lua b/test/failover/failover_errinj.test.lua index b4d2d35..14228de 100644 --- a/test/failover/failover_errinj.test.lua +++ b/test/failover/failover_errinj.test.lua @@ -20,7 +20,7 @@ vshard.router.cfg(cfg) -- Check that already run failover step is restarted on -- configuration change (if some replicasets are removed from -- config). -rs1 = vshard.router.internal.replicasets[rs_uuid[1]] +rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]] while not rs1.replica or not rs1.replica.conn:is_connected() do fiber.sleep(0.1) end vshard.router.internal.errinj.ERRINJ_FAILOVER_CHANGE_CFG = true wait_state('Configuration has changed, restart ') diff --git a/test/failover/router_1.lua b/test/failover/router_1.lua index d71209b..664a6c6 100644 --- a/test/failover/router_1.lua +++ b/test/failover/router_1.lua @@ -42,7 +42,7 @@ end function priority_order() local ret = {} for _, uuid in pairs(rs_uuid) do - local rs = vshard.router.internal.replicasets[uuid] + local rs = vshard.router.internal.static_router.replicasets[uuid] local sorted = {} for _, replica in pairs(rs.priority_list) do local z diff --git a/test/misc/reconfigure.result b/test/misc/reconfigure.result index c7960b3..311f749 100644 --- a/test/misc/reconfigure.result +++ b/test/misc/reconfigure.result @@ -250,7 +250,7 @@ test_run:switch('router_1') -- Ensure that in a case of error router internals are not -- changed. -- -not vshard.router.internal.collect_lua_garbage +not vshard.router.internal.static_router.collect_lua_garbage --- - true ... @@ -264,7 +264,7 @@ vshard.router.cfg(cfg) --- - error: 'Incorrect value for option ''invalid_option'': unexpected option' ... -not vshard.router.internal.collect_lua_garbage +not vshard.router.internal.static_router.collect_lua_garbage --- - true ... diff --git a/test/misc/reconfigure.test.lua b/test/misc/reconfigure.test.lua index 25dc2ca..298b9b0 100644 --- a/test/misc/reconfigure.test.lua +++ b/test/misc/reconfigure.test.lua @@ -99,11 +99,11 @@ test_run:switch('router_1') -- Ensure that in a case of error router internals are not -- changed. -- -not vshard.router.internal.collect_lua_garbage +not vshard.router.internal.static_router.collect_lua_garbage cfg.collect_lua_garbage = true cfg.invalid_option = 'kek' vshard.router.cfg(cfg) -not vshard.router.internal.collect_lua_garbage +not vshard.router.internal.static_router.collect_lua_garbage cfg.invalid_option = nil cfg.collect_lua_garbage = nil vshard.router.cfg(cfg) diff --git a/test/multiple_routers/configs.lua b/test/multiple_routers/configs.lua new file mode 100644 index 0000000..a6ce33c --- /dev/null +++ b/test/multiple_routers/configs.lua @@ -0,0 +1,81 @@ +names = { + storage_1_1_a = '32a2d4b8-f146-44ed-9d51-2436507efdf8', + storage_1_1_b = 'c1c849b1-641d-40b8-9283-bcfe73d46270', + storage_1_2_a = '04e677ed-c7ba-47e0-a67f-b5100cfa86af', + storage_1_2_b = 'c7a979ee-9263-4a38-84a5-2fb6a0a32684', + storage_2_1_a = '88dc03f0-23fb-4f05-b462-e29186542864', + storage_2_1_b = '4230b711-f5c4-4131-bf98-88cd43a16901', + storage_2_2_a = '6b1eefbc-1e2e-410e-84ff-44c572ea9916', + storage_2_2_b = 'be74419a-1e56-4ba4-97e9-6b18710f63c5', +} + +rs_1_1 = 'dd208fb8-8b90-49bc-8393-6b3a99da7c52' +rs_1_2 = 'af9cfe88-2091-4613-a877-a623776c5c0e' +rs_2_1 = '9ca8ee15-ae18-4f31-9385-4859f89ce73f' +rs_2_2 = '007f5f58-b654-4125-8441-a71866fb62b5' + +local cfg_1 = {} +cfg_1.sharding = { + [rs_1_1] = { + replicas = { + [names.storage_1_1_a] = { + uri = 'storage:storage@127.0.0.1:3301', + name = 'storage_1_1_a', + master = true, + }, + [names.storage_1_1_b] = { + uri = 'storage:storage@127.0.0.1:3302', + name = 'storage_1_1_b', + }, + } + }, + [rs_1_2] = { + replicas = { + [names.storage_1_2_a] = { + uri = 'storage:storage@127.0.0.1:3303', + name = 'storage_1_2_a', + master = true, + }, + [names.storage_1_2_b] = { + uri = 'storage:storage@127.0.0.1:3304', + name = 'storage_1_2_b', + }, + } + }, +} + + +local cfg_2 = {} +cfg_2.sharding = { + [rs_2_1] = { + replicas = { + [names.storage_2_1_a] = { + uri = 'storage:storage@127.0.0.1:3305', + name = 'storage_2_1_a', + master = true, + }, + [names.storage_2_1_b] = { + uri = 'storage:storage@127.0.0.1:3306', + name = 'storage_2_1_b', + }, + } + }, + [rs_2_2] = { + replicas = { + [names.storage_2_2_a] = { + uri = 'storage:storage@127.0.0.1:3307', + name = 'storage_2_2_a', + master = true, + }, + [names.storage_2_2_b] = { + uri = 'storage:storage@127.0.0.1:3308', + name = 'storage_2_2_b', + }, + } + }, +} + +return { + cfg_1 = cfg_1, + cfg_2 = cfg_2, +} diff --git a/test/multiple_routers/multiple_routers.result b/test/multiple_routers/multiple_routers.result new file mode 100644 index 0000000..1e309a7 --- /dev/null +++ b/test/multiple_routers/multiple_routers.result @@ -0,0 +1,295 @@ +test_run = require('test_run').new() +--- +... +REPLICASET_1_1 = { 'storage_1_1_a', 'storage_1_1_b' } +--- +... +REPLICASET_1_2 = { 'storage_1_2_a', 'storage_1_2_b' } +--- +... +REPLICASET_2_1 = { 'storage_2_1_a', 'storage_2_1_b' } +--- +... +REPLICASET_2_2 = { 'storage_2_2_a', 'storage_2_2_b' } +--- +... +test_run:create_cluster(REPLICASET_1_1, 'multiple_routers') +--- +... +test_run:create_cluster(REPLICASET_1_2, 'multiple_routers') +--- +... +test_run:create_cluster(REPLICASET_2_1, 'multiple_routers') +--- +... +test_run:create_cluster(REPLICASET_2_2, 'multiple_routers') +--- +... +util = require('lua_libs.util') +--- +... +util.wait_master(test_run, REPLICASET_1_1, 'storage_1_1_a') +--- +... +util.wait_master(test_run, REPLICASET_1_2, 'storage_1_2_a') +--- +... +util.wait_master(test_run, REPLICASET_2_1, 'storage_2_1_a') +--- +... +util.wait_master(test_run, REPLICASET_2_2, 'storage_2_2_a') +--- +... +test_run:cmd("create server router_1 with script='multiple_routers/router_1.lua'") +--- +- true +... +test_run:cmd("start server router_1") +--- +- true +... +-- Configure default (static) router. +_ = test_run:cmd("switch router_1") +--- +... +static_router = vshard.router.new('_static_router', configs.cfg_1) +--- +... +vshard.router.bootstrap() +--- +- true +... +_ = test_run:cmd("switch storage_1_2_a") +--- +... +wait_rebalancer_state('The cluster is balanced ok', test_run) +--- +... +_ = test_run:cmd("switch router_1") +--- +... +vshard.router.call(1, 'write', 'do_replace', {{1, 1}}) +--- +- true +... +vshard.router.call(1, 'read', 'do_select', {1}) +--- +- [[1, 1]] +... +-- Test that static router is just a router object under the hood. +static_router:route(1) == vshard.router.route(1) +--- +- true +... +-- Configure extra router. +router_2 = vshard.router.new('router_2', configs.cfg_2) +--- +... +router_2:bootstrap() +--- +- true +... +_ = test_run:cmd("switch storage_2_2_a") +--- +... +wait_rebalancer_state('The cluster is balanced ok', test_run) +--- +... +_ = test_run:cmd("switch router_1") +--- +... +router_2:call(1, 'write', 'do_replace', {{2, 2}}) +--- +- true +... +router_2:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +-- Check that router_2 and static router serves different clusters. +#router_2:call(1, 'read', 'do_select', {1}) == 0 +--- +- true +... +-- Create several routers to the same cluster. +routers = {} +--- +... +for i = 3, 10 do routers[i] = vshard.router.new('router_' .. i, configs.cfg_2) end +--- +... +routers[3]:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +-- Check that they have their own background fibers. +fiber_names = {} +--- +... +for i = 2, 10 do fiber_names['vshard.failover.router_' .. i] = true; fiber_names['vshard.discovery.router_' .. i] = true; end +--- +... +next(fiber_names) ~= nil +--- +- true +... +fiber = require('fiber') +--- +... +for _, xfiber in pairs(fiber.info()) do fiber_names[xfiber.name] = nil end +--- +... +next(fiber_names) == nil +--- +- true +... +-- Reconfigure one of routers do not affect the others. +routers[3]:cfg(configs.cfg_1) +--- +... +routers[3]:call(1, 'read', 'do_select', {1}) +--- +- [[1, 1]] +... +#routers[3]:call(1, 'read', 'do_select', {2}) == 0 +--- +- true +... +#routers[4]:call(1, 'read', 'do_select', {1}) == 0 +--- +- true +... +routers[4]:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +routers[3]:cfg(configs.cfg_2) +--- +... +-- Try to create router with the same name. +util = require('lua_libs.util') +--- +... +util.check_error(vshard.router.new, 'router_2', configs.cfg_2) +--- +- null +- Router with name router_2 already exists +... +-- Reload router module. +_, old_rs_1 = next(vshard.router.internal.static_router.replicasets) +--- +... +_, old_rs_2 = next(router_2.replicasets) +--- +... +package.loaded['vshard.router'] = nil +--- +... +vshard.router = require('vshard.router') +--- +... +while not old_rs_1.is_outdated do fiber.sleep(0.01) end +--- +... +while not old_rs_2.is_outdated do fiber.sleep(0.01) end +--- +... +vshard.router.call(1, 'read', 'do_select', {1}) +--- +- [[1, 1]] +... +router_2:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +routers[5]:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +-- Check lua_gc counter. +lua_gc = require('vshard.lua_gc') +--- +... +vshard.router.internal.collect_lua_garbage_cnt == 0 +--- +- true +... +lua_gc.internal.bg_fiber == nil +--- +- true +... +configs.cfg_2.collect_lua_garbage = true +--- +... +routers[5]:cfg(configs.cfg_2) +--- +... +lua_gc.internal.bg_fiber ~= nil +--- +- true +... +routers[7]:cfg(configs.cfg_2) +--- +... +lua_gc.internal.bg_fiber ~= nil +--- +- true +... +vshard.router.internal.collect_lua_garbage_cnt == 2 +--- +- true +... +package.loaded['vshard.router'] = nil +--- +... +vshard.router = require('vshard.router') +--- +... +vshard.router.internal.collect_lua_garbage_cnt == 2 +--- +- true +... +configs.cfg_2.collect_lua_garbage = nil +--- +... +routers[5]:cfg(configs.cfg_2) +--- +... +lua_gc.internal.bg_fiber ~= nil +--- +- true +... +routers[7]:cfg(configs.cfg_2) +--- +... +vshard.router.internal.collect_lua_garbage_cnt == 0 +--- +- true +... +lua_gc.internal.bg_fiber == nil +--- +- true +... +_ = test_run:cmd("switch default") +--- +... +test_run:cmd("stop server router_1") +--- +- true +... +test_run:cmd("cleanup server router_1") +--- +- true +... +test_run:drop_cluster(REPLICASET_1_1) +--- +... +test_run:drop_cluster(REPLICASET_1_2) +--- +... +test_run:drop_cluster(REPLICASET_2_1) +--- +... +test_run:drop_cluster(REPLICASET_2_2) +--- +... diff --git a/test/multiple_routers/multiple_routers.test.lua b/test/multiple_routers/multiple_routers.test.lua new file mode 100644 index 0000000..760ad9f --- /dev/null +++ b/test/multiple_routers/multiple_routers.test.lua @@ -0,0 +1,108 @@ +test_run = require('test_run').new() + +REPLICASET_1_1 = { 'storage_1_1_a', 'storage_1_1_b' } +REPLICASET_1_2 = { 'storage_1_2_a', 'storage_1_2_b' } +REPLICASET_2_1 = { 'storage_2_1_a', 'storage_2_1_b' } +REPLICASET_2_2 = { 'storage_2_2_a', 'storage_2_2_b' } + +test_run:create_cluster(REPLICASET_1_1, 'multiple_routers') +test_run:create_cluster(REPLICASET_1_2, 'multiple_routers') +test_run:create_cluster(REPLICASET_2_1, 'multiple_routers') +test_run:create_cluster(REPLICASET_2_2, 'multiple_routers') +util = require('lua_libs.util') +util.wait_master(test_run, REPLICASET_1_1, 'storage_1_1_a') +util.wait_master(test_run, REPLICASET_1_2, 'storage_1_2_a') +util.wait_master(test_run, REPLICASET_2_1, 'storage_2_1_a') +util.wait_master(test_run, REPLICASET_2_2, 'storage_2_2_a') + +test_run:cmd("create server router_1 with script='multiple_routers/router_1.lua'") +test_run:cmd("start server router_1") + +-- Configure default (static) router. +_ = test_run:cmd("switch router_1") +static_router = vshard.router.new('_static_router', configs.cfg_1) +vshard.router.bootstrap() +_ = test_run:cmd("switch storage_1_2_a") +wait_rebalancer_state('The cluster is balanced ok', test_run) +_ = test_run:cmd("switch router_1") + +vshard.router.call(1, 'write', 'do_replace', {{1, 1}}) +vshard.router.call(1, 'read', 'do_select', {1}) + +-- Test that static router is just a router object under the hood. +static_router:route(1) == vshard.router.route(1) + +-- Configure extra router. +router_2 = vshard.router.new('router_2', configs.cfg_2) +router_2:bootstrap() +_ = test_run:cmd("switch storage_2_2_a") +wait_rebalancer_state('The cluster is balanced ok', test_run) +_ = test_run:cmd("switch router_1") + +router_2:call(1, 'write', 'do_replace', {{2, 2}}) +router_2:call(1, 'read', 'do_select', {2}) +-- Check that router_2 and static router serves different clusters. +#router_2:call(1, 'read', 'do_select', {1}) == 0 + +-- Create several routers to the same cluster. +routers = {} +for i = 3, 10 do routers[i] = vshard.router.new('router_' .. i, configs.cfg_2) end +routers[3]:call(1, 'read', 'do_select', {2}) +-- Check that they have their own background fibers. +fiber_names = {} +for i = 2, 10 do fiber_names['vshard.failover.router_' .. i] = true; fiber_names['vshard.discovery.router_' .. i] = true; end +next(fiber_names) ~= nil +fiber = require('fiber') +for _, xfiber in pairs(fiber.info()) do fiber_names[xfiber.name] = nil end +next(fiber_names) == nil + +-- Reconfigure one of routers do not affect the others. +routers[3]:cfg(configs.cfg_1) +routers[3]:call(1, 'read', 'do_select', {1}) +#routers[3]:call(1, 'read', 'do_select', {2}) == 0 +#routers[4]:call(1, 'read', 'do_select', {1}) == 0 +routers[4]:call(1, 'read', 'do_select', {2}) +routers[3]:cfg(configs.cfg_2) + +-- Try to create router with the same name. +util = require('lua_libs.util') +util.check_error(vshard.router.new, 'router_2', configs.cfg_2) + +-- Reload router module. +_, old_rs_1 = next(vshard.router.internal.static_router.replicasets) +_, old_rs_2 = next(router_2.replicasets) +package.loaded['vshard.router'] = nil +vshard.router = require('vshard.router') +while not old_rs_1.is_outdated do fiber.sleep(0.01) end +while not old_rs_2.is_outdated do fiber.sleep(0.01) end +vshard.router.call(1, 'read', 'do_select', {1}) +router_2:call(1, 'read', 'do_select', {2}) +routers[5]:call(1, 'read', 'do_select', {2}) + +-- Check lua_gc counter. +lua_gc = require('vshard.lua_gc') +vshard.router.internal.collect_lua_garbage_cnt == 0 +lua_gc.internal.bg_fiber == nil +configs.cfg_2.collect_lua_garbage = true +routers[5]:cfg(configs.cfg_2) +lua_gc.internal.bg_fiber ~= nil +routers[7]:cfg(configs.cfg_2) +lua_gc.internal.bg_fiber ~= nil +vshard.router.internal.collect_lua_garbage_cnt == 2 +package.loaded['vshard.router'] = nil +vshard.router = require('vshard.router') +vshard.router.internal.collect_lua_garbage_cnt == 2 +configs.cfg_2.collect_lua_garbage = nil +routers[5]:cfg(configs.cfg_2) +lua_gc.internal.bg_fiber ~= nil +routers[7]:cfg(configs.cfg_2) +vshard.router.internal.collect_lua_garbage_cnt == 0 +lua_gc.internal.bg_fiber == nil + +_ = test_run:cmd("switch default") +test_run:cmd("stop server router_1") +test_run:cmd("cleanup server router_1") +test_run:drop_cluster(REPLICASET_1_1) +test_run:drop_cluster(REPLICASET_1_2) +test_run:drop_cluster(REPLICASET_2_1) +test_run:drop_cluster(REPLICASET_2_2) diff --git a/test/multiple_routers/router_1.lua b/test/multiple_routers/router_1.lua new file mode 100644 index 0000000..2e9ea91 --- /dev/null +++ b/test/multiple_routers/router_1.lua @@ -0,0 +1,15 @@ +#!/usr/bin/env tarantool + +require('strict').on() + +-- Get instance name +local fio = require('fio') +local NAME = fio.basename(arg[0], '.lua') + +require('console').listen(os.getenv('ADMIN')) + +configs = require('configs') + +-- Start the database with sharding +vshard = require('vshard') +box.cfg{} diff --git a/test/multiple_routers/storage_1_1_a.lua b/test/multiple_routers/storage_1_1_a.lua new file mode 100644 index 0000000..b44a97a --- /dev/null +++ b/test/multiple_routers/storage_1_1_a.lua @@ -0,0 +1,23 @@ +#!/usr/bin/env tarantool + +require('strict').on() + +-- Get instance name. +local fio = require('fio') +NAME = fio.basename(arg[0], '.lua') + +require('console').listen(os.getenv('ADMIN')) + +-- Fetch config for the cluster of the instance. +if NAME:sub(9,9) == '1' then + cfg = require('configs').cfg_1 +else + cfg = require('configs').cfg_2 +end + +-- Start the database with sharding. +vshard = require('vshard') +vshard.storage.cfg(cfg, names[NAME]) + +-- Bootstrap storage. +require('lua_libs.bootstrap') diff --git a/test/multiple_routers/storage_1_1_b.lua b/test/multiple_routers/storage_1_1_b.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_1_1_b.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_1_2_a.lua b/test/multiple_routers/storage_1_2_a.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_1_2_a.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_1_2_b.lua b/test/multiple_routers/storage_1_2_b.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_1_2_b.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_2_1_a.lua b/test/multiple_routers/storage_2_1_a.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_2_1_a.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_2_1_b.lua b/test/multiple_routers/storage_2_1_b.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_2_1_b.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_2_2_a.lua b/test/multiple_routers/storage_2_2_a.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_2_2_a.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_2_2_b.lua b/test/multiple_routers/storage_2_2_b.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_2_2_b.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/suite.ini b/test/multiple_routers/suite.ini new file mode 100644 index 0000000..d2d4470 --- /dev/null +++ b/test/multiple_routers/suite.ini @@ -0,0 +1,6 @@ +[default] +core = tarantool +description = Multiple routers tests +script = test.lua +is_parallel = False +lua_libs = ../lua_libs configs.lua diff --git a/test/multiple_routers/test.lua b/test/multiple_routers/test.lua new file mode 100644 index 0000000..cb7c1ee --- /dev/null +++ b/test/multiple_routers/test.lua @@ -0,0 +1,9 @@ +#!/usr/bin/env tarantool + +require('strict').on() + +box.cfg{ + listen = os.getenv("LISTEN"), +} + +require('console').listen(os.getenv('ADMIN')) diff --git a/test/router/exponential_timeout.result b/test/router/exponential_timeout.result index fb54d0f..6748b64 100644 --- a/test/router/exponential_timeout.result +++ b/test/router/exponential_timeout.result @@ -37,10 +37,10 @@ test_run:cmd('switch router_1') util = require('util') --- ... -rs1 = vshard.router.internal.replicasets[replicasets[1]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]] --- ... -rs2 = vshard.router.internal.replicasets[replicasets[2]] +rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]] --- ... util.collect_timeouts(rs1) diff --git a/test/router/exponential_timeout.test.lua b/test/router/exponential_timeout.test.lua index 3ec0b8c..75d85bf 100644 --- a/test/router/exponential_timeout.test.lua +++ b/test/router/exponential_timeout.test.lua @@ -13,8 +13,8 @@ test_run:cmd("start server router_1") test_run:cmd('switch router_1') util = require('util') -rs1 = vshard.router.internal.replicasets[replicasets[1]] -rs2 = vshard.router.internal.replicasets[replicasets[2]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]] +rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]] util.collect_timeouts(rs1) util.collect_timeouts(rs2) diff --git a/test/router/reconnect_to_master.result b/test/router/reconnect_to_master.result index 5e678ce..d502723 100644 --- a/test/router/reconnect_to_master.result +++ b/test/router/reconnect_to_master.result @@ -76,7 +76,7 @@ _ = test_run:cmd('stop server storage_1_a') _ = test_run:switch('router_1') --- ... -reps = vshard.router.internal.replicasets +reps = vshard.router.internal.static_router.replicasets --- ... test_run:cmd("setopt delimiter ';'") @@ -95,7 +95,7 @@ end; ... function count_known_buckets() local known_buckets = 0 - for _, id in pairs(vshard.router.internal.route_map) do + for _, id in pairs(vshard.router.internal.static_router.route_map) do known_buckets = known_buckets + 1 end return known_buckets @@ -127,7 +127,7 @@ is_disconnected() fiber = require('fiber') --- ... -while vshard.router.internal.replicasets[replicasets[1]].replica == nil do fiber.sleep(0.1) end +while vshard.router.internal.static_router.replicasets[replicasets[1]].replica == nil do fiber.sleep(0.1) end --- ... vshard.router.info() diff --git a/test/router/reconnect_to_master.test.lua b/test/router/reconnect_to_master.test.lua index 39ba90e..8820fa7 100644 --- a/test/router/reconnect_to_master.test.lua +++ b/test/router/reconnect_to_master.test.lua @@ -34,7 +34,7 @@ _ = test_run:cmd('stop server storage_1_a') _ = test_run:switch('router_1') -reps = vshard.router.internal.replicasets +reps = vshard.router.internal.static_router.replicasets test_run:cmd("setopt delimiter ';'") function is_disconnected() for i, rep in pairs(reps) do @@ -46,7 +46,7 @@ function is_disconnected() end; function count_known_buckets() local known_buckets = 0 - for _, id in pairs(vshard.router.internal.route_map) do + for _, id in pairs(vshard.router.internal.static_router.route_map) do known_buckets = known_buckets + 1 end return known_buckets @@ -63,7 +63,7 @@ is_disconnected() -- Wait until replica is connected to test alerts on unavailable -- master. fiber = require('fiber') -while vshard.router.internal.replicasets[replicasets[1]].replica == nil do fiber.sleep(0.1) end +while vshard.router.internal.static_router.replicasets[replicasets[1]].replica == nil do fiber.sleep(0.1) end vshard.router.info() -- Return master. diff --git a/test/router/reload.result b/test/router/reload.result index f0badc3..98e8e71 100644 --- a/test/router/reload.result +++ b/test/router/reload.result @@ -229,7 +229,7 @@ vshard.router.cfg(cfg) cfg.connection_outdate_delay = old_connection_delay --- ... -vshard.router.internal.connection_outdate_delay = nil +vshard.router.internal.static_router.connection_outdate_delay = nil --- ... rs_new = vshard.router.route(1) diff --git a/test/router/reload.test.lua b/test/router/reload.test.lua index 528222a..293cb26 100644 --- a/test/router/reload.test.lua +++ b/test/router/reload.test.lua @@ -104,7 +104,7 @@ old_connection_delay = cfg.connection_outdate_delay cfg.connection_outdate_delay = 0.3 vshard.router.cfg(cfg) cfg.connection_outdate_delay = old_connection_delay -vshard.router.internal.connection_outdate_delay = nil +vshard.router.internal.static_router.connection_outdate_delay = nil rs_new = vshard.router.route(1) rs_old = rs _, replica_old = next(rs_old.replicas) diff --git a/test/router/reroute_wrong_bucket.result b/test/router/reroute_wrong_bucket.result index 7f2a494..989dc79 100644 --- a/test/router/reroute_wrong_bucket.result +++ b/test/router/reroute_wrong_bucket.result @@ -98,7 +98,7 @@ vshard.router.call(100, 'read', 'customer_lookup', {1}, {timeout = 100}) --- - {'accounts': [], 'customer_id': 1, 'name': 'name'} ... -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]] --- ... vshard.router.call(100, 'write', 'customer_add', {{customer_id = 2, bucket_id = 100, name = 'name2', accounts = {}}}, {timeout = 100}) @@ -146,13 +146,13 @@ test_run:switch('router_1') ... -- Emulate a situation, when a replicaset_2 while is unknown for -- router, but is already known for storages. -save_rs2 = vshard.router.internal.replicasets[replicasets[2]] +save_rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]] --- ... -vshard.router.internal.replicasets[replicasets[2]] = nil +vshard.router.internal.static_router.replicasets[replicasets[2]] = nil --- ... -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]] --- ... fiber = require('fiber') @@ -207,7 +207,7 @@ err require('log').info(string.rep('a', 1000)) --- ... -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]] --- ... call_retval = nil @@ -219,7 +219,7 @@ f = fiber.create(do_call, 100) while not test_run:grep_log('router_1', 'please update configuration', 1000) do fiber.sleep(0.1) end --- ... -vshard.router.internal.replicasets[replicasets[2]] = save_rs2 +vshard.router.internal.static_router.replicasets[replicasets[2]] = save_rs2 --- ... while not call_retval do fiber.sleep(0.1) end diff --git a/test/router/reroute_wrong_bucket.test.lua b/test/router/reroute_wrong_bucket.test.lua index 03384d1..a00f941 100644 --- a/test/router/reroute_wrong_bucket.test.lua +++ b/test/router/reroute_wrong_bucket.test.lua @@ -35,7 +35,7 @@ customer_add({customer_id = 1, bucket_id = 100, name = 'name', accounts = {}}) test_run:switch('router_1') vshard.router.call(100, 'read', 'customer_lookup', {1}, {timeout = 100}) -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]] vshard.router.call(100, 'write', 'customer_add', {{customer_id = 2, bucket_id = 100, name = 'name2', accounts = {}}}, {timeout = 100}) -- Create cycle. @@ -55,9 +55,9 @@ box.space._bucket:replace({100, vshard.consts.BUCKET.SENT, replicasets[2]}) test_run:switch('router_1') -- Emulate a situation, when a replicaset_2 while is unknown for -- router, but is already known for storages. -save_rs2 = vshard.router.internal.replicasets[replicasets[2]] -vshard.router.internal.replicasets[replicasets[2]] = nil -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +save_rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]] +vshard.router.internal.static_router.replicasets[replicasets[2]] = nil +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]] fiber = require('fiber') call_retval = nil @@ -84,11 +84,11 @@ err -- detect it and end with ok. -- require('log').info(string.rep('a', 1000)) -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]] call_retval = nil f = fiber.create(do_call, 100) while not test_run:grep_log('router_1', 'please update configuration', 1000) do fiber.sleep(0.1) end -vshard.router.internal.replicasets[replicasets[2]] = save_rs2 +vshard.router.internal.static_router.replicasets[replicasets[2]] = save_rs2 while not call_retval do fiber.sleep(0.1) end call_retval vshard.router.call(100, 'read', 'customer_lookup', {3}, {timeout = 1}) diff --git a/test/router/retry_reads.result b/test/router/retry_reads.result index 64b0ff3..b803ae3 100644 --- a/test/router/retry_reads.result +++ b/test/router/retry_reads.result @@ -37,7 +37,7 @@ test_run:cmd('switch router_1') util = require('util') --- ... -rs1 = vshard.router.internal.replicasets[replicasets[1]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]] --- ... min_timeout = vshard.consts.CALL_TIMEOUT_MIN diff --git a/test/router/retry_reads.test.lua b/test/router/retry_reads.test.lua index 2fb2fc7..510e961 100644 --- a/test/router/retry_reads.test.lua +++ b/test/router/retry_reads.test.lua @@ -13,7 +13,7 @@ test_run:cmd("start server router_1") test_run:cmd('switch router_1') util = require('util') -rs1 = vshard.router.internal.replicasets[replicasets[1]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]] min_timeout = vshard.consts.CALL_TIMEOUT_MIN -- diff --git a/test/router/router.result b/test/router/router.result ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 3/3] Introduce multiple routers feature 2018-08-03 20:05 ` Alex Khatskevich @ 2018-08-06 17:03 ` Vladislav Shpilevoy 2018-08-07 13:18 ` Alex Khatskevich 0 siblings, 1 reply; 23+ messages in thread From: Vladislav Shpilevoy @ 2018-08-06 17:03 UTC (permalink / raw) To: Alex Khatskevich, tarantool-patches Thanks for the patch! See 8 comments below. 1. You did not send a full diff. There are tests only. (In this email I pasted it myself). Please, send a full diff next times. >>> + if M.routers[name] then >>> + return nil, string.format('Router with name %s already exists', name) >>> + end >>> + local router = table.deepcopy(ROUTER_TEMPLATE) >>> + setmetatable(router, router_mt) >>> + router.name = name >>> + M.routers[name] = router >>> + if name == STATIC_ROUTER_NAME then >>> + M.static_router = router >>> + export_static_router_attributes() >>> + end >> >> 9. This check can be removed if you move >> export_static_router_attributes call into legacy_cfg. > Butbue to this if, the static router can be configured by > `vshard.box.new(static_router_name)`. 2. It is not ok. A user should not use any internal names like _statis_router to configure it and get. Please, add a new member vshard.router.static that references the statis one. Until cfg is called it is nil. > diff --git a/vshard/router/init.lua b/vshard/router/init.lua > index 3e127cb..62fdcda 100644 > --- a/vshard/router/init.lua > +++ b/vshard/router/init.lua > @@ -25,14 +25,32 @@ local M = rawget(_G, MODULE_INTERNALS) > if not M then > M = { > ---------------- Common module attributes ---------------- > - -- The last passed configuration. > - current_cfg = nil, > errinj = { > ERRINJ_CFG = false, > ERRINJ_FAILOVER_CHANGE_CFG = false, > ERRINJ_RELOAD = false, > ERRINJ_LONG_DISCOVERY = false, > }, > + -- Dictionary, key is router name, value is a router. > + routers = {}, > + -- Router object which can be accessed by old api: > + -- e.g. vshard.router.call(...) > + static_router = nil, > + -- This counter is used to restart background fibers with > + -- new reloaded code. > + module_version = 0, > + collect_lua_garbage_cnt = 0, 3. A comment? > + } > +end > + > +-- > +-- Router object attributes. > +-- > +local ROUTER_TEMPLATE = { > + -- Name of router. > + name = nil, > + -- The last passed configuration. > + current_cfg = nil, > -- Time to outdate old objects on reload. > connection_outdate_delay = nil, > -- Bucket map cache.> @@ -488,8 +505,20 @@ end > -- Configuration > -------------------------------------------------------------------------------- > > -local function router_cfg(cfg) > - local vshard_cfg, box_cfg = lcfg.check(cfg, M.current_cfg) > +local function change_lua_gc_cnt(val) 4. The same. > + assert(M.collect_lua_garbage_cnt >= 0) > + local prev_cnt = M.collect_lua_garbage_cnt > + M.collect_lua_garbage_cnt = M.collect_lua_garbage_cnt + val > + if prev_cnt == 0 and M.collect_lua_garbage_cnt > 0 then > + lua_gc.set_state(true, consts.COLLECT_LUA_GARBAGE_INTERVAL) > + end > + if prev_cnt > 0 and M.collect_lua_garbage_cnt == 0 then > + lua_gc.set_state(false, consts.COLLECT_LUA_GARBAGE_INTERVAL) > + end 5. You know the concrete val in the caller always: 1 or -1. I think it would look simpler if you split this function into separate inc and dec ones. The former checks for prev == 0 and new > 0, the later checks for prev > 0 and new == 0. It is not needed to check both each time. > +end > + > +local function router_cfg(router, cfg) > + local vshard_cfg, box_cfg = lcfg.check(cfg, router.current_cfg) > if not M.replicasets then > log.info('Starting router configuration') > else > @@ -803,6 +839,77 @@ if M.errinj.ERRINJ_RELOAD then > error('Error injection: reload') > end > > +-------------------------------------------------------------------------------- > +-- Managing router instances > +-------------------------------------------------------------------------------- > + > +local function cfg_reconfigure(router, cfg) > + return router_cfg(router, cfg) > +end > + > +local router_mt = { > + __index = { > + cfg = cfg_reconfigure; > + info = router_info; > + buckets_info = router_buckets_info; > + call = router_call; > + callro = router_callro; > + callrw = router_callrw; > + route = router_route; > + routeall = router_routeall; > + bucket_id = router_bucket_id; > + bucket_count = router_bucket_count; > + sync = router_sync; > + bootstrap = cluster_bootstrap; > + bucket_discovery = bucket_discovery; > + discovery_wakeup = discovery_wakeup; > + } > +} > + > +-- Table which represents this module. > +local module = {} > + > +-- This metatable bypasses calls to a module to the static_router. > +local module_mt = {__index = {}} > +for method_name, method in pairs(router_mt.__index) do > + module_mt.__index[method_name] = function(...) > + return method(M.static_router, ...) > + end > +end > + > +local function export_static_router_attributes() > + setmetatable(module, module_mt) > +end > + > +local function router_new(name, cfg) 6. A comment? 7. This function should not check for router_name == static one. It just creates a new router and returns it. The caller should set it into M.routers or M.static_router if needed. For a user you expose not this function but a wrapper that calls router_new and sets M.routers. > + if type(name) ~= 'string' or type(cfg) ~= 'table' then > + error('Wrong argument type. Usage: vshard.router.new(name, cfg).') > + end > + if M.routers[name] then > + return nil, string.format('Router with name %s already exists', name) > + end > + local router = table.deepcopy(ROUTER_TEMPLATE) > + setmetatable(router, router_mt) > + router.name = name > + M.routers[name] = router > + if name == STATIC_ROUTER_NAME then > + M.static_router = router > + export_static_router_attributes() > + end > + router_cfg(router, cfg) > + return router > +end > + > @@ -813,28 +920,23 @@ end > if not rawget(_G, MODULE_INTERNALS) then > rawset(_G, MODULE_INTERNALS, M) > else > - router_cfg(M.current_cfg) > + for _, router in pairs(M.routers) do > + router_cfg(router, router.current_cfg) > + setmetatable(router, router_mt) > + end > M.module_version = M.module_version + 1 > end > > M.discovery_f = discovery_f > M.failover_f = failover_f > +M.router_mt = router_mt > +if M.static_router then > + export_static_router_attributes() > +end 8. This is possible on reload only and can be moved into the if above to the reload case processing. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 3/3] Introduce multiple routers feature 2018-08-06 17:03 ` Vladislav Shpilevoy @ 2018-08-07 13:18 ` Alex Khatskevich 2018-08-08 12:28 ` Vladislav Shpilevoy 0 siblings, 1 reply; 23+ messages in thread From: Alex Khatskevich @ 2018-08-07 13:18 UTC (permalink / raw) To: Vladislav Shpilevoy, tarantool-patches On 06.08.2018 20:03, Vladislav Shpilevoy wrote: > Thanks for the patch! See 8 comments below. > > 1. You did not send a full diff. There are tests only. (In > this email I pasted it myself). Please, send a full diff > next times. Ok, sorry. > >>>> + if M.routers[name] then >>>> + return nil, string.format('Router with name %s already >>>> exists', name) >>>> + end >>>> + local router = table.deepcopy(ROUTER_TEMPLATE) >>>> + setmetatable(router, router_mt) >>>> + router.name = name >>>> + M.routers[name] = router >>>> + if name == STATIC_ROUTER_NAME then >>>> + M.static_router = router >>>> + export_static_router_attributes() >>>> + end >>> >>> 9. This check can be removed if you move >>> export_static_router_attributes call into legacy_cfg. >> Butbue to this if, the static router can be configured by >> `vshard.box.new(static_router_name)`. > > 2. It is not ok. A user should not use any internal names like > _statis_router to configure it and get. Please, add a new member > vshard.router.static that references the statis one. Until cfg > is called it is nil. Fixed. By now, user can create a static router only by calling `vshard.router.cfg()` > >> diff --git a/vshard/router/init.lua b/vshard/router/init.lua >> index 3e127cb..62fdcda 100644 >> --- a/vshard/router/init.lua >> +++ b/vshard/router/init.lua >> @@ -25,14 +25,32 @@ local M = rawget(_G, MODULE_INTERNALS) >> if not M then >> M = { >> ---------------- Common module attributes ---------------- >> - -- The last passed configuration. >> - current_cfg = nil, >> errinj = { >> ERRINJ_CFG = false, >> ERRINJ_FAILOVER_CHANGE_CFG = false, >> ERRINJ_RELOAD = false, >> ERRINJ_LONG_DISCOVERY = false, >> }, >> + -- Dictionary, key is router name, value is a router. >> + routers = {}, >> + -- Router object which can be accessed by old api: >> + -- e.g. vshard.router.call(...) >> + static_router = nil, >> + -- This counter is used to restart background fibers with >> + -- new reloaded code. >> + module_version = 0, >> + collect_lua_garbage_cnt = 0, > > 3. A comment? added -- Number of router which require collecting lua garbage. > >> + } >> +end >> + >> +-- >> +-- Router object attributes. >> +-- >> +local ROUTER_TEMPLATE = { >> + -- Name of router. >> + name = nil, >> + -- The last passed configuration. >> + current_cfg = nil, >> -- Time to outdate old objects on reload. >> connection_outdate_delay = nil, >> -- Bucket map cache.> @@ -488,8 +505,20 @@ end >> -- Configuration >> -------------------------------------------------------------------------------- >> >> >> -local function router_cfg(cfg) >> - local vshard_cfg, box_cfg = lcfg.check(cfg, M.current_cfg) >> +local function change_lua_gc_cnt(val) > > 4. The same. fixed > >> + assert(M.collect_lua_garbage_cnt >= 0) >> + local prev_cnt = M.collect_lua_garbage_cnt >> + M.collect_lua_garbage_cnt = M.collect_lua_garbage_cnt + val >> + if prev_cnt == 0 and M.collect_lua_garbage_cnt > 0 then >> + lua_gc.set_state(true, consts.COLLECT_LUA_GARBAGE_INTERVAL) >> + end >> + if prev_cnt > 0 and M.collect_lua_garbage_cnt == 0 then >> + lua_gc.set_state(false, consts.COLLECT_LUA_GARBAGE_INTERVAL) >> + end > > 5. You know the concrete val in the caller always: 1 or -1. I think > it would look simpler if you split this function into separate inc > and dec ones. The former checks for prev == 0 and new > 0, the later > checks for prev > 0 and new == 0. It is not needed to check both > each time. changed > >> +end >> + >> +local function router_cfg(router, cfg) >> + local vshard_cfg, box_cfg = lcfg.check(cfg, router.current_cfg) >> if not M.replicasets then >> log.info('Starting router configuration') >> else >> @@ -803,6 +839,77 @@ if M.errinj.ERRINJ_RELOAD then >> error('Error injection: reload') >> end >> >> +-------------------------------------------------------------------------------- >> >> +-- Managing router instances >> +-------------------------------------------------------------------------------- >> >> + >> +local function cfg_reconfigure(router, cfg) >> + return router_cfg(router, cfg) >> +end >> + >> +local router_mt = { >> + __index = { >> + cfg = cfg_reconfigure; >> + info = router_info; >> + buckets_info = router_buckets_info; >> + call = router_call; >> + callro = router_callro; >> + callrw = router_callrw; >> + route = router_route; >> + routeall = router_routeall; >> + bucket_id = router_bucket_id; >> + bucket_count = router_bucket_count; >> + sync = router_sync; >> + bootstrap = cluster_bootstrap; >> + bucket_discovery = bucket_discovery; >> + discovery_wakeup = discovery_wakeup; >> + } >> +} >> + >> +-- Table which represents this module. >> +local module = {} >> + >> +-- This metatable bypasses calls to a module to the static_router. >> +local module_mt = {__index = {}} >> +for method_name, method in pairs(router_mt.__index) do >> + module_mt.__index[method_name] = function(...) >> + return method(M.static_router, ...) >> + end >> +end >> + >> +local function export_static_router_attributes() >> + setmetatable(module, module_mt) >> +end >> + >> +local function router_new(name, cfg) > > 6. A comment? added > > 7. This function should not check for router_name == static one. > It just creates a new router and returns it. The caller should set > it into M.routers or M.static_router if needed. For a user you > expose not this function but a wrapper that calls router_new and > sets M.routers. fixed. > >> + if type(name) ~= 'string' or type(cfg) ~= 'table' then >> + error('Wrong argument type. Usage: >> vshard.router.new(name, cfg).') >> + end >> + if M.routers[name] then >> + return nil, string.format('Router with name %s already >> exists', name) >> + end >> + local router = table.deepcopy(ROUTER_TEMPLATE) >> + setmetatable(router, router_mt) >> + router.name = name >> + M.routers[name] = router >> + if name == STATIC_ROUTER_NAME then >> + M.static_router = router >> + export_static_router_attributes() >> + end >> + router_cfg(router, cfg) >> + return router >> +end >> + >> @@ -813,28 +920,23 @@ end >> if not rawget(_G, MODULE_INTERNALS) then >> rawset(_G, MODULE_INTERNALS, M) >> else >> - router_cfg(M.current_cfg) >> + for _, router in pairs(M.routers) do >> + router_cfg(router, router.current_cfg) >> + setmetatable(router, router_mt) >> + end >> M.module_version = M.module_version + 1 >> end >> >> M.discovery_f = discovery_f >> M.failover_f = failover_f >> +M.router_mt = router_mt >> +if M.static_router then >> + export_static_router_attributes() >> +end > > 8. This is possible on reload only and can be moved into > the if above to the reload case processing. Fixed. Here is a full diff commit 87b6dc044de177e159dbe24f07abf3f98839ccff Author: AKhatskevich <avkhatskevich@tarantool.org> Date: Thu Jul 26 16:17:25 2018 +0300 Introduce multiple routers feature Key points: * Old `vshard.router.some_method()` api is preserved. * Add `vshard.router.new(name, cfg)` method which returns a new router. * Each router has its own: 1. name 2. background fibers 3. attributes (route_map, replicasets, outdate_delay...) * Module reload reloads all configured routers. * `cfg` reconfigures a single router. * All routers share the same box configuration. The last passed config overrides the global box config. * Multiple router instances can be connected to the same cluster. * By now, a router cannot be destroyed. Extra changes: * Add `data` parameter to `reloadable_fiber_create` function. Closes #130 diff --git a/test/failover/failover.result b/test/failover/failover.result index 73a4250..50410ad 100644 --- a/test/failover/failover.result +++ b/test/failover/failover.result @@ -174,7 +174,7 @@ test_run:switch('router_1') --- - true ... -rs1 = vshard.router.internal.replicasets[rs_uuid[1]] +rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]] --- ... while not rs1.replica_up_ts do fiber.sleep(0.1) end diff --git a/test/failover/failover.test.lua b/test/failover/failover.test.lua index 6e06314..44c8b6d 100644 --- a/test/failover/failover.test.lua +++ b/test/failover/failover.test.lua @@ -74,7 +74,7 @@ echo_count -- Ensure that replica_up_ts is updated periodically. test_run:switch('router_1') -rs1 = vshard.router.internal.replicasets[rs_uuid[1]] +rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]] while not rs1.replica_up_ts do fiber.sleep(0.1) end old_up_ts = rs1.replica_up_ts while rs1.replica_up_ts == old_up_ts do fiber.sleep(0.1) end diff --git a/test/failover/failover_errinj.result b/test/failover/failover_errinj.result index 3b6d986..484a1e3 100644 --- a/test/failover/failover_errinj.result +++ b/test/failover/failover_errinj.result @@ -49,7 +49,7 @@ vshard.router.cfg(cfg) -- Check that already run failover step is restarted on -- configuration change (if some replicasets are removed from -- config). -rs1 = vshard.router.internal.replicasets[rs_uuid[1]] +rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]] --- ... while not rs1.replica or not rs1.replica.conn:is_connected() do fiber.sleep(0.1) end diff --git a/test/failover/failover_errinj.test.lua b/test/failover/failover_errinj.test.lua index b4d2d35..14228de 100644 --- a/test/failover/failover_errinj.test.lua +++ b/test/failover/failover_errinj.test.lua @@ -20,7 +20,7 @@ vshard.router.cfg(cfg) -- Check that already run failover step is restarted on -- configuration change (if some replicasets are removed from -- config). -rs1 = vshard.router.internal.replicasets[rs_uuid[1]] +rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]] while not rs1.replica or not rs1.replica.conn:is_connected() do fiber.sleep(0.1) end vshard.router.internal.errinj.ERRINJ_FAILOVER_CHANGE_CFG = true wait_state('Configuration has changed, restart ') diff --git a/test/failover/router_1.lua b/test/failover/router_1.lua index d71209b..664a6c6 100644 --- a/test/failover/router_1.lua +++ b/test/failover/router_1.lua @@ -42,7 +42,7 @@ end function priority_order() local ret = {} for _, uuid in pairs(rs_uuid) do - local rs = vshard.router.internal.replicasets[uuid] + local rs = vshard.router.internal.static_router.replicasets[uuid] local sorted = {} for _, replica in pairs(rs.priority_list) do local z diff --git a/test/misc/reconfigure.result b/test/misc/reconfigure.result index c7960b3..311f749 100644 --- a/test/misc/reconfigure.result +++ b/test/misc/reconfigure.result @@ -250,7 +250,7 @@ test_run:switch('router_1') -- Ensure that in a case of error router internals are not -- changed. -- -not vshard.router.internal.collect_lua_garbage +not vshard.router.internal.static_router.collect_lua_garbage --- - true ... @@ -264,7 +264,7 @@ vshard.router.cfg(cfg) --- - error: 'Incorrect value for option ''invalid_option'': unexpected option' ... -not vshard.router.internal.collect_lua_garbage +not vshard.router.internal.static_router.collect_lua_garbage --- - true ... diff --git a/test/misc/reconfigure.test.lua b/test/misc/reconfigure.test.lua index 25dc2ca..298b9b0 100644 --- a/test/misc/reconfigure.test.lua +++ b/test/misc/reconfigure.test.lua @@ -99,11 +99,11 @@ test_run:switch('router_1') -- Ensure that in a case of error router internals are not -- changed. -- -not vshard.router.internal.collect_lua_garbage +not vshard.router.internal.static_router.collect_lua_garbage cfg.collect_lua_garbage = true cfg.invalid_option = 'kek' vshard.router.cfg(cfg) -not vshard.router.internal.collect_lua_garbage +not vshard.router.internal.static_router.collect_lua_garbage cfg.invalid_option = nil cfg.collect_lua_garbage = nil vshard.router.cfg(cfg) diff --git a/test/multiple_routers/configs.lua b/test/multiple_routers/configs.lua new file mode 100644 index 0000000..a6ce33c --- /dev/null +++ b/test/multiple_routers/configs.lua @@ -0,0 +1,81 @@ +names = { + storage_1_1_a = '32a2d4b8-f146-44ed-9d51-2436507efdf8', + storage_1_1_b = 'c1c849b1-641d-40b8-9283-bcfe73d46270', + storage_1_2_a = '04e677ed-c7ba-47e0-a67f-b5100cfa86af', + storage_1_2_b = 'c7a979ee-9263-4a38-84a5-2fb6a0a32684', + storage_2_1_a = '88dc03f0-23fb-4f05-b462-e29186542864', + storage_2_1_b = '4230b711-f5c4-4131-bf98-88cd43a16901', + storage_2_2_a = '6b1eefbc-1e2e-410e-84ff-44c572ea9916', + storage_2_2_b = 'be74419a-1e56-4ba4-97e9-6b18710f63c5', +} + +rs_1_1 = 'dd208fb8-8b90-49bc-8393-6b3a99da7c52' +rs_1_2 = 'af9cfe88-2091-4613-a877-a623776c5c0e' +rs_2_1 = '9ca8ee15-ae18-4f31-9385-4859f89ce73f' +rs_2_2 = '007f5f58-b654-4125-8441-a71866fb62b5' + +local cfg_1 = {} +cfg_1.sharding = { + [rs_1_1] = { + replicas = { + [names.storage_1_1_a] = { + uri = 'storage:storage@127.0.0.1:3301', + name = 'storage_1_1_a', + master = true, + }, + [names.storage_1_1_b] = { + uri = 'storage:storage@127.0.0.1:3302', + name = 'storage_1_1_b', + }, + } + }, + [rs_1_2] = { + replicas = { + [names.storage_1_2_a] = { + uri = 'storage:storage@127.0.0.1:3303', + name = 'storage_1_2_a', + master = true, + }, + [names.storage_1_2_b] = { + uri = 'storage:storage@127.0.0.1:3304', + name = 'storage_1_2_b', + }, + } + }, +} + + +local cfg_2 = {} +cfg_2.sharding = { + [rs_2_1] = { + replicas = { + [names.storage_2_1_a] = { + uri = 'storage:storage@127.0.0.1:3305', + name = 'storage_2_1_a', + master = true, + }, + [names.storage_2_1_b] = { + uri = 'storage:storage@127.0.0.1:3306', + name = 'storage_2_1_b', + }, + } + }, + [rs_2_2] = { + replicas = { + [names.storage_2_2_a] = { + uri = 'storage:storage@127.0.0.1:3307', + name = 'storage_2_2_a', + master = true, + }, + [names.storage_2_2_b] = { + uri = 'storage:storage@127.0.0.1:3308', + name = 'storage_2_2_b', + }, + } + }, +} + +return { + cfg_1 = cfg_1, + cfg_2 = cfg_2, +} diff --git a/test/multiple_routers/multiple_routers.result b/test/multiple_routers/multiple_routers.result new file mode 100644 index 0000000..5b85e1c --- /dev/null +++ b/test/multiple_routers/multiple_routers.result @@ -0,0 +1,301 @@ +test_run = require('test_run').new() +--- +... +REPLICASET_1_1 = { 'storage_1_1_a', 'storage_1_1_b' } +--- +... +REPLICASET_1_2 = { 'storage_1_2_a', 'storage_1_2_b' } +--- +... +REPLICASET_2_1 = { 'storage_2_1_a', 'storage_2_1_b' } +--- +... +REPLICASET_2_2 = { 'storage_2_2_a', 'storage_2_2_b' } +--- +... +test_run:create_cluster(REPLICASET_1_1, 'multiple_routers') +--- +... +test_run:create_cluster(REPLICASET_1_2, 'multiple_routers') +--- +... +test_run:create_cluster(REPLICASET_2_1, 'multiple_routers') +--- +... +test_run:create_cluster(REPLICASET_2_2, 'multiple_routers') +--- +... +util = require('lua_libs.util') +--- +... +util.wait_master(test_run, REPLICASET_1_1, 'storage_1_1_a') +--- +... +util.wait_master(test_run, REPLICASET_1_2, 'storage_1_2_a') +--- +... +util.wait_master(test_run, REPLICASET_2_1, 'storage_2_1_a') +--- +... +util.wait_master(test_run, REPLICASET_2_2, 'storage_2_2_a') +--- +... +test_run:cmd("create server router_1 with script='multiple_routers/router_1.lua'") +--- +- true +... +test_run:cmd("start server router_1") +--- +- true +... +-- Configure default (static) router. +_ = test_run:cmd("switch router_1") +--- +... +vshard.router.cfg(configs.cfg_1) +--- +... +vshard.router.bootstrap() +--- +- true +... +_ = test_run:cmd("switch storage_1_2_a") +--- +... +wait_rebalancer_state('The cluster is balanced ok', test_run) +--- +... +_ = test_run:cmd("switch router_1") +--- +... +vshard.router.call(1, 'write', 'do_replace', {{1, 1}}) +--- +- true +... +vshard.router.call(1, 'read', 'do_select', {1}) +--- +- [[1, 1]] +... +-- Test that static router is just a router object under the hood. +static_router = vshard.router.internal.static_router +--- +... +static_router:route(1) == vshard.router.route(1) +--- +- true +... +-- Configure extra router. +router_2 = vshard.router.new('router_2', configs.cfg_2) +--- +... +router_2:bootstrap() +--- +- true +... +_ = test_run:cmd("switch storage_2_2_a") +--- +... +wait_rebalancer_state('The cluster is balanced ok', test_run) +--- +... +_ = test_run:cmd("switch router_1") +--- +... +router_2:call(1, 'write', 'do_replace', {{2, 2}}) +--- +- true +... +router_2:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +-- Check that router_2 and static router serves different clusters. +#router_2:call(1, 'read', 'do_select', {1}) == 0 +--- +- true +... +-- Create several routers to the same cluster. +routers = {} +--- +... +for i = 3, 10 do routers[i] = vshard.router.new('router_' .. i, configs.cfg_2) end +--- +... +routers[3]:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +-- Check that they have their own background fibers. +fiber_names = {} +--- +... +for i = 2, 10 do fiber_names['vshard.failover.router_' .. i] = true; fiber_names['vshard.discovery.router_' .. i] = true; end +--- +... +next(fiber_names) ~= nil +--- +- true +... +fiber = require('fiber') +--- +... +for _, xfiber in pairs(fiber.info()) do fiber_names[xfiber.name] = nil end +--- +... +next(fiber_names) == nil +--- +- true +... +-- Reconfigure one of routers do not affect the others. +routers[3]:cfg(configs.cfg_1) +--- +... +routers[3]:call(1, 'read', 'do_select', {1}) +--- +- [[1, 1]] +... +#routers[3]:call(1, 'read', 'do_select', {2}) == 0 +--- +- true +... +#routers[4]:call(1, 'read', 'do_select', {1}) == 0 +--- +- true +... +routers[4]:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +routers[3]:cfg(configs.cfg_2) +--- +... +-- Try to create router with the same name. +util = require('lua_libs.util') +--- +... +util.check_error(vshard.router.new, 'router_2', configs.cfg_2) +--- +- null +- type: ShardingError + code: 21 + name: ROUTER_ALREADY_EXISTS + message: Router with name router_2 already exists +... +-- Reload router module. +_, old_rs_1 = next(vshard.router.internal.static_router.replicasets) +--- +... +_, old_rs_2 = next(router_2.replicasets) +--- +... +package.loaded['vshard.router'] = nil +--- +... +vshard.router = require('vshard.router') +--- +... +while not old_rs_1.is_outdated do fiber.sleep(0.01) end +--- +... +while not old_rs_2.is_outdated do fiber.sleep(0.01) end +--- +... +vshard.router.call(1, 'read', 'do_select', {1}) +--- +- [[1, 1]] +... +router_2:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +routers[5]:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +-- Check lua_gc counter. +lua_gc = require('vshard.lua_gc') +--- +... +vshard.router.internal.collect_lua_garbage_cnt == 0 +--- +- true +... +lua_gc.internal.bg_fiber == nil +--- +- true +... +configs.cfg_2.collect_lua_garbage = true +--- +... +routers[5]:cfg(configs.cfg_2) +--- +... +lua_gc.internal.bg_fiber ~= nil +--- +- true +... +routers[7]:cfg(configs.cfg_2) +--- +... +lua_gc.internal.bg_fiber ~= nil +--- +- true +... +vshard.router.internal.collect_lua_garbage_cnt == 2 +--- +- true +... +package.loaded['vshard.router'] = nil +--- +... +vshard.router = require('vshard.router') +--- +... +vshard.router.internal.collect_lua_garbage_cnt == 2 +--- +- true +... +configs.cfg_2.collect_lua_garbage = nil +--- +... +routers[5]:cfg(configs.cfg_2) +--- +... +lua_gc.internal.bg_fiber ~= nil +--- +- true +... +routers[7]:cfg(configs.cfg_2) +--- +... +vshard.router.internal.collect_lua_garbage_cnt == 0 +--- +- true +... +lua_gc.internal.bg_fiber == nil +--- +- true +... +_ = test_run:cmd("switch default") +--- +... +test_run:cmd("stop server router_1") +--- +- true +... +test_run:cmd("cleanup server router_1") +--- +- true +... +test_run:drop_cluster(REPLICASET_1_1) +--- +... +test_run:drop_cluster(REPLICASET_1_2) +--- +... +test_run:drop_cluster(REPLICASET_2_1) +--- +... +test_run:drop_cluster(REPLICASET_2_2) +--- +... diff --git a/test/multiple_routers/multiple_routers.test.lua b/test/multiple_routers/multiple_routers.test.lua new file mode 100644 index 0000000..ec3c7f7 --- /dev/null +++ b/test/multiple_routers/multiple_routers.test.lua @@ -0,0 +1,109 @@ +test_run = require('test_run').new() + +REPLICASET_1_1 = { 'storage_1_1_a', 'storage_1_1_b' } +REPLICASET_1_2 = { 'storage_1_2_a', 'storage_1_2_b' } +REPLICASET_2_1 = { 'storage_2_1_a', 'storage_2_1_b' } +REPLICASET_2_2 = { 'storage_2_2_a', 'storage_2_2_b' } + +test_run:create_cluster(REPLICASET_1_1, 'multiple_routers') +test_run:create_cluster(REPLICASET_1_2, 'multiple_routers') +test_run:create_cluster(REPLICASET_2_1, 'multiple_routers') +test_run:create_cluster(REPLICASET_2_2, 'multiple_routers') +util = require('lua_libs.util') +util.wait_master(test_run, REPLICASET_1_1, 'storage_1_1_a') +util.wait_master(test_run, REPLICASET_1_2, 'storage_1_2_a') +util.wait_master(test_run, REPLICASET_2_1, 'storage_2_1_a') +util.wait_master(test_run, REPLICASET_2_2, 'storage_2_2_a') + +test_run:cmd("create server router_1 with script='multiple_routers/router_1.lua'") +test_run:cmd("start server router_1") + +-- Configure default (static) router. +_ = test_run:cmd("switch router_1") +vshard.router.cfg(configs.cfg_1) +vshard.router.bootstrap() +_ = test_run:cmd("switch storage_1_2_a") +wait_rebalancer_state('The cluster is balanced ok', test_run) +_ = test_run:cmd("switch router_1") + +vshard.router.call(1, 'write', 'do_replace', {{1, 1}}) +vshard.router.call(1, 'read', 'do_select', {1}) + +-- Test that static router is just a router object under the hood. +static_router = vshard.router.internal.static_router +static_router:route(1) == vshard.router.route(1) + +-- Configure extra router. +router_2 = vshard.router.new('router_2', configs.cfg_2) +router_2:bootstrap() +_ = test_run:cmd("switch storage_2_2_a") +wait_rebalancer_state('The cluster is balanced ok', test_run) +_ = test_run:cmd("switch router_1") + +router_2:call(1, 'write', 'do_replace', {{2, 2}}) +router_2:call(1, 'read', 'do_select', {2}) +-- Check that router_2 and static router serves different clusters. +#router_2:call(1, 'read', 'do_select', {1}) == 0 + +-- Create several routers to the same cluster. +routers = {} +for i = 3, 10 do routers[i] = vshard.router.new('router_' .. i, configs.cfg_2) end +routers[3]:call(1, 'read', 'do_select', {2}) +-- Check that they have their own background fibers. +fiber_names = {} +for i = 2, 10 do fiber_names['vshard.failover.router_' .. i] = true; fiber_names['vshard.discovery.router_' .. i] = true; end +next(fiber_names) ~= nil +fiber = require('fiber') +for _, xfiber in pairs(fiber.info()) do fiber_names[xfiber.name] = nil end +next(fiber_names) == nil + +-- Reconfigure one of routers do not affect the others. +routers[3]:cfg(configs.cfg_1) +routers[3]:call(1, 'read', 'do_select', {1}) +#routers[3]:call(1, 'read', 'do_select', {2}) == 0 +#routers[4]:call(1, 'read', 'do_select', {1}) == 0 +routers[4]:call(1, 'read', 'do_select', {2}) +routers[3]:cfg(configs.cfg_2) + +-- Try to create router with the same name. +util = require('lua_libs.util') +util.check_error(vshard.router.new, 'router_2', configs.cfg_2) + +-- Reload router module. +_, old_rs_1 = next(vshard.router.internal.static_router.replicasets) +_, old_rs_2 = next(router_2.replicasets) +package.loaded['vshard.router'] = nil +vshard.router = require('vshard.router') +while not old_rs_1.is_outdated do fiber.sleep(0.01) end +while not old_rs_2.is_outdated do fiber.sleep(0.01) end +vshard.router.call(1, 'read', 'do_select', {1}) +router_2:call(1, 'read', 'do_select', {2}) +routers[5]:call(1, 'read', 'do_select', {2}) + +-- Check lua_gc counter. +lua_gc = require('vshard.lua_gc') +vshard.router.internal.collect_lua_garbage_cnt == 0 +lua_gc.internal.bg_fiber == nil +configs.cfg_2.collect_lua_garbage = true +routers[5]:cfg(configs.cfg_2) +lua_gc.internal.bg_fiber ~= nil +routers[7]:cfg(configs.cfg_2) +lua_gc.internal.bg_fiber ~= nil +vshard.router.internal.collect_lua_garbage_cnt == 2 +package.loaded['vshard.router'] = nil +vshard.router = require('vshard.router') +vshard.router.internal.collect_lua_garbage_cnt == 2 +configs.cfg_2.collect_lua_garbage = nil +routers[5]:cfg(configs.cfg_2) +lua_gc.internal.bg_fiber ~= nil +routers[7]:cfg(configs.cfg_2) +vshard.router.internal.collect_lua_garbage_cnt == 0 +lua_gc.internal.bg_fiber == nil + +_ = test_run:cmd("switch default") +test_run:cmd("stop server router_1") +test_run:cmd("cleanup server router_1") +test_run:drop_cluster(REPLICASET_1_1) +test_run:drop_cluster(REPLICASET_1_2) +test_run:drop_cluster(REPLICASET_2_1) +test_run:drop_cluster(REPLICASET_2_2) diff --git a/test/multiple_routers/router_1.lua b/test/multiple_routers/router_1.lua new file mode 100644 index 0000000..2e9ea91 --- /dev/null +++ b/test/multiple_routers/router_1.lua @@ -0,0 +1,15 @@ +#!/usr/bin/env tarantool + +require('strict').on() + +-- Get instance name +local fio = require('fio') +local NAME = fio.basename(arg[0], '.lua') + +require('console').listen(os.getenv('ADMIN')) + +configs = require('configs') + +-- Start the database with sharding +vshard = require('vshard') +box.cfg{} diff --git a/test/multiple_routers/storage_1_1_a.lua b/test/multiple_routers/storage_1_1_a.lua new file mode 100644 index 0000000..b44a97a --- /dev/null +++ b/test/multiple_routers/storage_1_1_a.lua @@ -0,0 +1,23 @@ +#!/usr/bin/env tarantool + +require('strict').on() + +-- Get instance name. +local fio = require('fio') +NAME = fio.basename(arg[0], '.lua') + +require('console').listen(os.getenv('ADMIN')) + +-- Fetch config for the cluster of the instance. +if NAME:sub(9,9) == '1' then + cfg = require('configs').cfg_1 +else + cfg = require('configs').cfg_2 +end + +-- Start the database with sharding. +vshard = require('vshard') +vshard.storage.cfg(cfg, names[NAME]) + +-- Bootstrap storage. +require('lua_libs.bootstrap') diff --git a/test/multiple_routers/storage_1_1_b.lua b/test/multiple_routers/storage_1_1_b.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_1_1_b.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_1_2_a.lua b/test/multiple_routers/storage_1_2_a.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_1_2_a.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_1_2_b.lua b/test/multiple_routers/storage_1_2_b.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_1_2_b.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_2_1_a.lua b/test/multiple_routers/storage_2_1_a.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_2_1_a.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_2_1_b.lua b/test/multiple_routers/storage_2_1_b.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_2_1_b.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_2_2_a.lua b/test/multiple_routers/storage_2_2_a.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_2_2_a.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_2_2_b.lua b/test/multiple_routers/storage_2_2_b.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_2_2_b.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/suite.ini b/test/multiple_routers/suite.ini new file mode 100644 index 0000000..d2d4470 --- /dev/null +++ b/test/multiple_routers/suite.ini @@ -0,0 +1,6 @@ +[default] +core = tarantool +description = Multiple routers tests +script = test.lua +is_parallel = False +lua_libs = ../lua_libs configs.lua diff --git a/test/multiple_routers/test.lua b/test/multiple_routers/test.lua new file mode 100644 index 0000000..cb7c1ee --- /dev/null +++ b/test/multiple_routers/test.lua @@ -0,0 +1,9 @@ +#!/usr/bin/env tarantool + +require('strict').on() + +box.cfg{ + listen = os.getenv("LISTEN"), +} + +require('console').listen(os.getenv('ADMIN')) diff --git a/test/router/exponential_timeout.result b/test/router/exponential_timeout.result index fb54d0f..6748b64 100644 --- a/test/router/exponential_timeout.result +++ b/test/router/exponential_timeout.result @@ -37,10 +37,10 @@ test_run:cmd('switch router_1') util = require('util') --- ... -rs1 = vshard.router.internal.replicasets[replicasets[1]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]] --- ... -rs2 = vshard.router.internal.replicasets[replicasets[2]] +rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]] --- ... util.collect_timeouts(rs1) diff --git a/test/router/exponential_timeout.test.lua b/test/router/exponential_timeout.test.lua index 3ec0b8c..75d85bf 100644 --- a/test/router/exponential_timeout.test.lua +++ b/test/router/exponential_timeout.test.lua @@ -13,8 +13,8 @@ test_run:cmd("start server router_1") test_run:cmd('switch router_1') util = require('util') -rs1 = vshard.router.internal.replicasets[replicasets[1]] -rs2 = vshard.router.internal.replicasets[replicasets[2]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]] +rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]] util.collect_timeouts(rs1) util.collect_timeouts(rs2) diff --git a/test/router/reconnect_to_master.result b/test/router/reconnect_to_master.result index 5e678ce..d502723 100644 --- a/test/router/reconnect_to_master.result +++ b/test/router/reconnect_to_master.result @@ -76,7 +76,7 @@ _ = test_run:cmd('stop server storage_1_a') _ = test_run:switch('router_1') --- ... -reps = vshard.router.internal.replicasets +reps = vshard.router.internal.static_router.replicasets --- ... test_run:cmd("setopt delimiter ';'") @@ -95,7 +95,7 @@ end; ... function count_known_buckets() local known_buckets = 0 - for _, id in pairs(vshard.router.internal.route_map) do + for _, id in pairs(vshard.router.internal.static_router.route_map) do known_buckets = known_buckets + 1 end return known_buckets @@ -127,7 +127,7 @@ is_disconnected() fiber = require('fiber') --- ... -while vshard.router.internal.replicasets[replicasets[1]].replica == nil do fiber.sleep(0.1) end +while vshard.router.internal.static_router.replicasets[replicasets[1]].replica == nil do fiber.sleep(0.1) end --- ... vshard.router.info() diff --git a/test/router/reconnect_to_master.test.lua b/test/router/reconnect_to_master.test.lua index 39ba90e..8820fa7 100644 --- a/test/router/reconnect_to_master.test.lua +++ b/test/router/reconnect_to_master.test.lua @@ -34,7 +34,7 @@ _ = test_run:cmd('stop server storage_1_a') _ = test_run:switch('router_1') -reps = vshard.router.internal.replicasets +reps = vshard.router.internal.static_router.replicasets test_run:cmd("setopt delimiter ';'") function is_disconnected() for i, rep in pairs(reps) do @@ -46,7 +46,7 @@ function is_disconnected() end; function count_known_buckets() local known_buckets = 0 - for _, id in pairs(vshard.router.internal.route_map) do + for _, id in pairs(vshard.router.internal.static_router.route_map) do known_buckets = known_buckets + 1 end return known_buckets @@ -63,7 +63,7 @@ is_disconnected() -- Wait until replica is connected to test alerts on unavailable -- master. fiber = require('fiber') -while vshard.router.internal.replicasets[replicasets[1]].replica == nil do fiber.sleep(0.1) end +while vshard.router.internal.static_router.replicasets[replicasets[1]].replica == nil do fiber.sleep(0.1) end vshard.router.info() -- Return master. diff --git a/test/router/reload.result b/test/router/reload.result index f0badc3..98e8e71 100644 --- a/test/router/reload.result +++ b/test/router/reload.result @@ -229,7 +229,7 @@ vshard.router.cfg(cfg) cfg.connection_outdate_delay = old_connection_delay --- ... -vshard.router.internal.connection_outdate_delay = nil +vshard.router.internal.static_router.connection_outdate_delay = nil --- ... rs_new = vshard.router.route(1) diff --git a/test/router/reload.test.lua b/test/router/reload.test.lua index 528222a..293cb26 100644 --- a/test/router/reload.test.lua +++ b/test/router/reload.test.lua @@ -104,7 +104,7 @@ old_connection_delay = cfg.connection_outdate_delay cfg.connection_outdate_delay = 0.3 vshard.router.cfg(cfg) cfg.connection_outdate_delay = old_connection_delay -vshard.router.internal.connection_outdate_delay = nil +vshard.router.internal.static_router.connection_outdate_delay = nil rs_new = vshard.router.route(1) rs_old = rs _, replica_old = next(rs_old.replicas) diff --git a/test/router/reroute_wrong_bucket.result b/test/router/reroute_wrong_bucket.result index 7f2a494..989dc79 100644 --- a/test/router/reroute_wrong_bucket.result +++ b/test/router/reroute_wrong_bucket.result @@ -98,7 +98,7 @@ vshard.router.call(100, 'read', 'customer_lookup', {1}, {timeout = 100}) --- - {'accounts': [], 'customer_id': 1, 'name': 'name'} ... -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]] --- ... vshard.router.call(100, 'write', 'customer_add', {{customer_id = 2, bucket_id = 100, name = 'name2', accounts = {}}}, {timeout = 100}) @@ -146,13 +146,13 @@ test_run:switch('router_1') ... -- Emulate a situation, when a replicaset_2 while is unknown for -- router, but is already known for storages. -save_rs2 = vshard.router.internal.replicasets[replicasets[2]] +save_rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]] --- ... -vshard.router.internal.replicasets[replicasets[2]] = nil +vshard.router.internal.static_router.replicasets[replicasets[2]] = nil --- ... -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]] --- ... fiber = require('fiber') @@ -207,7 +207,7 @@ err require('log').info(string.rep('a', 1000)) --- ... -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]] --- ... call_retval = nil @@ -219,7 +219,7 @@ f = fiber.create(do_call, 100) while not test_run:grep_log('router_1', 'please update configuration', 1000) do fiber.sleep(0.1) end --- ... -vshard.router.internal.replicasets[replicasets[2]] = save_rs2 +vshard.router.internal.static_router.replicasets[replicasets[2]] = save_rs2 --- ... while not call_retval do fiber.sleep(0.1) end diff --git a/test/router/reroute_wrong_bucket.test.lua b/test/router/reroute_wrong_bucket.test.lua index 03384d1..a00f941 100644 --- a/test/router/reroute_wrong_bucket.test.lua +++ b/test/router/reroute_wrong_bucket.test.lua @@ -35,7 +35,7 @@ customer_add({customer_id = 1, bucket_id = 100, name = 'name', accounts = {}}) test_run:switch('router_1') vshard.router.call(100, 'read', 'customer_lookup', {1}, {timeout = 100}) -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]] vshard.router.call(100, 'write', 'customer_add', {{customer_id = 2, bucket_id = 100, name = 'name2', accounts = {}}}, {timeout = 100}) -- Create cycle. @@ -55,9 +55,9 @@ box.space._bucket:replace({100, vshard.consts.BUCKET.SENT, replicasets[2]}) test_run:switch('router_1') -- Emulate a situation, when a replicaset_2 while is unknown for -- router, but is already known for storages. -save_rs2 = vshard.router.internal.replicasets[replicasets[2]] -vshard.router.internal.replicasets[replicasets[2]] = nil -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +save_rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]] +vshard.router.internal.static_router.replicasets[replicasets[2]] = nil +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]] fiber = require('fiber') call_retval = nil @@ -84,11 +84,11 @@ err -- detect it and end with ok. -- require('log').info(string.rep('a', 1000)) -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]] call_retval = nil f = fiber.create(do_call, 100) while not test_run:grep_log('router_1', 'please update configuration', 1000) do fiber.sleep(0.1) end -vshard.router.internal.replicasets[replicasets[2]] = save_rs2 +vshard.router.internal.static_router.replicasets[replicasets[2]] = save_rs2 while not call_retval do fiber.sleep(0.1) end call_retval vshard.router.call(100, 'read', 'customer_lookup', {3}, {timeout = 1}) diff --git a/test/router/retry_reads.result b/test/router/retry_reads.result index 64b0ff3..b803ae3 100644 --- a/test/router/retry_reads.result +++ b/test/router/retry_reads.result @@ -37,7 +37,7 @@ test_run:cmd('switch router_1') util = require('util') --- ... -rs1 = vshard.router.internal.replicasets[replicasets[1]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]] --- ... min_timeout = vshard.consts.CALL_TIMEOUT_MIN diff --git a/test/router/retry_reads.test.lua b/test/router/retry_reads.test.lua index 2fb2fc7..510e961 100644 --- a/test/router/retry_reads.test.lua +++ b/test/router/retry_reads.test.lua @@ -13,7 +13,7 @@ test_run:cmd("start server router_1") test_run:cmd('switch router_1') util = require('util') -rs1 = vshard.router.internal.replicasets[replicasets[1]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]] min_timeout = vshard.consts.CALL_TIMEOUT_MIN -- diff --git a/test/router/router.result b/test/router/router.result index 45394e1..ceaf672 100644 --- a/test/router/router.result +++ b/test/router/router.result @@ -70,10 +70,10 @@ test_run:grep_log('router_1', 'connected to ') --- - 'connected to ' ... -rs1 = vshard.router.internal.replicasets[replicasets[1]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]] --- ... -rs2 = vshard.router.internal.replicasets[replicasets[2]] +rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]] --- ... fiber = require('fiber') @@ -95,7 +95,7 @@ rs2.replica == rs2.master -- Part of gh-76: on reconfiguration do not recreate connections -- to replicas, that are kept in a new configuration. -- -old_replicasets = vshard.router.internal.replicasets +old_replicasets = vshard.router.internal.static_router.replicasets --- ... old_connections = {} @@ -127,17 +127,17 @@ connection_count == 4 vshard.router.cfg(cfg) --- ... -new_replicasets = vshard.router.internal.replicasets +new_replicasets = vshard.router.internal.static_router.replicasets --- ... old_replicasets ~= new_replicasets --- - true ... -rs1 = vshard.router.internal.replicasets[replicasets[1]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]] --- ... -rs2 = vshard.router.internal.replicasets[replicasets[2]] +rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]] --- ... while not rs1.replica or not rs2.replica do fiber.sleep(0.1) end @@ -225,7 +225,7 @@ vshard.router.bootstrap() -- -- gh-108: negative bucket count on discovery. -- -vshard.router.internal.route_map = {} +vshard.router.internal.static_router.route_map = {} --- ... rets = {} @@ -456,7 +456,7 @@ conn.state rs_uuid = '<replicaset_2>' --- ... -rs = vshard.router.internal.replicasets[rs_uuid] +rs = vshard.router.internal.static_router.replicasets[rs_uuid] --- ... master = rs.master @@ -605,7 +605,7 @@ vshard.router.info() ... -- Remove replica and master connections to trigger alert -- UNREACHABLE_REPLICASET. -rs = vshard.router.internal.replicasets[replicasets[1]] +rs = vshard.router.internal.static_router.replicasets[replicasets[1]] --- ... master_conn = rs.master.conn @@ -749,7 +749,7 @@ test_run:cmd("setopt delimiter ';'") ... function calculate_known_buckets() local known_buckets = 0 - for _, rs in pairs(vshard.router.internal.route_map) do + for _, rs in pairs(vshard.router.internal.static_router.route_map) do known_buckets = known_buckets + 1 end return known_buckets @@ -851,10 +851,10 @@ test_run:cmd("setopt delimiter ';'") - true ... for i = 1, 100 do - local rs = vshard.router.internal.route_map[i] + local rs = vshard.router.internal.static_router.route_map[i] assert(rs) rs.bucket_count = rs.bucket_count - 1 - vshard.router.internal.route_map[i] = nil + vshard.router.internal.static_router.route_map[i] = nil end; --- ... @@ -999,7 +999,7 @@ vshard.router.sync(100500) -- object method like this: object.method() instead of -- object:method(), an appropriate help-error returns. -- -_, replicaset = next(vshard.router.internal.replicasets) +_, replicaset = next(vshard.router.internal.static_router.replicasets) --- ... error_messages = {} @@ -1069,7 +1069,7 @@ test_run:cmd("setopt delimiter ';'") --- - true ... -for bucket, rs in pairs(vshard.router.internal.route_map) do +for bucket, rs in pairs(vshard.router.internal.static_router.route_map) do bucket_to_old_rs[bucket] = rs bucket_cnt = bucket_cnt + 1 end; @@ -1084,7 +1084,7 @@ vshard.router.cfg(cfg); ... for bucket, old_rs in pairs(bucket_to_old_rs) do local old_uuid = old_rs.uuid - local rs = vshard.router.internal.route_map[bucket] + local rs = vshard.router.internal.static_router.route_map[bucket] if not rs or not old_uuid == rs.uuid then error("Bucket lost during reconfigure.") end @@ -1111,7 +1111,7 @@ end; vshard.router.cfg(cfg); --- ... -vshard.router.internal.route_map = {}; +vshard.router.internal.static_router.route_map = {}; --- ... vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY = false; @@ -1119,7 +1119,7 @@ vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY = false; ... -- Do discovery iteration. Upload buckets from the -- first replicaset. -while not next(vshard.router.internal.route_map) do +while not next(vshard.router.internal.static_router.route_map) do vshard.router.discovery_wakeup() fiber.sleep(0.01) end; @@ -1128,12 +1128,12 @@ end; new_replicasets = {}; --- ... -for _, rs in pairs(vshard.router.internal.replicasets) do +for _, rs in pairs(vshard.router.internal.static_router.replicasets) do new_replicasets[rs] = true end; --- ... -_, rs = next(vshard.router.internal.route_map); +_, rs = next(vshard.router.internal.static_router.route_map); --- ... new_replicasets[rs] == true; @@ -1185,6 +1185,17 @@ vshard.router.route(1):callro('echo', {'some_data'}) - null - null ... +-- Multiple routers: check that static router can be used as an +-- object. +static_router = vshard.router.internal.static_router +--- +... +static_router:route(1):callro('echo', {'some_data'}) +--- +- some_data +- null +- null +... _ = test_run:cmd("switch default") --- ... diff --git a/test/router/router.test.lua b/test/router/router.test.lua index df2f381..d7588f7 100644 --- a/test/router/router.test.lua +++ b/test/router/router.test.lua @@ -27,8 +27,8 @@ util = require('util') -- gh-24: log all connnect/disconnect events. test_run:grep_log('router_1', 'connected to ') -rs1 = vshard.router.internal.replicasets[replicasets[1]] -rs2 = vshard.router.internal.replicasets[replicasets[2]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]] +rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]] fiber = require('fiber') while not rs1.replica or not rs2.replica do fiber.sleep(0.1) end -- With no zones the nearest server is master. @@ -39,7 +39,7 @@ rs2.replica == rs2.master -- Part of gh-76: on reconfiguration do not recreate connections -- to replicas, that are kept in a new configuration. -- -old_replicasets = vshard.router.internal.replicasets +old_replicasets = vshard.router.internal.static_router.replicasets old_connections = {} connection_count = 0 test_run:cmd("setopt delimiter ';'") @@ -52,10 +52,10 @@ end; test_run:cmd("setopt delimiter ''"); connection_count == 4 vshard.router.cfg(cfg) -new_replicasets = vshard.router.internal.replicasets +new_replicasets = vshard.router.internal.static_router.replicasets old_replicasets ~= new_replicasets -rs1 = vshard.router.internal.replicasets[replicasets[1]] -rs2 = vshard.router.internal.replicasets[replicasets[2]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]] +rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]] while not rs1.replica or not rs2.replica do fiber.sleep(0.1) end vshard.router.discovery_wakeup() -- Check that netbox connections are the same. @@ -91,7 +91,7 @@ vshard.router.bootstrap() -- -- gh-108: negative bucket count on discovery. -- -vshard.router.internal.route_map = {} +vshard.router.internal.static_router.route_map = {} rets = {} function do_echo() table.insert(rets, vshard.router.callro(1, 'echo', {1})) end f1 = fiber.create(do_echo) f2 = fiber.create(do_echo) @@ -153,7 +153,7 @@ conn = vshard.router.route(1).master.conn conn.state -- Test missing master. rs_uuid = 'ac522f65-aa94-4134-9f64-51ee384f1a54' -rs = vshard.router.internal.replicasets[rs_uuid] +rs = vshard.router.internal.static_router.replicasets[rs_uuid] master = rs.master rs.master = nil vshard.router.route(1).master @@ -223,7 +223,7 @@ vshard.router.info() -- Remove replica and master connections to trigger alert -- UNREACHABLE_REPLICASET. -rs = vshard.router.internal.replicasets[replicasets[1]] +rs = vshard.router.internal.static_router.replicasets[replicasets[1]] master_conn = rs.master.conn replica_conn = rs.replica.conn rs.master.conn = nil @@ -261,7 +261,7 @@ util.check_error(vshard.router.buckets_info, 123, '456') test_run:cmd("setopt delimiter ';'") function calculate_known_buckets() local known_buckets = 0 - for _, rs in pairs(vshard.router.internal.route_map) do + for _, rs in pairs(vshard.router.internal.static_router.route_map) do known_buckets = known_buckets + 1 end return known_buckets @@ -301,10 +301,10 @@ test_run:switch('router_1') -- test_run:cmd("setopt delimiter ';'") for i = 1, 100 do - local rs = vshard.router.internal.route_map[i] + local rs = vshard.router.internal.static_router.route_map[i] assert(rs) rs.bucket_count = rs.bucket_count - 1 - vshard.router.internal.route_map[i] = nil + vshard.router.internal.static_router.route_map[i] = nil end; test_run:cmd("setopt delimiter ''"); calculate_known_buckets() @@ -367,7 +367,7 @@ vshard.router.sync(100500) -- object method like this: object.method() instead of -- object:method(), an appropriate help-error returns. -- -_, replicaset = next(vshard.router.internal.replicasets) +_, replicaset = next(vshard.router.internal.static_router.replicasets) error_messages = {} test_run:cmd("setopt delimiter ';'") @@ -395,7 +395,7 @@ error_messages bucket_to_old_rs = {} bucket_cnt = 0 test_run:cmd("setopt delimiter ';'") -for bucket, rs in pairs(vshard.router.internal.route_map) do +for bucket, rs in pairs(vshard.router.internal.static_router.route_map) do bucket_to_old_rs[bucket] = rs bucket_cnt = bucket_cnt + 1 end; @@ -403,7 +403,7 @@ bucket_cnt; vshard.router.cfg(cfg); for bucket, old_rs in pairs(bucket_to_old_rs) do local old_uuid = old_rs.uuid - local rs = vshard.router.internal.route_map[bucket] + local rs = vshard.router.internal.static_router.route_map[bucket] if not rs or not old_uuid == rs.uuid then error("Bucket lost during reconfigure.") end @@ -423,19 +423,19 @@ while vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY ~= 'waiting' do fiber.sleep(0.02) end; vshard.router.cfg(cfg); -vshard.router.internal.route_map = {}; +vshard.router.internal.static_router.route_map = {}; vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY = false; -- Do discovery iteration. Upload buckets from the -- first replicaset. -while not next(vshard.router.internal.route_map) do +while not next(vshard.router.internal.static_router.route_map) do vshard.router.discovery_wakeup() fiber.sleep(0.01) end; new_replicasets = {}; -for _, rs in pairs(vshard.router.internal.replicasets) do +for _, rs in pairs(vshard.router.internal.static_router.replicasets) do new_replicasets[rs] = true end; -_, rs = next(vshard.router.internal.route_map); +_, rs = next(vshard.router.internal.static_router.route_map); new_replicasets[rs] == true; test_run:cmd("setopt delimiter ''"); @@ -453,6 +453,11 @@ vshard.router.internal.errinj.ERRINJ_CFG = false util.has_same_fields(old_internal, vshard.router.internal) vshard.router.route(1):callro('echo', {'some_data'}) +-- Multiple routers: check that static router can be used as an +-- object. +static_router = vshard.router.internal.static_router +static_router:route(1):callro('echo', {'some_data'}) + _ = test_run:cmd("switch default") test_run:drop_cluster(REPLICASET_2) diff --git a/vshard/error.lua b/vshard/error.lua index f79107b..da92b58 100644 --- a/vshard/error.lua +++ b/vshard/error.lua @@ -105,7 +105,12 @@ local error_message_template = { name = 'OBJECT_IS_OUTDATED', msg = 'Object is outdated after module reload/reconfigure. ' .. 'Use new instance.' - } + }, + [21] = { + name = 'ROUTER_ALREADY_EXISTS', + msg = 'Router with name %s already exists', + args = {'name'}, + }, } -- diff --git a/vshard/router/init.lua b/vshard/router/init.lua index 59c25a0..b31f7dc 100644 --- a/vshard/router/init.lua +++ b/vshard/router/init.lua @@ -25,14 +25,33 @@ local M = rawget(_G, MODULE_INTERNALS) if not M then M = { ---------------- Common module attributes ---------------- - -- The last passed configuration. - current_cfg = nil, errinj = { ERRINJ_CFG = false, ERRINJ_FAILOVER_CHANGE_CFG = false, ERRINJ_RELOAD = false, ERRINJ_LONG_DISCOVERY = false, }, + -- Dictionary, key is router name, value is a router. + routers = {}, + -- Router object which can be accessed by old api: + -- e.g. vshard.router.call(...) + static_router = nil, + -- This counter is used to restart background fibers with + -- new reloaded code. + module_version = 0, + -- Number of router which require collecting lua garbage. + collect_lua_garbage_cnt = 0, + } +end + +-- +-- Router object attributes. +-- +local ROUTER_TEMPLATE = { + -- Name of router. + name = nil, + -- The last passed configuration. + current_cfg = nil, -- Time to outdate old objects on reload. connection_outdate_delay = nil, -- Bucket map cache. @@ -47,38 +66,60 @@ if not M then total_bucket_count = 0, -- Boolean lua_gc state (create periodic gc task). collect_lua_garbage = nil, - -- This counter is used to restart background fibers with - -- new reloaded code. - module_version = 0, - } -end +} + +local STATIC_ROUTER_NAME = '_static_router' -- Set a bucket to a replicaset. -local function bucket_set(bucket_id, rs_uuid) - local replicaset = M.replicasets[rs_uuid] +local function bucket_set(router, bucket_id, rs_uuid) + local replicaset = router.replicasets[rs_uuid] -- It is technically possible to delete a replicaset at the -- same time when route to the bucket is discovered. if not replicaset then return nil, lerror.vshard(lerror.code.NO_ROUTE_TO_BUCKET, bucket_id) end - local old_replicaset = M.route_map[bucket_id] + local old_replicaset = router.route_map[bucket_id] if old_replicaset ~= replicaset then if old_replicaset then old_replicaset.bucket_count = old_replicaset.bucket_count - 1 end replicaset.bucket_count = replicaset.bucket_count + 1 end - M.route_map[bucket_id] = replicaset + router.route_map[bucket_id] = replicaset return replicaset end -- Remove a bucket from the cache. -local function bucket_reset(bucket_id) - local replicaset = M.route_map[bucket_id] +local function bucket_reset(router, bucket_id) + local replicaset = router.route_map[bucket_id] if replicaset then replicaset.bucket_count = replicaset.bucket_count - 1 end - M.route_map[bucket_id] = nil + router.route_map[bucket_id] = nil +end + +-------------------------------------------------------------------------------- +-- Helpers +-------------------------------------------------------------------------------- + +-- +-- Increase/decrease number of routers which require to collect +-- a lua garbage and change state of the `lua_gc` fiber. +-- + +local function lua_gc_cnt_inc() + M.collect_lua_garbage_cnt = M.collect_lua_garbage_cnt + 1 + if M.collect_lua_garbage_cnt == 1 then + lua_gc.set_state(true, consts.COLLECT_LUA_GARBAGE_INTERVAL) + end +end + +local function lua_gc_cnt_dec() + M.collect_lua_garbage_cnt = M.collect_lua_garbage_cnt - 1 + assert(M.collect_lua_garbage_cnt >= 0) + if M.collect_lua_garbage_cnt == 0 then + lua_gc.set_state(false, consts.COLLECT_LUA_GARBAGE_INTERVAL) + end end -------------------------------------------------------------------------------- @@ -86,8 +127,8 @@ end -------------------------------------------------------------------------------- -- Search bucket in whole cluster -local function bucket_discovery(bucket_id) - local replicaset = M.route_map[bucket_id] +local function bucket_discovery(router, bucket_id) + local replicaset = router.route_map[bucket_id] if replicaset ~= nil then return replicaset end @@ -95,11 +136,11 @@ local function bucket_discovery(bucket_id) log.verbose("Discovering bucket %d", bucket_id) local last_err = nil local unreachable_uuid = nil - for uuid, replicaset in pairs(M.replicasets) do + for uuid, replicaset in pairs(router.replicasets) do local _, err = replicaset:callrw('vshard.storage.bucket_stat', {bucket_id}) if err == nil then - return bucket_set(bucket_id, replicaset.uuid) + return bucket_set(router, bucket_id, replicaset.uuid) elseif err.code ~= lerror.code.WRONG_BUCKET then last_err = err unreachable_uuid = uuid @@ -128,14 +169,14 @@ local function bucket_discovery(bucket_id) end -- Resolve bucket id to replicaset uuid -local function bucket_resolve(bucket_id) +local function bucket_resolve(router, bucket_id) local replicaset, err - local replicaset = M.route_map[bucket_id] + local replicaset = router.route_map[bucket_id] if replicaset ~= nil then return replicaset end -- Replicaset removed from cluster, perform discovery - replicaset, err = bucket_discovery(bucket_id) + replicaset, err = bucket_discovery(router, bucket_id) if replicaset == nil then return nil, err end @@ -146,14 +187,14 @@ end -- Background fiber to perform discovery. It periodically scans -- replicasets one by one and updates route_map. -- -local function discovery_f() +local function discovery_f(router) local module_version = M.module_version while module_version == M.module_version do - while not next(M.replicasets) do + while not next(router.replicasets) do lfiber.sleep(consts.DISCOVERY_INTERVAL) end - local old_replicasets = M.replicasets - for rs_uuid, replicaset in pairs(M.replicasets) do + local old_replicasets = router.replicasets + for rs_uuid, replicaset in pairs(router.replicasets) do local active_buckets, err = replicaset:callro('vshard.storage.buckets_discovery', {}, {timeout = 2}) @@ -163,7 +204,7 @@ local function discovery_f() end -- Renew replicasets object captured by the for loop -- in case of reconfigure and reload events. - if M.replicasets ~= old_replicasets then + if router.replicasets ~= old_replicasets then break end if not active_buckets then @@ -176,11 +217,11 @@ local function discovery_f() end replicaset.bucket_count = #active_buckets for _, bucket_id in pairs(active_buckets) do - local old_rs = M.route_map[bucket_id] + local old_rs = router.route_map[bucket_id] if old_rs and old_rs ~= replicaset then old_rs.bucket_count = old_rs.bucket_count - 1 end - M.route_map[bucket_id] = replicaset + router.route_map[bucket_id] = replicaset end end lfiber.sleep(consts.DISCOVERY_INTERVAL) @@ -191,9 +232,9 @@ end -- -- Immediately wakeup discovery fiber if exists. -- -local function discovery_wakeup() - if M.discovery_fiber then - M.discovery_fiber:wakeup() +local function discovery_wakeup(router) + if router.discovery_fiber then + router.discovery_fiber:wakeup() end end @@ -205,7 +246,7 @@ end -- Function will restart operation after wrong bucket response until timeout -- is reached -- -local function router_call(bucket_id, mode, func, args, opts) +local function router_call(router, bucket_id, mode, func, args, opts) if opts and (type(opts) ~= 'table' or (opts.timeout and type(opts.timeout) ~= 'number')) then error('Usage: call(bucket_id, mode, func, args, opts)') @@ -213,7 +254,7 @@ local function router_call(bucket_id, mode, func, args, opts) local timeout = opts and opts.timeout or consts.CALL_TIMEOUT_MIN local replicaset, err local tend = lfiber.time() + timeout - if bucket_id > M.total_bucket_count or bucket_id <= 0 then + if bucket_id > router.total_bucket_count or bucket_id <= 0 then error('Bucket is unreachable: bucket id is out of range') end local call @@ -223,7 +264,7 @@ local function router_call(bucket_id, mode, func, args, opts) call = 'callrw' end repeat - replicaset, err = bucket_resolve(bucket_id) + replicaset, err = bucket_resolve(router, bucket_id) if replicaset then ::replicaset_is_found:: local storage_call_status, call_status, call_error = @@ -239,9 +280,9 @@ local function router_call(bucket_id, mode, func, args, opts) end err = call_status if err.code == lerror.code.WRONG_BUCKET then - bucket_reset(bucket_id) + bucket_reset(router, bucket_id) if err.destination then - replicaset = M.replicasets[err.destination] + replicaset = router.replicasets[err.destination] if not replicaset then log.warn('Replicaset "%s" was not found, but received'.. ' from storage as destination - please '.. @@ -253,13 +294,14 @@ local function router_call(bucket_id, mode, func, args, opts) -- but already is executed on storages. while lfiber.time() <= tend do lfiber.sleep(0.05) - replicaset = M.replicasets[err.destination] + replicaset = router.replicasets[err.destination] if replicaset then goto replicaset_is_found end end else - replicaset = bucket_set(bucket_id, replicaset.uuid) + replicaset = bucket_set(router, bucket_id, + replicaset.uuid) lfiber.yield() -- Protect against infinite cycle in a -- case of broken cluster, when a bucket @@ -276,7 +318,7 @@ local function router_call(bucket_id, mode, func, args, opts) -- is not timeout - these requests are repeated in -- any case on client, if error. assert(mode == 'write') - bucket_reset(bucket_id) + bucket_reset(router, bucket_id) return nil, err elseif err.code == lerror.code.NON_MASTER then -- Same, as above - do not wait and repeat. @@ -302,12 +344,12 @@ end -- -- Wrappers for router_call with preset mode. -- -local function router_callro(bucket_id, ...) - return router_call(bucket_id, 'read', ...) +local function router_callro(router, bucket_id, ...) + return router_call(router, bucket_id, 'read', ...) end -local function router_callrw(bucket_id, ...) - return router_call(bucket_id, 'write', ...) +local function router_callrw(router, bucket_id, ...) + return router_call(router, bucket_id, 'write', ...) end -- @@ -315,27 +357,27 @@ end -- @param bucket_id Bucket identifier. -- @retval Netbox connection. -- -local function router_route(bucket_id) +local function router_route(router, bucket_id) if type(bucket_id) ~= 'number' then error('Usage: router.route(bucket_id)') end - return bucket_resolve(bucket_id) + return bucket_resolve(router, bucket_id) end -- -- Return map of all replicasets. -- @retval See self.replicasets map. -- -local function router_routeall() - return M.replicasets +local function router_routeall(router) + return router.replicasets end -------------------------------------------------------------------------------- -- Failover -------------------------------------------------------------------------------- -local function failover_ping_round() - for _, replicaset in pairs(M.replicasets) do +local function failover_ping_round(router) + for _, replicaset in pairs(router.replicasets) do local replica = replicaset.replica if replica ~= nil and replica.conn ~= nil and replica.down_ts == nil then @@ -378,10 +420,10 @@ end -- Collect UUIDs of replicasets, priority of whose replica -- connections must be updated. -- -local function failover_collect_to_update() +local function failover_collect_to_update(router) local ts = lfiber.time() local uuid_to_update = {} - for uuid, rs in pairs(M.replicasets) do + for uuid, rs in pairs(router.replicasets) do if failover_need_down_priority(rs, ts) or failover_need_up_priority(rs, ts) then table.insert(uuid_to_update, uuid) @@ -396,16 +438,16 @@ end -- disconnected replicas. -- @retval true A replica of an replicaset has been changed. -- -local function failover_step() - failover_ping_round() - local uuid_to_update = failover_collect_to_update() +local function failover_step(router) + failover_ping_round(router) + local uuid_to_update = failover_collect_to_update(router) if #uuid_to_update == 0 then return false end local curr_ts = lfiber.time() local replica_is_changed = false for _, uuid in pairs(uuid_to_update) do - local rs = M.replicasets[uuid] + local rs = router.replicasets[uuid] if M.errinj.ERRINJ_FAILOVER_CHANGE_CFG then rs = nil M.errinj.ERRINJ_FAILOVER_CHANGE_CFG = false @@ -447,7 +489,7 @@ end -- tries to reconnect to the best replica. When the connection is -- established, it replaces the original replica. -- -local function failover_f() +local function failover_f(router) local module_version = M.module_version local min_timeout = math.min(consts.FAILOVER_UP_TIMEOUT, consts.FAILOVER_DOWN_TIMEOUT) @@ -457,7 +499,7 @@ local function failover_f() local prev_was_ok = false while module_version == M.module_version do ::continue:: - local ok, replica_is_changed = pcall(failover_step) + local ok, replica_is_changed = pcall(failover_step, router) if not ok then log.error('Error during failovering: %s', lerror.make(replica_is_changed)) @@ -484,8 +526,8 @@ end -- Configuration -------------------------------------------------------------------------------- -local function router_cfg(cfg, is_reload) - cfg = lcfg.check(cfg, M.current_cfg) +local function router_cfg(router, cfg, is_reload) + cfg = lcfg.check(cfg, router.current_cfg) local vshard_cfg, box_cfg = lcfg.split(cfg) if not M.replicasets then log.info('Starting router configuration') @@ -511,41 +553,47 @@ local function router_cfg(cfg, is_reload) -- Move connections from an old configuration to a new one. -- It must be done with no yields to prevent usage both of not -- fully moved old replicasets, and not fully built new ones. - lreplicaset.rebind_replicasets(new_replicasets, M.replicasets) + lreplicaset.rebind_replicasets(new_replicasets, router.replicasets) -- Now the new replicasets are fully built. Can establish -- connections and yield. for _, replicaset in pairs(new_replicasets) do replicaset:connect_all() end + -- Change state of lua GC. + if vshard_cfg.collect_lua_garbage and not router.collect_lua_garbage then + lua_gc_cnt_inc() + elseif not vshard_cfg.collect_lua_garbage and + router.collect_lua_garbage then + lua_gc_cnt_dec() + end lreplicaset.wait_masters_connect(new_replicasets) - lreplicaset.outdate_replicasets(M.replicasets, + lreplicaset.outdate_replicasets(router.replicasets, vshard_cfg.connection_outdate_delay) - M.connection_outdate_delay = vshard_cfg.connection_outdate_delay - M.total_bucket_count = total_bucket_count - M.collect_lua_garbage = vshard_cfg.collect_lua_garbage - M.current_cfg = cfg - M.replicasets = new_replicasets - for bucket, rs in pairs(M.route_map) do - M.route_map[bucket] = M.replicasets[rs.uuid] - end - if M.failover_fiber == nil then - M.failover_fiber = util.reloadable_fiber_create('vshard.failover', M, - 'failover_f') + router.connection_outdate_delay = vshard_cfg.connection_outdate_delay + router.total_bucket_count = total_bucket_count + router.collect_lua_garbage = vshard_cfg.collect_lua_garbage + router.current_cfg = cfg + router.replicasets = new_replicasets + for bucket, rs in pairs(router.route_map) do + router.route_map[bucket] = router.replicasets[rs.uuid] + end + if router.failover_fiber == nil then + router.failover_fiber = util.reloadable_fiber_create( + 'vshard.failover.' .. router.name, M, 'failover_f', router) + end + if router.discovery_fiber == nil then + router.discovery_fiber = util.reloadable_fiber_create( + 'vshard.discovery.' .. router.name, M, 'discovery_f', router) end - if M.discovery_fiber == nil then - M.discovery_fiber = util.reloadable_fiber_create('vshard.discovery', M, - 'discovery_f') - end - lua_gc.set_state(M.collect_lua_garbage, consts.COLLECT_LUA_GARBAGE_INTERVAL) end -------------------------------------------------------------------------------- -- Bootstrap -------------------------------------------------------------------------------- -local function cluster_bootstrap() +local function cluster_bootstrap(router) local replicasets = {} - for uuid, replicaset in pairs(M.replicasets) do + for uuid, replicaset in pairs(router.replicasets) do table.insert(replicasets, replicaset) local count, err = replicaset:callrw('vshard.storage.buckets_count', {}) @@ -556,9 +604,10 @@ local function cluster_bootstrap() return nil, lerror.vshard(lerror.code.NON_EMPTY) end end - lreplicaset.calculate_etalon_balance(M.replicasets, M.total_bucket_count) + lreplicaset.calculate_etalon_balance(router.replicasets, + router.total_bucket_count) local bucket_id = 1 - for uuid, replicaset in pairs(M.replicasets) do + for uuid, replicaset in pairs(router.replicasets) do if replicaset.etalon_bucket_count > 0 then local ok, err = replicaset:callrw('vshard.storage.bucket_force_create', @@ -614,7 +663,7 @@ local function replicaset_instance_info(replicaset, name, alerts, errcolor, return info, consts.STATUS.GREEN end -local function router_info() +local function router_info(router) local state = { replicasets = {}, bucket = { @@ -628,7 +677,7 @@ local function router_info() } local bucket_info = state.bucket local known_bucket_count = 0 - for rs_uuid, replicaset in pairs(M.replicasets) do + for rs_uuid, replicaset in pairs(router.replicasets) do -- Replicaset info parameters: -- * master instance info; -- * replica instance info; @@ -716,7 +765,7 @@ local function router_info() -- If a bucket is unreachable, then replicaset is -- unreachable too and color already is red. end - bucket_info.unknown = M.total_bucket_count - known_bucket_count + bucket_info.unknown = router.total_bucket_count - known_bucket_count if bucket_info.unknown > 0 then state.status = math.max(state.status, consts.STATUS.YELLOW) table.insert(state.alerts, lerror.alert(lerror.code.UNKNOWN_BUCKETS, @@ -733,13 +782,13 @@ end -- @param limit Maximal bucket count in output. -- @retval Map of type {bucket_id = 'unknown'/replicaset_uuid}. -- -local function router_buckets_info(offset, limit) +local function router_buckets_info(router, offset, limit) if offset ~= nil and type(offset) ~= 'number' or limit ~= nil and type(limit) ~= 'number' then error('Usage: buckets_info(offset, limit)') end offset = offset or 0 - limit = limit or M.total_bucket_count + limit = limit or router.total_bucket_count local ret = {} -- Use one string memory for all unknown buckets. local available_rw = 'available_rw' @@ -748,9 +797,9 @@ local function router_buckets_info(offset, limit) local unreachable = 'unreachable' -- Collect limit. local first = math.max(1, offset + 1) - local last = math.min(offset + limit, M.total_bucket_count) + local last = math.min(offset + limit, router.total_bucket_count) for bucket_id = first, last do - local rs = M.route_map[bucket_id] + local rs = router.route_map[bucket_id] if rs then if rs.master and rs.master:is_connected() then ret[bucket_id] = {uuid = rs.uuid, status = available_rw} @@ -770,22 +819,22 @@ end -- Other -------------------------------------------------------------------------------- -local function router_bucket_id(key) +local function router_bucket_id(router, key) if key == nil then error("Usage: vshard.router.bucket_id(key)") end - return lhash.key_hash(key) % M.total_bucket_count + 1 + return lhash.key_hash(key) % router.total_bucket_count + 1 end -local function router_bucket_count() - return M.total_bucket_count +local function router_bucket_count(router) + return router.total_bucket_count end -local function router_sync(timeout) +local function router_sync(router, timeout) if timeout ~= nil and type(timeout) ~= 'number' then error('Usage: vshard.router.sync([timeout: number])') end - for rs_uuid, replicaset in pairs(M.replicasets) do + for rs_uuid, replicaset in pairs(router.replicasets) do local status, err = replicaset:callrw('vshard.storage.sync', {timeout}) if not status then -- Add information about replicaset @@ -799,6 +848,90 @@ if M.errinj.ERRINJ_RELOAD then error('Error injection: reload') end +-------------------------------------------------------------------------------- +-- Managing router instances +-------------------------------------------------------------------------------- + +local function cfg_reconfigure(router, cfg) + return router_cfg(router, cfg, false) +end + +local router_mt = { + __index = { + cfg = cfg_reconfigure; + info = router_info; + buckets_info = router_buckets_info; + call = router_call; + callro = router_callro; + callrw = router_callrw; + route = router_route; + routeall = router_routeall; + bucket_id = router_bucket_id; + bucket_count = router_bucket_count; + sync = router_sync; + bootstrap = cluster_bootstrap; + bucket_discovery = bucket_discovery; + discovery_wakeup = discovery_wakeup; + } +} + +-- Table which represents this module. +local module = {} + +-- This metatable bypasses calls to a module to the static_router. +local module_mt = {__index = {}} +for method_name, method in pairs(router_mt.__index) do + module_mt.__index[method_name] = function(...) + return method(M.static_router, ...) + end +end + +local function export_static_router_attributes() + setmetatable(module, module_mt) +end + +-- +-- Create a new instance of router. +-- @param name Name of a new router. +-- @param cfg Configuration for `router_cfg`. +-- @retval Router instance. +-- @retval Nil and error object. +-- +local function router_new(name, cfg) + if type(name) ~= 'string' or type(cfg) ~= 'table' then + error('Wrong argument type. Usage: vshard.router.new(name, cfg).') + end + if M.routers[name] then + return nil, lerror.vshard(lerror.code.ROUTER_ALREADY_EXISTS, name) + end + local router = table.deepcopy(ROUTER_TEMPLATE) + setmetatable(router, router_mt) + router.name = name + M.routers[name] = router + router_cfg(router, cfg) + return router +end + +-- +-- Wrapper around a `router_new` API, which allow to use old +-- static `vshard.router.cfg()` API. +-- +local function legacy_cfg(cfg) + if M.static_router then + -- Reconfigure. + router_cfg(M.static_router, cfg, false) + else + -- Create new static instance. + local router, err = router_new(STATIC_ROUTER_NAME, cfg) + if router then + M.static_router = router + export_static_router_attributes() + else + return nil, err + end + end +end + -------------------------------------------------------------------------------- -- Module definition -------------------------------------------------------------------------------- @@ -809,28 +942,23 @@ end if not rawget(_G, MODULE_INTERNALS) then rawset(_G, MODULE_INTERNALS, M) else - router_cfg(M.current_cfg, true) + for _, router in pairs(M.routers) do + router_cfg(router, router.current_cfg, true) + setmetatable(router, router_mt) + end + if M.static_router then + export_static_router_attributes() + end M.module_version = M.module_version + 1 end M.discovery_f = discovery_f M.failover_f = failover_f +M.router_mt = router_mt -return { - cfg = function(cfg) return router_cfg(cfg, false) end; - info = router_info; - buckets_info = router_buckets_info; - call = router_call; - callro = router_callro; - callrw = router_callrw; - route = router_route; - routeall = router_routeall; - bucket_id = router_bucket_id; - bucket_count = router_bucket_count; - sync = router_sync; - bootstrap = cluster_bootstrap; - bucket_discovery = bucket_discovery; - discovery_wakeup = discovery_wakeup; - internal = M; - module_version = function() return M.module_version end; -} +module.cfg = legacy_cfg +module.new = router_new +module.internal = M +module.module_version = function() return M.module_version end + +return module diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua index 0593edf..63aa96f 100644 --- a/vshard/storage/init.lua +++ b/vshard/storage/init.lua @@ -1632,8 +1632,6 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload) M.rebalancer_max_receiving = rebalancer_max_receiving M.shard_index = shard_index M.collect_bucket_garbage_interval = collect_bucket_garbage_interval - M.collect_lua_garbage = collect_lua_garbage - M.current_cfg = cfg M.collect_lua_garbage = vshard_cfg.collect_lua_garbage M.current_cfg = cfg diff --git a/vshard/util.lua b/vshard/util.lua index 37abe2b..3afaa61 100644 --- a/vshard/util.lua +++ b/vshard/util.lua @@ -38,11 +38,11 @@ end -- reload of that module. -- See description of parameters in `reloadable_fiber_create`. -- -local function reloadable_fiber_main_loop(module, func_name) +local function reloadable_fiber_main_loop(module, func_name, data) log.info('%s has been started', func_name) local func = module[func_name] ::restart_loop:: - local ok, err = pcall(func) + local ok, err = pcall(func, data) -- yield serves two purposes: -- * makes this fiber cancellable -- * prevents 100% cpu consumption @@ -60,7 +60,7 @@ local function reloadable_fiber_main_loop(module, func_name) log.info('module is reloaded, restarting') -- luajit drops this frame if next function is called in -- return statement. - return M.reloadable_fiber_main_loop(module, func_name) + return M.reloadable_fiber_main_loop(module, func_name, data) end -- @@ -74,11 +74,13 @@ end -- @param module Module which can be reloaded. -- @param func_name Name of a function to be executed in the -- module. +-- @param data Data to be passed to the specified function. -- @retval New fiber. -- -local function reloadable_fiber_create(fiber_name, module, func_name) +local function reloadable_fiber_create(fiber_name, module, func_name, data) assert(type(fiber_name) == 'string') - local xfiber = fiber.create(reloadable_fiber_main_loop, module, func_name) + local xfiber = fiber.create(reloadable_fiber_main_loop, module, func_name, + data) xfiber:name(fiber_name) return xfiber end ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 3/3] Introduce multiple routers feature 2018-08-07 13:18 ` Alex Khatskevich @ 2018-08-08 12:28 ` Vladislav Shpilevoy 2018-08-08 14:04 ` Alex Khatskevich 0 siblings, 1 reply; 23+ messages in thread From: Vladislav Shpilevoy @ 2018-08-08 12:28 UTC (permalink / raw) To: Alex Khatskevich, tarantool-patches Thanks for the fixes! 1. Please, rebase on the master. I've failed to do it easy. 2. Please, adding a new commit send it to the same thread. I am talking about "Fix: do not update route map in place". Since you've not sent it, I review it here. 2.1. At first, please, prefix the commit title with a subsystem name the patch is for. Here it is not "Fix: ", but "router: ". 2.2. We know a new route map size before rebuild - it is equal to the total bucket count. So it can be allocated once via table.new(total_bucket_count, 0). It allows to avoid reallocs. I've fixed both remarks and pushed the commit into the master. > diff --git a/vshard/router/init.lua b/vshard/router/init.lua > index 59c25a0..b31f7dc 100644 > --- a/vshard/router/init.lua > +++ b/vshard/router/init.lua > @@ -799,6 +848,90 @@ if M.errinj.ERRINJ_RELOAD then > error('Error injection: reload') > end > > +-------------------------------------------------------------------------------- > +-- Managing router instances > +-------------------------------------------------------------------------------- > + > +local function cfg_reconfigure(router, cfg) > + return router_cfg(router, cfg, false) > +end > + > +local router_mt = { > + __index = { > + cfg = cfg_reconfigure; > + info = router_info; > + buckets_info = router_buckets_info; > + call = router_call; > + callro = router_callro; > + callrw = router_callrw; > + route = router_route; > + routeall = router_routeall; > + bucket_id = router_bucket_id; > + bucket_count = router_bucket_count; > + sync = router_sync; > + bootstrap = cluster_bootstrap; > + bucket_discovery = bucket_discovery; > + discovery_wakeup = discovery_wakeup; > + } > +} > + > +-- Table which represents this module. > +local module = {} > + > +-- This metatable bypasses calls to a module to the static_router. > +local module_mt = {__index = {}} > +for method_name, method in pairs(router_mt.__index) do > + module_mt.__index[method_name] = function(...) > + return method(M.static_router, ...) > + end > +end > + > +local function export_static_router_attributes() > + setmetatable(module, module_mt) > +end > + > +-- > +-- Create a new instance of router. > +-- @param name Name of a new router. > +-- @param cfg Configuration for `router_cfg`. > +-- @retval Router instance. > +-- @retval Nil and error object. > +-- > +local function router_new(name, cfg) > + if type(name) ~= 'string' or type(cfg) ~= 'table' then > + error('Wrong argument type. Usage: vshard.router.new(name, cfg).') > + end > + if M.routers[name] then > + return nil, lerror.vshard(lerror.code.ROUTER_ALREADY_EXISTS, name) > + end > + local router = table.deepcopy(ROUTER_TEMPLATE) > + setmetatable(router, router_mt) > + router.name = name > + M.routers[name] = router > + router_cfg(router, cfg) 3. router_cfg can raise an error from box.cfg. So on an error lets catch it, remove the router from M.routers and rethrow the error. In other things the patch LGTM. Please, fix the comments above and I will push it. Thank you for working on this! ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 3/3] Introduce multiple routers feature 2018-08-08 12:28 ` Vladislav Shpilevoy @ 2018-08-08 14:04 ` Alex Khatskevich 2018-08-08 15:37 ` Vladislav Shpilevoy 0 siblings, 1 reply; 23+ messages in thread From: Alex Khatskevich @ 2018-08-08 14:04 UTC (permalink / raw) To: Vladislav Shpilevoy, tarantool-patches On 08.08.2018 15:28, Vladislav Shpilevoy wrote: > Thanks for the fixes! > > 1. Please, rebase on the master. I've failed to do it > easy. > Done > 2. Please, adding a new commit send it to the same thread. > I am talking about "Fix: do not update route map in place". > > Since you've not sent it, I review it here. > > 2.1. At first, please, prefix the commit title with a > subsystem name the patch is for. Here it is not "Fix: ", > but "router: ". > > 2.2. We know a new route map size before rebuild - it is > equal to the total bucket count. So it can be allocated > once via table.new(total_bucket_count, 0). It allows to > avoid reallocs. > > I've fixed both remarks and pushed the commit into the > master. > Thanks >> +local function router_new(name, cfg) >> + if type(name) ~= 'string' or type(cfg) ~= 'table' then >> + error('Wrong argument type. Usage: >> vshard.router.new(name, cfg).') >> + end >> + if M.routers[name] then >> + return nil, lerror.vshard(lerror.code.ROUTER_ALREADY_EXISTS, >> name) >> + end >> + local router = table.deepcopy(ROUTER_TEMPLATE) >> + setmetatable(router, router_mt) >> + router.name = name >> + M.routers[name] = router >> + router_cfg(router, cfg) > > 3. router_cfg can raise an error from box.cfg. So on an error lets > catch it, > remove the router from M.routers and rethrow the error. Done > > In other things the patch LGTM. Please, fix the comments above and I will > push it. Thank you for working on this! full diff commit 5cc3991487b6b212ef1c35880963c020e443200e Author: AKhatskevich <avkhatskevich@tarantool.org> Date: Thu Jul 26 16:17:25 2018 +0300 Introduce multiple routers feature Key points: * Old `vshard.router.some_method()` api is preserved. * Add `vshard.router.new(name, cfg)` method which returns a new router. * Each router has its own: 1. name 2. background fibers 3. attributes (route_map, replicasets, outdate_delay...) * Module reload reloads all configured routers. * `cfg` reconfigures a single router. * All routers share the same box configuration. The last passed config overrides the global box config. * Multiple router instances can be connected to the same cluster. * By now, a router cannot be destroyed. Extra changes: * Add `data` parameter to `reloadable_fiber_create` function. Closes #130 diff --git a/test/failover/failover.result b/test/failover/failover.result index 73a4250..50410ad 100644 --- a/test/failover/failover.result +++ b/test/failover/failover.result @@ -174,7 +174,7 @@ test_run:switch('router_1') --- - true ... -rs1 = vshard.router.internal.replicasets[rs_uuid[1]] +rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]] --- ... while not rs1.replica_up_ts do fiber.sleep(0.1) end diff --git a/test/failover/failover.test.lua b/test/failover/failover.test.lua index 6e06314..44c8b6d 100644 --- a/test/failover/failover.test.lua +++ b/test/failover/failover.test.lua @@ -74,7 +74,7 @@ echo_count -- Ensure that replica_up_ts is updated periodically. test_run:switch('router_1') -rs1 = vshard.router.internal.replicasets[rs_uuid[1]] +rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]] while not rs1.replica_up_ts do fiber.sleep(0.1) end old_up_ts = rs1.replica_up_ts while rs1.replica_up_ts == old_up_ts do fiber.sleep(0.1) end diff --git a/test/failover/failover_errinj.result b/test/failover/failover_errinj.result index 3b6d986..484a1e3 100644 --- a/test/failover/failover_errinj.result +++ b/test/failover/failover_errinj.result @@ -49,7 +49,7 @@ vshard.router.cfg(cfg) -- Check that already run failover step is restarted on -- configuration change (if some replicasets are removed from -- config). -rs1 = vshard.router.internal.replicasets[rs_uuid[1]] +rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]] --- ... while not rs1.replica or not rs1.replica.conn:is_connected() do fiber.sleep(0.1) end diff --git a/test/failover/failover_errinj.test.lua b/test/failover/failover_errinj.test.lua index b4d2d35..14228de 100644 --- a/test/failover/failover_errinj.test.lua +++ b/test/failover/failover_errinj.test.lua @@ -20,7 +20,7 @@ vshard.router.cfg(cfg) -- Check that already run failover step is restarted on -- configuration change (if some replicasets are removed from -- config). -rs1 = vshard.router.internal.replicasets[rs_uuid[1]] +rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]] while not rs1.replica or not rs1.replica.conn:is_connected() do fiber.sleep(0.1) end vshard.router.internal.errinj.ERRINJ_FAILOVER_CHANGE_CFG = true wait_state('Configuration has changed, restart ') diff --git a/test/failover/router_1.lua b/test/failover/router_1.lua index d71209b..664a6c6 100644 --- a/test/failover/router_1.lua +++ b/test/failover/router_1.lua @@ -42,7 +42,7 @@ end function priority_order() local ret = {} for _, uuid in pairs(rs_uuid) do - local rs = vshard.router.internal.replicasets[uuid] + local rs = vshard.router.internal.static_router.replicasets[uuid] local sorted = {} for _, replica in pairs(rs.priority_list) do local z diff --git a/test/misc/reconfigure.result b/test/misc/reconfigure.result index c7960b3..311f749 100644 --- a/test/misc/reconfigure.result +++ b/test/misc/reconfigure.result @@ -250,7 +250,7 @@ test_run:switch('router_1') -- Ensure that in a case of error router internals are not -- changed. -- -not vshard.router.internal.collect_lua_garbage +not vshard.router.internal.static_router.collect_lua_garbage --- - true ... @@ -264,7 +264,7 @@ vshard.router.cfg(cfg) --- - error: 'Incorrect value for option ''invalid_option'': unexpected option' ... -not vshard.router.internal.collect_lua_garbage +not vshard.router.internal.static_router.collect_lua_garbage --- - true ... diff --git a/test/misc/reconfigure.test.lua b/test/misc/reconfigure.test.lua index 25dc2ca..298b9b0 100644 --- a/test/misc/reconfigure.test.lua +++ b/test/misc/reconfigure.test.lua @@ -99,11 +99,11 @@ test_run:switch('router_1') -- Ensure that in a case of error router internals are not -- changed. -- -not vshard.router.internal.collect_lua_garbage +not vshard.router.internal.static_router.collect_lua_garbage cfg.collect_lua_garbage = true cfg.invalid_option = 'kek' vshard.router.cfg(cfg) -not vshard.router.internal.collect_lua_garbage +not vshard.router.internal.static_router.collect_lua_garbage cfg.invalid_option = nil cfg.collect_lua_garbage = nil vshard.router.cfg(cfg) diff --git a/test/multiple_routers/configs.lua b/test/multiple_routers/configs.lua new file mode 100644 index 0000000..a6ce33c --- /dev/null +++ b/test/multiple_routers/configs.lua @@ -0,0 +1,81 @@ +names = { + storage_1_1_a = '32a2d4b8-f146-44ed-9d51-2436507efdf8', + storage_1_1_b = 'c1c849b1-641d-40b8-9283-bcfe73d46270', + storage_1_2_a = '04e677ed-c7ba-47e0-a67f-b5100cfa86af', + storage_1_2_b = 'c7a979ee-9263-4a38-84a5-2fb6a0a32684', + storage_2_1_a = '88dc03f0-23fb-4f05-b462-e29186542864', + storage_2_1_b = '4230b711-f5c4-4131-bf98-88cd43a16901', + storage_2_2_a = '6b1eefbc-1e2e-410e-84ff-44c572ea9916', + storage_2_2_b = 'be74419a-1e56-4ba4-97e9-6b18710f63c5', +} + +rs_1_1 = 'dd208fb8-8b90-49bc-8393-6b3a99da7c52' +rs_1_2 = 'af9cfe88-2091-4613-a877-a623776c5c0e' +rs_2_1 = '9ca8ee15-ae18-4f31-9385-4859f89ce73f' +rs_2_2 = '007f5f58-b654-4125-8441-a71866fb62b5' + +local cfg_1 = {} +cfg_1.sharding = { + [rs_1_1] = { + replicas = { + [names.storage_1_1_a] = { + uri = 'storage:storage@127.0.0.1:3301', + name = 'storage_1_1_a', + master = true, + }, + [names.storage_1_1_b] = { + uri = 'storage:storage@127.0.0.1:3302', + name = 'storage_1_1_b', + }, + } + }, + [rs_1_2] = { + replicas = { + [names.storage_1_2_a] = { + uri = 'storage:storage@127.0.0.1:3303', + name = 'storage_1_2_a', + master = true, + }, + [names.storage_1_2_b] = { + uri = 'storage:storage@127.0.0.1:3304', + name = 'storage_1_2_b', + }, + } + }, +} + + +local cfg_2 = {} +cfg_2.sharding = { + [rs_2_1] = { + replicas = { + [names.storage_2_1_a] = { + uri = 'storage:storage@127.0.0.1:3305', + name = 'storage_2_1_a', + master = true, + }, + [names.storage_2_1_b] = { + uri = 'storage:storage@127.0.0.1:3306', + name = 'storage_2_1_b', + }, + } + }, + [rs_2_2] = { + replicas = { + [names.storage_2_2_a] = { + uri = 'storage:storage@127.0.0.1:3307', + name = 'storage_2_2_a', + master = true, + }, + [names.storage_2_2_b] = { + uri = 'storage:storage@127.0.0.1:3308', + name = 'storage_2_2_b', + }, + } + }, +} + +return { + cfg_1 = cfg_1, + cfg_2 = cfg_2, +} diff --git a/test/multiple_routers/multiple_routers.result b/test/multiple_routers/multiple_routers.result new file mode 100644 index 0000000..5b85e1c --- /dev/null +++ b/test/multiple_routers/multiple_routers.result @@ -0,0 +1,301 @@ +test_run = require('test_run').new() +--- +... +REPLICASET_1_1 = { 'storage_1_1_a', 'storage_1_1_b' } +--- +... +REPLICASET_1_2 = { 'storage_1_2_a', 'storage_1_2_b' } +--- +... +REPLICASET_2_1 = { 'storage_2_1_a', 'storage_2_1_b' } +--- +... +REPLICASET_2_2 = { 'storage_2_2_a', 'storage_2_2_b' } +--- +... +test_run:create_cluster(REPLICASET_1_1, 'multiple_routers') +--- +... +test_run:create_cluster(REPLICASET_1_2, 'multiple_routers') +--- +... +test_run:create_cluster(REPLICASET_2_1, 'multiple_routers') +--- +... +test_run:create_cluster(REPLICASET_2_2, 'multiple_routers') +--- +... +util = require('lua_libs.util') +--- +... +util.wait_master(test_run, REPLICASET_1_1, 'storage_1_1_a') +--- +... +util.wait_master(test_run, REPLICASET_1_2, 'storage_1_2_a') +--- +... +util.wait_master(test_run, REPLICASET_2_1, 'storage_2_1_a') +--- +... +util.wait_master(test_run, REPLICASET_2_2, 'storage_2_2_a') +--- +... +test_run:cmd("create server router_1 with script='multiple_routers/router_1.lua'") +--- +- true +... +test_run:cmd("start server router_1") +--- +- true +... +-- Configure default (static) router. +_ = test_run:cmd("switch router_1") +--- +... +vshard.router.cfg(configs.cfg_1) +--- +... +vshard.router.bootstrap() +--- +- true +... +_ = test_run:cmd("switch storage_1_2_a") +--- +... +wait_rebalancer_state('The cluster is balanced ok', test_run) +--- +... +_ = test_run:cmd("switch router_1") +--- +... +vshard.router.call(1, 'write', 'do_replace', {{1, 1}}) +--- +- true +... +vshard.router.call(1, 'read', 'do_select', {1}) +--- +- [[1, 1]] +... +-- Test that static router is just a router object under the hood. +static_router = vshard.router.internal.static_router +--- +... +static_router:route(1) == vshard.router.route(1) +--- +- true +... +-- Configure extra router. +router_2 = vshard.router.new('router_2', configs.cfg_2) +--- +... +router_2:bootstrap() +--- +- true +... +_ = test_run:cmd("switch storage_2_2_a") +--- +... +wait_rebalancer_state('The cluster is balanced ok', test_run) +--- +... +_ = test_run:cmd("switch router_1") +--- +... +router_2:call(1, 'write', 'do_replace', {{2, 2}}) +--- +- true +... +router_2:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +-- Check that router_2 and static router serves different clusters. +#router_2:call(1, 'read', 'do_select', {1}) == 0 +--- +- true +... +-- Create several routers to the same cluster. +routers = {} +--- +... +for i = 3, 10 do routers[i] = vshard.router.new('router_' .. i, configs.cfg_2) end +--- +... +routers[3]:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +-- Check that they have their own background fibers. +fiber_names = {} +--- +... +for i = 2, 10 do fiber_names['vshard.failover.router_' .. i] = true; fiber_names['vshard.discovery.router_' .. i] = true; end +--- +... +next(fiber_names) ~= nil +--- +- true +... +fiber = require('fiber') +--- +... +for _, xfiber in pairs(fiber.info()) do fiber_names[xfiber.name] = nil end +--- +... +next(fiber_names) == nil +--- +- true +... +-- Reconfigure one of routers do not affect the others. +routers[3]:cfg(configs.cfg_1) +--- +... +routers[3]:call(1, 'read', 'do_select', {1}) +--- +- [[1, 1]] +... +#routers[3]:call(1, 'read', 'do_select', {2}) == 0 +--- +- true +... +#routers[4]:call(1, 'read', 'do_select', {1}) == 0 +--- +- true +... +routers[4]:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +routers[3]:cfg(configs.cfg_2) +--- +... +-- Try to create router with the same name. +util = require('lua_libs.util') +--- +... +util.check_error(vshard.router.new, 'router_2', configs.cfg_2) +--- +- null +- type: ShardingError + code: 21 + name: ROUTER_ALREADY_EXISTS + message: Router with name router_2 already exists +... +-- Reload router module. +_, old_rs_1 = next(vshard.router.internal.static_router.replicasets) +--- +... +_, old_rs_2 = next(router_2.replicasets) +--- +... +package.loaded['vshard.router'] = nil +--- +... +vshard.router = require('vshard.router') +--- +... +while not old_rs_1.is_outdated do fiber.sleep(0.01) end +--- +... +while not old_rs_2.is_outdated do fiber.sleep(0.01) end +--- +... +vshard.router.call(1, 'read', 'do_select', {1}) +--- +- [[1, 1]] +... +router_2:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +routers[5]:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +-- Check lua_gc counter. +lua_gc = require('vshard.lua_gc') +--- +... +vshard.router.internal.collect_lua_garbage_cnt == 0 +--- +- true +... +lua_gc.internal.bg_fiber == nil +--- +- true +... +configs.cfg_2.collect_lua_garbage = true +--- +... +routers[5]:cfg(configs.cfg_2) +--- +... +lua_gc.internal.bg_fiber ~= nil +--- +- true +... +routers[7]:cfg(configs.cfg_2) +--- +... +lua_gc.internal.bg_fiber ~= nil +--- +- true +... +vshard.router.internal.collect_lua_garbage_cnt == 2 +--- +- true +... +package.loaded['vshard.router'] = nil +--- +... +vshard.router = require('vshard.router') +--- +... +vshard.router.internal.collect_lua_garbage_cnt == 2 +--- +- true +... +configs.cfg_2.collect_lua_garbage = nil +--- +... +routers[5]:cfg(configs.cfg_2) +--- +... +lua_gc.internal.bg_fiber ~= nil +--- +- true +... +routers[7]:cfg(configs.cfg_2) +--- +... +vshard.router.internal.collect_lua_garbage_cnt == 0 +--- +- true +... +lua_gc.internal.bg_fiber == nil +--- +- true +... +_ = test_run:cmd("switch default") +--- +... +test_run:cmd("stop server router_1") +--- +- true +... +test_run:cmd("cleanup server router_1") +--- +- true +... +test_run:drop_cluster(REPLICASET_1_1) +--- +... +test_run:drop_cluster(REPLICASET_1_2) +--- +... +test_run:drop_cluster(REPLICASET_2_1) +--- +... +test_run:drop_cluster(REPLICASET_2_2) +--- +... diff --git a/test/multiple_routers/multiple_routers.test.lua b/test/multiple_routers/multiple_routers.test.lua new file mode 100644 index 0000000..ec3c7f7 --- /dev/null +++ b/test/multiple_routers/multiple_routers.test.lua @@ -0,0 +1,109 @@ +test_run = require('test_run').new() + +REPLICASET_1_1 = { 'storage_1_1_a', 'storage_1_1_b' } +REPLICASET_1_2 = { 'storage_1_2_a', 'storage_1_2_b' } +REPLICASET_2_1 = { 'storage_2_1_a', 'storage_2_1_b' } +REPLICASET_2_2 = { 'storage_2_2_a', 'storage_2_2_b' } + +test_run:create_cluster(REPLICASET_1_1, 'multiple_routers') +test_run:create_cluster(REPLICASET_1_2, 'multiple_routers') +test_run:create_cluster(REPLICASET_2_1, 'multiple_routers') +test_run:create_cluster(REPLICASET_2_2, 'multiple_routers') +util = require('lua_libs.util') +util.wait_master(test_run, REPLICASET_1_1, 'storage_1_1_a') +util.wait_master(test_run, REPLICASET_1_2, 'storage_1_2_a') +util.wait_master(test_run, REPLICASET_2_1, 'storage_2_1_a') +util.wait_master(test_run, REPLICASET_2_2, 'storage_2_2_a') + +test_run:cmd("create server router_1 with script='multiple_routers/router_1.lua'") +test_run:cmd("start server router_1") + +-- Configure default (static) router. +_ = test_run:cmd("switch router_1") +vshard.router.cfg(configs.cfg_1) +vshard.router.bootstrap() +_ = test_run:cmd("switch storage_1_2_a") +wait_rebalancer_state('The cluster is balanced ok', test_run) +_ = test_run:cmd("switch router_1") + +vshard.router.call(1, 'write', 'do_replace', {{1, 1}}) +vshard.router.call(1, 'read', 'do_select', {1}) + +-- Test that static router is just a router object under the hood. +static_router = vshard.router.internal.static_router +static_router:route(1) == vshard.router.route(1) + +-- Configure extra router. +router_2 = vshard.router.new('router_2', configs.cfg_2) +router_2:bootstrap() +_ = test_run:cmd("switch storage_2_2_a") +wait_rebalancer_state('The cluster is balanced ok', test_run) +_ = test_run:cmd("switch router_1") + +router_2:call(1, 'write', 'do_replace', {{2, 2}}) +router_2:call(1, 'read', 'do_select', {2}) +-- Check that router_2 and static router serves different clusters. +#router_2:call(1, 'read', 'do_select', {1}) == 0 + +-- Create several routers to the same cluster. +routers = {} +for i = 3, 10 do routers[i] = vshard.router.new('router_' .. i, configs.cfg_2) end +routers[3]:call(1, 'read', 'do_select', {2}) +-- Check that they have their own background fibers. +fiber_names = {} +for i = 2, 10 do fiber_names['vshard.failover.router_' .. i] = true; fiber_names['vshard.discovery.router_' .. i] = true; end +next(fiber_names) ~= nil +fiber = require('fiber') +for _, xfiber in pairs(fiber.info()) do fiber_names[xfiber.name] = nil end +next(fiber_names) == nil + +-- Reconfigure one of routers do not affect the others. +routers[3]:cfg(configs.cfg_1) +routers[3]:call(1, 'read', 'do_select', {1}) +#routers[3]:call(1, 'read', 'do_select', {2}) == 0 +#routers[4]:call(1, 'read', 'do_select', {1}) == 0 +routers[4]:call(1, 'read', 'do_select', {2}) +routers[3]:cfg(configs.cfg_2) + +-- Try to create router with the same name. +util = require('lua_libs.util') +util.check_error(vshard.router.new, 'router_2', configs.cfg_2) + +-- Reload router module. +_, old_rs_1 = next(vshard.router.internal.static_router.replicasets) +_, old_rs_2 = next(router_2.replicasets) +package.loaded['vshard.router'] = nil +vshard.router = require('vshard.router') +while not old_rs_1.is_outdated do fiber.sleep(0.01) end +while not old_rs_2.is_outdated do fiber.sleep(0.01) end +vshard.router.call(1, 'read', 'do_select', {1}) +router_2:call(1, 'read', 'do_select', {2}) +routers[5]:call(1, 'read', 'do_select', {2}) + +-- Check lua_gc counter. +lua_gc = require('vshard.lua_gc') +vshard.router.internal.collect_lua_garbage_cnt == 0 +lua_gc.internal.bg_fiber == nil +configs.cfg_2.collect_lua_garbage = true +routers[5]:cfg(configs.cfg_2) +lua_gc.internal.bg_fiber ~= nil +routers[7]:cfg(configs.cfg_2) +lua_gc.internal.bg_fiber ~= nil +vshard.router.internal.collect_lua_garbage_cnt == 2 +package.loaded['vshard.router'] = nil +vshard.router = require('vshard.router') +vshard.router.internal.collect_lua_garbage_cnt == 2 +configs.cfg_2.collect_lua_garbage = nil +routers[5]:cfg(configs.cfg_2) +lua_gc.internal.bg_fiber ~= nil +routers[7]:cfg(configs.cfg_2) +vshard.router.internal.collect_lua_garbage_cnt == 0 +lua_gc.internal.bg_fiber == nil + +_ = test_run:cmd("switch default") +test_run:cmd("stop server router_1") +test_run:cmd("cleanup server router_1") +test_run:drop_cluster(REPLICASET_1_1) +test_run:drop_cluster(REPLICASET_1_2) +test_run:drop_cluster(REPLICASET_2_1) +test_run:drop_cluster(REPLICASET_2_2) diff --git a/test/multiple_routers/router_1.lua b/test/multiple_routers/router_1.lua new file mode 100644 index 0000000..2e9ea91 --- /dev/null +++ b/test/multiple_routers/router_1.lua @@ -0,0 +1,15 @@ +#!/usr/bin/env tarantool + +require('strict').on() + +-- Get instance name +local fio = require('fio') +local NAME = fio.basename(arg[0], '.lua') + +require('console').listen(os.getenv('ADMIN')) + +configs = require('configs') + +-- Start the database with sharding +vshard = require('vshard') +box.cfg{} diff --git a/test/multiple_routers/storage_1_1_a.lua b/test/multiple_routers/storage_1_1_a.lua new file mode 100644 index 0000000..b44a97a --- /dev/null +++ b/test/multiple_routers/storage_1_1_a.lua @@ -0,0 +1,23 @@ +#!/usr/bin/env tarantool + +require('strict').on() + +-- Get instance name. +local fio = require('fio') +NAME = fio.basename(arg[0], '.lua') + +require('console').listen(os.getenv('ADMIN')) + +-- Fetch config for the cluster of the instance. +if NAME:sub(9,9) == '1' then + cfg = require('configs').cfg_1 +else + cfg = require('configs').cfg_2 +end + +-- Start the database with sharding. +vshard = require('vshard') +vshard.storage.cfg(cfg, names[NAME]) + +-- Bootstrap storage. +require('lua_libs.bootstrap') diff --git a/test/multiple_routers/storage_1_1_b.lua b/test/multiple_routers/storage_1_1_b.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_1_1_b.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_1_2_a.lua b/test/multiple_routers/storage_1_2_a.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_1_2_a.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_1_2_b.lua b/test/multiple_routers/storage_1_2_b.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_1_2_b.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_2_1_a.lua b/test/multiple_routers/storage_2_1_a.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_2_1_a.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_2_1_b.lua b/test/multiple_routers/storage_2_1_b.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_2_1_b.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_2_2_a.lua b/test/multiple_routers/storage_2_2_a.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_2_2_a.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_2_2_b.lua b/test/multiple_routers/storage_2_2_b.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_2_2_b.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/suite.ini b/test/multiple_routers/suite.ini new file mode 100644 index 0000000..d2d4470 --- /dev/null +++ b/test/multiple_routers/suite.ini @@ -0,0 +1,6 @@ +[default] +core = tarantool +description = Multiple routers tests +script = test.lua +is_parallel = False +lua_libs = ../lua_libs configs.lua diff --git a/test/multiple_routers/test.lua b/test/multiple_routers/test.lua new file mode 100644 index 0000000..cb7c1ee --- /dev/null +++ b/test/multiple_routers/test.lua @@ -0,0 +1,9 @@ +#!/usr/bin/env tarantool + +require('strict').on() + +box.cfg{ + listen = os.getenv("LISTEN"), +} + +require('console').listen(os.getenv('ADMIN')) diff --git a/test/router/exponential_timeout.result b/test/router/exponential_timeout.result index fb54d0f..6748b64 100644 --- a/test/router/exponential_timeout.result +++ b/test/router/exponential_timeout.result @@ -37,10 +37,10 @@ test_run:cmd('switch router_1') util = require('util') --- ... -rs1 = vshard.router.internal.replicasets[replicasets[1]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]] --- ... -rs2 = vshard.router.internal.replicasets[replicasets[2]] +rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]] --- ... util.collect_timeouts(rs1) diff --git a/test/router/exponential_timeout.test.lua b/test/router/exponential_timeout.test.lua index 3ec0b8c..75d85bf 100644 --- a/test/router/exponential_timeout.test.lua +++ b/test/router/exponential_timeout.test.lua @@ -13,8 +13,8 @@ test_run:cmd("start server router_1") test_run:cmd('switch router_1') util = require('util') -rs1 = vshard.router.internal.replicasets[replicasets[1]] -rs2 = vshard.router.internal.replicasets[replicasets[2]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]] +rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]] util.collect_timeouts(rs1) util.collect_timeouts(rs2) diff --git a/test/router/reconnect_to_master.result b/test/router/reconnect_to_master.result index 5e678ce..d502723 100644 --- a/test/router/reconnect_to_master.result +++ b/test/router/reconnect_to_master.result @@ -76,7 +76,7 @@ _ = test_run:cmd('stop server storage_1_a') _ = test_run:switch('router_1') --- ... -reps = vshard.router.internal.replicasets +reps = vshard.router.internal.static_router.replicasets --- ... test_run:cmd("setopt delimiter ';'") @@ -95,7 +95,7 @@ end; ... function count_known_buckets() local known_buckets = 0 - for _, id in pairs(vshard.router.internal.route_map) do + for _, id in pairs(vshard.router.internal.static_router.route_map) do known_buckets = known_buckets + 1 end return known_buckets @@ -127,7 +127,7 @@ is_disconnected() fiber = require('fiber') --- ... -while vshard.router.internal.replicasets[replicasets[1]].replica == nil do fiber.sleep(0.1) end +while vshard.router.internal.static_router.replicasets[replicasets[1]].replica == nil do fiber.sleep(0.1) end --- ... vshard.router.info() diff --git a/test/router/reconnect_to_master.test.lua b/test/router/reconnect_to_master.test.lua index 39ba90e..8820fa7 100644 --- a/test/router/reconnect_to_master.test.lua +++ b/test/router/reconnect_to_master.test.lua @@ -34,7 +34,7 @@ _ = test_run:cmd('stop server storage_1_a') _ = test_run:switch('router_1') -reps = vshard.router.internal.replicasets +reps = vshard.router.internal.static_router.replicasets test_run:cmd("setopt delimiter ';'") function is_disconnected() for i, rep in pairs(reps) do @@ -46,7 +46,7 @@ function is_disconnected() end; function count_known_buckets() local known_buckets = 0 - for _, id in pairs(vshard.router.internal.route_map) do + for _, id in pairs(vshard.router.internal.static_router.route_map) do known_buckets = known_buckets + 1 end return known_buckets @@ -63,7 +63,7 @@ is_disconnected() -- Wait until replica is connected to test alerts on unavailable -- master. fiber = require('fiber') -while vshard.router.internal.replicasets[replicasets[1]].replica == nil do fiber.sleep(0.1) end +while vshard.router.internal.static_router.replicasets[replicasets[1]].replica == nil do fiber.sleep(0.1) end vshard.router.info() -- Return master. diff --git a/test/router/reload.result b/test/router/reload.result index f0badc3..98e8e71 100644 --- a/test/router/reload.result +++ b/test/router/reload.result @@ -229,7 +229,7 @@ vshard.router.cfg(cfg) cfg.connection_outdate_delay = old_connection_delay --- ... -vshard.router.internal.connection_outdate_delay = nil +vshard.router.internal.static_router.connection_outdate_delay = nil --- ... rs_new = vshard.router.route(1) diff --git a/test/router/reload.test.lua b/test/router/reload.test.lua index 528222a..293cb26 100644 --- a/test/router/reload.test.lua +++ b/test/router/reload.test.lua @@ -104,7 +104,7 @@ old_connection_delay = cfg.connection_outdate_delay cfg.connection_outdate_delay = 0.3 vshard.router.cfg(cfg) cfg.connection_outdate_delay = old_connection_delay -vshard.router.internal.connection_outdate_delay = nil +vshard.router.internal.static_router.connection_outdate_delay = nil rs_new = vshard.router.route(1) rs_old = rs _, replica_old = next(rs_old.replicas) diff --git a/test/router/reroute_wrong_bucket.result b/test/router/reroute_wrong_bucket.result index 7f2a494..989dc79 100644 --- a/test/router/reroute_wrong_bucket.result +++ b/test/router/reroute_wrong_bucket.result @@ -98,7 +98,7 @@ vshard.router.call(100, 'read', 'customer_lookup', {1}, {timeout = 100}) --- - {'accounts': [], 'customer_id': 1, 'name': 'name'} ... -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]] --- ... vshard.router.call(100, 'write', 'customer_add', {{customer_id = 2, bucket_id = 100, name = 'name2', accounts = {}}}, {timeout = 100}) @@ -146,13 +146,13 @@ test_run:switch('router_1') ... -- Emulate a situation, when a replicaset_2 while is unknown for -- router, but is already known for storages. -save_rs2 = vshard.router.internal.replicasets[replicasets[2]] +save_rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]] --- ... -vshard.router.internal.replicasets[replicasets[2]] = nil +vshard.router.internal.static_router.replicasets[replicasets[2]] = nil --- ... -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]] --- ... fiber = require('fiber') @@ -207,7 +207,7 @@ err require('log').info(string.rep('a', 1000)) --- ... -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]] --- ... call_retval = nil @@ -219,7 +219,7 @@ f = fiber.create(do_call, 100) while not test_run:grep_log('router_1', 'please update configuration', 1000) do fiber.sleep(0.1) end --- ... -vshard.router.internal.replicasets[replicasets[2]] = save_rs2 +vshard.router.internal.static_router.replicasets[replicasets[2]] = save_rs2 --- ... while not call_retval do fiber.sleep(0.1) end diff --git a/test/router/reroute_wrong_bucket.test.lua b/test/router/reroute_wrong_bucket.test.lua index 03384d1..a00f941 100644 --- a/test/router/reroute_wrong_bucket.test.lua +++ b/test/router/reroute_wrong_bucket.test.lua @@ -35,7 +35,7 @@ customer_add({customer_id = 1, bucket_id = 100, name = 'name', accounts = {}}) test_run:switch('router_1') vshard.router.call(100, 'read', 'customer_lookup', {1}, {timeout = 100}) -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]] vshard.router.call(100, 'write', 'customer_add', {{customer_id = 2, bucket_id = 100, name = 'name2', accounts = {}}}, {timeout = 100}) -- Create cycle. @@ -55,9 +55,9 @@ box.space._bucket:replace({100, vshard.consts.BUCKET.SENT, replicasets[2]}) test_run:switch('router_1') -- Emulate a situation, when a replicaset_2 while is unknown for -- router, but is already known for storages. -save_rs2 = vshard.router.internal.replicasets[replicasets[2]] -vshard.router.internal.replicasets[replicasets[2]] = nil -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +save_rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]] +vshard.router.internal.static_router.replicasets[replicasets[2]] = nil +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]] fiber = require('fiber') call_retval = nil @@ -84,11 +84,11 @@ err -- detect it and end with ok. -- require('log').info(string.rep('a', 1000)) -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]] call_retval = nil f = fiber.create(do_call, 100) while not test_run:grep_log('router_1', 'please update configuration', 1000) do fiber.sleep(0.1) end -vshard.router.internal.replicasets[replicasets[2]] = save_rs2 +vshard.router.internal.static_router.replicasets[replicasets[2]] = save_rs2 while not call_retval do fiber.sleep(0.1) end call_retval vshard.router.call(100, 'read', 'customer_lookup', {3}, {timeout = 1}) diff --git a/test/router/retry_reads.result b/test/router/retry_reads.result index 64b0ff3..b803ae3 100644 --- a/test/router/retry_reads.result +++ b/test/router/retry_reads.result @@ -37,7 +37,7 @@ test_run:cmd('switch router_1') util = require('util') --- ... -rs1 = vshard.router.internal.replicasets[replicasets[1]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]] --- ... min_timeout = vshard.consts.CALL_TIMEOUT_MIN diff --git a/test/router/retry_reads.test.lua b/test/router/retry_reads.test.lua index 2fb2fc7..510e961 100644 --- a/test/router/retry_reads.test.lua +++ b/test/router/retry_reads.test.lua @@ -13,7 +13,7 @@ test_run:cmd("start server router_1") test_run:cmd('switch router_1') util = require('util') -rs1 = vshard.router.internal.replicasets[replicasets[1]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]] min_timeout = vshard.consts.CALL_TIMEOUT_MIN -- diff --git a/test/router/router.result b/test/router/router.result index 45394e1..ceaf672 100644 --- a/test/router/router.result +++ b/test/router/router.result @@ -70,10 +70,10 @@ test_run:grep_log('router_1', 'connected to ') --- - 'connected to ' ... -rs1 = vshard.router.internal.replicasets[replicasets[1]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]] --- ... -rs2 = vshard.router.internal.replicasets[replicasets[2]] +rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]] --- ... fiber = require('fiber') @@ -95,7 +95,7 @@ rs2.replica == rs2.master -- Part of gh-76: on reconfiguration do not recreate connections -- to replicas, that are kept in a new configuration. -- -old_replicasets = vshard.router.internal.replicasets +old_replicasets = vshard.router.internal.static_router.replicasets --- ... old_connections = {} @@ -127,17 +127,17 @@ connection_count == 4 vshard.router.cfg(cfg) --- ... -new_replicasets = vshard.router.internal.replicasets +new_replicasets = vshard.router.internal.static_router.replicasets --- ... old_replicasets ~= new_replicasets --- - true ... -rs1 = vshard.router.internal.replicasets[replicasets[1]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]] --- ... -rs2 = vshard.router.internal.replicasets[replicasets[2]] +rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]] --- ... while not rs1.replica or not rs2.replica do fiber.sleep(0.1) end @@ -225,7 +225,7 @@ vshard.router.bootstrap() -- -- gh-108: negative bucket count on discovery. -- -vshard.router.internal.route_map = {} +vshard.router.internal.static_router.route_map = {} --- ... rets = {} @@ -456,7 +456,7 @@ conn.state rs_uuid = '<replicaset_2>' --- ... -rs = vshard.router.internal.replicasets[rs_uuid] +rs = vshard.router.internal.static_router.replicasets[rs_uuid] --- ... master = rs.master @@ -605,7 +605,7 @@ vshard.router.info() ... -- Remove replica and master connections to trigger alert -- UNREACHABLE_REPLICASET. -rs = vshard.router.internal.replicasets[replicasets[1]] +rs = vshard.router.internal.static_router.replicasets[replicasets[1]] --- ... master_conn = rs.master.conn @@ -749,7 +749,7 @@ test_run:cmd("setopt delimiter ';'") ... function calculate_known_buckets() local known_buckets = 0 - for _, rs in pairs(vshard.router.internal.route_map) do + for _, rs in pairs(vshard.router.internal.static_router.route_map) do known_buckets = known_buckets + 1 end return known_buckets @@ -851,10 +851,10 @@ test_run:cmd("setopt delimiter ';'") - true ... for i = 1, 100 do - local rs = vshard.router.internal.route_map[i] + local rs = vshard.router.internal.static_router.route_map[i] assert(rs) rs.bucket_count = rs.bucket_count - 1 - vshard.router.internal.route_map[i] = nil + vshard.router.internal.static_router.route_map[i] = nil end; --- ... @@ -999,7 +999,7 @@ vshard.router.sync(100500) -- object method like this: object.method() instead of -- object:method(), an appropriate help-error returns. -- -_, replicaset = next(vshard.router.internal.replicasets) +_, replicaset = next(vshard.router.internal.static_router.replicasets) --- ... error_messages = {} @@ -1069,7 +1069,7 @@ test_run:cmd("setopt delimiter ';'") --- - true ... -for bucket, rs in pairs(vshard.router.internal.route_map) do +for bucket, rs in pairs(vshard.router.internal.static_router.route_map) do bucket_to_old_rs[bucket] = rs bucket_cnt = bucket_cnt + 1 end; @@ -1084,7 +1084,7 @@ vshard.router.cfg(cfg); ... for bucket, old_rs in pairs(bucket_to_old_rs) do local old_uuid = old_rs.uuid - local rs = vshard.router.internal.route_map[bucket] + local rs = vshard.router.internal.static_router.route_map[bucket] if not rs or not old_uuid == rs.uuid then error("Bucket lost during reconfigure.") end @@ -1111,7 +1111,7 @@ end; vshard.router.cfg(cfg); --- ... -vshard.router.internal.route_map = {}; +vshard.router.internal.static_router.route_map = {}; --- ... vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY = false; @@ -1119,7 +1119,7 @@ vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY = false; ... -- Do discovery iteration. Upload buckets from the -- first replicaset. -while not next(vshard.router.internal.route_map) do +while not next(vshard.router.internal.static_router.route_map) do vshard.router.discovery_wakeup() fiber.sleep(0.01) end; @@ -1128,12 +1128,12 @@ end; new_replicasets = {}; --- ... -for _, rs in pairs(vshard.router.internal.replicasets) do +for _, rs in pairs(vshard.router.internal.static_router.replicasets) do new_replicasets[rs] = true end; --- ... -_, rs = next(vshard.router.internal.route_map); +_, rs = next(vshard.router.internal.static_router.route_map); --- ... new_replicasets[rs] == true; @@ -1185,6 +1185,17 @@ vshard.router.route(1):callro('echo', {'some_data'}) - null - null ... +-- Multiple routers: check that static router can be used as an +-- object. +static_router = vshard.router.internal.static_router +--- +... +static_router:route(1):callro('echo', {'some_data'}) +--- +- some_data +- null +- null +... _ = test_run:cmd("switch default") --- ... diff --git a/test/router/router.test.lua b/test/router/router.test.lua index df2f381..d7588f7 100644 --- a/test/router/router.test.lua +++ b/test/router/router.test.lua @@ -27,8 +27,8 @@ util = require('util') -- gh-24: log all connnect/disconnect events. test_run:grep_log('router_1', 'connected to ') -rs1 = vshard.router.internal.replicasets[replicasets[1]] -rs2 = vshard.router.internal.replicasets[replicasets[2]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]] +rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]] fiber = require('fiber') while not rs1.replica or not rs2.replica do fiber.sleep(0.1) end -- With no zones the nearest server is master. @@ -39,7 +39,7 @@ rs2.replica == rs2.master -- Part of gh-76: on reconfiguration do not recreate connections -- to replicas, that are kept in a new configuration. -- -old_replicasets = vshard.router.internal.replicasets +old_replicasets = vshard.router.internal.static_router.replicasets old_connections = {} connection_count = 0 test_run:cmd("setopt delimiter ';'") @@ -52,10 +52,10 @@ end; test_run:cmd("setopt delimiter ''"); connection_count == 4 vshard.router.cfg(cfg) -new_replicasets = vshard.router.internal.replicasets +new_replicasets = vshard.router.internal.static_router.replicasets old_replicasets ~= new_replicasets -rs1 = vshard.router.internal.replicasets[replicasets[1]] -rs2 = vshard.router.internal.replicasets[replicasets[2]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]] +rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]] while not rs1.replica or not rs2.replica do fiber.sleep(0.1) end vshard.router.discovery_wakeup() -- Check that netbox connections are the same. @@ -91,7 +91,7 @@ vshard.router.bootstrap() -- -- gh-108: negative bucket count on discovery. -- -vshard.router.internal.route_map = {} +vshard.router.internal.static_router.route_map = {} rets = {} function do_echo() table.insert(rets, vshard.router.callro(1, 'echo', {1})) end f1 = fiber.create(do_echo) f2 = fiber.create(do_echo) @@ -153,7 +153,7 @@ conn = vshard.router.route(1).master.conn conn.state -- Test missing master. rs_uuid = 'ac522f65-aa94-4134-9f64-51ee384f1a54' -rs = vshard.router.internal.replicasets[rs_uuid] +rs = vshard.router.internal.static_router.replicasets[rs_uuid] master = rs.master rs.master = nil vshard.router.route(1).master @@ -223,7 +223,7 @@ vshard.router.info() -- Remove replica and master connections to trigger alert -- UNREACHABLE_REPLICASET. -rs = vshard.router.internal.replicasets[replicasets[1]] +rs = vshard.router.internal.static_router.replicasets[replicasets[1]] master_conn = rs.master.conn replica_conn = rs.replica.conn rs.master.conn = nil @@ -261,7 +261,7 @@ util.check_error(vshard.router.buckets_info, 123, '456') test_run:cmd("setopt delimiter ';'") function calculate_known_buckets() local known_buckets = 0 - for _, rs in pairs(vshard.router.internal.route_map) do + for _, rs in pairs(vshard.router.internal.static_router.route_map) do known_buckets = known_buckets + 1 end return known_buckets @@ -301,10 +301,10 @@ test_run:switch('router_1') -- test_run:cmd("setopt delimiter ';'") for i = 1, 100 do - local rs = vshard.router.internal.route_map[i] + local rs = vshard.router.internal.static_router.route_map[i] assert(rs) rs.bucket_count = rs.bucket_count - 1 - vshard.router.internal.route_map[i] = nil + vshard.router.internal.static_router.route_map[i] = nil end; test_run:cmd("setopt delimiter ''"); calculate_known_buckets() @@ -367,7 +367,7 @@ vshard.router.sync(100500) -- object method like this: object.method() instead of -- object:method(), an appropriate help-error returns. -- -_, replicaset = next(vshard.router.internal.replicasets) +_, replicaset = next(vshard.router.internal.static_router.replicasets) error_messages = {} test_run:cmd("setopt delimiter ';'") @@ -395,7 +395,7 @@ error_messages bucket_to_old_rs = {} bucket_cnt = 0 test_run:cmd("setopt delimiter ';'") -for bucket, rs in pairs(vshard.router.internal.route_map) do +for bucket, rs in pairs(vshard.router.internal.static_router.route_map) do bucket_to_old_rs[bucket] = rs bucket_cnt = bucket_cnt + 1 end; @@ -403,7 +403,7 @@ bucket_cnt; vshard.router.cfg(cfg); for bucket, old_rs in pairs(bucket_to_old_rs) do local old_uuid = old_rs.uuid - local rs = vshard.router.internal.route_map[bucket] + local rs = vshard.router.internal.static_router.route_map[bucket] if not rs or not old_uuid == rs.uuid then error("Bucket lost during reconfigure.") end @@ -423,19 +423,19 @@ while vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY ~= 'waiting' do fiber.sleep(0.02) end; vshard.router.cfg(cfg); -vshard.router.internal.route_map = {}; +vshard.router.internal.static_router.route_map = {}; vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY = false; -- Do discovery iteration. Upload buckets from the -- first replicaset. -while not next(vshard.router.internal.route_map) do +while not next(vshard.router.internal.static_router.route_map) do vshard.router.discovery_wakeup() fiber.sleep(0.01) end; new_replicasets = {}; -for _, rs in pairs(vshard.router.internal.replicasets) do +for _, rs in pairs(vshard.router.internal.static_router.replicasets) do new_replicasets[rs] = true end; -_, rs = next(vshard.router.internal.route_map); +_, rs = next(vshard.router.internal.static_router.route_map); new_replicasets[rs] == true; test_run:cmd("setopt delimiter ''"); @@ -453,6 +453,11 @@ vshard.router.internal.errinj.ERRINJ_CFG = false util.has_same_fields(old_internal, vshard.router.internal) vshard.router.route(1):callro('echo', {'some_data'}) +-- Multiple routers: check that static router can be used as an +-- object. +static_router = vshard.router.internal.static_router +static_router:route(1):callro('echo', {'some_data'}) + _ = test_run:cmd("switch default") test_run:drop_cluster(REPLICASET_2) diff --git a/vshard/error.lua b/vshard/error.lua index f79107b..da92b58 100644 --- a/vshard/error.lua +++ b/vshard/error.lua @@ -105,7 +105,12 @@ local error_message_template = { name = 'OBJECT_IS_OUTDATED', msg = 'Object is outdated after module reload/reconfigure. ' .. 'Use new instance.' - } + }, + [21] = { + name = 'ROUTER_ALREADY_EXISTS', + msg = 'Router with name %s already exists', + args = {'name'}, + }, } -- diff --git a/vshard/router/init.lua b/vshard/router/init.lua index 69cd37c..7ab2145 100644 --- a/vshard/router/init.lua +++ b/vshard/router/init.lua @@ -26,14 +26,33 @@ local M = rawget(_G, MODULE_INTERNALS) if not M then M = { ---------------- Common module attributes ---------------- - -- The last passed configuration. - current_cfg = nil, errinj = { ERRINJ_CFG = false, ERRINJ_FAILOVER_CHANGE_CFG = false, ERRINJ_RELOAD = false, ERRINJ_LONG_DISCOVERY = false, }, + -- Dictionary, key is router name, value is a router. + routers = {}, + -- Router object which can be accessed by old api: + -- e.g. vshard.router.call(...) + static_router = nil, + -- This counter is used to restart background fibers with + -- new reloaded code. + module_version = 0, + -- Number of router which require collecting lua garbage. + collect_lua_garbage_cnt = 0, + } +end + +-- +-- Router object attributes. +-- +local ROUTER_TEMPLATE = { + -- Name of router. + name = nil, + -- The last passed configuration. + current_cfg = nil, -- Time to outdate old objects on reload. connection_outdate_delay = nil, -- Bucket map cache. @@ -48,38 +67,60 @@ if not M then total_bucket_count = 0, -- Boolean lua_gc state (create periodic gc task). collect_lua_garbage = nil, - -- This counter is used to restart background fibers with - -- new reloaded code. - module_version = 0, - } -end +} + +local STATIC_ROUTER_NAME = '_static_router' -- Set a bucket to a replicaset. -local function bucket_set(bucket_id, rs_uuid) - local replicaset = M.replicasets[rs_uuid] +local function bucket_set(router, bucket_id, rs_uuid) + local replicaset = router.replicasets[rs_uuid] -- It is technically possible to delete a replicaset at the -- same time when route to the bucket is discovered. if not replicaset then return nil, lerror.vshard(lerror.code.NO_ROUTE_TO_BUCKET, bucket_id) end - local old_replicaset = M.route_map[bucket_id] + local old_replicaset = router.route_map[bucket_id] if old_replicaset ~= replicaset then if old_replicaset then old_replicaset.bucket_count = old_replicaset.bucket_count - 1 end replicaset.bucket_count = replicaset.bucket_count + 1 end - M.route_map[bucket_id] = replicaset + router.route_map[bucket_id] = replicaset return replicaset end -- Remove a bucket from the cache. -local function bucket_reset(bucket_id) - local replicaset = M.route_map[bucket_id] +local function bucket_reset(router, bucket_id) + local replicaset = router.route_map[bucket_id] if replicaset then replicaset.bucket_count = replicaset.bucket_count - 1 end - M.route_map[bucket_id] = nil + router.route_map[bucket_id] = nil +end + +-------------------------------------------------------------------------------- +-- Helpers +-------------------------------------------------------------------------------- + +-- +-- Increase/decrease number of routers which require to collect +-- a lua garbage and change state of the `lua_gc` fiber. +-- + +local function lua_gc_cnt_inc() + M.collect_lua_garbage_cnt = M.collect_lua_garbage_cnt + 1 + if M.collect_lua_garbage_cnt == 1 then + lua_gc.set_state(true, consts.COLLECT_LUA_GARBAGE_INTERVAL) + end +end + +local function lua_gc_cnt_dec() + M.collect_lua_garbage_cnt = M.collect_lua_garbage_cnt - 1 + assert(M.collect_lua_garbage_cnt >= 0) + if M.collect_lua_garbage_cnt == 0 then + lua_gc.set_state(false, consts.COLLECT_LUA_GARBAGE_INTERVAL) + end end -------------------------------------------------------------------------------- @@ -87,8 +128,8 @@ end -------------------------------------------------------------------------------- -- Search bucket in whole cluster -local function bucket_discovery(bucket_id) - local replicaset = M.route_map[bucket_id] +local function bucket_discovery(router, bucket_id) + local replicaset = router.route_map[bucket_id] if replicaset ~= nil then return replicaset end @@ -96,11 +137,11 @@ local function bucket_discovery(bucket_id) log.verbose("Discovering bucket %d", bucket_id) local last_err = nil local unreachable_uuid = nil - for uuid, replicaset in pairs(M.replicasets) do + for uuid, replicaset in pairs(router.replicasets) do local _, err = replicaset:callrw('vshard.storage.bucket_stat', {bucket_id}) if err == nil then - return bucket_set(bucket_id, replicaset.uuid) + return bucket_set(router, bucket_id, replicaset.uuid) elseif err.code ~= lerror.code.WRONG_BUCKET then last_err = err unreachable_uuid = uuid @@ -129,14 +170,14 @@ local function bucket_discovery(bucket_id) end -- Resolve bucket id to replicaset uuid -local function bucket_resolve(bucket_id) +local function bucket_resolve(router, bucket_id) local replicaset, err - local replicaset = M.route_map[bucket_id] + local replicaset = router.route_map[bucket_id] if replicaset ~= nil then return replicaset end -- Replicaset removed from cluster, perform discovery - replicaset, err = bucket_discovery(bucket_id) + replicaset, err = bucket_discovery(router, bucket_id) if replicaset == nil then return nil, err end @@ -147,14 +188,14 @@ end -- Background fiber to perform discovery. It periodically scans -- replicasets one by one and updates route_map. -- -local function discovery_f() +local function discovery_f(router) local module_version = M.module_version while module_version == M.module_version do - while not next(M.replicasets) do + while not next(router.replicasets) do lfiber.sleep(consts.DISCOVERY_INTERVAL) end - local old_replicasets = M.replicasets - for rs_uuid, replicaset in pairs(M.replicasets) do + local old_replicasets = router.replicasets + for rs_uuid, replicaset in pairs(router.replicasets) do local active_buckets, err = replicaset:callro('vshard.storage.buckets_discovery', {}, {timeout = 2}) @@ -164,7 +205,7 @@ local function discovery_f() end -- Renew replicasets object captured by the for loop -- in case of reconfigure and reload events. - if M.replicasets ~= old_replicasets then + if router.replicasets ~= old_replicasets then break end if not active_buckets then @@ -177,11 +218,11 @@ local function discovery_f() end replicaset.bucket_count = #active_buckets for _, bucket_id in pairs(active_buckets) do - local old_rs = M.route_map[bucket_id] + local old_rs = router.route_map[bucket_id] if old_rs and old_rs ~= replicaset then old_rs.bucket_count = old_rs.bucket_count - 1 end - M.route_map[bucket_id] = replicaset + router.route_map[bucket_id] = replicaset end end lfiber.sleep(consts.DISCOVERY_INTERVAL) @@ -192,9 +233,9 @@ end -- -- Immediately wakeup discovery fiber if exists. -- -local function discovery_wakeup() - if M.discovery_fiber then - M.discovery_fiber:wakeup() +local function discovery_wakeup(router) + if router.discovery_fiber then + router.discovery_fiber:wakeup() end end @@ -206,7 +247,7 @@ end -- Function will restart operation after wrong bucket response until timeout -- is reached -- -local function router_call(bucket_id, mode, func, args, opts) +local function router_call(router, bucket_id, mode, func, args, opts) if opts and (type(opts) ~= 'table' or (opts.timeout and type(opts.timeout) ~= 'number')) then error('Usage: call(bucket_id, mode, func, args, opts)') @@ -214,7 +255,7 @@ local function router_call(bucket_id, mode, func, args, opts) local timeout = opts and opts.timeout or consts.CALL_TIMEOUT_MIN local replicaset, err local tend = lfiber.time() + timeout - if bucket_id > M.total_bucket_count or bucket_id <= 0 then + if bucket_id > router.total_bucket_count or bucket_id <= 0 then error('Bucket is unreachable: bucket id is out of range') end local call @@ -224,7 +265,7 @@ local function router_call(bucket_id, mode, func, args, opts) call = 'callrw' end repeat - replicaset, err = bucket_resolve(bucket_id) + replicaset, err = bucket_resolve(router, bucket_id) if replicaset then ::replicaset_is_found:: local storage_call_status, call_status, call_error = @@ -240,9 +281,9 @@ local function router_call(bucket_id, mode, func, args, opts) end err = call_status if err.code == lerror.code.WRONG_BUCKET then - bucket_reset(bucket_id) + bucket_reset(router, bucket_id) if err.destination then - replicaset = M.replicasets[err.destination] + replicaset = router.replicasets[err.destination] if not replicaset then log.warn('Replicaset "%s" was not found, but received'.. ' from storage as destination - please '.. @@ -254,13 +295,14 @@ local function router_call(bucket_id, mode, func, args, opts) -- but already is executed on storages. while lfiber.time() <= tend do lfiber.sleep(0.05) - replicaset = M.replicasets[err.destination] + replicaset = router.replicasets[err.destination] if replicaset then goto replicaset_is_found end end else - replicaset = bucket_set(bucket_id, replicaset.uuid) + replicaset = bucket_set(router, bucket_id, + replicaset.uuid) lfiber.yield() -- Protect against infinite cycle in a -- case of broken cluster, when a bucket @@ -277,7 +319,7 @@ local function router_call(bucket_id, mode, func, args, opts) -- is not timeout - these requests are repeated in -- any case on client, if error. assert(mode == 'write') - bucket_reset(bucket_id) + bucket_reset(router, bucket_id) return nil, err elseif err.code == lerror.code.NON_MASTER then -- Same, as above - do not wait and repeat. @@ -303,12 +345,12 @@ end -- -- Wrappers for router_call with preset mode. -- -local function router_callro(bucket_id, ...) - return router_call(bucket_id, 'read', ...) +local function router_callro(router, bucket_id, ...) + return router_call(router, bucket_id, 'read', ...) end -local function router_callrw(bucket_id, ...) - return router_call(bucket_id, 'write', ...) +local function router_callrw(router, bucket_id, ...) + return router_call(router, bucket_id, 'write', ...) end -- @@ -316,27 +358,27 @@ end -- @param bucket_id Bucket identifier. -- @retval Netbox connection. -- -local function router_route(bucket_id) +local function router_route(router, bucket_id) if type(bucket_id) ~= 'number' then error('Usage: router.route(bucket_id)') end - return bucket_resolve(bucket_id) + return bucket_resolve(router, bucket_id) end -- -- Return map of all replicasets. -- @retval See self.replicasets map. -- -local function router_routeall() - return M.replicasets +local function router_routeall(router) + return router.replicasets end -------------------------------------------------------------------------------- -- Failover -------------------------------------------------------------------------------- -local function failover_ping_round() - for _, replicaset in pairs(M.replicasets) do +local function failover_ping_round(router) + for _, replicaset in pairs(router.replicasets) do local replica = replicaset.replica if replica ~= nil and replica.conn ~= nil and replica.down_ts == nil then @@ -379,10 +421,10 @@ end -- Collect UUIDs of replicasets, priority of whose replica -- connections must be updated. -- -local function failover_collect_to_update() +local function failover_collect_to_update(router) local ts = lfiber.time() local uuid_to_update = {} - for uuid, rs in pairs(M.replicasets) do + for uuid, rs in pairs(router.replicasets) do if failover_need_down_priority(rs, ts) or failover_need_up_priority(rs, ts) then table.insert(uuid_to_update, uuid) @@ -397,16 +439,16 @@ end -- disconnected replicas. -- @retval true A replica of an replicaset has been changed. -- -local function failover_step() - failover_ping_round() - local uuid_to_update = failover_collect_to_update() +local function failover_step(router) + failover_ping_round(router) + local uuid_to_update = failover_collect_to_update(router) if #uuid_to_update == 0 then return false end local curr_ts = lfiber.time() local replica_is_changed = false for _, uuid in pairs(uuid_to_update) do - local rs = M.replicasets[uuid] + local rs = router.replicasets[uuid] if M.errinj.ERRINJ_FAILOVER_CHANGE_CFG then rs = nil M.errinj.ERRINJ_FAILOVER_CHANGE_CFG = false @@ -448,7 +490,7 @@ end -- tries to reconnect to the best replica. When the connection is -- established, it replaces the original replica. -- -local function failover_f() +local function failover_f(router) local module_version = M.module_version local min_timeout = math.min(consts.FAILOVER_UP_TIMEOUT, consts.FAILOVER_DOWN_TIMEOUT) @@ -458,7 +500,7 @@ local function failover_f() local prev_was_ok = false while module_version == M.module_version do ::continue:: - local ok, replica_is_changed = pcall(failover_step) + local ok, replica_is_changed = pcall(failover_step, router) if not ok then log.error('Error during failovering: %s', lerror.make(replica_is_changed)) @@ -485,8 +527,8 @@ end -- Configuration -------------------------------------------------------------------------------- -local function router_cfg(cfg, is_reload) - cfg = lcfg.check(cfg, M.current_cfg) +local function router_cfg(router, cfg, is_reload) + cfg = lcfg.check(cfg, router.current_cfg) local vshard_cfg, box_cfg = lcfg.split(cfg) if not M.replicasets then log.info('Starting router configuration') @@ -511,45 +553,49 @@ local function router_cfg(cfg, is_reload) -- Move connections from an old configuration to a new one. -- It must be done with no yields to prevent usage both of not -- fully moved old replicasets, and not fully built new ones. - lreplicaset.rebind_replicasets(new_replicasets, M.replicasets) + lreplicaset.rebind_replicasets(new_replicasets, router.replicasets) -- Now the new replicasets are fully built. Can establish -- connections and yield. for _, replicaset in pairs(new_replicasets) do replicaset:connect_all() end + -- Change state of lua GC. + if vshard_cfg.collect_lua_garbage and not router.collect_lua_garbage then + lua_gc_cnt_inc() + elseif not vshard_cfg.collect_lua_garbage and + router.collect_lua_garbage then + lua_gc_cnt_dec() + end lreplicaset.wait_masters_connect(new_replicasets) - lreplicaset.outdate_replicasets(M.replicasets, + lreplicaset.outdate_replicasets(router.replicasets, vshard_cfg.connection_outdate_delay) - M.connection_outdate_delay = vshard_cfg.connection_outdate_delay - M.total_bucket_count = vshard_cfg.bucket_count - M.collect_lua_garbage = vshard_cfg.collect_lua_garbage - M.current_cfg = cfg - M.replicasets = new_replicasets - local old_route_map = M.route_map - M.route_map = table_new(M.total_bucket_count, 0) + router.connection_outdate_delay = vshard_cfg.connection_outdate_delay + router.total_bucket_count = vshard_cfg.bucket_count + router.collect_lua_garbage = vshard_cfg.collect_lua_garbage + router.current_cfg = cfg + router.replicasets = new_replicasets + local old_route_map = router.route_map + router.route_map = table_new(router.total_bucket_count, 0) for bucket, rs in pairs(old_route_map) do - M.route_map[bucket] = M.replicasets[rs.uuid] + router.route_map[bucket] = router.replicasets[rs.uuid] end - if M.failover_fiber == nil then - M.failover_fiber = util.reloadable_fiber_create('vshard.failover', M, - 'failover_f') + if router.failover_fiber == nil then + router.failover_fiber = util.reloadable_fiber_create( + 'vshard.failover.' .. router.name, M, 'failover_f', router) end - if M.discovery_fiber == nil then - M.discovery_fiber = util.reloadable_fiber_create('vshard.discovery', M, - 'discovery_f') + if router.discovery_fiber == nil then + router.discovery_fiber = util.reloadable_fiber_create( + 'vshard.discovery.' .. router.name, M, 'discovery_f', router) end - lua_gc.set_state(M.collect_lua_garbage, consts.COLLECT_LUA_GARBAGE_INTERVAL) - -- Destroy connections, not used in a new configuration. - collectgarbage() end -------------------------------------------------------------------------------- -- Bootstrap -------------------------------------------------------------------------------- -local function cluster_bootstrap() +local function cluster_bootstrap(router) local replicasets = {} - for uuid, replicaset in pairs(M.replicasets) do + for uuid, replicaset in pairs(router.replicasets) do table.insert(replicasets, replicaset) local count, err = replicaset:callrw('vshard.storage.buckets_count', {}) @@ -560,9 +606,10 @@ local function cluster_bootstrap() return nil, lerror.vshard(lerror.code.NON_EMPTY) end end - lreplicaset.calculate_etalon_balance(M.replicasets, M.total_bucket_count) + lreplicaset.calculate_etalon_balance(router.replicasets, + router.total_bucket_count) local bucket_id = 1 - for uuid, replicaset in pairs(M.replicasets) do + for uuid, replicaset in pairs(router.replicasets) do if replicaset.etalon_bucket_count > 0 then local ok, err = replicaset:callrw('vshard.storage.bucket_force_create', @@ -618,7 +665,7 @@ local function replicaset_instance_info(replicaset, name, alerts, errcolor, return info, consts.STATUS.GREEN end -local function router_info() +local function router_info(router) local state = { replicasets = {}, bucket = { @@ -632,7 +679,7 @@ local function router_info() } local bucket_info = state.bucket local known_bucket_count = 0 - for rs_uuid, replicaset in pairs(M.replicasets) do + for rs_uuid, replicaset in pairs(router.replicasets) do -- Replicaset info parameters: -- * master instance info; -- * replica instance info; @@ -720,7 +767,7 @@ local function router_info() -- If a bucket is unreachable, then replicaset is -- unreachable too and color already is red. end - bucket_info.unknown = M.total_bucket_count - known_bucket_count + bucket_info.unknown = router.total_bucket_count - known_bucket_count if bucket_info.unknown > 0 then state.status = math.max(state.status, consts.STATUS.YELLOW) table.insert(state.alerts, lerror.alert(lerror.code.UNKNOWN_BUCKETS, @@ -737,13 +784,13 @@ end -- @param limit Maximal bucket count in output. -- @retval Map of type {bucket_id = 'unknown'/replicaset_uuid}. -- -local function router_buckets_info(offset, limit) +local function router_buckets_info(router, offset, limit) if offset ~= nil and type(offset) ~= 'number' or limit ~= nil and type(limit) ~= 'number' then error('Usage: buckets_info(offset, limit)') end offset = offset or 0 - limit = limit or M.total_bucket_count + limit = limit or router.total_bucket_count local ret = {} -- Use one string memory for all unknown buckets. local available_rw = 'available_rw' @@ -752,9 +799,9 @@ local function router_buckets_info(offset, limit) local unreachable = 'unreachable' -- Collect limit. local first = math.max(1, offset + 1) - local last = math.min(offset + limit, M.total_bucket_count) + local last = math.min(offset + limit, router.total_bucket_count) for bucket_id = first, last do - local rs = M.route_map[bucket_id] + local rs = router.route_map[bucket_id] if rs then if rs.master and rs.master:is_connected() then ret[bucket_id] = {uuid = rs.uuid, status = available_rw} @@ -774,22 +821,22 @@ end -- Other -------------------------------------------------------------------------------- -local function router_bucket_id(key) +local function router_bucket_id(router, key) if key == nil then error("Usage: vshard.router.bucket_id(key)") end - return lhash.key_hash(key) % M.total_bucket_count + 1 + return lhash.key_hash(key) % router.total_bucket_count + 1 end -local function router_bucket_count() - return M.total_bucket_count +local function router_bucket_count(router) + return router.total_bucket_count end -local function router_sync(timeout) +local function router_sync(router, timeout) if timeout ~= nil and type(timeout) ~= 'number' then error('Usage: vshard.router.sync([timeout: number])') end - for rs_uuid, replicaset in pairs(M.replicasets) do + for rs_uuid, replicaset in pairs(router.replicasets) do local status, err = replicaset:callrw('vshard.storage.sync', {timeout}) if not status then -- Add information about replicaset @@ -803,6 +850,94 @@ if M.errinj.ERRINJ_RELOAD then error('Error injection: reload') end +-------------------------------------------------------------------------------- +-- Managing router instances +-------------------------------------------------------------------------------- + +local function cfg_reconfigure(router, cfg) + return router_cfg(router, cfg, false) +end + +local router_mt = { + __index = { + cfg = cfg_reconfigure; + info = router_info; + buckets_info = router_buckets_info; + call = router_call; + callro = router_callro; + callrw = router_callrw; + route = router_route; + routeall = router_routeall; + bucket_id = router_bucket_id; + bucket_count = router_bucket_count; + sync = router_sync; + bootstrap = cluster_bootstrap; + bucket_discovery = bucket_discovery; + discovery_wakeup = discovery_wakeup; + } +} + +-- Table which represents this module. +local module = {} + +-- This metatable bypasses calls to a module to the static_router. +local module_mt = {__index = {}} +for method_name, method in pairs(router_mt.__index) do + module_mt.__index[method_name] = function(...) + return method(M.static_router, ...) + end +end + +local function export_static_router_attributes() + setmetatable(module, module_mt) +end + +-- +-- Create a new instance of router. +-- @param name Name of a new router. +-- @param cfg Configuration for `router_cfg`. +-- @retval Router instance. +-- @retval Nil and error object. +-- +local function router_new(name, cfg) + if type(name) ~= 'string' or type(cfg) ~= 'table' then + error('Wrong argument type. Usage: vshard.router.new(name, cfg).') + end + if M.routers[name] then + return nil, lerror.vshard(lerror.code.ROUTER_ALREADY_EXISTS, name) + end + local router = table.deepcopy(ROUTER_TEMPLATE) + setmetatable(router, router_mt) + router.name = name + M.routers[name] = router + local ok, err = pcall(router_cfg, router, cfg) + if not ok then + M.routers[name] = nil + error(err) + end + return router +end + +-- +-- Wrapper around a `router_new` API, which allow to use old +-- static `vshard.router.cfg()` API. +-- +local function legacy_cfg(cfg) + if M.static_router then + -- Reconfigure. + router_cfg(M.static_router, cfg, false) + else + -- Create new static instance. + local router, err = router_new(STATIC_ROUTER_NAME, cfg) + if router then + M.static_router = router + export_static_router_attributes() + else + return nil, err + end + end +end + -------------------------------------------------------------------------------- -- Module definition -------------------------------------------------------------------------------- @@ -813,28 +948,23 @@ end if not rawget(_G, MODULE_INTERNALS) then rawset(_G, MODULE_INTERNALS, M) else - router_cfg(M.current_cfg, true) + for _, router in pairs(M.routers) do + router_cfg(router, router.current_cfg, true) + setmetatable(router, router_mt) + end + if M.static_router then + export_static_router_attributes() + end M.module_version = M.module_version + 1 end M.discovery_f = discovery_f M.failover_f = failover_f +M.router_mt = router_mt -return { - cfg = function(cfg) return router_cfg(cfg, false) end; - info = router_info; - buckets_info = router_buckets_info; - call = router_call; - callro = router_callro; - callrw = router_callrw; - route = router_route; - routeall = router_routeall; - bucket_id = router_bucket_id; - bucket_count = router_bucket_count; - sync = router_sync; - bootstrap = cluster_bootstrap; - bucket_discovery = bucket_discovery; - discovery_wakeup = discovery_wakeup; - internal = M; - module_version = function() return M.module_version end; -} +module.cfg = legacy_cfg +module.new = router_new +module.internal = M +module.module_version = function() return M.module_version end + +return module diff --git a/vshard/util.lua b/vshard/util.lua index 37abe2b..3afaa61 100644 --- a/vshard/util.lua +++ b/vshard/util.lua @@ -38,11 +38,11 @@ end -- reload of that module. -- See description of parameters in `reloadable_fiber_create`. -- -local function reloadable_fiber_main_loop(module, func_name) +local function reloadable_fiber_main_loop(module, func_name, data) log.info('%s has been started', func_name) local func = module[func_name] ::restart_loop:: - local ok, err = pcall(func) + local ok, err = pcall(func, data) -- yield serves two purposes: -- * makes this fiber cancellable -- * prevents 100% cpu consumption @@ -60,7 +60,7 @@ local function reloadable_fiber_main_loop(module, func_name) log.info('module is reloaded, restarting') -- luajit drops this frame if next function is called in -- return statement. - return M.reloadable_fiber_main_loop(module, func_name) + return M.reloadable_fiber_main_loop(module, func_name, data) end -- @@ -74,11 +74,13 @@ end -- @param module Module which can be reloaded. -- @param func_name Name of a function to be executed in the -- module. +-- @param data Data to be passed to the specified function. -- @retval New fiber. -- -local function reloadable_fiber_create(fiber_name, module, func_name) +local function reloadable_fiber_create(fiber_name, module, func_name, data) assert(type(fiber_name) == 'string') - local xfiber = fiber.create(reloadable_fiber_main_loop, module, func_name) + local xfiber = fiber.create(reloadable_fiber_main_loop, module, func_name, + data) xfiber:name(fiber_name) return xfiber end ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH 3/3] Introduce multiple routers feature 2018-08-08 14:04 ` Alex Khatskevich @ 2018-08-08 15:37 ` Vladislav Shpilevoy 0 siblings, 0 replies; 23+ messages in thread From: Vladislav Shpilevoy @ 2018-08-08 15:37 UTC (permalink / raw) To: Alex Khatskevich, tarantool-patches Thanks for the fixes! Pushed into the master. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] [PATCH] Check self arg passed for router objects 2018-07-31 16:25 [tarantool-patches] [PATCH 0/3] multiple routers AKhatskevich ` (2 preceding siblings ...) 2018-07-31 16:25 ` [tarantool-patches] [PATCH 3/3] Introduce multiple routers feature AKhatskevich @ 2018-08-01 14:30 ` AKhatskevich 2018-08-03 20:07 ` [tarantool-patches] [PATCH] Refactor config templates AKhatskevich 4 siblings, 0 replies; 23+ messages in thread From: AKhatskevich @ 2018-08-01 14:30 UTC (permalink / raw) To: v.shpilevoy, tarantool-patches Raise an exception in case someone calls router like `router.info()` instead of `router:info()`. --- test/multiple_routers/multiple_routers.result | 5 +++++ test/multiple_routers/multiple_routers.test.lua | 3 +++ vshard/router/init.lua | 9 +++++++++ 3 files changed, 17 insertions(+) diff --git a/test/multiple_routers/multiple_routers.result b/test/multiple_routers/multiple_routers.result index 33f4034..389bf9a 100644 --- a/test/multiple_routers/multiple_routers.result +++ b/test/multiple_routers/multiple_routers.result @@ -201,6 +201,11 @@ routers[5]:call(1, 'read', 'do_select', {2}) --- - [[2, 2]] ... +-- Self checker. +util.check_error(router_2.info) +--- +- Use router:info(...) instead of router.info(...) +... _ = test_run:cmd("switch default") --- ... diff --git a/test/multiple_routers/multiple_routers.test.lua b/test/multiple_routers/multiple_routers.test.lua index 6d470e1..2f159c7 100644 --- a/test/multiple_routers/multiple_routers.test.lua +++ b/test/multiple_routers/multiple_routers.test.lua @@ -76,6 +76,9 @@ vshard.router.call(1, 'read', 'do_select', {1}) router_2:call(1, 'read', 'do_select', {2}) routers[5]:call(1, 'read', 'do_select', {2}) +-- Self checker. +util.check_error(router_2.info) + _ = test_run:cmd("switch default") test_run:cmd("stop server router_1") test_run:cmd("cleanup server router_1") diff --git a/vshard/router/init.lua b/vshard/router/init.lua index 128628b..e0a39b2 100644 --- a/vshard/router/init.lua +++ b/vshard/router/init.lua @@ -860,6 +860,15 @@ local router_mt = { } } +-- +-- Wrap self methods with a sanity checker. +-- +local mt_index = {} +for name, func in pairs(router_mt.__index) do + mt_index[name] = util.generate_self_checker("router", name, router_mt, func) +end +router_mt.__index = mt_index + -- Table which represents this module. local module = {} -- 2.14.1 ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] [PATCH] Refactor config templates 2018-07-31 16:25 [tarantool-patches] [PATCH 0/3] multiple routers AKhatskevich ` (3 preceding siblings ...) 2018-08-01 14:30 ` [tarantool-patches] [PATCH] Check self arg passed for router objects AKhatskevich @ 2018-08-03 20:07 ` AKhatskevich 2018-08-06 15:49 ` [tarantool-patches] " Vladislav Shpilevoy 4 siblings, 1 reply; 23+ messages in thread From: AKhatskevich @ 2018-08-03 20:07 UTC (permalink / raw) To: v.shpilevoy, tarantool-patches Config templates are converted to dictionary format. Before: format = {{field_name, description}} After: format = {field_name = description]] This change is made for fast template lookups, which will be used in further commits. --- This is an extra commit, created especially for the 'Update only vshard part of a cfg on reload' patch. vshard/cfg.lua | 64 ++++++++++++++++++++++++++++------------------------------ 1 file changed, 31 insertions(+), 33 deletions(-) diff --git a/vshard/cfg.lua b/vshard/cfg.lua index bba12cc..7c9ab77 100644 --- a/vshard/cfg.lua +++ b/vshard/cfg.lua @@ -43,9 +43,7 @@ local type_validate = { } local function validate_config(config, template, check_arg) - for _, key_template in pairs(template) do - local key = key_template[1] - local template_value = key_template[2] + for key, template_value in pairs(template) do local value = config[key] if not value then if not template_value.is_optional then @@ -83,13 +81,13 @@ local function validate_config(config, template, check_arg) end local replica_template = { - {'uri', {type = 'non-empty string', name = 'URI', check = check_uri}}, - {'name', {type = 'string', name = "Name", is_optional = true}}, - {'zone', {type = {'string', 'number'}, name = "Zone", is_optional = true}}, - {'master', { + uri = {type = 'non-empty string', name = 'URI', check = check_uri}, + name = {type = 'string', name = "Name", is_optional = true}, + zone = {type = {'string', 'number'}, name = "Zone", is_optional = true}, + master = { type = 'boolean', name = "Master", is_optional = true, default = false, check = check_master - }}, + }, } local function check_replicas(replicas) @@ -100,12 +98,12 @@ local function check_replicas(replicas) end local replicaset_template = { - {'replicas', {type = 'table', name = 'Replicas', check = check_replicas}}, - {'weight', { + replicas = {type = 'table', name = 'Replicas', check = check_replicas}, + weight = { type = 'non-negative number', name = 'Weight', is_optional = true, default = 1, - }}, - {'lock', {type = 'boolean', name = 'Lock', is_optional = true}}, + }, + lock = {type = 'boolean', name = 'Lock', is_optional = true}, } -- @@ -177,50 +175,50 @@ local function check_sharding(sharding) end local cfg_template = { - {'sharding', {type = 'table', name = 'Sharding', check = check_sharding}}, - {'weights', { + sharding = {type = 'table', name = 'Sharding', check = check_sharding}, + weights = { type = 'table', name = 'Weight matrix', is_optional = true, check = cfg_check_weights - }}, - {'shard_index', { + }, + shard_index = { type = {'non-empty string', 'non-negative integer'}, name = 'Shard index', is_optional = true, default = 'bucket_id', - }}, - {'zone', { + }, + zone = { type = {'string', 'number'}, name = 'Zone identifier', is_optional = true - }}, - {'bucket_count', { + }, + bucket_count = { type = 'positive integer', name = 'Bucket count', is_optional = true, default = consts.DEFAULT_BUCKET_COUNT - }}, - {'rebalancer_disbalance_threshold', { + }, + rebalancer_disbalance_threshold = { type = 'non-negative number', name = 'Rebalancer disbalance threshold', is_optional = true, default = consts.DEFAULT_REBALANCER_DISBALANCE_THRESHOLD - }}, - {'rebalancer_max_receiving', { + }, + rebalancer_max_receiving = { type = 'positive integer', name = 'Rebalancer max receiving bucket count', is_optional = true, default = consts.DEFAULT_REBALANCER_MAX_RECEIVING - }}, - {'collect_bucket_garbage_interval', { + }, + collect_bucket_garbage_interval = { type = 'positive number', name = 'Garbage bucket collect interval', is_optional = true, default = consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL - }}, - {'collect_lua_garbage', { + }, + collect_lua_garbage = { type = 'boolean', name = 'Garbage Lua collect necessity', is_optional = true, default = false - }}, - {'sync_timeout', { + }, + sync_timeout = { type = 'non-negative number', name = 'Sync timeout', is_optional = true, default = consts.DEFAULT_SYNC_TIMEOUT - }}, - {'connection_outdate_delay', { + }, + connection_outdate_delay = { type = 'non-negative number', name = 'Object outdate timeout', is_optional = true - }}, + }, } -- -- 2.14.1 ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tarantool-patches] Re: [PATCH] Refactor config templates 2018-08-03 20:07 ` [tarantool-patches] [PATCH] Refactor config templates AKhatskevich @ 2018-08-06 15:49 ` Vladislav Shpilevoy 0 siblings, 0 replies; 23+ messages in thread From: Vladislav Shpilevoy @ 2018-08-06 15:49 UTC (permalink / raw) To: tarantool-patches, AKhatskevich Thanks for the patch! LGTM and pushed. ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2018-08-08 15:37 UTC | newest] Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-07-31 16:25 [tarantool-patches] [PATCH 0/3] multiple routers AKhatskevich 2018-07-31 16:25 ` [tarantool-patches] [PATCH 1/3] Update only vshard part of a cfg on reload AKhatskevich 2018-08-01 18:43 ` [tarantool-patches] " Vladislav Shpilevoy 2018-08-03 20:03 ` Alex Khatskevich 2018-08-06 17:03 ` Vladislav Shpilevoy 2018-08-07 13:19 ` Alex Khatskevich 2018-08-08 11:17 ` Vladislav Shpilevoy 2018-07-31 16:25 ` [tarantool-patches] [PATCH 2/3] Move lua gc to a dedicated module AKhatskevich 2018-08-01 18:43 ` [tarantool-patches] " Vladislav Shpilevoy 2018-08-03 20:04 ` Alex Khatskevich 2018-08-06 17:03 ` Vladislav Shpilevoy 2018-08-08 11:17 ` Vladislav Shpilevoy 2018-07-31 16:25 ` [tarantool-patches] [PATCH 3/3] Introduce multiple routers feature AKhatskevich 2018-08-01 18:43 ` [tarantool-patches] " Vladislav Shpilevoy 2018-08-03 20:05 ` Alex Khatskevich 2018-08-06 17:03 ` Vladislav Shpilevoy 2018-08-07 13:18 ` Alex Khatskevich 2018-08-08 12:28 ` Vladislav Shpilevoy 2018-08-08 14:04 ` Alex Khatskevich 2018-08-08 15:37 ` Vladislav Shpilevoy 2018-08-01 14:30 ` [tarantool-patches] [PATCH] Check self arg passed for router objects AKhatskevich 2018-08-03 20:07 ` [tarantool-patches] [PATCH] Refactor config templates AKhatskevich 2018-08-06 15:49 ` [tarantool-patches] " Vladislav Shpilevoy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox