From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 6128828B76 for ; Fri, 3 Aug 2018 16:05:17 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NWU8IAD5Ijro for ; Fri, 3 Aug 2018 16:05:17 -0400 (EDT) Received: from smtp58.i.mail.ru (smtp58.i.mail.ru [217.69.128.38]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id C516F28B40 for ; Fri, 3 Aug 2018 16:05:16 -0400 (EDT) From: Alex Khatskevich Subject: [tarantool-patches] Re: [PATCH 3/3] Introduce multiple routers feature References: <30ab88fc-fba0-11d8-254c-385e59caead7@tarantool.org> Message-ID: <9a0a958a-5f66-2aa5-83de-e2d6f55cbd71@tarantool.org> Date: Fri, 3 Aug 2018 23:05:11 +0300 MIME-Version: 1.0 In-Reply-To: <30ab88fc-fba0-11d8-254c-385e59caead7@tarantool.org> Content-Type: text/plain; charset="utf-8"; format="flowed" Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: Vladislav Shpilevoy , tarantool-patches@freelists.org On 01.08.2018 21:43, Vladislav Shpilevoy wrote: > Thanks for the patch! See 10 comments below. > > On 31/07/2018 19:25, AKhatskevich wrote: >> Key points: >> * Old `vshard.router.some_method()` api is preserved. >> * Add `vshard.router.new(name, cfg)` method which returns a new router. >> * Each router has its own: >>    1. name >>    2. background fibers >>    3. attributes (route_map, replicasets, outdate_delay...) >> * Module reload reloads all configured routers. >> * `cfg` reconfigures a single router. >> * All routers share the same box configuration. The last passed config >>    overrides the global config. >> * Multiple router instances can be connected to the same cluster. >> * By now, a router cannot be destroyed. >> >> Extra changes: >> * Add `data` parameter to `reloadable_fiber_create` function. >> >> Closes #130 >> ---> diff --git a/test/multiple_routers/multiple_routers.result >> b/test/multiple_routers/multiple_routers.result >> new file mode 100644 >> index 0000000..33f4034 >> --- /dev/null >> +++ b/test/multiple_routers/multiple_routers.result >> @@ -0,0 +1,226 @@ >> +-- Reconfigure one of routers do not affect the others. >> +routers[3]:cfg(configs.cfg_1) > > 1. You did not change configs.cfg_1 so it is not reconfig > actually. Please, change something to check that the > parameter affects one router and does not affect others. router[3] was configured with configs.cfg_2 before. So, its config was changed. > > 2. Please, add a test on an ability to get the static router > into a variable and use it like others. It should be possible > to hide distinctions between static and other routers. > > Like this: > >     r1 = vshard.router.static >     r2 = vshard.router.new(...) >     do_something_with_router(r1) >     do_something_with_router(r2) > > Here do_something_with_router() is unaware of whether the > router is static or not. > Few calls are added. >> diff --git a/vshard/router/init.lua b/vshard/router/init.lua >> index 3e127cb..7569baf 100644 >> --- a/vshard/router/init.lua >> +++ b/vshard/router/init.lua >> @@ -257,13 +272,13 @@ local function router_call(bucket_id, mode, >> func, args, opts) >>                           -- but already is executed on storages. >>                           while lfiber.time() <= tend do >>                               lfiber.sleep(0.05) >> -                            replicaset = M.replicasets[err.destination] >> +                            replicaset = >> router.replicasets[err.destination] >>                               if replicaset then >>                                   goto replicaset_is_found >>                               end >>                           end >>                       else >> -                        replicaset = bucket_set(bucket_id, >> replicaset.uuid) >> +                        replicaset = bucket_set(router, bucket_id, >> replicaset.uuid) > > 3. Out of 80 symbols. fixed > >>                           lfiber.yield() >>                           -- Protect against infinite cycle in a >>                           -- case of broken cluster, when a bucket >> @@ -488,9 +503,14 @@ end >>   -- Configuration >> -------------------------------------------------------------------------------- >>   -local function router_cfg(cfg) >> -    local vshard_cfg, box_cfg = lcfg.check(cfg, M.current_cfg) >> -    if not M.replicasets then >> +-- Types of configuration. >> +CFG_NEW = 'new' >> +CFG_RELOAD = 'reload' >> +CFG_RECONFIGURE = 'reconfigure' > > 4. Last two values are never used in router_cfg(). The first > is used for logging only and can be checked as it was before > with no explicit passing. I have left it as it is. Now, each of those is passed at least once. >> + >> +local function router_cfg(router, cfg, cfg_type) >> +    local vshard_cfg, box_cfg = lcfg.check(cfg, router.current_cfg) >> +    if cfg_type == CFG_NEW then >>           log.info('Starting router configuration') >>       else >>           log.info('Starting router reconfiguration') >> @@ -512,44 +532,53 @@ local function router_cfg(cfg) >> + >> +local function updage_lua_gc_state() > > 5. This function is not needed actually. > > On router_new() the only thing that can change is start of > the gc fiber if the new router has the flag and the gc is > not started now. It can be checked by a simple 'if' with > no full-scan of all routers. > > On reload it is not possible to change configuration, so > the gc state can not be changed and does not need an update. > Even if it could be changed, you already iterate over routers > on reload to call router_cfg and can collect their flags > along side. > > The next point is that it is not possible now to manage the > gc via simple :cfg() call. You do nothing with gc when > router_cfg is called directly. And that produces a question - > why do your tests pass if so? > > The possible solution - keep a counter of set lua gc flags > overall routers in M. On each cfg you update the counter > if the value is changed. If it was 0 and become > 0, then > you start gc. If it was > 0 and become 0, then you stop gc. > No routers iteration at all. Implemented by introducing a counter. > >> +    local lua_gc = false >> +    for _, xrouter in pairs(M.routers) do >> +        if xrouter.collect_lua_garbage then >> +            lua_gc = true >> +        end >> +    end >> +    lua_gc.set_state(lua_gc, consts.COLLECT_LUA_GARBAGE_INTERVAL) >>   end >>   @@ -803,6 +833,93 @@ if M.errinj.ERRINJ_RELOAD then >>       error('Error injection: reload') >>   end >> +-------------------------------------------------------------------------------- >> +-- Managing router instances >> +-------------------------------------------------------------------------------- >> >> + >> +local function cfg_reconfigure(router, cfg) >> +    return router_cfg(router, cfg, CFG_RECONFIGURE) >> +end >> + >> +local router_mt = { >> +    __index = { >> +        cfg = cfg_reconfigure; >> +        info = router_info; >> +        buckets_info = router_buckets_info; >> +        call = router_call; >> +        callro = router_callro; >> +        callrw = router_callrw; >> +        route = router_route; >> +        routeall = router_routeall; >> +        bucket_id = router_bucket_id; >> +        bucket_count = router_bucket_count; >> +        sync = router_sync; >> +        bootstrap = cluster_bootstrap; >> +        bucket_discovery = bucket_discovery; >> +        discovery_wakeup = discovery_wakeup; >> +    } >> +} >> + >> +-- Table which represents this module. >> +local module = {} >> + >> +local function export_static_router_attributes() >> +    -- This metatable bypasses calls to a module to the static_router. >> +    local module_mt = {__index = {}} >> +    for method_name, method in pairs(router_mt.__index) do >> +        module_mt.__index[method_name] = function(...) >> +            if M.static_router then >> +                return method(M.static_router, ...) >> +            else >> +                error('Static router is not configured') > > 6. This should not be all-time check. You should > initialize the static router metatable with only errors. > On the first cfg you reset the metatable to always use > regular methods. But anyway this code is unreachable. See > below in the comment 10 why it is so. Yes. Fixed. > >> +            end >> +        end >> +    end >> +    setmetatable(module, module_mt) >> +    -- Make static_router attributes accessible form >> +    -- vshard.router.internal. >> +    local M_static_router_attributes = { >> +        name = true, >> +        replicasets = true, >> +        route_map = true, >> +        total_bucket_count = true, >> +    } > > 7. I saw in the tests that you are using > vshard.router.internal.static_router > instead. Please, remove M_static_router_attributes then. Deleted. Tests are fixed. > >> +    setmetatable(M, { >> +        __index = function(M, key) >> +            return M.static_router[key] >> +        end >> +    }) >> +end >> + >> +local function router_new(name, cfg) >> +    assert(type(name) == 'string' and type(cfg) == 'table', >> +           'Wrong argument type. Usage: vshard.router.new(name, cfg).') > > 8. As I said before, do not use assertions for usage checks in public > API. Use 'if wrong_usage then error(...) end'. Fixed. > >> +    if M.routers[name] then >> +        return nil, string.format('Router with name %s already >> exists', name) >> +    end >> +    local router = table.deepcopy(ROUTER_TEMPLATE) >> +    setmetatable(router, router_mt) >> +    router.name = name >> +    M.routers[name] = router >> +    if name == STATIC_ROUTER_NAME then >> +        M.static_router = router >> +        export_static_router_attributes() >> +    end > > 9. This check can be removed if you move > export_static_router_attributes call into legacy_cfg. Butbue to this if, the static router can be configured by `vshard.box.new(static_router_name)`. > > 10. Looks like all your struggles in > export_static_router_attributes() about error on non-configured > router makes no sense since until cfg is called, vshard.router > has no any methods except cfg and new. > >> +    router_cfg(router, cfg, CFG_NEW) >> +    updage_lua_gc_state() >> +    return router >> +end >> + Fixed. full diff commit f3ffb6a6a3632277f05ee4ea7d095a19dd85a42f Author: AKhatskevich Date:   Thu Jul 26 16:17:25 2018 +0300     Introduce multiple routers feature     Key points:     * Old `vshard.router.some_method()` api is preserved.     * Add `vshard.router.new(name, cfg)` method which returns a new router.     * Each router has its own:       1. name       2. background fibers       3. attributes (route_map, replicasets, outdate_delay...)     * Module reload reloads all configured routers.     * `cfg` reconfigures a single router.     * All routers share the same box configuration. The last passed config       overrides the global box config.     * Multiple router instances can be connected to the same cluster.     * By now, a router cannot be destroyed.     Extra changes:     * Add `data` parameter to `reloadable_fiber_create` function.     Closes #130 diff --git a/test/failover/failover.result b/test/failover/failover.result index 73a4250..50410ad 100644 --- a/test/failover/failover.result +++ b/test/failover/failover.result @@ -174,7 +174,7 @@ test_run:switch('router_1')  ---  - true  ... -rs1 = vshard.router.internal.replicasets[rs_uuid[1]] +rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]]  ---  ...  while not rs1.replica_up_ts do fiber.sleep(0.1) end diff --git a/test/failover/failover.test.lua b/test/failover/failover.test.lua index 6e06314..44c8b6d 100644 --- a/test/failover/failover.test.lua +++ b/test/failover/failover.test.lua @@ -74,7 +74,7 @@ echo_count  -- Ensure that replica_up_ts is updated periodically.  test_run:switch('router_1') -rs1 = vshard.router.internal.replicasets[rs_uuid[1]] +rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]]  while not rs1.replica_up_ts do fiber.sleep(0.1) end  old_up_ts = rs1.replica_up_ts  while rs1.replica_up_ts == old_up_ts do fiber.sleep(0.1) end diff --git a/test/failover/failover_errinj.result b/test/failover/failover_errinj.result index 3b6d986..484a1e3 100644 --- a/test/failover/failover_errinj.result +++ b/test/failover/failover_errinj.result @@ -49,7 +49,7 @@ vshard.router.cfg(cfg)  -- Check that already run failover step is restarted on  -- configuration change (if some replicasets are removed from  -- config). -rs1 = vshard.router.internal.replicasets[rs_uuid[1]] +rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]]  ---  ...  while not rs1.replica or not rs1.replica.conn:is_connected() do fiber.sleep(0.1) end diff --git a/test/failover/failover_errinj.test.lua b/test/failover/failover_errinj.test.lua index b4d2d35..14228de 100644 --- a/test/failover/failover_errinj.test.lua +++ b/test/failover/failover_errinj.test.lua @@ -20,7 +20,7 @@ vshard.router.cfg(cfg)  -- Check that already run failover step is restarted on  -- configuration change (if some replicasets are removed from  -- config). -rs1 = vshard.router.internal.replicasets[rs_uuid[1]] +rs1 = vshard.router.internal.static_router.replicasets[rs_uuid[1]]  while not rs1.replica or not rs1.replica.conn:is_connected() do fiber.sleep(0.1) end  vshard.router.internal.errinj.ERRINJ_FAILOVER_CHANGE_CFG = true  wait_state('Configuration has changed, restart ') diff --git a/test/failover/router_1.lua b/test/failover/router_1.lua index d71209b..664a6c6 100644 --- a/test/failover/router_1.lua +++ b/test/failover/router_1.lua @@ -42,7 +42,7 @@ end  function priority_order()      local ret = {}      for _, uuid in pairs(rs_uuid) do -        local rs = vshard.router.internal.replicasets[uuid] +        local rs = vshard.router.internal.static_router.replicasets[uuid]          local sorted = {}          for _, replica in pairs(rs.priority_list) do              local z diff --git a/test/misc/reconfigure.result b/test/misc/reconfigure.result index c7960b3..311f749 100644 --- a/test/misc/reconfigure.result +++ b/test/misc/reconfigure.result @@ -250,7 +250,7 @@ test_run:switch('router_1')  -- Ensure that in a case of error router internals are not  -- changed.  -- -not vshard.router.internal.collect_lua_garbage +not vshard.router.internal.static_router.collect_lua_garbage  ---  - true  ... @@ -264,7 +264,7 @@ vshard.router.cfg(cfg)  ---  - error: 'Incorrect value for option ''invalid_option'': unexpected option'  ... -not vshard.router.internal.collect_lua_garbage +not vshard.router.internal.static_router.collect_lua_garbage  ---  - true  ... diff --git a/test/misc/reconfigure.test.lua b/test/misc/reconfigure.test.lua index 25dc2ca..298b9b0 100644 --- a/test/misc/reconfigure.test.lua +++ b/test/misc/reconfigure.test.lua @@ -99,11 +99,11 @@ test_run:switch('router_1')  -- Ensure that in a case of error router internals are not  -- changed.  -- -not vshard.router.internal.collect_lua_garbage +not vshard.router.internal.static_router.collect_lua_garbage  cfg.collect_lua_garbage = true  cfg.invalid_option = 'kek'  vshard.router.cfg(cfg) -not vshard.router.internal.collect_lua_garbage +not vshard.router.internal.static_router.collect_lua_garbage  cfg.invalid_option = nil  cfg.collect_lua_garbage = nil  vshard.router.cfg(cfg) diff --git a/test/multiple_routers/configs.lua b/test/multiple_routers/configs.lua new file mode 100644 index 0000000..a6ce33c --- /dev/null +++ b/test/multiple_routers/configs.lua @@ -0,0 +1,81 @@ +names = { +    storage_1_1_a = '32a2d4b8-f146-44ed-9d51-2436507efdf8', +    storage_1_1_b = 'c1c849b1-641d-40b8-9283-bcfe73d46270', +    storage_1_2_a = '04e677ed-c7ba-47e0-a67f-b5100cfa86af', +    storage_1_2_b = 'c7a979ee-9263-4a38-84a5-2fb6a0a32684', +    storage_2_1_a = '88dc03f0-23fb-4f05-b462-e29186542864', +    storage_2_1_b = '4230b711-f5c4-4131-bf98-88cd43a16901', +    storage_2_2_a = '6b1eefbc-1e2e-410e-84ff-44c572ea9916', +    storage_2_2_b = 'be74419a-1e56-4ba4-97e9-6b18710f63c5', +} + +rs_1_1 = 'dd208fb8-8b90-49bc-8393-6b3a99da7c52' +rs_1_2 = 'af9cfe88-2091-4613-a877-a623776c5c0e' +rs_2_1 = '9ca8ee15-ae18-4f31-9385-4859f89ce73f' +rs_2_2 = '007f5f58-b654-4125-8441-a71866fb62b5' + +local cfg_1 = {} +cfg_1.sharding = { +    [rs_1_1] = { +        replicas = { +            [names.storage_1_1_a] = { +                uri = 'storage:storage@127.0.0.1:3301', +                name = 'storage_1_1_a', +                master = true, +            }, +            [names.storage_1_1_b] = { +                uri = 'storage:storage@127.0.0.1:3302', +                name = 'storage_1_1_b', +            }, +        } +    }, +    [rs_1_2] = { +        replicas = { +            [names.storage_1_2_a] = { +                uri = 'storage:storage@127.0.0.1:3303', +                name = 'storage_1_2_a', +                master = true, +            }, +            [names.storage_1_2_b] = { +                uri = 'storage:storage@127.0.0.1:3304', +                name = 'storage_1_2_b', +            }, +        } +    }, +} + + +local cfg_2 = {} +cfg_2.sharding = { +    [rs_2_1] = { +        replicas = { +            [names.storage_2_1_a] = { +                uri = 'storage:storage@127.0.0.1:3305', +                name = 'storage_2_1_a', +                master = true, +            }, +            [names.storage_2_1_b] = { +                uri = 'storage:storage@127.0.0.1:3306', +                name = 'storage_2_1_b', +            }, +        } +    }, +    [rs_2_2] = { +        replicas = { +            [names.storage_2_2_a] = { +                uri = 'storage:storage@127.0.0.1:3307', +                name = 'storage_2_2_a', +                master = true, +            }, +            [names.storage_2_2_b] = { +                uri = 'storage:storage@127.0.0.1:3308', +                name = 'storage_2_2_b', +            }, +        } +    }, +} + +return { +    cfg_1 = cfg_1, +    cfg_2 = cfg_2, +} diff --git a/test/multiple_routers/multiple_routers.result b/test/multiple_routers/multiple_routers.result new file mode 100644 index 0000000..1e309a7 --- /dev/null +++ b/test/multiple_routers/multiple_routers.result @@ -0,0 +1,295 @@ +test_run = require('test_run').new() +--- +... +REPLICASET_1_1 = { 'storage_1_1_a', 'storage_1_1_b' } +--- +... +REPLICASET_1_2 = { 'storage_1_2_a', 'storage_1_2_b' } +--- +... +REPLICASET_2_1 = { 'storage_2_1_a', 'storage_2_1_b' } +--- +... +REPLICASET_2_2 = { 'storage_2_2_a', 'storage_2_2_b' } +--- +... +test_run:create_cluster(REPLICASET_1_1, 'multiple_routers') +--- +... +test_run:create_cluster(REPLICASET_1_2, 'multiple_routers') +--- +... +test_run:create_cluster(REPLICASET_2_1, 'multiple_routers') +--- +... +test_run:create_cluster(REPLICASET_2_2, 'multiple_routers') +--- +... +util = require('lua_libs.util') +--- +... +util.wait_master(test_run, REPLICASET_1_1, 'storage_1_1_a') +--- +... +util.wait_master(test_run, REPLICASET_1_2, 'storage_1_2_a') +--- +... +util.wait_master(test_run, REPLICASET_2_1, 'storage_2_1_a') +--- +... +util.wait_master(test_run, REPLICASET_2_2, 'storage_2_2_a') +--- +... +test_run:cmd("create server router_1 with script='multiple_routers/router_1.lua'") +--- +- true +... +test_run:cmd("start server router_1") +--- +- true +... +-- Configure default (static) router. +_ = test_run:cmd("switch router_1") +--- +... +static_router = vshard.router.new('_static_router', configs.cfg_1) +--- +... +vshard.router.bootstrap() +--- +- true +... +_ = test_run:cmd("switch storage_1_2_a") +--- +... +wait_rebalancer_state('The cluster is balanced ok', test_run) +--- +... +_ = test_run:cmd("switch router_1") +--- +... +vshard.router.call(1, 'write', 'do_replace', {{1, 1}}) +--- +- true +... +vshard.router.call(1, 'read', 'do_select', {1}) +--- +- [[1, 1]] +... +-- Test that static router is just a router object under the hood. +static_router:route(1) == vshard.router.route(1) +--- +- true +... +-- Configure extra router. +router_2 = vshard.router.new('router_2', configs.cfg_2) +--- +... +router_2:bootstrap() +--- +- true +... +_ = test_run:cmd("switch storage_2_2_a") +--- +... +wait_rebalancer_state('The cluster is balanced ok', test_run) +--- +... +_ = test_run:cmd("switch router_1") +--- +... +router_2:call(1, 'write', 'do_replace', {{2, 2}}) +--- +- true +... +router_2:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +-- Check that router_2 and static router serves different clusters. +#router_2:call(1, 'read', 'do_select', {1}) == 0 +--- +- true +... +-- Create several routers to the same cluster. +routers = {} +--- +... +for i = 3, 10 do routers[i] = vshard.router.new('router_' .. i, configs.cfg_2) end +--- +... +routers[3]:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +-- Check that they have their own background fibers. +fiber_names = {} +--- +... +for i = 2, 10 do fiber_names['vshard.failover.router_' .. i] = true; fiber_names['vshard.discovery.router_' .. i] = true; end +--- +... +next(fiber_names) ~= nil +--- +- true +... +fiber = require('fiber') +--- +... +for _, xfiber in pairs(fiber.info()) do fiber_names[xfiber.name] = nil end +--- +... +next(fiber_names) == nil +--- +- true +... +-- Reconfigure one of routers do not affect the others. +routers[3]:cfg(configs.cfg_1) +--- +... +routers[3]:call(1, 'read', 'do_select', {1}) +--- +- [[1, 1]] +... +#routers[3]:call(1, 'read', 'do_select', {2}) == 0 +--- +- true +... +#routers[4]:call(1, 'read', 'do_select', {1}) == 0 +--- +- true +... +routers[4]:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +routers[3]:cfg(configs.cfg_2) +--- +... +-- Try to create router with the same name. +util = require('lua_libs.util') +--- +... +util.check_error(vshard.router.new, 'router_2', configs.cfg_2) +--- +- null +- Router with name router_2 already exists +... +-- Reload router module. +_, old_rs_1 = next(vshard.router.internal.static_router.replicasets) +--- +... +_, old_rs_2 = next(router_2.replicasets) +--- +... +package.loaded['vshard.router'] = nil +--- +... +vshard.router = require('vshard.router') +--- +... +while not old_rs_1.is_outdated do fiber.sleep(0.01) end +--- +... +while not old_rs_2.is_outdated do fiber.sleep(0.01) end +--- +... +vshard.router.call(1, 'read', 'do_select', {1}) +--- +- [[1, 1]] +... +router_2:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +routers[5]:call(1, 'read', 'do_select', {2}) +--- +- [[2, 2]] +... +-- Check lua_gc counter. +lua_gc = require('vshard.lua_gc') +--- +... +vshard.router.internal.collect_lua_garbage_cnt == 0 +--- +- true +... +lua_gc.internal.bg_fiber == nil +--- +- true +... +configs.cfg_2.collect_lua_garbage = true +--- +... +routers[5]:cfg(configs.cfg_2) +--- +... +lua_gc.internal.bg_fiber ~= nil +--- +- true +... +routers[7]:cfg(configs.cfg_2) +--- +... +lua_gc.internal.bg_fiber ~= nil +--- +- true +... +vshard.router.internal.collect_lua_garbage_cnt == 2 +--- +- true +... +package.loaded['vshard.router'] = nil +--- +... +vshard.router = require('vshard.router') +--- +... +vshard.router.internal.collect_lua_garbage_cnt == 2 +--- +- true +... +configs.cfg_2.collect_lua_garbage = nil +--- +... +routers[5]:cfg(configs.cfg_2) +--- +... +lua_gc.internal.bg_fiber ~= nil +--- +- true +... +routers[7]:cfg(configs.cfg_2) +--- +... +vshard.router.internal.collect_lua_garbage_cnt == 0 +--- +- true +... +lua_gc.internal.bg_fiber == nil +--- +- true +... +_ = test_run:cmd("switch default") +--- +... +test_run:cmd("stop server router_1") +--- +- true +... +test_run:cmd("cleanup server router_1") +--- +- true +... +test_run:drop_cluster(REPLICASET_1_1) +--- +... +test_run:drop_cluster(REPLICASET_1_2) +--- +... +test_run:drop_cluster(REPLICASET_2_1) +--- +... +test_run:drop_cluster(REPLICASET_2_2) +--- +... diff --git a/test/multiple_routers/multiple_routers.test.lua b/test/multiple_routers/multiple_routers.test.lua new file mode 100644 index 0000000..760ad9f --- /dev/null +++ b/test/multiple_routers/multiple_routers.test.lua @@ -0,0 +1,108 @@ +test_run = require('test_run').new() + +REPLICASET_1_1 = { 'storage_1_1_a', 'storage_1_1_b' } +REPLICASET_1_2 = { 'storage_1_2_a', 'storage_1_2_b' } +REPLICASET_2_1 = { 'storage_2_1_a', 'storage_2_1_b' } +REPLICASET_2_2 = { 'storage_2_2_a', 'storage_2_2_b' } + +test_run:create_cluster(REPLICASET_1_1, 'multiple_routers') +test_run:create_cluster(REPLICASET_1_2, 'multiple_routers') +test_run:create_cluster(REPLICASET_2_1, 'multiple_routers') +test_run:create_cluster(REPLICASET_2_2, 'multiple_routers') +util = require('lua_libs.util') +util.wait_master(test_run, REPLICASET_1_1, 'storage_1_1_a') +util.wait_master(test_run, REPLICASET_1_2, 'storage_1_2_a') +util.wait_master(test_run, REPLICASET_2_1, 'storage_2_1_a') +util.wait_master(test_run, REPLICASET_2_2, 'storage_2_2_a') + +test_run:cmd("create server router_1 with script='multiple_routers/router_1.lua'") +test_run:cmd("start server router_1") + +-- Configure default (static) router. +_ = test_run:cmd("switch router_1") +static_router = vshard.router.new('_static_router', configs.cfg_1) +vshard.router.bootstrap() +_ = test_run:cmd("switch storage_1_2_a") +wait_rebalancer_state('The cluster is balanced ok', test_run) +_ = test_run:cmd("switch router_1") + +vshard.router.call(1, 'write', 'do_replace', {{1, 1}}) +vshard.router.call(1, 'read', 'do_select', {1}) + +-- Test that static router is just a router object under the hood. +static_router:route(1) == vshard.router.route(1) + +-- Configure extra router. +router_2 = vshard.router.new('router_2', configs.cfg_2) +router_2:bootstrap() +_ = test_run:cmd("switch storage_2_2_a") +wait_rebalancer_state('The cluster is balanced ok', test_run) +_ = test_run:cmd("switch router_1") + +router_2:call(1, 'write', 'do_replace', {{2, 2}}) +router_2:call(1, 'read', 'do_select', {2}) +-- Check that router_2 and static router serves different clusters. +#router_2:call(1, 'read', 'do_select', {1}) == 0 + +-- Create several routers to the same cluster. +routers = {} +for i = 3, 10 do routers[i] = vshard.router.new('router_' .. i, configs.cfg_2) end +routers[3]:call(1, 'read', 'do_select', {2}) +-- Check that they have their own background fibers. +fiber_names = {} +for i = 2, 10 do fiber_names['vshard.failover.router_' .. i] = true; fiber_names['vshard.discovery.router_' .. i] = true; end +next(fiber_names) ~= nil +fiber = require('fiber') +for _, xfiber in pairs(fiber.info()) do fiber_names[xfiber.name] = nil end +next(fiber_names) == nil + +-- Reconfigure one of routers do not affect the others. +routers[3]:cfg(configs.cfg_1) +routers[3]:call(1, 'read', 'do_select', {1}) +#routers[3]:call(1, 'read', 'do_select', {2}) == 0 +#routers[4]:call(1, 'read', 'do_select', {1}) == 0 +routers[4]:call(1, 'read', 'do_select', {2}) +routers[3]:cfg(configs.cfg_2) + +-- Try to create router with the same name. +util = require('lua_libs.util') +util.check_error(vshard.router.new, 'router_2', configs.cfg_2) + +-- Reload router module. +_, old_rs_1 = next(vshard.router.internal.static_router.replicasets) +_, old_rs_2 = next(router_2.replicasets) +package.loaded['vshard.router'] = nil +vshard.router = require('vshard.router') +while not old_rs_1.is_outdated do fiber.sleep(0.01) end +while not old_rs_2.is_outdated do fiber.sleep(0.01) end +vshard.router.call(1, 'read', 'do_select', {1}) +router_2:call(1, 'read', 'do_select', {2}) +routers[5]:call(1, 'read', 'do_select', {2}) + +-- Check lua_gc counter. +lua_gc = require('vshard.lua_gc') +vshard.router.internal.collect_lua_garbage_cnt == 0 +lua_gc.internal.bg_fiber == nil +configs.cfg_2.collect_lua_garbage = true +routers[5]:cfg(configs.cfg_2) +lua_gc.internal.bg_fiber ~= nil +routers[7]:cfg(configs.cfg_2) +lua_gc.internal.bg_fiber ~= nil +vshard.router.internal.collect_lua_garbage_cnt == 2 +package.loaded['vshard.router'] = nil +vshard.router = require('vshard.router') +vshard.router.internal.collect_lua_garbage_cnt == 2 +configs.cfg_2.collect_lua_garbage = nil +routers[5]:cfg(configs.cfg_2) +lua_gc.internal.bg_fiber ~= nil +routers[7]:cfg(configs.cfg_2) +vshard.router.internal.collect_lua_garbage_cnt == 0 +lua_gc.internal.bg_fiber == nil + +_ = test_run:cmd("switch default") +test_run:cmd("stop server router_1") +test_run:cmd("cleanup server router_1") +test_run:drop_cluster(REPLICASET_1_1) +test_run:drop_cluster(REPLICASET_1_2) +test_run:drop_cluster(REPLICASET_2_1) +test_run:drop_cluster(REPLICASET_2_2) diff --git a/test/multiple_routers/router_1.lua b/test/multiple_routers/router_1.lua new file mode 100644 index 0000000..2e9ea91 --- /dev/null +++ b/test/multiple_routers/router_1.lua @@ -0,0 +1,15 @@ +#!/usr/bin/env tarantool + +require('strict').on() + +-- Get instance name +local fio = require('fio') +local NAME = fio.basename(arg[0], '.lua') + +require('console').listen(os.getenv('ADMIN')) + +configs = require('configs') + +-- Start the database with sharding +vshard = require('vshard') +box.cfg{} diff --git a/test/multiple_routers/storage_1_1_a.lua b/test/multiple_routers/storage_1_1_a.lua new file mode 100644 index 0000000..b44a97a --- /dev/null +++ b/test/multiple_routers/storage_1_1_a.lua @@ -0,0 +1,23 @@ +#!/usr/bin/env tarantool + +require('strict').on() + +-- Get instance name. +local fio = require('fio') +NAME = fio.basename(arg[0], '.lua') + +require('console').listen(os.getenv('ADMIN')) + +-- Fetch config for the cluster of the instance. +if NAME:sub(9,9) == '1' then +    cfg = require('configs').cfg_1 +else +    cfg = require('configs').cfg_2 +end + +-- Start the database with sharding. +vshard = require('vshard') +vshard.storage.cfg(cfg, names[NAME]) + +-- Bootstrap storage. +require('lua_libs.bootstrap') diff --git a/test/multiple_routers/storage_1_1_b.lua b/test/multiple_routers/storage_1_1_b.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_1_1_b.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_1_2_a.lua b/test/multiple_routers/storage_1_2_a.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_1_2_a.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_1_2_b.lua b/test/multiple_routers/storage_1_2_b.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_1_2_b.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_2_1_a.lua b/test/multiple_routers/storage_2_1_a.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_2_1_a.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_2_1_b.lua b/test/multiple_routers/storage_2_1_b.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_2_1_b.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_2_2_a.lua b/test/multiple_routers/storage_2_2_a.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_2_2_a.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/storage_2_2_b.lua b/test/multiple_routers/storage_2_2_b.lua new file mode 120000 index 0000000..76d196b --- /dev/null +++ b/test/multiple_routers/storage_2_2_b.lua @@ -0,0 +1 @@ +storage_1_1_a.lua \ No newline at end of file diff --git a/test/multiple_routers/suite.ini b/test/multiple_routers/suite.ini new file mode 100644 index 0000000..d2d4470 --- /dev/null +++ b/test/multiple_routers/suite.ini @@ -0,0 +1,6 @@ +[default] +core = tarantool +description = Multiple routers tests +script = test.lua +is_parallel = False +lua_libs = ../lua_libs configs.lua diff --git a/test/multiple_routers/test.lua b/test/multiple_routers/test.lua new file mode 100644 index 0000000..cb7c1ee --- /dev/null +++ b/test/multiple_routers/test.lua @@ -0,0 +1,9 @@ +#!/usr/bin/env tarantool + +require('strict').on() + +box.cfg{ +    listen              = os.getenv("LISTEN"), +} + +require('console').listen(os.getenv('ADMIN')) diff --git a/test/router/exponential_timeout.result b/test/router/exponential_timeout.result index fb54d0f..6748b64 100644 --- a/test/router/exponential_timeout.result +++ b/test/router/exponential_timeout.result @@ -37,10 +37,10 @@ test_run:cmd('switch router_1')  util = require('util')  ---  ... -rs1 = vshard.router.internal.replicasets[replicasets[1]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]  ---  ... -rs2 = vshard.router.internal.replicasets[replicasets[2]] +rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]  ---  ...  util.collect_timeouts(rs1) diff --git a/test/router/exponential_timeout.test.lua b/test/router/exponential_timeout.test.lua index 3ec0b8c..75d85bf 100644 --- a/test/router/exponential_timeout.test.lua +++ b/test/router/exponential_timeout.test.lua @@ -13,8 +13,8 @@ test_run:cmd("start server router_1")  test_run:cmd('switch router_1')  util = require('util') -rs1 = vshard.router.internal.replicasets[replicasets[1]] -rs2 = vshard.router.internal.replicasets[replicasets[2]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]] +rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]  util.collect_timeouts(rs1)  util.collect_timeouts(rs2) diff --git a/test/router/reconnect_to_master.result b/test/router/reconnect_to_master.result index 5e678ce..d502723 100644 --- a/test/router/reconnect_to_master.result +++ b/test/router/reconnect_to_master.result @@ -76,7 +76,7 @@ _ = test_run:cmd('stop server storage_1_a')  _ = test_run:switch('router_1')  ---  ... -reps = vshard.router.internal.replicasets +reps = vshard.router.internal.static_router.replicasets  ---  ...  test_run:cmd("setopt delimiter ';'") @@ -95,7 +95,7 @@ end;  ...  function count_known_buckets()      local known_buckets = 0 -    for _, id in pairs(vshard.router.internal.route_map) do +    for _, id in pairs(vshard.router.internal.static_router.route_map) do          known_buckets = known_buckets + 1      end      return known_buckets @@ -127,7 +127,7 @@ is_disconnected()  fiber = require('fiber')  ---  ... -while vshard.router.internal.replicasets[replicasets[1]].replica == nil do fiber.sleep(0.1) end +while vshard.router.internal.static_router.replicasets[replicasets[1]].replica == nil do fiber.sleep(0.1) end  ---  ...  vshard.router.info() diff --git a/test/router/reconnect_to_master.test.lua b/test/router/reconnect_to_master.test.lua index 39ba90e..8820fa7 100644 --- a/test/router/reconnect_to_master.test.lua +++ b/test/router/reconnect_to_master.test.lua @@ -34,7 +34,7 @@ _ = test_run:cmd('stop server storage_1_a')  _ = test_run:switch('router_1') -reps = vshard.router.internal.replicasets +reps = vshard.router.internal.static_router.replicasets  test_run:cmd("setopt delimiter ';'")  function is_disconnected()      for i, rep in pairs(reps) do @@ -46,7 +46,7 @@ function is_disconnected()  end;  function count_known_buckets()      local known_buckets = 0 -    for _, id in pairs(vshard.router.internal.route_map) do +    for _, id in pairs(vshard.router.internal.static_router.route_map) do          known_buckets = known_buckets + 1      end      return known_buckets @@ -63,7 +63,7 @@ is_disconnected()  -- Wait until replica is connected to test alerts on unavailable  -- master.  fiber = require('fiber') -while vshard.router.internal.replicasets[replicasets[1]].replica == nil do fiber.sleep(0.1) end +while vshard.router.internal.static_router.replicasets[replicasets[1]].replica == nil do fiber.sleep(0.1) end  vshard.router.info()  -- Return master. diff --git a/test/router/reload.result b/test/router/reload.result index f0badc3..98e8e71 100644 --- a/test/router/reload.result +++ b/test/router/reload.result @@ -229,7 +229,7 @@ vshard.router.cfg(cfg)  cfg.connection_outdate_delay = old_connection_delay  ---  ... -vshard.router.internal.connection_outdate_delay = nil +vshard.router.internal.static_router.connection_outdate_delay = nil  ---  ...  rs_new = vshard.router.route(1) diff --git a/test/router/reload.test.lua b/test/router/reload.test.lua index 528222a..293cb26 100644 --- a/test/router/reload.test.lua +++ b/test/router/reload.test.lua @@ -104,7 +104,7 @@ old_connection_delay = cfg.connection_outdate_delay  cfg.connection_outdate_delay = 0.3  vshard.router.cfg(cfg)  cfg.connection_outdate_delay = old_connection_delay -vshard.router.internal.connection_outdate_delay = nil +vshard.router.internal.static_router.connection_outdate_delay = nil  rs_new = vshard.router.route(1)  rs_old = rs  _, replica_old = next(rs_old.replicas) diff --git a/test/router/reroute_wrong_bucket.result b/test/router/reroute_wrong_bucket.result index 7f2a494..989dc79 100644 --- a/test/router/reroute_wrong_bucket.result +++ b/test/router/reroute_wrong_bucket.result @@ -98,7 +98,7 @@ vshard.router.call(100, 'read', 'customer_lookup', {1}, {timeout = 100})  ---  - {'accounts': [], 'customer_id': 1, 'name': 'name'}  ... -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]]  ---  ...  vshard.router.call(100, 'write', 'customer_add', {{customer_id = 2, bucket_id = 100, name = 'name2', accounts = {}}}, {timeout = 100}) @@ -146,13 +146,13 @@ test_run:switch('router_1')  ...  -- Emulate a situation, when a replicaset_2 while is unknown for  -- router, but is already known for storages. -save_rs2 = vshard.router.internal.replicasets[replicasets[2]] +save_rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]]  ---  ... -vshard.router.internal.replicasets[replicasets[2]] = nil +vshard.router.internal.static_router.replicasets[replicasets[2]] = nil  ---  ... -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]]  ---  ...  fiber = require('fiber') @@ -207,7 +207,7 @@ err  require('log').info(string.rep('a', 1000))  ---  ... -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]]  ---  ...  call_retval = nil @@ -219,7 +219,7 @@ f = fiber.create(do_call, 100)  while not test_run:grep_log('router_1', 'please update configuration', 1000) do fiber.sleep(0.1) end  ---  ... -vshard.router.internal.replicasets[replicasets[2]] = save_rs2 +vshard.router.internal.static_router.replicasets[replicasets[2]] = save_rs2  ---  ...  while not call_retval do fiber.sleep(0.1) end diff --git a/test/router/reroute_wrong_bucket.test.lua b/test/router/reroute_wrong_bucket.test.lua index 03384d1..a00f941 100644 --- a/test/router/reroute_wrong_bucket.test.lua +++ b/test/router/reroute_wrong_bucket.test.lua @@ -35,7 +35,7 @@ customer_add({customer_id = 1, bucket_id = 100, name = 'name', accounts = {}})  test_run:switch('router_1')  vshard.router.call(100, 'read', 'customer_lookup', {1}, {timeout = 100}) -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]]  vshard.router.call(100, 'write', 'customer_add', {{customer_id = 2, bucket_id = 100, name = 'name2', accounts = {}}}, {timeout = 100})  -- Create cycle. @@ -55,9 +55,9 @@ box.space._bucket:replace({100, vshard.consts.BUCKET.SENT, replicasets[2]})  test_run:switch('router_1')  -- Emulate a situation, when a replicaset_2 while is unknown for  -- router, but is already known for storages. -save_rs2 = vshard.router.internal.replicasets[replicasets[2]] -vshard.router.internal.replicasets[replicasets[2]] = nil -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +save_rs2 = vshard.router.internal.static_router.replicasets[replicasets[2]] +vshard.router.internal.static_router.replicasets[replicasets[2]] = nil +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]]  fiber = require('fiber')  call_retval = nil @@ -84,11 +84,11 @@ err  -- detect it and end with ok.  --  require('log').info(string.rep('a', 1000)) -vshard.router.internal.route_map[100] = vshard.router.internal.replicasets[replicasets[1]] +vshard.router.internal.static_router.route_map[100] = vshard.router.internal.static_router.replicasets[replicasets[1]]  call_retval = nil  f = fiber.create(do_call, 100)  while not test_run:grep_log('router_1', 'please update configuration', 1000) do fiber.sleep(0.1) end -vshard.router.internal.replicasets[replicasets[2]] = save_rs2 +vshard.router.internal.static_router.replicasets[replicasets[2]] = save_rs2  while not call_retval do fiber.sleep(0.1) end  call_retval  vshard.router.call(100, 'read', 'customer_lookup', {3}, {timeout = 1}) diff --git a/test/router/retry_reads.result b/test/router/retry_reads.result index 64b0ff3..b803ae3 100644 --- a/test/router/retry_reads.result +++ b/test/router/retry_reads.result @@ -37,7 +37,7 @@ test_run:cmd('switch router_1')  util = require('util')  ---  ... -rs1 = vshard.router.internal.replicasets[replicasets[1]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]  ---  ...  min_timeout = vshard.consts.CALL_TIMEOUT_MIN diff --git a/test/router/retry_reads.test.lua b/test/router/retry_reads.test.lua index 2fb2fc7..510e961 100644 --- a/test/router/retry_reads.test.lua +++ b/test/router/retry_reads.test.lua @@ -13,7 +13,7 @@ test_run:cmd("start server router_1")  test_run:cmd('switch router_1')  util = require('util') -rs1 = vshard.router.internal.replicasets[replicasets[1]] +rs1 = vshard.router.internal.static_router.replicasets[replicasets[1]]  min_timeout = vshard.consts.CALL_TIMEOUT_MIN  -- diff --git a/test/router/router.result b/test/router/router.result