[tarantool-patches] Re: [PATCH 3/3] Introduce multiple routers feature

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Wed Aug 1 21:43:37 MSK 2018


Thanks for the patch! See 10 comments below.

On 31/07/2018 19:25, AKhatskevich wrote:
> Key points:
> * Old `vshard.router.some_method()` api is preserved.
> * Add `vshard.router.new(name, cfg)` method which returns a new router.
> * Each router has its own:
>    1. name
>    2. background fibers
>    3. attributes (route_map, replicasets, outdate_delay...)
> * Module reload reloads all configured routers.
> * `cfg` reconfigures a single router.
> * All routers share the same box configuration. The last passed config
>    overrides the global config.
> * Multiple router instances can be connected to the same cluster.
> * By now, a router cannot be destroyed.
> 
> Extra changes:
> * Add `data` parameter to `reloadable_fiber_create` function.
> 
> Closes #130
> ---> diff --git a/test/multiple_routers/multiple_routers.result b/test/multiple_routers/multiple_routers.result
> new file mode 100644
> index 0000000..33f4034
> --- /dev/null
> +++ b/test/multiple_routers/multiple_routers.result
> @@ -0,0 +1,226 @@
> +-- Reconfigure one of routers do not affect the others.
> +routers[3]:cfg(configs.cfg_1)

1. You did not change configs.cfg_1 so it is not reconfig
actually. Please, change something to check that the
parameter affects one router and does not affect others.

2. Please, add a test on an ability to get the static router
into a variable and use it like others. It should be possible
to hide distinctions between static and other routers.

Like this:

     r1 = vshard.router.static
     r2 = vshard.router.new(...)
     do_something_with_router(r1)
     do_something_with_router(r2)

Here do_something_with_router() is unaware of whether the
router is static or not.

> diff --git a/vshard/router/init.lua b/vshard/router/init.lua
> index 3e127cb..7569baf 100644
> --- a/vshard/router/init.lua
> +++ b/vshard/router/init.lua
> @@ -257,13 +272,13 @@ local function router_call(bucket_id, mode, func, args, opts)
>                           -- but already is executed on storages.
>                           while lfiber.time() <= tend do
>                               lfiber.sleep(0.05)
> -                            replicaset = M.replicasets[err.destination]
> +                            replicaset = router.replicasets[err.destination]
>                               if replicaset then
>                                   goto replicaset_is_found
>                               end
>                           end
>                       else
> -                        replicaset = bucket_set(bucket_id, replicaset.uuid)
> +                        replicaset = bucket_set(router, bucket_id, replicaset.uuid)

3. Out of 80 symbols.

>                           lfiber.yield()
>                           -- Protect against infinite cycle in a
>                           -- case of broken cluster, when a bucket
> @@ -488,9 +503,14 @@ end
>   -- Configuration
>   --------------------------------------------------------------------------------
>   
> -local function router_cfg(cfg)
> -    local vshard_cfg, box_cfg = lcfg.check(cfg, M.current_cfg)
> -    if not M.replicasets then
> +-- Types of configuration.
> +CFG_NEW = 'new'
> +CFG_RELOAD = 'reload'
> +CFG_RECONFIGURE = 'reconfigure'

4. Last two values are never used in router_cfg(). The first
is used for logging only and can be checked as it was before
with no explicit passing.

> +
> +local function router_cfg(router, cfg, cfg_type)
> +    local vshard_cfg, box_cfg = lcfg.check(cfg, router.current_cfg)
> +    if cfg_type == CFG_NEW then
>           log.info('Starting router configuration')
>       else
>           log.info('Starting router reconfiguration')
> @@ -512,44 +532,53 @@ local function router_cfg(cfg)
> +
> +local function updage_lua_gc_state()

5. This function is not needed actually.

On router_new() the only thing that can change is start of
the gc fiber if the new router has the flag and the gc is
not started now. It can be checked by a simple 'if' with
no full-scan of all routers.

On reload it is not possible to change configuration, so
the gc state can not be changed and does not need an update.
Even if it could be changed, you already iterate over routers
on reload to call router_cfg and can collect their flags
along side.

The next point is that it is not possible now to manage the
gc via simple :cfg() call. You do nothing with gc when
router_cfg is called directly. And that produces a question -
why do your tests pass if so?

The possible solution - keep a counter of set lua gc flags
overall routers in M. On each cfg you update the counter
if the value is changed. If it was 0 and become > 0, then
you start gc. If it was > 0 and become 0, then you stop gc.
No routers iteration at all.

> +    local lua_gc = false
> +    for _, xrouter in pairs(M.routers) do
> +        if xrouter.collect_lua_garbage then
> +            lua_gc = true
> +        end
> +    end
> +    lua_gc.set_state(lua_gc, consts.COLLECT_LUA_GARBAGE_INTERVAL)
>   end
>   
> @@ -803,6 +833,93 @@ if M.errinj.ERRINJ_RELOAD then
>       error('Error injection: reload')
>   end
>   
> +--------------------------------------------------------------------------------
> +-- Managing router instances
> +--------------------------------------------------------------------------------
> +
> +local function cfg_reconfigure(router, cfg)
> +    return router_cfg(router, cfg, CFG_RECONFIGURE)
> +end
> +
> +local router_mt = {
> +    __index = {
> +        cfg = cfg_reconfigure;
> +        info = router_info;
> +        buckets_info = router_buckets_info;
> +        call = router_call;
> +        callro = router_callro;
> +        callrw = router_callrw;
> +        route = router_route;
> +        routeall = router_routeall;
> +        bucket_id = router_bucket_id;
> +        bucket_count = router_bucket_count;
> +        sync = router_sync;
> +        bootstrap = cluster_bootstrap;
> +        bucket_discovery = bucket_discovery;
> +        discovery_wakeup = discovery_wakeup;
> +    }
> +}
> +
> +-- Table which represents this module.
> +local module = {}
> +
> +local function export_static_router_attributes()
> +    -- This metatable bypasses calls to a module to the static_router.
> +    local module_mt = {__index = {}}
> +    for method_name, method in pairs(router_mt.__index) do
> +        module_mt.__index[method_name] = function(...)
> +            if M.static_router then
> +                return method(M.static_router, ...)
> +            else
> +                error('Static router is not configured')

6. This should not be all-time check. You should
initialize the static router metatable with only errors.
On the first cfg you reset the metatable to always use
regular methods. But anyway this code is unreachable. See
below in the comment 10 why it is so.

> +            end
> +        end
> +    end
> +    setmetatable(module, module_mt)
> +    -- Make static_router attributes accessible form
> +    -- vshard.router.internal.
> +    local M_static_router_attributes = {
> +        name = true,
> +        replicasets = true,
> +        route_map = true,
> +        total_bucket_count = true,
> +    }

7. I saw in the tests that you are using vshard.router.internal.static_router
instead. Please, remove M_static_router_attributes then.

> +    setmetatable(M, {
> +        __index = function(M, key)
> +            return M.static_router[key]
> +        end
> +    })
> +end
> +
> +local function router_new(name, cfg)
> +    assert(type(name) == 'string' and type(cfg) == 'table',
> +           'Wrong argument type. Usage: vshard.router.new(name, cfg).')

8. As I said before, do not use assertions for usage checks in public
API. Use 'if wrong_usage then error(...) end'.

> +    if M.routers[name] then
> +        return nil, string.format('Router with name %s already exists', name)
> +    end
> +    local router = table.deepcopy(ROUTER_TEMPLATE)
> +    setmetatable(router, router_mt)
> +    router.name = name
> +    M.routers[name] = router
> +    if name == STATIC_ROUTER_NAME then
> +        M.static_router = router
> +        export_static_router_attributes()
> +    end

9. This check can be removed if you move
export_static_router_attributes call into legacy_cfg.

10. Looks like all your struggles in
export_static_router_attributes() about error on non-configured
router makes no sense since until cfg is called, vshard.router
has no any methods except cfg and new.

> +    router_cfg(router, cfg, CFG_NEW)
> +    updage_lua_gc_state()
> +    return router
> +end
> +




More information about the Tarantool-patches mailing list