[tarantool-patches] Re: [PATCH][vshard] Reload reloadable fiber

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Thu Jun 21 15:04:53 MSK 2018


Hello. Thanks for the patch! See my 6 comments below.

On 14/06/2018 14:42, AKhatskevich wrote:
> Fixed a problem:
> The `reloadable_fiber_f` was running an infinite where loop and

1. What is a 'where loop'?

> preventing the whole module from being reloaded.
> 
> This behavior is fixed by calling new version of `reloadable_fiber_f` in
> a return statement instead of the where loop.  Note: calling a function
> in a return statement doesn't increase a stack size.
> 
> Closes #116
> ---
> Branch: https://github.com/tarantool/vshard/tree/kh/gh-116-reloadable
> Issue: https://github.com/tarantool/vshard/issues/116
> 
>   test/router/reload.result    |  4 +--
>   test/router/reload.test.lua  |  4 +--
>   test/storage/reload.result   |  6 ++--
>   test/storage/reload.test.lua |  6 ++--
>   vshard/util.lua              | 78 +++++++++++++++++++++++++++++++-------------
>   5 files changed, 65 insertions(+), 33 deletions(-)
> 
> diff --git a/test/router/reload.result b/test/router/reload.result
> index 19a9ead..47f3c2e 100644
> --- a/test/router/reload.result
> +++ b/test/router/reload.result
> @@ -116,10 +116,10 @@ vshard.router.module_version()
>   check_reloaded()
>   ---
>   ...
> -while test_run:grep_log('router_1', 'Failover has been reloaded') == nil do fiber.sleep(0.1) end
> +while test_run:grep_log('router_1', 'Failover has been started') == nil do fiber.sleep(0.1) end

2. Why? Please, leave the old message. Router already writes that
failover is started in router_cfg. In other places the same.

> diff --git a/vshard/util.lua b/vshard/util.lua
> index bb71318..fa51701 100644
> --- a/vshard/util.lua
> +++ b/vshard/util.lua
> @@ -2,6 +2,24 @@
>   local log = require('log')
>   local fiber = require('fiber')
>   
> +local MODULE_INTERNALS = '__module_vshard_util'
> +local M = rawget(_G, MODULE_INTERNALS)
> +if not M then
> +    --
> +    -- The module is loaded for the first time.
> +    --
> +    M = {
> +        -- Latest versions of functions.
> +        reloadable_fiber_f = nil,
> +        errinj = {
> +            RELOADABLE_STACK_MAX = nil,

3. What is the point of this error injection? It
tests Lua, not VShard, as I think. And it takes
too many lines in the reloadable fiber function
complicating its understanding. So lets remove.

> +            RELOADABLE_EXIT = nil,
> +        }
> +
> +    }
> +    rawset(_G, MODULE_INTERNALS, M)
> +end
> +
> @@ -19,33 +37,43 @@ local function tuple_extract_key(tuple, parts)
>   end
>   
>   --
> --- Wrapper to run @a func in infinite loop and restart it on the
> --- module reload. This function CAN NOT BE AUTORELOADED. To update
> --- it you must manualy stop all fibers, run by this function, do
> --- reload, and then restart all stopped fibers. This can be done,
> --- for example, by calling vshard.storage/router.cfg() again with
> --- the same config as earlier.
> +-- Wrapper to run a func in infinite loop and restart it on
> +-- errors and module reload.
> +-- To handle module reload and run new version of a function
> +-- in the module, the function should just return.
>   --
> --- @param func Reloadable function to run. It must accept current
> ---        module version as an argument, and interrupt itself,
> ---        when it is changed.
> --- @param worker_name Name of the function. Usual infinite fiber
> ---        represents a background subsystem, which has a name. For
> ---        example: "Garbage Collector", "Recovery", "Discovery",
> ---        "Rebalancer".
> --- @param M Module which can reload.
> +-- @param module Module which can be reloaded.
> +-- @param func_name Name of a function to be executed in the
> +--        module.
> +-- @param worker_name Name of the reloadable background subsystem.
> +--        For example: "Garbage Collector", "Recovery", "Discovery",
> +--        "Rebalancer". Used only for an activity logging.
>   --
> -local function reloadable_fiber_f(M, func_name, worker_name)
> -    while true do
> -        local ok, err = pcall(M[func_name], M.module_version)
> -        if not ok then
> -            log.error('%s has been failed: %s', worker_name, err)
> -            fiber.yield()
> -        else
> -            log.info('%s has been reloaded', worker_name)
> -            fiber.yield()
> +local function reloadable_fiber_f(module, func_name, worker_name)
> +    log.info('%s has been started', worker_name)
> +    local func = module[func_name]
> +    local ok, err = pcall(func, module.module_version)
> +    if not ok then
> +        log.error('%s has been failed: %s', worker_name, err)
> +        if func ~= module[func_name] then
> +            log.warn('%s reloadable function %s has changed',
> +                        worker_name, func_name)
>           end
>       end
> +    fiber.yield()
> +    log.info('%s is reloading', worker_name)
> +    if M.errinj.RELOADABLE_EXIT then
> +        return

4. How is this error possible? There are no lines in reloadable_fiber_f
that can terminate the fiber.

5. Now on any reload I see two messages:

started
reloading
started
reloading

But actually the fiber is started once. Please, return the old
messages.

> diff --git a/test/unit/util.result b/test/unit/util.result
> new file mode 100644
> index 0000000..ea9edfa
> --- /dev/null
> +++ b/test/unit/util.result
> @@ -0,0 +1,107 @@
> +test_run = require('test_run').new()
> +---
> +...
> +util = require('vshard.util')
> +---
> +...
> +test_util = require('util')

6. Unused variable?

> +---
> +...
> +log = require('log')
> +---
> +...




More information about the Tarantool-patches mailing list