Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: tarantool-patches@freelists.org,
	AKhatskevich <avkhatskevich@tarantool.org>
Subject: [tarantool-patches] Re: [PATCH 2/2] Fix discovery/reconfigure race
Date: Thu, 21 Jun 2018 15:54:25 +0300	[thread overview]
Message-ID: <edbc3f57-83b9-8434-7893-99857cacb699@tarantool.org> (raw)
In-Reply-To: <fa97ea478a9e4d85237eeb2bb8ccab0067d4914d.1529066485.git.avkhatskevich@tarantool.org>

Thanks for the patch! See 5 comments below.

On 15/06/2018 15:47, AKhatskevich wrote:
> This commit prevents discovery fiber from discovering old replicasets
> and spoiling `route_map`.
> ---
>   test/router/router.result   | 62 +++++++++++++++++++++++++++++++++++++++++++++
>   test/router/router.test.lua | 42 ++++++++++++++++++++++++++++++
>   vshard/router/init.lua      | 15 ++++++++++-
>   3 files changed, 118 insertions(+), 1 deletion(-)
> 
> diff --git a/test/router/router.result b/test/router/router.result
> index 5643f3e..e61505e 100644
> --- a/test/router/router.result
> +++ b/test/router/router.result
> @@ -1095,6 +1095,68 @@ for bucket, old_rs in pairs(bucket_to_old_rs) do
>   end;
>   ---
>   ...
> +--
> +-- Check route_map is not filled with old replica objects after
> +-- recpnfigure.

1. Typo.

> +--
> +-- Perform #replicasets phases of discovery, to update replicasets
> +-- object in for loop of discovery fiber since previous cfg.
> +for _, __ in pairs(vshard.router.internal.replicasets) do
> +    vshard.router.discovery_wakeup()
> +    fiber.sleep(0.02)
> +end;
> +---
> +...
> +-- Simulate long `callro`.
> +-- Stuck on first rs in replicasets.
> +vshard.router.internal.errinj.LONG_DISCOVERY = true;

2. I do not see this error injection in the M.errinj
declaration.

> +---
> +...
> +for _, __ in pairs(vshard.router.internal.replicasets) do
> +    vshard.router.discovery_wakeup()
> +    fiber.sleep(0.02)
> +end;

3. This cycle makes no sense. With set LONG_DISCOVERY it is
equivalent to calling router.discovery_wakeup() once.

> +---
> +...
> +vshard.router.cfg(cfg);
> +---
> +...
> +vshard.router.internal.errinj.LONG_DISCOVERY = nil;
> +---
> +...
> +-- Do discovery iteration.
> +vshard.router.discovery_wakeup()
> +fiber.sleep(0.02)

4. Concrete timeouts are the way to create an unstable test.
Please, get rid of them and replace with 'while not cond do wait end'
where necessary.

> +
> +rs_cnt = 0;
> +---
> +...
> +new_replicasets = {}
> +for _, rs in pairs(vshard.router.internal.replicasets) do
> +    new_replicasets[rs] = true
> +    rs_cnt = rs_cnt + 1
> +end;
> +---
> +...
> +rs_cnt;
> +---
> +- 2
> +...
> +bucket_cnt = 0;
> +---
> +...
> +for bucket_id, rs in pairs(vshard.router.internal.route_map) do
> +    if not new_replicasets[rs] then> +        error('Old object added to route_map.')
> +    end
> +    bucket_cnt = bucket_cnt + 1
> +end;
> +---
> +...
> +bucket_cnt;
> +---
> +- 3000
> +...
>   test_run:cmd("setopt delimiter ''");
>   ---
>   - true
> diff --git a/vshard/router/init.lua b/vshard/router/init.lua
> index 7e765fa..df5b343 100644
> --- a/vshard/router/init.lua
> +++ b/vshard/router/init.lua
> @@ -127,10 +127,23 @@ local function discovery_f(module_version)
>       local iterations_until_lua_gc =
>           consts.COLLECT_LUA_GARBAGE_INTERVAL / consts.DISCOVERY_INTERVAL
>       while module_version == M.module_version do
> -        for _, replicaset in pairs(M.replicasets) do
> +        local old_replicasets = M.replicasets
> +        for rs_uuid, replicaset in pairs(M.replicasets) do
>               local active_buckets, err =
>                   replicaset:callro('vshard.storage.buckets_discovery', {},
>                                     {timeout = 2})
> +            while M.errinj.LONG_DISCOVERY do
> +                -- Stuck on the first replicaset.
> +                if rs_uuid ~= select(1, next(M.replicasets)) then
> +                    break
> +                end
> +                lfiber.sleep(0.01)
> +            end
> +            -- Renew replicasets object in case of reconfigure
> +            -- and reload events.

5. You do not renew here anything.

> +            if M.replicasets ~= old_replicasets then
> +                break
> +            end
>               if not active_buckets then
>                   log.error('Error during discovery %s: %s', replicaset, err)
>               else
> 

  reply	other threads:[~2018-06-21 12:54 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-15 12:47 [tarantool-patches] [PATCH 0/2][vshard] preserve route map AKhatskevich
2018-06-15 12:47 ` [tarantool-patches] [PATCH 1/2] Preserve route_map on router.cfg AKhatskevich
2018-06-21 12:54   ` [tarantool-patches] " Vladislav Shpilevoy
2018-06-25 12:48     ` Alex Khatskevich
2018-06-15 12:47 ` [tarantool-patches] [PATCH 2/2] Fix discovery/reconfigure race AKhatskevich
2018-06-21 12:54   ` Vladislav Shpilevoy [this message]
2018-06-25 12:48     ` [tarantool-patches] " Alex Khatskevich
2018-06-26 11:11       ` Vladislav Shpilevoy
2018-06-26 14:03         ` Alex Khatskevich
2018-06-27 11:45           ` Vladislav Shpilevoy
2018-06-27 19:50             ` Alex Khatskevich
2018-06-28 19:41               ` Vladislav Shpilevoy
2018-06-21 12:54 ` [tarantool-patches] Re: [PATCH 0/2][vshard] preserve route map Vladislav Shpilevoy
2018-06-25 11:52   ` Alex Khatskevich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=edbc3f57-83b9-8434-7893-99857cacb699@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=avkhatskevich@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='[tarantool-patches] Re: [PATCH 2/2] Fix discovery/reconfigure race' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox