From: Oleg Babin via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>,
	tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH vshard 2/2] router: don't fallback RO to master right away
Date: Sun, 5 Dec 2021 20:53:26 +0300	[thread overview]
Message-ID: <03bd48be-b017-a48f-773e-f4b8e95f8cf5@tarantool.org> (raw)
In-Reply-To: <36c68a48a627c8af98fb6f9809d2661abcd93658.1638577114.git.v.shpilevoy@tarantool.org>
Thanks for your patch. See one comment below.
Also as I see CI is red on your branch. Is it related to your changes?
On 04.12.2021 03:19, Vladislav Shpilevoy wrote:
> RO requests use the replica with the highest prio as specified in
> the weights matrix.
>
> If the best replica is not available now and failover didn't
> happen yet, then RO requests used to fallback to master. Even if
> there were other RO replicas with better prio.
>
> This patch makes RO call firstly try the currently selected most
> prio replica. If it is not available (no other connected replicas
> at all or failover didn't happen yet), the call will try to walk
> the prio list starting from this replica until it finds an
> available one.
>
> If it also fails, the call will try to walk the list from the
> beginning hoping that the unavailable replica wasn't the best one
> and there might be better option on the other side of the prio
> list.
>
> The patch was done in scope of task about replica backoff (#298)
> because the problem would additionally exist when the best replica
> is in backoff, not only disconnected. It would get worse.
>
> Closes #288
> Needed for #298
> ---
>   test/failover/failover.result         | 64 +++++++++++++------
>   test/failover/failover.test.lua       | 30 +++++----
>   test/router/master_discovery.result   | 92 +++++++++++++++++++++++----
>   test/router/master_discovery.test.lua | 38 ++++++++---
>   vshard/replicaset.lua                 | 21 ++++--
>   5 files changed, 185 insertions(+), 60 deletions(-)
>
> diff --git a/test/failover/failover.result b/test/failover/failover.result
> index bae57fa..31eda0c 100644
> --- a/test/failover/failover.result
> +++ b/test/failover/failover.result
> @@ -85,8 +85,8 @@ test_run:cmd('switch default')
>   --   available;
>   -- * up nearest replica priority if the best one is available
>   --   again;
> --- * replicaset uses master connection, if the nearest's one is
> ---   not available before call();
> +-- * replicaset uses next prio available connection, if the
> +--   nearest's one is not available before call();
>   -- * current nearest connection is not down, when trying to
>   --   connect to the replica with less weight.
>   --
> @@ -199,30 +199,17 @@ test_run:cmd('stop server box_1_d')
>   - true
>   ...
>   -- Down_ts must be set in on_disconnect() trigger.
> -while rs1.replica.down_ts == nil do fiber.sleep(0.1) end
> +box_1_d = rs1.replicas[names.replica_uuid.box_1_d]
>   ---
>   ...
> --- Try to execute read-only request - it must use master
> --- connection, because a replica's one is not available.
> -vshard.router.call(1, 'read', 'echo', {123})
> ----
> -- 123
> -...
> -test_run:switch('box_1_a')
> ----
> -- true
> -...
> -echo_count
> ----
> -- 2
> -...
> -test_run:switch('router_1')
> +test_run:wait_cond(function() return box_1_d.down_ts ~= nil end)
>   ---
>   - true
>   ...
>   -- New replica is box_1_b.
> -while rs1.replica.name ~= 'box_1_b' do fiber.sleep(0.1) end
> +test_run:wait_cond(function() return rs1.replica.name == 'box_1_b' end)
>   ---
> +- true
>   ...
>   rs1.replica.down_ts == nil
>   ---
> @@ -247,14 +234,49 @@ test_run:cmd('switch box_1_b')
>   ...
>   -- Ensure the 'read' echo was executed on box_1_b - nearest
>   -- available replica.
> -echo_count
> +assert(echo_count == 1)
>   ---
> -- 1
> +- true
> +...
> +test_run:switch('router_1')
> +---
> +- true
> +...
> +--
> +-- Kill the best replica. Don't need to wait for failover to happen for the
> +-- router to start using the next best available replica.
> +--
> +test_run:cmd('stop server box_1_b')
> +---
> +- true
> +...
> +box_1_b = rs1.replicas[names.replica_uuid.box_1_b]
> +---
> +...
> +test_run:wait_cond(function() return box_1_b.down_ts ~= nil end)
> +---
> +- true
> +...
> +vshard.router.callro(1, 'echo', {123})
> +---
> +- 123
> +...
> +test_run:switch('box_1_c')
> +---
> +- true
> +...
> +assert(echo_count == 1)
> +---
> +- true
>   ...
>   test_run:switch('router_1')
>   ---
>   - true
>   ...
> +test_run:cmd('start server box_1_b')
> +---
> +- true
> +...
>   -- Revive the best replica. A router must reconnect to it in
>   -- FAILOVER_UP_TIMEOUT seconds.
>   test_run:cmd('start server box_1_d')
> diff --git a/test/failover/failover.test.lua b/test/failover/failover.test.lua
> index a969e0e..b713319 100644
> --- a/test/failover/failover.test.lua
> +++ b/test/failover/failover.test.lua
> @@ -43,8 +43,8 @@ test_run:cmd('switch default')
>   --   available;
>   -- * up nearest replica priority if the best one is available
>   --   again;
> --- * replicaset uses master connection, if the nearest's one is
> ---   not available before call();
> +-- * replicaset uses next prio available connection, if the
> +--   nearest's one is not available before call();
>   -- * current nearest connection is not down, when trying to
>   --   connect to the replica with less weight.
>   --
> @@ -86,15 +86,10 @@ rs1.replica_up_ts - old_up_ts >= vshard.consts.FAILOVER_UP_TIMEOUT
>   -- box_1_d.
>   test_run:cmd('stop server box_1_d')
>   -- Down_ts must be set in on_disconnect() trigger.
> -while rs1.replica.down_ts == nil do fiber.sleep(0.1) end
> --- Try to execute read-only request - it must use master
> --- connection, because a replica's one is not available.
> -vshard.router.call(1, 'read', 'echo', {123})
> -test_run:switch('box_1_a')
> -echo_count
> -test_run:switch('router_1')
> +box_1_d = rs1.replicas[names.replica_uuid.box_1_d]
> +test_run:wait_cond(function() return box_1_d.down_ts ~= nil end)
>   -- New replica is box_1_b.
> -while rs1.replica.name ~= 'box_1_b' do fiber.sleep(0.1) end
> +test_run:wait_cond(function() return rs1.replica.name == 'box_1_b' end)
>   rs1.replica.down_ts == nil
>   rs1.replica_up_ts ~= nil
>   test_run:grep_log('router_1', 'New replica box_1_b%(storage%@')
> @@ -103,8 +98,21 @@ vshard.router.callro(1, 'echo', {123})
>   test_run:cmd('switch box_1_b')
>   -- Ensure the 'read' echo was executed on box_1_b - nearest
>   -- available replica.
> -echo_count
> +assert(echo_count == 1)
> +test_run:switch('router_1')
> +
> +--
> +-- gh-288: kill the best replica. Don't need to wait for failover to happen for
> +-- the router to start using the next best available replica.
> +--
> +test_run:cmd('stop server box_1_b')
> +box_1_b = rs1.replicas[names.replica_uuid.box_1_b]
> +test_run:wait_cond(function() return box_1_b.down_ts ~= nil end)
> +vshard.router.callro(1, 'echo', {123})
> +test_run:switch('box_1_c')
> +assert(echo_count == 1)
>   test_run:switch('router_1')
> +test_run:cmd('start server box_1_b')
>   
>   -- Revive the best replica. A router must reconnect to it in
>   -- FAILOVER_UP_TIMEOUT seconds.
> diff --git a/test/router/master_discovery.result b/test/router/master_discovery.result
> index 7ebb67d..94a848b 100644
> --- a/test/router/master_discovery.result
> +++ b/test/router/master_discovery.result
> @@ -325,6 +325,22 @@ end)
>   start_aggressive_master_search()
>    | ---
>    | ...
> +test_run:cmd('stop server storage_1_b')
> + | ---
> + | - true
> + | ...
> +rs1 = vshard.router.static.replicasets[util.replicasets[1]]
> + | ---
> + | ...
> +replica = rs1.replicas[util.name_to_uuid.storage_1_b]
> + | ---
> + | ...
> +-- Ensure the replica is not available. Otherwise RO requests sneak into it
> +-- instead of waiting for master.
> +test_run:wait_cond(function() return not replica:is_connected() end)
> + | ---
> + | - true
> + | ...
>   
>   forget_masters()
>    | ---
> @@ -337,8 +353,6 @@ vshard.router.callrw(1501, 'echo', {1}, opts_big_timeout)
>   forget_masters()
>    | ---
>    | ...
> --- XXX: this should not depend on master so much. RO requests should be able to
> --- go to replicas.
>   vshard.router.callro(1501, 'echo', {1}, opts_big_timeout)
>    | ---
>    | - 1
> @@ -357,8 +371,6 @@ vshard.router.route(1501):callrw('echo', {1}, opts_big_timeout)
>   forget_masters()
>    | ---
>    | ...
> --- XXX: the same as above - should not really wait for master. Regardless of it
> --- being auto or not.
>   vshard.router.route(1501):callro('echo', {1}, opts_big_timeout)
>    | ---
>    | - 1
> @@ -369,6 +381,10 @@ vshard.router.route(1501):callro('echo', {1}, opts_big_timeout)
>   stop_aggressive_master_search()
>    | ---
>    | ...
> +test_run:cmd('start server storage_1_b')
> + | ---
> + | - true
> + | ...
>   
>   test_run:switch('storage_1_a')
>    | ---
> @@ -411,7 +427,7 @@ vshard.storage.cfg(cfg, instance_uuid)
>    | ---
>    | ...
>   
> --- Try to make RW and RO requests but then turn of the auto search.
> +-- Try to make an RW request but then turn of the auto search.
>   test_run:switch('router_1')
>    | ---
>    | - true
> @@ -425,14 +441,6 @@ f1 = fiber.create(function()
>   end)
>    | ---
>    | ...
> --- XXX: should not really wait for master since this is an RO request. It could
> --- use a replica.
> -f2 = fiber.create(function()                                                    \
> -    fiber.self():set_joinable(true)                                             \
> -    return vshard.router.callro(1501, 'echo', {1}, opts_big_timeout)            \
> -end)
> - | ---
> - | ...
>   fiber.sleep(0.01)
>    | ---
>    | ...
> @@ -449,6 +457,31 @@ f1:join()
>    |   replicaset_uuid: <replicaset_1>
>    |   message: Master is not configured for replicaset <replicaset_1>
>    | ...
> +
> +-- Try to make an RO request but then turn of the auto search.
> +test_run:cmd('stop server storage_1_a')
> + | ---
> + | - true
> + | ...
> +test_run:cmd('stop server storage_1_b')
> + | ---
> + | - true
> + | ...
> +forget_masters()
> + | ---
> + | ...
> +f2 = fiber.create(function()                                                    \
> +    fiber.self():set_joinable(true)                                             \
> +    return vshard.router.callro(1501, 'echo', {1}, opts_big_timeout)            \
> +end)
> + | ---
> + | ...
> +fiber.sleep(0.01)
> + | ---
> + | ...
> +disable_auto_masters()
> + | ---
> + | ...
>   f2:join()
>    | ---
>    | - true
> @@ -459,6 +492,14 @@ f2:join()
>    |   replicaset_uuid: <replicaset_1>
>    |   message: Master is not configured for replicaset <replicaset_1>
>    | ...
> +test_run:cmd('start server storage_1_a')
> + | ---
> + | - true
> + | ...
> +test_run:cmd('start server storage_1_b')
> + | ---
> + | - true
> + | ...
>   
>   --
>   -- Multiple masters logging.
> @@ -473,6 +514,9 @@ replicas = cfg.sharding[util.replicasets[1]].replicas
>   replicas[util.name_to_uuid.storage_1_a].master = true
>    | ---
>    | ...
> +replicas[util.name_to_uuid.storage_1_b].master = false
> + | ---
> + | ...
>   vshard.storage.cfg(cfg, instance_uuid)
>    | ---
>    | ...
> @@ -484,6 +528,9 @@ test_run:switch('storage_1_b')
>   replicas = cfg.sharding[util.replicasets[1]].replicas
>    | ---
>    | ...
> +replicas[util.name_to_uuid.storage_1_a].master = false
> + | ---
> + | ...
>   replicas[util.name_to_uuid.storage_1_b].master = true
>    | ---
>    | ...
> @@ -495,6 +542,25 @@ test_run:switch('router_1')
>    | ---
>    | - true
>    | ...
> +-- Ensure both replicas are connected. Otherwise the router can go to only one,
> +-- find it is master, and won't go to the second one until the first one resigns
> +-- or dies.
> +rs1 = vshard.router.static.replicasets[util.replicasets[1]]
> + | ---
> + | ...
> +replica1 = rs1.replicas[util.name_to_uuid.storage_1_a]
> + | ---
> + | ...
> +replica2 = rs1.replicas[util.name_to_uuid.storage_1_b]
> + | ---
> + | ...
> +test_run:wait_cond(function()                                                   \
> +    return replica1:is_connected() and replica2:is_connected()                  \
> +end)
> + | ---
> + | - true
> + | ...
> +
>   forget_masters()
>    | ---
>    | ...
> diff --git a/test/router/master_discovery.test.lua b/test/router/master_discovery.test.lua
> index 6276dc9..8cacd6d 100644
> --- a/test/router/master_discovery.test.lua
> +++ b/test/router/master_discovery.test.lua
> @@ -181,24 +181,27 @@ end)
>   -- Call tries to wait for master if has enough time left.
>   --
>   start_aggressive_master_search()
> +test_run:cmd('stop server storage_1_b')
> +rs1 = vshard.router.static.replicasets[util.replicasets[1]]
> +replica = rs1.replicas[util.name_to_uuid.storage_1_b]
> +-- Ensure the replica is not available. Otherwise RO requests sneak into it
> +-- instead of waiting for master.
> +test_run:wait_cond(function() return not replica:is_connected() end)
>   
>   forget_masters()
>   vshard.router.callrw(1501, 'echo', {1}, opts_big_timeout)
>   
>   forget_masters()
> --- XXX: this should not depend on master so much. RO requests should be able to
> --- go to replicas.
>   vshard.router.callro(1501, 'echo', {1}, opts_big_timeout)
>   
>   forget_masters()
>   vshard.router.route(1501):callrw('echo', {1}, opts_big_timeout)
>   
>   forget_masters()
> --- XXX: the same as above - should not really wait for master. Regardless of it
> --- being auto or not.
>   vshard.router.route(1501):callro('echo', {1}, opts_big_timeout)
>   
>   stop_aggressive_master_search()
> +test_run:cmd('start server storage_1_b')
>   
>   test_run:switch('storage_1_a')
>   assert(echo_count == 4)
> @@ -218,23 +221,30 @@ replicas = cfg.sharding[util.replicasets[1]].replicas
>   replicas[util.name_to_uuid.storage_1_a].master = false
>   vshard.storage.cfg(cfg, instance_uuid)
>   
> --- Try to make RW and RO requests but then turn of the auto search.
> +-- Try to make an RW request but then turn of the auto search.
>   test_run:switch('router_1')
>   forget_masters()
>   f1 = fiber.create(function()                                                    \
>       fiber.self():set_joinable(true)                                             \
>       return vshard.router.callrw(1501, 'echo', {1}, opts_big_timeout)            \
>   end)
> --- XXX: should not really wait for master since this is an RO request. It could
> --- use a replica.
> +fiber.sleep(0.01)
> +disable_auto_masters()
> +f1:join()
> +
> +-- Try to make an RO request but then turn of the auto search.
> +test_run:cmd('stop server storage_1_a')
> +test_run:cmd('stop server storage_1_b')
> +forget_masters()
>   f2 = fiber.create(function()                                                    \
>       fiber.self():set_joinable(true)                                             \
>       return vshard.router.callro(1501, 'echo', {1}, opts_big_timeout)            \
>   end)
>   fiber.sleep(0.01)
>   disable_auto_masters()
> -f1:join()
>   f2:join()
> +test_run:cmd('start server storage_1_a')
> +test_run:cmd('start server storage_1_b')
>   
>   --
>   -- Multiple masters logging.
> @@ -242,14 +252,26 @@ f2:join()
>   test_run:switch('storage_1_a')
>   replicas = cfg.sharding[util.replicasets[1]].replicas
>   replicas[util.name_to_uuid.storage_1_a].master = true
> +replicas[util.name_to_uuid.storage_1_b].master = false
>   vshard.storage.cfg(cfg, instance_uuid)
>   
>   test_run:switch('storage_1_b')
>   replicas = cfg.sharding[util.replicasets[1]].replicas
> +replicas[util.name_to_uuid.storage_1_a].master = false
>   replicas[util.name_to_uuid.storage_1_b].master = true
>   vshard.storage.cfg(cfg, instance_uuid)
>   
>   test_run:switch('router_1')
> +-- Ensure both replicas are connected. Otherwise the router can go to only one,
> +-- find it is master, and won't go to the second one until the first one resigns
> +-- or dies.
> +rs1 = vshard.router.static.replicasets[util.replicasets[1]]
> +replica1 = rs1.replicas[util.name_to_uuid.storage_1_a]
> +replica2 = rs1.replicas[util.name_to_uuid.storage_1_b]
> +test_run:wait_cond(function()                                                   \
> +    return replica1:is_connected() and replica2:is_connected()                  \
> +end)
> +
>   forget_masters()
>   start_aggressive_master_search()
>   test_run:wait_log('router_1', 'Found more than one master', nil, 10)
> diff --git a/vshard/replicaset.lua b/vshard/replicaset.lua
> index 174c761..93f210e 100644
> --- a/vshard/replicaset.lua
> +++ b/vshard/replicaset.lua
> @@ -455,18 +455,25 @@ local function replicaset_template_multicallro(prefer_replica, balance)
>                       return r
>                   end
>               end
> -        elseif prefer_replica then
> -            r = replicaset.replica
> +        else
> +            local start_r = replicaset.replica
> +            r = start_r
>               while r do
> -                if r:is_connected() and r ~= master then
> +                if r:is_connected() and (not prefer_replica or r ~= master) then
>                       return r
>                   end
>                   r = r.next_by_priority
>               end
> -        else
> -            r = replicaset.replica
> -            if r and r:is_connected() then
> -                return r
> +            -- Iteration above could start not from the best prio replica.
> +            -- Check the beginning of the list too.
> +            for _, r in pairs(replicaset.priority_list) do
pairs can't be jitted (at least now until 
https://github.com/tarantool/tarantool/issues/6475).
Also pairs don't give any guarantees about iteration order. Could it be 
replaced with ipairs?
> +                if r == start_r then
> +                    -- Reached already checked part.
> +                    break
> +                end
> +                if r:is_connected() and (not prefer_replica or r ~= master) then
> +                    return r
> +                end
>               end
>           end
>       end
next prev parent reply	other threads:[~2021-12-05 17:53 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-04  0:19 [Tarantool-patches] [PATCH vshard 0/2] VShard replica backoff, part 1 Vladislav Shpilevoy via Tarantool-patches
2021-12-04  0:19 ` [Tarantool-patches] [PATCH vshard 1/2] router: drop wait_connected from master discovery Vladislav Shpilevoy via Tarantool-patches
2021-12-05 17:44   ` Oleg Babin via Tarantool-patches
2021-12-06 23:34     ` Vladislav Shpilevoy via Tarantool-patches
2021-12-07  6:49       ` Oleg Babin via Tarantool-patches
2021-12-04  0:19 ` [Tarantool-patches] [PATCH vshard 2/2] router: don't fallback RO to master right away Vladislav Shpilevoy via Tarantool-patches
2021-12-05 17:53   ` Oleg Babin via Tarantool-patches [this message]
2021-12-06 23:56     ` Vladislav Shpilevoy via Tarantool-patches
2021-12-07  6:49       ` Oleg Babin via Tarantool-patches
2021-12-07 22:38 ` [Tarantool-patches] [PATCH vshard 0/2] VShard replica backoff, part 1 Vladislav Shpilevoy via Tarantool-patches
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox
  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):
  git send-email \
    --in-reply-to=03bd48be-b017-a48f-773e-f4b8e95f8cf5@tarantool.org \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=olegrok@tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH vshard 2/2] router: don'\''t fallback RO to master right away' \
    /path/to/YOUR_REPLY
  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox