From: Vladislav Shpilevoy via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: tarantool-patches@dev.tarantool.org, olegrok@tarantool.org Subject: [Tarantool-patches] [PATCH vshard 2/2] router: don't fallback RO to master right away Date: Sat, 4 Dec 2021 01:19:38 +0100 [thread overview] Message-ID: <36c68a48a627c8af98fb6f9809d2661abcd93658.1638577114.git.v.shpilevoy@tarantool.org> (raw) In-Reply-To: <cover.1638577114.git.v.shpilevoy@tarantool.org> RO requests use the replica with the highest prio as specified in the weights matrix. If the best replica is not available now and failover didn't happen yet, then RO requests used to fallback to master. Even if there were other RO replicas with better prio. This patch makes RO call firstly try the currently selected most prio replica. If it is not available (no other connected replicas at all or failover didn't happen yet), the call will try to walk the prio list starting from this replica until it finds an available one. If it also fails, the call will try to walk the list from the beginning hoping that the unavailable replica wasn't the best one and there might be better option on the other side of the prio list. The patch was done in scope of task about replica backoff (#298) because the problem would additionally exist when the best replica is in backoff, not only disconnected. It would get worse. Closes #288 Needed for #298 --- test/failover/failover.result | 64 +++++++++++++------ test/failover/failover.test.lua | 30 +++++---- test/router/master_discovery.result | 92 +++++++++++++++++++++++---- test/router/master_discovery.test.lua | 38 ++++++++--- vshard/replicaset.lua | 21 ++++-- 5 files changed, 185 insertions(+), 60 deletions(-) diff --git a/test/failover/failover.result b/test/failover/failover.result index bae57fa..31eda0c 100644 --- a/test/failover/failover.result +++ b/test/failover/failover.result @@ -85,8 +85,8 @@ test_run:cmd('switch default') -- available; -- * up nearest replica priority if the best one is available -- again; --- * replicaset uses master connection, if the nearest's one is --- not available before call(); +-- * replicaset uses next prio available connection, if the +-- nearest's one is not available before call(); -- * current nearest connection is not down, when trying to -- connect to the replica with less weight. -- @@ -199,30 +199,17 @@ test_run:cmd('stop server box_1_d') - true ... -- Down_ts must be set in on_disconnect() trigger. -while rs1.replica.down_ts == nil do fiber.sleep(0.1) end +box_1_d = rs1.replicas[names.replica_uuid.box_1_d] --- ... --- Try to execute read-only request - it must use master --- connection, because a replica's one is not available. -vshard.router.call(1, 'read', 'echo', {123}) ---- -- 123 -... -test_run:switch('box_1_a') ---- -- true -... -echo_count ---- -- 2 -... -test_run:switch('router_1') +test_run:wait_cond(function() return box_1_d.down_ts ~= nil end) --- - true ... -- New replica is box_1_b. -while rs1.replica.name ~= 'box_1_b' do fiber.sleep(0.1) end +test_run:wait_cond(function() return rs1.replica.name == 'box_1_b' end) --- +- true ... rs1.replica.down_ts == nil --- @@ -247,14 +234,49 @@ test_run:cmd('switch box_1_b') ... -- Ensure the 'read' echo was executed on box_1_b - nearest -- available replica. -echo_count +assert(echo_count == 1) --- -- 1 +- true +... +test_run:switch('router_1') +--- +- true +... +-- +-- Kill the best replica. Don't need to wait for failover to happen for the +-- router to start using the next best available replica. +-- +test_run:cmd('stop server box_1_b') +--- +- true +... +box_1_b = rs1.replicas[names.replica_uuid.box_1_b] +--- +... +test_run:wait_cond(function() return box_1_b.down_ts ~= nil end) +--- +- true +... +vshard.router.callro(1, 'echo', {123}) +--- +- 123 +... +test_run:switch('box_1_c') +--- +- true +... +assert(echo_count == 1) +--- +- true ... test_run:switch('router_1') --- - true ... +test_run:cmd('start server box_1_b') +--- +- true +... -- Revive the best replica. A router must reconnect to it in -- FAILOVER_UP_TIMEOUT seconds. test_run:cmd('start server box_1_d') diff --git a/test/failover/failover.test.lua b/test/failover/failover.test.lua index a969e0e..b713319 100644 --- a/test/failover/failover.test.lua +++ b/test/failover/failover.test.lua @@ -43,8 +43,8 @@ test_run:cmd('switch default') -- available; -- * up nearest replica priority if the best one is available -- again; --- * replicaset uses master connection, if the nearest's one is --- not available before call(); +-- * replicaset uses next prio available connection, if the +-- nearest's one is not available before call(); -- * current nearest connection is not down, when trying to -- connect to the replica with less weight. -- @@ -86,15 +86,10 @@ rs1.replica_up_ts - old_up_ts >= vshard.consts.FAILOVER_UP_TIMEOUT -- box_1_d. test_run:cmd('stop server box_1_d') -- Down_ts must be set in on_disconnect() trigger. -while rs1.replica.down_ts == nil do fiber.sleep(0.1) end --- Try to execute read-only request - it must use master --- connection, because a replica's one is not available. -vshard.router.call(1, 'read', 'echo', {123}) -test_run:switch('box_1_a') -echo_count -test_run:switch('router_1') +box_1_d = rs1.replicas[names.replica_uuid.box_1_d] +test_run:wait_cond(function() return box_1_d.down_ts ~= nil end) -- New replica is box_1_b. -while rs1.replica.name ~= 'box_1_b' do fiber.sleep(0.1) end +test_run:wait_cond(function() return rs1.replica.name == 'box_1_b' end) rs1.replica.down_ts == nil rs1.replica_up_ts ~= nil test_run:grep_log('router_1', 'New replica box_1_b%(storage%@') @@ -103,8 +98,21 @@ vshard.router.callro(1, 'echo', {123}) test_run:cmd('switch box_1_b') -- Ensure the 'read' echo was executed on box_1_b - nearest -- available replica. -echo_count +assert(echo_count == 1) +test_run:switch('router_1') + +-- +-- gh-288: kill the best replica. Don't need to wait for failover to happen for +-- the router to start using the next best available replica. +-- +test_run:cmd('stop server box_1_b') +box_1_b = rs1.replicas[names.replica_uuid.box_1_b] +test_run:wait_cond(function() return box_1_b.down_ts ~= nil end) +vshard.router.callro(1, 'echo', {123}) +test_run:switch('box_1_c') +assert(echo_count == 1) test_run:switch('router_1') +test_run:cmd('start server box_1_b') -- Revive the best replica. A router must reconnect to it in -- FAILOVER_UP_TIMEOUT seconds. diff --git a/test/router/master_discovery.result b/test/router/master_discovery.result index 7ebb67d..94a848b 100644 --- a/test/router/master_discovery.result +++ b/test/router/master_discovery.result @@ -325,6 +325,22 @@ end) start_aggressive_master_search() | --- | ... +test_run:cmd('stop server storage_1_b') + | --- + | - true + | ... +rs1 = vshard.router.static.replicasets[util.replicasets[1]] + | --- + | ... +replica = rs1.replicas[util.name_to_uuid.storage_1_b] + | --- + | ... +-- Ensure the replica is not available. Otherwise RO requests sneak into it +-- instead of waiting for master. +test_run:wait_cond(function() return not replica:is_connected() end) + | --- + | - true + | ... forget_masters() | --- @@ -337,8 +353,6 @@ vshard.router.callrw(1501, 'echo', {1}, opts_big_timeout) forget_masters() | --- | ... --- XXX: this should not depend on master so much. RO requests should be able to --- go to replicas. vshard.router.callro(1501, 'echo', {1}, opts_big_timeout) | --- | - 1 @@ -357,8 +371,6 @@ vshard.router.route(1501):callrw('echo', {1}, opts_big_timeout) forget_masters() | --- | ... --- XXX: the same as above - should not really wait for master. Regardless of it --- being auto or not. vshard.router.route(1501):callro('echo', {1}, opts_big_timeout) | --- | - 1 @@ -369,6 +381,10 @@ vshard.router.route(1501):callro('echo', {1}, opts_big_timeout) stop_aggressive_master_search() | --- | ... +test_run:cmd('start server storage_1_b') + | --- + | - true + | ... test_run:switch('storage_1_a') | --- @@ -411,7 +427,7 @@ vshard.storage.cfg(cfg, instance_uuid) | --- | ... --- Try to make RW and RO requests but then turn of the auto search. +-- Try to make an RW request but then turn of the auto search. test_run:switch('router_1') | --- | - true @@ -425,14 +441,6 @@ f1 = fiber.create(function() end) | --- | ... --- XXX: should not really wait for master since this is an RO request. It could --- use a replica. -f2 = fiber.create(function() \ - fiber.self():set_joinable(true) \ - return vshard.router.callro(1501, 'echo', {1}, opts_big_timeout) \ -end) - | --- - | ... fiber.sleep(0.01) | --- | ... @@ -449,6 +457,31 @@ f1:join() | replicaset_uuid: <replicaset_1> | message: Master is not configured for replicaset <replicaset_1> | ... + +-- Try to make an RO request but then turn of the auto search. +test_run:cmd('stop server storage_1_a') + | --- + | - true + | ... +test_run:cmd('stop server storage_1_b') + | --- + | - true + | ... +forget_masters() + | --- + | ... +f2 = fiber.create(function() \ + fiber.self():set_joinable(true) \ + return vshard.router.callro(1501, 'echo', {1}, opts_big_timeout) \ +end) + | --- + | ... +fiber.sleep(0.01) + | --- + | ... +disable_auto_masters() + | --- + | ... f2:join() | --- | - true @@ -459,6 +492,14 @@ f2:join() | replicaset_uuid: <replicaset_1> | message: Master is not configured for replicaset <replicaset_1> | ... +test_run:cmd('start server storage_1_a') + | --- + | - true + | ... +test_run:cmd('start server storage_1_b') + | --- + | - true + | ... -- -- Multiple masters logging. @@ -473,6 +514,9 @@ replicas = cfg.sharding[util.replicasets[1]].replicas replicas[util.name_to_uuid.storage_1_a].master = true | --- | ... +replicas[util.name_to_uuid.storage_1_b].master = false + | --- + | ... vshard.storage.cfg(cfg, instance_uuid) | --- | ... @@ -484,6 +528,9 @@ test_run:switch('storage_1_b') replicas = cfg.sharding[util.replicasets[1]].replicas | --- | ... +replicas[util.name_to_uuid.storage_1_a].master = false + | --- + | ... replicas[util.name_to_uuid.storage_1_b].master = true | --- | ... @@ -495,6 +542,25 @@ test_run:switch('router_1') | --- | - true | ... +-- Ensure both replicas are connected. Otherwise the router can go to only one, +-- find it is master, and won't go to the second one until the first one resigns +-- or dies. +rs1 = vshard.router.static.replicasets[util.replicasets[1]] + | --- + | ... +replica1 = rs1.replicas[util.name_to_uuid.storage_1_a] + | --- + | ... +replica2 = rs1.replicas[util.name_to_uuid.storage_1_b] + | --- + | ... +test_run:wait_cond(function() \ + return replica1:is_connected() and replica2:is_connected() \ +end) + | --- + | - true + | ... + forget_masters() | --- | ... diff --git a/test/router/master_discovery.test.lua b/test/router/master_discovery.test.lua index 6276dc9..8cacd6d 100644 --- a/test/router/master_discovery.test.lua +++ b/test/router/master_discovery.test.lua @@ -181,24 +181,27 @@ end) -- Call tries to wait for master if has enough time left. -- start_aggressive_master_search() +test_run:cmd('stop server storage_1_b') +rs1 = vshard.router.static.replicasets[util.replicasets[1]] +replica = rs1.replicas[util.name_to_uuid.storage_1_b] +-- Ensure the replica is not available. Otherwise RO requests sneak into it +-- instead of waiting for master. +test_run:wait_cond(function() return not replica:is_connected() end) forget_masters() vshard.router.callrw(1501, 'echo', {1}, opts_big_timeout) forget_masters() --- XXX: this should not depend on master so much. RO requests should be able to --- go to replicas. vshard.router.callro(1501, 'echo', {1}, opts_big_timeout) forget_masters() vshard.router.route(1501):callrw('echo', {1}, opts_big_timeout) forget_masters() --- XXX: the same as above - should not really wait for master. Regardless of it --- being auto or not. vshard.router.route(1501):callro('echo', {1}, opts_big_timeout) stop_aggressive_master_search() +test_run:cmd('start server storage_1_b') test_run:switch('storage_1_a') assert(echo_count == 4) @@ -218,23 +221,30 @@ replicas = cfg.sharding[util.replicasets[1]].replicas replicas[util.name_to_uuid.storage_1_a].master = false vshard.storage.cfg(cfg, instance_uuid) --- Try to make RW and RO requests but then turn of the auto search. +-- Try to make an RW request but then turn of the auto search. test_run:switch('router_1') forget_masters() f1 = fiber.create(function() \ fiber.self():set_joinable(true) \ return vshard.router.callrw(1501, 'echo', {1}, opts_big_timeout) \ end) --- XXX: should not really wait for master since this is an RO request. It could --- use a replica. +fiber.sleep(0.01) +disable_auto_masters() +f1:join() + +-- Try to make an RO request but then turn of the auto search. +test_run:cmd('stop server storage_1_a') +test_run:cmd('stop server storage_1_b') +forget_masters() f2 = fiber.create(function() \ fiber.self():set_joinable(true) \ return vshard.router.callro(1501, 'echo', {1}, opts_big_timeout) \ end) fiber.sleep(0.01) disable_auto_masters() -f1:join() f2:join() +test_run:cmd('start server storage_1_a') +test_run:cmd('start server storage_1_b') -- -- Multiple masters logging. @@ -242,14 +252,26 @@ f2:join() test_run:switch('storage_1_a') replicas = cfg.sharding[util.replicasets[1]].replicas replicas[util.name_to_uuid.storage_1_a].master = true +replicas[util.name_to_uuid.storage_1_b].master = false vshard.storage.cfg(cfg, instance_uuid) test_run:switch('storage_1_b') replicas = cfg.sharding[util.replicasets[1]].replicas +replicas[util.name_to_uuid.storage_1_a].master = false replicas[util.name_to_uuid.storage_1_b].master = true vshard.storage.cfg(cfg, instance_uuid) test_run:switch('router_1') +-- Ensure both replicas are connected. Otherwise the router can go to only one, +-- find it is master, and won't go to the second one until the first one resigns +-- or dies. +rs1 = vshard.router.static.replicasets[util.replicasets[1]] +replica1 = rs1.replicas[util.name_to_uuid.storage_1_a] +replica2 = rs1.replicas[util.name_to_uuid.storage_1_b] +test_run:wait_cond(function() \ + return replica1:is_connected() and replica2:is_connected() \ +end) + forget_masters() start_aggressive_master_search() test_run:wait_log('router_1', 'Found more than one master', nil, 10) diff --git a/vshard/replicaset.lua b/vshard/replicaset.lua index 174c761..93f210e 100644 --- a/vshard/replicaset.lua +++ b/vshard/replicaset.lua @@ -455,18 +455,25 @@ local function replicaset_template_multicallro(prefer_replica, balance) return r end end - elseif prefer_replica then - r = replicaset.replica + else + local start_r = replicaset.replica + r = start_r while r do - if r:is_connected() and r ~= master then + if r:is_connected() and (not prefer_replica or r ~= master) then return r end r = r.next_by_priority end - else - r = replicaset.replica - if r and r:is_connected() then - return r + -- Iteration above could start not from the best prio replica. + -- Check the beginning of the list too. + for _, r in pairs(replicaset.priority_list) do + if r == start_r then + -- Reached already checked part. + break + end + if r:is_connected() and (not prefer_replica or r ~= master) then + return r + end end end end -- 2.24.3 (Apple Git-128)
next prev parent reply other threads:[~2021-12-04 0:20 UTC|newest] Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-12-04 0:19 [Tarantool-patches] [PATCH vshard 0/2] VShard replica backoff, part 1 Vladislav Shpilevoy via Tarantool-patches 2021-12-04 0:19 ` [Tarantool-patches] [PATCH vshard 1/2] router: drop wait_connected from master discovery Vladislav Shpilevoy via Tarantool-patches 2021-12-05 17:44 ` Oleg Babin via Tarantool-patches 2021-12-06 23:34 ` Vladislav Shpilevoy via Tarantool-patches 2021-12-07 6:49 ` Oleg Babin via Tarantool-patches 2021-12-04 0:19 ` Vladislav Shpilevoy via Tarantool-patches [this message] 2021-12-05 17:53 ` [Tarantool-patches] [PATCH vshard 2/2] router: don't fallback RO to master right away Oleg Babin via Tarantool-patches 2021-12-06 23:56 ` Vladislav Shpilevoy via Tarantool-patches 2021-12-07 6:49 ` Oleg Babin via Tarantool-patches 2021-12-07 22:38 ` [Tarantool-patches] [PATCH vshard 0/2] VShard replica backoff, part 1 Vladislav Shpilevoy via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=36c68a48a627c8af98fb6f9809d2661abcd93658.1638577114.git.v.shpilevoy@tarantool.org \ --to=tarantool-patches@dev.tarantool.org \ --cc=olegrok@tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH vshard 2/2] router: don'\''t fallback RO to master right away' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox