From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id A77986ECC0; Sun, 5 Dec 2021 20:53:29 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org A77986ECC0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1638726809; bh=gcSE+b44i71/R6nYZTOBzCLZUrJCCglIb8z83/qxjRM=; h=Date:To:References:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=KH1ERIZkGT5SOt33jgYYFrOlLCd1kcrW7glJDp9wNJ7DrYPdXuuxNK2mb2LRzKvwm Hut29jEmvZVETzK+4PmHrWnuGQ0/uRApKGUImPsbUZa3zF5ybJtYAAFnOYleTQgKiS CtqRjlsO0gwB68SPgI/g/hxXGxS+Jdl6RD6teCSk= Received: from smtp35.i.mail.ru (smtp35.i.mail.ru [94.100.177.95]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 628126ECC0 for ; Sun, 5 Dec 2021 20:53:28 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 628126ECC0 Received: by smtp35.i.mail.ru with esmtpa (envelope-from ) id 1mtvhX-0007Ub-Fq; Sun, 05 Dec 2021 20:53:28 +0300 Message-ID: <03bd48be-b017-a48f-773e-f4b8e95f8cf5@tarantool.org> Date: Sun, 5 Dec 2021 20:53:26 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.3.0 To: Vladislav Shpilevoy , tarantool-patches@dev.tarantool.org References: <36c68a48a627c8af98fb6f9809d2661abcd93658.1638577114.git.v.shpilevoy@tarantool.org> In-Reply-To: <36c68a48a627c8af98fb6f9809d2661abcd93658.1638577114.git.v.shpilevoy@tarantool.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-4EC0790: 10 X-7564579A: EEAE043A70213CC8 X-77F55803: 4F1203BC0FB41BD93822B471089FF64D7AAA9F64CDC6B40B98716208604F0CA0182A05F538085040A5D8A4573551124ABFC08A601F4A36EBFC96EDA00DC66446E6E5ADFE9FFD2881 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7956F10FFCC7409BAEA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F79006375A514678F9DF65078638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D880739E53FF59F5093B7BED972429D02D117882F4460429724CE54428C33FAD305F5C1EE8F4F765FC2EE5AD8F952D28FBA471835C12D1D9774AD6D5ED66289B52BA9C0B312567BB23117882F446042972877693876707352033AC447995A7AD18618001F51B5FD3F9D2E47CDBA5A96583BA9C0B312567BB2376E601842F6C81A19E625A9149C048EE7B96B19DC40933219935A1E27F592749D8FC6C240DEA7642DBF02ECDB25306B2B78CF848AE20165D0A6AB1C7CE11FEE389DDFE3E282F3DD103F1AB874ED89028C4224003CC836476EA7A3FFF5B025636E2021AF6380DFAD1A18204E546F3947CB11811A4A51E3B096D1867E19FE1407959CC434672EE6371089D37D7C0E48F6C8AA50765F79006372A310894EFA5E9A9EFF80C71ABB335746BA297DBC24807EABDAD6C7F3747799A X-C1DE0DAB: 0D63561A33F958A5D05A986427BFD90ABD780EF6DAFD9867553577876A6598CCD59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA7506FE1F977233B9BB410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D341B5517184E88C1BDBC888442C7CFD0CBBF02BE6EE5F15A1FE5F5D3C5F5C4B4BCB3D1B5466F3143221D7E09C32AA3244C95927D5FCEFD20EBF81CD16EE5C7D320435BF7150578642F729B2BEF169E0186 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojk4oXTccdlonxJkFk0pqAdA== X-Mailru-Sender: 583F1D7ACE8F49BD1042885CEC987B6B609AF76D4A565AB5BFC08A601F4A36EB7B0B9A7C4E99E59B7019711D9D5B048E1458020726E2BC9FD5ECBA0B92C0A936CDC7563AA7CEBD2872D6B4FCE48DF648AE208404248635DF X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH vshard 2/2] router: don't fallback RO to master right away X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Oleg Babin via Tarantool-patches Reply-To: Oleg Babin Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Thanks for your patch. See one comment below. Also as I see CI is red on your branch. Is it related to your changes? On 04.12.2021 03:19, Vladislav Shpilevoy wrote: > RO requests use the replica with the highest prio as specified in > the weights matrix. > > If the best replica is not available now and failover didn't > happen yet, then RO requests used to fallback to master. Even if > there were other RO replicas with better prio. > > This patch makes RO call firstly try the currently selected most > prio replica. If it is not available (no other connected replicas > at all or failover didn't happen yet), the call will try to walk > the prio list starting from this replica until it finds an > available one. > > If it also fails, the call will try to walk the list from the > beginning hoping that the unavailable replica wasn't the best one > and there might be better option on the other side of the prio > list. > > The patch was done in scope of task about replica backoff (#298) > because the problem would additionally exist when the best replica > is in backoff, not only disconnected. It would get worse. > > Closes #288 > Needed for #298 > --- > test/failover/failover.result | 64 +++++++++++++------ > test/failover/failover.test.lua | 30 +++++---- > test/router/master_discovery.result | 92 +++++++++++++++++++++++---- > test/router/master_discovery.test.lua | 38 ++++++++--- > vshard/replicaset.lua | 21 ++++-- > 5 files changed, 185 insertions(+), 60 deletions(-) > > diff --git a/test/failover/failover.result b/test/failover/failover.result > index bae57fa..31eda0c 100644 > --- a/test/failover/failover.result > +++ b/test/failover/failover.result > @@ -85,8 +85,8 @@ test_run:cmd('switch default') > -- available; > -- * up nearest replica priority if the best one is available > -- again; > --- * replicaset uses master connection, if the nearest's one is > --- not available before call(); > +-- * replicaset uses next prio available connection, if the > +-- nearest's one is not available before call(); > -- * current nearest connection is not down, when trying to > -- connect to the replica with less weight. > -- > @@ -199,30 +199,17 @@ test_run:cmd('stop server box_1_d') > - true > ... > -- Down_ts must be set in on_disconnect() trigger. > -while rs1.replica.down_ts == nil do fiber.sleep(0.1) end > +box_1_d = rs1.replicas[names.replica_uuid.box_1_d] > --- > ... > --- Try to execute read-only request - it must use master > --- connection, because a replica's one is not available. > -vshard.router.call(1, 'read', 'echo', {123}) > ---- > -- 123 > -... > -test_run:switch('box_1_a') > ---- > -- true > -... > -echo_count > ---- > -- 2 > -... > -test_run:switch('router_1') > +test_run:wait_cond(function() return box_1_d.down_ts ~= nil end) > --- > - true > ... > -- New replica is box_1_b. > -while rs1.replica.name ~= 'box_1_b' do fiber.sleep(0.1) end > +test_run:wait_cond(function() return rs1.replica.name == 'box_1_b' end) > --- > +- true > ... > rs1.replica.down_ts == nil > --- > @@ -247,14 +234,49 @@ test_run:cmd('switch box_1_b') > ... > -- Ensure the 'read' echo was executed on box_1_b - nearest > -- available replica. > -echo_count > +assert(echo_count == 1) > --- > -- 1 > +- true > +... > +test_run:switch('router_1') > +--- > +- true > +... > +-- > +-- Kill the best replica. Don't need to wait for failover to happen for the > +-- router to start using the next best available replica. > +-- > +test_run:cmd('stop server box_1_b') > +--- > +- true > +... > +box_1_b = rs1.replicas[names.replica_uuid.box_1_b] > +--- > +... > +test_run:wait_cond(function() return box_1_b.down_ts ~= nil end) > +--- > +- true > +... > +vshard.router.callro(1, 'echo', {123}) > +--- > +- 123 > +... > +test_run:switch('box_1_c') > +--- > +- true > +... > +assert(echo_count == 1) > +--- > +- true > ... > test_run:switch('router_1') > --- > - true > ... > +test_run:cmd('start server box_1_b') > +--- > +- true > +... > -- Revive the best replica. A router must reconnect to it in > -- FAILOVER_UP_TIMEOUT seconds. > test_run:cmd('start server box_1_d') > diff --git a/test/failover/failover.test.lua b/test/failover/failover.test.lua > index a969e0e..b713319 100644 > --- a/test/failover/failover.test.lua > +++ b/test/failover/failover.test.lua > @@ -43,8 +43,8 @@ test_run:cmd('switch default') > -- available; > -- * up nearest replica priority if the best one is available > -- again; > --- * replicaset uses master connection, if the nearest's one is > --- not available before call(); > +-- * replicaset uses next prio available connection, if the > +-- nearest's one is not available before call(); > -- * current nearest connection is not down, when trying to > -- connect to the replica with less weight. > -- > @@ -86,15 +86,10 @@ rs1.replica_up_ts - old_up_ts >= vshard.consts.FAILOVER_UP_TIMEOUT > -- box_1_d. > test_run:cmd('stop server box_1_d') > -- Down_ts must be set in on_disconnect() trigger. > -while rs1.replica.down_ts == nil do fiber.sleep(0.1) end > --- Try to execute read-only request - it must use master > --- connection, because a replica's one is not available. > -vshard.router.call(1, 'read', 'echo', {123}) > -test_run:switch('box_1_a') > -echo_count > -test_run:switch('router_1') > +box_1_d = rs1.replicas[names.replica_uuid.box_1_d] > +test_run:wait_cond(function() return box_1_d.down_ts ~= nil end) > -- New replica is box_1_b. > -while rs1.replica.name ~= 'box_1_b' do fiber.sleep(0.1) end > +test_run:wait_cond(function() return rs1.replica.name == 'box_1_b' end) > rs1.replica.down_ts == nil > rs1.replica_up_ts ~= nil > test_run:grep_log('router_1', 'New replica box_1_b%(storage%@') > @@ -103,8 +98,21 @@ vshard.router.callro(1, 'echo', {123}) > test_run:cmd('switch box_1_b') > -- Ensure the 'read' echo was executed on box_1_b - nearest > -- available replica. > -echo_count > +assert(echo_count == 1) > +test_run:switch('router_1') > + > +-- > +-- gh-288: kill the best replica. Don't need to wait for failover to happen for > +-- the router to start using the next best available replica. > +-- > +test_run:cmd('stop server box_1_b') > +box_1_b = rs1.replicas[names.replica_uuid.box_1_b] > +test_run:wait_cond(function() return box_1_b.down_ts ~= nil end) > +vshard.router.callro(1, 'echo', {123}) > +test_run:switch('box_1_c') > +assert(echo_count == 1) > test_run:switch('router_1') > +test_run:cmd('start server box_1_b') > > -- Revive the best replica. A router must reconnect to it in > -- FAILOVER_UP_TIMEOUT seconds. > diff --git a/test/router/master_discovery.result b/test/router/master_discovery.result > index 7ebb67d..94a848b 100644 > --- a/test/router/master_discovery.result > +++ b/test/router/master_discovery.result > @@ -325,6 +325,22 @@ end) > start_aggressive_master_search() > | --- > | ... > +test_run:cmd('stop server storage_1_b') > + | --- > + | - true > + | ... > +rs1 = vshard.router.static.replicasets[util.replicasets[1]] > + | --- > + | ... > +replica = rs1.replicas[util.name_to_uuid.storage_1_b] > + | --- > + | ... > +-- Ensure the replica is not available. Otherwise RO requests sneak into it > +-- instead of waiting for master. > +test_run:wait_cond(function() return not replica:is_connected() end) > + | --- > + | - true > + | ... > > forget_masters() > | --- > @@ -337,8 +353,6 @@ vshard.router.callrw(1501, 'echo', {1}, opts_big_timeout) > forget_masters() > | --- > | ... > --- XXX: this should not depend on master so much. RO requests should be able to > --- go to replicas. > vshard.router.callro(1501, 'echo', {1}, opts_big_timeout) > | --- > | - 1 > @@ -357,8 +371,6 @@ vshard.router.route(1501):callrw('echo', {1}, opts_big_timeout) > forget_masters() > | --- > | ... > --- XXX: the same as above - should not really wait for master. Regardless of it > --- being auto or not. > vshard.router.route(1501):callro('echo', {1}, opts_big_timeout) > | --- > | - 1 > @@ -369,6 +381,10 @@ vshard.router.route(1501):callro('echo', {1}, opts_big_timeout) > stop_aggressive_master_search() > | --- > | ... > +test_run:cmd('start server storage_1_b') > + | --- > + | - true > + | ... > > test_run:switch('storage_1_a') > | --- > @@ -411,7 +427,7 @@ vshard.storage.cfg(cfg, instance_uuid) > | --- > | ... > > --- Try to make RW and RO requests but then turn of the auto search. > +-- Try to make an RW request but then turn of the auto search. > test_run:switch('router_1') > | --- > | - true > @@ -425,14 +441,6 @@ f1 = fiber.create(function() > end) > | --- > | ... > --- XXX: should not really wait for master since this is an RO request. It could > --- use a replica. > -f2 = fiber.create(function() \ > - fiber.self():set_joinable(true) \ > - return vshard.router.callro(1501, 'echo', {1}, opts_big_timeout) \ > -end) > - | --- > - | ... > fiber.sleep(0.01) > | --- > | ... > @@ -449,6 +457,31 @@ f1:join() > | replicaset_uuid: > | message: Master is not configured for replicaset > | ... > + > +-- Try to make an RO request but then turn of the auto search. > +test_run:cmd('stop server storage_1_a') > + | --- > + | - true > + | ... > +test_run:cmd('stop server storage_1_b') > + | --- > + | - true > + | ... > +forget_masters() > + | --- > + | ... > +f2 = fiber.create(function() \ > + fiber.self():set_joinable(true) \ > + return vshard.router.callro(1501, 'echo', {1}, opts_big_timeout) \ > +end) > + | --- > + | ... > +fiber.sleep(0.01) > + | --- > + | ... > +disable_auto_masters() > + | --- > + | ... > f2:join() > | --- > | - true > @@ -459,6 +492,14 @@ f2:join() > | replicaset_uuid: > | message: Master is not configured for replicaset > | ... > +test_run:cmd('start server storage_1_a') > + | --- > + | - true > + | ... > +test_run:cmd('start server storage_1_b') > + | --- > + | - true > + | ... > > -- > -- Multiple masters logging. > @@ -473,6 +514,9 @@ replicas = cfg.sharding[util.replicasets[1]].replicas > replicas[util.name_to_uuid.storage_1_a].master = true > | --- > | ... > +replicas[util.name_to_uuid.storage_1_b].master = false > + | --- > + | ... > vshard.storage.cfg(cfg, instance_uuid) > | --- > | ... > @@ -484,6 +528,9 @@ test_run:switch('storage_1_b') > replicas = cfg.sharding[util.replicasets[1]].replicas > | --- > | ... > +replicas[util.name_to_uuid.storage_1_a].master = false > + | --- > + | ... > replicas[util.name_to_uuid.storage_1_b].master = true > | --- > | ... > @@ -495,6 +542,25 @@ test_run:switch('router_1') > | --- > | - true > | ... > +-- Ensure both replicas are connected. Otherwise the router can go to only one, > +-- find it is master, and won't go to the second one until the first one resigns > +-- or dies. > +rs1 = vshard.router.static.replicasets[util.replicasets[1]] > + | --- > + | ... > +replica1 = rs1.replicas[util.name_to_uuid.storage_1_a] > + | --- > + | ... > +replica2 = rs1.replicas[util.name_to_uuid.storage_1_b] > + | --- > + | ... > +test_run:wait_cond(function() \ > + return replica1:is_connected() and replica2:is_connected() \ > +end) > + | --- > + | - true > + | ... > + > forget_masters() > | --- > | ... > diff --git a/test/router/master_discovery.test.lua b/test/router/master_discovery.test.lua > index 6276dc9..8cacd6d 100644 > --- a/test/router/master_discovery.test.lua > +++ b/test/router/master_discovery.test.lua > @@ -181,24 +181,27 @@ end) > -- Call tries to wait for master if has enough time left. > -- > start_aggressive_master_search() > +test_run:cmd('stop server storage_1_b') > +rs1 = vshard.router.static.replicasets[util.replicasets[1]] > +replica = rs1.replicas[util.name_to_uuid.storage_1_b] > +-- Ensure the replica is not available. Otherwise RO requests sneak into it > +-- instead of waiting for master. > +test_run:wait_cond(function() return not replica:is_connected() end) > > forget_masters() > vshard.router.callrw(1501, 'echo', {1}, opts_big_timeout) > > forget_masters() > --- XXX: this should not depend on master so much. RO requests should be able to > --- go to replicas. > vshard.router.callro(1501, 'echo', {1}, opts_big_timeout) > > forget_masters() > vshard.router.route(1501):callrw('echo', {1}, opts_big_timeout) > > forget_masters() > --- XXX: the same as above - should not really wait for master. Regardless of it > --- being auto or not. > vshard.router.route(1501):callro('echo', {1}, opts_big_timeout) > > stop_aggressive_master_search() > +test_run:cmd('start server storage_1_b') > > test_run:switch('storage_1_a') > assert(echo_count == 4) > @@ -218,23 +221,30 @@ replicas = cfg.sharding[util.replicasets[1]].replicas > replicas[util.name_to_uuid.storage_1_a].master = false > vshard.storage.cfg(cfg, instance_uuid) > > --- Try to make RW and RO requests but then turn of the auto search. > +-- Try to make an RW request but then turn of the auto search. > test_run:switch('router_1') > forget_masters() > f1 = fiber.create(function() \ > fiber.self():set_joinable(true) \ > return vshard.router.callrw(1501, 'echo', {1}, opts_big_timeout) \ > end) > --- XXX: should not really wait for master since this is an RO request. It could > --- use a replica. > +fiber.sleep(0.01) > +disable_auto_masters() > +f1:join() > + > +-- Try to make an RO request but then turn of the auto search. > +test_run:cmd('stop server storage_1_a') > +test_run:cmd('stop server storage_1_b') > +forget_masters() > f2 = fiber.create(function() \ > fiber.self():set_joinable(true) \ > return vshard.router.callro(1501, 'echo', {1}, opts_big_timeout) \ > end) > fiber.sleep(0.01) > disable_auto_masters() > -f1:join() > f2:join() > +test_run:cmd('start server storage_1_a') > +test_run:cmd('start server storage_1_b') > > -- > -- Multiple masters logging. > @@ -242,14 +252,26 @@ f2:join() > test_run:switch('storage_1_a') > replicas = cfg.sharding[util.replicasets[1]].replicas > replicas[util.name_to_uuid.storage_1_a].master = true > +replicas[util.name_to_uuid.storage_1_b].master = false > vshard.storage.cfg(cfg, instance_uuid) > > test_run:switch('storage_1_b') > replicas = cfg.sharding[util.replicasets[1]].replicas > +replicas[util.name_to_uuid.storage_1_a].master = false > replicas[util.name_to_uuid.storage_1_b].master = true > vshard.storage.cfg(cfg, instance_uuid) > > test_run:switch('router_1') > +-- Ensure both replicas are connected. Otherwise the router can go to only one, > +-- find it is master, and won't go to the second one until the first one resigns > +-- or dies. > +rs1 = vshard.router.static.replicasets[util.replicasets[1]] > +replica1 = rs1.replicas[util.name_to_uuid.storage_1_a] > +replica2 = rs1.replicas[util.name_to_uuid.storage_1_b] > +test_run:wait_cond(function() \ > + return replica1:is_connected() and replica2:is_connected() \ > +end) > + > forget_masters() > start_aggressive_master_search() > test_run:wait_log('router_1', 'Found more than one master', nil, 10) > diff --git a/vshard/replicaset.lua b/vshard/replicaset.lua > index 174c761..93f210e 100644 > --- a/vshard/replicaset.lua > +++ b/vshard/replicaset.lua > @@ -455,18 +455,25 @@ local function replicaset_template_multicallro(prefer_replica, balance) > return r > end > end > - elseif prefer_replica then > - r = replicaset.replica > + else > + local start_r = replicaset.replica > + r = start_r > while r do > - if r:is_connected() and r ~= master then > + if r:is_connected() and (not prefer_replica or r ~= master) then > return r > end > r = r.next_by_priority > end > - else > - r = replicaset.replica > - if r and r:is_connected() then > - return r > + -- Iteration above could start not from the best prio replica. > + -- Check the beginning of the list too. > + for _, r in pairs(replicaset.priority_list) do pairs can't be jitted (at least now until https://github.com/tarantool/tarantool/issues/6475). Also pairs don't give any guarantees about iteration order. Could it be replaced with ipairs? > + if r == start_r then > + -- Reached already checked part. > + break > + end > + if r:is_connected() and (not prefer_replica or r ~= master) then > + return r > + end > end > end > end