Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH vshard 1/1] replicaset: check URI match when rebind connection
@ 2020-08-06 21:48 Vladislav Shpilevoy
  2020-08-06 22:02 ` Yaroslav Dynnikov
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Vladislav Shpilevoy @ 2020-08-06 21:48 UTC (permalink / raw)
  To: tarantool-patches, yaroslav.dynnikov, olegrok

Reconfiguration in router and storage always creates new
replica objects, and outdates old objects. Since most of the
time reconfiguration does not change cluster topology and URIs,
old connections are relocated to the new replicaset objects so as
not to reconnect on each router.cfg() and storage.cfg().

Connections were moved from old objects to new ones by UUID match
not taking into account possible URI change.

If URI was changed, the new replica object took the old connection
with old URI, and kept trying to connect to the old address.

The patch makes connection relocation work only if old and new URI
match.

Closes #245
---
Branch: http://github.com/tarantool/vshard/tree/gerold103/gh-245-router-reconnect
Issue: https://github.com/tarantool/vshard/issues/245

 test/router/reconnect_to_master.result   | 55 ++++++++++++++++++++++++
 test/router/reconnect_to_master.test.lua | 26 +++++++++++
 vshard/replicaset.lua                    |  2 +-
 3 files changed, 82 insertions(+), 1 deletion(-)

diff --git a/test/router/reconnect_to_master.result b/test/router/reconnect_to_master.result
index d640f0b..aa33f9c 100644
--- a/test/router/reconnect_to_master.result
+++ b/test/router/reconnect_to_master.result
@@ -169,6 +169,61 @@ is_disconnected()
 ---
 - false
 ...
+--
+-- gh-245: dynamic uri reconfiguration didn't work - even if URI was changed in
+-- the config for any instance, it used old connection, because reconfiguration
+-- compared connections by UUID instead of URI.
+--
+util = require('util')
+---
+...
+-- Firstly, clean router from storage_1_a connection.
+rs1_uuid = util.replicasets[1]
+---
+...
+rs1_cfg = cfg.sharding[rs1_uuid]
+---
+...
+cfg.sharding[rs1_uuid] = nil
+---
+...
+vshard.router.cfg(cfg)
+---
+...
+-- Now break the URI in the config.
+old_uri = rs1_cfg.replicas[util.name_to_uuid.storage_1_a].uri
+---
+...
+rs1_cfg.replicas[util.name_to_uuid.storage_1_a].uri = 'https://bad_uri.com:123'
+---
+...
+-- Apply the bad config.
+cfg.sharding[rs1_uuid] = rs1_cfg
+---
+...
+vshard.router.cfg(cfg)
+---
+...
+-- Should fail - master is not available because of the bad URI.
+res, err = vshard.router.callrw(1, 'echo', {1})
+---
+...
+res == nil and err ~= nil
+---
+- true
+...
+-- Repair the config.
+rs1_cfg.replicas[util.name_to_uuid.storage_1_a].uri = old_uri
+---
+...
+vshard.router.cfg(cfg)
+---
+...
+-- Should drop the old connection object and connect fine.
+vshard.router.callrw(1, 'echo', {1})
+---
+- 1
+...
 _ = test_run:switch("default")
 ---
 ...
diff --git a/test/router/reconnect_to_master.test.lua b/test/router/reconnect_to_master.test.lua
index c315c9f..87270af 100644
--- a/test/router/reconnect_to_master.test.lua
+++ b/test/router/reconnect_to_master.test.lua
@@ -70,6 +70,32 @@ while is_disconnected() and i < max_iters do i = i + 1 fiber.sleep(0.1) end
 -- Master connection is active again.
 is_disconnected()
 
+--
+-- gh-245: dynamic uri reconfiguration didn't work - even if URI was changed in
+-- the config for any instance, it used old connection, because reconfiguration
+-- compared connections by UUID instead of URI.
+--
+util = require('util')
+-- Firstly, clean router from storage_1_a connection.
+rs1_uuid = util.replicasets[1]
+rs1_cfg = cfg.sharding[rs1_uuid]
+cfg.sharding[rs1_uuid] = nil
+vshard.router.cfg(cfg)
+-- Now break the URI in the config.
+old_uri = rs1_cfg.replicas[util.name_to_uuid.storage_1_a].uri
+rs1_cfg.replicas[util.name_to_uuid.storage_1_a].uri = 'https://bad_uri.com:123'
+-- Apply the bad config.
+cfg.sharding[rs1_uuid] = rs1_cfg
+vshard.router.cfg(cfg)
+-- Should fail - master is not available because of the bad URI.
+res, err = vshard.router.callrw(1, 'echo', {1})
+res == nil and err ~= nil
+-- Repair the config.
+rs1_cfg.replicas[util.name_to_uuid.storage_1_a].uri = old_uri
+vshard.router.cfg(cfg)
+-- Should drop the old connection object and connect fine.
+vshard.router.callrw(1, 'echo', {1})
+
 _ = test_run:switch("default")
 _ = test_run:cmd('stop server router_1')
 _ = test_run:cmd('cleanup server router_1')
diff --git a/vshard/replicaset.lua b/vshard/replicaset.lua
index 0191936..b13d05e 100644
--- a/vshard/replicaset.lua
+++ b/vshard/replicaset.lua
@@ -456,7 +456,7 @@ local function rebind_replicasets(replicasets, old_replicasets)
         for replica_uuid, replica in pairs(replicaset.replicas) do
             local old_replica = old_replicaset and
                                 old_replicaset.replicas[replica_uuid]
-            if old_replica then
+            if old_replica and old_replica.uri == replica.uri then
                 local conn = old_replica.conn
                 replica.conn = conn
                 replica.down_ts = old_replica.down_ts
-- 
2.21.1 (Apple Git-122.3)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Tarantool-patches] [PATCH vshard 1/1] replicaset: check URI match when rebind connection
  2020-08-06 21:48 [Tarantool-patches] [PATCH vshard 1/1] replicaset: check URI match when rebind connection Vladislav Shpilevoy
@ 2020-08-06 22:02 ` Yaroslav Dynnikov
  2020-08-07  8:07 ` Oleg Babin
  2020-08-07 21:56 ` Vladislav Shpilevoy
  2 siblings, 0 replies; 4+ messages in thread
From: Yaroslav Dynnikov @ 2020-08-06 22:02 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: Yaroslav Dynnikov, tml

[-- Attachment #1: Type: text/plain, Size: 5180 bytes --]

Hi!

Thanks for the fix. LGTM.

Best regards
Yaroslav Dynnikov


On Fri, 7 Aug 2020 at 00:48, Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
wrote:

> Reconfiguration in router and storage always creates new
> replica objects, and outdates old objects. Since most of the
> time reconfiguration does not change cluster topology and URIs,
> old connections are relocated to the new replicaset objects so as
> not to reconnect on each router.cfg() and storage.cfg().
>
> Connections were moved from old objects to new ones by UUID match
> not taking into account possible URI change.
>
> If URI was changed, the new replica object took the old connection
> with old URI, and kept trying to connect to the old address.
>
> The patch makes connection relocation work only if old and new URI
> match.
>
> Closes #245
> ---
> Branch:
> http://github.com/tarantool/vshard/tree/gerold103/gh-245-router-reconnect
> Issue: https://github.com/tarantool/vshard/issues/245
>
>  test/router/reconnect_to_master.result   | 55 ++++++++++++++++++++++++
>  test/router/reconnect_to_master.test.lua | 26 +++++++++++
>  vshard/replicaset.lua                    |  2 +-
>  3 files changed, 82 insertions(+), 1 deletion(-)
>
> diff --git a/test/router/reconnect_to_master.result
> b/test/router/reconnect_to_master.result
> index d640f0b..aa33f9c 100644
> --- a/test/router/reconnect_to_master.result
> +++ b/test/router/reconnect_to_master.result
> @@ -169,6 +169,61 @@ is_disconnected()
>  ---
>  - false
>  ...
> +--
> +-- gh-245: dynamic uri reconfiguration didn't work - even if URI was
> changed in
> +-- the config for any instance, it used old connection, because
> reconfiguration
> +-- compared connections by UUID instead of URI.
> +--
> +util = require('util')
> +---
> +...
> +-- Firstly, clean router from storage_1_a connection.
> +rs1_uuid = util.replicasets[1]
> +---
> +...
> +rs1_cfg = cfg.sharding[rs1_uuid]
> +---
> +...
> +cfg.sharding[rs1_uuid] = nil
> +---
> +...
> +vshard.router.cfg(cfg)
> +---
> +...
> +-- Now break the URI in the config.
> +old_uri = rs1_cfg.replicas[util.name_to_uuid.storage_1_a].uri
> +---
> +...
> +rs1_cfg.replicas[util.name_to_uuid.storage_1_a].uri = '
> https://bad_uri.com:123'
> +---
> +...
> +-- Apply the bad config.
> +cfg.sharding[rs1_uuid] = rs1_cfg
> +---
> +...
> +vshard.router.cfg(cfg)
> +---
> +...
> +-- Should fail - master is not available because of the bad URI.
> +res, err = vshard.router.callrw(1, 'echo', {1})
> +---
> +...
> +res == nil and err ~= nil
> +---
> +- true
> +...
> +-- Repair the config.
> +rs1_cfg.replicas[util.name_to_uuid.storage_1_a].uri = old_uri
> +---
> +...
> +vshard.router.cfg(cfg)
> +---
> +...
> +-- Should drop the old connection object and connect fine.
> +vshard.router.callrw(1, 'echo', {1})
> +---
> +- 1
> +...
>  _ = test_run:switch("default")
>  ---
>  ...
> diff --git a/test/router/reconnect_to_master.test.lua
> b/test/router/reconnect_to_master.test.lua
> index c315c9f..87270af 100644
> --- a/test/router/reconnect_to_master.test.lua
> +++ b/test/router/reconnect_to_master.test.lua
> @@ -70,6 +70,32 @@ while is_disconnected() and i < max_iters do i = i + 1
> fiber.sleep(0.1) end
>  -- Master connection is active again.
>  is_disconnected()
>
> +--
> +-- gh-245: dynamic uri reconfiguration didn't work - even if URI was
> changed in
> +-- the config for any instance, it used old connection, because
> reconfiguration
> +-- compared connections by UUID instead of URI.
> +--
> +util = require('util')
> +-- Firstly, clean router from storage_1_a connection.
> +rs1_uuid = util.replicasets[1]
> +rs1_cfg = cfg.sharding[rs1_uuid]
> +cfg.sharding[rs1_uuid] = nil
> +vshard.router.cfg(cfg)
> +-- Now break the URI in the config.
> +old_uri = rs1_cfg.replicas[util.name_to_uuid.storage_1_a].uri
> +rs1_cfg.replicas[util.name_to_uuid.storage_1_a].uri = '
> https://bad_uri.com:123'
> +-- Apply the bad config.
> +cfg.sharding[rs1_uuid] = rs1_cfg
> +vshard.router.cfg(cfg)
> +-- Should fail - master is not available because of the bad URI.
> +res, err = vshard.router.callrw(1, 'echo', {1})
> +res == nil and err ~= nil
> +-- Repair the config.
> +rs1_cfg.replicas[util.name_to_uuid.storage_1_a].uri = old_uri
> +vshard.router.cfg(cfg)
> +-- Should drop the old connection object and connect fine.
> +vshard.router.callrw(1, 'echo', {1})
> +
>  _ = test_run:switch("default")
>  _ = test_run:cmd('stop server router_1')
>  _ = test_run:cmd('cleanup server router_1')
> diff --git a/vshard/replicaset.lua b/vshard/replicaset.lua
> index 0191936..b13d05e 100644
> --- a/vshard/replicaset.lua
> +++ b/vshard/replicaset.lua
> @@ -456,7 +456,7 @@ local function rebind_replicasets(replicasets,
> old_replicasets)
>          for replica_uuid, replica in pairs(replicaset.replicas) do
>              local old_replica = old_replicaset and
>                                  old_replicaset.replicas[replica_uuid]
> -            if old_replica then
> +            if old_replica and old_replica.uri == replica.uri then
>                  local conn = old_replica.conn
>                  replica.conn = conn
>                  replica.down_ts = old_replica.down_ts
> --
> 2.21.1 (Apple Git-122.3)
>
>

[-- Attachment #2: Type: text/html, Size: 6679 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Tarantool-patches] [PATCH vshard 1/1] replicaset: check URI match when rebind connection
  2020-08-06 21:48 [Tarantool-patches] [PATCH vshard 1/1] replicaset: check URI match when rebind connection Vladislav Shpilevoy
  2020-08-06 22:02 ` Yaroslav Dynnikov
@ 2020-08-07  8:07 ` Oleg Babin
  2020-08-07 21:56 ` Vladislav Shpilevoy
  2 siblings, 0 replies; 4+ messages in thread
From: Oleg Babin @ 2020-08-07  8:07 UTC (permalink / raw)
  To: Vladislav Shpilevoy, tarantool-patches, yaroslav.dynnikov

Hi! Thanks for your patch. LGTM.

On 07/08/2020 00:48, Vladislav Shpilevoy wrote:
> Reconfiguration in router and storage always creates new
> replica objects, and outdates old objects. Since most of the
> time reconfiguration does not change cluster topology and URIs,
> old connections are relocated to the new replicaset objects so as
> not to reconnect on each router.cfg() and storage.cfg().
>
> Connections were moved from old objects to new ones by UUID match
> not taking into account possible URI change.
>
> If URI was changed, the new replica object took the old connection
> with old URI, and kept trying to connect to the old address.
>
> The patch makes connection relocation work only if old and new URI
> match.
>
> Closes #245
> ---
> Branch:http://github.com/tarantool/vshard/tree/gerold103/gh-245-router-reconnect
> Issue:https://github.com/tarantool/vshard/issues/245
>
>   test/router/reconnect_to_master.result   | 55 ++++++++++++++++++++++++
>   test/router/reconnect_to_master.test.lua | 26 +++++++++++
>   vshard/replicaset.lua                    |  2 +-
>   3 files changed, 82 insertions(+), 1 deletion(-)
>
> diff --git a/test/router/reconnect_to_master.result b/test/router/reconnect_to_master.result
> index d640f0b..aa33f9c 100644
> --- a/test/router/reconnect_to_master.result
> +++ b/test/router/reconnect_to_master.result
> @@ -169,6 +169,61 @@ is_disconnected()
>   ---
>   - false
>   ...
> +--
> +-- gh-245: dynamic uri reconfiguration didn't work - even if URI was changed in
> +-- the config for any instance, it used old connection, because reconfiguration
> +-- compared connections by UUID instead of URI.
> +--
> +util = require('util')
> +---
> +...
> +-- Firstly, clean router from storage_1_a connection.
> +rs1_uuid = util.replicasets[1]
> +---
> +...
> +rs1_cfg = cfg.sharding[rs1_uuid]
> +---
> +...
> +cfg.sharding[rs1_uuid] = nil
> +---
> +...
> +vshard.router.cfg(cfg)
> +---
> +...
> +-- Now break the URI in the config.
> +old_uri = rs1_cfg.replicas[util.name_to_uuid.storage_1_a].uri
> +---
> +...
> +rs1_cfg.replicas[util.name_to_uuid.storage_1_a].uri = 'https://bad_uri.com:123'
> +---
> +...
> +-- Apply the bad config.
> +cfg.sharding[rs1_uuid] = rs1_cfg
> +---
> +...
> +vshard.router.cfg(cfg)
> +---
> +...
> +-- Should fail - master is not available because of the bad URI.
> +res, err = vshard.router.callrw(1, 'echo', {1})
> +---
> +...
> +res == nil and err ~= nil
> +---
> +- true
> +...
> +-- Repair the config.
> +rs1_cfg.replicas[util.name_to_uuid.storage_1_a].uri = old_uri
> +---
> +...
> +vshard.router.cfg(cfg)
> +---
> +...
> +-- Should drop the old connection object and connect fine.
> +vshard.router.callrw(1, 'echo', {1})
> +---
> +- 1
> +...
>   _ = test_run:switch("default")
>   ---
>   ...
> diff --git a/test/router/reconnect_to_master.test.lua b/test/router/reconnect_to_master.test.lua
> index c315c9f..87270af 100644
> --- a/test/router/reconnect_to_master.test.lua
> +++ b/test/router/reconnect_to_master.test.lua
> @@ -70,6 +70,32 @@ while is_disconnected() and i < max_iters do i = i + 1 fiber.sleep(0.1) end
>   -- Master connection is active again.
>   is_disconnected()
>   
> +--
> +-- gh-245: dynamic uri reconfiguration didn't work - even if URI was changed in
> +-- the config for any instance, it used old connection, because reconfiguration
> +-- compared connections by UUID instead of URI.
> +--
> +util = require('util')
> +-- Firstly, clean router from storage_1_a connection.
> +rs1_uuid = util.replicasets[1]
> +rs1_cfg = cfg.sharding[rs1_uuid]
> +cfg.sharding[rs1_uuid] = nil
> +vshard.router.cfg(cfg)
> +-- Now break the URI in the config.
> +old_uri = rs1_cfg.replicas[util.name_to_uuid.storage_1_a].uri
> +rs1_cfg.replicas[util.name_to_uuid.storage_1_a].uri = 'https://bad_uri.com:123'
> +-- Apply the bad config.
> +cfg.sharding[rs1_uuid] = rs1_cfg
> +vshard.router.cfg(cfg)
> +-- Should fail - master is not available because of the bad URI.
> +res, err = vshard.router.callrw(1, 'echo', {1})
> +res == nil and err ~= nil
> +-- Repair the config.
> +rs1_cfg.replicas[util.name_to_uuid.storage_1_a].uri = old_uri
> +vshard.router.cfg(cfg)
> +-- Should drop the old connection object and connect fine.
> +vshard.router.callrw(1, 'echo', {1})
> +
>   _ = test_run:switch("default")
>   _ = test_run:cmd('stop server router_1')
>   _ = test_run:cmd('cleanup server router_1')
> diff --git a/vshard/replicaset.lua b/vshard/replicaset.lua
> index 0191936..b13d05e 100644
> --- a/vshard/replicaset.lua
> +++ b/vshard/replicaset.lua
> @@ -456,7 +456,7 @@ local function rebind_replicasets(replicasets, old_replicasets)
>           for replica_uuid, replica in pairs(replicaset.replicas) do
>               local old_replica = old_replicaset and
>                                   old_replicaset.replicas[replica_uuid]
> -            if old_replica then
> +            if old_replica and old_replica.uri == replica.uri then
>                   local conn = old_replica.conn
>                   replica.conn = conn
>                   replica.down_ts = old_replica.down_ts

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Tarantool-patches] [PATCH vshard 1/1] replicaset: check URI match when rebind connection
  2020-08-06 21:48 [Tarantool-patches] [PATCH vshard 1/1] replicaset: check URI match when rebind connection Vladislav Shpilevoy
  2020-08-06 22:02 ` Yaroslav Dynnikov
  2020-08-07  8:07 ` Oleg Babin
@ 2020-08-07 21:56 ` Vladislav Shpilevoy
  2 siblings, 0 replies; 4+ messages in thread
From: Vladislav Shpilevoy @ 2020-08-07 21:56 UTC (permalink / raw)
  To: tarantool-patches, yaroslav.dynnikov, olegrok

Pushed to master.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-08-07 21:56 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-06 21:48 [Tarantool-patches] [PATCH vshard 1/1] replicaset: check URI match when rebind connection Vladislav Shpilevoy
2020-08-06 22:02 ` Yaroslav Dynnikov
2020-08-07  8:07 ` Oleg Babin
2020-08-07 21:56 ` Vladislav Shpilevoy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox