[Tarantool-patches] [PATCH vshard 6/6] router: update master using a hint from storage
Oleg Babin
olegrok at tarantool.org
Fri Jul 2 14:49:06 MSK 2021
Thanks for your patch. See 2 comments below.
On 02.07.2021 01:09, Vladislav Shpilevoy wrote:
> Storage sends NON_MASTER error when an attempt happens to make a
> read-write request on it while it is not a master. The error
> contains UUID of the instance.
>
> The patch adds to this error a new field - UUID of the master as
> it is seen on this storage. Router now uses that information to
> quickly switch its read-write requests to the new master. In fact,
> this should happen in almost all cases of master auto-discovery on
> the router if it occurs under load.
>
> Closes #75
> ---
> test/router/master_discovery.result | 427 ++++++++++++++++++++++++++
> test/router/master_discovery.test.lua | 191 ++++++++++++
> test/router/router.result | 8 +-
> vshard/error.lua | 2 +-
> vshard/replicaset.lua | 65 ++++
> vshard/router/init.lua | 17 +-
> vshard/storage/init.lua | 6 +-
> 7 files changed, 706 insertions(+), 10 deletions(-)
>
>
> ...
> @@ -1175,6 +1176,7 @@ error_messages
> - Use replicaset:connect_replica(...) instead of replicaset.connect_replica(...)
> - Use replicaset:down_replica_priority(...) instead of replicaset.down_replica_priority(...)
> - Use replicaset:up_replica_priority(...) instead of replicaset.up_replica_priority(...)
> + - Use replicaset:update_master(...) instead of replicaset.update_master(...)
> - Use replicaset:wait_connected(...) instead of replicaset.wait_connected(...)
> ...
> _, replica = next(replicaset.replicas)
> diff --git a/vshard/error.lua b/vshard/error.lua
> index e2d1a31..fa7bdaa 100644
> --- a/vshard/error.lua
> +++ b/vshard/error.lua
> @@ -20,7 +20,7 @@ local error_message_template = {
> [2] = {
> name = 'NON_MASTER',
> msg = 'Replica %s is not a master for replicaset %s anymore',
> - args = {'replica_uuid', 'replicaset_uuid'}
> + args = {'replica_uuid', 'replicaset_uuid', 'master_uuid'}
Error format string still contains only 2 arguments. Is it ok?
> },
> [3] = {
> name = 'BUCKET_ALREADY_EXISTS',
> diff --git a/vshard/replicaset.lua b/vshard/replicaset.lua
> index 7747258..660c786 100644
> --- a/vshard/replicaset.lua
> +++ b/vshard/replicaset.lua
> @@ -570,6 +570,70 @@ local function rebind_replicasets(replicasets, old_replicasets)
> end
> end
>
> +--
> +-- Let the replicaset know @a old_master_uuid is not a master anymore, should
> +-- use @a candidate_uuid instead.
> +-- Returns whether the request, which brought this information, can be retried.
> +--
> +local function replicaset_update_master(replicaset, old_master_uuid,
> + candidate_uuid)
> + local is_auto = replicaset.is_auto_master
> + local replicaset_uuid = replicaset.uuid
> + if old_master_uuid == candidate_uuid then
> + -- It should not happen ever, but be ready to everything.
> + log.warn('Replica %s in replicaset %s reports self as both master '..
> + 'and not master', master_uuid, replicaset_uuid)
> + return is_auto
> + end
> + local master = replicaset.master
> + if not master then
> + if not is_auto or not candidate_uuid then
> + return is_auto
> + end
> + local candidate = replicaset.replicas[candidate_uuid]
AFAIU it means that candidate_uuid doesn't belong to replicaset.
Why is it true?
> + if not candidate then
> + return true
> + end
> + replicaset.master = candidate
> + log.info('Replica %s becomes a master as reported by %s for '..
> + 'replicaset %s', candidate_uuid, old_master_uuid,
> + replicaset_uuid)
> + return true
More information about the Tarantool-patches
mailing list