From: Oleg Babin via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>, tarantool-patches@dev.tarantool.org, yaroslav.dynnikov@tarantool.org Subject: Re: [Tarantool-patches] [PATCH vshard 1/1] recovery: relax recovery messages verbosity Date: Thu, 3 Jun 2021 13:19:17 +0300 [thread overview] Message-ID: <38f8a40b-4fe7-f26a-ddc5-7f088445e4f2@tarantool.org> (raw) In-Reply-To: <ddf4570db038934ba121f5a3e99f690930faa869.1622669605.git.v.shpilevoy@tarantool.org> Hi! Thanks for your patch! Looks good but I placed three comments below. On 03.06.2021 00:34, Vladislav Shpilevoy wrote: > Recovery fiber on the storages used to print messages about > starting recovery even when no recovery was needed yet: > > Starting ... buckets recovery step > Finish bucket recovery step > > It happened a lot during rebalancing even if it worked fine. > Because there appear receiving/sending buckets, and recovery > double-checks if they are really transferring, not stuck. > > The patch makes recovery fiber not account the buckets, whose > transfer is actually in progress, as broken. Hence it won't print > the recovery messages anymore unless the transfer was really > interrupted. > > Along with that the recovery now prints more details about the > first bucket which triggered the real recovery. > > Closes #274 > --- > Branch: http://github.com/tarantool/vshard/tree/gerold103/gh-274-user-friendly-recovery > Issue: https://github.com/tarantool/vshard/issues/274 > > vshard/storage/init.lua | 30 +++++++++++++++++++++++++----- > 1 file changed, 25 insertions(+), 5 deletions(-) > > diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua > index 7045d91..8a019fa 100644 > --- a/vshard/storage/init.lua > +++ b/vshard/storage/init.lua > @@ -736,21 +736,25 @@ local function recovery_step_by_type(type) > local is_empty = true To be honest I don't completely understand what "is_empty" means. > local recovered = 0 > local total = 0 > + local start_format = 'Starting %s buckets recovery step' > for _, bucket in _bucket.index.status:pairs(type) do > total = total + 1 > local bucket_id = bucket.id > if M.rebalancer_transfering_buckets[bucket_id] then Side-note: transfering -> transferring > goto continue > end > - if is_empty then > - log.info('Starting %s buckets recovery step', type) > - end > - is_empty = false > assert(bucket_is_transfer_in_progress(bucket)) > - local destination = M.replicasets[bucket.destination] > + local peer_uuid = bucket.destination > + local destination = M.replicasets[peer_uuid] > if not destination or not destination.master then > -- No replicaset master for a bucket. Wait until it > -- appears. This comment states that there is no critical and it's appropriate and we should just wait. Why "error" and not "warn"? > + if is_empty then > + log.info(start_format, type) > + log.error('Can not find for bucket %s its peer %s', bucket_id, > + peer_uuid) > + is_empty = false > + end > goto continue > end > local remote_bucket, err = > @@ -759,6 +763,15 @@ local function recovery_step_by_type(type) > -- not be used to recovery anything. Try later. > if not remote_bucket and (not err or err.type ~= 'ShardingError' or > err.code ~= lerror.code.WRONG_BUCKET) then > + if is_empty then > + if err == nil then > + err = 'unknown' > + end > + log.info(start_format, type) > + log.error('Error during recovery of bucket %s on replicaset '.. > + '%s: %s', bucket_id, peer_uuid, err) > + is_empty = false > + end > goto continue > end > -- Do nothing until the bucket on both sides stopped > @@ -772,13 +785,20 @@ local function recovery_step_by_type(type) > if not bucket or not bucket_is_transfer_in_progress(bucket) then > goto continue > end > + if is_empty then > + log.info(start_format, type) > + end > if recovery_local_bucket_is_garbage(bucket, remote_bucket) then > _bucket:update({bucket_id}, {{'=', 2, consts.BUCKET.GARBAGE}}) > recovered = recovered + 1 > elseif recovery_local_bucket_is_active(bucket, remote_bucket) then > _bucket:replace({bucket_id, consts.BUCKET.ACTIVE}) > recovered = recovered + 1 > + elseif is_empty then > + log.info('Bucket %s is %s local and %s on replicaset %s, waiting', > + bucket_id, bucket.status, remote_bucket.status, peer_uuid) > end > + is_empty = false > ::continue:: > end > if not is_empty then
next prev parent reply other threads:[~2021-06-03 10:19 UTC|newest] Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-06-02 21:34 Vladislav Shpilevoy via Tarantool-patches 2021-06-03 10:19 ` Oleg Babin via Tarantool-patches [this message] 2021-06-03 12:55 ` Vladislav Shpilevoy via Tarantool-patches 2021-06-03 13:22 ` Oleg Babin via Tarantool-patches 2021-06-03 14:13 ` Vladislav Shpilevoy via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=38f8a40b-4fe7-f26a-ddc5-7f088445e4f2@tarantool.org \ --to=tarantool-patches@dev.tarantool.org \ --cc=olegrok@tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --cc=yaroslav.dynnikov@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH vshard 1/1] recovery: relax recovery messages verbosity' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox