From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id D95256EC40; Thu, 3 Jun 2021 13:19:20 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org D95256EC40 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1622715560; bh=wbW/CzXxlIR8kUfesHtNW4B29xSyUyoCL4qc7TDa06Q=; h=To:References:Date:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=YmAhmgmKdarUhUk7Ro2aEFcdltFbq9sh32nuxhu/prZVzWCawr0IuonWTdP5ko8Pc x/tTEu9iC61j03zvgpgaKRToTVskLBBQct5V474AknQPfM03x4NH+vh/TcH3evr6lD RqRuAzGDQdEGlvdZZrF92LycqZY/bH6EVTIxLoN4= Received: from smtp49.i.mail.ru (smtp49.i.mail.ru [94.100.177.109]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 914C86EC40 for ; Thu, 3 Jun 2021 13:19:19 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 914C86EC40 Received: by smtp49.i.mail.ru with esmtpa (envelope-from ) id 1lokRa-0001fG-EI; Thu, 03 Jun 2021 13:19:18 +0300 To: Vladislav Shpilevoy , tarantool-patches@dev.tarantool.org, yaroslav.dynnikov@tarantool.org References: Message-ID: <38f8a40b-4fe7-f26a-ddc5-7f088445e4f2@tarantool.org> Date: Thu, 3 Jun 2021 13:19:17 +0300 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.10.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-GB X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD9D5B0DA836B685C544BBC2A69B1B4100B389BF69B7A224D7C182A05F5380850407CE5DF084EE4021AF1F5EF77E903C001A5DE917D7602C3A5B4C50EB9C1BA1E4F X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7495A032B936E882FEA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F79006375F0BD5CF353A411D8638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D847F30EEA6D82E49FD86C3850989E9C3B117882F4460429724CE54428C33FAD305F5C1EE8F4F765FCB37B5D51F58B3735A471835C12D1D9774AD6D5ED66289B52BA9C0B312567BB23117882F4460429728776938767073520B1593CA6EC85F86D618001F51B5FD3F9D2E47CDBA5A96583BA9C0B312567BB2376E601842F6C81A19E625A9149C048EE4B6963042765DA4BA7B8E9D6D956BB52D8FC6C240DEA7642DBF02ECDB25306B2B78CF848AE20165D0A6AB1C7CE11FEE3A7DFDF579AB090EF302FCEF25BFAB345C4224003CC836476EA7A3FFF5B025636E2021AF6380DFAD1A18204E546F3947CB11811A4A51E3B096D1867E19FE1407959CC434672EE6371089D37D7C0E48F6C8AA50765F79006376A91CFDE938F542CEFF80C71ABB335746BA297DBC24807EABDAD6C7F3747799A X-C1DE0DAB: 0D63561A33F958A58C73472FB73B87CFFB9F76FDE04AFD71AA0415BF6BDCE5F6D59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA75FBC5FED0552DA851410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D3483E1FCD56FEA62E63F14078E13F7316ECBE65CF0AF42418B1AF0B063FBC4A22FF223BE72DE3307CF1D7E09C32AA3244C9364B3ADA107E630012EAA1B9EE5462B81560E2432555DBBFACE5A9C96DEB163 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojNbQUdF9mq4FUcm1tpK0DyA== X-Mailru-Sender: 583F1D7ACE8F49BD07526C4546A62CBF6A73DC0E549D81AD1C4970F623A525FC95E87B70FB6A0BA123E75C7104EB1B885DEE61814008E47C7013064206BFB89F93956FB04BA385BE9437F6177E88F7363CDA0F3B3F5B9367 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH vshard 1/1] recovery: relax recovery messages verbosity X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Oleg Babin via Tarantool-patches Reply-To: Oleg Babin Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Hi! Thanks for your patch! Looks good but I placed three comments below. On 03.06.2021 00:34, Vladislav Shpilevoy wrote: > Recovery fiber on the storages used to print messages about > starting recovery even when no recovery was needed yet: > > Starting ... buckets recovery step > Finish bucket recovery step > > It happened a lot during rebalancing even if it worked fine. > Because there appear receiving/sending buckets, and recovery > double-checks if they are really transferring, not stuck. > > The patch makes recovery fiber not account the buckets, whose > transfer is actually in progress, as broken. Hence it won't print > the recovery messages anymore unless the transfer was really > interrupted. > > Along with that the recovery now prints more details about the > first bucket which triggered the real recovery. > > Closes #274 > --- > Branch: http://github.com/tarantool/vshard/tree/gerold103/gh-274-user-friendly-recovery > Issue: https://github.com/tarantool/vshard/issues/274 > > vshard/storage/init.lua | 30 +++++++++++++++++++++++++----- > 1 file changed, 25 insertions(+), 5 deletions(-) > > diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua > index 7045d91..8a019fa 100644 > --- a/vshard/storage/init.lua > +++ b/vshard/storage/init.lua > @@ -736,21 +736,25 @@ local function recovery_step_by_type(type) > local is_empty = true To be honest I don't completely understand what "is_empty" means. > local recovered = 0 > local total = 0 > + local start_format = 'Starting %s buckets recovery step' > for _, bucket in _bucket.index.status:pairs(type) do > total = total + 1 > local bucket_id = bucket.id > if M.rebalancer_transfering_buckets[bucket_id] then Side-note: transfering -> transferring > goto continue > end > - if is_empty then > - log.info('Starting %s buckets recovery step', type) > - end > - is_empty = false > assert(bucket_is_transfer_in_progress(bucket)) > - local destination = M.replicasets[bucket.destination] > + local peer_uuid = bucket.destination > + local destination = M.replicasets[peer_uuid] > if not destination or not destination.master then > -- No replicaset master for a bucket. Wait until it > -- appears. This comment states that there is no critical and it's appropriate and we should just wait. Why "error" and not "warn"? > + if is_empty then > + log.info(start_format, type) > + log.error('Can not find for bucket %s its peer %s', bucket_id, > + peer_uuid) > + is_empty = false > + end > goto continue > end > local remote_bucket, err = > @@ -759,6 +763,15 @@ local function recovery_step_by_type(type) > -- not be used to recovery anything. Try later. > if not remote_bucket and (not err or err.type ~= 'ShardingError' or > err.code ~= lerror.code.WRONG_BUCKET) then > + if is_empty then > + if err == nil then > + err = 'unknown' > + end > + log.info(start_format, type) > + log.error('Error during recovery of bucket %s on replicaset '.. > + '%s: %s', bucket_id, peer_uuid, err) > + is_empty = false > + end > goto continue > end > -- Do nothing until the bucket on both sides stopped > @@ -772,13 +785,20 @@ local function recovery_step_by_type(type) > if not bucket or not bucket_is_transfer_in_progress(bucket) then > goto continue > end > + if is_empty then > + log.info(start_format, type) > + end > if recovery_local_bucket_is_garbage(bucket, remote_bucket) then > _bucket:update({bucket_id}, {{'=', 2, consts.BUCKET.GARBAGE}}) > recovered = recovered + 1 > elseif recovery_local_bucket_is_active(bucket, remote_bucket) then > _bucket:replace({bucket_id, consts.BUCKET.ACTIVE}) > recovered = recovered + 1 > + elseif is_empty then > + log.info('Bucket %s is %s local and %s on replicaset %s, waiting', > + bucket_id, bucket.status, remote_bucket.status, peer_uuid) > end > + is_empty = false > ::continue:: > end > if not is_empty then