From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp43.i.mail.ru (smtp43.i.mail.ru [94.100.177.103]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id D8FCC469710 for ; Tue, 5 May 2020 00:09:55 +0300 (MSK) References: <80bacfc685233ad047f6a80ddadd72b8903eae5b.1588292014.git.v.shpilevoy@tarantool.org> <4ee74ac5-55e0-9a90-91d0-69e9470d9cb3@tarantool.org> <2bb5c86a-04c4-96a9-ab3e-75b96d43b1d9@tarantool.org> From: Vladislav Shpilevoy Message-ID: Date: Mon, 4 May 2020 23:09:53 +0200 MIME-Version: 1.0 In-Reply-To: <2bb5c86a-04c4-96a9-ab3e-75b96d43b1d9@tarantool.org> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [Tarantool-patches] [PATCH vshard 6/7] router: make discovery smoother in a big cluster List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Oleg Babin , tarantool-patches@dev.tarantool.org Hi! Thanks for the review! >>>> diff --git a/vshard/router/init.lua b/vshard/router/init.lua >>>> index 6d88153..26ea85b 100644 >>>> --- a/vshard/router/init.lua >>>> +++ b/vshard/router/init.lua >>>> @@ -250,50 +250,127 @@ if util.version_is_at_least(1, 10, 0) then >>>> +            -- Don't spam many requests at once. Give >>>> +            -- storages time to handle them and other >>>> +            -- requests. >>>> +            lfiber.sleep(consts.DISCOVERY_WORK_STEP) >>>> +            if module_version ~= M.module_version then >>>> +                return >>>>                end >>> >>> >>> Is it possible to place such checks to some "common" places. At start/end of each iteration or between steps. This looks strange and a bit unobvous to do such checks after each timeout. >> >> Purpose is to stop the fiber as soon as possible in case a reload >> happened. So moving it to a common place is not really possible. >> It would make it work longer with the old code after reload. But >> if we keep the checks, I still need to do then after/before any >> long yield. >> > > My main concern is that such things could be easily missed in future. But I agree that we should left this function early as possible. It can be, and it was a couple of times. Don't know what to do with that honestly, at this moment. This restart-on-reload machinery should be reworked completely. >> During working on your comments I found that during aggressive discovery I >> added sleep in a wrong place. Fixed below: >> >> diff --git a/vshard/consts.lua b/vshard/consts.lua >> index 5391c0f..a6a8c1b 100644 >> --- a/vshard/consts.lua >> +++ b/vshard/consts.lua >> @@ -40,5 +40,8 @@ return { >>       DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL = 0.5; >>       RECOVERY_INTERVAL = 5; >>       COLLECT_LUA_GARBAGE_INTERVAL = 100; >> -    DISCOVERY_INTERVAL = 10; >> +    DISCOVERY_IDLE_INTERVAL = 10, >> +    DISCOVERY_WORK_INTERVAL = 1, >> +    DISCOVERY_WORK_STEP = 0.01, >> +    DISCOVERY_TIMEOUT = 10, >>   } > > Nit: here commas and semicolons are mixed. I think we should support it in consistent state - only commas or only semicolons. I don't like these ';'. They are against our code style. They were added in the second commit in this repository, and since then kept growing. I deliberately stopped using them and decided to switch to ',' in all new code, and wash out the old lines gradually. I guess I can split DISCOVERY_* parameters with an empty line from the others who still use ';' though. To make it more clear. ==================== diff --git a/vshard/consts.lua b/vshard/consts.lua index a6a8c1b..8c2a8b0 100644 --- a/vshard/consts.lua +++ b/vshard/consts.lua @@ -40,6 +40,7 @@ return { DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL = 0.5; RECOVERY_INTERVAL = 5; COLLECT_LUA_GARBAGE_INTERVAL = 100; + DISCOVERY_IDLE_INTERVAL = 10, DISCOVERY_WORK_INTERVAL = 1, DISCOVERY_WORK_STEP = 0.01, ====================