From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id C0C176FC86; Wed, 24 Mar 2021 16:09:59 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org C0C176FC86 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1616591399; bh=uPecdA10ps4QI7RBbsCCmBkbX/429Ru63MH6D/79iVY=; h=To:References:Date:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=LvFuoHIVVr5cAv/sPn6FZyGJSNY6vAxNWP3eJa/n6FQAIHrR1OgmIcC+0w7VT+M82 azlLg68hbkPemTKdef350Mb8ESur/6pfPg5gT5vIahVvabMlurMsHIjmxXCk+2ZsJp GZNvv5iEs1uaU4l61WeXR3NNaZIIOdvbYpMSSd6k= Received: from smtp61.i.mail.ru (smtp61.i.mail.ru [217.69.128.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 0F0596FC86 for ; Wed, 24 Mar 2021 16:09:58 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 0F0596FC86 Received: by smtp61.i.mail.ru with esmtpa (envelope-from ) id 1lP3Gm-0000pb-Ny; Wed, 24 Mar 2021 16:09:57 +0300 To: Cyrill Gorcunov , tml References: <20210323154710.1696442-1-gorcunov@gmail.com> <20210323154710.1696442-2-gorcunov@gmail.com> Message-ID: <1498cc0d-1a3a-619b-8cde-d484eca81758@tarantool.org> Date: Wed, 24 Mar 2021 16:09:56 +0300 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.16; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 MIME-Version: 1.0 In-Reply-To: <20210323154710.1696442-2-gorcunov@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-GB X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD95D6E7CC48CB1F5F10D3016C09B407F8B1E2E766A3410B623182A05F5380850409BDB4E7C169F30E74C2C12C6A93BA6056C045C741CC6AB33572681F92EF46BA9 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7D396ABDB9162C7E0EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F79006371895637A5F0B45FF8638F802B75D45FF914D58D5BE9E6BC131B5C99E7648C95C5DD32608FC869F5D1A9084D05A12731CC005597E41966AA3A471835C12D1D9774AD6D5ED66289B5278DA827A17800CE74A95F4E53E8DCE969FA2833FD35BB23D2EF20D2F80756B5F868A13BD56FB6657A471835C12D1D977725E5C173C3A84C3776A0366D588B3C3117882F4460429728AD0CFFFB425014E868A13BD56FB6657D81D268191BDAD3DC09775C1D3CA48CFBB58019631CEFEEDBA3038C0950A5D36C8A9BA7A39EFB766EC990983EF5C0329BA3038C0950A5D36D5E8D9A59859A8B6E444A4296FBB38C076E601842F6C81A1F004C906525384307823802FF610243DF43C7A68FF6260569E8FC8737B5C2249EC8D19AE6D49635B3BBE47FD9DD3FB59A8DF7F3B2552694A2BEBFE083D3B9BA73A03B725D353964BB11811A4A51E3B096D1867E19FE14079BA9C0B312567BB23089D37D7C0E48F6CA18204E546F3947C9FFED5BD9FB4175557739F23D657EF2BC8A9BA7A39EFB7666BA297DBC24807EA089D37D7C0E48F6C8AA50765F7900637703851249B9F082DEFF80C71ABB335746BA297DBC24807EA27F269C8F02392CD20465B3A5AADEC6827F269C8F02392CD5571747095F342E88FB05168BE4CE3AF X-C1DE0DAB: 0D63561A33F958A51375619A1E4786AE6C7D59BDE8DBD813E3C42093E551171DD59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA7502E6951B79FF9A3F410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D3457FA942CB4462B4CBF998C84DBFCD9FD2FAA0A3FF0F04BE5E5472CB9973C6B68FDD0E662DA5CB6AB1D7E09C32AA3244CE402EB42A71EDCF4197395358CD4AFD3D08D48398F32B4A6927AC6DF5659F194 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojNqBGwjEnRoVEgsknrUFGPw== X-Mailru-Sender: 583F1D7ACE8F49BDD2846D59FC20E9F832E82B20CF082F5EDF8DF56DD43251CF62BBCD203D363090424AE0EB1F3D1D21E2978F233C3FAE6EE63DB1732555E4A8EE80603BA4A5B0BC112434F685709FCF0DA7A0AF5A3A8387 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH v3 1/3] gc/xlog: delay xlog cleanup until relays are subscribed X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Serge Petrenko via Tarantool-patches Reply-To: Serge Petrenko Cc: Vladislav Shpilevoy Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Hi! Thanks for the patch! Please see my 2 comments below. 23.03.2021 18:47, Cyrill Gorcunov пишет: > --- a/src/box/box.cc > +++ b/src/box/box.cc > @@ -771,6 +771,19 @@ box_check_wal_queue_max_size(void) > return size; > } > > +static double > +box_check_wal_cleanup_delay(void) > +{ > + double value = cfg_getd("wal_cleanup_delay"); > + if (value < 0) { > + diag_set(ClientError, ER_CFG, "wal_cleanup_delay", > + "the value must be >= 0"); > + return -1; > + } > + > + return value; > +} > + > static void > box_check_readahead(int readahead) > { > @@ -918,6 +931,8 @@ box_check_config(void) > box_check_wal_mode(cfg_gets("wal_mode")); > if (box_check_wal_queue_max_size() < 0) > diag_raise(); > + if (box_check_wal_cleanup_delay() < 0) > + diag_raise(); > if (box_check_memory_quota("memtx_memory") < 0) > diag_raise(); > box_check_memtx_min_tuple_size(cfg_geti64("memtx_min_tuple_size")); > @@ -1465,6 +1480,23 @@ box_set_wal_queue_max_size(void) > return 0; > } > > +int > +box_set_wal_cleanup_delay(void) > +{ > + double delay = box_check_wal_cleanup_delay(); > + if (delay < 0) > + return -1; > + /* > + * Anonymous replicas do not require > + * delay the cleanup procedure since they > + * are read only. > + */ > + if (replication_anon) > + delay = 0; > + gc_set_wal_cleanup_delay(delay); > + return 0; > +} > + > void > box_set_vinyl_memory(void) > { > @@ -3000,7 +3032,7 @@ box_cfg_xc(void) > rmean_box = rmean_new(iproto_type_strs, IPROTO_TYPE_STAT_MAX); > rmean_error = rmean_new(rmean_error_strings, RMEAN_ERROR_LAST); > > - gc_init(); > + gc_init(box_check_wal_cleanup_delay()); You didn't  put `wal_cleanup_delay` to `dynamic_cfg_skip_at_load`, and that's correct because we need to disable it if replication_anon is set. So wal_cleanup_delay will be reapplied once box_cfg exits. I propose to init gc with TIMEOUT_INFINITY then. It'd look simpler than setting the same value twice IMO. > diff --git a/src/box/gc.c b/src/box/gc.c > index 9af4ef958..e1d7a1187 100644 > --- a/src/box/gc.c > +++ b/src/box/gc.c > @@ -102,11 +102,18 @@ gc_checkpoint_delete(struct gc_checkpoint *checkpoint) > } > > void > -gc_init(void) > +gc_init(double wal_cleanup_delay) > { > /* Don't delete any files until recovery is complete. */ > gc.min_checkpoint_count = INT_MAX; > > + gc.wal_cleanup_delay = wal_cleanup_delay; > + gc.is_paused = wal_cleanup_delay > 0; > + gc.delay_ref = 0; > + > + if (gc.is_paused) > + say_info("wal/engine cleanup is paused"); > + > vclock_create(&gc.vclock); > rlist_create(&gc.checkpoints); > gc_tree_new(&gc.consumers); > @@ -238,6 +245,39 @@ static int > gc_cleanup_fiber_f(va_list ap) > { > (void)ap; > + > + /* > + * Stage 1 (optional): in case if we're booting > + * up with cleanup disabled lets do wait in a > + * separate cycle to minimize branching on stage 2. > + */ > + if (gc.is_paused) { > + double start_time = fiber_clock(); > + while (!fiber_is_cancelled()) { > + double deadline = start_time + gc.wal_cleanup_delay; > + double timeout = gc.wal_cleanup_delay; > + > + if (fiber_clock() >= deadline || > + fiber_yield_timeout(timeout)) { > + say_info("wal/engine cleanup is resumed " > + "due to timeout expiration"); > + gc.is_paused = false; > + gc.delay_ref = 0; > + break; > + } > + > + /* > + * If a last reference is dropped > + * we can exit out early. > + */ > + if (!gc.is_paused) > + break; > + } > + } > + > + /* > + * Stage 2: a regular cleanup cycle. > + */ > while (!fiber_is_cancelled()) { > int64_t delta = gc.cleanup_scheduled - gc.cleanup_completed; > if (delta == 0) { > @@ -253,6 +293,43 @@ gc_cleanup_fiber_f(va_list ap) > return 0; > } > > +void > +gc_set_wal_cleanup_delay(double wal_cleanup_delay) > +{ > + gc.wal_cleanup_delay = wal_cleanup_delay; > + /* > + * This routine may be called at arbitrary > + * moment thus we must be sure the cleanup > + * fiber is paused to not wake up it when > + * it is already in a regular cleanup stage. > + */ > + if (gc.is_paused) > + fiber_wakeup(gc.cleanup_fiber); > +} > + > +void > +gc_delay_ref(void) > +{ > + if (gc.is_paused) { > + assert(gc.delay_ref >= 0); > + gc.delay_ref++; > + } > +} > + > +void > +gc_delay_unref(void) > +{ > + if (gc.is_paused) { > + assert(gc.delay_ref > 0); > + gc.delay_ref--; > + if (gc.delay_ref == 0) { > + say_info("wal/engine cleanup is resumed"); > + gc.is_paused = false; > + fiber_wakeup(gc.cleanup_fiber); I'd move the info message to the cleanup fiber. You may deduce reason for the resume there: timeout/replicas connected and print it. Or don't show reason for resume at all and leave a single info message. I don't insist on you doing this though, feel free to ignore. -- Serge Petrenko