From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 0D3C66F865; Thu, 29 Apr 2021 11:55:31 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 0D3C66F865 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1619686531; bh=puRRi9ruv5dCcvREGoC5Xao0rs8bcHclsATQTgugfSY=; h=To:Cc:References:Date:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=q3HaaT+AUyiINtxE5PwhYvINEuRiraCLjZXrCK7MJVfkG+wBaDLY88FLs9odkhGfq lvGWz8n5jI/NRcpVJ5TbO6ifrtjnqXvEYxu012HCATBzVqipRKjyNTKK3LO+Q0dZQX yKLKECDAh6bDQHXgaAelQnU4/MzYOEKuswRPoojQ= Received: from smtp43.i.mail.ru (smtp43.i.mail.ru [94.100.177.103]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 02FFC6F865 for ; Thu, 29 Apr 2021 11:55:29 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 02FFC6F865 Received: by smtp43.i.mail.ru with esmtpa (envelope-from ) id 1lc2SH-0002th-3m; Thu, 29 Apr 2021 11:55:29 +0300 To: Vladislav Shpilevoy , gorcunov@gmail.com Cc: tarantool-patches@dev.tarantool.org References: <20210426165954.46474-1-sergepetrenko@tarantool.org> <738d3030-80cf-8079-1b03-55a7d665dbbf@tarantool.org> Message-ID: <3d0997f3-066f-5b2c-3ae3-22b90583e9bf@tarantool.org> Date: Thu, 29 Apr 2021 11:55:28 +0300 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.10.0 MIME-Version: 1.0 In-Reply-To: <738d3030-80cf-8079-1b03-55a7d665dbbf@tarantool.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-GB X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD9ECFD8CE5F0594010172C8F5787C1B7A21B115C62DFC52C90182A05F538085040ABF2245780EABD6D8700BB412F8FE421B03C9EEE886C2FFD5829A7E8F1D106EA X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE79379E597596E9183EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F79006378D08D652E28591A78638F802B75D45FF914D58D5BE9E6BC1A93B80C6DEB9DEE97C6FB206A91F05B23B37C6C0AD5E5C5E7404F1C6617EA8C432051A2C76225635D2E47CDBA5A96583C09775C1D3CA48CF32941D3B652364A4117882F4460429724CE54428C33FAD30A8DF7F3B2552694AC26CFBAC0749D213D2E47CDBA5A9658378DA827A17800CE76E0B6B202B8EE8599FA2833FD35BB23DF004C906525384302BEBFE083D3B9BA71A620F70A64A45A98AA50765F79006372E808ACE2090B5E1725E5C173C3A84C3C5EA940A35A165FF2DBA43225CD8A89FB26E97DCB74E6252A91E23F1B6B78B78B5C8C57E37DE458BEDA766A37F9254B7 X-C1DE0DAB: 0D63561A33F958A55A954774A426CF55CF456D734541DA9A20C9FAC5CCBF7F5ED59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA7502E6951B79FF9A3F410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34023CA4E4726A7D6C99EDE9A833178E4838DAD241B68883823B8F634FDC0772E6152834971E9961091D7E09C32AA3244C667552B0FF2AA6800A5C2CB11B9B426F3E8609A02908F271FACE5A9C96DEB163 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojoCaqxM2e5srQ+E/5eOFFvw== X-Mailru-Sender: 3B9A0136629DC9125D61937A2360A44619C5E71895898B2ED5A68F933830E8AE35573D6B3F56B6D1424AE0EB1F3D1D21E2978F233C3FAE6EE63DB1732555E4A8EE80603BA4A5B0BC112434F685709FCF0DA7A0AF5A3A8387 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH] recovery: make it yield when positioning in a WAL X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Serge Petrenko via Tarantool-patches Reply-To: Serge Petrenko Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" 28.04.2021 23:50, Vladislav Shpilevoy пишет: >>>> We had various places in box.cc and relay.cc which counted processed >>>> rows and yielded every now and then. These yields didn't cover cases, >>>> when recovery has to position inside a long WAL file: >>>> >>>> For example, when tarantool exits without leaving an empty WAL file >>>> which'll be used to recover instance vclock on restart. In this case >>>> the instance freezes while processing the last availabe WAL in order >>>> to recover the vclock. >>>> >>>> Another issue is with replication. If a replica connects and needs data >>>> from the end of a really long WAL, recovery will read up to the needed >>>> position without yields, making relay disconnect by timeout. >>>> >>>> In order to fix the issue, make recovery decide when a yield should >>>> happen. Introduce a new callback: schedule_yield, which is called by >>>> recovery once it processes (no matter how, either simply skips or calls >>>> xstream_write) enough rows (WAL_ROWS_PER_YIELD). >>>> >>>> schedule_yield either yields right away, in case of relay, or saves the >>>> yield for later, in case of local recovery, because it might be in the >>>> middle of a transaction. >>> 1. Did you consider an option to yield explicitly in recovery code when >>> it skips rows? If they are being skipped, it does not matter what are >>> their transaction borders. >> I did consider that. It is possible to do so, but then we'll have yet another >> place (in addition to relay and wal_stream) which counts rows and yields >> every now and then. >> >> I thought it would be better to unify all these places. Actually, this could be >> done this way from the very beginning. >> I think it's not recovery's business whether to yield or not once >> some rows are processed. >> >> Anyway, I can make it this way, if you insist. > The current solution is also fine. > >>>> + >>>> +/** >>>> + * Plan a yield in recovery stream. Wal stream will execute it as soon as it's >>>> + * ready. >>>> + */ >>>> +static void >>>> +wal_stream_schedule_yield(void) >>>> +{ >>>> +    wal_stream.has_yield = true; >>>> +    wal_stream_try_yield(&wal_stream); >>>> +} >>>> diff --git a/src/box/recovery.cc b/src/box/recovery.cc >>>> index cd33e7635..5351d8524 100644 >>>> --- a/src/box/recovery.cc >>>> +++ b/src/box/recovery.cc >>>> @@ -241,10 +248,16 @@ static void >>>>   recover_xlog(struct recovery *r, struct xstream *stream, >>>>            const struct vclock *stop_vclock) >>>>   { >>>> +    /* Imitate old behaviour. Rows are counted separately for each xlog. */ >>>> +    r->row_count = 0; >>> 3. But why do you need to imitate it? Does it mean if the files are >>> too small to yield even once in each, but in total their number is >>> huge, there won't be yields? >> Yes, that's true. > Does not this look wrong to you? The xlog files might not contain enough > rows if wal_max_size is small enough, and then the same issue still > exists - no yields. > >>> Also does it mean "1M rows processed" was not ever printed in that >>> case? >> Yes, when WALs are not big enough. >> Recovery starts over with '0.1M rows processed' on every new WAL file. > Does not this look wrong to you too? That at least the number of > rows should not drop to 0 on each next xlog file. Yep, let's change it then. I thought we had to preserve log output. Fixed and added a changelog entry. ================================= diff --git a/changelogs/unreleased/gh-5979-recovery-ligs.md b/changelogs/unreleased/gh-5979-recovery-ligs.md new file mode 100644 index 000000000..86abfd66a --- /dev/null +++ b/changelogs/unreleased/gh-5979-recovery-ligs.md @@ -0,0 +1,11 @@ +# bugfix/core + +* Now tarantool yields when scanning `.xlog` files for the latest applied vclock +  and when finding the right place in `.xlog`s to start recovering. This means +  that the instance is responsive right after `box.cfg` call even when an empty +  `.xlog` was not created on previous exit. +  Also this prevents relay from timing out when a freshly subscribed replica +  needs rows from the end of a relatively long (hundreds of MBs) `.xlog` +  (gh-5979). +* The counter in `x.yM rows processed` log messages does not reset on each new +  recovered `xlog` anymore. diff --git a/src/box/recovery.cc b/src/box/recovery.cc index 5351d8524..8359f216d 100644 --- a/src/box/recovery.cc +++ b/src/box/recovery.cc @@ -149,6 +149,13 @@ recovery_scan(struct recovery *r, struct vclock *end_vclock,                 }         }         xlog_cursor_close(&cursor, false); + +       /* +        * Do not show scanned rows in log output and yield just in case +        * row_count was less than WAL_ROWS_PER_YIELD when we reset it. +        */ +       r->row_count = 0; +       r->schedule_yield();  }  static inline void @@ -248,8 +255,6 @@ static void  recover_xlog(struct recovery *r, struct xstream *stream,              const struct vclock *stop_vclock)  { -       /* Imitate old behaviour. Rows are counted separately for each xlog. */ -       r->row_count = 0;         struct xrow_header row;         while (xlog_cursor_next_xc(&r->cursor, &row,                                    r->wal_dir.force_recovery) == 0) { ================================= -- Serge Petrenko