From: Serge Petrenko via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>, gorcunov@gmail.com Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH] recovery: make it yield when positioning in a WAL Date: Thu, 29 Apr 2021 11:55:28 +0300 [thread overview] Message-ID: <3d0997f3-066f-5b2c-3ae3-22b90583e9bf@tarantool.org> (raw) In-Reply-To: <738d3030-80cf-8079-1b03-55a7d665dbbf@tarantool.org> 28.04.2021 23:50, Vladislav Shpilevoy пишет: >>>> We had various places in box.cc and relay.cc which counted processed >>>> rows and yielded every now and then. These yields didn't cover cases, >>>> when recovery has to position inside a long WAL file: >>>> >>>> For example, when tarantool exits without leaving an empty WAL file >>>> which'll be used to recover instance vclock on restart. In this case >>>> the instance freezes while processing the last availabe WAL in order >>>> to recover the vclock. >>>> >>>> Another issue is with replication. If a replica connects and needs data >>>> from the end of a really long WAL, recovery will read up to the needed >>>> position without yields, making relay disconnect by timeout. >>>> >>>> In order to fix the issue, make recovery decide when a yield should >>>> happen. Introduce a new callback: schedule_yield, which is called by >>>> recovery once it processes (no matter how, either simply skips or calls >>>> xstream_write) enough rows (WAL_ROWS_PER_YIELD). >>>> >>>> schedule_yield either yields right away, in case of relay, or saves the >>>> yield for later, in case of local recovery, because it might be in the >>>> middle of a transaction. >>> 1. Did you consider an option to yield explicitly in recovery code when >>> it skips rows? If they are being skipped, it does not matter what are >>> their transaction borders. >> I did consider that. It is possible to do so, but then we'll have yet another >> place (in addition to relay and wal_stream) which counts rows and yields >> every now and then. >> >> I thought it would be better to unify all these places. Actually, this could be >> done this way from the very beginning. >> I think it's not recovery's business whether to yield or not once >> some rows are processed. >> >> Anyway, I can make it this way, if you insist. > The current solution is also fine. > >>>> + >>>> +/** >>>> + * Plan a yield in recovery stream. Wal stream will execute it as soon as it's >>>> + * ready. >>>> + */ >>>> +static void >>>> +wal_stream_schedule_yield(void) >>>> +{ >>>> + wal_stream.has_yield = true; >>>> + wal_stream_try_yield(&wal_stream); >>>> +} >>>> diff --git a/src/box/recovery.cc b/src/box/recovery.cc >>>> index cd33e7635..5351d8524 100644 >>>> --- a/src/box/recovery.cc >>>> +++ b/src/box/recovery.cc >>>> @@ -241,10 +248,16 @@ static void >>>> recover_xlog(struct recovery *r, struct xstream *stream, >>>> const struct vclock *stop_vclock) >>>> { >>>> + /* Imitate old behaviour. Rows are counted separately for each xlog. */ >>>> + r->row_count = 0; >>> 3. But why do you need to imitate it? Does it mean if the files are >>> too small to yield even once in each, but in total their number is >>> huge, there won't be yields? >> Yes, that's true. > Does not this look wrong to you? The xlog files might not contain enough > rows if wal_max_size is small enough, and then the same issue still > exists - no yields. > >>> Also does it mean "1M rows processed" was not ever printed in that >>> case? >> Yes, when WALs are not big enough. >> Recovery starts over with '0.1M rows processed' on every new WAL file. > Does not this look wrong to you too? That at least the number of > rows should not drop to 0 on each next xlog file. Yep, let's change it then. I thought we had to preserve log output. Fixed and added a changelog entry. ================================= diff --git a/changelogs/unreleased/gh-5979-recovery-ligs.md b/changelogs/unreleased/gh-5979-recovery-ligs.md new file mode 100644 index 000000000..86abfd66a --- /dev/null +++ b/changelogs/unreleased/gh-5979-recovery-ligs.md @@ -0,0 +1,11 @@ +# bugfix/core + +* Now tarantool yields when scanning `.xlog` files for the latest applied vclock + and when finding the right place in `.xlog`s to start recovering. This means + that the instance is responsive right after `box.cfg` call even when an empty + `.xlog` was not created on previous exit. + Also this prevents relay from timing out when a freshly subscribed replica + needs rows from the end of a relatively long (hundreds of MBs) `.xlog` + (gh-5979). +* The counter in `x.yM rows processed` log messages does not reset on each new + recovered `xlog` anymore. diff --git a/src/box/recovery.cc b/src/box/recovery.cc index 5351d8524..8359f216d 100644 --- a/src/box/recovery.cc +++ b/src/box/recovery.cc @@ -149,6 +149,13 @@ recovery_scan(struct recovery *r, struct vclock *end_vclock, } } xlog_cursor_close(&cursor, false); + + /* + * Do not show scanned rows in log output and yield just in case + * row_count was less than WAL_ROWS_PER_YIELD when we reset it. + */ + r->row_count = 0; + r->schedule_yield(); } static inline void @@ -248,8 +255,6 @@ static void recover_xlog(struct recovery *r, struct xstream *stream, const struct vclock *stop_vclock) { - /* Imitate old behaviour. Rows are counted separately for each xlog. */ - r->row_count = 0; struct xrow_header row; while (xlog_cursor_next_xc(&r->cursor, &row, r->wal_dir.force_recovery) == 0) { ================================= -- Serge Petrenko
next prev parent reply other threads:[~2021-04-29 8:55 UTC|newest] Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-04-26 16:59 Serge Petrenko via Tarantool-patches 2021-04-26 21:20 ` Vladislav Shpilevoy via Tarantool-patches 2021-04-28 15:34 ` Serge Petrenko via Tarantool-patches 2021-04-28 20:50 ` Vladislav Shpilevoy via Tarantool-patches 2021-04-29 8:55 ` Serge Petrenko via Tarantool-patches [this message] 2021-04-29 20:03 ` Vladislav Shpilevoy via Tarantool-patches 2021-05-12 11:30 ` Serge Petrenko via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=3d0997f3-066f-5b2c-3ae3-22b90583e9bf@tarantool.org \ --to=tarantool-patches@dev.tarantool.org \ --cc=gorcunov@gmail.com \ --cc=sergepetrenko@tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH] recovery: make it yield when positioning in a WAL' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox