From: Serge Petrenko via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>, gorcunov@gmail.com Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH] recovery: make it yield when positioning in a WAL Date: Wed, 28 Apr 2021 18:34:05 +0300 [thread overview] Message-ID: <eff3dfea-b394-606d-e342-eb5aa46e5efd@tarantool.org> (raw) In-Reply-To: <b10505e4-5732-c224-4e0c-e9ccb99c3924@tarantool.org> 27.04.2021 00:20, Vladislav Shpilevoy пишет: > Hi! Thanks for the patch! > > See 2 questions, 1 comment. > > On 26.04.2021 18:59, Serge Petrenko wrote: Thanks for the review! >> We had various places in box.cc and relay.cc which counted processed >> rows and yielded every now and then. These yields didn't cover cases, >> when recovery has to position inside a long WAL file: >> >> For example, when tarantool exits without leaving an empty WAL file >> which'll be used to recover instance vclock on restart. In this case >> the instance freezes while processing the last availabe WAL in order >> to recover the vclock. >> >> Another issue is with replication. If a replica connects and needs data >> from the end of a really long WAL, recovery will read up to the needed >> position without yields, making relay disconnect by timeout. >> >> In order to fix the issue, make recovery decide when a yield should >> happen. Introduce a new callback: schedule_yield, which is called by >> recovery once it processes (no matter how, either simply skips or calls >> xstream_write) enough rows (WAL_ROWS_PER_YIELD). >> >> schedule_yield either yields right away, in case of relay, or saves the >> yield for later, in case of local recovery, because it might be in the >> middle of a transaction. > 1. Did you consider an option to yield explicitly in recovery code when > it skips rows? If they are being skipped, it does not matter what are > their transaction borders. I did consider that. It is possible to do so, but then we'll have yet another place (in addition to relay and wal_stream) which counts rows and yields every now and then. I thought it would be better to unify all these places. Actually, this could be done this way from the very beginning. I think it's not recovery's business whether to yield or not once some rows are processed. Anyway, I can make it this way, if you insist. > > Then the whole patch would be to add the yield once per WAL_ROWS_PER_YIELD > to recovery_scan(), correct? True. One place in recovery_scan() and one place in recover_xlog(), when the rows are skipped. > >> The only place with explicit row counting and manual yielding is now in >> relay_initial_join, since its row sources are engines rather than recovery >> with its WAL files. >> >> Closes #5979 >> --- >> https://github.com/tarantool/tarantool/tree/sp/gh-5979-recovery-yield >> https://github.com/tarantool/tarantool/issues/5979 >> >> diff --git a/src/box/box.cc b/src/box/box.cc >> index 59925962d..69a8f87eb 100644 >> --- a/src/box/box.cc >> +++ b/src/box/box.cc >> @@ -3101,6 +3087,19 @@ bootstrap(const struct tt_uuid *instance_uuid, >> } >> } >> >> +struct wal_stream wal_stream; > 2. This must be static. Sure. =================================================== diff --git a/src/box/box.cc b/src/box/box.cc index 69a8f87eb..62b55352e 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -3087,7 +3087,7 @@ bootstrap(const struct tt_uuid *instance_uuid, } } -struct wal_stream wal_stream; +static struct wal_stream wal_stream; /** * Plan a yield in recovery stream. Wal stream will execute it as soon as it's =================================================== > >> + >> +/** >> + * Plan a yield in recovery stream. Wal stream will execute it as soon as it's >> + * ready. >> + */ >> +static void >> +wal_stream_schedule_yield(void) >> +{ >> + wal_stream.has_yield = true; >> + wal_stream_try_yield(&wal_stream); >> +} >> diff --git a/src/box/recovery.cc b/src/box/recovery.cc >> index cd33e7635..5351d8524 100644 >> --- a/src/box/recovery.cc >> +++ b/src/box/recovery.cc >> @@ -241,10 +248,16 @@ static void >> recover_xlog(struct recovery *r, struct xstream *stream, >> const struct vclock *stop_vclock) >> { >> + /* Imitate old behaviour. Rows are counted separately for each xlog. */ >> + r->row_count = 0; > 3. But why do you need to imitate it? Does it mean if the files are > too small to yield even once in each, but in total their number is > huge, there won't be yields? Yes, that's true. > > Also does it mean "1M rows processed" was not ever printed in that > case? Yes, when WALs are not big enough. Recovery starts over with '0.1M rows processed' on every new WAL file. > >> struct xrow_header row; >> - uint64_t row_count = 0; >> while (xlog_cursor_next_xc(&r->cursor, &row, >> r->wal_dir.force_recovery) == 0) { >> + if (++r->row_count % WAL_ROWS_PER_YIELD == 0) { >> + r->schedule_yield(); >> + } >> + if (r->row_count % 100000 == 0) >> + say_info("%.1fM rows processed", r->row_count / 1000000.); >> /* >> * Read the next row from xlog file. >> * >> @@ -273,12 +286,7 @@ recover_xlog(struct recovery *r, struct xstream *stream, >> * failed row anyway. >> */ >> vclock_follow_xrow(&r->vclock, &row); >> - if (xstream_write(stream, &row) == 0) { >> - ++row_count; >> - if (row_count % 100000 == 0) >> - say_info("%.1fM rows processed", >> - row_count / 1000000.); >> - } else { >> + if (xstream_write(stream, &row) != 0) { >> if (!r->wal_dir.force_recovery) >> diag_raise(); >> -- Serge Petrenko
next prev parent reply other threads:[~2021-04-28 15:34 UTC|newest] Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-04-26 16:59 Serge Petrenko via Tarantool-patches 2021-04-26 21:20 ` Vladislav Shpilevoy via Tarantool-patches 2021-04-28 15:34 ` Serge Petrenko via Tarantool-patches [this message] 2021-04-28 20:50 ` Vladislav Shpilevoy via Tarantool-patches 2021-04-29 8:55 ` Serge Petrenko via Tarantool-patches 2021-04-29 20:03 ` Vladislav Shpilevoy via Tarantool-patches 2021-05-12 11:30 ` Serge Petrenko via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=eff3dfea-b394-606d-e342-eb5aa46e5efd@tarantool.org \ --to=tarantool-patches@dev.tarantool.org \ --cc=gorcunov@gmail.com \ --cc=sergepetrenko@tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH] recovery: make it yield when positioning in a WAL' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox