From: Vladislav Shpilevoy via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: Serge Petrenko <sergepetrenko@tarantool.org>, gorcunov@gmail.com Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows Date: Fri, 12 Feb 2021 22:48:49 +0100 [thread overview] Message-ID: <b26797a1-ae3b-7c51-6dfe-0733a460c721@tarantool.org> (raw) In-Reply-To: <20210212112541.27561-1-sergepetrenko@tarantool.org> Hi! Thanks for the patch! On 12.02.2021 12:25, Serge Petrenko via Tarantool-patches wrote: > While sending a WAL, relay only yields in `coio_write_xrow`, once it > sees the socket isn't ready for writes. > It may happen that the socket is always ready for a long period of time, > and relay doesn't yield at all while recovering a whole .xlog file. This > may take well more than a minute. > During this period of time, relay doesn't read replica's ACKs due to > relay reader fiber not being scheduled, and once the reader is finally > live it times out immediately, causing the replica to reconnect. > > The problem is amplified by the fact that replica waits for > replication_timeout to pass prior to reconnecting, which lets master > pile up even more ready WALs, and effectively making it impossible for > the replica to sync. I couldn't understand this part. Why is it bad? Yeah, replica waits, but replica is applier, on another instance. How is it related? And relay_reader does not send anything. So why is it bad? Couldn't the problem be fixed by reading all the non-consumed data after reading WAL? The current solution also looks fine. Maybe even better because it becomes consistent with local recovery. However I still want to understand this part about replica. > To fix the problem let's yield explicitly in relay_send_row every > WAL_ROWS_PER_YIELD rows. The same is already done in local recovery, and > serves the same purpose: to not block the event loop for too long. > > Closes #5762 > --- > diff --git a/src/box/relay.cc b/src/box/relay.cc > index df04f8198..afc57dfbc 100644 > --- a/src/box/relay.cc > +++ b/src/box/relay.cc > @@ -836,11 +836,20 @@ relay_send(struct relay *relay, struct xrow_header *packet) > { > ERROR_INJECT_YIELD(ERRINJ_RELAY_SEND_DELAY); > > + static uint64_t row_cnt = 0; Relays are in threads. So this variable either should be thread-local, or be in struct relay. Otherwise you get non-atomic updates which may lead to some increments disappearing. Given that thread-local variable access is not free, I would go for having it in struct relay, but up to you. > packet->sync = relay->sync; > relay->last_row_time = ev_monotonic_now(loop()); > coio_write_xrow(&relay->io, packet); > fiber_gc(); > > + /* > + * It may happen that the socket is always ready for write, so yield > + * explicitly every now and then to not block the event loop. > + */ > + row_cnt++; > + if (row_cnt % WAL_ROWS_PER_YIELD == 0) { > + fiber_sleep(0); > + } Maybe better drop {} as the if's body is just one line.
next prev parent reply other threads:[~2021-02-12 21:48 UTC|newest] Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-02-12 11:25 Serge Petrenko via Tarantool-patches 2021-02-12 11:37 ` Cyrill Gorcunov via Tarantool-patches 2021-02-12 11:46 ` Cyrill Gorcunov via Tarantool-patches 2021-02-12 12:08 ` Serge Petrenko via Tarantool-patches 2021-02-12 17:00 ` Cyrill Gorcunov via Tarantool-patches 2021-02-12 21:48 ` Vladislav Shpilevoy via Tarantool-patches [this message] 2021-02-12 22:25 ` Cyrill Gorcunov via Tarantool-patches 2021-02-15 8:45 ` Serge Petrenko via Tarantool-patches 2021-02-15 8:40 ` Serge Petrenko via Tarantool-patches 2021-02-17 21:11 ` Vladislav Shpilevoy via Tarantool-patches 2021-02-18 20:24 ` Serge Petrenko via Tarantool-patches 2021-02-23 22:30 ` Vladislav Shpilevoy via Tarantool-patches 2021-02-24 9:48 ` Serge Petrenko via Tarantool-patches 2021-02-24 10:15 ` Cyrill Gorcunov via Tarantool-patches 2021-02-24 10:35 ` Serge Petrenko via Tarantool-patches 2021-02-24 12:07 ` Cyrill Gorcunov via Tarantool-patches 2021-02-24 12:14 ` Serge Petrenko via Tarantool-patches 2021-02-24 22:20 ` Vladislav Shpilevoy via Tarantool-patches 2021-02-26 8:41 ` Kirill Yukhin via Tarantool-patches 2021-02-26 20:24 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-01 11:25 ` Serge Petrenko via Tarantool-patches 2021-03-01 21:24 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-02 9:52 ` Kirill Yukhin via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=b26797a1-ae3b-7c51-6dfe-0733a460c721@tarantool.org \ --to=tarantool-patches@dev.tarantool.org \ --cc=gorcunov@gmail.com \ --cc=sergepetrenko@tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox