Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows
@ 2021-02-12 11:25 Serge Petrenko via Tarantool-patches
  2021-02-12 11:37 ` Cyrill Gorcunov via Tarantool-patches
                   ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Serge Petrenko via Tarantool-patches @ 2021-02-12 11:25 UTC (permalink / raw)
  To: v.shpilevoy, gorcunov; +Cc: tarantool-patches

While sending a WAL, relay only yields in `coio_write_xrow`, once it
sees the socket isn't ready for writes.
It may happen that the socket is always ready for a long period of time,
and relay doesn't yield at all while recovering a whole .xlog file. This
may take well more than a minute.
During this period of time, relay doesn't read replica's ACKs due to
relay reader fiber not being scheduled, and once the reader is finally
live it times out immediately, causing the replica to reconnect.

The problem is amplified by the fact that replica waits for
replication_timeout to pass prior to reconnecting, which lets master
pile up even more ready WALs, and effectively making it impossible for
the replica to sync.

To fix the problem let's yield explicitly in relay_send_row every
WAL_ROWS_PER_YIELD rows. The same is already done in local recovery, and
serves the same purpose: to not block the event loop for too long.

Closes #5762
---
No test provided as the fix is quite obvious but rather hard to test
automatically.
https://github.com/tarantool/tarantool/issues/5762
https://github.com/tarantool/tarantool/tree/sp/gh-5762-relay-yield

 src/box/relay.cc | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/src/box/relay.cc b/src/box/relay.cc
index df04f8198..afc57dfbc 100644
--- a/src/box/relay.cc
+++ b/src/box/relay.cc
@@ -836,11 +836,20 @@ relay_send(struct relay *relay, struct xrow_header *packet)
 {
 	ERROR_INJECT_YIELD(ERRINJ_RELAY_SEND_DELAY);
 
+	static uint64_t row_cnt = 0;
 	packet->sync = relay->sync;
 	relay->last_row_time = ev_monotonic_now(loop());
 	coio_write_xrow(&relay->io, packet);
 	fiber_gc();
 
+	/*
+	 * It may happen that the socket is always ready for write, so yield
+	 * explicitly every now and then to not block the event loop.
+	 */
+	row_cnt++;
+	if (row_cnt % WAL_ROWS_PER_YIELD == 0) {
+		fiber_sleep(0);
+	}
 	struct errinj *inj = errinj(ERRINJ_RELAY_TIMEOUT, ERRINJ_DOUBLE);
 	if (inj != NULL && inj->dparam > 0)
 		fiber_sleep(inj->dparam);
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2021-03-02  9:52 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-12 11:25 [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows Serge Petrenko via Tarantool-patches
2021-02-12 11:37 ` Cyrill Gorcunov via Tarantool-patches
2021-02-12 11:46   ` Cyrill Gorcunov via Tarantool-patches
2021-02-12 12:08   ` Serge Petrenko via Tarantool-patches
2021-02-12 17:00     ` Cyrill Gorcunov via Tarantool-patches
2021-02-12 21:48 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-12 22:25   ` Cyrill Gorcunov via Tarantool-patches
2021-02-15  8:45     ` Serge Petrenko via Tarantool-patches
2021-02-15  8:40   ` Serge Petrenko via Tarantool-patches
2021-02-17 21:11     ` Vladislav Shpilevoy via Tarantool-patches
2021-02-18 20:24       ` Serge Petrenko via Tarantool-patches
2021-02-23 22:30         ` Vladislav Shpilevoy via Tarantool-patches
2021-02-24  9:48           ` Serge Petrenko via Tarantool-patches
2021-02-24 10:15             ` Cyrill Gorcunov via Tarantool-patches
2021-02-24 10:35               ` Serge Petrenko via Tarantool-patches
2021-02-24 12:07                 ` Cyrill Gorcunov via Tarantool-patches
2021-02-24 12:14                   ` Serge Petrenko via Tarantool-patches
2021-02-24 22:20 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-26  8:41 ` Kirill Yukhin via Tarantool-patches
2021-02-26 20:24   ` Vladislav Shpilevoy via Tarantool-patches
2021-03-01 11:25     ` Serge Petrenko via Tarantool-patches
2021-03-01 21:24       ` Vladislav Shpilevoy via Tarantool-patches
2021-03-02  9:52       ` Kirill Yukhin via Tarantool-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox