[Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows

Serge Petrenko sergepetrenko at tarantool.org
Fri Feb 12 14:25:41 MSK 2021


While sending a WAL, relay only yields in `coio_write_xrow`, once it
sees the socket isn't ready for writes.
It may happen that the socket is always ready for a long period of time,
and relay doesn't yield at all while recovering a whole .xlog file. This
may take well more than a minute.
During this period of time, relay doesn't read replica's ACKs due to
relay reader fiber not being scheduled, and once the reader is finally
live it times out immediately, causing the replica to reconnect.

The problem is amplified by the fact that replica waits for
replication_timeout to pass prior to reconnecting, which lets master
pile up even more ready WALs, and effectively making it impossible for
the replica to sync.

To fix the problem let's yield explicitly in relay_send_row every
WAL_ROWS_PER_YIELD rows. The same is already done in local recovery, and
serves the same purpose: to not block the event loop for too long.

Closes #5762
---
No test provided as the fix is quite obvious but rather hard to test
automatically.
https://github.com/tarantool/tarantool/issues/5762
https://github.com/tarantool/tarantool/tree/sp/gh-5762-relay-yield

 src/box/relay.cc | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/src/box/relay.cc b/src/box/relay.cc
index df04f8198..afc57dfbc 100644
--- a/src/box/relay.cc
+++ b/src/box/relay.cc
@@ -836,11 +836,20 @@ relay_send(struct relay *relay, struct xrow_header *packet)
 {
 	ERROR_INJECT_YIELD(ERRINJ_RELAY_SEND_DELAY);
 
+	static uint64_t row_cnt = 0;
 	packet->sync = relay->sync;
 	relay->last_row_time = ev_monotonic_now(loop());
 	coio_write_xrow(&relay->io, packet);
 	fiber_gc();
 
+	/*
+	 * It may happen that the socket is always ready for write, so yield
+	 * explicitly every now and then to not block the event loop.
+	 */
+	row_cnt++;
+	if (row_cnt % WAL_ROWS_PER_YIELD == 0) {
+		fiber_sleep(0);
+	}
 	struct errinj *inj = errinj(ERRINJ_RELAY_TIMEOUT, ERRINJ_DOUBLE);
 	if (inj != NULL && inj->dparam > 0)
 		fiber_sleep(inj->dparam);
-- 
2.24.3 (Apple Git-128)



More information about the Tarantool-patches mailing list