[Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Sat Feb 13 00:48:49 MSK 2021


Hi! Thanks for the patch!

On 12.02.2021 12:25, Serge Petrenko via Tarantool-patches wrote:
> While sending a WAL, relay only yields in `coio_write_xrow`, once it
> sees the socket isn't ready for writes.
> It may happen that the socket is always ready for a long period of time,
> and relay doesn't yield at all while recovering a whole .xlog file. This
> may take well more than a minute.
> During this period of time, relay doesn't read replica's ACKs due to
> relay reader fiber not being scheduled, and once the reader is finally
> live it times out immediately, causing the replica to reconnect.
> 
> The problem is amplified by the fact that replica waits for
> replication_timeout to pass prior to reconnecting, which lets master
> pile up even more ready WALs, and effectively making it impossible for
> the replica to sync.

I couldn't understand this part. Why is it bad? Yeah, replica waits,
but replica is applier, on another instance. How is it related? And
relay_reader does not send anything. So why is it bad?

Couldn't the problem be fixed by reading all the non-consumed data after
reading WAL?

The current solution also looks fine. Maybe even better because it
becomes consistent with local recovery. However I still want to
understand this part about replica.

> To fix the problem let's yield explicitly in relay_send_row every
> WAL_ROWS_PER_YIELD rows. The same is already done in local recovery, and
> serves the same purpose: to not block the event loop for too long.
> 
> Closes #5762
> ---
> diff --git a/src/box/relay.cc b/src/box/relay.cc
> index df04f8198..afc57dfbc 100644
> --- a/src/box/relay.cc
> +++ b/src/box/relay.cc
> @@ -836,11 +836,20 @@ relay_send(struct relay *relay, struct xrow_header *packet)
>  {
>  	ERROR_INJECT_YIELD(ERRINJ_RELAY_SEND_DELAY);
>  
> +	static uint64_t row_cnt = 0;

Relays are in threads. So this variable either should be thread-local,
or be in struct relay. Otherwise you get non-atomic updates which may
lead to some increments disappearing.

Given that thread-local variable access is not free, I would go for
having it in struct relay, but up to you.

>  	packet->sync = relay->sync;
>  	relay->last_row_time = ev_monotonic_now(loop());
>  	coio_write_xrow(&relay->io, packet);
>  	fiber_gc();
>  
> +	/*
> +	 * It may happen that the socket is always ready for write, so yield
> +	 * explicitly every now and then to not block the event loop.
> +	 */
> +	row_cnt++;
> +	if (row_cnt % WAL_ROWS_PER_YIELD == 0) {
> +		fiber_sleep(0);
> +	}

Maybe better drop {} as the if's body is just one line.


More information about the Tarantool-patches mailing list