Tarantool development patches archive
 help / color / mirror / Atom feed
From: Serge Petrenko via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>, gorcunov@gmail.com
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows
Date: Mon, 15 Feb 2021 11:40:19 +0300	[thread overview]
Message-ID: <57b04874-1bb7-3d62-856d-b60df700514a@tarantool.org> (raw)
In-Reply-To: <b26797a1-ae3b-7c51-6dfe-0733a460c721@tarantool.org>



13.02.2021 00:48, Vladislav Shpilevoy пишет:
> Hi! Thanks for the patch!
>
> On 12.02.2021 12:25, Serge Petrenko via Tarantool-patches wrote:
>> While sending a WAL, relay only yields in `coio_write_xrow`, once it
>> sees the socket isn't ready for writes.
>> It may happen that the socket is always ready for a long period of time,
>> and relay doesn't yield at all while recovering a whole .xlog file. This
>> may take well more than a minute.
>> During this period of time, relay doesn't read replica's ACKs due to
>> relay reader fiber not being scheduled, and once the reader is finally
>> live it times out immediately, causing the replica to reconnect.
>>
>> The problem is amplified by the fact that replica waits for
>> replication_timeout to pass prior to reconnecting, which lets master
>> pile up even more ready WALs, and effectively making it impossible for
>> the replica to sync.
> I couldn't understand this part. Why is it bad? Yeah, replica waits,
> but replica is applier, on another instance. How is it related? And
> relay_reader does not send anything. So why is it bad?

Thanks for the review!

I shouldn't have included this paragraph to the explanation probably.
I tried to explain how this bug leads to replica not being able to sync
with master when master's under load.

I reworded the commit message a bit, hope it's more clear now.

>
> Couldn't the problem be fixed by reading all the non-consumed data after
> reading WAL?

Relay does read every ack received while feeding a WAL, but it reads the 
acks only
after finishing reading WAL, so all the reads time-out.

>
> The current solution also looks fine. Maybe even better because it
> becomes consistent with local recovery. However I still want to
> understand this part about replica.
>
>> To fix the problem let's yield explicitly in relay_send_row every
>> WAL_ROWS_PER_YIELD rows. The same is already done in local recovery, and
>> serves the same purpose: to not block the event loop for too long.
>>
>> Closes #5762
>> ---
>> diff --git a/src/box/relay.cc b/src/box/relay.cc
>> index df04f8198..afc57dfbc 100644
>> --- a/src/box/relay.cc
>> +++ b/src/box/relay.cc
>> @@ -836,11 +836,20 @@ relay_send(struct relay *relay, struct xrow_header *packet)
>>   {
>>   	ERROR_INJECT_YIELD(ERRINJ_RELAY_SEND_DELAY);
>>   
>> +	static uint64_t row_cnt = 0;
> Relays are in threads. So this variable either should be thread-local,
> or be in struct relay. Otherwise you get non-atomic updates which may
> lead to some increments disappearing.
>
> Given that thread-local variable access is not free, I would go for
> having it in struct relay, but up to you.

Thanks for noticing! Let it be in relay then.
Diff:

================================================
diff --git a/src/box/relay.cc b/src/box/relay.cc
index 1d8edf116..6d9269e1d 100644
--- a/src/box/relay.cc
+++ b/src/box/relay.cc
@@ -117,6 +117,11 @@ struct relay {
          * is passed by the replica on subscribe.
          */
         uint32_t id_filter;
+       /**
+        * How many rows has this relay sent to the replica. Used to 
yield once
+        * in a while when reading a WAL to unblock the event loop.
+        */
+       size_t row_cnt;
         /**
          * Local vclock at the moment of subscribe, used to check
          * dataset on the other side and send missing data rows if any.
@@ -218,6 +223,7 @@ relay_start(struct relay *relay, int fd, uint64_t sync,
         coio_create(&relay->io, fd);
         relay->sync = sync;
         relay->state = RELAY_FOLLOW;
+       relay->row_cnt = 0;
         relay->last_row_time = ev_monotonic_now(loop());
  }

@@ -836,7 +842,6 @@ relay_send(struct relay *relay, struct xrow_header 
*packet)
  {
         ERROR_INJECT_YIELD(ERRINJ_RELAY_SEND_DELAY);

-       static size_t row_cnt = 0;
         packet->sync = relay->sync;
         relay->last_row_time = ev_monotonic_now(loop());
         coio_write_xrow(&relay->io, packet);
@@ -846,7 +851,7 @@ relay_send(struct relay *relay, struct xrow_header 
*packet)
          * It may happen that the socket is always ready for write, so 
yield
          * explicitly every now and then to not block the event loop.
          */
-       if (++row_cnt % WAL_ROWS_PER_YIELD == 0)
+       if (++relay->row_cnt % WAL_ROWS_PER_YIELD == 0)
                 fiber_sleep(0);

         struct errinj *inj = errinj(ERRINJ_RELAY_TIMEOUT, ERRINJ_DOUBLE);



>
>>   	packet->sync = relay->sync;
>>   	relay->last_row_time = ev_monotonic_now(loop());
>>   	coio_write_xrow(&relay->io, packet);
>>   	fiber_gc();
>>   
>> +	/*
>> +	 * It may happen that the socket is always ready for write, so yield
>> +	 * explicitly every now and then to not block the event loop.
>> +	 */
>> +	row_cnt++;
>> +	if (row_cnt % WAL_ROWS_PER_YIELD == 0) {
>> +		fiber_sleep(0);
>> +	}
> Maybe better drop {} as the if's body is just one line.
Already fixed in reply to Cyrill.

-- 
Serge Petrenko


  parent reply	other threads:[~2021-02-15  8:40 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-12 11:25 Serge Petrenko via Tarantool-patches
2021-02-12 11:37 ` Cyrill Gorcunov via Tarantool-patches
2021-02-12 11:46   ` Cyrill Gorcunov via Tarantool-patches
2021-02-12 12:08   ` Serge Petrenko via Tarantool-patches
2021-02-12 17:00     ` Cyrill Gorcunov via Tarantool-patches
2021-02-12 21:48 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-12 22:25   ` Cyrill Gorcunov via Tarantool-patches
2021-02-15  8:45     ` Serge Petrenko via Tarantool-patches
2021-02-15  8:40   ` Serge Petrenko via Tarantool-patches [this message]
2021-02-17 21:11     ` Vladislav Shpilevoy via Tarantool-patches
2021-02-18 20:24       ` Serge Petrenko via Tarantool-patches
2021-02-23 22:30         ` Vladislav Shpilevoy via Tarantool-patches
2021-02-24  9:48           ` Serge Petrenko via Tarantool-patches
2021-02-24 10:15             ` Cyrill Gorcunov via Tarantool-patches
2021-02-24 10:35               ` Serge Petrenko via Tarantool-patches
2021-02-24 12:07                 ` Cyrill Gorcunov via Tarantool-patches
2021-02-24 12:14                   ` Serge Petrenko via Tarantool-patches
2021-02-24 22:20 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-26  8:41 ` Kirill Yukhin via Tarantool-patches
2021-02-26 20:24   ` Vladislav Shpilevoy via Tarantool-patches
2021-03-01 11:25     ` Serge Petrenko via Tarantool-patches
2021-03-01 21:24       ` Vladislav Shpilevoy via Tarantool-patches
2021-03-02  9:52       ` Kirill Yukhin via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57b04874-1bb7-3d62-856d-b60df700514a@tarantool.org \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=gorcunov@gmail.com \
    --cc=sergepetrenko@tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox