* [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows @ 2021-02-12 11:25 Serge Petrenko via Tarantool-patches 2021-02-12 11:37 ` Cyrill Gorcunov via Tarantool-patches ` (3 more replies) 0 siblings, 4 replies; 23+ messages in thread From: Serge Petrenko via Tarantool-patches @ 2021-02-12 11:25 UTC (permalink / raw) To: v.shpilevoy, gorcunov; +Cc: tarantool-patches While sending a WAL, relay only yields in `coio_write_xrow`, once it sees the socket isn't ready for writes. It may happen that the socket is always ready for a long period of time, and relay doesn't yield at all while recovering a whole .xlog file. This may take well more than a minute. During this period of time, relay doesn't read replica's ACKs due to relay reader fiber not being scheduled, and once the reader is finally live it times out immediately, causing the replica to reconnect. The problem is amplified by the fact that replica waits for replication_timeout to pass prior to reconnecting, which lets master pile up even more ready WALs, and effectively making it impossible for the replica to sync. To fix the problem let's yield explicitly in relay_send_row every WAL_ROWS_PER_YIELD rows. The same is already done in local recovery, and serves the same purpose: to not block the event loop for too long. Closes #5762 --- No test provided as the fix is quite obvious but rather hard to test automatically. https://github.com/tarantool/tarantool/issues/5762 https://github.com/tarantool/tarantool/tree/sp/gh-5762-relay-yield src/box/relay.cc | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/src/box/relay.cc b/src/box/relay.cc index df04f8198..afc57dfbc 100644 --- a/src/box/relay.cc +++ b/src/box/relay.cc @@ -836,11 +836,20 @@ relay_send(struct relay *relay, struct xrow_header *packet) { ERROR_INJECT_YIELD(ERRINJ_RELAY_SEND_DELAY); + static uint64_t row_cnt = 0; packet->sync = relay->sync; relay->last_row_time = ev_monotonic_now(loop()); coio_write_xrow(&relay->io, packet); fiber_gc(); + /* + * It may happen that the socket is always ready for write, so yield + * explicitly every now and then to not block the event loop. + */ + row_cnt++; + if (row_cnt % WAL_ROWS_PER_YIELD == 0) { + fiber_sleep(0); + } struct errinj *inj = errinj(ERRINJ_RELAY_TIMEOUT, ERRINJ_DOUBLE); if (inj != NULL && inj->dparam > 0) fiber_sleep(inj->dparam); -- 2.24.3 (Apple Git-128) ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows 2021-02-12 11:25 [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows Serge Petrenko via Tarantool-patches @ 2021-02-12 11:37 ` Cyrill Gorcunov via Tarantool-patches 2021-02-12 11:46 ` Cyrill Gorcunov via Tarantool-patches 2021-02-12 12:08 ` Serge Petrenko via Tarantool-patches 2021-02-12 21:48 ` Vladislav Shpilevoy via Tarantool-patches ` (2 subsequent siblings) 3 siblings, 2 replies; 23+ messages in thread From: Cyrill Gorcunov via Tarantool-patches @ 2021-02-12 11:37 UTC (permalink / raw) To: Serge Petrenko; +Cc: v.shpilevoy, tarantool-patches On Fri, Feb 12, 2021 at 02:25:41PM +0300, Serge Petrenko wrote: > @@ -836,11 +836,20 @@ relay_send(struct relay *relay, struct xrow_header *packet) > { > ERROR_INJECT_YIELD(ERRINJ_RELAY_SEND_DELAY); > > + static uint64_t row_cnt = 0; > packet->sync = relay->sync; > relay->last_row_time = ev_monotonic_now(loop()); > coio_write_xrow(&relay->io, packet); > fiber_gc(); > > + /* > + * It may happen that the socket is always ready for write, so yield > + * explicitly every now and then to not block the event loop. > + */ > + row_cnt++; > + if (row_cnt % WAL_ROWS_PER_YIELD == 0) { > + fiber_sleep(0); > + } Serge, if I'm not missing something obvious, this row counter is related to xstream->rows? So maybe define this type as size_t instead, to be consistent with another occurence of WAL_ROWS_PER_YIELD. Say, static size_t row_cnt = 0; ... if (++row_cnt % WAL_ROWS_PER_YIELD == 0) fiber_sleep(0); I'm fine with uint64_t as well, just out of curiosity. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows 2021-02-12 11:37 ` Cyrill Gorcunov via Tarantool-patches @ 2021-02-12 11:46 ` Cyrill Gorcunov via Tarantool-patches 2021-02-12 12:08 ` Serge Petrenko via Tarantool-patches 1 sibling, 0 replies; 23+ messages in thread From: Cyrill Gorcunov via Tarantool-patches @ 2021-02-12 11:46 UTC (permalink / raw) To: Serge Petrenko; +Cc: v.shpilevoy, tarantool-patches On Fri, Feb 12, 2021 at 02:37:48PM +0300, Cyrill Gorcunov wrote: > > Serge, if I'm not missing something obvious, this row counter is > related to xstream->rows? So maybe define this type as size_t instead, > to be consistent with another occurence of WAL_ROWS_PER_YIELD. Say, > > static size_t row_cnt = 0; > ... > if (++row_cnt % WAL_ROWS_PER_YIELD == 0) > fiber_sleep(0); > > I'm fine with uint64_t as well, just out of curiosity. Ack for current version as well Cyrill ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows 2021-02-12 11:37 ` Cyrill Gorcunov via Tarantool-patches 2021-02-12 11:46 ` Cyrill Gorcunov via Tarantool-patches @ 2021-02-12 12:08 ` Serge Petrenko via Tarantool-patches 2021-02-12 17:00 ` Cyrill Gorcunov via Tarantool-patches 1 sibling, 1 reply; 23+ messages in thread From: Serge Petrenko via Tarantool-patches @ 2021-02-12 12:08 UTC (permalink / raw) To: Cyrill Gorcunov; +Cc: v.shpilevoy, tarantool-patches 12.02.2021 14:37, Cyrill Gorcunov пишет: > On Fri, Feb 12, 2021 at 02:25:41PM +0300, Serge Petrenko wrote: >> @@ -836,11 +836,20 @@ relay_send(struct relay *relay, struct xrow_header *packet) >> { >> ERROR_INJECT_YIELD(ERRINJ_RELAY_SEND_DELAY); >> >> + static uint64_t row_cnt = 0; >> packet->sync = relay->sync; >> relay->last_row_time = ev_monotonic_now(loop()); >> coio_write_xrow(&relay->io, packet); >> fiber_gc(); >> >> + /* >> + * It may happen that the socket is always ready for write, so yield >> + * explicitly every now and then to not block the event loop. >> + */ >> + row_cnt++; >> + if (row_cnt % WAL_ROWS_PER_YIELD == 0) { >> + fiber_sleep(0); >> + } > Serge, if I'm not missing something obvious, this row counter is > related to xstream->rows? So maybe define this type as size_t instead, Thanks for the review! Yes, it serves the same purpose. No problem, let it be size_t, updated. > to be consistent with another occurence of WAL_ROWS_PER_YIELD. Say, > > static size_t row_cnt = 0; > ... > if (++row_cnt % WAL_ROWS_PER_YIELD == 0) > fiber_sleep(0); > > I'm fine with uint64_t as well, just out of curiosity. Looks fine. My bad, I didn't pay enough attention when writing the patch. ========================================================================= diff --git a/src/box/relay.cc b/src/box/relay.cc index afc57dfbc..1d8edf116 100644 --- a/src/box/relay.cc +++ b/src/box/relay.cc @@ -836,7 +836,7 @@ relay_send(struct relay *relay, struct xrow_header *packet) { ERROR_INJECT_YIELD(ERRINJ_RELAY_SEND_DELAY); - static uint64_t row_cnt = 0; + static size_t row_cnt = 0; packet->sync = relay->sync; relay->last_row_time = ev_monotonic_now(loop()); coio_write_xrow(&relay->io, packet); @@ -846,10 +846,9 @@ relay_send(struct relay *relay, struct xrow_header *packet) * It may happen that the socket is always ready for write, so yield * explicitly every now and then to not block the event loop. */ - row_cnt++; - if (row_cnt % WAL_ROWS_PER_YIELD == 0) { + if (++row_cnt % WAL_ROWS_PER_YIELD == 0) fiber_sleep(0); - } + struct errinj *inj = errinj(ERRINJ_RELAY_TIMEOUT, ERRINJ_DOUBLE); if (inj != NULL && inj->dparam > 0) fiber_sleep(inj->dparam); -- Serge Petrenko ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows 2021-02-12 12:08 ` Serge Petrenko via Tarantool-patches @ 2021-02-12 17:00 ` Cyrill Gorcunov via Tarantool-patches 0 siblings, 0 replies; 23+ messages in thread From: Cyrill Gorcunov via Tarantool-patches @ 2021-02-12 17:00 UTC (permalink / raw) To: Serge Petrenko; +Cc: v.shpilevoy, tarantool-patches On Fri, Feb 12, 2021 at 03:08:51PM +0300, Serge Petrenko wrote: ... > ========================================================================= > diff --git a/src/box/relay.cc b/src/box/relay.cc > index afc57dfbc..1d8edf116 100644 > --- a/src/box/relay.cc > +++ b/src/box/relay.cc > @@ -836,7 +836,7 @@ relay_send(struct relay *relay, struct xrow_header > *packet) > { Acked-by: Cyrill Gorcunov <gorcunov@gmail.com> ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows 2021-02-12 11:25 [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows Serge Petrenko via Tarantool-patches 2021-02-12 11:37 ` Cyrill Gorcunov via Tarantool-patches @ 2021-02-12 21:48 ` Vladislav Shpilevoy via Tarantool-patches 2021-02-12 22:25 ` Cyrill Gorcunov via Tarantool-patches 2021-02-15 8:40 ` Serge Petrenko via Tarantool-patches 2021-02-24 22:20 ` Vladislav Shpilevoy via Tarantool-patches 2021-02-26 8:41 ` Kirill Yukhin via Tarantool-patches 3 siblings, 2 replies; 23+ messages in thread From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-12 21:48 UTC (permalink / raw) To: Serge Petrenko, gorcunov; +Cc: tarantool-patches Hi! Thanks for the patch! On 12.02.2021 12:25, Serge Petrenko via Tarantool-patches wrote: > While sending a WAL, relay only yields in `coio_write_xrow`, once it > sees the socket isn't ready for writes. > It may happen that the socket is always ready for a long period of time, > and relay doesn't yield at all while recovering a whole .xlog file. This > may take well more than a minute. > During this period of time, relay doesn't read replica's ACKs due to > relay reader fiber not being scheduled, and once the reader is finally > live it times out immediately, causing the replica to reconnect. > > The problem is amplified by the fact that replica waits for > replication_timeout to pass prior to reconnecting, which lets master > pile up even more ready WALs, and effectively making it impossible for > the replica to sync. I couldn't understand this part. Why is it bad? Yeah, replica waits, but replica is applier, on another instance. How is it related? And relay_reader does not send anything. So why is it bad? Couldn't the problem be fixed by reading all the non-consumed data after reading WAL? The current solution also looks fine. Maybe even better because it becomes consistent with local recovery. However I still want to understand this part about replica. > To fix the problem let's yield explicitly in relay_send_row every > WAL_ROWS_PER_YIELD rows. The same is already done in local recovery, and > serves the same purpose: to not block the event loop for too long. > > Closes #5762 > --- > diff --git a/src/box/relay.cc b/src/box/relay.cc > index df04f8198..afc57dfbc 100644 > --- a/src/box/relay.cc > +++ b/src/box/relay.cc > @@ -836,11 +836,20 @@ relay_send(struct relay *relay, struct xrow_header *packet) > { > ERROR_INJECT_YIELD(ERRINJ_RELAY_SEND_DELAY); > > + static uint64_t row_cnt = 0; Relays are in threads. So this variable either should be thread-local, or be in struct relay. Otherwise you get non-atomic updates which may lead to some increments disappearing. Given that thread-local variable access is not free, I would go for having it in struct relay, but up to you. > packet->sync = relay->sync; > relay->last_row_time = ev_monotonic_now(loop()); > coio_write_xrow(&relay->io, packet); > fiber_gc(); > > + /* > + * It may happen that the socket is always ready for write, so yield > + * explicitly every now and then to not block the event loop. > + */ > + row_cnt++; > + if (row_cnt % WAL_ROWS_PER_YIELD == 0) { > + fiber_sleep(0); > + } Maybe better drop {} as the if's body is just one line. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows 2021-02-12 21:48 ` Vladislav Shpilevoy via Tarantool-patches @ 2021-02-12 22:25 ` Cyrill Gorcunov via Tarantool-patches 2021-02-15 8:45 ` Serge Petrenko via Tarantool-patches 2021-02-15 8:40 ` Serge Petrenko via Tarantool-patches 1 sibling, 1 reply; 23+ messages in thread From: Cyrill Gorcunov via Tarantool-patches @ 2021-02-12 22:25 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches On Fri, Feb 12, 2021 at 10:48:49PM +0100, Vladislav Shpilevoy wrote: > > diff --git a/src/box/relay.cc b/src/box/relay.cc > > index df04f8198..afc57dfbc 100644 > > --- a/src/box/relay.cc > > +++ b/src/box/relay.cc > > @@ -836,11 +836,20 @@ relay_send(struct relay *relay, struct xrow_header *packet) > > { > > ERROR_INJECT_YIELD(ERRINJ_RELAY_SEND_DELAY); > > > > + static uint64_t row_cnt = 0; > > Relays are in threads. So this variable either should be thread-local, > or be in struct relay. Otherwise you get non-atomic updates which may > lead to some increments disappearing. That's the good catch! Without lock/tls this gonna be completely arbritrary updates. > Given that thread-local variable access is not free, I would go for > having it in struct relay, but up to you. Actually tls access should be as cheap as regular memory access except using different base register (iirc %fs on linux). But maybe things are changed novadays. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows 2021-02-12 22:25 ` Cyrill Gorcunov via Tarantool-patches @ 2021-02-15 8:45 ` Serge Petrenko via Tarantool-patches 0 siblings, 0 replies; 23+ messages in thread From: Serge Petrenko via Tarantool-patches @ 2021-02-15 8:45 UTC (permalink / raw) To: Cyrill Gorcunov, Vladislav Shpilevoy; +Cc: tarantool-patches 13.02.2021 01:25, Cyrill Gorcunov пишет: > On Fri, Feb 12, 2021 at 10:48:49PM +0100, Vladislav Shpilevoy wrote: >>> diff --git a/src/box/relay.cc b/src/box/relay.cc >>> index df04f8198..afc57dfbc 100644 >>> --- a/src/box/relay.cc >>> +++ b/src/box/relay.cc >>> @@ -836,11 +836,20 @@ relay_send(struct relay *relay, struct xrow_header *packet) >>> { >>> ERROR_INJECT_YIELD(ERRINJ_RELAY_SEND_DELAY); >>> >>> + static uint64_t row_cnt = 0; >> Relays are in threads. So this variable either should be thread-local, >> or be in struct relay. Otherwise you get non-atomic updates which may >> lead to some increments disappearing. > That's the good catch! Without lock/tls this gonna be completely > arbritrary updates. True. My bad I failed to notice this. > >> Given that thread-local variable access is not free, I would go for >> having it in struct relay, but up to you. > Actually tls access should be as cheap as regular memory access > except using different base register (iirc %fs on linux). But > maybe things are changed novadays. I found some random article regarding tls access cost: https://software.intel.com/content/www/us/en/develop/blogs/the-hidden-performance-cost-of-accessing-thread-local-variables.html So it might be not that cheap. However I didn't dive too deep into the article. I decided to implement the counter as relay member for now. -- Serge Petrenko ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows 2021-02-12 21:48 ` Vladislav Shpilevoy via Tarantool-patches 2021-02-12 22:25 ` Cyrill Gorcunov via Tarantool-patches @ 2021-02-15 8:40 ` Serge Petrenko via Tarantool-patches 2021-02-17 21:11 ` Vladislav Shpilevoy via Tarantool-patches 1 sibling, 1 reply; 23+ messages in thread From: Serge Petrenko via Tarantool-patches @ 2021-02-15 8:40 UTC (permalink / raw) To: Vladislav Shpilevoy, gorcunov; +Cc: tarantool-patches 13.02.2021 00:48, Vladislav Shpilevoy пишет: > Hi! Thanks for the patch! > > On 12.02.2021 12:25, Serge Petrenko via Tarantool-patches wrote: >> While sending a WAL, relay only yields in `coio_write_xrow`, once it >> sees the socket isn't ready for writes. >> It may happen that the socket is always ready for a long period of time, >> and relay doesn't yield at all while recovering a whole .xlog file. This >> may take well more than a minute. >> During this period of time, relay doesn't read replica's ACKs due to >> relay reader fiber not being scheduled, and once the reader is finally >> live it times out immediately, causing the replica to reconnect. >> >> The problem is amplified by the fact that replica waits for >> replication_timeout to pass prior to reconnecting, which lets master >> pile up even more ready WALs, and effectively making it impossible for >> the replica to sync. > I couldn't understand this part. Why is it bad? Yeah, replica waits, > but replica is applier, on another instance. How is it related? And > relay_reader does not send anything. So why is it bad? Thanks for the review! I shouldn't have included this paragraph to the explanation probably. I tried to explain how this bug leads to replica not being able to sync with master when master's under load. I reworded the commit message a bit, hope it's more clear now. > > Couldn't the problem be fixed by reading all the non-consumed data after > reading WAL? Relay does read every ack received while feeding a WAL, but it reads the acks only after finishing reading WAL, so all the reads time-out. > > The current solution also looks fine. Maybe even better because it > becomes consistent with local recovery. However I still want to > understand this part about replica. > >> To fix the problem let's yield explicitly in relay_send_row every >> WAL_ROWS_PER_YIELD rows. The same is already done in local recovery, and >> serves the same purpose: to not block the event loop for too long. >> >> Closes #5762 >> --- >> diff --git a/src/box/relay.cc b/src/box/relay.cc >> index df04f8198..afc57dfbc 100644 >> --- a/src/box/relay.cc >> +++ b/src/box/relay.cc >> @@ -836,11 +836,20 @@ relay_send(struct relay *relay, struct xrow_header *packet) >> { >> ERROR_INJECT_YIELD(ERRINJ_RELAY_SEND_DELAY); >> >> + static uint64_t row_cnt = 0; > Relays are in threads. So this variable either should be thread-local, > or be in struct relay. Otherwise you get non-atomic updates which may > lead to some increments disappearing. > > Given that thread-local variable access is not free, I would go for > having it in struct relay, but up to you. Thanks for noticing! Let it be in relay then. Diff: ================================================ diff --git a/src/box/relay.cc b/src/box/relay.cc index 1d8edf116..6d9269e1d 100644 --- a/src/box/relay.cc +++ b/src/box/relay.cc @@ -117,6 +117,11 @@ struct relay { * is passed by the replica on subscribe. */ uint32_t id_filter; + /** + * How many rows has this relay sent to the replica. Used to yield once + * in a while when reading a WAL to unblock the event loop. + */ + size_t row_cnt; /** * Local vclock at the moment of subscribe, used to check * dataset on the other side and send missing data rows if any. @@ -218,6 +223,7 @@ relay_start(struct relay *relay, int fd, uint64_t sync, coio_create(&relay->io, fd); relay->sync = sync; relay->state = RELAY_FOLLOW; + relay->row_cnt = 0; relay->last_row_time = ev_monotonic_now(loop()); } @@ -836,7 +842,6 @@ relay_send(struct relay *relay, struct xrow_header *packet) { ERROR_INJECT_YIELD(ERRINJ_RELAY_SEND_DELAY); - static size_t row_cnt = 0; packet->sync = relay->sync; relay->last_row_time = ev_monotonic_now(loop()); coio_write_xrow(&relay->io, packet); @@ -846,7 +851,7 @@ relay_send(struct relay *relay, struct xrow_header *packet) * It may happen that the socket is always ready for write, so yield * explicitly every now and then to not block the event loop. */ - if (++row_cnt % WAL_ROWS_PER_YIELD == 0) + if (++relay->row_cnt % WAL_ROWS_PER_YIELD == 0) fiber_sleep(0); struct errinj *inj = errinj(ERRINJ_RELAY_TIMEOUT, ERRINJ_DOUBLE); > >> packet->sync = relay->sync; >> relay->last_row_time = ev_monotonic_now(loop()); >> coio_write_xrow(&relay->io, packet); >> fiber_gc(); >> >> + /* >> + * It may happen that the socket is always ready for write, so yield >> + * explicitly every now and then to not block the event loop. >> + */ >> + row_cnt++; >> + if (row_cnt % WAL_ROWS_PER_YIELD == 0) { >> + fiber_sleep(0); >> + } > Maybe better drop {} as the if's body is just one line. Already fixed in reply to Cyrill. -- Serge Petrenko ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows 2021-02-15 8:40 ` Serge Petrenko via Tarantool-patches @ 2021-02-17 21:11 ` Vladislav Shpilevoy via Tarantool-patches 2021-02-18 20:24 ` Serge Petrenko via Tarantool-patches 0 siblings, 1 reply; 23+ messages in thread From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-17 21:11 UTC (permalink / raw) To: Serge Petrenko, gorcunov; +Cc: tarantool-patches Hi! Thanks for the fixes! > diff --git a/src/box/relay.cc b/src/box/relay.cc > index 1d8edf116..6d9269e1d 100644 > --- a/src/box/relay.cc > +++ b/src/box/relay.cc > @@ -117,6 +117,11 @@ struct relay { > * is passed by the replica on subscribe. > */ > uint32_t id_filter; > + /** > + * How many rows has this relay sent to the replica. Used to yield once > + * in a while when reading a WAL to unblock the event loop. > + */ > + size_t row_cnt; 1. But it is not a size of anything, right? Maybe make it int64_t then? > @@ -846,7 +851,7 @@ relay_send(struct relay *relay, struct xrow_header *packet) > * It may happen that the socket is always ready for write, so yield > * explicitly every now and then to not block the event loop. > */ > - if (++row_cnt % WAL_ROWS_PER_YIELD == 0) > + if (++relay->row_cnt % WAL_ROWS_PER_YIELD == 0) 2. I just found WAL_ROWS_PER_YIELD is not a power of 2. This means '%' may be expensive. If WAL_ROWS_PER_YIELD would be a power of 2, '%' would be turned by the compiler into '&' to cut the remainder off. Maybe worth changing in a separate non-related to this bug commit? ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows 2021-02-17 21:11 ` Vladislav Shpilevoy via Tarantool-patches @ 2021-02-18 20:24 ` Serge Petrenko via Tarantool-patches 2021-02-23 22:30 ` Vladislav Shpilevoy via Tarantool-patches 0 siblings, 1 reply; 23+ messages in thread From: Serge Petrenko via Tarantool-patches @ 2021-02-18 20:24 UTC (permalink / raw) To: Vladislav Shpilevoy, gorcunov; +Cc: tarantool-patches 18.02.2021 00:11, Vladislav Shpilevoy пишет: > Hi! Thanks for the fixes! Thanks for the review! > >> diff --git a/src/box/relay.cc b/src/box/relay.cc >> index 1d8edf116..6d9269e1d 100644 >> --- a/src/box/relay.cc >> +++ b/src/box/relay.cc >> @@ -117,6 +117,11 @@ struct relay { >> * is passed by the replica on subscribe. >> */ >> uint32_t id_filter; >> + /** >> + * How many rows has this relay sent to the replica. Used to yield once >> + * in a while when reading a WAL to unblock the event loop. >> + */ >> + size_t row_cnt; > 1. But it is not a size of anything, right? Maybe make it > int64_t then? uint64_t, probably? I'm fine with it. Fixed on the branch. > >> @@ -846,7 +851,7 @@ relay_send(struct relay *relay, struct xrow_header *packet) >> * It may happen that the socket is always ready for write, so yield >> * explicitly every now and then to not block the event loop. >> */ >> - if (++row_cnt % WAL_ROWS_PER_YIELD == 0) >> + if (++relay->row_cnt % WAL_ROWS_PER_YIELD == 0) > 2. I just found WAL_ROWS_PER_YIELD is not a power of 2. This means > '%' may be expensive. If WAL_ROWS_PER_YIELD would be a power of 2, > '%' would be turned by the compiler into '&' to cut the remainder off. > > Maybe worth changing in a separate non-related to this bug commit? Good idea, thanks! Pushed on top of the original commit. -- Serge Petrenko ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows 2021-02-18 20:24 ` Serge Petrenko via Tarantool-patches @ 2021-02-23 22:30 ` Vladislav Shpilevoy via Tarantool-patches 2021-02-24 9:48 ` Serge Petrenko via Tarantool-patches 0 siblings, 1 reply; 23+ messages in thread From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-23 22:30 UTC (permalink / raw) To: Serge Petrenko, gorcunov; +Cc: tarantool-patches Thanks for the patch! >>> diff --git a/src/box/relay.cc b/src/box/relay.cc >>> index 1d8edf116..6d9269e1d 100644 >>> --- a/src/box/relay.cc >>> +++ b/src/box/relay.cc >>> @@ -117,6 +117,11 @@ struct relay { >>> * is passed by the replica on subscribe. >>> */ >>> uint32_t id_filter; >>> + /** >>> + * How many rows has this relay sent to the replica. Used to yield once >>> + * in a while when reading a WAL to unblock the event loop. >>> + */ >>> + size_t row_cnt; Name is a bit ugly, because all the other members are not contractions. They use full words. But up to you. >> 1. But it is not a size of anything, right? Maybe make it >> int64_t then? > > uint64_t, probably? Nope, int64_t. It is supposed to be 'faster'. Because it does not have defined overflow rules, and therefore the hardware does not need to handle it. But honestly, I didn't measure. For me it is more cargo cult. I just use signed integers where I can assuming that the hardware really may omit an instruction or so. Up to you. The patch about power of 2 worked btw. This is how it looks now: andq $0x7fff, %rcx ; imm = 0x7FFF this is how it used to look: movl $0x7d00, %ecx ; imm = 0x7D00 divq %rcx ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows 2021-02-23 22:30 ` Vladislav Shpilevoy via Tarantool-patches @ 2021-02-24 9:48 ` Serge Petrenko via Tarantool-patches 2021-02-24 10:15 ` Cyrill Gorcunov via Tarantool-patches 0 siblings, 1 reply; 23+ messages in thread From: Serge Petrenko via Tarantool-patches @ 2021-02-24 9:48 UTC (permalink / raw) To: Vladislav Shpilevoy, gorcunov; +Cc: tarantool-patches 24.02.2021 01:30, Vladislav Shpilevoy пишет: > Thanks for the patch! Thanks for the reivew! >>>> diff --git a/src/box/relay.cc b/src/box/relay.cc >>>> index 1d8edf116..6d9269e1d 100644 >>>> --- a/src/box/relay.cc >>>> +++ b/src/box/relay.cc >>>> @@ -117,6 +117,11 @@ struct relay { >>>> * is passed by the replica on subscribe. >>>> */ >>>> uint32_t id_filter; >>>> + /** >>>> + * How many rows has this relay sent to the replica. Used to yield once >>>> + * in a while when reading a WAL to unblock the event loop. >>>> + */ >>>> + size_t row_cnt; > Name is a bit ugly, because all the other members are not > contractions. They use full words. But up to you. No problem, changed to `row_count`. >>> 1. But it is not a size of anything, right? Maybe make it >>> int64_t then? >> uint64_t, probably? > Nope, int64_t. It is supposed to be 'faster'. Because it does > not have defined overflow rules, and therefore the hardware does > not need to handle it. > > But honestly, I didn't measure. For me it is more cargo cult. I > just use signed integers where I can assuming that the hardware > really may omit an instruction or so. > > Up to you. Long story short, I'd like to leave it as is. Besides, we have an unsigned type (size_t) in local recovery. Ok, now I see what you meant. I never thought of this, and brief googling showed no signs of a speedup with signed arithmetic vs unsigned. Actually, the standard says signed overflow is an undefined behaviour while an unsigned overflow should result in a wrap (modulo 2^64 in our case). Do you think this wrap may be costly on some architecture? > > The patch about power of 2 worked btw. This is how it looks now: > > andq $0x7fff, %rcx ; imm = 0x7FFF > > this is how it used to look: > > movl $0x7d00, %ecx ; imm = 0x7D00 > divq %rcx Yeah, that was a nice catch. -- Serge Petrenko ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows 2021-02-24 9:48 ` Serge Petrenko via Tarantool-patches @ 2021-02-24 10:15 ` Cyrill Gorcunov via Tarantool-patches 2021-02-24 10:35 ` Serge Petrenko via Tarantool-patches 0 siblings, 1 reply; 23+ messages in thread From: Cyrill Gorcunov via Tarantool-patches @ 2021-02-24 10:15 UTC (permalink / raw) To: Serge Petrenko; +Cc: Vladislav Shpilevoy, tarantool-patches On Wed, Feb 24, 2021 at 12:48:07PM +0300, Serge Petrenko wrote: ... > > > > 1. But it is not a size of anything, right? Maybe make it > > > > int64_t then? > > > uint64_t, probably? > > Nope, int64_t. It is supposed to be 'faster'. Because it does > > not have defined overflow rules, and therefore the hardware does > > not need to handle it. > > > > But honestly, I didn't measure. For me it is more cargo cult. I > > just use signed integers where I can assuming that the hardware > > really may omit an instruction or so. > > > > Up to you. > > Long story short, I'd like to leave it as is. Besides, we have an unsigned > type (size_t) in local recovery. > > Ok, now I see what you meant. > I never thought of this, and brief googling showed no signs of a speedup > with signed arithmetic vs unsigned. > > Actually, the standard says signed overflow is an undefined behaviour while > an unsigned overflow should result in a wrap (modulo 2^64 in our case). > > Do you think this wrap may be costly on some architecture? Addition on hardware level _always_ setup OF/CF flags so in this term it doesn't matter which to use int64 or uint64 (iow hw treats addition as signed integers all the time and it is up you how you gonna use this flags in later code). What is more interesting is that older compilers (at least gcc) generate more efficient code for *signed* integers, ie int64_t. I've been pretty surprised when discovered this. I dont get you a reference because I don't remember the exact versions though. In summary: rule of thumb is to use signed integers if you really need some addition in a cycle. For more rare access unsigned is more that enough and latest gcc already can handle it the same way as signed numbers. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows 2021-02-24 10:15 ` Cyrill Gorcunov via Tarantool-patches @ 2021-02-24 10:35 ` Serge Petrenko via Tarantool-patches 2021-02-24 12:07 ` Cyrill Gorcunov via Tarantool-patches 0 siblings, 1 reply; 23+ messages in thread From: Serge Petrenko via Tarantool-patches @ 2021-02-24 10:35 UTC (permalink / raw) To: Cyrill Gorcunov; +Cc: Vladislav Shpilevoy, tarantool-patches 24.02.2021 13:15, Cyrill Gorcunov пишет: > On Wed, Feb 24, 2021 at 12:48:07PM +0300, Serge Petrenko wrote: > ... >>>>> 1. But it is not a size of anything, right? Maybe make it >>>>> int64_t then? >>>> uint64_t, probably? >>> Nope, int64_t. It is supposed to be 'faster'. Because it does >>> not have defined overflow rules, and therefore the hardware does >>> not need to handle it. >>> >>> But honestly, I didn't measure. For me it is more cargo cult. I >>> just use signed integers where I can assuming that the hardware >>> really may omit an instruction or so. >>> >>> Up to you. >> Long story short, I'd like to leave it as is. Besides, we have an unsigned >> type (size_t) in local recovery. >> >> Ok, now I see what you meant. >> I never thought of this, and brief googling showed no signs of a speedup >> with signed arithmetic vs unsigned. >> >> Actually, the standard says signed overflow is an undefined behaviour while >> an unsigned overflow should result in a wrap (modulo 2^64 in our case). >> >> Do you think this wrap may be costly on some architecture? > Addition on hardware level _always_ setup OF/CF flags so in this term it > doesn't matter which to use int64 or uint64 (iow hw treats addition as > signed integers all the time and it is up you how you gonna use this > flags in later code). > > What is more interesting is that older compilers (at least gcc) generate > more efficient code for *signed* integers, ie int64_t. I've been pretty > surprised when discovered this. I dont get you a reference because I > don't remember the exact versions though. In summary: rule of thumb > is to use signed integers if you really need some addition in a cycle. > For more rare access unsigned is more that enough and latest gcc already > can handle it the same way as signed numbers. Thanks for your answer! What do you propose? Should I hange it to int64_t then? -- Serge Petrenko ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows 2021-02-24 10:35 ` Serge Petrenko via Tarantool-patches @ 2021-02-24 12:07 ` Cyrill Gorcunov via Tarantool-patches 2021-02-24 12:14 ` Serge Petrenko via Tarantool-patches 0 siblings, 1 reply; 23+ messages in thread From: Cyrill Gorcunov via Tarantool-patches @ 2021-02-24 12:07 UTC (permalink / raw) To: Serge Petrenko; +Cc: Vladislav Shpilevoy, tarantool-patches On Wed, Feb 24, 2021 at 01:35:49PM +0300, Serge Petrenko wrote: > > Thanks for your answer! > What do you propose? Should I hange it to int64_t then? Yeah, better to stick with int64_t ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows 2021-02-24 12:07 ` Cyrill Gorcunov via Tarantool-patches @ 2021-02-24 12:14 ` Serge Petrenko via Tarantool-patches 0 siblings, 0 replies; 23+ messages in thread From: Serge Petrenko via Tarantool-patches @ 2021-02-24 12:14 UTC (permalink / raw) To: Cyrill Gorcunov; +Cc: Vladislav Shpilevoy, tarantool-patches 24.02.2021 15:07, Cyrill Gorcunov пишет: > On Wed, Feb 24, 2021 at 01:35:49PM +0300, Serge Petrenko wrote: >> Thanks for your answer! >> What do you propose? Should I hange it to int64_t then? > Yeah, better to stick with int64_t Ok, no problem. Changed to int64_t. -- Serge Petrenko ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows 2021-02-12 11:25 [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows Serge Petrenko via Tarantool-patches 2021-02-12 11:37 ` Cyrill Gorcunov via Tarantool-patches 2021-02-12 21:48 ` Vladislav Shpilevoy via Tarantool-patches @ 2021-02-24 22:20 ` Vladislav Shpilevoy via Tarantool-patches 2021-02-26 8:41 ` Kirill Yukhin via Tarantool-patches 3 siblings, 0 replies; 23+ messages in thread From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-24 22:20 UTC (permalink / raw) To: Serge Petrenko, gorcunov; +Cc: tarantool-patches Hi! Thanks for the patch! LGTM. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows 2021-02-12 11:25 [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows Serge Petrenko via Tarantool-patches ` (2 preceding siblings ...) 2021-02-24 22:20 ` Vladislav Shpilevoy via Tarantool-patches @ 2021-02-26 8:41 ` Kirill Yukhin via Tarantool-patches 2021-02-26 20:24 ` Vladislav Shpilevoy via Tarantool-patches 3 siblings, 1 reply; 23+ messages in thread From: Kirill Yukhin via Tarantool-patches @ 2021-02-26 8:41 UTC (permalink / raw) To: Serge Petrenko; +Cc: tarantool-patches, v.shpilevoy Hello, On 12 фев 14:25, Serge Petrenko via Tarantool-patches wrote: > While sending a WAL, relay only yields in `coio_write_xrow`, once it > sees the socket isn't ready for writes. > It may happen that the socket is always ready for a long period of time, > and relay doesn't yield at all while recovering a whole .xlog file. This > may take well more than a minute. > During this period of time, relay doesn't read replica's ACKs due to > relay reader fiber not being scheduled, and once the reader is finally > live it times out immediately, causing the replica to reconnect. > > The problem is amplified by the fact that replica waits for > replication_timeout to pass prior to reconnecting, which lets master > pile up even more ready WALs, and effectively making it impossible for > the replica to sync. > > To fix the problem let's yield explicitly in relay_send_row every > WAL_ROWS_PER_YIELD rows. The same is already done in local recovery, and > serves the same purpose: to not block the event loop for too long. > > Closes #5762 > --- > No test provided as the fix is quite obvious but rather hard to test > automatically. > https://github.com/tarantool/tarantool/issues/5762 > https://github.com/tarantool/tarantool/tree/sp/gh-5762-relay-yield I've checked your patch into 2.6, 2.6 and master. -- Regards, Kirill Yukhin ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows 2021-02-26 8:41 ` Kirill Yukhin via Tarantool-patches @ 2021-02-26 20:24 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-01 11:25 ` Serge Petrenko via Tarantool-patches 0 siblings, 1 reply; 23+ messages in thread From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-26 20:24 UTC (permalink / raw) To: Kirill Yukhin, Serge Petrenko; +Cc: tarantool-patches We forgot the changelog file. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows 2021-02-26 20:24 ` Vladislav Shpilevoy via Tarantool-patches @ 2021-03-01 11:25 ` Serge Petrenko via Tarantool-patches 2021-03-01 21:24 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-02 9:52 ` Kirill Yukhin via Tarantool-patches 0 siblings, 2 replies; 23+ messages in thread From: Serge Petrenko via Tarantool-patches @ 2021-03-01 11:25 UTC (permalink / raw) To: Vladislav Shpilevoy, Kirill Yukhin; +Cc: tarantool-patches 26.02.2021 23:24, Vladislav Shpilevoy пишет: > We forgot the changelog file. Oops, here you go. Should be pushed to 2.6, 2.7 and master. ======================================================= Subject: [PATCH] Add changelog entry for patch "relay: yield explicitly every N sent rows" Follow-up #5762 --- Branch: https://github.com/tarantool/tarantool/tree/sp/gh-5762-followup changelogs/unreleased/relay-yield.md | 4 ++++ 1 file changed, 4 insertions(+) create mode 100644 changelogs/unreleased/relay-yield.md diff --git a/changelogs/unreleased/relay-yield.md b/changelogs/unreleased/relay-yield.md new file mode 100644 index 000000000..02e4e0f76 --- /dev/null +++ b/changelogs/unreleased/relay-yield.md @@ -0,0 +1,4 @@ +## bugfix/core + +* Fix a bug in relay timing out while replica is joining or syncing + with master (gh-5762). -- 2.24.3 (Apple Git-128) -- Serge Petrenko ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows 2021-03-01 11:25 ` Serge Petrenko via Tarantool-patches @ 2021-03-01 21:24 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-02 9:52 ` Kirill Yukhin via Tarantool-patches 1 sibling, 0 replies; 23+ messages in thread From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-01 21:24 UTC (permalink / raw) To: Serge Petrenko, Kirill Yukhin; +Cc: tarantool-patches LGTM. On 01.03.2021 12:25, Serge Petrenko wrote: > > > 26.02.2021 23:24, Vladislav Shpilevoy пишет: >> We forgot the changelog file. > Oops, here you go. Should be pushed to 2.6, 2.7 and master. > > ======================================================= > > Subject: [PATCH] Add changelog entry for patch "relay: yield explicitly every > N sent rows" > > Follow-up #5762 > --- > Branch: https://github.com/tarantool/tarantool/tree/sp/gh-5762-followup > > > changelogs/unreleased/relay-yield.md | 4 ++++ > 1 file changed, 4 insertions(+) > create mode 100644 changelogs/unreleased/relay-yield.md > > diff --git a/changelogs/unreleased/relay-yield.md b/changelogs/unreleased/relay-yield.md > new file mode 100644 > index 000000000..02e4e0f76 > --- /dev/null > +++ b/changelogs/unreleased/relay-yield.md > @@ -0,0 +1,4 @@ > +## bugfix/core > + > +* Fix a bug in relay timing out while replica is joining or syncing > + with master (gh-5762). ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows 2021-03-01 11:25 ` Serge Petrenko via Tarantool-patches 2021-03-01 21:24 ` Vladislav Shpilevoy via Tarantool-patches @ 2021-03-02 9:52 ` Kirill Yukhin via Tarantool-patches 1 sibling, 0 replies; 23+ messages in thread From: Kirill Yukhin via Tarantool-patches @ 2021-03-02 9:52 UTC (permalink / raw) To: Serge Petrenko; +Cc: tarantool-patches, Vladislav Shpilevoy Hello, On 01 мар 14:25, Serge Petrenko wrote: > > > 26.02.2021 23:24, Vladislav Shpilevoy пишет: > > We forgot the changelog file. > Oops, here you go. Should be pushed to 2.6, 2.7 and master. > > ======================================================= > > Subject: [PATCH] Add changelog entry for patch "relay: yield explicitly > every > N sent rows" > > Follow-up #5762 > --- > Branch: https://github.com/tarantool/tarantool/tree/sp/gh-5762-followup I've checked your patch into 2.6, 2.7 and master. -- Regards, Kirill Yukhin ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2021-03-02 9:52 UTC | newest] Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-02-12 11:25 [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows Serge Petrenko via Tarantool-patches 2021-02-12 11:37 ` Cyrill Gorcunov via Tarantool-patches 2021-02-12 11:46 ` Cyrill Gorcunov via Tarantool-patches 2021-02-12 12:08 ` Serge Petrenko via Tarantool-patches 2021-02-12 17:00 ` Cyrill Gorcunov via Tarantool-patches 2021-02-12 21:48 ` Vladislav Shpilevoy via Tarantool-patches 2021-02-12 22:25 ` Cyrill Gorcunov via Tarantool-patches 2021-02-15 8:45 ` Serge Petrenko via Tarantool-patches 2021-02-15 8:40 ` Serge Petrenko via Tarantool-patches 2021-02-17 21:11 ` Vladislav Shpilevoy via Tarantool-patches 2021-02-18 20:24 ` Serge Petrenko via Tarantool-patches 2021-02-23 22:30 ` Vladislav Shpilevoy via Tarantool-patches 2021-02-24 9:48 ` Serge Petrenko via Tarantool-patches 2021-02-24 10:15 ` Cyrill Gorcunov via Tarantool-patches 2021-02-24 10:35 ` Serge Petrenko via Tarantool-patches 2021-02-24 12:07 ` Cyrill Gorcunov via Tarantool-patches 2021-02-24 12:14 ` Serge Petrenko via Tarantool-patches 2021-02-24 22:20 ` Vladislav Shpilevoy via Tarantool-patches 2021-02-26 8:41 ` Kirill Yukhin via Tarantool-patches 2021-02-26 20:24 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-01 11:25 ` Serge Petrenko via Tarantool-patches 2021-03-01 21:24 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-02 9:52 ` Kirill Yukhin via Tarantool-patches
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox