* [Tarantool-patches] [PATCH 1/1] applier: send heartbeat not only on commit, but on any write @ 2020-06-22 22:56 Vladislav Shpilevoy 2020-06-23 8:45 ` Serge Petrenko 2020-06-23 22:09 ` Vladislav Shpilevoy 0 siblings, 2 replies; 6+ messages in thread From: Vladislav Shpilevoy @ 2020-06-22 22:56 UTC (permalink / raw) To: tarantool-patches, sergepetrenko Concept of 'commit' becomes not 100% matching WAL write event, when synchro replication comes. And yet applier relied on commit event when sent periodic hearbeats to tell the master the replica's new vclock. The patch makes applier send heartbeats on any write event. Even if it was not commit. For example, when a sync transaction's data was written, and the replica needs to tell the master ACK using the heartbeat. Closes #5100 --- Branch: http://github.com/tarantool/tarantool/tree/gh-4842-sync-replication Issue: https://github.com/tarantool/tarantool/issues/5100 src/box/applier.cc | 12 +++++++++--- src/box/txn_limbo.c | 1 - 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/src/box/applier.cc b/src/box/applier.cc index ab9a5ac54..37bf25ffc 100644 --- a/src/box/applier.cc +++ b/src/box/applier.cc @@ -841,7 +841,7 @@ applier_txn_commit_cb(struct trigger *trigger, void *event) * Return 0 for success or -1 in case of an error. */ static int -applier_apply_tx(struct stailq *rows) +applier_apply_tx(struct stailq *rows, struct fiber *writer) { struct xrow_header *first_row = &stailq_first_entry(rows, struct applier_tx_row, next)->row; @@ -933,7 +933,13 @@ applier_apply_tx(struct stailq *rows) trigger_create(on_commit, applier_txn_commit_cb, NULL, NULL); txn_on_commit(txn, on_commit); - + /* + * Wake the fiber when the transaction finishes writing to + * disk. In case of async transaction it is the same as + * commit event. In case of sync it happens after the data + * is written to WAL. + */ + txn->fiber = writer; if (txn_commit_async(txn) < 0) goto fail; @@ -1131,7 +1137,7 @@ applier_subscribe(struct applier *applier) if (stailq_first_entry(&rows, struct applier_tx_row, next)->row.lsn == 0) fiber_cond_signal(&applier->writer_cond); - else if (applier_apply_tx(&rows) != 0) + else if (applier_apply_tx(&rows, applier->writer) != 0) diag_raise(); if (ibuf_used(ibuf) == 0) diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c index 8c05fbb0e..58eeabf2b 100644 --- a/src/box/txn_limbo.c +++ b/src/box/txn_limbo.c @@ -233,7 +233,6 @@ txn_limbo_read_confirm(struct txn_limbo *limbo, int64_t lsn) rlist_foreach_entry_safe(e, &limbo->queue, in_queue, tmp) { if (e->lsn > lsn) break; - assert(e->txn->fiber == NULL); e->is_commit = true; txn_limbo_remove(limbo, e); txn_clear_flag(e->txn, TXN_WAIT_ACK); -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/1] applier: send heartbeat not only on commit, but on any write 2020-06-22 22:56 [Tarantool-patches] [PATCH 1/1] applier: send heartbeat not only on commit, but on any write Vladislav Shpilevoy @ 2020-06-23 8:45 ` Serge Petrenko 2020-06-23 21:26 ` Vladislav Shpilevoy 2020-06-23 22:09 ` Vladislav Shpilevoy 1 sibling, 1 reply; 6+ messages in thread From: Serge Petrenko @ 2020-06-23 8:45 UTC (permalink / raw) To: Vladislav Shpilevoy, tarantool-patches 23.06.2020 01:56, Vladislav Shpilevoy пишет: > Concept of 'commit' becomes not 100% matching WAL write event, > when synchro replication comes. > > And yet applier relied on commit event when sent periodic > hearbeats to tell the master the replica's new vclock. > > The patch makes applier send heartbeats on any write event. Even > if it was not commit. For example, when a sync transaction's > data was written, and the replica needs to tell the master ACK > using the heartbeat. > > Closes #5100 > --- > Branch: http://github.com/tarantool/tarantool/tree/gh-4842-sync-replication > Issue: https://github.com/tarantool/tarantool/issues/5100 > > src/box/applier.cc | 12 +++++++++--- > src/box/txn_limbo.c | 1 - > 2 files changed, 9 insertions(+), 4 deletions(-) > > diff --git a/src/box/applier.cc b/src/box/applier.cc > index ab9a5ac54..37bf25ffc 100644 > --- a/src/box/applier.cc > +++ b/src/box/applier.cc > @@ -841,7 +841,7 @@ applier_txn_commit_cb(struct trigger *trigger, void *event) > * Return 0 for success or -1 in case of an error. > */ > static int > -applier_apply_tx(struct stailq *rows) > +applier_apply_tx(struct stailq *rows, struct fiber *writer) > { > struct xrow_header *first_row = &stailq_first_entry(rows, > struct applier_tx_row, next)->row; > @@ -933,7 +933,13 @@ applier_apply_tx(struct stailq *rows) > > trigger_create(on_commit, applier_txn_commit_cb, NULL, NULL); > txn_on_commit(txn, on_commit); > - > + /* > + * Wake the fiber when the transaction finishes writing to > + * disk. In case of async transaction it is the same as > + * commit event. In case of sync it happens after the data > + * is written to WAL. > + */ > + txn->fiber = writer; > if (txn_commit_async(txn) < 0) > goto fail; > > @@ -1131,7 +1137,7 @@ applier_subscribe(struct applier *applier) > if (stailq_first_entry(&rows, struct applier_tx_row, > next)->row.lsn == 0) > fiber_cond_signal(&applier->writer_cond); > - else if (applier_apply_tx(&rows) != 0) > + else if (applier_apply_tx(&rows, applier->writer) != 0) > diag_raise(); > > if (ibuf_used(ibuf) == 0) > diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c > index 8c05fbb0e..58eeabf2b 100644 > --- a/src/box/txn_limbo.c > +++ b/src/box/txn_limbo.c > @@ -233,7 +233,6 @@ txn_limbo_read_confirm(struct txn_limbo *limbo, int64_t lsn) > rlist_foreach_entry_safe(e, &limbo->queue, in_queue, tmp) { > if (e->lsn > lsn) > break; > - assert(e->txn->fiber == NULL); > e->is_commit = true; > txn_limbo_remove(limbo, e); > txn_clear_flag(e->txn, TXN_WAIT_ACK); Thanks! Looks good with the following amendments (on the branch): commit 937a1580bd378b6e42a205e0210d7e42ce47cf08 Author: Serge Petrenko <sergepetrenko@tarantool.org> Date: Tue Jun 23 11:29:35 2020 +0300 fix for 'applier: send heartbeat not only on commit, but on any write' [TO BE SQUASHED INTO THE PREVIOUS COMMIT] diff --git a/src/box/txn.c b/src/box/txn.c index af4c5d536..2e2b225c0 100644 --- a/src/box/txn.c +++ b/src/box/txn.c @@ -467,8 +467,10 @@ txn_complete(struct txn *txn) * back to the fiber, owning the transaction so as * it could decide what to do next. */ - if (txn->fiber != NULL && txn->fiber != fiber()) + if (txn->fiber != NULL && txn->fiber != fiber()) { fiber_wakeup(txn->fiber); + txn->fiber = NULL; + } return; } /* @@ -772,6 +774,7 @@ txn_commit(struct txn *txn) if (is_sync) { txn_limbo_assign_lsn(&txn_limbo, limbo_entry, req->rows[req->n_rows - 1]->lsn); + txn->fiber = fiber(); if (txn_limbo_wait_complete(&txn_limbo, limbo_entry) < 0) { txn_free(txn); return -1; diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c index 46b2c73f3..d2b6e6377 100644 --- a/src/box/txn_limbo.c +++ b/src/box/txn_limbo.c @@ -233,6 +233,7 @@ txn_limbo_read_confirm(struct txn_limbo *limbo, int64_t lsn) rlist_foreach_entry_safe(e, &limbo->queue, in_queue, tmp) { if (e->lsn > lsn) break; + assert(e->txn->fiber = NULL); e->is_commit = true; txn_limbo_remove(limbo, e); txn_clear_flag(e->txn, TXN_WAIT_ACK); -- Serge Petrenko ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/1] applier: send heartbeat not only on commit, but on any write 2020-06-23 8:45 ` Serge Petrenko @ 2020-06-23 21:26 ` Vladislav Shpilevoy 0 siblings, 0 replies; 6+ messages in thread From: Vladislav Shpilevoy @ 2020-06-23 21:26 UTC (permalink / raw) To: Serge Petrenko, tarantool-patches Why are these changes needed? Are they a prerequisite for something else? ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/1] applier: send heartbeat not only on commit, but on any write 2020-06-22 22:56 [Tarantool-patches] [PATCH 1/1] applier: send heartbeat not only on commit, but on any write Vladislav Shpilevoy 2020-06-23 8:45 ` Serge Petrenko @ 2020-06-23 22:09 ` Vladislav Shpilevoy 2020-06-23 22:20 ` Vladislav Shpilevoy 1 sibling, 1 reply; 6+ messages in thread From: Vladislav Shpilevoy @ 2020-06-23 22:09 UTC (permalink / raw) To: tarantool-patches, sergepetrenko Looks like now the transactions leak. Because txn_complete() does not call txn_free() seeing txn->fiber not NULL. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/1] applier: send heartbeat not only on commit, but on any write 2020-06-23 22:09 ` Vladislav Shpilevoy @ 2020-06-23 22:20 ` Vladislav Shpilevoy 2020-06-29 23:18 ` Vladislav Shpilevoy 0 siblings, 1 reply; 6+ messages in thread From: Vladislav Shpilevoy @ 2020-06-23 22:20 UTC (permalink / raw) To: tarantool-patches, sergepetrenko Or probably they don't. Is it why you added txn->fiber = NULL; to txn_complete(), when it is not a final complete? ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/1] applier: send heartbeat not only on commit, but on any write 2020-06-23 22:20 ` Vladislav Shpilevoy @ 2020-06-29 23:18 ` Vladislav Shpilevoy 0 siblings, 0 replies; 6+ messages in thread From: Vladislav Shpilevoy @ 2020-06-29 23:18 UTC (permalink / raw) To: tarantool-patches, sergepetrenko Appeared the transactions leak afterwards anyway. The problem was that txn->fiber was not set to NULL for regular async transactions anywhere in applier. So they all leaked. I made applier on_commit/on_rollback triggers nullify txn->fiber so it could be freed when commit/rollback are done. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-06-29 23:18 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-06-22 22:56 [Tarantool-patches] [PATCH 1/1] applier: send heartbeat not only on commit, but on any write Vladislav Shpilevoy 2020-06-23 8:45 ` Serge Petrenko 2020-06-23 21:26 ` Vladislav Shpilevoy 2020-06-23 22:09 ` Vladislav Shpilevoy 2020-06-23 22:20 ` Vladislav Shpilevoy 2020-06-29 23:18 ` Vladislav Shpilevoy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox