From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp16.mail.ru (smtp16.mail.ru [94.100.176.153]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 464BD42EF5C for ; Tue, 23 Jun 2020 11:45:34 +0300 (MSK) References: <23700741fb309ed1afd98939e1bc9a2fe5b6ea88.1592866585.git.v.shpilevoy@tarantool.org> From: Serge Petrenko Message-ID: Date: Tue, 23 Jun 2020 11:45:33 +0300 MIME-Version: 1.0 In-Reply-To: <23700741fb309ed1afd98939e1bc9a2fe5b6ea88.1592866585.git.v.shpilevoy@tarantool.org> Content-Type: text/plain; charset="utf-8"; format="flowed" Content-Transfer-Encoding: 8bit Content-Language: en-GB Subject: Re: [Tarantool-patches] [PATCH 1/1] applier: send heartbeat not only on commit, but on any write List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vladislav Shpilevoy , tarantool-patches@dev.tarantool.org 23.06.2020 01:56, Vladislav Shpilevoy пишет: > Concept of 'commit' becomes not 100% matching WAL write event, > when synchro replication comes. > > And yet applier relied on commit event when sent periodic > hearbeats to tell the master the replica's new vclock. > > The patch makes applier send heartbeats on any write event. Even > if it was not commit. For example, when a sync transaction's > data was written, and the replica needs to tell the master ACK > using the heartbeat. > > Closes #5100 > --- > Branch: http://github.com/tarantool/tarantool/tree/gh-4842-sync-replication > Issue: https://github.com/tarantool/tarantool/issues/5100 > > src/box/applier.cc | 12 +++++++++--- > src/box/txn_limbo.c | 1 - > 2 files changed, 9 insertions(+), 4 deletions(-) > > diff --git a/src/box/applier.cc b/src/box/applier.cc > index ab9a5ac54..37bf25ffc 100644 > --- a/src/box/applier.cc > +++ b/src/box/applier.cc > @@ -841,7 +841,7 @@ applier_txn_commit_cb(struct trigger *trigger, void *event) > * Return 0 for success or -1 in case of an error. > */ > static int > -applier_apply_tx(struct stailq *rows) > +applier_apply_tx(struct stailq *rows, struct fiber *writer) > { > struct xrow_header *first_row = &stailq_first_entry(rows, > struct applier_tx_row, next)->row; > @@ -933,7 +933,13 @@ applier_apply_tx(struct stailq *rows) > > trigger_create(on_commit, applier_txn_commit_cb, NULL, NULL); > txn_on_commit(txn, on_commit); > - > + /* > + * Wake the fiber when the transaction finishes writing to > + * disk. In case of async transaction it is the same as > + * commit event. In case of sync it happens after the data > + * is written to WAL. > + */ > + txn->fiber = writer; > if (txn_commit_async(txn) < 0) > goto fail; > > @@ -1131,7 +1137,7 @@ applier_subscribe(struct applier *applier) > if (stailq_first_entry(&rows, struct applier_tx_row, > next)->row.lsn == 0) > fiber_cond_signal(&applier->writer_cond); > - else if (applier_apply_tx(&rows) != 0) > + else if (applier_apply_tx(&rows, applier->writer) != 0) > diag_raise(); > > if (ibuf_used(ibuf) == 0) > diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c > index 8c05fbb0e..58eeabf2b 100644 > --- a/src/box/txn_limbo.c > +++ b/src/box/txn_limbo.c > @@ -233,7 +233,6 @@ txn_limbo_read_confirm(struct txn_limbo *limbo, int64_t lsn) > rlist_foreach_entry_safe(e, &limbo->queue, in_queue, tmp) { > if (e->lsn > lsn) > break; > - assert(e->txn->fiber == NULL); > e->is_commit = true; > txn_limbo_remove(limbo, e); > txn_clear_flag(e->txn, TXN_WAIT_ACK); Thanks! Looks good with the following amendments (on the branch): commit 937a1580bd378b6e42a205e0210d7e42ce47cf08 Author: Serge Petrenko Date:   Tue Jun 23 11:29:35 2020 +0300     fix for 'applier: send heartbeat not only on commit, but on any write'     [TO BE SQUASHED INTO THE PREVIOUS COMMIT] diff --git a/src/box/txn.c b/src/box/txn.c index af4c5d536..2e2b225c0 100644 --- a/src/box/txn.c +++ b/src/box/txn.c @@ -467,8 +467,10 @@ txn_complete(struct txn *txn)                  * back to the fiber, owning the transaction so as                  * it could decide what to do next.                  */ -               if (txn->fiber != NULL && txn->fiber != fiber()) +               if (txn->fiber != NULL && txn->fiber != fiber()) {                         fiber_wakeup(txn->fiber); +                       txn->fiber = NULL; +               }                 return;         }         /* @@ -772,6 +774,7 @@ txn_commit(struct txn *txn)         if (is_sync) {                 txn_limbo_assign_lsn(&txn_limbo, limbo_entry, req->rows[req->n_rows - 1]->lsn); +               txn->fiber = fiber();                 if (txn_limbo_wait_complete(&txn_limbo, limbo_entry) < 0) {                         txn_free(txn);                         return -1; diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c index 46b2c73f3..d2b6e6377 100644 --- a/src/box/txn_limbo.c +++ b/src/box/txn_limbo.c @@ -233,6 +233,7 @@ txn_limbo_read_confirm(struct txn_limbo *limbo, int64_t lsn)         rlist_foreach_entry_safe(e, &limbo->queue, in_queue, tmp) {                 if (e->lsn > lsn)                         break; +               assert(e->txn->fiber = NULL);                 e->is_commit = true;                 txn_limbo_remove(limbo, e);                 txn_clear_flag(e->txn, TXN_WAIT_ACK); -- Serge Petrenko