Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH 1/1] applier: send heartbeat not only on commit, but on any write
@ 2020-06-22 22:56 Vladislav Shpilevoy
  2020-06-23  8:45 ` Serge Petrenko
  2020-06-23 22:09 ` Vladislav Shpilevoy
  0 siblings, 2 replies; 6+ messages in thread
From: Vladislav Shpilevoy @ 2020-06-22 22:56 UTC (permalink / raw)
  To: tarantool-patches, sergepetrenko

Concept of 'commit' becomes not 100% matching WAL write event,
when synchro replication comes.

And yet applier relied on commit event when sent periodic
hearbeats to tell the master the replica's new vclock.

The patch makes applier send heartbeats on any write event. Even
if it was not commit. For example, when a sync transaction's
data was written, and the replica needs to tell the master ACK
using the heartbeat.

Closes #5100
---
Branch: http://github.com/tarantool/tarantool/tree/gh-4842-sync-replication
Issue: https://github.com/tarantool/tarantool/issues/5100

 src/box/applier.cc  | 12 +++++++++---
 src/box/txn_limbo.c |  1 -
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/src/box/applier.cc b/src/box/applier.cc
index ab9a5ac54..37bf25ffc 100644
--- a/src/box/applier.cc
+++ b/src/box/applier.cc
@@ -841,7 +841,7 @@ applier_txn_commit_cb(struct trigger *trigger, void *event)
  * Return 0 for success or -1 in case of an error.
  */
 static int
-applier_apply_tx(struct stailq *rows)
+applier_apply_tx(struct stailq *rows, struct fiber *writer)
 {
 	struct xrow_header *first_row = &stailq_first_entry(rows,
 					struct applier_tx_row, next)->row;
@@ -933,7 +933,13 @@ applier_apply_tx(struct stailq *rows)
 
 	trigger_create(on_commit, applier_txn_commit_cb, NULL, NULL);
 	txn_on_commit(txn, on_commit);
-
+	/*
+	 * Wake the fiber when the transaction finishes writing to
+	 * disk. In case of async transaction it is the same as
+	 * commit event. In case of sync it happens after the data
+	 * is written to WAL.
+	 */
+	txn->fiber = writer;
 	if (txn_commit_async(txn) < 0)
 		goto fail;
 
@@ -1131,7 +1137,7 @@ applier_subscribe(struct applier *applier)
 		if (stailq_first_entry(&rows, struct applier_tx_row,
 				       next)->row.lsn == 0)
 			fiber_cond_signal(&applier->writer_cond);
-		else if (applier_apply_tx(&rows) != 0)
+		else if (applier_apply_tx(&rows, applier->writer) != 0)
 			diag_raise();
 
 		if (ibuf_used(ibuf) == 0)
diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c
index 8c05fbb0e..58eeabf2b 100644
--- a/src/box/txn_limbo.c
+++ b/src/box/txn_limbo.c
@@ -233,7 +233,6 @@ txn_limbo_read_confirm(struct txn_limbo *limbo, int64_t lsn)
 	rlist_foreach_entry_safe(e, &limbo->queue, in_queue, tmp) {
 		if (e->lsn > lsn)
 			break;
-		assert(e->txn->fiber == NULL);
 		e->is_commit = true;
 		txn_limbo_remove(limbo, e);
 		txn_clear_flag(e->txn, TXN_WAIT_ACK);
-- 
2.21.1 (Apple Git-122.3)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/1] applier: send heartbeat not only on commit, but on any write
  2020-06-22 22:56 [Tarantool-patches] [PATCH 1/1] applier: send heartbeat not only on commit, but on any write Vladislav Shpilevoy
@ 2020-06-23  8:45 ` Serge Petrenko
  2020-06-23 21:26   ` Vladislav Shpilevoy
  2020-06-23 22:09 ` Vladislav Shpilevoy
  1 sibling, 1 reply; 6+ messages in thread
From: Serge Petrenko @ 2020-06-23  8:45 UTC (permalink / raw)
  To: Vladislav Shpilevoy, tarantool-patches


23.06.2020 01:56, Vladislav Shpilevoy пишет:
> Concept of 'commit' becomes not 100% matching WAL write event,
> when synchro replication comes.
>
> And yet applier relied on commit event when sent periodic
> hearbeats to tell the master the replica's new vclock.
>
> The patch makes applier send heartbeats on any write event. Even
> if it was not commit. For example, when a sync transaction's
> data was written, and the replica needs to tell the master ACK
> using the heartbeat.
>
> Closes #5100
> ---
> Branch: http://github.com/tarantool/tarantool/tree/gh-4842-sync-replication
> Issue: https://github.com/tarantool/tarantool/issues/5100
>
>   src/box/applier.cc  | 12 +++++++++---
>   src/box/txn_limbo.c |  1 -
>   2 files changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/src/box/applier.cc b/src/box/applier.cc
> index ab9a5ac54..37bf25ffc 100644
> --- a/src/box/applier.cc
> +++ b/src/box/applier.cc
> @@ -841,7 +841,7 @@ applier_txn_commit_cb(struct trigger *trigger, void *event)
>    * Return 0 for success or -1 in case of an error.
>    */
>   static int
> -applier_apply_tx(struct stailq *rows)
> +applier_apply_tx(struct stailq *rows, struct fiber *writer)
>   {
>   	struct xrow_header *first_row = &stailq_first_entry(rows,
>   					struct applier_tx_row, next)->row;
> @@ -933,7 +933,13 @@ applier_apply_tx(struct stailq *rows)
>   
>   	trigger_create(on_commit, applier_txn_commit_cb, NULL, NULL);
>   	txn_on_commit(txn, on_commit);
> -
> +	/*
> +	 * Wake the fiber when the transaction finishes writing to
> +	 * disk. In case of async transaction it is the same as
> +	 * commit event. In case of sync it happens after the data
> +	 * is written to WAL.
> +	 */
> +	txn->fiber = writer;
>   	if (txn_commit_async(txn) < 0)
>   		goto fail;
>   
> @@ -1131,7 +1137,7 @@ applier_subscribe(struct applier *applier)
>   		if (stailq_first_entry(&rows, struct applier_tx_row,
>   				       next)->row.lsn == 0)
>   			fiber_cond_signal(&applier->writer_cond);
> -		else if (applier_apply_tx(&rows) != 0)
> +		else if (applier_apply_tx(&rows, applier->writer) != 0)
>   			diag_raise();
>   
>   		if (ibuf_used(ibuf) == 0)
> diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c
> index 8c05fbb0e..58eeabf2b 100644
> --- a/src/box/txn_limbo.c
> +++ b/src/box/txn_limbo.c
> @@ -233,7 +233,6 @@ txn_limbo_read_confirm(struct txn_limbo *limbo, int64_t lsn)
>   	rlist_foreach_entry_safe(e, &limbo->queue, in_queue, tmp) {
>   		if (e->lsn > lsn)
>   			break;
> -		assert(e->txn->fiber == NULL);
>   		e->is_commit = true;
>   		txn_limbo_remove(limbo, e);
>   		txn_clear_flag(e->txn, TXN_WAIT_ACK);


Thanks!
Looks good with the following amendments (on the branch):

commit 937a1580bd378b6e42a205e0210d7e42ce47cf08
Author: Serge Petrenko <sergepetrenko@tarantool.org>
Date:   Tue Jun 23 11:29:35 2020 +0300

     fix for 'applier: send heartbeat not only on commit, but on any write'

     [TO BE SQUASHED INTO THE PREVIOUS COMMIT]

diff --git a/src/box/txn.c b/src/box/txn.c
index af4c5d536..2e2b225c0 100644
--- a/src/box/txn.c
+++ b/src/box/txn.c
@@ -467,8 +467,10 @@ txn_complete(struct txn *txn)
                  * back to the fiber, owning the transaction so as
                  * it could decide what to do next.
                  */
-               if (txn->fiber != NULL && txn->fiber != fiber())
+               if (txn->fiber != NULL && txn->fiber != fiber()) {
                         fiber_wakeup(txn->fiber);
+                       txn->fiber = NULL;
+               }
                 return;
         }
         /*
@@ -772,6 +774,7 @@ txn_commit(struct txn *txn)
         if (is_sync) {
                 txn_limbo_assign_lsn(&txn_limbo, limbo_entry,
req->rows[req->n_rows - 1]->lsn);
+               txn->fiber = fiber();
                 if (txn_limbo_wait_complete(&txn_limbo, limbo_entry) < 0) {
                         txn_free(txn);
                         return -1;
diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c
index 46b2c73f3..d2b6e6377 100644
--- a/src/box/txn_limbo.c
+++ b/src/box/txn_limbo.c
@@ -233,6 +233,7 @@ txn_limbo_read_confirm(struct txn_limbo *limbo, 
int64_t lsn)
         rlist_foreach_entry_safe(e, &limbo->queue, in_queue, tmp) {
                 if (e->lsn > lsn)
                         break;
+               assert(e->txn->fiber = NULL);
                 e->is_commit = true;
                 txn_limbo_remove(limbo, e);
                 txn_clear_flag(e->txn, TXN_WAIT_ACK);

-- 
Serge Petrenko

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/1] applier: send heartbeat not only on commit, but on any write
  2020-06-23  8:45 ` Serge Petrenko
@ 2020-06-23 21:26   ` Vladislav Shpilevoy
  0 siblings, 0 replies; 6+ messages in thread
From: Vladislav Shpilevoy @ 2020-06-23 21:26 UTC (permalink / raw)
  To: Serge Petrenko, tarantool-patches

Why are these changes needed? Are they a prerequisite for
something else?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/1] applier: send heartbeat not only on commit, but on any write
  2020-06-22 22:56 [Tarantool-patches] [PATCH 1/1] applier: send heartbeat not only on commit, but on any write Vladislav Shpilevoy
  2020-06-23  8:45 ` Serge Petrenko
@ 2020-06-23 22:09 ` Vladislav Shpilevoy
  2020-06-23 22:20   ` Vladislav Shpilevoy
  1 sibling, 1 reply; 6+ messages in thread
From: Vladislav Shpilevoy @ 2020-06-23 22:09 UTC (permalink / raw)
  To: tarantool-patches, sergepetrenko

Looks like now the transactions leak. Because txn_complete() does not
call txn_free() seeing txn->fiber not NULL.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/1] applier: send heartbeat not only on commit, but on any write
  2020-06-23 22:09 ` Vladislav Shpilevoy
@ 2020-06-23 22:20   ` Vladislav Shpilevoy
  2020-06-29 23:18     ` Vladislav Shpilevoy
  0 siblings, 1 reply; 6+ messages in thread
From: Vladislav Shpilevoy @ 2020-06-23 22:20 UTC (permalink / raw)
  To: tarantool-patches, sergepetrenko

Or probably they don't. Is it why you added

    txn->fiber = NULL;

to txn_complete(), when it is not a final complete?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/1] applier: send heartbeat not only on commit, but on any write
  2020-06-23 22:20   ` Vladislav Shpilevoy
@ 2020-06-29 23:18     ` Vladislav Shpilevoy
  0 siblings, 0 replies; 6+ messages in thread
From: Vladislav Shpilevoy @ 2020-06-29 23:18 UTC (permalink / raw)
  To: tarantool-patches, sergepetrenko

Appeared the transactions leak afterwards anyway.

The problem was that txn->fiber was not set to NULL for
regular async transactions anywhere in applier. So they all
leaked.

I made applier on_commit/on_rollback triggers nullify txn->fiber
so it could be freed when commit/rollback are done.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-06-29 23:18 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-22 22:56 [Tarantool-patches] [PATCH 1/1] applier: send heartbeat not only on commit, but on any write Vladislav Shpilevoy
2020-06-23  8:45 ` Serge Petrenko
2020-06-23 21:26   ` Vladislav Shpilevoy
2020-06-23 22:09 ` Vladislav Shpilevoy
2020-06-23 22:20   ` Vladislav Shpilevoy
2020-06-29 23:18     ` Vladislav Shpilevoy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox