Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH 00/10] box/journal: redesign sync and async writes
@ 2020-03-05 12:29 Cyrill Gorcunov
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 01/10] box: recovery_journal_create -- set journal here Cyrill Gorcunov
                   ` (9 more replies)
  0 siblings, 10 replies; 28+ messages in thread
From: Cyrill Gorcunov @ 2020-03-05 12:29 UTC (permalink / raw)
  To: tml

Kostya, take a look please, once time permit. I think tearing through
the patches won't be easy since changes are too intrusive so maybe
an easier way to look into the final result by applying the series.

https://github.com/tarantool/tarantool/blob/gorcunov/gh-4031-txn_write_to_wal-10/src/box/txn.c#L590
https://github.com/tarantool/tarantool/blob/gorcunov/gh-4031-txn_write_to_wal-10/src/box/journal.c
https://github.com/tarantool/tarantool/blob/gorcunov/gh-4031-txn_write_to_wal-10/src/box/wal.c#L1186
https://gitlab.com/tarantool/tarantool/pipelines/123555588

branch gorcunov/gh-4031-txn_write_to_wal-10

Cyrill Gorcunov (10):
  box: recovery_journal_create -- set journal here
  box/txn: move fiber_set_txn to header
  box/txn: rename txn_write to txn_commit_async
  box/txn: move setup of txn start to txn_prepare
  box/txn: add txn_commit_nop helper
  box/txn: unweave txn_commit from txn_commit_async
  box/txn: clear fiber storage right before journal write
  box/txn: move journal allocation into separate routine
  box/journal: journal_entry_new -- drop setting up callbacks
  box/journal: redesign sync and async writes

 src/box/applier.cc |   2 +-
 src/box/box.cc     |   7 +-
 src/box/journal.c  |  31 ++++++---
 src/box/journal.h  |  52 +++++++++++---
 src/box/txn.c      | 164 ++++++++++++++++++++++++++++++---------------
 src/box/txn.h      |   9 ++-
 src/box/vy_log.c   |   3 +-
 src/box/wal.c      | 106 ++++++++++++++++++++++++++---
 8 files changed, 287 insertions(+), 87 deletions(-)


base-commit: ed2e14305b28f35c33d947aadcc62ddbe8c863e6
-- 
2.20.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Tarantool-patches] [PATCH 01/10] box: recovery_journal_create -- set journal here
  2020-03-05 12:29 [Tarantool-patches] [PATCH 00/10] box/journal: redesign sync and async writes Cyrill Gorcunov
@ 2020-03-05 12:29 ` Cyrill Gorcunov
  2020-03-06 21:27   ` Konstantin Osipov
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 02/10] box/txn: move fiber_set_txn to header Cyrill Gorcunov
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 28+ messages in thread
From: Cyrill Gorcunov @ 2020-03-05 12:29 UTC (permalink / raw)
  To: tml

Allows to eliminate code duplication.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 src/box/box.cc | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/box/box.cc b/src/box/box.cc
index 09dd67ab4..eb5931e37 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -331,6 +331,7 @@ recovery_journal_create(struct recovery_journal *journal, struct vclock *v)
 {
 	journal_create(&journal->base, recovery_journal_write, NULL);
 	journal->vclock = v;
+	journal_set(&journal->base);
 }
 
 static void
@@ -2055,7 +2056,6 @@ bootstrap_from_master(struct replica *master)
 	engine_begin_final_recovery_xc();
 	struct recovery_journal journal;
 	recovery_journal_create(&journal, &replicaset.vclock);
-	journal_set(&journal.base);
 
 	if (!replication_anon) {
 		applier_resume_to_state(applier, APPLIER_JOINED,
@@ -2221,7 +2221,6 @@ local_recovery(const struct tt_uuid *instance_uuid,
 
 	struct recovery_journal journal;
 	recovery_journal_create(&journal, &recovery->vclock);
-	journal_set(&journal.base);
 
 	/*
 	 * We explicitly request memtx to recover its
-- 
2.20.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Tarantool-patches] [PATCH 02/10] box/txn: move fiber_set_txn to header
  2020-03-05 12:29 [Tarantool-patches] [PATCH 00/10] box/journal: redesign sync and async writes Cyrill Gorcunov
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 01/10] box: recovery_journal_create -- set journal here Cyrill Gorcunov
@ 2020-03-05 12:29 ` Cyrill Gorcunov
  2020-03-06 21:27   ` Konstantin Osipov
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 03/10] box/txn: rename txn_write to txn_commit_async Cyrill Gorcunov
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 28+ messages in thread
From: Cyrill Gorcunov @ 2020-03-05 12:29 UTC (permalink / raw)
  To: tml

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 src/box/txn.c | 6 ------
 src/box/txn.h | 7 +++++++
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/box/txn.c b/src/box/txn.c
index a4ca48224..6799f6c4b 100644
--- a/src/box/txn.c
+++ b/src/box/txn.c
@@ -49,12 +49,6 @@ txn_on_yield(struct trigger *trigger, void *event);
 static void
 txn_run_rollback_triggers(struct txn *txn, struct rlist *triggers);
 
-static inline void
-fiber_set_txn(struct fiber *fiber, struct txn *txn)
-{
-	fiber->storage.txn = txn;
-}
-
 static int
 txn_add_redo(struct txn *txn, struct txn_stmt *stmt, struct request *request)
 {
diff --git a/src/box/txn.h b/src/box/txn.h
index ae2c3a58f..7a7e52954 100644
--- a/src/box/txn.h
+++ b/src/box/txn.h
@@ -256,6 +256,13 @@ in_txn(void)
 	return fiber()->storage.txn;
 }
 
+/* Set to the current transaction (if any) */
+static inline void
+fiber_set_txn(struct fiber *fiber, struct txn *txn)
+{
+	fiber->storage.txn = txn;
+}
+
 /**
  * Start a transaction explicitly.
  * @pre no transaction is active
-- 
2.20.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Tarantool-patches] [PATCH 03/10] box/txn: rename txn_write to txn_commit_async
  2020-03-05 12:29 [Tarantool-patches] [PATCH 00/10] box/journal: redesign sync and async writes Cyrill Gorcunov
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 01/10] box: recovery_journal_create -- set journal here Cyrill Gorcunov
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 02/10] box/txn: move fiber_set_txn to header Cyrill Gorcunov
@ 2020-03-05 12:29 ` Cyrill Gorcunov
  2020-03-06 21:28   ` Konstantin Osipov
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 04/10] box/txn: move setup of txn start to txn_prepare Cyrill Gorcunov
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 28+ messages in thread
From: Cyrill Gorcunov @ 2020-03-05 12:29 UTC (permalink / raw)
  To: tml

To reflect the fact tha we're don't waiting for
transaction to complete but rely on completion
callback.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 src/box/applier.cc | 2 +-
 src/box/txn.c      | 4 ++--
 src/box/txn.h      | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/box/applier.cc b/src/box/applier.cc
index ff40e03c6..8666a3a98 100644
--- a/src/box/applier.cc
+++ b/src/box/applier.cc
@@ -805,7 +805,7 @@ applier_apply_tx(struct stailq *rows)
 	trigger_create(on_commit, applier_txn_commit_cb, NULL, NULL);
 	txn_on_commit(txn, on_commit);
 
-	if (txn_write(txn) < 0)
+	if (txn_commit_async(txn) < 0)
 		goto fail;
 
 	/* Transaction was sent to journal so promote vclock. */
diff --git a/src/box/txn.c b/src/box/txn.c
index 6799f6c4b..9f61303ab 100644
--- a/src/box/txn.c
+++ b/src/box/txn.c
@@ -558,7 +558,7 @@ txn_prepare(struct txn *txn)
 }
 
 int
-txn_write(struct txn *txn)
+txn_commit_async(struct txn *txn)
 {
 	if (txn_prepare(txn) != 0) {
 		txn_rollback(txn);
@@ -586,7 +586,7 @@ txn_commit(struct txn *txn)
 {
 	txn->fiber = fiber();
 
-	if (txn_write(txn) != 0)
+	if (txn_commit_async(txn) != 0)
 		return -1;
 	/*
 	 * In case of non-yielding journal the transaction could already
diff --git a/src/box/txn.h b/src/box/txn.h
index 7a7e52954..b950e2e18 100644
--- a/src/box/txn.h
+++ b/src/box/txn.h
@@ -300,7 +300,7 @@ txn_rollback(struct txn *txn);
  * freed.
  */
 int
-txn_write(struct txn *txn);
+txn_commit_async(struct txn *txn);
 
 /**
  * Most txns don't have triggers, and txn objects
-- 
2.20.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Tarantool-patches] [PATCH 04/10] box/txn: move setup of txn start to txn_prepare
  2020-03-05 12:29 [Tarantool-patches] [PATCH 00/10] box/journal: redesign sync and async writes Cyrill Gorcunov
                   ` (2 preceding siblings ...)
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 03/10] box/txn: rename txn_write to txn_commit_async Cyrill Gorcunov
@ 2020-03-05 12:29 ` Cyrill Gorcunov
  2020-03-06 21:30   ` Konstantin Osipov
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 05/10] box/txn: add txn_commit_nop helper Cyrill Gorcunov
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 28+ messages in thread
From: Cyrill Gorcunov @ 2020-03-05 12:29 UTC (permalink / raw)
  To: tml

For unification sake, we will handle nop transactions
via common helper for both sync and async cases.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 src/box/txn.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/box/txn.c b/src/box/txn.c
index 9f61303ab..5dd52e6db 100644
--- a/src/box/txn.c
+++ b/src/box/txn.c
@@ -554,6 +554,15 @@ txn_prepare(struct txn *txn)
 	trigger_clear(&txn->fiber_on_stop);
 	if (!txn_has_flag(txn, TXN_CAN_YIELD))
 		trigger_clear(&txn->fiber_on_yield);
+
+	/*
+	 * It is important to set start transaction
+	 * time at the last moment, when everything
+	 * is ready to initiate commit procedure,
+	 * just to be more precise in timings to
+	 * detect long WAL writes.
+	 */
+	txn->start_tm = ev_monotonic_now(loop());
 	return 0;
 }
 
@@ -569,7 +578,6 @@ txn_commit_async(struct txn *txn)
 	 * After this point the transaction must not be used
 	 * so reset the corresponding key in the fiber storage.
 	 */
-	txn->start_tm = ev_monotonic_now(loop());
 	if (txn->n_new_rows + txn->n_applier_rows == 0) {
 		/* Nothing to do. */
 		txn->signature = 0;
-- 
2.20.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Tarantool-patches] [PATCH 05/10] box/txn: add txn_commit_nop helper
  2020-03-05 12:29 [Tarantool-patches] [PATCH 00/10] box/journal: redesign sync and async writes Cyrill Gorcunov
                   ` (3 preceding siblings ...)
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 04/10] box/txn: move setup of txn start to txn_prepare Cyrill Gorcunov
@ 2020-03-05 12:29 ` Cyrill Gorcunov
  2020-03-06 21:30   ` Konstantin Osipov
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 06/10] box/txn: unweave txn_commit from txn_commit_async Cyrill Gorcunov
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 28+ messages in thread
From: Cyrill Gorcunov @ 2020-03-05 12:29 UTC (permalink / raw)
  To: tml

To reuse in sync trancastion once joural
redesign is complete.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 src/box/txn.c | 26 +++++++++++++++++++-------
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/src/box/txn.c b/src/box/txn.c
index 5dd52e6db..bee43a066 100644
--- a/src/box/txn.c
+++ b/src/box/txn.c
@@ -566,6 +566,22 @@ txn_prepare(struct txn *txn)
 	return 0;
 }
 
+/**
+ * Complete transaction early if it is barely nop.
+ */
+static bool
+txn_commit_nop(struct txn *txn)
+{
+	if (txn->n_new_rows + txn->n_applier_rows == 0) {
+		txn->signature = 0;
+		txn_complete(txn);
+		fiber_set_txn(fiber(), NULL);
+		return true;
+	}
+
+	return false;
+}
+
 int
 txn_commit_async(struct txn *txn)
 {
@@ -574,17 +590,13 @@ txn_commit_async(struct txn *txn)
 		return -1;
 	}
 
+	if (txn_commit_nop(txn))
+		return 0;
+
 	/*
 	 * After this point the transaction must not be used
 	 * so reset the corresponding key in the fiber storage.
 	 */
-	if (txn->n_new_rows + txn->n_applier_rows == 0) {
-		/* Nothing to do. */
-		txn->signature = 0;
-		txn_complete(txn);
-		fiber_set_txn(fiber(), NULL);
-		return 0;
-	}
 	fiber_set_txn(fiber(), NULL);
 	return txn_write_to_wal(txn);
 }
-- 
2.20.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Tarantool-patches] [PATCH 06/10] box/txn: unweave txn_commit from txn_commit_async
  2020-03-05 12:29 [Tarantool-patches] [PATCH 00/10] box/journal: redesign sync and async writes Cyrill Gorcunov
                   ` (4 preceding siblings ...)
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 05/10] box/txn: add txn_commit_nop helper Cyrill Gorcunov
@ 2020-03-05 12:29 ` Cyrill Gorcunov
  2020-03-06 21:33   ` Konstantin Osipov
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 07/10] box/txn: clear fiber storage right before journal write Cyrill Gorcunov
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 28+ messages in thread
From: Cyrill Gorcunov @ 2020-03-05 12:29 UTC (permalink / raw)
  To: tml

We gonna divergence sync and async code flow thus
lets make txn_commit to not use txn_commit_async.

Fixes #4031

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 src/box/txn.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/src/box/txn.c b/src/box/txn.c
index bee43a066..1c0143703 100644
--- a/src/box/txn.c
+++ b/src/box/txn.c
@@ -604,10 +604,24 @@ txn_commit_async(struct txn *txn)
 int
 txn_commit(struct txn *txn)
 {
+	int res = -1;
+
 	txn->fiber = fiber();
 
-	if (txn_commit_async(txn) != 0)
+	if (txn_prepare(txn) != 0) {
+		txn_rollback(txn);
+		goto out;
+	}
+
+	if (txn_commit_nop(txn)) {
+		res = 0;
+		goto out;
+	}
+
+	fiber_set_txn(fiber(), NULL);
+	if (txn_write_to_wal(txn) != 0)
 		return -1;
+
 	/*
 	 * In case of non-yielding journal the transaction could already
 	 * be done and there is nothing to wait in such cases.
@@ -617,10 +631,11 @@ txn_commit(struct txn *txn)
 		fiber_yield();
 		fiber_set_cancellable(cancellable);
 	}
-	int res = txn->signature >= 0 ? 0 : -1;
+	res = txn->signature >= 0 ? 0 : -1;
 	if (res != 0)
 		diag_set(ClientError, ER_WAL_IO);
 
+out:
 	/* Synchronous transactions are freed by the calling fiber. */
 	txn_free(txn);
 	return res;
-- 
2.20.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Tarantool-patches] [PATCH 07/10] box/txn: clear fiber storage right before journal write
  2020-03-05 12:29 [Tarantool-patches] [PATCH 00/10] box/journal: redesign sync and async writes Cyrill Gorcunov
                   ` (5 preceding siblings ...)
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 06/10] box/txn: unweave txn_commit from txn_commit_async Cyrill Gorcunov
@ 2020-03-05 12:29 ` Cyrill Gorcunov
  2020-03-06 21:34   ` Konstantin Osipov
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 08/10] box/txn: move journal allocation into separate routine Cyrill Gorcunov
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 28+ messages in thread
From: Cyrill Gorcunov @ 2020-03-05 12:29 UTC (permalink / raw)
  To: tml

Otherwise we won't be able to make a rollback in case
of journal_entry_new allocation failure.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 src/box/txn.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/box/txn.c b/src/box/txn.c
index 1c0143703..c5f24ebdb 100644
--- a/src/box/txn.c
+++ b/src/box/txn.c
@@ -513,7 +513,13 @@ txn_write_to_wal(struct txn *txn)
 	assert(remote_row == req->rows + txn->n_applier_rows);
 	assert(local_row == remote_row + txn->n_new_rows);
 
-	/* Send the entry to the journal. */
+	/*
+	 * Send the entry to the journal.
+	*
+	 * After this point the transaction must not be used
+	 * so reset the corresponding key in the fiber storage.
+	 */
+	fiber_set_txn(fiber(), NULL);
 	if (journal_write(req) < 0) {
 		diag_set(ClientError, ER_WAL_IO);
 		diag_log();
@@ -593,11 +599,6 @@ txn_commit_async(struct txn *txn)
 	if (txn_commit_nop(txn))
 		return 0;
 
-	/*
-	 * After this point the transaction must not be used
-	 * so reset the corresponding key in the fiber storage.
-	 */
-	fiber_set_txn(fiber(), NULL);
 	return txn_write_to_wal(txn);
 }
 
@@ -618,7 +619,6 @@ txn_commit(struct txn *txn)
 		goto out;
 	}
 
-	fiber_set_txn(fiber(), NULL);
 	if (txn_write_to_wal(txn) != 0)
 		return -1;
 
-- 
2.20.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Tarantool-patches] [PATCH 08/10] box/txn: move journal allocation into separate routine
  2020-03-05 12:29 [Tarantool-patches] [PATCH 00/10] box/journal: redesign sync and async writes Cyrill Gorcunov
                   ` (6 preceding siblings ...)
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 07/10] box/txn: clear fiber storage right before journal write Cyrill Gorcunov
@ 2020-03-05 12:29 ` Cyrill Gorcunov
  2020-03-06 21:35   ` Konstantin Osipov
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 09/10] box/journal: journal_entry_new -- drop setting up callbacks Cyrill Gorcunov
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 10/10] box/journal: redesign sync and async writes Cyrill Gorcunov
  9 siblings, 1 reply; 28+ messages in thread
From: Cyrill Gorcunov @ 2020-03-05 12:29 UTC (permalink / raw)
  To: tml

For now we assign callbacks unconditionally but it is
done to minimize changes in each patch in the series.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 src/box/txn.c | 55 ++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 39 insertions(+), 16 deletions(-)

diff --git a/src/box/txn.c b/src/box/txn.c
index c5f24ebdb..685f476e2 100644
--- a/src/box/txn.c
+++ b/src/box/txn.c
@@ -478,41 +478,49 @@ txn_entry_complete_cb(struct journal_entry *entry, void *data)
 	fiber_set_txn(fiber(), NULL);
 }
 
-static int64_t
-txn_write_to_wal(struct txn *txn)
+static struct journal_entry *
+txn_journal_entry_new(struct txn *txn)
 {
+	struct journal_entry *req;
+	struct txn_stmt *stmt;
+
 	assert(txn->n_new_rows + txn->n_applier_rows > 0);
 
-	/* Prepare a journal entry. */
-	struct journal_entry *req = journal_entry_new(txn->n_new_rows +
-						      txn->n_applier_rows,
-						      &txn->region,
-						      txn_entry_complete_cb,
-						      txn);
-	if (req == NULL) {
-		txn_rollback(txn);
-		return -1;
-	}
+	req = journal_entry_new(txn->n_new_rows + txn->n_applier_rows,
+				&txn->region, txn_entry_complete_cb, txn);
+	if (req == NULL)
+		return NULL;
 
-	struct txn_stmt *stmt;
 	struct xrow_header **remote_row = req->rows;
 	struct xrow_header **local_row = req->rows + txn->n_applier_rows;
+
 	stailq_foreach_entry(stmt, &txn->stmts, next) {
 		if (stmt->has_triggers) {
 			txn_init_triggers(txn);
 			rlist_splice(&txn->on_commit, &stmt->on_commit);
 		}
+
+		/* A read (e.g. select) request */
 		if (stmt->row == NULL)
-			continue; /* A read (e.g. select) request */
+			continue;
+
 		if (stmt->row->replica_id == 0)
 			*local_row++ = stmt->row;
 		else
 			*remote_row++ = stmt->row;
+
 		req->approx_len += xrow_approx_len(stmt->row);
 	}
+
 	assert(remote_row == req->rows + txn->n_applier_rows);
 	assert(local_row == remote_row + txn->n_new_rows);
 
+	return req;
+}
+
+static int64_t
+txn_write_to_wal(struct journal_entry *req)
+{
 	/*
 	 * Send the entry to the journal.
 	*
@@ -591,6 +599,8 @@ txn_commit_nop(struct txn *txn)
 int
 txn_commit_async(struct txn *txn)
 {
+	struct journal_entry *req;
+
 	if (txn_prepare(txn) != 0) {
 		txn_rollback(txn);
 		return -1;
@@ -599,12 +609,19 @@ txn_commit_async(struct txn *txn)
 	if (txn_commit_nop(txn))
 		return 0;
 
-	return txn_write_to_wal(txn);
+	req = txn_journal_entry_new(txn);
+	if (req == NULL) {
+		txn_rollback(txn);
+		return -1;
+	}
+
+	return txn_write_to_wal(req);
 }
 
 int
 txn_commit(struct txn *txn)
 {
+	struct journal_entry *req;
 	int res = -1;
 
 	txn->fiber = fiber();
@@ -619,7 +636,13 @@ txn_commit(struct txn *txn)
 		goto out;
 	}
 
-	if (txn_write_to_wal(txn) != 0)
+	req = txn_journal_entry_new(txn);
+	if (req == NULL) {
+		txn_rollback(txn);
+		goto out;
+	}
+
+	if (txn_write_to_wal(req) != 0)
 		return -1;
 
 	/*
-- 
2.20.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Tarantool-patches] [PATCH 09/10] box/journal: journal_entry_new -- drop setting up callbacks
  2020-03-05 12:29 [Tarantool-patches] [PATCH 00/10] box/journal: redesign sync and async writes Cyrill Gorcunov
                   ` (7 preceding siblings ...)
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 08/10] box/txn: move journal allocation into separate routine Cyrill Gorcunov
@ 2020-03-05 12:29 ` Cyrill Gorcunov
  2020-03-06 21:37   ` Konstantin Osipov
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 10/10] box/journal: redesign sync and async writes Cyrill Gorcunov
  9 siblings, 1 reply; 28+ messages in thread
From: Cyrill Gorcunov @ 2020-03-05 12:29 UTC (permalink / raw)
  To: tml

The callbacks gonna be different from sync and async
commits thus lets untangle it from journal_entry_new
interface but setup explicitly.

I think setting up NULLs for entry is a safe approach
so we can allow us to spend a few cycles.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 src/box/journal.c | 10 +++++-----
 src/box/journal.h |  4 +---
 src/box/txn.c     |  6 +++++-
 src/box/vy_log.c  |  3 +--
 4 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/src/box/journal.c b/src/box/journal.c
index 11e78990d..266ee5d1f 100644
--- a/src/box/journal.c
+++ b/src/box/journal.c
@@ -53,9 +53,7 @@ static struct journal dummy_journal = {
 struct journal *current_journal = &dummy_journal;
 
 struct journal_entry *
-journal_entry_new(size_t n_rows, struct region *region,
-		  journal_entry_complete_cb on_complete_cb,
-		  void *on_complete_cb_data)
+journal_entry_new(size_t n_rows, struct region *region)
 {
 	struct journal_entry *entry;
 
@@ -68,11 +66,13 @@ journal_entry_new(size_t n_rows, struct region *region,
 		diag_set(OutOfMemory, size, "region", "struct journal_entry");
 		return NULL;
 	}
+
 	entry->approx_len = 0;
 	entry->n_rows = n_rows;
 	entry->res = -1;
-	entry->on_complete_cb = on_complete_cb;
-	entry->on_complete_cb_data = on_complete_cb_data;
+	entry->on_complete_cb = NULL;
+	entry->on_complete_cb_data = NULL;
+
 	return entry;
 }
 
diff --git a/src/box/journal.h b/src/box/journal.h
index 64f167c6f..e74c69910 100644
--- a/src/box/journal.h
+++ b/src/box/journal.h
@@ -93,9 +93,7 @@ struct region;
  * @return NULL if out of memory, fiber diagnostics area is set
  */
 struct journal_entry *
-journal_entry_new(size_t n_rows, struct region *region,
-		  journal_entry_complete_cb on_complete_cb,
-		  void *on_complete_cb_data);
+journal_entry_new(size_t n_rows, struct region *region);
 
 /**
  * Finalize a single entry.
diff --git a/src/box/txn.c b/src/box/txn.c
index 685f476e2..613da181b 100644
--- a/src/box/txn.c
+++ b/src/box/txn.c
@@ -487,7 +487,7 @@ txn_journal_entry_new(struct txn *txn)
 	assert(txn->n_new_rows + txn->n_applier_rows > 0);
 
 	req = journal_entry_new(txn->n_new_rows + txn->n_applier_rows,
-				&txn->region, txn_entry_complete_cb, txn);
+				&txn->region);
 	if (req == NULL)
 		return NULL;
 
@@ -614,6 +614,8 @@ txn_commit_async(struct txn *txn)
 		txn_rollback(txn);
 		return -1;
 	}
+	req->on_complete_cb = txn_entry_complete_cb;
+	req->on_complete_cb_data = txn;
 
 	return txn_write_to_wal(req);
 }
@@ -641,6 +643,8 @@ txn_commit(struct txn *txn)
 		txn_rollback(txn);
 		goto out;
 	}
+	req->on_complete_cb = txn_entry_complete_cb;
+	req->on_complete_cb_data = txn;
 
 	if (txn_write_to_wal(req) != 0)
 		return -1;
diff --git a/src/box/vy_log.c b/src/box/vy_log.c
index cb291f3c8..276c5303e 100644
--- a/src/box/vy_log.c
+++ b/src/box/vy_log.c
@@ -815,8 +815,7 @@ vy_log_tx_flush(struct vy_log_tx *tx)
 		tx_size++;
 
 	size_t used = region_used(&fiber()->gc);
-	struct journal_entry *entry = journal_entry_new(tx_size, &fiber()->gc,
-							NULL, NULL);
+	struct journal_entry *entry = journal_entry_new(tx_size, &fiber()->gc);
 	if (entry == NULL)
 		goto err;
 
-- 
2.20.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Tarantool-patches] [PATCH 10/10] box/journal: redesign sync and async writes
  2020-03-05 12:29 [Tarantool-patches] [PATCH 00/10] box/journal: redesign sync and async writes Cyrill Gorcunov
                   ` (8 preceding siblings ...)
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 09/10] box/journal: journal_entry_new -- drop setting up callbacks Cyrill Gorcunov
@ 2020-03-05 12:29 ` Cyrill Gorcunov
  2020-03-06 21:48   ` Konstantin Osipov
  9 siblings, 1 reply; 28+ messages in thread
From: Cyrill Gorcunov @ 2020-03-05 12:29 UTC (permalink / raw)
  To: tml

Currently the journal provides only one method -- write,
which implies a callback to trigger upon write completion
(in contrary with 1.10 series where all commits were
processing in synchronous way).

Lets make difference between sync and async writes more
notable: provide journal::write_async method which takes
a callback and callback data as an argument, while
journal:write handle transaction in synchronous way.

Redesing notes:

1) Both journal_write and journal_write_async require
   the caller to pass valid fiber->storage.txn, on
   error the txn must not be reset but preserved so
   callers would be able to run txn_rollback

2) txn_commit and txn_commit_async call txn_rollback
   where appropriate

3) no need to call journal_entry_complete on sync
   writes anymore, it is handled by txn_commit
   by self

4) wal_write_in_wal_mode_none is too long, renamed
   to wal_write_none

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 src/box/box.cc    |   4 +-
 src/box/journal.c |  21 +++++++--
 src/box/journal.h |  48 ++++++++++++++++++---
 src/box/txn.c     | 102 ++++++++++++++++++++++----------------------
 src/box/wal.c     | 106 ++++++++++++++++++++++++++++++++++++++++++----
 5 files changed, 211 insertions(+), 70 deletions(-)

diff --git a/src/box/box.cc b/src/box/box.cc
index eb5931e37..03510eef9 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -322,14 +322,14 @@ recovery_journal_write(struct journal *base,
 {
 	struct recovery_journal *journal = (struct recovery_journal *) base;
 	entry->res = vclock_sum(journal->vclock);
-	journal_entry_complete(entry);
 	return 0;
 }
 
 static inline void
 recovery_journal_create(struct recovery_journal *journal, struct vclock *v)
 {
-	journal_create(&journal->base, recovery_journal_write, NULL);
+	journal_create(&journal->base, journal_no_write_async,
+		       recovery_journal_write, NULL);
 	journal->vclock = v;
 	journal_set(&journal->base);
 }
diff --git a/src/box/journal.c b/src/box/journal.c
index 266ee5d1f..2f78a4bc4 100644
--- a/src/box/journal.c
+++ b/src/box/journal.c
@@ -32,6 +32,21 @@
 #include <small/region.h>
 #include <diag.h>
 
+int
+journal_no_write_async(struct journal *journal,
+		       struct journal_entry *entry,
+		       journal_entry_complete_cb on_complete_cb,
+		       void *on_complete_cb_data)
+{
+	(void)journal;
+	(void)entry;
+	(void)on_complete_cb;
+	(void)on_complete_cb_data;
+
+	say_error("journal: write_async called from invalid context");
+	return -1;
+}
+
 /**
  * Used to load from a memtx snapshot. LSN is not used,
  * but txn_commit() must work.
@@ -41,13 +56,12 @@ dummy_journal_write(struct journal *journal, struct journal_entry *entry)
 {
 	(void) journal;
 	entry->res = 0;
-	journal_entry_complete(entry);
 	return 0;
 }
 
 static struct journal dummy_journal = {
-	dummy_journal_write,
-	NULL,
+	.write_async	= journal_no_write_async,
+	.write		= dummy_journal_write,
 };
 
 struct journal *current_journal = &dummy_journal;
@@ -75,4 +89,3 @@ journal_entry_new(size_t n_rows, struct region *region)
 
 	return entry;
 }
-
diff --git a/src/box/journal.h b/src/box/journal.h
index e74c69910..fac4d4e78 100644
--- a/src/box/journal.h
+++ b/src/box/journal.h
@@ -34,6 +34,7 @@
 #include <stdbool.h>
 #include "salad/stailq.h"
 #include "fiber.h"
+#include "txn.h"
 
 #if defined(__cplusplus)
 extern "C" {
@@ -110,6 +111,10 @@ journal_entry_complete(struct journal_entry *entry)
  * synchronous replication.
  */
 struct journal {
+	int (*write_async)(struct journal *journal,
+			   struct journal_entry *entry,
+			   journal_entry_complete_cb on_complete_cb,
+			   void *on_complete_cb_data);
 	int (*write)(struct journal *journal,
 		     struct journal_entry *req);
 	void (*destroy)(struct journal *journal);
@@ -122,16 +127,33 @@ struct journal {
 extern struct journal *current_journal;
 
 /**
- * Send a single entry to write.
+ * Write a single entry to the journal in synchronous way.
  *
- * @return 0 if write was scheduled or -1 in case of an error.
+ * @return 0 if write was processed by a backend or -1 in case of an error.
  */
 static inline int
 journal_write(struct journal_entry *entry)
 {
+	assert(in_txn() != NULL);
 	return current_journal->write(current_journal, entry);
 }
 
+/**
+ * Queue a single entry to the journal in asynchronous way.
+ *
+ * @return 0 if write was queued to a backend or -1 in case of an error.
+ */
+static inline int
+journal_write_async(struct journal_entry *entry,
+		    journal_entry_complete_cb on_complete_cb,
+		    void *on_complete_cb_data)
+{
+	assert(in_txn() != NULL);
+	return current_journal->write_async(current_journal, entry,
+					    on_complete_cb,
+					    on_complete_cb_data);
+}
+
 /**
  * Change the current implementation of the journaling API.
  * Happens during life cycle of an instance:
@@ -163,17 +185,33 @@ journal_set(struct journal *new_journal)
 
 static inline void
 journal_create(struct journal *journal,
+	       int (*write_async)(struct journal *journal,
+				  struct journal_entry *entry,
+				  journal_entry_complete_cb on_complete_cb,
+				  void *on_complete_cb_data),
 	       int (*write)(struct journal *, struct journal_entry *),
 	       void (*destroy)(struct journal *))
 {
-	journal->write = write;
-	journal->destroy = destroy;
+	journal->write_async	= write_async,
+	journal->write		= write;
+	journal->destroy	= destroy;
 }
 
+/**
+ * A stub to issue an error in case if asynchronous
+ * write is diabled in the backend.
+ */
+extern int
+journal_no_write_async(struct journal *journal,
+		       struct journal_entry *entry,
+		       journal_entry_complete_cb on_complete_cb,
+		       void *on_complete_cb_data);
+
 static inline bool
 journal_is_initialized(struct journal *journal)
 {
-	return journal->write != NULL;
+	return journal->write != NULL &&
+		journal->write_async != NULL;
 }
 
 #if defined(__cplusplus)
diff --git a/src/box/txn.c b/src/box/txn.c
index 613da181b..27aa3d35e 100644
--- a/src/box/txn.c
+++ b/src/box/txn.c
@@ -463,7 +463,7 @@ txn_complete(struct txn *txn)
 }
 
 static void
-txn_entry_complete_cb(struct journal_entry *entry, void *data)
+txn_async_complete(struct journal_entry *entry, void *data)
 {
 	struct txn *txn = data;
 	txn->signature = entry->res;
@@ -478,6 +478,10 @@ txn_entry_complete_cb(struct journal_entry *entry, void *data)
 	fiber_set_txn(fiber(), NULL);
 }
 
+/**
+ * Allocate new journal entry with transaction
+ * data to write.
+ */
 static struct journal_entry *
 txn_journal_entry_new(struct txn *txn)
 {
@@ -518,24 +522,6 @@ txn_journal_entry_new(struct txn *txn)
 	return req;
 }
 
-static int64_t
-txn_write_to_wal(struct journal_entry *req)
-{
-	/*
-	 * Send the entry to the journal.
-	*
-	 * After this point the transaction must not be used
-	 * so reset the corresponding key in the fiber storage.
-	 */
-	fiber_set_txn(fiber(), NULL);
-	if (journal_write(req) < 0) {
-		diag_set(ClientError, ER_WAL_IO);
-		diag_log();
-		return -1;
-	}
-	return 0;
-}
-
 /*
  * Prepare a transaction using engines.
  */
@@ -596,42 +582,51 @@ txn_commit_nop(struct txn *txn)
 	return false;
 }
 
+/**
+ * Commit a transaction asynchronously, the
+ * completion is processed by a callback.
+ */
 int
 txn_commit_async(struct txn *txn)
 {
 	struct journal_entry *req;
 
-	if (txn_prepare(txn) != 0) {
-		txn_rollback(txn);
-		return -1;
-	}
+	if (txn_prepare(txn) != 0)
+		goto out_rollback;
 
 	if (txn_commit_nop(txn))
 		return 0;
 
 	req = txn_journal_entry_new(txn);
-	if (req == NULL) {
-		txn_rollback(txn);
-		return -1;
+	if (req == NULL)
+		goto out_rollback;
+
+	if (journal_write_async(req, txn_async_complete, txn) != 0) {
+		diag_set(ClientError, ER_WAL_IO);
+		diag_log();
+		goto out_rollback;
 	}
-	req->on_complete_cb = txn_entry_complete_cb;
-	req->on_complete_cb_data = txn;
 
-	return txn_write_to_wal(req);
+	return 0;
+
+out_rollback:
+	txn_rollback(txn);
+	return -1;
 }
 
+/**
+ * Commit a transaction synchronously.
+ */
 int
 txn_commit(struct txn *txn)
 {
 	struct journal_entry *req;
-	int res = -1;
+	int res;
 
 	txn->fiber = fiber();
 
-	if (txn_prepare(txn) != 0) {
-		txn_rollback(txn);
-		goto out;
-	}
+	if (txn_prepare(txn) != 0)
+		goto out_rollback;
 
 	if (txn_commit_nop(txn)) {
 		res = 0;
@@ -639,33 +634,40 @@ txn_commit(struct txn *txn)
 	}
 
 	req = txn_journal_entry_new(txn);
-	if (req == NULL) {
-		txn_rollback(txn);
-		goto out;
-	}
-	req->on_complete_cb = txn_entry_complete_cb;
-	req->on_complete_cb_data = txn;
-
-	if (txn_write_to_wal(req) != 0)
-		return -1;
+	if (req == NULL)
+		goto out_rollback;
 
 	/*
-	 * In case of non-yielding journal the transaction could already
-	 * be done and there is nothing to wait in such cases.
+	 * FIXME: Move error setup inside the
+	 * journal engine itself. The ClientError
+	 * here is too general.
 	 */
-	if (!txn_has_flag(txn, TXN_IS_DONE)) {
-		bool cancellable = fiber_set_cancellable(false);
-		fiber_yield();
-		fiber_set_cancellable(cancellable);
+
+	if (journal_write(req) != 0) {
+		diag_set(ClientError, ER_WAL_IO);
+		diag_log();
+		goto out_rollback;
 	}
+
+	txn->signature = req->res;
 	res = txn->signature >= 0 ? 0 : -1;
-	if (res != 0)
+	if (res != 0) {
 		diag_set(ClientError, ER_WAL_IO);
+		diag_log();
+	}
 
+	txn_complete(txn);
+	fiber_set_txn(fiber(), NULL);
 out:
+
 	/* Synchronous transactions are freed by the calling fiber. */
 	txn_free(txn);
 	return res;
+
+out_rollback:
+	res = -1;
+	txn_rollback(txn);
+	goto out;
 }
 
 void
diff --git a/src/box/wal.c b/src/box/wal.c
index 1668c9348..dd9563f31 100644
--- a/src/box/wal.c
+++ b/src/box/wal.c
@@ -32,6 +32,7 @@
 
 #include "vclock.h"
 #include "fiber.h"
+#include "txn.h"
 #include "fio.h"
 #include "errinj.h"
 #include "error.h"
@@ -60,11 +61,19 @@ const char *wal_mode_STRS[] = { "none", "write", "fsync", NULL };
 
 int wal_dir_lock = -1;
 
+static int
+wal_write_async(struct journal *, struct journal_entry *,
+		journal_entry_complete_cb, void *);
+
 static int
 wal_write(struct journal *, struct journal_entry *);
 
 static int
-wal_write_in_wal_mode_none(struct journal *, struct journal_entry *);
+wal_write_none_async(struct journal *, struct journal_entry *,
+		     journal_entry_complete_cb, void *);
+
+static int
+wal_write_none(struct journal *, struct journal_entry *);
 
 /*
  * WAL writer - maintain a Write Ahead Log for every change
@@ -349,8 +358,12 @@ wal_writer_create(struct wal_writer *writer, enum wal_mode wal_mode,
 {
 	writer->wal_mode = wal_mode;
 	writer->wal_max_size = wal_max_size;
-	journal_create(&writer->base, wal_mode == WAL_NONE ?
-		       wal_write_in_wal_mode_none : wal_write, NULL);
+	journal_create(&writer->base,
+		       wal_mode == WAL_NONE ?
+		       wal_write_none_async : wal_write_async,
+		       wal_mode == WAL_NONE ?
+		       wal_write_none : wal_write,
+		       NULL);
 
 	struct xlog_opts opts = xlog_opts_default;
 	opts.sync_is_async = true;
@@ -1170,9 +1183,21 @@ wal_writer_f(va_list ap)
  * to be written to disk.
  */
 static int
-wal_write(struct journal *journal, struct journal_entry *entry)
+wal_write_async(struct journal *journal, struct journal_entry *entry,
+		journal_entry_complete_cb on_complete_cb,
+		void *on_complete_cb_data)
 {
 	struct wal_writer *writer = (struct wal_writer *) journal;
+	struct txn *txn = in_txn();
+
+	/*
+	 * After this point the transaction will
+	 * live on its own and processed via callbacks,
+	 * so reset the fiber storage.
+	 */
+	entry->on_complete_cb = on_complete_cb;
+	entry->on_complete_cb_data = on_complete_cb_data;
+	fiber_set_txn(fiber(), NULL);
 
 	ERROR_INJECT(ERRINJ_WAL_IO, {
 		goto fail;
@@ -1220,27 +1245,90 @@ wal_write(struct journal *journal, struct journal_entry *entry)
 	return 0;
 
 fail:
+	/*
+	 * Don't forget to restore transaction
+	 * in a fiber storage: the caller should
+	 * be able to run a rollback procedure.
+	 */
+	fiber_set_txn(fiber(), txn);
 	entry->res = -1;
-	journal_entry_complete(entry);
+	txn->signature = -1;
 	return -1;
 }
 
+static void
+wal_write_cb(struct journal_entry *entry, void *data)
+{
+	struct txn *txn = data;
+	(void)entry;
+
+	/*
+	 * On synchronous write just wake up
+	 * the waiter which will complete the
+	 * transaction.
+	 */
+	fiber_wakeup(txn->fiber);
+}
+
+/*
+ * Queue entry to write and wait until it processed.
+ */
 static int
-wal_write_in_wal_mode_none(struct journal *journal,
-			   struct journal_entry *entry)
+wal_write(struct journal *journal, struct journal_entry *entry)
 {
-	struct wal_writer *writer = (struct wal_writer *) journal;
+	struct txn *txn = in_txn();
+
+	/*
+	 * Lets reuse async WAL engine to shrink code a bit.
+	 */
+	if (wal_write_async(journal, entry, wal_write_cb, txn) != 0)
+		return -1;
+
+	bool cancellable = fiber_set_cancellable(false);
+	fiber_yield();
+	fiber_set_cancellable(cancellable);
+
+	/*
+	 * Unlike async write we preserve the transaction
+	 * in a fiber storage where the caller should finish
+	 * the transaction.
+	 */
+	fiber_set_txn(fiber(), txn);
+	return 0;
+}
+
+static int
+wal_write_none_async(struct journal *journal,
+		     struct journal_entry *entry,
+		     journal_entry_complete_cb on_complete_cb,
+		     void *on_complete_cb_data)
+{
+	struct wal_writer *writer = (struct wal_writer *)journal;
 	struct vclock vclock_diff;
+	struct txn *txn = in_txn();
+
+	(void)on_complete_cb;
+	(void)on_complete_cb_data;
+
+	fiber_set_txn(fiber(), NULL);
+
 	vclock_create(&vclock_diff);
 	wal_assign_lsn(&vclock_diff, &writer->vclock, entry->rows,
 		       entry->rows + entry->n_rows);
 	vclock_merge(&writer->vclock, &vclock_diff);
 	vclock_copy(&replicaset.vclock, &writer->vclock);
 	entry->res = vclock_sum(&writer->vclock);
-	journal_entry_complete(entry);
+
+	txn->signature = entry->res;
 	return 0;
 }
 
+static int
+wal_write_none(struct journal *journal, struct journal_entry *entry)
+{
+	return wal_write_none_async(journal, entry, NULL, NULL);
+}
+
 void
 wal_init_vy_log()
 {
-- 
2.20.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Tarantool-patches] [PATCH 01/10] box: recovery_journal_create -- set journal here
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 01/10] box: recovery_journal_create -- set journal here Cyrill Gorcunov
@ 2020-03-06 21:27   ` Konstantin Osipov
  0 siblings, 0 replies; 28+ messages in thread
From: Konstantin Osipov @ 2020-03-06 21:27 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml

* Cyrill Gorcunov <gorcunov@gmail.com> [20/03/05 15:32]:
> Allows to eliminate code duplication.

Yet recovery_journal_create() now does two things - creates and
sets a journal.
Then the next step is to move the declaration of struct
recovery_journal to this function and make this declaration static? 

static recovery_journal journal;
journal->vclock = v;
journal_set(&journal);


Anyway, this is unimportant, so lgtm.

> 
> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
> ---
>  src/box/box.cc | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/src/box/box.cc b/src/box/box.cc
> index 09dd67ab4..eb5931e37 100644
> --- a/src/box/box.cc
> +++ b/src/box/box.cc
> @@ -331,6 +331,7 @@ recovery_journal_create(struct recovery_journal *journal, struct vclock *v)
>  {
>  	journal_create(&journal->base, recovery_journal_write, NULL);
>  	journal->vclock = v;
> +	journal_set(&journal->base);
>  }
>  
>  static void
> @@ -2055,7 +2056,6 @@ bootstrap_from_master(struct replica *master)
>  	engine_begin_final_recovery_xc();
>  	struct recovery_journal journal;
>  	recovery_journal_create(&journal, &replicaset.vclock);
> -	journal_set(&journal.base);
>  
>  	if (!replication_anon) {
>  		applier_resume_to_state(applier, APPLIER_JOINED,
> @@ -2221,7 +2221,6 @@ local_recovery(const struct tt_uuid *instance_uuid,
>  
>  	struct recovery_journal journal;
>  	recovery_journal_create(&journal, &recovery->vclock);
> -	journal_set(&journal.base);
>  
>  	/*
>  	 * We explicitly request memtx to recover its

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Tarantool-patches] [PATCH 02/10] box/txn: move fiber_set_txn to header
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 02/10] box/txn: move fiber_set_txn to header Cyrill Gorcunov
@ 2020-03-06 21:27   ` Konstantin Osipov
  0 siblings, 0 replies; 28+ messages in thread
From: Konstantin Osipov @ 2020-03-06 21:27 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml

* Cyrill Gorcunov <gorcunov@gmail.com> [20/03/05 15:32]:

Would be nice to explain in the commit comment why you need it.

Assuming you do need it, lgtm.


-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Tarantool-patches] [PATCH 03/10] box/txn: rename txn_write to txn_commit_async
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 03/10] box/txn: rename txn_write to txn_commit_async Cyrill Gorcunov
@ 2020-03-06 21:28   ` Konstantin Osipov
  0 siblings, 0 replies; 28+ messages in thread
From: Konstantin Osipov @ 2020-03-06 21:28 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml

* Cyrill Gorcunov <gorcunov@gmail.com> [20/03/05 15:32]:
> To reflect the fact tha we're don't waiting for
> transaction to complete but rely on completion
> callback.

Finally. LGTM.

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Tarantool-patches] [PATCH 04/10] box/txn: move setup of txn start to txn_prepare
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 04/10] box/txn: move setup of txn start to txn_prepare Cyrill Gorcunov
@ 2020-03-06 21:30   ` Konstantin Osipov
  2020-03-18 12:38     ` Cyrill Gorcunov
  0 siblings, 1 reply; 28+ messages in thread
From: Konstantin Osipov @ 2020-03-06 21:30 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml

* Cyrill Gorcunov <gorcunov@gmail.com> [20/03/05 15:32]:
> For unification sake, we will handle nop transactions
> via common helper for both sync and async cases.
> 
> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
> ---
>  src/box/txn.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/src/box/txn.c b/src/box/txn.c
> index 9f61303ab..5dd52e6db 100644
> --- a/src/box/txn.c
> +++ b/src/box/txn.c
> @@ -554,6 +554,15 @@ txn_prepare(struct txn *txn)
>  	trigger_clear(&txn->fiber_on_stop);
>  	if (!txn_has_flag(txn, TXN_CAN_YIELD))
>  		trigger_clear(&txn->fiber_on_yield);
> +
> +	/*
> +	 * It is important to set start transaction
> +	 * time at the last moment, when everything
> +	 * is ready to initiate commit procedure,
> +	 * just to be more precise in timings to
> +	 * detect long WAL writes.
> +	 */

I think this comment is misleading. There are no yields
between start of txn_prepare() and end. There may appear yields
when vinyl lock manager is more smart, but there are no yields
now. So ev_monotonic_now returns exactly the same value regardless 
of where you call it.

Otherwise lgtm.

> +	txn->start_tm = ev_monotonic_now(loop());
>  	return 0;
>  }
>  

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Tarantool-patches] [PATCH 05/10] box/txn: add txn_commit_nop helper
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 05/10] box/txn: add txn_commit_nop helper Cyrill Gorcunov
@ 2020-03-06 21:30   ` Konstantin Osipov
  0 siblings, 0 replies; 28+ messages in thread
From: Konstantin Osipov @ 2020-03-06 21:30 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml

* Cyrill Gorcunov <gorcunov@gmail.com> [20/03/05 15:32]:
> To reuse in sync trancastion once joural

journal

> redesign is complete.

lgtm


-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Tarantool-patches] [PATCH 06/10] box/txn: unweave txn_commit from txn_commit_async
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 06/10] box/txn: unweave txn_commit from txn_commit_async Cyrill Gorcunov
@ 2020-03-06 21:33   ` Konstantin Osipov
  0 siblings, 0 replies; 28+ messages in thread
From: Konstantin Osipov @ 2020-03-06 21:33 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml

* Cyrill Gorcunov <gorcunov@gmail.com> [20/03/05 15:32]:
> We gonna divergence sync and async code flow thus

divergence->diverge. Or rather, split.

> lets make txn_commit to not use txn_commit_async.
> 
> +++ b/src/box/txn.c
> @@ -604,10 +604,24 @@ txn_commit_async(struct txn *txn)
>  int
>  txn_commit(struct txn *txn)
>  {
> +	int res = -1;
> +
>  	txn->fiber = fiber();
>  
> -	if (txn_commit_async(txn) != 0)
> +	if (txn_prepare(txn) != 0) {
> +		txn_rollback(txn);
> +		goto out;
> +	}
> +
> +	if (txn_commit_nop(txn)) {
> +		res = 0;
> +		goto out;
> +	}
> +
> +	fiber_set_txn(fiber(), NULL);
> +	if (txn_write_to_wal(txn) != 0)
>  		return -1;

I know this function will keep changing, but 
this specific place when you don't use res 
make the code temporarily ugly.

I suggest not having out: branch, and duplicate 
txn_free() in the above two branches.  No big deal.

But it's also minor.

So lgtm.


> +
>  	/*
>  	 * In case of non-yielding journal the transaction could already
>  	 * be done and there is nothing to wait in such cases.
> @@ -617,10 +631,11 @@ txn_commit(struct txn *txn)
>  		fiber_yield();
>  		fiber_set_cancellable(cancellable);
>  	}
> -	int res = txn->signature >= 0 ? 0 : -1;
> +	res = txn->signature >= 0 ? 0 : -1;
>  	if (res != 0)
>  		diag_set(ClientError, ER_WAL_IO);
>  
> +out:
>  	/* Synchronous transactions are freed by the calling fiber. */
>  	txn_free(txn);
>  	return res;
> -- 
> 2.20.1
> 

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Tarantool-patches] [PATCH 07/10] box/txn: clear fiber storage right before journal write
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 07/10] box/txn: clear fiber storage right before journal write Cyrill Gorcunov
@ 2020-03-06 21:34   ` Konstantin Osipov
  0 siblings, 0 replies; 28+ messages in thread
From: Konstantin Osipov @ 2020-03-06 21:34 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml

* Cyrill Gorcunov <gorcunov@gmail.com> [20/03/05 15:32]:
> Otherwise we won't be able to make a rollback in case
> of journal_entry_new allocation failure.

LGTM.


-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Tarantool-patches] [PATCH 08/10] box/txn: move journal allocation into separate routine
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 08/10] box/txn: move journal allocation into separate routine Cyrill Gorcunov
@ 2020-03-06 21:35   ` Konstantin Osipov
  0 siblings, 0 replies; 28+ messages in thread
From: Konstantin Osipov @ 2020-03-06 21:35 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml

* Cyrill Gorcunov <gorcunov@gmail.com> [20/03/05 15:32]:
> For now we assign callbacks unconditionally but it is
> done to minimize changes in each patch in the series.

Nice!

LGTM.


-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Tarantool-patches] [PATCH 09/10] box/journal: journal_entry_new -- drop setting up callbacks
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 09/10] box/journal: journal_entry_new -- drop setting up callbacks Cyrill Gorcunov
@ 2020-03-06 21:37   ` Konstantin Osipov
  2020-03-06 21:41     ` Cyrill Gorcunov
  0 siblings, 1 reply; 28+ messages in thread
From: Konstantin Osipov @ 2020-03-06 21:37 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml

* Cyrill Gorcunov <gorcunov@gmail.com> [20/03/05 15:32]:
> The callbacks gonna be different from sync and async
> commits thus lets untangle it from journal_entry_new
> interface but setup explicitly.
> 
> I think setting up NULLs for entry is a safe approach
> so we can allow us to spend a few cycles.

This is a hot path. 

I don't see much value in this patch, but I don't think it
makes much of a difference either way.

My preference is to drop this patch, just pass different
call backs from different call sites.


-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Tarantool-patches] [PATCH 09/10] box/journal: journal_entry_new -- drop setting up callbacks
  2020-03-06 21:37   ` Konstantin Osipov
@ 2020-03-06 21:41     ` Cyrill Gorcunov
  2020-03-06 21:51       ` Konstantin Osipov
  0 siblings, 1 reply; 28+ messages in thread
From: Cyrill Gorcunov @ 2020-03-06 21:41 UTC (permalink / raw)
  To: Konstantin Osipov, tml

On Sat, Mar 07, 2020 at 12:37:43AM +0300, Konstantin Osipov wrote:
> * Cyrill Gorcunov <gorcunov@gmail.com> [20/03/05 15:32]:
> > The callbacks gonna be different from sync and async
> > commits thus lets untangle it from journal_entry_new
> > interface but setup explicitly.
> > 
> > I think setting up NULLs for entry is a safe approach
> > so we can allow us to spend a few cycles.
> 
> This is a hot path. 

You mean to not setup NULL here? Maybe with Debug build only?

> I don't see much value in this patch, but I don't think it
> makes much of a difference either way.
> 
> My preference is to drop this patch, just pass different
> call backs from different call sites.

I don't mind to rework.

Kotya, there is a problem in this series. In one test case

[cyrill@uranus test] ./test-run.py replication/gc.test.lua
...
[002] replication/gc.test.lua                         memtx           [ pass ]
...

But vinyl test case fails

2020-03-07 00:35:51.116 [6494] main/136/applier/unix/:/home/cyrill/sda1 I> subscribed
2020-03-07 00:35:51.116 [6494] main/136/applier/unix/:/home/cyrill/sda1 I> remote vclock {1: 782} local vclock {1: 382}
tarantool: /home/cyrill/sda1/tarantool/tarantool.git/src/box/vy_tx.c:803: void vy_tx_commit(struct vy_tx *, int64_t): Assertion `xm->lsn <= lsn' failed.
[001] replication/gc.test.lua                         vinyl           
[001] 
[001] [Instance "replica" killed by signal: 6 (SIGABRT)]
[001] Found assertion fail in the results file [/home/cyrill/sda1/tarantool/tarantool.git/test/var/001_replication/replica.log]:
[001] [ fail ]
[Main process] Got failed test; gently terminate all workers...
---

Actually somtimes it pass, sometime fails. And I didn't find out why, yet. Investigating...

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Tarantool-patches] [PATCH 10/10] box/journal: redesign sync and async writes
  2020-03-05 12:29 ` [Tarantool-patches] [PATCH 10/10] box/journal: redesign sync and async writes Cyrill Gorcunov
@ 2020-03-06 21:48   ` Konstantin Osipov
  0 siblings, 0 replies; 28+ messages in thread
From: Konstantin Osipov @ 2020-03-06 21:48 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml

You're on track overall, please see comments inline.

* Cyrill Gorcunov <gorcunov@gmail.com> [20/03/05 15:32]:
> --- a/src/box/box.cc
> +++ b/src/box/box.cc
> @@ -322,14 +322,14 @@ recovery_journal_write(struct journal *base,
>  {
>  	struct recovery_journal *journal = (struct recovery_journal *) base;
>  	entry->res = vclock_sum(journal->vclock);
> -	journal_entry_complete(entry);
>  	return 0;

Yes!

>  }
>  
>  static inline void
>  recovery_journal_create(struct recovery_journal *journal, struct vclock *v)
>  {
> -	journal_create(&journal->base, recovery_journal_write, NULL);
> +	journal_create(&journal->base, journal_no_write_async,
> +		       recovery_journal_write, NULL);

Nice.

>  	journal->vclock = v;
>  	journal_set(&journal->base);
>  }
> diff --git a/src/box/journal.c b/src/box/journal.c
> index 266ee5d1f..2f78a4bc4 100644
> --- a/src/box/journal.c
> +++ b/src/box/journal.c
> @@ -32,6 +32,21 @@
>  #include <small/region.h>
>  #include <diag.h>
>  
> +int
> +journal_no_write_async(struct journal *journal,
> +		       struct journal_entry *entry,
> +		       journal_entry_complete_cb on_complete_cb,
> +		       void *on_complete_cb_data)
> +{
> +	(void)journal;
> +	(void)entry;
> +	(void)on_complete_cb;
> +	(void)on_complete_cb_data;
> +
> +	say_error("journal: write_async called from invalid context");

Please panic.

> +	return -1;
> +}
> +
>  /**
>   * Used to load from a memtx snapshot. LSN is not used,
>   * but txn_commit() must work.
> @@ -41,13 +56,12 @@ dummy_journal_write(struct journal *journal, struct journal_entry *entry)
>  {
>  	(void) journal;
>  	entry->res = 0;
> -	journal_entry_complete(entry);

Nice!

> --- a/src/box/journal.h
> +++ b/src/box/journal.h
> @@ -34,6 +34,7 @@
>  #include <stdbool.h>
>  #include "salad/stailq.h"
>  #include "fiber.h"
> +#include "txn.h"

Stray include. Please remove.

>  /**
> - * Send a single entry to write.
> + * Write a single entry to the journal in synchronous way.
>   *
> - * @return 0 if write was scheduled or -1 in case of an error.
> + * @return 0 if write was processed by a backend or -1 in case of an error.
>   */
>  static inline int
>  journal_write(struct journal_entry *entry)
>  {
> +	assert(in_txn() != NULL);

You should not create a dependency loop just because of an assert.

>  static inline bool
>  journal_is_initialized(struct journal *journal)
>  {
> -	return journal->write != NULL;
> +	return journal->write != NULL &&
> +		journal->write_async != NULL;

This is an unnecessary change. They are always set together.
Please feel free to add a flag is_initialized if you don't like
checking a single member.

>  static void
> -txn_entry_complete_cb(struct journal_entry *entry, void *data)
> +txn_async_complete(struct journal_entry *entry, void *data)
>  {

Yes!

>  	struct txn *txn = data;
>  	txn->signature = entry->res;
> @@ -478,6 +478,10 @@ txn_entry_complete_cb(struct journal_entry *entry, void *data)
>  	fiber_set_txn(fiber(), NULL);
>  }
>  
> +/**
> + * Allocate new journal entry with transaction
> + * data to write.
> + */
>  static struct journal_entry *
>  txn_journal_entry_new(struct txn *txn)
>  {
> @@ -518,24 +522,6 @@ txn_journal_entry_new(struct txn *txn)
>  	return req;
>  }
>  
> -static int64_t
> -txn_write_to_wal(struct journal_entry *req)
> -{
> -	/*
> -	 * Send the entry to the journal.
> -	*
> -	 * After this point the transaction must not be used
> -	 * so reset the corresponding key in the fiber storage.
> -	 */
> -	fiber_set_txn(fiber(), NULL);
> -	if (journal_write(req) < 0) {
> -		diag_set(ClientError, ER_WAL_IO);
> -		diag_log();
> -		return -1;
> -	}
> -	return 0;
> -}

Awesome.

> -
>  /*
>   * Prepare a transaction using engines.
>   */
> @@ -596,42 +582,51 @@ txn_commit_nop(struct txn *txn)
>  	return false;
>  }
>  
> +/**
> + * Commit a transaction asynchronously, the
> + * completion is processed by a callback.
> + */
>  int
>  txn_commit_async(struct txn *txn)
>  {
>  	struct journal_entry *req;
>  
> -	if (txn_prepare(txn) != 0) {
> -		txn_rollback(txn);
> -		return -1;
> -	}
> +	if (txn_prepare(txn) != 0)
> +		goto out_rollback;

Why not simply rollback: or out:
out_rollback suggests there more out_s.

>  
>  	if (txn_commit_nop(txn))
>  		return 0;
>  
>  	req = txn_journal_entry_new(txn);
> -	if (req == NULL) {
> -		txn_rollback(txn);
> -		return -1;
> +	if (req == NULL)
> +		goto out_rollback;
> +
> +	if (journal_write_async(req, txn_async_complete, txn) != 0) {
> +		diag_set(ClientError, ER_WAL_IO);
> +		diag_log();
> +		goto out_rollback;
>  	}
> -	req->on_complete_cb = txn_entry_complete_cb;
> -	req->on_complete_cb_data = txn;
>  
> -	return txn_write_to_wal(req);
> +	return 0;
> +
> +out_rollback:
> +	txn_rollback(txn);
> +	return -1;
>  }
>  
> +/**
> + * Commit a transaction synchronously.
> + */
>  int
>  txn_commit(struct txn *txn)
>  {
>  	struct journal_entry *req;
> -	int res = -1;
> +	int res;
>  
>  	txn->fiber = fiber();

I believe you don't have to set it twice, it's already set.

>  
> -	if (txn_prepare(txn) != 0) {
> -		txn_rollback(txn);
> -		goto out;
> -	}
> +	if (txn_prepare(txn) != 0)
> +		goto out_rollback;
>  
>  	if (txn_commit_nop(txn)) {
>  		res = 0;
> @@ -639,33 +634,40 @@ txn_commit(struct txn *txn)
>  	}
>  
>  	req = txn_journal_entry_new(txn);
> -	if (req == NULL) {
> -		txn_rollback(txn);
> -		goto out;
> -	}
> -	req->on_complete_cb = txn_entry_complete_cb;
> -	req->on_complete_cb_data = txn;
> -
> -	if (txn_write_to_wal(req) != 0)
> -		return -1;
> +	if (req == NULL)
> +		goto out_rollback;
>  
>  	/*
> -	 * In case of non-yielding journal the transaction could already
> -	 * be done and there is nothing to wait in such cases.
> +	 * FIXME: Move error setup inside the
> +	 * journal engine itself. The ClientError
> +	 * here is too general.
>  	 */
> -	if (!txn_has_flag(txn, TXN_IS_DONE)) {
> -		bool cancellable = fiber_set_cancellable(false);
> -		fiber_yield();
> -		fiber_set_cancellable(cancellable);
> +
> +	if (journal_write(req) != 0) {
> +		diag_set(ClientError, ER_WAL_IO);
> +		diag_log();
> +		goto out_rollback;
>  	}
> +
> +	txn->signature = req->res;
>  	res = txn->signature >= 0 ? 0 : -1;
> -	if (res != 0)
> +	if (res != 0) {
>  		diag_set(ClientError, ER_WAL_IO);
> +		diag_log();
> +	}
>  
> +	txn_complete(txn);
> +	fiber_set_txn(fiber(), NULL);
>  out:
> +
>  	/* Synchronous transactions are freed by the calling fiber. */
>  	txn_free(txn);
>  	return res;
> +
> +out_rollback:
> +	res = -1;
> +	txn_rollback(txn);
> +	goto out;
>  }


>  
>  void
> diff --git a/src/box/wal.c b/src/box/wal.c
> index 1668c9348..dd9563f31 100644
> --- a/src/box/wal.c
> +++ b/src/box/wal.c
> @@ -32,6 +32,7 @@
>  
>  #include "vclock.h"
>  #include "fiber.h"
> +#include "txn.h"

Noooo.

Please avoid dependency loops.

>  #include "fio.h"
>  #include "errinj.h"
>  #include "error.h"
> @@ -60,11 +61,19 @@ const char *wal_mode_STRS[] = { "none", "write", "fsync", NULL };
>  
>  int wal_dir_lock = -1;
>  
> +static int
> +wal_write_async(struct journal *, struct journal_entry *,
> +		journal_entry_complete_cb, void *);
> +
>  static int
>  wal_write(struct journal *, struct journal_entry *);
>  
>  static int
> -wal_write_in_wal_mode_none(struct journal *, struct journal_entry *);
> +wal_write_none_async(struct journal *, struct journal_entry *,
> +		     journal_entry_complete_cb, void *);
> +
> +static int
> +wal_write_none(struct journal *, struct journal_entry *);
>  
>  /*
>   * WAL writer - maintain a Write Ahead Log for every change
> @@ -349,8 +358,12 @@ wal_writer_create(struct wal_writer *writer, enum wal_mode wal_mode,
>  {
>  	writer->wal_mode = wal_mode;
>  	writer->wal_max_size = wal_max_size;
> -	journal_create(&writer->base, wal_mode == WAL_NONE ?
> -		       wal_write_in_wal_mode_none : wal_write, NULL);
> +	journal_create(&writer->base,
> +		       wal_mode == WAL_NONE ?
> +		       wal_write_none_async : wal_write_async,
> +		       wal_mode == WAL_NONE ?
> +		       wal_write_none : wal_write,
> +		       NULL);
>  
>  	struct xlog_opts opts = xlog_opts_default;
>  	opts.sync_is_async = true;
> @@ -1170,9 +1183,21 @@ wal_writer_f(va_list ap)
>   * to be written to disk.
>   */
>  static int
> -wal_write(struct journal *journal, struct journal_entry *entry)
> +wal_write_async(struct journal *journal, struct journal_entry *entry,
> +		journal_entry_complete_cb on_complete_cb,
> +		void *on_complete_cb_data)
>  {
>  	struct wal_writer *writer = (struct wal_writer *) journal;
> +	struct txn *txn = in_txn();
> +
> +	/*
> +	 * After this point the transaction will
> +	 * live on its own and processed via callbacks,
> +	 * so reset the fiber storage.
> +	 */
> +	entry->on_complete_cb = on_complete_cb;
> +	entry->on_complete_cb_data = on_complete_cb_data;
> +	fiber_set_txn(fiber(), NULL);

This should be in txn.cc, why did you move it here?

>  
>  	ERROR_INJECT(ERRINJ_WAL_IO, {
>  		goto fail;
> @@ -1220,27 +1245,90 @@ wal_write(struct journal *journal, struct journal_entry *entry)
>  	return 0;
>  
>  fail:
> +	/*
> +	 * Don't forget to restore transaction
> +	 * in a fiber storage: the caller should
> +	 * be able to run a rollback procedure.
> +	 */
> +	fiber_set_txn(fiber(), txn);

This should not be needed now.

>  	entry->res = -1;
> -	journal_entry_complete(entry);
> +	txn->signature = -1;
>  	return -1;
>  }
>  
> +static void
> +wal_write_cb(struct journal_entry *entry, void *data)
> +{
> +	struct txn *txn = data;
> +	(void)entry;
> +
> +	/*
> +	 * On synchronous write just wake up
> +	 * the waiter which will complete the
> +	 * transaction.
> +	 */
> +	fiber_wakeup(txn->fiber);

Why not pass fiber as data, this would avoid having to include
txn.h.

> +}
> +
> +/*
> + * Queue entry to write and wait until it processed.
> + */
>  static int
> -wal_write_in_wal_mode_none(struct journal *journal,
> -			   struct journal_entry *entry)
> +wal_write(struct journal *journal, struct journal_entry *entry)
>  {
> -	struct wal_writer *writer = (struct wal_writer *) journal;
> +	struct txn *txn = in_txn();
> +
> +	/*
> +	 * Lets reuse async WAL engine to shrink code a bit.
> +	 */
> +	if (wal_write_async(journal, entry, wal_write_cb, txn) != 0)

txn -> fiber_self().
> +		return -1;
> +
> +	bool cancellable = fiber_set_cancellable(false);
> +	fiber_yield();
> +	fiber_set_cancellable(cancellable);
> +
> +	/*
> +	 * Unlike async write we preserve the transaction
> +	 * in a fiber storage where the caller should finish
> +	 * the transaction.
> +	 */
> +	fiber_set_txn(fiber(), txn);
> +	return 0;
> +}
> +
> +static int
> +wal_write_none_async(struct journal *journal,
> +		     struct journal_entry *entry,
> +		     journal_entry_complete_cb on_complete_cb,
> +		     void *on_complete_cb_data)
> +{
> +	struct wal_writer *writer = (struct wal_writer *)journal;
>  	struct vclock vclock_diff;
> +	struct txn *txn = in_txn();
> +
> +	(void)on_complete_cb;
> +	(void)on_complete_cb_data;
> +
> +	fiber_set_txn(fiber(), NULL);
> +
>  	vclock_create(&vclock_diff);
>  	wal_assign_lsn(&vclock_diff, &writer->vclock, entry->rows,
>  		       entry->rows + entry->n_rows);
>  	vclock_merge(&writer->vclock, &vclock_diff);
>  	vclock_copy(&replicaset.vclock, &writer->vclock);
>  	entry->res = vclock_sum(&writer->vclock);
> -	journal_entry_complete(entry);
> +
> +	txn->signature = entry->res;


Please move back this assignment to txn.cc.

>  	return 0;
>  }
>  
> +static int
> +wal_write_none(struct journal *journal, struct journal_entry *entry)
> +{
> +	return wal_write_none_async(journal, entry, NULL, NULL);
> +}
> +

> 

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Tarantool-patches] [PATCH 09/10] box/journal: journal_entry_new -- drop setting up callbacks
  2020-03-06 21:41     ` Cyrill Gorcunov
@ 2020-03-06 21:51       ` Konstantin Osipov
  2020-03-06 21:57         ` Cyrill Gorcunov
  0 siblings, 1 reply; 28+ messages in thread
From: Konstantin Osipov @ 2020-03-06 21:51 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml

* Cyrill Gorcunov <gorcunov@gmail.com> [20/03/07 00:44]:
> But vinyl test case fails
> 
> 2020-03-07 00:35:51.116 [6494] main/136/applier/unix/:/home/cyrill/sda1 I> subscribed
> 2020-03-07 00:35:51.116 [6494] main/136/applier/unix/:/home/cyrill/sda1 I> remote vclock {1: 782} local vclock {1: 382}
> tarantool: /home/cyrill/sda1/tarantool/tarantool.git/src/box/vy_tx.c:803: void vy_tx_commit(struct vy_tx *, int64_t): Assertion `xm->lsn <= lsn' failed.
> [001] replication/gc.test.lua                         vinyl           

This has to do most likely with out-of-order invocation of the
wakeup callbacks.

Are these sync callbacks or async callbacks?

When a batch of transactions are submitted to wal, they got to be
scheduled to commit in the order of their lsn.

the order gets broken here for some reason. Please find out why.

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Tarantool-patches] [PATCH 09/10] box/journal: journal_entry_new -- drop setting up callbacks
  2020-03-06 21:51       ` Konstantin Osipov
@ 2020-03-06 21:57         ` Cyrill Gorcunov
  2020-03-06 22:04           ` Konstantin Osipov
  0 siblings, 1 reply; 28+ messages in thread
From: Cyrill Gorcunov @ 2020-03-06 21:57 UTC (permalink / raw)
  To: Konstantin Osipov, tml

On Sat, Mar 07, 2020 at 12:51:03AM +0300, Konstantin Osipov wrote:
> * Cyrill Gorcunov <gorcunov@gmail.com> [20/03/07 00:44]:
> > But vinyl test case fails
> > 
> > 2020-03-07 00:35:51.116 [6494] main/136/applier/unix/:/home/cyrill/sda1 I> subscribed
> > 2020-03-07 00:35:51.116 [6494] main/136/applier/unix/:/home/cyrill/sda1 I> remote vclock {1: 782} local vclock {1: 382}
> > tarantool: /home/cyrill/sda1/tarantool/tarantool.git/src/box/vy_tx.c:803: void vy_tx_commit(struct vy_tx *, int64_t): Assertion `xm->lsn <= lsn' failed.
> > [001] replication/gc.test.lua                         vinyl           
> 
> This has to do most likely with out-of-order invocation of the
> wakeup callbacks.
> 
> Are these sync callbacks or async callbacks?

To be honest -- I don't know at the moment. This is replication test
if I understand correctly and I presume these are async calls, but
not sure yet. What is interesting is that vinyl internally use
journal entries and calls cbus directly to write the data (again,
if I understand correctly 'cause I know nothing about vinyl internals)
and this might be the reason.

> When a batch of transactions are submitted to wal, they got to be
> scheduled to commit in the order of their lsn.
> 
> the order gets broken here for some reason. Please find out why.

Yes. Will continue the next week. And thanks a huge for all your
comments. I'll try to address them in next series once I finish
hunting this test issue.

You know this series of patches is a bit ugly but once we have
a working set of patches we will beautify them the way we like.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Tarantool-patches] [PATCH 09/10] box/journal: journal_entry_new -- drop setting up callbacks
  2020-03-06 21:57         ` Cyrill Gorcunov
@ 2020-03-06 22:04           ` Konstantin Osipov
  0 siblings, 0 replies; 28+ messages in thread
From: Konstantin Osipov @ 2020-03-06 22:04 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml

* Cyrill Gorcunov <gorcunov@gmail.com> [20/03/07 00:59]:
> To be honest -- I don't know at the moment. This is replication test
> if I understand correctly and I presume these are async calls, but
> not sure yet. What is interesting is that vinyl internally use
> journal entries and calls cbus directly to write the data (again,
> if I understand correctly 'cause I know nothing about vinyl internals)
> and this might be the reason.

it just reuses the wal thread infrastructure, but this is a completely
different code path, not related to your changes at all.

your crash is in txn_complete(), it seems, which is not called for
vinyl vy_log entry at all, there are no callbacks and it uses
cbus_call(), so there is no entry batching either.


-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Tarantool-patches] [PATCH 04/10] box/txn: move setup of txn start to txn_prepare
  2020-03-06 21:30   ` Konstantin Osipov
@ 2020-03-18 12:38     ` Cyrill Gorcunov
  2020-03-18 13:58       ` Konstantin Osipov
  0 siblings, 1 reply; 28+ messages in thread
From: Cyrill Gorcunov @ 2020-03-18 12:38 UTC (permalink / raw)
  To: Konstantin Osipov, tml

On Sat, Mar 07, 2020 at 12:30:03AM +0300, Konstantin Osipov wrote:
> > +
> > +	/*
> > +	 * It is important to set start transaction
> > +	 * time at the last moment, when everything
> > +	 * is ready to initiate commit procedure,
> > +	 * just to be more precise in timings to
> > +	 * detect long WAL writes.
> > +	 */
> 
> I think this comment is misleading. There are no yields
> between start of txn_prepare() and end. There may appear yields
> when vinyl lock manager is more smart, but there are no yields
> now. So ev_monotonic_now returns exactly the same value regardless 
> of where you call it.

Kostya, I happen to miss this comment. Look, as far as I understand
this is not about yields, but this timing is used to detect if the
write itself takes too long time.

	if (stop_tm - txn->start_tm > too_long_threshold) {
		int n_rows = txn->n_new_rows + txn->n_applier_rows;
		say_warn_ratelimited("too long WAL write: %d rows at "
				     "LSN %lld: %.3f sec", n_rows,
				     txn->signature - n_rows + 1,
				     stop_tm - txn->start_tm);
	}

That's why I put into the comment that we should save time value
at the very end of commit procedure.

> 
> Otherwise lgtm.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Tarantool-patches] [PATCH 04/10] box/txn: move setup of txn start to txn_prepare
  2020-03-18 12:38     ` Cyrill Gorcunov
@ 2020-03-18 13:58       ` Konstantin Osipov
  2020-03-18 14:09         ` Cyrill Gorcunov
  0 siblings, 1 reply; 28+ messages in thread
From: Konstantin Osipov @ 2020-03-18 13:58 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml

* Cyrill Gorcunov <gorcunov@gmail.com> [20/03/18 15:42]:
> On Sat, Mar 07, 2020 at 12:30:03AM +0300, Konstantin Osipov wrote:
> > > +
> > > +	/*
> > > +	 * It is important to set start transaction
> > > +	 * time at the last moment, when everything
> > > +	 * is ready to initiate commit procedure,
> > > +	 * just to be more precise in timings to
> > > +	 * detect long WAL writes.
> > > +	 */
> > 
> > I think this comment is misleading. There are no yields
> > between start of txn_prepare() and end. There may appear yields
> > when vinyl lock manager is more smart, but there are no yields
> > now. So ev_monotonic_now returns exactly the same value regardless 
> > of where you call it.
> 
> Kostya, I happen to miss this comment. Look, as far as I understand
> this is not about yields, but this timing is used to detect if the
> write itself takes too long time.

ev time is only updated once per event loop cycle, so as long as
you don't yield, your time is the same.


-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Tarantool-patches] [PATCH 04/10] box/txn: move setup of txn start to txn_prepare
  2020-03-18 13:58       ` Konstantin Osipov
@ 2020-03-18 14:09         ` Cyrill Gorcunov
  0 siblings, 0 replies; 28+ messages in thread
From: Cyrill Gorcunov @ 2020-03-18 14:09 UTC (permalink / raw)
  To: Konstantin Osipov, tml

On Wed, Mar 18, 2020 at 04:58:15PM +0300, Konstantin Osipov wrote:
> > 
> > Kostya, I happen to miss this comment. Look, as far as I understand
> > this is not about yields, but this timing is used to detect if the
> > write itself takes too long time.
> 
> ev time is only updated once per event loop cycle, so as long as
> you don't yield, your time is the same.

Yeah, indeed. Just found its implementation. I'll drop this
misleading comment. Thanks!

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2020-03-18 14:09 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-05 12:29 [Tarantool-patches] [PATCH 00/10] box/journal: redesign sync and async writes Cyrill Gorcunov
2020-03-05 12:29 ` [Tarantool-patches] [PATCH 01/10] box: recovery_journal_create -- set journal here Cyrill Gorcunov
2020-03-06 21:27   ` Konstantin Osipov
2020-03-05 12:29 ` [Tarantool-patches] [PATCH 02/10] box/txn: move fiber_set_txn to header Cyrill Gorcunov
2020-03-06 21:27   ` Konstantin Osipov
2020-03-05 12:29 ` [Tarantool-patches] [PATCH 03/10] box/txn: rename txn_write to txn_commit_async Cyrill Gorcunov
2020-03-06 21:28   ` Konstantin Osipov
2020-03-05 12:29 ` [Tarantool-patches] [PATCH 04/10] box/txn: move setup of txn start to txn_prepare Cyrill Gorcunov
2020-03-06 21:30   ` Konstantin Osipov
2020-03-18 12:38     ` Cyrill Gorcunov
2020-03-18 13:58       ` Konstantin Osipov
2020-03-18 14:09         ` Cyrill Gorcunov
2020-03-05 12:29 ` [Tarantool-patches] [PATCH 05/10] box/txn: add txn_commit_nop helper Cyrill Gorcunov
2020-03-06 21:30   ` Konstantin Osipov
2020-03-05 12:29 ` [Tarantool-patches] [PATCH 06/10] box/txn: unweave txn_commit from txn_commit_async Cyrill Gorcunov
2020-03-06 21:33   ` Konstantin Osipov
2020-03-05 12:29 ` [Tarantool-patches] [PATCH 07/10] box/txn: clear fiber storage right before journal write Cyrill Gorcunov
2020-03-06 21:34   ` Konstantin Osipov
2020-03-05 12:29 ` [Tarantool-patches] [PATCH 08/10] box/txn: move journal allocation into separate routine Cyrill Gorcunov
2020-03-06 21:35   ` Konstantin Osipov
2020-03-05 12:29 ` [Tarantool-patches] [PATCH 09/10] box/journal: journal_entry_new -- drop setting up callbacks Cyrill Gorcunov
2020-03-06 21:37   ` Konstantin Osipov
2020-03-06 21:41     ` Cyrill Gorcunov
2020-03-06 21:51       ` Konstantin Osipov
2020-03-06 21:57         ` Cyrill Gorcunov
2020-03-06 22:04           ` Konstantin Osipov
2020-03-05 12:29 ` [Tarantool-patches] [PATCH 10/10] box/journal: redesign sync and async writes Cyrill Gorcunov
2020-03-06 21:48   ` Konstantin Osipov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox