Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH 0/5] qsync: write confirm/rollback without txn engine
@ 2020-07-22 15:33 Cyrill Gorcunov
  2020-07-22 15:33 ` [Tarantool-patches] [PATCH 1/5] journal: drop redundant declaration Cyrill Gorcunov
                   ` (5 more replies)
  0 siblings, 6 replies; 8+ messages in thread
From: Cyrill Gorcunov @ 2020-07-22 15:33 UTC (permalink / raw)
  To: tml; +Cc: Vladislav Shpilevoy

Vlad, take a look please once time permit. Note the series is on top
of your series "[PATCH 0/2] Make txn_commit() simpler"

I think in our wal engine we should go further and eliminate completion
calls for synchronous writes completely (internally WAL could setup
completion to fiber_wakeup and reuse async engine) but from API it would
not call completion associated with transactions ever. But such redesign
is defenitely for other series.

Anyway in this series we make a first approach to not use txn engine when
we have to simply write confirm/rollback record into a journal.

issue https://github.com/tarantool/tarantool/issues/5129
branch gorcunov/gh-5129-journal

Cyrill Gorcunov (5):
  journal: drop redundant declaration
  wal: bind asynchronous write completion to an entry
  journal: add journal_entry_create helper
  qsync: implement direct write of confirm/rollback into a journal
  qsync: fix release build

 src/box/box.cc             |  14 ++---
 src/box/iproto_constants.h |  24 +++++++++
 src/box/journal.c          |   8 ++-
 src/box/journal.h          |  39 ++++++++++----
 src/box/txn.c              |   2 +-
 src/box/txn_limbo.c        | 101 ++++++++++++++++++++-----------------
 src/box/vy_log.c           |   2 +-
 src/box/wal.c              |  20 ++++----
 src/box/wal.h              |   4 +-
 src/box/xrow.c             |  46 ++++-------------
 src/box/xrow.h             |  31 ++++--------
 11 files changed, 151 insertions(+), 140 deletions(-)

-- 
2.26.2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Tarantool-patches] [PATCH 1/5] journal: drop redundant declaration
  2020-07-22 15:33 [Tarantool-patches] [PATCH 0/5] qsync: write confirm/rollback without txn engine Cyrill Gorcunov
@ 2020-07-22 15:33 ` Cyrill Gorcunov
  2020-07-22 15:33 ` [Tarantool-patches] [PATCH 2/5] wal: bind asynchronous write completion to an entry Cyrill Gorcunov
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Cyrill Gorcunov @ 2020-07-22 15:33 UTC (permalink / raw)
  To: tml; +Cc: Vladislav Shpilevoy

We declare journal_entry right below no need for
more declarations.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 src/box/journal.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/box/journal.h b/src/box/journal.h
index 1a10e66c3..9049a2ce0 100644
--- a/src/box/journal.h
+++ b/src/box/journal.h
@@ -40,7 +40,6 @@ extern "C" {
 #endif /* defined(__cplusplus) */
 
 struct xrow_header;
-struct journal_entry;
 
 /**
  * An entry for an abstract journal.
-- 
2.26.2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Tarantool-patches] [PATCH 2/5] wal: bind asynchronous write completion to an entry
  2020-07-22 15:33 [Tarantool-patches] [PATCH 0/5] qsync: write confirm/rollback without txn engine Cyrill Gorcunov
  2020-07-22 15:33 ` [Tarantool-patches] [PATCH 1/5] journal: drop redundant declaration Cyrill Gorcunov
@ 2020-07-22 15:33 ` Cyrill Gorcunov
  2020-07-22 15:33 ` [Tarantool-patches] [PATCH 3/5] journal: add journal_entry_create helper Cyrill Gorcunov
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Cyrill Gorcunov @ 2020-07-22 15:33 UTC (permalink / raw)
  To: tml; +Cc: Vladislav Shpilevoy

In commit 77ba0e3504464131fe81c672d508d0275be2173a we've redesigned
wal journal operations such that asynchronous write completion
is a signle instance per journal.

It turned out that such simplification is too tight and doesn't
allow us to pass entries into the journal with custom completions.

Thus lets allow back such ability. We will need it to be able
to write "confirm" records into wal directly without touching
trasactions code at all.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 src/box/box.cc    | 14 ++++++++------
 src/box/journal.c |  2 ++
 src/box/journal.h | 18 +++++++++---------
 src/box/txn.c     |  2 +-
 src/box/vy_log.c  |  2 +-
 src/box/wal.c     | 20 +++++++++-----------
 src/box/wal.h     |  4 ++--
 7 files changed, 32 insertions(+), 30 deletions(-)

diff --git a/src/box/box.cc b/src/box/box.cc
index 83eef5d98..7d61f2ed2 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -348,7 +348,7 @@ recovery_journal_write(struct journal *base,
 	 * Since there're no actual writes, fire a
 	 * journal_async_complete callback right away.
 	 */
-	journal_async_complete(base, entry);
+	journal_async_complete(entry);
 	return 0;
 }
 
@@ -357,7 +357,7 @@ recovery_journal_create(struct vclock *v)
 {
 	static struct recovery_journal journal;
 	journal_create(&journal.base, recovery_journal_write,
-		       txn_complete_async, recovery_journal_write);
+		       recovery_journal_write);
 	journal.vclock = v;
 	journal_set(&journal.base);
 }
@@ -2193,8 +2193,10 @@ engine_init()
 static int
 bootstrap_journal_write(struct journal *base, struct journal_entry *entry)
 {
+	(void)base;
+
 	entry->res = 0;
-	journal_async_complete(base, entry);
+	journal_async_complete(entry);
 	return 0;
 }
 
@@ -2580,8 +2582,8 @@ box_cfg_xc(void)
 
 	int64_t wal_max_size = box_check_wal_max_size(cfg_geti64("wal_max_size"));
 	enum wal_mode wal_mode = box_check_wal_mode(cfg_gets("wal_mode"));
-	if (wal_init(wal_mode, txn_complete_async, cfg_gets("wal_dir"),
-		     wal_max_size, &INSTANCE_UUID, on_wal_garbage_collection,
+	if (wal_init(wal_mode, cfg_gets("wal_dir"), wal_max_size,
+		     &INSTANCE_UUID, on_wal_garbage_collection,
 		     on_wal_checkpoint_threshold) != 0) {
 		diag_raise();
 	}
@@ -2628,7 +2630,7 @@ box_cfg_xc(void)
 	}
 
 	struct journal bootstrap_journal;
-	journal_create(&bootstrap_journal, NULL, txn_complete_async,
+	journal_create(&bootstrap_journal, bootstrap_journal_write,
 		       bootstrap_journal_write);
 	journal_set(&bootstrap_journal);
 	auto bootstrap_journal_guard = make_scoped_guard([] {
diff --git a/src/box/journal.c b/src/box/journal.c
index f1e89aaa2..fb81acb39 100644
--- a/src/box/journal.c
+++ b/src/box/journal.c
@@ -36,6 +36,7 @@ struct journal *current_journal = NULL;
 
 struct journal_entry *
 journal_entry_new(size_t n_rows, struct region *region,
+		  void (*write_async_cb)(struct journal_entry *entry),
 		  void *complete_data)
 {
 	struct journal_entry *entry;
@@ -50,6 +51,7 @@ journal_entry_new(size_t n_rows, struct region *region,
 		return NULL;
 	}
 
+	entry->write_async_cb = write_async_cb;
 	entry->complete_data = complete_data;
 	entry->approx_len = 0;
 	entry->n_rows = n_rows;
diff --git a/src/box/journal.h b/src/box/journal.h
index 9049a2ce0..74a5eb050 100644
--- a/src/box/journal.h
+++ b/src/box/journal.h
@@ -60,6 +60,10 @@ struct journal_entry {
 	 * A journal entry completion callback argument.
 	 */
 	void *complete_data;
+	/**
+	 * Asynchronous write completion function.
+	 */
+	void (*write_async_cb)(struct journal_entry *entry);
 	/**
 	 * Approximate size of this request when encoded.
 	 */
@@ -83,6 +87,7 @@ struct region;
  */
 struct journal_entry *
 journal_entry_new(size_t n_rows, struct region *region,
+		  void (*write_async_cb)(struct journal_entry *entry),
 		  void *complete_data);
 
 /**
@@ -95,22 +100,19 @@ struct journal {
 	int (*write_async)(struct journal *journal,
 			   struct journal_entry *entry);
 
-	/** Asynchronous write completion */
-	void (*write_async_cb)(struct journal_entry *entry);
-
 	/** Synchronous write */
 	int (*write)(struct journal *journal,
 		     struct journal_entry *entry);
 };
 
 /**
- * Finalize a single entry.
+ * Complete asynchronous write.
  */
 static inline void
-journal_async_complete(struct journal *journal, struct journal_entry *entry)
+journal_async_complete(struct journal_entry *entry)
 {
-	assert(journal->write_async_cb != NULL);
-	journal->write_async_cb(entry);
+	assert(entry->write_async_cb != NULL);
+	entry->write_async_cb(entry);
 }
 
 /**
@@ -172,12 +174,10 @@ static inline void
 journal_create(struct journal *journal,
 	       int (*write_async)(struct journal *journal,
 				  struct journal_entry *entry),
-	       void (*write_async_cb)(struct journal_entry *entry),
 	       int (*write)(struct journal *journal,
 			    struct journal_entry *entry))
 {
 	journal->write_async	= write_async;
-	journal->write_async_cb	= write_async_cb;
 	journal->write		= write;
 }
 
diff --git a/src/box/txn.c b/src/box/txn.c
index 9c21258c5..cc1f496c5 100644
--- a/src/box/txn.c
+++ b/src/box/txn.c
@@ -551,7 +551,7 @@ txn_journal_entry_new(struct txn *txn)
 
 	/* Save space for an additional NOP row just in case. */
 	req = journal_entry_new(txn->n_new_rows + txn->n_applier_rows + 1,
-				&txn->region, txn);
+				&txn->region, txn_complete_async, txn);
 	if (req == NULL)
 		return NULL;
 
diff --git a/src/box/vy_log.c b/src/box/vy_log.c
index 311985c72..de4c5205c 100644
--- a/src/box/vy_log.c
+++ b/src/box/vy_log.c
@@ -818,7 +818,7 @@ vy_log_tx_flush(struct vy_log_tx *tx)
 	size_t used = region_used(&fiber()->gc);
 
 	struct journal_entry *entry;
-	entry = journal_entry_new(tx_size, &fiber()->gc, NULL);
+	entry = journal_entry_new(tx_size, &fiber()->gc, NULL, NULL);
 	if (entry == NULL)
 		goto err;
 
diff --git a/src/box/wal.c b/src/box/wal.c
index 37a8bd483..4e6025104 100644
--- a/src/box/wal.c
+++ b/src/box/wal.c
@@ -266,10 +266,9 @@ xlog_write_entry(struct xlog *l, struct journal_entry *entry)
 static void
 tx_schedule_queue(struct stailq *queue)
 {
-	struct wal_writer *writer = &wal_writer_singleton;
 	struct journal_entry *req, *tmp;
 	stailq_foreach_entry_safe(req, tmp, queue, fifo)
-		journal_async_complete(&writer->base, req);
+		journal_async_complete(req);
 }
 
 /**
@@ -403,9 +402,8 @@ tx_notify_checkpoint(struct cmsg *msg)
  */
 static void
 wal_writer_create(struct wal_writer *writer, enum wal_mode wal_mode,
-		  void (*wall_async_cb)(struct journal_entry *entry),
-		  const char *wal_dirname,
-		  int64_t wal_max_size, const struct tt_uuid *instance_uuid,
+		  const char *wal_dirname, int64_t wal_max_size,
+		  const struct tt_uuid *instance_uuid,
 		  wal_on_garbage_collection_f on_garbage_collection,
 		  wal_on_checkpoint_threshold_f on_checkpoint_threshold)
 {
@@ -415,7 +413,6 @@ wal_writer_create(struct wal_writer *writer, enum wal_mode wal_mode,
 	journal_create(&writer->base,
 		       wal_mode == WAL_NONE ?
 		       wal_write_none_async : wal_write_async,
-		       wall_async_cb,
 		       wal_mode == WAL_NONE ?
 		       wal_write_none : wal_write);
 
@@ -525,15 +522,15 @@ wal_open(struct wal_writer *writer)
 }
 
 int
-wal_init(enum wal_mode wal_mode, void (*wall_async_cb)(struct journal_entry *entry),
-	 const char *wal_dirname, int64_t wal_max_size, const struct tt_uuid *instance_uuid,
+wal_init(enum wal_mode wal_mode, const char *wal_dirname,
+	 int64_t wal_max_size, const struct tt_uuid *instance_uuid,
 	 wal_on_garbage_collection_f on_garbage_collection,
 	 wal_on_checkpoint_threshold_f on_checkpoint_threshold)
 {
 	/* Initialize the state. */
 	struct wal_writer *writer = &wal_writer_singleton;
-	wal_writer_create(writer, wal_mode, wall_async_cb, wal_dirname,
-			  wal_max_size, instance_uuid, on_garbage_collection,
+	wal_writer_create(writer, wal_mode, wal_dirname, wal_max_size,
+			  instance_uuid, on_garbage_collection,
 			  on_checkpoint_threshold);
 
 	/* Start WAL thread. */
@@ -1304,7 +1301,8 @@ wal_write_none_async(struct journal *journal,
 	vclock_merge(&writer->vclock, &vclock_diff);
 	vclock_copy(&replicaset.vclock, &writer->vclock);
 	entry->res = vclock_sum(&writer->vclock);
-	journal_async_complete(journal, entry);
+
+	journal_async_complete(entry);
 	return 0;
 }
 
diff --git a/src/box/wal.h b/src/box/wal.h
index f348dc636..9d0cada46 100644
--- a/src/box/wal.h
+++ b/src/box/wal.h
@@ -81,8 +81,8 @@ typedef void (*wal_on_checkpoint_threshold_f)(void);
  * Start WAL thread and initialize WAL writer.
  */
 int
-wal_init(enum wal_mode wal_mode, void (*wall_async_cb)(struct journal_entry *entry),
-	 const char *wal_dirname, int64_t wal_max_size, const struct tt_uuid *instance_uuid,
+wal_init(enum wal_mode wal_mode, const char *wal_dirname,
+	 int64_t wal_max_size, const struct tt_uuid *instance_uuid,
 	 wal_on_garbage_collection_f on_garbage_collection,
 	 wal_on_checkpoint_threshold_f on_checkpoint_threshold);
 
-- 
2.26.2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Tarantool-patches] [PATCH 3/5] journal: add journal_entry_create helper
  2020-07-22 15:33 [Tarantool-patches] [PATCH 0/5] qsync: write confirm/rollback without txn engine Cyrill Gorcunov
  2020-07-22 15:33 ` [Tarantool-patches] [PATCH 1/5] journal: drop redundant declaration Cyrill Gorcunov
  2020-07-22 15:33 ` [Tarantool-patches] [PATCH 2/5] wal: bind asynchronous write completion to an entry Cyrill Gorcunov
@ 2020-07-22 15:33 ` Cyrill Gorcunov
  2020-07-22 15:33 ` [Tarantool-patches] [PATCH 4/5] qsync: implement direct write of confirm/rollback into a journal Cyrill Gorcunov
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Cyrill Gorcunov @ 2020-07-22 15:33 UTC (permalink / raw)
  To: tml; +Cc: Vladislav Shpilevoy

To create raw journal entries.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 src/box/journal.c |  8 ++------
 src/box/journal.h | 20 ++++++++++++++++++++
 2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/src/box/journal.c b/src/box/journal.c
index fb81acb39..159e32ff3 100644
--- a/src/box/journal.c
+++ b/src/box/journal.c
@@ -51,11 +51,7 @@ journal_entry_new(size_t n_rows, struct region *region,
 		return NULL;
 	}
 
-	entry->write_async_cb = write_async_cb;
-	entry->complete_data = complete_data;
-	entry->approx_len = 0;
-	entry->n_rows = n_rows;
-	entry->res = -1;
-
+	journal_entry_create(entry, n_rows, 0, write_async_cb,
+			     complete_data);
 	return entry;
 }
diff --git a/src/box/journal.h b/src/box/journal.h
index 74a5eb050..6e1160ad8 100644
--- a/src/box/journal.h
+++ b/src/box/journal.h
@@ -80,6 +80,26 @@ struct journal_entry {
 
 struct region;
 
+/**
+ * Initialize a new journal entry.
+ */
+static inline void
+journal_entry_create(struct journal_entry *entry, size_t n_rows,
+		     size_t approx_len,
+		     void (*write_async_cb)(struct journal_entry *entry),
+		     void *complete_data)
+{
+	/*
+	 * fifo member is set untouched since it is for
+	 * an internal use of a journal engine.
+	 */
+	entry->write_async_cb	= write_async_cb;
+	entry->complete_data	= complete_data;
+	entry->approx_len	= approx_len;
+	entry->n_rows		= n_rows;
+	entry->res		= -1;
+}
+
 /**
  * Create a new journal entry.
  *
-- 
2.26.2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Tarantool-patches] [PATCH 4/5] qsync: implement direct write of confirm/rollback into a journal
  2020-07-22 15:33 [Tarantool-patches] [PATCH 0/5] qsync: write confirm/rollback without txn engine Cyrill Gorcunov
                   ` (2 preceding siblings ...)
  2020-07-22 15:33 ` [Tarantool-patches] [PATCH 3/5] journal: add journal_entry_create helper Cyrill Gorcunov
@ 2020-07-22 15:33 ` Cyrill Gorcunov
  2020-07-22 15:41   ` Cyrill Gorcunov
  2020-07-22 15:33 ` [Tarantool-patches] [PATCH 5/5] qsync: fix release build Cyrill Gorcunov
  2020-07-23 11:41 ` [Tarantool-patches] [PATCH 0/5] qsync: write confirm/rollback without txn engine Cyrill Gorcunov
  5 siblings, 1 reply; 8+ messages in thread
From: Cyrill Gorcunov @ 2020-07-22 15:33 UTC (permalink / raw)
  To: tml; +Cc: Vladislav Shpilevoy

When we need to write CONFIRM or ROLLBACK message (which is just
a binary record in msgpack format) into a journal we use txn code
to allocate a new transaction, encode there a message and pass it
to walk the long txn path before it hit the journal. This is not
only resource wasting but also somehow strange from arhitectural
point of view.

Instead lets encode a record on the stack and write it
directly to the journal.

Closes #5129

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 src/box/iproto_constants.h | 24 ++++++++++
 src/box/txn_limbo.c        | 93 +++++++++++++++++++++-----------------
 src/box/xrow.c             | 46 ++++---------------
 src/box/xrow.h             | 31 ++++---------
 4 files changed, 94 insertions(+), 100 deletions(-)

diff --git a/src/box/iproto_constants.h b/src/box/iproto_constants.h
index 6b850f101..8f0c06981 100644
--- a/src/box/iproto_constants.h
+++ b/src/box/iproto_constants.h
@@ -328,6 +328,30 @@ iproto_type_is_synchro_request(uint32_t type)
 	return type == IPROTO_CONFIRM || type == IPROTO_ROLLBACK;
 }
 
+/** CONFIRM/ROLLBACK entries encoded in MsgPack. */
+struct PACKED request_synchro_body {
+	uint8_t m_body;
+	uint8_t k_replica_id;
+	uint8_t m_replica_id;
+	uint32_t v_replica_id;
+	uint8_t k_lsn;
+	uint8_t m_lsn;
+	uint64_t v_lsn;
+};
+
+static inline void
+request_synchro_body_create(struct request_synchro_body *body,
+			    uint32_t replica_id, int64_t lsn)
+{
+	body->m_body = 0x80 | 2;
+	body->k_replica_id = IPROTO_REPLICA_ID;
+	body->m_replica_id = 0xce;
+	body->v_replica_id = mp_bswap_u32(replica_id);
+	body->k_lsn = IPROTO_LSN;
+	body->m_lsn = 0xcf;
+	body->v_lsn = mp_bswap_u64(lsn);
+}
+
 /** This is an error. */
 static inline bool
 iproto_type_is_error(uint32_t type)
diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c
index a3936c569..de043f53d 100644
--- a/src/box/txn_limbo.c
+++ b/src/box/txn_limbo.c
@@ -32,6 +32,9 @@
 #include "txn_limbo.h"
 #include "replication.h"
 
+#include "iproto_constants.h"
+#include "journal.h"
+
 struct txn_limbo txn_limbo;
 
 static inline void
@@ -238,62 +241,70 @@ txn_limbo_wait_complete(struct txn_limbo *limbo, struct txn_limbo_entry *entry)
 	return 0;
 }
 
+static void
+txn_limbo_write_cb(struct journal_entry *entry)
+{
+	/*
+	 * Since we don't know from which context
+	 * we will be called (real wal engine or
+	 * some other non-context switching) we
+	 * might not need to wake up.
+	 */
+	if (entry->complete_data != fiber())
+		fiber_wakeup(entry->complete_data);
+}
+
+/**
+ * Write CONFIRM or ROLLBACK message to a journal directly
+ * without involving transaction engine because using txn
+ * engine is far from being cheap while we only need to
+ * write a small message.
+ */
 static int
-txn_limbo_write_confirm_rollback(struct txn_limbo *limbo, int64_t lsn,
-				 bool is_confirm)
+txn_limbo_write(uint32_t replica_id, int64_t lsn, int type)
 {
+	assert(replica_id != REPLICA_ID_NIL);
+	assert(type == IPROTO_CONFIRM || type == IPROTO_ROLLBACK);
 	assert(lsn > 0);
 
-	struct xrow_header row;
-	struct request request = {
-		.header = &row,
-	};
+	char buf[sizeof(struct journal_entry) +
+		 sizeof(struct xrow_header *) +
+		 sizeof(struct xrow_header)];
 
-	struct txn *txn = txn_begin();
-	if (txn == NULL)
-		return -1;
+	struct journal_entry *entry = (void *)buf;
+	struct xrow_header *row = (void *)&entry->rows[1];
+	entry->rows[0] = row;
 
-	int res = 0;
-	if (is_confirm) {
-		res = xrow_encode_confirm(&row, &txn->region,
-					  limbo->instance_id, lsn);
-	} else {
-		/*
-		 * This LSN is the first to be rolled back, so
-		 * the last "safe" lsn is lsn - 1.
-		 */
-		res = xrow_encode_rollback(&row, &txn->region,
-					   limbo->instance_id, lsn);
+	struct request_synchro_body body;
+	xrow_encode_confirm_rollback(row, &body, replica_id,
+				     lsn, type);
+
+	journal_entry_create(entry, 1, xrow_approx_len(row),
+			     txn_limbo_write_cb, fiber());
+
+	if (journal_write(entry) != 0) {
+		diag_set(ClientError, ER_WAL_IO);
+		diag_log();
+		return -1;
 	}
-	if (res == -1)
-		goto rollback;
-	/*
-	 * This is not really a transaction. It just uses txn API
-	 * to put the data into WAL. And obviously it should not
-	 * go to the limbo and block on the very same sync
-	 * transaction which it tries to confirm now.
-	 */
-	txn_set_flag(txn, TXN_FORCE_ASYNC);
 
-	if (txn_begin_stmt(txn, NULL) != 0)
-		goto rollback;
-	if (txn_commit_stmt(txn, &request) != 0)
-		goto rollback;
+	if (entry->res < 0) {
+		diag_set(ClientError, ER_WAL_IO);
+		diag_log();
+		return -1;
+	}
 
-	return txn_commit(txn);
-rollback:
-	txn_rollback(txn);
-	return -1;
+	return 0;
 }
 
 /**
  * Write a confirmation entry to WAL. After it's written all the
  * transactions waiting for confirmation may be finished.
  */
-static int
+static inline int
 txn_limbo_write_confirm(struct txn_limbo *limbo, int64_t lsn)
 {
-	return txn_limbo_write_confirm_rollback(limbo, lsn, true);
+	return txn_limbo_write(limbo->instance_id, lsn, IPROTO_CONFIRM);
 }
 
 void
@@ -339,10 +350,10 @@ txn_limbo_read_confirm(struct txn_limbo *limbo, int64_t lsn)
  * transactions following the current one and waiting for
  * confirmation must be rolled back.
  */
-static int
+static inline int
 txn_limbo_write_rollback(struct txn_limbo *limbo, int64_t lsn)
 {
-	return txn_limbo_write_confirm_rollback(limbo, lsn, false);
+	return txn_limbo_write(limbo->instance_id, lsn, IPROTO_ROLLBACK);
 }
 
 void
diff --git a/src/box/xrow.c b/src/box/xrow.c
index 0c797a9d5..bba4ea9e2 100644
--- a/src/box/xrow.c
+++ b/src/box/xrow.c
@@ -893,51 +893,23 @@ xrow_encode_dml(const struct request *request, struct region *region,
 	return iovcnt;
 }
 
-static int
-xrow_encode_confirm_rollback(struct xrow_header *row, struct region *region,
-			     uint32_t replica_id, int64_t lsn, int type)
+void
+xrow_encode_confirm_rollback(struct xrow_header *row,
+			     struct request_synchro_body *body,
+			     uint32_t replica_id, int64_t lsn,
+			     int type)
 {
-	size_t len = mp_sizeof_map(2) + mp_sizeof_uint(IPROTO_REPLICA_ID) +
-		     mp_sizeof_uint(replica_id) + mp_sizeof_uint(IPROTO_LSN) +
-		     mp_sizeof_uint(lsn);
-	char *buf = (char *)region_alloc(region, len);
-	if (buf == NULL) {
-		diag_set(OutOfMemory, len, "region_alloc", "buf");
-		return -1;
-	}
-	char *pos = buf;
+	assert(type == IPROTO_CONFIRM || type == IPROTO_ROLLBACK);
 
-	pos = mp_encode_map(pos, 2);
-	pos = mp_encode_uint(pos, IPROTO_REPLICA_ID);
-	pos = mp_encode_uint(pos, replica_id);
-	pos = mp_encode_uint(pos, IPROTO_LSN);
-	pos = mp_encode_uint(pos, lsn);
+	request_synchro_body_create(body, replica_id, lsn);
 
 	memset(row, 0, sizeof(*row));
 
-	row->body[0].iov_base = buf;
-	row->body[0].iov_len = len;
+	row->body[0].iov_base = body;
+	row->body[0].iov_len = sizeof(*body);
 	row->bodycnt = 1;
 
 	row->type = type;
-
-	return 0;
-}
-
-int
-xrow_encode_confirm(struct xrow_header *row, struct region *region,
-		    uint32_t replica_id, int64_t lsn)
-{
-	return xrow_encode_confirm_rollback(row, region, replica_id, lsn,
-					    IPROTO_CONFIRM);
-}
-
-int
-xrow_encode_rollback(struct xrow_header *row, struct region *region,
-		     uint32_t replica_id, int64_t lsn)
-{
-	return xrow_encode_confirm_rollback(row, region, replica_id, lsn,
-					    IPROTO_ROLLBACK);
 }
 
 static int
diff --git a/src/box/xrow.h b/src/box/xrow.h
index e21ede5a3..68fb4e8ef 100644
--- a/src/box/xrow.h
+++ b/src/box/xrow.h
@@ -54,6 +54,7 @@ enum {
 	IPROTO_SELECT_HEADER_LEN = IPROTO_HEADER_LEN + 7,
 };
 
+struct request_synchro_body;
 struct region;
 
 struct xrow_header {
@@ -216,18 +217,18 @@ xrow_encode_dml(const struct request *request, struct region *region,
 		struct iovec *iov);
 
 /**
- * Encode the CONFIRM to row body and set row type to
- * IPROTO_CONFIRM.
+ * Encode the CONFIRM or ROLLBACK to row body.
  * @param row xrow header.
- * @param region Region to use to encode the confirmation body.
+ * @param body encoded body.
  * @param replica_id master's instance id.
  * @param lsn last confirmed lsn.
- * @retval -1 on error.
- * @retval 0 success.
+ * @param type IPROTO_CONFIRM or IPROTO_ROLLBACK.
  */
-int
-xrow_encode_confirm(struct xrow_header *row, struct region *region,
-		    uint32_t replica_id, int64_t lsn);
+void
+xrow_encode_confirm_rollback(struct xrow_header *row,
+			     struct request_synchro_body *body,
+			     uint32_t replica_id, int64_t lsn,
+			     int type);
 
 /**
  * Decode the CONFIRM request body.
@@ -240,20 +241,6 @@ xrow_encode_confirm(struct xrow_header *row, struct region *region,
 int
 xrow_decode_confirm(struct xrow_header *row, uint32_t *replica_id, int64_t *lsn);
 
-/**
- * Encode the ROLLBACK row body and set row type to
- * IPROTO_ROLLBACK.
- * @param row xrow header.
- * @param region Region to use to encode the rollback body.
- * @param replica_id master's instance id.
- * @param lsn lsn to rollback from, including it.
- * @retval -1  on error.
- * @retval 0 success.
- */
-int
-xrow_encode_rollback(struct xrow_header *row, struct region *region,
-		     uint32_t replica_id, int64_t lsn);
-
 /**
  * Decode the ROLLBACK row body.
  * @param row xrow header.
-- 
2.26.2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Tarantool-patches] [PATCH 5/5] qsync: fix release build
  2020-07-22 15:33 [Tarantool-patches] [PATCH 0/5] qsync: write confirm/rollback without txn engine Cyrill Gorcunov
                   ` (3 preceding siblings ...)
  2020-07-22 15:33 ` [Tarantool-patches] [PATCH 4/5] qsync: implement direct write of confirm/rollback into a journal Cyrill Gorcunov
@ 2020-07-22 15:33 ` Cyrill Gorcunov
  2020-07-23 11:41 ` [Tarantool-patches] [PATCH 0/5] qsync: write confirm/rollback without txn engine Cyrill Gorcunov
  5 siblings, 0 replies; 8+ messages in thread
From: Cyrill Gorcunov @ 2020-07-22 15:33 UTC (permalink / raw)
  To: tml; +Cc: Vladislav Shpilevoy

We use txn for debug purpose only so it
triggers unused variable issue in release
build.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 src/box/txn_limbo.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c
index de043f53d..9c12f4fa6 100644
--- a/src/box/txn_limbo.c
+++ b/src/box/txn_limbo.c
@@ -167,13 +167,13 @@ txn_limbo_write_rollback(struct txn_limbo *limbo, int64_t lsn);
 int
 txn_limbo_wait_complete(struct txn_limbo *limbo, struct txn_limbo_entry *entry)
 {
-	struct txn *txn = entry->txn;
 	assert(entry->lsn > 0 || !txn_has_flag(entry->txn, TXN_WAIT_ACK));
 	if (txn_limbo_entry_is_complete(entry))
 		goto complete;
 
-	assert(!txn_has_flag(txn, TXN_IS_DONE));
-	assert(txn_has_flag(txn, TXN_WAIT_SYNC));
+	assert(!txn_has_flag(entry->txn, TXN_IS_DONE));
+	assert(txn_has_flag(entry->txn, TXN_WAIT_SYNC));
+
 	double start_time = fiber_clock();
 	while (true) {
 		double deadline = start_time + replication_synchro_timeout;
@@ -229,7 +229,7 @@ txn_limbo_wait_complete(struct txn_limbo *limbo, struct txn_limbo_entry *entry)
 	 * installed the commit/rollback flag.
 	 */
 	assert(rlist_empty(&entry->in_queue));
-	assert(txn_has_flag(txn, TXN_IS_DONE));
+	assert(txn_has_flag(entry->txn, TXN_IS_DONE));
 	/*
 	 * The first tx to be rolled back already performed all
 	 * the necessary cleanups for us.
-- 
2.26.2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Tarantool-patches] [PATCH 4/5] qsync: implement direct write of confirm/rollback into a journal
  2020-07-22 15:33 ` [Tarantool-patches] [PATCH 4/5] qsync: implement direct write of confirm/rollback into a journal Cyrill Gorcunov
@ 2020-07-22 15:41   ` Cyrill Gorcunov
  0 siblings, 0 replies; 8+ messages in thread
From: Cyrill Gorcunov @ 2020-07-22 15:41 UTC (permalink / raw)
  To: tml; +Cc: Vladislav Shpilevoy

On Wed, Jul 22, 2020 at 06:33:58PM +0300, Cyrill Gorcunov wrote:
> When we need to write CONFIRM or ROLLBACK message (which is just
> a binary record in msgpack format) into a journal we use txn code
> to allocate a new transaction, encode there a message and pass it
> to walk the long txn path before it hit the journal. This is not
> only resource wasting but also somehow strange from arhitectural
> point of view.
> 
> Instead lets encode a record on the stack and write it
> directly to the journal.
> 
> Closes #5129
> 
> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

It might be hard to review the diff, so the final result is
https://github.com/tarantool/tarantool/blob/gorcunov/gh-5129-journal/src/box/txn_limbo.c#L264

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Tarantool-patches] [PATCH 0/5] qsync: write confirm/rollback without txn engine
  2020-07-22 15:33 [Tarantool-patches] [PATCH 0/5] qsync: write confirm/rollback without txn engine Cyrill Gorcunov
                   ` (4 preceding siblings ...)
  2020-07-22 15:33 ` [Tarantool-patches] [PATCH 5/5] qsync: fix release build Cyrill Gorcunov
@ 2020-07-23 11:41 ` Cyrill Gorcunov
  5 siblings, 0 replies; 8+ messages in thread
From: Cyrill Gorcunov @ 2020-07-23 11:41 UTC (permalink / raw)
  To: tml; +Cc: Vladislav Shpilevoy

On Wed, Jul 22, 2020 at 06:33:54PM +0300, Cyrill Gorcunov wrote:
> Vlad, take a look please once time permit. Note the series is on top
> of your series "[PATCH 0/2] Make txn_commit() simpler"
> 
> I think in our wal engine we should go further and eliminate completion
> calls for synchronous writes completely (internally WAL could setup
> completion to fiber_wakeup and reuse async engine) but from API it would
> not call completion associated with transactions ever. But such redesign
> is defenitely for other series.
> 
> Anyway in this series we make a first approach to not use txn engine when
> we have to simply write confirm/rollback record into a journal.

Drop this series, please. v2 will be published.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-07-23 11:41 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-22 15:33 [Tarantool-patches] [PATCH 0/5] qsync: write confirm/rollback without txn engine Cyrill Gorcunov
2020-07-22 15:33 ` [Tarantool-patches] [PATCH 1/5] journal: drop redundant declaration Cyrill Gorcunov
2020-07-22 15:33 ` [Tarantool-patches] [PATCH 2/5] wal: bind asynchronous write completion to an entry Cyrill Gorcunov
2020-07-22 15:33 ` [Tarantool-patches] [PATCH 3/5] journal: add journal_entry_create helper Cyrill Gorcunov
2020-07-22 15:33 ` [Tarantool-patches] [PATCH 4/5] qsync: implement direct write of confirm/rollback into a journal Cyrill Gorcunov
2020-07-22 15:41   ` Cyrill Gorcunov
2020-07-22 15:33 ` [Tarantool-patches] [PATCH 5/5] qsync: fix release build Cyrill Gorcunov
2020-07-23 11:41 ` [Tarantool-patches] [PATCH 0/5] qsync: write confirm/rollback without txn engine Cyrill Gorcunov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox