[Tarantool-patches] [PATCH v2 6/7] qsync: implement direct write of CONFIRM/ROLLBACK into a journal

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Fri Jul 24 01:10:58 MSK 2020


Thanks for the patch!

See 5 comments below.

On 23.07.2020 14:29, Cyrill Gorcunov wrote:
> When we need to write CONFIRM or ROLLBACK message (which is just
> a binary record in msgpack format) into a journal we use txn code
> to allocate a new transaction, encode there a message and pass it
> to walk the long txn path before it hit the journal. This is not
> only resource wasting but also somehow strange from arhitectural

1. arhitectural -> architectural.

> point of view.
> 
> Instead lets encode a record on the stack and write it
> directly to the journal.
> 
> Closes #5129
> 
> Signed-off-by: Cyrill Gorcunov <gorcunov at gmail.com>
> ---
>  src/box/txn_limbo.c | 92 +++++++++++++++++++++++++--------------------
>  1 file changed, 51 insertions(+), 41 deletions(-)
> 
> diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c
> index a74bfe244..3eca244dc 100644
> --- a/src/box/txn_limbo.c
> +++ b/src/box/txn_limbo.c
> @@ -32,6 +32,9 @@
>  #include "txn_limbo.h"
>  #include "replication.h"
>  
> +#include "iproto_constants.h"
> +#include "journal.h"
> +
>  struct txn_limbo txn_limbo;
>  
>  static inline void
> @@ -237,62 +240,69 @@ txn_limbo_wait_complete(struct txn_limbo *limbo, struct txn_limbo_entry *entry)
>  	return 0;
>  }
>  
> +static void
> +txn_limbo_write_cb(struct journal_entry *entry)
> +{
> +	/*
> +	 * It is safe to call fiber_wakeup on
> +	 * active fiber but we need to call wakeup
> +	 * in case if we're sitting in WAL thread.

2. Actually it is not safe to call wakeup on the active fiber. It
leads to something being broken in the scheduler. This is why we
have so many checks like

	if (txn->fiber != fiber()) fiber_wakeup(txn->fiber);

But yeah, in this case this is fine - the fiber is sleeping for
sure.

> +	 */
> +	assert(entry->complete_data != NULL);
> +	fiber_wakeup(entry->complete_data);
> +}
> +
> +/**
> + * Write CONFIRM or ROLLBACK message to a journal directly
> + * without involving transaction engine because using txn
> + * engine is far from being cheap while we only need to
> + * write a small message.
> + */
>  static int
> -txn_limbo_write_confirm_rollback(struct txn_limbo *limbo, int64_t lsn,
> -				 bool is_confirm)
> +txn_limbo_write(uint32_t replica_id, int64_t lsn, int type)
>  {
> +	assert(replica_id != REPLICA_ID_NIL);
> +	assert(type == IPROTO_CONFIRM || type == IPROTO_ROLLBACK);
>  	assert(lsn > 0);
>  
> -	struct xrow_header row;
> -	struct request request = {
> -		.header = &row,
> -	};
> +	char buf[sizeof(struct journal_entry) +
> +		 sizeof(struct xrow_header *) +
> +		 sizeof(struct xrow_header)];

3. The last addition is an overkill a bit. Why can't
you declare {struct xrow_header row;} on stack separately? I
understand

	sizeof(struct journal_entry) + sizeof(struct xrow_header *)

- they should be one object. But the third addition seems not
necessarily to be done in the same buffer variable.


4. I think all these changes are going to interfere with
https://github.com/tarantool/tarantool/issues/5151.

What are you going to do to request_synchro_body to implement 5151? Will they be
two separate structs? Or will the struct synchro_request contain the body? - this
would lead to data duplication though, because the point of synchro_request is to
provide the synchro fields not encoded nor bswaped.

Would be good to think through how will these changes live with 5151.

> -	struct txn *txn = txn_begin();
> -	if (txn == NULL)
> -		return -1;
> +	struct journal_entry *entry = (void *)buf;
> +	struct xrow_header *row = (void *)&entry->rows[1];
> +	entry->rows[0] = row;
>  
> -	int res = 0;
> -	if (is_confirm) {
> -		res = xrow_encode_confirm(&row, &txn->region,
> -					  limbo->instance_id, lsn);
> -	} else {
> -		/*
> -		 * This LSN is the first to be rolled back, so
> -		 * the last "safe" lsn is lsn - 1.
> -		 */
> -		res = xrow_encode_rollback(&row, &txn->region,
> -					   limbo->instance_id, lsn);
> +	struct request_synchro_body body;

5. It looks like you could make things simpler, if you would create a
struct containing the journal_entry, body, *xrow_header[1], and
xrow_header. Would simplify things IMO. You wouldn't need the stack
manipulations like you did above. 

> +	xrow_encode_confirm_rollback(row, &body, replica_id,
> +				     lsn, type);
> +
> +	journal_entry_create(entry, 1, xrow_approx_len(row),
> +			     txn_limbo_write_cb, fiber());
> +
> +	if (journal_write(entry) != 0) {
> +		diag_set(ClientError, ER_WAL_IO);
> +		diag_log();
> +		return -1;
>  	}
> -	if (res == -1)
> -		goto rollback;
> -	/*
> -	 * This is not really a transaction. It just uses txn API
> -	 * to put the data into WAL. And obviously it should not
> -	 * go to the limbo and block on the very same sync
> -	 * transaction which it tries to confirm now.
> -	 */
> -	txn_set_flag(txn, TXN_FORCE_ASYNC);
>  
> -	if (txn_begin_stmt(txn, NULL) != 0)
> -		goto rollback;
> -	if (txn_commit_stmt(txn, &request) != 0)
> -		goto rollback;
> +	if (entry->res < 0) {
> +		diag_set(ClientError, ER_WAL_IO);
> +		diag_log();
> +		return -1;
> +	}
>  
> -	return txn_commit(txn);
> -rollback:
> -	txn_rollback(txn);
> -	return -1;
> +	return 0;
>  }


More information about the Tarantool-patches mailing list