[Tarantool-patches] [PATCH v3 10/13] tmx: introduce conflict tracker

Thu Jul 16 03:16:02 MSK 2020

Thanks for the patch!

See 11 comments below.

On 15.07.2020 15:55, Aleksandr Lyapunov wrote:
> There are situations when we have to track that if some TX is
> committed then some others must be aborted due to conflict.
> The common case is that one r/w TX have read some value while the
> second is about to overwrite the value; is the second is committed,

1. 'is the second' -> 'if the second'.

> the first must be aborted.
> Thus we have to store many-to-many TX relations between breaker
> TX and victim TX.
> The patch implements that.
> 
> Part of #4897
> ---
>  src/box/txn.c | 156 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  src/box/txn.h |  32 ++++++++++++
>  2 files changed, 187 insertions(+), 1 deletion(-)
> 
> diff --git a/src/box/txn.c b/src/box/txn.c
> index 7a15986..ba81b58 100644
> --- a/src/box/txn.c
> +++ b/src/box/txn.c
> @@ -42,6 +42,12 @@ struct tx_manager
>  {
>  	/** Last prepare-sequence-number that was assigned to prepared TX. */
>  	int64_t last_psn;
> +	/**
> +	 * List of all transactions that are in a read view.
> +	 * New transactions are added to the tail of this list,
> +	 * so the list is ordered by rv_psn.
> +	 */
> +	struct rlist read_view_txs;

2. How is it going to be used? I see you create this list, but it
is not used for anything meaningful (in the branch version on this
commit). You add something, remove, but never search in there.

The same for rv_psn. It is assigned, but never used (on top of
this commit).

>  };
>  
>  /** That's a definition, see declaration for description. */
> @@ -50,6 +56,21 @@ bool tx_manager_use_mvcc_engine = false;
>  /** The one and only instance of tx_manager. */
>  static struct tx_manager txm;
>  
> +/**
> + * Record that links two transactions, breaker and victim.
> + * See txm_cause_conflict for details.
> + */
> +struct tx_conflict_tracker {
> +	/** TX that aborts victim on commit. */
> +	struct txn *breaker;
> +	/** TX that aborts will be aborted on breaker's commit. */

3. 'aborts' seems like copy-paste artifact. I guess you meant:

    TX that will be aborted on breaker's commit

> +	struct txn *victim;
> +	/** Link in breaker->conflict_list. */
> +	struct rlist in_conflict_list;
> +	/** Link in victim->conflicted_by_list. */
> +	struct rlist in_conflicted_by_list;
> +};
> +
>  double too_long_threshold;
>  
>  /* Txn cache. */
> @@ -647,6 +699,32 @@ txn_journal_entry_new(struct txn *txn)
>  	return req;
>  }
>  
> +/**
> + * Handle conflict when @breaker transaction is prepared.
> + * The conflict is happened if @victim have read something that @breaker
> + * overwrites.
> + * If @victim is read-only or haven't made any changes, it should be send

4. send -> sent.

> + * to read view, in which is will not see @breaker.
> + * Otherwise @vistim must be marked as conflicted.

5. vistim -> victim.

Also please take a look at https://github.com/tarantool/tarantool/wiki/Code-review-procedure
for correct parameters referencing in the comments. Grep by '@a'.
Apply the fixes to this place and all the others in all the
commits.

> + */
> +static void
> +txn_handle_conflict(struct txn *breaker, struct txn *victim)
> +{
> +	assert(breaker->psn != 0);
> +	if (!victim->status != TXN_INPROGRESS) {

6. I assume the first '!' is redundant. And is even a bug.
(You wrote '!victim->status' instead of 'victim->status').

> +		/* Was conflicted by somebody else. */
> +		return;
> +	}
> +	if (stailq_empty(&victim->stmts)) {

7. Unfortunately, the statement list emptiness is not the
only sign of the transaction being read-only. The list can
be not empty, and still the transaction can be read-only.
For example, it had some statements, but rolled them back.
Or its statements were turned into NOP in a before_replace
trigger.

> +		/* Send to read view. */
> +		victim->rv_psn = breaker->psn;
> +		rlist_add_tail(&txm.read_view_txs, &victim->in_read_view_txs);
> +	} else {
> +		/* Mark as conflicted. */
> +		victim->status = TXN_CONFLICTED;

8. What if tx is conflicted, but the next thing it does
is rollback of all its statements, and then calls
box.commit()? In that case the transaction is empty, and
technically nothing should prevent it from committing. It
won't change anything already.

More general question - what if transaction is conflicted,
but rolled back the conflicted statements? And then tried
to commit the rest.

> +	}
> +}
> +
>  /*
>   * Prepare a transaction using engines.
>   */
> @@ -670,6 +748,17 @@ txn_prepare(struct txn *txn)
>  		diag_set(ClientError, ER_FOREIGN_KEY_CONSTRAINT);
>  		return -1;
>  	}
> +
> +	/*
> +	 * Somebody else has written some value that we have read.
> +	 * The transaction is not possible.
> +	 */
> +	if (txn->status == TXN_CONFLICTED || txn->status == TXN_IN_READ_VIEW) {

9. What if it is a read-only transaction? It shouldn't fail
on box.commit() I think. Because it didn't do anything. But
from this code it seems it will fail, because of TXN_IN_READ_VIEW.

Hard to tell, because this patch does not assign TXN_IN_READ_VIEW
anywhere.

> +		diag_set(ClientError, ER_TRANSACTION_CONFLICT);
> +		return -1;
> +	}
> +	assert(txn->status == TXN_INPROGRESS);
> +
>  	/*
>  	 * Perform transaction conflict resolution. Engine == NULL when
>  	 * we have a bunch of IPROTO_NOP statements.
> @@ -1184,10 +1293,55 @@ txn_on_yield(struct trigger *trigger, void *event)
>  void
>  tx_manager_init()
>  {
> -	(void)txm;
> +	rlist_create(&txm.read_view_txs);
>  }
>  
>  void
>  tx_manager_free()
>  {
>  }
> +
> +int
> +txm_cause_conflict(struct txn *breaker, struct txn *victim)
> +{
> +	struct tx_conflict_tracker *tracker = NULL;
> +	struct rlist *r1 = breaker->conflict_list.next;
> +	struct rlist *r2 = victim->conflicted_by_list.next;
> +	while (r1 != &breaker->conflict_list &&
> +	       r2 != &victim->conflicted_by_list) {
> +		tracker = rlist_entry(r1, struct tx_conflict_tracker,
> +				      in_conflict_list);
> +		assert(tracker->breaker == breaker);
> +		if (tracker->victim == victim)
> +			break;
> +		tracker = rlist_entry(r2, struct tx_conflict_tracker,
> +				      in_conflicted_by_list);
> +		assert(tracker->victim == victim);
> +		if (tracker->breaker == breaker)
> +			break;
> +		tracker = NULL;
> +		r1 = r1->next;
> +		r2 = r2->next;
> +	}
> +	if (tracker != NULL) {
> +		/* Move to the beginning of a list
> +		 * for a case of subsequent lookups */

10. Lets use the correct code style and put /* and */
on separate lines. Also put a dot in the end of the
sentence.

> +		rlist_del(&tracker->in_conflict_list);
> +		rlist_del(&tracker->in_conflicted_by_list);
> +	} else {
> +		size_t size;
> +		tracker = region_alloc_object(&victim->region,
> +					      struct tx_conflict_tracker,
> +					      &size);
> +		if (tracker == NULL) {
> +			diag_set(OutOfMemory, size, "tx region",
> +				 "conflict_tracker");
> +			return -1;
> +		}
> +		tracker->breaker = breaker;
> +		tracker->victim = victim;
> +	}
> +	rlist_add(&breaker->conflict_list, &tracker->in_conflict_list);
> +	rlist_add(&victim->conflicted_by_list, &tracker->in_conflicted_by_list);
> +	return 0;
> +}
> diff --git a/src/box/txn.h b/src/box/txn.h
> index b5b94fd..03ccb76 100644
> --- a/src/box/txn.h
> +++ b/src/box/txn.h
> @@ -717,6 +738,17 @@ tx_manager_init();
>  void
>  tx_manager_free();
>  
> +/**
> + * Notify TX manager that if transaction @breaker is committed then the
> + * transactions @victim must be aborted due to conflict.

11. transactions -> transaction.

> + * For example: there's two rw transaction in progress, one have read
> + * some value while the second is about to overwrite it. If the second
> + * is committed first, the first must be aborted.
> + * @return 0 on success, -1 on memory error.
> + */
> +int
> +txm_cause_conflict(struct txn *breaker, struct txn *victim);
> +
>  #if defined(__cplusplus)
>  } /* extern "C" */
>  #endif /* defined(__cplusplus) */
>