[Tarantool-patches] [PATCH] applier: follow vclock to the last tx row

Serge Petrenko sergepetrenko at tarantool.org
Thu Apr 23 12:53:30 MSK 2020


> 23 апр. 2020 г., в 12:41, Cyrill Gorcunov <gorcunov at gmail.com> написал(а):
> 
> On Wed, Apr 22, 2020 at 09:28:10PM +0300, Serge Petrenko wrote:
>> Since the introduction of transaction boundaries in replication
>> protocol, appliers follow replicaset.applier.vclock to the lsn of the
>> first row in an arrived batch. This is enough and doesn't lead to errors
>> when replicating from other instances, respecting transaction boundaries
>> (instances with version 2.1.2 and up). However, if there's a 1.10
>> instance in 2.1.2+ cluster, it sends every single tx row as a separate
>> transaction, breaking the comparison with replicaset.applier.vclock and
>> making the applier apply part of the changes, it has already applied
>> when processing a full transaction coming from another 2.x instance.
>> Such behaviour leads to ER_TUPLE_FOUND errors in the scenario described
>> above.
>> In order to guard from such cases, follow replicaset.applier.vclock to
>> the lsn of the last row in tx.
>> 
>> Closes #4924
> 
> Serge, can we please put this into code comment itself? Say like
> (please check that I didn't miss somthing)
> ---
> diff --git a/src/box/applier.cc b/src/box/applier.cc
> index 68de3c08c..495bc7393 100644
> --- a/src/box/applier.cc
> +++ b/src/box/applier.cc
> @@ -736,6 +736,7 @@ applier_apply_tx(struct stailq *rows)
> {
>        struct xrow_header *first_row = &stailq_first_entry(rows,
>                                        struct applier_tx_row, next)->row;
> +       struct xrow_header *last_row;
>        struct replica *replica = replica_by_id(first_row->replica_id);
>        /*
>         * In a full mesh topology, the same set of changes
> @@ -826,9 +827,16 @@ applier_apply_tx(struct stailq *rows)
>        if (txn_commit_async(txn) < 0)
>                goto fail;
> 
> -       /* Transaction was sent to journal so promote vclock. */
> -       vclock_follow(&replicaset.applier.vclock,
> -                     first_row->replica_id, first_row->lsn);
> +       /*
> +        * The transaction was sent to the journal so promote vclock.
> +        *
> +        * Use the lsn of the last row here for backward compatibility
> +        * with 1.10 series where we sent every single tx in a row as
> +        * a separate transaction.
> +        */
> +       last_row = &stailq_last_entry(rows, struct applier_tx_row, next)->row;
> +       vclock_follow(&replicaset.applier.vclock, last_row->replica_id,
> +                     last_row->lsn);
>        latch_unlock(latch);
>        return 0;
> rollback:

Hi! Thanks for the review! I’ve added a slightly different comment:

diff --git a/src/box/applier.cc b/src/box/applier.cc
index eb0297f73..42a154a33 100644
--- a/src/box/applier.cc
+++ b/src/box/applier.cc
@@ -827,7 +827,13 @@ applier_apply_tx(struct stailq *rows)
 	if (txn_commit_async(txn) < 0)
 		goto fail;
 
-	/* Transaction was sent to journal so promote vclock. */
+	/*
+	 * The transaction was sent to journal so promote vclock.
+	 *
+	 * Use the lsn of the last row to guard from 1.10
+	 * instances, which send every single tx row as a separate
+	 * transaction.
+	 */
 	last_row = &stailq_last_entry(rows, struct applier_tx_row, next)->row;
 	vclock_follow(&replicaset.applier.vclock, last_row->replica_id,
 		      last_row->lsn);

--
Serge Petrenko
sergepetrenko at tarantool.org


More information about the Tarantool-patches mailing list