[Tarantool-patches] [PATCH] applier: follow vclock to the last tx row
Serge Petrenko
sergepetrenko at tarantool.org
Thu Apr 23 12:53:30 MSK 2020
> 23 апр. 2020 г., в 12:41, Cyrill Gorcunov <gorcunov at gmail.com> написал(а):
>
> On Wed, Apr 22, 2020 at 09:28:10PM +0300, Serge Petrenko wrote:
>> Since the introduction of transaction boundaries in replication
>> protocol, appliers follow replicaset.applier.vclock to the lsn of the
>> first row in an arrived batch. This is enough and doesn't lead to errors
>> when replicating from other instances, respecting transaction boundaries
>> (instances with version 2.1.2 and up). However, if there's a 1.10
>> instance in 2.1.2+ cluster, it sends every single tx row as a separate
>> transaction, breaking the comparison with replicaset.applier.vclock and
>> making the applier apply part of the changes, it has already applied
>> when processing a full transaction coming from another 2.x instance.
>> Such behaviour leads to ER_TUPLE_FOUND errors in the scenario described
>> above.
>> In order to guard from such cases, follow replicaset.applier.vclock to
>> the lsn of the last row in tx.
>>
>> Closes #4924
>
> Serge, can we please put this into code comment itself? Say like
> (please check that I didn't miss somthing)
> ---
> diff --git a/src/box/applier.cc b/src/box/applier.cc
> index 68de3c08c..495bc7393 100644
> --- a/src/box/applier.cc
> +++ b/src/box/applier.cc
> @@ -736,6 +736,7 @@ applier_apply_tx(struct stailq *rows)
> {
> struct xrow_header *first_row = &stailq_first_entry(rows,
> struct applier_tx_row, next)->row;
> + struct xrow_header *last_row;
> struct replica *replica = replica_by_id(first_row->replica_id);
> /*
> * In a full mesh topology, the same set of changes
> @@ -826,9 +827,16 @@ applier_apply_tx(struct stailq *rows)
> if (txn_commit_async(txn) < 0)
> goto fail;
>
> - /* Transaction was sent to journal so promote vclock. */
> - vclock_follow(&replicaset.applier.vclock,
> - first_row->replica_id, first_row->lsn);
> + /*
> + * The transaction was sent to the journal so promote vclock.
> + *
> + * Use the lsn of the last row here for backward compatibility
> + * with 1.10 series where we sent every single tx in a row as
> + * a separate transaction.
> + */
> + last_row = &stailq_last_entry(rows, struct applier_tx_row, next)->row;
> + vclock_follow(&replicaset.applier.vclock, last_row->replica_id,
> + last_row->lsn);
> latch_unlock(latch);
> return 0;
> rollback:
Hi! Thanks for the review! I’ve added a slightly different comment:
diff --git a/src/box/applier.cc b/src/box/applier.cc
index eb0297f73..42a154a33 100644
--- a/src/box/applier.cc
+++ b/src/box/applier.cc
@@ -827,7 +827,13 @@ applier_apply_tx(struct stailq *rows)
if (txn_commit_async(txn) < 0)
goto fail;
- /* Transaction was sent to journal so promote vclock. */
+ /*
+ * The transaction was sent to journal so promote vclock.
+ *
+ * Use the lsn of the last row to guard from 1.10
+ * instances, which send every single tx row as a separate
+ * transaction.
+ */
last_row = &stailq_last_entry(rows, struct applier_tx_row, next)->row;
vclock_follow(&replicaset.applier.vclock, last_row->replica_id,
last_row->lsn);
--
Serge Petrenko
sergepetrenko at tarantool.org
More information about the Tarantool-patches
mailing list