From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gorcunov@gmail.com>
Received: from mail-lf1-f68.google.com (mail-lf1-f68.google.com
 [209.85.167.68])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by dev.tarantool.org (Postfix) with ESMTPS id CD8D74696C3
 for <tarantool-patches@dev.tarantool.org>;
 Thu, 23 Apr 2020 12:41:14 +0300 (MSK)
Received: by mail-lf1-f68.google.com with SMTP id t11so4216070lfe.4
 for <tarantool-patches@dev.tarantool.org>;
 Thu, 23 Apr 2020 02:41:14 -0700 (PDT)
Date: Thu, 23 Apr 2020 12:41:12 +0300
From: Cyrill Gorcunov <gorcunov@gmail.com>
Message-ID: <20200423094112.GD3072@uranus>
References: <20200422182810.79257-1-sergepetrenko@tarantool.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20200422182810.79257-1-sergepetrenko@tarantool.org>
Subject: Re: [Tarantool-patches] [PATCH] applier: follow vclock to the last
	tx row
List-Id: Tarantool development patches <tarantool-patches.dev.tarantool.org>
List-Unsubscribe: <https://lists.tarantool.org/mailman/options/tarantool-patches>, 
 <mailto:tarantool-patches-request@dev.tarantool.org?subject=unsubscribe>
List-Archive: <https://lists.tarantool.org/pipermail/tarantool-patches/>
List-Post: <mailto:tarantool-patches@dev.tarantool.org>
List-Help: <mailto:tarantool-patches-request@dev.tarantool.org?subject=help>
List-Subscribe: <https://lists.tarantool.org/mailman/listinfo/tarantool-patches>, 
 <mailto:tarantool-patches-request@dev.tarantool.org?subject=subscribe>
To: Serge Petrenko <sergepetrenko@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org, v.shpilevoy@tarantool.org

On Wed, Apr 22, 2020 at 09:28:10PM +0300, Serge Petrenko wrote:
> Since the introduction of transaction boundaries in replication
> protocol, appliers follow replicaset.applier.vclock to the lsn of the
> first row in an arrived batch. This is enough and doesn't lead to errors
> when replicating from other instances, respecting transaction boundaries
> (instances with version 2.1.2 and up). However, if there's a 1.10
> instance in 2.1.2+ cluster, it sends every single tx row as a separate
> transaction, breaking the comparison with replicaset.applier.vclock and
> making the applier apply part of the changes, it has already applied
> when processing a full transaction coming from another 2.x instance.
> Such behaviour leads to ER_TUPLE_FOUND errors in the scenario described
> above.
> In order to guard from such cases, follow replicaset.applier.vclock to
> the lsn of the last row in tx.
> 
> Closes #4924

Serge, can we please put this into code comment itself? Say like
(please check that I didn't miss somthing)
---
diff --git a/src/box/applier.cc b/src/box/applier.cc
index 68de3c08c..495bc7393 100644
--- a/src/box/applier.cc
+++ b/src/box/applier.cc
@@ -736,6 +736,7 @@ applier_apply_tx(struct stailq *rows)
 {
        struct xrow_header *first_row = &stailq_first_entry(rows,
                                        struct applier_tx_row, next)->row;
+       struct xrow_header *last_row;
        struct replica *replica = replica_by_id(first_row->replica_id);
        /*
         * In a full mesh topology, the same set of changes
@@ -826,9 +827,16 @@ applier_apply_tx(struct stailq *rows)
        if (txn_commit_async(txn) < 0)
                goto fail;
 
-       /* Transaction was sent to journal so promote vclock. */
-       vclock_follow(&replicaset.applier.vclock,
-                     first_row->replica_id, first_row->lsn);
+       /*
+        * The transaction was sent to the journal so promote vclock.
+        *
+        * Use the lsn of the last row here for backward compatibility
+        * with 1.10 series where we sent every single tx in a row as
+        * a separate transaction.
+        */
+       last_row = &stailq_last_entry(rows, struct applier_tx_row, next)->row;
+       vclock_follow(&replicaset.applier.vclock, last_row->replica_id,
+                     last_row->lsn);
        latch_unlock(latch);
        return 0;
 rollback: