From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-f176.google.com (mail-lj1-f176.google.com [209.85.208.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 8A713469719 for ; Fri, 14 Feb 2020 10:19:54 +0300 (MSK) Received: by mail-lj1-f176.google.com with SMTP id x7so9557891ljc.1 for ; Thu, 13 Feb 2020 23:19:54 -0800 (PST) Date: Fri, 14 Feb 2020 10:19:52 +0300 From: Konstantin Osipov Message-ID: <20200214071952.GA15237@atlas> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: Re: [Tarantool-patches] [PATCH v2 2/4] replication: check for rows to skip in applier correctly List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: sergepetrenko Cc: tarantool-patches@dev.tarantool.org, v.shpilevoy@tarantool.org * sergepetrenko [20/02/14 09:46]: > From: Serge Petrenko > > Remove applier vclock initialization from replication_init(), where it > is zeroed-out, and place it in the end of box_cfg_xc(), where replicaset > vclock already has a meaningful value. > Do not apply rows originating form the current instance if replication > sync has ended. > > Closes #4739 > --- > src/box/applier.cc | 17 +++++++++++++++-- > src/box/box.cc | 6 ++++++ > src/box/replication.cc | 1 - > 3 files changed, 21 insertions(+), 3 deletions(-) > > diff --git a/src/box/applier.cc b/src/box/applier.cc > index ae3d281a5..e931e1595 100644 > --- a/src/box/applier.cc > +++ b/src/box/applier.cc > @@ -731,8 +731,21 @@ applier_apply_tx(struct stailq *rows) > struct latch *latch = (replica ? &replica->order_latch : > &replicaset.applier.order_latch); > latch_lock(latch); > - if (vclock_get(&replicaset.applier.vclock, > - first_row->replica_id) >= first_row->lsn) { > + /* > + * Skip remote rows either if one of the appliers has > + * sent them to write or if the rows originate from the > + * local instance and we've already synced with the > + * replica. The latter is important because relay gets > + * notified about WAL write before tx does, so it is > + * possible that a remote instance receives our rows > + * via replication before we update replicaset vclock and > + * even sends these rows back to us. An attemt to apply > + * such rows will lead to having entries with duplicate > + * LSNs in WAL. > + */ > + if (vclock_get(&replicaset.applier.vclock, first_row->replica_id) >= > + first_row->lsn || (first_row->replica_id == instance_id && > + !box_is_orphan())) { > latch_unlock(latch); > return 0; I think you should patch SUBSCRIBE iproto command, not the filter itself. Basically, if it's *re*configuraiton, not first replication configuration, SUBSCRIBE should set local VCLOCK component to infinity (check out variable local_vclock_at_subscribe, how it is assigned and how it used by the relay). In other words, I think the filter should be on the relay side. -- Konstantin Osipov, Moscow, Russia https://scylladb.com