[Tarantool-patches] [PATCH v2 2/4] replication: check for rows to skip in applier correctly

Konstantin Osipov kostja.osipov at gmail.com
Fri Feb 14 10:19:52 MSK 2020


* sergepetrenko <sergepetrenko at tarantool.org> [20/02/14 09:46]:
> From: Serge Petrenko <sergepetrenko at tarantool.org>
> 
> Remove applier vclock initialization from replication_init(), where it
> is zeroed-out, and place it in the end of box_cfg_xc(), where replicaset
> vclock already has a meaningful value.
> Do not apply rows originating form the current instance if replication
> sync has ended.


> 
> Closes #4739
> ---
>  src/box/applier.cc     | 17 +++++++++++++++--
>  src/box/box.cc         |  6 ++++++
>  src/box/replication.cc |  1 -
>  3 files changed, 21 insertions(+), 3 deletions(-)
> 
> diff --git a/src/box/applier.cc b/src/box/applier.cc
> index ae3d281a5..e931e1595 100644
> --- a/src/box/applier.cc
> +++ b/src/box/applier.cc
> @@ -731,8 +731,21 @@ applier_apply_tx(struct stailq *rows)
>  	struct latch *latch = (replica ? &replica->order_latch :
>  			       &replicaset.applier.order_latch);
>  	latch_lock(latch);
> -	if (vclock_get(&replicaset.applier.vclock,
> -		       first_row->replica_id) >= first_row->lsn) {
> +	/*
> +	 * Skip remote rows either if one of the appliers has
> +	 * sent them to write or if the rows originate from the
> +	 * local instance and we've already synced with the
> +	 * replica. The latter is important because relay gets
> +	 * notified about WAL write before tx does, so it is
> +	 * possible that a remote instance receives our rows
> +	 * via replication before we update replicaset vclock and
> +	 * even sends these rows back to us. An attemt to apply
> +	 * such rows will lead to having entries with duplicate
> +	 * LSNs in WAL.
> +	 */
> +	if (vclock_get(&replicaset.applier.vclock, first_row->replica_id) >=
> +	    first_row->lsn || (first_row->replica_id == instance_id &&
> +	    !box_is_orphan())) {
>  		latch_unlock(latch);
>  		return 0;

I think you should patch SUBSCRIBE iproto command, not the filter
itself.

Basically, if it's *re*configuraiton, not first replication
configuration, SUBSCRIBE should set local VCLOCK component to
infinity (check out variable local_vclock_at_subscribe, how it is
assigned and how it used by the relay).

In other words, I think the filter should be on the relay side.


-- 
Konstantin Osipov, Moscow, Russia
https://scylladb.com


More information about the Tarantool-patches mailing list