Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH] applier: fix tx boundary check for half-applied txns
@ 2020-07-05 11:57 Serge Petrenko
  2020-07-05 16:11 ` Vladislav Shpilevoy
  2020-07-06  6:33 ` Kirill Yukhin
  0 siblings, 2 replies; 3+ messages in thread
From: Serge Petrenko @ 2020-07-05 11:57 UTC (permalink / raw)
  To: v.shpilevoy, gorcunov; +Cc: tarantool-patches

In case there are 2 "new" instances, running tarantool 2.2+,
master and replica, and one "old" instance, running an earlier tarantool
version, in a full-mesh cluster, it may happen that the "new" replica
receives part of a tx from an "old" instance, and the remaining part
from a "new" instance.

Since "new" instances preserve tx boundaries, "new" replica would skip
the tx remains assuming it has already applied the full tx if it has
applied the first tx row. This leads to gaps in "new" replica's WAL and
to skipping the remaining part of the tx forever.

Fix this behaviour to apply the full tx even if it's beginning is
already applied in mixed clusters.

Closes #5125
---

https://github.com/tarantool/tarantool/issues/5125
https://github.com/tarantool/tarantool/tree/sp/gh-5125-applier-tx-boundaries

 src/box/applier.cc | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/src/box/applier.cc b/src/box/applier.cc
index df48b4796..6ca4cca94 100644
--- a/src/box/applier.cc
+++ b/src/box/applier.cc
@@ -737,6 +737,7 @@ applier_apply_tx(struct stailq *rows)
 	struct xrow_header *first_row = &stailq_first_entry(rows,
 					struct applier_tx_row, next)->row;
 	struct xrow_header *last_row;
+	last_row = &stailq_last_entry(rows, struct applier_tx_row, next)->row;
 	struct replica *replica = replica_by_id(first_row->replica_id);
 	/*
 	 * In a full mesh topology, the same set of changes
@@ -748,9 +749,28 @@ applier_apply_tx(struct stailq *rows)
 			       &replicaset.applier.order_latch);
 	latch_lock(latch);
 	if (vclock_get(&replicaset.applier.vclock,
-		       first_row->replica_id) >= first_row->lsn) {
+		       last_row->replica_id) >= last_row->lsn) {
 		latch_unlock(latch);
 		return 0;
+	} else if (vclock_get(&replicaset.applier.vclock,
+			      first_row->replica_id) >= first_row->lsn) {
+		/*
+		 * We've received part of the tx from an old
+		 * instance not knowing of tx boundaries.
+		 * Skip the already applied part.
+		 */
+		struct xrow_header *tmp;
+		while (true) {
+			tmp = &stailq_first_entry(rows,
+						  struct applier_tx_row,
+						  next)->row;
+			if (tmp->lsn <= vclock_get(&replicaset.applier.vclock,
+						   tmp->replica_id)) {
+				stailq_shift(rows);
+			} else {
+				break;
+			}
+		}
 	}
 
 	/**
@@ -835,7 +855,6 @@ applier_apply_tx(struct stailq *rows)
 	 * instances, which send every single tx row as a separate
 	 * transaction.
 	 */
-	last_row = &stailq_last_entry(rows, struct applier_tx_row, next)->row;
 	vclock_follow(&replicaset.applier.vclock, last_row->replica_id,
 		      last_row->lsn);
 	latch_unlock(latch);
-- 
2.24.3 (Apple Git-128)

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-07-06  6:34 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-05 11:57 [Tarantool-patches] [PATCH] applier: fix tx boundary check for half-applied txns Serge Petrenko
2020-07-05 16:11 ` Vladislav Shpilevoy
2020-07-06  6:33 ` Kirill Yukhin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox