[Tarantool-patches] [PATCH v10 0/2] relay: provide downstream lag information

Tarantool development patches archive
 help / color / mirror / Atom feed

* [Tarantool-patches] [PATCH v10 0/2] relay: provide downstream lag information
@ 2021-06-22 15:22 Cyrill Gorcunov via Tarantool-patches
  2021-06-22 15:22 ` [Tarantool-patches] [PATCH v10 1/2] applier: send transaction's first row WAL time in the applier_writer_f Cyrill Gorcunov via Tarantool-patches
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-06-22 15:22 UTC (permalink / raw)
  To: tml; +Cc: Vladislav Shpilevoy

Guys, here is the latest version of "downstream lag" series, hopefully
everything is fine right now. I put only changes against v9 which could
be found here

https://lists.tarantool.org/tarantool-patches/20210617154835.315576-1-gorcunov@gmail.com/

v10:
 - since we're using error injection exclude the test from release build
   (by Vlad)
 - provide @applier argument for applier_apply_tx helper since we need
   to pass replica_id into apply_synchro_row/apply_plain_tx (the argument
   been removed in b35e4708e8419759bf2ec024d1f91ae7a356af79 because we
   didn't need it at that moment).

issue https://github.com/tarantool/tarantool/issues/5447
branch gorcunov/gh-5447-relay-lag-10

Cyrill Gorcunov (2):
  applier: send transaction's first row WAL time in the applier_writer_f
  relay: provide information about downstream lag

 .../unreleased/gh-5447-downstream-lag.md      |   6 +
 src/box/applier.cc                            | 101 +++++++++++---
 src/box/lua/info.c                            |   3 +
 src/box/relay.cc                              |  41 ++++++
 src/box/relay.h                               |   6 +
 src/box/replication.cc                        |   1 +
 src/box/replication.h                         |   5 +
 .../replication/gh-5447-downstream-lag.result | 124 ++++++++++++++++++
 .../gh-5447-downstream-lag.test.lua           |  56 ++++++++
 test/replication/suite.ini                    |   2 +-
 10 files changed, 327 insertions(+), 18 deletions(-)
 create mode 100644 changelogs/unreleased/gh-5447-downstream-lag.md
 create mode 100644 test/replication/gh-5447-downstream-lag.result
 create mode 100644 test/replication/gh-5447-downstream-lag.test.lua


base-commit: e6b74fd44c4e3d9304363ef3d367600bb9b0538d
-- 
2.31.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Tarantool-patches] [PATCH v10 1/2] applier: send transaction's first row WAL time in the applier_writer_f
  2021-06-22 15:22 [Tarantool-patches] [PATCH v10 0/2] relay: provide downstream lag information Cyrill Gorcunov via Tarantool-patches
@ 2021-06-22 15:22 ` Cyrill Gorcunov via Tarantool-patches
  2021-06-22 15:22 ` [Tarantool-patches] [PATCH v10 2/2] relay: provide information about downstream lag Cyrill Gorcunov via Tarantool-patches
  2021-06-23 22:19 ` [Tarantool-patches] [PATCH v10 0/2] relay: provide downstream lag information Vladislav Shpilevoy via Tarantool-patches
  2 siblings, 0 replies; 5+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-06-22 15:22 UTC (permalink / raw)
  To: tml; +Cc: Vladislav Shpilevoy

Applier fiber sends current vclock of the node to remote relay reader,
pointing current state of fetched WAL data so the relay will know which
new data should be sent. The packet applier sends carries xrow_header::tm
field as a zero but we can reuse it to provide information about first
timestamp in a transaction we wrote to our WAL. Since old instances of
Tarantool simply ignore this field such extension won't cause any
problems.

The timestamp will be needed to account lag of downstream replicas
suitable for information purpose and cluster health monitoring.

We update applier statistics in WAL callbacks but since both
apply_synchro_row and apply_plain_tx are used not only in real data
application but in final join stage as well (in this stage we're not
writing the data yet) the apply_synchro_row is extended with replica_id
argument which is non zero when applier is subscribed.

The calculation of the downstream lag itself lag will be addressed
in next patch because sending the timestamp and its observation
are independent actions.

Part-of #5447

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 src/box/applier.cc     | 101 ++++++++++++++++++++++++++++++++++-------
 src/box/replication.cc |   1 +
 src/box/replication.h  |   5 ++
 3 files changed, 90 insertions(+), 17 deletions(-)

diff --git a/src/box/applier.cc b/src/box/applier.cc
index ee5c05c0d..07fe7f5c7 100644
--- a/src/box/applier.cc
+++ b/src/box/applier.cc
@@ -163,6 +163,9 @@ applier_writer_f(va_list ap)
 	struct ev_io io;
 	coio_create(&io, applier->io.fd);
 
+	/* ID is permanent while applier is alive */
+	uint32_t replica_id = applier->instance_id;
+
 	while (!fiber_is_cancelled()) {
 		/*
 		 * Tarantool >= 1.7.7 sends periodic heartbeat
@@ -193,6 +196,16 @@ applier_writer_f(va_list ap)
 			applier->has_acks_to_send = false;
 			struct xrow_header xrow;
 			xrow_encode_vclock(&xrow, &replicaset.vclock);
+			/*
+			 * For relay lag statistics we report last
+			 * written transaction timestamp in tm field.
+			 *
+			 * If user delete the node from _cluster space,
+			 * we obtain a nil pointer here.
+			 */
+			struct replica *r = replica_by_id(replica_id);
+			if (likely(r != NULL))
+				xrow.tm = r->applier_txn_last_tm;
 			coio_write_xrow(&io, &xrow);
 			ERROR_INJECT(ERRINJ_APPLIER_SLOW_ACK, {
 				fiber_sleep(0.01);
@@ -490,7 +503,7 @@ static uint64_t
 applier_read_tx(struct applier *applier, struct stailq *rows, double timeout);
 
 static int
-apply_final_join_tx(struct stailq *rows);
+apply_final_join_tx(uint32_t replica_id, struct stailq *rows);
 
 /**
  * A helper struct to link xrow objects in a list.
@@ -535,7 +548,7 @@ applier_wait_register(struct applier *applier, uint64_t row_count)
 						  next)->row);
 			break;
 		}
-		if (apply_final_join_tx(&rows) != 0)
+		if (apply_final_join_tx(applier->instance_id, &rows) != 0)
 			diag_raise();
 	}
 
@@ -751,11 +764,35 @@ applier_txn_rollback_cb(struct trigger *trigger, void *event)
 	return 0;
 }
 
+struct replica_cb_data {
+	/** Replica ID the data belongs to. */
+	uint32_t replica_id;
+	/**
+	 * Timestamp of a transaction to be accounted
+	 * for relay lag. Usually it is a last row in
+	 * a transaction.
+	 */
+	double txn_last_tm;
+};
+
+/** Update replica associated data once write is complete. */
+static void
+replica_txn_wal_write_cb(struct replica_cb_data *rcb)
+{
+	struct replica *r = replica_by_id(rcb->replica_id);
+	if (likely(r != NULL))
+		r->applier_txn_last_tm = rcb->txn_last_tm;
+}
+
 static int
 applier_txn_wal_write_cb(struct trigger *trigger, void *event)
 {
-	(void) trigger;
 	(void) event;
+
+	struct replica_cb_data *rcb =
+		(struct replica_cb_data *)trigger->data;
+	replica_txn_wal_write_cb(rcb);
+
 	/* Broadcast the WAL write across all appliers. */
 	trigger_run(&replicaset.applier.on_wal_write, NULL);
 	return 0;
@@ -766,6 +803,8 @@ struct synchro_entry {
 	struct synchro_request *req;
 	/** Fiber created the entry. To wakeup when WAL write is done. */
 	struct fiber *owner;
+	/** Replica associated data. */
+	struct replica_cb_data *rcb;
 	/**
 	 * The base journal entry. It has unsized array and then must be the
 	 * last entry in the structure. But can workaround it via a union
@@ -789,6 +828,7 @@ apply_synchro_row_cb(struct journal_entry *entry)
 	if (entry->res < 0) {
 		applier_rollback_by_wal_io(entry->res);
 	} else {
+		replica_txn_wal_write_cb(synchro_entry->rcb);
 		txn_limbo_process(&txn_limbo, synchro_entry->req);
 		trigger_run(&replicaset.applier.on_wal_write, NULL);
 	}
@@ -797,7 +837,7 @@ apply_synchro_row_cb(struct journal_entry *entry)
 
 /** Process a synchro request. */
 static int
-apply_synchro_row(struct xrow_header *row)
+apply_synchro_row(uint32_t replica_id, struct xrow_header *row)
 {
 	assert(iproto_type_is_synchro_request(row->type));
 
@@ -805,6 +845,7 @@ apply_synchro_row(struct xrow_header *row)
 	if (xrow_decode_synchro(row, &req) != 0)
 		goto err;
 
+	struct replica_cb_data rcb_data;
 	struct synchro_entry entry;
 	/*
 	 * Rows array is cast from *[] to **, because otherwise g++ complains
@@ -817,6 +858,11 @@ apply_synchro_row(struct xrow_header *row)
 			     apply_synchro_row_cb, &entry);
 	entry.req = &req;
 	entry.owner = fiber();
+
+	rcb_data.replica_id = replica_id;
+	rcb_data.txn_last_tm = row->tm;
+	entry.rcb = &rcb_data;
+
 	/*
 	 * The WAL write is blocking. Otherwise it might happen that a CONFIRM
 	 * or ROLLBACK is sent to WAL, and it would empty the limbo, but before
@@ -864,8 +910,9 @@ applier_handle_raft(struct applier *applier, struct xrow_header *row)
 	return box_raft_process(&req, applier->instance_id);
 }
 
-static inline int
-apply_plain_tx(struct stailq *rows, bool skip_conflict, bool use_triggers)
+static int
+apply_plain_tx(uint32_t replica_id, struct stailq *rows,
+	       bool skip_conflict, bool use_triggers)
 {
 	/*
 	 * Explicitly begin the transaction so that we can
@@ -933,10 +980,28 @@ apply_plain_tx(struct stailq *rows, bool skip_conflict, bool use_triggers)
 			goto fail;
 		}
 
+		struct replica_cb_data *rcb;
+		rcb = region_alloc_object(&txn->region, typeof(*rcb), &size);
+		if (rcb == NULL) {
+			diag_set(OutOfMemory, size, "region_alloc_object", "rcb");
+			goto fail;
+		}
+
 		trigger_create(on_rollback, applier_txn_rollback_cb, NULL, NULL);
 		txn_on_rollback(txn, on_rollback);
 
-		trigger_create(on_wal_write, applier_txn_wal_write_cb, NULL, NULL);
+		/*
+		 * We use *last* entry timestamp because ack comes up to
+		 * last entry in transaction. Same time this shows more
+		 * precise result because we're interested in how long
+		 * transaction traversed network + remote WAL bundle before
+		 * ack get received.
+		 */
+		item = stailq_last_entry(rows, struct applier_tx_row, next);
+		rcb->replica_id = replica_id;
+		rcb->txn_last_tm = item->row.tm;
+
+		trigger_create(on_wal_write, applier_txn_wal_write_cb, rcb, NULL);
 		txn_on_wal_write(txn, on_wal_write);
 	}
 
@@ -948,7 +1013,7 @@ apply_plain_tx(struct stailq *rows, bool skip_conflict, bool use_triggers)
 
 /** A simpler version of applier_apply_tx() for final join stage. */
 static int
-apply_final_join_tx(struct stailq *rows)
+apply_final_join_tx(uint32_t replica_id, struct stailq *rows)
 {
 	struct xrow_header *first_row =
 		&stailq_first_entry(rows, struct applier_tx_row, next)->row;
@@ -959,9 +1024,9 @@ apply_final_join_tx(struct stailq *rows)
 	vclock_follow_xrow(&replicaset.vclock, last_row);
 	if (unlikely(iproto_type_is_synchro_request(first_row->type))) {
 		assert(first_row == last_row);
-		rc = apply_synchro_row(first_row);
+		rc = apply_synchro_row(replica_id, first_row);
 	} else {
-		rc = apply_plain_tx(rows, false, false);
+		rc = apply_plain_tx(replica_id, rows, false, false);
 	}
 	fiber_gc();
 	return rc;
@@ -1029,7 +1094,7 @@ nopify:;
  * Return 0 for success or -1 in case of an error.
  */
 static int
-applier_apply_tx(struct stailq *rows)
+applier_apply_tx(struct applier *applier, struct stailq *rows)
 {
 	/*
 	 * Initially we've been filtering out data if it came from
@@ -1093,12 +1158,14 @@ applier_apply_tx(struct stailq *rows)
 		 * each other.
 		 */
 		assert(first_row == last_row);
-		if ((rc = apply_synchro_row(first_row)) != 0)
-			goto finish;
-	} else if ((rc = apply_plain_tx(rows, replication_skip_conflict,
-					true)) != 0) {
-		goto finish;
+		rc = apply_synchro_row(applier->instance_id, first_row);
+	} else {
+		rc = apply_plain_tx(applier->instance_id, rows,
+				    replication_skip_conflict, true);
 	}
+	if (rc != 0)
+		goto finish;
+
 	vclock_follow(&replicaset.applier.vclock, last_row->replica_id,
 		      last_row->lsn);
 finish:
@@ -1317,7 +1384,7 @@ applier_subscribe(struct applier *applier)
 					diag_raise();
 			}
 			applier_signal_ack(applier);
-		} else if (applier_apply_tx(&rows) != 0) {
+		} else if (applier_apply_tx(applier, &rows) != 0) {
 			diag_raise();
 		}
 
diff --git a/src/box/replication.cc b/src/box/replication.cc
index 903390686..a0b3e0186 100644
--- a/src/box/replication.cc
+++ b/src/box/replication.cc
@@ -184,6 +184,7 @@ replica_new(void)
 	trigger_create(&replica->on_applier_state,
 		       replica_on_applier_state_f, NULL, NULL);
 	replica->applier_sync_state = APPLIER_DISCONNECTED;
+	replica->applier_txn_last_tm = 0;
 	latch_create(&replica->order_latch);
 	return replica;
 }
diff --git a/src/box/replication.h b/src/box/replication.h
index 5cc380373..57e0f10ae 100644
--- a/src/box/replication.h
+++ b/src/box/replication.h
@@ -331,6 +331,11 @@ struct replica {
 	 * separate from applier.
 	 */
 	enum applier_state applier_sync_state;
+	/**
+	 * Applier's last written to WAL transaction timestamp.
+	 * Needed for relay lagging statistics.
+	 */
+	double applier_txn_last_tm;
 	/* The latch is used to order replication requests. */
 	struct latch order_latch;
 };
-- 
2.31.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Tarantool-patches] [PATCH v10 2/2] relay: provide information about downstream lag
  2021-06-22 15:22 [Tarantool-patches] [PATCH v10 0/2] relay: provide downstream lag information Cyrill Gorcunov via Tarantool-patches
  2021-06-22 15:22 ` [Tarantool-patches] [PATCH v10 1/2] applier: send transaction's first row WAL time in the applier_writer_f Cyrill Gorcunov via Tarantool-patches
@ 2021-06-22 15:22 ` Cyrill Gorcunov via Tarantool-patches
  2021-06-23 22:19 ` [Tarantool-patches] [PATCH v10 0/2] relay: provide downstream lag information Vladislav Shpilevoy via Tarantool-patches
  2 siblings, 0 replies; 5+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-06-22 15:22 UTC (permalink / raw)
  To: tml; +Cc: Vladislav Shpilevoy

We already have `box.replication.upstream.lag` entry for monitoring
sake. Same time in synchronous replication timeouts are key properties
for quorum gathering procedure. Thus we would like to know how long
it took of a transaction to traverse `initiator WAL -> network ->
remote applier -> initiator ACK reception` path.

Typical output is

 | tarantool> box.info.replication[2].downstream
 | ---
 | - status: follow
 |   idle: 0.61753897101153
 |   vclock: {1: 147}
 |   lag: 0
 | ...
 | tarantool> box.space.sync:insert{69}
 | ---
 | - [69]
 | ...
 |
 | tarantool> box.info.replication[2].downstream
 | ---
 | - status: follow
 |   idle: 0.75324084801832
 |   vclock: {1: 151}
 |   lag: 0.0011014938354492
 | ...

Closes #5447

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

@TarantoolBot document
Title: Add `box.info.replication[n].downstream.lag` entry

`replication[n].downstream.lag` represents a lag between the main
node writes a certain transaction to it's own WAL and a moment it
receives an ack for this transaction from a replica.
---
 .../unreleased/gh-5447-downstream-lag.md      |   6 +
 src/box/lua/info.c                            |   3 +
 src/box/relay.cc                              |  41 ++++++
 src/box/relay.h                               |   6 +
 .../replication/gh-5447-downstream-lag.result | 124 ++++++++++++++++++
 .../gh-5447-downstream-lag.test.lua           |  56 ++++++++
 test/replication/suite.ini                    |   2 +-
 7 files changed, 237 insertions(+), 1 deletion(-)
 create mode 100644 changelogs/unreleased/gh-5447-downstream-lag.md
 create mode 100644 test/replication/gh-5447-downstream-lag.result
 create mode 100644 test/replication/gh-5447-downstream-lag.test.lua

diff --git a/changelogs/unreleased/gh-5447-downstream-lag.md b/changelogs/unreleased/gh-5447-downstream-lag.md
new file mode 100644
index 000000000..f937ce35e
--- /dev/null
+++ b/changelogs/unreleased/gh-5447-downstream-lag.md
@@ -0,0 +1,6 @@
+#feature/replication
+
+ * Introduced `box.info.replication[n].downstream.lag` field to monitor
+   state of replication. This member represents a lag between the main
+   node writes a certain transaction to it's own WAL and a moment it
+   receives an ack for this transaction from a replica (gh-5447).
diff --git a/src/box/lua/info.c b/src/box/lua/info.c
index 0eb48b823..f201b25e3 100644
--- a/src/box/lua/info.c
+++ b/src/box/lua/info.c
@@ -143,6 +143,9 @@ lbox_pushrelay(lua_State *L, struct relay *relay)
 		lua_pushnumber(L, ev_monotonic_now(loop()) -
 			       relay_last_row_time(relay));
 		lua_settable(L, -3);
+		lua_pushstring(L, "lag");
+		lua_pushnumber(L, relay_txn_lag(relay));
+		lua_settable(L, -3);
 		break;
 	case RELAY_STOPPED:
 	{
diff --git a/src/box/relay.cc b/src/box/relay.cc
index b1571b361..115037fc3 100644
--- a/src/box/relay.cc
+++ b/src/box/relay.cc
@@ -68,6 +68,8 @@ struct relay_status_msg {
 	struct relay *relay;
 	/** Replica vclock. */
 	struct vclock vclock;
+	/** Last replicated transaction timestamp. */
+	double txn_lag;
 };
 
 /**
@@ -158,6 +160,14 @@ struct relay {
 	struct stailq pending_gc;
 	/** Time when last row was sent to peer. */
 	double last_row_time;
+	/**
+	 * A time difference between the moment when we
+	 * wrote a transaction to the local WAL and when
+	 * this transaction has been replicated to remote
+	 * node (ie written to node's WAL) so that ACK get
+	 * received.
+	 */
+	double txn_lag;
 	/** Relay sync state. */
 	enum relay_state state;
 
@@ -166,6 +176,11 @@ struct relay {
 		alignas(CACHELINE_SIZE)
 		/** Known relay vclock. */
 		struct vclock vclock;
+		/**
+		 * Transaction downstream lag to be accessed
+		 * from TX thread only.
+		 */
+		double txn_lag;
 		/**
 		 * True if the relay needs Raft updates. It can live fine
 		 * without sending Raft updates, if it is a relay to an
@@ -217,6 +232,12 @@ relay_last_row_time(const struct relay *relay)
 	return relay->last_row_time;
 }
 
+double
+relay_txn_lag(const struct relay *relay)
+{
+	return relay->tx.txn_lag;
+}
+
 static void
 relay_send(struct relay *relay, struct xrow_header *packet);
 static void
@@ -336,6 +357,12 @@ relay_stop(struct relay *relay)
 	 * upon cord_create().
 	 */
 	relay->cord.id = 0;
+	/*
+	 * If relay is stopped then lag statistics should
+	 * be updated on next new ACK packets obtained.
+	 */
+	relay->txn_lag = 0;
+	relay->tx.txn_lag = 0;
 }
 
 void
@@ -483,6 +510,8 @@ tx_status_update(struct cmsg *msg)
 {
 	struct relay_status_msg *status = (struct relay_status_msg *)msg;
 	vclock_copy(&status->relay->tx.vclock, &status->vclock);
+	status->relay->tx.txn_lag = status->txn_lag;
+
 	struct replication_ack ack;
 	ack.source = status->relay->replica->id;
 	ack.vclock = &status->vclock;
@@ -629,6 +658,17 @@ relay_reader_f(va_list ap)
 			/* vclock is followed while decoding, zeroing it. */
 			vclock_create(&relay->recv_vclock);
 			xrow_decode_vclock_xc(&xrow, &relay->recv_vclock);
+			/*
+			 * Replica send us last replicated transaction
+			 * timestamp which is needed for relay lag
+			 * monitoring. Note that this transaction has
+			 * been written to WAL with our current realtime
+			 * clock value, thus when it get reported back we
+			 * can compute time spent regardless of the clock
+			 * value on remote replica.
+			 */
+			if (xrow.tm != 0)
+				relay->txn_lag = ev_now(loop()) - xrow.tm;
 			fiber_cond_signal(&relay->reader_cond);
 		}
 	} catch (Exception *e) {
@@ -838,6 +878,7 @@ relay_subscribe_f(va_list ap)
 		};
 		cmsg_init(&relay->status_msg.msg, route);
 		vclock_copy(&relay->status_msg.vclock, send_vclock);
+		relay->status_msg.txn_lag = relay->txn_lag;
 		relay->status_msg.relay = relay;
 		cpipe_push(&relay->tx_pipe, &relay->status_msg.msg);
 	}
diff --git a/src/box/relay.h b/src/box/relay.h
index b32e2ea2a..615ffb75d 100644
--- a/src/box/relay.h
+++ b/src/box/relay.h
@@ -93,6 +93,12 @@ relay_vclock(const struct relay *relay);
 double
 relay_last_row_time(const struct relay *relay);
 
+/**
+ * Returns relay's transaction's lag.
+ */
+double
+relay_txn_lag(const struct relay *relay);
+
 /**
  * Send a Raft update request to the relay channel. It is not
  * guaranteed that it will be delivered. The connection may break.
diff --git a/test/replication/gh-5447-downstream-lag.result b/test/replication/gh-5447-downstream-lag.result
new file mode 100644
index 000000000..0d5de2564
--- /dev/null
+++ b/test/replication/gh-5447-downstream-lag.result
@@ -0,0 +1,124 @@
+-- test-run result file version 2
+--
+-- gh-5447: Test for box.info.replication[n].downstream.lag.
+-- We need to be sure that slow ACKs delivery might be
+-- caught by monitoring tools.
+--
+
+fiber = require('fiber')
+ | ---
+ | ...
+test_run = require('test_run').new()
+ | ---
+ | ...
+engine = test_run:get_cfg('engine')
+ | ---
+ | ...
+
+box.schema.user.grant('guest', 'replication')
+ | ---
+ | ...
+
+test_run:cmd('create server replica with rpl_master=default, \
+             script="replication/replica.lua"')
+ | ---
+ | - true
+ | ...
+test_run:cmd('start server replica')
+ | ---
+ | - true
+ | ...
+
+replica_id = test_run:get_server_id('replica')
+ | ---
+ | ...
+
+--
+-- Upon replica startup there is no ACKs to process.
+assert(box.info.replication[replica_id].downstream.lag == 0)
+ | ---
+ | - true
+ | ...
+
+s = box.schema.space.create('test', {engine = engine})
+ | ---
+ | ...
+_ = s:create_index('pk')
+ | ---
+ | ...
+
+--
+-- The replica should wait some time before writing data
+-- to the WAL, otherwise we might not even notice the lag
+-- if media is too fast. Before disabling WAL we need to
+-- wait the space get propagated.
+test_run:switch('replica')
+ | ---
+ | - true
+ | ...
+test_run:wait_lsn('replica', 'default')
+ | ---
+ | ...
+box.error.injection.set("ERRINJ_WAL_DELAY", true)
+ | ---
+ | - ok
+ | ...
+
+--
+-- Insert a record and wakeup replica's WAL to process data.
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+box.space.test:insert({1})
+ | ---
+ | - [1]
+ | ...
+-- The record is written on the master node.
+test_run:switch('replica')
+ | ---
+ | - true
+ | ...
+box.error.injection.set("ERRINJ_WAL_DELAY", false)
+ | ---
+ | - ok
+ | ...
+
+--
+-- Wait the record to be ACKed, the lag value should be nonzero.
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+test_run:wait_lsn('replica', 'default')
+ | ---
+ | ...
+assert(box.info.replication[replica_id].downstream.lag > 0)
+ | ---
+ | - true
+ | ...
+
+--
+-- Cleanup everything.
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+box.space.test:drop()
+ | ---
+ | ...
+box.schema.user.revoke('guest', 'replication')
+ | ---
+ | ...
+test_run:cmd('stop server replica')
+ | ---
+ | - true
+ | ...
+test_run:cmd('cleanup server replica')
+ | ---
+ | - true
+ | ...
+test_run:cmd('delete server replica')
+ | ---
+ | - true
+ | ...
diff --git a/test/replication/gh-5447-downstream-lag.test.lua b/test/replication/gh-5447-downstream-lag.test.lua
new file mode 100644
index 000000000..dd1d2e2c9
--- /dev/null
+++ b/test/replication/gh-5447-downstream-lag.test.lua
@@ -0,0 +1,56 @@
+--
+-- gh-5447: Test for box.info.replication[n].downstream.lag.
+-- We need to be sure that slow ACKs delivery might be
+-- caught by monitoring tools.
+--
+
+fiber = require('fiber')
+test_run = require('test_run').new()
+engine = test_run:get_cfg('engine')
+
+box.schema.user.grant('guest', 'replication')
+
+test_run:cmd('create server replica with rpl_master=default, \
+             script="replication/replica.lua"')
+test_run:cmd('start server replica')
+
+replica_id = test_run:get_server_id('replica')
+
+--
+-- Upon replica startup there is no ACKs to process.
+assert(box.info.replication[replica_id].downstream.lag == 0)
+
+s = box.schema.space.create('test', {engine = engine})
+_ = s:create_index('pk')
+
+--
+-- The replica should wait some time before writing data
+-- to the WAL, otherwise we might not even notice the lag
+-- if media is too fast. Before disabling WAL we need to
+-- wait the space get propagated.
+test_run:switch('replica')
+test_run:wait_lsn('replica', 'default')
+box.error.injection.set("ERRINJ_WAL_DELAY", true)
+
+--
+-- Insert a record and wakeup replica's WAL to process data.
+test_run:switch('default')
+box.space.test:insert({1})
+-- The record is written on the master node.
+test_run:switch('replica')
+box.error.injection.set("ERRINJ_WAL_DELAY", false)
+
+--
+-- Wait the record to be ACKed, the lag value should be nonzero.
+test_run:switch('default')
+test_run:wait_lsn('replica', 'default')
+assert(box.info.replication[replica_id].downstream.lag > 0)
+
+--
+-- Cleanup everything.
+test_run:switch('default')
+box.space.test:drop()
+box.schema.user.revoke('guest', 'replication')
+test_run:cmd('stop server replica')
+test_run:cmd('cleanup server replica')
+test_run:cmd('delete server replica')
diff --git a/test/replication/suite.ini b/test/replication/suite.ini
index 40daa713b..7e478f7d0 100644
--- a/test/replication/suite.ini
+++ b/test/replication/suite.ini
@@ -3,7 +3,7 @@ core = tarantool
 script =  master.lua
 description = tarantool/box, replication
 disabled = consistent.test.lua
-release_disabled = catch.test.lua errinj.test.lua gc.test.lua gc_no_space.test.lua before_replace.test.lua qsync_advanced.test.lua qsync_errinj.test.lua quorum.test.lua recover_missing_xlog.test.lua sync.test.lua long_row_timeout.test.lua gh-4739-vclock-assert.test.lua gh-4730-applier-rollback.test.lua gh-5140-qsync-casc-rollback.test.lua gh-5144-qsync-dup-confirm.test.lua gh-5167-qsync-rollback-snap.test.lua gh-5506-election-on-off.test.lua gh-5536-wal-limit.test.lua hang_on_synchro_fail.test.lua anon_register_gap.test.lua gh-5213-qsync-applier-order.test.lua gh-5213-qsync-applier-order-3.test.lua gh-6027-applier-error-show.test.lua gh-6032-promote-wal-write.test.lua gh-6057-qsync-confirm-async-no-wal.test.lua
+release_disabled = catch.test.lua errinj.test.lua gc.test.lua gc_no_space.test.lua before_replace.test.lua qsync_advanced.test.lua qsync_errinj.test.lua quorum.test.lua recover_missing_xlog.test.lua sync.test.lua long_row_timeout.test.lua gh-4739-vclock-assert.test.lua gh-4730-applier-rollback.test.lua gh-5140-qsync-casc-rollback.test.lua gh-5144-qsync-dup-confirm.test.lua gh-5167-qsync-rollback-snap.test.lua gh-5506-election-on-off.test.lua gh-5536-wal-limit.test.lua hang_on_synchro_fail.test.lua anon_register_gap.test.lua gh-5213-qsync-applier-order.test.lua gh-5213-qsync-applier-order-3.test.lua gh-6027-applier-error-show.test.lua gh-6032-promote-wal-write.test.lua gh-6057-qsync-confirm-async-no-wal.test.lua gh-5447-downstream-lag.test.lua
 config = suite.cfg
 lua_libs = lua/fast_replica.lua lua/rlimit.lua
 use_unix_sockets = True
-- 
2.31.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Tarantool-patches] [PATCH v10 0/2] relay: provide downstream lag information
  2021-06-22 15:22 [Tarantool-patches] [PATCH v10 0/2] relay: provide downstream lag information Cyrill Gorcunov via Tarantool-patches
  2021-06-22 15:22 ` [Tarantool-patches] [PATCH v10 1/2] applier: send transaction's first row WAL time in the applier_writer_f Cyrill Gorcunov via Tarantool-patches
  2021-06-22 15:22 ` [Tarantool-patches] [PATCH v10 2/2] relay: provide information about downstream lag Cyrill Gorcunov via Tarantool-patches
@ 2021-06-23 22:19 ` Vladislav Shpilevoy via Tarantool-patches
  2021-06-23 22:21   ` Cyrill Gorcunov via Tarantool-patches
  2 siblings, 1 reply; 5+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-06-23 22:19 UTC (permalink / raw)
  To: Cyrill Gorcunov, tml

Hi! Thanks for the patchset!

Pushed to master.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Tarantool-patches] [PATCH v10 0/2] relay: provide downstream lag information
  2021-06-23 22:19 ` [Tarantool-patches] [PATCH v10 0/2] relay: provide downstream lag information Vladislav Shpilevoy via Tarantool-patches
@ 2021-06-23 22:21   ` Cyrill Gorcunov via Tarantool-patches
  0 siblings, 0 replies; 5+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-06-23 22:21 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tml

On Thu, Jun 24, 2021 at 12:19:27AM +0200, Vladislav Shpilevoy wrote:
> Hi! Thanks for the patchset!
> 
> Pushed to master.

Thanks a huge for review and all comments!

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-06-23 22:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-22 15:22 [Tarantool-patches] [PATCH v10 0/2] relay: provide downstream lag information Cyrill Gorcunov via Tarantool-patches
2021-06-22 15:22 ` [Tarantool-patches] [PATCH v10 1/2] applier: send transaction's first row WAL time in the applier_writer_f Cyrill Gorcunov via Tarantool-patches
2021-06-22 15:22 ` [Tarantool-patches] [PATCH v10 2/2] relay: provide information about downstream lag Cyrill Gorcunov via Tarantool-patches
2021-06-23 22:19 ` [Tarantool-patches] [PATCH v10 0/2] relay: provide downstream lag information Vladislav Shpilevoy via Tarantool-patches
2021-06-23 22:21   ` Cyrill Gorcunov via Tarantool-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox