[Tarantool-patches] [RFC v7 0/2] relay: provide downstream lag information

Tarantool development patches archive
 help / color / mirror / Atom feed

* [Tarantool-patches] [RFC v7 0/2] relay: provide downstream lag information
@ 2021-06-04 17:06 Cyrill Gorcunov via Tarantool-patches
  2021-06-04 17:06 ` [Tarantool-patches] [RFC v7 1/2] applier: send transaction's first row WAL time in the applier_writer_f Cyrill Gorcunov via Tarantool-patches
  2021-06-04 17:06 ` [Tarantool-patches] [RFC v7 2/2] relay: provide information about downstream lag Cyrill Gorcunov via Tarantool-patches
  0 siblings, 2 replies; 3+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-06-04 17:06 UTC (permalink / raw)
  To: tml; +Cc: Vladislav Shpilevoy

Guys, take a look once time permit. Previous version is here
https://lists.tarantool.org/tarantool-patches/20210506214132.533913-1-gorcunov@gmail.com/

Not for merging yet! I think instead of applier_wal_stat structure we might need some
_commonly shared_ statistics structure, probably bound to WAL code and all other
threads will update it in a lockless way, because we might need to collect more
detais on WAL processing in future. I though of something like

	enum {
		WAL_ICNT__APPLIER_TXN_START_TM,

		WAL_ICNT__MAX,
	};

	struct wal_stat {
		int64_t	icnt[WAL_ICNT__MAX];
	} wal_st[VCLOCK_MAX];

and introduce

	wal_st__icnt_read(unsigned id);
	wal_st__icnt_write(unsigned id);

then applier will simply push last timestamp to WAL_ICNT__APPLIER_TXN_START_TM
counter, and later when we need to send ACK we use wal_st__icnt_read() to
fetch it back. We won't need to allocate some dynamic memory for it but
rather use statically preallocated shared between threads. Not sure though.

v4 (by Vlad):
 - add a test case
 - add docbot request
 - dropped off xrow_encode_vclock_timed, we use opencoded assignment
   for tm value when send ack
 - struct awstat renamed to applier_wal_stat. Vlad I think this is
   better name than "applier_lag" because this is statistics on WAL,
   we simply track remote WAL propagation here, so more general name
   is better for grep sake and for future extensions
 - instead of passing applier structure we pass replica_id
 - the real keeper of this statistics comes into "replica" structure
   thus unbound of applier itself
 - for synchro entries we pass a pointer to the applier_wal_stat instead
   of using replica_id = 0 as a sign that we don't need to update statistics
   for initial and final join cases
 - to write and read statistics we provide wal_stat_update and wal_stat_ack
   helpers to cover the case where single ACK spans several transactions

v7:
 - reworked the idea, so we always send last applied transaction's timestamp
   inside applier's ACK message
 - fixed changelong
 - in replica structure opencode the applier_txn_start_tm member
 - drop multiple ifs in applier_apply_tx
 - drop if statement in apply_synchro_row

Vlad you pointed

 > 4. Lag is updated in the relay thread, therefore you can't
 > simply read it in TX thread like you do in the diff block
 > above.

actually I can read the relay's lag in box.info() output, if the
relay object is removed then it won't have RELAY_FOLLOW state
so we're safe. This is what you meant?

branch gorcunov/gh-5447-relay-lag-7-notest
issue https://github.com/tarantool/tarantool/issues/5447

Cyrill Gorcunov (2):
  applier: send transaction's first row WAL time in the applier_writer_f
  relay: provide information about downstream lag

 .../unreleased/gh-5447-downstream-lag.md      |  6 ++
 src/box/applier.cc                            | 74 ++++++++++++---
 src/box/applier.h                             | 14 +++
 src/box/lua/info.c                            |  3 +
 src/box/relay.cc                              | 51 ++++++++++
 src/box/relay.h                               |  6 ++
 src/box/replication.cc                        |  1 +
 src/box/replication.h                         |  5 +
 .../replication/gh-5447-downstream-lag.result | 93 +++++++++++++++++++
 .../gh-5447-downstream-lag.test.lua           | 41 ++++++++
 10 files changed, 279 insertions(+), 15 deletions(-)
 create mode 100644 changelogs/unreleased/gh-5447-downstream-lag.md
 create mode 100644 test/replication/gh-5447-downstream-lag.result
 create mode 100644 test/replication/gh-5447-downstream-lag.test.lua

-- 
2.31.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Tarantool-patches] [RFC v7 1/2] applier: send transaction's first row WAL time in the applier_writer_f
  2021-06-04 17:06 [Tarantool-patches] [RFC v7 0/2] relay: provide downstream lag information Cyrill Gorcunov via Tarantool-patches
@ 2021-06-04 17:06 ` Cyrill Gorcunov via Tarantool-patches
  2021-06-04 17:06 ` [Tarantool-patches] [RFC v7 2/2] relay: provide information about downstream lag Cyrill Gorcunov via Tarantool-patches
  1 sibling, 0 replies; 3+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-06-04 17:06 UTC (permalink / raw)
  To: tml; +Cc: Vladislav Shpilevoy

Applier fiber sends current vclock of the node to remote relay reader,
pointing current state of fetched WAL data so the relay will know which
new data should be sent. The packet applier sends carries xrow_header::tm
field as a zero but we can reuse it to provide information about first
timestamp in a transaction we wrote to our WAL. Since old instances of
Tarantool simply ignore this field such extension won't cause any
problems.

The timestamp will be needed to account lag of downstream replicas
suitable for information purpose and cluster health monitoring.

We update applier statistics in WAL callbacks but since both
apply_synchro_row and apply_plain_tx are used not only in real data
application but in final join stage as well (in this stage we're not
writing the data yet) the apply_synchro_row is extended with replica_id
argument which is non zero when applier is subscribed.

The calculation of the downstream lag itself lag will be addressed
in next patch because sending the timestamp and its observation
are independent actions.

Part-of #5447

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 src/box/applier.cc     | 74 +++++++++++++++++++++++++++++++++---------
 src/box/applier.h      | 14 ++++++++
 src/box/replication.cc |  1 +
 src/box/replication.h  |  5 +++
 4 files changed, 79 insertions(+), 15 deletions(-)

diff --git a/src/box/applier.cc b/src/box/applier.cc
index 33181fdbf..94f247298 100644
--- a/src/box/applier.cc
+++ b/src/box/applier.cc
@@ -148,6 +148,22 @@ applier_check_sync(struct applier *applier)
 	}
 }
 
+static void
+wal_stat_update(struct applier_wal_stat *wal_st)
+{
+	struct replica *r = replica_by_id(wal_st->replica_id);
+	if (likely(r != NULL))
+		r->applier_txn_start_tm = wal_st->txn_start_tm;
+}
+
+static void
+wal_stat_encode(uint32_t replica_id, struct xrow_header *xrow)
+{
+	struct replica *r = replica_by_id(replica_id);
+	if (likely(r != NULL))
+		xrow->tm = r->applier_txn_start_tm;
+}
+
 /*
  * Fiber function to write vclock to replication master.
  * To track connection status, replica answers master
@@ -193,6 +209,7 @@ applier_writer_f(va_list ap)
 			applier->has_acks_to_send = false;
 			struct xrow_header xrow;
 			xrow_encode_vclock(&xrow, &replicaset.vclock);
+			wal_stat_encode(applier->instance_id, &xrow);
 			coio_write_xrow(&io, &xrow);
 			ERROR_INJECT(ERRINJ_APPLIER_SLOW_ACK, {
 				fiber_sleep(0.01);
@@ -490,7 +507,7 @@ static uint64_t
 applier_read_tx(struct applier *applier, struct stailq *rows, double timeout);
 
 static int
-apply_final_join_tx(struct stailq *rows);
+apply_final_join_tx(uint32_t replica_id, struct stailq *rows);
 
 /**
  * A helper struct to link xrow objects in a list.
@@ -535,7 +552,7 @@ applier_wait_register(struct applier *applier, uint64_t row_count)
 						  next)->row);
 			break;
 		}
-		if (apply_final_join_tx(&rows) != 0)
+		if (apply_final_join_tx(applier->instance_id, &rows) != 0)
 			diag_raise();
 	}
 
@@ -754,8 +771,12 @@ applier_txn_rollback_cb(struct trigger *trigger, void *event)
 static int
 applier_txn_wal_write_cb(struct trigger *trigger, void *event)
 {
-	(void) trigger;
 	(void) event;
+
+	struct applier_wal_stat *wal_st =
+		(struct applier_wal_stat *)trigger->data;
+	wal_stat_update(wal_st);
+
 	/* Broadcast the WAL write across all appliers. */
 	trigger_run(&replicaset.applier.on_wal_write, NULL);
 	return 0;
@@ -766,6 +787,8 @@ struct synchro_entry {
 	struct synchro_request *req;
 	/** Fiber created the entry. To wakeup when WAL write is done. */
 	struct fiber *owner;
+	/** WAL bound statistics. */
+	struct applier_wal_stat *wal_st;
 	/**
 	 * The base journal entry. It has unsized array and then must be the
 	 * last entry in the structure. But can workaround it via a union
@@ -789,6 +812,7 @@ apply_synchro_row_cb(struct journal_entry *entry)
 	if (entry->res < 0) {
 		applier_rollback_by_wal_io();
 	} else {
+		wal_stat_update(synchro_entry->wal_st);
 		txn_limbo_process(&txn_limbo, synchro_entry->req);
 		trigger_run(&replicaset.applier.on_wal_write, NULL);
 	}
@@ -797,7 +821,7 @@ apply_synchro_row_cb(struct journal_entry *entry)
 
 /** Process a synchro request. */
 static int
-apply_synchro_row(struct xrow_header *row)
+apply_synchro_row(uint32_t replica_id, struct xrow_header *row)
 {
 	assert(iproto_type_is_synchro_request(row->type));
 
@@ -805,6 +829,7 @@ apply_synchro_row(struct xrow_header *row)
 	if (xrow_decode_synchro(row, &req) != 0)
 		goto err;
 
+	struct applier_wal_stat wal_st;
 	struct synchro_entry entry;
 	/*
 	 * Rows array is cast from *[] to **, because otherwise g++ complains
@@ -817,6 +842,11 @@ apply_synchro_row(struct xrow_header *row)
 			     apply_synchro_row_cb, &entry);
 	entry.req = &req;
 	entry.owner = fiber();
+
+	wal_st.replica_id = replica_id;
+	wal_st.txn_start_tm = row->tm;
+	entry.wal_st = &wal_st;
+
 	/*
 	 * The WAL write is blocking. Otherwise it might happen that a CONFIRM
 	 * or ROLLBACK is sent to WAL, and it would empty the limbo, but before
@@ -862,8 +892,9 @@ applier_handle_raft(struct applier *applier, struct xrow_header *row)
 	return box_raft_process(&req, applier->instance_id);
 }
 
-static inline int
-apply_plain_tx(struct stailq *rows, bool skip_conflict, bool use_triggers)
+static int
+apply_plain_tx(uint32_t replica_id, struct stailq *rows,
+	       bool skip_conflict, bool use_triggers)
 {
 	/*
 	 * Explicitly begin the transaction so that we can
@@ -931,10 +962,21 @@ apply_plain_tx(struct stailq *rows, bool skip_conflict, bool use_triggers)
 			goto fail;
 		}
 
+		struct applier_wal_stat *wal_st;
+		wal_st = region_alloc_object(&txn->region, typeof(*wal_st), &size);
+		if (wal_st == NULL) {
+			diag_set(OutOfMemory, size, "region_alloc_object", "wal_st");
+			goto fail;
+		}
+
 		trigger_create(on_rollback, applier_txn_rollback_cb, NULL, NULL);
 		txn_on_rollback(txn, on_rollback);
 
-		trigger_create(on_wal_write, applier_txn_wal_write_cb, NULL, NULL);
+		item = stailq_first_entry(rows, struct applier_tx_row, next);
+		wal_st->replica_id = replica_id;
+		wal_st->txn_start_tm = item->row.tm;
+
+		trigger_create(on_wal_write, applier_txn_wal_write_cb, wal_st, NULL);
 		txn_on_wal_write(txn, on_wal_write);
 	}
 
@@ -946,7 +988,7 @@ apply_plain_tx(struct stailq *rows, bool skip_conflict, bool use_triggers)
 
 /** A simpler version of applier_apply_tx() for final join stage. */
 static int
-apply_final_join_tx(struct stailq *rows)
+apply_final_join_tx(uint32_t replica_id, struct stailq *rows)
 {
 	struct xrow_header *first_row =
 		&stailq_first_entry(rows, struct applier_tx_row, next)->row;
@@ -957,9 +999,9 @@ apply_final_join_tx(struct stailq *rows)
 	vclock_follow_xrow(&replicaset.vclock, last_row);
 	if (unlikely(iproto_type_is_synchro_request(first_row->type))) {
 		assert(first_row == last_row);
-		rc = apply_synchro_row(first_row);
+		rc = apply_synchro_row(replica_id, first_row);
 	} else {
-		rc = apply_plain_tx(rows, false, false);
+		rc = apply_plain_tx(replica_id, rows, false, false);
 	}
 	fiber_gc();
 	return rc;
@@ -1088,12 +1130,14 @@ applier_apply_tx(struct applier *applier, struct stailq *rows)
 		 * each other.
 		 */
 		assert(first_row == last_row);
-		if ((rc = apply_synchro_row(first_row)) != 0)
-			goto finish;
-	} else if ((rc = apply_plain_tx(rows, replication_skip_conflict,
-					true)) != 0) {
-		goto finish;
+		rc = apply_synchro_row(applier->instance_id, first_row);
+	} else {
+		rc = apply_plain_tx(applier->instance_id, rows,
+				    replication_skip_conflict, true);
 	}
+	if (rc != 0)
+		goto finish;
+
 	vclock_follow(&replicaset.applier.vclock, last_row->replica_id,
 		      last_row->lsn);
 finish:
diff --git a/src/box/applier.h b/src/box/applier.h
index 15ca1fcfd..00afa7247 100644
--- a/src/box/applier.h
+++ b/src/box/applier.h
@@ -133,6 +133,20 @@ struct applier {
 	struct vclock remote_vclock_at_subscribe;
 };
 
+/**
+ * WAL related statistics.
+ */
+struct applier_wal_stat {
+	/** Replica ID initiated a transaction. */
+	uint32_t replica_id;
+	/**
+	 * Timestamp of a transaction to be accounted
+	 * for relay lag. Usually it is a first row in
+	 * a transaction.
+	 */
+	double txn_start_tm;
+};
+
 /**
  * Start a client to a remote master using a background fiber.
  *
diff --git a/src/box/replication.cc b/src/box/replication.cc
index aefb812b3..c97c1fc04 100644
--- a/src/box/replication.cc
+++ b/src/box/replication.cc
@@ -184,6 +184,7 @@ replica_new(void)
 	trigger_create(&replica->on_applier_state,
 		       replica_on_applier_state_f, NULL, NULL);
 	replica->applier_sync_state = APPLIER_DISCONNECTED;
+	replica->applier_txn_start_tm = 0;
 	latch_create(&replica->order_latch);
 	return replica;
 }
diff --git a/src/box/replication.h b/src/box/replication.h
index 2ad1cbf66..d9817d4ff 100644
--- a/src/box/replication.h
+++ b/src/box/replication.h
@@ -331,6 +331,11 @@ struct replica {
 	 * separate from applier.
 	 */
 	enum applier_state applier_sync_state;
+	/**
+	 * Applier's last written to WAL transaction timestamp.
+	 * Needed for relay lagging statistics.
+	 */
+	double applier_txn_start_tm;
 	/* The latch is used to order replication requests. */
 	struct latch order_latch;
 };
-- 
2.31.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Tarantool-patches] [RFC v7 2/2] relay: provide information about downstream lag
  2021-06-04 17:06 [Tarantool-patches] [RFC v7 0/2] relay: provide downstream lag information Cyrill Gorcunov via Tarantool-patches
  2021-06-04 17:06 ` [Tarantool-patches] [RFC v7 1/2] applier: send transaction's first row WAL time in the applier_writer_f Cyrill Gorcunov via Tarantool-patches
@ 2021-06-04 17:06 ` Cyrill Gorcunov via Tarantool-patches
  1 sibling, 0 replies; 3+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-06-04 17:06 UTC (permalink / raw)
  To: tml; +Cc: Vladislav Shpilevoy

We already have `box.replication.upstream.lag` entry for monitoring
sake. Same time in synchronous replication timeouts are key properties
for quorum gathering procedure. Thus we would like to know how long
it took of a transaction to traverse `initiator WAL -> network ->
remote applier -> ACK` path.

Typical ouput is

 | tarantool> box.info.replication[2].downstream
 | ---
 | - status: follow
 |   idle: 0.61753897101153
 |   vclock: {1: 147}
 |   lag: 0
 | ...
 | tarantool> box.space.sync:insert{69}
 | ---
 | - [69]
 | ...
 |
 | tarantool> box.info.replication[2].downstream
 | ---
 | - status: follow
 |   idle: 0.75324084801832
 |   vclock: {1: 151}
 |   lag: 0.0011014938354492
 | ...

Closes #5447

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

@TarantoolBot document
Title: Add `box.info.replication[n].downstream.lag` entry

`replication[n].downstream.lag` is the time difference between
last transactions been written to the WAL journal of the transaction
initiator and the transaction written to WAL on the `n` replica.

In other words this is a lag in seconds between the main node writes
data to own WAL and replica `n` get this data replicated to the own
WAL journal.
---
 .../unreleased/gh-5447-downstream-lag.md      |  6 ++
 src/box/lua/info.c                            |  3 +
 src/box/relay.cc                              | 51 ++++++++++
 src/box/relay.h                               |  6 ++
 .../replication/gh-5447-downstream-lag.result | 93 +++++++++++++++++++
 .../gh-5447-downstream-lag.test.lua           | 41 ++++++++
 6 files changed, 200 insertions(+)
 create mode 100644 changelogs/unreleased/gh-5447-downstream-lag.md
 create mode 100644 test/replication/gh-5447-downstream-lag.result
 create mode 100644 test/replication/gh-5447-downstream-lag.test.lua

diff --git a/changelogs/unreleased/gh-5447-downstream-lag.md b/changelogs/unreleased/gh-5447-downstream-lag.md
new file mode 100644
index 000000000..726175c6c
--- /dev/null
+++ b/changelogs/unreleased/gh-5447-downstream-lag.md
@@ -0,0 +1,6 @@
+#feature/replication
+
+ * Introduced `box.info.replication[n].downstream.lag` field to monitor
+   state of replication. This member represents time spent between
+   transaction been written to initiator's WAL file and reached WAL
+   file of a replica (gh-5447).
diff --git a/src/box/lua/info.c b/src/box/lua/info.c
index 0eb48b823..f201b25e3 100644
--- a/src/box/lua/info.c
+++ b/src/box/lua/info.c
@@ -143,6 +143,9 @@ lbox_pushrelay(lua_State *L, struct relay *relay)
 		lua_pushnumber(L, ev_monotonic_now(loop()) -
 			       relay_last_row_time(relay));
 		lua_settable(L, -3);
+		lua_pushstring(L, "lag");
+		lua_pushnumber(L, relay_txn_lag(relay));
+		lua_settable(L, -3);
 		break;
 	case RELAY_STOPPED:
 	{
diff --git a/src/box/relay.cc b/src/box/relay.cc
index b1571b361..18c1ac06b 100644
--- a/src/box/relay.cc
+++ b/src/box/relay.cc
@@ -158,6 +158,19 @@ struct relay {
 	struct stailq pending_gc;
 	/** Time when last row was sent to peer. */
 	double last_row_time;
+	/**
+	 * A time difference between the moment when we
+	 * wrote a transaction to the local WAL and when
+	 * this transaction has been replicated to remote
+	 * node (ie written to node's WAL).
+	 */
+	double txn_lag;
+	/**
+	 * Last timestamp observed from remote node to
+	 * persist @a txn_lag value.
+	 */
+	double txn_acked_tm;
+
 	/** Relay sync state. */
 	enum relay_state state;
 
@@ -217,6 +230,12 @@ relay_last_row_time(const struct relay *relay)
 	return relay->last_row_time;
 }
 
+double
+relay_txn_lag(const struct relay *relay)
+{
+	return relay->txn_lag;
+}
+
 static void
 relay_send(struct relay *relay, struct xrow_header *packet);
 static void
@@ -284,6 +303,16 @@ relay_start(struct relay *relay, int fd, uint64_t sync,
 	relay->state = RELAY_FOLLOW;
 	relay->row_count = 0;
 	relay->last_row_time = ev_monotonic_now(loop());
+	relay->txn_lag = 0;
+	/*
+	 * We assume that previously written rows in WAL
+	 * are older than current node real time which allows
+	 * to simplify @a txn_lag calculation. In worst
+	 * scenario when runtime has been adjusted backwards
+	 * between restart we simply get some big value in
+	 * @a txn_lag until next transaction get replicated.
+	 */
+	relay->txn_acked_tm = ev_now(loop());
 }
 
 void
@@ -336,6 +365,8 @@ relay_stop(struct relay *relay)
 	 * upon cord_create().
 	 */
 	relay->cord.id = 0;
+	relay->txn_lag = 0;
+	relay->txn_acked_tm = ev_now(loop());
 }
 
 void
@@ -629,6 +660,26 @@ relay_reader_f(va_list ap)
 			/* vclock is followed while decoding, zeroing it. */
 			vclock_create(&relay->recv_vclock);
 			xrow_decode_vclock_xc(&xrow, &relay->recv_vclock);
+			/*
+			 * Replica send us last replicated transaction
+			 * timestamp which is needed for relay lag
+			 * monitoring. Note that this transaction has
+			 * been written to WAL with our current realtime
+			 * clock value, thus when it get reported back we
+			 * can compute time spent regardless of the clock
+			 * value on remote replica.
+			 *
+			 * An interesting moment is replica restart - it will
+			 * send us value 0 after that but we can preserve
+			 * old reported value here since we *assume* that
+			 * timestamp is not going backwards on properly
+			 * set up nodes, otherwise the lag get raised.
+			 * After all this is a not tamper-proof value.
+			 */
+			if (relay->txn_acked_tm < xrow.tm) {
+				relay->txn_acked_tm = xrow.tm;
+				relay->txn_lag = ev_now(loop()) - xrow.tm;
+			}
 			fiber_cond_signal(&relay->reader_cond);
 		}
 	} catch (Exception *e) {
diff --git a/src/box/relay.h b/src/box/relay.h
index b32e2ea2a..615ffb75d 100644
--- a/src/box/relay.h
+++ b/src/box/relay.h
@@ -93,6 +93,12 @@ relay_vclock(const struct relay *relay);
 double
 relay_last_row_time(const struct relay *relay);
 
+/**
+ * Returns relay's transaction's lag.
+ */
+double
+relay_txn_lag(const struct relay *relay);
+
 /**
  * Send a Raft update request to the relay channel. It is not
  * guaranteed that it will be delivered. The connection may break.
diff --git a/test/replication/gh-5447-downstream-lag.result b/test/replication/gh-5447-downstream-lag.result
new file mode 100644
index 000000000..8586d0ed3
--- /dev/null
+++ b/test/replication/gh-5447-downstream-lag.result
@@ -0,0 +1,93 @@
+-- test-run result file version 2
+--
+-- gh-5447: Test for box.info.replication[n].downstream.lag.
+-- We need to be sure that if replica start been back of
+-- master node reports own lagging and cluster admin would
+-- be able to detect such situation.
+--
+
+fiber = require('fiber')
+ | ---
+ | ...
+test_run = require('test_run').new()
+ | ---
+ | ...
+engine = test_run:get_cfg('engine')
+ | ---
+ | ...
+
+box.schema.user.grant('guest', 'replication')
+ | ---
+ | ...
+
+test_run:cmd('create server replica with rpl_master=default, \
+             script="replication/replica.lua"')
+ | ---
+ | - true
+ | ...
+test_run:cmd('start server replica')
+ | ---
+ | - true
+ | ...
+
+s = box.schema.space.create('test', {engine = engine})
+ | ---
+ | ...
+_ = s:create_index('pk')
+ | ---
+ | ...
+
+--
+-- The replica should wait some time (wal delay is 1 second
+-- by default) so we would be able to detect the lag, since
+-- on local instances the lag is minimal and usually transactions
+-- are handled instantly.
+test_run:switch('replica')
+ | ---
+ | - true
+ | ...
+box.error.injection.set("ERRINJ_WAL_DELAY", true)
+ | ---
+ | - ok
+ | ...
+
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+box.space.test:insert({1})
+ | ---
+ | - [1]
+ | ...
+test_run:wait_cond(function() return box.info.replication[2].downstream.lag ~= 0 end, 10)
+ | ---
+ | - true
+ | ...
+
+test_run:switch('replica')
+ | ---
+ | - true
+ | ...
+box.error.injection.set("ERRINJ_WAL_DELAY", false)
+ | ---
+ | - ok
+ | ...
+--
+-- Cleanup everything.
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+
+test_run:cmd('stop server replica')
+ | ---
+ | - true
+ | ...
+test_run:cmd('cleanup server replica')
+ | ---
+ | - true
+ | ...
+test_run:cmd('delete server replica')
+ | ---
+ | - true
+ | ...
diff --git a/test/replication/gh-5447-downstream-lag.test.lua b/test/replication/gh-5447-downstream-lag.test.lua
new file mode 100644
index 000000000..650e8e215
--- /dev/null
+++ b/test/replication/gh-5447-downstream-lag.test.lua
@@ -0,0 +1,41 @@
+--
+-- gh-5447: Test for box.info.replication[n].downstream.lag.
+-- We need to be sure that if replica start been back of
+-- master node reports own lagging and cluster admin would
+-- be able to detect such situation.
+--
+
+fiber = require('fiber')
+test_run = require('test_run').new()
+engine = test_run:get_cfg('engine')
+
+box.schema.user.grant('guest', 'replication')
+
+test_run:cmd('create server replica with rpl_master=default, \
+             script="replication/replica.lua"')
+test_run:cmd('start server replica')
+
+s = box.schema.space.create('test', {engine = engine})
+_ = s:create_index('pk')
+
+--
+-- The replica should wait some time (wal delay is 1 second
+-- by default) so we would be able to detect the lag, since
+-- on local instances the lag is minimal and usually transactions
+-- are handled instantly.
+test_run:switch('replica')
+box.error.injection.set("ERRINJ_WAL_DELAY", true)
+
+test_run:switch('default')
+box.space.test:insert({1})
+test_run:wait_cond(function() return box.info.replication[2].downstream.lag ~= 0 end, 10)
+
+test_run:switch('replica')
+box.error.injection.set("ERRINJ_WAL_DELAY", false)
+--
+-- Cleanup everything.
+test_run:switch('default')
+
+test_run:cmd('stop server replica')
+test_run:cmd('cleanup server replica')
+test_run:cmd('delete server replica')
-- 
2.31.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-06-04 17:07 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-04 17:06 [Tarantool-patches] [RFC v7 0/2] relay: provide downstream lag information Cyrill Gorcunov via Tarantool-patches
2021-06-04 17:06 ` [Tarantool-patches] [RFC v7 1/2] applier: send transaction's first row WAL time in the applier_writer_f Cyrill Gorcunov via Tarantool-patches
2021-06-04 17:06 ` [Tarantool-patches] [RFC v7 2/2] relay: provide information about downstream lag Cyrill Gorcunov via Tarantool-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox