Tarantool development patches archive
 help / color / mirror / Atom feed
From: Konstantin Belyavskiy <k.belyavskiy@tarantool.org>
To: georgy@tarantool.org, kostja@tarantool.org
Cc: tarantool-patches@freelists.org
Subject: [tarantool-patches] [PATCH 3/3] replication: display downstream status at upstream
Date: Wed, 16 May 2018 14:32:27 +0300	[thread overview]
Message-ID: <566d0d43f6dc7126408d67c34e29369c5e3598c0.1526469555.git.k.belyavskiy@tarantool.org> (raw)
In-Reply-To: <cover.1526469555.git.k.belyavskiy@tarantool.org>
In-Reply-To: <cover.1526469555.git.k.belyavskiy@tarantool.org>

This fix improves 'box.info.replication' output.
If downstream fails and thus disconnects from upstream, improve
logging by printing 'status: disconnected' and error message on
both sides (master and replica).

Closes #3365
---
 src/box/lua/info.c                                 |  17 +++
 src/box/relay.cc                                   |   8 ++
 src/box/relay.h                                    |   3 +
 test/replication/show_error_on_disconnect.result   | 120 +++++++++++++++++++++
 test/replication/show_error_on_disconnect.test.lua |  38 +++++++
 5 files changed, 186 insertions(+)
 create mode 100644 test/replication/show_error_on_disconnect.result
 create mode 100644 test/replication/show_error_on_disconnect.test.lua

diff --git a/src/box/lua/info.c b/src/box/lua/info.c
index 9dbc3f92c..8f358d04e 100644
--- a/src/box/lua/info.c
+++ b/src/box/lua/info.c
@@ -148,6 +148,23 @@ lbox_pushreplica(lua_State *L, struct replica *replica)
 	if (relay_get_state(replica->relay) == RELAY_FOLLOW) {
 		lua_pushstring(L, "downstream");
 		lbox_pushrelay(L, relay);
+		lua_settable(L, -3);
+	} else if (relay_get_state(replica->relay) == RELAY_STOPPED) {
+		lua_pushstring(L, "downstream");
+
+		lua_newtable(L);
+		lua_pushstring(L, "status");
+		lua_pushstring(L, "stopped");
+		lua_settable(L, -3);
+
+		assert(replica->relay);
+		struct error *e = diag_last_error(relay_get_diag(replica->relay));
+		if (e != NULL) {
+			lua_pushstring(L, "message");
+			lua_pushstring(L, e->errmsg);
+			lua_settable(L, -3);
+		}
+
 		lua_settable(L, -3);
 	}
 }
diff --git a/src/box/relay.cc b/src/box/relay.cc
index 49835bcb2..92dcd68ba 100644
--- a/src/box/relay.cc
+++ b/src/box/relay.cc
@@ -142,6 +142,12 @@ struct relay {
 	} tx;
 };
 
+struct diag*
+relay_get_diag(struct relay *relay)
+{
+	return &relay->diag;
+}
+
 enum relay_state
 relay_get_state(const struct relay *relay)
 {
@@ -536,6 +542,8 @@ relay_subscribe_f(va_list ap)
 	if (!diag_is_empty(&relay->diag)) {
 		/* An error has occurred while reading ACKs of xlog. */
 		diag_move(&relay->diag, diag_get());
+		/* Reference the diag in the status. */
+		diag_add_error(&relay->diag, diag_last_error(diag_get()));
 	}
 	struct errinj *inj = errinj(ERRINJ_RELAY_EXIT_DELAY, ERRINJ_DOUBLE);
 	if (inj != NULL && inj->dparam > 0)
diff --git a/src/box/relay.h b/src/box/relay.h
index e20d4cd13..3fd83bc53 100644
--- a/src/box/relay.h
+++ b/src/box/relay.h
@@ -63,6 +63,9 @@ relay_new(void);
 void
 relay_destroy(struct relay *relay);
 
+struct diag*
+relay_get_diag(struct relay *relay);
+
 enum relay_state
 relay_get_state(const struct relay *relay);
 
diff --git a/test/replication/show_error_on_disconnect.result b/test/replication/show_error_on_disconnect.result
new file mode 100644
index 000000000..c5a91c004
--- /dev/null
+++ b/test/replication/show_error_on_disconnect.result
@@ -0,0 +1,120 @@
+--
+-- gh-3365: display an error in upstream on downstream failure.
+-- Create a gap in LSN to cause replica's failure.
+-- The goal here is to see same error message on both side.
+--
+test_run = require('test_run').new()
+---
+...
+SERVERS = {'master_quorum1', 'master_quorum2'}
+---
+...
+-- Deploy a cluster.
+test_run:create_cluster(SERVERS)
+---
+...
+test_run:wait_fullmesh(SERVERS)
+---
+...
+test_run:cmd("switch master_quorum1")
+---
+- true
+...
+repl = box.cfg.replication
+---
+...
+box.cfg{replication = ""}
+---
+...
+test_run:cmd("switch master_quorum2")
+---
+- true
+...
+box.space.test:insert{1}
+---
+- [1]
+...
+box.snapshot()
+---
+- ok
+...
+box.space.test:insert{2}
+---
+- [2]
+...
+box.snapshot()
+---
+- ok
+...
+test_run:cmd("switch default")
+---
+- true
+...
+fio = require('fio')
+---
+...
+fio.unlink(fio.pathjoin(fio.abspath("."), string.format('master_quorum2/%020d.xlog', 5)))
+---
+- true
+...
+test_run:cmd("switch master_quorum1")
+---
+- true
+...
+box.cfg{replication = repl}
+---
+...
+require('fiber').sleep(0.1)
+---
+...
+box.space.test:select()
+---
+- []
+...
+other_id = box.info.id % 2 + 1
+---
+...
+box.info.replication[other_id].upstream.status
+---
+- stopped
+...
+box.info.replication[other_id].upstream.message:match("Missing")
+---
+- Missing
+...
+test_run:cmd("switch master_quorum2")
+---
+- true
+...
+box.space.test:select()
+---
+- - [1]
+  - [2]
+...
+other_id = box.info.id % 2 + 1
+---
+...
+box.info.replication[other_id].upstream.status
+---
+- follow
+...
+box.info.replication[other_id].upstream.message
+---
+- null
+...
+box.info.replication[other_id].downstream.status
+---
+- stopped
+...
+box.info.replication[other_id].downstream.message:match("Missing")
+---
+- Missing
+...
+test_run:cmd("switch default")
+---
+- true
+...
+-- Cleanup.
+test_run:drop_cluster(SERVERS)
+---
+...
diff --git a/test/replication/show_error_on_disconnect.test.lua b/test/replication/show_error_on_disconnect.test.lua
new file mode 100644
index 000000000..64a750256
--- /dev/null
+++ b/test/replication/show_error_on_disconnect.test.lua
@@ -0,0 +1,38 @@
+--
+-- gh-3365: display an error in upstream on downstream failure.
+-- Create a gap in LSN to cause replica's failure.
+-- The goal here is to see same error message on both side.
+--
+test_run = require('test_run').new()
+SERVERS = {'master_quorum1', 'master_quorum2'}
+-- Deploy a cluster.
+test_run:create_cluster(SERVERS)
+test_run:wait_fullmesh(SERVERS)
+test_run:cmd("switch master_quorum1")
+repl = box.cfg.replication
+box.cfg{replication = ""}
+test_run:cmd("switch master_quorum2")
+box.space.test:insert{1}
+box.snapshot()
+box.space.test:insert{2}
+box.snapshot()
+test_run:cmd("switch default")
+fio = require('fio')
+fio.unlink(fio.pathjoin(fio.abspath("."), string.format('master_quorum2/%020d.xlog', 5)))
+test_run:cmd("switch master_quorum1")
+box.cfg{replication = repl}
+require('fiber').sleep(0.1)
+box.space.test:select()
+other_id = box.info.id % 2 + 1
+box.info.replication[other_id].upstream.status
+box.info.replication[other_id].upstream.message:match("Missing")
+test_run:cmd("switch master_quorum2")
+box.space.test:select()
+other_id = box.info.id % 2 + 1
+box.info.replication[other_id].upstream.status
+box.info.replication[other_id].upstream.message
+box.info.replication[other_id].downstream.status
+box.info.replication[other_id].downstream.message:match("Missing")
+test_run:cmd("switch default")
+-- Cleanup.
+test_run:drop_cluster(SERVERS)
-- 
2.14.3 (Apple Git-98)

      parent reply	other threads:[~2018-05-16 11:32 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-16 11:32 [tarantool-patches] [PATCH 0/3] replication: improve logging Konstantin Belyavskiy
2018-05-16 11:32 ` [tarantool-patches] [PATCH 1/3] replication: use applier_state to check quorum Konstantin Belyavskiy
2018-05-16 11:32 ` [tarantool-patches] [PATCH 2/3] replication: do not delete relay on applier disconnect Konstantin Belyavskiy
2018-05-16 11:32 ` Konstantin Belyavskiy [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=566d0d43f6dc7126408d67c34e29369c5e3598c0.1526469555.git.k.belyavskiy@tarantool.org \
    --to=k.belyavskiy@tarantool.org \
    --cc=georgy@tarantool.org \
    --cc=kostja@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='Re: [tarantool-patches] [PATCH 3/3] replication: display downstream status at upstream' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox