Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov.dev@gmail.com>
To: kostja@tarantool.org
Cc: tarantool-patches@freelists.org
Subject: [PATCH v2 01/11] box: retrieve instance uuid before starting local recovery
Date: Fri,  8 Jun 2018 20:34:19 +0300	[thread overview]
Message-ID: <ce1933bee558935133f908e766a7970c68dcea92.1528478913.git.vdavydov.dev@gmail.com> (raw)
In-Reply-To: <cover.1528478913.git.vdavydov.dev@gmail.com>
In-Reply-To: <cover.1528478913.git.vdavydov.dev@gmail.com>

In order to find out if the current instance fell too much behind its
peers in the cluster and so needs to be rebootstrapped, we need to
connect it to remote peers before proceeding to local recovery. The
problem is box.cfg.replication may have an entry corresponding to the
instance itself so before connecting we have to start listening to
incoming connections. Since an instance is supposed to sent its uuid in
the greeting message, we also have to initialize INSTANCE_UUID early,
before we start local recovery. So this patch makes memtx engine
constructor not only scan the snapshot directory, but also read the
header of the most recent snapshot to initialize INSTANCE_UUID.

Needed for #461
---
 src/box/box.cc         | 18 ++++++++++--------
 src/box/memtx_engine.c | 21 ++++++++++++++++++++-
 2 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/src/box/box.cc b/src/box/box.cc
index 61bfa117..e1bf3934 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -1839,6 +1839,15 @@ box_cfg_xc(void)
 	}
 	bool is_bootstrap_leader = false;
 	if (last_checkpoint_lsn >= 0) {
+		/* Check instance UUID. */
+		assert(!tt_uuid_is_nil(&INSTANCE_UUID));
+		if (!tt_uuid_is_nil(&instance_uuid) &&
+		    !tt_uuid_is_equal(&instance_uuid, &INSTANCE_UUID)) {
+			tnt_raise(ClientError, ER_INSTANCE_UUID_MISMATCH,
+				  tt_uuid_str(&instance_uuid),
+				  tt_uuid_str(&INSTANCE_UUID));
+		}
+
 		struct wal_stream wal_stream;
 		wal_stream_create(&wal_stream, cfg_geti64("rows_per_wal"));
 
@@ -1882,7 +1891,6 @@ box_cfg_xc(void)
 				      cfg_getd("wal_dir_rescan_delay"));
 		title("hot_standby");
 
-		assert(!tt_uuid_is_nil(&INSTANCE_UUID));
 		/*
 		 * Leave hot standby mode, if any, only
 		 * after acquiring the lock.
@@ -1902,13 +1910,7 @@ box_cfg_xc(void)
 		recovery_finalize(recovery, &wal_stream.base);
 		engine_end_recovery_xc();
 
-		/* Check replica set and instance UUID. */
-		if (!tt_uuid_is_nil(&instance_uuid) &&
-		    !tt_uuid_is_equal(&instance_uuid, &INSTANCE_UUID)) {
-			tnt_raise(ClientError, ER_INSTANCE_UUID_MISMATCH,
-				  tt_uuid_str(&instance_uuid),
-				  tt_uuid_str(&INSTANCE_UUID));
-		}
+		/* Check replica set UUID. */
 		if (!tt_uuid_is_nil(&replicaset_uuid) &&
 		    !tt_uuid_is_equal(&replicaset_uuid, &REPLICASET_UUID)) {
 			tnt_raise(ClientError, ER_REPLICASET_UUID_MISMATCH,
diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c
index fac84ce1..de9fd1ba 100644
--- a/src/box/memtx_engine.c
+++ b/src/box/memtx_engine.c
@@ -164,7 +164,6 @@ memtx_engine_recover_snapshot(struct memtx_engine *memtx,
 	struct xlog_cursor cursor;
 	if (xlog_cursor_open(&cursor, filename) < 0)
 		return -1;
-	INSTANCE_UUID = cursor.meta.instance_uuid;
 
 	int rc;
 	struct xrow_header row;
@@ -1001,6 +1000,26 @@ memtx_engine_new(const char *snap_dirname, bool force_recovery,
 	if (xdir_scan(&memtx->snap_dir) != 0)
 		goto fail;
 
+	/*
+	 * To check if the instance needs to be rebootstrapped, we
+	 * need to connect it to remote peers before proceeding to
+	 * local recovery. In order to do that, we have to start
+	 * listening for incoming connections, because one of remote
+	 * peers may be self. This, in turn, requires us to know the
+	 * instance UUID, as it is a part of a greeting message.
+	 * So if the local directory isn't empty, read the snapshot
+	 * signature right now to initialize the instance UUID.
+	 */
+	int64_t snap_signature = xdir_last_vclock(&memtx->snap_dir, NULL);
+	if (snap_signature >= 0) {
+		struct xlog_cursor cursor;
+		if (xdir_open_cursor(&memtx->snap_dir,
+				     snap_signature, &cursor) != 0)
+			goto fail;
+		INSTANCE_UUID = cursor.meta.instance_uuid;
+		xlog_cursor_close(&cursor, false);
+	}
+
 	stailq_create(&memtx->gc_queue);
 	memtx->gc_fiber = fiber_new("memtx.gc", memtx_engine_gc_f);
 	if (memtx->gc_fiber == NULL)
-- 
2.11.0

  reply	other threads:[~2018-06-08 17:34 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-08 17:34 [PATCH v2 00/11] Replica rejoin Vladimir Davydov
2018-06-08 17:34 ` Vladimir Davydov [this message]
2018-06-08 17:51   ` [PATCH v2 01/11] box: retrieve instance uuid before starting local recovery Konstantin Osipov
2018-06-08 17:34 ` [PATCH v2 02/11] box: refactor hot standby recovery Vladimir Davydov
2018-06-08 17:34 ` [PATCH v2 03/11] box: retrieve end vclock before starting local recovery Vladimir Davydov
2018-06-14 12:58   ` Konstantin Osipov
2018-06-08 17:34 ` [PATCH v2 04/11] box: open the port " Vladimir Davydov
2018-06-13 20:43   ` Konstantin Osipov
2018-06-14  8:31     ` Vladimir Davydov
2018-06-14 12:59       ` Konstantin Osipov
2018-06-15 15:48         ` [PATCH 0/3] Speed up recovery in case rebootstrap is not needed Vladimir Davydov
2018-06-15 15:48           ` [PATCH 1/3] xlog: erase eof marker when reopening existing file for writing Vladimir Davydov
2018-06-27 17:09             ` Konstantin Osipov
2018-06-15 15:48           ` [PATCH 2/3] wal: rollback vclock on write failure Vladimir Davydov
2018-06-27 17:22             ` Konstantin Osipov
2018-06-15 15:48           ` [PATCH 3/3] wal: create empty xlog on shutdown Vladimir Davydov
2018-06-27 17:29             ` Konstantin Osipov
2018-06-08 17:34 ` [PATCH v2 05/11] box: connect to remote peers before starting local recovery Vladimir Davydov
2018-06-13 20:45   ` Konstantin Osipov
2018-06-14  8:34     ` Vladimir Davydov
2018-06-14 12:59       ` Konstantin Osipov
2018-06-08 17:34 ` [PATCH v2 06/11] box: factor out local recovery function Vladimir Davydov
2018-06-13 20:50   ` Konstantin Osipov
2018-06-08 17:34 ` [PATCH v2 07/11] applier: inquire oldest vclock on connect Vladimir Davydov
2018-06-13 20:51   ` Konstantin Osipov
2018-06-14  8:40     ` Vladimir Davydov
2018-06-08 17:34 ` [PATCH v2 08/11] replication: rebootstrap instance on startup if it fell behind Vladimir Davydov
2018-06-13 20:55   ` Konstantin Osipov
2018-06-14  8:58     ` Vladimir Davydov
2018-06-08 17:34 ` [PATCH v2 09/11] vinyl: simplify vylog recovery from backup Vladimir Davydov
2018-06-08 17:34 ` [PATCH v2 10/11] vinyl: pass flags to vy_recovery_new Vladimir Davydov
2018-06-13 20:56   ` Konstantin Osipov
2018-06-08 17:34 ` [PATCH v2 11/11] vinyl: implement rebootstrap support Vladimir Davydov
2018-06-10 12:02   ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ce1933bee558935133f908e766a7970c68dcea92.1528478913.git.vdavydov.dev@gmail.com \
    --to=vdavydov.dev@gmail.com \
    --cc=kostja@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='Re: [PATCH v2 01/11] box: retrieve instance uuid before starting local recovery' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox