Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: tarantool-patches@freelists.org
Cc: kostja@tarantool.org
Subject: [tarantool-patches] [PATCH 3/5] test: speed up swim big cluster failure detection
Date: Fri,  5 Apr 2019 14:57:29 +0300	[thread overview]
Message-ID: <13b859581390bf1f202096fbaa2405204fcbb4dc.1554465150.git.v.shpilevoy@tarantool.org> (raw)
In-Reply-To: <cover.1554465150.git.v.shpilevoy@tarantool.org>
In-Reply-To: <cover.1554465150.git.v.shpilevoy@tarantool.org>

The test checks that if a member has failed in a big cluster, it
is eventually deleted from all instances. But it takes too much
real time despite usage of virtual time.

This is because member total deletion takes
O(N + ack_timeout * 5) time. N so as to wait until every member
pinged the failed one at least once, + 3 * ack_timeout to learn
that it is dead, and + 2 * ack_timeout to drop it. Of course, it
is an upper border, and usually it is faster but not much. For
example, on the cluster of size 50 it takes easily 55 virtual
seconds.

On the contrary, to just learn that a member is dead on every
instance takes O(log(N)) according to the SWIM paper. On the
same test with 50 instances cluster it takes ~15 virtual seconds
to disseminate 'dead' status of the failed member on every
instance. And even without dissemination component, with
anti-entropy only.

Leaping ahead, for the subsequent patches it is tested that with
the dissemination component it takes already ~6 virtual seconds.

In the summary, without losing test coverage it is much faster to
turn off SWIM GC and wait until the failed member looks dead on
all instances.

Part of #3234
---
 test/unit/swim.c            | 52 ++++++++++++++++++++-------------
 test/unit/swim.result       |  5 ++--
 test/unit/swim_test_utils.c | 57 +++++++++++++++++++++++++++++++++++++
 test/unit/swim_test_utils.h | 20 +++++++++++++
 4 files changed, 112 insertions(+), 22 deletions(-)

diff --git a/test/unit/swim.c b/test/unit/swim.c
index 860d3211e..d77225f6c 100644
--- a/test/unit/swim.c
+++ b/test/unit/swim.c
@@ -374,33 +374,45 @@ swim_test_refute(void)
 static void
 swim_test_too_big_packet(void)
 {
-	swim_start_test(2);
+	swim_start_test(3);
 	int size = 50;
+	double ack_timeout = 1;
+	double first_dead_timeout = 20;
+	double everywhere_dead_timeout = size * 3;
+	int drop_id = size / 2;
+
 	struct swim_cluster *cluster = swim_cluster_new(size);
 	for (int i = 1; i < size; ++i)
 		swim_cluster_add_link(cluster, 0, i);
-	is(swim_cluster_wait_fullmesh(cluster, size), 0, "despite S1 can not "\
-	   "send all the %d members in a one packet, fullmesh is eventually "\
-	   "reached", size);
-	swim_cluster_set_ack_timeout(cluster, 1);
-	int drop_id = size / 2;
+
+	is(swim_cluster_wait_fullmesh(cluster, size * 2), 0, "despite S1 can "\
+	   "not send all the %d members in a one packet, fullmesh is "\
+	   "eventually reached", size);
+
+	swim_cluster_set_ack_timeout(cluster, ack_timeout);
 	swim_cluster_set_drop(cluster, drop_id, true);
+	is(swim_cluster_wait_status_anywhere(cluster, drop_id, MEMBER_DEAD,
+					     first_dead_timeout), 0,
+	   "a dead member is detected in time not depending on cluster size");
 	/*
-	 * Dissemination of a detected failure takes long time
-	 * without help of the component, intended for that.
+	 * GC is off to simplify and speed up checks. When no GC
+	 * the test is sure that it is safe to check for
+	 * MEMBER_DEAD everywhere, because it is impossible that a
+	 * member is considered dead in one place, but already
+	 * deleted on another. Also, total member deletion takes
+	 * linear time, because a member is deleted from an
+	 * instance only when *that* instance will not receive
+	 * some direct acks from the member. Deletion and
+	 * additional pings are not triggered if a member dead
+	 * status is received indirectly via dissemination or
+	 * anti-entropy. Otherwise it could produce linear network
+	 * load on the already weak member.
 	 */
-	double timeout = size * 10;
-	int i = 0;
-	for (; i < size; ++i) {
-		double start = swim_time();
-		if (i != drop_id &&
-		   swim_cluster_wait_status(cluster, i, drop_id,
-					    swim_member_status_MAX, timeout) != 0)
-			break;
-		timeout -= swim_time() - start;
-	}
-	is(i, size, "S%d drops all the packets - it should become dead",
-	   drop_id + 1);
+	swim_cluster_set_gc(cluster, SWIM_GC_OFF);
+	is(swim_cluster_wait_status_everywhere(cluster, drop_id, MEMBER_DEAD,
+					       everywhere_dead_timeout), 0,
+	   "S%d death is eventually learned by everyone", drop_id + 1);
+
 	swim_cluster_delete(cluster);
 	swim_finish_test();
 }
diff --git a/test/unit/swim.result b/test/unit/swim.result
index 904f061f6..3393870c2 100644
--- a/test/unit/swim.result
+++ b/test/unit/swim.result
@@ -94,9 +94,10 @@ ok 8 - subtests
 ok 9 - subtests
 	*** swim_test_basic_gossip: done ***
 	*** swim_test_too_big_packet ***
-    1..2
+    1..3
     ok 1 - despite S1 can not send all the 50 members in a one packet, fullmesh is eventually reached
-    ok 2 - S26 drops all the packets - it should become dead
+    ok 2 - a dead member is detected in time not depending on cluster size
+    ok 3 - S26 death is eventually learned by everyone
 ok 10 - subtests
 	*** swim_test_too_big_packet: done ***
 	*** swim_test_undead ***
diff --git a/test/unit/swim_test_utils.c b/test/unit/swim_test_utils.c
index bb413372c..277a73498 100644
--- a/test/unit/swim_test_utils.c
+++ b/test/unit/swim_test_utils.c
@@ -361,6 +361,39 @@ swim_loop_check_member(struct swim_cluster *cluster, void *data)
 	return true;
 }
 
+/**
+ * Callback to check that a member matches a template on any
+ * instance in the cluster.
+ */
+static bool
+swim_loop_check_member_anywhere(struct swim_cluster *cluster, void *data)
+{
+	struct swim_member_template *t = (struct swim_member_template *) data;
+	for (t->node_id = 0; t->node_id < cluster->size; ++t->node_id) {
+		if (t->node_id != t->member_id &&
+		    swim_loop_check_member(cluster, data))
+			return true;
+	}
+	return false;
+}
+
+/**
+ * Callback to check that a member matches a template on every
+ * instance in the cluster.
+ */
+static bool
+swim_loop_check_member_everywhere(struct swim_cluster *cluster, void *data)
+{
+	struct swim_member_template *t = (struct swim_member_template *) data;
+	for (t->node_id = 0; t->node_id < cluster->size; ++t->node_id) {
+		if (t->node_id != t->member_id &&
+		    !swim_loop_check_member(cluster, data))
+			return false;
+	}
+	return true;
+}
+
+
 int
 swim_cluster_wait_status(struct swim_cluster *cluster, int node_id,
 			 int member_id, enum swim_member_status status,
@@ -383,6 +416,30 @@ swim_cluster_wait_incarnation(struct swim_cluster *cluster, int node_id,
 	return swim_wait_timeout(timeout, cluster, swim_loop_check_member, &t);
 }
 
+int
+swim_cluster_wait_status_anywhere(struct swim_cluster *cluster, int member_id,
+				  enum swim_member_status status,
+				  double timeout)
+{
+	struct swim_member_template t;
+	swim_member_template_create(&t, -1, member_id);
+	swim_member_template_set_status(&t, status);
+	return swim_wait_timeout(timeout, cluster,
+				 swim_loop_check_member_anywhere, &t);
+}
+
+int
+swim_cluster_wait_status_everywhere(struct swim_cluster *cluster, int member_id,
+				    enum swim_member_status status,
+				    double timeout)
+{
+	struct swim_member_template t;
+	swim_member_template_create(&t, -1, member_id);
+	swim_member_template_set_status(&t, status);
+	return swim_wait_timeout(timeout, cluster,
+				 swim_loop_check_member_everywhere, &t);
+}
+
 bool
 swim_error_check_match(const char *msg)
 {
diff --git a/test/unit/swim_test_utils.h b/test/unit/swim_test_utils.h
index d2ef00817..6e99b4879 100644
--- a/test/unit/swim_test_utils.h
+++ b/test/unit/swim_test_utils.h
@@ -118,6 +118,26 @@ swim_cluster_wait_status(struct swim_cluster *cluster, int node_id,
 			 int member_id, enum swim_member_status status,
 			 double timeout);
 
+/**
+ * Wait until a member with id @a member_id is seen with @a status
+ * in the membership table of any instance in @a cluster. At most
+ * @a timeout seconds.
+ */
+int
+swim_cluster_wait_status_anywhere(struct swim_cluster *cluster, int member_id,
+				  enum swim_member_status status,
+				  double timeout);
+
+/**
+ * Wait until a member with id @a member_id is seen with @a status
+ * in the membership table of every instance in @a cluster. At
+ * most @a timeout seconds.
+ */
+int
+swim_cluster_wait_status_everywhere(struct swim_cluster *cluster, int member_id,
+				    enum swim_member_status status,
+				    double timeout);
+
 /**
  * Wait until a member with id @a member_id is seen with @a
  * incarnation in the membership table of a member with id @a
-- 
2.17.2 (Apple Git-113)

  parent reply	other threads:[~2019-04-05 11:57 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-05 11:57 [tarantool-patches] [PATCH 0/5] swim dissemination component Vladislav Shpilevoy
2019-04-05 11:57 ` [tarantool-patches] [PATCH 1/5] swim: encapsulate member bin info into a 'passport' Vladislav Shpilevoy
2019-04-05 11:57 ` [tarantool-patches] [PATCH 2/5] swim: make members array decoder be a separate function Vladislav Shpilevoy
2019-04-05 11:57 ` Vladislav Shpilevoy [this message]
2019-04-09  8:43   ` [tarantool-patches] Re: [PATCH 3/5] test: speed up swim big cluster failure detection Konstantin Osipov
2019-04-09 11:47     ` Vladislav Shpilevoy
2019-04-05 11:57 ` [tarantool-patches] [PATCH 4/5] test: set packet drop rate instead of flag in swim tests Vladislav Shpilevoy
2019-04-05 11:57 ` [tarantool-patches] [PATCH 5/5] swim: introduce dissemination component Vladislav Shpilevoy
2019-04-08 20:13   ` [tarantool-patches] " Vladislav Shpilevoy
2019-04-09  9:58     ` Konstantin Osipov
2019-04-09 11:47       ` Vladislav Shpilevoy
2019-04-09 12:25 ` [tarantool-patches] Re: [PATCH 0/5] swim " Vladislav Shpilevoy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=13b859581390bf1f202096fbaa2405204fcbb4dc.1554465150.git.v.shpilevoy@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=kostja@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='Re: [tarantool-patches] [PATCH 3/5] test: speed up swim big cluster failure detection' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox