From: Konstantin Osipov <kostja@tarantool.org>
To: tarantool-patches@freelists.org
Subject: [tarantool-patches] Re: [PATCH 3/5] test: speed up swim big cluster failure detection
Date: Tue, 9 Apr 2019 11:43:03 +0300 [thread overview]
Message-ID: <20190409084303.GA16539@chai> (raw)
In-Reply-To: <13b859581390bf1f202096fbaa2405204fcbb4dc.1554465150.git.v.shpilevoy@tarantool.org>
* Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [19/04/05 16:12]:
The more I look at the tests the more confusing the test function
names become.
The namespace of test functions is clearly clashing with the main
swim namespace, which makes the tests hard to read and follow:
it's unclear which function belongs to the tests harness and which
to the swim itself. Please come up with a harness api prefix.
Please feel free to do it in a subsequent patch.
> The test checks that if a member has failed in a big cluster, it
> is eventually deleted from all instances. But it takes too much
> real time despite usage of virtual time.
>
> This is because member total deletion takes
> O(N + ack_timeout * 5) time. N so as to wait until every member
> pinged the failed one at least once, + 3 * ack_timeout to learn
> that it is dead, and + 2 * ack_timeout to drop it. Of course, it
> is an upper border, and usually it is faster but not much. For
> example, on the cluster of size 50 it takes easily 55 virtual
> seconds.
>
> On the contrary, to just learn that a member is dead on every
> instance takes O(log(N)) according to the SWIM paper. On the
> same test with 50 instances cluster it takes ~15 virtual seconds
> to disseminate 'dead' status of the failed member on every
> instance. And even without dissemination component, with
> anti-entropy only.
>
> Leaping ahead, for the subsequent patches it is tested that with
> the dissemination component it takes already ~6 virtual seconds.
>
> In the summary, without losing test coverage it is much faster to
> turn off SWIM GC and wait until the failed member looks dead on
> all instances.
>
> Part of #3234
> ---
> test/unit/swim.c | 52 ++++++++++++++++++++-------------
> test/unit/swim.result | 5 ++--
> test/unit/swim_test_utils.c | 57 +++++++++++++++++++++++++++++++++++++
> test/unit/swim_test_utils.h | 20 +++++++++++++
> 4 files changed, 112 insertions(+), 22 deletions(-)
>
> diff --git a/test/unit/swim.c b/test/unit/swim.c
> index 860d3211e..d77225f6c 100644
> --- a/test/unit/swim.c
> +++ b/test/unit/swim.c
> @@ -374,33 +374,45 @@ swim_test_refute(void)
> static void
> swim_test_too_big_packet(void)
> {
> - swim_start_test(2);
> + swim_start_test(3);
> int size = 50;
> + double ack_timeout = 1;
> + double first_dead_timeout = 20;
> + double everywhere_dead_timeout = size * 3;
> + int drop_id = size / 2;
> +
> struct swim_cluster *cluster = swim_cluster_new(size);
> for (int i = 1; i < size; ++i)
> swim_cluster_add_link(cluster, 0, i);
> - is(swim_cluster_wait_fullmesh(cluster, size), 0, "despite S1 can not "\
> - "send all the %d members in a one packet, fullmesh is eventually "\
> - "reached", size);
> - swim_cluster_set_ack_timeout(cluster, 1);
> - int drop_id = size / 2;
> +
> + is(swim_cluster_wait_fullmesh(cluster, size * 2), 0, "despite S1 can "\
> + "not send all the %d members in a one packet, fullmesh is "\
> + "eventually reached", size);
> +
> + swim_cluster_set_ack_timeout(cluster, ack_timeout);
> swim_cluster_set_drop(cluster, drop_id, true);
> + is(swim_cluster_wait_status_anywhere(cluster, drop_id, MEMBER_DEAD,
> + first_dead_timeout), 0,
> + "a dead member is detected in time not depending on cluster size");
> /*
> - * Dissemination of a detected failure takes long time
> - * without help of the component, intended for that.
> + * GC is off to simplify and speed up checks. When no GC
> + * the test is sure that it is safe to check for
> + * MEMBER_DEAD everywhere, because it is impossible that a
> + * member is considered dead in one place, but already
> + * deleted on another. Also, total member deletion takes
> + * linear time, because a member is deleted from an
> + * instance only when *that* instance will not receive
> + * some direct acks from the member. Deletion and
> + * additional pings are not triggered if a member dead
> + * status is received indirectly via dissemination or
> + * anti-entropy. Otherwise it could produce linear network
> + * load on the already weak member.
> */
> - double timeout = size * 10;
> - int i = 0;
> - for (; i < size; ++i) {
> - double start = swim_time();
> - if (i != drop_id &&
> - swim_cluster_wait_status(cluster, i, drop_id,
> - swim_member_status_MAX, timeout) != 0)
> - break;
> - timeout -= swim_time() - start;
> - }
> - is(i, size, "S%d drops all the packets - it should become dead",
> - drop_id + 1);
> + swim_cluster_set_gc(cluster, SWIM_GC_OFF);
> + is(swim_cluster_wait_status_everywhere(cluster, drop_id, MEMBER_DEAD,
> + everywhere_dead_timeout), 0,
> + "S%d death is eventually learned by everyone", drop_id + 1);
> +
> swim_cluster_delete(cluster);
> swim_finish_test();
> }
> diff --git a/test/unit/swim.result b/test/unit/swim.result
> index 904f061f6..3393870c2 100644
> --- a/test/unit/swim.result
> +++ b/test/unit/swim.result
> @@ -94,9 +94,10 @@ ok 8 - subtests
> ok 9 - subtests
> *** swim_test_basic_gossip: done ***
> *** swim_test_too_big_packet ***
> - 1..2
> + 1..3
> ok 1 - despite S1 can not send all the 50 members in a one packet, fullmesh is eventually reached
> - ok 2 - S26 drops all the packets - it should become dead
> + ok 2 - a dead member is detected in time not depending on cluster size
> + ok 3 - S26 death is eventually learned by everyone
> ok 10 - subtests
> *** swim_test_too_big_packet: done ***
> *** swim_test_undead ***
> diff --git a/test/unit/swim_test_utils.c b/test/unit/swim_test_utils.c
> index bb413372c..277a73498 100644
> --- a/test/unit/swim_test_utils.c
> +++ b/test/unit/swim_test_utils.c
> @@ -361,6 +361,39 @@ swim_loop_check_member(struct swim_cluster *cluster, void *data)
> return true;
> }
>
> +/**
> + * Callback to check that a member matches a template on any
> + * instance in the cluster.
> + */
> +static bool
> +swim_loop_check_member_anywhere(struct swim_cluster *cluster, void *data)
> +{
> + struct swim_member_template *t = (struct swim_member_template *) data;
> + for (t->node_id = 0; t->node_id < cluster->size; ++t->node_id) {
> + if (t->node_id != t->member_id &&
> + swim_loop_check_member(cluster, data))
> + return true;
> + }
> + return false;
> +}
> +
> +/**
> + * Callback to check that a member matches a template on every
> + * instance in the cluster.
> + */
> +static bool
> +swim_loop_check_member_everywhere(struct swim_cluster *cluster, void *data)
> +{
> + struct swim_member_template *t = (struct swim_member_template *) data;
> + for (t->node_id = 0; t->node_id < cluster->size; ++t->node_id) {
> + if (t->node_id != t->member_id &&
> + !swim_loop_check_member(cluster, data))
> + return false;
> + }
> + return true;
> +}
> +
> +
> int
> swim_cluster_wait_status(struct swim_cluster *cluster, int node_id,
> int member_id, enum swim_member_status status,
> @@ -383,6 +416,30 @@ swim_cluster_wait_incarnation(struct swim_cluster *cluster, int node_id,
> return swim_wait_timeout(timeout, cluster, swim_loop_check_member, &t);
> }
>
> +int
> +swim_cluster_wait_status_anywhere(struct swim_cluster *cluster, int member_id,
> + enum swim_member_status status,
> + double timeout)
> +{
> + struct swim_member_template t;
> + swim_member_template_create(&t, -1, member_id);
> + swim_member_template_set_status(&t, status);
> + return swim_wait_timeout(timeout, cluster,
> + swim_loop_check_member_anywhere, &t);
> +}
> +
> +int
> +swim_cluster_wait_status_everywhere(struct swim_cluster *cluster, int member_id,
> + enum swim_member_status status,
> + double timeout)
> +{
> + struct swim_member_template t;
> + swim_member_template_create(&t, -1, member_id);
> + swim_member_template_set_status(&t, status);
> + return swim_wait_timeout(timeout, cluster,
> + swim_loop_check_member_everywhere, &t);
> +}
> +
> bool
> swim_error_check_match(const char *msg)
> {
> diff --git a/test/unit/swim_test_utils.h b/test/unit/swim_test_utils.h
> index d2ef00817..6e99b4879 100644
> --- a/test/unit/swim_test_utils.h
> +++ b/test/unit/swim_test_utils.h
> @@ -118,6 +118,26 @@ swim_cluster_wait_status(struct swim_cluster *cluster, int node_id,
> int member_id, enum swim_member_status status,
> double timeout);
>
> +/**
> + * Wait until a member with id @a member_id is seen with @a status
> + * in the membership table of any instance in @a cluster. At most
> + * @a timeout seconds.
> + */
> +int
> +swim_cluster_wait_status_anywhere(struct swim_cluster *cluster, int member_id,
> + enum swim_member_status status,
> + double timeout);
> +
> +/**
> + * Wait until a member with id @a member_id is seen with @a status
> + * in the membership table of every instance in @a cluster. At
> + * most @a timeout seconds.
> + */
> +int
> +swim_cluster_wait_status_everywhere(struct swim_cluster *cluster, int member_id,
> + enum swim_member_status status,
> + double timeout);
> +
> /**
> * Wait until a member with id @a member_id is seen with @a
> * incarnation in the membership table of a member with id @a
> --
> 2.17.2 (Apple Git-113)
>
--
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov
next prev parent reply other threads:[~2019-04-09 8:43 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-05 11:57 [tarantool-patches] [PATCH 0/5] swim dissemination component Vladislav Shpilevoy
2019-04-05 11:57 ` [tarantool-patches] [PATCH 1/5] swim: encapsulate member bin info into a 'passport' Vladislav Shpilevoy
2019-04-05 11:57 ` [tarantool-patches] [PATCH 2/5] swim: make members array decoder be a separate function Vladislav Shpilevoy
2019-04-05 11:57 ` [tarantool-patches] [PATCH 3/5] test: speed up swim big cluster failure detection Vladislav Shpilevoy
2019-04-09 8:43 ` Konstantin Osipov [this message]
2019-04-09 11:47 ` [tarantool-patches] " Vladislav Shpilevoy
2019-04-05 11:57 ` [tarantool-patches] [PATCH 4/5] test: set packet drop rate instead of flag in swim tests Vladislav Shpilevoy
2019-04-05 11:57 ` [tarantool-patches] [PATCH 5/5] swim: introduce dissemination component Vladislav Shpilevoy
2019-04-08 20:13 ` [tarantool-patches] " Vladislav Shpilevoy
2019-04-09 9:58 ` Konstantin Osipov
2019-04-09 11:47 ` Vladislav Shpilevoy
2019-04-09 12:25 ` [tarantool-patches] Re: [PATCH 0/5] swim " Vladislav Shpilevoy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190409084303.GA16539@chai \
--to=kostja@tarantool.org \
--cc=tarantool-patches@freelists.org \
--subject='[tarantool-patches] Re: [PATCH 3/5] test: speed up swim big cluster failure detection' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox