From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> To: tarantool-patches@freelists.org Cc: kostja@tarantool.org Subject: [tarantool-patches] [PATCH v2 4/6] test: speed up swim big cluster failure detection Date: Tue, 9 Apr 2019 14:46:35 +0300 [thread overview] Message-ID: <78c5e189d17dd6d75cbf0a6865f56482e307de82.1554809200.git.v.shpilevoy@tarantool.org> (raw) In-Reply-To: <cover.1554809200.git.v.shpilevoy@tarantool.org> In-Reply-To: <cover.1554809200.git.v.shpilevoy@tarantool.org> The test checks that if a member has failed in a big cluster, it is eventually deleted from all instances. But it takes too much real time despite usage of virtual time. This is because member total deletion takes O(N + ack_timeout * 5) time. N so as to wait until every member pinged the failed one at least once, + 3 * ack_timeout to learn that it is dead, and + 2 * ack_timeout to drop it. Of course, it is an upper border, and usually it is faster but not much. For example, on the cluster of size 50 it takes easily 55 virtual seconds. On the contrary, to just learn that a member is dead on every instance takes O(log(N)) according to the SWIM paper. On the same test with 50 instances cluster it takes ~15 virtual seconds to disseminate 'dead' status of the failed member on every instance. And even without dissemination component, with anti-entropy only. Leaping ahead, for the subsequent patches it is tested that with the dissemination component it takes already ~6 virtual seconds. In the summary, without losing test coverage it is much faster to turn off SWIM GC and wait until the failed member looks dead on all instances. Part of #3234 --- test/unit/swim.c | 52 ++++++++++++++++++++------------- test/unit/swim.result | 5 ++-- test/unit/swim_test_utils.c | 57 +++++++++++++++++++++++++++++++++++++ test/unit/swim_test_utils.h | 20 +++++++++++++ 4 files changed, 112 insertions(+), 22 deletions(-) diff --git a/test/unit/swim.c b/test/unit/swim.c index 5b7b08ae1..2542eac1d 100644 --- a/test/unit/swim.c +++ b/test/unit/swim.c @@ -377,33 +377,45 @@ swim_test_refute(void) static void swim_test_too_big_packet(void) { - swim_test_start(2); + swim_test_start(3); int size = 50; + double ack_timeout = 1; + double first_dead_timeout = 20; + double everywhere_dead_timeout = size * 3; + int drop_id = size / 2; + struct swim_cluster *cluster = swim_cluster_new(size); for (int i = 1; i < size; ++i) swim_cluster_add_link(cluster, 0, i); - is(swim_cluster_wait_fullmesh(cluster, size), 0, "despite S1 can not "\ - "send all the %d members in a one packet, fullmesh is eventually "\ - "reached", size); - swim_cluster_set_ack_timeout(cluster, 1); - int drop_id = size / 2; + + is(swim_cluster_wait_fullmesh(cluster, size * 2), 0, "despite S1 can "\ + "not send all the %d members in a one packet, fullmesh is "\ + "eventually reached", size); + + swim_cluster_set_ack_timeout(cluster, ack_timeout); swim_cluster_set_drop(cluster, drop_id, true); + is(swim_cluster_wait_status_anywhere(cluster, drop_id, MEMBER_DEAD, + first_dead_timeout), 0, + "a dead member is detected in time not depending on cluster size"); /* - * Dissemination of a detected failure takes long time - * without help of the component, intended for that. + * GC is off to simplify and speed up checks. When no GC + * the test is sure that it is safe to check for + * MEMBER_DEAD everywhere, because it is impossible that a + * member is considered dead in one place, but already + * deleted on another. Also, total member deletion takes + * linear time, because a member is deleted from an + * instance only when *that* instance will not receive + * some direct acks from the member. Deletion and + * additional pings are not triggered if a member dead + * status is received indirectly via dissemination or + * anti-entropy. Otherwise it could produce linear network + * load on the already weak member. */ - double timeout = size * 10; - int i = 0; - for (; i < size; ++i) { - double start = swim_time(); - if (i != drop_id && - swim_cluster_wait_status(cluster, i, drop_id, - swim_member_status_MAX, timeout) != 0) - break; - timeout -= swim_time() - start; - } - is(i, size, "S%d drops all the packets - it should become dead", - drop_id + 1); + swim_cluster_set_gc(cluster, SWIM_GC_OFF); + is(swim_cluster_wait_status_everywhere(cluster, drop_id, MEMBER_DEAD, + everywhere_dead_timeout), 0, + "S%d death is eventually learned by everyone", drop_id + 1); + swim_cluster_delete(cluster); swim_test_finish(); } diff --git a/test/unit/swim.result b/test/unit/swim.result index 904f061f6..3393870c2 100644 --- a/test/unit/swim.result +++ b/test/unit/swim.result @@ -94,9 +94,10 @@ ok 8 - subtests ok 9 - subtests *** swim_test_basic_gossip: done *** *** swim_test_too_big_packet *** - 1..2 + 1..3 ok 1 - despite S1 can not send all the 50 members in a one packet, fullmesh is eventually reached - ok 2 - S26 drops all the packets - it should become dead + ok 2 - a dead member is detected in time not depending on cluster size + ok 3 - S26 death is eventually learned by everyone ok 10 - subtests *** swim_test_too_big_packet: done *** *** swim_test_undead *** diff --git a/test/unit/swim_test_utils.c b/test/unit/swim_test_utils.c index 006c7446f..02149f256 100644 --- a/test/unit/swim_test_utils.c +++ b/test/unit/swim_test_utils.c @@ -361,6 +361,39 @@ swim_loop_check_member(struct swim_cluster *cluster, void *data) return true; } +/** + * Callback to check that a member matches a template on any + * instance in the cluster. + */ +static bool +swim_loop_check_member_anywhere(struct swim_cluster *cluster, void *data) +{ + struct swim_member_template *t = (struct swim_member_template *) data; + for (t->node_id = 0; t->node_id < cluster->size; ++t->node_id) { + if (t->node_id != t->member_id && + swim_loop_check_member(cluster, data)) + return true; + } + return false; +} + +/** + * Callback to check that a member matches a template on every + * instance in the cluster. + */ +static bool +swim_loop_check_member_everywhere(struct swim_cluster *cluster, void *data) +{ + struct swim_member_template *t = (struct swim_member_template *) data; + for (t->node_id = 0; t->node_id < cluster->size; ++t->node_id) { + if (t->node_id != t->member_id && + !swim_loop_check_member(cluster, data)) + return false; + } + return true; +} + + int swim_cluster_wait_status(struct swim_cluster *cluster, int node_id, int member_id, enum swim_member_status status, @@ -383,6 +416,30 @@ swim_cluster_wait_incarnation(struct swim_cluster *cluster, int node_id, return swim_wait_timeout(timeout, cluster, swim_loop_check_member, &t); } +int +swim_cluster_wait_status_anywhere(struct swim_cluster *cluster, int member_id, + enum swim_member_status status, + double timeout) +{ + struct swim_member_template t; + swim_member_template_create(&t, -1, member_id); + swim_member_template_set_status(&t, status); + return swim_wait_timeout(timeout, cluster, + swim_loop_check_member_anywhere, &t); +} + +int +swim_cluster_wait_status_everywhere(struct swim_cluster *cluster, int member_id, + enum swim_member_status status, + double timeout) +{ + struct swim_member_template t; + swim_member_template_create(&t, -1, member_id); + swim_member_template_set_status(&t, status); + return swim_wait_timeout(timeout, cluster, + swim_loop_check_member_everywhere, &t); +} + bool swim_test_error_check_match(const char *msg) { diff --git a/test/unit/swim_test_utils.h b/test/unit/swim_test_utils.h index 4a1ff4cb8..13781d037 100644 --- a/test/unit/swim_test_utils.h +++ b/test/unit/swim_test_utils.h @@ -118,6 +118,26 @@ swim_cluster_wait_status(struct swim_cluster *cluster, int node_id, int member_id, enum swim_member_status status, double timeout); +/** + * Wait until a member with id @a member_id is seen with @a status + * in the membership table of any instance in @a cluster. At most + * @a timeout seconds. + */ +int +swim_cluster_wait_status_anywhere(struct swim_cluster *cluster, int member_id, + enum swim_member_status status, + double timeout); + +/** + * Wait until a member with id @a member_id is seen with @a status + * in the membership table of every instance in @a cluster. At + * most @a timeout seconds. + */ +int +swim_cluster_wait_status_everywhere(struct swim_cluster *cluster, int member_id, + enum swim_member_status status, + double timeout); + /** * Wait until a member with id @a member_id is seen with @a * incarnation in the membership table of a member with id @a -- 2.17.2 (Apple Git-113)
next prev parent reply other threads:[~2019-04-09 11:46 UTC|newest] Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-04-09 11:46 [tarantool-patches] [PATCH v2 0/6] swim dissemination Vladislav Shpilevoy 2019-04-09 11:46 ` [tarantool-patches] [PATCH v2 1/6] swim: encapsulate member bin info into a 'passport' Vladislav Shpilevoy 2019-04-09 11:46 ` [tarantool-patches] [PATCH v2 2/6] swim: make members array decoder be a separate function Vladislav Shpilevoy 2019-04-09 11:46 ` [tarantool-patches] [PATCH v2 3/6] test: rename some swim test methods and macros Vladislav Shpilevoy 2019-04-09 11:46 ` Vladislav Shpilevoy [this message] 2019-04-09 11:46 ` [tarantool-patches] [PATCH v2 5/6] test: set packet drop rate instead of flag in swim tests Vladislav Shpilevoy 2019-04-09 11:46 ` [tarantool-patches] [PATCH v2 6/6] swim: introduce dissemination component Vladislav Shpilevoy 2019-04-09 13:47 ` [tarantool-patches] " Konstantin Osipov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=78c5e189d17dd6d75cbf0a6865f56482e307de82.1554809200.git.v.shpilevoy@tarantool.org \ --to=v.shpilevoy@tarantool.org \ --cc=kostja@tarantool.org \ --cc=tarantool-patches@freelists.org \ --subject='Re: [tarantool-patches] [PATCH v2 4/6] test: speed up swim big cluster failure detection' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox