From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id A3F6D2B274 for ; Wed, 10 Apr 2019 15:19:50 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nlTRJShLQ3H0 for ; Wed, 10 Apr 2019 15:19:50 -0400 (EDT) Received: from smtpng1.m.smailru.net (smtpng1.m.smailru.net [94.100.181.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id DB91C1FD28 for ; Wed, 10 Apr 2019 15:19:49 -0400 (EDT) From: Vladislav Shpilevoy Subject: [tarantool-patches] [PATCH 1/1] swim: on address update increment incarnation Date: Wed, 10 Apr 2019 22:19:47 +0300 Message-Id: Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-Help: List-Unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-Subscribe: List-Owner: List-post: List-Archive: To: tarantool-patches@freelists.org Cc: kostja@tarantool.org In the original SWIM paper the incarnation is just a way of refuting old statuses, nothing more. It is not designed as a versioning system of a member and its non-status attributes. But Tarantool harnesses the incarnation for wider range of tasks. In Tarantool's implementation the incarnation (in theory) refutes old statuses, old payloads, old addresses. But appeared, that before the patch an address update did not touch incarnation. Because of that it was possible to rewrite a new address with the old one back. The patch fixes it with a mere increment of incarnation on each address update. The fix is simple because the current SWIM implementation always carries the tuple {incarnation, status, address} together, as a one big attribute. It is not so for payloads, so for them an analogous fix will be much more tricky. Follow-up for f510dc6f46a96d1ff9898519fd1c73167b38d655 (swim: introduce failure detection component) --- Branch: https://github.com/tarantool/tarantool/tree/gerold103/swim-incarnation-uri-change Issue: https://github.com/tarantool/tarantool/issues/3234 src/lib/swim/swim.c | 13 ++++-- test/unit/swim.c | 70 ++++++++++++++++++++++++++++++++- test/unit/swim.result | 8 +++- test/unit/swim_test_transport.c | 4 +- 4 files changed, 88 insertions(+), 7 deletions(-) diff --git a/src/lib/swim/swim.c b/src/lib/swim/swim.c index c64b8df3a..f2b09ef23 100644 --- a/src/lib/swim/swim.c +++ b/src/lib/swim/swim.c @@ -243,7 +243,11 @@ struct swim_member { * * Failure detection component */ - /** A monotonically growing number to refute old messages. */ + /** + * A monotonically growing number to refute old member's + * state, characterized by a triplet + * {incarnation, status, address}. + */ uint64_t incarnation; /** * How many recent pings did not receive an ack while the @@ -1026,9 +1030,10 @@ swim_check_acks(struct ev_loop *loop, struct ev_timer *t, int events) /** Update member's address.*/ static inline void swim_update_member_addr(struct swim *swim, struct swim_member *member, - const struct sockaddr_in *addr) + const struct sockaddr_in *addr, int incarnation_inc) { if (! swim_sockaddr_in_eq(addr, &member->addr)) { + member->incarnation += incarnation_inc; member->addr = *addr; swim_on_member_update(swim, member); } @@ -1078,7 +1083,7 @@ swim_upsert_member(struct swim *swim, const struct swim_member_def *def, if (member != self) { if (def->incarnation < member->incarnation) goto skip; - swim_update_member_addr(swim, member, &def->addr); + swim_update_member_addr(swim, member, &def->addr, 0); swim_update_member_inc_status(swim, member, def->status, def->incarnation); return 0; @@ -1451,7 +1456,7 @@ swim_cfg(struct swim *swim, const char *uri, double heartbeat_rate, swim_on_member_update(swim, swim->self); swim->self = new_self; } - swim_update_member_addr(swim, swim->self, &addr); + swim_update_member_addr(swim, swim->self, &addr, 1); if (gc_mode != SWIM_GC_DEFAULT) swim->gc_mode = gc_mode; return 0; diff --git a/test/unit/swim.c b/test/unit/swim.c index 499b2ef52..e4c9cc5a6 100644 --- a/test/unit/swim.c +++ b/test/unit/swim.c @@ -552,10 +552,77 @@ swim_test_quit(void) swim_finish_test(); } +static void +swim_test_uri_update(void) +{ + swim_start_test(2); + /* + * The test checks how a member address is updated. There + * is a cluster of 3 members: S1, S2, S3, and links: + * S1 <-> S2, S3 -> S1, S3 -> S2. S1 updates its address. + * The new address is sent to S2 and is updated here. Then + * S3 wakes up and disseminates the old address of S1. + * Member S2 should ignore that old address. It is + * achievable only via new incarnation on each address + * update. + */ + struct swim_cluster *cluster = swim_cluster_new(3); + swim_cluster_interconnect(cluster, 0, 1); + /* + * S3 should not accept packets so as to keep old address + * of S1. + */ + swim_cluster_set_drop(cluster, 2, 100); + swim_cluster_add_link(cluster, 2, 1); + swim_cluster_add_link(cluster, 2, 0); + + struct swim *s0 = swim_cluster_node(cluster, 0); + const struct swim_member *s0_self = swim_self(s0); + const char *new_s0_uri = "127.0.0.5:1"; + fail_if(swim_cfg(s0, "127.0.0.5:1", -1, -1, -1, NULL) != 0); + /* + * Since S1 knows about S2 only, one round step is enough. + */ + swim_run_for(1); + struct swim *s1 = swim_cluster_node(cluster, 1); + const struct swim_member *s0_view = + swim_member_by_uuid(s1, swim_member_uuid(s0_self)); + is(strcmp(new_s0_uri, swim_member_uri(s0_view)), 0, + "S1 updated its URI and S2 sees that"); + /* + * S2 should not manage to send the new address to S3, but + * should accept S3 packets later - therefore block is + * needed. + */ + swim_cluster_block_io(cluster, 1); + /* + * S1 should not send the new address to S3 - drop its + * packets. + */ + swim_cluster_set_drop(cluster, 0, 100); + /* + * Main part of the test - S3 sends the old address to S1. + */ + swim_cluster_set_drop(cluster, 2, 0); + swim_run_for(3); + swim_cluster_set_drop(cluster, 2, 100); + /* + * S2 absorbs the packets, but should ignore the old + * address. + */ + swim_cluster_unblock_io(cluster, 1); + swim_run_for(2); + is(strcmp(new_s0_uri, swim_member_uri(s0_view)), 0, + "S2 still keeps new S1's URI, even received the old one from S3"); + + swim_cluster_delete(cluster); + swim_finish_test(); +} + static int main_f(va_list ap) { - swim_start_test(13); + swim_start_test(14); (void) ap; swim_test_ev_init(); @@ -574,6 +641,7 @@ main_f(va_list ap) swim_test_undead(); swim_test_packet_loss(); swim_test_quit(); + swim_test_uri_update(); swim_test_transport_free(); swim_test_ev_free(); diff --git a/test/unit/swim.result b/test/unit/swim.result index 7277f2ee6..9dae810a0 100644 --- a/test/unit/swim.result +++ b/test/unit/swim.result @@ -1,5 +1,5 @@ *** main_f *** -1..13 +1..14 *** swim_test_one_link *** 1..6 ok 1 - no rounds - no fullmesh @@ -131,4 +131,10 @@ ok 12 - subtests ok 9 - and still is not added to S2 - left members can not be added ok 13 - subtests *** swim_test_quit: done *** + *** swim_test_uri_update *** + 1..2 + ok 1 - S1 updated its URI and S2 sees that + ok 2 - S2 still keeps new S1's URI, even received the old one from S3 +ok 14 - subtests + *** swim_test_uri_update: done *** *** main_f: done *** diff --git a/test/unit/swim_test_transport.c b/test/unit/swim_test_transport.c index 78fda587a..cda984779 100644 --- a/test/unit/swim_test_transport.c +++ b/test/unit/swim_test_transport.c @@ -230,8 +230,10 @@ swim_transport_bind(struct swim_transport *transport, const struct sockaddr_in *new_addr = (const struct sockaddr_in *) addr; int new_fd = ntohs(new_addr->sin_port) + FAKE_FD_BASE; int old_fd = transport->fd; - if (old_fd == new_fd) + if (old_fd == new_fd) { + transport->addr = *new_addr; return 0; + } if (swim_fd_open(&swim_fd[new_fd - FAKE_FD_BASE]) != 0) return -1; transport->fd = new_fd; -- 2.17.2 (Apple Git-113)