[tarantool-patches] Re: [PATCH 6/6] [RAW] swim: introduce failure detection component

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Tue Apr 2 15:25:53 MSK 2019



On 29/03/2019 21:59, Konstantin Osipov wrote:
> * Vladislav Shpilevoy <v.shpilevoy at tarantool.org> [19/03/20 14:11]:
>>  
>>  
>> +/**
>> + * Decode a failure detection message. Schedule acks, process
>> + * acks.
>> + */
>> +static int
>> +swim_process_failure_detection(struct swim *swim, const char **pos,
>> +			       const char *end, const struct sockaddr_in *src,
>> +			       const struct tt_uuid *uuid)
>> +{
>> +	const char *prefix = "invalid failure detection message:";
>> +	struct swim_failure_detection_def def;
>> +	struct swim_member_def mdef;
>> +	if (swim_failure_detection_def_decode(&def, pos, end, prefix) != 0)
>> +		return -1;
>> +	say_verbose("SWIM %d: process failure detection's %s", swim_fd(swim),
>> +		    swim_fd_msg_type_strs[def.type]);
>> +	swim_member_def_create(&mdef);
>> +	mdef.addr = *src;
>> +	mdef.incarnation = def.incarnation;
>> +	mdef.uuid = *uuid;
>> +	struct swim_member *member = swim_upsert_member(swim, &mdef);
>> +	if (member == NULL)
>> +		return -1;
>> +	if (def.incarnation < member->incarnation)
>> +		return 0;
> 
> I think this branch a comment. Why do you not respond to pings
> from older incarnations?

Firstly, I am trying to be consistent in neglect of old messages. I
just pretend that they do not exist. Despite their type and purpose.

Secondly, if we know a bigger incarnation of the member, then we can
be sure, that after this stray ping we have already had some interaction
with that member, and it is not necessary to garbage the network with
that apparently excessive ACK.

I added a comment with the things I said above.

> 
>> +/**
>> + * Send a ping to this address. If an ACK is received, the member
>> + * will be added.
> 
> Could you also please explain in the comment why this is useful?
> E.g. why would I want to add a member only after a successful
> ping/ack?

It is not about successful ping/ack, but rather for a case
when you don't know UUID. This method allows you to discover
UUID knowing only URI.

==============================================================================
diff --git a/src/lib/swim/swim.h b/src/lib/swim/swim.h
index 3a2a878fc..9659f03b8 100644
--- a/src/lib/swim/swim.h
+++ b/src/lib/swim/swim.h
@@ -123,7 +123,9 @@ swim_remove_member(struct swim *swim, const struct tt_uuid *uuid);
 
 /**
  * Send a ping to this address. If an ACK is received, the member
- * will be added.
+ * will be added. The main purpose of the method is to add a new
+ * member manually but without knowledge of its UUID. The member
+ * will send it with an ACK.
  */
 int
 swim_probe_member(struct swim *swim, const char *uri);
==============================================================================

> 
>> + */
>> +int
>> +swim_probe_member(struct swim *swim, const char *uri);
>> +
>> +	SWIM_FD_MSG_TYPE,
> 
> The patch is generally OK to push after fixing the minor comments
> above. I also changed a few comments on the branch.

Thanks, the fixes are applied. Partially though. For example,
in one place you changed 'the SWIM paper' to 'SWIM paper'. But
there are no the single SWIM paper in the world. And I used 'the'
to emphasize one concrete SWIM paper among many. Besides, you
left other 'the SWIM paper' as is.

Also, I removed empty lines from swim_update_member_inc_status.
We do not use the empty lines for such multi-line conditions.

In addition, fixed some places out of 66 border.

> 
>> +enum {
>> +	/**
>> +	 * Value being used to say that unacknowledged pings
>> +	 * should not affect a certain decision about member. For
>> +	 * example, regardless of number of unacked pings, never
>> +	 * drop a member.
>> +	 */
>> +	SWIM_NO_ACKS_IGNORE = INT_MAX,
>> +};
> 
> OK, now I get it. You use it to pin members. I would actually use
> bool is_pinned for that. It seems you make no_acks_to_gc a
> variable only to use it for pinning. Let's add a separate varaible
> for pins.

Discussed verbally. Pinning is necessary when members addition and
removal is managed by user and its external systems, while SWIM is
used for failure detection only. We decided to enclose no_acks_to_gc
and provide a user with pinning API only. But there is a problem that
when a new member is added implicitly, via anti-entropy, for example,
then it is not pinned. A user would need to track every new member
and pin it manually. So for that case we decided to introduce a flag
'auto_pin'. With that flag set every new member is pinned automatically.

But once I started implementation of thar idea, I faced with a dubious
moment - what to do with existing members when auto_pin = true? And do
we need point pinning now? I decided, that

    - we can postpone point pinning until a specific request for that;

    - instead of auto_pin it is better to add an explicit GC mode, because
      in a nutshell auto-pin is actually a GC turned off. Lets call a spade
      a spade.

With that simplifications I managed to make the incremental diff quite
small and simple for that comment. See below:

==============================================================================
diff --git a/src/lib/swim/swim.c b/src/lib/swim/swim.c
index 188e5bd9a..472137db3 100644
--- a/src/lib/swim/swim.c
+++ b/src/lib/swim/swim.c
@@ -160,7 +160,7 @@ enum {
 	 * 2) disseminate the status via dissemination and
 	 * anti-entropy components.
 	 */
-	NO_ACKS_TO_GC_DEFAULT = 2,
+	NO_ACKS_TO_GC = 2,
 };
 
 /**
@@ -362,12 +362,8 @@ struct swim {
 	heap_t wait_ack_heap;
 	/** Generator of ack checking events. */
 	struct ev_timer wait_ack_tick;
-	/**
-	 * How many pings to a dead member should be
-	 * unacknowledged for it to be deleted from the member
-	 * table.
-	 */
-	int no_acks_to_gc;
+	/** GC state saying how to remove dead members. */
+	enum swim_gc_mode gc_mode;
 };
 
 /** Put the member into a list of ACK waiters. */
@@ -875,7 +871,8 @@ swim_check_acks(struct ev_loop *loop, struct ev_timer *t, int events)
 			}
 			break;
 		case MEMBER_DEAD:
-			if (m->unacknowledged_pings >= swim->no_acks_to_gc) {
+			if (m->unacknowledged_pings >= NO_ACKS_TO_GC &&
+			    swim->gc_mode == SWIM_GC_ON) {
 				swim_delete_member(swim, m);
 				continue;
 			}
@@ -968,8 +965,7 @@ swim_upsert_member(struct swim *swim, const struct swim_member_def *def,
 {
 	struct swim_member *member = swim_find_member(swim, &def->uuid);
 	if (member == NULL) {
-		if (def->status == MEMBER_DEAD &&
-		    swim->no_acks_to_gc != SWIM_NO_ACKS_IGNORE) {
+		if (def->status == MEMBER_DEAD && swim->gc_mode == SWIM_GC_ON) {
 			/*
 			 * Do not 'resurrect' dead members to
 			 * prevent 'ghost' members. Ghost member
@@ -1185,7 +1181,7 @@ swim_new(void)
 	swim_ev_timer_init(&swim->wait_ack_tick, swim_check_acks,
 			   ACK_TIMEOUT_DEFAULT, 0);
 	swim->wait_ack_tick.data = (void *) swim;
-	swim->no_acks_to_gc = NO_ACKS_TO_GC_DEFAULT;
+	swim->gc_mode = SWIM_GC_ON;
 	return swim;
 }
 
@@ -1216,7 +1212,8 @@ swim_uri_to_addr(const char *uri, struct sockaddr_in *addr,
 
 int
 swim_cfg(struct swim *swim, const char *uri, double heartbeat_rate,
-	 double ack_timeout, int no_acks_to_gc, const struct tt_uuid *uuid)
+	 double ack_timeout, enum swim_gc_mode gc_mode,
+	 const struct tt_uuid *uuid)
 {
 	const char *prefix = "swim.cfg:";
 	struct sockaddr_in addr;
@@ -1284,8 +1281,8 @@ swim_cfg(struct swim *swim, const char *uri, double heartbeat_rate,
 	/* Reserved above. */
 	assert(rc == 0);
 	(void) rc;
-	if (no_acks_to_gc >= 0)
-		swim->no_acks_to_gc = no_acks_to_gc;
+	if (gc_mode != SWIM_GC_DEFAULT)
+		swim->gc_mode = gc_mode;
 	return 0;
 }
 
diff --git a/src/lib/swim/swim.h b/src/lib/swim/swim.h
index 9659f03b8..31150a18e 100644
--- a/src/lib/swim/swim.h
+++ b/src/lib/swim/swim.h
@@ -31,7 +31,6 @@
  * SUCH DAMAGE.
  */
 #include <stdbool.h>
-#include <limits.h>
 #include <stdint.h>
 #include "swim_constants.h"
 
@@ -39,22 +38,19 @@
 extern "C" {
 #endif
 
-enum {
-	/**
-	 * Value being used to say that unacknowledged pings
-	 * should not affect a certain decision about member. For
-	 * example, regardless of number of unacked pings, never
-	 * drop a member.
-	 */
-	SWIM_NO_ACKS_IGNORE = INT_MAX,
-};
-
 struct info_handler;
 struct swim;
 struct tt_uuid;
 struct swim_iterator;
 struct swim_member;
 
+/** Types of SWIM dead member deletion policy. */
+enum swim_gc_mode {
+	/** Just keep the current mode as is. */
+	SWIM_GC_DEFAULT = -1,
+	/**
+	 * Turn GC off. With that mode dead members are never
+	 * deleted automatically.
+	 */
+	SWIM_GC_OFF = 0,
+	/**
+	 * Turn GC on. Normal classical SWIM GC mode. Dead members
+	 * are deleted automatically after a number of
+	 * unacknowledged pings.
+	 */
+	SWIM_GC_ON = 1,
+};
+
 /**
  * Create a new SWIM instance. Do not bind to a port or set any
  * parameters. Allocation and initialization only.
@@ -78,12 +74,11 @@ swim_is_configured(const struct swim *swim);
  *        @heartbeat_rate.
  * @param ack_timeout Time in seconds after which a ping is
  *        considered to be unacknowledged.
- * @param no_acks_to_gc How many pings to a dead member should be
- *        unacknowledged to delete it from the member table. Big
- *        or even almost infinite (SWIM_NO_ACKS_IGNORE) values
- *        could be useful, if SWIM is used mainly for monitoring
- *        of existing nodes with manual removal of dead ones, and
- *        probably with only single initial discovery.
+ * @param gc_mode Says if members should never be deleted due to
+ *        too many unacknowledged pings. It could be useful, if
+ *        SWIM is used mainly for monitoring of existing nodes
+ *        with manual removal of dead ones, and probably with only
+ *        a single initial discovery.
  * @param uuid UUID of this instance. Must be unique over the
  *        cluster.
  *
@@ -92,7 +87,8 @@ swim_is_configured(const struct swim *swim);
  */
 int
 swim_cfg(struct swim *swim, const char *uri, double heartbeat_rate,
-	 double ack_timeout, int no_acks_to_gc, const struct tt_uuid *uuid);
+	 double ack_timeout, enum swim_gc_mode gc_mode,
+	 const struct tt_uuid *uuid);
 
 /** SWIM's ACK timeout, previously set via @sa swim_cfg. */
 double
diff --git a/test/unit/swim.c b/test/unit/swim.c
index f3a31f992..32d223a7a 100644
--- a/test/unit/swim.c
+++ b/test/unit/swim.c
@@ -410,7 +410,7 @@ swim_test_undead(void)
 {
 	swim_start_test(2);
 	struct swim_cluster *cluster = swim_cluster_new(2);
-	swim_cluster_set_no_acks_to_gc(cluster, SWIM_NO_ACKS_IGNORE);
+	swim_cluster_set_gc(cluster, SWIM_GC_OFF);
 	swim_cluster_set_ack_timeout(cluster, 1);
 	swim_cluster_add_link(cluster, 0, 1);
 	swim_cluster_add_link(cluster, 1, 0);
diff --git a/test/unit/swim_test_utils.c b/test/unit/swim_test_utils.c
index 57faf803a..bb413372c 100644
--- a/test/unit/swim_test_utils.c
+++ b/test/unit/swim_test_utils.c
@@ -84,9 +84,9 @@ swim_cluster_set_ack_timeout(struct swim_cluster *cluster, double ack_timeout)
 }
 
 void
-swim_cluster_set_no_acks_to_gc(struct swim_cluster *cluster, int no_acks_to_gc)
+swim_cluster_set_gc(struct swim_cluster *cluster, enum swim_gc_mode gc_mode)
 {
-	swim_cluster_set_cfg(cluster, NULL, -1, -1, no_acks_to_gc, NULL);
+	swim_cluster_set_cfg(cluster, NULL, -1, -1, gc_mode, NULL);
 }
 
 void
diff --git a/test/unit/swim_test_utils.h b/test/unit/swim_test_utils.h
index 494808df0..c0ecf27e0 100644
--- a/test/unit/swim_test_utils.h
+++ b/test/unit/swim_test_utils.h
@@ -53,7 +53,7 @@ swim_cluster_set_ack_timeout(struct swim_cluster *cluster, double ack_timeout);
  * of all the instances in the cluster.
  */
 void
-swim_cluster_set_no_acks_to_gc(struct swim_cluster *cluster, int no_acks_to_gc);
+swim_cluster_set_gc(struct swim_cluster *cluster, enum swim_gc_mode gc_mode);
 
 /** Delete all the SWIM instances, and the cluster itself. */
 void
==============================================================================




More information about the Tarantool-patches mailing list