[tarantool-patches] Re: [PATCH 1/1] swim: introduce SWIM's anti-entropy component

Konstantin Osipov kostja at tarantool.org
Thu Mar 7 13:22:31 MSK 2019


* Vladislav Shpilevoy <v.shpilevoy at tarantool.org> [19/03/07 11:19]:

> diff --git a/cmake/profile.cmake b/cmake/profile.cmake
> index 0ba31fa2c..70fa74694 100644
> --- a/cmake/profile.cmake
> +++ b/cmake/profile.cmake
> @@ -1,5 +1,5 @@
>  set(CMAKE_REQUIRED_FLAGS "-fprofile-arcs -ftest-coverage")
> -check_library_exists("" __gcov_flush "" HAVE_GCOV)
> +set(HAVE_GCOV 1)
>  set(CMAKE_REQUIRED_FLAGS "")

Stray change, please remove.

> +	/**
> +	 * Swim instance is its own iterator. Such iterators are
> +	 * obviously unstable and fragile, but it is good enough
> +	 * for a fast non-yielding scan of members table. Each new
> +	 * public API iterator iterates through member table using
> +	 * this variable as a storage of the current position.
> +	 */

This comment is unclear, Since you rejected my previous edition of
this comment, please check with Vladimir Davydov and write a
better comment than that.

> +/**
> + * Encode anti-entropy header and random members data as many as
> + * possible to the end of the packet.
> + * @retval 0 Not error, but nothing is encoded.
> + * @retval 1 Something is encoded.
> + */

This comment is confusing. You actually return the number of
members added to messagepack map. You can add 0 or 1 member. 
Please rewrite. 

> +static int
> +swim_encode_anti_entropy(struct swim *swim, struct swim_packet *packet)


> +	struct swim_member_bin member_bin;
> +	int size = sizeof(ae_header_bin);
> +	char *header = swim_packet_reserve(packet, size);
> +	if (header == NULL)
> +		return 0;

I don't understand why this error can ever be ignored.  The result of
this check does not depend on environment: each time you invoke
this function your header data won't fit in the packet. This
violates the basic assumptions you make in the protocol, i.e. that
you can fit at least one member in the packet. Please either
explain how you can ignore such a basic problem as when you first
ever header assert/panic.

> +	char *pos = header;
> +	swim_member_bin_create(&member_bin);
> +	struct mh_swim_table_t *t = swim->members;
> +	int i = 0, member_count = mh_size(t);
> +	int rnd = swim_scaled_rand(0, member_count - 1);
> +	for (mh_int_t rc = mh_swim_table_random(t, rnd), end = mh_end(t);
> +	     i < member_count; ++i) {
> +		struct swim_member *m = *mh_swim_table_node(t, rc);
> +		int new_size = size + sizeof(member_bin);
> +		if (swim_packet_reserve(packet, new_size) == NULL)
> +			break;

Same here. You should assert/panic that you can fit at least one member
in the datagram.  Otherwise the protocol is unusable, and this
error condition does not depend on environment - it's a
programming bug.

> +/**
> + * Encode source UUID.
> + * @retval 0 Not error, but nothing is encoded.
> + * @retval 1 Something is encoded.
> + */
> +static inline int
> +swim_encode_src_uuid(struct swim *swim, struct swim_packet *packet)
> +{

This comment is also confusing. Take a look at the usage:
Please update the comment.

> +	int map_size = 0;
> +	map_size += swim_encode_src_uuid(swim, packet);
> +	map_size += swim_encode_anti_entropy(swim, packet);

> +
> +/**
> + * Update or create a member by its definition, received from a

Update or create is "replace", or  upsert in Tarantool vocabulary. 
Update does nothing if no member exists. 
The choice of the name is also confusing since you chose to return
NULL on error.  
Please rename.

> +		if (swim_update_member(swim, &def) == NULL) {
> +			/*
> +			 * Not a critical error. Other members
> +			 * still can be updated.
> +			 */
> +			diag_log();

You're not using the return value.  I assume it will be used in a
subsequent patch. Otherwise please return {0,-1}

> +		}
> +	}
> +	return 0;
> +	if (size == 0) {
> +		diag_set(SwimError, "%s body can not be empty", prefix);
> +		goto error;
> +	}

Are you sure you want to use a single exception for both protocol
and transport errors? Why would you ever have a  protocol error
in production? It suggests that protocol errors, generally, should
be treated differently from transport errors. Nothing to be done
in this patch about it, to be addressed in a subsequent patch.

> +void
> +swim_delete(struct swim *swim)
> +{
> +	swim_scheduler_destroy(&swim->scheduler);
> +	swim_ev_timer_stop(loop(), &swim->round_tick);
> +	swim_task_destroy(&swim->round_step_task);
> +	mh_int_t node = mh_first(swim->members);
> +	while (node != mh_end(swim->members)) {
> +		struct swim_member *m =
> +			*mh_swim_table_node(swim->members, node);
> +		swim_delete_member(swim, m);
> +		node = mh_first(swim->members);
> +	}

This is O(n^2) from the number of members, since the hash table is
not generally shrinking after you delete a member from it. Please
make it linear: go over all mhash members, delete them with
swim_member_delete(), then delete the mhash itself. 

> +struct swim_iterator *
> +swim_iterator_open(struct swim *swim)
> +{
> +	assert(swim_is_configured(swim));
> +	swim->iterator = mh_first(swim->members);
> +	return (struct swim_iterator *) swim;
> +}

What are you trying to save on by fusing the iterator in? A
separate data structure? What's wrong with a separate iterator
object pointing copied by-value? This code is more complicated
than necessary. To be addressed in a separate patch.

> +static inline int
> +swim_decode_ip(struct sockaddr_in *address, const char **pos, const char *end,
> +	       const char *prefix, const char *param_name)
> +{
> +	uint64_t ip;
> +	if (swim_decode_uint(pos, end, &ip, prefix, param_name) != 0)
> +		return -1;
> +	if (ip > UINT32_MAX) {
> +		diag_set(SwimError, "%s %s is an invalid IP address", prefix,
> +			 param_name);
> +		return -1;
> +	}
> +	address->sin_addr.s_addr = ip;

I believe you should convert host order to network order when
assigning. Please add htons.

> +static int
> +main_f(va_list ap)
> +{
> +	swim_start_test(5);
> +
> +	(void) ap;
> +	struct ev_loop *loop = loop();
> +	swim_test_ev_init();
> +	swim_test_transport_init();
> +
> +	swim_test_one_link();
> +	swim_test_sequence();
> +	swim_test_uuid_update();
> +	swim_test_cfg();
> +	swim_test_add_remove();
> +
> +	swim_test_transport_free();
> +	swim_test_ev_free();
> +
> +	test_result = check_plan();
> +	footer();
> +	return 0;
> +}

Please also add to the test plan (in a separate patch):

- hitting the boundary of udp datagram (i.e. case when
  swim_packet_reserve() returns error)
- packet loss in the network - 5%, 10%, 20%, 50%, 90%
- packet loss from a single swim member
- one or few members "lagging" behind the rest.
- speed at which a detected failure spreads through the network
- a "flaky" member which is on and off

I understand anti-entropy itself does not exhibit too much
testable properties, so this test plan is good enough for the
first patch.

Please fix the above comments and push.

-- 
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov




More information about the Tarantool-patches mailing list