[tarantool-patches] Re: [PATCH 6/6] swim: introduce suspicion

Konstantin Osipov kostja at tarantool.org
Wed Apr 24 20:01:01 MSK 2019


* Vladislav Shpilevoy <v.shpilevoy at tarantool.org> [19/04/24 18:50]:
> +	/**
> +	 * Number of attempts to reach out a member via another
> +	 * members when it did not answer on a regular ping .
> +	 */
> +	INDIRECT_PING_COUNT = 2,

As far as I can see it's not the number of attempts, it's the
number of cluster members involved as proxies for sending an
indirect ping. The fact that you attempt an indirect ping only
once before switching member state to dead is secondary.

This has to do with lack of a description in changeset or code
comment of indirect pings strategy. 

> +/**
> + * Put the member into a list of ACK waiters. @a hop_count says
> + * how many hops from one member to another the ACK is expected to
> + * do.
> + */
>  static void
> -swim_wait_ack(struct swim *swim, struct swim_member *member)
> +swim_wait_ack(struct swim *swim, struct swim_member *member, int hop_count)
>  {
>  	if (heap_node_is_stray(&member->in_wait_ack_heap)) {
> -		member->ping_deadline = swim_time() + swim->wait_ack_tick.at;
> +		double timeout = swim->wait_ack_tick.at * hop_count;

Is hop_count ever greater than two? If not, I would not use int
hop_count, but use bool is_indirect_ping, and add a comment
explaining that in case of indirect pings the number of hops is
exactly two, so we need to increase the timeout.

> +/**
> + * When indirect pings are sent, each is represented by this
> + * structure.
> + */
> +struct swim_iping_task {
> +	/** Base task used by the scheduler. */
> +	struct swim_task base;
> +	/**
> +	 * Reference to a block of other indirect ping tasks sent
> +	 * at the same moment. Used to decide whether it is needed
> +	 * to start waiting for an ACK, and on which member.
> +	 */
> +	struct swim_iping_block *block;

I don't understand this comment, why do you need a reference?

I would not use it at all, seems like we have a separate data
structure when existing structures would do just fine.


> +};
> +
> +/**
> + * An array of indirect ping tasks sent simultaneously via
> + * different proxies. The block contains meta information allowing
> + * to 1) start waiting for an ACK only after first successful
> + * send; 2) determine which member has sent the pings - UUID is
> + * needed for that, inet address is not enough.
> + *
> + * The block is deleted when the last task is finished.
> + */
> +struct swim_iping_block {
> +	/** Array of indirect ping tasks. */
> +	struct swim_iping_task tasks[INDIRECT_PING_COUNT];
> +	/**
> +	 * UUID of the destination member. Used to set ping
> +	 * deadline in that member when at least one is sent
> +	 * successfully.
> +	 */
> +	struct tt_uuid dst_uuid;
> +	/**
> +	 * The flag is used to wait for an ACK only once after a
> +	 * first ping is sent to protect from a case, when an ACK
> +	 * is received faster, than the last ping is sent. Then
> +	 * the whole indirect ping block should be considered
> +	 * acked.
> +	 */

Why care if indirect ping block is acknowledged or not? What
matters is member status, and it is changed when we get an ack.
In other words, I'd drop the whole idea with blocks and just send
multiple independent pings.

>  	switch (def.type) {
>  	case SWIM_FD_MSG_PING:
> -		if (! swim_task_is_scheduled(&member->ack_task))
> +		if (proxy != NULL) {
> +			if (swim_send_indirect_ack(swim, &member->addr,
> +						   proxy) != 0)
> +				diag_log();
> +		} else if (! swim_task_is_scheduled(&member->ack_task)) {
>  			swim_send_ack(swim, &member->ack_task, &member->addr);
> +		}
>  		break;

Why do you ever need to distinguish direct and indirect acks? Why not
respond to the sender address, while preserving the routing table
from the ping request, assuming the sender can figure out whether
the response is addressed to him directly or it should proxy it
further based on the received proxy section of the response?

-- 
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov




More information about the Tarantool-patches mailing list