From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 637552C2A6 for ; Thu, 25 Apr 2019 09:50:11 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Cc70E_Wt9keB for ; Thu, 25 Apr 2019 09:50:11 -0400 (EDT) Received: from smtpng1.m.smailru.net (smtpng1.m.smailru.net [94.100.181.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 1FA0E2ADF9 for ; Thu, 25 Apr 2019 09:50:11 -0400 (EDT) Subject: [tarantool-patches] Re: [PATCH 5/6] swim: introduce routing References: <6592ce57d8b6c7ef46202551f858a37ca2e18a2c.1556116199.git.v.shpilevoy@tarantool.org> <20190424164659.GE13687@atlas> <52489504-ffa3-abbc-eb5c-8919d2da8d33@tarantool.org> <20190425103949.GE29257@atlas> From: Vladislav Shpilevoy Message-ID: <921181b4-450c-9d83-1060-8d472303ddd0@tarantool.org> Date: Thu, 25 Apr 2019 16:50:08 +0300 MIME-Version: 1.0 In-Reply-To: <20190425103949.GE29257@atlas> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-Help: List-Unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-Subscribe: List-Owner: List-post: List-Archive: To: Konstantin Osipov Cc: tarantool-patches@freelists.org > Please document this decision and the reasons for it as you stated > above. Done in the last commit. ========================================================================= @@ -59,11 +59,22 @@ * ping. Replies are processed out of the main cycle, * asynchronously. * - * Random selection provides even network load of ~1 message on - * each member per one protocol step regardless of the cluster - * size. Without randomness each member would receive a network - * load of N messages in each protocol step, where N is the - * cluster size. + * When a member unacknowledged too many pings, its status is + * changed to 'suspected'. The SWIM paper describes suspicion + * subcomponent as a protection against false-positive detection + * of alive members as dead. It happens when a member is + * overloaded and responds to pings too slow, or when the network + * is in trouble and packets can not go through some channels. + * When a member is suspected, another instance pings it + * indirectly via other members. It sends a fixed number of pings + * to the suspected one in parallel via additional hops selected + * randomly among other members. + * + * Random selection in all the components provides even network + * load of ~1 message on each member per one protocol step + * regardless of the cluster size. Without randomness each member + * would receive a network load of N messages in each protocol + * step, where N is the cluster size. * * To speed up propagation of new information by means of a few * random messages SWIM proposes a kind of fairness: when @@ -1111,7 +1122,12 @@ swim_send_ack(struct swim *swim, struct swim_task *task, swim_send_fd_msg(swim, task, dst, SWIM_FD_MSG_ACK, NULL); } -/** Schedule an indirect ack through @a proxy. */ +/** + * Schedule an indirect ack through @a proxy. Indirect ACK is sent + * only when this instance receives an indirect ping. It means + * that another member tries to reach this one via other nodes, + * and inexplicably failed to do it directly. + */ static inline int swim_send_indirect_ack(struct swim *swim, const struct sockaddr_in *dst, const struct sockaddr_in *proxy) @@ -1155,7 +1171,16 @@ finish: swim_task_delete_cb(task, scheduler, rc); } -/** Schedule a number of indirect pings to a member @a dst. */ +/** + * Schedule a number of indirect pings to a member @a dst. + * Indirect pings are used when direct pings are not acked too + * long. The SWIM paper explains that it is a protection against + * false-positive failure detection when a node sends ACKs too + * slow, or the network is in trouble. Then other nodes can try to + * access it via different channels and members. The algorithm is + * simple - choose a fixed number of random members and send pings + * to the suspected member via them in parallel. + */ static inline int swim_send_indirect_pings(struct swim *swim, const struct swim_member *dst) { ========================================================================= >>>> +void >>>> +swim_task_proxy(struct swim_task *task, const struct sockaddr_in *proxy) >>> >>> Why not "swim_task_set_proxy_addr"? >> >> 1) Because the transport level operates on addresses only. It is excessive >> to say everywhere '_addr' suffix. It is impossible to send to UUID or >> anything else except inet address; >> >> 2) For consistency - 'swim_task_send()' is not named >> 'swim_task_send_to_addr()'; >> >> 3) '_addr' suffix is too long. I tried it on the first SWIM version, >> but it does not help at all, only pads the code out. > > Then swim_task_set_proxy(). A method should have a verb in its > name, otherwise the name looks like a constructor. Done. ========================================================================= void -swim_task_proxy(struct swim_task *task, const struct sockaddr_in *proxy) +swim_task_set_proxy(struct swim_task *task, const struct sockaddr_in *proxy) { /* * Route meta should be reserved before body encoding is diff --git a/src/lib/swim/swim_io.h b/src/lib/swim/swim_io.h index 977859db7..03f268bc1 100644 --- a/src/lib/swim/swim_io.h +++ b/src/lib/swim/swim_io.h @@ -243,7 +243,7 @@ swim_task_is_scheduled(struct swim_task *task) * into metadata section. */ void -swim_task_proxy(struct swim_task *task, const struct sockaddr_in *proxy); +swim_task_set_proxy(struct swim_task *task, const struct sockaddr_in *proxy); =========================================================================