From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 0039E22FD6 for ; Fri, 27 Apr 2018 09:05:33 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lVwM0WZJefSJ for ; Fri, 27 Apr 2018 09:05:32 -0400 (EDT) Received: from smtp52.i.mail.ru (smtp52.i.mail.ru [94.100.177.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 4079222FD2 for ; Fri, 27 Apr 2018 09:05:31 -0400 (EDT) From: Konstantin Belyavskiy Subject: [tarantool-patches] [PATCH v2] replication: do not fetch records twice in a full mesh Date: Fri, 27 Apr 2018 16:05:26 +0300 Message-Id: <20180427130526.98757-1-k.belyavskiy@tarantool.org> Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: georgy@tarantool.org Cc: tarantool-patches@freelists.org --- Ticket: https://github.com/tarantool/tarantool/issues/3294 Branch: https://github.com/tarantool/tarantool/compare/gh-3294-do-not-fetch-records-twice Link: https://github.com/tarantool/tarantool/blob/gh-3294-do-not-fetch-records-twice/doc/rfc/3294-implement-selective-subscribe-to-avoid-fetch-record-twice.md ...ective-subscribe-to-avoid-fetch-record-twice.md | 26 ++++++++++++++++++++++ 1 file changed, 26 insertions(+) create mode 100644 doc/rfc/3294-implement-selective-subscribe-to-avoid-fetch-record-twice.md diff --git a/doc/rfc/3294-implement-selective-subscribe-to-avoid-fetch-record-twice.md b/doc/rfc/3294-implement-selective-subscribe-to-avoid-fetch-record-twice.md new file mode 100644 index 000000000..578f17d53 --- /dev/null +++ b/doc/rfc/3294-implement-selective-subscribe-to-avoid-fetch-record-twice.md @@ -0,0 +1,26 @@ +# Implement selective subscribe and subscription daemon + +* **Status**: In progress +* **Start date**: 25-04-2018 +* **Authors**: Konstantin Belyavskiy @kbelyavs k.belyavskiy@tarantool.org, Georgy Kirichenko @georgy georgy@tarantool.org, Konstantin Osipov @kostja kostja@tarantool.org +* **Issues**: [#3294](https://github.com/tarantool/tarantool/issues/3294) + +## Summary + +Extend SUBSCRIBE command with a list of server UUIDs for which SUBSCRIBE should fetch changes. In a full mesh configuration, only download records originating from the immediate peer. Do not download the records from other peers twice. +Implement subscription daemon, each time a server responsible feeding more than 1 server id is dropped, we need to re-subscribe to some other peer and reassign the dropped ids to that peer. Each time a server is connected again, we need to rebalance again. + +## Background and motivation + +Currently each Tarantool instance will download from all peers all records in their WAL except records with instance id equal to the self instance id. For example, in a full mesh of 3 replicas all record will be fetched twice. Instead, it could send a subscribe request to its peers with server ids which are not present in other subscribe requests. + +## Detailed design + +1. Extend IPROTO_SUBSCRIBE command with a list of server UUIDs for which SUBSCRIBE should fetch changes. Store this UUIDs within applier's internal data structure. By default issuing SUBSCRIBE with empty list what means no filtering at all. +2. Implement white-list filtering in relay. After processing SUBSCRIBE request, relay has a list of UUIDs. Extract associated peer ids and fill in a filter. By default transmit all records, unless SUBSCRIBE was done with at least one server UUID. In latter case drop all records except originating from replicas in this list. +3. After issuing REQUEST_VOTE to all peers, subscription daemon knows a map of server ids, their peers and their vclocks. Sort the map by server id. Iterate over each server in the list of peers and assign its id to this server's SUBSCRIBE request. Assign all the remaining ids to the last peer (alternatively, if there are many ids in the remainder, keep going through the list of server and assign "orphan" ids in round-robin fashion). +Issue the subscribe request. +After this feature is implemented, each time a server responsible feeding more than 1 server id is dropped, we need to re-subscribe to some other peer and reassign the dropped ids to that peer. Each time a server is connected again, we need to rebalance again. +4. Rebalancing. Connect/disconnect should trigger daemon to start reassigning process. + - On disconnect first get a list of all UUIDs, then iterate through appliers to find "orphan" and finally reassigned these UUIDs to last peer by issuing SUBSCRIBE for it. + - On connect, by iterating through appliers list, find stolen UUIDs, reassign them to correct applier and issue SUBSCRIBE for recently connected applier and for the one from whom we get these UUIDs back. -- 2.14.3 (Apple Git-98)