Tarantool development patches archive
 help / color / mirror / Atom feed
From: Konstantin Belyavskiy <k.belyavskiy@tarantool.org>
To: georgy@tarantool.org
Cc: tarantool-patches@freelists.org
Subject: [tarantool-patches] [PATCH v2] replication: do not fetch records twice in a full mesh
Date: Fri, 27 Apr 2018 16:05:26 +0300	[thread overview]
Message-ID: <20180427130526.98757-1-k.belyavskiy@tarantool.org> (raw)

---
Ticket: https://github.com/tarantool/tarantool/issues/3294
Branch: https://github.com/tarantool/tarantool/compare/gh-3294-do-not-fetch-records-twice
Link: https://github.com/tarantool/tarantool/blob/gh-3294-do-not-fetch-records-twice/doc/rfc/3294-implement-selective-subscribe-to-avoid-fetch-record-twice.md
 ...ective-subscribe-to-avoid-fetch-record-twice.md | 26 ++++++++++++++++++++++
 1 file changed, 26 insertions(+)
 create mode 100644 doc/rfc/3294-implement-selective-subscribe-to-avoid-fetch-record-twice.md

diff --git a/doc/rfc/3294-implement-selective-subscribe-to-avoid-fetch-record-twice.md b/doc/rfc/3294-implement-selective-subscribe-to-avoid-fetch-record-twice.md
new file mode 100644
index 000000000..578f17d53
--- /dev/null
+++ b/doc/rfc/3294-implement-selective-subscribe-to-avoid-fetch-record-twice.md
@@ -0,0 +1,26 @@
+# Implement selective subscribe and subscription daemon
+
+* **Status**: In progress
+* **Start date**: 25-04-2018
+* **Authors**: Konstantin Belyavskiy @kbelyavs k.belyavskiy@tarantool.org, Georgy Kirichenko @georgy georgy@tarantool.org, Konstantin Osipov @kostja kostja@tarantool.org
+* **Issues**: [#3294](https://github.com/tarantool/tarantool/issues/3294)
+
+## Summary
+
+Extend SUBSCRIBE command with a list of server UUIDs for which SUBSCRIBE should fetch changes. In a full mesh configuration, only download records originating from the immediate peer. Do not download the records from other peers twice.
+Implement subscription daemon, each time a server responsible feeding more than 1 server id is dropped, we need to re-subscribe to some other peer and reassign the dropped ids to that peer. Each time a server is connected again, we need to rebalance again.
+
+## Background and motivation
+
+Currently each Tarantool instance will download from all peers all records in their WAL except records with instance id equal to the self instance id. For example, in a full mesh of 3 replicas all record will be fetched twice. Instead, it could send a subscribe request to its peers with server ids which are not present in other subscribe requests.
+
+## Detailed design
+
+1. Extend IPROTO_SUBSCRIBE command with a list of server UUIDs for which SUBSCRIBE should fetch changes. Store this UUIDs within applier's internal data structure. By default issuing SUBSCRIBE with empty list what means no filtering at all.
+2. Implement white-list filtering in relay. After processing SUBSCRIBE request, relay has a list of UUIDs. Extract associated peer ids and fill in a filter. By default transmit all records, unless SUBSCRIBE was done with at least one server UUID. In latter case drop all records except originating from replicas in this list.
+3. After issuing REQUEST_VOTE to all peers, subscription daemon knows a map of server ids, their peers and their vclocks. Sort the map by server id. Iterate over each server in the list of peers and assign its id to this server's SUBSCRIBE request. Assign all the remaining ids to the last peer (alternatively, if there are many ids in the remainder, keep going through the list of server and assign "orphan" ids in round-robin fashion).
+Issue the subscribe request.
+After this feature is implemented, each time a server responsible feeding more than 1 server id is dropped, we need to re-subscribe to some other peer and reassign the dropped ids to that peer. Each time a server is connected again, we need to rebalance again.
+4. Rebalancing. Connect/disconnect should trigger daemon to start reassigning process.
+ - On disconnect first get a list of all UUIDs, then iterate through appliers to find "orphan" and finally reassigned these UUIDs to last peer by issuing SUBSCRIBE for it.
+ - On connect, by iterating through appliers list, find stolen UUIDs, reassign them to correct applier and issue SUBSCRIBE for recently connected applier and for the one from whom we get these UUIDs back.
-- 
2.14.3 (Apple Git-98)

                 reply	other threads:[~2018-04-27 13:05 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180427130526.98757-1-k.belyavskiy@tarantool.org \
    --to=k.belyavskiy@tarantool.org \
    --cc=georgy@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='Re: [tarantool-patches] [PATCH v2] replication: do not fetch records twice in a full mesh' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox