From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> To: Ilya Markov <imarkov@tarantool.org>, "tarantool-patches@freelists.org" <tarantool-patches@freelists.org>, Konstantin Osipov <kostja@tarantool.org> Subject: [tarantool-patches] Re: [commits] [tarantool] 01/04: Add mapping rfc Date: Tue, 10 Jul 2018 18:16:37 +0300 [thread overview] Message-ID: <e50bb7dc-4dce-625c-2ea6-f01329ae4916@tarantool.org> (raw) In-Reply-To: <1527877378.906931598.53020600671932000@mxpdd4.i.mail.ru> On 01/06/2018 21:35, Ilya Markov wrote: > This is an automated email from the git hooks/post-receive script. > > IlyaMarkovMipt pushed a commit to branch gh-3098-remapping-replicas > in repository tarantool. > > commit 54600183d2a6e9d2ff44c92284259b50a4bc3d46 > Author: Ilya Markov <imarkov@tarantool.org> > AuthorDate: Fri Jun 1 13:08:28 2018 +0300 > > Add mapping rfc > --- > doc/rfc/3098-replicas-id-remapping.md | 120 ++++++++++++++++++++++++++++++++++ > 1 file changed, 120 insertions(+) > > diff --git a/doc/rfc/3098-replicas-id-remapping.md b/doc/rfc/3098-replicas-id-remapping.md > new file mode 100644 > index 0000000..3ae1254 > --- /dev/null > +++ b/doc/rfc/3098-replicas-id-remapping.md > @@ -0,0 +1,120 @@ > +## Problems and ways to overcome them > + > +1. Problem with primary key in _cluster. So far primary key in _cluster is replica_id. > +But as we want to update inside before triggers according to our local replica id assigning, > + we need to update this field. Nevertheless, it's prohibited to update primary key field inside before_triggers. > + > +*Solution*: > + That's why we alter primary index to indexing uuid field. The second index we alter to indexing replica_id. > + Why not just delete the old and insert new record? > + > +2. Problem with simultaneous appliers. When several appliers exist in one moment, several triggers > +are set and each of them will be called. The problem is when the new tuple is delivered, > +we want to handle it only once, therewith by the trigger set by the applier > + for which tuple has come for. > +Therefore, we need to map tuples to appliers inside triggers.> + > +*Solution*: > +The idea we decided to implement is to add third field in tuples representing the uuid of replica it was sent. > +With that we can decide whether this tuple was sent to the applier for which trigger was called, > +simply comparing third field of tuple with applier->uuid. > + > +3. Before triggers are not called on join operation, so we don't update some of our _cluster meta data. > + It's not a problem for mappings, because the joining replica doesn't have _cluster at all. > + But it's problem for local replica id counter. It should be updated on each new replica added. > + > + *Solution*: On the call of _cluster trigger(the one is not assigned to any applier), > + we check if we have already updated local replica id counter. > + If yes, we use its value. > + Otherwise, we use the maximum replica id from _cluster table. > + > + Also the problem here is that the third field is not updated on join. > + But as such not-updated tuple are written in snapshots and in future can be handled only in join again, > + this field will be unnecessary. > + > +4. When should we set up the triggers? The initial data reception at join phase does not require mapping > + because within that phase node doesn't have an empty _cluster. But on subscribe or on recovery triggers are required. > + > +*Solution*: Trigger used for global counter is set on bootstrap, > +the others are set either in join after initial data receiving, or in subscribe phase. > + > +5. How to handle global counter? Global counter is used to assign new replicas ids. > +We have to assign it unique in order not to overlap it with other alive and disabled replicas. > + > +*Solution* Let's assign replica counter `RC`. > +On new replica registration we calculate `RC = max(max_id(_cluster), RC) + 1` > +With this formula we take into account the fact that triggers are not called on initial data reception during join phase, > +and the fact that replicas may be deleted. > + > +6. Another issue is the tuples whose third field(source uuid) is unknown for replica. > + > +In this case we would spoil _cluster, because we don't have trigger to handle this tuple. > + > +*Solution*: Skip such tuples. We need this tuples from _cluster mostly only for tracking vclocks. > +But if replica doesn't have applier for the replica with such uuid then this replica should not be vclock representation. > + > +## Alternatives > + > +Possible alternative was to use the uniqueness of UUID and > + store uuid instead of replica id in vclocks and xrows. In this way, there would be no need in remapping, as we could easily distinguish the replica. > + But the approach consumes much more memory and message size than previous one. > + Size of uuids is bigger in magnitude than simple identifiers. >
parent reply other threads:[~2018-07-10 15:16 UTC|newest] Thread overview: expand[flat|nested] mbox.gz Atom feed [parent not found: <1527877378.906931598.53020600671932000@mxpdd4.i.mail.ru>]
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=e50bb7dc-4dce-625c-2ea6-f01329ae4916@tarantool.org \ --to=v.shpilevoy@tarantool.org \ --cc=imarkov@tarantool.org \ --cc=kostja@tarantool.org \ --cc=tarantool-patches@freelists.org \ --subject='[tarantool-patches] Re: [commits] [tarantool] 01/04: Add mapping rfc' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox