Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: Ilya Markov <imarkov@tarantool.org>,
	"tarantool-patches@freelists.org"
	<tarantool-patches@freelists.org>,
	Konstantin Osipov <kostja@tarantool.org>
Subject: [tarantool-patches] Re: [commits] [tarantool] 01/04: Add mapping rfc
Date: Tue, 10 Jul 2018 18:16:37 +0300	[thread overview]
Message-ID: <e50bb7dc-4dce-625c-2ea6-f01329ae4916@tarantool.org> (raw)
In-Reply-To: <1527877378.906931598.53020600671932000@mxpdd4.i.mail.ru>



On 01/06/2018 21:35, Ilya Markov wrote:
> This is an automated email from the git hooks/post-receive script.
> 
> IlyaMarkovMipt pushed a commit to branch gh-3098-remapping-replicas
> in repository tarantool.
> 
> commit 54600183d2a6e9d2ff44c92284259b50a4bc3d46
> Author: Ilya Markov <imarkov@tarantool.org>
> AuthorDate: Fri Jun 1 13:08:28 2018 +0300
> 
>      Add mapping rfc
> ---
>   doc/rfc/3098-replicas-id-remapping.md | 120 ++++++++++++++++++++++++++++++++++
>   1 file changed, 120 insertions(+)
> 
> diff --git a/doc/rfc/3098-replicas-id-remapping.md b/doc/rfc/3098-replicas-id-remapping.md
> new file mode 100644
> index 0000000..3ae1254
> --- /dev/null
> +++ b/doc/rfc/3098-replicas-id-remapping.md
> @@ -0,0 +1,120 @@
> +## Problems and ways to overcome them
> +
> +1. Problem with primary key in _cluster. So far primary key in _cluster is replica_id.
> +But as we want to update inside before triggers according to our local replica id assigning,
> + we need to update this field. Nevertheless, it's prohibited to update primary key field inside before_triggers.
> +
> +*Solution*:
> + That's why we alter primary index to indexing uuid field. The second index we alter to indexing replica_id.
> +

Why not just delete the old and insert new record?

> +
> +2. Problem with simultaneous appliers. When several appliers exist in one moment, several triggers
> +are set and each of them will be called. The problem is when the new tuple is delivered,
> +we want to handle it only once, therewith by the trigger set by the applier
> + for which tuple has come for.
> +Therefore, we need to map tuples to appliers inside triggers.> +
> +*Solution*:
> +The idea we decided to implement is to add third field in tuples representing the uuid of replica it was sent.
> +With that we can decide whether this tuple was sent to the applier for which trigger was called,
> +simply comparing third field of tuple with applier->uuid.
> +
> +3. Before triggers are not called on join operation, so we don't update some of our _cluster meta data.
> + It's not a problem for mappings, because the joining replica doesn't have _cluster at all.
> + But it's problem for local replica id counter. It should be updated on each new replica added.
> +
> + *Solution*: On the call of _cluster trigger(the one is not assigned to any applier),
> +  we check if we have already updated local replica id counter.
> +  If yes, we use its value.
> +  Otherwise, we use the maximum replica id from _cluster table.
> +
> +  Also the problem here is that the third field is not updated on join.
> +  But as such not-updated tuple are written in snapshots and in future can be handled only in join again,
> +  this field will be unnecessary.
> +
> +4. When should we set up the triggers? The initial data reception at join phase does not require mapping
> + because within that phase node doesn't have an empty _cluster. But on subscribe or on recovery triggers are required.
> +
> +*Solution*: Trigger used for global counter is set on bootstrap,
> +the others are set either in join after initial data receiving, or in subscribe phase.
> +
> +5. How to handle global counter? Global counter is used to assign new replicas ids.
> +We have to assign it unique in order not to overlap it with other alive and disabled replicas.
> +
> +*Solution* Let's assign replica counter `RC`.
> +On new replica registration we calculate `RC = max(max_id(_cluster), RC) + 1`
> +With this formula we take into account the fact that triggers are not called on initial data reception during join phase,
> +and the fact that replicas may be deleted.
> +
> +6. Another issue is the tuples whose third field(source uuid) is unknown for replica.
> +
> +In this case we would spoil _cluster, because we don't have trigger to handle this tuple.
> +
> +*Solution*: Skip such tuples. We need this tuples from _cluster mostly only for tracking vclocks.
> +But if replica doesn't have applier for the replica with such uuid then this replica should not be vclock representation.
> +
> +## Alternatives
> +
> +Possible alternative was to use the uniqueness of UUID and
> + store uuid instead of replica id in vclocks and xrows. In this way, there would be no need in remapping, as we could easily distinguish the replica.
> + But the approach consumes much more memory and message size than previous one.
> + Size of uuids is bigger in magnitude than simple identifiers.
> 

           reply	other threads:[~2018-07-10 15:16 UTC|newest]

Thread overview: expand[flat|nested]  mbox.gz  Atom feed
 [parent not found: <1527877378.906931598.53020600671932000@mxpdd4.i.mail.ru>]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e50bb7dc-4dce-625c-2ea6-f01329ae4916@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=imarkov@tarantool.org \
    --cc=kostja@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='[tarantool-patches] Re: [commits] [tarantool] 01/04: Add mapping rfc' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox