Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Cyrill Gorcunov <gorcunov@gmail.com>,
	tml <tarantool-patches@dev.tarantool.org>
Subject: Re: [Tarantool-patches] [PATCH v9 4/5] limbo: filter incoming synchro requests
Date: Tue, 3 Aug 2021 01:50:49 +0200	[thread overview]
Message-ID: <ff186dbc-1da2-7654-0159-cca5645c63b9@tarantool.org> (raw)
In-Reply-To: <20210730113539.563318-5-gorcunov@gmail.com>

Thanks for the patch!

On 30.07.2021 13:35, Cyrill Gorcunov wrote:
> When we receive synchro requests we can't just apply
> them blindly because in worse case they may come from
> split-brain configuration (where a cluster splitted into

splitted -> split.

> several subclusters and each one has own leader elected,
> then subclisters are trying to merge back into original

subclisters -> subclusters.

> cluster). We need to do our best to detect such configs
> and force these nodes to rejoin from the scratch for
> data consistency sake.
> Thus when we're processing requests we pass them to the
> packet filter first which validates their contents and
> refuse to apply if they are not matched.
> Depending on request type each packet traverse an
> appropriate chain(s)
>  - Common chain for any synchro packet. We verify
>    that if replica_id is nil then it shall be
>    PROMOTE request with lsn 0 to migrate limbo owner

How can it be 0 for non PROMOTE/DEMOTE requests?
Do we ever encode such rows at all? Why isn't this

>  - Both confirm and rollback requests shall not come
>    with empty limbo since it measn the synchro queue

measn -> means.

>    is already processed and the peer didn't notice
>    that

Is it the only issue? What about ROLLBACK coming to
an instance, which already made PROMOTE on the rolled back
data? That is a part of the original problem in the ticket.

>  - Promote request should come in with new terms only,
>    otherwise it means the peer didn't notice election
>  - If limbo's confirmed_lsn is equal to promote LSN then
>    it is a valid request to process
>  - If limbo's confirmed_lsn is bigger than requested then
>    it is valid in one case only -- limbo migration so the
>    queue shall be empty

I don't understand. How is it valid? PROMOTE(lsn) rolls
back everything > lsn. If the local confirmed_lsn > lsn, it
means that data can't be rolled back now and the data becomes

>  - If limbo's confirmed_lsn is less than promote LSN then
>    - If queue is empty then it means the transactions are
>      already rolled back and request is invalid
>    - If queue is not empty then its first entry might be
>      greater than promote LSN and it means that old data
>      either committed or rolled back already and request
>      is invalid

If the first entry's LSN in the limbo > promote LSN, it
means it wasn't committed yet. The promote will roll it back
and it is fine. This will make the data consistent.

The problem appears if there were some other sync txns
rolled back or even committed with quorum=1 before this
hanging txn. And I don't remember we figured a way to
distinguish between these situations. Did we?

I didn't get to the code yet. Will do later.

>  - NOP, reserved for future use
> Closes #6036

  reply	other threads:[~2021-08-02 23:50 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-30 11:35 [Tarantool-patches] [PATCH v9 0/5] limbo: implement packets filtering Cyrill Gorcunov via Tarantool-patches
2021-07-30 11:35 ` [Tarantool-patches] [PATCH v9 1/5] latch: add latch_is_locked helper Cyrill Gorcunov via Tarantool-patches
2021-07-30 11:35 ` [Tarantool-patches] [PATCH v9 2/5] say: introduce panic_on helper Cyrill Gorcunov via Tarantool-patches
2021-07-30 11:35 ` [Tarantool-patches] [PATCH v9 3/5] limbo: order access to the limbo terms Cyrill Gorcunov via Tarantool-patches
2021-08-02 23:48   ` Vladislav Shpilevoy via Tarantool-patches
2021-08-03 11:23     ` Cyrill Gorcunov via Tarantool-patches
2021-07-30 11:35 ` [Tarantool-patches] [PATCH v9 4/5] limbo: filter incoming synchro requests Cyrill Gorcunov via Tarantool-patches
2021-08-02 23:50   ` Vladislav Shpilevoy via Tarantool-patches [this message]
2021-08-03 13:25     ` Cyrill Gorcunov via Tarantool-patches
2021-08-03 10:51   ` Serge Petrenko via Tarantool-patches
2021-08-03 13:49     ` Cyrill Gorcunov via Tarantool-patches
2021-07-30 11:35 ` [Tarantool-patches] [PATCH v9 5/5] test: add replication/gh-6036-rollback-confirm Cyrill Gorcunov via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ff186dbc-1da2-7654-0159-cca5645c63b9@tarantool.org \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=gorcunov@gmail.com \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v9 4/5] limbo: filter incoming synchro requests' \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox