From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <sergepetrenko@tarantool.org>
Received: from smtp60.i.mail.ru (smtp60.i.mail.ru [217.69.128.40])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by dev.tarantool.org (Postfix) with ESMTPS id F0ADB45C304
 for <tarantool-patches@dev.tarantool.org>;
 Mon, 30 Nov 2020 12:40:29 +0300 (MSK)
References: <20201124131820.32981-1-sergepetrenko@tarantool.org>
 <e09d6a82-ae01-c9ae-e30e-4357ac3f778b@tarantool.org>
 <aaff0f19-5360-a858-fd33-2d33c77ff42d@tarantool.org>
 <22c104d3-bb7b-fd4d-ebd0-730133e8ded9@tarantool.org>
From: Serge Petrenko <sergepetrenko@tarantool.org>
Message-ID: <459c1422-cb8d-45df-742f-4f720e8ec9e7@tarantool.org>
Date: Mon, 30 Nov 2020 12:40:28 +0300
MIME-Version: 1.0
In-Reply-To: <22c104d3-bb7b-fd4d-ebd0-730133e8ded9@tarantool.org>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-GB
Subject: Re: [Tarantool-patches] [PATCH] raft: make sure the leader stays ro
 till it clears the limbo
List-Id: Tarantool development patches <tarantool-patches.dev.tarantool.org>
List-Unsubscribe: <https://lists.tarantool.org/mailman/options/tarantool-patches>, 
 <mailto:tarantool-patches-request@dev.tarantool.org?subject=unsubscribe>
List-Archive: <https://lists.tarantool.org/pipermail/tarantool-patches/>
List-Post: <mailto:tarantool-patches@dev.tarantool.org>
List-Help: <mailto:tarantool-patches-request@dev.tarantool.org?subject=help>
List-Subscribe: <https://lists.tarantool.org/mailman/listinfo/tarantool-patches>, 
 <mailto:tarantool-patches-request@dev.tarantool.org?subject=subscribe>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>, gorcunov@gmail.com
Cc: tarantool-patches@dev.tarantool.org


28.11.2020 01:10, Vladislav Shpilevoy пишет:
>>> On 24.11.2020 14:18, Serge Petrenko wrote:
>>>> When running a cluster with leader election, its useful to wait till the
>>>> instance is writeable to determine that it has become a leader. However,
>>>> sometimes the instance fails to write data right after transitioning to
>>>> leader because its limbo still contains pending transactions from the
>>>> old leader. Make sure the instance deals with pending transactions first
>>>> and becomes writeable only once the limbo is empty.
>>> I just realized one thing. We can add a function txn_limbo_is_ro(),
>>> like we did with raft_is_ro(), account it in box_update_ro_summary(),
>>> and call box_update_ro_summary() when we see that the limbo is emptied,
>>> or when its ownership changes to a different instance.
>>>
>>> Probably would be simpler, and also we could make it work with manual
>>> election! So users could call box.ctl.wait_rw() even without using raft!
>>>
>>> To show concrete error if somebody still tries to write, we could
>>> patch box_check_writable() to show the reason why the instance is not
>>> writable. We will do it anyway for raft, to tell the users the real
>>> leader in case they are trying to write on a replica. In scope of
>>> https://github.com/tarantool/tarantool/issues/5568.
>>>
>>> Your version of the patch also looks good.
>>>
>>> What do you think?
>> Thanks for your answer!
>>
>> Your proposal looks good. One question though. What about multimaster
>> synchro? Are we planning to support it one day? If yes, then limbo
>> emptiness will mean nothing.
> Indeed. But I have no idea if we will ever support it, and if
> yes - when. It is possible in theory, but we never tried to
> elaborate so far. I wouldn't expect it happening in the next
> year or so.
Ok, I see.
>> So, there're two options:
>>
>> 1) we may leave this patch as is. Then one won't be
>>     able to call wait_rw() with manual election. That's a pity, since
>>     your proposal looks quite logical, especially from the user's point
>>     of view. Having a single error for all these cases would be good.
> It also bothers me, that now box.ctl.wait_rw() actually does not wait
> for rw, strictly speaking. So it is probably even a bug, not a feature.
> A user can get wait_rw() true, but still won't be able to write.
> Value of such helper becomes zero with synchronous replication, in
> essence.


Ok.


>
>> 2) Make limbo affect is_ro. Then everything's good for now, but we'll
>>     have to rewrite it back once (and if) we decide to implement
>>     multimaster synchro. Then the patch will look exactly like it does
>>     now.
> Perhaps. If nothing will change in raft and limbo significantly.
>
>> I'm not sure whether we're planning to make multimaster synchro work,
>> so I can't choose between these options and leave you to decide.
> We discuss it, but there is no really a plan. No customer request, and
> no ticket AFAIR. I assume some big customer must ask for this for us to
> even start planning.
>
>> Besides, looks like if we take option 2 the patch will differ in a single
>> line: `box_update_ro_summary()` after `box_clear_synchro_queue`
> Why? You will need to patch the limbo as well, not just remove
> box_update_ro_summary from after box_clear_synchro_queue. We will
> need to add a method like txn_limbo_is_ro() (similar to raft_is_ro()),
> use it in box_update_ro_summary(), and call the latter from some
> places in the limbo.


Never mind, I don't know what I was thinking at the time of writing that.


>
> I vote for the option 2. Mostly because I am afraid it is rather a bug.
> wait_rw just does not guarantee anything now.


Done. Please see "[PATCH v2] box: make instace ro while limbo is not empty"

-- 
Serge Petrenko