Tarantool development patches archive
 help / color / mirror / Atom feed
From: Serge Petrenko <sergepetrenko@tarantool.org>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>,
	tarantool-patches@dev.tarantool.org, gorcunov@gmail.com
Subject: Re: [Tarantool-patches] [PATCH 10/12] raft: move box_update_ro_summary to update trigger
Date: Thu, 19 Nov 2020 13:08:33 +0300	[thread overview]
Message-ID: <b971015c-5466-3999-d837-3bdcb6de3546@tarantool.org> (raw)
In-Reply-To: <169cf614-cd6f-a62c-07e7-adde82334a6e@tarantool.org>


19.11.2020 02:21, Vladislav Shpilevoy пишет:
> Hi! Thanks for the review!
>
>> Raft uses synchronous WAL write, corect?
>>
>> So there's a yield in raft_worker_handle_io(). Now there's a period of time when
>> an instance is a follower, but it isn't read-only.
>>
>> When you reconfigure a leader to become voter, everything's fine, since no
>> writing to disk is involved.
>>
>> However, if an existing leader receives a message with term greater, than its own,
>> it'll first persist this term, and thus yield, and proceed to broadcast and switching
>> to ro later.
>>
>> So now it's possible that a follower is writeable for some period of time.
> You are a savior. Thanks for the deep review and for noticing this.
>
> Indeed. The issue exists. And it is much deeper, it seems.
>
> I also glanced at your patch about RO-update vs limbo-clear order.
> Unfortunately, it does not work. As well as my idea about running on_update
> triggers from raft_schedule_broadcast().
>
> The reason is that currently our on_update trigger yields. In both our
> solutions. Because it calls box_clear_synchro_queue(). And this is exactly
> what I was trying to avoid by using a fiber in the first place. The state
> machine must not yield.
>
> Also there is another issue, not related to the patch, but which I spotted
> while worked on it today. The issue is - we can't cancel limbo clearance.
> The node could be demoted to a follower during waiting for confirms, but it
> still will wait for confirms and we can't stop it.
>
> I started thinking that we could resolve both these issues if box/raft.c could
> run update triggers without yields right away, but schedule async work, like
> limbo clear, into a separate fiber, cancellable.
>
> And I realized that we already have this fiber - raft.worker.
>
> The idea is that we can move the worker fiber to box/raft.c. And in libraft
> instead of a fiber we will have 2 methods:
>
> 	- raft_vtab.async_f - virtual method called by Raft, when it wants
> 	  to schedule some async heavy work (network, disk). We will call it
> 	  instead of raft_worker_wakeup().
>
> 	- raft_process_async - normal method, which the Raft owner should call
> 	  to handle all async events. In a separate fiber. This is the same
> 	  as raft_worker_f(), but not depending on fiber, and finite.
>
> In box/raft.c we have a worker fiber. To libraft we give async_f, which
> creates the fiber on demand, and wakes it up. No yields. Like now, but in
> box/raft.c.
>
> The fiber in its body will call raft_process_async() and fiber_yield() until
> it is cancelled.
>
> Now how does it fix the update triggers? - we fire on_update triggers in
> raft_schedule_broadcast(), but box/raft.c in the trigger will only update
> RO summary. It won't yield. For the limbo clearance it will wakeup the worker
> fiber, which now belongs to box/raft.c, so it is totally fine. The worker
> will call raft_process_async() and will clear the limbo when it is time.
>
> Also the worker fiber can be cancelled/interrupted somehow if we want to
> stop limbo clear when the node is not a leader anymore.
>
> I started working on this already, and it seems to be good. Raft is simplified
> even more, and we delete the ugly hack in box_raft_free() about changing struct
> raft with box_raft_global.worker = NULL. We still nullify the fiber, but we
> don't change struct raft.


Thanks for the answer & investigation!

Looks good at first glance.

-- 
Serge Petrenko

  reply	other threads:[~2020-11-19 10:08 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-17  0:02 [Tarantool-patches] [PATCH 00/12] Raft module, part 2 - relocation to src/lib/raft Vladislav Shpilevoy
2020-11-17  0:02 ` [Tarantool-patches] [PATCH 01/12] raft: move sources to raftlib.h/.c Vladislav Shpilevoy
2020-11-17  8:14   ` Serge Petrenko
2020-11-17  0:02 ` [Tarantool-patches] [PATCH 10/12] raft: move box_update_ro_summary to update trigger Vladislav Shpilevoy
2020-11-17 12:42   ` Serge Petrenko
2020-11-17 15:17     ` Serge Petrenko
2020-11-18 23:21     ` Vladislav Shpilevoy
2020-11-19 10:08       ` Serge Petrenko [this message]
2020-11-17  0:02 ` [Tarantool-patches] [PATCH 11/12] raft: introduce RaftError Vladislav Shpilevoy
2020-11-17 15:13   ` Serge Petrenko
2020-11-17  0:02 ` [Tarantool-patches] [PATCH 12/12] raft: move algorithm code to src/lib/raft Vladislav Shpilevoy
2020-11-17 15:13   ` Serge Petrenko
2020-11-17  0:02 ` [Tarantool-patches] [PATCH 02/12] raft: move box_raft_* to src/box/raft.h and .c Vladislav Shpilevoy
2020-11-17  8:14   ` Serge Petrenko
2020-11-17  0:02 ` [Tarantool-patches] [PATCH 03/12] raft: stop using replication_disconnect_timeout() Vladislav Shpilevoy
2020-11-17  8:15   ` Serge Petrenko
2020-11-17  0:02 ` [Tarantool-patches] [PATCH 04/12] raft: stop using replication_synchro_quorum Vladislav Shpilevoy
2020-11-17  8:17   ` Serge Petrenko
2020-11-19 23:42     ` Vladislav Shpilevoy
2020-11-17  0:02 ` [Tarantool-patches] [PATCH 05/12] raft: stop using instance_id Vladislav Shpilevoy
2020-11-17  8:59   ` Serge Petrenko
2020-11-17  0:02 ` [Tarantool-patches] [PATCH 06/12] raft: make raft_request.vclock constant Vladislav Shpilevoy
2020-11-17  9:17   ` Serge Petrenko
2020-11-17  0:02 ` [Tarantool-patches] [PATCH 07/12] raft: stop using replicaset.vclock Vladislav Shpilevoy
2020-11-17  9:23   ` Serge Petrenko
2020-11-17  0:02 ` [Tarantool-patches] [PATCH 08/12] raft: introduce vtab for disk and network Vladislav Shpilevoy
2020-11-17  9:35   ` Serge Petrenko
2020-11-19 23:43     ` Vladislav Shpilevoy
2020-11-17 10:00   ` Serge Petrenko
2020-11-19 23:43     ` Vladislav Shpilevoy
2020-11-20  7:56       ` Serge Petrenko
2020-11-20 19:40         ` Vladislav Shpilevoy
2020-11-23  8:09           ` Serge Petrenko
2020-11-17  0:02 ` [Tarantool-patches] [PATCH 09/12] raft: introduce raft_msg, drop xrow dependency Vladislav Shpilevoy
2020-11-17 10:22   ` Serge Petrenko
2020-11-19 23:43     ` Vladislav Shpilevoy
2020-11-20  8:03       ` Serge Petrenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b971015c-5466-3999-d837-3bdcb6de3546@tarantool.org \
    --to=sergepetrenko@tarantool.org \
    --cc=gorcunov@gmail.com \
    --cc=tarantool-patches@dev.tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH 10/12] raft: move box_update_ro_summary to update trigger' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox