From: Serge Petrenko via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>, gorcunov@gmail.com Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH v4 13/12] replication: send accumulated Raft messages after relay start Date: Tue, 20 Apr 2021 13:38:46 +0300 [thread overview] Message-ID: <fb530ad6-d3a4-3cd4-b852-921d9a1cc767@tarantool.org> (raw) In-Reply-To: <35351452-fbd8-926f-886b-8210ccb8f74e@tarantool.org> 20.04.2021 01:36, Vladislav Shpilevoy пишет: > Thanks for the patch! > > See 2 comments below. > >> diff --git a/src/box/relay.cc b/src/box/relay.cc >> index 7be33ee31..85f335cd7 100644 >> --- a/src/box/relay.cc >> +++ b/src/box/relay.cc >> @@ -628,13 +659,38 @@ struct relay_is_raft_enabled_msg { >> bool is_finished; >> }; >> >> +static void >> +relay_push_raft_msg(struct relay *relay, bool do_restart_recovery) > 1. Why is the recovery restart flag is ignored if a message is already > sent? This might lead to recovery restart loss if I am not mistaken. I think it's okay. As soon as the message is pushed from relay_push_raft() rather than from tx_set_is_raft_enabled(), we may freely restart the recovery. So, we only care whether do_restart_recovery is set when the message gets pushed in the same call. We don't care whether do_restart_recovery is set or not when the call exits without pushing the message. The next call will have the correct value for do_restart_recovery anyway. Please see a more detailed explanation below. > >> +{ >> + if (!relay->tx.is_raft_enabled || relay->tx.is_raft_push_sent) >> + return; >> + struct relay_raft_msg *msg = >> + &relay->tx.raft_msgs[relay->tx.raft_ready_msg]; >> + msg->do_restart_recovery = do_restart_recovery; >> + cpipe_push(&relay->relay_pipe, &msg->base); >> + relay->tx.raft_ready_msg = (relay->tx.raft_ready_msg + 1) % 2; >> + relay->tx.is_raft_push_sent = true; >> + relay->tx.is_raft_push_pending = false; >> +} >> + >> /** TX thread part of the Raft flag setting, first hop. */ >> static void >> tx_set_is_raft_enabled(struct cmsg *base) >> { >> struct relay_is_raft_enabled_msg *msg = >> (struct relay_is_raft_enabled_msg *)base; >> - msg->relay->tx.is_raft_enabled = msg->value; >> + struct relay *relay = msg->relay; >> + relay->tx.is_raft_enabled = msg->value; >> + /* >> + * Send saved raft message as soon as relay becomes operational. >> + * Do not restart recovery upon the message arrival. Recovery is >> + * positioned at replica_clock initially, i.e. already "restarted" and >> + * restarting it once again would position it at the oldest xlog >> + * possible, because relay reader hasn't received replica vclock yet. >> + */ >> + if (relay->tx.is_raft_push_pending) { >> + relay_push_raft_msg(msg->relay, false); > 2. I don't understand. Why wasn't there such a problem before? Recovery > must be restarted when the node becomes a leader. If you do not restart > it, the data would be ignored by the replicas. How do you know it is > positioned right now at replica_clock? You are in tx thread, you can't > tell. What do I miss? This is because this `relay_push_raft_msg` is delivered before `relay_set_is_raft_enabled`. And both these messages get processed by the cbus_process() loop waiting for `relay_seet_is_raft_enabled`. This happens in relay_send_is_raft_enabled() even before the relay reader fiber is created, so recv_vclock is zero. Restarting recovery here would lead to it being reset to the first ever wal this instance has, which's wrong. Such a problem might've existed before, but was extremely hard to catch: relay_push_raft_msg() wasn't called until relay->tx.is_raft_enabled was set. And when tx.is_raft_enabled was set it most probably meant that relay_set_is_raft_enabled was already delivered and relay has exited this first cbus_process() loop, which worked before reader fiber creation. In order to solve the problem in some another way, I need to make relay_push_raft_msg() deliver the message to the second cbus_process() loop, the main one. And I couldn't come up with an idea how to do that. The message should be pushed right in tx_set_is_raft_enabled, and this means it'll get delivered before relay_set_is_raft_enabled. -- Serge Petrenko
next prev parent reply other threads:[~2021-04-20 10:38 UTC|newest] Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-04-16 16:25 [Tarantool-patches] [PATCH v4 00/12] raft: introduce manual elections and fix a bug with re-applying rolled back transactions Serge Petrenko via Tarantool-patches 2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 01/12] wal: make wal_assign_lsn accept journal entry Serge Petrenko via Tarantool-patches 2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 02/12] xrow: enrich row's meta information with sync replication flags Serge Petrenko via Tarantool-patches 2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 03/12] xrow: introduce a PROMOTE entry Serge Petrenko via Tarantool-patches 2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 04/12] box: actualise iproto_key_type array Serge Petrenko via Tarantool-patches 2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 05/12] box: make clear_synchro_queue() write a PROMOTE entry instead of CONFIRM + ROLLBACK Serge Petrenko via Tarantool-patches 2021-04-16 22:12 ` Vladislav Shpilevoy via Tarantool-patches 2021-04-18 8:24 ` Serge Petrenko via Tarantool-patches 2021-04-20 22:30 ` Vladislav Shpilevoy via Tarantool-patches 2021-04-21 5:58 ` Serge Petrenko via Tarantool-patches 2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 06/12] box: write PROMOTE even for empty limbo Serge Petrenko via Tarantool-patches 2021-04-19 13:39 ` Serge Petrenko via Tarantool-patches 2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 07/12] raft: filter rows based on known peer terms Serge Petrenko via Tarantool-patches 2021-04-16 22:21 ` Vladislav Shpilevoy via Tarantool-patches 2021-04-18 8:49 ` Serge Petrenko via Tarantool-patches 2021-04-18 15:44 ` Vladislav Shpilevoy via Tarantool-patches 2021-04-19 9:31 ` Serge Petrenko via Tarantool-patches 2021-04-18 16:27 ` Vladislav Shpilevoy via Tarantool-patches 2021-04-19 9:30 ` Serge Petrenko via Tarantool-patches 2021-04-20 20:29 ` Serge Petrenko via Tarantool-patches 2021-04-20 20:31 ` Serge Petrenko via Tarantool-patches 2021-04-20 20:55 ` Serge Petrenko via Tarantool-patches 2021-04-20 22:30 ` Vladislav Shpilevoy via Tarantool-patches 2021-04-21 5:58 ` Serge Petrenko via Tarantool-patches 2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 08/12] election: introduce a new election mode: "manual" Serge Petrenko via Tarantool-patches 2021-04-19 22:34 ` Vladislav Shpilevoy via Tarantool-patches 2021-04-20 9:25 ` Serge Petrenko via Tarantool-patches 2021-04-20 17:37 ` Serge Petrenko via Tarantool-patches 2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 09/12] raft: introduce raft_start/stop_candidate Serge Petrenko via Tarantool-patches 2021-04-16 22:23 ` Vladislav Shpilevoy via Tarantool-patches 2021-04-18 8:59 ` Serge Petrenko via Tarantool-patches 2021-04-19 22:35 ` Vladislav Shpilevoy via Tarantool-patches 2021-04-20 9:28 ` Serge Petrenko via Tarantool-patches 2021-04-19 12:52 ` Serge Petrenko via Tarantool-patches 2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 10/12] election: support manual elections in clear_synchro_queue() Serge Petrenko via Tarantool-patches 2021-04-16 22:24 ` Vladislav Shpilevoy via Tarantool-patches 2021-04-18 9:26 ` Serge Petrenko via Tarantool-patches 2021-04-18 16:07 ` Vladislav Shpilevoy via Tarantool-patches 2021-04-19 9:32 ` Serge Petrenko via Tarantool-patches 2021-04-19 12:47 ` Serge Petrenko via Tarantool-patches 2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 11/12] box: remove parameter from clear_synchro_queue Serge Petrenko via Tarantool-patches 2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 12/12] box.ctl: rename clear_synchro_queue to promote Serge Petrenko via Tarantool-patches 2021-04-19 22:35 ` Vladislav Shpilevoy via Tarantool-patches 2021-04-20 10:22 ` Serge Petrenko via Tarantool-patches 2021-04-18 12:00 ` [Tarantool-patches] [PATCH v4 13/12] replication: send accumulated Raft messages after relay start Serge Petrenko via Tarantool-patches 2021-04-18 16:03 ` Vladislav Shpilevoy via Tarantool-patches 2021-04-19 12:11 ` Serge Petrenko via Tarantool-patches 2021-04-19 22:36 ` Vladislav Shpilevoy via Tarantool-patches 2021-04-20 10:38 ` Serge Petrenko via Tarantool-patches [this message] 2021-04-20 22:31 ` Vladislav Shpilevoy via Tarantool-patches 2021-04-21 5:59 ` Serge Petrenko via Tarantool-patches 2021-04-19 22:37 ` [Tarantool-patches] [PATCH v4 00/12] raft: introduce manual elections and fix a bug with re-applying rolled back transactions Vladislav Shpilevoy via Tarantool-patches 2021-04-20 17:38 ` [Tarantool-patches] [PATCH v4 14/12] txn: make NOPs fully asynchronous Serge Petrenko via Tarantool-patches 2021-04-20 22:31 ` Vladislav Shpilevoy via Tarantool-patches 2021-04-21 5:59 ` Serge Petrenko via Tarantool-patches 2021-04-20 22:30 ` [Tarantool-patches] [PATCH v4 00/12] raft: introduce manual elections and fix a bug with re-applying rolled back transactions Vladislav Shpilevoy via Tarantool-patches 2021-04-21 6:01 ` Serge Petrenko via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=fb530ad6-d3a4-3cd4-b852-921d9a1cc767@tarantool.org \ --to=tarantool-patches@dev.tarantool.org \ --cc=gorcunov@gmail.com \ --cc=sergepetrenko@tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH v4 13/12] replication: send accumulated Raft messages after relay start' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox