From: Serge Petrenko via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>, gorcunov@gmail.com
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH v4 10/12] election: support manual elections in clear_synchro_queue()
Date: Sun, 18 Apr 2021 12:26:41 +0300 [thread overview]
Message-ID: <5b3ec61c-5790-b27e-9a8c-9964bbcb6b4f@tarantool.org> (raw)
In-Reply-To: <05b14638-53d8-51db-4a85-3adafd611db6@tarantool.org>
17.04.2021 01:24, Vladislav Shpilevoy пишет:
> Thanks for the patch!
>
> See 1 comment below.
>
>> diff --git a/src/box/box.cc b/src/box/box.cc
>> index d5a55a30a..fcd812c09 100644
>> --- a/src/box/box.cc
>> +++ b/src/box/box.cc
>> @@ -1521,12 +1521,75 @@ box_clear_synchro_queue(bool try_wait)
>> if (!is_box_configured ||
>> raft_node_term(box_raft(), instance_id) == box_raft()->term)
>> return 0;
>> +
>> + bool run_elections = false;
>> +
>> + switch (box_election_mode) {
>> + case ELECTION_MODE_OFF:
>> + break;
>> + case ELECTION_MODE_VOTER:
>> + assert(box_raft()->state == RAFT_STATE_FOLLOWER);
>> + diag_set(ClientError, ER_UNSUPPORTED, "election_mode='voter'",
>> + "manual elections");
>> + return -1;
>> + case ELECTION_MODE_MANUAL:
>> + assert(box_raft()->state == RAFT_STATE_FOLLOWER);
>> + run_elections = true;
>> + try_wait = false;
>> + break;
>> + case ELECTION_MODE_CANDIDATE:
>> + /*
>> + * Leader elections are enabled, and this instance is allowed to
>> + * promote only if it's already an elected leader. No manual
>> + * elections.
>> + */
>> + if (box_raft()->state != RAFT_STATE_LEADER) {
>> + diag_set(ClientError, ER_UNSUPPORTED, "election_mode="
>> + "'candidate'", "manual elections");
>> + return -1;
>> + }
>> + break;
>> + default:
>> + unreachable();
>> + }
>> +
>> uint32_t former_leader_id = txn_limbo.owner_id;
>> int64_t wait_lsn = txn_limbo.confirmed_lsn;
>> int rc = 0;
>> int quorum = replication_synchro_quorum;
>> in_clear_synchro_queue = true;
>>
>> + if (run_elections) {
>> + /*
>> + * Make this instance a candidate and run until some leader, not
>> + * necessarily this instance, emerges.
>> + */
>> + raft_start_candidate(box_raft());
>> + /*
>> + * Trigger new elections without waiting for an old leader to
>> + * disappear.
>> + */
>> + raft_new_term(box_raft());
>> + box_raft_wait_leader_found();
> Shouldn't we wait for election_timeout?
I think not. Let's wait for however long it takes to elect a leader.
Several terms may pass before the leader is finally elected.
I mean, IMO it would be simpler for the user to do:
```
box.ctl.promote()
-- term1, split vote
-- term2, split vote
-- term3, leader found
-- success
```
rather than
```
box.ctl.promote()
-- error, split vote
box.ctl.promote()
-- error, split vote
box.ctl.promote()
-- success
```
>
> Also what if the fiber is canceled before the leader is found? It
> seems box_raft_wait_leader_found() would fail on an assertion because
> raft is still enabled, but leader_id is nil.
Thanks for noticing! Will fix.
Diff:
==================================
diff --git a/src/box/box.cc b/src/box/box.cc
index 962f649c3..797aa86b5 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -1572,13 +1572,17 @@ box_clear_synchro_queue(bool try_wait)
* disappear.
*/
raft_new_term(box_raft());
- box_raft_wait_leader_found();
+ rc = box_raft_wait_leader_found();
/*
* Do not reset raft mode if it was changed while
running the
* elections.
*/
if (box_election_mode == ELECTION_MODE_MANUAL)
raft_stop_candidate(box_raft(), false);
+ if (rc != 0) {
+ in_clear_synchro_queue = false;
+ return -1;
+ }
if (!box_raft()->is_enabled) {
diag_set(ClientError, ER_RAFT_DISABLED);
in_clear_synchro_queue = false;
diff --git a/src/box/raft.c b/src/box/raft.c
index 425353207..61fa9f91b 100644
--- a/src/box/raft.c
+++ b/src/box/raft.c
@@ -347,15 +347,20 @@ box_raft_wait_leader_found_f(struct trigger *trig,
void *event)
return 0;
}
-void
+int
box_raft_wait_leader_found(void)
{
struct trigger trig;
trigger_create(&trig, box_raft_wait_leader_found_f, fiber(), NULL);
raft_on_update(box_raft(), &trig);
fiber_yield();
- assert(box_raft()->leader != REPLICA_ID_NIL ||
!box_raft()->is_enabled);
trigger_clear(&trig);
+ if (fiber_is_cancelled()) {
+ diag_set(FiberIsCancelled);
+ return -1;
+ }
+ assert(box_raft()->leader != REPLICA_ID_NIL ||
!box_raft()->is_enabled);
+ return 0;
}
void
diff --git a/src/box/raft.h b/src/box/raft.h
index 8fce423e1..6b6136510 100644
--- a/src/box/raft.h
+++ b/src/box/raft.h
@@ -97,7 +97,8 @@ box_raft_checkpoint_remote(struct raft_request *req);
int
box_raft_process(struct raft_request *req, uint32_t source);
-void
+/** Block this fiber until Raft leader is known. */
+int
box_raft_wait_leader_found();
void
>
>> + /*
>> + * Do not reset raft mode if it was changed while running the
>> + * elections.
>> + */
>> + if (box_election_mode == ELECTION_MODE_MANUAL)
>> + raft_stop_candidate(box_raft(), false);
>> + if (!box_raft()->is_enabled) {
>> + diag_set(ClientError, ER_RAFT_DISABLED);
>> + in_clear_synchro_queue = false;
>> + return -1;
>> + }
>> + if (box_raft()->state != RAFT_STATE_LEADER) {
>> + diag_set(ClientError, ER_INTERFERING_PROMOTE,
>> + box_raft()->leader);
>> + in_clear_synchro_queue = false;
>> + return -1;
>> + }
>> + }
>> +
>> if (txn_limbo_is_empty(&txn_limbo))
>> goto promote;
>>
--
Serge Petrenko
next prev parent reply other threads:[~2021-04-18 9:26 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-16 16:25 [Tarantool-patches] [PATCH v4 00/12] raft: introduce manual elections and fix a bug with re-applying rolled back transactions Serge Petrenko via Tarantool-patches
2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 01/12] wal: make wal_assign_lsn accept journal entry Serge Petrenko via Tarantool-patches
2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 02/12] xrow: enrich row's meta information with sync replication flags Serge Petrenko via Tarantool-patches
2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 03/12] xrow: introduce a PROMOTE entry Serge Petrenko via Tarantool-patches
2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 04/12] box: actualise iproto_key_type array Serge Petrenko via Tarantool-patches
2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 05/12] box: make clear_synchro_queue() write a PROMOTE entry instead of CONFIRM + ROLLBACK Serge Petrenko via Tarantool-patches
2021-04-16 22:12 ` Vladislav Shpilevoy via Tarantool-patches
2021-04-18 8:24 ` Serge Petrenko via Tarantool-patches
2021-04-20 22:30 ` Vladislav Shpilevoy via Tarantool-patches
2021-04-21 5:58 ` Serge Petrenko via Tarantool-patches
2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 06/12] box: write PROMOTE even for empty limbo Serge Petrenko via Tarantool-patches
2021-04-19 13:39 ` Serge Petrenko via Tarantool-patches
2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 07/12] raft: filter rows based on known peer terms Serge Petrenko via Tarantool-patches
2021-04-16 22:21 ` Vladislav Shpilevoy via Tarantool-patches
2021-04-18 8:49 ` Serge Petrenko via Tarantool-patches
2021-04-18 15:44 ` Vladislav Shpilevoy via Tarantool-patches
2021-04-19 9:31 ` Serge Petrenko via Tarantool-patches
2021-04-18 16:27 ` Vladislav Shpilevoy via Tarantool-patches
2021-04-19 9:30 ` Serge Petrenko via Tarantool-patches
2021-04-20 20:29 ` Serge Petrenko via Tarantool-patches
2021-04-20 20:31 ` Serge Petrenko via Tarantool-patches
2021-04-20 20:55 ` Serge Petrenko via Tarantool-patches
2021-04-20 22:30 ` Vladislav Shpilevoy via Tarantool-patches
2021-04-21 5:58 ` Serge Petrenko via Tarantool-patches
2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 08/12] election: introduce a new election mode: "manual" Serge Petrenko via Tarantool-patches
2021-04-19 22:34 ` Vladislav Shpilevoy via Tarantool-patches
2021-04-20 9:25 ` Serge Petrenko via Tarantool-patches
2021-04-20 17:37 ` Serge Petrenko via Tarantool-patches
2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 09/12] raft: introduce raft_start/stop_candidate Serge Petrenko via Tarantool-patches
2021-04-16 22:23 ` Vladislav Shpilevoy via Tarantool-patches
2021-04-18 8:59 ` Serge Petrenko via Tarantool-patches
2021-04-19 22:35 ` Vladislav Shpilevoy via Tarantool-patches
2021-04-20 9:28 ` Serge Petrenko via Tarantool-patches
2021-04-19 12:52 ` Serge Petrenko via Tarantool-patches
2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 10/12] election: support manual elections in clear_synchro_queue() Serge Petrenko via Tarantool-patches
2021-04-16 22:24 ` Vladislav Shpilevoy via Tarantool-patches
2021-04-18 9:26 ` Serge Petrenko via Tarantool-patches [this message]
2021-04-18 16:07 ` Vladislav Shpilevoy via Tarantool-patches
2021-04-19 9:32 ` Serge Petrenko via Tarantool-patches
2021-04-19 12:47 ` Serge Petrenko via Tarantool-patches
2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 11/12] box: remove parameter from clear_synchro_queue Serge Petrenko via Tarantool-patches
2021-04-16 16:25 ` [Tarantool-patches] [PATCH v4 12/12] box.ctl: rename clear_synchro_queue to promote Serge Petrenko via Tarantool-patches
2021-04-19 22:35 ` Vladislav Shpilevoy via Tarantool-patches
2021-04-20 10:22 ` Serge Petrenko via Tarantool-patches
2021-04-18 12:00 ` [Tarantool-patches] [PATCH v4 13/12] replication: send accumulated Raft messages after relay start Serge Petrenko via Tarantool-patches
2021-04-18 16:03 ` Vladislav Shpilevoy via Tarantool-patches
2021-04-19 12:11 ` Serge Petrenko via Tarantool-patches
2021-04-19 22:36 ` Vladislav Shpilevoy via Tarantool-patches
2021-04-20 10:38 ` Serge Petrenko via Tarantool-patches
2021-04-20 22:31 ` Vladislav Shpilevoy via Tarantool-patches
2021-04-21 5:59 ` Serge Petrenko via Tarantool-patches
2021-04-19 22:37 ` [Tarantool-patches] [PATCH v4 00/12] raft: introduce manual elections and fix a bug with re-applying rolled back transactions Vladislav Shpilevoy via Tarantool-patches
2021-04-20 17:38 ` [Tarantool-patches] [PATCH v4 14/12] txn: make NOPs fully asynchronous Serge Petrenko via Tarantool-patches
2021-04-20 22:31 ` Vladislav Shpilevoy via Tarantool-patches
2021-04-21 5:59 ` Serge Petrenko via Tarantool-patches
2021-04-20 22:30 ` [Tarantool-patches] [PATCH v4 00/12] raft: introduce manual elections and fix a bug with re-applying rolled back transactions Vladislav Shpilevoy via Tarantool-patches
2021-04-21 6:01 ` Serge Petrenko via Tarantool-patches
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5b3ec61c-5790-b27e-9a8c-9964bbcb6b4f@tarantool.org \
--to=tarantool-patches@dev.tarantool.org \
--cc=gorcunov@gmail.com \
--cc=sergepetrenko@tarantool.org \
--cc=v.shpilevoy@tarantool.org \
--subject='Re: [Tarantool-patches] [PATCH v4 10/12] election: support manual elections in clear_synchro_queue()' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox