From: Sergey Petrenko via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>, gorcunov@gmail.com Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH v3 07/12] box: introduce `box.ctl.demote` Date: Fri, 23 Jul 2021 09:44:36 +0200 [thread overview] Message-ID: <036d7d32-54bc-3239-9291-d713b062f324@tarantool.org> (raw) In-Reply-To: <9ae5765f-ffcb-af66-469b-b0a99d71c331@tarantool.org> 22.07.2021 01:28, Vladislav Shpilevoy пишет: > Thanks for the fixes! > >>> 2. On top of this commit and on top of the branch too I tried to >>> promote a candidate and got a strange error in the logs, although >>> the promotion was successful: >>> >>> -- >>> -- Instance 1 >>> -- >>> >>> -- Step 1 >>> box.cfg{ >>> listen = 3313, >>> election_mode = 'candidate', >>> replication_synchro_quorum = 2, >>> replication = {'localhost:3314', 'localhost:3313'} >>> } >>> box.schema.user.grant('guest', 'super') >>> >>> >>> -- >>> -- Instance 2 >>> -- >>> >>> -- Step 2 >>> box.cfg{ >>> listen = 3314, >>> election_mode = 'voter', >>> replication_synchro_quorum = 2, >>> replication = {'localhost:3314', 'localhost:3313'}, >>> read_only = true, >>> } >>> >>> -- Step 3 >>> box.cfg{read_only = false, election_mode = 'candidate'} >>> >>> -- Step 4 >>> box.ctl.promote() >>> >>> main/112/raft_worker box.cc:1538 E> ER_UNSUPPORTED: box.ctl.promote does not support simultaneous invocations >>> --- >>> ... >> That's because once a candidate becomes the leader, it tries to issue `box.ctl.promote()`, and fails, >> since we're already in `box.ctl.promote()` call. >> I'm not sure how to handle that properly. This doesn't break anything though. > Still, the error message looks really not good. There is an option - > make box_promote() for candidate node just call raft_promote() and set > box_in_promote = false. Then wait for the term outcome. Will it work? > You would need to rebase your patchset on master branch then. Hmm, I don't like that. This will work, but it will complicate things in box_promote even more. Now one has to keep in mind that a manual promote call has 2 phases, one when it is issued manually and the other when it's called automatically when (and if) the instance becomes the leader. I think it'd be better to check whether we should run box_promote() right in box_raft_update_synchro_queue(). Check out the diff (mostly for commit "box: allow calling promote on a candidate"): ===================================== diff --git a/src/box/box.cc b/src/box/box.cc index a30e4f78d..5cdca4bd4 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -1692,20 +1692,24 @@ box_issue_demote(uint32_t prev_leader_id, int64_t promote_lsn) } /* A guard to block multiple simultaneous promote()/demote() invocations. */ -static bool box_in_promote = false; +static bool in_box_promote = false; + +bool box_in_promote(void) { + return in_box_promote; +} int box_promote(void) { - if (box_in_promote) { + if (in_box_promote) { diag_set(ClientError, ER_UNSUPPORTED, "box.ctl.promote", "simultaneous invocations"); return -1; } struct raft *raft = box_raft(); - box_in_promote = true; + in_box_promote = true; auto promote_guard = make_scoped_guard([&] { - box_in_promote = false; + in_box_promote = false; }); if (!is_box_configured) @@ -1757,14 +1761,14 @@ box_promote(void) int box_demote(void) { - if (box_in_promote) { + if (in_box_promote) { diag_set(ClientError, ER_UNSUPPORTED, "box.ctl.demote", "simultaneous invocations"); return -1; } - box_in_promote = true; + in_box_promote = true; auto promote_guard = make_scoped_guard([&] { - box_in_promote = false; + in_box_promote = false; }); if (!is_box_configured) diff --git a/src/box/box.h b/src/box/box.h index aaf20d9dd..344ed90f2 100644 --- a/src/box/box.h +++ b/src/box/box.h @@ -273,6 +273,9 @@ extern "C" { typedef struct tuple box_tuple_t; +bool +box_in_promote(void); + int box_promote(void); diff --git a/src/box/raft.c b/src/box/raft.c index 35c471f58..5e496c2e4 100644 --- a/src/box/raft.c +++ b/src/box/raft.c @@ -88,11 +88,11 @@ box_raft_update_synchro_queue(struct raft *raft) { assert(raft == box_raft()); /* - * In case these are manual elections, we are already in the middle of a - * `promote` call. No need to call it once again. + * In case the elections were triggered manually, we are already in + * the middle of a `promote` call. No need to call it once again. */ if (raft->state == RAFT_STATE_LEADER && - box_election_mode != ELECTION_MODE_MANUAL) { + !box_in_promote()) { int rc = 0; uint32_t errcode = 0; do { ===================================== > > It might be a little easier to do if you apply the diff below. (Warning: > I didn't test it.) The motivation is that one of the main reasons why I > wanted box_promote() simplified was because of the strange meaning of some > flags. In particular, try_wait flag somewhy triggered elections before the > waiting which is super not obvious why. How does 'wait' come to 'elections'? > > In the diff I tried to remove these flags entirely. And now you have a > single place in the code of box_promote(), where ELECTION_MODE_CANDIDATE > stuff is handled. Here you could try the proposal I gave above. Thanks for the help! Your diff looks good, I've reworked my patches to comply. > ==================== > diff --git a/src/box/box.cc b/src/box/box.cc > index f68fffcab..e7765b657 100644 > --- a/src/box/box.cc > +++ b/src/box/box.cc > @@ -1698,34 +1698,6 @@ box_issue_demote(uint32_t prev_leader_id, int64_t promote_lsn) > assert(txn_limbo_is_empty(&txn_limbo)); > } > > -/** > - * Check whether this instance may run a promote() and set promote parameters > - * according to its election mode. > - */ > -static int > -box_check_promote_election_mode(bool *try_wait, bool *run_elections) > -{ > - switch (box_election_mode) { > - case ELECTION_MODE_OFF: > - if (try_wait != NULL) > - *try_wait = true; > - break; > - case ELECTION_MODE_VOTER: > - assert(box_raft()->state == RAFT_STATE_FOLLOWER); > - diag_set(ClientError, ER_UNSUPPORTED, "election_mode='voter'", > - "manual elections"); > - return -1; > - case ELECTION_MODE_MANUAL: > - case ELECTION_MODE_CANDIDATE: > - if (run_elections != NULL) > - *run_elections = box_raft()->state != RAFT_STATE_LEADER; > - break; > - default: > - unreachable(); > - } > - return 0; > -} > - > /* A guard to block multiple simultaneous promote()/demote() invocations. */ > static bool box_in_promote = false; > > @@ -1757,27 +1729,35 @@ box_promote(void) > if (is_leader) > return 0; > > - bool run_elections = false; > - bool try_wait = false; > - > - if (box_check_promote_election_mode(&try_wait, &run_elections) < 0) > - return -1; > - > - int64_t wait_lsn = -1; > - > - if (run_elections && box_run_elections() < 0) > - return -1; > - if (try_wait) { > - if (box_try_wait_confirm(2 * replication_synchro_timeout) < 0) > + switch (box_election_mode) { > + case ELECTION_MODE_OFF: > + if (box_try_wait_confirm(2 * replication_synchro_timeout) != 0) > + return -1; > + if (box_trigger_elections() != 0) > return -1; > - if (box_trigger_elections() < 0) > + break; > + case ELECTION_MODE_VOTER: > + assert(box_raft()->state == RAFT_STATE_FOLLOWER); > + diag_set(ClientError, ER_UNSUPPORTED, "election_mode='voter'", > + "manual elections"); > + return -1; > + case ELECTION_MODE_MANUAL: > + case ELECTION_MODE_CANDIDATE: > + if (box_raft()->state != RAFT_STATE_LEADER && > + box_run_elections() != 0) > return -1; > + break; > + default: > + unreachable(); > } > - if ((wait_lsn = box_wait_limbo_acked()) < 0) > + if (box_check_promote_election_mode(&try_wait, &run_elections) < 0) > return -1; > > - box_issue_promote(txn_limbo.owner_id, wait_lsn); > + int64_t wait_lsn = box_wait_limbo_acked(); > + if (wait_lsn < 0) > + return -1; > > + box_issue_promote(txn_limbo.owner_id, wait_lsn); > return 0; > } > > @@ -1804,29 +1784,16 @@ box_demote(void) > is_leader = is_leader && box_raft()->state == RAFT_STATE_LEADER; > if (!is_leader) > return 0; > - > - bool try_wait = false; > - > - if (box_check_promote_election_mode(&try_wait, NULL) < 0) > - return -1; > - > - int64_t wait_lsn = -1; > - > if (box_trigger_elections() < 0) > return -1; > - > if (box_election_mode != ELECTION_MODE_OFF) > return 0; > - > - if (try_wait && > - box_try_wait_confirm(2 * replication_synchro_timeout) < 0) > + if (box_try_wait_confirm(2 * replication_synchro_timeout) != 0) > return -1; > - > - if ((wait_lsn = box_wait_limbo_acked()) < 0) > + int64_t wait_lsn = box_wait_limbo_acked(); > + if (wait_lsn < 0) > return -1; > - > box_issue_demote(txn_limbo.owner_id, wait_lsn); > - > return 0; > } >
next prev parent reply other threads:[~2021-07-23 7:44 UTC|newest] Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-06-28 22:12 [Tarantool-patches] [PATCH v3 00/12] forbid implicit limbo ownership transition Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 01/12] replication: always send raft state to subscribers Serge Petrenko via Tarantool-patches 2021-07-04 12:12 ` Vladislav Shpilevoy via Tarantool-patches 2021-07-09 9:43 ` Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 02/12] txn_limbo: fix promote term filtering Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 03/12] raft: refactor raft_new_term() Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 04/12] box: make promote always bump the term Serge Petrenko via Tarantool-patches 2021-07-04 12:14 ` Vladislav Shpilevoy via Tarantool-patches 2021-07-14 18:26 ` Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 05/12] replication: forbid implicit limbo owner transition Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 06/12] box: allow calling promote on a candidate Serge Petrenko via Tarantool-patches 2021-07-04 12:14 ` Vladislav Shpilevoy via Tarantool-patches 2021-07-14 18:26 ` Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 07/12] box: introduce `box.ctl.demote` Serge Petrenko via Tarantool-patches 2021-07-04 12:27 ` Vladislav Shpilevoy via Tarantool-patches 2021-07-14 18:28 ` Serge Petrenko via Tarantool-patches 2021-07-21 23:28 ` Vladislav Shpilevoy via Tarantool-patches 2021-07-23 7:44 ` Sergey Petrenko via Tarantool-patches [this message] 2021-07-26 23:50 ` Vladislav Shpilevoy via Tarantool-patches 2021-07-29 20:56 ` Sergey Petrenko via Tarantool-patches 2021-08-01 16:19 ` Vladislav Shpilevoy via Tarantool-patches 2021-08-03 7:56 ` Serge Petrenko via Tarantool-patches 2021-08-03 23:25 ` Vladislav Shpilevoy via Tarantool-patches 2021-08-04 13:08 ` Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 08/12] txn_limbo: persist the latest effective promote in snapshot Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 09/12] replication: encode version in JOIN request Serge Petrenko via Tarantool-patches 2021-07-04 12:27 ` Vladislav Shpilevoy via Tarantool-patches 2021-07-14 18:28 ` Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 10/12] replication: add META stage to JOIN Serge Petrenko via Tarantool-patches 2021-07-04 12:28 ` Vladislav Shpilevoy via Tarantool-patches 2021-07-14 18:28 ` Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 11/12] replication: send latest effective promote in initial join Serge Petrenko via Tarantool-patches 2021-07-04 12:28 ` Vladislav Shpilevoy via Tarantool-patches 2021-07-14 18:28 ` Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 12/12] replication: send current Raft term in join response Serge Petrenko via Tarantool-patches 2021-07-04 12:29 ` Vladislav Shpilevoy via Tarantool-patches 2021-07-14 18:28 ` Serge Petrenko via Tarantool-patches 2021-08-04 22:41 ` [Tarantool-patches] [PATCH v3 00/12] forbid implicit limbo ownership transition Vladislav Shpilevoy via Tarantool-patches 2021-08-06 7:54 ` Vitaliia Ioffe via Tarantool-patches 2021-08-06 8:31 ` Kirill Yukhin via Tarantool-patches 2021-08-08 10:46 ` Vladislav Shpilevoy via Tarantool-patches 2021-08-09 7:14 ` Kirill Yukhin via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=036d7d32-54bc-3239-9291-d713b062f324@tarantool.org \ --to=tarantool-patches@dev.tarantool.org \ --cc=gorcunov@gmail.com \ --cc=sergepetrenko@tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH v3 07/12] box: introduce `box.ctl.demote`' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox