[Tarantool-patches] [PATCH v3 07/12] box: introduce `box.ctl.demote`
Sergey Petrenko
sergepetrenko at tarantool.org
Thu Jul 29 23:56:45 MSK 2021
27.07.2021 02:50, Vladislav Shpilevoy пишет:
> Thanks for the fixes!
>
> On 23.07.2021 09:44, Sergey Petrenko wrote:
>> 22.07.2021 01:28, Vladislav Shpilevoy пишет:
>>> Thanks for the fixes!
>>>
>>>>> 2. On top of this commit and on top of the branch too I tried to
>>>>> promote a candidate and got a strange error in the logs, although
>>>>> the promotion was successful:
>>>>>
>>>>> --
>>>>> -- Instance 1
>>>>> --
>>>>>
>>>>> -- Step 1
>>>>> box.cfg{
>>>>> listen = 3313,
>>>>> election_mode = 'candidate',
>>>>> replication_synchro_quorum = 2,
>>>>> replication = {'localhost:3314', 'localhost:3313'}
>>>>> }
>>>>> box.schema.user.grant('guest', 'super')
>>>>>
>>>>>
>>>>> --
>>>>> -- Instance 2
>>>>> --
>>>>>
>>>>> -- Step 2
>>>>> box.cfg{
>>>>> listen = 3314,
>>>>> election_mode = 'voter',
>>>>> replication_synchro_quorum = 2,
>>>>> replication = {'localhost:3314', 'localhost:3313'},
>>>>> read_only = true,
>>>>> }
>>>>>
>>>>> -- Step 3
>>>>> box.cfg{read_only = false, election_mode = 'candidate'}
>>>>>
>>>>> -- Step 4
>>>>> box.ctl.promote()
>>>>>
>>>>> main/112/raft_worker box.cc:1538 E> ER_UNSUPPORTED: box.ctl.promote does not support simultaneous invocations
>>>>> ---
>>>>> ...
>>>> That's because once a candidate becomes the leader, it tries to issue `box.ctl.promote()`, and fails,
>>>> since we're already in `box.ctl.promote()` call.
>>>> I'm not sure how to handle that properly. This doesn't break anything though.
>>> Still, the error message looks really not good. There is an option -
>>> make box_promote() for candidate node just call raft_promote() and set
>>> box_in_promote = false. Then wait for the term outcome. Will it work?
>>> You would need to rebase your patchset on master branch then.
>> Hmm, I don't like that. This will work, but it will complicate things in
>>
>> box_promote even more. Now one has to keep in mind that a manual
>>
>> promote call has 2 phases, one when it is issued manually and the other
>>
>> when it's called automatically when (and if) the instance becomes the leader.
> But is it different in the new version on the branch? You still need to check
> if you are "in promote", but instead of it being encapsulated into box_promote()
> we now need to check it potentially in all box_promote() usage places. Which is
> only 2 now though, and only 1 needs this. But you can see that now you need to
> think if you should ignore being already in promote.
>> diff --git a/src/box/box.cc b/src/box/box.cc
>> index a30e4f78d..5cdca4bd4 100644
>> --- a/src/box/box.cc
>> +++ b/src/box/box.cc
>> @@ -1692,20 +1692,24 @@ box_issue_demote(uint32_t prev_leader_id, int64_t promote_lsn)
>> }
>>
>> /* A guard to block multiple simultaneous promote()/demote() invocations. */
>> -static bool box_in_promote = false;
>> +static bool in_box_promote = false;
>> +
>> +bool box_in_promote(void) {
> Could you please put the return value on a separate line and use 'is'
> as a suffix? For instance, `box_is_in_promote()`.
>> diff --git a/src/box/box.h b/src/box/box.h
>> index aaf20d9dd..344ed90f2 100644
>> --- a/src/box/box.h
>> +++ b/src/box/box.h
>> @@ -88,11 +88,11 @@ box_raft_update_synchro_queue(struct raft *raft)
>> {
>> assert(raft == box_raft());
>> /*
>> - * In case these are manual elections, we are already in the middle of a
>> - * `promote` call. No need to call it once again.
>> + * In case the elections were triggered manually, we are already in
>> + * the middle of a `promote` call. No need to call it once again.
>> */
>> if (raft->state == RAFT_STATE_LEADER &&
>> - box_election_mode != ELECTION_MODE_MANUAL) {
>> + !box_in_promote()) {
> It just seems to me that it should not be of raft's business if it was
> promoted manually or not.
Ok, I see your point. I'll apply your diff.
>> int rc = 0;
>> uint32_t errcode = 0;
>> do {
>> =====================================
>>
>>
>>> It might be a little easier to do if you apply the diff below. (Warning:
>>> I didn't test it.) The motivation is that one of the main reasons why I
>>> wanted box_promote() simplified was because of the strange meaning of some
>>> flags. In particular, try_wait flag somewhy triggered elections before the
>>> waiting which is super not obvious why. How does 'wait' come to 'elections'?
>>>
>>> In the diff I tried to remove these flags entirely. And now you have a
>>> single place in the code of box_promote(), where ELECTION_MODE_CANDIDATE
>>> stuff is handled. Here you could try the proposal I gave above.
>> Thanks for the help! Your diff looks good, I've reworked my patches to comply.
> Wouldn't it be too hard to merge this diff into the previous
> commits? Because it could be done like that from the first
> box_promote-refactoring commit it looks to me.
No problem.
>> diff --git a/src/box/box.cc b/src/box/box.cc
>> index af39f66f4..36302310b 100644
>> --- a/src/box/box.cc
>> +++ b/src/box/box.cc
>> @@ -1701,14 +1701,14 @@ box_promote(void)
>> if (!is_box_configured)
>> return 0;
>> if (txn_limbo_replica_term(&txn_limbo, instance_id) ==
>> - box_raft()->term)
>> + raft->term)
>> return 0;
>> - bool run_elections = false;
>> - bool try_wait = false;
>> -
>> switch (box_election_mode) {
>> case ELECTION_MODE_OFF:
>> - try_wait = true;
>> + if (box_try_wait_confirm(2 * replication_synchro_timeout) != 0)
>> + return -1;
>> + if (box_trigger_elections() != 0)
>> + return -1;
>> break;
>> case ELECTION_MODE_VOTER:
>> assert(raft->state == RAFT_STATE_FOLLOWER);
>> @@ -1717,23 +1717,17 @@ box_promote(void)
>> return -1;
>> case ELECTION_MODE_MANUAL:
>> case ELECTION_MODE_CANDIDATE:
>> - run_elections = raft->state != RAFT_STATE_LEADER;
>> + if (raft->state != RAFT_STATE_LEADER &&
>> + box_run_elections() != 0) {
>> + return -1;
>> + }
>> break;
>> default:
>> unreachable();
>> }
>>
>> - int64_t wait_lsn = -1;
>> -
>> - if (run_elections && box_run_elections() != 0)
>> - return -1;
>> - if (try_wait) {
>> - if (box_try_wait_confirm(2 * replication_synchro_timeout) != 0)
>> - return -1;
>> - if (box_trigger_elections() != 0)
>> - return -1;
>> - }
>> - if ((wait_lsn = box_wait_limbo_acked()) < 0)
>> + int64_t wait_lsn = box_wait_limbo_acked();
>> + if (wait_lsn < 0)
>> return -1;
>>
>> box_issue_promote(txn_limbo.owner_id, wait_lsn);
> I thought more about how the manual promotion conflicts with the
> automatic one and I tried to separate them. See my diff below, it
> is done on top of this commit, but I didn't push it, because I
> didn't run the tests except the one which used to log an error
> in the first version of the patchset.
>
> Thanks to the new small helper functions, I implemented a special
> function for automatic promote which does not conflict with
> the manual one. It you will apply it though, you might want to
> do it earlier that this commit.
Thanks for the help! I've applied your diff with minor changes to
patches
box: split manual and automatic promotion
box: allow calling promote on a candidate
>
> ====================
> diff --git a/src/box/box.cc b/src/box/box.cc
> index 36302310b..65d064615 100644
> --- a/src/box/box.cc
> +++ b/src/box/box.cc
> @@ -1599,7 +1599,7 @@ box_try_wait_confirm(double timeout)
> * Return lsn of the last limbo entry on success, -1 on error.
> */
> static int64_t
> -box_wait_limbo_acked(void)
> +box_wait_limbo_acked(double timeout)
> {
> if (txn_limbo_is_empty(&txn_limbo))
> return txn_limbo.confirmed_lsn;
> @@ -1629,7 +1629,7 @@ box_wait_limbo_acked(void)
> int64_t wait_lsn = last_entry->lsn;
>
> if (box_wait_quorum(txn_limbo.owner_id, wait_lsn, quorum,
> - replication_synchro_timeout) != 0)
> + timeout) != 0)
> return -1;
>
> if (box_check_promote_term_intact(promote_term) != 0)
> @@ -1675,9 +1675,23 @@ box_issue_promote(uint32_t prev_leader_id, int64_t promote_lsn)
> /* A guard to block multiple simultaneous box_promote() invocations. */
> static bool in_box_promote = false;
>
> -bool box_in_promote(void)
> +int
> +box_promote_qsync(void)
> {
> - return in_box_promote;
> + assert(!in_box_promote);
> + assert(is_box_configured);
> + struct raft *raft = box_raft();
> + if (raft->state != RAFT_STATE_LEADER)
> + return 0;
> + assert(box_election_mode == ELECTION_MODE_MANUAL ||
> + box_election_mode == ELECTION_MODE_CANDIDATE);
> + if (txn_limbo_replica_term(&txn_limbo, instance_id) == raft->term)
> + return 0;
> + int64_t wait_lsn = box_wait_limbo_acked(TIMEOUT_INFINITY);
> + if (wait_lsn < 0)
> + return -1;
> + box_issue_promote(txn_limbo.owner_id, wait_lsn);
> + return 0;
> }
>
> int
> @@ -1700,8 +1714,8 @@ box_promote(void)
> */
> if (!is_box_configured)
> return 0;
> - if (txn_limbo_replica_term(&txn_limbo, instance_id) ==
> - raft->term)
> + if (txn_limbo_replica_term(&txn_limbo, instance_id) == raft->term ||
> + raft->state == RAFT_STATE_LEADER)
> return 0;
> switch (box_election_mode) {
> case ELECTION_MODE_OFF:
> @@ -1715,18 +1729,16 @@ box_promote(void)
> diag_set(ClientError, ER_UNSUPPORTED, "election_mode='voter'",
> "manual elections");
> return -1;
> - case ELECTION_MODE_MANUAL:
> case ELECTION_MODE_CANDIDATE:
> - if (raft->state != RAFT_STATE_LEADER &&
> - box_run_elections() != 0) {
> - return -1;
> - }
> - break;
> + case ELECTION_MODE_MANUAL:
> + /* Let the automatic elections finish the promotion properly. */
> + in_box_promote = false;
> + return box_run_elections();
> default:
> unreachable();
> }
>
> - int64_t wait_lsn = box_wait_limbo_acked();
> + int64_t wait_lsn = box_wait_limbo_acked(replication_synchro_timeout);
> if (wait_lsn < 0)
> return -1;
>
> diff --git a/src/box/box.h b/src/box/box.h
> index db636b058..5e5d0bf6d 100644
> --- a/src/box/box.h
> +++ b/src/box/box.h
> @@ -273,12 +273,12 @@ extern "C" {
>
> typedef struct tuple box_tuple_t;
>
> -bool
> -box_in_promote(void);
> -
> int
> box_promote(void);
>
> +int
> +box_promote_qsync(void);
> +
> /* box_select is private and used only by FFI */
> API_EXPORT int
> box_select(uint32_t space_id, uint32_t index_id,
> diff --git a/src/box/raft.c b/src/box/raft.c
> index 5e496c2e4..07501157f 100644
> --- a/src/box/raft.c
> +++ b/src/box/raft.c
> @@ -83,30 +83,6 @@ box_raft_request_to_msg(const struct raft_request *req, struct raft_msg *msg)
> };
> }
>
> -static void
> -box_raft_update_synchro_queue(struct raft *raft)
> -{
> - assert(raft == box_raft());
> - /*
> - * In case the elections were triggered manually, we are already in
> - * the middle of a `promote` call. No need to call it once again.
> - */
> - if (raft->state == RAFT_STATE_LEADER &&
> - !box_in_promote()) {
> - int rc = 0;
> - uint32_t errcode = 0;
> - do {
> - rc = box_promote();
> - if (rc != 0) {
> - struct error *err = diag_last_error(diag_get());
> - errcode = box_error_code(err);
> - diag_log();
> - }
> - } while (rc != 0 && errcode == ER_QUORUM_WAIT &&
> - !fiber_is_cancelled());
> - }
> -}
> -
> static int
> box_raft_worker_f(va_list args)
> {
> @@ -117,7 +93,12 @@ box_raft_worker_f(va_list args)
> box_raft_has_work = false;
>
> raft_process_async(raft);
> - box_raft_update_synchro_queue(raft);
> + /*
> + * XXX: perhaps it should not ever fail. Or at least need a
> + * proper support for failures instead of the ignorance.
> + */
> + if (box_promote_qsync() != 0)
> + diag_log();
>
> if (!box_raft_has_work)
> fiber_yield();
More information about the Tarantool-patches
mailing list