Tarantool development patches archive
 help / color / mirror / Atom feed
From: Sergey Petrenko via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>, gorcunov@gmail.com
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH v3 07/12] box: introduce `box.ctl.demote`
Date: Fri, 23 Jul 2021 09:44:36 +0200	[thread overview]
Message-ID: <036d7d32-54bc-3239-9291-d713b062f324@tarantool.org> (raw)
In-Reply-To: <9ae5765f-ffcb-af66-469b-b0a99d71c331@tarantool.org>


22.07.2021 01:28, Vladislav Shpilevoy пишет:
> Thanks for the fixes!
>
>>> 2. On top of this commit and on top of the branch too I tried to
>>> promote a candidate and got a strange error in the logs, although
>>> the promotion was successful:
>>>
>>> -- 
>>> -- Instance 1
>>> -- 
>>>
>>> -- Step 1
>>> box.cfg{
>>>       listen = 3313,
>>>       election_mode = 'candidate',
>>>       replication_synchro_quorum = 2,
>>>       replication = {'localhost:3314', 'localhost:3313'}
>>> }
>>> box.schema.user.grant('guest', 'super')
>>>
>>>
>>> -- 
>>> -- Instance 2
>>> -- 
>>>
>>> -- Step 2
>>> box.cfg{
>>>       listen = 3314,
>>>       election_mode = 'voter',
>>>       replication_synchro_quorum = 2,
>>>       replication = {'localhost:3314', 'localhost:3313'},
>>>       read_only = true,
>>> }
>>>
>>> -- Step 3
>>> box.cfg{read_only = false, election_mode = 'candidate'}
>>>
>>> -- Step 4
>>> box.ctl.promote()
>>>
>>> main/112/raft_worker box.cc:1538 E> ER_UNSUPPORTED: box.ctl.promote does not support simultaneous invocations
>>> ---
>>> ...
>> That's because once a candidate becomes the leader, it tries to issue `box.ctl.promote()`, and fails,
>> since we're already in `box.ctl.promote()` call.
>> I'm not sure how to handle that properly. This doesn't break anything though.
> Still, the error message looks really not good. There is an option -
> make box_promote() for candidate node just call raft_promote() and set
> box_in_promote = false. Then wait for the term outcome. Will it work?
> You would need to rebase your patchset on master branch then.


Hmm, I don't like that. This will work, but it will complicate things in

box_promote even more. Now one has to keep in mind that a manual

promote call has 2 phases, one when it is issued manually and the other

when it's called automatically when (and if) the instance becomes the 
leader.


I think it'd be better to check whether we should run box_promote()

right in box_raft_update_synchro_queue().

Check out the diff (mostly for commit "box: allow calling promote on a 
candidate"):

=====================================

diff --git a/src/box/box.cc b/src/box/box.cc
index a30e4f78d..5cdca4bd4 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -1692,20 +1692,24 @@ box_issue_demote(uint32_t prev_leader_id, 
int64_t promote_lsn)
  }

  /* A guard to block multiple simultaneous promote()/demote() 
invocations. */
-static bool box_in_promote = false;
+static bool in_box_promote = false;
+
+bool box_in_promote(void) {
+    return in_box_promote;
+}

  int
  box_promote(void)
  {
-    if (box_in_promote) {
+    if (in_box_promote) {
          diag_set(ClientError, ER_UNSUPPORTED, "box.ctl.promote",
               "simultaneous invocations");
          return -1;
      }
      struct raft *raft = box_raft();
-    box_in_promote = true;
+    in_box_promote = true;
      auto promote_guard = make_scoped_guard([&] {
-        box_in_promote = false;
+        in_box_promote = false;
      });

      if (!is_box_configured)
@@ -1757,14 +1761,14 @@ box_promote(void)
  int
  box_demote(void)
  {
-    if (box_in_promote) {
+    if (in_box_promote) {
          diag_set(ClientError, ER_UNSUPPORTED, "box.ctl.demote",
               "simultaneous invocations");
          return -1;
      }
-    box_in_promote = true;
+    in_box_promote = true;
      auto promote_guard = make_scoped_guard([&] {
-        box_in_promote = false;
+        in_box_promote = false;
      });

      if (!is_box_configured)
diff --git a/src/box/box.h b/src/box/box.h
index aaf20d9dd..344ed90f2 100644
--- a/src/box/box.h
+++ b/src/box/box.h
@@ -273,6 +273,9 @@ extern "C" {

  typedef struct tuple box_tuple_t;

+bool
+box_in_promote(void);
+
  int
  box_promote(void);

diff --git a/src/box/raft.c b/src/box/raft.c
index 35c471f58..5e496c2e4 100644
--- a/src/box/raft.c
+++ b/src/box/raft.c
@@ -88,11 +88,11 @@ box_raft_update_synchro_queue(struct raft *raft)
  {
      assert(raft == box_raft());
      /*
-     * In case these are manual elections, we are already in the middle 
of a
-     * `promote` call. No need to call it once again.
+     * In case the elections were triggered manually, we are already in
+     * the middle of a `promote` call. No need to call it once again.
       */
      if (raft->state == RAFT_STATE_LEADER &&
-        box_election_mode != ELECTION_MODE_MANUAL) {
+        !box_in_promote()) {
          int rc = 0;
          uint32_t errcode = 0;
          do {
=====================================


>
> It might be a little easier to do if you apply the diff below. (Warning:
> I didn't test it.) The motivation is that one of the main reasons why I
> wanted box_promote() simplified was because of the strange meaning of some
> flags. In particular, try_wait flag somewhy triggered elections before the
> waiting which is super not obvious why. How does 'wait' come to 'elections'?
>
> In the diff I tried to remove these flags entirely. And now you have a
> single place in the code of box_promote(), where ELECTION_MODE_CANDIDATE
> stuff is handled. Here you could try the proposal I gave above.


Thanks for the help! Your diff looks good, I've reworked my patches to 
comply.


> ====================
> diff --git a/src/box/box.cc b/src/box/box.cc
> index f68fffcab..e7765b657 100644
> --- a/src/box/box.cc
> +++ b/src/box/box.cc
> @@ -1698,34 +1698,6 @@ box_issue_demote(uint32_t prev_leader_id, int64_t promote_lsn)
>   	assert(txn_limbo_is_empty(&txn_limbo));
>   }
>   
> -/**
> - * Check whether this instance may run a promote() and set promote parameters
> - * according to its election mode.
> - */
> -static int
> -box_check_promote_election_mode(bool *try_wait, bool *run_elections)
> -{
> -	switch (box_election_mode) {
> -	case ELECTION_MODE_OFF:
> -		if (try_wait != NULL)
> -			*try_wait = true;
> -		break;
> -	case ELECTION_MODE_VOTER:
> -		assert(box_raft()->state == RAFT_STATE_FOLLOWER);
> -		diag_set(ClientError, ER_UNSUPPORTED, "election_mode='voter'",
> -			 "manual elections");
> -		return -1;
> -	case ELECTION_MODE_MANUAL:
> -	case ELECTION_MODE_CANDIDATE:
> -		if (run_elections != NULL)
> -			*run_elections = box_raft()->state != RAFT_STATE_LEADER;
> -		break;
> -	default:
> -		unreachable();
> -	}
> -	return 0;
> -}
> -
>   /* A guard to block multiple simultaneous promote()/demote() invocations. */
>   static bool box_in_promote = false;
>   
> @@ -1757,27 +1729,35 @@ box_promote(void)
>   	if (is_leader)
>   		return 0;
>   
> -	bool run_elections = false;
> -	bool try_wait = false;
> -
> -	if (box_check_promote_election_mode(&try_wait, &run_elections) < 0)
> -		return -1;
> -
> -	int64_t wait_lsn = -1;
> -
> -	if (run_elections && box_run_elections() < 0)
> -		return -1;
> -	if (try_wait) {
> -		if (box_try_wait_confirm(2 * replication_synchro_timeout) < 0)
> +	switch (box_election_mode) {
> +	case ELECTION_MODE_OFF:
> +		if (box_try_wait_confirm(2 * replication_synchro_timeout) != 0)
> +			return -1;
> +		if (box_trigger_elections() != 0)
>   			return -1;
> -		if (box_trigger_elections() < 0)
> +		break;
> +	case ELECTION_MODE_VOTER:
> +		assert(box_raft()->state == RAFT_STATE_FOLLOWER);
> +		diag_set(ClientError, ER_UNSUPPORTED, "election_mode='voter'",
> +			 "manual elections");
> +		return -1;
> +	case ELECTION_MODE_MANUAL:
> +	case ELECTION_MODE_CANDIDATE:
> +		if (box_raft()->state != RAFT_STATE_LEADER &&
> +		    box_run_elections() != 0)
>   			return -1;
> +		break;
> +	default:
> +		unreachable();
>   	}
> -	if ((wait_lsn = box_wait_limbo_acked()) < 0)
> +	if (box_check_promote_election_mode(&try_wait, &run_elections) < 0)
>   		return -1;
>   
> -	box_issue_promote(txn_limbo.owner_id, wait_lsn);
> +	int64_t wait_lsn = box_wait_limbo_acked();
> +	if (wait_lsn < 0)
> +		return -1;
>   
> +	box_issue_promote(txn_limbo.owner_id, wait_lsn);
>   	return 0;
>   }
>   
> @@ -1804,29 +1784,16 @@ box_demote(void)
>   		is_leader = is_leader && box_raft()->state == RAFT_STATE_LEADER;
>   	if (!is_leader)
>   		return 0;
> -
> -	bool try_wait = false;
> -
> -	if (box_check_promote_election_mode(&try_wait, NULL) < 0)
> -		return -1;
> -
> -	int64_t wait_lsn = -1;
> -
>   	if (box_trigger_elections() < 0)
>   		return -1;
> -
>   	if (box_election_mode != ELECTION_MODE_OFF)
>   		return 0;
> -
> -	if (try_wait &&
> -	    box_try_wait_confirm(2 * replication_synchro_timeout) < 0)
> +	if (box_try_wait_confirm(2 * replication_synchro_timeout) != 0)
>   		return -1;
> -
> -	if ((wait_lsn = box_wait_limbo_acked()) < 0)
> +	int64_t wait_lsn = box_wait_limbo_acked();
> +	if (wait_lsn < 0)
>   		return -1;
> -
>   	box_issue_demote(txn_limbo.owner_id, wait_lsn);
> -
>   	return 0;
>   }
>   

  reply	other threads:[~2021-07-23  7:44 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-28 22:12 [Tarantool-patches] [PATCH v3 00/12] forbid implicit limbo ownership transition Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 01/12] replication: always send raft state to subscribers Serge Petrenko via Tarantool-patches
2021-07-04 12:12   ` Vladislav Shpilevoy via Tarantool-patches
2021-07-09  9:43     ` Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 02/12] txn_limbo: fix promote term filtering Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 03/12] raft: refactor raft_new_term() Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 04/12] box: make promote always bump the term Serge Petrenko via Tarantool-patches
2021-07-04 12:14   ` Vladislav Shpilevoy via Tarantool-patches
2021-07-14 18:26     ` Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 05/12] replication: forbid implicit limbo owner transition Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 06/12] box: allow calling promote on a candidate Serge Petrenko via Tarantool-patches
2021-07-04 12:14   ` Vladislav Shpilevoy via Tarantool-patches
2021-07-14 18:26     ` Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 07/12] box: introduce `box.ctl.demote` Serge Petrenko via Tarantool-patches
2021-07-04 12:27   ` Vladislav Shpilevoy via Tarantool-patches
2021-07-14 18:28     ` Serge Petrenko via Tarantool-patches
2021-07-21 23:28       ` Vladislav Shpilevoy via Tarantool-patches
2021-07-23  7:44         ` Sergey Petrenko via Tarantool-patches [this message]
2021-07-26 23:50           ` Vladislav Shpilevoy via Tarantool-patches
2021-07-29 20:56             ` Sergey Petrenko via Tarantool-patches
2021-08-01 16:19               ` Vladislav Shpilevoy via Tarantool-patches
2021-08-03  7:56                 ` Serge Petrenko via Tarantool-patches
2021-08-03 23:25                   ` Vladislav Shpilevoy via Tarantool-patches
2021-08-04 13:08                     ` Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 08/12] txn_limbo: persist the latest effective promote in snapshot Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 09/12] replication: encode version in JOIN request Serge Petrenko via Tarantool-patches
2021-07-04 12:27   ` Vladislav Shpilevoy via Tarantool-patches
2021-07-14 18:28     ` Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 10/12] replication: add META stage to JOIN Serge Petrenko via Tarantool-patches
2021-07-04 12:28   ` Vladislav Shpilevoy via Tarantool-patches
2021-07-14 18:28     ` Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 11/12] replication: send latest effective promote in initial join Serge Petrenko via Tarantool-patches
2021-07-04 12:28   ` Vladislav Shpilevoy via Tarantool-patches
2021-07-14 18:28     ` Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 12/12] replication: send current Raft term in join response Serge Petrenko via Tarantool-patches
2021-07-04 12:29   ` Vladislav Shpilevoy via Tarantool-patches
2021-07-14 18:28     ` Serge Petrenko via Tarantool-patches
2021-08-04 22:41 ` [Tarantool-patches] [PATCH v3 00/12] forbid implicit limbo ownership transition Vladislav Shpilevoy via Tarantool-patches
2021-08-06  7:54   ` Vitaliia Ioffe via Tarantool-patches
2021-08-06  8:31 ` Kirill Yukhin via Tarantool-patches
2021-08-08 10:46   ` Vladislav Shpilevoy via Tarantool-patches
2021-08-09  7:14     ` Kirill Yukhin via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=036d7d32-54bc-3239-9291-d713b062f324@tarantool.org \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=gorcunov@gmail.com \
    --cc=sergepetrenko@tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v3 07/12] box: introduce `box.ctl.demote`' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox