From: Vladislav Shpilevoy via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Serge Petrenko <sergepetrenko@tarantool.org>, gorcunov@gmail.com
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH v3 07/12] box: introduce `box.ctl.demote`
Date: Sun, 4 Jul 2021 14:27:19 +0200 [thread overview]
Message-ID: <cb9cbc91-a1ea-2778-75d4-38315e9aa808@tarantool.org> (raw)
In-Reply-To: <e0673678226484155fb5dd7dc49b0dee184db185.1624918077.git.sergepetrenko@tarantool.org>
Thanks for working on this!
See 11 comments below.
1. On this commit a lot of tests fail. Why?
2. On top of this commit and on top of the branch too I tried to
promote a candidate and got a strange error in the logs, although
the promotion was successful:
--
-- Instance 1
--
-- Step 1
box.cfg{
listen = 3313,
election_mode = 'candidate',
replication_synchro_quorum = 2,
replication = {'localhost:3314', 'localhost:3313'}
}
box.schema.user.grant('guest', 'super')
--
-- Instance 2
--
-- Step 2
box.cfg{
listen = 3314,
election_mode = 'voter',
replication_synchro_quorum = 2,
replication = {'localhost:3314', 'localhost:3313'},
read_only = true,
}
-- Step 3
box.cfg{read_only = false, election_mode = 'candidate'}
-- Step 4
box.ctl.promote()
main/112/raft_worker box.cc:1538 E> ER_UNSUPPORTED: box.ctl.promote does not support simultaneous invocations
---
...
3. On top of the branch I tried demote on a leader with election mode
candidate 2 times. Demote didn't do anything on the second invocation.
The test:
--
-- Instance 1
--
-- Step 1
box.cfg{
listen = 3313,
election_mode = 'candidate',
replication_synchro_quorum = 2,
replication = {'localhost:3314', 'localhost:3313'}
}
box.schema.user.grant('guest', 'super')
-- Step 3
-- I demote self but become leader again because I am a
-- candidate. This is fine.
box.ctl.demote()
-- Step 4
tarantool> box.info.election
---
- state: leader
vote: 1
leader: 1
term: 3
...
-- Step 5
-- But now I try to demote again and nothing happens.
-- I expected the same behaviour - bump term and become
-- a leader again.
tarantool> box.ctl.demote()
---
...
tarantool> box.info.election
---
- state: leader
vote: 1
leader: 1
term: 3 <- the term is the same, so demote didn't do anything.
...
--
-- Instance 2
--
-- Step 2
box.cfg{
listen = 3314,
election_mode = 'voter',
replication_synchro_quorum = 2,
replication = {'localhost:3314', 'localhost:3313'},
read_only = true,
}
4. I tried switching between candidate and manual, trying promote/demote
in different combinations and got promote not doing anything. The test
is the same as above but steps >= 3 are different:
tarantool> box.ctl.demote()
2021-07-04 14:22:49.550 [588] main/103/interactive I> RAFT: bump term to 3, follow
2021-07-04 14:22:49.550 [588] main/113/raft_worker I> RAFT: persisted state {term: 3}
2021-07-04 14:22:49.550 [588] main/113/raft_worker I> RAFT: vote for 1, follow
2021-07-04 14:22:49.551 [588] main/113/raft_worker I> RAFT: persisted state {term: 3, vote: 1}
2021-07-04 14:22:49.551 [588] main/113/raft_worker I> RAFT: enter candidate state with 1 self vote
---
...
tarantool> 2021-07-04 14:22:49.552 [588] main/110/applier/localhost:3314 I> RAFT: message {term: 3, state: follower} from 2
2021-07-04 14:22:49.552 [588] main/110/applier/localhost:3314 I> RAFT: message {term: 3, vote: 1, state: follower} from 2
2021-07-04 14:22:49.552 [588] main/110/applier/localhost:3314 I> RAFT: enter leader state with quorum 2
2021-07-04 14:22:49.552 [588] main/110/applier/localhost:3314 I> RAFT: message {term: 3, vote: 1, state: follower} from 2
2021-07-04 14:22:49.552 [588] main/110/applier/localhost:3314 I> RAFT: vote request is skipped - the leader is already known - 1
2021-07-04 14:22:49.553 [588] relay/[::1]:54006/101/main I> recover from `./00000000000000000000.xlog'
---
...
tarantool> box.cfg{election_mode='manual'}
2021-07-04 14:22:55.391 [588] main/103/interactive I> set 'election_mode' configuration option to "manual"
---
...
tarantool> box.info.election
---
- state: follower
vote: 1
leader: 0
term: 3
...
tarantool> box.ctl.promote()
---
...
tarantool> box.info.election
---
- state: follower
vote: 1
leader: 0
term: 3
...
As you can see, the last promote() didn't do anything. The node
stayed a follower, and no errors were raised.
From the comments 2-4 it seems to me the complexity of
box_clear_synchro_queue()/box_promote() is getting out of control with
the number of ifs and flags, sometimes having unexpected meaning (for
example, try_wait flag causes term bump somewhy). It might be good to
split it into separate functions and make promotion/demotion out of each
election mode linear and straight.
> @TarantoolBot document
> Title: box.ctl.demote
>
> `box.ctl.demote()` is a new function, which works exactly like
> `box.ctl.promote()`, with one exception that it results in the instance
> writing DEMOTE request to WAL instead of a PROMOTE request.
>
> A DEMOTE request (DEMOTE = 32) copies PROMOTE behaviour (it clears the
> limbo as well), but clears the synchronous transaction queue ownership instead
> of assigning it to a new instance.
5. It is worth mentioning that demote can only be called on a master.
> diff --git a/src/box/box.cc b/src/box/box.cc
> index 1d894be97..86c5967b9 100644
> --- a/src/box/box.cc
> +++ b/src/box/box.cc
> @@ -1674,15 +1680,22 @@ box_promote(void)
> rc = -1;
> } else {
> promote:
> - if (try_wait) {
> + if (try_wait || demote) {
> raft_new_term(box_raft());
> if (box_raft_wait_persisted() < 0)
> return -1;
6. What if now some other node won the elections of the term
you just bumped? Then you wouldn't have the right to perform
DEMOTE as the term is not yours.
> }
> uint64_t term = box_raft()->term;
> - txn_limbo_write_promote(&txn_limbo, wait_lsn, term);
> + if (demote) {
> + txn_limbo_write_demote(&txn_limbo, wait_lsn,
> + term);
> + } else {
> + txn_limbo_write_promote(&txn_limbo, wait_lsn,
> + term);
> + }
> + uint16_t type = demote ? IPROTO_DEMOTE : IPROTO_PROMOTE;
> struct synchro_request req = {
> - .type = IPROTO_PROMOTE,
> + .type = type,
> .replica_id = former_leader_id,
> .origin_id = instance_id,
> .lsn = wait_lsn,
> diff --git a/test/replication/gh-5446-qsync-eval-quorum.test.lua b/test/replication/gh-5446-qsync-eval-quorum.test.lua
> index 6b9e324ed..b969df836 100644
> --- a/test/replication/gh-5446-qsync-eval-quorum.test.lua
> +++ b/test/replication/gh-5446-qsync-eval-quorum.test.lua
> @@ -135,3 +136,5 @@ box.cfg{
> replication_synchro_quorum = old_synchro_quorum, \
> replication_synchro_timeout = old_synchro_timeout, \
> }
> +box.ctl.demote()
> +
7. Git diff highlights this new line with red as unnecessary.
> diff --git a/test/replication/gh-6034-limbo-ownership.result b/test/replication/gh-6034-limbo-ownership.result
> new file mode 100644
> index 000000000..e412b8d53
> --- /dev/null
> +++ b/test/replication/gh-6034-limbo-ownership.result
8. Could you please write qsync tests with gh-####-qsync-...
pattern? It is useful when you want to run all qsync tests
and can do `python test-run.py qsync` then.
> @@ -0,0 +1,189 @@
> +-- test-run result file version 2
> +test_run = require('test_run').new()
> + | ---
> + | ...
> +fiber = require('fiber')
9. Fiber is not used in the test.
> + | ---
> + | ...
> +
<...>
> +assert(box.info.synchro.queue.owner == test_run:eval('default', 'return box.info.id')[1])
10. There is test_run:get_server_id().
> + | ---
> + | - true
> + | ...
> +box.space.async:insert{2} -- failure.
> + | ---
> + | - error: Can't modify data because this instance is in read-only mode.
> + | ...
> +
> +-- Promotion on the other node. Default should become ro.
> +box.ctl.promote()
> + | ---
> + | ...
> +assert(not box.info.ro)
> + | ---
> + | - true
> + | ...
> +assert(box.info.synchro.queue.owner == box.info.id)
> + | ---
> + | - true
> + | ...
> +box.space.sync:insert{2} -- success.
> + | ---
> + | - [2]
> + | ...
> +
> +test_run:switch('default')
> + | ---
> + | - true
> + | ...
> +assert(box.info.ro)
> + | ---
> + | - true
> + | ...
> +assert(box.info.synchro.queue.owner == test_run:eval('replica', 'return box.info.id')[1])
11. Ditto.
next prev parent reply other threads:[~2021-07-04 12:27 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-28 22:12 [Tarantool-patches] [PATCH v3 00/12] forbid implicit limbo ownership transition Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 01/12] replication: always send raft state to subscribers Serge Petrenko via Tarantool-patches
2021-07-04 12:12 ` Vladislav Shpilevoy via Tarantool-patches
2021-07-09 9:43 ` Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 02/12] txn_limbo: fix promote term filtering Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 03/12] raft: refactor raft_new_term() Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 04/12] box: make promote always bump the term Serge Petrenko via Tarantool-patches
2021-07-04 12:14 ` Vladislav Shpilevoy via Tarantool-patches
2021-07-14 18:26 ` Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 05/12] replication: forbid implicit limbo owner transition Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 06/12] box: allow calling promote on a candidate Serge Petrenko via Tarantool-patches
2021-07-04 12:14 ` Vladislav Shpilevoy via Tarantool-patches
2021-07-14 18:26 ` Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 07/12] box: introduce `box.ctl.demote` Serge Petrenko via Tarantool-patches
2021-07-04 12:27 ` Vladislav Shpilevoy via Tarantool-patches [this message]
2021-07-14 18:28 ` Serge Petrenko via Tarantool-patches
2021-07-21 23:28 ` Vladislav Shpilevoy via Tarantool-patches
2021-07-23 7:44 ` Sergey Petrenko via Tarantool-patches
2021-07-26 23:50 ` Vladislav Shpilevoy via Tarantool-patches
2021-07-29 20:56 ` Sergey Petrenko via Tarantool-patches
2021-08-01 16:19 ` Vladislav Shpilevoy via Tarantool-patches
2021-08-03 7:56 ` Serge Petrenko via Tarantool-patches
2021-08-03 23:25 ` Vladislav Shpilevoy via Tarantool-patches
2021-08-04 13:08 ` Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 08/12] txn_limbo: persist the latest effective promote in snapshot Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 09/12] replication: encode version in JOIN request Serge Petrenko via Tarantool-patches
2021-07-04 12:27 ` Vladislav Shpilevoy via Tarantool-patches
2021-07-14 18:28 ` Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 10/12] replication: add META stage to JOIN Serge Petrenko via Tarantool-patches
2021-07-04 12:28 ` Vladislav Shpilevoy via Tarantool-patches
2021-07-14 18:28 ` Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 11/12] replication: send latest effective promote in initial join Serge Petrenko via Tarantool-patches
2021-07-04 12:28 ` Vladislav Shpilevoy via Tarantool-patches
2021-07-14 18:28 ` Serge Petrenko via Tarantool-patches
2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 12/12] replication: send current Raft term in join response Serge Petrenko via Tarantool-patches
2021-07-04 12:29 ` Vladislav Shpilevoy via Tarantool-patches
2021-07-14 18:28 ` Serge Petrenko via Tarantool-patches
2021-08-04 22:41 ` [Tarantool-patches] [PATCH v3 00/12] forbid implicit limbo ownership transition Vladislav Shpilevoy via Tarantool-patches
2021-08-06 7:54 ` Vitaliia Ioffe via Tarantool-patches
2021-08-06 8:31 ` Kirill Yukhin via Tarantool-patches
2021-08-08 10:46 ` Vladislav Shpilevoy via Tarantool-patches
2021-08-09 7:14 ` Kirill Yukhin via Tarantool-patches
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cb9cbc91-a1ea-2778-75d4-38315e9aa808@tarantool.org \
--to=tarantool-patches@dev.tarantool.org \
--cc=gorcunov@gmail.com \
--cc=sergepetrenko@tarantool.org \
--cc=v.shpilevoy@tarantool.org \
--subject='Re: [Tarantool-patches] [PATCH v3 07/12] box: introduce `box.ctl.demote`' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox