[Tarantool-patches] [PATCH v3 07/12] box: introduce `box.ctl.demote`
Vladislav Shpilevoy
v.shpilevoy at tarantool.org
Sun Jul 4 15:27:19 MSK 2021
Thanks for working on this!
See 11 comments below.
1. On this commit a lot of tests fail. Why?
2. On top of this commit and on top of the branch too I tried to
promote a candidate and got a strange error in the logs, although
the promotion was successful:
--
-- Instance 1
--
-- Step 1
box.cfg{
listen = 3313,
election_mode = 'candidate',
replication_synchro_quorum = 2,
replication = {'localhost:3314', 'localhost:3313'}
}
box.schema.user.grant('guest', 'super')
--
-- Instance 2
--
-- Step 2
box.cfg{
listen = 3314,
election_mode = 'voter',
replication_synchro_quorum = 2,
replication = {'localhost:3314', 'localhost:3313'},
read_only = true,
}
-- Step 3
box.cfg{read_only = false, election_mode = 'candidate'}
-- Step 4
box.ctl.promote()
main/112/raft_worker box.cc:1538 E> ER_UNSUPPORTED: box.ctl.promote does not support simultaneous invocations
---
...
3. On top of the branch I tried demote on a leader with election mode
candidate 2 times. Demote didn't do anything on the second invocation.
The test:
--
-- Instance 1
--
-- Step 1
box.cfg{
listen = 3313,
election_mode = 'candidate',
replication_synchro_quorum = 2,
replication = {'localhost:3314', 'localhost:3313'}
}
box.schema.user.grant('guest', 'super')
-- Step 3
-- I demote self but become leader again because I am a
-- candidate. This is fine.
box.ctl.demote()
-- Step 4
tarantool> box.info.election
---
- state: leader
vote: 1
leader: 1
term: 3
...
-- Step 5
-- But now I try to demote again and nothing happens.
-- I expected the same behaviour - bump term and become
-- a leader again.
tarantool> box.ctl.demote()
---
...
tarantool> box.info.election
---
- state: leader
vote: 1
leader: 1
term: 3 <- the term is the same, so demote didn't do anything.
...
--
-- Instance 2
--
-- Step 2
box.cfg{
listen = 3314,
election_mode = 'voter',
replication_synchro_quorum = 2,
replication = {'localhost:3314', 'localhost:3313'},
read_only = true,
}
4. I tried switching between candidate and manual, trying promote/demote
in different combinations and got promote not doing anything. The test
is the same as above but steps >= 3 are different:
tarantool> box.ctl.demote()
2021-07-04 14:22:49.550 [588] main/103/interactive I> RAFT: bump term to 3, follow
2021-07-04 14:22:49.550 [588] main/113/raft_worker I> RAFT: persisted state {term: 3}
2021-07-04 14:22:49.550 [588] main/113/raft_worker I> RAFT: vote for 1, follow
2021-07-04 14:22:49.551 [588] main/113/raft_worker I> RAFT: persisted state {term: 3, vote: 1}
2021-07-04 14:22:49.551 [588] main/113/raft_worker I> RAFT: enter candidate state with 1 self vote
---
...
tarantool> 2021-07-04 14:22:49.552 [588] main/110/applier/localhost:3314 I> RAFT: message {term: 3, state: follower} from 2
2021-07-04 14:22:49.552 [588] main/110/applier/localhost:3314 I> RAFT: message {term: 3, vote: 1, state: follower} from 2
2021-07-04 14:22:49.552 [588] main/110/applier/localhost:3314 I> RAFT: enter leader state with quorum 2
2021-07-04 14:22:49.552 [588] main/110/applier/localhost:3314 I> RAFT: message {term: 3, vote: 1, state: follower} from 2
2021-07-04 14:22:49.552 [588] main/110/applier/localhost:3314 I> RAFT: vote request is skipped - the leader is already known - 1
2021-07-04 14:22:49.553 [588] relay/[::1]:54006/101/main I> recover from `./00000000000000000000.xlog'
---
...
tarantool> box.cfg{election_mode='manual'}
2021-07-04 14:22:55.391 [588] main/103/interactive I> set 'election_mode' configuration option to "manual"
---
...
tarantool> box.info.election
---
- state: follower
vote: 1
leader: 0
term: 3
...
tarantool> box.ctl.promote()
---
...
tarantool> box.info.election
---
- state: follower
vote: 1
leader: 0
term: 3
...
As you can see, the last promote() didn't do anything. The node
stayed a follower, and no errors were raised.
>From the comments 2-4 it seems to me the complexity of
box_clear_synchro_queue()/box_promote() is getting out of control with
the number of ifs and flags, sometimes having unexpected meaning (for
example, try_wait flag causes term bump somewhy). It might be good to
split it into separate functions and make promotion/demotion out of each
election mode linear and straight.
> @TarantoolBot document
> Title: box.ctl.demote
>
> `box.ctl.demote()` is a new function, which works exactly like
> `box.ctl.promote()`, with one exception that it results in the instance
> writing DEMOTE request to WAL instead of a PROMOTE request.
>
> A DEMOTE request (DEMOTE = 32) copies PROMOTE behaviour (it clears the
> limbo as well), but clears the synchronous transaction queue ownership instead
> of assigning it to a new instance.
5. It is worth mentioning that demote can only be called on a master.
> diff --git a/src/box/box.cc b/src/box/box.cc
> index 1d894be97..86c5967b9 100644
> --- a/src/box/box.cc
> +++ b/src/box/box.cc
> @@ -1674,15 +1680,22 @@ box_promote(void)
> rc = -1;
> } else {
> promote:
> - if (try_wait) {
> + if (try_wait || demote) {
> raft_new_term(box_raft());
> if (box_raft_wait_persisted() < 0)
> return -1;
6. What if now some other node won the elections of the term
you just bumped? Then you wouldn't have the right to perform
DEMOTE as the term is not yours.
> }
> uint64_t term = box_raft()->term;
> - txn_limbo_write_promote(&txn_limbo, wait_lsn, term);
> + if (demote) {
> + txn_limbo_write_demote(&txn_limbo, wait_lsn,
> + term);
> + } else {
> + txn_limbo_write_promote(&txn_limbo, wait_lsn,
> + term);
> + }
> + uint16_t type = demote ? IPROTO_DEMOTE : IPROTO_PROMOTE;
> struct synchro_request req = {
> - .type = IPROTO_PROMOTE,
> + .type = type,
> .replica_id = former_leader_id,
> .origin_id = instance_id,
> .lsn = wait_lsn,
> diff --git a/test/replication/gh-5446-qsync-eval-quorum.test.lua b/test/replication/gh-5446-qsync-eval-quorum.test.lua
> index 6b9e324ed..b969df836 100644
> --- a/test/replication/gh-5446-qsync-eval-quorum.test.lua
> +++ b/test/replication/gh-5446-qsync-eval-quorum.test.lua
> @@ -135,3 +136,5 @@ box.cfg{
> replication_synchro_quorum = old_synchro_quorum, \
> replication_synchro_timeout = old_synchro_timeout, \
> }
> +box.ctl.demote()
> +
7. Git diff highlights this new line with red as unnecessary.
> diff --git a/test/replication/gh-6034-limbo-ownership.result b/test/replication/gh-6034-limbo-ownership.result
> new file mode 100644
> index 000000000..e412b8d53
> --- /dev/null
> +++ b/test/replication/gh-6034-limbo-ownership.result
8. Could you please write qsync tests with gh-####-qsync-...
pattern? It is useful when you want to run all qsync tests
and can do `python test-run.py qsync` then.
> @@ -0,0 +1,189 @@
> +-- test-run result file version 2
> +test_run = require('test_run').new()
> + | ---
> + | ...
> +fiber = require('fiber')
9. Fiber is not used in the test.
> + | ---
> + | ...
> +
<...>
> +assert(box.info.synchro.queue.owner == test_run:eval('default', 'return box.info.id')[1])
10. There is test_run:get_server_id().
> + | ---
> + | - true
> + | ...
> +box.space.async:insert{2} -- failure.
> + | ---
> + | - error: Can't modify data because this instance is in read-only mode.
> + | ...
> +
> +-- Promotion on the other node. Default should become ro.
> +box.ctl.promote()
> + | ---
> + | ...
> +assert(not box.info.ro)
> + | ---
> + | - true
> + | ...
> +assert(box.info.synchro.queue.owner == box.info.id)
> + | ---
> + | - true
> + | ...
> +box.space.sync:insert{2} -- success.
> + | ---
> + | - [2]
> + | ...
> +
> +test_run:switch('default')
> + | ---
> + | - true
> + | ...
> +assert(box.info.ro)
> + | ---
> + | - true
> + | ...
> +assert(box.info.synchro.queue.owner == test_run:eval('replica', 'return box.info.id')[1])
11. Ditto.
More information about the Tarantool-patches
mailing list