From: Vladislav Shpilevoy via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: Serge Petrenko <sergepetrenko@tarantool.org>, gorcunov@gmail.com Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH v3 07/12] box: introduce `box.ctl.demote` Date: Sun, 4 Jul 2021 14:27:19 +0200 [thread overview] Message-ID: <cb9cbc91-a1ea-2778-75d4-38315e9aa808@tarantool.org> (raw) In-Reply-To: <e0673678226484155fb5dd7dc49b0dee184db185.1624918077.git.sergepetrenko@tarantool.org> Thanks for working on this! See 11 comments below. 1. On this commit a lot of tests fail. Why? 2. On top of this commit and on top of the branch too I tried to promote a candidate and got a strange error in the logs, although the promotion was successful: -- -- Instance 1 -- -- Step 1 box.cfg{ listen = 3313, election_mode = 'candidate', replication_synchro_quorum = 2, replication = {'localhost:3314', 'localhost:3313'} } box.schema.user.grant('guest', 'super') -- -- Instance 2 -- -- Step 2 box.cfg{ listen = 3314, election_mode = 'voter', replication_synchro_quorum = 2, replication = {'localhost:3314', 'localhost:3313'}, read_only = true, } -- Step 3 box.cfg{read_only = false, election_mode = 'candidate'} -- Step 4 box.ctl.promote() main/112/raft_worker box.cc:1538 E> ER_UNSUPPORTED: box.ctl.promote does not support simultaneous invocations --- ... 3. On top of the branch I tried demote on a leader with election mode candidate 2 times. Demote didn't do anything on the second invocation. The test: -- -- Instance 1 -- -- Step 1 box.cfg{ listen = 3313, election_mode = 'candidate', replication_synchro_quorum = 2, replication = {'localhost:3314', 'localhost:3313'} } box.schema.user.grant('guest', 'super') -- Step 3 -- I demote self but become leader again because I am a -- candidate. This is fine. box.ctl.demote() -- Step 4 tarantool> box.info.election --- - state: leader vote: 1 leader: 1 term: 3 ... -- Step 5 -- But now I try to demote again and nothing happens. -- I expected the same behaviour - bump term and become -- a leader again. tarantool> box.ctl.demote() --- ... tarantool> box.info.election --- - state: leader vote: 1 leader: 1 term: 3 <- the term is the same, so demote didn't do anything. ... -- -- Instance 2 -- -- Step 2 box.cfg{ listen = 3314, election_mode = 'voter', replication_synchro_quorum = 2, replication = {'localhost:3314', 'localhost:3313'}, read_only = true, } 4. I tried switching between candidate and manual, trying promote/demote in different combinations and got promote not doing anything. The test is the same as above but steps >= 3 are different: tarantool> box.ctl.demote() 2021-07-04 14:22:49.550 [588] main/103/interactive I> RAFT: bump term to 3, follow 2021-07-04 14:22:49.550 [588] main/113/raft_worker I> RAFT: persisted state {term: 3} 2021-07-04 14:22:49.550 [588] main/113/raft_worker I> RAFT: vote for 1, follow 2021-07-04 14:22:49.551 [588] main/113/raft_worker I> RAFT: persisted state {term: 3, vote: 1} 2021-07-04 14:22:49.551 [588] main/113/raft_worker I> RAFT: enter candidate state with 1 self vote --- ... tarantool> 2021-07-04 14:22:49.552 [588] main/110/applier/localhost:3314 I> RAFT: message {term: 3, state: follower} from 2 2021-07-04 14:22:49.552 [588] main/110/applier/localhost:3314 I> RAFT: message {term: 3, vote: 1, state: follower} from 2 2021-07-04 14:22:49.552 [588] main/110/applier/localhost:3314 I> RAFT: enter leader state with quorum 2 2021-07-04 14:22:49.552 [588] main/110/applier/localhost:3314 I> RAFT: message {term: 3, vote: 1, state: follower} from 2 2021-07-04 14:22:49.552 [588] main/110/applier/localhost:3314 I> RAFT: vote request is skipped - the leader is already known - 1 2021-07-04 14:22:49.553 [588] relay/[::1]:54006/101/main I> recover from `./00000000000000000000.xlog' --- ... tarantool> box.cfg{election_mode='manual'} 2021-07-04 14:22:55.391 [588] main/103/interactive I> set 'election_mode' configuration option to "manual" --- ... tarantool> box.info.election --- - state: follower vote: 1 leader: 0 term: 3 ... tarantool> box.ctl.promote() --- ... tarantool> box.info.election --- - state: follower vote: 1 leader: 0 term: 3 ... As you can see, the last promote() didn't do anything. The node stayed a follower, and no errors were raised. From the comments 2-4 it seems to me the complexity of box_clear_synchro_queue()/box_promote() is getting out of control with the number of ifs and flags, sometimes having unexpected meaning (for example, try_wait flag causes term bump somewhy). It might be good to split it into separate functions and make promotion/demotion out of each election mode linear and straight. > @TarantoolBot document > Title: box.ctl.demote > > `box.ctl.demote()` is a new function, which works exactly like > `box.ctl.promote()`, with one exception that it results in the instance > writing DEMOTE request to WAL instead of a PROMOTE request. > > A DEMOTE request (DEMOTE = 32) copies PROMOTE behaviour (it clears the > limbo as well), but clears the synchronous transaction queue ownership instead > of assigning it to a new instance. 5. It is worth mentioning that demote can only be called on a master. > diff --git a/src/box/box.cc b/src/box/box.cc > index 1d894be97..86c5967b9 100644 > --- a/src/box/box.cc > +++ b/src/box/box.cc > @@ -1674,15 +1680,22 @@ box_promote(void) > rc = -1; > } else { > promote: > - if (try_wait) { > + if (try_wait || demote) { > raft_new_term(box_raft()); > if (box_raft_wait_persisted() < 0) > return -1; 6. What if now some other node won the elections of the term you just bumped? Then you wouldn't have the right to perform DEMOTE as the term is not yours. > } > uint64_t term = box_raft()->term; > - txn_limbo_write_promote(&txn_limbo, wait_lsn, term); > + if (demote) { > + txn_limbo_write_demote(&txn_limbo, wait_lsn, > + term); > + } else { > + txn_limbo_write_promote(&txn_limbo, wait_lsn, > + term); > + } > + uint16_t type = demote ? IPROTO_DEMOTE : IPROTO_PROMOTE; > struct synchro_request req = { > - .type = IPROTO_PROMOTE, > + .type = type, > .replica_id = former_leader_id, > .origin_id = instance_id, > .lsn = wait_lsn, > diff --git a/test/replication/gh-5446-qsync-eval-quorum.test.lua b/test/replication/gh-5446-qsync-eval-quorum.test.lua > index 6b9e324ed..b969df836 100644 > --- a/test/replication/gh-5446-qsync-eval-quorum.test.lua > +++ b/test/replication/gh-5446-qsync-eval-quorum.test.lua > @@ -135,3 +136,5 @@ box.cfg{ > replication_synchro_quorum = old_synchro_quorum, \ > replication_synchro_timeout = old_synchro_timeout, \ > } > +box.ctl.demote() > + 7. Git diff highlights this new line with red as unnecessary. > diff --git a/test/replication/gh-6034-limbo-ownership.result b/test/replication/gh-6034-limbo-ownership.result > new file mode 100644 > index 000000000..e412b8d53 > --- /dev/null > +++ b/test/replication/gh-6034-limbo-ownership.result 8. Could you please write qsync tests with gh-####-qsync-... pattern? It is useful when you want to run all qsync tests and can do `python test-run.py qsync` then. > @@ -0,0 +1,189 @@ > +-- test-run result file version 2 > +test_run = require('test_run').new() > + | --- > + | ... > +fiber = require('fiber') 9. Fiber is not used in the test. > + | --- > + | ... > + <...> > +assert(box.info.synchro.queue.owner == test_run:eval('default', 'return box.info.id')[1]) 10. There is test_run:get_server_id(). > + | --- > + | - true > + | ... > +box.space.async:insert{2} -- failure. > + | --- > + | - error: Can't modify data because this instance is in read-only mode. > + | ... > + > +-- Promotion on the other node. Default should become ro. > +box.ctl.promote() > + | --- > + | ... > +assert(not box.info.ro) > + | --- > + | - true > + | ... > +assert(box.info.synchro.queue.owner == box.info.id) > + | --- > + | - true > + | ... > +box.space.sync:insert{2} -- success. > + | --- > + | - [2] > + | ... > + > +test_run:switch('default') > + | --- > + | - true > + | ... > +assert(box.info.ro) > + | --- > + | - true > + | ... > +assert(box.info.synchro.queue.owner == test_run:eval('replica', 'return box.info.id')[1]) 11. Ditto.
next prev parent reply other threads:[~2021-07-04 12:27 UTC|newest] Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-06-28 22:12 [Tarantool-patches] [PATCH v3 00/12] forbid implicit limbo ownership transition Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 01/12] replication: always send raft state to subscribers Serge Petrenko via Tarantool-patches 2021-07-04 12:12 ` Vladislav Shpilevoy via Tarantool-patches 2021-07-09 9:43 ` Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 02/12] txn_limbo: fix promote term filtering Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 03/12] raft: refactor raft_new_term() Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 04/12] box: make promote always bump the term Serge Petrenko via Tarantool-patches 2021-07-04 12:14 ` Vladislav Shpilevoy via Tarantool-patches 2021-07-14 18:26 ` Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 05/12] replication: forbid implicit limbo owner transition Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 06/12] box: allow calling promote on a candidate Serge Petrenko via Tarantool-patches 2021-07-04 12:14 ` Vladislav Shpilevoy via Tarantool-patches 2021-07-14 18:26 ` Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 07/12] box: introduce `box.ctl.demote` Serge Petrenko via Tarantool-patches 2021-07-04 12:27 ` Vladislav Shpilevoy via Tarantool-patches [this message] 2021-07-14 18:28 ` Serge Petrenko via Tarantool-patches 2021-07-21 23:28 ` Vladislav Shpilevoy via Tarantool-patches 2021-07-23 7:44 ` Sergey Petrenko via Tarantool-patches 2021-07-26 23:50 ` Vladislav Shpilevoy via Tarantool-patches 2021-07-29 20:56 ` Sergey Petrenko via Tarantool-patches 2021-08-01 16:19 ` Vladislav Shpilevoy via Tarantool-patches 2021-08-03 7:56 ` Serge Petrenko via Tarantool-patches 2021-08-03 23:25 ` Vladislav Shpilevoy via Tarantool-patches 2021-08-04 13:08 ` Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 08/12] txn_limbo: persist the latest effective promote in snapshot Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 09/12] replication: encode version in JOIN request Serge Petrenko via Tarantool-patches 2021-07-04 12:27 ` Vladislav Shpilevoy via Tarantool-patches 2021-07-14 18:28 ` Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 10/12] replication: add META stage to JOIN Serge Petrenko via Tarantool-patches 2021-07-04 12:28 ` Vladislav Shpilevoy via Tarantool-patches 2021-07-14 18:28 ` Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 11/12] replication: send latest effective promote in initial join Serge Petrenko via Tarantool-patches 2021-07-04 12:28 ` Vladislav Shpilevoy via Tarantool-patches 2021-07-14 18:28 ` Serge Petrenko via Tarantool-patches 2021-06-28 22:12 ` [Tarantool-patches] [PATCH v3 12/12] replication: send current Raft term in join response Serge Petrenko via Tarantool-patches 2021-07-04 12:29 ` Vladislav Shpilevoy via Tarantool-patches 2021-07-14 18:28 ` Serge Petrenko via Tarantool-patches 2021-08-04 22:41 ` [Tarantool-patches] [PATCH v3 00/12] forbid implicit limbo ownership transition Vladislav Shpilevoy via Tarantool-patches 2021-08-06 7:54 ` Vitaliia Ioffe via Tarantool-patches 2021-08-06 8:31 ` Kirill Yukhin via Tarantool-patches 2021-08-08 10:46 ` Vladislav Shpilevoy via Tarantool-patches 2021-08-09 7:14 ` Kirill Yukhin via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=cb9cbc91-a1ea-2778-75d4-38315e9aa808@tarantool.org \ --to=tarantool-patches@dev.tarantool.org \ --cc=gorcunov@gmail.com \ --cc=sergepetrenko@tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH v3 07/12] box: introduce `box.ctl.demote`' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox