From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 51F03686C5; Wed, 14 Apr 2021 17:22:23 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 51F03686C5 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1618410143; bh=JDsFwPqUXStk/h9bu7tJIKuRCbbsLUzyOO18sdqF/jw=; h=To:Date:In-Reply-To:References:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=n3tNxzc3f8STrle77arp3Bdx6H/BIgVhQbEyqwxVwfH4JzKXa60X962L46bCi/ICF V9uiQpg3Nb5hVASo9Gid4iZH/nlfrGajaqQzve79/7Cij6cYKHbeNFVHLMJXVi5FXy W9fOYX0gAZqWBcOghXhLAlR3ifHQjae0iNCmmFwY= Received: from smtp49.i.mail.ru (smtp49.i.mail.ru [94.100.177.109]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 3C6CC6901C for ; Wed, 14 Apr 2021 17:18:08 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 3C6CC6901C Received: by smtp49.i.mail.ru with esmtpa (envelope-from ) id 1lWgLH-0003hE-DC; Wed, 14 Apr 2021 17:18:07 +0300 To: v.shpilevoy@tarantool.org, gorcunov@gmail.com Date: Wed, 14 Apr 2021 17:18:06 +0300 Message-Id: X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-7564579A: 78E4E2B564C1792B X-77F55803: 4F1203BC0FB41BD92FFCB8E6708E7480EBD5CA77A668ECB87DA2124B0A8E6609182A05F538085040E5ABA68D74CCA21AF599C34A8E6D3B8DDDF3E28DE21AD3F188E8618D3BDB6166 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE78FFEDB45F3F2BDECEA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637822A0A225AD602F38638F802B75D45FF914D58D5BE9E6BC1A93B80C6DEB9DEE97C6FB206A91F05B2CEB7523CF7BE104C2FEC5D1AE4D7B82D0A05E728617FD226D2E47CDBA5A96583C09775C1D3CA48CFCA5A41EBD8A3A0199FA2833FD35BB23D2EF20D2F80756B5F868A13BD56FB6657A471835C12D1D977725E5C173C3A84C3CA5A41EBD8A3A0199FA2833FD35BB23DF004C90652538430302FCEF25BFAB3454AD6D5ED66289B5278DA827A17800CE7B4F8BD9EAE31F8E9D32BA5DBAC0009BE395957E7521B51C20BC6067A898B09E4090A508E0FED6299176DF2183F8FC7C02ED174AB47A03ACECD04E86FAF290E2DE7108DC36ECF87B51DD303D21008E298D5E8D9A59859A8B6B372FE9A2E580EFC725E5C173C3A84C37CE9EF00282E02DF35872C767BF85DA2F004C90652538430E4A6367B16DE6309 X-C1DE0DAB: C20DE7B7AB408E4181F030C43753B8186998911F362727C414F749A5E30D975CB69F9342289A40B31B309BF8EF8C47941BA5539464DD0DE19C2B6934AE262D3EE7EAB7254005DCED7532B743992DF240BDC6A1CF3F042BAD6DF99611D93F60EF0417BEADF48D1460699F904B3F4130E343918A1A30D5E7FCCB5012B2E24CD356 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D3472E5ECC12A9739C18E2192A811957E0AC97127C254B76FC6CDF4E01FF3A6A882E541584BB7FDC2901D7E09C32AA3244CE5B9FB5545DAD05309797F0B78243B2795A9E0DC41E9A4CF927AC6DF5659F194 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojrcJA+pXcDuna0dk41um7Gg== X-Mailru-Sender: 583F1D7ACE8F49BDD2846D59FC20E9F86108BA0DAB33222D74A3C0EBFC9D446E6A7546A8BFF03010424AE0EB1F3D1D21E2978F233C3FAE6EE63DB1732555E4A8EE80603BA4A5B0BC112434F685709FCF0DA7A0AF5A3A8387 X-Mras: Ok Subject: [Tarantool-patches] [PATCH v3 10/10] box.ctl: rename clear_synchro_queue to promote X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Serge Petrenko via Tarantool-patches Reply-To: Serge Petrenko Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" New function name will be `box.ctl.promote()`. It's much shorter and closer to the function's now enriched functionality. Old name `box.ctl.clear_synchro_queue()` remains in Lua for the sake of backward compatibility. Follow-up #5445 Closes #3055 @TarantoolBot document Title: deprecate `box.ctl.clear_synchro_queue()` in favor of `box.ctl.promote()` Replace all the mentions of `box.ctl.clear_synchro_queue()` with `box.ctl.promote()` and add a note that `box.ctl.clear_synchro_queue()` is a deprecated alias to `box.ctl.promote()` --- changelogs/unreleased/box-ctl-promote.md | 8 ++ src/box/box.cc | 20 ++-- src/box/box.h | 2 +- src/box/lua/ctl.c | 8 +- src/box/raft.c | 4 +- test/replication/election_basic.result | 25 +++++ test/replication/election_basic.test.lua | 10 ++ .../gh-3055-election-promote.result | 105 ++++++++++++++++++ .../gh-3055-election-promote.test.lua | 43 +++++++ test/replication/suite.cfg | 1 + 10 files changed, 210 insertions(+), 16 deletions(-) create mode 100644 changelogs/unreleased/box-ctl-promote.md create mode 100644 test/replication/gh-3055-election-promote.result create mode 100644 test/replication/gh-3055-election-promote.test.lua diff --git a/changelogs/unreleased/box-ctl-promote.md b/changelogs/unreleased/box-ctl-promote.md new file mode 100644 index 000000000..15f6fb206 --- /dev/null +++ b/changelogs/unreleased/box-ctl-promote.md @@ -0,0 +1,8 @@ +## feature/replication + +* Introduce `box.ctl.promote()` and the concept of manual elections (enabled + with `election_mode='manual'`). Once the instance is in `manual` election + mode, it acts like a `voter` most of the time, but may trigger elections and + become a leader, once `box.ctl.promote()` is called. + When `election_mode ~= 'manual'`, `box.ctl.promote()` replaces + `box.ctl.clear_synchro_queue()`, which is now deprecated (gh-3055). diff --git a/src/box/box.cc b/src/box/box.cc index 9b663f54a..0b0c38cd5 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -1509,12 +1509,12 @@ box_wait_quorum(uint32_t lead_id, int64_t target_lsn, int quorum, } int -box_clear_synchro_queue(void) +box_promote(void) { /* A guard to block multiple simultaneous function invocations. */ - static bool in_clear_synchro_queue = false; - if (in_clear_synchro_queue) { - diag_set(ClientError, ER_UNSUPPORTED, "clear_synchro_queue", + static bool in_promote = false; + if (in_promote) { + diag_set(ClientError, ER_UNSUPPORTED, "box.ctl.promote", "simultaneous invocations"); return -1; } @@ -1567,7 +1567,7 @@ box_clear_synchro_queue(void) int64_t wait_lsn = txn_limbo.confirmed_lsn; int rc = 0; int quorum = replication_synchro_quorum; - in_clear_synchro_queue = true; + in_promote = true; if (run_elections) { /* @@ -1584,13 +1584,13 @@ box_clear_synchro_queue(void) raft_cfg_is_candidate(box_raft(), false, false); if (!box_raft()->is_enabled) { diag_set(ClientError, ER_RAFT_DISABLED); - in_clear_synchro_queue = false; + in_promote = false; return -1; } if (box_raft()->state != RAFT_STATE_LEADER) { diag_set(ClientError, ER_INTERFERING_PROMOTE, box_raft()->leader); - in_clear_synchro_queue = false; + in_promote = false; return -1; } } @@ -1614,13 +1614,13 @@ box_clear_synchro_queue(void) if (former_leader_id != txn_limbo.owner_id) { diag_set(ClientError, ER_INTERFERING_PROMOTE, txn_limbo.owner_id); - in_clear_synchro_queue = false; + in_promote = false; return -1; } } /* - * clear_synchro_queue() is a no-op on the limbo owner, so all the rows + * promote() is a no-op on the limbo owner, so all the rows * in the limbo must've come through the applier meaning they already * have an lsn assigned, even if their WAL write hasn't finished yet. */ @@ -1657,7 +1657,7 @@ promote: req.term); } } - in_clear_synchro_queue = false; + in_promote = false; return rc; } diff --git a/src/box/box.h b/src/box/box.h index 90facd189..04bdd397d 100644 --- a/src/box/box.h +++ b/src/box/box.h @@ -274,7 +274,7 @@ extern "C" { typedef struct tuple box_tuple_t; int -box_clear_synchro_queue(void); +box_promote(void); /* box_select is private and used only by FFI */ API_EXPORT int diff --git a/src/box/lua/ctl.c b/src/box/lua/ctl.c index 5b8d0d0e4..368b9ab60 100644 --- a/src/box/lua/ctl.c +++ b/src/box/lua/ctl.c @@ -82,9 +82,9 @@ lbox_ctl_on_schema_init(struct lua_State *L) } static int -lbox_ctl_clear_synchro_queue(struct lua_State *L) +lbox_ctl_promote(struct lua_State *L) { - if (box_clear_synchro_queue() != 0) + if (box_promote() != 0) return luaT_error(L); return 0; } @@ -124,7 +124,9 @@ static const struct luaL_Reg lbox_ctl_lib[] = { {"wait_rw", lbox_ctl_wait_rw}, {"on_shutdown", lbox_ctl_on_shutdown}, {"on_schema_init", lbox_ctl_on_schema_init}, - {"clear_synchro_queue", lbox_ctl_clear_synchro_queue}, + {"promote", lbox_ctl_promote}, + /* An old alias. */ + {"clear_synchro_queue", lbox_ctl_promote}, {"is_recovery_finished", lbox_ctl_is_recovery_finished}, {"set_on_shutdown_timeout", lbox_ctl_set_on_shutdown_timeout}, {NULL, NULL} diff --git a/src/box/raft.c b/src/box/raft.c index e8c9f3d2c..e357772a5 100644 --- a/src/box/raft.c +++ b/src/box/raft.c @@ -89,14 +89,14 @@ box_raft_update_synchro_queue(struct raft *raft) assert(raft == box_raft()); /* * In case these are manual elections, we are already in the middle of a - * `clear_synchro_queue` call. No need to call it once again. + * `promote` call. No need to call it once again. */ if (raft->state == RAFT_STATE_LEADER && box_election_mode != ELECTION_MODE_MANUAL) { int rc = 0; uint32_t errcode = 0; do { - rc = box_clear_synchro_queue(); + rc = box_promote(); if (rc != 0) { struct error *err = diag_last_error(diag_get()); errcode = box_error_code(err); diff --git a/test/replication/election_basic.result b/test/replication/election_basic.result index d5320b3ff..78c911245 100644 --- a/test/replication/election_basic.result +++ b/test/replication/election_basic.result @@ -108,6 +108,31 @@ assert(box.info.election.leader == box.info.id) | - true | ... +-- Manual election mode. A voter most of the time, a leader once +-- `box.ctl.promote()` is called. +box.cfg{election_mode = 'manual'} + | --- + | ... + +assert(box.info.election.state == 'follower') + | --- + | - true + | ... +term = box.info.election.term + | --- + | ... +box.ctl.promote() + | --- + | ... +assert(box.info.election.state == 'leader') + | --- + | - error: assertion failed! + | ... +assert(box.info.election.term > term) + | --- + | - error: assertion failed! + | ... + box.cfg{ \ election_mode = 'off', \ election_timeout = old_election_timeout \ diff --git a/test/replication/election_basic.test.lua b/test/replication/election_basic.test.lua index 821f73cea..5fc398848 100644 --- a/test/replication/election_basic.test.lua +++ b/test/replication/election_basic.test.lua @@ -39,6 +39,16 @@ assert(box.info.election.term > term) assert(box.info.election.vote == box.info.id) assert(box.info.election.leader == box.info.id) +-- Manual election mode. A voter most of the time, a leader once +-- `box.ctl.promote()` is called. +box.cfg{election_mode = 'manual'} + +assert(box.info.election.state == 'follower') +term = box.info.election.term +box.ctl.promote() +assert(box.info.election.state == 'leader') +assert(box.info.election.term > term) + box.cfg{ \ election_mode = 'off', \ election_timeout = old_election_timeout \ diff --git a/test/replication/gh-3055-election-promote.result b/test/replication/gh-3055-election-promote.result new file mode 100644 index 000000000..6f5af13bc --- /dev/null +++ b/test/replication/gh-3055-election-promote.result @@ -0,0 +1,105 @@ +-- test-run result file version 2 +test_run = require('test_run').new() + | --- + | ... + +-- +-- gh-3055 box.ctl.promote(). Call on instance with election_mode='manual' +-- in order to promote it to leader. +SERVERS = {'election_replica1', 'election_replica2', 'election_replica3'} + | --- + | ... +-- Start in candidate state in order for bootstrap to work. +test_run:create_cluster(SERVERS, 'replication', {args='2 0.1 candidate'}) + | --- + | ... +test_run:wait_fullmesh(SERVERS) + | --- + | ... + +cfg_set_manual =\ + "box.cfg{election_mode='manual'} "..\ + "assert(box.info.election.state == 'follower') "..\ + "assert(box.info.ro)" + | --- + | ... + +for _, server in pairs(SERVERS) do\ + ok, res = test_run:eval(server, cfg_set_manual)\ + assert(ok)\ +end + | --- + | ... + +-- Promote without living leader. +test_run:switch('election_replica1') + | --- + | - true + | ... +assert(box.info.election.state == 'follower') + | --- + | - true + | ... +term = box.info.election.term + | --- + | ... +box.ctl.promote() + | --- + | ... +assert(box.info.election.state == 'leader') + | --- + | - true + | ... +assert(not box.info.ro) + | --- + | - true + | ... +assert(box.info.election.term > term) + | --- + | - true + | ... + +-- Test promote when there's a live leader. +test_run:switch('election_replica2') + | --- + | - true + | ... +term = box.info.election.term + | --- + | ... +assert(box.info.election.state == 'follower') + | --- + | - true + | ... +assert(box.info.ro) + | --- + | - true + | ... +assert(box.info.election.leader ~= 0) + | --- + | - true + | ... +box.ctl.promote() + | --- + | ... +assert(box.info.election.state == 'leader') + | --- + | - true + | ... +assert(not box.info.ro) + | --- + | - true + | ... +assert(box.info.election.term > term) + | --- + | - true + | ... + +-- Cleanup. +test_run:switch('default') + | --- + | - true + | ... +test_run:drop_cluster(SERVERS) + | --- + | ... diff --git a/test/replication/gh-3055-election-promote.test.lua b/test/replication/gh-3055-election-promote.test.lua new file mode 100644 index 000000000..cbc3ed206 --- /dev/null +++ b/test/replication/gh-3055-election-promote.test.lua @@ -0,0 +1,43 @@ +test_run = require('test_run').new() + +-- +-- gh-3055 box.ctl.promote(). Call on instance with election_mode='manual' +-- in order to promote it to leader. +SERVERS = {'election_replica1', 'election_replica2', 'election_replica3'} +-- Start in candidate state in order for bootstrap to work. +test_run:create_cluster(SERVERS, 'replication', {args='2 0.1 candidate'}) +test_run:wait_fullmesh(SERVERS) + +cfg_set_manual =\ + "box.cfg{election_mode='manual'} "..\ + "assert(box.info.election.state == 'follower') "..\ + "assert(box.info.ro)" + +for _, server in pairs(SERVERS) do\ + ok, res = test_run:eval(server, cfg_set_manual)\ + assert(ok)\ +end + +-- Promote without living leader. +test_run:switch('election_replica1') +assert(box.info.election.state == 'follower') +term = box.info.election.term +box.ctl.promote() +assert(box.info.election.state == 'leader') +assert(not box.info.ro) +assert(box.info.election.term > term) + +-- Test promote when there's a live leader. +test_run:switch('election_replica2') +term = box.info.election.term +assert(box.info.election.state == 'follower') +assert(box.info.ro) +assert(box.info.election.leader ~= 0) +box.ctl.promote() +assert(box.info.election.state == 'leader') +assert(not box.info.ro) +assert(box.info.election.term > term) + +-- Cleanup. +test_run:switch('default') +test_run:drop_cluster(SERVERS) diff --git a/test/replication/suite.cfg b/test/replication/suite.cfg index 8b185ce7e..dc39e2f74 100644 --- a/test/replication/suite.cfg +++ b/test/replication/suite.cfg @@ -2,6 +2,7 @@ "anon.test.lua": {}, "anon_register_gap.test.lua": {}, "gh-2991-misc-asserts-on-update.test.lua": {}, + "gh-3055-election-promote.test.lua": {}, "gh-3111-misc-rebootstrap-from-ro-master.test.lua": {}, "gh-3160-misc-heartbeats-on-master-changes.test.lua": {}, "gh-3247-misc-iproto-sequence-value-not-replicated.test.lua": {}, -- 2.24.3 (Apple Git-128)