From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 120296E222; Wed, 14 Jul 2021 21:26:19 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 120296E222 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1626287179; bh=GscCJH3nKcnliiE9Pi+tISFdl8kvFNlvoajTNviXMec=; h=To:Date:In-Reply-To:References:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=pfTuoUePdTGAy85x8lcaxbwPJpSR8ZLYoMNuoJTnPfMlcmZtTnFFcfutLAVqmHHdD fDMuxvDPeAvbXg+E4OndA4O2vci8xbMjH/8VGMTxYhIVnhkU8IgYSTd4DUHkJk2MXI eshPpLkTnKgTUbrLUAKvuBduMbULpc7ygGSYgm7g= Received: from smtp58.i.mail.ru (smtp58.i.mail.ru [217.69.128.38]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id BCEED6EC55 for ; Wed, 14 Jul 2021 21:25:56 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org BCEED6EC55 Received: by smtp58.i.mail.ru with esmtpa (envelope-from ) id 1m3jZz-0007Q7-Ut; Wed, 14 Jul 2021 21:25:56 +0300 To: v.shpilevoy@tarantool.org, gorcunov@gmail.com Date: Wed, 14 Jul 2021 21:25:29 +0300 Message-Id: X-Mailer: git-send-email 2.30.1 (Apple Git-130) In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-4EC0790: 10 X-7564579A: 78E4E2B564C1792B X-77F55803: 4F1203BC0FB41BD97BB0EF39AD2B33D5A52DC40C1D324F80A71B59DA57F828EE182A05F538085040DBFDBF7E9CAD688EE0D3953BCC91C6AF0EA919962FDC93E0ECBDD26600D6B257 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7A962EFA892AC980EEA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637E8F1A1743CF948808638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D8456537E3180663EE08AE03D198016B2C117882F4460429724CE54428C33FAD305F5C1EE8F4F765FC974A882099E279BDA471835C12D1D9774AD6D5ED66289B52BA9C0B312567BB23117882F446042972877693876707352026055571C92BF10F28451B159A507268D2E47CDBA5A96583BA9C0B312567BB231DD303D21008E29813377AFFFEAFD269A417C69337E82CC2E827F84554CEF50127C277FBC8AE2E8BA83251EDC214901ED5E8D9A59859A8B6D635BA3ABDB36C18089D37D7C0E48F6C5571747095F342E88FB05168BE4CE3AF X-C1DE0DAB: C20DE7B7AB408E4181F030C43753B8186998911F362727C4C7A0BC55FA0FE5FC3EB0C8D5FD3BD3A0207BE4ACB1CE92C53BA32EC9B20DEB63B1881A6453793CE9C32612AADDFBE061C61BE10805914D3804EBA3D8E7E5B87ABF8C51168CD8EBDB30B6221521ACED37DC48ACC2A39D04F89CDFB48F4795C241BDAD6C7F3747799A X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D348E5EF936B2E46EBA5DC6654E3D185F6D2AC8D3EA931A589A3CE0E5E8693326FFB33A789DA3DC70DF1D7E09C32AA3244C23C0EBF344DF3DF82A0B8D1326D1E00ABBA718C7E6A9E042927AC6DF5659F194 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojbL9S8ysBdXjm5ldCwfClY3C/FajqdclO X-Mailru-Sender: 3B9A0136629DC9125D61937A2360A446EB6F8243A5588380CD9471EB29D098F7D09E45CD57FA3163424AE0EB1F3D1D21E2978F233C3FAE6EE63DB1732555E4A8EE80603BA4A5B0BC112434F685709FCF0DA7A0AF5A3A8387 X-Mras: Ok Subject: [Tarantool-patches] [PATCH v4 01/16] replication: always send raft state to subscribers X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Serge Petrenko via Tarantool-patches Reply-To: Serge Petrenko Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Tarantool used to send out raft state on subscribe only when raft was enabled. This was a safeguard against partially-upgraded clusters, where some nodes had no clue about Raft messages and couldn't handle them properly. Actually, Raft state should be sent out always. For example, promote will be changed to bump Raft term even when Raft is disabled, and it's important that everyone in cluster has the same term for the sake of promote at least. So, send out Raft state to every subscriber with version >= 2.6.0 (that's when Raft was introduced). Do the same for Raft broadcasts. They should be sent only to replicas with version >= 2.6.0 Closes #5438 --- src/box/box.cc | 11 ++-- src/box/relay.cc | 4 +- .../replication/gh-5438-election-state.result | 63 +++++++++++++++++++ .../gh-5438-election-state.test.lua | 28 +++++++++ test/replication/suite.cfg | 1 + 5 files changed, 100 insertions(+), 7 deletions(-) create mode 100644 test/replication/gh-5438-election-state.result create mode 100644 test/replication/gh-5438-election-state.test.lua diff --git a/src/box/box.cc b/src/box/box.cc index eeb57b04e..5dcf5b460 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -82,6 +82,7 @@ #include "msgpack.h" #include "raft.h" #include "trivia/util.h" +#include "version.h" enum { IPROTO_THREADS_MAX = 1000, @@ -2831,13 +2832,13 @@ box_process_subscribe(struct ev_io *io, struct xrow_header *header) tt_uuid_str(&replica_uuid), sio_socketname(io->fd)); say_info("remote vclock %s local vclock %s", vclock_to_string(&replica_clock), vclock_to_string(&vclock)); - if (raft_is_enabled(box_raft())) { + if (replica_version_id >= version_id(2, 6, 0) && !anon) { /* * Send out the current raft state of the instance. Don't do - * that if Raft is disabled. It can be that a part of the - * cluster still contains old versions, which can't handle Raft - * messages. So when it is disabled, its network footprint - * should be 0. + * that if the remote instance is old. It can be that a part of + * the cluster still contains old versions, which can't handle + * Raft messages. Raft's network footprint should be 0 as seen + * by such instances. */ struct raft_request req; box_raft_checkpoint_remote(&req); diff --git a/src/box/relay.cc b/src/box/relay.cc index 115037fc3..60f527b7f 100644 --- a/src/box/relay.cc +++ b/src/box/relay.cc @@ -800,7 +800,7 @@ relay_subscribe_f(va_list ap) &relay->relay_pipe, NULL, NULL, cbus_process); struct relay_is_raft_enabled_msg raft_enabler; - if (!relay->replica->anon) + if (!relay->replica->anon && relay->version_id >= version_id(2, 6, 0)) relay_send_is_raft_enabled(relay, &raft_enabler, true); /* @@ -883,7 +883,7 @@ relay_subscribe_f(va_list ap) cpipe_push(&relay->tx_pipe, &relay->status_msg.msg); } - if (!relay->replica->anon) + if (!relay->replica->anon && relay->version_id >= version_id(2, 6, 0)) relay_send_is_raft_enabled(relay, &raft_enabler, false); /* diff --git a/test/replication/gh-5438-election-state.result b/test/replication/gh-5438-election-state.result new file mode 100644 index 000000000..6985f026a --- /dev/null +++ b/test/replication/gh-5438-election-state.result @@ -0,0 +1,63 @@ +-- test-run result file version 2 +test_run = require('test_run').new() + | --- + | ... + +-- +-- gh-5428 send out Raft state to subscribers, even when Raft is disabled. +-- +-- Bump Raft term while the replica's offline. +term = box.info.election.term + | --- + | ... +old_election_mode = box.cfg.election_mode + | --- + | ... +box.cfg{election_mode = 'candidate'} + | --- + | ... +test_run:wait_cond(function() return box.info.election.term > term end) + | --- + | - true + | ... + +-- Make sure the replica receives new term on subscribe. +box.cfg{election_mode = 'off'} + | --- + | ... + +box.schema.user.grant('guest', 'replication') + | --- + | ... +test_run:cmd('create server replica with rpl_master=default,\ + script="replication/replica.lua"') + | --- + | - true + | ... +test_run:cmd('start server replica') + | --- + | - true + | ... +test_run:wait_cond(function()\ + return test_run:eval('replica', 'return box.info.election.term')[1] ==\ + box.info.election.term\ +end) + | --- + | - true + | ... + +-- Cleanup. +box.cfg{election_mode = old_election_mode} + | --- + | ... +test_run:cmd('stop server replica') + | --- + | - true + | ... +test_run:cmd('delete server replica') + | --- + | - true + | ... +box.schema.user.revoke('guest', 'replication') + | --- + | ... diff --git a/test/replication/gh-5438-election-state.test.lua b/test/replication/gh-5438-election-state.test.lua new file mode 100644 index 000000000..60c3366c1 --- /dev/null +++ b/test/replication/gh-5438-election-state.test.lua @@ -0,0 +1,28 @@ +test_run = require('test_run').new() + +-- +-- gh-5428 send out Raft state to subscribers, even when Raft is disabled. +-- +-- Bump Raft term while the replica's offline. +term = box.info.election.term +old_election_mode = box.cfg.election_mode +box.cfg{election_mode = 'candidate'} +test_run:wait_cond(function() return box.info.election.term > term end) + +-- Make sure the replica receives new term on subscribe. +box.cfg{election_mode = 'off'} + +box.schema.user.grant('guest', 'replication') +test_run:cmd('create server replica with rpl_master=default,\ + script="replication/replica.lua"') +test_run:cmd('start server replica') +test_run:wait_cond(function()\ + return test_run:eval('replica', 'return box.info.election.term')[1] ==\ + box.info.election.term\ +end) + +-- Cleanup. +box.cfg{election_mode = old_election_mode} +test_run:cmd('stop server replica') +test_run:cmd('delete server replica') +box.schema.user.revoke('guest', 'replication') diff --git a/test/replication/suite.cfg b/test/replication/suite.cfg index 69f2f3511..ae146c366 100644 --- a/test/replication/suite.cfg +++ b/test/replication/suite.cfg @@ -19,6 +19,7 @@ "gh-5213-qsync-applier-order-3.test.lua": {}, "gh-5426-election-on-off.test.lua": {}, "gh-5433-election-restart-recovery.test.lua": {}, + "gh-5438-election-state.test.lua": {}, "gh-5445-leader-inconsistency.test.lua": {}, "gh-5506-election-on-off.test.lua": {}, "once.test.lua": {}, -- 2.30.1 (Apple Git-130)