From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id DE87A6EC40; Thu, 3 Jun 2021 20:15:33 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org DE87A6EC40 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1622740534; bh=nMyg9GRzYicTNwSAUDdgS8ABNIoL47cj7MJcwYy5frk=; h=To:Date:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=S6ntKddeksNHMkTYnT2nysNS9w/DQiUDe11MndMOXnV6qOz/4A9iBdrsUWvekysjr 8rNCqsRAko9Fw+gUyY7V37iHrv/p0Sy28wZ//9nNdNps3W+oW1cjh2dexBT7TT88vK QSLlyrFEXL0VI9MBj4C4q0lYHr3oML+ZoAUlBhP8= Received: from smtpng2.m.smailru.net (smtpng2.m.smailru.net [94.100.179.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 5E1396EC40 for ; Thu, 3 Jun 2021 20:15:32 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 5E1396EC40 Received: by smtpng2.m.smailru.net with esmtpa (envelope-from ) id 1loqwN-0000d6-8a; Thu, 03 Jun 2021 20:15:31 +0300 To: tarantool-patches@dev.tarantool.org, sergepetrenko@tarantool.org, gorcunov@gmail.com Date: Thu, 3 Jun 2021 19:15:30 +0200 Message-Id: <94aeed6e00578cf917cf009b537342d6823d1f01.1622740090.git.v.shpilevoy@tarantool.org> X-Mailer: git-send-email 2.24.3 (Apple Git-128) MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD9D5B0DA836B685C54F4BC37E91F2690B85F43D7652182C513182A05F538085040DC628114CC66C8BEB05087AF49FA16AF6960A7B3237F6041AA9550279EFB7F31 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7B4F69EC2502AC40EEA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637B3D52627AD81B52CEA1F7E6F0F101C6723150C8DA25C47586E58E00D9D99D84E1BDDB23E98D2D38BD6CF32B5F8F9D40455AEA6011DFE13FFA4BC30668D6B56FBCC7F00164DA146DAFE8445B8C89999728AA50765F790063783E00425F71A4181389733CBF5DBD5E9C8A9BA7A39EFB766F5D81C698A659EA7CC7F00164DA146DA9985D098DBDEAEC81D471462564A2E19F6B57BC7E6449061A352F6E88A58FB86F5D81C698A659EA7E827F84554CEF5019E625A9149C048EE9ECD01F8117BC8BEE2021AF6380DFAD18AA50765F790063735872C767BF85DA227C277FBC8AE2E8B9F5955FECEF5819E75ECD9A6C639B01B4E70A05D1297E1BBCB5012B2E24CD356 X-C1DE0DAB: 8BD88D57C5CADBC8B2710865C386751094C72BDDC9A8ED5CA3B1A56EE2B804F6B226C914C996894645FD9D8A29397D6EFF55FE49A3C2BFCFC8C84E951CD0BE2F296C473AB1E14218EA052B563B0B06C67866D6147AF826D8E64B759B80A134C9AE12A9558E65F7FCF972CCD2F8FE1EF1CFC4036BBF6A4EA9BCF3CAC6AE8A31D7 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34829444FF2D8CB89B9ECDED8993E1781911898D4DDFD0A4D399FA8C424CD3F3CF869164E6F243D15C1D7E09C32AA3244C159017B63D1A0FC15CCCEB64D1EAAD61E3D93501275E802FFACE5A9C96DEB163 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojNbQUdF9mq4E5st2+z1lIqg== X-Mailru-Sender: 689FA8AB762F73936BC43F508A063822A53B3609E9C4E193241314C6B147DE9E3841015FED1DE5223CC9A89AB576DD93FB559BB5D741EB963CF37A108A312F5C27E8A8C3839CE0E267EA787935ED9F1B X-Mras: Ok Subject: [Tarantool-patches] [PATCH 1/1] replication: prevent boot when rs uuid mismatches X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Vladislav Shpilevoy via Tarantool-patches Reply-To: Vladislav Shpilevoy Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" When an instance was being bootstrapped, it didn't check if its replication sources had the same replicaset UUID. As a result, if they didn't match, it used to boot from any of them almost randomly (using selection among non-read-only nodes, and min uuid among these) and raise an error about the mismatching ones later. Obviously, there is something wrong, such replication config is not valid and the node should not boot at all. The patch tries to prevent such instance's bootstrap if it sees at least 2 replicas with different replicaset UUID. It does not solve the problem for all the cases because still one of the replicas simply could be not available. Then the new node would give up and boot from the other node successfully, and notice replicaset UUID mismatch only when the connection to the first node is restored. But it should help for most of the boot attempts. Closes #5613 @TarantoolBot document Title: New field in IPROTO_BALLOT `IPROTO_BALLOT(0x29)` is a response for `IPROTO_VOTE(0x44)`. It used to contain 5 fields (is_ro, vclock, gc_vclock, etc). Now it also contains `IPROTO_BALLOT_REPLICASET_UUID(0x06)`. It is a UUID string showing replicaset UUID of the sender. It can be nil UUID (when all digits are 0) when not known. It is optional. When omitted, it is assumed to be not known. --- Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-5613-cross-boot-uuid Issue: https://github.com/tarantool/tarantool/issues/5613 I am not sure this bug is worth fixing really. Can't even say it is a bug. Sending it on a review to get more opinions. I am 50/50 for closing it as wontfix and for pushing this patch. It would be good if there was a way to fix it without changing the protocol. Then I wouldn't doubt. But I couldn't find a way except patch the ballot. I also thought about adding replicaset UUID in the greeting, but it has plain text format, which means it is quite fragile when it comes to any changes. In terms of backward compatibility. I am open to any ideas how to fix it alternatively. And to opinions that we don't want it "fixed". .../unreleased/gh-5613-cross-boot-uuid.md | 6 +++ src/box/box.cc | 1 + src/box/errcode.h | 1 + src/box/iproto_constants.h | 1 + src/box/replication.cc | 14 +++++ src/box/xrow.c | 16 ++++-- src/box/xrow.h | 2 + test/box/error.result | 1 + .../gh-5613-cross-bootstrap.result | 54 +++++++++++++++++++ .../gh-5613-cross-bootstrap.test.lua | 16 ++++++ test/replication/gh-5613-master1.lua | 4 ++ test/replication/gh-5613-master2.lua | 4 ++ test/replication/gh-5613-replica.lua | 4 ++ test/replication/suite.cfg | 1 + 14 files changed, 122 insertions(+), 3 deletions(-) create mode 100644 changelogs/unreleased/gh-5613-cross-boot-uuid.md create mode 100644 test/replication/gh-5613-cross-bootstrap.result create mode 100644 test/replication/gh-5613-cross-bootstrap.test.lua create mode 100644 test/replication/gh-5613-master1.lua create mode 100644 test/replication/gh-5613-master2.lua create mode 100644 test/replication/gh-5613-replica.lua diff --git a/changelogs/unreleased/gh-5613-cross-boot-uuid.md b/changelogs/unreleased/gh-5613-cross-boot-uuid.md new file mode 100644 index 000000000..99b9e4428 --- /dev/null +++ b/changelogs/unreleased/gh-5613-cross-boot-uuid.md @@ -0,0 +1,6 @@ +## bugfix/replication + +* Fixed an error when a replica at attempt to boot from instances of different + replicasets (with not the same replicaset UUID) used to boot from one of the + instances and raise an error about the others. Now it won't boot at all if + detects this situation (gh-5613). diff --git a/src/box/box.cc b/src/box/box.cc index 6dc991dc8..13747cb7b 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -2872,6 +2872,7 @@ box_process_vote(struct ballot *ballot) ballot->is_loading = is_ro; vclock_copy(&ballot->vclock, &replicaset.vclock); vclock_copy(&ballot->gc_vclock, &gc.vclock); + ballot->replicaset_uuid = REPLICASET_UUID; } /** Insert a new cluster into _schema */ diff --git a/src/box/errcode.h b/src/box/errcode.h index d93820e96..4002d27ee 100644 --- a/src/box/errcode.h +++ b/src/box/errcode.h @@ -277,6 +277,7 @@ struct errcode_record { /*222 */_(ER_QUORUM_WAIT, "Couldn't wait for quorum %d: %s") \ /*223 */_(ER_INTERFERING_PROMOTE, "Instance with replica id %u was promoted first") \ /*224 */_(ER_RAFT_DISABLED, "Elections were turned off while running box.ctl.promote()")\ + /*225 */_(ER_REPLICASET_UUID_CONFLICT, "Conflicting replicaset UUIDs %s and %s") \ /* * !IMPORTANT! Please follow instructions at start of the file diff --git a/src/box/iproto_constants.h b/src/box/iproto_constants.h index 7362ddaf1..7ea7e4a14 100644 --- a/src/box/iproto_constants.h +++ b/src/box/iproto_constants.h @@ -167,6 +167,7 @@ enum iproto_ballot_key { IPROTO_BALLOT_GC_VCLOCK = 0x03, IPROTO_BALLOT_IS_LOADING = 0x04, IPROTO_BALLOT_IS_ANON = 0x05, + IPROTO_BALLOT_REPLICASET_UUID = 0x06, }; #define bit(c) (1ULL<state != APPLIER_CONNECTED) applier_stop(applier); + struct tt_uuid *uuid = &applier->ballot.replicaset_uuid; + if (tt_uuid_is_nil(uuid)) + continue; + if (tt_uuid_is_nil(&replicaset_uuid)) { + replicaset_uuid = *uuid; + continue; + } + if (tt_uuid_is_equal(&replicaset_uuid, uuid)) + continue; + diag_set(ClientError, ER_REPLICASET_UUID_CONFLICT, + tt_uuid_str(&replicaset_uuid), tt_uuid_str(uuid)); + goto error; } /* Now all the appliers are connected, update the replica set. */ diff --git a/src/box/xrow.c b/src/box/xrow.c index 2e364cea5..286a6323f 100644 --- a/src/box/xrow.c +++ b/src/box/xrow.c @@ -449,7 +449,7 @@ iproto_reply_vote(struct obuf *out, const struct ballot *ballot, uint64_t sync, uint32_t schema_version) { size_t max_size = IPROTO_HEADER_LEN + mp_sizeof_map(1) + - mp_sizeof_uint(UINT32_MAX) + mp_sizeof_map(5) + + mp_sizeof_uint(UINT32_MAX) + mp_sizeof_map(6) + mp_sizeof_uint(UINT32_MAX) + mp_sizeof_bool(ballot->is_ro) + mp_sizeof_uint(UINT32_MAX) + mp_sizeof_bool(ballot->is_loading) + mp_sizeof_uint(IPROTO_BALLOT_IS_ANON) + @@ -457,7 +457,9 @@ iproto_reply_vote(struct obuf *out, const struct ballot *ballot, mp_sizeof_uint(UINT32_MAX) + mp_sizeof_vclock_ignore0(&ballot->vclock) + mp_sizeof_uint(UINT32_MAX) + - mp_sizeof_vclock_ignore0(&ballot->gc_vclock); + mp_sizeof_vclock_ignore0(&ballot->gc_vclock) + + mp_sizeof_uint(UINT32_MAX) + + mp_sizeof_str(UUID_STR_LEN); char *buf = obuf_reserve(out, max_size); if (buf == NULL) { @@ -469,7 +471,7 @@ iproto_reply_vote(struct obuf *out, const struct ballot *ballot, char *data = buf + IPROTO_HEADER_LEN; data = mp_encode_map(data, 1); data = mp_encode_uint(data, IPROTO_BALLOT); - data = mp_encode_map(data, 5); + data = mp_encode_map(data, 6); data = mp_encode_uint(data, IPROTO_BALLOT_IS_RO); data = mp_encode_bool(data, ballot->is_ro); data = mp_encode_uint(data, IPROTO_BALLOT_IS_LOADING); @@ -480,6 +482,8 @@ iproto_reply_vote(struct obuf *out, const struct ballot *ballot, data = mp_encode_vclock_ignore0(data, &ballot->vclock); data = mp_encode_uint(data, IPROTO_BALLOT_GC_VCLOCK); data = mp_encode_vclock_ignore0(data, &ballot->gc_vclock); + data = mp_encode_uint(data, IPROTO_BALLOT_REPLICASET_UUID); + data = xrow_encode_uuid(data, &ballot->replicaset_uuid); size_t size = data - buf; assert(size <= max_size); @@ -1361,6 +1365,7 @@ xrow_decode_ballot(struct xrow_header *row, struct ballot *ballot) ballot->is_loading = false; ballot->is_anon = false; vclock_create(&ballot->vclock); + ballot->replicaset_uuid = uuid_nil; const char *start = NULL; const char *end = NULL; @@ -1424,6 +1429,11 @@ xrow_decode_ballot(struct xrow_header *row, struct ballot *ballot) &ballot->gc_vclock) != 0) goto err; break; + case IPROTO_BALLOT_REPLICASET_UUID: + if (xrow_decode_uuid(&data, + &ballot->replicaset_uuid) != 0) + goto err; + break; default: mp_next(&data); } diff --git a/src/box/xrow.h b/src/box/xrow.h index b3c664be2..4148a2314 100644 --- a/src/box/xrow.h +++ b/src/box/xrow.h @@ -382,6 +382,8 @@ struct ballot { struct vclock vclock; /** Oldest vclock available on the instance. */ struct vclock gc_vclock; + /** Replicaset UUID of the sender. Nil when unknown. */ + struct tt_uuid replicaset_uuid; }; /** diff --git a/test/box/error.result b/test/box/error.result index cc8cbaaa9..56e1bf8ab 100644 --- a/test/box/error.result +++ b/test/box/error.result @@ -443,6 +443,7 @@ t; | 222: box.error.QUORUM_WAIT | 223: box.error.INTERFERING_PROMOTE | 224: box.error.RAFT_DISABLED + | 225: box.error.REPLICASET_UUID_CONFLICT | ... test_run:cmd("setopt delimiter ''"); diff --git a/test/replication/gh-5613-cross-bootstrap.result b/test/replication/gh-5613-cross-bootstrap.result new file mode 100644 index 000000000..e45422f62 --- /dev/null +++ b/test/replication/gh-5613-cross-bootstrap.result @@ -0,0 +1,54 @@ +-- test-run result file version 2 +test_run = require('test_run').new() + | --- + | ... + +test_run:cmd('create server master1 with script="replication/gh-5613-master1.lua"') + | --- + | - true + | ... +test_run:cmd('start server master1') + | --- + | - true + | ... +test_run:cmd('create server master2 with script="replication/gh-5613-master2.lua"') + | --- + | - true + | ... +test_run:cmd('start server master2') + | --- + | - true + | ... + +test_run:cmd('create server replica with script="replication/gh-5613-replica.lua"') + | --- + | - true + | ... +test_run:cmd('start server replica with crash_expected=True') + | --- + | - false + | ... +opts = {filename = 'gh-5613-replica.log'} + | --- + | ... +assert(test_run:grep_log(nil, 'ER_REPLICASET_UUID_CONFLICT', nil, opts) ~= nil) + | --- + | - true + | ... + +test_run:cmd('stop server master2') + | --- + | - true + | ... +test_run:cmd('delete server master2') + | --- + | - true + | ... +test_run:cmd('stop server master1') + | --- + | - true + | ... +test_run:cmd('delete server master1') + | --- + | - true + | ... diff --git a/test/replication/gh-5613-cross-bootstrap.test.lua b/test/replication/gh-5613-cross-bootstrap.test.lua new file mode 100644 index 000000000..1af9c49fd --- /dev/null +++ b/test/replication/gh-5613-cross-bootstrap.test.lua @@ -0,0 +1,16 @@ +test_run = require('test_run').new() + +test_run:cmd('create server master1 with script="replication/gh-5613-master1.lua"') +test_run:cmd('start server master1') +test_run:cmd('create server master2 with script="replication/gh-5613-master2.lua"') +test_run:cmd('start server master2') + +test_run:cmd('create server replica with script="replication/gh-5613-replica.lua"') +test_run:cmd('start server replica with crash_expected=True') +opts = {filename = 'gh-5613-replica.log'} +assert(test_run:grep_log(nil, 'ER_REPLICASET_UUID_CONFLICT', nil, opts) ~= nil) + +test_run:cmd('stop server master2') +test_run:cmd('delete server master2') +test_run:cmd('stop server master1') +test_run:cmd('delete server master1') diff --git a/test/replication/gh-5613-master1.lua b/test/replication/gh-5613-master1.lua new file mode 100644 index 000000000..3fb77dd0c --- /dev/null +++ b/test/replication/gh-5613-master1.lua @@ -0,0 +1,4 @@ +#!/usr/bin/env tarantool + +require('console').listen(os.getenv('ADMIN')) +box.cfg({listen = 'unix/:./master1.sock'}) diff --git a/test/replication/gh-5613-master2.lua b/test/replication/gh-5613-master2.lua new file mode 100644 index 000000000..8a9bd4ea9 --- /dev/null +++ b/test/replication/gh-5613-master2.lua @@ -0,0 +1,4 @@ +#!/usr/bin/env tarantool + +require('console').listen(os.getenv('ADMIN')) +box.cfg({listen = 'unix/:./master2.sock'}) diff --git a/test/replication/gh-5613-replica.lua b/test/replication/gh-5613-replica.lua new file mode 100644 index 000000000..ec9edf0cf --- /dev/null +++ b/test/replication/gh-5613-replica.lua @@ -0,0 +1,4 @@ +#!/usr/bin/env tarantool + +require('console').listen(os.getenv('ADMIN')) +box.cfg({replication = {'unix/:./master1.sock', 'unix/:./master2.sock'}}) diff --git a/test/replication/suite.cfg b/test/replication/suite.cfg index 27eab20c2..bcaedc7e1 100644 --- a/test/replication/suite.cfg +++ b/test/replication/suite.cfg @@ -44,6 +44,7 @@ "gh-5435-qsync-clear-synchro-queue-commit-all.test.lua": {}, "gh-5536-wal-limit.test.lua": {}, "gh-5566-final-join-synchro.test.lua": {}, + "gh-5613-cross-bootstrap.test.lua": {}, "gh-6032-promote-wal-write.test.lua": {}, "gh-6057-qsync-confirm-async-no-wal.test.lua": {}, "gh-6094-rs-uuid-mismatch.test.lua": {}, -- 2.24.3 (Apple Git-128)