From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> To: tarantool-patches@dev.tarantool.org, sergepetrenko@tarantool.org, gorcunov@gmail.com Subject: [Tarantool-patches] [PATCH 4/4] replication: do not register outgoing connections Date: Sat, 12 Sep 2020 19:25:56 +0200 [thread overview] Message-ID: <a1fb9a174fdcf4a7824e96b5e438cf444efabb9e.1599931123.git.v.shpilevoy@tarantool.org> (raw) In-Reply-To: <cover.1599931123.git.v.shpilevoy@tarantool.org> Replication protocol's first stage for non-anonymous replicas is that the replica should be registered in _cluster to get a unique ID number. That happens, when replica connects to a writable node, which performs the registration. So it means, registration always happens on the master node when appears an *incoming* request for it, explicitly asking for a registration. Only relay can do that. That wasn't the case for bootstrap. If box.cfg.replication wasn't empty on the master node doing the cluster bootstrap, it registered all the outgoing connections in _cluster. Note, the target node could be even anonymous, but still was registered. That breaks the protocol, and leads to registration of anon replicas sometimes. The patch drops it. After the registration was dropped, appeared that it could lead to an anonymous replica stored on the master as non-anonymous, with an unknown anon status. That was because box.cfg.replication addresses were added to struct replicaset in order to create their appliers, even before they managed to connect back. As a result, when they did connect back, they didn't update the anon flag, and even got a garbage collection consumer object, preventing WAL files deletion on the master. Another motivation here is Raft cluster bootstrap specifics. During Raft bootstrap it is going to be very important that non-joined replicas should not be registered in _cluster. A replica can only register after its JOIN request was accepted, and its snapshot download has started. Closes #5287 --- src/box/applier.cc | 13 ++++ src/box/box.cc | 39 +++++++---- src/box/errcode.h | 1 + src/box/replication.cc | 12 ++++ src/box/replication.h | 8 +++ test/box/error.result | 1 + test/replication/autobootstrap_anon.lua | 25 +++++++ test/replication/autobootstrap_anon1.lua | 1 + test/replication/autobootstrap_anon2.lua | 1 + test/replication/gh-5287-boot-anon.result | 77 +++++++++++++++++++++ test/replication/gh-5287-boot-anon.test.lua | 29 ++++++++ test/replication/prune.result | 18 ++++- test/replication/prune.test.lua | 7 +- 13 files changed, 215 insertions(+), 17 deletions(-) create mode 100644 test/replication/autobootstrap_anon.lua create mode 120000 test/replication/autobootstrap_anon1.lua create mode 120000 test/replication/autobootstrap_anon2.lua create mode 100644 test/replication/gh-5287-boot-anon.result create mode 100644 test/replication/gh-5287-boot-anon.test.lua diff --git a/src/box/applier.cc b/src/box/applier.cc index 96dd48c0d..e272a7af6 100644 --- a/src/box/applier.cc +++ b/src/box/applier.cc @@ -107,6 +107,7 @@ applier_log_error(struct applier *applier, struct error *e) case ER_UNKNOWN_REPLICA: case ER_PASSWORD_MISMATCH: case ER_XLOG_GAP: + case ER_TOO_EARLY_SUBSCRIBE: say_info("will retry every %.2lf second", replication_reconnect_interval()); break; @@ -1306,6 +1307,18 @@ applier_f(va_list ap) applier_log_error(applier, e); applier_disconnect(applier, APPLIER_LOADING); goto reconnect; + } else if (e->errcode() == ER_TOO_EARLY_SUBSCRIBE) { + /* + * The instance is not anonymous, and is + * registered, but its ID is not delivered to + * all the nodes in the cluster yet, and some + * nodes may ask to retry connection later, + * until they receive _cluster record of this + * instance. From some third node, for example. + */ + applier_log_error(applier, e); + applier_disconnect(applier, APPLIER_LOADING); + goto reconnect; } else if (e->errcode() == ER_SYNC_QUORUM_TIMEOUT || e->errcode() == ER_SYNC_ROLLBACK) { /* diff --git a/src/box/box.cc b/src/box/box.cc index 3f2999fc0..fed2681f6 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -1992,9 +1992,33 @@ box_process_subscribe(struct ev_io *io, struct xrow_header *header) struct replica *replica = replica_by_uuid(&replica_uuid); if (!anon && (replica == NULL || replica->id == REPLICA_ID_NIL)) { - tnt_raise(ClientError, ER_UNKNOWN_REPLICA, - tt_uuid_str(&replica_uuid), - tt_uuid_str(&REPLICASET_UUID)); + /* + * The instance is not anonymous, and is registered (at least it + * claims so), but its ID is not delivered to the current + * instance yet. Need to wait until its _cluster record arrives + * from some third node. Likely to happen on bootstrap, when + * there is a fullmesh and 1 leader doing all the _cluster + * registrations. Not all of them are delivered to the other + * nodes yet. + * Also can happen when the replica is deleted from _cluster, + * but still tries to subscribe. It won't have an ID here. + */ + tnt_raise(ClientError, ER_TOO_EARLY_SUBSCRIBE, + tt_uuid_str(&replica_uuid)); + } + if (anon && replica != NULL && replica->id != REPLICA_ID_NIL) { + tnt_raise(ClientError, ER_PROTOCOL, "Can't subscribe an " + "anonymous replica having an ID assigned"); + } + if (anon && replica != NULL && + replica->id_status == REPLICA_ID_STATUS_UNKNOWN) { + /* + * A replica can already exist even before a first subscribe + * request, if it is specified in box.cfg.replication of this + * instance. In that case it was added with an unknown ID + * status to wait for ID or anon flag. Turns out it is anon. + */ + replica_set_anon(replica); } if (replica == NULL) replica = replicaset_add_anon(&replica_uuid); @@ -2207,15 +2231,6 @@ bootstrap_master(const struct tt_uuid *replicaset_uuid) box_register_replica(replica_id, &INSTANCE_UUID); assert(replica_by_uuid(&INSTANCE_UUID)->id == 1); - /* Register other cluster members */ - replicaset_foreach(replica) { - if (tt_uuid_is_equal(&replica->uuid, &INSTANCE_UUID)) - continue; - assert(replica->applier != NULL); - box_register_replica(++replica_id, &replica->uuid); - assert(replica->id == replica_id); - } - /* Set UUID of a new replica set */ box_set_replicaset_uuid(replicaset_uuid); diff --git a/src/box/errcode.h b/src/box/errcode.h index 9a240a431..e6957d612 100644 --- a/src/box/errcode.h +++ b/src/box/errcode.h @@ -272,6 +272,7 @@ struct errcode_record { /*217 */_(ER_SYNC_ROLLBACK, "A rollback for a synchronous transaction is received") \ /*218 */_(ER_TUPLE_METADATA_IS_TOO_BIG, "Can't create tuple: metadata size %u is too big") \ /*219 */_(ER_XLOG_GAP, "%s") \ + /*220 */_(ER_TOO_EARLY_SUBSCRIBE, "Can't subscribe non-anonymous replica %s until join is done") \ /* * !IMPORTANT! Please follow instructions at start of the file diff --git a/src/box/replication.cc b/src/box/replication.cc index 4cd08e6ff..3c95fe5df 100644 --- a/src/box/replication.cc +++ b/src/box/replication.cc @@ -254,6 +254,18 @@ replica_set_id(struct replica *replica, uint32_t replica_id) replica->id_status = REPLICA_ID_STATUS_REGISTERED; } +void +replica_set_anon(struct replica *replica) +{ + assert(replica->id == REPLICA_ID_NIL); + assert(replica->id_status == REPLICA_ID_STATUS_UNKNOWN); + assert(replica->gc == NULL); + replica->id_status = REPLICA_ID_STATUS_ANON; + replicaset.anon_count++; + say_info("replica %s is defined as anonymous", + tt_uuid_str(&replica->uuid)); +} + void replica_clear_id(struct replica *replica) { diff --git a/src/box/replication.h b/src/box/replication.h index 3e34d4544..5fbe39135 100644 --- a/src/box/replication.h +++ b/src/box/replication.h @@ -376,6 +376,14 @@ replicaset_next(struct replica *replica); void replica_set_id(struct replica *replica, uint32_t id); +/** + * Set the replica to be anonymous. That can only happen if the current ID + * status of the replica is unknown. A registered replica can't be downgraded to + * an anon replica so far. + */ +void +replica_set_anon(struct replica *replica); + /* * Clear the numeric replica-set-local id of a replica. * diff --git a/test/box/error.result b/test/box/error.result index 600517316..4d85b9e55 100644 --- a/test/box/error.result +++ b/test/box/error.result @@ -438,6 +438,7 @@ t; | 217: box.error.SYNC_ROLLBACK | 218: box.error.TUPLE_METADATA_IS_TOO_BIG | 219: box.error.XLOG_GAP + | 220: box.error.TOO_EARLY_SUBSCRIBE | ... test_run:cmd("setopt delimiter ''"); diff --git a/test/replication/autobootstrap_anon.lua b/test/replication/autobootstrap_anon.lua new file mode 100644 index 000000000..2e96d9d1a --- /dev/null +++ b/test/replication/autobootstrap_anon.lua @@ -0,0 +1,25 @@ +#!/usr/bin/env tarantool + +local INSTANCE_ID = string.match(arg[0], "%d") +local SOCKET_DIR = require('fio').cwd() +local ANON = arg[1] == 'true' + +local function instance_uri(instance_id) + return SOCKET_DIR..'/autobootstrap'..instance_id..'.sock'; +end + +require('console').listen(os.getenv('ADMIN')) + +box.cfg({ + listen = instance_uri(INSTANCE_ID), + replication = { + instance_uri(1), + instance_uri(2), + }; + replication_anon = ANON, + read_only = ANON, +}) + +box.once("bootstrap", function() + box.schema.user.grant('guest', 'super') +end) diff --git a/test/replication/autobootstrap_anon1.lua b/test/replication/autobootstrap_anon1.lua new file mode 120000 index 000000000..27e0fec36 --- /dev/null +++ b/test/replication/autobootstrap_anon1.lua @@ -0,0 +1 @@ +autobootstrap_anon.lua \ No newline at end of file diff --git a/test/replication/autobootstrap_anon2.lua b/test/replication/autobootstrap_anon2.lua new file mode 120000 index 000000000..27e0fec36 --- /dev/null +++ b/test/replication/autobootstrap_anon2.lua @@ -0,0 +1 @@ +autobootstrap_anon.lua \ No newline at end of file diff --git a/test/replication/gh-5287-boot-anon.result b/test/replication/gh-5287-boot-anon.result new file mode 100644 index 000000000..4df1bce8c --- /dev/null +++ b/test/replication/gh-5287-boot-anon.result @@ -0,0 +1,77 @@ +-- test-run result file version 2 +test_run = require('test_run').new() + | --- + | ... + +-- +-- gh-5287: when a cluster contained an anonymous replica during bootstrap, it +-- could be registered anyway. +-- + +test_run:cmd("create server replica1 with script='replication/autobootstrap_anon1.lua'") + | --- + | - true + | ... +test_run:cmd("start server replica1 with wait=False") + | --- + | - true + | ... + +test_run:cmd("create server replica2 with script='replication/autobootstrap_anon2.lua'") + | --- + | - true + | ... +test_run:cmd("start server replica2 with args='true', wait=False") + | --- + | - true + | ... + +test_run:switch('replica2') + | --- + | - true + | ... +-- Without box.info.replication test-run fails to wait a cond. +test_run:wait_cond(function() return next(box.info.replication) ~= nil end) + | --- + | - true + | ... +test_run:wait_upstream(1, {status = 'follow'}) + | --- + | - true + | ... + +test_run:switch('replica1') + | --- + | - true + | ... +-- The anonymous replica wasn't registered. +box.space._cluster:len() + | --- + | - 1 + | ... +box.info.gc().consumers + | --- + | - [] + | ... + +test_run:switch('default') + | --- + | - true + | ... + +test_run:cmd("stop server replica1") + | --- + | - true + | ... +test_run:cmd("delete server replica1") + | --- + | - true + | ... +test_run:cmd("stop server replica2") + | --- + | - true + | ... +test_run:cmd("delete server replica2") + | --- + | - true + | ... diff --git a/test/replication/gh-5287-boot-anon.test.lua b/test/replication/gh-5287-boot-anon.test.lua new file mode 100644 index 000000000..eb535a326 --- /dev/null +++ b/test/replication/gh-5287-boot-anon.test.lua @@ -0,0 +1,29 @@ +test_run = require('test_run').new() + +-- +-- gh-5287: when a cluster contained an anonymous replica during bootstrap, it +-- could be registered anyway. +-- + +test_run:cmd("create server replica1 with script='replication/autobootstrap_anon1.lua'") +test_run:cmd("start server replica1 with wait=False") + +test_run:cmd("create server replica2 with script='replication/autobootstrap_anon2.lua'") +test_run:cmd("start server replica2 with args='true', wait=False") + +test_run:switch('replica2') +-- Without box.info.replication test-run fails to wait a cond. +test_run:wait_cond(function() return next(box.info.replication) ~= nil end) +test_run:wait_upstream(1, {status = 'follow'}) + +test_run:switch('replica1') +-- The anonymous replica wasn't registered. +box.space._cluster:len() +box.info.gc().consumers + +test_run:switch('default') + +test_run:cmd("stop server replica1") +test_run:cmd("delete server replica1") +test_run:cmd("stop server replica2") +test_run:cmd("delete server replica2") diff --git a/test/replication/prune.result b/test/replication/prune.result index 1a130df40..67fd62990 100644 --- a/test/replication/prune.result +++ b/test/replication/prune.result @@ -133,7 +133,19 @@ test_run:cmd('eval replica1 "box.info.replication[1].upstream.message"') - ['The local instance id 2 is read-only'] ... -- restart replica and check that replica isn't able to join to cluster -test_run:cmd('restart server replica1') +test_run:cmd('stop server replica1') +--- +- true +... +test_run:cmd('start server replica1 with args="true 0"') +--- +- true +... +test_run:cmd('switch replica1') +--- +- true +... +test_run:wait_upstream(1, {message_re = "Can't subscribe non%-anonymous replica"}) --- - true ... @@ -147,9 +159,9 @@ box.space._cluster:len() == 1 ... test_run:cmd('eval replica1 "box.info.replication[1].upstream.status"') --- -- ['stopped'] +- ['loading'] ... -test_run:cmd('eval replica1 "box.info.replication[1].upstream.message"')[1]:match("is not registered with replica set") ~= nil +test_run:cmd('eval replica1 "box.info.replication[1].upstream.message"')[1]:match("Can't subscribe non%-anonymous replica") ~= nil --- - true ... diff --git a/test/replication/prune.test.lua b/test/replication/prune.test.lua index 80847325b..ea8b0b3c3 100644 --- a/test/replication/prune.test.lua +++ b/test/replication/prune.test.lua @@ -65,11 +65,14 @@ while test_run:cmd('eval replica1 "box.info.replication[1].upstream.status"')[1] test_run:cmd('eval replica1 "box.info.replication[1].upstream.message"') -- restart replica and check that replica isn't able to join to cluster -test_run:cmd('restart server replica1') +test_run:cmd('stop server replica1') +test_run:cmd('start server replica1 with args="true 0"') +test_run:cmd('switch replica1') +test_run:wait_upstream(1, {message_re = "Can't subscribe non%-anonymous replica"}) test_run:cmd('switch default') box.space._cluster:len() == 1 test_run:cmd('eval replica1 "box.info.replication[1].upstream.status"') -test_run:cmd('eval replica1 "box.info.replication[1].upstream.message"')[1]:match("is not registered with replica set") ~= nil +test_run:cmd('eval replica1 "box.info.replication[1].upstream.message"')[1]:match("Can't subscribe non%-anonymous replica") ~= nil replica_set.delete(test_run, 2) box.space.test:drop() -- 2.21.1 (Apple Git-122.3)
next prev parent reply other threads:[~2020-09-12 17:26 UTC|newest] Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-09-12 17:25 [Tarantool-patches] [PATCH 0/4] Boot with anon Vladislav Shpilevoy 2020-09-12 17:25 ` [Tarantool-patches] [PATCH 1/4] replication: replace anon flag with enum Vladislav Shpilevoy 2020-09-14 10:09 ` Cyrill Gorcunov 2020-09-12 17:25 ` [Tarantool-patches] [PATCH 2/4] xlog: introduce an error code for XlogGapError Vladislav Shpilevoy 2020-09-14 10:18 ` Cyrill Gorcunov 2020-09-12 17:25 ` [Tarantool-patches] [PATCH 3/4] replication: retry in case of XlogGapError Vladislav Shpilevoy 2020-09-14 12:27 ` Cyrill Gorcunov 2020-09-12 17:25 ` Vladislav Shpilevoy [this message] 2020-09-12 17:32 ` [Tarantool-patches] [PATCH 0/4] Boot with anon Vladislav Shpilevoy 2020-09-13 16:03 ` Vladislav Shpilevoy
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=a1fb9a174fdcf4a7824e96b5e438cf444efabb9e.1599931123.git.v.shpilevoy@tarantool.org \ --to=v.shpilevoy@tarantool.org \ --cc=gorcunov@gmail.com \ --cc=sergepetrenko@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH 4/4] replication: do not register outgoing connections' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox