From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id E1AF06FC86; Wed, 24 Mar 2021 15:27:22 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org E1AF06FC86 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1616588842; bh=5eXOT1vt4o1gxz3O8KFO+s8CzqAU4kJTwXgv+ZOxgIM=; h=To:Date:In-Reply-To:References:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=xAwO7lI6VLkV2xpqXerec/TY1jVTZHk3tuBe8ulHUBTVE7MjrynoOZ7b0qEtr9bOc TivNpqxKepV9Zx3fuSaxdUdHTmeWf6cxPaX0TzoB2Rw5RchKh+sZLhkqZhuwpwn2Vi gAXVMumlxPlPMNQJnzKiRGoJW8rVmMYZnsut+mrc= Received: from smtp36.i.mail.ru (smtp36.i.mail.ru [94.100.177.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 4C5FA6FF9D for ; Wed, 24 Mar 2021 15:24:28 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 4C5FA6FF9D Received: by smtp36.i.mail.ru with esmtpa (envelope-from ) id 1lP2Yl-0004oK-6u; Wed, 24 Mar 2021 15:24:27 +0300 To: v.shpilevoy@tarantool.org, gorcunov@gmail.com Date: Wed, 24 Mar 2021 15:24:16 +0300 Message-Id: <14137c7d7aeea238e2f876bec388923840557cc7.1616588119.git.sergepetrenko@tarantool.org> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD95D6E7CC48CB1F5F10D3016C09B407F8B1E2E766A3410B623182A05F538085040249D62D5505D0DA448D0609B362AB7C6797B77AD1A56224F133DE0D2931951E4 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7832AD58179B12F4FEA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637B84F9009663064BD8638F802B75D45FF914D58D5BE9E6BC131B5C99E7648C95C5DD32608FC869F5D6FB411CB181A68EBF987877EE9FD9FEAA471835C12D1D9774AD6D5ED66289B5278DA827A17800CE77A825AB47F0FC8649FA2833FD35BB23D2EF20D2F80756B5F868A13BD56FB6657A471835C12D1D977725E5C173C3A84C3CF36E64A7E3F8E58117882F4460429728AD0CFFFB425014E868A13BD56FB6657D81D268191BDAD3DC09775C1D3CA48CF53D04F98A43D6755BA3038C0950A5D36C8A9BA7A39EFB766EC990983EF5C0329BA3038C0950A5D36D5E8D9A59859A8B66F1C5D350F9AE87C76E601842F6C81A1F004C906525384307823802FF610243DF43C7A68FF6260569E8FC8737B5C2249EC8D19AE6D49635B3BBE47FD9DD3FB59A8DF7F3B2552694A2BEBFE083D3B9BA73A03B725D353964BB11811A4A51E3B096D1867E19FE14079BA9C0B312567BB23089D37D7C0E48F6CA18204E546F3947CC0F9454058DFE53C262FEC7FBD7D1F5BC8A9BA7A39EFB7666BA297DBC24807EA089D37D7C0E48F6C8AA50765F79006377AD8C0A31FFD7ED5EFF80C71ABB335746BA297DBC24807EA27F269C8F02392CD20465B3A5AADEC6827F269C8F02392CD5571747095F342E88FB05168BE4CE3AF X-B7AD71C0: AC4F5C86D027EB782CDD5689AFBDA7A24A6D60772A99906F8E1CD14B953EB46D8111B35371B3F618355D89D7DBCDD132 X-C1DE0DAB: C20DE7B7AB408E4181F030C43753B8183A4AFAF3EA6BDC44E1F4276B809941965DD32608FC869F5D6FB411CB181A68EBF987877EE9FD9FEA9C2B6934AE262D3EE7EAB7254005DCED114C52B35DBB74F4E7EAB7254005DCEDA0EE70D6C6970CA79510FB958DCE06DB6ED91DBE5ABE359AC8952F428387DEC05E4DBAB5AF249FA793EDB24507CE13387DFF0A840B692CF8 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34D71B56C992F8AF190813C8CD8CC48FB4138CE923E349887C687AF1986AC3A95AAFF09EB25A2A1DA81D7E09C32AA3244CA6A07F1189E879EC46BA6CF25079328F8A6D4CC6FBFAC251927AC6DF5659F194 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojNqBGwjEnRoXCfShYin8p0g== X-Mailru-Sender: 583F1D7ACE8F49BDD2846D59FC20E9F86C523D0B3819F67E473DC2FCB73F104E5E6D74ED9B01F55A424AE0EB1F3D1D21E2978F233C3FAE6EE63DB1732555E4A8EE80603BA4A5B0BC112434F685709FCF0DA7A0AF5A3A8387 X-Mras: Ok Subject: [Tarantool-patches] [PATCH v2 6/7] replication: tolerate synchro rollback during final join X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Serge Petrenko via Tarantool-patches Reply-To: Serge Petrenko Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Both box_process_register and box_process_join had guards ensuring that not a single rollback occured for transactions residing in WAL around replica's _cluster registration. Both functions would error on a rollback and make the replica retry final join. The reason for that was that replica couldn't process synchronous transactions correctly during final join, because it applied the final join stream row-by-row. This path with retrying final join was a dead end, because even if master manages to receive no ROLLBACK messages around N-th retry of box.space._cluster:insert{}, replica would still have to receive and process all the data dating back to its first _cluster registration attempt. In other words, the guard against sending synchronous rows to the replica didn't work. Let's remove the guard altogether, since now replica is capable of processing synchronous txs in final join stream and even retrying final join in case the _cluster registration was rolled back. Closes #5566 --- changelogs/unreleased/synchro-final-join.md | 4 + src/box/applier.cc | 2 + src/box/box.cc | 24 --- src/box/relay.cc | 1 + .../gh-5566-final-join-synchro.result | 139 ++++++++++++++++++ .../gh-5566-final-join-synchro.test.lua | 61 ++++++++ test/replication/suite.cfg | 1 + 7 files changed, 208 insertions(+), 24 deletions(-) create mode 100644 changelogs/unreleased/synchro-final-join.md create mode 100644 test/replication/gh-5566-final-join-synchro.result create mode 100644 test/replication/gh-5566-final-join-synchro.test.lua diff --git a/changelogs/unreleased/synchro-final-join.md b/changelogs/unreleased/synchro-final-join.md new file mode 100644 index 000000000..cef77df87 --- /dev/null +++ b/changelogs/unreleased/synchro-final-join.md @@ -0,0 +1,4 @@ +## bugfix/core + +* Fix a bug in applier erroring with `Unknown request type 40` during final join + when master has synchronous spaces (gh-5566). diff --git a/src/box/applier.cc b/src/box/applier.cc index 9a8b0f0fc..0d1b4d28d 100644 --- a/src/box/applier.cc +++ b/src/box/applier.cc @@ -109,6 +109,8 @@ applier_log_error(struct applier *applier, struct error *e) case ER_PASSWORD_MISMATCH: case ER_XLOG_GAP: case ER_TOO_EARLY_SUBSCRIBE: + case ER_SYNC_QUORUM_TIMEOUT: + case ER_SYNC_ROLLBACK: say_info("will retry every %.2lf second", replication_reconnect_interval()); break; diff --git a/src/box/box.cc b/src/box/box.cc index cc59564e1..292a54213 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -2163,8 +2163,6 @@ box_process_register(struct ev_io *io, struct xrow_header *header) say_info("registering replica %s at %s", tt_uuid_str(&instance_uuid), sio_socketname(io->fd)); - /* See box_process_join() */ - int64_t limbo_rollback_count = txn_limbo.rollback_count; struct vclock start_vclock; vclock_copy(&start_vclock, &replicaset.vclock); @@ -2180,12 +2178,6 @@ box_process_register(struct ev_io *io, struct xrow_header *header) struct vclock stop_vclock; vclock_copy(&stop_vclock, &replicaset.vclock); - if (txn_limbo.rollback_count != limbo_rollback_count) - tnt_raise(ClientError, ER_SYNC_ROLLBACK); - - if (txn_limbo_wait_confirm(&txn_limbo) != 0) - diag_raise(); - /* * Feed replica with WALs in range * (start_vclock, stop_vclock) so that it gets its @@ -2307,15 +2299,6 @@ box_process_join(struct ev_io *io, struct xrow_header *header) say_info("joining replica %s at %s", tt_uuid_str(&instance_uuid), sio_socketname(io->fd)); - /* - * In order to join a replica, master has to make sure it - * doesn't send unconfirmed data. We have to check that - * there are no rolled back transactions between - * start_vclock and stop_vclock, and that the data right - * before stop_vclock is confirmed, before we can proceed - * to final join. - */ - int64_t limbo_rollback_count = txn_limbo.rollback_count; /* * Initial stream: feed replica with dirty data from engines. */ @@ -2336,13 +2319,6 @@ box_process_join(struct ev_io *io, struct xrow_header *header) /* Remember master's vclock after the last request */ struct vclock stop_vclock; vclock_copy(&stop_vclock, &replicaset.vclock); - - if (txn_limbo.rollback_count != limbo_rollback_count) - tnt_raise(ClientError, ER_SYNC_ROLLBACK); - - if (txn_limbo_wait_confirm(&txn_limbo) != 0) - diag_raise(); - /* Send end of initial stage data marker */ struct xrow_header row; xrow_encode_vclock_xc(&row, &stop_vclock); diff --git a/src/box/relay.cc b/src/box/relay.cc index 41f949e8e..dd7a167e4 100644 --- a/src/box/relay.cc +++ b/src/box/relay.cc @@ -1035,6 +1035,7 @@ relay_send_row(struct xstream *stream, struct xrow_header *packet) ERRINJ_INT); if (inj != NULL && packet->lsn == inj->iparam) { packet->lsn = inj->iparam - 1; + packet->tsn = packet->lsn; say_warn("injected broken lsn: %lld", (long long) packet->lsn); } diff --git a/test/replication/gh-5566-final-join-synchro.result b/test/replication/gh-5566-final-join-synchro.result new file mode 100644 index 000000000..32749bf12 --- /dev/null +++ b/test/replication/gh-5566-final-join-synchro.result @@ -0,0 +1,139 @@ +-- test-run result file version 2 +test_run = require('test_run').new() + | --- + | ... + +-- +-- gh-5566 replica tolerates synchronous transactions in final join stream. +-- +_ = box.schema.space.create('sync', {is_sync=true}) + | --- + | ... +_ = box.space.sync:create_index('pk') + | --- + | ... + +box.schema.user.grant('guest', 'replication') + | --- + | ... +box.schema.user.grant('guest', 'write', 'space', 'sync') + | --- + | ... + +-- Part 1. Make sure a joining instance tolerates synchronous rows in final join +-- stream. +trig = function()\ + box.space.sync:replace{1}\ +end + | --- + | ... +-- The trigger will generate synchronous rows each time a replica joins. +_ = box.space._cluster:on_replace(trig) + | --- + | ... + +orig_synchro_quorum = box.cfg.replication_synchro_quorum + | --- + | ... +box.cfg{replication_synchro_quorum=1} + | --- + | ... + +test_run:cmd('create server replica with rpl_master=default,\ + script="replication/replica.lua"') + | --- + | - true + | ... +test_run:cmd('start server replica') + | --- + | - true + | ... + +test_run:switch('replica') + | --- + | - true + | ... +test_run:wait_upstream(1, {status='follow'}) + | --- + | - true + | ... + +test_run:switch('default') + | --- + | - true + | ... +test_run:cmd('stop server replica') + | --- + | - true + | ... +test_run:cmd('delete server replica') + | --- + | - true + | ... + +-- Part 2. Make sure master aborts final join if insert to _cluster is rolled +-- back and replica is capable of retrying it. +orig_synchro_timeout = box.cfg.replication_synchro_timeout + | --- + | ... +-- Make the trigger we used above fail with no quorum. +box.cfg{replication_synchro_quorum=2, replication_synchro_timeout=0.01} + | --- + | ... +-- Try to join the replica once again. +test_run:cmd('create server replica with rpl_master=default,\ + script="replication/replica.lua"') + | --- + | - true + | ... +test_run:cmd('start server replica with wait=False') + | --- + | - true + | ... + +test_run:wait_log('replica', 'ER_SYNC_QUORUM_TIMEOUT', nil, 10) + | --- + | - ER_SYNC_QUORUM_TIMEOUT + | ... +-- Remove the trigger to let the replica connect. +box.space._cluster:on_replace(nil, trig) + | --- + | ... + +test_run:switch('replica') + | --- + | - true + | ... +test_run:wait_upstream(1, {status='follow'}) + | --- + | - true + | ... + +-- Cleanup. +test_run:switch('default') + | --- + | - true + | ... +test_run:cmd('stop server replica') + | --- + | - true + | ... +test_run:cmd('delete server replica') + | --- + | - true + | ... +box.cfg{\ + replication_synchro_quorum=orig_synchro_quorum,\ + replication_synchro_timeout=orig_synchro_timeout\ +} + | --- + | ... +box.space.sync:drop() + | --- + | ... +test_run:cleanup_cluster() + | --- + | ... +box.schema.user.revoke('guest', 'replication') + | --- + | ... diff --git a/test/replication/gh-5566-final-join-synchro.test.lua b/test/replication/gh-5566-final-join-synchro.test.lua new file mode 100644 index 000000000..14302f6e6 --- /dev/null +++ b/test/replication/gh-5566-final-join-synchro.test.lua @@ -0,0 +1,61 @@ +test_run = require('test_run').new() + +-- +-- gh-5566 replica tolerates synchronous transactions in final join stream. +-- +_ = box.schema.space.create('sync', {is_sync=true}) +_ = box.space.sync:create_index('pk') + +box.schema.user.grant('guest', 'replication') +box.schema.user.grant('guest', 'write', 'space', 'sync') + +-- Part 1. Make sure a joining instance tolerates synchronous rows in final join +-- stream. +trig = function()\ + box.space.sync:replace{1}\ +end +-- The trigger will generate synchronous rows each time a replica joins. +_ = box.space._cluster:on_replace(trig) + +orig_synchro_quorum = box.cfg.replication_synchro_quorum +box.cfg{replication_synchro_quorum=1} + +test_run:cmd('create server replica with rpl_master=default,\ + script="replication/replica.lua"') +test_run:cmd('start server replica') + +test_run:switch('replica') +test_run:wait_upstream(1, {status='follow'}) + +test_run:switch('default') +test_run:cmd('stop server replica') +test_run:cmd('delete server replica') + +-- Part 2. Make sure master aborts final join if insert to _cluster is rolled +-- back and replica is capable of retrying it. +orig_synchro_timeout = box.cfg.replication_synchro_timeout +-- Make the trigger we used above fail with no quorum. +box.cfg{replication_synchro_quorum=2, replication_synchro_timeout=0.01} +-- Try to join the replica once again. +test_run:cmd('create server replica with rpl_master=default,\ + script="replication/replica.lua"') +test_run:cmd('start server replica with wait=False') + +test_run:wait_log('replica', 'ER_SYNC_QUORUM_TIMEOUT', nil, 10) +-- Remove the trigger to let the replica connect. +box.space._cluster:on_replace(nil, trig) + +test_run:switch('replica') +test_run:wait_upstream(1, {status='follow'}) + +-- Cleanup. +test_run:switch('default') +test_run:cmd('stop server replica') +test_run:cmd('delete server replica') +box.cfg{\ + replication_synchro_quorum=orig_synchro_quorum,\ + replication_synchro_timeout=orig_synchro_timeout\ +} +box.space.sync:drop() +test_run:cleanup_cluster() +box.schema.user.revoke('guest', 'replication') diff --git a/test/replication/suite.cfg b/test/replication/suite.cfg index 7e7004592..04a3c4bb2 100644 --- a/test/replication/suite.cfg +++ b/test/replication/suite.cfg @@ -38,6 +38,7 @@ "gh-5440-qsync-ro.test.lua": {}, "gh-5435-qsync-clear-synchro-queue-commit-all.test.lua": {}, "gh-5536-wal-limit.test.lua": {}, + "gh-5566-final-join-synchro.test.lua": {}, "*": { "memtx": {"engine": "memtx"}, "vinyl": {"engine": "vinyl"} -- 2.24.3 (Apple Git-128)