From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 978556EC40; Tue, 29 Jun 2021 01:15:52 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 978556EC40 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1624918552; bh=xDeGlnUje76JL7ZK3lXRwrHZ/dQ4lHlqIGLp39J1GxM=; h=To:Date:In-Reply-To:References:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=YpQ9u/Ws9TdW03Fhw4WgEZK8K+pC7ZEvnDH8hplL3Iaf7Me8u0WXWeqt2Q35HyMQx 8DSH0tBrjKtMFQMpKvesJjkHQC1+7qcPNERjJuoo0MTzzTt6n6uY0nkRovDEi5xcbS 1dCFckN4fpUKzDz7U821IRUK4jjftrV3E+yKArPM= Received: from smtp61.i.mail.ru (smtp61.i.mail.ru [217.69.128.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 882ED6EC5D for ; Tue, 29 Jun 2021 01:13:14 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 882ED6EC5D Received: by smtp61.i.mail.ru with esmtpa (envelope-from ) id 1lxzVB-0007oC-Le; Tue, 29 Jun 2021 01:13:14 +0300 To: v.shpilevoy@tarantool.org, gorcunov@gmail.com Date: Tue, 29 Jun 2021 01:12:51 +0300 Message-Id: <4cfb9acadb67f85f0493515e2664b21954428875.1624918077.git.sergepetrenko@tarantool.org> X-Mailer: git-send-email 2.30.1 (Apple Git-130) In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-4EC0790: 10 X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD954DFF1DC42D673FB4F75AC5594ACDC16869A51A860A12816182A05F5380850401C0FF5E59122A2D784EBB9191EA448DB796C1F90EF36B94F4124EDCDE60170B5 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7AB81E605F2C81F9EEA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637509039C6BFF921C08638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D80EA8AE8F56A6C4B1ADD670F9B4DA81E4117882F4460429724CE54428C33FAD305F5C1EE8F4F765FCF1175FABE1C0F9B6A471835C12D1D9774AD6D5ED66289B52BA9C0B312567BB23117882F446042972877693876707352033AC447995A7AD182CC0D3CB04F14752D2E47CDBA5A96583BA9C0B312567BB231DD303D21008E29813377AFFFEAFD269A417C69337E82CC2E827F84554CEF50127C277FBC8AE2E8BA83251EDC214901ED5E8D9A59859A8B60A62CEF541B197C8089D37D7C0E48F6C5571747095F342E88FB05168BE4CE3AF X-B7AD71C0: AC4F5C86D027EB782CDD5689AFBDA7A2AD77751E876CB595E8F7B195E1C97831A91E565E6410D4C4D5B09C2752DC36E1 X-C1DE0DAB: C20DE7B7AB408E4181F030C43753B8183A4AFAF3EA6BDC44C234C8B12C006B7A8F199441D0F32EB56E2BDD6E16ADE39726E76A2485EDF650B1881A6453793CE9C32612AADDFBE061C801D989C91DAA47C32612AADDFBE0618C81B5738AFAA7AF9510FB958DCE06DB6ED91DBE5ABE359AC8952F428387DEC02272C4C079A4C8AD93EDB24507CE13387DFF0A840B692CF8 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34A5112A9AECFE11B1605D76CEDD53AE3DBC6D5B6E76FC73E79414109EA1D0442D5601F72DF8FD8DA51D7E09C32AA3244C13E70A88682A014C74FBA84479961213A995755A1445935E927AC6DF5659F194 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojbL9S8ysBdXjjQb0ILB/OAlSoawrTk2k/ X-Mailru-Sender: 3B9A0136629DC9125D61937A2360A446C3FEE19542AA4C0E0D2760C197C21A0D976AF1CB91E6F081424AE0EB1F3D1D21E2978F233C3FAE6EE63DB1732555E4A8EE80603BA4A5B0BC112434F685709FCF0DA7A0AF5A3A8387 X-Mras: Ok Subject: [Tarantool-patches] [PATCH v3 05/12] replication: forbid implicit limbo owner transition X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Serge Petrenko via Tarantool-patches Reply-To: Serge Petrenko Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Forbid limbo ownership transition without an explicit promote. Make it so that synchronous transactions may be committed only after it is claimed by some instance via a PROMOTE request. Make everyone but the limbo owner read-only even when the limbo is empty. Part-of #6034 @TarantoolBot document Title: synchronous replication changes `box.info.synchro.queue` receives a new field: `owner`. It's a replica id of the instance owning the synchronous transaction queue. Once some instance owns the queue, every other instance becomes read-only. When the queue is unclaimed, e.g. `box.info.synchro.queue.owner` is `0`, everyone may be writeable, but cannot create synchronous transactions. In order to claim or re-claim the queue, you have to issue `box.ctl.promote()` on the instance you wish to promote. When elections are enabled, the instance issues `box.ctl.promote()` automatically once it wins the elections, no additional actions are required. --- src/box/errcode.h | 2 + src/box/lua/info.c | 4 +- src/box/txn_limbo.c | 32 ++--- test/box/alter.result | 2 +- test/box/error.result | 1 + test/replication/gh-5440-qsync-ro.result | 133 --------------------- test/replication/gh-5440-qsync-ro.test.lua | 53 -------- test/replication/suite.cfg | 1 - 8 files changed, 18 insertions(+), 210 deletions(-) delete mode 100644 test/replication/gh-5440-qsync-ro.result delete mode 100644 test/replication/gh-5440-qsync-ro.test.lua diff --git a/src/box/errcode.h b/src/box/errcode.h index 49aec4bf6..d42b64ef4 100644 --- a/src/box/errcode.h +++ b/src/box/errcode.h @@ -278,6 +278,8 @@ struct errcode_record { /*223 */_(ER_INTERFERING_PROMOTE, "Instance with replica id %u was promoted first") \ /*224 */_(ER_RAFT_DISABLED, "Elections were turned off while running box.ctl.promote()")\ /*225 */_(ER_TXN_ROLLBACK, "Transaction was rolled back") \ + /*226 */_(ER_SYNC_QUEUE_UNCLAIMED, "The synchronous transaction queue doesn't belong to any instance")\ + /*227 */_(ER_SYNC_QUEUE_FOREIGN, "The synchronous transaction queue belongs to other instance with id %u")\ /* * !IMPORTANT! Please follow instructions at start of the file diff --git a/src/box/lua/info.c b/src/box/lua/info.c index f201b25e3..211d2baea 100644 --- a/src/box/lua/info.c +++ b/src/box/lua/info.c @@ -614,9 +614,11 @@ lbox_info_synchro(struct lua_State *L) /* Queue information. */ struct txn_limbo *queue = &txn_limbo; - lua_createtable(L, 0, 1); + lua_createtable(L, 0, 2); lua_pushnumber(L, queue->len); lua_setfield(L, -2, "len"); + lua_pushnumber(L, queue->owner_id); + lua_setfield(L, -2, "owner"); lua_setfield(L, -2, "queue"); return 1; diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c index 16181b8a0..996f1a3fc 100644 --- a/src/box/txn_limbo.c +++ b/src/box/txn_limbo.c @@ -55,7 +55,8 @@ txn_limbo_create(struct txn_limbo *limbo) bool txn_limbo_is_ro(struct txn_limbo *limbo) { - return limbo->owner_id != instance_id && !txn_limbo_is_empty(limbo); + return limbo->owner_id != REPLICA_ID_NIL && + limbo->owner_id != instance_id; } struct txn_limbo_entry * @@ -95,18 +96,18 @@ txn_limbo_append(struct txn_limbo *limbo, uint32_t id, struct txn *txn) } if (id == 0) id = instance_id; - bool make_ro = false; - if (limbo->owner_id != id) { - if (rlist_empty(&limbo->queue)) { - limbo->owner_id = id; - limbo->confirmed_lsn = 0; - if (id != instance_id) - make_ro = true; + if (limbo->owner_id == REPLICA_ID_NIL) { + diag_set(ClientError, ER_SYNC_QUEUE_UNCLAIMED); + return NULL; + } else if (limbo->owner_id != id) { + if (txn_limbo_is_empty(limbo)) { + diag_set(ClientError, ER_SYNC_QUEUE_FOREIGN, + limbo->owner_id); } else { diag_set(ClientError, ER_UNCOMMITTED_FOREIGN_SYNC_TXNS, limbo->owner_id); - return NULL; } + return NULL; } size_t size; struct txn_limbo_entry *e = region_alloc_object(&txn->region, @@ -122,12 +123,6 @@ txn_limbo_append(struct txn_limbo *limbo, uint32_t id, struct txn *txn) e->is_rollback = false; rlist_add_tail_entry(&limbo->queue, e, in_queue); limbo->len++; - /* - * We added new entries from a remote instance to an empty limbo. - * Time to make this instance read-only. - */ - if (make_ro) - box_update_ro_summary(); return e; } @@ -432,9 +427,6 @@ txn_limbo_read_confirm(struct txn_limbo *limbo, int64_t lsn) assert(e->txn->signature >= 0); txn_complete_success(e->txn); } - /* Update is_ro once the limbo is clear. */ - if (txn_limbo_is_empty(limbo)) - box_update_ro_summary(); } /** @@ -482,9 +474,6 @@ txn_limbo_read_rollback(struct txn_limbo *limbo, int64_t lsn) if (e == last_rollback) break; } - /* Update is_ro once the limbo is clear. */ - if (txn_limbo_is_empty(limbo)) - box_update_ro_summary(); } void @@ -515,6 +504,7 @@ txn_limbo_read_promote(struct txn_limbo *limbo, uint32_t replica_id, txn_limbo_read_rollback(limbo, lsn + 1); assert(txn_limbo_is_empty(&txn_limbo)); limbo->owner_id = replica_id; + box_update_ro_summary(); limbo->confirmed_lsn = 0; } diff --git a/test/box/alter.result b/test/box/alter.result index a7bffce10..6a64f6b84 100644 --- a/test/box/alter.result +++ b/test/box/alter.result @@ -1464,7 +1464,7 @@ assert(s.is_sync) ... s:replace{1} --- -- error: Quorum collection for a synchronous transaction is timed out +- error: The synchronous transaction queue doesn't belong to any instance ... -- When not specified or nil - ignored. s:alter({is_sync = nil}) diff --git a/test/box/error.result b/test/box/error.result index 062a90399..dfe593dc2 100644 --- a/test/box/error.result +++ b/test/box/error.result @@ -444,6 +444,7 @@ t; | 223: box.error.INTERFERING_PROMOTE | 224: box.error.RAFT_DISABLED | 225: box.error.TXN_ROLLBACK + | 226: box.error.SYNC_QUEUE_UNCLAIMED | ... test_run:cmd("setopt delimiter ''"); diff --git a/test/replication/gh-5440-qsync-ro.result b/test/replication/gh-5440-qsync-ro.result deleted file mode 100644 index 1ece26a42..000000000 --- a/test/replication/gh-5440-qsync-ro.result +++ /dev/null @@ -1,133 +0,0 @@ --- test-run result file version 2 --- --- gh-5440 everyone but the limbo owner is read-only on non-empty limbo. --- -env = require('test_run') - | --- - | ... -test_run = env.new() - | --- - | ... -fiber = require('fiber') - | --- - | ... - -box.schema.user.grant('guest', 'replication') - | --- - | ... -test_run:cmd('create server replica with rpl_master=default, script="replication/replica.lua"') - | --- - | - true - | ... -test_run:cmd('start server replica with wait=True, wait_load=True') - | --- - | - true - | ... - -_ = box.schema.space.create('test', {is_sync=true}) - | --- - | ... -_ = box.space.test:create_index('pk') - | --- - | ... - -old_synchro_quorum = box.cfg.replication_synchro_quorum - | --- - | ... -old_synchro_timeout = box.cfg.replication_synchro_timeout - | --- - | ... - --- Make sure that the master stalls on commit leaving the limbo non-empty. -box.cfg{replication_synchro_quorum=3, replication_synchro_timeout=1000} - | --- - | ... - -f = fiber.new(function() box.space.test:insert{1} end) - | --- - | ... -f:status() - | --- - | - suspended - | ... - --- Wait till replica's limbo is non-empty. -test_run:wait_lsn('replica', 'default') - | --- - | ... -test_run:cmd('switch replica') - | --- - | - true - | ... - -box.info.ro - | --- - | - true - | ... -box.space.test:insert{2} - | --- - | - error: Can't modify data because this instance is in read-only mode. - | ... -success = false - | --- - | ... -f = require('fiber').new(function() box.ctl.wait_rw() success = true end) - | --- - | ... -f:status() - | --- - | - suspended - | ... - -test_run:cmd('switch default') - | --- - | - true - | ... - --- Empty the limbo. -box.cfg{replication_synchro_quorum=2} - | --- - | ... - -test_run:cmd('switch replica') - | --- - | - true - | ... - -test_run:wait_cond(function() return success end) - | --- - | - true - | ... -box.info.ro - | --- - | - false - | ... --- Should succeed now. -box.space.test:insert{2} - | --- - | - [2] - | ... - --- Cleanup. -test_run:cmd('switch default') - | --- - | - true - | ... -box.cfg{replication_synchro_quorum=old_synchro_quorum,\ - replication_synchro_timeout=old_synchro_timeout} - | --- - | ... -box.space.test:drop() - | --- - | ... -test_run:cmd('stop server replica') - | --- - | - true - | ... -test_run:cmd('delete server replica') - | --- - | - true - | ... -box.schema.user.revoke('guest', 'replication') - | --- - | ... diff --git a/test/replication/gh-5440-qsync-ro.test.lua b/test/replication/gh-5440-qsync-ro.test.lua deleted file mode 100644 index d63ec9c1e..000000000 --- a/test/replication/gh-5440-qsync-ro.test.lua +++ /dev/null @@ -1,53 +0,0 @@ --- --- gh-5440 everyone but the limbo owner is read-only on non-empty limbo. --- -env = require('test_run') -test_run = env.new() -fiber = require('fiber') - -box.schema.user.grant('guest', 'replication') -test_run:cmd('create server replica with rpl_master=default, script="replication/replica.lua"') -test_run:cmd('start server replica with wait=True, wait_load=True') - -_ = box.schema.space.create('test', {is_sync=true}) -_ = box.space.test:create_index('pk') - -old_synchro_quorum = box.cfg.replication_synchro_quorum -old_synchro_timeout = box.cfg.replication_synchro_timeout - --- Make sure that the master stalls on commit leaving the limbo non-empty. -box.cfg{replication_synchro_quorum=3, replication_synchro_timeout=1000} - -f = fiber.new(function() box.space.test:insert{1} end) -f:status() - --- Wait till replica's limbo is non-empty. -test_run:wait_lsn('replica', 'default') -test_run:cmd('switch replica') - -box.info.ro -box.space.test:insert{2} -success = false -f = require('fiber').new(function() box.ctl.wait_rw() success = true end) -f:status() - -test_run:cmd('switch default') - --- Empty the limbo. -box.cfg{replication_synchro_quorum=2} - -test_run:cmd('switch replica') - -test_run:wait_cond(function() return success end) -box.info.ro --- Should succeed now. -box.space.test:insert{2} - --- Cleanup. -test_run:cmd('switch default') -box.cfg{replication_synchro_quorum=old_synchro_quorum,\ - replication_synchro_timeout=old_synchro_timeout} -box.space.test:drop() -test_run:cmd('stop server replica') -test_run:cmd('delete server replica') -box.schema.user.revoke('guest', 'replication') diff --git a/test/replication/suite.cfg b/test/replication/suite.cfg index 496b2e104..eb88b9420 100644 --- a/test/replication/suite.cfg +++ b/test/replication/suite.cfg @@ -41,7 +41,6 @@ "gh-4739-vclock-assert.test.lua": {}, "gh-4730-applier-rollback.test.lua": {}, "gh-4928-tx-boundaries.test.lua": {}, - "gh-5440-qsync-ro.test.lua": {}, "gh-5435-qsync-clear-synchro-queue-commit-all.test.lua": {}, "gh-5536-wal-limit.test.lua": {}, "gh-5566-final-join-synchro.test.lua": {}, -- 2.30.1 (Apple Git-130)