From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 231A66E200; Fri, 18 Jun 2021 00:11:16 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 231A66E200 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1623964276; bh=Wk60/v0HAhS53pDnC8Di6HqcAkEfu7NXDOogXFP8pkw=; h=To:Date:In-Reply-To:References:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=eEnBTNWe3oay0pYlGGWheKTs5CQTi5sYk9TocZjerNWcbtUMUtUum1qYb4/kZO6Ao VRPjQkpD4tZUI+gDlXC/b0I12hLPZ7/jf8pGoVqaC8UNNIHZ9IyJUSAy0n3hOlAlr+ w6uClbnGk1B9mAQzQtkteglwZIUscWFnNVM1eVFg= Received: from smtp46.i.mail.ru (smtp46.i.mail.ru [94.100.177.106]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id D25A96E200 for ; Fri, 18 Jun 2021 00:08:45 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org D25A96E200 Received: by smtp46.i.mail.ru with esmtpa (envelope-from ) id 1ltzFl-00046o-1k; Fri, 18 Jun 2021 00:08:45 +0300 To: v.shpilevoy@tarantool.org, gorcunov@gmail.com Date: Fri, 18 Jun 2021 00:07:39 +0300 Message-Id: <2bdcf01c818330231b4818a6fb41deee8788113b.1623963649.git.sergepetrenko@tarantool.org> X-Mailer: git-send-email 2.30.1 (Apple Git-130) In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-4EC0790: 10 X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD91C2C07775F13263A075164279ADA4DEE4DF6CBEA76D0A4B700894C459B0CD1B90E0FAAC64B0C2404AF81AB2B041EC2AD7FF79ED123BE076C9A2121EFDC19030F X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7AEF176B7F3232F31EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F79006375080A183704825A98638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D8F940DEED89018176DE3688650B545BF1117882F4460429724CE54428C33FAD305F5C1EE8F4F765FC2EE5AD8F952D28FBA471835C12D1D9774AD6D5ED66289B52BA9C0B312567BB23117882F446042972877693876707352033AC447995A7AD18618001F51B5FD3F9D2E47CDBA5A96583BA9C0B312567BB2376E601842F6C81A19E625A9149C048EEC8105B04EFE076289735652A29929C6C4AD6D5ED66289B52698AB9A7B718F8C46E0066C2D8992A16725E5C173C3A84C3C58ADFD56BAB9852BA3038C0950A5D36B5C8C57E37DE458B0BC6067A898B09E46D1867E19FE14079C09775C1D3CA48CF3D321E7403792E342EB15956EA79C166A417C69337E82CC275ECD9A6C639B01B78DA827A17800CE7D151390FFDBF6399731C566533BA786AA5CC5B56E945C8DA X-B7AD71C0: AC4F5C86D027EB782CDD5689AFBDA7A2368A440D3B0F6089093C9A16E5BC824A2A04A2ABAA09D25379311020FFC8D4AD7067FEF890167A5428054E0CA4297952 X-C1DE0DAB: C20DE7B7AB408E4181F030C43753B8183A4AFAF3EA6BDC44C234C8B12C006B7A1E11FAF78CF6E2FBE2BFF4FFD26D4BB684C58968651D1FDCB1881A6453793CE9C32612AADDFBE061C801D989C91DAA47C32612AADDFBE0618C81B5738AFAA7AF9510FB958DCE06DB6ED91DBE5ABE359AC8952F428387DEC01B2EFE7B39F7738393EDB24507CE13387DFF0A840B692CF8 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D345DE7771146E56B088671EB72F1BBFC790EF32294F9784171A3075C4FBB4926D39A352BA864FBE34F1D7E09C32AA3244CDBA90056A88FE205EDA54A67E6290698259227199D06760A927AC6DF5659F194 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojbL9S8ysBdXjymZAW17GbhJ12fKgfqgs2 X-Mailru-Sender: 583F1D7ACE8F49BD9DF7A8DAE6E2B08A18ACA3F901677141240C2F739DC345F8445C4729B7B8178F424AE0EB1F3D1D21E2978F233C3FAE6EE63DB1732555E4A8EE80603BA4A5B0BC112434F685709FCF0DA7A0AF5A3A8387 X-Mras: Ok Subject: [Tarantool-patches] [PATCH v2 5/8] replication: forbid implicit limbo owner transition X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Serge Petrenko via Tarantool-patches Reply-To: Serge Petrenko Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Forbid limbo ownership transition without an explicit promote. Make it so that synchronous transactions may be committed only after it is claimed by some instance via a PROMOTE request. Make everyone but the limbo owner read-only even when the limbo is empty. Part-of #6034 @TarantoolBot document Title: synchronous replication changes `box.info.synchro.queue` receives a new field: `owner`. It's a replica id of the instance owning the synchronous transaction queue. Once some instance owns the queue, every other instance becomes read-only. When the queue is unclaimed, e.g. `box.info.synchro.queue.owner` is `0`, everyone may be writeable, but cannot create synchronous transactions. In order to claim or re-claim the queue, you have to issue `box.ctl.promote()` on the instance you wish to promote. When elections are enabled, the instance issues `box.ctl.promote()` automatically once it wins the elections, no additional actions are required. --- src/box/errcode.h | 1 + src/box/lua/info.c | 4 +- src/box/txn_limbo.c | 35 ++---- test/box/alter.result | 2 +- test/box/error.result | 1 + test/replication/gh-5440-qsync-ro.result | 133 --------------------- test/replication/gh-5440-qsync-ro.test.lua | 53 -------- 7 files changed, 16 insertions(+), 213 deletions(-) delete mode 100644 test/replication/gh-5440-qsync-ro.result delete mode 100644 test/replication/gh-5440-qsync-ro.test.lua diff --git a/src/box/errcode.h b/src/box/errcode.h index 49aec4bf6..e3943c01d 100644 --- a/src/box/errcode.h +++ b/src/box/errcode.h @@ -278,6 +278,7 @@ struct errcode_record { /*223 */_(ER_INTERFERING_PROMOTE, "Instance with replica id %u was promoted first") \ /*224 */_(ER_RAFT_DISABLED, "Elections were turned off while running box.ctl.promote()")\ /*225 */_(ER_TXN_ROLLBACK, "Transaction was rolled back") \ + /*226 */_(ER_SYNCHRO_QUEUE_UNCLAIMED, "The synchronous transaction queue doesn't belong to any instance")\ /* * !IMPORTANT! Please follow instructions at start of the file diff --git a/src/box/lua/info.c b/src/box/lua/info.c index 0eb48b823..7e3cd0b7d 100644 --- a/src/box/lua/info.c +++ b/src/box/lua/info.c @@ -611,9 +611,11 @@ lbox_info_synchro(struct lua_State *L) /* Queue information. */ struct txn_limbo *queue = &txn_limbo; - lua_createtable(L, 0, 1); + lua_createtable(L, 0, 2); lua_pushnumber(L, queue->len); lua_setfield(L, -2, "len"); + lua_pushnumber(L, queue->owner_id); + lua_setfield(L, -2, "owner"); lua_setfield(L, -2, "queue"); return 1; diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c index a5d1df00c..203dbe856 100644 --- a/src/box/txn_limbo.c +++ b/src/box/txn_limbo.c @@ -55,7 +55,8 @@ txn_limbo_create(struct txn_limbo *limbo) bool txn_limbo_is_ro(struct txn_limbo *limbo) { - return limbo->owner_id != instance_id && !txn_limbo_is_empty(limbo); + return limbo->owner_id != REPLICA_ID_NIL && + limbo->owner_id != instance_id; } struct txn_limbo_entry * @@ -95,18 +96,13 @@ txn_limbo_append(struct txn_limbo *limbo, uint32_t id, struct txn *txn) } if (id == 0) id = instance_id; - bool make_ro = false; - if (limbo->owner_id != id) { - if (rlist_empty(&limbo->queue)) { - limbo->owner_id = id; - limbo->confirmed_lsn = 0; - if (id != instance_id) - make_ro = true; - } else { - diag_set(ClientError, ER_UNCOMMITTED_FOREIGN_SYNC_TXNS, - limbo->owner_id); - return NULL; - } + if (limbo->owner_id == REPLICA_ID_NIL) { + diag_set(ClientError, ER_SYNCHRO_QUEUE_UNCLAIMED); + return NULL; + } else if (limbo->owner_id != id) { + diag_set(ClientError, ER_UNCOMMITTED_FOREIGN_SYNC_TXNS, + limbo->owner_id); + return NULL; } size_t size; struct txn_limbo_entry *e = region_alloc_object(&txn->region, @@ -122,12 +118,6 @@ txn_limbo_append(struct txn_limbo *limbo, uint32_t id, struct txn *txn) e->is_rollback = false; rlist_add_tail_entry(&limbo->queue, e, in_queue); limbo->len++; - /* - * We added new entries from a remote instance to an empty limbo. - * Time to make this instance read-only. - */ - if (make_ro) - box_update_ro_summary(); return e; } @@ -432,9 +422,6 @@ txn_limbo_read_confirm(struct txn_limbo *limbo, int64_t lsn) assert(e->txn->signature >= 0); txn_complete_success(e->txn); } - /* Update is_ro once the limbo is clear. */ - if (txn_limbo_is_empty(limbo)) - box_update_ro_summary(); } /** @@ -482,9 +469,6 @@ txn_limbo_read_rollback(struct txn_limbo *limbo, int64_t lsn) if (e == last_rollback) break; } - /* Update is_ro once the limbo is clear. */ - if (txn_limbo_is_empty(limbo)) - box_update_ro_summary(); } void @@ -515,6 +499,7 @@ txn_limbo_read_promote(struct txn_limbo *limbo, uint32_t replica_id, txn_limbo_read_rollback(limbo, lsn + 1); assert(txn_limbo_is_empty(&txn_limbo)); limbo->owner_id = replica_id; + box_update_ro_summary(); limbo->confirmed_lsn = 0; } diff --git a/test/box/alter.result b/test/box/alter.result index a7bffce10..6a64f6b84 100644 --- a/test/box/alter.result +++ b/test/box/alter.result @@ -1464,7 +1464,7 @@ assert(s.is_sync) ... s:replace{1} --- -- error: Quorum collection for a synchronous transaction is timed out +- error: The synchronous transaction queue doesn't belong to any instance ... -- When not specified or nil - ignored. s:alter({is_sync = nil}) diff --git a/test/box/error.result b/test/box/error.result index 062a90399..574521a14 100644 --- a/test/box/error.result +++ b/test/box/error.result @@ -444,6 +444,7 @@ t; | 223: box.error.INTERFERING_PROMOTE | 224: box.error.RAFT_DISABLED | 225: box.error.TXN_ROLLBACK + | 226: box.error.LIMBO_UNCLAIMED | ... test_run:cmd("setopt delimiter ''"); diff --git a/test/replication/gh-5440-qsync-ro.result b/test/replication/gh-5440-qsync-ro.result deleted file mode 100644 index 1ece26a42..000000000 --- a/test/replication/gh-5440-qsync-ro.result +++ /dev/null @@ -1,133 +0,0 @@ --- test-run result file version 2 --- --- gh-5440 everyone but the limbo owner is read-only on non-empty limbo. --- -env = require('test_run') - | --- - | ... -test_run = env.new() - | --- - | ... -fiber = require('fiber') - | --- - | ... - -box.schema.user.grant('guest', 'replication') - | --- - | ... -test_run:cmd('create server replica with rpl_master=default, script="replication/replica.lua"') - | --- - | - true - | ... -test_run:cmd('start server replica with wait=True, wait_load=True') - | --- - | - true - | ... - -_ = box.schema.space.create('test', {is_sync=true}) - | --- - | ... -_ = box.space.test:create_index('pk') - | --- - | ... - -old_synchro_quorum = box.cfg.replication_synchro_quorum - | --- - | ... -old_synchro_timeout = box.cfg.replication_synchro_timeout - | --- - | ... - --- Make sure that the master stalls on commit leaving the limbo non-empty. -box.cfg{replication_synchro_quorum=3, replication_synchro_timeout=1000} - | --- - | ... - -f = fiber.new(function() box.space.test:insert{1} end) - | --- - | ... -f:status() - | --- - | - suspended - | ... - --- Wait till replica's limbo is non-empty. -test_run:wait_lsn('replica', 'default') - | --- - | ... -test_run:cmd('switch replica') - | --- - | - true - | ... - -box.info.ro - | --- - | - true - | ... -box.space.test:insert{2} - | --- - | - error: Can't modify data because this instance is in read-only mode. - | ... -success = false - | --- - | ... -f = require('fiber').new(function() box.ctl.wait_rw() success = true end) - | --- - | ... -f:status() - | --- - | - suspended - | ... - -test_run:cmd('switch default') - | --- - | - true - | ... - --- Empty the limbo. -box.cfg{replication_synchro_quorum=2} - | --- - | ... - -test_run:cmd('switch replica') - | --- - | - true - | ... - -test_run:wait_cond(function() return success end) - | --- - | - true - | ... -box.info.ro - | --- - | - false - | ... --- Should succeed now. -box.space.test:insert{2} - | --- - | - [2] - | ... - --- Cleanup. -test_run:cmd('switch default') - | --- - | - true - | ... -box.cfg{replication_synchro_quorum=old_synchro_quorum,\ - replication_synchro_timeout=old_synchro_timeout} - | --- - | ... -box.space.test:drop() - | --- - | ... -test_run:cmd('stop server replica') - | --- - | - true - | ... -test_run:cmd('delete server replica') - | --- - | - true - | ... -box.schema.user.revoke('guest', 'replication') - | --- - | ... diff --git a/test/replication/gh-5440-qsync-ro.test.lua b/test/replication/gh-5440-qsync-ro.test.lua deleted file mode 100644 index d63ec9c1e..000000000 --- a/test/replication/gh-5440-qsync-ro.test.lua +++ /dev/null @@ -1,53 +0,0 @@ --- --- gh-5440 everyone but the limbo owner is read-only on non-empty limbo. --- -env = require('test_run') -test_run = env.new() -fiber = require('fiber') - -box.schema.user.grant('guest', 'replication') -test_run:cmd('create server replica with rpl_master=default, script="replication/replica.lua"') -test_run:cmd('start server replica with wait=True, wait_load=True') - -_ = box.schema.space.create('test', {is_sync=true}) -_ = box.space.test:create_index('pk') - -old_synchro_quorum = box.cfg.replication_synchro_quorum -old_synchro_timeout = box.cfg.replication_synchro_timeout - --- Make sure that the master stalls on commit leaving the limbo non-empty. -box.cfg{replication_synchro_quorum=3, replication_synchro_timeout=1000} - -f = fiber.new(function() box.space.test:insert{1} end) -f:status() - --- Wait till replica's limbo is non-empty. -test_run:wait_lsn('replica', 'default') -test_run:cmd('switch replica') - -box.info.ro -box.space.test:insert{2} -success = false -f = require('fiber').new(function() box.ctl.wait_rw() success = true end) -f:status() - -test_run:cmd('switch default') - --- Empty the limbo. -box.cfg{replication_synchro_quorum=2} - -test_run:cmd('switch replica') - -test_run:wait_cond(function() return success end) -box.info.ro --- Should succeed now. -box.space.test:insert{2} - --- Cleanup. -test_run:cmd('switch default') -box.cfg{replication_synchro_quorum=old_synchro_quorum,\ - replication_synchro_timeout=old_synchro_timeout} -box.space.test:drop() -test_run:cmd('stop server replica') -test_run:cmd('delete server replica') -box.schema.user.revoke('guest', 'replication') -- 2.30.1 (Apple Git-130)