From: Serge Petrenko via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: v.shpilevoy@tarantool.org, gorcunov@gmail.com Cc: tarantool-patches@dev.tarantool.org Subject: [Tarantool-patches] [PATCH 2/7] replication: forbid implicit limbo owner transition Date: Thu, 10 Jun 2021 16:32:52 +0300 [thread overview] Message-ID: <fd68d2657ec9d4974b2c7641f6faacde9a99d39a.1623331925.git.sergepetrenko@tarantool.org> (raw) In-Reply-To: <cover.1623331925.git.sergepetrenko@tarantool.org> Forbid limbo ownership transition without an explicit promote. Make it so that synchronous transactions may be committed only after it is claimed by some instance via a PROMOTE request. Make everyone but the limbo owner read-only even when the limbo is empty. Part-of #6034 @TarantoolBot document Title: synchronous replication changes `box.info.synchro.queue` receives a new field: `owner`. It's a replica id of the instance owning the synchronous transaction limbo. Once some instance owns the limbo, every other instance becomes read-only. When the limbo is unclaimed, e.g. `box.info.synchro.queue.owner` is `0`, everyone may be writeable, but cannot create synchronous transactions. In order to claim or re-claim the limbo, you have to issue `box.ctl.promote()` on the instance you wish to promote. When elections are enabled, the instance issues `box.ctl.promote()` automatically once it wins the elections. --- src/box/errcode.h | 1 + src/box/lua/info.c | 4 +- src/box/txn_limbo.c | 35 ++--- test/box/alter.result | 2 +- test/box/error.result | 1 + .../gh-5140-qsync-casc-rollback.result | 3 + .../gh-5140-qsync-casc-rollback.test.lua | 1 + .../gh-5144-qsync-dup-confirm.result | 3 + .../gh-5144-qsync-dup-confirm.test.lua | 1 + .../gh-5163-qsync-restart-crash.result | 3 + .../gh-5163-qsync-restart-crash.test.lua | 1 + .../gh-5167-qsync-rollback-snap.result | 3 + .../gh-5167-qsync-rollback-snap.test.lua | 1 + .../gh-5195-qsync-replica-write.result | 7 +- .../gh-5195-qsync-replica-write.test.lua | 5 +- .../gh-5213-qsync-applier-order-3.result | 6 + .../gh-5213-qsync-applier-order-3.test.lua | 2 + .../gh-5213-qsync-applier-order.result | 3 + .../gh-5213-qsync-applier-order.test.lua | 1 + .../replication/gh-5288-qsync-recovery.result | 3 + .../gh-5288-qsync-recovery.test.lua | 1 + .../gh-5298-qsync-recovery-snap.result | 3 + .../gh-5298-qsync-recovery-snap.test.lua | 1 + ...sync-clear-synchro-queue-commit-all.result | 3 + ...nc-clear-synchro-queue-commit-all.test.lua | 1 + test/replication/gh-5440-qsync-ro.result | 133 ------------------ test/replication/gh-5440-qsync-ro.test.lua | 53 ------- .../gh-5446-qsync-eval-quorum.result | 3 + .../gh-5446-qsync-eval-quorum.test.lua | 1 + .../gh-5566-final-join-synchro.result | 3 + .../gh-5566-final-join-synchro.test.lua | 1 + .../gh-5874-qsync-txn-recovery.result | 3 + .../gh-5874-qsync-txn-recovery.test.lua | 1 + .../gh-6057-qsync-confirm-async-no-wal.result | 4 + ...h-6057-qsync-confirm-async-no-wal.test.lua | 2 + test/replication/hang_on_synchro_fail.result | 3 + .../replication/hang_on_synchro_fail.test.lua | 1 + test/replication/qsync_advanced.result | 9 ++ test/replication/qsync_advanced.test.lua | 3 + test/replication/qsync_basic.result | 64 +-------- test/replication/qsync_basic.test.lua | 24 +--- test/replication/qsync_errinj.result | 3 + test/replication/qsync_errinj.test.lua | 1 + test/replication/qsync_snapshots.result | 3 + test/replication/qsync_snapshots.test.lua | 1 + test/replication/qsync_with_anon.result | 3 + test/replication/qsync_with_anon.test.lua | 1 + 47 files changed, 114 insertions(+), 301 deletions(-) delete mode 100644 test/replication/gh-5440-qsync-ro.result delete mode 100644 test/replication/gh-5440-qsync-ro.test.lua diff --git a/src/box/errcode.h b/src/box/errcode.h index d93820e96..e75f54a01 100644 --- a/src/box/errcode.h +++ b/src/box/errcode.h @@ -277,6 +277,7 @@ struct errcode_record { /*222 */_(ER_QUORUM_WAIT, "Couldn't wait for quorum %d: %s") \ /*223 */_(ER_INTERFERING_PROMOTE, "Instance with replica id %u was promoted first") \ /*224 */_(ER_RAFT_DISABLED, "Elections were turned off while running box.ctl.promote()")\ + /*225 */_(ER_LIMBO_UNCLAIMED, "Synchronous transaction limbo doesn't belong to any instance")\ /* * !IMPORTANT! Please follow instructions at start of the file diff --git a/src/box/lua/info.c b/src/box/lua/info.c index 0eb48b823..7e3cd0b7d 100644 --- a/src/box/lua/info.c +++ b/src/box/lua/info.c @@ -611,9 +611,11 @@ lbox_info_synchro(struct lua_State *L) /* Queue information. */ struct txn_limbo *queue = &txn_limbo; - lua_createtable(L, 0, 1); + lua_createtable(L, 0, 2); lua_pushnumber(L, queue->len); lua_setfield(L, -2, "len"); + lua_pushnumber(L, queue->owner_id); + lua_setfield(L, -2, "owner"); lua_setfield(L, -2, "queue"); return 1; diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c index dae6d2df4..53233add3 100644 --- a/src/box/txn_limbo.c +++ b/src/box/txn_limbo.c @@ -55,7 +55,8 @@ txn_limbo_create(struct txn_limbo *limbo) bool txn_limbo_is_ro(struct txn_limbo *limbo) { - return limbo->owner_id != instance_id && !txn_limbo_is_empty(limbo); + return limbo->owner_id != REPLICA_ID_NIL && + limbo->owner_id != instance_id; } struct txn_limbo_entry * @@ -94,18 +95,13 @@ txn_limbo_append(struct txn_limbo *limbo, uint32_t id, struct txn *txn) } if (id == 0) id = instance_id; - bool make_ro = false; - if (limbo->owner_id != id) { - if (rlist_empty(&limbo->queue)) { - limbo->owner_id = id; - limbo->confirmed_lsn = 0; - if (id != instance_id) - make_ro = true; - } else { - diag_set(ClientError, ER_UNCOMMITTED_FOREIGN_SYNC_TXNS, - limbo->owner_id); - return NULL; - } + if (limbo->owner_id == REPLICA_ID_NIL) { + diag_set(ClientError, ER_LIMBO_UNCLAIMED); + return NULL; + } else if (limbo->owner_id != id) { + diag_set(ClientError, ER_UNCOMMITTED_FOREIGN_SYNC_TXNS, + limbo->owner_id); + return NULL; } size_t size; struct txn_limbo_entry *e = region_alloc_object(&txn->region, @@ -121,12 +117,6 @@ txn_limbo_append(struct txn_limbo *limbo, uint32_t id, struct txn *txn) e->is_rollback = false; rlist_add_tail_entry(&limbo->queue, e, in_queue); limbo->len++; - /* - * We added new entries from a remote instance to an empty limbo. - * Time to make this instance read-only. - */ - if (make_ro) - box_update_ro_summary(); return e; } @@ -427,9 +417,6 @@ txn_limbo_read_confirm(struct txn_limbo *limbo, int64_t lsn) assert(e->txn->signature >= 0); txn_complete_success(e->txn); } - /* Update is_ro once the limbo is clear. */ - if (txn_limbo_is_empty(limbo)) - box_update_ro_summary(); } /** @@ -477,9 +464,6 @@ txn_limbo_read_rollback(struct txn_limbo *limbo, int64_t lsn) if (e == last_rollback) break; } - /* Update is_ro once the limbo is clear. */ - if (txn_limbo_is_empty(limbo)) - box_update_ro_summary(); } void @@ -510,6 +494,7 @@ txn_limbo_read_promote(struct txn_limbo *limbo, uint32_t replica_id, txn_limbo_read_rollback(limbo, lsn + 1); assert(txn_limbo_is_empty(&txn_limbo)); limbo->owner_id = replica_id; + box_update_ro_summary(); limbo->confirmed_lsn = 0; } diff --git a/test/box/alter.result b/test/box/alter.result index a7bffce10..b903bf0d0 100644 --- a/test/box/alter.result +++ b/test/box/alter.result @@ -1464,7 +1464,7 @@ assert(s.is_sync) ... s:replace{1} --- -- error: Quorum collection for a synchronous transaction is timed out +- error: Synchronous transaction limbo doesn't belong to any instance ... -- When not specified or nil - ignored. s:alter({is_sync = nil}) diff --git a/test/box/error.result b/test/box/error.result index cc8cbaaa9..6420f92bc 100644 --- a/test/box/error.result +++ b/test/box/error.result @@ -443,6 +443,7 @@ t; | 222: box.error.QUORUM_WAIT | 223: box.error.INTERFERING_PROMOTE | 224: box.error.RAFT_DISABLED + | 225: box.error.LIMBO_UNCLAIMED | ... test_run:cmd("setopt delimiter ''"); diff --git a/test/replication/gh-5140-qsync-casc-rollback.result b/test/replication/gh-5140-qsync-casc-rollback.result index da77631dd..0eec3f10a 100644 --- a/test/replication/gh-5140-qsync-casc-rollback.result +++ b/test/replication/gh-5140-qsync-casc-rollback.result @@ -73,6 +73,9 @@ _ = box.schema.space.create('async', {is_sync=false, engine = engine}) _ = _:create_index('pk') | --- | ... +box.ctl.promote() + | --- + | ... -- Write something to flush the master state to replica. box.space.sync:replace{1} | --- diff --git a/test/replication/gh-5140-qsync-casc-rollback.test.lua b/test/replication/gh-5140-qsync-casc-rollback.test.lua index 69fc9ad02..d45390aa6 100644 --- a/test/replication/gh-5140-qsync-casc-rollback.test.lua +++ b/test/replication/gh-5140-qsync-casc-rollback.test.lua @@ -48,6 +48,7 @@ _ = box.schema.space.create('sync', {is_sync = true, engine = engine}) _ = _:create_index('pk') _ = box.schema.space.create('async', {is_sync=false, engine = engine}) _ = _:create_index('pk') +box.ctl.promote() -- Write something to flush the master state to replica. box.space.sync:replace{1} diff --git a/test/replication/gh-5144-qsync-dup-confirm.result b/test/replication/gh-5144-qsync-dup-confirm.result index 9d265d9ff..01c7501f1 100644 --- a/test/replication/gh-5144-qsync-dup-confirm.result +++ b/test/replication/gh-5144-qsync-dup-confirm.result @@ -46,6 +46,9 @@ _ = box.schema.space.create('sync', {is_sync = true, engine = engine}) _ = _:create_index('pk') | --- | ... +box.ctl.promote() + | --- + | ... -- Remember the current LSN. In the end, when the following synchronous -- transaction is committed, result LSN should be this value +2: for the diff --git a/test/replication/gh-5144-qsync-dup-confirm.test.lua b/test/replication/gh-5144-qsync-dup-confirm.test.lua index 01a8351e0..9a295192a 100644 --- a/test/replication/gh-5144-qsync-dup-confirm.test.lua +++ b/test/replication/gh-5144-qsync-dup-confirm.test.lua @@ -19,6 +19,7 @@ box.cfg{replication_synchro_quorum = 2, replication_synchro_timeout = 1000} _ = box.schema.space.create('sync', {is_sync = true, engine = engine}) _ = _:create_index('pk') +box.ctl.promote() -- Remember the current LSN. In the end, when the following synchronous -- transaction is committed, result LSN should be this value +2: for the diff --git a/test/replication/gh-5163-qsync-restart-crash.result b/test/replication/gh-5163-qsync-restart-crash.result index e57bc76d1..8c9d43a9b 100644 --- a/test/replication/gh-5163-qsync-restart-crash.result +++ b/test/replication/gh-5163-qsync-restart-crash.result @@ -16,6 +16,9 @@ _ = box.schema.space.create('sync', {is_sync=true, engine=engine}) _ = box.space.sync:create_index('pk') | --- | ... +box.ctl.promote() + | --- + | ... box.space.sync:replace{1} | --- diff --git a/test/replication/gh-5163-qsync-restart-crash.test.lua b/test/replication/gh-5163-qsync-restart-crash.test.lua index d5aca4749..a0c2533c6 100644 --- a/test/replication/gh-5163-qsync-restart-crash.test.lua +++ b/test/replication/gh-5163-qsync-restart-crash.test.lua @@ -7,6 +7,7 @@ engine = test_run:get_cfg('engine') -- _ = box.schema.space.create('sync', {is_sync=true, engine=engine}) _ = box.space.sync:create_index('pk') +box.ctl.promote() box.space.sync:replace{1} test_run:cmd('restart server default') diff --git a/test/replication/gh-5167-qsync-rollback-snap.result b/test/replication/gh-5167-qsync-rollback-snap.result index 06f58526c..ddb3212ad 100644 --- a/test/replication/gh-5167-qsync-rollback-snap.result +++ b/test/replication/gh-5167-qsync-rollback-snap.result @@ -41,6 +41,9 @@ _ = box.schema.space.create('sync', {is_sync = true, engine = engine}) _ = box.space.sync:create_index('pk') | --- | ... +box.ctl.promote() + | --- + | ... -- Write something to flush the current master's state to replica. _ = box.space.sync:insert{1} | --- diff --git a/test/replication/gh-5167-qsync-rollback-snap.test.lua b/test/replication/gh-5167-qsync-rollback-snap.test.lua index 475727e61..1590a515a 100644 --- a/test/replication/gh-5167-qsync-rollback-snap.test.lua +++ b/test/replication/gh-5167-qsync-rollback-snap.test.lua @@ -16,6 +16,7 @@ fiber = require('fiber') box.cfg{replication_synchro_quorum = 2, replication_synchro_timeout = 1000} _ = box.schema.space.create('sync', {is_sync = true, engine = engine}) _ = box.space.sync:create_index('pk') +box.ctl.promote() -- Write something to flush the current master's state to replica. _ = box.space.sync:insert{1} _ = box.space.sync:delete{1} diff --git a/test/replication/gh-5195-qsync-replica-write.result b/test/replication/gh-5195-qsync-replica-write.result index 85e00e6ed..fcf96cd1a 100644 --- a/test/replication/gh-5195-qsync-replica-write.result +++ b/test/replication/gh-5195-qsync-replica-write.result @@ -40,6 +40,9 @@ _ = box.schema.space.create('sync', {engine = engine, is_sync = true}) _ = box.space.sync:create_index('pk') | --- | ... +box.ctl.promote() + | --- + | ... box.cfg{replication_synchro_timeout = 1000, replication_synchro_quorum = 3} | --- @@ -71,12 +74,12 @@ test_run:wait_lsn('replica', 'default') | --- | ... -- Normal DML is blocked - the limbo is not empty and does not belong to the --- replica. But synchro queue cleanup also does a WAL write, and propagates LSN +-- replica. But promote also does a WAL write, and propagates LSN -- of the instance. box.cfg{replication_synchro_timeout = 0.001} | --- | ... -box.ctl.clear_synchro_queue() +box.ctl.promote() | --- | ... diff --git a/test/replication/gh-5195-qsync-replica-write.test.lua b/test/replication/gh-5195-qsync-replica-write.test.lua index 64c48be99..751ceee69 100644 --- a/test/replication/gh-5195-qsync-replica-write.test.lua +++ b/test/replication/gh-5195-qsync-replica-write.test.lua @@ -17,6 +17,7 @@ test_run:cmd('start server replica with wait=True, wait_load=True') -- _ = box.schema.space.create('sync', {engine = engine, is_sync = true}) _ = box.space.sync:create_index('pk') +box.ctl.promote() box.cfg{replication_synchro_timeout = 1000, replication_synchro_quorum = 3} lsn = box.info.lsn @@ -30,10 +31,10 @@ test_run:wait_cond(function() return box.info.lsn == lsn end) test_run:switch('replica') test_run:wait_lsn('replica', 'default') -- Normal DML is blocked - the limbo is not empty and does not belong to the --- replica. But synchro queue cleanup also does a WAL write, and propagates LSN +-- replica. But promote also does a WAL write, and propagates LSN -- of the instance. box.cfg{replication_synchro_timeout = 0.001} -box.ctl.clear_synchro_queue() +box.ctl.promote() test_run:switch('default') -- Wait second ACK receipt. diff --git a/test/replication/gh-5213-qsync-applier-order-3.result b/test/replication/gh-5213-qsync-applier-order-3.result index bcb18b5c0..90608dcdc 100644 --- a/test/replication/gh-5213-qsync-applier-order-3.result +++ b/test/replication/gh-5213-qsync-applier-order-3.result @@ -45,6 +45,9 @@ s = box.schema.space.create('test', {is_sync = true}) _ = s:create_index('pk') | --- | ... +box.ctl.promote() + | --- + | ... test_run:cmd('create server replica1 with rpl_master=default,\ script="replication/replica1.lua"') @@ -179,6 +182,9 @@ box.cfg{ -- Replica2 takes the limbo ownership and sends the transaction to the replica1. -- Along with the CONFIRM from the default node, which is still not applied -- on the replica1. +box.ctl.promote() + | --- + | ... fiber = require('fiber') | --- | ... diff --git a/test/replication/gh-5213-qsync-applier-order-3.test.lua b/test/replication/gh-5213-qsync-applier-order-3.test.lua index 37b569da7..669ab3677 100644 --- a/test/replication/gh-5213-qsync-applier-order-3.test.lua +++ b/test/replication/gh-5213-qsync-applier-order-3.test.lua @@ -30,6 +30,7 @@ box.schema.user.grant('guest', 'super') s = box.schema.space.create('test', {is_sync = true}) _ = s:create_index('pk') +box.ctl.promote() test_run:cmd('create server replica1 with rpl_master=default,\ script="replication/replica1.lua"') @@ -90,6 +91,7 @@ box.cfg{ -- Replica2 takes the limbo ownership and sends the transaction to the replica1. -- Along with the CONFIRM from the default node, which is still not applied -- on the replica1. +box.ctl.promote() fiber = require('fiber') f = fiber.new(function() box.space.test:replace{2} end) diff --git a/test/replication/gh-5213-qsync-applier-order.result b/test/replication/gh-5213-qsync-applier-order.result index a8c24c289..3e1f2871b 100644 --- a/test/replication/gh-5213-qsync-applier-order.result +++ b/test/replication/gh-5213-qsync-applier-order.result @@ -29,6 +29,9 @@ s = box.schema.space.create('test', {is_sync = true}) _ = s:create_index('pk') | --- | ... +box.ctl.promote() + | --- + | ... test_run:cmd('create server replica with rpl_master=default,\ script="replication/gh-5213-replica.lua"') diff --git a/test/replication/gh-5213-qsync-applier-order.test.lua b/test/replication/gh-5213-qsync-applier-order.test.lua index f1eccfa84..010c08ca7 100644 --- a/test/replication/gh-5213-qsync-applier-order.test.lua +++ b/test/replication/gh-5213-qsync-applier-order.test.lua @@ -14,6 +14,7 @@ box.schema.user.grant('guest', 'super') s = box.schema.space.create('test', {is_sync = true}) _ = s:create_index('pk') +box.ctl.promote() test_run:cmd('create server replica with rpl_master=default,\ script="replication/gh-5213-replica.lua"') diff --git a/test/replication/gh-5288-qsync-recovery.result b/test/replication/gh-5288-qsync-recovery.result index dc0babef6..42662cd38 100644 --- a/test/replication/gh-5288-qsync-recovery.result +++ b/test/replication/gh-5288-qsync-recovery.result @@ -12,6 +12,9 @@ s = box.schema.space.create('sync', {is_sync = true}) _ = s:create_index('pk') | --- | ... +box.ctl.promote() + | --- + | ... s:insert{1} | --- | - [1] diff --git a/test/replication/gh-5288-qsync-recovery.test.lua b/test/replication/gh-5288-qsync-recovery.test.lua index 00bff7b87..24727d59e 100644 --- a/test/replication/gh-5288-qsync-recovery.test.lua +++ b/test/replication/gh-5288-qsync-recovery.test.lua @@ -5,6 +5,7 @@ test_run = require('test_run').new() -- s = box.schema.space.create('sync', {is_sync = true}) _ = s:create_index('pk') +box.ctl.promote() s:insert{1} box.snapshot() test_run:cmd('restart server default') diff --git a/test/replication/gh-5298-qsync-recovery-snap.result b/test/replication/gh-5298-qsync-recovery-snap.result index 922831552..ac6ccbc36 100644 --- a/test/replication/gh-5298-qsync-recovery-snap.result +++ b/test/replication/gh-5298-qsync-recovery-snap.result @@ -17,6 +17,9 @@ _ = box.schema.space.create('sync', {is_sync = true, engine = engine}) _ = box.space.sync:create_index('pk') | --- | ... +box.ctl.promote() + | --- + | ... for i = 1, 10 do box.space.sync:replace{i} end | --- | ... diff --git a/test/replication/gh-5298-qsync-recovery-snap.test.lua b/test/replication/gh-5298-qsync-recovery-snap.test.lua index 187f60d75..30f975c6c 100644 --- a/test/replication/gh-5298-qsync-recovery-snap.test.lua +++ b/test/replication/gh-5298-qsync-recovery-snap.test.lua @@ -8,6 +8,7 @@ engine = test_run:get_cfg('engine') -- _ = box.schema.space.create('sync', {is_sync = true, engine = engine}) _ = box.space.sync:create_index('pk') +box.ctl.promote() for i = 1, 10 do box.space.sync:replace{i} end -- Local rows could affect this by increasing the signature. diff --git a/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.result b/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.result index 2699231e5..20fab4072 100644 --- a/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.result +++ b/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.result @@ -49,6 +49,9 @@ _ = box.schema.space.create('test', {is_sync=true}) _ = box.space.test:create_index('pk') | --- | ... +box.ctl.promote() + | --- + | ... -- Fill the limbo with pending entries. 3 mustn't receive them yet. test_run:cmd('stop server election_replica3') diff --git a/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.test.lua b/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.test.lua index 03705d96c..ec0f1d77e 100644 --- a/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.test.lua +++ b/test/replication/gh-5435-qsync-clear-synchro-queue-commit-all.test.lua @@ -21,6 +21,7 @@ box.ctl.wait_rw() _ = box.schema.space.create('test', {is_sync=true}) _ = box.space.test:create_index('pk') +box.ctl.promote() -- Fill the limbo with pending entries. 3 mustn't receive them yet. test_run:cmd('stop server election_replica3') diff --git a/test/replication/gh-5440-qsync-ro.result b/test/replication/gh-5440-qsync-ro.result deleted file mode 100644 index 1ece26a42..000000000 --- a/test/replication/gh-5440-qsync-ro.result +++ /dev/null @@ -1,133 +0,0 @@ --- test-run result file version 2 --- --- gh-5440 everyone but the limbo owner is read-only on non-empty limbo. --- -env = require('test_run') - | --- - | ... -test_run = env.new() - | --- - | ... -fiber = require('fiber') - | --- - | ... - -box.schema.user.grant('guest', 'replication') - | --- - | ... -test_run:cmd('create server replica with rpl_master=default, script="replication/replica.lua"') - | --- - | - true - | ... -test_run:cmd('start server replica with wait=True, wait_load=True') - | --- - | - true - | ... - -_ = box.schema.space.create('test', {is_sync=true}) - | --- - | ... -_ = box.space.test:create_index('pk') - | --- - | ... - -old_synchro_quorum = box.cfg.replication_synchro_quorum - | --- - | ... -old_synchro_timeout = box.cfg.replication_synchro_timeout - | --- - | ... - --- Make sure that the master stalls on commit leaving the limbo non-empty. -box.cfg{replication_synchro_quorum=3, replication_synchro_timeout=1000} - | --- - | ... - -f = fiber.new(function() box.space.test:insert{1} end) - | --- - | ... -f:status() - | --- - | - suspended - | ... - --- Wait till replica's limbo is non-empty. -test_run:wait_lsn('replica', 'default') - | --- - | ... -test_run:cmd('switch replica') - | --- - | - true - | ... - -box.info.ro - | --- - | - true - | ... -box.space.test:insert{2} - | --- - | - error: Can't modify data because this instance is in read-only mode. - | ... -success = false - | --- - | ... -f = require('fiber').new(function() box.ctl.wait_rw() success = true end) - | --- - | ... -f:status() - | --- - | - suspended - | ... - -test_run:cmd('switch default') - | --- - | - true - | ... - --- Empty the limbo. -box.cfg{replication_synchro_quorum=2} - | --- - | ... - -test_run:cmd('switch replica') - | --- - | - true - | ... - -test_run:wait_cond(function() return success end) - | --- - | - true - | ... -box.info.ro - | --- - | - false - | ... --- Should succeed now. -box.space.test:insert{2} - | --- - | - [2] - | ... - --- Cleanup. -test_run:cmd('switch default') - | --- - | - true - | ... -box.cfg{replication_synchro_quorum=old_synchro_quorum,\ - replication_synchro_timeout=old_synchro_timeout} - | --- - | ... -box.space.test:drop() - | --- - | ... -test_run:cmd('stop server replica') - | --- - | - true - | ... -test_run:cmd('delete server replica') - | --- - | - true - | ... -box.schema.user.revoke('guest', 'replication') - | --- - | ... diff --git a/test/replication/gh-5440-qsync-ro.test.lua b/test/replication/gh-5440-qsync-ro.test.lua deleted file mode 100644 index d63ec9c1e..000000000 --- a/test/replication/gh-5440-qsync-ro.test.lua +++ /dev/null @@ -1,53 +0,0 @@ --- --- gh-5440 everyone but the limbo owner is read-only on non-empty limbo. --- -env = require('test_run') -test_run = env.new() -fiber = require('fiber') - -box.schema.user.grant('guest', 'replication') -test_run:cmd('create server replica with rpl_master=default, script="replication/replica.lua"') -test_run:cmd('start server replica with wait=True, wait_load=True') - -_ = box.schema.space.create('test', {is_sync=true}) -_ = box.space.test:create_index('pk') - -old_synchro_quorum = box.cfg.replication_synchro_quorum -old_synchro_timeout = box.cfg.replication_synchro_timeout - --- Make sure that the master stalls on commit leaving the limbo non-empty. -box.cfg{replication_synchro_quorum=3, replication_synchro_timeout=1000} - -f = fiber.new(function() box.space.test:insert{1} end) -f:status() - --- Wait till replica's limbo is non-empty. -test_run:wait_lsn('replica', 'default') -test_run:cmd('switch replica') - -box.info.ro -box.space.test:insert{2} -success = false -f = require('fiber').new(function() box.ctl.wait_rw() success = true end) -f:status() - -test_run:cmd('switch default') - --- Empty the limbo. -box.cfg{replication_synchro_quorum=2} - -test_run:cmd('switch replica') - -test_run:wait_cond(function() return success end) -box.info.ro --- Should succeed now. -box.space.test:insert{2} - --- Cleanup. -test_run:cmd('switch default') -box.cfg{replication_synchro_quorum=old_synchro_quorum,\ - replication_synchro_timeout=old_synchro_timeout} -box.space.test:drop() -test_run:cmd('stop server replica') -test_run:cmd('delete server replica') -box.schema.user.revoke('guest', 'replication') diff --git a/test/replication/gh-5446-qsync-eval-quorum.result b/test/replication/gh-5446-qsync-eval-quorum.result index 5f83b248c..fe9868651 100644 --- a/test/replication/gh-5446-qsync-eval-quorum.result +++ b/test/replication/gh-5446-qsync-eval-quorum.result @@ -88,6 +88,9 @@ s = box.schema.space.create('sync', {is_sync = true, engine = engine}) _ = s:create_index('pk') | --- | ... +box.ctl.promote() + | --- + | ... -- Only one master node -> 1/2 + 1 = 1 s:insert{1} -- should pass diff --git a/test/replication/gh-5446-qsync-eval-quorum.test.lua b/test/replication/gh-5446-qsync-eval-quorum.test.lua index 6b9e324ed..d8fe6cccf 100644 --- a/test/replication/gh-5446-qsync-eval-quorum.test.lua +++ b/test/replication/gh-5446-qsync-eval-quorum.test.lua @@ -37,6 +37,7 @@ end -- Create a sync space we will operate on s = box.schema.space.create('sync', {is_sync = true, engine = engine}) _ = s:create_index('pk') +box.ctl.promote() -- Only one master node -> 1/2 + 1 = 1 s:insert{1} -- should pass diff --git a/test/replication/gh-5566-final-join-synchro.result b/test/replication/gh-5566-final-join-synchro.result index a09882ba6..e43e03b1c 100644 --- a/test/replication/gh-5566-final-join-synchro.result +++ b/test/replication/gh-5566-final-join-synchro.result @@ -12,6 +12,9 @@ _ = box.schema.space.create('sync', {is_sync=true}) _ = box.space.sync:create_index('pk') | --- | ... +box.ctl.promote() + | --- + | ... box.schema.user.grant('guest', 'replication') | --- diff --git a/test/replication/gh-5566-final-join-synchro.test.lua b/test/replication/gh-5566-final-join-synchro.test.lua index 2db2c742f..0f22a0321 100644 --- a/test/replication/gh-5566-final-join-synchro.test.lua +++ b/test/replication/gh-5566-final-join-synchro.test.lua @@ -5,6 +5,7 @@ test_run = require('test_run').new() -- _ = box.schema.space.create('sync', {is_sync=true}) _ = box.space.sync:create_index('pk') +box.ctl.promote() box.schema.user.grant('guest', 'replication') box.schema.user.grant('guest', 'write', 'space', 'sync') diff --git a/test/replication/gh-5874-qsync-txn-recovery.result b/test/replication/gh-5874-qsync-txn-recovery.result index 73f903ca7..85eb89e04 100644 --- a/test/replication/gh-5874-qsync-txn-recovery.result +++ b/test/replication/gh-5874-qsync-txn-recovery.result @@ -31,6 +31,9 @@ sync = box.schema.create_space('sync', {is_sync = true, engine = engine}) _ = sync:create_index('pk') | --- | ... +box.ctl.promote() + | --- + | ... -- The transaction fails, but is written to the log anyway. box.begin() async:insert{1} sync:insert{1} box.commit() diff --git a/test/replication/gh-5874-qsync-txn-recovery.test.lua b/test/replication/gh-5874-qsync-txn-recovery.test.lua index f35eb68de..cf9753851 100644 --- a/test/replication/gh-5874-qsync-txn-recovery.test.lua +++ b/test/replication/gh-5874-qsync-txn-recovery.test.lua @@ -12,6 +12,7 @@ async = box.schema.create_space('async', {engine = engine}) _ = async:create_index('pk') sync = box.schema.create_space('sync', {is_sync = true, engine = engine}) _ = sync:create_index('pk') +box.ctl.promote() -- The transaction fails, but is written to the log anyway. box.begin() async:insert{1} sync:insert{1} box.commit() diff --git a/test/replication/gh-6057-qsync-confirm-async-no-wal.result b/test/replication/gh-6057-qsync-confirm-async-no-wal.result index 23c77729b..4e4fc7576 100644 --- a/test/replication/gh-6057-qsync-confirm-async-no-wal.result +++ b/test/replication/gh-6057-qsync-confirm-async-no-wal.result @@ -40,6 +40,10 @@ _ = s2:create_index('pk') | --- | ... +box.ctl.promote() + | --- + | ... + errinj = box.error.injection | --- | ... diff --git a/test/replication/gh-6057-qsync-confirm-async-no-wal.test.lua b/test/replication/gh-6057-qsync-confirm-async-no-wal.test.lua index a11ddc042..c2d9d290b 100644 --- a/test/replication/gh-6057-qsync-confirm-async-no-wal.test.lua +++ b/test/replication/gh-6057-qsync-confirm-async-no-wal.test.lua @@ -21,6 +21,8 @@ _ = s:create_index('pk') s2 = box.schema.create_space('test2') _ = s2:create_index('pk') +box.ctl.promote() + errinj = box.error.injection function create_hanging_async_after_confirm(sync_key, async_key1, async_key2) \ diff --git a/test/replication/hang_on_synchro_fail.result b/test/replication/hang_on_synchro_fail.result index 9f6fac00b..b73406368 100644 --- a/test/replication/hang_on_synchro_fail.result +++ b/test/replication/hang_on_synchro_fail.result @@ -19,6 +19,9 @@ _ = box.schema.space.create('sync', {is_sync=true}) _ = box.space.sync:create_index('pk') | --- | ... +box.ctl.promote() + | --- + | ... old_synchro_quorum = box.cfg.replication_synchro_quorum | --- diff --git a/test/replication/hang_on_synchro_fail.test.lua b/test/replication/hang_on_synchro_fail.test.lua index 6c3b09fab..549ae17f6 100644 --- a/test/replication/hang_on_synchro_fail.test.lua +++ b/test/replication/hang_on_synchro_fail.test.lua @@ -8,6 +8,7 @@ box.schema.user.grant('guest', 'replication') _ = box.schema.space.create('sync', {is_sync=true}) _ = box.space.sync:create_index('pk') +box.ctl.promote() old_synchro_quorum = box.cfg.replication_synchro_quorum box.cfg{replication_synchro_quorum=3} diff --git a/test/replication/qsync_advanced.result b/test/replication/qsync_advanced.result index 94b19b1f2..509e5c2a7 100644 --- a/test/replication/qsync_advanced.result +++ b/test/replication/qsync_advanced.result @@ -72,6 +72,9 @@ _ = box.schema.space.create('sync', {is_sync=true, engine=engine}) _ = box.space.sync:create_index('pk') | --- | ... +box.ctl.promote() + | --- + | ... -- Testcase body. box.space.sync:insert{1} -- success | --- @@ -468,6 +471,9 @@ box.space.sync:select{} -- 1 box.cfg{read_only=false} -- promote replica to master | --- | ... +box.ctl.promote() + | --- + | ... test_run:switch('default') | --- | - true @@ -508,6 +514,9 @@ test_run:switch('default') box.cfg{read_only=false} | --- | ... +box.ctl.promote() + | --- + | ... test_run:switch('replica') | --- | - true diff --git a/test/replication/qsync_advanced.test.lua b/test/replication/qsync_advanced.test.lua index 058ece602..5bc54794b 100644 --- a/test/replication/qsync_advanced.test.lua +++ b/test/replication/qsync_advanced.test.lua @@ -30,6 +30,7 @@ test_run:switch('default') box.cfg{replication_synchro_quorum=NUM_INSTANCES, replication_synchro_timeout=1000} _ = box.schema.space.create('sync', {is_sync=true, engine=engine}) _ = box.space.sync:create_index('pk') +box.ctl.promote() -- Testcase body. box.space.sync:insert{1} -- success test_run:cmd('switch replica') @@ -170,6 +171,7 @@ box.space.sync:select{} -- 1 test_run:switch('replica') box.space.sync:select{} -- 1 box.cfg{read_only=false} -- promote replica to master +box.ctl.promote() test_run:switch('default') box.cfg{read_only=true} -- demote master to replica test_run:switch('replica') @@ -181,6 +183,7 @@ box.space.sync:select{} -- 1, 2 -- Revert cluster configuration. test_run:switch('default') box.cfg{read_only=false} +box.ctl.promote() test_run:switch('replica') box.cfg{read_only=true} -- Testcase cleanup. diff --git a/test/replication/qsync_basic.result b/test/replication/qsync_basic.result index 7e711ba13..08474269e 100644 --- a/test/replication/qsync_basic.result +++ b/test/replication/qsync_basic.result @@ -14,6 +14,9 @@ s1.is_sync pk = s1:create_index('pk') | --- | ... +box.ctl.promote() + | --- + | ... box.begin() s1:insert({1}) s1:insert({2}) box.commit() | --- | ... @@ -637,67 +640,6 @@ box.space.sync:count() | - 0 | ... --- --- gh-5445: NOPs bypass the limbo for the sake of vclock bumps from foreign --- instances, but also works for local rows. --- -test_run:switch('default') - | --- - | - true - | ... -box.cfg{replication_synchro_quorum = 3, replication_synchro_timeout = 1000} - | --- - | ... -f = fiber.create(function() box.space.sync:replace{1} end) - | --- - | ... -test_run:wait_lsn('replica', 'default') - | --- - | ... - -test_run:switch('replica') - | --- - | - true - | ... -function skip_row() return nil end - | --- - | ... -old_lsn = box.info.lsn - | --- - | ... -_ = box.space.sync:before_replace(skip_row) - | --- - | ... -box.space.sync:replace{2} - | --- - | ... -box.space.sync:before_replace(nil, skip_row) - | --- - | ... -assert(box.space.sync:get{2} == nil) - | --- - | - true - | ... -assert(box.space.sync:get{1} ~= nil) - | --- - | - true - | ... - -test_run:switch('default') - | --- - | - true - | ... -box.cfg{replication_synchro_quorum = 2} - | --- - | ... -test_run:wait_cond(function() return f:status() == 'dead' end) - | --- - | - true - | ... -box.space.sync:truncate() - | --- - | ... - -- -- gh-5191: test box.info.synchro interface. For -- this sake we stop the replica and initiate data diff --git a/test/replication/qsync_basic.test.lua b/test/replication/qsync_basic.test.lua index 75c9b222b..6a49e2b01 100644 --- a/test/replication/qsync_basic.test.lua +++ b/test/replication/qsync_basic.test.lua @@ -6,6 +6,7 @@ s1 = box.schema.create_space('test1', {is_sync = true}) s1.is_sync pk = s1:create_index('pk') +box.ctl.promote() box.begin() s1:insert({1}) s1:insert({2}) box.commit() s1:select{} @@ -248,29 +249,6 @@ for i = 1, 100 do box.space.sync:delete{i} end test_run:cmd('switch replica') box.space.sync:count() --- --- gh-5445: NOPs bypass the limbo for the sake of vclock bumps from foreign --- instances, but also works for local rows. --- -test_run:switch('default') -box.cfg{replication_synchro_quorum = 3, replication_synchro_timeout = 1000} -f = fiber.create(function() box.space.sync:replace{1} end) -test_run:wait_lsn('replica', 'default') - -test_run:switch('replica') -function skip_row() return nil end -old_lsn = box.info.lsn -_ = box.space.sync:before_replace(skip_row) -box.space.sync:replace{2} -box.space.sync:before_replace(nil, skip_row) -assert(box.space.sync:get{2} == nil) -assert(box.space.sync:get{1} ~= nil) - -test_run:switch('default') -box.cfg{replication_synchro_quorum = 2} -test_run:wait_cond(function() return f:status() == 'dead' end) -box.space.sync:truncate() - -- -- gh-5191: test box.info.synchro interface. For -- this sake we stop the replica and initiate data diff --git a/test/replication/qsync_errinj.result b/test/replication/qsync_errinj.result index 635bcf939..837802847 100644 --- a/test/replication/qsync_errinj.result +++ b/test/replication/qsync_errinj.result @@ -35,6 +35,9 @@ _ = box.schema.space.create('sync', {is_sync = true, engine = engine}) _ = box.space.sync:create_index('pk') | --- | ... +box.ctl.promote() + | --- + | ... -- -- gh-5100: slow ACK sending shouldn't stun replica for the diff --git a/test/replication/qsync_errinj.test.lua b/test/replication/qsync_errinj.test.lua index 6a9fd3e1a..556c897c4 100644 --- a/test/replication/qsync_errinj.test.lua +++ b/test/replication/qsync_errinj.test.lua @@ -12,6 +12,7 @@ test_run:cmd('start server replica with wait=True, wait_load=True') _ = box.schema.space.create('sync', {is_sync = true, engine = engine}) _ = box.space.sync:create_index('pk') +box.ctl.promote() -- -- gh-5100: slow ACK sending shouldn't stun replica for the diff --git a/test/replication/qsync_snapshots.result b/test/replication/qsync_snapshots.result index cafdd63c8..f6b86ed70 100644 --- a/test/replication/qsync_snapshots.result +++ b/test/replication/qsync_snapshots.result @@ -57,6 +57,9 @@ _ = box.schema.space.create('sync', {is_sync=true, engine=engine}) _ = box.space.sync:create_index('pk') | --- | ... +box.ctl.promote() + | --- + | ... -- Testcase body. box.space.sync:insert{1} | --- diff --git a/test/replication/qsync_snapshots.test.lua b/test/replication/qsync_snapshots.test.lua index 590610974..e5f70e46a 100644 --- a/test/replication/qsync_snapshots.test.lua +++ b/test/replication/qsync_snapshots.test.lua @@ -23,6 +23,7 @@ test_run:switch('default') box.cfg{replication_synchro_quorum=NUM_INSTANCES, replication_synchro_timeout=1000} _ = box.schema.space.create('sync', {is_sync=true, engine=engine}) _ = box.space.sync:create_index('pk') +box.ctl.promote() -- Testcase body. box.space.sync:insert{1} box.space.sync:select{} -- 1 diff --git a/test/replication/qsync_with_anon.result b/test/replication/qsync_with_anon.result index 6a2952a32..5ec99c1ef 100644 --- a/test/replication/qsync_with_anon.result +++ b/test/replication/qsync_with_anon.result @@ -57,6 +57,9 @@ _ = box.schema.space.create('sync', {is_sync=true, engine=engine}) _ = box.space.sync:create_index('pk') | --- | ... +box.ctl.promote() + | --- + | ... -- Testcase body. test_run:switch('default') | --- diff --git a/test/replication/qsync_with_anon.test.lua b/test/replication/qsync_with_anon.test.lua index d7ecaa107..28e08697e 100644 --- a/test/replication/qsync_with_anon.test.lua +++ b/test/replication/qsync_with_anon.test.lua @@ -22,6 +22,7 @@ test_run:switch('default') box.cfg{replication_synchro_quorum=NUM_INSTANCES, replication_synchro_timeout=1000} _ = box.schema.space.create('sync', {is_sync=true, engine=engine}) _ = box.space.sync:create_index('pk') +box.ctl.promote() -- Testcase body. test_run:switch('default') box.space.sync:insert{1} -- success -- 2.30.1 (Apple Git-130)
next prev parent reply other threads:[~2021-06-10 13:34 UTC|newest] Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-06-10 13:32 [Tarantool-patches] [PATCH 0/7] forbid implicit limbo ownership transition Serge Petrenko via Tarantool-patches 2021-06-10 13:32 ` [Tarantool-patches] [PATCH 1/7] replication: always send raft state to subscribers Serge Petrenko via Tarantool-patches 2021-06-10 16:47 ` Cyrill Gorcunov via Tarantool-patches 2021-06-11 8:43 ` Serge Petrenko via Tarantool-patches 2021-06-11 8:44 ` Cyrill Gorcunov via Tarantool-patches 2021-06-15 20:53 ` Vladislav Shpilevoy via Tarantool-patches 2021-06-17 21:00 ` Serge Petrenko via Tarantool-patches 2021-06-10 13:32 ` Serge Petrenko via Tarantool-patches [this message] 2021-06-15 20:55 ` [Tarantool-patches] [PATCH 2/7] replication: forbid implicit limbo owner transition Vladislav Shpilevoy via Tarantool-patches 2021-06-17 21:00 ` Serge Petrenko via Tarantool-patches 2021-06-18 22:49 ` Vladislav Shpilevoy via Tarantool-patches 2021-06-21 10:13 ` Serge Petrenko via Tarantool-patches 2021-06-10 13:32 ` [Tarantool-patches] [PATCH 3/7] txn_limbo: fix promote term filtering Serge Petrenko via Tarantool-patches 2021-06-15 20:57 ` Vladislav Shpilevoy via Tarantool-patches 2021-06-17 21:00 ` Serge Petrenko via Tarantool-patches 2021-06-18 22:49 ` Vladislav Shpilevoy via Tarantool-patches 2021-06-21 8:55 ` Serge Petrenko via Tarantool-patches 2021-06-10 13:32 ` [Tarantool-patches] [PATCH 4/7] txn_limbo: persist the latest effective promote in snapshot Serge Petrenko via Tarantool-patches 2021-06-15 20:59 ` Vladislav Shpilevoy via Tarantool-patches 2021-06-17 21:00 ` Serge Petrenko via Tarantool-patches 2021-06-10 13:32 ` [Tarantool-patches] [PATCH 5/7] replication: send latest effective promote in initial join Serge Petrenko via Tarantool-patches 2021-06-15 21:00 ` Vladislav Shpilevoy via Tarantool-patches 2021-06-17 21:00 ` Serge Petrenko via Tarantool-patches 2021-06-18 22:52 ` Vladislav Shpilevoy via Tarantool-patches 2021-06-21 10:12 ` Serge Petrenko via Tarantool-patches 2021-06-10 13:32 ` [Tarantool-patches] [PATCH 6/7] box: introduce `box.ctl.demote` Serge Petrenko via Tarantool-patches 2021-06-18 22:52 ` Vladislav Shpilevoy via Tarantool-patches 2021-06-21 14:56 ` Serge Petrenko via Tarantool-patches 2021-06-10 13:32 ` [Tarantool-patches] [PATCH 7/7] box: make promote/demote always bump the term Serge Petrenko via Tarantool-patches 2021-06-15 21:00 ` Vladislav Shpilevoy via Tarantool-patches 2021-06-17 21:00 ` Serge Petrenko via Tarantool-patches 2021-06-18 22:53 ` Vladislav Shpilevoy via Tarantool-patches 2021-06-21 15:02 ` Serge Petrenko via Tarantool-patches 2021-06-15 20:53 ` [Tarantool-patches] [PATCH 0/7] forbid implicit limbo ownership transition Vladislav Shpilevoy via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=fd68d2657ec9d4974b2c7641f6faacde9a99d39a.1623331925.git.sergepetrenko@tarantool.org \ --to=tarantool-patches@dev.tarantool.org \ --cc=gorcunov@gmail.com \ --cc=sergepetrenko@tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH 2/7] replication: forbid implicit limbo owner transition' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox