From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id A3EF16C1AE; Thu, 20 May 2021 12:04:09 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org A3EF16C1AE DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1621501449; bh=ltIXqUC49DvJPtwTNk60rR5F+qa6Llc1n1UywF6rToo=; h=To:Date:In-Reply-To:References:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=tlDEV7q64l8LYRApFKEvnt4YjciGGBHLoPAFGQbAGTRS1sNWsPUzXSekdHjy5Giqc Uf72sHow04R9np9GF3XVI05jXOW3yh4uPEufeu2fWy+zKEYrP4CTsUtDDOwph8+kuT RMG97kUBcA9Om9iDU0dMhmPVCy1IKGB3uinznhOM= Received: from smtp52.i.mail.ru (smtp52.i.mail.ru [94.100.177.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 8CC416C1AE for ; Thu, 20 May 2021 12:02:43 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 8CC416C1AE Received: by smtp52.i.mail.ru with esmtpa (envelope-from ) id 1ljeZm-0000ap-Q6; Thu, 20 May 2021 12:02:43 +0300 To: v.shpilevoy@tarantool.org, gorcunov@gmail.com Date: Thu, 20 May 2021 12:02:36 +0300 Message-Id: X-Mailer: git-send-email 2.30.1 (Apple Git-130) In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD91B019B01C53E51AF07EDA974EE11B688BBB65244C6A6BC9F00894C459B0CD1B979B1B686A005E0F6DB0E44B2FACFC5B1377725EC486409A205836D1F9953231F X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7EE826B7DF942E3A1EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F79006376DFCC587F0A0398D8638F802B75D45FF914D58D5BE9E6BC1A93B80C6DEB9DEE97C6FB206A91F05B27CAC1E594986DF8DB249A9DFE96E2A936FD1C55BDD38FC3FD2E47CDBA5A96583C09775C1D3CA48CFCA5A41EBD8A3A0199FA2833FD35BB23D2EF20D2F80756B5F868A13BD56FB6657A471835C12D1D977725E5C173C3A84C317B107DEF921CE79117882F4460429728AD0CFFFB425014E868A13BD56FB6657D81D268191BDAD3DC09775C1D3CA48CF6F560EC7065075D7BA3038C0950A5D36C8A9BA7A39EFB766EC990983EF5C0329BA3038C0950A5D36D5E8D9A59859A8B60D0E8DD5A242ADE476E601842F6C81A1F004C906525384307823802FF610243DF43C7A68FF6260569E8FC8737B5C2249EC8D19AE6D49635B68655334FD4449CB9ECD01F8117BC8BEAAAE862A0553A39223F8577A6DFFEA7CA1D607EB49F9BFEF43847C11F186F3C59DAA53EE0834AAEE X-C1DE0DAB: C20DE7B7AB408E4181F030C43753B8183A4AFAF3EA6BDC44C234C8B12C006B7A289833858AF3DF23A62152B99C291687CBD94F60714F5334B1881A6453793CE9C32612AADDFBE061C801D989C91DAA47C32612AADDFBE0618C81B5738AFAA7AF9510FB958DCE06DB6ED91DBE5ABE359A805C47957401F4811B2EFE7B39F7738393EDB24507CE13387DFF0A840B692CF8 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34E54F8089C01448AA05942BFBBD2B2142B920E61BD3971487271FB9F7A4479683649D361831B09C951D7E09C32AA3244C1E2D60239807A6520DA874CA8EE23B03CE0B41342B755BCD927AC6DF5659F194 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojbL9S8ysBdXjRs6glMQaGniKd3pxAeIs+ X-Mailru-Sender: 583F1D7ACE8F49BD95918038521BA2AAB49FED91A72D1289A6D7A0B3C822AA79D2D2178C8611F775424AE0EB1F3D1D21E2978F233C3FAE6EE63DB1732555E4A8EE80603BA4A5B0BC112434F685709FCF0DA7A0AF5A3A8387 X-Mras: Ok Subject: [Tarantool-patches] [PATCH 3/3] box: fix an assertion failure in box.ctl.promote() X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Serge Petrenko via Tarantool-patches Reply-To: Serge Petrenko Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" box.ctl.promote() used to assume that the last synchronous entry is already written to WAL by the time it's called. This is not the case when promote is executed on the limbo owner. The last synchronous entry might still be en route to WAL. In order to fix the issue, introduce a new method: txn_limbo_wait_lsn_assigned(), which waits until the last synchronous entry receives its lsn. After this happens, it's safe to proceed to gathering quorum in promote. Closes #6032 --- src/box/box.cc | 26 ++--- src/box/txn_limbo.c | 42 +++++++ src/box/txn_limbo.h | 8 ++ .../gh-6032-promote-wal-write.result | 108 ++++++++++++++++++ .../gh-6032-promote-wal-write.test.lua | 41 +++++++ test/replication/suite.cfg | 1 + 6 files changed, 213 insertions(+), 13 deletions(-) create mode 100644 test/replication/gh-6032-promote-wal-write.result create mode 100644 test/replication/gh-6032-promote-wal-write.test.lua diff --git a/src/box/box.cc b/src/box/box.cc index c10e0d8bf..1b1e7eec0 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -1442,17 +1442,22 @@ box_quorum_on_ack_f(struct trigger *trigger, void *event) } /** - * Wait until at least @a quorum of nodes confirm @a target_lsn from the node - * with id @a lead_id. + * Wait until at least @a quorum of nodes confirm the last available synchronous + * entry from the node with id @a lead_id. */ static int -box_wait_quorum(uint32_t lead_id, int64_t target_lsn, int quorum, +box_wait_quorum(uint32_t lead_id, struct txn_limbo_entry **entry, int quorum, double timeout) { struct box_quorum_trigger t; memset(&t, 0, sizeof(t)); vclock_create(&t.vclock); + *entry = txn_limbo_wait_lsn_assigned(&txn_limbo); + if (*entry == NULL) + return -1; + int64_t target_lsn = (*entry)->lsn; + /* Take this node into account immediately. */ int ack_count = vclock_get(box_vclock, lead_id) >= target_lsn; replicaset_foreach(replica) { @@ -1622,22 +1627,17 @@ box_promote(void) } } - /* - * promote() is a no-op on the limbo owner, so all the rows - * in the limbo must've come through the applier meaning they already - * have an lsn assigned, even if their WAL write hasn't finished yet. - */ - wait_lsn = txn_limbo_last_synchro_entry(&txn_limbo)->lsn; - assert(wait_lsn > 0); - - rc = box_wait_quorum(former_leader_id, wait_lsn, quorum, + struct txn_limbo_entry *last_entry; + rc = box_wait_quorum(former_leader_id,&last_entry, quorum, replication_synchro_timeout); if (rc == 0) { + wait_lsn = last_entry->lsn; if (quorum < replication_synchro_quorum) { diag_set(ClientError, ER_QUORUM_WAIT, quorum, "quorum was increased while waiting"); rc = -1; - } else if (wait_lsn < txn_limbo_last_synchro_entry(&txn_limbo)->lsn) { + } else if (last_entry != + txn_limbo_last_synchro_entry(&txn_limbo)) { diag_set(ClientError, ER_QUORUM_WAIT, quorum, "new synchronous transactions appeared"); rc = -1; diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c index f287369a2..406f2de89 100644 --- a/src/box/txn_limbo.c +++ b/src/box/txn_limbo.c @@ -69,6 +69,48 @@ txn_limbo_last_synchro_entry(struct txn_limbo *limbo) return NULL; } +static int +txn_limbo_wait_lsn_assigned_f(struct trigger *trig, void *event) +{ + (void)event; + struct fiber *fiber = trig->data; + fiber_wakeup(fiber); + return 0; +} + +struct txn_limbo_entry * +txn_limbo_wait_lsn_assigned(struct txn_limbo *limbo) +{ + assert(!txn_limbo_is_empty(limbo)); + struct txn_limbo_entry *entry = txn_limbo_last_synchro_entry(limbo); + if (entry->lsn >= 0) + return entry; + + struct trigger write_trigger, rollback_trigger; + trigger_create(&write_trigger, txn_limbo_wait_lsn_assigned_f, fiber(), + NULL); + trigger_create(&rollback_trigger, txn_limbo_wait_lsn_assigned_f, + fiber(), NULL); + txn_on_wal_write(entry->txn, &write_trigger); + txn_on_rollback(entry->txn, &rollback_trigger); + do { + fiber_yield(); + if (fiber_is_cancelled()) { + diag_set(FiberIsCancelled); + entry = NULL; + break; + } + if (entry->txn->signature < 0) { + diag_set(ClientError, ER_SYNC_ROLLBACK); + entry = NULL; + break; + } + } while (entry->lsn == -1); + trigger_clear(&write_trigger); + trigger_clear(&rollback_trigger); + return entry; +} + struct txn_limbo_entry * txn_limbo_append(struct txn_limbo *limbo, uint32_t id, struct txn *txn) { diff --git a/src/box/txn_limbo.h b/src/box/txn_limbo.h index e409ac657..24aa68114 100644 --- a/src/box/txn_limbo.h +++ b/src/box/txn_limbo.h @@ -240,6 +240,14 @@ txn_limbo_is_replica_outdated(const struct txn_limbo *limbo, struct txn_limbo_entry * txn_limbo_last_synchro_entry(struct txn_limbo *limbo); +/** + * Wait until the last synchronous transaction in the limbo gets its LSN + * (i.e. is written to WAL). Return the entry or NULL in case of error. + */ +struct txn_limbo_entry * +txn_limbo_wait_lsn_assigned(struct txn_limbo *limbo); + + /** * Allocate, create, and append a new transaction to the limbo. * The limbo entry is allocated on the transaction's region. diff --git a/test/replication/gh-6032-promote-wal-write.result b/test/replication/gh-6032-promote-wal-write.result new file mode 100644 index 000000000..461d6416e --- /dev/null +++ b/test/replication/gh-6032-promote-wal-write.result @@ -0,0 +1,108 @@ +-- test-run result file version 2 +test_run = require('test_run').new() + | --- + | ... +fiber = require('fiber') + | --- + | ... + +replication_synchro_timeout = box.cfg.replication_synchro_timeout + | --- + | ... +box.cfg{\ + replication_synchro_timeout = 0.001,\ +} + | --- + | ... + +_ = box.schema.create_space('sync', {is_sync = true}):create_index('pk') + | --- + | ... + +box.error.injection.set('ERRINJ_WAL_DELAY', true) + | --- + | - ok + | ... +_ = fiber.create(function() box.space.sync:replace{1} end) + | --- + | ... +ok, err = nil, nil + | --- + | ... + +-- Test that the fiber actually waits for a WAL write to happen. +f = fiber.create(function() ok, err = pcall(box.ctl.promote) end) + | --- + | ... +fiber.sleep(0.1) + | --- + | ... +f:status() + | --- + | - suspended + | ... +box.error.injection.set('ERRINJ_WAL_DELAY', false) + | --- + | - ok + | ... +test_run:wait_cond(function() return f:status() == 'dead' end) + | --- + | - true + | ... +ok + | --- + | - true + | ... +err + | --- + | - null + | ... + +box.error.injection.set('ERRINJ_WAL_DELAY', true) + | --- + | - ok + | ... +_ = fiber.create(function() box.space.sync:replace{2} end) + | --- + | ... + +-- Test that the fiber is cancellable. +f = fiber.create(function() ok, err = pcall(box.ctl.promote) end) + | --- + | ... +fiber.sleep(0.1) + | --- + | ... +f:status() + | --- + | - suspended + | ... +f:cancel() + | --- + | ... +test_run:wait_cond(function() return f:status() == 'dead' end) + | --- + | - true + | ... +ok + | --- + | - false + | ... +err + | --- + | - fiber is cancelled + | ... + +-- Cleanup. +box.error.injection.set('ERRINJ_WAL_DELAY', false) + | --- + | - ok + | ... +box.cfg{\ + replication_synchro_timeout = replication_synchro_timeout,\ +} + | --- + | ... +box.space.sync:drop() + | --- + | ... diff --git a/test/replication/gh-6032-promote-wal-write.test.lua b/test/replication/gh-6032-promote-wal-write.test.lua new file mode 100644 index 000000000..7725330f5 --- /dev/null +++ b/test/replication/gh-6032-promote-wal-write.test.lua @@ -0,0 +1,41 @@ +test_run = require('test_run').new() +fiber = require('fiber') + +replication_synchro_timeout = box.cfg.replication_synchro_timeout +box.cfg{\ + replication_synchro_timeout = 0.001,\ +} + +_ = box.schema.create_space('sync', {is_sync = true}):create_index('pk') + +box.error.injection.set('ERRINJ_WAL_DELAY', true) +_ = fiber.create(function() box.space.sync:replace{1} end) +ok, err = nil, nil + +-- Test that the fiber actually waits for a WAL write to happen. +f = fiber.create(function() ok, err = pcall(box.ctl.promote) end) +fiber.sleep(0.1) +f:status() +box.error.injection.set('ERRINJ_WAL_DELAY', false) +test_run:wait_cond(function() return f:status() == 'dead' end) +ok +err + +box.error.injection.set('ERRINJ_WAL_DELAY', true) +_ = fiber.create(function() box.space.sync:replace{2} end) + +-- Test that the fiber is cancellable. +f = fiber.create(function() ok, err = pcall(box.ctl.promote) end) +fiber.sleep(0.1) +f:status() +f:cancel() +test_run:wait_cond(function() return f:status() == 'dead' end) +ok +err + +-- Cleanup. +box.error.injection.set('ERRINJ_WAL_DELAY', false) +box.cfg{\ + replication_synchro_timeout = replication_synchro_timeout,\ +} +box.space.sync:drop() diff --git a/test/replication/suite.cfg b/test/replication/suite.cfg index dc39e2f74..dfe4be9ae 100644 --- a/test/replication/suite.cfg +++ b/test/replication/suite.cfg @@ -45,6 +45,7 @@ "gh-5435-qsync-clear-synchro-queue-commit-all.test.lua": {}, "gh-5536-wal-limit.test.lua": {}, "gh-5566-final-join-synchro.test.lua": {}, + "gh-6032-promote-wal-write.test.lua": {}, "*": { "memtx": {"engine": "memtx"}, "vinyl": {"engine": "vinyl"} -- 2.30.1 (Apple Git-130)