From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 86D6E6EC55; Tue, 13 Jul 2021 01:20:11 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 86D6E6EC55 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1626128411; bh=VBBhkCjGzL7KS/hxKsQqNRcwTvl5dFXsZbezpcHK0oM=; h=To:Date:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=n9a3wIve8+ewIcHzQmaFQw0keGsCTXGO1mCBvWAIkvs1zAUkrV6ZSnwb7cUkd2PYy GY7yk2v/Ymr2YohpvjNvsa1rFpz9KYXUniQQSuiaiRTKWmINCGLR64EOMDLWp8KlAY KYUgvSCrMJ8Uk4PIVdPUjoQw7XmRVrYtw6YMAzm0= Received: from smtpng3.i.mail.ru (smtpng3.i.mail.ru [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 5D5776EC55 for ; Tue, 13 Jul 2021 01:20:09 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 5D5776EC55 Received: by smtpng3.m.smailru.net with esmtpa (envelope-from ) id 1m34HY-0005aJ-Ci; Tue, 13 Jul 2021 01:20:08 +0300 To: tarantool-patches@dev.tarantool.org, gorcunov@gmail.com, sergepetrenko@tarantool.org Date: Tue, 13 Jul 2021 00:20:07 +0200 Message-Id: X-Mailer: git-send-email 2.24.3 (Apple Git-128) MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD97BB0EF39AD2B33D52D9CC5C87942E9F174CE6222ED5B65F6182A05F538085040AD38B89FF3AF421AD3210E30CC67BD6CB70A5F9FB3C3208C2A04E2E74EBFA0B3 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7923B16447C554B85EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F790063792EDBE6976DC04EF8F08D7030A58E5AD1A62830130A00468AEEEE3FBA3A834EE7353EFBB55337566E68746B1F2AB10C6DD8589A7CDB97F1DD9C08F84AAD81258A471835C12D1D9774AD6D5ED66289B5278DA827A17800CE70F3DDF2BBF19B93A9FA2833FD35BB23D2EF20D2F80756B5F868A13BD56FB6657A471835C12D1D977725E5C173C3A84C390D92131081DE748117882F4460429728AD0CFFFB425014E868A13BD56FB6657E2021AF6380DFAD1A18204E546F3947CB11811A4A51E3B096D1867E19FE1407959CC434672EE6371089D37D7C0E48F6C8AA50765F7900637B8F435DEDE9E76EBEFF80C71ABB335746BA297DBC24807EABDAD6C7F3747799A X-C1DE0DAB: C20DE7B7AB408E4181F030C43753B8183A4AFAF3EA6BDC44C234C8B12C006B7ADD372AB3119D2EA7C46A0EC72002E1DC20956E4D9E94F59EB1881A6453793CE9C32612AADDFBE0613C1EF56D76AEF3979510FB958DCE06DB6ED91DBE5ABE359AA347268583507A9B5E4DBAB5AF249FA793EDB24507CE13387DFF0A840B692CF8 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D348CDF4129B2EA856062DF36C2DD1AB9B7889E91364063297BFCAB75FCC00C1C8F94E6984F8EAAC2041D7E09C32AA3244C281D6FBEE2C2BFDD1BB1A0FC652E7E6239C99C45E8D137E9FACE5A9C96DEB163 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojbL9S8ysBdXj5iCf96t4i3z4f2FZiFtzj X-Mailru-Sender: 689FA8AB762F7393C37E3C1AEC41BA5D556459998CA6E5D875E0ED39184850473841015FED1DE5223CC9A89AB576DD93FB559BB5D741EB963CF37A108A312F5C27E8A8C3839CE0E267EA787935ED9F1B X-Mras: Ok Subject: [Tarantool-patches] [PATCH 1/1] qsync: remove polling from box_promote() X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Vladislav Shpilevoy via Tarantool-patches Reply-To: Vladislav Shpilevoy Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" box_promote() when called manually used to wait for the existing transactions from a foreign limbo to end during a timeout. Giving them a chance to end on their terms. The waiting was done via polling like while (!done) sleep(small_timeout); Polling is almost always super bad both for execution time and for CPU usage. The patch replaces it with proper waiting based on events happening in the limbo. Closes #5190 --- Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-5190-qsync-polling Issue: https://github.com/tarantool/tarantool/issues/5190 src/box/box.cc | 7 +---- src/box/txn_limbo.c | 74 ++++++++++++++++++++++++++++++++++----------- src/box/txn_limbo.h | 4 +++ 3 files changed, 62 insertions(+), 23 deletions(-) diff --git a/src/box/box.cc b/src/box/box.cc index ab7d983c9..eeb57b04e 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -1627,12 +1627,7 @@ box_promote(void) if (try_wait) { /* Wait until pending confirmations/rollbacks reach us. */ double timeout = 2 * replication_synchro_timeout; - double start_tm = fiber_clock(); - while (!txn_limbo_is_empty(&txn_limbo)) { - if (fiber_clock() - start_tm > timeout) - break; - fiber_sleep(0.001); - } + txn_limbo_wait_empty(&txn_limbo, timeout); /* * Our mission was to clear the limbo from former leader's * transactions. Exit in case someone did that for us. diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c index 51dc2a186..fdea287c7 100644 --- a/src/box/txn_limbo.c +++ b/src/box/txn_limbo.c @@ -612,11 +612,14 @@ txn_rollback_cb(struct trigger *trigger, void *event) return 0; } -int -txn_limbo_wait_confirm(struct txn_limbo *limbo) +/** + * Wait until the last transaction in the limbo is finished and get its result. + */ +static int +txn_limbo_wait_last_txn(struct txn_limbo *limbo, bool *is_rollback, + double timeout) { - if (txn_limbo_is_empty(limbo)) - return 0; + assert(!txn_limbo_is_empty(limbo)); /* initialization of a waitpoint. */ struct confirm_waitpoint cwp; @@ -632,27 +635,42 @@ txn_limbo_wait_confirm(struct txn_limbo *limbo) struct txn_limbo_entry *tle = txn_limbo_last_entry(limbo); txn_on_commit(tle->txn, &on_complete); txn_on_rollback(tle->txn, &on_rollback); - double start_time = fiber_clock(); + double deadline = fiber_clock() + timeout; + int rc; while (true) { - double deadline = start_time + replication_synchro_timeout; + if (timeout < 0) { + rc = -1; + break; + } bool cancellable = fiber_set_cancellable(false); - double timeout = deadline - fiber_clock(); - int rc = fiber_cond_wait_timeout(&limbo->wait_cond, timeout); + rc = fiber_cond_wait_timeout(&limbo->wait_cond, timeout); fiber_set_cancellable(cancellable); - if (cwp.is_confirm || cwp.is_rollback) - goto complete; + if (cwp.is_confirm || cwp.is_rollback) { + *is_rollback = cwp.is_rollback; + rc = 0; + break; + } if (rc != 0) - goto timed_out; + break; + timeout = deadline - fiber_clock(); } -timed_out: - /* Clear the triggers if the timeout has been reached. */ trigger_clear(&on_complete); trigger_clear(&on_rollback); - diag_set(ClientError, ER_SYNC_QUORUM_TIMEOUT); - return -1; + return rc; +} -complete: - if (!cwp.is_confirm) { +int +txn_limbo_wait_confirm(struct txn_limbo *limbo) +{ + if (txn_limbo_is_empty(limbo)) + return 0; + bool is_rollback; + if (txn_limbo_wait_last_txn(limbo, &is_rollback, + replication_synchro_timeout) != 0) { + diag_set(ClientError, ER_SYNC_QUORUM_TIMEOUT); + return -1; + } + if (is_rollback) { /* The transaction has been rolled back. */ diag_set(ClientError, ER_SYNC_ROLLBACK); return -1; @@ -660,6 +678,28 @@ complete: return 0; } +int +txn_limbo_wait_empty(struct txn_limbo *limbo, double timeout) +{ + if (txn_limbo_is_empty(limbo)) + return 0; + bool is_rollback; + double deadline = fiber_clock() + timeout; + /* + * Retry in the loop. More transactions might be added while waiting for + * the last one. + */ + do { + if (txn_limbo_wait_last_txn(limbo, &is_rollback, + timeout) != 0) { + diag_set(ClientError, ER_TIMEOUT); + return -1; + } + timeout = deadline - fiber_clock(); + } while (!txn_limbo_is_empty(limbo)); + return 0; +} + void txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req) { diff --git a/src/box/txn_limbo.h b/src/box/txn_limbo.h index e409ac657..7debbc0b9 100644 --- a/src/box/txn_limbo.h +++ b/src/box/txn_limbo.h @@ -311,6 +311,10 @@ txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req); int txn_limbo_wait_confirm(struct txn_limbo *limbo); +/** Wait until the limbo is empty. Regardless of how its transactions end. */ +int +txn_limbo_wait_empty(struct txn_limbo *limbo, double timeout); + /** * Write a PROMOTE request, which has the same effect as CONFIRM(@a lsn) and * ROLLBACK(@a lsn + 1) combined. -- 2.24.3 (Apple Git-128)