From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 6517C6EC55; Tue, 13 Jul 2021 13:01:40 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 6517C6EC55 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1626170500; bh=WZ61U9sxH6T7Omg75BKnxd8bX/GXWIlQZtqDjnZfEZs=; h=To:References:Date:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=cLk2BbaqG5PqUipPm4cuxQ9OvkJXmmv+pvfZ0v3hS27mL7VQdI/teDcHgAUQCgn1W Z6v4WC8MW4BayIiF8P/yEpSPcX2lpvoKFXimgNLIx0LhxyD/wv9KMhkSRLKBGFdQ1m a1mtRHnO3mliXuElXhjovLkktV0+7+N1oT7v7dsc= Received: from smtp42.i.mail.ru (smtp42.i.mail.ru [94.100.177.102]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id DF91D6EC55 for ; Tue, 13 Jul 2021 13:01:38 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org DF91D6EC55 Received: by smtp42.i.mail.ru with esmtpa (envelope-from ) id 1m3FEQ-0003Zt-1m; Tue, 13 Jul 2021 13:01:38 +0300 To: Vladislav Shpilevoy , tarantool-patches@dev.tarantool.org, gorcunov@gmail.com References: Message-ID: Date: Tue, 13 Jul 2021 13:01:37 +0300 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: ru X-4EC0790: 10 X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD97BB0EF39AD2B33D5CFD6F66580F08A9EEA70CBC893E637A3182A05F5380850407DDDABA384DBF026CAAEE296462C2FF1518AF3F382AFC4B8880D969ECFB9B575 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE79207F2B4714610D0EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637968EC5F77C2942FE8638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D8B62EB519E23D6BE17B071A7A067DE346117882F4460429724CE54428C33FAD305F5C1EE8F4F765FCF1175FABE1C0F9B6A471835C12D1D9774AD6D5ED66289B52BA9C0B312567BB23117882F446042972877693876707352033AC447995A7AD18F04B652EEC242312D2E47CDBA5A96583BA9C0B312567BB231DD303D21008E29813377AFFFEAFD269A417C69337E82CC2E827F84554CEF50127C277FBC8AE2E8BA83251EDC214901ED5E8D9A59859A8B6D0C9BB9AE6BD5D69089D37D7C0E48F6C5571747095F342E88FB05168BE4CE3AF X-C1DE0DAB: 0D63561A33F958A5EC96F268517B7C8569AD4BA48C7C371080E0913658627127D59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA753753CEE10E4ED4A7410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34AF12ADB97C97CD893078DB058B4B16B3D2BF5634B7F862932235F957B457F142DCBBD556F0A8E0CF1D7E09C32AA3244C23BB57AC4F7CD51DCBA669045E598F9A3E8609A02908F271FACE5A9C96DEB163 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojAZDAgpmGsvZgssSuoPXnpg== X-Mailru-Sender: 3B9A0136629DC9125D61937A2360A44674B9A42CCC40DFD226394290347E2EC0EDB8A17F51B879CB424AE0EB1F3D1D21E2978F233C3FAE6EE63DB1732555E4A8EE80603BA4A5B0BC112434F685709FCF0DA7A0AF5A3A8387 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH 1/1] qsync: remove polling from box_promote() X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Serge Petrenko via Tarantool-patches Reply-To: Serge Petrenko Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" 13.07.2021 01:20, Vladislav Shpilevoy пишет: > box_promote() when called manually used to wait for the existing > transactions from a foreign limbo to end during a timeout. Giving > them a chance to end on their terms. > > The waiting was done via polling like > > while (!done) > sleep(small_timeout); > > Polling is almost always super bad both for execution time and > for CPU usage. The patch replaces it with proper waiting based on > events happening in the limbo. > > Closes #5190 > --- > Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-5190-qsync-polling > Issue: https://github.com/tarantool/tarantool/issues/5190 > > src/box/box.cc | 7 +---- > src/box/txn_limbo.c | 74 ++++++++++++++++++++++++++++++++++----------- > src/box/txn_limbo.h | 4 +++ > 3 files changed, 62 insertions(+), 23 deletions(-) > > diff --git a/src/box/box.cc b/src/box/box.cc > index ab7d983c9..eeb57b04e 100644 > --- a/src/box/box.cc > +++ b/src/box/box.cc > @@ -1627,12 +1627,7 @@ box_promote(void) > if (try_wait) { > /* Wait until pending confirmations/rollbacks reach us. */ > double timeout = 2 * replication_synchro_timeout; > - double start_tm = fiber_clock(); > - while (!txn_limbo_is_empty(&txn_limbo)) { > - if (fiber_clock() - start_tm > timeout) > - break; > - fiber_sleep(0.001); > - } > + txn_limbo_wait_empty(&txn_limbo, timeout); > /* > * Our mission was to clear the limbo from former leader's > * transactions. Exit in case someone did that for us. > diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c > index 51dc2a186..fdea287c7 100644 > --- a/src/box/txn_limbo.c > +++ b/src/box/txn_limbo.c > @@ -612,11 +612,14 @@ txn_rollback_cb(struct trigger *trigger, void *event) > return 0; > } > > -int > -txn_limbo_wait_confirm(struct txn_limbo *limbo) > +/** > + * Wait until the last transaction in the limbo is finished and get its result. > + */ > +static int > +txn_limbo_wait_last_txn(struct txn_limbo *limbo, bool *is_rollback, > + double timeout) > { > - if (txn_limbo_is_empty(limbo)) > - return 0; > + assert(!txn_limbo_is_empty(limbo)); > > /* initialization of a waitpoint. */ > struct confirm_waitpoint cwp; > @@ -632,27 +635,42 @@ txn_limbo_wait_confirm(struct txn_limbo *limbo) > struct txn_limbo_entry *tle = txn_limbo_last_entry(limbo); > txn_on_commit(tle->txn, &on_complete); > txn_on_rollback(tle->txn, &on_rollback); > - double start_time = fiber_clock(); > + double deadline = fiber_clock() + timeout; > + int rc; > while (true) { > - double deadline = start_time + replication_synchro_timeout; > + if (timeout < 0) { > + rc = -1; > + break; > + } > bool cancellable = fiber_set_cancellable(false); > - double timeout = deadline - fiber_clock(); > - int rc = fiber_cond_wait_timeout(&limbo->wait_cond, timeout); > + rc = fiber_cond_wait_timeout(&limbo->wait_cond, timeout); > fiber_set_cancellable(cancellable); > - if (cwp.is_confirm || cwp.is_rollback) > - goto complete; > + if (cwp.is_confirm || cwp.is_rollback) { > + *is_rollback = cwp.is_rollback; > + rc = 0; > + break; > + } > if (rc != 0) > - goto timed_out; > + break; > + timeout = deadline - fiber_clock(); > } > -timed_out: > - /* Clear the triggers if the timeout has been reached. */ > trigger_clear(&on_complete); > trigger_clear(&on_rollback); > - diag_set(ClientError, ER_SYNC_QUORUM_TIMEOUT); > - return -1; > + return rc; > +} > > -complete: > - if (!cwp.is_confirm) { > +int > +txn_limbo_wait_confirm(struct txn_limbo *limbo) > +{ > + if (txn_limbo_is_empty(limbo)) > + return 0; > + bool is_rollback; > + if (txn_limbo_wait_last_txn(limbo, &is_rollback, > + replication_synchro_timeout) != 0) { > + diag_set(ClientError, ER_SYNC_QUORUM_TIMEOUT); > + return -1; > + } > + if (is_rollback) { > /* The transaction has been rolled back. */ > diag_set(ClientError, ER_SYNC_ROLLBACK); > return -1; > @@ -660,6 +678,28 @@ complete: > return 0; > } > > +int > +txn_limbo_wait_empty(struct txn_limbo *limbo, double timeout) > +{ > + if (txn_limbo_is_empty(limbo)) > + return 0; > + bool is_rollback; > + double deadline = fiber_clock() + timeout; > + /* > + * Retry in the loop. More transactions might be added while waiting for > + * the last one. > + */ > + do { > + if (txn_limbo_wait_last_txn(limbo, &is_rollback, > + timeout) != 0) { > + diag_set(ClientError, ER_TIMEOUT); > + return -1; > + } > + timeout = deadline - fiber_clock(); > + } while (!txn_limbo_is_empty(limbo)); > + return 0; > +} > + > void > txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req) > { > diff --git a/src/box/txn_limbo.h b/src/box/txn_limbo.h > index e409ac657..7debbc0b9 100644 > --- a/src/box/txn_limbo.h > +++ b/src/box/txn_limbo.h > @@ -311,6 +311,10 @@ txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req); > int > txn_limbo_wait_confirm(struct txn_limbo *limbo); > > +/** Wait until the limbo is empty. Regardless of how its transactions end. */ > +int > +txn_limbo_wait_empty(struct txn_limbo *limbo, double timeout); > + > /** > * Write a PROMOTE request, which has the same effect as CONFIRM(@a lsn) and > * ROLLBACK(@a lsn + 1) combined. -- Serge Petrenko