From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp55.i.mail.ru (smtp55.i.mail.ru [217.69.128.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 1C23642EF5C for ; Mon, 22 Jun 2020 21:55:33 +0300 (MSK) From: Serge Petrenko References: <46ed0cc08e6ecf7ee7c9250d9cdfc3d68aa5c764.1592589312.git.sergepetrenko@tarantool.org> <30f75001-3ecc-9d7f-9cba-88c824087f4e@tarantool.org> Message-ID: Date: Mon, 22 Jun 2020 21:55:31 +0300 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format="flowed" Content-Transfer-Encoding: 8bit Content-Language: en-GB Subject: Re: [Tarantool-patches] [PATCH 1/2] box: rework local_recovery to use async txn_commit List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vladislav Shpilevoy , sergos@tarantool.org, gorcunov@tarantool.org, lvasiliev@tarantool.org Cc: tarantool-patches@dev.tarantool.org 22.06.2020 14:19, Serge Petrenko пишет: > > 21.06.2020 19:25, Vladislav Shpilevoy пишет: >> Hi! Thanks for the patch! >> >>> diff --git a/src/box/box.cc b/src/box/box.cc >>> index f80d6f8e6..0fe7625fb 100644 >>> --- a/src/box/box.cc >>> +++ b/src/box/box.cc >>> @@ -222,6 +222,12 @@ box_process_rw(struct request *request, struct >>> space *space, >>>                   */ >>>                  if (is_local_recovery) { >>>                          res = txn_commit_async(txn); >>> +                       /* >>> +                        * Hack: remove the unnecessary trigger. >>> +                        * I don't know of a better place to do >>> +                        * it. >>> +                        */ >> Why is it necessary to remove it? > > The trigger is set after journal_write_async() returns. > The idea is that journal_write_async() just submits the request for > writing, and later, journal_async_complete(), which is > txn_complete_async() > completes the tx processing. The key word here is "later", meaning that > txn_complete_async() is called after txn_commit async() returns. > > This is why txn_complete_async() clears the trigger set on write failure: > if write fails, we rollback the tx and remove the entry from the limbo. > If writing doesn't fail, it's limbo's business to remove an entry. The > tx still > may be rolled back due to a timeout, then limbo will remove the entry, > and if > an on_rollback trigger isn't cleared, it'll try to remove the entry > again. > > Now, back to this case: I reworked recovery_journal to use write_async(). > recovery_journal_write_async() calls journal_async_complete() right away, > since there're no actual writes and no one else can call > journal_async_complete(). > > So, once journal_write_async() returns, the trigger is cleared, but > later it's > reset at the end of txn_commit_async(), because txn_commit_async() > assumes that > the write hasn't happened yet, and journal_async_complete() is yet to > be called. > > One option is to call journal_async_complete() after > txn_commit_async(), this will > solve the problem, but journal_async_complete() receives `struct > journal_entry *`, > and we have no knowledge of the journal entry outside of > `txn_commit_async()`. > > Another option is to set a txn_flag, like > "TXN_DONT_EXPECT_WRITE_FAILURE", so that > the txn doesn't set an on_write_failure trigger when we don't need it. > But this > also looks ugly. > > That's why I unconditionally reset the trigger after txn_commit_async(). After further discussion, we decided to use different values of txn->signature < 0 as reasons for rollbackand instead of resetting the on_write_failure triggers, we'll just make them work onlyfor some rollback reasons (like failed write). > >> >>> + trigger_clear(&txn->on_write_failure); >>>                  } else { >>>                          res = txn_commit(txn); >>>                  } > -- Serge Petrenko