[Tarantool-patches] [PATCH 1/2] box: rework local_recovery to use async txn_commit

Serge Petrenko sergepetrenko at tarantool.org
Mon Jun 22 21:55:31 MSK 2020


22.06.2020 14:19, Serge Petrenko пишет:
>
> 21.06.2020 19:25, Vladislav Shpilevoy пишет:
>> Hi! Thanks for the patch!
>>
>>> diff --git a/src/box/box.cc b/src/box/box.cc
>>> index f80d6f8e6..0fe7625fb 100644
>>> --- a/src/box/box.cc
>>> +++ b/src/box/box.cc
>>> @@ -222,6 +222,12 @@ box_process_rw(struct request *request, struct 
>>> space *space,
>>>                   */
>>>                  if (is_local_recovery) {
>>>                          res = txn_commit_async(txn);
>>> +                       /*
>>> +                        * Hack: remove the unnecessary trigger.
>>> +                        * I don't know of a better place to do
>>> +                        * it.
>>> +                        */
>> Why is it necessary to remove it?
>
> The trigger is set after journal_write_async() returns.
> The idea is that journal_write_async() just submits the request for
> writing, and later, journal_async_complete(), which is 
> txn_complete_async()
> completes the tx processing. The key word here is "later", meaning that
> txn_complete_async() is called after txn_commit async() returns.
>
> This is why txn_complete_async() clears the trigger set on write failure:
> if write fails, we rollback the tx and remove the entry from the limbo.
> If writing doesn't fail, it's limbo's business to remove an entry. The 
> tx still
> may be rolled back due to a timeout, then limbo will remove the entry, 
> and if
> an on_rollback trigger isn't cleared, it'll try to remove the entry 
> again.
>
> Now, back to this case: I reworked recovery_journal to use write_async().
> recovery_journal_write_async() calls journal_async_complete() right away,
> since there're no actual writes and no one else can call 
> journal_async_complete().
>
> So, once journal_write_async() returns, the trigger is cleared, but 
> later it's
> reset at the end of txn_commit_async(), because txn_commit_async() 
> assumes that
> the write hasn't happened yet, and journal_async_complete() is yet to 
> be called.
>
> One option is to call journal_async_complete() after 
> txn_commit_async(), this will
> solve the problem, but journal_async_complete() receives `struct 
> journal_entry *`,
> and we have no knowledge of the journal entry outside of 
> `txn_commit_async()`.
>
> Another option is to set a txn_flag, like 
> "TXN_DONT_EXPECT_WRITE_FAILURE", so that
> the txn doesn't set an on_write_failure trigger when we don't need it. 
> But this
> also looks ugly.
>
> That's why I unconditionally reset the trigger after txn_commit_async().

After further discussion, we decided to use different values of 
txn->signature < 0
as reasons for rollbackand instead of resetting the on_write_failure 
triggers,
we'll just make them work onlyfor some rollback reasons (like failed write).

>
>>
>>> + trigger_clear(&txn->on_write_failure);
>>>                  } else {
>>>                          res = txn_commit(txn);
>>>                  }
>
-- 
Serge Petrenko



More information about the Tarantool-patches mailing list