[Tarantool-patches] [PATCH 1/2] box: rework local_recovery to use async txn_commit

Serge Petrenko sergepetrenko at tarantool.org
Mon Jun 22 14:19:47 MSK 2020


21.06.2020 19:25, Vladislav Shpilevoy пишет:
> Hi! Thanks for the patch!
>
>> diff --git a/src/box/box.cc b/src/box/box.cc
>> index f80d6f8e6..0fe7625fb 100644
>> --- a/src/box/box.cc
>> +++ b/src/box/box.cc
>> @@ -222,6 +222,12 @@ box_process_rw(struct request *request, struct space *space,
>>                   */
>>                  if (is_local_recovery) {
>>                          res = txn_commit_async(txn);
>> +                       /*
>> +                        * Hack: remove the unnecessary trigger.
>> +                        * I don't know of a better place to do
>> +                        * it.
>> +                        */
> Why is it necessary to remove it?

The trigger is set after journal_write_async() returns.
The idea is that journal_write_async() just submits the request for
writing, and later, journal_async_complete(), which is txn_complete_async()
completes the tx processing. The key word here is "later", meaning that
txn_complete_async() is called after txn_commit async() returns.

This is why txn_complete_async() clears the trigger set on write failure:
if write fails, we rollback the tx and remove the entry from the limbo.
If writing doesn't fail, it's limbo's business to remove an entry. The 
tx still
may be rolled back due to a timeout, then limbo will remove the entry, 
and if
an on_rollback trigger isn't cleared, it'll try to remove the entry again.

Now, back to this case: I reworked recovery_journal to use write_async().
recovery_journal_write_async() calls journal_async_complete() right away,
since there're no actual writes and no one else can call 
journal_async_complete().

So, once journal_write_async() returns, the trigger is cleared, but 
later it's
reset at the end of txn_commit_async(), because txn_commit_async() 
assumes that
the write hasn't happened yet, and journal_async_complete() is yet to be 
called.

One option is to call journal_async_complete() after txn_commit_async(), 
this will
solve the problem, but journal_async_complete() receives `struct 
journal_entry *`,
and we have no knowledge of the journal entry outside of 
`txn_commit_async()`.

Another option is to set a txn_flag, like 
"TXN_DONT_EXPECT_WRITE_FAILURE", so that
the txn doesn't set an on_write_failure trigger when we don't need it. 
But this
also looks ugly.

That's why I unconditionally reset the trigger after txn_commit_async().

>
>> + trigger_clear(&txn->on_write_failure);
>>                  } else {
>>                          res = txn_commit(txn);
>>                  }

-- 
Serge Petrenko



More information about the Tarantool-patches mailing list