[Tarantool-patches] [PATCH v2 2/2] box: fix an assertion failure in box.ctl.promote()

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Wed May 26 21:46:20 MSK 2021


>>> @@ -1618,14 +1618,29 @@ box_promote(void)
>>>                    txn_limbo.owner_id);
>>>               return -1;
>>>           }
>>> +        if (txn_limbo_is_empty(&txn_limbo)) {
>>> +            wait_lsn = txn_limbo.confirmed_lsn;
>>> +            goto promote;
>>> +        }
>>>       }
>>>   -    /*
>>> -     * promote() is a no-op on the limbo owner, so all the rows
>>> -     * in the limbo must've come through the applier meaning they already
>>> -     * have an lsn assigned, even if their WAL write hasn't finished yet.
>>> -     */
>>> -    wait_lsn = txn_limbo_last_synchro_entry(&txn_limbo)->lsn;
>>> +    struct txn_limbo_entry *last_entry;
>>> +    last_entry = txn_limbo_last_synchro_entry(&txn_limbo);
>>> +    /* Wait for the last entries WAL write. */
>>> +    if (last_entry->lsn < 0) {
>>> +        if (wal_sync(NULL) < 0)
>>> +            return -1;
>>> +        if (txn_limbo_is_empty(&txn_limbo)) {
>>> +            wait_lsn = txn_limbo.confirmed_lsn;
>>> +            goto promote;
>>> +        }
>>> +        if (last_entry != txn_limbo_last_synchro_entry(&txn_limbo)) {
>> This is a bit dangerous. We cache a pointer and then go to fiber_yield,
>> which switches context, at this moment the pointer become dangling one
>> and we simply can't be sure if it _were_ reused. IOW, Serge are we
>> 100% sure that the same pointer with same address but with new data
>> won't appear here as last entry in limbo?
> 
> I agree this solution is not perfect.
> 
> An alternative would be to do the following:
> 1) Check that the limbo owner hasn't changed
> 2) Check that the last entry has positive lsn (e.g. it's not a new entry which
>     wasn't yet written to WAL). And that this lsn is equal to the lsn of our entry.
> 
> But what if our entry was confirmed and destroyed during wal_sync()? We can't compare
> other entries lsn with this ones.

As decided in the chat, you can use txn->id. It is unique until
restart and should help to detect if the last transaction has
changed.


More information about the Tarantool-patches mailing list