[Tarantool-patches] [PATCH 1/5] [tosquash] replication: fix multiple rollbacks

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Sun Jul 5 18:13:30 MSK 2020


>>   src/box/txn_limbo.c                 | 25 +++++++++++++++++++++++++
>>   test/replication/qsync_basic.result |  2 +-
>>   2 files changed, 26 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c
>> index 0402664cb..2cb687f4d 100644
>> --- a/src/box/txn_limbo.c
>> +++ b/src/box/txn_limbo.c
>> @@ -150,6 +157,24 @@ txn_limbo_wait_complete(struct txn_limbo *limbo, struct txn_limbo_entry *entry)
>>       bool timed_out = fiber_yield_timeout(txn_limbo_confirm_timeout(limbo));
>>       fiber_set_cancellable(cancellable);
>>       if (timed_out) {
>> +        assert(!txn_limbo_is_empty(limbo));
>> +        if (txn_limbo_first_entry(limbo) != entry) {
>> +            /*
>> +             * If this is not a first entry in the
>> +             * limbo, it is definitely not a first
>> +             * timed out entry. And since it managed
>> +             * to time out too, it means there is
>> +             * currently another fiber writing
>> +             * rollback. Wait when it will finish and
>> +             * wake us up.
>> +             */
> 
> Why isn't it the first timed out? Is it because once previous entry was confirmed, it
> is removed from the queue immediately?
> Looks fragile.

Нет, это не связано с конфирмами. Если в лимбо добавлены две записи, и стаймаутила
вторая, то логично, что раз первая лежит в лимбе еще дольше, она стаймаутила тоже.
А значит достаточно записать роллбек для нее, и это откатит все следующие.


More information about the Tarantool-patches mailing list