[Tarantool-patches] [PATCH v2 6/6] txn_limbo: ignore CONFIRM/ROLLBACK for a foreign master

Serge Petrenko sergepetrenko at tarantool.org
Thu Dec 24 19:13:45 MSK 2020



23.12.2020 20:28, Vladislav Shpilevoy пишет:
> Thanks for the patch!
>
> On 23.12.2020 12:59, Serge Petrenko via Tarantool-patches wrote:
>> We designed limbo so that it errors on receiving a CONFIRM or ROLLBACK
>> for other instance's data. Actually, this error is pointless, and even
>> harmful. Here's why:
>>
>> Imagine you have 3 instances, 1, 2 and 3.
>> First 1 writes some synchronous transactions, but dies before writing CONFIRM.
>>
>> Now 2 has to write CONFIRM instead of 1 to take limbo ownership.
>>  From now on 2 is the limbo owner and in case of high enough load it constantly
>> has some data in the limbo.
>>
>> Once 1 restarts, it first recovers its xlogs, and fills its limbo with
>> its own unconfirmed transactions from the previous run. Now replication
>> between 1, 2 and 3 is started and the first thing 1 sees is that 2 and 3
>> ack its old transactions. So 1 writes CONFIRM for its own transactions
>> even before the same CONFIRM written by 2 reaches it.
>> Once the CONFIRM written by 1 is replicated to 2 and 3 they error and
>> stop replication, since their limbo contains entries from 2, not from 1.
>> Actually, there's no need to error, since it's just a really old CONFIRM
>> which's already processed by both 2 and 3.>
>> So, ignore CONFIRM/ROLLBACK when it references a wrong limbo owner.
>>
>> The issue was discovered with test replication/election_qsync_stress.
> The comment is good.

Thanks!

-- 
Serge Petrenko



More information about the Tarantool-patches mailing list