[Tarantool-patches] [PATCH v2 6/6] txn_limbo: ignore CONFIRM/ROLLBACK for a foreign master
Vladislav Shpilevoy
v.shpilevoy at tarantool.org
Wed Dec 23 20:28:06 MSK 2020
Thanks for the patch!
On 23.12.2020 12:59, Serge Petrenko via Tarantool-patches wrote:
> We designed limbo so that it errors on receiving a CONFIRM or ROLLBACK
> for other instance's data. Actually, this error is pointless, and even
> harmful. Here's why:
>
> Imagine you have 3 instances, 1, 2 and 3.
> First 1 writes some synchronous transactions, but dies before writing CONFIRM.
>
> Now 2 has to write CONFIRM instead of 1 to take limbo ownership.
> From now on 2 is the limbo owner and in case of high enough load it constantly
> has some data in the limbo.
>
> Once 1 restarts, it first recovers its xlogs, and fills its limbo with
> its own unconfirmed transactions from the previous run. Now replication
> between 1, 2 and 3 is started and the first thing 1 sees is that 2 and 3
> ack its old transactions. So 1 writes CONFIRM for its own transactions
> even before the same CONFIRM written by 2 reaches it.
> Once the CONFIRM written by 1 is replicated to 2 and 3 they error and
> stop replication, since their limbo contains entries from 2, not from 1.
> Actually, there's no need to error, since it's just a really old CONFIRM
> which's already processed by both 2 and 3.>
> So, ignore CONFIRM/ROLLBACK when it references a wrong limbo owner.
>
> The issue was discovered with test replication/election_qsync_stress.
The comment is good.
More information about the Tarantool-patches
mailing list