[Tarantool-patches] [PATCH v14 1/6] qsync: track confirmed lsn number on reads

Cyrill Gorcunov gorcunov at gmail.com
Mon Sep 13 01:18:04 MSK 2021


On Sun, Sep 12, 2021 at 05:44:00PM +0200, Vladislav Shpilevoy wrote:
> Thanks for the patch!
> 
> On 10.09.2021 17:29, Cyrill Gorcunov via Tarantool-patches wrote:
> > We will use this lsn for requests validation
> > in next patches for sake of split-brain detection.
> 
> I don't understand. How exactly will it help?

Sorry for not putting more detailed explanation. Here it is: we've
a test ./test-run replication/qsync_advanced.test.lua where limbo
owner is migrated in result our filter refuses to accept new limbo
owner

> txn_limbo.c:872 E> RAFT: rejecting PROMOTE (31) request from origin_id 2 replica_id 1 term 3. confirmed_lsn 72 > promote_lsn 0
> ER_CLUSTER_SPLIT: Cluster split detected. Backward promote LSN

become promote request comes in with LSN = 0 when confirmed_lsn is bigger,
which in turn happens because we update LSN on write operation only. In this
test we have two nodes "default" and "replica". Initially "default" node
is limbo owner, which writes some data into sync space. Then we wait until
this sync data get replicated (node the "default" has confirmed_lsn > 0
because it been writting the data). Then we jump to "replica" node and
call box.promote() there which initiate PROMOTE request with lsn = 0 and
send it back to "default" node which has been limbo owner before and
has confirmed_lsn=72. When this request comes in the filtration fails.
(the replica node didn't write _anything_ locally and its confirmed_lsn = 0,
which we send in promote body).


More information about the Tarantool-patches mailing list