[Tarantool-patches] [PATCH] Trigger on vclock change

Georgy Kirichenko georgy at tarantool.org
Thu Nov 14 20:13:43 MSK 2019


On Thursday, November 14, 2019 6:26:56 PM MSK Konstantin Osipov wrote:
> * Georgy Kirichenko <kirichenkoga at gmail.com> [19/11/14 17:11]:
> > On Thursday, November 14, 2019 4:44:22 PM MSK Konstantin Osipov wrote:
> > > * Maria <maria.khaydich at tarantool.org> [19/11/14 15:59]:
> > > > This patch implements replication.on_vclock
> > > > trigger that can be  useful for programming
> > > > shard-systems with redundancy.
> > > 
> > > 3808 is about being able to wait for an lsn.
> > > 
> > > Using a trigger for *waiting* is called busy waiting, and is a cpu
> > > hog, especially at a performance critical space like update of
> > > replica vclock, which can happen a hundred times a second.
> > > 
> > > Why not implement a way to wait for an lsn instead?
> > 
> > Please explain your proposal in a more detailed way.
> > Do you wish to implement a hard-coded `handler` and each time when a
> > replica vclock is updated this handler will compare the updated vclock
> > against members of set of replica_id:lsn pairs organized in a list, tree
> > or something else? And if a compare matches to true then a corresponding
> > handler will be called?
> Yes, quite simply wait_lsn() could add the server_id, lsn that is
> being waited for to a sorted list, and whenever we update
> replicaset vclock for this lsn we also look at top of the list, if
> it is not empty, and if the current lsn is greater than the top,
> we could pop the value from the list and send a notification to
> the waiter.
> 
> I also think it it's a pair of server_id, lsn, rather than entire
> vclock - usually you know what you're waiting for, and it's only
> one component of vclock, not all of them.
But there are some issues
1. what if we wish to have a timeout
2. what if lsn are waited in non-strictly increasing order
3. what if awaiting fiber is canceled
The approach you suggested looks for me like a bike-shed trigger 
implementation but the implementation is limited to use only for wait for lsn. 
So I would like to propose to ask Alexander Tikhonov to provide us with a 
benchmark result first and then make a conclusion about performance impact.

> 
> Going forward I think one is better off using synchronous
> replication, not wait-lsn, since wait_lsn doesn't roll back the
> transaction on failure.
> 
> Why did you decide to do this ticket at all?
Because some applications (like sls) do not require for any transaction to be 
replicated synchronously. Also this allows to be more-flexible.
> 
> > Anyway, we will need to have such trigger in order to make applier able to
> > report local replica wal and commited vclock in scope of synchronous
> > replication issue.
> 
> This has to happen in WAL thread, not in main thread, and has to
> watch relay-from-memory vclock, not async-replication vclock. And
> it also needs to roll back the transaction locally on failure,
> i.e. write some sort of undo records to the WAL.
This will work in an applier which lives in the TX cord, as an applier 
processes incoming transactions through the TX. And an applier should be able 
to answer with two vclocks - committed and written ones. Yes, WAL will batch 
such vclocks updates but this is still of hundreds of events per second. 
Unfortunately there is no point to move an applier to the WAL thread because a 
transaction could not be validated without TX.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20191114/c870d116/attachment.sig>


More information about the Tarantool-patches mailing list