[Tarantool-patches] [PATCH] Trigger on vclock change

Konstantin Osipov kostja.osipov at gmail.com
Thu Nov 14 20:33:38 MSK 2019


* Georgy Kirichenko <georgy at tarantool.org> [19/11/14 20:14]:
> > I also think it it's a pair of server_id, lsn, rather than entire
> > vclock - usually you know what you're waiting for, and it's only
> > one component of vclock, not all of them.
> But there are some issues
> 1. what if we wish to have a timeout
> 2. what if lsn are waited in non-strictly increasing order
> 3. what if awaiting fiber is canceled
> The approach you suggested looks for me like a bike-shed trigger 
> implementation but the implementation is limited to use only for wait for lsn. 
> So I would like to propose to ask Alexander Tikhonov to provide us with a 
> benchmark result first and then make a conclusion about performance impact.

Maybe you're right. But isn't the entire idea of wait_lsn()
bike-shed, as you put it, because we don't have sync replication?

> > > Anyway, we will need to have such trigger in order to make applier able to
> > > report local replica wal and commited vclock in scope of synchronous
> > > replication issue.
> > 
> > This has to happen in WAL thread, not in main thread, and has to
> > watch relay-from-memory vclock, not async-replication vclock. And
> > it also needs to roll back the transaction locally on failure,
> > i.e. write some sort of undo records to the WAL.
> This will work in an applier which lives in the TX cord, as an applier 
> processes incoming transactions through the TX. And an applier should be able 
> to answer with two vclocks - committed and written ones. Yes, WAL will batch 
> such vclocks updates but this is still of hundreds of events per second. 
> Unfortunately there is no point to move an applier to the WAL thread because a 
> transaction could not be validated without TX.

OK, now I get it where you're heading. I think sending acks from
tx thread has the following disadvantages:
- we mix up "committed" event and "written to the commit log"
  event. They become indistinguishable in tx thread. Per RAFT, we
  should send back acks as soon as we write to the local commit
  log, and when the leader gets enough 'acks' from enough commit
  logs it sends another message which makes the local transaction
  commit. If you 'ack' when you commit the local transaction, how
  would you be able to roll it back on leader change or majority 
  failure?

  So the event you need to be acknowledging is not the event this 
  trigger in question is capturing. 

- the second issue is latency. tx/wal scheduling delay can be in
  hundreds of microseconds, and this is close to  networking
  delays on fast networks within the same rack/data center.
  So it acknowledging commit log writes from WAL thread will
  also speed up the leader quite a bit, since the round trip
  will be shorter.

To sum up, I still think you should not use this trigger to
acknowledge commit log writes. Better have a separate socket for
this altogether, or move the write end of the existing socket to
the wal, while keeping the read end where it is now, in
tx/applier.

-- 
Konstantin Osipov, Moscow, Russia


More information about the Tarantool-patches mailing list