[Tarantool-patches] [PATCH v3 2/4] recovery: allow to ignore rows coming from a certain instance

Serge Petrenko sergepetrenko at tarantool.org
Wed Feb 19 11:43:11 MSK 2020


> 18 февр. 2020 г., в 22:03, Konstantin Osipov <kostja.osipov at gmail.com> написал(а):
> 
> * Serge Petrenko <sergepetrenko at tarantool.org> [20/02/18 20:38]:
>> Prerequisite #4739
> 
> This is a strange way to mute rows from self. Why not set vclock
> component to infinity as I suggested multiple times? Why not
> respond to me with objection if my suggestion  can not be done?

I responded with a patch, so now we can discuss both your and my suggestions.

If I understood you correctly, you suggested to set replica self lsn to infinity
(on master side), so that recovery on masters side would skip replicas rows.

I tried your approach and initialized recovery with vclock[replica_id] = INT64_MAX.
This does allow you to skip replica’s rows, but this breaks vclock signature, which will
overflow immediately. vclock signatures are used to order gc consumers, gc
consumer corresponding to a replica gets its vclock from relay recovery.
Ok, you could suggest to reset vclock[replica_id] to some meaningful value, but where
do you get this value from? You cannot do gc message vclock[replica_id] =
replica ack vclock[replica_id], because replica may have some rows you still
don’t have. Then replica ack vclock signature may get too big and you will delete
other logs containing your changes.
You also cannot set gc vclock[replica_id] to 0, because it will hold logs, not needed by
replica for too long.

This is why I decided to implement the skipping mechanism from this patch.
It allows to track the exact vclock of the last recovered row, and it allows to
skip replica rows, just like we wanted.


--
Serge Petrenko
sergepetrenko at tarantool.org



> 
>> 
> 
> -- 
> Konstantin Osipov, Moscow, Russia



More information about the Tarantool-patches mailing list