[Tarantool-patches] [PATCH v2 1/5] box: introduce matrix clock

Serge Petrenko sergepetrenko at tarantool.org
Thu Mar 19 14:59:24 MSK 2020


> 19 марта 2020 г., в 14:56, Konstantin Osipov <kostja.osipov at gmail.com> написал(а):
> 
> * Serge Petrenko <sergepetrenko at tarantool.org> [20/03/19 14:29]:
>> First of all, let me describe the issue with the current WAL GC:
>> it tracks consumers (i.e. remote replicas) by their vclock signature,
>> which is the sum of all vclock components.
>> It has always been wrong (since the introduction of vclocks, at least):
>> Say, you have 2 masters, A and B with ids 1 and 2 respectively, and a replica C with id 3.
>> The example will be a little synthetic, but it illustrates the problem:
>> Say С replicates from both A and B, and there is no replication between A and B (say, the
>> instances were reconfigured to not replicate from each other).
>> Now, say replica C has followed A and B to vclock {1:5, 2:13}. At the same time, A has lsn 10
>> and B has lsn 15. A and B do not know about each other’s changes, so A’s vclock is {1:10} and
>> B’s vclock is {2:15}. Now imagine A does a snapshot and creates a new xlog with signature 10.
>> A’s directory will look like: 00…000.xlog 00…010.snap 00….010.xlog
>> Replica C reports its vclock {1:5, 2:13} to A, A uses the vclock to update the corresponding GC
>> consumer. Since signatures are used, GC consumer is assigned a signature = 13 + 5 = 18.
>> This is greater than the signature of the last xlog on A (10), so the previous xlog (00…00.xlog) can be
>> deleted (at least A assumes it can be). Actually, replica still needs 00…00.xlog, because it contains
>> rows corresponding to vclocks {1:6} - {1:10}, which haven’t been replicated yet.
>> 
>> If instead of using vclock signatures, gc consumers used vclocks, such a problem wouldn’t arise.
>> Replia would report its vclock {1:5, 2:13}. The vclock is NOT strictly greater than A’s most recent
>> xlog vclock ({1:10}), so the previous log is kept until replica reports a vclock {1:10, 2:something}.
>> (or {1:11, …} and so on).
> 
> This explanation belongs to the commit comment.
> 

Ok, will amend.

> 
> 
> -- 
> Konstantin Osipov, Moscow, Russia

--
Serge Petrenko
sergepetrenko at tarantool.org


More information about the Tarantool-patches mailing list