[Tarantool-patches] [RFC] gc/xlog: delay xlog cleanup until relays are subscribed

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Thu Mar 18 23:54:12 MSK 2021


On 18.03.2021 21:45, Cyrill Gorcunov wrote:
> On Thu, Mar 18, 2021 at 09:36:33PM +0100, Vladislav Shpilevoy wrote:
>>>>    b) If replicas are not connected and timeout is
>>>>       expired we kick the cleanup fiber;
>>>
>>> I mean this.
>>
>> Then it should have 'replication_' prefix, not 'wal_'. Because
>> it is ignored if replicas connect before the timeout expires.
> 
> Replication is one of the reason while main gamer is "wal"
> here. In the series I sent recently I named it "wal_cleanup_delay".
> In future we might introduce some topology detector as you've
> been suggesting and better to not stick to "replication"
> prefix I think.

The thing you said is just another argument for it having 'replication'
prefix. Because topology also is not about WAL.

WAL is just a container, it is not a gamer. The thing you fix here is GC,
which is manipulated by the replication. The replication forces all the
decisions.

If the option owner would be WAL, then it should have worked regardless
of what is the topology. I.e. keep the logs for the entire timeout even
if all is connected.

But it is vice versa - the decision when to drop the logs is made by
the replication. And it is the replication who gives the command "now
you can delete the old logs".

If you want wal prefix so bad, at least it should state that this is
not the exact timeout. It is a max timeout, which may end much earlier
due to any reason: out of disk space, full replication sync.


More information about the Tarantool-patches mailing list