[tarantool-patches] Re: [PATCH v2 1/4] relay: adjust gc state on relay status update

Fri Sep 20 10:26:50 MSK 2019

* Georgy Kirichenko <georgy at tarantool.org> [19/09/19 16:15]:
> On Thursday, September 19, 2019 12:45:47 AM MSK Vladislav Shpilevoy wrote:
> > Thanks for the patch!
> 
> Thanks for the review. I'll try to explain the patch here.
> A relay collects ACK's from replica. Before a parallel applier was implemented 
> there was one ACK packet per transaction. And it was to expensive to update gc 
> state for each transaction. To overwhelm this issue a relay used a trigger 
> which fires when recovery finished with a file. So, when a relay received a 
> close-log event, it waits for the first ACK greather than 'closed' vclock and 
> then advances a gc.  In case of in-memory replication we definitely couldn't 
> rely on file boundaries and on_close trigger. Because we already have parallel 
> applier we shouldn't have to much ACK packets I decided to not to use 
> on_close_log more and pass ACK direct to garbage collector.
> 
> In other words, a relay still continues gc advancing in both modes (file or 
> memory) but does it after each ACK. Also this required to change vclock 
> comparison because gc vclocks are not aligned by local xlog vclock timeline.
> 
> Yes, this changed the gc behavior - now gc keeps only local changes (because 
> an INSTANCE_ID is used). Though, this could/would be changed back when we move 
> a relay to the wal thread (what is needed for synchronous replication 
> purposes)

By tracking only local changes you make it impossible to recover a
replica which has fallen behind in a 3-node setup. Essentially
each replica is now responsible for keeping track of its own
changes only - which means it will happily delete xlogs with
changes of a lost peer even if these changes are not synced to
other peers yet.

-- 
Konstantin Osipov, Moscow, Russia