[tarantool-patches] Re: [PATCH v2 1/4] relay: adjust gc state on relay status update

Georgy Kirichenko georgy at tarantool.org
Thu Sep 19 16:14:07 MSK 2019


On Thursday, September 19, 2019 12:45:47 AM MSK Vladislav Shpilevoy wrote:
> Thanks for the patch!

Thanks for the review. I'll try to explain the patch here.
A relay collects ACK's from replica. Before a parallel applier was implemented 
there was one ACK packet per transaction. And it was to expensive to update gc 
state for each transaction. To overwhelm this issue a relay used a trigger 
which fires when recovery finished with a file. So, when a relay received a 
close-log event, it waits for the first ACK greather than 'closed' vclock and 
then advances a gc.  In case of in-memory replication we definitely couldn't 
rely on file boundaries and on_close trigger. Because we already have parallel 
applier we shouldn't have to much ACK packets I decided to not to use 
on_close_log more and pass ACK direct to garbage collector.

In other words, a relay still continues gc advancing in both modes (file or 
memory) but does it after each ACK. Also this required to change vclock 
comparison because gc vclocks are not aligned by local xlog vclock timeline.

Yes, this changed the gc behavior - now gc keeps only local changes (because 
an INSTANCE_ID is used). Though, this could/would be changed back when we move 
a relay to the wal thread (what is needed for synchronous replication 
purposes)

> 
> [NOTE: review is not finished yet, I will send more emails]
> 
> On 18/09/2019 11:36, Georgy Kirichenko wrote:
> > Don't use on_close_log trigger to track xlog file boundaries.
> 
> How xlog boundaries were used before? Please, take into account,
> that I am not familiar with relay and gc code, as well as all
> other members of the team. Explain role of on_close_log and xlog
> boundaries here in more details, please.
> 
> > As we
> > intend implement in-memory replication relay could have no more xlog
> > file operations and couldn't rely on previous trigger invocations.
> 
> In the 0-email to this patchset you said, that relay will be able to
> use files as a source. Cite:
> 
>     "When relay went out of wal memory window it turns back to file mode."
> 
> How is it going to keep these files not deleted if it is not linked
> with GC anymore? Or it is linked? I don't understand.
> 
> > Now the consumer state is advanced together with relay vclock.
> 
> And what was happening before? As far as I understand, GC consumer state
> was always updated with relay vclock, and other vclocks. There is no other
> way to update a GC state. You always need to pass vclock.
> 
> > After parallel applier implementation relay wouldn't receive an ACK packet
> > for each transaction (because an applier groups them) so it should not be
> > too expensive to advance gc on each relay vclock update.
> > 
> > Prerequisites: #3794
> 
> We use 'Part of' for such patches, and don't write ':'.
> 
> > ---
> > 
> >  src/box/gc.c     |   7 +--
> >  src/box/relay.cc | 108 +----------------------------------------------
> >  2 files changed, 6 insertions(+), 109 deletions(-)
> > 
> > diff --git a/src/box/gc.c b/src/box/gc.c
> > index e0df92473..649932801 100644
> > --- a/src/box/gc.c
> > +++ b/src/box/gc.c
> > @@ -57,6 +57,7 @@
> > 
> >  #include "engine.h"		/* engine_collect_garbage() */
> >  #include "wal.h"		/* wal_collect_garbage() */
> >  #include "checkpoint_schedule.h"
> > 
> > +#include "replication.h"
> > 
> >  struct gc_state gc;
> > 
> > @@ -72,9 +73,9 @@ gc_checkpoint_fiber_f(va_list);
> > 
> >  static inline int
> >  gc_consumer_cmp(const struct gc_consumer *a, const struct gc_consumer *b)
> >  {
> > 
> > -	if (vclock_sum(&a->vclock) < vclock_sum(&b->vclock))
> > +	if (vclock_get(&a->vclock, instance_id) < vclock_get(&b->vclock,
> > instance_id))> 
> >  		return -1;
> > 
> > -	if (vclock_sum(&a->vclock) > vclock_sum(&b->vclock))
> > +	if (vclock_get(&a->vclock, instance_id) > vclock_get(&b->vclock,
> > instance_id))> 
> >  		return 1;
> 
> Why 'sum' was used before your patch? If we can just always use 'get', then
> why is not it a separate patch? The same for the hunk below.
> 
> >  	if ((intptr_t)a < (intptr_t)b)
> >  	
> >  		return -1;
> > 
> > @@ -562,7 +563,7 @@ gc_consumer_advance(struct gc_consumer *consumer,
> > const struct vclock *vclock)> 
> >  	 */
> >  	
> >  	struct gc_consumer *next = gc_tree_next(&gc.consumers, consumer);
> >  	bool update_tree = (next != NULL &&
> > 
> > -			    signature >= vclock_sum(&next->vclock));
> > +			    vclock_get(vclock, instance_id) >= 
vclock_get(&next->vclock,
> > instance_id));> 
> >  	if (update_tree)
> >  	
> >  		gc_tree_remove(&gc.consumers, consumer);

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20190919/53006d27/attachment.sig>


More information about the Tarantool-patches mailing list