[tarantool-patches] Re: [PATCH v2 1/4] relay: adjust gc state on relay status update

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Thu Sep 19 00:45:47 MSK 2019


Thanks for the patch!

[NOTE: review is not finished yet, I will send more emails]

On 18/09/2019 11:36, Georgy Kirichenko wrote:
> Don't use on_close_log trigger to track xlog file boundaries.

How xlog boundaries were used before? Please, take into account,
that I am not familiar with relay and gc code, as well as all
other members of the team. Explain role of on_close_log and xlog
boundaries here in more details, please.

> As we
> intend implement in-memory replication relay could have no more xlog
> file operations and couldn't rely on previous trigger invocations.

In the 0-email to this patchset you said, that relay will be able to
use files as a source. Cite:

    "When relay went out of wal memory window it turns back to file mode."

How is it going to keep these files not deleted if it is not linked
with GC anymore? Or it is linked? I don't understand.

> Now the consumer state is advanced together with relay vclock.

And what was happening before? As far as I understand, GC consumer state
was always updated with relay vclock, and other vclocks. There is no other
way to update a GC state. You always need to pass vclock.

> After parallel applier implementation relay wouldn't receive an ACK packet for
> each transaction (because an applier groups them) so it should not be
> too expensive to advance gc on each relay vclock update.
> 
> Prerequisites: #3794

We use 'Part of' for such patches, and don't write ':'.

> ---
>  src/box/gc.c     |   7 +--
>  src/box/relay.cc | 108 +----------------------------------------------
>  2 files changed, 6 insertions(+), 109 deletions(-)
> 
> diff --git a/src/box/gc.c b/src/box/gc.c
> index e0df92473..649932801 100644
> --- a/src/box/gc.c
> +++ b/src/box/gc.c
> @@ -57,6 +57,7 @@
>  #include "engine.h"		/* engine_collect_garbage() */
>  #include "wal.h"		/* wal_collect_garbage() */
>  #include "checkpoint_schedule.h"
> +#include "replication.h"
>  
>  struct gc_state gc;
>  
> @@ -72,9 +73,9 @@ gc_checkpoint_fiber_f(va_list);
>  static inline int
>  gc_consumer_cmp(const struct gc_consumer *a, const struct gc_consumer *b)
>  {
> -	if (vclock_sum(&a->vclock) < vclock_sum(&b->vclock))
> +	if (vclock_get(&a->vclock, instance_id) < vclock_get(&b->vclock, instance_id))
>  		return -1;
> -	if (vclock_sum(&a->vclock) > vclock_sum(&b->vclock))
> +	if (vclock_get(&a->vclock, instance_id) > vclock_get(&b->vclock, instance_id))
>  		return 1;

Why 'sum' was used before your patch? If we can just always use 'get', then
why is not it a separate patch? The same for the hunk below.

>  	if ((intptr_t)a < (intptr_t)b)
>  		return -1;
> @@ -562,7 +563,7 @@ gc_consumer_advance(struct gc_consumer *consumer, const struct vclock *vclock)
>  	 */
>  	struct gc_consumer *next = gc_tree_next(&gc.consumers, consumer);
>  	bool update_tree = (next != NULL &&
> -			    signature >= vclock_sum(&next->vclock));
> +			    vclock_get(vclock, instance_id) >= vclock_get(&next->vclock, instance_id));
>  
>  	if (update_tree)
>  		gc_tree_remove(&gc.consumers, consumer);




More information about the Tarantool-patches mailing list