[Tarantool-patches] [PATCH v2 1/3] gc/xlog: delay xlog cleanup until relays are subscribed

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Tue Mar 23 00:40:34 MSK 2021


Hi! Thanks for working on this!

See 8 comments below.

> Current state of `*.xlog` garbage collector can be found in
> `box.info.gc()` output. For example
> 
> ``` Lua
>  tarantool> box.info.gc()
>  ---
>    ...
>    cleanup_is_paused: false
> ```

1. Isn't it 'is_paused' instead of 'cleanup_is_paused'?

> The `cleanup_is_paused` shows if cleanup fiber is paused or not.
> ---
> diff --git a/src/box/box.cc b/src/box/box.cc
> index cc59564e1..b9fd7af00 100644
> --- a/src/box/box.cc
> +++ b/src/box/box.cc
> @@ -771,6 +771,29 @@ box_check_wal_queue_max_size(void)
>  	return size;
>  }
>  
> +static double
> +box_check_wal_cleanup_delay(void)
> +{
> +	double value = cfg_geti("wal_cleanup_delay");

2. cfg_geti() returns an integer. Please, use cfg_getd().

> +	if (value < 0) {
> +		diag_set(ClientError, ER_CFG, "wal_cleanup_delay",
> +			 "the value must be >= 0");
> +		return -1;
> +	}
> +
> +	/*
> +	 * Anonymous replicas do not require
> +	 * delay the cleanup procedure since they
> +	 * are read only.
> +	 */
> +	if (cfg_geti("replication_anon") != 0) {
> +		if (value != 0)
> +			value = 0;

3. Still it makes sense to check that if the option is set,
it is valid. Regardless of what is the anon value.

I also think the function should return the delay exactly as
it is set, not corrected by the anon value. Because these
box_check functions are not for configuring. They are for
checking a single option.

Also you should use 'replication_anon' global variable instead
of the config, which might be not installed at this moment yet.

> +	}
> +
> +	return value;
> +}
> @@ -1465,6 +1490,16 @@ box_set_wal_queue_max_size(void)
>  	return 0;
>  }
>  
> +int
> +box_set_wal_cleanup_delay(void)
> +{
> +	double delay = box_check_wal_cleanup_delay();
> +	if (delay < 0)
> +		return -1;

4. Here you could do the correction. Look at the option
value and at 'replication_anon' global variable. Not at
the anon config.

> +	gc_set_wal_cleanup_delay(delay);
> +	return 0;
> +}
> +
>  void
>  box_set_vinyl_memory(void)
>  {
> @@ -3000,7 +3035,7 @@ box_cfg_xc(void)
>  	rmean_box = rmean_new(iproto_type_strs, IPROTO_TYPE_STAT_MAX);
>  	rmean_error = rmean_new(rmean_error_strings, RMEAN_ERROR_LAST);
>  
> -	gc_init();
> +	gc_init(box_check_wal_cleanup_delay());

5. Here is the problem: you rely on replication_anon setting, but it is installed
below, after GC is initialized.

Perhaps you could set the delay to infinite and after the non-dynamic options
are done, box_set_wal_cleanup_delay() would be called by load_cfg.lua and
would lower the delay to 0 or whatever is configured.

>  	engine_init();
>  	schema_init();
>  	replication_init();
> diff --git a/src/box/gc.c b/src/box/gc.c
> index 9af4ef958..5418fd31d 100644
> --- a/src/box/gc.c
> +++ b/src/box/gc.c
> @@ -46,6 +46,7 @@
>  #include <small/rlist.h>
>  #include <tarantool_ev.h>
>  
> +#include "cfg.h"

6. Not necessary anymore.

>  #include "diag.h"
>  #include "errcode.h"
>  #include "fiber.h"
> @@ -238,6 +246,47 @@ static int
>  gc_cleanup_fiber_f(va_list ap)
>  {
>  	(void)ap;
> +
> +	/*
> +	 * Stage 1 (optional): in case if we're booting
> +	 * up with cleanup disabled lets do wait in a
> +	 * separate cycle to minimize branching on stage 2.
> +	 */
> +	if (gc.is_paused) {
> +		ev_tstamp timeout = gc.wal_cleanup_delay;

7. Please, use double for timestamps and timeouts. Because it
is used in almost all the other code working with time.

> +		while (!fiber_is_cancelled()) {
> +			ev_tstamp clock_start = fiber_clock();
> +			if (fiber_yield_timeout(timeout)) {
> +				say_info("wal/engine cleanup is resumed "
> +					 "due to timeout expiration");
> +				gc.is_paused = false;
> +				gc.delay_ref = 0;
> +				break;
> +			}
> +
> +			/*
> +			 * If a last reference is dropped
> +			 * we can exit out early.
> +			 */
> +			if (!gc.is_paused)
> +				break;
> +
> +			ev_tstamp elapsed = fiber_clock() - clock_start;
> +			timeout = gc.wal_cleanup_delay - elapsed;

8. In the previous review I said about remembering a timestamp not
accidentally. But because otherwise I can make this loop roll
forever if I update wal_cleanup_delay with some period < timeout.

For instance, it was 5. You remember clock_start. Then 2.5 secs
pass and I update it to 4.99999. elapsed is 2.5 secs. Timeout is
4.99999 - 2.5 = ~2.5. It starts again. I update the timeout to
4.99998 after another 2.5 secs. Elapsed is 2.5 secs again, and the
next timeout is about 2.5 secs. And so on.

In your patch it will happen even if I increase the timeout, not
only decrease. The fluctuations might be accidental in case I
calculate it from something in Lua, get precision loss, and call
box.cfg{} with a certain period until the instance accepts
requests.

This might be fixed if clock_start is moved out of the loop.

> +			if (timeout <= 0) {
> +				say_info("wal/engine cleanup is resumed "
> +					 "due to reconfigured timeout "
> +					 "expiration");
> +				gc.is_paused = false;
> +				gc.delay_ref = 0;
> +				break;
> +			}
> +		}
> +	}


More information about the Tarantool-patches mailing list