[Tarantool-patches] [PATCH v2 1/3] gc/xlog: delay xlog cleanup until relays are subscribed

Cyrill Gorcunov gorcunov at gmail.com
Tue Mar 23 10:28:46 MSK 2021


On Mon, Mar 22, 2021 at 10:40:34PM +0100, Vladislav Shpilevoy wrote:
> Hi! Thanks for working on this!
> 
> See 8 comments below.
> 
> > Current state of `*.xlog` garbage collector can be found in
> > `box.info.gc()` output. For example
> > 
> > ``` Lua
> >  tarantool> box.info.gc()
> >  ---
> >    ...
> >    cleanup_is_paused: false
> > ```
> 
> 1. Isn't it 'is_paused' instead of 'cleanup_is_paused'?

Yeah, sorry. Thanks!

> >  
> > +static double
> > +box_check_wal_cleanup_delay(void)
> > +{
> > +	double value = cfg_geti("wal_cleanup_delay");
> 
> 2. cfg_geti() returns an integer. Please, use cfg_getd().

I did it on a purpose: we accepts time in whole seconds,
not milliseconds and etc. But sure, no problem will change
it to cfg_getd.

> > +
> > +	/*
> > +	 * Anonymous replicas do not require
> > +	 * delay the cleanup procedure since they
> > +	 * are read only.
> > +	 */
> > +	if (cfg_geti("replication_anon") != 0) {
> > +		if (value != 0)
> > +			value = 0;
> 
> 3. Still it makes sense to check that if the option is set,
> it is valid. Regardless of what is the anon value.

OK

> 
> I also think the function should return the delay exactly as
> it is set, not corrected by the anon value. Because these
> box_check functions are not for configuring. They are for
> checking a single option.

OK

> 
> Also you should use 'replication_anon' global variable instead
> of the config, which might be not installed at this moment yet.

What would happen if one setup both 'wal_cleanup_delay' and
'replication_anon' in config at once. Which C's replication_anon
value I will be reading? The C's replication_anon is set after
the cfg procedure complete, so since I operate on values obrained
from Lua level I need to use cfg_geti("replication_anon") because
at this moment only Lua level is consistent and replication_anon
may have a value from previous box.cfg call.

> >  
> > +int
> > +box_set_wal_cleanup_delay(void)
> > +{
> > +	double delay = box_check_wal_cleanup_delay();
> > +	if (delay < 0)
> > +		return -1;
> 
> 4. Here you could do the correction. Look at the option
> value and at 'replication_anon' global variable. Not at
> the anon config.

Need to think about this moment, thanks!

> 
> > +	gc_set_wal_cleanup_delay(delay);
> > +	return 0;
> > +}
> > +
> >  void
> >  box_set_vinyl_memory(void)
> >  {
> > @@ -3000,7 +3035,7 @@ box_cfg_xc(void)
> >  	rmean_box = rmean_new(iproto_type_strs, IPROTO_TYPE_STAT_MAX);
> >  	rmean_error = rmean_new(rmean_error_strings, RMEAN_ERROR_LAST);
> >  
> > -	gc_init();
> > +	gc_init(box_check_wal_cleanup_delay());
> 
> 5. Here is the problem: you rely on replication_anon setting, but it is installed
> below, after GC is initialized.

Exactly, that's why I use raw cfg_geti("replication_anon") instead of
global replication_anon variable.

> Perhaps you could set the delay to infinite and after the non-dynamic options
> are done, box_set_wal_cleanup_delay() would be called by load_cfg.lua and
> would lower the delay to 0 or whatever is configured.
> 
> >  	engine_init();
> >  	schema_init();
> >  	replication_init();
> > diff --git a/src/box/gc.c b/src/box/gc.c
> > index 9af4ef958..5418fd31d 100644
> > --- a/src/box/gc.c
> > +++ b/src/box/gc.c
> > @@ -46,6 +46,7 @@
> >  #include <small/rlist.h>
> >  #include <tarantool_ev.h>
> >  
> > +#include "cfg.h"
> 
> 6. Not necessary anymore.

+1

> >  gc_cleanup_fiber_f(va_list ap)
> >  {
> >  	(void)ap;
> > +
> > +	/*
> > +	 * Stage 1 (optional): in case if we're booting
> > +	 * up with cleanup disabled lets do wait in a
> > +	 * separate cycle to minimize branching on stage 2.
> > +	 */
> > +	if (gc.is_paused) {
> > +		ev_tstamp timeout = gc.wal_cleanup_delay;
> 
> 7. Please, use double for timestamps and timeouts. Because it
> is used in almost all the other code working with time.

Sure

> > +		while (!fiber_is_cancelled()) {
> > +			ev_tstamp clock_start = fiber_clock();
> > +			if (fiber_yield_timeout(timeout)) {
> > +				say_info("wal/engine cleanup is resumed "
> > +					 "due to timeout expiration");
> > +				gc.is_paused = false;
> > +				gc.delay_ref = 0;
> > +				break;
> > +			}
> > +
> > +			/*
> > +			 * If a last reference is dropped
> > +			 * we can exit out early.
> > +			 */
> > +			if (!gc.is_paused)
> > +				break;
> > +
> > +			ev_tstamp elapsed = fiber_clock() - clock_start;
> > +			timeout = gc.wal_cleanup_delay - elapsed;
> 
> 8. In the previous review I said about remembering a timestamp not
> accidentally. But because otherwise I can make this loop roll
> forever if I update wal_cleanup_delay with some period < timeout.
> 
> For instance, it was 5. You remember clock_start. Then 2.5 secs
> pass and I update it to 4.99999. elapsed is 2.5 secs. Timeout is
> 4.99999 - 2.5 = ~2.5. It starts again. I update the timeout to
> 4.99998 after another 2.5 secs. Elapsed is 2.5 secs again, and the
> next timeout is about 2.5 secs. And so on.
> 
> In your patch it will happen even if I increase the timeout, not
> only decrease. The fluctuations might be accidental in case I
> calculate it from something in Lua, get precision loss, and call
> box.cfg{} with a certain period until the instance accepts
> requests.
> 
> This might be fixed if clock_start is moved out of the loop.

What you're talking about is fp rouding errors and you're to
work really hard to enter into a such loop. But I see your point
and will update. Btw the use of FP value for seconds is a big
mistake in architecure, not only here but all over the code
and I think we should get rid of this very strange approach
(and I must confess I don't understand *why* on earth the
FP is used for delays in ev library).

	Cyrill


More information about the Tarantool-patches mailing list