[Tarantool-patches] [PATCH v2 1/3] gc/xlog: delay xlog cleanup until relays are subscribed
Cyrill Gorcunov
gorcunov at gmail.com
Tue Mar 23 10:28:46 MSK 2021
On Mon, Mar 22, 2021 at 10:40:34PM +0100, Vladislav Shpilevoy wrote:
> Hi! Thanks for working on this!
>
> See 8 comments below.
>
> > Current state of `*.xlog` garbage collector can be found in
> > `box.info.gc()` output. For example
> >
> > ``` Lua
> > tarantool> box.info.gc()
> > ---
> > ...
> > cleanup_is_paused: false
> > ```
>
> 1. Isn't it 'is_paused' instead of 'cleanup_is_paused'?
Yeah, sorry. Thanks!
> >
> > +static double
> > +box_check_wal_cleanup_delay(void)
> > +{
> > + double value = cfg_geti("wal_cleanup_delay");
>
> 2. cfg_geti() returns an integer. Please, use cfg_getd().
I did it on a purpose: we accepts time in whole seconds,
not milliseconds and etc. But sure, no problem will change
it to cfg_getd.
> > +
> > + /*
> > + * Anonymous replicas do not require
> > + * delay the cleanup procedure since they
> > + * are read only.
> > + */
> > + if (cfg_geti("replication_anon") != 0) {
> > + if (value != 0)
> > + value = 0;
>
> 3. Still it makes sense to check that if the option is set,
> it is valid. Regardless of what is the anon value.
OK
>
> I also think the function should return the delay exactly as
> it is set, not corrected by the anon value. Because these
> box_check functions are not for configuring. They are for
> checking a single option.
OK
>
> Also you should use 'replication_anon' global variable instead
> of the config, which might be not installed at this moment yet.
What would happen if one setup both 'wal_cleanup_delay' and
'replication_anon' in config at once. Which C's replication_anon
value I will be reading? The C's replication_anon is set after
the cfg procedure complete, so since I operate on values obrained
from Lua level I need to use cfg_geti("replication_anon") because
at this moment only Lua level is consistent and replication_anon
may have a value from previous box.cfg call.
> >
> > +int
> > +box_set_wal_cleanup_delay(void)
> > +{
> > + double delay = box_check_wal_cleanup_delay();
> > + if (delay < 0)
> > + return -1;
>
> 4. Here you could do the correction. Look at the option
> value and at 'replication_anon' global variable. Not at
> the anon config.
Need to think about this moment, thanks!
>
> > + gc_set_wal_cleanup_delay(delay);
> > + return 0;
> > +}
> > +
> > void
> > box_set_vinyl_memory(void)
> > {
> > @@ -3000,7 +3035,7 @@ box_cfg_xc(void)
> > rmean_box = rmean_new(iproto_type_strs, IPROTO_TYPE_STAT_MAX);
> > rmean_error = rmean_new(rmean_error_strings, RMEAN_ERROR_LAST);
> >
> > - gc_init();
> > + gc_init(box_check_wal_cleanup_delay());
>
> 5. Here is the problem: you rely on replication_anon setting, but it is installed
> below, after GC is initialized.
Exactly, that's why I use raw cfg_geti("replication_anon") instead of
global replication_anon variable.
> Perhaps you could set the delay to infinite and after the non-dynamic options
> are done, box_set_wal_cleanup_delay() would be called by load_cfg.lua and
> would lower the delay to 0 or whatever is configured.
>
> > engine_init();
> > schema_init();
> > replication_init();
> > diff --git a/src/box/gc.c b/src/box/gc.c
> > index 9af4ef958..5418fd31d 100644
> > --- a/src/box/gc.c
> > +++ b/src/box/gc.c
> > @@ -46,6 +46,7 @@
> > #include <small/rlist.h>
> > #include <tarantool_ev.h>
> >
> > +#include "cfg.h"
>
> 6. Not necessary anymore.
+1
> > gc_cleanup_fiber_f(va_list ap)
> > {
> > (void)ap;
> > +
> > + /*
> > + * Stage 1 (optional): in case if we're booting
> > + * up with cleanup disabled lets do wait in a
> > + * separate cycle to minimize branching on stage 2.
> > + */
> > + if (gc.is_paused) {
> > + ev_tstamp timeout = gc.wal_cleanup_delay;
>
> 7. Please, use double for timestamps and timeouts. Because it
> is used in almost all the other code working with time.
Sure
> > + while (!fiber_is_cancelled()) {
> > + ev_tstamp clock_start = fiber_clock();
> > + if (fiber_yield_timeout(timeout)) {
> > + say_info("wal/engine cleanup is resumed "
> > + "due to timeout expiration");
> > + gc.is_paused = false;
> > + gc.delay_ref = 0;
> > + break;
> > + }
> > +
> > + /*
> > + * If a last reference is dropped
> > + * we can exit out early.
> > + */
> > + if (!gc.is_paused)
> > + break;
> > +
> > + ev_tstamp elapsed = fiber_clock() - clock_start;
> > + timeout = gc.wal_cleanup_delay - elapsed;
>
> 8. In the previous review I said about remembering a timestamp not
> accidentally. But because otherwise I can make this loop roll
> forever if I update wal_cleanup_delay with some period < timeout.
>
> For instance, it was 5. You remember clock_start. Then 2.5 secs
> pass and I update it to 4.99999. elapsed is 2.5 secs. Timeout is
> 4.99999 - 2.5 = ~2.5. It starts again. I update the timeout to
> 4.99998 after another 2.5 secs. Elapsed is 2.5 secs again, and the
> next timeout is about 2.5 secs. And so on.
>
> In your patch it will happen even if I increase the timeout, not
> only decrease. The fluctuations might be accidental in case I
> calculate it from something in Lua, get precision loss, and call
> box.cfg{} with a certain period until the instance accepts
> requests.
>
> This might be fixed if clock_start is moved out of the loop.
What you're talking about is fp rouding errors and you're to
work really hard to enter into a such loop. But I see your point
and will update. Btw the use of FP value for seconds is a big
mistake in architecure, not only here but all over the code
and I think we should get rid of this very strange approach
(and I must confess I don't understand *why* on earth the
FP is used for delays in ev library).
Cyrill
More information about the Tarantool-patches
mailing list