[Tarantool-patches] [PATCH] vinyl: unthrottle scheduler on checkpoint
Konstantin Osipov
kostja.osipov at gmail.com
Wed Apr 22 03:32:28 MSK 2020
* Vladislav Shpilevoy <v.shpilevoy at tarantool.org> [20/04/22 02:17]:
Vinyl dump is based on memtable state, not a schedule.
If each dump is unthrottling the scheduler, then what's the point
of throttling? Having a nice infinite loop on ENOSPC?
> Hi! Thanks for the patch!
>
> On 31/03/2020 16:42, Nikita Pettik wrote:
> > Before this patch box.snapshot() bails out immediately if it sees that
> > the scheduler is throttled due to errors. For instance:
> >
> > box.error.injection.set('ERRINJ_VY_RUN_WRITE', true)
> > snapshot() -- fails due to ERRINJ_VY_RUN_WRITE
> > box.error.injection.set('ERRINJ_VY_RUN_WRITE', false)
> > snapshot() -- still fails despite the fact that injected error is unset
> >
> > As a result, one has to wait up to a minute to make a snapshot. The
> > reason why throttling was introduced was to avoid flooding the log
> > in case of repeating disk errors.
> > On the other hand, checkpoint process is either called manually or on
> > schedule. What is more, to deal with schedule throttling in tests, we
> > had to introduce a new error injection (ERRINJ_VY_SCHED_TIMEOUT).
> > It reduces time duration during which the scheduler remains throttled,
> > which is ugly and race prone.
> > So, let's unthrottle scheduler when checkpoint process is launched.
> >
> > Closes #3519
> > ---
> > Note that VY_SCHED_TIMEOUT error injection is not completely removed
> > from tests, since at least one of them fails (instance crashes) without
> > it (vinyl/errinj_stat.test.lua). It's not a problem of this patch -
> > reproducer is extracted and filed in gh-4821.
> >
> > Branch: https://github.com/tarantool/tarantool/tree/np/gh-3519-unthrottle-sched
> > Issue: https://github.com/tarantool/tarantool/issues/3519
> >
> > src/box/vy_scheduler.c | 14 +++++---------
> > test/box/errinj.result | 8 --------
> > test/box/errinj.test.lua | 2 --
> > test/vinyl/errinj.result | 12 ++----------
> > test/vinyl/errinj.test.lua | 5 +----
> > test/vinyl/errinj_vylog.result | 14 --------------
> > test/vinyl/errinj_vylog.test.lua | 4 ----
> > 7 files changed, 8 insertions(+), 51 deletions(-)
> >
> > diff --git a/src/box/vy_scheduler.c b/src/box/vy_scheduler.c
> > index 9dba93d34..f57f10119 100644
> > --- a/src/box/vy_scheduler.c
> > +++ b/src/box/vy_scheduler.c
> > @@ -687,17 +687,13 @@ vy_scheduler_begin_checkpoint(struct vy_scheduler *scheduler)
> > assert(!scheduler->checkpoint_in_progress);
> >
> > /*
> > - * If the scheduler is throttled due to errors, do not wait
> > - * until it wakes up as it may take quite a while. Instead
> > - * fail checkpoint immediately with the last error seen by
> > - * the scheduler.
> > + * It makes no sense throttling checkpoint process since
> > + * it can be either ran manually or due to timeout. At this
> > + * point let's ignore it.
>
> Kostja's questio is fair. Can it be done for manual snapshots only?
> Automated checkpoints already have problems with killing the disk,
> when multiple instances on the same machine start them at the same
> time. With unthrotled scheduler it is going to get worse.
>
> However, if this is hard to detect, then it is ok. Just do a quick
> check if it is possible.
>
> > */
> > if (scheduler->is_throttled) {
> > - struct error *e = diag_last_error(&scheduler->diag);
> > - diag_add_error(diag_get(), e);
> > - say_error("cannot checkpoint vinyl, "
> > - "scheduler is throttled with: %s", e->errmsg);
> > - return -1;
> > + say_info("scheduler is unthrottled due to checkpoint process");
> > + scheduler->is_throttled = false;
>
> Is there a simple way to let the throttling continue after the
> checkpoint is finished?
>
> > }
> >
> > if (!vy_scheduler_dump_in_progress(scheduler)) {
--
Konstantin Osipov, Moscow, Russia
More information about the Tarantool-patches
mailing list