Tarantool development patches archive
 help / color / mirror / Atom feed
From: Konstantin Osipov <kostja.osipov@gmail.com>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH] vinyl: unthrottle scheduler on checkpoint
Date: Wed, 22 Apr 2020 03:32:28 +0300	[thread overview]
Message-ID: <20200422003228.GA19601@atlas> (raw)
In-Reply-To: <3f1e18cc-cc18-b586-825b-3e4e3a9c8e3f@tarantool.org>

* Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [20/04/22 02:17]:

Vinyl dump is based on memtable state, not a schedule.
If each dump is unthrottling the scheduler, then what's the point
of throttling? Having a nice infinite loop on ENOSPC?

> Hi! Thanks for the patch!
> 
> On 31/03/2020 16:42, Nikita Pettik wrote:
> > Before this patch box.snapshot() bails out immediately if it sees that
> > the scheduler is throttled due to errors. For instance:
> > 
> > box.error.injection.set('ERRINJ_VY_RUN_WRITE', true)
> > snapshot() -- fails due to ERRINJ_VY_RUN_WRITE
> > box.error.injection.set('ERRINJ_VY_RUN_WRITE', false)
> > snapshot() -- still fails despite the fact that injected error is unset
> > 
> > As a result, one has to wait up to a minute to make a snapshot. The
> > reason why throttling was introduced was to avoid flooding the log
> > in case of repeating disk errors.
> > On the other hand, checkpoint process is either called manually or on
> > schedule. What is more, to deal with schedule throttling in tests, we
> > had to introduce a new error injection (ERRINJ_VY_SCHED_TIMEOUT).
> > It reduces time duration during which the scheduler remains throttled,
> > which is ugly and race prone.
> > So, let's unthrottle scheduler when checkpoint process is launched.
> > 
> > Closes #3519
> > ---
> > Note that VY_SCHED_TIMEOUT error injection is not completely removed
> > from tests, since at least one of them fails (instance crashes) without
> > it (vinyl/errinj_stat.test.lua). It's not a problem of this patch -
> > reproducer is extracted and filed in gh-4821.
> > 
> > Branch: https://github.com/tarantool/tarantool/tree/np/gh-3519-unthrottle-sched
> > Issue: https://github.com/tarantool/tarantool/issues/3519
> > 
> >  src/box/vy_scheduler.c           | 14 +++++---------
> >  test/box/errinj.result           |  8 --------
> >  test/box/errinj.test.lua         |  2 --
> >  test/vinyl/errinj.result         | 12 ++----------
> >  test/vinyl/errinj.test.lua       |  5 +----
> >  test/vinyl/errinj_vylog.result   | 14 --------------
> >  test/vinyl/errinj_vylog.test.lua |  4 ----
> >  7 files changed, 8 insertions(+), 51 deletions(-)
> > 
> > diff --git a/src/box/vy_scheduler.c b/src/box/vy_scheduler.c
> > index 9dba93d34..f57f10119 100644
> > --- a/src/box/vy_scheduler.c
> > +++ b/src/box/vy_scheduler.c
> > @@ -687,17 +687,13 @@ vy_scheduler_begin_checkpoint(struct vy_scheduler *scheduler)
> >  	assert(!scheduler->checkpoint_in_progress);
> >  
> >  	/*
> > -	 * If the scheduler is throttled due to errors, do not wait
> > -	 * until it wakes up as it may take quite a while. Instead
> > -	 * fail checkpoint immediately with the last error seen by
> > -	 * the scheduler.
> > +	 * It makes no sense throttling checkpoint process since
> > +	 * it can be either ran manually or due to timeout. At this
> > +	 * point let's ignore it.
> 
> Kostja's questio is fair. Can it be done for manual snapshots only?
> Automated checkpoints already have problems with killing the disk,
> when multiple instances on the same machine start them at the same
> time. With unthrotled scheduler it is going to get worse.
> 
> However, if this is hard to detect, then it is ok. Just do a quick
> check if it is possible.
> 
> >  	 */
> >  	if (scheduler->is_throttled) {
> > -		struct error *e = diag_last_error(&scheduler->diag);
> > -		diag_add_error(diag_get(), e);
> > -		say_error("cannot checkpoint vinyl, "
> > -			  "scheduler is throttled with: %s", e->errmsg);
> > -		return -1;
> > +		say_info("scheduler is unthrottled due to checkpoint process");
> > +		scheduler->is_throttled = false;
> 
> Is there a simple way to let the throttling continue after the
> checkpoint is finished?
> 
> >  	}
> >  
> >  	if (!vy_scheduler_dump_in_progress(scheduler)) {

-- 
Konstantin Osipov, Moscow, Russia

  reply	other threads:[~2020-04-22  0:32 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-31 14:42 Nikita Pettik
2020-03-31 14:53 ` Konstantin Osipov
2020-04-21 23:13 ` Vladislav Shpilevoy
2020-04-22  0:32   ` Konstantin Osipov [this message]
2020-04-27 12:17   ` Nikita Pettik
2020-04-27 12:39     ` Konstantin Osipov
2020-04-27 15:18       ` Nikita Pettik
2020-04-27 15:20         ` Konstantin Osipov
2020-04-27 15:37           ` Nikita Pettik
2020-04-27 15:48             ` Konstantin Osipov
2020-04-27 15:52               ` Nikita Pettik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200422003228.GA19601@atlas \
    --to=kostja.osipov@gmail.com \
    --cc=tarantool-patches@dev.tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH] vinyl: unthrottle scheduler on checkpoint' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox