Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov.dev@gmail.com>
To: Konstantin Osipov <kostja@tarantool.org>
Cc: tarantool-patches@freelists.org
Subject: Re: [tarantool-patches] Re: [PATCH 7/9] box: rewrite checkpoint daemon in C
Date: Wed, 5 Dec 2018 19:21:19 +0300	[thread overview]
Message-ID: <20181205162119.62o4nwoqdfr4afj6@esperanza> (raw)
In-Reply-To: <20181130094141.nukl64wxtxrduvdp@esperanza>

Discussed verbally. For the record, see below.

On Fri, Nov 30, 2018 at 12:41:41PM +0300, Vladimir Davydov wrote:
> On Fri, Nov 30, 2018 at 11:58:10AM +0300, Konstantin Osipov wrote:
> > * Vladimir Davydov <vdavydov.dev@gmail.com> [18/11/28 19:16]:
> > > Long time ago, when the checkpoint daemon was added to Tarantool, it was
> > > responsible not only for making periodic checkpoints, but also for
> > > maintaining the configured number of checkpoints and removing old snap
> > > and xlog times, so it was much easier to implement it in Lua than in C.
> > files 
> > > However, over time, all its responsibilities have been reimplemented in
> > > C and moved to the server code so that now it just calls box.snapshot()
> > > periodically. Let's rewrite this simple procedure in C as well - this
> > > will allow us to easily add more complex logic there, e.g. triggering
> > > checkpoint when WAL files exceed a configured threshold.
> > > ---
> > >  src/box/CMakeLists.txt               |   1 -
> > >  src/box/box.cc                       | 102 ++++++++++++++++++++++++
> > >  src/box/box.h                        |   1 +
> > >  src/box/lua/cfg.cc                   |  12 +++
> > >  src/box/lua/checkpoint_daemon.lua    | 136 --------------------------------
> > >  src/box/lua/init.c                   |   2 -
> > >  src/box/lua/load_cfg.lua             |   2 +-
> > >  test/xlog/checkpoint_daemon.result   | 145 -----------------------------------
> > 
> > Could we please move this fiber to gc.c at least? Let's not
> > pollute box?
> 
> If we moved it to gc.c, there would be a cross-dependency between gc.c
> and box.cc - the daemon needs box_checkpoint defined in box.cc while
> box.cc needs gc methods to register consumers. One way to solve this
> problem is move box_checkpoint to gc.c, however, it would look weird
> IMO as checkpointing isn't really a part of garbage collection. I guess
> we have to introduce a separate file for the checkpoint daemon and
> box_checkpoint after all.

Agreed to move checkpoint daemon to gc.c. In order not to introduce
cross-reference between box.cc and gc.c, gc_make_checkpoint() will be
introduced. The new method will do the work currently done by
box_checkpoint(), namely call engine and WAL checkpoint callbacks.

> 
> > 
> > > +		if (box_checkpoint_is_in_progress) {
> > > +			/*
> > > +			 * The next checkpoint will be scheduled
> > > +			 * by the concurrent box_checkpoint().
> > > +			 */
> > 
> > This is rather fragile. Can we make the interaction between
> > box-checkpoint and checkpoint daemon less obvious? Imagine in
> > future we can perform checkpoints according to a cron. 
> > 
> > I think it would be better if the logic of next checkpoint time
> > calculation is consolidated in the checkpoint daemon alone. If the
> > daemon sees that a checkpoint is in progress it can skip the
> > current checkpoint, but not delay the next checkpoint till
> > infinity.
> 
> But then box_checkpoint triggered by exceeding checkpoint_wal_threshold
> wouldn't reset the timer set by the checkpoint daemon and the time
> interval between checkpoints could be less than configured at times,
> which would probably raise questions. Would it be OK? Or we'd better
> notify the checkpoint daemon about checkpoints done out of schedule (by
> the user or on exceeding checkpoin_wal_threshold)?

We must notify the checkpoint daemon to reschedule the next checkpoint
whenever the user makes a checkpoint with box.snapshot(). That is, there
should be gc_schedule_checkpoint() helper that will schedule the next
checkpoint. This method will be used both by the checkpoint daemon and
by box.snapshot() in order to schedule the next checkpoint according to
box.cfg.checkpoint_interval.

> 
> > 
> > > +			next_checkpoint_time = now + TIMEOUT_INFINITY;
> > > +			continue;
> > > +		}
> > > +		box_checkpoint();
> > > +	}
> > > +	checkpoint_daemon = NULL;
> > > +	return 0;
> > > +}
> > > +
> > > --- a/test/xlog/checkpoint_daemon.test.lua
> > > +++ b/test/xlog/checkpoint_daemon.test.lua
> > 
> > Please keep the test. I don't understand what's wrong with it,
> > neither it is obvious from changeset comments.
> 
> The test accesses checkpoint_daemon fiber directly to check the time of
> the next scheduled checkpoint. To keep the test we would have to export
> the time of the next scheduled checkpoint to box.info. Do we want it?

Checkpoint scheduling logic should be separated to a separate
independent module. We should write a unit test for this module.

  reply	other threads:[~2018-12-05 16:21 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-28 16:14 [PATCH 0/9] Allow to limit size of WAL files Vladimir Davydov
2018-11-28 16:14 ` [PATCH 1/9] wal: separate checkpoint and flush paths Vladimir Davydov
2018-11-29 16:24   ` [tarantool-patches] " Konstantin Osipov
2018-11-28 16:14 ` [PATCH 2/9] wal: remove files needed for recovery from backup checkpoints on ENOSPC Vladimir Davydov
2018-11-29 16:31   ` [tarantool-patches] " Konstantin Osipov
2018-11-29 17:42     ` Vladimir Davydov
2018-11-28 16:14 ` [PATCH 3/9] recovery: restore garbage collector vclock after restart Vladimir Davydov
2018-11-29 16:37   ` [tarantool-patches] " Konstantin Osipov
2018-11-29 17:42     ` Vladimir Davydov
2018-11-28 16:14 ` [PATCH 4/9] gc: run garbage collection in background Vladimir Davydov
2018-11-29 16:42   ` [tarantool-patches] " Konstantin Osipov
2018-11-29 17:43     ` Vladimir Davydov
2018-11-28 16:14 ` [PATCH 5/9] gc: do not use WAL watcher API for deactivating stale consumers Vladimir Davydov
2018-11-29 17:02   ` [tarantool-patches] " Konstantin Osipov
2018-11-28 16:14 ` [PATCH 6/9] wal: simplify watcher API Vladimir Davydov
2018-11-29 17:33   ` [tarantool-patches] " Konstantin Osipov
2018-11-28 16:14 ` [PATCH 7/9] box: rewrite checkpoint daemon in C Vladimir Davydov
2018-11-30  8:58   ` [tarantool-patches] " Konstantin Osipov
2018-11-30  9:41     ` Vladimir Davydov
2018-12-05 16:21       ` Vladimir Davydov [this message]
2018-11-28 16:14 ` [PATCH 8/9] wal: pass struct instead of vclock to checkpoint methods Vladimir Davydov
2018-11-30  9:00   ` [tarantool-patches] " Konstantin Osipov
2018-11-30  9:43     ` Vladimir Davydov
2018-12-03 20:20       ` Konstantin Osipov
2018-11-28 16:14 ` [PATCH 9/9] wal: trigger checkpoint if there are too many WALs Vladimir Davydov
2018-12-03 20:34   ` [tarantool-patches] " Konstantin Osipov
2018-12-04 11:25     ` Vladimir Davydov
2018-12-04 12:53       ` Konstantin Osipov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181205162119.62o4nwoqdfr4afj6@esperanza \
    --to=vdavydov.dev@gmail.com \
    --cc=kostja@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='Re: [tarantool-patches] Re: [PATCH 7/9] box: rewrite checkpoint daemon in C' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox