From: Vladimir Davydov <vdavydov.dev@gmail.com>
To: Konstantin Osipov <kostja@tarantool.org>
Cc: tarantool-patches@freelists.org
Subject: Re: [tarantool-patches] Re: [PATCH 7/9] box: rewrite checkpoint daemon in C
Date: Wed, 5 Dec 2018 19:21:19 +0300 [thread overview]
Message-ID: <20181205162119.62o4nwoqdfr4afj6@esperanza> (raw)
In-Reply-To: <20181130094141.nukl64wxtxrduvdp@esperanza>
Discussed verbally. For the record, see below.
On Fri, Nov 30, 2018 at 12:41:41PM +0300, Vladimir Davydov wrote:
> On Fri, Nov 30, 2018 at 11:58:10AM +0300, Konstantin Osipov wrote:
> > * Vladimir Davydov <vdavydov.dev@gmail.com> [18/11/28 19:16]:
> > > Long time ago, when the checkpoint daemon was added to Tarantool, it was
> > > responsible not only for making periodic checkpoints, but also for
> > > maintaining the configured number of checkpoints and removing old snap
> > > and xlog times, so it was much easier to implement it in Lua than in C.
> > files
> > > However, over time, all its responsibilities have been reimplemented in
> > > C and moved to the server code so that now it just calls box.snapshot()
> > > periodically. Let's rewrite this simple procedure in C as well - this
> > > will allow us to easily add more complex logic there, e.g. triggering
> > > checkpoint when WAL files exceed a configured threshold.
> > > ---
> > > src/box/CMakeLists.txt | 1 -
> > > src/box/box.cc | 102 ++++++++++++++++++++++++
> > > src/box/box.h | 1 +
> > > src/box/lua/cfg.cc | 12 +++
> > > src/box/lua/checkpoint_daemon.lua | 136 --------------------------------
> > > src/box/lua/init.c | 2 -
> > > src/box/lua/load_cfg.lua | 2 +-
> > > test/xlog/checkpoint_daemon.result | 145 -----------------------------------
> >
> > Could we please move this fiber to gc.c at least? Let's not
> > pollute box?
>
> If we moved it to gc.c, there would be a cross-dependency between gc.c
> and box.cc - the daemon needs box_checkpoint defined in box.cc while
> box.cc needs gc methods to register consumers. One way to solve this
> problem is move box_checkpoint to gc.c, however, it would look weird
> IMO as checkpointing isn't really a part of garbage collection. I guess
> we have to introduce a separate file for the checkpoint daemon and
> box_checkpoint after all.
Agreed to move checkpoint daemon to gc.c. In order not to introduce
cross-reference between box.cc and gc.c, gc_make_checkpoint() will be
introduced. The new method will do the work currently done by
box_checkpoint(), namely call engine and WAL checkpoint callbacks.
>
> >
> > > + if (box_checkpoint_is_in_progress) {
> > > + /*
> > > + * The next checkpoint will be scheduled
> > > + * by the concurrent box_checkpoint().
> > > + */
> >
> > This is rather fragile. Can we make the interaction between
> > box-checkpoint and checkpoint daemon less obvious? Imagine in
> > future we can perform checkpoints according to a cron.
> >
> > I think it would be better if the logic of next checkpoint time
> > calculation is consolidated in the checkpoint daemon alone. If the
> > daemon sees that a checkpoint is in progress it can skip the
> > current checkpoint, but not delay the next checkpoint till
> > infinity.
>
> But then box_checkpoint triggered by exceeding checkpoint_wal_threshold
> wouldn't reset the timer set by the checkpoint daemon and the time
> interval between checkpoints could be less than configured at times,
> which would probably raise questions. Would it be OK? Or we'd better
> notify the checkpoint daemon about checkpoints done out of schedule (by
> the user or on exceeding checkpoin_wal_threshold)?
We must notify the checkpoint daemon to reschedule the next checkpoint
whenever the user makes a checkpoint with box.snapshot(). That is, there
should be gc_schedule_checkpoint() helper that will schedule the next
checkpoint. This method will be used both by the checkpoint daemon and
by box.snapshot() in order to schedule the next checkpoint according to
box.cfg.checkpoint_interval.
>
> >
> > > + next_checkpoint_time = now + TIMEOUT_INFINITY;
> > > + continue;
> > > + }
> > > + box_checkpoint();
> > > + }
> > > + checkpoint_daemon = NULL;
> > > + return 0;
> > > +}
> > > +
> > > --- a/test/xlog/checkpoint_daemon.test.lua
> > > +++ b/test/xlog/checkpoint_daemon.test.lua
> >
> > Please keep the test. I don't understand what's wrong with it,
> > neither it is obvious from changeset comments.
>
> The test accesses checkpoint_daemon fiber directly to check the time of
> the next scheduled checkpoint. To keep the test we would have to export
> the time of the next scheduled checkpoint to box.info. Do we want it?
Checkpoint scheduling logic should be separated to a separate
independent module. We should write a unit test for this module.
next prev parent reply other threads:[~2018-12-05 16:21 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-28 16:14 [PATCH 0/9] Allow to limit size of WAL files Vladimir Davydov
2018-11-28 16:14 ` [PATCH 1/9] wal: separate checkpoint and flush paths Vladimir Davydov
2018-11-29 16:24 ` [tarantool-patches] " Konstantin Osipov
2018-11-28 16:14 ` [PATCH 2/9] wal: remove files needed for recovery from backup checkpoints on ENOSPC Vladimir Davydov
2018-11-29 16:31 ` [tarantool-patches] " Konstantin Osipov
2018-11-29 17:42 ` Vladimir Davydov
2018-11-28 16:14 ` [PATCH 3/9] recovery: restore garbage collector vclock after restart Vladimir Davydov
2018-11-29 16:37 ` [tarantool-patches] " Konstantin Osipov
2018-11-29 17:42 ` Vladimir Davydov
2018-11-28 16:14 ` [PATCH 4/9] gc: run garbage collection in background Vladimir Davydov
2018-11-29 16:42 ` [tarantool-patches] " Konstantin Osipov
2018-11-29 17:43 ` Vladimir Davydov
2018-11-28 16:14 ` [PATCH 5/9] gc: do not use WAL watcher API for deactivating stale consumers Vladimir Davydov
2018-11-29 17:02 ` [tarantool-patches] " Konstantin Osipov
2018-11-28 16:14 ` [PATCH 6/9] wal: simplify watcher API Vladimir Davydov
2018-11-29 17:33 ` [tarantool-patches] " Konstantin Osipov
2018-11-28 16:14 ` [PATCH 7/9] box: rewrite checkpoint daemon in C Vladimir Davydov
2018-11-30 8:58 ` [tarantool-patches] " Konstantin Osipov
2018-11-30 9:41 ` Vladimir Davydov
2018-12-05 16:21 ` Vladimir Davydov [this message]
2018-11-28 16:14 ` [PATCH 8/9] wal: pass struct instead of vclock to checkpoint methods Vladimir Davydov
2018-11-30 9:00 ` [tarantool-patches] " Konstantin Osipov
2018-11-30 9:43 ` Vladimir Davydov
2018-12-03 20:20 ` Konstantin Osipov
2018-11-28 16:14 ` [PATCH 9/9] wal: trigger checkpoint if there are too many WALs Vladimir Davydov
2018-12-03 20:34 ` [tarantool-patches] " Konstantin Osipov
2018-12-04 11:25 ` Vladimir Davydov
2018-12-04 12:53 ` Konstantin Osipov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181205162119.62o4nwoqdfr4afj6@esperanza \
--to=vdavydov.dev@gmail.com \
--cc=kostja@tarantool.org \
--cc=tarantool-patches@freelists.org \
--subject='Re: [tarantool-patches] Re: [PATCH 7/9] box: rewrite checkpoint daemon in C' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox