Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov.dev@gmail.com>
To: Konstantin Osipov <kostja@tarantool.org>
Cc: tarantool-patches@freelists.org
Subject: Re: [tarantool-patches] Re: [PATCH 7/9] vinyl: randomize range compaction to avoid IO load spikes
Date: Wed, 6 Feb 2019 14:49:05 +0300	[thread overview]
Message-ID: <20190206114905.qq6wwcdexal4gy3j@esperanza> (raw)
In-Reply-To: <20190206110609.GA24382@chai>

On Wed, Feb 06, 2019 at 02:06:09PM +0300, Konstantin Osipov wrote:
> * Vladimir Davydov <vdavydov.dev@gmail.com> [19/02/06 13:57]:
> > On Wed, Feb 06, 2019 at 01:44:19PM +0300, Konstantin Osipov wrote:
> > > * Vladimir Davydov <vdavydov.dev@gmail.com> [19/02/06 13:31]:
> > > > Over how many dumps? What do we do after restart, when there's no
> > > > history and perhaps even no level 1?
> > > 
> > > A am thinking about something along these lines: 
> > > f(n+1) = (f(n) + x*k)(1+k) - where k is the weight used to scale
> > > the next input
> > 
> > What should be k equal to?
> 
> I don't know.

Neither do I. That's why I don't want to involve any kind of averaging.

> 
> > Also, what we should do after restart when there's no history?
> 
> Seed the function with the top run size.

> If there is no top run, seed with 0.

This way the moving average will grow very reluctantly - we'll need more
than k dumps to adapt. During that time, compaction priority calculation
will be unstable.

> > 
> > Why is using the last level size as reference bad?
> 
> Because you don't know if it's last or not?

We know which run is last. Provided the workload is stable, i.e. have
stopped growing its dataset, it will be roughly the same. Besides, the
last level run size changes only on major compaction, which is
infrequent.  After a major compaction, it's OK to use a different first
level size - the point is in order not to break LSM algorithm, we have
to maintain stable level sizing between major compactions.

  reply	other threads:[~2019-02-06 11:49 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-20 21:16 [PATCH 0/9] vinyl: compaction randomization and throttling Vladimir Davydov
2019-01-20 21:17 ` [PATCH 1/9] vinyl: update lsm->range_heap in one go on dump completion Vladimir Davydov
2019-01-24 16:55   ` Vladimir Davydov
2019-02-05 16:37   ` [tarantool-patches] " Konstantin Osipov
2019-01-20 21:17 ` [PATCH 2/9] vinyl: ignore unknown .run, .index and .vylog keys Vladimir Davydov
2019-01-24 16:56   ` Vladimir Davydov
2019-01-20 21:17 ` [PATCH 3/9] vinyl: use uncompressed run size for range split/coalesce/compaction Vladimir Davydov
2019-01-21  9:42   ` Vladimir Davydov
2019-02-05 16:49     ` [tarantool-patches] " Konstantin Osipov
2019-02-06  8:55       ` Vladimir Davydov
2019-02-06 10:46         ` Konstantin Osipov
2019-02-06 10:55           ` Vladimir Davydov
2019-02-05 16:43   ` Konstantin Osipov
2019-02-06 16:48     ` Vladimir Davydov
2019-01-20 21:17 ` [PATCH 4/9] vinyl: rename lsm->range_heap to max_compaction_priority Vladimir Davydov
2019-01-20 21:17 ` [PATCH 5/9] vinyl: keep track of dumps per compaction for each LSM tree Vladimir Davydov
2019-02-05 16:58   ` [tarantool-patches] " Konstantin Osipov
2019-02-06  9:20     ` Vladimir Davydov
2019-02-06 16:54       ` Vladimir Davydov
2019-01-20 21:17 ` [PATCH 6/9] vinyl: set range size automatically Vladimir Davydov
2019-01-22  9:17   ` Vladimir Davydov
2019-02-05 17:09   ` [tarantool-patches] " Konstantin Osipov
2019-02-06  9:23     ` Vladimir Davydov
2019-02-06 17:04       ` Vladimir Davydov
2019-01-20 21:17 ` [PATCH 7/9] vinyl: randomize range compaction to avoid IO load spikes Vladimir Davydov
2019-01-22 12:54   ` Vladimir Davydov
2019-02-05 17:39     ` [tarantool-patches] " Konstantin Osipov
2019-02-06  8:53       ` Vladimir Davydov
2019-02-06 10:44         ` Konstantin Osipov
2019-02-06 10:52           ` Vladimir Davydov
2019-02-06 11:06             ` Konstantin Osipov
2019-02-06 11:49               ` Vladimir Davydov [this message]
2019-02-06 13:43                 ` Konstantin Osipov
2019-02-06 14:00                   ` Vladimir Davydov
2019-02-05 17:14   ` Konstantin Osipov
2019-01-20 21:17 ` [PATCH 8/9] vinyl: introduce quota consumer types Vladimir Davydov
2019-01-20 21:17 ` [PATCH 9/9] vinyl: throttle tx to ensure compaction keeps up with dumps Vladimir Davydov
2019-01-21 14:14   ` Vladimir Davydov
2019-01-22  9:09   ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190206114905.qq6wwcdexal4gy3j@esperanza \
    --to=vdavydov.dev@gmail.com \
    --cc=kostja@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='Re: [tarantool-patches] Re: [PATCH 7/9] vinyl: randomize range compaction to avoid IO load spikes' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox