[tarantool-patches] Re: [PATCH 7/9] vinyl: randomize range compaction to avoid IO load spikes
vdavydov.dev at gmail.com
Wed Feb 6 17:00:48 MSK 2019
On Wed, Feb 06, 2019 at 04:43:41PM +0300, Konstantin Osipov wrote:
> * Vladimir Davydov <vdavydov.dev at gmail.com> [19/02/06 14:53]:
> > We know which run is last. Provided the workload is stable, i.e. have
> > stopped growing its dataset, it will be roughly the same. Besides, the
> > last level run size changes only on major compaction, which is
> > infrequent. After a major compaction, it's OK to use a different first
> > level size - the point is in order not to break LSM algorithm, we have
> > to maintain stable level sizing between major compactions.
> The problem is that you can't use the last level as-is. You need a
> divider, e.g. dumps_per_compaction, which is meaningless until the
> first major compaction has happened.
Why? Level sizing isn't tight. Sizes of runs at level L range from
X*run_size_ratio^L to X*run_size_ratio^(L+1). We can choose X as we
please without breaking LSM algorithm assumptions, can't we?
More information about the Tarantool-patches