From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Wed, 6 Feb 2019 17:00:48 +0300 From: Vladimir Davydov Subject: Re: [tarantool-patches] Re: [PATCH 7/9] vinyl: randomize range compaction to avoid IO load spikes Message-ID: <20190206140048.in7tcapprvfkt2xo@esperanza> References: <44f34fbaf09af5d1054f2e4843a77e095afe1e71.1548017258.git.vdavydov.dev@gmail.com> <20190122125458.cutoz5rtfd2sb6el@esperanza> <20190205173958.GG6811@chai> <20190206085302.3xzjz2udfvdin5ld@esperanza> <20190206104419.GD19953@chai> <20190206105244.c4gkhb3xsn2pkqmp@esperanza> <20190206110609.GA24382@chai> <20190206114905.qq6wwcdexal4gy3j@esperanza> <20190206134341.GB24382@chai> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190206134341.GB24382@chai> To: Konstantin Osipov Cc: tarantool-patches@freelists.org List-ID: On Wed, Feb 06, 2019 at 04:43:41PM +0300, Konstantin Osipov wrote: > * Vladimir Davydov [19/02/06 14:53]: > > We know which run is last. Provided the workload is stable, i.e. have > > stopped growing its dataset, it will be roughly the same. Besides, the > > last level run size changes only on major compaction, which is > > infrequent. After a major compaction, it's OK to use a different first > > level size - the point is in order not to break LSM algorithm, we have > > to maintain stable level sizing between major compactions. > > The problem is that you can't use the last level as-is. You need a > divider, e.g. dumps_per_compaction, which is meaningless until the > first major compaction has happened. Why? Level sizing isn't tight. Sizes of runs at level L range from X*run_size_ratio^L to X*run_size_ratio^(L+1). We can choose X as we please without breaking LSM algorithm assumptions, can't we?