From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Wed, 6 Feb 2019 11:55:06 +0300 From: Vladimir Davydov Subject: Re: [tarantool-patches] Re: [PATCH 3/9] vinyl: use uncompressed run size for range split/coalesce/compaction Message-ID: <20190206085506.aoj3chculmmqoexr@esperanza> References: <20190121094213.yzfawcs7mgkqxi3e@esperanza> <20190205164905.GC6811@chai> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190205164905.GC6811@chai> To: Konstantin Osipov Cc: tarantool-patches@freelists.org List-ID: On Tue, Feb 05, 2019 at 07:49:05PM +0300, Konstantin Osipov wrote: > * Vladimir Davydov [19/01/21 13:15]: > > On Mon, Jan 21, 2019 at 12:17:02AM +0300, Vladimir Davydov wrote: > > > Historically, when considering splitting or coalescing a range or > > > updating compaction priority, we use sizes of compressed runs (see > > > bytes_compressed). This makes the algorithms dependent on whether > > > compression is used or not and how effective it is, which is weird, > > > because compression is a way of storing data on disk - it shouldn't > > > affect the way data is partitioned. E.g. if we turned off compression > > > at the first LSM tree level, which would make sense, because it's > > > relatively small, we would affect the compaction algorithm because > > > of this. > > > > > > That said, let's use uncompressed run sizes when considering range > > > tree transformations. > > > > This results in occasional failures of vinyl/deferred_delete.test.lua. > > I amended the patch on the branch to fix this. Here's the diff: > > > > I don't understand why you had to replace random data with > string.rep('z', ...). Otherwise lgtm. Only because after the patch I don't need to use random() to generate data that would trigger compaction - compaction logic uses uncompressed sizes now so we can use string.rep(), which is faster.