From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Wed, 6 Feb 2019 13:55:51 +0300 From: Vladimir Davydov Subject: Re: [tarantool-patches] Re: [PATCH 3/9] vinyl: use uncompressed run size for range split/coalesce/compaction Message-ID: <20190206105551.gqycbyywdltslei3@esperanza> References: <20190121094213.yzfawcs7mgkqxi3e@esperanza> <20190205164905.GC6811@chai> <20190206085506.aoj3chculmmqoexr@esperanza> <20190206104628.GE19953@chai> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190206104628.GE19953@chai> To: Konstantin Osipov Cc: tarantool-patches@freelists.org List-ID: On Wed, Feb 06, 2019 at 01:46:28PM +0300, Konstantin Osipov wrote: > * Vladimir Davydov [19/02/06 13:31]: > > Only because after the patch I don't need to use random() to generate > > data that would trigger compaction - compaction logic uses uncompressed > > sizes now so we can use string.rep(), which is faster. > > It is faster but it may impact other properties of the runs - like > physical run size. For example you begin getting much more pages per > run, for all runs except the top-level one. No. We don't take into account compression when splitting a run in pages. Never did. Actually, after this patch, it doesn't matter whether we use data that's easily compressed or not in tests - the result will be virtually the same. > I would avoid using string.rep() anywhere in vinyl test data, > unless I intentionally want to check the effects of inserting data > with high compression rate. Okay. I can change that if you wish. Just that there's no difference.