From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 0AFAD21CF0 for ; Tue, 5 Feb 2019 12:09:28 -0500 (EST) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WgqgBFHyH5Dn for ; Tue, 5 Feb 2019 12:09:27 -0500 (EST) Received: from fallback14.mail.ru (fallback14.mail.ru [94.100.179.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id B3C0D20C25 for ; Tue, 5 Feb 2019 12:09:27 -0500 (EST) Received: from [10.161.64.49] (port=42946 helo=smtp41.i.mail.ru) by fallback14.m.smailru.net with esmtp (envelope-from ) id 1gr4Dp-0004oF-No for tarantool-patches@freelists.org; Tue, 05 Feb 2019 20:09:22 +0300 Received: by smtp41.i.mail.ru with esmtpa (envelope-from ) id 1gr4Dj-0005O3-Ak for tarantool-patches@freelists.org; Tue, 05 Feb 2019 20:09:15 +0300 Date: Tue, 5 Feb 2019 20:09:09 +0300 From: Konstantin Osipov Subject: [tarantool-patches] Re: [PATCH 6/9] vinyl: set range size automatically Message-ID: <20190205170909.GE6811@chai> References: <6cd378d1640f87d46a6e40f1c51e4ae62a70c209.1548017258.git.vdavydov.dev@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6cd378d1640f87d46a6e40f1c51e4ae62a70c209.1548017258.git.vdavydov.dev@gmail.com> Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: tarantool-patches@freelists.org * Vladimir Davydov [19/01/21 06:58]: > +int64_t > +vy_lsm_range_size(struct vy_lsm *lsm) > +{ > + /* Use the configured range size if available. */ > + if (lsm->opts.range_size > 0) > + return lsm->opts.range_size; > + /* > + * It doesn't make much sense to create too small ranges. > + * Limit the max number of ranges per index to 1000 and > + * never create ranges smaller than 16 MB. > + */ > + enum { MIN_RANGE_SIZE = 16 * 1024 * 1024 }; > + enum { MAX_RANGE_COUNT = 1000 }; > + /* > + * Ideally, we want to compact roughly the same amount of > + * data after each dump so as to avoid IO bursts caused by > + * simultaneous major compaction of a bunch of ranges, > + * because such IO bursts can lead to a deviation of the > + * LSM tree from the configured shape and, as a result, > + * increased read amplification. To achieve that, we need > + * to have at least as many ranges as the number of dumps > + * it takes to trigger major compaction in a range. > + */ > + int range_count = vy_lsm_dumps_per_compaction(lsm); > + range_count = MIN(range_count, MAX_RANGE_COUNT); > + int64_t range_size = lsm->stat.disk.last_level_count.bytes / > + (range_count + 1); > + range_size = MAX(range_size, MIN_RANGE_SIZE); > + return range_size; > +} OK, you could say the value is rarely used, so can be calculated each time it is used, but why instead not recalculate it on each major compaction? This would spare us from technical debt and having to think about potential performance bottleneck in the future. -- Konstantin Osipov, Moscow, Russia, +7 903 626 22 32 http://tarantool.io - www.twitter.com/kostja_osipov