Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov.dev@gmail.com>
To: tarantool-patches@freelists.org
Subject: [PATCH 3/9] vinyl: use uncompressed run size for range split/coalesce/compaction
Date: Mon, 21 Jan 2019 00:17:02 +0300	[thread overview]
Message-ID: <fff9b3eddaeeb73ae19c2ff277b1ec49052bc030.1548017258.git.vdavydov.dev@gmail.com> (raw)
In-Reply-To: <cover.1548017258.git.vdavydov.dev@gmail.com>
In-Reply-To: <cover.1548017258.git.vdavydov.dev@gmail.com>

Historically, when considering splitting or coalescing a range or
updating compaction priority, we use sizes of compressed runs (see
bytes_compressed). This makes the algorithms dependent on whether
compression is used or not and how effective it is, which is weird,
because compression is a way of storing data on disk - it shouldn't
affect the way data is partitioned. E.g. if we turned off compression
at the first LSM tree level, which would make sense, because it's
relatively small, we would affect the compaction algorithm because
of this.

That said, let's use uncompressed run sizes when considering range
tree transformations.
---
 src/box/vy_range.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/box/vy_range.c b/src/box/vy_range.c
index f649aff7..87c4c6b9 100644
--- a/src/box/vy_range.c
+++ b/src/box/vy_range.c
@@ -329,7 +329,7 @@ vy_range_update_compaction_priority(struct vy_range *range,
 
 	struct vy_slice *slice;
 	rlist_foreach_entry(slice, &range->slices, in_range) {
-		uint64_t size = slice->count.bytes_compressed;
+		uint64_t size = slice->count.bytes;
 		/*
 		 * The size of the first level is defined by
 		 * the size of the most recent run.
@@ -377,7 +377,7 @@ vy_range_update_compaction_priority(struct vy_range *range,
 			 */
 			range->compaction_priority = total_run_count;
 			range->compaction_queue = total_stmt_count;
-			est_new_run_size = total_stmt_count.bytes_compressed;
+			est_new_run_size = total_stmt_count.bytes;
 		}
 	}
 
@@ -419,7 +419,7 @@ vy_range_needs_split(struct vy_range *range, const struct index_opts *opts,
 	slice = rlist_last_entry(&range->slices, struct vy_slice, in_range);
 
 	/* The range is too small to be split. */
-	if (slice->count.bytes_compressed < opts->range_size * 4 / 3)
+	if (slice->count.bytes < opts->range_size * 4 / 3)
 		return false;
 
 	/* Find the median key in the oldest run (approximately). */
@@ -481,7 +481,7 @@ vy_range_needs_coalesce(struct vy_range *range, vy_range_tree_t *tree,
 	struct vy_range *it;
 
 	/* Size of the coalesced range. */
-	uint64_t total_size = range->count.bytes_compressed;
+	uint64_t total_size = range->count.bytes;
 	/* Coalesce ranges until total_size > max_size. */
 	uint64_t max_size = opts->range_size / 2;
 
@@ -496,7 +496,7 @@ vy_range_needs_coalesce(struct vy_range *range, vy_range_tree_t *tree,
 	for (it = vy_range_tree_next(tree, range);
 	     it != NULL && !vy_range_is_scheduled(it);
 	     it = vy_range_tree_next(tree, it)) {
-		uint64_t size = it->count.bytes_compressed;
+		uint64_t size = it->count.bytes;
 		if (total_size + size > max_size)
 			break;
 		total_size += size;
@@ -505,7 +505,7 @@ vy_range_needs_coalesce(struct vy_range *range, vy_range_tree_t *tree,
 	for (it = vy_range_tree_prev(tree, range);
 	     it != NULL && !vy_range_is_scheduled(it);
 	     it = vy_range_tree_prev(tree, it)) {
-		uint64_t size = it->count.bytes_compressed;
+		uint64_t size = it->count.bytes;
 		if (total_size + size > max_size)
 			break;
 		total_size += size;
-- 
2.11.0

  parent reply	other threads:[~2019-01-20 21:17 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-20 21:16 [PATCH 0/9] vinyl: compaction randomization and throttling Vladimir Davydov
2019-01-20 21:17 ` [PATCH 1/9] vinyl: update lsm->range_heap in one go on dump completion Vladimir Davydov
2019-01-24 16:55   ` Vladimir Davydov
2019-02-05 16:37   ` [tarantool-patches] " Konstantin Osipov
2019-01-20 21:17 ` [PATCH 2/9] vinyl: ignore unknown .run, .index and .vylog keys Vladimir Davydov
2019-01-24 16:56   ` Vladimir Davydov
2019-01-20 21:17 ` Vladimir Davydov [this message]
2019-01-21  9:42   ` [PATCH 3/9] vinyl: use uncompressed run size for range split/coalesce/compaction Vladimir Davydov
2019-02-05 16:49     ` [tarantool-patches] " Konstantin Osipov
2019-02-06  8:55       ` Vladimir Davydov
2019-02-06 10:46         ` Konstantin Osipov
2019-02-06 10:55           ` Vladimir Davydov
2019-02-05 16:43   ` Konstantin Osipov
2019-02-06 16:48     ` Vladimir Davydov
2019-01-20 21:17 ` [PATCH 4/9] vinyl: rename lsm->range_heap to max_compaction_priority Vladimir Davydov
2019-01-20 21:17 ` [PATCH 5/9] vinyl: keep track of dumps per compaction for each LSM tree Vladimir Davydov
2019-02-05 16:58   ` [tarantool-patches] " Konstantin Osipov
2019-02-06  9:20     ` Vladimir Davydov
2019-02-06 16:54       ` Vladimir Davydov
2019-01-20 21:17 ` [PATCH 6/9] vinyl: set range size automatically Vladimir Davydov
2019-01-22  9:17   ` Vladimir Davydov
2019-02-05 17:09   ` [tarantool-patches] " Konstantin Osipov
2019-02-06  9:23     ` Vladimir Davydov
2019-02-06 17:04       ` Vladimir Davydov
2019-01-20 21:17 ` [PATCH 7/9] vinyl: randomize range compaction to avoid IO load spikes Vladimir Davydov
2019-01-22 12:54   ` Vladimir Davydov
2019-02-05 17:39     ` [tarantool-patches] " Konstantin Osipov
2019-02-06  8:53       ` Vladimir Davydov
2019-02-06 10:44         ` Konstantin Osipov
2019-02-06 10:52           ` Vladimir Davydov
2019-02-06 11:06             ` Konstantin Osipov
2019-02-06 11:49               ` Vladimir Davydov
2019-02-06 13:43                 ` Konstantin Osipov
2019-02-06 14:00                   ` Vladimir Davydov
2019-02-05 17:14   ` Konstantin Osipov
2019-01-20 21:17 ` [PATCH 8/9] vinyl: introduce quota consumer types Vladimir Davydov
2019-01-20 21:17 ` [PATCH 9/9] vinyl: throttle tx to ensure compaction keeps up with dumps Vladimir Davydov
2019-01-21 14:14   ` Vladimir Davydov
2019-01-22  9:09   ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fff9b3eddaeeb73ae19c2ff277b1ec49052bc030.1548017258.git.vdavydov.dev@gmail.com \
    --to=vdavydov.dev@gmail.com \
    --cc=tarantool-patches@freelists.org \
    --subject='Re: [PATCH 3/9] vinyl: use uncompressed run size for range split/coalesce/compaction' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox