[PATCH 3/9] vinyl: use uncompressed run size for range split/coalesce/compaction

Vladimir Davydov vdavydov.dev at gmail.com
Mon Jan 21 12:42:13 MSK 2019


On Mon, Jan 21, 2019 at 12:17:02AM +0300, Vladimir Davydov wrote:
> Historically, when considering splitting or coalescing a range or
> updating compaction priority, we use sizes of compressed runs (see
> bytes_compressed). This makes the algorithms dependent on whether
> compression is used or not and how effective it is, which is weird,
> because compression is a way of storing data on disk - it shouldn't
> affect the way data is partitioned. E.g. if we turned off compression
> at the first LSM tree level, which would make sense, because it's
> relatively small, we would affect the compaction algorithm because
> of this.
> 
> That said, let's use uncompressed run sizes when considering range
> tree transformations.

This results in occasional failures of vinyl/deferred_delete.test.lua.
I amended the patch on the branch to fix this. Here's the diff:

diff --git a/test/vinyl/deferred_delete.result b/test/vinyl/deferred_delete.result
index 29945f8d..61f81ce2 100644
--- a/test/vinyl/deferred_delete.result
+++ b/test/vinyl/deferred_delete.result
@@ -668,16 +668,13 @@ test_run:cmd("switch test")
 fiber = require('fiber')
 ---
 ...
-digest = require('digest')
----
-...
 s = box.schema.space.create('test', {engine = 'vinyl'})
 ---
 ...
-pk = s:create_index('pk', {run_count_per_level = 10})
+pk = s:create_index('pk', {run_count_per_level = 10, run_size_ratio = 2})
 ---
 ...
-sk = s:create_index('sk', {run_count_per_level = 10, parts = {2, 'unsigned', 3, 'string'}, unique = false})
+sk = s:create_index('sk', {run_count_per_level = 10, run_size_ratio = 2, parts = {2, 'unsigned', 3, 'string'}, unique = false})
 ---
 ...
 -- Write a run big enough to prevent major compaction from kicking in
@@ -685,13 +682,25 @@ sk = s:create_index('sk', {run_count_per_level = 10, parts = {2, 'unsigned', 3,
 dummy_rows = 100
 ---
 ...
-for i = 1, dummy_rows do s:replace{i + 1000, i + 1000, digest.urandom(100)} end
+pad = string.rep('z', 50 * 1024)
+---
+...
+for i = 1, dummy_rows do s:replace{i + 1000, i + 1000, pad} end
 ---
 ...
 box.snapshot()
 ---
 - ok
 ...
+pk:compact()
+---
+...
+sk:compact()
+---
+...
+while box.stat.vinyl().scheduler.compaction_queue > 0 do fiber.sleep(0.001) end
+---
+...
 pad = string.rep('x', 10 * 1024)
 ---
 ...
diff --git a/test/vinyl/deferred_delete.test.lua b/test/vinyl/deferred_delete.test.lua
index d38802da..93b5b358 100644
--- a/test/vinyl/deferred_delete.test.lua
+++ b/test/vinyl/deferred_delete.test.lua
@@ -252,17 +252,20 @@ test_run:cmd("start server test with args='1048576'")
 test_run:cmd("switch test")
 
 fiber = require('fiber')
-digest = require('digest')
 
 s = box.schema.space.create('test', {engine = 'vinyl'})
-pk = s:create_index('pk', {run_count_per_level = 10})
-sk = s:create_index('sk', {run_count_per_level = 10, parts = {2, 'unsigned', 3, 'string'}, unique = false})
+pk = s:create_index('pk', {run_count_per_level = 10, run_size_ratio = 2})
+sk = s:create_index('sk', {run_count_per_level = 10, run_size_ratio = 2, parts = {2, 'unsigned', 3, 'string'}, unique = false})
 
 -- Write a run big enough to prevent major compaction from kicking in
 -- (run_count_per_level is ignored on the last level - see gh-3657).
 dummy_rows = 100
-for i = 1, dummy_rows do s:replace{i + 1000, i + 1000, digest.urandom(100)} end
+pad = string.rep('z', 50 * 1024)
+for i = 1, dummy_rows do s:replace{i + 1000, i + 1000, pad} end
 box.snapshot()
+pk:compact()
+sk:compact()
+while box.stat.vinyl().scheduler.compaction_queue > 0 do fiber.sleep(0.001) end
 
 pad = string.rep('x', 10 * 1024)
 for i = 1, 120 do s:replace{i, i, pad} end



More information about the Tarantool-patches mailing list