From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Mon, 21 Jan 2019 12:42:13 +0300 From: Vladimir Davydov Subject: Re: [PATCH 3/9] vinyl: use uncompressed run size for range split/coalesce/compaction Message-ID: <20190121094213.yzfawcs7mgkqxi3e@esperanza> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: To: tarantool-patches@freelists.org List-ID: On Mon, Jan 21, 2019 at 12:17:02AM +0300, Vladimir Davydov wrote: > Historically, when considering splitting or coalescing a range or > updating compaction priority, we use sizes of compressed runs (see > bytes_compressed). This makes the algorithms dependent on whether > compression is used or not and how effective it is, which is weird, > because compression is a way of storing data on disk - it shouldn't > affect the way data is partitioned. E.g. if we turned off compression > at the first LSM tree level, which would make sense, because it's > relatively small, we would affect the compaction algorithm because > of this. > > That said, let's use uncompressed run sizes when considering range > tree transformations. This results in occasional failures of vinyl/deferred_delete.test.lua. I amended the patch on the branch to fix this. Here's the diff: diff --git a/test/vinyl/deferred_delete.result b/test/vinyl/deferred_delete.result index 29945f8d..61f81ce2 100644 --- a/test/vinyl/deferred_delete.result +++ b/test/vinyl/deferred_delete.result @@ -668,16 +668,13 @@ test_run:cmd("switch test") fiber = require('fiber') --- ... -digest = require('digest') ---- -... s = box.schema.space.create('test', {engine = 'vinyl'}) --- ... -pk = s:create_index('pk', {run_count_per_level = 10}) +pk = s:create_index('pk', {run_count_per_level = 10, run_size_ratio = 2}) --- ... -sk = s:create_index('sk', {run_count_per_level = 10, parts = {2, 'unsigned', 3, 'string'}, unique = false}) +sk = s:create_index('sk', {run_count_per_level = 10, run_size_ratio = 2, parts = {2, 'unsigned', 3, 'string'}, unique = false}) --- ... -- Write a run big enough to prevent major compaction from kicking in @@ -685,13 +682,25 @@ sk = s:create_index('sk', {run_count_per_level = 10, parts = {2, 'unsigned', 3, dummy_rows = 100 --- ... -for i = 1, dummy_rows do s:replace{i + 1000, i + 1000, digest.urandom(100)} end +pad = string.rep('z', 50 * 1024) +--- +... +for i = 1, dummy_rows do s:replace{i + 1000, i + 1000, pad} end --- ... box.snapshot() --- - ok ... +pk:compact() +--- +... +sk:compact() +--- +... +while box.stat.vinyl().scheduler.compaction_queue > 0 do fiber.sleep(0.001) end +--- +... pad = string.rep('x', 10 * 1024) --- ... diff --git a/test/vinyl/deferred_delete.test.lua b/test/vinyl/deferred_delete.test.lua index d38802da..93b5b358 100644 --- a/test/vinyl/deferred_delete.test.lua +++ b/test/vinyl/deferred_delete.test.lua @@ -252,17 +252,20 @@ test_run:cmd("start server test with args='1048576'") test_run:cmd("switch test") fiber = require('fiber') -digest = require('digest') s = box.schema.space.create('test', {engine = 'vinyl'}) -pk = s:create_index('pk', {run_count_per_level = 10}) -sk = s:create_index('sk', {run_count_per_level = 10, parts = {2, 'unsigned', 3, 'string'}, unique = false}) +pk = s:create_index('pk', {run_count_per_level = 10, run_size_ratio = 2}) +sk = s:create_index('sk', {run_count_per_level = 10, run_size_ratio = 2, parts = {2, 'unsigned', 3, 'string'}, unique = false}) -- Write a run big enough to prevent major compaction from kicking in -- (run_count_per_level is ignored on the last level - see gh-3657). dummy_rows = 100 -for i = 1, dummy_rows do s:replace{i + 1000, i + 1000, digest.urandom(100)} end +pad = string.rep('z', 50 * 1024) +for i = 1, dummy_rows do s:replace{i + 1000, i + 1000, pad} end box.snapshot() +pk:compact() +sk:compact() +while box.stat.vinyl().scheduler.compaction_queue > 0 do fiber.sleep(0.001) end pad = string.rep('x', 10 * 1024) for i = 1, 120 do s:replace{i, i, pad} end