* [PATCH] vinyl: remove runs not referenced by any checkpoint immediately
@ 2018-05-17 16:09 Vladimir Davydov
2018-05-17 16:35 ` Vladimir Davydov
2018-05-17 20:40 ` Konstantin Osipov
0 siblings, 2 replies; 4+ messages in thread
From: Vladimir Davydov @ 2018-05-17 16:09 UTC (permalink / raw)
To: kostja; +Cc: tarantool-patches
If a compacted run was created after the last checkpoint, it is not
needed to recover from any checkpoint and hence can be deleted right
away to save disk space.
Closes #3407
---
https://github.com/tarantool/tarantool/issues/3407
https://github.com/tarantool/tarantool/commits/gh-3407-vy-remove-unreferenced-runs-immediately
src/box/vy_scheduler.c | 27 +++++++++--------
test/vinyl/gc.result | 78 ++++++++++++++++++++++++++++++++++++++++++++++++++
test/vinyl/gc.test.lua | 35 ++++++++++++++++++++++
3 files changed, 126 insertions(+), 14 deletions(-)
diff --git a/src/box/vy_scheduler.c b/src/box/vy_scheduler.c
index e1853e5d..4c9103cf 100644
--- a/src/box/vy_scheduler.c
+++ b/src/box/vy_scheduler.c
@@ -1139,22 +1139,21 @@ vy_task_compact_complete(struct vy_scheduler *scheduler, struct vy_task *task)
return -1;
}
- if (gc_lsn < 0) {
- /*
- * If there is no last snapshot, i.e. we are in
- * the middle of join, we can delete compacted
- * run files right away.
- */
- vy_log_tx_begin();
- rlist_foreach_entry(run, &unused_runs, in_unused) {
- if (vy_run_remove_files(index->env->path,
- index->space_id, index->id,
- run->id) == 0) {
- vy_log_forget_run(run->id);
- }
+ /*
+ * Remove compacted run files that were created after
+ * the last checkpoint (and hence are not referenced
+ * by any checkpoint) immediately to save disk space.
+ */
+ vy_log_tx_begin();
+ rlist_foreach_entry(run, &unused_runs, in_unused) {
+ if (run->dump_lsn > gc_lsn &&
+ vy_run_remove_files(index->env->path,
+ index->space_id, index->id,
+ run->id) == 0) {
+ vy_log_forget_run(run->id);
}
- vy_log_tx_try_commit();
}
+ vy_log_tx_try_commit();
/*
* Account the new run if it is not empty,
diff --git a/test/vinyl/gc.result b/test/vinyl/gc.result
index f88b3996..b709135c 100644
--- a/test/vinyl/gc.result
+++ b/test/vinyl/gc.result
@@ -126,3 +126,81 @@ temp:drop()
box.cfg{checkpoint_count = default_checkpoint_count}
---
...
+--
+-- Check that compacted run files that are not referenced
+-- by any checkpoint are deleted immediately (gh-3407).
+--
+test_run:cmd("create server test with script='vinyl/low_quota.lua'")
+---
+- true
+...
+test_run:cmd("start server test with args='1048576'")
+---
+- true
+...
+test_run:cmd('switch test')
+---
+- true
+...
+box.cfg{checkpoint_count = 2}
+---
+...
+fio = require('fio')
+---
+...
+fiber = require('fiber')
+---
+...
+s = box.schema.space.create('test', {engine = 'vinyl'})
+---
+...
+_ = s:create_index('pk', {run_count_per_level = 3})
+---
+...
+function count_runs() return #fio.glob(fio.pathjoin(box.cfg.vinyl_dir, s.id, s.index.pk.id, '*.run')) end
+---
+...
+_ = s:replace{1}
+---
+...
+box.snapshot()
+---
+- ok
+...
+_ = s:replace{2}
+---
+...
+box.snapshot()
+---
+- ok
+...
+count_runs() -- 2
+---
+- 2
+...
+for i = 1, 20 do s:replace{i, string.rep('x', 100 * 1024)} end
+---
+...
+while s.index.pk:info().disk.compact.count < 1 do fiber.sleep(0.001) end
+---
+...
+s.index.pk:info().disk.compact.count -- 1
+---
+- 1
+...
+count_runs() -- 3 (compacted runs created after checkpoint are deleted)
+---
+- 3
+...
+test_run:cmd('switch default')
+---
+- true
+...
+test_run:cmd("stop server test")
+---
+- true
+...
+test_run:cmd("cleanup server test")
+---
+- true
+...
diff --git a/test/vinyl/gc.test.lua b/test/vinyl/gc.test.lua
index 3974048b..32078f00 100644
--- a/test/vinyl/gc.test.lua
+++ b/test/vinyl/gc.test.lua
@@ -61,3 +61,38 @@ files = ls_vylog()
temp:drop()
box.cfg{checkpoint_count = default_checkpoint_count}
+
+--
+-- Check that compacted run files that are not referenced
+-- by any checkpoint are deleted immediately (gh-3407).
+--
+test_run:cmd("create server test with script='vinyl/low_quota.lua'")
+test_run:cmd("start server test with args='1048576'")
+test_run:cmd('switch test')
+
+box.cfg{checkpoint_count = 2}
+
+fio = require('fio')
+fiber = require('fiber')
+
+s = box.schema.space.create('test', {engine = 'vinyl'})
+_ = s:create_index('pk', {run_count_per_level = 3})
+
+function count_runs() return #fio.glob(fio.pathjoin(box.cfg.vinyl_dir, s.id, s.index.pk.id, '*.run')) end
+
+_ = s:replace{1}
+box.snapshot()
+_ = s:replace{2}
+box.snapshot()
+
+count_runs() -- 2
+
+for i = 1, 20 do s:replace{i, string.rep('x', 100 * 1024)} end
+while s.index.pk:info().disk.compact.count < 1 do fiber.sleep(0.001) end
+s.index.pk:info().disk.compact.count -- 1
+
+count_runs() -- 3 (compacted runs created after checkpoint are deleted)
+
+test_run:cmd('switch default')
+test_run:cmd("stop server test")
+test_run:cmd("cleanup server test")
--
2.11.0
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] vinyl: remove runs not referenced by any checkpoint immediately
2018-05-17 16:09 [PATCH] vinyl: remove runs not referenced by any checkpoint immediately Vladimir Davydov
@ 2018-05-17 16:35 ` Vladimir Davydov
2018-05-17 20:40 ` Konstantin Osipov
1 sibling, 0 replies; 4+ messages in thread
From: Vladimir Davydov @ 2018-05-17 16:35 UTC (permalink / raw)
To: kostja; +Cc: tarantool-patches
On Thu, May 17, 2018 at 07:09:52PM +0300, Vladimir Davydov wrote:
> If a compacted run was created after the last checkpoint, it is not
> needed to recover from any checkpoint and hence can be deleted right
> away to save disk space.
>
> Closes #3407
> ---
> https://github.com/tarantool/tarantool/issues/3407
> https://github.com/tarantool/tarantool/commits/gh-3407-vy-remove-unreferenced-runs-immediately
>
> src/box/vy_scheduler.c | 27 +++++++++--------
> test/vinyl/gc.result | 78 ++++++++++++++++++++++++++++++++++++++++++++++++++
> test/vinyl/gc.test.lua | 35 ++++++++++++++++++++++
> 3 files changed, 126 insertions(+), 14 deletions(-)
>
> diff --git a/src/box/vy_scheduler.c b/src/box/vy_scheduler.c
> index e1853e5d..4c9103cf 100644
> --- a/src/box/vy_scheduler.c
> +++ b/src/box/vy_scheduler.c
> @@ -1139,22 +1139,21 @@ vy_task_compact_complete(struct vy_scheduler *scheduler, struct vy_task *task)
> return -1;
> }
>
> - if (gc_lsn < 0) {
> - /*
> - * If there is no last snapshot, i.e. we are in
> - * the middle of join, we can delete compacted
> - * run files right away.
> - */
> - vy_log_tx_begin();
> - rlist_foreach_entry(run, &unused_runs, in_unused) {
> - if (vy_run_remove_files(index->env->path,
> - index->space_id, index->id,
> - run->id) == 0) {
> - vy_log_forget_run(run->id);
> - }
> + /*
> + * Remove compacted run files that were created after
> + * the last checkpoint (and hence are not referenced
> + * by any checkpoint) immediately to save disk space.
> + */
> + vy_log_tx_begin();
> + rlist_foreach_entry(run, &unused_runs, in_unused) {
> + if (run->dump_lsn > gc_lsn &&
> + vy_run_remove_files(index->env->path,
> + index->space_id, index->id,
> + run->id) == 0) {
> + vy_log_forget_run(run->id);
> }
> - vy_log_tx_try_commit();
> }
> + vy_log_tx_try_commit();
>
> /*
> * Account the new run if it is not empty,
Oops, this patch breaks vinyl/errinj_gc test, which doesn't expect
compaction to remove any files if ERRINJ_VY_GC is set. Fixed on the
branch. Here's the incremental diff:
diff --git a/src/box/vinyl.c b/src/box/vinyl.c
index 55a68104..552d42ba 100644
--- a/src/box/vinyl.c
+++ b/src/box/vinyl.c
@@ -3389,10 +3389,6 @@ vy_gc_cb(const struct vy_log_record *record, void *cb_arg)
goto out;
}
- ERROR_INJECT(ERRINJ_VY_GC,
- {say_error("error injection: vinyl run %lld not deleted",
- (long long)record->run_id); goto out;});
-
/* Try to delete files. */
if (vy_run_remove_files(arg->env->path, arg->space_id,
arg->index_id, record->run_id) != 0)
diff --git a/src/box/vy_run.c b/src/box/vy_run.c
index edae5270..980bc4d2 100644
--- a/src/box/vy_run.c
+++ b/src/box/vy_run.c
@@ -2434,6 +2434,9 @@ int
vy_run_remove_files(const char *dir, uint32_t space_id,
uint32_t iid, int64_t run_id)
{
+ ERROR_INJECT(ERRINJ_VY_GC,
+ {say_error("error injection: vinyl run %lld not deleted",
+ (long long)run_id); return -1;});
int ret = 0;
char path[PATH_MAX];
for (int type = 0; type < vy_file_MAX; type++) {
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] vinyl: remove runs not referenced by any checkpoint immediately
2018-05-17 16:09 [PATCH] vinyl: remove runs not referenced by any checkpoint immediately Vladimir Davydov
2018-05-17 16:35 ` Vladimir Davydov
@ 2018-05-17 20:40 ` Konstantin Osipov
2018-05-17 22:39 ` Konstantin Osipov
1 sibling, 1 reply; 4+ messages in thread
From: Konstantin Osipov @ 2018-05-17 20:40 UTC (permalink / raw)
To: Vladimir Davydov; +Cc: tarantool-patches
* Vladimir Davydov <vdavydov.dev@gmail.com> [18/05/17 19:13]:
> If a compacted run was created after the last checkpoint, it is not
> needed to recover from any checkpoint and hence can be deleted right
> away to save disk space.
Thank you for the patch.
This will hopefully solve the space amplification problem at
cloud.mail.ru.
>
> Closes #3407
> ---
> https://github.com/tarantool/tarantool/issues/3407
> https://github.com/tarantool/tarantool/commits/gh-3407-vy-remove-unreferenced-runs-immediately
>
--
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] vinyl: remove runs not referenced by any checkpoint immediately
2018-05-17 20:40 ` Konstantin Osipov
@ 2018-05-17 22:39 ` Konstantin Osipov
0 siblings, 0 replies; 4+ messages in thread
From: Konstantin Osipov @ 2018-05-17 22:39 UTC (permalink / raw)
To: Vladimir Davydov; +Cc: tarantool-patches
* Konstantin Osipov <kostja@tarantool.org> [18/05/17 23:40]:
> * Vladimir Davydov <vdavydov.dev@gmail.com> [18/05/17 19:13]:
> > If a compacted run was created after the last checkpoint, it is not
> > needed to recover from any checkpoint and hence can be deleted right
> > away to save disk space.
>
> Thank you for the patch.
>
> This will hopefully solve the space amplification problem at
> cloud.mail.ru.
I was not able to push my merge into 1.10. Apparently there are
more tests which rely on the error injection which you moved.
Please merge.
--
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-05-17 22:39 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-17 16:09 [PATCH] vinyl: remove runs not referenced by any checkpoint immediately Vladimir Davydov
2018-05-17 16:35 ` Vladimir Davydov
2018-05-17 20:40 ` Konstantin Osipov
2018-05-17 22:39 ` Konstantin Osipov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox