From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: From: Vladimir Davydov Subject: [PATCH 00/13] box: garbage collection refactoring and fixes Date: Thu, 4 Oct 2018 20:20:02 +0300 Message-Id: To: kostja@tarantool.org Cc: tarantool-patches@freelists.org List-ID: Commit 9c5d851d7830 ("replication: remove old snapshot files not needed by replicas") introduced gc_consumer types so that a consumer could pin either WALs or checkpoints, not necessarily both. This makes sense, because a replica doesn't need to pin any checkpoints, however the implementation looks rather dubious: consumers of all kinds are stored in the same binary search tree so to find the consumer that needs the oldest checkpoint we have to linearly scan this tree, which is inefficient and ugly (see gc_tree_first_checkpoint). This also complicates further work on the garbage collector, in particular auto-deletion of WAL files on ENOSPC (#3397) and persistent garbage collector state (#3442). So this patch set separates WAL consumers from checkpoint references: gc_consumer can now only be used to pin WALs while to pin a checkpoint one has to use gc_checkpoint_ref, which has a more lightweight API and implementation (e.g. it doesn't have "advance" method, because it doesn't make sense to advance a checkpoint consumer). Along the way, it does some related cleanups and fixes bug #3708, which was also introduced by the above mentioned commit. https://github.com/tarantool/tarantool/issues/3708 https://github.com/tarantool/tarantool/tree/dv/gh-3708-box-gc-fixes Vladimir Davydov (13): vinyl: fix master crash on replica join failure vinyl: force deletion of runs left from unfinished indexes on restart gc: make gc_consumer and gc_state structs transparent gc: use fixed length buffer for storing consumer name gc: fold gc_consumer_new and gc_consumer_delete gc: format consumer name in gc_consumer_register gc: rename checkpoint_count to min_checkpoint_count gc: keep track of available checkpoints gc: cleanup garbage collection procedure gc: improve box.info.gc output gc: separate checkpoint references from wal consumers gc: call gc_run unconditionally when consumer is advanced replication: ref checkpoint needed to join replica src/box/CMakeLists.txt | 1 - src/box/box.cc | 103 ++++++------ src/box/checkpoint.c | 72 --------- src/box/checkpoint.h | 97 ------------ src/box/gc.c | 312 ++++++++++++++++--------------------- src/box/gc.h | 187 +++++++++++++++++----- src/box/lua/info.c | 44 ++++-- src/box/memtx_engine.c | 7 + src/box/relay.cc | 5 +- src/box/vinyl.c | 11 +- src/box/vy_scheduler.c | 1 - test/replication/gc.result | 19 ++- test/replication/gc.test.lua | 8 +- test/vinyl/errinj.result | 51 ++++++ test/vinyl/errinj.test.lua | 18 +++ test/vinyl/errinj_gc.result | 4 - test/vinyl/errinj_gc.test.lua | 1 - test/vinyl/replica_rejoin.result | 7 - test/vinyl/replica_rejoin.test.lua | 2 - 19 files changed, 480 insertions(+), 470 deletions(-) delete mode 100644 src/box/checkpoint.c delete mode 100644 src/box/checkpoint.h -- 2.11.0