From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: From: Vladimir Davydov Subject: [PATCH 0/9] Allow to limit size of WAL files Date: Wed, 28 Nov 2018 19:14:38 +0300 Message-Id: To: kostja@tarantool.org Cc: tarantool-patches@freelists.org List-ID: Tarantool makes checkpoints every box.cfg.checkpoint_interval seconds and keeps last box.cfg.checkpoint_count checkpoints. It also keeps all intermediate WAL files. Currently, it isn't possible to set a checkpoint trigger based on the sum size of WAL files, which makes it difficult to estimate the minimal amount of disk space that needs to be allotted to a Tarantool instance for storing WALs to eliminate the possibility of ENOSPC errors. For example, under normal conditions a Tarantool instance may write 1 GB of WAL files every box.cfg.checkpoint_interval seconds and so one may assume that 1 GB times box.cfg.checkpoint_count should be enough for the WAL partition, but there's no guarantee it won't write 10 GB between checkpoints when the load is extreme. So we've agreed that we must provide users with one more configuration option that could be used to impose the limit on the sum size of WAL files. The new option is called box.cfg.checkpoint_wal_threshold. Once the configured threshold is exceeded, the WAL thread notifies the checkpoint daemon that it's time to make a new checkpoint and delete old WAL files. Note, the new option only limits the size of WAL files created since the last checkpoint, because backup WAL files are not needed for recovery and can be deleted in case of emergency ENOSPC. https://github.com/tarantool/tarantool/issues/1082 https://github.com/tarantool/tarantool/commits/dv/gh-1082-wal-checkpoint-threshold Vladimir Davydov (9): wal: separate checkpoint and flush paths wal: remove files needed for recovery from backup checkpoints on ENOSPC recovery: restore garbage collector vclock after restart gc: run garbage collection in background gc: do not use WAL watcher API for deactivating stale consumers wal: simplify watcher API box: rewrite checkpoint daemon in C wal: pass struct instead of vclock to checkpoint methods wal: trigger checkpoint if there are too many WALs src/box/CMakeLists.txt | 1 - src/box/box.cc | 164 ++++++++++++++++-- src/box/box.h | 2 + src/box/gc.c | 115 +++++++----- src/box/gc.h | 58 ++++--- src/box/lua/cfg.cc | 24 +++ src/box/lua/checkpoint_daemon.lua | 136 --------------- src/box/lua/init.c | 2 - src/box/lua/load_cfg.lua | 5 +- src/box/recovery.cc | 14 +- src/box/recovery.h | 5 +- src/box/relay.cc | 12 +- src/box/vinyl.c | 5 +- src/box/wal.c | 298 +++++++++++++++++++++----------- src/box/wal.h | 110 +++++++----- test/app-tap/init_script.result | 87 +++++----- test/box/admin.result | 2 + test/box/cfg.result | 4 + test/replication/gc_no_space.result | 62 +++++-- test/replication/gc_no_space.test.lua | 30 +++- test/xlog/checkpoint_daemon.result | 145 ---------------- test/xlog/checkpoint_daemon.test.lua | 56 ------ test/xlog/checkpoint_threshold.result | 112 ++++++++++++ test/xlog/checkpoint_threshold.test.lua | 62 +++++++ test/xlog/panic_on_wal_error.result | 2 +- test/xlog/panic_on_wal_error.test.lua | 2 +- test/xlog/suite.ini | 2 +- 27 files changed, 876 insertions(+), 641 deletions(-) delete mode 100644 src/box/lua/checkpoint_daemon.lua create mode 100644 test/xlog/checkpoint_threshold.result create mode 100644 test/xlog/checkpoint_threshold.test.lua -- 2.11.0