[Tarantool-patches] [PATCH v5 1/3] gc/xlog: delay xlog cleanup until relays are subscribed
sergepetrenko at tarantool.org
Fri Mar 26 16:42:00 MSK 2021
26.03.2021 15:06, Cyrill Gorcunov пишет:
> In case if replica managed to be far behind the master node
> (so there are a number of xlog files present after the last
> master's snapshot) then once master node get restarted it
> may clean up the xlogs needed by the replica to subscribe
> in a fast way and instead the replica will have to rejoin
> reading a number of data back.
> Lets try to address this by delaying xlog files cleanup
> until replicas are got subscribed and relays are up
> and running. For this sake we start with cleanup fiber
> spinning in nop cycle ("paused" mode) and use a delay
> counter to wait until relays decrement them.
> This implies that if `_cluster` system space is not empty
> upon restart and the registered replica somehow vanished
> completely and won't ever come back, then the node
> administrator has to drop this replica from `_cluster`
> Note that this delayed cleanup start doesn't prevent
> WAL engine from removing old files if there is no
> space left on a storage device. The WAL will simply
> drop old data without a question.
> We need to take into account that some administrators
> might not need this functionality at all, for this
> sake we introduce "wal_cleanup_delay" configuration
> option which allows to enable or disable the delay.
> Closes #5806
> Signed-off-by: Cyrill Gorcunov <gorcunov at gmail.com>
> @TarantoolBot document
> Title: Add wal_cleanup_delay configuration parameter
> The `wal_cleanup_delay` option defines a delay in seconds
> before write ahead log files (`*.xlog`) are getting started
> to prune upon a node restart.
> This option is ignored in case if a node is running as
> an anonymous replica (`replication_anon = true`). Similarly
> if replication is unused or there is no plans to use
> replication at all then this option should not be considered.
> An initial problem to solve is the case where a node is operating
> so fast that its replicas do not manage to reach the node state
> and in case if the node is restarted at this moment (for various
> reasons, for example due to power outage) then `*.xlog` files might
> be pruned during restart. In result replicas will not find these
> files on the main node and have to reread all data back which
> is a very expensive procedure.
> Since replicas are tracked via `_cluster` system space this we use
> its content to count subscribed replicas and when all of them are
> up and running the cleanup procedure is automatically enabled even
> if `wal_cleanup_delay` is not expired.
> The `wal_cleanup_delay` should be set to:
> - `0` to disable the cleanup delay;
> - `>= 0` to wait for specified number of seconds.
> By default it is set to `14400` seconds (ie `4` hours).
> In case if registered replica is lost forever and timeout is set to
> infinity then a preferred way to enable cleanup procedure is not setting
> up a small timeout value but rather to delete this replica from `_cluster`
> space manually.
> Note that the option does *not* prevent WAL engine from removing
> old `*.xlog` files if there is no space left on a storage device,
> WAL engine can remove them in a force way.
> Current state of `*.xlog` garbage collector can be found in
> `box.info.gc()` output. For example
> ``` Lua
> tarantool> box.info.gc()
> is_paused: false
> The `is_paused` shows if cleanup fiber is paused or not.
> .../unreleased/add-wal_cleanup_delay.md | 5 +
> src/box/box.cc | 41 ++++++++
> src/box/box.h | 1 +
> src/box/gc.c | 95 ++++++++++++++++++-
> src/box/gc.h | 36 +++++++
> src/box/lua/cfg.cc | 9 ++
> src/box/lua/info.c | 4 +
> src/box/lua/load_cfg.lua | 5 +
> src/box/relay.cc | 1 +
> src/box/replication.cc | 2 +
> test/app-tap/init_script.result | 1 +
> test/box/admin.result | 2 +
> test/box/cfg.result | 4 +
> test/replication/replica_rejoin.lua | 22 +++++
> test/replication/replica_rejoin.result | 18 +++-
> test/replication/replica_rejoin.test.lua | 11 ++-
> test/vinyl/replica_rejoin.lua | 5 +-
> test/vinyl/replica_rejoin.result | 13 +++
> test/vinyl/replica_rejoin.test.lua | 8 ++
> 19 files changed, 275 insertions(+), 8 deletions(-)
> create mode 100644 changelogs/unreleased/add-wal_cleanup_delay.md
> create mode 100644 test/replication/replica_rejoin.lua
Thanks for the patch! LGTM.
More information about the Tarantool-patches