[Tarantool-patches] [PATCH 1/2] gc/xlog: delay xlog cleanup until relays are subscribed
Serge Petrenko
sergepetrenko at tarantool.org
Fri Mar 19 16:50:38 MSK 2021
18.03.2021 21:41, Cyrill Gorcunov пишет:
> In case if replica managed to be far behind the master node
> (so there are a number of xlog files present after the last
...
> src/box/replication.cc | 2 +
> test/app-tap/init_script.result | 1 +
> test/box/admin.result | 2 +
> test/box/cfg.result | 4 ++
> 10 files changed, 153 insertions(+), 3 deletions(-)
Hi! Thanks for the patch!
Please find 2 minor comments below.
>
> diff --git a/src/box/box.cc b/src/box/box.cc
> index b4a1a5e07..ee017c2dc 100644
> --- a/src/box/box.cc
> +++ b/src/box/box.cc
> @@ -754,6 +754,15 @@ box_check_wal_mode(const char *mode_name)
> return (enum wal_mode) mode;
> }
>
> +static void
> +box_check_wal_cleanup_delay(int tmo)
> +{
> + if (tmo < 0 && tmo != -1) {
> + tnt_raise(ClientError, ER_CFG, "wal_cleanup_delay",
> + "the value must be either >= 0 or -1");
> + }
> +}
> +
1. AFAICS all other delay and timeout parameters are 'double'.
Let's make this one also a double.
> static void
> box_check_readahead(int readahead)
> {
> @@ -899,6 +908,7 @@ box_check_config(void)
> box_check_checkpoint_count(cfg_geti("checkpoint_count"));
> box_check_wal_max_size(cfg_geti64("wal_max_size"));
> box_check_wal_mode(cfg_gets("wal_mode"));
> + box_check_wal_cleanup_delay(cfg_geti("wal_cleanup_delay"));
> if (box_check_memory_quota("memtx_memory") < 0)
> diag_raise();
> box_check_memtx_min_tuple_size(cfg_geti64("memtx_min_tuple_size"));
> @@ -3045,6 +3055,7 @@ box_cfg_xc(void)
> bootstrap(&instance_uuid, &replicaset_uuid,
> &is_bootstrap_leader);
> }
> + gc_delay_unref();
> fiber_gc();
>
> bootstrap_journal_guard.is_active = false;
...
> diff --git a/src/box/relay.cc b/src/box/relay.cc
> index 41f949e8e..5d09690ad 100644
> --- a/src/box/relay.cc
> +++ b/src/box/relay.cc
> @@ -668,6 +668,13 @@ relay_send_is_raft_enabled(struct relay *relay,
> }
> }
>
> +static void
> +relay_gc_delay_unref(struct cmsg *msg)
> +{
> + (void)msg;
> + gc_delay_unref();
> +}
> +
> /**
> * A libev callback invoked when a relay client socket is ready
> * for read. This currently only happens when the client closes
> @@ -721,6 +728,19 @@ relay_subscribe_f(va_list ap)
> */
> relay_send_heartbeat(relay);
>
> + /*
> + * Now we can resume wal/engine gc as relay
> + * is up and running.
> + */
> + if (!relay->replica->anon) {
> + static const struct cmsg_hop gc_delay_route[] = {
> + {relay_gc_delay_unref, NULL}
> + };
> + struct cmsg gc_delay_msg;
> + cmsg_init(&gc_delay_msg, gc_delay_route);
> + cpipe_push(&relay->tx_pipe, &gc_delay_msg);
> + }
> +
2. I agree with Vlad here. No need to call gc_delay_unref() from the
relay thread.
We need to wait until a corresponding gc consumer is registered.
This is done
in relay_subscribe(), which is in tx thread.
In fact, you may as well move the gc_delay_unref() call directly to
gc_consumer_register().
This would look even better, IMO.
> /*
> * Run the event loop until the connection is broken
> * or an error occurs.
> diff --git a/src/box/replication.cc b/src/box/replication.cc
> index 1fa8843e7..aefb812b3 100644
> --- a/src/box/replication.cc
> +++ b/src/box/replication.cc
> @@ -250,6 +250,7 @@ replica_set_id(struct replica *replica, uint32_t replica_id)
> tt_uuid_str(&replica->uuid));
> }
> replicaset.replica_by_id[replica_id] = replica;
> + gc_delay_ref();
> ++replicaset.registered_count;
> say_info("assigned id %d to replica %s",
> replica->id, tt_uuid_str(&replica->uuid));
> @@ -273,6 +274,7 @@ replica_clear_id(struct replica *replica)
> replicaset.replica_by_id[replica->id] = NULL;
> assert(replicaset.registered_count > 0);
> --replicaset.registered_count;
> + gc_delay_unref();
> if (replica->id == instance_id) {
> /* See replica_check_id(). */
> assert(replicaset.is_joining);
>
...
--
Serge Petrenko
More information about the Tarantool-patches
mailing list