Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [RFC] gc/xlog: delay xlog cleanup until relays are subscribed
@ 2021-03-17 18:57 Cyrill Gorcunov via Tarantool-patches
  2021-03-17 18:59 ` Cyrill Gorcunov via Tarantool-patches
  2021-03-17 21:23 ` Konstantin Osipov via Tarantool-patches
  0 siblings, 2 replies; 16+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-03-17 18:57 UTC (permalink / raw)
  To: tml; +Cc: Mons Anderson, Vladislav Shpilevoy

In case if replica managed to be far behind the master node
(so there are a number of xlog files present after the last
master's snapshot) then once master node get restarted it
may clean up the xlogs needed by the replica to subscribe
in a fast way and instead the replica will have to rejoin
reading a number of data back.

Lets try to address this by delaying xlog files cleanup
until replicas are got subscribed and relays are up
and running. For this sake we start with cleanup fiber
spinning in nop cycle ("paused" mode) and use a dalay
counter to wait until relays decrement them.

This implies that if `_cluster` system space is not empty
upon restart and the registered replica somehow vanished
completely and won't ever come back, then the node
administrator has to drop this replica from `_custer`
manually.

Note that this delayed cleanup start doesn't prevent
WAL engine from removing old files if there is no
space left on a storage device. The WAL will simply
drop old data without a question.

Closes #5806

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
Guys, take a look please. This is RFC on a purpose I would
like to gather more comments. Should we provide an option
to start node with cleanup enabled by force and etc?

 src/box/box.cc         |  1 +
 src/box/gc.c           | 61 +++++++++++++++++++++++++++++++++++++++---
 src/box/gc.h           | 26 ++++++++++++++++++
 src/box/lua/info.c     |  8 ++++++
 src/box/relay.cc       | 23 +++++++++++++---
 src/box/replication.cc |  2 ++
 6 files changed, 115 insertions(+), 6 deletions(-)

diff --git a/src/box/box.cc b/src/box/box.cc
index b4a1a5e07..a39f56cd9 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -3045,6 +3045,7 @@ box_cfg_xc(void)
 		bootstrap(&instance_uuid, &replicaset_uuid,
 			  &is_bootstrap_leader);
 	}
+	gc_delay_unref();
 	fiber_gc();
 
 	bootstrap_journal_guard.is_active = false;
diff --git a/src/box/gc.c b/src/box/gc.c
index 9af4ef958..1b1093dd7 100644
--- a/src/box/gc.c
+++ b/src/box/gc.c
@@ -46,6 +46,7 @@
 #include <small/rlist.h>
 #include <tarantool_ev.h>
 
+#include "cfg.h"
 #include "diag.h"
 #include "errcode.h"
 #include "fiber.h"
@@ -107,6 +108,18 @@ gc_init(void)
 	/* Don't delete any files until recovery is complete. */
 	gc.min_checkpoint_count = INT_MAX;
 
+	gc.delay_ref = 0;
+	if (cfg_geti("replication_anon") != 0) {
+		/*
+		 * Anonymous replicas are read only
+		 * so no need to keep XLOGs.
+		 */
+		gc.cleanup_is_paused = false;
+	} else {
+		say_info("gc: wal/engine cleanup is paused");
+		gc.cleanup_is_paused = true;
+	}
+
 	vclock_create(&gc.vclock);
 	rlist_create(&gc.checkpoints);
 	gc_tree_new(&gc.consumers);
@@ -238,6 +251,23 @@ static int
 gc_cleanup_fiber_f(va_list ap)
 {
 	(void)ap;
+
+	/*
+	 * Stage 1 (optional): in case if we're booting
+	 * up with cleanup disabled lets do wait in a
+	 * separate cycle to minimize branching on stage 2.
+	 */
+	if (gc.cleanup_is_paused) {
+		while (!fiber_is_cancelled()) {
+			fiber_sleep(TIMEOUT_INFINITY);
+			if (!gc.cleanup_is_paused)
+				break;
+		}
+	}
+
+	/*
+	 * Stage 2: a regular cleanup cycle.
+	 */
 	while (!fiber_is_cancelled()) {
 		int64_t delta = gc.cleanup_scheduled - gc.cleanup_completed;
 		if (delta == 0) {
@@ -253,6 +283,29 @@ gc_cleanup_fiber_f(va_list ap)
 	return 0;
 }
 
+void
+gc_delay_ref(void)
+{
+	if (gc.cleanup_is_paused) {
+		assert(gc.delay_ref >= 0);
+		gc.delay_ref++;
+	}
+}
+
+void
+gc_delay_unref(void)
+{
+	if (gc.cleanup_is_paused) {
+		assert(gc.delay_ref > 0);
+		gc.delay_ref--;
+		if (gc.delay_ref == 0) {
+			say_info("gc: wal/engine cleanup is resumed");
+			gc.cleanup_is_paused = false;
+			fiber_wakeup(gc.cleanup_fiber);
+		}
+	}
+}
+
 /**
  * Trigger asynchronous garbage collection.
  */
@@ -278,9 +331,11 @@ gc_schedule_cleanup(void)
 static void
 gc_wait_cleanup(void)
 {
-	int64_t scheduled = gc.cleanup_scheduled;
-	while (gc.cleanup_completed < scheduled)
-		fiber_cond_wait(&gc.cleanup_cond);
+	if (!gc.cleanup_is_paused) {
+		int64_t scheduled = gc.cleanup_scheduled;
+		while (gc.cleanup_completed < scheduled)
+			fiber_cond_wait(&gc.cleanup_cond);
+	}
 }
 
 void
diff --git a/src/box/gc.h b/src/box/gc.h
index 2a568c5f9..780da6540 100644
--- a/src/box/gc.h
+++ b/src/box/gc.h
@@ -147,6 +147,20 @@ struct gc_state {
 	 * taken at that moment of time.
 	 */
 	int64_t cleanup_completed, cleanup_scheduled;
+	/**
+	 * A counter to wait until all replicas are managed to
+	 * subscribe so that we can enable cleanup fiber to
+	 * remove old XLOGs. Otherwise some replicas might be
+	 * far behing the master node and after the master
+	 * node been restarted they will have to reread all
+	 * data back due to XlogGapError, ie too early deleted
+	 * XLOGs.
+	 */
+	int64_t delay_ref;
+	/**
+	 * When set the cleanup fiber is paused.
+	 */
+	bool cleanup_is_paused;
 	/**
 	 * Set if there's a fiber making a checkpoint right now.
 	 */
@@ -206,6 +220,18 @@ gc_init(void);
 void
 gc_free(void);
 
+/**
+ * Increment a reference to delay counter.
+ */
+void
+gc_delay_ref(void);
+
+/**
+ * Decrement a reference from the delay counter.
+ */
+void
+gc_delay_unref(void);
+
 /**
  * Advance the garbage collector vclock to the given position.
  * Deactivate WAL consumers that need older data.
diff --git a/src/box/lua/info.c b/src/box/lua/info.c
index c4c9fa0a0..3230b5bbb 100644
--- a/src/box/lua/info.c
+++ b/src/box/lua/info.c
@@ -445,6 +445,14 @@ lbox_info_gc_call(struct lua_State *L)
 	lua_pushboolean(L, gc.checkpoint_is_in_progress);
 	lua_settable(L, -3);
 
+	lua_pushstring(L, "cleanup_is_paused");
+	lua_pushboolean(L, gc.cleanup_is_paused);
+	lua_settable(L, -3);
+
+	lua_pushstring(L, "delay_ref");
+	luaL_pushint64(L, gc.delay_ref);
+	lua_settable(L, -3);
+
 	lua_pushstring(L, "checkpoints");
 	lua_newtable(L);
 
diff --git a/src/box/relay.cc b/src/box/relay.cc
index 41f949e8e..ee2e1fec8 100644
--- a/src/box/relay.cc
+++ b/src/box/relay.cc
@@ -668,6 +668,13 @@ relay_send_is_raft_enabled(struct relay *relay,
 	}
 }
 
+static void
+relay_gc_delay_unref(struct cmsg *msg)
+{
+	(void)msg;
+	gc_delay_unref();
+}
+
 /**
  * A libev callback invoked when a relay client socket is ready
  * for read. This currently only happens when the client closes
@@ -721,6 +728,19 @@ relay_subscribe_f(va_list ap)
 	 */
 	relay_send_heartbeat(relay);
 
+	/*
+	 * Now we can resume wal/engine gc as relay
+	 * is up and running.
+	 */
+	if (!relay->replica->anon) {
+		static const struct cmsg_hop gc_delay_route[] = {
+			{relay_gc_delay_unref, NULL}
+		};
+		struct cmsg gc_delay_msg;
+		cmsg_init(&gc_delay_msg, gc_delay_route);
+		cpipe_push(&relay->tx_pipe, &gc_delay_msg);
+	}
+
 	/*
 	 * Run the event loop until the connection is broken
 	 * or an error occurs.
@@ -771,9 +791,6 @@ relay_subscribe_f(va_list ap)
 		cpipe_push(&relay->tx_pipe, &relay->status_msg.msg);
 	}
 
-	if (!relay->replica->anon)
-		relay_send_is_raft_enabled(relay, &raft_enabler, false);
-
 	/*
 	 * Clear garbage collector trigger and WAL watcher.
 	 * trigger_clear() does nothing in case the triggers
diff --git a/src/box/replication.cc b/src/box/replication.cc
index 1fa8843e7..aefb812b3 100644
--- a/src/box/replication.cc
+++ b/src/box/replication.cc
@@ -250,6 +250,7 @@ replica_set_id(struct replica *replica, uint32_t replica_id)
 						   tt_uuid_str(&replica->uuid));
 	}
 	replicaset.replica_by_id[replica_id] = replica;
+	gc_delay_ref();
 	++replicaset.registered_count;
 	say_info("assigned id %d to replica %s",
 		 replica->id, tt_uuid_str(&replica->uuid));
@@ -273,6 +274,7 @@ replica_clear_id(struct replica *replica)
 	replicaset.replica_by_id[replica->id] = NULL;
 	assert(replicaset.registered_count > 0);
 	--replicaset.registered_count;
+	gc_delay_unref();
 	if (replica->id == instance_id) {
 		/* See replica_check_id(). */
 		assert(replicaset.is_joining);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [RFC] gc/xlog: delay xlog cleanup until relays are subscribed
  2021-03-17 18:57 [Tarantool-patches] [RFC] gc/xlog: delay xlog cleanup until relays are subscribed Cyrill Gorcunov via Tarantool-patches
@ 2021-03-17 18:59 ` Cyrill Gorcunov via Tarantool-patches
  2021-03-17 21:23 ` Konstantin Osipov via Tarantool-patches
  1 sibling, 0 replies; 16+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-03-17 18:59 UTC (permalink / raw)
  To: tml; +Cc: Mons Anderson, Vladislav Shpilevoy

On Wed, Mar 17, 2021 at 09:57:43PM +0300, Cyrill Gorcunov wrote:
> @@ -771,9 +791,6 @@ relay_subscribe_f(va_list ap)
>  		cpipe_push(&relay->tx_pipe, &relay->status_msg.msg);
>  	}
>  
> -	if (!relay->replica->anon)
> -		relay_send_is_raft_enabled(relay, &raft_enabler, false);
> -

Ignore this hunk, sneaked in occasionally :-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [RFC] gc/xlog: delay xlog cleanup until relays are subscribed
  2021-03-17 18:57 [Tarantool-patches] [RFC] gc/xlog: delay xlog cleanup until relays are subscribed Cyrill Gorcunov via Tarantool-patches
  2021-03-17 18:59 ` Cyrill Gorcunov via Tarantool-patches
@ 2021-03-17 21:23 ` Konstantin Osipov via Tarantool-patches
  2021-03-17 21:53   ` Cyrill Gorcunov via Tarantool-patches
  1 sibling, 1 reply; 16+ messages in thread
From: Konstantin Osipov via Tarantool-patches @ 2021-03-17 21:23 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml, Mons Anderson, Vladislav Shpilevoy

* Cyrill Gorcunov via Tarantool-patches <tarantool-patches@dev.tarantool.org> [21/03/17 22:01]:
> In case if replica managed to be far behind the master node
> (so there are a number of xlog files present after the last
> master's snapshot) then once master node get restarted it
> may clean up the xlogs needed by the replica to subscribe
> in a fast way and instead the replica will have to rejoin
> reading a number of data back.

Should be in 1.10 as well.

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [RFC] gc/xlog: delay xlog cleanup until relays are subscribed
  2021-03-17 21:23 ` Konstantin Osipov via Tarantool-patches
@ 2021-03-17 21:53   ` Cyrill Gorcunov via Tarantool-patches
  2021-03-18  5:18     ` Konstantin Osipov via Tarantool-patches
  0 siblings, 1 reply; 16+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-03-17 21:53 UTC (permalink / raw)
  To: Konstantin Osipov; +Cc: Mons Anderson, tml, Vladislav Shpilevoy

On Thu, Mar 18, 2021 at 12:23:50AM +0300, Konstantin Osipov wrote:
> * Cyrill Gorcunov via Tarantool-patches <tarantool-patches@dev.tarantool.org> [21/03/17 22:01]:
>
> > In case if replica managed to be far behind the master node
> > (so there are a number of xlog files present after the last
> > master's snapshot) then once master node get restarted it
> > may clean up the xlogs needed by the replica to subscribe
> > in a fast way and instead the replica will have to rejoin
> > reading a number of data back.
> 
> Should be in 1.10 as well.

Sure, but first I need to be sure if current rfc is acceptable
in general and I didn't miss something. I suspect we might
need to extend this code and better to not make some design
mistakes which gonna be hard to resolve later. I've a test
case for this simply didn't posted it yet.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [RFC] gc/xlog: delay xlog cleanup until relays are subscribed
  2021-03-17 21:53   ` Cyrill Gorcunov via Tarantool-patches
@ 2021-03-18  5:18     ` Konstantin Osipov via Tarantool-patches
  2021-03-18  7:41       ` Cyrill Gorcunov via Tarantool-patches
  0 siblings, 1 reply; 16+ messages in thread
From: Konstantin Osipov via Tarantool-patches @ 2021-03-18  5:18 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: Mons Anderson, tml, Vladislav Shpilevoy

* Cyrill Gorcunov <gorcunov@gmail.com> [21/03/18 00:54]:
> On Thu, Mar 18, 2021 at 12:23:50AM +0300, Konstantin Osipov wrote:
> > * Cyrill Gorcunov via Tarantool-patches <tarantool-patches@dev.tarantool.org> [21/03/17 22:01]:
> >
> > > In case if replica managed to be far behind the master node
> > > (so there are a number of xlog files present after the last
> > > master's snapshot) then once master node get restarted it
> > > may clean up the xlogs needed by the replica to subscribe
> > > in a fast way and instead the replica will have to rejoin
> > > reading a number of data back.
> > 
> > Should be in 1.10 as well.
> 
> Sure, but first I need to be sure if current rfc is acceptable
> in general and I didn't miss something. I suspect we might
> need to extend this code and better to not make some design
> mistakes which gonna be hard to resolve later. I've a test
> case for this simply didn't posted it yet.

This is a case where a configuration setting would be suitable. I
don't think it's necessary to keep these logs forever by default,
4-6 hours should be a good default for many setups. Some will set
the default to, say, 5 minutes, essentially to keep the old
behaviour, and some can set it to infinity, to get the current
behaviour of the patch.

All tarantool setting follow the convention to start from
subsystem name, so it got to be wal_{something}, e.g.
wal_keep_logs

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [RFC] gc/xlog: delay xlog cleanup until relays are subscribed
  2021-03-18  5:18     ` Konstantin Osipov via Tarantool-patches
@ 2021-03-18  7:41       ` Cyrill Gorcunov via Tarantool-patches
  2021-03-18  7:51         ` Konstantin Osipov via Tarantool-patches
  0 siblings, 1 reply; 16+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-03-18  7:41 UTC (permalink / raw)
  To: Konstantin Osipov; +Cc: Mons Anderson, tml, Vladislav Shpilevoy

On Thu, Mar 18, 2021 at 08:18:12AM +0300, Konstantin Osipov wrote:
> > > 
> > > Should be in 1.10 as well.
> > 
> > Sure, but first I need to be sure if current rfc is acceptable
> > in general and I didn't miss something. I suspect we might
> > need to extend this code and better to not make some design
> > mistakes which gonna be hard to resolve later. I've a test
> > case for this simply didn't posted it yet.
> 
> This is a case where a configuration setting would be suitable. I
> don't think it's necessary to keep these logs forever by default,
> 4-6 hours should be a good default for many setups. Some will set
> the default to, say, 5 minutes, essentially to keep the old
> behaviour, and some can set it to infinity, to get the current
> behaviour of the patch.
> 
> All tarantool setting follow the convention to start from
> subsystem name, so it got to be wal_{something}, e.g.
> wal_keep_logs

Good point, thanks! Kostya, lets clarify some moments:

1) We introduce "wal_keep_logs" option which defines
   a timeout to kick the cleanup fiber.
2) If node is anonymous replica we simply ignore this
   option.
3) If this option is set then we have a few subcases:
   a) The `_cluster` space is not empty thus thus once
      all replicas are subscribed _before_ the timeout
      expired we trigger the cleanup fiber since it is
      safe to process;
   b) If replicas are not connected and timeout is
      expired we kick the cleanup fiber;

Or you mean to always conside "wal_keep_logs" option
and never trigger the cleanup until it get expired?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [RFC] gc/xlog: delay xlog cleanup until relays are subscribed
  2021-03-18  7:41       ` Cyrill Gorcunov via Tarantool-patches
@ 2021-03-18  7:51         ` Konstantin Osipov via Tarantool-patches
  2021-03-18  7:56           ` Cyrill Gorcunov via Tarantool-patches
  2021-03-18 20:36           ` Vladislav Shpilevoy via Tarantool-patches
  0 siblings, 2 replies; 16+ messages in thread
From: Konstantin Osipov via Tarantool-patches @ 2021-03-18  7:51 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: Mons Anderson, tml, Vladislav Shpilevoy

* Cyrill Gorcunov <gorcunov@gmail.com> [21/03/18 10:42]:
> Good point, thanks! Kostya, lets clarify some moments:
> 
> 1) We introduce "wal_keep_logs" option which defines
>    a timeout to kick the cleanup fiber.
> 2) If node is anonymous replica we simply ignore this
>    option.
> 3) If this option is set then we have a few subcases:
>    a) The `_cluster` space is not empty thus thus once
>       all replicas are subscribed _before_ the timeout
>       expired we trigger the cleanup fiber since it is
>       safe to process;
>    b) If replicas are not connected and timeout is
>       expired we kick the cleanup fiber;

I mean this.

> Or you mean to always conside "wal_keep_logs" option
> and never trigger the cleanup until it get expired?

No.

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [RFC] gc/xlog: delay xlog cleanup until relays are subscribed
  2021-03-18  7:51         ` Konstantin Osipov via Tarantool-patches
@ 2021-03-18  7:56           ` Cyrill Gorcunov via Tarantool-patches
  2021-03-18 20:36           ` Vladislav Shpilevoy via Tarantool-patches
  1 sibling, 0 replies; 16+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-03-18  7:56 UTC (permalink / raw)
  To: Konstantin Osipov; +Cc: Mons Anderson, tml, Vladislav Shpilevoy

On Thu, Mar 18, 2021 at 10:51:27AM +0300, Konstantin Osipov wrote:
> > 3) If this option is set then we have a few subcases:
> >    a) The `_cluster` space is not empty thus thus once
> >       all replicas are subscribed _before_ the timeout
> >       expired we trigger the cleanup fiber since it is
> >       safe to process;
> >    b) If replicas are not connected and timeout is
> >       expired we kick the cleanup fiber;
> 
> I mean this.

Thanks a lot! I'll make it so then.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [RFC] gc/xlog: delay xlog cleanup until relays are subscribed
  2021-03-18  7:51         ` Konstantin Osipov via Tarantool-patches
  2021-03-18  7:56           ` Cyrill Gorcunov via Tarantool-patches
@ 2021-03-18 20:36           ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-18 20:45             ` Cyrill Gorcunov via Tarantool-patches
  2021-03-19  7:40             ` Konstantin Osipov via Tarantool-patches
  1 sibling, 2 replies; 16+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18 20:36 UTC (permalink / raw)
  To: Konstantin Osipov, Cyrill Gorcunov, tml, Mons Anderson,
	Serge Petrenko, Kirill Yukhin

On 18.03.2021 08:51, Konstantin Osipov via Tarantool-patches wrote:
> * Cyrill Gorcunov <gorcunov@gmail.com> [21/03/18 10:42]:
>> Good point, thanks! Kostya, lets clarify some moments:
>>
>> 1) We introduce "wal_keep_logs" option which defines
>>    a timeout to kick the cleanup fiber.
>> 2) If node is anonymous replica we simply ignore this
>>    option.
>> 3) If this option is set then we have a few subcases:
>>    a) The `_cluster` space is not empty thus thus once
>>       all replicas are subscribed _before_ the timeout
>>       expired we trigger the cleanup fiber since it is
>>       safe to process;
>>    b) If replicas are not connected and timeout is
>>       expired we kick the cleanup fiber;
> 
> I mean this.

Then it should have 'replication_' prefix, not 'wal_'. Because
it is ignored if replicas connect before the timeout expires.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [RFC] gc/xlog: delay xlog cleanup until relays are subscribed
  2021-03-18 20:36           ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-18 20:45             ` Cyrill Gorcunov via Tarantool-patches
  2021-03-18 20:54               ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-19  7:40             ` Konstantin Osipov via Tarantool-patches
  1 sibling, 1 reply; 16+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-03-18 20:45 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: Mons Anderson, tml

On Thu, Mar 18, 2021 at 09:36:33PM +0100, Vladislav Shpilevoy wrote:
> >>    b) If replicas are not connected and timeout is
> >>       expired we kick the cleanup fiber;
> > 
> > I mean this.
> 
> Then it should have 'replication_' prefix, not 'wal_'. Because
> it is ignored if replicas connect before the timeout expires.

Replication is one of the reason while main gamer is "wal"
here. In the series I sent recently I named it "wal_cleanup_delay".
In future we might introduce some topology detector as you've
been suggesting and better to not stick to "replication"
prefix I think.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [RFC] gc/xlog: delay xlog cleanup until relays are subscribed
  2021-03-18 20:45             ` Cyrill Gorcunov via Tarantool-patches
@ 2021-03-18 20:54               ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-18 21:31                 ` Cyrill Gorcunov via Tarantool-patches
  0 siblings, 1 reply; 16+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18 20:54 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: Mons Anderson, tml

On 18.03.2021 21:45, Cyrill Gorcunov wrote:
> On Thu, Mar 18, 2021 at 09:36:33PM +0100, Vladislav Shpilevoy wrote:
>>>>    b) If replicas are not connected and timeout is
>>>>       expired we kick the cleanup fiber;
>>>
>>> I mean this.
>>
>> Then it should have 'replication_' prefix, not 'wal_'. Because
>> it is ignored if replicas connect before the timeout expires.
> 
> Replication is one of the reason while main gamer is "wal"
> here. In the series I sent recently I named it "wal_cleanup_delay".
> In future we might introduce some topology detector as you've
> been suggesting and better to not stick to "replication"
> prefix I think.

The thing you said is just another argument for it having 'replication'
prefix. Because topology also is not about WAL.

WAL is just a container, it is not a gamer. The thing you fix here is GC,
which is manipulated by the replication. The replication forces all the
decisions.

If the option owner would be WAL, then it should have worked regardless
of what is the topology. I.e. keep the logs for the entire timeout even
if all is connected.

But it is vice versa - the decision when to drop the logs is made by
the replication. And it is the replication who gives the command "now
you can delete the old logs".

If you want wal prefix so bad, at least it should state that this is
not the exact timeout. It is a max timeout, which may end much earlier
due to any reason: out of disk space, full replication sync.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [RFC] gc/xlog: delay xlog cleanup until relays are subscribed
  2021-03-18 20:54               ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-18 21:31                 ` Cyrill Gorcunov via Tarantool-patches
  2021-03-18 23:04                   ` Vladislav Shpilevoy via Tarantool-patches
  0 siblings, 1 reply; 16+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-03-18 21:31 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: Mons Anderson, tml

On Thu, Mar 18, 2021 at 09:54:12PM +0100, Vladislav Shpilevoy wrote:
> >>
> >> Then it should have 'replication_' prefix, not 'wal_'. Because
> >> it is ignored if replicas connect before the timeout expires.
> > 
> > Replication is one of the reason while main gamer is "wal"
> > here. In the series I sent recently I named it "wal_cleanup_delay".
> > In future we might introduce some topology detector as you've
> > been suggesting and better to not stick to "replication"
> > prefix I think.
> 
> The thing you said is just another argument for it having 'replication'
> prefix. Because topology also is not about WAL.

Topology is anoter _reason_ to defer wal cleanup, and there might
be other reasons unrelated to replications at all. On the other
hands I see what you mean and won't insist. Should I name it
replication_cleanup_delay then?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [RFC] gc/xlog: delay xlog cleanup until relays are subscribed
  2021-03-18 21:31                 ` Cyrill Gorcunov via Tarantool-patches
@ 2021-03-18 23:04                   ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-19  7:13                     ` Cyrill Gorcunov via Tarantool-patches
  2021-03-19  8:00                     ` Konstantin Osipov via Tarantool-patches
  0 siblings, 2 replies; 16+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18 23:04 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: Mons Anderson, tml



On 18.03.2021 22:31, Cyrill Gorcunov wrote:
> On Thu, Mar 18, 2021 at 09:54:12PM +0100, Vladislav Shpilevoy wrote:
>>>>
>>>> Then it should have 'replication_' prefix, not 'wal_'. Because
>>>> it is ignored if replicas connect before the timeout expires.
>>>
>>> Replication is one of the reason while main gamer is "wal"
>>> here. In the series I sent recently I named it "wal_cleanup_delay".
>>> In future we might introduce some topology detector as you've
>>> been suggesting and better to not stick to "replication"
>>> prefix I think.
>>
>> The thing you said is just another argument for it having 'replication'
>> prefix. Because topology also is not about WAL.
> 
> Topology is anoter _reason_ to defer wal cleanup, and there might
> be other reasons unrelated to replications at all.

Topology is a part of replication. It **is** replication.

If there would be other reasons, your option won't work, because
replication simply drops it when all nodes are connected.

> On the other
> hands I see what you mean and won't insist. Should I name it
> replication_cleanup_delay then?

I don't know yet. I would call it replication_max_gc_delay probably.
But you need to ask Mons for a final decision. Perhaps he will
insist on wal_cleanup_delay. See my other points in the v2 thread.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [RFC] gc/xlog: delay xlog cleanup until relays are subscribed
  2021-03-18 23:04                   ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-19  7:13                     ` Cyrill Gorcunov via Tarantool-patches
  2021-03-19  8:00                     ` Konstantin Osipov via Tarantool-patches
  1 sibling, 0 replies; 16+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-03-19  7:13 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: Mons Anderson, tml

On Fri, Mar 19, 2021 at 12:04:04AM +0100, Vladislav Shpilevoy wrote:
> 
> > On the other
> > hands I see what you mean and won't insist. Should I name it
> > replication_cleanup_delay then?
> 
> I don't know yet. I would call it replication_max_gc_delay probably.
> But you need to ask Mons for a final decision. Perhaps he will
> insist on wal_cleanup_delay. See my other points in the v2 thread.

OK, thanks for comments!

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [RFC] gc/xlog: delay xlog cleanup until relays are subscribed
  2021-03-18 20:36           ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-18 20:45             ` Cyrill Gorcunov via Tarantool-patches
@ 2021-03-19  7:40             ` Konstantin Osipov via Tarantool-patches
  1 sibling, 0 replies; 16+ messages in thread
From: Konstantin Osipov via Tarantool-patches @ 2021-03-19  7:40 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: Mons Anderson, tml

* Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [21/03/18 23:40]:
> >> Good point, thanks! Kostya, lets clarify some moments:
> >>
> >> 1) We introduce "wal_keep_logs" option which defines
> >>    a timeout to kick the cleanup fiber.
> >> 2) If node is anonymous replica we simply ignore this
> >>    option.
> >> 3) If this option is set then we have a few subcases:
> >>    a) The `_cluster` space is not empty thus thus once
> >>       all replicas are subscribed _before_ the timeout
> >>       expired we trigger the cleanup fiber since it is
> >>       safe to process;
> >>    b) If replicas are not connected and timeout is
> >>       expired we kick the cleanup fiber;
> > 
> > I mean this.
> 
> Then it should have 'replication_' prefix, not 'wal_'. Because
> it is ignored if replicas connect before the timeout expires.

It defines a property of the wal subsystem. The naming scheme is
object-oriented, not subject-oriented.

-- 
Konstantin Osipov, Moscow, Russia
https://scylladb.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [RFC] gc/xlog: delay xlog cleanup until relays are subscribed
  2021-03-18 23:04                   ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-19  7:13                     ` Cyrill Gorcunov via Tarantool-patches
@ 2021-03-19  8:00                     ` Konstantin Osipov via Tarantool-patches
  1 sibling, 0 replies; 16+ messages in thread
From: Konstantin Osipov via Tarantool-patches @ 2021-03-19  8:00 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: Mons Anderson, tml

* Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [21/03/19 09:50]:
> I don't know yet. I would call it replication_max_gc_delay probably.

The option does set a gc proerty, gc is not part of replication.

> But you need to ask Mons for a final decision. Perhaps he will
> insist on wal_cleanup_delay. See my other points in the v2 thread.

-- 
Konstantin Osipov, Moscow, Russia
https://scylladb.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-03-19  8:00 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-17 18:57 [Tarantool-patches] [RFC] gc/xlog: delay xlog cleanup until relays are subscribed Cyrill Gorcunov via Tarantool-patches
2021-03-17 18:59 ` Cyrill Gorcunov via Tarantool-patches
2021-03-17 21:23 ` Konstantin Osipov via Tarantool-patches
2021-03-17 21:53   ` Cyrill Gorcunov via Tarantool-patches
2021-03-18  5:18     ` Konstantin Osipov via Tarantool-patches
2021-03-18  7:41       ` Cyrill Gorcunov via Tarantool-patches
2021-03-18  7:51         ` Konstantin Osipov via Tarantool-patches
2021-03-18  7:56           ` Cyrill Gorcunov via Tarantool-patches
2021-03-18 20:36           ` Vladislav Shpilevoy via Tarantool-patches
2021-03-18 20:45             ` Cyrill Gorcunov via Tarantool-patches
2021-03-18 20:54               ` Vladislav Shpilevoy via Tarantool-patches
2021-03-18 21:31                 ` Cyrill Gorcunov via Tarantool-patches
2021-03-18 23:04                   ` Vladislav Shpilevoy via Tarantool-patches
2021-03-19  7:13                     ` Cyrill Gorcunov via Tarantool-patches
2021-03-19  8:00                     ` Konstantin Osipov via Tarantool-patches
2021-03-19  7:40             ` Konstantin Osipov via Tarantool-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox