[tarantool-patches] [PATCH v6 0/2] force gc on running out of disk space

Tarantool development patches archive
 help / color / mirror / Atom feed

* [tarantool-patches] [PATCH v6 0/2] force gc on running out of disk space
@ 2018-07-12 14:44 Konstantin Belyavskiy
  2018-07-12 14:44 ` [tarantool-patches] [PATCH v6 1/2] replication: rename thread from tx to tx_prio Konstantin Belyavskiy
  2018-07-12 14:44 ` [tarantool-patches] [PATCH v6 2/2] replication: force gc to clean xdir on ENOSPC err Konstantin Belyavskiy
  0 siblings, 2 replies; 4+ messages in thread
From: Konstantin Belyavskiy @ 2018-07-12 14:44 UTC (permalink / raw)
  To: tarantool-patches

Garbage collector do not delete xlog unless replica do not notify
master with newer vclock. This can lead to running out of disk
space error and this is not right behaviour since it will stop the
master.
Fix it by forcing gc to clean xlogs for replica with highest lag.
Add an error injection and a test.

Changes in V2:
- Promoting error from wal_thread to tx via cpipe.
Changes in V3:
- Delete consumers and only for replicas (but not backup).
Changes in V4:
- Bug fix and small changes according to review.
Changes in V5:
- Compare signatures of the oldest replica and the oldest snapshot
  to keep to prevent deletion if it will not free any disk space.
- Add say_crit on consumer deletion with a little information.
Changes in V6:
- Rebase to latest 1.10.
- Update test.

Tichet: https://github.com/tarantool/tarantool/issues/3397
Branch: https://github.com/tarantool/tarantool/compare/kbelyavs/gh-3397-force-del-logs-on-no-disk-space

Konstantin Belyavskiy (2):
  replication: rename thread from tx to tx_prio
  replication: force gc to clean xdir on ENOSPC err

 src/box/box.cc                                     |   1 +
 src/box/gc.c                                       |  62 +++++++
 src/box/gc.h                                       |  17 ++
 src/box/relay.cc                                   |   1 +
 src/box/wal.c                                      |  49 ++++-
 src/errinj.h                                       |   1 +
 src/fio.c                                          |   7 +
 test/box/errinj.result                             |   4 +-
 test/replication/kick_dead_replica_on_enspc.result | 204 +++++++++++++++++++++
 .../kick_dead_replica_on_enspc.test.lua            |  91 +++++++++
 test/replication/suite.ini                         |   2 +-
 11 files changed, 427 insertions(+), 12 deletions(-)
 create mode 100644 test/replication/kick_dead_replica_on_enspc.result
 create mode 100644 test/replication/kick_dead_replica_on_enspc.test.lua

-- 
2.14.3 (Apple Git-98)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [tarantool-patches] [PATCH v6 1/2] replication: rename thread from tx to tx_prio
  2018-07-12 14:44 [tarantool-patches] [PATCH v6 0/2] force gc on running out of disk space Konstantin Belyavskiy
@ 2018-07-12 14:44 ` Konstantin Belyavskiy
  2018-07-12 14:44 ` [tarantool-patches] [PATCH v6 2/2] replication: force gc to clean xdir on ENOSPC err Konstantin Belyavskiy
  1 sibling, 0 replies; 4+ messages in thread
From: Konstantin Belyavskiy @ 2018-07-12 14:44 UTC (permalink / raw)
  To: tarantool-patches

There are two different threads: 'tx' and 'tx_prio'. The latter
does not support yield(). Rename it to avoid misunderstanding.

Needed for #3397
---
 src/box/wal.c | 30 +++++++++++++++++-------------
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/src/box/wal.c b/src/box/wal.c
index 19c9138ee..b88353f36 100644
--- a/src/box/wal.c
+++ b/src/box/wal.c
@@ -61,8 +61,11 @@ struct wal_thread {
 	struct cord cord;
 	/** A pipe from 'tx' thread to 'wal' */
 	struct cpipe wal_pipe;
-	/** Return pipe from 'wal' to tx' */
-	struct cpipe tx_pipe;
+	/**
+	 * Return pipe from 'wal' to tx'. This is a
+	 * priority pipe and DOES NOT support yield.
+	 */
+	struct cpipe tx_prio_pipe;
 };
 
 /*
@@ -157,7 +160,7 @@ static void
 tx_schedule_commit(struct cmsg *msg);
 
 static struct cmsg_hop wal_request_route[] = {
-	{wal_write_to_disk, &wal_thread.tx_pipe},
+	{wal_write_to_disk, &wal_thread.tx_prio_pipe},
 	{tx_schedule_commit, NULL},
 };
 
@@ -349,7 +352,7 @@ wal_open(struct wal_writer *writer)
 	 * thread.
 	 */
 	struct cbus_call_msg msg;
-	if (cbus_call(&wal_thread.wal_pipe, &wal_thread.tx_pipe, &msg,
+	if (cbus_call(&wal_thread.wal_pipe, &wal_thread.tx_prio_pipe, &msg,
 		      wal_open_f, NULL, TIMEOUT_INFINITY) == 0) {
 		/*
 		 * Success: we can now append to
@@ -491,7 +494,7 @@ wal_checkpoint(struct vclock *vclock, bool rotate)
 		return 0;
 	}
 	static struct cmsg_hop wal_checkpoint_route[] = {
-		{wal_checkpoint_f, &wal_thread.tx_pipe},
+		{wal_checkpoint_f, &wal_thread.tx_prio_pipe},
 		{wal_checkpoint_done_f, NULL},
 	};
 	vclock_create(vclock);
@@ -531,7 +534,7 @@ wal_collect_garbage(int64_t lsn)
 	struct wal_gc_msg msg;
 	msg.lsn = lsn;
 	bool cancellable = fiber_set_cancellable(false);
-	cbus_call(&wal_thread.wal_pipe, &wal_thread.tx_pipe, &msg.base,
+	cbus_call(&wal_thread.wal_pipe, &wal_thread.tx_prio_pipe, &msg.base,
 		  wal_collect_garbage_f, NULL, TIMEOUT_INFINITY);
 	fiber_set_cancellable(cancellable);
 }
@@ -622,7 +625,7 @@ wal_writer_begin_rollback(struct wal_writer *writer)
 		 * list.
 		 */
 		{ wal_writer_clear_bus, &wal_thread.wal_pipe },
-		{ wal_writer_clear_bus, &wal_thread.tx_pipe },
+		{ wal_writer_clear_bus, &wal_thread.tx_prio_pipe },
 		/*
 		 * Step 2: writer->rollback queue contains all
 		 * messages which need to be rolled back,
@@ -640,7 +643,7 @@ wal_writer_begin_rollback(struct wal_writer *writer)
 	 * all input until rollback mode is off.
 	 */
 	cmsg_init(&writer->in_rollback, rollback_route);
-	cpipe_push(&wal_thread.tx_pipe, &writer->in_rollback);
+	cpipe_push(&wal_thread.tx_prio_pipe, &writer->in_rollback);
 }
 
 static void
@@ -770,7 +773,7 @@ wal_thread_f(va_list ap)
 	 * endpoint, to ensure that WAL messages are delivered
 	 * even when tx fiber pool is used up by net messages.
 	 */
-	cpipe_create(&wal_thread.tx_pipe, "tx_prio");
+	cpipe_create(&wal_thread.tx_prio_pipe, "tx_prio");
 
 	cbus_loop(&endpoint);
 
@@ -799,7 +802,7 @@ wal_thread_f(va_list ap)
 	if (xlog_is_open(&vy_log_writer.xlog))
 		xlog_close(&vy_log_writer.xlog, false);
 
-	cpipe_destroy(&wal_thread.tx_pipe);
+	cpipe_destroy(&wal_thread.tx_prio_pipe);
 	return 0;
 }
 
@@ -944,8 +947,9 @@ wal_write_vy_log(struct journal_entry *entry)
 	struct wal_write_vy_log_msg msg;
 	msg.entry= entry;
 	bool cancellable = fiber_set_cancellable(false);
-	int rc = cbus_call(&wal_thread.wal_pipe, &wal_thread.tx_pipe, &msg.base,
-			   wal_write_vy_log_f, NULL, TIMEOUT_INFINITY);
+	int rc = cbus_call(&wal_thread.wal_pipe, &wal_thread.tx_prio_pipe,
+			   &msg.base, wal_write_vy_log_f, NULL,
+			   TIMEOUT_INFINITY);
 	fiber_set_cancellable(cancellable);
 	return rc;
 }
@@ -964,7 +968,7 @@ wal_rotate_vy_log()
 {
 	struct cbus_call_msg msg;
 	bool cancellable = fiber_set_cancellable(false);
-	cbus_call(&wal_thread.wal_pipe, &wal_thread.tx_pipe, &msg,
+	cbus_call(&wal_thread.wal_pipe, &wal_thread.tx_prio_pipe, &msg,
 		  wal_rotate_vy_log_f, NULL, TIMEOUT_INFINITY);
 	fiber_set_cancellable(cancellable);
 }
-- 
2.14.3 (Apple Git-98)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [tarantool-patches] [PATCH v6 2/2] replication: force gc to clean xdir on ENOSPC err
  2018-07-12 14:44 [tarantool-patches] [PATCH v6 0/2] force gc on running out of disk space Konstantin Belyavskiy
  2018-07-12 14:44 ` [tarantool-patches] [PATCH v6 1/2] replication: rename thread from tx to tx_prio Konstantin Belyavskiy
@ 2018-07-12 14:44 ` Konstantin Belyavskiy
  2018-07-13  8:48   ` [tarantool-patches] " Kirill Yukhin
  1 sibling, 1 reply; 4+ messages in thread
From: Konstantin Belyavskiy @ 2018-07-12 14:44 UTC (permalink / raw)
  To: tarantool-patches

Garbage collector do not delete xlog unless replica do not notify
master with newer vclock. This can lead to running out of disk
space error and this is not right behaviour since it will stop the
master.
Fix it by forcing gc to clean xlogs for replica with highest lag.
Add an error injection and a test.

Changes in V2:
- Promoting error from wal_thread to tx via cpipe.
Changes in V3:
- Delete consumers and only for replicas (but not backup).
Changes in V4:
- Bug fix and small changes according to review.
Changes in V5:
- Compare signatures of the oldest replica and the oldest snapshot
  to keep to prevent deletion if it will not free any disk space.
- Add say_crit on consumer deletion with a little information.
Changes in V6:
- Rebase to latest 1.10.
- Update test.

Closes #3397
---
 src/box/box.cc                                     |   1 +
 src/box/gc.c                                       |  62 +++++++
 src/box/gc.h                                       |  17 ++
 src/box/relay.cc                                   |   1 +
 src/box/wal.c                                      |  25 +++
 src/errinj.h                                       |   1 +
 src/fio.c                                          |   7 +
 test/box/errinj.result                             |   4 +-
 test/replication/kick_dead_replica_on_enspc.result | 204 +++++++++++++++++++++
 .../kick_dead_replica_on_enspc.test.lua            |  91 +++++++++
 test/replication/suite.ini                         |   2 +-
 11 files changed, 413 insertions(+), 2 deletions(-)
 create mode 100644 test/replication/kick_dead_replica_on_enspc.result
 create mode 100644 test/replication/kick_dead_replica_on_enspc.test.lua

diff --git a/src/box/box.cc b/src/box/box.cc
index ba0af95e4..3d7f43f09 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -1436,6 +1436,7 @@ box_process_join(struct ev_io *io, struct xrow_header *header)
 	replica = replica_by_uuid(&instance_uuid);
 	assert(replica != NULL);
 	replica->gc = gc;
+	gc_consumer_set_replica(gc, replica);
 	gc_guard.is_active = false;
 
 	/* Remember master's vclock after the last request */
diff --git a/src/box/gc.c b/src/box/gc.c
index 6a05b2983..f3831e5ad 100644
--- a/src/box/gc.c
+++ b/src/box/gc.c
@@ -65,6 +65,8 @@ struct gc_consumer {
 	 * WAL files, or both - SNAP and WAL.
 	 */
 	enum gc_consumer_type type;
+	/** Replica associated with consumer (if any). */
+	struct replica *replica;
 };
 
 typedef rb_tree(struct gc_consumer) gc_tree_t;
@@ -131,10 +133,18 @@ gc_consumer_new(const char *name, int64_t signature,
 	return consumer;
 }
 
+void
+gc_consumer_set_replica(struct gc_consumer *gc, struct replica *replica)
+{
+	gc->replica = replica;
+}
+
 /** Free a consumer object. */
 static void
 gc_consumer_delete(struct gc_consumer *consumer)
 {
+	if (consumer->replica != NULL)
+		consumer->replica->gc = NULL;
 	free(consumer->name);
 	TRASH(consumer);
 	free(consumer);
@@ -250,6 +260,58 @@ gc_set_checkpoint_count(int checkpoint_count)
 	gc.checkpoint_count = checkpoint_count;
 }
 
+void
+gc_xdir_clean_notify()
+{
+	/*
+	 * Compare the current time with the time of the last run.
+	 * This is needed in case of multiple failures to prevent
+	 * from deleting all replicas.
+	 */
+	static double prev_time = 0.;
+	double cur_time = ev_monotonic_time();
+	if (cur_time - prev_time < 1.)
+		return;
+	prev_time = cur_time;
+	struct gc_consumer *leftmost =
+	    gc_tree_first(&gc.consumers);
+	/*
+	 * Exit if no consumers left or if this consumer is
+	 * not associated with replica (backup for example).
+	 */
+	if (leftmost == NULL || leftmost->replica == NULL)
+		return;
+	/*
+	 * We have to maintain @checkpoint_count oldest snapshots,
+	 * plus we can't remove snapshots that are still in use.
+	 * So if leftmost replica has signature greater or equel
+	 * then the oldest checkpoint that must be preserved,
+	 * nothing to do.
+	 */
+	struct checkpoint_iterator checkpoints;
+	checkpoint_iterator_init(&checkpoints);
+	assert(gc.checkpoint_count > 0);
+	const struct vclock *vclock;
+	for (int i = 0; i < gc.checkpoint_count; i++)
+		if((vclock = checkpoint_iterator_prev(&checkpoints)) == NULL)
+			return;
+	if (leftmost->signature >= vclock_sum(vclock))
+		return;
+	int64_t signature = leftmost->signature;
+	while (true) {
+		say_crit("remove replica with the oldest signature = %lld"
+		         " and uuid = %s", signature,
+			 tt_uuid_str(&leftmost->replica->uuid));
+		gc_consumer_unregister(leftmost);
+		leftmost = gc_tree_first(&gc.consumers);
+		if (leftmost == NULL || leftmost->replica == NULL ||
+		    leftmost->signature > signature) {
+			gc_run();
+			return;
+		}
+	}
+}
+
 struct gc_consumer *
 gc_consumer_register(const char *name, int64_t signature,
 		     enum gc_consumer_type type)
diff --git a/src/box/gc.h b/src/box/gc.h
index 6a890b7b7..a5fba8df0 100644
--- a/src/box/gc.h
+++ b/src/box/gc.h
@@ -35,6 +35,8 @@
 #include <stdint.h>
 #include <stdbool.h>
 
+#include "replication.h"
+
 #if defined(__cplusplus)
 extern "C" {
 #endif /* defined(__cplusplus) */
@@ -92,6 +94,12 @@ struct gc_consumer *
 gc_consumer_register(const char *name, int64_t signature,
 		     enum gc_consumer_type type);
 
+/**
+ * Bind consumer with associated replica (if any).
+ */
+void
+gc_consumer_set_replica(struct gc_consumer *gc, struct replica *replica);
+
 /**
  * Unregister a consumer and invoke garbage collection
  * if needed.
@@ -99,6 +107,15 @@ gc_consumer_register(const char *name, int64_t signature,
 void
 gc_consumer_unregister(struct gc_consumer *consumer);
 
+/**
+ * Delete consumer with the least recent vclock and start
+ * garbage collection. If nothing to delete find next
+ * consumer etc. Originally created for cases with running
+ * out of disk space because of disconnected replica.
+ */
+void
+gc_xdir_clean_notify();
+
 /**
  * Advance the vclock signature tracked by a consumer and
  * invoke garbage collection if needed.
diff --git a/src/box/relay.cc b/src/box/relay.cc
index c91e5aed3..32255d655 100644
--- a/src/box/relay.cc
+++ b/src/box/relay.cc
@@ -581,6 +581,7 @@ relay_subscribe(struct replica *replica, int fd, uint64_t sync,
 			vclock_sum(replica_clock), GC_CONSUMER_WAL);
 		if (replica->gc == NULL)
 			diag_raise();
+		gc_consumer_set_replica(replica->gc, replica);
 	}
 
 	relay_start(relay, fd, sync, relay_send_row);
diff --git a/src/box/wal.c b/src/box/wal.c
index b88353f36..6008ff39d 100644
--- a/src/box/wal.c
+++ b/src/box/wal.c
@@ -43,6 +43,7 @@
 #include "cbus.h"
 #include "coio_task.h"
 #include "replication.h"
+#include "gc.h"
 
 
 const char *wal_mode_STRS[] = { "none", "write", "fsync", NULL };
@@ -66,6 +67,8 @@ struct wal_thread {
 	 * priority pipe and DOES NOT support yield.
 	 */
 	struct cpipe tx_prio_pipe;
+	/** Return pipe from 'wal' to tx' */
+	struct cpipe tx_pipe;
 };
 
 /*
@@ -662,6 +665,13 @@ wal_assign_lsn(struct wal_writer *writer, struct xrow_header **row,
 	}
 }
 
+static void
+gc_status_update(struct cmsg *msg)
+{
+	gc_xdir_clean_notify();
+	free(msg);
+}
+
 static void
 wal_write_to_disk(struct cmsg *msg)
 {
@@ -734,6 +744,19 @@ done:
 		/* Until we can pass the error to tx, log it and clear. */
 		error_log(error);
 		diag_clear(diag_get());
+		if (errno == ENOSPC) {
+			struct cmsg *msg =
+			    (struct cmsg*)calloc(1, sizeof(struct cmsg));
+			if (msg == NULL) {
+				say_error("failed to allocate cmsg");
+			} else {
+				static const struct cmsg_hop route[] = {
+					{gc_status_update, NULL}
+				};
+				cmsg_init(msg, route);
+				cpipe_push(&wal_thread.tx_pipe, msg);
+			}
+		}
 	}
 	/*
 	 * We need to start rollback from the first request
@@ -774,6 +797,7 @@ wal_thread_f(va_list ap)
 	 * even when tx fiber pool is used up by net messages.
 	 */
 	cpipe_create(&wal_thread.tx_prio_pipe, "tx_prio");
+	cpipe_create(&wal_thread.tx_pipe, "tx");
 
 	cbus_loop(&endpoint);
 
@@ -803,6 +827,7 @@ wal_thread_f(va_list ap)
 		xlog_close(&vy_log_writer.xlog, false);
 
 	cpipe_destroy(&wal_thread.tx_prio_pipe);
+	cpipe_destroy(&wal_thread.tx_pipe);
 	return 0;
 }
 
diff --git a/src/errinj.h b/src/errinj.h
index cde58d485..9c114b67c 100644
--- a/src/errinj.h
+++ b/src/errinj.h
@@ -117,6 +117,7 @@ struct errinj {
 	_(ERRINJ_VY_LOG_FILE_RENAME, ERRINJ_BOOL, {.bparam = false}) \
 	_(ERRINJ_VY_RUN_FILE_RENAME, ERRINJ_BOOL, {.bparam = false}) \
 	_(ERRINJ_VY_INDEX_FILE_RENAME, ERRINJ_BOOL, {.bparam = false}) \
+	_(ERRINJ_NO_DISK_SPACE, ERRINJ_BOOL, {.bparam = false}) \
 
 ENUM0(errinj_id, ERRINJ_LIST);
 extern struct errinj errinjs[];
diff --git a/src/fio.c b/src/fio.c
index b79d3d058..cdea11e87 100644
--- a/src/fio.c
+++ b/src/fio.c
@@ -29,6 +29,7 @@
  * SUCH DAMAGE.
  */
 #include "fio.h"
+#include "errinj.h"
 
 #include <sys/types.h>
 
@@ -141,6 +142,12 @@ fio_writev(int fd, struct iovec *iov, int iovcnt)
 	ssize_t nwr;
 restart:
 	nwr = writev(fd, iov, iovcnt);
+	/* Simulate running out of disk space to force the gc to clean logs. */
+	struct errinj *inj = errinj(ERRINJ_NO_DISK_SPACE, ERRINJ_BOOL);
+	if (inj != NULL && inj->bparam) {
+		errno = ENOSPC;
+		nwr = -1;
+	}
 	if (nwr < 0) {
 		if (errno == EINTR) {
 			errno = 0;
diff --git a/test/box/errinj.result b/test/box/errinj.result
index 54b6d578f..ca2f48f3d 100644
--- a/test/box/errinj.result
+++ b/test/box/errinj.result
@@ -60,9 +60,11 @@ errinj.info()
     state: false
   ERRINJ_WAL_WRITE_DISK:
     state: false
+  ERRINJ_VY_LOG_FILE_RENAME:
+    state: false
   ERRINJ_VY_RUN_WRITE:
     state: false
-  ERRINJ_VY_LOG_FILE_RENAME:
+  ERRINJ_NO_DISK_SPACE:
     state: false
   ERRINJ_VY_LOG_FLUSH_DELAY:
     state: false
diff --git a/test/replication/kick_dead_replica_on_enspc.result b/test/replication/kick_dead_replica_on_enspc.result
new file mode 100644
index 000000000..7c648311b
--- /dev/null
+++ b/test/replication/kick_dead_replica_on_enspc.result
@@ -0,0 +1,204 @@
+env = require('test_run')
+---
+...
+vclock_diff = require('fast_replica').vclock_diff
+---
+...
+test_run = env.new()
+---
+...
+SERVERS = { 'autobootstrap1', 'autobootstrap2', 'autobootstrap3' }
+---
+...
+--
+-- Start servers
+--
+test_run:create_cluster(SERVERS)
+---
+...
+--
+-- Wait for full mesh
+--
+test_run:wait_fullmesh(SERVERS)
+---
+...
+--
+-- Check vclock
+--
+vclock1 = test_run:get_vclock('autobootstrap1')
+---
+...
+vclock_diff(vclock1, test_run:get_vclock('autobootstrap2'))
+---
+- 0
+...
+vclock_diff(vclock1, test_run:get_vclock('autobootstrap3'))
+---
+- 0
+...
+--
+-- Switch off second replica
+--
+test_run:cmd("switch autobootstrap2")
+---
+- true
+...
+repl = box.cfg.replication
+---
+...
+box.cfg{replication = ""}
+---
+...
+--
+-- Insert rows
+--
+test_run:cmd("switch autobootstrap1")
+---
+- true
+...
+s = box.space.test
+---
+...
+for i = 1, 5 do s:insert{i} box.snapshot() end
+---
+...
+s:select()
+---
+- - [1]
+  - [2]
+  - [3]
+  - [4]
+  - [5]
+...
+fio = require('fio')
+---
+...
+path = fio.pathjoin(fio.abspath("."), 'autobootstrap1/*.xlog')
+---
+...
+-- Depend on first master is a leader or not it should be 5 or 6.
+#fio.glob(path) >= 5
+---
+- true
+...
+--
+-- Switch off third replica
+--
+test_run:cmd("switch autobootstrap3")
+---
+- true
+...
+repl = box.cfg.replication
+---
+...
+box.cfg{replication = ""}
+---
+...
+--
+-- Insert more rows
+--
+test_run:cmd("switch autobootstrap1")
+---
+- true
+...
+for i = 6, 10 do s:insert{i} box.snapshot() end
+---
+...
+s:select()
+---
+- - [1]
+  - [2]
+  - [3]
+  - [4]
+  - [5]
+  - [6]
+  - [7]
+  - [8]
+  - [9]
+  - [10]
+...
+fio = require('fio')
+---
+...
+path = fio.pathjoin(fio.abspath("."), 'autobootstrap1/*.xlog')
+---
+...
+-- Depend on if the first master is a leader or not it should be 10 or 11.
+#fio.glob(path) >= 10
+---
+- true
+...
+errinj = box.error.injection
+---
+...
+errinj.set("ERRINJ_NO_DISK_SPACE", true)
+---
+- ok
+...
+function insert(a) s:insert(a) end
+---
+...
+_, err = pcall(insert, {11})
+---
+...
+err:match("ailed to write")
+---
+- ailed to write
+...
+--
+-- Switch off third replica
+--
+test_run:cmd("switch autobootstrap3")
+---
+- true
+...
+box.cfg{replication = repl}
+---
+...
+--
+-- Wait untill the third replica will catch up with the first one.
+--
+test_run:cmd("switch autobootstrap1")
+---
+- true
+...
+fiber = require('fiber')
+---
+...
+while #fio.glob(path) ~= 2 do fiber.sleep(0.01) end
+---
+...
+#fio.glob(path)
+---
+- 2
+...
+--
+-- Check data integrity on the third replica.
+--
+test_run:cmd("switch autobootstrap3")
+---
+- true
+...
+box.space.test:select{}
+---
+- - [1]
+  - [2]
+  - [3]
+  - [4]
+  - [5]
+  - [6]
+  - [7]
+  - [8]
+  - [9]
+  - [10]
+...
+--
+-- Stop servers
+--
+test_run:cmd("switch default")
+---
+- true
+...
+test_run:drop_cluster(SERVERS)
+---
+...
diff --git a/test/replication/kick_dead_replica_on_enspc.test.lua b/test/replication/kick_dead_replica_on_enspc.test.lua
new file mode 100644
index 000000000..ffab9f3f9
--- /dev/null
+++ b/test/replication/kick_dead_replica_on_enspc.test.lua
@@ -0,0 +1,91 @@
+env = require('test_run')
+vclock_diff = require('fast_replica').vclock_diff
+test_run = env.new()
+
+
+SERVERS = { 'autobootstrap1', 'autobootstrap2', 'autobootstrap3' }
+
+--
+-- Start servers
+--
+test_run:create_cluster(SERVERS)
+
+--
+-- Wait for full mesh
+--
+test_run:wait_fullmesh(SERVERS)
+
+--
+-- Check vclock
+--
+vclock1 = test_run:get_vclock('autobootstrap1')
+vclock_diff(vclock1, test_run:get_vclock('autobootstrap2'))
+vclock_diff(vclock1, test_run:get_vclock('autobootstrap3'))
+
+--
+-- Switch off second replica
+--
+test_run:cmd("switch autobootstrap2")
+repl = box.cfg.replication
+box.cfg{replication = ""}
+
+--
+-- Insert rows
+--
+test_run:cmd("switch autobootstrap1")
+s = box.space.test
+for i = 1, 5 do s:insert{i} box.snapshot() end
+s:select()
+fio = require('fio')
+path = fio.pathjoin(fio.abspath("."), 'autobootstrap1/*.xlog')
+-- Depend on first master is a leader or not it should be 5 or 6.
+#fio.glob(path) >= 5
+
+--
+-- Switch off third replica
+--
+test_run:cmd("switch autobootstrap3")
+repl = box.cfg.replication
+box.cfg{replication = ""}
+
+--
+-- Insert more rows
+--
+test_run:cmd("switch autobootstrap1")
+for i = 6, 10 do s:insert{i} box.snapshot() end
+s:select()
+fio = require('fio')
+path = fio.pathjoin(fio.abspath("."), 'autobootstrap1/*.xlog')
+-- Depend on if the first master is a leader or not it should be 10 or 11.
+#fio.glob(path) >= 10
+errinj = box.error.injection
+errinj.set("ERRINJ_NO_DISK_SPACE", true)
+function insert(a) s:insert(a) end
+_, err = pcall(insert, {11})
+err:match("ailed to write")
+
+--
+-- Switch off third replica
+--
+test_run:cmd("switch autobootstrap3")
+box.cfg{replication = repl}
+
+--
+-- Wait untill the third replica will catch up with the first one.
+--
+test_run:cmd("switch autobootstrap1")
+fiber = require('fiber')
+while #fio.glob(path) ~= 2 do fiber.sleep(0.01) end
+#fio.glob(path)
+
+--
+-- Check data integrity on the third replica.
+--
+test_run:cmd("switch autobootstrap3")
+box.space.test:select{}
+
+--
+-- Stop servers
+--
+test_run:cmd("switch default")
+test_run:drop_cluster(SERVERS)
diff --git a/test/replication/suite.ini b/test/replication/suite.ini
index b489add58..27815acb6 100644
--- a/test/replication/suite.ini
+++ b/test/replication/suite.ini
@@ -3,7 +3,7 @@ core = tarantool
 script =  master.lua
 description = tarantool/box, replication
 disabled = consistent.test.lua
-release_disabled = catch.test.lua errinj.test.lua gc.test.lua before_replace.test.lua quorum.test.lua recover_missing_xlog.test.lua
+release_disabled = catch.test.lua errinj.test.lua gc.test.lua before_replace.test.lua kick_dead_replica_on_enspc.test.lua quorum.test.lua recover_missing_xlog.test.lua
 config = suite.cfg
 lua_libs = lua/fast_replica.lua
 long_run = prune.test.lua
-- 
2.14.3 (Apple Git-98)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [tarantool-patches] Re: [PATCH v6 2/2] replication: force gc to clean xdir on ENOSPC err
  2018-07-12 14:44 ` [tarantool-patches] [PATCH v6 2/2] replication: force gc to clean xdir on ENOSPC err Konstantin Belyavskiy
@ 2018-07-13  8:48   ` Kirill Yukhin
  0 siblings, 0 replies; 4+ messages in thread
From: Kirill Yukhin @ 2018-07-13  8:48 UTC (permalink / raw)
  To: tarantool-patches

Hello,
On 12 июл 17:44, Konstantin Belyavskiy wrote:
> Garbage collector do not delete xlog unless replica do not notify
> master with newer vclock. This can lead to running out of disk
> space error and this is not right behaviour since it will stop the
> master.
> Fix it by forcing gc to clean xlogs for replica with highest lag.
> Add an error injection and a test.
> 
> Changes in V2:
> - Promoting error from wal_thread to tx via cpipe.
> Changes in V3:
> - Delete consumers and only for replicas (but not backup).
> Changes in V4:
> - Bug fix and small changes according to review.
> Changes in V5:
> - Compare signatures of the oldest replica and the oldest snapshot
>   to keep to prevent deletion if it will not free any disk space.
> - Add say_crit on consumer deletion with a little information.
> Changes in V6:
> - Rebase to latest 1.10.
> - Update test.
Why did you put a ChangeLog entry to commit message?

--
Regards, Kirill Yukhin

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-07-13  8:48 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-12 14:44 [tarantool-patches] [PATCH v6 0/2] force gc on running out of disk space Konstantin Belyavskiy
2018-07-12 14:44 ` [tarantool-patches] [PATCH v6 1/2] replication: rename thread from tx to tx_prio Konstantin Belyavskiy
2018-07-12 14:44 ` [tarantool-patches] [PATCH v6 2/2] replication: force gc to clean xdir on ENOSPC err Konstantin Belyavskiy
2018-07-13  8:48   ` [tarantool-patches] " Kirill Yukhin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox