Re[2]: [tarantool-patches] [PATCH v3] replication: force gc to clean xdir on ENOSPC err

Konstantin Belyavskiy k.belyavskiy at tarantool.org
Wed Jun 27 19:11:55 MSK 2018


Please take a look at updated patch

ticket: https://github.com/tarantool/tarantool/issues/3397
branch: https://github.com/tarantool/tarantool/tree/kbelyavs/gh-3397-force-del-logs-on-no-disk-space

>Понедельник, 18 июня 2018, 15:35 +03:00 от Vladimir Davydov <vdavydov.dev at gmail.com>:
>
>On Sat, Jun 09, 2018 at 05:05:10PM +0300, Konstantin Belyavskiy wrote:
>> Garbage collector do not delete xlog unless replica do not notify
>> master with newer vclock. This can lead to running out of disk
>> space error and this is not right behaviour since it will stop the
>> master.
>
>AFAIU this is the second iteration of this patch. If so, you should
>have appended the version number to the subject prefix (i.e. PATCH v2)
>according to our guideline:
>
>https://tarantool.io/en/doc/1.9/dev_guide/developer_guidelines.html#how-to-submit-a-patch-for-review
>
>> 
>> List of changes:
>> - Promoting error from wal_thread to tx via cpipe.
>> - Introduce states for gc_consumers (OK and OFF). GC does not
>> take into account consumers with state OFF.
>> - Add an error injection and a test.
>
>The list of changes between different versions of the same patch should
>go after the link to the branch and the ticket. For example see
>
>https://www.freelists.org/post/tarantool-patches/PATCH-v2-box-make-gc-info-public
>
>> 
>> Closes #3397
>> ---
>> ticket:  https://github.com/tarantool/tarantool/issues/3397
>> branch:  https://github.com/tarantool/tarantool/tree/kbelyavs/gh-3397-force-del-logs-on-no-disk-space
>> 
>>  src/box/gc.c                                      |  52 ++++++++++-
>>  src/box/gc.h                                      |  13 +++
>>  src/box/wal.cc                                    |  56 +++++++++--
>>  src/errinj.h                                      |   1 +
>>  src/fio.c                                         |   7 ++
>>  test/box/errinj.result                            |   2 +
>>  test/replication/force_gc_on_err_nospace.result   | 107 ++++++++++++++++++++++
>>  test/replication/force_gc_on_err_nospace.test.lua |  38 ++++++++
>>  8 files changed, 265 insertions(+), 11 deletions(-)
>>  create mode 100644 test/replication/force_gc_on_err_nospace.result
>>  create mode 100644 test/replication/force_gc_on_err_nospace.test.lua
>> 
>> diff --git a/src/box/gc.c b/src/box/gc.c
>> index 12e68f3dc..efe033aaf 100644
>> --- a/src/box/gc.c
>> +++ b/src/box/gc.c
>> @@ -59,6 +59,8 @@ struct gc_consumer {
>>  	gc_node_t node;
>>  	/** Human-readable name. */
>>  	char *name;
>> +	/** Dead or alive. GC ignores dead consumers. */
>> +	enum gc_consumer_state state;
>
>Keeping dead consumers around is pointless. A consumer must be removed
>from the tree as soon as it is evicted. 
Fixed
>
>> +void
>> +gc_xdir_clean_notify(bool force_clean)
>> +{
>> +	if (force_clean == gc.force_clean_flag)
>> +		return; // nothing to do
>> +	else if (force_clean) { // xdir clean required
>> +		gc.force_clean_flag = true;
>
>We don't use C++ style comments in the code. 
Fixed
>
>> +		/**
>> +		 * Mark consumer with least recent vclock as "dead" and
>> +		 * invoke garbage collection. If nothing to delete find
>> +		 * next alive consumer etc. Originally created for
>> +		 * cases with running out of disk space because of
>> +		 * disconnected replica.
>> +		 */
>> +		struct gc_consumer *leftmost =
>> +		    gc_tree_first_alive(&gc.consumers);
>> +		if (leftmost == NULL)
>> +			return; // do nothing
>> +		int64_t signature = leftmost->signature;
>> +		while (true) {
>> +			leftmost->state = CONSUMER_OFF;
>> +			leftmost = gc_tree_first_alive(&gc.consumers);
>> +			if (leftmost == NULL ||
>> +			    leftmost->signature > signature) {
>> +				gc_run();
>> +				return;
>
>A consumer may correspond to a backup procedure, in which case it must
>not be evicted, even on ENOSPC. 
Fixed
>
>
>> +			}
>> +		}
>> +	} else
>> +		gc.force_clean_flag = false;
>> +}
>
>> diff --git a/src/box/wal.cc b/src/box/wal.cc
>> index 099c70caa..9dd8347d5 100644
>> --- a/src/box/wal.cc
>> +++ b/src/box/wal.cc
>> @@ -59,6 +62,11 @@ struct wal_thread {
>>  	struct cord cord;
>>  	/** A pipe from 'tx' thread to 'wal' */
>>  	struct cpipe wal_pipe;
>> +	/**
>> +	 * Return pipe from 'wal' to tx'. This is a
>> +	 * priority pipe and DOES NOT support yield.
>> +	 */
>> +	struct cpipe tx_prio_pipe;
>>  	/** Return pipe from 'wal' to tx' */
>>  	struct cpipe tx_pipe;
>>  };
>> @@ -154,7 +166,7 @@ static void
>>  tx_schedule_commit(struct cmsg *msg);
>> 
>>  static struct cmsg_hop wal_request_route[] = {
>> -	{wal_write_to_disk, &wal_thread.tx_pipe},
>> +	{wal_write_to_disk, &wal_thread.tx_prio_pipe},
>
>AFAICS you rename wal_thread.tx_pipe to tx_prio_pipe and create a new
>pipe called wal_thread.tx_pipe to be used for garbage collection, all in
>one patch. This is very difficult to review. Provided you really need
>it, renaming should be done in a separate patch with a good explanation
>why you're doing what you're doing. Currently, I fail to see a reason.
>
>> +static void
>> +gc_status_update(struct cmsg *msg)
>> +{
>> +	msg->route = NULL;
>> +	say_info("QQQ: gc_status_update");
>
>What is this? 
Sorry, my fault (use to find in logs).
>
>
>> +	gc_xdir_clean_notify(static_cast<gc_msg*>(msg)->io_err);
>> +}
>> +
>>  static void
>>  wal_write_to_disk(struct cmsg *msg)
>>  {
>> @@ -647,11 +667,29 @@ wal_write_to_disk(struct cmsg *msg)
>>  	last_committed = stailq_last(&wal_msg->commit);
>> 
>>  done:
>> +	bool need_send = false;
>>  	struct error *error = diag_last_error(diag_get());
>>  	if (error) {
>>  		/* Until we can pass the error to tx, log it and clear. */
>>  		error_log(error);
>>  		diag_clear(diag_get());
>> +		if (errno == ENOSPC) {
>> +			err_nospace = true;
>> +			need_send = true;
>> +		}
>> +	} else if (err_nospace) {
>> +		/** Clear flag and send message to gc. */
>> +		err_nospace = false;
>> +		need_send = true;
>> +	}
>> +	if (need_send) {
>
>This joggling with flags looks confusing. What are you trying to achieve
>here? Avoid invoking garbage collector twice on ENOSPC error? At the
>very least, this code should be accompanied by a comment.
>
>Also, AFAICS you fail a transaction that hit ENOSPC and triggered
>garbage collection. I don't think it's good. I see two ways of
>rectifying this: either retry WAL write after invoking garbage
>collection or use fallocate to reserve some space beyond the end of
>the file before writing to WAL.
Fixed by comparing current timestamp with the previous one.
>
>
>> +		struct gc_msg msg;
>> +		static const struct cmsg_hop route[] = {
>> +			{gc_status_update, NULL}
>> +		};
>> +		cmsg_init(&msg, route);
>> +		msg.io_err = err_nospace;
>> +		cpipe_push(&wal_thread.tx_pipe, &msg);
>>  	}
>
>Pushing a message defined on stack to a pipe and not waiting for it to
>be delivered doesn't look right. 
Fixed

Best regards,
Konstantin Belyavskiy
k.belyavskiy at tarantool.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20180627/af3b5930/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0000-cover-letter.patch
Type: application/x-patch
Size: 1479 bytes
Desc: not available
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20180627/af3b5930/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-replication-rename-thread-from-tx-to-tx_prio.patch
Type: application/x-patch
Size: 4042 bytes
Desc: not available
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20180627/af3b5930/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-replication-force-gc-to-clean-xdir-on-ENOSPC-err.patch
Type: application/x-patch
Size: 13071 bytes
Desc: not available
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20180627/af3b5930/attachment-0002.bin>


More information about the Tarantool-patches mailing list