Re[2]: [tarantool-patches] [PATCH v3] replication: force gc to clean xdir on ENOSPC err
Konstantin Belyavskiy
k.belyavskiy at tarantool.org
Wed Jun 27 19:11:55 MSK 2018
Please take a look at updated patch
ticket: https://github.com/tarantool/tarantool/issues/3397
branch: https://github.com/tarantool/tarantool/tree/kbelyavs/gh-3397-force-del-logs-on-no-disk-space
>Понедельник, 18 июня 2018, 15:35 +03:00 от Vladimir Davydov <vdavydov.dev at gmail.com>:
>
>On Sat, Jun 09, 2018 at 05:05:10PM +0300, Konstantin Belyavskiy wrote:
>> Garbage collector do not delete xlog unless replica do not notify
>> master with newer vclock. This can lead to running out of disk
>> space error and this is not right behaviour since it will stop the
>> master.
>
>AFAIU this is the second iteration of this patch. If so, you should
>have appended the version number to the subject prefix (i.e. PATCH v2)
>according to our guideline:
>
>https://tarantool.io/en/doc/1.9/dev_guide/developer_guidelines.html#how-to-submit-a-patch-for-review
>
>>
>> List of changes:
>> - Promoting error from wal_thread to tx via cpipe.
>> - Introduce states for gc_consumers (OK and OFF). GC does not
>> take into account consumers with state OFF.
>> - Add an error injection and a test.
>
>The list of changes between different versions of the same patch should
>go after the link to the branch and the ticket. For example see
>
>https://www.freelists.org/post/tarantool-patches/PATCH-v2-box-make-gc-info-public
>
>>
>> Closes #3397
>> ---
>> ticket: https://github.com/tarantool/tarantool/issues/3397
>> branch: https://github.com/tarantool/tarantool/tree/kbelyavs/gh-3397-force-del-logs-on-no-disk-space
>>
>> src/box/gc.c | 52 ++++++++++-
>> src/box/gc.h | 13 +++
>> src/box/wal.cc | 56 +++++++++--
>> src/errinj.h | 1 +
>> src/fio.c | 7 ++
>> test/box/errinj.result | 2 +
>> test/replication/force_gc_on_err_nospace.result | 107 ++++++++++++++++++++++
>> test/replication/force_gc_on_err_nospace.test.lua | 38 ++++++++
>> 8 files changed, 265 insertions(+), 11 deletions(-)
>> create mode 100644 test/replication/force_gc_on_err_nospace.result
>> create mode 100644 test/replication/force_gc_on_err_nospace.test.lua
>>
>> diff --git a/src/box/gc.c b/src/box/gc.c
>> index 12e68f3dc..efe033aaf 100644
>> --- a/src/box/gc.c
>> +++ b/src/box/gc.c
>> @@ -59,6 +59,8 @@ struct gc_consumer {
>> gc_node_t node;
>> /** Human-readable name. */
>> char *name;
>> + /** Dead or alive. GC ignores dead consumers. */
>> + enum gc_consumer_state state;
>
>Keeping dead consumers around is pointless. A consumer must be removed
>from the tree as soon as it is evicted.
Fixed
>
>> +void
>> +gc_xdir_clean_notify(bool force_clean)
>> +{
>> + if (force_clean == gc.force_clean_flag)
>> + return; // nothing to do
>> + else if (force_clean) { // xdir clean required
>> + gc.force_clean_flag = true;
>
>We don't use C++ style comments in the code.
Fixed
>
>> + /**
>> + * Mark consumer with least recent vclock as "dead" and
>> + * invoke garbage collection. If nothing to delete find
>> + * next alive consumer etc. Originally created for
>> + * cases with running out of disk space because of
>> + * disconnected replica.
>> + */
>> + struct gc_consumer *leftmost =
>> + gc_tree_first_alive(&gc.consumers);
>> + if (leftmost == NULL)
>> + return; // do nothing
>> + int64_t signature = leftmost->signature;
>> + while (true) {
>> + leftmost->state = CONSUMER_OFF;
>> + leftmost = gc_tree_first_alive(&gc.consumers);
>> + if (leftmost == NULL ||
>> + leftmost->signature > signature) {
>> + gc_run();
>> + return;
>
>A consumer may correspond to a backup procedure, in which case it must
>not be evicted, even on ENOSPC.
Fixed
>
>
>> + }
>> + }
>> + } else
>> + gc.force_clean_flag = false;
>> +}
>
>> diff --git a/src/box/wal.cc b/src/box/wal.cc
>> index 099c70caa..9dd8347d5 100644
>> --- a/src/box/wal.cc
>> +++ b/src/box/wal.cc
>> @@ -59,6 +62,11 @@ struct wal_thread {
>> struct cord cord;
>> /** A pipe from 'tx' thread to 'wal' */
>> struct cpipe wal_pipe;
>> + /**
>> + * Return pipe from 'wal' to tx'. This is a
>> + * priority pipe and DOES NOT support yield.
>> + */
>> + struct cpipe tx_prio_pipe;
>> /** Return pipe from 'wal' to tx' */
>> struct cpipe tx_pipe;
>> };
>> @@ -154,7 +166,7 @@ static void
>> tx_schedule_commit(struct cmsg *msg);
>>
>> static struct cmsg_hop wal_request_route[] = {
>> - {wal_write_to_disk, &wal_thread.tx_pipe},
>> + {wal_write_to_disk, &wal_thread.tx_prio_pipe},
>
>AFAICS you rename wal_thread.tx_pipe to tx_prio_pipe and create a new
>pipe called wal_thread.tx_pipe to be used for garbage collection, all in
>one patch. This is very difficult to review. Provided you really need
>it, renaming should be done in a separate patch with a good explanation
>why you're doing what you're doing. Currently, I fail to see a reason.
>
>> +static void
>> +gc_status_update(struct cmsg *msg)
>> +{
>> + msg->route = NULL;
>> + say_info("QQQ: gc_status_update");
>
>What is this?
Sorry, my fault (use to find in logs).
>
>
>> + gc_xdir_clean_notify(static_cast<gc_msg*>(msg)->io_err);
>> +}
>> +
>> static void
>> wal_write_to_disk(struct cmsg *msg)
>> {
>> @@ -647,11 +667,29 @@ wal_write_to_disk(struct cmsg *msg)
>> last_committed = stailq_last(&wal_msg->commit);
>>
>> done:
>> + bool need_send = false;
>> struct error *error = diag_last_error(diag_get());
>> if (error) {
>> /* Until we can pass the error to tx, log it and clear. */
>> error_log(error);
>> diag_clear(diag_get());
>> + if (errno == ENOSPC) {
>> + err_nospace = true;
>> + need_send = true;
>> + }
>> + } else if (err_nospace) {
>> + /** Clear flag and send message to gc. */
>> + err_nospace = false;
>> + need_send = true;
>> + }
>> + if (need_send) {
>
>This joggling with flags looks confusing. What are you trying to achieve
>here? Avoid invoking garbage collector twice on ENOSPC error? At the
>very least, this code should be accompanied by a comment.
>
>Also, AFAICS you fail a transaction that hit ENOSPC and triggered
>garbage collection. I don't think it's good. I see two ways of
>rectifying this: either retry WAL write after invoking garbage
>collection or use fallocate to reserve some space beyond the end of
>the file before writing to WAL.
Fixed by comparing current timestamp with the previous one.
>
>
>> + struct gc_msg msg;
>> + static const struct cmsg_hop route[] = {
>> + {gc_status_update, NULL}
>> + };
>> + cmsg_init(&msg, route);
>> + msg.io_err = err_nospace;
>> + cpipe_push(&wal_thread.tx_pipe, &msg);
>> }
>
>Pushing a message defined on stack to a pipe and not waiting for it to
>be delivered doesn't look right.
Fixed
Best regards,
Konstantin Belyavskiy
k.belyavskiy at tarantool.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20180627/af3b5930/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0000-cover-letter.patch
Type: application/x-patch
Size: 1479 bytes
Desc: not available
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20180627/af3b5930/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-replication-rename-thread-from-tx-to-tx_prio.patch
Type: application/x-patch
Size: 4042 bytes
Desc: not available
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20180627/af3b5930/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-replication-force-gc-to-clean-xdir-on-ENOSPC-err.patch
Type: application/x-patch
Size: 13071 bytes
Desc: not available
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20180627/af3b5930/attachment-0002.bin>
More information about the Tarantool-patches
mailing list