Tarantool development patches archive
 help / color / mirror / Atom feed
From: Serge Petrenko <sergepetrenko@tarantool.org>
To: Konstantin Osipov <kostja.osipov@gmail.com>
Cc: tarantool-patches@dev.tarantool.org,
	Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Subject: Re: [Tarantool-patches] [PATCH v4 3/4] gc: rely on minimal vclock components instead of signatures
Date: Mon, 30 Mar 2020 14:02:50 +0300	[thread overview]
Message-ID: <1ACFD89D-9FB2-4BA1-A577-6EE741783FA9@tarantool.org> (raw)
In-Reply-To: <20200328060352.GB23207@atlas>



> 28 марта 2020 г., в 09:03, Konstantin Osipov <kostja.osipov@gmail.com> написал(а):
> 
> * Serge Petrenko <sergepetrenko@tarantool.org> [20/03/27 18:08]:
>> +	struct vclock min_vclock;
>> +	struct gc_consumer *consumer = gc_tree_first(&gc.consumers);
> 
> The code would be easier to follow if the entire vclock api used
> would be ignore0:
> 
>> +	/*
>> +	 * Vclock of the oldest WAL row to keep is a by-component
>> +	 * minimum of all consumer vclocks and the oldest
>> +	 * checkpoint vclock. This ensures that all rows needed by
>> +	 * at least one consumer are kept.
>> +	 */
>> +	vclock_copy(&min_vclock, &checkpoint->vclock);
> 
> E.g. use vclock_copy_ignore0 here
> 
>> +	while (consumer != NULL) {
>> +		/*
>> +		 * Consumers will never need rows signed
>> +		 * with a zero instance id (local rows).
>> +		 */
>> +		vclock_min_ignore0(&min_vclock, &consumer->vclock);
>> +		consumer = gc_tree_next(&gc.consumers, consumer);
>> +	}
>> +
>> +	if (vclock_sum(&min_vclock) > vclock_sum(&gc.vclock)) {
> 
> Please use vclock_sum_ignore0
>> +		vclock_copy(&gc.vclock, &min_vclock);
> 
> Please use vclock_copy_ignore0

I can’t. wal_collect_garbage() searches for the wal file with vclock
strictly less than or equal to the one provided, so 0-th clock
component cannot be zeroed out, otherwise no logs will ever be deleted.
While all the other components are taken as a minimum between all
consumers and the oldest snapshot, the 0-th component is copied directly
from the oldest snapshot, since we have to keep all WALs starting from the
oldest snapshot. It is stated in the comment a few lines above. Now i moved
it to a more relevant place.

So, gc.vclock must contain a valid 0th component. That’s why I use ‘local’
vclock_sum() and vclock_copy() here.

> 
> The goal is to switch as many places as possible to ignore0 api,
> then see the few cases left where we don't, and flip around: 
> rename ignore0 api to the default naming scheme, and no-ignore0 to
> vlock_copy_local(), vclock_compare_local() vclock_copy_local().
> 
> This doesn't have to be part of your patch set, but 
> would be nice to get to.

Sounds good, I don’t want to do it now though. Let it be part of a
follow up patch.

> 
> Ideally ignore0 vclock should be a distinct data type
> with an explicit conversion to and from non-ignore0 vclock
> and no implicit assignment (using vclock_copy already more
> or less ensures that).
> 
> Other than that you seem to be on track with the patch.
> 
> -- 
> Konstantin Osipov, Moscow, Russia

=============================================
diff --git a/src/box/gc.c b/src/box/gc.c
index 4eae6ef3b..a2d0a515c 100644
--- a/src/box/gc.c
+++ b/src/box/gc.c
@@ -186,11 +186,7 @@ gc_run_cleanup(void)
 	/* At least one checkpoint must always be available. */
 	assert(checkpoint != NULL);
 
-	/*
-	 * Find the vclock of the oldest WAL row to keep.
-	 * Note, we must keep all WALs created after the
-	 * oldest checkpoint, even if no consumer needs them.
-	 */
+	/* Find the vclock of the oldest WAL row to keep. */
 	struct vclock min_vclock;
 	struct gc_consumer *consumer = gc_tree_first(&gc.consumers);
 	/*
@@ -198,6 +194,8 @@ gc_run_cleanup(void)
 	 * minimum of all consumer vclocks and the oldest
 	 * checkpoint vclock. This ensures that all rows needed by
 	 * at least one consumer are kept.
+	 * Note, we must keep all WALs created after the
+	 * oldest checkpoint, even if no consumer needs them.
 	 */
 	vclock_copy(&min_vclock, &checkpoint->vclock);
 	while (consumer != NULL) {


--
Serge Petrenko
sergepetrenko@tarantool.org

  reply	other threads:[~2020-03-30 11:02 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-27 10:20 [Tarantool-patches] [PATCH v4 0/4] replication: fix local space tracking Serge Petrenko
2020-03-27 10:20 ` [Tarantool-patches] [PATCH v4 1/4] vclock: add an ability to reset individual clock components Serge Petrenko
2020-03-27 10:20 ` [Tarantool-patches] [PATCH v4 2/4] replication: hide 0-th vclock components in replication responses Serge Petrenko
2020-03-28  5:57   ` Konstantin Osipov
2020-03-30 11:02     ` Serge Petrenko
2020-03-30 12:52       ` Konstantin Osipov
2020-03-27 10:20 ` [Tarantool-patches] [PATCH v4 3/4] gc: rely on minimal vclock components instead of signatures Serge Petrenko
2020-03-28  6:03   ` Konstantin Osipov
2020-03-30 11:02     ` Serge Petrenko [this message]
2020-03-30 12:54       ` Konstantin Osipov
2020-03-27 10:20 ` [Tarantool-patches] [PATCH v4 4/4] box: start counting local space requests separately Serge Petrenko
2020-03-28  6:17   ` Konstantin Osipov
2020-03-30 11:02     ` Serge Petrenko
2020-03-28 16:23   ` Konstantin Osipov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1ACFD89D-9FB2-4BA1-A577-6EE741783FA9@tarantool.org \
    --to=sergepetrenko@tarantool.org \
    --cc=kostja.osipov@gmail.com \
    --cc=tarantool-patches@dev.tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v4 3/4] gc: rely on minimal vclock components instead of signatures' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox