[Tarantool-patches] [PATCH v4 3/4] gc: rely on minimal vclock components instead of signatures
Serge Petrenko
sergepetrenko at tarantool.org
Mon Mar 30 14:02:50 MSK 2020
> 28 марта 2020 г., в 09:03, Konstantin Osipov <kostja.osipov at gmail.com> написал(а):
>
> * Serge Petrenko <sergepetrenko at tarantool.org> [20/03/27 18:08]:
>> + struct vclock min_vclock;
>> + struct gc_consumer *consumer = gc_tree_first(&gc.consumers);
>
> The code would be easier to follow if the entire vclock api used
> would be ignore0:
>
>> + /*
>> + * Vclock of the oldest WAL row to keep is a by-component
>> + * minimum of all consumer vclocks and the oldest
>> + * checkpoint vclock. This ensures that all rows needed by
>> + * at least one consumer are kept.
>> + */
>> + vclock_copy(&min_vclock, &checkpoint->vclock);
>
> E.g. use vclock_copy_ignore0 here
>
>> + while (consumer != NULL) {
>> + /*
>> + * Consumers will never need rows signed
>> + * with a zero instance id (local rows).
>> + */
>> + vclock_min_ignore0(&min_vclock, &consumer->vclock);
>> + consumer = gc_tree_next(&gc.consumers, consumer);
>> + }
>> +
>> + if (vclock_sum(&min_vclock) > vclock_sum(&gc.vclock)) {
>
> Please use vclock_sum_ignore0
>> + vclock_copy(&gc.vclock, &min_vclock);
>
> Please use vclock_copy_ignore0
I can’t. wal_collect_garbage() searches for the wal file with vclock
strictly less than or equal to the one provided, so 0-th clock
component cannot be zeroed out, otherwise no logs will ever be deleted.
While all the other components are taken as a minimum between all
consumers and the oldest snapshot, the 0-th component is copied directly
from the oldest snapshot, since we have to keep all WALs starting from the
oldest snapshot. It is stated in the comment a few lines above. Now i moved
it to a more relevant place.
So, gc.vclock must contain a valid 0th component. That’s why I use ‘local’
vclock_sum() and vclock_copy() here.
>
> The goal is to switch as many places as possible to ignore0 api,
> then see the few cases left where we don't, and flip around:
> rename ignore0 api to the default naming scheme, and no-ignore0 to
> vlock_copy_local(), vclock_compare_local() vclock_copy_local().
>
> This doesn't have to be part of your patch set, but
> would be nice to get to.
Sounds good, I don’t want to do it now though. Let it be part of a
follow up patch.
>
> Ideally ignore0 vclock should be a distinct data type
> with an explicit conversion to and from non-ignore0 vclock
> and no implicit assignment (using vclock_copy already more
> or less ensures that).
>
> Other than that you seem to be on track with the patch.
>
> --
> Konstantin Osipov, Moscow, Russia
=============================================
diff --git a/src/box/gc.c b/src/box/gc.c
index 4eae6ef3b..a2d0a515c 100644
--- a/src/box/gc.c
+++ b/src/box/gc.c
@@ -186,11 +186,7 @@ gc_run_cleanup(void)
/* At least one checkpoint must always be available. */
assert(checkpoint != NULL);
- /*
- * Find the vclock of the oldest WAL row to keep.
- * Note, we must keep all WALs created after the
- * oldest checkpoint, even if no consumer needs them.
- */
+ /* Find the vclock of the oldest WAL row to keep. */
struct vclock min_vclock;
struct gc_consumer *consumer = gc_tree_first(&gc.consumers);
/*
@@ -198,6 +194,8 @@ gc_run_cleanup(void)
* minimum of all consumer vclocks and the oldest
* checkpoint vclock. This ensures that all rows needed by
* at least one consumer are kept.
+ * Note, we must keep all WALs created after the
+ * oldest checkpoint, even if no consumer needs them.
*/
vclock_copy(&min_vclock, &checkpoint->vclock);
while (consumer != NULL) {
--
Serge Petrenko
sergepetrenko at tarantool.org
More information about the Tarantool-patches
mailing list