From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp60.i.mail.ru (smtp60.i.mail.ru [217.69.128.40]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id BAAAE441847 for ; Mon, 30 Mar 2020 14:02:50 +0300 (MSK) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3608.40.2.2.4\)) From: Serge Petrenko In-Reply-To: <20200328060352.GB23207@atlas> Date: Mon, 30 Mar 2020 14:02:50 +0300 Content-Transfer-Encoding: quoted-printable Message-Id: <1ACFD89D-9FB2-4BA1-A577-6EE741783FA9@tarantool.org> References: <20200328060352.GB23207@atlas> Subject: Re: [Tarantool-patches] [PATCH v4 3/4] gc: rely on minimal vclock components instead of signatures List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Konstantin Osipov Cc: tarantool-patches@dev.tarantool.org, Vladislav Shpilevoy > 28 =D0=BC=D0=B0=D1=80=D1=82=D0=B0 2020 =D0=B3., =D0=B2 09:03, = Konstantin Osipov =D0=BD=D0=B0=D0=BF=D0=B8=D1=81= =D0=B0=D0=BB(=D0=B0): >=20 > * Serge Petrenko [20/03/27 18:08]: >> + struct vclock min_vclock; >> + struct gc_consumer *consumer =3D gc_tree_first(&gc.consumers); >=20 > The code would be easier to follow if the entire vclock api used > would be ignore0: >=20 >> + /* >> + * Vclock of the oldest WAL row to keep is a by-component >> + * minimum of all consumer vclocks and the oldest >> + * checkpoint vclock. This ensures that all rows needed by >> + * at least one consumer are kept. >> + */ >> + vclock_copy(&min_vclock, &checkpoint->vclock); >=20 > E.g. use vclock_copy_ignore0 here >=20 >> + while (consumer !=3D NULL) { >> + /* >> + * Consumers will never need rows signed >> + * with a zero instance id (local rows). >> + */ >> + vclock_min_ignore0(&min_vclock, &consumer->vclock); >> + consumer =3D gc_tree_next(&gc.consumers, consumer); >> + } >> + >> + if (vclock_sum(&min_vclock) > vclock_sum(&gc.vclock)) { >=20 > Please use vclock_sum_ignore0 >> + vclock_copy(&gc.vclock, &min_vclock); >=20 > Please use vclock_copy_ignore0 I can=E2=80=99t. wal_collect_garbage() searches for the wal file with = vclock strictly less than or equal to the one provided, so 0-th clock component cannot be zeroed out, otherwise no logs will ever be deleted. While all the other components are taken as a minimum between all consumers and the oldest snapshot, the 0-th component is copied directly from the oldest snapshot, since we have to keep all WALs starting from = the oldest snapshot. It is stated in the comment a few lines above. Now i = moved it to a more relevant place. So, gc.vclock must contain a valid 0th component. That=E2=80=99s why I = use =E2=80=98local=E2=80=99 vclock_sum() and vclock_copy() here. >=20 > The goal is to switch as many places as possible to ignore0 api, > then see the few cases left where we don't, and flip around:=20 > rename ignore0 api to the default naming scheme, and no-ignore0 to > vlock_copy_local(), vclock_compare_local() vclock_copy_local(). >=20 > This doesn't have to be part of your patch set, but=20 > would be nice to get to. Sounds good, I don=E2=80=99t want to do it now though. Let it be part of = a follow up patch. >=20 > Ideally ignore0 vclock should be a distinct data type > with an explicit conversion to and from non-ignore0 vclock > and no implicit assignment (using vclock_copy already more > or less ensures that). >=20 > Other than that you seem to be on track with the patch. >=20 > --=20 > Konstantin Osipov, Moscow, Russia =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D diff --git a/src/box/gc.c b/src/box/gc.c index 4eae6ef3b..a2d0a515c 100644 --- a/src/box/gc.c +++ b/src/box/gc.c @@ -186,11 +186,7 @@ gc_run_cleanup(void) /* At least one checkpoint must always be available. */ assert(checkpoint !=3D NULL); =20 - /* - * Find the vclock of the oldest WAL row to keep. - * Note, we must keep all WALs created after the - * oldest checkpoint, even if no consumer needs them. - */ + /* Find the vclock of the oldest WAL row to keep. */ struct vclock min_vclock; struct gc_consumer *consumer =3D gc_tree_first(&gc.consumers); /* @@ -198,6 +194,8 @@ gc_run_cleanup(void) * minimum of all consumer vclocks and the oldest * checkpoint vclock. This ensures that all rows needed by * at least one consumer are kept. + * Note, we must keep all WALs created after the + * oldest checkpoint, even if no consumer needs them. */ vclock_copy(&min_vclock, &checkpoint->vclock); while (consumer !=3D NULL) { -- Serge Petrenko sergepetrenko@tarantool.org