From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp40.i.mail.ru (smtp40.i.mail.ru [94.100.177.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 42182469719 for ; Thu, 19 Mar 2020 14:59:26 +0300 (MSK) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3608.40.2.2.4\)) From: Serge Petrenko In-Reply-To: <20200319115624.GA17950@atlas> Date: Thu, 19 Mar 2020 14:59:24 +0300 Content-Transfer-Encoding: quoted-printable Message-Id: References: <680467d22cb2864fb4c2d18ac33c4cccb272ebbb.1584558067.git.sergepetrenko@tarantool.org> <20200318200846.GB17681@atlas> <09ba01d5fdc5$f5e1fab0$e1a5f010$@tarantool.org> <20200319084144.GD5707@atlas> <20200319091706.GH188@tarantool.org> <022BA1E5-1F84-400F-BF09-D363338CC296@tarantool.org> <20200319115624.GA17950@atlas> Subject: Re: [Tarantool-patches] [PATCH v2 1/5] box: introduce matrix clock List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Konstantin Osipov Cc: tarantool-patches@dev.tarantool.org, Vladislav Shpilevoy , kirichenkoga@gmail.com > 19 =D0=BC=D0=B0=D1=80=D1=82=D0=B0 2020 =D0=B3., =D0=B2 14:56, = Konstantin Osipov =D0=BD=D0=B0=D0=BF=D0=B8=D1=81= =D0=B0=D0=BB(=D0=B0): >=20 > * Serge Petrenko [20/03/19 14:29]: >> First of all, let me describe the issue with the current WAL GC: >> it tracks consumers (i.e. remote replicas) by their vclock signature, >> which is the sum of all vclock components. >> It has always been wrong (since the introduction of vclocks, at = least): >> Say, you have 2 masters, A and B with ids 1 and 2 respectively, and a = replica C with id 3. >> The example will be a little synthetic, but it illustrates the = problem: >> Say =D0=A1 replicates from both A and B, and there is no replication = between A and B (say, the >> instances were reconfigured to not replicate from each other). >> Now, say replica C has followed A and B to vclock {1:5, 2:13}. At the = same time, A has lsn 10 >> and B has lsn 15. A and B do not know about each other=E2=80=99s = changes, so A=E2=80=99s vclock is {1:10} and >> B=E2=80=99s vclock is {2:15}. Now imagine A does a snapshot and = creates a new xlog with signature 10. >> A=E2=80=99s directory will look like: 00=E2=80=A6000.xlog = 00=E2=80=A6010.snap 00=E2=80=A6.010.xlog >> Replica C reports its vclock {1:5, 2:13} to A, A uses the vclock to = update the corresponding GC >> consumer. Since signatures are used, GC consumer is assigned a = signature =3D 13 + 5 =3D 18. >> This is greater than the signature of the last xlog on A (10), so the = previous xlog (00=E2=80=A600.xlog) can be >> deleted (at least A assumes it can be). Actually, replica still needs = 00=E2=80=A600.xlog, because it contains >> rows corresponding to vclocks {1:6} - {1:10}, which haven=E2=80=99t = been replicated yet. >>=20 >> If instead of using vclock signatures, gc consumers used vclocks, = such a problem wouldn=E2=80=99t arise. >> Replia would report its vclock {1:5, 2:13}. The vclock is NOT = strictly greater than A=E2=80=99s most recent >> xlog vclock ({1:10}), so the previous log is kept until replica = reports a vclock {1:10, 2:something}. >> (or {1:11, =E2=80=A6} and so on). >=20 > This explanation belongs to the commit comment. >=20 Ok, will amend. >=20 >=20 > --=20 > Konstantin Osipov, Moscow, Russia -- Serge Petrenko sergepetrenko@tarantool.org