From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-f67.google.com (mail-lf1-f67.google.com [209.85.167.67]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 816A6469719 for ; Thu, 19 Mar 2020 14:56:27 +0300 (MSK) Received: by mail-lf1-f67.google.com with SMTP id t21so1357515lfe.9 for ; Thu, 19 Mar 2020 04:56:27 -0700 (PDT) Date: Thu, 19 Mar 2020 14:56:24 +0300 From: Konstantin Osipov Message-ID: <20200319115624.GA17950@atlas> References: <680467d22cb2864fb4c2d18ac33c4cccb272ebbb.1584558067.git.sergepetrenko@tarantool.org> <20200318200846.GB17681@atlas> <09ba01d5fdc5$f5e1fab0$e1a5f010$@tarantool.org> <20200319084144.GD5707@atlas> <20200319091706.GH188@tarantool.org> <022BA1E5-1F84-400F-BF09-D363338CC296@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <022BA1E5-1F84-400F-BF09-D363338CC296@tarantool.org> Subject: Re: [Tarantool-patches] [PATCH v2 1/5] box: introduce matrix clock List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Serge Petrenko Cc: tarantool-patches@dev.tarantool.org, Vladislav Shpilevoy , kirichenkoga@gmail.com * Serge Petrenko [20/03/19 14:29]: > First of all, let me describe the issue with the current WAL GC: > it tracks consumers (i.e. remote replicas) by their vclock signature, > which is the sum of all vclock components. > It has always been wrong (since the introduction of vclocks, at least): > Say, you have 2 masters, A and B with ids 1 and 2 respectively, and a replica C with id 3. > The example will be a little synthetic, but it illustrates the problem: > Say С replicates from both A and B, and there is no replication between A and B (say, the > instances were reconfigured to not replicate from each other). > Now, say replica C has followed A and B to vclock {1:5, 2:13}. At the same time, A has lsn 10 > and B has lsn 15. A and B do not know about each other’s changes, so A’s vclock is {1:10} and > B’s vclock is {2:15}. Now imagine A does a snapshot and creates a new xlog with signature 10. > A’s directory will look like: 00…000.xlog 00…010.snap 00….010.xlog > Replica C reports its vclock {1:5, 2:13} to A, A uses the vclock to update the corresponding GC > consumer. Since signatures are used, GC consumer is assigned a signature = 13 + 5 = 18. > This is greater than the signature of the last xlog on A (10), so the previous xlog (00…00.xlog) can be > deleted (at least A assumes it can be). Actually, replica still needs 00…00.xlog, because it contains > rows corresponding to vclocks {1:6} - {1:10}, which haven’t been replicated yet. > > If instead of using vclock signatures, gc consumers used vclocks, such a problem wouldn’t arise. > Replia would report its vclock {1:5, 2:13}. The vclock is NOT strictly greater than A’s most recent > xlog vclock ({1:10}), so the previous log is kept until replica reports a vclock {1:10, 2:something}. > (or {1:11, …} and so on). This explanation belongs to the commit comment. -- Konstantin Osipov, Moscow, Russia