From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Wed, 13 Jun 2018 23:55:27 +0300 From: Konstantin Osipov Subject: Re: [PATCH v2 08/11] replication: rebootstrap instance on startup if it fell behind Message-ID: <20180613205527.GE10632@chai> References: <7864f122209e681a81d4eb59ba3f52188e5051e9.1528478913.git.vdavydov.dev@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <7864f122209e681a81d4eb59ba3f52188e5051e9.1528478913.git.vdavydov.dev@gmail.com> To: Vladimir Davydov Cc: tarantool-patches@freelists.org List-ID: * Vladimir Davydov [18/06/08 20:38]: > If a replica fell too much behind its peers in the cluster and xlog > files needed for it to get up to speed have been removed, it won't be > able to proceed without rebootstrap. This patch makes the recovery > procedure detect such cases and initiate rebootstrap procedure if > necessary. > > Note, rebootstrap is currently only supported by memtx engine. If there > are vinyl spaces on the replica, rebootstrap will fail. This is fixed by > the following patches. A nitpick, but this makes the whole point of factoring out local recovery less valid. If local_recovery() can fall back to bootstrap_from_master(), then the name is misleading. Please make sure the control flow and decision making stays in box_cfg(). > > +bool > +replicaset_needs_rejoin(struct replica **master) > +{ > + replicaset_foreach(replica) { > + if (replica->applier != NULL && > + vclock_compare(&replica->applier->gc_vclock, > + &replicaset.vclock) > 0) { > + *master = replica; > + return true; > + } > + } > + *master = NULL; > + return false; > +} Intuitively this function should return true only if none of the masters can provide it with necessary logs, not *any* of the masters. -- Konstantin Osipov, Moscow, Russia, +7 903 626 22 32 http://tarantool.io - www.twitter.com/kostja_osipov