From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Mon, 10 Jun 2019 18:24:50 +0300 From: Vladimir Davydov Subject: Re: [tarantool-patches] Re: [PATCH 03/10] vinyl: move vylog recovery to vylog thread Message-ID: <20190610152450.etzc4ynus3yvmmno@esperanza> References: <69dfb6ed09c655e842ae9200598fce9c62176998.1558103547.git.vdavydov.dev@gmail.com> <20190601083600.GD29429@atlas> <20190606102322.ulfupnnts23gnyvs@esperanza> <20190607133954.GB31327@atlas> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190607133954.GB31327@atlas> To: Konstantin Osipov Cc: tarantool-patches@freelists.org List-ID: On Fri, Jun 07, 2019 at 04:39:54PM +0300, Konstantin Osipov wrote: > * Vladimir Davydov [19/06/06 13:24]: > > > > We used coio, because vylog was written from a WAL thread, which > > > > shouldn't be used for such a heavy operation as vylog recovery. > > > > Now, we can move it to the dedicated vylog thread. This allows > > > > us to simplify rotation logic as well: now most of work is done > > > > from the same function (vy_log_rotate_f) executed by vylog thread, > > > > not scattered between coio and WAL, as it used to be. > > > > > > Why do we need to lock out the scheduler while rotating the log in > > > the first place? > > > > We rotate vylog by first reading the old vylog and forming a recovery > > context, then dumping this recovery context to the new vylog. If a new > > record appears in the old vylog in between, it will be missing in the > > new vylog. That's why we lock out writers. > > We have two layers of abstractions intermixed here. During > snapshotting, when we really rotate the vylog, no DDL can happen, > it's locked out. So no one can take the problematic latch > anyway. Except compaction, which isn't locked out by checkpointing. > So there is, strictly speaking, no problem at all. But > since we're using a low level latch, and not a centralized > mechanism to lock out writers, we wouldn't know. > > One option could be to append the writes to vylog which happen > during checkpointing to the vylog buffer, and not flush them to > the vylog file which is about-to-become-obsolete. We must flush those records to disk, otherwise we risk loosing data. > > Anyway, I keep thinking that if you want to kill a latch, there is > a dozen of ways of killing it, not an own thread. What's so wrong about the new thread? Could you please give some insight why we should avoid introducing a separate thread for vylog at all costs at this point?