From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Tue, 4 Dec 2018 14:25:20 +0300 From: Vladimir Davydov Subject: Re: [tarantool-patches] Re: [PATCH 9/9] wal: trigger checkpoint if there are too many WALs Message-ID: <20181204112520.2di4acmhts24oj32@esperanza> References: <20181203203417.GI2890@chai> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181203203417.GI2890@chai> To: Konstantin Osipov Cc: tarantool-patches@freelists.org List-ID: On Mon, Dec 03, 2018 at 11:34:17PM +0300, Konstantin Osipov wrote: > * Vladimir Davydov [18/11/28 19:16]: > > Please avoid using 0 for infinity: Tarantool doesn't use 0 to mean > anything special. As a matter of fact, we do - setting checkpoint_interval/count to 0 results in infinite checkpoint interval/count. I want to make checkpoint_wal_threshold consistent with those configuration options. Anyway, if 0 doesn't mean infinity, what should one set checkpoint_wal_threshold to to disable this feature? A very large value? What value? 100 GB, 100 TB? Would look weird in box.cfg IMO. > > > Closes #1082 > > > > @TarantoolBot document > > Title: Document box.cfg.checkpoint_wal_threshold > > Please document the default value of the new variable. OK. > > Please add checks for the range of valid values of the new > variable, as well as tests for these. We don't check checkpoint_interval - setting it to a value <= 0 means infinite timeout. I though why bother about checkpoint_wal_threshold then? > > > + int64_t checkpoint_wal_size; > > + /** > > + * If greater than 0 > > Ugh. > > + , this variable sets a limit on the > > + * total size of WAL files written since the last checkpoint. > > + * Exceeding it will trigger auto checkpointing in tx. > > + */ > > + int64_t checkpoint_threshold; > > > > + bool checkpoint_threshold_signalled; > > + bool checkpoint_threshold_exceeded; > > If you had the checkpoint object wit hall the messages in > the wal writer signleton, then the entire checkpoint state, > including this variable, could be easily observed in a single > place. Now that I see this flag I'm more inclined to insist > on having a singleton wal_checkpoint object, inside struct > wal_writer or standalone. I'll remove checkpoint_threshold_exceeded and will use a separate message for this kind of notifications instead of piggybacking on a WAL request, as we agreed. Regarding checkpoint_threshold_signalled, quite frankly, I don't think that introducing a new checkpoint state struct and putting it in there would make the code look any better. This flag isn't really bound with checkpointing - it merely indicates whether we've already triggered a checkpoint while checkpointing may or may not be in progress. > > > +void > > +wal_set_checkpoint_threshold(int64_t checkpoint_threshold) > > +{ > > + struct wal_writer *writer = &wal_writer_singleton; > > + if (writer->wal_mode == WAL_NONE) > > + return; > > + struct wal_set_checkpoint_threshold_msg msg; > > + msg.checkpoint_threshold = checkpoint_threshold; > > + bool cancellable = fiber_set_cancellable(false); > > + cbus_call(&wal_thread.wal_pipe, &wal_thread.tx_prio_pipe, > > + &msg.base, wal_set_checkpoint_threshold_f, NULL, > > + TIMEOUT_INFINITY); > > + fiber_set_cancellable(cancellable); > > +} > > Please add a comment explaining that WAL_NONE is also set when wal > is not yet initialized. OK. > > I don't see where you calculate the value of the variable upon > server start. Did I miss this hunk? No, it is set by load_cfg.lua, just like box.cfg.checkpoint_interval.