[tarantool-patches] Re: [PATCH 9/9] wal: trigger checkpoint if there are too many WALs

Vladimir Davydov vdavydov.dev at gmail.com
Tue Dec 4 14:25:20 MSK 2018


On Mon, Dec 03, 2018 at 11:34:17PM +0300, Konstantin Osipov wrote:
> * Vladimir Davydov <vdavydov.dev at gmail.com> [18/11/28 19:16]:
> 
> Please avoid using 0 for infinity: Tarantool doesn't use 0 to mean
> anything special.

As a matter of fact, we do - setting checkpoint_interval/count to 0
results in infinite checkpoint interval/count. I want to make
checkpoint_wal_threshold consistent with those configuration options.

Anyway, if 0 doesn't mean infinity, what should one set
checkpoint_wal_threshold to to disable this feature? A very large value?
What value? 100 GB, 100 TB? Would look weird in box.cfg IMO.

> 
> > Closes #1082
> > 
> > @TarantoolBot document
> > Title: Document box.cfg.checkpoint_wal_threshold
> 
> Please document the default value of the new variable. 

OK.

> 
> Please add checks for the range of valid values of the new
> variable, as well as tests for these.

We don't check checkpoint_interval - setting it to a value <= 0 means
infinite timeout. I though why bother about checkpoint_wal_threshold
then?

> 
> > +	int64_t checkpoint_wal_size;
> > +	/**
> > +	 * If greater than 0
> 
> Ugh.
> > + , this variable sets a limit on the
> > +	 * total size of WAL files written since the last checkpoint.
> > +	 * Exceeding it will trigger auto checkpointing in tx.
> > +	 */
> > +	int64_t checkpoint_threshold;
> 
> 
> > +	bool checkpoint_threshold_signalled;
> > +	bool checkpoint_threshold_exceeded;
> 
> If you had the checkpoint object wit hall the messages in
> the wal writer signleton, then the entire checkpoint state,
> including this variable, could be easily observed in a single
> place. Now that I see this flag I'm more inclined to insist
> on having a singleton wal_checkpoint object, inside struct
> wal_writer or standalone.

I'll remove checkpoint_threshold_exceeded and will use a separate
message for this kind of notifications instead of piggybacking on
a WAL request, as we agreed.

Regarding checkpoint_threshold_signalled, quite frankly, I don't think
that introducing a new checkpoint state struct and putting it in there
would make the code look any better. This flag isn't really bound with
checkpointing - it merely indicates whether we've already triggered a
checkpoint while checkpointing may or may not be in progress.

> 
> > +void
> > +wal_set_checkpoint_threshold(int64_t checkpoint_threshold)
> > +{
> > +	struct wal_writer *writer = &wal_writer_singleton;
> > +	if (writer->wal_mode == WAL_NONE)
> > +		return;
> > +	struct wal_set_checkpoint_threshold_msg msg;
> > +	msg.checkpoint_threshold = checkpoint_threshold;
> > +	bool cancellable = fiber_set_cancellable(false);
> > +	cbus_call(&wal_thread.wal_pipe, &wal_thread.tx_prio_pipe,
> > +		  &msg.base, wal_set_checkpoint_threshold_f, NULL,
> > +		  TIMEOUT_INFINITY);
> > +	fiber_set_cancellable(cancellable);
> > +}
> 
> Please add a comment explaining that WAL_NONE is also set when wal
> is not yet initialized.

OK.

> 
> I don't see where you calculate the value of the variable upon
> server start. Did I miss this hunk?

No, it is set by load_cfg.lua, just like box.cfg.checkpoint_interval.



More information about the Tarantool-patches mailing list