Tarantool development patches archive
 help / color / mirror / Atom feed
From: Konstantin Osipov <kostja@tarantool.org>
To: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: tarantool-patches@freelists.org
Subject: Re: [PATCH v2 2/4] wal: preallocate disk space before writing rows
Date: Tue, 23 Oct 2018 21:33:22 +0300	[thread overview]
Message-ID: <20181023183322.GB29694@chai> (raw)
In-Reply-To: <484c921eb04a0a94563dcccf058bb5e8f96d3a49.1540314925.git.vdavydov.dev@gmail.com>

* Vladimir Davydov <vdavydov.dev@gmail.com> [18/10/23 20:30]:
> This function introduces a new xlog method xlog_fallocate() that makes
> sure that the requested amount of disk space is available at the current
> write position. It does that with posix_fallocate(). The new method is
> called before writing anything to WAL, see wal_fallocate(). In order not
> to invoke the system call too often, wal_fallocate() allocates disk
> space in big chunks (1 MB).
> 
> The reason why I'm doing this is that I want to have a single and
> clearly defined point in the code to handle ENOSPC errors, where
> I could delete old WALs and retry (this is what #3397 is about).
> 
> Needed for #3397

Please consider comment about member names. 
The patch is OK to push.
> index 1d64a7bd..fc495547 100644
> --- a/src/box/journal.h
> +++ b/src/box/journal.h
> @@ -59,6 +59,10 @@ struct journal_entry {
>  	 */
>  	struct fiber *fiber;
>  	/**
> +	 * Max size the rows are going to take when encoded.
> +	 */
> +	size_t len;

Since this is not an exact value, I suggest to call it est_len.
max_len is also OK but worse than est_len.

Any name which would indicate that it's not an exact len is fine:

footprint
xlog_max_len
xlog_len

> +	/**
>  	 * The number of rows in the request.
>  	 */
>  	int n_rows;
> diff --git a/src/box/txn.c b/src/box/txn.c
> index 17d97d76..9b465561 100644
> --- a/src/box/txn.c
> +++ b/src/box/txn.c
> @@ -272,6 +272,7 @@ txn_write_to_wal(struct txn *txn)
>  		if (stmt->row == NULL)
>  			continue; /* A read (e.g. select) request */
>  		*row++ = stmt->row;
> +		req->len += xrow_len_max(stmt->row);

req->est_len += xrow_est_len();
req->max_len += xrow_max_len();

>  
>  struct wal_msg {
>  	struct cmsg base;
> +	/**
> +	 * Max size the committed requests are going to take when
> +	 * written to disk.
> +	 */
> +	size_t len;

Same here.

> +	/*
> +	 * The actual write size can be greater than the sum size
> +	 * of encoded rows (compression, fixheaders). Double the
> +	 * given length to get a rough upper bound estimate.
> +	 */
> +	len *= 2;
> +
> +ssize_t
> +xlog_fallocate(struct xlog *log, size_t len)
> +{
> +#ifdef HAVE_POSIX_FALLOCATE
> +	int rc = posix_fallocate(log->fd, log->offset + log->alloc_len, len);
> +	if (rc != 0) {
> +		errno = rc;
> +		diag_set(SystemError, "%s: can't allocate disk space",
> +			 log->filename);
> +		return -1;
> +	}
> +	log->alloc_len += len;

I don't like this name either, but don't know of a better one.

log->reserved perhaps? or simply log->allocated? 

> +	return 0;
> +#else
> +	(void)log;
> +	(void)len;
> +	return 0;
> +#endif /* HAVE_POSIX_FALLOCATE */
> +}
> +
> +	if (log->alloc_len > (size_t)written)
> +		log->alloc_len -= written;

Imagine the name is allocated:

 if (log->allocated > written)
      log->allocated -= written;

Seems to be more clear?

> +	else
> +		log->alloc_len = 0;
>  	log->offset += written;
>  	log->rows += log->tx_rows;
>  	log->tx_rows = 0;
> @@ -1378,6 +1403,17 @@ xlog_write_eof(struct xlog *l)
>  		diag_set(ClientError, ER_INJECTION, "xlog write injection");
>  		return -1;
>  	});
> +
> +	/*
> +	 * Free disk space preallocated with xlog_fallocate().
> +	 * Don't write the eof marker if this fails, otherwise
> +	 * we'll get "data after eof marker" error on recovery.

I would simply remove this check - I don't see how it's useful as
long as we never append to an existing xlog.

> +	 */
> +	if (l->alloc_len > 0 && ftruncate(l->fd, l->offset) < 0) {
> +		diag_set(SystemError, "ftruncate() failed");
> +		return -1;
> +	}
> +
>  	if (fio_writen(l->fd, &eof_marker, sizeof(eof_marker)) < 0) {
>  		diag_set(SystemError, "write() failed");
>  		return -1;
> @@ -1793,6 +1829,15 @@ xlog_cursor_next_tx(struct xlog_cursor *i)
>  		return -1;
>  	if (rc > 0)
>  		return 1;
> +	if (load_u32(i->rbuf.rpos) == 0) {
> +		/*
> +		 * Space preallocated with xlog_fallocate().
> +		 * Treat as eof and clear the buffer.
> +		 */
> +		i->read_offset -= ibuf_used(&i->rbuf);
> +		ibuf_reset(&i->rbuf);

Especially since you added this. 

> +		return 1;
> +	}

-- 
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov

  reply	other threads:[~2018-10-23 18:33 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-23 17:26 [PATCH v2 0/4] Delete old WAL files if running out of disk space Vladimir Davydov
2018-10-23 17:26 ` [PATCH v2 1/4] xlog: turn use_coio argument of xdir_collect_garbage to flags Vladimir Davydov
2018-10-23 18:17   ` Konstantin Osipov
2018-10-24 11:21     ` Vladimir Davydov
2018-10-23 17:26 ` [PATCH v2 2/4] wal: preallocate disk space before writing rows Vladimir Davydov
2018-10-23 18:33   ` Konstantin Osipov [this message]
2018-10-24  9:54     ` Vladimir Davydov
2018-10-23 17:26 ` [PATCH v2 3/4] wal: notify watchers about wal file removal Vladimir Davydov
2018-10-23 17:26 ` [PATCH v2 4/4] wal: delete old wal files when running out of disk space Vladimir Davydov
2018-10-23 18:46   ` Konstantin Osipov
2018-10-24  9:51     ` Vladimir Davydov
2018-10-24 16:53       ` Konstantin Osipov
2018-10-25  8:31         ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181023183322.GB29694@chai \
    --to=kostja@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --cc=vdavydov.dev@gmail.com \
    --subject='Re: [PATCH v2 2/4] wal: preallocate disk space before writing rows' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox