[PATCH v2 2/4] wal: preallocate disk space before writing rows
Konstantin Osipov
kostja at tarantool.org
Tue Oct 23 21:33:22 MSK 2018
* Vladimir Davydov <vdavydov.dev at gmail.com> [18/10/23 20:30]:
> This function introduces a new xlog method xlog_fallocate() that makes
> sure that the requested amount of disk space is available at the current
> write position. It does that with posix_fallocate(). The new method is
> called before writing anything to WAL, see wal_fallocate(). In order not
> to invoke the system call too often, wal_fallocate() allocates disk
> space in big chunks (1 MB).
>
> The reason why I'm doing this is that I want to have a single and
> clearly defined point in the code to handle ENOSPC errors, where
> I could delete old WALs and retry (this is what #3397 is about).
>
> Needed for #3397
Please consider comment about member names.
The patch is OK to push.
> index 1d64a7bd..fc495547 100644
> --- a/src/box/journal.h
> +++ b/src/box/journal.h
> @@ -59,6 +59,10 @@ struct journal_entry {
> */
> struct fiber *fiber;
> /**
> + * Max size the rows are going to take when encoded.
> + */
> + size_t len;
Since this is not an exact value, I suggest to call it est_len.
max_len is also OK but worse than est_len.
Any name which would indicate that it's not an exact len is fine:
footprint
xlog_max_len
xlog_len
> + /**
> * The number of rows in the request.
> */
> int n_rows;
> diff --git a/src/box/txn.c b/src/box/txn.c
> index 17d97d76..9b465561 100644
> --- a/src/box/txn.c
> +++ b/src/box/txn.c
> @@ -272,6 +272,7 @@ txn_write_to_wal(struct txn *txn)
> if (stmt->row == NULL)
> continue; /* A read (e.g. select) request */
> *row++ = stmt->row;
> + req->len += xrow_len_max(stmt->row);
req->est_len += xrow_est_len();
req->max_len += xrow_max_len();
>
> struct wal_msg {
> struct cmsg base;
> + /**
> + * Max size the committed requests are going to take when
> + * written to disk.
> + */
> + size_t len;
Same here.
> + /*
> + * The actual write size can be greater than the sum size
> + * of encoded rows (compression, fixheaders). Double the
> + * given length to get a rough upper bound estimate.
> + */
> + len *= 2;
> +
> +ssize_t
> +xlog_fallocate(struct xlog *log, size_t len)
> +{
> +#ifdef HAVE_POSIX_FALLOCATE
> + int rc = posix_fallocate(log->fd, log->offset + log->alloc_len, len);
> + if (rc != 0) {
> + errno = rc;
> + diag_set(SystemError, "%s: can't allocate disk space",
> + log->filename);
> + return -1;
> + }
> + log->alloc_len += len;
I don't like this name either, but don't know of a better one.
log->reserved perhaps? or simply log->allocated?
> + return 0;
> +#else
> + (void)log;
> + (void)len;
> + return 0;
> +#endif /* HAVE_POSIX_FALLOCATE */
> +}
> +
> + if (log->alloc_len > (size_t)written)
> + log->alloc_len -= written;
Imagine the name is allocated:
if (log->allocated > written)
log->allocated -= written;
Seems to be more clear?
> + else
> + log->alloc_len = 0;
> log->offset += written;
> log->rows += log->tx_rows;
> log->tx_rows = 0;
> @@ -1378,6 +1403,17 @@ xlog_write_eof(struct xlog *l)
> diag_set(ClientError, ER_INJECTION, "xlog write injection");
> return -1;
> });
> +
> + /*
> + * Free disk space preallocated with xlog_fallocate().
> + * Don't write the eof marker if this fails, otherwise
> + * we'll get "data after eof marker" error on recovery.
I would simply remove this check - I don't see how it's useful as
long as we never append to an existing xlog.
> + */
> + if (l->alloc_len > 0 && ftruncate(l->fd, l->offset) < 0) {
> + diag_set(SystemError, "ftruncate() failed");
> + return -1;
> + }
> +
> if (fio_writen(l->fd, &eof_marker, sizeof(eof_marker)) < 0) {
> diag_set(SystemError, "write() failed");
> return -1;
> @@ -1793,6 +1829,15 @@ xlog_cursor_next_tx(struct xlog_cursor *i)
> return -1;
> if (rc > 0)
> return 1;
> + if (load_u32(i->rbuf.rpos) == 0) {
> + /*
> + * Space preallocated with xlog_fallocate().
> + * Treat as eof and clear the buffer.
> + */
> + i->read_offset -= ibuf_used(&i->rbuf);
> + ibuf_reset(&i->rbuf);
Especially since you added this.
> + return 1;
> + }
--
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov
More information about the Tarantool-patches
mailing list