From: Konstantin Osipov <kostja@tarantool.org> To: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: tarantool-patches@freelists.org Subject: Re: [PATCH v2 2/4] wal: preallocate disk space before writing rows Date: Tue, 23 Oct 2018 21:33:22 +0300 [thread overview] Message-ID: <20181023183322.GB29694@chai> (raw) In-Reply-To: <484c921eb04a0a94563dcccf058bb5e8f96d3a49.1540314925.git.vdavydov.dev@gmail.com> * Vladimir Davydov <vdavydov.dev@gmail.com> [18/10/23 20:30]: > This function introduces a new xlog method xlog_fallocate() that makes > sure that the requested amount of disk space is available at the current > write position. It does that with posix_fallocate(). The new method is > called before writing anything to WAL, see wal_fallocate(). In order not > to invoke the system call too often, wal_fallocate() allocates disk > space in big chunks (1 MB). > > The reason why I'm doing this is that I want to have a single and > clearly defined point in the code to handle ENOSPC errors, where > I could delete old WALs and retry (this is what #3397 is about). > > Needed for #3397 Please consider comment about member names. The patch is OK to push. > index 1d64a7bd..fc495547 100644 > --- a/src/box/journal.h > +++ b/src/box/journal.h > @@ -59,6 +59,10 @@ struct journal_entry { > */ > struct fiber *fiber; > /** > + * Max size the rows are going to take when encoded. > + */ > + size_t len; Since this is not an exact value, I suggest to call it est_len. max_len is also OK but worse than est_len. Any name which would indicate that it's not an exact len is fine: footprint xlog_max_len xlog_len > + /** > * The number of rows in the request. > */ > int n_rows; > diff --git a/src/box/txn.c b/src/box/txn.c > index 17d97d76..9b465561 100644 > --- a/src/box/txn.c > +++ b/src/box/txn.c > @@ -272,6 +272,7 @@ txn_write_to_wal(struct txn *txn) > if (stmt->row == NULL) > continue; /* A read (e.g. select) request */ > *row++ = stmt->row; > + req->len += xrow_len_max(stmt->row); req->est_len += xrow_est_len(); req->max_len += xrow_max_len(); > > struct wal_msg { > struct cmsg base; > + /** > + * Max size the committed requests are going to take when > + * written to disk. > + */ > + size_t len; Same here. > + /* > + * The actual write size can be greater than the sum size > + * of encoded rows (compression, fixheaders). Double the > + * given length to get a rough upper bound estimate. > + */ > + len *= 2; > + > +ssize_t > +xlog_fallocate(struct xlog *log, size_t len) > +{ > +#ifdef HAVE_POSIX_FALLOCATE > + int rc = posix_fallocate(log->fd, log->offset + log->alloc_len, len); > + if (rc != 0) { > + errno = rc; > + diag_set(SystemError, "%s: can't allocate disk space", > + log->filename); > + return -1; > + } > + log->alloc_len += len; I don't like this name either, but don't know of a better one. log->reserved perhaps? or simply log->allocated? > + return 0; > +#else > + (void)log; > + (void)len; > + return 0; > +#endif /* HAVE_POSIX_FALLOCATE */ > +} > + > + if (log->alloc_len > (size_t)written) > + log->alloc_len -= written; Imagine the name is allocated: if (log->allocated > written) log->allocated -= written; Seems to be more clear? > + else > + log->alloc_len = 0; > log->offset += written; > log->rows += log->tx_rows; > log->tx_rows = 0; > @@ -1378,6 +1403,17 @@ xlog_write_eof(struct xlog *l) > diag_set(ClientError, ER_INJECTION, "xlog write injection"); > return -1; > }); > + > + /* > + * Free disk space preallocated with xlog_fallocate(). > + * Don't write the eof marker if this fails, otherwise > + * we'll get "data after eof marker" error on recovery. I would simply remove this check - I don't see how it's useful as long as we never append to an existing xlog. > + */ > + if (l->alloc_len > 0 && ftruncate(l->fd, l->offset) < 0) { > + diag_set(SystemError, "ftruncate() failed"); > + return -1; > + } > + > if (fio_writen(l->fd, &eof_marker, sizeof(eof_marker)) < 0) { > diag_set(SystemError, "write() failed"); > return -1; > @@ -1793,6 +1829,15 @@ xlog_cursor_next_tx(struct xlog_cursor *i) > return -1; > if (rc > 0) > return 1; > + if (load_u32(i->rbuf.rpos) == 0) { > + /* > + * Space preallocated with xlog_fallocate(). > + * Treat as eof and clear the buffer. > + */ > + i->read_offset -= ibuf_used(&i->rbuf); > + ibuf_reset(&i->rbuf); Especially since you added this. > + return 1; > + } -- Konstantin Osipov, Moscow, Russia, +7 903 626 22 32 http://tarantool.io - www.twitter.com/kostja_osipov
next prev parent reply other threads:[~2018-10-23 18:33 UTC|newest] Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-10-23 17:26 [PATCH v2 0/4] Delete old WAL files if running out of disk space Vladimir Davydov 2018-10-23 17:26 ` [PATCH v2 1/4] xlog: turn use_coio argument of xdir_collect_garbage to flags Vladimir Davydov 2018-10-23 18:17 ` Konstantin Osipov 2018-10-24 11:21 ` Vladimir Davydov 2018-10-23 17:26 ` [PATCH v2 2/4] wal: preallocate disk space before writing rows Vladimir Davydov 2018-10-23 18:33 ` Konstantin Osipov [this message] 2018-10-24 9:54 ` Vladimir Davydov 2018-10-23 17:26 ` [PATCH v2 3/4] wal: notify watchers about wal file removal Vladimir Davydov 2018-10-23 17:26 ` [PATCH v2 4/4] wal: delete old wal files when running out of disk space Vladimir Davydov 2018-10-23 18:46 ` Konstantin Osipov 2018-10-24 9:51 ` Vladimir Davydov 2018-10-24 16:53 ` Konstantin Osipov 2018-10-25 8:31 ` Vladimir Davydov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20181023183322.GB29694@chai \ --to=kostja@tarantool.org \ --cc=tarantool-patches@freelists.org \ --cc=vdavydov.dev@gmail.com \ --subject='Re: [PATCH v2 2/4] wal: preallocate disk space before writing rows' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox