From: Vladimir Davydov <vdavydov.dev@gmail.com> To: Georgy Kirichenko <georgy@tarantool.org> Cc: tarantool-patches@freelists.org Subject: Re: [tarantool-patches] [PATCH 1/3] Do not promote wal vclock for failed writes Date: Wed, 6 Feb 2019 16:56:56 +0300 [thread overview] Message-ID: <20190206135656.rebkdloo6plr7ytx@esperanza> (raw) In-Reply-To: <10dda7d827c22bb8af33de31646f5ad34576f760.1549441084.git.georgy@tarantool.org> On Wed, Feb 06, 2019 at 11:29:57AM +0300, Georgy Kirichenko wrote: > Increase replica lsn only if row was successfully written to disk. This > prevents wal from lsn gaps in case of IO errors. Why are gaps that bad? Please say a few words of explanation in the commit message. > > Obsoletes xlog/panic_on_lsn_gap.test. > > Needs for #2283 Needed for > --- > src/box/wal.c | 19 +- > test/box/errinj.result | 23 ++ > test/box/errinj.test.lua | 8 + > test/xlog/errinj.result | 1 - > test/xlog/panic_on_lsn_gap.result | 377 ---------------------------- > test/xlog/panic_on_lsn_gap.test.lua | 136 ---------- > 6 files changed, 44 insertions(+), 520 deletions(-) > delete mode 100644 test/xlog/panic_on_lsn_gap.result > delete mode 100644 test/xlog/panic_on_lsn_gap.test.lua > > diff --git a/src/box/wal.c b/src/box/wal.c > index 3b50d3629..a55b544aa 100644 > --- a/src/box/wal.c > +++ b/src/box/wal.c > @@ -901,16 +901,16 @@ wal_writer_begin_rollback(struct wal_writer *writer) > } > > static void > -wal_assign_lsn(struct wal_writer *writer, struct xrow_header **row, > +wal_assign_lsn(struct vclock *vclock, struct xrow_header **row, > struct xrow_header **end) > { > /** Assign LSN to all local rows. */ > for ( ; row < end; row++) { > if ((*row)->replica_id == 0) { > - (*row)->lsn = vclock_inc(&writer->vclock, instance_id); > + (*row)->lsn = vclock_inc(vclock, instance_id); > (*row)->replica_id = instance_id; > } else { > - vclock_follow_xrow(&writer->vclock, *row); > + vclock_follow_xrow(vclock, *row); > } > } > } > @@ -922,6 +922,11 @@ wal_write_to_disk(struct cmsg *msg) > struct wal_msg *wal_msg = (struct wal_msg *) msg; > struct error *error; > > + /* Local vclock copy. */ Bad comment. Can you elaborate why you need it, please? E.g. We don't want to promote the WAL vclock if write fails, because that would result in LSN gaps, which are difficult to handle in relay. So we promote a local copy instead and sync it only if entries have been successfully flushed to disk. Or something like that. > + struct vclock vclock; > + vclock_create(&vclock); > + vclock_copy(&vclock, &writer->vclock); > + > struct errinj *inj = errinj(ERRINJ_WAL_DELAY, ERRINJ_BOOL); > while (inj != NULL && inj->bparam) > usleep(10); > @@ -974,14 +979,15 @@ wal_write_to_disk(struct cmsg *msg) > struct journal_entry *entry; > struct stailq_entry *last_committed = NULL; > stailq_foreach_entry(entry, &wal_msg->commit, fifo) { > - wal_assign_lsn(writer, entry->rows, entry->rows + entry->n_rows); > - entry->res = vclock_sum(&writer->vclock); > + wal_assign_lsn(&vclock, entry->rows, entry->rows + entry->n_rows); > + entry->res = vclock_sum(&vclock); > rc = xlog_write_entry(l, entry); > if (rc < 0) > goto done; > if (rc > 0) { > writer->checkpoint_wal_size += rc; > last_committed = &entry->fifo; > + vclock_copy(&writer->vclock, &vclock); > } > /* rc == 0: the write is buffered in xlog_tx */ > } > @@ -991,6 +997,7 @@ wal_write_to_disk(struct cmsg *msg) > > writer->checkpoint_wal_size += rc; > last_committed = stailq_last(&wal_msg->commit); > + vclock_copy(&writer->vclock, &vclock); > > /* > * Notify TX if the checkpoint threshold has been exceeded. > @@ -1185,7 +1192,7 @@ wal_write_in_wal_mode_none(struct journal *journal, > struct journal_entry *entry) > { > struct wal_writer *writer = (struct wal_writer *) journal; > - wal_assign_lsn(writer, entry->rows, entry->rows + entry->n_rows); > + wal_assign_lsn(&writer->vclock, entry->rows, entry->rows + entry->n_rows); > int64_t old_lsn = vclock_get(&replicaset.vclock, instance_id); > int64_t new_lsn = vclock_get(&writer->vclock, instance_id); > if (new_lsn > old_lsn) { > diff --git a/test/box/errinj.result b/test/box/errinj.result > index 12303670e..1d9a16d8d 100644 > --- a/test/box/errinj.result > +++ b/test/box/errinj.result > @@ -141,6 +141,15 @@ errinj.set("ERRINJ_TESTING", false) > --- > - ok > ... > +env = require('test_run') > +--- > +... > +test_run = env.new() > +--- > +... Why? You don't seem to need it. > +lsn1 = box.info.vclock[box.info.id] > +--- > +... > -- Check how well we handle a failed log write > errinj.set("ERRINJ_WAL_IO", true) > --- > @@ -161,6 +170,11 @@ space:insert{1} > --- > - [1] > ... > +-- Check vclock was promoted only one time > +box.info.vclock[box.info.id] == lsn1 + 1 > +--- > +- true > +... > errinj.set("ERRINJ_WAL_IO", true) > --- > - ok > @@ -180,6 +194,15 @@ errinj.set("ERRINJ_WAL_IO", false) > --- > - ok > ... > +space:update(1, {{'=', 2, 2}}) > +--- > +- [1, 2] > +... > +-- Check vclock was promoted only two times > +box.info.vclock[box.info.id] == lsn1 + 2 > +--- > +- true > +... This test will pass with and without your patch, because we already don't promote box.info.lsn on failed WAL write. I think you need to restart the server to check this properly. Other than that, the patch is fine by me.
next prev parent reply other threads:[~2019-02-06 13:56 UTC|newest] Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-02-06 8:29 [tarantool-patches] [PATCH 0/3] Promote vclock only for successful writes Georgy Kirichenko 2019-02-06 8:29 ` [tarantool-patches] [PATCH 1/3] Do not promote wal vclock for failed writes Georgy Kirichenko 2019-02-06 13:56 ` Vladimir Davydov [this message] 2019-02-06 8:29 ` [tarantool-patches] [PATCH 2/3] Enforce applier out of order protection Georgy Kirichenko 2019-02-06 14:13 ` Vladimir Davydov 2019-02-06 8:29 ` [tarantool-patches] [PATCH 3/3] Promote replicaset.vclock only after wal Georgy Kirichenko 2019-02-06 14:45 ` Vladimir Davydov 2019-02-06 13:50 ` [tarantool-patches] [PATCH 0/3] Promote vclock only for successful writes Vladimir Davydov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190206135656.rebkdloo6plr7ytx@esperanza \ --to=vdavydov.dev@gmail.com \ --cc=georgy@tarantool.org \ --cc=tarantool-patches@freelists.org \ --subject='Re: [tarantool-patches] [PATCH 1/3] Do not promote wal vclock for failed writes' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox