From: Vladimir Davydov <vdavydov.dev@gmail.com> To: Konstantin Belyavskiy <k.belyavskiy@tarantool.org> Cc: georgy <georgy@tarantool.org>, tarantool-patches@freelists.org Subject: Re: Fwd: Re[2]: [patches] [PATCH] [replication] [recovery] recover missing data Date: Wed, 28 Mar 2018 12:25:18 +0300 [thread overview] Message-ID: <20180328092518.ladp4mvyiufc6uto@esperanza> (raw) In-Reply-To: <1522077876.462276429@f459.i.mail.ru> On Mon, Mar 26, 2018 at 06:24:36PM +0300, Konstantin Belyavskiy wrote: > Please check most recent version. > branch: gh-3210-recover-missing-local-data-master-master Please don't send or submit a patch for 1.6 until we commit it to the trunk. I failed to find the patch in the mailing list. Pasting it here for review. > From 391448a496fd769ff6724cadab4d333d37a9088e Mon Sep 17 00:00:00 2001 > From: Konstantin Belyavskiy <k.belyavskiy@tarantool.org> > Date: Tue, 13 Mar 2018 17:51:52 +0300 > Subject: [PATCH] [replication] [recovery] recover missing data > > Recover missing local data from replica. > In case of sudden power-loss, if data was not written to WAL but > already sent to remote replica, local can't recover properly and > we have different datasets. > Fix it by using remote replica's data and LSN comparison. > Based on @GeorgyKirichenko proposal and @locker race free check. > Switch off replication/catch.test.lua Why? If the test is broken, please fix it. If you find the test pointless, delete it with a proper explanation. > > Closes #3210 > > diff --git a/src/box/applier.cc b/src/box/applier.cc > index 6bfe5a99..5f0b3069 100644 > --- a/src/box/applier.cc > +++ b/src/box/applier.cc > @@ -453,7 +453,8 @@ applier_subscribe(struct applier *applier) > } > > if (applier->state == APPLIER_SYNC && > - applier->lag <= replication_sync_lag) { > + applier->lag <= replication_sync_lag && > + vclock_compare(&applier->vclock, &replicaset.vclock) <= 0) { First, you use a wrong vclock - applier->vclock is the vclock at connect (the name is rather misleading though, true, we should probably rename it to remote_vclock_at_connect). You should use the vclock received in the SUBSCRIBE request. Second, this new condition could use a comment. Third, this is a worthwhile change as is so I think it should be submitted in a separate patch. > /* Applier is synced, switch to "follow". */ > applier_set_state(applier, APPLIER_FOLLOW); > } > diff --git a/src/box/relay.cc b/src/box/relay.cc > index 2bd05ad5..344a8e01 100644 > --- a/src/box/relay.cc > +++ b/src/box/relay.cc > @@ -110,6 +110,11 @@ struct relay { > struct vclock recv_vclock; > /** Replicatoin slave version. */ > uint32_t version_id; > + /** > + * Local master's LSN at the moment of subscribe, used to check > + * dataset on the other side and send missing data rows if any. > + */ > + int64_t masters_lsn_at_subscribe; Why did you change the member name? I only asked to update the comment. 'local_lsn_at_subscribe' is not perfect, but still a better name IMO. Actually, I'm thinking about storing a whole vclock here instead of just one LSN - that would help avoid confusion: /** Local vclock at the time of subscribe. */ struct vclock local_vclock_at_subscribe; > > /** Relay endpoint */ > struct cbus_endpoint endpoint; > diff --git a/src/box/wal.cc b/src/box/wal.cc > index 4576cfe0..4a43775d 100644 > --- a/src/box/wal.cc > +++ b/src/box/wal.cc > @@ -768,8 +768,15 @@ wal_write(struct journal *journal, struct journal_entry *entry) > /* > * Find last row from local instance id > * and promote vclock. > + * In master-master configuration, during sudden > + * power-loss if data was not written to WAL but > + * already sent to others they will send back. > + * In this case we should update only local > + * vclock but not the replicaset one. Could be > + * checked by simple lsn comparison. I still don't understand this comment. > */ > - if ((*last)->replica_id == instance_id) { > + if ((*last)->replica_id == instance_id && > + replicaset.vclock.lsn[instance_id] < (*last)->lsn) { Use vclock_get() for this. Also, we agreed to move this to applier AFAIR. > vclock_follow(&replicaset.vclock, instance_id, > (*last)->lsn); > break; > diff --git a/test/replication/on_replace.lua b/test/replication/on_replace.lua > index 7e49efe1..c5855892 100644 > --- a/test/replication/on_replace.lua > +++ b/test/replication/on_replace.lua > @@ -22,13 +22,10 @@ box.cfg({ > }; > }) > > -env = require('test_run') > -test_run = env.new() > -engine = test_run:get_cfg('engine') > - > box.once("bootstrap", function() > + local test_run = require('test_run').new() > box.schema.user.create(USER, { password = PASSWORD }) > box.schema.user.grant(USER, 'replication') > - box.schema.space.create('test', {engine = engine}) > + box.schema.space.create('test', {engine = test_run:get_cfg('engine')}) > box.space.test:create_index('primary') > end) > diff --git a/test/replication/recover_missing.test.lua b/test/replication/recover_missing.test.lua > new file mode 100644 > index 00000000..d5b0e0ad > --- /dev/null > +++ b/test/replication/recover_missing.test.lua > @@ -0,0 +1,42 @@ > +env = require('test_run') > +test_run = env.new() > + > +SERVERS = { 'on_replace1', 'on_replace2' } Please use autobootstrap.lua - I want to see how it works with multiple masters. > +-- Start servers > +test_run:create_cluster(SERVERS) > +-- Wait for full mesh > +test_run:wait_fullmesh(SERVERS) > + > +test_run:cmd("switch on_replace1") > +for i = 0, 9 do box.space.test:insert{i, 'test' .. i} end Nit: please start counting from 1, as this is common in Lua. > +box.space.test:count() > + > +test_run:cmd('switch default') > +vclock1 = test_run:get_vclock('on_replace1') > +vclock2 = test_run:wait_cluster_vclock(SERVERS, vclock1) > + > +test_run:cmd("switch on_replace2") > +box.space.test:count() > +box.error.injection.set("ERRINJ_RELAY_TIMEOUT", 0.1) Decrease the timeout to speed up test execution time. > +test_run:cmd("stop server on_replace1") > +fio = require('fio') > +-- This test checks ability to recover missing local data > +-- from remote replica. See #3210. > +-- Delete data on first master and test that after restart, > +-- due to difference in vclock it will be able to recover > +-- all missing data from replica. > +-- Also check that there is no concurrency, i.e. master is > +-- in 'read-only' mode unless it receives all data. > +fio.unlink(fio.pathjoin(fio.abspath("."), string.format('on_replace1/%020d.xlog', 8))) > +test_run:cmd("start server on_replace1") > + > +test_run:cmd("switch on_replace1") > +for i = 10, 19 do box.space.test:insert{i, 'test' .. i} end > +fiber = require('fiber') > +fiber.sleep(1) > +box.space.test:count() Use 'select' here to make sure the data received are correct. > + > +-- Cleanup. > +test_run:cmd('switch default') > +test_run:drop_cluster(SERVERS) > + Nit: extra line at the end of the file.
next prev parent reply other threads:[~2018-03-28 9:25 UTC|newest] Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-03-26 15:24 Konstantin Belyavskiy 2018-03-28 9:25 ` Vladimir Davydov [this message] 2018-03-28 17:01 ` Re[2]: " Konstantin Belyavskiy
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20180328092518.ladp4mvyiufc6uto@esperanza \ --to=vdavydov.dev@gmail.com \ --cc=georgy@tarantool.org \ --cc=k.belyavskiy@tarantool.org \ --cc=tarantool-patches@freelists.org \ --subject='Re: Fwd: Re[2]: [patches] [PATCH] [replication] [recovery] recover missing data' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox