From: Olga Arkhangelskaia <krishtal.olja@gmail.com> To: tarantool-patches@freelists.org Cc: Olga Arkhangelskaia <krishtal.olja@gmail.com> Subject: [tarantool-patches] [PATCH v4 2/2] box: adds replication sync after cfg. update Date: Tue, 28 Aug 2018 14:43:28 +0300 [thread overview] Message-ID: <20180828114328.25702-2-krishtal.olja@gmail.com> (raw) In-Reply-To: <20180828114328.25702-1-krishtal.olja@gmail.com> When replica reconnects to replica set not for the first time, we suffer from absence of synchronization. Such behavior leads to giving away outdated data. Closes #3427 --- https://github.com/tarantool/tarantool/issues/3427 https://github.com/tarantool/tarantool/tree/OKriw/replication_no_sync-1.9 v1: https://www.freelists.org/post/tarantool-patches/PATCH-replication-adds-replication-sync-after-cfg-update v2: https://www.freelists.org/post/tarantool-patches/PATCH-v2-replication-adds-replication-sync-after-cfg-update v3: https://www.freelists.org/post/tarantool-patches/PATCH-v3-box-adds-replication-sync-after-cfg-update Changes in v2: - fixed test - changed replicaset_sync Changes in v3: - now we raise the exception when sync is not successful. - fixed test - renamed test Changes in v4: - fixed test - replication_sync_lag is made dynamicall in separate patch - removed unnecessary error type - moved say_crit to another place - in case of sync error we rollback to prev. config src/box/box.cc | 8 ++++- src/box/replication.cc | 8 ++--- src/box/replication.h | 6 ++-- test/replication/sync.result | 81 ++++++++++++++++++++++++++++++++++++++++++ test/replication/sync.test.lua | 38 ++++++++++++++++++++ 5 files changed, 133 insertions(+), 8 deletions(-) create mode 100644 test/replication/sync.result create mode 100644 test/replication/sync.test.lua diff --git a/src/box/box.cc b/src/box/box.cc index be5077da8..aaae4219f 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -634,6 +634,11 @@ box_set_replication(void) box_sync_replication(true); /* Follow replica */ replicaset_follow(); + /* Sync replica up to quorum */ + if (!replicaset_sync()) { + tnt_raise(ClientError, ER_CFG, "replication", + "failed to connect to one or more replicas"); + } } void @@ -1948,7 +1953,8 @@ box_cfg_xc(void) is_box_configured = true; if (!is_bootstrap_leader) - replicaset_sync(); + if (!replicaset_sync()) + say_crit("entering orphan mode"); say_info("ready to accept requests"); } diff --git a/src/box/replication.cc b/src/box/replication.cc index 861ce34ea..9d3b1094c 100644 --- a/src/box/replication.cc +++ b/src/box/replication.cc @@ -661,13 +661,13 @@ replicaset_follow(void) } } -void +bool replicaset_sync(void) { int quorum = replicaset_quorum(); if (quorum == 0) - return; + return true; say_verbose("synchronizing with %d replicas", quorum); @@ -686,12 +686,12 @@ replicaset_sync(void) * Do not stall configuration, leave the instance * in 'orphan' state. */ - say_crit("entering orphan mode"); - return; + return false; } say_crit("replica set sync complete, quorum of %d " "replicas formed", quorum); + return true; } void diff --git a/src/box/replication.h b/src/box/replication.h index 06a2867b6..d4e6f7e3e 100644 --- a/src/box/replication.h +++ b/src/box/replication.h @@ -373,10 +373,10 @@ replicaset_follow(void); /** * Wait until a replication quorum is formed. - * Return immediately if a quorum cannot be - * formed because of errors. + * @return true in case of success. + * @return false if a quorum cannot be formed because of errors. */ -void +bool replicaset_sync(void); /** diff --git a/test/replication/sync.result b/test/replication/sync.result new file mode 100644 index 000000000..f6ddb02e0 --- /dev/null +++ b/test/replication/sync.result @@ -0,0 +1,81 @@ +-- +-- gh-3427: no sync after configuration update +-- +env = require('test_run') +--- +... +test_run = env.new() +--- +... +engine = test_run:get_cfg('engine') +--- +... +box.schema.user.grant('guest', 'replication') +--- +... +test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'") +--- +- true +... +test_run:cmd("start server replica") +--- +- true +... +s = box.schema.space.create('test', {engine = engine}) +--- +... +index = s:create_index('primary') +--- +... +-- change replica configuration +test_run:cmd("switch replica") +--- +- true +... +box.cfg{replication_sync_lag = 0.1} +--- +... +replication = box.cfg.replication +--- +... +box.cfg{replication={}} +--- +... +test_run:cmd("switch default") +--- +- true +... +-- insert values on the master while replica is unconfigured +a = 3000 box.begin() while a > 0 do a = a-1 box.space.test:insert{a,a} end box.commit() +--- +... +test_run:cmd("switch replica") +--- +- true +... +box.cfg{replication = replication} +--- +... +box.space.test:count() == 3000 +--- +- true +... +test_run:cmd("switch default") +--- +- true +... +-- cleanup +test_run:cmd("stop server replica") +--- +- true +... +test_run:cmd("cleanup server replica") +--- +- true +... +box.space.test:drop() +--- +... +box.schema.user.revoke('guest', 'replication') +--- +... diff --git a/test/replication/sync.test.lua b/test/replication/sync.test.lua new file mode 100644 index 000000000..4c2b55af8 --- /dev/null +++ b/test/replication/sync.test.lua @@ -0,0 +1,38 @@ +-- +-- gh-3427: no sync after configuration update +-- + +env = require('test_run') +test_run = env.new() +engine = test_run:get_cfg('engine') + +box.schema.user.grant('guest', 'replication') + +test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'") +test_run:cmd("start server replica") + +s = box.schema.space.create('test', {engine = engine}) +index = s:create_index('primary') + +-- change replica configuration +test_run:cmd("switch replica") +box.cfg{replication_sync_lag = 0.1} +replication = box.cfg.replication +box.cfg{replication={}} + +test_run:cmd("switch default") +-- insert values on the master while replica is unconfigured +a = 3000 box.begin() while a > 0 do a = a-1 box.space.test:insert{a,a} end box.commit() + +test_run:cmd("switch replica") +box.cfg{replication = replication} + +box.space.test:count() == 3000 + +test_run:cmd("switch default") + +-- cleanup +test_run:cmd("stop server replica") +test_run:cmd("cleanup server replica") +box.space.test:drop() +box.schema.user.revoke('guest', 'replication') -- 2.14.3 (Apple Git-98)
next prev parent reply other threads:[~2018-08-28 11:43 UTC|newest] Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-08-28 11:43 [tarantool-patches] [PATCH 1/2] box: make replication_sync_lag option dynamic Olga Arkhangelskaia 2018-08-28 11:43 ` Olga Arkhangelskaia [this message] 2018-08-28 15:58 ` [tarantool-patches] [PATCH v4 2/2] box: adds replication sync after cfg. update Vladimir Davydov 2018-08-28 16:19 ` Olga Krishtal 2018-08-28 16:36 ` Vladimir Davydov 2018-08-28 14:03 ` [tarantool-patches] [PATCH 1/2] box: make replication_sync_lag option dynamic Vladimir Davydov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20180828114328.25702-2-krishtal.olja@gmail.com \ --to=krishtal.olja@gmail.com \ --cc=tarantool-patches@freelists.org \ --subject='Re: [tarantool-patches] [PATCH v4 2/2] box: adds replication sync after cfg. update' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox