Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov.dev@gmail.com>
To: Olga Arkhangelskaia <krishtal.olja@gmail.com>
Cc: tarantool-patches@freelists.org
Subject: Re: [tarantool-patches] [PATCH v5 3/3] box: adds replication sync after cfg. update
Date: Thu, 30 Aug 2018 13:11:36 +0300	[thread overview]
Message-ID: <20180830101136.tegvxuxd7vtqi72e@esperanza> (raw)
In-Reply-To: <20180829185642.49479-3-krishtal.olja@gmail.com>

On Wed, Aug 29, 2018 at 09:56:42PM +0300, Olga Arkhangelskaia wrote:
> When replica reconnects to replica set not for the first time, we
> suffer from absence of synchronization. Such behavior leads to giving
> away outdated data.
> 
> @TarantoolBot document
> Title: Orphan status after configuration update or initial bootstrap.
> In case of initial bootstrap or after configuration update we can get
> an orphan status in two cases. If we synced up with number of replicas
> that is smaller than quorum or if we failed to sync up during the time
> specified in replication_sync_lag_timeout.
> 
> Closes #3427

'Closes ####' must be before TarantoolBot request.

> diff --git a/src/box/box.cc b/src/box/box.cc
> index 0f8364ebc..6bae9ea78 100644
> --- a/src/box/box.cc
> +++ b/src/box/box.cc
> @@ -646,6 +646,10 @@ box_set_replication(void)
>  	box_sync_replication(true);
>  	/* Follow replica */
>  	replicaset_follow();
> +	/* Sync replica up to quorum */
> +	if (!replicaset_sync()) {
> +		say_crit("entering orphan mode");
> +	}

I don't see where you enter the orphan state.

>  }
>  
>  void
> @@ -1967,7 +1971,8 @@ box_cfg_xc(void)
>  	is_box_configured = true;
>  
>  	if (!is_bootstrap_leader)
> -		replicaset_sync();
> +		if (!replicaset_sync())
> +			say_crit("entering orphan mode");

Apparently, you don't need to return true/false from replicaset_sync
anymore as box_set_replication and box_cfg_xc do the same in case of
synchronization failure.

> diff --git a/test/replication/sync.test.lua b/test/replication/sync.test.lua
> new file mode 100644
> index 000000000..e63edd0d3
> --- /dev/null
> +++ b/test/replication/sync.test.lua
> @@ -0,0 +1,57 @@
> +--
> +-- gh-3427: no sync after configuration update
> +--
> +
> +--
> +-- successful sync
> +--
> +
> +env = require('test_run')
> +test_run = env.new()
> +engine = test_run:get_cfg('engine')
> +
> +box.schema.user.grant('guest', 'replication')
> +
> +test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'")
> +test_run:cmd("start server replica")
> +
> +s = box.schema.space.create('test', {engine = engine})
> +index = s:create_index('primary')
> +
> +-- change replica configuration
> +test_run:cmd("switch replica")
> +box.cfg{replication_sync_lag = 0.1}

The test passes without this line, that is you don't check that
replication_sync_lag actually works. Please add a test case for this.

> +replication = box.cfg.replication
> +box.cfg{replication={}}
> +
> +test_run:cmd("switch default")
> +-- insert values on the master while replica is unconfigured
> +a = 100 box.begin() while a > 0 do a = a-1 box.space.test:insert{a,a} end box.commit()

for i = 1, 100 do ... end

looks much more readable.

> +
> +test_run:cmd("switch replica")
> +box.cfg{replication = replication}
> +
> +box.space.test:count() == 100
> +
> +
> +--
> +-- unsuccessful sync
> +--
> +box.cfg{replication={}}
> +box.cfg{replication_sync_lag_timeout = 0.001}
> +
> +test_run:cmd("switch default")
> +-- insert values on the master while replica is unconfigured
> +a = 200 box.begin() while a > 100 do a = a-1 box.space.test:insert{a,a} end box.commit()
> +
> +test_run:cmd("switch replica")
> +box.cfg{replication = replication}
> +box.space.test:count() < 200

You should check that tarantool enters the orphan state in this case and
leaves it once it is synchronized.

> +
> +test_run:cmd("switch default")
> +
> +-- cleanup
> +test_run:cmd("stop server replica")
> +test_run:cmd("cleanup server replica")
> +box.space.test:drop()
> +box.schema.user.revoke('guest', 'replication')

  reply	other threads:[~2018-08-30 10:11 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-29 18:56 [tarantool-patches] [PATCH v2 1/3] box: make replication_sync_lag option dynamic Olga Arkhangelskaia
2018-08-29 18:56 ` [tarantool-patches] [PATCH 2/3] box: add replication_sync_lag_timeout Olga Arkhangelskaia
2018-08-30 10:02   ` Vladimir Davydov
2018-08-29 18:56 ` [tarantool-patches] [PATCH v5 3/3] box: adds replication sync after cfg. update Olga Arkhangelskaia
2018-08-30 10:11   ` Vladimir Davydov [this message]
2018-08-30  9:48 ` [tarantool-patches] [PATCH v2 1/3] box: make replication_sync_lag option dynamic Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180830101136.tegvxuxd7vtqi72e@esperanza \
    --to=vdavydov.dev@gmail.com \
    --cc=krishtal.olja@gmail.com \
    --cc=tarantool-patches@freelists.org \
    --subject='Re: [tarantool-patches] [PATCH v5 3/3] box: adds replication sync after cfg. update' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox