* [Tarantool-patches] [PATCH] replication: fix rebootstrap in case the instance is listed in box.cfg.replication
@ 2020-02-28 17:01 Serge Petrenko
  2020-02-28 23:43 ` Vladislav Shpilevoy
  2020-03-03  8:25 ` Kirill Yukhin
  0 siblings, 2 replies; 9+ messages in thread
From: Serge Petrenko @ 2020-02-28 17:01 UTC (permalink / raw)
  To: v.shpilevoy, kostja.osipov, kirichenkoga; +Cc: tarantool-patches
When checking wheter rejoin is needed, replica loops through all the
instances in box.cfg.replication, which makes it believe that there is a
master holding files, needed by it, since it accounts itself just like
all other instances.
So make replica skip itself when finding an instance which holds files
needed by it, and determining whether rebootstrap is needed.
We already have a working test for the issue, it missed the issue due to
replica.lua replication settings. Fix replica.lua to optionally include
itself in box.cfg.replication, so that the corresponding test works
correctly.
Closes #4759
---
https://github.com/tarantool/tarantool/issues/4759
https://github.com/tarantool/tarantool/tree/sp/gh-4759-rebootstrap-fix
@ChangeLog
 - fix rebootstrap procedure not working in case replica itself
   is listed in `box.cfg.replication`
 src/box/replication.cc                   | 13 ++++++++++++-
 test/replication/replica.lua             | 11 ++++++++++-
 test/replication/replica_rejoin.result   | 12 ++++++------
 test/replication/replica_rejoin.test.lua | 12 ++++++------
 4 files changed, 34 insertions(+), 14 deletions(-)
diff --git a/src/box/replication.cc b/src/box/replication.cc
index e7bfa22ab..01edc0fb2 100644
--- a/src/box/replication.cc
+++ b/src/box/replication.cc
@@ -768,8 +768,19 @@ replicaset_needs_rejoin(struct replica **master)
 	struct replica *leader = NULL;
 	replicaset_foreach(replica) {
 		struct applier *applier = replica->applier;
-		if (applier == NULL)
+		/*
+		 * The function is called right after
+		 * box_sync_replication(), which in turn calls
+		 * replicaset_connect(), which ensures that
+		 * appliers are either stopped (APPLIER OFF) or
+		 * connected.
+		 * Also ignore self, as self applier might not
+		 * have disconnected yet.
+		 */
+		if (applier == NULL || applier->state == APPLIER_OFF ||
+		    tt_uuid_is_equal(&replica->uuid, &INSTANCE_UUID))
 			continue;
+		assert(applier->state == APPLIER_CONNECTED);
 
 		const struct ballot *ballot = &applier->ballot;
 		if (vclock_compare(&ballot->gc_vclock,
diff --git a/test/replication/replica.lua b/test/replication/replica.lua
index 20ac064e1..1cd60e062 100644
--- a/test/replication/replica.lua
+++ b/test/replication/replica.lua
@@ -1,8 +1,17 @@
 #!/usr/bin/env tarantool
 
+repl_include_self = arg[1] and arg[1] == 'true' or false
+repl_list = nil
+
+if repl_include_self then
+    repl_list = {os.getenv("MASTER"), os.getenv("LISTEN")}
+else
+    repl_list = os.getenv("MASTER")
+end
+
 box.cfg({
     listen              = os.getenv("LISTEN"),
-    replication         = os.getenv("MASTER"),
+    replication         = repl_list,
     memtx_memory        = 107374182,
     replication_timeout = 0.1,
     replication_connect_timeout = 0.5,
diff --git a/test/replication/replica_rejoin.result b/test/replication/replica_rejoin.result
index f71292da1..b8ed79f14 100644
--- a/test/replication/replica_rejoin.result
+++ b/test/replication/replica_rejoin.result
@@ -37,7 +37,7 @@ test_run:cmd("create server replica with rpl_master=default, script='replication
 ---
 - true
 ...
-test_run:cmd("start server replica")
+test_run:cmd("start server replica with args='true'")
 ---
 - true
 ...
@@ -111,7 +111,7 @@ box.cfg{checkpoint_count = checkpoint_count}
 ...
 -- Restart the replica. Since xlogs have been removed,
 -- it is supposed to rejoin without changing id.
-test_run:cmd("start server replica")
+test_run:cmd("start server replica with args='true'")
 ---
 - true
 ...
@@ -158,7 +158,7 @@ box.space.test:select()
   - [30, 30]
 ...
 -- Check that restart works as usual.
-test_run:cmd("restart server replica")
+test_run:cmd("restart server replica with args='true'")
 box.info.replication[1].upstream.status == 'follow' or box.info
 ---
 - true
@@ -210,7 +210,7 @@ test_run:wait_cond(function() return #fio.glob(fio.pathjoin(box.cfg.wal_dir, '*.
 box.cfg{checkpoint_count = checkpoint_count}
 ---
 ...
-test_run:cmd("start server replica")
+test_run:cmd("start server replica with args='true'")
 ---
 - true
 ...
@@ -252,7 +252,7 @@ test_run:cleanup_cluster()
 box.space.test:truncate()
 ---
 ...
-test_run:cmd("start server replica")
+test_run:cmd("start server replica with args='true'")
 ---
 - true
 ...
@@ -349,7 +349,7 @@ _ = test_run:wait_vclock('default', vclock)
 ---
 ...
 -- Restart the replica. It should successfully rebootstrap.
-test_run:cmd("restart server replica")
+test_run:cmd("restart server replica with args='true'")
 box.space.test:select()
 ---
 - - [1]
diff --git a/test/replication/replica_rejoin.test.lua b/test/replication/replica_rejoin.test.lua
index 22a91d8d7..25c0849ec 100644
--- a/test/replication/replica_rejoin.test.lua
+++ b/test/replication/replica_rejoin.test.lua
@@ -17,7 +17,7 @@ _ = box.space.test:insert{3}
 
 -- Join a replica, then stop it.
 test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'")
-test_run:cmd("start server replica")
+test_run:cmd("start server replica with args='true'")
 test_run:cmd("switch replica")
 box.info.replication[1].upstream.status == 'follow' or box.info
 box.space.test:select()
@@ -45,7 +45,7 @@ box.cfg{checkpoint_count = checkpoint_count}
 
 -- Restart the replica. Since xlogs have been removed,
 -- it is supposed to rejoin without changing id.
-test_run:cmd("start server replica")
+test_run:cmd("start server replica with args='true'")
 box.info.replication[2].downstream.vclock ~= nil or box.info
 test_run:cmd("switch replica")
 box.info.replication[1].upstream.status == 'follow' or box.info
@@ -60,7 +60,7 @@ test_run:cmd("switch replica")
 box.space.test:select()
 
 -- Check that restart works as usual.
-test_run:cmd("restart server replica")
+test_run:cmd("restart server replica with args='true'")
 box.info.replication[1].upstream.status == 'follow' or box.info
 box.space.test:select()
 
@@ -78,7 +78,7 @@ for i = 1, 3 do box.space.test:insert{i * 100} end
 fio = require('fio')
 test_run:wait_cond(function() return #fio.glob(fio.pathjoin(box.cfg.wal_dir, '*.xlog')) == 1 end) or fio.pathjoin(box.cfg.wal_dir, '*.xlog')
 box.cfg{checkpoint_count = checkpoint_count}
-test_run:cmd("start server replica")
+test_run:cmd("start server replica with args='true'")
 test_run:cmd("switch replica")
 box.info.status -- orphan
 box.space.test:select()
@@ -94,7 +94,7 @@ test_run:cmd("stop server replica")
 test_run:cmd("cleanup server replica")
 test_run:cleanup_cluster()
 box.space.test:truncate()
-test_run:cmd("start server replica")
+test_run:cmd("start server replica with args='true'")
 -- Subscribe the master to the replica.
 replica_listen = test_run:cmd("eval replica 'return box.cfg.listen'")
 replica_listen ~= nil
@@ -128,7 +128,7 @@ for i = 1, 10 do box.space.test:replace{2} end
 vclock = test_run:get_vclock('replica')
 _ = test_run:wait_vclock('default', vclock)
 -- Restart the replica. It should successfully rebootstrap.
-test_run:cmd("restart server replica")
+test_run:cmd("restart server replica with args='true'")
 box.space.test:select()
 box.snapshot()
 box.space.test:replace{2}
-- 
2.21.1 (Apple Git-122.3)
^ permalink raw reply	[flat|nested] 9+ messages in thread- * Re: [Tarantool-patches] [PATCH] replication: fix rebootstrap in case the instance is listed in box.cfg.replication
  2020-02-28 17:01 [Tarantool-patches] [PATCH] replication: fix rebootstrap in case the instance is listed in box.cfg.replication Serge Petrenko
@ 2020-02-28 23:43 ` Vladislav Shpilevoy
  2020-02-29  8:54   ` Konstantin Osipov
  2020-02-29  9:52   ` Serge Petrenko
  2020-03-03  8:25 ` Kirill Yukhin
  1 sibling, 2 replies; 9+ messages in thread
From: Vladislav Shpilevoy @ 2020-02-28 23:43 UTC (permalink / raw)
  To: Serge Petrenko, kostja.osipov, kirichenkoga; +Cc: tarantool-patches
Thanks for the patch!
On 28/02/2020 18:01, Serge Petrenko wrote:
> When checking wheter rejoin is needed, replica loops through all the
> instances in box.cfg.replication, which makes it believe that there is a
> master holding files, needed by it, since it accounts itself just like
> all other instances.
> So make replica skip itself when finding an instance which holds files
> needed by it, and determining whether rebootstrap is needed.
> 
> We already have a working test for the issue, it missed the issue due to
> replica.lua replication settings. Fix replica.lua to optionally include
> itself in box.cfg.replication, so that the corresponding test works
> correctly.
> 
> Closes #4759
> ---
> https://github.com/tarantool/tarantool/issues/4759
> https://github.com/tarantool/tarantool/tree/sp/gh-4759-rebootstrap-fix
> 
> @ChangeLog
>  - fix rebootstrap procedure not working in case replica itself
>    is listed in `box.cfg.replication`
> 
>  src/box/replication.cc                   | 13 ++++++++++++-
>  test/replication/replica.lua             | 11 ++++++++++-
>  test/replication/replica_rejoin.result   | 12 ++++++------
>  test/replication/replica_rejoin.test.lua | 12 ++++++------
>  4 files changed, 34 insertions(+), 14 deletions(-)
> 
> diff --git a/src/box/replication.cc b/src/box/replication.cc
> index e7bfa22ab..01edc0fb2 100644
> --- a/src/box/replication.cc
> +++ b/src/box/replication.cc
> @@ -768,8 +768,19 @@ replicaset_needs_rejoin(struct replica **master)
>  	struct replica *leader = NULL;
>  	replicaset_foreach(replica) {
>  		struct applier *applier = replica->applier;
> -		if (applier == NULL)
> +		/*
> +		 * The function is called right after
> +		 * box_sync_replication(), which in turn calls
> +		 * replicaset_connect(), which ensures that
> +		 * appliers are either stopped (APPLIER OFF) or
> +		 * connected.
> +		 * Also ignore self, as self applier might not
> +		 * have disconnected yet.
> +		 */
> +		if (applier == NULL || applier->state == APPLIER_OFF ||
> +		    tt_uuid_is_equal(&replica->uuid, &INSTANCE_UUID))
>  			continue;
> +		assert(applier->state == APPLIER_CONNECTED);
>  
Could you please understand one thing? Below I see this:
> 		const struct ballot *ballot = &applier->ballot;
> 		if (vclock_compare(&ballot->gc_vclock,
> 				   &replicaset.vclock) <= 0) {
> 			/*
> 			 * There's at least one master that still stores
> 			 * WALs needed by this instance. Proceed to local
> 			 * recovery.
> 			 */
> 			return false;
> 		}
Question is why do we need rebootstrap if some remote node's
vclock is bigger than ours? It does not mean anything. It doesn't
say whether that remote instance still keeps any xlogs. All it
tells is that the remote instance committed some more data since
our restart.
^ permalink raw reply	[flat|nested] 9+ messages in thread
- * Re: [Tarantool-patches] [PATCH] replication: fix rebootstrap in case the instance is listed in box.cfg.replication
  2020-02-28 23:43 ` Vladislav Shpilevoy
@ 2020-02-29  8:54   ` Konstantin Osipov
  2020-02-29  9:52   ` Serge Petrenko
  1 sibling, 0 replies; 9+ messages in thread
From: Konstantin Osipov @ 2020-02-29  8:54 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: kirichenkoga, tarantool-patches
* Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [20/02/29 08:50]:
> Thanks for the patch!
> 
> On 28/02/2020 18:01, Serge Petrenko wrote:
> > When checking wheter rejoin is needed, replica loops through all the
> > instances in box.cfg.replication, which makes it believe that there is a
> > master holding files, needed by it, since it accounts itself just like
> > all other instances.
> > So make replica skip itself when finding an instance which holds files
> > needed by it, and determining whether rebootstrap is needed.
> > 
> > We already have a working test for the issue, it missed the issue due to
> > replica.lua replication settings. Fix replica.lua to optionally include
> > itself in box.cfg.replication, so that the corresponding test works
> > correctly.
> > 
> > Closes #4759
> > ---
> > https://github.com/tarantool/tarantool/issues/4759
> > https://github.com/tarantool/tarantool/tree/sp/gh-4759-rebootstrap-fix
> > 
> > @ChangeLog
> >  - fix rebootstrap procedure not working in case replica itself
> >    is listed in `box.cfg.replication`
> > 
> >  src/box/replication.cc                   | 13 ++++++++++++-
> >  test/replication/replica.lua             | 11 ++++++++++-
> >  test/replication/replica_rejoin.result   | 12 ++++++------
> >  test/replication/replica_rejoin.test.lua | 12 ++++++------
> >  4 files changed, 34 insertions(+), 14 deletions(-)
> > 
> > diff --git a/src/box/replication.cc b/src/box/replication.cc
> > index e7bfa22ab..01edc0fb2 100644
> > --- a/src/box/replication.cc
> > +++ b/src/box/replication.cc
> > @@ -768,8 +768,19 @@ replicaset_needs_rejoin(struct replica **master)
> >  	struct replica *leader = NULL;
> >  	replicaset_foreach(replica) {
> >  		struct applier *applier = replica->applier;
> > -		if (applier == NULL)
> > +		/*
> > +		 * The function is called right after
> > +		 * box_sync_replication(), which in turn calls
> > +		 * replicaset_connect(), which ensures that
> > +		 * appliers are either stopped (APPLIER OFF) or
> > +		 * connected.
> > +		 * Also ignore self, as self applier might not
> > +		 * have disconnected yet.
> > +		 */
> > +		if (applier == NULL || applier->state == APPLIER_OFF ||
> > +		    tt_uuid_is_equal(&replica->uuid, &INSTANCE_UUID))
> >  			continue;
> > +		assert(applier->state == APPLIER_CONNECTED);
> >  
> 
> Could you please understand one thing? Below I see this:
> 
> > 		const struct ballot *ballot = &applier->ballot;
> > 		if (vclock_compare(&ballot->gc_vclock,
> > 				   &replicaset.vclock) <= 0) {
> > 			/*
> > 			 * There's at least one master that still stores
> > 			 * WALs needed by this instance. Proceed to local
> > 			 * recovery.
> > 			 */
> > 			return false;
> > 		}
> 
> Question is why do we need rebootstrap if some remote node's
> vclock is bigger than ours? It does not mean anything. It doesn't
> say whether that remote instance still keeps any xlogs. All it
> tells is that the remote instance committed some more data since
> our restart.
During rebootstrap we don't care about WAL logs too much, we
bootstrap off the latest checkpoint, and then feed the logs since
the latest checkpoint.
Could be added in a comment.
-- 
Konstantin Osipov, Moscow, Russia
^ permalink raw reply	[flat|nested] 9+ messages in thread
- * Re: [Tarantool-patches] [PATCH] replication: fix rebootstrap in case the instance is listed in box.cfg.replication
  2020-02-28 23:43 ` Vladislav Shpilevoy
  2020-02-29  8:54   ` Konstantin Osipov
@ 2020-02-29  9:52   ` Serge Petrenko
  2020-02-29 14:19     ` Vladislav Shpilevoy
  1 sibling, 1 reply; 9+ messages in thread
From: Serge Petrenko @ 2020-02-29  9:52 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: kirichenkoga, tarantool-patches
[-- Attachment #1: Type: text/plain, Size: 3318 bytes --]
Hi!  
>Суббота, 29 февраля 2020, 2:43 +03:00 от Vladislav Shpilevoy <v.shpilevoy@tarantool.org>:
> 
>Thanks for the patch!
>
>On 28/02/2020 18:01, Serge Petrenko wrote:
>> When checking wheter rejoin is needed, replica loops through all the
>> instances in box.cfg.replication, which makes it believe that there is a
>> master holding files, needed by it, since it accounts itself just like
>> all other instances.
>> So make replica skip itself when finding an instance which holds files
>> needed by it, and determining whether rebootstrap is needed.
>>
>> We already have a working test for the issue, it missed the issue due to
>> replica.lua replication settings. Fix replica.lua to optionally include
>> itself in box.cfg.replication, so that the corresponding test works
>> correctly.
>>
>> Closes #4759
>> ---
>>  https://github.com/tarantool/tarantool/issues/4759
>>  https://github.com/tarantool/tarantool/tree/sp/gh-4759-rebootstrap-fix
>>
>> @ChangeLog
>> - fix rebootstrap procedure not working in case replica itself
>> is listed in `box.cfg.replication`
>>
>> src/box/replication.cc | 13 ++++++++++++-
>> test/replication/replica.lua | 11 ++++++++++-
>> test/replication/replica_rejoin.result | 12 ++++++------
>> test/replication/replica_rejoin.test.lua | 12 ++++++------
>> 4 files changed, 34 insertions(+), 14 deletions(-)
>>
>> diff --git a/src/box/replication.cc b/src/box/replication.cc
>> index e7bfa22ab..01edc0fb2 100644
>> --- a/src/box/replication.cc
>> +++ b/src/box/replication.cc
>> @@ -768,8 +768,19 @@ replicaset_needs_rejoin(struct replica **master)
>> struct replica *leader = NULL;
>> replicaset_foreach(replica) {
>> struct applier *applier = replica->applier;
>> - if (applier == NULL)
>> + /*
>> + * The function is called right after
>> + * box_sync_replication(), which in turn calls
>> + * replicaset_connect(), which ensures that
>> + * appliers are either stopped (APPLIER OFF) or
>> + * connected.
>> + * Also ignore self, as self applier might not
>> + * have disconnected yet.
>> + */
>> + if (applier == NULL || applier->state == APPLIER_OFF ||
>> + tt_uuid_is_equal(&replica->uuid, &INSTANCE_UUID))
>> continue;
>> + assert(applier->state == APPLIER_CONNECTED);
>>
>
>Could you please understand one thing? Below I see this:
>
>> const struct ballot *ballot = &applier->ballot;
>> if (vclock_compare(&ballot->gc_vclock,
>> &replicaset.vclock) <= 0) {
>> /*
>> * There's at least one master that still stores
>> * WALs needed by this instance. Proceed to local
>> * recovery.
>> */
>> return false;
>> }
>
>Question is why do we need rebootstrap if some remote node's
>vclock is bigger than ours? It does not mean anything. It doesn't
>say whether that remote instance still keeps any xlogs. All it
>tells is that the remote instance committed some more data since
>our restart.
 
In the piece above we check for master’s gc vclock, which is vclock of the
oldest WAL the instance still has. If master’s gc vclock is less than our vclock,
no rebootstrap is needed. Master’ll be able to send us all the fresh data.
 Later we check whether our vclock is greater or incompatible with the remote
instance’s vclock. If it is, we cannot rebootstrap from the instance, since we have
some rows, that the instance doesn’t have.
 
--
Serge Petrenko
[-- Attachment #2: Type: text/html, Size: 4271 bytes --]
^ permalink raw reply	[flat|nested] 9+ messages in thread
- * Re: [Tarantool-patches] [PATCH] replication: fix rebootstrap in case the instance is listed in box.cfg.replication
  2020-02-29  9:52   ` Serge Petrenko
@ 2020-02-29 14:19     ` Vladislav Shpilevoy
  2020-03-02 17:18       ` Serge Petrenko
  0 siblings, 1 reply; 9+ messages in thread
From: Vladislav Shpilevoy @ 2020-02-29 14:19 UTC (permalink / raw)
  To: Serge Petrenko; +Cc: kirichenkoga, tarantool-patches
On 29/02/2020 10:52, Serge Petrenko wrote:
> Hi! 
> 
>     Суббота, 29 февраля 2020, 2:43 +03:00 от Vladislav Shpilevoy <v.shpilevoy@tarantool.org>:
>      
>     Thanks for the patch!
> 
>     On 28/02/2020 18:01, Serge Petrenko wrote:
>     > When checking wheter rejoin is needed, replica loops through all the
>     > instances in box.cfg.replication, which makes it believe that there is a
>     > master holding files, needed by it, since it accounts itself just like
>     > all other instances.
>     > So make replica skip itself when finding an instance which holds files
>     > needed by it, and determining whether rebootstrap is needed.
>     >
>     > We already have a working test for the issue, it missed the issue due to
>     > replica.lua replication settings. Fix replica.lua to optionally include
>     > itself in box.cfg.replication, so that the corresponding test works
>     > correctly.
>     >
>     > Closes #4759
>     > ---
>     > https://github.com/tarantool/tarantool/issues/4759
>     > https://github.com/tarantool/tarantool/tree/sp/gh-4759-rebootstrap-fix
>     >
>     > @ChangeLog
>     > - fix rebootstrap procedure not working in case replica itself
>     > is listed in `box.cfg.replication`
>     >
>     > src/box/replication.cc | 13 ++++++++++++-
>     > test/replication/replica.lua | 11 ++++++++++-
>     > test/replication/replica_rejoin.result | 12 ++++++------
>     > test/replication/replica_rejoin.test.lua | 12 ++++++------
>     > 4 files changed, 34 insertions(+), 14 deletions(-)
>     >
>     > diff --git a/src/box/replication.cc b/src/box/replication.cc
>     > index e7bfa22ab..01edc0fb2 100644
>     > --- a/src/box/replication.cc
>     > +++ b/src/box/replication.cc
>     > @@ -768,8 +768,19 @@ replicaset_needs_rejoin(struct replica **master)
>     > struct replica *leader = NULL;
>     > replicaset_foreach(replica) {
>     > struct applier *applier = replica->applier;
>     > - if (applier == NULL)
>     > + /*
>     > + * The function is called right after
>     > + * box_sync_replication(), which in turn calls
>     > + * replicaset_connect(), which ensures that
>     > + * appliers are either stopped (APPLIER OFF) or
>     > + * connected.
>     > + * Also ignore self, as self applier might not
>     > + * have disconnected yet.
>     > + */
>     > + if (applier == NULL || applier->state == APPLIER_OFF ||
>     > + tt_uuid_is_equal(&replica->uuid, &INSTANCE_UUID))
>     > continue;
>     > + assert(applier->state == APPLIER_CONNECTED);
>     >
> 
>     Could you please understand one thing? Below I see this:
> 
>     > const struct ballot *ballot = &applier->ballot;
>     > if (vclock_compare(&ballot->gc_vclock,
>     > &replicaset.vclock) <= 0) {
>     > /*
>     > * There's at least one master that still stores
>     > * WALs needed by this instance. Proceed to local
>     > * recovery.
>     > */
>     > return false;
>     > }
> 
>     Question is why do we need rebootstrap if some remote node's
>     vclock is bigger than ours? It does not mean anything. It doesn't
>     say whether that remote instance still keeps any xlogs. All it
>     tells is that the remote instance committed some more data since
>     our restart.
> 
>  
> In the piece above we check for master’s gc vclock, which is vclock of the
> oldest WAL the instance still has. If master’s gc vclock is less than our vclock,
> no rebootstrap is needed. Master’ll be able to send us all the fresh data.
>  Later we check whether our vclock is greater or incompatible with the remote
> instance’s vclock. If it is, we cannot rebootstrap from the instance, since we have
> some rows, that the instance doesn’t have.
Thanks for the explanation. Then I would change the comment to say
more about the issue. Currently it says about self only
    "Also ignore self, as self applier might not have disconnected yet."
This does not look related to the actual problem, that we just should
not look at our gc vclock.
Also why do we need 'applier->state == APPLIER_OFF' check? How is it
related to the issue? If this is a separate bug, is it possible to
make it a separate patch?
^ permalink raw reply	[flat|nested] 9+ messages in thread
- * Re: [Tarantool-patches] [PATCH] replication: fix rebootstrap in case the instance is listed in box.cfg.replication
  2020-02-29 14:19     ` Vladislav Shpilevoy
@ 2020-03-02 17:18       ` Serge Petrenko
  2020-03-02 21:26         ` Vladislav Shpilevoy
  0 siblings, 1 reply; 9+ messages in thread
From: Serge Petrenko @ 2020-03-02 17:18 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: kirichenkoga, tarantool-patches
[-- Attachment #1: Type: text/plain, Size: 5382 bytes --]
  
>Суббота, 29 февраля 2020, 17:19 +03:00 от Vladislav Shpilevoy <v.shpilevoy@tarantool.org>:
> 
>On 29/02/2020 10:52, Serge Petrenko wrote:
>> Hi! 
>>
>> Суббота, 29 февраля 2020, 2:43 +03:00 от Vladislav Shpilevoy < v.shpilevoy@tarantool.org >:
>>  
>> Thanks for the patch!
>>
>> On 28/02/2020 18:01, Serge Petrenko wrote:
>> > When checking wheter rejoin is needed, replica loops through all the
>> > instances in box.cfg.replication, which makes it believe that there is a
>> > master holding files, needed by it, since it accounts itself just like
>> > all other instances.
>> > So make replica skip itself when finding an instance which holds files
>> > needed by it, and determining whether rebootstrap is needed.
>> >
>> > We already have a working test for the issue, it missed the issue due to
>> > replica.lua replication settings. Fix replica.lua to optionally include
>> > itself in box.cfg.replication, so that the corresponding test works
>> > correctly.
>> >
>> > Closes #4759
>> > ---
>> >  https://github.com/tarantool/tarantool/issues/4759
>> >  https://github.com/tarantool/tarantool/tree/sp/gh-4759-rebootstrap-fix
>> >
>> > @ChangeLog
>> > - fix rebootstrap procedure not working in case replica itself
>> > is listed in `box.cfg.replication`
>> >
>> > src/box/replication.cc | 13 ++++++++++++-
>> > test/replication/replica.lua | 11 ++++++++++-
>> > test/replication/replica_rejoin.result | 12 ++++++------
>> > test/replication/replica_rejoin.test.lua | 12 ++++++------
>> > 4 files changed, 34 insertions(+), 14 deletions(-)
>> >
>> > diff --git a/src/box/replication.cc b/src/box/replication.cc
>> > index e7bfa22ab..01edc0fb2 100644
>> > --- a/src/box/replication.cc
>> > +++ b/src/box/replication.cc
>> > @@ -768,8 +768,19 @@ replicaset_needs_rejoin(struct replica **master)
>> > struct replica *leader = NULL;
>> > replicaset_foreach(replica) {
>> > struct applier *applier = replica->applier;
>> > - if (applier == NULL)
>> > + /*
>> > + * The function is called right after
>> > + * box_sync_replication(), which in turn calls
>> > + * replicaset_connect(), which ensures that
>> > + * appliers are either stopped (APPLIER OFF) or
>> > + * connected.
>> > + * Also ignore self, as self applier might not
>> > + * have disconnected yet.
>> > + */
>> > + if (applier == NULL || applier->state == APPLIER_OFF ||
>> > + tt_uuid_is_equal(&replica->uuid, &INSTANCE_UUID))
>> > continue;
>> > + assert(applier->state == APPLIER_CONNECTED);
>> >
>>
>> Could you please understand one thing? Below I see this:
>>
>> > const struct ballot *ballot = &applier->ballot;
>> > if (vclock_compare(&ballot->gc_vclock,
>> > &replicaset.vclock) <= 0) {
>> > /*
>> > * There's at least one master that still stores
>> > * WALs needed by this instance. Proceed to local
>> > * recovery.
>> > */
>> > return false;
>> > }
>>
>> Question is why do we need rebootstrap if some remote node's
>> vclock is bigger than ours? It does not mean anything. It doesn't
>> say whether that remote instance still keeps any xlogs. All it
>> tells is that the remote instance committed some more data since
>> our restart.
>>
>>  
>> In the piece above we check for master’s gc vclock, which is vclock of the
>> oldest WAL the instance still has. If master’s gc vclock is less than our vclock,
>> no rebootstrap is needed. Master’ll be able to send us all the fresh data.
>>  Later we check whether our vclock is greater or incompatible with the remote
>> instance’s vclock. If it is, we cannot rebootstrap from the instance, since we have
>> some rows, that the instance doesn’t have.
>Thanks for the explanation. Then I would change the comment to say
>more about the issue. Currently it says about self only
>
>    "Also ignore self, as self applier might not have disconnected yet."
>
>This does not look related to the actual problem, that we just should
>not look at our gc vclock.
 
Ok, I changed the comment. The diff is below.
 
>
>Also why do we need 'applier->state == APPLIER_OFF' check? How is it
>related to the issue? If this is a separate bug, is it possible to
>make it a separate patch?
 
We needn’t this check, after all. replicaset_update() at the end of replicaset_connect()
ensures that all the replicas in the replicaset have appliers in state APPLIER_CONNECTED.
Thanks for pointing this out.
 
diff --git a/src/box/replication.cc b/src/box/replication.cc
index 01edc0fb2..1345f189b 100644
--- a/src/box/replication.cc
+++ b/src/box/replication.cc
@@ -769,18 +769,12 @@ replicaset_needs_rejoin(struct replica **master)
  replicaset_foreach(replica) {
  struct applier *applier = replica->applier;
  /*
- * The function is called right after
- * box_sync_replication(), which in turn calls
- * replicaset_connect(), which ensures that
- * appliers are either stopped (APPLIER OFF) or
- * connected.
- * Also ignore self, as self applier might not
- * have disconnected yet.
+ * Skip the local instance, we shouldn't perform a
+ * check against our own gc vclock.
  */
- if (applier == NULL || applier->state == APPLIER_OFF ||
-     tt_uuid_is_equal(&replica->uuid, &INSTANCE_UUID))
+ if (applier == NULL || tt_uuid_is_equal(&replica->uuid,
+ &INSTANCE_UUID))
  continue;
- assert(applier->state == APPLIER_CONNECTED);
 
  const struct ballot *ballot = &applier->ballot;
 
--
Serge Petrenko
 
[-- Attachment #2: Type: text/html, Size: 7208 bytes --]
^ permalink raw reply	[flat|nested] 9+ messages in thread
- * Re: [Tarantool-patches] [PATCH] replication: fix rebootstrap in case the instance is listed in box.cfg.replication
  2020-03-02 17:18       ` Serge Petrenko
@ 2020-03-02 21:26         ` Vladislav Shpilevoy
  0 siblings, 0 replies; 9+ messages in thread
From: Vladislav Shpilevoy @ 2020-03-02 21:26 UTC (permalink / raw)
  To: Serge Petrenko; +Cc: kirichenkoga, tarantool-patches
LGTM.
On 02/03/2020 18:18, Serge Petrenko wrote:
>  
> 
>     Суббота, 29 февраля 2020, 17:19 +03:00 от Vladislav Shpilevoy <v.shpilevoy@tarantool.org>:
>      
>     On 29/02/2020 10:52, Serge Petrenko wrote:
>     > Hi! 
>     >
>     > Суббота, 29 февраля 2020, 2:43 +03:00 от Vladislav Shpilevoy <v.shpilevoy@tarantool.org </compose?To=v.shpilevoy@tarantool.org>>:
>     >  
>     > Thanks for the patch!
>     >
>     > On 28/02/2020 18:01, Serge Petrenko wrote:
>     > > When checking wheter rejoin is needed, replica loops through all the
>     > > instances in box.cfg.replication, which makes it believe that there is a
>     > > master holding files, needed by it, since it accounts itself just like
>     > > all other instances.
>     > > So make replica skip itself when finding an instance which holds files
>     > > needed by it, and determining whether rebootstrap is needed.
>     > >
>     > > We already have a working test for the issue, it missed the issue due to
>     > > replica.lua replication settings. Fix replica.lua to optionally include
>     > > itself in box.cfg.replication, so that the corresponding test works
>     > > correctly.
>     > >
>     > > Closes #4759
>     > > ---
>     > > https://github.com/tarantool/tarantool/issues/4759
>     > > https://github.com/tarantool/tarantool/tree/sp/gh-4759-rebootstrap-fix
>     > >
>     > > @ChangeLog
>     > > - fix rebootstrap procedure not working in case replica itself
>     > > is listed in `box.cfg.replication`
>     > >
>     > > src/box/replication.cc | 13 ++++++++++++-
>     > > test/replication/replica.lua | 11 ++++++++++-
>     > > test/replication/replica_rejoin.result | 12 ++++++------
>     > > test/replication/replica_rejoin.test.lua | 12 ++++++------
>     > > 4 files changed, 34 insertions(+), 14 deletions(-)
>     > >
>     > > diff --git a/src/box/replication.cc b/src/box/replication.cc
>     > > index e7bfa22ab..01edc0fb2 100644
>     > > --- a/src/box/replication.cc
>     > > +++ b/src/box/replication.cc
>     > > @@ -768,8 +768,19 @@ replicaset_needs_rejoin(struct replica **master)
>     > > struct replica *leader = NULL;
>     > > replicaset_foreach(replica) {
>     > > struct applier *applier = replica->applier;
>     > > - if (applier == NULL)
>     > > + /*
>     > > + * The function is called right after
>     > > + * box_sync_replication(), which in turn calls
>     > > + * replicaset_connect(), which ensures that
>     > > + * appliers are either stopped (APPLIER OFF) or
>     > > + * connected.
>     > > + * Also ignore self, as self applier might not
>     > > + * have disconnected yet.
>     > > + */
>     > > + if (applier == NULL || applier->state == APPLIER_OFF ||
>     > > + tt_uuid_is_equal(&replica->uuid, &INSTANCE_UUID))
>     > > continue;
>     > > + assert(applier->state == APPLIER_CONNECTED);
>     > >
>     >
>     > Could you please understand one thing? Below I see this:
>     >
>     > > const struct ballot *ballot = &applier->ballot;
>     > > if (vclock_compare(&ballot->gc_vclock,
>     > > &replicaset.vclock) <= 0) {
>     > > /*
>     > > * There's at least one master that still stores
>     > > * WALs needed by this instance. Proceed to local
>     > > * recovery.
>     > > */
>     > > return false;
>     > > }
>     >
>     > Question is why do we need rebootstrap if some remote node's
>     > vclock is bigger than ours? It does not mean anything. It doesn't
>     > say whether that remote instance still keeps any xlogs. All it
>     > tells is that the remote instance committed some more data since
>     > our restart.
>     >
>     >  
>     > In the piece above we check for master’s gc vclock, which is vclock of the
>     > oldest WAL the instance still has. If master’s gc vclock is less than our vclock,
>     > no rebootstrap is needed. Master’ll be able to send us all the fresh data.
>     >  Later we check whether our vclock is greater or incompatible with the remote
>     > instance’s vclock. If it is, we cannot rebootstrap from the instance, since we have
>     > some rows, that the instance doesn’t have.
> 
>     Thanks for the explanation. Then I would change the comment to say
>     more about the issue. Currently it says about self only
> 
>         "Also ignore self, as self applier might not have disconnected yet."
> 
>     This does not look related to the actual problem, that we just should
>     not look at our gc vclock.
> 
>  
> Ok, I changed the comment. The diff is below.
>  
> 
> 
>     Also why do we need 'applier->state == APPLIER_OFF' check? How is it
>     related to the issue? If this is a separate bug, is it possible to
>     make it a separate patch?
> 
>  
> We needn’t this check, after all. replicaset_update() at the end of replicaset_connect()
> ensures that all the replicas in the replicaset have appliers in state APPLIER_CONNECTED.
> Thanks for pointing this out.
>  
> 
> diff --git a/src/box/replication.cc b/src/box/replication.cc
> 
> index 01edc0fb2..1345f189b 100644
> 
> --- a/src/box/replication.cc
> 
> +++ b/src/box/replication.cc
> 
> @@ -769,18 +769,12 @@ replicaset_needs_rejoin(struct replica **master)
> 
>   replicaset_foreach(replica) {
> 
>   struct applier *applier = replica->applier;
> 
>   /*
> 
> - * The function is called right after
> 
> - * box_sync_replication(), which in turn calls
> 
> - * replicaset_connect(), which ensures that
> 
> - * appliers are either stopped (APPLIER OFF) or
> 
> - * connected.
> 
> - * Also ignore self, as self applier might not
> 
> - * have disconnected yet.
> 
> + * Skip the local instance, we shouldn't perform a
> 
> + * check against our own gc vclock.
> 
>   */
> 
> - if (applier == NULL || applier->state == APPLIER_OFF ||
> 
> -     tt_uuid_is_equal(&replica->uuid, &INSTANCE_UUID))
> 
> + if (applier == NULL || tt_uuid_is_equal(&replica->uuid,
> 
> + &INSTANCE_UUID))
> 
>   continue;
> 
> - assert(applier->state == APPLIER_CONNECTED);
> 
>  
> 
>   const struct ballot *ballot = &applier->ballot;
> 
>  
> --
> Serge Petrenko
>  
^ permalink raw reply	[flat|nested] 9+ messages in thread
 
 
 
 
- * Re: [Tarantool-patches] [PATCH] replication: fix rebootstrap in case the instance is listed in box.cfg.replication
  2020-02-28 17:01 [Tarantool-patches] [PATCH] replication: fix rebootstrap in case the instance is listed in box.cfg.replication Serge Petrenko
  2020-02-28 23:43 ` Vladislav Shpilevoy
@ 2020-03-03  8:25 ` Kirill Yukhin
  2020-03-05  4:52   ` Kirill Yukhin
  1 sibling, 1 reply; 9+ messages in thread
From: Kirill Yukhin @ 2020-03-03  8:25 UTC (permalink / raw)
  To: Serge Petrenko; +Cc: kirichenkoga, tarantool-patches, v.shpilevoy
Hello,
On 28 фев 20:01, Serge Petrenko wrote:
> When checking wheter rejoin is needed, replica loops through all the
> instances in box.cfg.replication, which makes it believe that there is a
> master holding files, needed by it, since it accounts itself just like
> all other instances.
> So make replica skip itself when finding an instance which holds files
> needed by it, and determining whether rebootstrap is needed.
> 
> We already have a working test for the issue, it missed the issue due to
> replica.lua replication settings. Fix replica.lua to optionally include
> itself in box.cfg.replication, so that the corresponding test works
> correctly.
> 
> Closes #4759
> ---
> https://github.com/tarantool/tarantool/issues/4759
> https://github.com/tarantool/tarantool/tree/sp/gh-4759-rebootstrap-fix
THe patch LGTM. I've checked it into 2.2, 2.3 and master.
--
Regards, Kirill Yukhin
^ permalink raw reply	[flat|nested] 9+ messages in thread 
- * Re: [Tarantool-patches] [PATCH] replication: fix rebootstrap in case the instance is listed in box.cfg.replication
  2020-03-03  8:25 ` Kirill Yukhin
@ 2020-03-05  4:52   ` Kirill Yukhin
  0 siblings, 0 replies; 9+ messages in thread
From: Kirill Yukhin @ 2020-03-05  4:52 UTC (permalink / raw)
  To: Serge Petrenko; +Cc: kirichenkoga, tarantool-patches, v.shpilevoy
Hello,
On 03 мар 11:25, Kirill Yukhin wrote:
> Hello,
> 
> On 28 фев 20:01, Serge Petrenko wrote:
> > When checking wheter rejoin is needed, replica loops through all the
> > instances in box.cfg.replication, which makes it believe that there is a
> > master holding files, needed by it, since it accounts itself just like
> > all other instances.
> > So make replica skip itself when finding an instance which holds files
> > needed by it, and determining whether rebootstrap is needed.
> > 
> > We already have a working test for the issue, it missed the issue due to
> > replica.lua replication settings. Fix replica.lua to optionally include
> > itself in box.cfg.replication, so that the corresponding test works
> > correctly.
> > 
> > Closes #4759
> > ---
> > https://github.com/tarantool/tarantool/issues/4759
> > https://github.com/tarantool/tarantool/tree/sp/gh-4759-rebootstrap-fix
> 
> THe patch LGTM. I've checked it into 2.2, 2.3 and master.
CHecked into 1.10 as well.
--
Regards, Kirill Yukhin
^ permalink raw reply	[flat|nested] 9+ messages in thread 
 
end of thread, other threads:[~2020-03-05  4:52 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-28 17:01 [Tarantool-patches] [PATCH] replication: fix rebootstrap in case the instance is listed in box.cfg.replication Serge Petrenko
2020-02-28 23:43 ` Vladislav Shpilevoy
2020-02-29  8:54   ` Konstantin Osipov
2020-02-29  9:52   ` Serge Petrenko
2020-02-29 14:19     ` Vladislav Shpilevoy
2020-03-02 17:18       ` Serge Petrenko
2020-03-02 21:26         ` Vladislav Shpilevoy
2020-03-03  8:25 ` Kirill Yukhin
2020-03-05  4:52   ` Kirill Yukhin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox