Tarantool development patches archive
 help / color / mirror / Atom feed
* [PATCH v2 0/2] replication: fix a failing assert in replica_on_applier_disconnect()
@ 2018-08-17  6:59 Serge Petrenko
  2018-08-17  6:59 ` [PATCH v2 1/2] Update test-run Serge Petrenko
  2018-08-17  6:59 ` [PATCH v2 2/2] replication: fix a failing assert in replica_on_applier_disconnect() Serge Petrenko
  0 siblings, 2 replies; 5+ messages in thread
From: Serge Petrenko @ 2018-08-17  6:59 UTC (permalink / raw)
  To: vdavydov.dev; +Cc: kyukhin, tarantool-patches, Serge Petrenko

https://github.com/tarantool/tarantool/issues/3510
https://github.com/tarantool/tarantool/tree/sergepetrenko/gh-3510-replication-asserts-fail

Changes in v2:
  - update test run with a fix to prevent
    test hang.

Serge Petrenko (2):
  Update test-run
  replication: fix a failing assert in replica_on_applier_disconnect()

 src/box/replication.cc         |  4 ++++
 test-run                       |  2 +-
 test/replication/er_load.lua   | 25 +++++++++++++++++++++++++
 test/replication/er_load1.lua  |  1 +
 test/replication/er_load2.lua  |  1 +
 test/replication/misc.result   | 32 ++++++++++++++++++++++++++++++++
 test/replication/misc.test.lua | 12 ++++++++++++
 7 files changed, 76 insertions(+), 1 deletion(-)
 create mode 100644 test/replication/er_load.lua
 create mode 120000 test/replication/er_load1.lua
 create mode 120000 test/replication/er_load2.lua

-- 
2.15.2 (Apple Git-101.1)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 1/2] Update test-run
  2018-08-17  6:59 [PATCH v2 0/2] replication: fix a failing assert in replica_on_applier_disconnect() Serge Petrenko
@ 2018-08-17  6:59 ` Serge Petrenko
  2018-08-17 11:41   ` [tarantool-patches] " Alexander Turenko
  2018-08-17  6:59 ` [PATCH v2 2/2] replication: fix a failing assert in replica_on_applier_disconnect() Serge Petrenko
  1 sibling, 1 reply; 5+ messages in thread
From: Serge Petrenko @ 2018-08-17  6:59 UTC (permalink / raw)
  To: vdavydov.dev; +Cc: kyukhin, tarantool-patches, Serge Petrenko

Fix a bug where crash_expected option lead to test hang.
---
 test-run | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/test-run b/test-run
index 0aa25ae8a..822eed379 160000
--- a/test-run
+++ b/test-run
@@ -1 +1 @@
-Subproject commit 0aa25ae8a9d4af977b3c3478cba3ccdc4ef81d35
+Subproject commit 822eed379ce04edf7b0f586e1b2c061687a67e92
-- 
2.15.2 (Apple Git-101.1)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 2/2] replication: fix a failing assert in replica_on_applier_disconnect()
  2018-08-17  6:59 [PATCH v2 0/2] replication: fix a failing assert in replica_on_applier_disconnect() Serge Petrenko
  2018-08-17  6:59 ` [PATCH v2 1/2] Update test-run Serge Petrenko
@ 2018-08-17  6:59 ` Serge Petrenko
  1 sibling, 0 replies; 5+ messages in thread
From: Serge Petrenko @ 2018-08-17  6:59 UTC (permalink / raw)
  To: vdavydov.dev; +Cc: kyukhin, tarantool-patches, Serge Petrenko

One possible case when two applier errors happen one after another
wasn't handled in replica_on_applier_disconnect(), which lead to
occasional test failures and crashes. Handle this case and add a
regression test.

Part of #3510
---
 src/box/replication.cc         |  4 ++++
 test/replication/er_load.lua   | 25 +++++++++++++++++++++++++
 test/replication/er_load1.lua  |  1 +
 test/replication/er_load2.lua  |  1 +
 test/replication/misc.result   | 32 ++++++++++++++++++++++++++++++++
 test/replication/misc.test.lua | 12 ++++++++++++
 6 files changed, 75 insertions(+)
 create mode 100644 test/replication/er_load.lua
 create mode 120000 test/replication/er_load1.lua
 create mode 120000 test/replication/er_load2.lua

diff --git a/src/box/replication.cc b/src/box/replication.cc
index 4270911ef..c9283bd82 100644
--- a/src/box/replication.cc
+++ b/src/box/replication.cc
@@ -358,6 +358,10 @@ replica_on_applier_disconnect(struct replica *replica)
 		assert(replicaset.applier.connected > 0);
 		replicaset.applier.connected--;
 		break;
+	case APPLIER_LOADING:
+		assert(replicaset.applier.loading > 0);
+		replicaset.applier.loading--;
+		break;
 	case APPLIER_DISCONNECTED:
 		break;
 	default:
diff --git a/test/replication/er_load.lua b/test/replication/er_load.lua
new file mode 100644
index 000000000..0515b3cce
--- /dev/null
+++ b/test/replication/er_load.lua
@@ -0,0 +1,25 @@
+#!/usr/bin/env tarantool
+
+-- get instance id from filename (er_load1.lua => 1)
+local INSTANCE_ID = string.match(arg[0], '%d')
+
+local SOCKET_DIR =  require('fio').cwd()
+local function instance_uri(instance_id)
+    return SOCKET_DIR..'/er_load'..instance_id..'.sock'
+end
+
+require('console').listen(os.getenv('ADMIN'))
+
+box.cfg{
+    listen = instance_uri(INSTANCE_ID);
+    replication = {
+	instance_uri(INSTANCE_ID),
+	instance_uri(INSTANCE_ID % 2 + 1)
+    },
+    replication_timeout = 0.01,
+    read_only = INSTANCE_ID == '2'
+}
+box.once('bootstrap', function()
+    box.schema.user.grant('guest', 'replication')
+    box.space._cluster:delete(2)
+end)
diff --git a/test/replication/er_load1.lua b/test/replication/er_load1.lua
new file mode 120000
index 000000000..18f7ffa5a
--- /dev/null
+++ b/test/replication/er_load1.lua
@@ -0,0 +1 @@
+er_load.lua
\ No newline at end of file
diff --git a/test/replication/er_load2.lua b/test/replication/er_load2.lua
new file mode 120000
index 000000000..18f7ffa5a
--- /dev/null
+++ b/test/replication/er_load2.lua
@@ -0,0 +1 @@
+er_load.lua
\ No newline at end of file
diff --git a/test/replication/misc.result b/test/replication/misc.result
index 9df2a2c4b..16b0fc362 100644
--- a/test/replication/misc.result
+++ b/test/replication/misc.result
@@ -232,3 +232,35 @@ test_run:drop_cluster(SERVERS)
 box.schema.user.revoke('guest', 'replication')
 ---
 ...
+-- gh-3510 assertion failure in replica_on_applier_disconnect()
+test_run:cmd('create server er_load1 with script="replication/er_load1.lua"')
+---
+- true
+...
+test_run:cmd('create server er_load2 with script="replication/er_load2.lua"')
+---
+- true
+...
+test_run:cmd('start server er_load1 with wait=False, wait_load=False')
+---
+- true
+...
+-- instance er_load2 will fail with error ER_READONLY. this is ok.
+-- We only test here that er_load1 doesn't assert.
+test_run:cmd('start server er_load2 with wait=True, wait_load=True, crash_expected = True')
+---
+- false
+...
+test_run:cmd('stop server er_load1')
+---
+- true
+...
+-- er_load2 exits automatically.
+test_run:cmd('cleanup server er_load1')
+---
+- true
+...
+test_run:cmd('cleanup server er_load2')
+---
+- true
+...
diff --git a/test/replication/misc.test.lua b/test/replication/misc.test.lua
index 979c5d58c..852d7374b 100644
--- a/test/replication/misc.test.lua
+++ b/test/replication/misc.test.lua
@@ -91,3 +91,15 @@ test_run:cmd("switch default")
 test_run:drop_cluster(SERVERS)
 
 box.schema.user.revoke('guest', 'replication')
+
+-- gh-3510 assertion failure in replica_on_applier_disconnect()
+test_run:cmd('create server er_load1 with script="replication/er_load1.lua"')
+test_run:cmd('create server er_load2 with script="replication/er_load2.lua"')
+test_run:cmd('start server er_load1 with wait=False, wait_load=False')
+-- instance er_load2 will fail with error ER_READONLY. this is ok.
+-- We only test here that er_load1 doesn't assert.
+test_run:cmd('start server er_load2 with wait=True, wait_load=True, crash_expected = True')
+test_run:cmd('stop server er_load1')
+-- er_load2 exits automatically.
+test_run:cmd('cleanup server er_load1')
+test_run:cmd('cleanup server er_load2')
-- 
2.15.2 (Apple Git-101.1)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [tarantool-patches] [PATCH v2 1/2] Update test-run
  2018-08-17  6:59 ` [PATCH v2 1/2] Update test-run Serge Petrenko
@ 2018-08-17 11:41   ` Alexander Turenko
  2018-08-17 12:15     ` Vladimir Davydov
  0 siblings, 1 reply; 5+ messages in thread
From: Alexander Turenko @ 2018-08-17 11:41 UTC (permalink / raw)
  To: tarantool-patches; +Cc: vdavydov.dev, kyukhin, Serge Petrenko

Hi!

Comments for the Sergey's V. patch 'test: update test-run submodule' are
relevant here too. But I think it would be convenient to push the
Sergey's V. patch first and rebase this patch on top of it.

WBR, Alexander Turenko.

On Fri, Aug 17, 2018 at 09:59:50AM +0300, Serge Petrenko wrote:
> Fix a bug where crash_expected option lead to test hang.
> ---
>  test-run | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/test-run b/test-run
> index 0aa25ae8a..822eed379 160000
> --- a/test-run
> +++ b/test-run
> @@ -1 +1 @@
> -Subproject commit 0aa25ae8a9d4af977b3c3478cba3ccdc4ef81d35
> +Subproject commit 822eed379ce04edf7b0f586e1b2c061687a67e92
> -- 
> 2.15.2 (Apple Git-101.1)
> 
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [tarantool-patches] [PATCH v2 1/2] Update test-run
  2018-08-17 11:41   ` [tarantool-patches] " Alexander Turenko
@ 2018-08-17 12:15     ` Vladimir Davydov
  0 siblings, 0 replies; 5+ messages in thread
From: Vladimir Davydov @ 2018-08-17 12:15 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches, kyukhin, Serge Petrenko

Sorry, I've already pushed this one to 1.10 (see reply to v2.1)

On Fri, Aug 17, 2018 at 02:41:59PM +0300, Alexander Turenko wrote:
> Hi!
> 
> Comments for the Sergey's V. patch 'test: update test-run submodule' are
> relevant here too. But I think it would be convenient to push the
> Sergey's V. patch first and rebase this patch on top of it.
> 
> WBR, Alexander Turenko.
> 
> On Fri, Aug 17, 2018 at 09:59:50AM +0300, Serge Petrenko wrote:
> > Fix a bug where crash_expected option lead to test hang.
> > ---
> >  test-run | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/test-run b/test-run
> > index 0aa25ae8a..822eed379 160000
> > --- a/test-run
> > +++ b/test-run
> > @@ -1 +1 @@
> > -Subproject commit 0aa25ae8a9d4af977b3c3478cba3ccdc4ef81d35
> > +Subproject commit 822eed379ce04edf7b0f586e1b2c061687a67e92
> > -- 
> > 2.15.2 (Apple Git-101.1)
> > 
> > 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-08-17 12:15 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-17  6:59 [PATCH v2 0/2] replication: fix a failing assert in replica_on_applier_disconnect() Serge Petrenko
2018-08-17  6:59 ` [PATCH v2 1/2] Update test-run Serge Petrenko
2018-08-17 11:41   ` [tarantool-patches] " Alexander Turenko
2018-08-17 12:15     ` Vladimir Davydov
2018-08-17  6:59 ` [PATCH v2 2/2] replication: fix a failing assert in replica_on_applier_disconnect() Serge Petrenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox