* [PATCH v2 0/2] replication: fix a failing assert in replica_on_applier_disconnect() @ 2018-08-17 6:59 Serge Petrenko 2018-08-17 6:59 ` [PATCH v2 1/2] Update test-run Serge Petrenko 2018-08-17 6:59 ` [PATCH v2 2/2] replication: fix a failing assert in replica_on_applier_disconnect() Serge Petrenko 0 siblings, 2 replies; 5+ messages in thread From: Serge Petrenko @ 2018-08-17 6:59 UTC (permalink / raw) To: vdavydov.dev; +Cc: kyukhin, tarantool-patches, Serge Petrenko https://github.com/tarantool/tarantool/issues/3510 https://github.com/tarantool/tarantool/tree/sergepetrenko/gh-3510-replication-asserts-fail Changes in v2: - update test run with a fix to prevent test hang. Serge Petrenko (2): Update test-run replication: fix a failing assert in replica_on_applier_disconnect() src/box/replication.cc | 4 ++++ test-run | 2 +- test/replication/er_load.lua | 25 +++++++++++++++++++++++++ test/replication/er_load1.lua | 1 + test/replication/er_load2.lua | 1 + test/replication/misc.result | 32 ++++++++++++++++++++++++++++++++ test/replication/misc.test.lua | 12 ++++++++++++ 7 files changed, 76 insertions(+), 1 deletion(-) create mode 100644 test/replication/er_load.lua create mode 120000 test/replication/er_load1.lua create mode 120000 test/replication/er_load2.lua -- 2.15.2 (Apple Git-101.1) ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v2 1/2] Update test-run 2018-08-17 6:59 [PATCH v2 0/2] replication: fix a failing assert in replica_on_applier_disconnect() Serge Petrenko @ 2018-08-17 6:59 ` Serge Petrenko 2018-08-17 11:41 ` [tarantool-patches] " Alexander Turenko 2018-08-17 6:59 ` [PATCH v2 2/2] replication: fix a failing assert in replica_on_applier_disconnect() Serge Petrenko 1 sibling, 1 reply; 5+ messages in thread From: Serge Petrenko @ 2018-08-17 6:59 UTC (permalink / raw) To: vdavydov.dev; +Cc: kyukhin, tarantool-patches, Serge Petrenko Fix a bug where crash_expected option lead to test hang. --- test-run | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/test-run b/test-run index 0aa25ae8a..822eed379 160000 --- a/test-run +++ b/test-run @@ -1 +1 @@ -Subproject commit 0aa25ae8a9d4af977b3c3478cba3ccdc4ef81d35 +Subproject commit 822eed379ce04edf7b0f586e1b2c061687a67e92 -- 2.15.2 (Apple Git-101.1) ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [tarantool-patches] [PATCH v2 1/2] Update test-run 2018-08-17 6:59 ` [PATCH v2 1/2] Update test-run Serge Petrenko @ 2018-08-17 11:41 ` Alexander Turenko 2018-08-17 12:15 ` Vladimir Davydov 0 siblings, 1 reply; 5+ messages in thread From: Alexander Turenko @ 2018-08-17 11:41 UTC (permalink / raw) To: tarantool-patches; +Cc: vdavydov.dev, kyukhin, Serge Petrenko Hi! Comments for the Sergey's V. patch 'test: update test-run submodule' are relevant here too. But I think it would be convenient to push the Sergey's V. patch first and rebase this patch on top of it. WBR, Alexander Turenko. On Fri, Aug 17, 2018 at 09:59:50AM +0300, Serge Petrenko wrote: > Fix a bug where crash_expected option lead to test hang. > --- > test-run | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/test-run b/test-run > index 0aa25ae8a..822eed379 160000 > --- a/test-run > +++ b/test-run > @@ -1 +1 @@ > -Subproject commit 0aa25ae8a9d4af977b3c3478cba3ccdc4ef81d35 > +Subproject commit 822eed379ce04edf7b0f586e1b2c061687a67e92 > -- > 2.15.2 (Apple Git-101.1) > > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [tarantool-patches] [PATCH v2 1/2] Update test-run 2018-08-17 11:41 ` [tarantool-patches] " Alexander Turenko @ 2018-08-17 12:15 ` Vladimir Davydov 0 siblings, 0 replies; 5+ messages in thread From: Vladimir Davydov @ 2018-08-17 12:15 UTC (permalink / raw) To: Alexander Turenko; +Cc: tarantool-patches, kyukhin, Serge Petrenko Sorry, I've already pushed this one to 1.10 (see reply to v2.1) On Fri, Aug 17, 2018 at 02:41:59PM +0300, Alexander Turenko wrote: > Hi! > > Comments for the Sergey's V. patch 'test: update test-run submodule' are > relevant here too. But I think it would be convenient to push the > Sergey's V. patch first and rebase this patch on top of it. > > WBR, Alexander Turenko. > > On Fri, Aug 17, 2018 at 09:59:50AM +0300, Serge Petrenko wrote: > > Fix a bug where crash_expected option lead to test hang. > > --- > > test-run | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/test-run b/test-run > > index 0aa25ae8a..822eed379 160000 > > --- a/test-run > > +++ b/test-run > > @@ -1 +1 @@ > > -Subproject commit 0aa25ae8a9d4af977b3c3478cba3ccdc4ef81d35 > > +Subproject commit 822eed379ce04edf7b0f586e1b2c061687a67e92 > > -- > > 2.15.2 (Apple Git-101.1) > > > > ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v2 2/2] replication: fix a failing assert in replica_on_applier_disconnect() 2018-08-17 6:59 [PATCH v2 0/2] replication: fix a failing assert in replica_on_applier_disconnect() Serge Petrenko 2018-08-17 6:59 ` [PATCH v2 1/2] Update test-run Serge Petrenko @ 2018-08-17 6:59 ` Serge Petrenko 1 sibling, 0 replies; 5+ messages in thread From: Serge Petrenko @ 2018-08-17 6:59 UTC (permalink / raw) To: vdavydov.dev; +Cc: kyukhin, tarantool-patches, Serge Petrenko One possible case when two applier errors happen one after another wasn't handled in replica_on_applier_disconnect(), which lead to occasional test failures and crashes. Handle this case and add a regression test. Part of #3510 --- src/box/replication.cc | 4 ++++ test/replication/er_load.lua | 25 +++++++++++++++++++++++++ test/replication/er_load1.lua | 1 + test/replication/er_load2.lua | 1 + test/replication/misc.result | 32 ++++++++++++++++++++++++++++++++ test/replication/misc.test.lua | 12 ++++++++++++ 6 files changed, 75 insertions(+) create mode 100644 test/replication/er_load.lua create mode 120000 test/replication/er_load1.lua create mode 120000 test/replication/er_load2.lua diff --git a/src/box/replication.cc b/src/box/replication.cc index 4270911ef..c9283bd82 100644 --- a/src/box/replication.cc +++ b/src/box/replication.cc @@ -358,6 +358,10 @@ replica_on_applier_disconnect(struct replica *replica) assert(replicaset.applier.connected > 0); replicaset.applier.connected--; break; + case APPLIER_LOADING: + assert(replicaset.applier.loading > 0); + replicaset.applier.loading--; + break; case APPLIER_DISCONNECTED: break; default: diff --git a/test/replication/er_load.lua b/test/replication/er_load.lua new file mode 100644 index 000000000..0515b3cce --- /dev/null +++ b/test/replication/er_load.lua @@ -0,0 +1,25 @@ +#!/usr/bin/env tarantool + +-- get instance id from filename (er_load1.lua => 1) +local INSTANCE_ID = string.match(arg[0], '%d') + +local SOCKET_DIR = require('fio').cwd() +local function instance_uri(instance_id) + return SOCKET_DIR..'/er_load'..instance_id..'.sock' +end + +require('console').listen(os.getenv('ADMIN')) + +box.cfg{ + listen = instance_uri(INSTANCE_ID); + replication = { + instance_uri(INSTANCE_ID), + instance_uri(INSTANCE_ID % 2 + 1) + }, + replication_timeout = 0.01, + read_only = INSTANCE_ID == '2' +} +box.once('bootstrap', function() + box.schema.user.grant('guest', 'replication') + box.space._cluster:delete(2) +end) diff --git a/test/replication/er_load1.lua b/test/replication/er_load1.lua new file mode 120000 index 000000000..18f7ffa5a --- /dev/null +++ b/test/replication/er_load1.lua @@ -0,0 +1 @@ +er_load.lua \ No newline at end of file diff --git a/test/replication/er_load2.lua b/test/replication/er_load2.lua new file mode 120000 index 000000000..18f7ffa5a --- /dev/null +++ b/test/replication/er_load2.lua @@ -0,0 +1 @@ +er_load.lua \ No newline at end of file diff --git a/test/replication/misc.result b/test/replication/misc.result index 9df2a2c4b..16b0fc362 100644 --- a/test/replication/misc.result +++ b/test/replication/misc.result @@ -232,3 +232,35 @@ test_run:drop_cluster(SERVERS) box.schema.user.revoke('guest', 'replication') --- ... +-- gh-3510 assertion failure in replica_on_applier_disconnect() +test_run:cmd('create server er_load1 with script="replication/er_load1.lua"') +--- +- true +... +test_run:cmd('create server er_load2 with script="replication/er_load2.lua"') +--- +- true +... +test_run:cmd('start server er_load1 with wait=False, wait_load=False') +--- +- true +... +-- instance er_load2 will fail with error ER_READONLY. this is ok. +-- We only test here that er_load1 doesn't assert. +test_run:cmd('start server er_load2 with wait=True, wait_load=True, crash_expected = True') +--- +- false +... +test_run:cmd('stop server er_load1') +--- +- true +... +-- er_load2 exits automatically. +test_run:cmd('cleanup server er_load1') +--- +- true +... +test_run:cmd('cleanup server er_load2') +--- +- true +... diff --git a/test/replication/misc.test.lua b/test/replication/misc.test.lua index 979c5d58c..852d7374b 100644 --- a/test/replication/misc.test.lua +++ b/test/replication/misc.test.lua @@ -91,3 +91,15 @@ test_run:cmd("switch default") test_run:drop_cluster(SERVERS) box.schema.user.revoke('guest', 'replication') + +-- gh-3510 assertion failure in replica_on_applier_disconnect() +test_run:cmd('create server er_load1 with script="replication/er_load1.lua"') +test_run:cmd('create server er_load2 with script="replication/er_load2.lua"') +test_run:cmd('start server er_load1 with wait=False, wait_load=False') +-- instance er_load2 will fail with error ER_READONLY. this is ok. +-- We only test here that er_load1 doesn't assert. +test_run:cmd('start server er_load2 with wait=True, wait_load=True, crash_expected = True') +test_run:cmd('stop server er_load1') +-- er_load2 exits automatically. +test_run:cmd('cleanup server er_load1') +test_run:cmd('cleanup server er_load2') -- 2.15.2 (Apple Git-101.1) ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-08-17 12:15 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-08-17 6:59 [PATCH v2 0/2] replication: fix a failing assert in replica_on_applier_disconnect() Serge Petrenko 2018-08-17 6:59 ` [PATCH v2 1/2] Update test-run Serge Petrenko 2018-08-17 11:41 ` [tarantool-patches] " Alexander Turenko 2018-08-17 12:15 ` Vladimir Davydov 2018-08-17 6:59 ` [PATCH v2 2/2] replication: fix a failing assert in replica_on_applier_disconnect() Serge Petrenko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox