From: Sergey Petrenko <sergepetrenko@tarantool.org> To: tarantool-patches@freelists.org Cc: Vladimir Davydov <vdavydov.dev@gmail.com>, Kirill Yukhin <kyukhin@tarantool.org> Subject: Re: [tarantool-patches] [PATCH] replication: fix a failing assert in replica_on_applier_disconnect() Date: Mon, 6 Aug 2018 17:14:05 +0300 [thread overview] Message-ID: <81999702-C603-423E-92C9-199CE605FED4@tarantool.org> (raw) In-Reply-To: <20180803155745.tmndjr52n6igtdno@tarantool.org> Hi! > 3 авг. 2018 г., в 18:57, Kirill Yukhin <kyukhin@tarantool.org> написал(а): > > Hello Serge, > On 03 авг 08:59, Serge Petrenko wrote: >> One possible case when two applier errors happen one after another >> wasn't handled in replica_on_applier_disconnect(), which lead to >> occasional test failures and crashes. Handle this case. >> >> Part of #3510 >> --- >> This patch fixes an assertion fail, submitted by @locker in issue comments. >> I wasn't able to reproduce 2 failures reported in the issue itself, and asked >> for comments, but got no answer. I also couldn't fix the latter 2 >> failures just by looking at code. >> >> https://github.com/tarantool/tarantool/tree/sergepetrenko/gh-3510-replication-asserts-fail >> https://github.com/tarantool/tarantool/issues/3510 > Could you pls prepare a regression test as well? Added a test. It fails with assertion(0) before my patch and passes with my patch. Here’s new diff: src/box/replication.cc | 4 ++++ test/replication/er_load.lua | 23 +++++++++++++++++++++++ test/replication/er_load1.lua | 1 + test/replication/er_load2.lua | 1 + test/replication/misc.result | 39 +++++++++++++++++++++++++++++++++++++++ test/replication/misc.test.lua | 12 ++++++++++++ 6 files changed, 80 insertions(+) create mode 100644 test/replication/er_load.lua create mode 120000 test/replication/er_load1.lua create mode 120000 test/replication/er_load2.lua diff --git a/src/box/replication.cc b/src/box/replication.cc index 26bbbe32a..0efbd7c0e 100644 --- a/src/box/replication.cc +++ b/src/box/replication.cc @@ -350,6 +350,10 @@ replica_on_applier_disconnect(struct replica *replica) assert(replicaset.applier.connected > 0); replicaset.applier.connected--; break; + case APPLIER_LOADING: + assert(replicaset.applier.loading > 0); + replicaset.applier.loading--; + break; case APPLIER_DISCONNECTED: break; default: diff --git a/test/replication/er_load.lua b/test/replication/er_load.lua new file mode 100644 index 000000000..0db8c9cfa --- /dev/null +++ b/test/replication/er_load.lua @@ -0,0 +1,23 @@ +#!/usr/bin/env tarantool + +-- get instance id from filename (er_load1.lua => 1) +local INSTANCE_ID = string.match(arg[0], '%d') + +local SOCKET_DIR = require('fio').cwd() +local function instance_uri(instance_id) + return SOCKET_DIR..'/er_load'..instance_id..'.sock' +end + +require('console').listen(os.getenv('ADMIN')) + +box.cfg{ + listen = instance_uri(INSTANCE_ID); + replication = { + instance_uri(INSTANCE_ID), + 'noone:pass@'..instance_uri(INSTANCE_ID % 2 + 1) + } +} + +box.once("leader", function() + box.schema.user.grant('guest', 'replication') +end) diff --git a/test/replication/er_load1.lua b/test/replication/er_load1.lua new file mode 120000 index 000000000..18f7ffa5a --- /dev/null +++ b/test/replication/er_load1.lua @@ -0,0 +1 @@ +er_load.lua \ No newline at end of file diff --git a/test/replication/er_load2.lua b/test/replication/er_load2.lua new file mode 120000 index 000000000..18f7ffa5a --- /dev/null +++ b/test/replication/er_load2.lua @@ -0,0 +1 @@ +er_load.lua \ No newline at end of file diff --git a/test/replication/misc.result b/test/replication/misc.result index ff0dbf549..35b51085f 100644 --- a/test/replication/misc.result +++ b/test/replication/misc.result @@ -208,3 +208,42 @@ test_run:drop_cluster(SERVERS) box.schema.user.revoke('guest', 'replication') --- ... +-- gh-3510 assertion failure in replica_on_applier_disconnect() +test_run:cmd('create server er_load1 with script="replication/er_load1.lua"') +--- +- true +... +test_run:cmd('create server er_load2 with script="replication/er_load2.lua"') +--- +- true +... +test_run:cmd('start server er_load1 with wait=False, wait_load=False') +--- +- true +... +test_run:cmd('start server er_load2 with wait=False, wait_load=False') +--- +- true +... +require('fiber').sleep(0.5) +--- +... +test_run:cmd('stop server er_load1') +--- +- true +... +require('fiber').sleep(1) +--- +... +test_run:cmd('stop server er_load2') +--- +- true +... +test_run:cmd('cleanup server er_load1') +--- +- true +... +test_run:cmd('cleanup server er_load2') +--- +- true +... diff --git a/test/replication/misc.test.lua b/test/replication/misc.test.lua index c05e52165..27c1a4821 100644 --- a/test/replication/misc.test.lua +++ b/test/replication/misc.test.lua @@ -81,3 +81,15 @@ test_run:cmd("switch default") test_run:drop_cluster(SERVERS) box.schema.user.revoke('guest', 'replication') + +-- gh-3510 assertion failure in replica_on_applier_disconnect() +test_run:cmd('create server er_load1 with script="replication/er_load1.lua"') +test_run:cmd('create server er_load2 with script="replication/er_load2.lua"') +test_run:cmd('start server er_load1 with wait=False, wait_load=False') +test_run:cmd('start server er_load2 with wait=False, wait_load=False') +require('fiber').sleep(0.5) +test_run:cmd('stop server er_load1') +require('fiber').sleep(1) +test_run:cmd('stop server er_load2') +test_run:cmd('cleanup server er_load1') +test_run:cmd('cleanup server er_load2') -- 2.15.2 (Apple Git-101.1)
next prev parent reply other threads:[~2018-08-06 14:14 UTC|newest] Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-08-03 5:59 Serge Petrenko 2018-08-03 15:43 ` Vladimir Davydov 2018-08-03 15:57 ` [tarantool-patches] " Kirill Yukhin 2018-08-06 14:14 ` Sergey Petrenko [this message] 2018-08-07 16:50 ` Vladimir Davydov 2018-08-08 10:10 ` Sergey Petrenko 2018-08-08 10:58 ` Vladimir Davydov 2018-08-08 15:19 ` Sergey Petrenko
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=81999702-C603-423E-92C9-199CE605FED4@tarantool.org \ --to=sergepetrenko@tarantool.org \ --cc=kyukhin@tarantool.org \ --cc=tarantool-patches@freelists.org \ --cc=vdavydov.dev@gmail.com \ --subject='Re: [tarantool-patches] [PATCH] replication: fix a failing assert in replica_on_applier_disconnect()' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox