From: Vladimir Davydov <vdavydov.dev@gmail.com>
To: Serge Petrenko <sergepetrenko@tarantool.org>
Cc: tarantool-patches@freelists.org
Subject: Re: [PATCH v3] replication: fix exit with ER_NO_SUCH_USER during bootstrap
Date: Fri, 24 Aug 2018 19:32:09 +0300 [thread overview]
Message-ID: <20180824163209.2agiibwwtowoizzw@esperanza> (raw)
In-Reply-To: <20180824115645.43531-1-sergepetrenko@tarantool.org>
Pushed to 1.9, here's the final version:
From 33950162f3e766d413567cd75aaa7e6c384831bd Mon Sep 17 00:00:00 2001
From: Serge Petrenko <sergepetrenko@tarantool.org>
Date: Thu, 23 Aug 2018 14:08:51 +0300
Subject: [PATCH] replication: fix exit with ER_NO_SUCH_USER during bootstrap
When replication is configured via some user created in box.once()
function and box.once() takes more than replication_timeout seconds
to execute, appliers recieve ER_NO_SUCH_USER error, which they don't
handle. This leads to occasional test failures in replication suite.
Fix this by handling the aforementioned case in applier_f() and add a
test case.
Closes #3637
diff --git a/src/box/applier.cc b/src/box/applier.cc
index b9f041d8..16a87389 100644
--- a/src/box/applier.cc
+++ b/src/box/applier.cc
@@ -596,7 +596,8 @@ applier_f(va_list ap)
applier_log_error(applier, e);
applier_disconnect(applier, APPLIER_LOADING);
goto reconnect;
- } else if (e->errcode() == ER_ACCESS_DENIED) {
+ } else if (e->errcode() == ER_ACCESS_DENIED ||
+ e->errcode() == ER_NO_SUCH_USER) {
/* Invalid configuration */
applier_log_error(applier, e);
applier_disconnect(applier, APPLIER_DISCONNECTED);
diff --git a/test/replication/misc.result b/test/replication/misc.result
index 9df2a2c4..76e7fd5e 100644
--- a/test/replication/misc.result
+++ b/test/replication/misc.result
@@ -232,3 +232,55 @@ test_run:drop_cluster(SERVERS)
box.schema.user.revoke('guest', 'replication')
---
...
+--
+-- Test case for gh-3637. Before the fix replica would exit with
+-- an error. Now check that we don't hang and successfully connect.
+--
+fiber = require('fiber')
+---
+...
+test_run:cleanup_cluster()
+---
+...
+test_run:cmd("create server replica_auth with rpl_master=default, script='replication/replica_auth.lua'")
+---
+- true
+...
+test_run:cmd("start server replica_auth with wait=False, wait_load=False, args='cluster:pass 0.05'")
+---
+- true
+...
+-- Wait a bit to make sure replica waits till user is created.
+fiber.sleep(0.1)
+---
+...
+box.schema.user.create('cluster', {password='pass'})
+---
+...
+box.schema.user.grant('cluster', 'replication')
+---
+...
+while box.info.replication[2] == nil do fiber.sleep(0.01) end
+---
+...
+vclock = test_run:get_vclock('default')
+---
+...
+_ = test_run:wait_vclock('replica_auth', vclock)
+---
+...
+test_run:cmd("stop server replica_auth")
+---
+- true
+...
+test_run:cmd("cleanup server replica_auth")
+---
+- true
+...
+test_run:cmd("delete server replica_auth")
+---
+- true
+...
+box.schema.user.drop('cluster')
+---
+...
diff --git a/test/replication/misc.test.lua b/test/replication/misc.test.lua
index 979c5d58..c60adf5a 100644
--- a/test/replication/misc.test.lua
+++ b/test/replication/misc.test.lua
@@ -91,3 +91,28 @@ test_run:cmd("switch default")
test_run:drop_cluster(SERVERS)
box.schema.user.revoke('guest', 'replication')
+
+--
+-- Test case for gh-3637. Before the fix replica would exit with
+-- an error. Now check that we don't hang and successfully connect.
+--
+fiber = require('fiber')
+
+test_run:cleanup_cluster()
+
+test_run:cmd("create server replica_auth with rpl_master=default, script='replication/replica_auth.lua'")
+test_run:cmd("start server replica_auth with wait=False, wait_load=False, args='cluster:pass 0.05'")
+-- Wait a bit to make sure replica waits till user is created.
+fiber.sleep(0.1)
+box.schema.user.create('cluster', {password='pass'})
+box.schema.user.grant('cluster', 'replication')
+
+while box.info.replication[2] == nil do fiber.sleep(0.01) end
+vclock = test_run:get_vclock('default')
+_ = test_run:wait_vclock('replica_auth', vclock)
+
+test_run:cmd("stop server replica_auth")
+test_run:cmd("cleanup server replica_auth")
+test_run:cmd("delete server replica_auth")
+
+box.schema.user.drop('cluster')
diff --git a/test/replication/replica_auth.lua b/test/replication/replica_auth.lua
new file mode 100644
index 00000000..22ba9146
--- /dev/null
+++ b/test/replication/replica_auth.lua
@@ -0,0 +1,14 @@
+#!/usr/bin/env tarantool
+
+local USER_PASS = arg[1]
+local TIMEOUT = arg[2] and tonumber(arg[2]) or 0.1
+local CON_TIMEOUT = arg[3] and tonumber(arg[3]) or 30.0
+
+require('console').listen(os.getenv('ADMIN'))
+
+box.cfg({
+ listen = os.getenv("LISTEN"),
+ replication = USER_PASS .. "@" .. os.getenv("MASTER"),
+ replication_timeout = TIMEOUT,
+ replication_connect_timeout = CON_TIMEOUT
+})
prev parent reply other threads:[~2018-08-24 16:32 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-24 11:56 Serge Petrenko
2018-08-24 12:54 ` Vladimir Davydov
2018-08-24 16:15 ` Serge Petrenko
2018-08-24 16:32 ` Vladimir Davydov [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180824163209.2agiibwwtowoizzw@esperanza \
--to=vdavydov.dev@gmail.com \
--cc=sergepetrenko@tarantool.org \
--cc=tarantool-patches@freelists.org \
--subject='Re: [PATCH v3] replication: fix exit with ER_NO_SUCH_USER during bootstrap' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox