[Tarantool-patches] [PATCH 1/1] replication: auto reconnect if password is invalid

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Fri Oct 18 00:16:37 MSK 2019


Before the patch there was a race in replication
password configuration. It was possible that a replica
connects to a master with a custom password before
that password is actually set. The replica treated the
error as critical and exited.

But in fact it is not critical. Replica even can
withstand absence of a user and keeps reconnecting.
Wrong password situation arises from the same problem
of non atomic configuration and is fixed the same -
keep reconnect attempts if the password was wrong.

Closes #4550
---
Branch: https://github.com/tarantool/tarantool/tree/gerold103/gh-4550-replication-password-cfg

 src/box/applier.cc             |  4 +++-
 test/replication/misc.result   | 16 +++++++++++++---
 test/replication/misc.test.lua | 12 +++++++++---
 3 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/src/box/applier.cc b/src/box/applier.cc
index 6239fcfd3..7d4a670d7 100644
--- a/src/box/applier.cc
+++ b/src/box/applier.cc
@@ -101,6 +101,7 @@ applier_log_error(struct applier *applier, struct error *e)
 	case ER_NO_SUCH_USER:
 	case ER_SYSTEM:
 	case ER_UNKNOWN_REPLICA:
+	case ER_PASSWORD_MISMATCH:
 		say_info("will retry every %.2lf second",
 			 replication_reconnect_interval());
 		break;
@@ -979,7 +980,8 @@ applier_f(va_list ap)
 				goto reconnect;
 			} else if (e->errcode() == ER_CFG ||
 				   e->errcode() == ER_ACCESS_DENIED ||
-				   e->errcode() == ER_NO_SUCH_USER) {
+				   e->errcode() == ER_NO_SUCH_USER ||
+				   e->errcode() == ER_PASSWORD_MISMATCH) {
 				/* Invalid configuration */
 				applier_log_error(applier, e);
 				applier_disconnect(applier, APPLIER_LOADING);
diff --git a/test/replication/misc.result b/test/replication/misc.result
index f7098aac8..b63d72846 100644
--- a/test/replication/misc.result
+++ b/test/replication/misc.result
@@ -374,8 +374,10 @@ test_run:cleanup_cluster()
 ---
 ...
 --
--- Test case for gh-3637. Before the fix replica would exit with
--- an error. Now check that we don't hang and successfully connect.
+-- Test case for gh-3637, gh-4550. Before the fix replica would
+-- exit with an error if a user does not exist or a password is
+-- incorrect. Now check that we don't hang/panic and successfully
+-- connect.
 --
 fiber = require('fiber')
 ---
@@ -392,7 +394,15 @@ test_run:cmd("start server replica_auth with wait=False, wait_load=False, args='
 fiber.sleep(0.1)
 ---
 ...
-box.schema.user.create('cluster', {password='pass'})
+box.schema.user.create('cluster')
+---
+...
+-- The user is created. Let the replica fail auth request due to
+-- a wrong password.
+fiber.sleep(0.1)
+---
+...
+box.schema.user.passwd('cluster', 'pass')
 ---
 ...
 box.schema.user.grant('cluster', 'replication')
diff --git a/test/replication/misc.test.lua b/test/replication/misc.test.lua
index c4ddbdb47..c454a0992 100644
--- a/test/replication/misc.test.lua
+++ b/test/replication/misc.test.lua
@@ -153,15 +153,21 @@ test_run:cmd('delete server er_load2')
 test_run:cleanup_cluster()
 
 --
--- Test case for gh-3637. Before the fix replica would exit with
--- an error. Now check that we don't hang and successfully connect.
+-- Test case for gh-3637, gh-4550. Before the fix replica would
+-- exit with an error if a user does not exist or a password is
+-- incorrect. Now check that we don't hang/panic and successfully
+-- connect.
 --
 fiber = require('fiber')
 test_run:cmd("create server replica_auth with rpl_master=default, script='replication/replica_auth.lua'")
 test_run:cmd("start server replica_auth with wait=False, wait_load=False, args='cluster:pass 0.05'")
 -- Wait a bit to make sure replica waits till user is created.
 fiber.sleep(0.1)
-box.schema.user.create('cluster', {password='pass'})
+box.schema.user.create('cluster')
+-- The user is created. Let the replica fail auth request due to
+-- a wrong password.
+fiber.sleep(0.1)
+box.schema.user.passwd('cluster', 'pass')
 box.schema.user.grant('cluster', 'replication')
 
 while box.info.replication[2] == nil do fiber.sleep(0.01) end
-- 
2.21.0 (Apple Git-122)



More information about the Tarantool-patches mailing list