[tarantool-patches] [sergepetrenko at tarantool.org: [server-dev] [PATCH] replication: disallow bootstrap of read-only masters]

Kirill Yukhin kyukhin at tarantool.org
Thu Sep 12 07:54:29 MSK 2019


Hello,

----- Forwarded message from Serge Petrenko <sergepetrenko at tarantool.org> -----

Date: Tue,  3 Sep 2019 20:06:41 +0300
From: Serge Petrenko <sergepetrenko at tarantool.org>
To: georgy at tarantool.org
Cc: server-dev at tarantool.org, Serge Petrenko <sergepetrenko at tarantool.org>
Subject: [server-dev] [PATCH] replication: disallow bootstrap of read-only masters
X-Mailer: git-send-email 2.20.1 (Apple Git-117)

In a configuration with several read-only and read-write instances, if
replication_connect_quorum is not greater than the amount of read-only
instances and replication_connect_timeout happens to be small enough
for some read-only instances to form a quorum and exceed the timeout
before any of the read-write instaces start, all these read-only
instances will choose themselves a read-only bootstrap leader.
This 'leader' will successfully bootstrap itself, but will fail to
register any of the other instances in _cluster table, since it isn't
writeable. As a result, some of the read-only instances will just die
unable to bootstrap from a read-only bootstrap leader, and when the
read-write instances are finally up, they'll see a single read-only
instance which managed to bootstrap itself and now gets a
REPLICASET_UUID_MISMATCH error, since no read-write instance will
choose it as bootstrap leader, and will rather bootstrap from one of
its read-write mates.

The described situation is clearly not what user has hoped for, so
throw an error, when a read-only instance tries to initiate the
bootstrap. The error will give the user a cue that he should increase
replication_connect_timeout.

Closes #4321

This patch was reviewed during ML downtime.
Checked in to 1.10, 2.1, 2.2 and master.

--
Regards, Kirill Yukhin




More information about the Tarantool-patches mailing list