From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Wed, 29 Aug 2018 13:00:57 +0300 From: Vladimir Davydov Subject: Re: [tarantool-patches] [PATCH] box: fix assertion with duplication in repl. source Message-ID: <20180829100057.fpc6xoac4diytjtq@esperanza> References: <20180828162226.53550-1-krishtal.olja@gmail.com> <20180829090325.6drp4hqojae67prf@esperanza> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: To: Olga Krishtal Cc: tarantool-patches@freelists.org List-ID: On Wed, Aug 29, 2018 at 12:36:33PM +0300, Olga Krishtal wrote: > > > diff --git a/src/box/box.cc b/src/box/box.cc > > > index 8d7454d1f..3a571ae3c 100644 > > > --- a/src/box/box.cc > > > +++ b/src/box/box.cc > > > @@ -369,9 +369,23 @@ static void > > > box_check_replication(void) > > > { > > > int count = cfg_getarr_size("replication"); > > > + char *repl[count-1]; > > > for (int i = 0; i < count; i++) { > > > const char *source = cfg_getarr_elem("replication", i); > > > box_check_uri(source, "replication"); > > > + repl[i] = strdup(source); > > > + if (repl[i] == NULL) { > > > + tnt_raise(OutOfMemory, sizeof(*source), "source", > > "malloc"); > > > + } > > > + for (int j = i; j >= 1; j--) { > > > + if (strcmp(repl[i], repl[j-1]) == 0) { > > > + tnt_raise(ClientError, ER_CFG, > > "replication", > > > + "duplication of replication > > source"); > > > + } > > > + } > > > + } > > > + for (int i = 0; i < count; i++) { > > > + free(repl[i]); > > > > This is totally wrong, because different URLs can point to the same > > instance, e.g. > > > > Instance 1: box.cfg{listen = 12345} > > > > Instance 2: box.cfg{replication = {12345, 'localhost:12345'}} > > > > Crash. > > > > All you're supposed to do is fix the checks in replication.cc > > > > > I am a bit lost. We have to raise an exception when have duplication uri, > or or just skip duplication? Before replication_connect_quorum was introduced, we checked for duplicate connections when configuring replication and raised exception on error (see replicaset_update). Now, due to replication_connect_quorum, we may be unable to detect duplicate connections, because we can fail to connect to some masters within replication_connect_timeout, before box.cfg{} returns. So we allow the configuration anyway and print a warning if later on, when a master is connected, we find it to be a duplicate. I guess we should preserve this behavior and just fix the crash.