[tarantool-patches] [PATCH] box: fix assertion with duplication in repl. source

Vladimir Davydov vdavydov.dev at gmail.com
Wed Aug 29 13:00:57 MSK 2018


On Wed, Aug 29, 2018 at 12:36:33PM +0300, Olga Krishtal wrote:
> > > diff --git a/src/box/box.cc b/src/box/box.cc
> > > index 8d7454d1f..3a571ae3c 100644
> > > --- a/src/box/box.cc
> > > +++ b/src/box/box.cc
> > > @@ -369,9 +369,23 @@ static void
> > >  box_check_replication(void)
> > >  {
> > >       int count = cfg_getarr_size("replication");
> > > +     char *repl[count-1];
> > >       for (int i = 0; i < count; i++) {
> > >               const char *source = cfg_getarr_elem("replication", i);
> > >               box_check_uri(source, "replication");
> > > +             repl[i] = strdup(source);
> > > +             if (repl[i] == NULL) {
> > > +                     tnt_raise(OutOfMemory, sizeof(*source), "source",
> > "malloc");
> > > +             }
> > > +             for (int j = i; j >= 1; j--) {
> > > +                     if (strcmp(repl[i], repl[j-1]) == 0) {
> > > +                             tnt_raise(ClientError, ER_CFG,
> > "replication",
> > > +                                       "duplication of replication
> > source");
> > > +                     }
> > > +             }
> > > +     }
> > > +     for (int i = 0; i < count; i++) {
> > > +             free(repl[i]);
> >
> > This is totally wrong, because different URLs can point to the same
> > instance, e.g.
> >
> > Instance 1: box.cfg{listen = 12345}
> >
> > Instance 2: box.cfg{replication = {12345, 'localhost:12345'}}
> >
> > Crash.
> >
> > All you're supposed to do is fix the checks in replication.cc
> >
> 
> 
> I am a bit lost. We have to raise an exception when have duplication uri,
> or or just skip duplication?

Before replication_connect_quorum was introduced, we checked for
duplicate connections when configuring replication and raised exception
on error (see replicaset_update).

Now, due to replication_connect_quorum, we may be unable to detect
duplicate connections, because we can fail to connect to some masters
within replication_connect_timeout, before box.cfg{} returns. So we
allow the configuration anyway and print a warning if later on, when a
master is connected, we find it to be a duplicate.

I guess we should preserve this behavior and just fix the crash.



More information about the Tarantool-patches mailing list