From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-f42.google.com (mail-lf1-f42.google.com [209.85.167.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 3FC5545C304 for ; Fri, 18 Dec 2020 10:25:32 +0300 (MSK) Received: by mail-lf1-f42.google.com with SMTP id h205so2954011lfd.5 for ; Thu, 17 Dec 2020 23:25:32 -0800 (PST) Date: Fri, 18 Dec 2020 10:25:29 +0300 From: Cyrill Gorcunov Message-ID: <20201218072529.GF14556@grain> References: <20201214113935.1040421-1-gorcunov@gmail.com> <20201214113935.1040421-3-gorcunov@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: Re: [Tarantool-patches] [PATCH v4 2/3] cfg: support symbolic evaluation of replication_synchro_quorum List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vladislav Shpilevoy Cc: Mons Anderson , tml On Fri, Dec 18, 2020 at 12:17:56AM +0100, Vladislav Shpilevoy wrote: > > > > +/** > > + * Evaluate replication syncro quorum number from a formula. > > + */ > > +static int > > +box_eval_replication_synchro_quorum(int nr_replicas) > > 1. I see you decided to never pass 0 here. Then I suggest to > add an assertion nr_replicas > 0 and < VCLOCK_MAX. OK, I will force push an update. > > + /* > > + * At least we should have 1 node to sync, the weird > > + * formulas such as N-2 do not guarantee quorums thus > > + * return an error. > > + * > > + * Since diag_set doesn't allow to show the valid range > > + * lets print a warning too. > > 2. Specifically for such cases we use tt_sprintf(). See usage > examples throughout the sources. Good idea, thanks! > > static int > > box_check_replication_synchro_quorum(void) > > { > > - int quorum = cfg_geti("replication_synchro_quorum"); > > + int quorum = 0; > > + > > + if (!cfg_isnumber("replication_synchro_quorum")) { > > + /* > > + * The formula uses symbolic name 'N' as > > + * a number of currently registered replicas. > > + * > > + * When we're in "checking" mode we should walk > > + * over all possible number of replicas to make > > + * sure the formula is correct. > > + * > > + * Note that currently VCLOCK_MAX is pretty small > > + * value but if we gonna increase this limit make > > + * sure that the cycle won't take too much time. > > + */ > > + for (int i = 1; i < VCLOCK_MAX; i++) { > > + quorum = box_eval_replication_synchro_quorum(i); > > + if (quorum < 0) > > + return -1; > > + } > > + /* > > + * Just to make clear the number we return here doesn't > > + * have any special meaning, only errors are matter. > > + * The real value is dynamic and will be updated on demand. > > + */ > > 3. Wtf? This function before your patch was supposed to return the > new quorum value. Like all cfg 'check()' functions. If it can't do that > now (but it can - just evaluate with the current number of replicas), > then the function must return only 0 or -1, like all 'binary result' > functions. Now it simply returns some random number in case the quorum > is an expression. As I pointer in the comment there is no special meaning in the return value, since we update the quorum on demand. Moreover once we manage to validate the formula we call box_update_replication_synchro_quorum which reevaluates the quorum with current number of replicas, thus to not make same work twice I will return 0 here. --- diff --git a/src/box/box.cc b/src/box/box.cc index 1b7643bfd..452e1a6ec 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -560,6 +560,8 @@ box_check_replication_sync_lag(void) static int box_eval_replication_synchro_quorum(int nr_replicas) { + assert(nr_replicas > 0 && nr_replicas < VCLOCK_MAX); + const char fmt[] = "local expr = [[%s]]\n" "local f, err = loadstring('return ('..expr..')')\n" @@ -621,13 +623,14 @@ box_eval_replication_synchro_quorum(int nr_replicas) * lets print a warning too. */ if (quorum <= 0 || quorum >= VCLOCK_MAX) { - say_warn("the replication_synchro_quorum formula " - "is evaluated to the quorum %d for replica " - "number %d, which is out of range [%d;%d]", - quorum, nr_replicas, 1, VCLOCK_MAX - 1); + const char *msg = + tt_sprintf("the formula is evaluated " + "to the quorum %d for replica " + "number %d, which is out of range " + "[%d;%d]", + quorum, nr_replicas, 1, VCLOCK_MAX - 1); diag_set(ClientError, ER_CFG, - "replication_synchro_quorum", - "evaluated value is out of range"); + "replication_synchro_quorum", msg); return -1; } @@ -662,7 +665,7 @@ box_check_replication_synchro_quorum(void) * have any special meaning, only errors are matter. * The real value is dynamic and will be updated on demand. */ - quorum = 1; + quorum = 0; } else { quorum = cfg_geti("replication_synchro_quorum"); }