[Tarantool-patches] [PATCH v4 2/3] cfg: support symbolic evaluation of replication_synchro_quorum
Vladislav Shpilevoy
v.shpilevoy at tarantool.org
Fri Dec 18 02:17:56 MSK 2020
Hi! Thanks for the patch!
> @TarantoolBot document
> Title: Support dynamic evaluation of synchronous replication quorum
>
> Setting `replication_synchro_quorum` option to an explicit integer
> value was introduced rather for simplicity sake mostly. For example
> if the cluster's size is not a constant value and new replicas are
> connected in dynamically then an administrator might need to increase
> the option by hands or by some other external tool.
>
> Instead one can use a dynamic evaluation of a quorum value via formal
> representation using symbol `N` as a current number of registered replicas
> in a cluster.
>
> For example the canonical definition for a quorum (ie majority
> of members in a set) of `N` replicas is `N/2+1`. For such configuration
> define
>
> ```
> box.cfg {replication_synchro_quorum = "N/2+1"}
> ```
>
> The formal statement allows to provide a flexible configuration but keep
> in mind that only canonical quorum (and bigger values, say `N` for all
> replicas) guarantees data reliability and various weird forms such as
> `N/3+1` while allowed may lead to unexpected results.
Now the description is good.
See 3 comments below.
> ---
> src/box/box.cc | 147 +++++++++++++++++++++++++++++++++++++--
> src/box/box.h | 1 +
> src/box/lua/load_cfg.lua | 2 +-
> src/box/replication.cc | 4 +-
> 4 files changed, 147 insertions(+), 7 deletions(-)
>
> diff --git a/src/box/box.cc b/src/box/box.cc
> index a8bc3471d..b820af5d0 100644
> --- a/src/box/box.cc
> +++ b/src/box/box.cc
> @@ -554,10 +554,119 @@ box_check_replication_sync_lag(void)
> return lag;
> }
>
> +/**
> + * Evaluate replication syncro quorum number from a formula.
> + */
> +static int
> +box_eval_replication_synchro_quorum(int nr_replicas)
1. I see you decided to never pass 0 here. Then I suggest to
add an assertion nr_replicas > 0 and < VCLOCK_MAX.
> +{
> + const char fmt[] =
> + "local expr = [[%s]]\n"
> + "local f, err = loadstring('return ('..expr..')')\n"
> + "if not f then "
> + "error(string.format('Failed to load \%\%s:"
> + "\%\%s', expr, err)) "
> + "end\n"
> + "setfenv(f, {N = %d, math = {"
> + "ceil = math.ceil,"
> + "floor = math.floor,"
> + "abs = math.abs,"
> + "random = math.random,"
> + "min = math.min,"
> + "max = math.abs,"
> + "sqrt = math.sqrt,"
> + "fmod = math.fmod,"
> + "}})\n"
> + "local res = f()\n"
> + "if type(res) ~= 'number' then\n"
> + "error('Expression should return a number')\n"
> + "end\n"
> + "return math.floor(res)\n";
> + const char *expr = cfg_gets("replication_synchro_quorum");
> + int quorum = -1;
> +
> + /*
> + * cfg_gets uses static buffer as well so we need a local
> + * one, 1K should be enough to carry arbitrary but sane
> + * formula.
> + */
> + char buf[1024];
> + int len = snprintf(buf, sizeof(buf), fmt, expr,
> + nr_replicas);
> + if (len >= (int)sizeof(buf)) {
> + diag_set(ClientError, ER_CFG,
> + "replication_synchro_quorum",
> + "the formula is too big");
> + return -1;
> + }
> +
> + luaL_loadstring(tarantool_L, buf);
> + if (lua_pcall(tarantool_L, 0, 1, 0) != 0) {
> + diag_set(ClientError, ER_CFG,
> + "replication_synchro_quorum",
> + lua_tostring(tarantool_L, -1));
> + return -1;
> + }
> +
> + if (lua_isnumber(tarantool_L, -1))
> + quorum = (int)lua_tonumber(tarantool_L, -1);
> + lua_pop(tarantool_L, 1);
> +
> + /*
> + * At least we should have 1 node to sync, the weird
> + * formulas such as N-2 do not guarantee quorums thus
> + * return an error.
> + *
> + * Since diag_set doesn't allow to show the valid range
> + * lets print a warning too.
2. Specifically for such cases we use tt_sprintf(). See usage
examples throughout the sources.
> + */
> + if (quorum <= 0 || quorum >= VCLOCK_MAX) {
> + say_warn("the replication_synchro_quorum formula "
> + "is evaluated to the quorum %d for replica "
> + "number %d, which is out of range [%d;%d]",
> + quorum, nr_replicas, 1, VCLOCK_MAX - 1);
> + diag_set(ClientError, ER_CFG,
> + "replication_synchro_quorum",
> + "evaluated value is out of range");
> + return -1;
> + }
> +
> + return quorum;
> +}
> +
> static int
> box_check_replication_synchro_quorum(void)
> {
> - int quorum = cfg_geti("replication_synchro_quorum");
> + int quorum = 0;
> +
> + if (!cfg_isnumber("replication_synchro_quorum")) {
> + /*
> + * The formula uses symbolic name 'N' as
> + * a number of currently registered replicas.
> + *
> + * When we're in "checking" mode we should walk
> + * over all possible number of replicas to make
> + * sure the formula is correct.
> + *
> + * Note that currently VCLOCK_MAX is pretty small
> + * value but if we gonna increase this limit make
> + * sure that the cycle won't take too much time.
> + */
> + for (int i = 1; i < VCLOCK_MAX; i++) {
> + quorum = box_eval_replication_synchro_quorum(i);
> + if (quorum < 0)
> + return -1;
> + }
> + /*
> + * Just to make clear the number we return here doesn't
> + * have any special meaning, only errors are matter.
> + * The real value is dynamic and will be updated on demand.
> + */
3. Wtf? This function before your patch was supposed to return the
new quorum value. Like all cfg 'check()' functions. If it can't do that
now (but it can - just evaluate with the current number of replicas),
then the function must return only 0 or -1, like all 'binary result'
functions. Now it simply returns some random number in case the quorum
is an expression.
> + quorum = 1;
> + } else {
> + quorum = cfg_geti("replication_synchro_quorum");
> + }
> +
> if (quorum <= 0 || quorum >= VCLOCK_MAX) {
> diag_set(ClientError, ER_CFG, "replication_synchro_quorum",
> "the value must be greater than zero and less than "
More information about the Tarantool-patches
mailing list