* [Tarantool-patches] [RFC 0/4] qsync: evaluate replication_synchro_quorum dynamically @ 2020-11-19 19:40 Cyrill Gorcunov 2020-11-19 19:40 ` [Tarantool-patches] [RFC 1/4] cfg: add cfg_isnumber helper Cyrill Gorcunov ` (3 more replies) 0 siblings, 4 replies; 21+ messages in thread From: Cyrill Gorcunov @ 2020-11-19 19:40 UTC (permalink / raw) To: tml; +Cc: Vladislav Shpilevoy From the issue description: > The reason is that users will likely not understand how the option > should be configured properly, and will break something accidentally. > So this idea about allowance to write an expression on the cluster > size allows to specify the canonical N/2 + 1 formula, and not update > it manually on all instances, when a new node is added, or an existing > one is deleted. The series provides a first draft/rfc for review. As to me current pucture is muddy. The whole cfg subsystem is too overloaded - some parts are configured on lua level then passed to C level where the values are finally stored. I think we need to keep configuration core on C level, simply providing a thin interface to lua this will allow to share settings (for example I've had a huge problems when been implementing separate log module config). Anyway in the series we start to allow cofigure replication_synchro_quorum settings via symbolic representation. Please take a look. There is NO tests yet, no docs update just an early draft to share. Maybe you share some ideas how to make this code less messy since now I'm far from being happy with it. Thus any comments are highly appreciated! issue https://github.com/tarantool/tarantool/issues/5446 branch gorcunov/gh-5446-eval-quorum Cyrill Gorcunov (4): cfg: add cfg_isnumber helper qsync: move synchro quorum update to separate routine cfg: prepare symbolic evaluation of replication_synchro_quorum qsync: allow to specify replication_synchro_quorum as a formula src/box/alter.cc | 2 + src/box/box.cc | 81 ++++++++++++++++++++++++++++++++++++++-- src/box/box.h | 1 + src/box/lua/load_cfg.lua | 3 +- src/box/replication.cc | 49 ++++++++++++++++++++++++ src/box/replication.h | 12 ++++++ src/cfg.c | 9 +++++ src/cfg.h | 6 +++ 8 files changed, 157 insertions(+), 6 deletions(-) -- 2.26.2 ^ permalink raw reply [flat|nested] 21+ messages in thread
* [Tarantool-patches] [RFC 1/4] cfg: add cfg_isnumber helper 2020-11-19 19:40 [Tarantool-patches] [RFC 0/4] qsync: evaluate replication_synchro_quorum dynamically Cyrill Gorcunov @ 2020-11-19 19:40 ` Cyrill Gorcunov 2020-11-20 9:53 ` Serge Petrenko 2020-11-19 19:40 ` [Tarantool-patches] [RFC 2/4] qsync: move synchro quorum update to separate routine Cyrill Gorcunov ` (2 subsequent siblings) 3 siblings, 1 reply; 21+ messages in thread From: Cyrill Gorcunov @ 2020-11-19 19:40 UTC (permalink / raw) To: tml; +Cc: Vladislav Shpilevoy We will need it to figure out if parameter is a numeric value when doing configuration check. Part-of #5446 Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> --- src/cfg.c | 9 +++++++++ src/cfg.h | 6 ++++++ 2 files changed, 15 insertions(+) diff --git a/src/cfg.c b/src/cfg.c index 46cff1999..cf07d5817 100644 --- a/src/cfg.c +++ b/src/cfg.c @@ -57,6 +57,15 @@ cfg_geti(const char *param) return val; } +bool +cfg_isnumber(const char *param) +{ + cfg_get(param); + bool ret = !!lua_isnumber(tarantool_L, -1); + lua_pop(tarantool_L, 1); + return ret; +} + int cfg_getb(const char *param) { diff --git a/src/cfg.h b/src/cfg.h index 140bdddb8..e2955e6b2 100644 --- a/src/cfg.h +++ b/src/cfg.h @@ -40,6 +40,12 @@ extern "C" { int cfg_geti(const char *param); +/** + * Test if cfg parameter is a number. + */ +bool +cfg_isnumber(const char *param); + /** * Gets boolean parameter of cfg. * Returns -1 in case of nil -- 2.26.2 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [RFC 1/4] cfg: add cfg_isnumber helper 2020-11-19 19:40 ` [Tarantool-patches] [RFC 1/4] cfg: add cfg_isnumber helper Cyrill Gorcunov @ 2020-11-20 9:53 ` Serge Petrenko 0 siblings, 0 replies; 21+ messages in thread From: Serge Petrenko @ 2020-11-20 9:53 UTC (permalink / raw) To: Cyrill Gorcunov, tml; +Cc: Vladislav Shpilevoy 19.11.2020 22:40, Cyrill Gorcunov пишет: > We will need it to figure out if parameter > is a numeric value when doing configuration > check. > > Part-of #5446 > > Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> > --- > src/cfg.c | 9 +++++++++ > src/cfg.h | 6 ++++++ > 2 files changed, 15 insertions(+) > > diff --git a/src/cfg.c b/src/cfg.c > index 46cff1999..cf07d5817 100644 > --- a/src/cfg.c > +++ b/src/cfg.c > @@ -57,6 +57,15 @@ cfg_geti(const char *param) > return val; > } > > +bool > +cfg_isnumber(const char *param) > +{ > + cfg_get(param); > + bool ret = !!lua_isnumber(tarantool_L, -1); > + lua_pop(tarantool_L, 1); > + return ret; > +} > + > int > cfg_getb(const char *param) > { > diff --git a/src/cfg.h b/src/cfg.h > index 140bdddb8..e2955e6b2 100644 > --- a/src/cfg.h > +++ b/src/cfg.h > @@ -40,6 +40,12 @@ extern "C" { > int > cfg_geti(const char *param); > > +/** > + * Test if cfg parameter is a number. > + */ > +bool > +cfg_isnumber(const char *param); > + > /** > * Gets boolean parameter of cfg. > * Returns -1 in case of nil LGTM. -- Serge Petrenko ^ permalink raw reply [flat|nested] 21+ messages in thread
* [Tarantool-patches] [RFC 2/4] qsync: move synchro quorum update to separate routine 2020-11-19 19:40 [Tarantool-patches] [RFC 0/4] qsync: evaluate replication_synchro_quorum dynamically Cyrill Gorcunov 2020-11-19 19:40 ` [Tarantool-patches] [RFC 1/4] cfg: add cfg_isnumber helper Cyrill Gorcunov @ 2020-11-19 19:40 ` Cyrill Gorcunov 2020-11-20 10:06 ` Serge Petrenko 2020-11-19 19:40 ` [Tarantool-patches] [RFC 3/4] cfg: prepare symbolic evaluation of replication_synchro_quorum Cyrill Gorcunov 2020-11-19 19:41 ` [Tarantool-patches] [RFC 4/4] qsync: allow to specify replication_synchro_quorum as a formula Cyrill Gorcunov 3 siblings, 1 reply; 21+ messages in thread From: Cyrill Gorcunov @ 2020-11-19 19:40 UTC (permalink / raw) To: tml; +Cc: Vladislav Shpilevoy We will be updating replication_synchro_quorum parameter dynamically once we add support for number evaluation via symbolic formula, thus the update will be called not only from manual configuration changes but from cluster updates as well. Thus to reuse the code lets gather it into a separate routine. Part-of #5446 Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> --- src/box/box.cc | 4 +--- src/box/replication.cc | 18 ++++++++++++++++++ src/box/replication.h | 3 +++ 3 files changed, 22 insertions(+), 3 deletions(-) diff --git a/src/box/box.cc b/src/box/box.cc index 1f7dec362..5fcf28cb3 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -924,9 +924,7 @@ box_set_replication_synchro_quorum(void) int value = box_check_replication_synchro_quorum(); if (value < 0) return -1; - replication_synchro_quorum = value; - txn_limbo_on_parameters_change(&txn_limbo); - raft_cfg_election_quorum(box_raft()); + replication_synchro_quorum_update(value); return 0; } diff --git a/src/box/replication.cc b/src/box/replication.cc index 65512cf0f..c83392f81 100644 --- a/src/box/replication.cc +++ b/src/box/replication.cc @@ -40,7 +40,9 @@ #include "gc.h" #include "error.h" #include "relay.h" +#include "raft.h" #include "sio.h" +#include "txn_limbo.h" uint32_t instance_id = REPLICA_ID_NIL; struct tt_uuid INSTANCE_UUID; @@ -82,6 +84,22 @@ replicaset_quorum(void) return MIN(replication_connect_quorum, replicaset.applier.total); } +/** + * Update the synchro quorum number and propagate the + * change to dependent subsystems. + */ +void +replication_synchro_quorum_update(int value) +{ + assert(value > 0 && value < VCLOCK_MAX); + + say_info("replication: replication_synchro_quorum = %d", value); + + replication_synchro_quorum = value; + txn_limbo_on_parameters_change(&txn_limbo); + raft_cfg_election_quorum(box_raft()); +} + void replication_init(void) { diff --git a/src/box/replication.h b/src/box/replication.h index e57912848..ced519612 100644 --- a/src/box/replication.h +++ b/src/box/replication.h @@ -175,6 +175,9 @@ replication_disconnect_timeout(void) return replication_timeout * 4; } +void +replication_synchro_quorum_update(int value); + void replication_init(void); -- 2.26.2 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [RFC 2/4] qsync: move synchro quorum update to separate routine 2020-11-19 19:40 ` [Tarantool-patches] [RFC 2/4] qsync: move synchro quorum update to separate routine Cyrill Gorcunov @ 2020-11-20 10:06 ` Serge Petrenko 2020-11-20 11:01 ` Cyrill Gorcunov 0 siblings, 1 reply; 21+ messages in thread From: Serge Petrenko @ 2020-11-20 10:06 UTC (permalink / raw) To: Cyrill Gorcunov, tml; +Cc: Vladislav Shpilevoy 19.11.2020 22:40, Cyrill Gorcunov пишет: > We will be updating replication_synchro_quorum parameter > dynamically once we add support for number evaluation > via symbolic formula, thus the update will be called > not only from manual configuration changes but from > cluster updates as well. > > Thus to reuse the code lets gather it into a separate > routine. > > Part-of #5446 Hi! Thanks for the patch! Please find my comments below. > Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> > --- > src/box/box.cc | 4 +--- > src/box/replication.cc | 18 ++++++++++++++++++ > src/box/replication.h | 3 +++ > 3 files changed, 22 insertions(+), 3 deletions(-) > > diff --git a/src/box/box.cc b/src/box/box.cc > index 1f7dec362..5fcf28cb3 100644 > --- a/src/box/box.cc > +++ b/src/box/box.cc > @@ -924,9 +924,7 @@ box_set_replication_synchro_quorum(void) > int value = box_check_replication_synchro_quorum(); > if (value < 0) > return -1; > - replication_synchro_quorum = value; > - txn_limbo_on_parameters_change(&txn_limbo); > - raft_cfg_election_quorum(box_raft()); > + replication_synchro_quorum_update(value); > return 0; > } IMO it'd be better to name the new function somewhat similar to `box_update_replication_synchro_quorum` and leave it in box.cc This way you avoid adding 2 new dependencies to replication.cc and leave all the code dealing with reconfiguration at one place. Besides, replication.cc already depends on box, so it won't be a problem to call update_replication_synchro_quorum there. > > diff --git a/src/box/replication.cc b/src/box/replication.cc > index 65512cf0f..c83392f81 100644 > --- a/src/box/replication.cc > +++ b/src/box/replication.cc > @@ -40,7 +40,9 @@ > #include "gc.h" > #include "error.h" > #include "relay.h" > +#include "raft.h" > #include "sio.h" > +#include "txn_limbo.h" > > uint32_t instance_id = REPLICA_ID_NIL; > struct tt_uuid INSTANCE_UUID; > @@ -82,6 +84,22 @@ replicaset_quorum(void) > return MIN(replication_connect_quorum, replicaset.applier.total); > } > > +/** > + * Update the synchro quorum number and propagate the > + * change to dependent subsystems. > + */ > +void > +replication_synchro_quorum_update(int value) > +{ > + assert(value > 0 && value < VCLOCK_MAX); > + > + say_info("replication: replication_synchro_quorum = %d", value); Load_cfg.lua will say something similar, when replication_synchro_quorum is a number: ``` log.info("set '%s' configuration option to %s", key, json.encode(val)) ``` So this say_info belongs to the trigger you invoke on replica register/unregister. > + > + replication_synchro_quorum = value; > + txn_limbo_on_parameters_change(&txn_limbo); > + raft_cfg_election_quorum(box_raft()); > +} > + > void > replication_init(void) > { > diff --git a/src/box/replication.h b/src/box/replication.h > index e57912848..ced519612 100644 > --- a/src/box/replication.h > +++ b/src/box/replication.h > @@ -175,6 +175,9 @@ replication_disconnect_timeout(void) > return replication_timeout * 4; > } > > +void > +replication_synchro_quorum_update(int value); > + > void > replication_init(void); > -- Serge Petrenko ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [RFC 2/4] qsync: move synchro quorum update to separate routine 2020-11-20 10:06 ` Serge Petrenko @ 2020-11-20 11:01 ` Cyrill Gorcunov 2020-11-20 11:39 ` Serge Petrenko 0 siblings, 1 reply; 21+ messages in thread From: Cyrill Gorcunov @ 2020-11-20 11:01 UTC (permalink / raw) To: Serge Petrenko; +Cc: tml, Vladislav Shpilevoy On Fri, Nov 20, 2020 at 01:06:55PM +0300, Serge Petrenko wrote: > > IMO it'd be better to name the new function somewhat similar to > `box_update_replication_synchro_quorum` and leave it in box.cc > > > This way you avoid adding 2 new dependencies to replication.cc > and leave all the code dealing with reconfiguration at one place. > > Besides, replication.cc already depends on box, so it won't be a problem > to call update_replication_synchro_quorum there. Yeah, I don't mind, thanks! > Load_cfg.lua will say something similar, when replication_synchro_quorum is > a number: > > ``` > > log.info("set '%s' configuration option to %s", key, > json.encode(val)) > > ``` > > So this say_info belongs to the trigger you invoke on replica > register/unregister. It will say "set ... to n+2/1" while we're printing the evaluated integer value which is more informative. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [RFC 2/4] qsync: move synchro quorum update to separate routine 2020-11-20 11:01 ` Cyrill Gorcunov @ 2020-11-20 11:39 ` Serge Petrenko 2020-11-20 11:47 ` Cyrill Gorcunov 0 siblings, 1 reply; 21+ messages in thread From: Serge Petrenko @ 2020-11-20 11:39 UTC (permalink / raw) To: Cyrill Gorcunov; +Cc: tml, Vladislav Shpilevoy 20.11.2020 14:01, Cyrill Gorcunov пишет: > On Fri, Nov 20, 2020 at 01:06:55PM +0300, Serge Petrenko wrote: >> IMO it'd be better to name the new function somewhat similar to >> `box_update_replication_synchro_quorum` and leave it in box.cc >> >> >> This way you avoid adding 2 new dependencies to replication.cc >> and leave all the code dealing with reconfiguration at one place. >> >> Besides, replication.cc already depends on box, so it won't be a problem >> to call update_replication_synchro_quorum there. > Yeah, I don't mind, thanks! > >> Load_cfg.lua will say something similar, when replication_synchro_quorum is >> a number: >> >> ``` >> >> log.info("set '%s' configuration option to %s", key, >> json.encode(val)) >> >> ``` >> >> So this say_info belongs to the trigger you invoke on replica >> register/unregister. > It will say "set ... to n+2/1" while we're printing the evaluated > integer value which is more informative. Yes, but when it's actually configured to number, there'll be duplicate messages. You already have the needed log message in the next commit. -- Serge Petrenko ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [RFC 2/4] qsync: move synchro quorum update to separate routine 2020-11-20 11:39 ` Serge Petrenko @ 2020-11-20 11:47 ` Cyrill Gorcunov 0 siblings, 0 replies; 21+ messages in thread From: Cyrill Gorcunov @ 2020-11-20 11:47 UTC (permalink / raw) To: Serge Petrenko; +Cc: tml, Vladislav Shpilevoy On Fri, Nov 20, 2020 at 02:39:03PM +0300, Serge Petrenko wrote: > > It will say "set ... to n+2/1" while we're printing the evaluated > > integer value which is more informative. > > Yes, but when it's actually configured to number, there'll be duplicate > messages. > > You already have the needed log message in the next commit. +1, thanks! ^ permalink raw reply [flat|nested] 21+ messages in thread
* [Tarantool-patches] [RFC 3/4] cfg: prepare symbolic evaluation of replication_synchro_quorum 2020-11-19 19:40 [Tarantool-patches] [RFC 0/4] qsync: evaluate replication_synchro_quorum dynamically Cyrill Gorcunov 2020-11-19 19:40 ` [Tarantool-patches] [RFC 1/4] cfg: add cfg_isnumber helper Cyrill Gorcunov 2020-11-19 19:40 ` [Tarantool-patches] [RFC 2/4] qsync: move synchro quorum update to separate routine Cyrill Gorcunov @ 2020-11-19 19:40 ` Cyrill Gorcunov 2020-11-20 10:32 ` Serge Petrenko 2020-11-26 14:38 ` Mons Anderson 2020-11-19 19:41 ` [Tarantool-patches] [RFC 4/4] qsync: allow to specify replication_synchro_quorum as a formula Cyrill Gorcunov 3 siblings, 2 replies; 21+ messages in thread From: Cyrill Gorcunov @ 2020-11-19 19:40 UTC (permalink / raw) To: tml; +Cc: Vladislav Shpilevoy Prepare scaffolds to evaluate replication_synchro_quorum, since we don't yet update the undeneath engines we accept and verify the formulas but refuse to proceed. In next patch we will support dynamic evaluation of the quorum number. Part-of #5446 Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> --- src/box/box.cc | 82 +++++++++++++++++++++++++++++++++++++++- src/box/box.h | 1 + src/box/lua/load_cfg.lua | 3 +- 3 files changed, 83 insertions(+), 3 deletions(-) diff --git a/src/box/box.cc b/src/box/box.cc index 5fcf28cb3..5f7ddfa99 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -562,10 +562,90 @@ box_check_replication_sync_lag(void) return lag; } +/** + * Evaluate replicaion syncro quorum number from a formula. + */ +int +eval_replication_synchro_quorum(int nr_replicas) +{ + const char fmt[] = + "local f, err = loadstring(\"return (%s)\")\n" + "if not f then return 'failed to load \"%s\"' end\n" + "setfenv(f, { n = %d })\n" + "local ok, res = pcall(f)\n" + "if not ok then return res end\n" + "return math.floor(res)\n"; + char buf[512]; + int value = -1; + + errno = 0; + + const char *expr = cfg_gets("replication_synchro_quorum"); + size_t ret = snprintf(buf, sizeof(buf), fmt, expr, + expr, nr_replicas); + if (ret >= sizeof(buf)) { + errno = EINVAL; + diag_set(ClientError, ER_CFG, + "replication_synchro_quorum", + "the expression is too big"); + return -1; + } + + luaL_loadstring(tarantool_L, buf); + lua_call(tarantool_L, 0, 1); + + if (lua_isnumber(tarantool_L, -1)) { + value = (int)lua_tonumber(tarantool_L, -1); + } else { + assert(lua_isstring(tarantool_L, -1)); + errno = EINVAL; + diag_set(ClientError, ER_CFG, + "replication_synchro_quorum", + lua_tostring(tarantool_L, -1)); + } + lua_pop(tarantool_L, 1); + return value; +} + static int box_check_replication_synchro_quorum(void) { - int quorum = cfg_geti("replication_synchro_quorum"); + int quorum = 0; + + if (!cfg_isnumber("replication_synchro_quorum")) { + /* + * When validating a formula it must return a + * positive value for a single node and maximum + * possible replicas because the quorum will be + * evaluated on each new replica registration, + * starting from a single node. + */ + int v[] = {1, VCLOCK_MAX-1}; + for (size_t i = 0; i < lengthof(v); i++) { + quorum = eval_replication_synchro_quorum(v[i]); + if (quorum < 0 && errno == EINVAL) + return -1; + } + + /* + * Once syntax is valid we should pass the real + * default value from replication module itself + * to evaluate the actual value to use. + */ + int value = replication_synchro_quorum; + quorum = eval_replication_synchro_quorum(value); + /* + * FIXME: Until we get full support. + */ + diag_set(ClientError, ER_CFG, + "replication_synchro_quorum", + "symbolic evaluation is not yet supported"); + diag_log(); + quorum = -1; + } else { + quorum = cfg_geti("replication_synchro_quorum"); + } + if (quorum <= 0 || quorum >= VCLOCK_MAX) { diag_set(ClientError, ER_CFG, "replication_synchro_quorum", "the value must be greater than zero and less than " diff --git a/src/box/box.h b/src/box/box.h index b47a220b7..8f438faab 100644 --- a/src/box/box.h +++ b/src/box/box.h @@ -252,6 +252,7 @@ void box_set_replication_connect_timeout(void); void box_set_replication_connect_quorum(void); void box_set_replication_sync_lag(void); int box_set_replication_synchro_quorum(void); +int eval_replication_synchro_quorum(int nr_replicas); int box_set_replication_synchro_timeout(void); void box_set_replication_sync_timeout(void); void box_set_replication_skip_conflict(void); diff --git a/src/box/lua/load_cfg.lua b/src/box/lua/load_cfg.lua index 76e2e92c2..26725e08d 100644 --- a/src/box/lua/load_cfg.lua +++ b/src/box/lua/load_cfg.lua @@ -172,7 +172,7 @@ local template_cfg = { replication_timeout = 'number', replication_sync_lag = 'number', replication_sync_timeout = 'number', - replication_synchro_quorum = 'number', + replication_synchro_quorum = 'string, number', replication_synchro_timeout = 'number', replication_connect_timeout = 'number', replication_connect_quorum = 'number', @@ -368,7 +368,6 @@ local dynamic_cfg_skip_at_load = { replication_connect_quorum = true, replication_sync_lag = true, replication_sync_timeout = true, - replication_synchro_quorum = true, replication_synchro_timeout = true, replication_skip_conflict = true, replication_anon = true, -- 2.26.2 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [RFC 3/4] cfg: prepare symbolic evaluation of replication_synchro_quorum 2020-11-19 19:40 ` [Tarantool-patches] [RFC 3/4] cfg: prepare symbolic evaluation of replication_synchro_quorum Cyrill Gorcunov @ 2020-11-20 10:32 ` Serge Petrenko 2020-11-20 11:34 ` Cyrill Gorcunov 2020-11-26 14:38 ` Mons Anderson 1 sibling, 1 reply; 21+ messages in thread From: Serge Petrenko @ 2020-11-20 10:32 UTC (permalink / raw) To: Cyrill Gorcunov, tml; +Cc: Vladislav Shpilevoy 19.11.2020 22:40, Cyrill Gorcunov пишет: > Prepare scaffolds to evaluate replication_synchro_quorum, > since we don't yet update the undeneath engines we accept > and verify the formulas but refuse to proceed. > > In next patch we will support dynamic evaluation of > the quorum number. > > Part-of #5446 Hi! Thanks for the patch! Please find my comments below. > > Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> > --- > src/box/box.cc | 82 +++++++++++++++++++++++++++++++++++++++- > src/box/box.h | 1 + > src/box/lua/load_cfg.lua | 3 +- > 3 files changed, 83 insertions(+), 3 deletions(-) > > diff --git a/src/box/box.cc b/src/box/box.cc > index 5fcf28cb3..5f7ddfa99 100644 > --- a/src/box/box.cc > +++ b/src/box/box.cc > @@ -562,10 +562,90 @@ box_check_replication_sync_lag(void) > return lag; > } > > +/** > + * Evaluate replicaion syncro quorum number from a formula. > + */ > +int > +eval_replication_synchro_quorum(int nr_replicas) > +{ > + const char fmt[] = > + "local f, err = loadstring(\"return (%s)\")\n" > + "if not f then return 'failed to load \"%s\"' end\n" > + "setfenv(f, { n = %d })\n" > + "local ok, res = pcall(f)\n" > + "if not ok then return res end\n" > + "return math.floor(res)\n"; > + char buf[512]; > + int value = -1; > + > + errno = 0; > + > + const char *expr = cfg_gets("replication_synchro_quorum"); > + size_t ret = snprintf(buf, sizeof(buf), fmt, expr, > + expr, nr_replicas); > + if (ret >= sizeof(buf)) { > + errno = EINVAL; > + diag_set(ClientError, ER_CFG, > + "replication_synchro_quorum", > + "the expression is too big"); > + return -1; > + } > + > + luaL_loadstring(tarantool_L, buf); > + lua_call(tarantool_L, 0, 1); > + > + if (lua_isnumber(tarantool_L, -1)) { > + value = (int)lua_tonumber(tarantool_L, -1); > + } else { > + assert(lua_isstring(tarantool_L, -1)); > + errno = EINVAL; > + diag_set(ClientError, ER_CFG, > + "replication_synchro_quorum", > + lua_tostring(tarantool_L, -1)); > + } > + lua_pop(tarantool_L, 1); > + return value; > +} Better make the function return -1 on an error, and when the function evaluates to something negative, log the eval result and return 1. This way you won't need errno. > + > static int > box_check_replication_synchro_quorum(void) > { > - int quorum = cfg_geti("replication_synchro_quorum"); > + int quorum = 0; > + > + if (!cfg_isnumber("replication_synchro_quorum")) { > + /* > + * When validating a formula it must return a > + * positive value for a single node and maximum > + * possible replicas because the quorum will be > + * evaluated on each new replica registration, > + * starting from a single node. > + */ > + int v[] = {1, VCLOCK_MAX-1}; > + for (size_t i = 0; i < lengthof(v); i++) { > + quorum = eval_replication_synchro_quorum(v[i]); > + if (quorum < 0 && errno == EINVAL) > + return -1; > + } You should either check every possible value, from 1 to VCLOCK_MAX - 1, to make sure, say, that no division by zero is involved for some input. or check a single value, say, 1, or 2 or whatever, to make sure that the expression in replication_synchro_quorum is at least valid Lua code. Why check only min and max inputs? > + > + /* > + * Once syntax is valid we should pass the real > + * default value from replication module itself > + * to evaluate the actual value to use. > + */ > + int value = replication_synchro_quorum; > + quorum = eval_replication_synchro_quorum(value); > + /* > + * FIXME: Until we get full support. > + */ > + diag_set(ClientError, ER_CFG, > + "replication_synchro_quorum", > + "symbolic evaluation is not yet supported"); > + diag_log(); > + quorum = -1; > + } else { > + quorum = cfg_geti("replication_synchro_quorum"); > + } > + > if (quorum <= 0 || quorum >= VCLOCK_MAX) { > diag_set(ClientError, ER_CFG, "replication_synchro_quorum", > "the value must be greater than zero and less than " > diff --git a/src/box/box.h b/src/box/box.h > index b47a220b7..8f438faab 100644 > --- a/src/box/box.h > +++ b/src/box/box.h > @@ -252,6 +252,7 @@ void box_set_replication_connect_timeout(void); > void box_set_replication_connect_quorum(void); > void box_set_replication_sync_lag(void); > int box_set_replication_synchro_quorum(void); > +int eval_replication_synchro_quorum(int nr_replicas); > int box_set_replication_synchro_timeout(void); > void box_set_replication_sync_timeout(void); > void box_set_replication_skip_conflict(void); > diff --git a/src/box/lua/load_cfg.lua b/src/box/lua/load_cfg.lua > index 76e2e92c2..26725e08d 100644 > --- a/src/box/lua/load_cfg.lua > +++ b/src/box/lua/load_cfg.lua > @@ -172,7 +172,7 @@ local template_cfg = { > replication_timeout = 'number', > replication_sync_lag = 'number', > replication_sync_timeout = 'number', > - replication_synchro_quorum = 'number', > + replication_synchro_quorum = 'string, number', > replication_synchro_timeout = 'number', > replication_connect_timeout = 'number', > replication_connect_quorum = 'number', > @@ -368,7 +368,6 @@ local dynamic_cfg_skip_at_load = { > replication_connect_quorum = true, > replication_sync_lag = true, > replication_sync_timeout = true, > - replication_synchro_quorum = true, You shouldn't remove replication_synchro_quorum from here. This table lists the options which are set directly from `box_cfg` in specific order. I.e. not lua calls box_cfg_set_..., but box_cfg() itself does. But only on the first box.cfg call. > replication_synchro_timeout = true, > replication_skip_conflict = true, > replication_anon = true, -- Serge Petrenko ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [RFC 3/4] cfg: prepare symbolic evaluation of replication_synchro_quorum 2020-11-20 10:32 ` Serge Petrenko @ 2020-11-20 11:34 ` Cyrill Gorcunov 2020-11-20 11:56 ` Serge Petrenko 0 siblings, 1 reply; 21+ messages in thread From: Cyrill Gorcunov @ 2020-11-20 11:34 UTC (permalink / raw) To: Serge Petrenko; +Cc: tml, Vladislav Shpilevoy On Fri, Nov 20, 2020 at 01:32:42PM +0300, Serge Petrenko wrote: > > +/** > > + * Evaluate replicaion syncro quorum number from a formula. > > + */ > > +int > > +eval_replication_synchro_quorum(int nr_replicas) > > +{ > > + const char fmt[] = > > + "local f, err = loadstring(\"return (%s)\")\n" > > + "if not f then return 'failed to load \"%s\"' end\n" > > + "setfenv(f, { n = %d })\n" > > + "local ok, res = pcall(f)\n" > > + "if not ok then return res end\n" > > + "return math.floor(res)\n"; > > + char buf[512]; > > + int value = -1; > > + > > + errno = 0; > > + > > + const char *expr = cfg_gets("replication_synchro_quorum"); > > + size_t ret = snprintf(buf, sizeof(buf), fmt, expr, > > + expr, nr_replicas); > > + if (ret >= sizeof(buf)) { > > + errno = EINVAL; > > + diag_set(ClientError, ER_CFG, > > + "replication_synchro_quorum", > > + "the expression is too big"); > > + return -1; > > + } > > + > > + luaL_loadstring(tarantool_L, buf); > > + lua_call(tarantool_L, 0, 1); > > + > > + if (lua_isnumber(tarantool_L, -1)) { > > + value = (int)lua_tonumber(tarantool_L, -1); > > + } else { > > + assert(lua_isstring(tarantool_L, -1)); > > + errno = EINVAL; > > + diag_set(ClientError, ER_CFG, > > + "replication_synchro_quorum", > > + lua_tostring(tarantool_L, -1)); > > + } > > + lua_pop(tarantool_L, 1); > > + return value; > > +} > > Better make the function return -1 on an error, and when the function > evaluates to something negative, > log the eval result and return 1. This way you won't need errno. We need to distinguish wrong syntax and wrong evaluated values somehow which means using -1 for anything wont work here. - wrong syntax (which includes too long formula): -1 && errno - on correct syntax we simply return the result the caller has to check if returned value fits the allowed range. Or I'm missing something. > You should either check every possible value, from 1 to VCLOCK_MAX - 1, > to make sure, say, that no division by zero is involved for some input. That's the good point. Another question if we should allow formulas like n-2, and while n <= 2 assume the quorum to be 1? Ie max(1, eval(n)) > or check a single value, say, 1, or 2 or whatever, to make sure that the > expression in replication_synchro_quorum is at least valid Lua code. > > Why check only min and max inputs? Lets consider the evaluation a bit closer. Denote arguments as x, thus we will have x ∈ N, where are x's are strictly ordered, thus to get f(x) < 0 between min/max values we need to use nonlinear f. I must confess I don't know what is allowed inside Lua evaluator and really don't like that we're allowing to pass completely arbitrary formula here. Moreover what would happen if f would be using some weird data so that if will be linear on bootstrap procedure but then switched to something more complex? All this arbitrary evaluation looks very very fishy and I bet there will be problems because this is the data we do not control completely. Back to the former question -- initially I assume the f gonna be linear and eval in min/max will be enough. But of course this is not correct. You know I can pass all N's here but still this doesn't guarantee anything :( That's why I'm for more strict rules here: - allow some symbolic names such as "all" -> (alias for f(x) = n) "canonical" -> (alias for f(x) = n/2 + 1) - implement own syntax analizyer/evaluator which would allow only basic operation {+,/} over the set of {n} (which will give us a linear functioni by definition). This of course will require a way more code so not right now. > > @@ -368,7 +368,6 @@ local dynamic_cfg_skip_at_load = { > > replication_connect_quorum = true, > > replication_sync_lag = true, > > replication_sync_timeout = true, > > - replication_synchro_quorum = true, > > You shouldn't remove replication_synchro_quorum from here. > This table lists the options which are set directly from `box_cfg` in > specific order. No, this table is to _skip_ evaluation on bootup. But we have to verify the default value to evaluate. > I.e. not lua calls box_cfg_set_..., but box_cfg() itself does. But only on > the first box.cfg call. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [RFC 3/4] cfg: prepare symbolic evaluation of replication_synchro_quorum 2020-11-20 11:34 ` Cyrill Gorcunov @ 2020-11-20 11:56 ` Serge Petrenko 2020-11-20 12:14 ` Cyrill Gorcunov 0 siblings, 1 reply; 21+ messages in thread From: Serge Petrenko @ 2020-11-20 11:56 UTC (permalink / raw) To: Cyrill Gorcunov; +Cc: tml, Vladislav Shpilevoy 20.11.2020 14:34, Cyrill Gorcunov пишет: > On Fri, Nov 20, 2020 at 01:32:42PM +0300, Serge Petrenko wrote: >>> +/** >>> + * Evaluate replicaion syncro quorum number from a formula. >>> + */ >>> +int >>> +eval_replication_synchro_quorum(int nr_replicas) >>> +{ >>> + const char fmt[] = >>> + "local f, err = loadstring(\"return (%s)\")\n" >>> + "if not f then return 'failed to load \"%s\"' end\n" >>> + "setfenv(f, { n = %d })\n" >>> + "local ok, res = pcall(f)\n" >>> + "if not ok then return res end\n" >>> + "return math.floor(res)\n"; >>> + char buf[512]; >>> + int value = -1; >>> + >>> + errno = 0; >>> + >>> + const char *expr = cfg_gets("replication_synchro_quorum"); >>> + size_t ret = snprintf(buf, sizeof(buf), fmt, expr, >>> + expr, nr_replicas); >>> + if (ret >= sizeof(buf)) { >>> + errno = EINVAL; >>> + diag_set(ClientError, ER_CFG, >>> + "replication_synchro_quorum", >>> + "the expression is too big"); >>> + return -1; >>> + } >>> + >>> + luaL_loadstring(tarantool_L, buf); >>> + lua_call(tarantool_L, 0, 1); >>> + >>> + if (lua_isnumber(tarantool_L, -1)) { >>> + value = (int)lua_tonumber(tarantool_L, -1); >>> + } else { >>> + assert(lua_isstring(tarantool_L, -1)); >>> + errno = EINVAL; >>> + diag_set(ClientError, ER_CFG, >>> + "replication_synchro_quorum", >>> + lua_tostring(tarantool_L, -1)); >>> + } >>> + lua_pop(tarantool_L, 1); >>> + return value; >>> +} >> Better make the function return -1 on an error, and when the function >> evaluates to something negative, >> log the eval result and return 1. This way you won't need errno. > We need to distinguish wrong syntax and wrong evaluated values somehow > which means using -1 for anything wont work here. > > - wrong syntax (which includes too long formula): -1 && errno > - on correct syntax we simply return the result the caller has > to check if returned value fits the allowed range. > > Or I'm missing something. I was thinking that we're gonna return something like max(1, min(value, 31)) so that any evaluated number is correct. Lets better discuss this verbally then. > >> You should either check every possible value, from 1 to VCLOCK_MAX - 1, >> to make sure, say, that no division by zero is involved for some input. > That's the good point. Another question if we should allow formulas like > n-2, and while n <= 2 assume the quorum to be 1? Ie max(1, eval(n)) Yes, that's what I was speaking about above. So that when the formula may be evaluated correctly (i.e. without division by zero or syntax errors) its result will automatically be correct. > >> or check a single value, say, 1, or 2 or whatever, to make sure that the >> expression in replication_synchro_quorum is at least valid Lua code. >> >> Why check only min and max inputs? > Lets consider the evaluation a bit closer. Denote arguments as x, thus > we will have x ∈ N, where are x's are strictly ordered, thus to get > f(x) < 0 between min/max values we need to use nonlinear f. I must confess > I don't know what is allowed inside Lua evaluator and really don't like > that we're allowing to pass completely arbitrary formula here. Moreover > what would happen if f would be using some weird data so that if will be > linear on bootstrap procedure but then switched to something more complex? > All this arbitrary evaluation looks very very fishy and I bet there will > be problems because this is the data we do not control completely. I guess we shouldn't be this crazy about what is allowed in this formula and what's not. If a user has access to box.cfg{}, he may evaluate any expression he wishes anyway. Anyway, this is subject of a verbal discussion. > > Back to the former question -- initially I assume the f gonna be linear > and eval in min/max will be enough. But of course this is not correct. > > You know I can pass all N's here but still this doesn't guarantee anything :( > That's why I'm for more strict rules here: > > - allow some symbolic names such as > "all" -> (alias for f(x) = n) > "canonical" -> (alias for f(x) = n/2 + 1) Sounds good to me. AFAIR others were agains it, though. > > - implement own syntax analizyer/evaluator which would allow only > basic operation {+,/} over the set of {n} (which will give us > a linear functioni by definition). This of course will require > a way more code so not right now. > >>> @@ -368,7 +368,6 @@ local dynamic_cfg_skip_at_load = { >>> replication_connect_quorum = true, >>> replication_sync_lag = true, >>> replication_sync_timeout = true, >>> - replication_synchro_quorum = true, >> You shouldn't remove replication_synchro_quorum from here. >> This table lists the options which are set directly from `box_cfg` in >> specific order. > No, this table is to _skip_ evaluation on bootup. But we have to verify > the default value to evaluate. Yes, that's what I'm talking about. Even if the cfg option from this list is 'skipped' in lua, it's referenced directly from box_cfg_xc(). Othervise the `box_cfg_set_...` will be called twice. Once from box_cfg_xc(), second time from this lua code. To be more verbose, all the setters from dynamic_cfg_skip_at_load are called on bootstrap. But from box_cfg_xc() in C, not from Lua. If you remove an entry from dynamic_cfg_skip_at_load, the corresponding setter will be called twice. > >> I.e. not lua calls box_cfg_set_..., but box_cfg() itself does. But only on >> the first box.cfg call. -- Serge Petrenko ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [RFC 3/4] cfg: prepare symbolic evaluation of replication_synchro_quorum 2020-11-20 11:56 ` Serge Petrenko @ 2020-11-20 12:14 ` Cyrill Gorcunov 0 siblings, 0 replies; 21+ messages in thread From: Cyrill Gorcunov @ 2020-11-20 12:14 UTC (permalink / raw) To: Serge Petrenko; +Cc: tml, Vladislav Shpilevoy On Fri, Nov 20, 2020 at 02:56:12PM +0300, Serge Petrenko wrote: > > I was thinking that we're gonna return something like max(1, min(value, 31)) > > so that any evaluated number is correct. Lets better discuss this verbally > then. Ah, I see. Actually I don't mind to change it this way. Looks reasonable. > > > > You should either check every possible value, from 1 to VCLOCK_MAX - 1, > > > to make sure, say, that no division by zero is involved for some input. > > That's the good point. Another question if we should allow formulas like > > n-2, and while n <= 2 assume the quorum to be 1? Ie max(1, eval(n)) > > Yes, that's what I was speaking about above. So that when the formula may > be evaluated correctly (i.e. without division by zero or syntax errors) its > result will automatically be correct. +1 > I guess we shouldn't be this crazy about what is allowed in this formula and > what's not. > If a user has access to box.cfg{}, he may evaluate any expression he wishes > anyway. > > Anyway, this is subject of a verbal discussion. Yes, better discuss. All this formalism is done in a sake "lets provide users options to make sync replication guaranteed" and this contradict the requirements with ability to run arbitrary formula :( > > Back to the former question -- initially I assume the f gonna be linear > > and eval in min/max will be enough. But of course this is not correct. > > > > You know I can pass all N's here but still this doesn't guarantee anything :( > > That's why I'm for more strict rules here: > > > > - allow some symbolic names such as > > "all" -> (alias for f(x) = n) > > "canonical" -> (alias for f(x) = n/2 + 1) > > Sounds good to me. AFAIR others were agains it, though. Yeah, except noone gave a good reason how manually defined formulas are better than predefined ones. Users usually doesn't care about config specifics they simply need a guaranteed replication to not loose their data. > > > You shouldn't remove replication_synchro_quorum from here. > > > This table lists the options which are set directly from `box_cfg` in > > > specific order. > > No, this table is to _skip_ evaluation on bootup. But we have to verify > > the default value to evaluate. > > > Yes, that's what I'm talking about. > Even if the cfg option from this list is 'skipped' in lua, it's > referenced > directly from box_cfg_xc(). Othervise the `box_cfg_set_...` will be called > twice. > Once from box_cfg_xc(), second time from this lua code. > > To be more verbose, all the setters from dynamic_cfg_skip_at_load are called > on > bootstrap. But from box_cfg_xc() in C, not from Lua. If you remove an entry > from dynamic_cfg_skip_at_load, the corresponding setter will be called > twice. I'll recheck, thanks Serge! ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [RFC 3/4] cfg: prepare symbolic evaluation of replication_synchro_quorum 2020-11-19 19:40 ` [Tarantool-patches] [RFC 3/4] cfg: prepare symbolic evaluation of replication_synchro_quorum Cyrill Gorcunov 2020-11-20 10:32 ` Serge Petrenko @ 2020-11-26 14:38 ` Mons Anderson 2020-11-26 14:44 ` Cyrill Gorcunov 1 sibling, 1 reply; 21+ messages in thread From: Mons Anderson @ 2020-11-26 14:38 UTC (permalink / raw) To: tarantool-patches Hi, thanks! Some comments below. On 19.11.2020 22:40, Cyrill Gorcunov wrote: > > +/** > + * Evaluate replicaion syncro quorum number from a formula. > + */ > +int > +eval_replication_synchro_quorum(int nr_replicas) > +{ > + const char fmt[] = > + "local f, err = loadstring(\"return (%s)\")\n" > + "if not f then return 'failed to load \"%s\"' end\n" > + "setfenv(f, { n = %d })\n" > + "local ok, res = pcall(f)\n" > + "if not ok then return res end\n" > + "return math.floor(res)\n"; For the formula evaluation I'd propose the following snippet: local expr = [[%s]] local f, err = loadstring('return ('..expr..')') if not f then error(string.format('Failed to load %%s: %%s',expr, err)) end setfenv(f, { N = %d, math = math }) return math.floor( f() ) The reasons are: - [[ ]] quotes are not interpolated, no need to repeat substitution again - You need to handle errors anyway, so no need to return errors as return values. Throw them. - Also, since error handling (like a pcall from C layer) is required, no need to do pcall inside - Use of uppercase N, not lowercase is desired. Lowercase is used for variables, uppercase for constants. Here N is more constant than variable. (Or we may include eihter N or n) - math module may be used for formulas, so pass it also. > + luaL_loadstring(tarantool_L, buf); > + lua_call(tarantool_L, 0, 1); Use lua_pcall instead lua_call. So then check for absence of error and lua_isnumber > + if (lua_isnumber(tarantool_L, -1)) { > + value = (int)lua_tonumber(tarantool_L, -1); > + } else { > + assert(lua_isstring(tarantool_L, -1)); > + errno = EINVAL; > + diag_set(ClientError, ER_CFG, > + "replication_synchro_quorum", > + lua_tostring(tarantool_L, -1)); > + } ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [RFC 3/4] cfg: prepare symbolic evaluation of replication_synchro_quorum 2020-11-26 14:38 ` Mons Anderson @ 2020-11-26 14:44 ` Cyrill Gorcunov 2020-11-26 16:01 ` Mons Anderson 0 siblings, 1 reply; 21+ messages in thread From: Cyrill Gorcunov @ 2020-11-26 14:44 UTC (permalink / raw) To: Mons Anderson; +Cc: tarantool-patches On Thu, Nov 26, 2020 at 05:38:32PM +0300, Mons Anderson wrote: > > For the formula evaluation I'd propose the following snippet: > > local expr = [[%s]] > local f, err = loadstring('return ('..expr..')') > if not f then error(string.format('Failed to load %%s: %%s',expr, err)) end > setfenv(f, { N = %d, math = math }) > return math.floor( f() ) Will do. And will address the rest of comments. Thanks for feedback! ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [RFC 3/4] cfg: prepare symbolic evaluation of replication_synchro_quorum 2020-11-26 14:44 ` Cyrill Gorcunov @ 2020-11-26 16:01 ` Mons Anderson 0 siblings, 0 replies; 21+ messages in thread From: Mons Anderson @ 2020-11-26 16:01 UTC (permalink / raw) To: Cyrill Gorcunov; +Cc: tarantool-patches Some feedback on behalf of Vlad Grubov: Quote: Good idea, but it is not secure to export whole math to the configuration. User must not have access to math table itself because it can modify it's function which will take effect through entire application. Moreover user must not have access to math.randomseed for security reasons. User is allowed to have access to ceil, floor, min, max, fmod, random, abs, sqrt. So, I'd chande setfenv with: setfenv(f, { N = %d, math = { ceil = math.ceil, floor = math.floor, abs = math.abs, random = math.random, min = math.min, max = math.abs, sqrt = math.sqrt, fmod = math.fmod } }) On 26.11.2020 17:44, Cyrill Gorcunov wrote: > On Thu, Nov 26, 2020 at 05:38:32PM +0300, Mons Anderson wrote: >> For the formula evaluation I'd propose the following snippet: >> >> local expr = [[%s]] >> local f, err = loadstring('return ('..expr..')') >> if not f then error(string.format('Failed to load %%s: %%s',expr, err)) end >> setfenv(f, { N = %d, math = math }) >> return math.floor( f() ) > Will do. And will address the rest of comments. Thanks > for feedback! ^ permalink raw reply [flat|nested] 21+ messages in thread
* [Tarantool-patches] [RFC 4/4] qsync: allow to specify replication_synchro_quorum as a formula 2020-11-19 19:40 [Tarantool-patches] [RFC 0/4] qsync: evaluate replication_synchro_quorum dynamically Cyrill Gorcunov ` (2 preceding siblings ...) 2020-11-19 19:40 ` [Tarantool-patches] [RFC 3/4] cfg: prepare symbolic evaluation of replication_synchro_quorum Cyrill Gorcunov @ 2020-11-19 19:41 ` Cyrill Gorcunov 2020-11-20 10:50 ` Serge Petrenko 3 siblings, 1 reply; 21+ messages in thread From: Cyrill Gorcunov @ 2020-11-19 19:41 UTC (permalink / raw) To: tml; +Cc: Vladislav Shpilevoy When synchronous replication is used we prefer a user to specify a quorum number, ie the number of replicas where data must be replicated before the master node continue accepting new operations. This is not that convenient since a user may not know initially how many replicas will be used. Moreover the number of replicas may variaty dynamically. For this sake we allow to specify the number of quorum in symbolic way. For example box.cfg { replication_synchro_quorum = "n/2+1", } where n is number of registered replicas in the cluster. Once new replica attached or old one detached the number is evaluated and propagated. Internally on each "_cluster" system space update the trigger runs replication_on_cluster_update() helper which counts the number of registered replicas and update the quorum value notifying limbo and raft about a change. Closes #5446 Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> --- src/box/alter.cc | 2 ++ src/box/box.cc | 11 +++-------- src/box/replication.cc | 31 +++++++++++++++++++++++++++++++ src/box/replication.h | 9 +++++++++ 4 files changed, 45 insertions(+), 8 deletions(-) diff --git a/src/box/alter.cc b/src/box/alter.cc index 075b79d33..bd291ad4f 100644 --- a/src/box/alter.cc +++ b/src/box/alter.cc @@ -4196,6 +4196,7 @@ register_replica(struct trigger *trigger, void * /* event */) } else { try { replica = replicaset_add(id, &uuid); + replication_on_cluster_update(); /* Can't throw exceptions from on_commit trigger */ } catch(Exception *e) { panic("Can't register replica: %s", e->errmsg); @@ -4216,6 +4217,7 @@ unregister_replica(struct trigger *trigger, void * /* event */) struct replica *replica = replica_by_uuid(&old_uuid); assert(replica != NULL); replica_clear_id(replica); + replication_on_cluster_update(); return 0; } diff --git a/src/box/box.cc b/src/box/box.cc index 5f7ddfa99..558a71468 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -634,14 +634,6 @@ box_check_replication_synchro_quorum(void) */ int value = replication_synchro_quorum; quorum = eval_replication_synchro_quorum(value); - /* - * FIXME: Until we get full support. - */ - diag_set(ClientError, ER_CFG, - "replication_synchro_quorum", - "symbolic evaluation is not yet supported"); - diag_log(); - quorum = -1; } else { quorum = cfg_geti("replication_synchro_quorum"); } @@ -1004,6 +996,9 @@ box_set_replication_synchro_quorum(void) int value = box_check_replication_synchro_quorum(); if (value < 0) return -1; + + bool isnumber = cfg_isnumber("replication_synchro_quorum"); + replication_synchro_quorum_eval = !isnumber; replication_synchro_quorum_update(value); return 0; } diff --git a/src/box/replication.cc b/src/box/replication.cc index c83392f81..bde850c1c 100644 --- a/src/box/replication.cc +++ b/src/box/replication.cc @@ -53,6 +53,7 @@ double replication_connect_timeout = 30.0; /* seconds */ int replication_connect_quorum = REPLICATION_CONNECT_QUORUM_ALL; double replication_sync_lag = 10.0; /* seconds */ int replication_synchro_quorum = 1; +bool replication_synchro_quorum_eval = false; double replication_synchro_timeout = 5.0; /* seconds */ double replication_sync_timeout = 300.0; /* seconds */ bool replication_skip_conflict = false; @@ -100,6 +101,36 @@ replication_synchro_quorum_update(int value) raft_cfg_election_quorum(box_raft()); } +/** + * Evaluate the new synchro quorum number when replica + * get registered/unregistered and the quorum depends on + * their amount via formula in config. + */ +void +replication_on_cluster_update(void) +{ + if (!replication_synchro_quorum_eval) + return; + + /* + * Account only registered replicas when evaluating + * quorum number from a fromula present in config. + */ + int value = replicaset.registered_count - replicaset.anon_count; + int quorum = eval_replication_synchro_quorum(value); + + /* + * Upon node bootstrap we verify configuration so there + * must never be a value out of bounds, still better to + * be sure since evaluation code lays far from here. + */ + if (quorum <= 0 || quorum >= VCLOCK_MAX) + panic("Unexpected result for replication_synchro_quorum eval"); + + say_info("replication: evaluated quorum is %d", quorum); + replication_synchro_quorum_update(quorum); +} + void replication_init(void) { diff --git a/src/box/replication.h b/src/box/replication.h index ced519612..fa651d1c5 100644 --- a/src/box/replication.h +++ b/src/box/replication.h @@ -131,6 +131,12 @@ extern double replication_sync_lag; */ extern int replication_synchro_quorum; +/** + * A flag to point that replication_synchro_quorum needs + * to be evaluated as a formula. + */ +extern bool replication_synchro_quorum_eval; + /** * Time in seconds which the master node is able to wait for ACKs * for a synchronous transaction until it is rolled back. @@ -178,6 +184,9 @@ replication_disconnect_timeout(void) void replication_synchro_quorum_update(int value); +void +replication_on_cluster_update(void); + void replication_init(void); -- 2.26.2 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [RFC 4/4] qsync: allow to specify replication_synchro_quorum as a formula 2020-11-19 19:41 ` [Tarantool-patches] [RFC 4/4] qsync: allow to specify replication_synchro_quorum as a formula Cyrill Gorcunov @ 2020-11-20 10:50 ` Serge Petrenko 2020-11-20 12:01 ` Cyrill Gorcunov 0 siblings, 1 reply; 21+ messages in thread From: Serge Petrenko @ 2020-11-20 10:50 UTC (permalink / raw) To: Cyrill Gorcunov, tml; +Cc: Vladislav Shpilevoy 19.11.2020 22:41, Cyrill Gorcunov пишет: > When synchronous replication is used we prefer a user to specify > a quorum number, ie the number of replicas where data must be > replicated before the master node continue accepting new operations. > > This is not that convenient since a user may not know initially > how many replicas will be used. Moreover the number of replicas > may variaty dynamically. For this sake we allow to specify the > number of quorum in symbolic way. > > For example > > box.cfg { > replication_synchro_quorum = "n/2+1", > } > > where n is number of registered replicas in the cluster. > Once new replica attached or old one detached the number > is evaluated and propagated. > > Internally on each "_cluster" system space update the trigger > runs replication_on_cluster_update() helper which counts the > number of registered replicas and update the quorum value notifying > limbo and raft about a change. > > Closes #5446 > > Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> > --- > src/box/alter.cc | 2 ++ > src/box/box.cc | 11 +++-------- > src/box/replication.cc | 31 +++++++++++++++++++++++++++++++ > src/box/replication.h | 9 +++++++++ > 4 files changed, 45 insertions(+), 8 deletions(-) > > diff --git a/src/box/alter.cc b/src/box/alter.cc > index 075b79d33..bd291ad4f 100644 > --- a/src/box/alter.cc > +++ b/src/box/alter.cc > @@ -4196,6 +4196,7 @@ register_replica(struct trigger *trigger, void * /* event */) > } else { > try { > replica = replicaset_add(id, &uuid); > + replication_on_cluster_update(); > /* Can't throw exceptions from on_commit trigger */ > } catch(Exception *e) { > panic("Can't register replica: %s", e->errmsg); > @@ -4216,6 +4217,7 @@ unregister_replica(struct trigger *trigger, void * /* event */) > struct replica *replica = replica_by_uuid(&old_uuid); > assert(replica != NULL); > replica_clear_id(replica); > + replication_on_cluster_update(); > return 0; > } > We usually perform all the work related to replica register/unregister directly in replica_set_id and replica_clear_id. Besides, these are the places where replicaset.registered_count is updated, so it'd be nice to call replication_on_cluster_update from there. > diff --git a/src/box/box.cc b/src/box/box.cc > index 5f7ddfa99..558a71468 100644 > --- a/src/box/box.cc > +++ b/src/box/box.cc > @@ -634,14 +634,6 @@ box_check_replication_synchro_quorum(void) > */ > int value = replication_synchro_quorum; > quorum = eval_replication_synchro_quorum(value); Pass replicaset.registered_count instead of replication_synchro_quorum here. > - /* > - * FIXME: Until we get full support. > - */ > - diag_set(ClientError, ER_CFG, > - "replication_synchro_quorum", > - "symbolic evaluation is not yet supported"); > - diag_log(); > - quorum = -1; > } else { > quorum = cfg_geti("replication_synchro_quorum"); > } > @@ -1004,6 +996,9 @@ box_set_replication_synchro_quorum(void) > int value = box_check_replication_synchro_quorum(); > if (value < 0) > return -1; > + > + bool isnumber = cfg_isnumber("replication_synchro_quorum"); > + replication_synchro_quorum_eval = !isnumber; > replication_synchro_quorum_update(value); > return 0; > } > diff --git a/src/box/replication.cc b/src/box/replication.cc > index c83392f81..bde850c1c 100644 > --- a/src/box/replication.cc > +++ b/src/box/replication.cc > @@ -53,6 +53,7 @@ double replication_connect_timeout = 30.0; /* seconds */ > int replication_connect_quorum = REPLICATION_CONNECT_QUORUM_ALL; > double replication_sync_lag = 10.0; /* seconds */ > int replication_synchro_quorum = 1; > +bool replication_synchro_quorum_eval = false; > double replication_synchro_timeout = 5.0; /* seconds */ > double replication_sync_timeout = 300.0; /* seconds */ > bool replication_skip_conflict = false; > @@ -100,6 +101,36 @@ replication_synchro_quorum_update(int value) > raft_cfg_election_quorum(box_raft()); > } > > +/** > + * Evaluate the new synchro quorum number when replica > + * get registered/unregistered and the quorum depends on > + * their amount via formula in config. > + */ > +void > +replication_on_cluster_update(void) > +{ > + if (!replication_synchro_quorum_eval) > + return; > + > + /* > + * Account only registered replicas when evaluating > + * quorum number from a fromula present in config. > + */ > + int value = replicaset.registered_count - replicaset.anon_count; registered_count stands for 'normal' replica count, so no need to subtract anon_count from it. > + int quorum = eval_replication_synchro_quorum(value); > + > + /* > + * Upon node bootstrap we verify configuration so there > + * must never be a value out of bounds, still better to > + * be sure since evaluation code lays far from here. > + */ > + if (quorum <= 0 || quorum >= VCLOCK_MAX) > + panic("Unexpected result for replication_synchro_quorum eval"); > + > + say_info("replication: evaluated quorum is %d", quorum); quorum -> replication_synchro_quorum > + replication_synchro_quorum_update(quorum); > +} > + > void > replication_init(void) > { > diff --git a/src/box/replication.h b/src/box/replication.h > index ced519612..fa651d1c5 100644 > --- a/src/box/replication.h > +++ b/src/box/replication.h > @@ -131,6 +131,12 @@ extern double replication_sync_lag; > */ > extern int replication_synchro_quorum; > > +/** > + * A flag to point that replication_synchro_quorum needs > + * to be evaluated as a formula. > + */ > +extern bool replication_synchro_quorum_eval; > + > /** > * Time in seconds which the master node is able to wait for ACKs > * for a synchronous transaction until it is rolled back. > @@ -178,6 +184,9 @@ replication_disconnect_timeout(void) > void > replication_synchro_quorum_update(int value); > > +void > +replication_on_cluster_update(void); > + > void > replication_init(void); > -- Serge Petrenko ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [RFC 4/4] qsync: allow to specify replication_synchro_quorum as a formula 2020-11-20 10:50 ` Serge Petrenko @ 2020-11-20 12:01 ` Cyrill Gorcunov 2020-11-20 12:41 ` Serge Petrenko 0 siblings, 1 reply; 21+ messages in thread From: Cyrill Gorcunov @ 2020-11-20 12:01 UTC (permalink / raw) To: Serge Petrenko; +Cc: tml, Vladislav Shpilevoy On Fri, Nov 20, 2020 at 01:50:50PM +0300, Serge Petrenko wrote: > > @@ -4216,6 +4217,7 @@ unregister_replica(struct trigger *trigger, void * /* event */) > > struct replica *replica = replica_by_uuid(&old_uuid); > > assert(replica != NULL); > > replica_clear_id(replica); > > + replication_on_cluster_update(); > > return 0; > > } > > We usually perform all the work related to replica register/unregister > directly in replica_set_id > and replica_clear_id. > > Besides, these are the places where replicaset.registered_count is updated, > so it'd be nice to call replication_on_cluster_update from there. Sure. > > --- a/src/box/box.cc > > +++ b/src/box/box.cc > > @@ -634,14 +634,6 @@ box_check_replication_synchro_quorum(void) > > */ > > int value = replication_synchro_quorum; > > quorum = eval_replication_synchro_quorum(value); > > > Pass replicaset.registered_count instead of replication_synchro_quorum here. Wait, this is bootstrap, replicaset.registered_count is 0 at this moment, no? > > } > > +/** > > + * Evaluate the new synchro quorum number when replica > > + * get registered/unregistered and the quorum depends on > > + * their amount via formula in config. > > + */ > > +void > > +replication_on_cluster_update(void) > > +{ > > + if (!replication_synchro_quorum_eval) > > + return; > > + > > + /* > > + * Account only registered replicas when evaluating > > + * quorum number from a fromula present in config. > > + */ > > + int value = replicaset.registered_count - replicaset.anon_count; > > > registered_count stands for 'normal' replica count, so no need to subtract > anon_count from it. Yeah, thanks! > > + > > + say_info("replication: evaluated quorum is %d", quorum); > > > quorum -> replication_synchro_quorum ok ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [RFC 4/4] qsync: allow to specify replication_synchro_quorum as a formula 2020-11-20 12:01 ` Cyrill Gorcunov @ 2020-11-20 12:41 ` Serge Petrenko 2020-11-20 15:00 ` Cyrill Gorcunov 0 siblings, 1 reply; 21+ messages in thread From: Serge Petrenko @ 2020-11-20 12:41 UTC (permalink / raw) To: Cyrill Gorcunov; +Cc: tml, Vladislav Shpilevoy 20.11.2020 15:01, Cyrill Gorcunov пишет: > On Fri, Nov 20, 2020 at 01:50:50PM +0300, Serge Petrenko wrote: >>> @@ -4216,6 +4217,7 @@ unregister_replica(struct trigger *trigger, void * /* event */) >>> struct replica *replica = replica_by_uuid(&old_uuid); >>> assert(replica != NULL); >>> replica_clear_id(replica); >>> + replication_on_cluster_update(); >>> return 0; >>> } >> We usually perform all the work related to replica register/unregister >> directly in replica_set_id >> and replica_clear_id. >> >> Besides, these are the places where replicaset.registered_count is updated, >> so it'd be nice to call replication_on_cluster_update from there. > Sure. > >>> --- a/src/box/box.cc >>> +++ b/src/box/box.cc >>> @@ -634,14 +634,6 @@ box_check_replication_synchro_quorum(void) >>> */ >>> int value = replication_synchro_quorum; >>> quorum = eval_replication_synchro_quorum(value); >>> Pass replicaset.registered_count instead of replication_synchro_quorum here. > Wait, this is bootstrap, replicaset.registered_count is 0 at this moment, no? Hm, I didn't think of this. Yes, this is either bootstrap or reconfiguration. Well, if we have a max(1, value) guard nothing bad should happen even if we pass 0 on bootstrap. The formula will be reevaluated each time a replica is registered anyway. Even when the node registers itself. Anyway, passing replication_synchro_quorum as parameter to evaluate quorum is even more random. > >>> } >>> +/** >>> + * Evaluate the new synchro quorum number when replica >>> + * get registered/unregistered and the quorum depends on >>> + * their amount via formula in config. >>> + */ >>> +void >>> +replication_on_cluster_update(void) >>> +{ >>> + if (!replication_synchro_quorum_eval) >>> + return; >>> + >>> + /* >>> + * Account only registered replicas when evaluating >>> + * quorum number from a fromula present in config. >>> + */ >>> + int value = replicaset.registered_count - replicaset.anon_count; >> >> registered_count stands for 'normal' replica count, so no need to subtract >> anon_count from it. > Yeah, thanks! > >>> + >>> + say_info("replication: evaluated quorum is %d", quorum); >> >> quorum -> replication_synchro_quorum > ok -- Serge Petrenko ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [RFC 4/4] qsync: allow to specify replication_synchro_quorum as a formula 2020-11-20 12:41 ` Serge Petrenko @ 2020-11-20 15:00 ` Cyrill Gorcunov 0 siblings, 0 replies; 21+ messages in thread From: Cyrill Gorcunov @ 2020-11-20 15:00 UTC (permalink / raw) To: Serge Petrenko; +Cc: tml, Vladislav Shpilevoy On Fri, Nov 20, 2020 at 03:41:12PM +0300, Serge Petrenko wrote: > > > > Pass replicaset.registered_count instead of replication_synchro_quorum here. > > Wait, this is bootstrap, replicaset.registered_count is 0 at this moment, no? > > Hm, I didn't think of this. > > Yes, this is either bootstrap or reconfiguration. > Well, if we have a max(1, value) guard nothing bad should happen even if we > pass 0 on bootstrap. The formula will be reevaluated each time > a replica is registered anyway. Even when the node registers itself. > > Anyway, passing replication_synchro_quorum as parameter to evaluate quorum > is even more random. ok, thanks! ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2020-11-26 16:01 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-11-19 19:40 [Tarantool-patches] [RFC 0/4] qsync: evaluate replication_synchro_quorum dynamically Cyrill Gorcunov 2020-11-19 19:40 ` [Tarantool-patches] [RFC 1/4] cfg: add cfg_isnumber helper Cyrill Gorcunov 2020-11-20 9:53 ` Serge Petrenko 2020-11-19 19:40 ` [Tarantool-patches] [RFC 2/4] qsync: move synchro quorum update to separate routine Cyrill Gorcunov 2020-11-20 10:06 ` Serge Petrenko 2020-11-20 11:01 ` Cyrill Gorcunov 2020-11-20 11:39 ` Serge Petrenko 2020-11-20 11:47 ` Cyrill Gorcunov 2020-11-19 19:40 ` [Tarantool-patches] [RFC 3/4] cfg: prepare symbolic evaluation of replication_synchro_quorum Cyrill Gorcunov 2020-11-20 10:32 ` Serge Petrenko 2020-11-20 11:34 ` Cyrill Gorcunov 2020-11-20 11:56 ` Serge Petrenko 2020-11-20 12:14 ` Cyrill Gorcunov 2020-11-26 14:38 ` Mons Anderson 2020-11-26 14:44 ` Cyrill Gorcunov 2020-11-26 16:01 ` Mons Anderson 2020-11-19 19:41 ` [Tarantool-patches] [RFC 4/4] qsync: allow to specify replication_synchro_quorum as a formula Cyrill Gorcunov 2020-11-20 10:50 ` Serge Petrenko 2020-11-20 12:01 ` Cyrill Gorcunov 2020-11-20 12:41 ` Serge Petrenko 2020-11-20 15:00 ` Cyrill Gorcunov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox