From: Cyrill Gorcunov <gorcunov@gmail.com> To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> Cc: Mons Anderson <v.perepelitsa@corp.mail.ru>, tml <tarantool-patches@dev.tarantool.org> Subject: Re: [Tarantool-patches] [PATCH v3 2/3] cfg: support symbolic evaluation of replication_synchro_quorum Date: Fri, 11 Dec 2020 15:25:58 +0300 [thread overview] Message-ID: <20201211122558.GF544004@grain> (raw) In-Reply-To: <4091c2f3-9990-80a2-1096-5c195903dac9@tarantool.org> On Thu, Dec 10, 2020 at 12:22:28AM +0100, Vladislav Shpilevoy wrote: > > > > @TarantoolBot document > > Title: Synchronous replication > > 1. Please, be more specific. Imagine if all github tickets about > qsync would have the same title 'Synchronous replication'. > > > The plain integer value might be convenient for simple scenarios. > > The plain integer for what? Please, keep in mind that the docteam > will see everything below "@TarantoolBot document" mark. It means, > at this sentence the reader is already lost, because no context at > all. This text will be read not by developers, and out of the > commit message context. > > Only by replication_synchro_quorum below the reader may assume, > that as a 'plain integer' you mean the old way of specifying > replication_synchro_quorum. Which we know, but the docteam does > not remember, that replication_synchro_quorum was an integer before, > and now can be a string. > > State explicitly what is the request is about, how it worked before, > what changed now, and why. 'Why' part is good - that it handles the > 'dynamic' part for you. OK, thanks! I suppose the plain integers are allowed for simplicity mostly, right? > > > > +/** > > + * Evaluate replication syncro quorum number from a formula. > > + */ > > +static int > > +eval_replication_synchro_quorum(int nr_replicas) > > 2. Number of replicas is always passed as replicaset.registered_count. > I suggest you to make this function take no args, and read > replicaset.registered_count internally. > > Also would be good to rename it to box_eval_... . Because you > touch box things here. Such as cfg_gets("replication_synchro_quorum"), > for example. Which reads box.cfg. Done, I'll resend a new version. > > + int value = -1; > > + const char *expr = cfg_gets("replication_synchro_quorum"); > > + const char *buf = tt_sprintf(fmt, expr, nr_replicas); > > 3. What is the result is >= TT_STATIC_BUF_LEN? I suspect a user will > get a surprising error message, or will even get no error, but the > expression will be just truncated. Does not look good. > > Oh, shit. I just found that cfg_gets() also uses the static buffer. > Besides, it just truncates the string value to 256 chars. So whatever > you specify as replication_synchro_quorum, if it is longer than 256, > it is silently truncated. Also does not look good. But don't know > how to fix it, and if we want to fix it now. For now I simply revert back to local stack 1K buffer for formula evaluation, like it was before. I think 1K would be more than enough and allows us to detect if trimming happened. /* * cfg_gets uses static buffer as well so we need a local * one, 1K should be enough to carry arbitrary but sane * formula. */ char buf[1024]; int len = snprintf(buf, sizeof(buf), fmt, expr, replicaset.registered_count); if (len >= (int)sizeof(buf)) { diag_set(ClientError, ER_CFG, "replication_synchro_quorum", "the formula is too big"); return -1; } > > + /* > > + * At least we should have 1 node to sync, thus > > + * if the formula has evaluated to some negative > > + * value (say it was n-2) do not treat it as an > > + * error but just yield a minimum valid magnitude. > > + */ > > + if (value >= VCLOCK_MAX) { > > + const int value_max = VCLOCK_MAX - 1; > > + say_warn("replication_synchro_quorum evaluated " > > + "to value %d, set to %d", > > + value, value_max); > > + value = value_max; > > 4. When I said I want to see a warning when the quorum is > evaluated to 0, while number of replicas is > 0, I didn't mean to > delete the validation at all. > > Your example about 'n-2' is a proof that a negative value means > an issue in user's code. Because if node count is 3, the quorum > will be 1, and synchro guarantees simply don't work. This is not anyhow different from using plain integers here. You know I told Mons several times already -- I don't like this "formula" approach at all. I don't think users gonna be using some complex formulas here and I don't understand where it might be needed. When one start using synchronious replication the only thing he is interested in -- data consistency, ie canonical N/2+1 quorum. And that's all. Instead we provide some strange interface forcing a user to figure out which exactly number of nodes he needs to guarantee that there won't be data loss :( I think this is simply not needed. But since I didn't manage to convince Mons we do have to implement formula evaluation. > But since we are here, there are two options: > > - Delete the upper bound validation as well. Because it makes no > sense to check it if we allow to overflow the other bound. This > warning does not warn about all invalid values anyway. Moreover, > a "<= 0" quorum is more dangerous than a too big quorum IMO, as the > user will commit data but with weaker guarantees. > > - Return the lower bound check and properly catch the case when the > quorum is 0 illegally. It is easy. If the formula returned a negative > value, it is always a warning. If the formula returned 0, but the > number of replicas is > 0, then this is a warning. Everything else is > correct (if no upper overflow). Including the case when the formula > returned 0, and the number of replicas is 0 (happens at bootstrap). > Also we could warn even in the latter case (0 quorum, 0 replicas). > Because it signals that the user does not use N/2+1 formula. > > If user will want to do strange things like N*3/4 or N-2, then he > will see warnings, will think more, and will add 'if's into his > code or min/max calls, or will fix an issue in his code. > > Another third special option is kinda stupid, but reliable as fuck. > > When formula is changed, you can try it with all replica counts > from 0 to 31. And if any returns an out of range value, we return > an error saying on which size the bad value was returned. Then > during cluster size changes we will never get a bad value. And the > user won't need to read the logs to see the errors. Personally, I > would just do it from the beginning. Box.cfg is rare, and all 32 > values will be checked in the order of ones of microseconds I think. OK, gimme some time to think about. Thanks! > > > > +/** > > + * Renew replication_synchro_quorum value if defined > > + * as a formula and we need to recalculate it. > > + */ > > +void > > +box_update_replication_synchro_quorum(void) > > +{ > > + if (cfg_isnumber("replication_synchro_quorum")) { > > + /* > > + * Even if replication_synchro_quorum is a constant > > + * number the RAFT engine should be notified on > > + * change of replicas amount. > > + */ > > + box_raft_update_election_quorum(); > > 5. Why don't you update the limbo? And why don't you change > replication_synchro_quorum global variable? It is not changed > anywhere now. Good catch, thanks! > > + > > + /* > > + * The formula has been verified already on the bootstrap > > + * stage (and on dynamic reconfig as well), still there > > + * is a Lua call inside, heck knowns what could go wrong > > 6. knowns -> knows. +1
next prev parent reply other threads:[~2020-12-11 12:26 UTC|newest] Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-12-03 14:04 [Tarantool-patches] [PATCH v3 0/3] qsync: evaluate replication_synchro_quorum dynamically Cyrill Gorcunov 2020-12-03 14:04 ` [Tarantool-patches] [PATCH v3 1/3] cfg: add cfg_isnumber helper Cyrill Gorcunov 2020-12-03 14:04 ` [Tarantool-patches] [PATCH v3 2/3] cfg: support symbolic evaluation of replication_synchro_quorum Cyrill Gorcunov 2020-12-04 23:52 ` Vladislav Shpilevoy 2020-12-07 20:17 ` Cyrill Gorcunov 2020-12-07 21:25 ` Vladislav Shpilevoy 2020-12-07 21:48 ` Cyrill Gorcunov 2020-12-08 8:02 ` Cyrill Gorcunov 2020-12-09 23:22 ` Vladislav Shpilevoy 2020-12-11 12:25 ` Cyrill Gorcunov [this message] 2020-12-13 18:12 ` Vladislav Shpilevoy 2020-12-03 14:04 ` [Tarantool-patches] [PATCH v3 3/3] test: add replication/gh-5446-sqync-eval-quorum.test.lua Cyrill Gorcunov 2020-12-04 23:52 ` Vladislav Shpilevoy 2020-12-08 8:48 ` Cyrill Gorcunov 2020-12-09 23:22 ` Vladislav Shpilevoy 2020-12-04 10:15 ` [Tarantool-patches] [PATCH v3 0/3] qsync: evaluate replication_synchro_quorum dynamically Serge Petrenko
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20201211122558.GF544004@grain \ --to=gorcunov@gmail.com \ --cc=tarantool-patches@dev.tarantool.org \ --cc=v.perepelitsa@corp.mail.ru \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH v3 2/3] cfg: support symbolic evaluation of replication_synchro_quorum' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox