Tarantool development patches archive
 help / color / mirror / Atom feed
From: Cyrill Gorcunov <gorcunov@gmail.com>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Cc: Mons Anderson <v.perepelitsa@corp.mail.ru>,
	tml <tarantool-patches@dev.tarantool.org>
Subject: Re: [Tarantool-patches] [PATCH v3 2/3] cfg: support symbolic evaluation of replication_synchro_quorum
Date: Fri, 11 Dec 2020 15:25:58 +0300	[thread overview]
Message-ID: <20201211122558.GF544004@grain> (raw)
In-Reply-To: <4091c2f3-9990-80a2-1096-5c195903dac9@tarantool.org>

On Thu, Dec 10, 2020 at 12:22:28AM +0100, Vladislav Shpilevoy wrote:
> > 
> > @TarantoolBot document
> > Title: Synchronous replication
> 
> 1. Please, be more specific. Imagine if all github tickets about
> qsync would have the same title 'Synchronous replication'.
> 
> > The plain integer value might be convenient for simple scenarios.
> 
> The plain integer for what? Please, keep in mind that the docteam
> will see everything below "@TarantoolBot document" mark. It means,
> at this sentence the reader is already lost, because no context at
> all. This text will be read not by developers, and out of the
> commit message context.
> 
> Only by replication_synchro_quorum below the reader may assume,
> that as a 'plain integer' you mean the old way of specifying
> replication_synchro_quorum. Which we know, but the docteam does
> not remember, that replication_synchro_quorum was an integer before,
> and now can be a string.
> 
> State explicitly what is the request is about, how it worked before,
> what changed now, and why. 'Why' part is good - that it handles the
> 'dynamic' part for you.

OK, thanks! I suppose the plain integers are allowed for simplicity
mostly, right?

> >  
> > +/**
> > + * Evaluate replication syncro quorum number from a formula.
> > + */
> > +static int
> > +eval_replication_synchro_quorum(int nr_replicas)
> 
> 2. Number of replicas is always passed as replicaset.registered_count.
> I suggest you to make this function take no args, and read
> replicaset.registered_count internally.
> 
> Also would be good to rename it to box_eval_... . Because you
> touch box things here. Such as cfg_gets("replication_synchro_quorum"),
> for example. Which reads box.cfg.

Done, I'll resend a new version.

> > +	int value = -1;
> > +	const char *expr = cfg_gets("replication_synchro_quorum");
> > +	const char *buf = tt_sprintf(fmt, expr, nr_replicas);
> 
> 3. What is the result is >= TT_STATIC_BUF_LEN? I suspect a user will
> get a surprising error message, or will even get no error, but the
> expression will be just truncated. Does not look good.
> 
> Oh, shit. I just found that cfg_gets() also uses the static buffer.
> Besides, it just truncates the string value to 256 chars. So whatever
> you specify as replication_synchro_quorum, if it is longer than 256,
> it is silently truncated. Also does not look good. But don't know
> how to fix it, and if we want to fix it now.

For now I simply revert back to local stack 1K buffer for formula
evaluation, like it was before. I think 1K would be more than enough
and allows us to detect if trimming happened.

	/*
	 * cfg_gets uses static buffer as well so we need a local
	 * one, 1K should be enough to carry arbitrary but sane
	 * formula.
	 */
	char buf[1024];
	int len = snprintf(buf, sizeof(buf), fmt, expr,
			   replicaset.registered_count);
	if (len >= (int)sizeof(buf)) {
		diag_set(ClientError, ER_CFG,
			 "replication_synchro_quorum",
			 "the formula is too big");
		return -1;
	}


> > +	/*
> > +	 * At least we should have 1 node to sync, thus
> > +	 * if the formula has evaluated to some negative
> > +	 * value (say it was n-2) do not treat it as an
> > +	 * error but just yield a minimum valid magnitude.
> > +	 */
> > +	if (value >= VCLOCK_MAX) {
> > +		const int value_max = VCLOCK_MAX - 1;
> > +		say_warn("replication_synchro_quorum evaluated "
> > +			 "to value %d, set to %d",
> > +			 value, value_max);
> > +		value = value_max;
> 
> 4. When I said I want to see a warning when the quorum is
> evaluated to 0, while number of replicas is > 0, I didn't mean to
> delete the validation at all.
> 
> Your example about 'n-2' is a proof that a negative value means
> an issue in user's code. Because if node count is 3, the quorum
> will be 1, and synchro guarantees simply don't work.

This is not anyhow different from using plain integers here. You know
I told Mons several times already -- I don't like this "formula" approach
at all. I don't think users gonna be using some complex formulas here
and I don't understand where it might be needed.

When one start using synchronious replication the only thing he is interested
in -- data consistency, ie canonical N/2+1 quorum. And that's all.

Instead we provide some strange interface forcing a user to figure out
which exactly number of nodes he needs to guarantee that there won't
be data loss :( I think this is simply not needed. But since I didn't
manage to convince Mons we do have to implement formula evaluation.

> But since we are here, there are two options:
> 
> - Delete the upper bound validation as well. Because it makes no
> sense to check it if we allow to overflow the other bound. This
> warning does not warn about all invalid values anyway. Moreover,
> a "<= 0" quorum is more dangerous than a too big quorum IMO, as the
> user will commit data but with weaker guarantees.
> 
> - Return the lower bound check and properly catch the case when the
> quorum is 0 illegally. It is easy. If the formula returned a negative
> value, it is always a warning. If the formula returned 0, but the
> number of replicas is > 0, then this is a warning. Everything else is
> correct (if no upper overflow). Including the case when the formula
> returned 0, and the number of replicas is 0 (happens at bootstrap).
> Also we could warn even in the latter case (0 quorum, 0 replicas).
> Because it signals that the user does not use N/2+1 formula.
> 
> If user will want to do strange things like N*3/4 or N-2, then he
> will see warnings, will think more, and will add 'if's into his
> code or min/max calls, or will fix an issue in his code.
> 
> Another third special option is kinda stupid, but reliable as fuck.
> 
> When formula is changed, you can try it with all replica counts
> from 0 to 31. And if any returns an out of range value, we return
> an error saying on which size the bad value was returned. Then
> during cluster size changes we will never get a bad value. And the
> user won't need to read the logs to see the errors. Personally, I
> would just do it from the beginning. Box.cfg is rare, and all 32
> values will be checked in the order of ones of microseconds I think.

OK, gimme some time to think about. Thanks!

> >  
> > +/**
> > + * Renew replication_synchro_quorum value if defined
> > + * as a formula and we need to recalculate it.
> > + */
> > +void
> > +box_update_replication_synchro_quorum(void)
> > +{
> > +	if (cfg_isnumber("replication_synchro_quorum")) {
> > +		/*
> > +		 * Even if replication_synchro_quorum is a constant
> > +		 * number the RAFT engine should be notified on
> > +		 * change of replicas amount.
> > +		 */
> > +		box_raft_update_election_quorum();
> 
> 5. Why don't you update the limbo? And why don't you change
> replication_synchro_quorum global variable? It is not changed
> anywhere now.

Good catch, thanks!

> > +
> > +	/*
> > +	 * The formula has been verified already on the bootstrap
> > +	 * stage (and on dynamic reconfig as well), still there
> > +	 * is a Lua call inside, heck knowns what could go wrong
> 
> 6. knowns -> knows.

+1

  reply	other threads:[~2020-12-11 12:26 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-03 14:04 [Tarantool-patches] [PATCH v3 0/3] qsync: evaluate replication_synchro_quorum dynamically Cyrill Gorcunov
2020-12-03 14:04 ` [Tarantool-patches] [PATCH v3 1/3] cfg: add cfg_isnumber helper Cyrill Gorcunov
2020-12-03 14:04 ` [Tarantool-patches] [PATCH v3 2/3] cfg: support symbolic evaluation of replication_synchro_quorum Cyrill Gorcunov
2020-12-04 23:52   ` Vladislav Shpilevoy
2020-12-07 20:17     ` Cyrill Gorcunov
2020-12-07 21:25     ` Vladislav Shpilevoy
2020-12-07 21:48       ` Cyrill Gorcunov
2020-12-08  8:02         ` Cyrill Gorcunov
2020-12-09 23:22           ` Vladislav Shpilevoy
2020-12-11 12:25             ` Cyrill Gorcunov [this message]
2020-12-13 18:12               ` Vladislav Shpilevoy
2020-12-03 14:04 ` [Tarantool-patches] [PATCH v3 3/3] test: add replication/gh-5446-sqync-eval-quorum.test.lua Cyrill Gorcunov
2020-12-04 23:52   ` Vladislav Shpilevoy
2020-12-08  8:48     ` Cyrill Gorcunov
2020-12-09 23:22       ` Vladislav Shpilevoy
2020-12-04 10:15 ` [Tarantool-patches] [PATCH v3 0/3] qsync: evaluate replication_synchro_quorum dynamically Serge Petrenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201211122558.GF544004@grain \
    --to=gorcunov@gmail.com \
    --cc=tarantool-patches@dev.tarantool.org \
    --cc=v.perepelitsa@corp.mail.ru \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v3 2/3] cfg: support symbolic evaluation of replication_synchro_quorum' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox