Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Cyrill Gorcunov <gorcunov@gmail.com>,
	tml <tarantool-patches@dev.tarantool.org>
Subject: Re: [Tarantool-patches] [PATCH v4 2/2] relay: provide information about downstream lag
Date: Wed, 12 May 2021 22:10:47 +0200	[thread overview]
Message-ID: <1aa8f87d-bba0-2967-352b-b751837ea2cd@tarantool.org> (raw)
In-Reply-To: <20210506214132.533913-3-gorcunov@gmail.com>

I appreciate the work you did here!

See 6 comments below.

On 06.05.2021 23:41, Cyrill Gorcunov wrote:
> We already have `box.replication.upstream.lag` entry for monitoring
> sake. Same time in synchronous replication timeouts are key properties
> for quorum gathering procedure. Thus we would like to know how long
> it took of a transaction to traverse `initiator WAL -> network ->
> remote applier -> ACK` path.
> 
> Typical ouput is

1. ouput -> output.

>  .../unreleased/gh-5447-downstream-lag.md      |  6 ++
>  src/box/lua/info.c                            |  3 +
>  src/box/relay.cc                              | 18 ++++
>  src/box/relay.h                               |  6 ++
>  .../replication/gh-5447-downstream-lag.result | 93 +++++++++++++++++++
>  .../gh-5447-downstream-lag.test.lua           | 41 ++++++++
>  6 files changed, 167 insertions(+)
>  create mode 100644 changelogs/unreleased/gh-5447-downstream-lag.md
>  create mode 100644 test/replication/gh-5447-downstream-lag.result
>  create mode 100644 test/replication/gh-5447-downstream-lag.test.lua
> 
> diff --git a/changelogs/unreleased/gh-5447-downstream-lag.md b/changelogs/unreleased/gh-5447-downstream-lag.md
> new file mode 100644
> index 000000000..3b11c50ee
> --- /dev/null
> +++ b/changelogs/unreleased/gh-5447-downstream-lag.md
> @@ -0,0 +1,6 @@
> +#feature/replication
> +
> + * Introduce `box.info.replication[n].downstream.lag` to monitor state

2. If was recently decided we must use Past Simple in changelogs.

> +   of replication which is especially important for synchronous spaces
> +   where malfunctioning replicas may prevent quorum from gathering votes
> +   to commit a transaction.

3. Please, add a reference to the ticket in the form `(gh-####)`.

> diff --git a/src/box/lua/info.c b/src/box/lua/info.c
> index 0eb48b823..5b7c25f28 100644
> --- a/src/box/lua/info.c
> +++ b/src/box/lua/info.c
> @@ -143,6 +143,9 @@ lbox_pushrelay(lua_State *L, struct relay *relay)
>  		lua_pushnumber(L, ev_monotonic_now(loop()) -
>  			       relay_last_row_time(relay));
>  		lua_settable(L, -3);
> +		lua_pushstring(L, "lag");
> +		lua_pushnumber(L, relay_lag(relay));
> +		lua_settable(L, -3);
>  		break;
>  	case RELAY_STOPPED:
>  	{
> diff --git a/src/box/relay.cc b/src/box/relay.cc
> index ff43c2fc7..1501e53c7 100644
> --- a/src/box/relay.cc
> +++ b/src/box/relay.cc
> @@ -158,6 +158,12 @@ struct relay {
>  	struct stailq pending_gc;
>  	/** Time when last row was sent to peer. */
>  	double last_row_time;
> +	/**
> +	 * A time difference between moment when we
> +	 * wrote a row in the local WAL and received
> +	 * an ACK that peer has it replicated.
> +	 */
> +	double lag;

4. Lag is updated in the relay thread, therefore you can't
simply read it in TX thread like you do in the diff block
above.

Try to deliver it to TX the same way as acked vclock is
delivered.

>  	/** Relay sync state. */
>  	enum relay_state state;
>  
> @@ -621,6 +633,12 @@ relay_reader_f(va_list ap)
>  			/* vclock is followed while decoding, zeroing it. */
>  			vclock_create(&relay->recv_vclock);
>  			xrow_decode_vclock_xc(&xrow, &relay->recv_vclock);
> +			if (xrow.tm != 0) {
> +				double delta = ev_now(loop()) - xrow.tm;

5. What if the acked row didn't come from the remote peer? But from a third
node. Node1 -> Node2 -> Node3. Does Node2, before sending the row to node3,
update its timestamp? Otherwise you are going to send an ACK to Node2 with
timestamp of Node1, which won't make much sense. I don't know. I see
xlog_write_entry() updates the timestamp, but please, double-check it. In
the debugger or via printfs.

> +				relay->lag = delta;
> +			} else {
> +				relay->lag = 0;

6. Hm. This means that if I sent a synchronous transaction, and its
quorum collection takes longer than the replication timeout, I would
see the lag 0. Moverover, it will constantly blink if I have many
synchronous transactions working slow. Between 0 and seconds. I am not
100% it looks right. Especially it won't be applicable for any monitoring,
the results won't be readable. I would rather expect the lag grow until
I receive an ACK, which will bump it to an actual value. And drop to 0
only when there is no rows to wait ACK for. Or maybe never drop to 0 like
applier does?

Drop to 0 when no rows means we will have it blinking between 0 and
seconds again if the synchronous transactions are long, but not so
frequent. So there is often no rows to ACK.

Did you try to ask Vlad G., who is going to actually use it probably?

I tried to list the options we have
https://github.com/tarantool/tarantool/issues/5447#issuecomment-840047693
and asked Vlad what he thinks. He also promised to ask more people who
use Tarantool actively.

      reply	other threads:[~2021-05-12 20:10 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-06 21:41 [Tarantool-patches] [PATCH v4 0/2] relay: provide downstream lag information Cyrill Gorcunov via Tarantool-patches
2021-05-06 21:41 ` [Tarantool-patches] [PATCH v4 1/2] applier: send transaction's first row WAL time in the applier_writer_f Cyrill Gorcunov via Tarantool-patches
2021-05-12 19:59   ` Vladislav Shpilevoy via Tarantool-patches
2021-05-06 21:41 ` [Tarantool-patches] [PATCH v4 2/2] relay: provide information about downstream lag Cyrill Gorcunov via Tarantool-patches
2021-05-12 20:10   ` Vladislav Shpilevoy via Tarantool-patches [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1aa8f87d-bba0-2967-352b-b751837ea2cd@tarantool.org \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=gorcunov@gmail.com \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v4 2/2] relay: provide information about downstream lag' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox