Tarantool development patches archive
 help / color / mirror / Atom feed
From: Serge Petrenko via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Vitaliia Ioffe <v.ioffe@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH] applier: fix upstream.lag calculations
Date: Sat, 14 Aug 2021 11:03:23 +0300	[thread overview]
Message-ID: <10e00de4-ef9f-b8c9-ed57-ba79989daffa@tarantool.org> (raw)
In-Reply-To: <1628923373.481086215@f108.i.mail.ru>



14.08.2021 09:42, Vitaliia Ioffe пишет:
> Hi Sergey,
> I’m so sorry for saying it: but this fix is not a fix. I have to 
> underline there were failed tests:
>
> [037] replication/errinj.test.lua                 memtx           [ fail ]
>
> [037] replication/errinj.test.lua                   vinyl            
>   [ fail ]
> You can find it here:
> https://github.com/tarantool/tarantool/runs/3322606890
> --
> Vitaliia Ioffe
Don't be sorry, I didn't check the patch thoroughly enough.

Applied the following diff and reworded the patch a bit.
Everything should be fine now.

===================================

diff --git a/test/replication/errinj.result b/test/replication/errinj.result
index 9d13f6aa7..ec251182f 100644
--- a/test/replication/errinj.result
+++ b/test/replication/errinj.result
@@ -308,7 +308,10 @@ box.info.replication[1].upstream.lag > 0
  ---
  - true
  ...
-box.info.replication[1].upstream.lag < 1
+-- Upstream lag is huge until the first row is received.
+test_run:wait_cond(function()\
+    return box.info.replication[1].upstream.lag < 1\
+end)
  ---
  - true
  ...
diff --git a/test/replication/errinj.test.lua 
b/test/replication/errinj.test.lua
index 19234ab35..7f6535ec1 100644
--- a/test/replication/errinj.test.lua
+++ b/test/replication/errinj.test.lua
@@ -130,7 +130,10 @@ test_run:cmd("switch replica")
  while box.info.replication[1].upstream.status ~= 'follow' do 
fiber.sleep(0.0001) end
  box.info.replication[1].upstream.status
  box.info.replication[1].upstream.lag > 0
-box.info.replication[1].upstream.lag < 1
+-- Upstream lag is huge until the first row is received.
+test_run:wait_cond(function()\
+    return box.info.replication[1].upstream.lag < 1\
+end)
  -- wait for ack timeout
  test_run:wait_upstream(1, {status='disconnected', 
message_re='unexpected EOF'})


===================================
>
>     Пятница, 13 августа 2021, 17:25 +03:00 от Serge Petrenko
>     <sergepetrenko@tarantool.org>:
>     upstream.lag is the delta between the moment when a row was written to
>     master's journal and the moment when it was received by the replica.
>     It's an important metric to check whether the replica has fallen
>     too far
>     behind master.
>
>     Not all the rows coming from master have a valid time of creation. For
>     example, RAFT system messages don't have one, and we can't assign
>     correct time to them: these messages do not originate from the
>     journal,
>     and assigning current time to them would lead to jumps in upstream.lag
>     results.
>
>     Stop updating upstream.lag for rows which don't have creation time
>     assigned.
>
>     This also fixes the flaky replication/errinj.test.lua
>     ---
>     https://github.com/tarantool/tarantool/tree/sp/applier-lag-fix
>     <https://github.com/tarantool/tarantool/tree/sp/applier-lag-fix>
>
>      src/box/applier.cc | 3 ++-
>      1 file changed, 2 insertions(+), 1 deletion(-)
>
>     diff --git a/src/box/applier.cc b/src/box/applier.cc
>     index 902d0bc72..9256078e1 100644
>     --- a/src/box/applier.cc
>     +++ b/src/box/applier.cc
>     @@ -664,7 +664,8 @@ applier_read_tx_row(struct applier *applier,
>     double timeout)
>
>       coio_read_xrow_timeout_xc(coio, ibuf, row, timeout);
>
>     - applier->lag = ev_now(loop()) - row->tm;
>     + if (row->tm > 0)
>     + applier->lag = ev_now(loop()) - row->tm;
>       applier->last_row_time = ev_monotonic_now(loop());
>       return tx_row;
>      }
>     --
>     2.30.1 (Apple Git-130)
>

-- 
Serge Petrenko


      reply	other threads:[~2021-08-14  8:03 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-13 14:25 Serge Petrenko via Tarantool-patches
2021-08-14  6:42 ` Vitaliia Ioffe via Tarantool-patches
2021-08-14  8:03   ` Serge Petrenko via Tarantool-patches [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=10e00de4-ef9f-b8c9-ed57-ba79989daffa@tarantool.org \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=sergepetrenko@tarantool.org \
    --cc=v.ioffe@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH] applier: fix upstream.lag calculations' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox