Hi Sergey, 
 
I’m so sorry for saying it: but this fix is not a fix. I have to underline there were failed tests:

[037] replication/errinj.test.lua                     memtx           [ fail ]

[037] replication/errinj.test.lua                     vinyl              [ fail ]
 
You can find it here: 
https://github.com/tarantool/tarantool/runs/3322606890
 
 
--
Vitaliia Ioffe
 
 
Пятница, 13 августа 2021, 17:25 +03:00 от Serge Petrenko <sergepetrenko@tarantool.org>:
 
upstream.lag is the delta between the moment when a row was written to
master's journal and the moment when it was received by the replica.
It's an important metric to check whether the replica has fallen too far
behind master.

Not all the rows coming from master have a valid time of creation. For
example, RAFT system messages don't have one, and we can't assign
correct time to them: these messages do not originate from the journal,
and assigning current time to them would lead to jumps in upstream.lag
results.

Stop updating upstream.lag for rows which don't have creation time
assigned.

This also fixes the flaky replication/errinj.test.lua
---
https://github.com/tarantool/tarantool/tree/sp/applier-lag-fix

 src/box/applier.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/box/applier.cc b/src/box/applier.cc
index 902d0bc72..9256078e1 100644
--- a/src/box/applier.cc
+++ b/src/box/applier.cc
@@ -664,7 +664,8 @@ applier_read_tx_row(struct applier *applier, double timeout)
 
  coio_read_xrow_timeout_xc(coio, ibuf, row, timeout);
 
- applier->lag = ev_now(loop()) - row->tm;
+ if (row->tm > 0)
+ applier->lag = ev_now(loop()) - row->tm;
  applier->last_row_time = ev_monotonic_now(loop());
  return tx_row;
 }
--
2.30.1 (Apple Git-130)