QA LGTM     -- Vitaliia Ioffe     >Понедельник, 16 августа 2021, 18:15 +03:00 от Serge Petrenko : >  >upstream.lag is the delta between the moment when a row was written to >master's journal and the moment when it was received by the replica. >It's an important metric to check whether the replica has fallen too far >behind master. > >Not all the rows coming from master have a valid time of creation. For >example, RAFT system messages don't have one, and we can't assign >correct time to them: these messages do not originate from the journal, >and assigning current time to them would lead to jumps in upstream.lag >results. > >Stop updating upstream.lag for rows which don't have creation time >assigned. > >The upstream.lag calculation changes were meant to fix the flaky >replication/errinj.test: > > Test failed! Result content mismatch: > --- replication/errinj.result Fri Aug 13 15:15:35 2021 > +++ /tmp/tnt/rejects/replication/errinj.reject Fri Aug 13 15:40:39 2021 > @@ -310,7 +310,7 @@ >  ... >  box.info.replication[1].upstream.lag < 1 >  --- > -- true > +- false >  ... > >But the changes were not enough, because now the test >may see the initial lag value (TIMEOUT_INFINITY). >So fix the test as well by waiting until upstream.lag becomes < 1. >--- > src/box/applier.cc | 3 ++- > test/replication/errinj.result | 5 ++++- > test/replication/errinj.test.lua | 5 ++++- > 3 files changed, 10 insertions(+), 3 deletions(-) > >diff --git a/src/box/applier.cc b/src/box/applier.cc >index 902d0bc72..9256078e1 100644 >--- a/src/box/applier.cc >+++ b/src/box/applier.cc >@@ -664,7 +664,8 @@ applier_read_tx_row(struct applier *applier, double timeout) >  >  coio_read_xrow_timeout_xc(coio, ibuf, row, timeout); >  >- applier->lag = ev_now(loop()) - row->tm; >+ if (row->tm > 0) >+ applier->lag = ev_now(loop()) - row->tm; >  applier->last_row_time = ev_monotonic_now(loop()); >  return tx_row; > } >diff --git a/test/replication/errinj.result b/test/replication/errinj.result >index 9d13f6aa7..ec251182f 100644 >--- a/test/replication/errinj.result >+++ b/test/replication/errinj.result >@@ -308,7 +308,10 @@ box.info.replication[1].upstream.lag > 0 > --- > - true > ... >-box.info.replication[1].upstream.lag < 1 >+-- Upstream lag is huge until the first row is received. >+test_run:wait_cond(function()\ >+ return box.info.replication[1].upstream.lag < 1\ >+end) > --- > - true > ... >diff --git a/test/replication/errinj.test.lua b/test/replication/errinj.test.lua >index 19234ab35..7f6535ec1 100644 >--- a/test/replication/errinj.test.lua >+++ b/test/replication/errinj.test.lua >@@ -130,7 +130,10 @@ test_run:cmd("switch replica") > while box.info.replication[1].upstream.status ~= 'follow' do fiber.sleep(0.0001) end > box.info.replication[1].upstream.status > box.info.replication[1].upstream.lag > 0 >-box.info.replication[1].upstream.lag < 1 >+-- Upstream lag is huge until the first row is received. >+test_run:wait_cond(function()\ >+ return box.info.replication[1].upstream.lag < 1\ >+end) > -- wait for ack timeout > test_run:wait_upstream(1, {status='disconnected', message_re='unexpected EOF'}) >  >-- >2.30.1 (Apple Git-130)