From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Thu, 14 Mar 2019 19:43:51 +0300 From: Vladimir Davydov Subject: Re: [tarantool-patches] [PATCH] evio: fix timeout calculations Message-ID: <20190314164351.rjdljd5p2wg4f36i@esperanza> References: <20190314151641.26876-1-sergepetrenko@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: To: Serge Petrenko Cc: tarantool-patches@freelists.org List-ID: On Thu, Mar 14, 2019 at 06:57:12PM +0300, Serge Petrenko wrote: > > > > 14 марта 2019 г., в 18:16, Serge Petrenko написал(а): > > > > The function evio_timeout_update() failed to update the starting time > > point, which lead to timeouts happening much faster than they should if > > there were consecutive calls to the function. > > This lead, for example, to applier timing out while reading a several > > megabyte-size row in 0.2 seconds even if replication_timeout was set to > > 15 seconds. > > > > Closes #4042 > > --- > > https://github.com/tarantool/tarantool/tree/sp/gh-4042-applier-timeout > > https://github.com/tarantool/tarantool/issues/4042 > > > > src/box/xrow_io.cc | 4 +- > > src/lib/core/coio.cc | 18 ++-- > > src/lib/core/coio.h | 2 +- > > src/lib/core/evio.h | 5 +- > > test/replication/long_row_timeout.result | 98 ++++++++++++++++++++++ > > test/replication/long_row_timeout.test.lua | 43 ++++++++++ > > test/replication/replica_big.lua | 12 +++ > > test/replication/suite.cfg | 1 + > > 8 files changed, 169 insertions(+), 14 deletions(-) > > create mode 100644 test/replication/long_row_timeout.result > > create mode 100644 test/replication/long_row_timeout.test.lua > > create mode 100644 test/replication/replica_big.lua > diff --git a/test/replication/long_row_timeout.test.lua b/test/replication/long_row_timeout.test.lua > index 21a522018..3993f1657 100644 > --- a/test/replication/long_row_timeout.test.lua > +++ b/test/replication/long_row_timeout.test.lua > @@ -10,13 +10,13 @@ test_run:cmd('create server replica with rpl_master=default, script="replication > test_run:cmd('start server replica') > box.info.replication[2].downstream.status > tup_sz = box.cfg.memtx_max_tuple_size > -box.cfg{memtx_max_tuple_size = 21 * 1024 * 1024, memtx_memory = 1024 * 1024 * 1024} > +box.cfg{memtx_max_tuple_size = 21 * 1024 * 1024} > > -- insert some big rows which cannot be read in one go, so applier yields > -- on read a couple of times. > s = box.schema.space.create('test') > _ = s:create_index('pk') > -for i = 1,5 do box.space.test:insert{i, require('digest').urandom(20 * 1024 * 1024)} end > +for i = 1,5 do box.space.test:replace{1, digest.urandom(20 * 1024 * 1024)} collectgarbage('collect') end After this change you don't need replica_big.lua anymore. I removed it. Also, we need to call `box.snapshot()` in the end of this test to rotate xlogs, otherwise the following test may fail trying to apply the huge rows on a newly deployed replica during final join stage. Added it. Pushed to 2.1 and 1.10.