From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Sun, 21 Oct 2018 23:41:22 +0300 From: Alexander Turenko Subject: Re: [PATCH v2 2/5] test: errinj for pause relay_send Message-ID: <20181021204121.c7vucdv4nhblodwy@tkn_work_nb> References: <20181019161721.49560-1-sergw@tarantool.org> <20181019161721.49560-3-sergw@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20181019161721.49560-3-sergw@tarantool.org> To: Sergei Voronezhskii Cc: tarantool-patches@freelists.org, Vladimir Davydov List-ID: Hi! I don't have objections in general. Some minor comments are below. Please, answer with fixes, don't just send the whole patch. WBR, Alexander Turenko. > Instead of using timeout we need just pause `relay_send`. Can't rely > on timeout because of various system load in parallel mode. Add new > errinj which checks boolean in loop and until it is not `True` do not > pass the method `relay_send` to the next statement. > > To check the read-only mode, need to make a modification of tuple. It > is enough to call `replace` method. Instead of `delete` and then > useless verification that we have not delete space by using `get` > method. > delete space -> delete tuple? > +-- In the next two cases we try to replace a tuple while replica > +-- is catching up with the master (local delete, remote delete) delete -> replace > +-- case Nit: period at the end. > --- check sync > -errinj.set("ERRINJ_RELAY_TIMEOUT", 0) > +-- Resume replicaton. replicaton -> replication > diff --git a/test/replication/gc.test.lua b/test/replication/gc.test.lua > index 5100378b3..22921289d 100644 > --- a/test/replication/gc.test.lua > +++ b/test/replication/gc.test.lua > @@ -12,6 +12,7 @@ default_checkpoint_count = box.cfg.checkpoint_count > box.cfg{checkpoint_count = 1} > > function wait_gc(n) while #box.info.gc().checkpoints > n do fiber.sleep(0.01) end end > +function wait_xlog(n, timeout) timeout = timeout or 1.0 return test_run:wait_cond(function() return #fio.glob('./master/*.xlog') == n end, timeout) end > Use 'set delimiter' and write it in several lines. Also, below I proposed to support 'n' being a table to allow count of files being one of several values. You can use auxiliary function like the following and type(n) == 'table' check. function value_in(val, arr) for _, elem in ipairs(arr) do if val == elem then return true end end return false end > @@ -31,7 +32,7 @@ for i = 1, 100 do s:auto_increment{} end > > -- Make sure replica join will take long enough for us to > -- invoke garbage collection. > -box.error.injection.set("ERRINJ_RELAY_TIMEOUT", 0.05) > +box.error.injection.set("ERRINJ_RELAY_SEND_DELAY", true) > > -- While the replica is receiving the initial data set, > -- make a snapshot and invoke garbage collection, then > @@ -41,7 +42,7 @@ test_run:cmd("setopt delimiter ';'") > fiber.create(function() > fiber.sleep(0.1) > box.snapshot() > - box.error.injection.set("ERRINJ_RELAY_TIMEOUT", 0) > + box.error.injection.set("ERRINJ_RELAY_SEND_DELAY", false) > end) > test_run:cmd("setopt delimiter ''"); > The entire comment: > -- While the replica is receiving the initial data set, > -- make a snapshot and invoke garbage collection, then > -- remove the timeout injection so that we don't have to > -- wait too long for the replica to start. Proposed: then remove delay to allow replica to start. > --- Remove the timeout injection so that the replica catches > +-- Resume replicaton so that the replica catches replicaton -> replication > @@ -146,17 +147,16 @@ box.snapshot() > _ = s:auto_increment{} > box.snapshot() > #box.info.gc().checkpoints == 1 or box.info.gc() > -xlog_count = #fio.glob('./master/*.xlog') > -- the replica may have managed to download all data > -- from xlog #1 before it was stopped, in which case > -- it's OK to collect xlog #1 > -xlog_count == 3 or xlog_count == 2 or fio.listdir('./master') > +wait_xlog(3, 0.1) or wait_xlog(2, 0.1) or fio.listdir('./master') You are set timeout to 1.0 for other cases, but 0.2 here. So, 0.2 is enough? It is better to allow the function to accept a table like {2, 3} as files count. Use 'set delimiter' and update the function.