Tarantool development patches archive
 help / color / mirror / Atom feed
From: "Alexander V. Tikhonov" <avtikhon@tarantool.org>
To: Nikita Pettik <korablev@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH v1] test: fix hang of vinyl/select_consistency.* test
Date: Wed, 2 Dec 2020 21:57:43 +0300	[thread overview]
Message-ID: <20201202185743.GA232692@hpalx> (raw)
In-Reply-To: <20201116210654.GA13996@tarantool.org>

Hi Nikita, thanks for the review. I've made all of your suggestions,
please review below.

On Mon, Nov 16, 2020 at 09:06:54PM +0000, Nikita Pettik wrote:
> On 15 Nov 21:43, Alexander V. Tikhonov wrote:
> > 
> > It happened because on heavy loaded hosts may occure the situation
> > when the previous snapshot was inprogress when the new snapshot came
> > with the same file name *.snap.inprogress. It happens before the
> > current snapshot completed and printed "dump completed" in log file.
> > Also this file *.snap.inprogress was seen left on manual debug, when
> > the test hanged before this patch. To resolve the test issue fiber
> > sleep delay after it can be increased, but we want to save the issue
> > reproducable. The current patch corrects the test to avoid of hang on
> 
> I guess increasing sleep and fileing exact repro with 'bug' label is
> enough to deal with this test.
>

Ok, I've added instructions how to reproduce the issue in the issue and
increased the sleep from 0.1 to 0.5.

> > box snapshot, to be able to continue testing after it failed. Fiber
> > sleep was even decreased after adding fiber for box.snapshot to be
> > able to reproduce the issue.
> > 
> > Needed for #4385
> > ---
> > @@ -75,8 +80,13 @@ end;
> >  ...
> >  function snap_loop()
> >      while not stop do
> > -        box.snapshot()
> > -        fiber.sleep(0.1)
> > +        local ok, err = fiber.create(function() local ok, err = pcall(box.snapshot) return ok, err end)
> 
> Why not simply wrap box.snapshot() in pcall? Why do you need another
> one separate fiber for it?
> 

Found that the test hangs running without it. Also this part, I'd
corrected a bit.

> > @@ -97,10 +107,7 @@ _ = fiber.create(dml_loop);
> >  _ = fiber.create(snap_loop);
> >  ---
> >  ...
> > -failed = {};
> > ----
> > -...
> > -for i = 1, 10000 do
> > +function run_iter()
> 
> I'd call it select_loop() (as dml/snap loops above).
>

Sure, changed.

> >      local val = math.random(MAX_VAL)
> >      box.begin()
> >      local res1 = s.index.i1:select({val})
> > @@ -117,25 +124,37 @@ for i = 1, 10000 do
> >                  end
> >              end
> >              if not found then
> > +                log.error("error: equal not found for #res1 = " .. #res1 .. ", #res2 = " .. #res2)
> 
> What's the point of printing number of elements in res1/res2?
> Why not print the whole result set?
> 

Right, added.

> > +end;
> > +---
> > +...
> > +for i = 1, 10000 do
> > +    if failed or not run_iter(i) then
> > +        log.error("error: failed on iteration " .. i)
> > +        failed = true
> > +        break
> > +    end
> >  end;
> > +function check_get()
> > +    for i = 1, ch:size() do
> > +        if not test_run:wait_cond(function() return ch:get() ~= nil end) then
> 
> As I see from docs, ch:get() has optional timeout param. Why not use it?
> I mean not only this particular test, but in general.
> 

Added timeout using local setup as define to make it easy to find all
the tests with the same changes in the future.

> > +            log.error("error: hanged on ch:get() on iteration " .. i)
> > +            return false
> > +	end
> > +    end
> > +    return true
> >  end;
> >  ---
> >  ...
> > @@ -143,10 +162,14 @@ test_run:cmd("setopt delimiter ''");
> >  ---
> >  - true
> >  ...
> > -#failed == 0 or failed
> > +test_run:wait_cond(function() return check_get() end)
> 
> Why do you need another one wait_cond over check_get?
> 

Right, no need for now.

  reply	other threads:[~2020-12-02 18:57 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-15 18:43 Alexander V. Tikhonov
2020-11-16 21:06 ` Nikita Pettik
2020-12-02 18:57   ` Alexander V. Tikhonov [this message]
2020-12-07 12:26     ` Nikita Pettik
  -- strict thread matches above, loose matches on Subject: below --
2020-11-09 12:57 Alexander V. Tikhonov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201202185743.GA232692@hpalx \
    --to=avtikhon@tarantool.org \
    --cc=korablev@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v1] test: fix hang of vinyl/select_consistency.* test' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox