From: Nikita Pettik <korablev@tarantool.org> To: "Alexander V. Tikhonov" <avtikhon@tarantool.org> Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH v2] test: fix hang of vinyl/select_consistency.* test Date: Mon, 7 Dec 2020 12:22:23 +0000 [thread overview] Message-ID: <20201207122223.GA21104@tarantool.org> (raw) In-Reply-To: <aa4847988d8cbf3f4e1a628fa7d97fb732ae95e8.1606932444.git.avtikhon@tarantool.org> On 02 Dec 21:09, Alexander V. Tikhonov wrote: > --- vinyl/select_consistency.result Thu Jun 11 11:59:48 2020 > +++ var/152_vinyl/select_consistency.result Tue Oct 20 09:13:02 2020 > @@ -137,16 +137,3 @@ > for i = 1, ch:size() do > ch:get() > end; > ---- > -... > -test_run:cmd("setopt delimiter ''"); > ---- > -- true > -... > > This issue stops the testing till the complete timeout of the test-run > will stop all the testing process. Hm, could we patch test-run so that timeout of single test won't stop the whole testing process? > It happened because on heavy loaded hosts may occure the situation > when the previous snapshot was inprogress when the new snapshot came > with the same file name *.snap.inprogress. It happens before the > current snapshot completed and printed "dump completed" in log file. > Also this file *.snap.inprogress was seen left on manual debug, when > the test hanged before this patch. To resolve the test issue fiber > sleep delay after it can be increased, but we want to save the issue > reproducable. The current patch corrects the test to avoid of hang on > box snapshot, to be able to continue testing after it failed. > > Needed for #4385 > --- > > Github: https://github.com/tarantool/tarantool/tree/avtikhon/gh-4385_hang_select_consistency > Issue: https://github.com/tarantool/tarantool/issues/4385 > > test/vinyl/select_consistency.result | 67 ++++++++++++++++++++------ > test/vinyl/select_consistency.test.lua | 61 ++++++++++++++++++----- > test/vinyl/suite.ini | 3 +- > 3 files changed, 103 insertions(+), 28 deletions(-) > > diff --git a/test/vinyl/select_consistency.result b/test/vinyl/select_consistency.result > index f27d0ec68..f6d96473d 100644 > --- a/test/vinyl/select_consistency.result > +++ b/test/vinyl/select_consistency.result > @@ -4,6 +4,9 @@ test_run = require('test_run').new() > fiber = require 'fiber' > --- > ... > +log = require 'log' > +--- > +... > math.randomseed(os.time()) > --- > ... > @@ -52,6 +55,8 @@ function gen_update() > end; > --- > ... > +failed = false > + > function dml_loop() > local insert = true > while not stop do > @@ -75,8 +80,18 @@ end; > ... > function snap_loop() > while not stop do > - box.snapshot() > - fiber.sleep(0.1) > + local ok = fiber.create(function() Still don't understand necessity of creating another one fiber. snap_loop() is itself executed in a separate fiber. I suggest you to dig a bit into this problem, it may hide another issue. > + local ok, err = pcall(box.snapshot) > + if ok == false then > + log.error("error: box.snapshot failed with error " .. err) > + end > + return ok > + end) > + if ok == false then > + failed = true > + break > + end > + fiber.sleep(0.5) > end > ch:put(true) > end; > @@ -97,10 +112,7 @@ _ = fiber.create(dml_loop); > _ = fiber.create(snap_loop); > --- > ... > -failed = {}; > ---- > -... > -for i = 1, 10000 do > +function select_loop() > local val = math.random(MAX_VAL) > box.begin() > local res1 = s.index.i1:select({val}) > @@ -117,25 +129,48 @@ for i = 1, 10000 do > end > end > if not found then > + log.error("error: equal not found for res1 and res2:") > + local ts1 = '' > + local ts2 = '' > + for _, t1 in ipairs(res1) do > + ts1 = ts1 .. " " .. t1[1] > + end > + log.info("error: equal not found for res1" .. ts1) Why the first log is error meanwhile second and third are info? I suggest to make them both error to make it consistent. > + for _, t2 in ipairs(res2) do > + ts2 = ts2 .. " " .. t2[1] > + end > + log.info("error: equal not found for res2" .. ts2) > equal = false > test_run:cmd("setopt delimiter ''"); > > -#failed == 0 or failed > +test_run:wait_cond(function() return check_get() end) > +failed In previous review iter you approved that there's no need in :wait_cond() here, but still it is presented. > - "issues": [ "gh-4385" ] > + "issues": [ "gh-4385" ], > + "checksums": [ "94d5359d451ee7bfdb35a037f80b14ff", "f7e8ebc72f4d59941b4ca10fde474963" ] Why there are two checksums? How to point out corresponding to particular checksum error? Moreover, with increased timeout this test shouldn't be flacky anymore, should it?
next prev parent reply other threads:[~2020-12-07 12:22 UTC|newest] Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-12-02 18:09 Alexander V. Tikhonov 2020-12-07 12:22 ` Nikita Pettik [this message] 2020-12-02 18:56 Alexander V. Tikhonov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20201207122223.GA21104@tarantool.org \ --to=korablev@tarantool.org \ --cc=avtikhon@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH v2] test: fix hang of vinyl/select_consistency.* test' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox