Tarantool development patches archive
 help / color / mirror / Atom feed
From: Serge Petrenko <sergepetrenko@tarantool.org>
To: Alexander Turenko <alexander.turenko@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org,
	Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Subject: Re: [Tarantool-patches] [PATCH v3 0/3] fiber.top(): minor fixup
Date: Wed, 20 Nov 2019 11:29:26 +0300	[thread overview]
Message-ID: <B94EEDDF-256A-475F-AC66-4E54FB8228F9@tarantool.org> (raw)
In-Reply-To: <20191120014409.f7r2inhv5on4watf@tkn_work_nb>

[-- Attachment #1: Type: text/plain, Size: 6581 bytes --]

Hi! Thanks for your reply!
--
Serge Petrenko
sergepetrenko@tarantool.org




> 20 нояб. 2019 г., в 4:44, Alexander Turenko <alexander.turenko@tarantool.org> написал(а):
> 
> It still shows miscompares for me:
> 
> $ cat /proc/cpuinfo | grep processor | wc -l
> 8
> $ (cd test && ./test-run.py $(yes fiber.test.lua | head -n 1000))
> 
> commit aefc64faaf7bfbb58ac53bf5abea55d1faabcbfe
> 
> The miscompare looks so:
> 
> [006] Test failed! Result content mismatch:
> [006] --- app/fiber.result	Tue Nov 19 20:39:21 2019
> [006] +++ app/fiber.reject	Wed Nov 20 04:33:39 2019
> [006] @@ -1525,7 +1525,7 @@
> [006]  -- disable the test above.
> [006]  sum_avg <= 100 or sum_avg
> [006]  ---
> [006] -- true
> [006] +- 100
> [006]  ...
> [006]  -- not exact due to accumulated integer division errors
> [006]  --sum_avg > 99 and sum_avg <= 100 or sum_avg
> 
> Maybe it is some rounding issue.
> 
> When I changed the condition to sum_avg <= 101, I got:
> 
> [015] Test failed! Result content mismatch:
> [015] --- app/fiber.result	Wed Nov 20 04:35:14 2019
> [015] +++ app/fiber.reject	Wed Nov 20 04:35:36 2019
> [015] @@ -1512,7 +1512,7 @@
> [015]  ...
> [015]  sum_inst
> [015]  ---
> [015] -- 100
> [015] +- 45.802853351342
> [015]  ...
> [015]  -- when a fiber dies, its impact on the thread moving average
> [015]  -- persists for a couple of ev loop iterations, but it is no
> 
> Sorry, but it does not look okay for me in the sense that it may lead to
> problems in testing: we already have enough ones.
> 
> Let's provide a test that will work stable or describe why it is not
> possible. In the latter case we can extract an unstable test case and
> mark it as fragile using suite.ini test-run's option. It will not
> strictly close the problem, but will lower its probability.
> 
> Maybe we can even retry such tests or allows them to fail:
> https://github.com/tarantool/test-run/issues/189
> 
> Anyway, if those fixes descrease probability of fails, I don't mind.
> Just noted that a further work may be needed.

Well, yes, they do. Just try to run this test before the patch.
Anyway, I guess you’re right that we shouldn’t leave flaky tests if
possible, so I commented the flaky pieces out and added a request to
reenable them once #4625 (List dead fibers in fiber.top <http://fiber.top/>() output) is
implemented.

The diff’s below. Now 1000 out of 1000 tests pass with 16 concurrent
jobs, at least on my machine.


> 
> WBR, Alexander Turenko.
> 
> On Tue, Nov 19, 2019 at 11:57:11PM +0100, Vladislav Shpilevoy wrote:
>> Hi! Thanks for the fixes!
>> 
>> LGTM.
>> 
>> On 18/11/2019 17:05, Serge Petrenko wrote:
>>> The first patch factors all the clock-related members of struct fiber and struct
>>> cord into appropriate structs and adds methods for their updates and resets.
>>> 
>>> The second patch fixes exponential moving average calculation so that we do not
>>> experience huge numbers in average load percentage calculations.
>>> 
>>> The third patch alters fiber.top() test to wait for correct output before
>>> testing it. 
>>> 
>>> Follow-up https://github.com/tarantool/tarantool/issues/2694
>>> Branch https://github.com/tarantool/tarantool/tree/sp/gh-2694-test-fixup
>>> 
>>> Changes in v3:
>>> - introduce a new patch which refactors all the clock stat
>>>   handling.
>>> - introduce a patch fixing EMA calculation
>>> - review fixes as per review from Vladislav
>>> 
>>> Changes in v2:
>>> - clean up all fibers clock stats on fiber.top_enable()
>>> - push 0 instead of NaN when cord clock_delta_last is 0
>>> - review fixes as per review from Vladislav
>>> 
>>> Serge Petrenko (3):
>>>  fiber.top() refactor clock and cpu time calculation
>>>  fiber.top(): alter exponential moving average calculation
>>>  app/fiber: wait till a full event loop iteration ends
>>> 
>>> src/lib/core/fiber.c    | 175 ++++++++++++++++++++++++----------------
>>> src/lib/core/fiber.h    | 109 +++++++++++++++++--------
>>> src/lua/fiber.c         |  20 +++--
>>> test/app/fiber.result   |  38 ++++++---
>>> test/app/fiber.test.lua |  36 +++++++--
>>> 5 files changed, 252 insertions(+), 126 deletions(-)
>>> 


diff --git a/test/app/fiber.result b/test/app/fiber.result
index b767bbb59..6d9604ad8 100644
--- a/test/app/fiber.result
+++ b/test/app/fiber.result
@@ -1510,25 +1510,21 @@ for k, v in pairs(a) do\
 end
 ---
 ...
-sum_inst
----
-- 100
-...
 -- when a fiber dies, its impact on the thread moving average
 -- persists for a couple of ev loop iterations, but it is no
 -- longer listed in fiber.top(). So sum_avg may way smaller than
--- 100%. See gh-4625 for details and reenable the test below as
+-- 100%. See gh-4625 for details and reenable both tests below as
 -- soon as it is implemented.
 -- In rare cases when a fiber dies on the same event loop
 -- iteration as you issue fiber.top(), sum_inst will also be
--- smaller than 100%, but it is so rare I don't even want to
--- disable the test above.
-sum_avg <= 100 or sum_avg
+-- smaller than 100%.
+-- sum_inst
+sum_avg <= 100.1 or sum_avg
 ---
 - true
 ...
 -- not exact due to accumulated integer division errors
---sum_avg > 99 and sum_avg <= 100 or sum_avg
+--sum_avg > 99 and sum_avg <= 100.1 or sum_avg
 tbl = nil
 ---
 ...
diff --git a/test/app/fiber.test.lua b/test/app/fiber.test.lua
index 7f7350acc..6df210d9c 100644
--- a/test/app/fiber.test.lua
+++ b/test/app/fiber.test.lua
@@ -658,19 +658,18 @@ for k, v in pairs(a) do\
     sum_avg = sum_avg + v["average"]\
 end
 
-sum_inst
 -- when a fiber dies, its impact on the thread moving average
 -- persists for a couple of ev loop iterations, but it is no
 -- longer listed in fiber.top(). So sum_avg may way smaller than
--- 100%. See gh-4625 for details and reenable the test below as
+-- 100%. See gh-4625 for details and reenable both tests below as
 -- soon as it is implemented.
 -- In rare cases when a fiber dies on the same event loop
 -- iteration as you issue fiber.top(), sum_inst will also be
--- smaller than 100%, but it is so rare I don't even want to
--- disable the test above.
-sum_avg <= 100 or sum_avg
+-- smaller than 100%.
+-- sum_inst
+sum_avg <= 100.1 or sum_avg
 -- not exact due to accumulated integer division errors
---sum_avg > 99 and sum_avg <= 100 or sum_avg
+--sum_avg > 99 and sum_avg <= 100.1 or sum_avg
 tbl = nil
 f = fiber.new(function()\
     local fiber_key = fiber.self().id()..'/'..fiber.self().name()\

[-- Attachment #2: Type: text/html, Size: 22357 bytes --]

  reply	other threads:[~2019-11-20  8:29 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-18 16:05 Serge Petrenko
2019-11-18 16:05 ` [Tarantool-patches] [PATCH v3 1/3] fiber.top() refactor clock and cpu time calculation Serge Petrenko
2019-11-18 22:25   ` Vladislav Shpilevoy
2019-11-19  7:14     ` Sergey Petrenko
2019-11-18 16:05 ` [Tarantool-patches] [PATCH v3 2/3] fiber.top(): alter exponential moving average calculation Serge Petrenko
2019-11-18 16:05 ` [Tarantool-patches] [PATCH v3 3/3] app/fiber: wait till a full event loop iteration ends Serge Petrenko
2019-11-19 22:57 ` [Tarantool-patches] [PATCH v3 0/3] fiber.top(): minor fixup Vladislav Shpilevoy
2019-11-20  1:44   ` Alexander Turenko
2019-11-20  8:29     ` Serge Petrenko [this message]
2019-11-20 12:19       ` Alexander Turenko
2019-11-21 17:07 ` Kirill Yukhin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B94EEDDF-256A-475F-AC66-4E54FB8228F9@tarantool.org \
    --to=sergepetrenko@tarantool.org \
    --cc=alexander.turenko@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v3 0/3] fiber.top(): minor fixup' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox