From: Serge Petrenko <sergepetrenko@tarantool.org>
To: Alexander Turenko <alexander.turenko@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org,
Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Subject: Re: [Tarantool-patches] [PATCH v3 0/3] fiber.top(): minor fixup
Date: Wed, 20 Nov 2019 11:29:26 +0300 [thread overview]
Message-ID: <B94EEDDF-256A-475F-AC66-4E54FB8228F9@tarantool.org> (raw)
In-Reply-To: <20191120014409.f7r2inhv5on4watf@tkn_work_nb>
[-- Attachment #1: Type: text/plain, Size: 6581 bytes --]
Hi! Thanks for your reply!
--
Serge Petrenko
sergepetrenko@tarantool.org
> 20 нояб. 2019 г., в 4:44, Alexander Turenko <alexander.turenko@tarantool.org> написал(а):
>
> It still shows miscompares for me:
>
> $ cat /proc/cpuinfo | grep processor | wc -l
> 8
> $ (cd test && ./test-run.py $(yes fiber.test.lua | head -n 1000))
>
> commit aefc64faaf7bfbb58ac53bf5abea55d1faabcbfe
>
> The miscompare looks so:
>
> [006] Test failed! Result content mismatch:
> [006] --- app/fiber.result Tue Nov 19 20:39:21 2019
> [006] +++ app/fiber.reject Wed Nov 20 04:33:39 2019
> [006] @@ -1525,7 +1525,7 @@
> [006] -- disable the test above.
> [006] sum_avg <= 100 or sum_avg
> [006] ---
> [006] -- true
> [006] +- 100
> [006] ...
> [006] -- not exact due to accumulated integer division errors
> [006] --sum_avg > 99 and sum_avg <= 100 or sum_avg
>
> Maybe it is some rounding issue.
>
> When I changed the condition to sum_avg <= 101, I got:
>
> [015] Test failed! Result content mismatch:
> [015] --- app/fiber.result Wed Nov 20 04:35:14 2019
> [015] +++ app/fiber.reject Wed Nov 20 04:35:36 2019
> [015] @@ -1512,7 +1512,7 @@
> [015] ...
> [015] sum_inst
> [015] ---
> [015] -- 100
> [015] +- 45.802853351342
> [015] ...
> [015] -- when a fiber dies, its impact on the thread moving average
> [015] -- persists for a couple of ev loop iterations, but it is no
>
> Sorry, but it does not look okay for me in the sense that it may lead to
> problems in testing: we already have enough ones.
>
> Let's provide a test that will work stable or describe why it is not
> possible. In the latter case we can extract an unstable test case and
> mark it as fragile using suite.ini test-run's option. It will not
> strictly close the problem, but will lower its probability.
>
> Maybe we can even retry such tests or allows them to fail:
> https://github.com/tarantool/test-run/issues/189
>
> Anyway, if those fixes descrease probability of fails, I don't mind.
> Just noted that a further work may be needed.
Well, yes, they do. Just try to run this test before the patch.
Anyway, I guess you’re right that we shouldn’t leave flaky tests if
possible, so I commented the flaky pieces out and added a request to
reenable them once #4625 (List dead fibers in fiber.top <http://fiber.top/>() output) is
implemented.
The diff’s below. Now 1000 out of 1000 tests pass with 16 concurrent
jobs, at least on my machine.
>
> WBR, Alexander Turenko.
>
> On Tue, Nov 19, 2019 at 11:57:11PM +0100, Vladislav Shpilevoy wrote:
>> Hi! Thanks for the fixes!
>>
>> LGTM.
>>
>> On 18/11/2019 17:05, Serge Petrenko wrote:
>>> The first patch factors all the clock-related members of struct fiber and struct
>>> cord into appropriate structs and adds methods for their updates and resets.
>>>
>>> The second patch fixes exponential moving average calculation so that we do not
>>> experience huge numbers in average load percentage calculations.
>>>
>>> The third patch alters fiber.top() test to wait for correct output before
>>> testing it.
>>>
>>> Follow-up https://github.com/tarantool/tarantool/issues/2694
>>> Branch https://github.com/tarantool/tarantool/tree/sp/gh-2694-test-fixup
>>>
>>> Changes in v3:
>>> - introduce a new patch which refactors all the clock stat
>>> handling.
>>> - introduce a patch fixing EMA calculation
>>> - review fixes as per review from Vladislav
>>>
>>> Changes in v2:
>>> - clean up all fibers clock stats on fiber.top_enable()
>>> - push 0 instead of NaN when cord clock_delta_last is 0
>>> - review fixes as per review from Vladislav
>>>
>>> Serge Petrenko (3):
>>> fiber.top() refactor clock and cpu time calculation
>>> fiber.top(): alter exponential moving average calculation
>>> app/fiber: wait till a full event loop iteration ends
>>>
>>> src/lib/core/fiber.c | 175 ++++++++++++++++++++++++----------------
>>> src/lib/core/fiber.h | 109 +++++++++++++++++--------
>>> src/lua/fiber.c | 20 +++--
>>> test/app/fiber.result | 38 ++++++---
>>> test/app/fiber.test.lua | 36 +++++++--
>>> 5 files changed, 252 insertions(+), 126 deletions(-)
>>>
diff --git a/test/app/fiber.result b/test/app/fiber.result
index b767bbb59..6d9604ad8 100644
--- a/test/app/fiber.result
+++ b/test/app/fiber.result
@@ -1510,25 +1510,21 @@ for k, v in pairs(a) do\
end
---
...
-sum_inst
----
-- 100
-...
-- when a fiber dies, its impact on the thread moving average
-- persists for a couple of ev loop iterations, but it is no
-- longer listed in fiber.top(). So sum_avg may way smaller than
--- 100%. See gh-4625 for details and reenable the test below as
+-- 100%. See gh-4625 for details and reenable both tests below as
-- soon as it is implemented.
-- In rare cases when a fiber dies on the same event loop
-- iteration as you issue fiber.top(), sum_inst will also be
--- smaller than 100%, but it is so rare I don't even want to
--- disable the test above.
-sum_avg <= 100 or sum_avg
+-- smaller than 100%.
+-- sum_inst
+sum_avg <= 100.1 or sum_avg
---
- true
...
-- not exact due to accumulated integer division errors
---sum_avg > 99 and sum_avg <= 100 or sum_avg
+--sum_avg > 99 and sum_avg <= 100.1 or sum_avg
tbl = nil
---
...
diff --git a/test/app/fiber.test.lua b/test/app/fiber.test.lua
index 7f7350acc..6df210d9c 100644
--- a/test/app/fiber.test.lua
+++ b/test/app/fiber.test.lua
@@ -658,19 +658,18 @@ for k, v in pairs(a) do\
sum_avg = sum_avg + v["average"]\
end
-sum_inst
-- when a fiber dies, its impact on the thread moving average
-- persists for a couple of ev loop iterations, but it is no
-- longer listed in fiber.top(). So sum_avg may way smaller than
--- 100%. See gh-4625 for details and reenable the test below as
+-- 100%. See gh-4625 for details and reenable both tests below as
-- soon as it is implemented.
-- In rare cases when a fiber dies on the same event loop
-- iteration as you issue fiber.top(), sum_inst will also be
--- smaller than 100%, but it is so rare I don't even want to
--- disable the test above.
-sum_avg <= 100 or sum_avg
+-- smaller than 100%.
+-- sum_inst
+sum_avg <= 100.1 or sum_avg
-- not exact due to accumulated integer division errors
---sum_avg > 99 and sum_avg <= 100 or sum_avg
+--sum_avg > 99 and sum_avg <= 100.1 or sum_avg
tbl = nil
f = fiber.new(function()\
local fiber_key = fiber.self().id()..'/'..fiber.self().name()\
[-- Attachment #2: Type: text/html, Size: 22357 bytes --]
next prev parent reply other threads:[~2019-11-20 8:29 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-18 16:05 Serge Petrenko
2019-11-18 16:05 ` [Tarantool-patches] [PATCH v3 1/3] fiber.top() refactor clock and cpu time calculation Serge Petrenko
2019-11-18 22:25 ` Vladislav Shpilevoy
2019-11-19 7:14 ` Sergey Petrenko
2019-11-18 16:05 ` [Tarantool-patches] [PATCH v3 2/3] fiber.top(): alter exponential moving average calculation Serge Petrenko
2019-11-18 16:05 ` [Tarantool-patches] [PATCH v3 3/3] app/fiber: wait till a full event loop iteration ends Serge Petrenko
2019-11-19 22:57 ` [Tarantool-patches] [PATCH v3 0/3] fiber.top(): minor fixup Vladislav Shpilevoy
2019-11-20 1:44 ` Alexander Turenko
2019-11-20 8:29 ` Serge Petrenko [this message]
2019-11-20 12:19 ` Alexander Turenko
2019-11-21 17:07 ` Kirill Yukhin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=B94EEDDF-256A-475F-AC66-4E54FB8228F9@tarantool.org \
--to=sergepetrenko@tarantool.org \
--cc=alexander.turenko@tarantool.org \
--cc=tarantool-patches@dev.tarantool.org \
--cc=v.shpilevoy@tarantool.org \
--subject='Re: [Tarantool-patches] [PATCH v3 0/3] fiber.top(): minor fixup' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox