[Tarantool-patches] [PATCH v3 0/3] fiber.top(): minor fixup
Serge Petrenko
sergepetrenko at tarantool.org
Wed Nov 20 11:29:26 MSK 2019
Hi! Thanks for your reply!
--
Serge Petrenko
sergepetrenko at tarantool.org
> 20 нояб. 2019 г., в 4:44, Alexander Turenko <alexander.turenko at tarantool.org> написал(а):
>
> It still shows miscompares for me:
>
> $ cat /proc/cpuinfo | grep processor | wc -l
> 8
> $ (cd test && ./test-run.py $(yes fiber.test.lua | head -n 1000))
>
> commit aefc64faaf7bfbb58ac53bf5abea55d1faabcbfe
>
> The miscompare looks so:
>
> [006] Test failed! Result content mismatch:
> [006] --- app/fiber.result Tue Nov 19 20:39:21 2019
> [006] +++ app/fiber.reject Wed Nov 20 04:33:39 2019
> [006] @@ -1525,7 +1525,7 @@
> [006] -- disable the test above.
> [006] sum_avg <= 100 or sum_avg
> [006] ---
> [006] -- true
> [006] +- 100
> [006] ...
> [006] -- not exact due to accumulated integer division errors
> [006] --sum_avg > 99 and sum_avg <= 100 or sum_avg
>
> Maybe it is some rounding issue.
>
> When I changed the condition to sum_avg <= 101, I got:
>
> [015] Test failed! Result content mismatch:
> [015] --- app/fiber.result Wed Nov 20 04:35:14 2019
> [015] +++ app/fiber.reject Wed Nov 20 04:35:36 2019
> [015] @@ -1512,7 +1512,7 @@
> [015] ...
> [015] sum_inst
> [015] ---
> [015] -- 100
> [015] +- 45.802853351342
> [015] ...
> [015] -- when a fiber dies, its impact on the thread moving average
> [015] -- persists for a couple of ev loop iterations, but it is no
>
> Sorry, but it does not look okay for me in the sense that it may lead to
> problems in testing: we already have enough ones.
>
> Let's provide a test that will work stable or describe why it is not
> possible. In the latter case we can extract an unstable test case and
> mark it as fragile using suite.ini test-run's option. It will not
> strictly close the problem, but will lower its probability.
>
> Maybe we can even retry such tests or allows them to fail:
> https://github.com/tarantool/test-run/issues/189
>
> Anyway, if those fixes descrease probability of fails, I don't mind.
> Just noted that a further work may be needed.
Well, yes, they do. Just try to run this test before the patch.
Anyway, I guess you’re right that we shouldn’t leave flaky tests if
possible, so I commented the flaky pieces out and added a request to
reenable them once #4625 (List dead fibers in fiber.top <http://fiber.top/>() output) is
implemented.
The diff’s below. Now 1000 out of 1000 tests pass with 16 concurrent
jobs, at least on my machine.
>
> WBR, Alexander Turenko.
>
> On Tue, Nov 19, 2019 at 11:57:11PM +0100, Vladislav Shpilevoy wrote:
>> Hi! Thanks for the fixes!
>>
>> LGTM.
>>
>> On 18/11/2019 17:05, Serge Petrenko wrote:
>>> The first patch factors all the clock-related members of struct fiber and struct
>>> cord into appropriate structs and adds methods for their updates and resets.
>>>
>>> The second patch fixes exponential moving average calculation so that we do not
>>> experience huge numbers in average load percentage calculations.
>>>
>>> The third patch alters fiber.top() test to wait for correct output before
>>> testing it.
>>>
>>> Follow-up https://github.com/tarantool/tarantool/issues/2694
>>> Branch https://github.com/tarantool/tarantool/tree/sp/gh-2694-test-fixup
>>>
>>> Changes in v3:
>>> - introduce a new patch which refactors all the clock stat
>>> handling.
>>> - introduce a patch fixing EMA calculation
>>> - review fixes as per review from Vladislav
>>>
>>> Changes in v2:
>>> - clean up all fibers clock stats on fiber.top_enable()
>>> - push 0 instead of NaN when cord clock_delta_last is 0
>>> - review fixes as per review from Vladislav
>>>
>>> Serge Petrenko (3):
>>> fiber.top() refactor clock and cpu time calculation
>>> fiber.top(): alter exponential moving average calculation
>>> app/fiber: wait till a full event loop iteration ends
>>>
>>> src/lib/core/fiber.c | 175 ++++++++++++++++++++++++----------------
>>> src/lib/core/fiber.h | 109 +++++++++++++++++--------
>>> src/lua/fiber.c | 20 +++--
>>> test/app/fiber.result | 38 ++++++---
>>> test/app/fiber.test.lua | 36 +++++++--
>>> 5 files changed, 252 insertions(+), 126 deletions(-)
>>>
diff --git a/test/app/fiber.result b/test/app/fiber.result
index b767bbb59..6d9604ad8 100644
--- a/test/app/fiber.result
+++ b/test/app/fiber.result
@@ -1510,25 +1510,21 @@ for k, v in pairs(a) do\
end
---
...
-sum_inst
----
-- 100
-...
-- when a fiber dies, its impact on the thread moving average
-- persists for a couple of ev loop iterations, but it is no
-- longer listed in fiber.top(). So sum_avg may way smaller than
--- 100%. See gh-4625 for details and reenable the test below as
+-- 100%. See gh-4625 for details and reenable both tests below as
-- soon as it is implemented.
-- In rare cases when a fiber dies on the same event loop
-- iteration as you issue fiber.top(), sum_inst will also be
--- smaller than 100%, but it is so rare I don't even want to
--- disable the test above.
-sum_avg <= 100 or sum_avg
+-- smaller than 100%.
+-- sum_inst
+sum_avg <= 100.1 or sum_avg
---
- true
...
-- not exact due to accumulated integer division errors
---sum_avg > 99 and sum_avg <= 100 or sum_avg
+--sum_avg > 99 and sum_avg <= 100.1 or sum_avg
tbl = nil
---
...
diff --git a/test/app/fiber.test.lua b/test/app/fiber.test.lua
index 7f7350acc..6df210d9c 100644
--- a/test/app/fiber.test.lua
+++ b/test/app/fiber.test.lua
@@ -658,19 +658,18 @@ for k, v in pairs(a) do\
sum_avg = sum_avg + v["average"]\
end
-sum_inst
-- when a fiber dies, its impact on the thread moving average
-- persists for a couple of ev loop iterations, but it is no
-- longer listed in fiber.top(). So sum_avg may way smaller than
--- 100%. See gh-4625 for details and reenable the test below as
+-- 100%. See gh-4625 for details and reenable both tests below as
-- soon as it is implemented.
-- In rare cases when a fiber dies on the same event loop
-- iteration as you issue fiber.top(), sum_inst will also be
--- smaller than 100%, but it is so rare I don't even want to
--- disable the test above.
-sum_avg <= 100 or sum_avg
+-- smaller than 100%.
+-- sum_inst
+sum_avg <= 100.1 or sum_avg
-- not exact due to accumulated integer division errors
---sum_avg > 99 and sum_avg <= 100 or sum_avg
+--sum_avg > 99 and sum_avg <= 100.1 or sum_avg
tbl = nil
f = fiber.new(function()\
local fiber_key = fiber.self().id()..'/'..fiber.self().name()\
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20191120/f1d50aa2/attachment.html>
More information about the Tarantool-patches
mailing list