[Tarantool-patches] [PATCH v3 0/3] fiber.top(): minor fixup

Serge Petrenko sergepetrenko at tarantool.org
Wed Nov 20 11:29:26 MSK 2019


Hi! Thanks for your reply!
--
Serge Petrenko
sergepetrenko at tarantool.org




> 20 нояб. 2019 г., в 4:44, Alexander Turenko <alexander.turenko at tarantool.org> написал(а):
> 
> It still shows miscompares for me:
> 
> $ cat /proc/cpuinfo | grep processor | wc -l
> 8
> $ (cd test && ./test-run.py $(yes fiber.test.lua | head -n 1000))
> 
> commit aefc64faaf7bfbb58ac53bf5abea55d1faabcbfe
> 
> The miscompare looks so:
> 
> [006] Test failed! Result content mismatch:
> [006] --- app/fiber.result	Tue Nov 19 20:39:21 2019
> [006] +++ app/fiber.reject	Wed Nov 20 04:33:39 2019
> [006] @@ -1525,7 +1525,7 @@
> [006]  -- disable the test above.
> [006]  sum_avg <= 100 or sum_avg
> [006]  ---
> [006] -- true
> [006] +- 100
> [006]  ...
> [006]  -- not exact due to accumulated integer division errors
> [006]  --sum_avg > 99 and sum_avg <= 100 or sum_avg
> 
> Maybe it is some rounding issue.
> 
> When I changed the condition to sum_avg <= 101, I got:
> 
> [015] Test failed! Result content mismatch:
> [015] --- app/fiber.result	Wed Nov 20 04:35:14 2019
> [015] +++ app/fiber.reject	Wed Nov 20 04:35:36 2019
> [015] @@ -1512,7 +1512,7 @@
> [015]  ...
> [015]  sum_inst
> [015]  ---
> [015] -- 100
> [015] +- 45.802853351342
> [015]  ...
> [015]  -- when a fiber dies, its impact on the thread moving average
> [015]  -- persists for a couple of ev loop iterations, but it is no
> 
> Sorry, but it does not look okay for me in the sense that it may lead to
> problems in testing: we already have enough ones.
> 
> Let's provide a test that will work stable or describe why it is not
> possible. In the latter case we can extract an unstable test case and
> mark it as fragile using suite.ini test-run's option. It will not
> strictly close the problem, but will lower its probability.
> 
> Maybe we can even retry such tests or allows them to fail:
> https://github.com/tarantool/test-run/issues/189
> 
> Anyway, if those fixes descrease probability of fails, I don't mind.
> Just noted that a further work may be needed.

Well, yes, they do. Just try to run this test before the patch.
Anyway, I guess you’re right that we shouldn’t leave flaky tests if
possible, so I commented the flaky pieces out and added a request to
reenable them once #4625 (List dead fibers in fiber.top <http://fiber.top/>() output) is
implemented.

The diff’s below. Now 1000 out of 1000 tests pass with 16 concurrent
jobs, at least on my machine.


> 
> WBR, Alexander Turenko.
> 
> On Tue, Nov 19, 2019 at 11:57:11PM +0100, Vladislav Shpilevoy wrote:
>> Hi! Thanks for the fixes!
>> 
>> LGTM.
>> 
>> On 18/11/2019 17:05, Serge Petrenko wrote:
>>> The first patch factors all the clock-related members of struct fiber and struct
>>> cord into appropriate structs and adds methods for their updates and resets.
>>> 
>>> The second patch fixes exponential moving average calculation so that we do not
>>> experience huge numbers in average load percentage calculations.
>>> 
>>> The third patch alters fiber.top() test to wait for correct output before
>>> testing it. 
>>> 
>>> Follow-up https://github.com/tarantool/tarantool/issues/2694
>>> Branch https://github.com/tarantool/tarantool/tree/sp/gh-2694-test-fixup
>>> 
>>> Changes in v3:
>>> - introduce a new patch which refactors all the clock stat
>>>   handling.
>>> - introduce a patch fixing EMA calculation
>>> - review fixes as per review from Vladislav
>>> 
>>> Changes in v2:
>>> - clean up all fibers clock stats on fiber.top_enable()
>>> - push 0 instead of NaN when cord clock_delta_last is 0
>>> - review fixes as per review from Vladislav
>>> 
>>> Serge Petrenko (3):
>>>  fiber.top() refactor clock and cpu time calculation
>>>  fiber.top(): alter exponential moving average calculation
>>>  app/fiber: wait till a full event loop iteration ends
>>> 
>>> src/lib/core/fiber.c    | 175 ++++++++++++++++++++++++----------------
>>> src/lib/core/fiber.h    | 109 +++++++++++++++++--------
>>> src/lua/fiber.c         |  20 +++--
>>> test/app/fiber.result   |  38 ++++++---
>>> test/app/fiber.test.lua |  36 +++++++--
>>> 5 files changed, 252 insertions(+), 126 deletions(-)
>>> 


diff --git a/test/app/fiber.result b/test/app/fiber.result
index b767bbb59..6d9604ad8 100644
--- a/test/app/fiber.result
+++ b/test/app/fiber.result
@@ -1510,25 +1510,21 @@ for k, v in pairs(a) do\
 end
 ---
 ...
-sum_inst
----
-- 100
-...
 -- when a fiber dies, its impact on the thread moving average
 -- persists for a couple of ev loop iterations, but it is no
 -- longer listed in fiber.top(). So sum_avg may way smaller than
--- 100%. See gh-4625 for details and reenable the test below as
+-- 100%. See gh-4625 for details and reenable both tests below as
 -- soon as it is implemented.
 -- In rare cases when a fiber dies on the same event loop
 -- iteration as you issue fiber.top(), sum_inst will also be
--- smaller than 100%, but it is so rare I don't even want to
--- disable the test above.
-sum_avg <= 100 or sum_avg
+-- smaller than 100%.
+-- sum_inst
+sum_avg <= 100.1 or sum_avg
 ---
 - true
 ...
 -- not exact due to accumulated integer division errors
---sum_avg > 99 and sum_avg <= 100 or sum_avg
+--sum_avg > 99 and sum_avg <= 100.1 or sum_avg
 tbl = nil
 ---
 ...
diff --git a/test/app/fiber.test.lua b/test/app/fiber.test.lua
index 7f7350acc..6df210d9c 100644
--- a/test/app/fiber.test.lua
+++ b/test/app/fiber.test.lua
@@ -658,19 +658,18 @@ for k, v in pairs(a) do\
     sum_avg = sum_avg + v["average"]\
 end
 
-sum_inst
 -- when a fiber dies, its impact on the thread moving average
 -- persists for a couple of ev loop iterations, but it is no
 -- longer listed in fiber.top(). So sum_avg may way smaller than
--- 100%. See gh-4625 for details and reenable the test below as
+-- 100%. See gh-4625 for details and reenable both tests below as
 -- soon as it is implemented.
 -- In rare cases when a fiber dies on the same event loop
 -- iteration as you issue fiber.top(), sum_inst will also be
--- smaller than 100%, but it is so rare I don't even want to
--- disable the test above.
-sum_avg <= 100 or sum_avg
+-- smaller than 100%.
+-- sum_inst
+sum_avg <= 100.1 or sum_avg
 -- not exact due to accumulated integer division errors
---sum_avg > 99 and sum_avg <= 100 or sum_avg
+--sum_avg > 99 and sum_avg <= 100.1 or sum_avg
 tbl = nil
 f = fiber.new(function()\
     local fiber_key = fiber.self().id()..'/'..fiber.self().name()\
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20191120/f1d50aa2/attachment.html>


More information about the Tarantool-patches mailing list