Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH v2 0/2] fiber.top(): minor fixup
@ 2019-11-15 14:58 Serge Petrenko
  2019-11-15 14:58 ` [Tarantool-patches] [PATCH v2 1/2] fiber: reset clock stats on fiber.top_enable() Serge Petrenko
  2019-11-15 14:58 ` [Tarantool-patches] [PATCH v2 2/2] app/fiber: wait till a full event loop iteration ends Serge Petrenko
  0 siblings, 2 replies; 9+ messages in thread
From: Serge Petrenko @ 2019-11-15 14:58 UTC (permalink / raw)
  To: v.shpilevoy; +Cc: tarantool-patches

The first patch fixes variable initialization after fiber.top_enable()
so that cord clocks don't contain garbage on first event loop iteration after
enable.

The second patch alters fiber.top() test to wait for correct output before
testing it. 

Follow-up https://github.com/tarantool/tarantool/issues/2694
Branch https://github.com/tarantool/tarantool/tree/sp/gh-2694-test-fixup

Changes in v2:
  - clean up all fibers clock stats on fiber.top_enable()
  - push 0 instead of NaN when cord clock_delta_last is 0
  - review fixes as per review from Vladislav

Serge Petrenko (2):
  fiber: reset clock stats on fiber.top_enable()
  app/fiber: wait till a full event loop iteration ends

 src/lib/core/fiber.c    | 21 +++++++++++++++++++++
 src/lua/fiber.c         | 12 ++++++++++--
 test/app/fiber.result   | 28 +++++++++++++++++++---------
 test/app/fiber.test.lua | 26 ++++++++++++++++++--------
 4 files changed, 68 insertions(+), 19 deletions(-)

-- 
2.21.0 (Apple Git-122)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Tarantool-patches] [PATCH v2 1/2] fiber: reset clock stats on fiber.top_enable()
  2019-11-15 14:58 [Tarantool-patches] [PATCH v2 0/2] fiber.top(): minor fixup Serge Petrenko
@ 2019-11-15 14:58 ` Serge Petrenko
  2019-11-15 16:11   ` Alexander Turenko
  2019-11-15 21:39   ` Vladislav Shpilevoy
  2019-11-15 14:58 ` [Tarantool-patches] [PATCH v2 2/2] app/fiber: wait till a full event loop iteration ends Serge Petrenko
  1 sibling, 2 replies; 9+ messages in thread
From: Serge Petrenko @ 2019-11-15 14:58 UTC (permalink / raw)
  To: v.shpilevoy; +Cc: tarantool-patches

We didn't refresh last remembered clock on fiber.top_enable()
This means that the fiber issuing fiber.top_enable() would get a huge
chunk of cpu time on its first yield. Fix this.
Also reset clock_delta and cpu_miss_count.

If fiber.top() is issued on the same ev loop iteration as
fiber.top_enable(), clock_delta_last is 0 for both cord and all the
fibers, so report "instant" and "average" stats per last iteration as 0
instead of NaN.

Follow-up #2694
---
 src/lib/core/fiber.c | 21 +++++++++++++++++++++
 src/lua/fiber.c      | 12 ++++++++++--
 2 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
index aebaba7f0..258c094f5 100644
--- a/src/lib/core/fiber.c
+++ b/src/lib/core/fiber.c
@@ -1203,9 +1203,30 @@ fiber_top_enable()
 		ev_check_start(cord()->loop, &cord()->check_event);
 		fiber_top_enabled = true;
 
+		/*
+		 * Reset cord and fiber clock stats in order to
+		 * count from zero even on reenable.
+		 */
 		cord()->clock_acc = 0;
 		cord()->cpu_miss_count_last = 0;
 		cord()->clock_delta_last = 0;
+		cord()->clock_delta = 0;
+
+		struct fiber * fiber;
+		rlist_foreach_entry(fiber, &cord()->alive, link) {
+			fiber->clock_acc = 0;
+			fiber->clock_delta_last = 0;
+			fiber->clock_delta = 0;
+			fiber->cputime = 0;
+		}
+
+		cord()->sched.clock_acc = 0;
+		cord()->sched.clock_delta_last = 0;
+		cord()->sched.clock_delta = 0;
+		cord()->sched.cputime = 0;
+
+		cord()->clock_last = __rdtscp(&cord()->cpu_id_last);
+		cord()->cpu_miss_count = 0;
 		struct timespec ts;
 		if (clock_gettime(CLOCK_THREAD_CPUTIME_ID, &ts) != 0) {
 			say_debug("clock_gettime(): failed to get this"
diff --git a/src/lua/fiber.c b/src/lua/fiber.c
index 8b3b22e55..647505643 100644
--- a/src/lua/fiber.c
+++ b/src/lua/fiber.c
@@ -335,11 +335,19 @@ lbox_fiber_top_entry(struct fiber *f, void *cb_ctx)
 	lua_newtable(L);
 
 	lua_pushliteral(L, "average");
-	lua_pushnumber(L, f->clock_acc / (double)cord()->clock_acc * 100);
+	if (cord()->clock_acc)
+		lua_pushnumber(L, f->clock_acc / (double)cord()->clock_acc * 100);
+	else
+		lua_pushnumber(L, 0);
 	lua_settable(L, -3);
+
 	lua_pushliteral(L, "instant");
-	lua_pushnumber(L, f->clock_delta_last / (double)cord()->clock_delta_last * 100);
+	if (cord()->clock_delta_last)
+		lua_pushnumber(L, f->clock_delta_last / (double)cord()->clock_delta_last * 100);
+	else
+		lua_pushnumber(L, 0);
 	lua_settable(L, -3);
+
 	lua_pushliteral(L, "time");
 	lua_pushnumber(L, f->cputime / (double) FIBER_TIME_RES);
 	lua_settable(L, -3);
-- 
2.21.0 (Apple Git-122)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Tarantool-patches] [PATCH v2 2/2] app/fiber: wait till a full event loop iteration ends
  2019-11-15 14:58 [Tarantool-patches] [PATCH v2 0/2] fiber.top(): minor fixup Serge Petrenko
  2019-11-15 14:58 ` [Tarantool-patches] [PATCH v2 1/2] fiber: reset clock stats on fiber.top_enable() Serge Petrenko
@ 2019-11-15 14:58 ` Serge Petrenko
  2019-11-15 15:35   ` Serge Petrenko
  2019-11-15 16:27   ` Alexander Turenko
  1 sibling, 2 replies; 9+ messages in thread
From: Serge Petrenko @ 2019-11-15 14:58 UTC (permalink / raw)
  To: v.shpilevoy; +Cc: tarantool-patches

fiber.top() fills in statistics every event loop iteration,
so if it was just enabled, fiber.top() may contain 'inf's and
'nan's in fiber cpu usage averages because total time consumed by
the main thread was not yet accounted for.
Same stands for viewing top() results for a freshly created fiber:
its metrics will be zero since it hasn't lived a full ev loop iteration
yet.
Fix this by delaying the test till top() results are meaningful and add
minor refactoring.

Follow-up #2694
---
 test/app/fiber.result   | 28 +++++++++++++++++++---------
 test/app/fiber.test.lua | 26 ++++++++++++++++++--------
 2 files changed, 37 insertions(+), 17 deletions(-)

diff --git a/test/app/fiber.result b/test/app/fiber.result
index 4a094939f..b4e448282 100644
--- a/test/app/fiber.result
+++ b/test/app/fiber.result
@@ -1469,6 +1469,12 @@ sum = 0
 fiber.top_enable()
 ---
 ...
+-- Wait till a full event loop iteration passes, so that
+-- top() contains meaningful results. On the ev loop iteration
+-- following fiber.top_enable() results will be zero.
+while fiber.top().cpu["1/sched"].instant == 0 do fiber.yield() end
+---
+...
 a = fiber.top()
 ---
 ...
@@ -1504,9 +1510,9 @@ for k, v in pairs(a) do\
 end
 ---
 ...
-sum_inst
+sum_inst > 99 and sum_inst < 101 or sum_inst
 ---
-- 100
+- true
 ...
 -- not exact due to accumulated integer division errors
 sum_avg > 99 and sum_avg < 101 or sum_avg
@@ -1517,24 +1523,28 @@ tbl = nil
 ---
 ...
 f = fiber.new(function()\
-    for i = 1,1000 do end\
-    fiber.yield()\
-    tbl = fiber.top().cpu[fiber.self().id()..'/'..fiber.self().name()]\
+    local fiber_key = fiber.self().id()..'/'..fiber.self().name()\
+    tbl = fiber.top().cpu[fiber_key]\
+    while tbl.time == 0 do\
+        for i = 1,1000 do end\
+        fiber.yield()\
+        tbl = fiber.top().cpu[fiber_key]\
+    end\
 end)
 ---
 ...
-while f:status() ~= 'dead' do fiber.sleep(0.01) end
+while f:status() ~= 'dead' do fiber.yield() end
 ---
 ...
-tbl["average"] > 0
+tbl.average > 0
 ---
 - true
 ...
-tbl["instant"] > 0
+tbl.instant > 0
 ---
 - true
 ...
-tbl["time"] > 0
+tbl.time > 0
 ---
 - true
 ...
diff --git a/test/app/fiber.test.lua b/test/app/fiber.test.lua
index 38b85d554..e2834216b 100644
--- a/test/app/fiber.test.lua
+++ b/test/app/fiber.test.lua
@@ -634,6 +634,12 @@ sum = 0
 -- gh-2694 fiber.top()
 fiber.top_enable()
 
+
+-- Wait till a full event loop iteration passes, so that
+-- top() contains meaningful results. On the ev loop iteration
+-- following fiber.top_enable() results will be zero.
+while fiber.top().cpu["1/sched"].instant == 0 do fiber.yield() end
+
 a = fiber.top()
 type(a)
 -- scheduler is present in fiber.top()
@@ -652,19 +658,23 @@ for k, v in pairs(a) do\
     sum_avg = sum_avg + v["average"]\
 end
 
-sum_inst
+sum_inst > 99 and sum_inst < 101 or sum_inst
 -- not exact due to accumulated integer division errors
 sum_avg > 99 and sum_avg < 101 or sum_avg
 tbl = nil
 f = fiber.new(function()\
-    for i = 1,1000 do end\
-    fiber.yield()\
-    tbl = fiber.top().cpu[fiber.self().id()..'/'..fiber.self().name()]\
+    local fiber_key = fiber.self().id()..'/'..fiber.self().name()\
+    tbl = fiber.top().cpu[fiber_key]\
+    while tbl.time == 0 do\
+        for i = 1,1000 do end\
+        fiber.yield()\
+        tbl = fiber.top().cpu[fiber_key]\
+    end\
 end)
-while f:status() ~= 'dead' do fiber.sleep(0.01) end
-tbl["average"] > 0
-tbl["instant"] > 0
-tbl["time"] > 0
+while f:status() ~= 'dead' do fiber.yield() end
+tbl.average > 0
+tbl.instant > 0
+tbl.time > 0
 
 fiber.top_disable()
 fiber.top()
-- 
2.21.0 (Apple Git-122)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 2/2] app/fiber: wait till a full event loop iteration ends
  2019-11-15 14:58 ` [Tarantool-patches] [PATCH v2 2/2] app/fiber: wait till a full event loop iteration ends Serge Petrenko
@ 2019-11-15 15:35   ` Serge Petrenko
  2019-11-15 16:27   ` Alexander Turenko
  1 sibling, 0 replies; 9+ messages in thread
From: Serge Petrenko @ 2019-11-15 15:35 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

> 15 нояб. 2019 г., в 17:58, Serge Petrenko <sergepetrenko@tarantool.org> написал(а):
> 
> fiber.top() fills in statistics every event loop iteration,
> so if it was just enabled, fiber.top() may contain 'inf's and
> 'nan's in fiber cpu usage averages because total time consumed by
> the main thread was not yet accounted for.
> Same stands for viewing top() results for a freshly created fiber:
> its metrics will be zero since it hasn't lived a full ev loop iteration
> yet.
> Fix this by delaying the test till top() results are meaningful and add
> minor refactoring.
> 
> Follow-up #2694
> —

Sorry, the commit message was outdated here. Fixed it on the branch:
```
app/fiber: wait till a full event loop iteration ends
fiber.top() fills in statistics every event loop iteration,
so if it was just enabled, fiber.top() returns zero in fiber cpu
usage statistics because total time consumed by the main thread was
not yet accounted for.
Same stands for viewing top() results for a freshly created fiber:
its metrics will be zero since it hasn't lived a full ev loop iteration
yet.
Fix this by delaying the test till top() results are meaningful and add
minor refactoring.

Follow-up 
#2694

```


> test/app/fiber.result   | 28 +++++++++++++++++++---------
> test/app/fiber.test.lua | 26 ++++++++++++++++++--------
> 2 files changed, 37 insertions(+), 17 deletions(-)
> 
> diff --git a/test/app/fiber.result b/test/app/fiber.result
> index 4a094939f..b4e448282 100644
> --- a/test/app/fiber.result
> +++ b/test/app/fiber.result
> @@ -1469,6 +1469,12 @@ sum = 0
> fiber.top_enable()
> ---
> ...
> +-- Wait till a full event loop iteration passes, so that
> +-- top() contains meaningful results. On the ev loop iteration
> +-- following fiber.top_enable() results will be zero.
> +while fiber.top().cpu["1/sched"].instant == 0 do fiber.yield() end
> +---
> +...
> a = fiber.top()
> ---
> ...
> @@ -1504,9 +1510,9 @@ for k, v in pairs(a) do\
> end
> ---
> ...
> -sum_inst
> +sum_inst > 99 and sum_inst < 101 or sum_inst
> ---
> -- 100
> +- true
> ...
> -- not exact due to accumulated integer division errors
> sum_avg > 99 and sum_avg < 101 or sum_avg
> @@ -1517,24 +1523,28 @@ tbl = nil
> ---
> ...
> f = fiber.new(function()\
> -    for i = 1,1000 do end\
> -    fiber.yield()\
> -    tbl = fiber.top().cpu[fiber.self().id()..'/'..fiber.self().name()]\
> +    local fiber_key = fiber.self().id()..'/'..fiber.self().name()\
> +    tbl = fiber.top().cpu[fiber_key]\
> +    while tbl.time == 0 do\
> +        for i = 1,1000 do end\
> +        fiber.yield()\
> +        tbl = fiber.top().cpu[fiber_key]\
> +    end\
> end)
> ---
> ...
> -while f:status() ~= 'dead' do fiber.sleep(0.01) end
> +while f:status() ~= 'dead' do fiber.yield() end
> ---
> ...
> -tbl["average"] > 0
> +tbl.average > 0
> ---
> - true
> ...
> -tbl["instant"] > 0
> +tbl.instant > 0
> ---
> - true
> ...
> -tbl["time"] > 0
> +tbl.time > 0
> ---
> - true
> ...
> diff --git a/test/app/fiber.test.lua b/test/app/fiber.test.lua
> index 38b85d554..e2834216b 100644
> --- a/test/app/fiber.test.lua
> +++ b/test/app/fiber.test.lua
> @@ -634,6 +634,12 @@ sum = 0
> -- gh-2694 fiber.top()
> fiber.top_enable()
> 
> +
> +-- Wait till a full event loop iteration passes, so that
> +-- top() contains meaningful results. On the ev loop iteration
> +-- following fiber.top_enable() results will be zero.
> +while fiber.top().cpu["1/sched"].instant == 0 do fiber.yield() end
> +
> a = fiber.top()
> type(a)
> -- scheduler is present in fiber.top()
> @@ -652,19 +658,23 @@ for k, v in pairs(a) do\
>     sum_avg = sum_avg + v["average"]\
> end
> 
> -sum_inst
> +sum_inst > 99 and sum_inst < 101 or sum_inst
> -- not exact due to accumulated integer division errors
> sum_avg > 99 and sum_avg < 101 or sum_avg
> tbl = nil
> f = fiber.new(function()\
> -    for i = 1,1000 do end\
> -    fiber.yield()\
> -    tbl = fiber.top().cpu[fiber.self().id()..'/'..fiber.self().name()]\
> +    local fiber_key = fiber.self().id()..'/'..fiber.self().name()\
> +    tbl = fiber.top().cpu[fiber_key]\
> +    while tbl.time == 0 do\
> +        for i = 1,1000 do end\
> +        fiber.yield()\
> +        tbl = fiber.top().cpu[fiber_key]\
> +    end\
> end)
> -while f:status() ~= 'dead' do fiber.sleep(0.01) end
> -tbl["average"] > 0
> -tbl["instant"] > 0
> -tbl["time"] > 0
> +while f:status() ~= 'dead' do fiber.yield() end
> +tbl.average > 0
> +tbl.instant > 0
> +tbl.time > 0
> 
> fiber.top_disable()
> fiber.top()
> -- 
> 2.21.0 (Apple Git-122)
> 

--
Serge Petrenko
sergepetrenko@tarantool.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 1/2] fiber: reset clock stats on fiber.top_enable()
  2019-11-15 14:58 ` [Tarantool-patches] [PATCH v2 1/2] fiber: reset clock stats on fiber.top_enable() Serge Petrenko
@ 2019-11-15 16:11   ` Alexander Turenko
  2019-11-15 18:23     ` Serge Petrenko
  2019-11-15 21:39   ` Vladislav Shpilevoy
  1 sibling, 1 reply; 9+ messages in thread
From: Alexander Turenko @ 2019-11-15 16:11 UTC (permalink / raw)
  To: Serge Petrenko; +Cc: tarantool-patches, v.shpilevoy

I have no any knowledge around fiber.top(), just asked a question
because of curiosity.

WBR, Alexander Turenko.

On Fri, Nov 15, 2019 at 05:58:58PM +0300, Serge Petrenko wrote:
> We didn't refresh last remembered clock on fiber.top_enable()
> This means that the fiber issuing fiber.top_enable() would get a huge
> chunk of cpu time on its first yield. Fix this.
> Also reset clock_delta and cpu_miss_count.
> 
> If fiber.top() is issued on the same ev loop iteration as
> fiber.top_enable(), clock_delta_last is 0 for both cord and all the
> fibers, so report "instant" and "average" stats per last iteration as 0
> instead of NaN.
> 
> Follow-up #2694
> ---
>  src/lib/core/fiber.c | 21 +++++++++++++++++++++
>  src/lua/fiber.c      | 12 ++++++++++--
>  2 files changed, 31 insertions(+), 2 deletions(-)
> 
> diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
> index aebaba7f0..258c094f5 100644
> --- a/src/lib/core/fiber.c
> +++ b/src/lib/core/fiber.c
> @@ -1203,9 +1203,30 @@ fiber_top_enable()
>  		ev_check_start(cord()->loop, &cord()->check_event);
>  		fiber_top_enabled = true;
>  
> +		/*
> +		 * Reset cord and fiber clock stats in order to
> +		 * count from zero even on reenable.
> +		 */
>  		cord()->clock_acc = 0;
>  		cord()->cpu_miss_count_last = 0;
>  		cord()->clock_delta_last = 0;
> +		cord()->clock_delta = 0;
> +
> +		struct fiber * fiber;

Code style: struct fiber *fiber;

> +		rlist_foreach_entry(fiber, &cord()->alive, link) {
> +			fiber->clock_acc = 0;
> +			fiber->clock_delta_last = 0;
> +			fiber->clock_delta = 0;
> +			fiber->cputime = 0;
> +		}
> +
> +		cord()->sched.clock_acc = 0;
> +		cord()->sched.clock_delta_last = 0;
> +		cord()->sched.clock_delta = 0;
> +		cord()->sched.cputime = 0;
> +
> +		cord()->clock_last = __rdtscp(&cord()->cpu_id_last);
> +		cord()->cpu_miss_count = 0;

Should not we doing such clean up in fiber_top_disable() rather then
here?

>  		struct timespec ts;
>  		if (clock_gettime(CLOCK_THREAD_CPUTIME_ID, &ts) != 0) {
>  			say_debug("clock_gettime(): failed to get this"
> diff --git a/src/lua/fiber.c b/src/lua/fiber.c
> index 8b3b22e55..647505643 100644
> --- a/src/lua/fiber.c
> +++ b/src/lua/fiber.c
> @@ -335,11 +335,19 @@ lbox_fiber_top_entry(struct fiber *f, void *cb_ctx)
>  	lua_newtable(L);
>  
>  	lua_pushliteral(L, "average");
> -	lua_pushnumber(L, f->clock_acc / (double)cord()->clock_acc * 100);
> +	if (cord()->clock_acc)
> +		lua_pushnumber(L, f->clock_acc / (double)cord()->clock_acc * 100);
> +	else
> +		lua_pushnumber(L, 0);
>  	lua_settable(L, -3);
> +
>  	lua_pushliteral(L, "instant");
> -	lua_pushnumber(L, f->clock_delta_last / (double)cord()->clock_delta_last * 100);
> +	if (cord()->clock_delta_last)
> +		lua_pushnumber(L, f->clock_delta_last / (double)cord()->clock_delta_last * 100);
> +	else
> +		lua_pushnumber(L, 0);
>  	lua_settable(L, -3);

Code style: lines over 80 chars.

> +
>  	lua_pushliteral(L, "time");
>  	lua_pushnumber(L, f->cputime / (double) FIBER_TIME_RES);
>  	lua_settable(L, -3);
> -- 
> 2.21.0 (Apple Git-122)
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 2/2] app/fiber: wait till a full event loop iteration ends
  2019-11-15 14:58 ` [Tarantool-patches] [PATCH v2 2/2] app/fiber: wait till a full event loop iteration ends Serge Petrenko
  2019-11-15 15:35   ` Serge Petrenko
@ 2019-11-15 16:27   ` Alexander Turenko
  1 sibling, 0 replies; 9+ messages in thread
From: Alexander Turenko @ 2019-11-15 16:27 UTC (permalink / raw)
  To: Serge Petrenko; +Cc: tarantool-patches, v.shpilevoy

I don't mind against the patch, while I have no enough context to
evaluate its correctness.

I see that the test fails rarely after the patchset, but still do, when
I run it in 16 jobs:

$ (cd test && ./test-run.py $(yes fiber.test.lua | head -n 1000))
<...>
[016] app/fiber.test.lua                                              [ fail ]
[016]
[016] Test failed! Result content mismatch:
[016] --- app/fiber.result	Fri Nov 15 18:14:05 2019
[016] +++ app/fiber.reject	Fri Nov 15 19:15:58 2019
[016] @@ -1517,7 +1517,7 @@
[016]  -- not exact due to accumulated integer division errors
[016]  sum_avg > 99 and sum_avg < 101 or sum_avg
[016]  ---
[016] -- true
[016] +- 109.27001355629
[016]  ...
[016]  tbl = nil
[016]  ---

Values are from 81.78153580673 to 367.42178027752.

Don't sure whether it is urgent enough to push the partial fix. I'll
wait for now.

WBR, Alexander Turenko.

On Fri, Nov 15, 2019 at 05:58:59PM +0300, Serge Petrenko wrote:
> fiber.top() fills in statistics every event loop iteration,
> so if it was just enabled, fiber.top() may contain 'inf's and
> 'nan's in fiber cpu usage averages because total time consumed by
> the main thread was not yet accounted for.
> Same stands for viewing top() results for a freshly created fiber:
> its metrics will be zero since it hasn't lived a full ev loop iteration
> yet.
> Fix this by delaying the test till top() results are meaningful and add
> minor refactoring.
> 
> Follow-up #2694
> ---
>  test/app/fiber.result   | 28 +++++++++++++++++++---------
>  test/app/fiber.test.lua | 26 ++++++++++++++++++--------
>  2 files changed, 37 insertions(+), 17 deletions(-)
> 
> diff --git a/test/app/fiber.result b/test/app/fiber.result
> index 4a094939f..b4e448282 100644
> --- a/test/app/fiber.result
> +++ b/test/app/fiber.result
> @@ -1469,6 +1469,12 @@ sum = 0
>  fiber.top_enable()
>  ---
>  ...
> +-- Wait till a full event loop iteration passes, so that
> +-- top() contains meaningful results. On the ev loop iteration
> +-- following fiber.top_enable() results will be zero.
> +while fiber.top().cpu["1/sched"].instant == 0 do fiber.yield() end
> +---
> +...
>  a = fiber.top()
>  ---
>  ...
> @@ -1504,9 +1510,9 @@ for k, v in pairs(a) do\
>  end
>  ---
>  ...
> -sum_inst
> +sum_inst > 99 and sum_inst < 101 or sum_inst
>  ---
> -- 100
> +- true
>  ...
>  -- not exact due to accumulated integer division errors
>  sum_avg > 99 and sum_avg < 101 or sum_avg
> @@ -1517,24 +1523,28 @@ tbl = nil
>  ---
>  ...
>  f = fiber.new(function()\
> -    for i = 1,1000 do end\
> -    fiber.yield()\
> -    tbl = fiber.top().cpu[fiber.self().id()..'/'..fiber.self().name()]\
> +    local fiber_key = fiber.self().id()..'/'..fiber.self().name()\
> +    tbl = fiber.top().cpu[fiber_key]\
> +    while tbl.time == 0 do\
> +        for i = 1,1000 do end\
> +        fiber.yield()\
> +        tbl = fiber.top().cpu[fiber_key]\
> +    end\
>  end)
>  ---
>  ...
> -while f:status() ~= 'dead' do fiber.sleep(0.01) end
> +while f:status() ~= 'dead' do fiber.yield() end
>  ---
>  ...
> -tbl["average"] > 0
> +tbl.average > 0
>  ---
>  - true
>  ...
> -tbl["instant"] > 0
> +tbl.instant > 0
>  ---
>  - true
>  ...
> -tbl["time"] > 0
> +tbl.time > 0
>  ---
>  - true
>  ...
> diff --git a/test/app/fiber.test.lua b/test/app/fiber.test.lua
> index 38b85d554..e2834216b 100644
> --- a/test/app/fiber.test.lua
> +++ b/test/app/fiber.test.lua
> @@ -634,6 +634,12 @@ sum = 0
>  -- gh-2694 fiber.top()
>  fiber.top_enable()
>  
> +
> +-- Wait till a full event loop iteration passes, so that
> +-- top() contains meaningful results. On the ev loop iteration
> +-- following fiber.top_enable() results will be zero.
> +while fiber.top().cpu["1/sched"].instant == 0 do fiber.yield() end
> +
>  a = fiber.top()
>  type(a)
>  -- scheduler is present in fiber.top()
> @@ -652,19 +658,23 @@ for k, v in pairs(a) do\
>      sum_avg = sum_avg + v["average"]\
>  end
>  
> -sum_inst
> +sum_inst > 99 and sum_inst < 101 or sum_inst
>  -- not exact due to accumulated integer division errors
>  sum_avg > 99 and sum_avg < 101 or sum_avg
>  tbl = nil
>  f = fiber.new(function()\
> -    for i = 1,1000 do end\
> -    fiber.yield()\
> -    tbl = fiber.top().cpu[fiber.self().id()..'/'..fiber.self().name()]\
> +    local fiber_key = fiber.self().id()..'/'..fiber.self().name()\
> +    tbl = fiber.top().cpu[fiber_key]\
> +    while tbl.time == 0 do\
> +        for i = 1,1000 do end\
> +        fiber.yield()\
> +        tbl = fiber.top().cpu[fiber_key]\
> +    end\
>  end)
> -while f:status() ~= 'dead' do fiber.sleep(0.01) end
> -tbl["average"] > 0
> -tbl["instant"] > 0
> -tbl["time"] > 0
> +while f:status() ~= 'dead' do fiber.yield() end
> +tbl.average > 0
> +tbl.instant > 0
> +tbl.time > 0
>  
>  fiber.top_disable()
>  fiber.top()
> -- 
> 2.21.0 (Apple Git-122)
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 1/2] fiber: reset clock stats on fiber.top_enable()
  2019-11-15 16:11   ` Alexander Turenko
@ 2019-11-15 18:23     ` Serge Petrenko
  0 siblings, 0 replies; 9+ messages in thread
From: Serge Petrenko @ 2019-11-15 18:23 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches, Vladislav Shpilevoy

[-- Attachment #1: Type: text/plain, Size: 5395 bytes --]

Hi! Thank you for your answer!
Fixed codestyle and answered your question inline.

--
Serge Petrenko
sergepetrenko@tarantool.org <mailto:sergepetrenko@tarantool.org>

> 15 нояб. 2019 г., в 19:11, Alexander Turenko <alexander.turenko@tarantool.org> написал(а):
> 
> I have no any knowledge around fiber.top(), just asked a question
> because of curiosity.
> 
> WBR, Alexander Turenko.
> 
> On Fri, Nov 15, 2019 at 05:58:58PM +0300, Serge Petrenko wrote:
>> We didn't refresh last remembered clock on fiber.top_enable()
>> This means that the fiber issuing fiber.top_enable() would get a huge
>> chunk of cpu time on its first yield. Fix this.
>> Also reset clock_delta and cpu_miss_count.
>> 
>> If fiber.top() is issued on the same ev loop iteration as
>> fiber.top_enable(), clock_delta_last is 0 for both cord and all the
>> fibers, so report "instant" and "average" stats per last iteration as 0
>> instead of NaN.
>> 
>> Follow-up #2694
>> ---
>> src/lib/core/fiber.c | 21 +++++++++++++++++++++
>> src/lua/fiber.c      | 12 ++++++++++--
>> 2 files changed, 31 insertions(+), 2 deletions(-)
>> 
>> diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
>> index aebaba7f0..258c094f5 100644
>> --- a/src/lib/core/fiber.c
>> +++ b/src/lib/core/fiber.c
>> @@ -1203,9 +1203,30 @@ fiber_top_enable()
>> 		ev_check_start(cord()->loop, &cord()->check_event);
>> 		fiber_top_enabled = true;
>> 
>> +		/*
>> +		 * Reset cord and fiber clock stats in order to
>> +		 * count from zero even on reenable.
>> +		 */
>> 		cord()->clock_acc = 0;
>> 		cord()->cpu_miss_count_last = 0;
>> 		cord()->clock_delta_last = 0;
>> +		cord()->clock_delta = 0;
>> +
>> +		struct fiber * fiber;
> 
> Code style: struct fiber *fiber;

Fixed

> 
>> +		rlist_foreach_entry(fiber, &cord()->alive, link) {
>> +			fiber->clock_acc = 0;
>> +			fiber->clock_delta_last = 0;
>> +			fiber->clock_delta = 0;
>> +			fiber->cputime = 0;
>> +		}
>> +
>> +		cord()->sched.clock_acc = 0;
>> +		cord()->sched.clock_delta_last = 0;
>> +		cord()->sched.clock_delta = 0;
>> +		cord()->sched.cputime = 0;
>> +
>> +		cord()->clock_last = __rdtscp(&cord()->cpu_id_last);
>> +		cord()->cpu_miss_count = 0;
> 
> Should not we doing such clean up in fiber_top_disable() rather then
> here?

It doesn’t matter IMO. Well, yes, you don’t need a cleanup when you call
fiber.top_enable() for the first time, but, other than that, I see no difference.

> 
>> 		struct timespec ts;
>> 		if (clock_gettime(CLOCK_THREAD_CPUTIME_ID, &ts) != 0) {
>> 			say_debug("clock_gettime(): failed to get this"
>> diff --git a/src/lua/fiber.c b/src/lua/fiber.c
>> index 8b3b22e55..647505643 100644
>> --- a/src/lua/fiber.c
>> +++ b/src/lua/fiber.c
>> @@ -335,11 +335,19 @@ lbox_fiber_top_entry(struct fiber *f, void *cb_ctx)
>> 	lua_newtable(L);
>> 
>> 	lua_pushliteral(L, "average");
>> -	lua_pushnumber(L, f->clock_acc / (double)cord()->clock_acc * 100);
>> +	if (cord()->clock_acc)
>> +		lua_pushnumber(L, f->clock_acc / (double)cord()->clock_acc * 100);
>> +	else
>> +		lua_pushnumber(L, 0);
>> 	lua_settable(L, -3);
>> +
>> 	lua_pushliteral(L, "instant");
>> -	lua_pushnumber(L, f->clock_delta_last / (double)cord()->clock_delta_last * 100);
>> +	if (cord()->clock_delta_last)
>> +		lua_pushnumber(L, f->clock_delta_last / (double)cord()->clock_delta_last * 100);
>> +	else
>> +		lua_pushnumber(L, 0);
>> 	lua_settable(L, -3);
> 
> Code style: lines over 80 chars.

Fixed

> 
>> +
>> 	lua_pushliteral(L, "time");
>> 	lua_pushnumber(L, f->cputime / (double) FIBER_TIME_RES);
>> 	lua_settable(L, -3);
>> -- 
>> 2.21.0 (Apple Git-122)


diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
index 258c094f5..c6aab18b2 100644
--- a/src/lib/core/fiber.c
+++ b/src/lib/core/fiber.c
@@ -1212,7 +1212,7 @@ fiber_top_enable()
                cord()->clock_delta_last = 0;
                cord()->clock_delta = 0;
 
-               struct fiber * fiber;
+               struct fiber *fiber;
                rlist_foreach_entry(fiber, &cord()->alive, link) {
                        fiber->clock_acc = 0;
                        fiber->clock_delta_last = 0;
diff --git a/src/lua/fiber.c b/src/lua/fiber.c
index 647505643..9268b0a19 100644
--- a/src/lua/fiber.c
+++ b/src/lua/fiber.c
@@ -335,17 +335,21 @@ lbox_fiber_top_entry(struct fiber *f, void *cb_ctx)
        lua_newtable(L);
 
        lua_pushliteral(L, "average");
-       if (cord()->clock_acc)
-               lua_pushnumber(L, f->clock_acc / (double)cord()->clock_acc * 100);
-       else
+       if (cord()->clock_acc) {
+               lua_pushnumber(L, f->clock_acc /
+                                 (double)cord()->clock_acc * 100);
+       } else {
                lua_pushnumber(L, 0);
+       }
        lua_settable(L, -3);
 
        lua_pushliteral(L, "instant");
-       if (cord()->clock_delta_last)
-               lua_pushnumber(L, f->clock_delta_last / (double)cord()->clock_delta_last * 100);
-       else
+       if (cord()->clock_delta_last) {
+               lua_pushnumber(L, f->clock_delta_last /
+                                 (double)cord()->clock_delta_last * 100);
+       } else {
                lua_pushnumber(L, 0);
+       }
        lua_settable(L, -3);
 
        lua_pushliteral(L, "time");


[-- Attachment #2: Type: text/html, Size: 15315 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 1/2] fiber: reset clock stats on fiber.top_enable()
  2019-11-15 14:58 ` [Tarantool-patches] [PATCH v2 1/2] fiber: reset clock stats on fiber.top_enable() Serge Petrenko
  2019-11-15 16:11   ` Alexander Turenko
@ 2019-11-15 21:39   ` Vladislav Shpilevoy
  2019-11-18 16:11     ` Serge Petrenko
  1 sibling, 1 reply; 9+ messages in thread
From: Vladislav Shpilevoy @ 2019-11-15 21:39 UTC (permalink / raw)
  To: Serge Petrenko; +Cc: tarantool-patches

Hi! Thanks for the fixes!

See 2 comments below.

On 15/11/2019 15:58, Serge Petrenko wrote:
> We didn't refresh last remembered clock on fiber.top_enable()
> This means that the fiber issuing fiber.top_enable() would get a huge
> chunk of cpu time on its first yield. Fix this.
> Also reset clock_delta and cpu_miss_count.
> 
> If fiber.top() is issued on the same ev loop iteration as
> fiber.top_enable(), clock_delta_last is 0 for both cord and all the
> fibers, so report "instant" and "average" stats per last iteration as 0
> instead of NaN.
> 
> Follow-up #2694
> ---
>  src/lib/core/fiber.c | 21 +++++++++++++++++++++
>  src/lua/fiber.c      | 12 ++++++++++--
>  2 files changed, 31 insertions(+), 2 deletions(-)
> 
> diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
> index aebaba7f0..258c094f5 100644
> --- a/src/lib/core/fiber.c
> +++ b/src/lib/core/fiber.c
> @@ -1203,9 +1203,30 @@ fiber_top_enable()
>  		ev_check_start(cord()->loop, &cord()->check_event);
>  		fiber_top_enabled = true;
>  
> +		/*
> +		 * Reset cord and fiber clock stats in order to
> +		 * count from zero even on reenable.
> +		 */
>  		cord()->clock_acc = 0;
>  		cord()->cpu_miss_count_last = 0;
>  		cord()->clock_delta_last = 0;
> +		cord()->clock_delta = 0;
> +
> +		struct fiber * fiber;
> +		rlist_foreach_entry(fiber, &cord()->alive, link) {
> +			fiber->clock_acc = 0;
> +			fiber->clock_delta_last = 0;
> +			fiber->clock_delta = 0;
> +			fiber->cputime = 0;

1. fiber_reset() should do exactly the same, right?
Then why doesn't it nullify clock_delta_last?

This mess with numerous time fields, and their reset,
looks like a necessity to create a new structure,
which would be included into cord and fiber structures.
And which would provide methods for cleaning and
updating the members.

> +		}
> +
> +		cord()->sched.clock_acc = 0;
> +		cord()->sched.clock_delta_last = 0;
> +		cord()->sched.clock_delta = 0;
> +		cord()->sched.cputime = 0;
> +
> +		cord()->clock_last = __rdtscp(&cord()->cpu_id_last);
> +		cord()->cpu_miss_count = 0;
>  		struct timespec ts;
>  		if (clock_gettime(CLOCK_THREAD_CPUTIME_ID, &ts) != 0) {
>  			say_debug("clock_gettime(): failed to get this"
> diff --git a/src/lua/fiber.c b/src/lua/fiber.c
> index 8b3b22e55..647505643 100644
> --- a/src/lua/fiber.c
> +++ b/src/lua/fiber.c
> @@ -335,11 +335,19 @@ lbox_fiber_top_entry(struct fiber *f, void *cb_ctx)
>  	lua_newtable(L);
>  
>  	lua_pushliteral(L, "average");
> -	lua_pushnumber(L, f->clock_acc / (double)cord()->clock_acc * 100);
> +	if (cord()->clock_acc)

2. Sorry for a nit, we usually compare with 0
explicitly, '!= 0', and avoid implicit casts.

> +		lua_pushnumber(L, f->clock_acc / (double)cord()->clock_acc * 100);
> +	else
> +		lua_pushnumber(L, 0);
>  	lua_settable(L, -3);
> +
>  	lua_pushliteral(L, "instant");
> -	lua_pushnumber(L, f->clock_delta_last / (double)cord()->clock_delta_last * 100);
> +	if (cord()->clock_delta_last)
> +		lua_pushnumber(L, f->clock_delta_last / (double)cord()->clock_delta_last * 100);
> +	else
> +		lua_pushnumber(L, 0);
>  	lua_settable(L, -3);
> +
>  	lua_pushliteral(L, "time");
>  	lua_pushnumber(L, f->cputime / (double) FIBER_TIME_RES);
>  	lua_settable(L, -3);
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 1/2] fiber: reset clock stats on fiber.top_enable()
  2019-11-15 21:39   ` Vladislav Shpilevoy
@ 2019-11-18 16:11     ` Serge Petrenko
  0 siblings, 0 replies; 9+ messages in thread
From: Serge Petrenko @ 2019-11-18 16:11 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 3836 bytes --]

Hi! Thankyou for reviewing this.

I’ve managed to finally find the cause of test failures.
It is fixed in the second patch of the newly sent series.
I addressed your comments and resent v3.

--
Serge Petrenko
sergepetrenko@tarantool.org




> 16 нояб. 2019 г., в 0:39, Vladislav Shpilevoy <v.shpilevoy@tarantool.org> написал(а):
> 
> Hi! Thanks for the fixes!
> 
> See 2 comments below.
> 
> On 15/11/2019 15:58, Serge Petrenko wrote:
>> We didn't refresh last remembered clock on fiber.top_enable()
>> This means that the fiber issuing fiber.top_enable() would get a huge
>> chunk of cpu time on its first yield. Fix this.
>> Also reset clock_delta and cpu_miss_count.
>> 
>> If fiber.top() is issued on the same ev loop iteration as
>> fiber.top_enable(), clock_delta_last is 0 for both cord and all the
>> fibers, so report "instant" and "average" stats per last iteration as 0
>> instead of NaN.
>> 
>> Follow-up #2694
>> ---
>> src/lib/core/fiber.c | 21 +++++++++++++++++++++
>> src/lua/fiber.c      | 12 ++++++++++--
>> 2 files changed, 31 insertions(+), 2 deletions(-)
>> 
>> diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
>> index aebaba7f0..258c094f5 100644
>> --- a/src/lib/core/fiber.c
>> +++ b/src/lib/core/fiber.c
>> @@ -1203,9 +1203,30 @@ fiber_top_enable()
>> 		ev_check_start(cord()->loop, &cord()->check_event);
>> 		fiber_top_enabled = true;
>> 
>> +		/*
>> +		 * Reset cord and fiber clock stats in order to
>> +		 * count from zero even on reenable.
>> +		 */
>> 		cord()->clock_acc = 0;
>> 		cord()->cpu_miss_count_last = 0;
>> 		cord()->clock_delta_last = 0;
>> +		cord()->clock_delta = 0;
>> +
>> +		struct fiber * fiber;
>> +		rlist_foreach_entry(fiber, &cord()->alive, link) {
>> +			fiber->clock_acc = 0;
>> +			fiber->clock_delta_last = 0;
>> +			fiber->clock_delta = 0;
>> +			fiber->cputime = 0;
> 
> 1. fiber_reset() should do exactly the same, right?
> Then why doesn't it nullify clock_delta_last?

I just didn’t notice that, sorry.

> 
> This mess with numerous time fields, and their reset,
> looks like a necessity to create a new structure,
> which would be included into cord and fiber structures.
> And which would provide methods for cleaning and
> updating the members.

Done.

> 
>> +		}
>> +
>> +		cord()->sched.clock_acc = 0;
>> +		cord()->sched.clock_delta_last = 0;
>> +		cord()->sched.clock_delta = 0;
>> +		cord()->sched.cputime = 0;
>> +
>> +		cord()->clock_last = __rdtscp(&cord()->cpu_id_last);
>> +		cord()->cpu_miss_count = 0;
>> 		struct timespec ts;
>> 		if (clock_gettime(CLOCK_THREAD_CPUTIME_ID, &ts) != 0) {
>> 			say_debug("clock_gettime(): failed to get this"
>> diff --git a/src/lua/fiber.c b/src/lua/fiber.c
>> index 8b3b22e55..647505643 100644
>> --- a/src/lua/fiber.c
>> +++ b/src/lua/fiber.c
>> @@ -335,11 +335,19 @@ lbox_fiber_top_entry(struct fiber *f, void *cb_ctx)
>> 	lua_newtable(L);
>> 
>> 	lua_pushliteral(L, "average");
>> -	lua_pushnumber(L, f->clock_acc / (double)cord()->clock_acc * 100);
>> +	if (cord()->clock_acc)
> 
> 2. Sorry for a nit, we usually compare with 0
> explicitly, '!= 0', and avoid implicit casts.

No problem. Fixed.

> 
>> +		lua_pushnumber(L, f->clock_acc / (double)cord()->clock_acc * 100);
>> +	else
>> +		lua_pushnumber(L, 0);
>> 	lua_settable(L, -3);
>> +
>> 	lua_pushliteral(L, "instant");
>> -	lua_pushnumber(L, f->clock_delta_last / (double)cord()->clock_delta_last * 100);
>> +	if (cord()->clock_delta_last)
>> +		lua_pushnumber(L, f->clock_delta_last / (double)cord()->clock_delta_last * 100);
>> +	else
>> +		lua_pushnumber(L, 0);
>> 	lua_settable(L, -3);
>> +
>> 	lua_pushliteral(L, "time");
>> 	lua_pushnumber(L, f->cputime / (double) FIBER_TIME_RES);
>> 	lua_settable(L, -3);


[-- Attachment #2: Type: text/html, Size: 23177 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-11-18 16:11 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-15 14:58 [Tarantool-patches] [PATCH v2 0/2] fiber.top(): minor fixup Serge Petrenko
2019-11-15 14:58 ` [Tarantool-patches] [PATCH v2 1/2] fiber: reset clock stats on fiber.top_enable() Serge Petrenko
2019-11-15 16:11   ` Alexander Turenko
2019-11-15 18:23     ` Serge Petrenko
2019-11-15 21:39   ` Vladislav Shpilevoy
2019-11-18 16:11     ` Serge Petrenko
2019-11-15 14:58 ` [Tarantool-patches] [PATCH v2 2/2] app/fiber: wait till a full event loop iteration ends Serge Petrenko
2019-11-15 15:35   ` Serge Petrenko
2019-11-15 16:27   ` Alexander Turenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox