[Tarantool-patches] [tarantool-patches] [PATCH v2 2/2] lua: add fiber.top() listing fiber cpu consumption

Serge Petrenko sergepetrenko at tarantool.org
Fri Oct 25 18:13:32 MSK 2019


Hi! Thank you for review!
I addressed your comments and answered inline.
Incremental diff is below.

> 23 окт. 2019 г., в 0:18, Vladislav Shpilevoy <v.shpilevoy at tarantool.org> написал(а):
> 
> Thanks for the patch!
> 
> Sorry, the version in the email is a bit outdated. I
> pasted below the actual version from the branch.
> 
> See 9 comments below.
> 
>>    lua: add fiber.top() listing fiber cpu consumption
>> 
>>    Implement a new function in Lua fiber library: top(). It returns a table
>>    of alive fibers (including the scheduler). Each table entry has two
>>    fields: average cpu consumption, which is calculated with exponential
>>    moving average over event loop iterations, and current cpu consumption,
>>    which shows fiber's cpu usage over the last event loop iteration.
>>    The patch relies on CPU timestamp counter to measure each fiber's time
>>    share.
>> 
>>    Closes #2694
>> 
>>    @TarantoolBot document
>>    Title: fiber: new function `fiber.top()`
>> 
>>    `fiber.top()` returns a table of all alive fibers and lists their cpu
>>    consumption. Let's take a look at the example:
>>    ```
>>    tarantool> fiber.top()
>>    ---
>>    - 1:
>>        cpu average (%): 10.779696493982
>>        cpu instant (%): 10.435256168573
> 
> 1. Have you considered making these option names
> one word? 'cpu average (%)' -> 'average'. The
> same for instant. My motivation is that it could
> simplify wrappers which are going to use the top
> to show it in a GUI or something. You actually use
> it in the tests.
> 
> It would not hard readability IMO. It is obvious
> that top is in percents, and about CPU time.

No problem.

> 
>>      115:
>>        cpu average (%): 5.4571279061075
>>        cpu instant (%): 5.9653973440576
>>      120:
>>        cpu average (%): 21.944382148464
>>        cpu instant (%): 23.849021825646
>>      116:
>>        cpu average (%): 8.6603872318158
>>        cpu instant (%): 9.6812031335093
>>      119:
>>        cpu average (%): 21.933168871944
>>        cpu instant (%): 20.007540530351
>>      cpu misses: 0
>>      118:
>>        cpu average (%): 19.342901995963
>>        cpu instant (%): 16.932679820703
>>      117:
>>        cpu average (%): 11.549674814981
>>        cpu instant (%): 13.128901177161
>>    ...
>>    ```
> 
> 2. This is super cool!
> 
> Could we give names to predefined fibers? For
> example, make an alias top.scheduler = top[1].
> 
> So as in the console output I would see:
> 
>    ---
>    - scheduler:
>        cpu average (%): 98.230148883023
>        cpu instant (%): 100
> 
> Instead of '1'. Because personally I didn't
> even know that 1 is always the scheduler, and
> I would be confused to see a regular fiber
> consumed 99% of time in a console.

Hmm, what about this?

--- a/src/lua/fiber.c
+++ b/src/lua/fiber.c
@@ -325,13 +325,18 @@ lbox_fiber_top_entry(struct fiber *f, void *cb_ctx)
 {
        struct lua_State *L = (struct lua_State *) cb_ctx;
 
-       lua_pushinteger(L, f->fid);
+       /* Index system fibers by name instead of id */
+       if (f->fid <= FIBER_ID_MAX_RESERVED) {
+               lua_pushstring(L, f->name);
+       } else {
+               lua_pushinteger(L, f->fid);
+       }
        lua_newtable(L);


> 
>>    In the table above keys are fiber id's (and a single 'cpu misses' key
>>    which indicates the amount of times tx thread was rescheduled on a
>>    different cpu core. More on that later).
>>    The two metrics available for each fiber are:
>>    1) cpu instant (per cent),
>>    which indicates the share of time fiber was executing during the
>>    previous event loop iteration
>>    2) cpu average (per cent), which is calculated as an exponential moving
>>    average of `cpu instant` values over all previous event loop iterations.
>> 
>>    More info on `cpu misses` field returned by `fiber.top()`:
>>    `cpu misses` indicates the amount of times tx thread detected it was
>>    rescheduled on a different cpu core during the last event loop
>>    iteration.
>>    fiber.top() uses cpu timestamp counter to measure each fiber's execution
>>    time. However, each cpu core may have its own counter value (you can
>>    only rely on counter deltas if both measurements were taken on the same
>>    core, otherwise the delta may even get negative).
>>    When tx thread is rescheduled to a different cpu core, tarantool just
>>    assumes cpu delta was zero for the lust measurement. This loweres
>>    precision of our computations, so the bigger `cpu misses` value the
>>    lower the precision of fiber.top() results.
>> 
>>    Fiber.top() doesn't work on arm architecture at the moment.
> 
> 2. Have you done benchmarks how much top slows down fiber
> switching? It should not be hard to measure. Just switching.
> Lots of fibers which only yield. And measure loop iterations
> per second. Won't work? Or average/median duration of one
> loop iteration.

Hi! I tested context switches per second.
The results are in the issue comments:
https://github.com/tarantool/tarantool/issues/2694#issuecomment-546381304 <https://github.com/tarantool/tarantool/issues/2694#issuecomment-546381304>
Performance degradation is between 10 and 16 per cent.

>> 
>> diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
>> index b813c1739..bea49eb41 100644
>> --- a/src/lib/core/fiber.c
>> +++ b/src/lib/core/fiber.c
>> @@ -37,11 +37,14 @@
>> #include <stdlib.h>
>> #include <string.h>
>> #include <pmatomic.h>
>> -
>> #include "assoc.h"
> 
> 3. Sorry, looks like stray diff. Could you
> please drop it?
> 

Thanks for noticing! Fixed.

>> #include "memory.h"
>> #include "trigger.h"
>> @@ -82,6 +85,34 @@ static int (*fiber_invoke)(fiber_func f, va_list ap);
>> 	err;								\
>> })
>> 
>> +#if ENABLE_FIBER_TOP
>> +/**
>> + * An action performed each time a context switch happens.
>> + * Used to count each fiber's processing time.
>> + * This is a macro rather than a function, since it is used
>> + * in scheduler too.
> 
> 4. Hm. Not sure if I understand. Ok, it is used in the scheduler,
> so why does it prevent making it a function? Caller is always
> struct fiber *, so you can make it a function taking fiber *
> parameter, can't you?

You cannot call functions from scheduler cos it doesn’t have a
stack AFAIU. I can make it an inline function if you want me to.

> 
>> + */
>> +#define clock_set_on_csw(caller)					\
>> +({									\
>> +	uint64_t clock;							\
>> +	uint32_t cpu_id;		 				\
>> +	clock = __rdtscp(&cpu_id);					\
>> +									\
>> +	if (cpu_id == cord()->cpu_id_last) {				\
>> +		(caller)->clock_delta += clock - cord()->clock_last;	\
>> +		cord()->clock_delta += clock - cord()->clock_last;	\
>> +	} else {							\
>> +		cord()->cpu_id_last = cpu_id;				\
>> +		cord()->cpu_miss_count++;				\
>> +	}								\
>> +	cord()->clock_last = clock;					\
>> +})> @@ -584,6 +617,7 @@ fiber_schedule_list(struct rlist *list)
>> 	}
>> 	last->caller = fiber();
>> 	assert(fiber() == &cord()->sched);
>> +	clock_set_on_csw(fiber());
> 
> 5. Am I right, that we need to update CPU consumption each
> time a context switch happens?

Yep.

> If so, then by logic you
> would need to call clock_set_on_csw() on each fiber->csw++,
> correct?

Right. I also call clock_set_on_csw() on each event loop iteration
end.

> 
> Currently that happens in 2 places: fiber_call_impl() and
> fiber_yield(). In both these places you already call
> clock_set_on_csw():
> 
>    - You always call clock_set_on_csw() right before
>      fiber_call_impl();
> 
>    - You call clock_set_on_csw() in fiber_yield().
> 
> I would then move fiber->cws++ logic into your macros too.
> So all the context switch and CPU time tracking logic
> would be in one place.
> 
> The only problem is that your macros takes caller, but csw++
> is done for callee. Probably we could increment it for the
> caller instead, I don't anything depends on that.
> 
> Also it would lead to increment of the scheduler's csw
> counter. Not sure whether it is ok.

Loop iteration end isn’t a context switch, but I still update
clocks here. I don’t want to increment csw here as a side
effect, so maybe leave csw increment as it is?

> 
> Like that (I didn't test it):
> 
> ============================================================
> 
> diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
> index bea49eb41..e9aac99d0 100644
> --- a/src/lib/core/fiber.c
> +++ b/src/lib/core/fiber.c
> @@ -105,6 +105,7 @@ static int (*fiber_invoke)(fiber_func f, va_list ap);
> 		cord()->cpu_id_last = cpu_id;				\
> 		cord()->cpu_miss_count++;				\
> 	}								\
> +	caller->csw++;							\
> 	cord()->clock_last = clock;					\
> })
> 
> @@ -258,7 +259,7 @@ fiber_call_impl(struct fiber *callee)
> 	cord->fiber = callee;
> 
> 	callee->flags &= ~FIBER_IS_READY;
> -	callee->csw++;
> +	clock_set_on_csw(caller);
> 	ASAN_START_SWITCH_FIBER(asan_state, 1,
> 				callee->stack,
> 				callee->stack_size);
> @@ -277,7 +278,6 @@ fiber_call(struct fiber *callee)
> 	/** By convention, these triggers must not throw. */
> 	if (! rlist_empty(&caller->on_yield))
> 		trigger_run(&caller->on_yield, NULL);
> -	clock_set_on_csw(caller);
> 	callee->caller = caller;
> 	callee->flags |= FIBER_IS_READY;
> 	caller->flags |= FIBER_IS_READY;
> @@ -511,7 +511,6 @@ fiber_yield(void)
> 	assert(callee->flags & FIBER_IS_READY || callee == &cord->sched);
> 	assert(! (callee->flags & FIBER_IS_DEAD));
> 	cord->fiber = callee;
> -	callee->csw++;
> 	callee->flags &= ~FIBER_IS_READY;
> 	ASAN_START_SWITCH_FIBER(asan_state,
> 				(caller->flags & FIBER_IS_DEAD) == 0,
> @@ -617,7 +616,6 @@ fiber_schedule_list(struct rlist *list)
> 	}
> 	last->caller = fiber();
> 	assert(fiber() == &cord()->sched);
> -	clock_set_on_csw(fiber());
> 	fiber_call_impl(first);
> }
> 
> ============================================================
> 
>> 	fiber_call_impl(first);
>> }
>> 
>> @@ -1044,6 +1082,107 @@ fiber_destroy_all(struct cord *cord)> +bool
>> +fiber_top_is_enabled()
>> +{
>> +	return fiber_top_enabled;
>> +}
>> +
>> +inline void
>> +fiber_top_enable()
> 
> 6. Please, add 'static' modifier. Looks like
> this function is used inside this file only.
> The same for fiber_top_disable().

It’s used in lua fiber module.

> 
>> +{
>> +	if (!fiber_top_enabled) {
>> +		ev_prepare_start(cord()->loop, &cord()->prepare_event);
>> +		ev_check_start(cord()->loop, &cord()->check_event);
>> +		fiber_top_enabled = true;
>> +
>> +		cord()->clock_acc = 0;
>> +		cord()->cpu_miss_count_last = 0;
>> +		cord()->clock_delta_last = 0;
>> +	}
>> +}
>> @@ -1077,6 +1216,14 @@ cord_create(struct cord *cord, const char *name)
>> 	ev_async_init(&cord->wakeup_event, fiber_schedule_wakeup);
>> 
>> 	ev_idle_init(&cord->idle_event, fiber_schedule_idle);
>> +
>> +#if ENABLE_FIBER_TOP
>> +	/* fiber.top() currently works only for the main thread. */
>> +	if (cord_is_main()) {
>> +		fiber_top_init();
>> +		fiber_top_enable();
> 
> 7. I think, we need to enable top by default only
> when we are sure that it does not affect performance.

Ok, disabled it by default. User can turn it on with
fiber.top_enable(). This also resolved the issue I had
with your swim test.

Looking at perf difference, maybe we can turn it on by default?

> 
>> +	}
>> +#endif /* ENABLE_FIBER_TOP */
>> 	cord_set_name(name);
>> 
>> #if ENABLE_ASAN
>> diff --git a/src/lua/fiber.c b/src/lua/fiber.c
>> index 336be60a2..2adff1536 100644
>> --- a/src/lua/fiber.c
>> +++ b/src/lua/fiber.c
>> @@ -319,6 +319,61 @@ lbox_fiber_statof_nobt(struct fiber *f, void *cb_ctx)
>> +
>> +static int
>> +lbox_fiber_top(struct lua_State *L)
>> +{
>> +	if (!fiber_top_is_enabled()) {
>> +		luaL_error(L, "fiber.top() is disabled. enable it with"
> 
> 8. 'enable' is a first word in the sentence. So probably
> it is 'Enable’.

Yep. Done.

> 
>> +			      " fiber.top_enable() first");
>> +	}
>> +	lua_newtable(L);
>> +	lua_pushliteral(L, "cpu misses");
>> +	lua_pushnumber(L, cord()->cpu_miss_count_last);
>> +	lua_settable(L, -3);
>> +
>> +	lbox_fiber_top_entry(&cord()->sched, L);
>> +	fiber_stat(lbox_fiber_top_entry, L);
>> +
>> +	return 1;
>> +}
>> diff --git a/test/app/fiber.result b/test/app/fiber.result
>> index 3c6115a33..196fae3b7 100644
>> --- a/test/app/fiber.result
>> +++ b/test/app/fiber.result
>> @@ -1462,6 +1462,84 @@ fiber.join(fiber.self())
>> ---
>> - error: the fiber is not joinable
>> ...
>> +sum = 0
>> +---
>> +...
>> +-- gh-2694 fiber.top()
>> +a = fiber.top()
>> +---
>> +...
>> +-- scheduler is present in fiber.top()
>> +a[1] ~= nil
>> +---
>> +- true
>> +...
>> +type(a["cpu misses"]) == 'number'
>> +---
>> +- true
>> +...
>> +sum_inst = 0
>> +---
>> +...
>> +sum_avg = 0
>> +---
>> +...
>> +test_run:cmd('setopt delimiter ";"')
> 
> 9. I would consider usage of '\' instead of a
> delimiter change for such a short piece of
> multiline code. But up to you. Just in case
> you didn't know about '\', because I didn't
> know either until recently.
> 


I didn’t know about ‘\’ as well. Thanks.


Here’s the incremental diff:

diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
index bea49eb41..883fbac36 100644
--- a/src/lib/core/fiber.c
+++ b/src/lib/core/fiber.c
@@ -37,6 +37,7 @@
 #include <stdlib.h>
 #include <string.h>
 #include <pmatomic.h>
+
 #include "assoc.h"
 #include "memory.h"
 #include "trigger.h"
@@ -1221,7 +1222,6 @@ cord_create(struct cord *cord, const char *name)
 	/* fiber.top() currently works only for the main thread. */
 	if (cord_is_main()) {
 		fiber_top_init();
-		fiber_top_enable();
 	}
 #endif /* ENABLE_FIBER_TOP */
 	cord_set_name(name);
diff --git a/src/lua/fiber.c b/src/lua/fiber.c
index 62f3ccce4..c38bd886f 100644
--- a/src/lua/fiber.c
+++ b/src/lua/fiber.c
@@ -325,13 +325,18 @@ lbox_fiber_top_entry(struct fiber *f, void *cb_ctx)
 {
 	struct lua_State *L = (struct lua_State *) cb_ctx;
 
-	lua_pushinteger(L, f->fid);
+	/* Index system fibers by name instead of id */
+	if (f->fid <= FIBER_ID_MAX_RESERVED) {
+		lua_pushstring(L, f->name);
+	} else {
+		lua_pushinteger(L, f->fid);
+	}
 	lua_newtable(L);
 
-	lua_pushliteral(L, "cpu average (%)");
+	lua_pushliteral(L, "average");
 	lua_pushnumber(L, f->clock_acc / (double)cord()->clock_acc * 100);
 	lua_settable(L, -3);
-	lua_pushliteral(L, "cpu instant (%)");
+	lua_pushliteral(L, "instant");
 	lua_pushnumber(L, f->clock_delta_last / (double)cord()->clock_delta_last * 100);
 	lua_settable(L, -3);
 	lua_settable(L, -3);
@@ -343,7 +348,7 @@ static int
 lbox_fiber_top(struct lua_State *L)
 {
 	if (!fiber_top_is_enabled()) {
-		luaL_error(L, "fiber.top() is disabled. enable it with"
+		luaL_error(L, "fiber.top() is disabled. Enable it with"
 			      " fiber.top_enable() first");
 	}
 	lua_newtable(L);
diff --git a/test/app/fiber.result b/test/app/fiber.result
index 196fae3b7..e45cd7639 100644
--- a/test/app/fiber.result
+++ b/test/app/fiber.result
@@ -1466,11 +1466,19 @@ sum = 0
 ---
 ...
 -- gh-2694 fiber.top()
+fiber.top_enable()
+---
+...
 a = fiber.top()
 ---
 ...
+type(a)
+---
+- table
+...
 -- scheduler is present in fiber.top()
-a[1] ~= nil
+-- and is indexed by name
+a["sched"] ~= nil
 ---
 - true
 ...
@@ -1484,21 +1492,19 @@ sum_inst = 0
 sum_avg = 0
 ---
 ...
-test_run:cmd('setopt delimiter ";"')
----
-- true
-...
-for k, v in pairs(a) do
-    if type(v) == 'table' then
-        sum_inst = sum_inst + v["cpu instant (%)"]
-        sum_avg = sum_avg + v["cpu average (%)"]
-    end
-end;
+-- update table to make sure
+-- a full event loop iteration
+-- has ended
+a = fiber.top()
 ---
 ...
-test_run:cmd('setopt delimiter ""');
+for k, v in pairs(a) do\
+    if type(v) == 'table' then\
+        sum_inst = sum_inst + v["instant"]\
+        sum_avg = sum_avg + v["average"]\
+    end\
+end
 ---
-- true
 ...
 sum_inst
 ---
@@ -1518,11 +1524,11 @@ f = fiber.new(function() for i = 1,1000 do end fiber.yield() tbl = fiber.top()[f
 while f:status() ~= 'dead' do fiber.sleep(0.01) end
 ---
 ...
-tbl["cpu average (%)"] > 0
+tbl["average"] > 0
 ---
 - true
 ...
-tbl["cpu instant (%)"] > 0
+tbl["instant"] > 0
 ---
 - true
 ...
@@ -1531,14 +1537,7 @@ fiber.top_disable()
 ...
 fiber.top()
 ---
-- error: fiber.top() is disabled. enable it with fiber.top_enable() first
-...
-fiber.top_enable()
----
-...
-type(fiber.top())
----
-- table
+- error: fiber.top() is disabled. Enable it with fiber.top_enable() first
 ...
 -- cleanup
 test_run:cmd("clear filter")
diff --git a/test/app/fiber.test.lua b/test/app/fiber.test.lua
index 15507c2c7..e30aba77e 100644
--- a/test/app/fiber.test.lua
+++ b/test/app/fiber.test.lua
@@ -632,33 +632,39 @@ fiber.join(fiber.self())
 sum = 0
 
 -- gh-2694 fiber.top()
+fiber.top_enable()
+
 a = fiber.top()
+type(a)
 -- scheduler is present in fiber.top()
-a[1] ~= nil
+-- and is indexed by name
+a["sched"] ~= nil
 type(a["cpu misses"]) == 'number'
 sum_inst = 0
 sum_avg = 0
-test_run:cmd('setopt delimiter ";"')
-for k, v in pairs(a) do
-    if type(v) == 'table' then
-        sum_inst = sum_inst + v["cpu instant (%)"]
-        sum_avg = sum_avg + v["cpu average (%)"]
-    end
-end;
-test_run:cmd('setopt delimiter ""');
+
+-- update table to make sure
+-- a full event loop iteration
+-- has ended
+a = fiber.top()
+for k, v in pairs(a) do\
+    if type(v) == 'table' then\
+        sum_inst = sum_inst + v["instant"]\
+        sum_avg = sum_avg + v["average"]\
+    end\
+end
+
 sum_inst
 -- not exact due to accumulated integer division errors
 sum_avg > 99 and sum_avg < 101 or sum_avg
 tbl = nil
 f = fiber.new(function() for i = 1,1000 do end fiber.yield() tbl = fiber.top()[fiber.self().id()] end)
 while f:status() ~= 'dead' do fiber.sleep(0.01) end
-tbl["cpu average (%)"] > 0
-tbl["cpu instant (%)"] > 0
+tbl["average"] > 0
+tbl["instant"] > 0
 
 fiber.top_disable()
 fiber.top()
-fiber.top_enable()
-type(fiber.top())
 
 -- cleanup
 test_run:cmd("clear filter")

--
Serge Petrenko
sergepetrenko at tarantool.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20191025/7ff142b4/attachment.html>


More information about the Tarantool-patches mailing list