From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp58.i.mail.ru (smtp58.i.mail.ru [217.69.128.38]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 5F6B743D678 for ; Wed, 23 Oct 2019 00:13:19 +0300 (MSK) References: From: Vladislav Shpilevoy Message-ID: <5b3c1fae-1c49-4b13-3a0e-0774c4add3fb@tarantool.org> Date: Tue, 22 Oct 2019 23:18:34 +0200 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Tarantool-patches] [tarantool-patches] [PATCH v2 2/2] lua: add fiber.top() listing fiber cpu consumption List-Id: Tarantool development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: tarantool-patches@freelists.org, Serge Petrenko , georgy@tarantool.org, tarantool-patches@dev.tarantool.org Thanks for the patch! Sorry, the version in the email is a bit outdated. I pasted below the actual version from the branch. See 9 comments below. > lua: add fiber.top() listing fiber cpu consumption > > Implement a new function in Lua fiber library: top(). It returns a table > of alive fibers (including the scheduler). Each table entry has two > fields: average cpu consumption, which is calculated with exponential > moving average over event loop iterations, and current cpu consumption, > which shows fiber's cpu usage over the last event loop iteration. > The patch relies on CPU timestamp counter to measure each fiber's time > share. > > Closes #2694 > > @TarantoolBot document > Title: fiber: new function `fiber.top()` > > `fiber.top()` returns a table of all alive fibers and lists their cpu > consumption. Let's take a look at the example: > ``` > tarantool> fiber.top() > --- > - 1: > cpu average (%): 10.779696493982 > cpu instant (%): 10.435256168573 1. Have you considered making these option names one word? 'cpu average (%)' -> 'average'. The same for instant. My motivation is that it could simplify wrappers which are going to use the top to show it in a GUI or something. You actually use it in the tests. It would not hard readability IMO. It is obvious that top is in percents, and about CPU time. > 115: > cpu average (%): 5.4571279061075 > cpu instant (%): 5.9653973440576 > 120: > cpu average (%): 21.944382148464 > cpu instant (%): 23.849021825646 > 116: > cpu average (%): 8.6603872318158 > cpu instant (%): 9.6812031335093 > 119: > cpu average (%): 21.933168871944 > cpu instant (%): 20.007540530351 > cpu misses: 0 > 118: > cpu average (%): 19.342901995963 > cpu instant (%): 16.932679820703 > 117: > cpu average (%): 11.549674814981 > cpu instant (%): 13.128901177161 > ... > ``` 2. This is super cool! Could we give names to predefined fibers? For example, make an alias top.scheduler = top[1]. So as in the console output I would see: --- - scheduler: cpu average (%): 98.230148883023 cpu instant (%): 100 Instead of '1'. Because personally I didn't even know that 1 is always the scheduler, and I would be confused to see a regular fiber consumed 99% of time in a console. > In the table above keys are fiber id's (and a single 'cpu misses' key > which indicates the amount of times tx thread was rescheduled on a > different cpu core. More on that later). > The two metrics available for each fiber are: > 1) cpu instant (per cent), > which indicates the share of time fiber was executing during the > previous event loop iteration > 2) cpu average (per cent), which is calculated as an exponential moving > average of `cpu instant` values over all previous event loop iterations. > > More info on `cpu misses` field returned by `fiber.top()`: > `cpu misses` indicates the amount of times tx thread detected it was > rescheduled on a different cpu core during the last event loop > iteration. > fiber.top() uses cpu timestamp counter to measure each fiber's execution > time. However, each cpu core may have its own counter value (you can > only rely on counter deltas if both measurements were taken on the same > core, otherwise the delta may even get negative). > When tx thread is rescheduled to a different cpu core, tarantool just > assumes cpu delta was zero for the lust measurement. This loweres > precision of our computations, so the bigger `cpu misses` value the > lower the precision of fiber.top() results. > > Fiber.top() doesn't work on arm architecture at the moment. 2. Have you done benchmarks how much top slows down fiber switching? It should not be hard to measure. Just switching. Lots of fibers which only yield. And measure loop iterations per second. Won't work? Or average/median duration of one loop iteration. > > diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c > index b813c1739..bea49eb41 100644 > --- a/src/lib/core/fiber.c > +++ b/src/lib/core/fiber.c > @@ -37,11 +37,14 @@ > #include > #include > #include > - > #include "assoc.h" 3. Sorry, looks like stray diff. Could you please drop it? > #include "memory.h" > #include "trigger.h" > @@ -82,6 +85,34 @@ static int (*fiber_invoke)(fiber_func f, va_list ap); > err; \ > }) > > +#if ENABLE_FIBER_TOP > +/** > + * An action performed each time a context switch happens. > + * Used to count each fiber's processing time. > + * This is a macro rather than a function, since it is used > + * in scheduler too. 4. Hm. Not sure if I understand. Ok, it is used in the scheduler, so why does it prevent making it a function? Caller is always struct fiber *, so you can make it a function taking fiber * parameter, can't you? > + */ > +#define clock_set_on_csw(caller) \ > +({ \ > + uint64_t clock; \ > + uint32_t cpu_id; \ > + clock = __rdtscp(&cpu_id); \ > + \ > + if (cpu_id == cord()->cpu_id_last) { \ > + (caller)->clock_delta += clock - cord()->clock_last; \ > + cord()->clock_delta += clock - cord()->clock_last; \ > + } else { \ > + cord()->cpu_id_last = cpu_id; \ > + cord()->cpu_miss_count++; \ > + } \ > + cord()->clock_last = clock; \ > +})> @@ -584,6 +617,7 @@ fiber_schedule_list(struct rlist *list) > } > last->caller = fiber(); > assert(fiber() == &cord()->sched); > + clock_set_on_csw(fiber()); 5. Am I right, that we need to update CPU consumption each time a context switch happens? If so, then by logic you would need to call clock_set_on_csw() on each fiber->csw++, correct? Currently that happens in 2 places: fiber_call_impl() and fiber_yield(). In both these places you already call clock_set_on_csw(): - You always call clock_set_on_csw() right before fiber_call_impl(); - You call clock_set_on_csw() in fiber_yield(). I would then move fiber->cws++ logic into your macros too. So all the context switch and CPU time tracking logic would be in one place. The only problem is that your macros takes caller, but csw++ is done for callee. Probably we could increment it for the caller instead, I don't anything depends on that. Also it would lead to increment of the scheduler's csw counter. Not sure whether it is ok. Like that (I didn't test it): ============================================================ diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c index bea49eb41..e9aac99d0 100644 --- a/src/lib/core/fiber.c +++ b/src/lib/core/fiber.c @@ -105,6 +105,7 @@ static int (*fiber_invoke)(fiber_func f, va_list ap); cord()->cpu_id_last = cpu_id; \ cord()->cpu_miss_count++; \ } \ + caller->csw++; \ cord()->clock_last = clock; \ }) @@ -258,7 +259,7 @@ fiber_call_impl(struct fiber *callee) cord->fiber = callee; callee->flags &= ~FIBER_IS_READY; - callee->csw++; + clock_set_on_csw(caller); ASAN_START_SWITCH_FIBER(asan_state, 1, callee->stack, callee->stack_size); @@ -277,7 +278,6 @@ fiber_call(struct fiber *callee) /** By convention, these triggers must not throw. */ if (! rlist_empty(&caller->on_yield)) trigger_run(&caller->on_yield, NULL); - clock_set_on_csw(caller); callee->caller = caller; callee->flags |= FIBER_IS_READY; caller->flags |= FIBER_IS_READY; @@ -511,7 +511,6 @@ fiber_yield(void) assert(callee->flags & FIBER_IS_READY || callee == &cord->sched); assert(! (callee->flags & FIBER_IS_DEAD)); cord->fiber = callee; - callee->csw++; callee->flags &= ~FIBER_IS_READY; ASAN_START_SWITCH_FIBER(asan_state, (caller->flags & FIBER_IS_DEAD) == 0, @@ -617,7 +616,6 @@ fiber_schedule_list(struct rlist *list) } last->caller = fiber(); assert(fiber() == &cord()->sched); - clock_set_on_csw(fiber()); fiber_call_impl(first); } ============================================================ > fiber_call_impl(first); > } > > @@ -1044,6 +1082,107 @@ fiber_destroy_all(struct cord *cord)> +bool > +fiber_top_is_enabled() > +{ > + return fiber_top_enabled; > +} > + > +inline void > +fiber_top_enable() 6. Please, add 'static' modifier. Looks like this function is used inside this file only. The same for fiber_top_disable(). > +{ > + if (!fiber_top_enabled) { > + ev_prepare_start(cord()->loop, &cord()->prepare_event); > + ev_check_start(cord()->loop, &cord()->check_event); > + fiber_top_enabled = true; > + > + cord()->clock_acc = 0; > + cord()->cpu_miss_count_last = 0; > + cord()->clock_delta_last = 0; > + } > +} > @@ -1077,6 +1216,14 @@ cord_create(struct cord *cord, const char *name) > ev_async_init(&cord->wakeup_event, fiber_schedule_wakeup); > > ev_idle_init(&cord->idle_event, fiber_schedule_idle); > + > +#if ENABLE_FIBER_TOP > + /* fiber.top() currently works only for the main thread. */ > + if (cord_is_main()) { > + fiber_top_init(); > + fiber_top_enable(); 7. I think, we need to enable top by default only when we are sure that it does not affect performance. > + } > +#endif /* ENABLE_FIBER_TOP */ > cord_set_name(name); > > #if ENABLE_ASAN > diff --git a/src/lua/fiber.c b/src/lua/fiber.c > index 336be60a2..2adff1536 100644 > --- a/src/lua/fiber.c > +++ b/src/lua/fiber.c > @@ -319,6 +319,61 @@ lbox_fiber_statof_nobt(struct fiber *f, void *cb_ctx) > + > +static int > +lbox_fiber_top(struct lua_State *L) > +{ > + if (!fiber_top_is_enabled()) { > + luaL_error(L, "fiber.top() is disabled. enable it with" 8. 'enable' is a first word in the sentence. So probably it is 'Enable'. > + " fiber.top_enable() first"); > + } > + lua_newtable(L); > + lua_pushliteral(L, "cpu misses"); > + lua_pushnumber(L, cord()->cpu_miss_count_last); > + lua_settable(L, -3); > + > + lbox_fiber_top_entry(&cord()->sched, L); > + fiber_stat(lbox_fiber_top_entry, L); > + > + return 1; > +} > diff --git a/test/app/fiber.result b/test/app/fiber.result > index 3c6115a33..196fae3b7 100644 > --- a/test/app/fiber.result > +++ b/test/app/fiber.result > @@ -1462,6 +1462,84 @@ fiber.join(fiber.self()) > --- > - error: the fiber is not joinable > ... > +sum = 0 > +--- > +... > +-- gh-2694 fiber.top() > +a = fiber.top() > +--- > +... > +-- scheduler is present in fiber.top() > +a[1] ~= nil > +--- > +- true > +... > +type(a["cpu misses"]) == 'number' > +--- > +- true > +... > +sum_inst = 0 > +--- > +... > +sum_avg = 0 > +--- > +... > +test_run:cmd('setopt delimiter ";"') 9. I would consider usage of '\' instead of a delimiter change for such a short piece of multiline code. But up to you. Just in case you didn't know about '\', because I didn't know either until recently.