From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp16.mail.ru (smtp16.mail.ru [94.100.176.153]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 69BEA430D56 for ; Mon, 28 Oct 2019 19:16:28 +0300 (MSK) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) From: Serge Petrenko In-Reply-To: <4d729ef4-6b21-fa22-b379-dd670ac6fcc5@tarantool.org> Date: Mon, 28 Oct 2019 19:16:27 +0300 Content-Transfer-Encoding: quoted-printable Message-Id: <0E4E89B7-CA95-48D2-A719-F19DF6541CD0@tarantool.org> References: <5b3c1fae-1c49-4b13-3a0e-0774c4add3fb@tarantool.org> <4d729ef4-6b21-fa22-b379-dd670ac6fcc5@tarantool.org> Subject: Re: [Tarantool-patches] [tarantool-patches] [PATCH v2 2/2] lua: add fiber.top() listing fiber cpu consumption List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vladislav Shpilevoy Cc: tarantool-patches@freelists.org, tarantool-patches@dev.tarantool.org Hi! Thanks for your answer! I=E2=80=99ve updated the branch, please take a look at the incremental = diff below. > 26 =D0=BE=D0=BA=D1=82. 2019 =D0=B3., =D0=B2 21:00, Vladislav Shpilevoy = =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB(=D0= =B0): >=20 > Hi! Thanks for the fixes! >=20 >>>> 115: >>>> cpu average (%): 5.4571279061075 >>>> cpu instant (%): 5.9653973440576 >>>> 120: >>>> cpu average (%): 21.944382148464 >>>> cpu instant (%): 23.849021825646 >>>> 116: >>>> cpu average (%): 8.6603872318158 >>>> cpu instant (%): 9.6812031335093 >>>> 119: >>>> cpu average (%): 21.933168871944 >>>> cpu instant (%): 20.007540530351 >>>> cpu misses: 0 >>>> 118: >>>> cpu average (%): 19.342901995963 >>>> cpu instant (%): 16.932679820703 >>>> 117: >>>> cpu average (%): 11.549674814981 >>>> cpu instant (%): 13.128901177161 >>>> ... >>>> ``` >>>=20 >>> 2. This is super cool! >>>=20 >>> Could we give names to predefined fibers? For >>> example, make an alias top.scheduler =3D top[1]. >>>=20 >>> So as in the console output I would see: >>>=20 >>> --- >>> - scheduler: >>> cpu average (%): 98.230148883023 >>> cpu instant (%): 100 >>>=20 >>> Instead of '1'. Because personally I didn't >>> even know that 1 is always the scheduler, and >>> I would be confused to see a regular fiber >>> consumed 99% of time in a console. >>=20 >> Hmm, what about this? >>=20 >> --- a/src/lua/fiber.c >> +++ b/src/lua/fiber.c >> @@ -325,13 +325,18 @@ lbox_fiber_top_entry(struct fiber *f, void = *cb_ctx) >> { >> struct lua_State *L =3D (struct lua_State *) cb_ctx; >> =20 >> - lua_pushinteger(L, f->fid); >> + /* Index system fibers by name instead of id */ >> + if (f->fid <=3D FIBER_ID_MAX_RESERVED) { >> + lua_pushstring(L, f->name); >> + } else { >> + lua_pushinteger(L, f->fid); >> + } >> lua_newtable(L); >>=20 >>=20 >=20 > What about always using a name, when it is not empty? When a > system is running, it might be hard to find to which fiber > an ID belongs. Or 'name:id' like in logs (or whatever format > we use in the logs). >=20 > For example I started a vshard recovery fiber, it somewhy > consumes 99% CPU, but I don't know its ID. And I would need > to use vshard.storage.internal.recovery_fiber:id() to find it. >=20 > Up to you, and probably to Kirill since this is public API. >=20 Let=E2=80=99s stick with `id/name` just like in the logs then. >>>> #include "memory.h" >>>> #include "trigger.h" >>>> @@ -82,6 +85,34 @@ static int (*fiber_invoke)(fiber_func f, va_list = ap); >>>> err;\ >>>> }) >>>>=20 >>>> +#if ENABLE_FIBER_TOP >>>> +/** >>>> + * An action performed each time a context switch happens. >>>> + * Used to count each fiber's processing time. >>>> + * This is a macro rather than a function, since it is used >>>> + * in scheduler too. >>>=20 >>> 4. Hm. Not sure if I understand. Ok, it is used in the scheduler, >>> so why does it prevent making it a function? Caller is always >>> struct fiber *, so you can make it a function taking fiber * >>> parameter, can't you? >>=20 >> You cannot call functions from scheduler cos it doesn=E2=80=99t have = a >> stack AFAIU. I can make it an inline function if you want me to. >>=20 >=20 > Does not have a stack? Then how does it work at all? You use > stack in your macros - you declare a couple of variables. Also > intermediate arithmetic results might be saved on the stack. >=20 Yes, you=E2=80=99re right, thanks. It was a strange belief of mine. > Talking of an 'inline' function - if that will work, then it > would look better IMO. At least it won't raise questions why > it is not a function. Will do then. >=20 >>> 5. Am I right, that we need to update CPU consumption each >>> time a context switch happens?=20 >>=20 >> Yep. >>=20 >>> If so, then by logic you >>> would need to call clock_set_on_csw() on each fiber->csw++, >>> correct? >>=20 >> Right. I also call clock_set_on_csw() on each event loop iteration >> end. >>=20 >>>=20 >>> Currently that happens in 2 places: fiber_call_impl() and >>> fiber_yield(). In both these places you already call >>> clock_set_on_csw(): >>>=20 >>> - You always call clock_set_on_csw() right before >>> fiber_call_impl(); >>>=20 >>> - You call clock_set_on_csw() in fiber_yield(). >>>=20 >>> I would then move fiber->cws++ logic into your macros too. >>> So all the context switch and CPU time tracking logic >>> would be in one place. >>>=20 >>> The only problem is that your macros takes caller, but csw++ >>> is done for callee. Probably we could increment it for the >>> caller instead, I don't anything depends on that. >>>=20 >>> Also it would lead to increment of the scheduler's csw >>> counter. Not sure whether it is ok. >>=20 >> Loop iteration end isn=E2=80=99t a context switch, but I still update >> clocks here. >=20 > Yes, you update, but not via clock_set_on_csw(). Am I wrong? > The only fiber which gets updated in the end of the loop via > clock_set_on_csw() is the scheduler. For other fibers you > manually update clock_acc, clock_delta_last, clock_delta in > loop_on_iteration_end(). Yep, I was talking about the sched. >=20 >> I don=E2=80=99t want to increment csw here as a side >> effect, so maybe leave csw increment as it is? >>=20 >=20 > If you are talking about the scheduler's csw counter, then > I don't think it matters. 'Csw' is just statistics. It is > not used for anything. Besides, why a switch to the scheduler > does not count? End of the loop =3D switch to the scheduler, so > it means its csw should be incremented, shouldn't it? Ok >=20 >>>> ev_idle_init(&cord->idle_event, fiber_schedule_idle); >>>> + >>>> +#if ENABLE_FIBER_TOP >>>> +/* fiber.top () currently works only for the = main thread. */ >>>> +if (cord_is_main()) { >>>> +fiber_top_init(); >>>> +fiber_top_enable(); >>>=20 >>> 7. I think, we need to enable top by default only >>> when we are sure that it does not affect performance. >>=20 >> Ok, disabled it by default. User can turn it on with >> fiber.top_enable(). This also resolved the issue I had >> with your swim test. >>=20 >> Looking at perf difference, maybe we can turn it on by default? >=20 > IMO, 10-16% looks quite significant taking into account, > that 99% of users won't use it at all, and others only for > debug of rare exceptional cases. For which they can call > enable(), get top(), and disable() back. I think, we should > ask Kirill. I am against default on. Ok diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c index e3fe77e21..4c646b278 100644 --- a/src/lib/core/fiber.c +++ b/src/lib/core/fiber.c @@ -87,31 +87,32 @@ static int (*fiber_invoke)(fiber_func f, va_list = ap); }) =20 #if ENABLE_FIBER_TOP +static bool fiber_top_enabled =3D false; + /** * An action performed each time a context switch happens. * Used to count each fiber's processing time. - * This is a macro rather than a function, since it is used - * in scheduler too. */ -#define clock_set_on_csw(caller) = \ -({ = \ - if (fiber_top_enabled) { = \ - uint64_t clock; = \ - uint32_t cpu_id; = \ - clock =3D __rdtscp(&cpu_id); = \ - = \ - if (cpu_id =3D=3D cord()->cpu_id_last) { = \ - (caller)->clock_delta +=3D clock - = cord()->clock_last; \ - cord()->clock_delta +=3D clock - = cord()->clock_last; \ - } else { = \ - cord()->cpu_id_last =3D cpu_id; = \ - cord()->cpu_miss_count++; = \ - } = \ - cord()->clock_last =3D clock; = \ - } = \ -}) +static inline void +clock_set_on_csw(struct fiber *caller) +{ + caller->csw++; + if (!fiber_top_enabled) + return; =20 -static bool fiber_top_enabled =3D false; + uint64_t clock; + uint32_t cpu_id; + clock =3D __rdtscp(&cpu_id); + + if (cpu_id =3D=3D cord()->cpu_id_last) { + caller->clock_delta +=3D clock - cord()->clock_last; + cord()->clock_delta +=3D clock - cord()->clock_last; + } else { + cord()->cpu_id_last =3D cpu_id; + cord()->cpu_miss_count++; + } + cord()->clock_last =3D clock; +} =20 #else #define clock_set_on_csw(caller) ; @@ -262,7 +263,6 @@ fiber_call_impl(struct fiber *callee) cord->fiber =3D callee; =20 callee->flags &=3D ~FIBER_IS_READY; - callee->csw++; ASAN_START_SWITCH_FIBER(asan_state, 1, callee->stack, callee->stack_size); @@ -515,7 +515,6 @@ fiber_yield(void) assert(callee->flags & FIBER_IS_READY || callee =3D=3D = &cord->sched); assert(! (callee->flags & FIBER_IS_DEAD)); cord->fiber =3D callee; - callee->csw++; callee->flags &=3D ~FIBER_IS_READY; ASAN_START_SWITCH_FIBER(asan_state, (caller->flags & FIBER_IS_DEAD) =3D=3D = 0, diff --git a/src/lua/fiber.c b/src/lua/fiber.c index c38bd886f..a30290bfa 100644 --- a/src/lua/fiber.c +++ b/src/lua/fiber.c @@ -324,13 +324,11 @@ static int lbox_fiber_top_entry(struct fiber *f, void *cb_ctx) { struct lua_State *L =3D (struct lua_State *) cb_ctx; + char name_buf[64]; + + snprintf(name_buf, sizeof(name_buf), "%u/%s", f->fid, f->name); + lua_pushstring(L, name_buf); =20 - /* Index system fibers by name instead of id */ - if (f->fid <=3D FIBER_ID_MAX_RESERVED) { - lua_pushstring(L, f->name); - } else { - lua_pushinteger(L, f->fid); - } lua_newtable(L); =20 lua_pushliteral(L, "average"); diff --git a/test/app/fiber.result b/test/app/fiber.result index e45cd7639..f9904d465 100644 --- a/test/app/fiber.result +++ b/test/app/fiber.result @@ -1478,7 +1478,7 @@ type(a) ... -- scheduler is present in fiber.top() -- and is indexed by name -a["sched"] ~=3D nil +a["1/sched"] ~=3D nil --- - true ... @@ -1518,7 +1518,11 @@ sum_avg > 99 and sum_avg < 101 or sum_avg tbl =3D nil --- ... -f =3D fiber.new(function() for i =3D 1,1000 do end fiber.yield() tbl =3D = fiber.top()[fiber.self().id()] end) +f =3D fiber.new(function()\ + for i =3D 1,1000 do end\ + fiber.yield()\ + tbl =3D fiber.top()[fiber.self().id()..'/'..fiber.self().name()]\ +end) --- ... while f:status() ~=3D 'dead' do fiber.sleep(0.01) end diff --git a/test/app/fiber.test.lua b/test/app/fiber.test.lua index e30aba77e..7f320e318 100644 --- a/test/app/fiber.test.lua +++ b/test/app/fiber.test.lua @@ -638,7 +638,7 @@ a =3D fiber.top() type(a) -- scheduler is present in fiber.top() -- and is indexed by name -a["sched"] ~=3D nil +a["1/sched"] ~=3D nil type(a["cpu misses"]) =3D=3D 'number' sum_inst =3D 0 sum_avg =3D 0 @@ -658,7 +658,11 @@ sum_inst -- not exact due to accumulated integer division errors sum_avg > 99 and sum_avg < 101 or sum_avg tbl =3D nil -f =3D fiber.new(function() for i =3D 1,1000 do end fiber.yield() tbl =3D = fiber.top()[fiber.self().id()] end) +f =3D fiber.new(function()\ + for i =3D 1,1000 do end\ + fiber.yield()\ + tbl =3D fiber.top()[fiber.self().id()..'/'..fiber.self().name()]\ +end) while f:status() ~=3D 'dead' do fiber.sleep(0.01) end tbl["average"] > 0 tbl["instant"] > 0