From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> To: Serge Petrenko <sergepetrenko@tarantool.org> Cc: tarantool-patches@freelists.org, tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [tarantool-patches] [PATCH v2 2/2] lua: add fiber.top() listing fiber cpu consumption Date: Sat, 26 Oct 2019 20:00:32 +0200 [thread overview] Message-ID: <4d729ef4-6b21-fa22-b379-dd670ac6fcc5@tarantool.org> (raw) In-Reply-To: <D30BCE1C-5518-4294-BFD6-296469B5B24D@tarantool.org> Hi! Thanks for the fixes! >>> 115: >>> cpu average (%): 5.4571279061075 >>> cpu instant (%): 5.9653973440576 >>> 120: >>> cpu average (%): 21.944382148464 >>> cpu instant (%): 23.849021825646 >>> 116: >>> cpu average (%): 8.6603872318158 >>> cpu instant (%): 9.6812031335093 >>> 119: >>> cpu average (%): 21.933168871944 >>> cpu instant (%): 20.007540530351 >>> cpu misses: 0 >>> 118: >>> cpu average (%): 19.342901995963 >>> cpu instant (%): 16.932679820703 >>> 117: >>> cpu average (%): 11.549674814981 >>> cpu instant (%): 13.128901177161 >>> ... >>> ``` >> >> 2. This is super cool! >> >> Could we give names to predefined fibers? For >> example, make an alias top.scheduler = top[1]. >> >> So as in the console output I would see: >> >> --- >> - scheduler: >> cpu average (%): 98.230148883023 >> cpu instant (%): 100 >> >> Instead of '1'. Because personally I didn't >> even know that 1 is always the scheduler, and >> I would be confused to see a regular fiber >> consumed 99% of time in a console. > > Hmm, what about this? > > --- a/src/lua/fiber.c > +++ b/src/lua/fiber.c > @@ -325,13 +325,18 @@ lbox_fiber_top_entry(struct fiber *f, void *cb_ctx) > { > struct lua_State *L = (struct lua_State *) cb_ctx; > > - lua_pushinteger(L, f->fid); > + /* Index system fibers by name instead of id */ > + if (f->fid <= FIBER_ID_MAX_RESERVED) { > + lua_pushstring(L, f->name); > + } else { > + lua_pushinteger(L, f->fid); > + } > lua_newtable(L); > > What about always using a name, when it is not empty? When a system is running, it might be hard to find to which fiber an ID belongs. Or 'name:id' like in logs (or whatever format we use in the logs). For example I started a vshard recovery fiber, it somewhy consumes 99% CPU, but I don't know its ID. And I would need to use vshard.storage.internal.recovery_fiber:id() to find it. Up to you, and probably to Kirill since this is public API. >>> #include "memory.h" >>> #include "trigger.h" >>> @@ -82,6 +85,34 @@ static int (*fiber_invoke)(fiber_func f, va_list ap); >>> err;\ >>> }) >>> >>> +#if ENABLE_FIBER_TOP >>> +/** >>> + * An action performed each time a context switch happens. >>> + * Used to count each fiber's processing time. >>> + * This is a macro rather than a function, since it is used >>> + * in scheduler too. >> >> 4. Hm. Not sure if I understand. Ok, it is used in the scheduler, >> so why does it prevent making it a function? Caller is always >> struct fiber *, so you can make it a function taking fiber * >> parameter, can't you? > > You cannot call functions from scheduler cos it doesn’t have a > stack AFAIU. I can make it an inline function if you want me to. > Does not have a stack? Then how does it work at all? You use stack in your macros - you declare a couple of variables. Also intermediate arithmetic results might be saved on the stack. Talking of an 'inline' function - if that will work, then it would look better IMO. At least it won't raise questions why it is not a function. >> 5. Am I right, that we need to update CPU consumption each >> time a context switch happens? > > Yep. > >> If so, then by logic you >> would need to call clock_set_on_csw() on each fiber->csw++, >> correct? > > Right. I also call clock_set_on_csw() on each event loop iteration > end. > >> >> Currently that happens in 2 places: fiber_call_impl() and >> fiber_yield(). In both these places you already call >> clock_set_on_csw(): >> >> - You always call clock_set_on_csw() right before >> fiber_call_impl(); >> >> - You call clock_set_on_csw() in fiber_yield(). >> >> I would then move fiber->cws++ logic into your macros too. >> So all the context switch and CPU time tracking logic >> would be in one place. >> >> The only problem is that your macros takes caller, but csw++ >> is done for callee. Probably we could increment it for the >> caller instead, I don't anything depends on that. >> >> Also it would lead to increment of the scheduler's csw >> counter. Not sure whether it is ok. > > Loop iteration end isn’t a context switch, but I still update > clocks here. Yes, you update, but not via clock_set_on_csw(). Am I wrong? The only fiber which gets updated in the end of the loop via clock_set_on_csw() is the scheduler. For other fibers you manually update clock_acc, clock_delta_last, clock_delta in loop_on_iteration_end(). > I don’t want to increment csw here as a side > effect, so maybe leave csw increment as it is? > If you are talking about the scheduler's csw counter, then I don't think it matters. 'Csw' is just statistics. It is not used for anything. Besides, why a switch to the scheduler does not count? End of the loop = switch to the scheduler, so it means its csw should be incremented, shouldn't it? >>> ev_idle_init(&cord->idle_event, fiber_schedule_idle); >>> + >>> +#if ENABLE_FIBER_TOP >>> +/* fiber.top <http://fiber.top>() currently works only for the main thread. */ >>> +if (cord_is_main()) { >>> +fiber_top_init(); >>> +fiber_top_enable(); >> >> 7. I think, we need to enable top by default only >> when we are sure that it does not affect performance. > > Ok, disabled it by default. User can turn it on with > fiber.top_enable(). This also resolved the issue I had > with your swim test. > > Looking at perf difference, maybe we can turn it on by default? IMO, 10-16% looks quite significant taking into account, that 99% of users won't use it at all, and others only for debug of rare exceptional cases. For which they can call enable(), get top(), and disable() back. I think, we should ask Kirill. I am against default on.
next prev parent reply other threads:[~2019-10-26 17:55 UTC|newest] Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top [not found] <cover.1570546695.git.sergepetrenko@tarantool.org> [not found] ` <eb924da60bc66cbd93d7920f53d6c332d15ffab8.1570546695.git.sergepetrenko@tarantool.org> 2019-10-22 21:18 ` Vladislav Shpilevoy 2019-10-25 15:13 ` Serge Petrenko 2019-10-25 16:19 ` Serge Petrenko 2019-10-26 18:00 ` Vladislav Shpilevoy [this message] 2019-10-28 16:16 ` Serge Petrenko 2019-10-28 23:00 ` Vladislav Shpilevoy 2019-11-01 14:09 ` Serge Petrenko [not found] ` <52e7822bbcd802528d448c15ce9e9fbe4479c73a.1570546695.git.sergepetrenko@tarantool.org> 2019-10-22 21:20 ` [Tarantool-patches] [tarantool-patches] [PATCH v2 1/2] test: modify swim_run_test to break event loop Vladislav Shpilevoy 2019-10-25 15:15 ` Serge Petrenko
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=4d729ef4-6b21-fa22-b379-dd670ac6fcc5@tarantool.org \ --to=v.shpilevoy@tarantool.org \ --cc=sergepetrenko@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --cc=tarantool-patches@freelists.org \ --subject='Re: [Tarantool-patches] [tarantool-patches] [PATCH v2 2/2] lua: add fiber.top() listing fiber cpu consumption' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox