Re: [Tarantool-patches] [tarantool-patches] [PATCH v2 2/2] lua: add fiber.top() listing fiber cpu consumption

Tarantool development patches archive
 help / color / mirror / Atom feed

From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: Serge Petrenko <sergepetrenko@tarantool.org>
Cc: tarantool-patches@freelists.org, tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [tarantool-patches] [PATCH v2 2/2] lua: add fiber.top() listing fiber cpu consumption
Date: Sat, 26 Oct 2019 20:00:32 +0200	[thread overview]
Message-ID: <4d729ef4-6b21-fa22-b379-dd670ac6fcc5@tarantool.org> (raw)
In-Reply-To: <D30BCE1C-5518-4294-BFD6-296469B5B24D@tarantool.org>

Hi! Thanks for the fixes!

>>>      115:
>>>        cpu average (%): 5.4571279061075
>>>        cpu instant (%): 5.9653973440576
>>>      120:
>>>        cpu average (%): 21.944382148464
>>>        cpu instant (%): 23.849021825646
>>>      116:
>>>        cpu average (%): 8.6603872318158
>>>        cpu instant (%): 9.6812031335093
>>>      119:
>>>        cpu average (%): 21.933168871944
>>>        cpu instant (%): 20.007540530351
>>>      cpu misses: 0
>>>      118:
>>>        cpu average (%): 19.342901995963
>>>        cpu instant (%): 16.932679820703
>>>      117:
>>>        cpu average (%): 11.549674814981
>>>        cpu instant (%): 13.128901177161
>>>    ...
>>>    ```
>>
>> 2. This is super cool!
>>
>> Could we give names to predefined fibers? For
>> example, make an alias top.scheduler = top[1].
>>
>> So as in the console output I would see:
>>
>>    ---
>>    - scheduler:
>>        cpu average (%): 98.230148883023
>>        cpu instant (%): 100
>>
>> Instead of '1'. Because personally I didn't
>> even know that 1 is always the scheduler, and
>> I would be confused to see a regular fiber
>> consumed 99% of time in a console.
> 
> Hmm, what about this?
> 
> --- a/src/lua/fiber.c
> +++ b/src/lua/fiber.c
> @@ -325,13 +325,18 @@ lbox_fiber_top_entry(struct fiber *f, void *cb_ctx)
>  {
>         struct lua_State *L = (struct lua_State *) cb_ctx;
>  
> -       lua_pushinteger(L, f->fid);
> +       /* Index system fibers by name instead of id */
> +       if (f->fid <= FIBER_ID_MAX_RESERVED) {
> +               lua_pushstring(L, f->name);
> +       } else {
> +               lua_pushinteger(L, f->fid);
> +       }
>         lua_newtable(L);
> 
> 

What about always using a name, when it is not empty? When a
system is running, it might be hard to find to which fiber
an ID belongs. Or 'name:id' like in logs (or whatever format
we use in the logs).

For example I started a vshard recovery fiber, it somewhy
consumes 99% CPU, but I don't know its ID. And I would need
to use vshard.storage.internal.recovery_fiber:id() to find it.

Up to you, and probably to Kirill since this is public API.

>>> #include "memory.h"
>>> #include "trigger.h"
>>> @@ -82,6 +85,34 @@ static int (*fiber_invoke)(fiber_func f, va_list ap);
>>> err;\
>>> })
>>>
>>> +#if ENABLE_FIBER_TOP
>>> +/**
>>> + * An action performed each time a context switch happens.
>>> + * Used to count each fiber's processing time.
>>> + * This is a macro rather than a function, since it is used
>>> + * in scheduler too.
>>
>> 4. Hm. Not sure if I understand. Ok, it is used in the scheduler,
>> so why does it prevent making it a function? Caller is always
>> struct fiber *, so you can make it a function taking fiber *
>> parameter, can't you?
> 
> You cannot call functions from scheduler cos it doesn’t have a
> stack AFAIU. I can make it an inline function if you want me to.
> 

Does not have a stack? Then how does it work at all? You use
stack in your macros - you declare a couple of variables. Also
intermediate arithmetic results might be saved on the stack.

Talking of an 'inline' function - if that will work, then it
would look better IMO. At least it won't raise questions why
it is not a function.

>> 5. Am I right, that we need to update CPU consumption each
>> time a context switch happens? 
> 
> Yep.
> 
>> If so, then by logic you
>> would need to call clock_set_on_csw() on each fiber->csw++,
>> correct?
> 
> Right. I also call clock_set_on_csw() on each event loop iteration
> end.
> 
>>
>> Currently that happens in 2 places: fiber_call_impl() and
>> fiber_yield(). In both these places you already call
>> clock_set_on_csw():
>>
>>    - You always call clock_set_on_csw() right before
>>      fiber_call_impl();
>>
>>    - You call clock_set_on_csw() in fiber_yield().
>>
>> I would then move fiber->cws++ logic into your macros too.
>> So all the context switch and CPU time tracking logic
>> would be in one place.
>>
>> The only problem is that your macros takes caller, but csw++
>> is done for callee. Probably we could increment it for the
>> caller instead, I don't anything depends on that.
>>
>> Also it would lead to increment of the scheduler's csw
>> counter. Not sure whether it is ok.
> 
> Loop iteration end isn’t a context switch, but I still update
> clocks here.

Yes, you update, but not via clock_set_on_csw(). Am I wrong?
The only fiber which gets updated in the end of the loop via
clock_set_on_csw() is the scheduler. For other fibers you
manually update clock_acc, clock_delta_last, clock_delta in
loop_on_iteration_end().

> I don’t want to increment csw here as a side
> effect, so maybe leave csw increment as it is?
> 

If you are talking about the scheduler's csw counter, then
I don't think it matters. 'Csw' is just statistics. It is
not used for anything. Besides, why a switch to the scheduler
does not count? End of the loop = switch to the scheduler, so
it means its csw should be incremented, shouldn't it?

>>> ev_idle_init(&cord->idle_event, fiber_schedule_idle);
>>> +
>>> +#if ENABLE_FIBER_TOP
>>> +/* fiber.top <http://fiber.top>() currently works only for the main thread. */
>>> +if (cord_is_main()) {
>>> +fiber_top_init();
>>> +fiber_top_enable();
>>
>> 7. I think, we need to enable top by default only
>> when we are sure that it does not affect performance.
> 
> Ok, disabled it by default. User can turn it on with
> fiber.top_enable(). This also resolved the issue I had
> with your swim test.
> 
> Looking at perf difference, maybe we can turn it on by default?

IMO, 10-16% looks quite significant taking into account,
that 99% of users won't use it at all, and others only for
debug of rare exceptional cases. For which they can call
enable(), get top(), and disable() back. I think, we should
ask Kirill. I am against default on.

next prev parent reply	other threads:[~2019-10-26 17:55 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <cover.1570546695.git.sergepetrenko@tarantool.org>
     [not found] ` <eb924da60bc66cbd93d7920f53d6c332d15ffab8.1570546695.git.sergepetrenko@tarantool.org>
2019-10-22 21:18   ` Vladislav Shpilevoy
2019-10-25 15:13     ` Serge Petrenko
2019-10-25 16:19       ` Serge Petrenko
2019-10-26 18:00       ` Vladislav Shpilevoy [this message]
2019-10-28 16:16         ` Serge Petrenko
2019-10-28 23:00           ` Vladislav Shpilevoy
2019-11-01 14:09             ` Serge Petrenko
     [not found] ` <52e7822bbcd802528d448c15ce9e9fbe4479c73a.1570546695.git.sergepetrenko@tarantool.org>
2019-10-22 21:20   ` [Tarantool-patches] [tarantool-patches] [PATCH v2 1/2] test: modify swim_run_test to break event loop Vladislav Shpilevoy
2019-10-25 15:15     ` Serge Petrenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4d729ef4-6b21-fa22-b379-dd670ac6fcc5@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=sergepetrenko@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='Re: [Tarantool-patches] [tarantool-patches] [PATCH v2 2/2] lua: add fiber.top() listing fiber cpu consumption' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox