[Tarantool-patches] [tarantool-patches] [PATCH v3] lua: add fiber.top() listing fiber cpu consumption

Kirill Yukhin kyukhin at tarantool.org
Sat Nov 9 09:53:40 MSK 2019


Hello,

On 01 ноя 17:05, Serge Petrenko wrote:
> Implement a new function in Lua fiber library: top(). It returns a table
> of alive fibers (including the scheduler). Each table entry has two
> fields: average cpu consumption, which is calculated with exponential
> moving average over event loop iterations, and current cpu consumption,
> which shows fiber's cpu usage over the last event loop iteration.
> The patch relies on CPU timestamp counter to measure each fiber's time
> share.
> 
> Closes #2694
> 
> @TarantoolBot document
> Title: fiber: new function `fiber.top()`
> 
> `fiber.top()` returns a table of all alive fibers and lists their cpu
> consumption. Let's take a look at the example:
> ```
> tarantool> fiber.top()
> ---
> - 104/lua:
>     instant: 18.433514726042
>     time: 0.677505865
>     average: 21.98826143184
>   103/lua:
>     instant: 19.131392015951
>     time: 0.689521917
>     average: 20.807772656431
>   107/lua:
>     instant: 18.624600174469
>     time: 0.681585168
>     average: 17.78194117452
>   101/on_shutdown:
>     instant: 0
>     time: 0
>     average: 0
>   105/lua:
>     instant: 18.562289702156
>     time: 0.682085309
>     average: 15.513811055476
>   106/lua:
>     instant: 18.441822789017
>     time: 0.677320271
>     average: 15.427595583115
>   102/interactive:
>     instant: 0
>     time: 0.000367182
>     average: 0
>   cpu misses: 0
>   1/sched:
>     instant: 6.8063805923649
>     time: 0.253035056
>     average: 8.3479789103691
> ...
> 
> ```
> In the table above keys are strings containing fiber ids and names
> (the only exception is a single 'cpu misses' key which indicates the
> number of times tx thread was rescheduled on a different cpu core.
> More on that later).
> The three metrics available for each fiber are:
> 1) instant (per cent),
> which indicates the share of time fiber was executing during the
> previous event loop iteration
> 2) average (per cent), which is calculated as an exponential moving
> average of `instant` values over all previous event loop iterations.
> 3) time (seconds), which estimates how much cpu time each fiber spent
> processing during its lifetime.
> 
> More info on `cpu misses` field returned by `fiber.top()`:
> `cpu misses` indicates the amount of times tx thread detected it was
> rescheduled on a different cpu core during the last event loop
> iteration.
> fiber.top() uses cpu timestamp counter to measure each fiber's execution
> time. However, each cpu core may have its own counter value (you can
> only rely on counter deltas if both measurements were taken on the same
> core, otherwise the delta may even get negative).
> When tx thread is rescheduled to a different cpu core, tarantool just
> assumes cpu delta was zero for the latest measurement. This loweres
> precision of our computations, so the bigger `cpu misses` value the
> lower the precision of fiber.top() results.
> 
> Fiber.top() doesn't work on arm architecture at the moment.
> 
> Please note, that enabling fiber.top() slows down fiber switching by
> about 15 per cent, so it is disabled by default.
> To enable it you need to issue `fiber.top_enable()`.
> You can disable it back after you finished debugging  using
> `fiber.top_disable()`.
> "Time" entry is also added to each fibers output in fiber.info()
> (it duplicates "time" entry from fiber.top()).
> Note, that "time" is only counted while fiber.top is enabled.
> ---
> https://github.com/tarantool/tarantool/issues/2694
> https://github.com/tarantool/tarantool/tree/sp/gh-2694-fiber-top

I've checked your patch into master.

--
Regards, Kirill Yukhin


More information about the Tarantool-patches mailing list