[Tarantool-patches] [PATCH v2] rfc: luajit metrics

Igor Munkin imun at tarantool.org
Thu Aug 27 22:18:37 MSK 2020


Sergey,

Thanks, this RFC is almost great! Please consider my comments below.

On 26.07.20, Sergey Kaplun wrote:
> Part of #5187
> ---
> 
> This patch adds RFC to LuaJIT metrics interfaces. Nevertheless name
> `misc` for builtin library is not good and should be discussed, because
> tons of user modules can use that name for their own libraies.
> 
> Branch: https://github.com/tarantool/tarantool/tree/skaplun/5187-luajit-metrics
> Issue: https://github.com/tarantool/tarantool/issues/5187
> 
> Changes in v2:
> - Fixed typos
> - Made comments more verbose
> - Avoided flushing any of metrics after each call of luaM_metrics()
> 
>  doc/rfc/5187-luajit-metrics.md | 126 +++++++++++++++++++++++++++++++++
>  1 file changed, 126 insertions(+)
>  create mode 100644 doc/rfc/5187-luajit-metrics.md
> 
> diff --git a/doc/rfc/5187-luajit-metrics.md b/doc/rfc/5187-luajit-metrics.md
> new file mode 100644
> index 000000000..2bd64cff4
> --- /dev/null
> +++ b/doc/rfc/5187-luajit-metrics.md
> @@ -0,0 +1,126 @@
> +# LuaJIT metrics
> +
> +* **Status**: In progress
> +* **Start date**: 17-07-2020
> +* **Authors**: Sergey Kaplun @Buristan skaplun at tarantool.org,
> +               Igor Munkin @igormunkin imun at tarantool.org,
> +               Sergey Ostanevich @sergos sergos at tarantool.org
> +* **Issues**: [#5187](https://github.com/tarantool/tarantool/issues/5187)
> +
> +## Summary
> +
> +LuaJIT metrics provide extra information about the Lua state. They consists of

Typo: s/consists/consist/.

> +GC metrics (overall amount of objects and memory usage), JIT stats (both
> +related to the compiled traces and the engine itself), string hash hits/misses.
> +
> +## Background and motivation
> +
> +One can be curious about their application performance. We are going to provide
> +various metrics about the several platform subsystems behaviour. GC pressure
> +produced by user code can weight down all application performance. Irrelevant
> +traces compiled by the JIT engine can just burn CPU time with no benefits as a
> +result. String hash collisions can lead to DoS caused by a single request. All
> +these metrics should be well monitored by users wanting to improve the
> +performance of their application.
> +
> +## Detailed design
> +
> +For C API we introduce additional extension header <lmisclib.h> that provides
> +interfaces for new LuaJIT C API extensions. The first interface in this header
> +will be the following:

I propose the following rewording:
| The additional header <lmisclib.h> is introduced to extend the existing
| LuaJIT C API with new interfaces. The first function provided via this
| header is the following:

> +
> +```
> +/* API for obtaining various metrics from the platform. */

Typo: s/metrics from the platform/platform metrics/.

> +
> +LUAM_API struct luam_Metrics *luaM_metrics(lua_State *L,
> +					   struct luam_Metrics *dst);

Please, address the comments I left regarding the function signature
here[1].

> +```
> +
> +This function fills the structure pointed by `dst` with the corresponding
> +metrics related to Lua state anchored to the given coroutine `L`. The result of
> +the function is a pointer to the filled structure (the same `dst` points to).
> +
> +The `struct luam_Metrics` has the following definition:
> +
> +```
> +struct luam_Metrics {

Please, address the comments I left regarding the structure definition
here[1].

> +	/*
> +	 * Strings amount founded in string hash
> +	 * instead of allocation of new one.
> +	 */
> +	size_t strhash_hit;
> +	/* Strings amount allocated and put into string hash. */
> +	size_t strhash_miss;
> +
> +	size_t strnum;   /* Amount of allocated string objects. */
> +	size_t tabnum;   /* Amount of allocated table objects. */
> +	size_t udatanum; /* Amount of allocated udata objects. */
> +	size_t cdatanum; /* Amount of allocated cdata objects. */
> +
> +	/* Memory currently allocated. */
> +	size_t gc_total;
> +	/* Total amount of freed memory. */
> +	size_t gc_freed;
> +	/* Total amount of allocated memory. */
> +	size_t gc_allocated;
> +
> +	/* Count of incremental GC steps per state. */
> +	size_t gc_steps_pause;
> +	size_t gc_steps_propagate;
> +	size_t gc_steps_atomic;
> +	size_t gc_steps_sweepstring;
> +	size_t gc_steps_sweep;
> +	size_t gc_steps_finalize;
> +
> +	/*
> +	 * Overall number of snap restores (and number of stopped
> +	 * trace executions) for given jit_State.
> +	 */
> +	size_t jit_snap_restore;
> +	/* Overall number of abort traces for given jit_State. */
> +	size_t jit_trace_abort;
> +	/* Total size of all allocated machine code areas. */
> +	size_t jit_mcode_size;
> +	/* Amount of JIT traces. */
> +	unsigned int jit_trace_num;
> +};
> +```
> +
> +All metrics are collected throughout the platform uptime. But some of them
> +(namely `strhash_hit`, `strhash_miss`, `gc_freed`, `gc_allocated`,
> +`gc_steps_pause`, `gc_steps_propagate`, `gc_steps_atomic`,
> +`gc_steps_sweepstring`, `gc_steps_sweep`, `gc_steps_finalize`,
> +`jit_snap_restore` and `jit_trace_abort`) increase monotonic and can overflow.

Ouch, let's list these metrics in a bullet list for better readability.

Typo: s/monotonic/monotonically/.

> +They make sense only with comparing with their value from a previous
> +`luaM_metrics()` call.
> +
> +There is also a complement introduced for Lua space -- `misc.getmetrics()`.
> +This function is just a wrapper for `luaM_metrics()` returning a Lua table with
> +the similar metrics. Its usage is quite simple:
> +```
> +$ ./src/tarantool
> +Tarantool 2.5.0-267-gbf047ad44
> +type 'help' for interactive help
> +tarantool> misc.getmetrics()
> +---
> +- tabnum: 1812
> +  gc_total: 1369927
> +  strnum: 5767
> +  jit_trace_num: 0
> +  cdatanum: 89
> +  jit_mcode_size: 0
> +  udatanum: 17
> +  jit_snap_restore: 0
> +  gc_freed: 2239391
> +  strhash_hit: 53759
> +  gc_steps_finalize: 0
> +  gc_allocated: 3609318
> +  gc_steps_atomic: 6
> +  gc_steps_sweep: 296
> +  gc_steps_sweepstring: 17920
> +  jit_trace_abort: 0
> +  strhash_miss: 6874
> +  gc_steps_propagate: 10106
> +  gc_steps_pause: 7
> +...
> +```
> -- 
> 2.24.1
> 

Otherwise, LGTM.


[1]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-August/019208.html

-- 
Best regards,
IM


More information about the Tarantool-patches mailing list