From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp33.i.mail.ru (smtp33.i.mail.ru [94.100.177.93]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id CCD6A469719 for ; Thu, 8 Oct 2020 22:30:19 +0300 (MSK) Date: Thu, 8 Oct 2020 22:29:54 +0300 From: Sergey Kaplun Message-ID: <20201008192954.GA8864@root> References: <20201005063029.31737-1-skaplun@tarantool.org> <20201008172512.GU18920@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201008172512.GU18920@tarantool.org> Subject: Re: [Tarantool-patches] [RFC v4] rfc: luajit metrics List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Igor Munkin Cc: tarantool-patches@dev.tarantool.org Hi, Igor! Thanks for the review! On 08.10.20, Igor Munkin wrote: > Sergey, > > Thanks for the updates! Considering the changes in the follow-up reply, > there are still several comments below. > > On 05.10.20, Sergey Kaplun wrote: > > Part of #5187 > > --- > > > > This patch adds RFC to LuaJIT metrics interfaces. Nevertheless name > > `misc` for builtin library is not good and should be discussed, because > > tons of user modules can use that name for their own libraries. > > > > Branch: https://github.com/tarantool/tarantool/tree/skaplun/5187-luajit-metrics > > Issue: https://github.com/tarantool/tarantool/issues/5187 > > > > Changes in v2: > > - Fixed typos > > - Made comments more verbose > > - Avoided flushing any of metrics after each call of luaM_metrics() > > Changes in v3: > > - Added colors count metrics description > > - Added description about how metrics are collected > > - Added benchmarks > > Changes in v3: > > - Removed colors count metrics > > - Added code block language > > - Added how to use section > > Minor: ChangeLog is misordered (the latest changes are the first entry). Thanks! Forgot to add ChangeLog to patch: @ChangeLog: * Add Lua and C API for LuaJIT platform metrics about: - overall amount of allocated tables, cdata and udata objects - number of incremental GC steps grouped by GC state - number of string hashes hits and misses - amount of allocated and freed memory - number of trace aborts, number of traces and restored snapshots > > > > > doc/rfc/5187-luajit-metrics.md | 299 +++++++++++++++++++++++++++++++++ > > 1 file changed, 299 insertions(+) > > create mode 100644 doc/rfc/5187-luajit-metrics.md > > > > diff --git a/doc/rfc/5187-luajit-metrics.md b/doc/rfc/5187-luajit-metrics.md > > new file mode 100644 > > index 000000000..02f5b559f > > --- /dev/null > > +++ b/doc/rfc/5187-luajit-metrics.md > > @@ -0,0 +1,299 @@ > > > > > +Couple of words about how metrics are collected: > > +- `strhash_*` -- whenever existing string is returned after attempt to > > + create new string there is incremented `strhash_hit` counter, if new string > > + created then `strhash_miss` is incremented instead. > > Minor: I propose to reword it the way similar you've updated it in the > luam_Metrics comments. Yep, sure! > > > +- `gc_*num`, `jit_trace_num` -- corresponding counter incremented whenever new > > Typo: s/whenever new/whenever a new/. Hawkeye! > > > + object is allocated. When object become garbage collected its counter is > > Typo: s/become garbage collected/is collected by GC/. Thanks! Fixed. > > > + decremented. > > +- `gc_total`, `gc_allocated`, `gc_freed` -- any time when allocation function > > + is called `gc_allocated` and/or `gc_freed` is increased and `gc_total` > > + increase when memory is allocated or reallocated, decrease when memory is > > Typo: s/increase/is increased/. > Typo: s/decrease/is decreased/. Thanks! Fixed. > > > + freed. > > +- `gc_steps_*` -- corresponding counter increments whenever Garbage Collector > > Typo: s/increments/is incremented/. Thanks! > > > + starts to execute 1 step of garbage collection. > > Minor: I propose s/1/an incremental/. No problem! > > > +- `jit_snap_restore` -- whenever JIT machine exits from the trace and restores > > + interpreter state `jit_snap_restore` counter is incremented. > > +- `jit_trace_abort` -- whenever JIT compiler can't record the trace in case NYI > > + BC this counter is incremented. > > Minor: NYI relates also to builtins, not only to bytecodes. Ok. Fixed. > > > +- `jit_mcode_size` -- whenever new MCode area is allocated `jit_mcode_size` is > > + increased at corresponding size in bytes. Sets to 0 when all mcode area is > > + freed. > > How does it change, when a trace is collected as a result of its flush? It doesn't. IINM, this area will be reused later for other traces. MCode area is linked with jit_State not with trace by itself. Trace just reserve MCode area that needed. > > > + > > +All metrics are collected throughout the platform uptime. These metrics > > +increase monotonically and can overflow: > > + - `strhash_hit` > > + - `strhash_miss` > > + - `gc_freed` > > + - `gc_allocated`, > > Typo: Excess comma. Hawkeye! > > > + - `gc_steps_pause` > > + - `gc_steps_propagate` > > + - `gc_steps_atomic` > > + - `gc_steps_sweepstring` > > + - `gc_steps_sweep` > > + - `gc_steps_finalize` > > + - `jit_snap_restore` > > + - `jit_trace_abort` > > > > > +## How to use > > + > > +This section describes small example of metrics usage. > > + > > +For example amount of `strhash_misses` can be shown for tracking of new string > > +objects allocations. For example if we add code like: > > +```lua > > +local function sharded_storage_func(storage_name, func_name) > > + return 'sharded_storage.storages.' .. storage_name .. '.' .. func_name > > +end > > +``` > > +increase in slope curve of `strhash_misses` means, that after your changes > > +there are more new strings allocating at runtime. Of course slope curve of > > +`strhash_misses` _should_ be less than slope curve of `strhash_hits`. If it > > +is not, you should refactor your code. > > Minor: I guess it's not a good idea to suggest 'code refactoring'. This > section is good to describe the values being observed, so the first part > about tilt angle is enough. OK, dropped it! > > > + > > +Slope curves of `gc_freed` and `gc_allocated` can be used for analysis of GC > > +pressure of your application (less is better). > > + > > +Also we can check some hacky optimization with these metrics. For example let's > > +assume that we have this code snippet: > > +```lua > > +local old_metrics = misc.getmetrics() > > +local t = {} > > +for i = 1, 513 do > > + t[i] = i > > +end > > +local new_metrics = misc.getmetrics() > > +local diff = new_metrics.gc_allocated - old_metrics.gc_allocated Found double space prior new_metrics.gc_allocated. Fixed. > > +``` > > +`diff` equals to 18879 after running of this chunk. > > + > > +But if we change table initialization to > > +```lua > > +local table_new = require "table.new" > > +local old_metrics = misc.getmetrics() > > +local t = table_new(513,0) > > +for i = 1, 513 do > > + t[i] = i > > +end > > +local new_metrics = misc.getmetrics() > > +local diff = new_metrics.gc_allocated - old_metrics.gc_allocated Here too. > > +``` > > +`diff` shows us only 5895. > > + > > +Slope curves of `gc_steps_*` can be used for tracking of GC pressure too. For > > +long time observations you will see periodic increment for `gc_steps_*` metrics > > +-- for example longer period of `gc_steps_atomic` increment is better. Also > > +additional amount of `gc_steps_propagate` in one period can be used to > > +indirectly estimate amount of objects. > > These values also correlate with the value. The amount of > incremental steps can grow, but one step can process a small amount of > GCobjects. So these metrics should be considered together with GC setup. Thanks! Add this note to RFC! > > > + > > +Amount of `gc_*num` is useful for control of memory leaks -- totally amount of > > Typo: s/totally/total/. Fixed! Thanks! > > > +these objects should not growth nonstop (you can also track `gc_total` for > > +this). Also `jit_mcode_size` can be used for tracking amount of allocated > > Typo: double space prior to "Also". Yep. Fixed! Side note: Too unpredictable autoformatting by `gq` :\ > > > +memory for traces machine code. > > + > > +Slope curves of `jit_trace_abort` shows how many times trace hasn't been > > +compiled when the attempt was made (less is better). > > + > > +Amount of `gc_trace_num` is shown how much traces was generated (_usually_ > > +more is better). > > + > > +And the last one -- `gc_snap_restores` can be used for estimation when LuaJIT > > +is stop trace execution. If slope curves of this metric growth after changing > > +old code it can mean performance degradation. > > + > > +Assumes that we have code like this: > > +```lua > > +local function foo(i) > > + return i <= 5 and i or tostring(i) > > +end > > +-- minstitch option needs to emulate nonstitching behaviour > > +jit.opt.start(0, "hotloop=2", "hotexit=2", "minstitch=15") > > + > > +local sum = 0 > > +local old_metrics = misc.getmetrics() > > +for i = 1, 10 do > > + sum = sum + foo(i) > > +end > > +local new_metrics = misc.getmetrics() > > +local diff = new_metrics.jit_snap_restore - old_metrics.jit_snap_restore > > +``` > > +`diff` equals 3 (1 side exit on loop end, 2 side exits to the interpreter > > +before trace gets hot and compiled) after this chunk of code. > > + > > +And now we decide to change `foo` function like this: > > +```lua > > +local function foo(i) > > + -- math.fmod is not yet compiled! > > + return i <= 5 and i or math.fmod(i, 11) > > +end > > +``` > > +`diff` equals 6 (1 side exit on loop end, 2 side exits to the interpreter > > +before trace gets hot and compiled an 3 side exits from the root trace could > > +not get compiled) after the same chunk of code. > > + > > +## Benchmarks > > + > > +Benchmarks was taken from repo: > > Typo: s/was/were/. Thanks, fixed! > > > +[LuaJIT-test-cleanup](https://github.com/LuaJIT/LuaJIT-test-cleanup). > > + > > +Example of usage: > > +```bash > > +/usr/bin/time -f"array3d %U" ./luajit $BENCH_DIR/array3d.lua 300 >/dev/null > > Typo: double space prior to "300". Thanks, fixed! > > > +``` > > > > > -- > > 2.28.0 > > > > -- > Best regards, > IM See iterative patch in the bottom. Branch force-pushed. =================================================================== diff --git a/doc/rfc/5187-luajit-metrics.md b/doc/rfc/5187-luajit-metrics.md index 988b049fb..0c1df6901 100644 --- a/doc/rfc/5187-luajit-metrics.md +++ b/doc/rfc/5187-luajit-metrics.md @@ -89,22 +89,23 @@ struct luam_Metrics { ``` Couple of words about how metrics are collected: -- `strhash_*` -- whenever existing string is returned after attempt to - create new string there is incremented `strhash_hit` counter, if new string - created then `strhash_miss` is incremented instead. -- `gc_*num`, `jit_trace_num` -- corresponding counter incremented whenever new - object is allocated. When object become garbage collected its counter is +- `strhash_*` -- whenever the string with the same payload is found, so a new + one is not created/allocated, there is incremented `strhash_hit` counter, if + a new one string created/allocated then `strhash_miss` is incremented + instead. +- `gc_*num`, `jit_trace_num` -- corresponding counter is incremented whenever a + new object is allocated. When object is collected by GC its counter is decremented. - `gc_total`, `gc_allocated`, `gc_freed` -- any time when allocation function - is called `gc_allocated` and/or `gc_freed` is increased and `gc_total` - increase when memory is allocated or reallocated, decrease when memory is - freed. -- `gc_steps_*` -- corresponding counter increments whenever Garbage Collector - starts to execute 1 step of garbage collection. + is called `gc_allocated` and/or `gc_freed` is increased and `gc_total` is + increased when memory is allocated or reallocated, is decreased when memory + is freed. +- `gc_steps_*` -- corresponding counter is incremented whenever Garbage + Collector starts to execute an incremental step of garbage collection. - `jit_snap_restore` -- whenever JIT machine exits from the trace and restores interpreter state `jit_snap_restore` counter is incremented. - `jit_trace_abort` -- whenever JIT compiler can't record the trace in case NYI - BC this counter is incremented. + BC or builtins this counter is incremented. - `jit_mcode_size` -- whenever new MCode area is allocated `jit_mcode_size` is increased at corresponding size in bytes. Sets to 0 when all mcode area is freed. @@ -114,7 +115,7 @@ increase monotonically and can overflow: - `strhash_hit` - `strhash_miss` - `gc_freed` - - `gc_allocated`, + - `gc_allocated` - `gc_steps_pause` - `gc_steps_propagate` - `gc_steps_atomic` @@ -173,8 +174,7 @@ end ``` increase in slope curve of `strhash_misses` means, that after your changes there are more new strings allocating at runtime. Of course slope curve of -`strhash_misses` _should_ be less than slope curve of `strhash_hits`. If it -is not, you should refactor your code. +`strhash_misses` _should_ be less than slope curve of `strhash_hits`. Slope curves of `gc_freed` and `gc_allocated` can be used for analysis of GC pressure of your application (less is better). @@ -188,7 +188,7 @@ for i = 1, 513 do t[i] = i end local new_metrics = misc.getmetrics() -local diff = new_metrics.gc_allocated - old_metrics.gc_allocated +local diff = new_metrics.gc_allocated - old_metrics.gc_allocated ``` `diff` equals to 18879 after running of this chunk. @@ -201,7 +201,7 @@ for i = 1, 513 do t[i] = i end local new_metrics = misc.getmetrics() -local diff = new_metrics.gc_allocated - old_metrics.gc_allocated +local diff = new_metrics.gc_allocated - old_metrics.gc_allocated ``` `diff` shows us only 5895. @@ -209,11 +209,14 @@ Slope curves of `gc_steps_*` can be used for tracking of GC pressure too. For long time observations you will see periodic increment for `gc_steps_*` metrics -- for example longer period of `gc_steps_atomic` increment is better. Also additional amount of `gc_steps_propagate` in one period can be used to -indirectly estimate amount of objects. +indirectly estimate amount of objects. These values also correlate with the +step multiplier of the GC. The amount of incremental steps can grow, but +one step can process a small amount of objects. So these metrics should be +considered together with GC setup. -Amount of `gc_*num` is useful for control of memory leaks -- totally amount of +Amount of `gc_*num` is useful for control of memory leaks -- total amount of these objects should not growth nonstop (you can also track `gc_total` for -this). Also `jit_mcode_size` can be used for tracking amount of allocated +this). Also `jit_mcode_size` can be used for tracking amount of allocated memory for traces machine code. Slope curves of `jit_trace_abort` shows how many times trace hasn't been @@ -258,12 +261,12 @@ not get compiled) after the same chunk of code. ## Benchmarks -Benchmarks was taken from repo: +Benchmarks were taken from repo: [LuaJIT-test-cleanup](https://github.com/LuaJIT/LuaJIT-test-cleanup). Example of usage: ```bash -/usr/bin/time -f"array3d %U" ./luajit $BENCH_DIR/array3d.lua 300 >/dev/null +/usr/bin/time -f"array3d %U" ./luajit $BENCH_DIR/array3d.lua 300 >/dev/null ``` Taking into account the measurement error ~ 2%, it can be said that there is no difference in the performance. =================================================================== -- Best regards, Sergey Kaplun