From: Sergey Kaplun <skaplun@tarantool.org> To: Igor Munkin <imun@tarantool.org> Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH 1/2] metrics: add counters for metrics interested in Date: Fri, 24 Jul 2020 15:00:00 +0300 [thread overview] Message-ID: <20200724120000.GB4086@root> (raw) In-Reply-To: <20200723201219.GQ18920@tarantool.org> Hi Igor! Thanks for the review! On 23.07.20, Igor Munkin wrote: > Sergey, > > Thanks for the patch! Please consider my comments below. > > On 21.07.20, Sergey Kaplun wrote: > > We still don't have strict prefixes, so I propose to use 'core' for this > patch. Does it look fine to you? Feel free to propose your own one. Maybe we should use 'misc' for this kind of patches? > > This patch adds counters for: > > - amount of table, cdata and udata objects, > > - GC invokations by GC states, > > - strhash misses or hits, > > - amount of freed or allocated memory, > > - amount of aborted traces and restored snapshots. > > > > C and Lua API will be added in the next patch. > > OK, let's make the commit message a bit clearer: > | core: introduce various platform metrics > | > | This patch introduces the following counters: > | - overall amount of allocated tables, cdata and udata objects > | - number of incremental GC steps grouped by GC state > | - number of string hashes hits and misses > | - amount of allocated and freed memory > | - number of trace aborts and restored snapshots > | > | Interfaces to obtain these metrics via both Lua and C API are > | introduced in the next patch. > > Feel free to change it on your own. Thanks! It looks fine for me. > > > > Part of tarantool/tarantool#5187 > > --- > > src/lj_cdata.c | 2 ++ > > src/lj_cdata.h | 2 ++ > > src/lj_gc.c | 4 ++++ > > src/lj_gc.h | 6 +----- > > src/lj_jit.h | 7 +++++++ > > src/lj_obj.h | 25 +++++++++++++++++++++++++ > > src/lj_snap.c | 1 + > > src/lj_state.c | 13 ++++++++++++- > > src/lj_str.c | 5 +++++ > > src/lj_tab.c | 2 ++ > > src/lj_trace.c | 5 ++++- > > src/lj_udata.c | 2 ++ > > 12 files changed, 67 insertions(+), 7 deletions(-) > > > > <snipped> > > > diff --git a/src/lj_jit.h b/src/lj_jit.h > > index 7eb3d2a..960a02b 100644 > > --- a/src/lj_jit.h > > +++ b/src/lj_jit.h > > @@ -475,6 +475,13 @@ typedef struct jit_State { > > size_t szmcarea; /* Size of current mcode area. */ > > size_t szallmcarea; /* Total size of all allocated mcode areas. */ > > > > + size_t nsnaprestore; /* Overall number of snap restores for all traces > > + ** "belonging" to the given jit_State > > + ** since the last call to luaM_metrics(). */ > > + size_t ntraceabort; /* Overall number of abort traces > > + ** "belonging" to the given jit_State > > + ** since the last call to luaM_metrics(). */ > > Code indentation is really strange here, like you mixed LuaJIT and uJIT > styles. Besides, I guess the remark regarding the field reset is wrong: > * These fields are not reset in this patch. > * <luaM_metrics> is introduced only in the following patch but you've > already mentioned it here. My bad! Thanks for this catch. I will fix indentation in the next patch. Naming-wise, as you can see at [1] we shouldn't use metrics since last call. So comment should gone. > > + > > TValue errinfo; /* Additional info element for trace errors. */ > > > > #if LJ_HASPROFILE > > diff --git a/src/lj_obj.h b/src/lj_obj.h > > index f368578..1fee04f 100644 > > --- a/src/lj_obj.h > > +++ b/src/lj_obj.h > > @@ -571,13 +571,28 @@ typedef enum { > > #define basemt_obj(g, o) ((g)->gcroot[GCROOT_BASEMT+itypemap(o)]) > > #define mmname_str(g, mm) (strref((g)->gcroot[GCROOT_MMNAME+(mm)])) > > > > +/* Garbage collector states. Order matters. */ > > +enum { > > + GCSpause, > > + GCSpropagate, > > + GCSatomic, > > + GCSsweepstring, > > + GCSsweep, > > + GCSfinalize, > > + GCSlast > > Minor: GCSMAX looks better, doesn't it? Feel free to ignore. Really, GCSlast is unreachable. GCSmax for this kind of naming is good enough. GCSunreachable also looks nice. What do you think? > > +}; > > + > > typedef struct GCState { > > GCSize total; /* Memory currently allocated. */ > > + size_t freed; /* Memory freed since last luaM_metrics() call. */ > > + size_t allocated; /* Memory allocated since last luaM_metrics() call. */ > > Well, why <total> field type differs from <freed> and <allocated> fields > type? And again, remark about <luaM_metrics> can be dropped (see above). GCSize for total memfory usage is defines by LJ_GC64 macro definition. | #if LJ_GC64 | typedef uint64_t GCSize; | #else | typedef uint32_t GCSize; | #endif But summarized allocation size can be many times over maximum memory size). So these field types should be independet. > > GCSize threshold; /* Memory threshold. */ > > uint8_t currentwhite; /* Current white color. */ > > uint8_t state; /* GC state. */ > > uint8_t nocdatafin; /* No cdata finalizer called. */ > > uint8_t unused2; > > + size_t state_count[GCSlast]; /* Count of GC invocations with different states > > + ** since previous call of luaM_metrics(). */ > > Again, a mess with whitespace. Furthermore, I propose to reword this > comment the following way: > | /* Count of incremental GC steps per state */ Ok, no problems! > > MSize sweepstr; /* Sweep position in string table. */ > > GCRef root; /* List of all collectable objects. */ > > MRef sweep; /* Sweep position in root list. */ > > @@ -589,6 +604,12 @@ typedef struct GCState { > > GCSize estimate; /* Estimate of memory actually in use. */ > > MSize stepmul; /* Incremental GC step granularity. */ > > MSize pause; /* Pause between successive GC cycles. */ > > + > > + size_t tabnum; /* Current amount of table objects. */ > > + size_t udatanum; /* Current amount of udata objects. */ > > Minor: These comments are a bit misleading. "Current" sounds like the > metric is reset somewhen. But you're talking about "alive" objects of > the particular type. Yes, formally (in GC terms) some of them are not > alive, but are still allocated (so called "dead") objects. So the term > "alive" is also a misleading one. I guess you can freely reword the > comments the following way: > | /* Number of allocated <GCtype> objects */ > I think that "allocated" is unnecessarily. We can omit it easily. I should reword the comments as | /* Number of <GCtype> objects. */ or | /* Amount of <GCtype> objects. */ and it seems enough. > > +#ifdef LJ_HASFFI > > + size_t cdatanum; /* Current amount of cdata objects. */ > > This counter is not incremented if cdata is allocated on trace. Consider > the following test: > | $ LUA_PATH="$HOME/.luarocks/share/lua/5.1/?.lua;;" ./luajit -joff <<EOF > | heredoc> local t = { } > | heredoc> for i = 1, 1000 do > | heredoc> table.insert(t, require('ffi').new('uint64_t', i)) > | heredoc> end > | heredoc> t = nil > | heredoc> collectgarbage('collect') > | heredoc> print(require('inspect')(misc.getmetrics())) > | heredoc> EOF > | { > | global = { > | cdatanum = 0, > | gc_total = 74722, > | jit_mcode_size = 0, > | jit_trace_num = 0, > | strnum = 534, > | tabnum = 77, > | udatanum = 5 > | }, > | incremental = { > | gc_allocated = 173092, > | gc_freed = 98370, > | gc_steps_atomic = 1, > | gc_steps_finalize = 0, > | gc_steps_pause = 3, > | gc_steps_propagate = 553, > | gc_steps_sweep = 64, > | gc_steps_sweepstring = 1024, > | jit_snap_restore = 0, > | jit_trace_abort = 0, > | strhash_hit = 4021, > | strhash_miss = 538 > | } > | } > | $ LUA_PATH="$HOME/.luarocks/share/lua/5.1/?.lua;;" ./luajit -jon <<EOF > | heredoc> local t = { } > | heredoc> for i = 1, 1000 do > | heredoc> table.insert(t, require('ffi').new('uint64_t', i)) > | heredoc> end > | heredoc> t = nil > | heredoc> collectgarbage('collect') > | heredoc> print(require('inspect')(misc.getmetrics())) > | heredoc> EOF > | { > | global = { > | cdatanum = -941, > | gc_total = 76561, > | jit_mcode_size = 65536, > | jit_trace_num = 3, > | strnum = 534, > | tabnum = 77, > | udatanum = 5 > | }, > | incremental = { > | gc_allocated = 145299, > | gc_freed = 68738, > | gc_steps_atomic = 3, > | gc_steps_finalize = 0, > | gc_steps_pause = 2, > | gc_steps_propagate = 505, > | gc_steps_sweep = 64, > | gc_steps_sweepstring = 1024, > | jit_snap_restore = 6, > | jit_trace_abort = 0, > | strhash_hit = 3085, > | strhash_miss = 538 > | } > | } > > As you can see global.cdatanum is negative when JIT is on. You can take > a look on the compiled trace by yourself. The root cause is CNEW IR > semantics: for VLA/VLS cdata types <lj_cdata_newv> call is emitted and > the counters are fine, but for other cdata types JIT compiler emits just > a <lj_mem_newgco> call that so this object is not counted. Thereby JIT > has to emit the corresponding instructions while assembling CNEW IR. > I will elaborate on this later. My fault! Ok, I will try to implement it at this week. > > +#endif > > } GCState; > > > > /* Global state, shared by all threads of a Lua universe. */ > > @@ -602,6 +623,10 @@ typedef struct global_State { > > BloomFilter next[2]; > > } strbloom; > > #endif > > + size_t strhash_hit; /* New string has been found in the storage > > + ** since last luaM_metrics() call. */ > > + size_t strhash_miss; /* New string has been added to the storage > > + ** since last luaM_metrics() call. */ > > Well, such indentation is closer to the desired one, but I would use two > spaces prior to these tabs. What do you think? Also please, don't forget > about <luaM_metrics> remark. Sorry, but ^\s+\t+\s+ stylistics looks very bad. In *.c files Mike use exactly ^\t+ style for code, so we have to use the same for comments. Thoughts? > > lua_Alloc allocf; /* Memory allocator. */ > > void *allocd; /* Memory allocator data. */ > > GCState gc; /* Garbage collector. */ > > <snipped> > > > diff --git a/src/lj_state.c b/src/lj_state.c > > index 632dd07..b71ebae 100644 > > --- a/src/lj_state.c > > +++ b/src/lj_state.c > > @@ -204,6 +204,16 @@ LUA_API lua_State *lua_newstate(lua_Alloc f, void *ud) > > setgcref(g->uvhead.prev, obj2gco(&g->uvhead)); > > setgcref(g->uvhead.next, obj2gco(&g->uvhead)); > > g->strmask = ~(MSize)0; > > + > > + memset(g->gc.state_count, 0, GCSlast * sizeof(g->gc.state_count[0])); > > + g->strhash_hit = 0; > > + g->strhash_miss = 0; > > + g->gc.tabnum = 0; > > + g->gc.udatanum = 0; > > +#if LJ_HASFFI > > + g->gc.cdatanum = 0; > > +#endif > > + > > It looks like all assignments above are excess: <GG_State> is filled > with zeros, <g> points to the corresponding <GG_State> chunk, <gc> is > allocated within <g>, ergo also initialized with zeros. Feel free to > drop this hunk. Nice catch! Thanks! > > setnilV(registry(L)); > > setnilV(&g->nilnode.val); > > setnilV(&g->nilnode.key); > > @@ -214,7 +224,8 @@ LUA_API lua_State *lua_newstate(lua_Alloc f, void *ud) > > g->gc.state = GCSpause; > > setgcref(g->gc.root, obj2gco(L)); > > setmref(g->gc.sweep, &g->gc.root); > > - g->gc.total = sizeof(GG_State); > > + g->gc.allocated = g->gc.total = sizeof(GG_State); > > + g->gc.freed = 0; > > Ditto for g->gc.freed. Sure! > > > g->gc.pause = LUAI_GCPAUSE; > > g->gc.stepmul = LUAI_GCMUL; > > lj_dispatch_init((GG_State *)L); > > <snipped> > > > -- > > 2.24.1 > > > > -- > Best regards, > IM [1]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-July/018845.html -- Best regards, Sergey Kaplun
next prev parent reply other threads:[~2020-07-24 12:00 UTC|newest] Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-07-21 11:34 [Tarantool-patches] [PATCH 0/2] Implement LuaJIT platform metrics Sergey Kaplun 2020-07-21 11:34 ` [Tarantool-patches] [PATCH 1/2] metrics: add counters for metrics interested in Sergey Kaplun 2020-07-23 20:12 ` Igor Munkin 2020-07-24 12:00 ` Sergey Kaplun [this message] 2020-07-24 20:05 ` Igor Munkin 2020-07-25 10:00 ` Sergey Kaplun 2020-07-27 2:25 ` Sergey Kaplun 2020-08-12 9:39 ` Igor Munkin 2020-07-21 11:34 ` [Tarantool-patches] [PATCH 2/2] metrics: add C and Lua API Sergey Kaplun 2020-07-21 11:37 ` [Tarantool-patches] [RFC] rfc: luajit metrics Sergey Kaplun 2020-07-23 10:15 ` Sergey Ostanevich 2020-07-24 11:18 ` Sergey Kaplun 2020-07-24 11:22 ` Sergey Kaplun 2020-07-24 17:20 ` Sergey Ostanevich 2020-07-27 2:18 ` Sergey Kaplun 2020-08-11 20:13 ` Igor Munkin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200724120000.GB4086@root \ --to=skaplun@tarantool.org \ --cc=imun@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH 1/2] metrics: add counters for metrics interested in' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox