[Tarantool-patches] [PATCH v4 0/2] Implement LuaJIT platform metrics

Tarantool development patches archive
 help / color / mirror / Atom feed

* [Tarantool-patches] [PATCH v4 0/2] Implement LuaJIT platform metrics
@ 2020-10-05  6:30 Sergey Kaplun
  2020-10-05  6:30 ` [Tarantool-patches] [PATCH v4 1/2] core: introduce various " Sergey Kaplun
                   ` (4 more replies)
  0 siblings, 5 replies; 26+ messages in thread
From: Sergey Kaplun @ 2020-10-05  6:30 UTC (permalink / raw)
  To: Igor Munkin, Sergey Ostanevich; +Cc: tarantool-patches

The series consists of 2 patches. The first one adds corresponding
counters to LuaJIT internal structures. The second provides C and Lua
API using this counters to collect metrics.

Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-5187-luajit-metrics
Issue: https://github.com/tarantool/tarantool/issues/5187

Changes in v2:
- Fixed naming and comments
- Fixed padding in struct GCState
- Dropped unnecessary initialisations inside lua_newstate()
- Avoided flushing any of metrics after each call of luaM_metrics()

Changes in v3:
- Cleaned up mess in Makefile.dep
- Fixed naming and comments
- Fixed padding in struct GCState for 64-bit architectures
- Fixed counting amount of JIT traces
- Fixed objects counting at trace recording
- Added counting of colors
- Added C and Lua tests

Changes in v4:
- Removed counting of colors
- Changed global_State structure correspondingly for 32-bit arm build

Sergey Kaplun (2):
  core: introduce various platform metrics
  misc: add C and Lua API for platform metrics

 Makefile                                      |   2 +-
 src/Makefile                                  |   5 +-
 src/Makefile.dep                              |  14 +-
 src/lib_init.c                                |   2 +
 src/lib_misc.c                                |  72 +++
 src/lj_asm.c                                  |   2 +
 src/lj_asm_arm.h                              |   7 +
 src/lj_asm_arm64.h                            |   7 +
 src/lj_asm_mips.h                             |   4 +
 src/lj_asm_ppc.h                              |   3 +
 src/lj_asm_x86.h                              |   4 +
 src/lj_cdata.c                                |   2 +
 src/lj_cdata.h                                |   2 +
 src/lj_gc.c                                   |   4 +
 src/lj_gc.h                                   |   6 +-
 src/lj_jit.h                                  |   3 +
 src/lj_mapi.c                                 |  61 +++
 src/lj_obj.h                                  |  28 +-
 src/lj_snap.c                                 |   1 +
 src/lj_state.c                                |   2 +-
 src/lj_str.c                                  |   5 +
 src/lj_tab.c                                  |   2 +
 src/lj_trace.c                                |   6 +-
 src/lj_udata.c                                |   2 +
 src/ljamalg.c                                 |   2 +
 src/lmisclib.h                                |  64 +++
 src/luaconf.h                                 |   1 +
 test/clib-misclib-getmetrics.test.lua         | 174 ++++++++
 test/clib-misclib-getmetrics/CMakeLists.txt   |   1 +
 test/clib-misclib-getmetrics/testgetmetrics.c | 242 ++++++++++
 test/lib-misc-getmetrics.test.lua             | 418 ++++++++++++++++++
 31 files changed, 1130 insertions(+), 18 deletions(-)
 create mode 100644 src/lib_misc.c
 create mode 100644 src/lj_mapi.c
 create mode 100644 src/lmisclib.h
 create mode 100755 test/clib-misclib-getmetrics.test.lua
 create mode 100644 test/clib-misclib-getmetrics/CMakeLists.txt
 create mode 100644 test/clib-misclib-getmetrics/testgetmetrics.c
 create mode 100755 test/lib-misc-getmetrics.test.lua

-- 
2.28.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Tarantool-patches] [PATCH v4 1/2] core: introduce various platform metrics
  2020-10-05  6:30 [Tarantool-patches] [PATCH v4 0/2] Implement LuaJIT platform metrics Sergey Kaplun
@ 2020-10-05  6:30 ` Sergey Kaplun
  2020-10-07 14:11   ` Igor Munkin
  2020-10-05  6:30 ` [Tarantool-patches] [PATCH v4 2/2] misc: add C and Lua API for " Sergey Kaplun
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 26+ messages in thread
From: Sergey Kaplun @ 2020-10-05  6:30 UTC (permalink / raw)
  To: Igor Munkin, Sergey Ostanevich; +Cc: tarantool-patches

This patch introduces the following counters:
  - overall amount of allocated tables, cdata and udata objects
  - number of incremental GC steps grouped by GC state
  - number of string hashes hits and misses
  - amount of allocated and freed memory
  - number of trace aborts and restored snapshots

Interfaces to obtain these metrics via both Lua and C API are
introduced in the next patch.

Part of tarantool/tarantool#5187
---
 src/lj_asm.c       |  2 ++
 src/lj_asm_arm.h   |  7 +++++++
 src/lj_asm_arm64.h |  7 +++++++
 src/lj_asm_mips.h  |  4 ++++
 src/lj_asm_ppc.h   |  3 +++
 src/lj_asm_x86.h   |  4 ++++
 src/lj_cdata.c     |  2 ++
 src/lj_cdata.h     |  2 ++
 src/lj_gc.c        |  4 ++++
 src/lj_gc.h        |  6 +-----
 src/lj_jit.h       |  3 +++
 src/lj_obj.h       | 28 +++++++++++++++++++++++++---
 src/lj_snap.c      |  1 +
 src/lj_state.c     |  2 +-
 src/lj_str.c       |  5 +++++
 src/lj_tab.c       |  2 ++
 src/lj_trace.c     |  6 +++++-
 src/lj_udata.c     |  2 ++
 18 files changed, 80 insertions(+), 10 deletions(-)

diff --git a/src/lj_asm.c b/src/lj_asm.c
index c2cf5a9..8fb3ccd 100644
--- a/src/lj_asm.c
+++ b/src/lj_asm.c
@@ -2273,6 +2273,7 @@ void lj_asm_trace(jit_State *J, GCtrace *T)
   as->J = J;
   as->T = T;
   J->curfinal = lj_trace_alloc(J->L, T);  /* This copies the IR, too. */
+  J->tracenum++;
   as->flags = J->flags;
   as->loopref = J->loopref;
   as->realign = NULL;
@@ -2379,6 +2380,7 @@ void lj_asm_trace(jit_State *J, GCtrace *T)
     lj_trace_free(J2G(J), J->curfinal);
     J->curfinal = NULL;  /* In case lj_trace_alloc() OOMs. */
     J->curfinal = lj_trace_alloc(J->L, T);
+    J->tracenum++;
     as->realign = NULL;
   }
 
diff --git a/src/lj_asm_arm.h b/src/lj_asm_arm.h
index 37bfa40..4fd08b9 100644
--- a/src/lj_asm_arm.h
+++ b/src/lj_asm_arm.h
@@ -1257,9 +1257,16 @@ static void asm_cnew(ASMState *as, IRIns *ir)
   {
     uint32_t k = emit_isk12(ARMI_MOV, id);
     Reg r = k ? RID_R1 : ra_allock(as, id, allow);
+    allow = rset_exclude(allow, r);
+    Reg gr = ra_allock(as, i32ptr(J2G(as->J)), allow);
     emit_lso(as, ARMI_STRB, RID_TMP, RID_RET, offsetof(GCcdata, gct));
     emit_lsox(as, ARMI_STRH, r, RID_RET, offsetof(GCcdata, ctypeid));
     emit_d(as, ARMI_MOV|ARMI_K12|~LJ_TCDATA, RID_TMP);
+    emit_lso(as, ARMI_STR, RID_TMP, gr,
+	     (int32_t)offsetof(global_State, gc.cdatanum));
+    emit_dn(as, ARMI_ADD|ARMI_K12|1, RID_TMP, RID_TMP);
+    emit_lso(as, ARMI_LDR, RID_TMP, gr,
+	     (int32_t)offsetof(global_State, gc.cdatanum));
     if (k) emit_d(as, ARMI_MOV^k, RID_R1);
   }
   args[0] = ASMREF_L;     /* lua_State *L */
diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
index 8fd92e7..a32ba2d 100644
--- a/src/lj_asm_arm64.h
+++ b/src/lj_asm_arm64.h
@@ -1220,9 +1220,16 @@ static void asm_cnew(ASMState *as, IRIns *ir)
   /* Initialize gct and ctypeid. lj_mem_newgco() already sets marked. */
   {
     Reg r = (id < 65536) ? RID_X1 : ra_allock(as, id, allow);
+    allow = rset_exclude(allow, r);
+    Reg gr = ra_allock(as, i64ptr(J2G(as->J)), allow);
     emit_lso(as, A64I_STRB, RID_TMP, RID_RET, offsetof(GCcdata, gct));
     emit_lso(as, A64I_STRH, r, RID_RET, offsetof(GCcdata, ctypeid));
     emit_d(as, A64I_MOVZw | A64F_U16(~LJ_TCDATA), RID_TMP);
+    emit_lso(as, A64I_STRx, RID_TMP, gr,
+	     (int32_t)offsetof(global_State, gc.cdatanum));
+    emit_dn(as, (A64I_ADDx^A64I_K12) | A64F_U12(1), RID_TMP, RID_TMP);
+    emit_lso(as, A64I_LDRx, RID_TMP, gr,
+	     (int32_t)offsetof(global_State, gc.cdatanum));
     if (id < 65536) emit_d(as, A64I_MOVZw | A64F_U16(id), RID_X1);
   }
   args[0] = ASMREF_L;     /* lua_State *L */
diff --git a/src/lj_asm_mips.h b/src/lj_asm_mips.h
index affe7d8..f4b4b5d 100644
--- a/src/lj_asm_mips.h
+++ b/src/lj_asm_mips.h
@@ -1473,11 +1473,15 @@ static void asm_cnew(ASMState *as, IRIns *ir)
     return;
   }
 
+  /* Code incrementing cdatanum is sparse to avoid mips data hazards. */
+  emit_setgl(as, RID_RET+2, gc.cdatanum);
   /* Initialize gct and ctypeid. lj_mem_newgco() already sets marked. */
   emit_tsi(as, MIPSI_SB, RID_RET+1, RID_RET, offsetof(GCcdata, gct));
   emit_tsi(as, MIPSI_SH, RID_TMP, RID_RET, offsetof(GCcdata, ctypeid));
+  emit_tsi(as, MIPSI_AADDIU, RID_RET+2, RID_RET+2, 1);
   emit_ti(as, MIPSI_LI, RID_RET+1, ~LJ_TCDATA);
   emit_ti(as, MIPSI_LI, RID_TMP, id); /* Lower 16 bit used. Sign-ext ok. */
+  emit_getgl(as, RID_RET+2, gc.cdatanum);
   args[0] = ASMREF_L;     /* lua_State *L */
   args[1] = ASMREF_TMP1;  /* MSize size   */
   asm_gencall(as, ci, args);
diff --git a/src/lj_asm_ppc.h b/src/lj_asm_ppc.h
index 6daa861..aa2d45c 100644
--- a/src/lj_asm_ppc.h
+++ b/src/lj_asm_ppc.h
@@ -1057,6 +1057,9 @@ static void asm_cnew(ASMState *as, IRIns *ir)
     return;
   }
 
+  emit_setgl(as, RID_TMP, gc.cdatanum);
+  emit_tai(as, PPCI_ADDIC, RID_TMP, RID_TMP, 1);
+  emit_getgl(as, RID_TMP, gc.cdatanum);
   /* Initialize gct and ctypeid. lj_mem_newgco() already sets marked. */
   emit_tai(as, PPCI_STB, RID_RET+1, RID_RET, offsetof(GCcdata, gct));
   emit_tai(as, PPCI_STH, RID_TMP, RID_RET, offsetof(GCcdata, ctypeid));
diff --git a/src/lj_asm_x86.h b/src/lj_asm_x86.h
index 3e189b1..959fc2d 100644
--- a/src/lj_asm_x86.h
+++ b/src/lj_asm_x86.h
@@ -1835,6 +1835,10 @@ static void asm_cnew(ASMState *as, IRIns *ir)
     return;
   }
 
+  /* Increment cdatanum counter by address directly. */
+  emit_i8(as, 1);
+  emit_rmro(as, XO_ARITHi8, XOg_ADD, RID_NONE,
+	    ptr2addr(&J2G(as->J)->gc.cdatanum));
   /* Combine initialization of marked, gct and ctypeid. */
   emit_movtomro(as, RID_ECX, RID_RET, offsetof(GCcdata, marked));
   emit_gri(as, XG_ARITHi(XOg_OR), RID_ECX,
diff --git a/src/lj_cdata.c b/src/lj_cdata.c
index 68e16d7..0dd8553 100644
--- a/src/lj_cdata.c
+++ b/src/lj_cdata.c
@@ -46,6 +46,7 @@ GCcdata *lj_cdata_newv(lua_State *L, CTypeID id, CTSize sz, CTSize align)
   cd->marked |= 0x80;
   cd->gct = ~LJ_TCDATA;
   cd->ctypeid = id;
+  g->gc.cdatanum++;
   return cd;
 }
 
@@ -82,6 +83,7 @@ void LJ_FASTCALL lj_cdata_free(global_State *g, GCcdata *cd)
   } else {
     lj_mem_free(g, memcdatav(cd), sizecdatav(cd));
   }
+  g->gc.cdatanum--;
 }
 
 void lj_cdata_setfin(lua_State *L, GCcdata *cd, GCobj *obj, uint32_t it)
diff --git a/src/lj_cdata.h b/src/lj_cdata.h
index 5bb0f5d..66b023b 100644
--- a/src/lj_cdata.h
+++ b/src/lj_cdata.h
@@ -45,6 +45,7 @@ static LJ_AINLINE GCcdata *lj_cdata_new(CTState *cts, CTypeID id, CTSize sz)
   cd = (GCcdata *)lj_mem_newgco(cts->L, sizeof(GCcdata) + sz);
   cd->gct = ~LJ_TCDATA;
   cd->ctypeid = ctype_check(cts, id);
+  G(cts->L)->gc.cdatanum++;
   return cd;
 }
 
@@ -54,6 +55,7 @@ static LJ_AINLINE GCcdata *lj_cdata_new_(lua_State *L, CTypeID id, CTSize sz)
   GCcdata *cd = (GCcdata *)lj_mem_newgco(L, sizeof(GCcdata) + sz);
   cd->gct = ~LJ_TCDATA;
   cd->ctypeid = id;
+  G(L)->gc.cdatanum++;
   return cd;
 }
 
diff --git a/src/lj_gc.c b/src/lj_gc.c
index 1c8e6ce..44c8aa1 100644
--- a/src/lj_gc.c
+++ b/src/lj_gc.c
@@ -634,6 +634,7 @@ static void atomic(global_State *g, lua_State *L)
 static size_t gc_onestep(lua_State *L)
 {
   global_State *g = G(L);
+  g->gc.state_count[g->gc.state]++;
   switch (g->gc.state) {
   case GCSpause:
     gc_mark_start(g);  /* Start a new GC cycle by marking all GC roots. */
@@ -857,6 +858,8 @@ void *lj_mem_realloc(lua_State *L, void *p, GCSize osz, GCSize nsz)
   lua_assert((nsz == 0) == (p == NULL));
   lua_assert(checkptrGC(p));
   g->gc.total = (g->gc.total - osz) + nsz;
+  g->gc.allocated += nsz;
+  g->gc.freed += osz;
   return p;
 }
 
@@ -869,6 +872,7 @@ void * LJ_FASTCALL lj_mem_newgco(lua_State *L, GCSize size)
     lj_err_mem(L);
   lua_assert(checkptrGC(o));
   g->gc.total += size;
+  g->gc.allocated += size;
   setgcrefr(o->gch.nextgc, g->gc.root);
   setgcref(g->gc.root, o);
   newwhite(g, o);
diff --git a/src/lj_gc.h b/src/lj_gc.h
index 669bbe9..2051220 100644
--- a/src/lj_gc.h
+++ b/src/lj_gc.h
@@ -8,11 +8,6 @@
 
 #include "lj_obj.h"
 
-/* Garbage collector states. Order matters. */
-enum {
-  GCSpause, GCSpropagate, GCSatomic, GCSsweepstring, GCSsweep, GCSfinalize
-};
-
 /* Bitmasks for marked field of GCobj. */
 #define LJ_GC_WHITE0	0x01
 #define LJ_GC_WHITE1	0x02
@@ -117,6 +112,7 @@ LJ_FUNC void *lj_mem_grow(lua_State *L, void *p,
 static LJ_AINLINE void lj_mem_free(global_State *g, void *p, size_t osize)
 {
   g->gc.total -= (GCSize)osize;
+  g->gc.freed += osize;
   g->allocf(g->allocd, p, osize, 0);
 }
 
diff --git a/src/lj_jit.h b/src/lj_jit.h
index 7eb3d2a..d82292f 100644
--- a/src/lj_jit.h
+++ b/src/lj_jit.h
@@ -474,6 +474,9 @@ typedef struct jit_State {
   MCode *mcbot;		/* Bottom of current mcode area. */
   size_t szmcarea;	/* Size of current mcode area. */
   size_t szallmcarea;	/* Total size of all allocated mcode areas. */
+  size_t tracenum;	/* Overall number of traces. */
+  size_t nsnaprestore;	/* Overall number of snap restores. */
+  size_t ntraceabort;	/* Overall number of abort traces. */
 
   TValue errinfo;	/* Additional info element for trace errors. */
 
diff --git a/src/lj_obj.h b/src/lj_obj.h
index f368578..600b68d 100644
--- a/src/lj_obj.h
+++ b/src/lj_obj.h
@@ -571,6 +571,17 @@ typedef enum {
 #define basemt_obj(g, o)	((g)->gcroot[GCROOT_BASEMT+itypemap(o)])
 #define mmname_str(g, mm)	(strref((g)->gcroot[GCROOT_MMNAME+(mm)]))
 
+/* Garbage collector states. Order matters. */
+enum {
+  GCSpause, /* Start a GC cycle and mark the root set.*/
+  GCSpropagate, /* One gray object is processed. */
+  GCSatomic, /* Atomic transition from mark to sweep phase. */
+  GCSsweepstring, /* Sweep one chain of strings. */
+  GCSsweep, /* Sweep few objects from root. */
+  GCSfinalize, /* Finalize one userdata or cdata object. */
+  GCSmax
+};
+
 typedef struct GCState {
   GCSize total;		/* Memory currently allocated. */
   GCSize threshold;	/* Memory threshold. */
@@ -589,6 +600,15 @@ typedef struct GCState {
   GCSize estimate;	/* Estimate of memory actually in use. */
   MSize stepmul;	/* Incremental GC step granularity. */
   MSize pause;		/* Pause between successive GC cycles. */
+
+  size_t freed;		/* Total amount of freed memory. */
+  size_t allocated;	/* Total amount of allocated memory. */
+  size_t state_count[GCSmax]; /* Count of incremental GC steps per state. */
+  size_t tabnum;	/* Amount of allocated table objects. */
+  size_t udatanum;	/* Amount of allocated udata objects. */
+#ifdef LJ_HASFFI
+  size_t cdatanum;	/* Amount of allocated cdata objects. */
+#endif
 } GCState;
 
 /* Global state, shared by all threads of a Lua universe. */
@@ -602,22 +622,24 @@ typedef struct global_State {
     BloomFilter next[2];
   } strbloom;
 #endif
+  size_t strhash_hit;	/* Strings amount found in string hash. */
+  size_t strhash_miss;	/* Strings amount allocated and put into string hash. */
   lua_Alloc allocf;	/* Memory allocator. */
   void *allocd;		/* Memory allocator data. */
   GCState gc;		/* Garbage collector. */
-  volatile int32_t vmstate;  /* VM state or current JIT code trace number. */
   SBuf tmpbuf;		/* Temporary string buffer. */
   GCstr strempty;	/* Empty string. */
   uint8_t stremptyz;	/* Zero terminator of empty string. */
   uint8_t hookmask;	/* Hook mask. */
   uint8_t dispatchmode;	/* Dispatch mode. */
   uint8_t vmevmask;	/* VM event mask. */
+  int32_t hookcount;	/* Instruction hook countdown. */
   GCRef mainthref;	/* Link to main thread. */
   TValue registrytv;	/* Anchor for registry. */
-  TValue tmptv, tmptv2;	/* Temporary TValues. */
+  TValue tmptv2, tmptv;	/* Temporary TValues. */
   Node nilnode;		/* Fallback 1-element hash part (nil key and value). */
   GCupval uvhead;	/* Head of double-linked list of all open upvalues. */
-  int32_t hookcount;	/* Instruction hook countdown. */
+  volatile int32_t vmstate;  /* VM state or current JIT code trace number. */
   int32_t hookcstart;	/* Start count for instruction hook counter. */
   lua_Hook hookf;	/* Hook function. */
   lua_CFunction wrapf;	/* Wrapper for C function calls. */
diff --git a/src/lj_snap.c b/src/lj_snap.c
index 7554caf..4cf27e7 100644
--- a/src/lj_snap.c
+++ b/src/lj_snap.c
@@ -904,6 +904,7 @@ const BCIns *lj_snap_restore(jit_State *J, void *exptr)
     L->top = frame + snap->nslots;
     break;
   }
+  J->nsnaprestore++;
   return pc;
 }
 
diff --git a/src/lj_state.c b/src/lj_state.c
index 632dd07..1d9c628 100644
--- a/src/lj_state.c
+++ b/src/lj_state.c
@@ -214,7 +214,7 @@ LUA_API lua_State *lua_newstate(lua_Alloc f, void *ud)
   g->gc.state = GCSpause;
   setgcref(g->gc.root, obj2gco(L));
   setmref(g->gc.sweep, &g->gc.root);
-  g->gc.total = sizeof(GG_State);
+  g->gc.allocated = g->gc.total = sizeof(GG_State);
   g->gc.pause = LUAI_GCPAUSE;
   g->gc.stepmul = LUAI_GCMUL;
   lj_dispatch_init((GG_State *)L);
diff --git a/src/lj_str.c b/src/lj_str.c
index 24e90ca..8ff955e 100644
--- a/src/lj_str.c
+++ b/src/lj_str.c
@@ -222,6 +222,7 @@ GCstr *lj_str_new(lua_State *L, const char *str, size_t lenx)
                       str_fastcmp(str, strdata(sx), len) == 0) {
 	/* Resurrect if dead. Can only happen with fixstring() (keywords). */
 	if (isdead(g, o)) flipwhite(o);
+	g->strhash_hit++;
 	return sx;  /* Return existing string. */
       }
       o = gcnext(o);
@@ -234,6 +235,7 @@ GCstr *lj_str_new(lua_State *L, const char *str, size_t lenx)
                       memcmp(str, strdata(sx), len) == 0) {
 	/* Resurrect if dead. Can only happen with fixstring() (keywords). */
 	if (isdead(g, o)) flipwhite(o);
+	g->strhash_hit++;
 	return sx;  /* Return existing string. */
       }
       o = gcnext(o);
@@ -266,6 +268,7 @@ GCstr *lj_str_new(lua_State *L, const char *str, size_t lenx)
 	    if (sx->hash == fh && sx->len == len && str_fastcmp(str, strdata(sx), len) == 0) {
 	      /* Resurrect if dead. Can only happen with fixstring() (keywords). */
 	      if (isdead(g, o)) flipwhite(o);
+	      g->strhash_hit++;
 	      return sx;  /* Return existing string. */
 	    }
 	    o = gcnext(o);
@@ -276,6 +279,7 @@ GCstr *lj_str_new(lua_State *L, const char *str, size_t lenx)
 	    if (sx->hash == fh && sx->len == len && memcmp(str, strdata(sx), len) == 0) {
 	      /* Resurrect if dead. Can only happen with fixstring() (keywords). */
 	      if (isdead(g, o)) flipwhite(o);
+	      g->strhash_hit++;
 	      return sx;  /* Return existing string. */
 	    }
 	    o = gcnext(o);
@@ -293,6 +297,7 @@ GCstr *lj_str_new(lua_State *L, const char *str, size_t lenx)
     }
   }
 #endif
+  g->strhash_miss++;
   /* Nope, create a new string. */
   s = lj_mem_newt(L, sizeof(GCstr)+len+1, GCstr);
   newwhite(g, s);
diff --git a/src/lj_tab.c b/src/lj_tab.c
index ff216f3..c5f358e 100644
--- a/src/lj_tab.c
+++ b/src/lj_tab.c
@@ -141,6 +141,7 @@ static GCtab *newtab(lua_State *L, uint32_t asize, uint32_t hbits)
   }
   if (hbits)
     newhpart(L, t, hbits);
+  G(L)->gc.tabnum++;
   return t;
 }
 
@@ -240,6 +241,7 @@ void LJ_FASTCALL lj_tab_free(global_State *g, GCtab *t)
     lj_mem_free(g, t, sizetabcolo((uint32_t)t->colo & 0x7f));
   else
     lj_mem_freet(g, t);
+  g->gc.tabnum--;
 }
 
 /* -- Table resizing ------------------------------------------------------ */
diff --git a/src/lj_trace.c b/src/lj_trace.c
index d85b47f..9bcbce6 100644
--- a/src/lj_trace.c
+++ b/src/lj_trace.c
@@ -176,6 +176,7 @@ void LJ_FASTCALL lj_trace_free(global_State *g, GCtrace *T)
   lj_mem_free(g, T,
     ((sizeof(GCtrace)+7)&~7) + (T->nins-T->nk)*sizeof(IRIns) +
     T->nsnap*sizeof(SnapShot) + T->nsnapmap*sizeof(SnapEntry));
+  J->tracenum--;
 }
 
 /* Re-enable compiling a prototype by unpatching any modified bytecode. */
@@ -538,8 +539,10 @@ static int trace_downrec(jit_State *J)
   /* Restart recording at the return instruction. */
   lua_assert(J->pt != NULL);
   lua_assert(bc_isret(bc_op(*J->pc)));
-  if (bc_op(*J->pc) == BC_RETM)
+  if (bc_op(*J->pc) == BC_RETM) {
+    J->ntraceabort++;
     return 0;  /* NYI: down-recursion with RETM. */
+  }
   J->parent = 0;
   J->exitno = 0;
   J->state = LJ_TRACE_RECORD;
@@ -616,6 +619,7 @@ static int trace_abort(jit_State *J)
     return trace_downrec(J);
   else if (e == LJ_TRERR_MCODEAL)
     lj_trace_flushall(L);
+  J->ntraceabort++;
   return 0;
 }
 
diff --git a/src/lj_udata.c b/src/lj_udata.c
index bd0321b..70c722a 100644
--- a/src/lj_udata.c
+++ b/src/lj_udata.c
@@ -24,11 +24,13 @@ GCudata *lj_udata_new(lua_State *L, MSize sz, GCtab *env)
   /* Chain to userdata list (after main thread). */
   setgcrefr(ud->nextgc, mainthread(g)->nextgc);
   setgcref(mainthread(g)->nextgc, obj2gco(ud));
+  g->gc.udatanum++;
   return ud;
 }
 
 void LJ_FASTCALL lj_udata_free(global_State *g, GCudata *ud)
 {
+  g->gc.udatanum--;
   lj_mem_free(g, ud, sizeudata(ud));
 }
 
-- 
2.28.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Tarantool-patches] [PATCH v4 2/2] misc: add C and Lua API for platform metrics
  2020-10-05  6:30 [Tarantool-patches] [PATCH v4 0/2] Implement LuaJIT platform metrics Sergey Kaplun
  2020-10-05  6:30 ` [Tarantool-patches] [PATCH v4 1/2] core: introduce various " Sergey Kaplun
@ 2020-10-05  6:30 ` Sergey Kaplun
  2020-10-06 22:17   ` Igor Munkin
  2020-10-09 14:45   ` Sergey Ostanevich
  2020-10-05  6:30 ` [Tarantool-patches] [RFC v4] rfc: luajit metrics Sergey Kaplun
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 26+ messages in thread
From: Sergey Kaplun @ 2020-10-05  6:30 UTC (permalink / raw)
  To: Igor Munkin, Sergey Ostanevich; +Cc: tarantool-patches

This patch introduces both C and Lua API for LuaJIT platform
metrics implemented in scope of the previous patch. New <lmisclib.h>
header provides <luaM_metrics> C interface that fills the given
<luam_metrics> structure with the platform metrics. Additionally
<misc> module is loaded to Lua space and provides <getmetrics>
method that yields the corresponding metrics via table.

Part of tarantool/tarantool#5187
---
 Makefile                                      |   2 +-
 src/Makefile                                  |   5 +-
 src/Makefile.dep                              |  14 +-
 src/lib_init.c                                |   2 +
 src/lib_misc.c                                |  72 +++
 src/lj_mapi.c                                 |  61 +++
 src/ljamalg.c                                 |   2 +
 src/lmisclib.h                                |  64 +++
 src/luaconf.h                                 |   1 +
 test/clib-misclib-getmetrics.test.lua         | 174 ++++++++
 test/clib-misclib-getmetrics/CMakeLists.txt   |   1 +
 test/clib-misclib-getmetrics/testgetmetrics.c | 242 ++++++++++
 test/lib-misc-getmetrics.test.lua             | 418 ++++++++++++++++++
 13 files changed, 1050 insertions(+), 8 deletions(-)
 create mode 100644 src/lib_misc.c
 create mode 100644 src/lj_mapi.c
 create mode 100644 src/lmisclib.h
 create mode 100755 test/clib-misclib-getmetrics.test.lua
 create mode 100644 test/clib-misclib-getmetrics/CMakeLists.txt
 create mode 100644 test/clib-misclib-getmetrics/testgetmetrics.c
 create mode 100755 test/lib-misc-getmetrics.test.lua

diff --git a/Makefile b/Makefile
index 0f93308..4a56917 100644
--- a/Makefile
+++ b/Makefile
@@ -84,7 +84,7 @@ FILE_A= libluajit.a
 FILE_SO= libluajit.so
 FILE_MAN= luajit.1
 FILE_PC= luajit.pc
-FILES_INC= lua.h lualib.h lauxlib.h luaconf.h lua.hpp luajit.h
+FILES_INC= lua.h lualib.h lauxlib.h luaconf.h lua.hpp luajit.h lmisclib.h
 FILES_JITLIB= bc.lua bcsave.lua dump.lua p.lua v.lua zone.lua \
 	      dis_x86.lua dis_x64.lua dis_arm.lua dis_arm64.lua \
 	      dis_arm64be.lua dis_ppc.lua dis_mips.lua dis_mipsel.lua \
diff --git a/src/Makefile b/src/Makefile
index 827d4a4..2786348 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -480,13 +480,14 @@ LJVM_BOUT= $(LJVM_S)
 LJVM_MODE= elfasm
 
 LJLIB_O= lib_base.o lib_math.o lib_bit.o lib_string.o lib_table.o \
-	 lib_io.o lib_os.o lib_package.o lib_debug.o lib_jit.o lib_ffi.o
+	 lib_io.o lib_os.o lib_package.o lib_debug.o lib_jit.o lib_ffi.o \
+	 lib_misc.o
 LJLIB_C= $(LJLIB_O:.o=.c)
 
 LJCORE_O= lj_gc.o lj_err.o lj_char.o lj_bc.o lj_obj.o lj_buf.o \
 	  lj_str.o lj_tab.o lj_func.o lj_udata.o lj_meta.o lj_debug.o \
 	  lj_state.o lj_dispatch.o lj_vmevent.o lj_vmmath.o lj_strscan.o \
-	  lj_strfmt.o lj_strfmt_num.o lj_api.o lj_profile.o \
+	  lj_strfmt.o lj_strfmt_num.o lj_api.o lj_mapi.o lj_profile.o \
 	  lj_lex.o lj_parse.o lj_bcread.o lj_bcwrite.o lj_load.o \
 	  lj_ir.o lj_opt_mem.o lj_opt_fold.o lj_opt_narrow.o \
 	  lj_opt_dce.o lj_opt_loop.o lj_opt_split.o lj_opt_sink.o \
diff --git a/src/Makefile.dep b/src/Makefile.dep
index 2b1cb5e..556314e 100644
--- a/src/Makefile.dep
+++ b/src/Makefile.dep
@@ -18,7 +18,7 @@ lib_ffi.o: lib_ffi.c lua.h luaconf.h lauxlib.h lualib.h lj_obj.h lj_def.h \
  lj_ctype.h lj_cparse.h lj_cdata.h lj_cconv.h lj_carith.h lj_ccall.h \
  lj_ccallback.h lj_clib.h lj_strfmt.h lj_ff.h lj_ffdef.h lj_lib.h \
  lj_libdef.h
-lib_init.o: lib_init.c lua.h luaconf.h lauxlib.h lualib.h lj_arch.h
+lib_init.o: lib_init.c lua.h luaconf.h lauxlib.h lualib.h lmisclib.h lj_arch.h
 lib_io.o: lib_io.c lua.h luaconf.h lauxlib.h lualib.h lj_obj.h lj_def.h \
  lj_arch.h lj_gc.h lj_err.h lj_errmsg.h lj_buf.h lj_str.h lj_state.h \
  lj_strfmt.h lj_ff.h lj_ffdef.h lj_lib.h lj_libdef.h
@@ -29,6 +29,8 @@ lib_jit.o: lib_jit.c lua.h luaconf.h lauxlib.h lualib.h lj_obj.h lj_def.h \
  lj_vm.h lj_vmevent.h lj_lib.h luajit.h lj_libdef.h
 lib_math.o: lib_math.c lua.h luaconf.h lauxlib.h lualib.h lj_obj.h \
  lj_def.h lj_arch.h lj_lib.h lj_vm.h lj_libdef.h
+lib_misc.o: lib_misc.c lua.h luaconf.h lmisclib.h lj_obj.h lj_def.h lj_arch.h \
+ lj_str.h lj_tab.h lj_lib.h lj_libdef.h
 lib_os.o: lib_os.c lua.h luaconf.h lauxlib.h lualib.h lj_obj.h lj_def.h \
  lj_arch.h lj_gc.h lj_err.h lj_errmsg.h lj_buf.h lj_str.h lj_lib.h \
  lj_libdef.h
@@ -137,6 +139,8 @@ lj_lib.o: lj_lib.c lauxlib.h lua.h luaconf.h lj_obj.h lj_def.h lj_arch.h \
 lj_load.o: lj_load.c lua.h luaconf.h lauxlib.h lj_obj.h lj_def.h \
  lj_arch.h lj_gc.h lj_err.h lj_errmsg.h lj_buf.h lj_str.h lj_func.h \
  lj_frame.h lj_bc.h lj_vm.h lj_lex.h lj_bcdump.h lj_parse.h
+lj_mapi.o: lj_mapi.c lua.h luaconf.h lmisclib.h lj_obj.h lj_def.h lj_arch.h \
+ lj_dispatch.h lj_bc.h lj_jit.h lj_ir.h
 lj_mcode.o: lj_mcode.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
  lj_gc.h lj_err.h lj_errmsg.h lj_jit.h lj_ir.h lj_mcode.h lj_trace.h \
  lj_dispatch.h lj_bc.h lj_traceerr.h lj_vm.h
@@ -215,9 +219,9 @@ ljamalg.o: ljamalg.c lua.h luaconf.h lauxlib.h lj_gc.c lj_obj.h lj_def.h \
  lj_func.c lj_udata.c lj_meta.c lj_strscan.h lj_lib.h lj_debug.c \
  lj_state.c lj_lex.h lj_alloc.h luajit.h lj_dispatch.c lj_ccallback.h \
  lj_profile.h lj_vmevent.c lj_vmevent.h lj_vmmath.c lj_strscan.c \
- lj_strfmt.c lj_strfmt_num.c lj_api.c lj_profile.c lj_lex.c lualib.h \
- lj_parse.h lj_parse.c lj_bcread.c lj_bcdump.h lj_bcwrite.c lj_load.c \
- lj_ctype.c lj_cdata.c lj_cconv.h lj_cconv.c lj_ccall.c lj_ccall.h \
+ lj_strfmt.c lj_strfmt_num.c lj_api.c lj_mapi.c lmisclib.h lj_profile.c \
+ lj_lex.c lualib.h lj_parse.h lj_parse.c lj_bcread.c lj_bcdump.h lj_bcwrite.c \
+ lj_load.c lj_ctype.c lj_cdata.c lj_cconv.h lj_cconv.c lj_ccall.c lj_ccall.h \
  lj_ccallback.c lj_target.h lj_target_*.h lj_mcode.h lj_carith.c \
  lj_carith.h lj_clib.c lj_clib.h lj_cparse.c lj_cparse.h lj_lib.c lj_ir.c \
  lj_ircall.h lj_iropt.h lj_opt_mem.c lj_opt_fold.c lj_folddef.h \
@@ -227,7 +231,7 @@ ljamalg.o: ljamalg.c lua.h luaconf.h lauxlib.h lj_gc.c lj_obj.h lj_def.h \
  lj_emit_*.h lj_asm_*.h lj_trace.c lj_gdbjit.h lj_gdbjit.c lj_alloc.c \
  lib_aux.c lib_base.c lj_libdef.h lib_math.c lib_string.c lib_table.c \
  lib_io.c lib_os.c lib_package.c lib_debug.c lib_bit.c lib_jit.c \
- lib_ffi.c lib_init.c
+ lib_ffi.c lib_misc.c lib_init.c
 luajit.o: luajit.c lua.h luaconf.h lauxlib.h lualib.h luajit.h lj_arch.h
 host/buildvm.o: host/buildvm.c host/buildvm.h lj_def.h lua.h luaconf.h \
  lj_arch.h lj_obj.h lj_def.h lj_arch.h lj_gc.h lj_obj.h lj_bc.h lj_ir.h \
diff --git a/src/lib_init.c b/src/lib_init.c
index 2ed370e..664aa7d 100644
--- a/src/lib_init.c
+++ b/src/lib_init.c
@@ -12,6 +12,7 @@
 #include "lua.h"
 #include "lauxlib.h"
 #include "lualib.h"
+#include "lmisclib.h"
 
 #include "lj_arch.h"
 
@@ -26,6 +27,7 @@ static const luaL_Reg lj_lib_load[] = {
   { LUA_DBLIBNAME,	luaopen_debug },
   { LUA_BITLIBNAME,	luaopen_bit },
   { LUA_JITLIBNAME,	luaopen_jit },
+  { LUAM_MISCLIBNAME,	luaopen_misc },
   { NULL,		NULL }
 };
 
diff --git a/src/lib_misc.c b/src/lib_misc.c
new file mode 100644
index 0000000..ef11237
--- /dev/null
+++ b/src/lib_misc.c
@@ -0,0 +1,72 @@
+/*
+** Miscellaneous Lua extensions library.
+**
+** Major portions taken verbatim or adapted from the LuaVela interpreter.
+** Copyright (C) 2015-2019 IPONWEB Ltd.
+*/
+
+#define lib_misc_c
+#define LUA_LIB
+
+#include "lua.h"
+#include "lmisclib.h"
+
+#include "lj_obj.h"
+#include "lj_str.h"
+#include "lj_tab.h"
+#include "lj_lib.h"
+
+/* ------------------------------------------------------------------------ */
+
+static LJ_AINLINE void setnumfield(struct lua_State *L, GCtab *t,
+				   const char *name, int64_t val)
+{
+  setnumV(lj_tab_setstr(L, t, lj_str_newz(L, name)), (double)val);
+}
+
+#define LJLIB_MODULE_misc
+
+LJLIB_CF(misc_getmetrics)
+{
+  struct luam_Metrics metrics;
+  lua_createtable(L, 0, 22);
+  GCtab *m = tabV(L->top - 1);
+
+  luaM_metrics(L, &metrics);
+
+  setnumfield(L, m, "strhash_hit", metrics.strhash_hit);
+  setnumfield(L, m, "strhash_miss", metrics.strhash_miss);
+
+  setnumfield(L, m, "gc_strnum", metrics.gc_strnum);
+  setnumfield(L, m, "gc_tabnum", metrics.gc_tabnum);
+  setnumfield(L, m, "gc_udatanum", metrics.gc_udatanum);
+  setnumfield(L, m, "gc_cdatanum", metrics.gc_cdatanum);
+
+  setnumfield(L, m, "gc_total", metrics.gc_total);
+  setnumfield(L, m, "gc_freed", metrics.gc_freed);
+  setnumfield(L, m, "gc_allocated", metrics.gc_allocated);
+
+  setnumfield(L, m, "gc_steps_pause", metrics.gc_steps_pause);
+  setnumfield(L, m, "gc_steps_propagate", metrics.gc_steps_propagate);
+  setnumfield(L, m, "gc_steps_atomic", metrics.gc_steps_atomic);
+  setnumfield(L, m, "gc_steps_sweepstring", metrics.gc_steps_sweepstring);
+  setnumfield(L, m, "gc_steps_sweep", metrics.gc_steps_sweep);
+  setnumfield(L, m, "gc_steps_finalize", metrics.gc_steps_finalize);
+
+  setnumfield(L, m, "jit_snap_restore", metrics.jit_snap_restore);
+  setnumfield(L, m, "jit_trace_abort", metrics.jit_trace_abort);
+  setnumfield(L, m, "jit_mcode_size", metrics.jit_mcode_size);
+  setnumfield(L, m, "jit_trace_num", metrics.jit_trace_num);
+
+  return 1;
+}
+
+/* ------------------------------------------------------------------------ */
+
+#include "lj_libdef.h"
+
+LUALIB_API int luaopen_misc(struct lua_State *L)
+{
+  LJ_LIB_REG(L, LUAM_MISCLIBNAME, misc);
+  return 1;
+}
diff --git a/src/lj_mapi.c b/src/lj_mapi.c
new file mode 100644
index 0000000..7645a44
--- /dev/null
+++ b/src/lj_mapi.c
@@ -0,0 +1,61 @@
+/*
+** Miscellaneous public C API extensions.
+**
+** Major portions taken verbatim or adapted from the LuaVela.
+** Copyright (C) 2015-2019 IPONWEB Ltd.
+*/
+
+#include "lua.h"
+#include "lmisclib.h"
+
+#include "lj_obj.h"
+#include "lj_dispatch.h"
+
+#if LJ_HASJIT
+#include "lj_jit.h"
+#endif
+
+LUAMISC_API void luaM_metrics(lua_State *L, struct luam_Metrics *metrics)
+{
+  lua_assert(metrics != NULL);
+  global_State *g = G(L);
+  GCState *gc = &g->gc;
+#if LJ_HASJIT
+  jit_State *J = G2J(g);
+#endif
+
+  metrics->strhash_hit = g->strhash_hit;
+  metrics->strhash_miss = g->strhash_miss;
+
+  metrics->gc_strnum = g->strnum;
+  metrics->gc_tabnum = gc->tabnum;
+  metrics->gc_udatanum = gc->udatanum;
+#if LJ_HASFFI
+  metrics->gc_cdatanum = gc->cdatanum;
+#else
+  metrics->gc_cdatanum = 0;
+#endif
+
+  metrics->gc_total = gc->total;
+  metrics->gc_freed = gc->freed;
+  metrics->gc_allocated = gc->allocated;
+
+  metrics->gc_steps_pause = gc->state_count[GCSpause];
+  metrics->gc_steps_propagate = gc->state_count[GCSpropagate];
+  metrics->gc_steps_atomic = gc->state_count[GCSatomic];
+  metrics->gc_steps_sweepstring = gc->state_count[GCSsweepstring];
+  metrics->gc_steps_sweep = gc->state_count[GCSsweep];
+  metrics->gc_steps_finalize = gc->state_count[GCSfinalize];
+
+#if LJ_HASJIT
+  metrics->jit_snap_restore = J->nsnaprestore;
+  metrics->jit_trace_abort = J->ntraceabort;
+  metrics->jit_mcode_size = J->szallmcarea;
+  metrics->jit_trace_num = J->tracenum;
+#else
+  metrics->jit_snap_restore = 0;
+  metrics->jit_trace_abort = 0;
+  metrics->jit_mcode_size = 0;
+  metrics->jit_trace_num = 0;
+#endif
+}
diff --git a/src/ljamalg.c b/src/ljamalg.c
index f1f2862..371bbb6 100644
--- a/src/ljamalg.c
+++ b/src/ljamalg.c
@@ -48,6 +48,7 @@
 #include "lj_strfmt.c"
 #include "lj_strfmt_num.c"
 #include "lj_api.c"
+#include "lj_mapi.c"
 #include "lj_profile.c"
 #include "lj_lex.c"
 #include "lj_parse.c"
@@ -93,5 +94,6 @@
 #include "lib_bit.c"
 #include "lib_jit.c"
 #include "lib_ffi.c"
+#include "lib_misc.c"
 #include "lib_init.c"
 
diff --git a/src/lmisclib.h b/src/lmisclib.h
new file mode 100644
index 0000000..e2d1909
--- /dev/null
+++ b/src/lmisclib.h
@@ -0,0 +1,64 @@
+/*
+** Miscellaneous public C API extensions.
+**
+** Major portions taken verbatim or adapted from the LuaVela.
+** Copyright (C) 2015-2019 IPONWEB Ltd.
+*/
+
+#ifndef _LMISCLIB_H
+#define _LMISCLIB_H
+
+#include "lua.h"
+
+/* API for obtaining various platform metrics. */
+
+struct luam_Metrics {
+  /* Strings amount found in string hash instead of allocation of new one. */
+  size_t strhash_hit;
+  /* Strings amount allocated and put into string hash. */
+  size_t strhash_miss;
+
+  /* Amount of allocated string objects. */
+  size_t gc_strnum;
+  /* Amount of allocated table objects. */
+  size_t gc_tabnum;
+  /* Amount of allocated udata objects. */
+  size_t gc_udatanum;
+  /* Amount of allocated cdata objects. */
+  size_t gc_cdatanum;
+
+  /* Memory currently allocated. */
+  size_t gc_total;
+  /* Total amount of freed memory. */
+  size_t gc_freed;
+  /* Total amount of allocated memory. */
+  size_t gc_allocated;
+
+  /* Count of incremental GC steps per state. */
+  size_t gc_steps_pause;
+  size_t gc_steps_propagate;
+  size_t gc_steps_atomic;
+  size_t gc_steps_sweepstring;
+  size_t gc_steps_sweep;
+  size_t gc_steps_finalize;
+
+  /*
+  ** Overall number of snap restores (amount of guard assertions
+  ** leading to stopping trace executions and trace exits,
+  ** that are not stitching with other traces).
+  */
+  size_t jit_snap_restore;
+  /* Overall number of abort traces. */
+  size_t jit_trace_abort;
+  /* Total size of all allocated machine code areas. */
+  size_t jit_mcode_size;
+  /* Amount of JIT traces. */
+  unsigned int jit_trace_num;
+};
+
+LUAMISC_API void luaM_metrics(lua_State *L, struct luam_Metrics *metrics);
+
+#define LUAM_MISCLIBNAME "misc"
+LUALIB_API int luaopen_misc(lua_State *L);
+
+#endif /* _LMISCLIB_H */
diff --git a/src/luaconf.h b/src/luaconf.h
index 60cb928..8029040 100644
--- a/src/luaconf.h
+++ b/src/luaconf.h
@@ -144,6 +144,7 @@
 #endif
 
 #define LUALIB_API	LUA_API
+#define LUAMISC_API	LUA_API
 
 /* Support for internal assertions. */
 #if defined(LUA_USE_ASSERT) || defined(LUA_USE_APICHECK)
diff --git a/test/clib-misclib-getmetrics.test.lua b/test/clib-misclib-getmetrics.test.lua
new file mode 100755
index 0000000..34adaba
--- /dev/null
+++ b/test/clib-misclib-getmetrics.test.lua
@@ -0,0 +1,174 @@
+#!/usr/bin/env tarantool
+
+local file = debug.getinfo(1, "S").source:sub(2)
+local filepath = file:match("(.*/)")
+local soext = jit.os == "OSX" and "dylib" or "so"
+package.cpath = filepath..'clib-misclib-getmetrics/?.'..soext..";"
+                ..package.cpath
+
+local jit_opt_default_hotloop = 56
+local jit_opt_default_hotexit = 10
+local jit_opt_default_level = 3
+local jit_opt_default_minstitch = 0
+
+local tap = require('tap')
+
+local test = tap.test("clib-misc-getmetrics")
+test:plan(10)
+
+local testgetmetrics = require("testgetmetrics")
+
+test:ok(testgetmetrics.base())
+test:ok(testgetmetrics.gc_allocated_freed())
+test:ok(testgetmetrics.gc_steps())
+
+test:ok(testgetmetrics.objcount(function()
+    local table_new = require("table.new")
+    local ffi = require("ffi")
+
+    jit.opt.start(0)
+
+    local placeholder = {
+        str = {},
+        tab = {},
+        udata = {},
+        cdata = {},
+    }
+
+    -- Separate objects creations to separate jit traces.
+    for i = 1, 1000 do
+        table.insert(placeholder.str, tostring(i))
+    end
+
+    for i = 1, 1000 do
+        table.insert(placeholder.tab, table_new(i, 0))
+    end
+
+    for i = 1, 1000 do
+        table.insert(placeholder.udata, newproxy())
+    end
+
+    for i = 1, 1000 do
+        -- Check counting of VLA/VLS/aligned cdata.
+        table.insert(placeholder.cdata, ffi.new("char[?]", 4))
+    end
+
+    for i = 1, 1000 do
+        -- Check counting of non-VLA/VLS/aligned cdata.
+        table.insert(placeholder.cdata, ffi.new("uint64_t", i))
+    end
+
+    placeholder = nil
+    -- Restore default jit settings.
+    jit.opt.start(jit_opt_default_level)
+end))
+
+-- Compiled loop with a direct exit to the interpreter.
+test:ok(testgetmetrics.snap_restores(function()
+    jit.opt.start(0, "hotloop=2")
+
+    local old_metrics = misc.getmetrics()
+
+    local sum = 0
+    for i = 1, 20 do
+        sum = sum + i
+    end
+
+    local new_metrics = misc.getmetrics()
+
+    -- Restore default jit settings.
+    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop)
+
+    -- A single snapshot restoration happened on loop finish.
+    return 1
+end))
+
+-- Compiled loop with a side exit which does not get compiled.
+test:ok(testgetmetrics.snap_restores(function()
+    jit.opt.start(0, "hotloop=2", "hotexit=2", "minstitch=15")
+
+    local function foo(i)
+        -- math.fmod is not yet compiled!
+        return i <= 5 and i or math.fmod(i, 11)
+    end
+
+    local sum = 0
+    for i = 1, 10 do
+        sum = sum + foo(i)
+    end
+
+    -- Restore default jit settings.
+    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop,
+                  "hotexit="..jit_opt_default_hotexit,
+                  "minstitch="..jit_opt_default_minstitch)
+
+    -- Side exits from the root trace could not get compiled.
+    return 5
+end))
+
+-- Compiled loop with a side exit which gets compiled.
+test:ok(testgetmetrics.snap_restores(function()
+    -- Optimization level is important here as `loop` optimization
+    -- may unroll the loop body and insert +1 side exit.
+    jit.opt.start(0, "hotloop=5", "hotexit=5")
+
+    local function foo(i)
+        return i <= 10 and i or tostring(i)
+    end
+
+    local sum = 0
+    for i = 1, 20 do
+        sum = sum + foo(i)
+    end
+
+    -- Restore default jit settings.
+    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop,
+                  "hotexit="..jit_opt_default_hotexit)
+
+    -- 5 side exits to the interpreter before trace gets hot
+    -- and compiled
+    -- 1 side exit on loop end
+    return 6
+end))
+
+-- Compiled scalar trace with a direct exit to the interpreter.
+test:ok(testgetmetrics.snap_restores(function()
+    -- For calls it will be 2 * hotloop (see lj_dispatch.{c,h}
+    -- and hotcall@vm_*.dasc).
+    jit.opt.start(3, "hotloop=2", "hotexit=3")
+
+    local function foo(i)
+        return i <= 15 and i or tostring(i)
+    end
+
+    foo(1)  -- interp only
+    foo(2)  -- interp only
+    foo(3)  -- interp only
+    foo(4)  -- compile trace during this call
+    foo(5)  -- follow the trace
+    foo(6)  -- follow the trace
+    foo(7)  -- follow the trace
+    foo(8)  -- follow the trace
+    foo(9)  -- follow the trace
+    foo(10) -- follow the trace
+
+    -- Simply 2 side exits from the trace:
+    foo(20)
+    foo(21)
+
+    -- Restore default jit settings.
+    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop,
+                  "hotexit="..jit_opt_default_hotexit)
+    return 2
+end))
+
+test:ok(testgetmetrics.strhash())
+
+test:ok(testgetmetrics.tracenum_base(function()
+    local sum = 0
+    for i = 1, 200 do
+        sum = sum + i
+    end
+    -- Compiled only 1 loop as new trace.
+    return 1
+end))
diff --git a/test/clib-misclib-getmetrics/CMakeLists.txt b/test/clib-misclib-getmetrics/CMakeLists.txt
new file mode 100644
index 0000000..e7cc8f8
--- /dev/null
+++ b/test/clib-misclib-getmetrics/CMakeLists.txt
@@ -0,0 +1 @@
+build_lualib(testgetmetrics testgetmetrics.c)
diff --git a/test/clib-misclib-getmetrics/testgetmetrics.c b/test/clib-misclib-getmetrics/testgetmetrics.c
new file mode 100644
index 0000000..32802d2
--- /dev/null
+++ b/test/clib-misclib-getmetrics/testgetmetrics.c
@@ -0,0 +1,242 @@
+#include <lua.h>
+#include <luajit.h>
+#include <lauxlib.h>
+
+#include <lmisclib.h>
+
+#include <assert.h>
+
+static int base(lua_State *L)
+{
+	struct luam_Metrics metrics;
+	luaM_metrics(L, &metrics);
+
+	/* Just check API. */
+	assert((ssize_t)metrics.strhash_hit >= 0);
+	assert((ssize_t)metrics.strhash_miss >= 0);
+
+	assert((ssize_t)metrics.gc_strnum >= 0);
+	assert((ssize_t)metrics.gc_tabnum >= 0);
+	assert((ssize_t)metrics.gc_udatanum >= 0);
+	assert((ssize_t)metrics.gc_cdatanum >= 0);
+
+	assert((ssize_t)metrics.gc_total >= 0);
+	assert((ssize_t)metrics.gc_freed >= 0);
+	assert((ssize_t)metrics.gc_allocated >= 0);
+
+	assert((ssize_t)metrics.gc_steps_pause >= 0);
+	assert((ssize_t)metrics.gc_steps_propagate >= 0);
+	assert((ssize_t)metrics.gc_steps_atomic >= 0);
+	assert((ssize_t)metrics.gc_steps_sweepstring >= 0);
+	assert((ssize_t)metrics.gc_steps_sweep >= 0);
+	assert((ssize_t)metrics.gc_steps_finalize >= 0);
+
+	assert((ssize_t)metrics.jit_snap_restore >= 0);
+	assert((ssize_t)metrics.jit_trace_abort >= 0);
+	assert((ssize_t)metrics.jit_mcode_size >= 0);
+	assert((ssize_t)metrics.jit_trace_num >= 0);
+
+	lua_pushboolean(L, 1);
+	return 1;
+}
+
+static int gc_allocated_freed(lua_State *L)
+{
+	struct luam_Metrics oldm, newm;
+	/* Force up garbage collect all dead objects. */
+	lua_gc(L, LUA_GCCOLLECT, 0);
+
+	luaM_metrics(L, &oldm);
+	/* Simple garbage generation. */
+	if (luaL_dostring(L, "local i = 0 for j = 1, 10 do i = i + j end"))
+		luaL_error(L, "failed to translate Lua code snippet");
+	lua_gc(L, LUA_GCCOLLECT, 0);
+	luaM_metrics(L, &newm);
+	assert(newm.gc_allocated - oldm.gc_allocated > 0);
+	assert(newm.gc_freed - oldm.gc_freed > 0);
+
+	lua_pushboolean(L, 1);
+	return 1;
+}
+
+static int gc_steps(lua_State *L)
+{
+	struct luam_Metrics oldm, newm;
+	/*
+	 * Some garbage has already happened before the next line,
+	 * i.e. during fronted processing lua test chunk.
+	 * Let's put a full garbage collection cycle on top
+	 * of that, and confirm that non-null values are reported
+	 * (we are not yet interested in actual numbers):
+	 */
+	luaM_metrics(L, &oldm);
+	lua_gc(L, LUA_GCCOLLECT, 0);
+	assert(oldm.gc_steps_pause > 0);
+	assert(oldm.gc_steps_propagate > 0);
+	assert(oldm.gc_steps_atomic > 0);
+	assert(oldm.gc_steps_sweepstring > 0);
+	assert(oldm.gc_steps_sweep > 0);
+	/* Nothing to finalize, skipped. */
+	assert(oldm.gc_steps_finalize == 0);
+
+	/*
+	 * As long as we don't create new Lua objects
+	 * consequent call should return the same values:
+	 */
+	luaM_metrics(L, &newm);
+	assert(newm.gc_steps_pause - oldm.gc_steps_pause > 0);
+	assert(newm.gc_steps_propagate - oldm.gc_steps_propagate > 0);
+	assert(newm.gc_steps_atomic - oldm.gc_steps_atomic > 0);
+	assert(newm.gc_steps_sweepstring - oldm.gc_steps_sweepstring > 0);
+	assert(newm.gc_steps_sweep - oldm.gc_steps_sweep > 0);
+	/* Nothing to finalize, skipped. */
+	assert(newm.gc_steps_finalize == 0);
+	oldm = newm;
+
+	/*
+	 * Now the last phase: run full GC once and make sure that
+	 * everything is being reported as expected:
+	 */
+	lua_gc(L, LUA_GCCOLLECT, 0);
+	luaM_metrics(L, &newm);
+	assert(newm.gc_steps_pause - oldm.gc_steps_pause == 1);
+	assert(newm.gc_steps_propagate - oldm.gc_steps_propagate >= 1);
+	assert(newm.gc_steps_atomic - oldm.gc_steps_atomic == 1);
+	assert(newm.gc_steps_sweepstring - oldm.gc_steps_sweepstring >= 1);
+	assert(newm.gc_steps_sweep - oldm.gc_steps_sweep >= 1);
+	/* Nothing to finalize, skipped. */
+	assert(newm.gc_steps_finalize == 0);
+	oldm = newm;
+
+	/*
+	 * Now let's run three GC cycles to ensure that
+	 * zero-to-one transition was not a lucky coincidence.
+	 */
+	lua_gc(L, LUA_GCCOLLECT, 0);
+	lua_gc(L, LUA_GCCOLLECT, 0);
+	lua_gc(L, LUA_GCCOLLECT, 0);
+	luaM_metrics(L, &newm);
+	assert(newm.gc_steps_pause - oldm.gc_steps_pause == 3);
+	assert(newm.gc_steps_propagate - oldm.gc_steps_propagate >= 3);
+	assert(newm.gc_steps_atomic - oldm.gc_steps_atomic == 3);
+	assert(newm.gc_steps_sweepstring - oldm.gc_steps_sweepstring >= 3);
+	assert(newm.gc_steps_sweep - oldm.gc_steps_sweep >= 3);
+	/* Nothing to finalize, skipped. */
+	assert(newm.gc_steps_finalize == 0);
+
+	lua_pushboolean(L, 1);
+	return 1;
+}
+
+static int objcount(lua_State *L)
+{
+	struct luam_Metrics oldm, newm;
+	int n = lua_gettop(L);
+	if (n != 1 || !lua_isfunction(L, 1))
+		luaL_error(L, "incorrect argument: 1 function is required");
+
+	/* Force up garbage collect all dead objects. */
+	lua_gc(L, LUA_GCCOLLECT, 0);
+
+	luaM_metrics(L, &oldm);
+	/* Generate garbage. */
+	lua_call(L, 0, 0);
+	lua_gc(L, LUA_GCCOLLECT, 0);
+	luaM_metrics(L, &newm);
+	assert(newm.gc_strnum - oldm.gc_strnum == 0);
+	assert(newm.gc_tabnum - oldm.gc_tabnum == 0);
+	assert(newm.gc_udatanum - oldm.gc_udatanum == 0);
+	assert(newm.gc_cdatanum - oldm.gc_cdatanum == 0);
+
+	lua_pushboolean(L, 1);
+	return 1;
+}
+
+static int snap_restores(lua_State *L)
+{
+	struct luam_Metrics oldm, newm;
+	int n = lua_gettop(L);
+	if (n != 1 || !lua_isfunction(L, 1))
+		luaL_error(L, "incorrect arguments: 1 function is required");
+
+	luaM_metrics(L, &oldm);
+	/* Generate snapshots. */
+	lua_call(L, 0, 1);
+	n = lua_gettop(L);
+	if (n != 1 || !lua_isnumber(L, 1))
+		luaL_error(L, "incorrect return value: 1 number is required");
+	size_t snap_restores = lua_tonumber(L, 1);
+	luaM_metrics(L, &newm);
+	assert(newm.jit_snap_restore - oldm.jit_snap_restore == snap_restores);
+
+	lua_pushboolean(L, 1);
+	return 1;
+}
+
+static int strhash(lua_State *L)
+{
+	struct luam_Metrics oldm, newm;
+	lua_pushstring(L, "strhash_hit");
+	luaM_metrics(L, &oldm);
+	lua_pushstring(L, "strhash_hit");
+	lua_pushstring(L, "new_str");
+	luaM_metrics(L, &newm);
+	assert(newm.strhash_hit - oldm.strhash_hit == 1);
+	assert(newm.strhash_miss - oldm.strhash_miss == 1);
+	lua_pop(L, 3);
+	lua_pushboolean(L, 1);
+	return 1;
+}
+
+static int tracenum_base(lua_State *L)
+{
+	struct luam_Metrics metrics;
+	int n = lua_gettop(L);
+	if (n != 1 || !lua_isfunction(L, 1))
+		luaL_error(L, "incorrect arguments: 1 function is required");
+
+	luaJIT_setmode(L, 0, LUAJIT_MODE_OFF);
+	luaJIT_setmode(L, 0, LUAJIT_MODE_FLUSH);
+	/* Force up garbage collect all dead objects. */
+	lua_gc(L, LUA_GCCOLLECT, 0);
+
+	luaM_metrics(L, &metrics);
+	assert(metrics.jit_trace_num == 0);
+
+	luaJIT_setmode(L, 0, LUAJIT_MODE_ON);
+	/* Generate traces. */
+	lua_call(L, 0, 1);
+	n = lua_gettop(L);
+	if (n != 1 || !lua_isnumber(L, 1))
+		luaL_error(L, "incorrect return value: 1 number is required");
+	size_t jit_trace_num = lua_tonumber(L, 1);
+	luaM_metrics(L, &metrics);
+	assert(metrics.jit_trace_num == jit_trace_num);
+
+	luaJIT_setmode(L, 0, LUAJIT_MODE_FLUSH);
+	/* Force up garbage collect all dead objects. */
+	lua_gc(L, LUA_GCCOLLECT, 0);
+	luaM_metrics(L, &metrics);
+	assert(metrics.jit_trace_num == 0);
+
+	luaJIT_setmode(L, 0, LUAJIT_MODE_ON);
+	lua_pushboolean(L, 1);
+	return 1;
+}
+
+static const struct luaL_Reg testgetmetrics[] = {
+	{"base", base},
+	{"gc_allocated_freed", gc_allocated_freed},
+	{"gc_steps", gc_steps},
+	{"objcount", objcount},
+	{"snap_restores", snap_restores},
+	{"strhash", strhash},
+	{"tracenum_base", tracenum_base},
+	{NULL, NULL}
+};
+
+LUA_API int luaopen_testgetmetrics(lua_State *L)
+{
+	luaL_register(L, "testgetmetrics", testgetmetrics);
+	return 1;
+}
diff --git a/test/lib-misc-getmetrics.test.lua b/test/lib-misc-getmetrics.test.lua
new file mode 100755
index 0000000..2859e26
--- /dev/null
+++ b/test/lib-misc-getmetrics.test.lua
@@ -0,0 +1,418 @@
+#!/usr/bin/env tarantool
+
+-- This is a part of tarantool/luajit testing suite.
+-- Major portions taken verbatim or adapted from the LuaVela testing suit.
+-- Copyright (C) 2015-2019 IPONWEB Ltd.
+
+local tap = require('tap')
+
+local test = tap.test("lib-misc-getmetrics")
+test:plan(10)
+
+local jit_opt_default_hotloop = 56
+local jit_opt_default_hotexit = 10
+local jit_opt_default_level = 3
+local jit_opt_default_minstitch = 0
+
+-- Test Lua API.
+test:test("base", function(subtest)
+    subtest:plan(19)
+    local metrics = misc.getmetrics()
+    subtest:ok(metrics.strhash_hit >= 0)
+    subtest:ok(metrics.strhash_miss >= 0)
+
+    subtest:ok(metrics.gc_strnum >= 0)
+    subtest:ok(metrics.gc_tabnum >= 0)
+    subtest:ok(metrics.gc_udatanum >= 0)
+    subtest:ok(metrics.gc_cdatanum >= 0)
+
+    subtest:ok(metrics.gc_total >= 0)
+    subtest:ok(metrics.gc_freed >= 0)
+    subtest:ok(metrics.gc_allocated >= 0)
+
+    subtest:ok(metrics.gc_steps_pause >= 0)
+    subtest:ok(metrics.gc_steps_propagate >= 0)
+    subtest:ok(metrics.gc_steps_atomic >= 0)
+    subtest:ok(metrics.gc_steps_sweepstring >= 0)
+    subtest:ok(metrics.gc_steps_sweep >= 0)
+    subtest:ok(metrics.gc_steps_finalize >= 0)
+
+    subtest:ok(metrics.jit_snap_restore >= 0)
+    subtest:ok(metrics.jit_trace_abort >= 0)
+    subtest:ok(metrics.jit_mcode_size >= 0)
+    subtest:ok(metrics.jit_trace_num >= 0)
+end)
+
+test:test("gc-allocated-freed", function(subtest)
+    subtest:plan(1)
+
+    -- Force up garbage collect all dead objects.
+    collectgarbage("collect")
+
+    -- Bump getmetrincs table and string keys allocation.
+    local old_metrics = misc.getmetrics()
+
+    -- Remember allocated size for getmetrics table.
+    old_metrics = misc.getmetrics()
+
+    collectgarbage("collect")
+
+    local new_metrics = misc.getmetrics()
+
+    local diff_alloc = new_metrics.gc_allocated - old_metrics.gc_allocated
+    local getmetrics_alloc = diff_alloc
+
+    -- Do not use test:ok to avoid extra allocated/freed objects.
+    assert(getmetrics_alloc > 0, "count allocated table for getmetrics")
+    old_metrics = new_metrics
+
+    -- NB: Avoid operations that use internal global string buffer
+    -- (such as concatenation, string.format, table.concat)
+    -- while creating the string. Otherwise gc_freed/gc_allocated
+    -- relations will not be so straightforward.
+    local str = string.sub("Hello, world", 1, 5)
+    collectgarbage("collect")
+
+    new_metrics = misc.getmetrics()
+
+    diff_alloc = new_metrics.gc_allocated - old_metrics.gc_allocated
+    local diff_freed = new_metrics.gc_freed - old_metrics.gc_freed
+
+    assert(diff_alloc > getmetrics_alloc,
+           "allocated str 'Hello' and table for getmetrics")
+    assert(diff_freed == getmetrics_alloc,
+           "freed old old_metrics")
+    old_metrics = new_metrics
+
+    str = string.sub("Hello, world", 8, -1)
+
+    new_metrics = misc.getmetrics()
+
+    diff_alloc = new_metrics.gc_allocated - old_metrics.gc_allocated
+    diff_freed = new_metrics.gc_freed - old_metrics.gc_freed
+
+    assert(diff_alloc > getmetrics_alloc,
+            "allocated str 'world' and table for getmetrics")
+    assert(diff_freed == 0, "nothing to free without collectgarbage")
+    old_metrics = new_metrics
+    collectgarbage("collect")
+
+    new_metrics = misc.getmetrics()
+
+    diff_alloc = new_metrics.gc_allocated - old_metrics.gc_allocated
+    diff_freed = new_metrics.gc_freed - old_metrics.gc_freed
+
+    assert(diff_alloc == getmetrics_alloc,
+            "allocated last one table for getmetrics")
+    assert(diff_freed > 2 * getmetrics_alloc,
+            "freed str 'Hello' and 2 tables for getmetrics")
+    subtest:ok(true, "no assetion failed")
+end)
+
+test:test("gc-steps", function(subtest)
+    subtest:plan(24)
+
+    -- Some garbage has already created before the next line,
+    -- i.e. during fronted processing this chunk.
+    -- Let's put a full garbage collection cycle on top of that,
+    -- and confirm that non-null values are reported (we are not
+    -- yet interested in actual numbers):
+    collectgarbage("collect")
+    collectgarbage("stop")
+    local oldm = misc.getmetrics()
+    subtest:ok(oldm.gc_steps_pause > 0)
+    subtest:ok(oldm.gc_steps_propagate > 0)
+    subtest:ok(oldm.gc_steps_atomic > 0)
+    subtest:ok(oldm.gc_steps_sweepstring > 0)
+    subtest:ok(oldm.gc_steps_sweep > 0)
+    -- Nothing to finalize, skipped.
+    subtest:is(oldm.gc_steps_finalize, 0)
+
+    -- As long as we stopped the GC, consequent call
+    -- should return the same values:
+    local newm = misc.getmetrics()
+    subtest:is(newm.gc_steps_pause - oldm.gc_steps_pause, 0)
+    subtest:is(newm.gc_steps_propagate - oldm.gc_steps_propagate, 0)
+    subtest:is(newm.gc_steps_atomic - oldm.gc_steps_atomic, 0)
+    subtest:is(newm.gc_steps_sweepstring - oldm.gc_steps_sweepstring, 0)
+    subtest:is(newm.gc_steps_sweep - oldm.gc_steps_sweep, 0)
+    -- Nothing to finalize, skipped.
+    subtest:is(newm.gc_steps_finalize, 0)
+    oldm = newm
+
+    -- Now the last phase: run full GC once and make sure that
+    -- everything is being reported as expected:
+    collectgarbage("collect")
+    collectgarbage("stop")
+    newm = misc.getmetrics()
+    subtest:ok(newm.gc_steps_pause - oldm.gc_steps_pause == 1)
+    subtest:ok(newm.gc_steps_propagate - oldm.gc_steps_propagate >= 1)
+    subtest:ok(newm.gc_steps_atomic - oldm.gc_steps_atomic == 1)
+    subtest:ok(newm.gc_steps_sweepstring - oldm.gc_steps_sweepstring >= 1)
+    subtest:ok(newm.gc_steps_sweep  - oldm.gc_steps_sweep >= 1)
+    -- Nothing to finalize, skipped.
+    subtest:is(newm.gc_steps_finalize, 0)
+    oldm = newm
+
+    -- Now let's run three GC cycles to ensure that zero-to-one
+    -- transition was not a lucky coincidence.
+    collectgarbage("collect")
+    collectgarbage("collect")
+    collectgarbage("collect")
+    collectgarbage("stop")
+    newm = misc.getmetrics()
+    subtest:ok(newm.gc_steps_pause - oldm.gc_steps_pause == 3)
+    subtest:ok(newm.gc_steps_propagate - oldm.gc_steps_propagate >= 3)
+    subtest:ok(newm.gc_steps_atomic - oldm.gc_steps_atomic == 3)
+    subtest:ok(newm.gc_steps_sweepstring - oldm.gc_steps_sweepstring >= 3)
+    subtest:ok(newm.gc_steps_sweep  - oldm.gc_steps_sweep >= 3)
+    -- Nothing to finalize, skipped.
+    subtest:is(newm.gc_steps_finalize, 0)
+end)
+
+test:test("objcount", function(subtest)
+    subtest:plan(4)
+    local table_new = require("table.new")
+    local ffi = require("ffi")
+
+    jit.opt.start(0)
+
+    -- Remove all dead objects.
+    collectgarbage("collect")
+
+    -- Bump strings and table creation.
+    local old_metrics = misc.getmetrics()
+    old_metrics = misc.getmetrics()
+
+    local placeholder = {
+        str = {},
+        tab = {},
+        udata = {},
+        cdata = {},
+    }
+
+    -- Separate objects creations to separate jit traces.
+    for i = 1, 1000 do
+        table.insert(placeholder.str, tostring(i))
+    end
+
+    for i = 1, 1000 do
+        table.insert(placeholder.tab, table_new(i, 0))
+    end
+
+    for i = 1, 1000 do
+        table.insert(placeholder.udata, newproxy())
+    end
+
+    for i = 1, 1000 do
+        -- Check counting of VLA/VLS/aligned cdata.
+        table.insert(placeholder.cdata, ffi.new("char[?]", 4))
+    end
+
+    for i = 1, 1000 do
+        -- Check counting of non-VLA/VLS/aligned cdata.
+        table.insert(placeholder.cdata, ffi.new("uint64_t", i))
+    end
+
+    placeholder = nil
+    collectgarbage("collect")
+    local new_metrics = misc.getmetrics()
+
+    -- Check that amount of objects not increased.
+    subtest:is(new_metrics.gc_strnum, old_metrics.gc_strnum,
+               "strnum don't change")
+    subtest:is(new_metrics.gc_tabnum, old_metrics.gc_tabnum,
+               "tabnum don't change")
+    subtest:is(new_metrics.gc_udatanum, old_metrics.gc_udatanum,
+               "udatanum don't change")
+    subtest:is(new_metrics.gc_cdatanum, old_metrics.gc_cdatanum,
+               "cdatanum don't change")
+
+    -- Restore default jit settings.
+    jit.opt.start(jit_opt_default_level)
+end)
+
+test:test("snap-restores-direct-loop", function(subtest)
+    -- Compiled loop with a direct exit to the interpreter.
+    subtest:plan(1)
+
+    jit.opt.start(0, "hotloop=2")
+
+    local old_metrics = misc.getmetrics()
+
+    local sum = 0
+    for i = 1, 20 do
+        sum = sum + i
+    end
+
+    local new_metrics = misc.getmetrics()
+
+    -- A single snapshot restoration happened on loop finish:
+    subtest:is(new_metrics.jit_snap_restore - old_metrics.jit_snap_restore, 1)
+
+    -- Restore default jit settings.
+    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop)
+end)
+
+test:test("snap-restores-loop-side-exit-non-compiled", function(subtest)
+    -- Compiled loop with a side exit which does not get compiled.
+    subtest:plan(1)
+
+    jit.opt.start(0, "hotloop=2", "hotexit=2", "minstitch=15")
+
+    local function foo(i)
+        -- math.fmod is not yet compiled!
+        return i <= 5 and i or math.fmod(i, 11)
+    end
+
+    local old_metrics = misc.getmetrics()
+    local sum = 0
+    for i = 1, 10 do
+        sum = sum + foo(i)
+    end
+
+    local new_metrics = misc.getmetrics()
+
+    -- Side exits from the root trace could not get compiled.
+    subtest:is(new_metrics.jit_snap_restore - old_metrics.jit_snap_restore, 5)
+
+    -- Restore default jit settings.
+    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop,
+                  "hotexit="..jit_opt_default_hotexit,
+                  "minstitch="..jit_opt_default_minstitch)
+end)
+
+test:test("snap-restores-loop-side-exit", function(subtest)
+    -- Compiled loop with a side exit which gets compiled.
+    subtest:plan(1)
+
+    -- Optimization level is important here as `loop` optimization
+    -- may unroll the loop body and insert +1 side exit.
+    jit.opt.start(0, "hotloop=5", "hotexit=5")
+
+    local function foo(i)
+        return i <= 10 and i or tostring(i)
+    end
+
+    local old_metrics = misc.getmetrics()
+    local sum = 0
+    for i = 1, 20 do
+        sum = sum + foo(i)
+    end
+
+    local new_metrics = misc.getmetrics()
+
+    -- 5 side exits to the interpreter before trace gets hot
+    -- and compiled
+    -- 1 side exit on loop end
+    subtest:is(new_metrics.jit_snap_restore - old_metrics.jit_snap_restore, 6)
+
+    -- Restore default jit settings.
+    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop,
+                  "hotexit="..jit_opt_default_hotexit)
+end)
+
+test:test("snap-restores-scalar", function(subtest)
+    -- Compiled scalar trace with a direct exit to the interpreter.
+    subtest:plan(2)
+
+    -- For calls it will be 2 * hotloop (see lj_dispatch.{c,h}
+    -- and hotcall@vm_*.dasc).
+    jit.opt.start(3, "hotloop=2", "hotexit=3")
+
+    local function foo(i)
+        return i <= 15 and i or tostring(i)
+    end
+
+    local old_metrics = misc.getmetrics()
+
+    foo(1)  -- interp only
+    foo(2)  -- interp only
+    foo(3)  -- interp only
+    foo(4)  -- compile trace during this call
+    foo(5)  -- follow the trace
+    foo(6)  -- follow the trace
+    foo(7)  -- follow the trace
+    foo(8)  -- follow the trace
+    foo(9)  -- follow the trace
+    foo(10) -- follow the trace
+
+    local new_metrics = misc.getmetrics()
+
+    -- No exits triggering snap restore so far: snapshot
+    -- restoration was inlined into the machine code.
+    subtest:is(new_metrics.jit_snap_restore - old_metrics.jit_snap_restore, 0)
+    old_metrics = new_metrics
+
+    -- Simply 2 side exits from the trace:
+    foo(20)
+    foo(21)
+
+    new_metrics = misc.getmetrics()
+    subtest:is(new_metrics.jit_snap_restore - old_metrics.jit_snap_restore, 2)
+
+    -- Restore default jit settings.
+    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop,
+                  "hotexit="..jit_opt_default_hotexit)
+end)
+
+test:test("strhash", function(subtest)
+    subtest:plan(1)
+
+    local old_metrics = misc.getmetrics()
+
+    local new_metrics = misc.getmetrics()
+    -- Do not use test:ok to avoid extra strhash hits/misses.
+    assert(new_metrics.strhash_hit - old_metrics.strhash_hit == 19)
+    assert(new_metrics.strhash_miss - old_metrics.strhash_miss == 0)
+    old_metrics = new_metrics
+
+    local str1  = "strhash".."_hit"
+
+    new_metrics = misc.getmetrics()
+    assert(new_metrics.strhash_hit - old_metrics.strhash_hit == 20)
+    assert(new_metrics.strhash_miss - old_metrics.strhash_miss == 0)
+    old_metrics = new_metrics
+
+    new_metrics = misc.getmetrics()
+    assert(new_metrics.strhash_hit - old_metrics.strhash_hit == 19)
+    assert(new_metrics.strhash_miss - old_metrics.strhash_miss == 0)
+    old_metrics = new_metrics
+
+    local str2 = "new".."string"
+
+    new_metrics = misc.getmetrics()
+    assert(new_metrics.strhash_hit - old_metrics.strhash_hit == 19)
+    assert(new_metrics.strhash_miss - old_metrics.strhash_miss == 1)
+    subtest:ok(true, "no assertion failed")
+end)
+
+test:test("tracenum-base", function(subtest)
+    subtest:plan(3)
+
+    jit.off()
+    jit.flush()
+    collectgarbage("collect")
+    local metrics = misc.getmetrics()
+    subtest:is(metrics.jit_trace_num, 0)
+
+    jit.on()
+    local sum = 0
+    for i = 1, 100 do
+        sum = sum + i
+    end
+
+    metrics = misc.getmetrics()
+    subtest:is(metrics.jit_trace_num, 1)
+
+    jit.off()
+    jit.flush()
+    collectgarbage("collect")
+
+    metrics = misc.getmetrics()
+    subtest:is(metrics.jit_trace_num, 0)
+
+    jit.on()
+end)
+
+os.exit(test:check() and 0 or 1)
-- 
2.28.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Tarantool-patches] [RFC v4] rfc: luajit metrics
  2020-10-05  6:30 [Tarantool-patches] [PATCH v4 0/2] Implement LuaJIT platform metrics Sergey Kaplun
  2020-10-05  6:30 ` [Tarantool-patches] [PATCH v4 1/2] core: introduce various " Sergey Kaplun
  2020-10-05  6:30 ` [Tarantool-patches] [PATCH v4 2/2] misc: add C and Lua API for " Sergey Kaplun
@ 2020-10-05  6:30 ` Sergey Kaplun
  2020-10-07 14:46   ` Sergey Kaplun
                     ` (2 more replies)
  2020-10-08 17:33 ` [Tarantool-patches] [PATCH v4 0/2] Implement LuaJIT platform metrics Igor Munkin
  2020-10-13 13:17 ` Kirill Yukhin
  4 siblings, 3 replies; 26+ messages in thread
From: Sergey Kaplun @ 2020-10-05  6:30 UTC (permalink / raw)
  To: Igor Munkin, Sergey Ostanevich; +Cc: tarantool-patches

Part of #5187
---

This patch adds RFC to LuaJIT metrics interfaces. Nevertheless name
`misc` for builtin library is not good and should be discussed, because
tons of user modules can use that name for their own libraries.

Branch: https://github.com/tarantool/tarantool/tree/skaplun/5187-luajit-metrics
Issue: https://github.com/tarantool/tarantool/issues/5187

Changes in v2:
- Fixed typos
- Made comments more verbose
- Avoided flushing any of metrics after each call of luaM_metrics()
Changes in v3:
- Added colors count metrics description
- Added description about how metrics are collected
- Added benchmarks
Changes in v3:
- Removed colors count metrics
- Added code block language
- Added how to use section

 doc/rfc/5187-luajit-metrics.md | 299 +++++++++++++++++++++++++++++++++
 1 file changed, 299 insertions(+)
 create mode 100644 doc/rfc/5187-luajit-metrics.md

diff --git a/doc/rfc/5187-luajit-metrics.md b/doc/rfc/5187-luajit-metrics.md
new file mode 100644
index 000000000..02f5b559f
--- /dev/null
+++ b/doc/rfc/5187-luajit-metrics.md
@@ -0,0 +1,299 @@
+# LuaJIT metrics
+
+* **Status**: In progress
+* **Start date**: 17-07-2020
+* **Authors**: Sergey Kaplun @Buristan skaplun@tarantool.org,
+               Igor Munkin @igormunkin imun@tarantool.org,
+               Sergey Ostanevich @sergos sergos@tarantool.org
+* **Issues**: [#5187](https://github.com/tarantool/tarantool/issues/5187)
+
+## Summary
+
+LuaJIT metrics provide extra information about the Lua state. They consist of
+GC metrics (overall amount of objects and memory usage), JIT stats (both
+related to the compiled traces and the engine itself), string hash hits/misses.
+
+## Background and motivation
+
+One can be curious about their application performance. We are going to provide
+various metrics about the several platform subsystems behaviour. GC pressure
+produced by user code can weight down all application performance. Irrelevant
+traces compiled by the JIT engine can just burn CPU time with no benefits as a
+result. String hash collisions can lead to DoS caused by a single request. All
+these metrics should be well monitored by users wanting to improve the
+performance of their application.
+
+## Detailed design
+
+The additional header <lmisclib.h> is introduced to extend the existing LuaJIT
+C API with new interfaces. The first function provided via this header is the
+following:
+
+```c
+/* API for obtaining various platform metrics. */
+
+LUAMISC_API void luaM_metrics(lua_State *L, struct luam_Metrics *metrics);
+```
+
+This function fills the structure pointed to by `metrics` with the corresponding
+metrics related to Lua state anchored to the given coroutine `L`.
+
+The `struct luam_Metrics` has the following definition:
+
+```c
+struct luam_Metrics {
+  /* Strings amount found in string hash instead of allocation of new one. */
+  size_t strhash_hit;
+  /* Strings amount allocated and put into string hash. */
+  size_t strhash_miss;
+
+  /* Amount of allocated string objects. */
+  size_t gc_strnum;
+  /* Amount of allocated table objects. */
+  size_t gc_tabnum;
+  /* Amount of allocated udata objects. */
+  size_t gc_udatanum;
+  /* Amount of allocated cdata objects. */
+  size_t gc_cdatanum;
+
+  /* Memory currently allocated. */
+  size_t gc_total;
+  /* Total amount of freed memory. */
+  size_t gc_freed;
+  /* Total amount of allocated memory. */
+  size_t gc_allocated;
+
+  /* Count of incremental GC steps per state. */
+  size_t gc_steps_pause;
+  size_t gc_steps_propagate;
+  size_t gc_steps_atomic;
+  size_t gc_steps_sweepstring;
+  size_t gc_steps_sweep;
+  size_t gc_steps_finalize;
+
+  /*
+  ** Overall number of snap restores (amount of guard assertions
+  ** leading to stopping trace executions and trace exits,
+  ** that are not stitching with other traces).
+  */
+  size_t jit_snap_restore;
+  /* Overall number of abort traces. */
+  size_t jit_trace_abort;
+  /* Total size of all allocated machine code areas. */
+  size_t jit_mcode_size;
+  /* Amount of JIT traces. */
+  unsigned int jit_trace_num;
+};
+```
+
+Couple of words about how metrics are collected:
+- `strhash_*` -- whenever existing string is returned after attempt to
+  create new string there is incremented `strhash_hit` counter, if new string
+  created then `strhash_miss` is incremented instead.
+- `gc_*num`, `jit_trace_num` -- corresponding counter incremented whenever new
+  object is allocated. When object become garbage collected its counter is
+  decremented.
+- `gc_total`, `gc_allocated`, `gc_freed` -- any time when allocation function
+  is called `gc_allocated` and/or `gc_freed` is increased and `gc_total`
+  increase when memory is allocated or reallocated, decrease when memory is
+  freed.
+- `gc_steps_*` -- corresponding counter increments whenever Garbage Collector
+  starts to execute 1 step of garbage collection.
+- `jit_snap_restore` -- whenever JIT machine exits from the trace and restores
+  interpreter state `jit_snap_restore` counter is incremented.
+- `jit_trace_abort` -- whenever JIT compiler can't record the trace in case NYI
+  BC this counter is incremented.
+- `jit_mcode_size` -- whenever new MCode area is allocated `jit_mcode_size` is
+  increased at corresponding size in bytes. Sets to 0 when all mcode area is
+  freed.
+
+All metrics are collected throughout the platform uptime. These metrics
+increase monotonically and can overflow:
+  - `strhash_hit`
+  - `strhash_miss`
+  - `gc_freed`
+  - `gc_allocated`,
+  - `gc_steps_pause`
+  - `gc_steps_propagate`
+  - `gc_steps_atomic`
+  - `gc_steps_sweepstring`
+  - `gc_steps_sweep`
+  - `gc_steps_finalize`
+  - `jit_snap_restore`
+  - `jit_trace_abort`
+
+They make sense only with comparing with their value from a previous
+`luaM_metrics()` call.
+
+There is also a complement introduced for Lua space -- `misc.getmetrics()`.
+This function is just a wrapper for `luaM_metrics()` returning a Lua table with
+the similar metrics. All returned values are presented as numbers with cast to
+double, so there is a corresponding precision loss. Function usage is quite
+simple:
+```
+$ ./src/tarantool
+Tarantool 2.5.0-267-gbf047ad44
+type 'help' for interactive help
+tarantool> misc.getmetrics()
+---
+- gc_freed: 2245853
+  strhash_hit: 53965
+  gc_steps_atomic: 6
+  strhash_miss: 6879
+  gc_steps_sweepstring: 17920
+  gc_strnum: 5759
+  gc_tabnum: 1813
+  gc_cdatanum: 89
+  jit_snap_restore: 0
+  gc_total: 1370812
+  gc_udatanum: 17
+  gc_steps_finalize: 0
+  gc_allocated: 3616665
+  jit_trace_num: 0
+  gc_steps_sweep: 297
+  jit_trace_abort: 0
+  jit_mcode_size: 0
+  gc_steps_propagate: 10181
+  gc_steps_pause: 7
+...
+```
+
+## How to use
+
+This section describes small example of metrics usage.
+
+For example amount of `strhash_misses` can be shown for tracking of new string
+objects allocations. For example if we add code like:
+```lua
+local function sharded_storage_func(storage_name, func_name)
+    return 'sharded_storage.storages.' .. storage_name .. '.' .. func_name
+end
+```
+increase in slope curve of `strhash_misses` means, that after your changes
+there are more new strings allocating at runtime. Of course slope curve of
+`strhash_misses` _should_ be less than slope curve of `strhash_hits`. If it
+is not, you should refactor your code.
+
+Slope curves of `gc_freed` and `gc_allocated` can be used for analysis of GC
+pressure of your application (less is better).
+
+Also we can check some hacky optimization with these metrics. For example let's
+assume that we have this code snippet:
+```lua
+local old_metrics = misc.getmetrics()
+local t = {}
+for i = 1, 513 do
+    t[i] = i
+end
+local new_metrics = misc.getmetrics()
+local diff =  new_metrics.gc_allocated - old_metrics.gc_allocated
+```
+`diff` equals to 18879 after running of this chunk.
+
+But if we change table initialization to
+```lua
+local table_new = require "table.new"
+local old_metrics = misc.getmetrics()
+local t = table_new(513,0)
+for i = 1, 513 do
+    t[i] = i
+end
+local new_metrics = misc.getmetrics()
+local diff =  new_metrics.gc_allocated - old_metrics.gc_allocated
+```
+`diff` shows us only 5895.
+
+Slope curves of `gc_steps_*` can be used for tracking of GC pressure too. For
+long time observations you will see periodic increment for `gc_steps_*` metrics
+-- for example longer period of `gc_steps_atomic` increment is better. Also
+additional amount of `gc_steps_propagate` in one period can be used to
+indirectly estimate amount of objects.
+
+Amount of `gc_*num` is useful for control of memory leaks -- totally amount of
+these objects should not growth nonstop (you can also track `gc_total` for
+this).  Also `jit_mcode_size` can be used for tracking amount of allocated
+memory for traces machine code.
+
+Slope curves of `jit_trace_abort` shows how many times trace hasn't been
+compiled when the attempt was made (less is better).
+
+Amount of `gc_trace_num` is shown how much traces was generated (_usually_
+more is better).
+
+And the last one -- `gc_snap_restores` can be used for estimation when LuaJIT
+is stop trace execution. If slope curves of this metric growth after changing
+old code it can mean performance degradation.
+
+Assumes that we have code like this:
+```lua
+local function foo(i)
+    return i <= 5 and i or tostring(i)
+end
+-- minstitch option needs to emulate nonstitching behaviour
+jit.opt.start(0, "hotloop=2", "hotexit=2", "minstitch=15")
+
+local sum = 0
+local old_metrics = misc.getmetrics()
+for i = 1, 10 do
+    sum = sum + foo(i)
+end
+local new_metrics = misc.getmetrics()
+local diff = new_metrics.jit_snap_restore - old_metrics.jit_snap_restore
+```
+`diff` equals 3 (1 side exit on loop end, 2 side exits to the interpreter
+before trace gets hot and compiled) after this chunk of code.
+
+And now we decide to change `foo` function like this:
+```lua
+local function foo(i)
+    -- math.fmod is not yet compiled!
+    return i <= 5 and i or math.fmod(i, 11)
+end
+```
+`diff` equals 6 (1 side exit on loop end, 2 side exits to the interpreter
+before trace gets hot and compiled an 3 side exits from the root trace could
+not get compiled) after the same chunk of code.
+
+## Benchmarks
+
+Benchmarks was taken from repo:
+[LuaJIT-test-cleanup](https://github.com/LuaJIT/LuaJIT-test-cleanup).
+
+Example of usage:
+```bash
+/usr/bin/time -f"array3d %U" ./luajit $BENCH_DIR/array3d.lua  300 >/dev/null
+```
+Taking into account the measurement error ~ 2%, it can be said that there is no
+difference in the performance.
+
+Benchmark results after and before patch (less is better):
+```
+   Benchmark   | AFTER (s) | BEFORE (s)
+---------------+-----------+-----------
+array3d        |   0.21    |   0.20
+binary-trees   |   3.30    |   3.24
+chameneos      |   2.86    |   2.99
+coroutine-ring |   0.98    |   1.02
+euler14-bit    |   1.01    |   1.05
+fannkuch       |   6.74    |   6.81
+fasta          |   8.25    |   8.28
+life           |   0.47    |   0.46
+mandelbrot     |   2.65    |   2.68
+mandelbrot-bit |   1.96    |   1.97
+md5            |   1.58    |   1.54
+nbody          |   1.36    |   1.56
+nsieve         |   2.02    |   2.06
+nsieve-bit     |   1.47    |   1.50
+nsieve-bit-fp  |   4.37    |   4.60
+partialsums    |   0.54    |   0.55
+pidigits-nogmp |   3.46    |   3.46
+ray            |   1.62    |   1.63
+recursive-ack  |   0.19    |   0.20
+recursive-fib  |   1.63    |   1.67
+scimark-fft    |   5.76    |   5.86
+scimark-lu     |   3.58    |   3.64
+scimark-sor    |   2.33    |   2.34
+scimark-sparse |   4.88    |   4.93
+series         |   0.94    |   0.94
+spectral-norm  |   0.94    |   0.97
+```
-- 
2.28.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 2/2] misc: add C and Lua API for platform metrics
  2020-10-05  6:30 ` [Tarantool-patches] [PATCH v4 2/2] misc: add C and Lua API for " Sergey Kaplun
@ 2020-10-06 22:17   ` Igor Munkin
  2020-10-07  5:57     ` Igor Munkin
  2020-10-07 14:35     ` Sergey Kaplun
  2020-10-09 14:45   ` Sergey Ostanevich
  1 sibling, 2 replies; 26+ messages in thread
From: Igor Munkin @ 2020-10-06 22:17 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sergey,

Thanks for the patch! Please consider my comments below.

On 05.10.20, Sergey Kaplun wrote:
> This patch introduces both C and Lua API for LuaJIT platform
> metrics implemented in scope of the previous patch. New <lmisclib.h>
> header provides <luaM_metrics> C interface that fills the given
> <luam_metrics> structure with the platform metrics. Additionally
> <misc> module is loaded to Lua space and provides <getmetrics>
> method that yields the corresponding metrics via table.
> 
> Part of tarantool/tarantool#5187
> ---
>  Makefile                                      |   2 +-
>  src/Makefile                                  |   5 +-
>  src/Makefile.dep                              |  14 +-
>  src/lib_init.c                                |   2 +
>  src/lib_misc.c                                |  72 +++
>  src/lj_mapi.c                                 |  61 +++
>  src/ljamalg.c                                 |   2 +
>  src/lmisclib.h                                |  64 +++
>  src/luaconf.h                                 |   1 +
>  test/clib-misclib-getmetrics.test.lua         | 174 ++++++++
>  test/clib-misclib-getmetrics/CMakeLists.txt   |   1 +
>  test/clib-misclib-getmetrics/testgetmetrics.c | 242 ++++++++++
>  test/lib-misc-getmetrics.test.lua             | 418 ++++++++++++++++++
>  13 files changed, 1050 insertions(+), 8 deletions(-)
>  create mode 100644 src/lib_misc.c
>  create mode 100644 src/lj_mapi.c
>  create mode 100644 src/lmisclib.h
>  create mode 100755 test/clib-misclib-getmetrics.test.lua
>  create mode 100644 test/clib-misclib-getmetrics/CMakeLists.txt
>  create mode 100644 test/clib-misclib-getmetrics/testgetmetrics.c
>  create mode 100755 test/lib-misc-getmetrics.test.lua
> 

<snipped>

> diff --git a/src/lib_misc.c b/src/lib_misc.c
> new file mode 100644
> index 0000000..ef11237
> --- /dev/null
> +++ b/src/lib_misc.c
> @@ -0,0 +1,72 @@

<snipped>

> +#define LJLIB_MODULE_misc
> +
> +LJLIB_CF(misc_getmetrics)
> +{
> +  struct luam_Metrics metrics;
> +  lua_createtable(L, 0, 22);

19 slots are definitely enough.

> +  GCtab *m = tabV(L->top - 1);

Once more: please, declare the variables in the beginning of the block.

> +
> +  luaM_metrics(L, &metrics);
> +

<snipped>

> diff --git a/src/lj_mapi.c b/src/lj_mapi.c
> new file mode 100644
> index 0000000..7645a44
> --- /dev/null
> +++ b/src/lj_mapi.c
> @@ -0,0 +1,61 @@

<snipped>

> +
> +LUAMISC_API void luaM_metrics(lua_State *L, struct luam_Metrics *metrics)
> +{
> +  lua_assert(metrics != NULL);

Ditto.

> +  global_State *g = G(L);
> +  GCState *gc = &g->gc;
> +#if LJ_HASJIT
> +  jit_State *J = G2J(g);
> +#endif
> +

<snipped>

> diff --git a/src/lmisclib.h b/src/lmisclib.h
> new file mode 100644
> index 0000000..e2d1909
> --- /dev/null
> +++ b/src/lmisclib.h

<snipped>

> +struct luam_Metrics {
> +  /* Strings amount found in string hash instead of allocation of new one. */

If I don't know how this metric is collected, I'll never guess what this
comment mean. Let's reword it the following way:
| /*
| ** Number of strings being interned (i.e. the string with the same
| ** payload is found, so a new one is not created/allocated)
| */

> +  size_t strhash_hit;
> +  /* Strings amount allocated and put into string hash. */

Well, does it mean this metric has the same value as gc_strnum?
Not indeed. This is a number of the strings

> +  size_t strhash_miss;
> +
> +  /* Amount of allocated string objects. */
> +  size_t gc_strnum;
> +  /* Amount of allocated table objects. */
> +  size_t gc_tabnum;
> +  /* Amount of allocated udata objects. */
> +  size_t gc_udatanum;
> +  /* Amount of allocated cdata objects. */
> +  size_t gc_cdatanum;
> +
> +  /* Memory currently allocated. */
> +  size_t gc_total;
> +  /* Total amount of freed memory. */
> +  size_t gc_freed;
> +  /* Total amount of allocated memory. */
> +  size_t gc_allocated;
> +
> +  /* Count of incremental GC steps per state. */
> +  size_t gc_steps_pause;
> +  size_t gc_steps_propagate;
> +  size_t gc_steps_atomic;
> +  size_t gc_steps_sweepstring;
> +  size_t gc_steps_sweep;
> +  size_t gc_steps_finalize;
> +
> +  /*
> +  ** Overall number of snap restores (amount of guard assertions
> +  ** leading to stopping trace executions and trace exits,
> +  ** that are not stitching with other traces).

Minor: Strictly saying, it's not a stitching in LuaJIT terms.

> +  */
> +  size_t jit_snap_restore;
> +  /* Overall number of abort traces. */
> +  size_t jit_trace_abort;
> +  /* Total size of all allocated machine code areas. */
> +  size_t jit_mcode_size;
> +  /* Amount of JIT traces. */
> +  unsigned int jit_trace_num;
> +};

<snipped>

> diff --git a/test/clib-misclib-getmetrics.test.lua b/test/clib-misclib-getmetrics.test.lua

Add Github issue to the file name.

> new file mode 100755
> index 0000000..34adaba
> --- /dev/null
> +++ b/test/clib-misclib-getmetrics.test.lua
> @@ -0,0 +1,174 @@
> +#!/usr/bin/env tarantool
> +
> +local file = debug.getinfo(1, "S").source:sub(2)
> +local filepath = file:match("(.*/)")
> +local soext = jit.os == "OSX" and "dylib" or "so"
> +package.cpath = filepath..'clib-misclib-getmetrics/?.'..soext..";"
> +                ..package.cpath

I've already solved this issue, so this part can be easily borrowed from
utils.lua[1]. Consider the following:
| local path = arg[0]:gsub('%.test%.lua', '')
| local suffix = package.cpath:match('?.(%a+);')
| package.cpath = ('%s/?.%s;%s'):format(path, suffix, package.cpath)

> +
> +local jit_opt_default_hotloop = 56
> +local jit_opt_default_hotexit = 10
> +local jit_opt_default_level = 3
> +local jit_opt_default_minstitch = 0

Well, this can be done in a much more elegant way:
| local jit_opt_default = {
|   3, -- level
|   "hotloop=56",
|   "hotexit=10",
|   "minstitch=0",
| }
|
| <snipped>
|
| jit.opt.start(unpack(jit_opt_default))

> +

<snipped>

> +test:ok(testgetmetrics.objcount(function()

<snipped>

> +    for i = 1, 1000 do
> +        table.insert(placeholder.tab, table_new(i, 0))

This can be done via creating { i }, so table.new is excess here.

> +    end

<snipped>

> +-- Compiled loop with a direct exit to the interpreter.
> +test:ok(testgetmetrics.snap_restores(function()
> +    jit.opt.start(0, "hotloop=2")

Why hotloop is 2?

> +

<snipped>

> +-- Compiled loop with a side exit which does not get compiled.
> +test:ok(testgetmetrics.snap_restores(function()
> +    jit.opt.start(0, "hotloop=2", "hotexit=2", "minstitch=15")

Why hotloop is 2?

> +

<snipped>

> +-- Compiled scalar trace with a direct exit to the interpreter.
> +test:ok(testgetmetrics.snap_restores(function()
> +    -- For calls it will be 2 * hotloop (see lj_dispatch.{c,h}
> +    -- and hotcall@vm_*.dasc).
> +    jit.opt.start(3, "hotloop=2", "hotexit=3")

Why hotloop is 2?

> +

<snipped>

> diff --git a/test/clib-misclib-getmetrics/testgetmetrics.c b/test/clib-misclib-getmetrics/testgetmetrics.c

Add Github issue to the directory name.

> new file mode 100644
> index 0000000..32802d2
> --- /dev/null
> +++ b/test/clib-misclib-getmetrics/testgetmetrics.c
> @@ -0,0 +1,242 @@

<snipped>

> +static int base(lua_State *L)
> +{
> +	struct luam_Metrics metrics;
> +	luaM_metrics(L, &metrics);
> +
> +	/* Just check API. */

Minor: it worths to drop a comment why there is a <ssize_t> here. It's
quite obvious, but has come to my mind only in a few seconds.

> +	assert((ssize_t)metrics.strhash_hit >= 0);

<snipped>

> +}
> +
> +static int gc_allocated_freed(lua_State *L)
> +{
> +	struct luam_Metrics oldm, newm;
> +	/* Force up garbage collect all dead objects. */
> +	lua_gc(L, LUA_GCCOLLECT, 0);
> +
> +	luaM_metrics(L, &oldm);
> +	/* Simple garbage generation. */
> +	if (luaL_dostring(L, "local i = 0 for j = 1, 10 do i = i + j end"))
> +		luaL_error(L, "failed to translate Lua code snippet");

Side note: OK, this is a funny example: the code to be executed doesn't
generate the garbage per se, but the machinery running it does. This is
nice, since AFAICS you can calculate the exact amount to allocated and
freed, can't you? Feel free to ignore this comment or consider it as
an extracurricular activity.

> +	lua_gc(L, LUA_GCCOLLECT, 0);
> +	luaM_metrics(L, &newm);
> +	assert(newm.gc_allocated - oldm.gc_allocated > 0);
> +	assert(newm.gc_freed - oldm.gc_freed > 0);
> +
> +	lua_pushboolean(L, 1);
> +	return 1;
> +}
> +
> +static int gc_steps(lua_State *L)
> +{
> +	struct luam_Metrics oldm, newm;
> +	/*
> +	 * Some garbage has already happened before the next line,
> +	 * i.e. during fronted processing lua test chunk.

Typo: s/fronted/frontend/.
Typo: s/lua/Lua/.

> +	 * Let's put a full garbage collection cycle on top
> +	 * of that, and confirm that non-null values are reported
> +	 * (we are not yet interested in actual numbers):
> +	 */
> +	luaM_metrics(L, &oldm);
> +	lua_gc(L, LUA_GCCOLLECT, 0);

OK, this is a misguiding part. At first I was confused with the comment
below that *new values are the same*. Why the difference doesn't equals
to zero if they are the same? Later I found the <lua_gc> call after
<oldm> initialization *but* prior to these asserts. I guess the next
reordering definitely make the code clearer:
| luaM_metrics(L, &oldm);
| assert(oldm.gc_steps_pause > 0);
| <other asserts>
| lua_gc(L, LUA_GCCOLLECT, 0);

> +	assert(oldm.gc_steps_pause > 0);

<snipped>

> +	/*
> +	 * As long as we don't create new Lua objects
> +	 * consequent call should return the same values:

How come? They *are not* the same considering the asserts below (except
<gc_steps_finalize>). The comment is misguiding.

> +	 */
> +	luaM_metrics(L, &newm);
> +	assert(newm.gc_steps_pause - oldm.gc_steps_pause > 0);

<snipped>

> +}
> +
> +static int objcount(lua_State *L)
> +{
> +	struct luam_Metrics oldm, newm;
> +	int n = lua_gettop(L);
> +	if (n != 1 || !lua_isfunction(L, 1))
> +		luaL_error(L, "incorrect argument: 1 function is required");
> +
> +	/* Force up garbage collect all dead objects. */
> +	lua_gc(L, LUA_GCCOLLECT, 0);
> +
> +	luaM_metrics(L, &oldm);
> +	/* Generate garbage. */
> +	lua_call(L, 0, 0);

If loop optimization is disabled, then you can pass the number of
iterations as an argument to the function and check the metrics values
against it. Thoughts?

> +	lua_gc(L, LUA_GCCOLLECT, 0);
> +	luaM_metrics(L, &newm);

<snipped>

> +}

> +static int tracenum_base(lua_State *L)
> +{
> +	struct luam_Metrics metrics;
> +	int n = lua_gettop(L);
> +	if (n != 1 || !lua_isfunction(L, 1))
> +		luaL_error(L, "incorrect arguments: 1 function is required");
> +
> +	luaJIT_setmode(L, 0, LUAJIT_MODE_OFF);

This line looks excess.

> +	luaJIT_setmode(L, 0, LUAJIT_MODE_FLUSH);
> +	/* Force up garbage collect all dead objects. */
> +	lua_gc(L, LUA_GCCOLLECT, 0);
> +
> +	luaM_metrics(L, &metrics);
> +	assert(metrics.jit_trace_num == 0);
> +
> +	luaJIT_setmode(L, 0, LUAJIT_MODE_ON);

<snipped>

> +
> +	luaJIT_setmode(L, 0, LUAJIT_MODE_FLUSH);
> +	/* Force up garbage collect all dead objects. */
> +	lua_gc(L, LUA_GCCOLLECT, 0);
> +	luaM_metrics(L, &metrics);
> +	assert(metrics.jit_trace_num == 0);
> +
> +	luaJIT_setmode(L, 0, LUAJIT_MODE_ON);

This line looks excess.

> +	lua_pushboolean(L, 1);
> +	return 1;
> +}

<snipped>

> diff --git a/test/lib-misc-getmetrics.test.lua b/test/lib-misc-getmetrics.test.lua

Add Github issue to the file name.

> new file mode 100755
> index 0000000..2859e26
> --- /dev/null
> +++ b/test/lib-misc-getmetrics.test.lua
> @@ -0,0 +1,418 @@
> +#!/usr/bin/env tarantool
> +
> +-- This is a part of tarantool/luajit testing suite.
> +-- Major portions taken verbatim or adapted from the LuaVela testing suit.

Typo: s/suit/suite/.

> +-- Copyright (C) 2015-2019 IPONWEB Ltd.

<snipped>

> +test:test("gc-allocated-freed", function(subtest)
> +    subtest:plan(1)
> +
> +    -- Force up garbage collect all dead objects.
> +    collectgarbage("collect")
> +
> +    -- Bump getmetrincs table and string keys allocation.

Typo: s/getmetrincs/getmetrics/.

> +    local old_metrics = misc.getmetrics()
> +
> +    -- Remember allocated size for getmetrics table.
> +    old_metrics = misc.getmetrics()
> +
> +    collectgarbage("collect")
> +
> +    local new_metrics = misc.getmetrics()
> +
> +    local diff_alloc = new_metrics.gc_allocated - old_metrics.gc_allocated
> +    local getmetrics_alloc = diff_alloc

Why the difference can't be assigned right to <getmetrics_alloc>?

> +

<snipped>

> +test:test("gc-steps", function(subtest)
> +    subtest:plan(24)
> +
> +    -- Some garbage has already created before the next line,
> +    -- i.e. during fronted processing this chunk.

Typo: s/fronted/frontend/.

> +    -- Let's put a full garbage collection cycle on top of that,
> +    -- and confirm that non-null values are reported (we are not
> +    -- yet interested in actual numbers):

<snipped>

> +test:test("objcount", function(subtest)

<snipped>

> +    -- Bump strings and table creation.
> +    local old_metrics = misc.getmetrics()

OK, I finally got the math you explained to me offline, but it's too
complex for such simple case. Let's consider the newly allocated table
right in the test assertion.

> +    old_metrics = misc.getmetrics()
> +

<snipped>

> +    for i = 1, 1000 do
> +        table.insert(placeholder.tab, table_new(i, 0))

This can be done via creating { i }, so table.new is excess here.

> +    end
> +

<snipped>

> +test:test("snap-restores-direct-loop", function(subtest)
> +    -- Compiled loop with a direct exit to the interpreter.
> +    subtest:plan(1)
> +
> +    jit.opt.start(0, "hotloop=2")

Why hotloop is 2?

> +

<snipped>

> +test:test("snap-restores-loop-side-exit-non-compiled", function(subtest)
> +    -- Compiled loop with a side exit which does not get compiled.
> +    subtest:plan(1)
> +
> +    jit.opt.start(0, "hotloop=2", "hotexit=2", "minstitch=15")

Why hotloop is 2?

> +

<snipped>

> +test:test("snap-restores-scalar", function(subtest)
> +    -- Compiled scalar trace with a direct exit to the interpreter.
> +    subtest:plan(2)
> +
> +    -- For calls it will be 2 * hotloop (see lj_dispatch.{c,h}
> +    -- and hotcall@vm_*.dasc).
> +    jit.opt.start(3, "hotloop=2", "hotexit=3")

Why hotloop is 2?

> +

<snipped>

> +test:test("tracenum-base", function(subtest)
> +    subtest:plan(3)
> +
> +    jit.off()

This line looks excess.

> +    jit.flush()
> +    collectgarbage("collect")
> +    local metrics = misc.getmetrics()
> +    subtest:is(metrics.jit_trace_num, 0)
> +
> +    jit.on()
> +    local sum = 0
> +    for i = 1, 100 do
> +        sum = sum + i
> +    end
> +
> +    metrics = misc.getmetrics()
> +    subtest:is(metrics.jit_trace_num, 1)
> +
> +    jit.off()

This line looks excess.

> +    jit.flush()
> +    collectgarbage("collect")
> +
> +    metrics = misc.getmetrics()
> +    subtest:is(metrics.jit_trace_num, 0)
> +
> +    jit.on()
> +end)
> +
> +os.exit(test:check() and 0 or 1)
> -- 
> 2.28.0
> 

[1]: https://github.com/tarantool/luajit/blob/tarantool/test/utils.lua

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 2/2] misc: add C and Lua API for platform metrics
  2020-10-06 22:17   ` Igor Munkin
@ 2020-10-07  5:57     ` Igor Munkin
  2020-10-07 14:35     ` Sergey Kaplun
  1 sibling, 0 replies; 26+ messages in thread
From: Igor Munkin @ 2020-10-07  5:57 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sorry, I forgot to finish the sentence below.

On 07.10.20, Igor Munkin wrote:
> Sergey,

<snipped>

> 
> > +  size_t strhash_hit;
> > +  /* Strings amount allocated and put into string hash. */
> 
> Well, does it mean this metric has the same value as gc_strnum?
> Not indeed. This is a number of the strings

This is a total number of the strings being allocated: if the string
'qq' is created, collected later and created again, this metric is
incremented twice. At the same time the value below just shows the
amount of allocated strings at the moment.

> 
> > +  size_t strhash_miss;
> > +

<snipped>

> 
> -- 
> Best regards,
> IM

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 1/2] core: introduce various platform metrics
  2020-10-05  6:30 ` [Tarantool-patches] [PATCH v4 1/2] core: introduce various " Sergey Kaplun
@ 2020-10-07 14:11   ` Igor Munkin
  2020-10-07 19:55     ` Sergey Kaplun
  0 siblings, 1 reply; 26+ messages in thread
From: Igor Munkin @ 2020-10-07 14:11 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sergey,

Thanks for the patch! Please consider my comments below.

On 05.10.20, Sergey Kaplun wrote:
> This patch introduces the following counters:
>   - overall amount of allocated tables, cdata and udata objects
>   - number of incremental GC steps grouped by GC state
>   - number of string hashes hits and misses
>   - amount of allocated and freed memory
>   - number of trace aborts and restored snapshots

Looks like you forgot a number of alive traces.

> 
> Interfaces to obtain these metrics via both Lua and C API are
> introduced in the next patch.
> 
> Part of tarantool/tarantool#5187
> ---
>  src/lj_asm.c       |  2 ++
>  src/lj_asm_arm.h   |  7 +++++++
>  src/lj_asm_arm64.h |  7 +++++++
>  src/lj_asm_mips.h  |  4 ++++
>  src/lj_asm_ppc.h   |  3 +++
>  src/lj_asm_x86.h   |  4 ++++
>  src/lj_cdata.c     |  2 ++
>  src/lj_cdata.h     |  2 ++
>  src/lj_gc.c        |  4 ++++
>  src/lj_gc.h        |  6 +-----
>  src/lj_jit.h       |  3 +++
>  src/lj_obj.h       | 28 +++++++++++++++++++++++++---
>  src/lj_snap.c      |  1 +
>  src/lj_state.c     |  2 +-
>  src/lj_str.c       |  5 +++++
>  src/lj_tab.c       |  2 ++
>  src/lj_trace.c     |  6 +++++-
>  src/lj_udata.c     |  2 ++
>  18 files changed, 80 insertions(+), 10 deletions(-)
> 
> diff --git a/src/lj_asm.c b/src/lj_asm.c
> index c2cf5a9..8fb3ccd 100644
> --- a/src/lj_asm.c
> +++ b/src/lj_asm.c
> @@ -2273,6 +2273,7 @@ void lj_asm_trace(jit_State *J, GCtrace *T)
>    as->J = J;
>    as->T = T;
>    J->curfinal = lj_trace_alloc(J->L, T);  /* This copies the IR, too. */
> +  J->tracenum++;

Why didn't you simply put this increment right to <lj_trace_alloc> as
you did for decrement and <lj_trace_free>?

>    as->flags = J->flags;
>    as->loopref = J->loopref;
>    as->realign = NULL;
> @@ -2379,6 +2380,7 @@ void lj_asm_trace(jit_State *J, GCtrace *T)
>      lj_trace_free(J2G(J), J->curfinal);
>      J->curfinal = NULL;  /* In case lj_trace_alloc() OOMs. */
>      J->curfinal = lj_trace_alloc(J->L, T);
> +    J->tracenum++;

Ditto.

>      as->realign = NULL;
>    }
>  

<snipped>

> diff --git a/src/lj_asm_mips.h b/src/lj_asm_mips.h
> index affe7d8..f4b4b5d 100644
> --- a/src/lj_asm_mips.h
> +++ b/src/lj_asm_mips.h
> @@ -1473,11 +1473,15 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>      return;
>    }
>  
> +  /* Code incrementing cdatanum is sparse to avoid mips data hazards. */

Side note: LOL, here you are :)

> +  emit_setgl(as, RID_RET+2, gc.cdatanum);

Well, I glanced a MIPS register-usage convention and AFAICS $4 register
(RID_RET + 2) is a general-purpose (i.e. doesn't store 0 or preserved by
kernel) caller-safe one. Ergo it should be allocated it in a proper way
from scratch set, shouldn't it?

>    /* Initialize gct and ctypeid. lj_mem_newgco() already sets marked. */
>    emit_tsi(as, MIPSI_SB, RID_RET+1, RID_RET, offsetof(GCcdata, gct));
>    emit_tsi(as, MIPSI_SH, RID_TMP, RID_RET, offsetof(GCcdata, ctypeid));
> +  emit_tsi(as, MIPSI_AADDIU, RID_RET+2, RID_RET+2, 1);
>    emit_ti(as, MIPSI_LI, RID_RET+1, ~LJ_TCDATA);
>    emit_ti(as, MIPSI_LI, RID_TMP, id); /* Lower 16 bit used. Sign-ext ok. */
> +  emit_getgl(as, RID_RET+2, gc.cdatanum);
>    args[0] = ASMREF_L;     /* lua_State *L */
>    args[1] = ASMREF_TMP1;  /* MSize size   */
>    asm_gencall(as, ci, args);

<snipped>

> diff --git a/src/lj_obj.h b/src/lj_obj.h
> index f368578..600b68d 100644
> --- a/src/lj_obj.h
> +++ b/src/lj_obj.h
> @@ -571,6 +571,17 @@ typedef enum {
>  #define basemt_obj(g, o)	((g)->gcroot[GCROOT_BASEMT+itypemap(o)])
>  #define mmname_str(g, mm)	(strref((g)->gcroot[GCROOT_MMNAME+(mm)]))
>  
> +/* Garbage collector states. Order matters. */
> +enum {
> +  GCSpause, /* Start a GC cycle and mark the root set.*/
> +  GCSpropagate, /* One gray object is processed. */
> +  GCSatomic, /* Atomic transition from mark to sweep phase. */
> +  GCSsweepstring, /* Sweep one chain of strings. */
> +  GCSsweep, /* Sweep few objects from root. */
> +  GCSfinalize, /* Finalize one userdata or cdata object. */
> +  GCSmax

Adjust the comments considering the code nearby.

> +};
> +
>  typedef struct GCState {
>    GCSize total;		/* Memory currently allocated. */
>    GCSize threshold;	/* Memory threshold. */

<snipped>

> @@ -602,22 +622,24 @@ typedef struct global_State {
>      BloomFilter next[2];
>    } strbloom;
>  #endif
> +  size_t strhash_hit;	/* Strings amount found in string hash. */
> +  size_t strhash_miss;	/* Strings amount allocated and put into string hash. */
>    lua_Alloc allocf;	/* Memory allocator. */
>    void *allocd;		/* Memory allocator data. */
>    GCState gc;		/* Garbage collector. */
> -  volatile int32_t vmstate;  /* VM state or current JIT code trace number. */
>    SBuf tmpbuf;		/* Temporary string buffer. */
>    GCstr strempty;	/* Empty string. */
>    uint8_t stremptyz;	/* Zero terminator of empty string. */
>    uint8_t hookmask;	/* Hook mask. */
>    uint8_t dispatchmode;	/* Dispatch mode. */
>    uint8_t vmevmask;	/* VM event mask. */
> +  int32_t hookcount;	/* Instruction hook countdown. */
>    GCRef mainthref;	/* Link to main thread. */
>    TValue registrytv;	/* Anchor for registry. */
> -  TValue tmptv, tmptv2;	/* Temporary TValues. */
> +  TValue tmptv2, tmptv;	/* Temporary TValues. */
>    Node nilnode;		/* Fallback 1-element hash part (nil key and value). */
>    GCupval uvhead;	/* Head of double-linked list of all open upvalues. */
> -  int32_t hookcount;	/* Instruction hook countdown. */
> +  volatile int32_t vmstate;  /* VM state or current JIT code trace number. */

There is no a single word in the commit message regarding this unclear
change. Please drop a sentence about the rationale for this reordering.

>    int32_t hookcstart;	/* Start count for instruction hook counter. */
>    lua_Hook hookf;	/* Hook function. */
>    lua_CFunction wrapf;	/* Wrapper for C function calls. */

<snipped>

> -- 
> 2.28.0
> 

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 2/2] misc: add C and Lua API for platform metrics
  2020-10-06 22:17   ` Igor Munkin
  2020-10-07  5:57     ` Igor Munkin
@ 2020-10-07 14:35     ` Sergey Kaplun
  2020-10-07 18:23       ` Igor Munkin
  1 sibling, 1 reply; 26+ messages in thread
From: Sergey Kaplun @ 2020-10-07 14:35 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches

Hi Igor! Thanks for the review!

On 07.10.20, Igor Munkin wrote:
> Sergey,
> 
> Thanks for the patch! Please consider my comments below.
> 

<snipped>

> 
> > +#define LJLIB_MODULE_misc
> > +
> > +LJLIB_CF(misc_getmetrics)
> > +{
> > +  struct luam_Metrics metrics;
> > +  lua_createtable(L, 0, 22);
> 
> 19 slots are definitely enough.

Nice catch! Forgot to change this line when was removing colors counting.

> 
> > +  GCtab *m = tabV(L->top - 1);
> 
> Once more: please, declare the variables in the beginning of the block.

Ooops. My bad. Fixed.

> 
> > +
> > +  luaM_metrics(L, &metrics);
> > +
> 
> <snipped>
> 
> > diff --git a/src/lj_mapi.c b/src/lj_mapi.c
> > new file mode 100644
> > index 0000000..7645a44
> > --- /dev/null
> > +++ b/src/lj_mapi.c
> > @@ -0,0 +1,61 @@
> 
> <snipped>
> 
> > +
> > +LUAMISC_API void luaM_metrics(lua_State *L, struct luam_Metrics *metrics)
> > +{
> > +  lua_assert(metrics != NULL);
> 
> Ditto.

And this too.

> 
> > +  global_State *g = G(L);
> > +  GCState *gc = &g->gc;
> > +#if LJ_HASJIT
> > +  jit_State *J = G2J(g);
> > +#endif
> > +

<snipped>

> > +struct luam_Metrics {
> > +  /* Strings amount found in string hash instead of allocation of new one. */
> 
> If I don't know how this metric is collected, I'll never guess what this
> comment mean. Let's reword it the following way:
> | /*
> | ** Number of strings being interned (i.e. the string with the same
> | ** payload is found, so a new one is not created/allocated)
> | */
> 

Thanks! Fixed.

> > +  size_t strhash_hit;
> > +  /* Strings amount allocated and put into string hash. */
> 
> Well, does it mean this metric has the same value as gc_strnum?
> Not indeed. This is a number of the strings

Ok. I rewrote this in the following way:
| /* Total number of strings allocations during the platform lifetime. */

> 
> > +  size_t strhash_miss;

<snipped>

> > +
> > +  /*
> > +  ** Overall number of snap restores (amount of guard assertions
> > +  ** leading to stopping trace executions and trace exits,
> > +  ** that are not stitching with other traces).
> 
> Minor: Strictly saying, it's not a stitching in LuaJIT terms.

Removed misleading comments.

> 
> > +  */
> > +  size_t jit_snap_restore;
> > +  /* Overall number of abort traces. */
> > +  size_t jit_trace_abort;
> > +  /* Total size of all allocated machine code areas. */
> > +  size_t jit_mcode_size;
> > +  /* Amount of JIT traces. */
> > +  unsigned int jit_trace_num;
> > +};
> 
> <snipped>
> 
> > diff --git a/test/clib-misclib-getmetrics.test.lua b/test/clib-misclib-getmetrics.test.lua
> 
> Add Github issue to the file name.

Here and below:
I thought we use gh-xxxx style for bugfix tests. Tests for new
features are named the same as a feature.
Is `misclib-getmetrics-{capi,lapi}.test.lua` good enough?

> 
> > new file mode 100755
> > index 0000000..34adaba
> > --- /dev/null
> > +++ b/test/clib-misclib-getmetrics.test.lua
> > @@ -0,0 +1,174 @@
> > +#!/usr/bin/env tarantool
> > +
> > +local file = debug.getinfo(1, "S").source:sub(2)
> > +local filepath = file:match("(.*/)")
> > +local soext = jit.os == "OSX" and "dylib" or "so"
> > +package.cpath = filepath..'clib-misclib-getmetrics/?.'..soext..";"
> > +                ..package.cpath
> 
> I've already solved this issue, so this part can be easily borrowed from
> utils.lua[1]. Consider the following:
> | local path = arg[0]:gsub('%.test%.lua', '')
> | local suffix = package.cpath:match('?.(%a+);')
> | package.cpath = ('%s/?.%s;%s'):format(path, suffix, package.cpath)
> 

Yep. See now. Thanks! We should add one small function for this
to reuse in other tests later.

> > +
> > +local jit_opt_default_hotloop = 56
> > +local jit_opt_default_hotexit = 10
> > +local jit_opt_default_level = 3
> > +local jit_opt_default_minstitch = 0
> 
> Well, this can be done in a much more elegant way:
> | local jit_opt_default = {
> |   3, -- level
> |   "hotloop=56",
> |   "hotexit=10",
> |   "minstitch=0",
> | }
> |
> | <snipped>
> |
> | jit.opt.start(unpack(jit_opt_default))
> 

Yep. Really looks better.

> > +

<snipped>

> > +    for i = 1, 1000 do
> > +        table.insert(placeholder.tab, table_new(i, 0))
> 
> This can be done via creating { i }, so table.new is excess here.

Yes! Thanks! Added here and below.

> 
> > +    end
> 
> <snipped>
> 
> > +-- Compiled loop with a direct exit to the interpreter.
> > +test:ok(testgetmetrics.snap_restores(function()
> > +    jit.opt.start(2, "hotloop=2")
> 
> Why hotloop is 2?

It has no any sacral significance. To avoid confusion, I've changed the
value to 1 here and below.

> 

<snipped>

> 
> > diff --git a/test/clib-misclib-getmetrics/testgetmetrics.c b/test/clib-misclib-getmetrics/testgetmetrics.c
> 

<snipped>

> 
> > +static int base(lua_State *L)
> > +{
> > +	struct luam_Metrics metrics;
> > +	luaM_metrics(L, &metrics);
> > +
> > +	/* Just check API. */
> 
> Minor: it worths to drop a comment why there is a <ssize_t> here. It's
> quite obvious, but has come to my mind only in a few seconds.

Sure! No problem!

> 
> > +	assert((ssize_t)metrics.strhash_hit >= 0);
> 
> <snipped>
> 
> > +}
> > +
> > +static int gc_allocated_freed(lua_State *L)
> > +{
> > +	struct luam_Metrics oldm, newm;
> > +	/* Force up garbage collect all dead objects. */
> > +	lua_gc(L, LUA_GCCOLLECT, 0);
> > +
> > +	luaM_metrics(L, &oldm);
> > +	/* Simple garbage generation. */
> > +	if (luaL_dostring(L, "local i = 0 for j = 1, 10 do i = i + j end"))
> > +		luaL_error(L, "failed to translate Lua code snippet");
> 
> Side note: OK, this is a funny example: the code to be executed doesn't
> generate the garbage per se, but the machinery running it does. This is

Yes, that was the idea.

> nice, since AFAICS you can calculate the exact amount to allocated and
> freed, can't you? Feel free to ignore this comment or consider it as
> an extracurricular activity.

I didn't set the exact value on purpose. It seemed to me that we should
not be tied to the behavior of the corresponding machinery, that can to be
changed in future.

> 
> > +	lua_gc(L, LUA_GCCOLLECT, 0);
> > +	luaM_metrics(L, &newm);
> > +	assert(newm.gc_allocated - oldm.gc_allocated > 0);
> > +	assert(newm.gc_freed - oldm.gc_freed > 0);
> > +
> > +	lua_pushboolean(L, 1);
> > +	return 1;
> > +}
> > +
> > +static int gc_steps(lua_State *L)
> > +{
> > +	struct luam_Metrics oldm, newm;
> > +	/*
> > +	 * Some garbage has already happened before the next line,
> > +	 * i.e. during fronted processing lua test chunk.
> 
> Typo: s/fronted/frontend/.
> Typo: s/lua/Lua/.

Fixed here and below.

> 
> > +	 * Let's put a full garbage collection cycle on top
> > +	 * of that, and confirm that non-null values are reported
> > +	 * (we are not yet interested in actual numbers):
> > +	 */
> > +	luaM_metrics(L, &oldm);
> > +	lua_gc(L, LUA_GCCOLLECT, 0);
> 
> OK, this is a misguiding part. At first I was confused with the comment
> below that *new values are the same*. Why the difference doesn't equals
> to zero if they are the same? Later I found the <lua_gc> call after
> <oldm> initialization *but* prior to these asserts. I guess the next
> reordering definitely make the code clearer:
> | luaM_metrics(L, &oldm);
> | assert(oldm.gc_steps_pause > 0);
> | <other asserts>
> | lua_gc(L, LUA_GCCOLLECT, 0);
> 
> > +	assert(oldm.gc_steps_pause > 0);
> 
> <snipped>
> 
> > +	/*
> > +	 * As long as we don't create new Lua objects
> > +	 * consequent call should return the same values:
> 
> How come? They *are not* the same considering the asserts below (except
> <gc_steps_finalize>). The comment is misguiding.
> 

This is bad mistake of mine. Here I want to check that results are the
same after two consecutive calls. Fixed. Thanks for your question!

> > +	 */
> > +	luaM_metrics(L, &newm);
> > +	assert(newm.gc_steps_pause - oldm.gc_steps_pause > 0);
> 
> <snipped>
> 
> > +}
> > +
> > +static int objcount(lua_State *L)
> > +{
> > +	struct luam_Metrics oldm, newm;
> > +	int n = lua_gettop(L);
> > +	if (n != 1 || !lua_isfunction(L, 1))
> > +		luaL_error(L, "incorrect argument: 1 function is required");
> > +
> > +	/* Force up garbage collect all dead objects. */
> > +	lua_gc(L, LUA_GCCOLLECT, 0);
> > +
> > +	luaM_metrics(L, &oldm);
> > +	/* Generate garbage. */
> > +	lua_call(L, 0, 0);
> 
> If loop optimization is disabled, then you can pass the number of
> iterations as an argument to the function and check the metrics values
> against it. Thoughts?

But there are more objects besides base generated one (for example
strings like "uint64_t", "ffi").
But we can steal set number of iterations here this makes test more flexible.
I've added it to patch.

> 
> > +	lua_gc(L, LUA_GCCOLLECT, 0);
> > +	luaM_metrics(L, &newm);
> 
> <snipped>
> 
> > +}
> 
> > +static int tracenum_base(lua_State *L)
> > +{
> > +	struct luam_Metrics metrics;
> > +	int n = lua_gettop(L);
> > +	if (n != 1 || !lua_isfunction(L, 1))
> > +		luaL_error(L, "incorrect arguments: 1 function is required");
> > +
> > +	luaJIT_setmode(L, 0, LUAJIT_MODE_OFF);
> 
> This line looks excess.

Removed here and below.

> 
> > +	luaJIT_setmode(L, 0, LUAJIT_MODE_FLUSH);
> > +	/* Force up garbage collect all dead objects. */
> > +	lua_gc(L, LUA_GCCOLLECT, 0);
> > +
> > +	luaM_metrics(L, &metrics);
> > +	assert(metrics.jit_trace_num == 0);
> > +
> > +	luaJIT_setmode(L, 0, LUAJIT_MODE_ON);
> 
> <snipped>
> 
> > +
> > +	luaJIT_setmode(L, 0, LUAJIT_MODE_FLUSH);
> > +	/* Force up garbage collect all dead objects. */
> > +	lua_gc(L, LUA_GCCOLLECT, 0);
> > +	luaM_metrics(L, &metrics);
> > +	assert(metrics.jit_trace_num == 0);
> > +
> > +	luaJIT_setmode(L, 0, LUAJIT_MODE_ON);
> 
> This line looks excess.

Removed here and below.

> 
> > +	lua_pushboolean(L, 1);
> > +	return 1;
> > +}
> 
> <snipped>
> 
> > diff --git a/test/lib-misc-getmetrics.test.lua b/test/lib-misc-getmetrics.test.lua
> 
> Add Github issue to the file name.
> 
> > new file mode 100755
> > index 0000000..2859e26
> > --- /dev/null
> > +++ b/test/lib-misc-getmetrics.test.lua
> > @@ -0,0 +1,418 @@
> > +#!/usr/bin/env tarantool
> > +
> > +-- This is a part of tarantool/luajit testing suite.
> > +-- Major portions taken verbatim or adapted from the LuaVela testing suit.
> 
> Typo: s/suit/suite/.

Thanks! Fixed.

> 
> > +-- Copyright (C) 2015-2019 IPONWEB Ltd.
> 
> <snipped>
> 
> > +test:test("gc-allocated-freed", function(subtest)
> > +    subtest:plan(1)
> > +
> > +    -- Force up garbage collect all dead objects.
> > +    collectgarbage("collect")
> > +
> > +    -- Bump getmetrincs table and string keys allocation.
> 
> Typo: s/getmetrincs/getmetrics/.

Thanks! Fixed!

> 
> > +    local old_metrics = misc.getmetrics()
> > +
> > +    -- Remember allocated size for getmetrics table.
> > +    old_metrics = misc.getmetrics()
> > +
> > +    collectgarbage("collect")
> > +
> > +    local new_metrics = misc.getmetrics()
> > +
> > +    local diff_alloc = new_metrics.gc_allocated - old_metrics.gc_allocated
> > +    local getmetrics_alloc = diff_alloc
> 
> Why the difference can't be assigned right to <getmetrics_alloc>?

Yep, it will be more clear. Thanks!

> 

<snipped>

> 
> > +    -- Bump strings and table creation.
> > +    local old_metrics = misc.getmetrics()
> 
> OK, I finally got the math you explained to me offline, but it's too
> complex for such simple case. Let's consider the newly allocated table
> right in the test assertion.
> 

Yep, this looks confusing. Fixed.

> > +    old_metrics = misc.getmetrics()

<snipped>

> 
> > +test:test("tracenum-base", function(subtest)
> > +    subtest:plan(3)
> > +
> > +    jit.off()
> 
> This line looks excess.

Yep. Removed.

> 
> > +    jit.flush()
> > +    collectgarbage("collect")
> > +    local metrics = misc.getmetrics()
> > +    subtest:is(metrics.jit_trace_num, 0)
> > +
> > +    jit.on()

And this too.

> > +    local sum = 0
> > +    for i = 1, 100 do
> > +        sum = sum + i
> > +    end
> > +
> > +    metrics = misc.getmetrics()
> > +    subtest:is(metrics.jit_trace_num, 1)
> > +
> > +    jit.off()
> 
> This line looks excess.

Yep. Removed.

> 
> > +    jit.flush()
> > +    collectgarbage("collect")
> > +
> > +    metrics = misc.getmetrics()
> > +    subtest:is(metrics.jit_trace_num, 0)
> > +
> > +    jit.on()

And this too.

> > +end)
> > +
> > +os.exit(test:check() and 0 or 1)
> > -- 
> > 2.28.0
> > 
> 
> [1]: https://github.com/tarantool/luajit/blob/tarantool/test/utils.lua
> 
> -- 
> Best regards,
> IM

See iterative patch in the bottom. Branch force-pushed.

===================================================================
diff --git a/src/lib_misc.c b/src/lib_misc.c
index ef11237..6f7b9a9 100644
--- a/src/lib_misc.c
+++ b/src/lib_misc.c
@@ -29,8 +29,10 @@ static LJ_AINLINE void setnumfield(struct lua_State *L, GCtab *t,
 LJLIB_CF(misc_getmetrics)
 {
   struct luam_Metrics metrics;
-  lua_createtable(L, 0, 22);
-  GCtab *m = tabV(L->top - 1);
+  GCtab *m;
+
+  lua_createtable(L, 0, 19);
+  m = tabV(L->top - 1);
 
   luaM_metrics(L, &metrics);
 
diff --git a/src/lj_mapi.c b/src/lj_mapi.c
index 7645a44..13737e0 100644
--- a/src/lj_mapi.c
+++ b/src/lj_mapi.c
@@ -17,13 +17,14 @@
 
 LUAMISC_API void luaM_metrics(lua_State *L, struct luam_Metrics *metrics)
 {
-  lua_assert(metrics != NULL);
   global_State *g = G(L);
   GCState *gc = &g->gc;
 #if LJ_HASJIT
   jit_State *J = G2J(g);
 #endif
 
+  lua_assert(metrics != NULL);
+
   metrics->strhash_hit = g->strhash_hit;
   metrics->strhash_miss = g->strhash_miss;
 
diff --git a/src/lmisclib.h b/src/lmisclib.h
index e2d1909..0c07707 100644
--- a/src/lmisclib.h
+++ b/src/lmisclib.h
@@ -13,9 +13,12 @@
 /* API for obtaining various platform metrics. */
 
 struct luam_Metrics {
-  /* Strings amount found in string hash instead of allocation of new one. */
+  /*
+  ** Number of strings being interned (i.e. the string with the
+  ** same payload is found, so a new one is not created/allocated).
+  */
   size_t strhash_hit;
-  /* Strings amount allocated and put into string hash. */
+  /* Total number of strings allocations during the platform lifetime. */
   size_t strhash_miss;
 
   /* Amount of allocated string objects. */
@@ -44,8 +47,7 @@ struct luam_Metrics {
 
   /*
   ** Overall number of snap restores (amount of guard assertions
-  ** leading to stopping trace executions and trace exits,
-  ** that are not stitching with other traces).
+  ** leading to stopping trace executions).
   */
   size_t jit_snap_restore;
   /* Overall number of abort traces. */
diff --git a/test/misclib-getmetrics-capi.test.lua b/test/misclib-getmetrics-capi.test.lua
index 34adaba..f133274 100755
--- a/test/misclib-getmetrics-capi.test.lua
+++ b/test/misclib-getmetrics-capi.test.lua
@@ -1,15 +1,15 @@
 #!/usr/bin/env tarantool
 
-local file = debug.getinfo(1, "S").source:sub(2)
-local filepath = file:match("(.*/)")
-local soext = jit.os == "OSX" and "dylib" or "so"
-package.cpath = filepath..'clib-misclib-getmetrics/?.'..soext..";"
-                ..package.cpath
+local path = arg[0]:gsub('%.test%.lua', '')
+local suffix = package.cpath:match('?.(%a+);')
+package.cpath = ('%s/?.%s'):format(path, suffix)
 
-local jit_opt_default_hotloop = 56
-local jit_opt_default_hotexit = 10
-local jit_opt_default_level = 3
-local jit_opt_default_minstitch = 0
+local jit_opt_default = {
+    3, -- level
+    "hotloop=56",
+    "hotexit=10",
+    "minstitch=0",
+}
 
 local tap = require('tap')
 
@@ -22,8 +22,7 @@ test:ok(testgetmetrics.base())
 test:ok(testgetmetrics.gc_allocated_freed())
 test:ok(testgetmetrics.gc_steps())
 
-test:ok(testgetmetrics.objcount(function()
-    local table_new = require("table.new")
+test:ok(testgetmetrics.objcount(function(iterations)
     local ffi = require("ffi")
 
     jit.opt.start(0)
@@ -36,36 +35,36 @@ test:ok(testgetmetrics.objcount(function()
     }
 
     -- Separate objects creations to separate jit traces.
-    for i = 1, 1000 do
+    for i = 1, iterations do
         table.insert(placeholder.str, tostring(i))
     end
 
-    for i = 1, 1000 do
-        table.insert(placeholder.tab, table_new(i, 0))
+    for i = 1, iterations do
+        table.insert(placeholder.tab, {i})
     end
 
-    for i = 1, 1000 do
+    for i = 1, iterations do
         table.insert(placeholder.udata, newproxy())
     end
 
-    for i = 1, 1000 do
+    for i = 1, iterations do
         -- Check counting of VLA/VLS/aligned cdata.
         table.insert(placeholder.cdata, ffi.new("char[?]", 4))
     end
 
-    for i = 1, 1000 do
+    for i = 1, iterations do
         -- Check counting of non-VLA/VLS/aligned cdata.
         table.insert(placeholder.cdata, ffi.new("uint64_t", i))
     end
 
     placeholder = nil
     -- Restore default jit settings.
-    jit.opt.start(jit_opt_default_level)
+    jit.opt.start(unpack(jit_opt_default))
 end))
 
 -- Compiled loop with a direct exit to the interpreter.
 test:ok(testgetmetrics.snap_restores(function()
-    jit.opt.start(0, "hotloop=2")
+    jit.opt.start(0, "hotloop=1")
 
     local old_metrics = misc.getmetrics()
 
@@ -77,7 +76,7 @@ test:ok(testgetmetrics.snap_restores(function()
     local new_metrics = misc.getmetrics()
 
     -- Restore default jit settings.
-    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop)
+    jit.opt.start(unpack(jit_opt_default))
 
     -- A single snapshot restoration happened on loop finish.
     return 1
@@ -85,7 +84,7 @@ end))
 
 -- Compiled loop with a side exit which does not get compiled.
 test:ok(testgetmetrics.snap_restores(function()
-    jit.opt.start(0, "hotloop=2", "hotexit=2", "minstitch=15")
+    jit.opt.start(0, "hotloop=1", "hotexit=2", "minstitch=15")
 
     local function foo(i)
         -- math.fmod is not yet compiled!
@@ -98,9 +97,7 @@ test:ok(testgetmetrics.snap_restores(function()
     end
 
     -- Restore default jit settings.
-    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop,
-                  "hotexit="..jit_opt_default_hotexit,
-                  "minstitch="..jit_opt_default_minstitch)
+    jit.opt.start(unpack(jit_opt_default))
 
     -- Side exits from the root trace could not get compiled.
     return 5
@@ -110,7 +107,7 @@ end))
 test:ok(testgetmetrics.snap_restores(function()
     -- Optimization level is important here as `loop` optimization
     -- may unroll the loop body and insert +1 side exit.
-    jit.opt.start(0, "hotloop=5", "hotexit=5")
+    jit.opt.start(0, "hotloop=1", "hotexit=5")
 
     local function foo(i)
         return i <= 10 and i or tostring(i)
@@ -122,8 +119,7 @@ test:ok(testgetmetrics.snap_restores(function()
     end
 
     -- Restore default jit settings.
-    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop,
-                  "hotexit="..jit_opt_default_hotexit)
+    jit.opt.start(unpack(jit_opt_default))
 
     -- 5 side exits to the interpreter before trace gets hot
     -- and compiled
@@ -157,8 +153,8 @@ test:ok(testgetmetrics.snap_restores(function()
     foo(21)
 
     -- Restore default jit settings.
-    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop,
-                  "hotexit="..jit_opt_default_hotexit)
+    jit.opt.start(unpack(jit_opt_default))
+
     return 2
 end))
 
diff --git a/test/misclib-getmetrics-capi/testgetmetrics.c b/test/misclib-getmetrics-capi/testgetmetrics.c
index 32802d2..4a7355a 100644
--- a/test/misclib-getmetrics-capi/testgetmetrics.c
+++ b/test/misclib-getmetrics-capi/testgetmetrics.c
@@ -11,7 +11,10 @@ static int base(lua_State *L)
 	struct luam_Metrics metrics;
 	luaM_metrics(L, &metrics);
 
-	/* Just check API. */
+	/*
+	 * Just check API.
+	 * (ssize_t) cast need for not always-true unsigned >= 0 check.
+	 */
 	assert((ssize_t)metrics.strhash_hit >= 0);
 	assert((ssize_t)metrics.strhash_miss >= 0);
 
@@ -64,13 +67,14 @@ static int gc_steps(lua_State *L)
 	struct luam_Metrics oldm, newm;
 	/*
 	 * Some garbage has already happened before the next line,
-	 * i.e. during fronted processing lua test chunk.
+	 * i.e. during frontend processing Lua test chunk.
 	 * Let's put a full garbage collection cycle on top
 	 * of that, and confirm that non-null values are reported
 	 * (we are not yet interested in actual numbers):
 	 */
-	luaM_metrics(L, &oldm);
 	lua_gc(L, LUA_GCCOLLECT, 0);
+
+	luaM_metrics(L, &oldm);
 	assert(oldm.gc_steps_pause > 0);
 	assert(oldm.gc_steps_propagate > 0);
 	assert(oldm.gc_steps_atomic > 0);
@@ -84,11 +88,11 @@ static int gc_steps(lua_State *L)
 	 * consequent call should return the same values:
 	 */
 	luaM_metrics(L, &newm);
-	assert(newm.gc_steps_pause - oldm.gc_steps_pause > 0);
-	assert(newm.gc_steps_propagate - oldm.gc_steps_propagate > 0);
-	assert(newm.gc_steps_atomic - oldm.gc_steps_atomic > 0);
-	assert(newm.gc_steps_sweepstring - oldm.gc_steps_sweepstring > 0);
-	assert(newm.gc_steps_sweep - oldm.gc_steps_sweep > 0);
+	assert(newm.gc_steps_pause - oldm.gc_steps_pause == 0);
+	assert(newm.gc_steps_propagate - oldm.gc_steps_propagate == 0);
+	assert(newm.gc_steps_atomic - oldm.gc_steps_atomic == 0);
+	assert(newm.gc_steps_sweepstring - oldm.gc_steps_sweepstring == 0);
+	assert(newm.gc_steps_sweep - oldm.gc_steps_sweep == 0);
 	/* Nothing to finalize, skipped. */
 	assert(newm.gc_steps_finalize == 0);
 	oldm = newm;
@@ -110,7 +114,7 @@ static int gc_steps(lua_State *L)
 
 	/*
 	 * Now let's run three GC cycles to ensure that
-	 * zero-to-one transition was not a lucky coincidence.
+	 * increment was not a lucky coincidence.
 	 */
 	lua_gc(L, LUA_GCCOLLECT, 0);
 	lua_gc(L, LUA_GCCOLLECT, 0);
@@ -139,8 +143,9 @@ static int objcount(lua_State *L)
 	lua_gc(L, LUA_GCCOLLECT, 0);
 
 	luaM_metrics(L, &oldm);
-	/* Generate garbage. */
-	lua_call(L, 0, 0);
+	/* Generate garbage. Argument is iterations amount. */
+	lua_pushnumber(L, 1000);
+	lua_call(L, 1, 0);
 	lua_gc(L, LUA_GCCOLLECT, 0);
 	luaM_metrics(L, &newm);
 	assert(newm.gc_strnum - oldm.gc_strnum == 0);
@@ -195,7 +200,6 @@ static int tracenum_base(lua_State *L)
 	if (n != 1 || !lua_isfunction(L, 1))
 		luaL_error(L, "incorrect arguments: 1 function is required");
 
-	luaJIT_setmode(L, 0, LUAJIT_MODE_OFF);
 	luaJIT_setmode(L, 0, LUAJIT_MODE_FLUSH);
 	/* Force up garbage collect all dead objects. */
 	lua_gc(L, LUA_GCCOLLECT, 0);
@@ -203,7 +207,6 @@ static int tracenum_base(lua_State *L)
 	luaM_metrics(L, &metrics);
 	assert(metrics.jit_trace_num == 0);
 
-	luaJIT_setmode(L, 0, LUAJIT_MODE_ON);
 	/* Generate traces. */
 	lua_call(L, 0, 1);
 	n = lua_gettop(L);
diff --git a/test/misclib-getmetrics-lapi.test.lua b/test/misclib-getmetrics-lapi.test.lua
index 2859e26..3b3d1f8 100755
--- a/test/misclib-getmetrics-lapi.test.lua
+++ b/test/misclib-getmetrics-lapi.test.lua
@@ -1,7 +1,7 @@
 #!/usr/bin/env tarantool
 
 -- This is a part of tarantool/luajit testing suite.
--- Major portions taken verbatim or adapted from the LuaVela testing suit.
+-- Major portions taken verbatim or adapted from the LuaVela testing suite.
 -- Copyright (C) 2015-2019 IPONWEB Ltd.
 
 local tap = require('tap')
@@ -9,10 +9,12 @@ local tap = require('tap')
 local test = tap.test("lib-misc-getmetrics")
 test:plan(10)
 
-local jit_opt_default_hotloop = 56
-local jit_opt_default_hotexit = 10
-local jit_opt_default_level = 3
-local jit_opt_default_minstitch = 0
+local jit_opt_default = {
+    3, -- level
+    "hotloop=56",
+    "hotexit=10",
+    "minstitch=0",
+}
 
 -- Test Lua API.
 test:test("base", function(subtest)
@@ -49,7 +51,7 @@ test:test("gc-allocated-freed", function(subtest)
     -- Force up garbage collect all dead objects.
     collectgarbage("collect")
 
-    -- Bump getmetrincs table and string keys allocation.
+    -- Bump getmetrics table and string keys allocation.
     local old_metrics = misc.getmetrics()
 
     -- Remember allocated size for getmetrics table.
@@ -59,8 +61,7 @@ test:test("gc-allocated-freed", function(subtest)
 
     local new_metrics = misc.getmetrics()
 
-    local diff_alloc = new_metrics.gc_allocated - old_metrics.gc_allocated
-    local getmetrics_alloc = diff_alloc
+    local getmetrics_alloc = new_metrics.gc_allocated - old_metrics.gc_allocated
 
     -- Do not use test:ok to avoid extra allocated/freed objects.
     assert(getmetrics_alloc > 0, "count allocated table for getmetrics")
@@ -75,7 +76,7 @@ test:test("gc-allocated-freed", function(subtest)
 
     new_metrics = misc.getmetrics()
 
-    diff_alloc = new_metrics.gc_allocated - old_metrics.gc_allocated
+    local diff_alloc = new_metrics.gc_allocated - old_metrics.gc_allocated
     local diff_freed = new_metrics.gc_freed - old_metrics.gc_freed
 
     assert(diff_alloc > getmetrics_alloc,
@@ -113,7 +114,7 @@ test:test("gc-steps", function(subtest)
     subtest:plan(24)
 
     -- Some garbage has already created before the next line,
-    -- i.e. during fronted processing this chunk.
+    -- i.e. during frontend processing this chunk.
     -- Let's put a full garbage collection cycle on top of that,
     -- and confirm that non-null values are reported (we are not
     -- yet interested in actual numbers):
@@ -154,8 +155,8 @@ test:test("gc-steps", function(subtest)
     subtest:is(newm.gc_steps_finalize, 0)
     oldm = newm
 
-    -- Now let's run three GC cycles to ensure that zero-to-one
-    -- transition was not a lucky coincidence.
+    -- Now let's run three GC cycles to ensure that increment
+    -- was not a lucky coincidence.
     collectgarbage("collect")
     collectgarbage("collect")
     collectgarbage("collect")
@@ -172,7 +173,6 @@ end)
 
 test:test("objcount", function(subtest)
     subtest:plan(4)
-    local table_new = require("table.new")
     local ffi = require("ffi")
 
     jit.opt.start(0)
@@ -180,9 +180,7 @@ test:test("objcount", function(subtest)
     -- Remove all dead objects.
     collectgarbage("collect")
 
-    -- Bump strings and table creation.
     local old_metrics = misc.getmetrics()
-    old_metrics = misc.getmetrics()
 
     local placeholder = {
         str = {},
@@ -197,7 +195,7 @@ test:test("objcount", function(subtest)
     end
 
     for i = 1, 1000 do
-        table.insert(placeholder.tab, table_new(i, 0))
+        table.insert(placeholder.tab, {i})
     end
 
     for i = 1, 1000 do
@@ -221,7 +219,12 @@ test:test("objcount", function(subtest)
     -- Check that amount of objects not increased.
     subtest:is(new_metrics.gc_strnum, old_metrics.gc_strnum,
                "strnum don't change")
-    subtest:is(new_metrics.gc_tabnum, old_metrics.gc_tabnum,
+    -- When we call getmetrics, we create table for metrics first.
+    -- So, when we save old_metrics there are x + 1 tables,
+    -- when we save new_metrics there are x + 2 tables, because
+    -- old table hasn't been collected yet (it is still
+    -- reachable).
+    subtest:is(new_metrics.gc_tabnum - old_metrics.gc_tabnum, 1,
                "tabnum don't change")
     subtest:is(new_metrics.gc_udatanum, old_metrics.gc_udatanum,
                "udatanum don't change")
@@ -229,14 +232,14 @@ test:test("objcount", function(subtest)
                "cdatanum don't change")
 
     -- Restore default jit settings.
-    jit.opt.start(jit_opt_default_level)
+    jit.opt.start(unpack(jit_opt_default))
 end)
 
 test:test("snap-restores-direct-loop", function(subtest)
     -- Compiled loop with a direct exit to the interpreter.
     subtest:plan(1)
 
-    jit.opt.start(0, "hotloop=2")
+    jit.opt.start(0, "hotloop=1")
 
     local old_metrics = misc.getmetrics()
 
@@ -251,14 +254,14 @@ test:test("snap-restores-direct-loop", function(subtest)
     subtest:is(new_metrics.jit_snap_restore - old_metrics.jit_snap_restore, 1)
 
     -- Restore default jit settings.
-    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop)
+    jit.opt.start(unpack(jit_opt_default))
 end)
 
 test:test("snap-restores-loop-side-exit-non-compiled", function(subtest)
     -- Compiled loop with a side exit which does not get compiled.
     subtest:plan(1)
 
-    jit.opt.start(0, "hotloop=2", "hotexit=2", "minstitch=15")
+    jit.opt.start(0, "hotloop=1", "hotexit=2", "minstitch=15")
 
     local function foo(i)
         -- math.fmod is not yet compiled!
@@ -277,9 +280,7 @@ test:test("snap-restores-loop-side-exit-non-compiled", function(subtest)
     subtest:is(new_metrics.jit_snap_restore - old_metrics.jit_snap_restore, 5)
 
     -- Restore default jit settings.
-    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop,
-                  "hotexit="..jit_opt_default_hotexit,
-                  "minstitch="..jit_opt_default_minstitch)
+    jit.opt.start(unpack(jit_opt_default))
 end)
 
 test:test("snap-restores-loop-side-exit", function(subtest)
@@ -288,7 +289,7 @@ test:test("snap-restores-loop-side-exit", function(subtest)
 
     -- Optimization level is important here as `loop` optimization
     -- may unroll the loop body and insert +1 side exit.
-    jit.opt.start(0, "hotloop=5", "hotexit=5")
+    jit.opt.start(0, "hotloop=1", "hotexit=5")
 
     local function foo(i)
         return i <= 10 and i or tostring(i)
@@ -308,8 +309,7 @@ test:test("snap-restores-loop-side-exit", function(subtest)
     subtest:is(new_metrics.jit_snap_restore - old_metrics.jit_snap_restore, 6)
 
     -- Restore default jit settings.
-    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop,
-                  "hotexit="..jit_opt_default_hotexit)
+    jit.opt.start(unpack(jit_opt_default))
 end)
 
 test:test("snap-restores-scalar", function(subtest)
@@ -352,8 +352,7 @@ test:test("snap-restores-scalar", function(subtest)
     subtest:is(new_metrics.jit_snap_restore - old_metrics.jit_snap_restore, 2)
 
     -- Restore default jit settings.
-    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop,
-                  "hotexit="..jit_opt_default_hotexit)
+    jit.opt.start(unpack(jit_opt_default))
 end)
 
 test:test("strhash", function(subtest)
@@ -390,13 +389,11 @@ end)
 test:test("tracenum-base", function(subtest)
     subtest:plan(3)
 
-    jit.off()
     jit.flush()
     collectgarbage("collect")
     local metrics = misc.getmetrics()
     subtest:is(metrics.jit_trace_num, 0)
 
-    jit.on()
     local sum = 0
     for i = 1, 100 do
         sum = sum + i
@@ -405,14 +402,11 @@ test:test("tracenum-base", function(subtest)
     metrics = misc.getmetrics()
     subtest:is(metrics.jit_trace_num, 1)
 
-    jit.off()
     jit.flush()
     collectgarbage("collect")
 
     metrics = misc.getmetrics()
     subtest:is(metrics.jit_trace_num, 0)
-
-    jit.on()
 end)
 
 os.exit(test:check() and 0 or 1)
===================================================================

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [RFC v4] rfc: luajit metrics
  2020-10-05  6:30 ` [Tarantool-patches] [RFC v4] rfc: luajit metrics Sergey Kaplun
@ 2020-10-07 14:46   ` Sergey Kaplun
  2020-10-08 17:25   ` Igor Munkin
  2020-12-22  9:07   ` Kirill Yukhin
  2 siblings, 0 replies; 26+ messages in thread
From: Sergey Kaplun @ 2020-10-07 14:46 UTC (permalink / raw)
  To: Igor Munkin, Sergey Ostanevich; +Cc: tarantool-patches

On 05.10.20, Sergey Kaplun wrote:
> Part of #5187
> ---
> 
> This patch adds RFC to LuaJIT metrics interfaces. Nevertheless name
> `misc` for builtin library is not good and should be discussed, because
> tons of user modules can use that name for their own libraries.
> 
> Branch: https://github.com/tarantool/tarantool/tree/skaplun/5187-luajit-metrics
> Issue: https://github.com/tarantool/tarantool/issues/5187
> 
> Changes in v2:
> - Fixed typos
> - Made comments more verbose
> - Avoided flushing any of metrics after each call of luaM_metrics()
> Changes in v3:
> - Added colors count metrics description
> - Added description about how metrics are collected
> - Added benchmarks
> Changes in v3:

v4 of course

> - Removed colors count metrics

<snipped>

> -- 
> 2.28.0
> 

Update patch considering to Igor's comments (see also [1]).
Iterative patch in the bottom. Branch force-pushed.

===================================================================
diff --git a/doc/rfc/5187-luajit-metrics.md b/doc/rfc/5187-luajit-metrics.md
index 02f5b559f..988b049fb 100644
--- a/doc/rfc/5187-luajit-metrics.md
+++ b/doc/rfc/5187-luajit-metrics.md
@@ -42,9 +42,12 @@ The `struct luam_Metrics` has the following definition:
 
 ```c
 struct luam_Metrics {
-  /* Strings amount found in string hash instead of allocation of new one. */
+  /*
+  ** Number of strings being interned (i.e. the string with the
+  ** same payload is found, so a new one is not created/allocated).
+  */
   size_t strhash_hit;
-  /* Strings amount allocated and put into string hash. */
+  /* Total number of strings allocations during the platform lifetime. */
   size_t strhash_miss;
 
   /* Amount of allocated string objects. */
@@ -73,8 +76,7 @@ struct luam_Metrics {
 
   /*
   ** Overall number of snap restores (amount of guard assertions
-  ** leading to stopping trace executions and trace exits,
-  ** that are not stitching with other traces).
+  ** leading to stopping trace executions).
   */
   size_t jit_snap_restore;
   /* Overall number of abort traces. */
===================================================================

[1]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-October/019947.html

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 2/2] misc: add C and Lua API for platform metrics
  2020-10-07 14:35     ` Sergey Kaplun
@ 2020-10-07 18:23       ` Igor Munkin
  2020-10-07 20:09         ` Sergey Kaplun
  0 siblings, 1 reply; 26+ messages in thread
From: Igor Munkin @ 2020-10-07 18:23 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sergey,

Thanks for the fixes! The patch LGTM, except two nits below.

On 07.10.20, Sergey Kaplun wrote:
> Hi Igor! Thanks for the review!
> 
> On 07.10.20, Igor Munkin wrote:
> > Sergey,
> > 
> > Thanks for the patch! Please consider my comments below.
> > 

<snipped>

> > > diff --git a/test/clib-misclib-getmetrics.test.lua b/test/clib-misclib-getmetrics.test.lua
> > 
> > Add Github issue to the file name.
> 
> Here and below:
> I thought we use gh-xxxx style for bugfix tests. Tests for new
> features are named the same as a feature.
> Is `misclib-getmetrics-{capi,lapi}.test.lua` good enough?
> 

Yes, I'm OK with these names.

> > 
> > > new file mode 100755
> > > index 0000000..34adaba
> > > --- /dev/null
> > > +++ b/test/clib-misclib-getmetrics.test.lua
> > > @@ -0,0 +1,174 @@
> > > +#!/usr/bin/env tarantool
> > > +
> > > +local file = debug.getinfo(1, "S").source:sub(2)
> > > +local filepath = file:match("(.*/)")
> > > +local soext = jit.os == "OSX" and "dylib" or "so"
> > > +package.cpath = filepath..'clib-misclib-getmetrics/?.'..soext..";"
> > > +                ..package.cpath
> > 
> > I've already solved this issue, so this part can be easily borrowed from
> > utils.lua[1]. Consider the following:
> > | local path = arg[0]:gsub('%.test%.lua', '')
> > | local suffix = package.cpath:match('?.(%a+);')
> > | package.cpath = ('%s/?.%s;%s'):format(path, suffix, package.cpath)
> > 
> 
> Yep. See now. Thanks! We should add one small function for this
> to reuse in other tests later.

Yes, this part can be moved to a separate routine in utils.lua.

> 

<snipped>

> 
> See iterative patch in the bottom. Branch force-pushed.
> 
> ===================================================================

<snipped>

> diff --git a/test/misclib-getmetrics-capi.test.lua b/test/misclib-getmetrics-capi.test.lua
> index 34adaba..f133274 100755
> --- a/test/misclib-getmetrics-capi.test.lua
> +++ b/test/misclib-getmetrics-capi.test.lua
> @@ -1,15 +1,15 @@
>  #!/usr/bin/env tarantool
>  
> -local file = debug.getinfo(1, "S").source:sub(2)
> -local filepath = file:match("(.*/)")
> -local soext = jit.os == "OSX" and "dylib" or "so"
> -package.cpath = filepath..'clib-misclib-getmetrics/?.'..soext..";"
> -                ..package.cpath
> +local path = arg[0]:gsub('%.test%.lua', '')
> +local suffix = package.cpath:match('?.(%a+);')
> +package.cpath = ('%s/?.%s'):format(path, suffix)

This line overrides the current <package.cpath>.

>  

<snipped>

> diff --git a/test/misclib-getmetrics-capi/testgetmetrics.c b/test/misclib-getmetrics-capi/testgetmetrics.c
> index 32802d2..4a7355a 100644
> --- a/test/misclib-getmetrics-capi/testgetmetrics.c
> +++ b/test/misclib-getmetrics-capi/testgetmetrics.c

<snipped>

> @@ -203,7 +207,6 @@ static int tracenum_base(lua_State *L)
>  	luaM_metrics(L, &metrics);
>  	assert(metrics.jit_trace_num == 0);
>  
> -	luaJIT_setmode(L, 0, LUAJIT_MODE_ON);

There is one more excess call at the end of this function.

>  	/* Generate traces. */
>  	lua_call(L, 0, 1);
>  	n = lua_gettop(L);

<snipped>

> ===================================================================
> 
> -- 
> Best regards,
> Sergey Kaplun

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 1/2] core: introduce various platform metrics
  2020-10-07 14:11   ` Igor Munkin
@ 2020-10-07 19:55     ` Sergey Kaplun
  2020-10-07 20:16       ` Igor Munkin
  0 siblings, 1 reply; 26+ messages in thread
From: Sergey Kaplun @ 2020-10-07 19:55 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches

On 07.10.20, Igor Munkin wrote:
> Sergey,
> 
> Thanks for the patch! Please consider my comments below.
> 
> On 05.10.20, Sergey Kaplun wrote:
> > This patch introduces the following counters:
> >   - overall amount of allocated tables, cdata and udata objects
> >   - number of incremental GC steps grouped by GC state
> >   - number of string hashes hits and misses
> >   - amount of allocated and freed memory
> >   - number of trace aborts and restored snapshots
> 
> Looks like you forgot a number of alive traces.
> 
> > 
> > Interfaces to obtain these metrics via both Lua and C API are
> > introduced in the next patch.
> > 
> > Part of tarantool/tarantool#5187
> > ---
> >  src/lj_asm.c       |  2 ++
> >  src/lj_asm_arm.h   |  7 +++++++
> >  src/lj_asm_arm64.h |  7 +++++++
> >  src/lj_asm_mips.h  |  4 ++++
> >  src/lj_asm_ppc.h   |  3 +++
> >  src/lj_asm_x86.h   |  4 ++++
> >  src/lj_cdata.c     |  2 ++
> >  src/lj_cdata.h     |  2 ++
> >  src/lj_gc.c        |  4 ++++
> >  src/lj_gc.h        |  6 +-----
> >  src/lj_jit.h       |  3 +++
> >  src/lj_obj.h       | 28 +++++++++++++++++++++++++---
> >  src/lj_snap.c      |  1 +
> >  src/lj_state.c     |  2 +-
> >  src/lj_str.c       |  5 +++++
> >  src/lj_tab.c       |  2 ++
> >  src/lj_trace.c     |  6 +++++-
> >  src/lj_udata.c     |  2 ++
> >  18 files changed, 80 insertions(+), 10 deletions(-)
> > 
> > diff --git a/src/lj_asm.c b/src/lj_asm.c
> > index c2cf5a9..8fb3ccd 100644
> > --- a/src/lj_asm.c
> > +++ b/src/lj_asm.c
> > @@ -2273,6 +2273,7 @@ void lj_asm_trace(jit_State *J, GCtrace *T)
> >    as->J = J;
> >    as->T = T;
> >    J->curfinal = lj_trace_alloc(J->L, T);  /* This copies the IR, too. */
> > +  J->tracenum++;
> 
> Why didn't you simply put this increment right to <lj_trace_alloc> as
> you did for decrement and <lj_trace_free>?

I was confused by L argument. Fixed

> 
> >    as->flags = J->flags;
> >    as->loopref = J->loopref;
> >    as->realign = NULL;
> > @@ -2379,6 +2380,7 @@ void lj_asm_trace(jit_State *J, GCtrace *T)
> >      lj_trace_free(J2G(J), J->curfinal);
> >      J->curfinal = NULL;  /* In case lj_trace_alloc() OOMs. */
> >      J->curfinal = lj_trace_alloc(J->L, T);
> > +    J->tracenum++;
> 
> Ditto.

And here too.

> 

<snipped>

> > +  /* Code incrementing cdatanum is sparse to avoid mips data hazards. */
> 
> Side note: LOL, here you are :)

Finally :)

> 
> > +  emit_setgl(as, RID_RET+2, gc.cdatanum);
> 
> Well, I glanced a MIPS register-usage convention and AFAICS $4 register
> (RID_RET + 2) is a general-purpose (i.e. doesn't store 0 or preserved by
> kernel) caller-safe one. Ergo it should be allocated it in a proper way
> from scratch set, shouldn't it?
> 

AFAIK, $a0 - $a3 ($4 - $7) registers are arguments to functions - not
preserved by subprograms.
But anyway explicit allocation is better here. Added.

> >    /* Initialize gct and ctypeid. lj_mem_newgco() already sets marked. */

<snipped>

> > +/* Garbage collector states. Order matters. */
> > +enum {
> > +  GCSpause, /* Start a GC cycle and mark the root set.*/
> > +  GCSpropagate, /* One gray object is processed. */
> > +  GCSatomic, /* Atomic transition from mark to sweep phase. */
> > +  GCSsweepstring, /* Sweep one chain of strings. */
> > +  GCSsweep, /* Sweep few objects from root. */
> > +  GCSfinalize, /* Finalize one userdata or cdata object. */
> > +  GCSmax
> 
> Adjust the comments considering the code nearby.

Sure! Thanks!

> 
> > +};
> > +
> >  typedef struct GCState {
> >    GCSize total;		/* Memory currently allocated. */
> >    GCSize threshold;	/* Memory threshold. */
> 
> <snipped>
> 
> > @@ -602,22 +622,24 @@ typedef struct global_State {
> >      BloomFilter next[2];
> >    } strbloom;
> >  #endif
> > +  size_t strhash_hit;	/* Strings amount found in string hash. */
> > +  size_t strhash_miss;	/* Strings amount allocated and put into string hash. */
> >    lua_Alloc allocf;	/* Memory allocator. */
> >    void *allocd;		/* Memory allocator data. */
> >    GCState gc;		/* Garbage collector. */
> > -  volatile int32_t vmstate;  /* VM state or current JIT code trace number. */
> >    SBuf tmpbuf;		/* Temporary string buffer. */
> >    GCstr strempty;	/* Empty string. */
> >    uint8_t stremptyz;	/* Zero terminator of empty string. */
> >    uint8_t hookmask;	/* Hook mask. */
> >    uint8_t dispatchmode;	/* Dispatch mode. */
> >    uint8_t vmevmask;	/* VM event mask. */
> > +  int32_t hookcount;	/* Instruction hook countdown. */
> >    GCRef mainthref;	/* Link to main thread. */
> >    TValue registrytv;	/* Anchor for registry. */
> > -  TValue tmptv, tmptv2;	/* Temporary TValues. */
> > +  TValue tmptv2, tmptv;	/* Temporary TValues. */
> >    Node nilnode;		/* Fallback 1-element hash part (nil key and value). */
> >    GCupval uvhead;	/* Head of double-linked list of all open upvalues. */
> > -  int32_t hookcount;	/* Instruction hook countdown. */
> > +  volatile int32_t vmstate;  /* VM state or current JIT code trace number. */
> 
> There is no a single word in the commit message regarding this unclear
> change. Please drop a sentence about the rationale for this reordering.

I've changed commit message as follows:

===================================================================
core: introduce various platform metrics

This patch introduces the following counters:
  - overall amount of allocated tables, cdata and udata objects
  - number of incremental GC steps grouped by GC state
  - number of string hashes hits and misses
  - amount of allocated and freed memory
  - number of trace aborts, number of traces and restored snapshots

Also this patch fixes alignment for 64-bit architectures.

NB: MSize and BCIns are the only fixed types that equal 32 bits. GCRef,
MRef and GCSize sizes depend on LJ_GC64 define.

struct GCState is terminated by three fields: GCSize estimate, MSize
stepmul and MSize pause, which are aligned. The introduces size_t
fields do not violate the alignment too.

vmstate 32-bit field goes right after GCState field within global_State
structure. The next field tmpbuf consists of several MRef fields that
have 64-bit size each. This issue can be solved by moving vmstate field
below. However DynASM doesn't work well with unaligned memory access on
64-bit bigendian MIPS, so vmstate should be aligned to a 64-bit
boundary.

Furthermore field order has been changed to be able to compile code by
DynASM for 32-bit ARM too (see also
https://github.com/openresty/luajit2/issues/37#issuecomment-459145226).

Interfaces to obtain these metrics via both Lua and C API are
introduced in the next patch.

Part of tarantool/tarantool#5187
===================================================================

Side note: If you want read a little bit more about ARM immediate value
encoding (and play with it) see also [1].

> 

<snipped>

> > -- 
> > 2.28.0
> > 
> 
> -- 
> Best regards,
> IM

See iterative patch in the bottom. Branch force-pushed.

===================================================================
diff --git a/src/lj_asm.c b/src/lj_asm.c
index 8fb3ccd..c2cf5a9 100644
--- a/src/lj_asm.c
+++ b/src/lj_asm.c
@@ -2273,7 +2273,6 @@ void lj_asm_trace(jit_State *J, GCtrace *T)
   as->J = J;
   as->T = T;
   J->curfinal = lj_trace_alloc(J->L, T);  /* This copies the IR, too. */
-  J->tracenum++;
   as->flags = J->flags;
   as->loopref = J->loopref;
   as->realign = NULL;
@@ -2380,7 +2379,6 @@ void lj_asm_trace(jit_State *J, GCtrace *T)
     lj_trace_free(J2G(J), J->curfinal);
     J->curfinal = NULL;  /* In case lj_trace_alloc() OOMs. */
     J->curfinal = lj_trace_alloc(J->L, T);
-    J->tracenum++;
     as->realign = NULL;
   }
 
diff --git a/src/lj_asm_mips.h b/src/lj_asm_mips.h
index f4b4b5d..0341701 100644
--- a/src/lj_asm_mips.h
+++ b/src/lj_asm_mips.h
@@ -1430,7 +1430,9 @@ static void asm_cnew(ASMState *as, IRIns *ir)
   CTInfo info = lj_ctype_info(cts, id, &sz);
   const CCallInfo *ci = &lj_ir_callinfo[IRCALL_lj_mem_newgco];
   IRRef args[4];
+  RegSet allow = (RSET_GPR & ~RSET_SCRATCH);
   RegSet drop = RSET_SCRATCH;
+  Reg tmp;
   lua_assert(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL));
 
   as->gcsteps++;
@@ -1442,7 +1444,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
 
   /* Initialize immutable cdata object. */
   if (ir->o == IR_CNEWI) {
-    RegSet allow = (RSET_GPR & ~RSET_SCRATCH);
 #if LJ_32
     int32_t ofs = sizeof(GCcdata);
     if (sz == 8) {
@@ -1473,15 +1474,16 @@ static void asm_cnew(ASMState *as, IRIns *ir)
     return;
   }
 
+  tmp = ra_scratch(as, allow);
   /* Code incrementing cdatanum is sparse to avoid mips data hazards. */
-  emit_setgl(as, RID_RET+2, gc.cdatanum);
+  emit_setgl(as, tmp, gc.cdatanum);
   /* Initialize gct and ctypeid. lj_mem_newgco() already sets marked. */
   emit_tsi(as, MIPSI_SB, RID_RET+1, RID_RET, offsetof(GCcdata, gct));
   emit_tsi(as, MIPSI_SH, RID_TMP, RID_RET, offsetof(GCcdata, ctypeid));
-  emit_tsi(as, MIPSI_AADDIU, RID_RET+2, RID_RET+2, 1);
+  emit_tsi(as, MIPSI_AADDIU, tmp, tmp, 1);
   emit_ti(as, MIPSI_LI, RID_RET+1, ~LJ_TCDATA);
   emit_ti(as, MIPSI_LI, RID_TMP, id); /* Lower 16 bit used. Sign-ext ok. */
-  emit_getgl(as, RID_RET+2, gc.cdatanum);
+  emit_getgl(as, tmp, gc.cdatanum);
   args[0] = ASMREF_L;     /* lua_State *L */
   args[1] = ASMREF_TMP1;  /* MSize size   */
   asm_gencall(as, ci, args);
diff --git a/src/lj_obj.h b/src/lj_obj.h
index 600b68d..927b347 100644
--- a/src/lj_obj.h
+++ b/src/lj_obj.h
@@ -573,12 +573,12 @@ typedef enum {
 
 /* Garbage collector states. Order matters. */
 enum {
-  GCSpause, /* Start a GC cycle and mark the root set.*/
-  GCSpropagate, /* One gray object is processed. */
-  GCSatomic, /* Atomic transition from mark to sweep phase. */
-  GCSsweepstring, /* Sweep one chain of strings. */
-  GCSsweep, /* Sweep few objects from root. */
-  GCSfinalize, /* Finalize one userdata or cdata object. */
+  GCSpause,		/* Start a GC cycle and mark the root set.*/
+  GCSpropagate,		/* One gray object is processed. */
+  GCSatomic,		/* Atomic transition from mark to sweep phase. */
+  GCSsweepstring,	/* Sweep one chain of strings. */
+  GCSsweep,		/* Sweep few objects from root. */
+  GCSfinalize,		/* Finalize one userdata or cdata object. */
   GCSmax
 };
 
diff --git a/src/lj_trace.c b/src/lj_trace.c
index 9bcbce6..86563cd 100644
--- a/src/lj_trace.c
+++ b/src/lj_trace.c
@@ -136,6 +136,7 @@ GCtrace * LJ_FASTCALL lj_trace_alloc(lua_State *L, GCtrace *T)
   T2->nsnap = T->nsnap;
   T2->nsnapmap = T->nsnapmap;
   memcpy(p, T->ir + T->nk, szins);
+  L2J(L)->tracenum++;
   return T2;
 }
 
===================================================================

[1]: https://alisdair.mcdiarmid.org/arm-immediate-value-encoding/

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 2/2] misc: add C and Lua API for platform metrics
  2020-10-07 18:23       ` Igor Munkin
@ 2020-10-07 20:09         ` Sergey Kaplun
  0 siblings, 0 replies; 26+ messages in thread
From: Sergey Kaplun @ 2020-10-07 20:09 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches

Thanks, again! Great comments as always!

On 07.10.20, Igor Munkin wrote:
> Sergey,
> 
> Thanks for the fixes! The patch LGTM, except two nits below.
> 
> On 07.10.20, Sergey Kaplun wrote:
> > Hi Igor! Thanks for the review!
> > 
> > On 07.10.20, Igor Munkin wrote:
> > > Sergey,
> > > 
> > > Thanks for the patch! Please consider my comments below.
> > > 

<snipped>

> > diff --git a/test/misclib-getmetrics-capi.test.lua b/test/misclib-getmetrics-capi.test.lua
> > index 34adaba..f133274 100755
> > --- a/test/misclib-getmetrics-capi.test.lua
> > +++ b/test/misclib-getmetrics-capi.test.lua
> > @@ -1,15 +1,15 @@
> >  #!/usr/bin/env tarantool
> >  
> > -local file = debug.getinfo(1, "S").source:sub(2)
> > -local filepath = file:match("(.*/)")
> > -local soext = jit.os == "OSX" and "dylib" or "so"
> > -package.cpath = filepath..'clib-misclib-getmetrics/?.'..soext..";"
> > -                ..package.cpath
> > +local path = arg[0]:gsub('%.test%.lua', '')
> > +local suffix = package.cpath:match('?.(%a+);')
> > +package.cpath = ('%s/?.%s'):format(path, suffix)
> 
> This line overrides the current <package.cpath>.

Sure! 10x!

> 
> >  
> 
> <snipped>
> 
> > diff --git a/test/misclib-getmetrics-capi/testgetmetrics.c b/test/misclib-getmetrics-capi/testgetmetrics.c
> > index 32802d2..4a7355a 100644
> > --- a/test/misclib-getmetrics-capi/testgetmetrics.c
> > +++ b/test/misclib-getmetrics-capi/testgetmetrics.c
> 
> <snipped>
> 
> > @@ -203,7 +207,6 @@ static int tracenum_base(lua_State *L)
> >  	luaM_metrics(L, &metrics);
> >  	assert(metrics.jit_trace_num == 0);
> >  
> > -	luaJIT_setmode(L, 0, LUAJIT_MODE_ON);
> 
> There is one more excess call at the end of this function.

Thanks! Nice catch!

> 
> >  	/* Generate traces. */
> >  	lua_call(L, 0, 1);
> >  	n = lua_gettop(L);
> 
> <snipped>
> 
> > ===================================================================
> > 
> > -- 
> > Best regards,
> > Sergey Kaplun
> 
> -- 
> Best regards,
> IM

See iterative patch in the bottom. Branch force-pushed.

===================================================================
diff --git a/test/misclib-getmetrics-capi.test.lua b/test/misclib-getmetrics-capi.test.lua
index f133274..1ad6958 100755
--- a/test/misclib-getmetrics-capi.test.lua
+++ b/test/misclib-getmetrics-capi.test.lua
@@ -2,7 +2,7 @@
 
 local path = arg[0]:gsub('%.test%.lua', '')
 local suffix = package.cpath:match('?.(%a+);')
-package.cpath = ('%s/?.%s'):format(path, suffix)
+package.cpath = ('%s/?.%s;'):format(path, suffix)..package.cpath
 
 local jit_opt_default = {
     3, -- level
diff --git a/test/misclib-getmetrics-capi/testgetmetrics.c b/test/misclib-getmetrics-capi/testgetmetrics.c
index 4a7355a..8844b17 100644
--- a/test/misclib-getmetrics-capi/testgetmetrics.c
+++ b/test/misclib-getmetrics-capi/testgetmetrics.c
@@ -222,7 +222,6 @@ static int tracenum_base(lua_State *L)
         luaM_metrics(L, &metrics);
         assert(metrics.jit_trace_num == 0);
 
-        luaJIT_setmode(L, 0, LUAJIT_MODE_ON);
         lua_pushboolean(L, 1);
         return 1;
 }
===================================================================

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 1/2] core: introduce various platform metrics
  2020-10-07 19:55     ` Sergey Kaplun
@ 2020-10-07 20:16       ` Igor Munkin
  2020-10-08  9:28         ` Igor Munkin
  2020-10-08 10:11         ` Sergey Kaplun
  0 siblings, 2 replies; 26+ messages in thread
From: Igor Munkin @ 2020-10-07 20:16 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sergey,

Thanks for your fixes! There is still a comment regarding CNEW
assembling and a couple minors below.

On 07.10.20, Sergey Kaplun wrote:
> On 07.10.20, Igor Munkin wrote:
> > Sergey,
> > 
> > Thanks for the patch! Please consider my comments below.
> > 
> > On 05.10.20, Sergey Kaplun wrote:

<snipped>

> 
> > 
> > > +  emit_setgl(as, RID_RET+2, gc.cdatanum);
> > 
> > Well, I glanced a MIPS register-usage convention and AFAICS $4 register
> > (RID_RET + 2) is a general-purpose (i.e. doesn't store 0 or preserved by
> > kernel) caller-safe one. Ergo it should be allocated it in a proper way
> > from scratch set, shouldn't it?
> > 
> 
> AFAIK, $a0 - $a3 ($4 - $7) registers are arguments to functions - not
> preserved by subprograms.

Yes, but there is e.g. $8, that is temporary one, isn't it? Anyway, you
can't just pick the particular register, since it can be already
allocated by RA. So it *has* to be explicitly allocated to avoid data
clash on the trace. I strongly believe the reason you see no failure on
tests is simply a lucky coincidence (or tiny traces).

> But anyway explicit allocation is better here. Added.
> 
> > >    /* Initialize gct and ctypeid. lj_mem_newgco() already sets marked. */
> 

<snipped>

> 
> I've changed commit message as follows:
> 
> ===================================================================
> core: introduce various platform metrics
> 
> This patch introduces the following counters:
>   - overall amount of allocated tables, cdata and udata objects
>   - number of incremental GC steps grouped by GC state
>   - number of string hashes hits and misses
>   - amount of allocated and freed memory
>   - number of trace aborts, number of traces and restored snapshots
> 
> Also this patch fixes alignment for 64-bit architectures.
> 
> NB: MSize and BCIns are the only fixed types that equal 32 bits. GCRef,
> MRef and GCSize sizes depend on LJ_GC64 define.
> 
> struct GCState is terminated by three fields: GCSize estimate, MSize
> stepmul and MSize pause, which are aligned. The introduces size_t

Typo: s/introduces/introduced/.

> fields do not violate the alignment too.
> 
> vmstate 32-bit field goes right after GCState field within global_State
> structure. The next field tmpbuf consists of several MRef fields that
> have 64-bit size each. This issue can be solved by moving vmstate field
> below. However DynASM doesn't work well with unaligned memory access on
> 64-bit bigendian MIPS, so vmstate should be aligned to a 64-bit
> boundary.
> 
> Furthermore field order has been changed to be able to compile code by
> DynASM for 32-bit ARM too (see also
> https://github.com/openresty/luajit2/issues/37#issuecomment-459145226).
> 
> Interfaces to obtain these metrics via both Lua and C API are
> introduced in the next patch.
> 
> Part of tarantool/tarantool#5187
> ===================================================================
> 
> Side note: If you want read a little bit more about ARM immediate value
> encoding (and play with it) see also [1].

Thanks.

> 

<snipped>

> 
> See iterative patch in the bottom. Branch force-pushed.
> 
> ===================================================================

<snipped>

> diff --git a/src/lj_asm_mips.h b/src/lj_asm_mips.h
> index f4b4b5d..0341701 100644
> --- a/src/lj_asm_mips.h
> +++ b/src/lj_asm_mips.h
> @@ -1430,7 +1430,9 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>    CTInfo info = lj_ctype_info(cts, id, &sz);
>    const CCallInfo *ci = &lj_ir_callinfo[IRCALL_lj_mem_newgco];
>    IRRef args[4];
> +  RegSet allow = (RSET_GPR & ~RSET_SCRATCH);
>    RegSet drop = RSET_SCRATCH;
> +  Reg tmp;
>    lua_assert(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL));
>  
>    as->gcsteps++;
> @@ -1442,7 +1444,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>  
>    /* Initialize immutable cdata object. */ >    if (ir->o == IR_CNEWI) {
> -    RegSet allow = (RSET_GPR & ~RSET_SCRATCH);
>  #if LJ_32
>      int32_t ofs = sizeof(GCcdata);
>      if (sz == 8) {
> @@ -1473,15 +1474,16 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>      return;
>    }
>  
> +  tmp = ra_scratch(as, allow);

Since there are registers allocated in scope of IR_CNEWI assembling
above, you need to exclude those registers from <allow> set prior to
scratching a new one.

>    /* Code incrementing cdatanum is sparse to avoid mips data hazards. */
> -  emit_setgl(as, RID_RET+2, gc.cdatanum);
> +  emit_setgl(as, tmp, gc.cdatanum);
>    /* Initialize gct and ctypeid. lj_mem_newgco() already sets marked. */
>    emit_tsi(as, MIPSI_SB, RID_RET+1, RID_RET, offsetof(GCcdata, gct));
>    emit_tsi(as, MIPSI_SH, RID_TMP, RID_RET, offsetof(GCcdata, ctypeid));
> -  emit_tsi(as, MIPSI_AADDIU, RID_RET+2, RID_RET+2, 1);
> +  emit_tsi(as, MIPSI_AADDIU, tmp, tmp, 1);
>    emit_ti(as, MIPSI_LI, RID_RET+1, ~LJ_TCDATA);
>    emit_ti(as, MIPSI_LI, RID_TMP, id); /* Lower 16 bit used. Sign-ext ok. */
> -  emit_getgl(as, RID_RET+2, gc.cdatanum);
> +  emit_getgl(as, tmp, gc.cdatanum);
>    args[0] = ASMREF_L;     /* lua_State *L */
>    args[1] = ASMREF_TMP1;  /* MSize size   */
>    asm_gencall(as, ci, args);

<snipped>

> ===================================================================
> 
> [1]: https://alisdair.mcdiarmid.org/arm-immediate-value-encoding/
> 
> -- 
> Best regards,
> Sergey Kaplun

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 1/2] core: introduce various platform metrics
  2020-10-07 20:16       ` Igor Munkin
@ 2020-10-08  9:28         ` Igor Munkin
  2020-10-08 10:11         ` Sergey Kaplun
  1 sibling, 0 replies; 26+ messages in thread
From: Igor Munkin @ 2020-10-08  9:28 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sergey,

On 07.10.20, Igor Munkin wrote:
> Sergey,
> 
> Thanks for your fixes! There is still a comment regarding CNEW
> assembling and a couple minors below.
> 

<snipped>

> 
> > diff --git a/src/lj_asm_mips.h b/src/lj_asm_mips.h
> > index f4b4b5d..0341701 100644
> > --- a/src/lj_asm_mips.h
> > +++ b/src/lj_asm_mips.h
> > @@ -1430,7 +1430,9 @@ static void asm_cnew(ASMState *as, IRIns *ir)
> >    CTInfo info = lj_ctype_info(cts, id, &sz);
> >    const CCallInfo *ci = &lj_ir_callinfo[IRCALL_lj_mem_newgco];
> >    IRRef args[4];
> > +  RegSet allow = (RSET_GPR & ~RSET_SCRATCH);
> >    RegSet drop = RSET_SCRATCH;
> > +  Reg tmp;
> >    lua_assert(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL));
> >  
> >    as->gcsteps++;
> > @@ -1442,7 +1444,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
> >  
> >    /* Initialize immutable cdata object. */ >    if (ir->o == IR_CNEWI) {
> > -    RegSet allow = (RSET_GPR & ~RSET_SCRATCH);
> >  #if LJ_32
> >      int32_t ofs = sizeof(GCcdata);
> >      if (sz == 8) {
> > @@ -1473,15 +1474,16 @@ static void asm_cnew(ASMState *as, IRIns *ir)
> >      return;
> >    }
> >  
> > +  tmp = ra_scratch(as, allow);
> 
> Since there are registers allocated in scope of IR_CNEWI assembling
> above, you need to exclude those registers from <allow> set prior to
> scratching a new one.

I glanced <ra_scratch> once more: the register to be yielded is picked
from the intersection between the not used set and allowed set (which is
quite obvious), so there is no problem. Disregard the comment above.

> 
> >    /* Code incrementing cdatanum is sparse to avoid mips data hazards. */

<snipped>

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 1/2] core: introduce various platform metrics
  2020-10-07 20:16       ` Igor Munkin
  2020-10-08  9:28         ` Igor Munkin
@ 2020-10-08 10:11         ` Sergey Kaplun
  2020-10-08 12:44           ` Igor Munkin
  1 sibling, 1 reply; 26+ messages in thread
From: Sergey Kaplun @ 2020-10-08 10:11 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches

Igor,

On 07.10.20, Igor Munkin wrote:
> Sergey,
> 
> Thanks for your fixes! There is still a comment regarding CNEW
> assembling and a couple minors below.
> 
> On 07.10.20, Sergey Kaplun wrote:
> > On 07.10.20, Igor Munkin wrote:
> > > Sergey,
> > > 
> > > Thanks for the patch! Please consider my comments below.
> > > 
> > > On 05.10.20, Sergey Kaplun wrote:
> 
> <snipped>
> 
> > 
> > > 
> > > > +  emit_setgl(as, RID_RET+2, gc.cdatanum);
> > > 
> > > Well, I glanced a MIPS register-usage convention and AFAICS $4 register
> > > (RID_RET + 2) is a general-purpose (i.e. doesn't store 0 or preserved by
> > > kernel) caller-safe one. Ergo it should be allocated it in a proper way
> > > from scratch set, shouldn't it?
> > > 
> > 
> > AFAIK, $a0 - $a3 ($4 - $7) registers are arguments to functions - not
> > preserved by subprograms.
> 
> Yes, but there is e.g. $8, that is temporary one, isn't it? Anyway, you

Yes, for example.
Side note: We can omit the call to the register allocator, since these
registers can change during the call, but in order not to be tied to ABI
we can use a lightweight `ra_scratch`, indeed.

> can't just pick the particular register, since it can be already
> allocated by RA. So it *has* to be explicitly allocated to avoid data
> clash on the trace. I strongly believe the reason you see no failure on
> tests is simply a lucky coincidence (or tiny traces).
> 
> > But anyway explicit allocation is better here. Added.
> > 
> > > >    /* Initialize gct and ctypeid. lj_mem_newgco() already sets marked. */
> > 
> 
> <snipped>
> 
> > 
> > I've changed commit message as follows:
> > 
> > ===================================================================
> > core: introduce various platform metrics
> > 
> > This patch introduces the following counters:
> >   - overall amount of allocated tables, cdata and udata objects
> >   - number of incremental GC steps grouped by GC state
> >   - number of string hashes hits and misses
> >   - amount of allocated and freed memory
> >   - number of trace aborts, number of traces and restored snapshots
> > 
> > Also this patch fixes alignment for 64-bit architectures.
> > 
> > NB: MSize and BCIns are the only fixed types that equal 32 bits. GCRef,
> > MRef and GCSize sizes depend on LJ_GC64 define.
> > 
> > struct GCState is terminated by three fields: GCSize estimate, MSize
> > stepmul and MSize pause, which are aligned. The introduces size_t
> 
> Typo: s/introduces/introduced/.

Thanks! Updated on branch.

> 
> > fields do not violate the alignment too.

<snipped>

> > ===================================================================
> > 
> > [1]: https://alisdair.mcdiarmid.org/arm-immediate-value-encoding/
> > 
> > -- 
> > Best regards,
> > Sergey Kaplun
> 
> -- 
> Best regards,
> IM

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 1/2] core: introduce various platform metrics
  2020-10-08 10:11         ` Sergey Kaplun
@ 2020-10-08 12:44           ` Igor Munkin
  2020-10-09 14:39             ` Sergey Ostanevich
  0 siblings, 1 reply; 26+ messages in thread
From: Igor Munkin @ 2020-10-08 12:44 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sergey,

Thanks, this patch also LGTM.

On 08.10.20, Sergey Kaplun wrote:
> Igor,
> 

<snipped>

> > > > > +  emit_setgl(as, RID_RET+2, gc.cdatanum);
> > > > 
> > > > Well, I glanced a MIPS register-usage convention and AFAICS $4 register
> > > > (RID_RET + 2) is a general-purpose (i.e. doesn't store 0 or preserved by
> > > > kernel) caller-safe one. Ergo it should be allocated it in a proper way
> > > > from scratch set, shouldn't it?
> > > > 
> > > 
> > > AFAIK, $a0 - $a3 ($4 - $7) registers are arguments to functions - not
> > > preserved by subprograms.
> > 
> > Yes, but there is e.g. $8, that is temporary one, isn't it? Anyway, you
> 
> Yes, for example.
> Side note: We can omit the call to the register allocator, since these
> registers can change during the call, but in order not to be tied to ABI
> we can use a lightweight `ra_scratch`, indeed.

Agree here: you can pick any of caller-safe registers instead of
scratching it. All values contained in these registers should be
preserved prior to call, so nothing is broken in your previous version
(i.e. not a lucky coincidence). However, using explicit scratching here
looks at least more consistent to me.

> 
> > can't just pick the particular register, since it can be already
> > allocated by RA. So it *has* to be explicitly allocated to avoid data
> > clash on the trace. I strongly believe the reason you see no failure on
> > tests is simply a lucky coincidence (or tiny traces).
> > 
> > > But anyway explicit allocation is better here. Added.
> > > 

<snipped>

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [RFC v4] rfc: luajit metrics
  2020-10-05  6:30 ` [Tarantool-patches] [RFC v4] rfc: luajit metrics Sergey Kaplun
  2020-10-07 14:46   ` Sergey Kaplun
@ 2020-10-08 17:25   ` Igor Munkin
  2020-10-08 19:29     ` Sergey Kaplun
  2020-12-22  9:07   ` Kirill Yukhin
  2 siblings, 1 reply; 26+ messages in thread
From: Igor Munkin @ 2020-10-08 17:25 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sergey,

Thanks for the updates! Considering the changes in the follow-up reply,
there are still several comments below.

On 05.10.20, Sergey Kaplun wrote:
> Part of #5187
> ---
> 
> This patch adds RFC to LuaJIT metrics interfaces. Nevertheless name
> `misc` for builtin library is not good and should be discussed, because
> tons of user modules can use that name for their own libraries.
> 
> Branch: https://github.com/tarantool/tarantool/tree/skaplun/5187-luajit-metrics
> Issue: https://github.com/tarantool/tarantool/issues/5187
> 
> Changes in v2:
> - Fixed typos
> - Made comments more verbose
> - Avoided flushing any of metrics after each call of luaM_metrics()
> Changes in v3:
> - Added colors count metrics description
> - Added description about how metrics are collected
> - Added benchmarks
> Changes in v3:
> - Removed colors count metrics
> - Added code block language
> - Added how to use section

Minor: ChangeLog is misordered (the latest changes are the first entry).

> 
>  doc/rfc/5187-luajit-metrics.md | 299 +++++++++++++++++++++++++++++++++
>  1 file changed, 299 insertions(+)
>  create mode 100644 doc/rfc/5187-luajit-metrics.md
> 
> diff --git a/doc/rfc/5187-luajit-metrics.md b/doc/rfc/5187-luajit-metrics.md
> new file mode 100644
> index 000000000..02f5b559f
> --- /dev/null
> +++ b/doc/rfc/5187-luajit-metrics.md
> @@ -0,0 +1,299 @@

<snipped>

> +Couple of words about how metrics are collected:
> +- `strhash_*` -- whenever existing string is returned after attempt to
> +  create new string there is incremented `strhash_hit` counter, if new string
> +  created then `strhash_miss` is incremented instead.

Minor: I propose to reword it the way similar you've updated it in the
luam_Metrics comments.

> +- `gc_*num`, `jit_trace_num` -- corresponding counter incremented whenever new

Typo: s/whenever new/whenever a new/.

> +  object is allocated. When object become garbage collected its counter is

Typo: s/become garbage collected/is collected by GC/.

> +  decremented.
> +- `gc_total`, `gc_allocated`, `gc_freed` -- any time when allocation function
> +  is called `gc_allocated` and/or `gc_freed` is increased and `gc_total`
> +  increase when memory is allocated or reallocated, decrease when memory is

Typo: s/increase/is increased/.
Typo: s/decrease/is decreased/.

> +  freed.
> +- `gc_steps_*` -- corresponding counter increments whenever Garbage Collector

Typo: s/increments/is incremented/.

> +  starts to execute 1 step of garbage collection.

Minor: I propose s/1/an incremental/.

> +- `jit_snap_restore` -- whenever JIT machine exits from the trace and restores
> +  interpreter state `jit_snap_restore` counter is incremented.
> +- `jit_trace_abort` -- whenever JIT compiler can't record the trace in case NYI
> +  BC this counter is incremented.

Minor: NYI relates also to builtins, not only to bytecodes.

> +- `jit_mcode_size` -- whenever new MCode area is allocated `jit_mcode_size` is
> +  increased at corresponding size in bytes. Sets to 0 when all mcode area is
> +  freed.

How does it change, when a trace is collected as a result of its flush?

> +
> +All metrics are collected throughout the platform uptime. These metrics
> +increase monotonically and can overflow:
> +  - `strhash_hit`
> +  - `strhash_miss`
> +  - `gc_freed`
> +  - `gc_allocated`,

Typo: Excess comma.

> +  - `gc_steps_pause`
> +  - `gc_steps_propagate`
> +  - `gc_steps_atomic`
> +  - `gc_steps_sweepstring`
> +  - `gc_steps_sweep`
> +  - `gc_steps_finalize`
> +  - `jit_snap_restore`
> +  - `jit_trace_abort`

<snipped>

> +## How to use
> +
> +This section describes small example of metrics usage.
> +
> +For example amount of `strhash_misses` can be shown for tracking of new string
> +objects allocations. For example if we add code like:
> +```lua
> +local function sharded_storage_func(storage_name, func_name)
> +    return 'sharded_storage.storages.' .. storage_name .. '.' .. func_name
> +end
> +```
> +increase in slope curve of `strhash_misses` means, that after your changes
> +there are more new strings allocating at runtime. Of course slope curve of
> +`strhash_misses` _should_ be less than slope curve of `strhash_hits`. If it
> +is not, you should refactor your code.

Minor: I guess it's not a good idea to suggest 'code refactoring'. This
section is good to describe the values being observed, so the first part
about tilt angle is enough.

> +
> +Slope curves of `gc_freed` and `gc_allocated` can be used for analysis of GC
> +pressure of your application (less is better).
> +
> +Also we can check some hacky optimization with these metrics. For example let's
> +assume that we have this code snippet:
> +```lua
> +local old_metrics = misc.getmetrics()
> +local t = {}
> +for i = 1, 513 do
> +    t[i] = i
> +end
> +local new_metrics = misc.getmetrics()
> +local diff =  new_metrics.gc_allocated - old_metrics.gc_allocated
> +```
> +`diff` equals to 18879 after running of this chunk.
> +
> +But if we change table initialization to
> +```lua
> +local table_new = require "table.new"
> +local old_metrics = misc.getmetrics()
> +local t = table_new(513,0)
> +for i = 1, 513 do
> +    t[i] = i
> +end
> +local new_metrics = misc.getmetrics()
> +local diff =  new_metrics.gc_allocated - old_metrics.gc_allocated
> +```
> +`diff` shows us only 5895.
> +
> +Slope curves of `gc_steps_*` can be used for tracking of GC pressure too. For
> +long time observations you will see periodic increment for `gc_steps_*` metrics
> +-- for example longer period of `gc_steps_atomic` increment is better. Also
> +additional amount of `gc_steps_propagate` in one period can be used to
> +indirectly estimate amount of objects.

These values also correlate with the <stepmul> value. The amount of
incremental steps can grow, but one step can process a small amount of
GCobjects. So these metrics should be considered together with GC setup.

> +
> +Amount of `gc_*num` is useful for control of memory leaks -- totally amount of

Typo: s/totally/total/.

> +these objects should not growth nonstop (you can also track `gc_total` for
> +this).  Also `jit_mcode_size` can be used for tracking amount of allocated

Typo: double space prior to "Also".

> +memory for traces machine code.
> +
> +Slope curves of `jit_trace_abort` shows how many times trace hasn't been
> +compiled when the attempt was made (less is better).
> +
> +Amount of `gc_trace_num` is shown how much traces was generated (_usually_
> +more is better).
> +
> +And the last one -- `gc_snap_restores` can be used for estimation when LuaJIT
> +is stop trace execution. If slope curves of this metric growth after changing
> +old code it can mean performance degradation.
> +
> +Assumes that we have code like this:
> +```lua
> +local function foo(i)
> +    return i <= 5 and i or tostring(i)
> +end
> +-- minstitch option needs to emulate nonstitching behaviour
> +jit.opt.start(0, "hotloop=2", "hotexit=2", "minstitch=15")
> +
> +local sum = 0
> +local old_metrics = misc.getmetrics()
> +for i = 1, 10 do
> +    sum = sum + foo(i)
> +end
> +local new_metrics = misc.getmetrics()
> +local diff = new_metrics.jit_snap_restore - old_metrics.jit_snap_restore
> +```
> +`diff` equals 3 (1 side exit on loop end, 2 side exits to the interpreter
> +before trace gets hot and compiled) after this chunk of code.
> +
> +And now we decide to change `foo` function like this:
> +```lua
> +local function foo(i)
> +    -- math.fmod is not yet compiled!
> +    return i <= 5 and i or math.fmod(i, 11)
> +end
> +```
> +`diff` equals 6 (1 side exit on loop end, 2 side exits to the interpreter
> +before trace gets hot and compiled an 3 side exits from the root trace could
> +not get compiled) after the same chunk of code.
> +
> +## Benchmarks
> +
> +Benchmarks was taken from repo:

Typo: s/was/were/.

> +[LuaJIT-test-cleanup](https://github.com/LuaJIT/LuaJIT-test-cleanup).
> +
> +Example of usage:
> +```bash
> +/usr/bin/time -f"array3d %U" ./luajit $BENCH_DIR/array3d.lua  300 >/dev/null

Typo: double space prior to "300".

> +```

<snipped>

> -- 
> 2.28.0
> 

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 0/2] Implement LuaJIT platform metrics
  2020-10-05  6:30 [Tarantool-patches] [PATCH v4 0/2] Implement LuaJIT platform metrics Sergey Kaplun
                   ` (2 preceding siblings ...)
  2020-10-05  6:30 ` [Tarantool-patches] [RFC v4] rfc: luajit metrics Sergey Kaplun
@ 2020-10-08 17:33 ` Igor Munkin
  2020-10-13 13:17 ` Kirill Yukhin
  4 siblings, 0 replies; 26+ messages in thread
From: Igor Munkin @ 2020-10-08 17:33 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sergey,

Thanks for the series. ChangeLog entry is missing.

On 05.10.20, Sergey Kaplun wrote:
> The series consists of 2 patches. The first one adds corresponding
> counters to LuaJIT internal structures. The second provides C and Lua
> API using this counters to collect metrics.
> 
> Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-5187-luajit-metrics
> Issue: https://github.com/tarantool/tarantool/issues/5187
> 

<snipped>

> 

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [RFC v4] rfc: luajit metrics
  2020-10-08 17:25   ` Igor Munkin
@ 2020-10-08 19:29     ` Sergey Kaplun
  2020-10-08 20:26       ` Igor Munkin
  0 siblings, 1 reply; 26+ messages in thread
From: Sergey Kaplun @ 2020-10-08 19:29 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches

Hi, Igor! Thanks for the review!

On 08.10.20, Igor Munkin wrote:
> Sergey,
> 
> Thanks for the updates! Considering the changes in the follow-up reply,
> there are still several comments below.
> 
> On 05.10.20, Sergey Kaplun wrote:
> > Part of #5187
> > ---
> > 
> > This patch adds RFC to LuaJIT metrics interfaces. Nevertheless name
> > `misc` for builtin library is not good and should be discussed, because
> > tons of user modules can use that name for their own libraries.
> > 
> > Branch: https://github.com/tarantool/tarantool/tree/skaplun/5187-luajit-metrics
> > Issue: https://github.com/tarantool/tarantool/issues/5187
> > 
> > Changes in v2:
> > - Fixed typos
> > - Made comments more verbose
> > - Avoided flushing any of metrics after each call of luaM_metrics()
> > Changes in v3:
> > - Added colors count metrics description
> > - Added description about how metrics are collected
> > - Added benchmarks
> > Changes in v3:
> > - Removed colors count metrics
> > - Added code block language
> > - Added how to use section
> 
> Minor: ChangeLog is misordered (the latest changes are the first entry).

Thanks!

Forgot to add ChangeLog to patch:
@ChangeLog:
 * Add Lua and C API for LuaJIT platform metrics about:
   - overall amount of allocated tables, cdata and udata objects
   - number of incremental GC steps grouped by GC state
   - number of string hashes hits and misses
   - amount of allocated and freed memory
   - number of trace aborts, number of traces and restored snapshots

> 
> > 
> >  doc/rfc/5187-luajit-metrics.md | 299 +++++++++++++++++++++++++++++++++
> >  1 file changed, 299 insertions(+)
> >  create mode 100644 doc/rfc/5187-luajit-metrics.md
> > 
> > diff --git a/doc/rfc/5187-luajit-metrics.md b/doc/rfc/5187-luajit-metrics.md
> > new file mode 100644
> > index 000000000..02f5b559f
> > --- /dev/null
> > +++ b/doc/rfc/5187-luajit-metrics.md
> > @@ -0,0 +1,299 @@
> 
> <snipped>
> 
> > +Couple of words about how metrics are collected:
> > +- `strhash_*` -- whenever existing string is returned after attempt to
> > +  create new string there is incremented `strhash_hit` counter, if new string
> > +  created then `strhash_miss` is incremented instead.
> 
> Minor: I propose to reword it the way similar you've updated it in the
> luam_Metrics comments.

Yep, sure!

> 
> > +- `gc_*num`, `jit_trace_num` -- corresponding counter incremented whenever new
> 
> Typo: s/whenever new/whenever a new/.

Hawkeye!

> 
> > +  object is allocated. When object become garbage collected its counter is
> 
> Typo: s/become garbage collected/is collected by GC/.

Thanks! Fixed.

> 
> > +  decremented.
> > +- `gc_total`, `gc_allocated`, `gc_freed` -- any time when allocation function
> > +  is called `gc_allocated` and/or `gc_freed` is increased and `gc_total`
> > +  increase when memory is allocated or reallocated, decrease when memory is
> 
> Typo: s/increase/is increased/.
> Typo: s/decrease/is decreased/.

Thanks! Fixed.

> 
> > +  freed.
> > +- `gc_steps_*` -- corresponding counter increments whenever Garbage Collector
> 
> Typo: s/increments/is incremented/.

Thanks!

> 
> > +  starts to execute 1 step of garbage collection.
> 
> Minor: I propose s/1/an incremental/.

No problem!

> 
> > +- `jit_snap_restore` -- whenever JIT machine exits from the trace and restores
> > +  interpreter state `jit_snap_restore` counter is incremented.
> > +- `jit_trace_abort` -- whenever JIT compiler can't record the trace in case NYI
> > +  BC this counter is incremented.
> 
> Minor: NYI relates also to builtins, not only to bytecodes.

Ok. Fixed.

> 
> > +- `jit_mcode_size` -- whenever new MCode area is allocated `jit_mcode_size` is
> > +  increased at corresponding size in bytes. Sets to 0 when all mcode area is
> > +  freed.
> 
> How does it change, when a trace is collected as a result of its flush?

It doesn't. IINM, this area will be reused later for other traces.
MCode area is linked with jit_State not with trace by itself. Trace just
reserve MCode area that needed.

> 
> > +
> > +All metrics are collected throughout the platform uptime. These metrics
> > +increase monotonically and can overflow:
> > +  - `strhash_hit`
> > +  - `strhash_miss`
> > +  - `gc_freed`
> > +  - `gc_allocated`,
> 
> Typo: Excess comma.

Hawkeye!

> 
> > +  - `gc_steps_pause`
> > +  - `gc_steps_propagate`
> > +  - `gc_steps_atomic`
> > +  - `gc_steps_sweepstring`
> > +  - `gc_steps_sweep`
> > +  - `gc_steps_finalize`
> > +  - `jit_snap_restore`
> > +  - `jit_trace_abort`
> 
> <snipped>
> 
> > +## How to use
> > +
> > +This section describes small example of metrics usage.
> > +
> > +For example amount of `strhash_misses` can be shown for tracking of new string
> > +objects allocations. For example if we add code like:
> > +```lua
> > +local function sharded_storage_func(storage_name, func_name)
> > +    return 'sharded_storage.storages.' .. storage_name .. '.' .. func_name
> > +end
> > +```
> > +increase in slope curve of `strhash_misses` means, that after your changes
> > +there are more new strings allocating at runtime. Of course slope curve of
> > +`strhash_misses` _should_ be less than slope curve of `strhash_hits`. If it
> > +is not, you should refactor your code.
> 
> Minor: I guess it's not a good idea to suggest 'code refactoring'. This
> section is good to describe the values being observed, so the first part
> about tilt angle is enough.

OK, dropped it!

> 
> > +
> > +Slope curves of `gc_freed` and `gc_allocated` can be used for analysis of GC
> > +pressure of your application (less is better).
> > +
> > +Also we can check some hacky optimization with these metrics. For example let's
> > +assume that we have this code snippet:
> > +```lua
> > +local old_metrics = misc.getmetrics()
> > +local t = {}
> > +for i = 1, 513 do
> > +    t[i] = i
> > +end
> > +local new_metrics = misc.getmetrics()
> > +local diff =  new_metrics.gc_allocated - old_metrics.gc_allocated

Found double space prior new_metrics.gc_allocated. Fixed.

> > +```
> > +`diff` equals to 18879 after running of this chunk.
> > +
> > +But if we change table initialization to
> > +```lua
> > +local table_new = require "table.new"
> > +local old_metrics = misc.getmetrics()
> > +local t = table_new(513,0)
> > +for i = 1, 513 do
> > +    t[i] = i
> > +end
> > +local new_metrics = misc.getmetrics()
> > +local diff =  new_metrics.gc_allocated - old_metrics.gc_allocated

Here too.

> > +```
> > +`diff` shows us only 5895.
> > +
> > +Slope curves of `gc_steps_*` can be used for tracking of GC pressure too. For
> > +long time observations you will see periodic increment for `gc_steps_*` metrics
> > +-- for example longer period of `gc_steps_atomic` increment is better. Also
> > +additional amount of `gc_steps_propagate` in one period can be used to
> > +indirectly estimate amount of objects.
> 
> These values also correlate with the <stepmul> value. The amount of
> incremental steps can grow, but one step can process a small amount of
> GCobjects. So these metrics should be considered together with GC setup.

Thanks! Add this note to RFC!

> 
> > +
> > +Amount of `gc_*num` is useful for control of memory leaks -- totally amount of
> 
> Typo: s/totally/total/.

Fixed! Thanks!

> 
> > +these objects should not growth nonstop (you can also track `gc_total` for
> > +this).  Also `jit_mcode_size` can be used for tracking amount of allocated
> 
> Typo: double space prior to "Also".

Yep. Fixed!
Side note: Too unpredictable autoformatting by `gq` :\

> 
> > +memory for traces machine code.
> > +
> > +Slope curves of `jit_trace_abort` shows how many times trace hasn't been
> > +compiled when the attempt was made (less is better).
> > +
> > +Amount of `gc_trace_num` is shown how much traces was generated (_usually_
> > +more is better).
> > +
> > +And the last one -- `gc_snap_restores` can be used for estimation when LuaJIT
> > +is stop trace execution. If slope curves of this metric growth after changing
> > +old code it can mean performance degradation.
> > +
> > +Assumes that we have code like this:
> > +```lua
> > +local function foo(i)
> > +    return i <= 5 and i or tostring(i)
> > +end
> > +-- minstitch option needs to emulate nonstitching behaviour
> > +jit.opt.start(0, "hotloop=2", "hotexit=2", "minstitch=15")
> > +
> > +local sum = 0
> > +local old_metrics = misc.getmetrics()
> > +for i = 1, 10 do
> > +    sum = sum + foo(i)
> > +end
> > +local new_metrics = misc.getmetrics()
> > +local diff = new_metrics.jit_snap_restore - old_metrics.jit_snap_restore
> > +```
> > +`diff` equals 3 (1 side exit on loop end, 2 side exits to the interpreter
> > +before trace gets hot and compiled) after this chunk of code.
> > +
> > +And now we decide to change `foo` function like this:
> > +```lua
> > +local function foo(i)
> > +    -- math.fmod is not yet compiled!
> > +    return i <= 5 and i or math.fmod(i, 11)
> > +end
> > +```
> > +`diff` equals 6 (1 side exit on loop end, 2 side exits to the interpreter
> > +before trace gets hot and compiled an 3 side exits from the root trace could
> > +not get compiled) after the same chunk of code.
> > +
> > +## Benchmarks
> > +
> > +Benchmarks was taken from repo:
> 
> Typo: s/was/were/.

Thanks, fixed!

> 
> > +[LuaJIT-test-cleanup](https://github.com/LuaJIT/LuaJIT-test-cleanup).
> > +
> > +Example of usage:
> > +```bash
> > +/usr/bin/time -f"array3d %U" ./luajit $BENCH_DIR/array3d.lua  300 >/dev/null
> 
> Typo: double space prior to "300".

Thanks, fixed!

> 
> > +```
> 
> <snipped>
> 
> > -- 
> > 2.28.0
> > 
> 
> -- 
> Best regards,
> IM

See iterative patch in the bottom. Branch force-pushed.

===================================================================
diff --git a/doc/rfc/5187-luajit-metrics.md b/doc/rfc/5187-luajit-metrics.md
index 988b049fb..0c1df6901 100644
--- a/doc/rfc/5187-luajit-metrics.md
+++ b/doc/rfc/5187-luajit-metrics.md
@@ -89,22 +89,23 @@ struct luam_Metrics {
 ```
 
 Couple of words about how metrics are collected:
-- `strhash_*` -- whenever existing string is returned after attempt to
-  create new string there is incremented `strhash_hit` counter, if new string
-  created then `strhash_miss` is incremented instead.
-- `gc_*num`, `jit_trace_num` -- corresponding counter incremented whenever new
-  object is allocated. When object become garbage collected its counter is
+- `strhash_*` -- whenever the string with the same payload is found, so a new
+  one is not created/allocated, there is incremented `strhash_hit` counter, if
+  a new one string created/allocated then `strhash_miss` is incremented
+  instead.
+- `gc_*num`, `jit_trace_num` -- corresponding counter is incremented whenever a
+  new object is allocated. When object is collected by GC its counter is
   decremented.
 - `gc_total`, `gc_allocated`, `gc_freed` -- any time when allocation function
-  is called `gc_allocated` and/or `gc_freed` is increased and `gc_total`
-  increase when memory is allocated or reallocated, decrease when memory is
-  freed.
-- `gc_steps_*` -- corresponding counter increments whenever Garbage Collector
-  starts to execute 1 step of garbage collection.
+  is called `gc_allocated` and/or `gc_freed` is increased and `gc_total` is
+  increased when memory is allocated or reallocated, is decreased when memory
+  is freed.
+- `gc_steps_*` -- corresponding counter is incremented whenever Garbage
+  Collector starts to execute an incremental step of garbage collection.
 - `jit_snap_restore` -- whenever JIT machine exits from the trace and restores
   interpreter state `jit_snap_restore` counter is incremented.
 - `jit_trace_abort` -- whenever JIT compiler can't record the trace in case NYI
-  BC this counter is incremented.
+  BC or builtins this counter is incremented.
 - `jit_mcode_size` -- whenever new MCode area is allocated `jit_mcode_size` is
   increased at corresponding size in bytes. Sets to 0 when all mcode area is
   freed.
@@ -114,7 +115,7 @@ increase monotonically and can overflow:
   - `strhash_hit`
   - `strhash_miss`
   - `gc_freed`
-  - `gc_allocated`,
+  - `gc_allocated`
   - `gc_steps_pause`
   - `gc_steps_propagate`
   - `gc_steps_atomic`
@@ -173,8 +174,7 @@ end
 ```
 increase in slope curve of `strhash_misses` means, that after your changes
 there are more new strings allocating at runtime. Of course slope curve of
-`strhash_misses` _should_ be less than slope curve of `strhash_hits`. If it
-is not, you should refactor your code.
+`strhash_misses` _should_ be less than slope curve of `strhash_hits`.
 
 Slope curves of `gc_freed` and `gc_allocated` can be used for analysis of GC
 pressure of your application (less is better).
@@ -188,7 +188,7 @@ for i = 1, 513 do
     t[i] = i
 end
 local new_metrics = misc.getmetrics()
-local diff =  new_metrics.gc_allocated - old_metrics.gc_allocated
+local diff = new_metrics.gc_allocated - old_metrics.gc_allocated
 ```
 `diff` equals to 18879 after running of this chunk.
 
@@ -201,7 +201,7 @@ for i = 1, 513 do
     t[i] = i
 end
 local new_metrics = misc.getmetrics()
-local diff =  new_metrics.gc_allocated - old_metrics.gc_allocated
+local diff = new_metrics.gc_allocated - old_metrics.gc_allocated
 ```
 `diff` shows us only 5895.
 
@@ -209,11 +209,14 @@ Slope curves of `gc_steps_*` can be used for tracking of GC pressure too. For
 long time observations you will see periodic increment for `gc_steps_*` metrics
 -- for example longer period of `gc_steps_atomic` increment is better. Also
 additional amount of `gc_steps_propagate` in one period can be used to
-indirectly estimate amount of objects.
+indirectly estimate amount of objects. These values also correlate with the
+step multiplier of the GC. The amount of incremental steps can grow, but
+one step can process a small amount of objects. So these metrics should be
+considered together with GC setup.
 
-Amount of `gc_*num` is useful for control of memory leaks -- totally amount of
+Amount of `gc_*num` is useful for control of memory leaks -- total amount of
 these objects should not growth nonstop (you can also track `gc_total` for
-this).  Also `jit_mcode_size` can be used for tracking amount of allocated
+this). Also `jit_mcode_size` can be used for tracking amount of allocated
 memory for traces machine code.
 
 Slope curves of `jit_trace_abort` shows how many times trace hasn't been
@@ -258,12 +261,12 @@ not get compiled) after the same chunk of code.
 
 ## Benchmarks
 
-Benchmarks was taken from repo:
+Benchmarks were taken from repo:
 [LuaJIT-test-cleanup](https://github.com/LuaJIT/LuaJIT-test-cleanup).
 
 Example of usage:
 ```bash
-/usr/bin/time -f"array3d %U" ./luajit $BENCH_DIR/array3d.lua  300 >/dev/null
+/usr/bin/time -f"array3d %U" ./luajit $BENCH_DIR/array3d.lua 300 >/dev/null
 ```
 Taking into account the measurement error ~ 2%, it can be said that there is no
 difference in the performance.
===================================================================

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [RFC v4] rfc: luajit metrics
  2020-10-08 19:29     ` Sergey Kaplun
@ 2020-10-08 20:26       ` Igor Munkin
  2020-10-09  6:06         ` Sergey Kaplun
  0 siblings, 1 reply; 26+ messages in thread
From: Igor Munkin @ 2020-10-08 20:26 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sergey,

Thanks, the RFC LGTM in general now, but please consider the last minor
comments below.

On 08.10.20, Sergey Kaplun wrote:
> Hi, Igor! Thanks for the review!
> 
> On 08.10.20, Igor Munkin wrote:
> > Sergey,
> > 

<snipped>

> > 
> > Minor: ChangeLog is misordered (the latest changes are the first entry).
> 
> Thanks!
> 
> Forgot to add ChangeLog to patch:
> @ChangeLog:
>  * Add Lua and C API for LuaJIT platform metrics about:
>    - overall amount of allocated tables, cdata and udata objects
>    - number of incremental GC steps grouped by GC state
>    - number of string hashes hits and misses
>    - amount of allocated and freed memory
>    - number of trace aborts, number of traces and restored snapshots

The related issue should be also mentioned here.

> 
> > 
> > > 
> > >  doc/rfc/5187-luajit-metrics.md | 299 +++++++++++++++++++++++++++++++++
> > >  1 file changed, 299 insertions(+)
> > >  create mode 100644 doc/rfc/5187-luajit-metrics.md
> > > 
> > > diff --git a/doc/rfc/5187-luajit-metrics.md b/doc/rfc/5187-luajit-metrics.md
> > > new file mode 100644
> > > index 000000000..02f5b559f
> > > --- /dev/null
> > > +++ b/doc/rfc/5187-luajit-metrics.md
> > > @@ -0,0 +1,299 @@
> > 

<snipped>

> 
> > 
> > > +- `jit_mcode_size` -- whenever new MCode area is allocated `jit_mcode_size` is
> > > +  increased at corresponding size in bytes. Sets to 0 when all mcode area is
> > > +  freed.
> > 
> > How does it change, when a trace is collected as a result of its flush?
> 
> It doesn't. IINM, this area will be reused later for other traces.
> MCode area is linked with jit_State not with trace by itself. Trace just
> reserve MCode area that needed.

I guess this should be explicitly mentioned here then.

> 

<snipped>

> 
> -- 
> Best regards,
> Sergey Kaplun

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [RFC v4] rfc: luajit metrics
  2020-10-08 20:26       ` Igor Munkin
@ 2020-10-09  6:06         ` Sergey Kaplun
  0 siblings, 0 replies; 26+ messages in thread
From: Sergey Kaplun @ 2020-10-09  6:06 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches

Igor,

On 08.10.20, Igor Munkin wrote:
> Sergey,
> 
> Thanks, the RFC LGTM in general now, but please consider the last minor
> comments below.
> 
> On 08.10.20, Sergey Kaplun wrote:
> > Hi, Igor! Thanks for the review!
> > 
> > On 08.10.20, Igor Munkin wrote:
> > > Sergey,
> > > 
> 
> <snipped>
> 
> > > 
> > > Minor: ChangeLog is misordered (the latest changes are the first entry).
> > 
> > Thanks!
> > 
> > Forgot to add ChangeLog to patch:
> > @ChangeLog:
> >  * Add Lua and C API for LuaJIT platform metrics about:
> >    - overall amount of allocated tables, cdata and udata objects
> >    - number of incremental GC steps grouped by GC state
> >    - number of string hashes hits and misses
> >    - amount of allocated and freed memory
> >    - number of trace aborts, number of traces and restored snapshots
> 
> The related issue should be also mentioned here.

Removed bullets as we had discussed offline.

@ChangeLog:
 * Introduced LuaJIT platform metrics (gh-5187).

> 
> > 
> > > 
> > > > 
> > > >  doc/rfc/5187-luajit-metrics.md | 299 +++++++++++++++++++++++++++++++++
> > > >  1 file changed, 299 insertions(+)
> > > >  create mode 100644 doc/rfc/5187-luajit-metrics.md
> > > > 
> > > > diff --git a/doc/rfc/5187-luajit-metrics.md b/doc/rfc/5187-luajit-metrics.md
> > > > new file mode 100644
> > > > index 000000000..02f5b559f
> > > > --- /dev/null
> > > > +++ b/doc/rfc/5187-luajit-metrics.md
> > > > @@ -0,0 +1,299 @@
> > > 
> 
> <snipped>
> 
> > 
> > > 
> > > > +- `jit_mcode_size` -- whenever new MCode area is allocated `jit_mcode_size` is
> > > > +  increased at corresponding size in bytes. Sets to 0 when all mcode area is
> > > > +  freed.
> > > 
> > > How does it change, when a trace is collected as a result of its flush?
> > 
> > It doesn't. IINM, this area will be reused later for other traces.
> > MCode area is linked with jit_State not with trace by itself. Trace just
> > reserve MCode area that needed.
> 
> I guess this should be explicitly mentioned here then.

OK, added. See iterative patch in the bottom. Branch force-pushed.

> 
> > 
> 
> <snipped>
> 
> > 
> > -- 
> > Best regards,
> > Sergey Kaplun
> 
> -- 
> Best regards,
> IM

===================================================================
diff --git a/doc/rfc/5187-luajit-metrics.md b/doc/rfc/5187-luajit-metrics.md
index 0c1df6901..db9ec7ee7 100644
--- a/doc/rfc/5187-luajit-metrics.md
+++ b/doc/rfc/5187-luajit-metrics.md
@@ -108,7 +108,9 @@ Couple of words about how metrics are collected:
   BC or builtins this counter is incremented.
 - `jit_mcode_size` -- whenever new MCode area is allocated `jit_mcode_size` is
   increased at corresponding size in bytes. Sets to 0 when all mcode area is
-  freed.
+  freed. When a trace is collected by GC this value doesn't change. This area
+  will be reused later for other traces. MCode area is linked with `jit_State`
+  not with trace by itself. Traces just reserve MCode area that needed.
 
 All metrics are collected throughout the platform uptime. These metrics
 increase monotonically and can overflow:
===================================================================

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 1/2] core: introduce various platform metrics
  2020-10-08 12:44           ` Igor Munkin
@ 2020-10-09 14:39             ` Sergey Ostanevich
  0 siblings, 0 replies; 26+ messages in thread
From: Sergey Ostanevich @ 2020-10-09 14:39 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches

Hi!

Thanks for the patch. Although I was puzzled by your conversation over
MIPS registers use, stalled at the first reply 'added scratch register
allocation'. And I see the same first version in the branch. 

The hand-scheduled MIPS code is cool also! 

LGTM, I didn't manage to catch any inconsistence in inc/dec of stats.

Sergos.


On 08 окт 15:44, Igor Munkin wrote:
> Sergey,
> 
> Thanks, this patch also LGTM.
> 
> On 08.10.20, Sergey Kaplun wrote:
> > Igor,
> > 
> 
> <snipped>
> 
> > > > > > +  emit_setgl(as, RID_RET+2, gc.cdatanum);
> > > > > 
> > > > > Well, I glanced a MIPS register-usage convention and AFAICS $4 register
> > > > > (RID_RET + 2) is a general-purpose (i.e. doesn't store 0 or preserved by
> > > > > kernel) caller-safe one. Ergo it should be allocated it in a proper way
> > > > > from scratch set, shouldn't it?
> > > > > 
> > > > 
> > > > AFAIK, $a0 - $a3 ($4 - $7) registers are arguments to functions - not
> > > > preserved by subprograms.
> > > 
> > > Yes, but there is e.g. $8, that is temporary one, isn't it? Anyway, you
> > 
> > Yes, for example.
> > Side note: We can omit the call to the register allocator, since these
> > registers can change during the call, but in order not to be tied to ABI
> > we can use a lightweight `ra_scratch`, indeed.
> 
> Agree here: you can pick any of caller-safe registers instead of
> scratching it. All values contained in these registers should be
> preserved prior to call, so nothing is broken in your previous version
> (i.e. not a lucky coincidence). However, using explicit scratching here
> looks at least more consistent to me.
> 
> > 
> > > can't just pick the particular register, since it can be already
> > > allocated by RA. So it *has* to be explicitly allocated to avoid data
> > > clash on the trace. I strongly believe the reason you see no failure on
> > > tests is simply a lucky coincidence (or tiny traces).
> > > 
> > > > But anyway explicit allocation is better here. Added.
> > > > 
> 
> <snipped>
> 
> -- 
> Best regards,
> IM

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 2/2] misc: add C and Lua API for platform metrics
  2020-10-05  6:30 ` [Tarantool-patches] [PATCH v4 2/2] misc: add C and Lua API for " Sergey Kaplun
  2020-10-06 22:17   ` Igor Munkin
@ 2020-10-09 14:45   ` Sergey Ostanevich
  2020-10-13  6:01     ` Sergey Kaplun
  1 sibling, 1 reply; 26+ messages in thread
From: Sergey Ostanevich @ 2020-10-09 14:45 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Hi!

Thanks for the patch!

I think it introduces a new user-visible interface, so we need a 
@TarantoolBot document entry here?

Otherwise LGTM.

Sergos


On 05 окт 09:30, Sergey Kaplun wrote:
> This patch introduces both C and Lua API for LuaJIT platform
> metrics implemented in scope of the previous patch. New <lmisclib.h>
> header provides <luaM_metrics> C interface that fills the given
> <luam_metrics> structure with the platform metrics. Additionally
> <misc> module is loaded to Lua space and provides <getmetrics>
> method that yields the corresponding metrics via table.
> 
> Part of tarantool/tarantool#5187
> ---
>  Makefile                                      |   2 +-
>  src/Makefile                                  |   5 +-
>  src/Makefile.dep                              |  14 +-
>  src/lib_init.c                                |   2 +
>  src/lib_misc.c                                |  72 +++
>  src/lj_mapi.c                                 |  61 +++
>  src/ljamalg.c                                 |   2 +
>  src/lmisclib.h                                |  64 +++
>  src/luaconf.h                                 |   1 +
>  test/clib-misclib-getmetrics.test.lua         | 174 ++++++++
>  test/clib-misclib-getmetrics/CMakeLists.txt   |   1 +
>  test/clib-misclib-getmetrics/testgetmetrics.c | 242 ++++++++++
>  test/lib-misc-getmetrics.test.lua             | 418 ++++++++++++++++++
>  13 files changed, 1050 insertions(+), 8 deletions(-)
>  create mode 100644 src/lib_misc.c
>  create mode 100644 src/lj_mapi.c
>  create mode 100644 src/lmisclib.h
>  create mode 100755 test/clib-misclib-getmetrics.test.lua
>  create mode 100644 test/clib-misclib-getmetrics/CMakeLists.txt
>  create mode 100644 test/clib-misclib-getmetrics/testgetmetrics.c
>  create mode 100755 test/lib-misc-getmetrics.test.lua
> 
> diff --git a/Makefile b/Makefile
> index 0f93308..4a56917 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -84,7 +84,7 @@ FILE_A= libluajit.a
>  FILE_SO= libluajit.so
>  FILE_MAN= luajit.1
>  FILE_PC= luajit.pc
> -FILES_INC= lua.h lualib.h lauxlib.h luaconf.h lua.hpp luajit.h
> +FILES_INC= lua.h lualib.h lauxlib.h luaconf.h lua.hpp luajit.h lmisclib.h
>  FILES_JITLIB= bc.lua bcsave.lua dump.lua p.lua v.lua zone.lua \
>  	      dis_x86.lua dis_x64.lua dis_arm.lua dis_arm64.lua \
>  	      dis_arm64be.lua dis_ppc.lua dis_mips.lua dis_mipsel.lua \
> diff --git a/src/Makefile b/src/Makefile
> index 827d4a4..2786348 100644
> --- a/src/Makefile
> +++ b/src/Makefile
> @@ -480,13 +480,14 @@ LJVM_BOUT= $(LJVM_S)
>  LJVM_MODE= elfasm
>  
>  LJLIB_O= lib_base.o lib_math.o lib_bit.o lib_string.o lib_table.o \
> -	 lib_io.o lib_os.o lib_package.o lib_debug.o lib_jit.o lib_ffi.o
> +	 lib_io.o lib_os.o lib_package.o lib_debug.o lib_jit.o lib_ffi.o \
> +	 lib_misc.o
>  LJLIB_C= $(LJLIB_O:.o=.c)
>  
>  LJCORE_O= lj_gc.o lj_err.o lj_char.o lj_bc.o lj_obj.o lj_buf.o \
>  	  lj_str.o lj_tab.o lj_func.o lj_udata.o lj_meta.o lj_debug.o \
>  	  lj_state.o lj_dispatch.o lj_vmevent.o lj_vmmath.o lj_strscan.o \
> -	  lj_strfmt.o lj_strfmt_num.o lj_api.o lj_profile.o \
> +	  lj_strfmt.o lj_strfmt_num.o lj_api.o lj_mapi.o lj_profile.o \
>  	  lj_lex.o lj_parse.o lj_bcread.o lj_bcwrite.o lj_load.o \
>  	  lj_ir.o lj_opt_mem.o lj_opt_fold.o lj_opt_narrow.o \
>  	  lj_opt_dce.o lj_opt_loop.o lj_opt_split.o lj_opt_sink.o \
> diff --git a/src/Makefile.dep b/src/Makefile.dep
> index 2b1cb5e..556314e 100644
> --- a/src/Makefile.dep
> +++ b/src/Makefile.dep
> @@ -18,7 +18,7 @@ lib_ffi.o: lib_ffi.c lua.h luaconf.h lauxlib.h lualib.h lj_obj.h lj_def.h \
>   lj_ctype.h lj_cparse.h lj_cdata.h lj_cconv.h lj_carith.h lj_ccall.h \
>   lj_ccallback.h lj_clib.h lj_strfmt.h lj_ff.h lj_ffdef.h lj_lib.h \
>   lj_libdef.h
> -lib_init.o: lib_init.c lua.h luaconf.h lauxlib.h lualib.h lj_arch.h
> +lib_init.o: lib_init.c lua.h luaconf.h lauxlib.h lualib.h lmisclib.h lj_arch.h
>  lib_io.o: lib_io.c lua.h luaconf.h lauxlib.h lualib.h lj_obj.h lj_def.h \
>   lj_arch.h lj_gc.h lj_err.h lj_errmsg.h lj_buf.h lj_str.h lj_state.h \
>   lj_strfmt.h lj_ff.h lj_ffdef.h lj_lib.h lj_libdef.h
> @@ -29,6 +29,8 @@ lib_jit.o: lib_jit.c lua.h luaconf.h lauxlib.h lualib.h lj_obj.h lj_def.h \
>   lj_vm.h lj_vmevent.h lj_lib.h luajit.h lj_libdef.h
>  lib_math.o: lib_math.c lua.h luaconf.h lauxlib.h lualib.h lj_obj.h \
>   lj_def.h lj_arch.h lj_lib.h lj_vm.h lj_libdef.h
> +lib_misc.o: lib_misc.c lua.h luaconf.h lmisclib.h lj_obj.h lj_def.h lj_arch.h \
> + lj_str.h lj_tab.h lj_lib.h lj_libdef.h
>  lib_os.o: lib_os.c lua.h luaconf.h lauxlib.h lualib.h lj_obj.h lj_def.h \
>   lj_arch.h lj_gc.h lj_err.h lj_errmsg.h lj_buf.h lj_str.h lj_lib.h \
>   lj_libdef.h
> @@ -137,6 +139,8 @@ lj_lib.o: lj_lib.c lauxlib.h lua.h luaconf.h lj_obj.h lj_def.h lj_arch.h \
>  lj_load.o: lj_load.c lua.h luaconf.h lauxlib.h lj_obj.h lj_def.h \
>   lj_arch.h lj_gc.h lj_err.h lj_errmsg.h lj_buf.h lj_str.h lj_func.h \
>   lj_frame.h lj_bc.h lj_vm.h lj_lex.h lj_bcdump.h lj_parse.h
> +lj_mapi.o: lj_mapi.c lua.h luaconf.h lmisclib.h lj_obj.h lj_def.h lj_arch.h \
> + lj_dispatch.h lj_bc.h lj_jit.h lj_ir.h
>  lj_mcode.o: lj_mcode.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
>   lj_gc.h lj_err.h lj_errmsg.h lj_jit.h lj_ir.h lj_mcode.h lj_trace.h \
>   lj_dispatch.h lj_bc.h lj_traceerr.h lj_vm.h
> @@ -215,9 +219,9 @@ ljamalg.o: ljamalg.c lua.h luaconf.h lauxlib.h lj_gc.c lj_obj.h lj_def.h \
>   lj_func.c lj_udata.c lj_meta.c lj_strscan.h lj_lib.h lj_debug.c \
>   lj_state.c lj_lex.h lj_alloc.h luajit.h lj_dispatch.c lj_ccallback.h \
>   lj_profile.h lj_vmevent.c lj_vmevent.h lj_vmmath.c lj_strscan.c \
> - lj_strfmt.c lj_strfmt_num.c lj_api.c lj_profile.c lj_lex.c lualib.h \
> - lj_parse.h lj_parse.c lj_bcread.c lj_bcdump.h lj_bcwrite.c lj_load.c \
> - lj_ctype.c lj_cdata.c lj_cconv.h lj_cconv.c lj_ccall.c lj_ccall.h \
> + lj_strfmt.c lj_strfmt_num.c lj_api.c lj_mapi.c lmisclib.h lj_profile.c \
> + lj_lex.c lualib.h lj_parse.h lj_parse.c lj_bcread.c lj_bcdump.h lj_bcwrite.c \
> + lj_load.c lj_ctype.c lj_cdata.c lj_cconv.h lj_cconv.c lj_ccall.c lj_ccall.h \
>   lj_ccallback.c lj_target.h lj_target_*.h lj_mcode.h lj_carith.c \
>   lj_carith.h lj_clib.c lj_clib.h lj_cparse.c lj_cparse.h lj_lib.c lj_ir.c \
>   lj_ircall.h lj_iropt.h lj_opt_mem.c lj_opt_fold.c lj_folddef.h \
> @@ -227,7 +231,7 @@ ljamalg.o: ljamalg.c lua.h luaconf.h lauxlib.h lj_gc.c lj_obj.h lj_def.h \
>   lj_emit_*.h lj_asm_*.h lj_trace.c lj_gdbjit.h lj_gdbjit.c lj_alloc.c \
>   lib_aux.c lib_base.c lj_libdef.h lib_math.c lib_string.c lib_table.c \
>   lib_io.c lib_os.c lib_package.c lib_debug.c lib_bit.c lib_jit.c \
> - lib_ffi.c lib_init.c
> + lib_ffi.c lib_misc.c lib_init.c
>  luajit.o: luajit.c lua.h luaconf.h lauxlib.h lualib.h luajit.h lj_arch.h
>  host/buildvm.o: host/buildvm.c host/buildvm.h lj_def.h lua.h luaconf.h \
>   lj_arch.h lj_obj.h lj_def.h lj_arch.h lj_gc.h lj_obj.h lj_bc.h lj_ir.h \
> diff --git a/src/lib_init.c b/src/lib_init.c
> index 2ed370e..664aa7d 100644
> --- a/src/lib_init.c
> +++ b/src/lib_init.c
> @@ -12,6 +12,7 @@
>  #include "lua.h"
>  #include "lauxlib.h"
>  #include "lualib.h"
> +#include "lmisclib.h"
>  
>  #include "lj_arch.h"
>  
> @@ -26,6 +27,7 @@ static const luaL_Reg lj_lib_load[] = {
>    { LUA_DBLIBNAME,	luaopen_debug },
>    { LUA_BITLIBNAME,	luaopen_bit },
>    { LUA_JITLIBNAME,	luaopen_jit },
> +  { LUAM_MISCLIBNAME,	luaopen_misc },
>    { NULL,		NULL }
>  };
>  
> diff --git a/src/lib_misc.c b/src/lib_misc.c
> new file mode 100644
> index 0000000..ef11237
> --- /dev/null
> +++ b/src/lib_misc.c
> @@ -0,0 +1,72 @@
> +/*
> +** Miscellaneous Lua extensions library.
> +**
> +** Major portions taken verbatim or adapted from the LuaVela interpreter.
> +** Copyright (C) 2015-2019 IPONWEB Ltd.
> +*/
> +
> +#define lib_misc_c
> +#define LUA_LIB
> +
> +#include "lua.h"
> +#include "lmisclib.h"
> +
> +#include "lj_obj.h"
> +#include "lj_str.h"
> +#include "lj_tab.h"
> +#include "lj_lib.h"
> +
> +/* ------------------------------------------------------------------------ */
> +
> +static LJ_AINLINE void setnumfield(struct lua_State *L, GCtab *t,
> +				   const char *name, int64_t val)
> +{
> +  setnumV(lj_tab_setstr(L, t, lj_str_newz(L, name)), (double)val);
> +}
> +
> +#define LJLIB_MODULE_misc
> +
> +LJLIB_CF(misc_getmetrics)
> +{
> +  struct luam_Metrics metrics;
> +  lua_createtable(L, 0, 22);
> +  GCtab *m = tabV(L->top - 1);
> +
> +  luaM_metrics(L, &metrics);
> +
> +  setnumfield(L, m, "strhash_hit", metrics.strhash_hit);
> +  setnumfield(L, m, "strhash_miss", metrics.strhash_miss);
> +
> +  setnumfield(L, m, "gc_strnum", metrics.gc_strnum);
> +  setnumfield(L, m, "gc_tabnum", metrics.gc_tabnum);
> +  setnumfield(L, m, "gc_udatanum", metrics.gc_udatanum);
> +  setnumfield(L, m, "gc_cdatanum", metrics.gc_cdatanum);
> +
> +  setnumfield(L, m, "gc_total", metrics.gc_total);
> +  setnumfield(L, m, "gc_freed", metrics.gc_freed);
> +  setnumfield(L, m, "gc_allocated", metrics.gc_allocated);
> +
> +  setnumfield(L, m, "gc_steps_pause", metrics.gc_steps_pause);
> +  setnumfield(L, m, "gc_steps_propagate", metrics.gc_steps_propagate);
> +  setnumfield(L, m, "gc_steps_atomic", metrics.gc_steps_atomic);
> +  setnumfield(L, m, "gc_steps_sweepstring", metrics.gc_steps_sweepstring);
> +  setnumfield(L, m, "gc_steps_sweep", metrics.gc_steps_sweep);
> +  setnumfield(L, m, "gc_steps_finalize", metrics.gc_steps_finalize);
> +
> +  setnumfield(L, m, "jit_snap_restore", metrics.jit_snap_restore);
> +  setnumfield(L, m, "jit_trace_abort", metrics.jit_trace_abort);
> +  setnumfield(L, m, "jit_mcode_size", metrics.jit_mcode_size);
> +  setnumfield(L, m, "jit_trace_num", metrics.jit_trace_num);
> +
> +  return 1;
> +}
> +
> +/* ------------------------------------------------------------------------ */
> +
> +#include "lj_libdef.h"
> +
> +LUALIB_API int luaopen_misc(struct lua_State *L)
> +{
> +  LJ_LIB_REG(L, LUAM_MISCLIBNAME, misc);
> +  return 1;
> +}
> diff --git a/src/lj_mapi.c b/src/lj_mapi.c
> new file mode 100644
> index 0000000..7645a44
> --- /dev/null
> +++ b/src/lj_mapi.c
> @@ -0,0 +1,61 @@
> +/*
> +** Miscellaneous public C API extensions.
> +**
> +** Major portions taken verbatim or adapted from the LuaVela.
> +** Copyright (C) 2015-2019 IPONWEB Ltd.
> +*/
> +
> +#include "lua.h"
> +#include "lmisclib.h"
> +
> +#include "lj_obj.h"
> +#include "lj_dispatch.h"
> +
> +#if LJ_HASJIT
> +#include "lj_jit.h"
> +#endif
> +
> +LUAMISC_API void luaM_metrics(lua_State *L, struct luam_Metrics *metrics)
> +{
> +  lua_assert(metrics != NULL);
> +  global_State *g = G(L);
> +  GCState *gc = &g->gc;
> +#if LJ_HASJIT
> +  jit_State *J = G2J(g);
> +#endif
> +
> +  metrics->strhash_hit = g->strhash_hit;
> +  metrics->strhash_miss = g->strhash_miss;
> +
> +  metrics->gc_strnum = g->strnum;
> +  metrics->gc_tabnum = gc->tabnum;
> +  metrics->gc_udatanum = gc->udatanum;
> +#if LJ_HASFFI
> +  metrics->gc_cdatanum = gc->cdatanum;
> +#else
> +  metrics->gc_cdatanum = 0;
> +#endif
> +
> +  metrics->gc_total = gc->total;
> +  metrics->gc_freed = gc->freed;
> +  metrics->gc_allocated = gc->allocated;
> +
> +  metrics->gc_steps_pause = gc->state_count[GCSpause];
> +  metrics->gc_steps_propagate = gc->state_count[GCSpropagate];
> +  metrics->gc_steps_atomic = gc->state_count[GCSatomic];
> +  metrics->gc_steps_sweepstring = gc->state_count[GCSsweepstring];
> +  metrics->gc_steps_sweep = gc->state_count[GCSsweep];
> +  metrics->gc_steps_finalize = gc->state_count[GCSfinalize];
> +
> +#if LJ_HASJIT
> +  metrics->jit_snap_restore = J->nsnaprestore;
> +  metrics->jit_trace_abort = J->ntraceabort;
> +  metrics->jit_mcode_size = J->szallmcarea;
> +  metrics->jit_trace_num = J->tracenum;
> +#else
> +  metrics->jit_snap_restore = 0;
> +  metrics->jit_trace_abort = 0;
> +  metrics->jit_mcode_size = 0;
> +  metrics->jit_trace_num = 0;
> +#endif
> +}
> diff --git a/src/ljamalg.c b/src/ljamalg.c
> index f1f2862..371bbb6 100644
> --- a/src/ljamalg.c
> +++ b/src/ljamalg.c
> @@ -48,6 +48,7 @@
>  #include "lj_strfmt.c"
>  #include "lj_strfmt_num.c"
>  #include "lj_api.c"
> +#include "lj_mapi.c"
>  #include "lj_profile.c"
>  #include "lj_lex.c"
>  #include "lj_parse.c"
> @@ -93,5 +94,6 @@
>  #include "lib_bit.c"
>  #include "lib_jit.c"
>  #include "lib_ffi.c"
> +#include "lib_misc.c"
>  #include "lib_init.c"
>  
> diff --git a/src/lmisclib.h b/src/lmisclib.h
> new file mode 100644
> index 0000000..e2d1909
> --- /dev/null
> +++ b/src/lmisclib.h
> @@ -0,0 +1,64 @@
> +/*
> +** Miscellaneous public C API extensions.
> +**
> +** Major portions taken verbatim or adapted from the LuaVela.
> +** Copyright (C) 2015-2019 IPONWEB Ltd.
> +*/
> +
> +#ifndef _LMISCLIB_H
> +#define _LMISCLIB_H
> +
> +#include "lua.h"
> +
> +/* API for obtaining various platform metrics. */
> +
> +struct luam_Metrics {
> +  /* Strings amount found in string hash instead of allocation of new one. */
> +  size_t strhash_hit;
> +  /* Strings amount allocated and put into string hash. */
> +  size_t strhash_miss;
> +
> +  /* Amount of allocated string objects. */
> +  size_t gc_strnum;
> +  /* Amount of allocated table objects. */
> +  size_t gc_tabnum;
> +  /* Amount of allocated udata objects. */
> +  size_t gc_udatanum;
> +  /* Amount of allocated cdata objects. */
> +  size_t gc_cdatanum;
> +
> +  /* Memory currently allocated. */
> +  size_t gc_total;
> +  /* Total amount of freed memory. */
> +  size_t gc_freed;
> +  /* Total amount of allocated memory. */
> +  size_t gc_allocated;
> +
> +  /* Count of incremental GC steps per state. */
> +  size_t gc_steps_pause;
> +  size_t gc_steps_propagate;
> +  size_t gc_steps_atomic;
> +  size_t gc_steps_sweepstring;
> +  size_t gc_steps_sweep;
> +  size_t gc_steps_finalize;
> +
> +  /*
> +  ** Overall number of snap restores (amount of guard assertions
> +  ** leading to stopping trace executions and trace exits,
> +  ** that are not stitching with other traces).
> +  */
> +  size_t jit_snap_restore;
> +  /* Overall number of abort traces. */
> +  size_t jit_trace_abort;
> +  /* Total size of all allocated machine code areas. */
> +  size_t jit_mcode_size;
> +  /* Amount of JIT traces. */
> +  unsigned int jit_trace_num;
> +};
> +
> +LUAMISC_API void luaM_metrics(lua_State *L, struct luam_Metrics *metrics);
> +
> +#define LUAM_MISCLIBNAME "misc"
> +LUALIB_API int luaopen_misc(lua_State *L);
> +
> +#endif /* _LMISCLIB_H */
> diff --git a/src/luaconf.h b/src/luaconf.h
> index 60cb928..8029040 100644
> --- a/src/luaconf.h
> +++ b/src/luaconf.h
> @@ -144,6 +144,7 @@
>  #endif
>  
>  #define LUALIB_API	LUA_API
> +#define LUAMISC_API	LUA_API
>  
>  /* Support for internal assertions. */
>  #if defined(LUA_USE_ASSERT) || defined(LUA_USE_APICHECK)
> diff --git a/test/clib-misclib-getmetrics.test.lua b/test/clib-misclib-getmetrics.test.lua
> new file mode 100755
> index 0000000..34adaba
> --- /dev/null
> +++ b/test/clib-misclib-getmetrics.test.lua
> @@ -0,0 +1,174 @@
> +#!/usr/bin/env tarantool
> +
> +local file = debug.getinfo(1, "S").source:sub(2)
> +local filepath = file:match("(.*/)")
> +local soext = jit.os == "OSX" and "dylib" or "so"
> +package.cpath = filepath..'clib-misclib-getmetrics/?.'..soext..";"
> +                ..package.cpath
> +
> +local jit_opt_default_hotloop = 56
> +local jit_opt_default_hotexit = 10
> +local jit_opt_default_level = 3
> +local jit_opt_default_minstitch = 0
> +
> +local tap = require('tap')
> +
> +local test = tap.test("clib-misc-getmetrics")
> +test:plan(10)
> +
> +local testgetmetrics = require("testgetmetrics")
> +
> +test:ok(testgetmetrics.base())
> +test:ok(testgetmetrics.gc_allocated_freed())
> +test:ok(testgetmetrics.gc_steps())
> +
> +test:ok(testgetmetrics.objcount(function()
> +    local table_new = require("table.new")
> +    local ffi = require("ffi")
> +
> +    jit.opt.start(0)
> +
> +    local placeholder = {
> +        str = {},
> +        tab = {},
> +        udata = {},
> +        cdata = {},
> +    }
> +
> +    -- Separate objects creations to separate jit traces.
> +    for i = 1, 1000 do
> +        table.insert(placeholder.str, tostring(i))
> +    end
> +
> +    for i = 1, 1000 do
> +        table.insert(placeholder.tab, table_new(i, 0))
> +    end
> +
> +    for i = 1, 1000 do
> +        table.insert(placeholder.udata, newproxy())
> +    end
> +
> +    for i = 1, 1000 do
> +        -- Check counting of VLA/VLS/aligned cdata.
> +        table.insert(placeholder.cdata, ffi.new("char[?]", 4))
> +    end
> +
> +    for i = 1, 1000 do
> +        -- Check counting of non-VLA/VLS/aligned cdata.
> +        table.insert(placeholder.cdata, ffi.new("uint64_t", i))
> +    end
> +
> +    placeholder = nil
> +    -- Restore default jit settings.
> +    jit.opt.start(jit_opt_default_level)
> +end))
> +
> +-- Compiled loop with a direct exit to the interpreter.
> +test:ok(testgetmetrics.snap_restores(function()
> +    jit.opt.start(0, "hotloop=2")
> +
> +    local old_metrics = misc.getmetrics()
> +
> +    local sum = 0
> +    for i = 1, 20 do
> +        sum = sum + i
> +    end
> +
> +    local new_metrics = misc.getmetrics()
> +
> +    -- Restore default jit settings.
> +    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop)
> +
> +    -- A single snapshot restoration happened on loop finish.
> +    return 1
> +end))
> +
> +-- Compiled loop with a side exit which does not get compiled.
> +test:ok(testgetmetrics.snap_restores(function()
> +    jit.opt.start(0, "hotloop=2", "hotexit=2", "minstitch=15")
> +
> +    local function foo(i)
> +        -- math.fmod is not yet compiled!
> +        return i <= 5 and i or math.fmod(i, 11)
> +    end
> +
> +    local sum = 0
> +    for i = 1, 10 do
> +        sum = sum + foo(i)
> +    end
> +
> +    -- Restore default jit settings.
> +    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop,
> +                  "hotexit="..jit_opt_default_hotexit,
> +                  "minstitch="..jit_opt_default_minstitch)
> +
> +    -- Side exits from the root trace could not get compiled.
> +    return 5
> +end))
> +
> +-- Compiled loop with a side exit which gets compiled.
> +test:ok(testgetmetrics.snap_restores(function()
> +    -- Optimization level is important here as `loop` optimization
> +    -- may unroll the loop body and insert +1 side exit.
> +    jit.opt.start(0, "hotloop=5", "hotexit=5")
> +
> +    local function foo(i)
> +        return i <= 10 and i or tostring(i)
> +    end
> +
> +    local sum = 0
> +    for i = 1, 20 do
> +        sum = sum + foo(i)
> +    end
> +
> +    -- Restore default jit settings.
> +    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop,
> +                  "hotexit="..jit_opt_default_hotexit)
> +
> +    -- 5 side exits to the interpreter before trace gets hot
> +    -- and compiled
> +    -- 1 side exit on loop end
> +    return 6
> +end))
> +
> +-- Compiled scalar trace with a direct exit to the interpreter.
> +test:ok(testgetmetrics.snap_restores(function()
> +    -- For calls it will be 2 * hotloop (see lj_dispatch.{c,h}
> +    -- and hotcall@vm_*.dasc).
> +    jit.opt.start(3, "hotloop=2", "hotexit=3")
> +
> +    local function foo(i)
> +        return i <= 15 and i or tostring(i)
> +    end
> +
> +    foo(1)  -- interp only
> +    foo(2)  -- interp only
> +    foo(3)  -- interp only
> +    foo(4)  -- compile trace during this call
> +    foo(5)  -- follow the trace
> +    foo(6)  -- follow the trace
> +    foo(7)  -- follow the trace
> +    foo(8)  -- follow the trace
> +    foo(9)  -- follow the trace
> +    foo(10) -- follow the trace
> +
> +    -- Simply 2 side exits from the trace:
> +    foo(20)
> +    foo(21)
> +
> +    -- Restore default jit settings.
> +    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop,
> +                  "hotexit="..jit_opt_default_hotexit)
> +    return 2
> +end))
> +
> +test:ok(testgetmetrics.strhash())
> +
> +test:ok(testgetmetrics.tracenum_base(function()
> +    local sum = 0
> +    for i = 1, 200 do
> +        sum = sum + i
> +    end
> +    -- Compiled only 1 loop as new trace.
> +    return 1
> +end))
> diff --git a/test/clib-misclib-getmetrics/CMakeLists.txt b/test/clib-misclib-getmetrics/CMakeLists.txt
> new file mode 100644
> index 0000000..e7cc8f8
> --- /dev/null
> +++ b/test/clib-misclib-getmetrics/CMakeLists.txt
> @@ -0,0 +1 @@
> +build_lualib(testgetmetrics testgetmetrics.c)
> diff --git a/test/clib-misclib-getmetrics/testgetmetrics.c b/test/clib-misclib-getmetrics/testgetmetrics.c
> new file mode 100644
> index 0000000..32802d2
> --- /dev/null
> +++ b/test/clib-misclib-getmetrics/testgetmetrics.c
> @@ -0,0 +1,242 @@
> +#include <lua.h>
> +#include <luajit.h>
> +#include <lauxlib.h>
> +
> +#include <lmisclib.h>
> +
> +#include <assert.h>
> +
> +static int base(lua_State *L)
> +{
> +	struct luam_Metrics metrics;
> +	luaM_metrics(L, &metrics);
> +
> +	/* Just check API. */
> +	assert((ssize_t)metrics.strhash_hit >= 0);
> +	assert((ssize_t)metrics.strhash_miss >= 0);
> +
> +	assert((ssize_t)metrics.gc_strnum >= 0);
> +	assert((ssize_t)metrics.gc_tabnum >= 0);
> +	assert((ssize_t)metrics.gc_udatanum >= 0);
> +	assert((ssize_t)metrics.gc_cdatanum >= 0);
> +
> +	assert((ssize_t)metrics.gc_total >= 0);
> +	assert((ssize_t)metrics.gc_freed >= 0);
> +	assert((ssize_t)metrics.gc_allocated >= 0);
> +
> +	assert((ssize_t)metrics.gc_steps_pause >= 0);
> +	assert((ssize_t)metrics.gc_steps_propagate >= 0);
> +	assert((ssize_t)metrics.gc_steps_atomic >= 0);
> +	assert((ssize_t)metrics.gc_steps_sweepstring >= 0);
> +	assert((ssize_t)metrics.gc_steps_sweep >= 0);
> +	assert((ssize_t)metrics.gc_steps_finalize >= 0);
> +
> +	assert((ssize_t)metrics.jit_snap_restore >= 0);
> +	assert((ssize_t)metrics.jit_trace_abort >= 0);
> +	assert((ssize_t)metrics.jit_mcode_size >= 0);
> +	assert((ssize_t)metrics.jit_trace_num >= 0);
> +
> +	lua_pushboolean(L, 1);
> +	return 1;
> +}
> +
> +static int gc_allocated_freed(lua_State *L)
> +{
> +	struct luam_Metrics oldm, newm;
> +	/* Force up garbage collect all dead objects. */
> +	lua_gc(L, LUA_GCCOLLECT, 0);
> +
> +	luaM_metrics(L, &oldm);
> +	/* Simple garbage generation. */
> +	if (luaL_dostring(L, "local i = 0 for j = 1, 10 do i = i + j end"))
> +		luaL_error(L, "failed to translate Lua code snippet");
> +	lua_gc(L, LUA_GCCOLLECT, 0);
> +	luaM_metrics(L, &newm);
> +	assert(newm.gc_allocated - oldm.gc_allocated > 0);
> +	assert(newm.gc_freed - oldm.gc_freed > 0);
> +
> +	lua_pushboolean(L, 1);
> +	return 1;
> +}
> +
> +static int gc_steps(lua_State *L)
> +{
> +	struct luam_Metrics oldm, newm;
> +	/*
> +	 * Some garbage has already happened before the next line,
> +	 * i.e. during fronted processing lua test chunk.
> +	 * Let's put a full garbage collection cycle on top
> +	 * of that, and confirm that non-null values are reported
> +	 * (we are not yet interested in actual numbers):
> +	 */
> +	luaM_metrics(L, &oldm);
> +	lua_gc(L, LUA_GCCOLLECT, 0);
> +	assert(oldm.gc_steps_pause > 0);
> +	assert(oldm.gc_steps_propagate > 0);
> +	assert(oldm.gc_steps_atomic > 0);
> +	assert(oldm.gc_steps_sweepstring > 0);
> +	assert(oldm.gc_steps_sweep > 0);
> +	/* Nothing to finalize, skipped. */
> +	assert(oldm.gc_steps_finalize == 0);
> +
> +	/*
> +	 * As long as we don't create new Lua objects
> +	 * consequent call should return the same values:
> +	 */
> +	luaM_metrics(L, &newm);
> +	assert(newm.gc_steps_pause - oldm.gc_steps_pause > 0);
> +	assert(newm.gc_steps_propagate - oldm.gc_steps_propagate > 0);
> +	assert(newm.gc_steps_atomic - oldm.gc_steps_atomic > 0);
> +	assert(newm.gc_steps_sweepstring - oldm.gc_steps_sweepstring > 0);
> +	assert(newm.gc_steps_sweep - oldm.gc_steps_sweep > 0);
> +	/* Nothing to finalize, skipped. */
> +	assert(newm.gc_steps_finalize == 0);
> +	oldm = newm;
> +
> +	/*
> +	 * Now the last phase: run full GC once and make sure that
> +	 * everything is being reported as expected:
> +	 */
> +	lua_gc(L, LUA_GCCOLLECT, 0);
> +	luaM_metrics(L, &newm);
> +	assert(newm.gc_steps_pause - oldm.gc_steps_pause == 1);
> +	assert(newm.gc_steps_propagate - oldm.gc_steps_propagate >= 1);
> +	assert(newm.gc_steps_atomic - oldm.gc_steps_atomic == 1);
> +	assert(newm.gc_steps_sweepstring - oldm.gc_steps_sweepstring >= 1);
> +	assert(newm.gc_steps_sweep - oldm.gc_steps_sweep >= 1);
> +	/* Nothing to finalize, skipped. */
> +	assert(newm.gc_steps_finalize == 0);
> +	oldm = newm;
> +
> +	/*
> +	 * Now let's run three GC cycles to ensure that
> +	 * zero-to-one transition was not a lucky coincidence.
> +	 */
> +	lua_gc(L, LUA_GCCOLLECT, 0);
> +	lua_gc(L, LUA_GCCOLLECT, 0);
> +	lua_gc(L, LUA_GCCOLLECT, 0);
> +	luaM_metrics(L, &newm);
> +	assert(newm.gc_steps_pause - oldm.gc_steps_pause == 3);
> +	assert(newm.gc_steps_propagate - oldm.gc_steps_propagate >= 3);
> +	assert(newm.gc_steps_atomic - oldm.gc_steps_atomic == 3);
> +	assert(newm.gc_steps_sweepstring - oldm.gc_steps_sweepstring >= 3);
> +	assert(newm.gc_steps_sweep - oldm.gc_steps_sweep >= 3);
> +	/* Nothing to finalize, skipped. */
> +	assert(newm.gc_steps_finalize == 0);
> +
> +	lua_pushboolean(L, 1);
> +	return 1;
> +}
> +
> +static int objcount(lua_State *L)
> +{
> +	struct luam_Metrics oldm, newm;
> +	int n = lua_gettop(L);
> +	if (n != 1 || !lua_isfunction(L, 1))
> +		luaL_error(L, "incorrect argument: 1 function is required");
> +
> +	/* Force up garbage collect all dead objects. */
> +	lua_gc(L, LUA_GCCOLLECT, 0);
> +
> +	luaM_metrics(L, &oldm);
> +	/* Generate garbage. */
> +	lua_call(L, 0, 0);
> +	lua_gc(L, LUA_GCCOLLECT, 0);
> +	luaM_metrics(L, &newm);
> +	assert(newm.gc_strnum - oldm.gc_strnum == 0);
> +	assert(newm.gc_tabnum - oldm.gc_tabnum == 0);
> +	assert(newm.gc_udatanum - oldm.gc_udatanum == 0);
> +	assert(newm.gc_cdatanum - oldm.gc_cdatanum == 0);
> +
> +	lua_pushboolean(L, 1);
> +	return 1;
> +}
> +
> +static int snap_restores(lua_State *L)
> +{
> +	struct luam_Metrics oldm, newm;
> +	int n = lua_gettop(L);
> +	if (n != 1 || !lua_isfunction(L, 1))
> +		luaL_error(L, "incorrect arguments: 1 function is required");
> +
> +	luaM_metrics(L, &oldm);
> +	/* Generate snapshots. */
> +	lua_call(L, 0, 1);
> +	n = lua_gettop(L);
> +	if (n != 1 || !lua_isnumber(L, 1))
> +		luaL_error(L, "incorrect return value: 1 number is required");
> +	size_t snap_restores = lua_tonumber(L, 1);
> +	luaM_metrics(L, &newm);
> +	assert(newm.jit_snap_restore - oldm.jit_snap_restore == snap_restores);
> +
> +	lua_pushboolean(L, 1);
> +	return 1;
> +}
> +
> +static int strhash(lua_State *L)
> +{
> +	struct luam_Metrics oldm, newm;
> +	lua_pushstring(L, "strhash_hit");
> +	luaM_metrics(L, &oldm);
> +	lua_pushstring(L, "strhash_hit");
> +	lua_pushstring(L, "new_str");
> +	luaM_metrics(L, &newm);
> +	assert(newm.strhash_hit - oldm.strhash_hit == 1);
> +	assert(newm.strhash_miss - oldm.strhash_miss == 1);
> +	lua_pop(L, 3);
> +	lua_pushboolean(L, 1);
> +	return 1;
> +}
> +
> +static int tracenum_base(lua_State *L)
> +{
> +	struct luam_Metrics metrics;
> +	int n = lua_gettop(L);
> +	if (n != 1 || !lua_isfunction(L, 1))
> +		luaL_error(L, "incorrect arguments: 1 function is required");
> +
> +	luaJIT_setmode(L, 0, LUAJIT_MODE_OFF);
> +	luaJIT_setmode(L, 0, LUAJIT_MODE_FLUSH);
> +	/* Force up garbage collect all dead objects. */
> +	lua_gc(L, LUA_GCCOLLECT, 0);
> +
> +	luaM_metrics(L, &metrics);
> +	assert(metrics.jit_trace_num == 0);
> +
> +	luaJIT_setmode(L, 0, LUAJIT_MODE_ON);
> +	/* Generate traces. */
> +	lua_call(L, 0, 1);
> +	n = lua_gettop(L);
> +	if (n != 1 || !lua_isnumber(L, 1))
> +		luaL_error(L, "incorrect return value: 1 number is required");
> +	size_t jit_trace_num = lua_tonumber(L, 1);
> +	luaM_metrics(L, &metrics);
> +	assert(metrics.jit_trace_num == jit_trace_num);
> +
> +	luaJIT_setmode(L, 0, LUAJIT_MODE_FLUSH);
> +	/* Force up garbage collect all dead objects. */
> +	lua_gc(L, LUA_GCCOLLECT, 0);
> +	luaM_metrics(L, &metrics);
> +	assert(metrics.jit_trace_num == 0);
> +
> +	luaJIT_setmode(L, 0, LUAJIT_MODE_ON);
> +	lua_pushboolean(L, 1);
> +	return 1;
> +}
> +
> +static const struct luaL_Reg testgetmetrics[] = {
> +	{"base", base},
> +	{"gc_allocated_freed", gc_allocated_freed},
> +	{"gc_steps", gc_steps},
> +	{"objcount", objcount},
> +	{"snap_restores", snap_restores},
> +	{"strhash", strhash},
> +	{"tracenum_base", tracenum_base},
> +	{NULL, NULL}
> +};
> +
> +LUA_API int luaopen_testgetmetrics(lua_State *L)
> +{
> +	luaL_register(L, "testgetmetrics", testgetmetrics);
> +	return 1;
> +}
> diff --git a/test/lib-misc-getmetrics.test.lua b/test/lib-misc-getmetrics.test.lua
> new file mode 100755
> index 0000000..2859e26
> --- /dev/null
> +++ b/test/lib-misc-getmetrics.test.lua
> @@ -0,0 +1,418 @@
> +#!/usr/bin/env tarantool
> +
> +-- This is a part of tarantool/luajit testing suite.
> +-- Major portions taken verbatim or adapted from the LuaVela testing suit.
> +-- Copyright (C) 2015-2019 IPONWEB Ltd.
> +
> +local tap = require('tap')
> +
> +local test = tap.test("lib-misc-getmetrics")
> +test:plan(10)
> +
> +local jit_opt_default_hotloop = 56
> +local jit_opt_default_hotexit = 10
> +local jit_opt_default_level = 3
> +local jit_opt_default_minstitch = 0
> +
> +-- Test Lua API.
> +test:test("base", function(subtest)
> +    subtest:plan(19)
> +    local metrics = misc.getmetrics()
> +    subtest:ok(metrics.strhash_hit >= 0)
> +    subtest:ok(metrics.strhash_miss >= 0)
> +
> +    subtest:ok(metrics.gc_strnum >= 0)
> +    subtest:ok(metrics.gc_tabnum >= 0)
> +    subtest:ok(metrics.gc_udatanum >= 0)
> +    subtest:ok(metrics.gc_cdatanum >= 0)
> +
> +    subtest:ok(metrics.gc_total >= 0)
> +    subtest:ok(metrics.gc_freed >= 0)
> +    subtest:ok(metrics.gc_allocated >= 0)
> +
> +    subtest:ok(metrics.gc_steps_pause >= 0)
> +    subtest:ok(metrics.gc_steps_propagate >= 0)
> +    subtest:ok(metrics.gc_steps_atomic >= 0)
> +    subtest:ok(metrics.gc_steps_sweepstring >= 0)
> +    subtest:ok(metrics.gc_steps_sweep >= 0)
> +    subtest:ok(metrics.gc_steps_finalize >= 0)
> +
> +    subtest:ok(metrics.jit_snap_restore >= 0)
> +    subtest:ok(metrics.jit_trace_abort >= 0)
> +    subtest:ok(metrics.jit_mcode_size >= 0)
> +    subtest:ok(metrics.jit_trace_num >= 0)
> +end)
> +
> +test:test("gc-allocated-freed", function(subtest)
> +    subtest:plan(1)
> +
> +    -- Force up garbage collect all dead objects.
> +    collectgarbage("collect")
> +
> +    -- Bump getmetrincs table and string keys allocation.
> +    local old_metrics = misc.getmetrics()
> +
> +    -- Remember allocated size for getmetrics table.
> +    old_metrics = misc.getmetrics()
> +
> +    collectgarbage("collect")
> +
> +    local new_metrics = misc.getmetrics()
> +
> +    local diff_alloc = new_metrics.gc_allocated - old_metrics.gc_allocated
> +    local getmetrics_alloc = diff_alloc
> +
> +    -- Do not use test:ok to avoid extra allocated/freed objects.
> +    assert(getmetrics_alloc > 0, "count allocated table for getmetrics")
> +    old_metrics = new_metrics
> +
> +    -- NB: Avoid operations that use internal global string buffer
> +    -- (such as concatenation, string.format, table.concat)
> +    -- while creating the string. Otherwise gc_freed/gc_allocated
> +    -- relations will not be so straightforward.
> +    local str = string.sub("Hello, world", 1, 5)
> +    collectgarbage("collect")
> +
> +    new_metrics = misc.getmetrics()
> +
> +    diff_alloc = new_metrics.gc_allocated - old_metrics.gc_allocated
> +    local diff_freed = new_metrics.gc_freed - old_metrics.gc_freed
> +
> +    assert(diff_alloc > getmetrics_alloc,
> +           "allocated str 'Hello' and table for getmetrics")
> +    assert(diff_freed == getmetrics_alloc,
> +           "freed old old_metrics")
> +    old_metrics = new_metrics
> +
> +    str = string.sub("Hello, world", 8, -1)
> +
> +    new_metrics = misc.getmetrics()
> +
> +    diff_alloc = new_metrics.gc_allocated - old_metrics.gc_allocated
> +    diff_freed = new_metrics.gc_freed - old_metrics.gc_freed
> +
> +    assert(diff_alloc > getmetrics_alloc,
> +            "allocated str 'world' and table for getmetrics")
> +    assert(diff_freed == 0, "nothing to free without collectgarbage")
> +    old_metrics = new_metrics
> +    collectgarbage("collect")
> +
> +    new_metrics = misc.getmetrics()
> +
> +    diff_alloc = new_metrics.gc_allocated - old_metrics.gc_allocated
> +    diff_freed = new_metrics.gc_freed - old_metrics.gc_freed
> +
> +    assert(diff_alloc == getmetrics_alloc,
> +            "allocated last one table for getmetrics")
> +    assert(diff_freed > 2 * getmetrics_alloc,
> +            "freed str 'Hello' and 2 tables for getmetrics")
> +    subtest:ok(true, "no assetion failed")
> +end)
> +
> +test:test("gc-steps", function(subtest)
> +    subtest:plan(24)
> +
> +    -- Some garbage has already created before the next line,
> +    -- i.e. during fronted processing this chunk.
> +    -- Let's put a full garbage collection cycle on top of that,
> +    -- and confirm that non-null values are reported (we are not
> +    -- yet interested in actual numbers):
> +    collectgarbage("collect")
> +    collectgarbage("stop")
> +    local oldm = misc.getmetrics()
> +    subtest:ok(oldm.gc_steps_pause > 0)
> +    subtest:ok(oldm.gc_steps_propagate > 0)
> +    subtest:ok(oldm.gc_steps_atomic > 0)
> +    subtest:ok(oldm.gc_steps_sweepstring > 0)
> +    subtest:ok(oldm.gc_steps_sweep > 0)
> +    -- Nothing to finalize, skipped.
> +    subtest:is(oldm.gc_steps_finalize, 0)
> +
> +    -- As long as we stopped the GC, consequent call
> +    -- should return the same values:
> +    local newm = misc.getmetrics()
> +    subtest:is(newm.gc_steps_pause - oldm.gc_steps_pause, 0)
> +    subtest:is(newm.gc_steps_propagate - oldm.gc_steps_propagate, 0)
> +    subtest:is(newm.gc_steps_atomic - oldm.gc_steps_atomic, 0)
> +    subtest:is(newm.gc_steps_sweepstring - oldm.gc_steps_sweepstring, 0)
> +    subtest:is(newm.gc_steps_sweep - oldm.gc_steps_sweep, 0)
> +    -- Nothing to finalize, skipped.
> +    subtest:is(newm.gc_steps_finalize, 0)
> +    oldm = newm
> +
> +    -- Now the last phase: run full GC once and make sure that
> +    -- everything is being reported as expected:
> +    collectgarbage("collect")
> +    collectgarbage("stop")
> +    newm = misc.getmetrics()
> +    subtest:ok(newm.gc_steps_pause - oldm.gc_steps_pause == 1)
> +    subtest:ok(newm.gc_steps_propagate - oldm.gc_steps_propagate >= 1)
> +    subtest:ok(newm.gc_steps_atomic - oldm.gc_steps_atomic == 1)
> +    subtest:ok(newm.gc_steps_sweepstring - oldm.gc_steps_sweepstring >= 1)
> +    subtest:ok(newm.gc_steps_sweep  - oldm.gc_steps_sweep >= 1)
> +    -- Nothing to finalize, skipped.
> +    subtest:is(newm.gc_steps_finalize, 0)
> +    oldm = newm
> +
> +    -- Now let's run three GC cycles to ensure that zero-to-one
> +    -- transition was not a lucky coincidence.
> +    collectgarbage("collect")
> +    collectgarbage("collect")
> +    collectgarbage("collect")
> +    collectgarbage("stop")
> +    newm = misc.getmetrics()
> +    subtest:ok(newm.gc_steps_pause - oldm.gc_steps_pause == 3)
> +    subtest:ok(newm.gc_steps_propagate - oldm.gc_steps_propagate >= 3)
> +    subtest:ok(newm.gc_steps_atomic - oldm.gc_steps_atomic == 3)
> +    subtest:ok(newm.gc_steps_sweepstring - oldm.gc_steps_sweepstring >= 3)
> +    subtest:ok(newm.gc_steps_sweep  - oldm.gc_steps_sweep >= 3)
> +    -- Nothing to finalize, skipped.
> +    subtest:is(newm.gc_steps_finalize, 0)
> +end)
> +
> +test:test("objcount", function(subtest)
> +    subtest:plan(4)
> +    local table_new = require("table.new")
> +    local ffi = require("ffi")
> +
> +    jit.opt.start(0)
> +
> +    -- Remove all dead objects.
> +    collectgarbage("collect")
> +
> +    -- Bump strings and table creation.
> +    local old_metrics = misc.getmetrics()
> +    old_metrics = misc.getmetrics()
> +
> +    local placeholder = {
> +        str = {},
> +        tab = {},
> +        udata = {},
> +        cdata = {},
> +    }
> +
> +    -- Separate objects creations to separate jit traces.
> +    for i = 1, 1000 do
> +        table.insert(placeholder.str, tostring(i))
> +    end
> +
> +    for i = 1, 1000 do
> +        table.insert(placeholder.tab, table_new(i, 0))
> +    end
> +
> +    for i = 1, 1000 do
> +        table.insert(placeholder.udata, newproxy())
> +    end
> +
> +    for i = 1, 1000 do
> +        -- Check counting of VLA/VLS/aligned cdata.
> +        table.insert(placeholder.cdata, ffi.new("char[?]", 4))
> +    end
> +
> +    for i = 1, 1000 do
> +        -- Check counting of non-VLA/VLS/aligned cdata.
> +        table.insert(placeholder.cdata, ffi.new("uint64_t", i))
> +    end
> +
> +    placeholder = nil
> +    collectgarbage("collect")
> +    local new_metrics = misc.getmetrics()
> +
> +    -- Check that amount of objects not increased.
> +    subtest:is(new_metrics.gc_strnum, old_metrics.gc_strnum,
> +               "strnum don't change")
> +    subtest:is(new_metrics.gc_tabnum, old_metrics.gc_tabnum,
> +               "tabnum don't change")
> +    subtest:is(new_metrics.gc_udatanum, old_metrics.gc_udatanum,
> +               "udatanum don't change")
> +    subtest:is(new_metrics.gc_cdatanum, old_metrics.gc_cdatanum,
> +               "cdatanum don't change")
> +
> +    -- Restore default jit settings.
> +    jit.opt.start(jit_opt_default_level)
> +end)
> +
> +test:test("snap-restores-direct-loop", function(subtest)
> +    -- Compiled loop with a direct exit to the interpreter.
> +    subtest:plan(1)
> +
> +    jit.opt.start(0, "hotloop=2")
> +
> +    local old_metrics = misc.getmetrics()
> +
> +    local sum = 0
> +    for i = 1, 20 do
> +        sum = sum + i
> +    end
> +
> +    local new_metrics = misc.getmetrics()
> +
> +    -- A single snapshot restoration happened on loop finish:
> +    subtest:is(new_metrics.jit_snap_restore - old_metrics.jit_snap_restore, 1)
> +
> +    -- Restore default jit settings.
> +    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop)
> +end)
> +
> +test:test("snap-restores-loop-side-exit-non-compiled", function(subtest)
> +    -- Compiled loop with a side exit which does not get compiled.
> +    subtest:plan(1)
> +
> +    jit.opt.start(0, "hotloop=2", "hotexit=2", "minstitch=15")
> +
> +    local function foo(i)
> +        -- math.fmod is not yet compiled!
> +        return i <= 5 and i or math.fmod(i, 11)
> +    end
> +
> +    local old_metrics = misc.getmetrics()
> +    local sum = 0
> +    for i = 1, 10 do
> +        sum = sum + foo(i)
> +    end
> +
> +    local new_metrics = misc.getmetrics()
> +
> +    -- Side exits from the root trace could not get compiled.
> +    subtest:is(new_metrics.jit_snap_restore - old_metrics.jit_snap_restore, 5)
> +
> +    -- Restore default jit settings.
> +    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop,
> +                  "hotexit="..jit_opt_default_hotexit,
> +                  "minstitch="..jit_opt_default_minstitch)
> +end)
> +
> +test:test("snap-restores-loop-side-exit", function(subtest)
> +    -- Compiled loop with a side exit which gets compiled.
> +    subtest:plan(1)
> +
> +    -- Optimization level is important here as `loop` optimization
> +    -- may unroll the loop body and insert +1 side exit.
> +    jit.opt.start(0, "hotloop=5", "hotexit=5")
> +
> +    local function foo(i)
> +        return i <= 10 and i or tostring(i)
> +    end
> +
> +    local old_metrics = misc.getmetrics()
> +    local sum = 0
> +    for i = 1, 20 do
> +        sum = sum + foo(i)
> +    end
> +
> +    local new_metrics = misc.getmetrics()
> +
> +    -- 5 side exits to the interpreter before trace gets hot
> +    -- and compiled
> +    -- 1 side exit on loop end
> +    subtest:is(new_metrics.jit_snap_restore - old_metrics.jit_snap_restore, 6)
> +
> +    -- Restore default jit settings.
> +    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop,
> +                  "hotexit="..jit_opt_default_hotexit)
> +end)
> +
> +test:test("snap-restores-scalar", function(subtest)
> +    -- Compiled scalar trace with a direct exit to the interpreter.
> +    subtest:plan(2)
> +
> +    -- For calls it will be 2 * hotloop (see lj_dispatch.{c,h}
> +    -- and hotcall@vm_*.dasc).
> +    jit.opt.start(3, "hotloop=2", "hotexit=3")
> +
> +    local function foo(i)
> +        return i <= 15 and i or tostring(i)
> +    end
> +
> +    local old_metrics = misc.getmetrics()
> +
> +    foo(1)  -- interp only
> +    foo(2)  -- interp only
> +    foo(3)  -- interp only
> +    foo(4)  -- compile trace during this call
> +    foo(5)  -- follow the trace
> +    foo(6)  -- follow the trace
> +    foo(7)  -- follow the trace
> +    foo(8)  -- follow the trace
> +    foo(9)  -- follow the trace
> +    foo(10) -- follow the trace
> +
> +    local new_metrics = misc.getmetrics()
> +
> +    -- No exits triggering snap restore so far: snapshot
> +    -- restoration was inlined into the machine code.
> +    subtest:is(new_metrics.jit_snap_restore - old_metrics.jit_snap_restore, 0)
> +    old_metrics = new_metrics
> +
> +    -- Simply 2 side exits from the trace:
> +    foo(20)
> +    foo(21)
> +
> +    new_metrics = misc.getmetrics()
> +    subtest:is(new_metrics.jit_snap_restore - old_metrics.jit_snap_restore, 2)
> +
> +    -- Restore default jit settings.
> +    jit.opt.start(jit_opt_default_level, "hotloop="..jit_opt_default_hotloop,
> +                  "hotexit="..jit_opt_default_hotexit)
> +end)
> +
> +test:test("strhash", function(subtest)
> +    subtest:plan(1)
> +
> +    local old_metrics = misc.getmetrics()
> +
> +    local new_metrics = misc.getmetrics()
> +    -- Do not use test:ok to avoid extra strhash hits/misses.
> +    assert(new_metrics.strhash_hit - old_metrics.strhash_hit == 19)
> +    assert(new_metrics.strhash_miss - old_metrics.strhash_miss == 0)
> +    old_metrics = new_metrics
> +
> +    local str1  = "strhash".."_hit"
> +
> +    new_metrics = misc.getmetrics()
> +    assert(new_metrics.strhash_hit - old_metrics.strhash_hit == 20)
> +    assert(new_metrics.strhash_miss - old_metrics.strhash_miss == 0)
> +    old_metrics = new_metrics
> +
> +    new_metrics = misc.getmetrics()
> +    assert(new_metrics.strhash_hit - old_metrics.strhash_hit == 19)
> +    assert(new_metrics.strhash_miss - old_metrics.strhash_miss == 0)
> +    old_metrics = new_metrics
> +
> +    local str2 = "new".."string"
> +
> +    new_metrics = misc.getmetrics()
> +    assert(new_metrics.strhash_hit - old_metrics.strhash_hit == 19)
> +    assert(new_metrics.strhash_miss - old_metrics.strhash_miss == 1)
> +    subtest:ok(true, "no assertion failed")
> +end)
> +
> +test:test("tracenum-base", function(subtest)
> +    subtest:plan(3)
> +
> +    jit.off()
> +    jit.flush()
> +    collectgarbage("collect")
> +    local metrics = misc.getmetrics()
> +    subtest:is(metrics.jit_trace_num, 0)
> +
> +    jit.on()
> +    local sum = 0
> +    for i = 1, 100 do
> +        sum = sum + i
> +    end
> +
> +    metrics = misc.getmetrics()
> +    subtest:is(metrics.jit_trace_num, 1)
> +
> +    jit.off()
> +    jit.flush()
> +    collectgarbage("collect")
> +
> +    metrics = misc.getmetrics()
> +    subtest:is(metrics.jit_trace_num, 0)
> +
> +    jit.on()
> +end)
> +
> +os.exit(test:check() and 0 or 1)
> -- 
> 2.28.0
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 2/2] misc: add C and Lua API for platform metrics
  2020-10-09 14:45   ` Sergey Ostanevich
@ 2020-10-13  6:01     ` Sergey Kaplun
  0 siblings, 0 replies; 26+ messages in thread
From: Sergey Kaplun @ 2020-10-13  6:01 UTC (permalink / raw)
  To: Sergey Ostanevich; +Cc: tarantool-patches

Hi, Sergos! Thanks for the review!

On 09.10.20, Sergey Ostanevich wrote:
> Hi!
> 
> Thanks for the patch!
> 
> I think it introduces a new user-visible interface, so we need a 
> @TarantoolBot document entry here?

I've created ticket here [1].

> 
> Otherwise LGTM.
> 
> Sergos
> 
> 
> On 05 окт 09:30, Sergey Kaplun wrote:

<snipped>

> > -- 
> > 2.28.0
> > 

[1]: https://github.com/tarantool/doc/issues/1597

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 0/2] Implement LuaJIT platform metrics
  2020-10-05  6:30 [Tarantool-patches] [PATCH v4 0/2] Implement LuaJIT platform metrics Sergey Kaplun
                   ` (3 preceding siblings ...)
  2020-10-08 17:33 ` [Tarantool-patches] [PATCH v4 0/2] Implement LuaJIT platform metrics Igor Munkin
@ 2020-10-13 13:17 ` Kirill Yukhin
  4 siblings, 0 replies; 26+ messages in thread
From: Kirill Yukhin @ 2020-10-13 13:17 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Hello,

On 05 окт 09:30, Sergey Kaplun wrote:
> The series consists of 2 patches. The first one adds corresponding
> counters to LuaJIT internal structures. The second provides C and Lua
> API using this counters to collect metrics.
> 
> Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-5187-luajit-metrics
> Issue: https://github.com/tarantool/tarantool/issues/5187
> 
> Changes in v2:
> - Fixed naming and comments
> - Fixed padding in struct GCState
> - Dropped unnecessary initialisations inside lua_newstate()
> - Avoided flushing any of metrics after each call of luaM_metrics()
> 
> Changes in v3:
> - Cleaned up mess in Makefile.dep
> - Fixed naming and comments
> - Fixed padding in struct GCState for 64-bit architectures
> - Fixed counting amount of JIT traces
> - Fixed objects counting at trace recording
> - Added counting of colors
> - Added C and Lua tests
> 
> Changes in v4:
> - Removed counting of colors
> - Changed global_State structure correspondingly for 32-bit arm build
> 
> Sergey Kaplun (2):
>   core: introduce various platform metrics
>   misc: add C and Lua API for platform metrics

I've checked your patchset into tarantool branch of luajit repo and
bumped a new version in 1.10, 2.4, 2.5 and master.

--
Regards, Kirill Yukhin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [RFC v4] rfc: luajit metrics
  2020-10-05  6:30 ` [Tarantool-patches] [RFC v4] rfc: luajit metrics Sergey Kaplun
  2020-10-07 14:46   ` Sergey Kaplun
  2020-10-08 17:25   ` Igor Munkin
@ 2020-12-22  9:07   ` Kirill Yukhin
  2 siblings, 0 replies; 26+ messages in thread
From: Kirill Yukhin @ 2020-12-22  9:07 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Hello,

On 05 Oct 09:30, Sergey Kaplun wrote:
> Part of #5187
> ---
> 
> This patch adds RFC to LuaJIT metrics interfaces. Nevertheless name
> `misc` for builtin library is not good and should be discussed, because
> tons of user modules can use that name for their own libraries.
> 
> Branch: https://github.com/tarantool/tarantool/tree/skaplun/5187-luajit-metrics
> Issue: https://github.com/tarantool/tarantool/issues/5187

I've checked your patch into master.

--
Regards, Kirill Yukhin

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2020-12-22  9:08 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-05  6:30 [Tarantool-patches] [PATCH v4 0/2] Implement LuaJIT platform metrics Sergey Kaplun
2020-10-05  6:30 ` [Tarantool-patches] [PATCH v4 1/2] core: introduce various " Sergey Kaplun
2020-10-07 14:11   ` Igor Munkin
2020-10-07 19:55     ` Sergey Kaplun
2020-10-07 20:16       ` Igor Munkin
2020-10-08  9:28         ` Igor Munkin
2020-10-08 10:11         ` Sergey Kaplun
2020-10-08 12:44           ` Igor Munkin
2020-10-09 14:39             ` Sergey Ostanevich
2020-10-05  6:30 ` [Tarantool-patches] [PATCH v4 2/2] misc: add C and Lua API for " Sergey Kaplun
2020-10-06 22:17   ` Igor Munkin
2020-10-07  5:57     ` Igor Munkin
2020-10-07 14:35     ` Sergey Kaplun
2020-10-07 18:23       ` Igor Munkin
2020-10-07 20:09         ` Sergey Kaplun
2020-10-09 14:45   ` Sergey Ostanevich
2020-10-13  6:01     ` Sergey Kaplun
2020-10-05  6:30 ` [Tarantool-patches] [RFC v4] rfc: luajit metrics Sergey Kaplun
2020-10-07 14:46   ` Sergey Kaplun
2020-10-08 17:25   ` Igor Munkin
2020-10-08 19:29     ` Sergey Kaplun
2020-10-08 20:26       ` Igor Munkin
2020-10-09  6:06         ` Sergey Kaplun
2020-12-22  9:07   ` Kirill Yukhin
2020-10-08 17:33 ` [Tarantool-patches] [PATCH v4 0/2] Implement LuaJIT platform metrics Igor Munkin
2020-10-13 13:17 ` Kirill Yukhin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox