[Tarantool-patches] [PATCH v4 1/2] core: introduce various platform metrics
Sergey Kaplun
skaplun at tarantool.org
Wed Oct 7 22:55:58 MSK 2020
On 07.10.20, Igor Munkin wrote:
> Sergey,
>
> Thanks for the patch! Please consider my comments below.
>
> On 05.10.20, Sergey Kaplun wrote:
> > This patch introduces the following counters:
> > - overall amount of allocated tables, cdata and udata objects
> > - number of incremental GC steps grouped by GC state
> > - number of string hashes hits and misses
> > - amount of allocated and freed memory
> > - number of trace aborts and restored snapshots
>
> Looks like you forgot a number of alive traces.
>
> >
> > Interfaces to obtain these metrics via both Lua and C API are
> > introduced in the next patch.
> >
> > Part of tarantool/tarantool#5187
> > ---
> > src/lj_asm.c | 2 ++
> > src/lj_asm_arm.h | 7 +++++++
> > src/lj_asm_arm64.h | 7 +++++++
> > src/lj_asm_mips.h | 4 ++++
> > src/lj_asm_ppc.h | 3 +++
> > src/lj_asm_x86.h | 4 ++++
> > src/lj_cdata.c | 2 ++
> > src/lj_cdata.h | 2 ++
> > src/lj_gc.c | 4 ++++
> > src/lj_gc.h | 6 +-----
> > src/lj_jit.h | 3 +++
> > src/lj_obj.h | 28 +++++++++++++++++++++++++---
> > src/lj_snap.c | 1 +
> > src/lj_state.c | 2 +-
> > src/lj_str.c | 5 +++++
> > src/lj_tab.c | 2 ++
> > src/lj_trace.c | 6 +++++-
> > src/lj_udata.c | 2 ++
> > 18 files changed, 80 insertions(+), 10 deletions(-)
> >
> > diff --git a/src/lj_asm.c b/src/lj_asm.c
> > index c2cf5a9..8fb3ccd 100644
> > --- a/src/lj_asm.c
> > +++ b/src/lj_asm.c
> > @@ -2273,6 +2273,7 @@ void lj_asm_trace(jit_State *J, GCtrace *T)
> > as->J = J;
> > as->T = T;
> > J->curfinal = lj_trace_alloc(J->L, T); /* This copies the IR, too. */
> > + J->tracenum++;
>
> Why didn't you simply put this increment right to <lj_trace_alloc> as
> you did for decrement and <lj_trace_free>?
I was confused by L argument. Fixed
>
> > as->flags = J->flags;
> > as->loopref = J->loopref;
> > as->realign = NULL;
> > @@ -2379,6 +2380,7 @@ void lj_asm_trace(jit_State *J, GCtrace *T)
> > lj_trace_free(J2G(J), J->curfinal);
> > J->curfinal = NULL; /* In case lj_trace_alloc() OOMs. */
> > J->curfinal = lj_trace_alloc(J->L, T);
> > + J->tracenum++;
>
> Ditto.
And here too.
>
<snipped>
> > + /* Code incrementing cdatanum is sparse to avoid mips data hazards. */
>
> Side note: LOL, here you are :)
Finally :)
>
> > + emit_setgl(as, RID_RET+2, gc.cdatanum);
>
> Well, I glanced a MIPS register-usage convention and AFAICS $4 register
> (RID_RET + 2) is a general-purpose (i.e. doesn't store 0 or preserved by
> kernel) caller-safe one. Ergo it should be allocated it in a proper way
> from scratch set, shouldn't it?
>
AFAIK, $a0 - $a3 ($4 - $7) registers are arguments to functions - not
preserved by subprograms.
But anyway explicit allocation is better here. Added.
> > /* Initialize gct and ctypeid. lj_mem_newgco() already sets marked. */
<snipped>
> > +/* Garbage collector states. Order matters. */
> > +enum {
> > + GCSpause, /* Start a GC cycle and mark the root set.*/
> > + GCSpropagate, /* One gray object is processed. */
> > + GCSatomic, /* Atomic transition from mark to sweep phase. */
> > + GCSsweepstring, /* Sweep one chain of strings. */
> > + GCSsweep, /* Sweep few objects from root. */
> > + GCSfinalize, /* Finalize one userdata or cdata object. */
> > + GCSmax
>
> Adjust the comments considering the code nearby.
Sure! Thanks!
>
> > +};
> > +
> > typedef struct GCState {
> > GCSize total; /* Memory currently allocated. */
> > GCSize threshold; /* Memory threshold. */
>
> <snipped>
>
> > @@ -602,22 +622,24 @@ typedef struct global_State {
> > BloomFilter next[2];
> > } strbloom;
> > #endif
> > + size_t strhash_hit; /* Strings amount found in string hash. */
> > + size_t strhash_miss; /* Strings amount allocated and put into string hash. */
> > lua_Alloc allocf; /* Memory allocator. */
> > void *allocd; /* Memory allocator data. */
> > GCState gc; /* Garbage collector. */
> > - volatile int32_t vmstate; /* VM state or current JIT code trace number. */
> > SBuf tmpbuf; /* Temporary string buffer. */
> > GCstr strempty; /* Empty string. */
> > uint8_t stremptyz; /* Zero terminator of empty string. */
> > uint8_t hookmask; /* Hook mask. */
> > uint8_t dispatchmode; /* Dispatch mode. */
> > uint8_t vmevmask; /* VM event mask. */
> > + int32_t hookcount; /* Instruction hook countdown. */
> > GCRef mainthref; /* Link to main thread. */
> > TValue registrytv; /* Anchor for registry. */
> > - TValue tmptv, tmptv2; /* Temporary TValues. */
> > + TValue tmptv2, tmptv; /* Temporary TValues. */
> > Node nilnode; /* Fallback 1-element hash part (nil key and value). */
> > GCupval uvhead; /* Head of double-linked list of all open upvalues. */
> > - int32_t hookcount; /* Instruction hook countdown. */
> > + volatile int32_t vmstate; /* VM state or current JIT code trace number. */
>
> There is no a single word in the commit message regarding this unclear
> change. Please drop a sentence about the rationale for this reordering.
I've changed commit message as follows:
===================================================================
core: introduce various platform metrics
This patch introduces the following counters:
- overall amount of allocated tables, cdata and udata objects
- number of incremental GC steps grouped by GC state
- number of string hashes hits and misses
- amount of allocated and freed memory
- number of trace aborts, number of traces and restored snapshots
Also this patch fixes alignment for 64-bit architectures.
NB: MSize and BCIns are the only fixed types that equal 32 bits. GCRef,
MRef and GCSize sizes depend on LJ_GC64 define.
struct GCState is terminated by three fields: GCSize estimate, MSize
stepmul and MSize pause, which are aligned. The introduces size_t
fields do not violate the alignment too.
vmstate 32-bit field goes right after GCState field within global_State
structure. The next field tmpbuf consists of several MRef fields that
have 64-bit size each. This issue can be solved by moving vmstate field
below. However DynASM doesn't work well with unaligned memory access on
64-bit bigendian MIPS, so vmstate should be aligned to a 64-bit
boundary.
Furthermore field order has been changed to be able to compile code by
DynASM for 32-bit ARM too (see also
https://github.com/openresty/luajit2/issues/37#issuecomment-459145226).
Interfaces to obtain these metrics via both Lua and C API are
introduced in the next patch.
Part of tarantool/tarantool#5187
===================================================================
Side note: If you want read a little bit more about ARM immediate value
encoding (and play with it) see also [1].
>
<snipped>
> > --
> > 2.28.0
> >
>
> --
> Best regards,
> IM
See iterative patch in the bottom. Branch force-pushed.
===================================================================
diff --git a/src/lj_asm.c b/src/lj_asm.c
index 8fb3ccd..c2cf5a9 100644
--- a/src/lj_asm.c
+++ b/src/lj_asm.c
@@ -2273,7 +2273,6 @@ void lj_asm_trace(jit_State *J, GCtrace *T)
as->J = J;
as->T = T;
J->curfinal = lj_trace_alloc(J->L, T); /* This copies the IR, too. */
- J->tracenum++;
as->flags = J->flags;
as->loopref = J->loopref;
as->realign = NULL;
@@ -2380,7 +2379,6 @@ void lj_asm_trace(jit_State *J, GCtrace *T)
lj_trace_free(J2G(J), J->curfinal);
J->curfinal = NULL; /* In case lj_trace_alloc() OOMs. */
J->curfinal = lj_trace_alloc(J->L, T);
- J->tracenum++;
as->realign = NULL;
}
diff --git a/src/lj_asm_mips.h b/src/lj_asm_mips.h
index f4b4b5d..0341701 100644
--- a/src/lj_asm_mips.h
+++ b/src/lj_asm_mips.h
@@ -1430,7 +1430,9 @@ static void asm_cnew(ASMState *as, IRIns *ir)
CTInfo info = lj_ctype_info(cts, id, &sz);
const CCallInfo *ci = &lj_ir_callinfo[IRCALL_lj_mem_newgco];
IRRef args[4];
+ RegSet allow = (RSET_GPR & ~RSET_SCRATCH);
RegSet drop = RSET_SCRATCH;
+ Reg tmp;
lua_assert(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL));
as->gcsteps++;
@@ -1442,7 +1444,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
/* Initialize immutable cdata object. */
if (ir->o == IR_CNEWI) {
- RegSet allow = (RSET_GPR & ~RSET_SCRATCH);
#if LJ_32
int32_t ofs = sizeof(GCcdata);
if (sz == 8) {
@@ -1473,15 +1474,16 @@ static void asm_cnew(ASMState *as, IRIns *ir)
return;
}
+ tmp = ra_scratch(as, allow);
/* Code incrementing cdatanum is sparse to avoid mips data hazards. */
- emit_setgl(as, RID_RET+2, gc.cdatanum);
+ emit_setgl(as, tmp, gc.cdatanum);
/* Initialize gct and ctypeid. lj_mem_newgco() already sets marked. */
emit_tsi(as, MIPSI_SB, RID_RET+1, RID_RET, offsetof(GCcdata, gct));
emit_tsi(as, MIPSI_SH, RID_TMP, RID_RET, offsetof(GCcdata, ctypeid));
- emit_tsi(as, MIPSI_AADDIU, RID_RET+2, RID_RET+2, 1);
+ emit_tsi(as, MIPSI_AADDIU, tmp, tmp, 1);
emit_ti(as, MIPSI_LI, RID_RET+1, ~LJ_TCDATA);
emit_ti(as, MIPSI_LI, RID_TMP, id); /* Lower 16 bit used. Sign-ext ok. */
- emit_getgl(as, RID_RET+2, gc.cdatanum);
+ emit_getgl(as, tmp, gc.cdatanum);
args[0] = ASMREF_L; /* lua_State *L */
args[1] = ASMREF_TMP1; /* MSize size */
asm_gencall(as, ci, args);
diff --git a/src/lj_obj.h b/src/lj_obj.h
index 600b68d..927b347 100644
--- a/src/lj_obj.h
+++ b/src/lj_obj.h
@@ -573,12 +573,12 @@ typedef enum {
/* Garbage collector states. Order matters. */
enum {
- GCSpause, /* Start a GC cycle and mark the root set.*/
- GCSpropagate, /* One gray object is processed. */
- GCSatomic, /* Atomic transition from mark to sweep phase. */
- GCSsweepstring, /* Sweep one chain of strings. */
- GCSsweep, /* Sweep few objects from root. */
- GCSfinalize, /* Finalize one userdata or cdata object. */
+ GCSpause, /* Start a GC cycle and mark the root set.*/
+ GCSpropagate, /* One gray object is processed. */
+ GCSatomic, /* Atomic transition from mark to sweep phase. */
+ GCSsweepstring, /* Sweep one chain of strings. */
+ GCSsweep, /* Sweep few objects from root. */
+ GCSfinalize, /* Finalize one userdata or cdata object. */
GCSmax
};
diff --git a/src/lj_trace.c b/src/lj_trace.c
index 9bcbce6..86563cd 100644
--- a/src/lj_trace.c
+++ b/src/lj_trace.c
@@ -136,6 +136,7 @@ GCtrace * LJ_FASTCALL lj_trace_alloc(lua_State *L, GCtrace *T)
T2->nsnap = T->nsnap;
T2->nsnapmap = T->nsnapmap;
memcpy(p, T->ir + T->nk, szins);
+ L2J(L)->tracenum++;
return T2;
}
===================================================================
[1]: https://alisdair.mcdiarmid.org/arm-immediate-value-encoding/
--
Best regards,
Sergey Kaplun
More information about the Tarantool-patches
mailing list