[Tarantool-patches] [PATCH v4 1/2] core: introduce various platform metrics

Igor Munkin imun at tarantool.org
Wed Oct 7 23:16:01 MSK 2020


Sergey,

Thanks for your fixes! There is still a comment regarding CNEW
assembling and a couple minors below.

On 07.10.20, Sergey Kaplun wrote:
> On 07.10.20, Igor Munkin wrote:
> > Sergey,
> > 
> > Thanks for the patch! Please consider my comments below.
> > 
> > On 05.10.20, Sergey Kaplun wrote:

<snipped>

> 
> > 
> > > +  emit_setgl(as, RID_RET+2, gc.cdatanum);
> > 
> > Well, I glanced a MIPS register-usage convention and AFAICS $4 register
> > (RID_RET + 2) is a general-purpose (i.e. doesn't store 0 or preserved by
> > kernel) caller-safe one. Ergo it should be allocated it in a proper way
> > from scratch set, shouldn't it?
> > 
> 
> AFAIK, $a0 - $a3 ($4 - $7) registers are arguments to functions - not
> preserved by subprograms.

Yes, but there is e.g. $8, that is temporary one, isn't it? Anyway, you
can't just pick the particular register, since it can be already
allocated by RA. So it *has* to be explicitly allocated to avoid data
clash on the trace. I strongly believe the reason you see no failure on
tests is simply a lucky coincidence (or tiny traces).

> But anyway explicit allocation is better here. Added.
> 
> > >    /* Initialize gct and ctypeid. lj_mem_newgco() already sets marked. */
> 

<snipped>

> 
> I've changed commit message as follows:
> 
> ===================================================================
> core: introduce various platform metrics
> 
> This patch introduces the following counters:
>   - overall amount of allocated tables, cdata and udata objects
>   - number of incremental GC steps grouped by GC state
>   - number of string hashes hits and misses
>   - amount of allocated and freed memory
>   - number of trace aborts, number of traces and restored snapshots
> 
> Also this patch fixes alignment for 64-bit architectures.
> 
> NB: MSize and BCIns are the only fixed types that equal 32 bits. GCRef,
> MRef and GCSize sizes depend on LJ_GC64 define.
> 
> struct GCState is terminated by three fields: GCSize estimate, MSize
> stepmul and MSize pause, which are aligned. The introduces size_t

Typo: s/introduces/introduced/.

> fields do not violate the alignment too.
> 
> vmstate 32-bit field goes right after GCState field within global_State
> structure. The next field tmpbuf consists of several MRef fields that
> have 64-bit size each. This issue can be solved by moving vmstate field
> below. However DynASM doesn't work well with unaligned memory access on
> 64-bit bigendian MIPS, so vmstate should be aligned to a 64-bit
> boundary.
> 
> Furthermore field order has been changed to be able to compile code by
> DynASM for 32-bit ARM too (see also
> https://github.com/openresty/luajit2/issues/37#issuecomment-459145226).
> 
> Interfaces to obtain these metrics via both Lua and C API are
> introduced in the next patch.
> 
> Part of tarantool/tarantool#5187
> ===================================================================
> 
> Side note: If you want read a little bit more about ARM immediate value
> encoding (and play with it) see also [1].

Thanks.

> 

<snipped>

> 
> See iterative patch in the bottom. Branch force-pushed.
> 
> ===================================================================

<snipped>

> diff --git a/src/lj_asm_mips.h b/src/lj_asm_mips.h
> index f4b4b5d..0341701 100644
> --- a/src/lj_asm_mips.h
> +++ b/src/lj_asm_mips.h
> @@ -1430,7 +1430,9 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>    CTInfo info = lj_ctype_info(cts, id, &sz);
>    const CCallInfo *ci = &lj_ir_callinfo[IRCALL_lj_mem_newgco];
>    IRRef args[4];
> +  RegSet allow = (RSET_GPR & ~RSET_SCRATCH);
>    RegSet drop = RSET_SCRATCH;
> +  Reg tmp;
>    lua_assert(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL));
>  
>    as->gcsteps++;
> @@ -1442,7 +1444,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>  
>    /* Initialize immutable cdata object. */ >    if (ir->o == IR_CNEWI) {
> -    RegSet allow = (RSET_GPR & ~RSET_SCRATCH);
>  #if LJ_32
>      int32_t ofs = sizeof(GCcdata);
>      if (sz == 8) {
> @@ -1473,15 +1474,16 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>      return;
>    }
>  
> +  tmp = ra_scratch(as, allow);

Since there are registers allocated in scope of IR_CNEWI assembling
above, you need to exclude those registers from <allow> set prior to
scratching a new one.

>    /* Code incrementing cdatanum is sparse to avoid mips data hazards. */
> -  emit_setgl(as, RID_RET+2, gc.cdatanum);
> +  emit_setgl(as, tmp, gc.cdatanum);
>    /* Initialize gct and ctypeid. lj_mem_newgco() already sets marked. */
>    emit_tsi(as, MIPSI_SB, RID_RET+1, RID_RET, offsetof(GCcdata, gct));
>    emit_tsi(as, MIPSI_SH, RID_TMP, RID_RET, offsetof(GCcdata, ctypeid));
> -  emit_tsi(as, MIPSI_AADDIU, RID_RET+2, RID_RET+2, 1);
> +  emit_tsi(as, MIPSI_AADDIU, tmp, tmp, 1);
>    emit_ti(as, MIPSI_LI, RID_RET+1, ~LJ_TCDATA);
>    emit_ti(as, MIPSI_LI, RID_TMP, id); /* Lower 16 bit used. Sign-ext ok. */
> -  emit_getgl(as, RID_RET+2, gc.cdatanum);
> +  emit_getgl(as, tmp, gc.cdatanum);
>    args[0] = ASMREF_L;     /* lua_State *L */
>    args[1] = ASMREF_TMP1;  /* MSize size   */
>    asm_gencall(as, ci, args);

<snipped>

> ===================================================================
> 
> [1]: https://alisdair.mcdiarmid.org/arm-immediate-value-encoding/
> 
> -- 
> Best regards,
> Sergey Kaplun

-- 
Best regards,
IM


More information about the Tarantool-patches mailing list