[Tarantool-patches] [PATCH luajit v3 1/7] vm: save topframe info into global_State

Sergey Kaplun skaplun at tarantool.org
Thu Apr 7 12:47:13 MSK 2022


Hi, Maxim!

Thanks for the patch!

Please consider my review comments below.

However, they are almost the same as for the previous (v1) version, since
there is no feedback nor related changes.

On 06.04.22, Maxim Kokryashkin wrote:
> From: Mikhail Shishatskiy <m.shishatskiy at tarantool.org>
> 
> Since commit 111d377d524e54e02187148a1832683291d620b2
> ('vm: introduce VM states for Lua and fast functions')
> the VM has LFUNC and FFUNC states. The upcoming sampling
> profiler uses these vmstates to determine if the guest
> stack is valid or not. So, we need to provide a There is an inconsistent behavior
> of the VM when the Lua stack is not valid, but the state
> is set to LFUNC. This patch is just a gross hack with which
> the profiler works fine.

About what hack are you talking about?

>                          The problem is to be investigated
> more deeply :(

Typo: s/ :(/./.

Minor: I suggest to drop this line.

Minor: lines of commit message look un-filled well (except the one line
that filled to the brim :).

> ---
>  src/lj_obj.h    | 12 ++++++++++++
>  src/vm_x64.dasc | 52 +++++++++++++++++++++++++++++++++++++++----------
>  src/vm_x86.dasc | 52 +++++++++++++++++++++++++++++++++++++++----------
>  3 files changed, 96 insertions(+), 20 deletions(-)
> 
> diff --git a/src/lj_obj.h b/src/lj_obj.h
> index d26e60be..b76c3155 100644
> --- a/src/lj_obj.h
> +++ b/src/lj_obj.h
> @@ -514,6 +514,17 @@ typedef struct GCtab {
>  #define setfreetop(t, n, v)	(setmref((n)->freetop, (v)))
>  #endif
>  
> +/* -- Misc objects -------------------------------------------------------- */

Minor: I suggest not Misc, but Profiler.

> +
> +struct lj_sysprof_topframe {
> +  uint8_t ffid;          /* FFUNC: fast function id. */

Why this field can't be a part of union?
Due to set vmstate we always know the necessary union's "subtype".

> +  union {
> +    uint64_t raw;        /* Raw value for context save/restore. */
> +    TValue *interp_base; /* LFUNC: Base of the executed coroutine. */
> +    lua_CFunction cf;    /* CFUNC: Address of the C function. */


Nit: please use tabs instead spaces for comments alignment, like it is
done for other structures in this header.

> +  } guesttop;
> +};
> +
>  /* -- State objects ------------------------------------------------------- */
>  
>  /* VM states. */
> @@ -674,6 +685,7 @@ typedef struct global_State {
>    MRef jit_base;	/* Current JIT code L->base or NULL. */
>    MRef ctype_state;	/* Pointer to C type state. */
>    GCRef gcroot[GCROOT_MAX];  /* GC roots. */
> +  struct lj_sysprof_topframe top_frame;  /* Top frame for sysprof */

Nit: I suppose that this structure should be introduced only if sysprof
is enabled. OTOH, I see no reason to hide this structure, so I suggest
to drop it as is for now.

Side note: Also, I'm not sure that this field has addressable offset for
ARM architecture (for DynASM).

Typo: s/for sysprof/for sysprof./

>  } global_State;
>  
>  #define mainthread(g)	(&gcref(g->mainthref)->th)
> diff --git a/src/vm_x64.dasc b/src/vm_x64.dasc
> index 974047d3..c4beb5e7 100644
> --- a/src/vm_x64.dasc
> +++ b/src/vm_x64.dasc
> @@ -345,6 +345,35 @@
>  |  mov dword [DISPATCH+DISPATCH_GL(vmstate)], ~LJ_VMST_..st
>  |.endmacro
>  |
> +|// Stash interpreter's internal base and enter LFUNC VM state.
> +|// PROFILER: Each time profiler sees LFUNC state, it will inspect [BASE-1]

I suppose, that this is not valid for GC64. LuaJIT uses 2-slot frame
info here. The func value is in BASE - 2 slot (see <src/lj_frame.h> for
details). So this part should be adjusted.

| (gdb) f 0
| #0  lj_cf_print (L=0x7ffff7c83378) at src/lib_base.c:486
| 486       ptrdiff_t i, nargs = L->top - L->base;
|
| (gdb) lj-arch
| LJ_64: True, LJ_GC64: True, LJ_DUALNUM: False
| (gdb) p gcval(L->base - 2)->fn.l.ffid
| $11 = 29 '\035' # print
| (gdb) p gcval(L->base - 1)->fn.l.ffid
| $12 = 200 '\310'

Minor: please use well-known special comments instead PROFILER.
XXX [1] looks good here, IMO.
Here and below.
| Use XXX in a comment to flag something that is bogus but works.

> +|// expecting to see a valid framelink there. So enter this state only when
> +|// BASE is stable and slots are not moved on the stack.
> +|.macro set_vmstate_lfunc
> +|  set_vmstate INTERP // Guard for non-atomic VM context restoration

Nit: missed dot at he end of the sentence.
Here and below.

> +|  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], BASE
> +|  set_vmstate LFUNC
> +|.endmacro
> +|
> +|// Stash ID of the fast function about to be executed and enter FFUNC VM state.
> +|// PROFILER: Each time profiler sees FFUNC state, it will write ffid
> +|// to the profile stream.
> +|.macro set_vmstate_ffunc
> +|  set_vmstate INTERP // Guard for non-atomic VM context restoration
> +|  mov XCHGd, dword [BASE-8]

XCHGd register is not defined in <vm_x64.dasm>.

| /home/burii/reviews/luajit/sysprof/src/vm_x64.dasc:501: error: bad operand mode in `mov i?,i?':
|  |  mov XCHGd, L:RBa->base
| ...

Also, BASE stands for 64-bit register (*).
| ...
| /home/burii/reviews/luajit/sysprof/src/vm_x64.dasc:467: error: mixed operand size in `mov xd,rq':
|   |  set_vmstate_cfunc
|   |    mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], BASE       [MACRO set_vmstate_cfunc (0)]

> +|  mov dword [DISPATCH+DISPATCH_GL(top_frame.ffid)], XCHGd

Don't get it. Why is it `dword` instead of `byte`?

> +|  set_vmstate FFUNC
> +|.endmacro
> +|
> +|// Stash address of the C function about to be executed and enter CFUNC VM state.

Nit: Line width is more than 80 symbols.

> +|// PROFILER: Each time profiler sees CFUNC state, it will write this address
> +|// to the profile stream.
> +|.macro set_vmstate_cfunc
> +|  set_vmstate INTERP // Guard for non-atomic VM context restoration
> +|  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], BASE

Ditto (*).

> +|  set_vmstate CFUNC
> +|.endmacro
> +|
>  |// Uses TMPRd (r10d).
>  |.macro save_vmstate
>  |.if not WIN
> @@ -435,7 +464,7 @@ static void build_subroutines(BuildCtx *ctx)
>    |  jnz ->vm_returnp
>    |
>    |  // Return to C.
> -  |  set_vmstate CFUNC
> +  |  set_vmstate_cfunc
>    |  and PC, -8
>    |  sub PC, BASE
>    |  neg PC				// Previous base = BASE - delta.
> @@ -467,6 +496,9 @@ static void build_subroutines(BuildCtx *ctx)
>    |  xor eax, eax			// Ok return status for vm_pcall.
>    |
>    |->vm_leave_unw:
> +  |  set_vmstate INTERP // Guard for non-atomic VM context restoration
> +  |  mov XCHGd, L:RBa->base

RBa register is not defined in <vm_x64.dasm>.

> +  |  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], XCHGd
>    |  // DISPATCH required to set properly.
>    |  restore_vmstate			// Caveat: uses TMPRd (r10d).
>    |  restoreregs
> @@ -725,7 +757,7 @@ static void build_subroutines(BuildCtx *ctx)

<snipped>

> diff --git a/src/vm_x86.dasc b/src/vm_x86.dasc
> index ab8e6f27..222754fe 100644
> --- a/src/vm_x86.dasc
> +++ b/src/vm_x86.dasc
> @@ -443,6 +443,35 @@
>  |  mov dword [DISPATCH+DISPATCH_GL(vmstate)], ~LJ_VMST_..st
>  |.endmacro
>  |
> +|// Stash interpreter's internal base and enter LFUNC VM state.
> +|// PROFILER: Each time profiler sees LFUNC state, it will inspect [BASE-1]
> +|// expecting to see a valid framelink there. So enter this state only when
> +|// BASE is stable and slots are not moved on the stack.
> +|.macro set_vmstate_lfunc
> +|  set_vmstate INTERP // Guard for non-atomic VM context restoration
> +|  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], BASE
> +|  set_vmstate LFUNC
> +|.endmacro
> +|
> +|// Stash ID of the fast function about to be executed and enter FFUNC VM state.
> +|// PROFILER: Each time profiler sees FFUNC state, it will write ffid
> +|// to the profile stream.
> +|.macro set_vmstate_ffunc
> +|  set_vmstate INTERP // Guard for non-atomic VM context restoration
> +|  mov XCHGd, dword [BASE-8]

This register is defined only for x64 architecture.
The error is occured, when build with the following command:

| $ make CC="gcc -m32" -f Makefile.original -j
| ...
| DYNASM    host/buildvm_arch.h
| vm_x86.dasc:632: error: bad operand mode in `mov i?,x?':
|   |  mov XCHGd, L:RBa->base
| vm_x86.dasc:1544: error: bad operand mode in `mov i?,xd':
|   |.ffunc_1 assert
|   |    mov XCHGd, dword [BASE-8]        [MACRO set_vmstate_ffunc (0)]
| vm_x86.dasc:1572: error: bad operand mode in `mov i?,xd':
|   |.ffunc_1 type
|   |    mov XCHGd, dword [BASE-8]        [MACRO set_vmstate_ffunc (0)]
| vm_x86.dasc:1599: error: bad operand mode in `mov i?,xd':
|   |.ffunc_1 getmetatable
|   |    mov XCHGd, dword [BASE-8]        [MACRO set_vmstate_ffunc (0)]

Side note: Unfortunately it's impossible for now forcify x32 build via
cmake. It requires to proxy CMAKE_C_FLAGS to macro in LuaJITUtils, IINM.

> +|  mov dword [DISPATCH+DISPATCH_GL(top_frame.ffid)], XCHGd

Please clarify the following things:

1) IINM, we save not ffid but the GCref for this function (since we not
   load ffid field from function).
2) Why do we store 64 bytes instead 8? Yes, the struct is not packed and
   there is a hole in it, so it works correct. But it is a little bit
   confusing. Also, why can't we write BASE here too and inspect ffid
   from this base later?

> +|  set_vmstate FFUNC
> +|.endmacro
> +|
> +|// Stash address of the C function about to be executed and enter CFUNC VM state.
> +|// PROFILER: Each time profiler sees CFUNC state, it will write this address
> +|// to the profile stream.
> +|.macro set_vmstate_cfunc
> +|  set_vmstate INTERP // Guard for non-atomic VM context restoration
> +|  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], BASE
> +|  set_vmstate CFUNC
> +|.endmacro
> +|
>  |// Uses spilled ecx on x86 or XCHGd (r11d) on x64.
>  |.macro save_vmstate
>  |.if not WIN
> @@ -560,7 +589,7 @@ static void build_subroutines(BuildCtx *ctx)

<snipped>

> @@ -599,6 +628,9 @@ static void build_subroutines(BuildCtx *ctx)
>    |  xor eax, eax			// Ok return status for vm_pcall.
>    |
>    |->vm_leave_unw:
> +  |  set_vmstate INTERP // Guard for non-atomic VM context restoration
> +  |  mov XCHGd, L:RBa->base

AFAICS, there is no garantee, that L:RBa is set up to `lua_State *`.

For example, during `vm_unwind_c_eh` RB register is set to
`global_State *`.

|->vm_unwind_c_eh:			// Landing pad for external unwinder.
|  mov L:DISPATCH, SAVE_L
|  mov GL:RB, L:DISPATCH->glref
|  mov dword GL:RB->cur_L, L:DISPATCH
|  mov dword GL:RB->vmstate, ~LJ_VMST_CFUNC
|  mov DISPATCH, L:DISPATCH->glref	// Setup pointer to dispatch table.
|  add DISPATCH, GG_G2DISP
|  jmp ->vm_leave_unw

> +  |  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], XCHGd
>    |  // DISPATCH required to set properly.
>    |  restore_vmstate			// Caveat: on x64 uses XCHGd (r11d).
>    |  restoreregs
> @@ -934,7 +966,7 @@ static void build_subroutines(BuildCtx *ctx)

<snipped>

> -- 
> 2.35.1
> 

[1]: https://www.oracle.com/java/technologies/javase/codeconventions-programmingpractices.html

-- 
Best regards,
Sergey Kaplun


More information about the Tarantool-patches mailing list