[Tarantool-patches] [PATCH luajit 1/7] core: save current frame top to lj_obj

Sergey Kaplun skaplun at tarantool.org
Mon Sep 20 20:21:31 MSK 2021


Hi, Maxim, Mikhail!

Thanks for the patch!

Please consider my comments below.

Side note: please share Tarantool branch with bumped LuaJIT version.

On 08.09.21, Maxim Kokryashkin wrote:
> From: Mikhail Shishatskiy <m.shishatskiy at tarantool.org>

| core: save current frame top to lj_obj

Minor: It is better to use "vm:" prefix here. Also, "frame top to
lj_obj" sounds a little bit confusing to me. I suggest reformulate it
like the following: "topframe info into global_State".

> 
> Since commit 111d377d524e54e02187148a1832683291d620b2

Minor: Please mention any commit in the following format:

hashhashhashhashhashhashhashhashhash ('commit subject here')

> the VM has LFUNC and FFUNC states. The upcoming sampling
> profiler uses these vmstates to determine if the guest
> stack is valid or not. There is an inconsistent behavior
> of the VM when the Lua stack is not valid, but the state
> is set to LFUNC. This patch is just a gross hack with which
> the profiler works fine.

You've said offline, that there is some problem nearby `vm_growstack`
functionality. From this commit message and commit itself I

1) don't see any related changes (I don't see any hack :))
2) have no idea about what exactly the problem that this hack solved

Also, I notice that there are two unrelated changes: refactoring of code
with general macro `set_vmstate_.func` and saving metainformation
information inside the topframe structure.

NB: May be I excessively granulate changes, and the second reviewer does
not agree with me at this point.

>                          The problem is to be investigated
> more deeply :(

Typo: s/ :(/./. Moreover, I suggest to drop this sentence.

Minor: looks like commit message lines is not fullfilled.
Nit: please add corresponding "Part of " or may be "Need for " keyword.

> ---
>  src/lj_obj.h    | 12 ++++++++++++
>  src/vm_x64.dasc | 52 +++++++++++++++++++++++++++++++++++++++----------
>  src/vm_x86.dasc | 52 +++++++++++++++++++++++++++++++++++++++----------
>  3 files changed, 96 insertions(+), 20 deletions(-)
> 
> diff --git a/src/lj_obj.h b/src/lj_obj.h
> index d26e60be..b76c3155 100644
> --- a/src/lj_obj.h
> +++ b/src/lj_obj.h
> @@ -514,6 +514,17 @@ typedef struct GCtab {
>  #define setfreetop(t, n, v)	(setmref((n)->freetop, (v)))
>  #endif
>  
> +/* -- Misc objects -------------------------------------------------------- */

Minor: I suggest not Misc, but Profiler.

> +
> +struct lj_sysprof_topframe {
> +  uint8_t ffid;          /* FFUNC: fast function id. */

Why do we separate this field from the union?

> +  union {
> +    uint64_t raw;        /* Raw value for context save/restore. */
> +    TValue *interp_base; /* LFUNC: Base of the executed coroutine. */
> +    lua_CFunction cf;    /* CFUNC: Address of the C function. */

Nit: please use tabs instead spaces for comments alignment, like it is
done for other structures in this header.

> +  } guesttop;
> +};
> +
>  /* -- State objects ------------------------------------------------------- */
>  
>  /* VM states. */
> @@ -674,6 +685,7 @@ typedef struct global_State {
>    MRef jit_base;	/* Current JIT code L->base or NULL. */
>    MRef ctype_state;	/* Pointer to C type state. */
>    GCRef gcroot[GCROOT_MAX];  /* GC roots. */
> +  struct lj_sysprof_topframe top_frame;  /* Top frame for sysprof */

Nit: I suppose that this structure should be introduced only if sysprof
is enabled. OTOH, I see no reason to hide this structure, so I suggest
to drop it as is for now.

Side note: Also, I'm not sure that this field has addressable offset for
ARM architecture (for DynASM).

Typo: s/for sysprof/for sysprof./

>  } global_State;
>  
>  #define mainthread(g)	(&gcref(g->mainthref)->th)
> diff --git a/src/vm_x64.dasc b/src/vm_x64.dasc
> index 974047d3..c4beb5e7 100644
> --- a/src/vm_x64.dasc
> +++ b/src/vm_x64.dasc
> @@ -345,6 +345,35 @@
>  |  mov dword [DISPATCH+DISPATCH_GL(vmstate)], ~LJ_VMST_..st
>  |.endmacro
>  |
> +|// Stash interpreter's internal base and enter LFUNC VM state.
> +|// PROFILER: Each time profiler sees LFUNC state, it will inspect [BASE-1]

I suppose, that this is not valid for GC64. LuaJIT uses 2-slot frame
info here. The func value is in BASE - 2 slot (see <src/lj_frame.h> for
details). So this part should be adjusted.

| (gdb) f 0
| #0  lj_cf_print (L=0x7ffff7c83378) at src/lib_base.c:486
| 486       ptrdiff_t i, nargs = L->top - L->base;
|
| (gdb) lj-arch
| LJ_64: True, LJ_GC64: True, LJ_DUALNUM: False
| (gdb) p gcval(L->base - 2)->fn.l.ffid
| $11 = 29 '\035' # print
| (gdb) p gcval(L->base - 1)->fn.l.ffid
| $12 = 200 '\310'

Minor: please use well-known special comments instead PROFILER.
XXX [1] looks good here, IMO. Here and below.
| Use XXX in a comment to flag something that is bogus but works.

> +|// expecting to see a valid framelink there. So enter this state only when
> +|// BASE is stable and slots are not moved on the stack.
> +|.macro set_vmstate_lfunc
> +|  set_vmstate INTERP // Guard for non-atomic VM context restoration

Minor: Missed dot in the end of the sentence.
Here and below.

> +|  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], BASE

BASE stands for 64-bit register.

The error is occured, when build with the following command:

| $ cmake -DCMAKE_BUILD_TYPE=Debug -DLUAJIT_ENABLE_GC64=ON -DLUA_USE_ASSERT=ON -DLUA_USE_APICHECK=ON . && make -j
| ...
| /home/burii/reviews/luajit/sysprof/src/vm_x64.dasc:761: error: mixed operand size in `mov xd,rq':
|   |  set_vmstate_lfunc                  // LFUNC after KBASE restoration.
|   |    mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], BASE       [MACRO set_vmstate_lfunc (0)]

> +|  set_vmstate LFUNC
> +|.endmacro
> +|
> +|// Stash ID of the fast function about to be executed and enter FFUNC VM state.
> +|// PROFILER: Each time profiler sees FFUNC state, it will write ffid
> +|// to the profile stream.
> +|.macro set_vmstate_ffunc
> +|  set_vmstate INTERP // Guard for non-atomic VM context restoration
> +|  mov XCHGd, dword [BASE-8]

XCHGd register is not defined in <vm_x64.dasm>.

| /home/burii/reviews/luajit/sysprof/src/vm_x64.dasc:501: error: bad operand mode in `mov i?,i?':
|  |  mov XCHGd, L:RBa->base
| ...

Also, BASE stands for 64-bit register.
| ...
| /home/burii/reviews/luajit/sysprof/src/vm_x64.dasc:467: error: mixed operand size in `mov xd,rq':
|   |  set_vmstate_cfunc
|   |    mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], BASE       [MACRO set_vmstate_cfunc (0)]

> +|  mov dword [DISPATCH+DISPATCH_GL(top_frame.ffid)], XCHGd
> +|  set_vmstate FFUNC
> +|.endmacro
> +|
> +|// Stash address of the C function about to be executed and enter CFUNC VM state.
> +|// PROFILER: Each time profiler sees CFUNC state, it will write this address
> +|// to the profile stream.
> +|.macro set_vmstate_cfunc
> +|  set_vmstate INTERP // Guard for non-atomic VM context restoration
> +|  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], BASE

Ditto.

> +|  set_vmstate CFUNC
> +|.endmacro
> +|
>  |// Uses TMPRd (r10d).
>  |.macro save_vmstate
>  |.if not WIN
> @@ -435,7 +464,7 @@ static void build_subroutines(BuildCtx *ctx)

<snipped>

> @@ -467,6 +496,9 @@ static void build_subroutines(BuildCtx *ctx)
>    |  xor eax, eax			// Ok return status for vm_pcall.
>    |
>    |->vm_leave_unw:
> +  |  set_vmstate INTERP // Guard for non-atomic VM context restoration
> +  |  mov XCHGd, L:RBa->base

RBa register is not defined in <vm_x64.dasm>.

> +  |  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], XCHGd
>    |  // DISPATCH required to set properly.
>    |  restore_vmstate			// Caveat: uses TMPRd (r10d).
>    |  restoreregs
> @@ -725,7 +757,7 @@ static void build_subroutines(BuildCtx *ctx)

<snipped>

> diff --git a/src/vm_x86.dasc b/src/vm_x86.dasc
> index ab8e6f27..222754fe 100644
> --- a/src/vm_x86.dasc
> +++ b/src/vm_x86.dasc
> @@ -443,6 +443,35 @@
>  |  mov dword [DISPATCH+DISPATCH_GL(vmstate)], ~LJ_VMST_..st
>  |.endmacro
>  |
> +|// Stash interpreter's internal base and enter LFUNC VM state.
> +|// PROFILER: Each time profiler sees LFUNC state, it will inspect [BASE-1]
> +|// expecting to see a valid framelink there. So enter this state only when
> +|// BASE is stable and slots are not moved on the stack.
> +|.macro set_vmstate_lfunc
> +|  set_vmstate INTERP // Guard for non-atomic VM context restoration
> +|  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], BASE
> +|  set_vmstate LFUNC
> +|.endmacro
> +|
> +|// Stash ID of the fast function about to be executed and enter FFUNC VM state.
> +|// PROFILER: Each time profiler sees FFUNC state, it will write ffid
> +|// to the profile stream.
> +|.macro set_vmstate_ffunc
> +|  set_vmstate INTERP // Guard for non-atomic VM context restoration
> +|  mov XCHGd, dword [BASE-8]

This register is defined only for x64 architecture.
The error is occured, when build with the following command:

| $ make CC="gcc -m32" -f Makefile.original -j
| ...
| DYNASM    host/buildvm_arch.h
| vm_x86.dasc:632: error: bad operand mode in `mov i?,x?':
|   |  mov XCHGd, L:RBa->base
| vm_x86.dasc:1544: error: bad operand mode in `mov i?,xd':
|   |.ffunc_1 assert
|   |    mov XCHGd, dword [BASE-8]        [MACRO set_vmstate_ffunc (0)]
| vm_x86.dasc:1572: error: bad operand mode in `mov i?,xd':
|   |.ffunc_1 type
|   |    mov XCHGd, dword [BASE-8]        [MACRO set_vmstate_ffunc (0)]
| vm_x86.dasc:1599: error: bad operand mode in `mov i?,xd':
|   |.ffunc_1 getmetatable
|   |    mov XCHGd, dword [BASE-8]        [MACRO set_vmstate_ffunc (0)]

Side note: Unfortunately it's impossible for now forcify x32 build via
cmake. It requires to proxy CMAKE_C_FLAGS to macro in LuaJITUtils, IINM.

> +|  mov dword [DISPATCH+DISPATCH_GL(top_frame.ffid)], XCHGd

Please clarify the following things:

1) IINM, we save not ffid but the GCref for this function (since we not
   load ffid field from function).
2) Why do we store 64 bytes instead 8? Yes, the struct is not packed and
   there is a hole in it, so it works correct. But it is a little bit
   confusing. Also, why can't we write BASE here too and inspect ffid
   from this base later?

> +|  set_vmstate FFUNC
> +|.endmacro
> +|
> +|// Stash address of the C function about to be executed and enter CFUNC VM state.
> +|// PROFILER: Each time profiler sees CFUNC state, it will write this address
> +|// to the profile stream.
> +|.macro set_vmstate_cfunc
> +|  set_vmstate INTERP // Guard for non-atomic VM context restoration
> +|  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], BASE
> +|  set_vmstate CFUNC
> +|.endmacro
> +|
>  |// Uses spilled ecx on x86 or XCHGd (r11d) on x64.
>  |.macro save_vmstate
>  |.if not WIN
> @@ -560,7 +589,7 @@ static void build_subroutines(BuildCtx *ctx)

<snipped>

> @@ -599,6 +628,9 @@ static void build_subroutines(BuildCtx *ctx)
>    |  xor eax, eax			// Ok return status for vm_pcall.
>    |
>    |->vm_leave_unw:
> +  |  set_vmstate INTERP // Guard for non-atomic VM context restoration
> +  |  mov XCHGd, L:RBa->base

AFAIKS, there is no garantee, that L:RBa is set up to `lua_State *`.

For example, during vm_unwind_c_eh RB register is set to
`global_State *`.

|->vm_unwind_c_eh:			// Landing pad for external unwinder.
|  mov L:DISPATCH, SAVE_L
|  mov GL:RB, L:DISPATCH->glref
|  mov dword GL:RB->cur_L, L:DISPATCH
|  mov dword GL:RB->vmstate, ~LJ_VMST_CFUNC
|  mov DISPATCH, L:DISPATCH->glref	// Setup pointer to dispatch table.
|  add DISPATCH, GG_G2DISP
|  jmp ->vm_leave_unw

> +  |  mov dword [DISPATCH+DISPATCH_GL(top_frame.guesttop)], XCHGd
>    |  // DISPATCH required to set properly.
>    |  restore_vmstate			// Caveat: on x64 uses XCHGd (r11d).
>    |  restoreregs
> @@ -934,7 +966,7 @@ static void build_subroutines(BuildCtx *ctx)

<snipped>

> -- 
> 2.33.0
> 

[1]: https://www.oracle.com/java/technologies/javase/codeconventions-programmingpractices.html

-- 
Best regards,
Sergey Kaplun


More information about the Tarantool-patches mailing list