From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id DC60F141405; Wed, 30 Nov 2022 19:04:39 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org DC60F141405 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1669824280; bh=WXG8FmqRIE99lwTIjZIaFvNORr9m0QVp/itDwWPy5nQ=; h=In-Reply-To:Date:References:To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=NMWS/w2S15Ao9QJoa/fEUH7c1QrlV4vLDhbFzDgYlbJQN5tUSXN3/BugFBN3ec6cR 0x+ZBCRId9J5HHI7aAAU4BqqZGLSpo7Kb9dKO/xIiIAVeEh2SPk1CJ9tGBi/j+vHMO wyceIhGcB/WFIvUbBW/GkSmRh4th8cj3dp6TGWRY= Received: from smtp60.i.mail.ru (smtp60.i.mail.ru [217.69.128.40]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id BB54713DA01 for ; Wed, 30 Nov 2022 19:04:38 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org BB54713DA01 Received: by smtp60.i.mail.ru with esmtpa (envelope-from ) id 1p0PZd-0003xA-RQ; Wed, 30 Nov 2022 19:04:38 +0300 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.200.110.1.12\)) In-Reply-To: <20221111101948.6773-1-m.kokryashkin@tarantool.org> Date: Wed, 30 Nov 2022 19:04:26 +0300 Content-Transfer-Encoding: quoted-printable Message-Id: <4FA392BA-CA1E-4C1B-8491-50DC91203F49@tarantool.org> References: <20221111101948.6773-1-m.kokryashkin@tarantool.org> To: Maxim Kokryashkin X-Mailer: Apple Mail (2.3731.200.110.1.12) X-Mailru-Src: smtp X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD9F70D9A593BA47E2A487291DD0DF16E345853D31D69E89B51182A05F5380850401DE766E45E24B6A435CE14F3181E95F0F51B339263C8770F8BBE95D8E318BFC3 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34FF70CCB59A6ED870F2C9838FDFFE41FE76C0D9EC4D1E9BD320A99B76531CC7B06F0CF350A9D80CF71D7E09C32AA3244C0A5F7061F292C4E5EFAEBE2D5672EA103FD9C8CA1B0515E0927AC6DF5659F194 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojWg9qwXN3O9WwQLnkOfTT3Q== X-Mailru-Sender: 11C2EC085EDE56FA38FD4C59F7EFE407C041A2572758CA7A134491201D70AEF6D775E2315FC03E4019381EE24192DF5555834048F03EF5D4C9A814A92B2E3B1BA4250FC3964EA4964198E0F3ECE9B5443453F38A29522196 X-Mras: OK Subject: Re: [Tarantool-patches] [PATCH luajit] GC64: enable sysprof support X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: sergos via Tarantool-patches Reply-To: sergos Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Hi! Thanks for the patch! LGTM. Sergos > On 11 Nov 2022, at 13:19, Maxim Kokryashkin = wrote: >=20 > This patch reverts changes introduced via commit > 71e7020637bb15024ceb7e93ce66b61870567339 ('GC64: disable > sysprof support'), fixing the incorrect behavior in vm > by adding the `LJ_HASSYSPROF` flag to the DynASM flags. >=20 > Resolves tarantool/tarantool#7919 > Related to tarantool/tarantool#781 > --- > Issue: https://github.com/tarantool/tarantool/issues/7919 > Branch: = https://github.com/tarantool/luajit/tree/fckxorg/gh-7919-sysprof-gc64 > PR: https://github.com/tarantool/tarantool/pull/7922 >=20 > cmake/SetDynASMFlags.cmake | 5 +++ > src/lj_arch.h | 2 +- > src/lj_obj.h | 2 + > src/vm_x64.dasc | 42 ++++++++++++++----- > src/vm_x86.dasc | 8 ++++ > .../misclib-sysprof-capi.test.lua | 7 ++-- > .../misclib-sysprof-lapi.test.lua | 5 +-- > 7 files changed, 52 insertions(+), 19 deletions(-) >=20 > diff --git a/cmake/SetDynASMFlags.cmake b/cmake/SetDynASMFlags.cmake > index 8e10758f..142d7e64 100644 > --- a/cmake/SetDynASMFlags.cmake > +++ b/cmake/SetDynASMFlags.cmake > @@ -113,6 +113,11 @@ if(NOT FOUND EQUAL -1) > AppendFlags(HOST_C_FLAGS -DLUAJIT_NO_UNWIND) > endif() >=20 > +string(FIND "${TESTARCH}" "LJ_HASSYSPROF 1" FOUND) > +if(NOT FOUND EQUAL -1) > + list(APPEND DYNASM_FLAGS -D LJ_HASSYSPROF) > +endif() > + > string(REGEX MATCH "LJ_ARCH_VERSION ([0-9]+)" LUAJIT_ARCH_VERSION = ${TESTARCH}) > list(APPEND DYNASM_FLAGS -D VER=3D${CMAKE_MATCH_1}) >=20 > diff --git a/src/lj_arch.h b/src/lj_arch.h > index 40129d9e..cd02fee1 100644 > --- a/src/lj_arch.h > +++ b/src/lj_arch.h > @@ -595,7 +595,7 @@ > #endif >=20 > /* Disable or enable the platform and Lua profiler. */ > -#if defined(LUAJIT_DISABLE_SYSPROF) || defined(LJ_ARCH_NOSYSPROF) || = !LJ_TARGET_LINUX || LJ_GC64 > +#if defined(LUAJIT_DISABLE_SYSPROF) || defined(LJ_ARCH_NOSYSPROF) || = !LJ_TARGET_LINUX > #define LJ_HASSYSPROF 0 > #else > #define LJ_HASSYSPROF 1 > diff --git a/src/lj_obj.h b/src/lj_obj.h > index d1451c3a..45507e0d 100644 > --- a/src/lj_obj.h > +++ b/src/lj_obj.h > @@ -674,7 +674,9 @@ typedef struct global_State { > MRef jit_base; /* Current JIT code L->base or NULL. */ > MRef ctype_state; /* Pointer to C type state. */ > GCRef gcroot[GCROOT_MAX]; /* GC roots. */ > +#ifdef LJ_HASSYSPROF > TValue *top_frame; /* Top frame for sysprof. */ > +#endif > } global_State; >=20 > #define mainthread(g) (&gcref(g->mainthref)->th) > diff --git a/src/vm_x64.dasc b/src/vm_x64.dasc > index 974047d3..59f117ba 100644 > --- a/src/vm_x64.dasc > +++ b/src/vm_x64.dasc > @@ -345,6 +345,22 @@ > | mov dword [DISPATCH+DISPATCH_GL(vmstate)], ~LJ_VMST_..st > |.endmacro > | > +|// Stash interpreter's internal base and set current VM state. > +|// XXX: Each time profiler sees LFUNC, CFUNC or FFUNC state, it = tries > +|// to dump Lua calling stack, so it needs a stack base pointer. > +|// If the sampling signal arriving during the execution of the VM = code, > +|// base pointer stored in the current lua_State can be irrelevant, = as > +|// it syncs with the BASE register only when the control is passed = to > +|// user code. So we need to sync the BASE on each vmstate change to > +|// keep it consistent. > +|.macro set_vmstate_sync_base, st > +|.if LJ_HASSYSPROF > +| set_vmstate INTERP // Guard for non-atomic VM context restoration > +| mov qword [DISPATCH+DISPATCH_GL(top_frame)], BASE > +|.endif > +| set_vmstate st > +|.endmacro > +| > |// Uses TMPRd (r10d). > |.macro save_vmstate > |.if not WIN > @@ -356,6 +372,12 @@ > |// Uses r10d. > |.macro restore_vmstate > |.if not WIN > +|.if LJ_HASSYSPROF > +| set_vmstate INTERP > +| mov TMPR, SAVE_L > +| mov TMPR, L:TMPR->base > +| mov qword [DISPATCH+DISPATCH_GL(top_frame)], TMPR > +|.endif > | mov TMPRd, SAVE_VMSTATE > | mov dword [DISPATCH+DISPATCH_GL(vmstate)], TMPRd > |.endif // WIN > @@ -435,7 +457,7 @@ static void build_subroutines(BuildCtx *ctx) > | jnz ->vm_returnp > | > | // Return to C. > - | set_vmstate CFUNC > + | set_vmstate_sync_base CFUNC > | and PC, -8 > | sub PC, BASE > | neg PC // Previous base =3D BASE - = delta. > @@ -725,7 +747,7 @@ static void build_subroutines(BuildCtx *ctx) > | cleartp LFUNC:KBASE > | mov KBASE, LFUNC:KBASE->pc > | mov KBASE, [KBASE+PC2PROTO(k)] > - | set_vmstate LFUNC // LFUNC after KBASE = restoration. > + | set_vmstate_sync_base LFUNC // LFUNC after KBASE = restoration. > | // BASE =3D base, RC =3D result, RB =3D meta base > | jmp RA // Jump to continuation. > | > @@ -1166,7 +1188,7 @@ static void build_subroutines(BuildCtx *ctx) > | > |.macro .ffunc, name > |->ff_ .. name: > - | set_vmstate FFUNC > + | set_vmstate_sync_base FFUNC > |.endmacro > | > |.macro .ffunc_1, name > @@ -1748,7 +1770,7 @@ static void build_subroutines(BuildCtx *ctx) > | movzx RAd, PC_RA > | neg RA > | lea BASE, [BASE+RA*8-16] // base =3D base - (RA+2)*8 > - | set_vmstate LFUNC // LFUNC state after = BASE restoration. > + | set_vmstate_sync_base LFUNC // LFUNC state after BASE = restoration. > | ins_next > | > |6: // Fill up results with nil. > @@ -2513,7 +2535,7 @@ static void build_subroutines(BuildCtx *ctx) > | mov KBASE, [KBASE+PC2PROTO(k)] > | mov L:RB->base, BASE > | mov qword [DISPATCH+DISPATCH_GL(jit_base)], 0 > - | set_vmstate LFUNC // LFUNC after BASE & = KBASE restoration. > + | set_vmstate_sync_base LFUNC // LFUNC after BASE & KBASE = restoration. > | // Modified copy of ins_next which handles function header = dispatch, too. > | mov RCd, [PC] > | movzx RAd, RCH > @@ -2730,7 +2752,7 @@ static void build_subroutines(BuildCtx *ctx) > | call extern lj_ccallback_enter // (CTState *cts, void *cf) > | // lua_State * returned in eax (RD). > | mov BASE, L:RD->base > - | set_vmstate LFUNC // LFUNC after BASE = restoration. > + | set_vmstate_sync_base LFUNC // LFUNC after BASE restoration. > | mov RD, L:RD->top > | sub RD, BASE > | mov LFUNC:RB, [BASE-16] > @@ -4299,7 +4321,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, = int defop) > | mov KBASE, LFUNC:KBASE->pc > | mov KBASE, [KBASE+PC2PROTO(k)] > | // LFUNC after the old BASE & KBASE is restored. > - | set_vmstate LFUNC > + | set_vmstate_sync_base LFUNC > | ins_next > | > |6: // Fill up results with nil. > @@ -4591,7 +4613,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, = int defop) > | ins_AD // BASE =3D new base, RA =3D framesize, RD =3D nargs+1 > | mov KBASE, [PC-4+PC2PROTO(k)] > | mov L:RB, SAVE_L > - | set_vmstate LFUNC // LFUNC after KBASE = restoration. > + | set_vmstate_sync_base LFUNC // LFUNC after KBASE = restoration. > | lea RA, [BASE+RA*8] // Top of frame. > | cmp RA, L:RB->maxstack > | ja ->vm_growstack_f > @@ -4629,7 +4651,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, = int defop) > | mov [RD-8], RB // Store delta + FRAME_VARG. > | mov [RD-16], LFUNC:KBASE // Store copy of LFUNC. > | mov L:RB, SAVE_L > - | set_vmstate LFUNC // LFUNC after KBASE = restoration. > + | set_vmstate_sync_base LFUNC // LFUNC after KBASE = restoration. > | lea RA, [RD+RA*8] > | cmp RA, L:RB->maxstack > | ja ->vm_growstack_v // Need to grow stack. > @@ -4685,7 +4707,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, = int defop) > | mov CARG1, L:RB // Caveat: CARG1 may be RA. > } > | ja ->vm_growstack_c // Need to grow stack. > - | set_vmstate CFUNC // CFUNC before entering C = function. > + | set_vmstate_sync_base CFUNC // CFUNC before entering C = function. > if (op =3D=3D BC_FUNCC) { > | call KBASE // (lua_State *L) > } else { > diff --git a/src/vm_x86.dasc b/src/vm_x86.dasc > index 22f69edf..f7ffe5d2 100644 > --- a/src/vm_x86.dasc > +++ b/src/vm_x86.dasc > @@ -452,8 +452,10 @@ > |// user code. So we need to sync the BASE on each vmstate change to > |// keep it consistent. > |.macro set_vmstate_sync_base, st > +|.if LJ_HASSYSPROF > | set_vmstate INTERP // Guard for non-atomic VM context restoration > | mov dword [DISPATCH+DISPATCH_GL(top_frame)], BASE > +|.endif > | set_vmstate st > |.endmacro > | > @@ -475,19 +477,25 @@ > |// Uses spilled ecx on x86 or XCHGd (r11d) on x64. > |.macro restore_vmstate > |.if not WIN > +|.if LJ_HASSYSPROF > | set_vmstate INTERP > +|.endif > |.if not X64 > | mov SPILLECX, ecx > +|.if LJ_HASSYSPROF > | mov ecx, SAVE_L > | mov ecx, L:ecx->base > | mov dword [DISPATCH+DISPATCH_GL(top_frame)], ecx > +|.endif > | mov ecx, SAVE_VMSTATE > | mov dword [DISPATCH+DISPATCH_GL(vmstate)], ecx > | mov ecx, SPILLECX > |.else // X64 > +|.if LJ_HASSYSPROF > | mov XCHGd, SAVE_L > | mov XCHGd, L:XCHGd->base > | mov dword [DISPATCH+DISPATCH_GL(top_frame)], XCHGd > +|.endif > | mov XCHGd, SAVE_VMSTATE > | mov dword [DISPATCH+DISPATCH_GL(vmstate)], XCHGd > |.endif // X64 > diff --git a/test/tarantool-tests/misclib-sysprof-capi.test.lua = b/test/tarantool-tests/misclib-sysprof-capi.test.lua > index afb99cf2..dad0fe4a 100644 > --- a/test/tarantool-tests/misclib-sysprof-capi.test.lua > +++ b/test/tarantool-tests/misclib-sysprof-capi.test.lua > @@ -1,8 +1,7 @@ > -- Sysprof is implemented for x86 and x64 architectures only. > -local ffi =3D require("ffi") > -require("utils").skipcond( > - jit.arch ~=3D "x86" and jit.arch ~=3D "x64" or jit.os ~=3D "Linux" > - or ffi.abi("gc64"), > +local utils =3D require("utils") > +utils.skipcond( > + jit.arch ~=3D "x86" and jit.arch ~=3D "x64" or jit.os ~=3D "Linux", > jit.arch.." architecture or "..jit.os.. > " OS is NIY for sysprof" > ) > diff --git a/test/tarantool-tests/misclib-sysprof-lapi.test.lua = b/test/tarantool-tests/misclib-sysprof-lapi.test.lua > index dbff41b0..4bf10e8d 100644 > --- a/test/tarantool-tests/misclib-sysprof-lapi.test.lua > +++ b/test/tarantool-tests/misclib-sysprof-lapi.test.lua > @@ -1,10 +1,7 @@ > -- Sysprof is implemented for x86 and x64 architectures only. > -local ffi =3D require("ffi") > local utils =3D require("utils") > - > utils.skipcond( > - jit.arch ~=3D "x86" and jit.arch ~=3D "x64" or jit.os ~=3D "Linux" > - or ffi.abi("gc64"), > + jit.arch ~=3D "x86" and jit.arch ~=3D "x64" or jit.os ~=3D "Linux", > jit.arch.." architecture or "..jit.os.. > " OS is NIY for sysprof" > ) > --=20 > 2.38.0 >=20