From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id AA27850873A; Thu, 13 Jul 2023 11:12:46 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org AA27850873A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1689235966; bh=1hlexdl3gRiOIQeSXnz0yUzwZ1VDDyVBSCG75dZQ/MY=; h=Date:To:References:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=dGgRMGiDg1qMd6OjMIxWcxblpAK72bBc+bsnEqvmK0wRDg35tBER/VGMUxrAAUSZP hA8NKsBZZprfMB08lGvkiukgo5VUJUkY63rB0scIU6qC2k2UpDwwinIBnu8FTJd2xz BIphKA7kV4fnTewU4PEmHgmvU1I6Xt1+WEEsUdvw= Received: from smtp49.i.mail.ru (smtp49.i.mail.ru [95.163.41.91]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 6B75F50873A for ; Thu, 13 Jul 2023 11:12:45 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 6B75F50873A Received: by smtp49.i.mail.ru with esmtpa (envelope-from ) id 1qJrRM-00CtHM-6T; Thu, 13 Jul 2023 11:12:45 +0300 Message-ID: <9aec4ca0-09c0-2692-81e6-5d391f06fd7b@tarantool.org> Date: Thu, 13 Jul 2023 11:12:43 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 To: Maxim Kokryashkin , tarantool-patches@dev.tarantool.org, skaplun@tarantool.org References: <20230710122818.22221-1-m.kokryashkin@tarantool.org> Content-Language: en-US In-Reply-To: <20230710122818.22221-1-m.kokryashkin@tarantool.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Mailru-Src: smtp X-4EC0790: 10 X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD9B2A9E02D3479B2E0ECF741C3158F2015434C71B2B49F534A182A05F5380850404E53528AB79BCAEA28A0DEBF13F0955BD0EBF10A6C9ED90C2F119C27E715A90E X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE788A2BECDB7201542EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F790063732D85235FAECD72A8638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D8BFCBF0BC36646DFB2FCE0EE3FFFD52CF117882F4460429724CE54428C33FAD305F5C1EE8F4F765FC8C7ADC89C2F0B2A5A471835C12D1D9774AD6D5ED66289B52BA9C0B312567BB23117882F44604297287769387670735204B6963042765DA4BBDFBBEFFF4125B51D2E47CDBA5A96583BA9C0B312567BB2376E601842F6C81A19E625A9149C048EECCD848CCB6FE560C846F39228950D27DD8FC6C240DEA76429C9F4D5AE37F343AA9539A8B242431040A6AB1C7CE11FEE3D2AC72D04CD5349BAD7EC71F1DB88427C4224003CC836476E2F48590F00D11D6E2021AF6380DFAD1A18204E546F3947CB11811A4A51E3B096D1867E19FE1407959CC434672EE6371089D37D7C0E48F6C8AA50765F7900637BC468E7E89D8C5D6EFF80C71ABB335746BA297DBC24807EABDAD6C7F3747799A X-C1DE0DAB: 0D63561A33F958A5EF3DECF64DA7435C17C13A5FEF1E3458E0FA40E42CBD0720F87CCE6106E1FC07E67D4AC08A07B9B0CE135D2742255B35CB5012B2E24CD356 X-C8649E89: 1C3962B70DF3F0ADE00A9FD3E00BEEDF3FED46C3ACD6F73ED3581295AF09D3DF87807E0823442EA2ED31085941D9CD0AF7F820E7B07EA4CFC3592C545D8BA475FCBEC1D7AABD1EA5471433C17B77714605A3AFE3AB3A85E5124E09ADABDE4174399DF444199BC1D141EB528FA1319714764DDAD4AD032964A74DFFEFA5DC0E7F02C26D483E81D6BE0DBAE6F56676BC7117BB6831D7356A2DEC5B5AD62611EEC62B5AFB4261A09AF0 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojmIyQFW4pVclzbMJREHHo5w== X-Mailru-Sender: 49D287FBCBBF3A5C1746B9497B71E89ED2CFA62995838CD428A0DEBF13F0955BD38A1BEA64D5C64AEBA65886582A37BD66FEC6BF5C9C28D98A98C1125256619760D574B6FC815AB872D6B4FCE48DF648AE208404248635DF X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH luajit v3] sysprof: fix crash during FFUNC stream X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Sergey Bronnikov via Tarantool-patches Reply-To: Sergey Bronnikov Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Hi, Max! Thanks for the patch! Test is passed with reverted patch, so it is useless. As I get it right there is a non-zero possibility to catch inconsistency in VM state in a future, so I propose to enable sysprof in our fuzzing test [1] with a high sampling interval and check that profiler will work on generated Lua programs . What do you think? Also see my comments inline. 1. https://github.com/ligurio/tarantool/commit/b8bc3293da66ded41383faf5ea913a2554d987b8 Sergey On 7/10/23 15:28, Maxim Kokryashkin wrote: > Sometimes, the Lua stack can be inconsistent during > the FFUNC execution, which may lead to a sysprof > crash during the stack unwinding. > > This patch replaces the `top_frame` property of `global_State` > with `lj_sysprof_topframe` structure, which contains `top_frame` > and `ffid` properties. `ffid` property makes sense only when the > LuaJIT VM state is set to `FFUNC`. That property is set to the > ffid of the fast function that VM is about to execute. > In the same time, `top_frame` property is not updated now, so > the top frame of the Lua stack can be streamed based on the ffid, > and the rest of the Lua stack can be streamed as usual. > > Also, this patch fixes build with plain makefile, by adding > the `LJ_HASSYSPROF` flag support to it. > > Resolves tarantool/tarantool#8594 > --- > Changes in v3: > - Fixed comments as per review by Sergey > > Branch: https://github.com/tarantool/luajit/tree/fckxorg/gh-8594-sysprof-ffunc-crash > PR: https://github.com/tarantool/tarantool/pull/8737 > src/Makefile.original | 3 ++ > src/lj_obj.h | 7 +++- > src/lj_sysprof.c | 26 ++++++++++++--- > src/vm_x64.dasc | 22 +++++++++++-- > src/vm_x86.dasc | 31 ++++++++++++++--- > .../gh-8594-sysprof-ffunc-crash.test.lua | 33 +++++++++++++++++++ > 6 files changed, 109 insertions(+), 13 deletions(-) > create mode 100644 test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua > > diff --git a/src/Makefile.original b/src/Makefile.original > index aedaaa73..e21a0e56 100644 > --- a/src/Makefile.original > +++ b/src/Makefile.original > @@ -441,6 +441,9 @@ ifneq (,$(findstring LJ_NO_UNWIND 1,$(TARGET_TESTARCH))) > DASM_AFLAGS+= -D NO_UNWIND > TARGET_ARCH+= -DLUAJIT_NO_UNWIND > endif > +ifneq (,$(findstring LJ_HASSYSPROF 1,$(TARGET_TESTARCH))) > + DASM_AFLAGS+= -D LJ_HASSYSPROF > +endif > DASM_AFLAGS+= -D VER=$(subst LJ_ARCH_VERSION_,,$(filter LJ_ARCH_VERSION_%,$(subst LJ_ARCH_VERSION ,LJ_ARCH_VERSION_,$(TARGET_TESTARCH)))) > ifeq (Windows,$(TARGET_SYS)) > DASM_AFLAGS+= -D WIN > diff --git a/src/lj_obj.h b/src/lj_obj.h > index 45507e0d..e17316df 100644 > --- a/src/lj_obj.h > +++ b/src/lj_obj.h > @@ -598,6 +598,11 @@ enum { > GCSmax > }; > > +struct lj_sysprof_topframe { > + uint8_t ffid; /* FFID of the fast function VM is about to execute. */ > + TValue *top_frame; /* Top frame for sysprof. */ > +}; > + > typedef struct GCState { > GCSize total; /* Memory currently allocated. */ > GCSize threshold; /* Memory threshold. */ > @@ -675,7 +680,7 @@ typedef struct global_State { > MRef ctype_state; /* Pointer to C type state. */ > GCRef gcroot[GCROOT_MAX]; /* GC roots. */ > #ifdef LJ_HASSYSPROF > - TValue *top_frame; /* Top frame for sysprof. */ > + struct lj_sysprof_topframe top_frame_info; /* Top frame info for sysprof. */ > #endif > } global_State; > > diff --git a/src/lj_sysprof.c b/src/lj_sysprof.c > index 2e9ed9b3..0a341e16 100644 > --- a/src/lj_sysprof.c > +++ b/src/lj_sysprof.c > @@ -109,6 +109,12 @@ static void stream_epilogue(struct sysprof *sp) > lj_wbuf_addbyte(&sp->out, LJP_EPILOGUE_BYTE); > } > > +static void stream_ffunc_impl(struct lj_wbuf *buf, uint8_t ffid) > +{ > + lj_wbuf_addbyte(buf, LJP_FRAME_FFUNC); > + lj_wbuf_addu64(buf, ffid); > +} > + > static void stream_lfunc(struct lj_wbuf *buf, const GCfunc *func) > { > lua_assert(isluafunc(func)); > @@ -129,8 +135,7 @@ static void stream_cfunc(struct lj_wbuf *buf, const GCfunc *func) > static void stream_ffunc(struct lj_wbuf *buf, const GCfunc *func) > { > lua_assert(isffunc(func)); > - lj_wbuf_addbyte(buf, LJP_FRAME_FFUNC); > - lj_wbuf_addu64(buf, func->c.ffid); > + stream_ffunc_impl(buf, func->c.ffid); > } > > static void stream_frame_lua(struct lj_wbuf *buf, const cTValue *frame) > @@ -148,7 +153,7 @@ static void stream_frame_lua(struct lj_wbuf *buf, const cTValue *frame) > lua_assert(0); > } > > -static void stream_backtrace_lua(struct sysprof *sp) > +static void stream_backtrace_lua(struct sysprof *sp, uint32_t vmstate) > { > global_State *g = sp->g; > struct lj_wbuf *buf = &sp->out; > @@ -158,8 +163,19 @@ static void stream_backtrace_lua(struct sysprof *sp) > lua_assert(g != NULL); > L = gco2th(gcref(g->cur_L)); > lua_assert(L != NULL); > + /* > + ** Lua stack may be inconsistent during the execution of a > + ** fast-function, so instead of updating the `top_frame` for > + ** it, its `ffid` is set instead. The first frame on the > + ** result stack is streamed manually, and the rest of the > + ** stack is streamed based on the previous `top_frame` value. > + */ > + if (vmstate == LJ_VMST_FFUNC) { > + uint8_t ffid = g->top_frame_info.ffid; > + stream_ffunc_impl(buf, ffid); > + } > > - top_frame = g->top_frame - 1; //(1 + LJ_FR2) > + top_frame = g->top_frame_info.top_frame - 1; > > bot = tvref(L->stack) + LJ_FR2; > /* Traverse frames backwards */ > @@ -234,7 +250,7 @@ static void stream_trace(struct sysprof *sp, uint32_t vmstate) > static void stream_guest(struct sysprof *sp, uint32_t vmstate) > { > lj_wbuf_addbyte(&sp->out, (uint8_t)vmstate); > - stream_backtrace_lua(sp); > + stream_backtrace_lua(sp, vmstate); > stream_backtrace_host(sp); > } > > diff --git a/src/vm_x64.dasc b/src/vm_x64.dasc > index 7b04b928..2a0c3f03 100644 > --- a/src/vm_x64.dasc > +++ b/src/vm_x64.dasc > @@ -53,6 +53,7 @@ > |.define RDL, RCL > |.define TMPR, r10 > |.define TMPRd, r10d > +|.define TMPRb, r10b > |.define ITYPE, r11 > |.define ITYPEd, r11d > | > @@ -353,14 +354,29 @@ > |// it syncs with the BASE register only when the control is passed to > |// user code. So we need to sync the BASE on each vmstate change to > |// keep it consistent. > +|// The only exception are FFUNCs because sometimes even internal BASE > +|// stash is inconsistent for them. To address that issue, their ffid > +|// is stashed instead, so the corresponding frame can be streamed > +|// manually. > |.macro set_vmstate_sync_base, st > |.if LJ_HASSYSPROF > | set_vmstate INTERP // Guard for non-atomic VM context restoration > -| mov qword [DISPATCH+DISPATCH_GL(top_frame)], BASE > +| mov qword [DISPATCH+DISPATCH_GL(top_frame_info.top_frame)], BASE > |.endif > | set_vmstate st > |.endmacro > | > +|.macro set_vmstate_ffunc > +|.if LJ_HASSYSPROF > +| set_vmstate INTERP > +| mov TMPR, [BASE - 16] > +| cleartp LFUNC:TMPR // Obtain plain address value. Equivalent of `gcval`. > +| mov TMPRb, LFUNC:TMPR->ffid > +| mov byte [DISPATCH+DISPATCH_GL(top_frame_info.ffid)], TMPRb > +|.endif > +| set_vmstate FFUNC > +|.endmacro > +| > |// Uses TMPRd (r10d). > |.macro save_vmstate > |.if not WIN > @@ -376,7 +392,7 @@ > | set_vmstate INTERP > | mov TMPR, SAVE_L > | mov TMPR, L:TMPR->base > -| mov qword [DISPATCH+DISPATCH_GL(top_frame)], TMPR > +| mov qword [DISPATCH+DISPATCH_GL(top_frame_info.top_frame)], TMPR > |.endif > | mov TMPRd, SAVE_VMSTATE > | mov dword [DISPATCH+DISPATCH_GL(vmstate)], TMPRd > @@ -1188,7 +1204,7 @@ static void build_subroutines(BuildCtx *ctx) > | > |.macro .ffunc, name > |->ff_ .. name: > - | set_vmstate_sync_base FFUNC > + | set_vmstate_ffunc > |.endmacro > | > |.macro .ffunc_1, name > diff --git a/src/vm_x86.dasc b/src/vm_x86.dasc > index bd1e940e..ff388d58 100644 > --- a/src/vm_x86.dasc > +++ b/src/vm_x86.dasc > @@ -101,6 +101,7 @@ > |.define FCARG2, CARG2d > | > |.define XCHGd, r11d // TMP on x64, used for exchange. > +|.define XCHGb, r11b // TMPRb on x64, used for exchange. > |.endif > | > |// Type definitions. Some of these are only used for documentation. > @@ -451,14 +452,36 @@ > |// it syncs with the BASE register only when the control is passed to > |// user code. So we need to sync the BASE on each vmstate change to > |// keep it consistent. > +|// The only exception are FFUNCs because sometimes even internal BASE > +|// stash is inconsistent for them. To address that issue, their ffid > +|// is stashed instead, so the corresponding frame can be streamed > +|// manually. > |.macro set_vmstate_sync_base, st > |.if LJ_HASSYSPROF > | set_vmstate INTERP // Guard for non-atomic VM context restoration > -| mov dword [DISPATCH+DISPATCH_GL(top_frame)], BASE > +| mov dword [DISPATCH+DISPATCH_GL(top_frame_info.top_frame)], BASE > |.endif > | set_vmstate st > |.endmacro > | > +|.macro set_vmstate_ffunc > +|.if LJ_HASSYSPROF > +| set_vmstate INTERP > +|.if not X64 > +| mov SPILLECX, ecx > +| mov LFUNC:ecx, [BASE - 4] > +| mov cl, LFUNC:ecx->ffid > +| mov byte [DISPATCH+DISPATCH_GL(top_frame_info.ffid)], cl > +| mov ecx, SPILLECX > +|.else // X64 > +| mov LFUNC:XCHGd, [BASE - 8] > +| mov XCHGb, LFUNC:XCHGd->ffid > +| mov byte [DISPATCH+DISPATCH_GL(top_frame_info.ffid)], XCHGb > +|.endif // X64 > +|.endif // LJ_HASSYSPROF > +| set_vmstate FFUNC > +|.endmacro > +| > |// Uses spilled ecx on x86 or XCHGd (r11d) on x64. > |.macro save_vmstate > |.if not WIN > @@ -485,7 +508,7 @@ > |.if LJ_HASSYSPROF > | mov ecx, SAVE_L > | mov ecx, L:ecx->base > -| mov dword [DISPATCH+DISPATCH_GL(top_frame)], ecx > +| mov dword [DISPATCH+DISPATCH_GL(top_frame_info.top_frame)], ecx > |.endif > | mov ecx, SAVE_VMSTATE > | mov dword [DISPATCH+DISPATCH_GL(vmstate)], ecx > @@ -494,7 +517,7 @@ > |.if LJ_HASSYSPROF > | mov XCHGd, SAVE_L > | mov XCHGd, L:XCHGd->base > -| mov dword [DISPATCH+DISPATCH_GL(top_frame)], XCHGd > +| mov dword [DISPATCH+DISPATCH_GL(top_frame_info.top_frame)], XCHGd > |.endif > | mov XCHGd, SAVE_VMSTATE > | mov dword [DISPATCH+DISPATCH_GL(vmstate)], XCHGd > @@ -1488,7 +1511,7 @@ static void build_subroutines(BuildCtx *ctx) > | > |.macro .ffunc, name > |->ff_ .. name: > - | set_vmstate_sync_base FFUNC > + | set_vmstate_ffunc > |.endmacro > | > |.macro .ffunc_1, name > diff --git a/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua b/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua > new file mode 100644 > index 00000000..e5cdeb07 > --- /dev/null > +++ b/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua > @@ -0,0 +1,33 @@ > +local tap = require('tap') > +local test = tap.test('gh-8594-sysprof-ffunc-crash'):skipcond({ > + ['Sysprof is implemented for x86_64 only'] = jit.arch ~= 'x86' and > + jit.arch ~= 'x64', > + ['Sysprof is implemented for Linux only'] = jit.os ~= 'Linux', > +}) > + > +test:plan(1) > + > +jit.off() > +-- XXX: Run JIT tuning functions in a safe frame to avoid errors > +-- thrown when LuaJIT is compiled with JIT engine disabled. > +pcall(jit.flush) > + > +local TMP_BINFILE = '/dev/null' > + > +local res, err = misc.sysprof.start{ > + mode = 'C', In the reproducer from bug description "L" mode is used. Why do you use "C" mode here? > + interval = 3, > + path = TMP_BINFILE, > +} > +assert(res, err) > + > +for i = 1, 1e5 do > + tostring(i) Please add a comment, that we use "tostring" here, because "tostring" is a fast function. (Sure it's may be obvious for you, but it will be more clear for those who will read it.) > +end > + > +res, err = misc.sysprof.stop() > +assert(res, err) > + > +test:ok(true, 'sysprof finished successfully') > + > +os.exit(test:check() and 0 or 1)