From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 5DB0F51E3E4; Mon, 10 Jul 2023 15:28:25 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 5DB0F51E3E4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1688992105; bh=MHTW5b/g7gsj/e87nCx1VsQQGXFf+0XU93d4QTjQPRg=; h=To:Date:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=seX0HJpi+L/PEcrmr9K2SnyPtGHmLGhFgp/oyUbMZy2ei1JdNiAcISal6Ydkx8RGk W53i2gy4gevF/fypEd352aEYVzNm1RzN4/18A6hPX7HYL2wyWcWyOyK5aoGpO2KlIz TZ3rQ/awx0+A0mcuU31SOFoSeynbmga4SXN0Mvm4= Received: from mail-lf1-f54.google.com (mail-lf1-f54.google.com [209.85.167.54]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id E7E5B518636 for ; Mon, 10 Jul 2023 15:28:23 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org E7E5B518636 Received: by mail-lf1-f54.google.com with SMTP id 2adb3069b0e04-4fb73ba3b5dso6900981e87.1 for ; Mon, 10 Jul 2023 05:28:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688992103; x=1691584103; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Og7jquWfw/gqEx4R3HIYIbIZlWs9VjmM+AaxdsDRZ24=; b=gKxIP6HU0wepRL3z7eI5TssMypNvesWKozfFUmlF9JJ4UW33mnBB1yJwIRFpNo3Isn XJQBdwuixKxBGYV1RZ5rZWZzY+4ldo64E0Z3FLMnAqfRzLMa2GHek/fN7Px9uvUW4/T7 Z1pQZ8yZidlNbJSOiLFsIvNyPfsb1+TtAJ2nh1WrfZjDd8qL/sULDzJO4CJQqMqn/e1H RRcQTvlevMZDOOFFEcrr7PQ3uJLMbClvvXVIYoFYpI3y1X22ueh9X0ylEjVpomunOZss 6f2TAS9uLRSLq3RFzf1ICmSIJj1/NU+AqAqVvUCNQ2jnfC80JYglzdPWKIBkhQpaSmT8 YptA== X-Gm-Message-State: ABy/qLaEnfZqrrdW8hV8ed73f9YzDOhOrUCpXWTNjDzc1Pa6ppihpXaF 06GVbfmxQHeCGZcw3ex0ZWPZDFGqgmaelg== X-Google-Smtp-Source: APBJJlHVbJhjkX/rhTkKZ3HC2Up2U7NraC9F7tjiWvdWxoR53YaYnpQPC9Uk2nIaVdvGLnUuc2qgVw== X-Received: by 2002:a05:6512:3253:b0:4fa:da4f:636f with SMTP id c19-20020a056512325300b004fada4f636fmr8696424lfr.32.1688992102689; Mon, 10 Jul 2023 05:28:22 -0700 (PDT) Received: from fckxorg.mail.msk ([2a00:1148:b0ba:16:a3e8:bdc1:dbed:dbc8]) by smtp.gmail.com with ESMTPSA id x23-20020ac25dd7000000b004faf6a87d63sm1682778lfq.38.2023.07.10.05.28.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 10 Jul 2023 05:28:22 -0700 (PDT) X-Google-Original-From: Maxim Kokryashkin To: tarantool-patches@dev.tarantool.org, skaplun@tarantool.org, sergeyb@tarantool.org Date: Mon, 10 Jul 2023 15:28:18 +0300 Message-Id: <20230710122818.22221-1-m.kokryashkin@tarantool.org> X-Mailer: git-send-email 2.40.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [Tarantool-patches] [PATCH luajit v3] sysprof: fix crash during FFUNC stream X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Maxim Kokryashkin via Tarantool-patches Reply-To: Maxim Kokryashkin Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Sometimes, the Lua stack can be inconsistent during the FFUNC execution, which may lead to a sysprof crash during the stack unwinding. This patch replaces the `top_frame` property of `global_State` with `lj_sysprof_topframe` structure, which contains `top_frame` and `ffid` properties. `ffid` property makes sense only when the LuaJIT VM state is set to `FFUNC`. That property is set to the ffid of the fast function that VM is about to execute. In the same time, `top_frame` property is not updated now, so the top frame of the Lua stack can be streamed based on the ffid, and the rest of the Lua stack can be streamed as usual. Also, this patch fixes build with plain makefile, by adding the `LJ_HASSYSPROF` flag support to it. Resolves tarantool/tarantool#8594 --- Changes in v3: - Fixed comments as per review by Sergey Branch: https://github.com/tarantool/luajit/tree/fckxorg/gh-8594-sysprof-ffunc-crash PR: https://github.com/tarantool/tarantool/pull/8737 src/Makefile.original | 3 ++ src/lj_obj.h | 7 +++- src/lj_sysprof.c | 26 ++++++++++++--- src/vm_x64.dasc | 22 +++++++++++-- src/vm_x86.dasc | 31 ++++++++++++++--- .../gh-8594-sysprof-ffunc-crash.test.lua | 33 +++++++++++++++++++ 6 files changed, 109 insertions(+), 13 deletions(-) create mode 100644 test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua diff --git a/src/Makefile.original b/src/Makefile.original index aedaaa73..e21a0e56 100644 --- a/src/Makefile.original +++ b/src/Makefile.original @@ -441,6 +441,9 @@ ifneq (,$(findstring LJ_NO_UNWIND 1,$(TARGET_TESTARCH))) DASM_AFLAGS+= -D NO_UNWIND TARGET_ARCH+= -DLUAJIT_NO_UNWIND endif +ifneq (,$(findstring LJ_HASSYSPROF 1,$(TARGET_TESTARCH))) + DASM_AFLAGS+= -D LJ_HASSYSPROF +endif DASM_AFLAGS+= -D VER=$(subst LJ_ARCH_VERSION_,,$(filter LJ_ARCH_VERSION_%,$(subst LJ_ARCH_VERSION ,LJ_ARCH_VERSION_,$(TARGET_TESTARCH)))) ifeq (Windows,$(TARGET_SYS)) DASM_AFLAGS+= -D WIN diff --git a/src/lj_obj.h b/src/lj_obj.h index 45507e0d..e17316df 100644 --- a/src/lj_obj.h +++ b/src/lj_obj.h @@ -598,6 +598,11 @@ enum { GCSmax }; +struct lj_sysprof_topframe { + uint8_t ffid; /* FFID of the fast function VM is about to execute. */ + TValue *top_frame; /* Top frame for sysprof. */ +}; + typedef struct GCState { GCSize total; /* Memory currently allocated. */ GCSize threshold; /* Memory threshold. */ @@ -675,7 +680,7 @@ typedef struct global_State { MRef ctype_state; /* Pointer to C type state. */ GCRef gcroot[GCROOT_MAX]; /* GC roots. */ #ifdef LJ_HASSYSPROF - TValue *top_frame; /* Top frame for sysprof. */ + struct lj_sysprof_topframe top_frame_info; /* Top frame info for sysprof. */ #endif } global_State; diff --git a/src/lj_sysprof.c b/src/lj_sysprof.c index 2e9ed9b3..0a341e16 100644 --- a/src/lj_sysprof.c +++ b/src/lj_sysprof.c @@ -109,6 +109,12 @@ static void stream_epilogue(struct sysprof *sp) lj_wbuf_addbyte(&sp->out, LJP_EPILOGUE_BYTE); } +static void stream_ffunc_impl(struct lj_wbuf *buf, uint8_t ffid) +{ + lj_wbuf_addbyte(buf, LJP_FRAME_FFUNC); + lj_wbuf_addu64(buf, ffid); +} + static void stream_lfunc(struct lj_wbuf *buf, const GCfunc *func) { lua_assert(isluafunc(func)); @@ -129,8 +135,7 @@ static void stream_cfunc(struct lj_wbuf *buf, const GCfunc *func) static void stream_ffunc(struct lj_wbuf *buf, const GCfunc *func) { lua_assert(isffunc(func)); - lj_wbuf_addbyte(buf, LJP_FRAME_FFUNC); - lj_wbuf_addu64(buf, func->c.ffid); + stream_ffunc_impl(buf, func->c.ffid); } static void stream_frame_lua(struct lj_wbuf *buf, const cTValue *frame) @@ -148,7 +153,7 @@ static void stream_frame_lua(struct lj_wbuf *buf, const cTValue *frame) lua_assert(0); } -static void stream_backtrace_lua(struct sysprof *sp) +static void stream_backtrace_lua(struct sysprof *sp, uint32_t vmstate) { global_State *g = sp->g; struct lj_wbuf *buf = &sp->out; @@ -158,8 +163,19 @@ static void stream_backtrace_lua(struct sysprof *sp) lua_assert(g != NULL); L = gco2th(gcref(g->cur_L)); lua_assert(L != NULL); + /* + ** Lua stack may be inconsistent during the execution of a + ** fast-function, so instead of updating the `top_frame` for + ** it, its `ffid` is set instead. The first frame on the + ** result stack is streamed manually, and the rest of the + ** stack is streamed based on the previous `top_frame` value. + */ + if (vmstate == LJ_VMST_FFUNC) { + uint8_t ffid = g->top_frame_info.ffid; + stream_ffunc_impl(buf, ffid); + } - top_frame = g->top_frame - 1; //(1 + LJ_FR2) + top_frame = g->top_frame_info.top_frame - 1; bot = tvref(L->stack) + LJ_FR2; /* Traverse frames backwards */ @@ -234,7 +250,7 @@ static void stream_trace(struct sysprof *sp, uint32_t vmstate) static void stream_guest(struct sysprof *sp, uint32_t vmstate) { lj_wbuf_addbyte(&sp->out, (uint8_t)vmstate); - stream_backtrace_lua(sp); + stream_backtrace_lua(sp, vmstate); stream_backtrace_host(sp); } diff --git a/src/vm_x64.dasc b/src/vm_x64.dasc index 7b04b928..2a0c3f03 100644 --- a/src/vm_x64.dasc +++ b/src/vm_x64.dasc @@ -53,6 +53,7 @@ |.define RDL, RCL |.define TMPR, r10 |.define TMPRd, r10d +|.define TMPRb, r10b |.define ITYPE, r11 |.define ITYPEd, r11d | @@ -353,14 +354,29 @@ |// it syncs with the BASE register only when the control is passed to |// user code. So we need to sync the BASE on each vmstate change to |// keep it consistent. +|// The only exception are FFUNCs because sometimes even internal BASE +|// stash is inconsistent for them. To address that issue, their ffid +|// is stashed instead, so the corresponding frame can be streamed +|// manually. |.macro set_vmstate_sync_base, st |.if LJ_HASSYSPROF | set_vmstate INTERP // Guard for non-atomic VM context restoration -| mov qword [DISPATCH+DISPATCH_GL(top_frame)], BASE +| mov qword [DISPATCH+DISPATCH_GL(top_frame_info.top_frame)], BASE |.endif | set_vmstate st |.endmacro | +|.macro set_vmstate_ffunc +|.if LJ_HASSYSPROF +| set_vmstate INTERP +| mov TMPR, [BASE - 16] +| cleartp LFUNC:TMPR // Obtain plain address value. Equivalent of `gcval`. +| mov TMPRb, LFUNC:TMPR->ffid +| mov byte [DISPATCH+DISPATCH_GL(top_frame_info.ffid)], TMPRb +|.endif +| set_vmstate FFUNC +|.endmacro +| |// Uses TMPRd (r10d). |.macro save_vmstate |.if not WIN @@ -376,7 +392,7 @@ | set_vmstate INTERP | mov TMPR, SAVE_L | mov TMPR, L:TMPR->base -| mov qword [DISPATCH+DISPATCH_GL(top_frame)], TMPR +| mov qword [DISPATCH+DISPATCH_GL(top_frame_info.top_frame)], TMPR |.endif | mov TMPRd, SAVE_VMSTATE | mov dword [DISPATCH+DISPATCH_GL(vmstate)], TMPRd @@ -1188,7 +1204,7 @@ static void build_subroutines(BuildCtx *ctx) | |.macro .ffunc, name |->ff_ .. name: - | set_vmstate_sync_base FFUNC + | set_vmstate_ffunc |.endmacro | |.macro .ffunc_1, name diff --git a/src/vm_x86.dasc b/src/vm_x86.dasc index bd1e940e..ff388d58 100644 --- a/src/vm_x86.dasc +++ b/src/vm_x86.dasc @@ -101,6 +101,7 @@ |.define FCARG2, CARG2d | |.define XCHGd, r11d // TMP on x64, used for exchange. +|.define XCHGb, r11b // TMPRb on x64, used for exchange. |.endif | |// Type definitions. Some of these are only used for documentation. @@ -451,14 +452,36 @@ |// it syncs with the BASE register only when the control is passed to |// user code. So we need to sync the BASE on each vmstate change to |// keep it consistent. +|// The only exception are FFUNCs because sometimes even internal BASE +|// stash is inconsistent for them. To address that issue, their ffid +|// is stashed instead, so the corresponding frame can be streamed +|// manually. |.macro set_vmstate_sync_base, st |.if LJ_HASSYSPROF | set_vmstate INTERP // Guard for non-atomic VM context restoration -| mov dword [DISPATCH+DISPATCH_GL(top_frame)], BASE +| mov dword [DISPATCH+DISPATCH_GL(top_frame_info.top_frame)], BASE |.endif | set_vmstate st |.endmacro | +|.macro set_vmstate_ffunc +|.if LJ_HASSYSPROF +| set_vmstate INTERP +|.if not X64 +| mov SPILLECX, ecx +| mov LFUNC:ecx, [BASE - 4] +| mov cl, LFUNC:ecx->ffid +| mov byte [DISPATCH+DISPATCH_GL(top_frame_info.ffid)], cl +| mov ecx, SPILLECX +|.else // X64 +| mov LFUNC:XCHGd, [BASE - 8] +| mov XCHGb, LFUNC:XCHGd->ffid +| mov byte [DISPATCH+DISPATCH_GL(top_frame_info.ffid)], XCHGb +|.endif // X64 +|.endif // LJ_HASSYSPROF +| set_vmstate FFUNC +|.endmacro +| |// Uses spilled ecx on x86 or XCHGd (r11d) on x64. |.macro save_vmstate |.if not WIN @@ -485,7 +508,7 @@ |.if LJ_HASSYSPROF | mov ecx, SAVE_L | mov ecx, L:ecx->base -| mov dword [DISPATCH+DISPATCH_GL(top_frame)], ecx +| mov dword [DISPATCH+DISPATCH_GL(top_frame_info.top_frame)], ecx |.endif | mov ecx, SAVE_VMSTATE | mov dword [DISPATCH+DISPATCH_GL(vmstate)], ecx @@ -494,7 +517,7 @@ |.if LJ_HASSYSPROF | mov XCHGd, SAVE_L | mov XCHGd, L:XCHGd->base -| mov dword [DISPATCH+DISPATCH_GL(top_frame)], XCHGd +| mov dword [DISPATCH+DISPATCH_GL(top_frame_info.top_frame)], XCHGd |.endif | mov XCHGd, SAVE_VMSTATE | mov dword [DISPATCH+DISPATCH_GL(vmstate)], XCHGd @@ -1488,7 +1511,7 @@ static void build_subroutines(BuildCtx *ctx) | |.macro .ffunc, name |->ff_ .. name: - | set_vmstate_sync_base FFUNC + | set_vmstate_ffunc |.endmacro | |.macro .ffunc_1, name diff --git a/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua b/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua new file mode 100644 index 00000000..e5cdeb07 --- /dev/null +++ b/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua @@ -0,0 +1,33 @@ +local tap = require('tap') +local test = tap.test('gh-8594-sysprof-ffunc-crash'):skipcond({ + ['Sysprof is implemented for x86_64 only'] = jit.arch ~= 'x86' and + jit.arch ~= 'x64', + ['Sysprof is implemented for Linux only'] = jit.os ~= 'Linux', +}) + +test:plan(1) + +jit.off() +-- XXX: Run JIT tuning functions in a safe frame to avoid errors +-- thrown when LuaJIT is compiled with JIT engine disabled. +pcall(jit.flush) + +local TMP_BINFILE = '/dev/null' + +local res, err = misc.sysprof.start{ + mode = 'C', + interval = 3, + path = TMP_BINFILE, +} +assert(res, err) + +for i = 1, 1e5 do + tostring(i) +end + +res, err = misc.sysprof.stop() +assert(res, err) + +test:ok(true, 'sysprof finished successfully') + +os.exit(test:check() and 0 or 1) -- 2.40.1