Tarantool development patches archive
 help / color / mirror / Atom feed
From: Maxim Kokryashkin via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: "Sergey Kaplun" <skaplun@tarantool.org>
Cc: "Maxim Kokryashkin" <max.kokryashkin@gmail.com>,
	tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches]  [PATCH luajit v3] sysprof: fix crash during FFUNC stream
Date: Mon, 17 Jul 2023 15:52:41 +0300	[thread overview]
Message-ID: <1689598361.696046955@f193.i.mail.ru> (raw)
In-Reply-To: <ZLK9ERB5xzQe/4jl@root>

[-- Attachment #1: Type: text/plain, Size: 5762 bytes --]


Hi!
Thanks for the review!
 
> 
>>Hi, Maxim!
>>Thanks for the fixes!
>>Generally LGTM, but please consider my comments below.
>>
>>On 10.07.23, Maxim Kokryashkin wrote:
>>> Sometimes, the Lua stack can be inconsistent during
>>> the FFUNC execution, which may lead to a sysprof
>>> crash during the stack unwinding.
>>>
>>> This patch replaces the `top_frame` property of `global_State`
>>> with `lj_sysprof_topframe` structure, which contains `top_frame`
>>> and `ffid` properties. `ffid` property makes sense only when the
>>> LuaJIT VM state is set to `FFUNC`. That property is set to the
>>> ffid of the fast function that VM is about to execute.
>>> In the same time, `top_frame` property is not updated now, so
>>> the top frame of the Lua stack can be streamed based on the ffid,
>>> and the rest of the Lua stack can be streamed as usual.
>>>
>>> Also, this patch fixes build with plain makefile, by adding
>>
>>Nit: I suggest to rephrase it like "original Makefile" or
>>Makefile.original.
>>
>>Feel free to ignore.
>Fixed! The branch is force-pushed.
>>
>>> the `LJ_HASSYSPROF` flag support to it.
>>>
>>> Resolves tarantool/tarantool#8594
>>> ---
>>> Changes in v3:
>>> - Fixed comments as per review by Sergey
>>>
>>> Branch:  https://github.com/tarantool/luajit/tree/fckxorg/gh-8594-sysprof-ffunc-crash
>>> PR:  https://github.com/tarantool/tarantool/pull/8737
>>> src/Makefile.original | 3 ++
>>> src/lj_obj.h | 7 +++-
>>> src/lj_sysprof.c | 26 ++++++++++++---
>>> src/vm_x64.dasc | 22 +++++++++++--
>>> src/vm_x86.dasc | 31 ++++++++++++++---
>>> .../gh-8594-sysprof-ffunc-crash.test.lua | 33 +++++++++++++++++++
>>> 6 files changed, 109 insertions(+), 13 deletions(-)
>>> create mode 100644 test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua
>>>
>>> diff --git a/src/Makefile.original b/src/Makefile.original
>>> index aedaaa73..e21a0e56 100644
>>> --- a/src/Makefile.original
>>> +++ b/src/Makefile.original
>>
>><snipped>
>>
>>> diff --git a/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua b/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua
>>> new file mode 100644
>>> index 00000000..e5cdeb07
>>> --- /dev/null
>>> +++ b/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua
>>> @@ -0,0 +1,33 @@
>>> +local tap = require('tap')
>>> +local test = tap.test('gh-8594-sysprof-ffunc-crash'):skipcond({
>>> + ['Sysprof is implemented for x86_64 only'] = jit.arch ~= 'x86' and
>>> + jit.arch ~= 'x64',
>>> + ['Sysprof is implemented for Linux only'] = jit.os ~= 'Linux',
>>> +})
>>> +
>>> +test:plan(1)
>>> +
>>> +jit.off()
>>> +-- XXX: Run JIT tuning functions in a safe frame to avoid errors
>>> +-- thrown when LuaJIT is compiled with JIT engine disabled.
>>> +pcall(jit.flush)
>>> +
>>> +local TMP_BINFILE = '/dev/null'
>>> +
>>> +local res, err = misc.sysprof.start{
>>> + mode = 'C',
>>> + interval = 3,
>>> + path = TMP_BINFILE,
>>> +}
>>> +assert(res, err)
>>> +
>>> +for i = 1, 1e5 do
>>> + tostring(i)
>>> +end
>>
>>Within these (interval/iterations) changes I hardly can see assertion failure
>>on master branch for my laptop (1/10 cases).
>>
>>| >>> git log -n1 --no-decorate --oneline
>>| 8e46d601 test: fix flaky <unit-jit-parse.test.lua>
>>| >>> LUA_PATH="src/?.lua;test/tarantool-tests/?.lua;;" src/luajit test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua
>>| TAP version 13
>>| 1..1
>>| ok - sysprof finished successfully
>>
>>I don't know how to resolve this problem with our CI at death's door...
>>OTOH, we may consider that this value is enough for our CI, so I'll see
>>the problem there.
>>So, I'll agree with Sergey's opinion about this flaky test.
>I’ve added a conditional parametrization for this test case, here is the diff:
>==========================================================
>diff --git a/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua b/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua
>index a053e41c..347bd087 100644
>--- a/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua
>+++ b/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua
>@@ -17,17 +17,30 @@ local TMP_BINFILE = '/dev/null'
> -- XXX: The best way to test the issue is to set the profile
> -- interval to be as short as possible. However, our CI is
> -- not capable of handling such intense testing, so it was a
>--- forced decision to reduce the sampling frequency. As a
>+-- forced decision to reduce the sampling frequency for it. As a
> -- result, it is now less likely to reproduce the issue
> -- statistically, but the test case is still valid.
>+
>+-- GitHub always sets[1] the `CI` environment variable to `true`
>+-- for every step in a workflow.
>+-- [1]: https://docs.github.com/en/actions/learn-github-actions/variables#default-environment-variables
>+local CI = os.getenv('CI') == 'true'
>+
>+-- Profile interval and number of iterations for CI are
>+-- empirical. Non-CI profile interval is set to be as short
>+-- as possible, so the issue is more likely to reproduce.
>+-- Non-CI number of iterations is greater for the same reason.
>+local PROFILE_INTERVAL = CI and 3 or 1
>+local N_ITERATIONS = CI and 1e5 or 1e6
>+
> local res, err = misc.sysprof.start{
>   mode = 'C',
>-  interval = 3,
>+  interval = PROFILE_INTERVAL,
>   path = TMP_BINFILE,
> }
> assert(res, err)
> 
>-for i = 1, 1e5 do
>+for i = 1, N_ITERATIONS do
>   -- XXX: `tostring` is FFUNC.
>   tostring(i)
> end
>==========================================================
>The branch is force-pushed.
>>
>>> +
>>> +res, err = misc.sysprof.stop()
>>> +assert(res, err)
>>> +
>>> +test:ok(true, 'sysprof finished successfully')
>>> +
>>> +os.exit(test:check() and 0 or 1)
>>> --
>>> 2.40.1
>>>
>>
>>--
>>Best regards,
>>Sergey Kaplun
>--
>Best regards,
>Maxim Kokryashkin

[-- Attachment #2: Type: text/html, Size: 7567 bytes --]

  reply	other threads:[~2023-07-17 12:52 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-10 12:28 Maxim Kokryashkin via Tarantool-patches
2023-07-13  8:12 ` Sergey Bronnikov via Tarantool-patches
2023-07-14 14:30   ` Maxim Kokryashkin via Tarantool-patches
2023-11-08 15:47     ` Sergey Bronnikov via Tarantool-patches
2023-11-08 23:01       ` Maxim Kokryashkin via Tarantool-patches
2023-11-09  7:06         ` Sergey Bronnikov via Tarantool-patches
2023-07-15 15:36 ` Sergey Kaplun via Tarantool-patches
2023-07-17 12:52   ` Maxim Kokryashkin via Tarantool-patches [this message]
2023-11-23  6:32 ` Igor Munkin via Tarantool-patches
  -- strict thread matches above, loose matches on Subject: below --
2023-07-05  8:56 Maxim Kokryashkin via Tarantool-patches
2023-07-09 12:21 ` Sergey Kaplun via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1689598361.696046955@f193.i.mail.ru \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=m.kokryashkin@tarantool.org \
    --cc=max.kokryashkin@gmail.com \
    --cc=skaplun@tarantool.org \
    --subject='Re: [Tarantool-patches]  [PATCH luajit v3] sysprof: fix crash during FFUNC stream' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox