[Tarantool-patches] [PATCH luajit v3] sysprof: fix crash during FFUNC stream

Sergey Kaplun skaplun at tarantool.org
Sat Jul 15 18:36:49 MSK 2023


Hi, Maxim!
Thanks for the fixes!
Generally LGTM, but please consider my comments below.

On 10.07.23, Maxim Kokryashkin wrote:
> Sometimes, the Lua stack can be inconsistent during
> the FFUNC execution, which may lead to a sysprof
> crash during the stack unwinding.
> 
> This patch replaces the `top_frame` property of `global_State`
> with `lj_sysprof_topframe` structure, which contains `top_frame`
> and `ffid` properties. `ffid` property makes sense only when the
> LuaJIT VM state is set to `FFUNC`. That property is set to the
> ffid of the fast function that VM is about to execute.
> In the same time, `top_frame` property is not updated now, so
> the top frame of the Lua stack can be streamed based on the ffid,
> and the rest of the Lua stack can be streamed as usual.
> 
> Also, this patch fixes build with plain makefile, by adding

Nit: I suggest to rephrase it like "original Makefile" or
Makefile.original.

Feel free to ignore.

> the `LJ_HASSYSPROF` flag support to it.
> 
> Resolves tarantool/tarantool#8594
> ---
> Changes in v3:
> - Fixed comments as per review by Sergey
> 
> Branch: https://github.com/tarantool/luajit/tree/fckxorg/gh-8594-sysprof-ffunc-crash
> PR: https://github.com/tarantool/tarantool/pull/8737
>  src/Makefile.original                         |  3 ++
>  src/lj_obj.h                                  |  7 +++-
>  src/lj_sysprof.c                              | 26 ++++++++++++---
>  src/vm_x64.dasc                               | 22 +++++++++++--
>  src/vm_x86.dasc                               | 31 ++++++++++++++---
>  .../gh-8594-sysprof-ffunc-crash.test.lua      | 33 +++++++++++++++++++
>  6 files changed, 109 insertions(+), 13 deletions(-)
>  create mode 100644 test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua
> 
> diff --git a/src/Makefile.original b/src/Makefile.original
> index aedaaa73..e21a0e56 100644
> --- a/src/Makefile.original
> +++ b/src/Makefile.original

<snipped>

> diff --git a/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua b/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua
> new file mode 100644
> index 00000000..e5cdeb07
> --- /dev/null
> +++ b/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua
> @@ -0,0 +1,33 @@
> +local tap = require('tap')
> +local test = tap.test('gh-8594-sysprof-ffunc-crash'):skipcond({
> +  ['Sysprof is implemented for x86_64 only'] = jit.arch ~= 'x86' and
> +                                               jit.arch ~= 'x64',
> +  ['Sysprof is implemented for Linux only'] = jit.os ~= 'Linux',
> +})
> +
> +test:plan(1)
> +
> +jit.off()
> +-- XXX: Run JIT tuning functions in a safe frame to avoid errors
> +-- thrown when LuaJIT is compiled with JIT engine disabled.
> +pcall(jit.flush)
> +
> +local TMP_BINFILE = '/dev/null'
> +
> +local res, err = misc.sysprof.start{
> +  mode = 'C',
> +  interval = 3,
> +  path = TMP_BINFILE,
> +}
> +assert(res, err)
> +
> +for i = 1, 1e5 do
> +  tostring(i)
> +end

Within these (interval/iterations) changes I hardly can see assertion failure
on master branch for my laptop (1/10 cases).

| >>> git log -n1 --no-decorate --oneline
| 8e46d601 test: fix flaky <unit-jit-parse.test.lua>
| >>> LUA_PATH="src/?.lua;test/tarantool-tests/?.lua;;" src/luajit test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua
| TAP version 13
| 1..1
| ok - sysprof finished successfully

I don't know how to resolve this problem with our CI at death's door...
OTOH, we may consider that this value is enough for our CI, so I'll see
the problem there.
So, I'll agree with Sergey's opinion about this flaky test.

> +
> +res, err = misc.sysprof.stop()
> +assert(res, err)
> +
> +test:ok(true, 'sysprof finished successfully')
> +
> +os.exit(test:check() and 0 or 1)
> -- 
> 2.40.1
> 

-- 
Best regards,
Sergey Kaplun


More information about the Tarantool-patches mailing list