[Tarantool-patches] [PATCH luajit v2] core: fix cur_L restoration on error throw

Igor Munkin imun at tarantool.org
Wed Aug 18 19:57:05 MSK 2021


Sergey,

Thanks for the patch! I'm very curious what's wrong with FreeBSD... The
first version is more portable, but since we're going to revert this
commit few patches later, I'm open to any implementation.

LGTM, with some nits.

On 18.08.21, Sergey Kaplun wrote:
> Implement cur_L restoration only for arm64 architecture, due to FreeBSD
> issue.
> 
> Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-6189-curL-v2
> Issues:
> * https://github.com/tarantool/tarantool/issues/6189
> * https://github.com/tarantool/tarantool/issues/6323
> * https://github.com/tarantool/tarantool/issues/1516
> 
> Tarantool branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-6189-curL-v2
> 
> Enable test-run tests on arm64, Odroid with bump to show their
> coverage.

Please, rebase on the current master: I believe CI should be green!

> 
> P.S. this problem is JIT-related, however, when I turn on `jit.dump()`
> in CI [1], it is disappeared :(. Also, can't reproduce it inside
> sh4/sh8 VM, test fails only in the CI. Red test-run.py suite due to
> fiber.top issue, see also [2].

Mystery. Again... Who knows, maybe you hit the root cause, why this hack
with cur_L restoring is forbidden.

> 
> I suppose it would be nice to have a FreeBSD test machine like we have
> for M1 and Odroid. It may be helpful to research the console issue [3]
> too.

Definitely. Glad Kirill is also in this thread :)

> 
> ===================================================================
> commit 0f555bf79fefa1016849577500aec52719378ca5
> Author: Sergey Kaplun <skaplun at tarantool.org>
> Date:   Sun Aug 15 15:47:13 2021 +0300
> 
> arm64: fix cur_L restoration on error throw
> 
> This change is a kind of follow-up of commits

Minor: not kind of, but just follow-up.

> ed412cd9f55fe87fd32a69c86e1732690fc5c1b0 ('Update cur_L on exceptional
> path') and 97699d9ee2467389b6aea21a098e38aff3469b5f ('Fix cur_L tracking
> on exceptional path').
> 
> When an error is thrown on the coroutine that is not the one being
> currently executed, `cur_L` is not set up. Hence, when the running trace
> exits at assertion guard right after the error is caught, Lua state is
> restored from the incorrect `cur_L`. As a result the resulting stack is
> inconsistent and the crash occurs.
> 
> Aforementioned patches fix the behaviour only for x86/x64 architectures.
> This patch updates the `cur_L` for arm64 architecture too.
> 
> Nevertheless, throwing an error at non-currently executed coroutine is a
> violation of Lua/C API. So, in the nearest possible future this patch

Minor: It would be great to refer Roberto's answer. Feel free to ignore,
since anyone can find it in your PR in LuaJIT repo.

> should be replaced within the corresponding assert in `lj_err_throw()`.

Typo: s/within/with/.

> 
> Resolves tarantool/tarantool#6189
> Relates to tarantool/tarantool#6323
> Follows up tarantool/tarantool#1516
> 

<snipped>

> diff --git a/test/tarantool-tests/gh-6189-cur_L.test.lua b/test/tarantool-tests/gh-6189-cur_L.test.lua
> new file mode 100644
> index 00000000..8521af9a
> --- /dev/null
> +++ b/test/tarantool-tests/gh-6189-cur_L.test.lua
> @@ -0,0 +1,25 @@
> +local libcur_L = require('libcur_L')
> +local tap = require('tap')
> +
> +local test = tap.test('gh-6189-cur_L')
> +test:plan(1)
> +
> +local function cbool(cond)
> +  if cond then
> +    return 1
> +  else
> +    return 0
> +  end
> +end
> +
> +-- Compile function to trace with snapshot.
> +jit.opt.start('hotloop=1')
> +cbool(true)
> +cbool(true)

Minor: Please add a comment why two calls are needed.

> +
> +pcall(libcur_L.error_from_other_thread)

Minor: It would be nice to add an assert that pcall yields false here.

> +-- Call with restoration from a snapshot with wrong cur_L.
> +cbool(false)
> +
> +test:ok(true)
> +os.exit(test:check() and 0 or 1)

<snipped>

> diff --git a/test/tarantool-tests/gh-6189-cur_L/libcur_L.c b/test/tarantool-tests/gh-6189-cur_L/libcur_L.c
> new file mode 100644
> index 00000000..2d58d2e7
> --- /dev/null
> +++ b/test/tarantool-tests/gh-6189-cur_L/libcur_L.c

<snipped>

> +static int error_from_other_thread(lua_State *L)
> +{
> +	lua_State *next_cur_L = lua_newthread(L);
> +	old_L = L;
> +	/* Remove thread. */
> +	lua_pop(L, 1);
> +	/* Do not show frame slot as return result after error. */
> +	lua_pushnil(L);
> +	lua_pushcfunction(next_cur_L, throw_error_at_old_thread);
> +	lua_call(next_cur_L, 0, 0);
> +	/* Unreachable. */

Then it's worth to add an assert here to be sure we never return here.

> +	return 0;
> +}
> +

<snipped>


> ===================================================================
> 
> [1]: https://github.com/tarantool/tarantool/runs/3349429293#step:5:4569
> [2]: https://github.com/tarantool/tarantool/pull/6303
> [3]: https://github.com/tarantool/tarantool/issues/6231
> 
> -- 
> Best regards,
> Sergey Kaplun

-- 
Best regards,
IM


More information about the Tarantool-patches mailing list