Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH luajit] core: fix cur_L restoration on error throw
@ 2021-08-16 10:19 Sergey Kaplun via Tarantool-patches
  2021-08-18  8:49 ` [Tarantool-patches] [PATCH luajit v2] " Sergey Kaplun via Tarantool-patches
  2021-08-19  7:42 ` [Tarantool-patches] [PATCH luajit] " Kirill Yukhin via Tarantool-patches
  0 siblings, 2 replies; 8+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-08-16 10:19 UTC (permalink / raw)
  To: Igor Munkin, Kirill Yukhin; +Cc: tarantool-patches

This change is a kind of revertion of commits
ed412cd9f55fe87fd32a69c86e1732690fc5c1b0 ('Update cur_L on exceptional
path') and 97699d9ee2467389b6aea21a098e38aff3469b5f ('Fix cur_L tracking
on exceptional path').

When an error is thrown on the coroutine that is not the one being
currently executed, `cur_L` is not set up. Hence, when the running trace
exits at assertion guard right after the error is caught, Lua state is
restored from the incorrect `cur_L`. As a result the resulting stack is
inconsistent and the crash occurs.

Aforementioned patches fix the behaviour only for x86/x64
architectures. This patch updates the `cur_L` within `lj_err_throw()` to
the given lua_State, where the error is raised, since this is the only
coroutine that can proceed later. Also, it removes unnecessary
restorations of `cur_L` at C and FF exception handlers in the VM.

Nevertheless, throwing an error at non-currently executed coroutine is a
violation of Lua/C API. So, in the nearest possible future this patch
should be replaced within the corresponding assert.

Resolves tarantool/tarantool#6189
Relates to tarantool/tarantool#6323
Follows up tarantool/tarantool#1516
---

Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-6189-curL
Issues:
* https://github.com/tarantool/tarantool/issues/6189
* https://github.com/tarantool/tarantool/issues/6323
* https://github.com/tarantool/tarantool/issues/1516

Tarantool branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-6189-curL

CI is red on FreeBSD, so the patch is probably not ready to merge.
I've tried to reproduce it on sh8, but I've failed. sh4 is still not
working and is used for running our CI jobs. Stay tuned!

 src/lj_err.c                                  |  6 ++++
 src/vm_x64.dasc                               |  2 --
 src/vm_x86.dasc                               |  2 --
 test/tarantool-tests/CMakeLists.txt           |  1 +
 test/tarantool-tests/gh-6189-cur_L.test.lua   | 25 +++++++++++++
 .../gh-6189-cur_L/CMakeLists.txt              |  1 +
 test/tarantool-tests/gh-6189-cur_L/libcur_L.c | 36 +++++++++++++++++++
 7 files changed, 69 insertions(+), 4 deletions(-)
 create mode 100644 test/tarantool-tests/gh-6189-cur_L.test.lua
 create mode 100644 test/tarantool-tests/gh-6189-cur_L/CMakeLists.txt
 create mode 100644 test/tarantool-tests/gh-6189-cur_L/libcur_L.c

diff --git a/src/lj_err.c b/src/lj_err.c
index b6be357e..df9755a2 100644
--- a/src/lj_err.c
+++ b/src/lj_err.c
@@ -509,6 +509,12 @@ LJ_NOINLINE void LJ_FASTCALL lj_err_throw(lua_State *L, int errcode)
   global_State *g = G(L);
   lj_trace_abort(g);
   setmref(g->jit_base, NULL);
+  /*
+  ** FIXME: Throwing error not on the currently executing coroutine is
+  ** a violation of Lua/C API. The next line should be changed to assert.
+  ** Set the running lua_State as the one where the error can be handled.
+  */
+  setgcref(g->cur_L, obj2gco(L));
   L->status = LUA_OK;
 #if LJ_UNWIND_EXT
   err_raise_ext(errcode);
diff --git a/src/vm_x64.dasc b/src/vm_x64.dasc
index 974047d3..c65e5625 100644
--- a/src/vm_x64.dasc
+++ b/src/vm_x64.dasc
@@ -513,7 +513,6 @@ static void build_subroutines(BuildCtx *ctx)
   |->vm_unwind_c_eh:			// Landing pad for external unwinder.
   |  mov L:DISPATCH, SAVE_L
   |  mov GL:RB, L:DISPATCH->glref
-  |  mov GL:RB->cur_L, L:DISPATCH
   |  mov dword GL:RB->vmstate, ~LJ_VMST_CFUNC
   |  mov DISPATCH, L:DISPATCH->glref	// Setup pointer to dispatch table.
   |  add DISPATCH, GG_G2DISP
@@ -539,7 +538,6 @@ static void build_subroutines(BuildCtx *ctx)
   |  add DISPATCH, GG_G2DISP
   |  mov PC, [BASE-8]			// Fetch PC of previous frame.
   |  mov_false RA
-  |  mov [DISPATCH+DISPATCH_GL(cur_L)], L:RB
   |  mov RB, [BASE]
   |  mov [BASE-16], RA			// Prepend false to error message.
   |  mov [BASE-8], RB
diff --git a/src/vm_x86.dasc b/src/vm_x86.dasc
index ab8e6f27..bdd5c95e 100644
--- a/src/vm_x86.dasc
+++ b/src/vm_x86.dasc
@@ -654,7 +654,6 @@ static void build_subroutines(BuildCtx *ctx)
   |->vm_unwind_c_eh:			// Landing pad for external unwinder.
   |  mov L:DISPATCH, SAVE_L
   |  mov GL:RB, L:DISPATCH->glref
-  |  mov dword GL:RB->cur_L, L:DISPATCH
   |  mov dword GL:RB->vmstate, ~LJ_VMST_CFUNC
   |  mov DISPATCH, L:DISPATCH->glref	// Setup pointer to dispatch table.
   |  add DISPATCH, GG_G2DISP
@@ -690,7 +689,6 @@ static void build_subroutines(BuildCtx *ctx)
   |  add DISPATCH, GG_G2DISP
   |  mov PC, [BASE-4]			// Fetch PC of previous frame.
   |  mov dword [BASE-4], LJ_TFALSE	// Prepend false to error message.
-  |  mov [DISPATCH+DISPATCH_GL(cur_L)], L:RB
   |  // INTERP until jump to BC_RET* or return to C.
   |  set_vmstate INTERP
   |  jmp ->vm_returnc			// Increments RD/MULTRES and returns.
diff --git a/test/tarantool-tests/CMakeLists.txt b/test/tarantool-tests/CMakeLists.txt
index 2fdb4d1f..df74a277 100644
--- a/test/tarantool-tests/CMakeLists.txt
+++ b/test/tarantool-tests/CMakeLists.txt
@@ -57,6 +57,7 @@ macro(BuildTestCLib lib sources)
 endmacro()
 
 add_subdirectory(gh-4427-ffi-sandwich)
+add_subdirectory(gh-6189-cur_L)
 add_subdirectory(lj-flush-on-trace)
 add_subdirectory(misclib-getmetrics-capi)
 
diff --git a/test/tarantool-tests/gh-6189-cur_L.test.lua b/test/tarantool-tests/gh-6189-cur_L.test.lua
new file mode 100644
index 00000000..8521af9a
--- /dev/null
+++ b/test/tarantool-tests/gh-6189-cur_L.test.lua
@@ -0,0 +1,25 @@
+local libcur_L = require('libcur_L')
+local tap = require('tap')
+
+local test = tap.test('gh-6189-cur_L')
+test:plan(1)
+
+local function cbool(cond)
+  if cond then
+    return 1
+  else
+    return 0
+  end
+end
+
+-- Compile function to trace with snapshot.
+jit.opt.start('hotloop=1')
+cbool(true)
+cbool(true)
+
+pcall(libcur_L.error_from_other_thread)
+-- Call with restoration from a snapshot with wrong cur_L.
+cbool(false)
+
+test:ok(true)
+os.exit(test:check() and 0 or 1)
diff --git a/test/tarantool-tests/gh-6189-cur_L/CMakeLists.txt b/test/tarantool-tests/gh-6189-cur_L/CMakeLists.txt
new file mode 100644
index 00000000..1e58e560
--- /dev/null
+++ b/test/tarantool-tests/gh-6189-cur_L/CMakeLists.txt
@@ -0,0 +1 @@
+BuildTestCLib(libcur_L libcur_L.c)
diff --git a/test/tarantool-tests/gh-6189-cur_L/libcur_L.c b/test/tarantool-tests/gh-6189-cur_L/libcur_L.c
new file mode 100644
index 00000000..2d58d2e7
--- /dev/null
+++ b/test/tarantool-tests/gh-6189-cur_L/libcur_L.c
@@ -0,0 +1,36 @@
+#include <lua.h>
+#include <lauxlib.h>
+
+static lua_State *old_L = NULL;
+
+int throw_error_at_old_thread(lua_State *cur_L)
+{
+	lua_error(old_L);
+	/* Unreachable. */
+	return 0;
+}
+
+static int error_from_other_thread(lua_State *L)
+{
+	lua_State *next_cur_L = lua_newthread(L);
+	old_L = L;
+	/* Remove thread. */
+	lua_pop(L, 1);
+	/* Do not show frame slot as return result after error. */
+	lua_pushnil(L);
+	lua_pushcfunction(next_cur_L, throw_error_at_old_thread);
+	lua_call(next_cur_L, 0, 0);
+	/* Unreachable. */
+	return 0;
+}
+
+static const struct luaL_Reg libcur_L[] = {
+	{"error_from_other_thread", error_from_other_thread},
+	{NULL, NULL}
+};
+
+LUA_API int luaopen_libcur_L(lua_State *L)
+{
+	luaL_register(L, "libcur_L", libcur_L);
+	return 1;
+}
-- 
2.31.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Tarantool-patches] [PATCH luajit v2] core: fix cur_L restoration on error throw
  2021-08-16 10:19 [Tarantool-patches] [PATCH luajit] core: fix cur_L restoration on error throw Sergey Kaplun via Tarantool-patches
@ 2021-08-18  8:49 ` Sergey Kaplun via Tarantool-patches
  2021-08-18 16:57   ` Igor Munkin via Tarantool-patches
  2021-08-19  8:23   ` Igor Munkin via Tarantool-patches
  2021-08-19  7:42 ` [Tarantool-patches] [PATCH luajit] " Kirill Yukhin via Tarantool-patches
  1 sibling, 2 replies; 8+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-08-18  8:49 UTC (permalink / raw)
  To: Igor Munkin, Kirill Yukhin; +Cc: tarantool-patches

Implement cur_L restoration only for arm64 architecture, due to FreeBSD
issue.

Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-6189-curL-v2
Issues:
* https://github.com/tarantool/tarantool/issues/6189
* https://github.com/tarantool/tarantool/issues/6323
* https://github.com/tarantool/tarantool/issues/1516

Tarantool branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-6189-curL-v2

Enable test-run tests on arm64, Odroid with bump to show their
coverage.

P.S. this problem is JIT-related, however, when I turn on `jit.dump()`
in CI [1], it is disappeared :(. Also, can't reproduce it inside
sh4/sh8 VM, test fails only in the CI. Red test-run.py suite due to
fiber.top issue, see also [2].

I suppose it would be nice to have a FreeBSD test machine like we have
for M1 and Odroid. It may be helpful to research the console issue [3]
too.

===================================================================
commit 0f555bf79fefa1016849577500aec52719378ca5
Author: Sergey Kaplun <skaplun@tarantool.org>
Date:   Sun Aug 15 15:47:13 2021 +0300

arm64: fix cur_L restoration on error throw

This change is a kind of follow-up of commits
ed412cd9f55fe87fd32a69c86e1732690fc5c1b0 ('Update cur_L on exceptional
path') and 97699d9ee2467389b6aea21a098e38aff3469b5f ('Fix cur_L tracking
on exceptional path').

When an error is thrown on the coroutine that is not the one being
currently executed, `cur_L` is not set up. Hence, when the running trace
exits at assertion guard right after the error is caught, Lua state is
restored from the incorrect `cur_L`. As a result the resulting stack is
inconsistent and the crash occurs.

Aforementioned patches fix the behaviour only for x86/x64 architectures.
This patch updates the `cur_L` for arm64 architecture too.

Nevertheless, throwing an error at non-currently executed coroutine is a
violation of Lua/C API. So, in the nearest possible future this patch
should be replaced within the corresponding assert in `lj_err_throw()`.

Resolves tarantool/tarantool#6189
Relates to tarantool/tarantool#6323
Follows up tarantool/tarantool#1516

diff --git a/src/vm_arm64.dasc b/src/vm_arm64.dasc
index 6e298255..2abf17fc 100644
--- a/src/vm_arm64.dasc
+++ b/src/vm_arm64.dasc
@@ -394,6 +394,7 @@ static void build_subroutines(BuildCtx *ctx)
   |   mv_vmstate TMP0w, CFUNC
   |  ldr GL, L->glref
   |   st_vmstate TMP0w
+  |  str L, GL->cur_L
   |  b ->vm_leave_unw
   |
   |->vm_unwind_ff:			// Unwind C stack, return from ff pcall.
@@ -409,6 +410,7 @@ static void build_subroutines(BuildCtx *ctx)
   |   ldr GL, L->glref			// Setup pointer to global state.
   |    mov_false TMP0
   |  sub RA, BASE, #8			// Results start at BASE-8.
+  |   str L, GL->cur_L
   |  ldr PC, [BASE, FRAME_PC]		// Fetch PC of previous frame.
   |    str TMP0, [BASE, #-8]		// Prepend false to error message.
   |   st_vmstate ST_INTERP
diff --git a/test/tarantool-tests/CMakeLists.txt b/test/tarantool-tests/CMakeLists.txt
index 2fdb4d1f..df74a277 100644
--- a/test/tarantool-tests/CMakeLists.txt
+++ b/test/tarantool-tests/CMakeLists.txt
@@ -57,6 +57,7 @@ macro(BuildTestCLib lib sources)
 endmacro()
 
 add_subdirectory(gh-4427-ffi-sandwich)
+add_subdirectory(gh-6189-cur_L)
 add_subdirectory(lj-flush-on-trace)
 add_subdirectory(misclib-getmetrics-capi)
 
diff --git a/test/tarantool-tests/gh-6189-cur_L.test.lua b/test/tarantool-tests/gh-6189-cur_L.test.lua
new file mode 100644
index 00000000..8521af9a
--- /dev/null
+++ b/test/tarantool-tests/gh-6189-cur_L.test.lua
@@ -0,0 +1,25 @@
+local libcur_L = require('libcur_L')
+local tap = require('tap')
+
+local test = tap.test('gh-6189-cur_L')
+test:plan(1)
+
+local function cbool(cond)
+  if cond then
+    return 1
+  else
+    return 0
+  end
+end
+
+-- Compile function to trace with snapshot.
+jit.opt.start('hotloop=1')
+cbool(true)
+cbool(true)
+
+pcall(libcur_L.error_from_other_thread)
+-- Call with restoration from a snapshot with wrong cur_L.
+cbool(false)
+
+test:ok(true)
+os.exit(test:check() and 0 or 1)
diff --git a/test/tarantool-tests/gh-6189-cur_L/CMakeLists.txt b/test/tarantool-tests/gh-6189-cur_L/CMakeLists.txt
new file mode 100644
index 00000000..1e58e560
--- /dev/null
+++ b/test/tarantool-tests/gh-6189-cur_L/CMakeLists.txt
@@ -0,0 +1 @@
+BuildTestCLib(libcur_L libcur_L.c)
diff --git a/test/tarantool-tests/gh-6189-cur_L/libcur_L.c b/test/tarantool-tests/gh-6189-cur_L/libcur_L.c
new file mode 100644
index 00000000..2d58d2e7
--- /dev/null
+++ b/test/tarantool-tests/gh-6189-cur_L/libcur_L.c
@@ -0,0 +1,36 @@
+#include <lua.h>
+#include <lauxlib.h>
+
+static lua_State *old_L = NULL;
+
+int throw_error_at_old_thread(lua_State *cur_L)
+{
+	lua_error(old_L);
+	/* Unreachable. */
+	return 0;
+}
+
+static int error_from_other_thread(lua_State *L)
+{
+	lua_State *next_cur_L = lua_newthread(L);
+	old_L = L;
+	/* Remove thread. */
+	lua_pop(L, 1);
+	/* Do not show frame slot as return result after error. */
+	lua_pushnil(L);
+	lua_pushcfunction(next_cur_L, throw_error_at_old_thread);
+	lua_call(next_cur_L, 0, 0);
+	/* Unreachable. */
+	return 0;
+}
+
+static const struct luaL_Reg libcur_L[] = {
+	{"error_from_other_thread", error_from_other_thread},
+	{NULL, NULL}
+};
+
+LUA_API int luaopen_libcur_L(lua_State *L)
+{
+	luaL_register(L, "libcur_L", libcur_L);
+	return 1;
+}
===================================================================

[1]: https://github.com/tarantool/tarantool/runs/3349429293#step:5:4569
[2]: https://github.com/tarantool/tarantool/pull/6303
[3]: https://github.com/tarantool/tarantool/issues/6231

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit v2] core: fix cur_L restoration on error throw
  2021-08-18  8:49 ` [Tarantool-patches] [PATCH luajit v2] " Sergey Kaplun via Tarantool-patches
@ 2021-08-18 16:57   ` Igor Munkin via Tarantool-patches
  2021-08-18 20:03     ` Sergey Kaplun via Tarantool-patches
  2021-08-19  8:23   ` Igor Munkin via Tarantool-patches
  1 sibling, 1 reply; 8+ messages in thread
From: Igor Munkin via Tarantool-patches @ 2021-08-18 16:57 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sergey,

Thanks for the patch! I'm very curious what's wrong with FreeBSD... The
first version is more portable, but since we're going to revert this
commit few patches later, I'm open to any implementation.

LGTM, with some nits.

On 18.08.21, Sergey Kaplun wrote:
> Implement cur_L restoration only for arm64 architecture, due to FreeBSD
> issue.
> 
> Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-6189-curL-v2
> Issues:
> * https://github.com/tarantool/tarantool/issues/6189
> * https://github.com/tarantool/tarantool/issues/6323
> * https://github.com/tarantool/tarantool/issues/1516
> 
> Tarantool branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-6189-curL-v2
> 
> Enable test-run tests on arm64, Odroid with bump to show their
> coverage.

Please, rebase on the current master: I believe CI should be green!

> 
> P.S. this problem is JIT-related, however, when I turn on `jit.dump()`
> in CI [1], it is disappeared :(. Also, can't reproduce it inside
> sh4/sh8 VM, test fails only in the CI. Red test-run.py suite due to
> fiber.top issue, see also [2].

Mystery. Again... Who knows, maybe you hit the root cause, why this hack
with cur_L restoring is forbidden.

> 
> I suppose it would be nice to have a FreeBSD test machine like we have
> for M1 and Odroid. It may be helpful to research the console issue [3]
> too.

Definitely. Glad Kirill is also in this thread :)

> 
> ===================================================================
> commit 0f555bf79fefa1016849577500aec52719378ca5
> Author: Sergey Kaplun <skaplun@tarantool.org>
> Date:   Sun Aug 15 15:47:13 2021 +0300
> 
> arm64: fix cur_L restoration on error throw
> 
> This change is a kind of follow-up of commits

Minor: not kind of, but just follow-up.

> ed412cd9f55fe87fd32a69c86e1732690fc5c1b0 ('Update cur_L on exceptional
> path') and 97699d9ee2467389b6aea21a098e38aff3469b5f ('Fix cur_L tracking
> on exceptional path').
> 
> When an error is thrown on the coroutine that is not the one being
> currently executed, `cur_L` is not set up. Hence, when the running trace
> exits at assertion guard right after the error is caught, Lua state is
> restored from the incorrect `cur_L`. As a result the resulting stack is
> inconsistent and the crash occurs.
> 
> Aforementioned patches fix the behaviour only for x86/x64 architectures.
> This patch updates the `cur_L` for arm64 architecture too.
> 
> Nevertheless, throwing an error at non-currently executed coroutine is a
> violation of Lua/C API. So, in the nearest possible future this patch

Minor: It would be great to refer Roberto's answer. Feel free to ignore,
since anyone can find it in your PR in LuaJIT repo.

> should be replaced within the corresponding assert in `lj_err_throw()`.

Typo: s/within/with/.

> 
> Resolves tarantool/tarantool#6189
> Relates to tarantool/tarantool#6323
> Follows up tarantool/tarantool#1516
> 

<snipped>

> diff --git a/test/tarantool-tests/gh-6189-cur_L.test.lua b/test/tarantool-tests/gh-6189-cur_L.test.lua
> new file mode 100644
> index 00000000..8521af9a
> --- /dev/null
> +++ b/test/tarantool-tests/gh-6189-cur_L.test.lua
> @@ -0,0 +1,25 @@
> +local libcur_L = require('libcur_L')
> +local tap = require('tap')
> +
> +local test = tap.test('gh-6189-cur_L')
> +test:plan(1)
> +
> +local function cbool(cond)
> +  if cond then
> +    return 1
> +  else
> +    return 0
> +  end
> +end
> +
> +-- Compile function to trace with snapshot.
> +jit.opt.start('hotloop=1')
> +cbool(true)
> +cbool(true)

Minor: Please add a comment why two calls are needed.

> +
> +pcall(libcur_L.error_from_other_thread)

Minor: It would be nice to add an assert that pcall yields false here.

> +-- Call with restoration from a snapshot with wrong cur_L.
> +cbool(false)
> +
> +test:ok(true)
> +os.exit(test:check() and 0 or 1)

<snipped>

> diff --git a/test/tarantool-tests/gh-6189-cur_L/libcur_L.c b/test/tarantool-tests/gh-6189-cur_L/libcur_L.c
> new file mode 100644
> index 00000000..2d58d2e7
> --- /dev/null
> +++ b/test/tarantool-tests/gh-6189-cur_L/libcur_L.c

<snipped>

> +static int error_from_other_thread(lua_State *L)
> +{
> +	lua_State *next_cur_L = lua_newthread(L);
> +	old_L = L;
> +	/* Remove thread. */
> +	lua_pop(L, 1);
> +	/* Do not show frame slot as return result after error. */
> +	lua_pushnil(L);
> +	lua_pushcfunction(next_cur_L, throw_error_at_old_thread);
> +	lua_call(next_cur_L, 0, 0);
> +	/* Unreachable. */

Then it's worth to add an assert here to be sure we never return here.

> +	return 0;
> +}
> +

<snipped>


> ===================================================================
> 
> [1]: https://github.com/tarantool/tarantool/runs/3349429293#step:5:4569
> [2]: https://github.com/tarantool/tarantool/pull/6303
> [3]: https://github.com/tarantool/tarantool/issues/6231
> 
> -- 
> Best regards,
> Sergey Kaplun

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit v2] core: fix cur_L restoration on error throw
  2021-08-18 16:57   ` Igor Munkin via Tarantool-patches
@ 2021-08-18 20:03     ` Sergey Kaplun via Tarantool-patches
  2021-08-18 20:26       ` Igor Munkin via Tarantool-patches
  0 siblings, 1 reply; 8+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-08-18 20:03 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches

Igor,

Thanks for the review!

On 18.08.21, Igor Munkin wrote:
> Sergey,
> 
> Thanks for the patch! I'm very curious what's wrong with FreeBSD... The
> first version is more portable, but since we're going to revert this
> commit few patches later, I'm open to any implementation.
> 
> LGTM, with some nits.
> 
> On 18.08.21, Sergey Kaplun wrote:
> > Implement cur_L restoration only for arm64 architecture, due to FreeBSD
> > issue.
> > 
> > Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-6189-curL-v2
> > Issues:
> > * https://github.com/tarantool/tarantool/issues/6189
> > * https://github.com/tarantool/tarantool/issues/6323
> > * https://github.com/tarantool/tarantool/issues/1516
> > 
> > Tarantool branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-6189-curL-v2
> > 
> > Enable test-run tests on arm64, Odroid with bump to show their
> > coverage.
> 
> Please, rebase on the current master: I believe CI should be green!

Done.

> 
> > 
> > P.S. this problem is JIT-related, however, when I turn on `jit.dump()`
> > in CI [1], it is disappeared :(. Also, can't reproduce it inside
> > sh4/sh8 VM, test fails only in the CI. Red test-run.py suite due to
> > fiber.top issue, see also [2].
> 
> Mystery. Again... Who knows, maybe you hit the root cause, why this hack
> with cur_L restoring is forbidden.
> 
> > 
> > I suppose it would be nice to have a FreeBSD test machine like we have
> > for M1 and Odroid. It may be helpful to research the console issue [3]
> > too.
> 
> Definitely. Glad Kirill is also in this thread :)
> 

Fixed commit message is the following:

===================================================================
arm64: fix cur_L restoration on error throw

This change is the follow-up of commits
ed412cd9f55fe87fd32a69c86e1732690fc5c1b0 ('Update cur_L on exceptional
path') and 97699d9ee2467389b6aea21a098e38aff3469b5f ('Fix cur_L tracking
on exceptional path').

When an error is thrown on the coroutine that is not the one being
currently executed, `cur_L` is not set up. Hence, when the running trace
exits at assertion guard right after the error is caught, Lua state is
restored from the incorrect `cur_L`. As a result the resulting stack is
inconsistent and the crash occurs.

Aforementioned patches fix the behaviour only for x86/x64 architectures.
This patch updates the `cur_L` for arm64 architecture too.

Nevertheless, throwing an error at non-currently executed coroutine is a
violation of Lua/C API. So, in the nearest possible future this patch
should be replaced with the corresponding assert in `lj_err_throw()`.

Resolves tarantool/tarantool#6189
Relates to tarantool/tarantool#6323
Follows up tarantool/tarantool#1516
===================================================================

> > 
> > ===================================================================
> > commit 0f555bf79fefa1016849577500aec52719378ca5
> > Author: Sergey Kaplun <skaplun@tarantool.org>
> > Date:   Sun Aug 15 15:47:13 2021 +0300
> > 
> > arm64: fix cur_L restoration on error throw
> > 
> > This change is a kind of follow-up of commits
> 
> Minor: not kind of, but just follow-up.

Fixed.

> 
> > ed412cd9f55fe87fd32a69c86e1732690fc5c1b0 ('Update cur_L on exceptional
> > path') and 97699d9ee2467389b6aea21a098e38aff3469b5f ('Fix cur_L tracking
> > on exceptional path').
> > 
> > When an error is thrown on the coroutine that is not the one being
> > currently executed, `cur_L` is not set up. Hence, when the running trace
> > exits at assertion guard right after the error is caught, Lua state is
> > restored from the incorrect `cur_L`. As a result the resulting stack is
> > inconsistent and the crash occurs.
> > 
> > Aforementioned patches fix the behaviour only for x86/x64 architectures.
> > This patch updates the `cur_L` for arm64 architecture too.
> > 
> > Nevertheless, throwing an error at non-currently executed coroutine is a
> > violation of Lua/C API. So, in the nearest possible future this patch
> 
> Minor: It would be great to refer Roberto's answer. Feel free to ignore,
> since anyone can find it in your PR in LuaJIT repo.

It's just confirming Mike's answer.
Ignoring.

> 
> > should be replaced within the corresponding assert in `lj_err_throw()`.
> 
> Typo: s/within/with/.

Fixed.

> 
> > 
> > Resolves tarantool/tarantool#6189
> > Relates to tarantool/tarantool#6323
> > Follows up tarantool/tarantool#1516
> > 
> 
> <snipped>
> 
> > diff --git a/test/tarantool-tests/gh-6189-cur_L.test.lua b/test/tarantool-tests/gh-6189-cur_L.test.lua
> > new file mode 100644
> > index 00000000..8521af9a
> > --- /dev/null
> > +++ b/test/tarantool-tests/gh-6189-cur_L.test.lua
> > @@ -0,0 +1,25 @@
> > +local libcur_L = require('libcur_L')
> > +local tap = require('tap')
> > +
> > +local test = tap.test('gh-6189-cur_L')
> > +test:plan(1)
> > +
> > +local function cbool(cond)
> > +  if cond then
> > +    return 1
> > +  else
> > +    return 0
> > +  end
> > +end
> > +
> > +-- Compile function to trace with snapshot.
> > +jit.opt.start('hotloop=1')
> > +cbool(true)
> > +cbool(true)
> 
> Minor: Please add a comment why two calls are needed.

Added.

> 
> > +
> > +pcall(libcur_L.error_from_other_thread)
> 
> Minor: It would be nice to add an assert that pcall yields false here.

Added.

===================================================================
diff --git a/test/tarantool-tests/gh-6189-cur_L.test.lua b/test/tarantool-tests/gh-6189-cur_L.test.lua
index 8521af9a..7f2184ec 100644
--- a/test/tarantool-tests/gh-6189-cur_L.test.lua
+++ b/test/tarantool-tests/gh-6189-cur_L.test.lua
@@ -14,10 +14,13 @@ end
 
 -- Compile function to trace with snapshot.
 jit.opt.start('hotloop=1')
+-- First call makes `cbool()` hot enough to be recorded next time.
 cbool(true)
+-- Second call records `cbool()` body (i.e. `if` branch). This is
+-- a root trace for `cbool()`.
 cbool(true)
 
-pcall(libcur_L.error_from_other_thread)
+assert(pcall(libcur_L.error_from_other_thread) == false, "return from error")
 -- Call with restoration from a snapshot with wrong cur_L.
 cbool(false)
 
===================================================================

> 
> > +-- Call with restoration from a snapshot with wrong cur_L.
> > +cbool(false)
> > +
> > +test:ok(true)
> > +os.exit(test:check() and 0 or 1)
> 
> <snipped>
> 
> > diff --git a/test/tarantool-tests/gh-6189-cur_L/libcur_L.c b/test/tarantool-tests/gh-6189-cur_L/libcur_L.c
> > new file mode 100644
> > index 00000000..2d58d2e7
> > --- /dev/null
> > +++ b/test/tarantool-tests/gh-6189-cur_L/libcur_L.c
> 
> <snipped>
> 
> > +static int error_from_other_thread(lua_State *L)
> > +{
> > +	lua_State *next_cur_L = lua_newthread(L);
> > +	old_L = L;
> > +	/* Remove thread. */
> > +	lua_pop(L, 1);
> > +	/* Do not show frame slot as return result after error. */
> > +	lua_pushnil(L);
> > +	lua_pushcfunction(next_cur_L, throw_error_at_old_thread);
> > +	lua_call(next_cur_L, 0, 0);
> > +	/* Unreachable. */
> 
> Then it's worth to add an assert here to be sure we never return here.

Added.

===================================================================
diff --git a/test/tarantool-tests/gh-6189-cur_L/libcur_L.c b/test/tarantool-tests/gh-6189-cur_L/libcur_L.c
index 2d58d2e7..e39b607d 100644
--- a/test/tarantool-tests/gh-6189-cur_L/libcur_L.c
+++ b/test/tarantool-tests/gh-6189-cur_L/libcur_L.c
@@ -1,6 +1,9 @@
 #include <lua.h>
 #include <lauxlib.h>
 
+#undef NDEBUG
+#include <assert.h>
+
 static lua_State *old_L = NULL;
 
 int throw_error_at_old_thread(lua_State *cur_L)
@@ -21,6 +24,7 @@ static int error_from_other_thread(lua_State *L)
         lua_pushcfunction(next_cur_L, throw_error_at_old_thread);
         lua_call(next_cur_L, 0, 0);
         /* Unreachable. */
+        assert(0);
         return 0;
 }
===================================================================
 
> 
> > +	return 0;
> > +}
> > +
> 
> <snipped>
> 
> 
> > ===================================================================
> > 
> > [1]: https://github.com/tarantool/tarantool/runs/3349429293#step:5:4569
> > [2]: https://github.com/tarantool/tarantool/pull/6303
> > [3]: https://github.com/tarantool/tarantool/issues/6231
> > 
> > -- 
> > Best regards,
> > Sergey Kaplun
> 
> -- 
> Best regards,
> IM

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit v2] core: fix cur_L restoration on error throw
  2021-08-18 20:03     ` Sergey Kaplun via Tarantool-patches
@ 2021-08-18 20:26       ` Igor Munkin via Tarantool-patches
  0 siblings, 0 replies; 8+ messages in thread
From: Igor Munkin via Tarantool-patches @ 2021-08-18 20:26 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sergey,

Thanks for the fixes! See both odroid CI pipelines successfully
finished, gratz!

On 18.08.21, Sergey Kaplun wrote:
> Igor,
> 
> Thanks for the review!
> 
> On 18.08.21, Igor Munkin wrote:
> > Sergey,
> > 
> > Thanks for the patch! I'm very curious what's wrong with FreeBSD... The
> > first version is more portable, but since we're going to revert this
> > commit few patches later, I'm open to any implementation.
> > 
> > LGTM, with some nits.
> > 

<snipped>

> > 
> > -- 
> > Best regards,
> > IM
> 
> -- 
> Best regards,
> Sergey Kaplun

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit] core: fix cur_L restoration on error throw
  2021-08-16 10:19 [Tarantool-patches] [PATCH luajit] core: fix cur_L restoration on error throw Sergey Kaplun via Tarantool-patches
  2021-08-18  8:49 ` [Tarantool-patches] [PATCH luajit v2] " Sergey Kaplun via Tarantool-patches
@ 2021-08-19  7:42 ` Kirill Yukhin via Tarantool-patches
  2021-08-19  7:56   ` Vitaliia Ioffe via Tarantool-patches
  1 sibling, 1 reply; 8+ messages in thread
From: Kirill Yukhin via Tarantool-patches @ 2021-08-19  7:42 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Hello,

On 16 авг 13:19, Sergey Kaplun wrote:
> This change is a kind of revertion of commits
> ed412cd9f55fe87fd32a69c86e1732690fc5c1b0 ('Update cur_L on exceptional
> path') and 97699d9ee2467389b6aea21a098e38aff3469b5f ('Fix cur_L tracking
> on exceptional path').
> 
> When an error is thrown on the coroutine that is not the one being
> currently executed, `cur_L` is not set up. Hence, when the running trace
> exits at assertion guard right after the error is caught, Lua state is
> restored from the incorrect `cur_L`. As a result the resulting stack is
> inconsistent and the crash occurs.
> 
> Aforementioned patches fix the behaviour only for x86/x64
> architectures. This patch updates the `cur_L` within `lj_err_throw()` to
> the given lua_State, where the error is raised, since this is the only
> coroutine that can proceed later. Also, it removes unnecessary
> restorations of `cur_L` at C and FF exception handlers in the VM.
> 
> Nevertheless, throwing an error at non-currently executed coroutine is a
> violation of Lua/C API. So, in the nearest possible future this patch
> should be replaced within the corresponding assert.
> 
> Resolves tarantool/tarantool#6189
> Relates to tarantool/tarantool#6323
> Follows up tarantool/tarantool#1516

LGTM.

--
Regards, Kirill Yukhin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Tarantool-patches]  [PATCH luajit] core: fix cur_L restoration on error throw
  2021-08-19  7:42 ` [Tarantool-patches] [PATCH luajit] " Kirill Yukhin via Tarantool-patches
@ 2021-08-19  7:56   ` Vitaliia Ioffe via Tarantool-patches
  0 siblings, 0 replies; 8+ messages in thread
From: Vitaliia Ioffe via Tarantool-patches @ 2021-08-19  7:56 UTC (permalink / raw)
  To: Kirill Yukhin; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 1518 bytes --]


Hi team, 
 
QA LGTM
 
 
--
Vitaliia Ioffe
 
  
>Четверг, 19 августа 2021, 10:42 +03:00 от Kirill Yukhin via Tarantool-patches <tarantool-patches@dev.tarantool.org>:
> 
>Hello,
>
>On 16 авг 13:19, Sergey Kaplun wrote:
>> This change is a kind of revertion of commits
>> ed412cd9f55fe87fd32a69c86e1732690fc5c1b0 ('Update cur_L on exceptional
>> path') and 97699d9ee2467389b6aea21a098e38aff3469b5f ('Fix cur_L tracking
>> on exceptional path').
>>
>> When an error is thrown on the coroutine that is not the one being
>> currently executed, `cur_L` is not set up. Hence, when the running trace
>> exits at assertion guard right after the error is caught, Lua state is
>> restored from the incorrect `cur_L`. As a result the resulting stack is
>> inconsistent and the crash occurs.
>>
>> Aforementioned patches fix the behaviour only for x86/x64
>> architectures. This patch updates the `cur_L` within `lj_err_throw()` to
>> the given lua_State, where the error is raised, since this is the only
>> coroutine that can proceed later. Also, it removes unnecessary
>> restorations of `cur_L` at C and FF exception handlers in the VM.
>>
>> Nevertheless, throwing an error at non-currently executed coroutine is a
>> violation of Lua/C API. So, in the nearest possible future this patch
>> should be replaced within the corresponding assert.
>>
>> Resolves tarantool/tarantool#6189
>> Relates to tarantool/tarantool#6323
>> Follows up tarantool/tarantool#1516
>LGTM.
>
>--
>Regards, Kirill Yukhin
 

[-- Attachment #2: Type: text/html, Size: 2146 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit v2] core: fix cur_L restoration on error throw
  2021-08-18  8:49 ` [Tarantool-patches] [PATCH luajit v2] " Sergey Kaplun via Tarantool-patches
  2021-08-18 16:57   ` Igor Munkin via Tarantool-patches
@ 2021-08-19  8:23   ` Igor Munkin via Tarantool-patches
  1 sibling, 0 replies; 8+ messages in thread
From: Igor Munkin via Tarantool-patches @ 2021-08-19  8:23 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sergey,

I've checked the patch into tarantool branch in tarantool/luajit and
bumped a new version in master.

On 18.08.21, Sergey Kaplun wrote:
> Implement cur_L restoration only for arm64 architecture, due to FreeBSD
> issue.
> 
> Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-6189-curL-v2
> Issues:
> * https://github.com/tarantool/tarantool/issues/6189
> * https://github.com/tarantool/tarantool/issues/6323
> * https://github.com/tarantool/tarantool/issues/1516
> 
> Tarantool branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-6189-curL-v2
> 
> Enable test-run tests on arm64, Odroid with bump to show their
> coverage.
> 
> P.S. this problem is JIT-related, however, when I turn on `jit.dump()`
> in CI [1], it is disappeared :(. Also, can't reproduce it inside
> sh4/sh8 VM, test fails only in the CI. Red test-run.py suite due to
> fiber.top issue, see also [2].
> 
> I suppose it would be nice to have a FreeBSD test machine like we have
> for M1 and Odroid. It may be helpful to research the console issue [3]
> too.
> 
> ===================================================================
> commit 0f555bf79fefa1016849577500aec52719378ca5
> Author: Sergey Kaplun <skaplun@tarantool.org>
> Date:   Sun Aug 15 15:47:13 2021 +0300
> 
> arm64: fix cur_L restoration on error throw
> 
> This change is a kind of follow-up of commits
> ed412cd9f55fe87fd32a69c86e1732690fc5c1b0 ('Update cur_L on exceptional
> path') and 97699d9ee2467389b6aea21a098e38aff3469b5f ('Fix cur_L tracking
> on exceptional path').
> 
> When an error is thrown on the coroutine that is not the one being
> currently executed, `cur_L` is not set up. Hence, when the running trace
> exits at assertion guard right after the error is caught, Lua state is
> restored from the incorrect `cur_L`. As a result the resulting stack is
> inconsistent and the crash occurs.
> 
> Aforementioned patches fix the behaviour only for x86/x64 architectures.
> This patch updates the `cur_L` for arm64 architecture too.
> 
> Nevertheless, throwing an error at non-currently executed coroutine is a
> violation of Lua/C API. So, in the nearest possible future this patch
> should be replaced within the corresponding assert in `lj_err_throw()`.
> 
> Resolves tarantool/tarantool#6189
> Relates to tarantool/tarantool#6323
> Follows up tarantool/tarantool#1516
> 

<snipped>

> ===================================================================
> 
> [1]: https://github.com/tarantool/tarantool/runs/3349429293#step:5:4569
> [2]: https://github.com/tarantool/tarantool/pull/6303
> [3]: https://github.com/tarantool/tarantool/issues/6231
> 
> -- 
> Best regards,
> Sergey Kaplun

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-08-19  8:48 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-16 10:19 [Tarantool-patches] [PATCH luajit] core: fix cur_L restoration on error throw Sergey Kaplun via Tarantool-patches
2021-08-18  8:49 ` [Tarantool-patches] [PATCH luajit v2] " Sergey Kaplun via Tarantool-patches
2021-08-18 16:57   ` Igor Munkin via Tarantool-patches
2021-08-18 20:03     ` Sergey Kaplun via Tarantool-patches
2021-08-18 20:26       ` Igor Munkin via Tarantool-patches
2021-08-19  8:23   ` Igor Munkin via Tarantool-patches
2021-08-19  7:42 ` [Tarantool-patches] [PATCH luajit] " Kirill Yukhin via Tarantool-patches
2021-08-19  7:56   ` Vitaliia Ioffe via Tarantool-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox