Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH v2 0/2] Prevent JIT engine breakage on fibers switch-over
@ 2020-09-23 19:06 Igor Munkin
  2020-09-23 19:06 ` [Tarantool-patches] [PATCH v2 1/2] fiber: introduce a callback for " Igor Munkin
                   ` (5 more replies)
  0 siblings, 6 replies; 29+ messages in thread
From: Igor Munkin @ 2020-09-23 19:06 UTC (permalink / raw)
  To: Vladislav Shpilevoy, Sergey Ostanevich; +Cc: tarantool-patches

There was a long discussion about the patch correctness and its
performance impact in v1 thread[1]. The benchmarks provided in v1 showed
this implementation as the least nerfing the platform overall
performance even for such synthetic test[2]. One can find the relevant
benchmarks results in the corresponding patches.

Changes in v2:
* implement the callback in a different way to negate its perf impact

@ChangeLog:
* Fixed fibers switch-over to prevent JIT machinery misbehaviour. Trace
  recording is aborted when fiber yields the execution. The yield
  occuring while mcode is being run leads to the platform panic
  (gh-1700, gh-4491).

Branch: https://github.com/tarantool/tarantool/tree/imun/gh-1700-abort-recording-on-fiber-switch
Issues:
* https://github.com/tarantool/tarantool/issues/1700
* https://github.com/tarantool/tarantool/issues/4491

[1]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-March/015290.html
[2]: https://gist.github.com/igormunkin/7e0cf48005bd003ffbdf30181eedb40e

Igor Munkin (2):
  fiber: introduce a callback for fibers switch-over
  lua: abort trace recording on fiber yield

 src/lib/core/fiber.c                          | 10 ++++
 src/lua/utils.c                               | 57 ++++++++++++++++++
 ...-4491-coio-wait-leads-to-segfault.test.lua | 53 +++++++++++++++++
 test/unit/CMakeLists.txt                      | 59 ++++++++++---------
 test/unit/core_test_utils.c                   | 37 ++++++++++++
 5 files changed, 188 insertions(+), 28 deletions(-)
 create mode 100755 test/app-tap/gh-4491-coio-wait-leads-to-segfault.test.lua
 create mode 100644 test/unit/core_test_utils.c

-- 
2.25.0

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [Tarantool-patches] [PATCH v2 1/2] fiber: introduce a callback for fibers switch-over
  2020-09-23 19:06 [Tarantool-patches] [PATCH v2 0/2] Prevent JIT engine breakage on fibers switch-over Igor Munkin
@ 2020-09-23 19:06 ` Igor Munkin
  2020-09-24 12:54   ` sergos
  2020-09-29 22:41   ` Vladislav Shpilevoy
  2020-09-23 19:06 ` [Tarantool-patches] [PATCH v2 2/2] lua: abort trace recording on fiber yield Igor Munkin
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 29+ messages in thread
From: Igor Munkin @ 2020-09-23 19:06 UTC (permalink / raw)
  To: Vladislav Shpilevoy, Sergey Ostanevich; +Cc: tarantool-patches

Tarantool integrates several complex environments together and there are
issues occurring at their junction leading to the platform failures.
E.g. fiber switch-over is implemented outside the Lua world, so when one
lua_State substitutes another one, main LuaJIT engines, such as JIT and
GC, are left unnotified leading to the further platform misbehaviour.

To solve this severe integration drawback <cord_on_yield> function is
introduced. This routine encloses the checks and actions to be done when
the running fiber yields the execution.

Unfortunately the way callback is implemented introduces a circular
dependency. Considering linker symbol resolving methods for static build
an auxiliary translation unit is added to the particular tests mocking
(i.e. exporting) <cord_on_yield> undefined symbol.

Part of #1700
Relates to #4491

Signed-off-by: Igor Munkin <imun@tarantool.org>
---

Here are also the benchmark results for the Release build:
* Vanilla (8477b6c) -> Patched (86bfd45) (min, median, mean, max):
| fibers: 10; iters: 100	0%	0%	0%	-3%
| fibers: 10; iters: 1000	2%	2%	1%	-1%
| fibers: 10; iters: 10000	6%	4%	4%	5%
| fibers: 10; iters: 100000	4%	3%	3%	-2%
| fibers: 100; iters: 100	0%	0%	0%	0%
| fibers: 100; iters: 1000	0%	0%	0%	-8%
| fibers: 100; iters: 10000	1%	0%	0%	1%
| fibers: 100; iters: 100000	0%	0%	1%	2%
| fibers: 1000; iters: 100	0%	0%	0%	0%
| fibers: 1000; iters: 1000	0%	0%	0%	0%
| fibers: 1000; iters: 10000	0%	0%	0%	1%
| fibers: 1000; iters: 100000	0%	0%	0%	0%
| fibers: 10000; iters: 100	-1%	-3%	-1%	0%
| fibers: 10000; iters: 1000	0%	0%	0%	0%
| fibers: 10000; iters: 10000	0%	0%	0%	0%
| fibers: 10000; iters: 100000	0%	-1%	-2%	-6%

 src/lib/core/fiber.c        | 10 +++++++
 src/lua/utils.c             |  8 +++++
 test/unit/CMakeLists.txt    | 59 +++++++++++++++++++------------------
 test/unit/core_test_utils.c | 37 +++++++++++++++++++++++
 4 files changed, 86 insertions(+), 28 deletions(-)
 create mode 100644 test/unit/core_test_utils.c

diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
index 483ae3ce1..1b7d255ff 100644
--- a/src/lib/core/fiber.c
+++ b/src/lib/core/fiber.c
@@ -46,6 +46,8 @@
 #if ENABLE_FIBER_TOP
 #include <x86intrin.h> /* __rdtscp() */
 
+extern void cord_on_yield(void);
+
 static inline void
 clock_stat_add_delta(struct clock_stat *stat, uint64_t clock_delta)
 {
@@ -416,6 +418,10 @@ fiber_call(struct fiber *callee)
 	/** By convention, these triggers must not throw. */
 	if (! rlist_empty(&caller->on_yield))
 		trigger_run(&caller->on_yield, NULL);
+
+	if (cord_is_main())
+		cord_on_yield();
+
 	clock_set_on_csw(caller);
 	callee->caller = caller;
 	callee->flags |= FIBER_IS_READY;
@@ -645,6 +651,10 @@ fiber_yield(void)
 	/** By convention, these triggers must not throw. */
 	if (! rlist_empty(&caller->on_yield))
 		trigger_run(&caller->on_yield, NULL);
+
+	if (cord_is_main())
+		cord_on_yield();
+
 	clock_set_on_csw(caller);
 
 	assert(callee->flags & FIBER_IS_READY || callee == &cord->sched);
diff --git a/src/lua/utils.c b/src/lua/utils.c
index af114b0a2..8e98f7607 100644
--- a/src/lua/utils.c
+++ b/src/lua/utils.c
@@ -1308,3 +1308,11 @@ tarantool_lua_utils_init(struct lua_State *L)
 	luaT_newthread_ref = luaL_ref(L, LUA_REGISTRYINDEX);
 	return 0;
 }
+
+/**
+ * This routine encloses the checks and actions to be done when
+ * the running fiber yields the execution.
+ */
+void cord_on_yield(void)
+{
+}
diff --git a/test/unit/CMakeLists.txt b/test/unit/CMakeLists.txt
index 419477748..aace8cf50 100644
--- a/test/unit/CMakeLists.txt
+++ b/test/unit/CMakeLists.txt
@@ -50,7 +50,7 @@ add_executable(bitset_index.test bitset_index.c)
 target_link_libraries(bitset_index.test bitset)
 add_executable(base64.test base64.c)
 target_link_libraries(base64.test misc unit)
-add_executable(uuid.test uuid.c)
+add_executable(uuid.test uuid.c core_test_utils.c)
 target_link_libraries(uuid.test uuid unit)
 
 add_executable(bps_tree.test bps_tree.cc)
@@ -69,53 +69,53 @@ add_executable(bloom.test bloom.cc)
 target_link_libraries(bloom.test salad)
 add_executable(vclock.test vclock.cc)
 target_link_libraries(vclock.test vclock unit)
-add_executable(xrow.test xrow.cc)
+add_executable(xrow.test xrow.cc core_test_utils.c)
 target_link_libraries(xrow.test xrow unit)
 add_executable(decimal.test decimal.c)
 target_link_libraries(decimal.test core unit)
-add_executable(mp_error.test mp_error.cc)
+add_executable(mp_error.test mp_error.cc core_test_utils.c)
 target_link_libraries(mp_error.test box_error core unit)
 
-add_executable(fiber.test fiber.cc)
+add_executable(fiber.test fiber.cc core_test_utils.c)
 set_source_files_properties(fiber.cc PROPERTIES COMPILE_FLAGS -O0)
 target_link_libraries(fiber.test core unit)
 
-add_executable(fiber_stack.test fiber_stack.c)
+add_executable(fiber_stack.test fiber_stack.c core_test_utils.c)
 set_source_files_properties(fiber_stack.c PROPERTIES COMPILE_FLAGS -O0)
 target_link_libraries(fiber_stack.test core unit)
 
 if (NOT ENABLE_GCOV)
     # This test is known to be broken with GCOV
-    add_executable(guard.test guard.cc)
+    add_executable(guard.test guard.cc core_test_utils.c)
     target_link_libraries(guard.test core unit)
 endif ()
 
-add_executable(fiber_stress.test fiber_stress.cc)
+add_executable(fiber_stress.test fiber_stress.cc core_test_utils.c)
 target_link_libraries(fiber_stress.test core)
 
-add_executable(fiber_cond.test fiber_cond.c unit.c)
+add_executable(fiber_cond.test fiber_cond.c unit.c core_test_utils.c)
 target_link_libraries(fiber_cond.test core)
 
-add_executable(fiber_channel.test fiber_channel.cc unit.c)
+add_executable(fiber_channel.test fiber_channel.cc unit.c core_test_utils.c)
 target_link_libraries(fiber_channel.test core)
 
-add_executable(fiber_channel_stress.test fiber_channel_stress.cc)
+add_executable(fiber_channel_stress.test fiber_channel_stress.cc core_test_utils.c)
 target_link_libraries(fiber_channel_stress.test core)
 
-add_executable(cbus_stress.test cbus_stress.c)
+add_executable(cbus_stress.test cbus_stress.c core_test_utils.c)
 target_link_libraries(cbus_stress.test core stat)
 
-add_executable(cbus.test cbus.c)
+add_executable(cbus.test cbus.c core_test_utils.c)
 target_link_libraries(cbus.test core unit stat)
 
 include(CheckSymbolExists)
 check_symbol_exists(__GLIBC__ features.h GLIBC_USED)
 if (GLIBC_USED)
-    add_executable(cbus_hang.test cbus_hang.c)
+    add_executable(cbus_hang.test cbus_hang.c core_test_utils.c)
     target_link_libraries(cbus_hang.test core unit stat)
 endif ()
 
-add_executable(coio.test coio.cc)
+add_executable(coio.test coio.cc core_test_utils.c)
 target_link_libraries(coio.test core eio bit uri unit)
 
 if (ENABLE_BUNDLED_MSGPUCK)
@@ -132,7 +132,7 @@ if (ENABLE_BUNDLED_MSGPUCK)
     target_link_libraries(msgpack.test ${MSGPUCK_LIBRARIES})
 endif ()
 
-add_executable(scramble.test scramble.c)
+add_executable(scramble.test scramble.c core_test_utils.c)
 target_link_libraries(scramble.test scramble)
 
 add_executable(guava.test guava.c)
@@ -155,7 +155,7 @@ target_link_libraries(csv.test csv)
 add_executable(json.test json.c)
 target_link_libraries(json.test json unit ${ICU_LIBRARIES})
 
-add_executable(rmean.test rmean.cc)
+add_executable(rmean.test rmean.cc core_test_utils.c)
 target_link_libraries(rmean.test stat unit)
 add_executable(histogram.test histogram.c)
 target_link_libraries(histogram.test stat unit)
@@ -170,7 +170,7 @@ target_link_libraries(luaL_iterator.test unit server coll core misc
     ${CURL_LIBRARIES} ${LIBYAML_LIBRARIES} ${READLINE_LIBRARIES}
     ${ICU_LIBRARIES} ${LUAJIT_LIBRARIES} ${LIB_DL})
 
-add_executable(say.test say.c)
+add_executable(say.test say.c core_test_utils.c)
 target_link_libraries(say.test core unit)
 
 set(ITERATOR_TEST_SOURCES
@@ -182,7 +182,7 @@ set(ITERATOR_TEST_SOURCES
     ${PROJECT_SOURCE_DIR}/src/box/vy_cache.c)
 set(ITERATOR_TEST_LIBS core tuple xrow unit)
 
-add_executable(vy_mem.test vy_mem.c ${ITERATOR_TEST_SOURCES})
+add_executable(vy_mem.test vy_mem.c ${ITERATOR_TEST_SOURCES} core_test_utils.c)
 target_link_libraries(vy_mem.test ${ITERATOR_TEST_LIBS} ${LIB_DL})
 
 add_executable(vy_point_lookup.test
@@ -204,11 +204,13 @@ add_executable(vy_point_lookup.test
     ${PROJECT_SOURCE_DIR}/src/box/index_def.c
     ${PROJECT_SOURCE_DIR}/src/box/schema_def.c
     ${PROJECT_SOURCE_DIR}/src/box/identifier.c
+    core_test_utils.c
 )
 target_link_libraries(vy_point_lookup.test core tuple xrow xlog unit ${LIB_DL})
 
 add_executable(column_mask.test
-    column_mask.c)
+    column_mask.c
+    core_test_utils.c)
 target_link_libraries(column_mask.test tuple unit)
 
 add_executable(vy_write_iterator.test
@@ -217,16 +219,17 @@ add_executable(vy_write_iterator.test
     ${PROJECT_SOURCE_DIR}/src/box/vy_upsert.c
     ${PROJECT_SOURCE_DIR}/src/box/vy_write_iterator.c
     ${ITERATOR_TEST_SOURCES}
+    core_test_utils.c
 )
 target_link_libraries(vy_write_iterator.test xlog ${ITERATOR_TEST_LIBS} ${LIB_DL})
 
-add_executable(vy_cache.test vy_cache.c ${ITERATOR_TEST_SOURCES})
+add_executable(vy_cache.test vy_cache.c ${ITERATOR_TEST_SOURCES} core_test_utils.c)
 target_link_libraries(vy_cache.test ${ITERATOR_TEST_LIBS} ${LIB_DL})
 
-add_executable(coll.test coll.cpp)
+add_executable(coll.test coll.cpp core_test_utils.c)
 target_link_libraries(coll.test coll unit misc ${LIB_DL})
 
-add_executable(tuple_bigref.test tuple_bigref.c)
+add_executable(tuple_bigref.test tuple_bigref.c core_test_utils.c)
 target_link_libraries(tuple_bigref.test tuple unit)
 
 add_executable(checkpoint_schedule.test
@@ -235,23 +238,23 @@ add_executable(checkpoint_schedule.test
 )
 target_link_libraries(checkpoint_schedule.test m unit)
 
-add_executable(sio.test sio.c)
+add_executable(sio.test sio.c core_test_utils.c)
 target_link_libraries(sio.test unit core)
 
-add_executable(crypto.test crypto.c)
+add_executable(crypto.test crypto.c core_test_utils.c)
 target_link_libraries(crypto.test crypto unit)
 
 add_executable(swim.test swim.c swim_test_transport.c swim_test_ev.c
-               swim_test_utils.c ${PROJECT_SOURCE_DIR}/src/version.c)
+               swim_test_utils.c ${PROJECT_SOURCE_DIR}/src/version.c core_test_utils.c)
 target_link_libraries(swim.test unit swim)
 
 add_executable(swim_proto.test swim_proto.c swim_test_transport.c swim_test_ev.c
-               swim_test_utils.c ${PROJECT_SOURCE_DIR}/src/version.c)
+               swim_test_utils.c ${PROJECT_SOURCE_DIR}/src/version.c core_test_utils.c)
 target_link_libraries(swim_proto.test unit swim)
 
 add_executable(swim_errinj.test swim_errinj.c swim_test_transport.c
                swim_test_ev.c swim_test_utils.c
-               ${PROJECT_SOURCE_DIR}/src/version.c)
+               ${PROJECT_SOURCE_DIR}/src/version.c core_test_utils.c)
 target_link_libraries(swim_errinj.test unit swim)
 
 add_executable(merger.test merger.test.c)
@@ -264,5 +267,5 @@ target_link_libraries(snap_quorum_delay.test box core unit)
 # Client for popen.test
 add_executable(popen-child popen-child.c)
 
-add_executable(popen.test popen.c)
+add_executable(popen.test popen.c core_test_utils.c)
 target_link_libraries(popen.test misc unit core)
diff --git a/test/unit/core_test_utils.c b/test/unit/core_test_utils.c
new file mode 100644
index 000000000..23452bbfd
--- /dev/null
+++ b/test/unit/core_test_utils.c
@@ -0,0 +1,37 @@
+/*
+ * Copyright 2010-2020, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ *    copyright notice, this list of conditions and the
+ *    following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials
+ *    provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+/**
+ * Mock function to resolve circular dependencies in unit tests
+ */
+void cord_on_yield(void)
+{
+}
-- 
2.25.0

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [Tarantool-patches] [PATCH v2 2/2] lua: abort trace recording on fiber yield
  2020-09-23 19:06 [Tarantool-patches] [PATCH v2 0/2] Prevent JIT engine breakage on fibers switch-over Igor Munkin
  2020-09-23 19:06 ` [Tarantool-patches] [PATCH v2 1/2] fiber: introduce a callback for " Igor Munkin
@ 2020-09-23 19:06 ` Igor Munkin
  2020-09-24 13:00   ` sergos
  2020-09-29 22:41   ` Vladislav Shpilevoy
  2020-09-24 13:15 ` [Tarantool-patches] [PATCH v2 0/2] Prevent JIT engine breakage on fibers switch-over sergos
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 29+ messages in thread
From: Igor Munkin @ 2020-09-23 19:06 UTC (permalink / raw)
  To: Vladislav Shpilevoy, Sergey Ostanevich; +Cc: tarantool-patches

Since Tarantool fibers don't respect Lua coroutine switch mechanism, JIT
machinery stays unnotified when one lua_State substitutes another one.
As a result if trace recording hasn't been aborted prior to fiber
switch, the recording proceeds using the new lua_State and leads to a
failure either on any further compiler phase or while the compiled trace
is executed.

This changeset extends <cord_on_yield> routine aborting trace recording
when the fiber switches to another one. If the switch-over occurs while
mcode is being run the platform finishes its execution with EXIT_FAILURE
code and calls panic routine prior to the exit.

Closes #1700
Fixes #4491

Signed-off-by: Igor Munkin <imun@tarantool.org>
---

Here are also the benchmark results for the Release build:
* Vanilla (8477b6c) -> Patched (f1026a1) (min, median, mean, max):
| fibers: 10; iters: 100	0%	0%	0%	0%
| fibers: 10; iters: 1000	2%	2%	1%	2%
| fibers: 10; iters: 10000	6%	4%	5%	5%
| fibers: 10; iters: 100000	4%	3%	4%	1%
| fibers: 100; iters: 100	1%	0%	1%	10%
| fibers: 100; iters: 1000	2%	2%	0%	-7%
| fibers: 100; iters: 10000	2%	1%	2%	1%
| fibers: 100; iters: 100000	2%	2%	2%	5%
| fibers: 1000; iters: 100	1%	0%	0%	-2%
| fibers: 1000; iters: 1000	1%	1%	1%	0%
| fibers: 1000; iters: 10000	1%	1%	1%	2%
| fibers: 1000; iters: 100000	1%	1%	1%	1%
| fibers: 10000; iters: 100	0%	0%	0%	1%
| fibers: 10000; iters: 1000	0%	2%	3%	3%
| fibers: 10000; iters: 10000	0%	0%	0%	-2%
| fibers: 10000; iters: 100000	0%	1%	0%	-4%

 src/lua/utils.c                               | 49 +++++++++++++++++
 ...-4491-coio-wait-leads-to-segfault.test.lua | 53 +++++++++++++++++++
 2 files changed, 102 insertions(+)
 create mode 100755 test/app-tap/gh-4491-coio-wait-leads-to-segfault.test.lua

diff --git a/src/lua/utils.c b/src/lua/utils.c
index 8e98f7607..39fe4e30c 100644
--- a/src/lua/utils.c
+++ b/src/lua/utils.c
@@ -29,6 +29,7 @@
  * SUCH DAMAGE.
  */
 #include "lua/utils.h"
+#include <lj_trace.h>
 
 #include <assert.h>
 #include <errno.h>
@@ -1309,10 +1310,57 @@ tarantool_lua_utils_init(struct lua_State *L)
 	return 0;
 }
 
+/*
+ * XXX: There is already defined <panic> macro in say.h header
+ * (included in diag.h). As a result the call below is misexpanded
+ * and compilation fails with the corresponding error. To avoid
+ * this error the macro is temporary renamed and restored later.
+ * Compilation now fails for the following cases:
+ * * temporary name <_panic> is used in this translation unit
+ * * <panic> is not define, so this hack can be freely dropped
+ */
+#if defined(panic) && !defined(_panic)
+#  define _panic panic
+#  undef panic
+#else
+#  error "Can't redefine <panic> macro"
+#endif
+
 /**
  * This routine encloses the checks and actions to be done when
  * the running fiber yields the execution.
+ * Since Tarantool fibers don't switch-over the way Lua coroutines
+ * do the platform ought to notify JIT engine when one lua_State
+ * substitutes another one.
  */
 void cord_on_yield(void)
 {
+	struct global_State *g = G(tarantool_L);
+	/*
+	 * XXX: Switching fibers while running the trace leads to
+	 * code misbehaviour and failures, so stop its execution.
+	 */
+	if (unlikely(tvref(g->jit_base))) {
+		/*
+		 * XXX: mcode is executed only in scope of Lua
+		 * world and one can obtain the corresponding Lua
+		 * coroutine from the fiber storage.
+		 */
+		struct lua_State *L = fiber()->storage.lua.stack;
+		assert(L != NULL);
+		lua_pushstring(L, "Fiber is switched on the trace");
+		if (g->panic)
+			g->panic(L);
+		exit(EXIT_FAILURE);
+	}
+	/*
+	 * Unconditionally abort trace recording whether fibers
+	 * switch each other. Otherwise, further compilation may
+	 * lead to a failure on any next compiler phase.
+	 */
+	lj_trace_abort(g);
 }
+
+/* Restore <panic> macro back */
+#define panic _panic
+#undef _panic
diff --git a/test/app-tap/gh-4491-coio-wait-leads-to-segfault.test.lua b/test/app-tap/gh-4491-coio-wait-leads-to-segfault.test.lua
new file mode 100755
index 000000000..0dd8dfbee
--- /dev/null
+++ b/test/app-tap/gh-4491-coio-wait-leads-to-segfault.test.lua
@@ -0,0 +1,53 @@
+#!/usr/bin/env tarantool
+
+local tap = require('tap')
+
+local test = tap.test('gh-4491-coio-wait-leads-to-segfault')
+
+-- Test file to demonstrate platform failure due to fiber switch
+-- while trace recording, details:
+--     https://github.com/tarantool/tarantool/issues/4491
+
+local fiber = require('fiber')
+local ffi = require('ffi')
+ffi.cdef('int coio_wait(int fd, int event, double timeout);')
+
+local cfg = {
+  hotloop = arg[1] or 1,
+  fibers = arg[1] or 2,
+  timeout = { put = 1, get = 1 },
+}
+
+test:plan(cfg.fibers + 1)
+
+local args = {
+  fd      = 1   , -- STDIN file descriptor
+  event   = 0x1 , -- COIO_READ event
+  timeout = 0.05, -- Timeout value
+}
+
+local function run(iterations, channel)
+  for _ = 1, iterations do
+    ffi.C.coio_wait(args.fd, args.event, args.timeout)
+  end
+  channel:put(true, cfg.timeout.put)
+end
+
+local channels = { }
+
+jit.opt.start('3', string.format('hotloop=%d', cfg.hotloop))
+
+for _ = 1, cfg.fibers do
+  channels[_] = fiber.channel(1)
+  fiber.new(run, cfg.hotloop + 1, channels[_])
+end
+
+-- Finalize the existing fibers
+for _ = 1, cfg.fibers do
+  test:ok(channels[_]:get(cfg.timeout.get),
+          string.format('fiber #%d successfully finished', _))
+end
+
+test:ok(true, 'trace is not recorded due to fiber switch underneath coio_wait')
+
+os.exit(test:check() and 0 or 1)
-- 
2.25.0

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 1/2] fiber: introduce a callback for fibers switch-over
  2020-09-23 19:06 ` [Tarantool-patches] [PATCH v2 1/2] fiber: introduce a callback for " Igor Munkin
@ 2020-09-24 12:54   ` sergos
  2020-09-28 13:06     ` Igor Munkin
  2020-09-29 22:41   ` Vladislav Shpilevoy
  1 sibling, 1 reply; 29+ messages in thread
From: sergos @ 2020-09-24 12:54 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches, Vladislav Shpilevoy

Hi!

Thank you for the patch, see my single comment below.

Regards,
Sergos

> On 23 Sep 2020, at 22:06, Igor Munkin <imun@tarantool.org> wrote:
> 
> Tarantool integrates several complex environments together and there are
> issues occurring at their junction leading to the platform failures.
> E.g. fiber switch-over is implemented outside the Lua world, so when one
> lua_State substitutes another one, main LuaJIT engines, such as JIT and
> GC, are left unnotified leading to the further platform misbehaviour.
> 
> To solve this severe integration drawback <cord_on_yield> function is
> introduced. This routine encloses the checks and actions to be done when
> the running fiber yields the execution.
> 
> Unfortunately the way callback is implemented introduces a circular
> dependency. Considering linker symbol resolving methods for static build
> an auxiliary translation unit is added to the particular tests mocking
> (i.e. exporting) <cord_on_yield> undefined symbol.
> 
> Part of #1700
> Relates to #4491
> 
> Signed-off-by: Igor Munkin <imun@tarantool.org>
> ---
> 
> Here are also the benchmark results for the Release build:
> * Vanilla (8477b6c) -> Patched (86bfd45) (min, median, mean, max):
> | fibers: 10; iters: 100	0%	0%	0%	-3%
> | fibers: 10; iters: 1000	2%	2%	1%	-1%
> | fibers: 10; iters: 10000	6%	4%	4%	5%
> | fibers: 10; iters: 100000	4%	3%	3%	-2%
> | fibers: 100; iters: 100	0%	0%	0%	0%
> | fibers: 100; iters: 1000	0%	0%	0%	-8%
> | fibers: 100; iters: 10000	1%	0%	0%	1%
> | fibers: 100; iters: 100000	0%	0%	1%	2%
> | fibers: 1000; iters: 100	0%	0%	0%	0%
> | fibers: 1000; iters: 1000	0%	0%	0%	0%
> | fibers: 1000; iters: 10000	0%	0%	0%	1%
> | fibers: 1000; iters: 100000	0%	0%	0%	0%
> | fibers: 10000; iters: 100	-1%	-3%	-1%	0%
> | fibers: 10000; iters: 1000	0%	0%	0%	0%
> | fibers: 10000; iters: 10000	0%	0%	0%	0%
> | fibers: 10000; iters: 100000	0%	-1%	-2%	-6%
> 
> src/lib/core/fiber.c        | 10 +++++++
> src/lua/utils.c             |  8 +++++
> test/unit/CMakeLists.txt    | 59 +++++++++++++++++++------------------
> test/unit/core_test_utils.c | 37 +++++++++++++++++++++++
> 4 files changed, 86 insertions(+), 28 deletions(-)
> create mode 100644 test/unit/core_test_utils.c
> 
> diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
> index 483ae3ce1..1b7d255ff 100644
> --- a/src/lib/core/fiber.c
> +++ b/src/lib/core/fiber.c
> @@ -46,6 +46,8 @@
> #if ENABLE_FIBER_TOP
> #include <x86intrin.h> /* __rdtscp() */
> 

> +extern void cord_on_yield(void);
> +

The definition is under ENABLE_FIBER_TOP, while its use isn’t. 
The build will fail for ARM target for exmaple.
AFAICS the changes are not releveant to the fiber top in any way.

> static inline void
> clock_stat_add_delta(struct clock_stat *stat, uint64_t clock_delta)
> {
> @@ -416,6 +418,10 @@ fiber_call(struct fiber *callee)
> 	/** By convention, these triggers must not throw. */
> 	if (! rlist_empty(&caller->on_yield))
> 		trigger_run(&caller->on_yield, NULL);
> +
> +	if (cord_is_main())
> +		cord_on_yield();
> +
> 	clock_set_on_csw(caller);
> 	callee->caller = caller;
> 	callee->flags |= FIBER_IS_READY;
> @@ -645,6 +651,10 @@ fiber_yield(void)
> 	/** By convention, these triggers must not throw. */
> 	if (! rlist_empty(&caller->on_yield))
> 		trigger_run(&caller->on_yield, NULL);
> +
> +	if (cord_is_main())
> +		cord_on_yield();
> +
> 	clock_set_on_csw(caller);
> 
> 	assert(callee->flags & FIBER_IS_READY || callee == &cord->sched);
> diff --git a/src/lua/utils.c b/src/lua/utils.c
> index af114b0a2..8e98f7607 100644
> --- a/src/lua/utils.c
> +++ b/src/lua/utils.c
> @@ -1308,3 +1308,11 @@ tarantool_lua_utils_init(struct lua_State *L)
> 	luaT_newthread_ref = luaL_ref(L, LUA_REGISTRYINDEX);
> 	return 0;
> }
> +
> +/**
> + * This routine encloses the checks and actions to be done when
> + * the running fiber yields the execution.
> + */
> +void cord_on_yield(void)
> +{
> +}
> diff --git a/test/unit/CMakeLists.txt b/test/unit/CMakeLists.txt
> index 419477748..aace8cf50 100644
> --- a/test/unit/CMakeLists.txt
> +++ b/test/unit/CMakeLists.txt
> @@ -50,7 +50,7 @@ add_executable(bitset_index.test bitset_index.c)
> target_link_libraries(bitset_index.test bitset)
> add_executable(base64.test base64.c)
> target_link_libraries(base64.test misc unit)
> -add_executable(uuid.test uuid.c)
> +add_executable(uuid.test uuid.c core_test_utils.c)
> target_link_libraries(uuid.test uuid unit)
> 
> add_executable(bps_tree.test bps_tree.cc)
> @@ -69,53 +69,53 @@ add_executable(bloom.test bloom.cc)
> target_link_libraries(bloom.test salad)
> add_executable(vclock.test vclock.cc)
> target_link_libraries(vclock.test vclock unit)
> -add_executable(xrow.test xrow.cc)
> +add_executable(xrow.test xrow.cc core_test_utils.c)
> target_link_libraries(xrow.test xrow unit)
> add_executable(decimal.test decimal.c)
> target_link_libraries(decimal.test core unit)
> -add_executable(mp_error.test mp_error.cc)
> +add_executable(mp_error.test mp_error.cc core_test_utils.c)
> target_link_libraries(mp_error.test box_error core unit)
> 
> -add_executable(fiber.test fiber.cc)
> +add_executable(fiber.test fiber.cc core_test_utils.c)
> set_source_files_properties(fiber.cc PROPERTIES COMPILE_FLAGS -O0)
> target_link_libraries(fiber.test core unit)
> 
> -add_executable(fiber_stack.test fiber_stack.c)
> +add_executable(fiber_stack.test fiber_stack.c core_test_utils.c)
> set_source_files_properties(fiber_stack.c PROPERTIES COMPILE_FLAGS -O0)
> target_link_libraries(fiber_stack.test core unit)
> 
> if (NOT ENABLE_GCOV)
>     # This test is known to be broken with GCOV
> -    add_executable(guard.test guard.cc)
> +    add_executable(guard.test guard.cc core_test_utils.c)
>     target_link_libraries(guard.test core unit)
> endif ()
> 
> -add_executable(fiber_stress.test fiber_stress.cc)
> +add_executable(fiber_stress.test fiber_stress.cc core_test_utils.c)
> target_link_libraries(fiber_stress.test core)
> 
> -add_executable(fiber_cond.test fiber_cond.c unit.c)
> +add_executable(fiber_cond.test fiber_cond.c unit.c core_test_utils.c)
> target_link_libraries(fiber_cond.test core)
> 
> -add_executable(fiber_channel.test fiber_channel.cc unit.c)
> +add_executable(fiber_channel.test fiber_channel.cc unit.c core_test_utils.c)
> target_link_libraries(fiber_channel.test core)
> 
> -add_executable(fiber_channel_stress.test fiber_channel_stress.cc)
> +add_executable(fiber_channel_stress.test fiber_channel_stress.cc core_test_utils.c)
> target_link_libraries(fiber_channel_stress.test core)
> 
> -add_executable(cbus_stress.test cbus_stress.c)
> +add_executable(cbus_stress.test cbus_stress.c core_test_utils.c)
> target_link_libraries(cbus_stress.test core stat)
> 
> -add_executable(cbus.test cbus.c)
> +add_executable(cbus.test cbus.c core_test_utils.c)
> target_link_libraries(cbus.test core unit stat)
> 
> include(CheckSymbolExists)
> check_symbol_exists(__GLIBC__ features.h GLIBC_USED)
> if (GLIBC_USED)
> -    add_executable(cbus_hang.test cbus_hang.c)
> +    add_executable(cbus_hang.test cbus_hang.c core_test_utils.c)
>     target_link_libraries(cbus_hang.test core unit stat)
> endif ()
> 
> -add_executable(coio.test coio.cc)
> +add_executable(coio.test coio.cc core_test_utils.c)
> target_link_libraries(coio.test core eio bit uri unit)
> 
> if (ENABLE_BUNDLED_MSGPUCK)
> @@ -132,7 +132,7 @@ if (ENABLE_BUNDLED_MSGPUCK)
>     target_link_libraries(msgpack.test ${MSGPUCK_LIBRARIES})
> endif ()
> 
> -add_executable(scramble.test scramble.c)
> +add_executable(scramble.test scramble.c core_test_utils.c)
> target_link_libraries(scramble.test scramble)
> 
> add_executable(guava.test guava.c)
> @@ -155,7 +155,7 @@ target_link_libraries(csv.test csv)
> add_executable(json.test json.c)
> target_link_libraries(json.test json unit ${ICU_LIBRARIES})
> 
> -add_executable(rmean.test rmean.cc)
> +add_executable(rmean.test rmean.cc core_test_utils.c)
> target_link_libraries(rmean.test stat unit)
> add_executable(histogram.test histogram.c)
> target_link_libraries(histogram.test stat unit)
> @@ -170,7 +170,7 @@ target_link_libraries(luaL_iterator.test unit server coll core misc
>     ${CURL_LIBRARIES} ${LIBYAML_LIBRARIES} ${READLINE_LIBRARIES}
>     ${ICU_LIBRARIES} ${LUAJIT_LIBRARIES} ${LIB_DL})
> 
> -add_executable(say.test say.c)
> +add_executable(say.test say.c core_test_utils.c)
> target_link_libraries(say.test core unit)
> 
> set(ITERATOR_TEST_SOURCES
> @@ -182,7 +182,7 @@ set(ITERATOR_TEST_SOURCES
>     ${PROJECT_SOURCE_DIR}/src/box/vy_cache.c)
> set(ITERATOR_TEST_LIBS core tuple xrow unit)
> 
> -add_executable(vy_mem.test vy_mem.c ${ITERATOR_TEST_SOURCES})
> +add_executable(vy_mem.test vy_mem.c ${ITERATOR_TEST_SOURCES} core_test_utils.c)
> target_link_libraries(vy_mem.test ${ITERATOR_TEST_LIBS} ${LIB_DL})
> 
> add_executable(vy_point_lookup.test
> @@ -204,11 +204,13 @@ add_executable(vy_point_lookup.test
>     ${PROJECT_SOURCE_DIR}/src/box/index_def.c
>     ${PROJECT_SOURCE_DIR}/src/box/schema_def.c
>     ${PROJECT_SOURCE_DIR}/src/box/identifier.c
> +    core_test_utils.c
> )
> target_link_libraries(vy_point_lookup.test core tuple xrow xlog unit ${LIB_DL})
> 
> add_executable(column_mask.test
> -    column_mask.c)
> +    column_mask.c
> +    core_test_utils.c)
> target_link_libraries(column_mask.test tuple unit)
> 
> add_executable(vy_write_iterator.test
> @@ -217,16 +219,17 @@ add_executable(vy_write_iterator.test
>     ${PROJECT_SOURCE_DIR}/src/box/vy_upsert.c
>     ${PROJECT_SOURCE_DIR}/src/box/vy_write_iterator.c
>     ${ITERATOR_TEST_SOURCES}
> +    core_test_utils.c
> )
> target_link_libraries(vy_write_iterator.test xlog ${ITERATOR_TEST_LIBS} ${LIB_DL})
> 
> -add_executable(vy_cache.test vy_cache.c ${ITERATOR_TEST_SOURCES})
> +add_executable(vy_cache.test vy_cache.c ${ITERATOR_TEST_SOURCES} core_test_utils.c)
> target_link_libraries(vy_cache.test ${ITERATOR_TEST_LIBS} ${LIB_DL})
> 
> -add_executable(coll.test coll.cpp)
> +add_executable(coll.test coll.cpp core_test_utils.c)
> target_link_libraries(coll.test coll unit misc ${LIB_DL})
> 
> -add_executable(tuple_bigref.test tuple_bigref.c)
> +add_executable(tuple_bigref.test tuple_bigref.c core_test_utils.c)
> target_link_libraries(tuple_bigref.test tuple unit)
> 
> add_executable(checkpoint_schedule.test
> @@ -235,23 +238,23 @@ add_executable(checkpoint_schedule.test
> )
> target_link_libraries(checkpoint_schedule.test m unit)
> 
> -add_executable(sio.test sio.c)
> +add_executable(sio.test sio.c core_test_utils.c)
> target_link_libraries(sio.test unit core)
> 
> -add_executable(crypto.test crypto.c)
> +add_executable(crypto.test crypto.c core_test_utils.c)
> target_link_libraries(crypto.test crypto unit)
> 
> add_executable(swim.test swim.c swim_test_transport.c swim_test_ev.c
> -               swim_test_utils.c ${PROJECT_SOURCE_DIR}/src/version.c)
> +               swim_test_utils.c ${PROJECT_SOURCE_DIR}/src/version.c core_test_utils.c)
> target_link_libraries(swim.test unit swim)
> 
> add_executable(swim_proto.test swim_proto.c swim_test_transport.c swim_test_ev.c
> -               swim_test_utils.c ${PROJECT_SOURCE_DIR}/src/version.c)
> +               swim_test_utils.c ${PROJECT_SOURCE_DIR}/src/version.c core_test_utils.c)
> target_link_libraries(swim_proto.test unit swim)
> 
> add_executable(swim_errinj.test swim_errinj.c swim_test_transport.c
>                swim_test_ev.c swim_test_utils.c
> -               ${PROJECT_SOURCE_DIR}/src/version.c)
> +               ${PROJECT_SOURCE_DIR}/src/version.c core_test_utils.c)
> target_link_libraries(swim_errinj.test unit swim)
> 
> add_executable(merger.test merger.test.c)
> @@ -264,5 +267,5 @@ target_link_libraries(snap_quorum_delay.test box core unit)
> # Client for popen.test
> add_executable(popen-child popen-child.c)
> 
> -add_executable(popen.test popen.c)
> +add_executable(popen.test popen.c core_test_utils.c)
> target_link_libraries(popen.test misc unit core)
> diff --git a/test/unit/core_test_utils.c b/test/unit/core_test_utils.c
> new file mode 100644
> index 000000000..23452bbfd
> --- /dev/null
> +++ b/test/unit/core_test_utils.c
> @@ -0,0 +1,37 @@
> +/*
> + * Copyright 2010-2020, Tarantool AUTHORS, please see AUTHORS file.
> + *
> + * Redistribution and use in source and binary forms, with or
> + * without modification, are permitted provided that the following
> + * conditions are met:
> + *
> + * 1. Redistributions of source code must retain the above
> + *    copyright notice, this list of conditions and the
> + *    following disclaimer.
> + *
> + * 2. Redistributions in binary form must reproduce the above
> + *    copyright notice, this list of conditions and the following
> + *    disclaimer in the documentation and/or other materials
> + *    provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
> + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
> + * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
> + * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
> + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
> + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> + * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE.
> + */
> +
> +/**
> + * Mock function to resolve circular dependencies in unit tests
> + */
> +void cord_on_yield(void)
> +{
> +}
> -- 
> 2.25.0
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 2/2] lua: abort trace recording on fiber yield
  2020-09-23 19:06 ` [Tarantool-patches] [PATCH v2 2/2] lua: abort trace recording on fiber yield Igor Munkin
@ 2020-09-24 13:00   ` sergos
  2020-09-28 13:07     ` Igor Munkin
  2020-09-29 22:41   ` Vladislav Shpilevoy
  1 sibling, 1 reply; 29+ messages in thread
From: sergos @ 2020-09-24 13:00 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches, Vladislav Shpilevoy

Hi!

Thanks for the patch, see my 2 comments below.


> On 23 Sep 2020, at 22:06, Igor Munkin <imun@tarantool.org> wrote:
> 
> Since Tarantool fibers don't respect Lua coroutine switch mechanism, JIT
> machinery stays unnotified when one lua_State substitutes another one.
> As a result if trace recording hasn't been aborted prior to fiber
> switch, the recording proceeds using the new lua_State and leads to a
> failure either on any further compiler phase or while the compiled trace
> is executed.
> 
> This changeset extends <cord_on_yield> routine aborting trace recording
> when the fiber switches to another one. If the switch-over occurs while
> mcode is being run the platform finishes its execution with EXIT_FAILURE
> code and calls panic routine prior to the exit.
> 
> Closes #1700
> Fixes #4491
> 
> Signed-off-by: Igor Munkin <imun@tarantool.org>
> ---
> 
> Here are also the benchmark results for the Release build:
> * Vanilla (8477b6c) -> Patched (f1026a1) (min, median, mean, max):
> | fibers: 10; iters: 100	0%	0%	0%	0%
> | fibers: 10; iters: 1000	2%	2%	1%	2%
> | fibers: 10; iters: 10000	6%	4%	5%	5%
> | fibers: 10; iters: 100000	4%	3%	4%	1%
> | fibers: 100; iters: 100	1%	0%	1%	10%
> | fibers: 100; iters: 1000	2%	2%	0%	-7%
> | fibers: 100; iters: 10000	2%	1%	2%	1%
> | fibers: 100; iters: 100000	2%	2%	2%	5%
> | fibers: 1000; iters: 100	1%	0%	0%	-2%
> | fibers: 1000; iters: 1000	1%	1%	1%	0%
> | fibers: 1000; iters: 10000	1%	1%	1%	2%
> | fibers: 1000; iters: 100000	1%	1%	1%	1%
> | fibers: 10000; iters: 100	0%	0%	0%	1%
> | fibers: 10000; iters: 1000	0%	2%	3%	3%
> | fibers: 10000; iters: 10000	0%	0%	0%	-2%
> | fibers: 10000; iters: 100000	0%	1%	0%	-4%
> 
> src/lua/utils.c                               | 49 +++++++++++++++++
> ...-4491-coio-wait-leads-to-segfault.test.lua | 53 +++++++++++++++++++
> 2 files changed, 102 insertions(+)
> create mode 100755 test/app-tap/gh-4491-coio-wait-leads-to-segfault.test.lua
> 
> diff --git a/src/lua/utils.c b/src/lua/utils.c
> index 8e98f7607..39fe4e30c 100644
> --- a/src/lua/utils.c
> +++ b/src/lua/utils.c
> @@ -29,6 +29,7 @@
>  * SUCH DAMAGE.
>  */
> #include "lua/utils.h"
> +#include <lj_trace.h>
> 
> #include <assert.h>
> #include <errno.h>
> @@ -1309,10 +1310,57 @@ tarantool_lua_utils_init(struct lua_State *L)
> 	return 0;
> }
> 
> +/*
> + * XXX: There is already defined <panic> macro in say.h header
> + * (included in diag.h). As a result the call below is misexpanded
> + * and compilation fails with the corresponding error. To avoid
> + * this error the macro is temporary renamed and restored later.
> + * Compilation now fails for the following cases:
> + * * temporary name <_panic> is used in this translation unit

I would propose some more sophisitcated name to avoid possible overlap 
with ‘hidden’ _something convention. Since it is Tarantool sources, a
reference to the gh will be just fine, like panic_gh1700.

> + * * <panic> is not define, so this hack can be freely dropped
> + */
> +#if defined(panic) && !defined(_panic)
> +#  define _panic panic
> +#  undef panic
> +#else
> +#  error "Can't redefine <panic> macro"
> +#endif
> +
> /**
>  * This routine encloses the checks and actions to be done when
>  * the running fiber yields the execution.
> + * Since Tarantool fibers don't switch-over the way Lua coroutines
> + * do the platform ought to notify JIT engine when one lua_State
> + * substitutes another one.
>  */
> void cord_on_yield(void)
> {
> +	struct global_State *g = G(tarantool_L);
> +	/*
> +	 * XXX: Switching fibers while running the trace leads to
> +	 * code misbehaviour and failures, so stop its execution.
> +	 */
> +	if (unlikely(tvref(g->jit_base))) {
> +		/*
> +		 * XXX: mcode is executed only in scope of Lua
> +		 * world and one can obtain the corresponding Lua
> +		 * coroutine from the fiber storage.
> +		 */
> +		struct lua_State *L = fiber()->storage.lua.stack;
> +		assert(L != NULL);
> +		lua_pushstring(L, "Fiber is switched on the trace”);

To me it is very much ’Too long WAL write’ style. Everything said,
nothing is clear. Could you please elaborate on the ‘FFI code’ and 
‘led to a yield’ and probably the fiber number.

> +		if (g->panic)
> +			g->panic(L);
> +		exit(EXIT_FAILURE);
> +	}
> +	/*
> +	 * Unconditionally abort trace recording whether fibers
> +	 * switch each other. Otherwise, further compilation may
> +	 * lead to a failure on any next compiler phase.
> +	 */
> +	lj_trace_abort(g);
> }
> +
> +/* Restore <panic> macro back */
> +#define panic _panic
> +#undef _panic
> diff --git a/test/app-tap/gh-4491-coio-wait-leads-to-segfault.test.lua b/test/app-tap/gh-4491-coio-wait-leads-to-segfault.test.lua
> new file mode 100755
> index 000000000..0dd8dfbee
> --- /dev/null
> +++ b/test/app-tap/gh-4491-coio-wait-leads-to-segfault.test.lua
> @@ -0,0 +1,53 @@
> +#!/usr/bin/env tarantool
> +
> +local tap = require('tap')
> +
> +local test = tap.test('gh-4491-coio-wait-leads-to-segfault')
> +
> +-- Test file to demonstrate platform failure due to fiber switch
> +-- while trace recording, details:
> +--     https://github.com/tarantool/tarantool/issues/4491
> +
> +local fiber = require('fiber')
> +local ffi = require('ffi')
> +ffi.cdef('int coio_wait(int fd, int event, double timeout);')
> +
> +local cfg = {
> +  hotloop = arg[1] or 1,
> +  fibers = arg[1] or 2,
> +  timeout = { put = 1, get = 1 },
> +}
> +
> +test:plan(cfg.fibers + 1)
> +
> +local args = {
> +  fd      = 1   , -- STDIN file descriptor
> +  event   = 0x1 , -- COIO_READ event
> +  timeout = 0.05, -- Timeout value
> +}
> +
> +local function run(iterations, channel)
> +  for _ = 1, iterations do
> +    ffi.C.coio_wait(args.fd, args.event, args.timeout)
> +  end
> +  channel:put(true, cfg.timeout.put)
> +end
> +
> +local channels = { }
> +
> +jit.opt.start('3', string.format('hotloop=%d', cfg.hotloop))
> +
> +for _ = 1, cfg.fibers do
> +  channels[_] = fiber.channel(1)
> +  fiber.new(run, cfg.hotloop + 1, channels[_])
> +end
> +
> +-- Finalize the existing fibers
> +for _ = 1, cfg.fibers do
> +  test:ok(channels[_]:get(cfg.timeout.get),
> +          string.format('fiber #%d successfully finished', _))
> +end
> +
> +test:ok(true, 'trace is not recorded due to fiber switch underneath coio_wait')
> +
> +os.exit(test:check() and 0 or 1)
> -- 
> 2.25.0
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 0/2] Prevent JIT engine breakage on fibers switch-over
  2020-09-23 19:06 [Tarantool-patches] [PATCH v2 0/2] Prevent JIT engine breakage on fibers switch-over Igor Munkin
  2020-09-23 19:06 ` [Tarantool-patches] [PATCH v2 1/2] fiber: introduce a callback for " Igor Munkin
  2020-09-23 19:06 ` [Tarantool-patches] [PATCH v2 2/2] lua: abort trace recording on fiber yield Igor Munkin
@ 2020-09-24 13:15 ` sergos
  2020-09-28 13:06   ` Igor Munkin
  2020-10-01 21:25 ` Vladislav Shpilevoy
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 29+ messages in thread
From: sergos @ 2020-09-24 13:15 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches, Vladislav Shpilevoy

Hi!

Thanks for the patch, please consider my 2 cents below. 

Sergos

> On 23 Sep 2020, at 22:06, Igor Munkin <imun@tarantool.org> wrote:
> 
> There was a long discussion about the patch correctness and its
> performance impact in v1 thread[1]. The benchmarks provided in v1 showed
> this implementation as the least nerfing the platform overall
> performance even for such synthetic test[2]. One can find the relevant
> benchmarks results in the corresponding patches.
> 
> Changes in v2:
> * implement the callback in a different way to negate its perf impact
> 
> @ChangeLog:
> * Fixed fibers switch-over to prevent JIT machinery misbehaviour. Trace
>  recording is aborted when fiber yields the execution. The yield
>  occuring while mcode is being run leads to the platform panic

Could you please rephrase the last sentence to be more clear, like
yielding from a function called directly from the jitted code (means -
through FFI) will cause panic. 

>  (gh-1700, gh-4491).
> 
> Branch: https://github.com/tarantool/tarantool/tree/imun/gh-1700-abort-recording-on-fiber-switch
> Issues:
> * https://github.com/tarantool/tarantool/issues/1700
> * https://github.com/tarantool/tarantool/issues/4491
> 
> [1]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-March/015290.html
> [2]: https://gist.github.com/igormunkin/7e0cf48005bd003ffbdf30181eedb40e
> 
> Igor Munkin (2):
>  fiber: introduce a callback for fibers switch-over
>  lua: abort trace recording on fiber yield
> 
> src/lib/core/fiber.c                          | 10 ++++
> src/lua/utils.c                               | 57 ++++++++++++++++++
> ...-4491-coio-wait-leads-to-segfault.test.lua | 53 +++++++++++++++++
> test/unit/CMakeLists.txt                      | 59 ++++++++++---------
> test/unit/core_test_utils.c                   | 37 ++++++++++++
> 5 files changed, 188 insertions(+), 28 deletions(-)
> create mode 100755 test/app-tap/gh-4491-coio-wait-leads-to-segfault.test.lua
> create mode 100644 test/unit/core_test_utils.c
> 
> -- 
> 2.25.0
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 0/2] Prevent JIT engine breakage on fibers switch-over
  2020-09-24 13:15 ` [Tarantool-patches] [PATCH v2 0/2] Prevent JIT engine breakage on fibers switch-over sergos
@ 2020-09-28 13:06   ` Igor Munkin
  2020-09-29  9:14     ` Sergey Ostanevich
  0 siblings, 1 reply; 29+ messages in thread
From: Igor Munkin @ 2020-09-28 13:06 UTC (permalink / raw)
  To: sergos; +Cc: tarantool-patches, Vladislav Shpilevoy

Sergos,

Thanks for your review!

On 24.09.20, sergos@tarantool.org wrote:
> Hi!
> 
> Thanks for the patch, please consider my 2 cents below.
> 
> Sergos
> 
> > On 23 Sep 2020, at 22:06, Igor Munkin <imun@tarantool.org> wrote:
> > 

<snipped>

> > 
> > @ChangeLog:
> > * Fixed fibers switch-over to prevent JIT machinery misbehaviour. Trace
> >  recording is aborted when fiber yields the execution. The yield
> >  occuring while mcode is being run leads to the platform panic
> 
> Could you please rephrase the last sentence to be more clear, like
> yielding from a function called directly from the jitted code (means -
> through FFI) will cause panic.

I guess the following wording is quite close to the one you desire:

@ChangeLog:
* Fixed fibers switch-over to prevent JIT machinery misbehaviour. Trace
  recording is aborted when fiber yields the execution. The yield
  occuring while the compiled code is being run (it's likely a function
  with a yield underneath called via LuaJIT FFI) leads to the platform
  panic (gh-1700, gh-4491).

> 
> >  (gh-1700, gh-4491).
> > 

<snipped>

> > 
> > -- 
> > 2.25.0
> > 
> 

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 1/2] fiber: introduce a callback for fibers switch-over
  2020-09-24 12:54   ` sergos
@ 2020-09-28 13:06     ` Igor Munkin
  2020-09-29  9:15       ` Sergey Ostanevich
  0 siblings, 1 reply; 29+ messages in thread
From: Igor Munkin @ 2020-09-28 13:06 UTC (permalink / raw)
  To: sergos; +Cc: tarantool-patches, Vladislav Shpilevoy

Sergos,

Thanks for your review!

On 24.09.20, sergos@tarantool.org wrote:
> Hi!
> 
> Thank you for the patch, see my single comment below.
> 
> Regards,
> Sergos
> 
> > On 23 Sep 2020, at 22:06, Igor Munkin <imun@tarantool.org> wrote:
> > 

<snipped>

> > 
> > diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
> > index 483ae3ce1..1b7d255ff 100644
> > --- a/src/lib/core/fiber.c
> > +++ b/src/lib/core/fiber.c
> > @@ -46,6 +46,8 @@
> > #if ENABLE_FIBER_TOP
> > #include <x86intrin.h> /* __rdtscp() */
> > 
> 
> > +extern void cord_on_yield(void);
> > +
> 
> The definition is under ENABLE_FIBER_TOP, while its use isn’t. 
> The build will fail for ARM target for exmaple.
> AFAICS the changes are not releveant to the fiber top in any way.

Crap... Thanks for noticing! Fixed:

================================================================================

diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
index 1b7d255..223c841 100644
--- a/src/lib/core/fiber.c
+++ b/src/lib/core/fiber.c
@@ -43,11 +43,11 @@
 #include "trigger.h"
 #include "errinj.h"
 
+extern void cord_on_yield(void);
+
 #if ENABLE_FIBER_TOP
 #include <x86intrin.h> /* __rdtscp() */
 
-extern void cord_on_yield(void);
-
 static inline void
 clock_stat_add_delta(struct clock_stat *stat, uint64_t clock_delta)
 {

================================================================================

Side note: I guess we have to reprioritize ARM testing issue[1].

> 
> > static inline void
> > clock_stat_add_delta(struct clock_stat *stat, uint64_t clock_delta)
> > {

<snipped>

> > 
> 

[1]: https://github.com/tarantool/tarantool/issues/4270

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 2/2] lua: abort trace recording on fiber yield
  2020-09-24 13:00   ` sergos
@ 2020-09-28 13:07     ` Igor Munkin
  2020-09-28 15:36       ` Igor Munkin
  0 siblings, 1 reply; 29+ messages in thread
From: Igor Munkin @ 2020-09-28 13:07 UTC (permalink / raw)
  To: sergos; +Cc: tarantool-patches, Vladislav Shpilevoy

Sergos,

Thanks for your review!

On 24.09.20, sergos@tarantool.org wrote:
> Hi!
> 
> Thanks for the patch, see my 2 comments below.
> 

<snipped>

> > 
> > diff --git a/src/lua/utils.c b/src/lua/utils.c
> > index 8e98f7607..39fe4e30c 100644
> > --- a/src/lua/utils.c
> > +++ b/src/lua/utils.c
> > @@ -29,6 +29,7 @@
> >  * SUCH DAMAGE.
> >  */
> > #include "lua/utils.h"
> > +#include <lj_trace.h>
> > 
> > #include <assert.h>
> > #include <errno.h>
> > @@ -1309,10 +1310,57 @@ tarantool_lua_utils_init(struct lua_State *L)
> > 	return 0;
> > }
> > 
> > +/*
> > + * XXX: There is already defined <panic> macro in say.h header
> > + * (included in diag.h). As a result the call below is misexpanded
> > + * and compilation fails with the corresponding error. To avoid
> > + * this error the macro is temporary renamed and restored later.
> > + * Compilation now fails for the following cases:
> > + * * temporary name <_panic> is used in this translation unit
> 
> I would propose some more sophisitcated name to avoid possible overlap 
> with ‘hidden’ _something convention. Since it is Tarantool sources, a
> reference to the gh will be just fine, like panic_gh1700.

Sounds better, thanks. Adjusted:

================================================================================

diff --git a/src/lua/utils.c b/src/lua/utils.c
index d2ad0570a..b36824ebf 100644
--- a/src/lua/utils.c
+++ b/src/lua/utils.c
@@ -1316,11 +1316,12 @@ tarantool_lua_utils_init(struct lua_State *L)
  * and compilation fails with the corresponding error. To avoid
  * this error the macro is temporary renamed and restored later.
  * Compilation now fails for the following cases:
- * * temporary name <_panic> is used in this translation unit
+ * * temporary name <gh1700_panic> is used in scope of this
+ *   translation unit
  * * <panic> is not define, so this hack can be freely dropped
  */
-#if defined(panic) && !defined(_panic)
-#  define _panic panic
+#if defined(panic) && !defined(gh1700_panic)
+#  define gh1700_panic panic
 #  undef panic
 #else
 #  error "Can't redefine <panic> macro"
@@ -1362,5 +1363,5 @@ void cord_on_yield(void)
 }
 
 /* Restore <panic> macro back */
-#define panic _panic
-#undef _panic
+#define panic gh1700_panic
+#undef gh1700_panic

================================================================================

> 
> > + * * <panic> is not define, so this hack can be freely dropped
> > + */
> > +#if defined(panic) && !defined(_panic)
> > +#  define _panic panic
> > +#  undef panic
> > +#else
> > +#  error "Can't redefine <panic> macro"
> > +#endif
> > +
> > /**
> >  * This routine encloses the checks and actions to be done when
> >  * the running fiber yields the execution.
> > + * Since Tarantool fibers don't switch-over the way Lua coroutines
> > + * do the platform ought to notify JIT engine when one lua_State
> > + * substitutes another one.
> >  */
> > void cord_on_yield(void)
> > {
> > +	struct global_State *g = G(tarantool_L);
> > +	/*
> > +	 * XXX: Switching fibers while running the trace leads to
> > +	 * code misbehaviour and failures, so stop its execution.
> > +	 */
> > +	if (unlikely(tvref(g->jit_base))) {
> > +		/*
> > +		 * XXX: mcode is executed only in scope of Lua
> > +		 * world and one can obtain the corresponding Lua
> > +		 * coroutine from the fiber storage.
> > +		 */
> > +		struct lua_State *L = fiber()->storage.lua.stack;
> > +		assert(L != NULL);
> > +		lua_pushstring(L, "Fiber is switched on the trace”);
> 
> To me it is very much ’Too long WAL write’ style. Everything said,
> nothing is clear. Could you please elaborate on the ‘FFI code’ and
> ‘led to a yield’ and probably the fiber number.

Agree, extended the error the following way:

================================================================================

diff --git a/src/lua/utils.c b/src/lua/utils.c
index b36824ebf..a8b66d34e 100644
--- a/src/lua/utils.c
+++ b/src/lua/utils.c
@@ -1349,7 +1349,10 @@ void cord_on_yield(void)
 		 */
 		struct lua_State *L = fiber()->storage.lua.stack;
 		assert(L != NULL);
-		lua_pushstring(L, "Fiber is switched on the trace");
+		lua_pushfstring(L, "fiber %d is switched while running the"
+				" compiled code (it's likely a function with"
+				" a yield underneath called via LuaJIT FFI)",
+				fiber()->fid);
 		if (g->panic)
 			g->panic(L);
 		exit(EXIT_FAILURE);

================================================================================

> 
> > +		if (g->panic)
> > +			g->panic(L);
> > +		exit(EXIT_FAILURE);
> > +	}
> > +	/*
> > +	 * Unconditionally abort trace recording whether fibers
> > +	 * switch each other. Otherwise, further compilation may
> > +	 * lead to a failure on any next compiler phase.
> > +	 */
> > +	lj_trace_abort(g);
> > }
> > +
> > +/* Restore <panic> macro back */
> > +#define panic _panic
> > +#undef _panic

<snipped>

> > -- 
> > 2.25.0
> > 
> 

Besides, I added a test for the panic case. Here is the diff:

================================================================================

diff --git a/test/app-tap/CMakeLists.txt b/test/app-tap/CMakeLists.txt
index ee67cf533..bf7d28136 100644
--- a/test/app-tap/CMakeLists.txt
+++ b/test/app-tap/CMakeLists.txt
@@ -1 +1,2 @@
 build_module(module_api module_api.c)
+build_module(libyield libyield.c)
diff --git a/test/app-tap/gh-1700-abort-recording-on-fiber-switch.test.lua b/test/app-tap/gh-1700-abort-recording-on-fiber-switch.test.lua
new file mode 100755
index 000000000..027dee219
--- /dev/null
+++ b/test/app-tap/gh-1700-abort-recording-on-fiber-switch.test.lua
@@ -0,0 +1,77 @@
+#!/usr/bin/env tarantool
+
+if #arg == 0 then
+  local checks = {
+    {
+      arg = {
+        1, -- hotloop (arg[1])
+        1, -- trigger (arg[2])
+      },
+      res = 'OK',
+      msg = 'Trace is aborted',
+    },
+    {
+      arg = {
+        1, -- hotloop (arg[1])
+        2, -- trigger (arg[2])
+      },
+      res = 'fiber %d+ is switched while running the compiled code %b()',
+      msg = 'Trace is recorded',
+    },
+  }
+
+  local tap = require('tap')
+  local test = tap.test('gh-1700-abort-recording-on-fiber-switch')
+
+  test:plan(#checks)
+
+  local vars = {
+    LUABIN = arg[-1],
+    SCRIPT = arg[0],
+    PATH   = arg[0]:gsub('/?[^/]+%.test%.lua', ''),
+    SUFFIX = package.cpath:match('?.(%a+);'),
+  }
+
+  local cmd = string.gsub('LUA_CPATH="$LUA_CPATH;<PATH>/?.<SUFFIX>" ' ..
+                          'LUA_PATH="$LUA_PATH;<PATH>/?.lua" ' ..
+                          'LD_LIBRARY_PATH=<PATH> ' ..
+                          '<LUABIN> 2>&1 <SCRIPT>', '%<(%w+)>', vars)
+
+  for _, ch in pairs(checks) do
+    local res
+    local proc = io.popen((cmd .. (' %s'):rep(#ch.arg)):format(unpack(ch.arg)))
+    for s in proc:lines() do res = s end
+    assert(res, 'proc:lines failed')
+    test:like(res, ch.res, ch.msg)
+  end
+
+  os.exit(test:check() and 0 or 1)
+end
+
+
+-- Test body.
+
+local cfg = {
+  hotloop = arg[1] or 1,
+  trigger = arg[2] or 1,
+}
+
+local ffi = require('ffi')
+local ffiyield = ffi.load('libyield')
+ffi.cdef('void yield(struct yield *state, int i)')
+
+-- Set the value to trigger <yield> call switch the running fuber.
+local yield = require('libyield')(cfg.trigger)
+
+-- Depending on trigger and hotloop values the following contexts
+-- are possible:
+-- * if trigger <= hotloop -> trace recording is aborted
+-- * if trigger >  hotloop -> trace is recorded but execution
+--   leads to panic
+jit.opt.start("3", string.format("hotloop=%d", cfg.hotloop))
+
+for i = 0, cfg.trigger + cfg.hotloop do
+  ffiyield.yield(yield, i)
+end
+-- Panic didn't occur earlier.
+print('OK')
diff --git a/test/app-tap/libyield.c b/test/app-tap/libyield.c
new file mode 100644
index 000000000..1b7221a8e
--- /dev/null
+++ b/test/app-tap/libyield.c
@@ -0,0 +1,29 @@
+#include <lua.h>
+#include <module.h>
+
+struct yield {
+	int trigger;  /* Trigger for yielding the fiber execution */
+};
+
+void yield(struct yield *state, int i)
+{
+	if (i < state->trigger)
+		return;
+
+	/* Fiber yields the execution for a jiffy */
+	fiber_sleep(0);
+}
+
+static int init(lua_State *L)
+{
+	struct yield *state = lua_newuserdata(L, sizeof(struct yield));
+
+	state->trigger = lua_tonumber(L, 1);
+	return 1;
+}
+
+LUA_API int luaopen_libyield(lua_State *L)
+{
+	lua_pushcfunction(L, init);
+	return 1;
+}

================================================================================

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 2/2] lua: abort trace recording on fiber yield
  2020-09-28 13:07     ` Igor Munkin
@ 2020-09-28 15:36       ` Igor Munkin
  2020-09-28 16:37         ` Igor Munkin
  0 siblings, 1 reply; 29+ messages in thread
From: Igor Munkin @ 2020-09-28 15:36 UTC (permalink / raw)
  To: sergos; +Cc: tarantool-patches, Vladislav Shpilevoy

There was an issue with out of source build in the newly added test[1].
Here is the fix:

================================================================================

diff --git a/test/app-tap/gh-1700-abort-recording-on-fiber-switch.test.lua b/test/app-tap/gh-1700-abort-recording-on-fiber-swit
ch.test.lua
index 027dee219..f4ceb9455 100755
--- a/test/app-tap/gh-1700-abort-recording-on-fiber-switch.test.lua
+++ b/test/app-tap/gh-1700-abort-recording-on-fiber-switch.test.lua
@@ -28,7 +28,8 @@ if #arg == 0 then
   local vars = {
     LUABIN = arg[-1],
     SCRIPT = arg[0],
-    PATH   = arg[0]:gsub('/?[^/]+%.test%.lua', ''),
+    -- To support out-of-source build use relative paths in repo
+    PATH   = arg[-1]:gsub('src/tarantool', 'test/app-tap'),
     SUFFIX = package.cpath:match('?.(%a+);'),
   }

================================================================================

[1]: https://gitlab.com/tarantool/tarantool/-/jobs/760937971

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 2/2] lua: abort trace recording on fiber yield
  2020-09-28 15:36       ` Igor Munkin
@ 2020-09-28 16:37         ` Igor Munkin
  2020-09-28 17:45           ` Igor Munkin
  0 siblings, 1 reply; 29+ messages in thread
From: Igor Munkin @ 2020-09-28 16:37 UTC (permalink / raw)
  To: sergos; +Cc: tarantool-patches, Vladislav Shpilevoy

Well, the issue still occurs for static-build[1]. Too many
'src/tarantool' in the path and more specific pattern is needed. Here
are the changes:

================================================================================

diff --git a/test/app-tap/gh-1700-abort-recording-on-fiber-switch.test.lua b/test/app-tap/gh-1700-abort-recording-on-fiber-swit
ch.test.lua
index f4ceb9455..7b9156c22 100755
--- a/test/app-tap/gh-1700-abort-recording-on-fiber-switch.test.lua
+++ b/test/app-tap/gh-1700-abort-recording-on-fiber-switch.test.lua
@@ -29,7 +29,7 @@ if #arg == 0 then
     LUABIN = arg[-1],
     SCRIPT = arg[0],
     -- To support out-of-source build use relative paths in repo
-    PATH   = arg[-1]:gsub('src/tarantool', 'test/app-tap'),
+    PATH   = arg[-1]:gsub('src/tarantool$', 'test/app-tap'),
     SUFFIX = package.cpath:match('?.(%a+);'),
   }
 

================================================================================

[1]: https://gitlab.com/tarantool/tarantool/-/jobs/761207911

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 2/2] lua: abort trace recording on fiber yield
  2020-09-28 16:37         ` Igor Munkin
@ 2020-09-28 17:45           ` Igor Munkin
  2020-09-29  9:24             ` Sergey Ostanevich
  0 siblings, 1 reply; 29+ messages in thread
From: Igor Munkin @ 2020-09-28 17:45 UTC (permalink / raw)
  To: sergos; +Cc: tarantool-patches, Vladislav Shpilevoy

Meh, the last one (I hope): I forgot there are the issues with JIT on
FreeBSD[1]. So I added skipcond file for the newly added test:

================================================================================

diff --git a/test/app-tap/gh-1700-abort-recording-on-fiber-switch.skipcond b/test/app-tap/gh-1700-abort-recording-on-fiber-switch.skipcond
new file mode 100644
index 000000000..2a2ec4d97
--- /dev/null
+++ b/test/app-tap/gh-1700-abort-recording-on-fiber-switch.skipcond
@@ -0,0 +1,7 @@
+import platform
+
+# Disabled on FreeBSD due to #4819.
+if platform.system() == 'FreeBSD':
+    self.skip = 1
+
+# vim: set ft=python:

================================================================================

[1]: https://github.com/tarantool/tarantool/issues/4819

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 0/2] Prevent JIT engine breakage on fibers switch-over
  2020-09-28 13:06   ` Igor Munkin
@ 2020-09-29  9:14     ` Sergey Ostanevich
  0 siblings, 0 replies; 29+ messages in thread
From: Sergey Ostanevich @ 2020-09-29  9:14 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches, Vladislav Shpilevoy

Hi!

Thanks for the update, LGTM.

Sergos

On 28 сен 16:06, Igor Munkin wrote:
> Sergos,
> 
> Thanks for your review!
> 
> On 24.09.20, sergos@tarantool.org wrote:
> > Hi!
> > 
> > Thanks for the patch, please consider my 2 cents below.
> > 
> > Sergos
> > 
> > > On 23 Sep 2020, at 22:06, Igor Munkin <imun@tarantool.org> wrote:
> > > 
> 
> <snipped>
> 
> > > 
> > > @ChangeLog:
> > > * Fixed fibers switch-over to prevent JIT machinery misbehaviour. Trace
> > >  recording is aborted when fiber yields the execution. The yield
> > >  occuring while mcode is being run leads to the platform panic
> > 
> > Could you please rephrase the last sentence to be more clear, like
> > yielding from a function called directly from the jitted code (means -
> > through FFI) will cause panic.
> 
> I guess the following wording is quite close to the one you desire:
> 
> @ChangeLog:
> * Fixed fibers switch-over to prevent JIT machinery misbehaviour. Trace
>   recording is aborted when fiber yields the execution. The yield
>   occuring while the compiled code is being run (it's likely a function
>   with a yield underneath called via LuaJIT FFI) leads to the platform
>   panic (gh-1700, gh-4491).
> 
> > 
> > >  (gh-1700, gh-4491).
> > > 
> 
> <snipped>
> 
> > > 
> > > -- 
> > > 2.25.0
> > > 
> > 
> 
> -- 
> Best regards,
> IM

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 1/2] fiber: introduce a callback for fibers switch-over
  2020-09-28 13:06     ` Igor Munkin
@ 2020-09-29  9:15       ` Sergey Ostanevich
  2020-09-29 10:05         ` Igor Munkin
  0 siblings, 1 reply; 29+ messages in thread
From: Sergey Ostanevich @ 2020-09-29  9:15 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches, Vladislav Shpilevoy

Hi!

Thanks for the patch, LGTM.

Sergos

On 28 сен 16:06, Igor Munkin wrote:
> Sergos,
> 
> Thanks for your review!
> 
> On 24.09.20, sergos@tarantool.org wrote:
> > Hi!
> > 
> > Thank you for the patch, see my single comment below.
> > 
> > Regards,
> > Sergos
> > 
> > > On 23 Sep 2020, at 22:06, Igor Munkin <imun@tarantool.org> wrote:
> > > 
> 
> <snipped>
> 
> > > 
> > > diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
> > > index 483ae3ce1..1b7d255ff 100644
> > > --- a/src/lib/core/fiber.c
> > > +++ b/src/lib/core/fiber.c
> > > @@ -46,6 +46,8 @@
> > > #if ENABLE_FIBER_TOP
> > > #include <x86intrin.h> /* __rdtscp() */
> > > 
> > 
> > > +extern void cord_on_yield(void);
> > > +
> > 
> > The definition is under ENABLE_FIBER_TOP, while its use isn’t. 
> > The build will fail for ARM target for exmaple.
> > AFAICS the changes are not releveant to the fiber top in any way.
> 
> Crap... Thanks for noticing! Fixed:
> 
> ================================================================================
> 
> diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
> index 1b7d255..223c841 100644
> --- a/src/lib/core/fiber.c
> +++ b/src/lib/core/fiber.c
> @@ -43,11 +43,11 @@
>  #include "trigger.h"
>  #include "errinj.h"
>  
> +extern void cord_on_yield(void);
> +
>  #if ENABLE_FIBER_TOP
>  #include <x86intrin.h> /* __rdtscp() */
>  
> -extern void cord_on_yield(void);
> -
>  static inline void
>  clock_stat_add_delta(struct clock_stat *stat, uint64_t clock_delta)
>  {
> 
> ================================================================================
> 
> Side note: I guess we have to reprioritize ARM testing issue[1].
> 
> > 
> > > static inline void
> > > clock_stat_add_delta(struct clock_stat *stat, uint64_t clock_delta)
> > > {
> 
> <snipped>
> 
> > > 
> > 
> 
> [1]: https://github.com/tarantool/tarantool/issues/4270
> 
> -- 
> Best regards,
> IM

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 2/2] lua: abort trace recording on fiber yield
  2020-09-28 17:45           ` Igor Munkin
@ 2020-09-29  9:24             ` Sergey Ostanevich
  2020-09-29 10:06               ` Igor Munkin
  0 siblings, 1 reply; 29+ messages in thread
From: Sergey Ostanevich @ 2020-09-29  9:24 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches, Vladislav Shpilevoy

Hi!

Thanks for the patch!

I wonder how many tests are failing due to #4819 and are they covered 
with skip conditions?
Also, how will we fix all those tests after FreeBSD fixup?

The patchset is LGTM to me.

Regards,
Sergos

On 28 сен 20:45, Igor Munkin wrote:
> Meh, the last one (I hope): I forgot there are the issues with JIT on
> FreeBSD[1]. So I added skipcond file for the newly added test:
> 
> ================================================================================
> 
> diff --git a/test/app-tap/gh-1700-abort-recording-on-fiber-switch.skipcond b/test/app-tap/gh-1700-abort-recording-on-fiber-switch.skipcond
> new file mode 100644
> index 000000000..2a2ec4d97
> --- /dev/null
> +++ b/test/app-tap/gh-1700-abort-recording-on-fiber-switch.skipcond
> @@ -0,0 +1,7 @@
> +import platform
> +
> +# Disabled on FreeBSD due to #4819.
> +if platform.system() == 'FreeBSD':
> +    self.skip = 1
> +
> +# vim: set ft=python:
> 
> ================================================================================
> 
> [1]: https://github.com/tarantool/tarantool/issues/4819
> 
> -- 
> Best regards,
> IM

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 1/2] fiber: introduce a callback for fibers switch-over
  2020-09-29  9:15       ` Sergey Ostanevich
@ 2020-09-29 10:05         ` Igor Munkin
  0 siblings, 0 replies; 29+ messages in thread
From: Igor Munkin @ 2020-09-29 10:05 UTC (permalink / raw)
  To: Sergey Ostanevich; +Cc: tarantool-patches, Vladislav Shpilevoy

Sergos,

Thanks for your review!

On 29.09.20, Sergey Ostanevich wrote:
> Hi!
> 
> Thanks for the patch, LGTM.

Added your tag:
| Reviewed-by: Sergey Ostanevich <sergos@tarantool.org>

> 
> Sergos
> 

<snipped>

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 2/2] lua: abort trace recording on fiber yield
  2020-09-29  9:24             ` Sergey Ostanevich
@ 2020-09-29 10:06               ` Igor Munkin
  0 siblings, 0 replies; 29+ messages in thread
From: Igor Munkin @ 2020-09-29 10:06 UTC (permalink / raw)
  To: Sergey Ostanevich; +Cc: tarantool-patches, Vladislav Shpilevoy

Sergos,

Thanks for your review!

On 29.09.20, Sergey Ostanevich wrote:
> Hi!
> 
> Thanks for the patch!
> 
> I wonder how many tests are failing due to #4819 and are they covered 
> with skip conditions?

I know only three: this one and two from LuaJIT tests (for FFI sandwich
and flushing trace while running one).

> Also, how will we fix all those tests after FreeBSD fixup?

Just remove this skipcond file. The problem is JIT fails to assemble the
recorded trace, so the mcode is not generated and the "desired" platform
panic doesn't occur. When the issue is fixed, the test should be fine.

> 
> The patchset is LGTM to me.

Added your tag:
| Reviewed-by: Sergey Ostanevich <sergos@tarantool.org>

> 
> Regards,
> Sergos
> 
> On 28 сен 20:45, Igor Munkin wrote:
> > Meh, the last one (I hope): I forgot there are the issues with JIT on
> > FreeBSD[1]. So I added skipcond file for the newly added test:
> > 

<snipped>

> > 
> > [1]: https://github.com/tarantool/tarantool/issues/4819
> > 
> > -- 
> > Best regards,
> > IM

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 1/2] fiber: introduce a callback for fibers switch-over
  2020-09-23 19:06 ` [Tarantool-patches] [PATCH v2 1/2] fiber: introduce a callback for " Igor Munkin
  2020-09-24 12:54   ` sergos
@ 2020-09-29 22:41   ` Vladislav Shpilevoy
  2020-09-30  9:30     ` Igor Munkin
  1 sibling, 1 reply; 29+ messages in thread
From: Vladislav Shpilevoy @ 2020-09-29 22:41 UTC (permalink / raw)
  To: Igor Munkin, Sergey Ostanevich; +Cc: tarantool-patches

Hi! Thanks for the patch!

> diff --git a/test/unit/CMakeLists.txt b/test/unit/CMakeLists.txt
> index 419477748..aace8cf50 100644
> --- a/test/unit/CMakeLists.txt
> +++ b/test/unit/CMakeLists.txt
> @@ -50,7 +50,7 @@ add_executable(bitset_index.test bitset_index.c)
>  target_link_libraries(bitset_index.test bitset)
>  add_executable(base64.test base64.c)
>  target_link_libraries(base64.test misc unit)
> -add_executable(uuid.test uuid.c)
> +add_executable(uuid.test uuid.c core_test_utils.c)

To make lib/core more self-sufficient we could make cord_on_yield
inlined in fiber.h under a macros like LIBCORE_USE_DEFAULT_ON_YIELD.
Which would be set by default to 1, but to 0 in the executable file.

Could help not to change the test binaries, and whatever else depends
on lib/core, but does not care about Lua.

I do not insist, but I will do that myself if someday I will need to
write a new unit test (raft, for example).

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 2/2] lua: abort trace recording on fiber yield
  2020-09-23 19:06 ` [Tarantool-patches] [PATCH v2 2/2] lua: abort trace recording on fiber yield Igor Munkin
  2020-09-24 13:00   ` sergos
@ 2020-09-29 22:41   ` Vladislav Shpilevoy
  2020-09-30  6:27     ` Igor Munkin
  1 sibling, 1 reply; 29+ messages in thread
From: Vladislav Shpilevoy @ 2020-09-29 22:41 UTC (permalink / raw)
  To: Igor Munkin, Sergey Ostanevich; +Cc: tarantool-patches

Thanks for the patch!

Functionally all looks good. See one comment related not to Lua.

> diff --git a/src/lua/utils.c b/src/lua/utils.c
> index 8e98f7607..6e38f5734 100644
> --- a/src/lua/utils.c
> +++ b/src/lua/utils.c
> @@ -1309,10 +1310,61 @@ tarantool_lua_utils_init(struct lua_State *L)
>  	return 0;
>  }
>  
> +/*
> + * XXX: There is already defined <panic> macro in say.h header
> + * (included in diag.h). As a result the call below is misexpanded
> + * and compilation fails with the corresponding error. To avoid
> + * this error the macro is temporary renamed and restored later.
> + * Compilation now fails for the following cases:
> + * * temporary name <gh1700_panic> is used in scope of this
> + *   translation unit
> + * * <panic> is not define, so this hack can be freely dropped
> + */
> +#if defined(panic) && !defined(gh1700_panic)
> +#  define gh1700_panic panic
> +#  undef panic
> +#else
> +#  error "Can't redefine <panic> macro"
> +#endif

Unfortunately, it won't work. Macros are not variables. You
can's assign them like that. I wish we could.

What you did here is you eliminated 'panic' macro, and added a
new macro 'gh1700_panic' whose value is "panic". Not what was
in the old 'panic', but exactly "panic". So the old value of 'panic'
is lost here.

Try to run gcc -E, or try to use gh1700_panic, and you will see it.
Below I tried to use it, but got an error.

	@@ -1363,6 +1363,7 @@ void cord_on_yield(void)
	 	 * lead to a failure on any next compiler phase.
	 	 */
	 	lj_trace_abort(g);
	+	gh1700_panic("message");
	 }
	tarantool/src/lua/utils.c:1366:2: error: implicit declaration of function 'panic' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
        	gh1700_panic("message");

Also when you try to restore the old 'panic', it also won't work
again. As an alternative you can just do '#undef panic', since your
code is in the end of the file anyway, it won't break shit. Or make
a separate patch, which would turn panic() into a function.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 2/2] lua: abort trace recording on fiber yield
  2020-09-29 22:41   ` Vladislav Shpilevoy
@ 2020-09-30  6:27     ` Igor Munkin
  2020-09-30 21:59       ` Vladislav Shpilevoy
  0 siblings, 1 reply; 29+ messages in thread
From: Igor Munkin @ 2020-09-30  6:27 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Vlad,

Thanks for your review!

On 30.09.20, Vladislav Shpilevoy wrote:
> Thanks for the patch!
> 
> Functionally all looks good. See one comment related not to Lua.
> 
> > diff --git a/src/lua/utils.c b/src/lua/utils.c
> > index 8e98f7607..6e38f5734 100644
> > --- a/src/lua/utils.c
> > +++ b/src/lua/utils.c
> > @@ -1309,10 +1310,61 @@ tarantool_lua_utils_init(struct lua_State *L)
> >  	return 0;
> >  }
> >  
> > +/*
> > + * XXX: There is already defined <panic> macro in say.h header
> > + * (included in diag.h). As a result the call below is misexpanded
> > + * and compilation fails with the corresponding error. To avoid
> > + * this error the macro is temporary renamed and restored later.
> > + * Compilation now fails for the following cases:
> > + * * temporary name <gh1700_panic> is used in scope of this
> > + *   translation unit
> > + * * <panic> is not define, so this hack can be freely dropped
> > + */
> > +#if defined(panic) && !defined(gh1700_panic)
> > +#  define gh1700_panic panic
> > +#  undef panic
> > +#else
> > +#  error "Can't redefine <panic> macro"
> > +#endif
> 
> Unfortunately, it won't work. Macros are not variables. You
> can's assign them like that. I wish we could.
> 
> What you did here is you eliminated 'panic' macro, and added a
> new macro 'gh1700_panic' whose value is "panic". Not what was
> in the old 'panic', but exactly "panic". So the old value of 'panic'
> is lost here.

Hell, you're right. They can be recursively expanded, but if panic macro
is undefined then it is just a bareword (as I originally wanted).

> 

<snipped>

> 
> Also when you try to restore the old 'panic', it also won't work
> again. As an alternative you can just do '#undef panic', since your
> code is in the end of the file anyway, it won't break shit. Or make
> a separate patch, which would turn panic() into a function.

I simply undefined <panic> macro (the diff is below) and filed an
issue[1] to rewrite this part.

================================================================================

diff --git a/src/lua/utils.c b/src/lua/utils.c
index 6e38f5734..bb2287162 100644
--- a/src/lua/utils.c
+++ b/src/lua/utils.c
@@ -1314,18 +1314,10 @@ tarantool_lua_utils_init(struct lua_State *L)
  * XXX: There is already defined <panic> macro in say.h header
  * (included in diag.h). As a result the call below is misexpanded
  * and compilation fails with the corresponding error. To avoid
- * this error the macro is temporary renamed and restored later.
- * Compilation now fails for the following cases:
- * * temporary name <gh1700_panic> is used in scope of this
- *   translation unit
- * * <panic> is not define, so this hack can be freely dropped
+ * this error the macro is undefined since it's not used anymore
+ * in scope of this translation unit.
  */
-#if defined(panic) && !defined(gh1700_panic)
-#  define gh1700_panic panic
-#  undef panic
-#else
-#  error "Can't redefine <panic> macro"
-#endif
+#undef panic
 
 /**
  * This routine encloses the checks and actions to be done when
@@ -1364,7 +1356,3 @@ void cord_on_yield(void)
 	 */
 	lj_trace_abort(g);
 }
-
-/* Restore <panic> macro back */
-#define panic gh1700_panic
-#undef gh1700_panic

================================================================================

[1]: https://github.com/tarantool/tarantool/issues/5365

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 1/2] fiber: introduce a callback for fibers switch-over
  2020-09-29 22:41   ` Vladislav Shpilevoy
@ 2020-09-30  9:30     ` Igor Munkin
  2020-09-30 22:00       ` Vladislav Shpilevoy
  0 siblings, 1 reply; 29+ messages in thread
From: Igor Munkin @ 2020-09-30  9:30 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Vlad,

Thanks for your review!

On 30.09.20, Vladislav Shpilevoy wrote:
> Hi! Thanks for the patch!
> 
> > diff --git a/test/unit/CMakeLists.txt b/test/unit/CMakeLists.txt
> > index 419477748..aace8cf50 100644
> > --- a/test/unit/CMakeLists.txt
> > +++ b/test/unit/CMakeLists.txt
> > @@ -50,7 +50,7 @@ add_executable(bitset_index.test bitset_index.c)
> >  target_link_libraries(bitset_index.test bitset)
> >  add_executable(base64.test base64.c)
> >  target_link_libraries(base64.test misc unit)
> > -add_executable(uuid.test uuid.c)
> > +add_executable(uuid.test uuid.c core_test_utils.c)
> 
> To make lib/core more self-sufficient we could make cord_on_yield
> inlined in fiber.h under a macros like LIBCORE_USE_DEFAULT_ON_YIELD.
> Which would be set by default to 1, but to 0 in the executable file.

Oh, this is a nice one. It might be a solution but...

> 
> Could help not to change the test binaries, and whatever else depends
> on lib/core, but does not care about Lua.

...unfortunately it's not. At first, we need to build both libraries
with the new define set to 1 and to 0. Furthermore, this leads to link
all tests against "coretest" instead of original "core". However, tests
implicitly dependent on "core" are still linked against it, so the issue
is not solved.

> 
> I do not insist, but I will do that myself if someday I will need to
> write a new unit test (raft, for example).

I would be glad to fix the issue, so let's return to it a bit later.
I hope there is a CMake recipe or hack for this case.

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 2/2] lua: abort trace recording on fiber yield
  2020-09-30  6:27     ` Igor Munkin
@ 2020-09-30 21:59       ` Vladislav Shpilevoy
  2020-10-01  6:14         ` Igor Munkin
  0 siblings, 1 reply; 29+ messages in thread
From: Vladislav Shpilevoy @ 2020-09-30 21:59 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches

Hi! Thanks for the fixes!

They are good, it seems, but you didn't push them on the
branch.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 1/2] fiber: introduce a callback for fibers switch-over
  2020-09-30  9:30     ` Igor Munkin
@ 2020-09-30 22:00       ` Vladislav Shpilevoy
  0 siblings, 0 replies; 29+ messages in thread
From: Vladislav Shpilevoy @ 2020-09-30 22:00 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches

Thanks for the fix attempt!

>> To make lib/core more self-sufficient we could make cord_on_yield
>> inlined in fiber.h under a macros like LIBCORE_USE_DEFAULT_ON_YIELD.
>> Which would be set by default to 1, but to 0 in the executable file.
> 
> Oh, this is a nice one. It might be a solution but...
> 
>>
>> Could help not to change the test binaries, and whatever else depends
>> on lib/core, but does not care about Lua.
> 
> ...unfortunately it's not. At first, we need to build both libraries
> with the new define set to 1 and to 0. Furthermore, this leads to link
> all tests against "coretest" instead of original "core". However, tests
> implicitly dependent on "core" are still linked against it, so the issue
> is not solved.
> 
>>
>> I do not insist, but I will do that myself if someday I will need to
>> write a new unit test (raft, for example).
> 
> I would be glad to fix the issue, so let's return to it a bit later.
> I hope there is a CMake recipe or hack for this case.

Ok, then up to you what to do with it. I can't think of a simpler
solution for now, and I am ok with the current one.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 2/2] lua: abort trace recording on fiber yield
  2020-09-30 21:59       ` Vladislav Shpilevoy
@ 2020-10-01  6:14         ` Igor Munkin
  0 siblings, 0 replies; 29+ messages in thread
From: Igor Munkin @ 2020-10-01  6:14 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Vlad,

On 30.09.20, Vladislav Shpilevoy wrote:
> Hi! Thanks for the fixes!
> 
> They are good, it seems, but you didn't push them on the
> branch.

My bad. Now pushed (also to the branches for backporting).

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 0/2] Prevent JIT engine breakage on fibers switch-over
  2020-09-23 19:06 [Tarantool-patches] [PATCH v2 0/2] Prevent JIT engine breakage on fibers switch-over Igor Munkin
                   ` (2 preceding siblings ...)
  2020-09-24 13:15 ` [Tarantool-patches] [PATCH v2 0/2] Prevent JIT engine breakage on fibers switch-over sergos
@ 2020-10-01 21:25 ` Vladislav Shpilevoy
  2020-10-01 21:29   ` Igor Munkin
  2020-10-01 22:17 ` Igor Munkin
  2020-10-02 12:43 ` Kirill Yukhin
  5 siblings, 1 reply; 29+ messages in thread
From: Vladislav Shpilevoy @ 2020-10-01 21:25 UTC (permalink / raw)
  To: Igor Munkin, Sergey Ostanevich; +Cc: tarantool-patches

Hi! Thanks for the patchset!

LGTM.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 0/2] Prevent JIT engine breakage on fibers switch-over
  2020-10-01 21:25 ` Vladislav Shpilevoy
@ 2020-10-01 21:29   ` Igor Munkin
  0 siblings, 0 replies; 29+ messages in thread
From: Igor Munkin @ 2020-10-01 21:29 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Vlad,

Thanks for your review!

On 01.10.20, Vladislav Shpilevoy wrote:
> Hi! Thanks for the patchset!
> 
> LGTM.

Added your tag to both patches:
| Reviewed-by: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 0/2] Prevent JIT engine breakage on fibers switch-over
  2020-09-23 19:06 [Tarantool-patches] [PATCH v2 0/2] Prevent JIT engine breakage on fibers switch-over Igor Munkin
                   ` (3 preceding siblings ...)
  2020-10-01 21:25 ` Vladislav Shpilevoy
@ 2020-10-01 22:17 ` Igor Munkin
  2020-10-02 12:43 ` Kirill Yukhin
  5 siblings, 0 replies; 29+ messages in thread
From: Igor Munkin @ 2020-10-01 22:17 UTC (permalink / raw)
  To: Kirill Yukhin; +Cc: tarantool-patches, Vladislav Shpilevoy

Kirill,

Please proceed with the series.

NB: the relevant ChangeLog entry is here[1].

Here are the branches with backported commits:
* imun/gh-1700-abort-recording-on-fiber-switch-1.10
* imun/gh-1700-abort-recording-on-fiber-switch-2.4
* imun/gh-1700-abort-recording-on-fiber-switch-2.5

On 23.09.20, Igor Munkin wrote:
> There was a long discussion about the patch correctness and its
> performance impact in v1 thread[1]. The benchmarks provided in v1 showed
> this implementation as the least nerfing the platform overall
> performance even for such synthetic test[2]. One can find the relevant
> benchmarks results in the corresponding patches.
> 
> Changes in v2:
> * implement the callback in a different way to negate its perf impact
> 
> @ChangeLog:
> * Fixed fibers switch-over to prevent JIT machinery misbehaviour. Trace
>   recording is aborted when fiber yields the execution. The yield
>   occuring while mcode is being run leads to the platform panic
>   (gh-1700, gh-4491).
> 
> Branch: https://github.com/tarantool/tarantool/tree/imun/gh-1700-abort-recording-on-fiber-switch
> Issues:
> * https://github.com/tarantool/tarantool/issues/1700
> * https://github.com/tarantool/tarantool/issues/4491
> 
> [1]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-March/015290.html
> [2]: https://gist.github.com/igormunkin/7e0cf48005bd003ffbdf30181eedb40e
> 
> Igor Munkin (2):
>   fiber: introduce a callback for fibers switch-over
>   lua: abort trace recording on fiber yield
> 
>  src/lib/core/fiber.c                          | 10 ++++
>  src/lua/utils.c                               | 57 ++++++++++++++++++
>  ...-4491-coio-wait-leads-to-segfault.test.lua | 53 +++++++++++++++++
>  test/unit/CMakeLists.txt                      | 59 ++++++++++---------
>  test/unit/core_test_utils.c                   | 37 ++++++++++++
>  5 files changed, 188 insertions(+), 28 deletions(-)
>  create mode 100755 test/app-tap/gh-4491-coio-wait-leads-to-segfault.test.lua
>  create mode 100644 test/unit/core_test_utils.c
> 
> -- 
> 2.25.0
> 

[1]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-September/019711.html

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 0/2] Prevent JIT engine breakage on fibers switch-over
  2020-09-23 19:06 [Tarantool-patches] [PATCH v2 0/2] Prevent JIT engine breakage on fibers switch-over Igor Munkin
                   ` (4 preceding siblings ...)
  2020-10-01 22:17 ` Igor Munkin
@ 2020-10-02 12:43 ` Kirill Yukhin
  2020-10-02 12:44   ` Igor Munkin
  5 siblings, 1 reply; 29+ messages in thread
From: Kirill Yukhin @ 2020-10-02 12:43 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches, Vladislav Shpilevoy

Hello,

On 23 сен 22:06, Igor Munkin wrote:
> There was a long discussion about the patch correctness and its
> performance impact in v1 thread[1]. The benchmarks provided in v1 showed
> this implementation as the least nerfing the platform overall
> performance even for such synthetic test[2]. One can find the relevant
> benchmarks results in the corresponding patches.
> 
> Changes in v2:
> * implement the callback in a different way to negate its perf impact
> 
> @ChangeLog:
> * Fixed fibers switch-over to prevent JIT machinery misbehaviour. Trace
>   recording is aborted when fiber yields the execution. The yield
>   occuring while mcode is being run leads to the platform panic
>   (gh-1700, gh-4491).
> 
> Branch: https://github.com/tarantool/tarantool/tree/imun/gh-1700-abort-recording-on-fiber-switch
> Issues:
> * https://github.com/tarantool/tarantool/issues/1700
> * https://github.com/tarantool/tarantool/issues/4491
> 
> [1]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-March/015290.html
> [2]: https://gist.github.com/igormunkin/7e0cf48005bd003ffbdf30181eedb40e

I've checked your patch set into 1.10, 2.4, 2.5 and master.

--
Regards, Kirill Yukhin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH v2 0/2] Prevent JIT engine breakage on fibers switch-over
  2020-10-02 12:43 ` Kirill Yukhin
@ 2020-10-02 12:44   ` Igor Munkin
  0 siblings, 0 replies; 29+ messages in thread
From: Igor Munkin @ 2020-10-02 12:44 UTC (permalink / raw)
  To: Kirill Yukhin; +Cc: tarantool-patches, Vladislav Shpilevoy

Kirill,

On 02.10.20, Kirill Yukhin wrote:
> Hello,
> 
> On 23 сен 22:06, Igor Munkin wrote:
> > There was a long discussion about the patch correctness and its
> > performance impact in v1 thread[1]. The benchmarks provided in v1 showed
> > this implementation as the least nerfing the platform overall
> > performance even for such synthetic test[2]. One can find the relevant
> > benchmarks results in the corresponding patches.
> > 
> > Changes in v2:
> > * implement the callback in a different way to negate its perf impact
> > 
> > @ChangeLog:
> > * Fixed fibers switch-over to prevent JIT machinery misbehaviour. Trace
> >   recording is aborted when fiber yields the execution. The yield
> >   occuring while mcode is being run leads to the platform panic
> >   (gh-1700, gh-4491).
> > 
> > Branch: https://github.com/tarantool/tarantool/tree/imun/gh-1700-abort-recording-on-fiber-switch
> > Issues:
> > * https://github.com/tarantool/tarantool/issues/1700
> > * https://github.com/tarantool/tarantool/issues/4491
> > 
> > [1]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-March/015290.html
> > [2]: https://gist.github.com/igormunkin/7e0cf48005bd003ffbdf30181eedb40e
> 
> I've checked your patch set into 1.10, 2.4, 2.5 and master.

Thanks, I've updated the corresponding release notes.

> 
> --
> Regards, Kirill Yukhin

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2020-10-02 12:55 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-23 19:06 [Tarantool-patches] [PATCH v2 0/2] Prevent JIT engine breakage on fibers switch-over Igor Munkin
2020-09-23 19:06 ` [Tarantool-patches] [PATCH v2 1/2] fiber: introduce a callback for " Igor Munkin
2020-09-24 12:54   ` sergos
2020-09-28 13:06     ` Igor Munkin
2020-09-29  9:15       ` Sergey Ostanevich
2020-09-29 10:05         ` Igor Munkin
2020-09-29 22:41   ` Vladislav Shpilevoy
2020-09-30  9:30     ` Igor Munkin
2020-09-30 22:00       ` Vladislav Shpilevoy
2020-09-23 19:06 ` [Tarantool-patches] [PATCH v2 2/2] lua: abort trace recording on fiber yield Igor Munkin
2020-09-24 13:00   ` sergos
2020-09-28 13:07     ` Igor Munkin
2020-09-28 15:36       ` Igor Munkin
2020-09-28 16:37         ` Igor Munkin
2020-09-28 17:45           ` Igor Munkin
2020-09-29  9:24             ` Sergey Ostanevich
2020-09-29 10:06               ` Igor Munkin
2020-09-29 22:41   ` Vladislav Shpilevoy
2020-09-30  6:27     ` Igor Munkin
2020-09-30 21:59       ` Vladislav Shpilevoy
2020-10-01  6:14         ` Igor Munkin
2020-09-24 13:15 ` [Tarantool-patches] [PATCH v2 0/2] Prevent JIT engine breakage on fibers switch-over sergos
2020-09-28 13:06   ` Igor Munkin
2020-09-29  9:14     ` Sergey Ostanevich
2020-10-01 21:25 ` Vladislav Shpilevoy
2020-10-01 21:29   ` Igor Munkin
2020-10-01 22:17 ` Igor Munkin
2020-10-02 12:43 ` Kirill Yukhin
2020-10-02 12:44   ` Igor Munkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox