Tarantool development patches archive
 help / color / mirror / Atom feed
From: Igor Munkin <imun@tarantool.org>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>,
	Sergey Ostanevich <sergos@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Subject: [Tarantool-patches] [PATCH v2 2/2] lua: abort trace recording on fiber yield
Date: Wed, 23 Sep 2020 22:06:19 +0300	[thread overview]
Message-ID: <6b7def49c9d2252425d3944cc4274c9a155ae9e9.1600862684.git.imun@tarantool.org> (raw)
In-Reply-To: <cover.1600862684.git.imun@tarantool.org>

Since Tarantool fibers don't respect Lua coroutine switch mechanism, JIT
machinery stays unnotified when one lua_State substitutes another one.
As a result if trace recording hasn't been aborted prior to fiber
switch, the recording proceeds using the new lua_State and leads to a
failure either on any further compiler phase or while the compiled trace
is executed.

This changeset extends <cord_on_yield> routine aborting trace recording
when the fiber switches to another one. If the switch-over occurs while
mcode is being run the platform finishes its execution with EXIT_FAILURE
code and calls panic routine prior to the exit.

Closes #1700
Fixes #4491

Signed-off-by: Igor Munkin <imun@tarantool.org>
---

Here are also the benchmark results for the Release build:
* Vanilla (8477b6c) -> Patched (f1026a1) (min, median, mean, max):
| fibers: 10; iters: 100	0%	0%	0%	0%
| fibers: 10; iters: 1000	2%	2%	1%	2%
| fibers: 10; iters: 10000	6%	4%	5%	5%
| fibers: 10; iters: 100000	4%	3%	4%	1%
| fibers: 100; iters: 100	1%	0%	1%	10%
| fibers: 100; iters: 1000	2%	2%	0%	-7%
| fibers: 100; iters: 10000	2%	1%	2%	1%
| fibers: 100; iters: 100000	2%	2%	2%	5%
| fibers: 1000; iters: 100	1%	0%	0%	-2%
| fibers: 1000; iters: 1000	1%	1%	1%	0%
| fibers: 1000; iters: 10000	1%	1%	1%	2%
| fibers: 1000; iters: 100000	1%	1%	1%	1%
| fibers: 10000; iters: 100	0%	0%	0%	1%
| fibers: 10000; iters: 1000	0%	2%	3%	3%
| fibers: 10000; iters: 10000	0%	0%	0%	-2%
| fibers: 10000; iters: 100000	0%	1%	0%	-4%

 src/lua/utils.c                               | 49 +++++++++++++++++
 ...-4491-coio-wait-leads-to-segfault.test.lua | 53 +++++++++++++++++++
 2 files changed, 102 insertions(+)
 create mode 100755 test/app-tap/gh-4491-coio-wait-leads-to-segfault.test.lua

diff --git a/src/lua/utils.c b/src/lua/utils.c
index 8e98f7607..39fe4e30c 100644
--- a/src/lua/utils.c
+++ b/src/lua/utils.c
@@ -29,6 +29,7 @@
  * SUCH DAMAGE.
  */
 #include "lua/utils.h"
+#include <lj_trace.h>
 
 #include <assert.h>
 #include <errno.h>
@@ -1309,10 +1310,57 @@ tarantool_lua_utils_init(struct lua_State *L)
 	return 0;
 }
 
+/*
+ * XXX: There is already defined <panic> macro in say.h header
+ * (included in diag.h). As a result the call below is misexpanded
+ * and compilation fails with the corresponding error. To avoid
+ * this error the macro is temporary renamed and restored later.
+ * Compilation now fails for the following cases:
+ * * temporary name <_panic> is used in this translation unit
+ * * <panic> is not define, so this hack can be freely dropped
+ */
+#if defined(panic) && !defined(_panic)
+#  define _panic panic
+#  undef panic
+#else
+#  error "Can't redefine <panic> macro"
+#endif
+
 /**
  * This routine encloses the checks and actions to be done when
  * the running fiber yields the execution.
+ * Since Tarantool fibers don't switch-over the way Lua coroutines
+ * do the platform ought to notify JIT engine when one lua_State
+ * substitutes another one.
  */
 void cord_on_yield(void)
 {
+	struct global_State *g = G(tarantool_L);
+	/*
+	 * XXX: Switching fibers while running the trace leads to
+	 * code misbehaviour and failures, so stop its execution.
+	 */
+	if (unlikely(tvref(g->jit_base))) {
+		/*
+		 * XXX: mcode is executed only in scope of Lua
+		 * world and one can obtain the corresponding Lua
+		 * coroutine from the fiber storage.
+		 */
+		struct lua_State *L = fiber()->storage.lua.stack;
+		assert(L != NULL);
+		lua_pushstring(L, "Fiber is switched on the trace");
+		if (g->panic)
+			g->panic(L);
+		exit(EXIT_FAILURE);
+	}
+	/*
+	 * Unconditionally abort trace recording whether fibers
+	 * switch each other. Otherwise, further compilation may
+	 * lead to a failure on any next compiler phase.
+	 */
+	lj_trace_abort(g);
 }
+
+/* Restore <panic> macro back */
+#define panic _panic
+#undef _panic
diff --git a/test/app-tap/gh-4491-coio-wait-leads-to-segfault.test.lua b/test/app-tap/gh-4491-coio-wait-leads-to-segfault.test.lua
new file mode 100755
index 000000000..0dd8dfbee
--- /dev/null
+++ b/test/app-tap/gh-4491-coio-wait-leads-to-segfault.test.lua
@@ -0,0 +1,53 @@
+#!/usr/bin/env tarantool
+
+local tap = require('tap')
+
+local test = tap.test('gh-4491-coio-wait-leads-to-segfault')
+
+-- Test file to demonstrate platform failure due to fiber switch
+-- while trace recording, details:
+--     https://github.com/tarantool/tarantool/issues/4491
+
+local fiber = require('fiber')
+local ffi = require('ffi')
+ffi.cdef('int coio_wait(int fd, int event, double timeout);')
+
+local cfg = {
+  hotloop = arg[1] or 1,
+  fibers = arg[1] or 2,
+  timeout = { put = 1, get = 1 },
+}
+
+test:plan(cfg.fibers + 1)
+
+local args = {
+  fd      = 1   , -- STDIN file descriptor
+  event   = 0x1 , -- COIO_READ event
+  timeout = 0.05, -- Timeout value
+}
+
+local function run(iterations, channel)
+  for _ = 1, iterations do
+    ffi.C.coio_wait(args.fd, args.event, args.timeout)
+  end
+  channel:put(true, cfg.timeout.put)
+end
+
+local channels = { }
+
+jit.opt.start('3', string.format('hotloop=%d', cfg.hotloop))
+
+for _ = 1, cfg.fibers do
+  channels[_] = fiber.channel(1)
+  fiber.new(run, cfg.hotloop + 1, channels[_])
+end
+
+-- Finalize the existing fibers
+for _ = 1, cfg.fibers do
+  test:ok(channels[_]:get(cfg.timeout.get),
+          string.format('fiber #%d successfully finished', _))
+end
+
+test:ok(true, 'trace is not recorded due to fiber switch underneath coio_wait')
+
+os.exit(test:check() and 0 or 1)
-- 
2.25.0

  parent reply	other threads:[~2020-09-23 19:16 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-23 19:06 [Tarantool-patches] [PATCH v2 0/2] Prevent JIT engine breakage on fibers switch-over Igor Munkin
2020-09-23 19:06 ` [Tarantool-patches] [PATCH v2 1/2] fiber: introduce a callback for " Igor Munkin
2020-09-24 12:54   ` sergos
2020-09-28 13:06     ` Igor Munkin
2020-09-29  9:15       ` Sergey Ostanevich
2020-09-29 10:05         ` Igor Munkin
2020-09-29 22:41   ` Vladislav Shpilevoy
2020-09-30  9:30     ` Igor Munkin
2020-09-30 22:00       ` Vladislav Shpilevoy
2020-09-23 19:06 ` Igor Munkin [this message]
2020-09-24 13:00   ` [Tarantool-patches] [PATCH v2 2/2] lua: abort trace recording on fiber yield sergos
2020-09-28 13:07     ` Igor Munkin
2020-09-28 15:36       ` Igor Munkin
2020-09-28 16:37         ` Igor Munkin
2020-09-28 17:45           ` Igor Munkin
2020-09-29  9:24             ` Sergey Ostanevich
2020-09-29 10:06               ` Igor Munkin
2020-09-29 22:41   ` Vladislav Shpilevoy
2020-09-30  6:27     ` Igor Munkin
2020-09-30 21:59       ` Vladislav Shpilevoy
2020-10-01  6:14         ` Igor Munkin
2020-09-24 13:15 ` [Tarantool-patches] [PATCH v2 0/2] Prevent JIT engine breakage on fibers switch-over sergos
2020-09-28 13:06   ` Igor Munkin
2020-09-29  9:14     ` Sergey Ostanevich
2020-10-01 21:25 ` Vladislav Shpilevoy
2020-10-01 21:29   ` Igor Munkin
2020-10-01 22:17 ` Igor Munkin
2020-10-02 12:43 ` Kirill Yukhin
2020-10-02 12:44   ` Igor Munkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6b7def49c9d2252425d3944cc4274c9a155ae9e9.1600862684.git.imun@tarantool.org \
    --to=imun@tarantool.org \
    --cc=sergos@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v2 2/2] lua: abort trace recording on fiber yield' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox