[Tarantool-patches] [PATCH v2 2/2] lua: abort trace recording on fiber yield

Igor Munkin imun at tarantool.org
Mon Sep 28 16:07:25 MSK 2020


Sergos,

Thanks for your review!

On 24.09.20, sergos at tarantool.org wrote:
> Hi!
> 
> Thanks for the patch, see my 2 comments below.
> 

<snipped>

> > 
> > diff --git a/src/lua/utils.c b/src/lua/utils.c
> > index 8e98f7607..39fe4e30c 100644
> > --- a/src/lua/utils.c
> > +++ b/src/lua/utils.c
> > @@ -29,6 +29,7 @@
> >  * SUCH DAMAGE.
> >  */
> > #include "lua/utils.h"
> > +#include <lj_trace.h>
> > 
> > #include <assert.h>
> > #include <errno.h>
> > @@ -1309,10 +1310,57 @@ tarantool_lua_utils_init(struct lua_State *L)
> > 	return 0;
> > }
> > 
> > +/*
> > + * XXX: There is already defined <panic> macro in say.h header
> > + * (included in diag.h). As a result the call below is misexpanded
> > + * and compilation fails with the corresponding error. To avoid
> > + * this error the macro is temporary renamed and restored later.
> > + * Compilation now fails for the following cases:
> > + * * temporary name <_panic> is used in this translation unit
> 
> I would propose some more sophisitcated name to avoid possible overlap 
> with ‘hidden’ _something convention. Since it is Tarantool sources, a
> reference to the gh will be just fine, like panic_gh1700.

Sounds better, thanks. Adjusted:

================================================================================

diff --git a/src/lua/utils.c b/src/lua/utils.c
index d2ad0570a..b36824ebf 100644
--- a/src/lua/utils.c
+++ b/src/lua/utils.c
@@ -1316,11 +1316,12 @@ tarantool_lua_utils_init(struct lua_State *L)
  * and compilation fails with the corresponding error. To avoid
  * this error the macro is temporary renamed and restored later.
  * Compilation now fails for the following cases:
- * * temporary name <_panic> is used in this translation unit
+ * * temporary name <gh1700_panic> is used in scope of this
+ *   translation unit
  * * <panic> is not define, so this hack can be freely dropped
  */
-#if defined(panic) && !defined(_panic)
-#  define _panic panic
+#if defined(panic) && !defined(gh1700_panic)
+#  define gh1700_panic panic
 #  undef panic
 #else
 #  error "Can't redefine <panic> macro"
@@ -1362,5 +1363,5 @@ void cord_on_yield(void)
 }
 
 /* Restore <panic> macro back */
-#define panic _panic
-#undef _panic
+#define panic gh1700_panic
+#undef gh1700_panic

================================================================================

> 
> > + * * <panic> is not define, so this hack can be freely dropped
> > + */
> > +#if defined(panic) && !defined(_panic)
> > +#  define _panic panic
> > +#  undef panic
> > +#else
> > +#  error "Can't redefine <panic> macro"
> > +#endif
> > +
> > /**
> >  * This routine encloses the checks and actions to be done when
> >  * the running fiber yields the execution.
> > + * Since Tarantool fibers don't switch-over the way Lua coroutines
> > + * do the platform ought to notify JIT engine when one lua_State
> > + * substitutes another one.
> >  */
> > void cord_on_yield(void)
> > {
> > +	struct global_State *g = G(tarantool_L);
> > +	/*
> > +	 * XXX: Switching fibers while running the trace leads to
> > +	 * code misbehaviour and failures, so stop its execution.
> > +	 */
> > +	if (unlikely(tvref(g->jit_base))) {
> > +		/*
> > +		 * XXX: mcode is executed only in scope of Lua
> > +		 * world and one can obtain the corresponding Lua
> > +		 * coroutine from the fiber storage.
> > +		 */
> > +		struct lua_State *L = fiber()->storage.lua.stack;
> > +		assert(L != NULL);
> > +		lua_pushstring(L, "Fiber is switched on the trace”);
> 
> To me it is very much ’Too long WAL write’ style. Everything said,
> nothing is clear. Could you please elaborate on the ‘FFI code’ and
> ‘led to a yield’ and probably the fiber number.

Agree, extended the error the following way:

================================================================================

diff --git a/src/lua/utils.c b/src/lua/utils.c
index b36824ebf..a8b66d34e 100644
--- a/src/lua/utils.c
+++ b/src/lua/utils.c
@@ -1349,7 +1349,10 @@ void cord_on_yield(void)
 		 */
 		struct lua_State *L = fiber()->storage.lua.stack;
 		assert(L != NULL);
-		lua_pushstring(L, "Fiber is switched on the trace");
+		lua_pushfstring(L, "fiber %d is switched while running the"
+				" compiled code (it's likely a function with"
+				" a yield underneath called via LuaJIT FFI)",
+				fiber()->fid);
 		if (g->panic)
 			g->panic(L);
 		exit(EXIT_FAILURE);

================================================================================

> 
> > +		if (g->panic)
> > +			g->panic(L);
> > +		exit(EXIT_FAILURE);
> > +	}
> > +	/*
> > +	 * Unconditionally abort trace recording whether fibers
> > +	 * switch each other. Otherwise, further compilation may
> > +	 * lead to a failure on any next compiler phase.
> > +	 */
> > +	lj_trace_abort(g);
> > }
> > +
> > +/* Restore <panic> macro back */
> > +#define panic _panic
> > +#undef _panic

<snipped>

> > -- 
> > 2.25.0
> > 
> 

Besides, I added a test for the panic case. Here is the diff:

================================================================================

diff --git a/test/app-tap/CMakeLists.txt b/test/app-tap/CMakeLists.txt
index ee67cf533..bf7d28136 100644
--- a/test/app-tap/CMakeLists.txt
+++ b/test/app-tap/CMakeLists.txt
@@ -1 +1,2 @@
 build_module(module_api module_api.c)
+build_module(libyield libyield.c)
diff --git a/test/app-tap/gh-1700-abort-recording-on-fiber-switch.test.lua b/test/app-tap/gh-1700-abort-recording-on-fiber-switch.test.lua
new file mode 100755
index 000000000..027dee219
--- /dev/null
+++ b/test/app-tap/gh-1700-abort-recording-on-fiber-switch.test.lua
@@ -0,0 +1,77 @@
+#!/usr/bin/env tarantool
+
+if #arg == 0 then
+  local checks = {
+    {
+      arg = {
+        1, -- hotloop (arg[1])
+        1, -- trigger (arg[2])
+      },
+      res = 'OK',
+      msg = 'Trace is aborted',
+    },
+    {
+      arg = {
+        1, -- hotloop (arg[1])
+        2, -- trigger (arg[2])
+      },
+      res = 'fiber %d+ is switched while running the compiled code %b()',
+      msg = 'Trace is recorded',
+    },
+  }
+
+  local tap = require('tap')
+  local test = tap.test('gh-1700-abort-recording-on-fiber-switch')
+
+  test:plan(#checks)
+
+  local vars = {
+    LUABIN = arg[-1],
+    SCRIPT = arg[0],
+    PATH   = arg[0]:gsub('/?[^/]+%.test%.lua', ''),
+    SUFFIX = package.cpath:match('?.(%a+);'),
+  }
+
+  local cmd = string.gsub('LUA_CPATH="$LUA_CPATH;<PATH>/?.<SUFFIX>" ' ..
+                          'LUA_PATH="$LUA_PATH;<PATH>/?.lua" ' ..
+                          'LD_LIBRARY_PATH=<PATH> ' ..
+                          '<LUABIN> 2>&1 <SCRIPT>', '%<(%w+)>', vars)
+
+  for _, ch in pairs(checks) do
+    local res
+    local proc = io.popen((cmd .. (' %s'):rep(#ch.arg)):format(unpack(ch.arg)))
+    for s in proc:lines() do res = s end
+    assert(res, 'proc:lines failed')
+    test:like(res, ch.res, ch.msg)
+  end
+
+  os.exit(test:check() and 0 or 1)
+end
+
+
+-- Test body.
+
+local cfg = {
+  hotloop = arg[1] or 1,
+  trigger = arg[2] or 1,
+}
+
+local ffi = require('ffi')
+local ffiyield = ffi.load('libyield')
+ffi.cdef('void yield(struct yield *state, int i)')
+
+-- Set the value to trigger <yield> call switch the running fuber.
+local yield = require('libyield')(cfg.trigger)
+
+-- Depending on trigger and hotloop values the following contexts
+-- are possible:
+-- * if trigger <= hotloop -> trace recording is aborted
+-- * if trigger >  hotloop -> trace is recorded but execution
+--   leads to panic
+jit.opt.start("3", string.format("hotloop=%d", cfg.hotloop))
+
+for i = 0, cfg.trigger + cfg.hotloop do
+  ffiyield.yield(yield, i)
+end
+-- Panic didn't occur earlier.
+print('OK')
diff --git a/test/app-tap/libyield.c b/test/app-tap/libyield.c
new file mode 100644
index 000000000..1b7221a8e
--- /dev/null
+++ b/test/app-tap/libyield.c
@@ -0,0 +1,29 @@
+#include <lua.h>
+#include <module.h>
+
+struct yield {
+	int trigger;  /* Trigger for yielding the fiber execution */
+};
+
+void yield(struct yield *state, int i)
+{
+	if (i < state->trigger)
+		return;
+
+	/* Fiber yields the execution for a jiffy */
+	fiber_sleep(0);
+}
+
+static int init(lua_State *L)
+{
+	struct yield *state = lua_newuserdata(L, sizeof(struct yield));
+
+	state->trigger = lua_tonumber(L, 1);
+	return 1;
+}
+
+LUA_API int luaopen_libyield(lua_State *L)
+{
+	lua_pushcfunction(L, init);
+	return 1;
+}

================================================================================

-- 
Best regards,
IM


More information about the Tarantool-patches mailing list