[Tarantool-patches] [PATCH] lua: prohibit fiber yield when GC hook is active

Sergey Ostanevich sergos at tarantool.org
Mon Oct 5 21:37:56 MSK 2020


Hi!

Thank you for the patch, it LGTM.

Best regards,
Sergos 

Friday, 2 October 2020, 17:49 +0300 from Igor Munkin  <imun at tarantool.org>:
>While running GC hook (i.e. __gc  metamethod) garbage collector engine
>is "stopped": the memory penalty threshold is set to LJ_MAX_MEM and
>incremental GC step is not triggered as a result. Ergo, yielding the
>execution at the finalizer body leads to further running platform with
>disabled LuaJIT GC. It is not re-enabled until the yielded fiber doesn't
>get the execution back.
>
>This changeset extends <cord_on_yield> routine with the check whether GC
>hook is active. If the switch-over occurs in scope of __gc metamethod
>the platform is forced to stop its execution with EXIT_FAILURE and calls
>panic routine before the exit.
>
>Relates to #4518
>Follows up #4727
>
>Signed-off-by: Igor Munkin < imun at tarantool.org >
>---
>
>Vlad introduced the internal interface and local internal background
>fiber in scope of 8443bd9 ("fiber: introduce schedule_task() internal
>function") to postpone any yielding finalization (e.g. 3d5b4da ("fio:
>close unused descriptors automatically") and f073834 ("swim: use
>fiber._internal.schedule_task() for GC")). After this patch is merged we
>need to update docs and provide users a correct scenario to detect and
>fix yielding finalizers.
>
>Here are also the benchmark results for the Release build:
>* Vanilla (2711797) -> Patched (61072ba) (min, median, mean, max):
>| fibers: 10; iters: 100	0%	0%	0%	2%
>| fibers: 10; iters: 1000	-3%	0%	0%	1%
>| fibers: 10; iters: 10000	-3%	0%	-1%	-2%
>| fibers: 10; iters: 100000	0%	0%	0%	1%
>| fibers: 100; iters: 100	-1%	-2%	-2%	-9%
>| fibers: 100; iters: 1000	0%	0%	0%	5%
>| fibers: 100; iters: 10000	0%	0%	0%	0%
>| fibers: 100; iters: 100000	0%	0%	0%	-3%
>| fibers: 1000; iters: 100	0%	0%	0%	0%
>| fibers: 1000; iters: 1000	0%	0%	0%	0%
>| fibers: 1000; iters: 10000	0%	0%	0%	1%
>| fibers: 1000; iters: 100000	0%	0%	0%	2%
>| fibers: 10000; iters: 100	0%	-3%	-1%	-2%
>| fibers: 10000; iters: 1000	0%	-2%	-3%	-2%
>| fibers: 10000; iters: 10000	0%	0%	0%	-2%
>| fibers: 10000; iters: 100000	0%	-3%	-3%	-6%
>
> src/lua/utils.c                             | 26 ++++++++++++++-
> test/app-tap/yield-in-gc-finalizer.test.lua | 36 +++++++++++++++++++++
> 2 files changed, 61 insertions(+), 1 deletion(-)
> create mode 100755 test/app-tap/yield-in-gc-finalizer.test.lua
>
>diff --git a/src/lua/utils.c b/src/lua/utils.c
>index bb2287162..399bec6c6 100644
>--- a/src/lua/utils.c
>+++ b/src/lua/utils.c
>@@ -1324,7 +1324,8 @@ tarantool_lua_utils_init(struct lua_State *L)
>  * the running fiber yields the execution.
>  * Since Tarantool fibers don't switch-over the way Lua coroutines
>  * do the platform ought to notify JIT engine when one lua_State
>- * substitutes another one.
>+ * substitutes another one. Furthermore fiber switch is forbidden
>+ * when GC hook (i.e. __gc metamethod) is running.
>  */
> void cord_on_yield(void)
> {
>@@ -1355,4 +1356,27 @@ void cord_on_yield(void)
> 	 * lead to a failure on any next compiler phase.
> 	 */
> 	lj_trace_abort(g);
>+
>+	/*
>+	 * XXX: While running GC hook (i.e. __gc  metamethod)
>+	 * garbage collector is formally "stopped" since the
>+	 * memory penalty threshold is set to its maximum value,
>+	 * ergo incremental GC step is not triggered. Thereby,
>+	 * yielding the execution at this point leads to further
>+	 * running platform with disabled LuaJIT GC. The fiber
>+	 * doesn't get the execution back until it's ready, so
>+	 * in pessimistic scenario LuaJIT OOM might occur
>+	 * earlier. As a result fiber switch is prohibited when
>+	 * GC hook is active and the platform is forced to stop.
>+	 */
>+	if (unlikely(g->hookmask & (HOOK_ACTIVE|HOOK_GC))) {
>+		struct lua_State *L = fiber()->storage.lua.stack;
>+		assert(L != NULL);
>+		lua_pushfstring(L, "fiber %d is switched while running GC"
>+				" finalizer (i.e. __gc metamethod)",
>+				fiber()->fid);
>+		if (g->panic)
>+			g->panic(L);
>+		exit(EXIT_FAILURE);
>+	}
> }
>diff --git a/test/app-tap/yield-in-gc-finalizer.test.lua b/test/app-tap/yield-in-gc-finalizer.test.lua
>new file mode 100755
>index 000000000..a7e173721
>--- /dev/null
>+++ b/test/app-tap/yield-in-gc-finalizer.test.lua
>@@ -0,0 +1,36 @@
>+#!/usr/bin/env tarantool
>+
>+if #arg == 0 then
>+  local tap = require('tap')
>+  local test = tap.test('test')
>+
>+  test:plan(1)
>+
>+  -- XXX: Shell argument <test> is necessary to differ test case
>+  -- from the test runner.
>+  local cmd = string.gsub('<LUABIN> 2>/dev/null <SCRIPT> test', '%<(%w+)>', {
>+    LUABIN = arg[-1],
>+    SCRIPT = arg[0],
>+  })
>+  test:isnt(os.execute(cmd), 0, 'fiber.yield is forbidden in __gc')
>+
>+  os.exit(test:check() and 0 or 1)
>+end
>+
>+
>+-- Test body.
>+
>+local ffi = require('ffi')
>+local fiber = require('fiber')
>+
>+ffi.cdef('struct test { int foo; };')
>+
>+local test = ffi.metatype('struct test', {
>+  __gc = function() fiber.yield() end,
>+})
>+
>+local t = test(9)
>+t = nil
>+
>+-- This call leads to the platform panic.
>+collectgarbage('collect')
>-- 
>2.25.0
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20201005/02919573/attachment.html>


More information about the Tarantool-patches mailing list