[PATCH v3] Fix fiber_join() hang in case fiber_cancel() was called
Serge Petrenko
sergepetrenko at tarantool.org
Wed Feb 6 15:56:30 MSK 2019
In case a fiber joining another fiber gets cancelled, it stays suspended
forever and never finishes joining. This happens because fiber_cancel()
wakes the fiber and removes it from all execution queues.
Fix this by adding the fiber back to the wakeup queue of the joined
fiber after each yield.
Closes #3948
---
https://github.com/tarantool/tarantool/issues/3948
https://github.com/tarantool/tarantool/tree/sp/gh-3948-fiber-cancel-during-join
Changes in v3:
- rewrote the test with fiber channel
to remove scheduler dependency.
- went back to ignoring cancellation
till join is complete.
Changes in v2:
- rewrote the test completely.
- instead of continuing to join if the fiber
is cancelled make the fiber to be joined
non-joinable and exit. This solution was
discussed verbally.
- revert comment changes for fiber_yield().
It really isn't a cancellation point.
src/fiber.c | 12 ++++++++++--
test/app/fiber.result | 43 +++++++++++++++++++++++++++++++++++++++++
test/app/fiber.test.lua | 21 ++++++++++++++++++++
3 files changed, 74 insertions(+), 2 deletions(-)
diff --git a/src/fiber.c b/src/fiber.c
index 6f3d0ab78..70e992f13 100644
--- a/src/fiber.c
+++ b/src/fiber.c
@@ -392,9 +392,17 @@ fiber_join(struct fiber *fiber)
assert(fiber->flags & FIBER_IS_JOINABLE);
if (! fiber_is_dead(fiber)) {
- rlist_add_tail_entry(&fiber->wake, fiber(), state);
-
do {
+ /*
+ * In case fiber is cancelled during yield
+ * it will be remoed from wake queue by a
+ * wakeup following the cancel, so we have
+ * to put it back in.
+ * Having multiple queue entries for the
+ * same fiber doesn't hurt, since wakeup
+ * is executed only once per fiber.
+ */
+ rlist_add_tail_entry(&fiber->wake, fiber(), state);
fiber_yield();
} while (! fiber_is_dead(fiber));
}
diff --git a/test/app/fiber.result b/test/app/fiber.result
index ab7c1941b..1b72ed5da 100644
--- a/test/app/fiber.result
+++ b/test/app/fiber.result
@@ -1411,6 +1411,49 @@ l = nil
l1 = nil
---
...
+-- gh-3948 fiber.join() blocks if fiber is cancelled.
+function another_func() ch1:get() end
+---
+...
+test_run:cmd("setopt delimiter ';'")
+---
+- true
+...
+function func()
+ local fib = fiber.create(another_func)
+ fib:set_joinable(true)
+ ch2:put(1)
+ fib:join()
+end;
+---
+...
+test_run:cmd("setopt delimiter ''");
+---
+- true
+...
+ch1 = fiber.channel(1)
+---
+...
+ch2 = fiber.channel(1)
+---
+...
+f = fiber.create(func)
+---
+...
+ch2:get()
+---
+- 1
+...
+f:cancel()
+---
+...
+ch1:put(1)
+---
+- true
+...
+while f:status() ~= 'dead' do fiber.sleep(0.01) end
+---
+...
-- cleanup
test_run:cmd("clear filter")
---
diff --git a/test/app/fiber.test.lua b/test/app/fiber.test.lua
index 2762047e4..a0d1e993b 100644
--- a/test/app/fiber.test.lua
+++ b/test/app/fiber.test.lua
@@ -602,6 +602,27 @@ f = nil
l = nil
l1 = nil
+-- gh-3948 fiber.join() blocks if fiber is cancelled.
+function another_func() ch1:get() end
+test_run:cmd("setopt delimiter ';'")
+function func()
+ local fib = fiber.create(another_func)
+ fib:set_joinable(true)
+ ch2:put(1)
+ fib:join()
+end;
+test_run:cmd("setopt delimiter ''");
+
+ch1 = fiber.channel(1)
+ch2 = fiber.channel(1)
+
+f = fiber.create(func)
+ch2:get()
+f:cancel()
+ch1:put(1)
+
+while f:status() ~= 'dead' do fiber.sleep(0.01) end
+
-- cleanup
test_run:cmd("clear filter")
--
2.17.2 (Apple Git-113)
More information about the Tarantool-patches
mailing list