[PATCH v3] Fix fiber_join() hang in case fiber_cancel() was called

Serge Petrenko sergepetrenko at tarantool.org
Wed Feb 6 15:56:30 MSK 2019


In case a fiber joining another fiber gets cancelled, it stays suspended
forever and never finishes joining. This happens because fiber_cancel()
wakes the fiber and removes it from all execution queues.
Fix this by adding the fiber back to the wakeup queue of the joined
fiber after each yield.

Closes #3948
---
https://github.com/tarantool/tarantool/issues/3948
https://github.com/tarantool/tarantool/tree/sp/gh-3948-fiber-cancel-during-join

Changes in v3: 
  - rewrote the test with fiber channel
    to remove scheduler dependency.
  - went back to ignoring cancellation
    till join is complete.

Changes in v2: 
  - rewrote the test completely.
  - instead of continuing to join if the fiber
    is cancelled make the fiber to be joined
    non-joinable and exit. This solution was 
    discussed verbally.
  - revert comment changes for fiber_yield().
    It really isn't a cancellation point.

 src/fiber.c             | 12 ++++++++++--
 test/app/fiber.result   | 43 +++++++++++++++++++++++++++++++++++++++++
 test/app/fiber.test.lua | 21 ++++++++++++++++++++
 3 files changed, 74 insertions(+), 2 deletions(-)

diff --git a/src/fiber.c b/src/fiber.c
index 6f3d0ab78..70e992f13 100644
--- a/src/fiber.c
+++ b/src/fiber.c
@@ -392,9 +392,17 @@ fiber_join(struct fiber *fiber)
 	assert(fiber->flags & FIBER_IS_JOINABLE);
 
 	if (! fiber_is_dead(fiber)) {
-		rlist_add_tail_entry(&fiber->wake, fiber(), state);
-
 		do {
+			/*
+			 * In case fiber is cancelled during yield
+			 * it will be remoed from wake queue by a
+			 * wakeup following the cancel, so we have
+			 * to put it back in.
+			 * Having multiple queue entries for the
+			 * same fiber doesn't hurt, since wakeup
+			 * is executed only once per fiber.
+			 */
+			rlist_add_tail_entry(&fiber->wake, fiber(), state);
 			fiber_yield();
 		} while (! fiber_is_dead(fiber));
 	}
diff --git a/test/app/fiber.result b/test/app/fiber.result
index ab7c1941b..1b72ed5da 100644
--- a/test/app/fiber.result
+++ b/test/app/fiber.result
@@ -1411,6 +1411,49 @@ l = nil
 l1 = nil
 ---
 ...
+-- gh-3948 fiber.join() blocks if fiber is cancelled.
+function another_func() ch1:get() end
+---
+...
+test_run:cmd("setopt delimiter ';'")
+---
+- true
+...
+function func()
+    local fib = fiber.create(another_func)
+    fib:set_joinable(true)
+    ch2:put(1)
+    fib:join()
+end;
+---
+...
+test_run:cmd("setopt delimiter ''");
+---
+- true
+...
+ch1 = fiber.channel(1)
+---
+...
+ch2 = fiber.channel(1)
+---
+...
+f = fiber.create(func)
+---
+...
+ch2:get()
+---
+- 1
+...
+f:cancel()
+---
+...
+ch1:put(1)
+---
+- true
+...
+while f:status() ~= 'dead' do fiber.sleep(0.01) end
+---
+...
 -- cleanup
 test_run:cmd("clear filter")
 ---
diff --git a/test/app/fiber.test.lua b/test/app/fiber.test.lua
index 2762047e4..a0d1e993b 100644
--- a/test/app/fiber.test.lua
+++ b/test/app/fiber.test.lua
@@ -602,6 +602,27 @@ f = nil
 l = nil
 l1 = nil
 
+-- gh-3948 fiber.join() blocks if fiber is cancelled.
+function another_func() ch1:get() end
+test_run:cmd("setopt delimiter ';'")
+function func()
+    local fib = fiber.create(another_func)
+    fib:set_joinable(true)
+    ch2:put(1)
+    fib:join()
+end;
+test_run:cmd("setopt delimiter ''");
+
+ch1 = fiber.channel(1)
+ch2 = fiber.channel(1)
+
+f = fiber.create(func)
+ch2:get()
+f:cancel()
+ch1:put(1)
+
+while f:status() ~= 'dead' do fiber.sleep(0.01) end
+
 -- cleanup
 test_run:cmd("clear filter")
 
-- 
2.17.2 (Apple Git-113)




More information about the Tarantool-patches mailing list