[PATCH] Fix fiber_join() hang in case fiber_cancel() was called

Serge Petrenko sergepetrenko at tarantool.org
Mon Feb 4 16:51:38 MSK 2019


In case a fiber joining another fiber gets cancelled, it stays suspended
forever and never finishes joining. This happens because fiber_cancel()
wakes the fiber and removes it from all execution queues.
Fix this by adding ourselves to the joined fibers wake queue on every
iteration. While joining the fiber ignores cancelation.

Also, while we're at it, fix comment for fiber_yield(): it obviously is
a cancellation point.

Closes #3948
---
https://github.com/tarantool/tarantool/tree/sp/gh-3948-fiber-cancel-during-join
https://github.com/tarantool/tarantool/issues/3948

 src/fiber.c            | 14 +++++++++++---
 test/unit/fiber.cc     | 37 +++++++++++++++++++++++++++++++++++++
 test/unit/fiber.result |  2 ++
 3 files changed, 50 insertions(+), 3 deletions(-)

diff --git a/src/fiber.c b/src/fiber.c
index 6f3d0ab78..4dc6b3c5a 100644
--- a/src/fiber.c
+++ b/src/fiber.c
@@ -392,9 +392,17 @@ fiber_join(struct fiber *fiber)
 	assert(fiber->flags & FIBER_IS_JOINABLE);
 
 	if (! fiber_is_dead(fiber)) {
-		rlist_add_tail_entry(&fiber->wake, fiber(), state);
 
 		do {
+			/*
+			 * In case fiber is cancelled during yield
+			 * it will be removed from wake queue by a
+			 * wakeup following the cancel.
+			 * Having multiple entries for the same fiber
+			 * doesn't hurt, since wakeup is executed only
+			 * once per fiber.
+			 */
+			rlist_add_tail_entry(&fiber->wake, fiber(), state);
 			fiber_yield();
 		} while (! fiber_is_dead(fiber));
 	}
@@ -411,8 +419,8 @@ fiber_join(struct fiber *fiber)
 }
 
 /**
- * @note: this is not a cancellation point (@sa fiber_testcancel())
- * but it is considered good practice to call testcancel()
+ * @note: this is a cancellation point (@sa fiber_testcancel())
+ * It is considered good practice to call testcancel()
  * after each yield.
  */
 void
diff --git a/test/unit/fiber.cc b/test/unit/fiber.cc
index 3e9c479c5..5c2a645f6 100644
--- a/test/unit/fiber.cc
+++ b/test/unit/fiber.cc
@@ -47,6 +47,29 @@ cancel_dead_f(va_list ap)
 	return 0;
 }
 
+static int
+yield_f(va_list ap)
+{
+	/** Let the fiber schedule. */
+	fiber_wakeup(fiber());
+	fiber_yield();
+	return 0;
+}
+
+static int
+cancel_joining_f(va_list ap)
+{
+	fiber_set_cancellable(true);
+	struct fiber *fiber = fiber_new("yield", yield_f);
+	fiber_set_joinable(fiber, true);
+	fiber_wakeup(fiber);
+	fiber_yield();
+	note("joining");
+	fiber_join(fiber);
+	note("joined");
+	return 0;
+}
+
 static size_t fiber_stack_size_default;
 
 static void NOINLINE
@@ -138,6 +161,20 @@ fiber_join_test()
 	note("big-stack fiber not crashed");
 	fiber_join(fiber);
 
+	/*
+	 * Cancelling a fiber which tries to join another fiber
+	 * lead to the joining fiber never getting invoked.
+	 * Make suse fiber joins successfully and that it doesn't
+	 * interrupt join due to a cancel.
+	 */
+	fiber = fiber_new_xc("cancel_joining", cancel_joining_f);
+	fiber_set_joinable(fiber, true);
+	fiber_wakeup(fiber);
+	fiber_wakeup(fiber());
+	fiber_yield();
+	fiber_cancel(fiber);
+	fiber_join(fiber);
+
 	footer();
 }
 
diff --git a/test/unit/fiber.result b/test/unit/fiber.result
index 4f7108ffc..e08409243 100644
--- a/test/unit/fiber.result
+++ b/test/unit/fiber.result
@@ -13,4 +13,6 @@ SystemError Failed to allocate 42 bytes in allocator for exception: Cannot alloc
 # cancel dead has started
 # by this time the fiber should be dead already
 # big-stack fiber not crashed
+# joining
+# joined
 	*** fiber_join_test: done ***
-- 
2.17.2 (Apple Git-113)




More information about the Tarantool-patches mailing list