Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH 0/2] fiber_wakeup() nop on self
@ 2021-04-23 23:15 Vladislav Shpilevoy via Tarantool-patches
  2021-04-23 23:15 ` [Tarantool-patches] [PATCH 1/2] fiber: make wakeup in Lua and C " Vladislav Shpilevoy via Tarantool-patches
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-04-23 23:15 UTC (permalink / raw)
  To: tarantool-patches, gorcunov, sergepetrenko

The patchset makes fiber_wakeup() in C and fiber.wakeup() in Lua
nop when called on the current fiber. This fixes a couple of
crashes in the public API, and prevent spurious wakeups in
certain cases.

Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-5292-6043-fiber-wakeup-nop
Issue: https://github.com/tarantool/tarantool/issues/5292
Issue: https://github.com/tarantool/tarantool/issues/6043

Vladislav Shpilevoy (2):
  fiber: make wakeup in Lua and C nop on self
  fiber: use wakeup safely on self everywhere

 .../unreleased/gh-6043-fiber-wakeup-self.md   | 33 +++++++
 src/box/applier.cc                            |  4 +-
 src/box/journal.c                             |  6 ++
 src/box/journal.h                             |  7 ++
 src/box/raft.c                                | 14 +--
 src/box/txn.c                                 |  6 +-
 src/box/txn_limbo.c                           | 14 +--
 src/lib/core/fiber.c                          | 85 +++++++++++--------
 src/lib/core/fiber.h                          |  6 +-
 src/lib/swim/swim.c                           |  1 -
 .../gh-6043-fiber-wakeup-self.test.lua        | 50 +++++++++++
 test/unit/fiber.cc                            | 33 ++++++-
 test/unit/fiber.result                        |  2 +
 test/unit/fiber_stress.cc                     |  6 +-
 14 files changed, 191 insertions(+), 76 deletions(-)
 create mode 100644 changelogs/unreleased/gh-6043-fiber-wakeup-self.md
 create mode 100755 test/app-tap/gh-6043-fiber-wakeup-self.test.lua

-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Tarantool-patches] [PATCH 1/2] fiber: make wakeup in Lua and C nop on self
  2021-04-23 23:15 [Tarantool-patches] [PATCH 0/2] fiber_wakeup() nop on self Vladislav Shpilevoy via Tarantool-patches
@ 2021-04-23 23:15 ` Vladislav Shpilevoy via Tarantool-patches
  2021-04-25  9:21   ` Serge Petrenko via Tarantool-patches
  2021-04-23 23:15 ` [Tarantool-patches] [PATCH 2/2] fiber: use wakeup safely on self everywhere Vladislav Shpilevoy via Tarantool-patches
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-04-23 23:15 UTC (permalink / raw)
  To: tarantool-patches, gorcunov, sergepetrenko

fiber.wakeup() in Lua and fiber_wakeup() in C could lead to a
crash or undefined behaviour when called on the currently running
fiber.

In particular, if after wakeup was started a new fiber in a
blocking way (fiber.create() and fiber_start()) it would crash in
debug build, and lead to unknown results in release.

If after wakeup was made a sleep with non-zero timeout or an
infinite yield (fiber_yield()), the fiber woke up in the same
event loop iteration regardless of any timeout or other wakeups.
It was a spurious wakeup, which is not expected in most of the
places internally.

The patch makes the wakeup nop on the current fiber making it safe
to use anywhere.

Closes #5292
Closes #6043

@TarantoolBot document
Title: fiber.wakeup() in Lua and fiber_wakeup() in C are nop on self
In Lua `fiber.wakeup()` being called on the current fiber does not
do anything, safe to use. The same for `fiber_wakeup()` in C.
---
 .../unreleased/gh-6043-fiber-wakeup-self.md   | 33 +++++++
 src/lib/core/fiber.c                          | 85 +++++++++++--------
 src/lib/core/fiber.h                          |  6 +-
 .../gh-6043-fiber-wakeup-self.test.lua        | 50 +++++++++++
 test/unit/fiber.cc                            | 33 ++++++-
 test/unit/fiber.result                        |  2 +
 test/unit/fiber_stress.cc                     |  6 +-
 7 files changed, 171 insertions(+), 44 deletions(-)
 create mode 100644 changelogs/unreleased/gh-6043-fiber-wakeup-self.md
 create mode 100755 test/app-tap/gh-6043-fiber-wakeup-self.test.lua

diff --git a/changelogs/unreleased/gh-6043-fiber-wakeup-self.md b/changelogs/unreleased/gh-6043-fiber-wakeup-self.md
new file mode 100644
index 000000000..5869e2696
--- /dev/null
+++ b/changelogs/unreleased/gh-6043-fiber-wakeup-self.md
@@ -0,0 +1,33 @@
+## bugfix/core
+
+* **[Breaking change]** `fiber.wakeup()` in Lua and `fiber_wakeup()` in C became
+  NOP on the currently running fiber. Previously they allowed to "ignore" the
+  next yield or sleep leading to unexpected spurious wakeups. Could lead to a
+  crash (in debug build) or undefined behaviour (in release build) if called
+  right before `fiber.create()` in Lua or `fiber_start()` in C (gh-6043).
+
+  There was a single usecase for that - reschedule in the same event loop
+  iteration which is not the same as `fiber.sleep(0)` in Lua and
+  `fiber_sleep(0)` in C. Could be done in C like that:
+  ```C
+  fiber_wakeup(fiber_self());
+  fiber_yield();
+  ```
+  and in Lua like that:
+  ```Lua
+  fiber.self():wakeup()
+  fiber.yield()
+  ```
+  Now to get the same effect in C use `fiber_reschedule()`. In Lua it is now
+  simply impossible to reschedule the current fiber in the same event loop
+  iteration directly. But still can reschedule self through a second fiber like
+  this (**never use it, please**):
+  ```Lua
+  local self = fiber.self()
+  fiber.new(function() self:wakeup() end)
+  fiber.sleep(0)
+  ```
+
+----
+Breaking change: `fiber.wakeup()` in Lua and `fiber_wakeup()` in C became NOP on
+the currently running fiber.
diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
index f8b85d99d..cb6295171 100644
--- a/src/lib/core/fiber.c
+++ b/src/lib/core/fiber.c
@@ -396,10 +396,13 @@ fiber_call_impl(struct fiber *callee)
 	assert(rlist_empty(&callee->state));
 	assert(caller);
 	assert(caller != callee);
+	assert((caller->flags & FIBER_IS_RUNNING) != 0);
+	assert((callee->flags & FIBER_IS_RUNNING) == 0);
 
+	caller->flags &= ~FIBER_IS_RUNNING;
 	cord->fiber = callee;
+	callee->flags = (callee->flags & ~FIBER_IS_READY) | FIBER_IS_RUNNING;
 
-	callee->flags &= ~FIBER_IS_READY;
 	ASAN_START_SWITCH_FIBER(asan_state, 1,
 				callee->stack,
 				callee->stack_size);
@@ -443,19 +446,9 @@ fiber_checkstack(void)
 	return false;
 }
 
-/**
- * Interrupt a synchronous wait of a fiber inside the event loop.
- * We do so by keeping an "async" event in every fiber, solely
- * for this purpose, and raising this event here.
- *
- * @note: if this is sent to self, followed by a fiber_yield()
- * call, it simply reschedules the fiber after other ready
- * fibers in the same event loop iteration.
- */
-void
-fiber_wakeup(struct fiber *f)
+static void
+fiber_make_ready(struct fiber *f)
 {
-	assert(! (f->flags & FIBER_IS_DEAD));
 	/**
 	 * Do nothing if the fiber is already in cord->ready
 	 * list *or* is in the call chain created by
@@ -464,21 +457,12 @@ fiber_wakeup(struct fiber *f)
 	 * but the same game is deadly when the fiber is in
 	 * the callee list created by fiber_schedule_list().
 	 *
-	 * To put it another way, fiber_wakeup() is a 'request' to
+	 * To put it another way, fiber_make_ready() is a 'request' to
 	 * schedule the fiber for execution, and once it is executing
-	 * a wakeup request is considered complete and it must be
+	 * the 'make ready' request is considered complete and it must be
 	 * removed.
-	 *
-	 * A dead fiber can be lingering in the cord fiber list
-	 * if it is joinable. This makes it technically possible
-	 * to schedule it. We would never make such a mistake
-	 * in our own code, hence the assert above. But as long
-	 * as fiber.wakeup() is a part of public Lua API, an
-	 * external rock can mess things up. Ignore such attempts
-	 * as well.
 	 */
-	if (f->flags & (FIBER_IS_READY | FIBER_IS_DEAD))
-		return;
+	assert((f->flags & (FIBER_IS_DEAD | FIBER_IS_READY)) == 0);
 	struct cord *cord = cord();
 	if (rlist_empty(&cord->ready)) {
 		/*
@@ -503,6 +487,23 @@ fiber_wakeup(struct fiber *f)
 	f->flags |= FIBER_IS_READY;
 }
 
+void
+fiber_wakeup(struct fiber *f)
+{
+	/*
+	 * DEAD is checked both in the assertion and in the release build
+	 * because it should not ever happen, at least internally. But in some
+	 * user modules it might happen, and better ignore such fibers.
+	 * Especially since this was allowed for quite some time in the public
+	 * API and need to keep it if it costs nothing, for backward
+	 * compatibility.
+	 */
+	assert((f->flags & FIBER_IS_DEAD) == 0);
+	const int no_flags = FIBER_IS_READY | FIBER_IS_DEAD | FIBER_IS_RUNNING;
+	if ((f->flags & no_flags) == 0)
+		fiber_make_ready(f);
+}
+
 /** Cancel the subject fiber.
 *
  * Note: cancelation is asynchronous. Use fiber_join() to wait for the
@@ -521,7 +522,6 @@ void
 fiber_cancel(struct fiber *f)
 {
 	assert(f->fid != 0);
-	struct fiber *self = fiber();
 	/**
 	 * Do nothing if the fiber is dead, since cancelling
 	 * the fiber would clear the diagnostics area and
@@ -535,10 +535,8 @@ fiber_cancel(struct fiber *f)
 	/**
 	 * Don't wake self and zombies.
 	 */
-	if (f != self) {
-		if (f->flags & FIBER_IS_CANCELLABLE)
-			fiber_wakeup(f);
-	}
+	if ((f->flags & FIBER_IS_CANCELLABLE) != 0)
+		fiber_wakeup(f);
 }
 
 /**
@@ -602,7 +600,14 @@ fiber_clock64(void)
 void
 fiber_reschedule(void)
 {
-	fiber_wakeup(fiber());
+	struct fiber *f = fiber();
+	/*
+	 * The current fiber can't be dead, the flag is set when the fiber
+	 * function returns. Can't be ready, because such status is assigned
+	 * only to the queued fibers in the ready-list.
+	 */
+	assert((f->flags & (FIBER_IS_READY | FIBER_IS_DEAD)) == 0);
+	fiber_make_ready(f);
 	fiber_yield();
 }
 
@@ -686,8 +691,13 @@ fiber_yield(void)
 
 	assert(callee->flags & FIBER_IS_READY || callee == &cord->sched);
 	assert(! (callee->flags & FIBER_IS_DEAD));
+	assert((caller->flags & FIBER_IS_RUNNING) != 0);
+	assert((callee->flags & FIBER_IS_RUNNING) == 0);
+
+	caller->flags &= ~FIBER_IS_RUNNING;
 	cord->fiber = callee;
-	callee->flags &= ~FIBER_IS_READY;
+	callee->flags = (callee->flags & ~FIBER_IS_READY) | FIBER_IS_RUNNING;
+
 	ASAN_START_SWITCH_FIBER(asan_state,
 				(caller->flags & FIBER_IS_DEAD) == 0,
 				callee->stack,
@@ -748,10 +758,6 @@ fiber_sleep(double delay)
 	if (delay == 0) {
 		ev_idle_start(loop(), &cord()->idle_event);
 	}
-	/*
-	 * We don't use fiber_wakeup() here to ensure there is
-	 * no infinite wakeup loop in case of fiber_sleep(0).
-	 */
 	fiber_yield_timeout(delay);
 
 	if (delay == 0) {
@@ -864,7 +870,11 @@ fiber_reset(struct fiber *fiber)
 {
 	rlist_create(&fiber->on_yield);
 	rlist_create(&fiber->on_stop);
-	fiber->flags = FIBER_DEFAULT_FLAGS;
+	/*
+	 * Preserve the running flag if set. Reset might be called on the
+	 * current fiber when it is recycled.
+	 */
+	fiber->flags = FIBER_DEFAULT_FLAGS | (fiber->flags & FIBER_IS_RUNNING);
 #if ENABLE_FIBER_TOP
 	clock_stat_reset(&fiber->clock_stat);
 #endif /* ENABLE_FIBER_TOP */
@@ -1449,6 +1459,7 @@ cord_create(struct cord *cord, const char *name)
 	cord->sched.name = NULL;
 	fiber_set_name(&cord->sched, "sched");
 	cord->fiber = &cord->sched;
+	cord->sched.flags |= FIBER_IS_RUNNING;
 
 	cord->max_fid = FIBER_ID_MAX_RESERVED;
 	/*
diff --git a/src/lib/core/fiber.h b/src/lib/core/fiber.h
index b9eeef697..586bc9433 100644
--- a/src/lib/core/fiber.h
+++ b/src/lib/core/fiber.h
@@ -162,6 +162,10 @@ enum {
 	 * This flag is set when fiber uses custom stack size.
 	 */
 	FIBER_CUSTOM_STACK	= 1 << 5,
+	/**
+	 * The flag is set for the fiber currently being executed by the cord.
+	 */
+	FIBER_IS_RUNNING	= 1 << 6,
 	FIBER_DEFAULT_FLAGS = FIBER_IS_CANCELLABLE
 };
 
@@ -279,7 +283,7 @@ API_EXPORT void
 fiber_start(struct fiber *callee, ...);
 
 /**
- * Interrupt a synchronous wait of a fiber
+ * Interrupt a synchronous wait of a fiber. Nop for the currently running fiber.
  *
  * \param f fiber to be woken up
  */
diff --git a/test/app-tap/gh-6043-fiber-wakeup-self.test.lua b/test/app-tap/gh-6043-fiber-wakeup-self.test.lua
new file mode 100755
index 000000000..082c1228d
--- /dev/null
+++ b/test/app-tap/gh-6043-fiber-wakeup-self.test.lua
@@ -0,0 +1,50 @@
+#!/usr/bin/env tarantool
+
+local tap = require('tap')
+local fiber = require('fiber')
+
+--
+-- gh-6043: fiber.wakeup() on self could lead to a crash if followed by
+-- fiber.create(). Also it could lead to spurious wakeup if followed by a sleep
+-- with non-zero timeout, or by, for example, an infinite yield such as WAL
+-- write.
+--
+
+local function test_wakeup_self_and_call(test)
+    test:plan(3)
+
+    local self = fiber.self()
+    fiber.wakeup(self)
+    local count = 0
+    local f = fiber.create(function()
+        count = count + 1
+        fiber.yield()
+        count = count + 1
+    end)
+    test:is(count, 1, 'created a fiber')
+    fiber.yield()
+    test:is(count, 2, 'it yielded')
+    test:is(f:status(), 'dead', 'and ended')
+end
+
+local function test_wakeup_self_and_wal_write(test)
+    test:plan(1)
+
+    local s = box.schema.create_space('test')
+    s:create_index('pk')
+
+    fiber.wakeup(fiber.self())
+    local lsn = box.info.lsn
+    s:replace{1}
+    test:is(box.info.lsn, lsn + 1, 'written to WAL')
+    s:drop()
+end
+
+box.cfg{}
+
+local test = tap.test('gh-6043-fiber-wakeup-self')
+test:plan(2)
+test:test('wakeup self + call', test_wakeup_self_and_call)
+test:test('wakeup self + wal write', test_wakeup_self_and_wal_write)
+
+os.exit(test:check() and 0 or 1)
diff --git a/test/unit/fiber.cc b/test/unit/fiber.cc
index 9c1a23bdd..a1e3ded87 100644
--- a/test/unit/fiber.cc
+++ b/test/unit/fiber.cc
@@ -138,8 +138,7 @@ fiber_join_test()
 	fiber_set_joinable(fiber, true);
 	fiber_wakeup(fiber);
 	/** Let the fiber schedule */
-	fiber_wakeup(fiber());
-	fiber_yield();
+	fiber_reschedule();
 	note("by this time the fiber should be dead already");
 	fiber_cancel(fiber);
 	fiber_join(fiber);
@@ -201,12 +200,42 @@ fiber_name_test()
 	footer();
 }
 
+static void
+fiber_wakeup_self_test()
+{
+	header();
+
+	struct fiber *f = fiber();
+
+	fiber_wakeup(f);
+	double duration = 0.001;
+	uint64_t t1 = fiber_clock64();
+	fiber_sleep(duration);
+	uint64_t t2 = fiber_clock64();
+	/*
+	 * It was a real sleep, not 0 duration. Wakeup is nop on the running
+	 * fiber.
+	 */
+	assert(t2 - t1 >= duration);
+
+	/*
+	 * Wakeup + start of a new fiber. This is different from yield but
+	 * works without crashes too.
+	 */
+	struct fiber *newf = fiber_new_xc("nop", noop_f);
+	fiber_wakeup(f);
+	fiber_start(newf);
+
+	footer();
+}
+
 static int
 main_f(va_list ap)
 {
 	fiber_name_test();
 	fiber_join_test();
 	fiber_stack_test();
+	fiber_wakeup_self_test();
 	ev_break(loop(), EVBREAK_ALL);
 	return 0;
 }
diff --git a/test/unit/fiber.result b/test/unit/fiber.result
index a61e0a2b8..36bf2a599 100644
--- a/test/unit/fiber.result
+++ b/test/unit/fiber.result
@@ -17,3 +17,5 @@ SystemError Failed to allocate 42 bytes in allocator for exception: Cannot alloc
 # normal-stack fiber not crashed
 # big-stack fiber not crashed
 	*** fiber_stack_test: done ***
+	*** fiber_wakeup_self_test ***
+	*** fiber_wakeup_self_test: done ***
diff --git a/test/unit/fiber_stress.cc b/test/unit/fiber_stress.cc
index 44ab12487..aa892d0a0 100644
--- a/test/unit/fiber_stress.cc
+++ b/test/unit/fiber_stress.cc
@@ -9,10 +9,8 @@ enum {
 static int
 yield_f(va_list ap)
 {
-	for (int i = 0; i < ITERATIONS; i++) {
-		fiber_wakeup(fiber());
-		fiber_yield();
-	}
+	for (int i = 0; i < ITERATIONS; i++)
+		fiber_reschedule();
 	return 0;
 }
 
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Tarantool-patches] [PATCH 2/2] fiber: use wakeup safely on self everywhere
  2021-04-23 23:15 [Tarantool-patches] [PATCH 0/2] fiber_wakeup() nop on self Vladislav Shpilevoy via Tarantool-patches
  2021-04-23 23:15 ` [Tarantool-patches] [PATCH 1/2] fiber: make wakeup in Lua and C " Vladislav Shpilevoy via Tarantool-patches
@ 2021-04-23 23:15 ` Vladislav Shpilevoy via Tarantool-patches
  2021-04-25  9:23   ` Serge Petrenko via Tarantool-patches
  2021-04-26 21:56 ` [Tarantool-patches] [PATCH 0/2] fiber_wakeup() nop on self Cyrill Gorcunov via Tarantool-patches
  2021-04-27 12:03 ` Kirill Yukhin via Tarantool-patches
  3 siblings, 1 reply; 9+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-04-23 23:15 UTC (permalink / raw)
  To: tarantool-patches, gorcunov, sergepetrenko

The previous commit made fiber_wakeup() safe to use on the current
fiber. Leverage the new behaviour everywhere in the source code to
remove all checks f != fiber() before fiber_wakeup(f) calls.

Follow-up #5292
---
 src/box/applier.cc  |  4 +---
 src/box/journal.c   |  6 ++++++
 src/box/journal.h   |  7 +++++++
 src/box/raft.c      | 14 +++-----------
 src/box/txn.c       |  6 ++----
 src/box/txn_limbo.c | 14 +-------------
 src/lib/swim/swim.c |  1 -
 7 files changed, 20 insertions(+), 32 deletions(-)

diff --git a/src/box/applier.cc b/src/box/applier.cc
index dc05c91d3..33181fdbf 100644
--- a/src/box/applier.cc
+++ b/src/box/applier.cc
@@ -792,9 +792,7 @@ apply_synchro_row_cb(struct journal_entry *entry)
 		txn_limbo_process(&txn_limbo, synchro_entry->req);
 		trigger_run(&replicaset.applier.on_wal_write, NULL);
 	}
-	/* The fiber is the same on final join. */
-	if (synchro_entry->owner != fiber())
-		fiber_wakeup(synchro_entry->owner);
+	fiber_wakeup(synchro_entry->owner);
 }
 
 /** Process a synchro request. */
diff --git a/src/box/journal.c b/src/box/journal.c
index 886a15139..df491610a 100644
--- a/src/box/journal.c
+++ b/src/box/journal.c
@@ -63,6 +63,12 @@ journal_entry_new(size_t n_rows, struct region *region,
 	return entry;
 }
 
+void
+journal_entry_fiber_wakeup_cb(struct journal_entry *entry)
+{
+	fiber_wakeup(entry->complete_data);
+}
+
 void
 journal_queue_wakeup(void)
 {
diff --git a/src/box/journal.h b/src/box/journal.h
index 8f3d56a61..4ab7e8afb 100644
--- a/src/box/journal.h
+++ b/src/box/journal.h
@@ -112,6 +112,13 @@ journal_entry_new(size_t n_rows, struct region *region,
 		  journal_write_async_f write_async_cb,
 		  void *complete_data);
 
+/**
+ * Treat complete_data like a fiber pointer and wake it up when journal write is
+ * done.
+ */
+void
+journal_entry_fiber_wakeup_cb(struct journal_entry *entry);
+
 struct journal_queue {
 	/** Maximal size of entries enqueued in journal (in bytes). */
 	int64_t max_size;
diff --git a/src/box/raft.c b/src/box/raft.c
index 47c869712..6b52c9876 100644
--- a/src/box/raft.c
+++ b/src/box/raft.c
@@ -155,8 +155,7 @@ box_raft_schedule_async(struct raft *raft)
 	 * adapted to this. Also don't wakeup the current fiber - it leads to
 	 * undefined behaviour.
 	 */
-	if ((box_raft_worker->flags & FIBER_IS_CANCELLABLE) != 0 &&
-	    fiber() != box_raft_worker)
+	if ((box_raft_worker->flags & FIBER_IS_CANCELLABLE) != 0)
 		fiber_wakeup(box_raft_worker);
 	box_raft_has_work = true;
 }
@@ -279,13 +278,6 @@ box_raft_broadcast(struct raft *raft, const struct raft_msg *msg)
 		relay_push_raft(replica->relay, &req);
 }
 
-/** Wakeup Raft state writer fiber waiting for WAL write end. */
-static void
-box_raft_write_cb(struct journal_entry *entry)
-{
-	fiber_wakeup(entry->complete_data);
-}
-
 static void
 box_raft_write(struct raft *raft, const struct raft_msg *msg)
 {
@@ -307,8 +299,8 @@ box_raft_write(struct raft *raft, const struct raft_msg *msg)
 
 	if (xrow_encode_raft(&row, region, &req) != 0)
 		goto fail;
-	journal_entry_create(entry, 1, xrow_approx_len(&row), box_raft_write_cb,
-			     fiber());
+	journal_entry_create(entry, 1, xrow_approx_len(&row),
+			     journal_entry_fiber_wakeup_cb, fiber());
 
 	/*
 	 * A non-cancelable fiber is considered non-wake-able, generally. Raft
diff --git a/src/box/txn.c b/src/box/txn.c
index 03b39e0de..07ab131a2 100644
--- a/src/box/txn.c
+++ b/src/box/txn.c
@@ -504,9 +504,7 @@ txn_free_or_wakeup(struct txn *txn)
 		txn_free(txn);
 	else {
 		txn_set_flags(txn, TXN_IS_DONE);
-		if (txn->fiber != fiber())
-			/* Wake a waiting fiber up. */
-			fiber_wakeup(txn->fiber);
+		fiber_wakeup(txn->fiber);
 	}
 }
 
@@ -578,7 +576,7 @@ txn_on_journal_write(struct journal_entry *entry)
 		txn_run_wal_write_triggers(txn);
 	if (!txn_has_flag(txn, TXN_WAIT_SYNC))
 		txn_complete_success(txn);
-	else if (txn->fiber != NULL && txn->fiber != fiber())
+	else if (txn->fiber != NULL)
 		fiber_wakeup(txn->fiber);
 finish:
 	fiber_set_txn(fiber(), NULL);
diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c
index c22bd6665..5b23d1bb1 100644
--- a/src/box/txn_limbo.c
+++ b/src/box/txn_limbo.c
@@ -306,18 +306,6 @@ complete:
 	return 0;
 }
 
-/**
- * A callback for synchronous write: txn_limbo_write_synchro fiber
- * waiting to proceed once a record is written to WAL.
- */
-static void
-txn_limbo_write_cb(struct journal_entry *entry)
-{
-	assert(entry->complete_data != NULL);
-	if (fiber() != entry->complete_data)
-		fiber_wakeup(entry->complete_data);
-}
-
 static void
 txn_limbo_write_synchro(struct txn_limbo *limbo, uint16_t type, int64_t lsn,
 			uint64_t term)
@@ -346,7 +334,7 @@ txn_limbo_write_synchro(struct txn_limbo *limbo, uint16_t type, int64_t lsn,
 	xrow_encode_synchro(&row, body, &req);
 
 	journal_entry_create(entry, 1, xrow_approx_len(&row),
-			     txn_limbo_write_cb, fiber());
+			     journal_entry_fiber_wakeup_cb, fiber());
 
 	if (journal_write(entry) != 0 || entry->res < 0) {
 		diag_set(ClientError, ER_WAL_IO);
diff --git a/src/lib/swim/swim.c b/src/lib/swim/swim.c
index 1ecc90414..3955c3130 100644
--- a/src/lib/swim/swim.c
+++ b/src/lib/swim/swim.c
@@ -2219,7 +2219,6 @@ swim_kill_event_handler(struct swim *swim)
 	 * reused.
 	 */
 	swim->event_handler = NULL;
-	fiber_wakeup(f);
 	fiber_cancel(f);
 	fiber_join(f);
 }
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] fiber: make wakeup in Lua and C nop on self
  2021-04-23 23:15 ` [Tarantool-patches] [PATCH 1/2] fiber: make wakeup in Lua and C " Vladislav Shpilevoy via Tarantool-patches
@ 2021-04-25  9:21   ` Serge Petrenko via Tarantool-patches
  2021-04-25 15:53     ` Vladislav Shpilevoy via Tarantool-patches
  0 siblings, 1 reply; 9+ messages in thread
From: Serge Petrenko via Tarantool-patches @ 2021-04-25  9:21 UTC (permalink / raw)
  To: Vladislav Shpilevoy, tarantool-patches, gorcunov



24.04.2021 02:15, Vladislav Shpilevoy пишет:
> fiber.wakeup() in Lua and fiber_wakeup() in C could lead to a
> crash or undefined behaviour when called on the currently running
> fiber.
>
> In particular, if after wakeup was started a new fiber in a
> blocking way (fiber.create() and fiber_start()) it would crash in
> debug build, and lead to unknown results in release.
>
> If after wakeup was made a sleep with non-zero timeout or an
> infinite yield (fiber_yield()), the fiber woke up in the same
> event loop iteration regardless of any timeout or other wakeups.
> It was a spurious wakeup, which is not expected in most of the
> places internally.
>
> The patch makes the wakeup nop on the current fiber making it safe
> to use anywhere.
>
> Closes #5292
> Closes #6043
>
> @TarantoolBot document
> Title: fiber.wakeup() in Lua and fiber_wakeup() in C are nop on self
> In Lua `fiber.wakeup()` being called on the current fiber does not
> do anything, safe to use. The same for `fiber_wakeup()` in C.
> ---

Thanks for the patch!
LGTM, generally, with one test-related comment.


...

> diff --git a/test/app-tap/gh-6043-fiber-wakeup-self.test.lua b/test/app-tap/gh-6043-fiber-wakeup-self.test.lua
> new file mode 100755
> index 000000000..082c1228d
> --- /dev/null
> +++ b/test/app-tap/gh-6043-fiber-wakeup-self.test.lua
> @@ -0,0 +1,50 @@
> +#!/usr/bin/env tarantool
> +
> +local tap = require('tap')
> +local fiber = require('fiber')
> +
> +--
> +-- gh-6043: fiber.wakeup() on self could lead to a crash if followed by
> +-- fiber.create(). Also it could lead to spurious wakeup if followed by a sleep
> +-- with non-zero timeout, or by, for example, an infinite yield such as WAL
> +-- write.
> +--
> +
> +local function test_wakeup_self_and_call(test)
> +    test:plan(3)
> +
> +    local self = fiber.self()
> +    fiber.wakeup(self)
> +    local count = 0
> +    local f = fiber.create(function()
> +        count = count + 1
> +        fiber.yield()
> +        count = count + 1
> +    end)
> +    test:is(count, 1, 'created a fiber')
> +    fiber.yield()
> +    test:is(count, 2, 'it yielded')
> +    test:is(f:status(), 'dead', 'and ended')
> +end
> +
> +local function test_wakeup_self_and_wal_write(test)
> +    test:plan(1)
> +
> +    local s = box.schema.create_space('test')
> +    s:create_index('pk')
> +
> +    fiber.wakeup(fiber.self())
> +    local lsn = box.info.lsn
> +    s:replace{1}
> +    test:is(box.info.lsn, lsn + 1, 'written to WAL')
> +    s:drop()
> +end

Again, shouldn't everything related to space creation, WAL writes and so on
reside in box-tap ? I may be wrong, just asking.

> +
> +box.cfg{}
> +
> +local test = tap.test('gh-6043-fiber-wakeup-self')
> +test:plan(2)
> +test:test('wakeup self + call', test_wakeup_self_and_call)
> +test:test('wakeup self + wal write', test_wakeup_self_and_wal_write)
> +
> +os.exit(test:check() and 0 or 1)
> diff --git a/test/unit/fiber.cc b/test/unit/fiber.cc
> index 9c1a23bdd..a1e3ded87 100644
> --- a/test/unit/fiber.cc
> +++ b/test/unit/fiber.cc
> @@ -138,8 +138,7 @@ fiber_join_test()
>   	fiber_set_joinable(fiber, true);
>   	fiber_wakeup(fiber);
>   	/** Let the fiber schedule */
> -	fiber_wakeup(fiber());
> -	fiber_yield();
> +	fiber_reschedule();
>   	note("by this time the fiber should be dead already");
>   	fiber_cancel(fiber);
>   	fiber_join(fiber);
> @@ -201,12 +200,42 @@ fiber_name_test()
>   	footer();
>   }
>   
> +static void
> +fiber_wakeup_self_test()
> +{
> +	header();
> +
> +	struct fiber *f = fiber();
> +
> +	fiber_wakeup(f);
> +	double duration = 0.001;
> +	uint64_t t1 = fiber_clock64();
> +	fiber_sleep(duration);
> +	uint64_t t2 = fiber_clock64();
> +	/*
> +	 * It was a real sleep, not 0 duration. Wakeup is nop on the running
> +	 * fiber.
> +	 */
> +	assert(t2 - t1 >= duration);
> +
> +	/*
> +	 * Wakeup + start of a new fiber. This is different from yield but
> +	 * works without crashes too.
> +	 */
> +	struct fiber *newf = fiber_new_xc("nop", noop_f);
> +	fiber_wakeup(f);
> +	fiber_start(newf);
> +
> +	footer();
> +}
> +
>   static int
>   main_f(va_list ap)
>   {
>   	fiber_name_test();
>   	fiber_join_test();
>   	fiber_stack_test();
> +	fiber_wakeup_self_test();
>   	ev_break(loop(), EVBREAK_ALL);
>   	return 0;
>   }
> diff --git a/test/unit/fiber.result b/test/unit/fiber.result
> index a61e0a2b8..36bf2a599 100644
> --- a/test/unit/fiber.result
> +++ b/test/unit/fiber.result
> @@ -17,3 +17,5 @@ SystemError Failed to allocate 42 bytes in allocator for exception: Cannot alloc
>   # normal-stack fiber not crashed
>   # big-stack fiber not crashed
>   	*** fiber_stack_test: done ***
> +	*** fiber_wakeup_self_test ***
> +	*** fiber_wakeup_self_test: done ***
> diff --git a/test/unit/fiber_stress.cc b/test/unit/fiber_stress.cc
> index 44ab12487..aa892d0a0 100644
> --- a/test/unit/fiber_stress.cc
> +++ b/test/unit/fiber_stress.cc
> @@ -9,10 +9,8 @@ enum {
>   static int
>   yield_f(va_list ap)
>   {
> -	for (int i = 0; i < ITERATIONS; i++) {
> -		fiber_wakeup(fiber());
> -		fiber_yield();
> -	}
> +	for (int i = 0; i < ITERATIONS; i++)
> +		fiber_reschedule();
>   	return 0;
>   }
>   

-- 
Serge Petrenko


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] fiber: use wakeup safely on self everywhere
  2021-04-23 23:15 ` [Tarantool-patches] [PATCH 2/2] fiber: use wakeup safely on self everywhere Vladislav Shpilevoy via Tarantool-patches
@ 2021-04-25  9:23   ` Serge Petrenko via Tarantool-patches
  0 siblings, 0 replies; 9+ messages in thread
From: Serge Petrenko via Tarantool-patches @ 2021-04-25  9:23 UTC (permalink / raw)
  To: Vladislav Shpilevoy, tarantool-patches, gorcunov



24.04.2021 02:15, Vladislav Shpilevoy пишет:
> The previous commit made fiber_wakeup() safe to use on the current
> fiber. Leverage the new behaviour everywhere in the source code to
> remove all checks f != fiber() before fiber_wakeup(f) calls.
>
> Follow-up #5292
> ---
>


Thanks! LGTM.

-- 
Serge Petrenko


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] fiber: make wakeup in Lua and C nop on self
  2021-04-25  9:21   ` Serge Petrenko via Tarantool-patches
@ 2021-04-25 15:53     ` Vladislav Shpilevoy via Tarantool-patches
  2021-04-26  9:21       ` Serge Petrenko via Tarantool-patches
  0 siblings, 1 reply; 9+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-04-25 15:53 UTC (permalink / raw)
  To: Serge Petrenko, tarantool-patches, gorcunov

Thanks for the review!

>> +
>> +local function test_wakeup_self_and_wal_write(test)
>> +    test:plan(1)
>> +
>> +    local s = box.schema.create_space('test')
>> +    s:create_index('pk')
>> +
>> +    fiber.wakeup(fiber.self())
>> +    local lsn = box.info.lsn
>> +    s:replace{1}
>> +    test:is(box.info.lsn, lsn + 1, 'written to WAL')
>> +    s:drop()
>> +end
> 
> Again, shouldn't everything related to space creation, WAL writes and so on
> reside in box-tap ? I may be wrong, just asking.

It should, but this test is about testing the fibers, not
the box. Box and the WAL write here are secondary things.
The same as for app/fiber.test.lua (not TAP).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] fiber: make wakeup in Lua and C nop on self
  2021-04-25 15:53     ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-04-26  9:21       ` Serge Petrenko via Tarantool-patches
  0 siblings, 0 replies; 9+ messages in thread
From: Serge Petrenko via Tarantool-patches @ 2021-04-26  9:21 UTC (permalink / raw)
  To: Vladislav Shpilevoy, tarantool-patches, gorcunov



25.04.2021 18:53, Vladislav Shpilevoy пишет:
> Thanks for the review!
>
>>> +
>>> +local function test_wakeup_self_and_wal_write(test)
>>> +    test:plan(1)
>>> +
>>> +    local s = box.schema.create_space('test')
>>> +    s:create_index('pk')
>>> +
>>> +    fiber.wakeup(fiber.self())
>>> +    local lsn = box.info.lsn
>>> +    s:replace{1}
>>> +    test:is(box.info.lsn, lsn + 1, 'written to WAL')
>>> +    s:drop()
>>> +end
>> Again, shouldn't everything related to space creation, WAL writes and so on
>> reside in box-tap ? I may be wrong, just asking.
> It should, but this test is about testing the fibers, not
> the box. Box and the WAL write here are secondary things.
> The same as for app/fiber.test.lua (not TAP).

Thanks for the explanation! LGTM.


-- 
Serge Petrenko


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Tarantool-patches] [PATCH 0/2] fiber_wakeup() nop on self
  2021-04-23 23:15 [Tarantool-patches] [PATCH 0/2] fiber_wakeup() nop on self Vladislav Shpilevoy via Tarantool-patches
  2021-04-23 23:15 ` [Tarantool-patches] [PATCH 1/2] fiber: make wakeup in Lua and C " Vladislav Shpilevoy via Tarantool-patches
  2021-04-23 23:15 ` [Tarantool-patches] [PATCH 2/2] fiber: use wakeup safely on self everywhere Vladislav Shpilevoy via Tarantool-patches
@ 2021-04-26 21:56 ` Cyrill Gorcunov via Tarantool-patches
  2021-04-27 12:03 ` Kirill Yukhin via Tarantool-patches
  3 siblings, 0 replies; 9+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-04-26 21:56 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On Sat, Apr 24, 2021 at 01:15:52AM +0200, Vladislav Shpilevoy wrote:
> The patchset makes fiber_wakeup() in C and fiber.wakeup() in Lua
> nop when called on the current fiber. This fixes a couple of
> crashes in the public API, and prevent spurious wakeups in
> certain cases.
> 
> Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-5292-6043-fiber-wakeup-nop
> Issue: https://github.com/tarantool/tarantool/issues/5292
> Issue: https://github.com/tarantool/tarantool/issues/6043
> 
> Vladislav Shpilevoy (2):
>   fiber: make wakeup in Lua and C nop on self
>   fiber: use wakeup safely on self everywhere

Ack

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Tarantool-patches] [PATCH 0/2] fiber_wakeup() nop on self
  2021-04-23 23:15 [Tarantool-patches] [PATCH 0/2] fiber_wakeup() nop on self Vladislav Shpilevoy via Tarantool-patches
                   ` (2 preceding siblings ...)
  2021-04-26 21:56 ` [Tarantool-patches] [PATCH 0/2] fiber_wakeup() nop on self Cyrill Gorcunov via Tarantool-patches
@ 2021-04-27 12:03 ` Kirill Yukhin via Tarantool-patches
  3 siblings, 0 replies; 9+ messages in thread
From: Kirill Yukhin via Tarantool-patches @ 2021-04-27 12:03 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Hello,

On 24 апр 01:15, Vladislav Shpilevoy via Tarantool-patches wrote:
> The patchset makes fiber_wakeup() in C and fiber.wakeup() in Lua
> nop when called on the current fiber. This fixes a couple of
> crashes in the public API, and prevent spurious wakeups in
> certain cases.
> 
> Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-5292-6043-fiber-wakeup-nop
> Issue: https://github.com/tarantool/tarantool/issues/5292
> Issue: https://github.com/tarantool/tarantool/issues/6043
> 
> Vladislav Shpilevoy (2):
>   fiber: make wakeup in Lua and C nop on self
>   fiber: use wakeup safely on self everywhere

I've checked your patch set into 2.7, 2.8 and master.
Also, 1st patch into 1.10.

--
Regards, Kirill Yukhin

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-04-27 12:03 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-23 23:15 [Tarantool-patches] [PATCH 0/2] fiber_wakeup() nop on self Vladislav Shpilevoy via Tarantool-patches
2021-04-23 23:15 ` [Tarantool-patches] [PATCH 1/2] fiber: make wakeup in Lua and C " Vladislav Shpilevoy via Tarantool-patches
2021-04-25  9:21   ` Serge Petrenko via Tarantool-patches
2021-04-25 15:53     ` Vladislav Shpilevoy via Tarantool-patches
2021-04-26  9:21       ` Serge Petrenko via Tarantool-patches
2021-04-23 23:15 ` [Tarantool-patches] [PATCH 2/2] fiber: use wakeup safely on self everywhere Vladislav Shpilevoy via Tarantool-patches
2021-04-25  9:23   ` Serge Petrenko via Tarantool-patches
2021-04-26 21:56 ` [Tarantool-patches] [PATCH 0/2] fiber_wakeup() nop on self Cyrill Gorcunov via Tarantool-patches
2021-04-27 12:03 ` Kirill Yukhin via Tarantool-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox