Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH 0/2] Fiber storage leak
@ 2019-12-08 19:28 Vladislav Shpilevoy
  2019-12-08 19:28 ` [Tarantool-patches] [PATCH 1/2] fiber: unref fiber.storage via global Lua state Vladislav Shpilevoy
  2019-12-08 19:28 ` [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto Vladislav Shpilevoy
  0 siblings, 2 replies; 26+ messages in thread
From: Vladislav Shpilevoy @ 2019-12-08 19:28 UTC (permalink / raw)
  To: tarantool-patches, kostja.osipov, imun

The patchset makes fiber.storage be always deleted. Regardless of
where was it created - in a fiber born in the Lua land, or in a
fiber serving IProto requests and accessing Lua.

That removes

  - a leak occurred each time when fiber storage was created for
    an IProto request;

  - a possibility to see previous request's artifacts left in
    fiber.storage;

And makes possible to use fiber.storage as request-local data.

Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-4662-fiber-storage-leak
Issue: https://github.com/tarantool/tarantool/issues/4662
Issue: https://github.com/tarantool/tarantool/issues/3462

Vladislav Shpilevoy (2):
  fiber: unref fiber.storage via global Lua state
  fiber: destroy fiber.storage created by iproto

 src/box/session.cc                           | 11 ++-
 src/box/session.h                            |  7 +-
 src/box/txn.c                                | 13 +--
 src/box/txn.h                                |  7 +-
 src/lib/core/fiber.c                         | 23 +++--
 src/lib/core/fiber.h                         | 13 ++-
 src/lib/core/fiber_pool.c                    |  6 ++
 src/lua/fiber.c                              | 35 ++++++--
 test/app/gh-4662-fiber-storage-leak.result   | 88 ++++++++++++++++++++
 test/app/gh-4662-fiber-storage-leak.test.lua | 43 ++++++++++
 10 files changed, 217 insertions(+), 29 deletions(-)
 create mode 100644 test/app/gh-4662-fiber-storage-leak.result
 create mode 100644 test/app/gh-4662-fiber-storage-leak.test.lua

-- 
2.21.0 (Apple Git-122.2)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Tarantool-patches] [PATCH 1/2] fiber: unref fiber.storage via global Lua state
  2019-12-08 19:28 [Tarantool-patches] [PATCH 0/2] Fiber storage leak Vladislav Shpilevoy
@ 2019-12-08 19:28 ` Vladislav Shpilevoy
  2019-12-11 23:57   ` Igor Munkin
  2019-12-08 19:28 ` [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto Vladislav Shpilevoy
  1 sibling, 1 reply; 26+ messages in thread
From: Vladislav Shpilevoy @ 2019-12-08 19:28 UTC (permalink / raw)
  To: tarantool-patches, kostja.osipov, imun

Fiber.storage is a table, available from anywhere in the fiber. It
is destroyed after fiber function is finished. That provides a
reliable fiber-local storage, similar to thread-local in C/C++.

But there is a problem that the storage may be created via one
struct lua_State, and destroyed via another. Here is an example:

    function test_storage()
        fiber.self().storage.key = 100
    end
    box.schema.func.create('test_storage')
    _ = fiber.create(function()
        box.func.test_storage:call()
    end)

There are 3 struct lua_State:
    tarantool_L - global always alive Lua state;
    L1 - stack of the fiber, created by fiber.create();
    L2 - stack created by that fiber to execute test_storage().

Fiber.storage is created on stack of L2 and referenced by global
LUA_REGISTRYINDEX. Then it is unreferenced from L1 when the fiber
is being destroyed.

That is generally ok as soon as the storage object is always in
LUA_REGISTRYINDEX, which is shared by all Lua states.

But soon during destruction of the fiber.storage there will be
only tarantool_L and the original L2. Original L2 may be already
deleted by the time the storage is being destroyed. So this patch
makes unref of the storage via reliable tarantool_L.

Needed for #4662
---
 src/lua/fiber.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/lua/fiber.c b/src/lua/fiber.c
index 53ebec9aa..a7b75f9bf 100644
--- a/src/lua/fiber.c
+++ b/src/lua/fiber.c
@@ -445,7 +445,7 @@ lua_fiber_run_f(MAYBE_UNUSED va_list ap)
 	/* Destroy local storage */
 	int storage_ref = f->storage.lua.ref;
 	if (storage_ref > 0)
-		luaL_unref(L, LUA_REGISTRYINDEX, storage_ref);
+		luaL_unref(tarantool_L, LUA_REGISTRYINDEX, storage_ref);
 	/*
 	 * If fiber is not joinable
 	 * We can unref child stack here,
-- 
2.21.0 (Apple Git-122.2)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto
  2019-12-08 19:28 [Tarantool-patches] [PATCH 0/2] Fiber storage leak Vladislav Shpilevoy
  2019-12-08 19:28 ` [Tarantool-patches] [PATCH 1/2] fiber: unref fiber.storage via global Lua state Vladislav Shpilevoy
@ 2019-12-08 19:28 ` Vladislav Shpilevoy
  2019-12-09  7:21   ` Konstantin Osipov
  2019-12-10  8:32   ` Konstantin Osipov
  1 sibling, 2 replies; 26+ messages in thread
From: Vladislav Shpilevoy @ 2019-12-08 19:28 UTC (permalink / raw)
  To: tarantool-patches, kostja.osipov, imun

Fiber.storage was not deleted when created in a fiber started from
the thread pool used by IProto requests. The problem was that
fiber.storage was created and deleted in Lua land only, assuming
that only Lua-born fibers could have it. But in fact any fiber can
create a Lua storage. Including the ones used to serve IProto
requests.

Not deletion of the storage led to a possibility of meeting a
non-empty fiber.storage in the beginning of an iproto request, and
not deletion of the memory caught by the storage until its
explicit nullification.

Now the storage destructor works for any fiber, which managed to
create the storage. The destructor unrefs and nullifies the
storage.

For destructor purposes the fiber.on_stop triggers were reworked.
Now they are on_cleanup triggers, and can be called multiple times
during fiber's lifetime. After every request done by that fiber.

Closes #4662
Closes #3462
---
 src/box/session.cc                           | 11 ++-
 src/box/session.h                            |  7 +-
 src/box/txn.c                                | 13 +--
 src/box/txn.h                                |  7 +-
 src/lib/core/fiber.c                         | 23 +++--
 src/lib/core/fiber.h                         | 13 ++-
 src/lib/core/fiber_pool.c                    |  6 ++
 src/lua/fiber.c                              | 35 ++++++--
 test/app/gh-4662-fiber-storage-leak.result   | 88 ++++++++++++++++++++
 test/app/gh-4662-fiber-storage-leak.test.lua | 43 ++++++++++
 10 files changed, 217 insertions(+), 29 deletions(-)
 create mode 100644 test/app/gh-4662-fiber-storage-leak.result
 create mode 100644 test/app/gh-4662-fiber-storage-leak.test.lua

diff --git a/src/box/session.cc b/src/box/session.cc
index 461d1cf25..a08043986 100644
--- a/src/box/session.cc
+++ b/src/box/session.cc
@@ -81,7 +81,7 @@ sid_max()
 }
 
 static int
-session_on_stop(struct trigger *trigger, void * /* event */)
+session_on_fiber_cleanup(struct trigger *trigger, void * /* event */)
 {
 	/*
 	 * Remove on_stop trigger from the fiber, otherwise the
@@ -167,11 +167,10 @@ session_create_on_demand()
 	struct session *s = session_create(SESSION_TYPE_BACKGROUND);
 	if (s == NULL)
 		return NULL;
-	s->fiber_on_stop = {
-		RLIST_LINK_INITIALIZER, session_on_stop, NULL, NULL
-	};
-	/* Add a trigger to destroy session on fiber stop */
-	trigger_add(&fiber()->on_stop, &s->fiber_on_stop);
+	trigger_create(&s->fiber_on_cleanup, session_on_fiber_cleanup,
+		       NULL, NULL);
+	/* Add a trigger to destroy session on fiber cleanup. */
+	trigger_add(&fiber()->on_cleanup, &s->fiber_on_cleanup);
 	credentials_reset(&s->credentials, admin_user);
 	fiber_set_session(fiber(), s);
 	fiber_set_user(fiber(), &s->credentials);
diff --git a/src/box/session.h b/src/box/session.h
index eff3d7a67..710e4919b 100644
--- a/src/box/session.h
+++ b/src/box/session.h
@@ -103,8 +103,11 @@ struct session {
 	union session_meta meta;
 	/** Session user id and global grants */
 	struct credentials credentials;
-	/** Trigger for fiber on_stop to cleanup created on-demand session */
-	struct trigger fiber_on_stop;
+	/**
+	 * Trigger for fiber on_cleanup to cleanup created
+	 * on-demand session.
+	 */
+	struct trigger fiber_on_cleanup;
 };
 
 struct session_vtab {
diff --git a/src/box/txn.c b/src/box/txn.c
index 963ec8eeb..7d549a6a9 100644
--- a/src/box/txn.c
+++ b/src/box/txn.c
@@ -41,7 +41,7 @@ double too_long_threshold;
 static struct stailq txn_cache = {NULL, &txn_cache.first};
 
 static int
-txn_on_stop(struct trigger *trigger, void *event);
+txn_on_fiber_cleanup(struct trigger *trigger, void *event);
 
 static int
 txn_on_yield(struct trigger *trigger, void *event);
@@ -226,8 +226,9 @@ txn_begin()
 	txn->fiber = NULL;
 	fiber_set_txn(fiber(), txn);
 	/* fiber_on_yield is initialized by engine on demand */
-	trigger_create(&txn->fiber_on_stop, txn_on_stop, NULL, NULL);
-	trigger_add(&fiber()->on_stop, &txn->fiber_on_stop);
+	trigger_create(&txn->fiber_on_cleanup, txn_on_fiber_cleanup,
+		       NULL, NULL);
+	trigger_add(&fiber()->on_cleanup, &txn->fiber_on_cleanup);
 	/*
 	 * By default all transactions may yield.
 	 * It's a responsibility of an engine to disable yields
@@ -550,7 +551,7 @@ txn_prepare(struct txn *txn)
 		if (engine_prepare(txn->engine, txn) != 0)
 			return -1;
 	}
-	trigger_clear(&txn->fiber_on_stop);
+	trigger_clear(&txn->fiber_on_cleanup);
 	if (!txn_has_flag(txn, TXN_CAN_YIELD))
 		trigger_clear(&txn->fiber_on_yield);
 	return 0;
@@ -618,7 +619,7 @@ void
 txn_rollback(struct txn *txn)
 {
 	assert(txn == in_txn());
-	trigger_clear(&txn->fiber_on_stop);
+	trigger_clear(&txn->fiber_on_cleanup);
 	if (!txn_has_flag(txn, TXN_CAN_YIELD))
 		trigger_clear(&txn->fiber_on_yield);
 	txn->signature = -1;
@@ -840,7 +841,7 @@ txn_savepoint_release(struct txn_savepoint *svp)
 }
 
 static int
-txn_on_stop(struct trigger *trigger, void *event)
+txn_on_fiber_cleanup(struct trigger *trigger, void *event)
 {
 	(void) trigger;
 	(void) event;
diff --git a/src/box/txn.h b/src/box/txn.h
index da12feebf..755c464f0 100644
--- a/src/box/txn.h
+++ b/src/box/txn.h
@@ -207,10 +207,11 @@ struct txn {
 	 */
 	struct trigger fiber_on_yield;
 	/**
-	 * Trigger on fiber stop, to rollback transaction
-	 * in case a fiber stops (all engines).
+	 * Trigger on fiber cleanup, to rollback transaction
+	 * in case a fiber is getting reset/recycled/destroyed
+	 * (all engines).
 	 */
-	struct trigger fiber_on_stop;
+	struct trigger fiber_on_cleanup;
 	/** Commit and rollback triggers. */
 	struct rlist on_commit, on_rollback;
 	/**
diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
index 00ae8cded..db8fdd5fe 100644
--- a/src/lib/core/fiber.c
+++ b/src/lib/core/fiber.c
@@ -325,6 +325,20 @@ fiber_attr_getstacksize(struct fiber_attr *fiber_attr)
 				    fiber_attr_default.stack_size;
 }
 
+void
+fiber_cleanup(struct fiber *f)
+{
+	if (trigger_run(&f->on_cleanup, f) != 0)
+		panic("Fiber cleanup can't fail");
+	/*
+	 * Double call of cleanup routines is a bad idea. It is
+	 * like calling a destructor 2 times. On_cleanup can be
+	 * called only once until the fiber is reused again, and
+	 * new triggers are set.
+	 */
+	trigger_destroy(&f->on_cleanup);
+}
+
 static void
 fiber_recycle(struct fiber *fiber);
 
@@ -787,7 +801,7 @@ static void
 fiber_reset(struct fiber *fiber)
 {
 	rlist_create(&fiber->on_yield);
-	rlist_create(&fiber->on_stop);
+	rlist_create(&fiber->on_cleanup);
 	fiber->flags = FIBER_DEFAULT_FLAGS;
 #if ENABLE_FIBER_TOP
 	clock_stat_reset(&fiber->clock_stat);
@@ -856,8 +870,7 @@ fiber_loop(MAYBE_UNUSED void *data)
 		       assert(f != fiber);
 		       fiber_wakeup(f);
 	        }
-		if (! rlist_empty(&fiber->on_stop))
-			trigger_run(&fiber->on_stop, fiber);
+	        fiber_cleanup(fiber);
 		/* reset pending wakeups */
 		rlist_del(&fiber->state);
 		if (! (fiber->flags & FIBER_IS_JOINABLE))
@@ -1161,7 +1174,7 @@ fiber_destroy(struct cord *cord, struct fiber *f)
 	assert(f != &cord->sched);
 
 	trigger_destroy(&f->on_yield);
-	trigger_destroy(&f->on_stop);
+	trigger_destroy(&f->on_cleanup);
 	rlist_del(&f->state);
 	rlist_del(&f->link);
 	region_destroy(&f->gc);
@@ -1555,7 +1568,7 @@ cord_costart_thread_func(void *arg)
 	 * Got to be in a trigger, to break the loop even
 	 * in case of an exception.
 	 */
-	trigger_add(&f->on_stop, &break_ev_loop);
+	trigger_add(&f->on_cleanup, &break_ev_loop);
 	fiber_set_joinable(f, true);
 	fiber_start(f, ctx.arg);
 	if (!fiber_is_dead(f)) {
diff --git a/src/lib/core/fiber.h b/src/lib/core/fiber.h
index c5b975513..4e21b9999 100644
--- a/src/lib/core/fiber.h
+++ b/src/lib/core/fiber.h
@@ -458,8 +458,13 @@ struct fiber {
 
 	/** Triggers invoked before this fiber yields. Must not throw. */
 	struct rlist on_yield;
-	/** Triggers invoked before this fiber stops.  Must not throw. */
-	struct rlist on_stop;
+	/**
+	 * Triggers invoked before this fiber is
+	 * stopped/reset/recycled/destroyed/reused. In other
+	 * words, each time when the fiber has finished execution
+	 * of a request.
+	 */
+	struct rlist on_cleanup;
 	/**
 	 * The list of fibers awaiting for this fiber's timely
 	 * (or untimely) death.
@@ -506,6 +511,10 @@ struct fiber {
 	char name[FIBER_NAME_MAX + 1];
 };
 
+/** Invoke cleanup triggers and delete them. */
+void
+fiber_cleanup(struct fiber *f);
+
 struct cord_on_exit;
 
 /**
diff --git a/src/lib/core/fiber_pool.c b/src/lib/core/fiber_pool.c
index 77f89c9fa..31f366fac 100644
--- a/src/lib/core/fiber_pool.c
+++ b/src/lib/core/fiber_pool.c
@@ -62,6 +62,12 @@ restart:
 			assert(f->caller->caller == &cord->sched);
 		}
 		cmsg_deliver(msg);
+		/*
+		 * Different messages = different requests. They
+		 * should not affect each other. This is why
+		 * cleanup is done here.
+		 */
+		fiber_cleanup(f);
 	}
 	/** Put the current fiber into a fiber cache. */
 	if (!fiber_is_cancelled(fiber()) && (msg != NULL ||
diff --git a/src/lua/fiber.c b/src/lua/fiber.c
index a7b75f9bf..cba625082 100644
--- a/src/lua/fiber.c
+++ b/src/lua/fiber.c
@@ -441,11 +441,6 @@ lua_fiber_run_f(MAYBE_UNUSED va_list ap)
 	int coro_ref = lua_tointeger(L, -1);
 	lua_pop(L, 1);
 	result = luaT_call(L, lua_gettop(L) - 1, LUA_MULTRET);
-
-	/* Destroy local storage */
-	int storage_ref = f->storage.lua.ref;
-	if (storage_ref > 0)
-		luaL_unref(tarantool_L, LUA_REGISTRYINDEX, storage_ref);
 	/*
 	 * If fiber is not joinable
 	 * We can unref child stack here,
@@ -606,12 +601,42 @@ lbox_fiber_name(struct lua_State *L)
 	}
 }
 
+/**
+ * Trigger invoked when the fiber has finished execution of its
+ * current request. Only purpose - delete storage.lua.ref keeping
+ * a reference of Lua fiber.storage object. Unlike Lua stack,
+ * Lua fiber storage may be created not only for fibers born from
+ * Lua land. For example, an IProto request may execute a Lua
+ * function, which can create the storage. Trigger guarantees,
+ * that even for non-Lua fibers the Lua storage is destroyed.
+ */
+static int
+lbox_fiber_on_cleanup(struct trigger *trigger, void *event)
+{
+	(void) trigger;
+	struct fiber *f = (struct fiber *) event;
+	int storage_ref = f->storage.lua.ref;
+	assert(storage_ref > 0);
+	luaL_unref(tarantool_L, LUA_REGISTRYINDEX, storage_ref);
+	f->storage.lua.ref = 0;
+	return 0;
+}
+
 static int
 lbox_fiber_storage(struct lua_State *L)
 {
 	struct fiber *f = lbox_checkfiber(L, 1);
 	int storage_ref = f->storage.lua.ref;
 	if (storage_ref <= 0) {
+		struct trigger *t = (struct trigger *)
+			malloc(sizeof(*t));
+		if (t == NULL) {
+			diag_set(OutOfMemory, sizeof(*t), "malloc", "t");
+			return luaT_error(L);
+		}
+		trigger_create(t, lbox_fiber_on_cleanup, NULL,
+			       (trigger_f0) free);
+		trigger_add(&f->on_cleanup, t);
 		lua_newtable(L); /* create local storage on demand */
 		storage_ref = luaL_ref(L, LUA_REGISTRYINDEX);
 		f->storage.lua.ref = storage_ref;
diff --git a/test/app/gh-4662-fiber-storage-leak.result b/test/app/gh-4662-fiber-storage-leak.result
new file mode 100644
index 000000000..4ade017a4
--- /dev/null
+++ b/test/app/gh-4662-fiber-storage-leak.result
@@ -0,0 +1,88 @@
+-- test-run result file version 2
+fiber = require('fiber')
+ | ---
+ | ...
+netbox = require('net.box')
+ | ---
+ | ...
+
+--
+-- gh-4662: fiber.storage was not deleted when created in a fiber
+-- started from the thread pool used by IProto requests. The
+-- problem was that fiber.storage was created and deleted in Lua
+-- land only, assuming that only Lua-born fibers could have it.
+-- But in fact any fiber can create a Lua storage. Including the
+-- ones used to serve IProto requests.
+-- The test checks if fiber.storage is really deleted, and is not
+-- shared between requests.
+--
+
+box.schema.user.grant('guest', 'execute', 'universe')
+ | ---
+ | ...
+storage = nil
+ | ---
+ | ...
+i = 0
+ | ---
+ | ...
+weak_table = setmetatable({}, {__mode = 'v'})
+ | ---
+ | ...
+object = {'object'}
+ | ---
+ | ...
+weak_table.object = object
+ | ---
+ | ...
+function ref_object_in_fiber()                  \
+    storage = fiber.self().storage              \
+    assert(next(storage) == nil)                \
+    i = i + 1                                   \
+    fiber.self().storage.key = i                \
+    fiber.self().storage.object = object        \
+end
+ | ---
+ | ...
+
+c = netbox.connect(box.cfg.listen)
+ | ---
+ | ...
+c:call('ref_object_in_fiber') c:call('ref_object_in_fiber')
+ | ---
+ | ...
+storage
+ | ---
+ | - key: 2
+ |   object:
+ |   - object
+ | ...
+i
+ | ---
+ | - 2
+ | ...
+object = nil
+ | ---
+ | ...
+storage = nil
+ | ---
+ | ...
+collectgarbage('collect')
+ | ---
+ | - 0
+ | ...
+-- The weak table should be empty, because the only two hard
+-- references were in the fibers used to serve
+-- ref_object_in_fiber() requests. And their storages should be
+-- cleaned up.
+weak_table
+ | ---
+ | - []
+ | ...
+
+c:close()
+ | ---
+ | ...
+box.schema.user.revoke('guest', 'execute', 'universe')
+ | ---
+ | ...
diff --git a/test/app/gh-4662-fiber-storage-leak.test.lua b/test/app/gh-4662-fiber-storage-leak.test.lua
new file mode 100644
index 000000000..25acf5679
--- /dev/null
+++ b/test/app/gh-4662-fiber-storage-leak.test.lua
@@ -0,0 +1,43 @@
+fiber = require('fiber')
+netbox = require('net.box')
+
+--
+-- gh-4662: fiber.storage was not deleted when created in a fiber
+-- started from the thread pool used by IProto requests. The
+-- problem was that fiber.storage was created and deleted in Lua
+-- land only, assuming that only Lua-born fibers could have it.
+-- But in fact any fiber can create a Lua storage. Including the
+-- ones used to serve IProto requests.
+-- The test checks if fiber.storage is really deleted, and is not
+-- shared between requests.
+--
+
+box.schema.user.grant('guest', 'execute', 'universe')
+storage = nil
+i = 0
+weak_table = setmetatable({}, {__mode = 'v'})
+object = {'object'}
+weak_table.object = object
+function ref_object_in_fiber()                  \
+    storage = fiber.self().storage              \
+    assert(next(storage) == nil)                \
+    i = i + 1                                   \
+    fiber.self().storage.key = i                \
+    fiber.self().storage.object = object        \
+end
+
+c = netbox.connect(box.cfg.listen)
+c:call('ref_object_in_fiber') c:call('ref_object_in_fiber')
+storage
+i
+object = nil
+storage = nil
+collectgarbage('collect')
+-- The weak table should be empty, because the only two hard
+-- references were in the fibers used to serve
+-- ref_object_in_fiber() requests. And their storages should be
+-- cleaned up.
+weak_table
+
+c:close()
+box.schema.user.revoke('guest', 'execute', 'universe')
-- 
2.21.0 (Apple Git-122.2)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto
  2019-12-08 19:28 ` [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto Vladislav Shpilevoy
@ 2019-12-09  7:21   ` Konstantin Osipov
  2019-12-09 23:31     ` Vladislav Shpilevoy
  2019-12-10  8:32   ` Konstantin Osipov
  1 sibling, 1 reply; 26+ messages in thread
From: Konstantin Osipov @ 2019-12-09  7:21 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

* Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [19/12/08 23:47]:

The danger of reporting, self-assigning and fixing a bug is that
... it might be not a bug.

Users should know that iproto fibers are pooled, not
created/destroyed on demand. So fiber local storage is simply not
making any sense for them.

A proper fix would be to disable fiber local storage for these
fibers, not try to make it work.

Even though your patch "works", it is pretty much useless.

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto
  2019-12-09  7:21   ` Konstantin Osipov
@ 2019-12-09 23:31     ` Vladislav Shpilevoy
  2019-12-10  8:21       ` Konstantin Osipov
  0 siblings, 1 reply; 26+ messages in thread
From: Vladislav Shpilevoy @ 2019-12-09 23:31 UTC (permalink / raw)
  To: Konstantin Osipov, tarantool-patches, imun

Hi! Thanks for the review!

Disclaimer: in case something looks 'toxic', this is an
accident, and is not done intentionally.

On 09/12/2019 08:21, Konstantin Osipov wrote:
> * Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [19/12/08 23:47]:
> 
> The danger of reporting, self-assigning and fixing a bug is that
> ... it might be not a bug.

I agree. Except when something is obviously a bug which is
clearly the case.

> 
> Users should know that iproto fibers are pooled, not
> created/destroyed on demand. So fiber local storage is simply not
> making any sense for them.

Here I disagree. Users should not know anything about whether
the fibers are pooled, when, and how.

If so, then please, give an example, when such a strange
fiber.storage is needed? I just really can't come up with a
usage case.

I did a public poll to understand what a behaviour is expected,
and not a single person expects, that pooled vs not pooled fiber
would affect its behaviour. Pool is like a cache. It should not
be visible. You literally said it in another thread about SQL
PREPARE.

> 
> A proper fix would be to disable fiber local storage for these
> fibers, not try to make it work.

That would lead to a problem, that functions, using fiber.storage,
would work properly being called from a background fiber, but would
fail being called from an iproto-used fiber. It is clearly not a
fix, but rather a much worse bug, sorry.

> 
> Even though your patch "works", it is pretty much useless.
> 

Well, sorry, with all respect, but I disagree here too. Appeared,
there were repetitive requests to fix this weird behaviour of
fiber.storage a couple of years ago, and again recently, which were
successfully rejected by you and by the same reason, that pooling,
as you think, should be exposed.

My patch fixes that problem, what makes it not useless.

Moreover, that fixes another problem about pooled fibers never
freeing the storages (= leak) and therefore occupying Lua
memory, which is very limited in size. That may become a real
problem, when there are many pooled fibers. Their count may be
thousands, and depends on net_msg_max.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto
  2019-12-09 23:31     ` Vladislav Shpilevoy
@ 2019-12-10  8:21       ` Konstantin Osipov
  0 siblings, 0 replies; 26+ messages in thread
From: Konstantin Osipov @ 2019-12-10  8:21 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

* Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [19/12/10 10:22]:
> > The danger of reporting, self-assigning and fixing a bug is that
> > ... it might be not a bug.
> 
> I agree. Except when something is obviously a bug which is
> clearly the case.
> 
> > 
> > Users should know that iproto fibers are pooled, not
> > created/destroyed on demand. So fiber local storage is simply not
> > making any sense for them.
> 
> Here I disagree. Users should not know anything about whether
> the fibers are pooled, when, and how.
> 
> If so, then please, give an example, when such a strange
> fiber.storage is needed? I just really can't come up with a
> usage case.
> 
> I did a public poll to understand what a behaviour is expected,
> and not a single person expects, that pooled vs not pooled fiber
> would affect its behaviour. Pool is like a cache. It should not
> be visible. You literally said it in another thread about SQL
> PREPARE.

You had a discussion with the community on the channel and many
people favour your fix. This is how you handle this! 

I will send a review.

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto
  2019-12-08 19:28 ` [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto Vladislav Shpilevoy
  2019-12-09  7:21   ` Konstantin Osipov
@ 2019-12-10  8:32   ` Konstantin Osipov
  2019-12-10 22:59     ` Vladislav Shpilevoy
  1 sibling, 1 reply; 26+ messages in thread
From: Konstantin Osipov @ 2019-12-10  8:32 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

* Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [19/12/08 23:47]:
>  static int
> -session_on_stop(struct trigger *trigger, void * /* event */)
> +session_on_fiber_cleanup(struct trigger *trigger, void * /* event */)

Please come up with a better name, and provide a comment.

> -	trigger_add(&fiber()->on_stop, &s->fiber_on_stop);
> +	trigger_create(&s->fiber_on_cleanup, session_on_fiber_cleanup,
> +		       NULL, NULL);
> +	/* Add a trigger to destroy session on fiber cleanup. */
> +	trigger_add(&fiber()->on_cleanup, &s->fiber_on_cleanup);

I don't understand why these two concerns are intermixed here.
Why is this code part of session now? 

I think if you change the semantics of the trigger, you should 
move setting/resetting it to the place which controls the relevant
object and its life cycle. Specifically, if you change
fiber.storage semantics to be per-request *or* per Lua fiber,
you should move the trigger to lua fiber create/destroy +
iproto request processing start/end. Instead you move it
to the core library, which is oblivious of these events. 
The way the core is implemented may change, and whoever is
changing it will have to be aware of the legacy behaviour you make
it depend on.

> index eff3d7a67..710e4919b 100644
> --- a/src/box/session.h
> +++ b/src/box/session.h
> @@ -103,8 +103,11 @@ struct session {
>  	union session_meta meta;
>  	/** Session user id and global grants */
>  	struct credentials credentials;
> -	/** Trigger for fiber on_stop to cleanup created on-demand session */
> -	struct trigger fiber_on_stop;
> +	/**
> +	 * Trigger for fiber on_cleanup to cleanup created
> +	 * on-demand session.
> +	 */
> +	struct trigger fiber_on_cleanup;
...

> -	trigger_clear(&txn->fiber_on_stop);
> +	trigger_clear(&txn->fiber_on_cleanup);

Please see the above comment. Fiber cleanup trigger, as defined
now, does not have anything to do with the session, or fiber, or
transaction.
It belongs to the request processing layer and should be
set/cleared from it.

> -	struct rlist on_stop;
> +	/**
> +	 * Triggers invoked before this fiber is
> +	 * stopped/reset/recycled/destroyed/reused. In other
> +	 * words, each time when the fiber has finished execution
> +	 * of a request.
> +	 */
> +	struct rlist on_cleanup;

Please make the comment describing the timing when this
trigger is called less broad and more accurate, calling out
the specific places when the trigger is called, and call
the trigger accordingly.
on_request_or_fiber_end?

> +/** Invoke cleanup triggers and delete them. */
> +void
> +fiber_cleanup(struct fiber *f);

The call for a better name for the event and the accompanying 
action applies to all of the new api
methods this patch introduces.

>  		cmsg_deliver(msg);
> +		/*
> +		 * Different messages = different requests. They
> +		 * should not affect each other. This is why
> +		 * cleanup is done here.
> +		 */
> +		fiber_cleanup(f);

Ugh. I think this is a layering violation, and the cleanup
call should be somewhere inside cmsg_deliver for the relevant
messages. The rlist with cleanup triggers is better stored in
a fiber key.

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto
  2019-12-10  8:32   ` Konstantin Osipov
@ 2019-12-10 22:59     ` Vladislav Shpilevoy
  2019-12-11  7:08       ` Konstantin Osipov
  0 siblings, 1 reply; 26+ messages in thread
From: Vladislav Shpilevoy @ 2019-12-10 22:59 UTC (permalink / raw)
  To: Konstantin Osipov, tarantool-patches

Hi! Thanks for the review!

On 10/12/2019 09:32, Konstantin Osipov wrote:
> * Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [19/12/08 23:47]:
>>  static int
>> -session_on_stop(struct trigger *trigger, void * /* event */)
>> +session_on_fiber_cleanup(struct trigger *trigger, void * /* event */)
> 
> Please come up with a better name, and provide a comment.
> 

Well, I don't know a better name. IMO this name perfectly
describes what is happening. 'Cleanup' means that the
fiber is getting cleaned up. Quite straightforward
explanation. If you have other ideas - please, bring them
up, and lets discuss.

I was thinking about 'on_reload', 'on_reuse', 'on_retire',
'on_reset', 'on_request_end', 'on_destroy'. Or keep 'on_stop'.
Reload - bad, because it is also called when the fiber goes
  to the dead list. It is not reload.
Reuse - bad by the same reason.
Retire - bad, because basically = death, but the fiber may be
  getting reused.
Reset - bad, too common.
Request_end - bad, because some fibers don't serve requests.
  For example, single fibers of background threads.
Stop/destroy - bad, because would be called after each
  request, and that != fiber stop or destruction.

Waiting for your feedback. I really don't know what an
other name to use.

Comment update:

===================================================================
+/**
+ * Destroy session of a background fiber when the
+ * fiber is getting destroyed.
+ */
 static int
 session_on_fiber_cleanup(struct trigger *trigger, void * /* event */)
 {
===================================================================

Even though I think that it is useless here. The function purpose
is obvious already.

>> -	trigger_add(&fiber()->on_stop, &s->fiber_on_stop);
>> +	trigger_create(&s->fiber_on_cleanup, session_on_fiber_cleanup,
>> +		       NULL, NULL);
>> +	/* Add a trigger to destroy session on fiber cleanup. */
>> +	trigger_add(&fiber()->on_cleanup, &s->fiber_on_cleanup);
> 
> I don't understand why these two concerns are intermixed here.
> Why is this code part of session now? 

Looks like you didn't read the patch carefully enough. This
piece of code is kept as is. I just used a proper
constructor for trigger object instead of assigning its
members manually. And added a comment. That is all. I would
not change this hunk at all, if no the 'on_stop' -> 'on_cleanup'
rename.

Talking of why is that code part of the session, *and always
was* - I guess so as to destroy only local sessions. We
can't just destroy any session in struct fiber.storage.session
when the fiber is being destroyed or reused, right in fiber.c.
Because remote sessions live longer than their fibers. This is
why local sessions schedule their own destruction via
on_stop/on_cleanup trigger in the fiber. And remote sessions
don't.

Although that problem could be solved by reference counting.
Anyway, that is not a subject to discuss in scope of that
issue.

> 
> I think if you change the semantics of the trigger, you should 
> move setting/resetting it to the place which controls the relevant
> object and its life cycle. Specifically, if you change
> fiber.storage semantics to be per-request *or* per Lua fiber,
> you should move the trigger to lua fiber create/destroy +
> iproto request processing start/end.

Well, this is not about Lua only. Background sessions can be
created from C API too. I could patch both Lua lbox_fiber_create
and API_EXPORT fiber_new(), but again certainly not here. I
would like to keep these not related changes away from the
problem of storage leaking and garbaging. I would be happy to
fix them separately.

Btw, Lua fiber storage in that patch is created and destroyed
in Lua part. So it conforms to your vision of how the trigger
should be used.

> Instead you move it
> to the core library, which is oblivious of these events. 

Nope. I don't move anything anywhere. Everything is kept
where it was.

> The way the core is implemented may change, and whoever is
> changing it will have to be aware of the legacy behaviour you make
> it depend on.
> 
>> index eff3d7a67..710e4919b 100644
>> --- a/src/box/session.h
>> +++ b/src/box/session.h
>> @@ -103,8 +103,11 @@ struct session {
>>  	union session_meta meta;
>>  	/** Session user id and global grants */
>>  	struct credentials credentials;
>> -	/** Trigger for fiber on_stop to cleanup created on-demand session */
>> -	struct trigger fiber_on_stop;
>> +	/**
>> +	 * Trigger for fiber on_cleanup to cleanup created
>> +	 * on-demand session.
>> +	 */
>> +	struct trigger fiber_on_cleanup;
> ...
> 
>> -	trigger_clear(&txn->fiber_on_stop);
>> +	trigger_clear(&txn->fiber_on_cleanup);
> 
> Please see the above comment. Fiber cleanup trigger, as defined
> now, does not have anything to do with the session, or fiber, or
> transaction.

But it does. As I said, my patch does not change anything here.
Txn and local session were destroyed in fiber cleanup, and they
still are.

> It belongs to the request processing layer and should be
> set/cleared from it.

Fiber loop and pool are the request processing layers. They process
all requests, local and remote. So cleanup was and still is
invoked there. But see my last comment below.

>> -	struct rlist on_stop;
>> +	/**
>> +	 * Triggers invoked before this fiber is
>> +	 * stopped/reset/recycled/destroyed/reused. In other
>> +	 * words, each time when the fiber has finished execution
>> +	 * of a request.
>> +	 */
>> +	struct rlist on_cleanup;
> 
> Please make the comment describing the timing when this
> trigger is called less broad and more accurate, calling out
> the specific places when the trigger is called, and call
> the trigger accordingly.
> on_request_or_fiber_end?

IMO, that names looks terrible. See my first comment.

Talking of the code comment. I am not sure what is wrong
with accuracy. I said exactly when cleanup is being done:
stop/reset/recycle/destroy/reuse/request end.

But ok:

===================================================================
@@ -463,6 +463,10 @@ struct fiber {
 	 * stopped/reset/recycled/destroyed/reused. In other
 	 * words, each time when the fiber has finished execution
 	 * of a request.
+	 * In particular, for fibers not from fiber pool the
+	 * cleanup is done before destroy and death.
+	 * Pooled fibers are cleaned up after each request, even
+	 * if they are never destroyed.
 	 */
 	struct rlist on_cleanup;
 	/**
===================================================================

I am going to stop here and not to give any source code
references in the comment, because such refs tend to
outdate.

>> +/** Invoke cleanup triggers and delete them. */
>> +void
>> +fiber_cleanup(struct fiber *f);
> 
> The call for a better name for the event and the accompanying 
> action applies to all of the new api
> methods this patch introduces.
> 

This is mutual. Propose a better name or choose one of
mine, please. I gave them in the first comment. I can't read
your mind, so if you want a particular name - say it, and lets
discuss.

>>  		cmsg_deliver(msg);
>> +		/*
>> +		 * Different messages = different requests. They
>> +		 * should not affect each other. This is why
>> +		 * cleanup is done here.
>> +		 */
>> +		fiber_cleanup(f);
> 
> Ugh. I think this is a layering violation, and the cleanup
> call should be somewhere inside cmsg_deliver for the relevant
> messages. The rlist with cleanup triggers is better stored in
> a fiber key.
> 

Sorry. I really don't do that on purpose. This is not because
I just like to disagree with everything. This is because I don't
understand you sometimes. - Here I again partially disagree :)

cmsg_deliver() is used in two places - the pooled fiber loop, and
cbus loop. Cbus loop is used in background threads such as vinyl
readers, relay, etc. They have no a concept of isolated request or
a context. The whole cbus_loop is one long request working in one
long-living fiber. So the only place needed that patch is the
fiber pool loop, which serves many requests. An attempt to call
cleanup in cbus loop would destroy that fiber's session after each
request, what looks like needless oscillation. That session is empty
anyway.

Also cbus has nothing to do with fiber control. It uses some
fiber things such as conds, but that is all.


On the other hand I agree, that fiber pooled loop probably should
not know anything about requests. Another option is to call
fiber_cleanup() right in iproto.cc after each request. We would need
to patch tx_process_misc(), tx_process_call(), tx_process1(),
tx_process_select(), tx_process_sql(), tx_process_join_subscribe().
Not so many places. What do you think?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto
  2019-12-10 22:59     ` Vladislav Shpilevoy
@ 2019-12-11  7:08       ` Konstantin Osipov
  2019-12-11 21:23         ` Vladislav Shpilevoy
  0 siblings, 1 reply; 26+ messages in thread
From: Konstantin Osipov @ 2019-12-11  7:08 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

* Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [19/12/11 10:01]:
> On the other hand I agree, that fiber pooled loop probably should
> not know anything about requests. Another option is to call
> fiber_cleanup() right in iproto.cc after each request. We would need
> to patch tx_process_misc(), tx_process_call(), tx_process1(),
> tx_process_select(), tx_process_sql(), tx_process_join_subscribe().
> Not so many places. What do you think?

This is what I suggest. Let's begin with this, then the rest of 
my comments will fit in their place.

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto
  2019-12-11  7:08       ` Konstantin Osipov
@ 2019-12-11 21:23         ` Vladislav Shpilevoy
  2019-12-12  0:00           ` Igor Munkin
  2019-12-12  8:46           ` Konstantin Osipov
  0 siblings, 2 replies; 26+ messages in thread
From: Vladislav Shpilevoy @ 2019-12-11 21:23 UTC (permalink / raw)
  To: Konstantin Osipov, tarantool-patches

Thanks for the review!

On 11/12/2019 08:08, Konstantin Osipov wrote:
> * Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [19/12/11 10:01]:
>> On the other hand I agree, that fiber pooled loop probably should
>> not know anything about requests. Another option is to call
>> fiber_cleanup() right in iproto.cc after each request. We would need
>> to patch tx_process_misc(), tx_process_call(), tx_process1(),
>> tx_process_select(), tx_process_sql(), tx_process_join_subscribe().
>> Not so many places. What do you think?
> 
> This is what I suggest. Let's begin with this, then the rest of 
> my comments will fit in their place.
> 

Here it is:

- I moved cleanup to iproto.cc, and called it tx_fiber_cleanup().
  By analogue with tx_fiber_init() which currently initializes the
  fiber;

- I made cleanup triggers always clean them by themselves to remove
  trigger_destroy() from fiber.c.

But it didn't help me to understand, why are you saying, that
tx and session should not use these triggers. Tx, in case it was
not removed from fiber before its cleanup, should be destroyed.
Session uses that trigger to destroy local one-fiber sessions,
which die together with the fiber getting recycled/killed. So
what is a problem? That trigger perfectly fits all the needs.

=========================================================================
diff --git a/src/box/iproto.cc b/src/box/iproto.cc
index c39b8e7bf..600a21ac8 100644
--- a/src/box/iproto.cc
+++ b/src/box/iproto.cc
@@ -1306,6 +1306,12 @@ static void
 tx_fiber_init(struct session *session, uint64_t sync)
 {
 	struct fiber *f = fiber();
+	/*
+	 * There should not be any not executed
+	 * destructors from a previous request
+	 * executed in that fiber.
+	 */
+	assert(rlist_empty(&f->on_cleanup));
 	f->storage.net.sync = sync;
 	/*
 	 * We do not cleanup fiber keys at the end of each request.
@@ -1321,6 +1327,17 @@ tx_fiber_init(struct session *session, uint64_t sync)
 	fiber_set_user(f, &session->credentials);
 }
 
+/**
+ * Cleanup current fiber after a request is
+ * executed to make it possible to reuse the fiber
+ * for a next request.
+ */
+static inline void
+tx_fiber_cleanup()
+{
+	fiber_cleanup(fiber());
+}
+
 static void
 tx_process_disconnect(struct cmsg *m)
 {
@@ -1331,6 +1348,7 @@ tx_process_disconnect(struct cmsg *m)
 		if (! rlist_empty(&session_on_disconnect)) {
 			tx_fiber_init(con->session, 0);
 			session_run_on_disconnect_triggers(con->session);
+			tx_fiber_cleanup();
 		}
 	}
 }
@@ -1486,6 +1504,7 @@ tx_reply_iproto_error(struct cmsg *m)
 	iproto_reply_error(out, diag_last_error(&msg->diag),
 			   msg->header.sync, ::schema_version);
 	iproto_wpos_create(&msg->wpos, out);
+	tx_fiber_cleanup();
 }
 
 /** Inject a short delay on tx request processing for testing. */
@@ -1519,9 +1538,11 @@ tx_process1(struct cmsg *m)
 	iproto_reply_select(out, &svp, msg->header.sync, ::schema_version,
 			    tuple != 0);
 	iproto_wpos_create(&msg->wpos, out);
-	return;
+	goto end;
 error:
 	tx_reply_error(msg);
+end:
+	tx_fiber_cleanup();
 }
 
 static void
@@ -1562,9 +1583,11 @@ tx_process_select(struct cmsg *m)
 	iproto_reply_select(out, &svp, msg->header.sync,
 			    ::schema_version, count);
 	iproto_wpos_create(&msg->wpos, out);
-	return;
+	goto end;
 error:
 	tx_reply_error(msg);
+end:
+	tx_fiber_cleanup();
 }
 
 static int
@@ -1652,9 +1675,11 @@ tx_process_call(struct cmsg *m)
 	iproto_reply_select(out, &svp, msg->header.sync,
 			    ::schema_version, count);
 	iproto_wpos_create(&msg->wpos, out);
-	return;
+	goto end;
 error:
 	tx_reply_error(msg);
+end:
+	tx_fiber_cleanup();
 }
 
 static void
@@ -1695,9 +1720,11 @@ tx_process_misc(struct cmsg *m)
 	} catch (Exception *e) {
 		tx_reply_error(msg);
 	}
-	return;
+	goto end;
 error:
 	tx_reply_error(msg);
+end:
+	tx_fiber_cleanup();
 }
 
 static void
@@ -1711,8 +1738,6 @@ tx_process_sql(struct cmsg *m)
 	const char *sql;
 	uint32_t len;
 
-	tx_fiber_init(msg->connection->session, msg->header.sync);
-
 	if (tx_check_schema(msg->header.schema_version))
 		goto error;
 	assert(msg->header.type == IPROTO_EXECUTE);
@@ -1746,9 +1771,11 @@ tx_process_sql(struct cmsg *m)
 	port_destroy(&port);
 	iproto_reply_sql(out, &header_svp, msg->header.sync, schema_version);
 	iproto_wpos_create(&msg->wpos, out);
-	return;
+	goto end;
 error:
 	tx_reply_error(msg);
+end:
+	tx_fiber_cleanup();
 }
 
 static void
@@ -1781,11 +1808,12 @@ tx_process_join_subscribe(struct cmsg *m)
 			unreachable();
 		}
 	} catch (SocketError *e) {
-		return; /* don't write error response to prevent SIGPIPE */
+		/* don't write error response to prevent SIGPIPE */
 	} catch (Exception *e) {
 		iproto_write_error(con->input.fd, e, ::schema_version,
 				   msg->header.sync);
 	}
+	tx_fiber_cleanup();
 }
 
 static void
@@ -1891,6 +1919,7 @@ tx_process_connect(struct cmsg *m)
 		tx_reply_error(msg);
 		msg->close_connection = true;
 	}
+	tx_fiber_cleanup();
 }
 
 /**
diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
index db8fdd5fe..d89c640aa 100644
--- a/src/lib/core/fiber.c
+++ b/src/lib/core/fiber.c
@@ -331,12 +331,12 @@ fiber_cleanup(struct fiber *f)
 	if (trigger_run(&f->on_cleanup, f) != 0)
 		panic("Fiber cleanup can't fail");
 	/*
-	 * Double call of cleanup routines is a bad idea. It is
-	 * like calling a destructor 2 times. On_cleanup can be
-	 * called only once until the fiber is reused again, and
-	 * new triggers are set.
+	 * All cleanup triggers are supposed to
+	 * remove themselves. So as no to waste
+	 * time on that here, and to make them all
+	 * work uniformly.
 	 */
-	trigger_destroy(&f->on_cleanup);
+	assert(rlist_empty(&f->on_cleanup));
 }
 
 static void
@@ -1538,8 +1538,8 @@ cord_cojoin(struct cord *cord)
 int
 break_ev_loop_f(struct trigger *trigger, void *event)
 {
-	(void) trigger;
 	(void) event;
+	trigger_clear(trigger);
 	ev_break(loop(), EVBREAK_ALL);
 	return 0;
 }
diff --git a/src/lib/core/fiber_pool.c b/src/lib/core/fiber_pool.c
index 31f366fac..77f89c9fa 100644
--- a/src/lib/core/fiber_pool.c
+++ b/src/lib/core/fiber_pool.c
@@ -62,12 +62,6 @@ restart:
 			assert(f->caller->caller == &cord->sched);
 		}
 		cmsg_deliver(msg);
-		/*
-		 * Different messages = different requests. They
-		 * should not affect each other. This is why
-		 * cleanup is done here.
-		 */
-		fiber_cleanup(f);
 	}
 	/** Put the current fiber into a fiber cache. */
 	if (!fiber_is_cancelled(fiber()) && (msg != NULL ||
diff --git a/src/lua/fiber.c b/src/lua/fiber.c
index cba625082..4d2828f1c 100644
--- a/src/lua/fiber.c
+++ b/src/lua/fiber.c
@@ -613,12 +613,13 @@ lbox_fiber_name(struct lua_State *L)
 static int
 lbox_fiber_on_cleanup(struct trigger *trigger, void *event)
 {
-	(void) trigger;
 	struct fiber *f = (struct fiber *) event;
 	int storage_ref = f->storage.lua.ref;
 	assert(storage_ref > 0);
 	luaL_unref(tarantool_L, LUA_REGISTRYINDEX, storage_ref);
 	f->storage.lua.ref = 0;
+	trigger_clear(trigger);
+	free(trigger);
 	return 0;
 }
=========================================================================

The whole new patch:

=========================================================================

    fiber: destroy fiber.storage created by iproto
    
    Fiber.storage was not deleted when created in a fiber started from
    the thread pool used by IProto requests. The problem was that
    fiber.storage was created and deleted in Lua land only, assuming
    that only Lua-born fibers could have it. But in fact any fiber can
    create a Lua storage. Including the ones used to serve IProto
    requests.
    
    Not deletion of the storage led to a possibility of meeting a
    non-empty fiber.storage in the beginning of an iproto request, and
    not deletion of the memory caught by the storage until its
    explicit nullification.
    
    Now the storage destructor works for any fiber, which managed to
    create the storage. The destructor unrefs and nullifies the
    storage.
    
    For destructor purposes the fiber.on_stop triggers were reworked.
    Now they are on_cleanup triggers, and can be called multiple times
    during fiber's lifetime. After every request done by that fiber.
    
    Closes #4662
    Closes #3462

diff --git a/src/box/iproto.cc b/src/box/iproto.cc
index c39b8e7bf..600a21ac8 100644
--- a/src/box/iproto.cc
+++ b/src/box/iproto.cc
@@ -1306,6 +1306,12 @@ static void
 tx_fiber_init(struct session *session, uint64_t sync)
 {
 	struct fiber *f = fiber();
+	/*
+	 * There should not be any not executed
+	 * destructors from a previous request
+	 * executed in that fiber.
+	 */
+	assert(rlist_empty(&f->on_cleanup));
 	f->storage.net.sync = sync;
 	/*
 	 * We do not cleanup fiber keys at the end of each request.
@@ -1321,6 +1327,17 @@ tx_fiber_init(struct session *session, uint64_t sync)
 	fiber_set_user(f, &session->credentials);
 }
 
+/**
+ * Cleanup current fiber after a request is
+ * executed to make it possible to reuse the fiber
+ * for a next request.
+ */
+static inline void
+tx_fiber_cleanup()
+{
+	fiber_cleanup(fiber());
+}
+
 static void
 tx_process_disconnect(struct cmsg *m)
 {
@@ -1331,6 +1348,7 @@ tx_process_disconnect(struct cmsg *m)
 		if (! rlist_empty(&session_on_disconnect)) {
 			tx_fiber_init(con->session, 0);
 			session_run_on_disconnect_triggers(con->session);
+			tx_fiber_cleanup();
 		}
 	}
 }
@@ -1486,6 +1504,7 @@ tx_reply_iproto_error(struct cmsg *m)
 	iproto_reply_error(out, diag_last_error(&msg->diag),
 			   msg->header.sync, ::schema_version);
 	iproto_wpos_create(&msg->wpos, out);
+	tx_fiber_cleanup();
 }
 
 /** Inject a short delay on tx request processing for testing. */
@@ -1519,9 +1538,11 @@ tx_process1(struct cmsg *m)
 	iproto_reply_select(out, &svp, msg->header.sync, ::schema_version,
 			    tuple != 0);
 	iproto_wpos_create(&msg->wpos, out);
-	return;
+	goto end;
 error:
 	tx_reply_error(msg);
+end:
+	tx_fiber_cleanup();
 }
 
 static void
@@ -1562,9 +1583,11 @@ tx_process_select(struct cmsg *m)
 	iproto_reply_select(out, &svp, msg->header.sync,
 			    ::schema_version, count);
 	iproto_wpos_create(&msg->wpos, out);
-	return;
+	goto end;
 error:
 	tx_reply_error(msg);
+end:
+	tx_fiber_cleanup();
 }
 
 static int
@@ -1652,9 +1675,11 @@ tx_process_call(struct cmsg *m)
 	iproto_reply_select(out, &svp, msg->header.sync,
 			    ::schema_version, count);
 	iproto_wpos_create(&msg->wpos, out);
-	return;
+	goto end;
 error:
 	tx_reply_error(msg);
+end:
+	tx_fiber_cleanup();
 }
 
 static void
@@ -1695,9 +1720,11 @@ tx_process_misc(struct cmsg *m)
 	} catch (Exception *e) {
 		tx_reply_error(msg);
 	}
-	return;
+	goto end;
 error:
 	tx_reply_error(msg);
+end:
+	tx_fiber_cleanup();
 }
 
 static void
@@ -1711,8 +1738,6 @@ tx_process_sql(struct cmsg *m)
 	const char *sql;
 	uint32_t len;
 
-	tx_fiber_init(msg->connection->session, msg->header.sync);
-
 	if (tx_check_schema(msg->header.schema_version))
 		goto error;
 	assert(msg->header.type == IPROTO_EXECUTE);
@@ -1746,9 +1771,11 @@ tx_process_sql(struct cmsg *m)
 	port_destroy(&port);
 	iproto_reply_sql(out, &header_svp, msg->header.sync, schema_version);
 	iproto_wpos_create(&msg->wpos, out);
-	return;
+	goto end;
 error:
 	tx_reply_error(msg);
+end:
+	tx_fiber_cleanup();
 }
 
 static void
@@ -1781,11 +1808,12 @@ tx_process_join_subscribe(struct cmsg *m)
 			unreachable();
 		}
 	} catch (SocketError *e) {
-		return; /* don't write error response to prevent SIGPIPE */
+		/* don't write error response to prevent SIGPIPE */
 	} catch (Exception *e) {
 		iproto_write_error(con->input.fd, e, ::schema_version,
 				   msg->header.sync);
 	}
+	tx_fiber_cleanup();
 }
 
 static void
@@ -1891,6 +1919,7 @@ tx_process_connect(struct cmsg *m)
 		tx_reply_error(msg);
 		msg->close_connection = true;
 	}
+	tx_fiber_cleanup();
 }
 
 /**
diff --git a/src/box/session.cc b/src/box/session.cc
index 461d1cf25..09e988148 100644
--- a/src/box/session.cc
+++ b/src/box/session.cc
@@ -80,8 +80,12 @@ sid_max()
 	return sid_max;
 }
 
+/**
+ * Destroy session of a background fiber when the
+ * fiber is getting destroyed.
+ */
 static int
-session_on_stop(struct trigger *trigger, void * /* event */)
+session_on_fiber_cleanup(struct trigger *trigger, void * /* event */)
 {
 	/*
 	 * Remove on_stop trigger from the fiber, otherwise the
@@ -167,11 +171,10 @@ session_create_on_demand()
 	struct session *s = session_create(SESSION_TYPE_BACKGROUND);
 	if (s == NULL)
 		return NULL;
-	s->fiber_on_stop = {
-		RLIST_LINK_INITIALIZER, session_on_stop, NULL, NULL
-	};
-	/* Add a trigger to destroy session on fiber stop */
-	trigger_add(&fiber()->on_stop, &s->fiber_on_stop);
+	trigger_create(&s->fiber_on_cleanup, session_on_fiber_cleanup,
+		       NULL, NULL);
+	/* Add a trigger to destroy session on fiber cleanup. */
+	trigger_add(&fiber()->on_cleanup, &s->fiber_on_cleanup);
 	credentials_reset(&s->credentials, admin_user);
 	fiber_set_session(fiber(), s);
 	fiber_set_user(fiber(), &s->credentials);
diff --git a/src/box/session.h b/src/box/session.h
index eff3d7a67..710e4919b 100644
--- a/src/box/session.h
+++ b/src/box/session.h
@@ -103,8 +103,11 @@ struct session {
 	union session_meta meta;
 	/** Session user id and global grants */
 	struct credentials credentials;
-	/** Trigger for fiber on_stop to cleanup created on-demand session */
-	struct trigger fiber_on_stop;
+	/**
+	 * Trigger for fiber on_cleanup to cleanup created
+	 * on-demand session.
+	 */
+	struct trigger fiber_on_cleanup;
 };
 
 struct session_vtab {
diff --git a/src/box/txn.c b/src/box/txn.c
index 963ec8eeb..7d549a6a9 100644
--- a/src/box/txn.c
+++ b/src/box/txn.c
@@ -41,7 +41,7 @@ double too_long_threshold;
 static struct stailq txn_cache = {NULL, &txn_cache.first};
 
 static int
-txn_on_stop(struct trigger *trigger, void *event);
+txn_on_fiber_cleanup(struct trigger *trigger, void *event);
 
 static int
 txn_on_yield(struct trigger *trigger, void *event);
@@ -226,8 +226,9 @@ txn_begin()
 	txn->fiber = NULL;
 	fiber_set_txn(fiber(), txn);
 	/* fiber_on_yield is initialized by engine on demand */
-	trigger_create(&txn->fiber_on_stop, txn_on_stop, NULL, NULL);
-	trigger_add(&fiber()->on_stop, &txn->fiber_on_stop);
+	trigger_create(&txn->fiber_on_cleanup, txn_on_fiber_cleanup,
+		       NULL, NULL);
+	trigger_add(&fiber()->on_cleanup, &txn->fiber_on_cleanup);
 	/*
 	 * By default all transactions may yield.
 	 * It's a responsibility of an engine to disable yields
@@ -550,7 +551,7 @@ txn_prepare(struct txn *txn)
 		if (engine_prepare(txn->engine, txn) != 0)
 			return -1;
 	}
-	trigger_clear(&txn->fiber_on_stop);
+	trigger_clear(&txn->fiber_on_cleanup);
 	if (!txn_has_flag(txn, TXN_CAN_YIELD))
 		trigger_clear(&txn->fiber_on_yield);
 	return 0;
@@ -618,7 +619,7 @@ void
 txn_rollback(struct txn *txn)
 {
 	assert(txn == in_txn());
-	trigger_clear(&txn->fiber_on_stop);
+	trigger_clear(&txn->fiber_on_cleanup);
 	if (!txn_has_flag(txn, TXN_CAN_YIELD))
 		trigger_clear(&txn->fiber_on_yield);
 	txn->signature = -1;
@@ -840,7 +841,7 @@ txn_savepoint_release(struct txn_savepoint *svp)
 }
 
 static int
-txn_on_stop(struct trigger *trigger, void *event)
+txn_on_fiber_cleanup(struct trigger *trigger, void *event)
 {
 	(void) trigger;
 	(void) event;
diff --git a/src/box/txn.h b/src/box/txn.h
index da12feebf..755c464f0 100644
--- a/src/box/txn.h
+++ b/src/box/txn.h
@@ -207,10 +207,11 @@ struct txn {
 	 */
 	struct trigger fiber_on_yield;
 	/**
-	 * Trigger on fiber stop, to rollback transaction
-	 * in case a fiber stops (all engines).
+	 * Trigger on fiber cleanup, to rollback transaction
+	 * in case a fiber is getting reset/recycled/destroyed
+	 * (all engines).
 	 */
-	struct trigger fiber_on_stop;
+	struct trigger fiber_on_cleanup;
 	/** Commit and rollback triggers. */
 	struct rlist on_commit, on_rollback;
 	/**
diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
index 00ae8cded..d89c640aa 100644
--- a/src/lib/core/fiber.c
+++ b/src/lib/core/fiber.c
@@ -325,6 +325,20 @@ fiber_attr_getstacksize(struct fiber_attr *fiber_attr)
 				    fiber_attr_default.stack_size;
 }
 
+void
+fiber_cleanup(struct fiber *f)
+{
+	if (trigger_run(&f->on_cleanup, f) != 0)
+		panic("Fiber cleanup can't fail");
+	/*
+	 * All cleanup triggers are supposed to
+	 * remove themselves. So as no to waste
+	 * time on that here, and to make them all
+	 * work uniformly.
+	 */
+	assert(rlist_empty(&f->on_cleanup));
+}
+
 static void
 fiber_recycle(struct fiber *fiber);
 
@@ -787,7 +801,7 @@ static void
 fiber_reset(struct fiber *fiber)
 {
 	rlist_create(&fiber->on_yield);
-	rlist_create(&fiber->on_stop);
+	rlist_create(&fiber->on_cleanup);
 	fiber->flags = FIBER_DEFAULT_FLAGS;
 #if ENABLE_FIBER_TOP
 	clock_stat_reset(&fiber->clock_stat);
@@ -856,8 +870,7 @@ fiber_loop(MAYBE_UNUSED void *data)
 		       assert(f != fiber);
 		       fiber_wakeup(f);
 	        }
-		if (! rlist_empty(&fiber->on_stop))
-			trigger_run(&fiber->on_stop, fiber);
+	        fiber_cleanup(fiber);
 		/* reset pending wakeups */
 		rlist_del(&fiber->state);
 		if (! (fiber->flags & FIBER_IS_JOINABLE))
@@ -1161,7 +1174,7 @@ fiber_destroy(struct cord *cord, struct fiber *f)
 	assert(f != &cord->sched);
 
 	trigger_destroy(&f->on_yield);
-	trigger_destroy(&f->on_stop);
+	trigger_destroy(&f->on_cleanup);
 	rlist_del(&f->state);
 	rlist_del(&f->link);
 	region_destroy(&f->gc);
@@ -1525,8 +1538,8 @@ cord_cojoin(struct cord *cord)
 int
 break_ev_loop_f(struct trigger *trigger, void *event)
 {
-	(void) trigger;
 	(void) event;
+	trigger_clear(trigger);
 	ev_break(loop(), EVBREAK_ALL);
 	return 0;
 }
@@ -1555,7 +1568,7 @@ cord_costart_thread_func(void *arg)
 	 * Got to be in a trigger, to break the loop even
 	 * in case of an exception.
 	 */
-	trigger_add(&f->on_stop, &break_ev_loop);
+	trigger_add(&f->on_cleanup, &break_ev_loop);
 	fiber_set_joinable(f, true);
 	fiber_start(f, ctx.arg);
 	if (!fiber_is_dead(f)) {
diff --git a/src/lib/core/fiber.h b/src/lib/core/fiber.h
index c5b975513..21fff8f88 100644
--- a/src/lib/core/fiber.h
+++ b/src/lib/core/fiber.h
@@ -458,8 +458,17 @@ struct fiber {
 
 	/** Triggers invoked before this fiber yields. Must not throw. */
 	struct rlist on_yield;
-	/** Triggers invoked before this fiber stops.  Must not throw. */
-	struct rlist on_stop;
+	/**
+	 * Triggers invoked before this fiber is
+	 * stopped/reset/recycled/destroyed/reused. In other
+	 * words, each time when the fiber has finished execution
+	 * of a request.
+	 * In particular, for fibers not from fiber pool the
+	 * cleanup is done before destroy and death.
+	 * Pooled fibers are cleaned up after each request, even
+	 * if they are never destroyed.
+	 */
+	struct rlist on_cleanup;
 	/**
 	 * The list of fibers awaiting for this fiber's timely
 	 * (or untimely) death.
@@ -506,6 +515,10 @@ struct fiber {
 	char name[FIBER_NAME_MAX + 1];
 };
 
+/** Invoke cleanup triggers and delete them. */
+void
+fiber_cleanup(struct fiber *f);
+
 struct cord_on_exit;
 
 /**
diff --git a/src/lua/fiber.c b/src/lua/fiber.c
index a7b75f9bf..4d2828f1c 100644
--- a/src/lua/fiber.c
+++ b/src/lua/fiber.c
@@ -441,11 +441,6 @@ lua_fiber_run_f(MAYBE_UNUSED va_list ap)
 	int coro_ref = lua_tointeger(L, -1);
 	lua_pop(L, 1);
 	result = luaT_call(L, lua_gettop(L) - 1, LUA_MULTRET);
-
-	/* Destroy local storage */
-	int storage_ref = f->storage.lua.ref;
-	if (storage_ref > 0)
-		luaL_unref(tarantool_L, LUA_REGISTRYINDEX, storage_ref);
 	/*
 	 * If fiber is not joinable
 	 * We can unref child stack here,
@@ -606,12 +601,43 @@ lbox_fiber_name(struct lua_State *L)
 	}
 }
 
+/**
+ * Trigger invoked when the fiber has finished execution of its
+ * current request. Only purpose - delete storage.lua.ref keeping
+ * a reference of Lua fiber.storage object. Unlike Lua stack,
+ * Lua fiber storage may be created not only for fibers born from
+ * Lua land. For example, an IProto request may execute a Lua
+ * function, which can create the storage. Trigger guarantees,
+ * that even for non-Lua fibers the Lua storage is destroyed.
+ */
+static int
+lbox_fiber_on_cleanup(struct trigger *trigger, void *event)
+{
+	struct fiber *f = (struct fiber *) event;
+	int storage_ref = f->storage.lua.ref;
+	assert(storage_ref > 0);
+	luaL_unref(tarantool_L, LUA_REGISTRYINDEX, storage_ref);
+	f->storage.lua.ref = 0;
+	trigger_clear(trigger);
+	free(trigger);
+	return 0;
+}
+
 static int
 lbox_fiber_storage(struct lua_State *L)
 {
 	struct fiber *f = lbox_checkfiber(L, 1);
 	int storage_ref = f->storage.lua.ref;
 	if (storage_ref <= 0) {
+		struct trigger *t = (struct trigger *)
+			malloc(sizeof(*t));
+		if (t == NULL) {
+			diag_set(OutOfMemory, sizeof(*t), "malloc", "t");
+			return luaT_error(L);
+		}
+		trigger_create(t, lbox_fiber_on_cleanup, NULL,
+			       (trigger_f0) free);
+		trigger_add(&f->on_cleanup, t);
 		lua_newtable(L); /* create local storage on demand */
 		storage_ref = luaL_ref(L, LUA_REGISTRYINDEX);
 		f->storage.lua.ref = storage_ref;
diff --git a/test/app/gh-4662-fiber-storage-leak.result b/test/app/gh-4662-fiber-storage-leak.result
new file mode 100644
index 000000000..4ade017a4
--- /dev/null
+++ b/test/app/gh-4662-fiber-storage-leak.result
@@ -0,0 +1,88 @@
+-- test-run result file version 2
+fiber = require('fiber')
+ | ---
+ | ...
+netbox = require('net.box')
+ | ---
+ | ...
+
+--
+-- gh-4662: fiber.storage was not deleted when created in a fiber
+-- started from the thread pool used by IProto requests. The
+-- problem was that fiber.storage was created and deleted in Lua
+-- land only, assuming that only Lua-born fibers could have it.
+-- But in fact any fiber can create a Lua storage. Including the
+-- ones used to serve IProto requests.
+-- The test checks if fiber.storage is really deleted, and is not
+-- shared between requests.
+--
+
+box.schema.user.grant('guest', 'execute', 'universe')
+ | ---
+ | ...
+storage = nil
+ | ---
+ | ...
+i = 0
+ | ---
+ | ...
+weak_table = setmetatable({}, {__mode = 'v'})
+ | ---
+ | ...
+object = {'object'}
+ | ---
+ | ...
+weak_table.object = object
+ | ---
+ | ...
+function ref_object_in_fiber()                  \
+    storage = fiber.self().storage              \
+    assert(next(storage) == nil)                \
+    i = i + 1                                   \
+    fiber.self().storage.key = i                \
+    fiber.self().storage.object = object        \
+end
+ | ---
+ | ...
+
+c = netbox.connect(box.cfg.listen)
+ | ---
+ | ...
+c:call('ref_object_in_fiber') c:call('ref_object_in_fiber')
+ | ---
+ | ...
+storage
+ | ---
+ | - key: 2
+ |   object:
+ |   - object
+ | ...
+i
+ | ---
+ | - 2
+ | ...
+object = nil
+ | ---
+ | ...
+storage = nil
+ | ---
+ | ...
+collectgarbage('collect')
+ | ---
+ | - 0
+ | ...
+-- The weak table should be empty, because the only two hard
+-- references were in the fibers used to serve
+-- ref_object_in_fiber() requests. And their storages should be
+-- cleaned up.
+weak_table
+ | ---
+ | - []
+ | ...
+
+c:close()
+ | ---
+ | ...
+box.schema.user.revoke('guest', 'execute', 'universe')
+ | ---
+ | ...
diff --git a/test/app/gh-4662-fiber-storage-leak.test.lua b/test/app/gh-4662-fiber-storage-leak.test.lua
new file mode 100644
index 000000000..25acf5679
--- /dev/null
+++ b/test/app/gh-4662-fiber-storage-leak.test.lua
@@ -0,0 +1,43 @@
+fiber = require('fiber')
+netbox = require('net.box')
+
+--
+-- gh-4662: fiber.storage was not deleted when created in a fiber
+-- started from the thread pool used by IProto requests. The
+-- problem was that fiber.storage was created and deleted in Lua
+-- land only, assuming that only Lua-born fibers could have it.
+-- But in fact any fiber can create a Lua storage. Including the
+-- ones used to serve IProto requests.
+-- The test checks if fiber.storage is really deleted, and is not
+-- shared between requests.
+--
+
+box.schema.user.grant('guest', 'execute', 'universe')
+storage = nil
+i = 0
+weak_table = setmetatable({}, {__mode = 'v'})
+object = {'object'}
+weak_table.object = object
+function ref_object_in_fiber()                  \
+    storage = fiber.self().storage              \
+    assert(next(storage) == nil)                \
+    i = i + 1                                   \
+    fiber.self().storage.key = i                \
+    fiber.self().storage.object = object        \
+end
+
+c = netbox.connect(box.cfg.listen)
+c:call('ref_object_in_fiber') c:call('ref_object_in_fiber')
+storage
+i
+object = nil
+storage = nil
+collectgarbage('collect')
+-- The weak table should be empty, because the only two hard
+-- references were in the fibers used to serve
+-- ref_object_in_fiber() requests. And their storages should be
+-- cleaned up.
+weak_table
+
+c:close()
+box.schema.user.revoke('guest', 'execute', 'universe')

=========================================================================

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] fiber: unref fiber.storage via global Lua state
  2019-12-08 19:28 ` [Tarantool-patches] [PATCH 1/2] fiber: unref fiber.storage via global Lua state Vladislav Shpilevoy
@ 2019-12-11 23:57   ` Igor Munkin
  2019-12-12  8:50     ` Konstantin Osipov
  2019-12-12 23:38     ` Vladislav Shpilevoy
  0 siblings, 2 replies; 26+ messages in thread
From: Igor Munkin @ 2019-12-11 23:57 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Vlad,

Thanks for the patch!

I spent some time knee deep into fibers machinery and totally agree with
this fix: the most safe approach is "unrefing" storage entity via
tarantool_L coro, taking into account the possible unref of the fiber's
one and its consequtive collection.

However, I don't get why we need to apply these changes, considering
your follow-up patch. Could you please provide a bit more detailed
rationale?

On 08.12.19, Vladislav Shpilevoy wrote:
> Fiber.storage is a table, available from anywhere in the fiber. It
> is destroyed after fiber function is finished. That provides a
> reliable fiber-local storage, similar to thread-local in C/C++.
> 
> But there is a problem that the storage may be created via one
> struct lua_State, and destroyed via another. Here is an example:
> 
>     function test_storage()
>         fiber.self().storage.key = 100
>     end
>     box.schema.func.create('test_storage')
>     _ = fiber.create(function()
>         box.func.test_storage:call()
>     end)
> 
> There are 3 struct lua_State:
>     tarantool_L - global always alive Lua state;
>     L1 - stack of the fiber, created by fiber.create();
>     L2 - stack created by that fiber to execute test_storage().

Minor: Strictly saying, lua_State is a Lua coroutine, and not just a
guest stack. Please, consider adjusting the commit message, but feel
free to ignore this remark.

> 
> Fiber.storage is created on stack of L2 and referenced by global
> LUA_REGISTRYINDEX. Then it is unreferenced from L1 when the fiber
> is being destroyed.
> 
> That is generally ok as soon as the storage object is always in
> LUA_REGISTRYINDEX, which is shared by all Lua states.
> 
> But soon during destruction of the fiber.storage there will be
> only tarantool_L and the original L2. Original L2 may be already
> deleted by the time the storage is being destroyed. So this patch
> makes unref of the storage via reliable tarantool_L.
> 
> Needed for #4662
> ---
>  src/lua/fiber.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/lua/fiber.c b/src/lua/fiber.c
> index 53ebec9aa..a7b75f9bf 100644
> --- a/src/lua/fiber.c
> +++ b/src/lua/fiber.c
> @@ -445,7 +445,7 @@ lua_fiber_run_f(MAYBE_UNUSED va_list ap)
>  	/* Destroy local storage */
>  	int storage_ref = f->storage.lua.ref;
>  	if (storage_ref > 0)
> -		luaL_unref(L, LUA_REGISTRYINDEX, storage_ref);
> +		luaL_unref(tarantool_L, LUA_REGISTRYINDEX, storage_ref);
>  	/*
>  	 * If fiber is not joinable
>  	 * We can unref child stack here,
> -- 
> 2.21.0 (Apple Git-122.2)
> 

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto
  2019-12-11 21:23         ` Vladislav Shpilevoy
@ 2019-12-12  0:00           ` Igor Munkin
  2019-12-12 23:37             ` Vladislav Shpilevoy
  2019-12-12  8:46           ` Konstantin Osipov
  1 sibling, 1 reply; 26+ messages in thread
From: Igor Munkin @ 2019-12-12  0:00 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Vlad,

Thanks for the patch!

Sorry, I needed some intro into all this machinery thus the review was
postponed a bit and you've managed to prepare a new patch regarding
Kostja's remarks. Therefore, I decided to move all my notes to the new
patch instead of splitting them into two replies.

This patch LGTM in general, but please consider several minor comments I
left below.

On 11.12.19, Vladislav Shpilevoy wrote:

<snipped>

> =========================================================================
> 
> The whole new patch:
> 
> =========================================================================
> 
>     fiber: destroy fiber.storage created by iproto
>     
>     Fiber.storage was not deleted when created in a fiber started from
>     the thread pool used by IProto requests. The problem was that
>     fiber.storage was created and deleted in Lua land only, assuming
>     that only Lua-born fibers could have it. But in fact any fiber can
>     create a Lua storage. Including the ones used to serve IProto
>     requests.
>     
>     Not deletion of the storage led to a possibility of meeting a
>     non-empty fiber.storage in the beginning of an iproto request, and
>     not deletion of the memory caught by the storage until its
>     explicit nullification.
>     
>     Now the storage destructor works for any fiber, which managed to
>     create the storage. The destructor unrefs and nullifies the
>     storage.
>     
>     For destructor purposes the fiber.on_stop triggers were reworked.
>     Now they are on_cleanup triggers, and can be called multiple times
>     during fiber's lifetime. After every request done by that fiber.
>     
>     Closes #4662
>     Closes #3462
> 
> diff --git a/src/box/iproto.cc b/src/box/iproto.cc
> index c39b8e7bf..600a21ac8 100644
> --- a/src/box/iproto.cc
> +++ b/src/box/iproto.cc
> @@ -1306,6 +1306,12 @@ static void
>  tx_fiber_init(struct session *session, uint64_t sync)
>  {
>  	struct fiber *f = fiber();
> +	/*
> +	 * There should not be any not executed
> +	 * destructors from a previous request
> +	 * executed in that fiber.
> +	 */
> +	assert(rlist_empty(&f->on_cleanup));
>  	f->storage.net.sync = sync;
>  	/*
>  	 * We do not cleanup fiber keys at the end of each request.
> @@ -1321,6 +1327,17 @@ tx_fiber_init(struct session *session, uint64_t sync)
>  	fiber_set_user(f, &session->credentials);
>  }
>  
> +/**
> + * Cleanup current fiber after a request is
> + * executed to make it possible to reuse the fiber
> + * for a next request.
> + */
> +static inline void
> +tx_fiber_cleanup()
> +{
> +	fiber_cleanup(fiber());
> +}
> +
>  static void
>  tx_process_disconnect(struct cmsg *m)
>  {
> @@ -1331,6 +1348,7 @@ tx_process_disconnect(struct cmsg *m)
>  		if (! rlist_empty(&session_on_disconnect)) {
>  			tx_fiber_init(con->session, 0);
>  			session_run_on_disconnect_triggers(con->session);
> +			tx_fiber_cleanup();
>  		}
>  	}
>  }
> @@ -1486,6 +1504,7 @@ tx_reply_iproto_error(struct cmsg *m)
>  	iproto_reply_error(out, diag_last_error(&msg->diag),
>  			   msg->header.sync, ::schema_version);
>  	iproto_wpos_create(&msg->wpos, out);
> +	tx_fiber_cleanup();
>  }
>  
>  /** Inject a short delay on tx request processing for testing. */
> @@ -1519,9 +1538,11 @@ tx_process1(struct cmsg *m)
>  	iproto_reply_select(out, &svp, msg->header.sync, ::schema_version,
>  			    tuple != 0);
>  	iproto_wpos_create(&msg->wpos, out);
> -	return;
> +	goto end;
>  error:
>  	tx_reply_error(msg);
> +end:
> +	tx_fiber_cleanup();
>  }
>  
>  static void
> @@ -1562,9 +1583,11 @@ tx_process_select(struct cmsg *m)
>  	iproto_reply_select(out, &svp, msg->header.sync,
>  			    ::schema_version, count);
>  	iproto_wpos_create(&msg->wpos, out);
> -	return;
> +	goto end;
>  error:
>  	tx_reply_error(msg);
> +end:
> +	tx_fiber_cleanup();
>  }
>  
>  static int
> @@ -1652,9 +1675,11 @@ tx_process_call(struct cmsg *m)
>  	iproto_reply_select(out, &svp, msg->header.sync,
>  			    ::schema_version, count);
>  	iproto_wpos_create(&msg->wpos, out);
> -	return;
> +	goto end;
>  error:
>  	tx_reply_error(msg);
> +end:
> +	tx_fiber_cleanup();
>  }
>  
>  static void
> @@ -1695,9 +1720,11 @@ tx_process_misc(struct cmsg *m)
>  	} catch (Exception *e) {
>  		tx_reply_error(msg);
>  	}
> -	return;
> +	goto end;
>  error:
>  	tx_reply_error(msg);
> +end:
> +	tx_fiber_cleanup();
>  }
>  
>  static void
> @@ -1711,8 +1738,6 @@ tx_process_sql(struct cmsg *m)
>  	const char *sql;
>  	uint32_t len;
>  
> -	tx_fiber_init(msg->connection->session, msg->header.sync);
> -
>  	if (tx_check_schema(msg->header.schema_version))
>  		goto error;
>  	assert(msg->header.type == IPROTO_EXECUTE);
> @@ -1746,9 +1771,11 @@ tx_process_sql(struct cmsg *m)
>  	port_destroy(&port);
>  	iproto_reply_sql(out, &header_svp, msg->header.sync, schema_version);
>  	iproto_wpos_create(&msg->wpos, out);
> -	return;
> +	goto end;
>  error:
>  	tx_reply_error(msg);
> +end:
> +	tx_fiber_cleanup();
>  }
>  
>  static void
> @@ -1781,11 +1808,12 @@ tx_process_join_subscribe(struct cmsg *m)
>  			unreachable();
>  		}
>  	} catch (SocketError *e) {
> -		return; /* don't write error response to prevent SIGPIPE */
> +		/* don't write error response to prevent SIGPIPE */
>  	} catch (Exception *e) {
>  		iproto_write_error(con->input.fd, e, ::schema_version,
>  				   msg->header.sync);
>  	}
> +	tx_fiber_cleanup();
>  }
>  
>  static void
> @@ -1891,6 +1919,7 @@ tx_process_connect(struct cmsg *m)
>  		tx_reply_error(msg);
>  		msg->close_connection = true;
>  	}
> +	tx_fiber_cleanup();
>  }
>  
>  /**
> diff --git a/src/box/session.cc b/src/box/session.cc
> index 461d1cf25..09e988148 100644
> --- a/src/box/session.cc
> +++ b/src/box/session.cc
> @@ -80,8 +80,12 @@ sid_max()
>  	return sid_max;
>  }
>  
> +/**
> + * Destroy session of a background fiber when the
> + * fiber is getting destroyed.
> + */
>  static int
> -session_on_stop(struct trigger *trigger, void * /* event */)
> +session_on_fiber_cleanup(struct trigger *trigger, void * /* event */)
>  {
>  	/*
>  	 * Remove on_stop trigger from the fiber, otherwise the
> @@ -167,11 +171,10 @@ session_create_on_demand()
>  	struct session *s = session_create(SESSION_TYPE_BACKGROUND);
>  	if (s == NULL)
>  		return NULL;
> -	s->fiber_on_stop = {
> -		RLIST_LINK_INITIALIZER, session_on_stop, NULL, NULL
> -	};
> -	/* Add a trigger to destroy session on fiber stop */
> -	trigger_add(&fiber()->on_stop, &s->fiber_on_stop);
> +	trigger_create(&s->fiber_on_cleanup, session_on_fiber_cleanup,
> +		       NULL, NULL);
> +	/* Add a trigger to destroy session on fiber cleanup. */
> +	trigger_add(&fiber()->on_cleanup, &s->fiber_on_cleanup);
>  	credentials_reset(&s->credentials, admin_user);
>  	fiber_set_session(fiber(), s);
>  	fiber_set_user(fiber(), &s->credentials);
> diff --git a/src/box/session.h b/src/box/session.h
> index eff3d7a67..710e4919b 100644
> --- a/src/box/session.h
> +++ b/src/box/session.h
> @@ -103,8 +103,11 @@ struct session {
>  	union session_meta meta;
>  	/** Session user id and global grants */
>  	struct credentials credentials;
> -	/** Trigger for fiber on_stop to cleanup created on-demand session */
> -	struct trigger fiber_on_stop;
> +	/**
> +	 * Trigger for fiber on_cleanup to cleanup created
> +	 * on-demand session.
> +	 */
> +	struct trigger fiber_on_cleanup;
>  };
>  
>  struct session_vtab {
> diff --git a/src/box/txn.c b/src/box/txn.c
> index 963ec8eeb..7d549a6a9 100644
> --- a/src/box/txn.c
> +++ b/src/box/txn.c
> @@ -41,7 +41,7 @@ double too_long_threshold;
>  static struct stailq txn_cache = {NULL, &txn_cache.first};
>  
>  static int
> -txn_on_stop(struct trigger *trigger, void *event);
> +txn_on_fiber_cleanup(struct trigger *trigger, void *event);
>  
>  static int
>  txn_on_yield(struct trigger *trigger, void *event);
> @@ -226,8 +226,9 @@ txn_begin()
>  	txn->fiber = NULL;
>  	fiber_set_txn(fiber(), txn);
>  	/* fiber_on_yield is initialized by engine on demand */
> -	trigger_create(&txn->fiber_on_stop, txn_on_stop, NULL, NULL);
> -	trigger_add(&fiber()->on_stop, &txn->fiber_on_stop);
> +	trigger_create(&txn->fiber_on_cleanup, txn_on_fiber_cleanup,
> +		       NULL, NULL);
> +	trigger_add(&fiber()->on_cleanup, &txn->fiber_on_cleanup);
>  	/*
>  	 * By default all transactions may yield.
>  	 * It's a responsibility of an engine to disable yields
> @@ -550,7 +551,7 @@ txn_prepare(struct txn *txn)
>  		if (engine_prepare(txn->engine, txn) != 0)
>  			return -1;
>  	}
> -	trigger_clear(&txn->fiber_on_stop);
> +	trigger_clear(&txn->fiber_on_cleanup);
>  	if (!txn_has_flag(txn, TXN_CAN_YIELD))
>  		trigger_clear(&txn->fiber_on_yield);
>  	return 0;
> @@ -618,7 +619,7 @@ void
>  txn_rollback(struct txn *txn)
>  {
>  	assert(txn == in_txn());
> -	trigger_clear(&txn->fiber_on_stop);
> +	trigger_clear(&txn->fiber_on_cleanup);
>  	if (!txn_has_flag(txn, TXN_CAN_YIELD))
>  		trigger_clear(&txn->fiber_on_yield);
>  	txn->signature = -1;
> @@ -840,7 +841,7 @@ txn_savepoint_release(struct txn_savepoint *svp)
>  }
>  
>  static int
> -txn_on_stop(struct trigger *trigger, void *event)
> +txn_on_fiber_cleanup(struct trigger *trigger, void *event)
>  {
>  	(void) trigger;
>  	(void) event;
> diff --git a/src/box/txn.h b/src/box/txn.h
> index da12feebf..755c464f0 100644
> --- a/src/box/txn.h
> +++ b/src/box/txn.h
> @@ -207,10 +207,11 @@ struct txn {
>  	 */
>  	struct trigger fiber_on_yield;
>  	/**
> -	 * Trigger on fiber stop, to rollback transaction
> -	 * in case a fiber stops (all engines).
> +	 * Trigger on fiber cleanup, to rollback transaction
> +	 * in case a fiber is getting reset/recycled/destroyed
> +	 * (all engines).
>  	 */
> -	struct trigger fiber_on_stop;
> +	struct trigger fiber_on_cleanup;
>  	/** Commit and rollback triggers. */
>  	struct rlist on_commit, on_rollback;
>  	/**
> diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
> index 00ae8cded..d89c640aa 100644
> --- a/src/lib/core/fiber.c
> +++ b/src/lib/core/fiber.c
> @@ -325,6 +325,20 @@ fiber_attr_getstacksize(struct fiber_attr *fiber_attr)
>  				    fiber_attr_default.stack_size;
>  }
>  
> +void
> +fiber_cleanup(struct fiber *f)
> +{
> +	if (trigger_run(&f->on_cleanup, f) != 0)
> +		panic("Fiber cleanup can't fail");
> +	/*
> +	 * All cleanup triggers are supposed to
> +	 * remove themselves. So as no to waste
> +	 * time on that here, and to make them all
> +	 * work uniformly.
> +	 */
> +	assert(rlist_empty(&f->on_cleanup));
> +}
> +
>  static void
>  fiber_recycle(struct fiber *fiber);
>  
> @@ -787,7 +801,7 @@ static void
>  fiber_reset(struct fiber *fiber)
>  {
>  	rlist_create(&fiber->on_yield);
> -	rlist_create(&fiber->on_stop);
> +	rlist_create(&fiber->on_cleanup);
>  	fiber->flags = FIBER_DEFAULT_FLAGS;
>  #if ENABLE_FIBER_TOP
>  	clock_stat_reset(&fiber->clock_stat);
> @@ -856,8 +870,7 @@ fiber_loop(MAYBE_UNUSED void *data)
>  		       assert(f != fiber);
>  		       fiber_wakeup(f);
>  	        }
> -		if (! rlist_empty(&fiber->on_stop))
> -			trigger_run(&fiber->on_stop, fiber);
> +	        fiber_cleanup(fiber);

I see you dropped the if above and fiber_cleanup doesn't have any
corresponding within. Is this removal an intentional one?

Minor: There is a mess with whitespace above. Feel free to ignore the
remark, since indentation was broken before your changes. I blamed
several lines and it looks more like a ridiculous code style, and not a
problem with editor. Thereby you just implicitly continue the mix.

>  		/* reset pending wakeups */
>  		rlist_del(&fiber->state);
>  		if (! (fiber->flags & FIBER_IS_JOINABLE))
> @@ -1161,7 +1174,7 @@ fiber_destroy(struct cord *cord, struct fiber *f)
>  	assert(f != &cord->sched);
>  
>  	trigger_destroy(&f->on_yield);
> -	trigger_destroy(&f->on_stop);
> +	trigger_destroy(&f->on_cleanup);
>  	rlist_del(&f->state);
>  	rlist_del(&f->link);
>  	region_destroy(&f->gc);
> @@ -1525,8 +1538,8 @@ cord_cojoin(struct cord *cord)
>  int
>  break_ev_loop_f(struct trigger *trigger, void *event)
>  {
> -	(void) trigger;
>  	(void) event;
> +	trigger_clear(trigger);
>  	ev_break(loop(), EVBREAK_ALL);
>  	return 0;
>  }
> @@ -1555,7 +1568,7 @@ cord_costart_thread_func(void *arg)
>  	 * Got to be in a trigger, to break the loop even
>  	 * in case of an exception.
>  	 */
> -	trigger_add(&f->on_stop, &break_ev_loop);
> +	trigger_add(&f->on_cleanup, &break_ev_loop);
>  	fiber_set_joinable(f, true);
>  	fiber_start(f, ctx.arg);
>  	if (!fiber_is_dead(f)) {
> diff --git a/src/lib/core/fiber.h b/src/lib/core/fiber.h
> index c5b975513..21fff8f88 100644
> --- a/src/lib/core/fiber.h
> +++ b/src/lib/core/fiber.h
> @@ -458,8 +458,17 @@ struct fiber {
>  
>  	/** Triggers invoked before this fiber yields. Must not throw. */
>  	struct rlist on_yield;
> -	/** Triggers invoked before this fiber stops.  Must not throw. */
> -	struct rlist on_stop;
> +	/**
> +	 * Triggers invoked before this fiber is
> +	 * stopped/reset/recycled/destroyed/reused. In other
> +	 * words, each time when the fiber has finished execution
> +	 * of a request.
> +	 * In particular, for fibers not from fiber pool the
> +	 * cleanup is done before destroy and death.
> +	 * Pooled fibers are cleaned up after each request, even
> +	 * if they are never destroyed.
> +	 */
> +	struct rlist on_cleanup;

Minor: both trigger lists above has the "Must not throw" note. Does the
introduced list respect this rule? Please, adjust the comment if yes.

>  	/**
>  	 * The list of fibers awaiting for this fiber's timely
>  	 * (or untimely) death.
> @@ -506,6 +515,10 @@ struct fiber {
>  	char name[FIBER_NAME_MAX + 1];
>  };
>  
> +/** Invoke cleanup triggers and delete them. */
> +void
> +fiber_cleanup(struct fiber *f);
> +
>  struct cord_on_exit;
>  
>  /**
> diff --git a/src/lua/fiber.c b/src/lua/fiber.c
> index a7b75f9bf..4d2828f1c 100644
> --- a/src/lua/fiber.c
> +++ b/src/lua/fiber.c
> @@ -441,11 +441,6 @@ lua_fiber_run_f(MAYBE_UNUSED va_list ap)
>  	int coro_ref = lua_tointeger(L, -1);
>  	lua_pop(L, 1);
>  	result = luaT_call(L, lua_gettop(L) - 1, LUA_MULTRET);
> -
> -	/* Destroy local storage */
> -	int storage_ref = f->storage.lua.ref;
> -	if (storage_ref > 0)
> -		luaL_unref(tarantool_L, LUA_REGISTRYINDEX, storage_ref);
>  	/*
>  	 * If fiber is not joinable
>  	 * We can unref child stack here,
> @@ -606,12 +601,43 @@ lbox_fiber_name(struct lua_State *L)
>  	}
>  }
>  
> +/**
> + * Trigger invoked when the fiber has finished execution of its
> + * current request. Only purpose - delete storage.lua.ref keeping
> + * a reference of Lua fiber.storage object. Unlike Lua stack,
> + * Lua fiber storage may be created not only for fibers born from
> + * Lua land. For example, an IProto request may execute a Lua
> + * function, which can create the storage. Trigger guarantees,
> + * that even for non-Lua fibers the Lua storage is destroyed.
> + */
> +static int
> +lbox_fiber_on_cleanup(struct trigger *trigger, void *event)
> +{
> +	struct fiber *f = (struct fiber *) event;
> +	int storage_ref = f->storage.lua.ref;
> +	assert(storage_ref > 0);
> +	luaL_unref(tarantool_L, LUA_REGISTRYINDEX, storage_ref);
> +	f->storage.lua.ref = 0;

Though 0 value is specific "ref" (a table slot, where the LRU ref is
stored) and this value cannot be yield from luaL_ref. Please, consider
using LUA_NOREF value for such cases, since it's being provided by Lua
and lua_unref does nothing for it, as well as for LUA_REFNIL.

> +	trigger_clear(trigger);
> +	free(trigger);
> +	return 0;
> +}
> +
>  static int
>  lbox_fiber_storage(struct lua_State *L)
>  {
>  	struct fiber *f = lbox_checkfiber(L, 1);
>  	int storage_ref = f->storage.lua.ref;
>  	if (storage_ref <= 0) {
> +		struct trigger *t = (struct trigger *)
> +			malloc(sizeof(*t));
> +		if (t == NULL) {
> +			diag_set(OutOfMemory, sizeof(*t), "malloc", "t");
> +			return luaT_error(L);
> +		}
> +		trigger_create(t, lbox_fiber_on_cleanup, NULL,
> +			       (trigger_f0) free);
> +		trigger_add(&f->on_cleanup, t);
>  		lua_newtable(L); /* create local storage on demand */
>  		storage_ref = luaL_ref(L, LUA_REGISTRYINDEX);
>  		f->storage.lua.ref = storage_ref;
> diff --git a/test/app/gh-4662-fiber-storage-leak.result b/test/app/gh-4662-fiber-storage-leak.result
> new file mode 100644
> index 000000000..4ade017a4
> --- /dev/null
> +++ b/test/app/gh-4662-fiber-storage-leak.result
> @@ -0,0 +1,88 @@
> +-- test-run result file version 2
> +fiber = require('fiber')
> + | ---
> + | ...
> +netbox = require('net.box')
> + | ---
> + | ...
> +
> +--
> +-- gh-4662: fiber.storage was not deleted when created in a fiber
> +-- started from the thread pool used by IProto requests. The
> +-- problem was that fiber.storage was created and deleted in Lua
> +-- land only, assuming that only Lua-born fibers could have it.
> +-- But in fact any fiber can create a Lua storage. Including the
> +-- ones used to serve IProto requests.
> +-- The test checks if fiber.storage is really deleted, and is not
> +-- shared between requests.
> +--
> +
> +box.schema.user.grant('guest', 'execute', 'universe')
> + | ---
> + | ...
> +storage = nil
> + | ---
> + | ...
> +i = 0
> + | ---
> + | ...
> +weak_table = setmetatable({}, {__mode = 'v'})
> + | ---
> + | ...
> +object = {'object'}
> + | ---
> + | ...
> +weak_table.object = object
> + | ---
> + | ...
> +function ref_object_in_fiber()                  \
> +    storage = fiber.self().storage              \
> +    assert(next(storage) == nil)                \
> +    i = i + 1                                   \
> +    fiber.self().storage.key = i                \
> +    fiber.self().storage.object = object        \
> +end
> + | ---
> + | ...
> +
> +c = netbox.connect(box.cfg.listen)
> + | ---
> + | ...
> +c:call('ref_object_in_fiber') c:call('ref_object_in_fiber')
> + | ---
> + | ...
> +storage
> + | ---
> + | - key: 2
> + |   object:
> + |   - object
> + | ...
> +i
> + | ---
> + | - 2
> + | ...
> +object = nil
> + | ---
> + | ...
> +storage = nil
> + | ---
> + | ...
> +collectgarbage('collect')
> + | ---
> + | - 0
> + | ...
> +-- The weak table should be empty, because the only two hard
> +-- references were in the fibers used to serve
> +-- ref_object_in_fiber() requests. And their storages should be
> +-- cleaned up.
> +weak_table
> + | ---
> + | - []
> + | ...
> +
> +c:close()
> + | ---
> + | ...
> +box.schema.user.revoke('guest', 'execute', 'universe')
> + | ---
> + | ...
> diff --git a/test/app/gh-4662-fiber-storage-leak.test.lua b/test/app/gh-4662-fiber-storage-leak.test.lua
> new file mode 100644
> index 000000000..25acf5679
> --- /dev/null
> +++ b/test/app/gh-4662-fiber-storage-leak.test.lua
> @@ -0,0 +1,43 @@
> +fiber = require('fiber')
> +netbox = require('net.box')
> +
> +--
> +-- gh-4662: fiber.storage was not deleted when created in a fiber
> +-- started from the thread pool used by IProto requests. The
> +-- problem was that fiber.storage was created and deleted in Lua
> +-- land only, assuming that only Lua-born fibers could have it.
> +-- But in fact any fiber can create a Lua storage. Including the
> +-- ones used to serve IProto requests.
> +-- The test checks if fiber.storage is really deleted, and is not
> +-- shared between requests.
> +--
> +
> +box.schema.user.grant('guest', 'execute', 'universe')
> +storage = nil
> +i = 0
> +weak_table = setmetatable({}, {__mode = 'v'})
> +object = {'object'}
> +weak_table.object = object
> +function ref_object_in_fiber()                  \
> +    storage = fiber.self().storage              \
> +    assert(next(storage) == nil)                \
> +    i = i + 1                                   \
> +    fiber.self().storage.key = i                \
> +    fiber.self().storage.object = object        \
> +end
> +
> +c = netbox.connect(box.cfg.listen)
> +c:call('ref_object_in_fiber') c:call('ref_object_in_fiber')
> +storage
> +i
> +object = nil
> +storage = nil
> +collectgarbage('collect')
> +-- The weak table should be empty, because the only two hard
> +-- references were in the fibers used to serve
> +-- ref_object_in_fiber() requests. And their storages should be
> +-- cleaned up.
> +weak_table
> +
> +c:close()
> +box.schema.user.revoke('guest', 'execute', 'universe')
> 
> =========================================================================

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto
  2019-12-11 21:23         ` Vladislav Shpilevoy
  2019-12-12  0:00           ` Igor Munkin
@ 2019-12-12  8:46           ` Konstantin Osipov
  2019-12-13  0:02             ` Vladislav Shpilevoy
  1 sibling, 1 reply; 26+ messages in thread
From: Konstantin Osipov @ 2019-12-12  8:46 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

* Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [19/12/12 00:27]:
> - I moved cleanup to iproto.cc, and called it tx_fiber_cleanup().
>   By analogue with tx_fiber_init() which currently initializes the
>   fiber;
> 
> - I made cleanup triggers always clean them by themselves to remove
>   trigger_destroy() from fiber.c.
> 
> But it didn't help me to understand, why are you saying, that
> tx and session should not use these triggers. Tx, in case it was
> not removed from fiber before its cleanup, should be destroyed.
> Session uses that trigger to destroy local one-fiber sessions,
> which die together with the fiber getting recycled/killed. So
> what is a problem? That trigger perfectly fits all the needs.

OK, the function itself may be a match, but it doesn't make the
idea of using the same trigger object for everything right, don't
you think?

My point is that you do  not only move where the function is
invoked, but also where the trigger list object is stored.

Can you move it to Lua fiber object and iproto_msg object?
There could be no other entry points to the request, so my point
is that the trigger is part of request state, not fiber state.

Does it make any sense?

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] fiber: unref fiber.storage via global Lua state
  2019-12-11 23:57   ` Igor Munkin
@ 2019-12-12  8:50     ` Konstantin Osipov
  2019-12-12 23:38     ` Vladislav Shpilevoy
  1 sibling, 0 replies; 26+ messages in thread
From: Konstantin Osipov @ 2019-12-12  8:50 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches, Vladislav Shpilevoy

* Igor Munkin <imun@tarantool.org> [19/12/12 03:03]:
> I spent some time knee deep into fibers machinery and totally agree with
> this fix: the most safe approach is "unrefing" storage entity via
> tarantool_L coro, taking into account the possible unref of the fiber's
> one and its consequtive collection.
> 
> However, I don't get why we need to apply these changes, considering
> your follow-up patch. Could you please provide a bit more detailed
> rationale?

I think it's worth applying them simply because the follow up
changes may end up being something completely different.

I agree this is a good fix.


-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto
  2019-12-12  0:00           ` Igor Munkin
@ 2019-12-12 23:37             ` Vladislav Shpilevoy
  2019-12-13 13:35               ` Igor Munkin
  0 siblings, 1 reply; 26+ messages in thread
From: Vladislav Shpilevoy @ 2019-12-12 23:37 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches

Thanks for the review!

>> diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
>> index 00ae8cded..d89c640aa 100644
>> --- a/src/lib/core/fiber.c
>> +++ b/src/lib/core/fiber.c
>> @@ -787,7 +801,7 @@ static void
>>  fiber_reset(struct fiber *fiber)
>>  {
>>  	rlist_create(&fiber->on_yield);
>> -	rlist_create(&fiber->on_stop);
>> +	rlist_create(&fiber->on_cleanup);
>>  	fiber->flags = FIBER_DEFAULT_FLAGS;
>>  #if ENABLE_FIBER_TOP
>>  	clock_stat_reset(&fiber->clock_stat);
>> @@ -856,8 +870,7 @@ fiber_loop(MAYBE_UNUSED void *data)
>>  		       assert(f != fiber);
>>  		       fiber_wakeup(f);
>>  	        }
>> -		if (! rlist_empty(&fiber->on_stop))
>> -			trigger_run(&fiber->on_stop, fiber);
>> +	        fiber_cleanup(fiber);
> 
> I see you dropped the if above and fiber_cleanup doesn't have any
> corresponding within. Is this removal an intentional one?

Yes. This is because trigger_run works fine with empty
rlist, so this check was useless.

> 
> Minor: There is a mess with whitespace above. Feel free to ignore the
> remark, since indentation was broken before your changes. I blamed
> several lines and it looks more like a ridiculous code style, and not a
> problem with editor. Thereby you just implicitly continue the mix.

Wow, good catch! I didn't notice it in git diff nor in the
editor. Fixed.

> 
>>  		/* reset pending wakeups */
>>  		rlist_del(&fiber->state);
>>  		if (! (fiber->flags & FIBER_IS_JOINABLE))
>> diff --git a/src/lib/core/fiber.h b/src/lib/core/fiber.h
>> index c5b975513..21fff8f88 100644
>> --- a/src/lib/core/fiber.h
>> +++ b/src/lib/core/fiber.h
>> @@ -458,8 +458,17 @@ struct fiber {
>>  
>>  	/** Triggers invoked before this fiber yields. Must not throw. */
>>  	struct rlist on_yield;
>> -	/** Triggers invoked before this fiber stops.  Must not throw. */
>> -	struct rlist on_stop;
>> +	/**
>> +	 * Triggers invoked before this fiber is
>> +	 * stopped/reset/recycled/destroyed/reused. In other
>> +	 * words, each time when the fiber has finished execution
>> +	 * of a request.
>> +	 * In particular, for fibers not from fiber pool the
>> +	 * cleanup is done before destroy and death.
>> +	 * Pooled fibers are cleaned up after each request, even
>> +	 * if they are never destroyed.
>> +	 */
>> +	struct rlist on_cleanup;
> 
> Minor: both trigger lists above has the "Must not throw" note. Does the
> introduced list respect this rule? Please, adjust the comment if yes.

Nope. These 'must not throw' are outdated. Our triggers does not
throw errors anymore by design. So I deleted it from the updated
comment.

>>  	/**
>>  	 * The list of fibers awaiting for this fiber's timely
>>  	 * (or untimely) death.
>> diff --git a/src/lua/fiber.c b/src/lua/fiber.c
>> index a7b75f9bf..4d2828f1c 100644
>> --- a/src/lua/fiber.c
>> +++ b/src/lua/fiber.c
>> @@ -606,12 +601,43 @@ lbox_fiber_name(struct lua_State *L)
>>  	}
>>  }
>>  
>> +/**
>> + * Trigger invoked when the fiber has finished execution of its
>> + * current request. Only purpose - delete storage.lua.ref keeping
>> + * a reference of Lua fiber.storage object. Unlike Lua stack,
>> + * Lua fiber storage may be created not only for fibers born from
>> + * Lua land. For example, an IProto request may execute a Lua
>> + * function, which can create the storage. Trigger guarantees,
>> + * that even for non-Lua fibers the Lua storage is destroyed.
>> + */
>> +static int
>> +lbox_fiber_on_cleanup(struct trigger *trigger, void *event)
>> +{
>> +	struct fiber *f = (struct fiber *) event;
>> +	int storage_ref = f->storage.lua.ref;
>> +	assert(storage_ref > 0);
>> +	luaL_unref(tarantool_L, LUA_REGISTRYINDEX, storage_ref);
>> +	f->storage.lua.ref = 0;
> 
> Though 0 value is specific "ref" (a table slot, where the LRU ref is
> stored) and this value cannot be yield from luaL_ref. Please, consider
> using LUA_NOREF value for such cases, since it's being provided by Lua
> and lua_unref does nothing for it, as well as for LUA_REFNIL.

Thanks, didn't know about these constants.

===================================================================
 	assert(storage_ref > 0);
 	luaL_unref(tarantool_L, LUA_REGISTRYINDEX, storage_ref);
-	f->storage.lua.ref = 0;
+	f->storage.lua.ref = LUA_NOREF;
 	trigger_clear(trigger);
 	free(trigger);
===================================================================

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] fiber: unref fiber.storage via global Lua state
  2019-12-11 23:57   ` Igor Munkin
  2019-12-12  8:50     ` Konstantin Osipov
@ 2019-12-12 23:38     ` Vladislav Shpilevoy
  2019-12-13 10:44       ` Igor Munkin
  1 sibling, 1 reply; 26+ messages in thread
From: Vladislav Shpilevoy @ 2019-12-12 23:38 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches

Hi! Thanks for the review!

On 12/12/2019 00:57, Igor Munkin wrote:
> Vlad,
> 
> Thanks for the patch!
> 
> I spent some time knee deep into fibers machinery and totally agree with
> this fix: the most safe approach is "unrefing" storage entity via
> tarantool_L coro, taking into account the possible unref of the fiber's
> one and its consequtive collection.
> 
> However, I don't get why we need to apply these changes, considering
> your follow-up patch. Could you please provide a bit more detailed
> rationale?

My follow up patch is about a different bug and I wanted to keep
these independent fixes separated. That would have been strange,
if in the second commit I had just silently replaced the old L
with tarantool_L.

> 
> On 08.12.19, Vladislav Shpilevoy wrote:
>> Fiber.storage is a table, available from anywhere in the fiber. It
>> is destroyed after fiber function is finished. That provides a
>> reliable fiber-local storage, similar to thread-local in C/C++.
>>
>> But there is a problem that the storage may be created via one
>> struct lua_State, and destroyed via another. Here is an example:
>>
>>     function test_storage()
>>         fiber.self().storage.key = 100
>>     end
>>     box.schema.func.create('test_storage')
>>     _ = fiber.create(function()
>>         box.func.test_storage:call()
>>     end)
>>
>> There are 3 struct lua_State:
>>     tarantool_L - global always alive Lua state;
>>     L1 - stack of the fiber, created by fiber.create();
>>     L2 - stack created by that fiber to execute test_storage().
> 
> Minor: Strictly saying, lua_State is a Lua coroutine, and not just a
> guest stack. Please, consider adjusting the commit message, but feel
> free to ignore this remark.
> 

Agree, here is a new version of this paragraph:

"
    There are 3 struct lua_State:
        tarantool_L - global always alive state;
        L1 - Lua coroutine of the fiber, created by fiber.create();
        L2 - Lua coroutine created by that fiber to execute
             test_storage().
"

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto
  2019-12-12  8:46           ` Konstantin Osipov
@ 2019-12-13  0:02             ` Vladislav Shpilevoy
  2019-12-13  7:58               ` Konstantin Osipov
  0 siblings, 1 reply; 26+ messages in thread
From: Vladislav Shpilevoy @ 2019-12-13  0:02 UTC (permalink / raw)
  To: Konstantin Osipov, tarantool-patches

Hi! Thanks for the review!

On 12/12/2019 09:46, Konstantin Osipov wrote:
> * Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [19/12/12 00:27]:
>> - I moved cleanup to iproto.cc, and called it tx_fiber_cleanup().
>>   By analogue with tx_fiber_init() which currently initializes the
>>   fiber;
>>
>> - I made cleanup triggers always clean them by themselves to remove
>>   trigger_destroy() from fiber.c.
>>
>> But it didn't help me to understand, why are you saying, that
>> tx and session should not use these triggers. Tx, in case it was
>> not removed from fiber before its cleanup, should be destroyed.
>> Session uses that trigger to destroy local one-fiber sessions,
>> which die together with the fiber getting recycled/killed. So
>> what is a problem? That trigger perfectly fits all the needs.
> 
> OK, the function itself may be a match, but it doesn't make the
> idea of using the same trigger object for everything right, don't
> you think?

For everything possible to imagine - no. For txn, session, and
language specific fiber-local data - yes. And by coincidence that
is pretty much everything we have now.

> 
> My point is that you do  not only move where the function is
> invoked, but also where the trigger list object is stored.

Nope. The trigger list is still in struct fiber.

> 
> Can you move it to Lua fiber object and iproto_msg object?

I am not sure I understand what you mean. I can't move lua fiber
storage destructor to iproto. I need Lua for that. So this should
be in Lua folder.

If you are talking about moving trigger list to iproto and Lua,
and somehow saying pointer at that list to a member in struct
fiber, then
  1) I would need to patch C version too, because we have public
     C API. The fact you forgot about C API says, that it is not
     a good idea to make that cleanup in each front end. Easy to
     miss something;
  2) That will give exactly the same - a member in struct fiber.
     There just no other communication channel between different
     frontends in which a fiber may participate during lifetime;
  3) We would need to add a new member to struct fiber and
     increase its size.

> There could be no other entry points to the request, so my point
> is that the trigger is part of request state, not fiber state.
'Request' idea is different for session, for txn, for iproto.
You can't define one request for all things.

While you are trying to define a 'request' concept, there is an
already defined concept of fiber function. The function which a
fiber executes. That should be the anchor. This is the 'request'
in terms of fibers. Fiber owns session, fiber owns txn, fiber
owns Lua resources. So it is his task to clean them up. And it
should be in his triggers. On_cleanup is fiber's destructor.

Yes, fiber can hand out these resources, but all access to them
goes through fiber object. Fiber owns them, and fiber should
destroy them when its current task is done. This is fiber's
responsibility.

> 
> Does it make any sense?
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto
  2019-12-13  0:02             ` Vladislav Shpilevoy
@ 2019-12-13  7:58               ` Konstantin Osipov
  2019-12-13 23:11                 ` Vladislav Shpilevoy
  0 siblings, 1 reply; 26+ messages in thread
From: Konstantin Osipov @ 2019-12-13  7:58 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

* Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [19/12/13 09:42]:
> > Can you move it to Lua fiber object and iproto_msg object?
> 
> I am not sure I understand what you mean. I can't move lua fiber
> storage destructor to iproto. I need Lua for that. So this should
> be in Lua folder.

I mean struct rlist on_request_end should be a member of
iproto_msg, not struct fiber. Does it make sense?

> If you are talking about moving trigger list to iproto and Lua,
> and somehow saying pointer at that list to a member in struct
> fiber, then

No, why do you need to store the list in struct fiber?

>   1) I would need to patch C version too, because we have public
>      C API. The fact you forgot about C API says, that it is not
>      a good idea to make that cleanup in each front end. Easy to
>      miss something;

C api will enter into Lua land (lbox_pushfiber) as soon as it
creates a Lua fiber, and the Lua land will call the trigger in 
the end of lbox_fiber_run_f.

>   2) That will give exactly the same - a member in struct fiber.
>      There just no other communication channel between different
>      frontends in which a fiber may participate during lifetime;

Why do you think there will be different frontends involved? The
storage is pure-Lua, there is no fiber storage for non-Lua fibers.

>   3) We would need to add a new member to struct fiber and
>      increase its size.

Please see above, do you get the idea now?


-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] fiber: unref fiber.storage via global Lua state
  2019-12-12 23:38     ` Vladislav Shpilevoy
@ 2019-12-13 10:44       ` Igor Munkin
  0 siblings, 0 replies; 26+ messages in thread
From: Igor Munkin @ 2019-12-13 10:44 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Vlad,

On 13.12.19, Vladislav Shpilevoy wrote:
> Hi! Thanks for the review!
> 
> On 12/12/2019 00:57, Igor Munkin wrote:
> > Vlad,
> > 
> > Thanks for the patch!
> > 
> > I spent some time knee deep into fibers machinery and totally agree with
> > this fix: the most safe approach is "unrefing" storage entity via
> > tarantool_L coro, taking into account the possible unref of the fiber's
> > one and its consequtive collection.
> > 
> > However, I don't get why we need to apply these changes, considering
> > your follow-up patch. Could you please provide a bit more detailed
> > rationale?
> 
> My follow up patch is about a different bug and I wanted to keep
> these independent fixes separated. That would have been strange,
> if in the second commit I had just silently replaced the old L
> with tarantool_L.
> 

Yes, on second thought I agree with you and Kostja, so here is my LGTM
for this patch.

> > 
> > On 08.12.19, Vladislav Shpilevoy wrote:
> >> Fiber.storage is a table, available from anywhere in the fiber. It
> >> is destroyed after fiber function is finished. That provides a
> >> reliable fiber-local storage, similar to thread-local in C/C++.
> >>
> >> But there is a problem that the storage may be created via one
> >> struct lua_State, and destroyed via another. Here is an example:
> >>
> >>     function test_storage()
> >>         fiber.self().storage.key = 100
> >>     end
> >>     box.schema.func.create('test_storage')
> >>     _ = fiber.create(function()
> >>         box.func.test_storage:call()
> >>     end)
> >>
> >> There are 3 struct lua_State:
> >>     tarantool_L - global always alive Lua state;
> >>     L1 - stack of the fiber, created by fiber.create();
> >>     L2 - stack created by that fiber to execute test_storage().
> > 
> > Minor: Strictly saying, lua_State is a Lua coroutine, and not just a
> > guest stack. Please, consider adjusting the commit message, but feel
> > free to ignore this remark.
> > 
> 
> Agree, here is a new version of this paragraph:
> 
> "
>     There are 3 struct lua_State:
>         tarantool_L - global always alive state;
>         L1 - Lua coroutine of the fiber, created by fiber.create();
>         L2 - Lua coroutine created by that fiber to execute
>              test_storage().
> "

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto
  2019-12-12 23:37             ` Vladislav Shpilevoy
@ 2019-12-13 13:35               ` Igor Munkin
  0 siblings, 0 replies; 26+ messages in thread
From: Igor Munkin @ 2019-12-13 13:35 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Vlad,

On 13.12.19, Vladislav Shpilevoy wrote:
> Thanks for the review!
> 
<snipped>
> 
> > 
> >>  		/* reset pending wakeups */
> >>  		rlist_del(&fiber->state);
> >>  		if (! (fiber->flags & FIBER_IS_JOINABLE))
> >> diff --git a/src/lib/core/fiber.h b/src/lib/core/fiber.h
> >> index c5b975513..21fff8f88 100644
> >> --- a/src/lib/core/fiber.h
> >> +++ b/src/lib/core/fiber.h
> >> @@ -458,8 +458,17 @@ struct fiber {
> >>  
> >>  	/** Triggers invoked before this fiber yields. Must not throw. */
> >>  	struct rlist on_yield;
> >> -	/** Triggers invoked before this fiber stops.  Must not throw. */
> >> -	struct rlist on_stop;
> >> +	/**
> >> +	 * Triggers invoked before this fiber is
> >> +	 * stopped/reset/recycled/destroyed/reused. In other
> >> +	 * words, each time when the fiber has finished execution
> >> +	 * of a request.
> >> +	 * In particular, for fibers not from fiber pool the
> >> +	 * cleanup is done before destroy and death.
> >> +	 * Pooled fibers are cleaned up after each request, even
> >> +	 * if they are never destroyed.
> >> +	 */
> >> +	struct rlist on_cleanup;
> > 
> > Minor: both trigger lists above has the "Must not throw" note. Does the
> > introduced list respect this rule? Please, adjust the comment if yes.
> 
> Nope. These 'must not throw' are outdated. Our triggers does not
> throw errors anymore by design. So I deleted it from the updated
> comment.
> 

OK, thanks, now it's clear. Guess we need an activity to update comments
for other trigger lists.

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto
  2019-12-13  7:58               ` Konstantin Osipov
@ 2019-12-13 23:11                 ` Vladislav Shpilevoy
  2019-12-14 12:26                   ` Konstantin Osipov
  0 siblings, 1 reply; 26+ messages in thread
From: Vladislav Shpilevoy @ 2019-12-13 23:11 UTC (permalink / raw)
  To: Konstantin Osipov, tarantool-patches



On 13/12/2019 08:58, Konstantin Osipov wrote:
> * Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [19/12/13 09:42]:
>>> Can you move it to Lua fiber object and iproto_msg object?
>>
>> I am not sure I understand what you mean. I can't move lua fiber
>> storage destructor to iproto. I need Lua for that. So this should
>> be in Lua folder.
> 
> I mean struct rlist on_request_end should be a member of
> iproto_msg, not struct fiber. Does it make sense?
> 
>> If you are talking about moving trigger list to iproto and Lua,
>> and somehow saying pointer at that list to a member in struct
>> fiber, then
> 
> No, why do you need to store the list in struct fiber?
> 

Because the only thing which is available in lua/fiber.c is the
fiber object.

>>   1) I would need to patch C version too, because we have public
>>      C API. The fact you forgot about C API says, that it is not
>>      a good idea to make that cleanup in each front end. Easy to
>>      miss something;
> 
> C api will enter into Lua land (lbox_pushfiber) as soon as it
> creates a Lua fiber, and the Lua land will call the trigger in 
> the end of lbox_fiber_run_f.

lbox_fiber_run_f is not called for C fibers. This is for Lua born
fibers only. To invoke a C function Lua is not used.

>>   2) That will give exactly the same - a member in struct fiber.
>>      There just no other communication channel between different
>>      frontends in which a fiber may participate during lifetime;
> 
> Why do you think there will be different frontends involved? The
> storage is pure-Lua, there is no fiber storage for non-Lua fibers.
> 

Because other frontends will appear.

>>   3) We would need to add a new member to struct fiber and
>>      increase its size.
> 
> Please see above, do you get the idea now?
> 
> 

No. I don't understand what you want and how.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto
  2019-12-13 23:11                 ` Vladislav Shpilevoy
@ 2019-12-14 12:26                   ` Konstantin Osipov
  2019-12-14 12:30                     ` Konstantin Osipov
  0 siblings, 1 reply; 26+ messages in thread
From: Konstantin Osipov @ 2019-12-14 12:26 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

* Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [19/12/14 02:13]:
> lbox_fiber_run_f is not called for C fibers. This is for Lua born
> fibers only. To invoke a C function Lua is not used.

It can't access fiber storage. 

> >>   2) That will give exactly the same - a member in struct fiber.
> >>      There just no other communication channel between different
> >>      frontends in which a fiber may participate during lifetime;
> > 
> > Why do you think there will be different frontends involved? The
> > storage is pure-Lua, there is no fiber storage for non-Lua fibers.
> > 
> 
> Because other frontends will appear.

request storage is specific to Lua which doesn't have Go-style deferrables,
Python-style finally() or any other working means for controlled
garbage collection.

Even in Lua it is better done as a third party library, which works
the same way as box.atomic(): wraps the request body with 
a RAII-style function.

This feature *is* broken & redundant, and instead of making the
damage controlled, you want to "fix" it by making the feature
universal and implementing it on C level.

> 
> >>   3) We would need to add a new member to struct fiber and
> >>      increase its size.
> > 
> > Please see above, do you get the idea now?
> 
> No. I don't understand what you want and how.

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto
  2019-12-14 12:26                   ` Konstantin Osipov
@ 2019-12-14 12:30                     ` Konstantin Osipov
  2019-12-14 12:33                       ` Konstantin Osipov
  0 siblings, 1 reply; 26+ messages in thread
From: Konstantin Osipov @ 2019-12-14 12:30 UTC (permalink / raw)
  To: Vladislav Shpilevoy, tarantool-patches

* Konstantin Osipov <kostja.osipov@gmail.com> [19/12/14 15:26]:
> request storage is specific to Lua which doesn't have Go-style deferrables,
> Python-style finally() or any other working means for controlled
> garbage collection.
> 
> Even in Lua it is better done as a third party library, which works
> the same way as box.atomic(): wraps the request body with 
> a RAII-style function.
> 
> This feature *is* broken & redundant, and instead of making the
> damage controlled, you want to "fix" it by making the feature
> universal and implementing it on C level.

I checked now, and Lua addeded debug.getupvalue() to Lua C API.

On other words, fiber storage now can be implemented in pure Lua,
without any triggers. 

Back when we were writing, upvalue API was only available in C
API, and AFAIR there was no way to access upvalues from Lua, this
is why we had to come up with this hack.

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto
  2019-12-14 12:30                     ` Konstantin Osipov
@ 2019-12-14 12:33                       ` Konstantin Osipov
  2019-12-14 16:49                         ` Vladislav Shpilevoy
  0 siblings, 1 reply; 26+ messages in thread
From: Konstantin Osipov @ 2019-12-14 12:33 UTC (permalink / raw)
  To: Vladislav Shpilevoy, tarantool-patches

* Konstantin Osipov <kostja.osipov@gmail.com> [19/12/14 15:30]:
> I checked now, and Lua addeded debug.getupvalue() to Lua C API.
> 
> On other words, fiber storage now can be implemented in pure Lua,
> without any triggers. 
> 
> Back when we were writing, upvalue API was only available in C
> API, and AFAIR there was no way to access upvalues from Lua, this
> is why we had to come up with this hack.

The feature can actually be deprecated. There is now a completely
legitimate and easy to use way to have a request-local object with
a user defined destructor in pure Lua.

If you do want to keep it, you should try to minimized the damage
to the core, not extend it.

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto
  2019-12-14 12:33                       ` Konstantin Osipov
@ 2019-12-14 16:49                         ` Vladislav Shpilevoy
  2019-12-14 20:59                           ` Vladislav Shpilevoy
  0 siblings, 1 reply; 26+ messages in thread
From: Vladislav Shpilevoy @ 2019-12-14 16:49 UTC (permalink / raw)
  To: Konstantin Osipov, tarantool-patches

Well, I still don't understand what you are keep saying and
what do you want, and since it does not look like I will, I
am out of this ticket.

On 14/12/2019 13:33, Konstantin Osipov wrote:
> * Konstantin Osipov <kostja.osipov@gmail.com> [19/12/14 15:30]:
>> I checked now, and Lua addeded debug.getupvalue() to Lua C API.
>>
>> On other words, fiber storage now can be implemented in pure Lua,
>> without any triggers. 
>>
>> Back when we were writing, upvalue API was only available in C
>> API, and AFAIR there was no way to access upvalues from Lua, this
>> is why we had to come up with this hack.
> 
> The feature can actually be deprecated. There is now a completely
> legitimate and easy to use way to have a request-local object with
> a user defined destructor in pure Lua.
> 
> If you do want to keep it, you should try to minimized the damage
> to the core, not extend it.
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto
  2019-12-14 16:49                         ` Vladislav Shpilevoy
@ 2019-12-14 20:59                           ` Vladislav Shpilevoy
  0 siblings, 0 replies; 26+ messages in thread
From: Vladislav Shpilevoy @ 2019-12-14 20:59 UTC (permalink / raw)
  To: Konstantin Osipov, tarantool-patches

Ok, we discussed the ticket verbally. It was
decided to keep it as is but revert all renames.

Also I will add a doc request in which I will
describe with details when and how fiber storage
behaves. For example, what about triggers, applier.

On 14/12/2019 17:49, Vladislav Shpilevoy wrote:
> Well, I still don't understand what you are keep saying and
> what do you want, and since it does not look like I will, I
> am out of this ticket.
> 
> On 14/12/2019 13:33, Konstantin Osipov wrote:
>> * Konstantin Osipov <kostja.osipov@gmail.com> [19/12/14 15:30]:
>>> I checked now, and Lua addeded debug.getupvalue() to Lua C API.
>>>
>>> On other words, fiber storage now can be implemented in pure Lua,
>>> without any triggers. 
>>>
>>> Back when we were writing, upvalue API was only available in C
>>> API, and AFAIR there was no way to access upvalues from Lua, this
>>> is why we had to come up with this hack.
>>
>> The feature can actually be deprecated. There is now a completely
>> legitimate and easy to use way to have a request-local object with
>> a user defined destructor in pure Lua.
>>
>> If you do want to keep it, you should try to minimized the damage
>> to the core, not extend it.
>>

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2019-12-14 20:59 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-08 19:28 [Tarantool-patches] [PATCH 0/2] Fiber storage leak Vladislav Shpilevoy
2019-12-08 19:28 ` [Tarantool-patches] [PATCH 1/2] fiber: unref fiber.storage via global Lua state Vladislav Shpilevoy
2019-12-11 23:57   ` Igor Munkin
2019-12-12  8:50     ` Konstantin Osipov
2019-12-12 23:38     ` Vladislav Shpilevoy
2019-12-13 10:44       ` Igor Munkin
2019-12-08 19:28 ` [Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto Vladislav Shpilevoy
2019-12-09  7:21   ` Konstantin Osipov
2019-12-09 23:31     ` Vladislav Shpilevoy
2019-12-10  8:21       ` Konstantin Osipov
2019-12-10  8:32   ` Konstantin Osipov
2019-12-10 22:59     ` Vladislav Shpilevoy
2019-12-11  7:08       ` Konstantin Osipov
2019-12-11 21:23         ` Vladislav Shpilevoy
2019-12-12  0:00           ` Igor Munkin
2019-12-12 23:37             ` Vladislav Shpilevoy
2019-12-13 13:35               ` Igor Munkin
2019-12-12  8:46           ` Konstantin Osipov
2019-12-13  0:02             ` Vladislav Shpilevoy
2019-12-13  7:58               ` Konstantin Osipov
2019-12-13 23:11                 ` Vladislav Shpilevoy
2019-12-14 12:26                   ` Konstantin Osipov
2019-12-14 12:30                     ` Konstantin Osipov
2019-12-14 12:33                       ` Konstantin Osipov
2019-12-14 16:49                         ` Vladislav Shpilevoy
2019-12-14 20:59                           ` Vladislav Shpilevoy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox