Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: tarantool-patches@dev.tarantool.org, kostja.osipov@gmail.com
Subject: [Tarantool-patches] [PATCH v2 2/3] fiber: destroy fiber.storage created by iproto
Date: Thu, 16 Jan 2020 22:54:22 +0100	[thread overview]
Message-ID: <31e884f2b6d2ab7b222229f1b204fa067d63ae65.1579211601.git.v.shpilevoy@tarantool.org> (raw)
In-Reply-To: <cover.1579211601.git.v.shpilevoy@tarantool.org>

Fiber.storage was not deleted when created in a fiber started from
the thread pool used by IProto requests. The problem was that
fiber.storage was created and deleted in Lua land only, assuming
that only Lua-born fibers could have it. But in fact any fiber can
create a Lua storage. Including the ones used to serve IProto
requests.

Not deletion of the storage led to a possibility of meeting a
non-empty fiber.storage in the beginning of an iproto request, and
to not deletion of the memory caught by the storage until its
explicit nullification.

Now the storage destructor works for any fiber, which managed to
create the storage. The destructor unrefs and nullifies the
storage.

For destructor purposes the fiber.on_stop triggers were reworked.
Now they can be called multiple times during fiber's lifetime.
After every request done by that fiber.

Closes #4662
Closes #3462

@TarantoolBot document
Title: Clarify fiber.storage lifetime

Fiber.storage is a Lua table created when it is first accessed. On
the site it is said that it is deleted when fiber is canceled via
fiber:cancel(). But it is not the full truth.

Fiber.storage is destroyed when the fiber is finished. Regardless
of how is it finished - via :cancel(), or the fiber's function
did 'return', it does not matter. Moreover, from that moment the
storage is cleaned up even for pooled fibers used to serve IProto
requests. Pooled fibers never really die, but nonetheless their
storage is cleaned up after each request. That makes possible to
use fiber.storage as a full featured request-local storage.

Fiber.storage may be created for a fiber no matter how the fiber
itself was created - from C, from Lua. For example, a fiber could
be created in C using fiber_new(), then it could insert into a
space, which had Lua on_replace triggers, and one of the triggers
could create fiber.storage. That storage will be deleted when the
fiber is stopped.

Another place where fiber.storage may be created - for replication
applier fiber. Applier has a fiber from which it applies
transactions from a remote instance. In case the applier fiber
somehow creates a fiber.storage (for example, from a space trigger
again), the storage won't be deleted until the applier fiber is
stopped.
---
 src/box/iproto.cc                            | 44 ++++++++--
 src/lib/core/fiber.c                         | 18 +++-
 src/lib/core/fiber.h                         | 14 +++-
 src/lua/fiber.c                              | 35 ++++++--
 test/app/gh-4662-fiber-storage-leak.result   | 88 ++++++++++++++++++++
 test/app/gh-4662-fiber-storage-leak.test.lua | 43 ++++++++++
 6 files changed, 225 insertions(+), 17 deletions(-)
 create mode 100644 test/app/gh-4662-fiber-storage-leak.result
 create mode 100644 test/app/gh-4662-fiber-storage-leak.test.lua

diff --git a/src/box/iproto.cc b/src/box/iproto.cc
index 5e83ab0ea..0526f2305 100644
--- a/src/box/iproto.cc
+++ b/src/box/iproto.cc
@@ -1310,6 +1310,11 @@ static void
 tx_fiber_init(struct session *session, uint64_t sync)
 {
 	struct fiber *f = fiber();
+	/*
+	 * There should not be any not executed on_stop triggers
+	 * from a previous request executed in that fiber.
+	 */
+	assert(rlist_empty(&f->on_stop));
 	f->storage.net.sync = sync;
 	/*
 	 * We do not cleanup fiber keys at the end of each request.
@@ -1325,6 +1330,17 @@ tx_fiber_init(struct session *session, uint64_t sync)
 	fiber_set_user(f, &session->credentials);
 }
 
+/**
+ * Stop the current fiber after a request is executed to make it
+ * possible to reuse the fiber for a next request. On_stop
+ * triggers remove all request-specific data from there.
+ */
+static inline void
+tx_fiber_on_stop()
+{
+	fiber_on_stop(fiber());
+}
+
 static void
 tx_process_disconnect(struct cmsg *m)
 {
@@ -1335,6 +1351,7 @@ tx_process_disconnect(struct cmsg *m)
 		if (! rlist_empty(&session_on_disconnect)) {
 			tx_fiber_init(con->session, 0);
 			session_run_on_disconnect_triggers(con->session);
+			tx_fiber_on_stop();
 		}
 	}
 }
@@ -1490,6 +1507,7 @@ tx_reply_iproto_error(struct cmsg *m)
 	iproto_reply_error(out, diag_last_error(&msg->diag),
 			   msg->header.sync, ::schema_version);
 	iproto_wpos_create(&msg->wpos, out);
+	tx_fiber_on_stop();
 }
 
 /** Inject a short delay on tx request processing for testing. */
@@ -1523,9 +1541,11 @@ tx_process1(struct cmsg *m)
 	iproto_reply_select(out, &svp, msg->header.sync, ::schema_version,
 			    tuple != 0);
 	iproto_wpos_create(&msg->wpos, out);
-	return;
+	goto end;
 error:
 	tx_reply_error(msg);
+end:
+	tx_fiber_on_stop();
 }
 
 static void
@@ -1566,9 +1586,11 @@ tx_process_select(struct cmsg *m)
 	iproto_reply_select(out, &svp, msg->header.sync,
 			    ::schema_version, count);
 	iproto_wpos_create(&msg->wpos, out);
-	return;
+	goto end;
 error:
 	tx_reply_error(msg);
+end:
+	tx_fiber_on_stop();
 }
 
 static int
@@ -1656,9 +1678,11 @@ tx_process_call(struct cmsg *m)
 	iproto_reply_select(out, &svp, msg->header.sync,
 			    ::schema_version, count);
 	iproto_wpos_create(&msg->wpos, out);
-	return;
+	goto end;
 error:
 	tx_reply_error(msg);
+end:
+	tx_fiber_on_stop();
 }
 
 static void
@@ -1699,9 +1723,11 @@ tx_process_misc(struct cmsg *m)
 	} catch (Exception *e) {
 		tx_reply_error(msg);
 	}
-	return;
+	goto end;
 error:
 	tx_reply_error(msg);
+end:
+	tx_fiber_on_stop();
 }
 
 static void
@@ -1716,8 +1742,6 @@ tx_process_sql(struct cmsg *m)
 	uint32_t len;
 	bool is_unprepare = false;
 
-	tx_fiber_init(msg->connection->session, msg->header.sync);
-
 	if (tx_check_schema(msg->header.schema_version))
 		goto error;
 	assert(msg->header.type == IPROTO_EXECUTE ||
@@ -1794,9 +1818,11 @@ tx_process_sql(struct cmsg *m)
 	}
 	iproto_reply_sql(out, &header_svp, msg->header.sync, schema_version);
 	iproto_wpos_create(&msg->wpos, out);
-	return;
+	goto end;
 error:
 	tx_reply_error(msg);
+end:
+	tx_fiber_on_stop();
 }
 
 static void
@@ -1835,11 +1861,12 @@ tx_process_replication(struct cmsg *m)
 			unreachable();
 		}
 	} catch (SocketError *e) {
-		return; /* don't write error response to prevent SIGPIPE */
+		/* don't write error response to prevent SIGPIPE */
 	} catch (Exception *e) {
 		iproto_write_error(con->input.fd, e, ::schema_version,
 				   msg->header.sync);
 	}
+	tx_fiber_on_stop();
 }
 
 static void
@@ -1945,6 +1972,7 @@ tx_process_connect(struct cmsg *m)
 		tx_reply_error(msg);
 		msg->close_connection = true;
 	}
+	tx_fiber_on_stop();
 }
 
 /**
diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
index 00ae8cded..634b3d1b0 100644
--- a/src/lib/core/fiber.c
+++ b/src/lib/core/fiber.c
@@ -325,6 +325,19 @@ fiber_attr_getstacksize(struct fiber_attr *fiber_attr)
 				    fiber_attr_default.stack_size;
 }
 
+void
+fiber_on_stop(struct fiber *f)
+{
+	if (trigger_run(&f->on_stop, f) != 0)
+		panic("On_stop triggers can't fail");
+	/*
+	 * All on_stop triggers are supposed to remove themselves.
+	 * So as no to waste time on that here, and to make them
+	 * all work uniformly.
+	 */
+	assert(rlist_empty(&f->on_stop));
+}
+
 static void
 fiber_recycle(struct fiber *fiber);
 
@@ -856,8 +869,7 @@ fiber_loop(MAYBE_UNUSED void *data)
 		       assert(f != fiber);
 		       fiber_wakeup(f);
 	        }
-		if (! rlist_empty(&fiber->on_stop))
-			trigger_run(&fiber->on_stop, fiber);
+		fiber_on_stop(fiber);
 		/* reset pending wakeups */
 		rlist_del(&fiber->state);
 		if (! (fiber->flags & FIBER_IS_JOINABLE))
@@ -1525,8 +1537,8 @@ cord_cojoin(struct cord *cord)
 int
 break_ev_loop_f(struct trigger *trigger, void *event)
 {
-	(void) trigger;
 	(void) event;
+	trigger_clear(trigger);
 	ev_break(loop(), EVBREAK_ALL);
 	return 0;
 }
diff --git a/src/lib/core/fiber.h b/src/lib/core/fiber.h
index faaf2e0da..8a3e5aef5 100644
--- a/src/lib/core/fiber.h
+++ b/src/lib/core/fiber.h
@@ -457,7 +457,15 @@ struct fiber {
 
 	/** Triggers invoked before this fiber yields. Must not throw. */
 	struct rlist on_yield;
-	/** Triggers invoked before this fiber stops.  Must not throw. */
+	/**
+	 * Triggers invoked before this fiber is stopped/reset/
+	 * recycled/destroyed/reused. In other words, each time
+	 * when the fiber has finished execution of a request.
+	 * In particular, for fibers not from a fiber pool the
+	 * stop event is emitted before destruction and death.
+	 * Pooled fibers receive the stop event after each
+	 * request, even if they are never destroyed.
+	 */
 	struct rlist on_stop;
 	/**
 	 * The list of fibers awaiting for this fiber's timely
@@ -505,6 +513,10 @@ struct fiber {
 	char name[FIBER_NAME_MAX + 1];
 };
 
+/** Invoke on_stop triggers and delete them. */
+void
+fiber_on_stop(struct fiber *f);
+
 struct cord_on_exit;
 
 /**
diff --git a/src/lua/fiber.c b/src/lua/fiber.c
index a7b75f9bf..575a020d0 100644
--- a/src/lua/fiber.c
+++ b/src/lua/fiber.c
@@ -441,11 +441,6 @@ lua_fiber_run_f(MAYBE_UNUSED va_list ap)
 	int coro_ref = lua_tointeger(L, -1);
 	lua_pop(L, 1);
 	result = luaT_call(L, lua_gettop(L) - 1, LUA_MULTRET);
-
-	/* Destroy local storage */
-	int storage_ref = f->storage.lua.ref;
-	if (storage_ref > 0)
-		luaL_unref(tarantool_L, LUA_REGISTRYINDEX, storage_ref);
 	/*
 	 * If fiber is not joinable
 	 * We can unref child stack here,
@@ -606,12 +601,42 @@ lbox_fiber_name(struct lua_State *L)
 	}
 }
 
+/**
+ * Trigger invoked when the fiber has stopped execution of its
+ * current request. Only purpose - delete storage.lua.ref keeping
+ * a reference of Lua fiber.storage object. Unlike Lua stack,
+ * Lua fiber storage may be created not only for fibers born from
+ * Lua land. For example, an IProto request may execute a Lua
+ * function, which can create the storage. Trigger guarantees,
+ * that even for non-Lua fibers the Lua storage is destroyed.
+ */
+static int
+lbox_fiber_on_stop(struct trigger *trigger, void *event)
+{
+	struct fiber *f = (struct fiber *) event;
+	int storage_ref = f->storage.lua.ref;
+	assert(storage_ref > 0);
+	luaL_unref(tarantool_L, LUA_REGISTRYINDEX, storage_ref);
+	f->storage.lua.ref = LUA_NOREF;
+	trigger_clear(trigger);
+	free(trigger);
+	return 0;
+}
+
 static int
 lbox_fiber_storage(struct lua_State *L)
 {
 	struct fiber *f = lbox_checkfiber(L, 1);
 	int storage_ref = f->storage.lua.ref;
 	if (storage_ref <= 0) {
+		struct trigger *t = (struct trigger *)
+			malloc(sizeof(*t));
+		if (t == NULL) {
+			diag_set(OutOfMemory, sizeof(*t), "malloc", "t");
+			return luaT_error(L);
+		}
+		trigger_create(t, lbox_fiber_on_stop, NULL, (trigger_f0) free);
+		trigger_add(&f->on_stop, t);
 		lua_newtable(L); /* create local storage on demand */
 		storage_ref = luaL_ref(L, LUA_REGISTRYINDEX);
 		f->storage.lua.ref = storage_ref;
diff --git a/test/app/gh-4662-fiber-storage-leak.result b/test/app/gh-4662-fiber-storage-leak.result
new file mode 100644
index 000000000..4ade017a4
--- /dev/null
+++ b/test/app/gh-4662-fiber-storage-leak.result
@@ -0,0 +1,88 @@
+-- test-run result file version 2
+fiber = require('fiber')
+ | ---
+ | ...
+netbox = require('net.box')
+ | ---
+ | ...
+
+--
+-- gh-4662: fiber.storage was not deleted when created in a fiber
+-- started from the thread pool used by IProto requests. The
+-- problem was that fiber.storage was created and deleted in Lua
+-- land only, assuming that only Lua-born fibers could have it.
+-- But in fact any fiber can create a Lua storage. Including the
+-- ones used to serve IProto requests.
+-- The test checks if fiber.storage is really deleted, and is not
+-- shared between requests.
+--
+
+box.schema.user.grant('guest', 'execute', 'universe')
+ | ---
+ | ...
+storage = nil
+ | ---
+ | ...
+i = 0
+ | ---
+ | ...
+weak_table = setmetatable({}, {__mode = 'v'})
+ | ---
+ | ...
+object = {'object'}
+ | ---
+ | ...
+weak_table.object = object
+ | ---
+ | ...
+function ref_object_in_fiber()                  \
+    storage = fiber.self().storage              \
+    assert(next(storage) == nil)                \
+    i = i + 1                                   \
+    fiber.self().storage.key = i                \
+    fiber.self().storage.object = object        \
+end
+ | ---
+ | ...
+
+c = netbox.connect(box.cfg.listen)
+ | ---
+ | ...
+c:call('ref_object_in_fiber') c:call('ref_object_in_fiber')
+ | ---
+ | ...
+storage
+ | ---
+ | - key: 2
+ |   object:
+ |   - object
+ | ...
+i
+ | ---
+ | - 2
+ | ...
+object = nil
+ | ---
+ | ...
+storage = nil
+ | ---
+ | ...
+collectgarbage('collect')
+ | ---
+ | - 0
+ | ...
+-- The weak table should be empty, because the only two hard
+-- references were in the fibers used to serve
+-- ref_object_in_fiber() requests. And their storages should be
+-- cleaned up.
+weak_table
+ | ---
+ | - []
+ | ...
+
+c:close()
+ | ---
+ | ...
+box.schema.user.revoke('guest', 'execute', 'universe')
+ | ---
+ | ...
diff --git a/test/app/gh-4662-fiber-storage-leak.test.lua b/test/app/gh-4662-fiber-storage-leak.test.lua
new file mode 100644
index 000000000..25acf5679
--- /dev/null
+++ b/test/app/gh-4662-fiber-storage-leak.test.lua
@@ -0,0 +1,43 @@
+fiber = require('fiber')
+netbox = require('net.box')
+
+--
+-- gh-4662: fiber.storage was not deleted when created in a fiber
+-- started from the thread pool used by IProto requests. The
+-- problem was that fiber.storage was created and deleted in Lua
+-- land only, assuming that only Lua-born fibers could have it.
+-- But in fact any fiber can create a Lua storage. Including the
+-- ones used to serve IProto requests.
+-- The test checks if fiber.storage is really deleted, and is not
+-- shared between requests.
+--
+
+box.schema.user.grant('guest', 'execute', 'universe')
+storage = nil
+i = 0
+weak_table = setmetatable({}, {__mode = 'v'})
+object = {'object'}
+weak_table.object = object
+function ref_object_in_fiber()                  \
+    storage = fiber.self().storage              \
+    assert(next(storage) == nil)                \
+    i = i + 1                                   \
+    fiber.self().storage.key = i                \
+    fiber.self().storage.object = object        \
+end
+
+c = netbox.connect(box.cfg.listen)
+c:call('ref_object_in_fiber') c:call('ref_object_in_fiber')
+storage
+i
+object = nil
+storage = nil
+collectgarbage('collect')
+-- The weak table should be empty, because the only two hard
+-- references were in the fibers used to serve
+-- ref_object_in_fiber() requests. And their storages should be
+-- cleaned up.
+weak_table
+
+c:close()
+box.schema.user.revoke('guest', 'execute', 'universe')
-- 
2.21.1 (Apple Git-122.3)

  parent reply	other threads:[~2020-01-16 21:54 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-16 21:54 [Tarantool-patches] [PATCH v2 0/3] Fiber storage leak Vladislav Shpilevoy
2020-01-16 21:54 ` [Tarantool-patches] [PATCH v2 1/3] fiber: unref fiber.storage via global Lua state Vladislav Shpilevoy
2020-01-17  7:30   ` Konstantin Osipov
2020-01-16 21:54 ` Vladislav Shpilevoy [this message]
2020-01-16 22:00   ` [Tarantool-patches] [PATCH v2 2/3] fiber: destroy fiber.storage created by iproto Cyrill Gorcunov
2020-01-17  7:47     ` Konstantin Osipov
2020-01-17  8:06       ` Cyrill Gorcunov
2020-01-17  7:45   ` Konstantin Osipov
2020-01-19 17:32     ` Vladislav Shpilevoy
2020-01-20  7:22       ` Konstantin Osipov
2020-01-20 19:15         ` Georgy Kirichenko
2020-01-21 22:21         ` Vladislav Shpilevoy
2020-01-21 22:32           ` Konstantin Osipov
2020-01-16 21:54 ` [Tarantool-patches] [PATCH v2 3/3] box: remove dead code from box_process_call/eval() Vladislav Shpilevoy
2020-01-17  7:46   ` Konstantin Osipov
2020-01-17  7:47   ` Konstantin Osipov
2020-01-17 17:41   ` Georgy Kirichenko
2020-01-19 17:32     ` Vladislav Shpilevoy
2020-01-20 19:21       ` Georgy Kirichenko
2020-01-18 19:27 ` [Tarantool-patches] [PATCH v2 0/3] Fiber storage leak Igor Munkin
2020-02-15  1:02 ` Vladislav Shpilevoy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=31e884f2b6d2ab7b222229f1b204fa067d63ae65.1579211601.git.v.shpilevoy@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=kostja.osipov@gmail.com \
    --cc=tarantool-patches@dev.tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v2 2/3] fiber: destroy fiber.storage created by iproto' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox