* [Tarantool-patches] [PATCH v2 0/3] Fiber storage leak @ 2020-01-16 21:54 Vladislav Shpilevoy 2020-01-16 21:54 ` [Tarantool-patches] [PATCH v2 1/3] fiber: unref fiber.storage via global Lua state Vladislav Shpilevoy ` (4 more replies) 0 siblings, 5 replies; 21+ messages in thread From: Vladislav Shpilevoy @ 2020-01-16 21:54 UTC (permalink / raw) To: tarantool-patches, kostja.osipov The patchset makes fiber.storage be always deleted. Regardless of where was it created - in a fiber born in the Lua land, or in a fiber serving IProto requests and accessing Lua. That removes - a leak occurred each time when fiber storage was created for an IProto request; - a possibility to see previous request's artifacts left in fiber.storage; And makes possible to use fiber.storage as request-local data. Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-4662-fiber-storage-leak Issue: https://github.com/tarantool/tarantool/issues/4662 Changes in v2: - Renamed back fiber.on_cleanup -> fiber.on_stop; - Removed dead code from box_process_call/eval(), which is now handled by iproto. Vladislav Shpilevoy (3): fiber: unref fiber.storage via global Lua state fiber: destroy fiber.storage created by iproto box: remove dead code from box_process_call/eval() src/box/call.c | 28 +------ src/box/iproto.cc | 44 ++++++++-- src/box/txn.c | 1 + src/lib/core/fiber.c | 18 +++- src/lib/core/fiber.h | 14 +++- src/lua/fiber.c | 35 ++++++-- test/app/gh-4662-fiber-storage-leak.result | 88 ++++++++++++++++++++ test/app/gh-4662-fiber-storage-leak.test.lua | 43 ++++++++++ 8 files changed, 230 insertions(+), 41 deletions(-) create mode 100644 test/app/gh-4662-fiber-storage-leak.result create mode 100644 test/app/gh-4662-fiber-storage-leak.test.lua -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply [flat|nested] 21+ messages in thread
* [Tarantool-patches] [PATCH v2 1/3] fiber: unref fiber.storage via global Lua state 2020-01-16 21:54 [Tarantool-patches] [PATCH v2 0/3] Fiber storage leak Vladislav Shpilevoy @ 2020-01-16 21:54 ` Vladislav Shpilevoy 2020-01-17 7:30 ` Konstantin Osipov 2020-01-16 21:54 ` [Tarantool-patches] [PATCH v2 2/3] fiber: destroy fiber.storage created by iproto Vladislav Shpilevoy ` (3 subsequent siblings) 4 siblings, 1 reply; 21+ messages in thread From: Vladislav Shpilevoy @ 2020-01-16 21:54 UTC (permalink / raw) To: tarantool-patches, kostja.osipov Fiber.storage is a table, available from anywhere in the fiber. It is destroyed after fiber function is finished. That provides a reliable fiber-local storage, similar to thread-local in C/C++. But there is a problem that the storage may be created via one struct lua_State, and destroyed via another. Here is an example: function test_storage() fiber.self().storage.key = 100 end box.schema.func.create('test_storage') _ = fiber.create(function() box.func.test_storage:call() end) There are 3 struct lua_State: tarantool_L - global always alive state; L1 - Lua coroutine of the fiber, created by fiber.create(); L2 - Lua coroutine created by that fiber to execute test_storage(). Fiber.storage is created on stack of L2 and referenced by global LUA_REGISTRYINDEX. Then it is unreferenced from L1 when the fiber is being destroyed. That is generally ok as soon as the storage object is always in LUA_REGISTRYINDEX, which is shared by all Lua states. But soon during destruction of the fiber.storage there will be only tarantool_L and the original L2. Original L2 may be already deleted by the time the storage is being destroyed. So this patch makes unref of the storage via reliable tarantool_L. Needed for #4662 --- src/lua/fiber.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/lua/fiber.c b/src/lua/fiber.c index 53ebec9aa..a7b75f9bf 100644 --- a/src/lua/fiber.c +++ b/src/lua/fiber.c @@ -445,7 +445,7 @@ lua_fiber_run_f(MAYBE_UNUSED va_list ap) /* Destroy local storage */ int storage_ref = f->storage.lua.ref; if (storage_ref > 0) - luaL_unref(L, LUA_REGISTRYINDEX, storage_ref); + luaL_unref(tarantool_L, LUA_REGISTRYINDEX, storage_ref); /* * If fiber is not joinable * We can unref child stack here, -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 1/3] fiber: unref fiber.storage via global Lua state 2020-01-16 21:54 ` [Tarantool-patches] [PATCH v2 1/3] fiber: unref fiber.storage via global Lua state Vladislav Shpilevoy @ 2020-01-17 7:30 ` Konstantin Osipov 0 siblings, 0 replies; 21+ messages in thread From: Konstantin Osipov @ 2020-01-17 7:30 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches * Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [20/01/17 00:57]: > Fiber.storage is a table, available from anywhere in the fiber. It > is destroyed after fiber function is finished. That provides a > reliable fiber-local storage, similar to thread-local in C/C++. lgtm -- Konstantin Osipov, Moscow, Russia ^ permalink raw reply [flat|nested] 21+ messages in thread
* [Tarantool-patches] [PATCH v2 2/3] fiber: destroy fiber.storage created by iproto 2020-01-16 21:54 [Tarantool-patches] [PATCH v2 0/3] Fiber storage leak Vladislav Shpilevoy 2020-01-16 21:54 ` [Tarantool-patches] [PATCH v2 1/3] fiber: unref fiber.storage via global Lua state Vladislav Shpilevoy @ 2020-01-16 21:54 ` Vladislav Shpilevoy 2020-01-16 22:00 ` Cyrill Gorcunov 2020-01-17 7:45 ` Konstantin Osipov 2020-01-16 21:54 ` [Tarantool-patches] [PATCH v2 3/3] box: remove dead code from box_process_call/eval() Vladislav Shpilevoy ` (2 subsequent siblings) 4 siblings, 2 replies; 21+ messages in thread From: Vladislav Shpilevoy @ 2020-01-16 21:54 UTC (permalink / raw) To: tarantool-patches, kostja.osipov Fiber.storage was not deleted when created in a fiber started from the thread pool used by IProto requests. The problem was that fiber.storage was created and deleted in Lua land only, assuming that only Lua-born fibers could have it. But in fact any fiber can create a Lua storage. Including the ones used to serve IProto requests. Not deletion of the storage led to a possibility of meeting a non-empty fiber.storage in the beginning of an iproto request, and to not deletion of the memory caught by the storage until its explicit nullification. Now the storage destructor works for any fiber, which managed to create the storage. The destructor unrefs and nullifies the storage. For destructor purposes the fiber.on_stop triggers were reworked. Now they can be called multiple times during fiber's lifetime. After every request done by that fiber. Closes #4662 Closes #3462 @TarantoolBot document Title: Clarify fiber.storage lifetime Fiber.storage is a Lua table created when it is first accessed. On the site it is said that it is deleted when fiber is canceled via fiber:cancel(). But it is not the full truth. Fiber.storage is destroyed when the fiber is finished. Regardless of how is it finished - via :cancel(), or the fiber's function did 'return', it does not matter. Moreover, from that moment the storage is cleaned up even for pooled fibers used to serve IProto requests. Pooled fibers never really die, but nonetheless their storage is cleaned up after each request. That makes possible to use fiber.storage as a full featured request-local storage. Fiber.storage may be created for a fiber no matter how the fiber itself was created - from C, from Lua. For example, a fiber could be created in C using fiber_new(), then it could insert into a space, which had Lua on_replace triggers, and one of the triggers could create fiber.storage. That storage will be deleted when the fiber is stopped. Another place where fiber.storage may be created - for replication applier fiber. Applier has a fiber from which it applies transactions from a remote instance. In case the applier fiber somehow creates a fiber.storage (for example, from a space trigger again), the storage won't be deleted until the applier fiber is stopped. --- src/box/iproto.cc | 44 ++++++++-- src/lib/core/fiber.c | 18 +++- src/lib/core/fiber.h | 14 +++- src/lua/fiber.c | 35 ++++++-- test/app/gh-4662-fiber-storage-leak.result | 88 ++++++++++++++++++++ test/app/gh-4662-fiber-storage-leak.test.lua | 43 ++++++++++ 6 files changed, 225 insertions(+), 17 deletions(-) create mode 100644 test/app/gh-4662-fiber-storage-leak.result create mode 100644 test/app/gh-4662-fiber-storage-leak.test.lua diff --git a/src/box/iproto.cc b/src/box/iproto.cc index 5e83ab0ea..0526f2305 100644 --- a/src/box/iproto.cc +++ b/src/box/iproto.cc @@ -1310,6 +1310,11 @@ static void tx_fiber_init(struct session *session, uint64_t sync) { struct fiber *f = fiber(); + /* + * There should not be any not executed on_stop triggers + * from a previous request executed in that fiber. + */ + assert(rlist_empty(&f->on_stop)); f->storage.net.sync = sync; /* * We do not cleanup fiber keys at the end of each request. @@ -1325,6 +1330,17 @@ tx_fiber_init(struct session *session, uint64_t sync) fiber_set_user(f, &session->credentials); } +/** + * Stop the current fiber after a request is executed to make it + * possible to reuse the fiber for a next request. On_stop + * triggers remove all request-specific data from there. + */ +static inline void +tx_fiber_on_stop() +{ + fiber_on_stop(fiber()); +} + static void tx_process_disconnect(struct cmsg *m) { @@ -1335,6 +1351,7 @@ tx_process_disconnect(struct cmsg *m) if (! rlist_empty(&session_on_disconnect)) { tx_fiber_init(con->session, 0); session_run_on_disconnect_triggers(con->session); + tx_fiber_on_stop(); } } } @@ -1490,6 +1507,7 @@ tx_reply_iproto_error(struct cmsg *m) iproto_reply_error(out, diag_last_error(&msg->diag), msg->header.sync, ::schema_version); iproto_wpos_create(&msg->wpos, out); + tx_fiber_on_stop(); } /** Inject a short delay on tx request processing for testing. */ @@ -1523,9 +1541,11 @@ tx_process1(struct cmsg *m) iproto_reply_select(out, &svp, msg->header.sync, ::schema_version, tuple != 0); iproto_wpos_create(&msg->wpos, out); - return; + goto end; error: tx_reply_error(msg); +end: + tx_fiber_on_stop(); } static void @@ -1566,9 +1586,11 @@ tx_process_select(struct cmsg *m) iproto_reply_select(out, &svp, msg->header.sync, ::schema_version, count); iproto_wpos_create(&msg->wpos, out); - return; + goto end; error: tx_reply_error(msg); +end: + tx_fiber_on_stop(); } static int @@ -1656,9 +1678,11 @@ tx_process_call(struct cmsg *m) iproto_reply_select(out, &svp, msg->header.sync, ::schema_version, count); iproto_wpos_create(&msg->wpos, out); - return; + goto end; error: tx_reply_error(msg); +end: + tx_fiber_on_stop(); } static void @@ -1699,9 +1723,11 @@ tx_process_misc(struct cmsg *m) } catch (Exception *e) { tx_reply_error(msg); } - return; + goto end; error: tx_reply_error(msg); +end: + tx_fiber_on_stop(); } static void @@ -1716,8 +1742,6 @@ tx_process_sql(struct cmsg *m) uint32_t len; bool is_unprepare = false; - tx_fiber_init(msg->connection->session, msg->header.sync); - if (tx_check_schema(msg->header.schema_version)) goto error; assert(msg->header.type == IPROTO_EXECUTE || @@ -1794,9 +1818,11 @@ tx_process_sql(struct cmsg *m) } iproto_reply_sql(out, &header_svp, msg->header.sync, schema_version); iproto_wpos_create(&msg->wpos, out); - return; + goto end; error: tx_reply_error(msg); +end: + tx_fiber_on_stop(); } static void @@ -1835,11 +1861,12 @@ tx_process_replication(struct cmsg *m) unreachable(); } } catch (SocketError *e) { - return; /* don't write error response to prevent SIGPIPE */ + /* don't write error response to prevent SIGPIPE */ } catch (Exception *e) { iproto_write_error(con->input.fd, e, ::schema_version, msg->header.sync); } + tx_fiber_on_stop(); } static void @@ -1945,6 +1972,7 @@ tx_process_connect(struct cmsg *m) tx_reply_error(msg); msg->close_connection = true; } + tx_fiber_on_stop(); } /** diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c index 00ae8cded..634b3d1b0 100644 --- a/src/lib/core/fiber.c +++ b/src/lib/core/fiber.c @@ -325,6 +325,19 @@ fiber_attr_getstacksize(struct fiber_attr *fiber_attr) fiber_attr_default.stack_size; } +void +fiber_on_stop(struct fiber *f) +{ + if (trigger_run(&f->on_stop, f) != 0) + panic("On_stop triggers can't fail"); + /* + * All on_stop triggers are supposed to remove themselves. + * So as no to waste time on that here, and to make them + * all work uniformly. + */ + assert(rlist_empty(&f->on_stop)); +} + static void fiber_recycle(struct fiber *fiber); @@ -856,8 +869,7 @@ fiber_loop(MAYBE_UNUSED void *data) assert(f != fiber); fiber_wakeup(f); } - if (! rlist_empty(&fiber->on_stop)) - trigger_run(&fiber->on_stop, fiber); + fiber_on_stop(fiber); /* reset pending wakeups */ rlist_del(&fiber->state); if (! (fiber->flags & FIBER_IS_JOINABLE)) @@ -1525,8 +1537,8 @@ cord_cojoin(struct cord *cord) int break_ev_loop_f(struct trigger *trigger, void *event) { - (void) trigger; (void) event; + trigger_clear(trigger); ev_break(loop(), EVBREAK_ALL); return 0; } diff --git a/src/lib/core/fiber.h b/src/lib/core/fiber.h index faaf2e0da..8a3e5aef5 100644 --- a/src/lib/core/fiber.h +++ b/src/lib/core/fiber.h @@ -457,7 +457,15 @@ struct fiber { /** Triggers invoked before this fiber yields. Must not throw. */ struct rlist on_yield; - /** Triggers invoked before this fiber stops. Must not throw. */ + /** + * Triggers invoked before this fiber is stopped/reset/ + * recycled/destroyed/reused. In other words, each time + * when the fiber has finished execution of a request. + * In particular, for fibers not from a fiber pool the + * stop event is emitted before destruction and death. + * Pooled fibers receive the stop event after each + * request, even if they are never destroyed. + */ struct rlist on_stop; /** * The list of fibers awaiting for this fiber's timely @@ -505,6 +513,10 @@ struct fiber { char name[FIBER_NAME_MAX + 1]; }; +/** Invoke on_stop triggers and delete them. */ +void +fiber_on_stop(struct fiber *f); + struct cord_on_exit; /** diff --git a/src/lua/fiber.c b/src/lua/fiber.c index a7b75f9bf..575a020d0 100644 --- a/src/lua/fiber.c +++ b/src/lua/fiber.c @@ -441,11 +441,6 @@ lua_fiber_run_f(MAYBE_UNUSED va_list ap) int coro_ref = lua_tointeger(L, -1); lua_pop(L, 1); result = luaT_call(L, lua_gettop(L) - 1, LUA_MULTRET); - - /* Destroy local storage */ - int storage_ref = f->storage.lua.ref; - if (storage_ref > 0) - luaL_unref(tarantool_L, LUA_REGISTRYINDEX, storage_ref); /* * If fiber is not joinable * We can unref child stack here, @@ -606,12 +601,42 @@ lbox_fiber_name(struct lua_State *L) } } +/** + * Trigger invoked when the fiber has stopped execution of its + * current request. Only purpose - delete storage.lua.ref keeping + * a reference of Lua fiber.storage object. Unlike Lua stack, + * Lua fiber storage may be created not only for fibers born from + * Lua land. For example, an IProto request may execute a Lua + * function, which can create the storage. Trigger guarantees, + * that even for non-Lua fibers the Lua storage is destroyed. + */ +static int +lbox_fiber_on_stop(struct trigger *trigger, void *event) +{ + struct fiber *f = (struct fiber *) event; + int storage_ref = f->storage.lua.ref; + assert(storage_ref > 0); + luaL_unref(tarantool_L, LUA_REGISTRYINDEX, storage_ref); + f->storage.lua.ref = LUA_NOREF; + trigger_clear(trigger); + free(trigger); + return 0; +} + static int lbox_fiber_storage(struct lua_State *L) { struct fiber *f = lbox_checkfiber(L, 1); int storage_ref = f->storage.lua.ref; if (storage_ref <= 0) { + struct trigger *t = (struct trigger *) + malloc(sizeof(*t)); + if (t == NULL) { + diag_set(OutOfMemory, sizeof(*t), "malloc", "t"); + return luaT_error(L); + } + trigger_create(t, lbox_fiber_on_stop, NULL, (trigger_f0) free); + trigger_add(&f->on_stop, t); lua_newtable(L); /* create local storage on demand */ storage_ref = luaL_ref(L, LUA_REGISTRYINDEX); f->storage.lua.ref = storage_ref; diff --git a/test/app/gh-4662-fiber-storage-leak.result b/test/app/gh-4662-fiber-storage-leak.result new file mode 100644 index 000000000..4ade017a4 --- /dev/null +++ b/test/app/gh-4662-fiber-storage-leak.result @@ -0,0 +1,88 @@ +-- test-run result file version 2 +fiber = require('fiber') + | --- + | ... +netbox = require('net.box') + | --- + | ... + +-- +-- gh-4662: fiber.storage was not deleted when created in a fiber +-- started from the thread pool used by IProto requests. The +-- problem was that fiber.storage was created and deleted in Lua +-- land only, assuming that only Lua-born fibers could have it. +-- But in fact any fiber can create a Lua storage. Including the +-- ones used to serve IProto requests. +-- The test checks if fiber.storage is really deleted, and is not +-- shared between requests. +-- + +box.schema.user.grant('guest', 'execute', 'universe') + | --- + | ... +storage = nil + | --- + | ... +i = 0 + | --- + | ... +weak_table = setmetatable({}, {__mode = 'v'}) + | --- + | ... +object = {'object'} + | --- + | ... +weak_table.object = object + | --- + | ... +function ref_object_in_fiber() \ + storage = fiber.self().storage \ + assert(next(storage) == nil) \ + i = i + 1 \ + fiber.self().storage.key = i \ + fiber.self().storage.object = object \ +end + | --- + | ... + +c = netbox.connect(box.cfg.listen) + | --- + | ... +c:call('ref_object_in_fiber') c:call('ref_object_in_fiber') + | --- + | ... +storage + | --- + | - key: 2 + | object: + | - object + | ... +i + | --- + | - 2 + | ... +object = nil + | --- + | ... +storage = nil + | --- + | ... +collectgarbage('collect') + | --- + | - 0 + | ... +-- The weak table should be empty, because the only two hard +-- references were in the fibers used to serve +-- ref_object_in_fiber() requests. And their storages should be +-- cleaned up. +weak_table + | --- + | - [] + | ... + +c:close() + | --- + | ... +box.schema.user.revoke('guest', 'execute', 'universe') + | --- + | ... diff --git a/test/app/gh-4662-fiber-storage-leak.test.lua b/test/app/gh-4662-fiber-storage-leak.test.lua new file mode 100644 index 000000000..25acf5679 --- /dev/null +++ b/test/app/gh-4662-fiber-storage-leak.test.lua @@ -0,0 +1,43 @@ +fiber = require('fiber') +netbox = require('net.box') + +-- +-- gh-4662: fiber.storage was not deleted when created in a fiber +-- started from the thread pool used by IProto requests. The +-- problem was that fiber.storage was created and deleted in Lua +-- land only, assuming that only Lua-born fibers could have it. +-- But in fact any fiber can create a Lua storage. Including the +-- ones used to serve IProto requests. +-- The test checks if fiber.storage is really deleted, and is not +-- shared between requests. +-- + +box.schema.user.grant('guest', 'execute', 'universe') +storage = nil +i = 0 +weak_table = setmetatable({}, {__mode = 'v'}) +object = {'object'} +weak_table.object = object +function ref_object_in_fiber() \ + storage = fiber.self().storage \ + assert(next(storage) == nil) \ + i = i + 1 \ + fiber.self().storage.key = i \ + fiber.self().storage.object = object \ +end + +c = netbox.connect(box.cfg.listen) +c:call('ref_object_in_fiber') c:call('ref_object_in_fiber') +storage +i +object = nil +storage = nil +collectgarbage('collect') +-- The weak table should be empty, because the only two hard +-- references were in the fibers used to serve +-- ref_object_in_fiber() requests. And their storages should be +-- cleaned up. +weak_table + +c:close() +box.schema.user.revoke('guest', 'execute', 'universe') -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 2/3] fiber: destroy fiber.storage created by iproto 2020-01-16 21:54 ` [Tarantool-patches] [PATCH v2 2/3] fiber: destroy fiber.storage created by iproto Vladislav Shpilevoy @ 2020-01-16 22:00 ` Cyrill Gorcunov 2020-01-17 7:47 ` Konstantin Osipov 2020-01-17 7:45 ` Konstantin Osipov 1 sibling, 1 reply; 21+ messages in thread From: Cyrill Gorcunov @ 2020-01-16 22:00 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches On Thu, Jan 16, 2020 at 10:54:22PM +0100, Vladislav Shpilevoy wrote: ... > static int > lbox_fiber_storage(struct lua_State *L) > { > struct fiber *f = lbox_checkfiber(L, 1); > int storage_ref = f->storage.lua.ref; > if (storage_ref <= 0) { > + struct trigger *t = (struct trigger *) > + malloc(sizeof(*t)); Vlad, maybe you could point me -- why do we need an explicit cast here? ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 2/3] fiber: destroy fiber.storage created by iproto 2020-01-16 22:00 ` Cyrill Gorcunov @ 2020-01-17 7:47 ` Konstantin Osipov 2020-01-17 8:06 ` Cyrill Gorcunov 0 siblings, 1 reply; 21+ messages in thread From: Konstantin Osipov @ 2020-01-17 7:47 UTC (permalink / raw) To: Cyrill Gorcunov; +Cc: tarantool-patches, Vladislav Shpilevoy * Cyrill Gorcunov <gorcunov@gmail.com> [20/01/17 01:02]: > On Thu, Jan 16, 2020 at 10:54:22PM +0100, Vladislav Shpilevoy wrote: > ... > > static int > > lbox_fiber_storage(struct lua_State *L) > > { > > struct fiber *f = lbox_checkfiber(L, 1); > > int storage_ref = f->storage.lua.ref; > > if (storage_ref <= 0) { > > + struct trigger *t = (struct trigger *) > > + malloc(sizeof(*t)); > > Vlad, maybe you could point me -- why do we need an explicit cast here? It's a C++ habit, where void * is not implicitly cast-able to any type. -- Konstantin Osipov, Moscow, Russia ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 2/3] fiber: destroy fiber.storage created by iproto 2020-01-17 7:47 ` Konstantin Osipov @ 2020-01-17 8:06 ` Cyrill Gorcunov 0 siblings, 0 replies; 21+ messages in thread From: Cyrill Gorcunov @ 2020-01-17 8:06 UTC (permalink / raw) To: Konstantin Osipov; +Cc: tarantool-patches, Vladislav Shpilevoy On Fri, Jan 17, 2020 at 10:47:56AM +0300, Konstantin Osipov wrote: > * Cyrill Gorcunov <gorcunov@gmail.com> [20/01/17 01:02]: > > On Thu, Jan 16, 2020 at 10:54:22PM +0100, Vladislav Shpilevoy wrote: > > ... > > > static int > > > lbox_fiber_storage(struct lua_State *L) > > > { > > > struct fiber *f = lbox_checkfiber(L, 1); > > > int storage_ref = f->storage.lua.ref; > > > if (storage_ref <= 0) { > > > + struct trigger *t = (struct trigger *) > > > + malloc(sizeof(*t)); > > > > Vlad, maybe you could point me -- why do we need an explicit cast here? > > It's a C++ habit, where void * is not implicitly cast-able to any > type. OK, thanks for info! ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 2/3] fiber: destroy fiber.storage created by iproto 2020-01-16 21:54 ` [Tarantool-patches] [PATCH v2 2/3] fiber: destroy fiber.storage created by iproto Vladislav Shpilevoy 2020-01-16 22:00 ` Cyrill Gorcunov @ 2020-01-17 7:45 ` Konstantin Osipov 2020-01-19 17:32 ` Vladislav Shpilevoy 1 sibling, 1 reply; 21+ messages in thread From: Konstantin Osipov @ 2020-01-17 7:45 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches * Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [20/01/17 00:57]: > Fiber.storage was not deleted when created in a fiber started from > the thread pool used by IProto requests. The problem was that > fiber.storage was created and deleted in Lua land only, assuming > that only Lua-born fibers could have it. But in fact any fiber can > create a Lua storage. Including the ones used to serve IProto > requests. > > Not deletion of the storage led to a possibility of meeting a > non-empty fiber.storage in the beginning of an iproto request, and > to not deletion of the memory caught by the storage until its > explicit nullification. > > Now the storage destructor works for any fiber, which managed to > create the storage. The destructor unrefs and nullifies the > storage. > > For destructor purposes the fiber.on_stop triggers were reworked. > Now they can be called multiple times during fiber's lifetime. > After every request done by that fiber. > > Closes #4662 > Closes #3462 I like this version of the patch much more than the previous one. It's way more clear. Instead of a generic name, you made a good comment for fiber::on_stop. > +/** > + * Stop the current fiber after a request is executed to make it > + * possible to reuse the fiber for a next request. On_stop > + * triggers remove all request-specific data from there. > + */ > +static inline void > +tx_fiber_on_stop() > +{ > + fiber_on_stop(fiber()); > +} > + > static void > tx_process_disconnect(struct cmsg *m) > { > @@ -1335,6 +1351,7 @@ tx_process_disconnect(struct cmsg *m) > if (! rlist_empty(&session_on_disconnect)) { > tx_fiber_init(con->session, 0); > session_run_on_disconnect_triggers(con->session); > + tx_fiber_on_stop(); > } > } Why did you have to add so many invocation points to fiber_on_stop() rather than simply adding fiber_on_stop invocation to fiber_pool.c? Maybe we discussed this, but it's been 3 weeks ago and I lost the context/rationale. > +void > +fiber_on_stop(struct fiber *f) > +{ > + if (trigger_run(&f->on_stop, f) != 0) > + panic("On_stop triggers can't fail"); > + /* > + * All on_stop triggers are supposed to remove themselves. > + * So as no to waste time on that here, and to make them > + * all work uniformly. > + */ Nice. > + assert(rlist_empty(&f->on_stop)); > +} > + > static void > fiber_recycle(struct fiber *fiber); > > @@ -856,8 +869,7 @@ fiber_loop(MAYBE_UNUSED void *data) > assert(f != fiber); > fiber_wakeup(f); > } > - if (! rlist_empty(&fiber->on_stop)) > - trigger_run(&fiber->on_stop, fiber); > + fiber_on_stop(fiber); This was an attempt to optimize a non-inline function call for the most common case. I would move this !rlist_empty check to fiber_on_stop and add a comment why we explicitly check for the list first. -- Konstantin Osipov, Moscow, Russia ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 2/3] fiber: destroy fiber.storage created by iproto 2020-01-17 7:45 ` Konstantin Osipov @ 2020-01-19 17:32 ` Vladislav Shpilevoy 2020-01-20 7:22 ` Konstantin Osipov 0 siblings, 1 reply; 21+ messages in thread From: Vladislav Shpilevoy @ 2020-01-19 17:32 UTC (permalink / raw) To: Konstantin Osipov, tarantool-patches Hi! Thanks for the review! >> +/** >> + * Stop the current fiber after a request is executed to make it >> + * possible to reuse the fiber for a next request. On_stop >> + * triggers remove all request-specific data from there. >> + */ >> +static inline void >> +tx_fiber_on_stop() >> +{ >> + fiber_on_stop(fiber()); >> +} >> + >> static void >> tx_process_disconnect(struct cmsg *m) >> { >> @@ -1335,6 +1351,7 @@ tx_process_disconnect(struct cmsg *m) >> if (! rlist_empty(&session_on_disconnect)) { >> tx_fiber_init(con->session, 0); >> session_run_on_disconnect_triggers(con->session); >> + tx_fiber_on_stop(); >> } >> } > > Why did you have to add so many invocation points to > fiber_on_stop() rather than simply adding fiber_on_stop invocation to > fiber_pool.c? We already discussed that in the previous patch version. We decided to move cleanup to iproto.cc, because it depends on when a request ends. Fiber pool knows nothing about requests. Iproto.cc is request processing layer, and this is the right place for request data destruction. >> static void >> fiber_recycle(struct fiber *fiber); >> >> @@ -856,8 +869,7 @@ fiber_loop(MAYBE_UNUSED void *data) >> assert(f != fiber); >> fiber_wakeup(f); >> } >> - if (! rlist_empty(&fiber->on_stop)) >> - trigger_run(&fiber->on_stop, fiber); >> + fiber_on_stop(fiber); > > This was an attempt to optimize a non-inline function call > for the most common case. > > I would move this !rlist_empty check to fiber_on_stop and add a > comment why we explicitly check for the list first. I doubt it really helps, but ok. ================================================================================ diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c index 634b3d1b0..354749549 100644 --- a/src/lib/core/fiber.c +++ b/src/lib/core/fiber.c @@ -328,6 +328,12 @@ fiber_attr_getstacksize(struct fiber_attr *fiber_attr) void fiber_on_stop(struct fiber *f) { + /* + * The most common case is when the list is empty. Do an + * inlined check before calling trigger_run(). + */ + if (rlist_empty(&f->on_stop)) + return; if (trigger_run(&f->on_stop, f) != 0) panic("On_stop triggers can't fail"); /* ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 2/3] fiber: destroy fiber.storage created by iproto 2020-01-19 17:32 ` Vladislav Shpilevoy @ 2020-01-20 7:22 ` Konstantin Osipov 2020-01-20 19:15 ` Georgy Kirichenko 2020-01-21 22:21 ` Vladislav Shpilevoy 0 siblings, 2 replies; 21+ messages in thread From: Konstantin Osipov @ 2020-01-20 7:22 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches * Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [20/01/20 09:59]: > > Why did you have to add so many invocation points to > > fiber_on_stop() rather than simply adding fiber_on_stop invocation to > > fiber_pool.c? > > We already discussed that in the previous patch version. We decided > to move cleanup to iproto.cc, because it depends on when a request > ends. Fiber pool knows nothing about requests. Iproto.cc is request > processing layer, and this is the right place for request data > destruction. True, but since you abstracted out the destruction via an opaque trigger, why not move the invocation of the trigger to fiber pool? fiber pool has most knowledge about fiber life cycle, so it seems natural to invoke the triggers in it - it will tie the *timing* to fiber pool, but not what's going on inside the trigger. Thoughts? > > I would move this !rlist_empty check to fiber_on_stop and add a > > comment why we explicitly check for the list first. > > I doubt it really helps, but ok. > > ================================================================================ > > diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c > index 634b3d1b0..354749549 100644 > --- a/src/lib/core/fiber.c > +++ b/src/lib/core/fiber.c > @@ -328,6 +328,12 @@ fiber_attr_getstacksize(struct fiber_attr *fiber_attr) > void > fiber_on_stop(struct fiber *f) > { > + /* > + * The most common case is when the list is empty. Do an > + * inlined check before calling trigger_run(). > + */ > + if (rlist_empty(&f->on_stop)) > + return; > if (trigger_run(&f->on_stop, f) != 0) > panic("On_stop triggers can't fail"); > /* thanks! -- Konstantin Osipov, Moscow, Russia ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 2/3] fiber: destroy fiber.storage created by iproto 2020-01-20 7:22 ` Konstantin Osipov @ 2020-01-20 19:15 ` Georgy Kirichenko 2020-01-21 22:21 ` Vladislav Shpilevoy 1 sibling, 0 replies; 21+ messages in thread From: Georgy Kirichenko @ 2020-01-20 19:15 UTC (permalink / raw) To: Konstantin Osipov, Vladislav Shpilevoy, tarantool-patches [-- Attachment #1: Type: text/plain, Size: 2083 bytes --] On Monday, 20 January 2020 10:22:34 MSK Konstantin Osipov wrote: > * Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [20/01/20 09:59]: > > > Why did you have to add so many invocation points to > > > fiber_on_stop() rather than simply adding fiber_on_stop invocation to > > > fiber_pool.c? > > > > We already discussed that in the previous patch version. We decided > > to move cleanup to iproto.cc, because it depends on when a request > > ends. Fiber pool knows nothing about requests. Iproto.cc is request > > processing layer, and this is the right place for request data > > destruction. > > True, but since you abstracted out the destruction via an opaque > trigger, why not move the invocation of the trigger to fiber pool? > fiber pool has most knowledge about fiber life cycle, so it seems > natural to invoke the triggers in it - it will tie the *timing* to > fiber pool, but not what's going on inside the trigger. > > Thoughts? I agree with Kostja's comment, if a fiber pool is only an optimization then there should not be any visible difference between code invocation inside a standalone fiber and a fiber pool member and this point includes fiber pool clearance. > > > > I would move this !rlist_empty check to fiber_on_stop and add a > > > comment why we explicitly check for the list first. > > > > I doubt it really helps, but ok. > > > > ========================================================================== > > ====== > > > > diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c > > index 634b3d1b0..354749549 100644 > > --- a/src/lib/core/fiber.c > > +++ b/src/lib/core/fiber.c > > @@ -328,6 +328,12 @@ fiber_attr_getstacksize(struct fiber_attr > > *fiber_attr) > > > > void > > fiber_on_stop(struct fiber *f) > > { > > > > + /* > > + * The most common case is when the list is empty. Do an > > + * inlined check before calling trigger_run(). > > + */ > > + if (rlist_empty(&f->on_stop)) > > + return; > > > > if (trigger_run(&f->on_stop, f) != 0) > > > > panic("On_stop triggers can't fail"); > > > > /* > > thanks! [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 2/3] fiber: destroy fiber.storage created by iproto 2020-01-20 7:22 ` Konstantin Osipov 2020-01-20 19:15 ` Georgy Kirichenko @ 2020-01-21 22:21 ` Vladislav Shpilevoy 2020-01-21 22:32 ` Konstantin Osipov 1 sibling, 1 reply; 21+ messages in thread From: Vladislav Shpilevoy @ 2020-01-21 22:21 UTC (permalink / raw) To: Konstantin Osipov, tarantool-patches On 20/01/2020 08:22, Konstantin Osipov wrote: > * Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [20/01/20 09:59]: >>> Why did you have to add so many invocation points to >>> fiber_on_stop() rather than simply adding fiber_on_stop invocation to >>> fiber_pool.c? >> >> We already discussed that in the previous patch version. We decided >> to move cleanup to iproto.cc, because it depends on when a request >> ends. Fiber pool knows nothing about requests. Iproto.cc is request >> processing layer, and this is the right place for request data >> destruction. > > True, but since you abstracted out the destruction via an opaque > trigger, why not move the invocation of the trigger to fiber pool? > fiber pool has most knowledge about fiber life cycle, so it seems > natural to invoke the triggers in it - it will tie the *timing* to > fiber pool, but not what's going on inside the trigger. > > Thoughts? And we came to the exactly the same patch I had in the first version, with rolled back on_stop -> on_cleanup rename. Motivation for calling on_stop in iproto was that only iproto is interested in clearing the fiber's state after each request, and fiber pool should not decide when a request ends. From what I understood. But I am ok with both solutions. Anyway, here is the new patch. ================================================================================ commit 422068e5ea4aaf4e6f4b4dc56f252f3b5971aec2 Author: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> Date: Thu Jan 16 21:47:18 2020 +0100 fiber: destroy fiber.storage created by iproto Fiber.storage was not deleted when created in a fiber started from the thread pool used by IProto requests. The problem was that fiber.storage was created and deleted in Lua land only, assuming that only Lua-born fibers could have it. But in fact any fiber can create a Lua storage. Including the ones used to serve IProto requests. Not deletion of the storage led to a possibility of meeting a non-empty fiber.storage in the beginning of an iproto request, and to not deletion of the memory caught by the storage until its explicit nullification. Now the storage destructor works for any fiber, which managed to create the storage. The destructor unrefs and nullifies the storage. For destructor purposes the fiber.on_stop triggers were reworked. Now they can be called multiple times during fiber's lifetime. After every request done by that fiber. Closes #4662 Closes #3462 @TarantoolBot document Title: Clarify fiber.storage lifetime Fiber.storage is a Lua table created when it is first accessed. On the site it is said that it is deleted when fiber is canceled via fiber:cancel(). But it is not the full truth. Fiber.storage is destroyed when the fiber is finished. Regardless of how is it finished - via :cancel(), or the fiber's function did 'return', it does not matter. Moreover, from that moment the storage is cleaned up even for pooled fibers used to serve IProto requests. Pooled fibers never really die, but nonetheless their storage is cleaned up after each request. That makes possible to use fiber.storage as a full featured request-local storage. Fiber.storage may be created for a fiber no matter how the fiber itself was created - from C, from Lua. For example, a fiber could be created in C using fiber_new(), then it could insert into a space, which had Lua on_replace triggers, and one of the triggers could create fiber.storage. That storage will be deleted when the fiber is stopped. Another place where fiber.storage may be created - for replication applier fiber. Applier has a fiber from which it applies transactions from a remote instance. In case the applier fiber somehow creates a fiber.storage (for example, from a space trigger again), the storage won't be deleted until the applier fiber is stopped. diff --git a/src/box/iproto.cc b/src/box/iproto.cc index 5e83ab0ea..f313af6ae 100644 --- a/src/box/iproto.cc +++ b/src/box/iproto.cc @@ -1310,6 +1310,11 @@ static void tx_fiber_init(struct session *session, uint64_t sync) { struct fiber *f = fiber(); + /* + * There should not be any not executed on_stop triggers + * from a previous request executed in that fiber. + */ + assert(rlist_empty(&f->on_stop)); f->storage.net.sync = sync; /* * We do not cleanup fiber keys at the end of each request. @@ -1716,8 +1721,6 @@ tx_process_sql(struct cmsg *m) uint32_t len; bool is_unprepare = false; - tx_fiber_init(msg->connection->session, msg->header.sync); - if (tx_check_schema(msg->header.schema_version)) goto error; assert(msg->header.type == IPROTO_EXECUTE || diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c index 00ae8cded..354749549 100644 --- a/src/lib/core/fiber.c +++ b/src/lib/core/fiber.c @@ -325,6 +325,25 @@ fiber_attr_getstacksize(struct fiber_attr *fiber_attr) fiber_attr_default.stack_size; } +void +fiber_on_stop(struct fiber *f) +{ + /* + * The most common case is when the list is empty. Do an + * inlined check before calling trigger_run(). + */ + if (rlist_empty(&f->on_stop)) + return; + if (trigger_run(&f->on_stop, f) != 0) + panic("On_stop triggers can't fail"); + /* + * All on_stop triggers are supposed to remove themselves. + * So as no to waste time on that here, and to make them + * all work uniformly. + */ + assert(rlist_empty(&f->on_stop)); +} + static void fiber_recycle(struct fiber *fiber); @@ -856,8 +875,7 @@ fiber_loop(MAYBE_UNUSED void *data) assert(f != fiber); fiber_wakeup(f); } - if (! rlist_empty(&fiber->on_stop)) - trigger_run(&fiber->on_stop, fiber); + fiber_on_stop(fiber); /* reset pending wakeups */ rlist_del(&fiber->state); if (! (fiber->flags & FIBER_IS_JOINABLE)) @@ -1525,8 +1543,8 @@ cord_cojoin(struct cord *cord) int break_ev_loop_f(struct trigger *trigger, void *event) { - (void) trigger; (void) event; + trigger_clear(trigger); ev_break(loop(), EVBREAK_ALL); return 0; } diff --git a/src/lib/core/fiber.h b/src/lib/core/fiber.h index faaf2e0da..8a3e5aef5 100644 --- a/src/lib/core/fiber.h +++ b/src/lib/core/fiber.h @@ -457,7 +457,15 @@ struct fiber { /** Triggers invoked before this fiber yields. Must not throw. */ struct rlist on_yield; - /** Triggers invoked before this fiber stops. Must not throw. */ + /** + * Triggers invoked before this fiber is stopped/reset/ + * recycled/destroyed/reused. In other words, each time + * when the fiber has finished execution of a request. + * In particular, for fibers not from a fiber pool the + * stop event is emitted before destruction and death. + * Pooled fibers receive the stop event after each + * request, even if they are never destroyed. + */ struct rlist on_stop; /** * The list of fibers awaiting for this fiber's timely @@ -505,6 +513,10 @@ struct fiber { char name[FIBER_NAME_MAX + 1]; }; +/** Invoke on_stop triggers and delete them. */ +void +fiber_on_stop(struct fiber *f); + struct cord_on_exit; /** diff --git a/src/lib/core/fiber_pool.c b/src/lib/core/fiber_pool.c index 77f89c9fa..d40c9bcc3 100644 --- a/src/lib/core/fiber_pool.c +++ b/src/lib/core/fiber_pool.c @@ -62,6 +62,16 @@ restart: assert(f->caller->caller == &cord->sched); } cmsg_deliver(msg); + /* + * Normally fibers die after their function + * returns, and they call on_stop() triggers. The + * pool optimization interferes into that logic + * and a fiber doesn't die after its function + * returns. But on_stop triggers still should be + * called so that the pool wouldn't affect fiber's + * visible lifecycle. + */ + fiber_on_stop(f); } /** Put the current fiber into a fiber cache. */ if (!fiber_is_cancelled(fiber()) && (msg != NULL || diff --git a/src/lua/fiber.c b/src/lua/fiber.c index a7b75f9bf..575a020d0 100644 --- a/src/lua/fiber.c +++ b/src/lua/fiber.c @@ -441,11 +441,6 @@ lua_fiber_run_f(MAYBE_UNUSED va_list ap) int coro_ref = lua_tointeger(L, -1); lua_pop(L, 1); result = luaT_call(L, lua_gettop(L) - 1, LUA_MULTRET); - - /* Destroy local storage */ - int storage_ref = f->storage.lua.ref; - if (storage_ref > 0) - luaL_unref(tarantool_L, LUA_REGISTRYINDEX, storage_ref); /* * If fiber is not joinable * We can unref child stack here, @@ -606,12 +601,42 @@ lbox_fiber_name(struct lua_State *L) } } +/** + * Trigger invoked when the fiber has stopped execution of its + * current request. Only purpose - delete storage.lua.ref keeping + * a reference of Lua fiber.storage object. Unlike Lua stack, + * Lua fiber storage may be created not only for fibers born from + * Lua land. For example, an IProto request may execute a Lua + * function, which can create the storage. Trigger guarantees, + * that even for non-Lua fibers the Lua storage is destroyed. + */ +static int +lbox_fiber_on_stop(struct trigger *trigger, void *event) +{ + struct fiber *f = (struct fiber *) event; + int storage_ref = f->storage.lua.ref; + assert(storage_ref > 0); + luaL_unref(tarantool_L, LUA_REGISTRYINDEX, storage_ref); + f->storage.lua.ref = LUA_NOREF; + trigger_clear(trigger); + free(trigger); + return 0; +} + static int lbox_fiber_storage(struct lua_State *L) { struct fiber *f = lbox_checkfiber(L, 1); int storage_ref = f->storage.lua.ref; if (storage_ref <= 0) { + struct trigger *t = (struct trigger *) + malloc(sizeof(*t)); + if (t == NULL) { + diag_set(OutOfMemory, sizeof(*t), "malloc", "t"); + return luaT_error(L); + } + trigger_create(t, lbox_fiber_on_stop, NULL, (trigger_f0) free); + trigger_add(&f->on_stop, t); lua_newtable(L); /* create local storage on demand */ storage_ref = luaL_ref(L, LUA_REGISTRYINDEX); f->storage.lua.ref = storage_ref; diff --git a/test/app/gh-4662-fiber-storage-leak.result b/test/app/gh-4662-fiber-storage-leak.result new file mode 100644 index 000000000..4ade017a4 --- /dev/null +++ b/test/app/gh-4662-fiber-storage-leak.result @@ -0,0 +1,88 @@ +-- test-run result file version 2 +fiber = require('fiber') + | --- + | ... +netbox = require('net.box') + | --- + | ... + +-- +-- gh-4662: fiber.storage was not deleted when created in a fiber +-- started from the thread pool used by IProto requests. The +-- problem was that fiber.storage was created and deleted in Lua +-- land only, assuming that only Lua-born fibers could have it. +-- But in fact any fiber can create a Lua storage. Including the +-- ones used to serve IProto requests. +-- The test checks if fiber.storage is really deleted, and is not +-- shared between requests. +-- + +box.schema.user.grant('guest', 'execute', 'universe') + | --- + | ... +storage = nil + | --- + | ... +i = 0 + | --- + | ... +weak_table = setmetatable({}, {__mode = 'v'}) + | --- + | ... +object = {'object'} + | --- + | ... +weak_table.object = object + | --- + | ... +function ref_object_in_fiber() \ + storage = fiber.self().storage \ + assert(next(storage) == nil) \ + i = i + 1 \ + fiber.self().storage.key = i \ + fiber.self().storage.object = object \ +end + | --- + | ... + +c = netbox.connect(box.cfg.listen) + | --- + | ... +c:call('ref_object_in_fiber') c:call('ref_object_in_fiber') + | --- + | ... +storage + | --- + | - key: 2 + | object: + | - object + | ... +i + | --- + | - 2 + | ... +object = nil + | --- + | ... +storage = nil + | --- + | ... +collectgarbage('collect') + | --- + | - 0 + | ... +-- The weak table should be empty, because the only two hard +-- references were in the fibers used to serve +-- ref_object_in_fiber() requests. And their storages should be +-- cleaned up. +weak_table + | --- + | - [] + | ... + +c:close() + | --- + | ... +box.schema.user.revoke('guest', 'execute', 'universe') + | --- + | ... diff --git a/test/app/gh-4662-fiber-storage-leak.test.lua b/test/app/gh-4662-fiber-storage-leak.test.lua new file mode 100644 index 000000000..25acf5679 --- /dev/null +++ b/test/app/gh-4662-fiber-storage-leak.test.lua @@ -0,0 +1,43 @@ +fiber = require('fiber') +netbox = require('net.box') + +-- +-- gh-4662: fiber.storage was not deleted when created in a fiber +-- started from the thread pool used by IProto requests. The +-- problem was that fiber.storage was created and deleted in Lua +-- land only, assuming that only Lua-born fibers could have it. +-- But in fact any fiber can create a Lua storage. Including the +-- ones used to serve IProto requests. +-- The test checks if fiber.storage is really deleted, and is not +-- shared between requests. +-- + +box.schema.user.grant('guest', 'execute', 'universe') +storage = nil +i = 0 +weak_table = setmetatable({}, {__mode = 'v'}) +object = {'object'} +weak_table.object = object +function ref_object_in_fiber() \ + storage = fiber.self().storage \ + assert(next(storage) == nil) \ + i = i + 1 \ + fiber.self().storage.key = i \ + fiber.self().storage.object = object \ +end + +c = netbox.connect(box.cfg.listen) +c:call('ref_object_in_fiber') c:call('ref_object_in_fiber') +storage +i +object = nil +storage = nil +collectgarbage('collect') +-- The weak table should be empty, because the only two hard +-- references were in the fibers used to serve +-- ref_object_in_fiber() requests. And their storages should be +-- cleaned up. +weak_table + +c:close() +box.schema.user.revoke('guest', 'execute', 'universe') ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 2/3] fiber: destroy fiber.storage created by iproto 2020-01-21 22:21 ` Vladislav Shpilevoy @ 2020-01-21 22:32 ` Konstantin Osipov 0 siblings, 0 replies; 21+ messages in thread From: Konstantin Osipov @ 2020-01-21 22:32 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches * Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [20/01/22 01:25]: > On 20/01/2020 08:22, Konstantin Osipov wrote: > > * Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [20/01/20 09:59]: > >>> Why did you have to add so many invocation points to > >>> fiber_on_stop() rather than simply adding fiber_on_stop invocation to > >>> fiber_pool.c? > >> > >> We already discussed that in the previous patch version. We decided > >> to move cleanup to iproto.cc, because it depends on when a request > >> ends. Fiber pool knows nothing about requests. Iproto.cc is request > >> processing layer, and this is the right place for request data > >> destruction. > > > > True, but since you abstracted out the destruction via an opaque > > trigger, why not move the invocation of the trigger to fiber pool? > > fiber pool has most knowledge about fiber life cycle, so it seems > > natural to invoke the triggers in it - it will tie the *timing* to > > fiber pool, but not what's going on inside the trigger. > > > > Thoughts? > > And we came to the exactly the same patch I had in the first > version, with rolled back on_stop -> on_cleanup rename. > > Motivation for calling on_stop in iproto was that only iproto > is interested in clearing the fiber's state after each request, > and fiber pool should not decide when a request ends. From what > I understood. But I am ok with both solutions. > > Anyway, here is the new patch. lgtm. -- Konstantin Osipov, Moscow, Russia ^ permalink raw reply [flat|nested] 21+ messages in thread
* [Tarantool-patches] [PATCH v2 3/3] box: remove dead code from box_process_call/eval() 2020-01-16 21:54 [Tarantool-patches] [PATCH v2 0/3] Fiber storage leak Vladislav Shpilevoy 2020-01-16 21:54 ` [Tarantool-patches] [PATCH v2 1/3] fiber: unref fiber.storage via global Lua state Vladislav Shpilevoy 2020-01-16 21:54 ` [Tarantool-patches] [PATCH v2 2/3] fiber: destroy fiber.storage created by iproto Vladislav Shpilevoy @ 2020-01-16 21:54 ` Vladislav Shpilevoy 2020-01-17 7:46 ` Konstantin Osipov ` (2 more replies) 2020-01-18 19:27 ` [Tarantool-patches] [PATCH v2 0/3] Fiber storage leak Igor Munkin 2020-02-15 1:02 ` Vladislav Shpilevoy 4 siblings, 3 replies; 21+ messages in thread From: Vladislav Shpilevoy @ 2020-01-16 21:54 UTC (permalink / raw) To: tarantool-patches, kostja.osipov box_process_call/eval() in the end check if there is an active transaction. If there is, it is rolled back, and an error is set. But rollback is not needed anymore, because anyway in the end of the request the fiber is stopped, and its not finished transaction is rolled back. Just setting of the error is enough. Follow-up #4662 --- src/box/call.c | 28 ++++------------------------ src/box/txn.c | 1 + 2 files changed, 5 insertions(+), 24 deletions(-) diff --git a/src/box/call.c b/src/box/call.c index bcaa453ea..a46a61c3c 100644 --- a/src/box/call.c +++ b/src/box/call.c @@ -122,23 +122,13 @@ box_process_call(struct call_request *request, struct port *port) SC_FUNCTION, tt_cstr(name, name_len))) == 0) { rc = box_lua_call(name, name_len, &args, port); } - - struct txn *txn = in_txn(); - if (rc != 0) { - if (txn != NULL) - txn_rollback(txn); - fiber_gc(); + if (rc != 0) return -1; - } - - if (txn != NULL) { + if (in_txn() != NULL) { diag_set(ClientError, ER_FUNCTION_TX_ACTIVE); port_destroy(port); - txn_rollback(txn); - fiber_gc(); return -1; } - return 0; } @@ -154,21 +144,11 @@ box_process_eval(struct call_request *request, struct port *port) request->args_end - request->args); const char *expr = request->expr; uint32_t expr_len = mp_decode_strl(&expr); - struct txn *txn; - if (box_lua_eval(expr, expr_len, &args, port) != 0) { - txn = in_txn(); - if (txn != NULL) - txn_rollback(txn); - fiber_gc(); + if (box_lua_eval(expr, expr_len, &args, port) != 0) return -1; - } - - txn = in_txn(); - if (txn != NULL) { + if (in_txn() != 0) { diag_set(ClientError, ER_FUNCTION_TX_ACTIVE); port_destroy(port); - txn_rollback(txn); - fiber_gc(); return -1; } return 0; diff --git a/src/box/txn.c b/src/box/txn.c index 963ec8eeb..0854bbacc 100644 --- a/src/box/txn.c +++ b/src/box/txn.c @@ -845,6 +845,7 @@ txn_on_stop(struct trigger *trigger, void *event) (void) trigger; (void) event; txn_rollback(in_txn()); /* doesn't yield or fail */ + fiber_gc(); return 0; } -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 3/3] box: remove dead code from box_process_call/eval() 2020-01-16 21:54 ` [Tarantool-patches] [PATCH v2 3/3] box: remove dead code from box_process_call/eval() Vladislav Shpilevoy @ 2020-01-17 7:46 ` Konstantin Osipov 2020-01-17 7:47 ` Konstantin Osipov 2020-01-17 17:41 ` Georgy Kirichenko 2 siblings, 0 replies; 21+ messages in thread From: Konstantin Osipov @ 2020-01-17 7:46 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches * Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [20/01/17 00:57]: > box_process_call/eval() in the end check if there is an > active transaction. If there is, it is rolled back, and > an error is set. > > But rollback is not needed anymore, because anyway in > the end of the request the fiber is stopped, and its > not finished transaction is rolled back. Just setting > of the error is enough. Nice! -- Konstantin Osipov, Moscow, Russia ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 3/3] box: remove dead code from box_process_call/eval() 2020-01-16 21:54 ` [Tarantool-patches] [PATCH v2 3/3] box: remove dead code from box_process_call/eval() Vladislav Shpilevoy 2020-01-17 7:46 ` Konstantin Osipov @ 2020-01-17 7:47 ` Konstantin Osipov 2020-01-17 17:41 ` Georgy Kirichenko 2 siblings, 0 replies; 21+ messages in thread From: Konstantin Osipov @ 2020-01-17 7:47 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches * Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [20/01/17 00:57]: > box_process_call/eval() in the end check if there is an > active transaction. If there is, it is rolled back, and > an error is set. > > But rollback is not needed anymore, because anyway in > the end of the request the fiber is stopped, and its > not finished transaction is rolled back. Just setting > of the error is enough. (and lgtm) This patch imho is necessary since it "seals" correctness of the previous patch - if the previous patch is broken, after this patch is applied, we're going to get assertion failures in txn_init(). Very nice. -- Konstantin Osipov, Moscow, Russia ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 3/3] box: remove dead code from box_process_call/eval() 2020-01-16 21:54 ` [Tarantool-patches] [PATCH v2 3/3] box: remove dead code from box_process_call/eval() Vladislav Shpilevoy 2020-01-17 7:46 ` Konstantin Osipov 2020-01-17 7:47 ` Konstantin Osipov @ 2020-01-17 17:41 ` Georgy Kirichenko 2020-01-19 17:32 ` Vladislav Shpilevoy 2 siblings, 1 reply; 21+ messages in thread From: Georgy Kirichenko @ 2020-01-17 17:41 UTC (permalink / raw) To: tarantool-patches, kostja.osipov; +Cc: Vladislav Shpilevoy [-- Attachment #1: Type: text/plain, Size: 2690 bytes --] On Friday, 17 January 2020 00:54:23 MSK Vladislav Shpilevoy wrote: > box_process_call/eval() in the end check if there is an > active transaction. If there is, it is rolled back, and > an error is set. > > But rollback is not needed anymore, because anyway in > the end of the request the fiber is stopped, and its > not finished transaction is rolled back. Just setting > of the error is enough. Hi! Thanks for the patch, but I do not think that to remove an explicit rollback is a a good idea because of broken encapsulation - call and eval handlers should not rely on its execution context - a simple fiber, iproto fiber pool member or whatever else. Also I would like to mention that box_call and box_eval are members of the public api and it is not necessary that user will stop a fiber. > > Follow-up #4662 > --- > src/box/call.c | 28 ++++------------------------ > src/box/txn.c | 1 + > 2 files changed, 5 insertions(+), 24 deletions(-) > > diff --git a/src/box/call.c b/src/box/call.c > index bcaa453ea..a46a61c3c 100644 > --- a/src/box/call.c > +++ b/src/box/call.c > @@ -122,23 +122,13 @@ box_process_call(struct call_request *request, struct > port *port) SC_FUNCTION, tt_cstr(name, name_len))) == 0) { > rc = box_lua_call(name, name_len, &args, port); > } > - > - struct txn *txn = in_txn(); > - if (rc != 0) { > - if (txn != NULL) > - txn_rollback(txn); > - fiber_gc(); > + if (rc != 0) > return -1; > - } > - > - if (txn != NULL) { > + if (in_txn() != NULL) { > diag_set(ClientError, ER_FUNCTION_TX_ACTIVE); > port_destroy(port); > - txn_rollback(txn); > - fiber_gc(); > return -1; > } > - > return 0; > } > > @@ -154,21 +144,11 @@ box_process_eval(struct call_request *request, struct > port *port) request->args_end - request->args); > const char *expr = request->expr; > uint32_t expr_len = mp_decode_strl(&expr); > - struct txn *txn; > - if (box_lua_eval(expr, expr_len, &args, port) != 0) { > - txn = in_txn(); > - if (txn != NULL) > - txn_rollback(txn); > - fiber_gc(); > + if (box_lua_eval(expr, expr_len, &args, port) != 0) > return -1; > - } > - > - txn = in_txn(); > - if (txn != NULL) { > + if (in_txn() != 0) { > diag_set(ClientError, ER_FUNCTION_TX_ACTIVE); > port_destroy(port); > - txn_rollback(txn); > - fiber_gc(); > return -1; > } > return 0; > diff --git a/src/box/txn.c b/src/box/txn.c > index 963ec8eeb..0854bbacc 100644 > --- a/src/box/txn.c > +++ b/src/box/txn.c > @@ -845,6 +845,7 @@ txn_on_stop(struct trigger *trigger, void *event) > (void) trigger; > (void) event; > txn_rollback(in_txn()); /* doesn't yield or fail */ > + fiber_gc(); > return 0; > } [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 3/3] box: remove dead code from box_process_call/eval() 2020-01-17 17:41 ` Georgy Kirichenko @ 2020-01-19 17:32 ` Vladislav Shpilevoy 2020-01-20 19:21 ` Georgy Kirichenko 0 siblings, 1 reply; 21+ messages in thread From: Vladislav Shpilevoy @ 2020-01-19 17:32 UTC (permalink / raw) To: Georgy Kirichenko, tarantool-patches, kostja.osipov Hi! Thanks for the review! On 17/01/2020 18:41, Georgy Kirichenko wrote: > On Friday, 17 January 2020 00:54:23 MSK Vladislav Shpilevoy wrote: >> box_process_call/eval() in the end check if there is an >> active transaction. If there is, it is rolled back, and >> an error is set. >> >> But rollback is not needed anymore, because anyway in >> the end of the request the fiber is stopped, and its >> not finished transaction is rolled back. Just setting >> of the error is enough. > > Hi! > Thanks for the patch, but I do not think that to remove an explicit rollback > is a a good idea because of broken encapsulation - call and eval handlers > should not rely on its execution context - a simple fiber, iproto fiber pool > member or whatever else. Also I would like to mention that box_call and > box_eval are members of the public api and it is not necessary that user will > stop a fiber. Functions box_call and box_eval don't exist. So I don't understand what you are talking about. Assume you talked about box_process_call/eval. In that case you are wrong - they are not a part of the public API. They are always called from iproto.cc only. Besides, we will need to remove ER_FUNCTION_TX_ACTIVE and all the other txn stuff from them anyway after interactive transactions are ready. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 3/3] box: remove dead code from box_process_call/eval() 2020-01-19 17:32 ` Vladislav Shpilevoy @ 2020-01-20 19:21 ` Georgy Kirichenko 0 siblings, 0 replies; 21+ messages in thread From: Georgy Kirichenko @ 2020-01-20 19:21 UTC (permalink / raw) To: tarantool-patches, kostja.osipov, Vladislav Shpilevoy [-- Attachment #1: Type: text/plain, Size: 1735 bytes --] On Sunday, 19 January 2020 20:32:45 MSK Vladislav Shpilevoy wrote: > Hi! Thanks for the review! > > On 17/01/2020 18:41, Georgy Kirichenko wrote: > > On Friday, 17 January 2020 00:54:23 MSK Vladislav Shpilevoy wrote: > >> box_process_call/eval() in the end check if there is an > >> active transaction. If there is, it is rolled back, and > >> an error is set. > >> > >> But rollback is not needed anymore, because anyway in > >> the end of the request the fiber is stopped, and its > >> not finished transaction is rolled back. Just setting > >> of the error is enough. > > > > Hi! > > > > Thanks for the patch, but I do not think that to remove an explicit > > rollback> > > is a a good idea because of broken encapsulation - call and eval handlers > > should not rely on its execution context - a simple fiber, iproto fiber > > pool member or whatever else. Also I would like to mention that box_call > > and box_eval are members of the public api and it is not necessary that > > user will stop a fiber. > > Functions box_call and box_eval don't exist. So I don't understand what you > are talking about. > > Assume you talked about box_process_call/eval. In that case you are wrong - > they are not a part of the public API. They are always called from iproto.cc > only. Sorry, you are right, I though we already export them. > Besides, we will need to remove ER_FUNCTION_TX_ACTIVE and all the > other txn stuff from them anyway after interactive transactions are ready. This error should be removed only when we are not in context of a sequence stream of operations. This is not implemented yet so I think we should not change this behavior right now, because I pretty sure we should preserve backward compatibility. [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 0/3] Fiber storage leak 2020-01-16 21:54 [Tarantool-patches] [PATCH v2 0/3] Fiber storage leak Vladislav Shpilevoy ` (2 preceding siblings ...) 2020-01-16 21:54 ` [Tarantool-patches] [PATCH v2 3/3] box: remove dead code from box_process_call/eval() Vladislav Shpilevoy @ 2020-01-18 19:27 ` Igor Munkin 2020-02-15 1:02 ` Vladislav Shpilevoy 4 siblings, 0 replies; 21+ messages in thread From: Igor Munkin @ 2020-01-18 19:27 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches Vlad, Thanks, the patch is neat! I've looked on the new series since you added one more patch to it. LGTM for the whole patchset. On 16.01.20, Vladislav Shpilevoy wrote: > The patchset makes fiber.storage be always deleted. Regardless of > where was it created - in a fiber born in the Lua land, or in a > fiber serving IProto requests and accessing Lua. > > That removes > > - a leak occurred each time when fiber storage was created for > an IProto request; > > - a possibility to see previous request's artifacts left in > fiber.storage; > > And makes possible to use fiber.storage as request-local data. > > Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-4662-fiber-storage-leak > Issue: https://github.com/tarantool/tarantool/issues/4662 > > Changes in v2: > - Renamed back fiber.on_cleanup -> fiber.on_stop; > - Removed dead code from box_process_call/eval(), which is now > handled by iproto. > > Vladislav Shpilevoy (3): > fiber: unref fiber.storage via global Lua state > fiber: destroy fiber.storage created by iproto > box: remove dead code from box_process_call/eval() > > src/box/call.c | 28 +------ > src/box/iproto.cc | 44 ++++++++-- > src/box/txn.c | 1 + > src/lib/core/fiber.c | 18 +++- > src/lib/core/fiber.h | 14 +++- > src/lua/fiber.c | 35 ++++++-- > test/app/gh-4662-fiber-storage-leak.result | 88 ++++++++++++++++++++ > test/app/gh-4662-fiber-storage-leak.test.lua | 43 ++++++++++ > 8 files changed, 230 insertions(+), 41 deletions(-) > create mode 100644 test/app/gh-4662-fiber-storage-leak.result > create mode 100644 test/app/gh-4662-fiber-storage-leak.test.lua > > -- > 2.21.1 (Apple Git-122.3) > -- Best regards, IM ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 0/3] Fiber storage leak 2020-01-16 21:54 [Tarantool-patches] [PATCH v2 0/3] Fiber storage leak Vladislav Shpilevoy ` (3 preceding siblings ...) 2020-01-18 19:27 ` [Tarantool-patches] [PATCH v2 0/3] Fiber storage leak Igor Munkin @ 2020-02-15 1:02 ` Vladislav Shpilevoy 4 siblings, 0 replies; 21+ messages in thread From: Vladislav Shpilevoy @ 2020-02-15 1:02 UTC (permalink / raw) To: tarantool-patches, kostja.osipov Pushed to the master, 2.3, 2.2, 1.10. In 1.10 I didn't port the last commit, because in 1.10 fiber.on_stop triggers for transactions are installed differently. @ChangeLog - fiber.storage is cleaned between requests, and can be used as a request-local storage. Previously it was possible, that fiber.storage would contain some old values in the beginning of an iproto request execution, and it needed to be nullified manually. Now it is not needed (gh-4662). On 16/01/2020 22:54, Vladislav Shpilevoy wrote: > The patchset makes fiber.storage be always deleted. Regardless of > where was it created - in a fiber born in the Lua land, or in a > fiber serving IProto requests and accessing Lua. > > That removes > > - a leak occurred each time when fiber storage was created for > an IProto request; > > - a possibility to see previous request's artifacts left in > fiber.storage; > > And makes possible to use fiber.storage as request-local data. > > Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-4662-fiber-storage-leak > Issue: https://github.com/tarantool/tarantool/issues/4662 > > Changes in v2: > - Renamed back fiber.on_cleanup -> fiber.on_stop; > - Removed dead code from box_process_call/eval(), which is now > handled by iproto. > > Vladislav Shpilevoy (3): > fiber: unref fiber.storage via global Lua state > fiber: destroy fiber.storage created by iproto > box: remove dead code from box_process_call/eval() > > src/box/call.c | 28 +------ > src/box/iproto.cc | 44 ++++++++-- > src/box/txn.c | 1 + > src/lib/core/fiber.c | 18 +++- > src/lib/core/fiber.h | 14 +++- > src/lua/fiber.c | 35 ++++++-- > test/app/gh-4662-fiber-storage-leak.result | 88 ++++++++++++++++++++ > test/app/gh-4662-fiber-storage-leak.test.lua | 43 ++++++++++ > 8 files changed, 230 insertions(+), 41 deletions(-) > create mode 100644 test/app/gh-4662-fiber-storage-leak.result > create mode 100644 test/app/gh-4662-fiber-storage-leak.test.lua > ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2020-02-15 1:02 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-01-16 21:54 [Tarantool-patches] [PATCH v2 0/3] Fiber storage leak Vladislav Shpilevoy 2020-01-16 21:54 ` [Tarantool-patches] [PATCH v2 1/3] fiber: unref fiber.storage via global Lua state Vladislav Shpilevoy 2020-01-17 7:30 ` Konstantin Osipov 2020-01-16 21:54 ` [Tarantool-patches] [PATCH v2 2/3] fiber: destroy fiber.storage created by iproto Vladislav Shpilevoy 2020-01-16 22:00 ` Cyrill Gorcunov 2020-01-17 7:47 ` Konstantin Osipov 2020-01-17 8:06 ` Cyrill Gorcunov 2020-01-17 7:45 ` Konstantin Osipov 2020-01-19 17:32 ` Vladislav Shpilevoy 2020-01-20 7:22 ` Konstantin Osipov 2020-01-20 19:15 ` Georgy Kirichenko 2020-01-21 22:21 ` Vladislav Shpilevoy 2020-01-21 22:32 ` Konstantin Osipov 2020-01-16 21:54 ` [Tarantool-patches] [PATCH v2 3/3] box: remove dead code from box_process_call/eval() Vladislav Shpilevoy 2020-01-17 7:46 ` Konstantin Osipov 2020-01-17 7:47 ` Konstantin Osipov 2020-01-17 17:41 ` Georgy Kirichenko 2020-01-19 17:32 ` Vladislav Shpilevoy 2020-01-20 19:21 ` Georgy Kirichenko 2020-01-18 19:27 ` [Tarantool-patches] [PATCH v2 0/3] Fiber storage leak Igor Munkin 2020-02-15 1:02 ` Vladislav Shpilevoy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox