From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng1.m.smailru.net (smtpng1.m.smailru.net [94.100.181.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 6E8C1445320 for ; Wed, 22 Jul 2020 18:22:28 +0300 (MSK) Date: Wed, 22 Jul 2020 18:12:09 +0300 From: Igor Munkin Message-ID: <20200722151209.GP18920@tarantool.org> References: <20200722113055.GB894@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20200722113055.GB894@tarantool.org> Subject: Re: [Tarantool-patches] [PATCH] lua/utils: improve luaT_newthread performance List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Sergey Ostanevich Cc: tarantool-patches@dev.tarantool.org, Vladislav Shpilevoy Sergos, Thanks for your review! On 22.07.20, Sergey Ostanevich wrote: > Hi! > > Thanks for the patch, LGTM. Added your tag: | Reviewed-by: Sergey Ostanevich > > I wonder if it can be done in a more safe way in luaT_newthread() so > that if the refernece is not initialized then initialize it, rather > assert in debug and - I suppose - fail with creation in release? I see > an overhead for condition here and making even bigger indirect > machinery for the wrappers itself won't bring a lot either. Frankly speaking, it looks an overkill to me. What are your exact concerns? I added those asserts mostly for self-check. This is a hot path in call/eval request handling, fiber creation and stored procedures calls, so Debug mode testing fails if the problem occurs. This approach is the similar to one[1] but there are no such checks. This static reference can't be unintentionally changed outside this translation unit, so its usage is well-localized. My arguments are not about performance but sanity. Do you see any flaws in this design? > > Regards, > Sergos > > > On 20 Jul 14:28, Igor Munkin wrote: > > created a new GCfunc object for the helper invoked in a > > protected frame (i.e. ) on each > > call. The change introduces a static reference to a GCfunc object for > > to be initialized on Tarantool startup to > > reduce Lua GC memory usage. > > > > Furthermore, since yields nothing on guest stack, the newly > > created Lua coroutine need to be pushed back to prevent its sweep. So > > to reduce guest stack manipulations is replaced with > > and the resulting Lua thread is obtained via guest stack. > > > > Signed-off-by: Igor Munkin > > --- > > Recently we discussed with Timur his struggling with linking his binary > > Lua module against Tarantool. The reason is LuaJIT internals usage for > > manipulations with the guest stack that are not provided by the binary. > > I glanced the current implementation and found out two > > another problems related to the platform overall performance (as it is > > proved with the corresponding benchmarks). > > > > The first problem is the similar one had prior to the > > patch[1]: creates an auxiliary GCfunc object for the > > function to be called in protected frame. However this function is the > > same throughout the platform uptime. It can be created on Taranool > > startup and I see no reason to clobber GC that way. > > > > Another problem I found are excess manipulations with the guest stack: > > one need the newly born coroutine on it to prevent it being collected by > > GC, but purges everything left on the stack in scope of the > > invoked function. As a result the obtained Lua coroutine is "pushed" > > back to the guest stack via LuaJIT internal interfaces. It's a bit > > ridiculous, since one can just use public API to get the same results: > > Lua coroutine on the guest stack and the corresponding pointer to it. > > > > I tested the platform performance with the same benchmark[2] I made for > > the patch and here are the numbers I obtained after > > the 15 runs: > > * Vanilla bleeding master (mean): > > | ===== 2.5.0-267-gbf047ad44 ===== > > | call by ref GC: 921877 Kb > > | call by ref time: 1.340172 sec > > | call GC: 476563.991667 Kb > > | eval GC: 655274.935547 Kb > > * Patched bleeding master (mean): > > | ===== 2.5.0-268-gec0eb12f4 ===== > > | call by ref GC: 859377 Kb > > | call by ref time: 1.215410 sec > > | call GC: 445313 Kb > > | eval GC: 624024 Kb > > * Relative measurements (before -> after): > > | call by ref GC: -6% (-62500 Kb) > > | call by ref time: -9% (-0.124762 sec) > > | call GC: -6% (-31250 Kb) > > | eval GC: -4% (-31250 Kb) > > > > There is one hot path I left unverified -- Lua-born fibers creation, but > > I guess the relative numbers are quite similar to the ones I mentioned > > above. However, if one wonders these results, feel free to ask me. > > > > Branch: https://github.com/tarantool/tarantool/tree/imun/experimental-luaT-newthread > > > > @ChangeLog: > > * Improved safe Lua coroutine creation for the case of fiber > > initialization. Prepared GCfunc object is used instead of temporary > > one, resulting in 3-6% garbage collection reduction. Also excess guest > > stack manipulations are removed. > > > > src/lua/utils.c | 32 ++++++++++++++++++++++++++++++++ > > src/lua/utils.h | 38 ++++++++++---------------------------- > > 2 files changed, 42 insertions(+), 28 deletions(-) > > > > [1]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-June/017834.html > > [2]: https://gist.github.com/igormunkin/c941074fa9fdf0f7a4f068498fb5e24c > > > > diff --git a/src/lua/utils.c b/src/lua/utils.c > > index 0b05d7257..23ccbc3c9 100644 > > --- a/src/lua/utils.c > > +++ b/src/lua/utils.c > > @@ -43,6 +43,8 @@ int luaL_nil_ref = LUA_REFNIL; > > int luaL_map_metatable_ref = LUA_REFNIL; > > int luaL_array_metatable_ref = LUA_REFNIL; > > > > +static int luaT_newthread_ref = LUA_NOREF; > > + > > static uint32_t CTID_STRUCT_IBUF; > > static uint32_t CTID_STRUCT_IBUF_PTR; > > static uint32_t CTID_CHAR_PTR; > > @@ -1224,6 +1226,33 @@ void luaL_iterator_delete(struct luaL_iterator *it) > > > > /* }}} */ > > > > +/** > > + * @brief A wrapper for to be called via luaT_call > > + * in luaT_newthread. Whether new Lua coroutine is created it is > > + * returned on the top of the guest stack. > > + * @param L is a Lua state > > + * @sa > > + */ > > +static int > > +luaT_newthread_wrapper(lua_State *L) > > +{ > > + (void)lua_newthread(L); > > + return 1; > > +} > > + > > +lua_State * > > +luaT_newthread(lua_State *L) > > +{ > > + assert(luaT_newthread_ref != LUA_NOREF); > > + lua_rawgeti(L, LUA_REGISTRYINDEX, luaT_newthread_ref); > > + assert(lua_isfunction(L, -1)); > > + if (luaT_call(L, 0, 1) != 0) > > + return NULL; > > + lua_State *L1 = lua_tothread(L, -1); > > + assert(L1 != NULL); > > + return L1; > > +} > > + > > int > > tarantool_lua_utils_init(struct lua_State *L) > > { > > @@ -1274,5 +1303,8 @@ tarantool_lua_utils_init(struct lua_State *L) > > (void) rc; > > CTID_UUID = luaL_ctypeid(L, "struct tt_uuid"); > > assert(CTID_UUID != 0); > > + > > + lua_pushcfunction(L, luaT_newthread_wrapper); > > + luaT_newthread_ref = luaL_ref(L, LUA_REGISTRYINDEX); > > return 0; > > } > > diff --git a/src/lua/utils.h b/src/lua/utils.h > > index b10754e4a..404eafe9b 100644 > > --- a/src/lua/utils.h > > +++ b/src/lua/utils.h > > @@ -597,34 +597,16 @@ luaL_checkfinite(struct lua_State *L, struct luaL_serializer *cfg, > > } > > > > /** > > - * @brief A wrapper for lua_newthread() to pass it into luaT_cpcall > > - * @param L is a Lua State > > - * @sa lua_newthread() > > - */ > > -static inline int > > -luaT_newthread_wrapper(lua_State *L) > > -{ > > - *(lua_State **)lua_touserdata(L, 1) = lua_newthread(L); > > - return 0; > > -} > > - > > -/** > > - * @brief Safe wrapper for lua_newthread() > > - * @param L is a Lua State > > - * @sa lua_newthread() > > - */ > > -static inline lua_State * > > -luaT_newthread(lua_State *L) > > -{ > > - lua_State *L1 = NULL; > > - if (luaT_cpcall(L, luaT_newthread_wrapper, &L1) != 0) { > > - return NULL; > > - } > > - assert(L1 != NULL); > > - setthreadV(L, L->top, L1); > > - incr_top(L); > > - return L1; > > -} > > + * @brief Creates a new Lua coroutine in a protected frame. If > > + * call underneath succeeds, the created Lua state > > + * is on the top of the guest stack and a pointer to this state is > > + * returned. Otherwise LUA_ERRMEM error is handled and the result > > + * is NULL. > > + * @param L is a Lua state > > + * @sa > > + */ > > +lua_State * > > +luaT_newthread(lua_State *L); > > > > /** > > * Check if a value on @a L stack by index @a idx is an ibuf > > -- > > 2.25.0 > > [1]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-June/017836.html -- Best regards, IM