[Tarantool-patches] [PATCH] lua/utils: improve luaT_newthread performance

Igor Munkin imun at tarantool.org
Wed Jul 22 18:12:09 MSK 2020


Sergos,

Thanks for your review!

On 22.07.20, Sergey Ostanevich wrote:
> Hi!
> 
> Thanks for the patch, LGTM.

Added your tag:
| Reviewed-by: Sergey Ostanevich <sergos at tarantool.org>

> 
> I wonder if it can be done in a more safe way in luaT_newthread() so
> that if the refernece is not initialized then initialize it, rather 
> assert in debug and - I suppose - fail with creation in release? I see
> an overhead for condition here and making even bigger indirect 
> machinery for the wrappers itself won't bring a lot either.

Frankly speaking, it looks an overkill to me. What are your exact
concerns? I added those asserts mostly for self-check. This is a hot
path in call/eval request handling, fiber creation and stored procedures
calls, so Debug mode testing fails if the problem occurs. This approach
is the similar to <box_process_lua> one[1] but there are no such checks.
This static reference can't be unintentionally changed outside this
translation unit, so its usage is well-localized. My arguments are not
about performance but sanity. Do you see any flaws in this design?

> 
> Regards,
> Sergos
> 
> 
> On 20 Jul 14:28, Igor Munkin wrote:
> > <luaT_newthread> created a new GCfunc object for the helper invoked in a
> > protected <lua_cpcall> frame (i.e. <luaT_newthread_wrapper>) on each
> > call. The change introduces a static reference to a GCfunc object for
> > <luaT_newthread_wrapper> to be initialized on Tarantool startup to
> > reduce Lua GC memory usage.
> > 
> > Furthermore, since <lua_cpcall> yields nothing on guest stack, the newly
> > created Lua coroutine need to be pushed back to prevent its sweep. So
> > to reduce guest stack manipulations <lua_cpcall> is replaced with
> > <lua_pcall> and the resulting Lua thread is obtained via guest stack.
> > 
> > Signed-off-by: Igor Munkin <imun at tarantool.org>
> > ---
> > Recently we discussed with Timur his struggling with linking his binary
> > Lua module against Tarantool. The reason is LuaJIT internals usage for
> > manipulations with the guest stack that are not provided by the binary.
> > I glanced the current <luaT_newthread> implementation and found out two
> > another problems related to the platform overall performance (as it is
> > proved with the corresponding benchmarks).
> > 
> > The first problem is the similar one <box_process_lua> had prior to the
> > patch[1]: <lua_cpcall> creates an auxiliary GCfunc object for the
> > function to be called in protected frame. However this function is the
> > same throughout the platform uptime. It can be created on Taranool
> > startup and I see no reason to clobber GC that way.
> > 
> > Another problem I found are excess manipulations with the guest stack:
> > one need the newly born coroutine on it to prevent it being collected by
> > GC, but <lua_cpcall> purges everything left on the stack in scope of the
> > invoked function. As a result the obtained Lua coroutine is "pushed"
> > back to the guest stack via LuaJIT internal interfaces. It's a bit
> > ridiculous, since one can just use public API to get the same results:
> > Lua coroutine on the guest stack and the corresponding pointer to it.
> > 
> > I tested the platform performance with the same benchmark[2] I made for
> > the <box_process_lua> patch and here are the numbers I obtained after
> > the 15 runs:
> > * Vanilla bleeding master (mean):
> > | ===== 2.5.0-267-gbf047ad44 =====
> > | call by ref GC: 921877 Kb
> > | call by ref time: 1.340172 sec
> > | call GC: 476563.991667 Kb
> > | eval GC: 655274.935547 Kb
> > * Patched bleeding master (mean):
> > | ===== 2.5.0-268-gec0eb12f4 =====
> > | call by ref GC: 859377 Kb
> > | call by ref time: 1.215410 sec
> > | call GC: 445313 Kb
> > | eval GC: 624024 Kb
> > * Relative measurements (before -> after):
> > | call by ref GC: -6% (-62500 Kb)
> > | call by ref time: -9% (-0.124762 sec)
> > | call GC: -6% (-31250 Kb)
> > | eval GC: -4% (-31250 Kb)
> > 
> > There is one hot path I left unverified -- Lua-born fibers creation, but
> > I guess the relative numbers are quite similar to the ones I mentioned
> > above. However, if one wonders these results, feel free to ask me.
> > 
> > Branch: https://github.com/tarantool/tarantool/tree/imun/experimental-luaT-newthread
> > 
> > @ChangeLog:
> > * Improved safe Lua coroutine creation for the case of fiber
> >   initialization. Prepared GCfunc object is used instead of temporary
> >   one, resulting in 3-6% garbage collection reduction. Also excess guest
> >   stack manipulations are removed.
> > 
> >  src/lua/utils.c | 32 ++++++++++++++++++++++++++++++++
> >  src/lua/utils.h | 38 ++++++++++----------------------------
> >  2 files changed, 42 insertions(+), 28 deletions(-)
> > 
> > [1]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-June/017834.html
> > [2]: https://gist.github.com/igormunkin/c941074fa9fdf0f7a4f068498fb5e24c
> > 
> > diff --git a/src/lua/utils.c b/src/lua/utils.c
> > index 0b05d7257..23ccbc3c9 100644
> > --- a/src/lua/utils.c
> > +++ b/src/lua/utils.c
> > @@ -43,6 +43,8 @@ int luaL_nil_ref = LUA_REFNIL;
> >  int luaL_map_metatable_ref = LUA_REFNIL;
> >  int luaL_array_metatable_ref = LUA_REFNIL;
> >  
> > +static int luaT_newthread_ref = LUA_NOREF;
> > +
> >  static uint32_t CTID_STRUCT_IBUF;
> >  static uint32_t CTID_STRUCT_IBUF_PTR;
> >  static uint32_t CTID_CHAR_PTR;
> > @@ -1224,6 +1226,33 @@ void luaL_iterator_delete(struct luaL_iterator *it)
> >  
> >  /* }}} */
> >  
> > +/**
> > + * @brief A wrapper for <lua_newthread> to be called via luaT_call
> > + * in luaT_newthread. Whether new Lua coroutine is created it is
> > + * returned on the top of the guest stack.
> > + * @param L is a Lua state
> > + * @sa <lua_newthread>
> > + */
> > +static int
> > +luaT_newthread_wrapper(lua_State *L)
> > +{
> > +	(void)lua_newthread(L);
> > +	return 1;
> > +}
> > +
> > +lua_State *
> > +luaT_newthread(lua_State *L)
> > +{
> > +	assert(luaT_newthread_ref != LUA_NOREF);
> > +	lua_rawgeti(L, LUA_REGISTRYINDEX, luaT_newthread_ref);
> > +	assert(lua_isfunction(L, -1));
> > +	if (luaT_call(L, 0, 1) != 0)
> > +		return NULL;
> > +	lua_State *L1 = lua_tothread(L, -1);
> > +	assert(L1 != NULL);
> > +	return L1;
> > +}
> > +
> >  int
> >  tarantool_lua_utils_init(struct lua_State *L)
> >  {
> > @@ -1274,5 +1303,8 @@ tarantool_lua_utils_init(struct lua_State *L)
> >  	(void) rc;
> >  	CTID_UUID = luaL_ctypeid(L, "struct tt_uuid");
> >  	assert(CTID_UUID != 0);
> > +
> > +	lua_pushcfunction(L, luaT_newthread_wrapper);
> > +	luaT_newthread_ref = luaL_ref(L, LUA_REGISTRYINDEX);
> >  	return 0;
> >  }
> > diff --git a/src/lua/utils.h b/src/lua/utils.h
> > index b10754e4a..404eafe9b 100644
> > --- a/src/lua/utils.h
> > +++ b/src/lua/utils.h
> > @@ -597,34 +597,16 @@ luaL_checkfinite(struct lua_State *L, struct luaL_serializer *cfg,
> >  }
> >  
> >  /**
> > - * @brief A wrapper for lua_newthread() to pass it into luaT_cpcall
> > - * @param L is a Lua State
> > - * @sa lua_newthread()
> > - */
> > -static inline int
> > -luaT_newthread_wrapper(lua_State *L)
> > -{
> > -	*(lua_State **)lua_touserdata(L, 1) = lua_newthread(L);
> > -	return 0;
> > -}
> > -
> > -/**
> > - * @brief Safe wrapper for lua_newthread()
> > - * @param L is a Lua State
> > - * @sa lua_newthread()
> > - */
> > -static inline lua_State *
> > -luaT_newthread(lua_State *L)
> > -{
> > -	lua_State *L1 = NULL;
> > -	if (luaT_cpcall(L, luaT_newthread_wrapper, &L1) != 0) {
> > -		return NULL;
> > -	}
> > -	assert(L1 != NULL);
> > -	setthreadV(L, L->top, L1);
> > -	incr_top(L);
> > -	return L1;
> > -}
> > + * @brief Creates a new Lua coroutine in a protected frame. If
> > + * <lua_newthread> call underneath succeeds, the created Lua state
> > + * is on the top of the guest stack and a pointer to this state is
> > + * returned. Otherwise LUA_ERRMEM error is handled and the result
> > + * is NULL.
> > + * @param L is a Lua state
> > + * @sa <lua_newthread>
> > + */
> > +lua_State *
> > +luaT_newthread(lua_State *L);
> >  
> >  /**
> >   * Check if a value on @a L stack by index @a idx is an ibuf
> > -- 
> > 2.25.0
> > 

[1]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-June/017836.html

-- 
Best regards,
IM


More information about the Tarantool-patches mailing list