[Tarantool-patches] [PATCH] lua/utils: improve luaT_newthread performance

Igor Munkin imun at tarantool.org
Mon Jul 20 14:28:37 MSK 2020


<luaT_newthread> created a new GCfunc object for the helper invoked in a
protected <lua_cpcall> frame (i.e. <luaT_newthread_wrapper>) on each
call. The change introduces a static reference to a GCfunc object for
<luaT_newthread_wrapper> to be initialized on Tarantool startup to
reduce Lua GC memory usage.

Furthermore, since <lua_cpcall> yields nothing on guest stack, the newly
created Lua coroutine need to be pushed back to prevent its sweep. So
to reduce guest stack manipulations <lua_cpcall> is replaced with
<lua_pcall> and the resulting Lua thread is obtained via guest stack.

Signed-off-by: Igor Munkin <imun at tarantool.org>
---
Recently we discussed with Timur his struggling with linking his binary
Lua module against Tarantool. The reason is LuaJIT internals usage for
manipulations with the guest stack that are not provided by the binary.
I glanced the current <luaT_newthread> implementation and found out two
another problems related to the platform overall performance (as it is
proved with the corresponding benchmarks).

The first problem is the similar one <box_process_lua> had prior to the
patch[1]: <lua_cpcall> creates an auxiliary GCfunc object for the
function to be called in protected frame. However this function is the
same throughout the platform uptime. It can be created on Taranool
startup and I see no reason to clobber GC that way.

Another problem I found are excess manipulations with the guest stack:
one need the newly born coroutine on it to prevent it being collected by
GC, but <lua_cpcall> purges everything left on the stack in scope of the
invoked function. As a result the obtained Lua coroutine is "pushed"
back to the guest stack via LuaJIT internal interfaces. It's a bit
ridiculous, since one can just use public API to get the same results:
Lua coroutine on the guest stack and the corresponding pointer to it.

I tested the platform performance with the same benchmark[2] I made for
the <box_process_lua> patch and here are the numbers I obtained after
the 15 runs:
* Vanilla bleeding master (mean):
| ===== 2.5.0-267-gbf047ad44 =====
| call by ref GC: 921877 Kb
| call by ref time: 1.340172 sec
| call GC: 476563.991667 Kb
| eval GC: 655274.935547 Kb
* Patched bleeding master (mean):
| ===== 2.5.0-268-gec0eb12f4 =====
| call by ref GC: 859377 Kb
| call by ref time: 1.215410 sec
| call GC: 445313 Kb
| eval GC: 624024 Kb
* Relative measurements (before -> after):
| call by ref GC: -6% (-62500 Kb)
| call by ref time: -9% (-0.124762 sec)
| call GC: -6% (-31250 Kb)
| eval GC: -4% (-31250 Kb)

There is one hot path I left unverified -- Lua-born fibers creation, but
I guess the relative numbers are quite similar to the ones I mentioned
above. However, if one wonders these results, feel free to ask me.

Branch: https://github.com/tarantool/tarantool/tree/imun/experimental-luaT-newthread

@ChangeLog:
* Improved safe Lua coroutine creation for the case of fiber
  initialization. Prepared GCfunc object is used instead of temporary
  one, resulting in 3-6% garbage collection reduction. Also excess guest
  stack manipulations are removed.

 src/lua/utils.c | 32 ++++++++++++++++++++++++++++++++
 src/lua/utils.h | 38 ++++++++++----------------------------
 2 files changed, 42 insertions(+), 28 deletions(-)

[1]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-June/017834.html
[2]: https://gist.github.com/igormunkin/c941074fa9fdf0f7a4f068498fb5e24c

diff --git a/src/lua/utils.c b/src/lua/utils.c
index 0b05d7257..23ccbc3c9 100644
--- a/src/lua/utils.c
+++ b/src/lua/utils.c
@@ -43,6 +43,8 @@ int luaL_nil_ref = LUA_REFNIL;
 int luaL_map_metatable_ref = LUA_REFNIL;
 int luaL_array_metatable_ref = LUA_REFNIL;
 
+static int luaT_newthread_ref = LUA_NOREF;
+
 static uint32_t CTID_STRUCT_IBUF;
 static uint32_t CTID_STRUCT_IBUF_PTR;
 static uint32_t CTID_CHAR_PTR;
@@ -1224,6 +1226,33 @@ void luaL_iterator_delete(struct luaL_iterator *it)
 
 /* }}} */
 
+/**
+ * @brief A wrapper for <lua_newthread> to be called via luaT_call
+ * in luaT_newthread. Whether new Lua coroutine is created it is
+ * returned on the top of the guest stack.
+ * @param L is a Lua state
+ * @sa <lua_newthread>
+ */
+static int
+luaT_newthread_wrapper(lua_State *L)
+{
+	(void)lua_newthread(L);
+	return 1;
+}
+
+lua_State *
+luaT_newthread(lua_State *L)
+{
+	assert(luaT_newthread_ref != LUA_NOREF);
+	lua_rawgeti(L, LUA_REGISTRYINDEX, luaT_newthread_ref);
+	assert(lua_isfunction(L, -1));
+	if (luaT_call(L, 0, 1) != 0)
+		return NULL;
+	lua_State *L1 = lua_tothread(L, -1);
+	assert(L1 != NULL);
+	return L1;
+}
+
 int
 tarantool_lua_utils_init(struct lua_State *L)
 {
@@ -1274,5 +1303,8 @@ tarantool_lua_utils_init(struct lua_State *L)
 	(void) rc;
 	CTID_UUID = luaL_ctypeid(L, "struct tt_uuid");
 	assert(CTID_UUID != 0);
+
+	lua_pushcfunction(L, luaT_newthread_wrapper);
+	luaT_newthread_ref = luaL_ref(L, LUA_REGISTRYINDEX);
 	return 0;
 }
diff --git a/src/lua/utils.h b/src/lua/utils.h
index b10754e4a..404eafe9b 100644
--- a/src/lua/utils.h
+++ b/src/lua/utils.h
@@ -597,34 +597,16 @@ luaL_checkfinite(struct lua_State *L, struct luaL_serializer *cfg,
 }
 
 /**
- * @brief A wrapper for lua_newthread() to pass it into luaT_cpcall
- * @param L is a Lua State
- * @sa lua_newthread()
- */
-static inline int
-luaT_newthread_wrapper(lua_State *L)
-{
-	*(lua_State **)lua_touserdata(L, 1) = lua_newthread(L);
-	return 0;
-}
-
-/**
- * @brief Safe wrapper for lua_newthread()
- * @param L is a Lua State
- * @sa lua_newthread()
- */
-static inline lua_State *
-luaT_newthread(lua_State *L)
-{
-	lua_State *L1 = NULL;
-	if (luaT_cpcall(L, luaT_newthread_wrapper, &L1) != 0) {
-		return NULL;
-	}
-	assert(L1 != NULL);
-	setthreadV(L, L->top, L1);
-	incr_top(L);
-	return L1;
-}
+ * @brief Creates a new Lua coroutine in a protected frame. If
+ * <lua_newthread> call underneath succeeds, the created Lua state
+ * is on the top of the guest stack and a pointer to this state is
+ * returned. Otherwise LUA_ERRMEM error is handled and the result
+ * is NULL.
+ * @param L is a Lua state
+ * @sa <lua_newthread>
+ */
+lua_State *
+luaT_newthread(lua_State *L);
 
 /**
  * Check if a value on @a L stack by index @a idx is an ibuf
-- 
2.25.0



More information about the Tarantool-patches mailing list