* [Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call
@ 2021-06-18 18:14 Sergey Kaplun via Tarantool-patches
  2021-06-21 20:36 ` Vladislav Shpilevoy via Tarantool-patches
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-06-18 18:14 UTC (permalink / raw)
  To: Vladislav Shpilevoy, Igor Munkin; +Cc: tarantool-patches
The old code flow was the following:
1) `struct port_lua` given to `port_lua_do_dump()` has Lua stack with
   arguments to encode to MessagePack
2) The main coroutine `tarantool_L` is used with `lua_cpcall()` to call
   `encode_lua_call_16()` or `encode_lua_call()`
3) Objects on minor coroutine are encoded via `luamp_encode_call16()` or
   `luamp_encode()`. This encoding may raise an error on unprotected
   `port->L` coroutine. This coroutine has no protected frame on it
   and this call should fail in pure Lua. It is forbidden to call
   anything on unprotected coroutine [1] (Lua 5.1 sets protection only
   for specific lua_State [2] and calls a panic function if we raise an
   error on unprotected lua_State [3]). Netherless, there is no panic
   at now due to two facts. The first one is LuaJIT's support of C++
   exception handling [4] that allows to raise an error in Lua and
   catch it in C++ or vice versa. But documentation still doesn't
   permit errors on unprotected coroutines (at least we must set
   try-catch block). The second one is double monkey-patching of LuaJIT
   to restore currently executed coroutine, when C function or fast
   function raises an error [5][6] (see related issue here [7][8]).
   For these reasons, when an error occurs, the unwinder searches and
   finds the C-protected stack frame from the `lua_cpcall()` for the
   `tarantool_L` coroutine and unwinds until that point (without
   aforementioned patches LuaJIT just calls a panic function and exit).
4) If an error is raised, and `lua_cpcall()` returns not `LUA_OK`, then
   the error from `port->L` coroutine is converted into a Tarantool error
   and a diagnostic is set.
The auxiliary usage of `tarantool_L` coroutine is redundant and doesn't
respect Lua idiomatic of usage. So this patch drops it and uses only
minor coroutine instead with `lua_pcall()`.
Functions to encode are saved as entrance in the `LUA_REGISTRY` table
to reduce GC pressure, like it is done for other handlers [9].
[1]: https://www.lua.org/manual/5.2/manual.html#4.6
> If an error happens outside any protected environment, Lua calls a
> panic function
[2]: https://www.lua.org/source/5.1/lstate.h.html#lua_State
[3]: https://www.lua.org/source/5.1/ldo.c.html#luaD_throw
[4]: https://luajit.org/extensions.html#exceptions
[5]: https://github.com/tarantool/luajit/commit/ed412cd9f55fe87fd32a69c86e1732690fc5c1b0
[6]: https://github.com/tarantool/luajit/commit/97699d9ee2467389b6aea21a098e38aff3469b5f
[7]: https://github.com/tarantool/tarantool/issues/1516
[8]: https://www.freelists.org/post/luajit/Issue-with-PCALL-in-21
[9]: https://github.com/tarantool/tarantool/commit/e88c0d21ab765d4c53bed2437c49d77b3ffe4216
---
Branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-noticket-refactor-lua-call
See the benchmarks sources here [1].
Before patch:
| Encode map: 189851357 mcs, 15.8 K ps
| Encode seq: 187926351 mcs, 16.0 K ps
| Encode str: 185451675 mcs, 16.2 K ps
| Encode dig: 184833396 mcs, 16.2 K ps
After patch:
| Encode map: 187814261 mcs, 16.0 K ps
| Encode seq: 183755028 mcs, 16.3 K ps
| Encode str: 181571626 mcs, 16.5 K ps
| Encode dig: 181572998 mcs, 16.5 K ps
Looks like the perf doesn't degrade at least.
[1]: https://gist.github.com/Buristan/3e6d6bf2c722874bec55a8c5a44b98f3
 src/box/lua/call.c | 71 ++++++++++++++++++++++++++++++++++++----------
 1 file changed, 56 insertions(+), 15 deletions(-)
diff --git a/src/box/lua/call.c b/src/box/lua/call.c
index 0315e720c..3b2572096 100644
--- a/src/box/lua/call.c
+++ b/src/box/lua/call.c
@@ -61,6 +61,8 @@ enum handlers {
 	HANDLER_CALL,
 	HANDLER_CALL_BY_REF,
 	HANDLER_EVAL,
+	HANDLER_ENCODE_CALL,
+	HANDLER_ENCODE_CALL_16,
 	HANDLER_MAX,
 };
 
@@ -400,11 +402,26 @@ struct encode_lua_ctx {
 	struct mpstream *stream;
 };
 
+/**
+ * Encode call results to msgpack from Lua stack.
+ * Lua stack has the following structure -- the last element is
+ * lightuserdata pointer to encode_lua_ctx, all other values are
+ * arguments to proceed.
+ * The function encodes all given Lua objects to msgpack stream
+ * from context, sets port's size and returns no value on the Lua
+ * stack.
+ * XXX: This function *MUST* be called under lua_pcall(), because
+ * luamp_encode() may raise an error.
+ */
 static int
 encode_lua_call(lua_State *L)
 {
+	assert(lua_islightuserdata(L, -1));
 	struct encode_lua_ctx *ctx =
-		(struct encode_lua_ctx *) lua_topointer(L, 1);
+		(struct encode_lua_ctx *) lua_topointer(L, -1);
+	assert(ctx->port->L == L);
+	/* Delete ctx from the stack. */
+	lua_pop(L, 1);
 	/*
 	 * Add all elements from Lua stack to the buffer.
 	 *
@@ -413,33 +430,48 @@ encode_lua_call(lua_State *L)
 	struct luaL_serializer *cfg = luaL_msgpack_default;
 	const struct serializer_opts *opts =
 		¤t_session()->meta.serializer_opts;
-	int size = lua_gettop(ctx->port->L);
+	const int size = lua_gettop(L);
 	for (int i = 1; i <= size; ++i)
-		luamp_encode(ctx->port->L, cfg, opts, ctx->stream, i);
+		luamp_encode(L, cfg, opts, ctx->stream, i);
 	ctx->port->size = size;
 	mpstream_flush(ctx->stream);
 	return 0;
 }
 
+/**
+ * Encode call_16 results to msgpack from Lua stack.
+ * Lua stack has the following structure -- the last element is
+ * lightuserdata pointer to encode_lua_ctx, all other values are
+ * arguments to proceed.
+ * The function encodes all given Lua objects to msgpack stream
+ * from context, sets port's size and returns no value on the Lua
+ * stack.
+ * XXX: This function *MUST* be called under lua_pcall(), because
+ * luamp_encode() may raise an error.
+ */
 static int
 encode_lua_call_16(lua_State *L)
 {
+	assert(lua_islightuserdata(L, -1));
 	struct encode_lua_ctx *ctx =
-		(struct encode_lua_ctx *) lua_topointer(L, 1);
+		(struct encode_lua_ctx *) lua_topointer(L, -1);
+	/* Delete ctx from the stack. */
+	lua_pop(L, 1);
+	assert(ctx->port->L == L);
 	/*
 	 * Add all elements from Lua stack to the buffer.
 	 *
 	 * TODO: forbid explicit yield from __serialize or __index here
 	 */
 	struct luaL_serializer *cfg = luaL_msgpack_default;
-	ctx->port->size = luamp_encode_call_16(ctx->port->L, cfg, ctx->stream);
+	ctx->port->size = luamp_encode_call_16(L, cfg, ctx->stream);
 	mpstream_flush(ctx->stream);
 	return 0;
 }
 
 static inline int
 port_lua_do_dump(struct port *base, struct mpstream *stream,
-		 lua_CFunction handler)
+		 enum handlers handler)
 {
 	struct port_lua *port = (struct port_lua *) base;
 	assert(port->vtab == &port_lua_vtab);
@@ -450,13 +482,20 @@ port_lua_do_dump(struct port *base, struct mpstream *stream,
 	struct encode_lua_ctx ctx;
 	ctx.port = port;
 	ctx.stream = stream;
-	struct lua_State *L = tarantool_L;
-	int top = lua_gettop(L);
-	if (lua_cpcall(L, handler, &ctx) != 0) {
-		luaT_toerror(port->L);
+	lua_State *L = port->L;
+	/*
+	 * At the moment Lua stack holds only values to encode.
+	 * Insert corresponding encoder to the bottom and push
+	 * encode context as lightuserdata to the top.
+	 */
+	const int size = lua_gettop(L);
+	lua_rawgeti(L, LUA_REGISTRYINDEX, execute_lua_refs[handler]);
+	assert(lua_isfunction(L, -1) && lua_iscfunction(L, -1));
+	lua_insert(L, 1);
+	lua_pushlightuserdata(L, &ctx);
+	/* nargs -- all arguments + lightuserdata. */
+	if (luaT_call(L, size + 1, 0) != 0)
 		return -1;
-	}
-	lua_settop(L, top);
 	return port->size;
 }
 
@@ -467,7 +506,7 @@ port_lua_dump(struct port *base, struct obuf *out)
 	struct mpstream stream;
 	mpstream_init(&stream, out, obuf_reserve_cb, obuf_alloc_cb,
 		      luamp_error, port->L);
-	return port_lua_do_dump(base, &stream, encode_lua_call);
+	return port_lua_do_dump(base, &stream, HANDLER_ENCODE_CALL);
 }
 
 static int
@@ -477,7 +516,7 @@ port_lua_dump_16(struct port *base, struct obuf *out)
 	struct mpstream stream;
 	mpstream_init(&stream, out, obuf_reserve_cb, obuf_alloc_cb,
 		      luamp_error, port->L);
-	return port_lua_do_dump(base, &stream, encode_lua_call_16);
+	return port_lua_do_dump(base, &stream, HANDLER_ENCODE_CALL_16);
 }
 
 static void
@@ -501,7 +540,7 @@ port_lua_get_msgpack(struct port *base, uint32_t *size)
 	mpstream_init(&stream, region, region_reserve_cb, region_alloc_cb,
 		      luamp_error, port->L);
 	mpstream_encode_array(&stream, lua_gettop(port->L));
-	int rc = port_lua_do_dump(base, &stream, encode_lua_call);
+	int rc = port_lua_do_dump(base, &stream, HANDLER_ENCODE_CALL);
 	if (rc < 0) {
 		region_truncate(region, region_svp);
 		return NULL;
@@ -1049,6 +1088,8 @@ box_lua_call_init(struct lua_State *L)
 		[HANDLER_CALL] = execute_lua_call,
 		[HANDLER_CALL_BY_REF] = execute_lua_call_by_ref,
 		[HANDLER_EVAL] = execute_lua_eval,
+		[HANDLER_ENCODE_CALL] = encode_lua_call,
+		[HANDLER_ENCODE_CALL_16] = encode_lua_call_16,
 	};
 
 	for (int i = 0; i < HANDLER_MAX; i++) {
-- 
2.31.0
^ permalink raw reply	[flat|nested] 13+ messages in thread* Re: [Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call 2021-06-18 18:14 [Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call Sergey Kaplun via Tarantool-patches @ 2021-06-21 20:36 ` Vladislav Shpilevoy via Tarantool-patches 2021-06-22 13:38 ` Sergey Kaplun via Tarantool-patches 2021-08-01 12:34 ` Igor Munkin via Tarantool-patches ` (2 subsequent siblings) 3 siblings, 1 reply; 13+ messages in thread From: Vladislav Shpilevoy via Tarantool-patches @ 2021-06-21 20:36 UTC (permalink / raw) To: Sergey Kaplun, Igor Munkin; +Cc: tarantool-patches Hi! Thanks for the patch! > Branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-noticket-refactor-lua-call > See the benchmarks sources here [1]. > > Before patch: > | Encode map: 189851357 mcs, 15.8 K ps > | Encode seq: 187926351 mcs, 16.0 K ps > | Encode str: 185451675 mcs, 16.2 K ps > | Encode dig: 184833396 mcs, 16.2 K ps > > After patch: > | Encode map: 187814261 mcs, 16.0 K ps > | Encode seq: 183755028 mcs, 16.3 K ps > | Encode str: 181571626 mcs, 16.5 K ps > | Encode dig: 181572998 mcs, 16.5 K ps > > Looks like the perf doesn't degrade at least. > > [1]: https://gist.github.com/Buristan/3e6d6bf2c722874bec55a8c5a44b98f3 > > src/box/lua/call.c | 71 ++++++++++++++++++++++++++++++++++++---------- > 1 file changed, 56 insertions(+), 15 deletions(-) > > diff --git a/src/box/lua/call.c b/src/box/lua/call.c > index 0315e720c..3b2572096 100644 > --- a/src/box/lua/call.c > +++ b/src/box/lua/call.c > @@ -450,13 +482,20 @@ port_lua_do_dump(struct port *base, struct mpstream *stream, > struct encode_lua_ctx ctx; > ctx.port = port; > ctx.stream = stream; > - struct lua_State *L = tarantool_L; > - int top = lua_gettop(L); > - if (lua_cpcall(L, handler, &ctx) != 0) { > - luaT_toerror(port->L); > + lua_State *L = port->L; > + /* > + * At the moment Lua stack holds only values to encode. > + * Insert corresponding encoder to the bottom and push > + * encode context as lightuserdata to the top. > + */ > + const int size = lua_gettop(L); > + lua_rawgeti(L, LUA_REGISTRYINDEX, execute_lua_refs[handler]); > + assert(lua_isfunction(L, -1) && lua_iscfunction(L, -1)); > + lua_insert(L, 1); If I remember correctly, this leads to moving all the existing stack elements forward. Which might be expensive. I know from Vlad Grubov's words that they have code with hundreds of values in multireturn from stored functions. Did you bench what happens when the Lua coroutine contains a lot of values? In the bench by the link above I see only 1-element array and map. Size of the array and map does not matter though. Only multireturn is interesting here. Like 'return 1, 2, 3, ...'. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call 2021-06-21 20:36 ` Vladislav Shpilevoy via Tarantool-patches @ 2021-06-22 13:38 ` Sergey Kaplun via Tarantool-patches 2021-06-24 20:00 ` Vladislav Shpilevoy via Tarantool-patches 0 siblings, 1 reply; 13+ messages in thread From: Sergey Kaplun via Tarantool-patches @ 2021-06-22 13:38 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches Hi! Thanks for the review! On 21.06.21, Vladislav Shpilevoy wrote: > Hi! Thanks for the patch! > > > Branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-noticket-refactor-lua-call > > See the benchmarks sources here [1]. > > > > Before patch: > > | Encode map: 189851357 mcs, 15.8 K ps > > | Encode seq: 187926351 mcs, 16.0 K ps > > | Encode str: 185451675 mcs, 16.2 K ps > > | Encode dig: 184833396 mcs, 16.2 K ps > > > > After patch: > > | Encode map: 187814261 mcs, 16.0 K ps > > | Encode seq: 183755028 mcs, 16.3 K ps > > | Encode str: 181571626 mcs, 16.5 K ps > > | Encode dig: 181572998 mcs, 16.5 K ps > > > > Looks like the perf doesn't degrade at least. > > > > [1]: https://gist.github.com/Buristan/3e6d6bf2c722874bec55a8c5a44b98f3 > > > > src/box/lua/call.c | 71 ++++++++++++++++++++++++++++++++++++---------- > > 1 file changed, 56 insertions(+), 15 deletions(-) > > > > diff --git a/src/box/lua/call.c b/src/box/lua/call.c > > index 0315e720c..3b2572096 100644 > > --- a/src/box/lua/call.c > > +++ b/src/box/lua/call.c > > @@ -450,13 +482,20 @@ port_lua_do_dump(struct port *base, struct mpstream *stream, > > struct encode_lua_ctx ctx; > > ctx.port = port; > > ctx.stream = stream; > > - struct lua_State *L = tarantool_L; > > - int top = lua_gettop(L); > > - if (lua_cpcall(L, handler, &ctx) != 0) { > > - luaT_toerror(port->L); > > + lua_State *L = port->L; > > + /* > > + * At the moment Lua stack holds only values to encode. > > + * Insert corresponding encoder to the bottom and push > > + * encode context as lightuserdata to the top. > > + */ > > + const int size = lua_gettop(L); > > + lua_rawgeti(L, LUA_REGISTRYINDEX, execute_lua_refs[handler]); > > + assert(lua_isfunction(L, -1) && lua_iscfunction(L, -1)); > > + lua_insert(L, 1); > > If I remember correctly, this leads to moving all the existing > stack elements forward. Which might be expensive. I know from > Vlad Grubov's words that they have code with hundreds of values in > multireturn from stored functions. Did you bench what happens when > the Lua coroutine contains a lot of values? In the bench by the > link above I see only 1-element array and map. Size of the array > and map does not matter though. Only multireturn is interesting > here. Like 'return 1, 2, 3, ...'. I've added this benchmark (200 numbers to return) to the gist. Local results for the bench: Master: | Encode mul: 237900953 mcs, 12.6 K ps My branch: | Encode mul: 235735350 mcs, 12.7 K ps `luamp_encode()` has the biggest impact in `port_do_lua_dump()` (`lua_insert()` costs ~0.1% of the whole runtime). -- Best regards, Sergey Kaplun ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call 2021-06-22 13:38 ` Sergey Kaplun via Tarantool-patches @ 2021-06-24 20:00 ` Vladislav Shpilevoy via Tarantool-patches 2021-06-29 7:07 ` Sergey Kaplun via Tarantool-patches 0 siblings, 1 reply; 13+ messages in thread From: Vladislav Shpilevoy via Tarantool-patches @ 2021-06-24 20:00 UTC (permalink / raw) To: Sergey Kaplun; +Cc: tarantool-patches Thanks for the investigation! >>> diff --git a/src/box/lua/call.c b/src/box/lua/call.c >>> index 0315e720c..3b2572096 100644 >>> --- a/src/box/lua/call.c >>> +++ b/src/box/lua/call.c >>> @@ -450,13 +482,20 @@ port_lua_do_dump(struct port *base, struct mpstream *stream, >>> struct encode_lua_ctx ctx; >>> ctx.port = port; >>> ctx.stream = stream; >>> - struct lua_State *L = tarantool_L; >>> - int top = lua_gettop(L); >>> - if (lua_cpcall(L, handler, &ctx) != 0) { >>> - luaT_toerror(port->L); >>> + lua_State *L = port->L; >>> + /* >>> + * At the moment Lua stack holds only values to encode. >>> + * Insert corresponding encoder to the bottom and push >>> + * encode context as lightuserdata to the top. >>> + */ >>> + const int size = lua_gettop(L); >>> + lua_rawgeti(L, LUA_REGISTRYINDEX, execute_lua_refs[handler]); >>> + assert(lua_isfunction(L, -1) && lua_iscfunction(L, -1)); >>> + lua_insert(L, 1); >> >> If I remember correctly, this leads to moving all the existing >> stack elements forward. Which might be expensive. I know from >> Vlad Grubov's words that they have code with hundreds of values in >> multireturn from stored functions. Did you bench what happens when >> the Lua coroutine contains a lot of values? In the bench by the >> link above I see only 1-element array and map. Size of the array >> and map does not matter though. Only multireturn is interesting >> here. Like 'return 1, 2, 3, ...'. > > I've added this benchmark (200 numbers to return) to the gist. > Local results for the bench: > > Master: > | Encode mul: 237900953 mcs, 12.6 K ps > > My branch: > | Encode mul: 235735350 mcs, 12.7 K ps > > `luamp_encode()` has the biggest impact in `port_do_lua_dump()` > (`lua_insert()` costs ~0.1% of the whole runtime). That looks suspicious. Why does it have such a small impact? Did you check that multireturn really made the stack bigger? Shouldn't it have linear complexity depending on the stack size after the target index? The patch looks fine to me, if the benchmarks show there is even a slight perf win. But I think Igor should take another look. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call 2021-06-24 20:00 ` Vladislav Shpilevoy via Tarantool-patches @ 2021-06-29 7:07 ` Sergey Kaplun via Tarantool-patches 0 siblings, 0 replies; 13+ messages in thread From: Sergey Kaplun via Tarantool-patches @ 2021-06-29 7:07 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches Thanks for the questions! On 24.06.21, Vladislav Shpilevoy wrote: > Thanks for the investigation! > > >>> diff --git a/src/box/lua/call.c b/src/box/lua/call.c > >>> index 0315e720c..3b2572096 100644 > >>> --- a/src/box/lua/call.c > >>> +++ b/src/box/lua/call.c > >>> @@ -450,13 +482,20 @@ port_lua_do_dump(struct port *base, struct mpstream *stream, > >>> struct encode_lua_ctx ctx; > >>> ctx.port = port; > >>> ctx.stream = stream; > >>> - struct lua_State *L = tarantool_L; > >>> - int top = lua_gettop(L); > >>> - if (lua_cpcall(L, handler, &ctx) != 0) { > >>> - luaT_toerror(port->L); > >>> + lua_State *L = port->L; > >>> + /* > >>> + * At the moment Lua stack holds only values to encode. > >>> + * Insert corresponding encoder to the bottom and push > >>> + * encode context as lightuserdata to the top. > >>> + */ > >>> + const int size = lua_gettop(L); > >>> + lua_rawgeti(L, LUA_REGISTRYINDEX, execute_lua_refs[handler]); > >>> + assert(lua_isfunction(L, -1) && lua_iscfunction(L, -1)); > >>> + lua_insert(L, 1); > >> > >> If I remember correctly, this leads to moving all the existing > >> stack elements forward. Which might be expensive. I know from > >> Vlad Grubov's words that they have code with hundreds of values in > >> multireturn from stored functions. Did you bench what happens when > >> the Lua coroutine contains a lot of values? In the bench by the > >> link above I see only 1-element array and map. Size of the array > >> and map does not matter though. Only multireturn is interesting > >> here. Like 'return 1, 2, 3, ...'. > > > > I've added this benchmark (200 numbers to return) to the gist. > > Local results for the bench: > > > > Master: > > | Encode mul: 237900953 mcs, 12.6 K ps > > > > My branch: > > | Encode mul: 235735350 mcs, 12.7 K ps > > > > `luamp_encode()` has the biggest impact in `port_do_lua_dump()` > > (`lua_insert()` costs ~0.1% of the whole runtime). > > That looks suspicious. Why does it have such a small impact? Did I've run the benchmark for return case only with `perf stat` (and 5K return values) and found three interesting metrics (others are pretty the same, up to 0.07%): Master: | 10,255,748,650 cache-misses # 13.686 % of all cache refs (12.49%) | 1,364,762,310 LLC-load-misses # 15.92% of all LL-cache accesses (12.50%) | 401,136,504 iTLB-load-misses # 39.68% of all iTLB cache accesses (12.51%) My branch: | 9,964,939,197 cache-misses # 13.413 % of all cache refs (12.52%) | 1,296,545,699 LLC-load-misses # 15.08% of all LL-cache accesses (12.50%) | 285,852,865 iTLB-load-misses # 29.13% of all iTLB cache accesses (12.49%) 1) LLC (last-line cache, L2 in my case) misses is slightly decreasing. I suppose that all insert movement operations use cache (L1 or L2) instead memory access, that make it rather cheaply. 2) iTLB (instruction translation look-aside buffer) [1] needs to avoid penalty of page walks to a page table. So looks like within this patch code locality is improved (may be due to the fact of using `luaT_call()` instead of 'lua_cpcall()` and 'luaT_toerror()`). > you check that multireturn really made the stack bigger? Shouldn't Yes, I did. > it have linear complexity depending on the stack size after the > target index? Yes, with 5K items to return it becomes pretty equal or even a little bit slower (up to ~1%) than the original code. But encoding is still taking the biggest part of the runtime. Side note: if user wants to return 5K values (or a table with 5K values) from Tarantool we can't stop him, but only to offer him our empathy. > > The patch looks fine to me, if the benchmarks show there is even a > slight perf win. But I think Igor should take another look. [1]: https://software.intel.com/content/www/us/en/develop/documentation/vtune-help/top/reference/cpu-metrics-reference/front-end-bound/itlb-overhead.html -- Best regards, Sergey Kaplun ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call 2021-06-18 18:14 [Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call Sergey Kaplun via Tarantool-patches 2021-06-21 20:36 ` Vladislav Shpilevoy via Tarantool-patches @ 2021-08-01 12:34 ` Igor Munkin via Tarantool-patches 2021-08-02 14:25 ` Sergey Kaplun via Tarantool-patches 2021-08-04 22:29 ` Vladislav Shpilevoy via Tarantool-patches 2021-08-14 10:16 ` Igor Munkin via Tarantool-patches 3 siblings, 1 reply; 13+ messages in thread From: Igor Munkin via Tarantool-patches @ 2021-08-01 12:34 UTC (permalink / raw) To: Sergey Kaplun; +Cc: Vladislav Shpilevoy, tarantool-patches Sergey, Thanks for the patch! The changes are fine, but please consider the comments below. On 18.06.21, Sergey Kaplun wrote: > The old code flow was the following: > 1) `struct port_lua` given to `port_lua_do_dump()` has Lua stack with > arguments to encode to MessagePack > 2) The main coroutine `tarantool_L` is used with `lua_cpcall()` to call > `encode_lua_call_16()` or `encode_lua_call()` > 3) Objects on minor coroutine are encoded via `luamp_encode_call16()` or What is this minor coroutine? To make terms more consistent in scope of this commit message, let's define this in the first bullet and use it everywhere. For me the best option is "port coroutine". > `luamp_encode()`. This encoding may raise an error on unprotected The next bullet is recommended to be started from this sentence. > `port->L` coroutine. This coroutine has no protected frame on it > and this call should fail in pure Lua. It is forbidden to call Ditto. > anything on unprotected coroutine [1] (Lua 5.1 sets protection only > for specific lua_State [2] and calls a panic function if we raise an > error on unprotected lua_State [3]). Netherless, there is no panic Ditto. > at now due to two facts. The first one is LuaJIT's support of C++ And sublist is strongly recommended for "the first" and "the second" facts. It'll make this War and Peace part more readable. > exception handling [4] that allows to raise an error in Lua and > catch it in C++ or vice versa. But documentation still doesn't > permit errors on unprotected coroutines (at least we must set > try-catch block). The second one is double monkey-patching of LuaJIT I doubt this is monkey-patching, but rather a monkey's patch. > to restore currently executed coroutine, when C function or fast > function raises an error [5][6] (see related issue here [7][8]). > For these reasons, when an error occurs, the unwinder searches and > finds the C-protected stack frame from the `lua_cpcall()` for the > `tarantool_L` coroutine and unwinds until that point (without > aforementioned patches LuaJIT just calls a panic function and exit). > 4) If an error is raised, and `lua_cpcall()` returns not `LUA_OK`, then > the error from `port->L` coroutine is converted into a Tarantool error > and a diagnostic is set. > > The auxiliary usage of `tarantool_L` coroutine is redundant and doesn't Well, as for me, the main problem is violating Lua coroutine practice (it's worth to mention, that internal unwinder used on M1 is not such flexible, so this misuse still leads to panic call), and the minor reason is tarantool_L usage redundancy. > respect Lua idiomatic of usage. So this patch drops it and uses only > minor coroutine instead with `lua_pcall()`. > > Functions to encode are saved as entrance in the `LUA_REGISTRY` table > to reduce GC pressure, like it is done for other handlers [9]. > > [1]: https://www.lua.org/manual/5.2/manual.html#4.6 > > If an error happens outside any protected environment, Lua calls a > > panic function Some copy-pasting artefacts. > [2]: https://www.lua.org/source/5.1/lstate.h.html#lua_State > [3]: https://www.lua.org/source/5.1/ldo.c.html#luaD_throw > [4]: https://luajit.org/extensions.html#exceptions > [5]: https://github.com/tarantool/luajit/commit/ed412cd9f55fe87fd32a69c86e1732690fc5c1b0 > [6]: https://github.com/tarantool/luajit/commit/97699d9ee2467389b6aea21a098e38aff3469b5f > [7]: https://github.com/tarantool/tarantool/issues/1516 > [8]: https://www.freelists.org/post/luajit/Issue-with-PCALL-in-21 > [9]: https://github.com/tarantool/tarantool/commit/e88c0d21ab765d4c53bed2437c49d77b3ffe4216 This patch resolves #6248, BTW. > --- > > Branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-noticket-refactor-lua-call > See the benchmarks sources here [1]. > > Before patch: > | Encode map: 189851357 mcs, 15.8 K ps > | Encode seq: 187926351 mcs, 16.0 K ps > | Encode str: 185451675 mcs, 16.2 K ps > | Encode dig: 184833396 mcs, 16.2 K ps > > After patch: > | Encode map: 187814261 mcs, 16.0 K ps > | Encode seq: 183755028 mcs, 16.3 K ps > | Encode str: 181571626 mcs, 16.5 K ps > | Encode dig: 181572998 mcs, 16.5 K ps > > Looks like the perf doesn't degrade at least. At first, I would like to emphasize that we have no option for merging or not the fix for this issue. Re benchmarks: It's worth to mention you're measuring two performance critical changes: <lua_insert> effect and lower GC pressure. So, it's interesting to see the following benchmarks: * one with disabled GC and GC stats * one with considerable amount of elements on Lua stack, but not triggering stack resize (AFAIU, 200 is too much) Here are my points: * There is no such huge increase as a result of reducing GC pressure * Moving 1-5 8-byte elements is neglible for performance * Moving 200(*) elements as a result of the guest stack resize affects both patched and vanilla versions * <lua_insert> measurements are affected by resizing (considering your perf stats) Anyway, though these are kinda independent changes, we see no performance degradation using both of them in a single patch, so I guess we have no reason to worry about. (*) I'm not sure about the exact amount of the elements to be moved. > > [1]: https://gist.github.com/Buristan/3e6d6bf2c722874bec55a8c5a44b98f3 > > src/box/lua/call.c | 71 ++++++++++++++++++++++++++++++++++++---------- > 1 file changed, 56 insertions(+), 15 deletions(-) > > diff --git a/src/box/lua/call.c b/src/box/lua/call.c > index 0315e720c..3b2572096 100644 > --- a/src/box/lua/call.c > +++ b/src/box/lua/call.c > @@ -61,6 +61,8 @@ enum handlers { > HANDLER_CALL, > HANDLER_CALL_BY_REF, > HANDLER_EVAL, > + HANDLER_ENCODE_CALL, > + HANDLER_ENCODE_CALL_16, Minor: Entries are not sorted. > HANDLER_MAX, > }; > > @@ -400,11 +402,26 @@ struct encode_lua_ctx { > struct mpstream *stream; > }; > > +/** > + * Encode call results to msgpack from Lua stack. > + * Lua stack has the following structure -- the last element is > + * lightuserdata pointer to encode_lua_ctx, all other values are > + * arguments to proceed. Typo: s/proceed/process/. > + * The function encodes all given Lua objects to msgpack stream > + * from context, sets port's size and returns no value on the Lua > + * stack. > + * XXX: This function *MUST* be called under lua_pcall(), because > + * luamp_encode() may raise an error. > + */ <snipped> > +/** > + * Encode call_16 results to msgpack from Lua stack. > + * Lua stack has the following structure -- the last element is > + * lightuserdata pointer to encode_lua_ctx, all other values are > + * arguments to proceed. Ditto. > + * The function encodes all given Lua objects to msgpack stream > + * from context, sets port's size and returns no value on the Lua > + * stack. > + * XXX: This function *MUST* be called under lua_pcall(), because > + * luamp_encode() may raise an error. > + */ <snipped> > @@ -450,13 +482,20 @@ port_lua_do_dump(struct port *base, struct mpstream *stream, > struct encode_lua_ctx ctx; > ctx.port = port; > ctx.stream = stream; > - struct lua_State *L = tarantool_L; > - int top = lua_gettop(L); > - if (lua_cpcall(L, handler, &ctx) != 0) { > - luaT_toerror(port->L); > + lua_State *L = port->L; > + /* > + * At the moment Lua stack holds only values to encode. > + * Insert corresponding encoder to the bottom and push > + * encode context as lightuserdata to the top. > + */ > + const int size = lua_gettop(L); > + lua_rawgeti(L, LUA_REGISTRYINDEX, execute_lua_refs[handler]); > + assert(lua_isfunction(L, -1) && lua_iscfunction(L, -1)); > + lua_insert(L, 1); Side note: I don't like this line, like all we do. The only better approach in my head is to implement something similar to VARG frame, but I doubt this approach have no pure Lua violations. Let's return to this later a bit. > + lua_pushlightuserdata(L, &ctx); > + /* nargs -- all arguments + lightuserdata. */ > + if (luaT_call(L, size + 1, 0) != 0) > return -1; > - } > - lua_settop(L, top); > return port->size; > } > <snipped> > @@ -1049,6 +1088,8 @@ box_lua_call_init(struct lua_State *L) > [HANDLER_CALL] = execute_lua_call, > [HANDLER_CALL_BY_REF] = execute_lua_call_by_ref, > [HANDLER_EVAL] = execute_lua_eval, > + [HANDLER_ENCODE_CALL] = encode_lua_call, > + [HANDLER_ENCODE_CALL_16] = encode_lua_call_16, Minor: Entries are also not sorted. > }; > > for (int i = 0; i < HANDLER_MAX; i++) { > -- > 2.31.0 > -- Best regards, IM ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call 2021-08-01 12:34 ` Igor Munkin via Tarantool-patches @ 2021-08-02 14:25 ` Sergey Kaplun via Tarantool-patches 2021-08-12 17:35 ` Igor Munkin via Tarantool-patches 0 siblings, 1 reply; 13+ messages in thread From: Sergey Kaplun via Tarantool-patches @ 2021-08-02 14:25 UTC (permalink / raw) To: Igor Munkin; +Cc: Vladislav Shpilevoy, tarantool-patches Hi, Igor! Thanks for the review! On 01.08.21, Igor Munkin wrote: > Sergey, > > Thanks for the patch! The changes are fine, but please consider the > comments below. > > On 18.06.21, Sergey Kaplun wrote: > > The old code flow was the following: > > 1) `struct port_lua` given to `port_lua_do_dump()` has Lua stack with > > arguments to encode to MessagePack > > 2) The main coroutine `tarantool_L` is used with `lua_cpcall()` to call > > `encode_lua_call_16()` or `encode_lua_call()` > > 3) Objects on minor coroutine are encoded via `luamp_encode_call16()` or > > What is this minor coroutine? To make terms more consistent in scope of > this commit message, let's define this in the first bullet and use it > everywhere. For me the best option is "port coroutine". Fixed. > > > `luamp_encode()`. This encoding may raise an error on unprotected > > The next bullet is recommended to be started from this sentence. Fixed. > > > `port->L` coroutine. This coroutine has no protected frame on it > > and this call should fail in pure Lua. It is forbidden to call > > Ditto. Fixed. > > > anything on unprotected coroutine [1] (Lua 5.1 sets protection only > > for specific lua_State [2] and calls a panic function if we raise an > > error on unprotected lua_State [3]). Netherless, there is no panic > > Ditto. Fixed. > > > at now due to two facts. The first one is LuaJIT's support of C++ > > And sublist is strongly recommended for "the first" and "the second" > facts. It'll make this War and Peace part more readable. Done. > > > exception handling [4] that allows to raise an error in Lua and > > catch it in C++ or vice versa. But documentation still doesn't > > permit errors on unprotected coroutines (at least we must set > > try-catch block). The second one is double monkey-patching of LuaJIT > > I doubt this is monkey-patching, but rather a monkey's patch. Changed to patching. > > > to restore currently executed coroutine, when C function or fast > > function raises an error [5][6] (see related issue here [7][8]). > > For these reasons, when an error occurs, the unwinder searches and > > finds the C-protected stack frame from the `lua_cpcall()` for the > > `tarantool_L` coroutine and unwinds until that point (without > > aforementioned patches LuaJIT just calls a panic function and exit). > > 4) If an error is raised, and `lua_cpcall()` returns not `LUA_OK`, then > > the error from `port->L` coroutine is converted into a Tarantool error > > and a diagnostic is set. > > > > The auxiliary usage of `tarantool_L` coroutine is redundant and doesn't > > Well, as for me, the main problem is violating Lua coroutine practice > (it's worth to mention, that internal unwinder used on M1 is not such > flexible, so this misuse still leads to panic call), and the minor > reason is tarantool_L usage redundancy. Changed considering your comments. > > > respect Lua idiomatic of usage. So this patch drops it and uses only > > minor coroutine instead with `lua_pcall()`. > > > > Functions to encode are saved as entrance in the `LUA_REGISTRY` table > > to reduce GC pressure, like it is done for other handlers [9]. > > > > [1]: https://www.lua.org/manual/5.2/manual.html#4.6 > > > If an error happens outside any protected environment, Lua calls a > > > panic function > > Some copy-pasting artefacts. Moved upwards. > > > [2]: https://www.lua.org/source/5.1/lstate.h.html#lua_State > > [3]: https://www.lua.org/source/5.1/ldo.c.html#luaD_throw > > [4]: https://luajit.org/extensions.html#exceptions > > [5]: https://github.com/tarantool/luajit/commit/ed412cd9f55fe87fd32a69c86e1732690fc5c1b0 > > [6]: https://github.com/tarantool/luajit/commit/97699d9ee2467389b6aea21a098e38aff3469b5f > > [7]: https://github.com/tarantool/tarantool/issues/1516 > > [8]: https://www.freelists.org/post/luajit/Issue-with-PCALL-in-21 > > [9]: https://github.com/tarantool/tarantool/commit/e88c0d21ab765d4c53bed2437c49d77b3ffe4216 > > This patch resolves #6248, BTW. Added. > The new commit message is the following: =================================================================== lua: refactor port_lua_do_dump and encode_lua_call The old code flow was the following: 1) `struct port_lua` given to `port_lua_do_dump()` has Lua stack with arguments to encode to MessagePack. 2) The main coroutine `tarantool_L` is used to call `encode_lua_call()` or `encode_lua_call_16`() via `lua_cpcall()`. 3) Objects on port coroutine are encoded via `luamp_encode()` or `luamp_encode_call16()`. 4) This encoding may raise an error on unprotected `port->L` coroutine. This coroutine has no protected frame on it and this call should fail in pure Lua. Calling anything on unprotected coroutine is not allowed in Lua [1]: | If an error happens outside any protected environment, Lua calls a | panic function Lua 5.1 sets protection only for specific lua_State [2] and calls a panic function if we raise an error on unprotected lua_State [3]. Nevertheless, no panic occurs now due to two facts: * The first one is LuaJIT's support of C++ exception handling [4] that allows to raise an error in Lua and catch it in C++ or vice versa. But documentation still doesn't allow raising errors on unprotected coroutines (at least we must use try-catch block). * The second one is the patch made in LuaJIT to restore currently executed coroutine, when C function or fast function raises an error [5][6] (see the related issue here [7][8]). For these reasons, when an error occurs, the unwinder searches and finds the C-protected stack frame from the `lua_cpcall()` for `tarantool_L` coroutine and unwinds until that point (without aforementioned patches LuaJIT just calls a panic function and exit). If an error is raised, and `lua_cpcall()` returns not `LUA_OK`, then the error from `port->L` coroutine is converted into a Tarantool error and a diagnostic is set. The auxiliary usage of `tarantool_L` coroutine doesn't respect Lua idiomatic of usage. Internal unwinder used on M1 is not such flexible, so such misuse leads to panic call. Also the `tarantool_L` usage is redundant. So this patch drops it and uses only minor coroutine instead with `lua_pcall()`. Functions to encode are saved as entrance in the `LUA_REGISTRY` table to reduce GC pressure, like it is done for other handlers [9]. [1]: https://www.lua.org/manual/5.2/manual.html#4.6 [2]: https://www.lua.org/source/5.1/lstate.h.html#lua_State [3]: https://www.lua.org/source/5.1/ldo.c.html#luaD_throw [4]: https://luajit.org/extensions.html#exceptions [5]: https://github.com/tarantool/luajit/commit/ed412cd9f55fe87fd32a69c86e1732690fc5c1b0 [6]: https://github.com/tarantool/luajit/commit/97699d9ee2467389b6aea21a098e38aff3469b5f [7]: https://github.com/tarantool/tarantool/issues/1516 [8]: https://www.freelists.org/post/luajit/Issue-with-PCALL-in-21 [9]: https://github.com/tarantool/tarantool/commit/e88c0d21ab765d4c53bed2437c49d77b3ffe4216 Closes #6248 Closes #4617 =================================================================== > > --- > > > > Branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-noticket-refactor-lua-call > > See the benchmarks sources here [1]. > > > > Before patch: > > | Encode map: 189851357 mcs, 15.8 K ps > > | Encode seq: 187926351 mcs, 16.0 K ps > > | Encode str: 185451675 mcs, 16.2 K ps > > | Encode dig: 184833396 mcs, 16.2 K ps > > > > After patch: > > | Encode map: 187814261 mcs, 16.0 K ps > > | Encode seq: 183755028 mcs, 16.3 K ps > > | Encode str: 181571626 mcs, 16.5 K ps > > | Encode dig: 181572998 mcs, 16.5 K ps > > > > Looks like the perf doesn't degrade at least. > > At first, I would like to emphasize that we have no option for merging > or not the fix for this issue. > > Re benchmarks: It's worth to mention you're measuring two performance > critical changes: <lua_insert> effect and lower GC pressure. So, it's > interesting to see the following benchmarks: > * one with disabled GC and GC stats Here the results with disabled GC: Before patch: Encode map: 4679394 mcs, 21.4 K ps Encode seq: 4559824 mcs, 21.9 K ps Encode str: 4574213 mcs, 21.9 K ps Encode dig: 4595043 mcs, 21.8 K ps Encode mul: 5978444 mcs, 16.7 K ps After: Encode map: 4739110 mcs, 21.1 K ps Encode seq: 4528261 mcs, 22.1 K ps Encode str: 4576910 mcs, 21.8 K ps Encode dig: 4506142 mcs, 22.2 K ps Encode mul: 6016659 mcs, 16.6 K ps I suppose, that values are almost the same, at least within the margin of error. Note: I reduced amount of iterations 30 times. So inaccuracy increased. > * one with considerable amount of elements on Lua stack, but not > triggering stack resize (AFAIU, 200 is too much) Tried with 40 items on the stack: Without GC: Master: Encode mul: 4895280 mcs, 20.4 K ps Branch: Encode mul: 4896076 mcs, 20.4 K ps With GC: Master: Encode mul: 5123580 mcs, 19.5 K ps Branch: Encode mul: 5050863 mcs, 19.8 K ps Seems pretty equal too. > > Here are my points: > * There is no such huge increase as a result of reducing GC pressure > * Moving 1-5 8-byte elements is neglible for performance > * Moving 200(*) elements as a result of the guest stack resize affects > both patched and vanilla versions > * <lua_insert> measurements are affected by resizing (considering your > perf stats) > > Anyway, though these are kinda independent changes, we see no > performance degradation using both of them in a single patch, so I guess > we have no reason to worry about. > > (*) I'm not sure about the exact amount of the elements to be moved. Exactly 200. > > > > > [1]: https://gist.github.com/Buristan/3e6d6bf2c722874bec55a8c5a44b98f3 > > > > src/box/lua/call.c | 71 ++++++++++++++++++++++++++++++++++++---------- > > 1 file changed, 56 insertions(+), 15 deletions(-) > > > > diff --git a/src/box/lua/call.c b/src/box/lua/call.c > > index 0315e720c..3b2572096 100644 > > --- a/src/box/lua/call.c > > +++ b/src/box/lua/call.c > > @@ -61,6 +61,8 @@ enum handlers { > > HANDLER_CALL, > > HANDLER_CALL_BY_REF, > > HANDLER_EVAL, > > + HANDLER_ENCODE_CALL, > > + HANDLER_ENCODE_CALL_16, > > Minor: Entries are not sorted. Fixed here and below. See the iterative patch: =================================================================== diff --git a/src/box/lua/call.c b/src/box/lua/call.c index 3b2572096..5db17359d 100644 --- a/src/box/lua/call.c +++ b/src/box/lua/call.c @@ -60,9 +60,9 @@ enum handlers { HANDLER_CALL, HANDLER_CALL_BY_REF, - HANDLER_EVAL, HANDLER_ENCODE_CALL, HANDLER_ENCODE_CALL_16, + HANDLER_EVAL, HANDLER_MAX, }; @@ -1087,9 +1087,9 @@ box_lua_call_init(struct lua_State *L) lua_CFunction handles[] = { [HANDLER_CALL] = execute_lua_call, [HANDLER_CALL_BY_REF] = execute_lua_call_by_ref, - [HANDLER_EVAL] = execute_lua_eval, [HANDLER_ENCODE_CALL] = encode_lua_call, [HANDLER_ENCODE_CALL_16] = encode_lua_call_16, + [HANDLER_EVAL] = execute_lua_eval, }; for (int i = 0; i < HANDLER_MAX; i++) { =================================================================== > > > HANDLER_MAX, > > }; > > > > @@ -400,11 +402,26 @@ struct encode_lua_ctx { > > struct mpstream *stream; > > }; > > > > +/** > > + * Encode call results to msgpack from Lua stack. > > + * Lua stack has the following structure -- the last element is > > + * lightuserdata pointer to encode_lua_ctx, all other values are > > + * arguments to proceed. > > Typo: s/proceed/process/. Fixes. See the iterative patch below. =================================================================== diff --git a/src/box/lua/call.c b/src/box/lua/call.c index 5db17359d..857b57165 100644 --- a/src/box/lua/call.c +++ b/src/box/lua/call.c @@ -406,7 +406,7 @@ struct encode_lua_ctx { * Encode call results to msgpack from Lua stack. * Lua stack has the following structure -- the last element is * lightuserdata pointer to encode_lua_ctx, all other values are - * arguments to proceed. + * arguments to process. * The function encodes all given Lua objects to msgpack stream * from context, sets port's size and returns no value on the Lua * stack. @@ -442,7 +442,7 @@ encode_lua_call(lua_State *L) * Encode call_16 results to msgpack from Lua stack. * Lua stack has the following structure -- the last element is * lightuserdata pointer to encode_lua_ctx, all other values are - * arguments to proceed. + * arguments to process. * The function encodes all given Lua objects to msgpack stream * from context, sets port's size and returns no value on the Lua * stack. =================================================================== > > > + * The function encodes all given Lua objects to msgpack stream > > + * from context, sets port's size and returns no value on the Lua > > + * stack. > > + * XXX: This function *MUST* be called under lua_pcall(), because > > + * luamp_encode() may raise an error. > > + */ > > <snipped> > > > +/** > > + * Encode call_16 results to msgpack from Lua stack. > > + * Lua stack has the following structure -- the last element is > > + * lightuserdata pointer to encode_lua_ctx, all other values are > > + * arguments to proceed. > > Ditto. > > > + * The function encodes all given Lua objects to msgpack stream > > + * from context, sets port's size and returns no value on the Lua > > + * stack. > > + * XXX: This function *MUST* be called under lua_pcall(), because > > + * luamp_encode() may raise an error. > > + */ > > <snipped> > > > @@ -450,13 +482,20 @@ port_lua_do_dump(struct port *base, struct mpstream *stream, > > struct encode_lua_ctx ctx; > > ctx.port = port; > > ctx.stream = stream; > > - struct lua_State *L = tarantool_L; > > - int top = lua_gettop(L); > > - if (lua_cpcall(L, handler, &ctx) != 0) { > > - luaT_toerror(port->L); > > + lua_State *L = port->L; > > + /* > > + * At the moment Lua stack holds only values to encode. > > + * Insert corresponding encoder to the bottom and push > > + * encode context as lightuserdata to the top. > > + */ > > + const int size = lua_gettop(L); > > + lua_rawgeti(L, LUA_REGISTRYINDEX, execute_lua_refs[handler]); > > + assert(lua_isfunction(L, -1) && lua_iscfunction(L, -1)); > > + lua_insert(L, 1); > > Side note: I don't like this line, like all we do. The only better > approach in my head is to implement something similar to VARG frame, but > I doubt this approach have no pure Lua violations. Let's return to this > later a bit. > > > + lua_pushlightuserdata(L, &ctx); > > + /* nargs -- all arguments + lightuserdata. */ > > + if (luaT_call(L, size + 1, 0) != 0) > > return -1; > > - } > > - lua_settop(L, top); > > return port->size; > > } > > > > <snipped> > > > @@ -1049,6 +1088,8 @@ box_lua_call_init(struct lua_State *L) > > [HANDLER_CALL] = execute_lua_call, > > [HANDLER_CALL_BY_REF] = execute_lua_call_by_ref, > > [HANDLER_EVAL] = execute_lua_eval, > > + [HANDLER_ENCODE_CALL] = encode_lua_call, > > + [HANDLER_ENCODE_CALL_16] = encode_lua_call_16, > > Minor: Entries are also not sorted. > > > }; > > > > for (int i = 0; i < HANDLER_MAX; i++) { > > -- > > 2.31.0 > > > > -- > Best regards, > IM -- Best regards, Sergey Kaplun ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call 2021-08-02 14:25 ` Sergey Kaplun via Tarantool-patches @ 2021-08-12 17:35 ` Igor Munkin via Tarantool-patches 2021-08-13 7:30 ` Sergey Kaplun via Tarantool-patches 0 siblings, 1 reply; 13+ messages in thread From: Igor Munkin via Tarantool-patches @ 2021-08-12 17:35 UTC (permalink / raw) To: Sergey Kaplun; +Cc: Vladislav Shpilevoy, tarantool-patches Sergey, Thanks for the fixes! LGTM, with a several typos in the commit message mentioned below. Moreover, please rebase your branch on the current master to check nothing is broken. On 02.08.21, Sergey Kaplun wrote: > Hi, Igor! > > Thanks for the review! > <snipped> > > The new commit message is the following: > > =================================================================== > lua: refactor port_lua_do_dump and encode_lua_call > > The old code flow was the following: > > 1) `struct port_lua` given to `port_lua_do_dump()` has Lua stack with > arguments to encode to MessagePack. > > 2) The main coroutine `tarantool_L` is used to call `encode_lua_call()` > or `encode_lua_call_16`() via `lua_cpcall()`. > > 3) Objects on port coroutine are encoded via `luamp_encode()` or > `luamp_encode_call16()`. > > 4) This encoding may raise an error on unprotected `port->L` coroutine. > This coroutine has no protected frame on it and this call should fail > in pure Lua. > > Calling anything on unprotected coroutine is not allowed in Lua [1]: > > | If an error happens outside any protected environment, Lua calls a > | panic function > > Lua 5.1 sets protection only for specific lua_State [2] and calls a > panic function if we raise an error on unprotected lua_State [3]. > > Nevertheless, no panic occurs now due to two facts: > * The first one is LuaJIT's support of C++ exception handling [4] that > allows to raise an error in Lua and catch it in C++ or vice versa. But > documentation still doesn't allow raising errors on unprotected > coroutines (at least we must use try-catch block). > * The second one is the patch made in LuaJIT to restore currently > executed coroutine, when C function or fast function raises an > error [5][6] (see the related issue here [7][8]). > > For these reasons, when an error occurs, the unwinder searches and finds > the C-protected stack frame from the `lua_cpcall()` for `tarantool_L` > coroutine and unwinds until that point (without aforementioned patches > LuaJIT just calls a panic function and exit). Typo: s/exit/exits/. > > If an error is raised, and `lua_cpcall()` returns not `LUA_OK`, then the > error from `port->L` coroutine is converted into a Tarantool error and a > diagnostic is set. > > The auxiliary usage of `tarantool_L` coroutine doesn't respect Lua > idiomatic of usage. Internal unwinder used on M1 is not such flexible, Typo: Too much "usage", so I propose the following wording for the previous sentence: | Such auxiliary usage of `tarantool_L` is not idiomatic for Lua. > so such misuse leads to panic call. Also the `tarantool_L` usage is > redundant. So this patch drops it and uses only minor coroutine instead Typo: Again, not minor coroutine, but port coroutine (as we agreed in the previous review). > with `lua_pcall()`. > > Functions to encode are saved as entrance in the `LUA_REGISTRY` table to Typo: s/saved as entrance in/saved to/. > reduce GC pressure, like it is done for other handlers [9]. > > [1]: https://www.lua.org/manual/5.2/manual.html#4.6 > [2]: https://www.lua.org/source/5.1/lstate.h.html#lua_State > [3]: https://www.lua.org/source/5.1/ldo.c.html#luaD_throw > [4]: https://luajit.org/extensions.html#exceptions > [5]: https://github.com/tarantool/luajit/commit/ed412cd9f55fe87fd32a69c86e1732690fc5c1b0 > [6]: https://github.com/tarantool/luajit/commit/97699d9ee2467389b6aea21a098e38aff3469b5f > [7]: https://github.com/tarantool/tarantool/issues/1516 > [8]: https://www.freelists.org/post/luajit/Issue-with-PCALL-in-21 > [9]: https://github.com/tarantool/tarantool/commit/e88c0d21ab765d4c53bed2437c49d77b3ffe4216 > > Closes #6248 > Closes #4617 > =================================================================== > > > > --- > > > > > > Branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-noticket-refactor-lua-call > > > See the benchmarks sources here [1]. > > > > > > Before patch: > > > | Encode map: 189851357 mcs, 15.8 K ps > > > | Encode seq: 187926351 mcs, 16.0 K ps > > > | Encode str: 185451675 mcs, 16.2 K ps > > > | Encode dig: 184833396 mcs, 16.2 K ps > > > > > > After patch: > > > | Encode map: 187814261 mcs, 16.0 K ps > > > | Encode seq: 183755028 mcs, 16.3 K ps > > > | Encode str: 181571626 mcs, 16.5 K ps > > > | Encode dig: 181572998 mcs, 16.5 K ps > > > > > > Looks like the perf doesn't degrade at least. > > > > At first, I would like to emphasize that we have no option for merging > > or not the fix for this issue. > > > > Re benchmarks: It's worth to mention you're measuring two performance > > critical changes: <lua_insert> effect and lower GC pressure. So, it's > > interesting to see the following benchmarks: > > * one with disabled GC and GC stats > > Here the results with disabled GC: > Before patch: > > Encode map: 4679394 mcs, 21.4 K ps > Encode seq: 4559824 mcs, 21.9 K ps > Encode str: 4574213 mcs, 21.9 K ps > Encode dig: 4595043 mcs, 21.8 K ps > Encode mul: 5978444 mcs, 16.7 K ps > > After: > > Encode map: 4739110 mcs, 21.1 K ps > Encode seq: 4528261 mcs, 22.1 K ps > Encode str: 4576910 mcs, 21.8 K ps > Encode dig: 4506142 mcs, 22.2 K ps > Encode mul: 6016659 mcs, 16.6 K ps > > I suppose, that values are almost the same, at least within the margin > of error. > Note: I reduced amount of iterations 30 times. So inaccuracy increased. > > > * one with considerable amount of elements on Lua stack, but not > > triggering stack resize (AFAIU, 200 is too much) > > Tried with 40 items on the stack: > > Without GC: > > Master: > Encode mul: 4895280 mcs, 20.4 K ps > > Branch: > Encode mul: 4896076 mcs, 20.4 K ps > > With GC: > > Master: > Encode mul: 5123580 mcs, 19.5 K ps > > Branch: > Encode mul: 5050863 mcs, 19.8 K ps > > Seems pretty equal too. Mystery. Anyway, the current performance is not lost and this is great! > > > > > Here are my points: > > * There is no such huge increase as a result of reducing GC pressure > > * Moving 1-5 8-byte elements is neglible for performance > > * Moving 200(*) elements as a result of the guest stack resize affects > > both patched and vanilla versions > > * <lua_insert> measurements are affected by resizing (considering your > > perf stats) > > > > Anyway, though these are kinda independent changes, we see no > > performance degradation using both of them in a single patch, so I guess > > we have no reason to worry about. > > > > (*) I'm not sure about the exact amount of the elements to be moved. > > Exactly 200. > <snipped> > > -- > Best regards, > Sergey Kaplun -- Best regards, IM ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call 2021-08-12 17:35 ` Igor Munkin via Tarantool-patches @ 2021-08-13 7:30 ` Sergey Kaplun via Tarantool-patches 2021-08-13 7:41 ` Vitaliia Ioffe via Tarantool-patches 2021-08-13 9:27 ` Sergey Kaplun via Tarantool-patches 0 siblings, 2 replies; 13+ messages in thread From: Sergey Kaplun via Tarantool-patches @ 2021-08-13 7:30 UTC (permalink / raw) To: Igor Munkin; +Cc: Vladislav Shpilevoy, tarantool-patches Igor, On 12.08.21, Igor Munkin wrote: > Sergey, > > Thanks for the fixes! LGTM, with a several typos in the commit message > mentioned below. Moreover, please rebase your branch on the current > master to check nothing is broken. Branch is rebased, fixes are done, and branch is force-pushed :). > > On 02.08.21, Sergey Kaplun wrote: > > Hi, Igor! > > > > Thanks for the review! > > > > <snipped> > > > > > The new commit message is the following: > > > > =================================================================== > > lua: refactor port_lua_do_dump and encode_lua_call > > > > The old code flow was the following: > > > > 1) `struct port_lua` given to `port_lua_do_dump()` has Lua stack with > > arguments to encode to MessagePack. > > > > 2) The main coroutine `tarantool_L` is used to call `encode_lua_call()` > > or `encode_lua_call_16`() via `lua_cpcall()`. > > > > 3) Objects on port coroutine are encoded via `luamp_encode()` or > > `luamp_encode_call16()`. > > > > 4) This encoding may raise an error on unprotected `port->L` coroutine. > > This coroutine has no protected frame on it and this call should fail > > in pure Lua. > > > > Calling anything on unprotected coroutine is not allowed in Lua [1]: > > > > | If an error happens outside any protected environment, Lua calls a > > | panic function > > > > Lua 5.1 sets protection only for specific lua_State [2] and calls a > > panic function if we raise an error on unprotected lua_State [3]. > > > > Nevertheless, no panic occurs now due to two facts: > > * The first one is LuaJIT's support of C++ exception handling [4] that > > allows to raise an error in Lua and catch it in C++ or vice versa. But > > documentation still doesn't allow raising errors on unprotected > > coroutines (at least we must use try-catch block). > > * The second one is the patch made in LuaJIT to restore currently > > executed coroutine, when C function or fast function raises an > > error [5][6] (see the related issue here [7][8]). > > > > For these reasons, when an error occurs, the unwinder searches and finds > > the C-protected stack frame from the `lua_cpcall()` for `tarantool_L` > > coroutine and unwinds until that point (without aforementioned patches > > LuaJIT just calls a panic function and exit). > > Typo: s/exit/exits/. Fixed. > > > > > If an error is raised, and `lua_cpcall()` returns not `LUA_OK`, then the > > error from `port->L` coroutine is converted into a Tarantool error and a > > diagnostic is set. > > > > The auxiliary usage of `tarantool_L` coroutine doesn't respect Lua > > idiomatic of usage. Internal unwinder used on M1 is not such flexible, > > Typo: Too much "usage", so I propose the following wording for the > previous sentence: > | Such auxiliary usage of `tarantool_L` is not idiomatic for Lua. Done. > > > so such misuse leads to panic call. Also the `tarantool_L` usage is > > redundant. So this patch drops it and uses only minor coroutine instead > > Typo: Again, not minor coroutine, but port coroutine (as we agreed in > the previous review). Fixed. > > > with `lua_pcall()`. > > > > Functions to encode are saved as entrance in the `LUA_REGISTRY` table to > > Typo: s/saved as entrance in/saved to/. Fixed. > > > reduce GC pressure, like it is done for other handlers [9]. > > > > [1]: https://www.lua.org/manual/5.2/manual.html#4.6 > > [2]: https://www.lua.org/source/5.1/lstate.h.html#lua_State > > [3]: https://www.lua.org/source/5.1/ldo.c.html#luaD_throw > > [4]: https://luajit.org/extensions.html#exceptions > > [5]: https://github.com/tarantool/luajit/commit/ed412cd9f55fe87fd32a69c86e1732690fc5c1b0 > > [6]: https://github.com/tarantool/luajit/commit/97699d9ee2467389b6aea21a098e38aff3469b5f > > [7]: https://github.com/tarantool/tarantool/issues/1516 > > [8]: https://www.freelists.org/post/luajit/Issue-with-PCALL-in-21 > > [9]: https://github.com/tarantool/tarantool/commit/e88c0d21ab765d4c53bed2437c49d77b3ffe4216 > > > > Closes #6248 > > Closes #4617 > > =================================================================== > > > > > > --- > > > > > > > > Branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-noticket-refactor-lua-call <snipped> -- Best regards, Sergey Kaplun ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call 2021-08-13 7:30 ` Sergey Kaplun via Tarantool-patches @ 2021-08-13 7:41 ` Vitaliia Ioffe via Tarantool-patches 2021-08-13 9:27 ` Sergey Kaplun via Tarantool-patches 1 sibling, 0 replies; 13+ messages in thread From: Vitaliia Ioffe via Tarantool-patches @ 2021-08-13 7:41 UTC (permalink / raw) To: Sergey Kaplun; +Cc: tarantool-patches, Vladislav Shpilevoy [-- Attachment #1: Type: text/plain, Size: 4552 bytes --] Hi team, QA LGTM! -- Vitaliia Ioffe >Пятница, 13 августа 2021, 10:32 +03:00 от Sergey Kaplun via Tarantool-patches <tarantool-patches@dev.tarantool.org>: > >Igor, > >On 12.08.21, Igor Munkin wrote: >> Sergey, >> >> Thanks for the fixes! LGTM, with a several typos in the commit message >> mentioned below. Moreover, please rebase your branch on the current >> master to check nothing is broken. > >Branch is rebased, fixes are done, and branch is force-pushed :). > >> >> On 02.08.21, Sergey Kaplun wrote: >> > Hi, Igor! >> > >> > Thanks for the review! >> > >> >> <snipped> >> >> > >> > The new commit message is the following: >> > >> > =================================================================== >> > lua: refactor port_lua_do_dump and encode_lua_call >> > >> > The old code flow was the following: >> > >> > 1) `struct port_lua` given to `port_lua_do_dump()` has Lua stack with >> > arguments to encode to MessagePack. >> > >> > 2) The main coroutine `tarantool_L` is used to call `encode_lua_call()` >> > or `encode_lua_call_16`() via `lua_cpcall()`. >> > >> > 3) Objects on port coroutine are encoded via `luamp_encode()` or >> > `luamp_encode_call16()`. >> > >> > 4) This encoding may raise an error on unprotected `port->L` coroutine. >> > This coroutine has no protected frame on it and this call should fail >> > in pure Lua. >> > >> > Calling anything on unprotected coroutine is not allowed in Lua [1]: >> > >> > | If an error happens outside any protected environment, Lua calls a >> > | panic function >> > >> > Lua 5.1 sets protection only for specific lua_State [2] and calls a >> > panic function if we raise an error on unprotected lua_State [3]. >> > >> > Nevertheless, no panic occurs now due to two facts: >> > * The first one is LuaJIT's support of C++ exception handling [4] that >> > allows to raise an error in Lua and catch it in C++ or vice versa. But >> > documentation still doesn't allow raising errors on unprotected >> > coroutines (at least we must use try-catch block). >> > * The second one is the patch made in LuaJIT to restore currently >> > executed coroutine, when C function or fast function raises an >> > error [5][6] (see the related issue here [7][8]). >> > >> > For these reasons, when an error occurs, the unwinder searches and finds >> > the C-protected stack frame from the `lua_cpcall()` for `tarantool_L` >> > coroutine and unwinds until that point (without aforementioned patches >> > LuaJIT just calls a panic function and exit). >> >> Typo: s/exit/exits/. > >Fixed. > >> >> > >> > If an error is raised, and `lua_cpcall()` returns not `LUA_OK`, then the >> > error from `port->L` coroutine is converted into a Tarantool error and a >> > diagnostic is set. >> > >> > The auxiliary usage of `tarantool_L` coroutine doesn't respect Lua >> > idiomatic of usage. Internal unwinder used on M1 is not such flexible, >> >> Typo: Too much "usage", so I propose the following wording for the >> previous sentence: >> | Such auxiliary usage of `tarantool_L` is not idiomatic for Lua. > >Done. > >> >> > so such misuse leads to panic call. Also the `tarantool_L` usage is >> > redundant. So this patch drops it and uses only minor coroutine instead >> >> Typo: Again, not minor coroutine, but port coroutine (as we agreed in >> the previous review). > >Fixed. > >> >> > with `lua_pcall()`. >> > >> > Functions to encode are saved as entrance in the `LUA_REGISTRY` table to >> >> Typo: s/saved as entrance in/saved to/. > >Fixed. > >> >> > reduce GC pressure, like it is done for other handlers [9]. >> > >> > [1]: https://www.lua.org/manual/5.2/manual.html#4.6 >> > [2]: https://www.lua.org/source/5.1/lstate.h.html#lua_State >> > [3]: https://www.lua.org/source/5.1/ldo.c.html#luaD_throw >> > [4]: https://luajit.org/extensions.html#exceptions >> > [5]: https://github.com/tarantool/luajit/commit/ed412cd9f55fe87fd32a69c86e1732690fc5c1b0 >> > [6]: https://github.com/tarantool/luajit/commit/97699d9ee2467389b6aea21a098e38aff3469b5f >> > [7]: https://github.com/tarantool/tarantool/issues/1516 >> > [8]: https://www.freelists.org/post/luajit/Issue-with-PCALL-in-21 >> > [9]: https://github.com/tarantool/tarantool/commit/e88c0d21ab765d4c53bed2437c49d77b3ffe4216 >> > >> > Closes #6248 >> > Closes #4617 >> > =================================================================== >> > >> > > > --- >> > > > >> > > > Branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-noticket-refactor-lua-call > ><snipped> > > >-- >Best regards, >Sergey Kaplun [-- Attachment #2: Type: text/html, Size: 6778 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call 2021-08-13 7:30 ` Sergey Kaplun via Tarantool-patches 2021-08-13 7:41 ` Vitaliia Ioffe via Tarantool-patches @ 2021-08-13 9:27 ` Sergey Kaplun via Tarantool-patches 1 sibling, 0 replies; 13+ messages in thread From: Sergey Kaplun via Tarantool-patches @ 2021-08-13 9:27 UTC (permalink / raw) To: Igor Munkin, Vladislav Shpilevoy, Vitaliia Ioffe, tarantool-patches Also, add the branch for 1.10: https://github.com/tarantool/tarantool/tree/skaplun/gh-noticket-refactor-lua-call-1.10 The difference is that the port structure itself is pushed instead ctx structure on the stack. The patch is the following: =================================================================== commit 59eafb8e6d8f41de06bdb28b930c9ff1d3fbe63e Author: Sergey Kaplun <skaplun@tarantool.org> Date: Fri Jun 18 18:28:30 2021 +0300 lua: refactor port_lua_do_dump and encode_lua_call (cherry picked from commit 434d31bdb52d80384af55acd2c3a637e5154e257) The old code flow was the following: 1) `struct port_lua` given to `port_lua_do_dump()` has Lua stack with arguments to encode to MessagePack. 2) The main coroutine `tarantool_L` is used to call `encode_lua_call()` or `encode_lua_call_16`() via `lua_cpcall()`. 3) Objects on port coroutine are encoded via `luamp_encode()` or `luamp_encode_call16()`. 4) This encoding may raise an error on unprotected `port->L` coroutine. This coroutine has no protected frame on it and this call should fail in pure Lua. Calling anything on unprotected coroutine is not allowed in Lua [1]: | If an error happens outside any protected environment, Lua calls a | panic function Lua 5.1 sets protection only for specific lua_State [2] and calls a panic function if we raise an error on unprotected lua_State [3]. Nevertheless, no panic occurs now due to two facts: * The first one is LuaJIT's support of C++ exception handling [4] that allows to raise an error in Lua and catch it in C++ or vice versa. But documentation still doesn't allow raising errors on unprotected coroutines (at least we must use try-catch block). * The second one is the patch made in LuaJIT to restore currently executed coroutine, when C function or fast function raises an error [5][6] (see the related issue here [7][8]). For these reasons, when an error occurs, the unwinder searches and finds the C-protected stack frame from the `lua_cpcall()` for `tarantool_L` coroutine and unwinds until that point (without aforementioned patches LuaJIT just calls a panic function and exits). If an error is raised, and `lua_cpcall()` returns not `LUA_OK`, then the error from `port->L` coroutine is converted into a Tarantool error and a diagnostic is set. Such auxiliary usage of `tarantool_L` is not idiomatic for Lua. Internal unwinder used on M1 is not such flexible, so such misuse leads to panic call. Also the `tarantool_L` usage is redundant. So this patch drops it and uses only port coroutine instead with `lua_pcall()`. Functions to encode are saved to the `LUA_REGISTRY` table to reduce GC pressure, like it is done for other handlers [9]. [1]: https://www.lua.org/manual/5.2/manual.html#4.6 [2]: https://www.lua.org/source/5.1/lstate.h.html#lua_State [3]: https://www.lua.org/source/5.1/ldo.c.html#luaD_throw [4]: https://luajit.org/extensions.html#exceptions [5]: https://github.com/tarantool/luajit/commit/ed412cd9f55fe87fd32a69c86e1732690fc5c1b0 [6]: https://github.com/tarantool/luajit/commit/97699d9ee2467389b6aea21a098e38aff3469b5f [7]: https://github.com/tarantool/tarantool/issues/1516 [8]: https://www.freelists.org/post/luajit/Issue-with-PCALL-in-21 [9]: https://github.com/tarantool/tarantool/commit/e88c0d21ab765d4c53bed2437c49d77b3ffe4216 Closes #6248 Closes #4617 diff --git a/src/box/lua/call.c b/src/box/lua/call.c index 593317265..f173d97b7 100644 --- a/src/box/lua/call.c +++ b/src/box/lua/call.c @@ -53,6 +53,8 @@ */ enum handlers { HANDLER_CALL, + HANDLER_ENCODE_CALL, + HANDLER_ENCODE_CALL_16, HANDLER_EVAL, HANDLER_MAX, }; @@ -347,10 +349,26 @@ execute_lua_eval(lua_State *L) return lua_gettop(L); } + +/** + * Encode call results to msgpack from Lua stack. + * Lua stack has the following structure -- the last element is + * lightuserdata pointer to encode_lua_ctx, all other values are + * arguments to process. + * The function encodes all given Lua objects to msgpack stream + * from context, sets port's size and returns no value on the Lua + * stack. + * XXX: This function *MUST* be called under lua_pcall(), because + * luamp_encode() may raise an error. + */ static int encode_lua_call(lua_State *L) { + assert(lua_islightuserdata(L, -1)); struct port_lua *port = (struct port_lua *) lua_topointer(L, -1); + assert(port->L == L); + /* Delete port from the stack. */ + lua_pop(L, 1); /* * Add all elements from Lua stack to the buffer. * @@ -361,18 +379,33 @@ encode_lua_call(lua_State *L) luamp_error, port->L); struct luaL_serializer *cfg = luaL_msgpack_default; - int size = lua_gettop(port->L); + const int size = lua_gettop(L); for (int i = 1; i <= size; ++i) - luamp_encode(port->L, cfg, &stream, i); + luamp_encode(L, cfg, &stream, i); port->size = size; mpstream_flush(&stream); return 0; } +/** + * Encode call_16 results to msgpack from Lua stack. + * Lua stack has the following structure -- the last element is + * lightuserdata pointer to encode_lua_ctx, all other values are + * arguments to process. + * The function encodes all given Lua objects to msgpack stream + * from context, sets port's size and returns no value on the Lua + * stack. + * XXX: This function *MUST* be called under lua_pcall(), because + * luamp_encode() may raise an error. + */ static int encode_lua_call_16(lua_State *L) { + assert(lua_islightuserdata(L, -1)); struct port_lua *port = (struct port_lua *) lua_topointer(L, -1); + assert(port->L == L); + /* Delete port from the stack. */ + lua_pop(L, 1); /* * Add all elements from Lua stack to the buffer. * @@ -383,42 +416,45 @@ encode_lua_call_16(lua_State *L) luamp_error, port->L); struct luaL_serializer *cfg = luaL_msgpack_default; - port->size = luamp_encode_call_16(port->L, cfg, &stream); + port->size = luamp_encode_call_16(L, cfg, &stream); mpstream_flush(&stream); return 0; } static inline int -port_lua_do_dump(struct port *base, struct obuf *out, lua_CFunction handler) +port_lua_do_dump(struct port *base, struct obuf *out, enum handlers handler) { struct port_lua *port = (struct port_lua *)base; assert(port->vtab == &port_lua_vtab); /* Use port to pass arguments to encoder quickly. */ port->out = out; + lua_State *L = port->L; /* - * Use the same global state, assuming the encoder doesn't - * yield. + * At the moment Lua stack holds only values to encode. + * Insert corresponding encoder to the bottom and push + * encode context as lightuserdata to the top. */ - struct lua_State *L = tarantool_L; - int top = lua_gettop(L); - if (lua_cpcall(L, handler, port) != 0) { - luaT_toerror(port->L); + const int size = lua_gettop(L); + lua_rawgeti(L, LUA_REGISTRYINDEX, execute_lua_refs[handler]); + assert(lua_isfunction(L, -1) && lua_iscfunction(L, -1)); + lua_insert(L, 1); + lua_pushlightuserdata(L, port); + /* nargs -- all arguments + lightuserdata. */ + if (luaT_call(L, size + 1, 0) != 0) return -1; - } - lua_settop(L, top); return port->size; } static int port_lua_dump(struct port *base, struct obuf *out) { - return port_lua_do_dump(base, out, encode_lua_call); + return port_lua_do_dump(base, out, HANDLER_ENCODE_CALL); } static int port_lua_dump_16(struct port *base, struct obuf *out) { - return port_lua_do_dump(base, out, encode_lua_call_16); + return port_lua_do_dump(base, out, HANDLER_ENCODE_CALL_16); } static void @@ -499,6 +535,8 @@ box_lua_call_init(struct lua_State *L) lua_CFunction handles[] = { [HANDLER_CALL] = execute_lua_call, + [HANDLER_ENCODE_CALL] = encode_lua_call, + [HANDLER_ENCODE_CALL_16] = encode_lua_call_16, [HANDLER_EVAL] = execute_lua_eval, }; =================================================================== -- Best regards, Sergey Kaplun ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call 2021-06-18 18:14 [Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call Sergey Kaplun via Tarantool-patches 2021-06-21 20:36 ` Vladislav Shpilevoy via Tarantool-patches 2021-08-01 12:34 ` Igor Munkin via Tarantool-patches @ 2021-08-04 22:29 ` Vladislav Shpilevoy via Tarantool-patches 2021-08-14 10:16 ` Igor Munkin via Tarantool-patches 3 siblings, 0 replies; 13+ messages in thread From: Vladislav Shpilevoy via Tarantool-patches @ 2021-08-04 22:29 UTC (permalink / raw) To: Sergey Kaplun, Igor Munkin; +Cc: tarantool-patches Hi! Thanks for the patch! LGTM. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call 2021-06-18 18:14 [Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call Sergey Kaplun via Tarantool-patches ` (2 preceding siblings ...) 2021-08-04 22:29 ` Vladislav Shpilevoy via Tarantool-patches @ 2021-08-14 10:16 ` Igor Munkin via Tarantool-patches 3 siblings, 0 replies; 13+ messages in thread From: Igor Munkin via Tarantool-patches @ 2021-08-14 10:16 UTC (permalink / raw) To: Sergey Kaplun; +Cc: Vladislav Shpilevoy, tarantool-patches Sergey, I've checked the patch into 1.10, 2.7, 2.8 and master. On 18.06.21, Sergey Kaplun wrote: > The old code flow was the following: > 1) `struct port_lua` given to `port_lua_do_dump()` has Lua stack with > arguments to encode to MessagePack > 2) The main coroutine `tarantool_L` is used with `lua_cpcall()` to call > `encode_lua_call_16()` or `encode_lua_call()` > 3) Objects on minor coroutine are encoded via `luamp_encode_call16()` or > `luamp_encode()`. This encoding may raise an error on unprotected > `port->L` coroutine. This coroutine has no protected frame on it > and this call should fail in pure Lua. It is forbidden to call > anything on unprotected coroutine [1] (Lua 5.1 sets protection only > for specific lua_State [2] and calls a panic function if we raise an > error on unprotected lua_State [3]). Netherless, there is no panic > at now due to two facts. The first one is LuaJIT's support of C++ > exception handling [4] that allows to raise an error in Lua and > catch it in C++ or vice versa. But documentation still doesn't > permit errors on unprotected coroutines (at least we must set > try-catch block). The second one is double monkey-patching of LuaJIT > to restore currently executed coroutine, when C function or fast > function raises an error [5][6] (see related issue here [7][8]). > For these reasons, when an error occurs, the unwinder searches and > finds the C-protected stack frame from the `lua_cpcall()` for the > `tarantool_L` coroutine and unwinds until that point (without > aforementioned patches LuaJIT just calls a panic function and exit). > 4) If an error is raised, and `lua_cpcall()` returns not `LUA_OK`, then > the error from `port->L` coroutine is converted into a Tarantool error > and a diagnostic is set. > > The auxiliary usage of `tarantool_L` coroutine is redundant and doesn't > respect Lua idiomatic of usage. So this patch drops it and uses only > minor coroutine instead with `lua_pcall()`. > > Functions to encode are saved as entrance in the `LUA_REGISTRY` table > to reduce GC pressure, like it is done for other handlers [9]. > > [1]: https://www.lua.org/manual/5.2/manual.html#4.6 > > If an error happens outside any protected environment, Lua calls a > > panic function > [2]: https://www.lua.org/source/5.1/lstate.h.html#lua_State > [3]: https://www.lua.org/source/5.1/ldo.c.html#luaD_throw > [4]: https://luajit.org/extensions.html#exceptions > [5]: https://github.com/tarantool/luajit/commit/ed412cd9f55fe87fd32a69c86e1732690fc5c1b0 > [6]: https://github.com/tarantool/luajit/commit/97699d9ee2467389b6aea21a098e38aff3469b5f > [7]: https://github.com/tarantool/tarantool/issues/1516 > [8]: https://www.freelists.org/post/luajit/Issue-with-PCALL-in-21 > [9]: https://github.com/tarantool/tarantool/commit/e88c0d21ab765d4c53bed2437c49d77b3ffe4216 > --- > > Branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-noticket-refactor-lua-call > See the benchmarks sources here [1]. > > Before patch: > | Encode map: 189851357 mcs, 15.8 K ps > | Encode seq: 187926351 mcs, 16.0 K ps > | Encode str: 185451675 mcs, 16.2 K ps > | Encode dig: 184833396 mcs, 16.2 K ps > > After patch: > | Encode map: 187814261 mcs, 16.0 K ps > | Encode seq: 183755028 mcs, 16.3 K ps > | Encode str: 181571626 mcs, 16.5 K ps > | Encode dig: 181572998 mcs, 16.5 K ps > > Looks like the perf doesn't degrade at least. > > [1]: https://gist.github.com/Buristan/3e6d6bf2c722874bec55a8c5a44b98f3 > > src/box/lua/call.c | 71 ++++++++++++++++++++++++++++++++++++---------- > 1 file changed, 56 insertions(+), 15 deletions(-) > <snipped> > -- > 2.31.0 > -- Best regards, IM ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2021-08-14 10:40 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-06-18 18:14 [Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call Sergey Kaplun via Tarantool-patches 2021-06-21 20:36 ` Vladislav Shpilevoy via Tarantool-patches 2021-06-22 13:38 ` Sergey Kaplun via Tarantool-patches 2021-06-24 20:00 ` Vladislav Shpilevoy via Tarantool-patches 2021-06-29 7:07 ` Sergey Kaplun via Tarantool-patches 2021-08-01 12:34 ` Igor Munkin via Tarantool-patches 2021-08-02 14:25 ` Sergey Kaplun via Tarantool-patches 2021-08-12 17:35 ` Igor Munkin via Tarantool-patches 2021-08-13 7:30 ` Sergey Kaplun via Tarantool-patches 2021-08-13 7:41 ` Vitaliia Ioffe via Tarantool-patches 2021-08-13 9:27 ` Sergey Kaplun via Tarantool-patches 2021-08-04 22:29 ` Vladislav Shpilevoy via Tarantool-patches 2021-08-14 10:16 ` Igor Munkin via Tarantool-patches
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox