[Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call

Sergey Kaplun skaplun at tarantool.org
Mon Aug 2 17:25:56 MSK 2021


Hi, Igor!

Thanks for the review!

On 01.08.21, Igor Munkin wrote:
> Sergey,
> 
> Thanks for the patch! The changes are fine, but please consider the
> comments below.
> 
> On 18.06.21, Sergey Kaplun wrote:
> > The old code flow was the following:
> > 1) `struct port_lua` given to `port_lua_do_dump()` has Lua stack with
> >    arguments to encode to MessagePack
> > 2) The main coroutine `tarantool_L` is used with `lua_cpcall()` to call
> >    `encode_lua_call_16()` or `encode_lua_call()`
> > 3) Objects on minor coroutine are encoded via `luamp_encode_call16()` or
> 
> What is this minor coroutine? To make terms more consistent in scope of
> this commit message, let's define this in the first bullet and use it
> everywhere. For me the best option is "port coroutine".

Fixed.

> 
> >    `luamp_encode()`. This encoding may raise an error on unprotected
> 
> The next bullet is recommended to be started from this sentence.

Fixed.

> 
> >    `port->L` coroutine. This coroutine has no protected frame on it
> >    and this call should fail in pure Lua. It is forbidden to call
> 
> Ditto.

Fixed.

> 
> >    anything on unprotected coroutine [1] (Lua 5.1 sets protection only
> >    for specific lua_State [2] and calls a panic function if we raise an
> >    error on unprotected lua_State [3]). Netherless, there is no panic
> 
> Ditto.

Fixed.

> 
> >    at now due to two facts. The first one is LuaJIT's support of C++
> 
> And sublist is strongly recommended for "the first" and "the second"
> facts. It'll make this War and Peace part more readable.

Done.

> 
> >    exception handling [4] that allows to raise an error in Lua and
> >    catch it in C++ or vice versa. But documentation still doesn't
> >    permit errors on unprotected coroutines (at least we must set
> >    try-catch block). The second one is double monkey-patching of LuaJIT
> 
> I doubt this is monkey-patching, but rather a monkey's patch.

Changed to patching.

> 
> >    to restore currently executed coroutine, when C function or fast
> >    function raises an error [5][6] (see related issue here [7][8]).
> >    For these reasons, when an error occurs, the unwinder searches and
> >    finds the C-protected stack frame from the `lua_cpcall()` for the
> >    `tarantool_L` coroutine and unwinds until that point (without
> >    aforementioned patches LuaJIT just calls a panic function and exit).
> > 4) If an error is raised, and `lua_cpcall()` returns not `LUA_OK`, then
> >    the error from `port->L` coroutine is converted into a Tarantool error
> >    and a diagnostic is set.
> > 
> > The auxiliary usage of `tarantool_L` coroutine is redundant and doesn't
> 
> Well, as for me, the main problem is violating Lua coroutine practice
> (it's worth to mention, that internal unwinder used on M1 is not such
> flexible, so this misuse still leads to panic call), and the minor
> reason is tarantool_L usage redundancy.

Changed considering your comments.

> 
> > respect Lua idiomatic of usage. So this patch drops it and uses only
> > minor coroutine instead with `lua_pcall()`.
> > 
> > Functions to encode are saved as entrance in the `LUA_REGISTRY` table
> > to reduce GC pressure, like it is done for other handlers [9].
> > 
> > [1]: https://www.lua.org/manual/5.2/manual.html#4.6
> > > If an error happens outside any protected environment, Lua calls a
> > > panic function
> 
> Some copy-pasting artefacts.

Moved upwards.

> 
> > [2]: https://www.lua.org/source/5.1/lstate.h.html#lua_State
> > [3]: https://www.lua.org/source/5.1/ldo.c.html#luaD_throw
> > [4]: https://luajit.org/extensions.html#exceptions
> > [5]: https://github.com/tarantool/luajit/commit/ed412cd9f55fe87fd32a69c86e1732690fc5c1b0
> > [6]: https://github.com/tarantool/luajit/commit/97699d9ee2467389b6aea21a098e38aff3469b5f
> > [7]: https://github.com/tarantool/tarantool/issues/1516
> > [8]: https://www.freelists.org/post/luajit/Issue-with-PCALL-in-21
> > [9]: https://github.com/tarantool/tarantool/commit/e88c0d21ab765d4c53bed2437c49d77b3ffe4216
> 
> This patch resolves #6248, BTW.

Added.

> 

The new commit message is the following:

===================================================================
lua: refactor port_lua_do_dump and encode_lua_call

The old code flow was the following:

1) `struct port_lua` given to `port_lua_do_dump()` has Lua stack with
   arguments to encode to MessagePack.

2) The main coroutine `tarantool_L` is used to call `encode_lua_call()`
   or `encode_lua_call_16`() via `lua_cpcall()`.

3) Objects on port coroutine are encoded via `luamp_encode()` or
   `luamp_encode_call16()`.

4) This encoding may raise an error on unprotected `port->L` coroutine.
   This coroutine has no protected frame on it and this call should fail
   in pure Lua.

Calling anything on unprotected coroutine is not allowed in Lua [1]:

| If an error happens outside any protected environment, Lua calls a
| panic function

Lua 5.1 sets protection only for specific lua_State [2] and calls a
panic function if we raise an error on unprotected lua_State [3].

Nevertheless, no panic occurs now due to two facts:
* The first one is LuaJIT's support of C++ exception handling [4] that
  allows to raise an error in Lua and catch it in C++ or vice versa. But
  documentation still doesn't allow raising errors on unprotected
  coroutines (at least we must use try-catch block).
* The second one is the patch made in LuaJIT to restore currently
  executed coroutine, when C function or fast function raises an
  error [5][6] (see the related issue here [7][8]).

For these reasons, when an error occurs, the unwinder searches and finds
the C-protected stack frame from the `lua_cpcall()` for `tarantool_L`
coroutine and unwinds until that point (without aforementioned patches
LuaJIT just calls a panic function and exit).

If an error is raised, and `lua_cpcall()` returns not `LUA_OK`, then the
error from `port->L` coroutine is converted into a Tarantool error and a
diagnostic is set.

The auxiliary usage of `tarantool_L` coroutine doesn't respect Lua
idiomatic of usage. Internal unwinder used on M1 is not such flexible,
so such misuse leads to panic call. Also the `tarantool_L` usage is
redundant. So this patch drops it and uses only minor coroutine instead
with `lua_pcall()`.

Functions to encode are saved as entrance in the `LUA_REGISTRY` table to
reduce GC pressure, like it is done for other handlers [9].

[1]: https://www.lua.org/manual/5.2/manual.html#4.6
[2]: https://www.lua.org/source/5.1/lstate.h.html#lua_State
[3]: https://www.lua.org/source/5.1/ldo.c.html#luaD_throw
[4]: https://luajit.org/extensions.html#exceptions
[5]: https://github.com/tarantool/luajit/commit/ed412cd9f55fe87fd32a69c86e1732690fc5c1b0
[6]: https://github.com/tarantool/luajit/commit/97699d9ee2467389b6aea21a098e38aff3469b5f
[7]: https://github.com/tarantool/tarantool/issues/1516
[8]: https://www.freelists.org/post/luajit/Issue-with-PCALL-in-21
[9]: https://github.com/tarantool/tarantool/commit/e88c0d21ab765d4c53bed2437c49d77b3ffe4216

Closes #6248
Closes #4617
===================================================================

> > ---
> > 
> > Branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-noticket-refactor-lua-call
> > See the benchmarks sources here [1].
> > 
> > Before patch:
> > | Encode map: 189851357 mcs, 15.8 K ps
> > | Encode seq: 187926351 mcs, 16.0 K ps
> > | Encode str: 185451675 mcs, 16.2 K ps
> > | Encode dig: 184833396 mcs, 16.2 K ps
> > 
> > After patch:
> > | Encode map: 187814261 mcs, 16.0 K ps
> > | Encode seq: 183755028 mcs, 16.3 K ps
> > | Encode str: 181571626 mcs, 16.5 K ps
> > | Encode dig: 181572998 mcs, 16.5 K ps
> > 
> > Looks like the perf doesn't degrade at least.
> 
> At first, I would like to emphasize that we have no option for merging
> or not the fix for this issue.
> 
> Re benchmarks: It's worth to mention you're measuring two performance
> critical changes: <lua_insert> effect and lower GC pressure. So, it's
> interesting to see the following benchmarks:
> * one with disabled GC and GC stats

Here the results with disabled GC:
Before patch:

Encode map: 4679394 mcs, 21.4 K ps
Encode seq: 4559824 mcs, 21.9 K ps
Encode str: 4574213 mcs, 21.9 K ps
Encode dig: 4595043 mcs, 21.8 K ps
Encode mul: 5978444 mcs, 16.7 K ps

After:

Encode map: 4739110 mcs, 21.1 K ps
Encode seq: 4528261 mcs, 22.1 K ps
Encode str: 4576910 mcs, 21.8 K ps
Encode dig: 4506142 mcs, 22.2 K ps
Encode mul: 6016659 mcs, 16.6 K ps

I suppose, that values are almost the same, at least within the margin
of error.
Note: I reduced amount of iterations 30 times. So inaccuracy increased.

> * one with considerable amount of elements on Lua stack, but not
>   triggering stack resize (AFAIU, 200 is too much)

Tried with 40 items on the stack:

Without GC:

Master:
Encode mul: 4895280 mcs, 20.4 K ps

Branch:
Encode mul: 4896076 mcs, 20.4 K ps

With GC:

Master:
Encode mul: 5123580 mcs, 19.5 K ps

Branch:
Encode mul: 5050863 mcs, 19.8 K ps

Seems pretty equal too.

> 
> Here are my points:
> * There is no such huge increase as a result of reducing GC pressure
> * Moving 1-5 8-byte elements is neglible for performance
> * Moving 200(*) elements as a result of the guest stack resize affects
>   both patched and vanilla versions
> * <lua_insert> measurements are affected by resizing (considering your
>   perf stats)
> 
> Anyway, though these are kinda independent changes, we see no
> performance degradation using both of them in a single patch, so I guess
> we have no reason to worry about.
> 
> (*) I'm not sure about the exact amount of the elements to be moved.

Exactly 200.

> 
> > 
> > [1]: https://gist.github.com/Buristan/3e6d6bf2c722874bec55a8c5a44b98f3
> > 
> >  src/box/lua/call.c | 71 ++++++++++++++++++++++++++++++++++++----------
> >  1 file changed, 56 insertions(+), 15 deletions(-)
> > 
> > diff --git a/src/box/lua/call.c b/src/box/lua/call.c
> > index 0315e720c..3b2572096 100644
> > --- a/src/box/lua/call.c
> > +++ b/src/box/lua/call.c
> > @@ -61,6 +61,8 @@ enum handlers {
> >  	HANDLER_CALL,
> >  	HANDLER_CALL_BY_REF,
> >  	HANDLER_EVAL,
> > +	HANDLER_ENCODE_CALL,
> > +	HANDLER_ENCODE_CALL_16,
> 
> Minor: Entries are not sorted.

Fixed here and below. See the iterative patch:
===================================================================
diff --git a/src/box/lua/call.c b/src/box/lua/call.c
index 3b2572096..5db17359d 100644
--- a/src/box/lua/call.c
+++ b/src/box/lua/call.c
@@ -60,9 +60,9 @@
 enum handlers {
         HANDLER_CALL,
         HANDLER_CALL_BY_REF,
-        HANDLER_EVAL,
         HANDLER_ENCODE_CALL,
         HANDLER_ENCODE_CALL_16,
+        HANDLER_EVAL,
         HANDLER_MAX,
 };
 
@@ -1087,9 +1087,9 @@ box_lua_call_init(struct lua_State *L)
         lua_CFunction handles[] = {
                 [HANDLER_CALL] = execute_lua_call,
                 [HANDLER_CALL_BY_REF] = execute_lua_call_by_ref,
-                [HANDLER_EVAL] = execute_lua_eval,
                 [HANDLER_ENCODE_CALL] = encode_lua_call,
                 [HANDLER_ENCODE_CALL_16] = encode_lua_call_16,
+                [HANDLER_EVAL] = execute_lua_eval,
         };
 
         for (int i = 0; i < HANDLER_MAX; i++) {
===================================================================

> 
> >  	HANDLER_MAX,
> >  };
> >  
> > @@ -400,11 +402,26 @@ struct encode_lua_ctx {
> >  	struct mpstream *stream;
> >  };
> >  
> > +/**
> > + * Encode call results to msgpack from Lua stack.
> > + * Lua stack has the following structure -- the last element is
> > + * lightuserdata pointer to encode_lua_ctx, all other values are
> > + * arguments to proceed.
> 
> Typo: s/proceed/process/.

Fixes. See the iterative patch below.

===================================================================
diff --git a/src/box/lua/call.c b/src/box/lua/call.c
index 5db17359d..857b57165 100644
--- a/src/box/lua/call.c
+++ b/src/box/lua/call.c
@@ -406,7 +406,7 @@ struct encode_lua_ctx {
  * Encode call results to msgpack from Lua stack.
  * Lua stack has the following structure -- the last element is
  * lightuserdata pointer to encode_lua_ctx, all other values are
- * arguments to proceed.
+ * arguments to process.
  * The function encodes all given Lua objects to msgpack stream
  * from context, sets port's size and returns no value on the Lua
  * stack.
@@ -442,7 +442,7 @@ encode_lua_call(lua_State *L)
  * Encode call_16 results to msgpack from Lua stack.
  * Lua stack has the following structure -- the last element is
  * lightuserdata pointer to encode_lua_ctx, all other values are
- * arguments to proceed.
+ * arguments to process.
  * The function encodes all given Lua objects to msgpack stream
  * from context, sets port's size and returns no value on the Lua
  * stack.
===================================================================

> 
> > + * The function encodes all given Lua objects to msgpack stream
> > + * from context, sets port's size and returns no value on the Lua
> > + * stack.
> > + * XXX: This function *MUST* be called under lua_pcall(), because
> > + * luamp_encode() may raise an error.
> > + */
> 
> <snipped>
> 
> > +/**
> > + * Encode call_16 results to msgpack from Lua stack.
> > + * Lua stack has the following structure -- the last element is
> > + * lightuserdata pointer to encode_lua_ctx, all other values are
> > + * arguments to proceed.
> 
> Ditto.
> 
> > + * The function encodes all given Lua objects to msgpack stream
> > + * from context, sets port's size and returns no value on the Lua
> > + * stack.
> > + * XXX: This function *MUST* be called under lua_pcall(), because
> > + * luamp_encode() may raise an error.
> > + */
> 
> <snipped>
> 
> > @@ -450,13 +482,20 @@ port_lua_do_dump(struct port *base, struct mpstream *stream,
> >  	struct encode_lua_ctx ctx;
> >  	ctx.port = port;
> >  	ctx.stream = stream;
> > -	struct lua_State *L = tarantool_L;
> > -	int top = lua_gettop(L);
> > -	if (lua_cpcall(L, handler, &ctx) != 0) {
> > -		luaT_toerror(port->L);
> > +	lua_State *L = port->L;
> > +	/*
> > +	 * At the moment Lua stack holds only values to encode.
> > +	 * Insert corresponding encoder to the bottom and push
> > +	 * encode context as lightuserdata to the top.
> > +	 */
> > +	const int size = lua_gettop(L);
> > +	lua_rawgeti(L, LUA_REGISTRYINDEX, execute_lua_refs[handler]);
> > +	assert(lua_isfunction(L, -1) && lua_iscfunction(L, -1));
> > +	lua_insert(L, 1);
> 
> Side note: I don't like this line, like all we do. The only better
> approach in my head is to implement something similar to VARG frame, but
> I doubt this approach have no pure Lua violations. Let's return to this
> later a bit.
> 
> > +	lua_pushlightuserdata(L, &ctx);
> > +	/* nargs -- all arguments + lightuserdata. */
> > +	if (luaT_call(L, size + 1, 0) != 0)
> >  		return -1;
> > -	}
> > -	lua_settop(L, top);
> >  	return port->size;
> >  }
> >  
> 
> <snipped>
> 
> > @@ -1049,6 +1088,8 @@ box_lua_call_init(struct lua_State *L)
> >  		[HANDLER_CALL] = execute_lua_call,
> >  		[HANDLER_CALL_BY_REF] = execute_lua_call_by_ref,
> >  		[HANDLER_EVAL] = execute_lua_eval,
> > +		[HANDLER_ENCODE_CALL] = encode_lua_call,
> > +		[HANDLER_ENCODE_CALL_16] = encode_lua_call_16,
> 
> Minor: Entries are also not sorted.
> 
> >  	};
> >  
> >  	for (int i = 0; i < HANDLER_MAX; i++) {
> > -- 
> > 2.31.0
> > 
> 
> -- 
> Best regards,
> IM

-- 
Best regards,
Sergey Kaplun


More information about the Tarantool-patches mailing list