[Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call
Sergey Kaplun
skaplun at tarantool.org
Mon Aug 2 17:25:56 MSK 2021
Hi, Igor!
Thanks for the review!
On 01.08.21, Igor Munkin wrote:
> Sergey,
>
> Thanks for the patch! The changes are fine, but please consider the
> comments below.
>
> On 18.06.21, Sergey Kaplun wrote:
> > The old code flow was the following:
> > 1) `struct port_lua` given to `port_lua_do_dump()` has Lua stack with
> > arguments to encode to MessagePack
> > 2) The main coroutine `tarantool_L` is used with `lua_cpcall()` to call
> > `encode_lua_call_16()` or `encode_lua_call()`
> > 3) Objects on minor coroutine are encoded via `luamp_encode_call16()` or
>
> What is this minor coroutine? To make terms more consistent in scope of
> this commit message, let's define this in the first bullet and use it
> everywhere. For me the best option is "port coroutine".
Fixed.
>
> > `luamp_encode()`. This encoding may raise an error on unprotected
>
> The next bullet is recommended to be started from this sentence.
Fixed.
>
> > `port->L` coroutine. This coroutine has no protected frame on it
> > and this call should fail in pure Lua. It is forbidden to call
>
> Ditto.
Fixed.
>
> > anything on unprotected coroutine [1] (Lua 5.1 sets protection only
> > for specific lua_State [2] and calls a panic function if we raise an
> > error on unprotected lua_State [3]). Netherless, there is no panic
>
> Ditto.
Fixed.
>
> > at now due to two facts. The first one is LuaJIT's support of C++
>
> And sublist is strongly recommended for "the first" and "the second"
> facts. It'll make this War and Peace part more readable.
Done.
>
> > exception handling [4] that allows to raise an error in Lua and
> > catch it in C++ or vice versa. But documentation still doesn't
> > permit errors on unprotected coroutines (at least we must set
> > try-catch block). The second one is double monkey-patching of LuaJIT
>
> I doubt this is monkey-patching, but rather a monkey's patch.
Changed to patching.
>
> > to restore currently executed coroutine, when C function or fast
> > function raises an error [5][6] (see related issue here [7][8]).
> > For these reasons, when an error occurs, the unwinder searches and
> > finds the C-protected stack frame from the `lua_cpcall()` for the
> > `tarantool_L` coroutine and unwinds until that point (without
> > aforementioned patches LuaJIT just calls a panic function and exit).
> > 4) If an error is raised, and `lua_cpcall()` returns not `LUA_OK`, then
> > the error from `port->L` coroutine is converted into a Tarantool error
> > and a diagnostic is set.
> >
> > The auxiliary usage of `tarantool_L` coroutine is redundant and doesn't
>
> Well, as for me, the main problem is violating Lua coroutine practice
> (it's worth to mention, that internal unwinder used on M1 is not such
> flexible, so this misuse still leads to panic call), and the minor
> reason is tarantool_L usage redundancy.
Changed considering your comments.
>
> > respect Lua idiomatic of usage. So this patch drops it and uses only
> > minor coroutine instead with `lua_pcall()`.
> >
> > Functions to encode are saved as entrance in the `LUA_REGISTRY` table
> > to reduce GC pressure, like it is done for other handlers [9].
> >
> > [1]: https://www.lua.org/manual/5.2/manual.html#4.6
> > > If an error happens outside any protected environment, Lua calls a
> > > panic function
>
> Some copy-pasting artefacts.
Moved upwards.
>
> > [2]: https://www.lua.org/source/5.1/lstate.h.html#lua_State
> > [3]: https://www.lua.org/source/5.1/ldo.c.html#luaD_throw
> > [4]: https://luajit.org/extensions.html#exceptions
> > [5]: https://github.com/tarantool/luajit/commit/ed412cd9f55fe87fd32a69c86e1732690fc5c1b0
> > [6]: https://github.com/tarantool/luajit/commit/97699d9ee2467389b6aea21a098e38aff3469b5f
> > [7]: https://github.com/tarantool/tarantool/issues/1516
> > [8]: https://www.freelists.org/post/luajit/Issue-with-PCALL-in-21
> > [9]: https://github.com/tarantool/tarantool/commit/e88c0d21ab765d4c53bed2437c49d77b3ffe4216
>
> This patch resolves #6248, BTW.
Added.
>
The new commit message is the following:
===================================================================
lua: refactor port_lua_do_dump and encode_lua_call
The old code flow was the following:
1) `struct port_lua` given to `port_lua_do_dump()` has Lua stack with
arguments to encode to MessagePack.
2) The main coroutine `tarantool_L` is used to call `encode_lua_call()`
or `encode_lua_call_16`() via `lua_cpcall()`.
3) Objects on port coroutine are encoded via `luamp_encode()` or
`luamp_encode_call16()`.
4) This encoding may raise an error on unprotected `port->L` coroutine.
This coroutine has no protected frame on it and this call should fail
in pure Lua.
Calling anything on unprotected coroutine is not allowed in Lua [1]:
| If an error happens outside any protected environment, Lua calls a
| panic function
Lua 5.1 sets protection only for specific lua_State [2] and calls a
panic function if we raise an error on unprotected lua_State [3].
Nevertheless, no panic occurs now due to two facts:
* The first one is LuaJIT's support of C++ exception handling [4] that
allows to raise an error in Lua and catch it in C++ or vice versa. But
documentation still doesn't allow raising errors on unprotected
coroutines (at least we must use try-catch block).
* The second one is the patch made in LuaJIT to restore currently
executed coroutine, when C function or fast function raises an
error [5][6] (see the related issue here [7][8]).
For these reasons, when an error occurs, the unwinder searches and finds
the C-protected stack frame from the `lua_cpcall()` for `tarantool_L`
coroutine and unwinds until that point (without aforementioned patches
LuaJIT just calls a panic function and exit).
If an error is raised, and `lua_cpcall()` returns not `LUA_OK`, then the
error from `port->L` coroutine is converted into a Tarantool error and a
diagnostic is set.
The auxiliary usage of `tarantool_L` coroutine doesn't respect Lua
idiomatic of usage. Internal unwinder used on M1 is not such flexible,
so such misuse leads to panic call. Also the `tarantool_L` usage is
redundant. So this patch drops it and uses only minor coroutine instead
with `lua_pcall()`.
Functions to encode are saved as entrance in the `LUA_REGISTRY` table to
reduce GC pressure, like it is done for other handlers [9].
[1]: https://www.lua.org/manual/5.2/manual.html#4.6
[2]: https://www.lua.org/source/5.1/lstate.h.html#lua_State
[3]: https://www.lua.org/source/5.1/ldo.c.html#luaD_throw
[4]: https://luajit.org/extensions.html#exceptions
[5]: https://github.com/tarantool/luajit/commit/ed412cd9f55fe87fd32a69c86e1732690fc5c1b0
[6]: https://github.com/tarantool/luajit/commit/97699d9ee2467389b6aea21a098e38aff3469b5f
[7]: https://github.com/tarantool/tarantool/issues/1516
[8]: https://www.freelists.org/post/luajit/Issue-with-PCALL-in-21
[9]: https://github.com/tarantool/tarantool/commit/e88c0d21ab765d4c53bed2437c49d77b3ffe4216
Closes #6248
Closes #4617
===================================================================
> > ---
> >
> > Branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-noticket-refactor-lua-call
> > See the benchmarks sources here [1].
> >
> > Before patch:
> > | Encode map: 189851357 mcs, 15.8 K ps
> > | Encode seq: 187926351 mcs, 16.0 K ps
> > | Encode str: 185451675 mcs, 16.2 K ps
> > | Encode dig: 184833396 mcs, 16.2 K ps
> >
> > After patch:
> > | Encode map: 187814261 mcs, 16.0 K ps
> > | Encode seq: 183755028 mcs, 16.3 K ps
> > | Encode str: 181571626 mcs, 16.5 K ps
> > | Encode dig: 181572998 mcs, 16.5 K ps
> >
> > Looks like the perf doesn't degrade at least.
>
> At first, I would like to emphasize that we have no option for merging
> or not the fix for this issue.
>
> Re benchmarks: It's worth to mention you're measuring two performance
> critical changes: <lua_insert> effect and lower GC pressure. So, it's
> interesting to see the following benchmarks:
> * one with disabled GC and GC stats
Here the results with disabled GC:
Before patch:
Encode map: 4679394 mcs, 21.4 K ps
Encode seq: 4559824 mcs, 21.9 K ps
Encode str: 4574213 mcs, 21.9 K ps
Encode dig: 4595043 mcs, 21.8 K ps
Encode mul: 5978444 mcs, 16.7 K ps
After:
Encode map: 4739110 mcs, 21.1 K ps
Encode seq: 4528261 mcs, 22.1 K ps
Encode str: 4576910 mcs, 21.8 K ps
Encode dig: 4506142 mcs, 22.2 K ps
Encode mul: 6016659 mcs, 16.6 K ps
I suppose, that values are almost the same, at least within the margin
of error.
Note: I reduced amount of iterations 30 times. So inaccuracy increased.
> * one with considerable amount of elements on Lua stack, but not
> triggering stack resize (AFAIU, 200 is too much)
Tried with 40 items on the stack:
Without GC:
Master:
Encode mul: 4895280 mcs, 20.4 K ps
Branch:
Encode mul: 4896076 mcs, 20.4 K ps
With GC:
Master:
Encode mul: 5123580 mcs, 19.5 K ps
Branch:
Encode mul: 5050863 mcs, 19.8 K ps
Seems pretty equal too.
>
> Here are my points:
> * There is no such huge increase as a result of reducing GC pressure
> * Moving 1-5 8-byte elements is neglible for performance
> * Moving 200(*) elements as a result of the guest stack resize affects
> both patched and vanilla versions
> * <lua_insert> measurements are affected by resizing (considering your
> perf stats)
>
> Anyway, though these are kinda independent changes, we see no
> performance degradation using both of them in a single patch, so I guess
> we have no reason to worry about.
>
> (*) I'm not sure about the exact amount of the elements to be moved.
Exactly 200.
>
> >
> > [1]: https://gist.github.com/Buristan/3e6d6bf2c722874bec55a8c5a44b98f3
> >
> > src/box/lua/call.c | 71 ++++++++++++++++++++++++++++++++++++----------
> > 1 file changed, 56 insertions(+), 15 deletions(-)
> >
> > diff --git a/src/box/lua/call.c b/src/box/lua/call.c
> > index 0315e720c..3b2572096 100644
> > --- a/src/box/lua/call.c
> > +++ b/src/box/lua/call.c
> > @@ -61,6 +61,8 @@ enum handlers {
> > HANDLER_CALL,
> > HANDLER_CALL_BY_REF,
> > HANDLER_EVAL,
> > + HANDLER_ENCODE_CALL,
> > + HANDLER_ENCODE_CALL_16,
>
> Minor: Entries are not sorted.
Fixed here and below. See the iterative patch:
===================================================================
diff --git a/src/box/lua/call.c b/src/box/lua/call.c
index 3b2572096..5db17359d 100644
--- a/src/box/lua/call.c
+++ b/src/box/lua/call.c
@@ -60,9 +60,9 @@
enum handlers {
HANDLER_CALL,
HANDLER_CALL_BY_REF,
- HANDLER_EVAL,
HANDLER_ENCODE_CALL,
HANDLER_ENCODE_CALL_16,
+ HANDLER_EVAL,
HANDLER_MAX,
};
@@ -1087,9 +1087,9 @@ box_lua_call_init(struct lua_State *L)
lua_CFunction handles[] = {
[HANDLER_CALL] = execute_lua_call,
[HANDLER_CALL_BY_REF] = execute_lua_call_by_ref,
- [HANDLER_EVAL] = execute_lua_eval,
[HANDLER_ENCODE_CALL] = encode_lua_call,
[HANDLER_ENCODE_CALL_16] = encode_lua_call_16,
+ [HANDLER_EVAL] = execute_lua_eval,
};
for (int i = 0; i < HANDLER_MAX; i++) {
===================================================================
>
> > HANDLER_MAX,
> > };
> >
> > @@ -400,11 +402,26 @@ struct encode_lua_ctx {
> > struct mpstream *stream;
> > };
> >
> > +/**
> > + * Encode call results to msgpack from Lua stack.
> > + * Lua stack has the following structure -- the last element is
> > + * lightuserdata pointer to encode_lua_ctx, all other values are
> > + * arguments to proceed.
>
> Typo: s/proceed/process/.
Fixes. See the iterative patch below.
===================================================================
diff --git a/src/box/lua/call.c b/src/box/lua/call.c
index 5db17359d..857b57165 100644
--- a/src/box/lua/call.c
+++ b/src/box/lua/call.c
@@ -406,7 +406,7 @@ struct encode_lua_ctx {
* Encode call results to msgpack from Lua stack.
* Lua stack has the following structure -- the last element is
* lightuserdata pointer to encode_lua_ctx, all other values are
- * arguments to proceed.
+ * arguments to process.
* The function encodes all given Lua objects to msgpack stream
* from context, sets port's size and returns no value on the Lua
* stack.
@@ -442,7 +442,7 @@ encode_lua_call(lua_State *L)
* Encode call_16 results to msgpack from Lua stack.
* Lua stack has the following structure -- the last element is
* lightuserdata pointer to encode_lua_ctx, all other values are
- * arguments to proceed.
+ * arguments to process.
* The function encodes all given Lua objects to msgpack stream
* from context, sets port's size and returns no value on the Lua
* stack.
===================================================================
>
> > + * The function encodes all given Lua objects to msgpack stream
> > + * from context, sets port's size and returns no value on the Lua
> > + * stack.
> > + * XXX: This function *MUST* be called under lua_pcall(), because
> > + * luamp_encode() may raise an error.
> > + */
>
> <snipped>
>
> > +/**
> > + * Encode call_16 results to msgpack from Lua stack.
> > + * Lua stack has the following structure -- the last element is
> > + * lightuserdata pointer to encode_lua_ctx, all other values are
> > + * arguments to proceed.
>
> Ditto.
>
> > + * The function encodes all given Lua objects to msgpack stream
> > + * from context, sets port's size and returns no value on the Lua
> > + * stack.
> > + * XXX: This function *MUST* be called under lua_pcall(), because
> > + * luamp_encode() may raise an error.
> > + */
>
> <snipped>
>
> > @@ -450,13 +482,20 @@ port_lua_do_dump(struct port *base, struct mpstream *stream,
> > struct encode_lua_ctx ctx;
> > ctx.port = port;
> > ctx.stream = stream;
> > - struct lua_State *L = tarantool_L;
> > - int top = lua_gettop(L);
> > - if (lua_cpcall(L, handler, &ctx) != 0) {
> > - luaT_toerror(port->L);
> > + lua_State *L = port->L;
> > + /*
> > + * At the moment Lua stack holds only values to encode.
> > + * Insert corresponding encoder to the bottom and push
> > + * encode context as lightuserdata to the top.
> > + */
> > + const int size = lua_gettop(L);
> > + lua_rawgeti(L, LUA_REGISTRYINDEX, execute_lua_refs[handler]);
> > + assert(lua_isfunction(L, -1) && lua_iscfunction(L, -1));
> > + lua_insert(L, 1);
>
> Side note: I don't like this line, like all we do. The only better
> approach in my head is to implement something similar to VARG frame, but
> I doubt this approach have no pure Lua violations. Let's return to this
> later a bit.
>
> > + lua_pushlightuserdata(L, &ctx);
> > + /* nargs -- all arguments + lightuserdata. */
> > + if (luaT_call(L, size + 1, 0) != 0)
> > return -1;
> > - }
> > - lua_settop(L, top);
> > return port->size;
> > }
> >
>
> <snipped>
>
> > @@ -1049,6 +1088,8 @@ box_lua_call_init(struct lua_State *L)
> > [HANDLER_CALL] = execute_lua_call,
> > [HANDLER_CALL_BY_REF] = execute_lua_call_by_ref,
> > [HANDLER_EVAL] = execute_lua_eval,
> > + [HANDLER_ENCODE_CALL] = encode_lua_call,
> > + [HANDLER_ENCODE_CALL_16] = encode_lua_call_16,
>
> Minor: Entries are also not sorted.
>
> > };
> >
> > for (int i = 0; i < HANDLER_MAX; i++) {
> > --
> > 2.31.0
> >
>
> --
> Best regards,
> IM
--
Best regards,
Sergey Kaplun
More information about the Tarantool-patches
mailing list