From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 1B9A66EC58; Mon, 2 Aug 2021 17:27:12 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 1B9A66EC58 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1627914432; bh=GxZqt9gSJO3EXDkgsXKKpErBE8jXHqnCkTPaKoWozFo=; h=Date:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=qlWKW2bA8pGHHKb97owfm83bGPwF1NSGcbAY0FvpgXaqyu77tcNCoQR99Po8zba7F KYp6zi5X55tl4Z141zOyScmTSrHVQxEQAeYqVQnFYBtGgOOjc7Sa2gEKaRiIDN+K2z gIfnDd8C0mnqBMvKtHv3CzwHwK08BqanlS21NTL8= Received: from smtp32.i.mail.ru (smtp32.i.mail.ru [94.100.177.92]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id D763C6EC58 for ; Mon, 2 Aug 2021 17:27:10 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org D763C6EC58 Received: by smtp32.i.mail.ru with esmtpa (envelope-from ) id 1mAYuL-0000oE-Oo; Mon, 02 Aug 2021 17:27:10 +0300 Date: Mon, 2 Aug 2021 17:25:56 +0300 To: Igor Munkin Cc: Vladislav Shpilevoy , tarantool-patches@dev.tarantool.org Message-ID: References: <20210618181416.25454-1-skaplun@tarantool.org> <20210801123417.GA27855@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210801123417.GA27855@tarantool.org> X-4EC0790: 10 X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD941C43E597735A9C351B198F4576AC7B21928AAE70459C21B182A05F538085040E59FAB96B609930AA41B3913A1DE0E13F9DFE3C9640032F8F83D25D4DD971003 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE75AD53DF1D86BACA3EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F790063706922F90966A37BA8638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D87128597C81D37A7A4137026BAEC01C5B117882F4460429724CE54428C33FAD305F5C1EE8F4F765FCAA867293B0326636D2E47CDBA5A96583BD4B6F7A4D31EC0BC014FD901B82EE079FA2833FD35BB23D27C277FBC8AE2E8BAA867293B0326636D2E47CDBA5A96583BA9C0B312567BB231DD303D21008E29813377AFFFEAFD269A417C69337E82CC2E827F84554CEF50127C277FBC8AE2E8BA83251EDC214901ED5E8D9A59859A8B613439FA09F3DCB32089D37D7C0E48F6C5571747095F342E88FB05168BE4CE3AF X-C1DE0DAB: 0D63561A33F958A511FA0346C5B8B8249CF86E82F3F9B6906E7A73AA7DE767B4D59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA751B940EDA0DFB0535410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34AA13E2DDB90678623318BFBC68DE999D3CD915D054C9E87CA5983D681A8BD38DACD46327DE57807C1D7E09C32AA3244CDB89FEC44CF8217A4C8929E4957D6121C3B3ADDA61883BB5FACE5A9C96DEB163 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2bioj9N286KAyvN5st/frJNZF9w== X-Mailru-Sender: 3B9A0136629DC91206CBC582EFEF4CB4C82907D24E48F9AD086B1F65EEDF2E8406057BBFE6ADB733F2400F607609286E924004A7DEC283833C7120B22964430C52B393F8C72A41A89437F6177E88F7363CDA0F3B3F5B9367 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Sergey Kaplun via Tarantool-patches Reply-To: Sergey Kaplun Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Hi, Igor! Thanks for the review! On 01.08.21, Igor Munkin wrote: > Sergey, > > Thanks for the patch! The changes are fine, but please consider the > comments below. > > On 18.06.21, Sergey Kaplun wrote: > > The old code flow was the following: > > 1) `struct port_lua` given to `port_lua_do_dump()` has Lua stack with > > arguments to encode to MessagePack > > 2) The main coroutine `tarantool_L` is used with `lua_cpcall()` to call > > `encode_lua_call_16()` or `encode_lua_call()` > > 3) Objects on minor coroutine are encoded via `luamp_encode_call16()` or > > What is this minor coroutine? To make terms more consistent in scope of > this commit message, let's define this in the first bullet and use it > everywhere. For me the best option is "port coroutine". Fixed. > > > `luamp_encode()`. This encoding may raise an error on unprotected > > The next bullet is recommended to be started from this sentence. Fixed. > > > `port->L` coroutine. This coroutine has no protected frame on it > > and this call should fail in pure Lua. It is forbidden to call > > Ditto. Fixed. > > > anything on unprotected coroutine [1] (Lua 5.1 sets protection only > > for specific lua_State [2] and calls a panic function if we raise an > > error on unprotected lua_State [3]). Netherless, there is no panic > > Ditto. Fixed. > > > at now due to two facts. The first one is LuaJIT's support of C++ > > And sublist is strongly recommended for "the first" and "the second" > facts. It'll make this War and Peace part more readable. Done. > > > exception handling [4] that allows to raise an error in Lua and > > catch it in C++ or vice versa. But documentation still doesn't > > permit errors on unprotected coroutines (at least we must set > > try-catch block). The second one is double monkey-patching of LuaJIT > > I doubt this is monkey-patching, but rather a monkey's patch. Changed to patching. > > > to restore currently executed coroutine, when C function or fast > > function raises an error [5][6] (see related issue here [7][8]). > > For these reasons, when an error occurs, the unwinder searches and > > finds the C-protected stack frame from the `lua_cpcall()` for the > > `tarantool_L` coroutine and unwinds until that point (without > > aforementioned patches LuaJIT just calls a panic function and exit). > > 4) If an error is raised, and `lua_cpcall()` returns not `LUA_OK`, then > > the error from `port->L` coroutine is converted into a Tarantool error > > and a diagnostic is set. > > > > The auxiliary usage of `tarantool_L` coroutine is redundant and doesn't > > Well, as for me, the main problem is violating Lua coroutine practice > (it's worth to mention, that internal unwinder used on M1 is not such > flexible, so this misuse still leads to panic call), and the minor > reason is tarantool_L usage redundancy. Changed considering your comments. > > > respect Lua idiomatic of usage. So this patch drops it and uses only > > minor coroutine instead with `lua_pcall()`. > > > > Functions to encode are saved as entrance in the `LUA_REGISTRY` table > > to reduce GC pressure, like it is done for other handlers [9]. > > > > [1]: https://www.lua.org/manual/5.2/manual.html#4.6 > > > If an error happens outside any protected environment, Lua calls a > > > panic function > > Some copy-pasting artefacts. Moved upwards. > > > [2]: https://www.lua.org/source/5.1/lstate.h.html#lua_State > > [3]: https://www.lua.org/source/5.1/ldo.c.html#luaD_throw > > [4]: https://luajit.org/extensions.html#exceptions > > [5]: https://github.com/tarantool/luajit/commit/ed412cd9f55fe87fd32a69c86e1732690fc5c1b0 > > [6]: https://github.com/tarantool/luajit/commit/97699d9ee2467389b6aea21a098e38aff3469b5f > > [7]: https://github.com/tarantool/tarantool/issues/1516 > > [8]: https://www.freelists.org/post/luajit/Issue-with-PCALL-in-21 > > [9]: https://github.com/tarantool/tarantool/commit/e88c0d21ab765d4c53bed2437c49d77b3ffe4216 > > This patch resolves #6248, BTW. Added. > The new commit message is the following: =================================================================== lua: refactor port_lua_do_dump and encode_lua_call The old code flow was the following: 1) `struct port_lua` given to `port_lua_do_dump()` has Lua stack with arguments to encode to MessagePack. 2) The main coroutine `tarantool_L` is used to call `encode_lua_call()` or `encode_lua_call_16`() via `lua_cpcall()`. 3) Objects on port coroutine are encoded via `luamp_encode()` or `luamp_encode_call16()`. 4) This encoding may raise an error on unprotected `port->L` coroutine. This coroutine has no protected frame on it and this call should fail in pure Lua. Calling anything on unprotected coroutine is not allowed in Lua [1]: | If an error happens outside any protected environment, Lua calls a | panic function Lua 5.1 sets protection only for specific lua_State [2] and calls a panic function if we raise an error on unprotected lua_State [3]. Nevertheless, no panic occurs now due to two facts: * The first one is LuaJIT's support of C++ exception handling [4] that allows to raise an error in Lua and catch it in C++ or vice versa. But documentation still doesn't allow raising errors on unprotected coroutines (at least we must use try-catch block). * The second one is the patch made in LuaJIT to restore currently executed coroutine, when C function or fast function raises an error [5][6] (see the related issue here [7][8]). For these reasons, when an error occurs, the unwinder searches and finds the C-protected stack frame from the `lua_cpcall()` for `tarantool_L` coroutine and unwinds until that point (without aforementioned patches LuaJIT just calls a panic function and exit). If an error is raised, and `lua_cpcall()` returns not `LUA_OK`, then the error from `port->L` coroutine is converted into a Tarantool error and a diagnostic is set. The auxiliary usage of `tarantool_L` coroutine doesn't respect Lua idiomatic of usage. Internal unwinder used on M1 is not such flexible, so such misuse leads to panic call. Also the `tarantool_L` usage is redundant. So this patch drops it and uses only minor coroutine instead with `lua_pcall()`. Functions to encode are saved as entrance in the `LUA_REGISTRY` table to reduce GC pressure, like it is done for other handlers [9]. [1]: https://www.lua.org/manual/5.2/manual.html#4.6 [2]: https://www.lua.org/source/5.1/lstate.h.html#lua_State [3]: https://www.lua.org/source/5.1/ldo.c.html#luaD_throw [4]: https://luajit.org/extensions.html#exceptions [5]: https://github.com/tarantool/luajit/commit/ed412cd9f55fe87fd32a69c86e1732690fc5c1b0 [6]: https://github.com/tarantool/luajit/commit/97699d9ee2467389b6aea21a098e38aff3469b5f [7]: https://github.com/tarantool/tarantool/issues/1516 [8]: https://www.freelists.org/post/luajit/Issue-with-PCALL-in-21 [9]: https://github.com/tarantool/tarantool/commit/e88c0d21ab765d4c53bed2437c49d77b3ffe4216 Closes #6248 Closes #4617 =================================================================== > > --- > > > > Branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-noticket-refactor-lua-call > > See the benchmarks sources here [1]. > > > > Before patch: > > | Encode map: 189851357 mcs, 15.8 K ps > > | Encode seq: 187926351 mcs, 16.0 K ps > > | Encode str: 185451675 mcs, 16.2 K ps > > | Encode dig: 184833396 mcs, 16.2 K ps > > > > After patch: > > | Encode map: 187814261 mcs, 16.0 K ps > > | Encode seq: 183755028 mcs, 16.3 K ps > > | Encode str: 181571626 mcs, 16.5 K ps > > | Encode dig: 181572998 mcs, 16.5 K ps > > > > Looks like the perf doesn't degrade at least. > > At first, I would like to emphasize that we have no option for merging > or not the fix for this issue. > > Re benchmarks: It's worth to mention you're measuring two performance > critical changes: effect and lower GC pressure. So, it's > interesting to see the following benchmarks: > * one with disabled GC and GC stats Here the results with disabled GC: Before patch: Encode map: 4679394 mcs, 21.4 K ps Encode seq: 4559824 mcs, 21.9 K ps Encode str: 4574213 mcs, 21.9 K ps Encode dig: 4595043 mcs, 21.8 K ps Encode mul: 5978444 mcs, 16.7 K ps After: Encode map: 4739110 mcs, 21.1 K ps Encode seq: 4528261 mcs, 22.1 K ps Encode str: 4576910 mcs, 21.8 K ps Encode dig: 4506142 mcs, 22.2 K ps Encode mul: 6016659 mcs, 16.6 K ps I suppose, that values are almost the same, at least within the margin of error. Note: I reduced amount of iterations 30 times. So inaccuracy increased. > * one with considerable amount of elements on Lua stack, but not > triggering stack resize (AFAIU, 200 is too much) Tried with 40 items on the stack: Without GC: Master: Encode mul: 4895280 mcs, 20.4 K ps Branch: Encode mul: 4896076 mcs, 20.4 K ps With GC: Master: Encode mul: 5123580 mcs, 19.5 K ps Branch: Encode mul: 5050863 mcs, 19.8 K ps Seems pretty equal too. > > Here are my points: > * There is no such huge increase as a result of reducing GC pressure > * Moving 1-5 8-byte elements is neglible for performance > * Moving 200(*) elements as a result of the guest stack resize affects > both patched and vanilla versions > * measurements are affected by resizing (considering your > perf stats) > > Anyway, though these are kinda independent changes, we see no > performance degradation using both of them in a single patch, so I guess > we have no reason to worry about. > > (*) I'm not sure about the exact amount of the elements to be moved. Exactly 200. > > > > > [1]: https://gist.github.com/Buristan/3e6d6bf2c722874bec55a8c5a44b98f3 > > > > src/box/lua/call.c | 71 ++++++++++++++++++++++++++++++++++++---------- > > 1 file changed, 56 insertions(+), 15 deletions(-) > > > > diff --git a/src/box/lua/call.c b/src/box/lua/call.c > > index 0315e720c..3b2572096 100644 > > --- a/src/box/lua/call.c > > +++ b/src/box/lua/call.c > > @@ -61,6 +61,8 @@ enum handlers { > > HANDLER_CALL, > > HANDLER_CALL_BY_REF, > > HANDLER_EVAL, > > + HANDLER_ENCODE_CALL, > > + HANDLER_ENCODE_CALL_16, > > Minor: Entries are not sorted. Fixed here and below. See the iterative patch: =================================================================== diff --git a/src/box/lua/call.c b/src/box/lua/call.c index 3b2572096..5db17359d 100644 --- a/src/box/lua/call.c +++ b/src/box/lua/call.c @@ -60,9 +60,9 @@ enum handlers { HANDLER_CALL, HANDLER_CALL_BY_REF, - HANDLER_EVAL, HANDLER_ENCODE_CALL, HANDLER_ENCODE_CALL_16, + HANDLER_EVAL, HANDLER_MAX, }; @@ -1087,9 +1087,9 @@ box_lua_call_init(struct lua_State *L) lua_CFunction handles[] = { [HANDLER_CALL] = execute_lua_call, [HANDLER_CALL_BY_REF] = execute_lua_call_by_ref, - [HANDLER_EVAL] = execute_lua_eval, [HANDLER_ENCODE_CALL] = encode_lua_call, [HANDLER_ENCODE_CALL_16] = encode_lua_call_16, + [HANDLER_EVAL] = execute_lua_eval, }; for (int i = 0; i < HANDLER_MAX; i++) { =================================================================== > > > HANDLER_MAX, > > }; > > > > @@ -400,11 +402,26 @@ struct encode_lua_ctx { > > struct mpstream *stream; > > }; > > > > +/** > > + * Encode call results to msgpack from Lua stack. > > + * Lua stack has the following structure -- the last element is > > + * lightuserdata pointer to encode_lua_ctx, all other values are > > + * arguments to proceed. > > Typo: s/proceed/process/. Fixes. See the iterative patch below. =================================================================== diff --git a/src/box/lua/call.c b/src/box/lua/call.c index 5db17359d..857b57165 100644 --- a/src/box/lua/call.c +++ b/src/box/lua/call.c @@ -406,7 +406,7 @@ struct encode_lua_ctx { * Encode call results to msgpack from Lua stack. * Lua stack has the following structure -- the last element is * lightuserdata pointer to encode_lua_ctx, all other values are - * arguments to proceed. + * arguments to process. * The function encodes all given Lua objects to msgpack stream * from context, sets port's size and returns no value on the Lua * stack. @@ -442,7 +442,7 @@ encode_lua_call(lua_State *L) * Encode call_16 results to msgpack from Lua stack. * Lua stack has the following structure -- the last element is * lightuserdata pointer to encode_lua_ctx, all other values are - * arguments to proceed. + * arguments to process. * The function encodes all given Lua objects to msgpack stream * from context, sets port's size and returns no value on the Lua * stack. =================================================================== > > > + * The function encodes all given Lua objects to msgpack stream > > + * from context, sets port's size and returns no value on the Lua > > + * stack. > > + * XXX: This function *MUST* be called under lua_pcall(), because > > + * luamp_encode() may raise an error. > > + */ > > > > > +/** > > + * Encode call_16 results to msgpack from Lua stack. > > + * Lua stack has the following structure -- the last element is > > + * lightuserdata pointer to encode_lua_ctx, all other values are > > + * arguments to proceed. > > Ditto. > > > + * The function encodes all given Lua objects to msgpack stream > > + * from context, sets port's size and returns no value on the Lua > > + * stack. > > + * XXX: This function *MUST* be called under lua_pcall(), because > > + * luamp_encode() may raise an error. > > + */ > > > > > @@ -450,13 +482,20 @@ port_lua_do_dump(struct port *base, struct mpstream *stream, > > struct encode_lua_ctx ctx; > > ctx.port = port; > > ctx.stream = stream; > > - struct lua_State *L = tarantool_L; > > - int top = lua_gettop(L); > > - if (lua_cpcall(L, handler, &ctx) != 0) { > > - luaT_toerror(port->L); > > + lua_State *L = port->L; > > + /* > > + * At the moment Lua stack holds only values to encode. > > + * Insert corresponding encoder to the bottom and push > > + * encode context as lightuserdata to the top. > > + */ > > + const int size = lua_gettop(L); > > + lua_rawgeti(L, LUA_REGISTRYINDEX, execute_lua_refs[handler]); > > + assert(lua_isfunction(L, -1) && lua_iscfunction(L, -1)); > > + lua_insert(L, 1); > > Side note: I don't like this line, like all we do. The only better > approach in my head is to implement something similar to VARG frame, but > I doubt this approach have no pure Lua violations. Let's return to this > later a bit. > > > + lua_pushlightuserdata(L, &ctx); > > + /* nargs -- all arguments + lightuserdata. */ > > + if (luaT_call(L, size + 1, 0) != 0) > > return -1; > > - } > > - lua_settop(L, top); > > return port->size; > > } > > > > > > > @@ -1049,6 +1088,8 @@ box_lua_call_init(struct lua_State *L) > > [HANDLER_CALL] = execute_lua_call, > > [HANDLER_CALL_BY_REF] = execute_lua_call_by_ref, > > [HANDLER_EVAL] = execute_lua_eval, > > + [HANDLER_ENCODE_CALL] = encode_lua_call, > > + [HANDLER_ENCODE_CALL_16] = encode_lua_call_16, > > Minor: Entries are also not sorted. > > > }; > > > > for (int i = 0; i < HANDLER_MAX; i++) { > > -- > > 2.31.0 > > > > -- > Best regards, > IM -- Best regards, Sergey Kaplun