From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 1FE8469F6EB; Fri, 13 Oct 2023 14:50:32 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 1FE8469F6EB DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1697197832; bh=8QaJevvJfsg4ZCZI2tDr/qJwzjjlTirBy79YZTyBNSQ=; h=Date:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=bQ9dulmAWzKVXdDZF0jocDreeaXKHqqxd6V1OagMGSykj/quOfeXQwNdMeLj2wAEG sQ/dJ+C46U16uGU/R0b/EH5UbJgDWRrlPWYJ5pHWIYt2c8r2XbRjPTQcRh6VanRdPf YJQuyufyQDN5k1cAnag4xMtLvJV5/BATJ0NBofP4= Received: from smtp60.i.mail.ru (smtp60.i.mail.ru [95.163.41.98]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id EABBC5F2B01 for ; Fri, 13 Oct 2023 14:50:29 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org EABBC5F2B01 Received: by smtp60.i.mail.ru with esmtpa (envelope-from ) id 1qrGgW-00Ez6J-1i; Fri, 13 Oct 2023 14:50:28 +0300 Date: Fri, 13 Oct 2023 14:50:28 +0300 To: Sergey Kaplun Cc: Maxim Kokryashkin , tarantool-patches@dev.tarantool.org Message-ID: References: <839195a6a6ff32a49c5a690a9f0101cc741901fd.1695968227.git.m.kokryashkin@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Mailru-Src: smtp X-7564579A: 78E4E2B564C1792B X-77F55803: 4F1203BC0FB41BD93E991D2DB989D7D377E144062AB4C7705B67E90667497A57182A05F538085040F64EC8DE02105828B2D86B216CC00DD8E98CB1CF2C34278646717039FDCA1E03 X-C1DE0DAB: 0D63561A33F958A51F296569547CB2AB021F25248483DCA7768D43959CACA240F87CCE6106E1FC07E67D4AC08A07B9B0CE135D2742255B35CB5012B2E24CD356 X-C8649E89: 1C3962B70DF3F0AD5177F0B940C8B66ECE892A7B2722663E91682638B966EB3F662256BEEFA9527F69DCB3559C6DC2B8748FFA2D7775ACD3914E948823A6A9B940602CFA29137DD8F13499238A2BC08F37FD76D11AF80A0DC343AB5E4838688604A4CAE6DBB4CC44EA455F16B58544A2557BDE0DD54B3590965026E5D17F6739C77C69D99B9914278E50E1F0597A6FD5CD72808BE417F3B9E0E7457915DAA85F X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojBYtV3BjG6I7QuhPOrDtpIQ== X-Mailru-Sender: 11C2EC085EDE56FA38FD4C59F7EFE407D176F5A0B62593FD7B0A2240D4A36F4076748896E63B1A09D51284F0FE6F529ABC7555A253F5B200DF104D74F62EE79D27EC13EC74F6107F4198E0F3ECE9B5443453F38A29522196 X-Mras: OK Subject: Re: [Tarantool-patches] [PATCH luajit v2 5/5] Restore cur_L for specific Lua/C API use case. X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Maxim Kokryashkin via Tarantool-patches Reply-To: Maxim Kokryashkin Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Hi, Sergey! Thanks for the review! See my replies below. Fixed your comments, the branch is force-pushed. On Tue, Oct 10, 2023 at 11:57:22AM +0300, Sergey Kaplun via Tarantool-patches wrote: > Hi, Maxim! > Thanks for the patch! > LGTM, after fixing comments below. > > I suppose that this patch should be the first in the patch series > since, otherwise, the test will > fail on the corresponding architectures before this commit. Moved > > AFAIR, there are still some API misusages in the Tarantool Lua C code. > I suppose they should be fixed first before this patch is merged. > Please correct me if I am wrong. Yep, but there are two important aspects: 1. This patch actually make this misuse legal now, so it is questionable, whether it should be fixed or not. 2. The issue in tarantool is more broad -- Lua coroutines and C fibers are switched completely unaware of one another, which is not a great design choice. To provide 100% correct switching, LuaJIT API changes are required, which should be discussed separately. > > On 29.09.23, Maxim Kokryashkin wrote: > > From: Mike Pall > > > > Thanks to Peter Cawley. > > > > (cherry-picked from commit e86990f7f24a94b0897061f25a84547fe1108bed) > > > > Consider the following Lua C API function: > > > > ``` > > static int error_after_coroutine_return(lua_State *L) > > { > > lua_State *innerL = lua_newthread(L); > > luaL_loadstring(innerL, "print('inner coro')"); > > lua_pcall(innerL, 0, 0, 0); > > luaL_error(L, "my fancy error"); > > return 0; > > } > > ``` > > > > And the following Lua script: > > ``` > > local libcur_L = require('libcur_L') > > > > local function onesnap_f(var) > > if var then > > return 1 > > else > > return 0 > > end > > end > > > > -- Compile function to trace with snapshot. > > if jit then jit.opt.start('hotloop=1') end > > onesnap_f(true) > > onesnap_f(true) > > > > local r, s = pcall(libcur_L.error_after_coroutine_return) > > onesnap_f(false) > > ``` > > Minor: I prefer to describe the test case in words but not copy-paste it > from tests. I suppose it will be enough to mention the test case for > details. Fixed! > > > > > This is the only case when `cur_L` is not restored, according to > > the analysis done in https://github.com/LuaJIT/LuaJIT/issues/1066. > > > > This patch changes the error-catching routine, so now the patch > > sets the actual cur_L there. > > Now it is possible to throw errors on non-executing coroutines, > > which is a violation of the Lua C API. So, even though it is now > > possible, that behavior should be avoided anyway. > > It's worth mentioning the analysis done in the 1066 that there is no > need to patch case FRAME_CP because all work is done inside the VM, when > returning from the C function called via `lua_cpcall()`. > You may recap the analisys from 1066 here also to simplify reading of > the patch. > > > > > Maxim Kokryashkin: > > * added the description and the test for the problem > > > > Resolves tarantool/tarantool#6323 > > Missed label: > | Part of tarantool/tarantool#9145 Fixed! Here is the new commit message: === Restore cur_L for specific Lua/C API use case. Thanks to Peter Cawley. (cherry-picked from commit e86990f7f24a94b0897061f25a84547fe1108bed) There is a case of valid Lua C API usage when non-throwing code is executed on coroutine, other than one that an entry-point C function was called with, which results in a segmentation fault because cur_L is not being restored correspondingly. For a comprehensive example, see the test case added within this patch. According to the analysis done in lj-1066, to hit this issue, we first need to call into a C function. That C function needs to change cur_L by calling lua_resume/lua_p?call, and then the only problematic path is leaving the C function by throwing an error, which is caught by the fast-function x?pcall. When x?pcall then returns to its caller, we can be executing bytecode in the VM with an incorrect cur_L, and things will go badly if we enter a trace before correcting cur_L. The fix proposed in lj-740 was to set cur_L on error throw; however, the analysis from lj-1066 suggests that it can be set on the catching phase. This patch changes the error-catching routine, so now the patch sets the actual cur_L there. Now it is possible to throw errors on non-executing coroutines, which is a violation of the Lua C API. So, even though it is now possible, that behavior should be avoided anyway. Maxim Kokryashkin: * added the description and the test for the problem Resolves tarantool/tarantool#6323 Part of tarantool/tarantool#9145 === > > > --- > > src/lj_err.c | 5 ++- > > test/tarantool-tests/CMakeLists.txt | 1 + > > ...-fix-cur_L-after-coroutine-resume.test.lua | 32 +++++++++++++++++++ > > .../CMakeLists.txt | 1 + > > .../libcur_L_coroutine.c | 22 +++++++++++++ > > 5 files changed, 60 insertions(+), 1 deletion(-) > > create mode 100644 test/tarantool-tests/lj-1066-fix-cur_L-after-coroutine-resume.test.lua > > create mode 100644 test/tarantool-tests/lj-1066-fix-cur_L-after-coroutine-resume/CMakeLists.txt > > create mode 100644 test/tarantool-tests/lj-1066-fix-cur_L-after-coroutine-resume/libcur_L_coroutine.c > > > > diff --git a/src/lj_err.c b/src/lj_err.c > > > > > diff --git a/test/tarantool-tests/CMakeLists.txt b/test/tarantool-tests/CMakeLists.txt > > index c15d6037..d84072e0 100644 > > --- a/test/tarantool-tests/CMakeLists.txt > > +++ b/test/tarantool-tests/CMakeLists.txt > > > > > diff --git a/test/tarantool-tests/lj-1066-fix-cur_L-after-coroutine-resume.test.lua b/test/tarantool-tests/lj-1066-fix-cur_L-after-coroutine-resume.test.lua > > new file mode 100644 > > index 00000000..3919ae23 > > --- /dev/null > > +++ b/test/tarantool-tests/lj-1066-fix-cur_L-after-coroutine-resume.test.lua > > @@ -0,0 +1,32 @@ > > +local tap = require('tap') > > +local test = tap.test('lj-1066-fix-cur_L-after-coroutine-resume'):skipcond({ > > + ['Test requires JIT enabled'] = not jit.status(), > > +}) > > + > > +test:plan(1) > > + > > +local libcur_L_coroutine = require('libcur_L_coroutine') > > + > > +local function cbool(cond) > > + if cond then > > + return 1 > > + else > > + return 0 > > + end > > +end > > + > > +-- Compile function to trace with snapshot. > > +jit.opt.start('hotloop=1') > > +-- First call makes `cbool()` hot enough to be recorded next time. > > +cbool(true) > > +-- Second call records `cbool()` body (i.e. `if` branch). This is > > +-- a root trace for `cbool()`. > > +cbool(true) > > + > > +local res = pcall(libcur_L_coroutine.error_after_coroutine_return) > > +assert(res == false, "return from error") > > Nit: Please, use single quotes here. Fixed! > > > +-- Call with restoration from a snapshot with wrong cur_L. > > +cbool(false) > > + > > +test:ok(true) > > +test:done(true) > > diff --git a/test/tarantool-tests/lj-1066-fix-cur_L-after-coroutine-resume/CMakeLists.txt b/test/tarantool-tests/lj-1066-fix-cur_L-after-coroutine-resume/CMakeLists.txt > > > > > diff --git a/test/tarantool-tests/lj-1066-fix-cur_L-after-coroutine-resume/libcur_L_coroutine.c b/test/tarantool-tests/lj-1066-fix-cur_L-after-coroutine-resume/libcur_L_coroutine.c > > new file mode 100644 > > index 00000000..7a71d0f0 > > --- /dev/null > > +++ b/test/tarantool-tests/lj-1066-fix-cur_L-after-coroutine-resume/libcur_L_coroutine.c > > @@ -0,0 +1,22 @@ > > +#include "lua.h" > > +#include "lauxlib.h" > > + > > +static int error_after_coroutine_return(lua_State *L) > > +{ > > + lua_State *innerL = lua_newthread(L); > > + luaL_loadstring(innerL, "print('inner coro')"); > > There is no need for debug printing here, just "return" in enough, I > suppose. > Fixed! > > + lua_pcall(innerL, 0, 0, 0); > > + luaL_error(L, "my fancy error"); > > + return 0; > > +} > > + > > +static const struct luaL_Reg libcur_L_coroutine[] = { > > + {"error_after_coroutine_return", error_after_coroutine_return}, > > + {NULL, NULL} > > +}; > > + > > +LUA_API int luaopen_libcur_L_coroutine(lua_State *L) > > +{ > > + luaL_register(L, "libcur_L_coroutine", libcur_L_coroutine); > > + return 1; > > +} > > -- > > 2.42.0 Here is the diff: === diff --git a/test/tarantool-tests/lj-1066-fix-cur_L-after-coroutine-resume.test.lua b/test/tarantool-tests/lj-1066-fix-cur_L-after-coroutine-resume.test.lua index 3919ae23..7f3739bf 100644 --- a/test/tarantool-tests/lj-1066-fix-cur_L-after-coroutine-resume.test.lua +++ b/test/tarantool-tests/lj-1066-fix-cur_L-after-coroutine-resume.test.lua @@ -24,7 +24,7 @@ cbool(true) cbool(true) local res = pcall(libcur_L_coroutine.error_after_coroutine_return) -assert(res == false, "return from error") +assert(res == false, 'return from error') -- Call with restoration from a snapshot with wrong cur_L. cbool(false) diff --git a/test/tarantool-tests/lj-1066-fix-cur_L-after-coroutine-resume/libcur_L_coroutine.c b/test/tarantool-tests/lj-1066-fix-cur_L-after-coroutine-resume/libcur_L_coroutine.c index 7a71d0f0..39d90e18 100644 --- a/test/tarantool-tests/lj-1066-fix-cur_L-after-coroutine-resume/libcur_L_coroutine.c +++ b/test/tarantool-tests/lj-1066-fix-cur_L-after-coroutine-resume/libcur_L_coroutine.c @@ -4,7 +4,7 @@ static int error_after_coroutine_return(lua_State *L) { lua_State *innerL = lua_newthread(L); - luaL_loadstring(innerL, "print('inner coro')"); + luaL_loadstring(innerL, "return"); lua_pcall(innerL, 0, 0, 0); luaL_error(L, "my fancy error"); return 0; === > > > > -- > Best regards, > Sergey Kaplun