From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 485D92ACC9; Fri, 11 Nov 2022 15:21:48 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 485D92ACC9 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1668169308; bh=CMnOLIsd1uzhUbXFAs+xdfCREzSIpIf2YKFGhjAbYB0=; h=Date:To:References:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=hf9WcSk5cleLUOR54ugbxhUG4h3TQtrQIfI9S32ehz51jN5zthSlHuBr8nkzv86ig KTjapXaFyvg5g07F2/YvAm5qULDoxeUpustCZW0DdPpzvlQIwbMt0C2WmLFGT98Db3 xHwjm0KzT1zTOMuB63PrndnQy4YtLt0oq+ioPOvM= Received: from smtp44.i.mail.ru (smtp44.i.mail.ru [94.100.177.104]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id C74422ACC1 for ; Fri, 11 Nov 2022 15:21:46 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org C74422ACC1 Received: by smtp44.i.mail.ru with esmtpa (envelope-from ) id 1otT2X-00032Y-Ni; Fri, 11 Nov 2022 15:21:46 +0300 Date: Fri, 11 Nov 2022 15:18:40 +0300 To: Igor Munkin Message-ID: References: <20221109174948.10952-1-skaplun@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Mailru-Src: smtp X-4EC0790: 10 X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD9D880D530162779F186FAE211D4D51FAF3D23BC449FA949B1182A05F5380850402DA38CF34F9FF5A274DA79CAEB23F37A95F928F82B3DE7B1AABB81C9B359CA4A X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7F16C4DE526EFCC04EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F79006379347C0682FC030B08638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D8BADB871D86690E28BB16A0962F75FD98117882F4460429724CE54428C33FAD305F5C1EE8F4F765FC3A703B70628EAD7BA471835C12D1D9774AD6D5ED66289B52BA9C0B312567BB23117882F4460429728776938767073520599709FD55CB46A6C26CFBAC0749D213D2E47CDBA5A96583BA9C0B312567BB2376E601842F6C81A19E625A9149C048EEFAD5A440E159F97D9935A1E27F592749D8FC6C240DEA7642DBF02ECDB25306B2B78CF848AE20165D0A6AB1C7CE11FEE3093C2F12201C912ABA3038C0950A5D36B5C8C57E37DE458B330BD67F2E7D9AF16D1867E19FE14079C09775C1D3CA48CF3D321E7403792E342EB15956EA79C166A417C69337E82CC275ECD9A6C639B01B78DA827A17800CE78DD9044B304389D4731C566533BA786AA5CC5B56E945C8DA X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D341C998A3771F041539B13CC984BA6F335AF4A90C955B7D09BC0BAD2738F260FD3D2425CFE0D1EAB691D7E09C32AA3244C456F3A896B6C02D7DD76384055AD386839C99C45E8D137E9FACE5A9C96DEB163 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojSqkc8Nj21UOMUAlUgJU36w== X-Mailru-Sender: 87D5297B137A96FE4AECA001A08BCABA1C857C14915AB75DB94877F27ACAF52E64C07447216EFC9D525762887713E5F1475755348978188EF9D3679FA3DE6E791CC59163FFD68303B0DAF586E7D11B3E67EA787935ED9F1B X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH luajit] Ensure correct stack top for OOM error message. X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Sergey Kaplun via Tarantool-patches Reply-To: Sergey Kaplun Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Hi, Igor! Thanks for the review! Welcome back to the ML! :) Updated remote branch considering your comments. On 11.11.22, Igor Munkin wrote: > Sergey, > > Thanks for the patch! > > On 09.11.22, Sergey Kaplun wrote: > > From: Mike Pall > > > > Reported by Sergey Kaplun. > > > > (cherry picked from commit ca8d3257bb44e42100c7910c47dcdcf01f494187) > > > > `lj_err_mem()` doesn't set up `L->top` for Lua frames, but uses it for > > I believe it "doesn't set `L->top` for the current Lua frame, but...", > doesn't it? Fixed, thanks. > > > pushing error message on the stack. So, when we call some routine that > > Typo: s/some routine/a routine/ or s/some routine/arbitrary routine/. Fixed. > > > does some allocations, it can raise the OOM error (like `lj_tab_dup()` > > Strictly saying it's LUA_ERRMEM, not OOM. If you say about naming. Replaced. > > > in `BC_TDUP`) and this error may corrupt stack for unwind in situations > > when `L->top` < `L->base`. > > > > This patch restores `L->top` for Lua frames when raise the error via > > Typo: s/raise the error/the error is raised/. Fixed. > > > `lj_err_mem()`. > > > > Sergey Kaplun: > > * added the description and the test for the problem > > > > Resolves tarantool/tarantool#3840 > > Part of tarantool/tarantool#7230 > > --- > > > > Issues: > > * https://github.com/LuaJIT/LuaJIT/issues/906 > > * https://github.com/tarantool/tarantool/issues/7230 > > * https://github.com/tarantool/tarantool/issues/3840 > > PR: https://github.com/tarantool/tarantool/pull/7915 > > Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-906-fix-err-mem > > > > Red LuaJIT CI for MacOS Release builds is a known issue with self-hosted > > runners, as Igor has said before. > > Everything is fixed at the moment, please try to rebase. Rebased. It works now! Thanks! > > > > > src/lj_err.c | 1 + > > .../lj-906-fix-err-mem.test.lua | 90 +++++++++++++++++++ > > 2 files changed, 91 insertions(+) > > create mode 100644 test/tarantool-tests/lj-906-fix-err-mem.test.lua > > > > > > > diff --git a/test/tarantool-tests/lj-906-fix-err-mem.test.lua b/test/tarantool-tests/lj-906-fix-err-mem.test.lua > > new file mode 100644 > > index 00000000..a139e1c9 > > --- /dev/null > > +++ b/test/tarantool-tests/lj-906-fix-err-mem.test.lua > > @@ -0,0 +1,90 @@ > > +local tap = require('tap') > > +local ffi = require('ffi') > > +local table_new = require('table.new') > > + > > +-- Avoid test to be killed. > > +require('utils').skipcond(ffi.abi('gc64'), 'test is not GC64 only') > > + > > +local test = tap.test('lj-906-fix-err-mem') > > +test:plan(1) > > + > > +local KB = 1024 > > +local MB = 1024 * KB > > + > > +-- The maximum available table size, taking into account created > > +-- constants for one function. > > +local TNEW_SIZE = 511 > > + > > +local gc_anchor = {} > > + > > +-- This function works until raises the error. > > +local function eat_chunks(size) > > + -- Need raise the OOM error inside TDUP, not TNEW, so reserve > > + -- memory for it. > > + -- luacheck: no unused > > + local tnew_anchor = table_new(TNEW_SIZE, 0) > > + while true do > > + table.insert(gc_anchor, ffi.new('char [?]', size)) > > + end > > +end > > + > > +-- Function to format inner tab leading to TDUP emitting. > > +local function format_inner_tab() > > Minor: Maybe instead of ? > Just asking, feel free to ignore. Fixed. > > > + local inner_tab = '' > > + local inner_depth = 128 > > Why 128? Please, drop a few words. Added. > > > + -- Repeate table template for TDUP. > > + for _ = 1, inner_depth do > > + inner_tab = inner_tab .. '{a =' > > + end > > + inner_tab = inner_tab .. '{}' > > + for _ = 1, inner_depth do > > + inner_tab = inner_tab .. '},' > > + end > > + return inner_tab > > +end > > + > > Minor: Same (also think about moving these helpers to utils.*). > Maybe instead of ? IMHO, it's too specific to use this function test-wide, so I just keep it as is. If we need similar functionality somewhere else, we can move this function to later. So ignore here. > And > mention TDUP specifics in the comment. Fixed. > > Just asking too, feel free to ignore. > > > +local function format_TDUP_chunk() > > + local big_tab = 'local _ = {\n' > > + local inner_tab = format_inner_tab() > > + for _ = 1, TNEW_SIZE do > > + big_tab = big_tab .. inner_tab .. '\n' > > + end > > + big_tab = big_tab .. '}' > > + return big_tab > > +end > > + > > +local TDUP, err = loadstring(format_TDUP_chunk()) > > +assert(TDUP, err) > > + > > +local function frame_before_TDUP() > > + -- Stack slots are needed for coredump in case of misbehaviour. > > Why are they needed? Please drop few more words regarding this. Hmm, I obscure coredump without them too (because of rewriting the other stack slot). But the frame before TDUP is really needed to rewrite pcall frame. Fixed, and added the comment. All minor patches about comments above are done via the following iterative patch: =================================================================== diff --git a/test/tarantool-tests/lj-906-fix-err-mem.test.lua b/test/tarantool-tests/lj-906-fix-err-mem.test.lua index f512e802..0ec1b30e 100644 --- a/test/tarantool-tests/lj-906-fix-err-mem.test.lua +++ b/test/tarantool-tests/lj-906-fix-err-mem.test.lua @@ -12,7 +12,7 @@ local KB = 1024 local MB = 1024 * KB -- The maximum available table size, taking into account created --- constants for one function. +-- constants for one function and nested level. local TNEW_SIZE = 511 local gc_anchor = {} @@ -29,9 +29,8 @@ local function eat_chunks(size) end -- Function to format inner tab leading to TDUP emitting. -local function format_inner_tab() +local function make_flat_table(inner_depth) local inner_tab = '' - local inner_depth = 128 -- Repeate table template for TDUP. for _ = 1, inner_depth do inner_tab = inner_tab .. '{a =' @@ -43,9 +42,14 @@ local function format_inner_tab() return inner_tab end +-- The `lj_err_mem()` doesn't fix `L->top`, when called from +-- helper function (like `lj_tab_dup()`) before the patch. +-- This function creates a chunk with many BC_TDUP inside. local function format_TDUP_chunk() local big_tab = 'local _ = {\n' - local inner_tab = format_inner_tab() + -- The maximum available table size, taking into account created + -- constants for one function and nested level. + local inner_tab = make_flat_table(128) for _ = 1, TNEW_SIZE do big_tab = big_tab .. inner_tab .. '\n' end @@ -56,12 +60,10 @@ end local TDUP, err = loadstring(format_TDUP_chunk()) assert(TDUP, err) +-- Function to create the additional frame after to be rewrited +-- one in case of `lj_err_mem()` misbehaviour. local function frame_before_TDUP() - -- Stack slots are needed for coredump in case of misbehaviour. - -- luacheck: no unused - local frame_slot1, frame_slot2 TDUP() - return frame_slot1, frame_slot2 end collectgarbage() =================================================================== > > > + -- luacheck: no unused > > + local frame_slot1, frame_slot2 > > + TDUP() > > + return frame_slot1, frame_slot2 > > +end > > + > > Minor: I believe you can pack these two lines into something one can > call : > | local function janitor() > | collectgarbage('collect') > | collectgarbage('stop') > | end > > Anyway, you can leave this as is. Added, thanks! See the iterative patch below. =================================================================== diff --git a/test/tarantool-tests/lj-906-fix-err-mem.test.lua b/test/tarantool-tests/lj-906-fix-err-mem.test.lua index 519e6bf6..7dbc1614 100644 --- a/test/tarantool-tests/lj-906-fix-err-mem.test.lua +++ b/test/tarantool-tests/lj-906-fix-err-mem.test.lua @@ -66,8 +66,12 @@ local function frame_before_TDUP() TDUP() end -collectgarbage() -collectgarbage('stop') +local function janitor() + collectgarbage('collect') + collectgarbage('stop') +end + +janitor() -- Avoid OOM on traces. jit.off() @@ -75,14 +79,11 @@ jit.off() -- Stack slots are needed for coredump in case of misbehaviour. -- luacheck: no unused local r, e = pcall(eat_chunks, 8 * MB) -collectgarbage() -collectgarbage('stop') +janitor() pcall(eat_chunks, 8 * KB) -collectgarbage() -collectgarbage('stop') +janitor() pcall(eat_chunks, 8) -collectgarbage() -collectgarbage('stop') +janitor() pcall(frame_before_TDUP) =================================================================== > > > +collectgarbage() > > +collectgarbage('stop') > > + > > +-- Avoid OOM on traces. > > +jit.off() > > + > > +-- Stack slots are needed for coredump in case of misbehaviour. > > +-- luacheck: no unused > > +local r, e = pcall(eat_chunks, 8 * MB) > > +collectgarbage() > > +pcall(eat_chunks, 8 * KB) > > +collectgarbage() > > +pcall(eat_chunks, 8) > > +collectgarbage() > > + > > +pcall(frame_before_TDUP) > > Do we need to check the status of the above? I suppose no. As we disscussed offline the test may be flaky (false positive, when we can alloc all tables) due to memory allocations. Also, this part is very fragile, and we don't want to fix test later by removing this check the status of the `pcall()`. > > > + > > +-- Release memory for `tap` functions. > > +gc_anchor = nil > > +collectgarbage() > > Is this step required? I doubt. Yes it is. Without it we may get the OOM error inside `test:ok()` or `test:check()` functions (I faced it on my local machine, when rewrite this test). > > > + > > +test:ok(true, 'correctly throw memory error') > > + > > +os.exit(test:check() and 0 or 1) > > -- > > 2.34.1 > > > > -- > Best regards, > IM -- Best regards, Sergey Kaplun