From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 72A5F6EC40; Mon, 5 Jul 2021 00:30:21 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 72A5F6EC40 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1625434221; bh=I3R6pXeVoV2LAJtZcnflkRhssz7r4U/dLLpNIt19KeU=; h=Date:To:References:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=QlKmkT1qek2K4dOjPidt6mwzo+wd3pw7IBa9ipXOZc1ZYE0cJu6MtlGmydfAD5uBK xIAHew0yIyGs0s7m1Cjy1wRH8W4RJ+OBtu45o0MjQ9kMr7NqWk1VGd20atFjOQloxt WrvS3a5/0FQCu7roup7yHXIoNv+RybHirijt5oSQ= Received: from smtpng2.i.mail.ru (smtpng2.i.mail.ru [94.100.179.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 4C1766EC40 for ; Mon, 5 Jul 2021 00:30:19 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 4C1766EC40 Received: by smtpng2.m.smailru.net with esmtpa (envelope-from ) id 1m09gw-0004r4-5d; Mon, 05 Jul 2021 00:30:18 +0300 Date: Mon, 5 Jul 2021 00:06:47 +0300 To: Sergey Ostanevich Message-ID: <20210704210647.GA6106@tarantool.org> References: <804A99A3-6D0C-4DA9-A939-26FFED0EC823@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <804A99A3-6D0C-4DA9-A939-26FFED0EC823@tarantool.org> X-Clacks-Overhead: GNU Terry Pratchett User-Agent: Mutt/1.10.1 (2018-07-13) X-4EC0790: 10 X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD954DFF1DC42D673FB4F75AC5594ACDC16869A51A860A12816182A05F53808504030EF65AFD38568837F6F386032FB32BC6C89C8E95DE7B208700B2E2D87F0BA07 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE77D3344F2D0C9F5BEEA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F79006374B2D40F594293EAD8638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D8CBC5C790F94D6C0E10D14BC43D7A1203117882F4460429724CE54428C33FAD305F5C1EE8F4F765FC8C7ADC89C2F0B2A5A471835C12D1D9774AD6D5ED66289B52BA9C0B312567BB23117882F446042972877693876707352033AC447995A7AD1828451B159A507268D2E47CDBA5A96583BA9C0B312567BB231DD303D21008E29813377AFFFEAFD269A417C69337E82CC2E827F84554CEF50127C277FBC8AE2E8BA83251EDC214901ED5E8D9A59859A8B6A8DADCFA31BDB70175ECD9A6C639B01B4E70A05D1297E1BBCB5012B2E24CD356 X-B7AD71C0: AC4F5C86D027EB782CDD5689AFBDA7A2AD77751E876CB595E8F7B195E1C978318EC52019342C8FBD0ECBE94767D537CD X-C1DE0DAB: 0D63561A33F958A5ABB4712F6146277B57133723602F3C8B1512755ABDCEBA31D59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA75342909995EBBA6E4410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D3474B2583E51315984C3907F0F5542E08C819FBC7AF963B1650BA516B6F34C3A6A2D1B71BD8AD1C18F1D7E09C32AA3244C7720E20421F0BACAB4A59FC05DC7C9C4D9ADFF0C0BDB8D1F927AC6DF5659F194 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2bioj5fH2RN9TpJk45KGX4i6H3Q== X-Mailru-Sender: 689FA8AB762F7393C37E3C1AEC41BA5D35F8042759ED11FB14BACA83D83BDFEBA7C8D0F45F857DBFE9F1EFEE2F478337FB559BB5D741EB964C8C2C849690F8E70A04DAD6CC59E33667EA787935ED9F1B X-Mras: Ok Subject: Re: [Tarantool-patches] [PATH luajit] GC64: fix 64-bit constant fusion X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Igor Munkin via Tarantool-patches Reply-To: Igor Munkin Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Sergos, Thanks for the patch! Something went wrong with formatting the patch I guess, so I'll share you the recipe how I send the backported patches: 1. cherry-pick the original patch from the upstream 2. format-patch the backported commit 3. Add "From" header with the original author of the changes at the very beginning of the formatted patch 4. Adjust the commit message according to our backporting procedure BTW, I've found neither the branch with the patch in tarantool/luajit nor the branch with the green CI in tarantool/tarantool... On 28.05.21, Sergey Ostanevich wrote: > Author: Mike Pall > Date: Mon Aug 28 10:43:37 2017 +0200 > > x64/LJ_GC64: Fix fallback case of asm_fuseloadk64(). > > Contributed by Peter Cawley. > > (cherry picked from commit 6b0824852677cc12570c20a3211fbfe0e4f0ce14) > > Code generation under LJ_GC64 missed an update to the mcode area after Minor: s/mcode area/mcode area boundaries/. Feel free to ignore. > a 64bit constant encoding. This lead to a corruption to the constant Typo: s/corruption to/corruption of/. > later on. > The problem is rather rare, since there should be big enough (4GiB) > distance from the currently allocated mcode to the dispatch pointer. Minor: Sorry for being pedantic, but this is not such a trivial bug, so I've described semantics around this a bit. As for me it's worth to mention four possible encodings of 64-bit constant on the trace. 0. If the address of the constant fits into 32-bit one, then encode it as a 32-bit displacement (the only option for non-GC64 mode). 1. If the offset of the constant slot from the dispatch table (pinned to r14 that is not changed while trace execution) fits into 32-bit, then encode this as a 32-bit displacement relative to r14. 2. If the offset of the constant slot from the mcode (i.e. rip) fits into 32-bit, then encode this as a 32-bit displacement relative to rip (considering long mode specifics and RID_RIP hack). 3. If none of the conditions above are valid, compiler materializes this 64-bit constant right at the trace bottom and encodes this the same way it does for the previous case. And now goes your part regarding 4Gb and rarity of this case and the problem is much clearer to the reader. Please adjust this part using my description above. > This lead to a number of flaky tests, trackers are addressed. > > Sergey Ostanevich: > * added the description and the test for the problem > > Closes: #4095, #4199, #4614 At first, please move split this line into the separate ones with a single issue per line. Also don't use ':' in GitHub tags, please: it just doesn't respect our commit message style. Unfortunately, this commit tags no issue, since it is pushed to tarantool/luajit repo, but the issues relate to tarantool/tarantool. Hence, you need to explicitly mention the latter repo. And last (but not least), we agreed with Sergey to use "Resolves" instead of "Closes" in tarantool/luajit repo and use "Closes" in the corresponding patches bumping LuaJIT submodule. As for these patches, I'd rather use "Fixes", since you have checked that the failures related to the mentioned issues are gone. Considering everything above, the line above transforms into the following: | Fixes tarantool/tarantool#4095 | Fixes tarantool/tarantool#4199 | Fixes tarantool/tarantool#4614 > > Signed-off-by: Sergey Ostanevich > > diff --git a/src/lj_asm_x86.h b/src/lj_asm_x86.h > index 767bf6f3..2850aea9 100644 > --- a/src/lj_asm_x86.h > +++ b/src/lj_asm_x86.h > @@ -387,6 +387,7 @@ static Reg asm_fuseloadk64(ASMState *as, IRIns *ir) > ir->i = (int32_t)(as->mctop - as->mcbot); > as->mcbot += 8; > as->mclim = as->mcbot + MCLIM_REDZONE; > + lj_mcode_commitbot(as->J, as->mcbot); > } > as->mrm.ofs = (int32_t)mcpofs(as, as->mctop - ir->i); > as->mrm.base = RID_RIP; > diff --git a/test/tarantool-tests/gh-4199-gc64-flaky.test.lua b/test/tarantool-tests/gh-4199-gc64-flaky.test.lua > new file mode 100644 > index 00000000..3ac30427 > --- /dev/null > +++ b/test/tarantool-tests/gh-4199-gc64-flaky.test.lua > @@ -0,0 +1,63 @@ Regarding test code style: I don't want to point each place you have violated it, so I just refer the document[1] that we try to follow except the one rule: we use 2 spaces for indentation to get tests closer to LuaJIT sources. Hence, please consider that I've "implicitly" commented every spot below, where you don't follow our code style. > +-- the test is GC64 only > +local ffi=require('ffi') > +require('utils').skipcond(not ffi.abi('gc64'), 'test is GC64 only') > + > +local tap = require("tap") > +local test = tap.test("gh-4199-gc64-flaky") > +test:plan(1) > + > +-- first - we have to make a gap from current JIT infra to next > +-- available mappable memory > +-- most efficient is to grab it per-page > + > + > +ffi.cdef('void * mmap(void *start, size_t length, int prot , int flags, int fd, long offset);') > +ffi.cdef('long getpagesize();') Minor: It's better use once but with multiline declarations instead of several calls. > + > +local pagesize = tonumber(ffi.C.getpagesize()) > +local blob = {} > +for i=1, 4e9/pagesize do > + blob[i] = ffi.C.mmap(ffi.cast('void*',0), pagesize, 0, 0x22, 0, 0) 0x22, 4e9 -- these are magic numbers for me. Please, create a variable with the descriptive name (and even comments, I guess) for each of them, to ease the further maintenance. > + assert(blob[i] ~= 0) > +end > + > +-- try to chomp all memory in currently allocated gc space > +collectgarbage('stop') Since you decided to stop GC, you can do it right at the very beginning, so you don't need to anchor all mmaped memory above. > +local dummy={'a'} > +for i=2,30 do > + dummy[i] = dummy[i - 1] .. dummy[i - 1] > +end Why do you need this loop? > + > +-- generate a bunch of functions and keep them stored to trigger wrong constant placement > + > +local s={} Again, since GC is stopped, there is no need to anchor all the functions generated below. > +local pass = true > + > +jit.opt.start('hotloop=1’) > +for n=1,100 do > + local src='function f'.. n .. [[(x,y,z,f,g,h,j,k,r,c,d) > + local a={} > + for i=1,1e6 do > + a[i] = x + y + z + f + g + h + j + k + r + c + d > + if (x > 0) then a[i] = a[i] + 1.1 end > + if (c > 0) then a[i] = a[i] + 2.2 end > + if (z > 0) then a[i] = a[i] + 3.3 end > + if (f > 0) then a[i] = a[i] + 4.4 end > + x=x+r > + y=y-c > + z=z+d > + end > + return a[1] > + end > + return f]] .. n ..'(...)' > + > + s[n] = assert(load(src)) > + local res1 = s[n](1,2,3,4,5,6,7,8,9,10,11) > + local res2 = s[n](1,2,3,4,5,6,7,8,9,10,11) This was not obvious to me, but I've finally got it: you compare the result yielded by interpreter with the one yielded by the trace. At first, you don't need to run all 1e6 iterations: you have set 'hotloop' to 1, so you need only 3 (!) iterations to successfully compile the loop. The second call will use the compiled trace for the first loop iteration, but also will try to compile the function itself. AFAIU, the constant being materialized on the trace is the table address, right? At least I see no other option, so please mention in the comment which constant is fused and leads to the failure if the patch is missing. Finally, I don't get, why do you need 100 iterations for getting the misbehaviour. Please, add everything I dumped below as the corresponding comments for this part. > + if (res1 ~= res2) then > + pass = false Why don't you add the assertion about constant fusion right here? > + break > + end > +end > + > +test:ok(pass, 'wrong IR constant fuse') [1]: https://www.tarantool.io/en/doc/latest/dev_guide/lua_style_guide/ -- Best regards, IM