From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 3B7306EC40; Mon, 16 Aug 2021 10:43:51 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 3B7306EC40 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1629099831; bh=XeCLhIS3WkqslX16HUrBjw2f2NeIwFvduvgEm+TU5K4=; h=Date:To:References:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=RdAw+dItTcGQKIeYNQhyAR0ckmk0FjS6tfwS2wg2qXaHm7uiW7rFzZjGmxjDw2xAC fwLJxgYbPCG8qtdaZTn3wKuYIM4RZcnDtdC7BeyfKdY+8Gy1b1HP2a9h0HYGg2DQxy YKhP/AxYypBkm6zbzU+PUIp30Mpm1oMOwRa8TVC0= Received: from smtpng1.i.mail.ru (smtpng1.i.mail.ru [94.100.181.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 099606EC40 for ; Mon, 16 Aug 2021 10:43:49 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 099606EC40 Received: by smtpng1.m.smailru.net with esmtpa (envelope-from ) id 1mFXHg-0003rJ-Il; Mon, 16 Aug 2021 10:43:49 +0300 Date: Mon, 16 Aug 2021 10:20:07 +0300 To: Sergey Kaplun Message-ID: <20210816072007.GR27855@tarantool.org> References: <20210719073632.12008-1-skaplun@tarantool.org> <20210801104318.GZ27855@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Clacks-Overhead: GNU Terry Pratchett User-Agent: Mutt/1.10.1 (2018-07-13) X-4EC0790: 10 X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD92087353F0EC44DD906AB4890CDABF0C5CB76CEE71D3E4007182A05F538085040A59E9051E408E761E25594B263A1C12C982099AA31A1848027B9135E606B2CE2 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7624C4D757C4F5837EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F790063724170451E8B6ECF78638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D85C8FA23F82909F19241D515B49D78E94117882F4460429724CE54428C33FAD305F5C1EE8F4F765FC974A882099E279BDA471835C12D1D9774AD6D5ED66289B52BA9C0B312567BB23117882F44604297287769387670735201E561CDFBCA1751FC26CFBAC0749D213D2E47CDBA5A96583BA9C0B312567BB231DD303D21008E29813377AFFFEAFD269A417C69337E82CC2E827F84554CEF50127C277FBC8AE2E8BA83251EDC214901ED5E8D9A59859A8B6753C3A5E0A5AB5B7089D37D7C0E48F6C5571747095F342E88FB05168BE4CE3AF X-C1DE0DAB: 0D63561A33F958A5087E8B1C82164E16372929C5792EE7F08108105EE628E164D59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA752DA3D96DA0CEF5C48E8E86DC7131B365E7726E8460B7C23C X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D3454548929AF40B4284110DFA0E7DEB0B8FBF6768D2ECBAD323AFB2F7B1C386E4F9A544260D6A055AF1D7E09C32AA3244C7FF307612770FD6F0B90671BBF79C2EA250262A5EE9971B0927AC6DF5659F194 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojVT2don9h9KoBgSQ8rEKwVQ== X-Mailru-Sender: 689FA8AB762F7393C37E3C1AEC41BA5D2668E9F08F350CF0CCBCC6869E274CE9A7C8D0F45F857DBFE9F1EFEE2F478337FB559BB5D741EB964C8C2C849690F8E70A04DAD6CC59E33667EA787935ED9F1B X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH luajit] Fix bytecode register allocation for comparisons. X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Igor Munkin via Tarantool-patches Reply-To: Igor Munkin Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Sergey, Thanks for the explanation! Please consider the new comments below. On 01.08.21, Sergey Kaplun wrote: > Hi, Igor! > > Thanks for the review! > > On 01.08.21, Igor Munkin wrote: > > Sergey, > > > > Thanks for the patch! Please consider the comments below. I didn't check > > the test yet, since I don't get the JIT peculiarities from your commit > > message and comments. Please provide a clearer description and I'll > > proceed with the review of the test case then. > > > > On 19.07.21, Sergey Kaplun wrote: > > > From: Mike Pall > > > > > > (cherry picked from commit 2f3f07882fb4ad9c64967d7088461b1ca0a25d3a) > > > > > > When LuaJIT is build with LJ_FR2 (GC64), information about frame takes > > > two slots -- the first takes the TValue with the function to call, the > > > second takes the additional frame information. The recording JIT > > > > Minor: The second slot is the framelink in LuaJIT terms. > > Yes, because it takes the additional frame information. How do you want > to modify this line? Just say that the second slot takes the framelink: this is lapidary. > > > > > > machinery works pretty the same -- the function IR_KGC is loaded in the > > > first slot, and the second is set to TREF_FRAME value. This value > > > should be rewritten after return from a callee. It is done either by the > > > return values either this slot is cleared (set to zero) manually with > > > the next bytecode with RA dst mode with the assumption, that the dst RA > > > takes the next slot after TREF_FRAME, i.e. an earlier instruction uses > > > the smallest possible destination register (see `lj_record_ins()` for > > > the details). > > > > The main point lies in the monstrous 5-line sentence. I've read several > > times, but still don't get it. Could you please reword it in a not such > > complex sentence? > > The first option is rewrite this slot by return values from the > function. And this is not the case, right? I mean, this approach works fine even without the patch, doesn't it? > > The second option is clearing slot (i.e. set to zero) manually, when > there is no values to return. It is done by the next bytecode having RA > dst mode. This obliges that the destination of RA takes the next slot > after TREF_FRAME. For this an earlier instruction must use the smallest > possible destination register (see `lj_record_ins()` for the details). Here is the case, got it, thanks! So, I guess it's enough to adjust the commit message to be similar to the section above. > > > > > > > > > Bytecode allocator swaps operands for ISGT and ISGE comparisons. I believe this should be called "bytecode emitter" or just "frontend". > > > When it happens, the aforementioned rule for registers allocations > > > may be violated. When it happens, and this chunk is recording, the slot > > > with TREF_FRAME is not rewritten (but the next empty slot after > > > TREF_FRAME is) during bytecode recording. This leads to JIT slots > > > inconsistency and assertion failure in `rec_check_slots()` during > > > recording the next bytecode instruction. > > > > > > This patch fixes bytecode register allocation by changing the register > > > allocation order in case of ISGT and ISGE bytecodes. > > > > It's better to use "virtual register" or even "VM register" to avoid > > ambiguous plain "register" usage. > > Changed to VM register. > > > > > > > > > Sergey Kaplun: > > > * added the description and the test for the problem > > > > > > Resolves tarantool/tarantool#6227 > > > > Minor: Why #5629 is not mentioned? > > Added. > Branch is updated and force-pushed. > > > > > > --- > > > > > > Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-6227-fix-bytecode-allocator-for-comp > > > Tarantool branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-6227-fix-bytecode-allocator-for-comp > > > Issue: https://github.com/tarantool/tarantool/issues/6227 > > > > > > src/lj_parse.c | 7 +++- > > > ...ytecode-allocator-for-comparisons.test.lua | 41 +++++++++++++++++++ > > > 2 files changed, 46 insertions(+), 2 deletions(-) > > > create mode 100644 test/tarantool-tests/gh-6227-bytecode-allocator-for-comparisons.test.lua > > > > > > diff --git a/test/tarantool-tests/gh-6227-bytecode-allocator-for-comparisons.test.lua b/test/tarantool-tests/gh-6227-bytecode-allocator-for-comparisons.test.lua > > > new file mode 100644 > > > index 00000000..66f6885e > > > --- /dev/null > > > +++ b/test/tarantool-tests/gh-6227-bytecode-allocator-for-comparisons.test.lua > > > @@ -0,0 +1,41 @@ > > > +local tap = require('tap') > > > +local test = tap.test('gh-6227-bytecode-allocator-for-comparisons') > > > +test:plan(1) > > > + > > > +-- Test file to demonstrate assertion failure during recording > > > +-- wrong allocated bytecode for comparisons. > > > +-- See also https://github.com/tarantool/tarantool/issues/6227. > > > + > > > +-- Need function with RET0 bytecode to avoid reset of > > > +-- the first JIT slot with frame info. Also need no assignments > > > +-- by the caller. > > > +local function empty() end > > > + > > > +local uv = 0 > > > + > > > +-- This function needs to reset register enumerating. > > > +-- Also set `J->maxslot` to zero. Please add the reason, why J->maxslot is zero (it is initialized with nargs in ). > > > +-- The upvalue function to call is loaded to 0 slot. > > > +local function bump_frame() > > > + -- First call function with RET0 to set TREF_FRAME in the > > > + -- last slot. > > > + empty() > > > + -- Test ISGE or ISGT bytecode. These bytecodes swap their > > > + -- operands. Also, a constant is always loaded into the slot > > > + -- smaller than upvalue. So, if upvalue loads before KSHORT, > > > + -- then the difference between registers is more than 2 (2 is > > > + -- needed for LJ_FR2) and TREF_FRAME slot is not rewriting by > > > + -- the bytecode after call and return as expected. That leads If the constant is loaded into a slot prior to the one with an upvalue, then how upvalue can be loaded *before* KSHORT? How the difference becomes more than 2? I don't get this math. Furthermore, what does stop you from using local variables? > > > + -- to recording slots inconsistency and assertion failure at > > > + -- `rec_check_slots()`. > > > + empty(1>uv) > > > +end > > > + > > > +jit.opt.start('hotloop=1') It's worth to mention, that such JIT engine tuning allows to compile function at first, and only later compile the loop below. As a result function is not inlined into the loop body, so the fix can be checked. > > > + > > > +for _ = 1,3 do Minor: Space is missing after the comma. > > > + bump_frame() > > > +end > > > + > > > +test:ok(true) > > > +os.exit(test:check() and 0 or 1) > > > -- > > > 2.31.0 > > > > > > > -- > > Best regards, > > IM > > -- > Best regards, > Sergey Kaplun -- Best regards, IM