<HTML><BODY><div>QA LGTM</div><div> </div><div> </div><div data-signature-widget="container"><div data-signature-widget="content"><div>--<br>Vitaliia Ioffe</div></div></div><div> </div><div> </div><blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;">Понедельник, 16 августа 2021, 19:52 +03:00 от Igor Munkin via Tarantool-patches <tarantool-patches@dev.tarantool.org>:<br> <div id=""><div class="js-helper js-readmsg-msg"><div><div id="style_16291327610717057272_BODY">Sergey,<br><br>Thanks for the fixes! LGTM now.<br><br>On 16.08.21, Sergey Kaplun wrote:<br>> Igor,<br>><br>> Thanks for the review!<br>><br>> See the new comment message above:<br>> ===================================================================<br>> Fix bytecode register allocation for comparisons.<br>><br>> (cherry picked from commit 2f3f07882fb4ad9c64967d7088461b1ca0a25d3a)<br>><br>> When LuaJIT is built with LJ_FR2 (e.g. with GC64 mode enabled),<br>> information about frame takes two slots -- the first takes the TValue<br>> with the function to be called, the second takes the framelink. The JIT<br>> recording machinery does pretty the same -- the function IR_KGC is<br>> loaded in the first slot, and the second is set to TREF_FRAME value.<br>> This value should be rewritten after return from a callee. This slot is<br>> cleared either by return values or manually (set to zero), when there<br>> are no values to return. The latter case is done by the next bytecode<br>> with RA dst mode. This obliges that the destination of RA takes the next<br>> slot after TREF_FRAME. Hence, this an earlier instruction must use the<br>> smallest possible destination register (see `lj_record_ins()` for the<br>> details).<br>><br>> Bytecode emitter swaps operands for ISGT and ISGE comparisons. As a<br>> result, the aforementioned rule for registers allocations may be<br>> violated. When it happens for a chunk being recorded, the slot with<br>> TREF_FRAME is not rewritten (but the next empty slot after TREF_FRAME<br>> is). This leads to JIT slots inconsistency and assertion failure in<br>> `rec_check_slots()` during recording of the next bytecode instruction.<br>><br>> This patch fixes bytecode register allocation by changing the VM<br>> register allocation order in case of ISGT and ISGE bytecodes.<br>><br>> Sergey Kaplun:<br>> * added the description and the test for the problem<br>><br>> Resolves tarantool/tarantool#6227<br>> Part of tarantool/tarantool#5629<br>> ===================================================================<br>><br>> On 16.08.21, Igor Munkin wrote:<br><br><snipped><br><br>><br>> ><br>> > Furthermore, what does stop you from using local variables?<br>><br>> They occupy new slots and make it harder to maintain, see the new<br>> comment below.<br><br>Meh, OK anyway :)<br><br>><br>> ><br><br><snipped><br><br>><br>> ===================================================================<br>> diff --git a/test/tarantool-tests/gh-6227-bytecode-allocator-for-comparisons.test.lua b/test/tarantool-tests/gh-6227-bytecode-allocator-for-comparisons.test.lua<br>> index 66f6885e..9788923a 100644<br>> --- a/test/tarantool-tests/gh-6227-bytecode-allocator-for-comparisons.test.lua<br>> +++ b/test/tarantool-tests/gh-6227-bytecode-allocator-for-comparisons.test.lua<br>> @@ -14,26 +14,39 @@ local function empty() end<br>> local uv = 0<br>><br>> -- This function needs to reset register enumerating.<br>> --- Also set `J->maxslot` to zero.<br>> --- The upvalue function to call is loaded to 0 slot.<br>> +-- `J->maxslot` is initialized with `nargs` (i.e. zero in this<br>> +-- case) in `rec_call_setup()`.<br>> local function bump_frame()<br>> -- First call function with RET0 to set TREF_FRAME in the<br>> -- last slot.<br>> empty()<br>> + -- The old bytecode to be recorded looks like the following:<br>> + -- 0000 . FUNCF 4<br>> + -- 0001 . UGET 0 0 ; empty<br>> + -- 0002 . CALL 0 1 1<br>> + -- 0000 . . JFUNCF 1 1<br>> + -- 0001 . . RET0 0 1<br>> + -- 0002 . CALL 0 1 1<br>> + -- 0003 . UGET 0 0 ; empty<br>> + -- 0004 . UGET 3 1 ; uv<br>> + -- 0005 . KSHORT 2 1<br>> + -- 0006 . ISLT 3 2<br>> -- Test ISGE or ISGT bytecode. These bytecodes swap their<br>> - -- operands. Also, a constant is always loaded into the slot<br>> - -- smaller than upvalue. So, if upvalue loads before KSHORT,<br>> - -- then the difference between registers is more than 2 (2 is<br>> - -- needed for LJ_FR2) and TREF_FRAME slot is not rewriting by<br>> - -- the bytecode after call and return as expected. That leads<br>> - -- to recording slots inconsistency and assertion failure at<br>> - -- `rec_check_slots()`.<br>> + -- operands (consider ISLT above).<br>> + -- Two calls of `empty()` function in a row is necessary for 2<br>> + -- slot gap in LJ_FR2 mode.<br>> + -- Upvalue loads before KSHORT, so the difference between slot<br>> + -- for upvalue `empty` (function to be called) and slot for<br>> + -- upvalue `uv` is more than 2. Hence, TREF_FRAME slot is not<br>> + -- rewritten by the bytecode after return from `empty()`<br>> + -- function as expected. That leads to recording slots<br>> + -- inconsistency and assertion failure at `rec_check_slots()`.<br>> empty(1>uv)<br>> end<br>><br>> jit.opt.start('hotloop=1')<br>><br>> -for _ = 1,3 do<br>> +for _ = 1, 3 do<br>> bump_frame()<br>> end<br>> ===================================================================<br>><br><br><snipped><br><br>><br>> --<br>> Best regards,<br>> Sergey Kaplun<br><br>--<br>Best regards,<br>IM</div></div></div></div></blockquote><div> </div></BODY></HTML>