<HTML><BODY><div>Hi, Sergey!</div><div>Thanks for the patch!</div><div>LGTM</div><div> </div><div data-signature-widget="container"><div data-signature-widget="content"><div>--<br>Best regards,</div><div>Maxim Kokryashkin</div></div></div><div> </div><div> </div><blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;"><div> <blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;"><div id=""><div class="js-helper js-readmsg-msg"><div><div id="style_17013299501746656570_BODY">Hi, Sergey!<br>Thanks for the review!<br>I've fixed your comment regarding test comment and force-pushed the<br>branch.<br><br>On 29.11.23, Sergey Bronnikov wrote:<br>> Hello, Sergey<br>><br>> thanks for the patch!<br>><br>> LGTM with a three minor comments below<br>><br>> On 11/28/23 15:21, Sergey Kaplun wrote:<br>> > From: Mike Pall <mike><br>> ><br>> > Reported by XmiliaH.<br>> ><br>> > (cherry-picked from commit e73916d811710ab02a4dfe447d621c99f4e7186c)<br>> ><br>> > The RETF IR has a side effect: it shifts base when returning to a lower<br>> > frame, i.e., it affects `REF_BASE` IR (0000) (thus, we can say that this<br>> > IR is violating SSA form). So any optimization of IRs with `REF_BASE` as<br>> > an operand across RETF IR may lead to incorrect optimizations (see<br>> > details in the test file).<br>> ><br>> > This patch adds rules to the folding engine to prevent CSE across `IR_RETF`<br>> > for all possible IRs containing REF_BASE.<br>> ><br>> > Sergey Kaplun:<br>> > * added the description and the test for the problem<br>> ><br>> > Part of tarantool/tarantool#9145<br>> > ---<br>> ><br>> > Branch:https://github.com/tarantool/luajit/tree/skaplun/lj-784-cse-ref-base-over-retf<br>> > Tarantool PR:https://github.com/tarantool/tarantool/pull/9421<br>> > Related issues:<br>> > *<a href="https://github.com/LuaJIT/LuaJIT/issues/784" target="_blank">https://github.com/LuaJIT/LuaJIT/issues/784</a><br>> > *<a href="https://github.com/tarantool/tarantool/issues/9145" target="_blank">https://github.com/tarantool/tarantool/issues/9145</a><br>> ><br>> > Interested reviewers can mention that only the `SUB any BASE` case is<br>> > tested.<br>> > The reason is that other cases are impossible to record in LuaJIT:<br>> > * EQ any BASE: EQ pgc REF_BASE IR for upvalues is emitted when<br>> > the open upvalue aliases a SSA slot, i.e., it belongs to the frame of<br>> > the currently executed function. In that case, if we want to emit RETF<br>> > IR, we need to leave this function. So we need to record the UCLO<br>> > bytecode, which is NIY in JIT. So, such a type of trace is impossible.<br>> > * SUB BASE any: SUB BASE fr is emitted for the recording of VARG<br>><br>> Nit: fr -> frame<br>><br>> or put in backticks if you refer to a variable in source code<br>><br>> > bytecode, in case varargs are undefined on trace. We need a vararg<br>> > function to call to create an additional frame. But returning to lower<br>> > frames from a vararg function isn't implemented in LuaJIT -- either<br>> > the trace recording is stopped or the error is rased and the trace<br>> > isn't compiled. Also, IINM, fr operands will always be different for<br>><br>> Nit: fr -> frame<br>><br>> or put in backticks if you refer to a variable in source code<br><br>Since this is only a remider for review in the ML, I've not changed it:).<br>I suppose that to mention this information in the commit message is<br>excess and important only for clarification on review.<br><br>><br>> > different frames, so there is no possible CSE here.<br>> ><br>> > So, these cases are needed to prevent any regressions in the future.<br>> ><br>> > Please correct me if I've missed something.<br>> ><br>> > src/lj_opt_fold.c | 11 +++<br>> > .../lj-784-cse-ref-base-over-retf.test.lua | 86 +++++++++++++++++++<br>> > 2 files changed, 97 insertions(+)<br>> > create mode 100644 test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua<br>> ><br>> > diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c<br>> > index c5f2232e..750f1c7e 100644<br>> > --- a/src/lj_opt_fold.c<br>> > +++ b/src/lj_opt_fold.c<br>> > @@ -2313,6 +2313,17 @@ LJFOLDF(xload_kptr)<br>> > LJFOLD(XLOAD any any)<br>> > LJFOLDX(lj_opt_fwd_xload)<br>> ><br>> > +/* -- Frame handling ------------------------------------------------------ */<br>> > +<br>> > +/* Prevent CSE of a REF_BASE operand across IR_RETF. */<br>> > +LJFOLD(SUB any BASE)<br>> > +LJFOLD(SUB BASE any)<br>> > +LJFOLD(EQ any BASE)<br>> > +LJFOLDF(fold_base)<br>> > +{<br>> > + return lj_opt_cselim(J, J->chain[IR_RETF]);<br>> > +}<br>> > +<br>> > /* -- Write barriers ------------------------------------------------------ */<br>> ><br>> > /* Write barriers are amenable to CSE, but not across any incremental<br>> > diff --git a/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua b/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua<br>> > new file mode 100644<br>> > index 00000000..095376fc<br>> > --- /dev/null<br>> > +++ b/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua<br>> > @@ -0,0 +1,86 @@<br>> > +local tap = require('tap')<br>> > +<br>> > +-- Test file to demonstrate incorrect FOLD optimization for IR<br>> > +-- with REF_BASE operand across IR RETF.<br>> > +-- See also,https://github.com/LuaJIT/LuaJIT/issues/784.<br>> > +<br>> > +local test = tap.test('lj-784-cse-ref-base-over-retf'):skipcond({<br>> > + ['Test requires JIT enabled'] = not jit.status(),<br>> > +})<br>> > +<br>> > +test:plan(1)<br>> > +<br>> > +-- The RETF IR has a side effect: it shifts base when returning to<br>> > +-- a lower frame, i.e., it affects `REF_BASE` IR (0000) (thus, we<br>> > +-- can say that this IR is violating SSA form).<br>> > +-- So any optimization of IRs with `REF_BASE` as an operand across<br>> > +-- RETF IR may lead to incorrect optimizations.<br>> > +-- In this test, SUB uref REF_BASE IR was eliminated, so instead<br>> > +-- the following trace:<br>> > +--<br>> > +-- 0004 p32 SUB 0003 0000<br>> > +-- 0005 > p32 UGT 0004 +32<br>> > +-- ...<br>> > +-- 0009 > p32 RETF proto: 0x407dc118 [0x407dc194]<br>> > +-- ...<br>> > +-- 0012 p32 SUB 0003 0000<br>> > +-- 0013 > p32 UGT 0012 +72<br>> > +--<br>> > +-- We got the following:<br>> > +--<br>> > +-- 0004 p32 SUB 0003 0000<br>> > +-- 0005 > p32 UGT 0004 +32<br>> > +-- ...<br>> > +-- 0009 > p32 RETF proto: 0x41ffe0c0 [0x41ffe13c]<br>> > +-- ...<br>> > +-- 0012 > p32 UGT 0004 +72<br>> > +--<br>> > +-- As you can see, the 0012 SUB IR is eliminated because it is the<br>> > +-- same as the 0004 IR. This leads to incorrect assertion guards<br>> > +-- in the IR below.<br>><br>> I would rephrase it to "assertion guards in the resulted IR"<br>><br>> because there is no IR below the comment.<br><br>Fixed, branch is force-pushed.<br><br>===================================================================<br>diff --git a/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua b/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua<br>index 095376fc..d6442cbb 100644<br>--- a/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua<br>+++ b/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua<br>@@ -37,7 +37,7 @@ test:plan(1)<br> --<br> -- As you can see, the 0012 SUB IR is eliminated because it is the<br> -- same as the 0004 IR. This leads to incorrect assertion guards<br>--- in the IR below.<br>+-- in the resulted IR 0012 below.<br> <br> local MAGIC = 42<br> -- XXX: simplify `jit.dump()` output.<br>===================================================================<br><br>><br>> > +<br>> > +local MAGIC = 42<br>> > +-- XXX: simplify `jit.dump()` output.<br>> > +local fmod = math.fmod<br>> > +<br>> > +local function exit_with_retf(closure)<br>> > + -- Forcify stitch. Any NYI is OK here.<br>> > + fmod(1, 1)<br>> > + -- Call the closure so that we have emitted `uref - REF_BASE`.<br>> > + closure(0)<br>> > + -- Exit with `IR_RETF`. This will change `REF_BASE`.<br>> > +end<br>> > +<br>> > +local function sub_uref_base(closure)<br>> > + local open_upvalue<br>> > + if closure == nil then<br>> > + closure = function(val)<br>> > + local old = open_upvalue<br>> > + open_upvalue = val<br>> > + return old<br>> > + end<br>> > + -- First, create an additional frame, so we got the trace,<br>> > + -- where the open upvalue reference is always < `REF_BASE`.<br>> > + sub_uref_base(closure)<br>> > + end<br>> > + for _ = 1, 4 do<br>> > + -- `closure` function is inherited from the previous frame.<br>> > + exit_with_retf(closure)<br>> > + open_upvalue = MAGIC<br>> > + -- The open upvalue guard will use CSE over `IR_RETF` for<br>> > + -- `uref - REF_BASE`. `IR_RETF` changed the value of<br>> > + -- `REF_BASE`.<br>> > + -- Thus, the guards afterwards take the wrong IR as the first<br>> > + -- operand, so they are not failed, and the wrong value is<br>> > + -- returned from the trace.<br>> > + open_upvalue = closure(0)<br>> > + end<br>> > + return open_upvalue<br>> > +end<br>> > +<br>> > +jit.opt.start('hotloop=1')<br>> > +<br>> > +local res = sub_uref_base()<br>> > +test:is(res, MAGIC, 'no SUB uref REF_BASE CSE across RETF')<br>> > +<br>> > +test:done(true)<br><br>--<br>Best regards,<br>Sergey Kaplun</div></div></div></div></blockquote><div> </div></div></blockquote></BODY></HTML>