[Tarantool-patches] [PATCH luajit] Prevent CSE of a REF_BASE operand across IR_RETF.

Maxim Kokryashkin m.kokryashkin at tarantool.org
Thu Nov 30 20:59:39 MSK 2023


Hi, Sergey!
Thanks for the patch!
LGTM
 
--
Best regards,
Maxim Kokryashkin
 
 
> 
>>Hi, Sergey!
>>Thanks for the review!
>>I've fixed your comment regarding test comment and force-pushed the
>>branch.
>>
>>On 29.11.23, Sergey Bronnikov wrote:
>>> Hello, Sergey
>>>
>>> thanks for the patch!
>>>
>>> LGTM with a three minor comments below
>>>
>>> On 11/28/23 15:21, Sergey Kaplun wrote:
>>> > From: Mike Pall <mike>
>>> >
>>> > Reported by XmiliaH.
>>> >
>>> > (cherry-picked from commit e73916d811710ab02a4dfe447d621c99f4e7186c)
>>> >
>>> > The RETF IR has a side effect: it shifts base when returning to a lower
>>> > frame, i.e., it affects `REF_BASE` IR (0000) (thus, we can say that this
>>> > IR is violating SSA form). So any optimization of IRs with `REF_BASE` as
>>> > an operand across RETF IR may lead to incorrect optimizations (see
>>> > details in the test file).
>>> >
>>> > This patch adds rules to the folding engine to prevent CSE across `IR_RETF`
>>> > for all possible IRs containing REF_BASE.
>>> >
>>> > Sergey Kaplun:
>>> > * added the description and the test for the problem
>>> >
>>> > Part of tarantool/tarantool#9145
>>> > ---
>>> >
>>> > Branch:https://github.com/tarantool/luajit/tree/skaplun/lj-784-cse-ref-base-over-retf
>>> > Tarantool PR:https://github.com/tarantool/tarantool/pull/9421
>>> > Related issues:
>>> > * https://github.com/LuaJIT/LuaJIT/issues/784
>>> > * https://github.com/tarantool/tarantool/issues/9145
>>> >
>>> > Interested reviewers can mention that only the `SUB any BASE` case is
>>> > tested.
>>> > The reason is that other cases are impossible to record in LuaJIT:
>>> > * EQ any BASE: EQ pgc REF_BASE IR for upvalues is emitted when
>>> > the open upvalue aliases a SSA slot, i.e., it belongs to the frame of
>>> > the currently executed function. In that case, if we want to emit RETF
>>> > IR, we need to leave this function. So we need to record the UCLO
>>> > bytecode, which is NIY in JIT. So, such a type of trace is impossible.
>>> > * SUB BASE any: SUB BASE fr is emitted for the recording of VARG
>>>
>>> Nit: fr -> frame
>>>
>>> or put in backticks if you refer to a variable in source code
>>>
>>> > bytecode, in case varargs are undefined on trace. We need a vararg
>>> > function to call to create an additional frame. But returning to lower
>>> > frames from a vararg function isn't implemented in LuaJIT -- either
>>> > the trace recording is stopped or the error is rased and the trace
>>> > isn't compiled. Also, IINM, fr operands will always be different for
>>>
>>> Nit: fr -> frame
>>>
>>> or put in backticks if you refer to a variable in source code
>>
>>Since this is only a remider for review in the ML, I've not changed it:).
>>I suppose that to mention this information in the commit message is
>>excess and important only for clarification on review.
>>
>>>
>>> > different frames, so there is no possible CSE here.
>>> >
>>> > So, these cases are needed to prevent any regressions in the future.
>>> >
>>> > Please correct me if I've missed something.
>>> >
>>> > src/lj_opt_fold.c | 11 +++
>>> > .../lj-784-cse-ref-base-over-retf.test.lua | 86 +++++++++++++++++++
>>> > 2 files changed, 97 insertions(+)
>>> > create mode 100644 test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
>>> >
>>> > diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
>>> > index c5f2232e..750f1c7e 100644
>>> > --- a/src/lj_opt_fold.c
>>> > +++ b/src/lj_opt_fold.c
>>> > @@ -2313,6 +2313,17 @@ LJFOLDF(xload_kptr)
>>> > LJFOLD(XLOAD any any)
>>> > LJFOLDX(lj_opt_fwd_xload)
>>> >
>>> > +/* -- Frame handling ------------------------------------------------------ */
>>> > +
>>> > +/* Prevent CSE of a REF_BASE operand across IR_RETF. */
>>> > +LJFOLD(SUB any BASE)
>>> > +LJFOLD(SUB BASE any)
>>> > +LJFOLD(EQ any BASE)
>>> > +LJFOLDF(fold_base)
>>> > +{
>>> > + return lj_opt_cselim(J, J->chain[IR_RETF]);
>>> > +}
>>> > +
>>> > /* -- Write barriers ------------------------------------------------------ */
>>> >
>>> > /* Write barriers are amenable to CSE, but not across any incremental
>>> > diff --git a/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua b/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
>>> > new file mode 100644
>>> > index 00000000..095376fc
>>> > --- /dev/null
>>> > +++ b/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
>>> > @@ -0,0 +1,86 @@
>>> > +local tap = require('tap')
>>> > +
>>> > +-- Test file to demonstrate incorrect FOLD optimization for IR
>>> > +-- with REF_BASE operand across IR RETF.
>>> > +-- See also,https://github.com/LuaJIT/LuaJIT/issues/784.
>>> > +
>>> > +local test = tap.test('lj-784-cse-ref-base-over-retf'):skipcond({
>>> > + ['Test requires JIT enabled'] = not jit.status(),
>>> > +})
>>> > +
>>> > +test:plan(1)
>>> > +
>>> > +-- The RETF IR has a side effect: it shifts base when returning to
>>> > +-- a lower frame, i.e., it affects `REF_BASE` IR (0000) (thus, we
>>> > +-- can say that this IR is violating SSA form).
>>> > +-- So any optimization of IRs with `REF_BASE` as an operand across
>>> > +-- RETF IR may lead to incorrect optimizations.
>>> > +-- In this test, SUB uref REF_BASE IR was eliminated, so instead
>>> > +-- the following trace:
>>> > +--
>>> > +-- 0004 p32 SUB 0003 0000
>>> > +-- 0005 > p32 UGT 0004 +32
>>> > +-- ...
>>> > +-- 0009 > p32 RETF proto: 0x407dc118 [0x407dc194]
>>> > +-- ...
>>> > +-- 0012 p32 SUB 0003 0000
>>> > +-- 0013 > p32 UGT 0012 +72
>>> > +--
>>> > +-- We got the following:
>>> > +--
>>> > +-- 0004 p32 SUB 0003 0000
>>> > +-- 0005 > p32 UGT 0004 +32
>>> > +-- ...
>>> > +-- 0009 > p32 RETF proto: 0x41ffe0c0 [0x41ffe13c]
>>> > +-- ...
>>> > +-- 0012 > p32 UGT 0004 +72
>>> > +--
>>> > +-- As you can see, the 0012 SUB IR is eliminated because it is the
>>> > +-- same as the 0004 IR. This leads to incorrect assertion guards
>>> > +-- in the IR below.
>>>
>>> I would rephrase it to "assertion guards in the resulted IR"
>>>
>>> because there is no IR below the comment.
>>
>>Fixed, branch is force-pushed.
>>
>>===================================================================
>>diff --git a/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua b/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
>>index 095376fc..d6442cbb 100644
>>--- a/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
>>+++ b/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
>>@@ -37,7 +37,7 @@ test:plan(1)
>> --
>> -- As you can see, the 0012 SUB IR is eliminated because it is the
>> -- same as the 0004 IR. This leads to incorrect assertion guards
>>--- in the IR below.
>>+-- in the resulted IR 0012 below.
>> 
>> local MAGIC = 42
>> -- XXX: simplify `jit.dump()` output.
>>===================================================================
>>
>>>
>>> > +
>>> > +local MAGIC = 42
>>> > +-- XXX: simplify `jit.dump()` output.
>>> > +local fmod = math.fmod
>>> > +
>>> > +local function exit_with_retf(closure)
>>> > + -- Forcify stitch. Any NYI is OK here.
>>> > + fmod(1, 1)
>>> > + -- Call the closure so that we have emitted `uref - REF_BASE`.
>>> > + closure(0)
>>> > + -- Exit with `IR_RETF`. This will change `REF_BASE`.
>>> > +end
>>> > +
>>> > +local function sub_uref_base(closure)
>>> > + local open_upvalue
>>> > + if closure == nil then
>>> > + closure = function(val)
>>> > + local old = open_upvalue
>>> > + open_upvalue = val
>>> > + return old
>>> > + end
>>> > + -- First, create an additional frame, so we got the trace,
>>> > + -- where the open upvalue reference is always < `REF_BASE`.
>>> > + sub_uref_base(closure)
>>> > + end
>>> > + for _ = 1, 4 do
>>> > + -- `closure` function is inherited from the previous frame.
>>> > + exit_with_retf(closure)
>>> > + open_upvalue = MAGIC
>>> > + -- The open upvalue guard will use CSE over `IR_RETF` for
>>> > + -- `uref - REF_BASE`. `IR_RETF` changed the value of
>>> > + -- `REF_BASE`.
>>> > + -- Thus, the guards afterwards take the wrong IR as the first
>>> > + -- operand, so they are not failed, and the wrong value is
>>> > + -- returned from the trace.
>>> > + open_upvalue = closure(0)
>>> > + end
>>> > + return open_upvalue
>>> > +end
>>> > +
>>> > +jit.opt.start('hotloop=1')
>>> > +
>>> > +local res = sub_uref_base()
>>> > +test:is(res, MAGIC, 'no SUB uref REF_BASE CSE across RETF')
>>> > +
>>> > +test:done(true)
>>
>>--
>>Best regards,
>>Sergey Kaplun
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20231130/c222811f/attachment.htm>


More information about the Tarantool-patches mailing list