Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH luajit] Prevent CSE of a REF_BASE operand across IR_RETF.
@ 2023-11-28 12:21 Sergey Kaplun via Tarantool-patches
  2023-11-29 14:26 ` Sergey Bronnikov via Tarantool-patches
  2024-01-10  8:51 ` Igor Munkin via Tarantool-patches
  0 siblings, 2 replies; 6+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-11-28 12:21 UTC (permalink / raw)
  To: Maxim Kokryashkin, Sergey Bronnikov; +Cc: tarantool-patches

From: Mike Pall <mike>

Reported by XmiliaH.

(cherry-picked from commit e73916d811710ab02a4dfe447d621c99f4e7186c)

The RETF IR has a side effect: it shifts base when returning to a lower
frame, i.e., it affects `REF_BASE` IR (0000) (thus, we can say that this
IR is violating SSA form). So any optimization of IRs with `REF_BASE` as
an operand across RETF IR may lead to incorrect optimizations (see
details in the test file).

This patch adds rules to the folding engine to prevent CSE across `IR_RETF`
for all possible IRs containing REF_BASE.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#9145
---

Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-784-cse-ref-base-over-retf
Tarantool PR: https://github.com/tarantool/tarantool/pull/9421
Related issues:
* https://github.com/LuaJIT/LuaJIT/issues/784
* https://github.com/tarantool/tarantool/issues/9145

Interested reviewers can mention that only the `SUB any BASE` case is
tested.
The reason is that other cases are impossible to record in LuaJIT:
* EQ any BASE: EQ pgc REF_BASE IR for upvalues is emitted when
  the open upvalue aliases a SSA slot, i.e., it belongs to the frame of
  the currently executed function. In that case, if we want to emit RETF
  IR, we need to leave this function. So we need to record the UCLO
  bytecode, which is NIY in JIT. So, such a type of trace is impossible.
* SUB BASE any: SUB BASE fr is emitted for the recording of VARG
  bytecode, in case varargs are undefined on trace. We need a vararg
  function to call to create an additional frame. But returning to lower
  frames from a vararg function isn't implemented in LuaJIT -- either
  the trace recording is stopped or the error is rased and the trace
  isn't compiled. Also, IINM, fr operands will always be different for
  different frames, so there is no possible CSE here.

So, these cases are needed to prevent any regressions in the future.

Please correct me if I've missed something.

 src/lj_opt_fold.c                             | 11 +++
 .../lj-784-cse-ref-base-over-retf.test.lua    | 86 +++++++++++++++++++
 2 files changed, 97 insertions(+)
 create mode 100644 test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua

diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
index c5f2232e..750f1c7e 100644
--- a/src/lj_opt_fold.c
+++ b/src/lj_opt_fold.c
@@ -2313,6 +2313,17 @@ LJFOLDF(xload_kptr)
 LJFOLD(XLOAD any any)
 LJFOLDX(lj_opt_fwd_xload)
 
+/* -- Frame handling ------------------------------------------------------ */
+
+/* Prevent CSE of a REF_BASE operand across IR_RETF. */
+LJFOLD(SUB any BASE)
+LJFOLD(SUB BASE any)
+LJFOLD(EQ any BASE)
+LJFOLDF(fold_base)
+{
+  return lj_opt_cselim(J, J->chain[IR_RETF]);
+}
+
 /* -- Write barriers ------------------------------------------------------ */
 
 /* Write barriers are amenable to CSE, but not across any incremental
diff --git a/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua b/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
new file mode 100644
index 00000000..095376fc
--- /dev/null
+++ b/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
@@ -0,0 +1,86 @@
+local tap = require('tap')
+
+-- Test file to demonstrate incorrect FOLD optimization for IR
+-- with REF_BASE operand across IR RETF.
+-- See also, https://github.com/LuaJIT/LuaJIT/issues/784.
+
+local test = tap.test('lj-784-cse-ref-base-over-retf'):skipcond({
+  ['Test requires JIT enabled'] = not jit.status(),
+})
+
+test:plan(1)
+
+-- The RETF IR has a side effect: it shifts base when returning to
+-- a lower frame, i.e., it affects `REF_BASE` IR (0000) (thus, we
+-- can say that this IR is violating SSA form).
+-- So any optimization of IRs with `REF_BASE` as an operand across
+-- RETF IR may lead to incorrect optimizations.
+-- In this test, SUB uref REF_BASE IR was eliminated, so instead
+-- the following trace:
+--
+-- 0004    p32 SUB    0003  0000
+-- 0005 >  p32 UGT    0004  +32
+-- ...
+-- 0009 >  p32 RETF   proto: 0x407dc118  [0x407dc194]
+-- ...
+-- 0012    p32 SUB    0003  0000
+-- 0013 >  p32 UGT    0012  +72
+--
+-- We got the following:
+--
+-- 0004    p32 SUB    0003  0000
+-- 0005 >  p32 UGT    0004  +32
+-- ...
+-- 0009 >  p32 RETF   proto: 0x41ffe0c0  [0x41ffe13c]
+-- ...
+-- 0012 >  p32 UGT    0004  +72
+--
+-- As you can see, the 0012 SUB IR is eliminated because it is the
+-- same as the 0004 IR. This leads to incorrect assertion guards
+-- in the IR below.
+
+local MAGIC = 42
+-- XXX: simplify `jit.dump()` output.
+local fmod =  math.fmod
+
+local function exit_with_retf(closure)
+  -- Forcify stitch. Any NYI is OK here.
+  fmod(1, 1)
+  -- Call the closure so that we have emitted `uref - REF_BASE`.
+  closure(0)
+  -- Exit with `IR_RETF`. This will change `REF_BASE`.
+end
+
+local function sub_uref_base(closure)
+  local open_upvalue
+  if closure == nil then
+    closure = function(val)
+      local old = open_upvalue
+      open_upvalue = val
+      return old
+    end
+    -- First, create an additional frame, so we got the trace,
+    -- where the open upvalue reference is always < `REF_BASE`.
+    sub_uref_base(closure)
+  end
+  for _ = 1, 4 do
+    -- `closure` function is inherited from the previous frame.
+    exit_with_retf(closure)
+    open_upvalue = MAGIC
+    -- The open upvalue guard will use CSE over `IR_RETF` for
+    -- `uref - REF_BASE`. `IR_RETF` changed the value of
+    -- `REF_BASE`.
+    -- Thus, the guards afterwards take the wrong IR as the first
+    -- operand, so they are not failed, and the wrong value is
+    -- returned from the trace.
+    open_upvalue = closure(0)
+  end
+  return open_upvalue
+end
+
+jit.opt.start('hotloop=1')
+
+local res = sub_uref_base()
+test:is(res, MAGIC, 'no SUB uref REF_BASE CSE across RETF')
+
+test:done(true)
-- 
2.42.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit] Prevent CSE of a REF_BASE operand across IR_RETF.
  2023-11-28 12:21 [Tarantool-patches] [PATCH luajit] Prevent CSE of a REF_BASE operand across IR_RETF Sergey Kaplun via Tarantool-patches
@ 2023-11-29 14:26 ` Sergey Bronnikov via Tarantool-patches
  2023-11-30  7:34   ` Sergey Kaplun via Tarantool-patches
  2024-01-10  8:51 ` Igor Munkin via Tarantool-patches
  1 sibling, 1 reply; 6+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2023-11-29 14:26 UTC (permalink / raw)
  To: Sergey Kaplun, Maxim Kokryashkin; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 6549 bytes --]

Hello, Sergey

thanks for the patch!

LGTM with a three minor comments below

On 11/28/23 15:21, Sergey Kaplun wrote:
> From: Mike Pall <mike>
>
> Reported by XmiliaH.
>
> (cherry-picked from commit e73916d811710ab02a4dfe447d621c99f4e7186c)
>
> The RETF IR has a side effect: it shifts base when returning to a lower
> frame, i.e., it affects `REF_BASE` IR (0000) (thus, we can say that this
> IR is violating SSA form). So any optimization of IRs with `REF_BASE` as
> an operand across RETF IR may lead to incorrect optimizations (see
> details in the test file).
>
> This patch adds rules to the folding engine to prevent CSE across `IR_RETF`
> for all possible IRs containing REF_BASE.
>
> Sergey Kaplun:
> * added the description and the test for the problem
>
> Part of tarantool/tarantool#9145
> ---
>
> Branch:https://github.com/tarantool/luajit/tree/skaplun/lj-784-cse-ref-base-over-retf
> Tarantool PR:https://github.com/tarantool/tarantool/pull/9421
> Related issues:
> *https://github.com/LuaJIT/LuaJIT/issues/784
> *https://github.com/tarantool/tarantool/issues/9145
>
> Interested reviewers can mention that only the `SUB any BASE` case is
> tested.
> The reason is that other cases are impossible to record in LuaJIT:
> * EQ any BASE: EQ pgc REF_BASE IR for upvalues is emitted when
>    the open upvalue aliases a SSA slot, i.e., it belongs to the frame of
>    the currently executed function. In that case, if we want to emit RETF
>    IR, we need to leave this function. So we need to record the UCLO
>    bytecode, which is NIY in JIT. So, such a type of trace is impossible.
> * SUB BASE any: SUB BASE fr is emitted for the recording of VARG

Nit: fr -> frame

or put in backticks if you refer to a variable in source code

>    bytecode, in case varargs are undefined on trace. We need a vararg
>    function to call to create an additional frame. But returning to lower
>    frames from a vararg function isn't implemented in LuaJIT -- either
>    the trace recording is stopped or the error is rased and the trace
>    isn't compiled. Also, IINM, fr operands will always be different for

Nit: fr -> frame

or put in backticks if you refer to a variable in source code

>    different frames, so there is no possible CSE here.
>
> So, these cases are needed to prevent any regressions in the future.
>
> Please correct me if I've missed something.
>
>   src/lj_opt_fold.c                             | 11 +++
>   .../lj-784-cse-ref-base-over-retf.test.lua    | 86 +++++++++++++++++++
>   2 files changed, 97 insertions(+)
>   create mode 100644 test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
>
> diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
> index c5f2232e..750f1c7e 100644
> --- a/src/lj_opt_fold.c
> +++ b/src/lj_opt_fold.c
> @@ -2313,6 +2313,17 @@ LJFOLDF(xload_kptr)
>   LJFOLD(XLOAD any any)
>   LJFOLDX(lj_opt_fwd_xload)
>   
> +/* -- Frame handling ------------------------------------------------------ */
> +
> +/* Prevent CSE of a REF_BASE operand across IR_RETF. */
> +LJFOLD(SUB any BASE)
> +LJFOLD(SUB BASE any)
> +LJFOLD(EQ any BASE)
> +LJFOLDF(fold_base)
> +{
> +  return lj_opt_cselim(J, J->chain[IR_RETF]);
> +}
> +
>   /* -- Write barriers ------------------------------------------------------ */
>   
>   /* Write barriers are amenable to CSE, but not across any incremental
> diff --git a/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua b/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
> new file mode 100644
> index 00000000..095376fc
> --- /dev/null
> +++ b/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
> @@ -0,0 +1,86 @@
> +local tap = require('tap')
> +
> +-- Test file to demonstrate incorrect FOLD optimization for IR
> +-- with REF_BASE operand across IR RETF.
> +-- See also,https://github.com/LuaJIT/LuaJIT/issues/784.
> +
> +local test = tap.test('lj-784-cse-ref-base-over-retf'):skipcond({
> +  ['Test requires JIT enabled'] = not jit.status(),
> +})
> +
> +test:plan(1)
> +
> +-- The RETF IR has a side effect: it shifts base when returning to
> +-- a lower frame, i.e., it affects `REF_BASE` IR (0000) (thus, we
> +-- can say that this IR is violating SSA form).
> +-- So any optimization of IRs with `REF_BASE` as an operand across
> +-- RETF IR may lead to incorrect optimizations.
> +-- In this test, SUB uref REF_BASE IR was eliminated, so instead
> +-- the following trace:
> +--
> +-- 0004    p32 SUB    0003  0000
> +-- 0005 >  p32 UGT    0004  +32
> +-- ...
> +-- 0009 >  p32 RETF   proto: 0x407dc118  [0x407dc194]
> +-- ...
> +-- 0012    p32 SUB    0003  0000
> +-- 0013 >  p32 UGT    0012  +72
> +--
> +-- We got the following:
> +--
> +-- 0004    p32 SUB    0003  0000
> +-- 0005 >  p32 UGT    0004  +32
> +-- ...
> +-- 0009 >  p32 RETF   proto: 0x41ffe0c0  [0x41ffe13c]
> +-- ...
> +-- 0012 >  p32 UGT    0004  +72
> +--
> +-- As you can see, the 0012 SUB IR is eliminated because it is the
> +-- same as the 0004 IR. This leads to incorrect assertion guards
> +-- in the IR below.

I would rephrase it to "assertion guards in the resulted IR"

because there is no IR below the comment.

> +
> +local MAGIC = 42
> +-- XXX: simplify `jit.dump()` output.
> +local fmod =  math.fmod
> +
> +local function exit_with_retf(closure)
> +  -- Forcify stitch. Any NYI is OK here.
> +  fmod(1, 1)
> +  -- Call the closure so that we have emitted `uref - REF_BASE`.
> +  closure(0)
> +  -- Exit with `IR_RETF`. This will change `REF_BASE`.
> +end
> +
> +local function sub_uref_base(closure)
> +  local open_upvalue
> +  if closure == nil then
> +    closure = function(val)
> +      local old = open_upvalue
> +      open_upvalue = val
> +      return old
> +    end
> +    -- First, create an additional frame, so we got the trace,
> +    -- where the open upvalue reference is always < `REF_BASE`.
> +    sub_uref_base(closure)
> +  end
> +  for _ = 1, 4 do
> +    -- `closure` function is inherited from the previous frame.
> +    exit_with_retf(closure)
> +    open_upvalue = MAGIC
> +    -- The open upvalue guard will use CSE over `IR_RETF` for
> +    -- `uref - REF_BASE`. `IR_RETF` changed the value of
> +    -- `REF_BASE`.
> +    -- Thus, the guards afterwards take the wrong IR as the first
> +    -- operand, so they are not failed, and the wrong value is
> +    -- returned from the trace.
> +    open_upvalue = closure(0)
> +  end
> +  return open_upvalue
> +end
> +
> +jit.opt.start('hotloop=1')
> +
> +local res = sub_uref_base()
> +test:is(res, MAGIC, 'no SUB uref REF_BASE CSE across RETF')
> +
> +test:done(true)

[-- Attachment #2: Type: text/html, Size: 8105 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit] Prevent CSE of a REF_BASE operand across IR_RETF.
  2023-11-29 14:26 ` Sergey Bronnikov via Tarantool-patches
@ 2023-11-30  7:34   ` Sergey Kaplun via Tarantool-patches
  2023-11-30  8:53     ` Sergey Bronnikov via Tarantool-patches
  2023-11-30 17:59     ` Maxim Kokryashkin via Tarantool-patches
  0 siblings, 2 replies; 6+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-11-30  7:34 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review!
I've fixed your comment regarding test comment and force-pushed the
branch.

On 29.11.23, Sergey Bronnikov wrote:
> Hello, Sergey
> 
> thanks for the patch!
> 
> LGTM with a three minor comments below
> 
> On 11/28/23 15:21, Sergey Kaplun wrote:
> > From: Mike Pall <mike>
> >
> > Reported by XmiliaH.
> >
> > (cherry-picked from commit e73916d811710ab02a4dfe447d621c99f4e7186c)
> >
> > The RETF IR has a side effect: it shifts base when returning to a lower
> > frame, i.e., it affects `REF_BASE` IR (0000) (thus, we can say that this
> > IR is violating SSA form). So any optimization of IRs with `REF_BASE` as
> > an operand across RETF IR may lead to incorrect optimizations (see
> > details in the test file).
> >
> > This patch adds rules to the folding engine to prevent CSE across `IR_RETF`
> > for all possible IRs containing REF_BASE.
> >
> > Sergey Kaplun:
> > * added the description and the test for the problem
> >
> > Part of tarantool/tarantool#9145
> > ---
> >
> > Branch:https://github.com/tarantool/luajit/tree/skaplun/lj-784-cse-ref-base-over-retf
> > Tarantool PR:https://github.com/tarantool/tarantool/pull/9421
> > Related issues:
> > *https://github.com/LuaJIT/LuaJIT/issues/784
> > *https://github.com/tarantool/tarantool/issues/9145
> >
> > Interested reviewers can mention that only the `SUB any BASE` case is
> > tested.
> > The reason is that other cases are impossible to record in LuaJIT:
> > * EQ any BASE: EQ pgc REF_BASE IR for upvalues is emitted when
> >    the open upvalue aliases a SSA slot, i.e., it belongs to the frame of
> >    the currently executed function. In that case, if we want to emit RETF
> >    IR, we need to leave this function. So we need to record the UCLO
> >    bytecode, which is NIY in JIT. So, such a type of trace is impossible.
> > * SUB BASE any: SUB BASE fr is emitted for the recording of VARG
> 
> Nit: fr -> frame
> 
> or put in backticks if you refer to a variable in source code
> 
> >    bytecode, in case varargs are undefined on trace. We need a vararg
> >    function to call to create an additional frame. But returning to lower
> >    frames from a vararg function isn't implemented in LuaJIT -- either
> >    the trace recording is stopped or the error is rased and the trace
> >    isn't compiled. Also, IINM, fr operands will always be different for
> 
> Nit: fr -> frame
> 
> or put in backticks if you refer to a variable in source code

Since this is only a remider for review in the ML, I've not changed it:).
I suppose that to mention this information in the commit message is
excess and important only for clarification on review.

> 
> >    different frames, so there is no possible CSE here.
> >
> > So, these cases are needed to prevent any regressions in the future.
> >
> > Please correct me if I've missed something.
> >
> >   src/lj_opt_fold.c                             | 11 +++
> >   .../lj-784-cse-ref-base-over-retf.test.lua    | 86 +++++++++++++++++++
> >   2 files changed, 97 insertions(+)
> >   create mode 100644 test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
> >
> > diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
> > index c5f2232e..750f1c7e 100644
> > --- a/src/lj_opt_fold.c
> > +++ b/src/lj_opt_fold.c
> > @@ -2313,6 +2313,17 @@ LJFOLDF(xload_kptr)
> >   LJFOLD(XLOAD any any)
> >   LJFOLDX(lj_opt_fwd_xload)
> >   
> > +/* -- Frame handling ------------------------------------------------------ */
> > +
> > +/* Prevent CSE of a REF_BASE operand across IR_RETF. */
> > +LJFOLD(SUB any BASE)
> > +LJFOLD(SUB BASE any)
> > +LJFOLD(EQ any BASE)
> > +LJFOLDF(fold_base)
> > +{
> > +  return lj_opt_cselim(J, J->chain[IR_RETF]);
> > +}
> > +
> >   /* -- Write barriers ------------------------------------------------------ */
> >   
> >   /* Write barriers are amenable to CSE, but not across any incremental
> > diff --git a/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua b/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
> > new file mode 100644
> > index 00000000..095376fc
> > --- /dev/null
> > +++ b/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
> > @@ -0,0 +1,86 @@
> > +local tap = require('tap')
> > +
> > +-- Test file to demonstrate incorrect FOLD optimization for IR
> > +-- with REF_BASE operand across IR RETF.
> > +-- See also,https://github.com/LuaJIT/LuaJIT/issues/784.
> > +
> > +local test = tap.test('lj-784-cse-ref-base-over-retf'):skipcond({
> > +  ['Test requires JIT enabled'] = not jit.status(),
> > +})
> > +
> > +test:plan(1)
> > +
> > +-- The RETF IR has a side effect: it shifts base when returning to
> > +-- a lower frame, i.e., it affects `REF_BASE` IR (0000) (thus, we
> > +-- can say that this IR is violating SSA form).
> > +-- So any optimization of IRs with `REF_BASE` as an operand across
> > +-- RETF IR may lead to incorrect optimizations.
> > +-- In this test, SUB uref REF_BASE IR was eliminated, so instead
> > +-- the following trace:
> > +--
> > +-- 0004    p32 SUB    0003  0000
> > +-- 0005 >  p32 UGT    0004  +32
> > +-- ...
> > +-- 0009 >  p32 RETF   proto: 0x407dc118  [0x407dc194]
> > +-- ...
> > +-- 0012    p32 SUB    0003  0000
> > +-- 0013 >  p32 UGT    0012  +72
> > +--
> > +-- We got the following:
> > +--
> > +-- 0004    p32 SUB    0003  0000
> > +-- 0005 >  p32 UGT    0004  +32
> > +-- ...
> > +-- 0009 >  p32 RETF   proto: 0x41ffe0c0  [0x41ffe13c]
> > +-- ...
> > +-- 0012 >  p32 UGT    0004  +72
> > +--
> > +-- As you can see, the 0012 SUB IR is eliminated because it is the
> > +-- same as the 0004 IR. This leads to incorrect assertion guards
> > +-- in the IR below.
> 
> I would rephrase it to "assertion guards in the resulted IR"
> 
> because there is no IR below the comment.

Fixed, branch is force-pushed.

===================================================================
diff --git a/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua b/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
index 095376fc..d6442cbb 100644
--- a/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
+++ b/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
@@ -37,7 +37,7 @@ test:plan(1)
 --
 -- As you can see, the 0012 SUB IR is eliminated because it is the
 -- same as the 0004 IR. This leads to incorrect assertion guards
--- in the IR below.
+-- in the resulted IR 0012 below.
 
 local MAGIC = 42
 -- XXX: simplify `jit.dump()` output.
===================================================================

> 
> > +
> > +local MAGIC = 42
> > +-- XXX: simplify `jit.dump()` output.
> > +local fmod =  math.fmod
> > +
> > +local function exit_with_retf(closure)
> > +  -- Forcify stitch. Any NYI is OK here.
> > +  fmod(1, 1)
> > +  -- Call the closure so that we have emitted `uref - REF_BASE`.
> > +  closure(0)
> > +  -- Exit with `IR_RETF`. This will change `REF_BASE`.
> > +end
> > +
> > +local function sub_uref_base(closure)
> > +  local open_upvalue
> > +  if closure == nil then
> > +    closure = function(val)
> > +      local old = open_upvalue
> > +      open_upvalue = val
> > +      return old
> > +    end
> > +    -- First, create an additional frame, so we got the trace,
> > +    -- where the open upvalue reference is always < `REF_BASE`.
> > +    sub_uref_base(closure)
> > +  end
> > +  for _ = 1, 4 do
> > +    -- `closure` function is inherited from the previous frame.
> > +    exit_with_retf(closure)
> > +    open_upvalue = MAGIC
> > +    -- The open upvalue guard will use CSE over `IR_RETF` for
> > +    -- `uref - REF_BASE`. `IR_RETF` changed the value of
> > +    -- `REF_BASE`.
> > +    -- Thus, the guards afterwards take the wrong IR as the first
> > +    -- operand, so they are not failed, and the wrong value is
> > +    -- returned from the trace.
> > +    open_upvalue = closure(0)
> > +  end
> > +  return open_upvalue
> > +end
> > +
> > +jit.opt.start('hotloop=1')
> > +
> > +local res = sub_uref_base()
> > +test:is(res, MAGIC, 'no SUB uref REF_BASE CSE across RETF')
> > +
> > +test:done(true)

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit] Prevent CSE of a REF_BASE operand across IR_RETF.
  2023-11-30  7:34   ` Sergey Kaplun via Tarantool-patches
@ 2023-11-30  8:53     ` Sergey Bronnikov via Tarantool-patches
  2023-11-30 17:59     ` Maxim Kokryashkin via Tarantool-patches
  1 sibling, 0 replies; 6+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2023-11-30  8:53 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Thanks! LGTM as I said before.

On 11/30/23 10:34, Sergey Kaplun wrote:
> Hi, Sergey!
> Thanks for the review!
> I've fixed your comment regarding test comment and force-pushed the
> branch.
>
> On 29.11.23, Sergey Bronnikov wrote:
>> Hello, Sergey
>>
>> thanks for the patch!
>>
>> LGTM with a three minor comments below
>>
>> On 11/28/23 15:21, Sergey Kaplun wrote:
>>
<snipped>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Tarantool-patches]  [PATCH luajit] Prevent CSE of a REF_BASE operand across IR_RETF.
  2023-11-30  7:34   ` Sergey Kaplun via Tarantool-patches
  2023-11-30  8:53     ` Sergey Bronnikov via Tarantool-patches
@ 2023-11-30 17:59     ` Maxim Kokryashkin via Tarantool-patches
  1 sibling, 0 replies; 6+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2023-11-30 17:59 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 8384 bytes --]


Hi, Sergey!
Thanks for the patch!
LGTM
 
--
Best regards,
Maxim Kokryashkin
 
 
> 
>>Hi, Sergey!
>>Thanks for the review!
>>I've fixed your comment regarding test comment and force-pushed the
>>branch.
>>
>>On 29.11.23, Sergey Bronnikov wrote:
>>> Hello, Sergey
>>>
>>> thanks for the patch!
>>>
>>> LGTM with a three minor comments below
>>>
>>> On 11/28/23 15:21, Sergey Kaplun wrote:
>>> > From: Mike Pall <mike>
>>> >
>>> > Reported by XmiliaH.
>>> >
>>> > (cherry-picked from commit e73916d811710ab02a4dfe447d621c99f4e7186c)
>>> >
>>> > The RETF IR has a side effect: it shifts base when returning to a lower
>>> > frame, i.e., it affects `REF_BASE` IR (0000) (thus, we can say that this
>>> > IR is violating SSA form). So any optimization of IRs with `REF_BASE` as
>>> > an operand across RETF IR may lead to incorrect optimizations (see
>>> > details in the test file).
>>> >
>>> > This patch adds rules to the folding engine to prevent CSE across `IR_RETF`
>>> > for all possible IRs containing REF_BASE.
>>> >
>>> > Sergey Kaplun:
>>> > * added the description and the test for the problem
>>> >
>>> > Part of tarantool/tarantool#9145
>>> > ---
>>> >
>>> > Branch:https://github.com/tarantool/luajit/tree/skaplun/lj-784-cse-ref-base-over-retf
>>> > Tarantool PR:https://github.com/tarantool/tarantool/pull/9421
>>> > Related issues:
>>> > * https://github.com/LuaJIT/LuaJIT/issues/784
>>> > * https://github.com/tarantool/tarantool/issues/9145
>>> >
>>> > Interested reviewers can mention that only the `SUB any BASE` case is
>>> > tested.
>>> > The reason is that other cases are impossible to record in LuaJIT:
>>> > * EQ any BASE: EQ pgc REF_BASE IR for upvalues is emitted when
>>> > the open upvalue aliases a SSA slot, i.e., it belongs to the frame of
>>> > the currently executed function. In that case, if we want to emit RETF
>>> > IR, we need to leave this function. So we need to record the UCLO
>>> > bytecode, which is NIY in JIT. So, such a type of trace is impossible.
>>> > * SUB BASE any: SUB BASE fr is emitted for the recording of VARG
>>>
>>> Nit: fr -> frame
>>>
>>> or put in backticks if you refer to a variable in source code
>>>
>>> > bytecode, in case varargs are undefined on trace. We need a vararg
>>> > function to call to create an additional frame. But returning to lower
>>> > frames from a vararg function isn't implemented in LuaJIT -- either
>>> > the trace recording is stopped or the error is rased and the trace
>>> > isn't compiled. Also, IINM, fr operands will always be different for
>>>
>>> Nit: fr -> frame
>>>
>>> or put in backticks if you refer to a variable in source code
>>
>>Since this is only a remider for review in the ML, I've not changed it:).
>>I suppose that to mention this information in the commit message is
>>excess and important only for clarification on review.
>>
>>>
>>> > different frames, so there is no possible CSE here.
>>> >
>>> > So, these cases are needed to prevent any regressions in the future.
>>> >
>>> > Please correct me if I've missed something.
>>> >
>>> > src/lj_opt_fold.c | 11 +++
>>> > .../lj-784-cse-ref-base-over-retf.test.lua | 86 +++++++++++++++++++
>>> > 2 files changed, 97 insertions(+)
>>> > create mode 100644 test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
>>> >
>>> > diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
>>> > index c5f2232e..750f1c7e 100644
>>> > --- a/src/lj_opt_fold.c
>>> > +++ b/src/lj_opt_fold.c
>>> > @@ -2313,6 +2313,17 @@ LJFOLDF(xload_kptr)
>>> > LJFOLD(XLOAD any any)
>>> > LJFOLDX(lj_opt_fwd_xload)
>>> >
>>> > +/* -- Frame handling ------------------------------------------------------ */
>>> > +
>>> > +/* Prevent CSE of a REF_BASE operand across IR_RETF. */
>>> > +LJFOLD(SUB any BASE)
>>> > +LJFOLD(SUB BASE any)
>>> > +LJFOLD(EQ any BASE)
>>> > +LJFOLDF(fold_base)
>>> > +{
>>> > + return lj_opt_cselim(J, J->chain[IR_RETF]);
>>> > +}
>>> > +
>>> > /* -- Write barriers ------------------------------------------------------ */
>>> >
>>> > /* Write barriers are amenable to CSE, but not across any incremental
>>> > diff --git a/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua b/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
>>> > new file mode 100644
>>> > index 00000000..095376fc
>>> > --- /dev/null
>>> > +++ b/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
>>> > @@ -0,0 +1,86 @@
>>> > +local tap = require('tap')
>>> > +
>>> > +-- Test file to demonstrate incorrect FOLD optimization for IR
>>> > +-- with REF_BASE operand across IR RETF.
>>> > +-- See also,https://github.com/LuaJIT/LuaJIT/issues/784.
>>> > +
>>> > +local test = tap.test('lj-784-cse-ref-base-over-retf'):skipcond({
>>> > + ['Test requires JIT enabled'] = not jit.status(),
>>> > +})
>>> > +
>>> > +test:plan(1)
>>> > +
>>> > +-- The RETF IR has a side effect: it shifts base when returning to
>>> > +-- a lower frame, i.e., it affects `REF_BASE` IR (0000) (thus, we
>>> > +-- can say that this IR is violating SSA form).
>>> > +-- So any optimization of IRs with `REF_BASE` as an operand across
>>> > +-- RETF IR may lead to incorrect optimizations.
>>> > +-- In this test, SUB uref REF_BASE IR was eliminated, so instead
>>> > +-- the following trace:
>>> > +--
>>> > +-- 0004 p32 SUB 0003 0000
>>> > +-- 0005 > p32 UGT 0004 +32
>>> > +-- ...
>>> > +-- 0009 > p32 RETF proto: 0x407dc118 [0x407dc194]
>>> > +-- ...
>>> > +-- 0012 p32 SUB 0003 0000
>>> > +-- 0013 > p32 UGT 0012 +72
>>> > +--
>>> > +-- We got the following:
>>> > +--
>>> > +-- 0004 p32 SUB 0003 0000
>>> > +-- 0005 > p32 UGT 0004 +32
>>> > +-- ...
>>> > +-- 0009 > p32 RETF proto: 0x41ffe0c0 [0x41ffe13c]
>>> > +-- ...
>>> > +-- 0012 > p32 UGT 0004 +72
>>> > +--
>>> > +-- As you can see, the 0012 SUB IR is eliminated because it is the
>>> > +-- same as the 0004 IR. This leads to incorrect assertion guards
>>> > +-- in the IR below.
>>>
>>> I would rephrase it to "assertion guards in the resulted IR"
>>>
>>> because there is no IR below the comment.
>>
>>Fixed, branch is force-pushed.
>>
>>===================================================================
>>diff --git a/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua b/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
>>index 095376fc..d6442cbb 100644
>>--- a/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
>>+++ b/test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
>>@@ -37,7 +37,7 @@ test:plan(1)
>> --
>> -- As you can see, the 0012 SUB IR is eliminated because it is the
>> -- same as the 0004 IR. This leads to incorrect assertion guards
>>--- in the IR below.
>>+-- in the resulted IR 0012 below.
>> 
>> local MAGIC = 42
>> -- XXX: simplify `jit.dump()` output.
>>===================================================================
>>
>>>
>>> > +
>>> > +local MAGIC = 42
>>> > +-- XXX: simplify `jit.dump()` output.
>>> > +local fmod = math.fmod
>>> > +
>>> > +local function exit_with_retf(closure)
>>> > + -- Forcify stitch. Any NYI is OK here.
>>> > + fmod(1, 1)
>>> > + -- Call the closure so that we have emitted `uref - REF_BASE`.
>>> > + closure(0)
>>> > + -- Exit with `IR_RETF`. This will change `REF_BASE`.
>>> > +end
>>> > +
>>> > +local function sub_uref_base(closure)
>>> > + local open_upvalue
>>> > + if closure == nil then
>>> > + closure = function(val)
>>> > + local old = open_upvalue
>>> > + open_upvalue = val
>>> > + return old
>>> > + end
>>> > + -- First, create an additional frame, so we got the trace,
>>> > + -- where the open upvalue reference is always < `REF_BASE`.
>>> > + sub_uref_base(closure)
>>> > + end
>>> > + for _ = 1, 4 do
>>> > + -- `closure` function is inherited from the previous frame.
>>> > + exit_with_retf(closure)
>>> > + open_upvalue = MAGIC
>>> > + -- The open upvalue guard will use CSE over `IR_RETF` for
>>> > + -- `uref - REF_BASE`. `IR_RETF` changed the value of
>>> > + -- `REF_BASE`.
>>> > + -- Thus, the guards afterwards take the wrong IR as the first
>>> > + -- operand, so they are not failed, and the wrong value is
>>> > + -- returned from the trace.
>>> > + open_upvalue = closure(0)
>>> > + end
>>> > + return open_upvalue
>>> > +end
>>> > +
>>> > +jit.opt.start('hotloop=1')
>>> > +
>>> > +local res = sub_uref_base()
>>> > +test:is(res, MAGIC, 'no SUB uref REF_BASE CSE across RETF')
>>> > +
>>> > +test:done(true)
>>
>>--
>>Best regards,
>>Sergey Kaplun
> 

[-- Attachment #2: Type: text/html, Size: 10428 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit] Prevent CSE of a REF_BASE operand across IR_RETF.
  2023-11-28 12:21 [Tarantool-patches] [PATCH luajit] Prevent CSE of a REF_BASE operand across IR_RETF Sergey Kaplun via Tarantool-patches
  2023-11-29 14:26 ` Sergey Bronnikov via Tarantool-patches
@ 2024-01-10  8:51 ` Igor Munkin via Tarantool-patches
  1 sibling, 0 replies; 6+ messages in thread
From: Igor Munkin via Tarantool-patches @ 2024-01-10  8:51 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sergey,

I've checked the patchset into all long-term branches in
tarantool/luajit and bumped a new version in master, release/2.11 and
release/2.10.

On 28.11.23, Sergey Kaplun via Tarantool-patches wrote:
> From: Mike Pall <mike>
> 
> Reported by XmiliaH.
> 
> (cherry-picked from commit e73916d811710ab02a4dfe447d621c99f4e7186c)
> 
> The RETF IR has a side effect: it shifts base when returning to a lower
> frame, i.e., it affects `REF_BASE` IR (0000) (thus, we can say that this
> IR is violating SSA form). So any optimization of IRs with `REF_BASE` as
> an operand across RETF IR may lead to incorrect optimizations (see
> details in the test file).
> 
> This patch adds rules to the folding engine to prevent CSE across `IR_RETF`
> for all possible IRs containing REF_BASE.
> 
> Sergey Kaplun:
> * added the description and the test for the problem
> 
> Part of tarantool/tarantool#9145
> ---
> 
> Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-784-cse-ref-base-over-retf
> Tarantool PR: https://github.com/tarantool/tarantool/pull/9421
> Related issues:
> * https://github.com/LuaJIT/LuaJIT/issues/784
> * https://github.com/tarantool/tarantool/issues/9145
> 
> Interested reviewers can mention that only the `SUB any BASE` case is
> tested.
> The reason is that other cases are impossible to record in LuaJIT:
> * EQ any BASE: EQ pgc REF_BASE IR for upvalues is emitted when
>   the open upvalue aliases a SSA slot, i.e., it belongs to the frame of
>   the currently executed function. In that case, if we want to emit RETF
>   IR, we need to leave this function. So we need to record the UCLO
>   bytecode, which is NIY in JIT. So, such a type of trace is impossible.
> * SUB BASE any: SUB BASE fr is emitted for the recording of VARG
>   bytecode, in case varargs are undefined on trace. We need a vararg
>   function to call to create an additional frame. But returning to lower
>   frames from a vararg function isn't implemented in LuaJIT -- either
>   the trace recording is stopped or the error is rased and the trace
>   isn't compiled. Also, IINM, fr operands will always be different for
>   different frames, so there is no possible CSE here.
> 
> So, these cases are needed to prevent any regressions in the future.
> 
> Please correct me if I've missed something.
> 
>  src/lj_opt_fold.c                             | 11 +++
>  .../lj-784-cse-ref-base-over-retf.test.lua    | 86 +++++++++++++++++++
>  2 files changed, 97 insertions(+)
>  create mode 100644 test/tarantool-tests/lj-784-cse-ref-base-over-retf.test.lua
> 

<snipped>

> -- 
> 2.42.1
> 

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-01-10  8:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-28 12:21 [Tarantool-patches] [PATCH luajit] Prevent CSE of a REF_BASE operand across IR_RETF Sergey Kaplun via Tarantool-patches
2023-11-29 14:26 ` Sergey Bronnikov via Tarantool-patches
2023-11-30  7:34   ` Sergey Kaplun via Tarantool-patches
2023-11-30  8:53     ` Sergey Bronnikov via Tarantool-patches
2023-11-30 17:59     ` Maxim Kokryashkin via Tarantool-patches
2024-01-10  8:51 ` Igor Munkin via Tarantool-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox