Tarantool development patches archive
 help / color / mirror / Atom feed
From: Sergey Bronnikov via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Sergey Kaplun <skaplun@tarantool.org>,
	Maxim Kokryashkin <m.kokryashkin@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH luajit] Limit CSE for IR_CARG to fix loop optimizations.
Date: Tue, 8 Oct 2024 15:39:30 +0300	[thread overview]
Message-ID: <2486af85-0af6-4073-952c-ce5ce5a86d5c@tarantool.org> (raw)
In-Reply-To: <20240925103656.14771-1-skaplun@tarantool.org>

[-- Attachment #1: Type: text/plain, Size: 3905 bytes --]

Hi, Sergey!

LGTM with a minor question below.

On 25.09.2024 13:36, Sergey Kaplun wrote:
> From: Mike Pall <mike>
>
> Thanks to Peter Cawley.
>
> (cherry picked from commit 3bdc6498c4c012a8fbf9cfa2756a5b07f56f1540)
>
> `IR_CALLXS` for the vararg function contains `IR_CARG(fptr, ctid)` as
> the second operand. The `loop_emit_phi()` scans only the first operand
> of the IR, so the second is not marked as PHI. In this case, when the IR
> appears in both the invariant and variant parts of the loop, CSE may
> remove it and thus lead to incorrect emitting results.
>
> This patch tweaks the CSE rules to avoid CSE across the `IR_LOOP`.
>
> Sergey Kaplun:
> * added the description and the test for the problem
>
> Part of tarantool/tarantool#10199
> ---
>
> Branch:https://github.com/tarantool/luajit/tree/skaplun/lj-1244-missing-phi-carg
> Related issues:
> *https://github.com/tarantool/tarantool/issues/10199
> *https://github.com/LuaJIT/LuaJIT/issues/1244
>
>   src/lj_opt_fold.c                             | 11 ++++
>   .../lj-1244-missing-phi-carg.test.lua         | 53 +++++++++++++++++++
>   2 files changed, 64 insertions(+)
>   create mode 100644 test/tarantool-tests/lj-1244-missing-phi-carg.test.lua
>
> diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
> index e2171e1b..33e5f9dd 100644
> --- a/src/lj_opt_fold.c
> +++ b/src/lj_opt_fold.c
> @@ -2406,6 +2406,17 @@ LJFOLD(XSNEW any any)
>   LJFOLD(BUFHDR any any)
>   LJFOLDX(lj_ir_emit)
>   
> +/* -- Miscellaneous ------------------------------------------------------- */
> +
> +LJFOLD(CARG any any)
> +LJFOLDF(cse_carg)
> +{
> +  TRef tr = lj_opt_cse(J);
> +  if (tref_ref(tr) < J->chain[IR_LOOP])  /* CSE across loop? */
> +    return EMITFOLD;  /* Raw emit. Assumes fins is left intact by CSE. */
> +  return tr;
> +}
> +
>   /* ------------------------------------------------------------------------ */
>   
>   /* Every entry in the generated hash table is a 32 bit pattern:
> diff --git a/test/tarantool-tests/lj-1244-missing-phi-carg.test.lua b/test/tarantool-tests/lj-1244-missing-phi-carg.test.lua
> new file mode 100644
> index 00000000..865cdd26
> --- /dev/null
> +++ b/test/tarantool-tests/lj-1244-missing-phi-carg.test.lua
> @@ -0,0 +1,53 @@
> +local ffi = require('ffi')
> +local table_new = require('table.new')
> +
> +-- Test file to demonstrate LuaJIT incorrect behaviour for
> +-- recording the FFI call to the vararg function. See also:
> +--https://github.com/LuaJIT/LuaJIT/issues/1244.
> +local tap = require('tap')
> +local test = tap.test('lj-1244-missing-phi-carg'):skipcond({
> +  ['Test requires JIT enabled'] = not jit.status(),
> +})
> +
> +-- Loop unrolls into 2 iterations. Thus means that the loop is
> +-- executed on trace on the 5th iteration (instead of the usual
> +-- 4th). Run it even number of iterations to test both, so last is
> +-- 6th.
> +local NTESTS = 6
> +
> +test:plan(NTESTS)
> +
> +ffi.cdef[[
> +  double sin(double, ...);
> +  double cos(double, ...);

Why do you use sin/cos with wrong function prototypes if you can take

a function with varargs. (printf for example)?

> +]]
> +
> +local EXPECTED = {[0] = ffi.C.sin(0), ffi.C.cos(0)}
> +
> +-- Array of 2 functions.
> +local fns = ffi.new('double (*[2])(double, ...)')
> +fns[0] = ffi.C.cos
> +fns[1] = ffi.C.sin
> +
> +-- Avoid reallocating the table on the trace.
> +local result = table_new(8, 0)
> +
> +jit.opt.start('hotloop=1')
> +
> +local fn = fns[0]
> +-- The first result is `cos()`.
> +for i = 1, NTESTS do
> +  result[i] = fn(0)
> +  fn = fns[i % 2]
> +  -- The call persists in the invariant part of the loop as well.
> +  -- Hence, XLOAD (part of the IR_CARG -- function to be called)
> +  -- should be marked as PHI, but it isn't due to CSE.
> +  fn(0)
> +end
> +
> +for i = 1, NTESTS do
> +  test:is(result[i], EXPECTED[i % 2],
> +          ('correct result on iteration %d'):format(i))
> +end
> +
> +test:done(true)

[-- Attachment #2: Type: text/html, Size: 4749 bytes --]

  reply	other threads:[~2024-10-08 12:39 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-25 10:36 Sergey Kaplun via Tarantool-patches
2024-10-08 12:39 ` Sergey Bronnikov via Tarantool-patches [this message]
2024-10-08 13:25   ` Sergey Kaplun via Tarantool-patches
2024-10-11 14:02     ` Sergey Bronnikov via Tarantool-patches
2024-10-17  7:16       ` Sergey Kaplun via Tarantool-patches
2024-10-11 18:58 ` Maxim Kokryashkin via Tarantool-patches
2024-10-18 15:19 ` Sergey Kaplun via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2486af85-0af6-4073-952c-ce5ce5a86d5c@tarantool.org \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=m.kokryashkin@tarantool.org \
    --cc=sergeyb@tarantool.org \
    --cc=skaplun@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH luajit] Limit CSE for IR_CARG to fix loop optimizations.' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox