Tarantool development patches archive
 help / color / mirror / Atom feed
From: sergos via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Sergey Kaplun <skaplun@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH luajit] x86/x64: Check for jcc when using xor r, r in emit_loadi().
Date: Wed, 6 Jul 2022 13:30:05 +0300	[thread overview]
Message-ID: <813CA8F4-3B19-4370-A859-12B956ACCF3C@tarantool.org> (raw)
In-Reply-To: <20220704093344.13522-1-skaplun@tarantool.org>

Hi!

Thanks for the patch!

I won’t veto on the patch, since it’s from upstream. Still I have got
a number of questions on it’s crutch nature. Perhaps we need to follow
up with some test cases to luajit.

Some description fixes are needed also.

Sergos


> On 4 Jul 2022, at 12:33, Sergey Kaplun <skaplun@tarantool.org> wrote:
> 
> From: Mike Pall <mike>
> 
> Thanks to Peter Cawley.
> 
> (cherry picked from commit fb5e522fbc0750c838ef6a926b11c5d870826183)
> 
> To reproduce this issue, we need:
> 1) a register which contains the constant zero value

> 2) a floating point comparison operation

The integer one will set the same flags (CF for the case of JNB) and XOR will
clear it anyways. 

> 3) the comparison operation to perform a fused load, which in
>   turn needs to allocate a register, and for there to be no
>   free registers at that moment, and for the register chosen
>   for sacrifice to be holding the constant zero.
> 
Unfortunately, it’s not clear what this register (I suppose it’s r14d) is used for.
Is it an argument, needed after the fall-through in the same trace?
Why not it sank down below the branch? 
IIRC it is a dedicated register used for dispatch, so why is it used for sacrificing?

> This leads to assembly code like the following:
> | ucomisd xmm7, [r14+0x18]
> | xor r14d, r14d
> | jnb 0x12a0e001c ->3
> 
> That xor is a big problem, as it modifies flags between the
> ucomisd and the jnb, thereby causing the jnb to do the wrong
> thing.
> 

There’s no trace after the fix, so it’s hard to grasp the resolution used.
 
> This patch forbids emitting xor in `emit_loadi()` for jcc operations.
> 
> Sergey Kaplun:
> * added the description and the test for the problem
> 
> Part of tarantool/tarantool#7230
> ---
> 
> Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-416-xor-before-jcc-full-ci
> Issues:
> * https://github.com/LuaJIT/LuaJIT/issues/416
> * https://github.com/tarantool/tarantool/issues/7230
> 
> Changelog entry (I suggest to update this entry with the each
> corresponding bump):
> ===================================================================
> ## bugfix/luajit
> 
> Backported patches from vanilla LuaJIT trunk (gh-7230). In the scope of this
> activity, the following issues have been resolved:
> * Fixed `emit_loadi()` on x86/x64 emitting xor between condition check
>  and jump instructions.
> ===================================================================
> 
> src/lj_emit_x86.h                             |  6 +-
> test/tarantool-tests/CMakeLists.txt           |  1 +
> .../lj-416-xor-before-jcc.test.lua            | 70 +++++++++++++++++++
> .../lj-416-xor-before-jcc/CMakeLists.txt      |  1 +
> .../lj-416-xor-before-jcc/testxor.c           | 14 ++++
> 5 files changed, 90 insertions(+), 2 deletions(-)
> create mode 100644 test/tarantool-tests/lj-416-xor-before-jcc.test.lua
> create mode 100644 test/tarantool-tests/lj-416-xor-before-jcc/CMakeLists.txt
> create mode 100644 test/tarantool-tests/lj-416-xor-before-jcc/testxor.c
> 
> diff --git a/src/lj_emit_x86.h b/src/lj_emit_x86.h
> index 5207f9da..6b58306b 100644
> --- a/src/lj_emit_x86.h
> +++ b/src/lj_emit_x86.h
> @@ -274,10 +274,12 @@ static void emit_movmroi(ASMState *as, Reg base, int32_t ofs, int32_t i)
> /* mov r, i / xor r, r */
> static void emit_loadi(ASMState *as, Reg r, int32_t i)
> {
> -  /* XOR r,r is shorter, but modifies the flags. This is bad for HIOP. */
> +  /* XOR r,r is shorter, but modifies the flags. This is bad for HIOP/jcc. */
>   if (i == 0 && !(LJ_32 && (IR(as->curins)->o == IR_HIOP ||
> 			    (as->curins+1 < as->T->nins &&
> -			     IR(as->curins+1)->o == IR_HIOP)))) {
> +			     IR(as->curins+1)->o == IR_HIOP))) &&
> +		!((*as->mcp == 0x0f && (as->mcp[1] & 0xf0) == XI_JCCn) ||
> +		  (*as->mcp & 0xf0) == XI_JCCs)) {

I wonder if there can be a case with two instructions emitted between the comparison
and the jump. In such a case the xor still can sneak in? 
Another idea: there could be different instructions that relies on the flags, such as
x86Arith:XOg_ADC or x86Op:XO_CMOV, so the if condition above could be bigger?

>     emit_rr(as, XO_ARITH(XOg_XOR), r, r);
>   } else {
>     MCode *p = as->mcp;
> diff --git a/test/tarantool-tests/CMakeLists.txt b/test/tarantool-tests/CMakeLists.txt
> index 5708822e..ad69e2cc 100644
> --- a/test/tarantool-tests/CMakeLists.txt
> +++ b/test/tarantool-tests/CMakeLists.txt
> @@ -65,6 +65,7 @@ add_subdirectory(gh-5813-resolving-of-c-symbols/stripped)
> add_subdirectory(gh-6098-fix-side-exit-patching-on-arm64)
> add_subdirectory(gh-6189-cur_L)
> add_subdirectory(lj-49-bad-lightuserdata)
> +add_subdirectory(lj-416-xor-before-jcc)
> add_subdirectory(lj-601-fix-gc-finderrfunc)
> add_subdirectory(lj-727-lightuserdata-itern)
> add_subdirectory(lj-flush-on-trace)
> diff --git a/test/tarantool-tests/lj-416-xor-before-jcc.test.lua b/test/tarantool-tests/lj-416-xor-before-jcc.test.lua
> new file mode 100644
> index 00000000..7c6ab2b9
> --- /dev/null
> +++ b/test/tarantool-tests/lj-416-xor-before-jcc.test.lua
> @@ -0,0 +1,70 @@
> +local ffi = require('ffi')
> +local tap = require('tap')
> +
> +local test = tap.test('lj-416-xor-before-jcc')
> +test:plan(1)
> +
> +-- To reproduce this issue, we need:
> +-- 1) a register which contains the constant zero value
> +-- 2) a floating point comparison operation
> +-- 3) the comparison operation to perform a fused load, which in
> +--    turn needs to allocate a register, and for there to be no
> +--    free registers at that moment, and for the register chosen
> +--    for sacrifice to be holding the constant zero.
> +--
> +-- This leads to assembly code like the following:
> +--   ucomisd xmm7, [r14+0x18]
> +--   xor r14d, r14d
> +--   jnb 0x12a0e001c ->3
> +--
> +-- That xor is a big problem, as it modifies flags between the
> +-- ucomisd and the jnb, thereby causing the jnb to do the wrong
> +-- thing.

Same problems with explanation, copy from <fixed> message above

> +
> +ffi.cdef[[
> +  int test_xor_func(int a, int b, int c, int d, int e, int f, void * g, int h);
> +]]
> +local testxor = ffi.load('libtestxor')

Should have a test for x86 platform, since changes are x86 only?

> +
> +local handler = setmetatable({}, {
> +  __newindex = function ()
> +    -- 0 and nil are suggested as differnt constant-zero values
> +    -- for the call and occupied different registers.
> +    testxor.test_xor_func(0, 0, 0, 0, 0, 0, nil, 0)
> +  end
> +})
> +
> +local mconf = {
> +  { use = false, value = 100 },
> +  { use = true,  value = 100 },
> +}
> +
> +local function testf()
> +  -- Generate register pressure.
> +  local value = 50
> +  for _, rule in ipairs(mconf) do
> +    if rule.use then
> +      value = rule.value
> +      break
> +    end
> +  end
> +
> +  -- This branch shouldn't be taken.
> +  if value <= 42 then
> +    return true
> +  end
> +
> +  -- Nothing to do, just call testxor with many arguments.
> +  handler[4] = 4

If it is needed to use the metatable here then why?

> +end
> +
> +-- We need to create long side trace to generate register
> +-- pressure.
> +jit.opt.start('hotloop=1', 'hotexit=1')
> +for _ = 1, 3 do
> +  -- Don't use any `test` functions here to freeze the trace.
> +  assert (not testf())
> +end
> +test:ok(true, 'imposible branch is not taken')
> +
> +os.exit(test:check() and 0 or 1)
> diff --git a/test/tarantool-tests/lj-416-xor-before-jcc/CMakeLists.txt b/test/tarantool-tests/lj-416-xor-before-jcc/CMakeLists.txt
> new file mode 100644
> index 00000000..17aa9f9b
> --- /dev/null
> +++ b/test/tarantool-tests/lj-416-xor-before-jcc/CMakeLists.txt
> @@ -0,0 +1 @@
> +BuildTestCLib(libtestxor testxor.c)
> diff --git a/test/tarantool-tests/lj-416-xor-before-jcc/testxor.c b/test/tarantool-tests/lj-416-xor-before-jcc/testxor.c
> new file mode 100644
> index 00000000..32436d42
> --- /dev/null
> +++ b/test/tarantool-tests/lj-416-xor-before-jcc/testxor.c
> @@ -0,0 +1,14 @@
> +#define UNUSED(x) ((void)x)
> +
> +int test_xor_func(int a, int b, int c, int d, int e, int f, void *g, int h)
> +{
> +	UNUSED(a);
> +	UNUSED(b);
> +	UNUSED(c);
> +	UNUSED(d);
> +	UNUSED(e);
> +	UNUSED(f);
> +	UNUSED(g);
> +	UNUSED(h);
> +	return 0;
> +}
> -- 
> 2.34.1
> 


  parent reply	other threads:[~2022-07-06 10:30 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-04  9:33 Sergey Kaplun via Tarantool-patches
2022-07-05 15:10 ` Igor Munkin via Tarantool-patches
2022-07-06 10:30 ` sergos via Tarantool-patches [this message]
2022-11-11 10:20   ` Igor Munkin via Tarantool-patches
2022-11-22  4:36     ` Sergey Kaplun via Tarantool-patches
2022-11-22  5:45       ` Igor Munkin via Tarantool-patches
2022-08-31  8:48 ` Sergey Kaplun via Tarantool-patches
2022-08-31 16:04   ` sergos via Tarantool-patches
2022-09-01 10:32     ` Sergey Kaplun via Tarantool-patches
2022-11-23  7:50 ` Igor Munkin via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=813CA8F4-3B19-4370-A859-12B956ACCF3C@tarantool.org \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=sergos@tarantool.org \
    --cc=skaplun@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH luajit] x86/x64: Check for jcc when using xor r, r in emit_loadi().' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox