Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH luajit] x86/x64: Check for jcc when using xor r, r in emit_loadi().
@ 2022-07-04  9:33 Sergey Kaplun via Tarantool-patches
  2022-07-05 15:10 ` Igor Munkin via Tarantool-patches
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2022-07-04  9:33 UTC (permalink / raw)
  To: Sergey Ostanevich, Igor Munkin; +Cc: tarantool-patches

From: Mike Pall <mike>

Thanks to Peter Cawley.

(cherry picked from commit fb5e522fbc0750c838ef6a926b11c5d870826183)

To reproduce this issue, we need:
1) a register which contains the constant zero value
2) a floating point comparison operation
3) the comparison operation to perform a fused load, which in
   turn needs to allocate a register, and for there to be no
   free registers at that moment, and for the register chosen
   for sacrifice to be holding the constant zero.

This leads to assembly code like the following:
| ucomisd xmm7, [r14+0x18]
| xor r14d, r14d
| jnb 0x12a0e001c ->3

That xor is a big problem, as it modifies flags between the
ucomisd and the jnb, thereby causing the jnb to do the wrong
thing.

This patch forbids emitting xor in `emit_loadi()` for jcc operations.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#7230
---

Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-416-xor-before-jcc-full-ci
Issues:
* https://github.com/LuaJIT/LuaJIT/issues/416
* https://github.com/tarantool/tarantool/issues/7230

Changelog entry (I suggest to update this entry with the each
corresponding bump):
===================================================================
## bugfix/luajit

Backported patches from vanilla LuaJIT trunk (gh-7230). In the scope of this
activity, the following issues have been resolved:
* Fixed `emit_loadi()` on x86/x64 emitting xor between condition check
  and jump instructions.
===================================================================

 src/lj_emit_x86.h                             |  6 +-
 test/tarantool-tests/CMakeLists.txt           |  1 +
 .../lj-416-xor-before-jcc.test.lua            | 70 +++++++++++++++++++
 .../lj-416-xor-before-jcc/CMakeLists.txt      |  1 +
 .../lj-416-xor-before-jcc/testxor.c           | 14 ++++
 5 files changed, 90 insertions(+), 2 deletions(-)
 create mode 100644 test/tarantool-tests/lj-416-xor-before-jcc.test.lua
 create mode 100644 test/tarantool-tests/lj-416-xor-before-jcc/CMakeLists.txt
 create mode 100644 test/tarantool-tests/lj-416-xor-before-jcc/testxor.c

diff --git a/src/lj_emit_x86.h b/src/lj_emit_x86.h
index 5207f9da..6b58306b 100644
--- a/src/lj_emit_x86.h
+++ b/src/lj_emit_x86.h
@@ -274,10 +274,12 @@ static void emit_movmroi(ASMState *as, Reg base, int32_t ofs, int32_t i)
 /* mov r, i / xor r, r */
 static void emit_loadi(ASMState *as, Reg r, int32_t i)
 {
-  /* XOR r,r is shorter, but modifies the flags. This is bad for HIOP. */
+  /* XOR r,r is shorter, but modifies the flags. This is bad for HIOP/jcc. */
   if (i == 0 && !(LJ_32 && (IR(as->curins)->o == IR_HIOP ||
 			    (as->curins+1 < as->T->nins &&
-			     IR(as->curins+1)->o == IR_HIOP)))) {
+			     IR(as->curins+1)->o == IR_HIOP))) &&
+		!((*as->mcp == 0x0f && (as->mcp[1] & 0xf0) == XI_JCCn) ||
+		  (*as->mcp & 0xf0) == XI_JCCs)) {
     emit_rr(as, XO_ARITH(XOg_XOR), r, r);
   } else {
     MCode *p = as->mcp;
diff --git a/test/tarantool-tests/CMakeLists.txt b/test/tarantool-tests/CMakeLists.txt
index 5708822e..ad69e2cc 100644
--- a/test/tarantool-tests/CMakeLists.txt
+++ b/test/tarantool-tests/CMakeLists.txt
@@ -65,6 +65,7 @@ add_subdirectory(gh-5813-resolving-of-c-symbols/stripped)
 add_subdirectory(gh-6098-fix-side-exit-patching-on-arm64)
 add_subdirectory(gh-6189-cur_L)
 add_subdirectory(lj-49-bad-lightuserdata)
+add_subdirectory(lj-416-xor-before-jcc)
 add_subdirectory(lj-601-fix-gc-finderrfunc)
 add_subdirectory(lj-727-lightuserdata-itern)
 add_subdirectory(lj-flush-on-trace)
diff --git a/test/tarantool-tests/lj-416-xor-before-jcc.test.lua b/test/tarantool-tests/lj-416-xor-before-jcc.test.lua
new file mode 100644
index 00000000..7c6ab2b9
--- /dev/null
+++ b/test/tarantool-tests/lj-416-xor-before-jcc.test.lua
@@ -0,0 +1,70 @@
+local ffi = require('ffi')
+local tap = require('tap')
+
+local test = tap.test('lj-416-xor-before-jcc')
+test:plan(1)
+
+-- To reproduce this issue, we need:
+-- 1) a register which contains the constant zero value
+-- 2) a floating point comparison operation
+-- 3) the comparison operation to perform a fused load, which in
+--    turn needs to allocate a register, and for there to be no
+--    free registers at that moment, and for the register chosen
+--    for sacrifice to be holding the constant zero.
+--
+-- This leads to assembly code like the following:
+--   ucomisd xmm7, [r14+0x18]
+--   xor r14d, r14d
+--   jnb 0x12a0e001c ->3
+--
+-- That xor is a big problem, as it modifies flags between the
+-- ucomisd and the jnb, thereby causing the jnb to do the wrong
+-- thing.
+
+ffi.cdef[[
+  int test_xor_func(int a, int b, int c, int d, int e, int f, void * g, int h);
+]]
+local testxor = ffi.load('libtestxor')
+
+local handler = setmetatable({}, {
+  __newindex = function ()
+    -- 0 and nil are suggested as differnt constant-zero values
+    -- for the call and occupied different registers.
+    testxor.test_xor_func(0, 0, 0, 0, 0, 0, nil, 0)
+  end
+})
+
+local mconf = {
+  { use = false, value = 100 },
+  { use = true,  value = 100 },
+}
+
+local function testf()
+  -- Generate register pressure.
+  local value = 50
+  for _, rule in ipairs(mconf) do
+    if rule.use then
+      value = rule.value
+      break
+    end
+  end
+
+  -- This branch shouldn't be taken.
+  if value <= 42 then
+    return true
+  end
+
+  -- Nothing to do, just call testxor with many arguments.
+  handler[4] = 4
+end
+
+-- We need to create long side trace to generate register
+-- pressure.
+jit.opt.start('hotloop=1', 'hotexit=1')
+for _ = 1, 3 do
+  -- Don't use any `test` functions here to freeze the trace.
+  assert (not testf())
+end
+test:ok(true, 'imposible branch is not taken')
+
+os.exit(test:check() and 0 or 1)
diff --git a/test/tarantool-tests/lj-416-xor-before-jcc/CMakeLists.txt b/test/tarantool-tests/lj-416-xor-before-jcc/CMakeLists.txt
new file mode 100644
index 00000000..17aa9f9b
--- /dev/null
+++ b/test/tarantool-tests/lj-416-xor-before-jcc/CMakeLists.txt
@@ -0,0 +1 @@
+BuildTestCLib(libtestxor testxor.c)
diff --git a/test/tarantool-tests/lj-416-xor-before-jcc/testxor.c b/test/tarantool-tests/lj-416-xor-before-jcc/testxor.c
new file mode 100644
index 00000000..32436d42
--- /dev/null
+++ b/test/tarantool-tests/lj-416-xor-before-jcc/testxor.c
@@ -0,0 +1,14 @@
+#define UNUSED(x) ((void)x)
+
+int test_xor_func(int a, int b, int c, int d, int e, int f, void *g, int h)
+{
+	UNUSED(a);
+	UNUSED(b);
+	UNUSED(c);
+	UNUSED(d);
+	UNUSED(e);
+	UNUSED(f);
+	UNUSED(g);
+	UNUSED(h);
+	return 0;
+}
-- 
2.34.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-11-23  8:04 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-04  9:33 [Tarantool-patches] [PATCH luajit] x86/x64: Check for jcc when using xor r, r in emit_loadi() Sergey Kaplun via Tarantool-patches
2022-07-05 15:10 ` Igor Munkin via Tarantool-patches
2022-07-06 10:30 ` sergos via Tarantool-patches
2022-11-11 10:20   ` Igor Munkin via Tarantool-patches
2022-11-22  4:36     ` Sergey Kaplun via Tarantool-patches
2022-11-22  5:45       ` Igor Munkin via Tarantool-patches
2022-08-31  8:48 ` Sergey Kaplun via Tarantool-patches
2022-08-31 16:04   ` sergos via Tarantool-patches
2022-09-01 10:32     ` Sergey Kaplun via Tarantool-patches
2022-11-23  7:50 ` Igor Munkin via Tarantool-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox