[Tarantool-patches] [PATCH luajit] x86/x64: Check for jcc when using xor r, r in emit_loadi().
Igor Munkin
imun at tarantool.org
Tue Jul 5 18:10:23 MSK 2022
Sergey,
Thanks for the patch! Please consider my comments below.
On 04.07.22, Sergey Kaplun wrote:
> From: Mike Pall <mike>
>
> Thanks to Peter Cawley.
>
> (cherry picked from commit fb5e522fbc0750c838ef6a926b11c5d870826183)
>
> To reproduce this issue, we need:
> 1) a register which contains the constant zero value
> 2) a floating point comparison operation
> 3) the comparison operation to perform a fused load, which in
> turn needs to allocate a register, and for there to be no
> free registers at that moment, and for the register chosen
> for sacrifice to be holding the constant zero.
>
> This leads to assembly code like the following:
> | ucomisd xmm7, [r14+0x18]
> | xor r14d, r14d
> | jnb 0x12a0e001c ->3
>
> That xor is a big problem, as it modifies flags between the
> ucomisd and the jnb, thereby causing the jnb to do the wrong
> thing.
>
> This patch forbids emitting xor in `emit_loadi()` for jcc operations.
>
> Sergey Kaplun:
> * added the description and the test for the problem
>
> Part of tarantool/tarantool#7230
> ---
>
> Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-416-xor-before-jcc-full-ci
> Issues:
> * https://github.com/LuaJIT/LuaJIT/issues/416
> * https://github.com/tarantool/tarantool/issues/7230
>
> Changelog entry (I suggest to update this entry with the each
> corresponding bump):
> ===================================================================
> ## bugfix/luajit
>
> Backported patches from vanilla LuaJIT trunk (gh-7230). In the scope of this
> activity, the following issues have been resolved:
> * Fixed `emit_loadi()` on x86/x64 emitting xor between condition check
> and jump instructions.
> ===================================================================
>
> src/lj_emit_x86.h | 6 +-
> test/tarantool-tests/CMakeLists.txt | 1 +
> .../lj-416-xor-before-jcc.test.lua | 70 +++++++++++++++++++
> .../lj-416-xor-before-jcc/CMakeLists.txt | 1 +
> .../lj-416-xor-before-jcc/testxor.c | 14 ++++
> 5 files changed, 90 insertions(+), 2 deletions(-)
> create mode 100644 test/tarantool-tests/lj-416-xor-before-jcc.test.lua
> create mode 100644 test/tarantool-tests/lj-416-xor-before-jcc/CMakeLists.txt
> create mode 100644 test/tarantool-tests/lj-416-xor-before-jcc/testxor.c
>
<snipped>
> diff --git a/test/tarantool-tests/lj-416-xor-before-jcc.test.lua b/test/tarantool-tests/lj-416-xor-before-jcc.test.lua
> new file mode 100644
> index 00000000..7c6ab2b9
> --- /dev/null
> +++ b/test/tarantool-tests/lj-416-xor-before-jcc.test.lua
> @@ -0,0 +1,70 @@
> +local ffi = require('ffi')
> +local tap = require('tap')
> +
> +local test = tap.test('lj-416-xor-before-jcc')
Should this test be run only on x86_64, considering its semantics?
> +test:plan(1)
> +
> +-- To reproduce this issue, we need:
> +-- 1) a register which contains the constant zero value
> +-- 2) a floating point comparison operation
> +-- 3) the comparison operation to perform a fused load, which in
> +-- turn needs to allocate a register, and for there to be no
> +-- free registers at that moment, and for the register chosen
> +-- for sacrifice to be holding the constant zero.
> +--
> +-- This leads to assembly code like the following:
> +-- ucomisd xmm7, [r14+0x18]
> +-- xor r14d, r14d
> +-- jnb 0x12a0e001c ->3
> +--
> +-- That xor is a big problem, as it modifies flags between the
> +-- ucomisd and the jnb, thereby causing the jnb to do the wrong
> +-- thing.
> +
> +ffi.cdef[[
> + int test_xor_func(int a, int b, int c, int d, int e, int f, void * g, int h);
> +]]
> +local testxor = ffi.load('libtestxor')
> +
> +local handler = setmetatable({}, {
> + __newindex = function ()
> + -- 0 and nil are suggested as differnt constant-zero values
> + -- for the call and occupied different registers.
> + testxor.test_xor_func(0, 0, 0, 0, 0, 0, nil, 0)
Minor: Please, describe the purpose of this function.
> + end
> +})
> +
> +local mconf = {
> + { use = false, value = 100 },
> + { use = true, value = 100 },
> +}
> +
> +local function testf()
> + -- Generate register pressure.
> + local value = 50
> + for _, rule in ipairs(mconf) do
> + if rule.use then
> + value = rule.value
> + break
> + end
> + end
Minor: Could you please explain this block of code a bit?
> +
> + -- This branch shouldn't be taken.
Minor: Why this branch should be taken without the patch?
> + if value <= 42 then
> + return true
> + end
> +
> + -- Nothing to do, just call testxor with many arguments.
> + handler[4] = 4
Minor: What is 4 in both key and value senses?
> +end
> +
> +-- We need to create long side trace to generate register
> +-- pressure.
Minor: What makes the generated side trace long?
> +jit.opt.start('hotloop=1', 'hotexit=1')
> +for _ = 1, 3 do
> + -- Don't use any `test` functions here to freeze the trace.
> + assert (not testf())
Typo: s/assert (/assert(/.
> +end
> +test:ok(true, 'imposible branch is not taken')
> +
> +os.exit(test:check() and 0 or 1)
<snipped>
> --
> 2.34.1
>
--
Best regards,
IM
More information about the Tarantool-patches
mailing list