From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 8338B6ECE3; Mon, 4 Jul 2022 12:36:08 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 8338B6ECE3 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1656927368; bh=4VJuztmxRBSlm5H84D2mY85iLN8tnrscHksfIdqLMwU=; h=To:Date:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=Z/YMeiu+WcWKItUFFJEhcz+w6mxb5bezh9nQU1nrxIDq8vPFh6EiyEsBkmDzULvce WVgloh1Yd2dk6s/DoDhoxxctPM22Sg+T2w8KmtImItELUOFk27RnL3ugKn+fKCZS1O nKatQ4Ofv73GAQvFrS8zLligjUEz0+LHpI8dkeTY= Received: from smtpng1.i.mail.ru (smtpng1.i.mail.ru [94.100.181.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 65DE56ECE3 for ; Mon, 4 Jul 2022 12:36:06 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 65DE56ECE3 Received: by smtpng1.m.smailru.net with esmtpa (envelope-from ) id 1o8IUv-00030G-9x; Mon, 04 Jul 2022 12:36:05 +0300 To: Sergey Ostanevich , Igor Munkin Date: Mon, 4 Jul 2022 12:33:44 +0300 Message-Id: <20220704093344.13522-1-skaplun@tarantool.org> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Mailru-Src: smtp X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD987C0EE6E7F0A597DF6D449F86FFBCF60688355514E92E4BCCD62213F67905E7AA7A5AC251038955C0E38C4424BFB6A04910503CC6753F9CBB7AB0E7134D2E1A8 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE71B1B5393BB9C1313EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637E565730CA785C4918638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D8EC3942E9421784CC4E202549EBC9CF0E117882F4460429724CE54428C33FAD305F5C1EE8F4F765FC8C7ADC89C2F0B2A5A471835C12D1D9774AD6D5ED66289B52BA9C0B312567BB23117882F446042972877693876707352026055571C92BF10F2CC0D3CB04F14752D2E47CDBA5A96583BA9C0B312567BB231DD303D21008E29813377AFFFEAFD269A417C69337E82CC2E827F84554CEF50127C277FBC8AE2E8BA83251EDC214901ED5E8D9A59859A8B6B1CFA6D474D4A6A4089D37D7C0E48F6C5571747095F342E88FB05168BE4CE3AF X-8FC586DF: 6EFBBC1D9D64D975 X-C1DE0DAB: C20DE7B7AB408E4181F030C43753B8186998911F362727C41E93BD56E7067354307CAA32FF218580205367B2BCC23E5B0F81E220A2C5AEF0A9D420A4CFB5DD3ED2D7854FB84A1A2F9E36249242E7E71A5E1DF3F22D3D003DD59269BC5F550898DBE8DEE28BC9005CC6B1DB194890EDFB638A446BE3E5C627BF0CFE790FC11A7261332C5CB50AE517886A5961035A09600383DAD389E261318FB05168BE4CE3AF X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D3455AC8BF8E3153BA09AF35C40BBAAE787B4C4D8FFE03A4387481391DBCA321E976E015369FD9200B11D7E09C32AA3244C0EA56AE770EC5FD5A214B6D5B5D37792F26BFA4C8A6946B8927AC6DF5659F194 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojbL9S8ysBdXhlR8zev6ck7wQdTZei47gx X-Mailru-Sender: 689FA8AB762F7393CC2E0F076E87284EC75DC99DA08B115237FC0302342B9A620FBE9A32752B8C9C2AA642CC12EC09F1FB559BB5D741EB962F61BD320559CF1EFD657A8799238ED55FEEDEB644C299C0ED14614B50AE0675 X-Mras: Ok Subject: [Tarantool-patches] [PATCH luajit] x86/x64: Check for jcc when using xor r, r in emit_loadi(). X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Sergey Kaplun via Tarantool-patches Reply-To: Sergey Kaplun Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" From: Mike Pall Thanks to Peter Cawley. (cherry picked from commit fb5e522fbc0750c838ef6a926b11c5d870826183) To reproduce this issue, we need: 1) a register which contains the constant zero value 2) a floating point comparison operation 3) the comparison operation to perform a fused load, which in turn needs to allocate a register, and for there to be no free registers at that moment, and for the register chosen for sacrifice to be holding the constant zero. This leads to assembly code like the following: | ucomisd xmm7, [r14+0x18] | xor r14d, r14d | jnb 0x12a0e001c ->3 That xor is a big problem, as it modifies flags between the ucomisd and the jnb, thereby causing the jnb to do the wrong thing. This patch forbids emitting xor in `emit_loadi()` for jcc operations. Sergey Kaplun: * added the description and the test for the problem Part of tarantool/tarantool#7230 --- Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-416-xor-before-jcc-full-ci Issues: * https://github.com/LuaJIT/LuaJIT/issues/416 * https://github.com/tarantool/tarantool/issues/7230 Changelog entry (I suggest to update this entry with the each corresponding bump): =================================================================== ## bugfix/luajit Backported patches from vanilla LuaJIT trunk (gh-7230). In the scope of this activity, the following issues have been resolved: * Fixed `emit_loadi()` on x86/x64 emitting xor between condition check and jump instructions. =================================================================== src/lj_emit_x86.h | 6 +- test/tarantool-tests/CMakeLists.txt | 1 + .../lj-416-xor-before-jcc.test.lua | 70 +++++++++++++++++++ .../lj-416-xor-before-jcc/CMakeLists.txt | 1 + .../lj-416-xor-before-jcc/testxor.c | 14 ++++ 5 files changed, 90 insertions(+), 2 deletions(-) create mode 100644 test/tarantool-tests/lj-416-xor-before-jcc.test.lua create mode 100644 test/tarantool-tests/lj-416-xor-before-jcc/CMakeLists.txt create mode 100644 test/tarantool-tests/lj-416-xor-before-jcc/testxor.c diff --git a/src/lj_emit_x86.h b/src/lj_emit_x86.h index 5207f9da..6b58306b 100644 --- a/src/lj_emit_x86.h +++ b/src/lj_emit_x86.h @@ -274,10 +274,12 @@ static void emit_movmroi(ASMState *as, Reg base, int32_t ofs, int32_t i) /* mov r, i / xor r, r */ static void emit_loadi(ASMState *as, Reg r, int32_t i) { - /* XOR r,r is shorter, but modifies the flags. This is bad for HIOP. */ + /* XOR r,r is shorter, but modifies the flags. This is bad for HIOP/jcc. */ if (i == 0 && !(LJ_32 && (IR(as->curins)->o == IR_HIOP || (as->curins+1 < as->T->nins && - IR(as->curins+1)->o == IR_HIOP)))) { + IR(as->curins+1)->o == IR_HIOP))) && + !((*as->mcp == 0x0f && (as->mcp[1] & 0xf0) == XI_JCCn) || + (*as->mcp & 0xf0) == XI_JCCs)) { emit_rr(as, XO_ARITH(XOg_XOR), r, r); } else { MCode *p = as->mcp; diff --git a/test/tarantool-tests/CMakeLists.txt b/test/tarantool-tests/CMakeLists.txt index 5708822e..ad69e2cc 100644 --- a/test/tarantool-tests/CMakeLists.txt +++ b/test/tarantool-tests/CMakeLists.txt @@ -65,6 +65,7 @@ add_subdirectory(gh-5813-resolving-of-c-symbols/stripped) add_subdirectory(gh-6098-fix-side-exit-patching-on-arm64) add_subdirectory(gh-6189-cur_L) add_subdirectory(lj-49-bad-lightuserdata) +add_subdirectory(lj-416-xor-before-jcc) add_subdirectory(lj-601-fix-gc-finderrfunc) add_subdirectory(lj-727-lightuserdata-itern) add_subdirectory(lj-flush-on-trace) diff --git a/test/tarantool-tests/lj-416-xor-before-jcc.test.lua b/test/tarantool-tests/lj-416-xor-before-jcc.test.lua new file mode 100644 index 00000000..7c6ab2b9 --- /dev/null +++ b/test/tarantool-tests/lj-416-xor-before-jcc.test.lua @@ -0,0 +1,70 @@ +local ffi = require('ffi') +local tap = require('tap') + +local test = tap.test('lj-416-xor-before-jcc') +test:plan(1) + +-- To reproduce this issue, we need: +-- 1) a register which contains the constant zero value +-- 2) a floating point comparison operation +-- 3) the comparison operation to perform a fused load, which in +-- turn needs to allocate a register, and for there to be no +-- free registers at that moment, and for the register chosen +-- for sacrifice to be holding the constant zero. +-- +-- This leads to assembly code like the following: +-- ucomisd xmm7, [r14+0x18] +-- xor r14d, r14d +-- jnb 0x12a0e001c ->3 +-- +-- That xor is a big problem, as it modifies flags between the +-- ucomisd and the jnb, thereby causing the jnb to do the wrong +-- thing. + +ffi.cdef[[ + int test_xor_func(int a, int b, int c, int d, int e, int f, void * g, int h); +]] +local testxor = ffi.load('libtestxor') + +local handler = setmetatable({}, { + __newindex = function () + -- 0 and nil are suggested as differnt constant-zero values + -- for the call and occupied different registers. + testxor.test_xor_func(0, 0, 0, 0, 0, 0, nil, 0) + end +}) + +local mconf = { + { use = false, value = 100 }, + { use = true, value = 100 }, +} + +local function testf() + -- Generate register pressure. + local value = 50 + for _, rule in ipairs(mconf) do + if rule.use then + value = rule.value + break + end + end + + -- This branch shouldn't be taken. + if value <= 42 then + return true + end + + -- Nothing to do, just call testxor with many arguments. + handler[4] = 4 +end + +-- We need to create long side trace to generate register +-- pressure. +jit.opt.start('hotloop=1', 'hotexit=1') +for _ = 1, 3 do + -- Don't use any `test` functions here to freeze the trace. + assert (not testf()) +end +test:ok(true, 'imposible branch is not taken') + +os.exit(test:check() and 0 or 1) diff --git a/test/tarantool-tests/lj-416-xor-before-jcc/CMakeLists.txt b/test/tarantool-tests/lj-416-xor-before-jcc/CMakeLists.txt new file mode 100644 index 00000000..17aa9f9b --- /dev/null +++ b/test/tarantool-tests/lj-416-xor-before-jcc/CMakeLists.txt @@ -0,0 +1 @@ +BuildTestCLib(libtestxor testxor.c) diff --git a/test/tarantool-tests/lj-416-xor-before-jcc/testxor.c b/test/tarantool-tests/lj-416-xor-before-jcc/testxor.c new file mode 100644 index 00000000..32436d42 --- /dev/null +++ b/test/tarantool-tests/lj-416-xor-before-jcc/testxor.c @@ -0,0 +1,14 @@ +#define UNUSED(x) ((void)x) + +int test_xor_func(int a, int b, int c, int d, int e, int f, void *g, int h) +{ + UNUSED(a); + UNUSED(b); + UNUSED(c); + UNUSED(d); + UNUSED(e); + UNUSED(f); + UNUSED(g); + UNUSED(h); + return 0; +} -- 2.34.1