From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 51B466ECE3; Tue, 5 Jul 2022 18:10:26 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 51B466ECE3 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1657033826; bh=X2kNCoZncEUl9EvkN+DpVqWQqTxzszckk4nfy2KmBBA=; h=Date:To:References:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=bj5veeVgSGgUK0L2Vm30B7RhEeIrPFY162xNpzZFJIjQwWhAvPs9qX6QlkDXnld6w PTgmr9vLbC56RSeVko686uweR8nNkuB43XCbe7Pw2UsX6rbKgVvmOYxO3sCsD8eNlc bCxFZT4wFvUPofpZopTBEVMPcqhTXJK0ztmMVl6I= Received: from smtpng3.i.mail.ru (smtpng3.i.mail.ru [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id C61546ECE3 for ; Tue, 5 Jul 2022 18:10:25 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org C61546ECE3 Received: by smtpng3.m.smailru.net with esmtpa (envelope-from ) id 1o8kC0-0001W9-KG; Tue, 05 Jul 2022 18:10:25 +0300 Date: Tue, 5 Jul 2022 18:10:23 +0300 To: Sergey Kaplun Message-ID: References: <20220704093344.13522-1-skaplun@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20220704093344.13522-1-skaplun@tarantool.org> X-Clacks-Overhead: GNU Terry Pratchett X-Mailru-Src: smtp X-4EC0790: 10 X-7564579A: EEAE043A70213CC8 X-77F55803: 4F1203BC0FB41BD9AABED37AFFA51518A10862901CCE4FD519A2D010BEC2E928182A05F5380850404C228DA9ACA6FE27EBB8792196A782042433E51F1CF32E03ABAC17BA67C6BDDD4863E0D11F82A099 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE73230F712CF4B3924EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F79006374DF0C582D42FCA168638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D84CD37AE60343AD5644CF166CC9B4E066117882F4460429724CE54428C33FAD305F5C1EE8F4F765FC3A703B70628EAD7BA471835C12D1D9774AD6D5ED66289B52BA9C0B312567BB23117882F44604297287769387670735201E561CDFBCA1751FCB629EEF1311BF91D2E47CDBA5A96583BA9C0B312567BB231DD303D21008E29813377AFFFEAFD269A417C69337E82CC2E827F84554CEF50127C277FBC8AE2E8BA83251EDC214901ED5E8D9A59859A8B6D0C9BB9AE6BD5D69089D37D7C0E48F6C5571747095F342E88FB05168BE4CE3AF X-8FC586DF: 6EFBBC1D9D64D975 X-C1DE0DAB: 9604B64F49C60606AD91A466A1DEF99B296C473AB1E142185AC9E3593CE4B31AB1881A6453793CE9274300E5CE05BD4401A9E91200F654B0BC42BABAB7E101F7699E857C1273028840FB73E6546CFAA5B26BE8E7A8CDA1FA9C2B6934AE262D3EE7EAB7254005DCED8DA55E71E02F9FC08E8E86DC7131B365E7726E8460B7C23C X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34ADE558D2B396DA7CD34A11388CCF83870980E3359CC5F1A8C0E34382C08BE5F021F2C3D925C4C21A1D7E09C32AA3244C99E96C85C02465D59DF999B1958BE85330363D8B7DA7DD44927AC6DF5659F194 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojDSjrPsecpvokhZzP8+9QYA== X-Mailru-Sender: 689FA8AB762F7393CC2E0F076E87284EE9276360FA83B5BDDBA26A9F74F3622AA7C8D0F45F857DBFE9F1EFEE2F478337FB559BB5D741EB964C8C2C849690F8E70A04DAD6CC59E3365FEEDEB644C299C0ED14614B50AE0675 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH luajit] x86/x64: Check for jcc when using xor r, r in emit_loadi(). X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Igor Munkin via Tarantool-patches Reply-To: Igor Munkin Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Sergey, Thanks for the patch! Please consider my comments below. On 04.07.22, Sergey Kaplun wrote: > From: Mike Pall > > Thanks to Peter Cawley. > > (cherry picked from commit fb5e522fbc0750c838ef6a926b11c5d870826183) > > To reproduce this issue, we need: > 1) a register which contains the constant zero value > 2) a floating point comparison operation > 3) the comparison operation to perform a fused load, which in > turn needs to allocate a register, and for there to be no > free registers at that moment, and for the register chosen > for sacrifice to be holding the constant zero. > > This leads to assembly code like the following: > | ucomisd xmm7, [r14+0x18] > | xor r14d, r14d > | jnb 0x12a0e001c ->3 > > That xor is a big problem, as it modifies flags between the > ucomisd and the jnb, thereby causing the jnb to do the wrong > thing. > > This patch forbids emitting xor in `emit_loadi()` for jcc operations. > > Sergey Kaplun: > * added the description and the test for the problem > > Part of tarantool/tarantool#7230 > --- > > Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-416-xor-before-jcc-full-ci > Issues: > * https://github.com/LuaJIT/LuaJIT/issues/416 > * https://github.com/tarantool/tarantool/issues/7230 > > Changelog entry (I suggest to update this entry with the each > corresponding bump): > =================================================================== > ## bugfix/luajit > > Backported patches from vanilla LuaJIT trunk (gh-7230). In the scope of this > activity, the following issues have been resolved: > * Fixed `emit_loadi()` on x86/x64 emitting xor between condition check > and jump instructions. > =================================================================== > > src/lj_emit_x86.h | 6 +- > test/tarantool-tests/CMakeLists.txt | 1 + > .../lj-416-xor-before-jcc.test.lua | 70 +++++++++++++++++++ > .../lj-416-xor-before-jcc/CMakeLists.txt | 1 + > .../lj-416-xor-before-jcc/testxor.c | 14 ++++ > 5 files changed, 90 insertions(+), 2 deletions(-) > create mode 100644 test/tarantool-tests/lj-416-xor-before-jcc.test.lua > create mode 100644 test/tarantool-tests/lj-416-xor-before-jcc/CMakeLists.txt > create mode 100644 test/tarantool-tests/lj-416-xor-before-jcc/testxor.c > > diff --git a/test/tarantool-tests/lj-416-xor-before-jcc.test.lua b/test/tarantool-tests/lj-416-xor-before-jcc.test.lua > new file mode 100644 > index 00000000..7c6ab2b9 > --- /dev/null > +++ b/test/tarantool-tests/lj-416-xor-before-jcc.test.lua > @@ -0,0 +1,70 @@ > +local ffi = require('ffi') > +local tap = require('tap') > + > +local test = tap.test('lj-416-xor-before-jcc') Should this test be run only on x86_64, considering its semantics? > +test:plan(1) > + > +-- To reproduce this issue, we need: > +-- 1) a register which contains the constant zero value > +-- 2) a floating point comparison operation > +-- 3) the comparison operation to perform a fused load, which in > +-- turn needs to allocate a register, and for there to be no > +-- free registers at that moment, and for the register chosen > +-- for sacrifice to be holding the constant zero. > +-- > +-- This leads to assembly code like the following: > +-- ucomisd xmm7, [r14+0x18] > +-- xor r14d, r14d > +-- jnb 0x12a0e001c ->3 > +-- > +-- That xor is a big problem, as it modifies flags between the > +-- ucomisd and the jnb, thereby causing the jnb to do the wrong > +-- thing. > + > +ffi.cdef[[ > + int test_xor_func(int a, int b, int c, int d, int e, int f, void * g, int h); > +]] > +local testxor = ffi.load('libtestxor') > + > +local handler = setmetatable({}, { > + __newindex = function () > + -- 0 and nil are suggested as differnt constant-zero values > + -- for the call and occupied different registers. > + testxor.test_xor_func(0, 0, 0, 0, 0, 0, nil, 0) Minor: Please, describe the purpose of this function. > + end > +}) > + > +local mconf = { > + { use = false, value = 100 }, > + { use = true, value = 100 }, > +} > + > +local function testf() > + -- Generate register pressure. > + local value = 50 > + for _, rule in ipairs(mconf) do > + if rule.use then > + value = rule.value > + break > + end > + end Minor: Could you please explain this block of code a bit? > + > + -- This branch shouldn't be taken. Minor: Why this branch should be taken without the patch? > + if value <= 42 then > + return true > + end > + > + -- Nothing to do, just call testxor with many arguments. > + handler[4] = 4 Minor: What is 4 in both key and value senses? > +end > + > +-- We need to create long side trace to generate register > +-- pressure. Minor: What makes the generated side trace long? > +jit.opt.start('hotloop=1', 'hotexit=1') > +for _ = 1, 3 do > + -- Don't use any `test` functions here to freeze the trace. > + assert (not testf()) Typo: s/assert (/assert(/. > +end > +test:ok(true, 'imposible branch is not taken') > + > +os.exit(test:check() and 0 or 1) > -- > 2.34.1 > -- Best regards, IM