From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 514C227E46; Fri, 11 Nov 2022 13:33:16 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 514C227E46 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1668162796; bh=a7C257YA1yGMK0ClnZzC2A8RLS65cfqlzOto910RUGA=; h=Date:To:References:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=UNxnLut7PH7DA78ykoSijvFxAFso/YZgefeqCzG9EYSvKSa0IVMdSORCRIetevtei ljgWfGPz7041kWo6VuWGQqAzXrINWR7eD9isFo0jc9YanB5M6m68gu4d1GUxYjVyta 0QyMEMWviaksyJt5S5+BVXt8sfKLks48oZQTadig= Received: from smtpng1.i.mail.ru (smtpng1.i.mail.ru [94.100.181.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 4D52427E46 for ; Fri, 11 Nov 2022 13:33:14 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 4D52427E46 Received: by smtpng1.m.smailru.net with esmtpa (envelope-from ) id 1otRLV-0002Em-6f; Fri, 11 Nov 2022 13:33:13 +0300 Date: Fri, 11 Nov 2022 13:20:32 +0300 To: sergos Message-ID: References: <20220704093344.13522-1-skaplun@tarantool.org> <813CA8F4-3B19-4370-A859-12B956ACCF3C@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <813CA8F4-3B19-4370-A859-12B956ACCF3C@tarantool.org> X-Clacks-Overhead: GNU Terry Pratchett X-Mailru-Src: smtp X-4EC0790: 10 X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD9D880D530162779F13EAC6FBDF6B7DB4C8B89852CAAF032A0182A05F538085040AF3622B320F022D8E019A61408001189941B18C935E383A914DD9F9E918497C5 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7BB17EE3498E810FEEA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637D4360D888D8F9BE48638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D8EB0501D985F8309B0946850B4D14A46D117882F4460429724CE54428C33FAD305F5C1EE8F4F765FCE2CCD8F0CAA010FB389733CBF5DBD5E9C8A9BA7A39EFB766F5D81C698A659EA7CC7F00164DA146DA9985D098DBDEAEC8B2DECCBDF547A305117882F4460429728AD0CFFFB425014E868A13BD56FB6657E2021AF6380DFAD1A18204E546F3947CB11811A4A51E3B096D1867E19FE1407959CC434672EE6371089D37D7C0E48F6C8AA50765F7900637EC3198ECE464ADAEEFF80C71ABB335746BA297DBC24807EABDAD6C7F3747799A X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D3403182E70AE6895E0273B54448B43117FC45BDEB3DCFB7625EABCBB6B5DF949A3FBC25DC1B17284061D7E09C32AA3244C7B4F8478C07E55D50187E0D1BEF74DE31DD47778AE04E04D927AC6DF5659F194 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojSqkc8Nj21UOstafdqbRwiA== X-DA7885C5: 2CF1C47DE1B0024BB08430800E67B1172DDD083E6A6EDC2AAAA45A4C99D0EB7C262E2D401490A4A0DB037EFA58388B346E8BC1A9835FDE71 X-Mailru-Sender: 689FA8AB762F7393CC2E0F076E87284EC8140795C17E37C110250DCAC62347E4A7C8D0F45F857DBFE9F1EFEE2F478337FB559BB5D741EB964C8C2C849690F8E70A04DAD6CC59E3365FEEDEB644C299C0ED14614B50AE0675 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH luajit] x86/x64: Check for jcc when using xor r, r in emit_loadi(). X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Igor Munkin via Tarantool-patches Reply-To: Igor Munkin Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Sergos, On 06.07.22, sergos wrote: > Hi! > > Thanks for the patch! > > I won’t veto on the patch, since it’s from upstream. Still I have got > a number of questions on it’s crutch nature. Perhaps we need to follow > up with some test cases to luajit. Thanks for your precise review. I would like to shed some light onto the patch regarding LuaJIT semantics. Even if you're familiar with all specifics, it might be useful for other guys *still* reading ML and some parts can be used either in commit message or in test comments. > > Some description fixes are needed also. > > Sergos > > > > On 4 Jul 2022, at 12:33, Sergey Kaplun wrote: > > > > From: Mike Pall > > > > Thanks to Peter Cawley. > > > > (cherry picked from commit fb5e522fbc0750c838ef6a926b11c5d870826183) > > > > To reproduce this issue, we need: > > 1) a register which contains the constant zero value > > > 2) a floating point comparison operation > > The integer one will set the same flags (CF for the case of JNB) and XOR will > clear it anyways. Strictly saying, yes, I see no reason for "float" requirement here. I believe this can be changed just to "comparison operation occurs", but Sergey can correct me if I'm wrong. > > > 3) the comparison operation to perform a fused load, which in > > turn needs to allocate a register, and for there to be no > > free registers at that moment, and for the register chosen > > for sacrifice to be holding the constant zero. > > > Unfortunately, it’s not clear what this register (I suppose it’s r14d) is used for. > Is it an argument, needed after the fall-through in the same trace? > Why not it sank down below the branch? > IIRC it is a dedicated register used for dispatch, so why is it used for sacrificing? Like many things in LuaJIT register allocation is "state-of-the-art" (or as I called this "ad-hoc") entity. Since it's implemented as a linear scan RA working in a reverse direction of the trace being recorded, register is "allocated" for IR slot being used as a source and "freed" at the moment it becomes a destination. To describe the process Sergey fit in a three-bullet list above, we have the following. 0) IR for either "internal" (e.g. type check, hmask check) or "external" (e.g. branch or loop condition) guard is begin emitted to mcode. 1) JCC to side exit is emitted to the trace mcode at the beginning. 2) Condition (i.e. comparison) is going to be emitted. 3) Fuse optimization takes its place, that ought to allocate a register for the load base. 4) There is no free registers at this point. 5) The one storing the constant zero is chosen to be sacrificed and reallocated (consider allocation cost in ra_alloc for constant materialization). 6) Before (or in the sense of trace execution, after) register is being used in the aforementioned comparison, register (r14 in our case) is reset by XOR emitted right after (before) jump instruction. 7) The comparison with fused load within is emitted. As a result flags set by comparison are reset by XOR emitted in between of condition and jump instructions. > > > This leads to assembly code like the following: > > | ucomisd xmm7, [r14+0x18] > > | xor r14d, r14d > > | jnb 0x12a0e001c ->3 > > > > That xor is a big problem, as it modifies flags between the > > ucomisd and the jnb, thereby causing the jnb to do the wrong > > thing. > > > > > > diff --git a/src/lj_emit_x86.h b/src/lj_emit_x86.h > > index 5207f9da..6b58306b 100644 > > --- a/src/lj_emit_x86.h > > +++ b/src/lj_emit_x86.h > > @@ -274,10 +274,12 @@ static void emit_movmroi(ASMState *as, Reg base, int32_t ofs, int32_t i) > > /* mov r, i / xor r, r */ > > static void emit_loadi(ASMState *as, Reg r, int32_t i) > > { > > - /* XOR r,r is shorter, but modifies the flags. This is bad for HIOP. */ > > + /* XOR r,r is shorter, but modifies the flags. This is bad for HIOP/jcc. */ > > if (i == 0 && !(LJ_32 && (IR(as->curins)->o == IR_HIOP || > > (as->curins+1 < as->T->nins && > > - IR(as->curins+1)->o == IR_HIOP)))) { > > + IR(as->curins+1)->o == IR_HIOP))) && > > + !((*as->mcp == 0x0f && (as->mcp[1] & 0xf0) == XI_JCCn) || > > + (*as->mcp & 0xf0) == XI_JCCs)) { > > I wonder if there can be a case with two instructions emitted between the comparison > and the jump. In such a case the xor still can sneak in? > Another idea: there could be different instructions that relies on the flags, such as > x86Arith:XOg_ADC or x86Op:XO_CMOV, so the if condition above could be bigger? Nice question indeed! I believe, we need to re-check this case and send the patch to upstream if the issue occurs. > > > emit_rr(as, XO_ARITH(XOg_XOR), r, r); > > } else { > > MCode *p = as->mcp; > > -- > > 2.34.1 > > > -- Best regards, IM