From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id EC86414F5F8C; Mon, 8 Sep 2025 15:15:45 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org EC86414F5F8C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1757333746; bh=t7r1tSBhiWyD0a/ZHkpdUf8A1N7kF8PFSYs/qO67obE=; h=Date:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=cuAaxk76XfMcoNZ1RtJAcSSqgiwnK6fSg7jyvR93S0KDR4zoK39tIBCdJ5rpR17WZ w0KBEsBFlIMotVr3bVLFSeOX0rwPvc7bQUOdgxbxYdbAwCCi+jqFgYJDDcSZJph/tp w+fzcz4vTElZ9dUpUP6SQ8pDolKlWYjUYkYN3sRE= Received: from send197.i.mail.ru (send197.i.mail.ru [95.163.59.36]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 7059F14F5F8A for ; Mon, 8 Sep 2025 15:15:44 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 7059F14F5F8A Received: by exim-smtp-c584fb9f-kp4vq with esmtpa (envelope-from ) id 1uvamd-000000007yx-2Xvm; Mon, 08 Sep 2025 15:15:43 +0300 Content-Type: multipart/alternative; boundary="------------qO9L0IohCB4uiaAo8fs2io2B" Message-ID: <178227cb-433b-4ea1-8180-95dc2c805b70@tarantool.org> Date: Mon, 8 Sep 2025 15:15:43 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: Sergey Kaplun Cc: tarantool-patches@dev.tarantool.org References: <20250820114959.27378-1-skaplun@tarantool.org> In-Reply-To: <20250820114959.27378-1-skaplun@tarantool.org> X-Mailru-Src: smtp X-4EC0790: 10 X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD9C019336662AA2141DF6A44063013B3AFC43E2DCECF84F90C182A05F538085040CF711F901B4FAB3D3DE06ABAFEAF67058C39535355480B39736C06C7A9DE227B9799EF890AE2FEA9 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE75644E22E05AA81AEB287FD4696A6DC2FA8DF7F3B2552694A4E2F5AFA99E116B42401471946AA11AFA417C69337E82CC2BCF491FFA38154B6B5C8C57E37DE458B543C44CD4AFD0029C6CDE5D1141D2B1C1C53048140BF439207921919C007305C48216D7483AA14AF193CDA023E6BD71F8941B15DA834481FA18204E546F3947C0839144E5BB460BAF6B57BC7E64490618DEB871D839B7333395957E7521B51C2DFABB839C843B9C08941B15DA834481F8AA50765F790063765D76192D7FB77C9389733CBF5DBD5E9B5C8C57E37DE458BD9DD9810294C998ED8FC6C240DEA76428AA50765F79006375F9A1B9E53A3C84ED81D268191BDAD3DBD4B6F7A4D31EC0BE2F48590F00D11D6D81D268191BDAD3D78DA827A17800CE79E777A86B99D0C02EC76A7562686271ED91E3A1F190DE8FD2E808ACE2090B5E14AD6D5ED66289B5259CC434672EE63711DD303D21008E298D5E8D9A59859A8B6B372FE9A2E580EFC725E5C173C3A84C36174550A02D153F535872C767BF85DA2F004C90652538430E4A6367B16DE6309 X-C1DE0DAB: 0D63561A33F958A55CAEF3835459B5695002B1117B3ED6967E310EB3ADBAE41C33EE06AFCD964888823CB91A9FED034534781492E4B8EEADCAFEFF123806BC82BDAD6C7F3747799A X-C8649E89: 1C3962B70DF3F0ADBF74143AD284FC7177DD89D51EBB7742424CF958EAFF5D571004E42C50DC4CA955A7F0CF078B5EC49A30900B95165D3435BBF0AC4E3A921C7E8D881414649E82D94DE2881C9277F66D7DBE822CF3FB0CC93486D36723D96E1D7E09C32AA3244C6B0D94C5E8205F5077DD89D51EBB774236DED0502F3C7B31EA455F16B58544A2E30DDF7C44BCB90DA5AE236DF995FB59978A700BF655EAEEED6A17656DB59BCAD427812AF56FC65B X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu53w8ahmwBjZKM/YPHZyZHvz5uv+WouB9+ObcCpyrx6l7KImUglyhkEat/+ysWwi0gdhEs0JGjl6ggRWTy1haxBpVdbIX1nthFXMZebaIdHP2ghjoIc/363UZI6Kf1ptIMVdVMtzNxwZu5Ja/um4uFy/4= X-Mailru-Sender: 811C44EDE0507D1F797560C68D020EBDF3C2AA7C381ED7B8CE7DA15FCD53A221EDA1A9C3F7DAA78CF2DA82EE4B13C7E3645D15D82EE4B272BD6E4642A116CA93524AA66B5ACBE6721EF430B9A63E2A504198E0F3ECE9B5443453F38A29522196 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH luajit] x86/x64: Don't use undefined MUL/IMUL zero flag. X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Sergey Bronnikov via Tarantool-patches Reply-To: Sergey Bronnikov Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" This is a multi-part message in MIME format. --------------qO9L0IohCB4uiaAo8fs2io2B Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Hi,  Sergey, thanks for the patch! LGTM Sergey On 8/20/25 14:49, Sergey Kaplun wrote: > From: Mike Pall > > Reported by VrIgHtEr. > > (cherry picked from commit c92d0cb19263e7e302b4740ba6617a32c201c613) > > When emitting the arithmetic operations on registers via the > `asm_intarith()`, the next `test` instruction may be dropped since the > flag register is modified by the arithmetic instruction to be emitted. > > But the `imul` instruction [1] doesn't modify ZF, so its value is > undefined. This patch prevents dropping the `test` instruction if > the emitted instruction is `imul`. > > Sergey Kaplun: > * added the description and the test for the problem > > [1]:https://www.felixcloutier.com/x86/imul > > Part of tarantool/tarantool#11691 > --- > > Branch:https://github.com/tarantool/luajit/tree/skaplun/lj-1376-undefined-mul-test-flag > Related issues: > *https://github.com/tarantool/tarantool/issues/11691 > *https://github.com/LuaJIT/LuaJIT/issues/1376 > > src/lj_asm_x86.h | 3 +- > .../lj-1376-undefined-mul-test-flag.test.lua | 38 +++++++++++++++++++ > 2 files changed, 40 insertions(+), 1 deletion(-) > create mode 100644 test/tarantool-tests/lj-1376-undefined-mul-test-flag.test.lua > > diff --git a/src/lj_asm_x86.h b/src/lj_asm_x86.h > index 89e83205..0e3a473c 100644 > --- a/src/lj_asm_x86.h > +++ b/src/lj_asm_x86.h > @@ -2061,7 +2061,8 @@ static void asm_intarith(ASMState *as, IRIns *ir, x86Arith xa) > RegSet allow = RSET_GPR; > Reg dest, right; > int32_t k = 0; > - if (as->flagmcp == as->mcp) { /* Drop test r,r instruction. */ > + if (as->flagmcp == as->mcp && xa != XOg_X_IMUL) { > + /* Drop test r,r instruction. */ > MCode *p = as->mcp + ((LJ_64 && *as->mcp < XI_TESTb) ? 3 : 2); > MCode *q = p[0] == 0x0f ? p+1 : p; > if ((*q & 15) < 14) { > diff --git a/test/tarantool-tests/lj-1376-undefined-mul-test-flag.test.lua b/test/tarantool-tests/lj-1376-undefined-mul-test-flag.test.lua > new file mode 100644 > index 00000000..f6c02a00 > --- /dev/null > +++ b/test/tarantool-tests/lj-1376-undefined-mul-test-flag.test.lua > @@ -0,0 +1,38 @@ > +local tap = require('tap') > + > +-- Test file to demonstrate incorrect assembling optimization > +-- for x86/x64 CPUs. > +-- See also:https://github.com/LuaJIT/LuaJIT/issues/1376. > + > +local test = tap.test('lj-1376-undefined-mul-test-flag'):skipcond({ > + ['Test requires JIT enabled'] = not jit.status(), > +}) > + > +test:plan(1) > + > +local a, b = 0ULL, 0ULL > + > +jit.opt.start('hotloop=1') > +for _ = 1, 4 do > + -- Before the patch, the `test` instruction is dropped by > + -- assuming the `imul` instruction before it modifies the flags > + -- register. It results in the following mcode: > + -- | imul r15, rbp > + -- | jnz 0x559415b10060 ->5 > + -- Instead of the following: > + -- | imul r15, rbp > + -- | test r15, r15 > + -- | jnz 0x559415b10060 ->5 > + -- This leads to the incorrect branch being taken. > + if a * b ~= 0ULL then > +test:fail('the impossible branch is taken') > +test:done(true) > + end > + -- XXX: Need to update multiplier to stay in the variant part of > + -- the loop, since invariant contains IR_NOP (former unused > + -- IR_CNEW) between IRs, and the optimization is not applied. > + b = b + 1 > +end > + > +test:ok(true, 'no dropping of test instruction') > +test:done(true) --------------qO9L0IohCB4uiaAo8fs2io2B Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit

Hi,  Sergey,

thanks for the patch! LGTM

Sergey

On 8/20/25 14:49, Sergey Kaplun wrote:
From: Mike Pall <mike>

Reported by VrIgHtEr.

(cherry picked from commit c92d0cb19263e7e302b4740ba6617a32c201c613)

When emitting the arithmetic operations on registers via the
`asm_intarith()`, the next `test` instruction may be dropped since the
flag register is modified by the arithmetic instruction to be emitted.

But the `imul` instruction [1] doesn't modify ZF, so its value is
undefined. This patch prevents dropping the `test` instruction if
the emitted instruction is `imul`.

Sergey Kaplun:
* added the description and the test for the problem

[1]: https://www.felixcloutier.com/x86/imul

Part of tarantool/tarantool#11691
---

Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-1376-undefined-mul-test-flag
Related issues:
* https://github.com/tarantool/tarantool/issues/11691
* https://github.com/LuaJIT/LuaJIT/issues/1376

 src/lj_asm_x86.h                              |  3 +-
 .../lj-1376-undefined-mul-test-flag.test.lua  | 38 +++++++++++++++++++
 2 files changed, 40 insertions(+), 1 deletion(-)
 create mode 100644 test/tarantool-tests/lj-1376-undefined-mul-test-flag.test.lua

diff --git a/src/lj_asm_x86.h b/src/lj_asm_x86.h
index 89e83205..0e3a473c 100644
--- a/src/lj_asm_x86.h
+++ b/src/lj_asm_x86.h
@@ -2061,7 +2061,8 @@ static void asm_intarith(ASMState *as, IRIns *ir, x86Arith xa)
   RegSet allow = RSET_GPR;
   Reg dest, right;
   int32_t k = 0;
-  if (as->flagmcp == as->mcp) {  /* Drop test r,r instruction. */
+  if (as->flagmcp == as->mcp && xa != XOg_X_IMUL) {
+    /* Drop test r,r instruction. */
     MCode *p = as->mcp + ((LJ_64 && *as->mcp < XI_TESTb) ? 3 : 2);
     MCode *q = p[0] == 0x0f ? p+1 : p;
     if ((*q & 15) < 14) {
diff --git a/test/tarantool-tests/lj-1376-undefined-mul-test-flag.test.lua b/test/tarantool-tests/lj-1376-undefined-mul-test-flag.test.lua
new file mode 100644
index 00000000..f6c02a00
--- /dev/null
+++ b/test/tarantool-tests/lj-1376-undefined-mul-test-flag.test.lua
@@ -0,0 +1,38 @@
+local tap = require('tap')
+
+-- Test file to demonstrate incorrect assembling optimization
+-- for x86/x64 CPUs.
+-- See also: https://github.com/LuaJIT/LuaJIT/issues/1376.
+
+local test = tap.test('lj-1376-undefined-mul-test-flag'):skipcond({
+  ['Test requires JIT enabled'] = not jit.status(),
+})
+
+test:plan(1)
+
+local a, b = 0ULL, 0ULL
+
+jit.opt.start('hotloop=1')
+for _ = 1, 4 do
+  -- Before the patch, the `test` instruction is dropped by
+  -- assuming the `imul` instruction before it modifies the flags
+  -- register. It results in the following mcode:
+  -- | imul r15, rbp
+  -- | jnz 0x559415b10060        ->5
+  -- Instead of the following:
+  -- | imul r15, rbp
+  -- | test r15, r15
+  -- | jnz 0x559415b10060        ->5
+  -- This leads to the incorrect branch being taken.
+  if a * b ~= 0ULL then
+    test:fail('the impossible branch is taken')
+    test:done(true)
+  end
+  -- XXX: Need to update multiplier to stay in the variant part of
+  -- the loop, since invariant contains IR_NOP (former unused
+  -- IR_CNEW) between IRs, and the optimization is not applied.
+  b = b + 1
+end
+
+test:ok(true, 'no dropping of test instruction')
+test:done(true)
--------------qO9L0IohCB4uiaAo8fs2io2B--