From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 909A9158BBF7; Mon, 22 Sep 2025 17:11:32 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 909A9158BBF7 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1758550292; bh=WvYtJqIU9HJEZ+lSRbkntOiad76Cu76SblTyhrDCuKY=; h=Date:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=KBifMrxC9JHWz+J0jVmy+tNJTBIhkVwnpKIiGHNnvq3mxqN84a4c3yQ4egqtLdYVk avD21sxmj9XVb6l+MBabSpLsElQZTRG0tkGxGMX2jjYnJtAMFEu+swChPd1+8J9rns aYhx5x0hwE77ulwrfED2o2HJY/9ZgoSNqBReYvJ4= Received: from send127.i.mail.ru (send127.i.mail.ru [89.221.237.222]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 4183D6AEAEE for ; Mon, 22 Sep 2025 17:11:31 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 4183D6AEAEE Received: by exim-smtp-6f7cf5f88f-645kz with esmtpa (envelope-from ) id 1v0hGM-000000002tS-09RU; Mon, 22 Sep 2025 17:11:30 +0300 Content-Type: multipart/alternative; boundary="------------3oxOLN9f0pvhPbJDQUOjLJEM" Message-ID: <11d91aa2-0f0c-468b-9ac2-0ac65eae18cb@tarantool.org> Date: Mon, 22 Sep 2025 17:09:14 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: Sergey Kaplun Cc: tarantool-patches@dev.tarantool.org References: <20250918135535.22756-1-skaplun@tarantool.org> Content-Language: en-US In-Reply-To: <20250918135535.22756-1-skaplun@tarantool.org> X-Mailru-Src: smtp X-4EC0790: 10 X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD9DB4A046F746C73F177C10EF8CA4AF40A02AA3E023C7A43EA182A05F53808504001E718EA221873003DE06ABAFEAF67054A1933C52606B29D94FA608087A46C89D1FFACA8F48D2A96 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE745229D52CF30560CEA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637AC83A81C8FD4AD23D82A6BABE6F325AC2E85FA5F3EDFCBAA7353EFBB5533756663AC535A9FE9FE77CCD2DB4746590917F6E60A512AA60DC69346E8611F35E89B389733CBF5DBD5E913377AFFFEAFD269176DF2183F8FC7C0A29E2F051442AF778941B15DA834481FCF19DD082D7633A0EF3E4896CB9E6436389733CBF5DBD5E9D5E8D9A59859A8B6F459A8243F1D1D44CC7F00164DA146DA6F5DAA56C3B73B237318B6A418E8EAB8D32BA5DBAC0009BE9E8FC8737B5C2249AFA7A56FB51D02A176E601842F6C81A12EF20D2F80756B5FB606B96278B59C4276E601842F6C81A127C277FBC8AE2E8B8AD186D6B41AFE173AA81AA40904B5D99C9F4D5AE37F343AD1F44FA8B9022EA23BBE47FD9DD3FB595F5C1EE8F4F765FCF1175FABE1C0F9B6E2021AF6380DFAD18AA50765F790063735872C767BF85DA227C277FBC8AE2E8B80B9CEB5436E71E375ECD9A6C639B01B4E70A05D1297E1BBCB5012B2E24CD356 X-C1DE0DAB: 0D63561A33F958A52A40673001A8CE395002B1117B3ED69691D5A92EEFF6E51B92212597CCBD6D77823CB91A9FED034534781492E4B8EEADAE4FDBF11360AC9BBDAD6C7F3747799A X-C8649E89: 1C3962B70DF3F0ADBF74143AD284FC7177DD89D51EBB7742424CF958EAFF5D571004E42C50DC4CA955A7F0CF078B5EC49A30900B95165D347341ACA13FB8BDD998B3E40F57F91AAF8B43843A78A2E7EDD5848089F7F8CE7EF2AE97C86A3222671D7E09C32AA3244CD5534C19115E02F977DD89D51EBB77428463E2C4BECFAB5CEA455F16B58544A2E30DDF7C44BCB90DA5AE236DF995FB59978A700BF655EAEEED6A17656DB59BCAD427812AF56FC65B X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu53w8ahmwBjZKM/YPHZyZHvz5uv+WouB9+ObcCpyrx6l7KImUglyhkEat/+ysWwi0gdhEs0JGjl6ggRWTy1haxBpVdbIX1nthFXMZebaIdHP2ghjoIc/363UZI6Kf1ptIMVd2LZJfJwXSdmr+F6DdsrVk= X-Mailru-Sender: 520A125C2F17F0B1A9638AD358559B590C643A965F56B2CA3DE06ABAFEAF67054A1933C52606B29DB7CBEF92542CD7C8795FA72BAB74744FC77752E0C033A69EA16A481184E8BB1C9B38E6EA4F046BE03A5DB60FBEB33A8A0DA7A0AF5A3A8387 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH luajit] ARM64: Fix assembly of HREFK (again). X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Sergey Bronnikov via Tarantool-patches Reply-To: Sergey Bronnikov Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" This is a multi-part message in MIME format. --------------3oxOLN9f0pvhPbJDQUOjLJEM Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Hi, Sergey, thanks for the patch! LGTM Sergey On 9/18/25 16:55, Sergey Kaplun wrote: > From: Mike Pall > > Thanks to Peter Cawley. > > (cherry picked from commit 93ce12ee15abf28ef4cb24ae7e4b8a5b73d75c85) > > When assembling the HREFK IR with the huge offset of the target node > from the table, this offset calculation and the key loading from the > node are emitted like the following: > | ldr x16, [x2, 40] > | add x16, x16, x21 > | ldr x27, [x16, 8] > | cmp x27, x17 > > Here, `x16` is the node register, `x27` is the key register, and `x21` > is the register containing the offset. > > It is possible that the register for holding the constant operand in the > addition may be chosen as the same register containing the node address, > since the full `RSET_GPR` is given to the `emit_opk()`. It will result > in the following invalid mcode: > | ldr x27, [x2, 40] > | str x27, [sp, 8] > | add x16, x16, x16 > | ldr x16, [sp, 8] > | ldr x27, [x16, 8] > | cmp x27, x17 > > It seems that in the current implementation the LuaJIT's register > allocator always prefers the register holding the key instead, so this > does not lead to the invalid emitting. Hence, it is impossible to come > up with any valid reproducer. However, to avoid possible regressions in > the future, this patch fixes the invalid register set by excluding the > node register from it. > > Sergey Kaplun: > * added the description for the problem > > Part of tarantool/tarantool#11691 > --- > > Branch:https://github.com/tarantool/luajit/tree/skaplun/lj-1026-fix-ra-hrefk > Related issues: > *https://github.com/tarantool/tarantool/issues/11691 > *https://github.com/LuaJIT/LuaJIT/issues/1026 > > The issue isn't reproduced even with the RANDOM_RA, so I suppose we may > apply the patch without a test case. > > src/lj_asm_arm64.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h > index 9b3c0467..313b4a96 100644 > --- a/src/lj_asm_arm64.h > +++ b/src/lj_asm_arm64.h > @@ -911,7 +911,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir) > emit_nm(as, A64I_CMPx, key, ra_allock(as, k, rset_exclude(allow, key))); > emit_lso(as, A64I_LDRx, key, idx, kofs); > if (bigofs) > - emit_opk(as, A64I_ADDx, dest, node, ofs, RSET_GPR); > + emit_opk(as, A64I_ADDx, dest, node, ofs, rset_exclude(RSET_GPR, node)); > } > > static void asm_uref(ASMState *as, IRIns *ir) --------------3oxOLN9f0pvhPbJDQUOjLJEM Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit

Hi, Sergey,

thanks for the patch! LGTM

Sergey

On 9/18/25 16:55, Sergey Kaplun wrote:
From: Mike Pall <mike>

Thanks to Peter Cawley.

(cherry picked from commit 93ce12ee15abf28ef4cb24ae7e4b8a5b73d75c85)

When assembling the HREFK IR with the huge offset of the target node
from the table, this offset calculation and the key loading from the
node are emitted like the following:
|  ldr   x16, [x2, 40]
|  add   x16, x16, x21
|  ldr   x27, [x16, 8]
|  cmp   x27, x17

Here, `x16` is the node register, `x27` is the key register, and `x21`
is the register containing the offset.

It is possible that the register for holding the constant operand in the
addition may be chosen as the same register containing the node address,
since the full `RSET_GPR` is given to the `emit_opk()`. It will result
in the following invalid mcode:
|  ldr   x27, [x2, 40]
|  str   x27, [sp, 8]
|  add   x16, x16, x16
|  ldr   x16, [sp, 8]
|  ldr   x27, [x16, 8]
|  cmp   x27, x17

It seems that in the current implementation the LuaJIT's register
allocator always prefers the register holding the key instead, so this
does not lead to the invalid emitting. Hence, it is impossible to come
up with any valid reproducer. However, to avoid possible regressions in
the future, this patch fixes the invalid register set by excluding the
node register from it.

Sergey Kaplun:
* added the description for the problem

Part of tarantool/tarantool#11691
---

Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-1026-fix-ra-hrefk
Related issues:
* https://github.com/tarantool/tarantool/issues/11691
* https://github.com/LuaJIT/LuaJIT/issues/1026

The issue isn't reproduced even with the RANDOM_RA, so I suppose we may
apply the patch without a test case.

 src/lj_asm_arm64.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
index 9b3c0467..313b4a96 100644
--- a/src/lj_asm_arm64.h
+++ b/src/lj_asm_arm64.h
@@ -911,7 +911,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
   emit_nm(as, A64I_CMPx, key, ra_allock(as, k, rset_exclude(allow, key)));
   emit_lso(as, A64I_LDRx, key, idx, kofs);
   if (bigofs)
-    emit_opk(as, A64I_ADDx, dest, node, ofs, RSET_GPR);
+    emit_opk(as, A64I_ADDx, dest, node, ofs, rset_exclude(RSET_GPR, node));
 }
 
 static void asm_uref(ASMState *as, IRIns *ir)
--------------3oxOLN9f0pvhPbJDQUOjLJEM--