From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id A45A214D4C0D; Thu, 24 Jul 2025 12:04:35 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org A45A214D4C0D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1753347875; bh=oWM5vgqqLDAzJaadig/qhNnIhQmH1pfUzivcvJcqyD0=; h=To:Date:In-Reply-To:References:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=C/QXcIBhpEqc3cEncCIBYl+aTc0EQnCbd7QaCtg2fuZqkJDoxqxSYNQjArupvFzzx Bvj6t920/5JYhT39qO1ZrMIYGbP6uz7eCUiXmrhBJjqz2DQi7rYcj08I2vcdQ1R9XN 9UF430B1udP7e/CpPYVVbX3NI1sdzElPxsI9PkTM= Received: from send243.i.mail.ru (send243.i.mail.ru [95.163.59.82]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id D3CB014D4C0A for ; Thu, 24 Jul 2025 12:03:35 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org D3CB014D4C0A Received: by exim-smtp-6db95c9866-xvr86 with esmtpa (envelope-from ) id 1uerrS-000000000SL-3LLD; Thu, 24 Jul 2025 12:03:35 +0300 To: Sergey Bronnikov Date: Thu, 24 Jul 2025 12:03:59 +0300 Message-ID: <0183aa1f346bf87d8e626274323c87e2291e75bf.1753344905.git.skaplun@tarantool.org> X-Mailer: git-send-email 2.50.0 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Mailru-Src: smtp X-4EC0790: 10 X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD9B9F43B9EFA5FD56402BA162944778ADC9669EA6AF6D5957E182A05F538085040927DF2B74F5A98673DE06ABAFEAF670558455FCD093CD752DF98AC2C90C84BB2A24CDA786A8AA87E X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7CB1C1CF81BFF4FD8EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637AC83A81C8FD4AD23D82A6BABE6F325AC2E85FA5F3EDFCBAA7353EFBB553375660E7F2F9780B0C8579595D61520E73F06176A71CF666A8271898A09F13D035812389733CBF5DBD5E913377AFFFEAFD269176DF2183F8FC7C07734D68A6916D8318941B15DA834481FCF19DD082D7633A0EF3E4896CB9E6436389733CBF5DBD5E9D5E8D9A59859A8B6BAA8CD687FCDB2EBCC7F00164DA146DA6F5DAA56C3B73B237318B6A418E8EAB86D1867E19FE14079C09775C1D3CA48CF3D321E7403792E342EB15956EA79C166A417C69337E82CC275ECD9A6C639B01B78DA827A17800CE7B2B7C64F398C7410731C566533BA786AA5CC5B56E945C8DA X-C1DE0DAB: 0D63561A33F958A58291224E6B78E3565002B1117B3ED696BA8EDB2A2DBFF4893E67C18142C611B7823CB91A9FED034534781492E4B8EEAD09F854029C6BD0DAC79554A2A72441328621D336A7BC284946AD531847A6065A17B107DEF921CE79BDAD6C7F3747799A X-C8649E89: 1C3962B70DF3F0ADE00A9FD3E00BEEDF3FED46C3ACD6F73ED3581295AF09D3DF87807E0823442EA2ED31085941D9CD0AF7F820E7B07EA4CF262920B8ED5E58ACE6CE7316ED9640D728375ADEEC63DE52DFEE902B438149877D7FE2FA29F5778776EB2F54A6C907AD076CB2FE180E3081981AC94E6D1C16E6F2FAE490D879583A5F4332CA8FE04980913E6812662D5F2A5EAB5682573093F7837F15F2B5E4A70B33F2C28C22F508233FCF178C6DD14203 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu53w8ahmwBjZKM/YPHZyZHvz5uv+WouB9+ObcCpyrx6l7KImUglyhkEat/+ysWwi0gdhEs0JGjl6ggRWTy1haxBpVdbIX1nthFXMZebaIdHP2ghjoIc/363UZI6Kf1ptIMVU4HtwWQGdJGdpc8aWyPt5o= X-DA7885C5: CE07F6C732279BF1F255D290C0D534F9FFD4C871ADDA0EA1CBF76A16ED7C47830F5A5BB18DDBF21D5B1A4C17EAA7BC4BEF2421ABFA55128DAF83EF9164C44C7E X-Mailru-Sender: 689FA8AB762F7393FE9E42A757851DB69BDA5DFBA73B4E71D5B33FE1970292805D4E29F090AE60DDE49D44BB4BD9522A059A1ED8796F048DB274557F927329BE89D5A3BC2B10C37545BD1C3CC395C826B4A721A3011E896F X-Mras: Ok Subject: [Tarantool-patches] [PATCH luajit 2/3] ARM64: More improvements to the generation of immediates. X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Sergey Kaplun via Tarantool-patches Reply-To: Sergey Kaplun Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" From: Mike Pall (cherry picked from commit 69138082a3166105faa8cbb25fadb1e4298686c0) This patch refactors the emitting of immediates for the arm64 architecture. The main changes are the following: * Use `emit_getgl()`, `emit_setgl()` instead of `emit_lso()`, where it is possible, since it makes the code cleaner. * The `RID_GL` is allocated for `g` at the start of the trace emitting. Also, this register is considered as a candidate to be used as a base for the N-step offset in `emit_kdelta()`. * The address of `tmptv` is not rematerialized to the register from the constant not. It is calculated via the adding the corresponding offset to `RID_GL`. Sergey Kaplun: * added the description for the patch Part of tarantool/tarantool#11691 --- src/lj_asm.c | 3 +++ src/lj_asm_arm64.h | 23 ++++++++--------------- src/lj_emit_arm64.h | 2 +- 3 files changed, 12 insertions(+), 16 deletions(-) diff --git a/src/lj_asm.c b/src/lj_asm.c index 9e81dbc9..f163b2e3 100644 --- a/src/lj_asm.c +++ b/src/lj_asm.c @@ -2113,6 +2113,9 @@ static void asm_setup_regsp(ASMState *as) #endif ra_setup(as); +#if LJ_TARGET_ARM64 + ra_setkref(as, RID_GL, (intptr_t)J2G(as->J)); +#endif /* Clear reg/sp for constants. */ for (ir = IR(T->nk), lastir = IR(REF_BASE); ir < lastir; ir++) { diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h index a7f059a2..5a6c60b7 100644 --- a/src/lj_asm_arm64.h +++ b/src/lj_asm_arm64.h @@ -690,7 +690,7 @@ static void asm_tvptr(ASMState *as, Reg dest, IRRef ref) } else { /* Otherwise use g->tmptv to hold the TValue. */ asm_tvstore64(as, dest, 0, ref); - ra_allockreg(as, i64ptr(&J2G(as->J)->tmptv), dest); + emit_dn(as, A64I_ADDx^emit_isk12(glofs(as, &J2G(as->J)->tmptv)), dest, RID_GL); } } @@ -1269,17 +1269,13 @@ static void asm_tbar(ASMState *as, IRIns *ir) { Reg tab = ra_alloc1(as, ir->op1, RSET_GPR); Reg link = ra_scratch(as, rset_exclude(RSET_GPR, tab)); - Reg gr = ra_allock(as, i64ptr(J2G(as->J)), - rset_exclude(rset_exclude(RSET_GPR, tab), link)); Reg mark = RID_TMP; MCLabel l_end = emit_label(as); emit_lso(as, A64I_STRx, link, tab, (int32_t)offsetof(GCtab, gclist)); emit_lso(as, A64I_STRB, mark, tab, (int32_t)offsetof(GCtab, marked)); - emit_lso(as, A64I_STRx, tab, gr, - (int32_t)offsetof(global_State, gc.grayagain)); + emit_setgl(as, tab, gc.grayagain); emit_dn(as, A64I_ANDw^emit_isk13(~LJ_GC_BLACK, 0), mark, mark); - emit_lso(as, A64I_LDRx, link, gr, - (int32_t)offsetof(global_State, gc.grayagain)); + emit_getgl(as, link, gc.grayagain); emit_cond_branch(as, CC_EQ, l_end); emit_n(as, A64I_TSTw^emit_isk13(LJ_GC_BLACK, 0), mark); emit_lso(as, A64I_LDRB, mark, tab, (int32_t)offsetof(GCtab, marked)); @@ -1299,7 +1295,7 @@ static void asm_obar(ASMState *as, IRIns *ir) args[0] = ASMREF_TMP1; /* global_State *g */ args[1] = ir->op1; /* TValue *tv */ asm_gencall(as, ci, args); - ra_allockreg(as, i64ptr(J2G(as->J)), ra_releasetmp(as, ASMREF_TMP1) ); + emit_dm(as, A64I_MOVx, ra_releasetmp(as, ASMREF_TMP1), RID_GL); obj = IR(ir->op1)->r; tmp = ra_scratch(as, rset_exclude(allow, obj)); emit_cond_branch(as, CC_EQ, l_end); @@ -1808,7 +1804,7 @@ static void asm_gc_check(ASMState *as) const CCallInfo *ci = &lj_ir_callinfo[IRCALL_lj_gc_step_jit]; IRRef args[2]; MCLabel l_end; - Reg tmp1, tmp2; + Reg tmp2; ra_evictset(as, RSET_SCRATCH); l_end = emit_label(as); /* Exit trace if in GCSatomic or GCSfinalize. Avoids syncing GC objects. */ @@ -1816,17 +1812,14 @@ static void asm_gc_check(ASMState *as) args[0] = ASMREF_TMP1; /* global_State *g */ args[1] = ASMREF_TMP2; /* MSize steps */ asm_gencall(as, ci, args); - tmp1 = ra_releasetmp(as, ASMREF_TMP1); + emit_dm(as, A64I_MOVx, ra_releasetmp(as, ASMREF_TMP1), RID_GL); tmp2 = ra_releasetmp(as, ASMREF_TMP2); emit_loadi(as, tmp2, as->gcsteps); /* Jump around GC step if GC total < GC threshold. */ emit_cond_branch(as, CC_LS, l_end); emit_nm(as, A64I_CMPx, RID_TMP, tmp2); - emit_lso(as, A64I_LDRx, tmp2, tmp1, - (int32_t)offsetof(global_State, gc.threshold)); - emit_lso(as, A64I_LDRx, RID_TMP, tmp1, - (int32_t)offsetof(global_State, gc.total)); - ra_allockreg(as, i64ptr(J2G(as->J)), tmp1); + emit_getgl(as, tmp2, gc.threshold); + emit_getgl(as, RID_TMP, gc.total); as->gcsteps = 0; checkmclim(as); } diff --git a/src/lj_emit_arm64.h b/src/lj_emit_arm64.h index 2bb93dd9..184a05ca 100644 --- a/src/lj_emit_arm64.h +++ b/src/lj_emit_arm64.h @@ -163,7 +163,7 @@ nopair: /* Try to find an N-step delta relative to other consts with N < lim. */ static int emit_kdelta(ASMState *as, Reg rd, uint64_t k, int lim) { - RegSet work = ~as->freeset & RSET_GPR; + RegSet work = (~as->freeset & RSET_GPR) | RID2RSET(RID_GL); if (lim <= 1) return 0; /* Can't beat that. */ while (work) { Reg r = rset_picktop(work); -- 2.50.0