From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 695057034F; Fri, 19 Aug 2022 02:28:31 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 695057034F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1660865311; bh=IBe3yQIb+G1/1jh5M/8J0VFszpTa7Eaw4tqriO1cgJ4=; h=To:Date:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=L28fyyeqnI56zsUJTciv+K1uzc4h4RblNG4UNdZjoAEndRZvehQHK+4AepZzp/uAi 0qLjP0Rb/qpKfy5cw6YCw16QHAEwNaAKlgbNQreeocNfiRY5NeQ4ZB+Rt4XJQ+oxGH pB9xvYKGlGN0mKAEroB/a1QghN7kPo+0HvwFz3Ng= Received: from mail-lj1-f175.google.com (mail-lj1-f175.google.com [209.85.208.175]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 9513D7034F for ; Fri, 19 Aug 2022 02:28:29 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 9513D7034F Received: by mail-lj1-f175.google.com with SMTP id l21so3086155ljj.2 for ; Thu, 18 Aug 2022 16:28:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc; bh=MNle0pHE6vSCQR1e4GA/QYZT3v9scjoOvmKStM9UuSU=; b=PjEUIQTEQspAPedXPbJ+66pHRtUf+eoeSzgsRpuj2kOafHJUuI5JqmZkyplMv3O71P 8OHBDc4wIDHn8eg2Q2nJYGnhJ4XYrfPA8/22jjJV+k69gFM2oEUzX9LnHyzGaApmdzxh 6pBYYDI2kXeYACjZqWi14ZSJtG4gfVDtPezcPFJ/ZgrxXP87xd1ffB6G9vPhGw1WZWw1 UlQMwqbYVLs7I2PUlY+IF7uK29l3sky/o6GP0py/1wptEuZqA+/FtlUPDV5u/1ZG5Nnl yT5xmDgZr1iBO4k3+mzNK5phneDMQOBQCZZi90AWSLVCowze74KBukMf5njgmIsrd5GS Up1Q== X-Gm-Message-State: ACgBeo1Jlthj6HVfCatyc0OENE5h3h2MiloKHU1J3HVox6o4PO9XHEyM 0I2YYd6Doggc35bEwIaSbWIrfFfvsKnqDru+ X-Google-Smtp-Source: AA6agR7+IehRQ2lvJPtv+9NVyLv1CDi3yIGRJgJUjgMVe28uqspHRSN2OEZkb4Fq3RBIVWIpvAj5bg== X-Received: by 2002:a05:651c:12c4:b0:25d:d71c:e249 with SMTP id 4-20020a05651c12c400b0025dd71ce249mr1348661lje.17.1660865308484; Thu, 18 Aug 2022 16:28:28 -0700 (PDT) Received: from localhost.localdomain ([2a00:1370:8176:ec5:6fff:556:948:3bc9]) by smtp.gmail.com with ESMTPSA id t2-20020a19dc02000000b0047f8d7c08e4sm395048lfg.166.2022.08.18.16.28.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Aug 2022 16:28:28 -0700 (PDT) X-Google-Original-From: Maxim Kokryashkin To: tarantool-patches@dev.tarantool.org, imun@tarantool.org, skaplun@tarantool.org Date: Fri, 19 Aug 2022 02:28:23 +0300 Message-Id: <20220818232823.96280-1-m.kokryashkin@tarantool.org> X-Mailer: git-send-email 2.36.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [Tarantool-patches] [PATCH luajit v2] Cleanup math function compilation and fix inconsistencies. X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Maxim Kokryashkin via Tarantool-patches Reply-To: Maxim Kokryashkin Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" This patch changes 'math_unary', `math_htrig` and `math_atrig` to `math_call` for math functions compilation. Now all of math functions in IRs are called with `CALLN` instead of `FPMATH` (`ATAN2` for `math.atan2`). The `ATAN2` instruction itself is removed, as well as the fold optimization for it. Also, this patch adds new fold optimizations for `CALLN`. Part of tarantool/tarantool#7230 --- As we decided offline, no tests are provided with this patch, because of their extreme complexity with no real benefits. Branch: https://github.com/tarantool/luajit/tree/fckxorg/gh-7230-cleanup-math-inconsistencies PR: https://github.com/tarantool/tarantool/pull/7586 src/lib_math.c | 22 +++++++++++----------- src/lj_asm.c | 6 ------ src/lj_asm_arm.h | 1 - src/lj_asm_arm64.h | 1 - src/lj_asm_x86.h | 2 -- src/lj_ffrecord.c | 19 ++----------------- src/lj_ir.h | 4 +--- src/lj_ircall.h | 14 +++++++++----- src/lj_opt_fold.c | 25 ++++++++++++++++++++++++- src/lj_opt_split.c | 3 --- src/lj_target_x86.h | 6 ------ src/lj_vmmath.c | 6 ------ 12 files changed, 47 insertions(+), 62 deletions(-) diff --git a/src/lib_math.c b/src/lib_math.c index ef9dda2d..e9cea9ca 100644 --- a/src/lib_math.c +++ b/src/lib_math.c @@ -33,17 +33,17 @@ LJLIB_ASM(math_sqrt) LJLIB_REC(math_unary IRFPM_SQRT) lj_lib_checknum(L, 1); return FFH_RETRY; } -LJLIB_ASM_(math_log10) LJLIB_REC(math_unary IRFPM_LOG10) -LJLIB_ASM_(math_exp) LJLIB_REC(math_unary IRFPM_EXP) -LJLIB_ASM_(math_sin) LJLIB_REC(math_unary IRFPM_SIN) -LJLIB_ASM_(math_cos) LJLIB_REC(math_unary IRFPM_COS) -LJLIB_ASM_(math_tan) LJLIB_REC(math_unary IRFPM_TAN) -LJLIB_ASM_(math_asin) LJLIB_REC(math_atrig FF_math_asin) -LJLIB_ASM_(math_acos) LJLIB_REC(math_atrig FF_math_acos) -LJLIB_ASM_(math_atan) LJLIB_REC(math_atrig FF_math_atan) -LJLIB_ASM_(math_sinh) LJLIB_REC(math_htrig IRCALL_sinh) -LJLIB_ASM_(math_cosh) LJLIB_REC(math_htrig IRCALL_cosh) -LJLIB_ASM_(math_tanh) LJLIB_REC(math_htrig IRCALL_tanh) +LJLIB_ASM_(math_log10) LJLIB_REC(math_call IRCALL_log10) +LJLIB_ASM_(math_exp) LJLIB_REC(math_call IRCALL_exp) +LJLIB_ASM_(math_sin) LJLIB_REC(math_call IRCALL_sin) +LJLIB_ASM_(math_cos) LJLIB_REC(math_call IRCALL_cos) +LJLIB_ASM_(math_tan) LJLIB_REC(math_call IRCALL_tan) +LJLIB_ASM_(math_asin) LJLIB_REC(math_call IRCALL_asin) +LJLIB_ASM_(math_acos) LJLIB_REC(math_call IRCALL_acos) +LJLIB_ASM_(math_atan) LJLIB_REC(math_call IRCALL_atan) +LJLIB_ASM_(math_sinh) LJLIB_REC(math_call IRCALL_sinh) +LJLIB_ASM_(math_cosh) LJLIB_REC(math_call IRCALL_cosh) +LJLIB_ASM_(math_tanh) LJLIB_REC(math_call IRCALL_tanh) LJLIB_ASM_(math_frexp) LJLIB_ASM_(math_modf) LJLIB_REC(.) diff --git a/src/lj_asm.c b/src/lj_asm.c index 10e5872b..1a7fb0c8 100644 --- a/src/lj_asm.c +++ b/src/lj_asm.c @@ -1660,7 +1660,6 @@ static void asm_ir(ASMState *as, IRIns *ir) case IR_DIV: asm_div(as, ir); break; case IR_POW: asm_pow(as, ir); break; case IR_ABS: asm_abs(as, ir); break; - case IR_ATAN2: asm_atan2(as, ir); break; case IR_LDEXP: asm_ldexp(as, ir); break; case IR_FPMATH: asm_fpmath(as, ir); break; case IR_TOBIT: asm_tobit(as, ir); break; @@ -2150,11 +2149,6 @@ static void asm_setup_regsp(ASMState *as) as->modset = RSET_SCRATCH; break; #if !LJ_SOFTFP - case IR_ATAN2: -#if LJ_TARGET_X86 - if (as->evenspill < 4) /* Leave room to call atan2(). */ - as->evenspill = 4; -#endif #if !LJ_TARGET_X86ORX64 case IR_LDEXP: #endif diff --git a/src/lj_asm_arm.h b/src/lj_asm_arm.h index 4fd08b9e..8af19eb9 100644 --- a/src/lj_asm_arm.h +++ b/src/lj_asm_arm.h @@ -1502,7 +1502,6 @@ static void asm_mul(ASMState *as, IRIns *ir) #define asm_div(as, ir) asm_fparith(as, ir, ARMI_VDIV_D) #define asm_pow(as, ir) asm_callid(as, ir, IRCALL_lj_vm_powi) #define asm_abs(as, ir) asm_fpunary(as, ir, ARMI_VABS_D) -#define asm_atan2(as, ir) asm_callid(as, ir, IRCALL_atan2) #define asm_ldexp(as, ir) asm_callid(as, ir, IRCALL_ldexp) #endif diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h index da0ee4bb..4aeb51f3 100644 --- a/src/lj_asm_arm64.h +++ b/src/lj_asm_arm64.h @@ -1450,7 +1450,6 @@ static void asm_pow(ASMState *as, IRIns *ir) #define asm_mulov(as, ir) asm_mul(as, ir) #define asm_abs(as, ir) asm_fpunary(as, ir, A64I_FABS) -#define asm_atan2(as, ir) asm_callid(as, ir, IRCALL_atan2) #define asm_ldexp(as, ir) asm_callid(as, ir, IRCALL_ldexp) static void asm_mod(ASMState *as, IRIns *ir) diff --git a/src/lj_asm_x86.h b/src/lj_asm_x86.h index 2850aea9..8a4d4025 100644 --- a/src/lj_asm_x86.h +++ b/src/lj_asm_x86.h @@ -1971,8 +1971,6 @@ static void asm_fpmath(ASMState *as, IRIns *ir) } } -#define asm_atan2(as, ir) asm_callid(as, ir, IRCALL_atan2) - static void asm_ldexp(ASMState *as, IRIns *ir) { int32_t ofs = sps_scale(ir->s); /* Use spill slot or temp slots. */ diff --git a/src/lj_ffrecord.c b/src/lj_ffrecord.c index be890a93..0c9c211d 100644 --- a/src/lj_ffrecord.c +++ b/src/lj_ffrecord.c @@ -563,7 +563,7 @@ static void LJ_FASTCALL recff_math_atan2(jit_State *J, RecordFFData *rd) { TRef tr = lj_ir_tonum(J, J->base[0]); TRef tr2 = lj_ir_tonum(J, J->base[1]); - J->base[0] = emitir(IRTN(IR_ATAN2), tr, tr2); + J->base[0] = lj_ir_call(J, IRCALL_atan2, tr, tr2); UNUSED(rd); } @@ -580,22 +580,7 @@ static void LJ_FASTCALL recff_math_ldexp(jit_State *J, RecordFFData *rd) UNUSED(rd); } -/* Record math.asin, math.acos, math.atan. */ -static void LJ_FASTCALL recff_math_atrig(jit_State *J, RecordFFData *rd) -{ - TRef y = lj_ir_tonum(J, J->base[0]); - TRef x = lj_ir_knum_one(J); - uint32_t ffid = rd->data; - if (ffid != FF_math_atan) { - TRef tmp = emitir(IRTN(IR_MUL), y, y); - tmp = emitir(IRTN(IR_SUB), x, tmp); - tmp = emitir(IRTN(IR_FPMATH), tmp, IRFPM_SQRT); - if (ffid == FF_math_asin) { x = tmp; } else { x = y; y = tmp; } - } - J->base[0] = emitir(IRTN(IR_ATAN2), y, x); -} - -static void LJ_FASTCALL recff_math_htrig(jit_State *J, RecordFFData *rd) +static void LJ_FASTCALL recff_math_call(jit_State *J, RecordFFData *rd) { TRef tr = lj_ir_tonum(J, J->base[0]); J->base[0] = emitir(IRTN(IR_CALLN), tr, rd->data); diff --git a/src/lj_ir.h b/src/lj_ir.h index 3059bf65..4bad47ed 100644 --- a/src/lj_ir.h +++ b/src/lj_ir.h @@ -75,7 +75,6 @@ _(NEG, N , ref, ref) \ \ _(ABS, N , ref, ref) \ - _(ATAN2, N , ref, ref) \ _(LDEXP, N , ref, ref) \ _(MIN, C , ref, ref) \ _(MAX, C , ref, ref) \ @@ -178,8 +177,7 @@ LJ_STATIC_ASSERT((int)IR_XLOAD + IRDELTA_L2S == (int)IR_XSTORE); /* FPMATH sub-functions. ORDER FPM. */ #define IRFPMDEF(_) \ _(FLOOR) _(CEIL) _(TRUNC) /* Must be first and in this order. */ \ - _(SQRT) _(EXP) _(EXP2) _(LOG) _(LOG2) _(LOG10) \ - _(SIN) _(COS) _(TAN) \ + _(SQRT) _(EXP2) _(LOG) _(LOG2) \ _(OTHER) typedef enum { diff --git a/src/lj_ircall.h b/src/lj_ircall.h index 973c36e6..aa06b273 100644 --- a/src/lj_ircall.h +++ b/src/lj_ircall.h @@ -21,6 +21,7 @@ typedef struct CCallInfo { #define CCI_OTSHIFT 16 #define CCI_OPTYPE(ci) ((ci)->flags >> CCI_OTSHIFT) /* Get op/type. */ +#define CCI_TYPE(ci) (((ci)->flags>>CCI_OTSHIFT) & IRT_TYPE) #define CCI_OPSHIFT 24 #define CCI_OP(ci) ((ci)->flags >> CCI_OPSHIFT) /* Get op. */ @@ -158,6 +159,14 @@ typedef struct CCallInfo { _(ANY, lj_mem_newgco, 2, FS, PGC, CCI_L) \ _(ANY, lj_math_random_step, 1, FS, NUM, CCI_CASTU64) \ _(ANY, lj_vm_modi, 2, FN, INT, 0) \ + _(ANY, log10, 1, N, NUM, XA_FP) \ + _(ANY, exp, 1, N, NUM, XA_FP) \ + _(ANY, sin, 1, N, NUM, XA_FP) \ + _(ANY, cos, 1, N, NUM, XA_FP) \ + _(ANY, tan, 1, N, NUM, XA_FP) \ + _(ANY, asin, 1, N, NUM, XA_FP) \ + _(ANY, acos, 1, N, NUM, XA_FP) \ + _(ANY, atan, 1, N, NUM, XA_FP) \ _(ANY, sinh, 1, N, NUM, XA_FP) \ _(ANY, cosh, 1, N, NUM, XA_FP) \ _(ANY, tanh, 1, N, NUM, XA_FP) \ @@ -169,14 +178,9 @@ typedef struct CCallInfo { _(FPMATH, lj_vm_ceil, 1, N, NUM, XA_FP) \ _(FPMATH, lj_vm_trunc, 1, N, NUM, XA_FP) \ _(FPMATH, sqrt, 1, N, NUM, XA_FP) \ - _(ANY, exp, 1, N, NUM, XA_FP) \ _(ANY, lj_vm_exp2, 1, N, NUM, XA_FP) \ _(ANY, log, 1, N, NUM, XA_FP) \ _(ANY, lj_vm_log2, 1, N, NUM, XA_FP) \ - _(ANY, log10, 1, N, NUM, XA_FP) \ - _(ANY, sin, 1, N, NUM, XA_FP) \ - _(ANY, cos, 1, N, NUM, XA_FP) \ - _(ANY, tan, 1, N, NUM, XA_FP) \ _(ANY, lj_vm_powi, 2, N, NUM, XA_FP) \ _(ANY, pow, 2, N, NUM, XA2_FP) \ _(ANY, atan2, 2, N, NUM, XA2_FP) \ diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c index 276dc040..49f74996 100644 --- a/src/lj_opt_fold.c +++ b/src/lj_opt_fold.c @@ -173,7 +173,6 @@ LJFOLD(ADD KNUM KNUM) LJFOLD(SUB KNUM KNUM) LJFOLD(MUL KNUM KNUM) LJFOLD(DIV KNUM KNUM) -LJFOLD(ATAN2 KNUM KNUM) LJFOLD(LDEXP KNUM KNUM) LJFOLD(MIN KNUM KNUM) LJFOLD(MAX KNUM KNUM) @@ -213,6 +212,30 @@ LJFOLDF(kfold_fpmath) return lj_ir_knum(J, y); } +LJFOLD(CALLN KNUM any) +LJFOLDF(kfold_fpcall1) +{ + const CCallInfo *ci = &lj_ir_callinfo[fins->op2]; + if (CCI_TYPE(ci) == IRT_NUM) { + double y = ((double (*)(double))ci->func)(knumleft); + return lj_ir_knum(J, y); + } + return NEXTFOLD; +} + +LJFOLD(CALLN CARG IRCALL_atan2) +LJFOLDF(kfold_fpcall2) +{ + if (irref_isk(fleft->op1) && irref_isk(fleft->op2)) { + const CCallInfo *ci = &lj_ir_callinfo[fins->op2]; + double a = ir_knum(IR(fleft->op1))->n; + double b = ir_knum(IR(fleft->op2))->n; + double y = ((double (*)(double, double))ci->func)(a, b); + return lj_ir_knum(J, y); + } + return NEXTFOLD; +} + LJFOLD(POW KNUM KINT) LJFOLDF(kfold_numpow) { diff --git a/src/lj_opt_split.c b/src/lj_opt_split.c index fc935204..c0788106 100644 --- a/src/lj_opt_split.c +++ b/src/lj_opt_split.c @@ -426,9 +426,6 @@ static void split_ir(jit_State *J) } hi = split_call_l(J, hisubst, oir, ir, IRCALL_lj_vm_floor + ir->op2); break; - case IR_ATAN2: - hi = split_call_ll(J, hisubst, oir, ir, IRCALL_atan2); - break; case IR_LDEXP: hi = split_call_li(J, hisubst, oir, ir, IRCALL_ldexp); break; diff --git a/src/lj_target_x86.h b/src/lj_target_x86.h index 356f7924..194f8e70 100644 --- a/src/lj_target_x86.h +++ b/src/lj_target_x86.h @@ -228,16 +228,10 @@ typedef enum { /* Note: little-endian byte-order! */ XI_FLDZ = 0xeed9, XI_FLD1 = 0xe8d9, - XI_FLDLG2 = 0xecd9, - XI_FLDLN2 = 0xedd9, XI_FDUP = 0xc0d9, /* Really fld st0. */ XI_FPOP = 0xd8dd, /* Really fstp st0. */ XI_FPOP1 = 0xd9dd, /* Really fstp st1. */ XI_FRNDINT = 0xfcd9, - XI_FSIN = 0xfed9, - XI_FCOS = 0xffd9, - XI_FPTAN = 0xf2d9, - XI_FPATAN = 0xf3d9, XI_FSCALE = 0xfdd9, XI_FYL2X = 0xf1d9, diff --git a/src/lj_vmmath.c b/src/lj_vmmath.c index b231d3e8..c04459bd 100644 --- a/src/lj_vmmath.c +++ b/src/lj_vmmath.c @@ -48,7 +48,6 @@ double lj_vm_foldarith(double x, double y, int op) case IR_NEG - IR_ADD: return -x; break; case IR_ABS - IR_ADD: return fabs(x); break; #if LJ_HASJIT - case IR_ATAN2 - IR_ADD: return atan2(x, y); break; case IR_LDEXP - IR_ADD: return ldexp(x, (int)y); break; case IR_MIN - IR_ADD: return x > y ? y : x; break; case IR_MAX - IR_ADD: return x < y ? y : x; break; @@ -129,14 +128,9 @@ double lj_vm_foldfpm(double x, int fpm) case IRFPM_CEIL: return lj_vm_ceil(x); case IRFPM_TRUNC: return lj_vm_trunc(x); case IRFPM_SQRT: return sqrt(x); - case IRFPM_EXP: return exp(x); case IRFPM_EXP2: return lj_vm_exp2(x); case IRFPM_LOG: return log(x); case IRFPM_LOG2: return lj_vm_log2(x); - case IRFPM_LOG10: return log10(x); - case IRFPM_SIN: return sin(x); - case IRFPM_COS: return cos(x); - case IRFPM_TAN: return tan(x); default: lua_assert(0); } return 0; -- 2.36.1