[Tarantool-patches] [PATCH luajit v6] Fix math.min()/math.max() inconsistencies.
Maxim Kokryashkin
max.kokryashkin at gmail.com
Mon Dec 19 12:52:26 MSK 2022
From: Mike Pall <mike>
(cherry-picked from commit 03208c8162af9cc01ca76ee1676ca79e5abe9b60)
`math.min()`/`math.max()` could produce different results.
Previously, dirty values on the Lua stack could be
treated as arguments to `math.min()`/`math.max()`.
This patch adds check for the number of arguments provided to
math.min/max, which fixes the issue.
Also it adds the corresponding test case for
the mentioned issue and does some refactoring:
1. fcc is changed for min/max functions in ARM
assembly from LO/HI (lower/upper or unordered) to LE/PL
(lower/upper, equal or unoredered).
2. Several fold optimizations for min/max were removed
or modified.
Resolves tarantool/tarantool#6163
---
>IMHO, LO -> LT (N!=V Less than or unordered) do the same thing wo
>changing the order.
>
>IINM, an ordered comparison checks if neither operand is NaN.
>Conversely, an unordered comparison checks if either operand is a NaN.
>
>So, looks like an attempt to fix inconsistent behaviour for NaNs in
>math.min/math.max on aarch64.
>
>Also, I found inconsistent behaviour on x86 (between LuaJIT|Lua):
>
>| # on upstream build
>| $ ./luajit -Ohotloop=1 -e 'local res = {} for i = 1,4 do res[i] = math.max(0/0, math.huge) end for i = 1, #res do print(res[i]) end'
>| inf
>| inf
>| inf
>| inf
>| $ lua -e 'local res = {} for i = 1,4 do res[i] = math.max(0/0, math.huge) end for i = 1, #res do print(res[i]) end'
>| -nan
>| -nan
>| -nan
>| -nan
>
>Can you please test some similar examples on aarch64/M1?
I've tested those on M1 and here are the results:
$ ./src/luajit -Ohotloop=1 -e 'local res = {} for i=1,4 do res[i]=math.max(0/0,math.huge) end for i =1, #res do print(res[i]) end'
inf
inf
inf
inf
$ lua -e 'local res = {} for i=1,4 do res[i]=math.max(0/0,math.huge) end for i =1, #res do print(res[i]) end'
nan
nan
nan
nan
>Also, AFAICS, some optimizations are the reason of inconsistent
>behaviour for JIT-ed code (not the fold in this commit).
>| # on our fork
>| ./luajit -O0 -Ohotloop=1 -e 'local res = {} for i = 1,4 do res[i] = math.max(0/0, math.huge) end for i = 1, #res do print(res[i]) end'
>| inf
>| inf
>| inf
>| inf
>| # on our fork
>| ./luajit -Ohotloop=1 -e 'local res = {} for i = 1,4 do res[i] = math.max(0/0, math.huge) end for i = 1, #res do print(res[i]) end'
>| inf
>| inf
>| inf
>| nan
>
>BTW this commit doesn't fix the problem. Can you please bisect the
>commit to backport?
Works perfectly fine on our fork after this patch.
>> diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
>> index 276dc040..07a52a4d 100644
>> --- a/src/lj_opt_fold.c
>> +++ b/src/lj_opt_fold.c
>> @@ -1774,8 +1774,6 @@ LJFOLDF(reassoc_intarith_k64)
>> #endif
>> }
>>
>> -LJFOLD(MIN MIN any)
>> -LJFOLD(MAX MAX any)
>> LJFOLD(BAND BAND any)
>> LJFOLD(BOR BOR any)
>> LJFOLDF(reassoc_dup)
>> @@ -1785,6 +1783,15 @@ LJFOLDF(reassoc_dup)
>> return NEXTFOLD;
>> }
>>
>> +LJFOLD(MIN MIN any)
>> +LJFOLD(MAX MAX any)
>> +LJFOLDF(reassoc_dup_minmax)
>> +{
>> + if (fins->op2 == fleft->op2)
>> + return LEFTFOLD; /* (a o b) o b ==> a o b */
>> + return NEXTFOLD;
>> +}
>> +
>
>Do you know why the opt `(a o b) o a ==> a o b;` is ommited now?
>Are there any examples of incorrect behaviour? I suggest to check NaN
>behaviour in this case.
Added the test case for that one, but I failed to find any for the
other.
I've done test runs on all of the combinations of `{1, -1, 0, -0, 0/0,
-math.huge, math.huge}` for all of the optimizations.
I'll be glad to add another test case if you can think of any.
src/lj_asm_arm.h | 6 +--
src/lj_asm_arm64.h | 6 +--
src/lj_opt_fold.c | 53 +++++++------------
src/lj_vmmath.c | 4 +-
src/vm_arm.dasc | 4 +-
src/vm_arm64.dasc | 4 +-
src/vm_x64.dasc | 2 +-
src/vm_x86.dasc | 2 +-
test/tarantool-tests/gh-6163-min-max.test.lua | 48 +++++++++++++++++
9 files changed, 81 insertions(+), 48 deletions(-)
create mode 100644 test/tarantool-tests/gh-6163-min-max.test.lua
diff --git a/src/lj_asm_arm.h b/src/lj_asm_arm.h
index 8af19eb9..6ae6e2f2 100644
--- a/src/lj_asm_arm.h
+++ b/src/lj_asm_arm.h
@@ -1663,8 +1663,8 @@ static void asm_min_max(ASMState *as, IRIns *ir, int cc, int fcc)
asm_intmin_max(as, ir, cc);
}
-#define asm_min(as, ir) asm_min_max(as, ir, CC_GT, CC_HI)
-#define asm_max(as, ir) asm_min_max(as, ir, CC_LT, CC_LO)
+#define asm_min(as, ir) asm_min_max(as, ir, CC_GT, CC_PL)
+#define asm_max(as, ir) asm_min_max(as, ir, CC_LT, CC_LE)
/* -- Comparisons --------------------------------------------------------- */
@@ -1856,7 +1856,7 @@ static void asm_hiop(ASMState *as, IRIns *ir)
} else if ((ir-1)->o == IR_MIN || (ir-1)->o == IR_MAX) {
as->curins--; /* Always skip the loword min/max. */
if (uselo || usehi)
- asm_sfpmin_max(as, ir-1, (ir-1)->o == IR_MIN ? CC_HI : CC_LO);
+ asm_sfpmin_max(as, ir-1, (ir-1)->o == IR_MIN ? CC_PL : CC_LE);
return;
#elif LJ_HASFFI
} else if ((ir-1)->o == IR_CONV) {
diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
index 4aeb51f3..fe197700 100644
--- a/src/lj_asm_arm64.h
+++ b/src/lj_asm_arm64.h
@@ -1592,7 +1592,7 @@ static void asm_fpmin_max(ASMState *as, IRIns *ir, A64CC fcc)
Reg dest = (ra_dest(as, ir, RSET_FPR) & 31);
Reg right, left = ra_alloc2(as, ir, RSET_FPR);
right = ((left >> 8) & 31); left &= 31;
- emit_dnm(as, A64I_FCSELd | A64F_CC(fcc), dest, left, right);
+ emit_dnm(as, A64I_FCSELd | A64F_CC(fcc), dest, right, left);
emit_nm(as, A64I_FCMPd, left, right);
}
@@ -1604,8 +1604,8 @@ static void asm_min_max(ASMState *as, IRIns *ir, A64CC cc, A64CC fcc)
asm_intmin_max(as, ir, cc);
}
-#define asm_max(as, ir) asm_min_max(as, ir, CC_GT, CC_HI)
-#define asm_min(as, ir) asm_min_max(as, ir, CC_LT, CC_LO)
+#define asm_min(as, ir) asm_min_max(as, ir, CC_LT, CC_PL)
+#define asm_max(as, ir) asm_min_max(as, ir, CC_GT, CC_LE)
/* -- Comparisons --------------------------------------------------------- */
diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
index 49f74996..27e489af 100644
--- a/src/lj_opt_fold.c
+++ b/src/lj_opt_fold.c
@@ -1797,8 +1797,6 @@ LJFOLDF(reassoc_intarith_k64)
#endif
}
-LJFOLD(MIN MIN any)
-LJFOLD(MAX MAX any)
LJFOLD(BAND BAND any)
LJFOLD(BOR BOR any)
LJFOLDF(reassoc_dup)
@@ -1808,6 +1806,15 @@ LJFOLDF(reassoc_dup)
return NEXTFOLD;
}
+LJFOLD(MIN MIN any)
+LJFOLD(MAX MAX any)
+LJFOLDF(reassoc_dup_minmax)
+{
+ if (fins->op2 == fleft->op2)
+ return LEFTFOLD; /* (a o b) o b ==> a o b */
+ return NEXTFOLD;
+}
+
LJFOLD(BXOR BXOR any)
LJFOLDF(reassoc_bxor)
{
@@ -1846,23 +1853,12 @@ LJFOLDF(reassoc_shift)
return NEXTFOLD;
}
-LJFOLD(MIN MIN KNUM)
-LJFOLD(MAX MAX KNUM)
LJFOLD(MIN MIN KINT)
LJFOLD(MAX MAX KINT)
LJFOLDF(reassoc_minmax_k)
{
IRIns *irk = IR(fleft->op2);
- if (irk->o == IR_KNUM) {
- lua_Number a = ir_knum(irk)->n;
- lua_Number y = lj_vm_foldarith(a, knumright, fins->o - IR_ADD);
- if (a == y) /* (x o k1) o k2 ==> x o k1, if (k1 o k2) == k1. */
- return LEFTFOLD;
- PHIBARRIER(fleft);
- fins->op1 = fleft->op1;
- fins->op2 = (IRRef1)lj_ir_knum(J, y);
- return RETRYFOLD; /* (x o k1) o k2 ==> x o (k1 o k2) */
- } else if (irk->o == IR_KINT) {
+ if (irk->o == IR_KINT) {
int32_t a = irk->i;
int32_t y = kfold_intop(a, fright->i, fins->o);
if (a == y) /* (x o k1) o k2 ==> x o k1, if (k1 o k2) == k1. */
@@ -1875,24 +1871,6 @@ LJFOLDF(reassoc_minmax_k)
return NEXTFOLD;
}
-LJFOLD(MIN MAX any)
-LJFOLD(MAX MIN any)
-LJFOLDF(reassoc_minmax_left)
-{
- if (fins->op2 == fleft->op1 || fins->op2 == fleft->op2)
- return RIGHTFOLD; /* (b o1 a) o2 b ==> b; (a o1 b) o2 b ==> b */
- return NEXTFOLD;
-}
-
-LJFOLD(MIN any MAX)
-LJFOLD(MAX any MIN)
-LJFOLDF(reassoc_minmax_right)
-{
- if (fins->op1 == fright->op1 || fins->op1 == fright->op2)
- return LEFTFOLD; /* a o2 (a o1 b) ==> a; a o2 (b o1 a) ==> a */
- return NEXTFOLD;
-}
-
/* -- Array bounds check elimination -------------------------------------- */
/* Eliminate ABC across PHIs to handle t[i-1] forwarding case.
@@ -2018,8 +1996,6 @@ LJFOLDF(comm_comp)
LJFOLD(BAND any any)
LJFOLD(BOR any any)
-LJFOLD(MIN any any)
-LJFOLD(MAX any any)
LJFOLDF(comm_dup)
{
if (fins->op1 == fins->op2) /* x o x ==> x */
@@ -2027,6 +2003,15 @@ LJFOLDF(comm_dup)
return fold_comm_swap(J);
}
+LJFOLD(MIN any any)
+LJFOLD(MAX any any)
+LJFOLDF(comm_dup_minmax)
+{
+ if (fins->op1 == fins->op2) /* x o x ==> x */
+ return LEFTFOLD;
+ return NEXTFOLD;
+}
+
LJFOLD(BXOR any any)
LJFOLDF(comm_bxor)
{
diff --git a/src/lj_vmmath.c b/src/lj_vmmath.c
index c04459bd..ae4e0f15 100644
--- a/src/lj_vmmath.c
+++ b/src/lj_vmmath.c
@@ -49,8 +49,8 @@ double lj_vm_foldarith(double x, double y, int op)
case IR_ABS - IR_ADD: return fabs(x); break;
#if LJ_HASJIT
case IR_LDEXP - IR_ADD: return ldexp(x, (int)y); break;
- case IR_MIN - IR_ADD: return x > y ? y : x; break;
- case IR_MAX - IR_ADD: return x < y ? y : x; break;
+ case IR_MIN - IR_ADD: return x < y ? x : y; break;
+ case IR_MAX - IR_ADD: return x > y ? x : y; break;
#endif
default: return x;
}
diff --git a/src/vm_arm.dasc b/src/vm_arm.dasc
index a29292f1..89faa03e 100644
--- a/src/vm_arm.dasc
+++ b/src/vm_arm.dasc
@@ -1718,8 +1718,8 @@ static void build_subroutines(BuildCtx *ctx)
|.endif
|.endmacro
|
- | math_minmax math_min, gt, hi
- | math_minmax math_max, lt, lo
+ | math_minmax math_min, gt, pl
+ | math_minmax math_max, lt, le
|
|//-- String library -----------------------------------------------------
|
diff --git a/src/vm_arm64.dasc b/src/vm_arm64.dasc
index f517a808..2c1bb4f8 100644
--- a/src/vm_arm64.dasc
+++ b/src/vm_arm64.dasc
@@ -1494,8 +1494,8 @@ static void build_subroutines(BuildCtx *ctx)
| b <6
|.endmacro
|
- | math_minmax math_min, gt, hi
- | math_minmax math_max, lt, lo
+ | math_minmax math_min, gt, pl
+ | math_minmax math_max, lt, le
|
|//-- String library -----------------------------------------------------
|
diff --git a/src/vm_x64.dasc b/src/vm_x64.dasc
index 59f117ba..faeb5181 100644
--- a/src/vm_x64.dasc
+++ b/src/vm_x64.dasc
@@ -1896,7 +1896,7 @@ static void build_subroutines(BuildCtx *ctx)
| jmp ->fff_res
|
|.macro math_minmax, name, cmovop, sseop
- | .ffunc name
+ | .ffunc_1 name
| mov RAd, 2
|.if DUALNUM
| mov RB, [BASE]
diff --git a/src/vm_x86.dasc b/src/vm_x86.dasc
index f7ffe5d2..1c995d16 100644
--- a/src/vm_x86.dasc
+++ b/src/vm_x86.dasc
@@ -2321,7 +2321,7 @@ static void build_subroutines(BuildCtx *ctx)
| xorps xmm4, xmm4; jmp <1 // Return +-Inf and +-0.
|
|.macro math_minmax, name, cmovop, sseop
- | .ffunc name
+ | .ffunc_1 name
| mov RA, 2
| cmp dword [BASE+4], LJ_TISNUM
|.if DUALNUM
diff --git a/test/tarantool-tests/gh-6163-min-max.test.lua b/test/tarantool-tests/gh-6163-min-max.test.lua
new file mode 100644
index 00000000..1da8a259
--- /dev/null
+++ b/test/tarantool-tests/gh-6163-min-max.test.lua
@@ -0,0 +1,48 @@
+local tap = require('tap')
+local test = tap.test('gh-6163-jit-min-max')
+test:plan(3)
+--
+-- gh-6163: math.min/math.max inconsistencies.
+--
+
+local function is_consistent(res)
+ for i = 1, #res - 1 do
+ if res[i] ~= res[i + 1] then
+ return false
+ end
+ end
+ return true
+end
+
+-- This function creates dirty values on the Lua stack.
+-- The latter of them is going to be treated as an
+-- argument by the `math.min/math.max`.
+-- The first two of them are going to be overwritten
+-- by the math function itself.
+local function filler()
+ return 1, 1, 1
+end
+
+-- Success with no args.
+filler()
+local r, _ = pcall(function() math.min() end)
+test:ok(false == r, 'math.min fails with no args')
+
+filler()
+r, _ = pcall(function() math.max() end)
+test:ok(false == r, 'math.max fails with no args')
+
+-- Incorrect fold optimization.
+jit.off()
+jit.flush()
+jit.opt.start('hotloop=1')
+jit.on()
+
+local res = {}
+for i = 1, 4 do
+ res[i] = math.min(math.min(0/0, math.huge), math.huge)
+end
+
+test:ok(is_consistent(res), '(a o b) o a -> a o b')
+
+os.exit(test:check() and 0 or 1)
--
2.38.1
More information about the Tarantool-patches
mailing list