[Tarantool-patches] [PATCH luajit v7] Fix math.min()/math.max() inconsistencies.

Fri Feb 3 12:11:21 MSK 2023

Hi, Maxim!

Thanks for the fixes and verbose comments!

LGTM, except a few nits below.

On 03.02.23, Maxim Kokryashkin wrote:
> From: Mike Pall <mike>
> 
> (cherry-picked from commit 03208c8162af9cc01ca76ee1676ca79e5abe9b60)
> 
> `math.min()`/`math.max()` could produce different results.
> Previously, dirty values on the Lua stack could be
> treated as arguments to `math.min()`/`math.max()`.
> This patch adds check for the number of arguments provided to
> math.min/max, which fixes the issue.
> 
> Also, several fold optimizations were modified or
> deleted due to inconsistencies between compiler
> and interpreter on NaN-values.
> 
> Changes in `fold_kfold_numarith()` also required
> to replace the `CC_LO/CC_HI` comparison modes with the `CC_LE/CC_PL`
> on aarhc64 platforms. The issue is thoroughly described just before
> the corresponding test.

Please, mention that mips/ppc-related changes are omitted due to
conflicts with the previous patches to backport.

> 
> Maxim Kokryashkin & Sergey Kaplun:
> * added the description and tests for the problem
> 
> Resolves tarantool/tarantool#6163
> ---
>  src/lj_asm_arm.h                              |   6 +-
>  src/lj_asm_arm64.h                            |   6 +-
>  src/lj_opt_fold.c                             |  53 ++--
>  src/lj_vmmath.c                               |   4 +-
>  src/vm_arm.dasc                               |   4 +-
>  src/vm_arm64.dasc                             |   4 +-
>  src/vm_x64.dasc                               |   2 +-
>  src/vm_x86.dasc                               |   2 +-
>  test/tarantool-tests/gh-6163-min-max.test.lua | 245 ++++++++++++++++++
>  9 files changed, 278 insertions(+), 48 deletions(-)
>  create mode 100644 test/tarantool-tests/gh-6163-min-max.test.lua
> 
> diff --git a/src/lj_asm_arm.h b/src/lj_asm_arm.h
> index 8af19eb9..6ae6e2f2 100644
> --- a/src/lj_asm_arm.h
> +++ b/src/lj_asm_arm.h

<snipped>

> diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
> index 4aeb51f3..fe197700 100644
> --- a/src/lj_asm_arm64.h
> +++ b/src/lj_asm_arm64.h

<snipped>

> diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
> index 49f74996..27e489af 100644
> --- a/src/lj_opt_fold.c
> +++ b/src/lj_opt_fold.c

<snipped>

> diff --git a/src/lj_vmmath.c b/src/lj_vmmath.c
> index c04459bd..ae4e0f15 100644
> --- a/src/lj_vmmath.c
> +++ b/src/lj_vmmath.c

<snipped>

> diff --git a/src/vm_arm.dasc b/src/vm_arm.dasc
> index a29292f1..89faa03e 100644
> --- a/src/vm_arm.dasc
> +++ b/src/vm_arm.dasc

<snipped>

> diff --git a/src/vm_arm64.dasc b/src/vm_arm64.dasc
> index f517a808..2c1bb4f8 100644
> --- a/src/vm_arm64.dasc
> +++ b/src/vm_arm64.dasc

<snipped>

> diff --git a/src/vm_x64.dasc b/src/vm_x64.dasc
> index 59f117ba..faeb5181 100644
> --- a/src/vm_x64.dasc
> +++ b/src/vm_x64.dasc

<snipped>

> diff --git a/src/vm_x86.dasc b/src/vm_x86.dasc
> index f7ffe5d2..1c995d16 100644
> --- a/src/vm_x86.dasc
> +++ b/src/vm_x86.dasc

<snipped>

> diff --git a/test/tarantool-tests/gh-6163-min-max.test.lua b/test/tarantool-tests/gh-6163-min-max.test.lua
> new file mode 100644
> index 00000000..0c9378cc
> --- /dev/null
> +++ b/test/tarantool-tests/gh-6163-min-max.test.lua
> @@ -0,0 +1,245 @@
> +local tap = require('tap')
> +local test = tap.test('gh-6163-jit-min-max')
> +local x86_64 = jit.arch == 'x86' or jit.arch == 'x64'

I suggest to create an additional ticket for this misbehaviour on
x86_64 (in our or Mike's repo -- up to you).

> +test:plan(18)
> +--
> +-- gh-6163: math.min/math.max inconsistencies.
> +--
> +
> +local function isnan(x)
> +    return x ~= x
> +end
> +
> +local function array_is_consistent(res)
> +  for i = 1, #res - 1 do
> +    if res[i] ~= res[i + 1] and not (isnan(res[i]) and isnan(res[i + 1])) then
> +      return false
> +    end
> +  end
> +  return true
> +end
> +
> +-- This function creates dirty values on the Lua stack.
> +-- The latter of them is going to be treated as an
> +-- argument by the `math.min/math.max`.
> +-- The first two of them are going to be overwritten
> +-- by the math function itself.
> +local function filler()
> +  return 1, 1, 1
> +end
> +
> +local min = math.min
> +local max = math.max
> +
> +-- It would be enough to test all cases for the
> +-- `math.min()` or for the `math.max()` only, because the
> +-- problem was in the common code. However, we shouldn't
> +-- make such assumptions in the testing code.

Please, drop a comment that cycle with
| for name, comp in pairs({min = min, max = max}) do
brokes the semantics of some tests (as we discussed offline), so to be
consistent between each other use the copy-paste approach.

> +
> +-- `math.min()/math.max()` should raise an error when
> +-- called with no arguments.
> +filler()
> +local r, _ = pcall(function() min() end)
> +test:ok(not r, 'math.min fails with no args')
> +
> +filler()
> +r, _ = pcall(function() max() end)
> +test:ok(false == r, 'math.max fails with no args')
> +
> +local nan = 0/0
> +local x = 1
> +
> +jit.opt.start('hotloop=1')
> +jit.on()

IINM, jit is already on, so there is no need in this line.

> +
> +-- Without the `(a o b) o a ==> a o b` fold optimization for
> +-- `math.min()/math.max()` the following mcode is emitted on aarch64
> +-- for the `math.min(math.min(x, nan), x)` expression:
> +--
> +-- | fcmp d2, d3 ; fcmp 1.0, nan
> +-- | fcsel d1, d2, d3, cc ; d1 == nan after this instruction
> +-- | ...
> +-- | fcmp d1, d2 ; fcmp nan, 1.0
> +-- | fcsel d0, d1, d2, cc ; d0 == 1.0 after this instruction
> +--
> +-- According to the `fcmp` docs[1], if either of the operands is NaN,
> +-- then the operands are unordered. It results in the following state
> +-- of the flags register: N=0, Z=0, C=1, V=1
> +--
> +-- According to the `fcsel` docs[2], if the condition is met, then
> +-- the first register value is taken, otherwise -- the second.
> +-- In our case, the condition is cc, which means that the `C` flag
> +-- should be clear[3], which is false. Then, the second value is taken,
> +-- which is `NaN` for the first `fcmp`-`fcsel` pair, and `1.0` for
> +-- the second.
> +--
> +-- If that fold optimization is applied, then only the first `fcmp`-`fcsel`
> +-- pair is emitted, and the result is `NaN`, which is inconsistent with
> +-- the result of the non-optimized mcode.
> +--
> +-- [1]: https://developer.arm.com/documentation/dui0801/g/A64-Floating-point-Instructions/FCMP
> +-- [2]: https://developer.arm.com/documentation/100069/0608/A64-Floating-point-Instructions/FCSEL
> +-- [3]: https://developer.arm.com/documentation/dui0068/b/ARM-Instruction-Reference/Conditional-execution
> +
> +local result = {}
> +for k = 1, 4 do
> +    result[k] = min(min(x, nan), x)
> +end

It will be nice to add 'expected: ...' comment for all tests, not really
obvious to me.
Here and below.

> +test:ok(array_is_consistent(result), 'math.min: reassoc_dup')
> +

<snipped>

> +-- XXX: The two tests below use the `0/0` constant instead of `nan`
> +-- variable is dictated by the `fold_kfold_numarith` semantics.
> +result = {}
> +for k = 1, 4 do
> +  result[k] = min(min(7.1, 0/0), 1.1)
> +end
> +test:ok(array_is_consistent(result), 'min: fold_kfold_numarith')
> +
> +result = {}
> +for k = 1, 4 do
> +  result[k] = max(max(7.1, 0/0), 1.1)
> +end
> +test:ok(array_is_consistent(result), 'max: fold_kfold_numarith')
> +
> +

Nit: excess empty line.

> +os.exit(test:check() and 0 or 1)
> -- 
> 2.39.0
> 

-- 
Best regards,
Sergey Kaplun