[Tarantool-patches] [PATCH luajit 4/5] Fix pow() optimization inconsistencies.
Maxim Kokryashkin
m.kokryashkin at tarantool.org
Mon Aug 21 12:00:37 MSK 2023
Hi, Sergey!
Thanks for the fixes!
LGTM now, see my answers below.
On Mon, Aug 21, 2023 at 11:06:32AM +0300, Sergey Kaplun wrote:
> Hi, Maxim!
> Thanks for the review!
> See my answers below.
>
> On 20.08.23, Maxim Kokryashkin wrote:
> > Hi, Sergey!
> > Thanks for the patch!
> > Please consider my comments below.
> >
> > On Tue, Aug 15, 2023 at 12:36:30PM +0300, Sergey Kaplun wrote:
> > > From: Mike Pall <mike>
> > >
> > > (cherry-picked from commit 9512d5c1aced61e13e7be2d3208ec7ae3516b458)
> > >
> > > This patch fixes different misbehaviour between JIT-compiled code and
> > Typo: s/misbehaviour/misbehaviours/
>
> Fixed.
>
> > > the interpreter for power operator with the following ways:
> > Typo: s/with the/in the/
>
> Fixed.
>
> > > * Drop folding optimizations for base ^ 0.5 => sqrt(base), as far as
> > > pow(base, 0.5) isn't interchangeable and depends on the <math.h>
> > > implementation.
> > > * Drop folding optimizations for 2 ^ int_pow => ldexp(1.0, int_pow), to
> > > avoid dependcy on the <math.h> implementation.
> > > * Now `asm_pow()` always assemble a call to the `lj_vm_powi()` function,
> > Typo: s/assemble/assembles/
>
> Fixed.
>
> > > that is general now for all CPU architectures. Using this internal
> > > function instead of toolchain-provided `pow()` guarantees consistency
> > Typo: s/of/of the/
>
> Fixed.
>
> > > between interpreter and JIT results. Also, it drops custom
> > Typo: s/drops/drops the/
>
> Fixed.
>
> > > implementation for the `vm_powi_sse()` on x86_64.
> > Typo: s/for the/for/
>
> Fixed.
>
> > > * `math_extern2` macro in the VM may take the second argument, that is
> > > used as the target function to call. The first argument is still the
> > > name for `func_nnsse` macro.
> > > * Narrowing for power operation avoids range guard for non-constant base
> > > IR. This leads to invalid result if value on trace is out of range.
> > Typo: s/to invalid/to an invalid/
>
> Fixed.
>
> > > Now it is done unconditionally.
> > >
> > > Be aware, that [220/502] lib/string/format/num.lua test [1] from
> > Typo: s/from the/from/
>
> I suppose that it should be "from the"? Fixed.
Yep, I got the order wrong, sorry.
>
> > > LuaJIT-test suite fails after this commit.
> > >
> > > [1]: https://www.exploringbinary.com/incorrect-floating-point-to-decimal-conversions/
> > >
> > > Sergey Kaplun:
> > > * added the description and the test for the problem
> > >
> > > Part of tarantool/tarantool#8825
> > > ---
>
> <snipped>
>
> > > +local res = {}
> > > +-- -0 ^ 0.5 = 0. Test sign with `tostring()`.
> > Typo: s/Test/Test the/
>
> Fixed.
>
> > > +-- XXX: use local variable to prevent folding via parser.
>
> <snipped>
>
> > > +
> > > +-- 2921 ^ 0.5 = 0x1.b05ec632536fap+5.
> > We certainly need to add some explanation here about the precision, because
> > it is not obvious why these magic numbers should cause any issues.
>
> I suppose any really intererested in this reader may compare the
> behaviour of the glibc implementation of `sqrt()` and `pow()`. Also, the
> comment should mention this implementation, so it becomes too huge and
> distracts the reader from the test case itself.
Something like the comment below is sufficient:
| This number has no special meaning and is used as one that gives different
| results when its square root is obtained with glibc's `sqrt` and `power`
| operations, thanks to their implementation nuances.
I strongly suggest adding it to make the test case more understandable.
>
> Ignoring for now.
>
> > > +res = {}
>
> <snipped>
>
> > > +test:samevalues(res, ('consistent results for folding 2921 ^ 0.5'))
> >
> > I believe it is possible to make a single function with different
> > parameters for all three cases above.
> > Something like `test_power(value, power, extra_map)`, so you can do
> > | res[i] = extra_map(value ^ power)
>
> I afraid that this function doesn't give any improvement in readability,
> also, it may change the trace semantics, so I prefer to leave it as is.
>
> Ignoring for now.
I've expressed my suggestion incomprehensively, sorry. Here is what I've meant
someting like this:
| local function pow_test_case(value, power, extra_map)
| jit.on()
| res = {}
| jit.on()
| for i = 1, 4 do
| res[i] = extra_map(value ^ power)
| end
|
| -- XXX: Prevent hotcount side effects.
| jit.off()
| jit.flush()
|
| test:samevalues(res, ('consistent results for <...>'))
| end
Anyway, I've checked the jit.dump by myself, and even for the simple
cases traces are entirely different. With that in mind, I believe, this
comment should be ignored, even though this is very sad.
>
> >
> > > +
>
> <snipped>
>
> > > +-- Need some value near 1, to avoid infinite result.
> > Typo: s/Need/We need/
> > Typo: s/avoid/avoid an/
>
> Fixed.
>
> See the iterative patch below.
>
> ===================================================================
> diff --git a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> index 5129fc45..003fe957 100644
> --- a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> +++ b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> @@ -18,7 +18,7 @@ jit.off()
> jit.flush()
>
> local res = {}
> --- -0 ^ 0.5 = 0. Test sign with `tostring()`.
> +-- -0 ^ 0.5 = 0. Test the sign with `tostring()`.
> -- XXX: use local variable to prevent folding via parser.
> -- XXX: use stack slot out of trace to prevent constant folding.
> local minus_zero = -0
> @@ -75,7 +75,7 @@ jit.on()
> pow(1, 2)
> pow(1, 2)
>
> --- Need some value near 1, to avoid infinite result.
> +-- We need some value near 1, to avoid an infinite result.
> local base = 1.0000000001
> local power = 65536 * 3
> local resulting_value = pow(base, power)
> ===================================================================
>
> > > +local base = 1.0000000001
>
> <snipped>
>
> > > --
> > > 2.41.0
> > >
>
> --
> Best regards,
> Sergey Kaplun
More information about the Tarantool-patches
mailing list