* Re: [Tarantool-patches] [PATCH luajit] x64: Fix 64 bit shift code generation.
[not found] <20230601110814.577426-1-m.kokryashkin@tarantool.org>
@ 2023-06-07 7:54 ` Sergey Kaplun via Tarantool-patches
0 siblings, 0 replies; only message in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-06-07 7:54 UTC (permalink / raw)
To: Maxim Kokryashkin; +Cc: tarantool-patches
Hi, Maxim!
Thanks for the patch!
Please, consider my comments below.
On 01.06.23, Maxim Kokryashkin wrote:
> From: Mike Pall <mike>
>
> Reported by Philipp Kutin.
> Fix contributed by Peter Cawley.
>
> (cherry-picked from commit 03a7ebca4f6819658cdaa12ba3af54a17b8035e9)
>
> In a situation where a variable shift left bitwise rotation that
> has a 64-bit result is recorded on an x86 64-bit processor and
> the result is supposed to end up in the `rcx` register, that value
> could be written into the `ecx` instead, thus being truncated into
> 32 bits. This patch fixes the described behavior, so now that
> value is written into the `rcx`.
>
> Importantly, the same behavior is impossible with the right
> rotation because there is a BMI2 instruction for it, so it
> is handled differently.
I'd mention that this is impossible on machines with BMI2 support, to
avoid misunderstanding.
>
> Maxim Kokryashkin:
> * added the description and the test for the problem
>
> Part of tarantool/tarantool#8516
> ---
> PR: https://github.com/tarantool/tarantool/pull/8727
> Branch: https://github.com/tarantool/luajit/tree/fckxorg/fix-bit-shift-generation
>
> src/lj_asm_x86.h | 2 +-
> .../fix-bit-shift-generation.test.lua | 29 +++++++++++++++++++
> 2 files changed, 30 insertions(+), 1 deletion(-)
> create mode 100644 test/tarantool-tests/fix-bit-shift-generation.test.lua
>
> diff --git a/src/lj_asm_x86.h b/src/lj_asm_x86.h
> index e6c42c6d..63d332ca 100644
> --- a/src/lj_asm_x86.h
> +++ b/src/lj_asm_x86.h
<snipped>
> diff --git a/test/tarantool-tests/fix-bit-shift-generation.test.lua b/test/tarantool-tests/fix-bit-shift-generation.test.lua
> new file mode 100644
> index 00000000..f9da42cc
> --- /dev/null
> +++ b/test/tarantool-tests/fix-bit-shift-generation.test.lua
> @@ -0,0 +1,29 @@
> +local tap = require('tap')
> +local test = tap.test('fix-bit-shift-generation'):skipcond({
> + ['Test requires JIT enabled'] = not jit.status(),
> +})
> +
> +test:plan(4)
Minor: I suggest to create the constant for this, named `NTESTS` to be
used later in cycle too.
> +
> +local ffi = require('ffi')
> +local rol = require('bit').rol
> +
> +ffi.cdef[[
> +int snprintf(char* dst, size_t n, const char* format, ...);
> +]]
Minor: I suppose, that we can use the following declaration (like in
<test/tarantool-tests/lj-695-ffi-vararg-call.test.lua>):
| ffi.cdef('int sprintf(char *str, const char *format, ...)')
Feel free to ignore.
> +
> +-- Buffer size is adjsuted to fit `(1 << 36)`,
> +-- which has exactly 11 digits.
> +local BUFFER_SIZE = 12
> +local buf = ffi.new('char[?]', BUFFER_SIZE)
> +jit.opt.start('hotloop=1')
> +
> +for i = 1, 4 do
> +-- Rotation is performed beyond the 32-bit size, for truncation to
> +-- become noticeable. Sprintf is used to ensure that the result of
> +-- rotation goes into the `rcx`, corresponing to the x86_64 ABI.
Typo: wrong indentation.
> + local result = ffi.C.snprintf(buf, BUFFER_SIZE, "%llu", rol(1ULL, i + 32))
Typo: s/"%llu"/'%llu'/, according to our code style agreements.
IINM, the resulting assembler changes within the patch from this:
| rol rsi, cl
| mov ecx, esi
to the following:
| rol rsi, cl
| mov rcx, rsi
Please, mention this in the comment to the `rol(1ULL, i + 32)`, and the
commit message.
> + test:ok(result > 1, '64-bit value was not truncated')
Can we also check the buffer contents (i.e the result of the operation)?
More specifically, we can use `tonumber(ffi.string(buf))` and compare it
with the `2 ^ (i + 32)`.
Side note: Also, maybe it is better to move test check to the separate
cycle, to get "pure" example of misbehaviour and trace of the former.
So, someone to examine the test can just copy-paste the main cycle to
see `jit.dump()` output.
But the current approach is totally fine too, so feel free to ignore.
> +end
> +
> +os.exit(test:check() and 0 or 1)
> --
> 2.40.1
>
--
Best regards,
Sergey Kaplun
^ permalink raw reply [flat|nested] only message in thread