Hello, Sergey,

thanks for the patch! Please see my comments below.

On 02.10.2024 11:09, Sergey Kaplun wrote:
> From: Mike Pall <mike>
>
> Thanks to Peter Cawley.
>
> (cherry picked from commit 9e0437240f1fb4bfa7248f6ec8be0e3181016119)
>
> For `IR_BSHR`, `IR_BROL`, `IR_BROR` during `kfold_int64arith()` the left
> argument is truncated down to 32 bits, which leads to incorrect results
> if the right argument is >= 32.
typo: is >= 2,147,483,647
>
> Also, `IR_BSAR` does an unsigned shift rather than a signed shift, but
> since this case branch is unreachable, it is harmless for now.
>
> This patch fixes all misbehaviours (including possible for `IR_BSAR`) to
> preserve IR semantics.
>
> Sergey Kaplun:
> * added the description and the test for the problem
>
> Part of tarantool/tarantool#10199
> ---
>   src/lj_opt_fold.c                             |  8 +-
>   .../lj-1079-fix-64-bitshift-folds.test.lua    | 74 +++++++++++++++++++
>   2 files changed, 78 insertions(+), 4 deletions(-)
>   create mode 100644 test/tarantool-tests/lj-1079-fix-64-bitshift-folds.test.lua
>
> diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
> index e2171e1b..2702f79f 100644
> --- a/src/lj_opt_fold.c
> +++ b/src/lj_opt_fold.c
> @@ -382,10 +382,10 @@ static uint64_t kfold_int64arith(jit_State *J, uint64_t k1, uint64_t k2,
>     case IR_BOR: k1 |= k2; break;
>     case IR_BXOR: k1 ^= k2; break;
>     case IR_BSHL: k1 <<= (k2 & 63); break;
> -  case IR_BSHR: k1 = (int32_t)((uint32_t)k1 >> (k2 & 63)); break;
> -  case IR_BSAR: k1 >>= (k2 & 63); break;
> -  case IR_BROL: k1 = (int32_t)lj_rol((uint32_t)k1, (k2 & 63)); break;
> -  case IR_BROR: k1 = (int32_t)lj_ror((uint32_t)k1, (k2 & 63)); break;
> +  case IR_BSHR: k1 >>= (k2 & 63); break;
> +  case IR_BSAR: k1 = (uint64_t)((int64_t)k1 >> (k2 & 63)); break;
> +  case IR_BROL: k1 = lj_rol(k1, (k2 & 63)); break;
> +  case IR_BROR: k1 = lj_ror(k1, (k2 & 63)); break;
>     default: lj_assertJ(0, "bad IR op %d", op); break;
>     }
>   #else
> diff --git a/test/tarantool-tests/lj-1079-fix-64-bitshift-folds.test.lua b/test/tarantool-tests/lj-1079-fix-64-bitshift-folds.test.lua
> new file mode 100644
> index 00000000..6cc0b319
> --- /dev/null
> +++ b/test/tarantool-tests/lj-1079-fix-64-bitshift-folds.test.lua
> @@ -0,0 +1,74 @@
> +local tap = require('tap')
> +
> +-- Test file to demonstrate LuaJIT misbehaviour on folding
> +-- for bitshift operations.
> +-- See also,https://github.com/LuaJIT/LuaJIT/issues/1079.
> +
> +local test = tap.test('lj-1079-fix-64-bitshift-folds'):skipcond({
> +  ['Test requires JIT enabled'] = not jit.status(),
> +})
> +
> +local bit = require('bit')
> +
> +test:plan(4)
> +
> +-- Generic function for `bit.ror()`, `bit.rol()`.
> +local function bitop_rotation(bitop)

I would rename arg `bitop` to `bitop_func` to highlight the type

of the value.

> +  local r = {}
> +  for i = 1, 4 do
> +    -- (i & k1) o k2 ==> (i o k2) & (k1 o k2)
> +    local int64 = bit.band(i, 7LL)
> +    r[i] = tonumber(bitop(int64, 32))
please add comments about magic constants here and below
> +  end
> +  return r
> +end
> +
> +-- Similar function for `bit.rshift()`.
> +local function bitop_rshift_signed()
> +  local r = {}
> +  for i = 1, 4 do
> +    -- (i & k1) o k2 ==> (i o k2) & (k1 o k2)
> +    -- XXX: Use `-i` instead of `i` to prevent other folding due
> +    -- to IR difference so the IRs don't match fold rule mask.
> +    -- (-i & 7LL) < 1 << 32 => result == 0.
> +    local int64 = bit.band(-i, 7LL)
> +    r[i] = tonumber(bit.rshift(int64, 32))
> +  end
> +  return r
> +end
> +
> +-- A little bit different example, which leads to the assertion
> +-- failure due to the incorrect recording.
> +local function bitop_rshift_huge()
> +  local r = {}
> +  for i = 1, 4 do
> +    -- (i & k1) o k2 ==> (i o k2) & (k1 o k2)
> +    -- XXX: Need to use cast to the int64_t via `+ 0LL`, see the
> +    -- documentation [1] for the details.
> +    -- [1]:https://bitop.luajit.org/semantics.html
> +    local int64 = bit.band(2 ^ 33 + i, 2 ^ 33 + 0LL)
> +    r[i] = tonumber(bit.rshift(int64, 32))
> +  end
> +  return r
> +end
> +
> +local function test_64bitness(subtest, payload_func, bitop)
> +  subtest:plan(1)
> +
> +  jit.off()
> +  jit.flush()
> +  local results_joff = payload_func(bitop)
> +  jit.on()
> +  -- Reset hotcounters.
> +  jit.opt.start('hotloop=1')
> +  local results_jon = payload_func(bitop)
> +  subtest:is_deeply(results_jon, results_joff,
> +                    'same results for VM and JIT for ' .. subtest.name)
> +end
> +
> +test:test('rol', test_64bitness, bitop_rotation, bit.rol)
> +test:test('ror', test_64bitness, bitop_rotation, bit.ror)
> +test:test('rshift signed', test_64bitness, bitop_rshift_signed)
> +test:test('rshift huge',   test_64bitness, bitop_rshift_huge)
have you added additional whitespaces intentionally?
> +
> +test:done(true)