Hi, Sergey!
Thanks for the patch!
Please consider my comments below.
 
 
From: Mike Pall <mike>

(cherry picked from commit 522d2073da4be2af79db4728cbb375db0fbdfc48)

`asm_intarith()` function may try to drop `test r, r` instruction before
Please note that "r" is an allocated register for the instruction.
the Jcc instruction. However, in case when Jcc instruction is "Jump
Typo: s/in case when/in cases where/
short if ..." instruction (i.e. has no 0F opcode prefix like "Jump near
if ..."), the `test` instruction is dropped when shouldn't be, due to
Typo: s/when/when it/
memory miss. As the result, the loop can't be realigned later in
Typo: s/memory/a memory/
Also, that part about the memory miss is unclear, it would be better if you
could clarify it a bit.
`asm_loop_fixup` due to target to jump isn't aligned and the assertion
Typo: s/isn’t aligned/being misaligned/
fails.

This patch adds the additional check for 0F opcode in `asm_intarith()`.
Typo: s/for 0F/for the 0F/

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#8069
---
 src/lj_asm_x86.h | 5 +++--
 .../lj-556-fix-loop-realignment.test.lua | 18 ++++++++++++++++++
 2 files changed, 21 insertions(+), 2 deletions(-)
 create mode 100644 test/tarantool-tests/lj-556-fix-loop-realignment.test.lua

diff --git a/src/lj_asm_x86.h b/src/lj_asm_x86.h
index 8efda8e5..e6c42c6d 100644
--- a/src/lj_asm_x86.h
+++ b/src/lj_asm_x86.h
@@ -2068,8 +2068,9 @@ static void asm_intarith(ASMState *as, IRIns *ir, x86Arith xa)
   int32_t k = 0;
   if (as->flagmcp == as->mcp) { /* Drop test r,r instruction. */
     MCode *p = as->mcp + ((LJ_64 && *as->mcp < XI_TESTb) ? 3 : 2);
- if ((p[1] & 15) < 14) {
- if ((p[1] & 15) >= 12) p[1] -= 4; /* L <->S, NL <-> NS */
+ MCode *q = p[0] == 0x0f ? p+1 : p;
+ if ((*q & 15) < 14) {
+ if ((*q & 15) >= 12) *q -= 4; /* L <->S, NL <-> NS */
       as->flagmcp = NULL;
       as->mcp = p;
     } /* else: cannot transform LE/NLE to cc without use of OF. */
diff --git a/test/tarantool-tests/lj-556-fix-loop-realignment.test.lua b/test/tarantool-tests/lj-556-fix-loop-realignment.test.lua
new file mode 100644
index 00000000..9a8e6098
--- /dev/null
+++ b/test/tarantool-tests/lj-556-fix-loop-realignment.test.lua
@@ -0,0 +1,18 @@
+local tap = require('tap')
+
+local test = tap.test('lj-505-fold-icorrect-behavior')
+test:plan(1)
+
+-- Test file to demonstrate JIT misbehaviour for loop realignment
+-- in LUAJIT_NUMMODE=2. See also
+-- https://github.com/LuaJIT/LuaJIT/issues/556.
+
+jit.opt.start('hotloop=1')
+
+local s = 4
+while s > 0 do
+ s = s - 1
+end
+
+test:ok(true, 'loop is compiled and ran successfully')
+os.exit(test:check() and 0 or 1)
--
The test works just fine with HEAD on 
f7d61d96 ci: introduce workflow for exotic builds.
 
Tested configurations: 
LJ_64: True, LJ_GC64: True, LJ_DUALNUM: True
LJ_64: True, LJ_GC64: False, LJ_DUALNUM: True
--
Best regards,
Maxim Kokryashkin