Hi, Sergey! Thanks for the fixes! LGTM   >  >>Hi, Maxim! >> >>Thanks for the review! >> >>On 24.01.23, Maxim Kokryashkin wrote: >>> >>> Hi, Sergey! >>> Thanks for the patch! >>> Please consider my comments below. >>>   >>> >  >>> >>From: Mike Pall >>> >> >>> >>(cherry picked from commit 522d2073da4be2af79db4728cbb375db0fbdfc48) >>> >> >>> >>`asm_intarith()` function may try to drop `test r, r` instruction before >>> >Please note that "r" is an allocated register for the instruction. >>> >>the Jcc instruction. However, in case when Jcc instruction is "Jump >>> >Typo: s/in case when/in cases where/ >> >>Fixed. >> >>> >>short if ..." instruction (i.e. has no 0F opcode prefix like "Jump near >>> >>if ..."), the `test` instruction is dropped when shouldn't be, due to >>> >Typo: s/when/when it/ >>> >>memory miss. As the result, the loop can't be realigned later in >>> >Typo: s/memory/a memory/ >>> >Also, that part about the memory miss is unclear, it would be better if you >>> >could clarify it a bit. >>> >>`asm_loop_fixup` due to target to jump isn't aligned and the assertion >>> >Typo: s/isn’t aligned/being misaligned/ >> >>Fixed. >> >>> >>fails. >>> >> >>> >>This patch adds the additional check for 0F opcode in `asm_intarith()`. >>> >Typo: s/for 0F/for the 0F/ >> >>Fixed, thanks! >> >>> >> >>> >>Sergey Kaplun: >>> >>* added the description and the test for the problem >>> >> >>> >>Part of tarantool/tarantool#8069 >> >>The new commit message is the following: >> >>| x86/x64: Fix loop realignment. >>| >>| (cherry picked from commit 522d2073da4be2af79db4728cbb375db0fbdfc48) >>| >>| `asm_intarith()` function may try to drop `test r, r` (where `r` is an >>| allocated register) instruction before the Jcc instruction. However, in >>| cases when Jcc instruction is "Jump short if ..." instruction (i.e. has >>| no 0F opcode prefix like "Jump near if ..."), the `test` instruction is >>| dropped when it shouldn't be, due to usage for the comparison the next >>| byte after instruction itself. As the result, the loop can't be >>| realigned later in `asm_loop_fixup` due to target to jump being >>| misaligned and the assertion fails. >>| >>| This patch adds the additional check for the 0F opcode in >>| `asm_intarith()`. >>| >>| Sergey Kaplun: >>| * added the description and the test for the problem >>| >>| Part of tarantool/tarantool#8069 >> >>Branch is force pushed. >> >>> >>--- >>> >> src/lj_asm_x86.h | 5 +++-- >>> >> .../lj-556-fix-loop-realignment.test.lua | 18 ++++++++++++++++++ >>> >> 2 files changed, 21 insertions(+), 2 deletions(-) >>> >> create mode 100644 test/tarantool-tests/lj-556-fix-loop-realignment.test.lua >>> >> >>> >>diff --git a/src/lj_asm_x86.h b/src/lj_asm_x86.h >>> >>index 8efda8e5..e6c42c6d 100644 >>> >>--- a/src/lj_asm_x86.h >>> >>+++ b/src/lj_asm_x86.h >>> >>@@ -2068,8 +2068,9 @@ static void asm_intarith(ASMState *as, IRIns *ir, x86Arith xa) >>> >>   int32_t k = 0; >>> >>   if (as->flagmcp == as->mcp) { /* Drop test r,r instruction. */ >>> >>     MCode *p = as->mcp + ((LJ_64 && *as->mcp < XI_TESTb) ? 3 : 2); >>> >>- if ((p[1] & 15) < 14) { >>> >>- if ((p[1] & 15) >= 12) p[1] -= 4; /* L <->S, NL <-> NS */ >>> >>+ MCode *q = p[0] == 0x0f ? p+1 : p; >>> >>+ if ((*q & 15) < 14) { >>> >>+ if ((*q & 15) >= 12) *q -= 4; /* L <->S, NL <-> NS */ >>> >>       as->flagmcp = NULL; >>> >>       as->mcp = p; >>> >>     } /* else: cannot transform LE/NLE to cc without use of OF. */ >>> >>diff --git a/test/tarantool-tests/lj-556-fix-loop-realignment.test.lua b/test/tarantool-tests/lj-556-fix-loop-realignment.test.lua >>> >>new file mode 100644 >>> >>index 00000000..9a8e6098 >>> >>--- /dev/null >>> >>+++ b/test/tarantool-tests/lj-556-fix-loop-realignment.test.lua >>> >>@@ -0,0 +1,18 @@ >>> >>+local tap = require('tap') >>> >>+ >>> >>+local test = tap.test('lj-505-fold-icorrect-behavior') >>> >>+test:plan(1) >>> >>+ >>> >>+-- Test file to demonstrate JIT misbehaviour for loop realignment >>> >>+-- in LUAJIT_NUMMODE=2. See also >>> >>+-- https://github.com/LuaJIT/LuaJIT/issues/556 . >>> >>+ >>> >>+jit.opt.start('hotloop=1') >>> >>+ >>> >>+local s = 4 >>> >>+while s > 0 do >>> >>+ s = s - 1 >>> >>+end >>> >>+ >>> >>+test:ok(true, 'loop is compiled and ran successfully') >>> >>+os.exit(test:check() and 0 or 1) >>> >>-- >>> >The test works just fine with HEAD on  >>> >f7d61d96  ci: introduce workflow for exotic builds. >>> >  >>> >Tested configurations:  >>> >LJ_64: True, LJ_GC64: True, LJ_DUALNUM: True >>> >LJ_64: True, LJ_GC64: False, LJ_DUALNUM: True >> >>It's strange... >>I use the following build command: >>| $ cmake . -DCMAKE_BUILD_TYPE=Debug -DLUA_USE_APICHECK=ON -DLUA_USE_ASSERT=ON -DLUAJIT_ENABLE_GC64=OFF -DLUAJIT_NUMMODE=2 && make -j >>and get the following assertion: >>| asm_loop_fixup: Assertion `((intptr_t)target & 15) == 0' failed. >>What command do you use to build LuaJIT? >PEBKAC, I forgot to add the LUA_USE_ASSERT flag. >>> >-- >>> >Best regards, >>> >Maxim Kokryashkin >>> >  >> >>-- >>Best regards, >>Sergey Kaplun >-- >Best regards, >Maxim Kokryashkin >