Hi, Sergey, thanks for the patch! Please see my comments below. Sergey On 6/12/25 12:36, Sergey Kaplun wrote: > From: Mike Pall > > Reported by caohongqing. > Fix contributed by Peter Cawley. > > (cherry picked from commit 8fbd576fb9414a5fa70dfa6069733d3416a78269) > > `asm_hrefk()` uses the check for the offset for the corresponding node > structure. However, the target load is performed from its inner `key` > field with the offset 8. In the case of a huge table, it is possible > that the offset of the node (4095 * 8) is less than 4096 * 8 and can be > emitted via the corresponding instruction as an immediate offset, but > the offset of the `key` field is not. This leads to the corresponding > assertion failure in `emit_lso()`. The issue [1] contains yet another fix in the same place [2]. We decided to backport the patch separately. But please mention this in commit message. 1. https://github.com/LuaJIT/LuaJIT/issues/1026 2. https://github.com/LuaJIT/LuaJIT/commit/93ce12ee15abf28ef4cb24ae7e4b8a5b73d75c85 > > This patch fixes this behaviour by the correct check. > > Sergey Kaplun: > * added the description and the test for the problem > > Part of tarantool/tarantool#11278 > --- > > Related issues: > *https://github.com/LuaJIT/LuaJIT/issues/1026 > *https://github.com/tarantool/tarantool/issues/11278 > Branch:https://github.com/tarantool/luajit/tree/skaplun/lj-1026-arm64-invalid-hrefk-offset-check > > src/lj_asm_arm64.h | 2 +- > ...-arm64-invalid-hrefk-offset-check.test.lua | 48 +++++++++++++++++++ > 2 files changed, 49 insertions(+), 1 deletion(-) > create mode 100644 test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua > > diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h > index 6c7b011f..a7f059a2 100644 > --- a/src/lj_asm_arm64.h > +++ b/src/lj_asm_arm64.h > @@ -885,7 +885,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir) > IRIns *irkey = IR(kslot->op1); > int32_t ofs = (int32_t)(kslot->op2 * sizeof(Node)); > int32_t kofs = ofs + (int32_t)offsetof(Node, key); > - int bigofs = !emit_checkofs(A64I_LDRx, ofs); > + int bigofs = !emit_checkofs(A64I_LDRx, kofs); > Reg dest = (ra_used(ir) || bigofs) ? ra_dest(as, ir, RSET_GPR) : RID_NONE; > Reg node = ra_alloc1(as, ir->op1, RSET_GPR); > Reg key, idx = node; > diff --git a/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua b/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua > new file mode 100644 > index 00000000..de243814 > --- /dev/null > +++ b/test/tarantool-tests/lj-1026-arm64-invalid-hrefk-offset-check.test.lua > @@ -0,0 +1,48 @@ > +local tap = require('tap') > + > +-- Test file to demonstrate LuaJIT misbehaviour when assembling > +-- HREFK instruction on arm64 with the huge offset. > +-- See also:https://github.com/LuaJIT/LuaJIT/issues/1026. > +local test = tap.test('lj-1026-arm64-invalid-hrefk-offset-check'):skipcond({ > + ['Test requires JIT enabled'] = not jit.status(), It is an ARM-specific patch, should we add a condition for ARM here? > +}) > + > +test:plan(1) > + > +-- The assertion fails since in HREFK we are checking the offset > +-- from the hslots of the table of the Node structure itself s/Node/`Node`/ > +-- instead of its inner field `key` (with additional 8 bytes). > +-- So to test this, we generate a big table with constant keys > +-- and compile a trace for each HREFK possible. > + > +local big_tab = {} > +-- The map of the characters to generate constant string keys. > +-- The offset of the node should be 4096 * 8. It takes at least > +-- 1365 keys to hit this value. The maximum possible slots in the > +-- hash part is 2048, so to fill it with the maximum density (with > +-- the way below), we need 45 * 45 = 2025 keys. > +local chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRS' > +for c1 inchars:gmatch('.') do > + for c2 inchars:gmatch('.') do > + big_tab[c1 .. c2] = 1 > + end > +end > + > +jit.opt.start('hotloop=1') > + > +-- Generate bunch of traces. > +for c1 inchars:gmatch('.') do > + for c2 inchars:gmatch('.') do > + loadstring([=[ > + local t = ... > + for i = 1, 4 do > + -- HREFK generation. > + t[ ']=] .. c1 .. c2 .. [=[' ] = i > + end > + ]=])(big_tab) > + end > +end > + > +test:ok(true, 'no assertion failed') I would replace testcase description to something like "emitted assembly is correct". Feel free to ignore. > + > +test:done(true)