[Tarantool-patches] [PATCH luajit] LJ_GC64: Fix HREFK optimization.
Sergey Bronnikov
sergeyb at tarantool.org
Tue Jan 16 11:46:24 MSK 2024
Hi, Max
thanks for the patch!
test is passed after reverting the patch.
On the same build repro from the issue [1] has failed.
1. https://github.com/LuaJIT/LuaJIT/issues/840
Sergey
On 1/12/24 16:26, Maxim Kokryashkin wrote:
> From: Mike Pall <mike>
>
> Contributed by XmiliaH.
>
> (cherry-picked from commit 91bc6b8ad1f373c1ce9003dc024b2e21fad0e444)
>
> In `lj_record_idx` when `ix->oldv` is the global nilnode and the
> required key is not present in the table, it is possible to pass
> the constant key lookup optimization condition because of the
> `uint32_t` overflow. Because of that, further recording
> incorrectly removes the check for the nilnode, which produces
> wrong results when trace is called for a different table.
>
> Maxim Kokryashkin:
> * added the description and the test for the problem
>
> Part of tarantool/tarantool#9145
> ---
> Branch: https://github.com/tarantool/luajit/tree/fckxorg/lj-840-fix-hrefk-optimization
> PR: https://github.com/tarantool/tarantool/pull/9591
> Issues: https://github.com/LuaJIT/LuaJIT/issues/840
> https://github.com/tarantool/tarantool/issues/9145
>
> src/lj_record.c | 8 +--
> .../lj-840-fix-hrefk-optimization.test.lua | 58 +++++++++++++++++++
> 2 files changed, 62 insertions(+), 4 deletions(-)
> create mode 100644 test/tarantool-tests/lj-840-fix-hrefk-optimization.test.lua
>
> diff --git a/src/lj_record.c b/src/lj_record.c
> index a929b8aa..919e7169 100644
> --- a/src/lj_record.c
> +++ b/src/lj_record.c
> @@ -1374,16 +1374,16 @@ static TRef rec_idx_key(jit_State *J, RecordIndex *ix, IRRef *rbref,
> key = emitir(IRTN(IR_CONV), key, IRCONV_NUM_INT);
> if (tref_isk(key)) {
> /* Optimize lookup of constant hash keys. */
> - MSize hslot = (MSize)((char *)ix->oldv - (char *)&noderef(t->node)[0].val);
> - if (t->hmask > 0 && hslot <= t->hmask*(MSize)sizeof(Node) &&
> - hslot <= 65535*(MSize)sizeof(Node)) {
> + GCSize hslot = (GCSize)((char *)ix->oldv-(char *)&noderef(t->node)[0].val);
> + if (hslot <= t->hmask*(GCSize)sizeof(Node) &&
> + hslot <= 65535*(GCSize)sizeof(Node)) {
> TRef node, kslot, hm;
> *rbref = J->cur.nins; /* Mark possible rollback point. */
> *rbguard = J->guardemit;
> hm = emitir(IRTI(IR_FLOAD), ix->tab, IRFL_TAB_HMASK);
> emitir(IRTGI(IR_EQ), hm, lj_ir_kint(J, (int32_t)t->hmask));
> node = emitir(IRT(IR_FLOAD, IRT_PGC), ix->tab, IRFL_TAB_NODE);
> - kslot = lj_ir_kslot(J, key, hslot / sizeof(Node));
> + kslot = lj_ir_kslot(J, key, (IRRef)(hslot / sizeof(Node)));
> return emitir(IRTG(IR_HREFK, IRT_PGC), node, kslot);
> }
> }
> diff --git a/test/tarantool-tests/lj-840-fix-hrefk-optimization.test.lua b/test/tarantool-tests/lj-840-fix-hrefk-optimization.test.lua
> new file mode 100644
> index 00000000..a11b91e3
> --- /dev/null
> +++ b/test/tarantool-tests/lj-840-fix-hrefk-optimization.test.lua
> @@ -0,0 +1,58 @@
> +local tap = require('tap')
> +
> +-- Test file to demonstrate incorrect HREFK optimization
> +-- in LuaJIT.
> +
> +local ffi = require('ffi')
> +local test = tap.test('lj-840-fix-hrefk-optimization'):skipcond({
> + ['Test requires GC64 mode enabled'] = not ffi.abi('gc64'),
> + ['Test requires JIT enabled'] = not jit.status(),
> +})
> +test:plan(1)
> +
> +local table_new = require('table.new')
> +
> +-- Size of single hash node in bytes.
> +local NODE_SIZE = 24
> +-- Number of hash nodes to allocate on each iteration
> +-- based on the condition from `rec_idx_key`
> +local HASH_NODES = 65535
> +-- The vector of hash nodes should have a raw size of
> +-- `HASH_NODES * NODE_SIZE`, which is allocated in
> +-- `lj_alloc_malloc` directly with `mmap`. However,
> +-- the LuaJIT allocator adds a bunch of small paddings
> +-- and aligns the required size to LJ_PAGESIZE, which is
> +-- 4096, so the actual allocated size includes alignment.
> +local ALIGNMENT = 4096
> +-- The vector for hash nodes in the table is allocated based on
> +-- `hbits`, so it's actually got a size of 65536 nodes.
> +local SINGLE_ITERATION_ALLOC = (HASH_NODES + 1) * NODE_SIZE + ALIGNMENT + 72
> +-- We need to overflow the 32-bit distance to the global nilnode,
> +-- so we divide 2^32 by the SINGLE_ITERATION_ALLOC. There are a
> +-- bunch of non-table.new allocations already performed, so one
> +-- iteration is subtracted to account for them.
> +local N_ITERATIONS = 0x100000000 / SINGLE_ITERATION_ALLOC - 1
> +-- Prevent anchor table from interfering with target table allocations.
> +local anchor = table.new(N_ITERATIONS, 0)
> +
> +-- Construct table.
> +for _ = 1, N_ITERATIONS do
> + table.insert(anchor, table_new(0, HASH_NODES))
> +end
> +
> +jit.opt.start('hotloop=1')
> +local function get_n(tab)
> + local x
> + for _ = 1, 4 do
> + x = tab.n
> + end
> + return x
> +end
> +
> +-- Record the trace for the constructed table.
> +get_n(anchor[#anchor])
> +
> +-- Check the result for the table that has the required key.
> +local result = get_n({n=1})
> +test:is(result, 1, 'correct value retrieved')
> +test:done(true)
> --
> 2.43.0
>
More information about the Tarantool-patches
mailing list