Tarantool development patches archive
 help / color / mirror / Atom feed
From: sergos via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Sergey Kaplun <skaplun@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH luajit] LJ_GC64: Fix ir_khash for non-string GCobj.
Date: Wed, 14 Dec 2022 14:33:51 +0300	[thread overview]
Message-ID: <D715E1A3-8D95-47DF-84F7-853F8C23E0F3@tarantool.org> (raw)
In-Reply-To: <20221208054618.9104-1-skaplun@tarantool.org>

Hi!

Thanks for the patch!

Some addition to Max’s comments. And a question on the test.

Sergos

> On 8 Dec 2022, at 08:46, Sergey Kaplun <skaplun@tarantool.org> wrote:
> 
> From: Mike Pall <mike>
> 
> Contributed by Peter Cawley.
> 
> (cherry picked from commit b4ed3219a1a98dd9fe7d1e3eeea3b82f5a780948)
> 
> When emitting `IR_HREF` for constant value to lookup the `ir_khash()`
               an                           ^^^ 
                         perhaps just ‘for a constant value lokup’?

> function is used to calculate hash for the corresponding object.
> This calculation must be the same as in the corresponding `hashkey()`
> function from <lj_tab.c>.
> 
> Hash calculating via passing two arguments `lo`, and `hi` to `hashrot()`
                                                             the

> routine. For non-string GC objects the first `lo` argument is the same
> for GC64 and not GC64 mode -- lower 32 bits of the object address. For
> GC64 mode `hi` argument is upper 32 bits of the object address,
> including specific type NaN-tag. This `hi` argument in `ir_khash()`
           a

> function is miscalculated in GC64 using non-GC64 value (`lo` +
                                 mode    a

> `HASH_BIAS`). As a result, the hash for the GC object is miscalculated
> on trace and we exit from trace due to assertion guard on the type or
                          the          an
> value check.
> 
> This patch fixes calculation of hash value on trace for GC64 mode by
> making it consistent with `hashkey()`.
                          the
> 
> Sergey Kaplun:
> * added the description and the test for the problem
> 
> Part of tarantool/tarantool#7230
> ---
> 
> Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-356-ir-khash-non-string-obj-full-ci
> Issue/PR:
> * https://github.com/tarantool/tarantool/issues/7230
> * https://github.com/LuaJIT/LuaJIT/pull/356
> Tarantool PR: https://github.com/tarantool/tarantool/pull/8020
> 
> Side note: Problems with red fuzzer jobs look irrelevant to the patch.
> 
> src/lj_asm.c                                  |  4 +
> .../lj-356-ir-khash-non-string-obj.test.lua   | 90 +++++++++++++++++++
> 2 files changed, 94 insertions(+)
> create mode 100644 test/tarantool-tests/lj-356-ir-khash-non-string-obj.test.lua
> 
> diff --git a/src/lj_asm.c b/src/lj_asm.c
> index 1a7fb0c8..a154547b 100644
> --- a/src/lj_asm.c
> +++ b/src/lj_asm.c
> @@ -1016,7 +1016,11 @@ static uint32_t ir_khash(IRIns *ir)
>   } else {
>     lua_assert(irt_isgcv(ir->t));
>     lo = u32ptr(ir_kgc(ir));
> +#if LJ_GC64
> +    hi = (uint32_t)(u64ptr(ir_kgc(ir)) >> 32) | (irt_toitype(ir->t) << 15);
> +#else
>     hi = lo + HASH_BIAS;
> +#endif
>   }
>   return hashrot(lo, hi);
> }
> diff --git a/test/tarantool-tests/lj-356-ir-khash-non-string-obj.test.lua b/test/tarantool-tests/lj-356-ir-khash-non-string-obj.test.lua
> new file mode 100644
> index 00000000..fff0b1a5
> --- /dev/null
> +++ b/test/tarantool-tests/lj-356-ir-khash-non-string-obj.test.lua
> @@ -0,0 +1,90 @@
> +local tap = require('tap')
> +local traceinfo = require('jit.util').traceinfo
> +local table_new = require('table.new')
> +
> +-- Test file to demonstrate the incorrect GC64 JIT behaviour
> +-- for `IR_HREF` for on-trace-constant key lookup.
      of an           an
> +-- See also https://github.com/LuaJIT/LuaJIT/pull/356.
> +local test = tap.test('lj-356-ir-khash-non-string-obj')
> +local N_ITERATIONS = 4
> +
> +-- Amount of iteration for trace compilation and execution and
> +-- additional check, that there is no new trace compiled.
> +test:plan(N_ITERATIONS + 1)
> +
> +-- To reproduce the issue we need to compile a trace with
> +-- `IR_HREF`, with a lookup of constant hash key GC value. To
> +-- prevent `IR_HREFK` to be emitted instead, we need a table with
             an `IR_HREFK` emission

> +-- a huge hash part. Delta of address between the start of the
> +-- hash part of the table and the current node to lookup must be
> +-- more than `(1024 * 64 - 1) * sizeof(Node)`.
> +-- See <src/lj_record.c>, for details.
> +-- XXX: This constant is well suited to prevent test to be flaky,
> +-- because the aforementioned delta is always large enough.
> +local N_HASH_FIELDS = 1024 * 1024 * 8
> +local MAGIC = 42
> +
> +local filled_tab = table_new(0, N_HASH_FIELDS + 1)
> +
> +-- The function returns constant cdata pinned to `GCproto` to be
> +-- used as a key for table lookup.
> +local function get_const_cdata()
> +  return 0LL
> +end
> +
> +-- XXX: don't set `hotexit` to prevent compilation of trace after
> +-- exiting the main test cycle.
> +jit.opt.start('hotloop=1')
> +
> +-- Prevent `get_const_cdata()` become hot and be compiled before
> +-- the main test cycle.
> +jit.off()
> +
> +filled_tab[get_const_cdata()] = MAGIC
> +
> +-- Speed up table filling-up.
> +jit.on()
> +
> +-- Filling-up the table with GC values to minimize the amount of
> +-- hash collisions and increases delta between the start of the
> +-- hash part of the table and currently stored node.
> +for i = 1, N_HASH_FIELDS do
> +  filled_tab[1LL] = i
> +end
> +
> +-- Prevent JIT misbehaviour before the main test chunk.
> +jit.off()
> +
> +-- Allocate a table with exact array part to be sure that there
> +-- is no side exit from the trace, due to table reallocation.
> +local result_tab = table_new(N_ITERATIONS, 0)
> +
> +jit.flush()
> +
> +assert(not traceinfo(1), 'no traces compiled after flush')
> +
> +jit.on()
> +
> +for _ = 1, N_ITERATIONS do
> +  -- If the hash for table lookup is miscalculated, then we get
> +  -- `nil` (most possibly) value from the table and the side exit
> +  -- will be taken and we continue execution from the call to
> +  -- `get_const_cdata()`, this function is already hot after the
> +  -- first cycle iteration, and the new trace is recorded.
> +  table.insert(result_tab, filled_tab[get_const_cdata()])
> +end
> +
> +jit.off()
> +
> +test:ok(not traceinfo(2), 'the second trace should not be compiled')

That’s not quite clear to me: a second trace generation is a side-effect
of the incorrect hash calculation. Is it always leads to the trace
generation? 

> +
> +-- No more need to prevent trace compilation.
> +jit.on()
> +
> +for i = 1, N_ITERATIONS do
> +  -- Check that that all lookups are correct and there is no
> +  -- value from other cdata stored in the table.
> +  test:ok(result_tab[i] == MAGIC, 'correct hash lookup from the table')

And this one checks what then? The hash is calculated correctly, but the value
read from the `filled_tab` is incorrect - what can lead to this?

> +end
> +
> +os.exit(test:check() and 0 or 1)
> -- 
> 2.34.1
> 


  parent reply	other threads:[~2022-12-14 11:35 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-08  5:46 Sergey Kaplun via Tarantool-patches
2022-12-12 11:44 ` Maxim Kokryashkin via Tarantool-patches
2022-12-15 10:00   ` Sergey Kaplun via Tarantool-patches
2022-12-14 11:33 ` sergos via Tarantool-patches [this message]
2022-12-15 10:13   ` Sergey Kaplun via Tarantool-patches
2022-12-15 11:46     ` Maxim Kokryashkin via Tarantool-patches
2022-12-15 15:39     ` sergos via Tarantool-patches
2023-01-12 14:55 ` Igor Munkin via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D715E1A3-8D95-47DF-84F7-853F8C23E0F3@tarantool.org \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=sergos@tarantool.org \
    --cc=skaplun@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH luajit] LJ_GC64: Fix ir_khash for non-string GCobj.' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox