[Tarantool-patches] [PATCH luajit] x64/LJ_GC64: Fix emit_rma().
Sergey Kaplun
skaplun at tarantool.org
Wed Mar 22 11:27:39 MSK 2023
From: Mike Pall <mike>
(cherry picked from commit 7e662e4f87134f1e84f7bea80933e033c5bf53a3)
The accessing of memory address for some operation `emit_rma()` may be
encoded in one of the following ways:
a. If the offset of the accessing address from the dispatch table
(pinned to r14 that is not changed while trace execution) fits into
32-bit, then encode this as an access to 32-bit displacement
relative to r14.
b. If the offset of the accessing address from the mcode (i.e. rip)
fits into 32-bit, then encode this as an access to 32-bit
displacement relative to rip (considering long mode specifics and
`RID_RIP` hack).
c. If the address doesn't fit into 32-bit one and we use `mov` or
`movsd`, then encode 64-bit load from this address.
d. Elsewhere, encode it as an access to 32-bit (the address should fit
into 32-bit one) displacement (the only option for non-GC64 mode).
So, each instruction in GC64 mode differs from `mov` or `movsd` should
be encoded via the last option. But if we got a 64-bit address with a
big enough offset it can't be encoded and the assertion in `ptr2addr()`
will fail.
There are several cases, when `emit_rma()` is used with non `mov`
instruction:
* `IR_LDEXP` with `fld` instruction for loading constant
number `TValue` by address.
* `IR_OBAR` with the corresponding `test` instruction on
`marked` field of `GCobj`.
All these instructions require an additional register to store value by
address. We can't truly allocate a register here due to possibility to
break IR assembling which depends on specific register usage. So, we use
and restore r14 here for emitting.
Also, this patch removes `movsd` from condition from the `x86Op` type
check, as far as it never uses for the `emit_rma()` routine (see also
`emit_loadk64()` for details).
Sergey Kaplun:
* added the description and the test for the problem
Part of tarantool/tarantool#8069
---
Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-noticket-fix-emit-rma
PR: https://github.com/tarantool/tarantool/pull/8477
Related issue: https://github.com/tarantool/tarantool/issues/8069
AFAICS, other places with `emit_rma()` usage are not related to the
patch as far as they take an offset for the address of JIT constants
stored in `jit_State`, so it always be near enough to dispatch.
Side note: you may check test-correctness of the last check with GC by
changing the corresponding condition check on `GC_WHITES` in asm_obar to
CC_NZ (like it will be treated for incorrect check). Be carefull, member
that instructions are emitted from bottom to top!
src/lj_emit_x86.h | 24 ++++-
test/tarantool-tests/fix-emit-rma.test.lua | 102 +++++++++++++++++++++
2 files changed, 123 insertions(+), 3 deletions(-)
create mode 100644 test/tarantool-tests/fix-emit-rma.test.lua
diff --git a/src/lj_emit_x86.h b/src/lj_emit_x86.h
index 6b58306b..b3dc4ea5 100644
--- a/src/lj_emit_x86.h
+++ b/src/lj_emit_x86.h
@@ -345,9 +345,27 @@ static void emit_rma(ASMState *as, x86Op xo, Reg rr, const void *addr)
emit_rmro(as, xo, rr, RID_DISPATCH, (int32_t)dispofs(as, addr));
} else if (checki32(mcpofs(as, addr)) && checki32(mctopofs(as, addr))) {
emit_rmro(as, xo, rr, RID_RIP, (int32_t)mcpofs(as, addr));
- } else if (!checki32((intptr_t)addr) && (xo == XO_MOV || xo == XO_MOVSD)) {
- emit_rmro(as, xo, rr, rr, 0);
- emit_loadu64(as, rr, (uintptr_t)addr);
+ } else if (!checki32((intptr_t)addr)) {
+ Reg ra = (rr & 15);
+ if (xo != XO_MOV) {
+ /* We can't allocate a register here. Use and restore DISPATCH. Ugly. */
+ uint64_t dispaddr = (uintptr_t)J2GG(as->J)->dispatch;
+ uint8_t i8 = xo == XO_GROUP3b ? *as->mcp++ : 0;
+ ra = RID_DISPATCH;
+ if (checku32(dispaddr)) {
+ emit_loadi(as, ra, (int32_t)dispaddr);
+ } else { /* Full-size 64 bit load. */
+ MCode *p = as->mcp;
+ *(uint64_t *)(p-8) = dispaddr;
+ p[-9] = (MCode)(XI_MOVri+(ra&7));
+ p[-10] = 0x48 + ((ra>>3)&1);
+ p -= 10;
+ as->mcp = p;
+ }
+ if (xo == XO_GROUP3b) emit_i8(as, i8);
+ }
+ emit_rmro(as, xo, rr, ra, 0);
+ emit_loadu64(as, ra, (uintptr_t)addr);
} else
#endif
{
diff --git a/test/tarantool-tests/fix-emit-rma.test.lua b/test/tarantool-tests/fix-emit-rma.test.lua
new file mode 100644
index 00000000..faddfe83
--- /dev/null
+++ b/test/tarantool-tests/fix-emit-rma.test.lua
@@ -0,0 +1,102 @@
+local tap = require('tap')
+local test = tap.test('fix-emit-rma'):skipcond({
+ ['Test requires JIT enabled'] = not jit.status(),
+ ['Test requires GC64 mode enabled'] = not require('ffi').abi('gc64'),
+})
+
+-- Need to test 2 cases of `emit_rma()` particulary on x64:
+-- * `IR_LDEXP` with `fld` instruction for loading constant
+-- number `TValue` by address.
+-- * `IR_OBAR` with the corresponding `test` instruction on
+-- `marked` field of `GCobj`.
+-- Also, test correctness.
+test:plan(4)
+
+local ffi = require('ffi')
+
+collectgarbage()
+-- Chomp memory in currently allocated GC space.
+collectgarbage('stop')
+
+for _ = 1, 8 do
+ ffi.new('char[?]', 256 * 1024 * 1024)
+end
+
+jit.opt.start('hotloop=1')
+
+-- Test `IR_LDEXP`.
+
+-- Reproducer here is a little tricky.
+-- We need to generate a bunch of traces as far we reference an
+-- IR field (`TValue`) address in `emit_rma()`. The amount of
+-- traces is empirical. Usually, assert fails on ~33d iteration,
+-- so let's use 100 just to be sure.
+local test_marker
+for _ = 1, 100 do
+ test_marker = loadstring([[
+ local test_marker
+ for i = 1, 4 do
+ -- Avoid fold optimization, use `i` as the second argument.
+ -- Need some constant differs from 1 or 0 as the first
+ -- argument.
+ test_marker = math.ldexp(1.2, i)
+ end
+ return test_marker
+ ]])()
+end
+
+-- If we here, it means no assertion failed during emitting.
+test:ok(true, 'IR_LDEXP emit_rma')
+test:ok(test_marker == math.ldexp(1.2, 4), 'IR_LDEXP emit_rma check result')
+
+-- Test `IR_OBAR`.
+
+-- First, create a closed upvalue.
+do
+ local uv -- luacheck: no unused
+ -- `IR_OBAR` is used for object write barrier on upvalues.
+ _G.change_uv = function(newv)
+ uv = newv
+ end
+end
+
+-- We need a constant value on trace to be referenced far enough
+-- from dispatch table. So we need to create a new function
+-- prototype with a constant string.
+-- This string should be long enough to be allocated with direct
+-- alloc far away from dispatch.
+local DEFAULT_MMAP_THRESHOLD = 128 * 1024
+local str = string.rep('x', DEFAULT_MMAP_THRESHOLD)
+local func_with_trace = loadstring([[
+ for _ = 1, 4 do
+ change_uv(']] .. str .. [[')
+ end
+]])
+func_with_trace()
+
+-- If we here, it means no assertion failed during emitting.
+test:ok(true, 'IR_OBAR emit_rma')
+
+-- Now check the correctness.
+
+-- Set GC state to GCpause.
+collectgarbage()
+
+-- We want to wait for the situation, when upvalue is black,
+-- the string is gray. Both conditions are satisfied, when the
+-- corresponding `change_uv()` function is marked, for example.
+-- We don't know on what exactly step our upvalue is marked as
+-- black and execution of trace become dangerous, so just check it
+-- at each step.
+-- Don't need to do the full GC cycle step by step.
+local old_steps_atomic = misc.getmetrics().gc_steps_atomic
+while (misc.getmetrics().gc_steps_atomic == old_steps_atomic) do
+ collectgarbage('step')
+ func_with_trace()
+end
+
+-- If we here, it means no assertion failed during `gc_mark()`,
+-- due to wrong call to `lj_gc_barrieruv()` on trace.
+test:ok(true, 'IR_OBAR emit_rma check correctness')
+
+os.exit(test:check() and 0 or 1)
--
2.34.1
More information about the Tarantool-patches
mailing list