From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 89CC63195D5; Wed, 22 Mar 2023 11:31:26 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 89CC63195D5 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1679473886; bh=ceCLfZYKEdMZmUDVl1LdK4B6EmWCkpBSWZBHi25NGOM=; h=To:Date:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=dKQtwpQv8Ul0UUNT7RTgRgH0/RGkNVCYFlFe5/wLdMAlaTyEIawhgNRGw/GG/t6sB uyYEipHlzp3OTd01IB0Uj86OX7kA5IEquMtg18dPt7o1VkGjwJ2ZhebyykRmt8Gdge cltOtWbscuIbjDRZKtLXjr7s0T2Km4e8+eRwMIUk= Received: from smtpng3.i.mail.ru (smtpng3.i.mail.ru [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 4968D2ACFC1 for ; Wed, 22 Mar 2023 11:31:26 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 4968D2ACFC1 Received: by smtpng3.m.smailru.net with esmtpa (envelope-from ) id 1petsT-00073W-4u; Wed, 22 Mar 2023 11:31:25 +0300 To: Sergey Ostanevich , Maxim Kokryashkin Date: Wed, 22 Mar 2023 11:27:39 +0300 Message-Id: <20230322082739.25391-1-skaplun@tarantool.org> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Mailru-Src: smtp X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD992B47CBA3690DD1E12D3F43B95D0050E685867873C79256C00894C459B0CD1B964E7AEC47D29B5E372911FC0A5CE2C21DDE56F2612ACA8E7D19A5A69C2E3A454 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7705F446BE41E38A1EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F79006371AA4FDB8B3812E678638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D8B6C4A6F43CD0FF08C2AACC74B4460BAD117882F4460429724CE54428C33FAD305F5C1EE8F4F765FC07EE6BD9D908E34E389733CBF5DBD5E9C8A9BA7A39EFB766F5D81C698A659EA7CC7F00164DA146DA9985D098DBDEAEC808214CF94FAA95E0F6B57BC7E6449061A352F6E88A58FB86F5D81C698A659EA7E827F84554CEF5019E625A9149C048EE33AC447995A7AD182BEBFE083D3B9BA73A03B725D353964B0B7D0EA88DDEDAC722CA9DD8327EE4930A3850AC1BE2E73542F54486E6D6388DC4224003CC83647689D4C264860C145E X-C1DE0DAB: 0D63561A33F958A5B8F465491BE74DF5DC04A546DE6BC5879CB11C67CFE69C9AF87CCE6106E1FC07E67D4AC08A07B9B013BDA61BF53F5E1D9C5DF10A05D560A950611B66E3DA6D700B0A020F03D25A092FFDA4F57982C5F49C5DF10A05D560A9E9DBE8BB5FBB5B28E84698DA5A7895A6535571D14F44ED41 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D349DAEDEF7DE8FDCFA1298D96DD555EDB4545408A5DB582DC121E88850ED8814900AC5A4A84B08AA081D7E09C32AA3244C62D0B99624EA281158A4504C52449E1C8A6D4CC6FBFAC251927AC6DF5659F194 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojTmvmcsmE6tiMCAq7ohxUfA== X-DA7885C5: 1CD76B20E3FF7390C08CB0AF0B1484E2A86DD66709345E25313C12D798495EE9262E2D401490A4A0DB037EFA58388B346E8BC1A9835FDE71 X-Mailru-Sender: 689FA8AB762F73933AF1F914F131DBF5E867C2E18AF477987B4732FAE5C9EB3B0FBE9A32752B8C9C2AA642CC12EC09F1FB559BB5D741EB962F61BD320559CF1EFD657A8799238ED55FEEDEB644C299C0ED14614B50AE0675 X-Mras: Ok Subject: [Tarantool-patches] [PATCH luajit] x64/LJ_GC64: Fix emit_rma(). X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Sergey Kaplun via Tarantool-patches Reply-To: Sergey Kaplun Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" From: Mike Pall (cherry picked from commit 7e662e4f87134f1e84f7bea80933e033c5bf53a3) The accessing of memory address for some operation `emit_rma()` may be encoded in one of the following ways: a. If the offset of the accessing address from the dispatch table (pinned to r14 that is not changed while trace execution) fits into 32-bit, then encode this as an access to 32-bit displacement relative to r14. b. If the offset of the accessing address from the mcode (i.e. rip) fits into 32-bit, then encode this as an access to 32-bit displacement relative to rip (considering long mode specifics and `RID_RIP` hack). c. If the address doesn't fit into 32-bit one and we use `mov` or `movsd`, then encode 64-bit load from this address. d. Elsewhere, encode it as an access to 32-bit (the address should fit into 32-bit one) displacement (the only option for non-GC64 mode). So, each instruction in GC64 mode differs from `mov` or `movsd` should be encoded via the last option. But if we got a 64-bit address with a big enough offset it can't be encoded and the assertion in `ptr2addr()` will fail. There are several cases, when `emit_rma()` is used with non `mov` instruction: * `IR_LDEXP` with `fld` instruction for loading constant number `TValue` by address. * `IR_OBAR` with the corresponding `test` instruction on `marked` field of `GCobj`. All these instructions require an additional register to store value by address. We can't truly allocate a register here due to possibility to break IR assembling which depends on specific register usage. So, we use and restore r14 here for emitting. Also, this patch removes `movsd` from condition from the `x86Op` type check, as far as it never uses for the `emit_rma()` routine (see also `emit_loadk64()` for details). Sergey Kaplun: * added the description and the test for the problem Part of tarantool/tarantool#8069 --- Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-noticket-fix-emit-rma PR: https://github.com/tarantool/tarantool/pull/8477 Related issue: https://github.com/tarantool/tarantool/issues/8069 AFAICS, other places with `emit_rma()` usage are not related to the patch as far as they take an offset for the address of JIT constants stored in `jit_State`, so it always be near enough to dispatch. Side note: you may check test-correctness of the last check with GC by changing the corresponding condition check on `GC_WHITES` in asm_obar to CC_NZ (like it will be treated for incorrect check). Be carefull, member that instructions are emitted from bottom to top! src/lj_emit_x86.h | 24 ++++- test/tarantool-tests/fix-emit-rma.test.lua | 102 +++++++++++++++++++++ 2 files changed, 123 insertions(+), 3 deletions(-) create mode 100644 test/tarantool-tests/fix-emit-rma.test.lua diff --git a/src/lj_emit_x86.h b/src/lj_emit_x86.h index 6b58306b..b3dc4ea5 100644 --- a/src/lj_emit_x86.h +++ b/src/lj_emit_x86.h @@ -345,9 +345,27 @@ static void emit_rma(ASMState *as, x86Op xo, Reg rr, const void *addr) emit_rmro(as, xo, rr, RID_DISPATCH, (int32_t)dispofs(as, addr)); } else if (checki32(mcpofs(as, addr)) && checki32(mctopofs(as, addr))) { emit_rmro(as, xo, rr, RID_RIP, (int32_t)mcpofs(as, addr)); - } else if (!checki32((intptr_t)addr) && (xo == XO_MOV || xo == XO_MOVSD)) { - emit_rmro(as, xo, rr, rr, 0); - emit_loadu64(as, rr, (uintptr_t)addr); + } else if (!checki32((intptr_t)addr)) { + Reg ra = (rr & 15); + if (xo != XO_MOV) { + /* We can't allocate a register here. Use and restore DISPATCH. Ugly. */ + uint64_t dispaddr = (uintptr_t)J2GG(as->J)->dispatch; + uint8_t i8 = xo == XO_GROUP3b ? *as->mcp++ : 0; + ra = RID_DISPATCH; + if (checku32(dispaddr)) { + emit_loadi(as, ra, (int32_t)dispaddr); + } else { /* Full-size 64 bit load. */ + MCode *p = as->mcp; + *(uint64_t *)(p-8) = dispaddr; + p[-9] = (MCode)(XI_MOVri+(ra&7)); + p[-10] = 0x48 + ((ra>>3)&1); + p -= 10; + as->mcp = p; + } + if (xo == XO_GROUP3b) emit_i8(as, i8); + } + emit_rmro(as, xo, rr, ra, 0); + emit_loadu64(as, ra, (uintptr_t)addr); } else #endif { diff --git a/test/tarantool-tests/fix-emit-rma.test.lua b/test/tarantool-tests/fix-emit-rma.test.lua new file mode 100644 index 00000000..faddfe83 --- /dev/null +++ b/test/tarantool-tests/fix-emit-rma.test.lua @@ -0,0 +1,102 @@ +local tap = require('tap') +local test = tap.test('fix-emit-rma'):skipcond({ + ['Test requires JIT enabled'] = not jit.status(), + ['Test requires GC64 mode enabled'] = not require('ffi').abi('gc64'), +}) + +-- Need to test 2 cases of `emit_rma()` particulary on x64: +-- * `IR_LDEXP` with `fld` instruction for loading constant +-- number `TValue` by address. +-- * `IR_OBAR` with the corresponding `test` instruction on +-- `marked` field of `GCobj`. +-- Also, test correctness. +test:plan(4) + +local ffi = require('ffi') + +collectgarbage() +-- Chomp memory in currently allocated GC space. +collectgarbage('stop') + +for _ = 1, 8 do + ffi.new('char[?]', 256 * 1024 * 1024) +end + +jit.opt.start('hotloop=1') + +-- Test `IR_LDEXP`. + +-- Reproducer here is a little tricky. +-- We need to generate a bunch of traces as far we reference an +-- IR field (`TValue`) address in `emit_rma()`. The amount of +-- traces is empirical. Usually, assert fails on ~33d iteration, +-- so let's use 100 just to be sure. +local test_marker +for _ = 1, 100 do + test_marker = loadstring([[ + local test_marker + for i = 1, 4 do + -- Avoid fold optimization, use `i` as the second argument. + -- Need some constant differs from 1 or 0 as the first + -- argument. + test_marker = math.ldexp(1.2, i) + end + return test_marker + ]])() +end + +-- If we here, it means no assertion failed during emitting. +test:ok(true, 'IR_LDEXP emit_rma') +test:ok(test_marker == math.ldexp(1.2, 4), 'IR_LDEXP emit_rma check result') + +-- Test `IR_OBAR`. + +-- First, create a closed upvalue. +do + local uv -- luacheck: no unused + -- `IR_OBAR` is used for object write barrier on upvalues. + _G.change_uv = function(newv) + uv = newv + end +end + +-- We need a constant value on trace to be referenced far enough +-- from dispatch table. So we need to create a new function +-- prototype with a constant string. +-- This string should be long enough to be allocated with direct +-- alloc far away from dispatch. +local DEFAULT_MMAP_THRESHOLD = 128 * 1024 +local str = string.rep('x', DEFAULT_MMAP_THRESHOLD) +local func_with_trace = loadstring([[ + for _ = 1, 4 do + change_uv(']] .. str .. [[') + end +]]) +func_with_trace() + +-- If we here, it means no assertion failed during emitting. +test:ok(true, 'IR_OBAR emit_rma') + +-- Now check the correctness. + +-- Set GC state to GCpause. +collectgarbage() + +-- We want to wait for the situation, when upvalue is black, +-- the string is gray. Both conditions are satisfied, when the +-- corresponding `change_uv()` function is marked, for example. +-- We don't know on what exactly step our upvalue is marked as +-- black and execution of trace become dangerous, so just check it +-- at each step. +-- Don't need to do the full GC cycle step by step. +local old_steps_atomic = misc.getmetrics().gc_steps_atomic +while (misc.getmetrics().gc_steps_atomic == old_steps_atomic) do + collectgarbage('step') + func_with_trace() +end + +-- If we here, it means no assertion failed during `gc_mark()`, +-- due to wrong call to `lj_gc_barrieruv()` on trace. +test:ok(true, 'IR_OBAR emit_rma check correctness') + +os.exit(test:check() and 0 or 1) -- 2.34.1