From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 622035AAECA; Wed, 16 Aug 2023 12:01:10 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 622035AAECA DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1692176470; bh=SLGzHuIvIAmZ+Hol7ef41KEEPgXRlRS4fVqVJm15IzM=; h=Date:To:References:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=qZ5JERk+yQqOCfpVY18UJTsh6mX+RywShLRavfe314WcfcGEZ70RqQ+FwaOeld96C pBCJyM3wN+dC3rhmEqj6uRJh6raKhSchmMZAIryIl5bJwGaXvkxmsqhz+rbsrKq8P7 w6Ip1tDd3RXW3ya76HNyIqLzjfGVOvh7fTq0fqEs= Received: from smtp40.i.mail.ru (smtp40.i.mail.ru [95.163.41.81]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 428D054EB90 for ; Wed, 16 Aug 2023 12:01:09 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 428D054EB90 Received: by smtp40.i.mail.ru with esmtpa (envelope-from ) id 1qWCOq-007eTd-1r; Wed, 16 Aug 2023 12:01:08 +0300 Date: Wed, 16 Aug 2023 12:01:08 +0300 To: Sergey Kaplun Message-ID: References: <37c2435a3529beb36c2e428f9c8e8b5c007c68e7.1691592488.git.skaplun@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <37c2435a3529beb36c2e428f9c8e8b5c007c68e7.1691592488.git.skaplun@tarantool.org> X-Mailru-Src: smtp X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD969E04B5EED670DC8BEB87106826C459512B2000DD660D84D182A05F538085040ED0E7FFF138FDB6FDD2E501AFCF0F3A0D98D4F08AFFE0658A5201446AE708EBA X-C1DE0DAB: 0D63561A33F958A5F496D6027CAE4F62E4794AF5B6BCFA59704932623D02D7DDF87CCE6106E1FC07E67D4AC08A07B9B0A6C7FFFE744CA7FBCB5012B2E24CD356 X-C8649E89: 1C3962B70DF3F0AD5177F0B940C8B66ECE892A7B2722663E91682638B966EB3F662256BEEFA9527F5B890922CD528EC0FDE8E2041FDA3F7B98F762FCCA9EB8E7F82C65CB9B063B3AE40B738790DF643E37FD76D11AF80A0D243378C0907B68319DA94C07DD5A93DDEA455F16B58544A21C197AAF4D2E4732965026E5D17F6739C77C69D99B9914278E50E1F0597A6FD5CD72808BE417F3B9E0E7457915DAA85F X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojHVl7ekwB6hig7RXuxUeVqA== X-Mailru-Sender: 11C2EC085EDE56FA38FD4C59F7EFE4079066FCDA44FBE4B78595EE6BA50141ADBA272E20C90909CED51284F0FE6F529ABC7555A253F5B200DF104D74F62EE79D27EC13EC74F6107F4198E0F3ECE9B5443453F38A29522196 X-Mras: OK Subject: Re: [Tarantool-patches] [PATCH luajit 17/19] MIPS64: Fix register allocation in assembly of HREF. X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Maxim Kokryashkin via Tarantool-patches Reply-To: Maxim Kokryashkin Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Hi, Sergey! Thanks for the patch! Please consider my comments below. On Wed, Aug 09, 2023 at 06:36:06PM +0300, Sergey Kaplun via Tarantool-patches wrote: > From: Mike Pall > > Contributed by James Cowgill. > > (cherry-picked from commit 99cdfbf6a1e8856f64908072ef10443a7eab14f2) > > The issue is observed for the following merged IRs: > | p64 HREF 0001 "a" ; or other keys > | > p64 EQ 0002 [0x4002d0c528] ; nilnode > Sometimes, when we need to rematerialize a constant during evicting of Typo: s/during evicting/during the eviction/ > the register. So, the instruction related to constant rematerialization Sometimes happens what? The sentence looks kind of chopped. > is placed in the delay branch slot, which suppose to contain the loads Typo: s/which suppose/which is supposed/ > of trace exit number to the `$ra` register. The resulting assembly is Typo: s/number/numbers/ (because of `loads` being in the plural form) > the following (for example): > | beq ra, r1, 0x400abee9b0 ->exit > | lui r1, 65531 ; delay slot without setting of the `ra` > This leading to the assertion failure during trace exit in Typo: s/leading/leads/ > `lj_trace_exit()`, since a trace number is incorrect. > > This patch moves the constant register allocations above the main > instruction emitting code in `asm_href()`. AFAICS, It is not just moved, the register allocation logic has changed too. Before the patch, there were a few cases of inplace emissions, which disappeared after the patch. I believe it is important to mention to, along with a more detailed description of the logic changes. > > Sergey Kaplun: > * added the description and the test for the problem > > Part of tarantool/tarantool#8825 > --- > src/lj_asm_mips.h | 42 +++++--- > ...-mips64-href-delay-slot-side-exit.test.lua | 101 ++++++++++++++++++ > 2 files changed, 126 insertions(+), 17 deletions(-) > create mode 100644 test/tarantool-tests/lj-362-mips64-href-delay-slot-side-exit.test.lua > > diff --git a/src/lj_asm_mips.h b/src/lj_asm_mips.h > index c27d8413..23ffc3aa 100644 > --- a/src/lj_asm_mips.h > +++ b/src/lj_asm_mips.h > @@ -859,6 +859,9 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge) > Reg dest = ra_dest(as, ir, allow); > Reg tab = ra_alloc1(as, ir->op1, rset_clear(allow, dest)); > Reg key = RID_NONE, type = RID_NONE, tmpnum = RID_NONE, tmp1 = RID_TMP, tmp2; > +#if LJ_64 > + Reg cmp64 = RID_NONE; > +#endif > IRRef refkey = ir->op2; > IRIns *irkey = IR(refkey); > int isk = irref_isk(refkey); > @@ -901,6 +904,26 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge) > #endif > tmp2 = ra_scratch(as, allow); > rset_clear(allow, tmp2); > +#if LJ_64 > + if (LJ_SOFTFP || !irt_isnum(kt)) { > + /* Allocate cmp64 register used for 64-bit comparisons */ > + if (LJ_SOFTFP && irt_isnum(kt)) { > + cmp64 = key; > + } else if (!isk && irt_isaddr(kt)) { > + cmp64 = tmp2; > + } else { > + int64_t k; > + if (isk && irt_isaddr(kt)) { > + k = ((int64_t)irt_toitype(irkey->t) << 47) | irkey[1].tv.u64; > + } else { > + lua_assert(irt_ispri(kt) && !irt_isnil(kt)); > + k = ~((int64_t)~irt_toitype(ir->t) << 47); > + } > + cmp64 = ra_allock(as, k, allow); > + rset_clear(allow, cmp64); > + } > + } > +#endif > > /* Key not found in chain: jump to exit (if merged) or load niltv. */ > l_end = emit_label(as); > @@ -943,24 +966,9 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge) > emit_dta(as, MIPSI_DSRA32, tmp1, tmp1, 15); > emit_tg(as, MIPSI_DMTC1, tmp1, tmpnum); > emit_tsi(as, MIPSI_LD, tmp1, dest, (int32_t)offsetof(Node, key.u64)); > - } else if (LJ_SOFTFP && irt_isnum(kt)) { > - emit_branch(as, MIPSI_BEQ, tmp1, key, l_end); > - emit_tsi(as, MIPSI_LD, tmp1, dest, (int32_t)offsetof(Node, key.u64)); > - } else if (irt_isaddr(kt)) { > - Reg refk = tmp2; > - if (isk) { > - int64_t k = ((int64_t)irt_toitype(irkey->t) << 47) | irkey[1].tv.u64; > - refk = ra_allock(as, k, allow); > - rset_clear(allow, refk); > - } > - emit_branch(as, MIPSI_BEQ, tmp1, refk, l_end); > - emit_tsi(as, MIPSI_LD, tmp1, dest, offsetof(Node, key)); > } else { > - Reg pri = ra_allock(as, ~((int64_t)~irt_toitype(ir->t) << 47), allow); > - rset_clear(allow, pri); > - lua_assert(irt_ispri(kt) && !irt_isnil(kt)); > - emit_branch(as, MIPSI_BEQ, tmp1, pri, l_end); > - emit_tsi(as, MIPSI_LD, tmp1, dest, offsetof(Node, key)); > + emit_branch(as, MIPSI_BEQ, tmp1, cmp64, l_end); > + emit_tsi(as, MIPSI_LD, tmp1, dest, (int32_t)offsetof(Node, key.u64)); > } > *l_loop = MIPSI_BNE | MIPSF_S(tmp1) | ((as->mcp-l_loop-1) & 0xffffu); > if (!isk && irt_isaddr(kt)) { > diff --git a/test/tarantool-tests/lj-362-mips64-href-delay-slot-side-exit.test.lua b/test/tarantool-tests/lj-362-mips64-href-delay-slot-side-exit.test.lua > new file mode 100644 > index 00000000..8c75e69c > --- /dev/null > +++ b/test/tarantool-tests/lj-362-mips64-href-delay-slot-side-exit.test.lua > @@ -0,0 +1,101 @@ > +local tap = require('tap') > +-- Test file to demonstrate the incorrect JIT behaviour for HREF > +-- IR compilation on mips64. > +-- See also https://github.com/LuaJIT/LuaJIT/pull/362. > +local test = tap.test('lj-362-mips64-href-delay-slot-side-exit'):skipcond({ > + ['Test requires JIT enabled'] = not jit.status(), > +}) > + > +test:plan(1) > + > +-- To reproduce the issue we need to compile a trace with > +-- `IR_HREF`, with a lookup of constant hash key GC value. To Typo: s/constant/a constant/ > +-- prevent an `IR_HREFK` to be emitted instead, we need a table Typo: s/to be/from being/ > +-- with a huge hash part. Delta of address between the start of Typo: s/Delta/The delta/ > +-- the hash part of the table and the current node to lookup must > +-- be more than `(1024 * 64 - 1) * sizeof(Node)`. Typo: s/more/greater/ > +-- See , for details. > +-- XXX: This constant is well suited to prevent test to be flaky, Typo: s/to be/from being/ > +-- because the aforementioned delta is always large enough. > +-- Also, this constant avoids table rehashing, when inserting new > +-- keys. > +local N_HASH_FIELDS = 2 ^ 16 + 2 ^ 15 > + > +-- XXX: don't set `hotexit` to prevent compilation of trace after > +-- exiting the main test cycle. I suggest rehprasing it the following way: | The `hotexit` option is not set to prevent the compilation of traces | after the emission of the main test cycle. > +jit.opt.start('hotloop=1') > + > +-- Don't use `table.new()`, here by intence -- this leads to the Typo: s/Don't use `table.new()`, here by intence/`table.new()` is not used here by intention/ > +-- allocation failure for the mcode memory, so traces are not > +-- compiled. > +local filled_tab = {} > +-- Filling-up the table with GC values to minimize the amount of Typo: s/Filling-up/Fill up/ > +-- hash collisions and increase delta between the start of the Typo: s/delta/the delta/ > +-- hash part of the table and currently stored node. Typo: s/currently/the currently/ > +for _ = 1, N_HASH_FIELDS do > + filled_tab[1LL] = 1 > +end > + > +-- luacheck: no unused > +local tab_value_a > +local tab_value_b > +local tab_value_c > +local tab_value_d > +local tab_value_e > +local tab_value_f > +local tab_value_g > +local tab_value_h > +local tab_value_i > + > +-- The function for this trace has a bunch of the following IRs: > +-- p64 HREF 0001 "a" ; or other keys > +-- > p64 EQ 0002 [0x4002d0c528] ; nilnode > +-- Sometimes, when we need to rematerialize a constant during > +-- evicting of the register. So, the instruction related to Typo: s/evicting/the eviction/ Again, sometimes happens what? > +-- constant rematerialization is placed in the delay branch slot, > +-- which suppose to contain the loads of trace exit number to the Typo: s/which suppose/which is supposed/ Typo: s/number/numbers/ > +-- `$ra` register. This leading to the assertion failure during Typo: s/leading/leads/ > +-- trace exit in `lj_trace_exit()`, since a trace number is > +-- incorrect. The amount of the side exit to check is empirical Typo: s/exit/exits/ > +-- (even a little bit more, than necessary just in case). Typo: s/more/greater/ > +local function href_const(tab) > + tab_value_a = tab.a > + tab_value_b = tab.b > + tab_value_c = tab.c > + tab_value_d = tab.d > + tab_value_e = tab.e > + tab_value_f = tab.f > + tab_value_g = tab.g > + tab_value_h = tab.h > + tab_value_i = tab.i > +end > + > +-- Compile main trace first. Typo: s/main/the main/ > +href_const(filled_tab) > +href_const(filled_tab) > + > +-- Now brute-force side exits to check that they are compiled > +-- correct. Take side exits in the reverse order to take a new Typo: s/correct/correctly/ Typo: s/the reverse/reverse/ > +-- side exit each time. > +filled_tab.i = 'i' > +href_const(filled_tab) > +filled_tab.h = 'h' > +href_const(filled_tab) > +filled_tab.g = 'g' > +href_const(filled_tab) > +filled_tab.f = 'f' > +href_const(filled_tab) > +filled_tab.e = 'e' > +href_const(filled_tab) > +filled_tab.d = 'd' > +href_const(filled_tab) > +filled_tab.c = 'c' > +href_const(filled_tab) > +filled_tab.b = 'b' > +href_const(filled_tab) > +filled_tab.a = 'a' > +href_const(filled_tab) > + > +test:ok(true, 'no assertion failures during trace exits') > + > +test:done(true) > -- > 2.41.0 >