From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 63D7D54EB90; Wed, 16 Aug 2023 12:16:54 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 63D7D54EB90 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1692177414; bh=sjvNrsgHzFVERQk/x3Q6xjnf/iQ/buAPFe5I+lJo8zk=; h=Date:To:References:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=wqwLJCnE46dmMItARfSN/C/0kftuFH+4JfgF8KPEDlI+Ps7V6hdWtxNr0xwa4aZD5 Z1K5UxldnTPPYMzyPqkUQ6kYSxKCmrD9hzopNoQvB1i0wtmwe4yXTbOSjVhz6oeoEZ GME/ulVIhWKyYNuW7VQVJNuAZxwmtQdFM1UxNsTA= Received: from smtp41.i.mail.ru (smtp41.i.mail.ru [95.163.41.64]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id EE68554EB90 for ; Wed, 16 Aug 2023 12:16:52 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org EE68554EB90 Received: by smtp41.i.mail.ru with esmtpa (envelope-from ) id 1qWCe3-00ACrq-2o; Wed, 16 Aug 2023 12:16:52 +0300 Date: Wed, 16 Aug 2023 12:16:51 +0300 To: Sergey Kaplun Message-ID: References: <876736e650dcb70ce32a272f92cc3ba034a4dd3b.1691592488.git.skaplun@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <876736e650dcb70ce32a272f92cc3ba034a4dd3b.1691592488.git.skaplun@tarantool.org> X-Mailru-Src: smtp X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD9700E0DCE2907754D6BBDFD16DC5F73F29F7DA2AE4136AA2D182A05F538085040DF36AE9A1575C3045B342328E454E7276A2604B488B91D9C0B46E09C1443C872 X-C1DE0DAB: 0D63561A33F958A58F5900B348A6DA00CFB785162F0C1E1DF4A407CEE9E0D4C6F87CCE6106E1FC07E67D4AC08A07B9B062B3BD3CC35DA588CB5012B2E24CD356 X-C8649E89: 1C3962B70DF3F0AD5177F0B940C8B66ECE892A7B2722663E91682638B966EB3F662256BEEFA9527F4CC1DB0494A983BCACE01A6E9A7A5FD66521367CF17FDF4E49AF4F3CB35C76ECF6C844C90402F5DF37FD76D11AF80A0D8A06726AACDBB3E1EE868205D977BBCFEA455F16B58544A21C197AAF4D2E4732965026E5D17F6739C77C69D99B9914278E50E1F0597A6FD5CD72808BE417F3B9E0E7457915DAA85F X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojHVl7ekwB6hiORiV7MZUVYA== X-Mailru-Sender: 11C2EC085EDE56FA38FD4C59F7EFE407B08BDCE9D602582197D624AFD5FB549D8D65200875EB1684D51284F0FE6F529ABC7555A253F5B200DF104D74F62EE79D27EC13EC74F6107F4198E0F3ECE9B5443453F38A29522196 X-Mras: OK Subject: Re: [Tarantool-patches] [PATCH luajit 19/19] MIPS: Add MIPS64 R6 port. X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Maxim Kokryashkin via Tarantool-patches Reply-To: Maxim Kokryashkin Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Hi, Sergey! Thanks for the patch! LGTM, except for a few nits regarding the commit message. On Wed, Aug 09, 2023 at 06:36:08PM +0300, Sergey Kaplun via Tarantool-patches wrote: > From: Mike Pall > > Contributed by Hua Zhang, YunQiang Su from Wave Computing, > and Radovan Birdic from RT-RK. > Sponsored by Wave Computing. > > (cherry-picked from commit 94d0b53004a5fa368defa4307a17edcdb87fe727) > > This patch adds support for MIPS Release 6 [1] for the 64-bit build. > This includes: > * Global `_map_def` value is set with . `MIPSR6` key > specifies the corresponding instruction set support. Also, `MIPSR6` is > defined in `DYNASM_FLAGS` (`DASM_AFLAGS`). > * New instructions are added within , they are > used if the aforementioned key is set. > * Obsolete instructions (that are no more in use in r6) are used in the Typo: s/no more/no longer/ > opposite case (if `MIPSR6` isn't set). > * New opcode maps are added into . Typo: s/into/to/ > * `map_arch` table in is refactored for more convenient > usage. Now each arch key contains a table with the corresponding info > about supported architecture: Typo: s/about/about the/ > - `e`: endianess; "le" or "be" > - `b`: bit-width of the supported architecture; 32 or 64 > - `m`: machine specification (see `e_machine` in man elf) > - `f`: processor-specific flags (see `e_flags` in man elf) > - `p`: number that identifies the type of target machine [2] for > Portable Executable format [3]. > * New `LJ_TARGET_MIPSR6` define is set for MIPSR6 in . > * The corresponding "MIPS32R6", "MIPS64R6" CPU strings are added to the > > * MIPSR6 instructions are added to the , some > obsolete instructions are removed or defined only for the non-MIPSR6 > build. > * All release-dependent instructions in are > instrumented with `LJ_TARGET_MIPSR6` macro. > * `f20`, `f21`, `f22` FP registers are defined as `FTMP0`, `FTMP1`, > `FTMP2` correspondingly in the VM. > * All release-dependent instructions in are > instrumented with `MIPSR6` macro. > * `sfmin_max` macro now takes the third operand for the MIPSR6 build. > * Fix implicit fallthrough warning for `LJ_SOFTFP && !LJ_NEED_FP64` Typo: s/Fix/Fix the/ > build in . > > Note, that 32-bit r6 targets still unsupported, because it is difficult Typo: s/targets/targets are/ > and most available r6 CPUs are 64 bit. > > [1]: https://www.mips.com/products/architectures/mips64/ > [2]: https://learn.microsoft.com/en-us/windows/win32/debug/pe-format#machine-types > [3]: https://learn.microsoft.com/en-us/windows/win32/debug/pe-format > > Sergey Kaplun: > * added the description for the feature > > Part of tarantool/tarantool#8825 > --- > cmake/SetDynASMFlags.cmake | 5 + > dynasm/dasm_mips.h | 13 +- > dynasm/dasm_mips.lua | 625 +++++++++++++++++++++++-------------- > dynasm/dynasm.lua | 1 + > src/Makefile.original | 3 + > src/jit/bcsave.lua | 84 ++--- > src/jit/dis_mips.lua | 293 +++++++++++++++-- > src/jit/dis_mips64r6.lua | 17 + > src/jit/dis_mips64r6el.lua | 17 + > src/lj_arch.h | 29 +- > src/lj_asm.c | 2 +- > src/lj_asm_mips.h | 114 ++++++- > src/lj_emit_mips.h | 15 +- > src/lj_jit.h | 8 + > src/lj_target_mips.h | 52 ++- > src/vm_mips64.dasc | 370 ++++++++++++++++++++-- > 16 files changed, 1301 insertions(+), 347 deletions(-) > create mode 100644 src/jit/dis_mips64r6.lua > create mode 100644 src/jit/dis_mips64r6el.lua > > diff --git a/cmake/SetDynASMFlags.cmake b/cmake/SetDynASMFlags.cmake > index 142d7e64..7eead6e9 100644 > --- a/cmake/SetDynASMFlags.cmake > +++ b/cmake/SetDynASMFlags.cmake > @@ -64,6 +64,11 @@ elseif(LUAJIT_ARCH STREQUAL "mips") > endif() > endif() > > +string(FIND "${TESTARCH}" "LJ_TARGET_MIPSR6" FOUND) > +if(NOT FOUND EQUAL -1) > + AppendFlags(DYNASM_FLAGS -D MIPSR6) > +endif() > + > string(FIND "${TESTARCH}" "LJ_LE 1" FOUND) > if(NOT FOUND EQUAL -1) > list(APPEND DYNASM_FLAGS -D ENDIAN_LE) > diff --git a/dynasm/dasm_mips.h b/dynasm/dasm_mips.h > index 71a835b2..7d06aa72 100644 > --- a/dynasm/dasm_mips.h > +++ b/dynasm/dasm_mips.h > @@ -355,14 +355,15 @@ int dasm_encode(Dst_DECL, void *buffer) > CK(n >= 0, UNDEF_PC); > n = *DASM_POS2PTR(D, n); > if (ins & 2048) > - n = n - (int)((char *)cp - base); > - else > n = (n + (int)(size_t)base) & 0x0fffffff; > - patchrel: > + else > + n = n - (int)((char *)cp - base); > + patchrel: { > + unsigned int e = 16 + ((ins >> 12) & 15); > CK((n & 3) == 0 && > - ((n + ((ins & 2048) ? 0x00020000 : 0)) >> > - ((ins & 2048) ? 18 : 28)) == 0, RANGE_REL); > - cp[-1] |= ((n>>2) & ((ins & 2048) ? 0x0000ffff: 0x03ffffff)); > + ((n + ((ins & 2048) ? 0 : (1<<(e+1)))) >> (e+2)) == 0, RANGE_REL); > + cp[-1] |= ((n>>2) & ((1< + } > break; > case DASM_LABEL_LG: > ins &= 2047; if (ins >= 20) D->globals[ins-10] = (void *)(base + n); > diff --git a/dynasm/dasm_mips.lua b/dynasm/dasm_mips.lua > index bd2a2b43..ccdc53cd 100644 > --- a/dynasm/dasm_mips.lua > +++ b/dynasm/dasm_mips.lua > @@ -6,6 +6,7 @@ > ------------------------------------------------------------------------------ > > local mips64 = mips64 > +local mipsr6 = _map_def.MIPSR6 > > -- Module information: > local _info = { > @@ -238,7 +239,6 @@ local map_op = { > bne_3 = "14000000STB", > blez_2 = "18000000SB", > bgtz_2 = "1c000000SB", > - addi_3 = "20000000TSI", > li_2 = "24000000TI", > addiu_3 = "24000000TSI", > slti_3 = "28000000TSI", > @@ -248,40 +248,22 @@ local map_op = { > ori_3 = "34000000TSU", > xori_3 = "38000000TSU", > lui_2 = "3c000000TU", > - beqzl_2 = "50000000SB", > - beql_3 = "50000000STB", > - bnezl_2 = "54000000SB", > - bnel_3 = "54000000STB", > - blezl_2 = "58000000SB", > - bgtzl_2 = "5c000000SB", > - daddi_3 = mips64 and "60000000TSI", > daddiu_3 = mips64 and "64000000TSI", > ldl_2 = mips64 and "68000000TO", > ldr_2 = mips64 and "6c000000TO", > lb_2 = "80000000TO", > lh_2 = "84000000TO", > - lwl_2 = "88000000TO", > lw_2 = "8c000000TO", > lbu_2 = "90000000TO", > lhu_2 = "94000000TO", > - lwr_2 = "98000000TO", > lwu_2 = mips64 and "9c000000TO", > sb_2 = "a0000000TO", > sh_2 = "a4000000TO", > - swl_2 = "a8000000TO", > sw_2 = "ac000000TO", > - sdl_2 = mips64 and "b0000000TO", > - sdr_2 = mips64 and "b1000000TO", > - swr_2 = "b8000000TO", > - cache_2 = "bc000000NO", > - ll_2 = "c0000000TO", > lwc1_2 = "c4000000HO", > - pref_2 = "cc000000NO", > ldc1_2 = "d4000000HO", > ld_2 = mips64 and "dc000000TO", > - sc_2 = "e0000000TO", > swc1_2 = "e4000000HO", > - scd_2 = mips64 and "f0000000TO", > sdc1_2 = "f4000000HO", > sd_2 = mips64 and "fc000000TO", > > @@ -289,10 +271,6 @@ local map_op = { > nop_0 = "00000000", > sll_3 = "00000000DTA", > sextw_2 = "00000000DT", > - movf_2 = "00000001DS", > - movf_3 = "00000001DSC", > - movt_2 = "00010001DS", > - movt_3 = "00010001DSC", > srl_3 = "00000002DTA", > rotr_3 = "00200002DTA", > sra_3 = "00000003DTA", > @@ -301,31 +279,16 @@ local map_op = { > rotrv_3 = "00000046DTS", > drotrv_3 = mips64 and "00000056DTS", > srav_3 = "00000007DTS", > - jr_1 = "00000008S", > jalr_1 = "0000f809S", > jalr_2 = "00000009DS", > - movz_3 = "0000000aDST", > - movn_3 = "0000000bDST", > syscall_0 = "0000000c", > syscall_1 = "0000000cY", > break_0 = "0000000d", > break_1 = "0000000dY", > sync_0 = "0000000f", > - mfhi_1 = "00000010D", > - mthi_1 = "00000011S", > - mflo_1 = "00000012D", > - mtlo_1 = "00000013S", > dsllv_3 = mips64 and "00000014DTS", > dsrlv_3 = mips64 and "00000016DTS", > dsrav_3 = mips64 and "00000017DTS", > - mult_2 = "00000018ST", > - multu_2 = "00000019ST", > - div_2 = "0000001aST", > - divu_2 = "0000001bST", > - dmult_2 = mips64 and "0000001cST", > - dmultu_2 = mips64 and "0000001dST", > - ddiv_2 = mips64 and "0000001eST", > - ddivu_2 = mips64 and "0000001fST", > add_3 = "00000020DST", > move_2 = mips64 and "00000025DS" or "00000021DS", > addu_3 = "00000021DST", > @@ -369,32 +332,9 @@ local map_op = { > bgez_2 = "04010000SB", > bltzl_2 = "04020000SB", > bgezl_2 = "04030000SB", > - tgei_2 = "04080000SI", > - tgeiu_2 = "04090000SI", > - tlti_2 = "040a0000SI", > - tltiu_2 = "040b0000SI", > - teqi_2 = "040c0000SI", > - tnei_2 = "040e0000SI", > - bltzal_2 = "04100000SB", > bal_1 = "04110000B", > - bgezal_2 = "04110000SB", > - bltzall_2 = "04120000SB", > - bgezall_2 = "04130000SB", > synci_1 = "041f0000O", > > - -- Opcode SPECIAL2. > - madd_2 = "70000000ST", > - maddu_2 = "70000001ST", > - mul_3 = "70000002DST", > - msub_2 = "70000004ST", > - msubu_2 = "70000005ST", > - clz_2 = "70000020DS=", > - clo_2 = "70000021DS=", > - dclz_2 = mips64 and "70000024DS=", > - dclo_2 = mips64 and "70000025DS=", > - sdbbp_0 = "7000003f", > - sdbbp_1 = "7000003fY", > - > -- Opcode SPECIAL3. > ext_4 = "7c000000TSAM", -- Note: last arg is msbd = size-1 > dextm_4 = mips64 and "7c000001TSAM", -- Args: pos | size-1-32 > @@ -445,15 +385,6 @@ local map_op = { > ctc1_2 = "44c00000TG", > mthc1_2 = "44e00000TG", > > - bc1f_1 = "45000000B", > - bc1f_2 = "45000000CB", > - bc1t_1 = "45010000B", > - bc1t_2 = "45010000CB", > - bc1fl_1 = "45020000B", > - bc1fl_2 = "45020000CB", > - bc1tl_1 = "45030000B", > - bc1tl_2 = "45030000CB", > - > ["add.s_3"] = "46000000FGH", > ["sub.s_3"] = "46000001FGH", > ["mul.s_3"] = "46000002FGH", > @@ -470,51 +401,11 @@ local map_op = { > ["trunc.w.s_2"] = "4600000dFG", > ["ceil.w.s_2"] = "4600000eFG", > ["floor.w.s_2"] = "4600000fFG", > - ["movf.s_2"] = "46000011FG", > - ["movf.s_3"] = "46000011FGC", > - ["movt.s_2"] = "46010011FG", > - ["movt.s_3"] = "46010011FGC", > - ["movz.s_3"] = "46000012FGT", > - ["movn.s_3"] = "46000013FGT", > ["recip.s_2"] = "46000015FG", > ["rsqrt.s_2"] = "46000016FG", > ["cvt.d.s_2"] = "46000021FG", > ["cvt.w.s_2"] = "46000024FG", > ["cvt.l.s_2"] = "46000025FG", > - ["cvt.ps.s_3"] = "46000026FGH", > - ["c.f.s_2"] = "46000030GH", > - ["c.f.s_3"] = "46000030VGH", > - ["c.un.s_2"] = "46000031GH", > - ["c.un.s_3"] = "46000031VGH", > - ["c.eq.s_2"] = "46000032GH", > - ["c.eq.s_3"] = "46000032VGH", > - ["c.ueq.s_2"] = "46000033GH", > - ["c.ueq.s_3"] = "46000033VGH", > - ["c.olt.s_2"] = "46000034GH", > - ["c.olt.s_3"] = "46000034VGH", > - ["c.ult.s_2"] = "46000035GH", > - ["c.ult.s_3"] = "46000035VGH", > - ["c.ole.s_2"] = "46000036GH", > - ["c.ole.s_3"] = "46000036VGH", > - ["c.ule.s_2"] = "46000037GH", > - ["c.ule.s_3"] = "46000037VGH", > - ["c.sf.s_2"] = "46000038GH", > - ["c.sf.s_3"] = "46000038VGH", > - ["c.ngle.s_2"] = "46000039GH", > - ["c.ngle.s_3"] = "46000039VGH", > - ["c.seq.s_2"] = "4600003aGH", > - ["c.seq.s_3"] = "4600003aVGH", > - ["c.ngl.s_2"] = "4600003bGH", > - ["c.ngl.s_3"] = "4600003bVGH", > - ["c.lt.s_2"] = "4600003cGH", > - ["c.lt.s_3"] = "4600003cVGH", > - ["c.nge.s_2"] = "4600003dGH", > - ["c.nge.s_3"] = "4600003dVGH", > - ["c.le.s_2"] = "4600003eGH", > - ["c.le.s_3"] = "4600003eVGH", > - ["c.ngt.s_2"] = "4600003fGH", > - ["c.ngt.s_3"] = "4600003fVGH", > - > ["add.d_3"] = "46200000FGH", > ["sub.d_3"] = "46200001FGH", > ["mul.d_3"] = "46200002FGH", > @@ -531,130 +422,410 @@ local map_op = { > ["trunc.w.d_2"] = "4620000dFG", > ["ceil.w.d_2"] = "4620000eFG", > ["floor.w.d_2"] = "4620000fFG", > - ["movf.d_2"] = "46200011FG", > - ["movf.d_3"] = "46200011FGC", > - ["movt.d_2"] = "46210011FG", > - ["movt.d_3"] = "46210011FGC", > - ["movz.d_3"] = "46200012FGT", > - ["movn.d_3"] = "46200013FGT", > ["recip.d_2"] = "46200015FG", > ["rsqrt.d_2"] = "46200016FG", > ["cvt.s.d_2"] = "46200020FG", > ["cvt.w.d_2"] = "46200024FG", > ["cvt.l.d_2"] = "46200025FG", > - ["c.f.d_2"] = "46200030GH", > - ["c.f.d_3"] = "46200030VGH", > - ["c.un.d_2"] = "46200031GH", > - ["c.un.d_3"] = "46200031VGH", > - ["c.eq.d_2"] = "46200032GH", > - ["c.eq.d_3"] = "46200032VGH", > - ["c.ueq.d_2"] = "46200033GH", > - ["c.ueq.d_3"] = "46200033VGH", > - ["c.olt.d_2"] = "46200034GH", > - ["c.olt.d_3"] = "46200034VGH", > - ["c.ult.d_2"] = "46200035GH", > - ["c.ult.d_3"] = "46200035VGH", > - ["c.ole.d_2"] = "46200036GH", > - ["c.ole.d_3"] = "46200036VGH", > - ["c.ule.d_2"] = "46200037GH", > - ["c.ule.d_3"] = "46200037VGH", > - ["c.sf.d_2"] = "46200038GH", > - ["c.sf.d_3"] = "46200038VGH", > - ["c.ngle.d_2"] = "46200039GH", > - ["c.ngle.d_3"] = "46200039VGH", > - ["c.seq.d_2"] = "4620003aGH", > - ["c.seq.d_3"] = "4620003aVGH", > - ["c.ngl.d_2"] = "4620003bGH", > - ["c.ngl.d_3"] = "4620003bVGH", > - ["c.lt.d_2"] = "4620003cGH", > - ["c.lt.d_3"] = "4620003cVGH", > - ["c.nge.d_2"] = "4620003dGH", > - ["c.nge.d_3"] = "4620003dVGH", > - ["c.le.d_2"] = "4620003eGH", > - ["c.le.d_3"] = "4620003eVGH", > - ["c.ngt.d_2"] = "4620003fGH", > - ["c.ngt.d_3"] = "4620003fVGH", > - > - ["add.ps_3"] = "46c00000FGH", > - ["sub.ps_3"] = "46c00001FGH", > - ["mul.ps_3"] = "46c00002FGH", > - ["abs.ps_2"] = "46c00005FG", > - ["mov.ps_2"] = "46c00006FG", > - ["neg.ps_2"] = "46c00007FG", > - ["movf.ps_2"] = "46c00011FG", > - ["movf.ps_3"] = "46c00011FGC", > - ["movt.ps_2"] = "46c10011FG", > - ["movt.ps_3"] = "46c10011FGC", > - ["movz.ps_3"] = "46c00012FGT", > - ["movn.ps_3"] = "46c00013FGT", > - ["cvt.s.pu_2"] = "46c00020FG", > - ["cvt.s.pl_2"] = "46c00028FG", > - ["pll.ps_3"] = "46c0002cFGH", > - ["plu.ps_3"] = "46c0002dFGH", > - ["pul.ps_3"] = "46c0002eFGH", > - ["puu.ps_3"] = "46c0002fFGH", > - ["c.f.ps_2"] = "46c00030GH", > - ["c.f.ps_3"] = "46c00030VGH", > - ["c.un.ps_2"] = "46c00031GH", > - ["c.un.ps_3"] = "46c00031VGH", > - ["c.eq.ps_2"] = "46c00032GH", > - ["c.eq.ps_3"] = "46c00032VGH", > - ["c.ueq.ps_2"] = "46c00033GH", > - ["c.ueq.ps_3"] = "46c00033VGH", > - ["c.olt.ps_2"] = "46c00034GH", > - ["c.olt.ps_3"] = "46c00034VGH", > - ["c.ult.ps_2"] = "46c00035GH", > - ["c.ult.ps_3"] = "46c00035VGH", > - ["c.ole.ps_2"] = "46c00036GH", > - ["c.ole.ps_3"] = "46c00036VGH", > - ["c.ule.ps_2"] = "46c00037GH", > - ["c.ule.ps_3"] = "46c00037VGH", > - ["c.sf.ps_2"] = "46c00038GH", > - ["c.sf.ps_3"] = "46c00038VGH", > - ["c.ngle.ps_2"] = "46c00039GH", > - ["c.ngle.ps_3"] = "46c00039VGH", > - ["c.seq.ps_2"] = "46c0003aGH", > - ["c.seq.ps_3"] = "46c0003aVGH", > - ["c.ngl.ps_2"] = "46c0003bGH", > - ["c.ngl.ps_3"] = "46c0003bVGH", > - ["c.lt.ps_2"] = "46c0003cGH", > - ["c.lt.ps_3"] = "46c0003cVGH", > - ["c.nge.ps_2"] = "46c0003dGH", > - ["c.nge.ps_3"] = "46c0003dVGH", > - ["c.le.ps_2"] = "46c0003eGH", > - ["c.le.ps_3"] = "46c0003eVGH", > - ["c.ngt.ps_2"] = "46c0003fGH", > - ["c.ngt.ps_3"] = "46c0003fVGH", > - > ["cvt.s.w_2"] = "46800020FG", > ["cvt.d.w_2"] = "46800021FG", > - > ["cvt.s.l_2"] = "46a00020FG", > ["cvt.d.l_2"] = "46a00021FG", > - > - -- Opcode COP1X. > - lwxc1_2 = "4c000000FX", > - ldxc1_2 = "4c000001FX", > - luxc1_2 = "4c000005FX", > - swxc1_2 = "4c000008FX", > - sdxc1_2 = "4c000009FX", > - suxc1_2 = "4c00000dFX", > - prefx_2 = "4c00000fMX", > - ["alnv.ps_4"] = "4c00001eFGHS", > - ["madd.s_4"] = "4c000020FRGH", > - ["madd.d_4"] = "4c000021FRGH", > - ["madd.ps_4"] = "4c000026FRGH", > - ["msub.s_4"] = "4c000028FRGH", > - ["msub.d_4"] = "4c000029FRGH", > - ["msub.ps_4"] = "4c00002eFRGH", > - ["nmadd.s_4"] = "4c000030FRGH", > - ["nmadd.d_4"] = "4c000031FRGH", > - ["nmadd.ps_4"] = "4c000036FRGH", > - ["nmsub.s_4"] = "4c000038FRGH", > - ["nmsub.d_4"] = "4c000039FRGH", > - ["nmsub.ps_4"] = "4c00003eFRGH", > } > > +if mipsr6 then -- Instructions added with MIPSR6. > + > + for k,v in pairs({ > + > + -- Add immediate to upper bits. > + aui_3 = "3c000000TSI", > + daui_3 = mips64 and "74000000TSI", > + dahi_2 = mips64 and "04060000SI", > + dati_2 = mips64 and "041e0000SI", > + > + -- TODO: addiupc, auipc, aluipc, lwpc, lwupc, ldpc. > + > + -- Compact branches. > + blezalc_2 = "18000000TB", -- rt != 0. > + bgezalc_2 = "18000000T=SB", -- rt != 0. > + bgtzalc_2 = "1c000000TB", -- rt != 0. > + bltzalc_2 = "1c000000T=SB", -- rt != 0. > + > + blezc_2 = "58000000TB", -- rt != 0. > + bgezc_2 = "58000000T=SB", -- rt != 0. > + bgec_3 = "58000000STB", -- rs != rt. > + blec_3 = "58000000TSB", -- rt != rs. > + > + bgtzc_2 = "5c000000TB", -- rt != 0. > + bltzc_2 = "5c000000T=SB", -- rt != 0. > + bltc_3 = "5c000000STB", -- rs != rt. > + bgtc_3 = "5c000000TSB", -- rt != rs. > + > + bgeuc_3 = "18000000STB", -- rs != rt. > + bleuc_3 = "18000000TSB", -- rt != rs. > + bltuc_3 = "1c000000STB", -- rs != rt. > + bgtuc_3 = "1c000000TSB", -- rt != rs. > + > + beqzalc_2 = "20000000TB", -- rt != 0. > + bnezalc_2 = "60000000TB", -- rt != 0. > + beqc_3 = "20000000STB", -- rs < rt. > + bnec_3 = "60000000STB", -- rs < rt. > + bovc_3 = "20000000STB", -- rs >= rt. > + bnvc_3 = "60000000STB", -- rs >= rt. > + > + beqzc_2 = "d8000000SK", -- rs != 0. > + bnezc_2 = "f8000000SK", -- rs != 0. > + jic_2 = "d8000000TI", > + jialc_2 = "f8000000TI", > + bc_1 = "c8000000L", > + balc_1 = "e8000000L", > + > + -- Opcode SPECIAL. > + jr_1 = "00000009S", > + sdbbp_0 = "0000000e", > + sdbbp_1 = "0000000eY", > + lsa_4 = "00000005DSTA", > + dlsa_4 = mips64 and "00000015DSTA", > + seleqz_3 = "00000035DST", > + selnez_3 = "00000037DST", > + clz_2 = "00000050DS", > + clo_2 = "00000051DS", > + dclz_2 = mips64 and "00000052DS", > + dclo_2 = mips64 and "00000053DS", > + mul_3 = "00000098DST", > + muh_3 = "000000d8DST", > + mulu_3 = "00000099DST", > + muhu_3 = "000000d9DST", > + div_3 = "0000009aDST", > + mod_3 = "000000daDST", > + divu_3 = "0000009bDST", > + modu_3 = "000000dbDST", > + dmul_3 = mips64 and "0000009cDST", > + dmuh_3 = mips64 and "000000dcDST", > + dmulu_3 = mips64 and "0000009dDST", > + dmuhu_3 = mips64 and "000000ddDST", > + ddiv_3 = mips64 and "0000009eDST", > + dmod_3 = mips64 and "000000deDST", > + ddivu_3 = mips64 and "0000009fDST", > + dmodu_3 = mips64 and "000000dfDST", > + > + -- Opcode SPECIAL3. > + align_4 = "7c000220DSTA", > + dalign_4 = mips64 and "7c000224DSTA", > + bitswap_2 = "7c000020DT", > + dbitswap_2 = mips64 and "7c000024DT", > + > + -- Opcode COP1. > + bc1eqz_2 = "45200000HB", > + bc1nez_2 = "45a00000HB", > + > + ["sel.s_3"] = "46000010FGH", > + ["seleqz.s_3"] = "46000014FGH", > + ["selnez.s_3"] = "46000017FGH", > + ["maddf.s_3"] = "46000018FGH", > + ["msubf.s_3"] = "46000019FGH", > + ["rint.s_2"] = "4600001aFG", > + ["class.s_2"] = "4600001bFG", > + ["min.s_3"] = "4600001cFGH", > + ["mina.s_3"] = "4600001dFGH", > + ["max.s_3"] = "4600001eFGH", > + ["maxa.s_3"] = "4600001fFGH", > + ["cmp.af.s_3"] = "46800000FGH", > + ["cmp.un.s_3"] = "46800001FGH", > + ["cmp.or.s_3"] = "46800011FGH", > + ["cmp.eq.s_3"] = "46800002FGH", > + ["cmp.une.s_3"] = "46800012FGH", > + ["cmp.ueq.s_3"] = "46800003FGH", > + ["cmp.ne.s_3"] = "46800013FGH", > + ["cmp.lt.s_3"] = "46800004FGH", > + ["cmp.ult.s_3"] = "46800005FGH", > + ["cmp.le.s_3"] = "46800006FGH", > + ["cmp.ule.s_3"] = "46800007FGH", > + ["cmp.saf.s_3"] = "46800008FGH", > + ["cmp.sun.s_3"] = "46800009FGH", > + ["cmp.sor.s_3"] = "46800019FGH", > + ["cmp.seq.s_3"] = "4680000aFGH", > + ["cmp.sune.s_3"] = "4680001aFGH", > + ["cmp.sueq.s_3"] = "4680000bFGH", > + ["cmp.sne.s_3"] = "4680001bFGH", > + ["cmp.slt.s_3"] = "4680000cFGH", > + ["cmp.sult.s_3"] = "4680000dFGH", > + ["cmp.sle.s_3"] = "4680000eFGH", > + ["cmp.sule.s_3"] = "4680000fFGH", > + > + ["sel.d_3"] = "46200010FGH", > + ["seleqz.d_3"] = "46200014FGH", > + ["selnez.d_3"] = "46200017FGH", > + ["maddf.d_3"] = "46200018FGH", > + ["msubf.d_3"] = "46200019FGH", > + ["rint.d_2"] = "4620001aFG", > + ["class.d_2"] = "4620001bFG", > + ["min.d_3"] = "4620001cFGH", > + ["mina.d_3"] = "4620001dFGH", > + ["max.d_3"] = "4620001eFGH", > + ["maxa.d_3"] = "4620001fFGH", > + ["cmp.af.d_3"] = "46a00000FGH", > + ["cmp.un.d_3"] = "46a00001FGH", > + ["cmp.or.d_3"] = "46a00011FGH", > + ["cmp.eq.d_3"] = "46a00002FGH", > + ["cmp.une.d_3"] = "46a00012FGH", > + ["cmp.ueq.d_3"] = "46a00003FGH", > + ["cmp.ne.d_3"] = "46a00013FGH", > + ["cmp.lt.d_3"] = "46a00004FGH", > + ["cmp.ult.d_3"] = "46a00005FGH", > + ["cmp.le.d_3"] = "46a00006FGH", > + ["cmp.ule.d_3"] = "46a00007FGH", > + ["cmp.saf.d_3"] = "46a00008FGH", > + ["cmp.sun.d_3"] = "46a00009FGH", > + ["cmp.sor.d_3"] = "46a00019FGH", > + ["cmp.seq.d_3"] = "46a0000aFGH", > + ["cmp.sune.d_3"] = "46a0001aFGH", > + ["cmp.sueq.d_3"] = "46a0000bFGH", > + ["cmp.sne.d_3"] = "46a0001bFGH", > + ["cmp.slt.d_3"] = "46a0000cFGH", > + ["cmp.sult.d_3"] = "46a0000dFGH", > + ["cmp.sle.d_3"] = "46a0000eFGH", > + ["cmp.sule.d_3"] = "46a0000fFGH", > + > + }) do map_op[k] = v end > + > +else -- Instructions removed by MIPSR6. > + > + for k,v in pairs({ > + -- Traps, don't use. > + addi_3 = "20000000TSI", > + daddi_3 = mips64 and "60000000TSI", > + > + -- Branch on likely, don't use. > + beqzl_2 = "50000000SB", > + beql_3 = "50000000STB", > + bnezl_2 = "54000000SB", > + bnel_3 = "54000000STB", > + blezl_2 = "58000000SB", > + bgtzl_2 = "5c000000SB", > + > + lwl_2 = "88000000TO", > + lwr_2 = "98000000TO", > + swl_2 = "a8000000TO", > + sdl_2 = mips64 and "b0000000TO", > + sdr_2 = mips64 and "b1000000TO", > + swr_2 = "b8000000TO", > + cache_2 = "bc000000NO", > + ll_2 = "c0000000TO", > + pref_2 = "cc000000NO", > + sc_2 = "e0000000TO", > + scd_2 = mips64 and "f0000000TO", > + > + -- Opcode SPECIAL. > + movf_2 = "00000001DS", > + movf_3 = "00000001DSC", > + movt_2 = "00010001DS", > + movt_3 = "00010001DSC", > + jr_1 = "00000008S", > + movz_3 = "0000000aDST", > + movn_3 = "0000000bDST", > + mfhi_1 = "00000010D", > + mthi_1 = "00000011S", > + mflo_1 = "00000012D", > + mtlo_1 = "00000013S", > + mult_2 = "00000018ST", > + multu_2 = "00000019ST", > + div_3 = "0000001aST", > + divu_3 = "0000001bST", > + ddiv_3 = mips64 and "0000001eST", > + ddivu_3 = mips64 and "0000001fST", > + dmult_2 = mips64 and "0000001cST", > + dmultu_2 = mips64 and "0000001dST", > + > + -- Opcode REGIMM. > + tgei_2 = "04080000SI", > + tgeiu_2 = "04090000SI", > + tlti_2 = "040a0000SI", > + tltiu_2 = "040b0000SI", > + teqi_2 = "040c0000SI", > + tnei_2 = "040e0000SI", > + bltzal_2 = "04100000SB", > + bgezal_2 = "04110000SB", > + bltzall_2 = "04120000SB", > + bgezall_2 = "04130000SB", > + > + -- Opcode SPECIAL2. > + madd_2 = "70000000ST", > + maddu_2 = "70000001ST", > + mul_3 = "70000002DST", > + msub_2 = "70000004ST", > + msubu_2 = "70000005ST", > + clz_2 = "70000020D=TS", > + clo_2 = "70000021D=TS", > + dclz_2 = mips64 and "70000024D=TS", > + dclo_2 = mips64 and "70000025D=TS", > + sdbbp_0 = "7000003f", > + sdbbp_1 = "7000003fY", > + > + -- Opcode COP1. > + bc1f_1 = "45000000B", > + bc1f_2 = "45000000CB", > + bc1t_1 = "45010000B", > + bc1t_2 = "45010000CB", > + bc1fl_1 = "45020000B", > + bc1fl_2 = "45020000CB", > + bc1tl_1 = "45030000B", > + bc1tl_2 = "45030000CB", > + > + ["movf.s_2"] = "46000011FG", > + ["movf.s_3"] = "46000011FGC", > + ["movt.s_2"] = "46010011FG", > + ["movt.s_3"] = "46010011FGC", > + ["movz.s_3"] = "46000012FGT", > + ["movn.s_3"] = "46000013FGT", > + ["cvt.ps.s_3"] = "46000026FGH", > + ["c.f.s_2"] = "46000030GH", > + ["c.f.s_3"] = "46000030VGH", > + ["c.un.s_2"] = "46000031GH", > + ["c.un.s_3"] = "46000031VGH", > + ["c.eq.s_2"] = "46000032GH", > + ["c.eq.s_3"] = "46000032VGH", > + ["c.ueq.s_2"] = "46000033GH", > + ["c.ueq.s_3"] = "46000033VGH", > + ["c.olt.s_2"] = "46000034GH", > + ["c.olt.s_3"] = "46000034VGH", > + ["c.ult.s_2"] = "46000035GH", > + ["c.ult.s_3"] = "46000035VGH", > + ["c.ole.s_2"] = "46000036GH", > + ["c.ole.s_3"] = "46000036VGH", > + ["c.ule.s_2"] = "46000037GH", > + ["c.ule.s_3"] = "46000037VGH", > + ["c.sf.s_2"] = "46000038GH", > + ["c.sf.s_3"] = "46000038VGH", > + ["c.ngle.s_2"] = "46000039GH", > + ["c.ngle.s_3"] = "46000039VGH", > + ["c.seq.s_2"] = "4600003aGH", > + ["c.seq.s_3"] = "4600003aVGH", > + ["c.ngl.s_2"] = "4600003bGH", > + ["c.ngl.s_3"] = "4600003bVGH", > + ["c.lt.s_2"] = "4600003cGH", > + ["c.lt.s_3"] = "4600003cVGH", > + ["c.nge.s_2"] = "4600003dGH", > + ["c.nge.s_3"] = "4600003dVGH", > + ["c.le.s_2"] = "4600003eGH", > + ["c.le.s_3"] = "4600003eVGH", > + ["c.ngt.s_2"] = "4600003fGH", > + ["c.ngt.s_3"] = "4600003fVGH", > + ["movf.d_2"] = "46200011FG", > + ["movf.d_3"] = "46200011FGC", > + ["movt.d_2"] = "46210011FG", > + ["movt.d_3"] = "46210011FGC", > + ["movz.d_3"] = "46200012FGT", > + ["movn.d_3"] = "46200013FGT", > + ["c.f.d_2"] = "46200030GH", > + ["c.f.d_3"] = "46200030VGH", > + ["c.un.d_2"] = "46200031GH", > + ["c.un.d_3"] = "46200031VGH", > + ["c.eq.d_2"] = "46200032GH", > + ["c.eq.d_3"] = "46200032VGH", > + ["c.ueq.d_2"] = "46200033GH", > + ["c.ueq.d_3"] = "46200033VGH", > + ["c.olt.d_2"] = "46200034GH", > + ["c.olt.d_3"] = "46200034VGH", > + ["c.ult.d_2"] = "46200035GH", > + ["c.ult.d_3"] = "46200035VGH", > + ["c.ole.d_2"] = "46200036GH", > + ["c.ole.d_3"] = "46200036VGH", > + ["c.ule.d_2"] = "46200037GH", > + ["c.ule.d_3"] = "46200037VGH", > + ["c.sf.d_2"] = "46200038GH", > + ["c.sf.d_3"] = "46200038VGH", > + ["c.ngle.d_2"] = "46200039GH", > + ["c.ngle.d_3"] = "46200039VGH", > + ["c.seq.d_2"] = "4620003aGH", > + ["c.seq.d_3"] = "4620003aVGH", > + ["c.ngl.d_2"] = "4620003bGH", > + ["c.ngl.d_3"] = "4620003bVGH", > + ["c.lt.d_2"] = "4620003cGH", > + ["c.lt.d_3"] = "4620003cVGH", > + ["c.nge.d_2"] = "4620003dGH", > + ["c.nge.d_3"] = "4620003dVGH", > + ["c.le.d_2"] = "4620003eGH", > + ["c.le.d_3"] = "4620003eVGH", > + ["c.ngt.d_2"] = "4620003fGH", > + ["c.ngt.d_3"] = "4620003fVGH", > + ["add.ps_3"] = "46c00000FGH", > + ["sub.ps_3"] = "46c00001FGH", > + ["mul.ps_3"] = "46c00002FGH", > + ["abs.ps_2"] = "46c00005FG", > + ["mov.ps_2"] = "46c00006FG", > + ["neg.ps_2"] = "46c00007FG", > + ["movf.ps_2"] = "46c00011FG", > + ["movf.ps_3"] = "46c00011FGC", > + ["movt.ps_2"] = "46c10011FG", > + ["movt.ps_3"] = "46c10011FGC", > + ["movz.ps_3"] = "46c00012FGT", > + ["movn.ps_3"] = "46c00013FGT", > + ["cvt.s.pu_2"] = "46c00020FG", > + ["cvt.s.pl_2"] = "46c00028FG", > + ["pll.ps_3"] = "46c0002cFGH", > + ["plu.ps_3"] = "46c0002dFGH", > + ["pul.ps_3"] = "46c0002eFGH", > + ["puu.ps_3"] = "46c0002fFGH", > + ["c.f.ps_2"] = "46c00030GH", > + ["c.f.ps_3"] = "46c00030VGH", > + ["c.un.ps_2"] = "46c00031GH", > + ["c.un.ps_3"] = "46c00031VGH", > + ["c.eq.ps_2"] = "46c00032GH", > + ["c.eq.ps_3"] = "46c00032VGH", > + ["c.ueq.ps_2"] = "46c00033GH", > + ["c.ueq.ps_3"] = "46c00033VGH", > + ["c.olt.ps_2"] = "46c00034GH", > + ["c.olt.ps_3"] = "46c00034VGH", > + ["c.ult.ps_2"] = "46c00035GH", > + ["c.ult.ps_3"] = "46c00035VGH", > + ["c.ole.ps_2"] = "46c00036GH", > + ["c.ole.ps_3"] = "46c00036VGH", > + ["c.ule.ps_2"] = "46c00037GH", > + ["c.ule.ps_3"] = "46c00037VGH", > + ["c.sf.ps_2"] = "46c00038GH", > + ["c.sf.ps_3"] = "46c00038VGH", > + ["c.ngle.ps_2"] = "46c00039GH", > + ["c.ngle.ps_3"] = "46c00039VGH", > + ["c.seq.ps_2"] = "46c0003aGH", > + ["c.seq.ps_3"] = "46c0003aVGH", > + ["c.ngl.ps_2"] = "46c0003bGH", > + ["c.ngl.ps_3"] = "46c0003bVGH", > + ["c.lt.ps_2"] = "46c0003cGH", > + ["c.lt.ps_3"] = "46c0003cVGH", > + ["c.nge.ps_2"] = "46c0003dGH", > + ["c.nge.ps_3"] = "46c0003dVGH", > + ["c.le.ps_2"] = "46c0003eGH", > + ["c.le.ps_3"] = "46c0003eVGH", > + ["c.ngt.ps_2"] = "46c0003fGH", > + ["c.ngt.ps_3"] = "46c0003fVGH", > + > + -- Opcode COP1X. > + lwxc1_2 = "4c000000FX", > + ldxc1_2 = "4c000001FX", > + luxc1_2 = "4c000005FX", > + swxc1_2 = "4c000008FX", > + sdxc1_2 = "4c000009FX", > + suxc1_2 = "4c00000dFX", > + prefx_2 = "4c00000fMX", > + ["alnv.ps_4"] = "4c00001eFGHS", > + ["madd.s_4"] = "4c000020FRGH", > + ["madd.d_4"] = "4c000021FRGH", > + ["madd.ps_4"] = "4c000026FRGH", > + ["msub.s_4"] = "4c000028FRGH", > + ["msub.d_4"] = "4c000029FRGH", > + ["msub.ps_4"] = "4c00002eFRGH", > + ["nmadd.s_4"] = "4c000030FRGH", > + ["nmadd.d_4"] = "4c000031FRGH", > + ["nmadd.ps_4"] = "4c000036FRGH", > + ["nmsub.s_4"] = "4c000038FRGH", > + ["nmsub.d_4"] = "4c000039FRGH", > + ["nmsub.ps_4"] = "4c00003eFRGH", > + > + }) do map_op[k] = v end > + > +end > + > ------------------------------------------------------------------------------ > > local function parse_gpr(expr) > @@ -808,9 +979,11 @@ map_op[".template__"] = function(params, template, nparams) > op = op + parse_disp(params[n]); n = n + 1 > elseif p == "X" then > op = op + parse_index(params[n]); n = n + 1 > - elseif p == "B" or p == "J" then > + elseif p == "B" or p == "J" or p == "K" or p == "L" then > local mode, m, s = parse_label(params[n], false) > - if p == "B" then m = m + 2048 end > + if p == "J" then m = m + 0xa800 > + elseif p == "K" then m = m + 0x5000 > + elseif p == "L" then m = m + 0xa000 end > waction("REL_"..mode, m, s, 1) > n = n + 1 > elseif p == "A" then > @@ -833,7 +1006,7 @@ map_op[".template__"] = function(params, template, nparams) > elseif p == "Z" then > op = op + parse_imm(params[n], 10, 6, 0, false); n = n + 1 > elseif p == "=" then > - op = op + shl(band(op, 0xf800), 5) -- Copy D to T for clz, clo. > + n = n - 1 -- Re-use previous parameter for next template char. > else > assert(false) > end > diff --git a/dynasm/dynasm.lua b/dynasm/dynasm.lua > index 5ec21a79..46ebfca8 100644 > --- a/dynasm/dynasm.lua > +++ b/dynasm/dynasm.lua > @@ -630,6 +630,7 @@ end > -- Load architecture-specific module. > local function loadarch(arch) > if not match(arch, "^[%w_]+$") then return "bad arch name" end > + _G._map_def = map_def > local ok, m_arch = pcall(require, "dasm_"..arch) > if not ok then return "cannot load module: "..m_arch end > g_arch = m_arch > diff --git a/src/Makefile.original b/src/Makefile.original > index aedaaa73..22d36a27 100644 > --- a/src/Makefile.original > +++ b/src/Makefile.original > @@ -455,6 +455,9 @@ ifeq (arm,$(TARGET_LJARCH)) > DASM_AFLAGS+= -D IOS > endif > else > +ifneq (,$(findstring LJ_TARGET_MIPSR6 ,$(TARGET_TESTARCH))) > + DASM_AFLAGS+= -D MIPSR6 > +endif > ifeq (ppc,$(TARGET_LJARCH)) > ifneq (,$(findstring LJ_ARCH_SQRT 1,$(TARGET_TESTARCH))) > DASM_AFLAGS+= -D SQRT > diff --git a/src/jit/bcsave.lua b/src/jit/bcsave.lua > index 2553d97e..41081184 100644 > --- a/src/jit/bcsave.lua > +++ b/src/jit/bcsave.lua > @@ -17,6 +17,10 @@ local bit = require("bit") > -- Symbol name prefix for LuaJIT bytecode. > local LJBC_PREFIX = "luaJIT_BC_" > > +local type, assert = type, assert > +local format = string.format > +local tremove, tconcat = table.remove, table.concat > + > ------------------------------------------------------------------------------ > > local function usage() > @@ -63,8 +67,18 @@ local map_type = { > } > > local map_arch = { > - x86 = true, x64 = true, arm = true, arm64 = true, arm64be = true, > - ppc = true, mips = true, mipsel = true, > + x86 = { e = "le", b = 32, m = 3, p = 0x14c, }, > + x64 = { e = "le", b = 64, m = 62, p = 0x8664, }, > + arm = { e = "le", b = 32, m = 40, p = 0x1c0, }, > + arm64 = { e = "le", b = 64, m = 183, p = 0xaa64, }, > + arm64be = { e = "be", b = 64, m = 183, }, > + ppc = { e = "be", b = 32, m = 20, }, > + mips = { e = "be", b = 32, m = 8, f = 0x50001006, }, > + mipsel = { e = "le", b = 32, m = 8, f = 0x50001006, }, > + mips64 = { e = "be", b = 64, m = 8, f = 0x80000007, }, > + mips64el = { e = "le", b = 64, m = 8, f = 0x80000007, }, > + mips64r6 = { e = "be", b = 64, m = 8, f = 0xa0000407, }, > + mips64r6el = { e = "le", b = 64, m = 8, f = 0xa0000407, }, > } > > local map_os = { > @@ -73,33 +87,33 @@ local map_os = { > } > > local function checkarg(str, map, err) > - str = string.lower(str) > + str = str:lower() > local s = check(map[str], "unknown ", err) > - return s == true and str or s > + return type(s) == "string" and s or str > end > > local function detecttype(str) > - local ext = string.match(string.lower(str), "%.(%a+)$") > + local ext = str:lower():match("%.(%a+)$") > return map_type[ext] or "raw" > end > > local function checkmodname(str) > - check(string.match(str, "^[%w_.%-]+$"), "bad module name") > - return string.gsub(str, "[%.%-]", "_") > + check(str:match("^[%w_.%-]+$"), "bad module name") > + return str:gsub("[%.%-]", "_") > end > > local function detectmodname(str) > if type(str) == "string" then > - local tail = string.match(str, "[^/\\]+$") > + local tail = str:match("[^/\\]+$") > if tail then str = tail end > - local head = string.match(str, "^(.*)%.[^.]*$") > + local head = str:match("^(.*)%.[^.]*$") > if head then str = head end > - str = string.match(str, "^[%w_.%-]+") > + str = str:match("^[%w_.%-]+") > else > str = nil > end > check(str, "cannot derive module name, use -n name") > - return string.gsub(str, "[%.%-]", "_") > + return str:gsub("[%.%-]", "_") > end > > ------------------------------------------------------------------------------ > @@ -118,7 +132,7 @@ end > local function bcsave_c(ctx, output, s) > local fp = savefile(output, "w") > if ctx.type == "c" then > - fp:write(string.format([[ > + fp:write(format([[ > #ifdef _cplusplus > extern "C" > #endif > @@ -128,7 +142,7 @@ __declspec(dllexport) > const unsigned char %s%s[] = { > ]], LJBC_PREFIX, ctx.modname)) > else > - fp:write(string.format([[ > + fp:write(format([[ > #define %s%s_SIZE %d > static const unsigned char %s%s[] = { > ]], LJBC_PREFIX, ctx.modname, #s, LJBC_PREFIX, ctx.modname)) > @@ -138,13 +152,13 @@ static const unsigned char %s%s[] = { > local b = tostring(string.byte(s, i)) > m = m + #b + 1 > if m > 78 then > - fp:write(table.concat(t, ",", 1, n), ",\n") > + fp:write(tconcat(t, ",", 1, n), ",\n") > n, m = 0, #b + 1 > end > n = n + 1 > t[n] = b > end > - bcsave_tail(fp, output, table.concat(t, ",", 1, n).."\n};\n") > + bcsave_tail(fp, output, tconcat(t, ",", 1, n).."\n};\n") > end > > local function bcsave_elfobj(ctx, output, s, ffi) > @@ -199,12 +213,8 @@ typedef struct { > } ELF64obj; > ]] > local symname = LJBC_PREFIX..ctx.modname > - local is64, isbe = false, false > - if ctx.arch == "x64" or ctx.arch == "arm64" or ctx.arch == "arm64be" then > - is64 = true > - elseif ctx.arch == "ppc" or ctx.arch == "mips" then > - isbe = true > - end > + local ai = assert(map_arch[ctx.arch]) > + local is64, isbe = ai.b == 64, ai.e == "be" > > -- Handle different host/target endianess. > local function f32(x) return x end > @@ -237,10 +247,8 @@ typedef struct { > hdr.eendian = isbe and 2 or 1 > hdr.eversion = 1 > hdr.type = f16(1) > - hdr.machine = f16(({ x86=3, x64=62, arm=40, arm64=183, arm64be=183, ppc=20, mips=8, mipsel=8 })[ctx.arch]) > - if ctx.arch == "mips" or ctx.arch == "mipsel" then > - hdr.flags = f32(0x50001006) > - end > + hdr.machine = f16(ai.m) > + hdr.flags = f32(ai.f or 0) > hdr.version = f32(1) > hdr.shofs = fofs(ffi.offsetof(o, "sect")) > hdr.ehsize = f16(ffi.sizeof(hdr)) > @@ -336,12 +344,8 @@ typedef struct { > } PEobj; > ]] > local symname = LJBC_PREFIX..ctx.modname > - local is64 = false > - if ctx.arch == "x86" then > - symname = "_"..symname > - elseif ctx.arch == "x64" then > - is64 = true > - end > + local ai = assert(map_arch[ctx.arch]) > + local is64 = ai.b == 64 > local symexport = " /EXPORT:"..symname..",DATA " > > -- The file format is always little-endian. Swap if the host is big-endian. > @@ -355,7 +359,7 @@ typedef struct { > -- Create PE object and fill in header. > local o = ffi.new("PEobj") > local hdr = o.hdr > - hdr.arch = f16(({ x86=0x14c, x64=0x8664, arm=0x1c0, ppc=0x1f2, mips=0x366, mipsel=0x366 })[ctx.arch]) > + hdr.arch = f16(assert(ai.p)) > hdr.nsects = f16(2) > hdr.symtabofs = f32(ffi.offsetof(o, "sym0")) > hdr.nsyms = f32(6) > @@ -605,16 +609,16 @@ local function docmd(...) > local n = 1 > local list = false > local ctx = { > - strip = true, arch = jit.arch, os = string.lower(jit.os), > + strip = true, arch = jit.arch, os = jit.os:lower(), > type = false, modname = false, > } > while n <= #arg do > local a = arg[n] > - if type(a) == "string" and string.sub(a, 1, 1) == "-" and a ~= "-" then > - table.remove(arg, n) > + if type(a) == "string" and a:sub(1, 1) == "-" and a ~= "-" then > + tremove(arg, n) > if a == "--" then break end > for m=2,#a do > - local opt = string.sub(a, m, m) > + local opt = a:sub(m, m) > if opt == "l" then > list = true > elseif opt == "s" then > @@ -627,13 +631,13 @@ local function docmd(...) > if n ~= 1 then usage() end > arg[1] = check(loadstring(arg[1])) > elseif opt == "n" then > - ctx.modname = checkmodname(table.remove(arg, n)) > + ctx.modname = checkmodname(tremove(arg, n)) > elseif opt == "t" then > - ctx.type = checkarg(table.remove(arg, n), map_type, "file type") > + ctx.type = checkarg(tremove(arg, n), map_type, "file type") > elseif opt == "a" then > - ctx.arch = checkarg(table.remove(arg, n), map_arch, "architecture") > + ctx.arch = checkarg(tremove(arg, n), map_arch, "architecture") > elseif opt == "o" then > - ctx.os = checkarg(table.remove(arg, n), map_os, "OS name") > + ctx.os = checkarg(tremove(arg, n), map_os, "OS name") > else > usage() > end > diff --git a/src/jit/dis_mips.lua b/src/jit/dis_mips.lua > index a12b8e62..c003b984 100644 > --- a/src/jit/dis_mips.lua > +++ b/src/jit/dis_mips.lua > @@ -19,13 +19,34 @@ local band, bor, tohex = bit.band, bit.bor, bit.tohex > local lshift, rshift, arshift = bit.lshift, bit.rshift, bit.arshift > > ------------------------------------------------------------------------------ > --- Primary and extended opcode maps > +-- Extended opcode maps common to all MIPS releases > ------------------------------------------------------------------------------ > > -local map_movci = { shift = 16, mask = 1, [0] = "movfDSC", "movtDSC", } > local map_srl = { shift = 21, mask = 1, [0] = "srlDTA", "rotrDTA", } > local map_srlv = { shift = 6, mask = 1, [0] = "srlvDTS", "rotrvDTS", } > > +local map_cop0 = { > + shift = 25, mask = 1, > + [0] = { > + shift = 21, mask = 15, > + [0] = "mfc0TDW", [4] = "mtc0TDW", > + [10] = "rdpgprDT", > + [11] = { shift = 5, mask = 1, [0] = "diT0", "eiT0", }, > + [14] = "wrpgprDT", > + }, { > + shift = 0, mask = 63, > + [1] = "tlbr", [2] = "tlbwi", [6] = "tlbwr", [8] = "tlbp", > + [24] = "eret", [31] = "deret", > + [32] = "wait", > + }, > +} > + > +------------------------------------------------------------------------------ > +-- Primary and extended opcode maps for MIPS R1-R5 > +------------------------------------------------------------------------------ > + > +local map_movci = { shift = 16, mask = 1, [0] = "movfDSC", "movtDSC", } > + > local map_special = { > shift = 0, mask = 63, > [0] = { shift = 0, mask = -1, [0] = "nop", _ = "sllDTA" }, > @@ -87,22 +108,6 @@ local map_regimm = { > false, false, false, "synciSO", > } > > -local map_cop0 = { > - shift = 25, mask = 1, > - [0] = { > - shift = 21, mask = 15, > - [0] = "mfc0TDW", [4] = "mtc0TDW", > - [10] = "rdpgprDT", > - [11] = { shift = 5, mask = 1, [0] = "diT0", "eiT0", }, > - [14] = "wrpgprDT", > - }, { > - shift = 0, mask = 63, > - [1] = "tlbr", [2] = "tlbwi", [6] = "tlbwr", [8] = "tlbp", > - [24] = "eret", [31] = "deret", > - [32] = "wait", > - }, > -} > - > local map_cop1s = { > shift = 0, mask = 63, > [0] = "add.sFGH", "sub.sFGH", "mul.sFGH", "div.sFGH", > @@ -233,6 +238,208 @@ local map_pri = { > false, "sdc1HSO", "sdc2TSO", "sdTSO", > } > > +------------------------------------------------------------------------------ > +-- Primary and extended opcode maps for MIPS R6 > +------------------------------------------------------------------------------ > + > +local map_mul_r6 = { shift = 6, mask = 3, [2] = "mulDST", [3] = "muhDST" } > +local map_mulu_r6 = { shift = 6, mask = 3, [2] = "muluDST", [3] = "muhuDST" } > +local map_div_r6 = { shift = 6, mask = 3, [2] = "divDST", [3] = "modDST" } > +local map_divu_r6 = { shift = 6, mask = 3, [2] = "divuDST", [3] = "moduDST" } > +local map_dmul_r6 = { shift = 6, mask = 3, [2] = "dmulDST", [3] = "dmuhDST" } > +local map_dmulu_r6 = { shift = 6, mask = 3, [2] = "dmuluDST", [3] = "dmuhuDST" } > +local map_ddiv_r6 = { shift = 6, mask = 3, [2] = "ddivDST", [3] = "dmodDST" } > +local map_ddivu_r6 = { shift = 6, mask = 3, [2] = "ddivuDST", [3] = "dmoduDST" } > + > +local map_special_r6 = { > + shift = 0, mask = 63, > + [0] = { shift = 0, mask = -1, [0] = "nop", _ = "sllDTA" }, > + false, map_srl, "sraDTA", > + "sllvDTS", false, map_srlv, "sravDTS", > + "jrS", "jalrD1S", false, false, > + "syscallY", "breakY", false, "sync", > + "clzDS", "cloDS", "dclzDS", "dcloDS", > + "dsllvDST", "dlsaDSTA", "dsrlvDST", "dsravDST", > + map_mul_r6, map_mulu_r6, map_div_r6, map_divu_r6, > + map_dmul_r6, map_dmulu_r6, map_ddiv_r6, map_ddivu_r6, > + "addDST", "addu|moveDST0", "subDST", "subu|neguDS0T", > + "andDST", "or|moveDST0", "xorDST", "nor|notDST0", > + false, false, "sltDST", "sltuDST", > + "daddDST", "dadduDST", "dsubDST", "dsubuDST", > + "tgeSTZ", "tgeuSTZ", "tltSTZ", "tltuSTZ", > + "teqSTZ", "seleqzDST", "tneSTZ", "selnezDST", > + "dsllDTA", false, "dsrlDTA", "dsraDTA", > + "dsll32DTA", false, "dsrl32DTA", "dsra32DTA", > +} > + > +local map_bshfl_r6 = { > + shift = 9, mask = 3, > + [1] = "alignDSTa", > + _ = { > + shift = 6, mask = 31, > + [0] = "bitswapDT", > + [2] = "wsbhDT", > + [16] = "sebDT", > + [24] = "sehDT", > + } > +} > + > +local map_dbshfl_r6 = { > + shift = 9, mask = 3, > + [1] = "dalignDSTa", > + _ = { > + shift = 6, mask = 31, > + [0] = "dbitswapDT", > + [2] = "dsbhDT", > + [5] = "dshdDT", > + } > +} > + > +local map_special3_r6 = { > + shift = 0, mask = 63, > + [0] = "extTSAK", [1] = "dextmTSAP", [3] = "dextTSAK", > + [4] = "insTSAL", [6] = "dinsuTSEQ", [7] = "dinsTSAL", > + [32] = map_bshfl_r6, [36] = map_dbshfl_r6, [59] = "rdhwrTD", > +} > + > +local map_regimm_r6 = { > + shift = 16, mask = 31, > + [0] = "bltzSB", [1] = "bgezSB", > + [6] = "dahiSI", [30] = "datiSI", > + [23] = "sigrieI", [31] = "synciSO", > +} > + > +local map_pcrel_r6 = { > + shift = 19, mask = 3, > + [0] = "addiupcS2", "lwpcS2", "lwupcS2", { > + shift = 18, mask = 1, > + [0] = "ldpcS3", { shift = 16, mask = 3, [2] = "auipcSI", [3] = "aluipcSI" } > + } > +} > + > +local map_cop1s_r6 = { > + shift = 0, mask = 63, > + [0] = "add.sFGH", "sub.sFGH", "mul.sFGH", "div.sFGH", > + "sqrt.sFG", "abs.sFG", "mov.sFG", "neg.sFG", > + "round.l.sFG", "trunc.l.sFG", "ceil.l.sFG", "floor.l.sFG", > + "round.w.sFG", "trunc.w.sFG", "ceil.w.sFG", "floor.w.sFG", > + "sel.sFGH", false, false, false, > + "seleqz.sFGH", "recip.sFG", "rsqrt.sFG", "selnez.sFGH", > + "maddf.sFGH", "msubf.sFGH", "rint.sFG", "class.sFG", > + "min.sFGH", "mina.sFGH", "max.sFGH", "maxa.sFGH", > + false, "cvt.d.sFG", false, false, > + "cvt.w.sFG", "cvt.l.sFG", > +} > + > +local map_cop1d_r6 = { > + shift = 0, mask = 63, > + [0] = "add.dFGH", "sub.dFGH", "mul.dFGH", "div.dFGH", > + "sqrt.dFG", "abs.dFG", "mov.dFG", "neg.dFG", > + "round.l.dFG", "trunc.l.dFG", "ceil.l.dFG", "floor.l.dFG", > + "round.w.dFG", "trunc.w.dFG", "ceil.w.dFG", "floor.w.dFG", > + "sel.dFGH", false, false, false, > + "seleqz.dFGH", "recip.dFG", "rsqrt.dFG", "selnez.dFGH", > + "maddf.dFGH", "msubf.dFGH", "rint.dFG", "class.dFG", > + "min.dFGH", "mina.dFGH", "max.dFGH", "maxa.dFGH", > + "cvt.s.dFG", false, false, false, > + "cvt.w.dFG", "cvt.l.dFG", > +} > + > +local map_cop1w_r6 = { > + shift = 0, mask = 63, > + [0] = "cmp.af.sFGH", "cmp.un.sFGH", "cmp.eq.sFGH", "cmp.ueq.sFGH", > + "cmp.lt.sFGH", "cmp.ult.sFGH", "cmp.le.sFGH", "cmp.ule.sFGH", > + "cmp.saf.sFGH", "cmp.sun.sFGH", "cmp.seq.sFGH", "cmp.sueq.sFGH", > + "cmp.slt.sFGH", "cmp.sult.sFGH", "cmp.sle.sFGH", "cmp.sule.sFGH", > + false, "cmp.or.sFGH", "cmp.une.sFGH", "cmp.ne.sFGH", > + false, false, false, false, > + false, "cmp.sor.sFGH", "cmp.sune.sFGH", "cmp.sne.sFGH", > + false, false, false, false, > + "cvt.s.wFG", "cvt.d.wFG", > +} > + > +local map_cop1l_r6 = { > + shift = 0, mask = 63, > + [0] = "cmp.af.dFGH", "cmp.un.dFGH", "cmp.eq.dFGH", "cmp.ueq.dFGH", > + "cmp.lt.dFGH", "cmp.ult.dFGH", "cmp.le.dFGH", "cmp.ule.dFGH", > + "cmp.saf.dFGH", "cmp.sun.dFGH", "cmp.seq.dFGH", "cmp.sueq.dFGH", > + "cmp.slt.dFGH", "cmp.sult.dFGH", "cmp.sle.dFGH", "cmp.sule.dFGH", > + false, "cmp.or.dFGH", "cmp.une.dFGH", "cmp.ne.dFGH", > + false, false, false, false, > + false, "cmp.sor.dFGH", "cmp.sune.dFGH", "cmp.sne.dFGH", > + false, false, false, false, > + "cvt.s.lFG", "cvt.d.lFG", > +} > + > +local map_cop1_r6 = { > + shift = 21, mask = 31, > + [0] = "mfc1TG", "dmfc1TG", "cfc1TG", "mfhc1TG", > + "mtc1TG", "dmtc1TG", "ctc1TG", "mthc1TG", > + false, "bc1eqzHB", false, false, > + false, "bc1nezHB", false, false, > + map_cop1s_r6, map_cop1d_r6, false, false, > + map_cop1w_r6, map_cop1l_r6, > +} > + > +local function maprs_popTS(rs, rt) > + if rt == 0 then return 0 elseif rs == 0 then return 1 > + elseif rs == rt then return 2 else return 3 end > +end > + > +local map_pop06_r6 = { > + maprs = maprs_popTS, [0] = "blezSB", "blezalcTB", "bgezalcTB", "bgeucSTB" > +} > +local map_pop07_r6 = { > + maprs = maprs_popTS, [0] = "bgtzSB", "bgtzalcTB", "bltzalcTB", "bltucSTB" > +} > +local map_pop26_r6 = { > + maprs = maprs_popTS, "blezcTB", "bgezcTB", "bgecSTB" > +} > +local map_pop27_r6 = { > + maprs = maprs_popTS, "bgtzcTB", "bltzcTB", "bltcSTB" > +} > + > +local function maprs_popS(rs, rt) > + if rs == 0 then return 0 else return 1 end > +end > + > +local map_pop66_r6 = { > + maprs = maprs_popS, [0] = "jicTI", "beqzcSb" > +} > +local map_pop76_r6 = { > + maprs = maprs_popS, [0] = "jialcTI", "bnezcSb" > +} > + > +local function maprs_popST(rs, rt) > + if rs >= rt then return 0 elseif rs == 0 then return 1 else return 2 end > +end > + > +local map_pop10_r6 = { > + maprs = maprs_popST, [0] = "bovcSTB", "beqzalcTB", "beqcSTB" > +} > +local map_pop30_r6 = { > + maprs = maprs_popST, [0] = "bnvcSTB", "bnezalcTB", "bnecSTB" > +} > + > +local map_pri_r6 = { > + [0] = map_special_r6, map_regimm_r6, "jJ", "jalJ", > + "beq|beqz|bST00B", "bne|bnezST0B", map_pop06_r6, map_pop07_r6, > + map_pop10_r6, "addiu|liTS0I", "sltiTSI", "sltiuTSI", > + "andiTSU", "ori|liTS0U", "xoriTSU", "aui|luiTS0U", > + map_cop0, map_cop1_r6, false, false, > + false, false, map_pop26_r6, map_pop27_r6, > + map_pop30_r6, "daddiuTSI", false, false, > + false, "dauiTSI", false, map_special3_r6, > + "lbTSO", "lhTSO", false, "lwTSO", > + "lbuTSO", "lhuTSO", false, false, > + "sbTSO", "shTSO", false, "swTSO", > + false, false, false, false, > + false, "lwc1HSO", "bc#", false, > + false, "ldc1HSO", map_pop66_r6, "ldTSO", > + false, "swc1HSO", "balc#", map_pcrel_r6, > + false, "sdc1HSO", map_pop76_r6, "sdTSO", > +} > + > ------------------------------------------------------------------------------ > > local map_gpr = { > @@ -287,10 +494,14 @@ local function disass_ins(ctx) > ctx.op = op > ctx.rel = nil > > - local opat = map_pri[rshift(op, 26)] > + local opat = ctx.map_pri[rshift(op, 26)] > while type(opat) ~= "string" do > if not opat then return unknown(ctx) end > - opat = opat[band(rshift(op, opat.shift), opat.mask)] or opat._ > + if opat.maprs then > + opat = opat[opat.maprs(band(rshift(op,21),31), band(rshift(op,16),31))] > + else > + opat = opat[band(rshift(op, opat.shift), opat.mask)] or opat._ > + end > end > local name, pat = match(opat, "^([a-z0-9_.]*)(.*)") > local altname, pat2 = match(pat, "|([a-z0-9_.|]*)(.*)") > @@ -314,6 +525,8 @@ local function disass_ins(ctx) > x = "f"..band(rshift(op, 21), 31) > elseif p == "A" then > x = band(rshift(op, 6), 31) > + elseif p == "a" then > + x = band(rshift(op, 6), 7) > elseif p == "E" then > x = band(rshift(op, 6), 31) + 32 > elseif p == "M" then > @@ -333,6 +546,10 @@ local function disass_ins(ctx) > x = band(rshift(op, 11), 31) - last + 33 > elseif p == "I" then > x = arshift(lshift(op, 16), 16) > + elseif p == "2" then > + x = arshift(lshift(op, 13), 11) > + elseif p == "3" then > + x = arshift(lshift(op, 14), 11) > elseif p == "U" then > x = band(op, 0xffff) > elseif p == "O" then > @@ -342,7 +559,15 @@ local function disass_ins(ctx) > local index = map_gpr[band(rshift(op, 16), 31)] > operands[#operands] = format("%s(%s)", index, last) > elseif p == "B" then > - x = ctx.addr + ctx.pos + arshift(lshift(op, 16), 16)*4 + 4 > + x = ctx.addr + ctx.pos + arshift(lshift(op, 16), 14) + 4 > + ctx.rel = x > + x = format("0x%08x", x) > + elseif p == "b" then > + x = ctx.addr + ctx.pos + arshift(lshift(op, 11), 9) + 4 > + ctx.rel = x > + x = format("0x%08x", x) > + elseif p == "#" then > + x = ctx.addr + ctx.pos + arshift(lshift(op, 6), 4) + 4 > ctx.rel = x > x = format("0x%08x", x) > elseif p == "J" then > @@ -408,6 +633,7 @@ local function create(code, addr, out) > ctx.disass = disass_block > ctx.hexdump = 8 > ctx.get = get_be > + ctx.map_pri = map_pri > return ctx > end > > @@ -417,6 +643,19 @@ local function create_el(code, addr, out) > return ctx > end > > +local function create_r6(code, addr, out) > + local ctx = create(code, addr, out) > + ctx.map_pri = map_pri_r6 > + return ctx > +end > + > +local function create_r6_el(code, addr, out) > + local ctx = create(code, addr, out) > + ctx.get = get_le > + ctx.map_pri = map_pri_r6 > + return ctx > +end > + > -- Simple API: disassemble code (a string) at address and output via out. > local function disass(code, addr, out) > create(code, addr, out):disass() > @@ -426,6 +665,14 @@ local function disass_el(code, addr, out) > create_el(code, addr, out):disass() > end > > +local function disass_r6(code, addr, out) > + create_r6(code, addr, out):disass() > +end > + > +local function disass_r6_el(code, addr, out) > + create_r6_el(code, addr, out):disass() > +end > + > -- Return register name for RID. > local function regname(r) > if r < 32 then return map_gpr[r] end > @@ -436,8 +683,12 @@ end > return { > create = create, > create_el = create_el, > + create_r6 = create_r6, > + create_r6_el = create_r6_el, > disass = disass, > disass_el = disass_el, > + disass_r6 = disass_r6, > + disass_r6_el = disass_r6_el, > regname = regname > } > > diff --git a/src/jit/dis_mips64r6.lua b/src/jit/dis_mips64r6.lua > new file mode 100644 > index 00000000..023c05ab > --- /dev/null > +++ b/src/jit/dis_mips64r6.lua > @@ -0,0 +1,17 @@ > +---------------------------------------------------------------------------- > +-- LuaJIT MIPS64R6 disassembler wrapper module. > +-- > +-- Copyright (C) 2005-2017 Mike Pall. All rights reserved. > +-- Released under the MIT license. See Copyright Notice in luajit.h > +---------------------------------------------------------------------------- > +-- This module just exports the r6 big-endian functions from the > +-- MIPS disassembler module. All the interesting stuff is there. > +------------------------------------------------------------------------------ > + > +local dis_mips = require((string.match(..., ".*%.") or "").."dis_mips") > +return { > + create = dis_mips.create_r6, > + disass = dis_mips.disass_r6, > + regname = dis_mips.regname > +} > + > diff --git a/src/jit/dis_mips64r6el.lua b/src/jit/dis_mips64r6el.lua > new file mode 100644 > index 00000000..f2988339 > --- /dev/null > +++ b/src/jit/dis_mips64r6el.lua > @@ -0,0 +1,17 @@ > +---------------------------------------------------------------------------- > +-- LuaJIT MIPS64R6EL disassembler wrapper module. > +-- > +-- Copyright (C) 2005-2017 Mike Pall. All rights reserved. > +-- Released under the MIT license. See Copyright Notice in luajit.h > +---------------------------------------------------------------------------- > +-- This module just exports the r6 little-endian functions from the > +-- MIPS disassembler module. All the interesting stuff is there. > +------------------------------------------------------------------------------ > + > +local dis_mips = require((string.match(..., ".*%.") or "").."dis_mips") > +return { > + create = dis_mips.create_r6_el, > + disass = dis_mips.disass_r6_el, > + regname = dis_mips.regname > +} > + > diff --git a/src/lj_arch.h b/src/lj_arch.h > index 0351e046..cf31a291 100644 > --- a/src/lj_arch.h > +++ b/src/lj_arch.h > @@ -342,18 +342,38 @@ > #elif LUAJIT_TARGET == LUAJIT_ARCH_MIPS32 || LUAJIT_TARGET == LUAJIT_ARCH_MIPS64 > > #if defined(__MIPSEL__) || defined(__MIPSEL) || defined(_MIPSEL) > +#if __mips_isa_rev >= 6 > +#define LJ_TARGET_MIPSR6 1 > +#define LJ_TARGET_UNALIGNED 1 > +#endif > #if LUAJIT_TARGET == LUAJIT_ARCH_MIPS32 > +#if LJ_TARGET_MIPSR6 > +#define LJ_ARCH_NAME "mips32r6el" > +#else > #define LJ_ARCH_NAME "mipsel" > +#endif > +#else > +#if LJ_TARGET_MIPSR6 > +#define LJ_ARCH_NAME "mips64r6el" > #else > #define LJ_ARCH_NAME "mips64el" > #endif > +#endif > #define LJ_ARCH_ENDIAN LUAJIT_LE > #else > #if LUAJIT_TARGET == LUAJIT_ARCH_MIPS32 > +#if LJ_TARGET_MIPSR6 > +#define LJ_ARCH_NAME "mips32r6" > +#else > #define LJ_ARCH_NAME "mips" > +#endif > +#else > +#if LJ_TARGET_MIPSR6 > +#define LJ_ARCH_NAME "mips64r6" > #else > #define LJ_ARCH_NAME "mips64" > #endif > +#endif > #define LJ_ARCH_ENDIAN LUAJIT_BE > #endif > > @@ -390,7 +410,9 @@ > #define LJ_TARGET_UNIFYROT 2 /* Want only IR_BROR. */ > #define LJ_ARCH_NUMMODE LJ_NUMMODE_DUAL > > -#if _MIPS_ARCH_MIPS32R2 || _MIPS_ARCH_MIPS64R2 > +#if LJ_TARGET_MIPSR6 > +#define LJ_ARCH_VERSION 60 > +#elif _MIPS_ARCH_MIPS32R2 || _MIPS_ARCH_MIPS64R2 > #define LJ_ARCH_VERSION 20 > #else > #define LJ_ARCH_VERSION 10 > @@ -472,8 +494,13 @@ > #if !((defined(_MIPS_SIM_ABI32) && _MIPS_SIM == _MIPS_SIM_ABI32) || (defined(_ABIO32) && _MIPS_SIM == _ABIO32)) > #error "Only o32 ABI supported for MIPS32" > #endif > +#if LJ_TARGET_MIPSR6 > +/* Not that useful, since most available r6 CPUs are 64 bit. */ > +#error "No support for MIPS32R6" > +#endif > #elif LJ_TARGET_MIPS64 > #if !((defined(_MIPS_SIM_ABI64) && _MIPS_SIM == _MIPS_SIM_ABI64) || (defined(_ABI64) && _MIPS_SIM == _ABI64)) > +/* MIPS32ON64 aka n32 ABI support might be desirable, but difficult. */ > #error "Only n64 ABI supported for MIPS64" > #endif > #endif > diff --git a/src/lj_asm.c b/src/lj_asm.c > index 25b96264..96b8c032 100644 > --- a/src/lj_asm.c > +++ b/src/lj_asm.c > @@ -2159,8 +2159,8 @@ static void asm_setup_regsp(ASMState *as) > ir->prev = REGSP_HINT(RID_FPRET); > continue; > } > - /* fallthrough */ > #endif > + /* fallthrough */ > case IR_CALLN: case IR_CALLXS: > #if LJ_SOFTFP > case IR_MIN: case IR_MAX: > diff --git a/src/lj_asm_mips.h b/src/lj_asm_mips.h > index 23ffc3aa..4626507b 100644 > --- a/src/lj_asm_mips.h > +++ b/src/lj_asm_mips.h > @@ -101,7 +101,12 @@ static void asm_guard(ASMState *as, MIPSIns mi, Reg rs, Reg rt) > as->invmcp = NULL; > as->loopinv = 1; > as->mcp = p+1; > +#if !LJ_TARGET_MIPSR6 > mi = mi ^ ((mi>>28) == 1 ? 0x04000000u : 0x00010000u); /* Invert cond. */ > +#else > + mi = mi ^ ((mi>>28) == 1 ? 0x04000000u : > + (mi>>28) == 4 ? 0x00800000u : 0x00010000u); /* Invert cond. */ > +#endif > target = p; /* Patch target later in asm_loop_fixup. */ > } > emit_ti(as, MIPSI_LI, RID_TMP, as->snapno); > @@ -410,7 +415,11 @@ static void asm_callround(ASMState *as, IRIns *ir, IRCallID id) > { > /* The modified regs must match with the *.dasc implementation. */ > RegSet drop = RID2RSET(RID_R1)|RID2RSET(RID_R12)|RID2RSET(RID_FPRET)| > - RID2RSET(RID_F2)|RID2RSET(RID_F4)|RID2RSET(REGARG_FIRSTFPR); > + RID2RSET(RID_F2)|RID2RSET(RID_F4)|RID2RSET(REGARG_FIRSTFPR) > +#if LJ_TARGET_MIPSR6 > + |RID2RSET(RID_F21) > +#endif > + ; > if (ra_hasreg(ir->r)) rset_clear(drop, ir->r); > ra_evictset(as, drop); > ra_destreg(as, ir, RID_FPRET); > @@ -444,8 +453,13 @@ static void asm_tointg(ASMState *as, IRIns *ir, Reg left) > { > Reg tmp = ra_scratch(as, rset_exclude(RSET_FPR, left)); > Reg dest = ra_dest(as, ir, RSET_GPR); > +#if !LJ_TARGET_MIPSR6 > asm_guard(as, MIPSI_BC1F, 0, 0); > emit_fgh(as, MIPSI_C_EQ_D, 0, tmp, left); > +#else > + asm_guard(as, MIPSI_BC1EQZ, 0, (tmp&31)); > + emit_fgh(as, MIPSI_CMP_EQ_D, tmp, tmp, left); > +#endif > emit_fg(as, MIPSI_CVT_D_W, tmp, tmp); > emit_tg(as, MIPSI_MFC1, dest, tmp); > emit_fg(as, MIPSI_CVT_W_D, tmp, left); > @@ -599,8 +613,13 @@ static void asm_conv(ASMState *as, IRIns *ir) > (void *)&as->J->k64[LJ_K64_M2P64], > rset_exclude(RSET_GPR, dest)); > emit_fg(as, MIPSI_TRUNC_L_D, tmp, left); /* Delay slot. */ > - emit_branch(as, MIPSI_BC1T, 0, 0, l_end); > - emit_fgh(as, MIPSI_C_OLT_D, 0, left, tmp); > +#if !LJ_TARGET_MIPSR6 > + emit_branch(as, MIPSI_BC1T, 0, 0, l_end); > + emit_fgh(as, MIPSI_C_OLT_D, 0, left, tmp); > +#else > + emit_branch(as, MIPSI_BC1NEZ, 0, (left&31), l_end); > + emit_fgh(as, MIPSI_CMP_LT_D, left, left, tmp); > +#endif > emit_lsptr(as, MIPSI_LDC1, (tmp & 31), > (void *)&as->J->k64[LJ_K64_2P63], > rset_exclude(RSET_GPR, dest)); > @@ -611,8 +630,13 @@ static void asm_conv(ASMState *as, IRIns *ir) > (void *)&as->J->k32[LJ_K32_M2P64], > rset_exclude(RSET_GPR, dest)); > emit_fg(as, MIPSI_TRUNC_L_S, tmp, left); /* Delay slot. */ > - emit_branch(as, MIPSI_BC1T, 0, 0, l_end); > - emit_fgh(as, MIPSI_C_OLT_S, 0, left, tmp); > +#if !LJ_TARGET_MIPSR6 > + emit_branch(as, MIPSI_BC1T, 0, 0, l_end); > + emit_fgh(as, MIPSI_C_OLT_S, 0, left, tmp); > +#else > + emit_branch(as, MIPSI_BC1NEZ, 0, (left&31), l_end); > + emit_fgh(as, MIPSI_CMP_LT_S, left, left, tmp); > +#endif > emit_lsptr(as, MIPSI_LWC1, (tmp & 31), > (void *)&as->J->k32[LJ_K32_2P63], > rset_exclude(RSET_GPR, dest)); > @@ -840,8 +864,12 @@ static void asm_aref(ASMState *as, IRIns *ir) > } > base = ra_alloc1(as, ir->op1, RSET_GPR); > idx = ra_alloc1(as, ir->op2, rset_exclude(RSET_GPR, base)); > +#if !LJ_TARGET_MIPSR6 > emit_dst(as, MIPSI_AADDU, dest, RID_TMP, base); > emit_dta(as, MIPSI_SLL, RID_TMP, idx, 3); > +#else > + emit_dst(as, MIPSI_ALSA | MIPSF_A(3-1), dest, idx, base); > +#endif > } > > /* Inlined hash lookup. Specialized for key type and for const keys. > @@ -944,8 +972,13 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge) > l_end = asm_exitstub_addr(as); > } > if (!LJ_SOFTFP && irt_isnum(kt)) { > +#if !LJ_TARGET_MIPSR6 > emit_branch(as, MIPSI_BC1T, 0, 0, l_end); > emit_fgh(as, MIPSI_C_EQ_D, 0, tmpnum, key); > +#else > + emit_branch(as, MIPSI_BC1NEZ, 0, (tmpnum&31), l_end); > + emit_fgh(as, MIPSI_CMP_EQ_D, tmpnum, tmpnum, key); > +#endif > *--as->mcp = MIPSI_NOP; /* Avoid NaN comparison overhead. */ > emit_branch(as, MIPSI_BEQ, tmp1, RID_ZERO, l_next); > emit_tsi(as, MIPSI_SLTIU, tmp1, tmp1, (int32_t)LJ_TISNUM); > @@ -1196,7 +1229,9 @@ static MIPSIns asm_fxloadins(IRIns *ir) > case IRT_I16: return MIPSI_LH; > case IRT_U16: return MIPSI_LHU; > case IRT_NUM: lua_assert(!LJ_SOFTFP32); if (!LJ_SOFTFP) return MIPSI_LDC1; > + /* fallthrough */ > case IRT_FLOAT: if (!LJ_SOFTFP) return MIPSI_LWC1; > + /* fallthrough */ > default: return (LJ_64 && irt_is64(ir->t)) ? MIPSI_LD : MIPSI_LW; > } > } > @@ -1207,7 +1242,9 @@ static MIPSIns asm_fxstoreins(IRIns *ir) > case IRT_I8: case IRT_U8: return MIPSI_SB; > case IRT_I16: case IRT_U16: return MIPSI_SH; > case IRT_NUM: lua_assert(!LJ_SOFTFP32); if (!LJ_SOFTFP) return MIPSI_SDC1; > + /* fallthrough */ > case IRT_FLOAT: if (!LJ_SOFTFP) return MIPSI_SWC1; > + /* fallthrough */ > default: return (LJ_64 && irt_is64(ir->t)) ? MIPSI_SD : MIPSI_SW; > } > } > @@ -1253,7 +1290,7 @@ static void asm_xload(ASMState *as, IRIns *ir) > { > Reg dest = ra_dest(as, ir, > (!LJ_SOFTFP && irt_isfp(ir->t)) ? RSET_FPR : RSET_GPR); > - lua_assert(!(ir->op2 & IRXLOAD_UNALIGNED)); > + lua_assert(LJ_TARGET_UNALIGNED || !(ir->op2 & IRXLOAD_UNALIGNED)); > asm_fusexref(as, asm_fxloadins(ir), dest, ir->op1, RSET_GPR, 0); > } > > @@ -1545,7 +1582,7 @@ static void asm_cnew(ASMState *as, IRIns *ir) > ofs -= 4; if (LJ_BE) ir++; else ir--; > } > #else > - emit_tsi(as, MIPSI_SD, ra_alloc1(as, ir->op2, allow), > + emit_tsi(as, sz == 8 ? MIPSI_SD : MIPSI_SW, ra_alloc1(as, ir->op2, allow), > RID_RET, sizeof(GCcdata)); > #endif > lua_assert(sz == 4 || sz == 8); > @@ -1678,6 +1715,7 @@ static void asm_add(ASMState *as, IRIns *ir) > } else > #endif > { > + /* TODO MIPSR6: Fuse ADD(BSHL(a,1-4),b) or ADD(ADD(a,a),b) to MIPSI_ALSA. */ > Reg dest = ra_dest(as, ir, RSET_GPR); > Reg right, left = ra_hintalloc(as, ir->op1, dest, RSET_GPR); > if (irref_isk(ir->op2)) { > @@ -1722,8 +1760,12 @@ static void asm_mul(ASMState *as, IRIns *ir) > Reg right, left = ra_alloc2(as, ir, RSET_GPR); > right = (left >> 8); left &= 255; > if (LJ_64 && irt_is64(ir->t)) { > +#if !LJ_TARGET_MIPSR6 > emit_dst(as, MIPSI_MFLO, dest, 0, 0); > emit_dst(as, MIPSI_DMULT, 0, left, right); > +#else > + emit_dst(as, MIPSI_DMUL, dest, left, right); > +#endif > } else { > emit_dst(as, MIPSI_MUL, dest, left, right); > } > @@ -1806,6 +1848,7 @@ static void asm_abs(ASMState *as, IRIns *ir) > > static void asm_arithov(ASMState *as, IRIns *ir) > { > + /* TODO MIPSR6: bovc/bnvc. Caveat: no delay slot to load RID_TMP. */ > Reg right, left, tmp, dest = ra_dest(as, ir, RSET_GPR); > lua_assert(!irt_is64(ir->t)); > if (irref_isk(ir->op2)) { > @@ -1850,9 +1893,14 @@ static void asm_mulov(ASMState *as, IRIns *ir) > right), dest)); > asm_guard(as, MIPSI_BNE, RID_TMP, tmp); > emit_dta(as, MIPSI_SRA, RID_TMP, dest, 31); > +#if !LJ_TARGET_MIPSR6 > emit_dst(as, MIPSI_MFHI, tmp, 0, 0); > emit_dst(as, MIPSI_MFLO, dest, 0, 0); > emit_dst(as, MIPSI_MULT, 0, left, right); > +#else > + emit_dst(as, MIPSI_MUL, dest, left, right); > + emit_dst(as, MIPSI_MUH, tmp, left, right); > +#endif > } > > #if LJ_32 && LJ_HASFFI > @@ -2076,6 +2124,7 @@ static void asm_min_max(ASMState *as, IRIns *ir, int ismax) > Reg dest = ra_dest(as, ir, RSET_FPR); > Reg right, left = ra_alloc2(as, ir, RSET_FPR); > right = (left >> 8); left &= 255; > +#if !LJ_TARGET_MIPSR6 > if (dest == left) { > emit_fg(as, MIPSI_MOVT_D, dest, right); > } else { > @@ -2083,19 +2132,37 @@ static void asm_min_max(ASMState *as, IRIns *ir, int ismax) > if (dest != right) emit_fg(as, MIPSI_MOV_D, dest, right); > } > emit_fgh(as, MIPSI_C_OLT_D, 0, ismax ? left : right, ismax ? right : left); > +#else > + emit_fgh(as, ismax ? MIPSI_MAX_D : MIPSI_MIN_D, dest, left, right); > +#endif > #endif > } else { > Reg dest = ra_dest(as, ir, RSET_GPR); > Reg right, left = ra_alloc2(as, ir, RSET_GPR); > right = (left >> 8); left &= 255; > - if (dest == left) { > - emit_dst(as, MIPSI_MOVN, dest, right, RID_TMP); > + if (left == right) { > + if (dest != left) emit_move(as, dest, left); > } else { > - emit_dst(as, MIPSI_MOVZ, dest, left, RID_TMP); > - if (dest != right) emit_move(as, dest, right); > +#if !LJ_TARGET_MIPSR6 > + if (dest == left) { > + emit_dst(as, MIPSI_MOVN, dest, right, RID_TMP); > + } else { > + emit_dst(as, MIPSI_MOVZ, dest, left, RID_TMP); > + if (dest != right) emit_move(as, dest, right); > + } > +#else > + emit_dst(as, MIPSI_OR, dest, dest, RID_TMP); > + if (dest != right) { > + emit_dst(as, MIPSI_SELNEZ, RID_TMP, right, RID_TMP); > + emit_dst(as, MIPSI_SELEQZ, dest, left, RID_TMP); > + } else { > + emit_dst(as, MIPSI_SELEQZ, RID_TMP, left, RID_TMP); > + emit_dst(as, MIPSI_SELNEZ, dest, right, RID_TMP); > + } > +#endif > + emit_dst(as, MIPSI_SLT, RID_TMP, > + ismax ? left : right, ismax ? right : left); > } > - emit_dst(as, MIPSI_SLT, RID_TMP, > - ismax ? left : right, ismax ? right : left); > } > } > > @@ -2179,10 +2246,18 @@ static void asm_comp(ASMState *as, IRIns *ir) > #if LJ_SOFTFP > asm_sfpcomp(as, ir); > #else > +#if !LJ_TARGET_MIPSR6 > Reg right, left = ra_alloc2(as, ir, RSET_FPR); > right = (left >> 8); left &= 255; > asm_guard(as, (op&1) ? MIPSI_BC1T : MIPSI_BC1F, 0, 0); > emit_fgh(as, MIPSI_C_OLT_D + ((op&3) ^ ((op>>2)&1)), 0, left, right); > +#else > + Reg tmp, right, left = ra_alloc2(as, ir, RSET_FPR); > + right = (left >> 8); left &= 255; > + tmp = ra_scratch(as, rset_exclude(rset_exclude(RSET_FPR, left), right)); > + asm_guard(as, (op&1) ? MIPSI_BC1NEZ : MIPSI_BC1EQZ, 0, (tmp&31)); > + emit_fgh(as, MIPSI_CMP_LT_D + ((op&3) ^ ((op>>2)&1)), tmp, left, right); > +#endif > #endif > } else { > Reg right, left = ra_alloc1(as, ir->op1, RSET_GPR); > @@ -2218,9 +2293,13 @@ static void asm_equal(ASMState *as, IRIns *ir) > if (!LJ_SOFTFP32 && irt_isnum(ir->t)) { > #if LJ_SOFTFP > asm_sfpcomp(as, ir); > -#else > +#elif !LJ_TARGET_MIPSR6 > asm_guard(as, (ir->o & 1) ? MIPSI_BC1T : MIPSI_BC1F, 0, 0); > emit_fgh(as, MIPSI_C_EQ_D, 0, left, right); > +#else > + Reg tmp = ra_scratch(as, rset_exclude(rset_exclude(RSET_FPR, left), right)); > + asm_guard(as, (ir->o & 1) ? MIPSI_BC1NEZ : MIPSI_BC1EQZ, 0, (tmp&31)); > + emit_fgh(as, MIPSI_CMP_EQ_D, tmp, left, right); > #endif > } else { > asm_guard(as, (ir->o & 1) ? MIPSI_BEQ : MIPSI_BNE, left, right); > @@ -2623,7 +2702,12 @@ void lj_asm_patchexit(jit_State *J, GCtrace *T, ExitNo exitno, MCode *target) > if (((p[-1] ^ (px-p)) & 0xffffu) == 0 && > ((p[-1] & 0xf0000000u) == MIPSI_BEQ || > (p[-1] & 0xfc1e0000u) == MIPSI_BLTZ || > - (p[-1] & 0xffe00000u) == MIPSI_BC1F)) { > +#if !LJ_TARGET_MIPSR6 > + (p[-1] & 0xffe00000u) == MIPSI_BC1F > +#else > + (p[-1] & 0xff600000u) == MIPSI_BC1EQZ > +#endif > + )) { > ptrdiff_t delta = target - p; > if (((delta + 0x8000) >> 16) == 0) { /* Patch in-range branch. */ > patchbranch: > diff --git a/src/lj_emit_mips.h b/src/lj_emit_mips.h > index bb6593ae..313d030a 100644 > --- a/src/lj_emit_mips.h > +++ b/src/lj_emit_mips.h > @@ -138,6 +138,7 @@ static void emit_loadu64(ASMState *as, Reg r, uint64_t u64) > } else if (emit_kdelta1(as, r, (intptr_t)u64)) { > return; > } else { > + /* TODO MIPSR6: Use DAHI & DATI. Caveat: sign-extension. */ > if ((u64 & 0xffff)) { > emit_tsi(as, MIPSI_ORI, r, r, u64 & 0xffff); > } > @@ -236,10 +237,22 @@ static void emit_jmp(ASMState *as, MCode *target) > static void emit_call(ASMState *as, void *target, int needcfa) > { > MCode *p = as->mcp; > - *--p = MIPSI_NOP; > +#if LJ_TARGET_MIPSR6 > + ptrdiff_t delta = (char *)target - (char *)p; > + if ((((delta>>2) + 0x02000000) >> 26) == 0) { /* Try compact call first. */ > + *--p = MIPSI_BALC | (((uintptr_t)delta >>2) & 0x03ffffffu); > + as->mcp = p; > + return; > + } > +#endif > + *--p = MIPSI_NOP; /* Delay slot. */ > if ((((uintptr_t)target ^ (uintptr_t)p) >> 28) == 0) { > +#if !LJ_TARGET_MIPSR6 > *--p = (((uintptr_t)target & 1) ? MIPSI_JALX : MIPSI_JAL) | > (((uintptr_t)target >>2) & 0x03ffffffu); > +#else > + *--p = MIPSI_JAL | (((uintptr_t)target >>2) & 0x03ffffffu); > +#endif > } else { /* Target out of range: need indirect call. */ > *--p = MIPSI_JALR | MIPSF_S(RID_CFUNCADDR); > needcfa = 1; > diff --git a/src/lj_jit.h b/src/lj_jit.h > index c06829ab..a8b6f9a7 100644 > --- a/src/lj_jit.h > +++ b/src/lj_jit.h > @@ -51,10 +51,18 @@ > /* Names for the CPU-specific flags. Must match the order above. */ > #define JIT_F_CPU_FIRST JIT_F_MIPSXXR2 > #if LJ_TARGET_MIPS32 > +#if LJ_TARGET_MIPSR6 > +#define JIT_F_CPUSTRING "\010MIPS32R6" > +#else > #define JIT_F_CPUSTRING "\010MIPS32R2" > +#endif > +#else > +#if LJ_TARGET_MIPSR6 > +#define JIT_F_CPUSTRING "\010MIPS64R6" > #else > #define JIT_F_CPUSTRING "\010MIPS64R2" > #endif > +#endif > #else > #define JIT_F_CPU_FIRST 0 > #define JIT_F_CPUSTRING "" > diff --git a/src/lj_target_mips.h b/src/lj_target_mips.h > index 740687b3..84db6012 100644 > --- a/src/lj_target_mips.h > +++ b/src/lj_target_mips.h > @@ -223,6 +223,8 @@ typedef enum MIPSIns { > MIPSI_ADDIU = 0x24000000, > MIPSI_SUB = 0x00000022, > MIPSI_SUBU = 0x00000023, > + > +#if !LJ_TARGET_MIPSR6 > MIPSI_MUL = 0x70000002, > MIPSI_DIV = 0x0000001a, > MIPSI_DIVU = 0x0000001b, > @@ -232,6 +234,15 @@ typedef enum MIPSIns { > MIPSI_MFHI = 0x00000010, > MIPSI_MFLO = 0x00000012, > MIPSI_MULT = 0x00000018, > +#else > + MIPSI_MUL = 0x00000098, > + MIPSI_MUH = 0x000000d8, > + MIPSI_DIV = 0x0000009a, > + MIPSI_DIVU = 0x0000009b, > + > + MIPSI_SELEQZ = 0x00000035, > + MIPSI_SELNEZ = 0x00000037, > +#endif > > MIPSI_SLL = 0x00000000, > MIPSI_SRL = 0x00000002, > @@ -253,8 +264,13 @@ typedef enum MIPSIns { > MIPSI_B = 0x10000000, > MIPSI_J = 0x08000000, > MIPSI_JAL = 0x0c000000, > +#if !LJ_TARGET_MIPSR6 > MIPSI_JALX = 0x74000000, > MIPSI_JR = 0x00000008, > +#else > + MIPSI_JR = 0x00000009, > + MIPSI_BALC = 0xe8000000, > +#endif > MIPSI_JALR = 0x0000f809, > > MIPSI_BEQ = 0x10000000, > @@ -282,15 +298,23 @@ typedef enum MIPSIns { > > /* MIPS64 instructions. */ > MIPSI_DADD = 0x0000002c, > - MIPSI_DADDI = 0x60000000, > MIPSI_DADDU = 0x0000002d, > MIPSI_DADDIU = 0x64000000, > MIPSI_DSUB = 0x0000002e, > MIPSI_DSUBU = 0x0000002f, > +#if !LJ_TARGET_MIPSR6 > MIPSI_DDIV = 0x0000001e, > MIPSI_DDIVU = 0x0000001f, > MIPSI_DMULT = 0x0000001c, > MIPSI_DMULTU = 0x0000001d, > +#else > + MIPSI_DDIV = 0x0000009e, > + MIPSI_DMOD = 0x000000de, > + MIPSI_DDIVU = 0x0000009f, > + MIPSI_DMODU = 0x000000df, > + MIPSI_DMUL = 0x0000009c, > + MIPSI_DMUH = 0x000000dc, > +#endif > > MIPSI_DSLL = 0x00000038, > MIPSI_DSRL = 0x0000003a, > @@ -308,6 +332,11 @@ typedef enum MIPSIns { > MIPSI_ASUBU = LJ_32 ? MIPSI_SUBU : MIPSI_DSUBU, > MIPSI_AL = LJ_32 ? MIPSI_LW : MIPSI_LD, > MIPSI_AS = LJ_32 ? MIPSI_SW : MIPSI_SD, > +#if LJ_TARGET_MIPSR6 > + MIPSI_LSA = 0x00000005, > + MIPSI_DLSA = 0x00000015, > + MIPSI_ALSA = LJ_32 ? MIPSI_LSA : MIPSI_DLSA, > +#endif > > /* Extract/insert instructions. */ > MIPSI_DEXTM = 0x7c000001, > @@ -317,18 +346,19 @@ typedef enum MIPSIns { > MIPSI_DINSU = 0x7c000006, > MIPSI_DINS = 0x7c000007, > > - MIPSI_RINT_D = 0x4620001a, > - MIPSI_RINT_S = 0x4600001a, > - MIPSI_RINT = 0x4400001a, > MIPSI_FLOOR_D = 0x4620000b, > - MIPSI_CEIL_D = 0x4620000a, > - MIPSI_ROUND_D = 0x46200008, > > /* FP instructions. */ > MIPSI_MOV_S = 0x46000006, > MIPSI_MOV_D = 0x46200006, > +#if !LJ_TARGET_MIPSR6 > MIPSI_MOVT_D = 0x46210011, > MIPSI_MOVF_D = 0x46200011, > +#else > + MIPSI_MIN_D = 0x4620001C, > + MIPSI_MAX_D = 0x4620001E, > + MIPSI_SEL_D = 0x46200010, > +#endif > > MIPSI_ABS_D = 0x46200005, > MIPSI_NEG_D = 0x46200007, > @@ -363,15 +393,23 @@ typedef enum MIPSIns { > MIPSI_DMTC1 = 0x44a00000, > MIPSI_DMFC1 = 0x44200000, > > +#if !LJ_TARGET_MIPSR6 > MIPSI_BC1F = 0x45000000, > MIPSI_BC1T = 0x45010000, > - > MIPSI_C_EQ_D = 0x46200032, > MIPSI_C_OLT_S = 0x46000034, > MIPSI_C_OLT_D = 0x46200034, > MIPSI_C_ULT_D = 0x46200035, > MIPSI_C_OLE_D = 0x46200036, > MIPSI_C_ULE_D = 0x46200037, > +#else > + MIPSI_BC1EQZ = 0x45200000, > + MIPSI_BC1NEZ = 0x45a00000, > + MIPSI_CMP_EQ_D = 0x46a00002, > + MIPSI_CMP_LT_S = 0x46800004, > + MIPSI_CMP_LT_D = 0x46a00004, > +#endif > + > } MIPSIns; > > #endif > diff --git a/src/vm_mips64.dasc b/src/vm_mips64.dasc > index 9839b5ac..44fba36c 100644 > --- a/src/vm_mips64.dasc > +++ b/src/vm_mips64.dasc > @@ -83,6 +83,10 @@ > | > |.define FRET1, f0 > |.define FRET2, f2 > +| > +|.define FTMP0, f20 > +|.define FTMP1, f21 > +|.define FTMP2, f22 > |.endif > | > |// Stack layout while in interpreter. Must match with lj_frame.h. > @@ -310,10 +314,10 @@ > |.endmacro > | > |// Assumes DISPATCH is relative to GL. > -#define DISPATCH_GL(field) (GG_DISP2G + (int)offsetof(global_State, field)) > -#define DISPATCH_J(field) (GG_DISP2J + (int)offsetof(jit_State, field)) > -#define GG_DISP2GOT (GG_OFS(got) - GG_OFS(dispatch)) > -#define DISPATCH_GOT(name) (GG_DISP2GOT + sizeof(void*)*LJ_GOT_##name) > +#define DISPATCH_GL(field) (GG_DISP2G + (int)offsetof(global_State, field)) > +#define DISPATCH_J(field) (GG_DISP2J + (int)offsetof(jit_State, field)) > +#define GG_DISP2GOT (GG_OFS(got) - GG_OFS(dispatch)) > +#define DISPATCH_GOT(name) (GG_DISP2GOT + sizeof(void*)*LJ_GOT_##name) > | > #define PC2PROTO(field) ((int)offsetof(GCproto, field)-(int)sizeof(GCproto)) > | > @@ -492,8 +496,15 @@ static void build_subroutines(BuildCtx *ctx) > |7: // Less results wanted. > | subu TMP0, RD, TMP2 > | dsubu TMP0, BASE, TMP0 // Either keep top or shrink it. > + |.if MIPSR6 > + | selnez TMP0, TMP0, TMP2 // LUA_MULTRET+1 case? > + | seleqz BASE, BASE, TMP2 > + | b <3 > + |. or BASE, BASE, TMP0 > + |.else > | b <3 > |. movn BASE, TMP0, TMP2 // LUA_MULTRET+1 case? > + |.endif > | > |8: // Corner case: need to grow stack for filling up results. > | // This can happen if: > @@ -1125,11 +1136,16 @@ static void build_subroutines(BuildCtx *ctx) > |.endmacro > | > |// Inlined GC threshold check. Caveat: uses TMP0 and TMP1 and has delay slot! > + |// MIPSR6: no delay slot, but a forbidden slot. > |.macro ffgccheck > | ld TMP0, DISPATCH_GL(gc.total)(DISPATCH) > | ld TMP1, DISPATCH_GL(gc.threshold)(DISPATCH) > | dsubu AT, TMP0, TMP1 > + |.if MIPSR6 > + | bgezalc AT, ->fff_gcstep > + |.else > | bgezal AT, ->fff_gcstep > + |.endif > |.endmacro > | > |//-- Base library: checks ----------------------------------------------- > @@ -1157,7 +1173,13 @@ static void build_subroutines(BuildCtx *ctx) > | sltu TMP1, TISNUM, TMP0 > | not TMP2, TMP0 > | li TMP3, ~LJ_TISNUM > + |.if MIPSR6 > + | selnez TMP2, TMP2, TMP1 > + | seleqz TMP3, TMP3, TMP1 > + | or TMP2, TMP2, TMP3 > + |.else > | movz TMP2, TMP3, TMP1 > + |.endif > | dsll TMP2, TMP2, 3 > | daddu TMP2, CFUNC:RB, TMP2 > | b ->fff_restv > @@ -1169,7 +1191,11 @@ static void build_subroutines(BuildCtx *ctx) > | gettp TMP2, CARG1 > | daddiu TMP0, TMP2, -LJ_TTAB > | daddiu TMP1, TMP2, -LJ_TUDATA > + |.if MIPSR6 > + | selnez TMP0, TMP1, TMP0 > + |.else > | movn TMP0, TMP1, TMP0 > + |.endif > | bnez TMP0, >6 > |. cleartp TAB:CARG1 > |1: // Field metatable must be at same offset for GCtab and GCudata! > @@ -1208,7 +1234,13 @@ static void build_subroutines(BuildCtx *ctx) > | > |6: > | sltiu AT, TMP2, LJ_TISNUM > + |.if MIPSR6 > + | selnez TMP0, TISNUM, AT > + | seleqz AT, TMP2, AT > + | or TMP2, TMP0, AT > + |.else > | movn TMP2, TISNUM, AT > + |.endif > | dsll TMP2, TMP2, 3 > | dsubu TMP0, DISPATCH, TMP2 > | b <2 > @@ -1270,8 +1302,13 @@ static void build_subroutines(BuildCtx *ctx) > | or TMP0, TMP0, TMP1 > | bnez TMP0, ->fff_fallback > |. sd BASE, L->base // Add frame since C call can throw. > + |.if MIPSR6 > + | sd PC, SAVE_PC // Redundant (but a defined value). > + | ffgccheck > + |.else > | ffgccheck > |. sd PC, SAVE_PC // Redundant (but a defined value). > + |.endif > | load_got lj_strfmt_number > | move CARG1, L > | call_intern lj_strfmt_number // (lua_State *L, cTValue *o) > @@ -1441,8 +1478,15 @@ static void build_subroutines(BuildCtx *ctx) > | addiu AT, TMP0, -LUA_YIELD > | daddu CARG3, CARG2, TMP0 > | daddiu TMP3, CARG2, 8 > + |.if MIPSR6 > + | seleqz CARG2, CARG2, AT > + | selnez TMP3, TMP3, AT > + | bgtz AT, ->fff_fallback // st > LUA_YIELD? > + |. or CARG2, TMP3, CARG2 > + |.else > | bgtz AT, ->fff_fallback // st > LUA_YIELD? > |. movn CARG2, TMP3, AT > + |.endif > | xor TMP2, TMP2, CARG3 > | bnez TMP1, ->fff_fallback // cframe != 0? > |. or AT, TMP2, TMP0 > @@ -1754,7 +1798,7 @@ static void build_subroutines(BuildCtx *ctx) > | b ->fff_res > |. li RD, (2+1)*8 > | > - |.macro math_minmax, name, intins, fpins > + |.macro math_minmax, name, intins, intinsc, fpins > | .ffunc_1 name > | daddu TMP3, BASE, NARGS8:RC > | checkint CARG1, >5 > @@ -1766,7 +1810,13 @@ static void build_subroutines(BuildCtx *ctx) > |. sextw CARG1, CARG1 > | lw CARG2, LO(TMP2) > |. slt AT, CARG1, CARG2 > + |.if MIPSR6 > + | intins TMP1, CARG2, AT > + | intinsc CARG1, CARG1, AT > + | or CARG1, CARG1, TMP1 > + |.else > | intins CARG1, CARG2, AT > + |.endif > | daddiu TMP2, TMP2, 8 > | zextw CARG1, CARG1 > | b <1 > @@ -1802,13 +1852,23 @@ static void build_subroutines(BuildCtx *ctx) > |. nop > |7: > |.if FPU > + |.if MIPSR6 > + | fpins FRET1, FRET1, FARG1 > + |.else > | c.olt.d FRET1, FARG1 > | fpins FRET1, FARG1 > + |.endif > |.else > | bal ->vm_sfcmpolt > |. nop > + |.if MIPSR6 > + | intins AT, CARG2, CRET1 > + | intinsc CARG1, CARG1, CRET1 > + | or CARG1, CARG1, AT > + |.else > | intins CARG1, CARG2, CRET1 > |.endif > + |.endif > | b <6 > |. daddiu TMP2, TMP2, 8 > | > @@ -1828,8 +1888,13 @@ static void build_subroutines(BuildCtx *ctx) > | > |.endmacro > | > - | math_minmax math_min, movz, movf.d > - | math_minmax math_max, movn, movt.d > + |.if MIPSR6 > + | math_minmax math_min, seleqz, selnez, min.d > + | math_minmax math_max, selnez, seleqz, max.d > + |.else > + | math_minmax math_min, movz, _, movf.d > + | math_minmax math_max, movn, _, movt.d > + |.endif > | > |//-- String library ----------------------------------------------------- > | > @@ -1854,7 +1919,9 @@ static void build_subroutines(BuildCtx *ctx) > | > |.ffunc string_char // Only handle the 1-arg case here. > | ffgccheck > + |.if not MIPSR6 > |. nop > + |.endif > | ld CARG1, 0(BASE) > | gettp TMP0, CARG1 > | xori AT, NARGS8:RC, 8 // Exactly 1 argument. > @@ -1884,7 +1951,9 @@ static void build_subroutines(BuildCtx *ctx) > | > |.ffunc string_sub > | ffgccheck > + |.if not MIPSR6 > |. nop > + |.endif > | addiu AT, NARGS8:RC, -16 > | ld TMP0, 0(BASE) > | bltz AT, ->fff_fallback > @@ -1907,8 +1976,30 @@ static void build_subroutines(BuildCtx *ctx) > | addiu TMP0, CARG2, 1 > | addu TMP1, CARG4, TMP0 > | slt TMP3, CARG3, r0 > + |.if MIPSR6 > + | seleqz CARG4, CARG4, AT > + | selnez TMP1, TMP1, AT > + | or CARG4, TMP1, CARG4 // if (end < 0) end += len+1 > + |.else > | movn CARG4, TMP1, AT // if (end < 0) end += len+1 > + |.endif > | addu TMP1, CARG3, TMP0 > + |.if MIPSR6 > + | selnez TMP1, TMP1, TMP3 > + | seleqz CARG3, CARG3, TMP3 > + | or CARG3, TMP1, CARG3 // if (start < 0) start += len+1 > + | li TMP2, 1 > + | slt AT, CARG4, r0 > + | slt TMP3, r0, CARG3 > + | seleqz CARG4, CARG4, AT // if (end < 0) end = 0 > + | selnez CARG3, CARG3, TMP3 > + | seleqz TMP2, TMP2, TMP3 > + | or CARG3, TMP2, CARG3 // if (start < 1) start = 1 > + | slt AT, CARG2, CARG4 > + | seleqz CARG4, CARG4, AT > + | selnez CARG2, CARG2, AT > + | or CARG4, CARG2, CARG4 // if (end > len) end = len > + |.else > | movn CARG3, TMP1, TMP3 // if (start < 0) start += len+1 > | li TMP2, 1 > | slt AT, CARG4, r0 > @@ -1917,6 +2008,7 @@ static void build_subroutines(BuildCtx *ctx) > | movz CARG3, TMP2, TMP3 // if (start < 1) start = 1 > | slt AT, CARG2, CARG4 > | movn CARG4, CARG2, AT // if (end > len) end = len > + |.endif > | daddu CARG2, STR:CARG1, CARG3 > | subu CARG3, CARG4, CARG3 // len = end - start > | daddiu CARG2, CARG2, sizeof(GCstr)-1 > @@ -1978,7 +2070,13 @@ static void build_subroutines(BuildCtx *ctx) > | slt AT, CARG1, r0 > | dsrlv CRET1, TMP0, CARG3 > | dsubu TMP0, r0, CRET1 > + |.if MIPSR6 > + | selnez TMP0, TMP0, AT > + | seleqz CRET1, CRET1, AT > + | or CRET1, CRET1, TMP0 > + |.else > | movn CRET1, TMP0, AT > + |.endif > | jr ra > |. zextw CRET1, CRET1 > |1: > @@ -2001,14 +2099,28 @@ static void build_subroutines(BuildCtx *ctx) > | slt AT, CARG1, r0 > | dsrlv CRET1, CRET2, TMP0 > | dsubu CARG1, r0, CRET1 > + |.if MIPSR6 > + | seleqz CRET1, CRET1, AT > + | selnez CARG1, CARG1, AT > + | or CRET1, CRET1, CARG1 > + |.else > | movn CRET1, CARG1, AT > + |.endif > | li CARG1, 64 > | subu TMP0, CARG1, TMP0 > | dsllv CRET2, CRET2, TMP0 // Integer check. > | sextw AT, CRET1 > | xor AT, CRET1, AT // Range check. > | jr ra > + |.if MIPSR6 > + | seleqz AT, AT, CRET2 > + | selnez CRET2, CRET2, CRET2 > + | jr ra > + |. or CRET2, AT, CRET2 > + |.else > + | jr ra > |. movz CRET2, AT, CRET2 > + |.endif > |1: > | jr ra > |. li CRET2, 1 > @@ -2518,15 +2630,22 @@ static void build_subroutines(BuildCtx *ctx) > | > |// Hard-float round to integer. > |// Modifies AT, TMP0, FRET1, FRET2, f4. Keeps all others incl. FARG1. > + |// MIPSR6: Modifies FTMP1, too. > |.macro vm_round_hf, func > | lui TMP0, 0x4330 // Hiword of 2^52 (double). > | dsll TMP0, TMP0, 32 > | dmtc1 TMP0, f4 > | abs.d FRET2, FARG1 // |x| > | dmfc1 AT, FARG1 > + |.if MIPSR6 > + | cmp.lt.d FTMP1, FRET2, f4 > + | add.d FRET1, FRET2, f4 // (|x| + 2^52) - 2^52 > + | bc1eqz FTMP1, >1 // Truncate only if |x| < 2^52. > + |.else > | c.olt.d 0, FRET2, f4 > | add.d FRET1, FRET2, f4 // (|x| + 2^52) - 2^52 > | bc1f 0, >1 // Truncate only if |x| < 2^52. > + |.endif > |. sub.d FRET1, FRET1, f4 > | slt AT, AT, r0 > |.if "func" == "ceil" > @@ -2537,16 +2656,38 @@ static void build_subroutines(BuildCtx *ctx) > |.if "func" == "trunc" > | dsll TMP0, TMP0, 32 > | dmtc1 TMP0, f4 > + |.if MIPSR6 > + | cmp.lt.d FTMP1, FRET2, FRET1 // |x| < result? > + | sub.d FRET2, FRET1, f4 > + | sel.d FTMP1, FRET1, FRET2 // If yes, subtract +1. > + | dmtc1 AT, FRET1 > + | neg.d FRET2, FTMP1 > + | jr ra > + |. sel.d FRET1, FTMP1, FRET2 // Merge sign bit back in. > + |.else > | c.olt.d 0, FRET2, FRET1 // |x| < result? > | sub.d FRET2, FRET1, f4 > | movt.d FRET1, FRET2, 0 // If yes, subtract +1. > | neg.d FRET2, FRET1 > | jr ra > |. movn.d FRET1, FRET2, AT // Merge sign bit back in. > + |.endif > |.else > | neg.d FRET2, FRET1 > | dsll TMP0, TMP0, 32 > | dmtc1 TMP0, f4 > + |.if MIPSR6 > + | dmtc1 AT, FTMP1 > + | sel.d FTMP1, FRET1, FRET2 > + |.if "func" == "ceil" > + | cmp.lt.d FRET1, FTMP1, FARG1 // x > result? > + |.else > + | cmp.lt.d FRET1, FARG1, FTMP1 // x < result? > + |.endif > + | sub.d FRET2, FTMP1, f4 // If yes, subtract +-1. > + | jr ra > + |. sel.d FRET1, FTMP1, FRET2 > + |.else > | movn.d FRET1, FRET2, AT // Merge sign bit back in. > |.if "func" == "ceil" > | c.olt.d 0, FRET1, FARG1 // x > result? > @@ -2557,6 +2698,7 @@ static void build_subroutines(BuildCtx *ctx) > | jr ra > |. movt.d FRET1, FRET2, 0 > |.endif > + |.endif > |1: > | jr ra > |. mov.d FRET1, FARG1 > @@ -2701,7 +2843,7 @@ static void build_subroutines(BuildCtx *ctx) > |. li CRET1, 0 > |.endif > | > - |.macro sfmin_max, name, intins > + |.macro sfmin_max, name, intins, intinsc > |->vm_sf .. name: > |.if JIT and not FPU > | move TMP2, ra > @@ -2710,13 +2852,25 @@ static void build_subroutines(BuildCtx *ctx) > | move ra, TMP2 > | move TMP0, CRET1 > | move CRET1, CARG1 > + |.if MIPSR6 > + | intins CRET1, CRET1, TMP0 > + | intinsc TMP0, CARG2, TMP0 > + | jr ra > + |. or CRET1, CRET1, TMP0 > + |.else > | jr ra > |. intins CRET1, CARG2, TMP0 > |.endif > + |.endif > |.endmacro > | > - | sfmin_max min, movz > - | sfmin_max max, movn > + |.if MIPSR6 > + | sfmin_max min, selnez, seleqz > + | sfmin_max max, seleqz, selnez > + |.else > + | sfmin_max min, movz, _ > + | sfmin_max max, movn, _ > + |.endif > | > |//----------------------------------------------------------------------- > |//-- Miscellaneous functions -------------------------------------------- > @@ -2885,7 +3039,11 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > | lui TMP3, (-(BCBIAS_J*4 >> 16) & 65535) > | slt AT, CARG1, CARG2 > | addu TMP2, TMP2, TMP3 > + |.if MIPSR6 > + | movop TMP2, TMP2, AT > + |.else > | movop TMP2, r0, AT > + |.endif > |1: > | daddu PC, PC, TMP2 > | ins_next > @@ -2903,16 +3061,28 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > |.endif > |3: // RA and RD are both numbers. > |.if FPU > - | fcomp f20, f22 > + |.if MIPSR6 > + | fcomp FTMP0, FTMP0, FTMP2 > + | addu TMP2, TMP2, TMP3 > + | mfc1 TMP3, FTMP0 > + | b <1 > + |. fmovop TMP2, TMP2, TMP3 > + |.else > + | fcomp FTMP0, FTMP2 > | addu TMP2, TMP2, TMP3 > | b <1 > |. fmovop TMP2, r0 > + |.endif > |.else > | bal sfcomp > |. addu TMP2, TMP2, TMP3 > | b <1 > + |.if MIPSR6 > + |. movop TMP2, TMP2, CRET1 > + |.else > |. movop TMP2, r0, CRET1 > |.endif > + |.endif > | > |4: // RA is a number, RD is not a number. > | bne CARG4, TISNUM, ->vmeta_comp > @@ -2959,15 +3129,27 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > |.endif > |.endmacro > | > + |.if MIPSR6 > + if (op == BC_ISLT) { > + | bc_comp FTMP0, FTMP2, CARG1, CARG2, selnez, selnez, cmp.lt.d, ->vm_sfcmpolt > + } else if (op == BC_ISGE) { > + | bc_comp FTMP0, FTMP2, CARG1, CARG2, seleqz, seleqz, cmp.lt.d, ->vm_sfcmpolt > + } else if (op == BC_ISLE) { > + | bc_comp FTMP2, FTMP0, CARG2, CARG1, seleqz, seleqz, cmp.ult.d, ->vm_sfcmpult > + } else { > + | bc_comp FTMP2, FTMP0, CARG2, CARG1, selnez, selnez, cmp.ult.d, ->vm_sfcmpult > + } > + |.else > if (op == BC_ISLT) { > - | bc_comp f20, f22, CARG1, CARG2, movz, movf, c.olt.d, ->vm_sfcmpolt > + | bc_comp FTMP0, FTMP2, CARG1, CARG2, movz, movf, c.olt.d, ->vm_sfcmpolt > } else if (op == BC_ISGE) { > - | bc_comp f20, f22, CARG1, CARG2, movn, movt, c.olt.d, ->vm_sfcmpolt > + | bc_comp FTMP0, FTMP2, CARG1, CARG2, movn, movt, c.olt.d, ->vm_sfcmpolt > } else if (op == BC_ISLE) { > - | bc_comp f22, f20, CARG2, CARG1, movn, movt, c.ult.d, ->vm_sfcmpult > + | bc_comp FTMP2, FTMP0, CARG2, CARG1, movn, movt, c.ult.d, ->vm_sfcmpult > } else { > - | bc_comp f22, f20, CARG2, CARG1, movz, movf, c.ult.d, ->vm_sfcmpult > + | bc_comp FTMP2, FTMP0, CARG2, CARG1, movz, movf, c.ult.d, ->vm_sfcmpult > } > + |.endif > break; > > case BC_ISEQV: case BC_ISNEV: > @@ -3013,7 +3195,11 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > |2: // Check if the tags are the same and it's a table or userdata. > | xor AT, CARG3, CARG4 // Same type? > | sltiu TMP0, CARG3, LJ_TISTABUD+1 // Table or userdata? > + |.if MIPSR6 > + | seleqz TMP0, TMP0, AT > + |.else > | movn TMP0, r0, AT > + |.endif > if (vk) { > | beqz TMP0, <1 > } else { > @@ -3063,11 +3249,19 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > | lui TMP3, (-(BCBIAS_J*4 >> 16) & 65535) > | xor TMP1, CARG1, CARG2 > | addu TMP2, TMP2, TMP3 > + |.if MIPSR6 > + if (vk) { > + | seleqz TMP2, TMP2, TMP1 > + } else { > + | selnez TMP2, TMP2, TMP1 > + } > + |.else > if (vk) { > | movn TMP2, r0, TMP1 > } else { > | movz TMP2, r0, TMP1 > } > + |.endif > | daddu PC, PC, TMP2 > | ins_next > break; > @@ -3094,6 +3288,19 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > | bne CARG4, TISNUM, >6 > |. addu TMP2, TMP2, TMP3 > | xor AT, CARG1, CARG2 > + |.if MIPSR6 > + if (vk) { > + | seleqz TMP2, TMP2, AT > + |1: > + | daddu PC, PC, TMP2 > + |2: > + } else { > + | selnez TMP2, TMP2, AT > + |1: > + |2: > + | daddu PC, PC, TMP2 > + } > + |.else > if (vk) { > | movn TMP2, r0, AT > |1: > @@ -3105,6 +3312,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > |2: > | daddu PC, PC, TMP2 > } > + |.endif > | ins_next > | > |3: // RA is not an integer. > @@ -3117,30 +3325,49 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > |. addu TMP2, TMP2, TMP3 > | sltu AT, CARG4, TISNUM > |.if FPU > - | ldc1 f20, 0(RA) > - | ldc1 f22, 0(RD) > + | ldc1 FTMP0, 0(RA) > + | ldc1 FTMP2, 0(RD) > |.endif > | beqz AT, >5 > |. nop > |4: // RA and RD are both numbers. > |.if FPU > - | c.eq.d f20, f22 > + |.if MIPSR6 > + | cmp.eq.d FTMP0, FTMP0, FTMP2 > + | dmfc1 TMP1, FTMP0 > + | b <1 > + if (vk) { > + |. selnez TMP2, TMP2, TMP1 > + } else { > + |. seleqz TMP2, TMP2, TMP1 > + } > + |.else > + | c.eq.d FTMP0, FTMP2 > | b <1 > if (vk) { > |. movf TMP2, r0 > } else { > |. movt TMP2, r0 > } > + |.endif > |.else > | bal ->vm_sfcmpeq > |. nop > | b <1 > + |.if MIPSR6 > + if (vk) { > + |. selnez TMP2, TMP2, CRET1 > + } else { > + |. seleqz TMP2, TMP2, CRET1 > + } > + |.else > if (vk) { > |. movz TMP2, r0, CRET1 > } else { > |. movn TMP2, r0, CRET1 > } > |.endif > + |.endif > | > |5: // RA is a number, RD is not a number. > |.if FFI > @@ -3150,9 +3377,9 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > |.endif > | // RA is a number, RD is an integer. Convert RD to a number. > |.if FPU > - |. lwc1 f22, LO(RD) > + |. lwc1 FTMP2, LO(RD) > | b <4 > - |. cvt.d.w f22, f22 > + |. cvt.d.w FTMP2, FTMP2 > |.else > |. sextw CARG2, CARG2 > | bal ->vm_sfi2d_2 > @@ -3170,10 +3397,10 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > |.endif > | // RA is an integer, RD is a number. Convert RA to a number. > |.if FPU > - |. lwc1 f20, LO(RA) > - | ldc1 f22, 0(RD) > + |. lwc1 FTMP0, LO(RA) > + | ldc1 FTMP2, 0(RD) > | b <4 > - | cvt.d.w f20, f20 > + | cvt.d.w FTMP0, FTMP0 > |.else > |. sextw CARG1, CARG1 > | bal ->vm_sfi2d_1 > @@ -3216,11 +3443,19 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > | decode_RD4b TMP2 > | lui TMP3, (-(BCBIAS_J*4 >> 16) & 65535) > | addu TMP2, TMP2, TMP3 > + |.if MIPSR6 > + if (vk) { > + | seleqz TMP2, TMP2, TMP0 > + } else { > + | selnez TMP2, TMP2, TMP0 > + } > + |.else > if (vk) { > | movn TMP2, r0, TMP0 > } else { > | movz TMP2, r0, TMP0 > } > + |.endif > | daddu PC, PC, TMP2 > | ins_next > break; > @@ -3239,11 +3474,19 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > | decode_RD4b TMP2 > | lui TMP3, (-(BCBIAS_J*4 >> 16) & 65535) > | addu TMP2, TMP2, TMP3 > + |.if MIPSR6 > + if (op == BC_IST) { > + | selnez TMP2, TMP2, TMP0; > + } else { > + | seleqz TMP2, TMP2, TMP0; > + } > + |.else > if (op == BC_IST) { > | movz TMP2, r0, TMP0 > } else { > | movn TMP2, r0, TMP0 > } > + |.endif > | daddu PC, PC, TMP2 > } else { > | ld CRET1, 0(RD) > @@ -3486,9 +3729,15 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > | bltz TMP1, ->vmeta_arith > |. daddu RA, BASE, RA > |.elif "intins" == "mult" > + |.if MIPSR6 > + |. nop > + | mul CRET1, CARG3, CARG4 > + | muh TMP2, CARG3, CARG4 > + |.else > |. intins CARG3, CARG4 > | mflo CRET1 > | mfhi TMP2 > + |.endif > | sra TMP1, CRET1, 31 > | bne TMP1, TMP2, ->vmeta_arith > |. daddu RA, BASE, RA > @@ -3511,16 +3760,16 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > |.endif > | > |5: // Check for two numbers. > - | .FPU ldc1 f20, 0(RB) > + | .FPU ldc1 FTMP0, 0(RB) > | sltu AT, TMP0, TISNUM > | sltu TMP0, TMP1, TISNUM > - | .FPU ldc1 f22, 0(RC) > + | .FPU ldc1 FTMP2, 0(RC) > | and AT, AT, TMP0 > | beqz AT, ->vmeta_arith > |. daddu RA, BASE, RA > | > |.if FPU > - | fpins FRET1, f20, f22 > + | fpins FRET1, FTMP0, FTMP2 > |.elif "fpcall" == "sfpmod" > | sfpmod > |.else > @@ -3850,7 +4099,13 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > | li TMP0, 0x801 > | addiu AT, CARG2, -0x7ff > | srl CARG3, RD, 14 > + |.if MIPSR6 > + | seleqz TMP0, TMP0, AT > + | selnez CARG2, CARG2, AT > + | or CARG2, CARG2, TMP0 > + |.else > | movz CARG2, TMP0, AT > + |.endif > | // (lua_State *L, int32_t asize, uint32_t hbits) > | call_intern lj_tab_new > |. move CARG1, L > @@ -4131,7 +4386,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > | daddu NODE:TMP2, NODE:TMP2, TMP1 // node = tab->node + (idx*32-idx*8) > | settp STR:RC, TMP3 // Tagged key to look for. > |.if FPU > - | ldc1 f20, 0(RA) > + | ldc1 FTMP0, 0(RA) > |.else > | ld CRET1, 0(RA) > |.endif > @@ -4147,7 +4402,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > | andi AT, TMP3, LJ_GC_BLACK // isblack(table) > | bnez AT, >7 > |.if FPU > - |. sdc1 f20, NODE:TMP2->val > + |. sdc1 FTMP0, NODE:TMP2->val > |.else > |. sd CRET1, NODE:TMP2->val > |.endif > @@ -4188,7 +4443,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > | ld BASE, L->base > |.if FPU > | b <3 // No 2nd write barrier needed. > - |. sdc1 f20, 0(CRET1) > + |. sdc1 FTMP0, 0(CRET1) > |.else > | ld CARG1, 0(RA) > | b <3 // No 2nd write barrier needed. > @@ -4531,7 +4786,13 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > | ld CARG1, 0(RC) > | sltu AT, RC, TMP3 > | daddiu RC, RC, 8 > + |.if MIPSR6 > + | selnez CARG1, CARG1, AT > + | seleqz AT, TISNIL, AT > + | or CARG1, CARG1, AT > + |.else > | movz CARG1, TISNIL, AT > + |.endif > | sd CARG1, 0(RA) > | sltu AT, RA, TMP2 > | bnez AT, <1 > @@ -4720,7 +4981,13 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > | dext AT, CRET1, 31, 0 > | slt CRET1, CARG2, CARG3 > | slt TMP1, CARG3, CARG2 > + |.if MIPSR6 > + | selnez TMP1, TMP1, AT > + | seleqz CRET1, CRET1, AT > + | or CRET1, CRET1, TMP1 > + |.else > | movn CRET1, TMP1, AT > + |.endif > } else { > | bne CARG3, TISNUM, >5 > |. ld CARG2, FORL_STEP*8(RA) // STEP CARG2 - CARG4 type > @@ -4736,20 +5003,34 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > | slt CRET1, CRET1, CARG1 > | slt AT, CARG2, r0 > | slt TMP0, TMP0, r0 // ((y^a) & (y^b)) < 0: overflow. > + |.if MIPSR6 > + | selnez TMP1, TMP1, AT > + | seleqz CRET1, CRET1, AT > + | or CRET1, CRET1, TMP1 > + |.else > | movn CRET1, TMP1, AT > + |.endif > | or CRET1, CRET1, TMP0 > | zextw CARG1, CARG1 > | settp CARG1, TISNUM > } > |1: > if (op == BC_FORI) { > + |.if MIPSR6 > + | selnez TMP2, TMP2, CRET1 > + |.else > | movz TMP2, r0, CRET1 > + |.endif > | daddu PC, PC, TMP2 > } else if (op == BC_JFORI) { > | daddu PC, PC, TMP2 > | lhu RD, -4+OFS_RD(PC) > } else if (op == BC_IFORL) { > + |.if MIPSR6 > + | seleqz TMP2, TMP2, CRET1 > + |.else > | movn TMP2, r0, CRET1 > + |.endif > | daddu PC, PC, TMP2 > } > if (vk) { > @@ -4779,6 +5060,14 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > | and AT, AT, TMP0 > | beqz AT, ->vmeta_for > |. slt TMP3, TMP3, r0 > + |.if MIPSR6 > + | dmtc1 TMP3, FTMP2 > + | cmp.lt.d FTMP0, f0, f2 > + | cmp.lt.d FTMP1, f2, f0 > + | sel.d FTMP2, FTMP1, FTMP0 > + | b <1 > + |. dmfc1 CRET1, FTMP2 > + |.else > | c.ole.d 0, f0, f2 > | c.ole.d 1, f2, f0 > | li CRET1, 1 > @@ -4786,12 +5075,25 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > | movt AT, r0, 1 > | b <1 > |. movn CRET1, AT, TMP3 > + |.endif > } else { > | ldc1 f0, FORL_IDX*8(RA) > | ldc1 f4, FORL_STEP*8(RA) > | ldc1 f2, FORL_STOP*8(RA) > | ld TMP3, FORL_STEP*8(RA) > | add.d f0, f0, f4 > + |.if MIPSR6 > + | slt TMP3, TMP3, r0 > + | dmtc1 TMP3, FTMP2 > + | cmp.lt.d FTMP0, f0, f2 > + | cmp.lt.d FTMP1, f2, f0 > + | sel.d FTMP2, FTMP1, FTMP0 > + | dmfc1 CRET1, FTMP2 > + if (op == BC_IFORL) { > + | seleqz TMP2, TMP2, CRET1 > + | daddu PC, PC, TMP2 > + } > + |.else > | c.ole.d 0, f0, f2 > | c.ole.d 1, f2, f0 > | slt TMP3, TMP3, r0 > @@ -4804,6 +5106,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > | movn TMP2, r0, CRET1 > | daddu PC, PC, TMP2 > } > + |.endif > | sdc1 f0, FORL_IDX*8(RA) > | ins_next1 > | b <2 > @@ -4979,8 +5282,17 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop) > | ld TMP0, 0(RA) > | sltu AT, RA, RC // Less args than parameters? > | move CARG1, TMP0 > + |.if MIPSR6 > + | selnez TMP0, TMP0, AT > + | seleqz TMP3, TISNIL, AT > + | or TMP0, TMP0, TMP3 > + | seleqz TMP3, CARG1, AT > + | selnez CARG1, TISNIL, AT > + | or CARG1, CARG1, TMP3 > + |.else > | movz TMP0, TISNIL, AT // Clear missing parameters. > | movn CARG1, TISNIL, AT // Clear old fixarg slot (help the GC). > + |.endif > | addiu TMP2, TMP2, -1 > | sd TMP0, 16(TMP1) > | daddiu TMP1, TMP1, 8 > -- > 2.41.0 >