Tarantool development patches archive
 help / color / mirror / Atom feed
From: Maxim Kokryashkin via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Sergey Kaplun <skaplun@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH luajit 19/19] MIPS: Add MIPS64 R6 port.
Date: Wed, 16 Aug 2023 12:16:51 +0300	[thread overview]
Message-ID: <ee5dcm5qj277tujwfwljxvmjdmvrckz5yekyirvggb7jsbwspm@3pv4rldftgf2> (raw)
In-Reply-To: <876736e650dcb70ce32a272f92cc3ba034a4dd3b.1691592488.git.skaplun@tarantool.org>

Hi, Sergey!
Thanks for the patch!
LGTM, except for a few nits regarding the commit message.
On Wed, Aug 09, 2023 at 06:36:08PM +0300, Sergey Kaplun via Tarantool-patches wrote:
> From: Mike Pall <mike>
> 
> Contributed by Hua Zhang, YunQiang Su from Wave Computing,
> and Radovan Birdic from RT-RK.
> Sponsored by Wave Computing.
> 
> (cherry-picked from commit 94d0b53004a5fa368defa4307a17edcdb87fe727)
> 
> This patch adds support for MIPS Release 6 [1] for the 64-bit build.
> This includes:
> * Global `_map_def` value is set with <dynasm/dynasm.lua>. `MIPSR6` key
>   specifies the corresponding instruction set support. Also, `MIPSR6` is
>   defined in `DYNASM_FLAGS` (`DASM_AFLAGS`).
> * New instructions are added within <dynasm/dasm_mips.lua>, they are
>   used if the aforementioned key is set.
> * Obsolete instructions (that are no more in use in r6) are used in the
Typo: s/no more/no longer/
>   opposite case (if `MIPSR6` isn't set).
> * New opcode maps are added into  <src/jit/dis_mips.lua>.
Typo: s/into/to/
> * `map_arch` table in <jit/bcsave.lua> is refactored for more convenient
>   usage. Now each arch key contains a table with the corresponding info
>   about supported architecture:
Typo: s/about/about the/
>     - `e`: endianess; "le" or "be"
>     - `b`: bit-width of the supported architecture; 32 or 64
>     - `m`: machine specification (see `e_machine` in man elf)
>     - `f`: processor-specific flags (see `e_flags` in man elf)
>     - `p`: number that identifies the type of target machine [2] for
>       Portable Executable format [3].
> * New `LJ_TARGET_MIPSR6` define is set for MIPSR6 in <src/lj_arch.h>.
> * The corresponding "MIPS32R6", "MIPS64R6" CPU strings are added to the
>   <src/jit.h>
> * MIPSR6 instructions are added to the <src/lj_target_mips.h>, some
>   obsolete instructions are removed or defined only for the non-MIPSR6
>   build.
> * All release-dependent instructions in <src/lj_asm_mips.h> are
>   instrumented with `LJ_TARGET_MIPSR6` macro.
> * `f20`, `f21`, `f22` FP registers are defined as `FTMP0`, `FTMP1`,
>   `FTMP2` correspondingly in the VM.
> * All release-dependent instructions in <src/vm_mips64.dasm> are
>   instrumented with `MIPSR6` macro.
> * `sfmin_max` macro now takes the third operand for the MIPSR6 build.
> * Fix implicit fallthrough warning for `LJ_SOFTFP && !LJ_NEED_FP64`
Typo: s/Fix/Fix the/
>   build in <src/lj_asm.c>.
> 
> Note, that 32-bit r6 targets still unsupported, because it is difficult
Typo: s/targets/targets are/
> and most available r6 CPUs are 64 bit.
> 
> [1]: https://www.mips.com/products/architectures/mips64/
> [2]: https://learn.microsoft.com/en-us/windows/win32/debug/pe-format#machine-types
> [3]: https://learn.microsoft.com/en-us/windows/win32/debug/pe-format
> 
> Sergey Kaplun:
> * added the description for the feature
> 
> Part of tarantool/tarantool#8825
> ---
>  cmake/SetDynASMFlags.cmake |   5 +
>  dynasm/dasm_mips.h         |  13 +-
>  dynasm/dasm_mips.lua       | 625 +++++++++++++++++++++++--------------
>  dynasm/dynasm.lua          |   1 +
>  src/Makefile.original      |   3 +
>  src/jit/bcsave.lua         |  84 ++---
>  src/jit/dis_mips.lua       | 293 +++++++++++++++--
>  src/jit/dis_mips64r6.lua   |  17 +
>  src/jit/dis_mips64r6el.lua |  17 +
>  src/lj_arch.h              |  29 +-
>  src/lj_asm.c               |   2 +-
>  src/lj_asm_mips.h          | 114 ++++++-
>  src/lj_emit_mips.h         |  15 +-
>  src/lj_jit.h               |   8 +
>  src/lj_target_mips.h       |  52 ++-
>  src/vm_mips64.dasc         | 370 ++++++++++++++++++++--
>  16 files changed, 1301 insertions(+), 347 deletions(-)
>  create mode 100644 src/jit/dis_mips64r6.lua
>  create mode 100644 src/jit/dis_mips64r6el.lua
> 
> diff --git a/cmake/SetDynASMFlags.cmake b/cmake/SetDynASMFlags.cmake
> index 142d7e64..7eead6e9 100644
> --- a/cmake/SetDynASMFlags.cmake
> +++ b/cmake/SetDynASMFlags.cmake
> @@ -64,6 +64,11 @@ elseif(LUAJIT_ARCH STREQUAL "mips")
>    endif()
>  endif()
>  
> +string(FIND "${TESTARCH}" "LJ_TARGET_MIPSR6" FOUND)
> +if(NOT FOUND EQUAL -1)
> +  AppendFlags(DYNASM_FLAGS -D MIPSR6)
> +endif()
> +
>  string(FIND "${TESTARCH}" "LJ_LE 1" FOUND)
>  if(NOT FOUND EQUAL -1)
>    list(APPEND DYNASM_FLAGS -D ENDIAN_LE)
> diff --git a/dynasm/dasm_mips.h b/dynasm/dasm_mips.h
> index 71a835b2..7d06aa72 100644
> --- a/dynasm/dasm_mips.h
> +++ b/dynasm/dasm_mips.h
> @@ -355,14 +355,15 @@ int dasm_encode(Dst_DECL, void *buffer)
>  	  CK(n >= 0, UNDEF_PC);
>  	  n = *DASM_POS2PTR(D, n);
>  	  if (ins & 2048)
> -	    n = n - (int)((char *)cp - base);
> -	  else
>  	    n = (n + (int)(size_t)base) & 0x0fffffff;
> -	patchrel:
> +	  else
> +	    n = n - (int)((char *)cp - base);
> +	patchrel: {
> +	  unsigned int e = 16 + ((ins >> 12) & 15);
>  	  CK((n & 3) == 0 &&
> -	     ((n + ((ins & 2048) ? 0x00020000 : 0)) >>
> -	       ((ins & 2048) ? 18 : 28)) == 0, RANGE_REL);
> -	  cp[-1] |= ((n>>2) & ((ins & 2048) ? 0x0000ffff: 0x03ffffff));
> +	     ((n + ((ins & 2048) ? 0 : (1<<(e+1)))) >> (e+2)) == 0, RANGE_REL);
> +	  cp[-1] |= ((n>>2) & ((1<<e)-1));
> +	  }
>  	  break;
>  	case DASM_LABEL_LG:
>  	  ins &= 2047; if (ins >= 20) D->globals[ins-10] = (void *)(base + n);
> diff --git a/dynasm/dasm_mips.lua b/dynasm/dasm_mips.lua
> index bd2a2b43..ccdc53cd 100644
> --- a/dynasm/dasm_mips.lua
> +++ b/dynasm/dasm_mips.lua
> @@ -6,6 +6,7 @@
>  ------------------------------------------------------------------------------
>  
>  local mips64 = mips64
> +local mipsr6 = _map_def.MIPSR6
>  
>  -- Module information:
>  local _info = {
> @@ -238,7 +239,6 @@ local map_op = {
>    bne_3 =	"14000000STB",
>    blez_2 =	"18000000SB",
>    bgtz_2 =	"1c000000SB",
> -  addi_3 =	"20000000TSI",
>    li_2 =	"24000000TI",
>    addiu_3 =	"24000000TSI",
>    slti_3 =	"28000000TSI",
> @@ -248,40 +248,22 @@ local map_op = {
>    ori_3 =	"34000000TSU",
>    xori_3 =	"38000000TSU",
>    lui_2 =	"3c000000TU",
> -  beqzl_2 =	"50000000SB",
> -  beql_3 =	"50000000STB",
> -  bnezl_2 =	"54000000SB",
> -  bnel_3 =	"54000000STB",
> -  blezl_2 =	"58000000SB",
> -  bgtzl_2 =	"5c000000SB",
> -  daddi_3 =	mips64 and "60000000TSI",
>    daddiu_3 =	mips64 and "64000000TSI",
>    ldl_2 =	mips64 and "68000000TO",
>    ldr_2 =	mips64 and "6c000000TO",
>    lb_2 =	"80000000TO",
>    lh_2 =	"84000000TO",
> -  lwl_2 =	"88000000TO",
>    lw_2 =	"8c000000TO",
>    lbu_2 =	"90000000TO",
>    lhu_2 =	"94000000TO",
> -  lwr_2 =	"98000000TO",
>    lwu_2 =	mips64 and "9c000000TO",
>    sb_2 =	"a0000000TO",
>    sh_2 =	"a4000000TO",
> -  swl_2 =	"a8000000TO",
>    sw_2 =	"ac000000TO",
> -  sdl_2 =	mips64 and "b0000000TO",
> -  sdr_2 =	mips64 and "b1000000TO",
> -  swr_2 =	"b8000000TO",
> -  cache_2 =	"bc000000NO",
> -  ll_2 =	"c0000000TO",
>    lwc1_2 =	"c4000000HO",
> -  pref_2 =	"cc000000NO",
>    ldc1_2 =	"d4000000HO",
>    ld_2 =	mips64 and "dc000000TO",
> -  sc_2 =	"e0000000TO",
>    swc1_2 =	"e4000000HO",
> -  scd_2 =	mips64 and "f0000000TO",
>    sdc1_2 =	"f4000000HO",
>    sd_2 =	mips64 and "fc000000TO",
>  
> @@ -289,10 +271,6 @@ local map_op = {
>    nop_0 =	"00000000",
>    sll_3 =	"00000000DTA",
>    sextw_2 =	"00000000DT",
> -  movf_2 =	"00000001DS",
> -  movf_3 =	"00000001DSC",
> -  movt_2 =	"00010001DS",
> -  movt_3 =	"00010001DSC",
>    srl_3 =	"00000002DTA",
>    rotr_3 =	"00200002DTA",
>    sra_3 =	"00000003DTA",
> @@ -301,31 +279,16 @@ local map_op = {
>    rotrv_3 =	"00000046DTS",
>    drotrv_3 =	mips64 and "00000056DTS",
>    srav_3 =	"00000007DTS",
> -  jr_1 =	"00000008S",
>    jalr_1 =	"0000f809S",
>    jalr_2 =	"00000009DS",
> -  movz_3 =	"0000000aDST",
> -  movn_3 =	"0000000bDST",
>    syscall_0 =	"0000000c",
>    syscall_1 =	"0000000cY",
>    break_0 =	"0000000d",
>    break_1 =	"0000000dY",
>    sync_0 =	"0000000f",
> -  mfhi_1 =	"00000010D",
> -  mthi_1 =	"00000011S",
> -  mflo_1 =	"00000012D",
> -  mtlo_1 =	"00000013S",
>    dsllv_3 =	mips64 and "00000014DTS",
>    dsrlv_3 =	mips64 and "00000016DTS",
>    dsrav_3 =	mips64 and "00000017DTS",
> -  mult_2 =	"00000018ST",
> -  multu_2 =	"00000019ST",
> -  div_2 =	"0000001aST",
> -  divu_2 =	"0000001bST",
> -  dmult_2 =	mips64 and "0000001cST",
> -  dmultu_2 =	mips64 and "0000001dST",
> -  ddiv_2 =	mips64 and "0000001eST",
> -  ddivu_2 =	mips64 and "0000001fST",
>    add_3 =	"00000020DST",
>    move_2 =	mips64 and "00000025DS" or "00000021DS",
>    addu_3 =	"00000021DST",
> @@ -369,32 +332,9 @@ local map_op = {
>    bgez_2 =	"04010000SB",
>    bltzl_2 =	"04020000SB",
>    bgezl_2 =	"04030000SB",
> -  tgei_2 =	"04080000SI",
> -  tgeiu_2 =	"04090000SI",
> -  tlti_2 =	"040a0000SI",
> -  tltiu_2 =	"040b0000SI",
> -  teqi_2 =	"040c0000SI",
> -  tnei_2 =	"040e0000SI",
> -  bltzal_2 =	"04100000SB",
>    bal_1 =	"04110000B",
> -  bgezal_2 =	"04110000SB",
> -  bltzall_2 =	"04120000SB",
> -  bgezall_2 =	"04130000SB",
>    synci_1 =	"041f0000O",
>  
> -  -- Opcode SPECIAL2.
> -  madd_2 =	"70000000ST",
> -  maddu_2 =	"70000001ST",
> -  mul_3 =	"70000002DST",
> -  msub_2 =	"70000004ST",
> -  msubu_2 =	"70000005ST",
> -  clz_2 =	"70000020DS=",
> -  clo_2 =	"70000021DS=",
> -  dclz_2 =	mips64 and "70000024DS=",
> -  dclo_2 =	mips64 and "70000025DS=",
> -  sdbbp_0 =	"7000003f",
> -  sdbbp_1 =	"7000003fY",
> -
>    -- Opcode SPECIAL3.
>    ext_4 =	"7c000000TSAM", -- Note: last arg is msbd = size-1
>    dextm_4 =	mips64 and "7c000001TSAM", -- Args: pos    | size-1-32
> @@ -445,15 +385,6 @@ local map_op = {
>    ctc1_2 =	"44c00000TG",
>    mthc1_2 =	"44e00000TG",
>  
> -  bc1f_1 =	"45000000B",
> -  bc1f_2 =	"45000000CB",
> -  bc1t_1 =	"45010000B",
> -  bc1t_2 =	"45010000CB",
> -  bc1fl_1 =	"45020000B",
> -  bc1fl_2 =	"45020000CB",
> -  bc1tl_1 =	"45030000B",
> -  bc1tl_2 =	"45030000CB",
> -
>    ["add.s_3"] =		"46000000FGH",
>    ["sub.s_3"] =		"46000001FGH",
>    ["mul.s_3"] =		"46000002FGH",
> @@ -470,51 +401,11 @@ local map_op = {
>    ["trunc.w.s_2"] =	"4600000dFG",
>    ["ceil.w.s_2"] =	"4600000eFG",
>    ["floor.w.s_2"] =	"4600000fFG",
> -  ["movf.s_2"] =	"46000011FG",
> -  ["movf.s_3"] =	"46000011FGC",
> -  ["movt.s_2"] =	"46010011FG",
> -  ["movt.s_3"] =	"46010011FGC",
> -  ["movz.s_3"] =	"46000012FGT",
> -  ["movn.s_3"] =	"46000013FGT",
>    ["recip.s_2"] =	"46000015FG",
>    ["rsqrt.s_2"] =	"46000016FG",
>    ["cvt.d.s_2"] =	"46000021FG",
>    ["cvt.w.s_2"] =	"46000024FG",
>    ["cvt.l.s_2"] =	"46000025FG",
> -  ["cvt.ps.s_3"] =	"46000026FGH",
> -  ["c.f.s_2"] =		"46000030GH",
> -  ["c.f.s_3"] =		"46000030VGH",
> -  ["c.un.s_2"] =	"46000031GH",
> -  ["c.un.s_3"] =	"46000031VGH",
> -  ["c.eq.s_2"] =	"46000032GH",
> -  ["c.eq.s_3"] =	"46000032VGH",
> -  ["c.ueq.s_2"] =	"46000033GH",
> -  ["c.ueq.s_3"] =	"46000033VGH",
> -  ["c.olt.s_2"] =	"46000034GH",
> -  ["c.olt.s_3"] =	"46000034VGH",
> -  ["c.ult.s_2"] =	"46000035GH",
> -  ["c.ult.s_3"] =	"46000035VGH",
> -  ["c.ole.s_2"] =	"46000036GH",
> -  ["c.ole.s_3"] =	"46000036VGH",
> -  ["c.ule.s_2"] =	"46000037GH",
> -  ["c.ule.s_3"] =	"46000037VGH",
> -  ["c.sf.s_2"] =	"46000038GH",
> -  ["c.sf.s_3"] =	"46000038VGH",
> -  ["c.ngle.s_2"] =	"46000039GH",
> -  ["c.ngle.s_3"] =	"46000039VGH",
> -  ["c.seq.s_2"] =	"4600003aGH",
> -  ["c.seq.s_3"] =	"4600003aVGH",
> -  ["c.ngl.s_2"] =	"4600003bGH",
> -  ["c.ngl.s_3"] =	"4600003bVGH",
> -  ["c.lt.s_2"] =	"4600003cGH",
> -  ["c.lt.s_3"] =	"4600003cVGH",
> -  ["c.nge.s_2"] =	"4600003dGH",
> -  ["c.nge.s_3"] =	"4600003dVGH",
> -  ["c.le.s_2"] =	"4600003eGH",
> -  ["c.le.s_3"] =	"4600003eVGH",
> -  ["c.ngt.s_2"] =	"4600003fGH",
> -  ["c.ngt.s_3"] =	"4600003fVGH",
> -
>    ["add.d_3"] =		"46200000FGH",
>    ["sub.d_3"] =		"46200001FGH",
>    ["mul.d_3"] =		"46200002FGH",
> @@ -531,130 +422,410 @@ local map_op = {
>    ["trunc.w.d_2"] =	"4620000dFG",
>    ["ceil.w.d_2"] =	"4620000eFG",
>    ["floor.w.d_2"] =	"4620000fFG",
> -  ["movf.d_2"] =	"46200011FG",
> -  ["movf.d_3"] =	"46200011FGC",
> -  ["movt.d_2"] =	"46210011FG",
> -  ["movt.d_3"] =	"46210011FGC",
> -  ["movz.d_3"] =	"46200012FGT",
> -  ["movn.d_3"] =	"46200013FGT",
>    ["recip.d_2"] =	"46200015FG",
>    ["rsqrt.d_2"] =	"46200016FG",
>    ["cvt.s.d_2"] =	"46200020FG",
>    ["cvt.w.d_2"] =	"46200024FG",
>    ["cvt.l.d_2"] =	"46200025FG",
> -  ["c.f.d_2"] =		"46200030GH",
> -  ["c.f.d_3"] =		"46200030VGH",
> -  ["c.un.d_2"] =	"46200031GH",
> -  ["c.un.d_3"] =	"46200031VGH",
> -  ["c.eq.d_2"] =	"46200032GH",
> -  ["c.eq.d_3"] =	"46200032VGH",
> -  ["c.ueq.d_2"] =	"46200033GH",
> -  ["c.ueq.d_3"] =	"46200033VGH",
> -  ["c.olt.d_2"] =	"46200034GH",
> -  ["c.olt.d_3"] =	"46200034VGH",
> -  ["c.ult.d_2"] =	"46200035GH",
> -  ["c.ult.d_3"] =	"46200035VGH",
> -  ["c.ole.d_2"] =	"46200036GH",
> -  ["c.ole.d_3"] =	"46200036VGH",
> -  ["c.ule.d_2"] =	"46200037GH",
> -  ["c.ule.d_3"] =	"46200037VGH",
> -  ["c.sf.d_2"] =	"46200038GH",
> -  ["c.sf.d_3"] =	"46200038VGH",
> -  ["c.ngle.d_2"] =	"46200039GH",
> -  ["c.ngle.d_3"] =	"46200039VGH",
> -  ["c.seq.d_2"] =	"4620003aGH",
> -  ["c.seq.d_3"] =	"4620003aVGH",
> -  ["c.ngl.d_2"] =	"4620003bGH",
> -  ["c.ngl.d_3"] =	"4620003bVGH",
> -  ["c.lt.d_2"] =	"4620003cGH",
> -  ["c.lt.d_3"] =	"4620003cVGH",
> -  ["c.nge.d_2"] =	"4620003dGH",
> -  ["c.nge.d_3"] =	"4620003dVGH",
> -  ["c.le.d_2"] =	"4620003eGH",
> -  ["c.le.d_3"] =	"4620003eVGH",
> -  ["c.ngt.d_2"] =	"4620003fGH",
> -  ["c.ngt.d_3"] =	"4620003fVGH",
> -
> -  ["add.ps_3"] =	"46c00000FGH",
> -  ["sub.ps_3"] =	"46c00001FGH",
> -  ["mul.ps_3"] =	"46c00002FGH",
> -  ["abs.ps_2"] =	"46c00005FG",
> -  ["mov.ps_2"] =	"46c00006FG",
> -  ["neg.ps_2"] =	"46c00007FG",
> -  ["movf.ps_2"] =	"46c00011FG",
> -  ["movf.ps_3"] =	"46c00011FGC",
> -  ["movt.ps_2"] =	"46c10011FG",
> -  ["movt.ps_3"] =	"46c10011FGC",
> -  ["movz.ps_3"] =	"46c00012FGT",
> -  ["movn.ps_3"] =	"46c00013FGT",
> -  ["cvt.s.pu_2"] =	"46c00020FG",
> -  ["cvt.s.pl_2"] =	"46c00028FG",
> -  ["pll.ps_3"] =	"46c0002cFGH",
> -  ["plu.ps_3"] =	"46c0002dFGH",
> -  ["pul.ps_3"] =	"46c0002eFGH",
> -  ["puu.ps_3"] =	"46c0002fFGH",
> -  ["c.f.ps_2"] =	"46c00030GH",
> -  ["c.f.ps_3"] =	"46c00030VGH",
> -  ["c.un.ps_2"] =	"46c00031GH",
> -  ["c.un.ps_3"] =	"46c00031VGH",
> -  ["c.eq.ps_2"] =	"46c00032GH",
> -  ["c.eq.ps_3"] =	"46c00032VGH",
> -  ["c.ueq.ps_2"] =	"46c00033GH",
> -  ["c.ueq.ps_3"] =	"46c00033VGH",
> -  ["c.olt.ps_2"] =	"46c00034GH",
> -  ["c.olt.ps_3"] =	"46c00034VGH",
> -  ["c.ult.ps_2"] =	"46c00035GH",
> -  ["c.ult.ps_3"] =	"46c00035VGH",
> -  ["c.ole.ps_2"] =	"46c00036GH",
> -  ["c.ole.ps_3"] =	"46c00036VGH",
> -  ["c.ule.ps_2"] =	"46c00037GH",
> -  ["c.ule.ps_3"] =	"46c00037VGH",
> -  ["c.sf.ps_2"] =	"46c00038GH",
> -  ["c.sf.ps_3"] =	"46c00038VGH",
> -  ["c.ngle.ps_2"] =	"46c00039GH",
> -  ["c.ngle.ps_3"] =	"46c00039VGH",
> -  ["c.seq.ps_2"] =	"46c0003aGH",
> -  ["c.seq.ps_3"] =	"46c0003aVGH",
> -  ["c.ngl.ps_2"] =	"46c0003bGH",
> -  ["c.ngl.ps_3"] =	"46c0003bVGH",
> -  ["c.lt.ps_2"] =	"46c0003cGH",
> -  ["c.lt.ps_3"] =	"46c0003cVGH",
> -  ["c.nge.ps_2"] =	"46c0003dGH",
> -  ["c.nge.ps_3"] =	"46c0003dVGH",
> -  ["c.le.ps_2"] =	"46c0003eGH",
> -  ["c.le.ps_3"] =	"46c0003eVGH",
> -  ["c.ngt.ps_2"] =	"46c0003fGH",
> -  ["c.ngt.ps_3"] =	"46c0003fVGH",
> -
>    ["cvt.s.w_2"] =	"46800020FG",
>    ["cvt.d.w_2"] =	"46800021FG",
> -
>    ["cvt.s.l_2"] =	"46a00020FG",
>    ["cvt.d.l_2"] =	"46a00021FG",
> -
> -  -- Opcode COP1X.
> -  lwxc1_2 =		"4c000000FX",
> -  ldxc1_2 =		"4c000001FX",
> -  luxc1_2 =		"4c000005FX",
> -  swxc1_2 =		"4c000008FX",
> -  sdxc1_2 =		"4c000009FX",
> -  suxc1_2 =		"4c00000dFX",
> -  prefx_2 =		"4c00000fMX",
> -  ["alnv.ps_4"] =	"4c00001eFGHS",
> -  ["madd.s_4"] =	"4c000020FRGH",
> -  ["madd.d_4"] =	"4c000021FRGH",
> -  ["madd.ps_4"] =	"4c000026FRGH",
> -  ["msub.s_4"] =	"4c000028FRGH",
> -  ["msub.d_4"] =	"4c000029FRGH",
> -  ["msub.ps_4"] =	"4c00002eFRGH",
> -  ["nmadd.s_4"] =	"4c000030FRGH",
> -  ["nmadd.d_4"] =	"4c000031FRGH",
> -  ["nmadd.ps_4"] =	"4c000036FRGH",
> -  ["nmsub.s_4"] =	"4c000038FRGH",
> -  ["nmsub.d_4"] =	"4c000039FRGH",
> -  ["nmsub.ps_4"] =	"4c00003eFRGH",
>  }
>  
> +if mipsr6 then -- Instructions added with MIPSR6.
> +
> +  for k,v in pairs({
> +
> +    -- Add immediate to upper bits.
> +    aui_3 =	"3c000000TSI",
> +    daui_3 =	mips64 and "74000000TSI",
> +    dahi_2 =	mips64 and "04060000SI",
> +    dati_2 =	mips64 and "041e0000SI",
> +
> +    -- TODO: addiupc, auipc, aluipc, lwpc, lwupc, ldpc.
> +
> +    -- Compact branches.
> +    blezalc_2 =	"18000000TB",	-- rt != 0.
> +    bgezalc_2 =	"18000000T=SB",	-- rt != 0.
> +    bgtzalc_2 =	"1c000000TB",	-- rt != 0.
> +    bltzalc_2 =	"1c000000T=SB",	-- rt != 0.
> +
> +    blezc_2 =	"58000000TB",	-- rt != 0.
> +    bgezc_2 =	"58000000T=SB",	-- rt != 0.
> +    bgec_3 =	"58000000STB",	-- rs != rt.
> +    blec_3 =	"58000000TSB",	-- rt != rs.
> +
> +    bgtzc_2 =	"5c000000TB",	-- rt != 0.
> +    bltzc_2 =	"5c000000T=SB",	-- rt != 0.
> +    bltc_3 =	"5c000000STB",	-- rs != rt.
> +    bgtc_3 =	"5c000000TSB",	-- rt != rs.
> +
> +    bgeuc_3 =	"18000000STB",	-- rs != rt.
> +    bleuc_3 =	"18000000TSB",	-- rt != rs.
> +    bltuc_3 =	"1c000000STB",	-- rs != rt.
> +    bgtuc_3 =	"1c000000TSB",	-- rt != rs.
> +
> +    beqzalc_2 =	"20000000TB",	-- rt != 0.
> +    bnezalc_2 =	"60000000TB",	-- rt != 0.
> +    beqc_3 =	"20000000STB",	-- rs < rt.
> +    bnec_3 =	"60000000STB",	-- rs < rt.
> +    bovc_3 =	"20000000STB",	-- rs >= rt.
> +    bnvc_3 =	"60000000STB",	-- rs >= rt.
> +
> +    beqzc_2 =	"d8000000SK",	-- rs != 0.
> +    bnezc_2 =	"f8000000SK",	-- rs != 0.
> +    jic_2 =	"d8000000TI",
> +    jialc_2 =	"f8000000TI",
> +    bc_1 =	"c8000000L",
> +    balc_1 =	"e8000000L",
> +
> +    -- Opcode SPECIAL.
> +    jr_1 =	"00000009S",
> +    sdbbp_0 =	"0000000e",
> +    sdbbp_1 =	"0000000eY",
> +    lsa_4 =	"00000005DSTA",
> +    dlsa_4 =	mips64 and "00000015DSTA",
> +    seleqz_3 =	"00000035DST",
> +    selnez_3 =	"00000037DST",
> +    clz_2 =	"00000050DS",
> +    clo_2 =	"00000051DS",
> +    dclz_2 =	mips64 and "00000052DS",
> +    dclo_2 =	mips64 and "00000053DS",
> +    mul_3 =	"00000098DST",
> +    muh_3 =	"000000d8DST",
> +    mulu_3 =	"00000099DST",
> +    muhu_3 =	"000000d9DST",
> +    div_3 =	"0000009aDST",
> +    mod_3 =	"000000daDST",
> +    divu_3 =	"0000009bDST",
> +    modu_3 =	"000000dbDST",
> +    dmul_3 =	mips64 and "0000009cDST",
> +    dmuh_3 =	mips64 and "000000dcDST",
> +    dmulu_3 =	mips64 and "0000009dDST",
> +    dmuhu_3 =	mips64 and "000000ddDST",
> +    ddiv_3 =	mips64 and "0000009eDST",
> +    dmod_3 =	mips64 and "000000deDST",
> +    ddivu_3 =	mips64 and "0000009fDST",
> +    dmodu_3 =	mips64 and "000000dfDST",
> +
> +    -- Opcode SPECIAL3.
> +    align_4 =		"7c000220DSTA",
> +    dalign_4 =		mips64 and "7c000224DSTA",
> +    bitswap_2 =		"7c000020DT",
> +    dbitswap_2 =	mips64 and "7c000024DT",
> +
> +    -- Opcode COP1.
> +    bc1eqz_2 =	"45200000HB",
> +    bc1nez_2 =	"45a00000HB",
> +
> +    ["sel.s_3"] =	"46000010FGH",
> +    ["seleqz.s_3"] =	"46000014FGH",
> +    ["selnez.s_3"] =	"46000017FGH",
> +    ["maddf.s_3"] =	"46000018FGH",
> +    ["msubf.s_3"] =	"46000019FGH",
> +    ["rint.s_2"] =	"4600001aFG",
> +    ["class.s_2"] =	"4600001bFG",
> +    ["min.s_3"] =	"4600001cFGH",
> +    ["mina.s_3"] =	"4600001dFGH",
> +    ["max.s_3"] =	"4600001eFGH",
> +    ["maxa.s_3"] =	"4600001fFGH",
> +    ["cmp.af.s_3"] =	"46800000FGH",
> +    ["cmp.un.s_3"] =	"46800001FGH",
> +    ["cmp.or.s_3"] =	"46800011FGH",
> +    ["cmp.eq.s_3"] =	"46800002FGH",
> +    ["cmp.une.s_3"] =	"46800012FGH",
> +    ["cmp.ueq.s_3"] =	"46800003FGH",
> +    ["cmp.ne.s_3"] =	"46800013FGH",
> +    ["cmp.lt.s_3"] =	"46800004FGH",
> +    ["cmp.ult.s_3"] =	"46800005FGH",
> +    ["cmp.le.s_3"] =	"46800006FGH",
> +    ["cmp.ule.s_3"] =	"46800007FGH",
> +    ["cmp.saf.s_3"] =	"46800008FGH",
> +    ["cmp.sun.s_3"] =	"46800009FGH",
> +    ["cmp.sor.s_3"] =	"46800019FGH",
> +    ["cmp.seq.s_3"] =	"4680000aFGH",
> +    ["cmp.sune.s_3"] =	"4680001aFGH",
> +    ["cmp.sueq.s_3"] =	"4680000bFGH",
> +    ["cmp.sne.s_3"] =	"4680001bFGH",
> +    ["cmp.slt.s_3"] =	"4680000cFGH",
> +    ["cmp.sult.s_3"] =	"4680000dFGH",
> +    ["cmp.sle.s_3"] =	"4680000eFGH",
> +    ["cmp.sule.s_3"] =	"4680000fFGH",
> +
> +    ["sel.d_3"] =	"46200010FGH",
> +    ["seleqz.d_3"] =	"46200014FGH",
> +    ["selnez.d_3"] =	"46200017FGH",
> +    ["maddf.d_3"] =	"46200018FGH",
> +    ["msubf.d_3"] =	"46200019FGH",
> +    ["rint.d_2"] =	"4620001aFG",
> +    ["class.d_2"] =	"4620001bFG",
> +    ["min.d_3"] =	"4620001cFGH",
> +    ["mina.d_3"] =	"4620001dFGH",
> +    ["max.d_3"] =	"4620001eFGH",
> +    ["maxa.d_3"] =	"4620001fFGH",
> +    ["cmp.af.d_3"] =	"46a00000FGH",
> +    ["cmp.un.d_3"] =	"46a00001FGH",
> +    ["cmp.or.d_3"] =	"46a00011FGH",
> +    ["cmp.eq.d_3"] =	"46a00002FGH",
> +    ["cmp.une.d_3"] =	"46a00012FGH",
> +    ["cmp.ueq.d_3"] =	"46a00003FGH",
> +    ["cmp.ne.d_3"] =	"46a00013FGH",
> +    ["cmp.lt.d_3"] =	"46a00004FGH",
> +    ["cmp.ult.d_3"] =	"46a00005FGH",
> +    ["cmp.le.d_3"] =	"46a00006FGH",
> +    ["cmp.ule.d_3"] =	"46a00007FGH",
> +    ["cmp.saf.d_3"] =	"46a00008FGH",
> +    ["cmp.sun.d_3"] =	"46a00009FGH",
> +    ["cmp.sor.d_3"] =	"46a00019FGH",
> +    ["cmp.seq.d_3"] =	"46a0000aFGH",
> +    ["cmp.sune.d_3"] =	"46a0001aFGH",
> +    ["cmp.sueq.d_3"] =	"46a0000bFGH",
> +    ["cmp.sne.d_3"] =	"46a0001bFGH",
> +    ["cmp.slt.d_3"] =	"46a0000cFGH",
> +    ["cmp.sult.d_3"] =	"46a0000dFGH",
> +    ["cmp.sle.d_3"] =	"46a0000eFGH",
> +    ["cmp.sule.d_3"] =	"46a0000fFGH",
> +
> +  }) do map_op[k] = v end
> +
> +else -- Instructions removed by MIPSR6.
> +
> +  for k,v in pairs({
> +    -- Traps, don't use.
> +    addi_3 =	"20000000TSI",
> +    daddi_3 =	mips64 and "60000000TSI",
> +
> +    -- Branch on likely, don't use.
> +    beqzl_2 =	"50000000SB",
> +    beql_3 =	"50000000STB",
> +    bnezl_2 =	"54000000SB",
> +    bnel_3 =	"54000000STB",
> +    blezl_2 =	"58000000SB",
> +    bgtzl_2 =	"5c000000SB",
> +
> +    lwl_2 =	"88000000TO",
> +    lwr_2 =	"98000000TO",
> +    swl_2 =	"a8000000TO",
> +    sdl_2 =	mips64 and "b0000000TO",
> +    sdr_2 =	mips64 and "b1000000TO",
> +    swr_2 =	"b8000000TO",
> +    cache_2 =	"bc000000NO",
> +    ll_2 =	"c0000000TO",
> +    pref_2 =	"cc000000NO",
> +    sc_2 =	"e0000000TO",
> +    scd_2 =	mips64 and "f0000000TO",
> +
> +    -- Opcode SPECIAL.
> +    movf_2 =	"00000001DS",
> +    movf_3 =	"00000001DSC",
> +    movt_2 =	"00010001DS",
> +    movt_3 =	"00010001DSC",
> +    jr_1 =	"00000008S",
> +    movz_3 =	"0000000aDST",
> +    movn_3 =	"0000000bDST",
> +    mfhi_1 =	"00000010D",
> +    mthi_1 =	"00000011S",
> +    mflo_1 =	"00000012D",
> +    mtlo_1 =	"00000013S",
> +    mult_2 =	"00000018ST",
> +    multu_2 =	"00000019ST",
> +    div_3 =	"0000001aST",
> +    divu_3 =	"0000001bST",
> +    ddiv_3 =	mips64 and "0000001eST",
> +    ddivu_3 =	mips64 and "0000001fST",
> +    dmult_2 =	mips64 and "0000001cST",
> +    dmultu_2 =	mips64 and "0000001dST",
> +
> +    -- Opcode REGIMM.
> +    tgei_2 =	"04080000SI",
> +    tgeiu_2 =	"04090000SI",
> +    tlti_2 =	"040a0000SI",
> +    tltiu_2 =	"040b0000SI",
> +    teqi_2 =	"040c0000SI",
> +    tnei_2 =	"040e0000SI",
> +    bltzal_2 =	"04100000SB",
> +    bgezal_2 =	"04110000SB",
> +    bltzall_2 =	"04120000SB",
> +    bgezall_2 =	"04130000SB",
> +
> +    -- Opcode SPECIAL2.
> +    madd_2 =	"70000000ST",
> +    maddu_2 =	"70000001ST",
> +    mul_3 =	"70000002DST",
> +    msub_2 =	"70000004ST",
> +    msubu_2 =	"70000005ST",
> +    clz_2 =	"70000020D=TS",
> +    clo_2 =	"70000021D=TS",
> +    dclz_2 =	mips64 and "70000024D=TS",
> +    dclo_2 =	mips64 and "70000025D=TS",
> +    sdbbp_0 =	"7000003f",
> +    sdbbp_1 =	"7000003fY",
> +
> +    -- Opcode COP1.
> +    bc1f_1 =	"45000000B",
> +    bc1f_2 =	"45000000CB",
> +    bc1t_1 =	"45010000B",
> +    bc1t_2 =	"45010000CB",
> +    bc1fl_1 =	"45020000B",
> +    bc1fl_2 =	"45020000CB",
> +    bc1tl_1 =	"45030000B",
> +    bc1tl_2 =	"45030000CB",
> +
> +    ["movf.s_2"] =	"46000011FG",
> +    ["movf.s_3"] =	"46000011FGC",
> +    ["movt.s_2"] =	"46010011FG",
> +    ["movt.s_3"] =	"46010011FGC",
> +    ["movz.s_3"] =	"46000012FGT",
> +    ["movn.s_3"] =	"46000013FGT",
> +    ["cvt.ps.s_3"] =	"46000026FGH",
> +    ["c.f.s_2"] =	"46000030GH",
> +    ["c.f.s_3"] =	"46000030VGH",
> +    ["c.un.s_2"] =	"46000031GH",
> +    ["c.un.s_3"] =	"46000031VGH",
> +    ["c.eq.s_2"] =	"46000032GH",
> +    ["c.eq.s_3"] =	"46000032VGH",
> +    ["c.ueq.s_2"] =	"46000033GH",
> +    ["c.ueq.s_3"] =	"46000033VGH",
> +    ["c.olt.s_2"] =	"46000034GH",
> +    ["c.olt.s_3"] =	"46000034VGH",
> +    ["c.ult.s_2"] =	"46000035GH",
> +    ["c.ult.s_3"] =	"46000035VGH",
> +    ["c.ole.s_2"] =	"46000036GH",
> +    ["c.ole.s_3"] =	"46000036VGH",
> +    ["c.ule.s_2"] =	"46000037GH",
> +    ["c.ule.s_3"] =	"46000037VGH",
> +    ["c.sf.s_2"] =	"46000038GH",
> +    ["c.sf.s_3"] =	"46000038VGH",
> +    ["c.ngle.s_2"] =	"46000039GH",
> +    ["c.ngle.s_3"] =	"46000039VGH",
> +    ["c.seq.s_2"] =	"4600003aGH",
> +    ["c.seq.s_3"] =	"4600003aVGH",
> +    ["c.ngl.s_2"] =	"4600003bGH",
> +    ["c.ngl.s_3"] =	"4600003bVGH",
> +    ["c.lt.s_2"] =	"4600003cGH",
> +    ["c.lt.s_3"] =	"4600003cVGH",
> +    ["c.nge.s_2"] =	"4600003dGH",
> +    ["c.nge.s_3"] =	"4600003dVGH",
> +    ["c.le.s_2"] =	"4600003eGH",
> +    ["c.le.s_3"] =	"4600003eVGH",
> +    ["c.ngt.s_2"] =	"4600003fGH",
> +    ["c.ngt.s_3"] =	"4600003fVGH",
> +    ["movf.d_2"] =	"46200011FG",
> +    ["movf.d_3"] =	"46200011FGC",
> +    ["movt.d_2"] =	"46210011FG",
> +    ["movt.d_3"] =	"46210011FGC",
> +    ["movz.d_3"] =	"46200012FGT",
> +    ["movn.d_3"] =	"46200013FGT",
> +    ["c.f.d_2"] =	"46200030GH",
> +    ["c.f.d_3"] =	"46200030VGH",
> +    ["c.un.d_2"] =	"46200031GH",
> +    ["c.un.d_3"] =	"46200031VGH",
> +    ["c.eq.d_2"] =	"46200032GH",
> +    ["c.eq.d_3"] =	"46200032VGH",
> +    ["c.ueq.d_2"] =	"46200033GH",
> +    ["c.ueq.d_3"] =	"46200033VGH",
> +    ["c.olt.d_2"] =	"46200034GH",
> +    ["c.olt.d_3"] =	"46200034VGH",
> +    ["c.ult.d_2"] =	"46200035GH",
> +    ["c.ult.d_3"] =	"46200035VGH",
> +    ["c.ole.d_2"] =	"46200036GH",
> +    ["c.ole.d_3"] =	"46200036VGH",
> +    ["c.ule.d_2"] =	"46200037GH",
> +    ["c.ule.d_3"] =	"46200037VGH",
> +    ["c.sf.d_2"] =	"46200038GH",
> +    ["c.sf.d_3"] =	"46200038VGH",
> +    ["c.ngle.d_2"] =	"46200039GH",
> +    ["c.ngle.d_3"] =	"46200039VGH",
> +    ["c.seq.d_2"] =	"4620003aGH",
> +    ["c.seq.d_3"] =	"4620003aVGH",
> +    ["c.ngl.d_2"] =	"4620003bGH",
> +    ["c.ngl.d_3"] =	"4620003bVGH",
> +    ["c.lt.d_2"] =	"4620003cGH",
> +    ["c.lt.d_3"] =	"4620003cVGH",
> +    ["c.nge.d_2"] =	"4620003dGH",
> +    ["c.nge.d_3"] =	"4620003dVGH",
> +    ["c.le.d_2"] =	"4620003eGH",
> +    ["c.le.d_3"] =	"4620003eVGH",
> +    ["c.ngt.d_2"] =	"4620003fGH",
> +    ["c.ngt.d_3"] =	"4620003fVGH",
> +    ["add.ps_3"] =	"46c00000FGH",
> +    ["sub.ps_3"] =	"46c00001FGH",
> +    ["mul.ps_3"] =	"46c00002FGH",
> +    ["abs.ps_2"] =	"46c00005FG",
> +    ["mov.ps_2"] =	"46c00006FG",
> +    ["neg.ps_2"] =	"46c00007FG",
> +    ["movf.ps_2"] =	"46c00011FG",
> +    ["movf.ps_3"] =	"46c00011FGC",
> +    ["movt.ps_2"] =	"46c10011FG",
> +    ["movt.ps_3"] =	"46c10011FGC",
> +    ["movz.ps_3"] =	"46c00012FGT",
> +    ["movn.ps_3"] =	"46c00013FGT",
> +    ["cvt.s.pu_2"] =	"46c00020FG",
> +    ["cvt.s.pl_2"] =	"46c00028FG",
> +    ["pll.ps_3"] =	"46c0002cFGH",
> +    ["plu.ps_3"] =	"46c0002dFGH",
> +    ["pul.ps_3"] =	"46c0002eFGH",
> +    ["puu.ps_3"] =	"46c0002fFGH",
> +    ["c.f.ps_2"] =	"46c00030GH",
> +    ["c.f.ps_3"] =	"46c00030VGH",
> +    ["c.un.ps_2"] =	"46c00031GH",
> +    ["c.un.ps_3"] =	"46c00031VGH",
> +    ["c.eq.ps_2"] =	"46c00032GH",
> +    ["c.eq.ps_3"] =	"46c00032VGH",
> +    ["c.ueq.ps_2"] =	"46c00033GH",
> +    ["c.ueq.ps_3"] =	"46c00033VGH",
> +    ["c.olt.ps_2"] =	"46c00034GH",
> +    ["c.olt.ps_3"] =	"46c00034VGH",
> +    ["c.ult.ps_2"] =	"46c00035GH",
> +    ["c.ult.ps_3"] =	"46c00035VGH",
> +    ["c.ole.ps_2"] =	"46c00036GH",
> +    ["c.ole.ps_3"] =	"46c00036VGH",
> +    ["c.ule.ps_2"] =	"46c00037GH",
> +    ["c.ule.ps_3"] =	"46c00037VGH",
> +    ["c.sf.ps_2"] =	"46c00038GH",
> +    ["c.sf.ps_3"] =	"46c00038VGH",
> +    ["c.ngle.ps_2"] =	"46c00039GH",
> +    ["c.ngle.ps_3"] =	"46c00039VGH",
> +    ["c.seq.ps_2"] =	"46c0003aGH",
> +    ["c.seq.ps_3"] =	"46c0003aVGH",
> +    ["c.ngl.ps_2"] =	"46c0003bGH",
> +    ["c.ngl.ps_3"] =	"46c0003bVGH",
> +    ["c.lt.ps_2"] =	"46c0003cGH",
> +    ["c.lt.ps_3"] =	"46c0003cVGH",
> +    ["c.nge.ps_2"] =	"46c0003dGH",
> +    ["c.nge.ps_3"] =	"46c0003dVGH",
> +    ["c.le.ps_2"] =	"46c0003eGH",
> +    ["c.le.ps_3"] =	"46c0003eVGH",
> +    ["c.ngt.ps_2"] =	"46c0003fGH",
> +    ["c.ngt.ps_3"] =	"46c0003fVGH",
> +
> +    -- Opcode COP1X.
> +    lwxc1_2 =	"4c000000FX",
> +    ldxc1_2 =	"4c000001FX",
> +    luxc1_2 =	"4c000005FX",
> +    swxc1_2 =	"4c000008FX",
> +    sdxc1_2 =	"4c000009FX",
> +    suxc1_2 =	"4c00000dFX",
> +    prefx_2 =	"4c00000fMX",
> +    ["alnv.ps_4"] =	"4c00001eFGHS",
> +    ["madd.s_4"] =	"4c000020FRGH",
> +    ["madd.d_4"] =	"4c000021FRGH",
> +    ["madd.ps_4"] =	"4c000026FRGH",
> +    ["msub.s_4"] =	"4c000028FRGH",
> +    ["msub.d_4"] =	"4c000029FRGH",
> +    ["msub.ps_4"] =	"4c00002eFRGH",
> +    ["nmadd.s_4"] =	"4c000030FRGH",
> +    ["nmadd.d_4"] =	"4c000031FRGH",
> +    ["nmadd.ps_4"] =	"4c000036FRGH",
> +    ["nmsub.s_4"] =	"4c000038FRGH",
> +    ["nmsub.d_4"] =	"4c000039FRGH",
> +    ["nmsub.ps_4"] =	"4c00003eFRGH",
> +
> +  }) do map_op[k] = v end
> +
> +end
> +
>  ------------------------------------------------------------------------------
>  
>  local function parse_gpr(expr)
> @@ -808,9 +979,11 @@ map_op[".template__"] = function(params, template, nparams)
>        op = op + parse_disp(params[n]); n = n + 1
>      elseif p == "X" then
>        op = op + parse_index(params[n]); n = n + 1
> -    elseif p == "B" or p == "J" then
> +    elseif p == "B" or p == "J" or p == "K" or p == "L" then
>        local mode, m, s = parse_label(params[n], false)
> -      if p == "B" then m = m + 2048 end
> +      if p == "J" then m = m + 0xa800
> +      elseif p == "K" then m = m + 0x5000
> +      elseif p == "L" then m = m + 0xa000 end
>        waction("REL_"..mode, m, s, 1)
>        n = n + 1
>      elseif p == "A" then
> @@ -833,7 +1006,7 @@ map_op[".template__"] = function(params, template, nparams)
>      elseif p == "Z" then
>        op = op + parse_imm(params[n], 10, 6, 0, false); n = n + 1
>      elseif p == "=" then
> -      op = op + shl(band(op, 0xf800), 5) -- Copy D to T for clz, clo.
> +      n = n - 1 -- Re-use previous parameter for next template char.
>      else
>        assert(false)
>      end
> diff --git a/dynasm/dynasm.lua b/dynasm/dynasm.lua
> index 5ec21a79..46ebfca8 100644
> --- a/dynasm/dynasm.lua
> +++ b/dynasm/dynasm.lua
> @@ -630,6 +630,7 @@ end
>  -- Load architecture-specific module.
>  local function loadarch(arch)
>    if not match(arch, "^[%w_]+$") then return "bad arch name" end
> +  _G._map_def = map_def
>    local ok, m_arch = pcall(require, "dasm_"..arch)
>    if not ok then return "cannot load module: "..m_arch end
>    g_arch = m_arch
> diff --git a/src/Makefile.original b/src/Makefile.original
> index aedaaa73..22d36a27 100644
> --- a/src/Makefile.original
> +++ b/src/Makefile.original
> @@ -455,6 +455,9 @@ ifeq (arm,$(TARGET_LJARCH))
>      DASM_AFLAGS+= -D IOS
>    endif
>  else
> +ifneq (,$(findstring LJ_TARGET_MIPSR6 ,$(TARGET_TESTARCH)))
> +  DASM_AFLAGS+= -D MIPSR6
> +endif
>  ifeq (ppc,$(TARGET_LJARCH))
>    ifneq (,$(findstring LJ_ARCH_SQRT 1,$(TARGET_TESTARCH)))
>      DASM_AFLAGS+= -D SQRT
> diff --git a/src/jit/bcsave.lua b/src/jit/bcsave.lua
> index 2553d97e..41081184 100644
> --- a/src/jit/bcsave.lua
> +++ b/src/jit/bcsave.lua
> @@ -17,6 +17,10 @@ local bit = require("bit")
>  -- Symbol name prefix for LuaJIT bytecode.
>  local LJBC_PREFIX = "luaJIT_BC_"
>  
> +local type, assert = type, assert
> +local format = string.format
> +local tremove, tconcat = table.remove, table.concat
> +
>  ------------------------------------------------------------------------------
>  
>  local function usage()
> @@ -63,8 +67,18 @@ local map_type = {
>  }
>  
>  local map_arch = {
> -  x86 = true, x64 = true, arm = true, arm64 = true, arm64be = true,
> -  ppc = true, mips = true, mipsel = true,
> +  x86 =		{ e = "le", b = 32, m = 3, p = 0x14c, },
> +  x64 =		{ e = "le", b = 64, m = 62, p = 0x8664, },
> +  arm =		{ e = "le", b = 32, m = 40, p = 0x1c0, },
> +  arm64 =	{ e = "le", b = 64, m = 183, p = 0xaa64, },
> +  arm64be =	{ e = "be", b = 64, m = 183, },
> +  ppc =		{ e = "be", b = 32, m = 20, },
> +  mips =	{ e = "be", b = 32, m = 8, f = 0x50001006, },
> +  mipsel =	{ e = "le", b = 32, m = 8, f = 0x50001006, },
> +  mips64 =	{ e = "be", b = 64, m = 8, f = 0x80000007, },
> +  mips64el =	{ e = "le", b = 64, m = 8, f = 0x80000007, },
> +  mips64r6 =	{ e = "be", b = 64, m = 8, f = 0xa0000407, },
> +  mips64r6el =	{ e = "le", b = 64, m = 8, f = 0xa0000407, },
>  }
>  
>  local map_os = {
> @@ -73,33 +87,33 @@ local map_os = {
>  }
>  
>  local function checkarg(str, map, err)
> -  str = string.lower(str)
> +  str = str:lower()
>    local s = check(map[str], "unknown ", err)
> -  return s == true and str or s
> +  return type(s) == "string" and s or str
>  end
>  
>  local function detecttype(str)
> -  local ext = string.match(string.lower(str), "%.(%a+)$")
> +  local ext = str:lower():match("%.(%a+)$")
>    return map_type[ext] or "raw"
>  end
>  
>  local function checkmodname(str)
> -  check(string.match(str, "^[%w_.%-]+$"), "bad module name")
> -  return string.gsub(str, "[%.%-]", "_")
> +  check(str:match("^[%w_.%-]+$"), "bad module name")
> +  return str:gsub("[%.%-]", "_")
>  end
>  
>  local function detectmodname(str)
>    if type(str) == "string" then
> -    local tail = string.match(str, "[^/\\]+$")
> +    local tail = str:match("[^/\\]+$")
>      if tail then str = tail end
> -    local head = string.match(str, "^(.*)%.[^.]*$")
> +    local head = str:match("^(.*)%.[^.]*$")
>      if head then str = head end
> -    str = string.match(str, "^[%w_.%-]+")
> +    str = str:match("^[%w_.%-]+")
>    else
>      str = nil
>    end
>    check(str, "cannot derive module name, use -n name")
> -  return string.gsub(str, "[%.%-]", "_")
> +  return str:gsub("[%.%-]", "_")
>  end
>  
>  ------------------------------------------------------------------------------
> @@ -118,7 +132,7 @@ end
>  local function bcsave_c(ctx, output, s)
>    local fp = savefile(output, "w")
>    if ctx.type == "c" then
> -    fp:write(string.format([[
> +    fp:write(format([[
>  #ifdef _cplusplus
>  extern "C"
>  #endif
> @@ -128,7 +142,7 @@ __declspec(dllexport)
>  const unsigned char %s%s[] = {
>  ]], LJBC_PREFIX, ctx.modname))
>    else
> -    fp:write(string.format([[
> +    fp:write(format([[
>  #define %s%s_SIZE %d
>  static const unsigned char %s%s[] = {
>  ]], LJBC_PREFIX, ctx.modname, #s, LJBC_PREFIX, ctx.modname))
> @@ -138,13 +152,13 @@ static const unsigned char %s%s[] = {
>      local b = tostring(string.byte(s, i))
>      m = m + #b + 1
>      if m > 78 then
> -      fp:write(table.concat(t, ",", 1, n), ",\n")
> +      fp:write(tconcat(t, ",", 1, n), ",\n")
>        n, m = 0, #b + 1
>      end
>      n = n + 1
>      t[n] = b
>    end
> -  bcsave_tail(fp, output, table.concat(t, ",", 1, n).."\n};\n")
> +  bcsave_tail(fp, output, tconcat(t, ",", 1, n).."\n};\n")
>  end
>  
>  local function bcsave_elfobj(ctx, output, s, ffi)
> @@ -199,12 +213,8 @@ typedef struct {
>  } ELF64obj;
>  ]]
>    local symname = LJBC_PREFIX..ctx.modname
> -  local is64, isbe = false, false
> -  if ctx.arch == "x64" or ctx.arch == "arm64" or ctx.arch == "arm64be" then
> -    is64 = true
> -  elseif ctx.arch == "ppc" or ctx.arch == "mips" then
> -    isbe = true
> -  end
> +  local ai = assert(map_arch[ctx.arch])
> +  local is64, isbe = ai.b == 64, ai.e == "be"
>  
>    -- Handle different host/target endianess.
>    local function f32(x) return x end
> @@ -237,10 +247,8 @@ typedef struct {
>    hdr.eendian = isbe and 2 or 1
>    hdr.eversion = 1
>    hdr.type = f16(1)
> -  hdr.machine = f16(({ x86=3, x64=62, arm=40, arm64=183, arm64be=183, ppc=20, mips=8, mipsel=8 })[ctx.arch])
> -  if ctx.arch == "mips" or ctx.arch == "mipsel" then
> -    hdr.flags = f32(0x50001006)
> -  end
> +  hdr.machine = f16(ai.m)
> +  hdr.flags = f32(ai.f or 0)
>    hdr.version = f32(1)
>    hdr.shofs = fofs(ffi.offsetof(o, "sect"))
>    hdr.ehsize = f16(ffi.sizeof(hdr))
> @@ -336,12 +344,8 @@ typedef struct {
>  } PEobj;
>  ]]
>    local symname = LJBC_PREFIX..ctx.modname
> -  local is64 = false
> -  if ctx.arch == "x86" then
> -    symname = "_"..symname
> -  elseif ctx.arch == "x64" then
> -    is64 = true
> -  end
> +  local ai = assert(map_arch[ctx.arch])
> +  local is64 = ai.b == 64
>    local symexport = "   /EXPORT:"..symname..",DATA "
>  
>    -- The file format is always little-endian. Swap if the host is big-endian.
> @@ -355,7 +359,7 @@ typedef struct {
>    -- Create PE object and fill in header.
>    local o = ffi.new("PEobj")
>    local hdr = o.hdr
> -  hdr.arch = f16(({ x86=0x14c, x64=0x8664, arm=0x1c0, ppc=0x1f2, mips=0x366, mipsel=0x366 })[ctx.arch])
> +  hdr.arch = f16(assert(ai.p))
>    hdr.nsects = f16(2)
>    hdr.symtabofs = f32(ffi.offsetof(o, "sym0"))
>    hdr.nsyms = f32(6)
> @@ -605,16 +609,16 @@ local function docmd(...)
>    local n = 1
>    local list = false
>    local ctx = {
> -    strip = true, arch = jit.arch, os = string.lower(jit.os),
> +    strip = true, arch = jit.arch, os = jit.os:lower(),
>      type = false, modname = false,
>    }
>    while n <= #arg do
>      local a = arg[n]
> -    if type(a) == "string" and string.sub(a, 1, 1) == "-" and a ~= "-" then
> -      table.remove(arg, n)
> +    if type(a) == "string" and a:sub(1, 1) == "-" and a ~= "-" then
> +      tremove(arg, n)
>        if a == "--" then break end
>        for m=2,#a do
> -	local opt = string.sub(a, m, m)
> +	local opt = a:sub(m, m)
>  	if opt == "l" then
>  	  list = true
>  	elseif opt == "s" then
> @@ -627,13 +631,13 @@ local function docmd(...)
>  	    if n ~= 1 then usage() end
>  	    arg[1] = check(loadstring(arg[1]))
>  	  elseif opt == "n" then
> -	    ctx.modname = checkmodname(table.remove(arg, n))
> +	    ctx.modname = checkmodname(tremove(arg, n))
>  	  elseif opt == "t" then
> -	    ctx.type = checkarg(table.remove(arg, n), map_type, "file type")
> +	    ctx.type = checkarg(tremove(arg, n), map_type, "file type")
>  	  elseif opt == "a" then
> -	    ctx.arch = checkarg(table.remove(arg, n), map_arch, "architecture")
> +	    ctx.arch = checkarg(tremove(arg, n), map_arch, "architecture")
>  	  elseif opt == "o" then
> -	    ctx.os = checkarg(table.remove(arg, n), map_os, "OS name")
> +	    ctx.os = checkarg(tremove(arg, n), map_os, "OS name")
>  	  else
>  	    usage()
>  	  end
> diff --git a/src/jit/dis_mips.lua b/src/jit/dis_mips.lua
> index a12b8e62..c003b984 100644
> --- a/src/jit/dis_mips.lua
> +++ b/src/jit/dis_mips.lua
> @@ -19,13 +19,34 @@ local band, bor, tohex = bit.band, bit.bor, bit.tohex
>  local lshift, rshift, arshift = bit.lshift, bit.rshift, bit.arshift
>  
>  ------------------------------------------------------------------------------
> --- Primary and extended opcode maps
> +-- Extended opcode maps common to all MIPS releases
>  ------------------------------------------------------------------------------
>  
> -local map_movci = { shift = 16, mask = 1, [0] = "movfDSC", "movtDSC", }
>  local map_srl = { shift = 21, mask = 1, [0] = "srlDTA", "rotrDTA", }
>  local map_srlv = { shift = 6, mask = 1, [0] = "srlvDTS", "rotrvDTS", }
>  
> +local map_cop0 = {
> +  shift = 25, mask = 1,
> +  [0] = {
> +    shift = 21, mask = 15,
> +    [0] = "mfc0TDW", [4] = "mtc0TDW",
> +    [10] = "rdpgprDT",
> +    [11] = { shift = 5, mask = 1, [0] = "diT0", "eiT0", },
> +    [14] = "wrpgprDT",
> +  }, {
> +    shift = 0, mask = 63,
> +    [1] = "tlbr", [2] = "tlbwi", [6] = "tlbwr", [8] = "tlbp",
> +    [24] = "eret", [31] = "deret",
> +    [32] = "wait",
> +  },
> +}
> +
> +------------------------------------------------------------------------------
> +-- Primary and extended opcode maps for MIPS R1-R5
> +------------------------------------------------------------------------------
> +
> +local map_movci = { shift = 16, mask = 1, [0] = "movfDSC", "movtDSC", }
> +
>  local map_special = {
>    shift = 0, mask = 63,
>    [0] = { shift = 0, mask = -1, [0] = "nop", _ = "sllDTA" },
> @@ -87,22 +108,6 @@ local map_regimm = {
>    false,	false,		false,		"synciSO",
>  }
>  
> -local map_cop0 = {
> -  shift = 25, mask = 1,
> -  [0] = {
> -    shift = 21, mask = 15,
> -    [0] = "mfc0TDW", [4] = "mtc0TDW",
> -    [10] = "rdpgprDT",
> -    [11] = { shift = 5, mask = 1, [0] = "diT0", "eiT0", },
> -    [14] = "wrpgprDT",
> -  }, {
> -    shift = 0, mask = 63,
> -    [1] = "tlbr", [2] = "tlbwi", [6] = "tlbwr", [8] = "tlbp",
> -    [24] = "eret", [31] = "deret",
> -    [32] = "wait",
> -  },
> -}
> -
>  local map_cop1s = {
>    shift = 0, mask = 63,
>    [0] = "add.sFGH",	"sub.sFGH",	"mul.sFGH",	"div.sFGH",
> @@ -233,6 +238,208 @@ local map_pri = {
>    false,	"sdc1HSO",	"sdc2TSO",	"sdTSO",
>  }
>  
> +------------------------------------------------------------------------------
> +-- Primary and extended opcode maps for MIPS R6
> +------------------------------------------------------------------------------
> +
> +local map_mul_r6 =   { shift = 6, mask = 3, [2] = "mulDST",   [3] = "muhDST" }
> +local map_mulu_r6 =  { shift = 6, mask = 3, [2] = "muluDST",  [3] = "muhuDST" }
> +local map_div_r6 =   { shift = 6, mask = 3, [2] = "divDST",   [3] = "modDST" }
> +local map_divu_r6 =  { shift = 6, mask = 3, [2] = "divuDST",  [3] = "moduDST" }
> +local map_dmul_r6 =  { shift = 6, mask = 3, [2] = "dmulDST",  [3] = "dmuhDST" }
> +local map_dmulu_r6 = { shift = 6, mask = 3, [2] = "dmuluDST", [3] = "dmuhuDST" }
> +local map_ddiv_r6 =  { shift = 6, mask = 3, [2] = "ddivDST",  [3] = "dmodDST" }
> +local map_ddivu_r6 = { shift = 6, mask = 3, [2] = "ddivuDST", [3] = "dmoduDST" }
> +
> +local map_special_r6 = {
> +  shift = 0, mask = 63,
> +  [0] = { shift = 0, mask = -1, [0] = "nop", _ = "sllDTA" },
> +  false,	map_srl,	"sraDTA",
> +  "sllvDTS",	false,		map_srlv,	"sravDTS",
> +  "jrS",	"jalrD1S",	false,		false,
> +  "syscallY",	"breakY",	false,		"sync",
> +  "clzDS",	"cloDS",	"dclzDS",	"dcloDS",
> +  "dsllvDST",	"dlsaDSTA",	"dsrlvDST",	"dsravDST",
> +  map_mul_r6,	map_mulu_r6,	map_div_r6,	map_divu_r6,
> +  map_dmul_r6,	map_dmulu_r6,	map_ddiv_r6,	map_ddivu_r6,
> +  "addDST",	"addu|moveDST0", "subDST",	"subu|neguDS0T",
> +  "andDST",	"or|moveDST0",	"xorDST",	"nor|notDST0",
> +  false,	false,		"sltDST",	"sltuDST",
> +  "daddDST",	"dadduDST",	"dsubDST",	"dsubuDST",
> +  "tgeSTZ",	"tgeuSTZ",	"tltSTZ",	"tltuSTZ",
> +  "teqSTZ",	"seleqzDST",	"tneSTZ",	"selnezDST",
> +  "dsllDTA",	false,		"dsrlDTA",	"dsraDTA",
> +  "dsll32DTA",	false,		"dsrl32DTA",	"dsra32DTA",
> +}
> +
> +local map_bshfl_r6 = {
> +  shift = 9, mask = 3,
> +  [1] = "alignDSTa",
> +  _ = {
> +    shift = 6, mask = 31,
> +    [0] = "bitswapDT",
> +    [2] = "wsbhDT",
> +    [16] = "sebDT",
> +    [24] = "sehDT",
> +  }
> +}
> +
> +local map_dbshfl_r6 = {
> +  shift = 9, mask = 3,
> +  [1] = "dalignDSTa",
> +  _ = {
> +    shift = 6, mask = 31,
> +    [0] = "dbitswapDT",
> +    [2] = "dsbhDT",
> +    [5] = "dshdDT",
> +  }
> +}
> +
> +local map_special3_r6 = {
> +  shift = 0, mask = 63,
> +  [0]  = "extTSAK", [1]  = "dextmTSAP", [3]  = "dextTSAK",
> +  [4]  = "insTSAL", [6]  = "dinsuTSEQ", [7]  = "dinsTSAL",
> +  [32] = map_bshfl_r6, [36] = map_dbshfl_r6,  [59] = "rdhwrTD",
> +}
> +
> +local map_regimm_r6 = {
> +  shift = 16, mask = 31,
> +  [0] = "bltzSB", [1] = "bgezSB",
> +  [6] = "dahiSI", [30] = "datiSI",
> +  [23] = "sigrieI", [31] = "synciSO",
> +}
> +
> +local map_pcrel_r6 = {
> +  shift = 19, mask = 3,
> +  [0] = "addiupcS2", "lwpcS2", "lwupcS2", {
> +    shift = 18, mask = 1,
> +    [0] = "ldpcS3", { shift = 16, mask = 3, [2] = "auipcSI", [3] = "aluipcSI" }
> +  }
> +}
> +
> +local map_cop1s_r6 = {
> +  shift = 0, mask = 63,
> +  [0] = "add.sFGH",	"sub.sFGH",	"mul.sFGH",	"div.sFGH",
> +  "sqrt.sFG",		"abs.sFG",	"mov.sFG",	"neg.sFG",
> +  "round.l.sFG",	"trunc.l.sFG",	"ceil.l.sFG",	"floor.l.sFG",
> +  "round.w.sFG",	"trunc.w.sFG",	"ceil.w.sFG",	"floor.w.sFG",
> +  "sel.sFGH",		false,		false,		false,
> +  "seleqz.sFGH",	"recip.sFG",	"rsqrt.sFG",	"selnez.sFGH",
> +  "maddf.sFGH",		"msubf.sFGH",	"rint.sFG",	"class.sFG",
> +  "min.sFGH",		"mina.sFGH",	"max.sFGH",	"maxa.sFGH",
> +  false,		"cvt.d.sFG",	false,		false,
> +  "cvt.w.sFG",		"cvt.l.sFG",
> +}
> +
> +local map_cop1d_r6 = {
> +  shift = 0, mask = 63,
> +  [0] = "add.dFGH",	"sub.dFGH",	"mul.dFGH",	"div.dFGH",
> +  "sqrt.dFG",		"abs.dFG",	"mov.dFG",	"neg.dFG",
> +  "round.l.dFG",	"trunc.l.dFG",	"ceil.l.dFG",	"floor.l.dFG",
> +  "round.w.dFG",	"trunc.w.dFG",	"ceil.w.dFG",	"floor.w.dFG",
> +  "sel.dFGH",		false,		false,		false,
> +  "seleqz.dFGH",	"recip.dFG",	"rsqrt.dFG",	"selnez.dFGH",
> +  "maddf.dFGH",		"msubf.dFGH",	"rint.dFG",	"class.dFG",
> +  "min.dFGH",		"mina.dFGH",	"max.dFGH",	"maxa.dFGH",
> +  "cvt.s.dFG",		false,		false,		false,
> +  "cvt.w.dFG",		"cvt.l.dFG",
> +}
> +
> +local map_cop1w_r6 = {
> +  shift = 0, mask = 63,
> +  [0] = "cmp.af.sFGH",	"cmp.un.sFGH",	"cmp.eq.sFGH",	"cmp.ueq.sFGH",
> +  "cmp.lt.sFGH",	"cmp.ult.sFGH",	"cmp.le.sFGH",	"cmp.ule.sFGH",
> +  "cmp.saf.sFGH",	"cmp.sun.sFGH",	"cmp.seq.sFGH",	"cmp.sueq.sFGH",
> +  "cmp.slt.sFGH",	"cmp.sult.sFGH",	"cmp.sle.sFGH",	"cmp.sule.sFGH",
> +  false,		"cmp.or.sFGH",	"cmp.une.sFGH",	"cmp.ne.sFGH",
> +  false,		false,		false,		false,
> +  false,		"cmp.sor.sFGH",	"cmp.sune.sFGH",	"cmp.sne.sFGH",
> +  false,		false,		false,		false,
> +  "cvt.s.wFG", "cvt.d.wFG",
> +}
> +
> +local map_cop1l_r6 = {
> +  shift = 0, mask = 63,
> +  [0] = "cmp.af.dFGH",	"cmp.un.dFGH",	"cmp.eq.dFGH",	"cmp.ueq.dFGH",
> +  "cmp.lt.dFGH",	"cmp.ult.dFGH",	"cmp.le.dFGH",	"cmp.ule.dFGH",
> +  "cmp.saf.dFGH",	"cmp.sun.dFGH",	"cmp.seq.dFGH",	"cmp.sueq.dFGH",
> +  "cmp.slt.dFGH",	"cmp.sult.dFGH",	"cmp.sle.dFGH",	"cmp.sule.dFGH",
> +  false,		"cmp.or.dFGH",	"cmp.une.dFGH",	"cmp.ne.dFGH",
> +  false,		false,		false,		false,
> +  false,		"cmp.sor.dFGH",	"cmp.sune.dFGH",	"cmp.sne.dFGH",
> +  false,		false,		false,		false,
> +  "cvt.s.lFG", "cvt.d.lFG",
> +}
> +
> +local map_cop1_r6 = {
> +  shift = 21, mask = 31,
> +  [0] = "mfc1TG", "dmfc1TG",	"cfc1TG",	"mfhc1TG",
> +  "mtc1TG",	"dmtc1TG",	"ctc1TG",	"mthc1TG",
> +  false,	"bc1eqzHB",	false,		false,
> +  false,	"bc1nezHB",	false,		false,
> +  map_cop1s_r6,	map_cop1d_r6,	false,		false,
> +  map_cop1w_r6,	map_cop1l_r6,
> +}
> +
> +local function maprs_popTS(rs, rt)
> +  if rt == 0 then return 0 elseif rs == 0 then return 1
> +  elseif rs == rt then return 2 else return 3 end
> +end
> +
> +local map_pop06_r6 = {
> +  maprs = maprs_popTS, [0] = "blezSB", "blezalcTB", "bgezalcTB", "bgeucSTB"
> +}
> +local map_pop07_r6 = {
> +  maprs = maprs_popTS, [0] = "bgtzSB", "bgtzalcTB", "bltzalcTB", "bltucSTB"
> +}
> +local map_pop26_r6 = {
> +  maprs = maprs_popTS, "blezcTB", "bgezcTB", "bgecSTB"
> +}
> +local map_pop27_r6 = {
> +  maprs = maprs_popTS, "bgtzcTB", "bltzcTB", "bltcSTB"
> +}
> +
> +local function maprs_popS(rs, rt)
> +  if rs == 0 then return 0 else return 1 end
> +end
> +
> +local map_pop66_r6 = {
> +  maprs = maprs_popS, [0] = "jicTI", "beqzcSb"
> +}
> +local map_pop76_r6 = {
> +  maprs = maprs_popS, [0] = "jialcTI", "bnezcSb"
> +}
> +
> +local function maprs_popST(rs, rt)
> +  if rs >= rt then return 0 elseif rs == 0 then return 1 else return 2 end
> +end
> +
> +local map_pop10_r6 = {
> +  maprs = maprs_popST, [0] = "bovcSTB", "beqzalcTB", "beqcSTB"
> +}
> +local map_pop30_r6 = {
> +  maprs = maprs_popST, [0] = "bnvcSTB", "bnezalcTB", "bnecSTB"
> +}
> +
> +local map_pri_r6 = {
> +  [0] = map_special_r6,	map_regimm_r6,	"jJ",	"jalJ",
> +  "beq|beqz|bST00B",	"bne|bnezST0B",		map_pop06_r6,	map_pop07_r6,
> +  map_pop10_r6,	"addiu|liTS0I",	"sltiTSI",	"sltiuTSI",
> +  "andiTSU",	"ori|liTS0U",	"xoriTSU",	"aui|luiTS0U",
> +  map_cop0,	map_cop1_r6,	false,		false,
> +  false,	false,		map_pop26_r6,	map_pop27_r6,
> +  map_pop30_r6,	"daddiuTSI",	false,		false,
> +  false,	"dauiTSI",	false,		map_special3_r6,
> +  "lbTSO",	"lhTSO",	false,		"lwTSO",
> +  "lbuTSO",	"lhuTSO",	false,		false,
> +  "sbTSO",	"shTSO",	false,		"swTSO",
> +  false,	false,		false,		false,
> +  false,	"lwc1HSO",	"bc#",		false,
> +  false,	"ldc1HSO",	map_pop66_r6,	"ldTSO",
> +  false,	"swc1HSO",	"balc#",	map_pcrel_r6,
> +  false,	"sdc1HSO",	map_pop76_r6,	"sdTSO",
> +}
> +
>  ------------------------------------------------------------------------------
>  
>  local map_gpr = {
> @@ -287,10 +494,14 @@ local function disass_ins(ctx)
>    ctx.op = op
>    ctx.rel = nil
>  
> -  local opat = map_pri[rshift(op, 26)]
> +  local opat = ctx.map_pri[rshift(op, 26)]
>    while type(opat) ~= "string" do
>      if not opat then return unknown(ctx) end
> -    opat = opat[band(rshift(op, opat.shift), opat.mask)] or opat._
> +    if opat.maprs then
> +      opat = opat[opat.maprs(band(rshift(op,21),31), band(rshift(op,16),31))]
> +    else
> +      opat = opat[band(rshift(op, opat.shift), opat.mask)] or opat._
> +    end
>    end
>    local name, pat = match(opat, "^([a-z0-9_.]*)(.*)")
>    local altname, pat2 = match(pat, "|([a-z0-9_.|]*)(.*)")
> @@ -314,6 +525,8 @@ local function disass_ins(ctx)
>        x = "f"..band(rshift(op, 21), 31)
>      elseif p == "A" then
>        x = band(rshift(op, 6), 31)
> +    elseif p == "a" then
> +      x = band(rshift(op, 6), 7)
>      elseif p == "E" then
>        x = band(rshift(op, 6), 31) + 32
>      elseif p == "M" then
> @@ -333,6 +546,10 @@ local function disass_ins(ctx)
>        x = band(rshift(op, 11), 31) - last + 33
>      elseif p == "I" then
>        x = arshift(lshift(op, 16), 16)
> +    elseif p == "2" then
> +      x = arshift(lshift(op, 13), 11)
> +    elseif p == "3" then
> +      x = arshift(lshift(op, 14), 11)
>      elseif p == "U" then
>        x = band(op, 0xffff)
>      elseif p == "O" then
> @@ -342,7 +559,15 @@ local function disass_ins(ctx)
>        local index = map_gpr[band(rshift(op, 16), 31)]
>        operands[#operands] = format("%s(%s)", index, last)
>      elseif p == "B" then
> -      x = ctx.addr + ctx.pos + arshift(lshift(op, 16), 16)*4 + 4
> +      x = ctx.addr + ctx.pos + arshift(lshift(op, 16), 14) + 4
> +      ctx.rel = x
> +      x = format("0x%08x", x)
> +    elseif p == "b" then
> +      x = ctx.addr + ctx.pos + arshift(lshift(op, 11), 9) + 4
> +      ctx.rel = x
> +      x = format("0x%08x", x)
> +    elseif p == "#" then
> +      x = ctx.addr + ctx.pos + arshift(lshift(op, 6), 4) + 4
>        ctx.rel = x
>        x = format("0x%08x", x)
>      elseif p == "J" then
> @@ -408,6 +633,7 @@ local function create(code, addr, out)
>    ctx.disass = disass_block
>    ctx.hexdump = 8
>    ctx.get = get_be
> +  ctx.map_pri = map_pri
>    return ctx
>  end
>  
> @@ -417,6 +643,19 @@ local function create_el(code, addr, out)
>    return ctx
>  end
>  
> +local function create_r6(code, addr, out)
> +  local ctx = create(code, addr, out)
> +  ctx.map_pri = map_pri_r6
> +  return ctx
> +end
> +
> +local function create_r6_el(code, addr, out)
> +  local ctx = create(code, addr, out)
> +  ctx.get = get_le
> +  ctx.map_pri = map_pri_r6
> +  return ctx
> +end
> +
>  -- Simple API: disassemble code (a string) at address and output via out.
>  local function disass(code, addr, out)
>    create(code, addr, out):disass()
> @@ -426,6 +665,14 @@ local function disass_el(code, addr, out)
>    create_el(code, addr, out):disass()
>  end
>  
> +local function disass_r6(code, addr, out)
> +  create_r6(code, addr, out):disass()
> +end
> +
> +local function disass_r6_el(code, addr, out)
> +  create_r6_el(code, addr, out):disass()
> +end
> +
>  -- Return register name for RID.
>  local function regname(r)
>    if r < 32 then return map_gpr[r] end
> @@ -436,8 +683,12 @@ end
>  return {
>    create = create,
>    create_el = create_el,
> +  create_r6 = create_r6,
> +  create_r6_el = create_r6_el,
>    disass = disass,
>    disass_el = disass_el,
> +  disass_r6 = disass_r6,
> +  disass_r6_el = disass_r6_el,
>    regname = regname
>  }
>  
> diff --git a/src/jit/dis_mips64r6.lua b/src/jit/dis_mips64r6.lua
> new file mode 100644
> index 00000000..023c05ab
> --- /dev/null
> +++ b/src/jit/dis_mips64r6.lua
> @@ -0,0 +1,17 @@
> +----------------------------------------------------------------------------
> +-- LuaJIT MIPS64R6 disassembler wrapper module.
> +--
> +-- Copyright (C) 2005-2017 Mike Pall. All rights reserved.
> +-- Released under the MIT license. See Copyright Notice in luajit.h
> +----------------------------------------------------------------------------
> +-- This module just exports the r6 big-endian functions from the
> +-- MIPS disassembler module. All the interesting stuff is there.
> +------------------------------------------------------------------------------
> +
> +local dis_mips = require((string.match(..., ".*%.") or "").."dis_mips")
> +return {
> +  create = dis_mips.create_r6,
> +  disass = dis_mips.disass_r6,
> +  regname = dis_mips.regname
> +}
> +
> diff --git a/src/jit/dis_mips64r6el.lua b/src/jit/dis_mips64r6el.lua
> new file mode 100644
> index 00000000..f2988339
> --- /dev/null
> +++ b/src/jit/dis_mips64r6el.lua
> @@ -0,0 +1,17 @@
> +----------------------------------------------------------------------------
> +-- LuaJIT MIPS64R6EL disassembler wrapper module.
> +--
> +-- Copyright (C) 2005-2017 Mike Pall. All rights reserved.
> +-- Released under the MIT license. See Copyright Notice in luajit.h
> +----------------------------------------------------------------------------
> +-- This module just exports the r6 little-endian functions from the
> +-- MIPS disassembler module. All the interesting stuff is there.
> +------------------------------------------------------------------------------
> +
> +local dis_mips = require((string.match(..., ".*%.") or "").."dis_mips")
> +return {
> +  create = dis_mips.create_r6_el,
> +  disass = dis_mips.disass_r6_el,
> +  regname = dis_mips.regname
> +}
> +
> diff --git a/src/lj_arch.h b/src/lj_arch.h
> index 0351e046..cf31a291 100644
> --- a/src/lj_arch.h
> +++ b/src/lj_arch.h
> @@ -342,18 +342,38 @@
>  #elif LUAJIT_TARGET == LUAJIT_ARCH_MIPS32 || LUAJIT_TARGET == LUAJIT_ARCH_MIPS64
>  
>  #if defined(__MIPSEL__) || defined(__MIPSEL) || defined(_MIPSEL)
> +#if __mips_isa_rev >= 6
> +#define LJ_TARGET_MIPSR6	1
> +#define LJ_TARGET_UNALIGNED	1
> +#endif
>  #if LUAJIT_TARGET == LUAJIT_ARCH_MIPS32
> +#if LJ_TARGET_MIPSR6
> +#define LJ_ARCH_NAME		"mips32r6el"
> +#else
>  #define LJ_ARCH_NAME		"mipsel"
> +#endif
> +#else
> +#if LJ_TARGET_MIPSR6
> +#define LJ_ARCH_NAME		"mips64r6el"
>  #else
>  #define LJ_ARCH_NAME		"mips64el"
>  #endif
> +#endif
>  #define LJ_ARCH_ENDIAN		LUAJIT_LE
>  #else
>  #if LUAJIT_TARGET == LUAJIT_ARCH_MIPS32
> +#if LJ_TARGET_MIPSR6
> +#define LJ_ARCH_NAME		"mips32r6"
> +#else
>  #define LJ_ARCH_NAME		"mips"
> +#endif
> +#else
> +#if LJ_TARGET_MIPSR6
> +#define LJ_ARCH_NAME		"mips64r6"
>  #else
>  #define LJ_ARCH_NAME		"mips64"
>  #endif
> +#endif
>  #define LJ_ARCH_ENDIAN		LUAJIT_BE
>  #endif
>  
> @@ -390,7 +410,9 @@
>  #define LJ_TARGET_UNIFYROT	2	/* Want only IR_BROR. */
>  #define LJ_ARCH_NUMMODE		LJ_NUMMODE_DUAL
>  
> -#if _MIPS_ARCH_MIPS32R2 || _MIPS_ARCH_MIPS64R2
> +#if LJ_TARGET_MIPSR6
> +#define LJ_ARCH_VERSION		60
> +#elif _MIPS_ARCH_MIPS32R2 || _MIPS_ARCH_MIPS64R2
>  #define LJ_ARCH_VERSION		20
>  #else
>  #define LJ_ARCH_VERSION		10
> @@ -472,8 +494,13 @@
>  #if !((defined(_MIPS_SIM_ABI32) && _MIPS_SIM == _MIPS_SIM_ABI32) || (defined(_ABIO32) && _MIPS_SIM == _ABIO32))
>  #error "Only o32 ABI supported for MIPS32"
>  #endif
> +#if LJ_TARGET_MIPSR6
> +/* Not that useful, since most available r6 CPUs are 64 bit. */
> +#error "No support for MIPS32R6"
> +#endif
>  #elif LJ_TARGET_MIPS64
>  #if !((defined(_MIPS_SIM_ABI64) && _MIPS_SIM == _MIPS_SIM_ABI64) || (defined(_ABI64) && _MIPS_SIM == _ABI64))
> +/* MIPS32ON64 aka n32 ABI support might be desirable, but difficult. */
>  #error "Only n64 ABI supported for MIPS64"
>  #endif
>  #endif
> diff --git a/src/lj_asm.c b/src/lj_asm.c
> index 25b96264..96b8c032 100644
> --- a/src/lj_asm.c
> +++ b/src/lj_asm.c
> @@ -2159,8 +2159,8 @@ static void asm_setup_regsp(ASMState *as)
>  	  ir->prev = REGSP_HINT(RID_FPRET);
>  	  continue;
>  	}
> -	/* fallthrough */
>  #endif
> +      /* fallthrough */
>        case IR_CALLN: case IR_CALLXS:
>  #if LJ_SOFTFP
>        case IR_MIN: case IR_MAX:
> diff --git a/src/lj_asm_mips.h b/src/lj_asm_mips.h
> index 23ffc3aa..4626507b 100644
> --- a/src/lj_asm_mips.h
> +++ b/src/lj_asm_mips.h
> @@ -101,7 +101,12 @@ static void asm_guard(ASMState *as, MIPSIns mi, Reg rs, Reg rt)
>      as->invmcp = NULL;
>      as->loopinv = 1;
>      as->mcp = p+1;
> +#if !LJ_TARGET_MIPSR6
>      mi = mi ^ ((mi>>28) == 1 ? 0x04000000u : 0x00010000u);  /* Invert cond. */
> +#else
> +    mi = mi ^ ((mi>>28) == 1 ? 0x04000000u :
> +	       (mi>>28) == 4 ? 0x00800000u : 0x00010000u);  /* Invert cond. */
> +#endif
>      target = p;  /* Patch target later in asm_loop_fixup. */
>    }
>    emit_ti(as, MIPSI_LI, RID_TMP, as->snapno);
> @@ -410,7 +415,11 @@ static void asm_callround(ASMState *as, IRIns *ir, IRCallID id)
>  {
>    /* The modified regs must match with the *.dasc implementation. */
>    RegSet drop = RID2RSET(RID_R1)|RID2RSET(RID_R12)|RID2RSET(RID_FPRET)|
> -		RID2RSET(RID_F2)|RID2RSET(RID_F4)|RID2RSET(REGARG_FIRSTFPR);
> +		RID2RSET(RID_F2)|RID2RSET(RID_F4)|RID2RSET(REGARG_FIRSTFPR)
> +#if LJ_TARGET_MIPSR6
> +		|RID2RSET(RID_F21)
> +#endif
> +		;
>    if (ra_hasreg(ir->r)) rset_clear(drop, ir->r);
>    ra_evictset(as, drop);
>    ra_destreg(as, ir, RID_FPRET);
> @@ -444,8 +453,13 @@ static void asm_tointg(ASMState *as, IRIns *ir, Reg left)
>  {
>    Reg tmp = ra_scratch(as, rset_exclude(RSET_FPR, left));
>    Reg dest = ra_dest(as, ir, RSET_GPR);
> +#if !LJ_TARGET_MIPSR6
>    asm_guard(as, MIPSI_BC1F, 0, 0);
>    emit_fgh(as, MIPSI_C_EQ_D, 0, tmp, left);
> +#else
> +  asm_guard(as, MIPSI_BC1EQZ, 0, (tmp&31));
> +  emit_fgh(as, MIPSI_CMP_EQ_D, tmp, tmp, left);
> +#endif
>    emit_fg(as, MIPSI_CVT_D_W, tmp, tmp);
>    emit_tg(as, MIPSI_MFC1, dest, tmp);
>    emit_fg(as, MIPSI_CVT_W_D, tmp, left);
> @@ -599,8 +613,13 @@ static void asm_conv(ASMState *as, IRIns *ir)
>  		     (void *)&as->J->k64[LJ_K64_M2P64],
>  		     rset_exclude(RSET_GPR, dest));
>  	  emit_fg(as, MIPSI_TRUNC_L_D, tmp, left);  /* Delay slot. */
> -	  emit_branch(as, MIPSI_BC1T, 0, 0, l_end);
> -	  emit_fgh(as, MIPSI_C_OLT_D, 0, left, tmp);
> +#if !LJ_TARGET_MIPSR6
> +	 emit_branch(as, MIPSI_BC1T, 0, 0, l_end);
> +	 emit_fgh(as, MIPSI_C_OLT_D, 0, left, tmp);
> +#else
> +	 emit_branch(as, MIPSI_BC1NEZ, 0, (left&31), l_end);
> +	 emit_fgh(as, MIPSI_CMP_LT_D, left, left, tmp);
> +#endif
>  	  emit_lsptr(as, MIPSI_LDC1, (tmp & 31),
>  		     (void *)&as->J->k64[LJ_K64_2P63],
>  		     rset_exclude(RSET_GPR, dest));
> @@ -611,8 +630,13 @@ static void asm_conv(ASMState *as, IRIns *ir)
>  		     (void *)&as->J->k32[LJ_K32_M2P64],
>  		     rset_exclude(RSET_GPR, dest));
>  	  emit_fg(as, MIPSI_TRUNC_L_S, tmp, left);  /* Delay slot. */
> -	  emit_branch(as, MIPSI_BC1T, 0, 0, l_end);
> -	  emit_fgh(as, MIPSI_C_OLT_S, 0, left, tmp);
> +#if !LJ_TARGET_MIPSR6
> +	 emit_branch(as, MIPSI_BC1T, 0, 0, l_end);
> +	 emit_fgh(as, MIPSI_C_OLT_S, 0, left, tmp);
> +#else
> +	 emit_branch(as, MIPSI_BC1NEZ, 0, (left&31), l_end);
> +	 emit_fgh(as, MIPSI_CMP_LT_S, left, left, tmp);
> +#endif
>  	  emit_lsptr(as, MIPSI_LWC1, (tmp & 31),
>  		     (void *)&as->J->k32[LJ_K32_2P63],
>  		     rset_exclude(RSET_GPR, dest));
> @@ -840,8 +864,12 @@ static void asm_aref(ASMState *as, IRIns *ir)
>    }
>    base = ra_alloc1(as, ir->op1, RSET_GPR);
>    idx = ra_alloc1(as, ir->op2, rset_exclude(RSET_GPR, base));
> +#if !LJ_TARGET_MIPSR6
>    emit_dst(as, MIPSI_AADDU, dest, RID_TMP, base);
>    emit_dta(as, MIPSI_SLL, RID_TMP, idx, 3);
> +#else
> +  emit_dst(as, MIPSI_ALSA | MIPSF_A(3-1), dest, idx, base);
> +#endif
>  }
>  
>  /* Inlined hash lookup. Specialized for key type and for const keys.
> @@ -944,8 +972,13 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>      l_end = asm_exitstub_addr(as);
>    }
>    if (!LJ_SOFTFP && irt_isnum(kt)) {
> +#if !LJ_TARGET_MIPSR6
>      emit_branch(as, MIPSI_BC1T, 0, 0, l_end);
>      emit_fgh(as, MIPSI_C_EQ_D, 0, tmpnum, key);
> +#else
> +    emit_branch(as, MIPSI_BC1NEZ, 0, (tmpnum&31), l_end);
> +    emit_fgh(as, MIPSI_CMP_EQ_D, tmpnum, tmpnum, key);
> +#endif
>      *--as->mcp = MIPSI_NOP;  /* Avoid NaN comparison overhead. */
>      emit_branch(as, MIPSI_BEQ, tmp1, RID_ZERO, l_next);
>      emit_tsi(as, MIPSI_SLTIU, tmp1, tmp1, (int32_t)LJ_TISNUM);
> @@ -1196,7 +1229,9 @@ static MIPSIns asm_fxloadins(IRIns *ir)
>    case IRT_I16: return MIPSI_LH;
>    case IRT_U16: return MIPSI_LHU;
>    case IRT_NUM: lua_assert(!LJ_SOFTFP32); if (!LJ_SOFTFP) return MIPSI_LDC1;
> +  /* fallthrough */
>    case IRT_FLOAT: if (!LJ_SOFTFP) return MIPSI_LWC1;
> +  /* fallthrough */
>    default: return (LJ_64 && irt_is64(ir->t)) ? MIPSI_LD : MIPSI_LW;
>    }
>  }
> @@ -1207,7 +1242,9 @@ static MIPSIns asm_fxstoreins(IRIns *ir)
>    case IRT_I8: case IRT_U8: return MIPSI_SB;
>    case IRT_I16: case IRT_U16: return MIPSI_SH;
>    case IRT_NUM: lua_assert(!LJ_SOFTFP32); if (!LJ_SOFTFP) return MIPSI_SDC1;
> +  /* fallthrough */
>    case IRT_FLOAT: if (!LJ_SOFTFP) return MIPSI_SWC1;
> +  /* fallthrough */
>    default: return (LJ_64 && irt_is64(ir->t)) ? MIPSI_SD : MIPSI_SW;
>    }
>  }
> @@ -1253,7 +1290,7 @@ static void asm_xload(ASMState *as, IRIns *ir)
>  {
>    Reg dest = ra_dest(as, ir,
>      (!LJ_SOFTFP && irt_isfp(ir->t)) ? RSET_FPR : RSET_GPR);
> -  lua_assert(!(ir->op2 & IRXLOAD_UNALIGNED));
> +  lua_assert(LJ_TARGET_UNALIGNED || !(ir->op2 & IRXLOAD_UNALIGNED));
>    asm_fusexref(as, asm_fxloadins(ir), dest, ir->op1, RSET_GPR, 0);
>  }
>  
> @@ -1545,7 +1582,7 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>        ofs -= 4; if (LJ_BE) ir++; else ir--;
>      }
>  #else
> -    emit_tsi(as, MIPSI_SD, ra_alloc1(as, ir->op2, allow),
> +    emit_tsi(as, sz == 8 ? MIPSI_SD : MIPSI_SW, ra_alloc1(as, ir->op2, allow),
>  	     RID_RET, sizeof(GCcdata));
>  #endif
>      lua_assert(sz == 4 || sz == 8);
> @@ -1678,6 +1715,7 @@ static void asm_add(ASMState *as, IRIns *ir)
>    } else
>  #endif
>    {
> +    /* TODO MIPSR6: Fuse ADD(BSHL(a,1-4),b) or ADD(ADD(a,a),b) to MIPSI_ALSA. */
>      Reg dest = ra_dest(as, ir, RSET_GPR);
>      Reg right, left = ra_hintalloc(as, ir->op1, dest, RSET_GPR);
>      if (irref_isk(ir->op2)) {
> @@ -1722,8 +1760,12 @@ static void asm_mul(ASMState *as, IRIns *ir)
>      Reg right, left = ra_alloc2(as, ir, RSET_GPR);
>      right = (left >> 8); left &= 255;
>      if (LJ_64 && irt_is64(ir->t)) {
> +#if !LJ_TARGET_MIPSR6
>        emit_dst(as, MIPSI_MFLO, dest, 0, 0);
>        emit_dst(as, MIPSI_DMULT, 0, left, right);
> +#else
> +      emit_dst(as, MIPSI_DMUL, dest, left, right);
> +#endif
>      } else {
>        emit_dst(as, MIPSI_MUL, dest, left, right);
>      }
> @@ -1806,6 +1848,7 @@ static void asm_abs(ASMState *as, IRIns *ir)
>  
>  static void asm_arithov(ASMState *as, IRIns *ir)
>  {
> +  /* TODO MIPSR6: bovc/bnvc. Caveat: no delay slot to load RID_TMP. */
>    Reg right, left, tmp, dest = ra_dest(as, ir, RSET_GPR);
>    lua_assert(!irt_is64(ir->t));
>    if (irref_isk(ir->op2)) {
> @@ -1850,9 +1893,14 @@ static void asm_mulov(ASMState *as, IRIns *ir)
>  						 right), dest));
>    asm_guard(as, MIPSI_BNE, RID_TMP, tmp);
>    emit_dta(as, MIPSI_SRA, RID_TMP, dest, 31);
> +#if !LJ_TARGET_MIPSR6
>    emit_dst(as, MIPSI_MFHI, tmp, 0, 0);
>    emit_dst(as, MIPSI_MFLO, dest, 0, 0);
>    emit_dst(as, MIPSI_MULT, 0, left, right);
> +#else
> +  emit_dst(as, MIPSI_MUL, dest, left, right);
> +  emit_dst(as, MIPSI_MUH, tmp, left, right);
> +#endif
>  }
>  
>  #if LJ_32 && LJ_HASFFI
> @@ -2076,6 +2124,7 @@ static void asm_min_max(ASMState *as, IRIns *ir, int ismax)
>      Reg dest = ra_dest(as, ir, RSET_FPR);
>      Reg right, left = ra_alloc2(as, ir, RSET_FPR);
>      right = (left >> 8); left &= 255;
> +#if !LJ_TARGET_MIPSR6
>      if (dest == left) {
>        emit_fg(as, MIPSI_MOVT_D, dest, right);
>      } else {
> @@ -2083,19 +2132,37 @@ static void asm_min_max(ASMState *as, IRIns *ir, int ismax)
>        if (dest != right) emit_fg(as, MIPSI_MOV_D, dest, right);
>      }
>      emit_fgh(as, MIPSI_C_OLT_D, 0, ismax ? left : right, ismax ? right : left);
> +#else
> +    emit_fgh(as, ismax ? MIPSI_MAX_D : MIPSI_MIN_D, dest, left, right);
> +#endif
>  #endif
>    } else {
>      Reg dest = ra_dest(as, ir, RSET_GPR);
>      Reg right, left = ra_alloc2(as, ir, RSET_GPR);
>      right = (left >> 8); left &= 255;
> -    if (dest == left) {
> -      emit_dst(as, MIPSI_MOVN, dest, right, RID_TMP);
> +    if (left == right) {
> +      if (dest != left) emit_move(as, dest, left);
>      } else {
> -      emit_dst(as, MIPSI_MOVZ, dest, left, RID_TMP);
> -      if (dest != right) emit_move(as, dest, right);
> +#if !LJ_TARGET_MIPSR6
> +      if (dest == left) {
> +	emit_dst(as, MIPSI_MOVN, dest, right, RID_TMP);
> +      } else {
> +	emit_dst(as, MIPSI_MOVZ, dest, left, RID_TMP);
> +	if (dest != right) emit_move(as, dest, right);
> +      }
> +#else
> +      emit_dst(as, MIPSI_OR, dest, dest, RID_TMP);
> +      if (dest != right) {
> +	emit_dst(as, MIPSI_SELNEZ, RID_TMP, right, RID_TMP);
> +	emit_dst(as, MIPSI_SELEQZ, dest, left, RID_TMP);
> +      } else {
> +	emit_dst(as, MIPSI_SELEQZ, RID_TMP, left, RID_TMP);
> +	emit_dst(as, MIPSI_SELNEZ, dest, right, RID_TMP);
> +      }
> +#endif
> +      emit_dst(as, MIPSI_SLT, RID_TMP,
> +	       ismax ? left : right, ismax ? right : left);
>      }
> -    emit_dst(as, MIPSI_SLT, RID_TMP,
> -	     ismax ? left : right, ismax ? right : left);
>    }
>  }
>  
> @@ -2179,10 +2246,18 @@ static void asm_comp(ASMState *as, IRIns *ir)
>  #if LJ_SOFTFP
>      asm_sfpcomp(as, ir);
>  #else
> +#if !LJ_TARGET_MIPSR6
>      Reg right, left = ra_alloc2(as, ir, RSET_FPR);
>      right = (left >> 8); left &= 255;
>      asm_guard(as, (op&1) ? MIPSI_BC1T : MIPSI_BC1F, 0, 0);
>      emit_fgh(as, MIPSI_C_OLT_D + ((op&3) ^ ((op>>2)&1)), 0, left, right);
> +#else
> +    Reg tmp, right, left = ra_alloc2(as, ir, RSET_FPR);
> +    right = (left >> 8); left &= 255;
> +    tmp = ra_scratch(as, rset_exclude(rset_exclude(RSET_FPR, left), right));
> +    asm_guard(as, (op&1) ? MIPSI_BC1NEZ : MIPSI_BC1EQZ, 0, (tmp&31));
> +    emit_fgh(as, MIPSI_CMP_LT_D + ((op&3) ^ ((op>>2)&1)), tmp, left, right);
> +#endif
>  #endif
>    } else {
>      Reg right, left = ra_alloc1(as, ir->op1, RSET_GPR);
> @@ -2218,9 +2293,13 @@ static void asm_equal(ASMState *as, IRIns *ir)
>    if (!LJ_SOFTFP32 && irt_isnum(ir->t)) {
>  #if LJ_SOFTFP
>      asm_sfpcomp(as, ir);
> -#else
> +#elif !LJ_TARGET_MIPSR6
>      asm_guard(as, (ir->o & 1) ? MIPSI_BC1T : MIPSI_BC1F, 0, 0);
>      emit_fgh(as, MIPSI_C_EQ_D, 0, left, right);
> +#else
> +    Reg tmp = ra_scratch(as, rset_exclude(rset_exclude(RSET_FPR, left), right));
> +    asm_guard(as, (ir->o & 1) ? MIPSI_BC1NEZ : MIPSI_BC1EQZ, 0, (tmp&31));
> +    emit_fgh(as, MIPSI_CMP_EQ_D, tmp, left, right);
>  #endif
>    } else {
>      asm_guard(as, (ir->o & 1) ? MIPSI_BEQ : MIPSI_BNE, left, right);
> @@ -2623,7 +2702,12 @@ void lj_asm_patchexit(jit_State *J, GCtrace *T, ExitNo exitno, MCode *target)
>        if (((p[-1] ^ (px-p)) & 0xffffu) == 0 &&
>  	  ((p[-1] & 0xf0000000u) == MIPSI_BEQ ||
>  	   (p[-1] & 0xfc1e0000u) == MIPSI_BLTZ ||
> -	   (p[-1] & 0xffe00000u) == MIPSI_BC1F)) {
> +#if !LJ_TARGET_MIPSR6
> +	   (p[-1] & 0xffe00000u) == MIPSI_BC1F
> +#else
> +	   (p[-1] & 0xff600000u) == MIPSI_BC1EQZ
> +#endif
> +	  )) {
>  	ptrdiff_t delta = target - p;
>  	if (((delta + 0x8000) >> 16) == 0) {  /* Patch in-range branch. */
>  	patchbranch:
> diff --git a/src/lj_emit_mips.h b/src/lj_emit_mips.h
> index bb6593ae..313d030a 100644
> --- a/src/lj_emit_mips.h
> +++ b/src/lj_emit_mips.h
> @@ -138,6 +138,7 @@ static void emit_loadu64(ASMState *as, Reg r, uint64_t u64)
>      } else if (emit_kdelta1(as, r, (intptr_t)u64)) {
>        return;
>      } else {
> +      /* TODO MIPSR6: Use DAHI & DATI. Caveat: sign-extension. */
>        if ((u64 & 0xffff)) {
>  	emit_tsi(as, MIPSI_ORI, r, r, u64 & 0xffff);
>        }
> @@ -236,10 +237,22 @@ static void emit_jmp(ASMState *as, MCode *target)
>  static void emit_call(ASMState *as, void *target, int needcfa)
>  {
>    MCode *p = as->mcp;
> -  *--p = MIPSI_NOP;
> +#if LJ_TARGET_MIPSR6
> +  ptrdiff_t delta = (char *)target - (char *)p;
> +  if ((((delta>>2) + 0x02000000) >> 26) == 0) {  /* Try compact call first. */
> +    *--p = MIPSI_BALC | (((uintptr_t)delta >>2) & 0x03ffffffu);
> +    as->mcp = p;
> +    return;
> +  }
> +#endif
> +  *--p = MIPSI_NOP;  /* Delay slot. */
>    if ((((uintptr_t)target ^ (uintptr_t)p) >> 28) == 0) {
> +#if !LJ_TARGET_MIPSR6
>      *--p = (((uintptr_t)target & 1) ? MIPSI_JALX : MIPSI_JAL) |
>  	   (((uintptr_t)target >>2) & 0x03ffffffu);
> +#else
> +    *--p = MIPSI_JAL | (((uintptr_t)target >>2) & 0x03ffffffu);
> +#endif
>    } else {  /* Target out of range: need indirect call. */
>      *--p = MIPSI_JALR | MIPSF_S(RID_CFUNCADDR);
>      needcfa = 1;
> diff --git a/src/lj_jit.h b/src/lj_jit.h
> index c06829ab..a8b6f9a7 100644
> --- a/src/lj_jit.h
> +++ b/src/lj_jit.h
> @@ -51,10 +51,18 @@
>  /* Names for the CPU-specific flags. Must match the order above. */
>  #define JIT_F_CPU_FIRST		JIT_F_MIPSXXR2
>  #if LJ_TARGET_MIPS32
> +#if LJ_TARGET_MIPSR6
> +#define JIT_F_CPUSTRING		"\010MIPS32R6"
> +#else
>  #define JIT_F_CPUSTRING		"\010MIPS32R2"
> +#endif
> +#else
> +#if LJ_TARGET_MIPSR6
> +#define JIT_F_CPUSTRING		"\010MIPS64R6"
>  #else
>  #define JIT_F_CPUSTRING		"\010MIPS64R2"
>  #endif
> +#endif
>  #else
>  #define JIT_F_CPU_FIRST		0
>  #define JIT_F_CPUSTRING		""
> diff --git a/src/lj_target_mips.h b/src/lj_target_mips.h
> index 740687b3..84db6012 100644
> --- a/src/lj_target_mips.h
> +++ b/src/lj_target_mips.h
> @@ -223,6 +223,8 @@ typedef enum MIPSIns {
>    MIPSI_ADDIU = 0x24000000,
>    MIPSI_SUB = 0x00000022,
>    MIPSI_SUBU = 0x00000023,
> +
> +#if !LJ_TARGET_MIPSR6
>    MIPSI_MUL = 0x70000002,
>    MIPSI_DIV = 0x0000001a,
>    MIPSI_DIVU = 0x0000001b,
> @@ -232,6 +234,15 @@ typedef enum MIPSIns {
>    MIPSI_MFHI = 0x00000010,
>    MIPSI_MFLO = 0x00000012,
>    MIPSI_MULT = 0x00000018,
> +#else
> +  MIPSI_MUL = 0x00000098,
> +  MIPSI_MUH = 0x000000d8,
> +  MIPSI_DIV = 0x0000009a,
> +  MIPSI_DIVU = 0x0000009b,
> +
> +  MIPSI_SELEQZ = 0x00000035,
> +  MIPSI_SELNEZ = 0x00000037,
> +#endif
>  
>    MIPSI_SLL = 0x00000000,
>    MIPSI_SRL = 0x00000002,
> @@ -253,8 +264,13 @@ typedef enum MIPSIns {
>    MIPSI_B = 0x10000000,
>    MIPSI_J = 0x08000000,
>    MIPSI_JAL = 0x0c000000,
> +#if !LJ_TARGET_MIPSR6
>    MIPSI_JALX = 0x74000000,
>    MIPSI_JR = 0x00000008,
> +#else
> +  MIPSI_JR = 0x00000009,
> +  MIPSI_BALC = 0xe8000000,
> +#endif
>    MIPSI_JALR = 0x0000f809,
>  
>    MIPSI_BEQ = 0x10000000,
> @@ -282,15 +298,23 @@ typedef enum MIPSIns {
>  
>    /* MIPS64 instructions. */
>    MIPSI_DADD = 0x0000002c,
> -  MIPSI_DADDI = 0x60000000,
>    MIPSI_DADDU = 0x0000002d,
>    MIPSI_DADDIU = 0x64000000,
>    MIPSI_DSUB = 0x0000002e,
>    MIPSI_DSUBU = 0x0000002f,
> +#if !LJ_TARGET_MIPSR6
>    MIPSI_DDIV = 0x0000001e,
>    MIPSI_DDIVU = 0x0000001f,
>    MIPSI_DMULT = 0x0000001c,
>    MIPSI_DMULTU = 0x0000001d,
> +#else
> +  MIPSI_DDIV = 0x0000009e,
> +  MIPSI_DMOD = 0x000000de,
> +  MIPSI_DDIVU = 0x0000009f,
> +  MIPSI_DMODU = 0x000000df,
> +  MIPSI_DMUL = 0x0000009c,
> +  MIPSI_DMUH = 0x000000dc,
> +#endif
>  
>    MIPSI_DSLL = 0x00000038,
>    MIPSI_DSRL = 0x0000003a,
> @@ -308,6 +332,11 @@ typedef enum MIPSIns {
>    MIPSI_ASUBU = LJ_32 ? MIPSI_SUBU : MIPSI_DSUBU,
>    MIPSI_AL = LJ_32 ? MIPSI_LW : MIPSI_LD,
>    MIPSI_AS = LJ_32 ? MIPSI_SW : MIPSI_SD,
> +#if LJ_TARGET_MIPSR6
> +  MIPSI_LSA = 0x00000005,
> +  MIPSI_DLSA = 0x00000015,
> +  MIPSI_ALSA = LJ_32 ? MIPSI_LSA : MIPSI_DLSA,
> +#endif
>  
>    /* Extract/insert instructions. */
>    MIPSI_DEXTM = 0x7c000001,
> @@ -317,18 +346,19 @@ typedef enum MIPSIns {
>    MIPSI_DINSU = 0x7c000006,
>    MIPSI_DINS = 0x7c000007,
>  
> -  MIPSI_RINT_D = 0x4620001a,
> -  MIPSI_RINT_S = 0x4600001a,
> -  MIPSI_RINT = 0x4400001a,
>    MIPSI_FLOOR_D = 0x4620000b,
> -  MIPSI_CEIL_D = 0x4620000a,
> -  MIPSI_ROUND_D = 0x46200008,
>  
>    /* FP instructions. */
>    MIPSI_MOV_S = 0x46000006,
>    MIPSI_MOV_D = 0x46200006,
> +#if !LJ_TARGET_MIPSR6
>    MIPSI_MOVT_D = 0x46210011,
>    MIPSI_MOVF_D = 0x46200011,
> +#else
> +  MIPSI_MIN_D = 0x4620001C,
> +  MIPSI_MAX_D = 0x4620001E,
> +  MIPSI_SEL_D = 0x46200010,
> +#endif
>  
>    MIPSI_ABS_D = 0x46200005,
>    MIPSI_NEG_D = 0x46200007,
> @@ -363,15 +393,23 @@ typedef enum MIPSIns {
>    MIPSI_DMTC1 = 0x44a00000,
>    MIPSI_DMFC1 = 0x44200000,
>  
> +#if !LJ_TARGET_MIPSR6
>    MIPSI_BC1F = 0x45000000,
>    MIPSI_BC1T = 0x45010000,
> -
>    MIPSI_C_EQ_D = 0x46200032,
>    MIPSI_C_OLT_S = 0x46000034,
>    MIPSI_C_OLT_D = 0x46200034,
>    MIPSI_C_ULT_D = 0x46200035,
>    MIPSI_C_OLE_D = 0x46200036,
>    MIPSI_C_ULE_D = 0x46200037,
> +#else
> +  MIPSI_BC1EQZ = 0x45200000,
> +  MIPSI_BC1NEZ = 0x45a00000,
> +  MIPSI_CMP_EQ_D = 0x46a00002,
> +  MIPSI_CMP_LT_S = 0x46800004,
> +  MIPSI_CMP_LT_D = 0x46a00004,
> +#endif
> +
>  } MIPSIns;
>  
>  #endif
> diff --git a/src/vm_mips64.dasc b/src/vm_mips64.dasc
> index 9839b5ac..44fba36c 100644
> --- a/src/vm_mips64.dasc
> +++ b/src/vm_mips64.dasc
> @@ -83,6 +83,10 @@
>  |
>  |.define FRET1,		f0
>  |.define FRET2,		f2
> +|
> +|.define FTMP0,		f20
> +|.define FTMP1,		f21
> +|.define FTMP2,		f22
>  |.endif
>  |
>  |// Stack layout while in interpreter. Must match with lj_frame.h.
> @@ -310,10 +314,10 @@
>  |.endmacro
>  |
>  |// Assumes DISPATCH is relative to GL.
> -#define DISPATCH_GL(field)      (GG_DISP2G + (int)offsetof(global_State, field))
> -#define DISPATCH_J(field)       (GG_DISP2J + (int)offsetof(jit_State, field))
> -#define GG_DISP2GOT             (GG_OFS(got) - GG_OFS(dispatch))
> -#define DISPATCH_GOT(name)      (GG_DISP2GOT + sizeof(void*)*LJ_GOT_##name)
> +#define DISPATCH_GL(field)	(GG_DISP2G + (int)offsetof(global_State, field))
> +#define DISPATCH_J(field)	(GG_DISP2J + (int)offsetof(jit_State, field))
> +#define GG_DISP2GOT		(GG_OFS(got) - GG_OFS(dispatch))
> +#define DISPATCH_GOT(name)	(GG_DISP2GOT + sizeof(void*)*LJ_GOT_##name)
>  |
>  #define PC2PROTO(field)  ((int)offsetof(GCproto, field)-(int)sizeof(GCproto))
>  |
> @@ -492,8 +496,15 @@ static void build_subroutines(BuildCtx *ctx)
>    |7:  // Less results wanted.
>    |  subu TMP0, RD, TMP2
>    |  dsubu TMP0, BASE, TMP0		// Either keep top or shrink it.
> +  |.if MIPSR6
> +  |  selnez TMP0, TMP0, TMP2		// LUA_MULTRET+1 case?
> +  |  seleqz BASE, BASE, TMP2
> +  |  b <3
> +  |.  or BASE, BASE, TMP0
> +  |.else
>    |  b <3
>    |.  movn BASE, TMP0, TMP2		// LUA_MULTRET+1 case?
> +  |.endif
>    |
>    |8:  // Corner case: need to grow stack for filling up results.
>    |  // This can happen if:
> @@ -1125,11 +1136,16 @@ static void build_subroutines(BuildCtx *ctx)
>    |.endmacro
>    |
>    |// Inlined GC threshold check. Caveat: uses TMP0 and TMP1 and has delay slot!
> +  |// MIPSR6: no delay slot, but a forbidden slot.
>    |.macro ffgccheck
>    |  ld TMP0, DISPATCH_GL(gc.total)(DISPATCH)
>    |  ld TMP1, DISPATCH_GL(gc.threshold)(DISPATCH)
>    |  dsubu AT, TMP0, TMP1
> +  |.if MIPSR6
> +  |  bgezalc AT, ->fff_gcstep
> +  |.else
>    |  bgezal AT, ->fff_gcstep
> +  |.endif
>    |.endmacro
>    |
>    |//-- Base library: checks -----------------------------------------------
> @@ -1157,7 +1173,13 @@ static void build_subroutines(BuildCtx *ctx)
>    |  sltu TMP1, TISNUM, TMP0
>    |  not TMP2, TMP0
>    |  li TMP3, ~LJ_TISNUM
> +  |.if MIPSR6
> +  |  selnez TMP2, TMP2, TMP1
> +  |  seleqz TMP3, TMP3, TMP1
> +  |  or TMP2, TMP2, TMP3
> +  |.else
>    |  movz TMP2, TMP3, TMP1
> +  |.endif
>    |  dsll TMP2, TMP2, 3
>    |  daddu TMP2, CFUNC:RB, TMP2
>    |  b ->fff_restv
> @@ -1169,7 +1191,11 @@ static void build_subroutines(BuildCtx *ctx)
>    |  gettp TMP2, CARG1
>    |  daddiu TMP0, TMP2, -LJ_TTAB
>    |  daddiu TMP1, TMP2, -LJ_TUDATA
> +  |.if MIPSR6
> +  |  selnez TMP0, TMP1, TMP0
> +  |.else
>    |  movn TMP0, TMP1, TMP0
> +  |.endif
>    |  bnez TMP0, >6
>    |.  cleartp TAB:CARG1
>    |1:  // Field metatable must be at same offset for GCtab and GCudata!
> @@ -1208,7 +1234,13 @@ static void build_subroutines(BuildCtx *ctx)
>    |
>    |6:
>    |  sltiu AT, TMP2, LJ_TISNUM
> +  |.if MIPSR6
> +  |  selnez TMP0, TISNUM, AT
> +  |  seleqz AT, TMP2, AT
> +  |  or TMP2, TMP0, AT
> +  |.else
>    |  movn TMP2, TISNUM, AT
> +  |.endif
>    |  dsll TMP2, TMP2, 3
>    |   dsubu TMP0, DISPATCH, TMP2
>    |  b <2
> @@ -1270,8 +1302,13 @@ static void build_subroutines(BuildCtx *ctx)
>    |  or TMP0, TMP0, TMP1
>    |  bnez TMP0, ->fff_fallback
>    |.  sd BASE, L->base			// Add frame since C call can throw.
> +  |.if MIPSR6
> +  |  sd PC, SAVE_PC			// Redundant (but a defined value).
> +  |  ffgccheck
> +  |.else
>    |  ffgccheck
>    |.  sd PC, SAVE_PC			// Redundant (but a defined value).
> +  |.endif
>    |  load_got lj_strfmt_number
>    |  move CARG1, L
>    |  call_intern lj_strfmt_number	// (lua_State *L, cTValue *o)
> @@ -1441,8 +1478,15 @@ static void build_subroutines(BuildCtx *ctx)
>    |  addiu AT, TMP0, -LUA_YIELD
>    |    daddu CARG3, CARG2, TMP0
>    |   daddiu TMP3, CARG2, 8
> +  |.if MIPSR6
> +  |  seleqz CARG2, CARG2, AT
> +  |  selnez TMP3, TMP3, AT
> +  |  bgtz AT, ->fff_fallback		// st > LUA_YIELD?
> +  |.  or CARG2, TMP3, CARG2
> +  |.else
>    |  bgtz AT, ->fff_fallback		// st > LUA_YIELD?
>    |.  movn CARG2, TMP3, AT
> +  |.endif
>    |   xor TMP2, TMP2, CARG3
>    |  bnez TMP1, ->fff_fallback		// cframe != 0?
>    |.  or AT, TMP2, TMP0
> @@ -1754,7 +1798,7 @@ static void build_subroutines(BuildCtx *ctx)
>    |  b ->fff_res
>    |.  li RD, (2+1)*8
>    |
> -  |.macro math_minmax, name, intins, fpins
> +  |.macro math_minmax, name, intins, intinsc, fpins
>    |  .ffunc_1 name
>    |  daddu TMP3, BASE, NARGS8:RC
>    |  checkint CARG1, >5
> @@ -1766,7 +1810,13 @@ static void build_subroutines(BuildCtx *ctx)
>    |.  sextw CARG1, CARG1
>    |  lw CARG2, LO(TMP2)
>    |.  slt AT, CARG1, CARG2
> +  |.if MIPSR6
> +  |  intins TMP1, CARG2, AT
> +  |  intinsc CARG1, CARG1, AT
> +  |  or CARG1, CARG1, TMP1
> +  |.else
>    |  intins CARG1, CARG2, AT
> +  |.endif
>    |  daddiu TMP2, TMP2, 8
>    |  zextw CARG1, CARG1
>    |  b <1
> @@ -1802,13 +1852,23 @@ static void build_subroutines(BuildCtx *ctx)
>    |.  nop
>    |7:
>    |.if FPU
> +  |.if MIPSR6
> +  |  fpins FRET1, FRET1, FARG1
> +  |.else
>    |  c.olt.d FRET1, FARG1
>    |  fpins FRET1, FARG1
> +  |.endif
>    |.else
>    |  bal ->vm_sfcmpolt
>    |.  nop
> +  |.if MIPSR6
> +  |  intins AT, CARG2, CRET1
> +  |  intinsc CARG1, CARG1, CRET1
> +  |  or CARG1, CARG1, AT
> +  |.else
>    |  intins CARG1, CARG2, CRET1
>    |.endif
> +  |.endif
>    |  b <6
>    |.  daddiu TMP2, TMP2, 8
>    |
> @@ -1828,8 +1888,13 @@ static void build_subroutines(BuildCtx *ctx)
>    |
>    |.endmacro
>    |
> -  |  math_minmax math_min, movz, movf.d
> -  |  math_minmax math_max, movn, movt.d
> +  |.if MIPSR6
> +  |  math_minmax math_min, seleqz, selnez, min.d
> +  |  math_minmax math_max, selnez, seleqz, max.d
> +  |.else
> +  |  math_minmax math_min, movz, _, movf.d
> +  |  math_minmax math_max, movn, _, movt.d
> +  |.endif
>    |
>    |//-- String library -----------------------------------------------------
>    |
> @@ -1854,7 +1919,9 @@ static void build_subroutines(BuildCtx *ctx)
>    |
>    |.ffunc string_char			// Only handle the 1-arg case here.
>    |  ffgccheck
> +  |.if not MIPSR6
>    |.  nop
> +  |.endif
>    |  ld CARG1, 0(BASE)
>    |  gettp TMP0, CARG1
>    |  xori AT, NARGS8:RC, 8		// Exactly 1 argument.
> @@ -1884,7 +1951,9 @@ static void build_subroutines(BuildCtx *ctx)
>    |
>    |.ffunc string_sub
>    |  ffgccheck
> +  |.if not MIPSR6
>    |.  nop
> +  |.endif
>    |  addiu AT, NARGS8:RC, -16
>    |  ld TMP0, 0(BASE)
>    |  bltz AT, ->fff_fallback
> @@ -1907,8 +1976,30 @@ static void build_subroutines(BuildCtx *ctx)
>    |  addiu TMP0, CARG2, 1
>    |  addu TMP1, CARG4, TMP0
>    |   slt TMP3, CARG3, r0
> +  |.if MIPSR6
> +  |  seleqz CARG4, CARG4, AT
> +  |  selnez TMP1, TMP1, AT
> +  |  or CARG4, TMP1, CARG4		// if (end < 0) end += len+1
> +  |.else
>    |  movn CARG4, TMP1, AT		// if (end < 0) end += len+1
> +  |.endif
>    |   addu TMP1, CARG3, TMP0
> +  |.if MIPSR6
> +  |   selnez TMP1, TMP1, TMP3
> +  |   seleqz CARG3, CARG3, TMP3
> +  |   or CARG3, TMP1, CARG3		// if (start < 0) start += len+1
> +  |   li TMP2, 1
> +  |  slt AT, CARG4, r0
> +  |   slt TMP3, r0, CARG3
> +  |  seleqz CARG4, CARG4, AT		// if (end < 0) end = 0
> +  |   selnez CARG3, CARG3, TMP3
> +  |   seleqz TMP2, TMP2, TMP3
> +  |   or CARG3, TMP2, CARG3		// if (start < 1) start = 1
> +  |  slt AT, CARG2, CARG4
> +  |  seleqz CARG4, CARG4, AT
> +  |  selnez CARG2, CARG2, AT
> +  |  or CARG4, CARG2, CARG4		// if (end > len) end = len
> +  |.else
>    |   movn CARG3, TMP1, TMP3		// if (start < 0) start += len+1
>    |   li TMP2, 1
>    |  slt AT, CARG4, r0
> @@ -1917,6 +2008,7 @@ static void build_subroutines(BuildCtx *ctx)
>    |   movz CARG3, TMP2, TMP3		// if (start < 1) start = 1
>    |  slt AT, CARG2, CARG4
>    |  movn CARG4, CARG2, AT		// if (end > len) end = len
> +  |.endif
>    |   daddu CARG2, STR:CARG1, CARG3
>    |  subu CARG3, CARG4, CARG3		// len = end - start
>    |   daddiu CARG2, CARG2, sizeof(GCstr)-1
> @@ -1978,7 +2070,13 @@ static void build_subroutines(BuildCtx *ctx)
>    |  slt AT, CARG1, r0
>    |  dsrlv CRET1, TMP0, CARG3
>    |  dsubu TMP0, r0, CRET1
> +  |.if MIPSR6
> +  |  selnez TMP0, TMP0, AT
> +  |  seleqz CRET1, CRET1, AT
> +  |  or CRET1, CRET1, TMP0
> +  |.else
>    |  movn CRET1, TMP0, AT
> +  |.endif
>    |  jr ra
>    |.  zextw CRET1, CRET1
>    |1:
> @@ -2001,14 +2099,28 @@ static void build_subroutines(BuildCtx *ctx)
>    |  slt AT, CARG1, r0
>    |  dsrlv CRET1, CRET2, TMP0
>    |  dsubu CARG1, r0, CRET1
> +  |.if MIPSR6
> +  |  seleqz CRET1, CRET1, AT
> +  |  selnez CARG1, CARG1, AT
> +  |  or CRET1, CRET1, CARG1
> +  |.else
>    |  movn CRET1, CARG1, AT
> +  |.endif
>    |  li CARG1, 64
>    |  subu TMP0, CARG1, TMP0
>    |  dsllv CRET2, CRET2, TMP0	// Integer check.
>    |  sextw AT, CRET1
>    |  xor AT, CRET1, AT		// Range check.
>    |  jr ra
> +  |.if MIPSR6
> +  |  seleqz AT, AT, CRET2
> +  |  selnez CRET2, CRET2, CRET2
> +  |  jr ra
> +  |.  or CRET2, AT, CRET2
> +  |.else
> +  |  jr ra
>    |.  movz CRET2, AT, CRET2
> +  |.endif
>    |1:
>    |  jr ra
>    |.  li CRET2, 1
> @@ -2518,15 +2630,22 @@ static void build_subroutines(BuildCtx *ctx)
>    |
>    |// Hard-float round to integer.
>    |// Modifies AT, TMP0, FRET1, FRET2, f4. Keeps all others incl. FARG1.
> +  |// MIPSR6: Modifies FTMP1, too.
>    |.macro vm_round_hf, func
>    |  lui TMP0, 0x4330			// Hiword of 2^52 (double).
>    |  dsll TMP0, TMP0, 32
>    |  dmtc1 TMP0, f4
>    |  abs.d FRET2, FARG1			// |x|
>    |    dmfc1 AT, FARG1
> +  |.if MIPSR6
> +  |  cmp.lt.d FTMP1, FRET2, f4
> +  |   add.d FRET1, FRET2, f4		// (|x| + 2^52) - 2^52
> +  |  bc1eqz FTMP1, >1			// Truncate only if |x| < 2^52.
> +  |.else
>    |  c.olt.d 0, FRET2, f4
>    |   add.d FRET1, FRET2, f4		// (|x| + 2^52) - 2^52
>    |  bc1f 0, >1				// Truncate only if |x| < 2^52.
> +  |.endif
>    |.  sub.d FRET1, FRET1, f4
>    |    slt AT, AT, r0
>    |.if "func" == "ceil"
> @@ -2537,16 +2656,38 @@ static void build_subroutines(BuildCtx *ctx)
>    |.if "func" == "trunc"
>    |   dsll TMP0, TMP0, 32
>    |   dmtc1 TMP0, f4
> +  |.if MIPSR6
> +  |  cmp.lt.d FTMP1, FRET2, FRET1	// |x| < result?
> +  |   sub.d FRET2, FRET1, f4
> +  |  sel.d  FTMP1, FRET1, FRET2		// If yes, subtract +1.
> +  |  dmtc1 AT, FRET1
> +  |  neg.d FRET2, FTMP1
> +  |  jr ra
> +  |.  sel.d FRET1, FTMP1, FRET2		// Merge sign bit back in.
> +  |.else
>    |  c.olt.d 0, FRET2, FRET1		// |x| < result?
>    |   sub.d FRET2, FRET1, f4
>    |  movt.d FRET1, FRET2, 0		// If yes, subtract +1.
>    |  neg.d FRET2, FRET1
>    |  jr ra
>    |.  movn.d FRET1, FRET2, AT		// Merge sign bit back in.
> +  |.endif
>    |.else
>    |  neg.d FRET2, FRET1
>    |   dsll TMP0, TMP0, 32
>    |   dmtc1 TMP0, f4
> +  |.if MIPSR6
> +  |  dmtc1 AT, FTMP1
> +  |  sel.d FTMP1, FRET1, FRET2
> +  |.if "func" == "ceil"
> +  |  cmp.lt.d FRET1, FTMP1, FARG1	// x > result?
> +  |.else
> +  |  cmp.lt.d FRET1, FARG1, FTMP1	// x < result?
> +  |.endif
> +  |   sub.d FRET2, FTMP1, f4		// If yes, subtract +-1.
> +  |  jr ra
> +  |.  sel.d FRET1, FTMP1, FRET2
> +  |.else
>    |  movn.d FRET1, FRET2, AT		// Merge sign bit back in.
>    |.if "func" == "ceil"
>    |  c.olt.d 0, FRET1, FARG1		// x > result?
> @@ -2557,6 +2698,7 @@ static void build_subroutines(BuildCtx *ctx)
>    |  jr ra
>    |.  movt.d FRET1, FRET2, 0
>    |.endif
> +  |.endif
>    |1:
>    |  jr ra
>    |.  mov.d FRET1, FARG1
> @@ -2701,7 +2843,7 @@ static void build_subroutines(BuildCtx *ctx)
>    |.  li CRET1, 0
>    |.endif
>    |
> -  |.macro sfmin_max, name, intins
> +  |.macro sfmin_max, name, intins, intinsc
>    |->vm_sf .. name:
>    |.if JIT and not FPU
>    |  move TMP2, ra
> @@ -2710,13 +2852,25 @@ static void build_subroutines(BuildCtx *ctx)
>    |  move ra, TMP2
>    |  move TMP0, CRET1
>    |  move CRET1, CARG1
> +  |.if MIPSR6
> +  |  intins CRET1, CRET1, TMP0
> +  |  intinsc TMP0, CARG2, TMP0
> +  |  jr ra
> +  |.  or CRET1, CRET1, TMP0
> +  |.else
>    |  jr ra
>    |.  intins CRET1, CARG2, TMP0
>    |.endif
> +  |.endif
>    |.endmacro
>    |
> -  |  sfmin_max min, movz
> -  |  sfmin_max max, movn
> +  |.if MIPSR6
> +  |  sfmin_max min, selnez, seleqz
> +  |  sfmin_max max, seleqz, selnez
> +  |.else
> +  |  sfmin_max min, movz, _
> +  |  sfmin_max max, movn, _
> +  |.endif
>    |
>    |//-----------------------------------------------------------------------
>    |//-- Miscellaneous functions --------------------------------------------
> @@ -2885,7 +3039,11 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |    lui TMP3, (-(BCBIAS_J*4 >> 16) & 65535)
>      |  slt AT, CARG1, CARG2
>      |    addu TMP2, TMP2, TMP3
> +    |.if MIPSR6
> +    |  movop TMP2, TMP2, AT
> +    |.else
>      |  movop TMP2, r0, AT
> +    |.endif
>      |1:
>      |  daddu PC, PC, TMP2
>      |  ins_next
> @@ -2903,16 +3061,28 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |.endif
>      |3:  // RA and RD are both numbers.
>      |.if FPU
> -    |  fcomp f20, f22
> +    |.if MIPSR6
> +    |  fcomp FTMP0, FTMP0, FTMP2
> +    |   addu TMP2, TMP2, TMP3
> +    |  mfc1 TMP3, FTMP0
> +    |  b <1
> +    |.  fmovop TMP2, TMP2, TMP3
> +    |.else
> +    |  fcomp FTMP0, FTMP2
>      |   addu TMP2, TMP2, TMP3
>      |  b <1
>      |.  fmovop TMP2, r0
> +    |.endif
>      |.else
>      |  bal sfcomp
>      |.   addu TMP2, TMP2, TMP3
>      |  b <1
> +    |.if MIPSR6
> +    |.  movop TMP2, TMP2, CRET1
> +    |.else
>      |.  movop TMP2, r0, CRET1
>      |.endif
> +    |.endif
>      |
>      |4:  // RA is a number, RD is not a number.
>      |  bne CARG4, TISNUM, ->vmeta_comp
> @@ -2959,15 +3129,27 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |.endif
>      |.endmacro
>      |
> +    |.if MIPSR6
> +    if (op == BC_ISLT) {
> +      |  bc_comp FTMP0, FTMP2, CARG1, CARG2, selnez, selnez, cmp.lt.d, ->vm_sfcmpolt
> +    } else if (op == BC_ISGE) {
> +      |  bc_comp FTMP0, FTMP2, CARG1, CARG2, seleqz, seleqz, cmp.lt.d, ->vm_sfcmpolt
> +    } else if (op == BC_ISLE) {
> +      |  bc_comp FTMP2, FTMP0, CARG2, CARG1, seleqz, seleqz, cmp.ult.d, ->vm_sfcmpult
> +    } else {
> +      |  bc_comp FTMP2, FTMP0, CARG2, CARG1, selnez, selnez, cmp.ult.d, ->vm_sfcmpult
> +    }
> +    |.else
>      if (op == BC_ISLT) {
> -      |  bc_comp f20, f22, CARG1, CARG2, movz, movf, c.olt.d, ->vm_sfcmpolt
> +      |  bc_comp FTMP0, FTMP2, CARG1, CARG2, movz, movf, c.olt.d, ->vm_sfcmpolt
>      } else if (op == BC_ISGE) {
> -      |  bc_comp f20, f22, CARG1, CARG2, movn, movt, c.olt.d, ->vm_sfcmpolt
> +      |  bc_comp FTMP0, FTMP2, CARG1, CARG2, movn, movt, c.olt.d, ->vm_sfcmpolt
>      } else if (op == BC_ISLE) {
> -      |  bc_comp f22, f20, CARG2, CARG1, movn, movt, c.ult.d, ->vm_sfcmpult
> +      |  bc_comp FTMP2, FTMP0, CARG2, CARG1, movn, movt, c.ult.d, ->vm_sfcmpult
>      } else {
> -      |  bc_comp f22, f20, CARG2, CARG1, movz, movf, c.ult.d, ->vm_sfcmpult
> +      |  bc_comp FTMP2, FTMP0, CARG2, CARG1, movz, movf, c.ult.d, ->vm_sfcmpult
>      }
> +    |.endif
>      break;
>  
>    case BC_ISEQV: case BC_ISNEV:
> @@ -3013,7 +3195,11 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |2:  // Check if the tags are the same and it's a table or userdata.
>      |  xor AT, CARG3, CARG4			// Same type?
>      |  sltiu TMP0, CARG3, LJ_TISTABUD+1		// Table or userdata?
> +    |.if MIPSR6
> +    |  seleqz TMP0, TMP0, AT
> +    |.else
>      |  movn TMP0, r0, AT
> +    |.endif
>      if (vk) {
>        |  beqz TMP0, <1
>      } else {
> @@ -3063,11 +3249,19 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |   lui TMP3, (-(BCBIAS_J*4 >> 16) & 65535)
>      |  xor TMP1, CARG1, CARG2
>      |   addu TMP2, TMP2, TMP3
> +    |.if MIPSR6
> +    if (vk) {
> +      |  seleqz TMP2, TMP2, TMP1
> +    } else {
> +      |  selnez TMP2, TMP2, TMP1
> +    }
> +    |.else
>      if (vk) {
>        |  movn TMP2, r0, TMP1
>      } else {
>        |  movz TMP2, r0, TMP1
>      }
> +    |.endif
>      |  daddu PC, PC, TMP2
>      |  ins_next
>      break;
> @@ -3094,6 +3288,19 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |  bne CARG4, TISNUM, >6
>      |.   addu TMP2, TMP2, TMP3
>      |  xor AT, CARG1, CARG2
> +    |.if MIPSR6
> +    if (vk) {
> +      | seleqz TMP2, TMP2, AT
> +      |1:
> +      |  daddu PC, PC, TMP2
> +      |2:
> +    } else {
> +      |  selnez TMP2, TMP2, AT
> +      |1:
> +      |2:
> +      |  daddu PC, PC, TMP2
> +    }
> +    |.else
>      if (vk) {
>        | movn TMP2, r0, AT
>        |1:
> @@ -3105,6 +3312,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>        |2:
>        |  daddu PC, PC, TMP2
>      }
> +    |.endif
>      |  ins_next
>      |
>      |3:  // RA is not an integer.
> @@ -3117,30 +3325,49 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |.   addu TMP2, TMP2, TMP3
>      |  sltu AT, CARG4, TISNUM
>      |.if FPU
> -    |  ldc1 f20, 0(RA)
> -    |   ldc1 f22, 0(RD)
> +    |  ldc1 FTMP0, 0(RA)
> +    |   ldc1 FTMP2, 0(RD)
>      |.endif
>      |  beqz AT, >5
>      |.  nop
>      |4:  // RA and RD are both numbers.
>      |.if FPU
> -    |  c.eq.d f20, f22
> +    |.if MIPSR6
> +    |  cmp.eq.d FTMP0, FTMP0, FTMP2
> +    |  dmfc1 TMP1, FTMP0
> +    |  b <1
> +    if (vk) {
> +      |.  selnez TMP2, TMP2, TMP1
> +    } else {
> +      |.  seleqz TMP2, TMP2, TMP1
> +    }
> +    |.else
> +    |  c.eq.d FTMP0, FTMP2
>      |  b <1
>      if (vk) {
>        |.  movf TMP2, r0
>      } else {
>        |.  movt TMP2, r0
>      }
> +    |.endif
>      |.else
>      |  bal ->vm_sfcmpeq
>      |.  nop
>      |  b <1
> +    |.if MIPSR6
> +    if (vk) {
> +      |.  selnez TMP2, TMP2, CRET1
> +    } else {
> +      |.  seleqz TMP2, TMP2, CRET1
> +    }
> +    |.else
>      if (vk) {
>        |.  movz TMP2, r0, CRET1
>      } else {
>        |.  movn TMP2, r0, CRET1
>      }
>      |.endif
> +    |.endif
>      |
>      |5:  // RA is a number, RD is not a number.
>      |.if FFI
> @@ -3150,9 +3377,9 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |.endif
>      |  // RA is a number, RD is an integer. Convert RD to a number.
>      |.if FPU
> -    |.  lwc1 f22, LO(RD)
> +    |.  lwc1 FTMP2, LO(RD)
>      |  b <4
> -    |.  cvt.d.w f22, f22
> +    |.  cvt.d.w FTMP2, FTMP2
>      |.else
>      |.  sextw CARG2, CARG2
>      |  bal ->vm_sfi2d_2
> @@ -3170,10 +3397,10 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |.endif
>      |  // RA is an integer, RD is a number. Convert RA to a number.
>      |.if FPU
> -    |.  lwc1 f20, LO(RA)
> -    |   ldc1 f22, 0(RD)
> +    |.  lwc1 FTMP0, LO(RA)
> +    |   ldc1 FTMP2, 0(RD)
>      |  b <4
> -    |   cvt.d.w f20, f20
> +    |   cvt.d.w FTMP0, FTMP0
>      |.else
>      |.  sextw CARG1, CARG1
>      |  bal ->vm_sfi2d_1
> @@ -3216,11 +3443,19 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |  decode_RD4b TMP2
>      |  lui TMP3, (-(BCBIAS_J*4 >> 16) & 65535)
>      |  addu TMP2, TMP2, TMP3
> +    |.if MIPSR6
> +    if (vk) {
> +      |  seleqz TMP2, TMP2, TMP0
> +    } else {
> +      |  selnez TMP2, TMP2, TMP0
> +    }
> +    |.else
>      if (vk) {
>        |  movn TMP2, r0, TMP0
>      } else {
>        |  movz TMP2, r0, TMP0
>      }
> +    |.endif
>      |  daddu PC, PC, TMP2
>      |  ins_next
>      break;
> @@ -3239,11 +3474,19 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>        |   decode_RD4b TMP2
>        |   lui TMP3, (-(BCBIAS_J*4 >> 16) & 65535)
>        |   addu TMP2, TMP2, TMP3
> +      |.if MIPSR6
> +      if (op == BC_IST) {
> +	|  selnez TMP2, TMP2, TMP0;
> +      } else {
> +	|  seleqz TMP2, TMP2, TMP0;
> +      }
> +      |.else
>        if (op == BC_IST) {
>  	|  movz TMP2, r0, TMP0
>        } else {
>  	|  movn TMP2, r0, TMP0
>        }
> +      |.endif
>        |  daddu PC, PC, TMP2
>      } else {
>        |  ld CRET1, 0(RD)
> @@ -3486,9 +3729,15 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |  bltz TMP1, ->vmeta_arith
>      |.  daddu RA, BASE, RA
>      |.elif "intins" == "mult"
> +    |.if MIPSR6
> +    |.  nop
> +    |  mul CRET1, CARG3, CARG4
> +    |  muh TMP2, CARG3, CARG4
> +    |.else
>      |.  intins CARG3, CARG4
>      |  mflo CRET1
>      |  mfhi TMP2
> +    |.endif
>      |  sra TMP1, CRET1, 31
>      |  bne TMP1, TMP2, ->vmeta_arith
>      |.  daddu RA, BASE, RA
> @@ -3511,16 +3760,16 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |.endif
>      |
>      |5:  // Check for two numbers.
> -    |  .FPU ldc1 f20, 0(RB)
> +    |  .FPU ldc1 FTMP0, 0(RB)
>      |  sltu AT, TMP0, TISNUM
>      |   sltu TMP0, TMP1, TISNUM
> -    |  .FPU ldc1 f22, 0(RC)
> +    |  .FPU ldc1 FTMP2, 0(RC)
>      |   and AT, AT, TMP0
>      |   beqz AT, ->vmeta_arith
>      |.   daddu RA, BASE, RA
>      |
>      |.if FPU
> -    |  fpins FRET1, f20, f22
> +    |  fpins FRET1, FTMP0, FTMP2
>      |.elif "fpcall" == "sfpmod"
>      |  sfpmod
>      |.else
> @@ -3850,7 +4099,13 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>        |  li TMP0, 0x801
>        |  addiu AT, CARG2, -0x7ff
>        |   srl CARG3, RD, 14
> +      |.if MIPSR6
> +      |  seleqz TMP0, TMP0, AT
> +      |  selnez CARG2, CARG2, AT
> +      |  or CARG2, CARG2, TMP0
> +      |.else
>        |  movz CARG2, TMP0, AT
> +      |.endif
>        |  // (lua_State *L, int32_t asize, uint32_t hbits)
>        |  call_intern lj_tab_new
>        |.  move CARG1, L
> @@ -4131,7 +4386,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |  daddu NODE:TMP2, NODE:TMP2, TMP1	// node = tab->node + (idx*32-idx*8)
>      |   settp STR:RC, TMP3		// Tagged key to look for.
>      |.if FPU
> -    |   ldc1 f20, 0(RA)
> +    |   ldc1 FTMP0, 0(RA)
>      |.else
>      |   ld CRET1, 0(RA)
>      |.endif
> @@ -4147,7 +4402,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |  andi AT, TMP3, LJ_GC_BLACK	// isblack(table)
>      |  bnez AT, >7
>      |.if FPU
> -    |.  sdc1 f20, NODE:TMP2->val
> +    |.  sdc1 FTMP0, NODE:TMP2->val
>      |.else
>      |.  sd CRET1, NODE:TMP2->val
>      |.endif
> @@ -4188,7 +4443,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |  ld BASE, L->base
>      |.if FPU
>      |  b <3				// No 2nd write barrier needed.
> -    |.  sdc1 f20, 0(CRET1)
> +    |.  sdc1 FTMP0, 0(CRET1)
>      |.else
>      |  ld CARG1, 0(RA)
>      |  b <3				// No 2nd write barrier needed.
> @@ -4531,7 +4786,13 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |  ld CARG1, 0(RC)
>      |  sltu AT, RC, TMP3
>      |    daddiu RC, RC, 8
> +    |.if MIPSR6
> +    |  selnez CARG1, CARG1, AT
> +    |  seleqz AT, TISNIL, AT
> +    |  or CARG1, CARG1, AT
> +    |.else
>      |  movz CARG1, TISNIL, AT
> +    |.endif
>      |  sd CARG1, 0(RA)
>      |  sltu AT, RA, TMP2
>      |  bnez AT, <1
> @@ -4720,7 +4981,13 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>        |  dext AT, CRET1, 31, 0
>        |  slt CRET1, CARG2, CARG3
>        |  slt TMP1, CARG3, CARG2
> +      |.if MIPSR6
> +      |  selnez TMP1, TMP1, AT
> +      |  seleqz CRET1, CRET1, AT
> +      |  or CRET1, CRET1, TMP1
> +      |.else
>        |  movn CRET1, TMP1, AT
> +      |.endif
>      } else {
>        |  bne CARG3, TISNUM, >5
>        |.  ld CARG2, FORL_STEP*8(RA)	// STEP CARG2 - CARG4 type
> @@ -4736,20 +5003,34 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>        |  slt CRET1, CRET1, CARG1
>        |  slt AT, CARG2, r0
>        |   slt TMP0, TMP0, r0		// ((y^a) & (y^b)) < 0: overflow.
> +      |.if MIPSR6
> +      |  selnez TMP1, TMP1, AT
> +      |  seleqz CRET1, CRET1, AT
> +      |  or CRET1, CRET1, TMP1
> +      |.else
>        |  movn CRET1, TMP1, AT
> +      |.endif
>        |   or CRET1, CRET1, TMP0
>        |  zextw CARG1, CARG1
>        |  settp CARG1, TISNUM
>      }
>      |1:
>      if (op == BC_FORI) {
> +      |.if MIPSR6
> +      |  selnez TMP2, TMP2, CRET1
> +      |.else
>        |  movz TMP2, r0, CRET1
> +      |.endif
>        |  daddu PC, PC, TMP2
>      } else if (op == BC_JFORI) {
>        |  daddu PC, PC, TMP2
>        |  lhu RD, -4+OFS_RD(PC)
>      } else if (op == BC_IFORL) {
> +      |.if MIPSR6
> +      |  seleqz TMP2, TMP2, CRET1
> +      |.else
>        |  movn TMP2, r0, CRET1
> +      |.endif
>        |  daddu PC, PC, TMP2
>      }
>      if (vk) {
> @@ -4779,6 +5060,14 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>        |  and AT, AT, TMP0
>        |  beqz AT, ->vmeta_for
>        |.  slt TMP3, TMP3, r0
> +      |.if MIPSR6
> +      |   dmtc1 TMP3, FTMP2
> +      |  cmp.lt.d FTMP0, f0, f2
> +      |  cmp.lt.d FTMP1, f2, f0
> +      |  sel.d FTMP2, FTMP1, FTMP0
> +      |  b <1
> +      |.  dmfc1 CRET1, FTMP2
> +      |.else
>        |  c.ole.d 0, f0, f2
>        |  c.ole.d 1, f2, f0
>        |  li CRET1, 1
> @@ -4786,12 +5075,25 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>        |  movt AT, r0, 1
>        |  b <1
>        |.  movn CRET1, AT, TMP3
> +      |.endif
>      } else {
>        |  ldc1 f0, FORL_IDX*8(RA)
>        |   ldc1 f4, FORL_STEP*8(RA)
>        |    ldc1 f2, FORL_STOP*8(RA)
>        |   ld TMP3, FORL_STEP*8(RA)
>        |  add.d f0, f0, f4
> +      |.if MIPSR6
> +      |   slt TMP3, TMP3, r0
> +      |   dmtc1 TMP3, FTMP2
> +      |  cmp.lt.d FTMP0, f0, f2
> +      |  cmp.lt.d FTMP1, f2, f0
> +      |  sel.d FTMP2, FTMP1, FTMP0
> +      |  dmfc1 CRET1, FTMP2
> +      if (op == BC_IFORL) {
> +	|  seleqz TMP2, TMP2, CRET1
> +	|  daddu PC, PC, TMP2
> +      }
> +      |.else
>        |  c.ole.d 0, f0, f2
>        |  c.ole.d 1, f2, f0
>        |   slt TMP3, TMP3, r0
> @@ -4804,6 +5106,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>  	|  movn TMP2, r0, CRET1
>  	|  daddu PC, PC, TMP2
>        }
> +      |.endif
>        |  sdc1 f0, FORL_IDX*8(RA)
>        |  ins_next1
>        |  b <2
> @@ -4979,8 +5282,17 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |  ld TMP0, 0(RA)
>      |  sltu AT, RA, RC			// Less args than parameters?
>      |  move CARG1, TMP0
> +    |.if MIPSR6
> +    |  selnez TMP0, TMP0, AT
> +    |  seleqz TMP3, TISNIL, AT
> +    |  or TMP0, TMP0, TMP3
> +    |  seleqz TMP3, CARG1, AT
> +    |  selnez CARG1, TISNIL, AT
> +    |  or CARG1, CARG1, TMP3
> +    |.else
>      |  movz TMP0, TISNIL, AT		// Clear missing parameters.
>      |  movn CARG1, TISNIL, AT		// Clear old fixarg slot (help the GC).
> +    |.endif
>      |    addiu TMP2, TMP2, -1
>      |  sd TMP0, 16(TMP1)
>      |    daddiu TMP1, TMP1, 8
> -- 
> 2.41.0
> 

  reply	other threads:[~2023-08-16  9:16 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-09 15:35 [Tarantool-patches] [PATCH luajit 00/19] Prerequisites for improve assertions Sergey Kaplun via Tarantool-patches
2023-08-09 15:35 ` [Tarantool-patches] [PATCH luajit 01/19] MIPS: Use precise search for exit jump patching Sergey Kaplun via Tarantool-patches
2023-08-15  9:36   ` Maxim Kokryashkin via Tarantool-patches
2023-08-16 12:40     ` Sergey Kaplun via Tarantool-patches
2023-08-16 13:25   ` Sergey Bronnikov via Tarantool-patches
2023-08-09 15:35 ` [Tarantool-patches] [PATCH luajit 02/19] test: introduce mcode generator for tests Sergey Kaplun via Tarantool-patches
2023-08-15 10:14   ` Maxim Kokryashkin via Tarantool-patches
2023-08-16 12:55     ` Sergey Kaplun via Tarantool-patches
2023-08-16 13:06       ` Maxim Kokryashkin via Tarantool-patches
2023-08-16 14:32   ` Sergey Bronnikov via Tarantool-patches
2023-08-16 15:20     ` Sergey Kaplun via Tarantool-patches
2023-08-16 16:08       ` Sergey Bronnikov via Tarantool-patches
2023-08-09 15:35 ` [Tarantool-patches] [PATCH luajit 03/19] MIPS: Fix handling of spare long-range jump slots Sergey Kaplun via Tarantool-patches
2023-08-15 11:13   ` Maxim Kokryashkin via Tarantool-patches
2023-08-16 13:05     ` Sergey Kaplun via Tarantool-patches
2023-08-16 15:02   ` Sergey Bronnikov via Tarantool-patches
2023-08-16 15:32     ` Sergey Kaplun via Tarantool-patches
2023-08-16 16:08       ` Sergey Bronnikov via Tarantool-patches
2023-08-09 15:35 ` [Tarantool-patches] [PATCH luajit 04/19] MIPS64: Add soft-float support to JIT compiler backend Sergey Kaplun via Tarantool-patches
2023-08-15 11:27   ` Maxim Kokryashkin via Tarantool-patches
2023-08-16 13:10     ` Sergey Kaplun via Tarantool-patches
2023-08-16 16:07   ` Sergey Bronnikov via Tarantool-patches
2023-08-09 15:35 ` [Tarantool-patches] [PATCH luajit 05/19] PPC: Add soft-float support to interpreter Sergey Kaplun via Tarantool-patches
2023-08-15 11:40   ` Maxim Kokryashkin via Tarantool-patches
2023-08-16 13:13     ` Sergey Kaplun via Tarantool-patches
2023-08-17 14:53   ` Sergey Bronnikov via Tarantool-patches
2023-08-09 15:35 ` [Tarantool-patches] [PATCH luajit 06/19] PPC: Add soft-float support to JIT compiler backend Sergey Kaplun via Tarantool-patches
2023-08-15 11:46   ` Maxim Kokryashkin via Tarantool-patches
2023-08-16 13:21     ` Sergey Kaplun via Tarantool-patches
2023-08-17 14:33   ` Sergey Bronnikov via Tarantool-patches
2023-08-09 15:35 ` [Tarantool-patches] [PATCH luajit 07/19] build: fix non-Linux/macOS builds Sergey Kaplun via Tarantool-patches
2023-08-15 11:58   ` Maxim Kokryashkin via Tarantool-patches
2023-08-16 13:40     ` Sergey Kaplun via Tarantool-patches
2023-08-17 14:31   ` Sergey Bronnikov via Tarantool-patches
2023-08-09 15:35 ` [Tarantool-patches] [PATCH luajit 08/19] Windows: Add UWP support, part 1 Sergey Kaplun via Tarantool-patches
2023-08-15 12:09   ` Maxim Kokryashkin via Tarantool-patches
2023-08-16 13:50     ` Sergey Kaplun via Tarantool-patches
2023-08-16 16:40   ` Sergey Bronnikov via Tarantool-patches
2023-08-09 15:35 ` [Tarantool-patches] [PATCH luajit 09/19] FFI: Eliminate hardcoded string hashes Sergey Kaplun via Tarantool-patches
2023-08-15 13:07   ` Maxim Kokryashkin via Tarantool-patches
2023-08-16 13:52     ` Sergey Kaplun via Tarantool-patches
2023-08-16 17:04     ` Sergey Bronnikov via Tarantool-patches
2023-08-09 15:35 ` [Tarantool-patches] [PATCH luajit 10/19] Cleanup math function compilation and fix inconsistencies Sergey Kaplun via Tarantool-patches
2023-08-11  8:06   ` Sergey Kaplun via Tarantool-patches
2023-08-15 13:10   ` Maxim Kokryashkin via Tarantool-patches
2023-08-16 17:15   ` Sergey Bronnikov via Tarantool-patches
2023-08-09 15:36 ` [Tarantool-patches] [PATCH luajit 11/19] Fix GCC 7 -Wimplicit-fallthrough warnings Sergey Kaplun via Tarantool-patches
2023-08-15 13:17   ` Maxim Kokryashkin via Tarantool-patches
2023-08-16 13:59     ` Sergey Kaplun via Tarantool-patches
2023-08-17  7:37   ` Sergey Bronnikov via Tarantool-patches
2023-08-09 15:36 ` [Tarantool-patches] [PATCH luajit 12/19] DynASM: Fix warning Sergey Kaplun via Tarantool-patches
2023-08-15 13:21   ` Maxim Kokryashkin via Tarantool-patches
2023-08-16 14:01     ` Sergey Kaplun via Tarantool-patches
2023-08-17  7:39   ` Sergey Bronnikov via Tarantool-patches
2023-08-17  7:51     ` Sergey Bronnikov via Tarantool-patches
2023-08-17  7:58       ` Sergey Kaplun via Tarantool-patches
2023-08-09 15:36 ` [Tarantool-patches] [PATCH luajit 13/19] ARM: Fix GCC 7 -Wimplicit-fallthrough warnings Sergey Kaplun via Tarantool-patches
2023-08-15 13:25   ` Maxim Kokryashkin via Tarantool-patches
2023-08-16 14:08     ` Sergey Kaplun via Tarantool-patches
2023-08-17  7:44   ` Sergey Bronnikov via Tarantool-patches
2023-08-17  8:01     ` Sergey Kaplun via Tarantool-patches
2023-08-09 15:36 ` [Tarantool-patches] [PATCH luajit 14/19] Fix debug.getinfo() argument check Sergey Kaplun via Tarantool-patches
2023-08-15 13:35   ` Maxim Kokryashkin via Tarantool-patches
2023-08-16 14:20     ` Sergey Kaplun via Tarantool-patches
2023-08-16 20:13       ` Maxim Kokryashkin via Tarantool-patches
2023-08-17  8:29   ` Sergey Bronnikov via Tarantool-patches
2023-08-09 15:36 ` [Tarantool-patches] [PATCH luajit 15/19] Fix LJ_MAX_JSLOTS assertion in rec_check_slots() Sergey Kaplun via Tarantool-patches
2023-08-15 14:07   ` Maxim Kokryashkin via Tarantool-patches
2023-08-16 14:22     ` Sergey Kaplun via Tarantool-patches
2023-08-17  8:57   ` Sergey Bronnikov via Tarantool-patches
2023-08-17  8:57     ` Sergey Kaplun via Tarantool-patches
2023-08-09 15:36 ` [Tarantool-patches] [PATCH luajit 16/19] Prevent integer overflow while parsing long strings Sergey Kaplun via Tarantool-patches
2023-08-15 14:38   ` Maxim Kokryashkin via Tarantool-patches
2023-08-16 14:52     ` Sergey Kaplun via Tarantool-patches
2023-08-17 10:53   ` Sergey Bronnikov via Tarantool-patches
2023-08-17 13:57     ` Sergey Kaplun via Tarantool-patches
2023-08-17 14:28       ` Sergey Bronnikov via Tarantool-patches
2023-08-09 15:36 ` [Tarantool-patches] [PATCH luajit 17/19] MIPS64: Fix register allocation in assembly of HREF Sergey Kaplun via Tarantool-patches
2023-08-16  9:01   ` Maxim Kokryashkin via Tarantool-patches
2023-08-16 15:17     ` Sergey Kaplun via Tarantool-patches
2023-08-16 20:14       ` Maxim Kokryashkin via Tarantool-patches
2023-08-17 11:06   ` Sergey Bronnikov via Tarantool-patches
2023-08-17 13:50     ` Sergey Kaplun via Tarantool-patches
2023-08-17 14:30       ` Sergey Bronnikov via Tarantool-patches
2023-08-09 15:36 ` [Tarantool-patches] [PATCH luajit 18/19] DynASM/MIPS: Fix shadowed variable Sergey Kaplun via Tarantool-patches
2023-08-16  9:03   ` Maxim Kokryashkin via Tarantool-patches
2023-08-16 15:22     ` Sergey Kaplun via Tarantool-patches
2023-08-17 12:01   ` Sergey Bronnikov via Tarantool-patches
2023-08-09 15:36 ` [Tarantool-patches] [PATCH luajit 19/19] MIPS: Add MIPS64 R6 port Sergey Kaplun via Tarantool-patches
2023-08-16  9:16   ` Maxim Kokryashkin via Tarantool-patches [this message]
2023-08-16 15:24     ` Sergey Kaplun via Tarantool-patches
2023-08-17 13:03   ` Sergey Bronnikov via Tarantool-patches
2023-08-17 13:59     ` Sergey Kaplun via Tarantool-patches
2023-08-16 15:35 ` [Tarantool-patches] [PATCH luajit 00/19] Prerequisites for improve assertions Sergey Kaplun via Tarantool-patches
2023-08-17 14:06   ` Maxim Kokryashkin via Tarantool-patches
2023-08-17 14:38 ` Sergey Bronnikov via Tarantool-patches
2023-08-31 15:17 ` Igor Munkin via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ee5dcm5qj277tujwfwljxvmjdmvrckz5yekyirvggb7jsbwspm@3pv4rldftgf2 \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=m.kokryashkin@tarantool.org \
    --cc=skaplun@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH luajit 19/19] MIPS: Add MIPS64 R6 port.' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox