Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH luajit 0/5] Fix pow inconsistencies and improve asserts
@ 2023-08-15  9:36 Sergey Kaplun via Tarantool-patches
  2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 1/5] test: introduce `samevalues()` TAP checker Sergey Kaplun via Tarantool-patches
                   ` (6 more replies)
  0 siblings, 7 replies; 34+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-08-15  9:36 UTC (permalink / raw)
  To: Maxim Kokryashkin, Sergey Bronnikov; +Cc: tarantool-patches

This patchset fix `^` operator insconsistencies. Since the last 2
commits are based on the patch "Improve assertions." (*) it is backported as
well (it's about time).

(*) Be aware that assertions in <src/luajit.c> aren't replaced in the
upstream too.
The following functions/modules contains our own code with assertions,
so we can discuss some better namings for them:
* `lj_fullhash()` in <src/lj_str.c>
* `lua_hashstring()` in <src/lj_api.c>
* <src/lib_misc.c>
* <src/lj_mapi.c>
* <src/lj_memprof.c>
* <src/lj_sysprof.c>
* <src/lj_utils_leb128.c>
* <src/lj_wbuf.c>

P.S. Unfortunately, I can't find any reproducer for dropping
optimization of 2 ^ i => ldexp(1.0, i). Please, guide me, if you may
found any.

Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-9-pow-inconsistencies
PR: https://github.com/tarantool/tarantool/pull/8985
Related issues:
* https://github.com/tarantool/tarantool/issues/8825
* https://github.com/LuaJIT/LuaJIT/issues/9
* https://github.com/LuaJIT/LuaJIT/issues/684
* https://github.com/LuaJIT/LuaJIT/issues/817


Mike Pall (4):
  Remove pow() splitting and cleanup backends.
  Improve assertions.
  Fix pow() optimization inconsistencies.
  Revert to trival pow() optimizations to prevent inaccuracies.

Sergey Kaplun (1):
  test: introduce `samevalues()` TAP checker

 src/CMakeLists.txt                            |   1 +
 src/Makefile.dep.original                     |  13 +-
 src/Makefile.original                         |   4 +-
 src/lib_io.c                                  |   6 +-
 src/lib_jit.c                                 |   4 +-
 src/lib_misc.c                                |  12 +-
 src/lib_string.c                              |   6 +-
 src/lj_api.c                                  | 140 ++++++-----
 src/lj_arch.h                                 |   3 -
 src/lj_asm.c                                  | 230 +++++++++++-------
 src/lj_asm_arm.h                              | 129 +++++-----
 src/lj_asm_arm64.h                            | 134 +++++-----
 src/lj_asm_mips.h                             | 189 +++++++-------
 src/lj_asm_ppc.h                              | 122 ++++++----
 src/lj_asm_x86.h                              | 211 +++++++---------
 src/lj_assert.c                               |  28 +++
 src/lj_bcread.c                               |  20 +-
 src/lj_bcwrite.c                              |  24 +-
 src/lj_buf.c                                  |   4 +-
 src/lj_carith.c                               |  10 +-
 src/lj_ccall.c                                |  19 +-
 src/lj_ccallback.c                            |  42 ++--
 src/lj_cconv.c                                |  57 +++--
 src/lj_cconv.h                                |   5 +-
 src/lj_cdata.c                                |  27 +-
 src/lj_cdata.h                                |   7 +-
 src/lj_clib.c                                 |   6 +-
 src/lj_cparse.c                               |  25 +-
 src/lj_crecord.c                              |  19 +-
 src/lj_ctype.c                                |  13 +-
 src/lj_ctype.h                                |  14 +-
 src/lj_debug.c                                |  18 +-
 src/lj_def.h                                  |  26 +-
 src/lj_dispatch.c                             |  11 +-
 src/lj_emit_arm.h                             |  50 ++--
 src/lj_emit_arm64.h                           |  21 +-
 src/lj_emit_mips.h                            |  22 +-
 src/lj_emit_ppc.h                             |  12 +-
 src/lj_emit_x86.h                             |  22 +-
 src/lj_err.c                                  |  40 +--
 src/lj_ffrecord.c                             |   4 +-
 src/lj_func.c                                 |  18 +-
 src/lj_gc.c                                   |  78 +++---
 src/lj_gc.h                                   |   6 +-
 src/lj_gdbjit.c                               |   5 +-
 src/lj_ir.c                                   |  31 +--
 src/lj_ir.h                                   |   7 +-
 src/lj_ircall.h                               |   2 -
 src/lj_iropt.h                                |   1 -
 src/lj_jit.h                                  |   6 +
 src/lj_lex.c                                  |  14 +-
 src/lj_lex.h                                  |   6 +
 src/lj_load.c                                 |   2 +-
 src/lj_mapi.c                                 |   2 +-
 src/lj_mcode.c                                |   2 +-
 src/lj_memprof.c                              |  35 +--
 src/lj_meta.c                                 |   6 +-
 src/lj_obj.h                                  |  35 ++-
 src/lj_opt_fold.c                             | 144 +++++------
 src/lj_opt_loop.c                             |   5 +-
 src/lj_opt_mem.c                              |  15 +-
 src/lj_opt_narrow.c                           |  55 +----
 src/lj_opt_split.c                            |  45 +---
 src/lj_parse.c                                | 114 +++++----
 src/lj_record.c                               | 164 ++++++++-----
 src/lj_snap.c                                 | 100 +++++---
 src/lj_snap.h                                 |   3 +-
 src/lj_state.c                                |  18 +-
 src/lj_str.c                                  |   7 +-
 src/lj_strfmt.c                               |   4 +-
 src/lj_strfmt.h                               |   3 +-
 src/lj_strfmt_num.c                           |   6 +-
 src/lj_strscan.c                              |   9 +-
 src/lj_symtab.c                               |  11 +-
 src/lj_sysprof.c                              |  31 +--
 src/lj_tab.c                                  |  20 +-
 src/lj_target.h                               |   3 +-
 src/lj_trace.c                                |  57 +++--
 src/lj_utils_leb128.c                         |   5 +-
 src/lj_vm.h                                   |   9 -
 src/lj_vmmath.c                               |  51 +---
 src/lj_wbuf.c                                 |   3 +-
 src/ljamalg.c                                 |   1 +
 src/luaconf.h                                 |   2 +-
 src/vm_x64.dasc                               |  35 ---
 src/vm_x86.dasc                               |  35 ---
 test/tarantool-tests/gh-6163-min-max.test.lua |  52 ++--
 .../lj-684-pow-inconsistencies.test.lua       | 106 ++++++++
 .../lj-9-pow-inconsistencies.test.lua         |  65 +++++
 test/tarantool-tests/tap.lua                  |  14 ++
 90 files changed, 1729 insertions(+), 1469 deletions(-)
 create mode 100644 src/lj_assert.c
 create mode 100644 test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
 create mode 100644 test/tarantool-tests/lj-9-pow-inconsistencies.test.lua

-- 
2.41.0


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Tarantool-patches] [PATCH luajit 1/5] test: introduce `samevalues()` TAP checker
  2023-08-15  9:36 [Tarantool-patches] [PATCH luajit 0/5] Fix pow inconsistencies and improve asserts Sergey Kaplun via Tarantool-patches
@ 2023-08-15  9:36 ` Sergey Kaplun via Tarantool-patches
  2023-08-17 14:03   ` Maxim Kokryashkin via Tarantool-patches
  2023-08-18 10:43   ` Sergey Bronnikov via Tarantool-patches
  2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 2/5] Remove pow() splitting and cleanup backends Sergey Kaplun via Tarantool-patches
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 34+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-08-15  9:36 UTC (permalink / raw)
  To: Maxim Kokryashkin, Sergey Bronnikov; +Cc: tarantool-patches

The introduced `samevalues()` helper checks that values in range from
1, to `table.maxn()` of the given table are exactly the same. It may be
usefull for test consistency of JIT and VM behaviour. Originally, the
`arr_is_consistent()` function was introduced in the
<tarantool-tests/gh-6163-min-max.test.lua>. `samevalues()` has the same
functionallity (except usage of `table.maxn()` instead `#` operator to
be sure, that the table we check isn't a sparse array).
---
 test/tarantool-tests/gh-6163-min-max.test.lua | 52 ++++++++-----------
 test/tarantool-tests/tap.lua                  | 14 +++++
 2 files changed, 37 insertions(+), 29 deletions(-)

diff --git a/test/tarantool-tests/gh-6163-min-max.test.lua b/test/tarantool-tests/gh-6163-min-max.test.lua
index 63437955..4bc6155c 100644
--- a/test/tarantool-tests/gh-6163-min-max.test.lua
+++ b/test/tarantool-tests/gh-6163-min-max.test.lua
@@ -2,25 +2,17 @@ local tap = require('tap')
 local test = tap.test('gh-6163-jit-min-max'):skipcond({
   ['Test requires JIT enabled'] = not jit.status(),
 })
+
 local x86_64 = jit.arch == 'x86' or jit.arch == 'x64'
+-- XXX: table to use for dummy check for some inconsistent results
+-- on the x86/64 architecture.
+local DUMMY_TAB = {}
+
 test:plan(18)
 --
 -- gh-6163: math.min/math.max inconsistencies.
 --
 
-local function isnan(x)
-    return x ~= x
-end
-
-local function array_is_consistent(res)
-  for i = 1, #res - 1 do
-    if res[i] ~= res[i + 1] and not (isnan(res[i]) and isnan(res[i + 1])) then
-      return false
-    end
-  end
-  return true
-end
-
 -- This function creates dirty values on the Lua stack.
 -- The latter of them is going to be treated as an
 -- argument by the `math.min/math.max`.
@@ -91,14 +83,14 @@ for k = 1, 4 do
     result[k] = min(min(x, nan), x)
 end
 -- expected: 1 1 1 1
-test:ok(array_is_consistent(result), 'math.min: reassoc_dup')
+test:samevalues(result, 'math.min: reassoc_dup')
 
 result = {}
 for k = 1, 4 do
     result[k] = max(max(x, nan), x)
 end
 -- expected: 1 1 1 1
-test:ok(array_is_consistent(result), 'math.max: reassoc_dup')
+test:samevalues(result, 'math.max: reassoc_dup')
 
 -- If one gets the expression like `math.min(x, math.min(x, nan))`,
 -- and the `comm_dup` optimization is applied, it results in the
@@ -120,7 +112,7 @@ for k = 1, 4 do
 end
 -- FIXME: results are still inconsistent for the x86/64 architecture.
 -- expected: nan nan nan nan
-test:ok(array_is_consistent(result) or x86_64, 'math.min: comm_dup_minmax')
+test:samevalues(x86_64 and DUMMY_TAB or result, 'math.min: comm_dup_minmax')
 
 result = {}
 for k = 1, 4 do
@@ -128,7 +120,7 @@ for k = 1, 4 do
 end
 -- FIXME: results are still inconsistent for the x86/64 architecture.
 -- expected: nan nan nan nan
-test:ok(array_is_consistent(result) or x86_64, 'math.max: comm_dup_minmax')
+test:samevalues(x86_64 and DUMMY_TAB or result, 'math.max: comm_dup_minmax')
 
 -- The following optimization should be disabled:
 -- (x o k1) o k2 ==> x o (k1 o k2)
@@ -139,49 +131,49 @@ for k = 1, 4 do
     result[k] = min(min(x, 0/0), 1.3)
 end
 -- expected: 1.3 1.3 1.3 1.3
-test:ok(array_is_consistent(result), 'math.min: reassoc_minmax_k')
+test:samevalues(result, 'math.min: reassoc_minmax_k')
 
 result = {}
 for k = 1, 4 do
     result[k] = max(max(x, 0/0), 1.1)
 end
 -- expected: 1.1 1.1 1.1 1.1
-test:ok(array_is_consistent(result), 'math.max: reassoc_minmax_k')
+test:samevalues(result, 'math.max: reassoc_minmax_k')
 
 result = {}
 for k = 1, 4 do
   result[k] = min(max(nan, 1), 1)
 end
 -- expected: 1 1 1 1
-test:ok(array_is_consistent(result), 'min-max-case1: reassoc_minmax_left')
+test:samevalues(result, 'min-max-case1: reassoc_minmax_left')
 
 result = {}
 for k = 1, 4 do
   result[k] = min(max(1, nan), 1)
 end
 -- expected: 1 1 1 1
-test:ok(array_is_consistent(result), 'min-max-case2: reassoc_minmax_left')
+test:samevalues(result, 'min-max-case2: reassoc_minmax_left')
 
 result = {}
 for k = 1, 4 do
   result[k] = max(min(nan, 1), 1)
 end
 -- expected: 1 1 1 1
-test:ok(array_is_consistent(result), 'max-min-case1: reassoc_minmax_left')
+test:samevalues(result, 'max-min-case1: reassoc_minmax_left')
 
 result = {}
 for k = 1, 4 do
   result[k] = max(min(1, nan), 1)
 end
 -- expected: 1 1 1 1
-test:ok(array_is_consistent(result), 'max-min-case2: reassoc_minmax_left')
+test:samevalues(result, 'max-min-case2: reassoc_minmax_left')
 
 result = {}
 for k = 1, 4 do
   result[k] = min(1, max(nan, 1))
 end
 -- expected: 1 1 1 1
-test:ok(array_is_consistent(result), 'min-max-case1: reassoc_minmax_right')
+test:samevalues(result, 'min-max-case1: reassoc_minmax_right')
 
 result = {}
 for k = 1, 4 do
@@ -189,14 +181,15 @@ for k = 1, 4 do
 end
 -- FIXME: results are still inconsistent for the x86/64 architecture.
 -- expected: nan nan nan nan
-test:ok(array_is_consistent(result) or x86_64, 'min-max-case2: reassoc_minmax_right')
+test:samevalues(x86_64 and DUMMY_TAB or result,
+                'min-max-case2: reassoc_minmax_right')
 
 result = {}
 for k = 1, 4 do
   result[k] = max(1, min(nan, 1))
 end
 -- expected: 1 1 1 1
-test:ok(array_is_consistent(result), 'max-min-case1: reassoc_minmax_right')
+test:samevalues(result, 'max-min-case1: reassoc_minmax_right')
 
 result = {}
 for k = 1, 4 do
@@ -204,7 +197,8 @@ for k = 1, 4 do
 end
 -- FIXME: results are still inconsistent for the x86/64 architecture.
 -- expected: nan nan nan nan
-test:ok(array_is_consistent(result) or x86_64, 'max-min-case2: reassoc_minmax_right')
+test:samevalues(x86_64 and DUMMY_TAB or result,
+                'max-min-case2: reassoc_minmax_right')
 
 -- XXX: If we look into the disassembled code of `lj_vm_foldarith()`
 -- we can see the following:
@@ -253,13 +247,13 @@ for k = 1, 4 do
   result[k] = min(min(7.1, 0/0), 1.1)
 end
 -- expected: 1.1 1.1 1.1 1.1
-test:ok(array_is_consistent(result), 'min: fold_kfold_numarith')
+test:samevalues(result, 'min: fold_kfold_numarith')
 
 result = {}
 for k = 1, 4 do
   result[k] = max(max(7.1, 0/0), 1.1)
 end
 -- expected: 1.1 1.1 1.1 1.1
-test:ok(array_is_consistent(result), 'max: fold_kfold_numarith')
+test:samevalues(result, 'max: fold_kfold_numarith')
 
 test:done(true)
diff --git a/test/tarantool-tests/tap.lua b/test/tarantool-tests/tap.lua
index 8559ee52..af1d4b20 100644
--- a/test/tarantool-tests/tap.lua
+++ b/test/tarantool-tests/tap.lua
@@ -254,6 +254,19 @@ local function iscdata(test, v, ctype, message, extra)
   return ok(test, ffi.istype(ctype, v), message, extra)
 end
 
+local function isnan(v)
+  return v ~= v
+end
+
+local function samevalues(test, got, message, extra)
+  for i = 1, table.maxn(got) - 1 do
+    if got[i] ~= got[i + 1] and not (isnan(got[i]) and isnan(got[i + 1])) then
+      return fail(test, message, extra)
+    end
+  end
+  return ok(test, true, message, extra)
+end
+
 local test_mt
 
 local function new(parent, name, fun, ...)
@@ -372,6 +385,7 @@ test_mt = {
     isudata    = isudata,
     iscdata    = iscdata,
     is_deeply  = is_deeply,
+    samevalues = samevalues,
     like       = like,
     unlike     = unlike,
   }
-- 
2.41.0


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Tarantool-patches] [PATCH luajit 2/5] Remove pow() splitting and cleanup backends.
  2023-08-15  9:36 [Tarantool-patches] [PATCH luajit 0/5] Fix pow inconsistencies and improve asserts Sergey Kaplun via Tarantool-patches
  2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 1/5] test: introduce `samevalues()` TAP checker Sergey Kaplun via Tarantool-patches
@ 2023-08-15  9:36 ` Sergey Kaplun via Tarantool-patches
  2023-08-17 14:52   ` Maxim Kokryashkin via Tarantool-patches
  2023-08-18 11:08   ` Sergey Bronnikov via Tarantool-patches
  2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 3/5] Improve assertions Sergey Kaplun via Tarantool-patches
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 34+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-08-15  9:36 UTC (permalink / raw)
  To: Maxim Kokryashkin, Sergey Bronnikov; +Cc: tarantool-patches

From: Mike Pall <mike>

(cherry-picked from commit b2307c8ad817e350d65cc909a579ca2f77439682)

The JIT engine tries to split b^c to exp2(c * log2(b)) with attempt to
rejoin them later for some backends. It adds a dependency on C99
exp2() and log2(), which aren't part of some libm implementations.
Also, for some cases for IEEE754 we can see, that exp2(log2(x)) != x,
due to mathematical functions accuracy and double precision
restrictions. So, the values on the JIT slots and Lua stack are
inconsistent.

This patch removes splitting of pow operator, so IR_POW is emitting for
all cases (except power of 0.5 replaced with sqrt operation).

Also this patch does some refactoring:

* Functions `asm_pow()`, `asm_mod()`, `asm_ldexp()`, `asm_div()`
  (replaced with `asm_fpdiv()` for CPU architectures) are moved to the
  <src/lj_asm.c> as far as their implementation is generic for all
  architectures.
* Fusing of IR_HREF + IR_EQ/IR_NE moved to a `asm_fuseequal()`.
* Since `lj_vm_exp2()` subroutine and `IRFPM_EXP2` are removed as no
  longer used.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#8825
---
 src/lj_arch.h                                 |   3 -
 src/lj_asm.c                                  | 106 +++++++++++-------
 src/lj_asm_arm.h                              |  10 +-
 src/lj_asm_arm64.h                            |  39 +------
 src/lj_asm_mips.h                             |  38 +------
 src/lj_asm_ppc.h                              |   9 +-
 src/lj_asm_x86.h                              |  37 +-----
 src/lj_ir.h                                   |   2 +-
 src/lj_ircall.h                               |   1 -
 src/lj_opt_fold.c                             |  18 ++-
 src/lj_opt_narrow.c                           |  20 +---
 src/lj_opt_split.c                            |  21 ----
 src/lj_vm.h                                   |   5 -
 src/lj_vmmath.c                               |   8 --
 .../lj-9-pow-inconsistencies.test.lua         |  63 +++++++++++
 15 files changed, 158 insertions(+), 222 deletions(-)
 create mode 100644 test/tarantool-tests/lj-9-pow-inconsistencies.test.lua

diff --git a/src/lj_arch.h b/src/lj_arch.h
index cf31a291..3bdbe84e 100644
--- a/src/lj_arch.h
+++ b/src/lj_arch.h
@@ -607,9 +607,6 @@
 #if defined(__ANDROID__) || defined(__symbian__) || LJ_TARGET_XBOX360 || LJ_TARGET_WINDOWS
 #define LUAJIT_NO_LOG2
 #endif
-#if defined(__symbian__) || LJ_TARGET_WINDOWS
-#define LUAJIT_NO_EXP2
-#endif
 #if LJ_TARGET_CONSOLE || (LJ_TARGET_IOS && __IPHONE_OS_VERSION_MIN_REQUIRED >= __IPHONE_8_0)
 #define LJ_NO_SYSTEM		1
 #endif
diff --git a/src/lj_asm.c b/src/lj_asm.c
index b352fd35..a6906b19 100644
--- a/src/lj_asm.c
+++ b/src/lj_asm.c
@@ -1356,32 +1356,6 @@ static void asm_call(ASMState *as, IRIns *ir)
   asm_gencall(as, ci, args);
 }
 
-#if !LJ_SOFTFP32
-static void asm_fppow(ASMState *as, IRIns *ir, IRRef lref, IRRef rref)
-{
-  const CCallInfo *ci = &lj_ir_callinfo[IRCALL_pow];
-  IRRef args[2];
-  args[0] = lref;
-  args[1] = rref;
-  asm_setupresult(as, ir, ci);
-  asm_gencall(as, ci, args);
-}
-
-static int asm_fpjoin_pow(ASMState *as, IRIns *ir)
-{
-  IRIns *irp = IR(ir->op1);
-  if (irp == ir-1 && irp->o == IR_MUL && !ra_used(irp)) {
-    IRIns *irpp = IR(irp->op1);
-    if (irpp == ir-2 && irpp->o == IR_FPMATH &&
-	irpp->op2 == IRFPM_LOG2 && !ra_used(irpp)) {
-      asm_fppow(as, ir, irpp->op1, irp->op2);
-      return 1;
-    }
-  }
-  return 0;
-}
-#endif
-
 /* -- PHI and loop handling ----------------------------------------------- */
 
 /* Break a PHI cycle by renaming to a free register (evict if needed). */
@@ -1652,6 +1626,62 @@ static void asm_loop(ASMState *as)
 #error "Missing assembler for target CPU"
 #endif
 
+/* -- Common instruction helpers ------------------------------------------ */
+
+#if !LJ_SOFTFP32
+#if !LJ_TARGET_X86ORX64
+#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
+#define asm_fppowi(as, ir)	asm_callid(as, ir, IRCALL_lj_vm_powi)
+#endif
+
+static void asm_pow(ASMState *as, IRIns *ir)
+{
+#if LJ_64 && LJ_HASFFI
+  if (!irt_isnum(ir->t))
+    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_powi64 :
+					  IRCALL_lj_carith_powu64);
+  else
+#endif
+  if (irt_isnum(IR(ir->op2)->t))
+    asm_callid(as, ir, IRCALL_pow);
+  else
+    asm_fppowi(as, ir);
+}
+
+static void asm_div(ASMState *as, IRIns *ir)
+{
+#if LJ_64 && LJ_HASFFI
+  if (!irt_isnum(ir->t))
+    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_divi64 :
+					  IRCALL_lj_carith_divu64);
+  else
+#endif
+    asm_fpdiv(as, ir);
+}
+#endif
+
+static void asm_mod(ASMState *as, IRIns *ir)
+{
+#if LJ_64 && LJ_HASFFI
+  if (!irt_isint(ir->t))
+    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_modi64 :
+					  IRCALL_lj_carith_modu64);
+  else
+#endif
+    asm_callid(as, ir, IRCALL_lj_vm_modi);
+}
+
+static void asm_fuseequal(ASMState *as, IRIns *ir)
+{
+  /* Fuse HREF + EQ/NE. */
+  if ((ir-1)->o == IR_HREF && ir->op1 == as->curins-1) {
+    as->curins--;
+    asm_href(as, ir-1, (IROp)ir->o);
+  } else {
+    asm_equal(as, ir);
+  }
+}
+
 /* -- Instruction dispatch ------------------------------------------------ */
 
 /* Assemble a single instruction. */
@@ -1674,14 +1704,7 @@ static void asm_ir(ASMState *as, IRIns *ir)
   case IR_ABC:
     asm_comp(as, ir);
     break;
-  case IR_EQ: case IR_NE:
-    if ((ir-1)->o == IR_HREF && ir->op1 == as->curins-1) {
-      as->curins--;
-      asm_href(as, ir-1, (IROp)ir->o);
-    } else {
-      asm_equal(as, ir);
-    }
-    break;
+  case IR_EQ: case IR_NE: asm_fuseequal(as, ir); break;
 
   case IR_RETF: asm_retf(as, ir); break;
 
@@ -1750,7 +1773,13 @@ static void asm_ir(ASMState *as, IRIns *ir)
   case IR_SNEW: case IR_XSNEW: asm_snew(as, ir); break;
   case IR_TNEW: asm_tnew(as, ir); break;
   case IR_TDUP: asm_tdup(as, ir); break;
-  case IR_CNEW: case IR_CNEWI: asm_cnew(as, ir); break;
+  case IR_CNEW: case IR_CNEWI:
+#if LJ_HASFFI
+    asm_cnew(as, ir);
+#else
+    lua_assert(0);
+#endif
+    break;
 
   /* Buffer operations. */
   case IR_BUFHDR: asm_bufhdr(as, ir); break;
@@ -2215,6 +2244,10 @@ static void asm_setup_regsp(ASMState *as)
 	if (inloop)
 	  as->modset |= RSET_SCRATCH;
 #if LJ_TARGET_X86
+	if (irt_isnum(IR(ir->op2)->t)) {
+	  if (as->evenspill < 4)  /* Leave room to call pow(). */
+	    as->evenspill = 4;
+	}
 	break;
 #else
 	ir->prev = REGSP_HINT(RID_FPRET);
@@ -2240,9 +2273,6 @@ static void asm_setup_regsp(ASMState *as)
 	  continue;
 	}
 	break;
-      } else if (ir->op2 == IRFPM_EXP2 && !LJ_64) {
-	if (as->evenspill < 4)  /* Leave room to call pow(). */
-	  as->evenspill = 4;
       }
 #endif
       if (inloop)
diff --git a/src/lj_asm_arm.h b/src/lj_asm_arm.h
index 2894e5c9..29a07c80 100644
--- a/src/lj_asm_arm.h
+++ b/src/lj_asm_arm.h
@@ -1275,8 +1275,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
   ra_allockreg(as, (int32_t)(sz+sizeof(GCcdata)),
 	       ra_releasetmp(as, ASMREF_TMP1));
 }
-#else
-#define asm_cnew(as, ir)	((void)0)
 #endif
 
 /* -- Write barriers ------------------------------------------------------ */
@@ -1371,8 +1369,6 @@ static void asm_callround(ASMState *as, IRIns *ir, int id)
 
 static void asm_fpmath(ASMState *as, IRIns *ir)
 {
-  if (ir->op2 == IRFPM_EXP2 && asm_fpjoin_pow(as, ir))
-    return;
   if (ir->op2 <= IRFPM_TRUNC)
     asm_callround(as, ir, ir->op2);
   else if (ir->op2 == IRFPM_SQRT)
@@ -1499,14 +1495,10 @@ static void asm_mul(ASMState *as, IRIns *ir)
 #define asm_mulov(as, ir)	asm_mul(as, ir)
 
 #if !LJ_SOFTFP
-#define asm_div(as, ir)		asm_fparith(as, ir, ARMI_VDIV_D)
-#define asm_pow(as, ir)		asm_callid(as, ir, IRCALL_lj_vm_powi)
+#define asm_fpdiv(as, ir)	asm_fparith(as, ir, ARMI_VDIV_D)
 #define asm_abs(as, ir)		asm_fpunary(as, ir, ARMI_VABS_D)
-#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
 #endif
 
-#define asm_mod(as, ir)		asm_callid(as, ir, IRCALL_lj_vm_modi)
-
 static void asm_neg(ASMState *as, IRIns *ir)
 {
 #if !LJ_SOFTFP
diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
index aea251a9..c3d6889e 100644
--- a/src/lj_asm_arm64.h
+++ b/src/lj_asm_arm64.h
@@ -1249,8 +1249,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
   ra_allockreg(as, (int32_t)(sz+sizeof(GCcdata)),
 	       ra_releasetmp(as, ASMREF_TMP1));
 }
-#else
-#define asm_cnew(as, ir)	((void)0)
 #endif
 
 /* -- Write barriers ------------------------------------------------------ */
@@ -1327,8 +1325,6 @@ static void asm_fpmath(ASMState *as, IRIns *ir)
   } else if (fpm <= IRFPM_TRUNC) {
     asm_fpunary(as, ir, fpm == IRFPM_FLOOR ? A64I_FRINTMd :
 			fpm == IRFPM_CEIL ? A64I_FRINTPd : A64I_FRINTZd);
-  } else if (fpm == IRFPM_EXP2 && asm_fpjoin_pow(as, ir)) {
-    return;
   } else {
     asm_callid(as, ir, IRCALL_lj_vm_floor + fpm);
   }
@@ -1435,45 +1431,12 @@ static void asm_mul(ASMState *as, IRIns *ir)
   asm_intmul(as, ir);
 }
 
-static void asm_div(ASMState *as, IRIns *ir)
-{
-#if LJ_HASFFI
-  if (!irt_isnum(ir->t))
-    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_divi64 :
-					  IRCALL_lj_carith_divu64);
-  else
-#endif
-    asm_fparith(as, ir, A64I_FDIVd);
-}
-
-static void asm_pow(ASMState *as, IRIns *ir)
-{
-#if LJ_HASFFI
-  if (!irt_isnum(ir->t))
-    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_powi64 :
-					  IRCALL_lj_carith_powu64);
-  else
-#endif
-    asm_callid(as, ir, IRCALL_lj_vm_powi);
-}
-
 #define asm_addov(as, ir)	asm_add(as, ir)
 #define asm_subov(as, ir)	asm_sub(as, ir)
 #define asm_mulov(as, ir)	asm_mul(as, ir)
 
+#define asm_fpdiv(as, ir)	asm_fparith(as, ir, A64I_FDIVd)
 #define asm_abs(as, ir)		asm_fpunary(as, ir, A64I_FABS)
-#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
-
-static void asm_mod(ASMState *as, IRIns *ir)
-{
-#if LJ_HASFFI
-  if (!irt_isint(ir->t))
-    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_modi64 :
-					  IRCALL_lj_carith_modu64);
-  else
-#endif
-    asm_callid(as, ir, IRCALL_lj_vm_modi);
-}
 
 static void asm_neg(ASMState *as, IRIns *ir)
 {
diff --git a/src/lj_asm_mips.h b/src/lj_asm_mips.h
index 4626507b..0f92959b 100644
--- a/src/lj_asm_mips.h
+++ b/src/lj_asm_mips.h
@@ -1613,8 +1613,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
   ra_allockreg(as, (int32_t)(sz+sizeof(GCcdata)),
 	       ra_releasetmp(as, ASMREF_TMP1));
 }
-#else
-#define asm_cnew(as, ir)	((void)0)
 #endif
 
 /* -- Write barriers ------------------------------------------------------ */
@@ -1683,8 +1681,6 @@ static void asm_fpunary(ASMState *as, IRIns *ir, MIPSIns mi)
 #if !LJ_SOFTFP32
 static void asm_fpmath(ASMState *as, IRIns *ir)
 {
-  if (ir->op2 == IRFPM_EXP2 && asm_fpjoin_pow(as, ir))
-    return;
 #if !LJ_SOFTFP
   if (ir->op2 <= IRFPM_TRUNC)
     asm_callround(as, ir, IRCALL_lj_vm_floor + ir->op2);
@@ -1772,41 +1768,13 @@ static void asm_mul(ASMState *as, IRIns *ir)
   }
 }
 
-static void asm_mod(ASMState *as, IRIns *ir)
-{
-#if LJ_64 && LJ_HASFFI
-  if (!irt_isint(ir->t))
-    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_modi64 :
-					  IRCALL_lj_carith_modu64);
-  else
-#endif
-    asm_callid(as, ir, IRCALL_lj_vm_modi);
-}
-
 #if !LJ_SOFTFP32
-static void asm_pow(ASMState *as, IRIns *ir)
-{
-#if LJ_64 && LJ_HASFFI
-  if (!irt_isnum(ir->t))
-    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_powi64 :
-					  IRCALL_lj_carith_powu64);
-  else
-#endif
-    asm_callid(as, ir, IRCALL_lj_vm_powi);
-}
-
-static void asm_div(ASMState *as, IRIns *ir)
+static void asm_fpdiv(ASMState *as, IRIns *ir)
 {
-#if LJ_64 && LJ_HASFFI
-  if (!irt_isnum(ir->t))
-    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_divi64 :
-					  IRCALL_lj_carith_divu64);
-  else
-#endif
 #if !LJ_SOFTFP
     asm_fparith(as, ir, MIPSI_DIV_D);
 #else
-  asm_callid(as, ir, IRCALL_softfp_div);
+    asm_callid(as, ir, IRCALL_softfp_div);
 #endif
 }
 #endif
@@ -1844,8 +1812,6 @@ static void asm_abs(ASMState *as, IRIns *ir)
 }
 #endif
 
-#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
-
 static void asm_arithov(ASMState *as, IRIns *ir)
 {
   /* TODO MIPSR6: bovc/bnvc. Caveat: no delay slot to load RID_TMP. */
diff --git a/src/lj_asm_ppc.h b/src/lj_asm_ppc.h
index 6aaed058..62a5c3e2 100644
--- a/src/lj_asm_ppc.h
+++ b/src/lj_asm_ppc.h
@@ -1177,8 +1177,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
   ra_allockreg(as, (int32_t)(sz+sizeof(GCcdata)),
 	       ra_releasetmp(as, ASMREF_TMP1));
 }
-#else
-#define asm_cnew(as, ir)	((void)0)
 #endif
 
 /* -- Write barriers ------------------------------------------------------ */
@@ -1249,8 +1247,6 @@ static void asm_fpunary(ASMState *as, IRIns *ir, PPCIns pi)
 
 static void asm_fpmath(ASMState *as, IRIns *ir)
 {
-  if (ir->op2 == IRFPM_EXP2 && asm_fpjoin_pow(as, ir))
-    return;
   if (ir->op2 == IRFPM_SQRT && (as->flags & JIT_F_SQRT))
     asm_fpunary(as, ir, PPCI_FSQRT);
   else
@@ -1364,9 +1360,7 @@ static void asm_mul(ASMState *as, IRIns *ir)
   }
 }
 
-#define asm_div(as, ir)		asm_fparith(as, ir, PPCI_FDIV)
-#define asm_mod(as, ir)		asm_callid(as, ir, IRCALL_lj_vm_modi)
-#define asm_pow(as, ir)		asm_callid(as, ir, IRCALL_lj_vm_powi)
+#define asm_fpdiv(as, ir)	asm_fparith(as, ir, PPCI_FDIV)
 
 static void asm_neg(ASMState *as, IRIns *ir)
 {
@@ -1390,7 +1384,6 @@ static void asm_neg(ASMState *as, IRIns *ir)
 }
 
 #define asm_abs(as, ir)		asm_fpunary(as, ir, PPCI_FABS)
-#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
 
 static void asm_arithov(ASMState *as, IRIns *ir, PPCIns pi)
 {
diff --git a/src/lj_asm_x86.h b/src/lj_asm_x86.h
index 63d332ca..5f5fe3cf 100644
--- a/src/lj_asm_x86.h
+++ b/src/lj_asm_x86.h
@@ -1857,8 +1857,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
   asm_gencall(as, ci, args);
   emit_loadi(as, ra_releasetmp(as, ASMREF_TMP1), (int32_t)(sz+sizeof(GCcdata)));
 }
-#else
-#define asm_cnew(as, ir)	((void)0)
 #endif
 
 /* -- Write barriers ------------------------------------------------------ */
@@ -1964,8 +1962,6 @@ static void asm_fpmath(ASMState *as, IRIns *ir)
 		    fpm == IRFPM_CEIL ? lj_vm_ceil_sse : lj_vm_trunc_sse);
       ra_left(as, RID_XMM0, ir->op1);
     }
-  } else if (fpm == IRFPM_EXP2 && asm_fpjoin_pow(as, ir)) {
-    /* Rejoined to pow(). */
   } else {
     asm_callid(as, ir, IRCALL_lj_vm_floor + fpm);
   }
@@ -2000,17 +1996,6 @@ static void asm_fppowi(ASMState *as, IRIns *ir)
   ra_left(as, RID_EAX, ir->op2);
 }
 
-static void asm_pow(ASMState *as, IRIns *ir)
-{
-#if LJ_64 && LJ_HASFFI
-  if (!irt_isnum(ir->t))
-    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_powi64 :
-					  IRCALL_lj_carith_powu64);
-  else
-#endif
-    asm_fppowi(as, ir);
-}
-
 static int asm_swapops(ASMState *as, IRIns *ir)
 {
   IRIns *irl = IR(ir->op1);
@@ -2208,27 +2193,7 @@ static void asm_mul(ASMState *as, IRIns *ir)
     asm_intarith(as, ir, XOg_X_IMUL);
 }
 
-static void asm_div(ASMState *as, IRIns *ir)
-{
-#if LJ_64 && LJ_HASFFI
-  if (!irt_isnum(ir->t))
-    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_divi64 :
-					  IRCALL_lj_carith_divu64);
-  else
-#endif
-    asm_fparith(as, ir, XO_DIVSD);
-}
-
-static void asm_mod(ASMState *as, IRIns *ir)
-{
-#if LJ_64 && LJ_HASFFI
-  if (!irt_isint(ir->t))
-    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_modi64 :
-					  IRCALL_lj_carith_modu64);
-  else
-#endif
-    asm_callid(as, ir, IRCALL_lj_vm_modi);
-}
+#define asm_fpdiv(as, ir)	asm_fparith(as, ir, XO_DIVSD)
 
 static void asm_neg_not(ASMState *as, IRIns *ir, x86Group3 xg)
 {
diff --git a/src/lj_ir.h b/src/lj_ir.h
index e8bca275..43e55069 100644
--- a/src/lj_ir.h
+++ b/src/lj_ir.h
@@ -177,7 +177,7 @@ LJ_STATIC_ASSERT((int)IR_XLOAD + IRDELTA_L2S == (int)IR_XSTORE);
 /* FPMATH sub-functions. ORDER FPM. */
 #define IRFPMDEF(_) \
   _(FLOOR) _(CEIL) _(TRUNC)  /* Must be first and in this order. */ \
-  _(SQRT) _(EXP2) _(LOG) _(LOG2) \
+  _(SQRT) _(LOG) _(LOG2) \
   _(OTHER)
 
 typedef enum {
diff --git a/src/lj_ircall.h b/src/lj_ircall.h
index bbad35b1..af064a6f 100644
--- a/src/lj_ircall.h
+++ b/src/lj_ircall.h
@@ -192,7 +192,6 @@ typedef struct CCallInfo {
   _(FPMATH,	lj_vm_ceil,		1,   N, NUM, XA_FP) \
   _(FPMATH,	lj_vm_trunc,		1,   N, NUM, XA_FP) \
   _(FPMATH,	sqrt,			1,   N, NUM, XA_FP) \
-  _(ANY,	lj_vm_exp2,		1,   N, NUM, XA_FP) \
   _(ANY,	log,			1,   N, NUM, XA_FP) \
   _(ANY,	lj_vm_log2,		1,   N, NUM, XA_FP) \
   _(ANY,	lj_vm_powi,		2,   N, NUM, XA_FP) \
diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
index 27e489af..cd803d87 100644
--- a/src/lj_opt_fold.c
+++ b/src/lj_opt_fold.c
@@ -237,10 +237,11 @@ LJFOLDF(kfold_fpcall2)
 }
 
 LJFOLD(POW KNUM KINT)
+LJFOLD(POW KNUM KNUM)
 LJFOLDF(kfold_numpow)
 {
   lua_Number a = knumleft;
-  lua_Number b = (lua_Number)fright->i;
+  lua_Number b = fright->o == IR_KINT ? (lua_Number)fright->i : knumright;
   lua_Number y = lj_vm_foldarith(a, b, IR_POW - IR_ADD);
   return lj_ir_knum(J, y);
 }
@@ -1077,7 +1078,7 @@ LJFOLDF(simplify_nummuldiv_negneg)
 }
 
 LJFOLD(POW any KINT)
-LJFOLDF(simplify_numpow_xk)
+LJFOLDF(simplify_numpow_xkint)
 {
   int32_t k = fright->i;
   TRef ref = fins->op1;
@@ -1106,13 +1107,22 @@ LJFOLDF(simplify_numpow_xk)
   return ref;
 }
 
+LJFOLD(POW any KNUM)
+LJFOLDF(simplify_numpow_xknum)
+{
+  if (knumright == 0.5)  /* x ^ 0.5 ==> sqrt(x) */
+    return emitir(IRTN(IR_FPMATH), fins->op1, IRFPM_SQRT);
+  return NEXTFOLD;
+}
+
 LJFOLD(POW KNUM any)
 LJFOLDF(simplify_numpow_kx)
 {
   lua_Number n = knumleft;
-  if (n == 2.0) {  /* 2.0 ^ i ==> ldexp(1.0, tonum(i)) */
-    fins->o = IR_CONV;
+  if (n == 2.0 && irt_isint(fright->t)) {  /* 2.0 ^ i ==> ldexp(1.0, i) */
 #if LJ_TARGET_X86ORX64
+    /* Different IR_LDEXP calling convention on x86/x64 requires conversion. */
+    fins->o = IR_CONV;
     fins->op1 = fins->op2;
     fins->op2 = IRCONV_NUM_INT;
     fins->op2 = (IRRef1)lj_opt_fold(J);
diff --git a/src/lj_opt_narrow.c b/src/lj_opt_narrow.c
index bb61f97b..4f285334 100644
--- a/src/lj_opt_narrow.c
+++ b/src/lj_opt_narrow.c
@@ -593,10 +593,10 @@ TRef lj_opt_narrow_pow(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc)
   /* Narrowing must be unconditional to preserve (-x)^i semantics. */
   if (tvisint(vc) || numisint(numV(vc))) {
     int checkrange = 0;
-    /* Split pow is faster for bigger exponents. But do this only for (+k)^i. */
+    /* pow() is faster for bigger exponents. But do this only for (+k)^i. */
     if (tref_isk(rb) && (int32_t)ir_knum(IR(tref_ref(rb)))->u32.hi >= 0) {
       int32_t k = numberVint(vc);
-      if (!(k >= -65536 && k <= 65536)) goto split_pow;
+      if (!(k >= -65536 && k <= 65536)) goto force_pow_num;
       checkrange = 1;
     }
     if (!tref_isinteger(rc)) {
@@ -607,19 +607,11 @@ TRef lj_opt_narrow_pow(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc)
       TRef tmp = emitir(IRTI(IR_ADD), rc, lj_ir_kint(J, 65536));
       emitir(IRTGI(IR_ULE), tmp, lj_ir_kint(J, 2*65536));
     }
-    return emitir(IRTN(IR_POW), rb, rc);
+  } else {
+force_pow_num:
+    rc = lj_ir_tonum(J, rc);  /* Want POW(num, num), not POW(num, int). */
   }
-split_pow:
-  /* FOLD covers most cases, but some are easier to do here. */
-  if (tref_isk(rb) && tvispone(ir_knum(IR(tref_ref(rb)))))
-    return rb;  /* 1 ^ x ==> 1 */
-  rc = lj_ir_tonum(J, rc);
-  if (tref_isk(rc) && ir_knum(IR(tref_ref(rc)))->n == 0.5)
-    return emitir(IRTN(IR_FPMATH), rb, IRFPM_SQRT);  /* x ^ 0.5 ==> sqrt(x) */
-  /* Split up b^c into exp2(c*log2(b)). Assembler may rejoin later. */
-  rb = emitir(IRTN(IR_FPMATH), rb, IRFPM_LOG2);
-  rc = emitir(IRTN(IR_MUL), rb, rc);
-  return emitir(IRTN(IR_FPMATH), rc, IRFPM_EXP2);
+  return emitir(IRTN(IR_POW), rb, rc);
 }
 
 /* -- Predictive narrowing of induction variables ------------------------- */
diff --git a/src/lj_opt_split.c b/src/lj_opt_split.c
index 2fc36b8d..c10a85cb 100644
--- a/src/lj_opt_split.c
+++ b/src/lj_opt_split.c
@@ -403,27 +403,6 @@ static void split_ir(jit_State *J)
 	hi = split_call_li(J, hisubst, oir, ir, IRCALL_lj_vm_powi);
 	break;
       case IR_FPMATH:
-	/* Try to rejoin pow from EXP2, MUL and LOG2. */
-	if (nir->op2 == IRFPM_EXP2 && nir->op1 > J->loopref) {
-	  IRIns *irp = IR(nir->op1);
-	  if (irp->o == IR_CALLN && irp->op2 == IRCALL_softfp_mul) {
-	    IRIns *irm4 = IR(irp->op1);
-	    IRIns *irm3 = IR(irm4->op1);
-	    IRIns *irm12 = IR(irm3->op1);
-	    IRIns *irl1 = IR(irm12->op1);
-	    if (irm12->op1 > J->loopref && irl1->o == IR_CALLN &&
-		irl1->op2 == IRCALL_lj_vm_log2) {
-	      IRRef tmp = irl1->op1;  /* Recycle first two args from LOG2. */
-	      IRRef arg3 = irm3->op2, arg4 = irm4->op2;
-	      J->cur.nins--;
-	      tmp = split_emit(J, IRT(IR_CARG, IRT_NIL), tmp, arg3);
-	      tmp = split_emit(J, IRT(IR_CARG, IRT_NIL), tmp, arg4);
-	      ir->prev = tmp = split_emit(J, IRTI(IR_CALLN), tmp, IRCALL_pow);
-	      hi = split_emit(J, IRT(IR_HIOP, IRT_SOFTFP), tmp, tmp);
-	      break;
-	    }
-	  }
-	}
 	hi = split_call_l(J, hisubst, oir, ir, IRCALL_lj_vm_floor + ir->op2);
 	break;
       case IR_LDEXP:
diff --git a/src/lj_vm.h b/src/lj_vm.h
index 411caafa..abaa7c52 100644
--- a/src/lj_vm.h
+++ b/src/lj_vm.h
@@ -95,11 +95,6 @@ LJ_ASMF double lj_vm_trunc(double);
 LJ_ASMF double lj_vm_trunc_sf(double);
 #endif
 #endif
-#ifdef LUAJIT_NO_EXP2
-LJ_ASMF double lj_vm_exp2(double);
-#else
-#define lj_vm_exp2	exp2
-#endif
 #if LJ_HASFFI
 LJ_ASMF int lj_vm_errno(void);
 #endif
diff --git a/src/lj_vmmath.c b/src/lj_vmmath.c
index ae4e0f15..9c0d3fde 100644
--- a/src/lj_vmmath.c
+++ b/src/lj_vmmath.c
@@ -79,13 +79,6 @@ double lj_vm_log2(double a)
 }
 #endif
 
-#ifdef LUAJIT_NO_EXP2
-double lj_vm_exp2(double a)
-{
-  return exp(a * 0.6931471805599453);
-}
-#endif
-
 #if !LJ_TARGET_X86ORX64
 /* Unsigned x^k. */
 static double lj_vm_powui(double x, uint32_t k)
@@ -128,7 +121,6 @@ double lj_vm_foldfpm(double x, int fpm)
   case IRFPM_CEIL: return lj_vm_ceil(x);
   case IRFPM_TRUNC: return lj_vm_trunc(x);
   case IRFPM_SQRT: return sqrt(x);
-  case IRFPM_EXP2: return lj_vm_exp2(x);
   case IRFPM_LOG: return log(x);
   case IRFPM_LOG2: return lj_vm_log2(x);
   default: lua_assert(0);
diff --git a/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
new file mode 100644
index 00000000..21b3a0d9
--- /dev/null
+++ b/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
@@ -0,0 +1,63 @@
+local tap = require('tap')
+-- Test to demonstrate the incorrect JIT behaviour when splitting
+-- IR_POW.
+-- See also https://github.com/LuaJIT/LuaJIT/issues/9.
+local test = tap.test('lj-9-pow-inconsistencies'):skipcond({
+  ['Test requires JIT enabled'] = not jit.status(),
+})
+
+local nan = 0 / 0
+local inf = math.huge
+
+-- Table with some corner cases to check:
+local INTERESTING_VALUES = {
+  -- 0, -0, 1, -1 special cases with nan, inf, etc..
+  0, -0, 1, -1, nan, inf, -inf,
+  -- x ^  inf = 0 (inf), if |x| < 1 (|x| > 1).
+  -- x ^ -inf = inf (0), if |x| < 1 (|x| > 1).
+  0.999999, 1.000001, -0.999999, -1.000001,
+}
+test:plan(1 + (#INTERESTING_VALUES) ^ 2)
+
+jit.opt.start('hotloop=1')
+
+-- The JIT engine tries to split b^c to exp2(c * log2(b)).
+-- For some cases for IEEE754 we can see, that
+-- (double)exp2((double)log2(x)) != x, due to mathematical
+-- functions accuracy and double precision restrictions.
+-- Just use some numbers to observe this misbehaviour.
+local res = {}
+local cnt = 1
+while cnt < 4 do
+  -- XXX: use local variable to prevent folding via parser.
+  local b = -0.90000000001
+  res[cnt] = 1000 ^ b
+  cnt = cnt + 1
+end
+
+test:samevalues(res, 'consistent pow operator behaviour for corner case')
+
+-- Prevent JIT side effects for parent loops.
+jit.off()
+for i = 1, #INTERESTING_VALUES do
+  for j = 1, #INTERESTING_VALUES do
+    local b = INTERESTING_VALUES[i]
+    local c = INTERESTING_VALUES[j]
+    local results = {}
+    local counter = 1
+    jit.on()
+    while counter < 4 do
+      results[counter] = b ^ c
+      counter = counter + 1
+    end
+    -- Prevent JIT side effects.
+    jit.off()
+    jit.flush()
+    test:samevalues(
+      results,
+      ('consistent pow operator behaviour for (%s)^(%s)'):format(b, c)
+    )
+  end
+end
+
+test:done(true)
-- 
2.41.0


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Tarantool-patches] [PATCH luajit 3/5] Improve assertions.
  2023-08-15  9:36 [Tarantool-patches] [PATCH luajit 0/5] Fix pow inconsistencies and improve asserts Sergey Kaplun via Tarantool-patches
  2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 1/5] test: introduce `samevalues()` TAP checker Sergey Kaplun via Tarantool-patches
  2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 2/5] Remove pow() splitting and cleanup backends Sergey Kaplun via Tarantool-patches
@ 2023-08-15  9:36 ` Sergey Kaplun via Tarantool-patches
  2023-08-17 14:58   ` Maxim Kokryashkin via Tarantool-patches
  2023-08-18 11:20   ` Sergey Bronnikov via Tarantool-patches
  2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 4/5] Fix pow() optimization inconsistencies Sergey Kaplun via Tarantool-patches
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 34+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-08-15  9:36 UTC (permalink / raw)
  To: Maxim Kokryashkin, Sergey Bronnikov; +Cc: tarantool-patches

From: Mike Pall <mike>

(cherry-picked from commit 8ae5170cdc9c307bd81019b3e014391c9fd00581)

This commit refactors assertions used in the LuaJIT. It introduces new
module <src/lj_assert.c> with the `lj_assert_fail()` implementation.
Wrappers of this function are used across the whole code base. Each
macro wrapper is defined in the corresponding module gets global state
(if possible) from its environment to be passed inside the assertion.
For now, the global state is unused, but later it may be used for
dumping of the VM state.

Sergey Kaplun:
* added the description for the feature

Part of tarantool/tarantool#8825
---
 src/CMakeLists.txt        |   1 +
 src/Makefile.dep.original |  13 +--
 src/Makefile.original     |   4 +-
 src/lib_io.c              |   6 +-
 src/lib_jit.c             |   4 +-
 src/lib_misc.c            |  12 +--
 src/lib_string.c          |   6 +-
 src/lj_api.c              | 140 +++++++++++++++++---------------
 src/lj_asm.c              | 130 ++++++++++++++++++------------
 src/lj_asm_arm.h          | 119 ++++++++++++++++------------
 src/lj_asm_arm64.h        |  95 ++++++++++++----------
 src/lj_asm_mips.h         | 151 ++++++++++++++++++++---------------
 src/lj_asm_ppc.h          | 113 +++++++++++++++-----------
 src/lj_asm_x86.h          | 161 +++++++++++++++++++++----------------
 src/lj_assert.c           |  28 +++++++
 src/lj_bcread.c           |  20 ++---
 src/lj_bcwrite.c          |  24 ++++--
 src/lj_buf.c              |   4 +-
 src/lj_carith.c           |  10 ++-
 src/lj_ccall.c            |  19 +++--
 src/lj_ccallback.c        |  42 +++++-----
 src/lj_cconv.c            |  57 ++++++++------
 src/lj_cconv.h            |   5 +-
 src/lj_cdata.c            |  27 ++++---
 src/lj_cdata.h            |   7 +-
 src/lj_clib.c             |   6 +-
 src/lj_cparse.c           |  25 +++---
 src/lj_crecord.c          |  19 +++--
 src/lj_ctype.c            |  13 +--
 src/lj_ctype.h            |  14 +++-
 src/lj_debug.c            |  18 +++--
 src/lj_def.h              |  26 ++++--
 src/lj_dispatch.c         |  11 ++-
 src/lj_emit_arm.h         |  50 ++++++------
 src/lj_emit_arm64.h       |  21 ++---
 src/lj_emit_mips.h        |  22 +++---
 src/lj_emit_ppc.h         |  12 +--
 src/lj_emit_x86.h         |  22 +++---
 src/lj_err.c              |  40 ++--------
 src/lj_func.c             |  18 +++--
 src/lj_gc.c               |  78 ++++++++++--------
 src/lj_gc.h               |   6 +-
 src/lj_gdbjit.c           |   5 +-
 src/lj_ir.c               |  31 ++++----
 src/lj_ir.h               |   5 +-
 src/lj_jit.h              |   6 ++
 src/lj_lex.c              |  14 ++--
 src/lj_lex.h              |   6 ++
 src/lj_load.c             |   2 +-
 src/lj_mapi.c             |   2 +-
 src/lj_mcode.c            |   2 +-
 src/lj_memprof.c          |  35 ++++----
 src/lj_meta.c             |   6 +-
 src/lj_obj.h              |  35 +++++---
 src/lj_opt_fold.c         |  88 ++++++++++++---------
 src/lj_opt_loop.c         |   5 +-
 src/lj_opt_mem.c          |  15 ++--
 src/lj_opt_narrow.c       |  17 ++--
 src/lj_opt_split.c        |  22 +++---
 src/lj_parse.c            | 114 +++++++++++++++------------
 src/lj_record.c           | 162 +++++++++++++++++++++++---------------
 src/lj_snap.c             | 100 ++++++++++++++---------
 src/lj_snap.h             |   3 +-
 src/lj_state.c            |  18 +++--
 src/lj_str.c              |   7 +-
 src/lj_strfmt.c           |   4 +-
 src/lj_strfmt.h           |   3 +-
 src/lj_strfmt_num.c       |   6 +-
 src/lj_strscan.c          |   9 ++-
 src/lj_symtab.c           |  11 +--
 src/lj_sysprof.c          |  31 ++++----
 src/lj_tab.c              |  20 ++---
 src/lj_target.h           |   3 +-
 src/lj_trace.c            |  57 +++++++-------
 src/lj_utils_leb128.c     |   5 +-
 src/lj_vmmath.c           |   7 +-
 src/lj_wbuf.c             |   3 +-
 src/ljamalg.c             |   1 +
 src/luaconf.h             |   2 +-
 79 files changed, 1436 insertions(+), 1025 deletions(-)
 create mode 100644 src/lj_assert.c

diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index feeccbde..03338306 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -59,6 +59,7 @@ make_source_list(SOURCES_FRONTEND
 make_source_list(SOURCES_UTILS
   SOURCES
     lj_alloc.c
+    lj_assert.c
     lj_char.c
     lj_utils_leb128.c
     lj_vmmath.c
diff --git a/src/Makefile.dep.original b/src/Makefile.dep.original
index 968805ed..d35b6d9a 100644
--- a/src/Makefile.dep.original
+++ b/src/Makefile.dep.original
@@ -54,6 +54,7 @@ lj_asm.o: lj_asm.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h lj_gc.h \
  lj_ircall.h lj_iropt.h lj_mcode.h lj_trace.h lj_dispatch.h lj_traceerr.h \
  lj_snap.h lj_asm.h lj_vm.h lj_target.h lj_target_*.h lj_emit_*.h \
  lj_asm_*.h
+lj_assert.o: lj_assert.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h
 lj_bc.o: lj_bc.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h lj_bc.h \
  lj_bcdef.h
 lj_bcread.o: lj_bcread.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
@@ -164,7 +165,7 @@ lj_opt_loop.o: lj_opt_loop.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
  lj_iropt.h lj_trace.h lj_dispatch.h lj_bc.h lj_traceerr.h lj_snap.h \
  lj_vm.h
 lj_opt_mem.o: lj_opt_mem.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
- lj_tab.h lj_ir.h lj_jit.h lj_iropt.h lj_ircall.h
+ lj_tab.h lj_ir.h lj_jit.h lj_iropt.h lj_ircall.h lj_dispatch.h lj_bc.h
 lj_opt_narrow.o: lj_opt_narrow.c lj_obj.h lua.h luaconf.h lj_def.h \
  lj_arch.h lj_bc.h lj_ir.h lj_jit.h lj_iropt.h lj_trace.h lj_dispatch.h \
  lj_traceerr.h lj_vm.h lj_strscan.h
@@ -224,15 +225,17 @@ lj_trace.o: lj_trace.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
  lmisclib.h lj_sysprof.h
 lj_udata.o: lj_udata.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
  lj_gc.h lj_udata.h
-lj_utils_leb128.o: lj_utils_leb128.c lj_utils.h lj_def.h lua.h luaconf.h
+lj_utils_leb128.o: lj_utils_leb128.c lj_utils.h lj_def.h lua.h luaconf.h \
+ lj_obj.h lj_arch.h
 lj_vmevent.o: lj_vmevent.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
  lj_str.h lj_tab.h lj_state.h lj_dispatch.h lj_bc.h lj_jit.h lj_ir.h \
  lj_vm.h lj_vmevent.h
 lj_vmmath.o: lj_vmmath.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
  lj_ir.h lj_vm.h
-lj_wbuf.o: lj_wbuf.c lj_wbuf.h lj_def.h lua.h luaconf.h lj_utils.h
-ljamalg.o: ljamalg.c lua.h luaconf.h lauxlib.h lj_gc.c lj_obj.h lj_def.h \
- lj_arch.h lj_gc.h lj_err.h lj_errmsg.h lj_buf.h lj_str.h lj_tab.h \
+lj_wbuf.o: lj_wbuf.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
+ lj_wbuf.h lj_utils.h
+ljamalg.o: ljamalg.c lua.h luaconf.h lauxlib.h lj_assert.c lj_obj.h lj_def.h \
+ lj_arch.h lj_gc.c lj_gc.h lj_err.h lj_errmsg.h lj_buf.h lj_str.h lj_tab.h \
  lj_func.h lj_udata.h lj_meta.h lj_state.h lj_frame.h lj_bc.h lj_ctype.h \
  lj_cdata.h lj_trace.h lj_jit.h lj_ir.h lj_dispatch.h lj_traceerr.h \
  lj_vm.h lj_err.c lj_debug.h lj_ff.h lj_ffdef.h lj_strfmt.h lj_char.c \
diff --git a/src/Makefile.original b/src/Makefile.original
index 22d36a27..8cfe55c2 100644
--- a/src/Makefile.original
+++ b/src/Makefile.original
@@ -499,8 +499,8 @@ LJLIB_O= lib_base.o lib_math.o lib_bit.o lib_string.o lib_table.o \
 	 lib_misc.o
 LJLIB_C= $(LJLIB_O:.o=.c)
 
-LJCORE_O= lj_gc.o lj_err.o lj_char.o lj_bc.o lj_obj.o lj_buf.o lj_wbuf.o \
-	  lj_str.o lj_tab.o lj_func.o lj_udata.o lj_meta.o lj_debug.o \
+LJCORE_O= lj_assert.o lj_gc.o lj_err.o lj_char.o lj_bc.o lj_obj.o lj_buf.o \
+	  lj_wbuf.o lj_str.o lj_tab.o lj_func.o lj_udata.o lj_meta.o lj_debug.o \
 	  lj_state.o lj_dispatch.o lj_vmevent.o lj_vmmath.o lj_strscan.o \
 	  lj_strfmt.o lj_strfmt_num.o lj_api.o lj_mapi.o lj_profile.o \
 	  lj_profile_timer.o lj_memprof.o lj_symtab.o lj_sysprof.o \
diff --git a/src/lib_io.c b/src/lib_io.c
index db995ae6..ef39e535 100644
--- a/src/lib_io.c
+++ b/src/lib_io.c
@@ -101,9 +101,6 @@ static int io_file_close(lua_State *L, IOFileUD *iof)
     stat = pclose(iof->fp);
 #elif LJ_TARGET_WINDOWS && !LJ_TARGET_XBOXONE && !LJ_TARGET_UWP
     stat = _pclose(iof->fp);
-#else
-    lua_assert(0);
-    return 0;
 #endif
 #if LJ_52
     iof->fp = NULL;
@@ -112,7 +109,8 @@ static int io_file_close(lua_State *L, IOFileUD *iof)
     ok = (stat != -1);
 #endif
   } else {
-    lua_assert((iof->type & IOFILE_TYPE_MASK) == IOFILE_TYPE_STDF);
+    lj_assertL((iof->type & IOFILE_TYPE_MASK) == IOFILE_TYPE_STDF,
+	       "close of unknown FILE* type");
     setnilV(L->top++);
     lua_pushliteral(L, "cannot close standard file");
     return 2;
diff --git a/src/lib_jit.c b/src/lib_jit.c
index 40aa2b51..b3c1c93c 100644
--- a/src/lib_jit.c
+++ b/src/lib_jit.c
@@ -227,7 +227,7 @@ LJLIB_CF(jit_util_funcbc)
   if (pc < pt->sizebc) {
     BCIns ins = proto_bc(pt)[pc];
     BCOp op = bc_op(ins);
-    lua_assert(op < BC__MAX);
+    lj_assertL(op < BC__MAX, "bad bytecode op %d", op);
     setintV(L->top, ins);
     setintV(L->top+1, lj_bc_mode[op]);
     L->top += 2;
@@ -491,7 +491,7 @@ static int jitopt_param(jit_State *J, const char *str)
   int i;
   for (i = 0; i < JIT_P__MAX; i++) {
     size_t len = *(const uint8_t *)lst;
-    lua_assert(len != 0);
+    lj_assertJ(len != 0, "bad JIT_P_STRING");
     if (strncmp(str, lst+1, len) == 0 && str[len] == '=') {
       int32_t n = 0;
       const char *p = &str[len+1];
diff --git a/src/lib_misc.c b/src/lib_misc.c
index 1913a622..ca1d1c75 100644
--- a/src/lib_misc.c
+++ b/src/lib_misc.c
@@ -109,7 +109,7 @@ static size_t buffer_writer_default(const void **buf_addr, size_t len,
   const void *data = *buf_addr;
   size_t write_total = 0;
 
-  lua_assert(len <= STREAM_BUFFER_SIZE);
+  lj_assertX(len <= STREAM_BUFFER_SIZE, "stream buffer overflow");
 
   for (;;) {
     const ssize_t written = write(fd, data, len - write_total);
@@ -127,7 +127,7 @@ static size_t buffer_writer_default(const void **buf_addr, size_t len,
     }
 
     write_total += written;
-    lua_assert(write_total <= len);
+    lj_assertX(write_total <= len, "invalid stream buffer write");
 
     if (write_total == len)
       break;
@@ -168,7 +168,7 @@ static int on_stop_cb_default(void *opt, uint8_t *buf)
 static int set_output_path(const char *path, struct luam_Sysprof_Options *opt) {
   struct profile_ctx *ctx = opt->ctx;
   int fd = 0;
-  lua_assert(path != NULL);
+  lj_assertX(path != NULL, "no file to open by sysprof");
   fd = open(path, O_CREAT | O_WRONLY | O_TRUNC, 0644);
   if(fd == -1) {
     return PROFILE_ERRIO;
@@ -280,7 +280,7 @@ static int sysprof_error(lua_State *L, int status)
       return luaL_fileresult(L, 0, NULL);
 #endif
     default:
-      lua_assert(0);
+      lj_assertL(0, "bad sysprof error %d", status);
       return 0;
   }
 }
@@ -401,7 +401,7 @@ LJLIB_CF(misc_memprof_start)
       return luaL_fileresult(L, 0, fname);
 #endif
     default:
-      lua_assert(0);
+      lj_assertL(0, "bad memprof error %d", memprof_status);
       return 0;
     }
   }
@@ -430,7 +430,7 @@ LJLIB_CF(misc_memprof_stop)
       return luaL_fileresult(L, 0, NULL);
 #endif
     default:
-      lua_assert(0);
+      lj_assertL(0, "bad memprof error %d", status);
       return 0;
     }
   }
diff --git a/src/lib_string.c b/src/lib_string.c
index 156dae66..9b9c369a 100644
--- a/src/lib_string.c
+++ b/src/lib_string.c
@@ -136,7 +136,7 @@ LJLIB_CF(string_dump)
 /* ------------------------------------------------------------------------ */
 
 /* macro to `unsign' a character */
-#define uchar(c)        ((unsigned char)(c))
+#define uchar(c)	((unsigned char)(c))
 
 #define CAP_UNFINISHED	(-1)
 #define CAP_POSITION	(-2)
@@ -645,7 +645,7 @@ static GCstr *string_fmt_tostring(lua_State *L, int arg, int retry)
 {
   TValue *o = L->base+arg-1;
   cTValue *mo;
-  lua_assert(o < L->top);  /* Caller already checks for existence. */
+  lj_assertL(o < L->top, "bad usage");  /* Caller already checks for existence. */
   if (LJ_LIKELY(tvisstr(o)))
     return strV(o);
   if (retry != 2 && !tvisnil(mo = lj_meta_lookup(L, o, MM_tostring))) {
@@ -717,7 +717,7 @@ again:
 	lj_strfmt_putptr(sb, lj_obj_ptr(G(L), L->base+arg-1));
 	break;
       default:
-	lua_assert(0);
+	lj_assertL(0, "bad string format type");
 	break;
       }
     }
diff --git a/src/lj_api.c b/src/lj_api.c
index 89998815..05e02029 100644
--- a/src/lj_api.c
+++ b/src/lj_api.c
@@ -28,8 +28,8 @@
 
 /* -- Common helper functions --------------------------------------------- */
 
-#define api_checknelems(L, n)		api_check(L, (n) <= (L->top - L->base))
-#define api_checkvalidindex(L, i)	api_check(L, (i) != niltv(L))
+#define lj_checkapi_slot(idx) \
+  lj_checkapi((idx) <= (L->top - L->base), "stack slot %d out of range", (idx))
 
 static TValue *index2adr(lua_State *L, int idx)
 {
@@ -37,7 +37,8 @@ static TValue *index2adr(lua_State *L, int idx)
     TValue *o = L->base + (idx - 1);
     return o < L->top ? o : niltv(L);
   } else if (idx > LUA_REGISTRYINDEX) {
-    api_check(L, idx != 0 && -idx <= L->top - L->base);
+    lj_checkapi(idx != 0 && -idx <= L->top - L->base,
+		"bad stack slot %d", idx);
     return L->top + idx;
   } else if (idx == LUA_GLOBALSINDEX) {
     TValue *o = &G(L)->tmptv;
@@ -47,7 +48,8 @@ static TValue *index2adr(lua_State *L, int idx)
     return registry(L);
   } else {
     GCfunc *fn = curr_func(L);
-    api_check(L, fn->c.gct == ~LJ_TFUNC && !isluafunc(fn));
+    lj_checkapi(fn->c.gct == ~LJ_TFUNC && !isluafunc(fn),
+		"calling frame is not a C function");
     if (idx == LUA_ENVIRONINDEX) {
       TValue *o = &G(L)->tmptv;
       settabV(L, o, tabref(fn->c.env));
@@ -59,13 +61,27 @@ static TValue *index2adr(lua_State *L, int idx)
   }
 }
 
-static TValue *stkindex2adr(lua_State *L, int idx)
+static LJ_AINLINE TValue *index2adr_check(lua_State *L, int idx)
+{
+  TValue *o = index2adr(L, idx);
+  lj_checkapi(o != niltv(L), "invalid stack slot %d", idx);
+  return o;
+}
+
+static TValue *index2adr_stack(lua_State *L, int idx)
 {
   if (idx > 0) {
     TValue *o = L->base + (idx - 1);
+    if (o < L->top) {
+      return o;
+    } else {
+      lj_checkapi(0, "invalid stack slot %d", idx);
+      return niltv(L);
+    }
     return o < L->top ? o : niltv(L);
   } else {
-    api_check(L, idx != 0 && -idx <= L->top - L->base);
+    lj_checkapi(idx != 0 && -idx <= L->top - L->base,
+		"invalid stack slot %d", idx);
     return L->top + idx;
   }
 }
@@ -111,17 +127,17 @@ LUALIB_API void luaL_checkstack(lua_State *L, int size, const char *msg)
     lj_err_callerv(L, LJ_ERR_STKOVM, msg);
 }
 
-LUA_API void lua_xmove(lua_State *from, lua_State *to, int n)
+LUA_API void lua_xmove(lua_State *L, lua_State *to, int n)
 {
   TValue *f, *t;
-  if (from == to) return;
-  api_checknelems(from, n);
-  api_check(from, G(from) == G(to));
+  if (L == to) return;
+  lj_checkapi_slot(n);
+  lj_checkapi(G(L) == G(to), "move across global states");
   lj_state_checkstack(to, (MSize)n);
-  f = from->top;
+  f = L->top;
   t = to->top = to->top + n;
   while (--n >= 0) copyTV(to, --t, --f);
-  from->top = f;
+  L->top = f;
 }
 
 LUA_API const lua_Number *lua_version(lua_State *L)
@@ -141,7 +157,7 @@ LUA_API int lua_gettop(lua_State *L)
 LUA_API void lua_settop(lua_State *L, int idx)
 {
   if (idx >= 0) {
-    api_check(L, idx <= tvref(L->maxstack) - L->base);
+    lj_checkapi(idx <= tvref(L->maxstack) - L->base, "bad stack slot %d", idx);
     if (L->base + idx > L->top) {
       if (L->base + idx >= tvref(L->maxstack))
 	lj_state_growstack(L, (MSize)idx - (MSize)(L->top - L->base));
@@ -150,23 +166,21 @@ LUA_API void lua_settop(lua_State *L, int idx)
       L->top = L->base + idx;
     }
   } else {
-    api_check(L, -(idx+1) <= (L->top - L->base));
+    lj_checkapi(-(idx+1) <= (L->top - L->base), "bad stack slot %d", idx);
     L->top += idx+1;  /* Shrinks top (idx < 0). */
   }
 }
 
 LUA_API void lua_remove(lua_State *L, int idx)
 {
-  TValue *p = stkindex2adr(L, idx);
-  api_checkvalidindex(L, p);
+  TValue *p = index2adr_stack(L, idx);
   while (++p < L->top) copyTV(L, p-1, p);
   L->top--;
 }
 
 LUA_API void lua_insert(lua_State *L, int idx)
 {
-  TValue *q, *p = stkindex2adr(L, idx);
-  api_checkvalidindex(L, p);
+  TValue *q, *p = index2adr_stack(L, idx);
   for (q = L->top; q > p; q--) copyTV(L, q, q-1);
   copyTV(L, p, L->top);
 }
@@ -174,19 +188,18 @@ LUA_API void lua_insert(lua_State *L, int idx)
 static void copy_slot(lua_State *L, TValue *f, int idx)
 {
   if (idx == LUA_GLOBALSINDEX) {
-    api_check(L, tvistab(f));
+    lj_checkapi(tvistab(f), "stack slot %d is not a table", idx);
     /* NOBARRIER: A thread (i.e. L) is never black. */
     setgcref(L->env, obj2gco(tabV(f)));
   } else if (idx == LUA_ENVIRONINDEX) {
     GCfunc *fn = curr_func(L);
     if (fn->c.gct != ~LJ_TFUNC)
       lj_err_msg(L, LJ_ERR_NOENV);
-    api_check(L, tvistab(f));
+    lj_checkapi(tvistab(f), "stack slot %d is not a table", idx);
     setgcref(fn->c.env, obj2gco(tabV(f)));
     lj_gc_barrier(L, fn, f);
   } else {
-    TValue *o = index2adr(L, idx);
-    api_checkvalidindex(L, o);
+    TValue *o = index2adr_check(L, idx);
     copyTV(L, o, f);
     if (idx < LUA_GLOBALSINDEX)  /* Need a barrier for upvalues. */
       lj_gc_barrier(L, curr_func(L), f);
@@ -195,7 +208,7 @@ static void copy_slot(lua_State *L, TValue *f, int idx)
 
 LUA_API void lua_replace(lua_State *L, int idx)
 {
-  api_checknelems(L, 1);
+  lj_checkapi_slot(1);
   copy_slot(L, L->top - 1, idx);
   L->top--;
 }
@@ -231,7 +244,7 @@ LUA_API int lua_type(lua_State *L, int idx)
 #else
     int tt = (int)(((t < 8 ? 0x98042110u : 0x75a06u) >> 4*(t&7)) & 15u);
 #endif
-    lua_assert(tt != LUA_TNIL || tvisnil(o));
+    lj_assertL(tt != LUA_TNIL || tvisnil(o), "bad tag conversion");
     return tt;
   }
 }
@@ -522,7 +535,7 @@ LUA_API const char *lua_tolstring(lua_State *L, int idx, size_t *len)
 LUA_API uint32_t lua_hashstring(lua_State *L, int idx)
 {
   TValue *o = index2adr(L, idx);
-  lua_assert(tvisstr(o));
+  lj_checkapi(tvisstr(o), "stack slot %d is not a string", idx);
   GCstr *s = strV(o);
   if (! strsmart(s))
     return s->hash;
@@ -699,14 +712,14 @@ LUA_API void lua_pushcclosure(lua_State *L, lua_CFunction f, int n)
 {
   GCfunc *fn;
   lj_gc_check(L);
-  api_checknelems(L, n);
+  lj_checkapi_slot(n);
   fn = lj_func_newC(L, (MSize)n, getcurrenv(L));
   fn->c.f = f;
   L->top -= n;
   while (n--)
     copyTV(L, &fn->c.upvalue[n], L->top+n);
   setfuncV(L, L->top, fn);
-  lua_assert(iswhite(obj2gco(fn)));
+  lj_assertL(iswhite(obj2gco(fn)), "new GC object is not white");
   incr_top(L);
 }
 
@@ -779,7 +792,7 @@ LUA_API void *lua_newuserdata(lua_State *L, size_t size)
 
 LUA_API void lua_concat(lua_State *L, int n)
 {
-  api_checknelems(L, n);
+  lj_checkapi_slot(n);
   if (n >= 2) {
     n--;
     do {
@@ -805,9 +818,8 @@ LUA_API void lua_concat(lua_State *L, int n)
 
 LUA_API void lua_gettable(lua_State *L, int idx)
 {
-  cTValue *v, *t = index2adr(L, idx);
-  api_checkvalidindex(L, t);
-  v = lj_meta_tget(L, t, L->top-1);
+  cTValue *t = index2adr_check(L, idx);
+  cTValue *v = lj_meta_tget(L, t, L->top-1);
   if (v == NULL) {
     L->top += 2;
     jit_secure_call(L, L->top-2, 1+1);
@@ -819,9 +831,8 @@ LUA_API void lua_gettable(lua_State *L, int idx)
 
 LUA_API void lua_getfield(lua_State *L, int idx, const char *k)
 {
-  cTValue *v, *t = index2adr(L, idx);
+  cTValue *v, *t = index2adr_check(L, idx);
   TValue key;
-  api_checkvalidindex(L, t);
   setstrV(L, &key, lj_str_newz(L, k));
   v = lj_meta_tget(L, t, &key);
   if (v == NULL) {
@@ -837,14 +848,14 @@ LUA_API void lua_getfield(lua_State *L, int idx, const char *k)
 LUA_API void lua_rawget(lua_State *L, int idx)
 {
   cTValue *t = index2adr(L, idx);
-  api_check(L, tvistab(t));
+  lj_checkapi(tvistab(t), "stack slot %d is not a table", idx);
   copyTV(L, L->top-1, lj_tab_get(L, tabV(t), L->top-1));
 }
 
 LUA_API void lua_rawgeti(lua_State *L, int idx, int n)
 {
   cTValue *v, *t = index2adr(L, idx);
-  api_check(L, tvistab(t));
+  lj_checkapi(tvistab(t), "stack slot %d is not a table", idx);
   v = lj_tab_getint(tabV(t), n);
   if (v) {
     copyTV(L, L->top, v);
@@ -886,8 +897,7 @@ LUALIB_API int luaL_getmetafield(lua_State *L, int idx, const char *field)
 
 LUA_API void lua_getfenv(lua_State *L, int idx)
 {
-  cTValue *o = index2adr(L, idx);
-  api_checkvalidindex(L, o);
+  cTValue *o = index2adr_check(L, idx);
   if (tvisfunc(o)) {
     settabV(L, L->top, tabref(funcV(o)->c.env));
   } else if (tvisudata(o)) {
@@ -904,7 +914,7 @@ LUA_API int lua_next(lua_State *L, int idx)
 {
   cTValue *t = index2adr(L, idx);
   int more;
-  api_check(L, tvistab(t));
+  lj_checkapi(tvistab(t), "stack slot %d is not a table", idx);
   more = lj_tab_next(L, tabV(t), L->top-1);
   if (more) {
     incr_top(L);  /* Return new key and value slot. */
@@ -930,7 +940,7 @@ LUA_API void *lua_upvalueid(lua_State *L, int idx, int n)
 {
   GCfunc *fn = funcV(index2adr(L, idx));
   n--;
-  api_check(L, (uint32_t)n < fn->l.nupvalues);
+  lj_checkapi((uint32_t)n < fn->l.nupvalues, "bad upvalue %d", n);
   return isluafunc(fn) ? (void *)gcref(fn->l.uvptr[n]) :
 			 (void *)&fn->c.upvalue[n];
 }
@@ -940,8 +950,10 @@ LUA_API void lua_upvaluejoin(lua_State *L, int idx1, int n1, int idx2, int n2)
   GCfunc *fn1 = funcV(index2adr(L, idx1));
   GCfunc *fn2 = funcV(index2adr(L, idx2));
   n1--; n2--;
-  api_check(L, isluafunc(fn1) && (uint32_t)n1 < fn1->l.nupvalues);
-  api_check(L, isluafunc(fn2) && (uint32_t)n2 < fn2->l.nupvalues);
+  lj_checkapi(isluafunc(fn1), "stack slot %d is not a Lua function", idx1);
+  lj_checkapi(isluafunc(fn2), "stack slot %d is not a Lua function", idx2);
+  lj_checkapi((uint32_t)n1 < fn1->l.nupvalues, "bad upvalue %d", n1+1);
+  lj_checkapi((uint32_t)n2 < fn2->l.nupvalues, "bad upvalue %d", n2+1);
   setgcrefr(fn1->l.uvptr[n1], fn2->l.uvptr[n2]);
   lj_gc_objbarrier(L, fn1, gcref(fn1->l.uvptr[n1]));
 }
@@ -970,9 +982,8 @@ LUALIB_API void *luaL_checkudata(lua_State *L, int idx, const char *tname)
 LUA_API void lua_settable(lua_State *L, int idx)
 {
   TValue *o;
-  cTValue *t = index2adr(L, idx);
-  api_checknelems(L, 2);
-  api_checkvalidindex(L, t);
+  cTValue *t = index2adr_check(L, idx);
+  lj_checkapi_slot(2);
   o = lj_meta_tset(L, t, L->top-2);
   if (o) {
     /* NOBARRIER: lj_meta_tset ensures the table is not black. */
@@ -991,9 +1002,8 @@ LUA_API void lua_setfield(lua_State *L, int idx, const char *k)
 {
   TValue *o;
   TValue key;
-  cTValue *t = index2adr(L, idx);
-  api_checknelems(L, 1);
-  api_checkvalidindex(L, t);
+  cTValue *t = index2adr_check(L, idx);
+  lj_checkapi_slot(1);
   setstrV(L, &key, lj_str_newz(L, k));
   o = lj_meta_tset(L, t, &key);
   if (o) {
@@ -1012,7 +1022,7 @@ LUA_API void lua_rawset(lua_State *L, int idx)
 {
   GCtab *t = tabV(index2adr(L, idx));
   TValue *dst, *key;
-  api_checknelems(L, 2);
+  lj_checkapi_slot(2);
   key = L->top-2;
   dst = lj_tab_set(L, t, key);
   copyTV(L, dst, key+1);
@@ -1024,7 +1034,7 @@ LUA_API void lua_rawseti(lua_State *L, int idx, int n)
 {
   GCtab *t = tabV(index2adr(L, idx));
   TValue *dst, *src;
-  api_checknelems(L, 1);
+  lj_checkapi_slot(1);
   dst = lj_tab_setint(L, t, n);
   src = L->top-1;
   copyTV(L, dst, src);
@@ -1036,13 +1046,12 @@ LUA_API int lua_setmetatable(lua_State *L, int idx)
 {
   global_State *g;
   GCtab *mt;
-  cTValue *o = index2adr(L, idx);
-  api_checknelems(L, 1);
-  api_checkvalidindex(L, o);
+  cTValue *o = index2adr_check(L, idx);
+  lj_checkapi_slot(1);
   if (tvisnil(L->top-1)) {
     mt = NULL;
   } else {
-    api_check(L, tvistab(L->top-1));
+    lj_checkapi(tvistab(L->top-1), "top stack slot is not a table");
     mt = tabV(L->top-1);
   }
   g = G(L);
@@ -1079,11 +1088,10 @@ LUALIB_API void luaL_setmetatable(lua_State *L, const char *tname)
 
 LUA_API int lua_setfenv(lua_State *L, int idx)
 {
-  cTValue *o = index2adr(L, idx);
+  cTValue *o = index2adr_check(L, idx);
   GCtab *t;
-  api_checknelems(L, 1);
-  api_checkvalidindex(L, o);
-  api_check(L, tvistab(L->top-1));
+  lj_checkapi_slot(1);
+  lj_checkapi(tvistab(L->top-1), "top stack slot is not a table");
   t = tabV(L->top-1);
   if (tvisfunc(o)) {
     setgcref(funcV(o)->c.env, obj2gco(t));
@@ -1106,7 +1114,7 @@ LUA_API const char *lua_setupvalue(lua_State *L, int idx, int n)
   TValue *val;
   GCobj *o;
   const char *name;
-  api_checknelems(L, 1);
+  lj_checkapi_slot(1);
   name = lj_debug_uvnamev(f, (uint32_t)(n-1), &val, &o);
   if (name) {
     L->top--;
@@ -1133,8 +1141,9 @@ static TValue *api_call_base(lua_State *L, int nargs)
 
 LUA_API void lua_call(lua_State *L, int nargs, int nresults)
 {
-  api_check(L, L->status == LUA_OK || L->status == LUA_ERRERR);
-  api_checknelems(L, nargs+1);
+  lj_checkapi(L->status == LUA_OK || L->status == LUA_ERRERR,
+	      "thread called in wrong state %d", L->status);
+  lj_checkapi_slot(nargs+1);
   jit_secure_call(L, api_call_base(L, nargs), nresults+1);
 }
 
@@ -1144,13 +1153,13 @@ LUA_API int lua_pcall(lua_State *L, int nargs, int nresults, int errfunc)
   uint8_t oldh = hook_save(g);
   ptrdiff_t ef;
   int status;
-  api_check(L, L->status == LUA_OK || L->status == LUA_ERRERR);
-  api_checknelems(L, nargs+1);
+  lj_checkapi(L->status == LUA_OK || L->status == LUA_ERRERR,
+	      "thread called in wrong state %d", L->status);
+  lj_checkapi_slot(nargs+1);
   if (errfunc == 0) {
     ef = 0;
   } else {
-    cTValue *o = stkindex2adr(L, errfunc);
-    api_checkvalidindex(L, o);
+    cTValue *o = index2adr_stack(L, errfunc);
     ef = savestack(L, o);
   }
   /* Forbid Lua world re-entrancy while running the trace */
@@ -1186,7 +1195,8 @@ LUA_API int lua_cpcall(lua_State *L, lua_CFunction func, void *ud)
   global_State *g = G(L);
   uint8_t oldh = hook_save(g);
   int status;
-  api_check(L, L->status == LUA_OK || L->status == LUA_ERRERR);
+  lj_checkapi(L->status == LUA_OK || L->status == LUA_ERRERR,
+	      "thread called in wrong state %d", L->status);
   /* Forbid Lua world re-entrancy while running the trace */
   if (tvref(g->jit_base)) {
     setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL));
diff --git a/src/lj_asm.c b/src/lj_asm.c
index a6906b19..d71fa8c8 100644
--- a/src/lj_asm.c
+++ b/src/lj_asm.c
@@ -100,6 +100,12 @@ typedef struct ASMState {
   uint16_t parentmap[LJ_MAX_JSLOTS];  /* Parent instruction to RegSP map. */
 } ASMState;
 
+#ifdef LUA_USE_ASSERT
+#define lj_assertA(c, ...)	lj_assertG_(J2G(as->J), (c), __VA_ARGS__)
+#else
+#define lj_assertA(c, ...)	((void)as)
+#endif
+
 #define IR(ref)			(&as->ir[(ref)])
 
 #define ASMREF_TMP1		REF_TRUE	/* Temp. register. */
@@ -131,9 +137,8 @@ static LJ_AINLINE void checkmclim(ASMState *as)
 #ifdef LUA_USE_ASSERT
   if (as->mcp + MCLIM_REDZONE < as->mcp_prev) {
     IRIns *ir = IR(as->curins+1);
-    fprintf(stderr, "RED ZONE OVERFLOW: %p IR %04d  %02d %04d %04d\n", as->mcp,
-	    as->curins+1-REF_BIAS, ir->o, ir->op1-REF_BIAS, ir->op2-REF_BIAS);
-    lua_assert(0);
+    lj_assertA(0, "red zone overflow: %p IR %04d  %02d %04d %04d\n", as->mcp,
+      as->curins+1-REF_BIAS, ir->o, ir->op1-REF_BIAS, ir->op2-REF_BIAS);
   }
 #endif
   if (LJ_UNLIKELY(as->mcp < as->mclim)) asm_mclimit(as);
@@ -247,7 +252,7 @@ static void ra_dprintf(ASMState *as, const char *fmt, ...)
 	  *p++ = *q >= 'A' && *q <= 'Z' ? *q + 0x20 : *q;
       } else {
 	*p++ = '?';
-	lua_assert(0);
+	lj_assertA(0, "bad register %d for debug format \"%s\"", r, fmt);
       }
     } else if (e[1] == 'f' || e[1] == 'i') {
       IRRef ref;
@@ -265,7 +270,7 @@ static void ra_dprintf(ASMState *as, const char *fmt, ...)
     } else if (e[1] == 'x') {
       p += sprintf(p, "%08x", va_arg(argp, int32_t));
     } else {
-      lua_assert(0);
+      lj_assertA(0, "bad debug format code");
     }
     fmt = e+2;
   }
@@ -324,7 +329,7 @@ static Reg ra_rematk(ASMState *as, IRRef ref)
   Reg r;
   if (ra_iskref(ref)) {
     r = ra_krefreg(ref);
-    lua_assert(!rset_test(as->freeset, r));
+    lj_assertA(!rset_test(as->freeset, r), "rematk of free reg %d", r);
     ra_free(as, r);
     ra_modified(as, r);
 #if LJ_64
@@ -336,7 +341,9 @@ static Reg ra_rematk(ASMState *as, IRRef ref)
   }
   ir = IR(ref);
   r = ir->r;
-  lua_assert(ra_hasreg(r) && !ra_hasspill(ir->s));
+  lj_assertA(ra_hasreg(r), "rematk of K%03d has no reg", REF_BIAS - ref);
+  lj_assertA(!ra_hasspill(ir->s),
+	     "rematk of K%03d has spill slot [%x]", REF_BIAS - ref, ir->s);
   ra_free(as, r);
   ra_modified(as, r);
   ir->r = RID_INIT;  /* Do not keep any hint. */
@@ -350,7 +357,8 @@ static Reg ra_rematk(ASMState *as, IRRef ref)
     ra_sethint(ir->r, RID_BASE);  /* Restore BASE register hint. */
     emit_getgl(as, r, jit_base);
   } else if (emit_canremat(ASMREF_L) && ir->o == IR_KPRI) {
-    lua_assert(irt_isnil(ir->t));  /* REF_NIL stores ASMREF_L register. */
+    /* REF_NIL stores ASMREF_L register. */
+    lj_assertA(irt_isnil(ir->t), "rematk of bad ASMREF_L");
     emit_getgl(as, r, cur_L);
 #if LJ_64
   } else if (ir->o == IR_KINT64) {
@@ -363,8 +371,9 @@ static Reg ra_rematk(ASMState *as, IRRef ref)
 #endif
 #endif
   } else {
-    lua_assert(ir->o == IR_KINT || ir->o == IR_KGC ||
-	       ir->o == IR_KPTR || ir->o == IR_KKPTR || ir->o == IR_KNULL);
+    lj_assertA(ir->o == IR_KINT || ir->o == IR_KGC ||
+	       ir->o == IR_KPTR || ir->o == IR_KKPTR || ir->o == IR_KNULL,
+	       "rematk of bad IR op %d", ir->o);
     emit_loadi(as, r, ir->i);
   }
   return r;
@@ -374,7 +383,8 @@ static Reg ra_rematk(ASMState *as, IRRef ref)
 static int32_t ra_spill(ASMState *as, IRIns *ir)
 {
   int32_t slot = ir->s;
-  lua_assert(ir >= as->ir + REF_TRUE);
+  lj_assertA(ir >= as->ir + REF_TRUE,
+	     "spill of K%03d", REF_BIAS - (int)(ir - as->ir));
   if (!ra_hasspill(slot)) {
     if (irt_is64(ir->t)) {
       slot = as->evenspill;
@@ -399,7 +409,9 @@ static Reg ra_releasetmp(ASMState *as, IRRef ref)
 {
   IRIns *ir = IR(ref);
   Reg r = ir->r;
-  lua_assert(ra_hasreg(r) && !ra_hasspill(ir->s));
+  lj_assertA(ra_hasreg(r), "release of TMP%d has no reg", ref-ASMREF_TMP1+1);
+  lj_assertA(!ra_hasspill(ir->s),
+	     "release of TMP%d has spill slot [%x]", ref-ASMREF_TMP1+1, ir->s);
   ra_free(as, r);
   ra_modified(as, r);
   ir->r = RID_INIT;
@@ -415,7 +427,7 @@ static Reg ra_restore(ASMState *as, IRRef ref)
     IRIns *ir = IR(ref);
     int32_t ofs = ra_spill(as, ir);  /* Force a spill slot. */
     Reg r = ir->r;
-    lua_assert(ra_hasreg(r));
+    lj_assertA(ra_hasreg(r), "restore of IR %04d has no reg", ref - REF_BIAS);
     ra_sethint(ir->r, r);  /* Keep hint. */
     ra_free(as, r);
     if (!rset_test(as->weakset, r)) {  /* Only restore non-weak references. */
@@ -444,14 +456,15 @@ static Reg ra_evict(ASMState *as, RegSet allow)
 {
   IRRef ref;
   RegCost cost = ~(RegCost)0;
-  lua_assert(allow != RSET_EMPTY);
+  lj_assertA(allow != RSET_EMPTY, "evict from empty set");
   if (RID_NUM_FPR == 0 || allow < RID2RSET(RID_MAX_GPR)) {
     GPRDEF(MINCOST)
   } else {
     FPRDEF(MINCOST)
   }
   ref = regcost_ref(cost);
-  lua_assert(ra_iskref(ref) || (ref >= as->T->nk && ref < as->T->nins));
+  lj_assertA(ra_iskref(ref) || (ref >= as->T->nk && ref < as->T->nins),
+	     "evict of out-of-range IR %04d", ref - REF_BIAS);
   /* Preferably pick any weak ref instead of a non-weak, non-const ref. */
   if (!irref_isk(ref) && (as->weakset & allow)) {
     IRIns *ir = IR(ref);
@@ -609,7 +622,8 @@ static Reg ra_allocref(ASMState *as, IRRef ref, RegSet allow)
   IRIns *ir = IR(ref);
   RegSet pick = as->freeset & allow;
   Reg r;
-  lua_assert(ra_noreg(ir->r));
+  lj_assertA(ra_noreg(ir->r),
+	     "IR %04d already has reg %d", ref - REF_BIAS, ir->r);
   if (pick) {
     /* First check register hint from propagation or PHI. */
     if (ra_hashint(ir->r)) {
@@ -673,8 +687,10 @@ static void ra_rename(ASMState *as, Reg down, Reg up)
   IRIns *ir = IR(ref);
   ir->r = (uint8_t)up;
   as->cost[down] = 0;
-  lua_assert((down < RID_MAX_GPR) == (up < RID_MAX_GPR));
-  lua_assert(!rset_test(as->freeset, down) && rset_test(as->freeset, up));
+  lj_assertA((down < RID_MAX_GPR) == (up < RID_MAX_GPR),
+	     "rename between GPR/FPR %d and %d", down, up);
+  lj_assertA(!rset_test(as->freeset, down), "rename from free reg %d", down);
+  lj_assertA(rset_test(as->freeset, up), "rename to non-free reg %d", up);
   ra_free(as, down);  /* 'down' is free ... */
   ra_modified(as, down);
   rset_clear(as->freeset, up);  /* ... and 'up' is now allocated. */
@@ -722,7 +738,7 @@ static void ra_destreg(ASMState *as, IRIns *ir, Reg r)
 {
   Reg dest = ra_dest(as, ir, RID2RSET(r));
   if (dest != r) {
-    lua_assert(rset_test(as->freeset, r));
+    lj_assertA(rset_test(as->freeset, r), "dest reg %d is not free", r);
     ra_modified(as, r);
     emit_movrr(as, ir, dest, r);
   }
@@ -755,8 +771,9 @@ static void ra_left(ASMState *as, Reg dest, IRRef lref)
 #endif
 #endif
       } else if (ir->o != IR_KPRI) {
-	lua_assert(ir->o == IR_KINT || ir->o == IR_KGC ||
-		   ir->o == IR_KPTR || ir->o == IR_KKPTR || ir->o == IR_KNULL);
+	lj_assertA(ir->o == IR_KINT || ir->o == IR_KGC ||
+		   ir->o == IR_KPTR || ir->o == IR_KKPTR || ir->o == IR_KNULL,
+		   "K%03d has bad IR op %d", REF_BIAS - lref, ir->o);
 	emit_loadi(as, dest, ir->i);
 	return;
       }
@@ -901,11 +918,14 @@ static void asm_snap_alloc1(ASMState *as, IRRef ref)
 #endif
       {  /* Allocate stored values for TNEW, TDUP and CNEW. */
 	IRIns *irs;
-	lua_assert(ir->o == IR_TNEW || ir->o == IR_TDUP || ir->o == IR_CNEW);
+	lj_assertA(ir->o == IR_TNEW || ir->o == IR_TDUP || ir->o == IR_CNEW,
+		   "sink of IR %04d has bad op %d", ref - REF_BIAS, ir->o);
 	for (irs = IR(as->snapref-1); irs > ir; irs--)
 	  if (irs->r == RID_SINK && asm_sunk_store(as, ir, irs)) {
-	    lua_assert(irs->o == IR_ASTORE || irs->o == IR_HSTORE ||
-		       irs->o == IR_FSTORE || irs->o == IR_XSTORE);
+	    lj_assertA(irs->o == IR_ASTORE || irs->o == IR_HSTORE ||
+		       irs->o == IR_FSTORE || irs->o == IR_XSTORE,
+		       "sunk store IR %04d has bad op %d",
+		       (int)(irs - as->ir) - REF_BIAS, irs->o);
 	    asm_snap_alloc1(as, irs->op2);
 	    if (LJ_32 && (irs+1)->o == IR_HIOP)
 	      asm_snap_alloc1(as, (irs+1)->op2);
@@ -953,15 +973,9 @@ static void asm_snap_alloc(ASMState *as, int snapno)
     if (!irref_isk(ref)) {
       asm_snap_alloc1(as, ref);
       if (LJ_SOFTFP && (sn & SNAP_SOFTFPNUM)) {
-	/*
-	** FIXME: The following assert was replaced with
-	** the conventional `lua_assert`.
-	**
-	** lj_assertA(irt_type(IR(ref+1)->t) == IRT_SOFTFP,
-	** "snap %d[%d] points to bad SOFTFP IR %04d",
-	** snapno, n, ref - REF_BIAS);
-	*/
-	lua_assert(irt_type(IR(ref+1)->t) == IRT_SOFTFP);
+	lj_assertA(irt_type(IR(ref+1)->t) == IRT_SOFTFP,
+		   "snap %d[%d] points to bad SOFTFP IR %04d",
+		   snapno, n, ref - REF_BIAS);
 	asm_snap_alloc1(as, ref+1);
       }
     }
@@ -1045,19 +1059,20 @@ static int32_t asm_stack_adjust(ASMState *as)
 }
 
 /* Must match with hash*() in lj_tab.c. */
-static uint32_t ir_khash(IRIns *ir)
+static uint32_t ir_khash(ASMState *as, IRIns *ir)
 {
   uint32_t lo, hi;
+  UNUSED(as);
   if (irt_isstr(ir->t)) {
     return ir_kstr(ir)->hash;
   } else if (irt_isnum(ir->t)) {
     lo = ir_knum(ir)->u32.lo;
     hi = ir_knum(ir)->u32.hi << 1;
   } else if (irt_ispri(ir->t)) {
-    lua_assert(!irt_isnil(ir->t));
+    lj_assertA(!irt_isnil(ir->t), "hash of nil key");
     return irt_type(ir->t)-IRT_FALSE;
   } else {
-    lua_assert(irt_isgcv(ir->t));
+    lj_assertA(irt_isgcv(ir->t), "hash of bad IR type %d", irt_type(ir->t));
     lo = u32ptr(ir_kgc(ir));
 #if LJ_GC64
     hi = (uint32_t)(u64ptr(ir_kgc(ir)) >> 32) | (irt_toitype(ir->t) << 15);
@@ -1168,7 +1183,8 @@ static void asm_bufput(ASMState *as, IRIns *ir)
   args[0] = ir->op1;  /* SBuf * */
   args[1] = ir->op2;  /* GCstr * */
   irs = IR(ir->op2);
-  lua_assert(irt_isstr(irs->t));
+  lj_assertA(irt_isstr(irs->t),
+	     "BUFPUT of non-string IR %04d", ir->op2 - REF_BIAS);
   if (irs->o == IR_KGC) {
     GCstr *s = ir_kstr(irs);
     if (s->len == 1) {  /* Optimize put of single-char string constant. */
@@ -1182,7 +1198,8 @@ static void asm_bufput(ASMState *as, IRIns *ir)
 	args[1] = ASMREF_TMP1;  /* TValue * */
 	ci = &lj_ir_callinfo[IRCALL_lj_strfmt_putnum];
       } else {
-	lua_assert(irt_isinteger(IR(irs->op1)->t));
+	lj_assertA(irt_isinteger(IR(irs->op1)->t),
+		   "TOSTR of non-numeric IR %04d", irs->op1);
 	args[1] = irs->op1;  /* int */
 	if (irs->op2 == IRTOSTR_INT)
 	  ci = &lj_ir_callinfo[IRCALL_lj_strfmt_putint];
@@ -1248,7 +1265,8 @@ static void asm_conv64(ASMState *as, IRIns *ir)
   IRType dt = (((ir-1)->op2 & IRCONV_DSTMASK) >> IRCONV_DSH);
   IRCallID id;
   IRRef args[2];
-  lua_assert((ir-1)->o == IR_CONV && ir->o == IR_HIOP);
+  lj_assertA((ir-1)->o == IR_CONV && ir->o == IR_HIOP,
+	     "not a CONV/HIOP pair at IR %04d", (int)(ir - as->ir) - REF_BIAS);
   args[LJ_BE] = (ir-1)->op1;
   args[LJ_LE] = ir->op1;
   if (st == IRT_NUM || st == IRT_FLOAT) {
@@ -1304,15 +1322,16 @@ static void asm_collectargs(ASMState *as, IRIns *ir,
 			    const CCallInfo *ci, IRRef *args)
 {
   uint32_t n = CCI_XNARGS(ci);
-  lua_assert(n <= CCI_NARGS_MAX*2);  /* Account for split args. */
+  /* Account for split args. */
+  lj_assertA(n <= CCI_NARGS_MAX*2, "too many args %d to collect", n);
   if ((ci->flags & CCI_L)) { *args++ = ASMREF_L; n--; }
   while (n-- > 1) {
     ir = IR(ir->op1);
-    lua_assert(ir->o == IR_CARG);
+    lj_assertA(ir->o == IR_CARG, "malformed CALL arg tree");
     args[n] = ir->op2 == REF_NIL ? 0 : ir->op2;
   }
   args[0] = ir->op1 == REF_NIL ? 0 : ir->op1;
-  lua_assert(IR(ir->op1)->o != IR_CARG);
+  lj_assertA(IR(ir->op1)->o != IR_CARG, "malformed CALL arg tree");
 }
 
 /* Reconstruct CCallInfo flags for CALLX*. */
@@ -1690,7 +1709,10 @@ static void asm_ir(ASMState *as, IRIns *ir)
   switch ((IROp)ir->o) {
   /* Miscellaneous ops. */
   case IR_LOOP: asm_loop(as); break;
-  case IR_NOP: case IR_XBAR: lua_assert(!ra_used(ir)); break;
+  case IR_NOP: case IR_XBAR:
+    lj_assertA(!ra_used(ir),
+	       "IR %04d not unused", (int)(ir - as->ir) - REF_BIAS);
+    break;
   case IR_USE:
     ra_alloc1(as, ir->op1, irt_isfp(ir->t) ? RSET_FPR : RSET_GPR); break;
   case IR_PHI: asm_phi(as, ir); break;
@@ -1729,7 +1751,9 @@ static void asm_ir(ASMState *as, IRIns *ir)
 #if LJ_SOFTFP32
   case IR_DIV: case IR_POW: case IR_ABS:
   case IR_LDEXP: case IR_FPMATH: case IR_TOBIT:
-    lua_assert(0);  /* Unused for LJ_SOFTFP32. */
+    /* Unused for LJ_SOFTFP32. */
+    lj_assertA(0, "IR %04d with unused op %d",
+		  (int)(ir - as->ir) - REF_BIAS, ir->o);
     break;
 #else
   case IR_DIV: asm_div(as, ir); break;
@@ -1777,7 +1801,8 @@ static void asm_ir(ASMState *as, IRIns *ir)
 #if LJ_HASFFI
     asm_cnew(as, ir);
 #else
-    lua_assert(0);
+    lj_assertA(0, "IR %04d with unused op %d",
+		  (int)(ir - as->ir) - REF_BIAS, ir->o);
 #endif
     break;
 
@@ -1854,8 +1879,10 @@ static void asm_head_side(ASMState *as)
   for (i = as->stopins; i > REF_BASE; i--) {
     IRIns *ir = IR(i);
     RegSP rs;
-    lua_assert((ir->o == IR_SLOAD && (ir->op2 & IRSLOAD_PARENT)) ||
-	       (LJ_SOFTFP && ir->o == IR_HIOP) || ir->o == IR_PVAL);
+    lj_assertA((ir->o == IR_SLOAD && (ir->op2 & IRSLOAD_PARENT)) ||
+	       (LJ_SOFTFP && ir->o == IR_HIOP) || ir->o == IR_PVAL,
+	       "IR %04d has bad parent op %d",
+	       (int)(ir - as->ir) - REF_BIAS, ir->o);
     rs = as->parentmap[i - REF_FIRST];
     if (ra_hasreg(ir->r)) {
       rset_clear(allow, ir->r);
@@ -2115,7 +2142,7 @@ static void asm_setup_regsp(ASMState *as)
   ir = IR(REF_FIRST);
   if (as->parent) {
     uint16_t *p;
-    lastir = lj_snap_regspmap(as->parent, as->J->exitno, ir);
+    lastir = lj_snap_regspmap(as->J, as->parent, as->J->exitno, ir);
     if (lastir - ir > LJ_MAX_JSLOTS)
       lj_trace_err(as->J, LJ_TRERR_NYICOAL);
     as->stopins = (IRRef)((lastir-1) - as->ir);
@@ -2418,7 +2445,10 @@ void lj_asm_trace(jit_State *J, GCtrace *T)
     /* Assemble a trace in linear backwards order. */
     for (as->curins--; as->curins > as->stopins; as->curins--) {
       IRIns *ir = IR(as->curins);
-      lua_assert(!(LJ_32 && irt_isint64(ir->t)));  /* Handled by SPLIT. */
+      /* 64 bit types handled by SPLIT for 32 bit archs. */
+      lj_assertA(!(LJ_32 && irt_isint64(ir->t)),
+		 "IR %04d has unsplit 64 bit type",
+		 (int)(ir - as->ir) - REF_BIAS);
       asm_snap_prev(as);
       if (!ra_used(ir) && !ir_sideeff(ir) && (as->flags & JIT_F_OPT_DCE))
 	continue;  /* Dead-code elimination can be soooo easy. */
@@ -2449,7 +2479,7 @@ void lj_asm_trace(jit_State *J, GCtrace *T)
     asm_phi_fixup(as);
 
     if (J->curfinal->nins >= T->nins) {  /* IR didn't grow? */
-      lua_assert(J->curfinal->nk == T->nk);
+      lj_assertA(J->curfinal->nk == T->nk, "unexpected IR constant growth");
       memcpy(J->curfinal->ir + as->orignins, T->ir + as->orignins,
 	     (T->nins - as->orignins) * sizeof(IRIns));  /* Copy RENAMEs. */
       T->nins = J->curfinal->nins;
diff --git a/src/lj_asm_arm.h b/src/lj_asm_arm.h
index 29a07c80..47564d2e 100644
--- a/src/lj_asm_arm.h
+++ b/src/lj_asm_arm.h
@@ -41,7 +41,7 @@ static Reg ra_scratchpair(ASMState *as, RegSet allow)
       }
     }
   }
-  lua_assert(rset_test(RSET_GPREVEN, r));
+  lj_assertA(rset_test(RSET_GPREVEN, r), "odd reg %d", r);
   ra_modified(as, r);
   ra_modified(as, r+1);
   RA_DBGX((as, "scratchpair    $r $r", r, r+1));
@@ -269,7 +269,7 @@ static void asm_fusexref(ASMState *as, ARMIns ai, Reg rd, IRRef ref,
 	return;
       }
     } else if (ir->o == IR_STRREF && !(!LJ_SOFTFP && (ai & 0x08000000))) {
-      lua_assert(ofs == 0);
+      lj_assertA(ofs == 0, "bad usage");
       ofs = (int32_t)sizeof(GCstr);
       if (irref_isk(ir->op2)) {
 	ofs += IR(ir->op2)->i;
@@ -389,9 +389,11 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
       as->freeset |= (of & RSET_RANGE(REGARG_FIRSTGPR, REGARG_LASTGPR+1));
       if (irt_isnum(ir->t)) gpr = (gpr+1) & ~1u;
       if (gpr <= REGARG_LASTGPR) {
-	lua_assert(rset_test(as->freeset, gpr));  /* Must have been evicted. */
+	lj_assertA(rset_test(as->freeset, gpr),
+		   "reg %d not free", gpr);  /* Must have been evicted. */
 	if (irt_isnum(ir->t)) {
-	  lua_assert(rset_test(as->freeset, gpr+1));  /* Ditto. */
+	  lj_assertA(rset_test(as->freeset, gpr+1),
+		     "reg %d not free", gpr+1);  /* Ditto. */
 	  emit_dnm(as, ARMI_VMOV_RR_D, gpr, gpr+1, (src & 15));
 	  gpr += 2;
 	} else {
@@ -408,7 +410,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
 #endif
     {
       if (gpr <= REGARG_LASTGPR) {
-	lua_assert(rset_test(as->freeset, gpr));  /* Must have been evicted. */
+	lj_assertA(rset_test(as->freeset, gpr),
+		   "reg %d not free", gpr);  /* Must have been evicted. */
 	if (ref) ra_leftov(as, gpr, ref);
 	gpr++;
       } else {
@@ -433,7 +436,7 @@ static void asm_setupresult(ASMState *as, IRIns *ir, const CCallInfo *ci)
     rset_clear(drop, (ir+1)->r);  /* Dest reg handled below. */
   ra_evictset(as, drop);  /* Evictions must be performed first. */
   if (ra_used(ir)) {
-    lua_assert(!irt_ispri(ir->t));
+    lj_assertA(!irt_ispri(ir->t), "PRI dest");
     if (!LJ_SOFTFP && irt_isfp(ir->t)) {
       if (LJ_ABI_SOFTFP || (ci->flags & (CCI_CASTU64|CCI_VARARG))) {
 	Reg dest = (ra_dest(as, ir, RSET_FPR) & 15);
@@ -530,13 +533,17 @@ static void asm_conv(ASMState *as, IRIns *ir)
 #endif
   IRRef lref = ir->op1;
   /* 64 bit integer conversions are handled by SPLIT. */
-  lua_assert(!irt_isint64(ir->t) && !(st == IRT_I64 || st == IRT_U64));
+  lj_assertA(!irt_isint64(ir->t) && !(st == IRT_I64 || st == IRT_U64),
+	     "IR %04d has unsplit 64 bit type",
+	     (int)(ir - as->ir) - REF_BIAS);
 #if LJ_SOFTFP
   /* FP conversions are handled by SPLIT. */
-  lua_assert(!irt_isfp(ir->t) && !(st == IRT_NUM || st == IRT_FLOAT));
+  lj_assertA(!irt_isfp(ir->t) && !(st == IRT_NUM || st == IRT_FLOAT),
+	     "IR %04d has FP type",
+	     (int)(ir - as->ir) - REF_BIAS);
   /* Can't check for same types: SPLIT uses CONV int.int + BXOR for sfp NEG. */
 #else
-  lua_assert(irt_type(ir->t) != st);
+  lj_assertA(irt_type(ir->t) != st, "inconsistent types for CONV");
   if (irt_isfp(ir->t)) {
     Reg dest = ra_dest(as, ir, RSET_FPR);
     if (stfp) {  /* FP to FP conversion. */
@@ -553,7 +560,8 @@ static void asm_conv(ASMState *as, IRIns *ir)
   } else if (stfp) {  /* FP to integer conversion. */
     if (irt_isguard(ir->t)) {
       /* Checked conversions are only supported from number to int. */
-      lua_assert(irt_isint(ir->t) && st == IRT_NUM);
+      lj_assertA(irt_isint(ir->t) && st == IRT_NUM,
+		 "bad type for checked CONV");
       asm_tointg(as, ir, ra_alloc1(as, lref, RSET_FPR));
     } else {
       Reg left = ra_alloc1(as, lref, RSET_FPR);
@@ -572,7 +580,7 @@ static void asm_conv(ASMState *as, IRIns *ir)
     Reg dest = ra_dest(as, ir, RSET_GPR);
     if (st >= IRT_I8 && st <= IRT_U16) {  /* Extend to 32 bit integer. */
       Reg left = ra_alloc1(as, lref, RSET_GPR);
-      lua_assert(irt_isint(ir->t) || irt_isu32(ir->t));
+      lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t), "bad type for CONV EXT");
       if ((as->flags & JIT_F_ARMV6)) {
 	ARMIns ai = st == IRT_I8 ? ARMI_SXTB :
 		    st == IRT_U8 ? ARMI_UXTB :
@@ -667,7 +675,7 @@ static void asm_tvptr(ASMState *as, Reg dest, IRRef ref)
       ra_allockreg(as, i32ptr(ir_knum(ir)), dest);
     } else {
 #if LJ_SOFTFP
-      lua_assert(0);
+      lj_assertA(0, "unsplit FP op");
 #else
       /* Otherwise force a spill and use the spill slot. */
       emit_opk(as, ARMI_ADD, dest, RID_SP, ra_spill(as, ir), RSET_GPR);
@@ -811,7 +819,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
   *l_loop = ARMF_CC(ARMI_B, CC_NE) | ((as->mcp-l_loop-2) & 0x00ffffffu);
 
   /* Load main position relative to tab->node into dest. */
-  khash = irref_isk(refkey) ? ir_khash(irkey) : 1;
+  khash = irref_isk(refkey) ? ir_khash(as, irkey) : 1;
   if (khash == 0) {
     emit_lso(as, ARMI_LDR, dest, tab, (int32_t)offsetof(GCtab, node));
   } else {
@@ -867,7 +875,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
   Reg node = ra_alloc1(as, ir->op1, RSET_GPR);
   Reg key = RID_NONE, type = RID_TMP, idx = node;
   RegSet allow = rset_exclude(RSET_GPR, node);
-  lua_assert(ofs % sizeof(Node) == 0);
+  lj_assertA(ofs % sizeof(Node) == 0, "unaligned HREFK slot");
   if (ofs > 4095) {
     idx = dest;
     rset_clear(allow, dest);
@@ -934,7 +942,7 @@ static void asm_uref(ASMState *as, IRIns *ir)
 static void asm_fref(ASMState *as, IRIns *ir)
 {
   UNUSED(as); UNUSED(ir);
-  lua_assert(!ra_used(ir));
+  lj_assertA(!ra_used(ir), "unfused FREF");
 }
 
 static void asm_strref(ASMState *as, IRIns *ir)
@@ -971,25 +979,27 @@ static void asm_strref(ASMState *as, IRIns *ir)
 
 /* -- Loads and stores ---------------------------------------------------- */
 
-static ARMIns asm_fxloadins(IRIns *ir)
+static ARMIns asm_fxloadins(ASMState *as, IRIns *ir)
 {
+  UNUSED(as);
   switch (irt_type(ir->t)) {
   case IRT_I8: return ARMI_LDRSB;
   case IRT_U8: return ARMI_LDRB;
   case IRT_I16: return ARMI_LDRSH;
   case IRT_U16: return ARMI_LDRH;
-  case IRT_NUM: lua_assert(!LJ_SOFTFP); return ARMI_VLDR_D;
+  case IRT_NUM: lj_assertA(!LJ_SOFTFP, "unsplit FP op"); return ARMI_VLDR_D;
   case IRT_FLOAT: if (!LJ_SOFTFP) return ARMI_VLDR_S;  /* fallthrough */
   default: return ARMI_LDR;
   }
 }
 
-static ARMIns asm_fxstoreins(IRIns *ir)
+static ARMIns asm_fxstoreins(ASMState *as, IRIns *ir)
 {
+  UNUSED(as);
   switch (irt_type(ir->t)) {
   case IRT_I8: case IRT_U8: return ARMI_STRB;
   case IRT_I16: case IRT_U16: return ARMI_STRH;
-  case IRT_NUM: lua_assert(!LJ_SOFTFP); return ARMI_VSTR_D;
+  case IRT_NUM: lj_assertA(!LJ_SOFTFP, "unsplit FP op"); return ARMI_VSTR_D;
   case IRT_FLOAT: if (!LJ_SOFTFP) return ARMI_VSTR_S;  /* fallthrough */
   default: return ARMI_STR;
   }
@@ -997,12 +1007,13 @@ static ARMIns asm_fxstoreins(IRIns *ir)
 
 static void asm_fload(ASMState *as, IRIns *ir)
 {
-  if (ir->op1 == REF_NIL) {
-    lua_assert(!ra_used(ir));  /* We can end up here if DCE is turned off. */
+  if (ir->op1 == REF_NIL) {  /* FLOAD from GG_State with offset. */
+    /* We can end up here if DCE is turned off. */
+    lj_assertA(!ra_used(ir), "NYI FLOAD GG_State");
   } else {
     Reg dest = ra_dest(as, ir, RSET_GPR);
     Reg idx = ra_alloc1(as, ir->op1, RSET_GPR);
-    ARMIns ai = asm_fxloadins(ir);
+    ARMIns ai = asm_fxloadins(as, ir);
     int32_t ofs;
     if (ir->op2 == IRFL_TAB_ARRAY) {
       ofs = asm_fuseabase(as, ir->op1);
@@ -1026,7 +1037,7 @@ static void asm_fstore(ASMState *as, IRIns *ir)
     IRIns *irf = IR(ir->op1);
     Reg idx = ra_alloc1(as, irf->op1, rset_exclude(RSET_GPR, src));
     int32_t ofs = field_ofs[irf->op2];
-    ARMIns ai = asm_fxstoreins(ir);
+    ARMIns ai = asm_fxstoreins(as, ir);
     if ((ai & 0x04000000))
       emit_lso(as, ai, src, idx, ofs);
     else
@@ -1038,8 +1049,8 @@ static void asm_xload(ASMState *as, IRIns *ir)
 {
   Reg dest = ra_dest(as, ir,
 		     (!LJ_SOFTFP && irt_isfp(ir->t)) ? RSET_FPR : RSET_GPR);
-  lua_assert(!(ir->op2 & IRXLOAD_UNALIGNED));
-  asm_fusexref(as, asm_fxloadins(ir), dest, ir->op1, RSET_GPR, 0);
+  lj_assertA(!(ir->op2 & IRXLOAD_UNALIGNED), "unaligned XLOAD");
+  asm_fusexref(as, asm_fxloadins(as, ir), dest, ir->op1, RSET_GPR, 0);
 }
 
 static void asm_xstore_(ASMState *as, IRIns *ir, int32_t ofs)
@@ -1047,7 +1058,7 @@ static void asm_xstore_(ASMState *as, IRIns *ir, int32_t ofs)
   if (ir->r != RID_SINK) {
     Reg src = ra_alloc1(as, ir->op2,
 			(!LJ_SOFTFP && irt_isfp(ir->t)) ? RSET_FPR : RSET_GPR);
-    asm_fusexref(as, asm_fxstoreins(ir), src, ir->op1,
+    asm_fusexref(as, asm_fxstoreins(as, ir), src, ir->op1,
 		 rset_exclude(RSET_GPR, src), ofs);
   }
 }
@@ -1066,8 +1077,9 @@ static void asm_ahuvload(ASMState *as, IRIns *ir)
     rset_clear(allow, type);
   }
   if (ra_used(ir)) {
-    lua_assert((LJ_SOFTFP ? 0 : irt_isnum(ir->t)) ||
-	       irt_isint(ir->t) || irt_isaddr(ir->t));
+    lj_assertA((LJ_SOFTFP ? 0 : irt_isnum(ir->t)) ||
+	       irt_isint(ir->t) || irt_isaddr(ir->t),
+	       "bad load type %d", irt_type(ir->t));
     dest = ra_dest(as, ir, (!LJ_SOFTFP && t == IRT_NUM) ? RSET_FPR : allow);
     rset_clear(allow, dest);
   }
@@ -1133,10 +1145,13 @@ static void asm_sload(ASMState *as, IRIns *ir)
   IRType t = hiop ? IRT_NUM : irt_type(ir->t);
   Reg dest = RID_NONE, type = RID_NONE, base;
   RegSet allow = RSET_GPR;
-  lua_assert(!(ir->op2 & IRSLOAD_PARENT));  /* Handled by asm_head_side(). */
-  lua_assert(irt_isguard(ir->t) || !(ir->op2 & IRSLOAD_TYPECHECK));
+  lj_assertA(!(ir->op2 & IRSLOAD_PARENT),
+	     "bad parent SLOAD");  /* Handled by asm_head_side(). */
+  lj_assertA(irt_isguard(ir->t) || !(ir->op2 & IRSLOAD_TYPECHECK),
+	     "inconsistent SLOAD variant");
 #if LJ_SOFTFP
-  lua_assert(!(ir->op2 & IRSLOAD_CONVERT));  /* Handled by LJ_SOFTFP SPLIT. */
+  lj_assertA(!(ir->op2 & IRSLOAD_CONVERT),
+	     "unsplit SLOAD convert");  /* Handled by LJ_SOFTFP SPLIT. */
   if (hiop && ra_used(ir+1)) {
     type = ra_dest(as, ir+1, allow);
     rset_clear(allow, type);
@@ -1152,8 +1167,9 @@ static void asm_sload(ASMState *as, IRIns *ir)
     Reg tmp = RID_NONE;
     if ((ir->op2 & IRSLOAD_CONVERT))
       tmp = ra_scratch(as, t == IRT_INT ? RSET_FPR : RSET_GPR);
-    lua_assert((LJ_SOFTFP ? 0 : irt_isnum(ir->t)) ||
-	       irt_isint(ir->t) || irt_isaddr(ir->t));
+    lj_assertA((LJ_SOFTFP ? 0 : irt_isnum(ir->t)) ||
+	       irt_isint(ir->t) || irt_isaddr(ir->t),
+	       "bad SLOAD type %d", irt_type(ir->t));
     dest = ra_dest(as, ir, (!LJ_SOFTFP && t == IRT_NUM) ? RSET_FPR : allow);
     rset_clear(allow, dest);
     base = ra_alloc1(as, REF_BASE, allow);
@@ -1218,7 +1234,8 @@ static void asm_cnew(ASMState *as, IRIns *ir)
   IRRef args[4];
   RegSet allow = (RSET_GPR & ~RSET_SCRATCH);
   RegSet drop = RSET_SCRATCH;
-  lua_assert(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL));
+  lj_assertA(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL),
+	     "bad CNEW/CNEWI operands");
 
   as->gcsteps++;
   if (ra_hasreg(ir->r))
@@ -1230,10 +1247,10 @@ static void asm_cnew(ASMState *as, IRIns *ir)
   /* Initialize immutable cdata object. */
   if (ir->o == IR_CNEWI) {
     int32_t ofs = sizeof(GCcdata);
-    lua_assert(sz == 4 || sz == 8);
+    lj_assertA(sz == 4 || sz == 8, "bad CNEWI size %d", sz);
     if (sz == 8) {
       ofs += 4; ir++;
-      lua_assert(ir->o == IR_HIOP);
+      lj_assertA(ir->o == IR_HIOP, "expected HIOP for CNEWI");
     }
     for (;;) {
       Reg r = ra_alloc1(as, ir->op2, allow);
@@ -1306,7 +1323,7 @@ static void asm_obar(ASMState *as, IRIns *ir)
   MCLabel l_end;
   Reg obj, val, tmp;
   /* No need for other object barriers (yet). */
-  lua_assert(IR(ir->op1)->o == IR_UREFC);
+  lj_assertA(IR(ir->op1)->o == IR_UREFC, "bad OBAR type");
   ra_evictset(as, RSET_SCRATCH);
   l_end = emit_label(as);
   args[0] = ASMREF_TMP1;  /* global_State *g */
@@ -1580,7 +1597,7 @@ static void asm_bitshift(ASMState *as, IRIns *ir, ARMShift sh)
 #define asm_bshr(as, ir)	asm_bitshift(as, ir, ARMSH_LSR)
 #define asm_bsar(as, ir)	asm_bitshift(as, ir, ARMSH_ASR)
 #define asm_bror(as, ir)	asm_bitshift(as, ir, ARMSH_ROR)
-#define asm_brol(as, ir)	lua_assert(0)
+#define asm_brol(as, ir)	lj_assertA(0, "unexpected BROL")
 
 static void asm_intmin_max(ASMState *as, IRIns *ir, int cc)
 {
@@ -1731,7 +1748,8 @@ static void asm_intcomp(ASMState *as, IRIns *ir)
   Reg left;
   uint32_t m;
   int cmpprev0 = 0;
-  lua_assert(irt_isint(ir->t) || irt_isu32(ir->t) || irt_isaddr(ir->t));
+  lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t) || irt_isaddr(ir->t),
+	     "bad comparison data type %d", irt_type(ir->t));
   if (asm_swapops(as, lref, rref)) {
     Reg tmp = lref; lref = rref; rref = tmp;
     if (cc >= CC_GE) cc ^= 7;  /* LT <-> GT, LE <-> GE */
@@ -1900,10 +1918,11 @@ static void asm_hiop(ASMState *as, IRIns *ir)
   case IR_CNEWI:
     /* Nothing to do here. Handled by lo op itself. */
     break;
-  default: lua_assert(0); break;
+  default: lj_assertA(0, "bad HIOP for op %d", (ir-1)->o); break;
   }
 #else
-  UNUSED(as); UNUSED(ir); lua_assert(0);
+  /* Unused without SOFTFP or FFI. */
+  UNUSED(as); UNUSED(ir); lj_assertA(0, "unexpected HIOP");
 #endif
 }
 
@@ -1928,7 +1947,7 @@ static void asm_stack_check(ASMState *as, BCReg topslot,
   if (irp) {
     if (!ra_hasspill(irp->s)) {
       pbase = irp->r;
-      lua_assert(ra_hasreg(pbase));
+      lj_assertA(ra_hasreg(pbase), "base reg lost");
     } else if (allow) {
       pbase = rset_pickbot(allow);
     } else {
@@ -1940,7 +1959,7 @@ static void asm_stack_check(ASMState *as, BCReg topslot,
   }
   emit_branch(as, ARMF_CC(ARMI_BL, CC_LS), exitstub_addr(as->J, exitno));
   k = emit_isk12(0, (int32_t)(8*topslot));
-  lua_assert(k);
+  lj_assertA(k, "slot offset %d does not fit in K12", 8*topslot);
   emit_n(as, ARMI_CMP^k, RID_TMP);
   emit_dnm(as, ARMI_SUB, RID_TMP, RID_TMP, pbase);
   emit_lso(as, ARMI_LDR, RID_TMP, RID_TMP,
@@ -1977,7 +1996,8 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
 #if LJ_SOFTFP
       RegSet odd = rset_exclude(RSET_GPRODD, RID_BASE);
       Reg tmp;
-      lua_assert(irref_isk(ref));  /* LJ_SOFTFP: must be a number constant. */
+      /* LJ_SOFTFP: must be a number constant. */
+      lj_assertA(irref_isk(ref), "unsplit FP op");
       tmp = ra_allock(as, (int32_t)ir_knum(ir)->u32.lo,
 		      rset_exclude(RSET_GPREVEN, RID_BASE));
       emit_lso(as, ARMI_STR, tmp, RID_BASE, ofs);
@@ -1991,7 +2011,8 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
     } else {
       RegSet odd = rset_exclude(RSET_GPRODD, RID_BASE);
       Reg type;
-      lua_assert(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t));
+      lj_assertA(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t),
+		 "restore of IR type %d", irt_type(ir->t));
       if (!irt_ispri(ir->t)) {
 	Reg src = ra_alloc1(as, ref, rset_exclude(RSET_GPREVEN, RID_BASE));
 	emit_lso(as, ARMI_STR, src, RID_BASE, ofs);
@@ -2011,7 +2032,7 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
     }
     checkmclim(as);
   }
-  lua_assert(map + nent == flinks);
+  lj_assertA(map + nent == flinks, "inconsistent frames in snapshot");
 }
 
 /* -- GC handling --------------------------------------------------------- */
@@ -2097,7 +2118,7 @@ static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
     rset_clear(allow, ra_dest(as, ir, allow));
   } else {
     Reg r = irp->r;
-    lua_assert(ra_hasreg(r));
+    lj_assertA(ra_hasreg(r), "base reg lost");
     rset_clear(allow, r);
     if (r != ir->r && !rset_test(as->freeset, r))
       ra_restore(as, regcost_ref(as->cost[r]));
@@ -2119,7 +2140,7 @@ static void asm_tail_fixup(ASMState *as, TraceNo lnk)
   } else {
     /* Patch stack adjustment. */
     uint32_t k = emit_isk12(ARMI_ADD, spadj);
-    lua_assert(k);
+    lj_assertA(k, "stack adjustment %d does not fit in K12", spadj);
     p[-2] = (ARMI_ADD^k) | ARMF_D(RID_SP) | ARMF_N(RID_SP);
   }
   /* Patch exit branch. */
@@ -2201,7 +2222,7 @@ void lj_asm_patchexit(jit_State *J, GCtrace *T, ExitNo exitno, MCode *target)
       if (!cstart) cstart = p;
     }
   }
-  lua_assert(cstart != NULL);
+  lj_assertJ(cstart != NULL, "exit stub %d not found", exitno);
   lj_mcode_sync(cstart, cend);
   lj_mcode_patch(J, mcarea, 1);
 }
diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
index c3d6889e..d1d4237b 100644
--- a/src/lj_asm_arm64.h
+++ b/src/lj_asm_arm64.h
@@ -213,7 +213,7 @@ static uint32_t asm_fuseopm(ASMState *as, A64Ins ai, IRRef ref, RegSet allow)
     return A64F_M(ir->r);
   } else if (irref_isk(ref)) {
     uint32_t m;
-    int64_t k = get_k64val(ir);
+    int64_t k = get_k64val(as, ref);
     if ((ai & 0x1f000000) == 0x0a000000)
       m = emit_isk13(k, irt_is64(ir->t));
     else
@@ -354,9 +354,9 @@ static int asm_fusemadd(ASMState *as, IRIns *ir, A64Ins ai, A64Ins air)
 static int asm_fuseandshift(ASMState *as, IRIns *ir)
 {
   IRIns *irl = IR(ir->op1);
-  lua_assert(ir->o == IR_BAND);
+  lj_assertA(ir->o == IR_BAND, "bad usage");
   if (canfuse(as, irl) && irref_isk(ir->op2)) {
-    uint64_t mask = get_k64val(IR(ir->op2));
+    uint64_t mask = get_k64val(as, ir->op2);
     if (irref_isk(irl->op2) && (irl->o == IR_BSHR || irl->o == IR_BSHL)) {
       int32_t shmask = irt_is64(irl->t) ? 63 : 31;
       int32_t shift = (IR(irl->op2)->i & shmask);
@@ -384,7 +384,7 @@ static int asm_fuseandshift(ASMState *as, IRIns *ir)
 static int asm_fuseorshift(ASMState *as, IRIns *ir)
 {
   IRIns *irl = IR(ir->op1), *irr = IR(ir->op2);
-  lua_assert(ir->o == IR_BOR);
+  lj_assertA(ir->o == IR_BOR, "bad usage");
   if (canfuse(as, irl) && canfuse(as, irr) &&
       ((irl->o == IR_BSHR && irr->o == IR_BSHL) ||
        (irl->o == IR_BSHL && irr->o == IR_BSHR))) {
@@ -428,7 +428,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
     if (ref) {
       if (irt_isfp(ir->t)) {
 	if (fpr <= REGARG_LASTFPR) {
-	  lua_assert(rset_test(as->freeset, fpr)); /* Must have been evicted. */
+	  lj_assertA(rset_test(as->freeset, fpr),
+		     "reg %d not free", fpr);  /* Must have been evicted. */
 	  ra_leftov(as, fpr, ref);
 	  fpr++;
 	} else {
@@ -438,7 +439,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
 	}
       } else {
 	if (gpr <= REGARG_LASTGPR) {
-	  lua_assert(rset_test(as->freeset, gpr)); /* Must have been evicted. */
+	  lj_assertA(rset_test(as->freeset, gpr),
+		     "reg %d not free", gpr);  /* Must have been evicted. */
 	  ra_leftov(as, gpr, ref);
 	  gpr++;
 	} else {
@@ -459,7 +461,7 @@ static void asm_setupresult(ASMState *as, IRIns *ir, const CCallInfo *ci)
     rset_clear(drop, ir->r); /* Dest reg handled below. */
   ra_evictset(as, drop); /* Evictions must be performed first. */
   if (ra_used(ir)) {
-    lua_assert(!irt_ispri(ir->t));
+    lj_assertA(!irt_ispri(ir->t), "PRI dest");
     if (irt_isfp(ir->t)) {
       if (ci->flags & CCI_CASTU64) {
 	Reg dest = ra_dest(as, ir, RSET_FPR) & 31;
@@ -546,7 +548,7 @@ static void asm_conv(ASMState *as, IRIns *ir)
   int st64 = (st == IRT_I64 || st == IRT_U64 || st == IRT_P64);
   int stfp = (st == IRT_NUM || st == IRT_FLOAT);
   IRRef lref = ir->op1;
-  lua_assert(irt_type(ir->t) != st);
+  lj_assertA(irt_type(ir->t) != st, "inconsistent types for CONV");
   if (irt_isfp(ir->t)) {
     Reg dest = ra_dest(as, ir, RSET_FPR);
     if (stfp) {  /* FP to FP conversion. */
@@ -566,7 +568,8 @@ static void asm_conv(ASMState *as, IRIns *ir)
   } else if (stfp) {  /* FP to integer conversion. */
     if (irt_isguard(ir->t)) {
       /* Checked conversions are only supported from number to int. */
-      lua_assert(irt_isint(ir->t) && st == IRT_NUM);
+      lj_assertA(irt_isint(ir->t) && st == IRT_NUM,
+		 "bad type for checked CONV");
       asm_tointg(as, ir, ra_alloc1(as, lref, RSET_FPR));
     } else {
       Reg left = ra_alloc1(as, lref, RSET_FPR);
@@ -586,7 +589,7 @@ static void asm_conv(ASMState *as, IRIns *ir)
     A64Ins ai = st == IRT_I8 ? A64I_SXTBw :
 		st == IRT_U8 ? A64I_UXTBw :
 		st == IRT_I16 ? A64I_SXTHw : A64I_UXTHw;
-    lua_assert(irt_isint(ir->t) || irt_isu32(ir->t));
+    lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t), "bad type for CONV EXT");
     emit_dn(as, ai, dest, left);
   } else {
     Reg dest = ra_dest(as, ir, RSET_GPR);
@@ -650,7 +653,8 @@ static void asm_tvstore64(ASMState *as, Reg base, int32_t ofs, IRRef ref)
 {
   RegSet allow = rset_exclude(RSET_GPR, base);
   IRIns *ir = IR(ref);
-  lua_assert(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t));
+  lj_assertA(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t),
+	     "store of IR type %d", irt_type(ir->t));
   if (irref_isk(ref)) {
     TValue k;
     lj_ir_kvalue(as->J->L, &k, ir);
@@ -770,7 +774,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
     }
     rset_clear(allow, scr);
   } else {
-    lua_assert(irt_ispri(kt) && !irt_isnil(kt));
+    lj_assertA(irt_ispri(kt) && !irt_isnil(kt), "bad HREF key type");
     type = ra_allock(as, ~((int64_t)~irt_toitype(ir->t) << 47), allow);
     scr = ra_scratch(as, rset_clear(allow, type));
     rset_clear(allow, scr);
@@ -831,7 +835,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
     rset_clear(allow, type);
   }
   /* Load main position relative to tab->node into dest. */
-  khash = isk ? ir_khash(irkey) : 1;
+  khash = isk ? ir_khash(as, irkey) : 1;
   if (khash == 0) {
     emit_lso(as, A64I_LDRx, dest, tab, offsetof(GCtab, node));
   } else {
@@ -886,7 +890,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
   Reg key, idx = node;
   RegSet allow = rset_exclude(RSET_GPR, node);
   uint64_t k;
-  lua_assert(ofs % sizeof(Node) == 0);
+  lj_assertA(ofs % sizeof(Node) == 0, "unaligned HREFK slot");
   if (bigofs) {
     idx = dest;
     rset_clear(allow, dest);
@@ -936,7 +940,7 @@ static void asm_uref(ASMState *as, IRIns *ir)
 static void asm_fref(ASMState *as, IRIns *ir)
 {
   UNUSED(as); UNUSED(ir);
-  lua_assert(!ra_used(ir));
+  lj_assertA(!ra_used(ir), "unfused FREF");
 }
 
 static void asm_strref(ASMState *as, IRIns *ir)
@@ -988,7 +992,7 @@ static void asm_fload(ASMState *as, IRIns *ir)
   Reg idx;
   A64Ins ai = asm_fxloadins(ir);
   int32_t ofs;
-  if (ir->op1 == REF_NIL) {
+  if (ir->op1 == REF_NIL) {  /* FLOAD from GG_State with offset. */
     idx = RID_GL;
     ofs = (ir->op2 << 2) - GG_OFS(g);
   } else {
@@ -1019,7 +1023,7 @@ static void asm_fstore(ASMState *as, IRIns *ir)
 static void asm_xload(ASMState *as, IRIns *ir)
 {
   Reg dest = ra_dest(as, ir, irt_isfp(ir->t) ? RSET_FPR : RSET_GPR);
-  lua_assert(!(ir->op2 & IRXLOAD_UNALIGNED));
+  lj_assertA(!(ir->op2 & IRXLOAD_UNALIGNED), "unaligned XLOAD");
   asm_fusexref(as, asm_fxloadins(ir), dest, ir->op1, RSET_GPR);
 }
 
@@ -1037,8 +1041,9 @@ static void asm_ahuvload(ASMState *as, IRIns *ir)
   Reg idx, tmp, type;
   int32_t ofs = 0;
   RegSet gpr = RSET_GPR, allow = irt_isnum(ir->t) ? RSET_FPR : RSET_GPR;
-  lua_assert(irt_isnum(ir->t) || irt_ispri(ir->t) || irt_isaddr(ir->t) ||
-	     irt_isint(ir->t));
+  lj_assertA(irt_isnum(ir->t) || irt_ispri(ir->t) || irt_isaddr(ir->t) ||
+	     irt_isint(ir->t),
+	     "bad load type %d", irt_type(ir->t));
   if (ra_used(ir)) {
     Reg dest = ra_dest(as, ir, allow);
     tmp = irt_isnum(ir->t) ? ra_scratch(as, rset_clear(gpr, dest)) : dest;
@@ -1057,7 +1062,8 @@ static void asm_ahuvload(ASMState *as, IRIns *ir)
   /* Always do the type check, even if the load result is unused. */
   asm_guardcc(as, irt_isnum(ir->t) ? CC_LS : CC_NE);
   if (irt_type(ir->t) >= IRT_NUM) {
-    lua_assert(irt_isinteger(ir->t) || irt_isnum(ir->t));
+    lj_assertA(irt_isinteger(ir->t) || irt_isnum(ir->t),
+	       "bad load type %d", irt_type(ir->t));
     emit_nm(as, A64I_CMPx | A64F_SH(A64SH_LSR, 32),
 	    ra_allock(as, LJ_TISNUM << 15, rset_exclude(gpr, idx)), tmp);
   } else if (irt_isaddr(ir->t)) {
@@ -1122,8 +1128,10 @@ static void asm_sload(ASMState *as, IRIns *ir)
   IRType1 t = ir->t;
   Reg dest = RID_NONE, base;
   RegSet allow = RSET_GPR;
-  lua_assert(!(ir->op2 & IRSLOAD_PARENT));  /* Handled by asm_head_side(). */
-  lua_assert(irt_isguard(t) || !(ir->op2 & IRSLOAD_TYPECHECK));
+  lj_assertA(!(ir->op2 & IRSLOAD_PARENT),
+	     "bad parent SLOAD");  /* Handled by asm_head_side(). */
+  lj_assertA(irt_isguard(t) || !(ir->op2 & IRSLOAD_TYPECHECK),
+	     "inconsistent SLOAD variant");
   if ((ir->op2 & IRSLOAD_CONVERT) && irt_isguard(t) && irt_isint(t)) {
     dest = ra_scratch(as, RSET_FPR);
     asm_tointg(as, ir, dest);
@@ -1132,7 +1140,8 @@ static void asm_sload(ASMState *as, IRIns *ir)
     Reg tmp = RID_NONE;
     if ((ir->op2 & IRSLOAD_CONVERT))
       tmp = ra_scratch(as, irt_isint(t) ? RSET_FPR : RSET_GPR);
-    lua_assert((irt_isnum(t)) || irt_isint(t) || irt_isaddr(t));
+    lj_assertA((irt_isnum(t)) || irt_isint(t) || irt_isaddr(t),
+	       "bad SLOAD type %d", irt_type(t));
     dest = ra_dest(as, ir, irt_isnum(t) ? RSET_FPR : allow);
     base = ra_alloc1(as, REF_BASE, rset_clear(allow, dest));
     if (irt_isaddr(t)) {
@@ -1172,7 +1181,8 @@ dotypecheck:
     /* Need type check, even if the load result is unused. */
     asm_guardcc(as, irt_isnum(t) ? CC_LS : CC_NE);
     if (irt_type(t) >= IRT_NUM) {
-      lua_assert(irt_isinteger(t) || irt_isnum(t));
+      lj_assertA(irt_isinteger(t) || irt_isnum(t),
+		 "bad SLOAD type %d", irt_type(t));
       emit_nm(as, A64I_CMPx | A64F_SH(A64SH_LSR, 32),
 	      ra_allock(as, LJ_TISNUM << 15, allow), tmp);
     } else if (irt_isnil(t)) {
@@ -1207,7 +1217,8 @@ static void asm_cnew(ASMState *as, IRIns *ir)
   const CCallInfo *ci = &lj_ir_callinfo[IRCALL_lj_mem_newgco];
   IRRef args[4];
   RegSet allow = (RSET_GPR & ~RSET_SCRATCH);
-  lua_assert(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL));
+  lj_assertA(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL),
+	     "bad CNEW/CNEWI operands");
 
   as->gcsteps++;
   asm_setupresult(as, ir, ci);  /* GCcdata * */
@@ -1215,7 +1226,7 @@ static void asm_cnew(ASMState *as, IRIns *ir)
   if (ir->o == IR_CNEWI) {
     int32_t ofs = sizeof(GCcdata);
     Reg r = ra_alloc1(as, ir->op2, allow);
-    lua_assert(sz == 4 || sz == 8);
+    lj_assertA(sz == 4 || sz == 8, "bad CNEWI size %d", sz);
     emit_lso(as, sz == 8 ? A64I_STRx : A64I_STRw, r, RID_RET, ofs);
   } else if (ir->op2 != REF_NIL) {  /* Create VLA/VLS/aligned cdata. */
     ci = &lj_ir_callinfo[IRCALL_lj_cdata_newv];
@@ -1281,7 +1292,7 @@ static void asm_obar(ASMState *as, IRIns *ir)
   RegSet allow = RSET_GPR;
   Reg obj, val, tmp;
   /* No need for other object barriers (yet). */
-  lua_assert(IR(ir->op1)->o == IR_UREFC);
+  lj_assertA(IR(ir->op1)->o == IR_UREFC, "bad OBAR type");
   ra_evictset(as, RSET_SCRATCH);
   l_end = emit_label(as);
   args[0] = ASMREF_TMP1;  /* global_State *g */
@@ -1551,7 +1562,7 @@ static void asm_bitshift(ASMState *as, IRIns *ir, A64Ins ai, A64Shift sh)
 #define asm_bshr(as, ir)	asm_bitshift(as, ir, A64I_UBFMw, A64SH_LSR)
 #define asm_bsar(as, ir)	asm_bitshift(as, ir, A64I_SBFMw, A64SH_ASR)
 #define asm_bror(as, ir)	asm_bitshift(as, ir, A64I_EXTRw, A64SH_ROR)
-#define asm_brol(as, ir)	lua_assert(0)
+#define asm_brol(as, ir)	lj_assertA(0, "unexpected BROL")
 
 static void asm_intmin_max(ASMState *as, IRIns *ir, A64CC cc)
 {
@@ -1632,15 +1643,16 @@ static void asm_intcomp(ASMState *as, IRIns *ir)
   Reg left;
   uint32_t m;
   int cmpprev0 = 0;
-  lua_assert(irt_is64(ir->t) || irt_isint(ir->t) ||
-	     irt_isu32(ir->t) || irt_isaddr(ir->t) || irt_isu8(ir->t));
+  lj_assertA(irt_is64(ir->t) || irt_isint(ir->t) ||
+	     irt_isu32(ir->t) || irt_isaddr(ir->t) || irt_isu8(ir->t),
+	     "bad comparison data type %d", irt_type(ir->t));
   if (asm_swapops(as, lref, rref)) {
     IRRef tmp = lref; lref = rref; rref = tmp;
     if (cc >= CC_GE) cc ^= 7;  /* LT <-> GT, LE <-> GE */
     else if (cc > CC_NE) cc ^= 11;  /* LO <-> HI, LS <-> HS */
   }
   oldcc = cc;
-  if (irref_isk(rref) && get_k64val(IR(rref)) == 0) {
+  if (irref_isk(rref) && get_k64val(as, rref) == 0) {
     IRIns *irl = IR(lref);
     if (cc == CC_GE) cc = CC_PL;
     else if (cc == CC_LT) cc = CC_MI;
@@ -1655,7 +1667,7 @@ static void asm_intcomp(ASMState *as, IRIns *ir)
 	Reg tmp = blref; blref = brref; brref = tmp;
       }
       if (irref_isk(brref)) {
-	uint64_t k = get_k64val(IR(brref));
+	uint64_t k = get_k64val(as, brref);
 	if (k && !(k & (k-1)) && (cc == CC_EQ || cc == CC_NE)) {
 	  asm_guardtnb(as, cc == CC_EQ ? A64I_TBZ : A64I_TBNZ,
 		       ra_alloc1(as, blref, RSET_GPR), emit_ctz64(k));
@@ -1704,7 +1716,8 @@ static void asm_comp(ASMState *as, IRIns *ir)
 /* Hiword op of a split 64 bit op. Previous op must be the loword op. */
 static void asm_hiop(ASMState *as, IRIns *ir)
 {
-  UNUSED(as); UNUSED(ir); lua_assert(0);  /* Unused on 64 bit. */
+  UNUSED(as); UNUSED(ir);
+  lj_assertA(0, "unexpected HIOP");  /* Unused on 64 bit. */
 }
 
 /* -- Profiling ----------------------------------------------------------- */
@@ -1712,7 +1725,7 @@ static void asm_hiop(ASMState *as, IRIns *ir)
 static void asm_prof(ASMState *as, IRIns *ir)
 {
   uint32_t k = emit_isk13(HOOK_PROFILE, 0);
-  lua_assert(k != 0);
+  lj_assertA(k != 0, "HOOK_PROFILE does not fit in K13");
   UNUSED(ir);
   asm_guardcc(as, CC_NE);
   emit_n(as, A64I_TSTw^k, RID_TMP);
@@ -1730,7 +1743,7 @@ static void asm_stack_check(ASMState *as, BCReg topslot,
   if (irp) {
     if (!ra_hasspill(irp->s)) {
       pbase = irp->r;
-      lua_assert(ra_hasreg(pbase));
+      lj_assertA(ra_hasreg(pbase), "base reg lost");
     } else if (allow) {
       pbase = rset_pickbot(allow);
     } else {
@@ -1742,7 +1755,7 @@ static void asm_stack_check(ASMState *as, BCReg topslot,
   }
   emit_cond_branch(as, CC_LS, asm_exitstub_addr(as, exitno));
   k = emit_isk12((8*topslot));
-  lua_assert(k);
+  lj_assertA(k, "slot offset %d does not fit in K12", 8*topslot);
   emit_n(as, A64I_CMPx^k, RID_TMP);
   emit_dnm(as, A64I_SUBx, RID_TMP, RID_TMP, pbase);
   emit_lso(as, A64I_LDRx, RID_TMP, RID_TMP,
@@ -1783,7 +1796,7 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
     }
     checkmclim(as);
   }
-  lua_assert(map + nent == flinks);
+  lj_assertA(map + nent == flinks, "inconsistent frames in snapshot");
 }
 
 /* -- GC handling --------------------------------------------------------- */
@@ -1871,7 +1884,7 @@ static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
     rset_clear(allow, ra_dest(as, ir, allow));
   } else {
     Reg r = irp->r;
-    lua_assert(ra_hasreg(r));
+    lj_assertA(ra_hasreg(r), "base reg lost");
     rset_clear(allow, r);
     if (r != ir->r && !rset_test(as->freeset, r))
       ra_restore(as, regcost_ref(as->cost[r]));
@@ -1895,7 +1908,7 @@ static void asm_tail_fixup(ASMState *as, TraceNo lnk)
   } else {
     /* Patch stack adjustment. */
     uint32_t k = emit_isk12(spadj);
-    lua_assert(k);
+    lj_assertA(k, "stack adjustment %d does not fit in K12", spadj);
     p[-2] = (A64I_ADDx^k) | A64F_D(RID_SP) | A64F_N(RID_SP);
   }
   /* Patch exit branch. */
@@ -1981,7 +1994,7 @@ void lj_asm_patchexit(jit_State *J, GCtrace *T, ExitNo exitno, MCode *target)
     } else if ((ins & 0xfc000000u) == 0x14000000u &&
 	       ((ins ^ (px-p)) & 0x03ffffffu) == 0) {
       /* Patch b. */
-      lua_assert(A64F_S_OK(delta, 26));
+      lj_assertJ(A64F_S_OK(delta, 26), "branch target out of range");
       *p = A64I_LE((ins & 0xfc000000u) | A64F_S26(delta));
       if (!cstart) cstart = p;
     } else if ((ins & 0x7e000000u) == 0x34000000u &&
@@ -2002,7 +2015,7 @@ void lj_asm_patchexit(jit_State *J, GCtrace *T, ExitNo exitno, MCode *target)
   }
   {  /* Always patch long-range branch in exit stub itself. */
     ptrdiff_t delta = target - px;
-    lua_assert(A64F_S_OK(delta, 26));
+    lj_assertJ(A64F_S_OK(delta, 26), "branch target out of range");
     *px = A64I_B | A64F_S26(delta);
     if (!cstart) cstart = px;
   }
diff --git a/src/lj_asm_mips.h b/src/lj_asm_mips.h
index 0f92959b..ea108aab 100644
--- a/src/lj_asm_mips.h
+++ b/src/lj_asm_mips.h
@@ -23,7 +23,7 @@ static Reg ra_alloc1z(ASMState *as, IRRef ref, RegSet allow)
 {
   Reg r = IR(ref)->r;
   if (ra_noreg(r)) {
-    if (!(allow & RSET_FPR) && irref_isk(ref) && get_kval(IR(ref)) == 0)
+    if (!(allow & RSET_FPR) && irref_isk(ref) && get_kval(as, ref) == 0)
       return RID_ZERO;
     r = ra_allocref(as, ref, allow);
   } else {
@@ -66,10 +66,10 @@ static void asm_sparejump_setup(ASMState *as)
 {
   MCode *mxp = as->mcbot;
   if (((uintptr_t)mxp & (LJ_PAGESIZE-1)) == sizeof(MCLink)) {
-    lua_assert(MIPSI_NOP == 0);
+    lj_assertA(MIPSI_NOP == 0, "bad NOP");
     memset(mxp, 0, MIPS_SPAREJUMP*2*sizeof(MCode));
     mxp += MIPS_SPAREJUMP*2;
-    lua_assert(mxp < as->mctop);
+    lj_assertA(mxp < as->mctop, "MIPS_SPAREJUMP too big");
     lj_mcode_sync(as->mcbot, mxp);
     lj_mcode_commitbot(as->J, mxp);
     as->mcbot = mxp;
@@ -84,7 +84,8 @@ static void asm_exitstub_setup(ASMState *as)
   /* sw TMP, 0(sp); j ->vm_exit_handler; li TMP, traceno */
   *--mxp = MIPSI_LI|MIPSF_T(RID_TMP)|as->T->traceno;
   *--mxp = MIPSI_J|((((uintptr_t)(void *)lj_vm_exit_handler)>>2)&0x03ffffffu);
-  lua_assert(((uintptr_t)mxp ^ (uintptr_t)(void *)lj_vm_exit_handler)>>28 == 0);
+  lj_assertA(((uintptr_t)mxp ^ (uintptr_t)(void *)lj_vm_exit_handler)>>28 == 0,
+	     "branch target out of range");
   *--mxp = MIPSI_SW|MIPSF_T(RID_TMP)|MIPSF_S(RID_SP)|0;
   as->mctop = mxp;
 }
@@ -195,20 +196,20 @@ static void asm_fusexref(ASMState *as, MIPSIns mi, Reg rt, IRRef ref,
   if (ra_noreg(ir->r) && canfuse(as, ir)) {
     if (ir->o == IR_ADD) {
       intptr_t ofs2;
-      if (irref_isk(ir->op2) && (ofs2 = ofs + get_kval(IR(ir->op2)),
+      if (irref_isk(ir->op2) && (ofs2 = ofs + get_kval(as, ir->op2),
 				 checki16(ofs2))) {
 	ref = ir->op1;
 	ofs = (int32_t)ofs2;
       }
     } else if (ir->o == IR_STRREF) {
       intptr_t ofs2 = 65536;
-      lua_assert(ofs == 0);
+      lj_assertA(ofs == 0, "bad usage");
       ofs = (int32_t)sizeof(GCstr);
       if (irref_isk(ir->op2)) {
-	ofs2 = ofs + get_kval(IR(ir->op2));
+	ofs2 = ofs + get_kval(as, ir->op2);
 	ref = ir->op1;
       } else if (irref_isk(ir->op1)) {
-	ofs2 = ofs + get_kval(IR(ir->op1));
+	ofs2 = ofs + get_kval(as, ir->op1);
 	ref = ir->op2;
       }
       if (!checki16(ofs2)) {
@@ -252,7 +253,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
 #if !LJ_SOFTFP
       if (irt_isfp(ir->t) && fpr <= REGARG_LASTFPR &&
 	  !(ci->flags & CCI_VARARG)) {
-	lua_assert(rset_test(as->freeset, fpr));  /* Already evicted. */
+	lj_assertA(rset_test(as->freeset, fpr),
+		   "reg %d not free", fpr);  /* Already evicted. */
 	ra_leftov(as, fpr, ref);
 	fpr += LJ_32 ? 2 : 1;
 	gpr += (LJ_32 && irt_isnum(ir->t)) ? 2 : 1;
@@ -264,7 +266,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
 #endif
 	if (LJ_32 && irt_isnum(ir->t)) gpr = (gpr+1) & ~1;
 	if (gpr <= REGARG_LASTGPR) {
-	  lua_assert(rset_test(as->freeset, gpr));  /* Already evicted. */
+	  lj_assertA(rset_test(as->freeset, gpr),
+		     "reg %d not free", gpr);  /* Already evicted. */
 #if !LJ_SOFTFP
 	  if (irt_isfp(ir->t)) {
 	    RegSet of = as->freeset;
@@ -277,7 +280,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
 #if LJ_32
 	      emit_tg(as, MIPSI_MFC1, gpr+(LJ_BE?0:1), r+1);
 	      emit_tg(as, MIPSI_MFC1, gpr+(LJ_BE?1:0), r);
-	      lua_assert(rset_test(as->freeset, gpr+1));  /* Already evicted. */
+	      lj_assertA(rset_test(as->freeset, gpr+1),
+			 "reg %d not free", gpr+1);  /* Already evicted. */
 	      gpr += 2;
 #else
 	      emit_tg(as, MIPSI_DMFC1, gpr, r);
@@ -347,7 +351,7 @@ static void asm_setupresult(ASMState *as, IRIns *ir, const CCallInfo *ci)
 #endif
   ra_evictset(as, drop);  /* Evictions must be performed first. */
   if (ra_used(ir)) {
-    lua_assert(!irt_ispri(ir->t));
+    lj_assertA(!irt_ispri(ir->t), "PRI dest");
     if (!LJ_SOFTFP && irt_isfp(ir->t)) {
       if ((ci->flags & CCI_CASTU64)) {
 	int32_t ofs = sps_scale(ir->s);
@@ -395,7 +399,7 @@ static void asm_callx(ASMState *as, IRIns *ir)
   func = ir->op2; irf = IR(func);
   if (irf->o == IR_CARG) { func = irf->op1; irf = IR(func); }
   if (irref_isk(func)) {  /* Call to constant address. */
-    ci.func = (ASMFunction)(void *)get_kval(irf);
+    ci.func = (ASMFunction)(void *)get_kval(as, func);
   } else {  /* Need specific register for indirect calls. */
     Reg r = ra_alloc1(as, func, RID2RSET(RID_CFUNCADDR));
     MCode *p = as->mcp;
@@ -512,15 +516,19 @@ static void asm_conv(ASMState *as, IRIns *ir)
 #endif
   IRRef lref = ir->op1;
 #if LJ_32
-  lua_assert(!(irt_isint64(ir->t) ||
-	       (st == IRT_I64 || st == IRT_U64))); /* Handled by SPLIT. */
+  /* 64 bit integer conversions are handled by SPLIT. */
+  lj_assertA(!(irt_isint64(ir->t) || (st == IRT_I64 || st == IRT_U64)),
+	     "IR %04d has unsplit 64 bit type",
+	     (int)(ir - as->ir) - REF_BIAS);
 #endif
 #if LJ_SOFTFP32
   /* FP conversions are handled by SPLIT. */
-  lua_assert(!irt_isfp(ir->t) && !(st == IRT_NUM || st == IRT_FLOAT));
+  lj_assertA(!irt_isfp(ir->t) && !(st == IRT_NUM || st == IRT_FLOAT),
+	     "IR %04d has FP type",
+	     (int)(ir - as->ir) - REF_BIAS);
   /* Can't check for same types: SPLIT uses CONV int.int + BXOR for sfp NEG. */
 #else
-  lua_assert(irt_type(ir->t) != st);
+  lj_assertA(irt_type(ir->t) != st, "inconsistent types for CONV");
 #if !LJ_SOFTFP
   if (irt_isfp(ir->t)) {
     Reg dest = ra_dest(as, ir, RSET_FPR);
@@ -579,7 +587,8 @@ static void asm_conv(ASMState *as, IRIns *ir)
   } else if (stfp) {  /* FP to integer conversion. */
     if (irt_isguard(ir->t)) {
       /* Checked conversions are only supported from number to int. */
-      lua_assert(irt_isint(ir->t) && st == IRT_NUM);
+      lj_assertA(irt_isint(ir->t) && st == IRT_NUM,
+		 "bad type for checked CONV");
       asm_tointg(as, ir, ra_alloc1(as, lref, RSET_FPR));
     } else {
       Reg dest = ra_dest(as, ir, RSET_GPR);
@@ -679,7 +688,8 @@ static void asm_conv(ASMState *as, IRIns *ir)
   } else if (stfp) {  /* FP to integer conversion. */
     if (irt_isguard(ir->t)) {
       /* Checked conversions are only supported from number to int. */
-      lua_assert(irt_isint(ir->t) && st == IRT_NUM);
+      lj_assertA(irt_isint(ir->t) && st == IRT_NUM,
+		 "bad type for checked CONV");
       asm_tointg(as, ir, RID_NONE);
     } else {
       IRCallID cid = irt_is64(ir->t) ?
@@ -698,7 +708,7 @@ static void asm_conv(ASMState *as, IRIns *ir)
     Reg dest = ra_dest(as, ir, RSET_GPR);
     if (st >= IRT_I8 && st <= IRT_U16) {  /* Extend to 32 bit integer. */
       Reg left = ra_alloc1(as, ir->op1, RSET_GPR);
-      lua_assert(irt_isint(ir->t) || irt_isu32(ir->t));
+      lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t), "bad type for CONV EXT");
       if ((ir->op2 & IRCONV_SEXT)) {
 	if (LJ_64 || (as->flags & JIT_F_MIPSXXR2)) {
 	  emit_dst(as, st == IRT_I8 ? MIPSI_SEB : MIPSI_SEH, dest, 0, left);
@@ -795,7 +805,8 @@ static void asm_tvstore64(ASMState *as, Reg base, int32_t ofs, IRRef ref)
 {
   RegSet allow = rset_exclude(RSET_GPR, base);
   IRIns *ir = IR(ref);
-  lua_assert(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t));
+  lj_assertA(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t),
+	     "store of IR type %d", irt_type(ir->t));
   if (irref_isk(ref)) {
     TValue k;
     lj_ir_kvalue(as->J->L, &k, ir);
@@ -944,7 +955,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
       if (isk && irt_isaddr(kt)) {
 	k = ((int64_t)irt_toitype(irkey->t) << 47) | irkey[1].tv.u64;
       } else {
-	lua_assert(irt_ispri(kt) && !irt_isnil(kt));
+	lj_assertA(irt_ispri(kt) && !irt_isnil(kt), "bad HREF key type");
 	k = ~((int64_t)~irt_toitype(ir->t) << 47);
       }
       cmp64 = ra_allock(as, k, allow);
@@ -1012,7 +1023,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
 #endif
 
   /* Load main position relative to tab->node into dest. */
-  khash = isk ? ir_khash(irkey) : 1;
+  khash = isk ? ir_khash(as, irkey) : 1;
   if (khash == 0) {
     emit_tsi(as, MIPSI_AL, dest, tab, (int32_t)offsetof(GCtab, node));
   } else {
@@ -1020,7 +1031,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
     if (isk)
       tmphash = ra_allock(as, khash, allow);
     emit_dst(as, MIPSI_AADDU, dest, dest, tmp1);
-    lua_assert(sizeof(Node) == 24);
+    lj_assertA(sizeof(Node) == 24, "bad Node size");
     emit_dst(as, MIPSI_SUBU, tmp1, tmp2, tmp1);
     emit_dta(as, MIPSI_SLL, tmp1, tmp1, 3);
     emit_dta(as, MIPSI_SLL, tmp2, tmp1, 5);
@@ -1098,7 +1109,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
   Reg key = ra_scratch(as, allow);
   int64_t k;
 #endif
-  lua_assert(ofs % sizeof(Node) == 0);
+  lj_assertA(ofs % sizeof(Node) == 0, "unaligned HREFK slot");
   if (ofs > 32736) {
     idx = dest;
     rset_clear(allow, dest);
@@ -1127,7 +1138,7 @@ nolo:
   emit_tsi(as, MIPSI_LW, type, idx, kofs+(LJ_BE?0:4));
 #else
   if (irt_ispri(irkey->t)) {
-    lua_assert(!irt_isnil(irkey->t));
+    lj_assertA(!irt_isnil(irkey->t), "bad HREFK key type");
     k = ~((int64_t)~irt_toitype(irkey->t) << 47);
   } else if (irt_isnum(irkey->t)) {
     k = (int64_t)ir_knum(irkey)->u64;
@@ -1166,7 +1177,7 @@ static void asm_uref(ASMState *as, IRIns *ir)
 static void asm_fref(ASMState *as, IRIns *ir)
 {
   UNUSED(as); UNUSED(ir);
-  lua_assert(!ra_used(ir));
+  lj_assertA(!ra_used(ir), "unfused FREF");
 }
 
 static void asm_strref(ASMState *as, IRIns *ir)
@@ -1221,14 +1232,17 @@ static void asm_strref(ASMState *as, IRIns *ir)
 
 /* -- Loads and stores ---------------------------------------------------- */
 
-static MIPSIns asm_fxloadins(IRIns *ir)
+static MIPSIns asm_fxloadins(ASMState *as, IRIns *ir)
 {
+  UNUSED(as);
   switch (irt_type(ir->t)) {
   case IRT_I8: return MIPSI_LB;
   case IRT_U8: return MIPSI_LBU;
   case IRT_I16: return MIPSI_LH;
   case IRT_U16: return MIPSI_LHU;
-  case IRT_NUM: lua_assert(!LJ_SOFTFP32); if (!LJ_SOFTFP) return MIPSI_LDC1;
+  case IRT_NUM:
+    lj_assertA(!LJ_SOFTFP32, "unsplit FP op");
+    if (!LJ_SOFTFP) return MIPSI_LDC1;
   /* fallthrough */
   case IRT_FLOAT: if (!LJ_SOFTFP) return MIPSI_LWC1;
   /* fallthrough */
@@ -1236,12 +1250,15 @@ static MIPSIns asm_fxloadins(IRIns *ir)
   }
 }
 
-static MIPSIns asm_fxstoreins(IRIns *ir)
+static MIPSIns asm_fxstoreins(ASMState *as, IRIns *ir)
 {
+  UNUSED(as);
   switch (irt_type(ir->t)) {
   case IRT_I8: case IRT_U8: return MIPSI_SB;
   case IRT_I16: case IRT_U16: return MIPSI_SH;
-  case IRT_NUM: lua_assert(!LJ_SOFTFP32); if (!LJ_SOFTFP) return MIPSI_SDC1;
+  case IRT_NUM:
+    lj_assertA(!LJ_SOFTFP32, "unsplit FP op");
+    if (!LJ_SOFTFP) return MIPSI_SDC1;
   /* fallthrough */
   case IRT_FLOAT: if (!LJ_SOFTFP) return MIPSI_SWC1;
   /* fallthrough */
@@ -1252,10 +1269,10 @@ static MIPSIns asm_fxstoreins(IRIns *ir)
 static void asm_fload(ASMState *as, IRIns *ir)
 {
   Reg dest = ra_dest(as, ir, RSET_GPR);
-  MIPSIns mi = asm_fxloadins(ir);
+  MIPSIns mi = asm_fxloadins(as, ir);
   Reg idx;
   int32_t ofs;
-  if (ir->op1 == REF_NIL) {
+  if (ir->op1 == REF_NIL) {  /* FLOAD from GG_State with offset. */
     idx = RID_JGL;
     ofs = (ir->op2 << 2) - 32768 - GG_OFS(g);
   } else {
@@ -1269,7 +1286,7 @@ static void asm_fload(ASMState *as, IRIns *ir)
     }
     ofs = field_ofs[ir->op2];
   }
-  lua_assert(!irt_isfp(ir->t));
+  lj_assertA(!irt_isfp(ir->t), "bad FP FLOAD");
   emit_tsi(as, mi, dest, idx, ofs);
 }
 
@@ -1280,8 +1297,8 @@ static void asm_fstore(ASMState *as, IRIns *ir)
     IRIns *irf = IR(ir->op1);
     Reg idx = ra_alloc1(as, irf->op1, rset_exclude(RSET_GPR, src));
     int32_t ofs = field_ofs[irf->op2];
-    MIPSIns mi = asm_fxstoreins(ir);
-    lua_assert(!irt_isfp(ir->t));
+    MIPSIns mi = asm_fxstoreins(as, ir);
+    lj_assertA(!irt_isfp(ir->t), "bad FP FSTORE");
     emit_tsi(as, mi, src, idx, ofs);
   }
 }
@@ -1290,8 +1307,9 @@ static void asm_xload(ASMState *as, IRIns *ir)
 {
   Reg dest = ra_dest(as, ir,
     (!LJ_SOFTFP && irt_isfp(ir->t)) ? RSET_FPR : RSET_GPR);
-  lua_assert(LJ_TARGET_UNALIGNED || !(ir->op2 & IRXLOAD_UNALIGNED));
-  asm_fusexref(as, asm_fxloadins(ir), dest, ir->op1, RSET_GPR, 0);
+  lj_assertA(LJ_TARGET_UNALIGNED || !(ir->op2 & IRXLOAD_UNALIGNED),
+	     "unaligned XLOAD");
+  asm_fusexref(as, asm_fxloadins(as, ir), dest, ir->op1, RSET_GPR, 0);
 }
 
 static void asm_xstore_(ASMState *as, IRIns *ir, int32_t ofs)
@@ -1299,7 +1317,7 @@ static void asm_xstore_(ASMState *as, IRIns *ir, int32_t ofs)
   if (ir->r != RID_SINK) {
     Reg src = ra_alloc1z(as, ir->op2,
       (!LJ_SOFTFP && irt_isfp(ir->t)) ? RSET_FPR : RSET_GPR);
-    asm_fusexref(as, asm_fxstoreins(ir), src, ir->op1,
+    asm_fusexref(as, asm_fxstoreins(as, ir), src, ir->op1,
 		 rset_exclude(RSET_GPR, src), ofs);
   }
 }
@@ -1321,8 +1339,9 @@ static void asm_ahuvload(ASMState *as, IRIns *ir)
     }
   }
   if (ra_used(ir)) {
-    lua_assert((LJ_SOFTFP32 ? 0 : irt_isnum(ir->t)) ||
-	       irt_isint(ir->t) || irt_isaddr(ir->t));
+    lj_assertA((LJ_SOFTFP32 ? 0 : irt_isnum(ir->t)) ||
+	       irt_isint(ir->t) || irt_isaddr(ir->t),
+	       "bad load type %d", irt_type(ir->t));
     dest = ra_dest(as, ir, (!LJ_SOFTFP && irt_isnum(t)) ? RSET_FPR : allow);
     rset_clear(allow, dest);
 #if LJ_64
@@ -1427,10 +1446,13 @@ static void asm_sload(ASMState *as, IRIns *ir)
 #else
   int32_t ofs = 8*((int32_t)ir->op1-2);
 #endif
-  lua_assert(!(ir->op2 & IRSLOAD_PARENT));  /* Handled by asm_head_side(). */
-  lua_assert(irt_isguard(ir->t) || !(ir->op2 & IRSLOAD_TYPECHECK));
+  lj_assertA(!(ir->op2 & IRSLOAD_PARENT),
+	     "bad parent SLOAD");  /* Handled by asm_head_side(). */
+  lj_assertA(irt_isguard(ir->t) || !(ir->op2 & IRSLOAD_TYPECHECK),
+	     "inconsistent SLOAD variant");
 #if LJ_SOFTFP32
-  lua_assert(!(ir->op2 & IRSLOAD_CONVERT));  /* Handled by LJ_SOFTFP SPLIT. */
+  lj_assertA(!(ir->op2 & IRSLOAD_CONVERT),
+	     "unsplit SLOAD convert");  /* Handled by LJ_SOFTFP SPLIT. */
   if (hiop && ra_used(ir+1)) {
     type = ra_dest(as, ir+1, allow);
     rset_clear(allow, type);
@@ -1443,8 +1465,9 @@ static void asm_sload(ASMState *as, IRIns *ir)
   } else
 #endif
   if (ra_used(ir)) {
-    lua_assert((LJ_SOFTFP32 ? 0 : irt_isnum(ir->t)) ||
-	       irt_isint(ir->t) || irt_isaddr(ir->t));
+    lj_assertA((LJ_SOFTFP32 ? 0 : irt_isnum(ir->t)) ||
+	       irt_isint(ir->t) || irt_isaddr(ir->t),
+	       "bad SLOAD type %d", irt_type(ir->t));
     dest = ra_dest(as, ir, (!LJ_SOFTFP && irt_isnum(t)) ? RSET_FPR : allow);
     rset_clear(allow, dest);
     base = ra_alloc1(as, REF_BASE, allow);
@@ -1556,7 +1579,8 @@ static void asm_cnew(ASMState *as, IRIns *ir)
   RegSet allow = (RSET_GPR & ~RSET_SCRATCH);
   RegSet drop = RSET_SCRATCH;
   Reg tmp;
-  lua_assert(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL));
+  lj_assertA(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL),
+	     "bad CNEW/CNEWI operands");
 
   as->gcsteps++;
   if (ra_hasreg(ir->r))
@@ -1571,7 +1595,7 @@ static void asm_cnew(ASMState *as, IRIns *ir)
     int32_t ofs = sizeof(GCcdata);
     if (sz == 8) {
       ofs += 4;
-      lua_assert((ir+1)->o == IR_HIOP);
+      lj_assertA((ir+1)->o == IR_HIOP, "expected HIOP for CNEWI");
       if (LJ_LE) ir++;
     }
     for (;;) {
@@ -1585,7 +1609,7 @@ static void asm_cnew(ASMState *as, IRIns *ir)
     emit_tsi(as, sz == 8 ? MIPSI_SD : MIPSI_SW, ra_alloc1(as, ir->op2, allow),
 	     RID_RET, sizeof(GCcdata));
 #endif
-    lua_assert(sz == 4 || sz == 8);
+    lj_assertA(sz == 4 || sz == 8, "bad CNEWI size %d", sz);
   } else if (ir->op2 != REF_NIL) {  /* Create VLA/VLS/aligned cdata. */
     ci = &lj_ir_callinfo[IRCALL_lj_cdata_newv];
     args[0] = ASMREF_L;     /* lua_State *L */
@@ -1640,7 +1664,7 @@ static void asm_obar(ASMState *as, IRIns *ir)
   MCLabel l_end;
   Reg obj, val, tmp;
   /* No need for other object barriers (yet). */
-  lua_assert(IR(ir->op1)->o == IR_UREFC);
+  lj_assertA(IR(ir->op1)->o == IR_UREFC, "bad OBAR type");
   ra_evictset(as, RSET_SCRATCH);
   l_end = emit_label(as);
   args[0] = ASMREF_TMP1;  /* global_State *g */
@@ -1715,7 +1739,7 @@ static void asm_add(ASMState *as, IRIns *ir)
     Reg dest = ra_dest(as, ir, RSET_GPR);
     Reg right, left = ra_hintalloc(as, ir->op1, dest, RSET_GPR);
     if (irref_isk(ir->op2)) {
-      intptr_t k = get_kval(IR(ir->op2));
+      intptr_t k = get_kval(as, ir->op2);
       if (checki16(k)) {
 	emit_tsi(as, (LJ_64 && irt_is64(t)) ? MIPSI_DADDIU : MIPSI_ADDIU, dest,
 		 left, k);
@@ -1816,7 +1840,7 @@ static void asm_arithov(ASMState *as, IRIns *ir)
 {
   /* TODO MIPSR6: bovc/bnvc. Caveat: no delay slot to load RID_TMP. */
   Reg right, left, tmp, dest = ra_dest(as, ir, RSET_GPR);
-  lua_assert(!irt_is64(ir->t));
+  lj_assertA(!irt_is64(ir->t), "bad usage");
   if (irref_isk(ir->op2)) {
     int k = IR(ir->op2)->i;
     if (ir->o == IR_SUBOV) k = -k;
@@ -2003,7 +2027,7 @@ static void asm_bitop(ASMState *as, IRIns *ir, MIPSIns mi, MIPSIns mik)
   Reg dest = ra_dest(as, ir, RSET_GPR);
   Reg right, left = ra_hintalloc(as, ir->op1, dest, RSET_GPR);
   if (irref_isk(ir->op2)) {
-    intptr_t k = get_kval(IR(ir->op2));
+    intptr_t k = get_kval(as, ir->op2);
     if (checku16(k)) {
       emit_tsi(as, mik, dest, left, k);
       return;
@@ -2036,7 +2060,7 @@ static void asm_bitshift(ASMState *as, IRIns *ir, MIPSIns mi, MIPSIns mik)
 #define asm_bshl(as, ir)	asm_bitshift(as, ir, MIPSI_SLLV, MIPSI_SLL)
 #define asm_bshr(as, ir)	asm_bitshift(as, ir, MIPSI_SRLV, MIPSI_SRL)
 #define asm_bsar(as, ir)	asm_bitshift(as, ir, MIPSI_SRAV, MIPSI_SRA)
-#define asm_brol(as, ir)	lua_assert(0)
+#define asm_brol(as, ir)	lj_assertA(0, "unexpected BROL")
 
 static void asm_bror(ASMState *as, IRIns *ir)
 {
@@ -2228,13 +2252,13 @@ static void asm_comp(ASMState *as, IRIns *ir)
   } else {
     Reg right, left = ra_alloc1(as, ir->op1, RSET_GPR);
     if (op == IR_ABC) op = IR_UGT;
-    if ((op&4) == 0 && irref_isk(ir->op2) && get_kval(IR(ir->op2)) == 0) {
+    if ((op&4) == 0 && irref_isk(ir->op2) && get_kval(as, ir->op2) == 0) {
       MIPSIns mi = (op&2) ? ((op&1) ? MIPSI_BLEZ : MIPSI_BGTZ) :
 			    ((op&1) ? MIPSI_BLTZ : MIPSI_BGEZ);
       asm_guard(as, mi, left, 0);
     } else {
       if (irref_isk(ir->op2)) {
-	intptr_t k = get_kval(IR(ir->op2));
+	intptr_t k = get_kval(as, ir->op2);
 	if ((op&2)) k++;
 	if (checki16(k)) {
 	  asm_guard(as, (op&1) ? MIPSI_BNE : MIPSI_BEQ, RID_TMP, RID_ZERO);
@@ -2390,10 +2414,11 @@ static void asm_hiop(ASMState *as, IRIns *ir)
   case IR_CNEWI:
     /* Nothing to do here. Handled by lo op itself. */
     break;
-  default: lua_assert(0); break;
+  default: lj_assertA(0, "bad HIOP for op %d", (ir-1)->o); break;
   }
 #else
-  UNUSED(as); UNUSED(ir); lua_assert(0);  /* Unused without FFI. */
+  /* Unused on MIPS64 or without SOFTFP or FFI. */
+  UNUSED(as); UNUSED(ir); lj_assertA(0, "unexpected HIOP");
 #endif
 }
 
@@ -2462,7 +2487,8 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
 #if LJ_SOFTFP32
       Reg tmp;
       RegSet allow = rset_exclude(RSET_GPR, RID_BASE);
-      lua_assert(irref_isk(ref));  /* LJ_SOFTFP: must be a number constant. */
+      /* LJ_SOFTFP: must be a number constant. */
+      lj_assertA(irref_isk(ref), "unsplit FP op");
       tmp = ra_allock(as, (int32_t)ir_knum(ir)->u32.lo, allow);
       emit_tsi(as, MIPSI_SW, tmp, RID_BASE, ofs+(LJ_BE?4:0));
       if (rset_test(as->freeset, tmp+1)) allow = RID2RSET(tmp+1);
@@ -2479,7 +2505,8 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
 #if LJ_32
       RegSet allow = rset_exclude(RSET_GPR, RID_BASE);
       Reg type;
-      lua_assert(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t));
+      lj_assertA(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t),
+		 "restore of IR type %d", irt_type(ir->t));
       if (!irt_ispri(ir->t)) {
 	Reg src = ra_alloc1(as, ref, allow);
 	rset_clear(allow, src);
@@ -2502,7 +2529,7 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
     }
     checkmclim(as);
   }
-  lua_assert(map + nent == flinks);
+  lj_assertA(map + nent == flinks, "inconsistent frames in snapshot");
 }
 
 /* -- GC handling --------------------------------------------------------- */
@@ -2700,7 +2727,7 @@ void lj_asm_patchexit(jit_State *J, GCtrace *T, ExitNo exitno, MCode *target)
 	}
       } else if (p+1 == pe) {
 	/* Patch NOP after code for inverted loop branch. Use of J is ok. */
-	lua_assert(p[1] == MIPSI_NOP);
+	lj_assertJ(p[1] == MIPSI_NOP, "expected NOP");
 	p[1] = tjump;
 	*p = MIPSI_NOP;  /* Replace the load of the exit number. */
 	cstop = p+2;
diff --git a/src/lj_asm_ppc.h b/src/lj_asm_ppc.h
index 62a5c3e2..971dcc88 100644
--- a/src/lj_asm_ppc.h
+++ b/src/lj_asm_ppc.h
@@ -181,7 +181,7 @@ static void asm_fusexref(ASMState *as, PPCIns pi, Reg rt, IRRef ref,
 	return;
       }
     } else if (ir->o == IR_STRREF) {
-      lua_assert(ofs == 0);
+      lj_assertA(ofs == 0, "bad usage");
       ofs = (int32_t)sizeof(GCstr);
       if (irref_isk(ir->op2)) {
 	ofs += IR(ir->op2)->i;
@@ -268,7 +268,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
 #if !LJ_SOFTFP
       if (irt_isfp(ir->t)) {
 	if (fpr <= REGARG_LASTFPR) {
-	  lua_assert(rset_test(as->freeset, fpr));  /* Already evicted. */
+	  lj_assertA(rset_test(as->freeset, fpr),
+		     "reg %d not free", fpr);  /* Already evicted. */
 	  ra_leftov(as, fpr, ref);
 	  fpr++;
 	} else {
@@ -281,7 +282,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
 #endif
       {
 	if (gpr <= REGARG_LASTGPR) {
-	  lua_assert(rset_test(as->freeset, gpr));  /* Already evicted. */
+	  lj_assertA(rset_test(as->freeset, gpr),
+		     "reg %d not free", gpr);  /* Already evicted. */
 	  ra_leftov(as, gpr, ref);
 	  gpr++;
 	} else {
@@ -319,7 +321,7 @@ static void asm_setupresult(ASMState *as, IRIns *ir, const CCallInfo *ci)
     rset_clear(drop, (ir+1)->r);  /* Dest reg handled below. */
   ra_evictset(as, drop);  /* Evictions must be performed first. */
   if (ra_used(ir)) {
-    lua_assert(!irt_ispri(ir->t));
+    lj_assertA(!irt_ispri(ir->t), "PRI dest");
     if (!LJ_SOFTFP && irt_isfp(ir->t)) {
       if ((ci->flags & CCI_CASTU64)) {
 	/* Use spill slot or temp slots. */
@@ -431,14 +433,18 @@ static void asm_conv(ASMState *as, IRIns *ir)
   int stfp = (st == IRT_NUM || st == IRT_FLOAT);
 #endif
   IRRef lref = ir->op1;
-  lua_assert(!(irt_isint64(ir->t) ||
-	       (st == IRT_I64 || st == IRT_U64))); /* Handled by SPLIT. */
+  /* 64 bit integer conversions are handled by SPLIT. */
+  lj_assertA(!(irt_isint64(ir->t) || (st == IRT_I64 || st == IRT_U64)),
+	     "IR %04d has unsplit 64 bit type",
+	     (int)(ir - as->ir) - REF_BIAS);
 #if LJ_SOFTFP
   /* FP conversions are handled by SPLIT. */
-  lua_assert(!irt_isfp(ir->t) && !(st == IRT_NUM || st == IRT_FLOAT));
+  lj_assertA(!irt_isfp(ir->t) && !(st == IRT_NUM || st == IRT_FLOAT),
+	     "IR %04d has FP type",
+	     (int)(ir - as->ir) - REF_BIAS);
   /* Can't check for same types: SPLIT uses CONV int.int + BXOR for sfp NEG. */
 #else
-  lua_assert(irt_type(ir->t) != st);
+  lj_assertA(irt_type(ir->t) != st, "inconsistent types for CONV");
   if (irt_isfp(ir->t)) {
     Reg dest = ra_dest(as, ir, RSET_FPR);
     if (stfp) {  /* FP to FP conversion. */
@@ -467,7 +473,8 @@ static void asm_conv(ASMState *as, IRIns *ir)
   } else if (stfp) {  /* FP to integer conversion. */
     if (irt_isguard(ir->t)) {
       /* Checked conversions are only supported from number to int. */
-      lua_assert(irt_isint(ir->t) && st == IRT_NUM);
+      lj_assertA(irt_isint(ir->t) && st == IRT_NUM,
+		 "bad type for checked CONV");
       asm_tointg(as, ir, ra_alloc1(as, lref, RSET_FPR));
     } else {
       Reg dest = ra_dest(as, ir, RSET_GPR);
@@ -503,7 +510,7 @@ static void asm_conv(ASMState *as, IRIns *ir)
     Reg dest = ra_dest(as, ir, RSET_GPR);
     if (st >= IRT_I8 && st <= IRT_U16) {  /* Extend to 32 bit integer. */
       Reg left = ra_alloc1(as, ir->op1, RSET_GPR);
-      lua_assert(irt_isint(ir->t) || irt_isu32(ir->t));
+      lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t), "bad type for CONV EXT");
       if ((ir->op2 & IRCONV_SEXT))
 	emit_as(as, st == IRT_I8 ? PPCI_EXTSB : PPCI_EXTSH, dest, left);
       else
@@ -699,7 +706,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
 	    (((char *)as->mcp-(char *)l_loop) & 0xffffu);
 
   /* Load main position relative to tab->node into dest. */
-  khash = isk ? ir_khash(irkey) : 1;
+  khash = isk ? ir_khash(as, irkey) : 1;
   if (khash == 0) {
     emit_tai(as, PPCI_LWZ, dest, tab, (int32_t)offsetof(GCtab, node));
   } else {
@@ -754,7 +761,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
   Reg node = ra_alloc1(as, ir->op1, RSET_GPR);
   Reg key = RID_NONE, type = RID_TMP, idx = node;
   RegSet allow = rset_exclude(RSET_GPR, node);
-  lua_assert(ofs % sizeof(Node) == 0);
+  lj_assertA(ofs % sizeof(Node) == 0, "unaligned HREFK slot");
   if (ofs > 32736) {
     idx = dest;
     rset_clear(allow, dest);
@@ -813,7 +820,7 @@ static void asm_uref(ASMState *as, IRIns *ir)
 static void asm_fref(ASMState *as, IRIns *ir)
 {
   UNUSED(as); UNUSED(ir);
-  lua_assert(!ra_used(ir));
+  lj_assertA(!ra_used(ir), "unfused FREF");
 }
 
 static void asm_strref(ASMState *as, IRIns *ir)
@@ -853,25 +860,27 @@ static void asm_strref(ASMState *as, IRIns *ir)
 
 /* -- Loads and stores ---------------------------------------------------- */
 
-static PPCIns asm_fxloadins(IRIns *ir)
+static PPCIns asm_fxloadins(ASMState *as, IRIns *ir)
 {
+  UNUSED(as);
   switch (irt_type(ir->t)) {
   case IRT_I8: return PPCI_LBZ;  /* Needs sign-extension. */
   case IRT_U8: return PPCI_LBZ;
   case IRT_I16: return PPCI_LHA;
   case IRT_U16: return PPCI_LHZ;
-  case IRT_NUM: lua_assert(!LJ_SOFTFP); return PPCI_LFD;
+  case IRT_NUM: lj_assertA(!LJ_SOFTFP, "unsplit FP op"); return PPCI_LFD;
   case IRT_FLOAT: if (!LJ_SOFTFP) return PPCI_LFS;
   default: return PPCI_LWZ;
   }
 }
 
-static PPCIns asm_fxstoreins(IRIns *ir)
+static PPCIns asm_fxstoreins(ASMState *as, IRIns *ir)
 {
+  UNUSED(as);
   switch (irt_type(ir->t)) {
   case IRT_I8: case IRT_U8: return PPCI_STB;
   case IRT_I16: case IRT_U16: return PPCI_STH;
-  case IRT_NUM: lua_assert(!LJ_SOFTFP); return PPCI_STFD;
+  case IRT_NUM: lj_assertA(!LJ_SOFTFP, "unsplit FP op"); return PPCI_STFD;
   case IRT_FLOAT: if (!LJ_SOFTFP) return PPCI_STFS;
   default: return PPCI_STW;
   }
@@ -880,10 +889,10 @@ static PPCIns asm_fxstoreins(IRIns *ir)
 static void asm_fload(ASMState *as, IRIns *ir)
 {
   Reg dest = ra_dest(as, ir, RSET_GPR);
-  PPCIns pi = asm_fxloadins(ir);
+  PPCIns pi = asm_fxloadins(as, ir);
   Reg idx;
   int32_t ofs;
-  if (ir->op1 == REF_NIL) {
+  if (ir->op1 == REF_NIL) {  /* FLOAD from GG_State with offset. */
     idx = RID_JGL;
     ofs = (ir->op2 << 2) - 32768;
   } else {
@@ -897,7 +906,7 @@ static void asm_fload(ASMState *as, IRIns *ir)
     }
     ofs = field_ofs[ir->op2];
   }
-  lua_assert(!irt_isi8(ir->t));
+  lj_assertA(!irt_isi8(ir->t), "unsupported FLOAD I8");
   emit_tai(as, pi, dest, idx, ofs);
 }
 
@@ -908,7 +917,7 @@ static void asm_fstore(ASMState *as, IRIns *ir)
     IRIns *irf = IR(ir->op1);
     Reg idx = ra_alloc1(as, irf->op1, rset_exclude(RSET_GPR, src));
     int32_t ofs = field_ofs[irf->op2];
-    PPCIns pi = asm_fxstoreins(ir);
+    PPCIns pi = asm_fxstoreins(as, ir);
     emit_tai(as, pi, src, idx, ofs);
   }
 }
@@ -917,10 +926,10 @@ static void asm_xload(ASMState *as, IRIns *ir)
 {
   Reg dest = ra_dest(as, ir,
     (!LJ_SOFTFP && irt_isfp(ir->t)) ? RSET_FPR : RSET_GPR);
-  lua_assert(!(ir->op2 & IRXLOAD_UNALIGNED));
+  lj_assertA(!(ir->op2 & IRXLOAD_UNALIGNED), "unaligned XLOAD");
   if (irt_isi8(ir->t))
     emit_as(as, PPCI_EXTSB, dest, dest);
-  asm_fusexref(as, asm_fxloadins(ir), dest, ir->op1, RSET_GPR, 0);
+  asm_fusexref(as, asm_fxloadins(as, ir), dest, ir->op1, RSET_GPR, 0);
 }
 
 static void asm_xstore_(ASMState *as, IRIns *ir, int32_t ofs)
@@ -936,7 +945,7 @@ static void asm_xstore_(ASMState *as, IRIns *ir, int32_t ofs)
   } else {
     Reg src = ra_alloc1(as, ir->op2,
       (!LJ_SOFTFP && irt_isfp(ir->t)) ? RSET_FPR : RSET_GPR);
-    asm_fusexref(as, asm_fxstoreins(ir), src, ir->op1,
+    asm_fusexref(as, asm_fxstoreins(as, ir), src, ir->op1,
 		 rset_exclude(RSET_GPR, src), ofs);
   }
 }
@@ -958,8 +967,9 @@ static void asm_ahuvload(ASMState *as, IRIns *ir)
     ofs = 0;
   }
   if (ra_used(ir)) {
-    lua_assert((LJ_SOFTFP ? 0 : irt_isnum(ir->t)) ||
-	       irt_isint(ir->t) || irt_isaddr(ir->t));
+    lj_assertA((LJ_SOFTFP ? 0 : irt_isnum(ir->t)) ||
+	       irt_isint(ir->t) || irt_isaddr(ir->t),
+	       "bad load type %d", irt_type(ir->t));
     if (LJ_SOFTFP || !irt_isnum(t)) ofs = 0;
     dest = ra_dest(as, ir, (!LJ_SOFTFP && irt_isnum(t)) ? RSET_FPR : allow);
     rset_clear(allow, dest);
@@ -1042,12 +1052,16 @@ static void asm_sload(ASMState *as, IRIns *ir)
   int hiop = (LJ_SOFTFP && (ir+1)->o == IR_HIOP);
   if (hiop)
     t.irt = IRT_NUM;
-  lua_assert(!(ir->op2 & IRSLOAD_PARENT));  /* Handled by asm_head_side(). */
-  lua_assert(irt_isguard(ir->t) || !(ir->op2 & IRSLOAD_TYPECHECK));
-  lua_assert(LJ_DUALNUM ||
-	     !irt_isint(t) || (ir->op2 & (IRSLOAD_CONVERT|IRSLOAD_FRAME)));
+  lj_assertA(!(ir->op2 & IRSLOAD_PARENT),
+	     "bad parent SLOAD");  /* Handled by asm_head_side(). */
+  lj_assertA(irt_isguard(ir->t) || !(ir->op2 & IRSLOAD_TYPECHECK),
+	     "inconsistent SLOAD variant");
+  lj_assertA(LJ_DUALNUM ||
+	     !irt_isint(t) || (ir->op2 & (IRSLOAD_CONVERT|IRSLOAD_FRAME)),
+	     "bad SLOAD type");
 #if LJ_SOFTFP
-  lua_assert(!(ir->op2 & IRSLOAD_CONVERT));  /* Handled by LJ_SOFTFP SPLIT. */
+  lj_assertA(!(ir->op2 & IRSLOAD_CONVERT),
+	     "unsplit SLOAD convert");  /* Handled by LJ_SOFTFP SPLIT. */
   if (hiop && ra_used(ir+1)) {
     type = ra_dest(as, ir+1, allow);
     rset_clear(allow, type);
@@ -1060,7 +1074,8 @@ static void asm_sload(ASMState *as, IRIns *ir)
   } else
 #endif
   if (ra_used(ir)) {
-    lua_assert(irt_isnum(t) || irt_isint(t) || irt_isaddr(t));
+    lj_assertA(irt_isnum(t) || irt_isint(t) || irt_isaddr(t),
+	       "bad SLOAD type %d", irt_type(ir->t));
     dest = ra_dest(as, ir, (!LJ_SOFTFP && irt_isnum(t)) ? RSET_FPR : allow);
     rset_clear(allow, dest);
     base = ra_alloc1(as, REF_BASE, allow);
@@ -1127,7 +1142,8 @@ static void asm_cnew(ASMState *as, IRIns *ir)
   const CCallInfo *ci = &lj_ir_callinfo[IRCALL_lj_mem_newgco];
   IRRef args[4];
   RegSet drop = RSET_SCRATCH;
-  lua_assert(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL));
+  lj_assertA(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL),
+	     "bad CNEW/CNEWI operands");
 
   as->gcsteps++;
   if (ra_hasreg(ir->r))
@@ -1140,10 +1156,10 @@ static void asm_cnew(ASMState *as, IRIns *ir)
   if (ir->o == IR_CNEWI) {
     RegSet allow = (RSET_GPR & ~RSET_SCRATCH);
     int32_t ofs = sizeof(GCcdata);
-    lua_assert(sz == 4 || sz == 8);
+    lj_assertA(sz == 4 || sz == 8, "bad CNEWI size %d", sz);
     if (sz == 8) {
       ofs += 4;
-      lua_assert((ir+1)->o == IR_HIOP);
+      lj_assertA((ir+1)->o == IR_HIOP, "expected HIOP for CNEWI");
     }
     for (;;) {
       Reg r = ra_alloc1(as, ir->op2, allow);
@@ -1190,7 +1206,7 @@ static void asm_tbar(ASMState *as, IRIns *ir)
   emit_tai(as, PPCI_STW, link, tab, (int32_t)offsetof(GCtab, gclist));
   emit_tai(as, PPCI_STB, mark, tab, (int32_t)offsetof(GCtab, marked));
   emit_setgl(as, tab, gc.grayagain);
-  lua_assert(LJ_GC_BLACK == 0x04);
+  lj_assertA(LJ_GC_BLACK == 0x04, "bad LJ_GC_BLACK");
   emit_rot(as, PPCI_RLWINM, mark, mark, 0, 30, 28);  /* Clear black bit. */
   emit_getgl(as, link, gc.grayagain);
   emit_condbranch(as, PPCI_BC|PPCF_Y, CC_EQ, l_end);
@@ -1205,7 +1221,7 @@ static void asm_obar(ASMState *as, IRIns *ir)
   MCLabel l_end;
   Reg obj, val, tmp;
   /* No need for other object barriers (yet). */
-  lua_assert(IR(ir->op1)->o == IR_UREFC);
+  lj_assertA(IR(ir->op1)->o == IR_UREFC, "bad OBAR type");
   ra_evictset(as, RSET_SCRATCH);
   l_end = emit_label(as);
   args[0] = ASMREF_TMP1;  /* global_State *g */
@@ -1676,7 +1692,7 @@ static void asm_bitshift(ASMState *as, IRIns *ir, PPCIns pi, PPCIns pik)
 #define asm_brol(as, ir) \
   asm_bitshift(as, ir, PPCI_RLWNM|PPCF_MB(0)|PPCF_ME(31), \
 		       PPCI_RLWINM|PPCF_MB(0)|PPCF_ME(31))
-#define asm_bror(as, ir)	lua_assert(0)
+#define asm_bror(as, ir)	lj_assertA(0, "unexpected BROR")
 
 #if LJ_SOFTFP
 static void asm_sfpmin_max(ASMState *as, IRIns *ir)
@@ -1951,10 +1967,11 @@ static void asm_hiop(ASMState *as, IRIns *ir)
   case IR_CNEWI:
     /* Nothing to do here. Handled by lo op itself. */
     break;
-  default: lua_assert(0); break;
+  default: lj_assertA(0, "bad HIOP for op %d", (ir-1)->o); break;
   }
 #else
-  UNUSED(as); UNUSED(ir); lua_assert(0);  /* Unused without FFI. */
+  /* Unused without SOFTFP or FFI. */
+  UNUSED(as); UNUSED(ir); lj_assertA(0, "unexpected HIOP");
 #endif
 }
 
@@ -2014,7 +2031,8 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
 #if LJ_SOFTFP
       Reg tmp;
       RegSet allow = rset_exclude(RSET_GPR, RID_BASE);
-      lua_assert(irref_isk(ref));  /* LJ_SOFTFP: must be a number constant. */
+      /* LJ_SOFTFP: must be a number constant. */
+      lj_assertA(irref_isk(ref), "unsplit FP op");
       tmp = ra_allock(as, (int32_t)ir_knum(ir)->u32.lo, allow);
       emit_tai(as, PPCI_STW, tmp, RID_BASE, ofs+(LJ_BE?4:0));
       if (rset_test(as->freeset, tmp+1)) allow = RID2RSET(tmp+1);
@@ -2027,7 +2045,8 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
     } else {
       Reg type;
       RegSet allow = rset_exclude(RSET_GPR, RID_BASE);
-      lua_assert(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t));
+      lj_assertA(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t),
+		 "restore of IR type %d", irt_type(ir->t));
       if (!irt_ispri(ir->t)) {
 	Reg src = ra_alloc1(as, ref, allow);
 	rset_clear(allow, src);
@@ -2047,7 +2066,7 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
     }
     checkmclim(as);
   }
-  lua_assert(map + nent == flinks);
+  lj_assertA(map + nent == flinks, "inconsistent frames in snapshot");
 }
 
 /* -- GC handling --------------------------------------------------------- */
@@ -2145,7 +2164,7 @@ static void asm_tail_fixup(ASMState *as, TraceNo lnk)
     as->mctop = p;
   } else {
     /* Patch stack adjustment. */
-    lua_assert(checki16(CFRAME_SIZE+spadj));
+    lj_assertA(checki16(CFRAME_SIZE+spadj), "stack adjustment out of range");
     p[-3] = PPCI_ADDI | PPCF_T(RID_TMP) | PPCF_A(RID_SP) | (CFRAME_SIZE+spadj);
     p[-2] = PPCI_STWU | PPCF_T(RID_TMP) | PPCF_A(RID_SP) | spadj;
   }
@@ -2222,14 +2241,16 @@ void lj_asm_patchexit(jit_State *J, GCtrace *T, ExitNo exitno, MCode *target)
     } else if ((ins & 0xfc000000u) == PPCI_B &&
 	       ((ins ^ ((char *)px-(char *)p)) & 0x03ffffffu) == 0) {
       ptrdiff_t delta = (char *)target - (char *)p;
-      lua_assert(((delta + 0x02000000) >> 26) == 0);
+      lj_assertJ(((delta + 0x02000000) >> 26) == 0,
+		 "branch target out of range");
       *p = PPCI_B | ((uint32_t)delta & 0x03ffffffu);
       if (!cstart) cstart = p;
     }
   }
   {  /* Always patch long-range branch in exit stub itself. */
     ptrdiff_t delta = (char *)target - (char *)px - clearso;
-    lua_assert(((delta + 0x02000000) >> 26) == 0);
+    lj_assertJ(((delta + 0x02000000) >> 26) == 0,
+	       "branch target out of range");
     *px = PPCI_B | ((uint32_t)delta & 0x03ffffffu);
   }
   if (!cstart) cstart = px;
diff --git a/src/lj_asm_x86.h b/src/lj_asm_x86.h
index 5f5fe3cf..74f2d853 100644
--- a/src/lj_asm_x86.h
+++ b/src/lj_asm_x86.h
@@ -31,7 +31,7 @@ static MCode *asm_exitstub_gen(ASMState *as, ExitNo group)
 #endif
   /* Jump to exit handler which fills in the ExitState. */
   *mxp++ = XI_JMP; mxp += 4;
-  *((int32_t *)(mxp-4)) = jmprel(mxp, (MCode *)(void *)lj_vm_exit_handler);
+  *((int32_t *)(mxp-4)) = jmprel(as->J, mxp, (MCode *)(void *)lj_vm_exit_handler);
   /* Commit the code for this group (even if assembly fails later on). */
   lj_mcode_commitbot(as->J, mxp);
   as->mcbot = mxp;
@@ -60,7 +60,7 @@ static void asm_guardcc(ASMState *as, int cc)
   MCode *p = as->mcp;
   if (LJ_UNLIKELY(p == as->invmcp)) {
     as->loopinv = 1;
-    *(int32_t *)(p+1) = jmprel(p+5, target);
+    *(int32_t *)(p+1) = jmprel(as->J, p+5, target);
     target = p;
     cc ^= 1;
     if (as->realign) {
@@ -131,7 +131,7 @@ static IRRef asm_fuseabase(ASMState *as, IRRef ref)
   as->mrm.ofs = 0;
   if (irb->o == IR_FLOAD) {
     IRIns *ira = IR(irb->op1);
-    lua_assert(irb->op2 == IRFL_TAB_ARRAY);
+    lj_assertA(irb->op2 == IRFL_TAB_ARRAY, "expected FLOAD TAB_ARRAY");
     /* We can avoid the FLOAD of t->array for colocated arrays. */
     if (ira->o == IR_TNEW && ira->op1 <= LJ_MAX_COLOSIZE &&
 	!neverfuse(as) && noconflict(as, irb->op1, IR_NEWREF, 1)) {
@@ -150,7 +150,7 @@ static IRRef asm_fuseabase(ASMState *as, IRRef ref)
 static void asm_fusearef(ASMState *as, IRIns *ir, RegSet allow)
 {
   IRIns *irx;
-  lua_assert(ir->o == IR_AREF);
+  lj_assertA(ir->o == IR_AREF, "expected AREF");
   as->mrm.base = (uint8_t)ra_alloc1(as, asm_fuseabase(as, ir->op1), allow);
   irx = IR(ir->op2);
   if (irref_isk(ir->op2)) {
@@ -217,8 +217,9 @@ static void asm_fuseahuref(ASMState *as, IRRef ref, RegSet allow)
       }
       break;
     default:
-      lua_assert(ir->o == IR_HREF || ir->o == IR_NEWREF || ir->o == IR_UREFO ||
-		 ir->o == IR_KKPTR);
+      lj_assertA(ir->o == IR_HREF || ir->o == IR_NEWREF || ir->o == IR_UREFO ||
+		 ir->o == IR_KKPTR,
+		 "bad IR op %d", ir->o);
       break;
     }
   }
@@ -230,9 +231,10 @@ static void asm_fuseahuref(ASMState *as, IRRef ref, RegSet allow)
 /* Fuse FLOAD/FREF reference into memory operand. */
 static void asm_fusefref(ASMState *as, IRIns *ir, RegSet allow)
 {
-  lua_assert(ir->o == IR_FLOAD || ir->o == IR_FREF);
+  lj_assertA(ir->o == IR_FLOAD || ir->o == IR_FREF,
+	     "bad IR op %d", ir->o);
   as->mrm.idx = RID_NONE;
-  if (ir->op1 == REF_NIL) {
+  if (ir->op1 == REF_NIL) {  /* FLOAD from GG_State with offset. */
 #if LJ_GC64
     as->mrm.ofs = (int32_t)(ir->op2 << 2) - GG_OFS(dispatch);
     as->mrm.base = RID_DISPATCH;
@@ -271,7 +273,7 @@ static void asm_fusefref(ASMState *as, IRIns *ir, RegSet allow)
 static void asm_fusestrref(ASMState *as, IRIns *ir, RegSet allow)
 {
   IRIns *irr;
-  lua_assert(ir->o == IR_STRREF);
+  lj_assertA(ir->o == IR_STRREF, "bad IR op %d", ir->o);
   as->mrm.base = as->mrm.idx = RID_NONE;
   as->mrm.scale = XM_SCALE1;
   as->mrm.ofs = sizeof(GCstr);
@@ -378,9 +380,10 @@ static Reg asm_fuseloadk64(ASMState *as, IRIns *ir)
 	     checki32(mctopofs(as, k)) && checki32(mctopofs(as, k+1))) {
     as->mrm.ofs = (int32_t)mcpofs(as, k);
     as->mrm.base = RID_RIP;
-  } else {
+  } else {  /* Intern 64 bit constant at bottom of mcode. */
     if (ir->i) {
-      lua_assert(*k == *(uint64_t*)(as->mctop - ir->i));
+      lj_assertA(*k == *(uint64_t*)(as->mctop - ir->i),
+		 "bad interned 64 bit constant");
     } else {
       while ((uintptr_t)as->mcbot & 7) *as->mcbot++ = XI_INT3;
       *(uint64_t*)as->mcbot = *k;
@@ -420,12 +423,12 @@ static Reg asm_fuseload(ASMState *as, IRRef ref, RegSet allow)
   }
   if (ir->o == IR_KNUM) {
     RegSet avail = as->freeset & ~as->modset & RSET_FPR;
-    lua_assert(allow != RSET_EMPTY);
+    lj_assertA(allow != RSET_EMPTY, "no register allowed");
     if (!(avail & (avail-1)))  /* Fuse if less than two regs available. */
       return asm_fuseloadk64(as, ir);
   } else if (ref == REF_BASE || ir->o == IR_KINT64) {
     RegSet avail = as->freeset & ~as->modset & RSET_GPR;
-    lua_assert(allow != RSET_EMPTY);
+    lj_assertA(allow != RSET_EMPTY, "no register allowed");
     if (!(avail & (avail-1))) {  /* Fuse if less than two regs available. */
       if (ref == REF_BASE) {
 #if LJ_GC64
@@ -606,7 +609,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
 #endif
 	  emit_loadi(as, r, ir->i);
       } else {
-	lua_assert(rset_test(as->freeset, r));  /* Must have been evicted. */
+	/* Must have been evicted. */
+	lj_assertA(rset_test(as->freeset, r), "reg %d not free", r);
 	if (ra_hasreg(ir->r)) {
 	  ra_noweak(as, ir->r);
 	  emit_movrr(as, ir, r, ir->r);
@@ -615,7 +619,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
 	}
       }
     } else if (irt_isfp(ir->t)) {  /* FP argument is on stack. */
-      lua_assert(!(irt_isfloat(ir->t) && irref_isk(ref)));  /* No float k. */
+      lj_assertA(!(irt_isfloat(ir->t) && irref_isk(ref)),
+		 "unexpected float constant");
       if (LJ_32 && (ofs & 4) && irref_isk(ref)) {
 	/* Split stores for unaligned FP consts. */
 	emit_movmroi(as, RID_ESP, ofs, (int32_t)ir_knum(ir)->u32.lo);
@@ -691,7 +696,7 @@ static void asm_setupresult(ASMState *as, IRIns *ir, const CCallInfo *ci)
       ra_destpair(as, ir);
 #endif
     } else {
-      lua_assert(!irt_ispri(ir->t));
+      lj_assertA(!irt_ispri(ir->t), "PRI dest");
       ra_destreg(as, ir, RID_RET);
     }
   } else if (LJ_32 && irt_isfp(ir->t) && !(ci->flags & CCI_CASTU64)) {
@@ -810,8 +815,10 @@ static void asm_conv(ASMState *as, IRIns *ir)
   int st64 = (st == IRT_I64 || st == IRT_U64 || (LJ_64 && st == IRT_P64));
   int stfp = (st == IRT_NUM || st == IRT_FLOAT);
   IRRef lref = ir->op1;
-  lua_assert(irt_type(ir->t) != st);
-  lua_assert(!(LJ_32 && (irt_isint64(ir->t) || st64)));  /* Handled by SPLIT. */
+  lj_assertA(irt_type(ir->t) != st, "inconsistent types for CONV");
+  lj_assertA(!(LJ_32 && (irt_isint64(ir->t) || st64)),
+	     "IR %04d has unsplit 64 bit type",
+	     (int)(ir - as->ir) - REF_BIAS);
   if (irt_isfp(ir->t)) {
     Reg dest = ra_dest(as, ir, RSET_FPR);
     if (stfp) {  /* FP to FP conversion. */
@@ -847,7 +854,8 @@ static void asm_conv(ASMState *as, IRIns *ir)
   } else if (stfp) {  /* FP to integer conversion. */
     if (irt_isguard(ir->t)) {
       /* Checked conversions are only supported from number to int. */
-      lua_assert(irt_isint(ir->t) && st == IRT_NUM);
+      lj_assertA(irt_isint(ir->t) && st == IRT_NUM,
+		 "bad type for checked CONV");
       asm_tointg(as, ir, ra_alloc1(as, lref, RSET_FPR));
     } else {
       Reg dest = ra_dest(as, ir, RSET_GPR);
@@ -882,7 +890,7 @@ static void asm_conv(ASMState *as, IRIns *ir)
     Reg left, dest = ra_dest(as, ir, RSET_GPR);
     RegSet allow = RSET_GPR;
     x86Op op;
-    lua_assert(irt_isint(ir->t) || irt_isu32(ir->t));
+    lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t), "bad type for CONV EXT");
     if (st == IRT_I8) {
       op = XO_MOVSXb; allow = RSET_GPR8; dest |= FORCE_REX;
     } else if (st == IRT_U8) {
@@ -953,7 +961,7 @@ static void asm_conv_fp_int64(ASMState *as, IRIns *ir)
     emit_sjcc(as, CC_NS, l_end);
     emit_rr(as, XO_TEST, hi, hi);  /* Check if u64 >= 2^63. */
   } else {
-    lua_assert(((ir-1)->op2 & IRCONV_SRCMASK) == IRT_I64);
+    lj_assertA(((ir-1)->op2 & IRCONV_SRCMASK) == IRT_I64, "bad type for CONV");
   }
   emit_rmro(as, XO_FILDq, XOg_FILDq, RID_ESP, 0);
   /* NYI: Avoid narrow-to-wide store-to-load forwarding stall. */
@@ -967,8 +975,8 @@ static void asm_conv_int64_fp(ASMState *as, IRIns *ir)
   IRType st = (IRType)((ir-1)->op2 & IRCONV_SRCMASK);
   IRType dt = (((ir-1)->op2 & IRCONV_DSTMASK) >> IRCONV_DSH);
   Reg lo, hi;
-  lua_assert(st == IRT_NUM || st == IRT_FLOAT);
-  lua_assert(dt == IRT_I64 || dt == IRT_U64);
+  lj_assertA(st == IRT_NUM || st == IRT_FLOAT, "bad type for CONV");
+  lj_assertA(dt == IRT_I64 || dt == IRT_U64, "bad type for CONV");
   hi = ra_dest(as, ir, RSET_GPR);
   lo = ra_dest(as, ir-1, rset_exclude(RSET_GPR, hi));
   if (ra_used(ir-1)) emit_rmro(as, XO_MOV, lo, RID_ESP, 0);
@@ -1180,13 +1188,13 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
       emit_rmro(as, XO_CMP, tmp|REX_64, dest, offsetof(Node, key.u64));
     }
   } else {
-    lua_assert(irt_ispri(kt) && !irt_isnil(kt));
+    lj_assertA(irt_ispri(kt) && !irt_isnil(kt), "bad HREF key type");
     emit_u32(as, (irt_toitype(kt)<<15)|0x7fff);
     emit_rmro(as, XO_ARITHi, XOg_CMP, dest, offsetof(Node, key.it));
 #else
   } else {
     if (!irt_ispri(kt)) {
-      lua_assert(irt_isaddr(kt));
+      lj_assertA(irt_isaddr(kt), "bad HREF key type");
       if (isk)
 	emit_gmroi(as, XG_ARITHi(XOg_CMP), dest, offsetof(Node, key.gcr),
 		   ptr2addr(ir_kgc(irkey)));
@@ -1194,7 +1202,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
 	emit_rmro(as, XO_CMP, key, dest, offsetof(Node, key.gcr));
       emit_sjcc(as, CC_NE, l_next);
     }
-    lua_assert(!irt_isnil(kt));
+    lj_assertA(!irt_isnil(kt), "bad HREF key type");
     emit_i8(as, irt_toitype(kt));
     emit_rmro(as, XO_ARITHi8, XOg_CMP, dest, offsetof(Node, key.it));
 #endif
@@ -1209,7 +1217,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
 #endif
 
   /* Load main position relative to tab->node into dest. */
-  khash = isk ? ir_khash(irkey) : 1;
+  khash = isk ? ir_khash(as, irkey) : 1;
   if (khash == 0) {
     emit_rmro(as, XO_MOV, dest|REX_GC64, tab, offsetof(GCtab, node));
   } else {
@@ -1276,7 +1284,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
 #if !LJ_64
   MCLabel l_exit;
 #endif
-  lua_assert(ofs % sizeof(Node) == 0);
+  lj_assertA(ofs % sizeof(Node) == 0, "unaligned HREFK slot");
   if (ra_hasreg(dest)) {
     if (ofs != 0) {
       if (dest == node && !(as->flags & JIT_F_LEA_AGU))
@@ -1293,7 +1301,8 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
     Reg key = ra_scratch(as, rset_exclude(RSET_GPR, node));
     emit_rmro(as, XO_CMP, key|REX_64, node,
 	       ofs + (int32_t)offsetof(Node, key.u64));
-    lua_assert(irt_isnum(irkey->t) || irt_isgcv(irkey->t));
+    lj_assertA(irt_isnum(irkey->t) || irt_isgcv(irkey->t),
+	       "bad HREFK key type");
     /* Assumes -0.0 is already canonicalized to +0.0. */
     emit_loadu64(as, key, irt_isnum(irkey->t) ? ir_knum(irkey)->u64 :
 #if LJ_GC64
@@ -1304,7 +1313,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
 			  (uint64_t)(uint32_t)ptr2addr(ir_kgc(irkey)));
 #endif
   } else {
-    lua_assert(!irt_isnil(irkey->t));
+    lj_assertA(!irt_isnil(irkey->t), "bad HREFK key type");
 #if LJ_GC64
     emit_i32(as, (irt_toitype(irkey->t)<<15)|0x7fff);
     emit_rmro(as, XO_ARITHi, XOg_CMP, node,
@@ -1328,13 +1337,13 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
 	       (int32_t)ir_knum(irkey)->u32.hi);
   } else {
     if (!irt_ispri(irkey->t)) {
-      lua_assert(irt_isgcv(irkey->t));
+      lj_assertA(irt_isgcv(irkey->t), "bad HREFK key type");
       emit_gmroi(as, XG_ARITHi(XOg_CMP), node,
 		 ofs + (int32_t)offsetof(Node, key.gcr),
 		 ptr2addr(ir_kgc(irkey)));
       emit_sjcc(as, CC_NE, l_exit);
     }
-    lua_assert(!irt_isnil(irkey->t));
+    lj_assertA(!irt_isnil(irkey->t), "bad HREFK key type");
     emit_i8(as, irt_toitype(irkey->t));
     emit_rmro(as, XO_ARITHi8, XOg_CMP, node,
 	      ofs + (int32_t)offsetof(Node, key.it));
@@ -1407,7 +1416,8 @@ static void asm_fxload(ASMState *as, IRIns *ir)
     if (LJ_64 && irt_is64(ir->t))
       dest |= REX_64;
     else
-      lua_assert(irt_isint(ir->t) || irt_isu32(ir->t) || irt_isaddr(ir->t));
+      lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t) || irt_isaddr(ir->t),
+		 "unsplit 64 bit load");
     xo = XO_MOV;
     break;
   }
@@ -1452,13 +1462,16 @@ static void asm_fxstore(ASMState *as, IRIns *ir)
     case IRT_NUM: xo = XO_MOVSDto; break;
     case IRT_FLOAT: xo = XO_MOVSSto; break;
 #if LJ_64 && !LJ_GC64
-    case IRT_LIGHTUD: lua_assert(0);  /* NYI: mask 64 bit lightuserdata. */
+    case IRT_LIGHTUD:
+      /* NYI: mask 64 bit lightuserdata. */
+      lj_assertA(0, "store of lightuserdata");
 #endif
     default:
       if (LJ_64 && irt_is64(ir->t))
 	src |= REX_64;
       else
-	lua_assert(irt_isint(ir->t) || irt_isu32(ir->t) || irt_isaddr(ir->t));
+	lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t) || irt_isaddr(ir->t),
+		   "unsplit 64 bit store");
       xo = XO_MOVto;
       break;
     }
@@ -1472,8 +1485,8 @@ static void asm_fxstore(ASMState *as, IRIns *ir)
       emit_i8(as, k);
       emit_mrm(as, XO_MOVmib, 0, RID_MRM);
     } else {
-      lua_assert(irt_is64(ir->t) || irt_isint(ir->t) || irt_isu32(ir->t) ||
-		 irt_isaddr(ir->t));
+      lj_assertA(irt_is64(ir->t) || irt_isint(ir->t) || irt_isu32(ir->t) ||
+		 irt_isaddr(ir->t), "bad store type");
       emit_i32(as, k);
       emit_mrm(as, XO_MOVmi, REX_64IR(ir, 0), RID_MRM);
     }
@@ -1508,8 +1521,9 @@ static void asm_ahuvload(ASMState *as, IRIns *ir)
 #if LJ_GC64
   Reg tmp = RID_NONE;
 #endif
-  lua_assert(irt_isnum(ir->t) || irt_ispri(ir->t) || irt_isaddr(ir->t) ||
-	     (LJ_DUALNUM && irt_isint(ir->t)));
+  lj_assertA(irt_isnum(ir->t) || irt_ispri(ir->t) || irt_isaddr(ir->t) ||
+	     (LJ_DUALNUM && irt_isint(ir->t)),
+	     "bad load type %d", irt_type(ir->t));
 #if LJ_64 && !LJ_GC64
   if (irt_islightud(ir->t)) {
     Reg dest = asm_load_lightud64(as, ir, 1);
@@ -1556,7 +1570,8 @@ static void asm_ahuvload(ASMState *as, IRIns *ir)
   as->mrm.ofs += 4;
   asm_guardcc(as, irt_isnum(ir->t) ? CC_AE : CC_NE);
   if (LJ_64 && irt_type(ir->t) >= IRT_NUM) {
-    lua_assert(irt_isinteger(ir->t) || irt_isnum(ir->t));
+    lj_assertA(irt_isinteger(ir->t) || irt_isnum(ir->t),
+	       "bad load type %d", irt_type(ir->t));
 #if LJ_GC64
     emit_u32(as, LJ_TISNUM << 15);
 #else
@@ -1638,13 +1653,14 @@ static void asm_ahustore(ASMState *as, IRIns *ir)
 #endif
       emit_mrm(as, XO_MOVto, src, RID_MRM);
     } else if (!irt_ispri(irr->t)) {
-      lua_assert(irt_isaddr(ir->t) || (LJ_DUALNUM && irt_isinteger(ir->t)));
+      lj_assertA(irt_isaddr(ir->t) || (LJ_DUALNUM && irt_isinteger(ir->t)),
+		 "bad store type");
       emit_i32(as, irr->i);
       emit_mrm(as, XO_MOVmi, 0, RID_MRM);
     }
     as->mrm.ofs += 4;
 #if LJ_GC64
-    lua_assert(LJ_DUALNUM && irt_isinteger(ir->t));
+    lj_assertA(LJ_DUALNUM && irt_isinteger(ir->t), "bad store type");
     emit_i32(as, LJ_TNUMX << 15);
 #else
     emit_i32(as, (int32_t)irt_toitype(ir->t));
@@ -1659,10 +1675,13 @@ static void asm_sload(ASMState *as, IRIns *ir)
 		(!LJ_FR2 && (ir->op2 & IRSLOAD_FRAME) ? 4 : 0);
   IRType1 t = ir->t;
   Reg base;
-  lua_assert(!(ir->op2 & IRSLOAD_PARENT));  /* Handled by asm_head_side(). */
-  lua_assert(irt_isguard(t) || !(ir->op2 & IRSLOAD_TYPECHECK));
-  lua_assert(LJ_DUALNUM ||
-	     !irt_isint(t) || (ir->op2 & (IRSLOAD_CONVERT|IRSLOAD_FRAME)));
+  lj_assertA(!(ir->op2 & IRSLOAD_PARENT),
+	     "bad parent SLOAD"); /* Handled by asm_head_side(). */
+  lj_assertA(irt_isguard(t) || !(ir->op2 & IRSLOAD_TYPECHECK),
+	     "inconsistent SLOAD variant");
+  lj_assertA(LJ_DUALNUM ||
+	     !irt_isint(t) || (ir->op2 & (IRSLOAD_CONVERT|IRSLOAD_FRAME)),
+	     "bad SLOAD type");
   if ((ir->op2 & IRSLOAD_CONVERT) && irt_isguard(t) && irt_isint(t)) {
     Reg left = ra_scratch(as, RSET_FPR);
     asm_tointg(as, ir, left);  /* Frees dest reg. Do this before base alloc. */
@@ -1682,7 +1701,8 @@ static void asm_sload(ASMState *as, IRIns *ir)
     RegSet allow = irt_isnum(t) ? RSET_FPR : RSET_GPR;
     Reg dest = ra_dest(as, ir, allow);
     base = ra_alloc1(as, REF_BASE, RSET_GPR);
-    lua_assert(irt_isnum(t) || irt_isint(t) || irt_isaddr(t));
+    lj_assertA(irt_isnum(t) || irt_isint(t) || irt_isaddr(t),
+	       "bad SLOAD type %d", irt_type(t));
     if ((ir->op2 & IRSLOAD_CONVERT)) {
       t.irt = irt_isint(t) ? IRT_NUM : IRT_INT;  /* Check for original type. */
       emit_rmro(as, irt_isint(t) ? XO_CVTSI2SD : XO_CVTTSD2SI, dest, base, ofs);
@@ -1728,7 +1748,8 @@ static void asm_sload(ASMState *as, IRIns *ir)
     /* Need type check, even if the load result is unused. */
     asm_guardcc(as, irt_isnum(t) ? CC_AE : CC_NE);
     if (LJ_64 && irt_type(t) >= IRT_NUM) {
-      lua_assert(irt_isinteger(t) || irt_isnum(t));
+      lj_assertA(irt_isinteger(t) || irt_isnum(t),
+		 "bad SLOAD type %d", irt_type(t));
 #if LJ_GC64
       emit_u32(as, LJ_TISNUM << 15);
 #else
@@ -1780,7 +1801,8 @@ static void asm_cnew(ASMState *as, IRIns *ir)
   CTInfo info = lj_ctype_info(cts, id, &sz);
   const CCallInfo *ci = &lj_ir_callinfo[IRCALL_lj_mem_newgco];
   IRRef args[4];
-  lua_assert(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL));
+  lj_assertA(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL),
+	     "bad CNEW/CNEWI operands");
 
   as->gcsteps++;
   asm_setupresult(as, ir, ci);  /* GCcdata * */
@@ -1810,7 +1832,7 @@ static void asm_cnew(ASMState *as, IRIns *ir)
     int32_t ofs = sizeof(GCcdata);
     if (sz == 8) {
       ofs += 4; ir++;
-      lua_assert(ir->o == IR_HIOP);
+      lj_assertA(ir->o == IR_HIOP, "missing CNEWI HIOP");
     }
     do {
       if (irref_isk(ir->op2)) {
@@ -1824,7 +1846,7 @@ static void asm_cnew(ASMState *as, IRIns *ir)
       ofs -= 4; ir--;
     } while (1);
 #endif
-    lua_assert(sz == 4 || sz == 8);
+    lj_assertA(sz == 4 || sz == 8, "bad CNEWI size %d", sz);
   } else if (ir->op2 != REF_NIL) {  /* Create VLA/VLS/aligned cdata. */
     ci = &lj_ir_callinfo[IRCALL_lj_cdata_newv];
     args[0] = ASMREF_L;     /* lua_State *L */
@@ -1883,7 +1905,7 @@ static void asm_obar(ASMState *as, IRIns *ir)
   MCLabel l_end;
   Reg obj;
   /* No need for other object barriers (yet). */
-  lua_assert(IR(ir->op1)->o == IR_UREFC);
+  lj_assertA(IR(ir->op1)->o == IR_UREFC, "bad OBAR type");
   ra_evictset(as, RSET_SCRATCH);
   l_end = emit_label(as);
   args[0] = ASMREF_TMP1;  /* global_State *g */
@@ -2000,7 +2022,7 @@ static int asm_swapops(ASMState *as, IRIns *ir)
 {
   IRIns *irl = IR(ir->op1);
   IRIns *irr = IR(ir->op2);
-  lua_assert(ra_noreg(irr->r));
+  lj_assertA(ra_noreg(irr->r), "bad usage");
   if (!irm_iscomm(lj_ir_mode[ir->o]))
     return 0;  /* Can't swap non-commutative operations. */
   if (irref_isk(ir->op2))
@@ -2391,8 +2413,9 @@ static void asm_comp(ASMState *as, IRIns *ir)
     IROp leftop = (IROp)(IR(lref)->o);
     Reg r64 = REX_64IR(ir, 0);
     int32_t imm = 0;
-    lua_assert(irt_is64(ir->t) || irt_isint(ir->t) ||
-	       irt_isu32(ir->t) || irt_isaddr(ir->t) || irt_isu8(ir->t));
+    lj_assertA(irt_is64(ir->t) || irt_isint(ir->t) ||
+	       irt_isu32(ir->t) || irt_isaddr(ir->t) || irt_isu8(ir->t),
+	       "bad comparison data type %d", irt_type(ir->t));
     /* Swap constants (only for ABC) and fusable loads to the right. */
     if (irref_isk(lref) || (!irref_isk(rref) && opisfusableload(leftop))) {
       if ((cc & 0xc) == 0xc) cc ^= 0x53;  /* L <-> G, LE <-> GE */
@@ -2474,7 +2497,7 @@ static void asm_comp(ASMState *as, IRIns *ir)
 	  /* Use test r,r instead of cmp r,0. */
 	  x86Op xo = XO_TEST;
 	  if (irt_isu8(ir->t)) {
-	    lua_assert(ir->o == IR_EQ || ir->o == IR_NE);
+	    lj_assertA(ir->o == IR_EQ || ir->o == IR_NE, "bad usage");
 	    xo = XO_TESTb;
 	    if (!rset_test(RSET_RANGE(RID_EAX, RID_EBX+1), left)) {
 	      if (LJ_64) {
@@ -2630,10 +2653,11 @@ static void asm_hiop(ASMState *as, IRIns *ir)
   case IR_CNEWI:
     /* Nothing to do here. Handled by CNEWI itself. */
     break;
-  default: lua_assert(0); break;
+  default: lj_assertA(0, "bad HIOP for op %d", (ir-1)->o); break;
   }
 #else
-  UNUSED(as); UNUSED(ir); lua_assert(0);  /* Unused on x64 or without FFI. */
+  /* Unused on x64 or without FFI. */
+  UNUSED(as); UNUSED(ir); lj_assertA(0, "unexpected HIOP");
 #endif
 }
 
@@ -2699,8 +2723,9 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
       Reg src = ra_alloc1(as, ref, RSET_FPR);
       emit_rmro(as, XO_MOVSDto, src, RID_BASE, ofs);
     } else {
-      lua_assert(irt_ispri(ir->t) || irt_isaddr(ir->t) ||
-		 (LJ_DUALNUM && irt_isinteger(ir->t)));
+      lj_assertA(irt_ispri(ir->t) || irt_isaddr(ir->t) ||
+		 (LJ_DUALNUM && irt_isinteger(ir->t)),
+		 "restore of IR type %d", irt_type(ir->t));
       if (!irref_isk(ref)) {
 	Reg src = ra_alloc1(as, ref, rset_exclude(RSET_GPR, RID_BASE));
 #if LJ_GC64
@@ -2745,7 +2770,7 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
     }
     checkmclim(as);
   }
-  lua_assert(map + nent == flinks);
+  lj_assertA(map + nent == flinks, "inconsistent frames in snapshot");
 }
 
 /* -- GC handling --------------------------------------------------------- */
@@ -2789,16 +2814,16 @@ static void asm_loop_fixup(ASMState *as)
   MCode *target = as->mcp;
   if (as->realign) {  /* Realigned loops use short jumps. */
     as->realign = NULL;  /* Stop another retry. */
-    lua_assert(((intptr_t)target & 15) == 0);
+    lj_assertA(((intptr_t)target & 15) == 0, "loop realign failed");
     if (as->loopinv) {  /* Inverted loop branch? */
       p -= 5;
       p[0] = XI_JMP;
-      lua_assert(target - p >= -128);
+      lj_assertA(target - p >= -128, "loop realign failed");
       p[-1] = (MCode)(target - p);  /* Patch sjcc. */
       if (as->loopinv == 2)
 	p[-3] = (MCode)(target - p + 2);  /* Patch opt. short jp. */
     } else {
-      lua_assert(target - p >= -128);
+      lj_assertA(target - p >= -128, "loop realign failed");
       p[-1] = (MCode)(int8_t)(target - p);  /* Patch short jmp. */
       p[-2] = XI_JMPs;
     }
@@ -2904,7 +2929,7 @@ static void asm_tail_fixup(ASMState *as, TraceNo lnk)
   }
   /* Patch exit branch. */
   target = lnk ? traceref(as->J, lnk)->mcode : (MCode *)lj_vm_exit_interp;
-  *(int32_t *)(p-4) = jmprel(p, target);
+  *(int32_t *)(p-4) = jmprel(as->J, p, target);
   p[-5] = XI_JMP;
   /* Drop unused mcode tail. Fill with NOPs to make the prefetcher happy. */
   for (q = as->mctop-1; q >= p; q--)
@@ -3077,17 +3102,17 @@ void lj_asm_patchexit(jit_State *J, GCtrace *T, ExitNo exitno, MCode *target)
   uint32_t statei = u32ptr(&J2G(J)->vmstate);
 #endif
   if (len > 5 && p[len-5] == XI_JMP && p+len-6 + *(int32_t *)(p+len-4) == px)
-    *(int32_t *)(p+len-4) = jmprel(p+len, target);
+    *(int32_t *)(p+len-4) = jmprel(J, p+len, target);
   /* Do not patch parent exit for a stack check. Skip beyond vmstate update. */
   for (; p < pe; p += asm_x86_inslen(p)) {
     intptr_t ofs = LJ_GC64 ? (p[0] & 0xf0) == 0x40 : LJ_64;
     if (*(uint32_t *)(p+2+ofs) == statei && p[ofs+LJ_GC64-LJ_64] == XI_MOVmi)
       break;
   }
-  lua_assert(p < pe);
+  lj_assertJ(p < pe, "instruction length decoder failed");
   for (; p < pe; p += asm_x86_inslen(p))
     if ((*(uint16_t *)p & 0xf0ff) == 0x800f && p + *(int32_t *)(p+2) == px)
-      *(int32_t *)(p+2) = jmprel(p+6, target);
+      *(int32_t *)(p+2) = jmprel(J, p+6, target);
   lj_mcode_sync(T->mcode, T->mcode + T->szmcode);
   lj_mcode_patch(J, mcarea, 1);
 }
diff --git a/src/lj_assert.c b/src/lj_assert.c
new file mode 100644
index 00000000..7989dbe6
--- /dev/null
+++ b/src/lj_assert.c
@@ -0,0 +1,28 @@
+/*
+** Internal assertions.
+** Copyright (C) 2005-2020 Mike Pall. See Copyright Notice in luajit.h
+*/
+
+#define lj_assert_c
+#define LUA_CORE
+
+#if defined(LUA_USE_ASSERT) || defined(LUA_USE_APICHECK)
+
+#include <stdio.h>
+
+#include "lj_obj.h"
+
+void lj_assert_fail(global_State *g, const char *file, int line,
+		    const char *func, const char *fmt, ...)
+{
+  va_list argp;
+  va_start(argp, fmt);
+  fprintf(stderr, "LuaJIT ASSERT %s:%d: %s: ", file, line, func);
+  vfprintf(stderr, fmt, argp);
+  fputc('\n', stderr);
+  va_end(argp);
+  UNUSED(g);  /* May be NULL. TODO: optionally dump state. */
+  abort();
+}
+
+#endif
diff --git a/src/lj_bcread.c b/src/lj_bcread.c
index f6c7ad25..cddf6ff1 100644
--- a/src/lj_bcread.c
+++ b/src/lj_bcread.c
@@ -53,7 +53,7 @@ static LJ_NOINLINE void bcread_error(LexState *ls, ErrMsg em)
 /* Refill buffer. */
 static LJ_NOINLINE void bcread_fill(LexState *ls, MSize len, int need)
 {
-  lua_assert(len != 0);
+  lj_assertLS(len != 0, "empty refill");
   if (len > LJ_MAX_BUF || ls->c < 0)
     bcread_error(ls, LJ_ERR_BCBAD);
   do {
@@ -63,7 +63,7 @@ static LJ_NOINLINE void bcread_fill(LexState *ls, MSize len, int need)
     MSize n = (MSize)(ls->pe - ls->p);
     if (n) {  /* Copy remainder to buffer. */
       if (sbuflen(&ls->sb)) {  /* Move down in buffer. */
-	lua_assert(ls->pe == sbufP(&ls->sb));
+	lj_assertLS(ls->pe == sbufP(&ls->sb), "bad buffer pointer");
 	if (ls->p != p) memmove(p, ls->p, n);
       } else {  /* Copy from buffer provided by reader. */
 	p = lj_buf_need(&ls->sb, len);
@@ -112,7 +112,7 @@ static LJ_AINLINE uint8_t *bcread_mem(LexState *ls, MSize len)
 {
   uint8_t *p = (uint8_t *)ls->p;
   ls->p += len;
-  lua_assert(ls->p <= ls->pe);
+  lj_assertLS(ls->p <= ls->pe, "buffer read overflow");
   return p;
 }
 
@@ -125,7 +125,7 @@ static void bcread_block(LexState *ls, void *q, MSize len)
 /* Read byte from buffer. */
 static LJ_AINLINE uint32_t bcread_byte(LexState *ls)
 {
-  lua_assert(ls->p < ls->pe);
+  lj_assertLS(ls->p < ls->pe, "buffer read overflow");
   return (uint32_t)(uint8_t)*ls->p++;
 }
 
@@ -133,7 +133,7 @@ static LJ_AINLINE uint32_t bcread_byte(LexState *ls)
 static LJ_AINLINE uint32_t bcread_uleb128(LexState *ls)
 {
   uint32_t v = lj_buf_ruleb128(&ls->p);
-  lua_assert(ls->p <= ls->pe);
+  lj_assertLS(ls->p <= ls->pe, "buffer read overflow");
   return v;
 }
 
@@ -150,7 +150,7 @@ static uint32_t bcread_uleb128_33(LexState *ls)
    } while (*p++ >= 0x80);
   }
   ls->p = (char *)p;
-  lua_assert(ls->p <= ls->pe);
+  lj_assertLS(ls->p <= ls->pe, "buffer read overflow");
   return v;
 }
 
@@ -197,7 +197,7 @@ static void bcread_ktabk(LexState *ls, TValue *o)
     o->u32.lo = bcread_uleb128(ls);
     o->u32.hi = bcread_uleb128(ls);
   } else {
-    lua_assert(tp <= BCDUMP_KTAB_TRUE);
+    lj_assertLS(tp <= BCDUMP_KTAB_TRUE, "bad constant type %d", tp);
     setpriV(o, ~tp);
   }
 }
@@ -219,7 +219,7 @@ static GCtab *bcread_ktab(LexState *ls)
     for (i = 0; i < nhash; i++) {
       TValue key;
       bcread_ktabk(ls, &key);
-      lua_assert(!tvisnil(&key));
+      lj_assertLS(!tvisnil(&key), "nil key");
       bcread_ktabk(ls, lj_tab_set(ls->L, t, &key));
     }
   }
@@ -256,7 +256,7 @@ static void bcread_kgc(LexState *ls, GCproto *pt, MSize sizekgc)
 #endif
     } else {
       lua_State *L = ls->L;
-      lua_assert(tp == BCDUMP_KGC_CHILD);
+      lj_assertLS(tp == BCDUMP_KGC_CHILD, "bad constant type %d", tp);
       if (L->top <= bcread_oldtop(L, ls))  /* Stack underflow? */
 	bcread_error(ls, LJ_ERR_BCBAD);
       L->top--;
@@ -437,7 +437,7 @@ static int bcread_header(LexState *ls)
 GCproto *lj_bcread(LexState *ls)
 {
   lua_State *L = ls->L;
-  lua_assert(ls->c == BCDUMP_HEAD1);
+  lj_assertLS(ls->c == BCDUMP_HEAD1, "bad bytecode header");
   bcread_savetop(L, ls, L->top);
   lj_buf_reset(&ls->sb);
   /* Check for a valid bytecode dump header. */
diff --git a/src/lj_bcwrite.c b/src/lj_bcwrite.c
index a86d6d00..ce5837f6 100644
--- a/src/lj_bcwrite.c
+++ b/src/lj_bcwrite.c
@@ -29,8 +29,17 @@ typedef struct BCWriteCtx {
   void *wdata;			/* Writer callback data. */
   int strip;			/* Strip debug info. */
   int status;			/* Status from writer callback. */
+#ifdef LUA_USE_ASSERT
+  global_State *g;
+#endif
 } BCWriteCtx;
 
+#ifdef LUA_USE_ASSERT
+#define lj_assertBCW(c, ...)	lj_assertG_(ctx->g, (c), __VA_ARGS__)
+#else
+#define lj_assertBCW(c, ...)	((void)ctx)
+#endif
+
 /* -- Bytecode writer ----------------------------------------------------- */
 
 /* Write a single constant key/value of a template table. */
@@ -61,7 +70,7 @@ static void bcwrite_ktabk(BCWriteCtx *ctx, cTValue *o, int narrow)
     p = lj_strfmt_wuleb128(p, o->u32.lo);
     p = lj_strfmt_wuleb128(p, o->u32.hi);
   } else {
-    lua_assert(tvispri(o));
+    lj_assertBCW(tvispri(o), "unhandled type %d", itype(o));
     *p++ = BCDUMP_KTAB_NIL+~itype(o);
   }
   setsbufP(&ctx->sb, p);
@@ -121,7 +130,7 @@ static void bcwrite_kgc(BCWriteCtx *ctx, GCproto *pt)
       tp = BCDUMP_KGC_STR + gco2str(o)->len;
       need = 5+gco2str(o)->len;
     } else if (o->gch.gct == ~LJ_TPROTO) {
-      lua_assert((pt->flags & PROTO_CHILD));
+      lj_assertBCW((pt->flags & PROTO_CHILD), "prototype has unexpected child");
       tp = BCDUMP_KGC_CHILD;
 #if LJ_HASFFI
     } else if (o->gch.gct == ~LJ_TCDATA) {
@@ -132,12 +141,14 @@ static void bcwrite_kgc(BCWriteCtx *ctx, GCproto *pt)
       } else if (id == CTID_UINT64) {
 	tp = BCDUMP_KGC_U64;
       } else {
-	lua_assert(id == CTID_COMPLEX_DOUBLE);
+	lj_assertBCW(id == CTID_COMPLEX_DOUBLE,
+		     "bad cdata constant CTID %d", id);
 	tp = BCDUMP_KGC_COMPLEX;
       }
 #endif
     } else {
-      lua_assert(o->gch.gct == ~LJ_TTAB);
+      lj_assertBCW(o->gch.gct == ~LJ_TTAB,
+		   "bad constant GC type %d", o->gch.gct);
       tp = BCDUMP_KGC_TAB;
       need = 1+2*5;
     }
@@ -289,7 +300,7 @@ static void bcwrite_proto(BCWriteCtx *ctx, GCproto *pt)
     MSize nn = (lj_fls(n)+8)*9 >> 6;
     char *q = sbufB(&ctx->sb) + (5 - nn);
     p = lj_strfmt_wuleb128(q, n);  /* Fill in final size. */
-    lua_assert(p == sbufB(&ctx->sb) + 5);
+    lj_assertBCW(p == sbufB(&ctx->sb) + 5, "bad ULEB128 write");
     ctx->status = ctx->wfunc(sbufL(&ctx->sb), q, nn+n, ctx->wdata);
   }
 }
@@ -349,6 +360,9 @@ int lj_bcwrite(lua_State *L, GCproto *pt, lua_Writer writer, void *data,
   ctx.wdata = data;
   ctx.strip = strip;
   ctx.status = 0;
+#ifdef LUA_USE_ASSERT
+  ctx.g = G(L);
+#endif
   lj_buf_init(L, &ctx.sb);
   status = lj_vm_cpcall(L, NULL, &ctx, cpwriter);
   if (status == 0) status = ctx.status;
diff --git a/src/lj_buf.c b/src/lj_buf.c
index 0dfe7f98..923f4276 100644
--- a/src/lj_buf.c
+++ b/src/lj_buf.c
@@ -30,7 +30,7 @@ static void buf_grow(SBuf *sb, MSize sz)
 
 LJ_NOINLINE char *LJ_FASTCALL lj_buf_need2(SBuf *sb, MSize sz)
 {
-  lua_assert(sz > sbufsz(sb));
+  lj_assertG_(G(sbufL(sb)), sz > sbufsz(sb), "SBuf overflow");
   if (LJ_UNLIKELY(sz > LJ_MAX_BUF))
     lj_err_mem(sbufL(sb));
   buf_grow(sb, sz);
@@ -40,7 +40,7 @@ LJ_NOINLINE char *LJ_FASTCALL lj_buf_need2(SBuf *sb, MSize sz)
 LJ_NOINLINE char *LJ_FASTCALL lj_buf_more2(SBuf *sb, MSize sz)
 {
   MSize len = sbuflen(sb);
-  lua_assert(sz > sbufleft(sb));
+  lj_assertG_(G(sbufL(sb)), sz > sbufleft(sb), "SBuf overflow");
   if (LJ_UNLIKELY(sz > LJ_MAX_BUF || len + sz > LJ_MAX_BUF))
     lj_err_mem(sbufL(sb));
   buf_grow(sb, len + sz);
diff --git a/src/lj_carith.c b/src/lj_carith.c
index 04c18054..4ae1e9ee 100644
--- a/src/lj_carith.c
+++ b/src/lj_carith.c
@@ -122,7 +122,7 @@ static int carith_ptr(lua_State *L, CTState *cts, CDArith *ca, MMS mm)
 	setboolV(L->top-1, ((uintptr_t)pp < (uintptr_t)pp2));
 	return 1;
       } else {
-	lua_assert(mm == MM_le);
+	lj_assertL(mm == MM_le, "bad metamethod %d", mm);
 	setboolV(L->top-1, ((uintptr_t)pp <= (uintptr_t)pp2));
 	return 1;
       }
@@ -208,7 +208,9 @@ static int carith_int64(lua_State *L, CTState *cts, CDArith *ca, MMS mm)
 	*up = lj_carith_powu64(u0, u1);
       break;
     case MM_unm: *up = (uint64_t)-(int64_t)u0; break;
-    default: lua_assert(0); break;
+    default:
+      lj_assertL(0, "bad metamethod %d", mm);
+      break;
     }
     lj_gc_check(L);
     return 1;
@@ -301,7 +303,9 @@ uint64_t lj_carith_shift64(uint64_t x, int32_t sh, int op)
   case IR_BSAR-IR_BSHL: x = lj_carith_sar64(x, sh); break;
   case IR_BROL-IR_BSHL: x = lj_carith_rol64(x, sh); break;
   case IR_BROR-IR_BSHL: x = lj_carith_ror64(x, sh); break;
-  default: lua_assert(0); break;
+  default:
+    lj_assertX(0, "bad shift op %d", op);
+    break;
   }
   return x;
 }
diff --git a/src/lj_ccall.c b/src/lj_ccall.c
index c1e12f56..a989f657 100644
--- a/src/lj_ccall.c
+++ b/src/lj_ccall.c
@@ -391,7 +391,8 @@
 #define CCALL_HANDLE_GPR \
   /* Try to pass argument in GPRs. */ \
   if (n > 1) { \
-    lua_assert(n == 2 || n == 4);  /* int64_t or complex (float). */ \
+    /* int64_t or complex (float). */ \
+    lj_assertL(n == 2 || n == 4, "bad GPR size %d", n); \
     if (ctype_isinteger(d->info) || ctype_isfp(d->info)) \
       ngpr = (ngpr + 1u) & ~1u;  /* Align int64_t to regpair. */ \
     else if (ngpr + n > maxgpr) \
@@ -642,7 +643,8 @@ static void ccall_classify_ct(CTState *cts, CType *ct, int *rcl, CTSize ofs)
     ccall_classify_struct(cts, ct, rcl, ofs);
   } else {
     int cl = ctype_isfp(ct->info) ? CCALL_RCL_SSE : CCALL_RCL_INT;
-    lua_assert(ctype_hassize(ct->info));
+    lj_assertCTS(ctype_hassize(ct->info),
+		 "classify ctype %08x without size", ct->info);
     if ((ofs & (ct->size-1))) cl = CCALL_RCL_MEM;  /* Unaligned. */
     rcl[(ofs >= 8)] |= cl;
   }
@@ -667,12 +669,13 @@ static int ccall_classify_struct(CTState *cts, CType *ct, int *rcl, CTSize ofs)
 }
 
 /* Try to split up a small struct into registers. */
-static int ccall_struct_reg(CCallState *cc, GPRArg *dp, int *rcl)
+static int ccall_struct_reg(CCallState *cc, CTState *cts, GPRArg *dp, int *rcl)
 {
   MSize ngpr = cc->ngpr, nfpr = cc->nfpr;
   uint32_t i;
+  UNUSED(cts);
   for (i = 0; i < 2; i++) {
-    lua_assert(!(rcl[i] & CCALL_RCL_MEM));
+    lj_assertCTS(!(rcl[i] & CCALL_RCL_MEM), "pass mem struct in reg");
     if ((rcl[i] & CCALL_RCL_INT)) {  /* Integer class takes precedence. */
       if (ngpr >= CCALL_NARG_GPR) return 1;  /* Register overflow. */
       cc->gpr[ngpr++] = dp[i];
@@ -693,7 +696,8 @@ static int ccall_struct_arg(CCallState *cc, CTState *cts, CType *d, int *rcl,
   dp[0] = dp[1] = 0;
   /* Convert to temp. struct. */
   lj_cconv_ct_tv(cts, d, (uint8_t *)dp, o, CCF_ARG(narg));
-  if (ccall_struct_reg(cc, dp, rcl)) {  /* Register overflow? Pass on stack. */
+  if (ccall_struct_reg(cc, cts, dp, rcl)) {
+    /* Register overflow? Pass on stack. */
     MSize nsp = cc->nsp, n = rcl[1] ? 2 : 1;
     if (nsp + n > CCALL_MAXSTACK) return 1;  /* Too many arguments. */
     cc->nsp = nsp + n;
@@ -989,7 +993,7 @@ static int ccall_set_args(lua_State *L, CTState *cts, CType *ct,
     if (fid) {  /* Get argument type from field. */
       CType *ctf = ctype_get(cts, fid);
       fid = ctf->sib;
-      lua_assert(ctype_isfield(ctf->info));
+      lj_assertL(ctype_isfield(ctf->info), "field expected");
       did = ctype_cid(ctf->info);
     } else {
       if (!(ct->info & CTF_VARARG))
@@ -1137,7 +1141,8 @@ static int ccall_get_results(lua_State *L, CTState *cts, CType *ct,
   CCALL_HANDLE_RET
 #endif
   /* No reference types end up here, so there's no need for the CTypeID. */
-  lua_assert(!(ctype_isrefarray(ctr->info) || ctype_isstruct(ctr->info)));
+  lj_assertL(!(ctype_isrefarray(ctr->info) || ctype_isstruct(ctr->info)),
+	     "unexpected reference ctype");
   return lj_cconv_tv_ct(cts, ctr, 0, L->top-1, sp);
 }
 
diff --git a/src/lj_ccallback.c b/src/lj_ccallback.c
index 37edd00f..3738c234 100644
--- a/src/lj_ccallback.c
+++ b/src/lj_ccallback.c
@@ -107,9 +107,9 @@ MSize lj_ccallback_ptr2slot(CTState *cts, void *p)
 /* Initialize machine code for callback function pointers. */
 #if LJ_OS_NOJIT
 /* Disabled callback support. */
-#define callback_mcode_init(g, p)	UNUSED(p)
+#define callback_mcode_init(g, p)	(p)
 #elif LJ_TARGET_X86ORX64
-static void callback_mcode_init(global_State *g, uint8_t *page)
+static void *callback_mcode_init(global_State *g, uint8_t *page)
 {
   uint8_t *p = page;
   uint8_t *target = (uint8_t *)(void *)lj_vm_ffi_callback;
@@ -143,10 +143,10 @@ static void callback_mcode_init(global_State *g, uint8_t *page)
       *p++ = XI_JMPs; *p++ = (uint8_t)((2+2)*(31-(slot&31)) - 2);
     }
   }
-  lua_assert(p - page <= CALLBACK_MCODE_SIZE);
+  return p;
 }
 #elif LJ_TARGET_ARM
-static void callback_mcode_init(global_State *g, uint32_t *page)
+static void *callback_mcode_init(global_State *g, uint32_t *page)
 {
   uint32_t *p = page;
   void *target = (void *)lj_vm_ffi_callback;
@@ -165,10 +165,10 @@ static void callback_mcode_init(global_State *g, uint32_t *page)
     *p = ARMI_B | ((page-p-2) & 0x00ffffffu);
     p++;
   }
-  lua_assert(p - page <= CALLBACK_MCODE_SIZE);
+  return p;
 }
 #elif LJ_TARGET_ARM64
-static void callback_mcode_init(global_State *g, uint32_t *page)
+static void *callback_mcode_init(global_State *g, uint32_t *page)
 {
   uint32_t *p = page;
   void *target = (void *)lj_vm_ffi_callback;
@@ -185,10 +185,10 @@ static void callback_mcode_init(global_State *g, uint32_t *page)
     *p = A64I_LE(A64I_B | A64F_S26((page-p) & 0x03ffffffu));
     p++;
   }
-  lua_assert(p - page <= CALLBACK_MCODE_SIZE);
+  return p;
 }
 #elif LJ_TARGET_PPC
-static void callback_mcode_init(global_State *g, uint32_t *page)
+static void *callback_mcode_init(global_State *g, uint32_t *page)
 {
   uint32_t *p = page;
   void *target = (void *)lj_vm_ffi_callback;
@@ -204,10 +204,10 @@ static void callback_mcode_init(global_State *g, uint32_t *page)
     *p = PPCI_B | (((page-p) & 0x00ffffffu) << 2);
     p++;
   }
-  lua_assert(p - page <= CALLBACK_MCODE_SIZE);
+  return p;
 }
 #elif LJ_TARGET_MIPS
-static void callback_mcode_init(global_State *g, uint32_t *page)
+static void *callback_mcode_init(global_State *g, uint32_t *page)
 {
   uint32_t *p = page;
   uintptr_t target = (uintptr_t)(void *)lj_vm_ffi_callback;
@@ -236,11 +236,11 @@ static void callback_mcode_init(global_State *g, uint32_t *page)
     p++;
     *p++ = MIPSI_LI | MIPSF_T(RID_R1) | slot;
   }
-  lua_assert(p - page <= CALLBACK_MCODE_SIZE);
+  return p;
 }
 #else
 /* Missing support for this architecture. */
-#define callback_mcode_init(g, p)	UNUSED(p)
+#define callback_mcode_init(g, p)	(p)
 #endif
 
 /* -- Machine code management --------------------------------------------- */
@@ -263,7 +263,7 @@ static void callback_mcode_init(global_State *g, uint32_t *page)
 static void callback_mcode_new(CTState *cts)
 {
   size_t sz = (size_t)CALLBACK_MCODE_SIZE;
-  void *p;
+  void *p, *pe;
   if (CALLBACK_MAX_SLOT == 0)
     lj_err_caller(cts->L, LJ_ERR_FFI_CBACKOV);
 #if LJ_TARGET_WINDOWS
@@ -280,7 +280,10 @@ static void callback_mcode_new(CTState *cts)
   p = lj_mem_new(cts->L, sz);
 #endif
   cts->cb.mcode = p;
-  callback_mcode_init(cts->g, p);
+  pe = callback_mcode_init(cts->g, p);
+  UNUSED(pe);
+  lj_assertCTS((size_t)((char *)pe - (char *)p) <= sz,
+	       "miscalculated CALLBACK_MAX_SLOT");
   lj_mcode_sync(p, (char *)p + sz);
 #if LJ_TARGET_WINDOWS
   {
@@ -421,8 +424,9 @@ void lj_ccallback_mcode_free(CTState *cts)
 
 #define CALLBACK_HANDLE_GPR \
   if (n > 1) { \
-    lua_assert(((LJ_ABI_SOFTFP && ctype_isnum(cta->info)) ||  /* double. */ \
-		ctype_isinteger(cta->info)) && n == 2);  /* int64_t. */ \
+    lj_assertCTS(((LJ_ABI_SOFTFP && ctype_isnum(cta->info)) ||  /* double. */ \
+		 ctype_isinteger(cta->info)) && n == 2,  /* int64_t. */ \
+		 "bad GPR type"); \
     ngpr = (ngpr + 1u) & ~1u;  /* Align int64_t to regpair. */ \
   } \
   if (ngpr + n <= maxgpr) { \
@@ -579,7 +583,7 @@ static void callback_conv_args(CTState *cts, lua_State *L)
       CTSize sz;
       int isfp;
       MSize n;
-      lua_assert(ctype_isfield(ctf->info));
+      lj_assertCTS(ctype_isfield(ctf->info), "field expected");
       cta = ctype_rawchild(cts, ctf);
       isfp = ctype_isfp(cta->info);
       sz = (cta->size + CTSIZE_PTR-1) & ~(CTSIZE_PTR-1);
@@ -671,7 +675,7 @@ lua_State * LJ_FASTCALL lj_ccallback_enter(CTState *cts, void *cf)
 {
   lua_State *L = cts->L;
   global_State *g = cts->g;
-  lua_assert(L != NULL);
+  lj_assertG(L != NULL, "uninitialized cts->L in callback");
   if (tvref(g->jit_base)) {
     setstrV(L, L->top++, lj_err_str(L, LJ_ERR_FFI_BADCBACK));
     if (g->panic) g->panic(L);
@@ -756,7 +760,7 @@ static CType *callback_checkfunc(CTState *cts, CType *ct)
       CType *ctf = ctype_get(cts, fid);
       if (!ctype_isattrib(ctf->info)) {
 	CType *cta;
-	lua_assert(ctype_isfield(ctf->info));
+	lj_assertCTS(ctype_isfield(ctf->info), "field expected");
 	cta = ctype_rawchild(cts, ctf);
 	if (!(ctype_isenum(cta->info) || ctype_isptr(cta->info) ||
 	      (ctype_isnum(cta->info) && cta->size <= 8)) ||
diff --git a/src/lj_cconv.c b/src/lj_cconv.c
index ca2a5d30..37c88852 100644
--- a/src/lj_cconv.c
+++ b/src/lj_cconv.c
@@ -122,19 +122,25 @@ void lj_cconv_ct_ct(CTState *cts, CType *d, CType *s,
   CTInfo dinfo = d->info, sinfo = s->info;
   void *tmpptr;
 
-  lua_assert(!ctype_isenum(dinfo) && !ctype_isenum(sinfo));
-  lua_assert(!ctype_isattrib(dinfo) && !ctype_isattrib(sinfo));
+  lj_assertCTS(!ctype_isenum(dinfo) && !ctype_isenum(sinfo),
+	       "unresolved enum");
+  lj_assertCTS(!ctype_isattrib(dinfo) && !ctype_isattrib(sinfo),
+	       "unstripped attribute");
 
   if (ctype_type(dinfo) > CT_MAYCONVERT || ctype_type(sinfo) > CT_MAYCONVERT)
     goto err_conv;
 
   /* Some basic sanity checks. */
-  lua_assert(!ctype_isnum(dinfo) || dsize > 0);
-  lua_assert(!ctype_isnum(sinfo) || ssize > 0);
-  lua_assert(!ctype_isbool(dinfo) || dsize == 1 || dsize == 4);
-  lua_assert(!ctype_isbool(sinfo) || ssize == 1 || ssize == 4);
-  lua_assert(!ctype_isinteger(dinfo) || (1u<<lj_fls(dsize)) == dsize);
-  lua_assert(!ctype_isinteger(sinfo) || (1u<<lj_fls(ssize)) == ssize);
+  lj_assertCTS(!ctype_isnum(dinfo) || dsize > 0, "bad size for number type");
+  lj_assertCTS(!ctype_isnum(sinfo) || ssize > 0, "bad size for number type");
+  lj_assertCTS(!ctype_isbool(dinfo) || dsize == 1 || dsize == 4,
+	       "bad size for bool type");
+  lj_assertCTS(!ctype_isbool(sinfo) || ssize == 1 || ssize == 4,
+	       "bad size for bool type");
+  lj_assertCTS(!ctype_isinteger(dinfo) || (1u<<lj_fls(dsize)) == dsize,
+	       "bad size for integer type");
+  lj_assertCTS(!ctype_isinteger(sinfo) || (1u<<lj_fls(ssize)) == ssize,
+	       "bad size for integer type");
 
   switch (cconv_idx2(dinfo, sinfo)) {
   /* Destination is a bool. */
@@ -357,7 +363,7 @@ void lj_cconv_ct_ct(CTState *cts, CType *d, CType *s,
     if ((flags & CCF_CAST) || (d->info & CTF_VLA) || d != s)
       goto err_conv;  /* Must be exact same type. */
 copyval:  /* Copy value. */
-    lua_assert(dsize == ssize);
+    lj_assertCTS(dsize == ssize, "value copy with different sizes");
     memcpy(dp, sp, dsize);
     break;
 
@@ -389,7 +395,7 @@ int lj_cconv_tv_ct(CTState *cts, CType *s, CTypeID sid,
 	lj_cconv_ct_ct(cts, ctype_get(cts, CTID_DOUBLE), s,
 		       (uint8_t *)&o->n, sp, 0);
 	/* Numbers are NOT canonicalized here! Beware of uninitialized data. */
-	lua_assert(tvisnum(o));
+	lj_assertCTS(tvisnum(o), "non-canonical NaN passed");
       }
     } else {
       uint32_t b = s->size == 1 ? (*sp != 0) : (*(int *)sp != 0);
@@ -406,7 +412,7 @@ int lj_cconv_tv_ct(CTState *cts, CType *s, CTypeID sid,
     CTSize sz;
   copyval:  /* Copy value. */
     sz = s->size;
-    lua_assert(sz != CTSIZE_INVALID);
+    lj_assertCTS(sz != CTSIZE_INVALID, "value copy with invalid size");
     /* Attributes are stripped, qualifiers are kept (but mostly ignored). */
     cd = lj_cdata_new(cts, ctype_typeid(cts, s), sz);
     setcdataV(cts->L, o, cd);
@@ -421,19 +427,22 @@ int lj_cconv_tv_bf(CTState *cts, CType *s, TValue *o, uint8_t *sp)
   CTInfo info = s->info;
   CTSize pos, bsz;
   uint32_t val;
-  lua_assert(ctype_isbitfield(info));
+  lj_assertCTS(ctype_isbitfield(info), "bitfield expected");
   /* NYI: packed bitfields may cause misaligned reads. */
   switch (ctype_bitcsz(info)) {
   case 4: val = *(uint32_t *)sp; break;
   case 2: val = *(uint16_t *)sp; break;
   case 1: val = *(uint8_t *)sp; break;
-  default: lua_assert(0); val = 0; break;
+  default:
+    lj_assertCTS(0, "bad bitfield container size %d", ctype_bitcsz(info));
+    val = 0;
+    break;
   }
   /* Check if a packed bitfield crosses a container boundary. */
   pos = ctype_bitpos(info);
   bsz = ctype_bitbsz(info);
-  lua_assert(pos < 8*ctype_bitcsz(info));
-  lua_assert(bsz > 0 && bsz <= 8*ctype_bitcsz(info));
+  lj_assertCTS(pos < 8*ctype_bitcsz(info), "bad bitfield position");
+  lj_assertCTS(bsz > 0 && bsz <= 8*ctype_bitcsz(info), "bad bitfield size");
   if (pos + bsz > 8*ctype_bitcsz(info))
     lj_err_caller(cts->L, LJ_ERR_FFI_NYIPACKBIT);
   if (!(info & CTF_BOOL)) {
@@ -449,7 +458,7 @@ int lj_cconv_tv_bf(CTState *cts, CType *s, TValue *o, uint8_t *sp)
     }
   } else {
     uint32_t b = (val >> pos) & 1;
-    lua_assert(bsz == 1);
+    lj_assertCTS(bsz == 1, "bad bool bitfield size");
     setboolV(o, b);
     setboolV(&cts->g->tmptv2, b);  /* Remember for trace recorder. */
   }
@@ -553,7 +562,7 @@ void lj_cconv_ct_tv(CTState *cts, CType *d,
     sid = cdataV(o)->ctypeid;
     s = ctype_get(cts, sid);
     if (ctype_isref(s->info)) {  /* Resolve reference for value. */
-      lua_assert(s->size == CTSIZE_PTR);
+      lj_assertCTS(s->size == CTSIZE_PTR, "ref is not pointer-sized");
       sp = *(void **)sp;
       sid = ctype_cid(s->info);
     }
@@ -571,7 +580,7 @@ void lj_cconv_ct_tv(CTState *cts, CType *d,
       CType *cct = lj_ctype_getfield(cts, d, str, &ofs);
       if (!cct || !ctype_isconstval(cct->info))
 	goto err_conv;
-      lua_assert(d->size == 4);
+      lj_assertCTS(d->size == 4, "only 32 bit enum supported");  /* NYI */
       sp = (uint8_t *)&cct->size;
       sid = ctype_cid(cct->info);
     } else if (ctype_isrefarray(d->info)) {  /* Copy string to array. */
@@ -635,10 +644,10 @@ void lj_cconv_bf_tv(CTState *cts, CType *d, uint8_t *dp, TValue *o)
   CTInfo info = d->info;
   CTSize pos, bsz;
   uint32_t val, mask;
-  lua_assert(ctype_isbitfield(info));
+  lj_assertCTS(ctype_isbitfield(info), "bitfield expected");
   if ((info & CTF_BOOL)) {
     uint8_t tmpbool;
-    lua_assert(ctype_bitbsz(info) == 1);
+    lj_assertCTS(ctype_bitbsz(info) == 1, "bad bool bitfield size");
     lj_cconv_ct_tv(cts, ctype_get(cts, CTID_BOOL), &tmpbool, o, 0);
     val = tmpbool;
   } else {
@@ -647,8 +656,8 @@ void lj_cconv_bf_tv(CTState *cts, CType *d, uint8_t *dp, TValue *o)
   }
   pos = ctype_bitpos(info);
   bsz = ctype_bitbsz(info);
-  lua_assert(pos < 8*ctype_bitcsz(info));
-  lua_assert(bsz > 0 && bsz <= 8*ctype_bitcsz(info));
+  lj_assertCTS(pos < 8*ctype_bitcsz(info), "bad bitfield position");
+  lj_assertCTS(bsz > 0 && bsz <= 8*ctype_bitcsz(info), "bad bitfield size");
   /* Check if a packed bitfield crosses a container boundary. */
   if (pos + bsz > 8*ctype_bitcsz(info))
     lj_err_caller(cts->L, LJ_ERR_FFI_NYIPACKBIT);
@@ -659,7 +668,9 @@ void lj_cconv_bf_tv(CTState *cts, CType *d, uint8_t *dp, TValue *o)
   case 4: *(uint32_t *)dp = (*(uint32_t *)dp & ~mask) | (uint32_t)val; break;
   case 2: *(uint16_t *)dp = (*(uint16_t *)dp & ~mask) | (uint16_t)val; break;
   case 1: *(uint8_t *)dp = (*(uint8_t *)dp & ~mask) | (uint8_t)val; break;
-  default: lua_assert(0); break;
+  default:
+    lj_assertCTS(0, "bad bitfield container size %d", ctype_bitcsz(info));
+    break;
   }
 }
 
diff --git a/src/lj_cconv.h b/src/lj_cconv.h
index 0a0b66c9..54a61fd4 100644
--- a/src/lj_cconv.h
+++ b/src/lj_cconv.h
@@ -27,13 +27,14 @@ enum {
 static LJ_AINLINE uint32_t cconv_idx(CTInfo info)
 {
   uint32_t idx = ((info >> 26) & 15u);  /* Dispatch bits. */
-  lua_assert(ctype_type(info) <= CT_MAYCONVERT);
+  lj_assertX(ctype_type(info) <= CT_MAYCONVERT,
+	     "cannot convert ctype %08x", info);
 #if LJ_64
   idx = ((uint32_t)(U64x(f436fff5,fff7f021) >> 4*idx) & 15u);
 #else
   idx = (((idx < 8 ? 0xfff7f021u : 0xf436fff5) >> 4*(idx & 7u)) & 15u);
 #endif
-  lua_assert(idx < 8);
+  lj_assertX(idx < 8, "cannot convert ctype %08x", info);
   return idx;
 }
 
diff --git a/src/lj_cdata.c b/src/lj_cdata.c
index d3042f24..35d0e76a 100644
--- a/src/lj_cdata.c
+++ b/src/lj_cdata.c
@@ -35,7 +35,7 @@ GCcdata *lj_cdata_newv(lua_State *L, CTypeID id, CTSize sz, CTSize align)
   uintptr_t adata = (uintptr_t)p + sizeof(GCcdataVar) + sizeof(GCcdata);
   uintptr_t almask = (1u << align) - 1u;
   GCcdata *cd = (GCcdata *)(((adata + almask) & ~almask) - sizeof(GCcdata));
-  lua_assert((char *)cd - p < 65536);
+  lj_assertL((char *)cd - p < 65536, "excessive cdata alignment");
   cdatav(cd)->offset = (uint16_t)((char *)cd - p);
   cdatav(cd)->extra = extra;
   cdatav(cd)->len = sz;
@@ -77,8 +77,8 @@ void LJ_FASTCALL lj_cdata_free(global_State *g, GCcdata *cd)
   } else if (LJ_LIKELY(!cdataisv(cd))) {
     CType *ct = ctype_raw(ctype_ctsG(g), cd->ctypeid);
     CTSize sz = ctype_hassize(ct->info) ? ct->size : CTSIZE_PTR;
-    lua_assert(ctype_hassize(ct->info) || ctype_isfunc(ct->info) ||
-	       ctype_isextern(ct->info));
+    lj_assertG(ctype_hassize(ct->info) || ctype_isfunc(ct->info) ||
+	       ctype_isextern(ct->info), "free of ctype without a size");
     lj_mem_free(g, cd, sizeof(GCcdata) + sz);
     g->gc.cdatanum--;
   } else {
@@ -118,7 +118,7 @@ CType *lj_cdata_index(CTState *cts, GCcdata *cd, cTValue *key, uint8_t **pp,
 
   /* Resolve reference for cdata object. */
   if (ctype_isref(ct->info)) {
-    lua_assert(ct->size == CTSIZE_PTR);
+    lj_assertCTS(ct->size == CTSIZE_PTR, "ref is not pointer-sized");
     p = *(uint8_t **)p;
     ct = ctype_child(cts, ct);
   }
@@ -129,7 +129,8 @@ collect_attrib:
     if (ctype_attrib(ct->info) == CTA_QUAL) *qual |= ct->size;
     ct = ctype_child(cts, ct);
   }
-  lua_assert(!ctype_isref(ct->info));  /* Interning rejects refs to refs. */
+  /* Interning rejects refs to refs. */
+  lj_assertCTS(!ctype_isref(ct->info), "bad ref of ref");
 
   if (tvisint(key)) {
     idx = (ptrdiff_t)intV(key);
@@ -215,7 +216,8 @@ collect_attrib:
 static void cdata_getconst(CTState *cts, TValue *o, CType *ct)
 {
   CType *ctt = ctype_child(cts, ct);
-  lua_assert(ctype_isinteger(ctt->info) && ctt->size <= 4);
+  lj_assertCTS(ctype_isinteger(ctt->info) && ctt->size <= 4,
+	       "only 32 bit const supported");  /* NYI */
   /* Constants are already zero-extended/sign-extended to 32 bits. */
   if ((ctt->info & CTF_UNSIGNED) && (int32_t)ct->size < 0)
     setnumV(o, (lua_Number)(uint32_t)ct->size);
@@ -236,13 +238,14 @@ int lj_cdata_get(CTState *cts, CType *s, TValue *o, uint8_t *sp)
   }
 
   /* Get child type of pointer/array/field. */
-  lua_assert(ctype_ispointer(s->info) || ctype_isfield(s->info));
+  lj_assertCTS(ctype_ispointer(s->info) || ctype_isfield(s->info),
+	       "pointer or field expected");
   sid = ctype_cid(s->info);
   s = ctype_get(cts, sid);
 
   /* Resolve reference for field. */
   if (ctype_isref(s->info)) {
-    lua_assert(s->size == CTSIZE_PTR);
+    lj_assertCTS(s->size == CTSIZE_PTR, "ref is not pointer-sized");
     sp = *(uint8_t **)sp;
     sid = ctype_cid(s->info);
     s = ctype_get(cts, sid);
@@ -269,12 +272,13 @@ void lj_cdata_set(CTState *cts, CType *d, uint8_t *dp, TValue *o, CTInfo qual)
   }
 
   /* Get child type of pointer/array/field. */
-  lua_assert(ctype_ispointer(d->info) || ctype_isfield(d->info));
+  lj_assertCTS(ctype_ispointer(d->info) || ctype_isfield(d->info),
+	       "pointer or field expected");
   d = ctype_child(cts, d);
 
   /* Resolve reference for field. */
   if (ctype_isref(d->info)) {
-    lua_assert(d->size == CTSIZE_PTR);
+    lj_assertCTS(d->size == CTSIZE_PTR, "ref is not pointer-sized");
     dp = *(uint8_t **)dp;
     d = ctype_child(cts, d);
   }
@@ -289,7 +293,8 @@ void lj_cdata_set(CTState *cts, CType *d, uint8_t *dp, TValue *o, CTInfo qual)
     d = ctype_child(cts, d);
   }
 
-  lua_assert(ctype_hassize(d->info) && !ctype_isvoid(d->info));
+  lj_assertCTS(ctype_hassize(d->info), "store to ctype without size");
+  lj_assertCTS(!ctype_isvoid(d->info), "store to void type");
 
   if (((d->info|qual) & CTF_CONST)) {
   err_const:
diff --git a/src/lj_cdata.h b/src/lj_cdata.h
index 66b023bd..193e4241 100644
--- a/src/lj_cdata.h
+++ b/src/lj_cdata.h
@@ -18,7 +18,7 @@ static LJ_AINLINE void *cdata_getptr(void *p, CTSize sz)
   if (LJ_64 && sz == 4) {  /* Support 32 bit pointers on 64 bit targets. */
     return ((void *)(uintptr_t)*(uint32_t *)p);
   } else {
-    lua_assert(sz == CTSIZE_PTR);
+    lj_assertX(sz == CTSIZE_PTR, "bad pointer size %d", sz);
     return *(void **)p;
   }
 }
@@ -29,7 +29,7 @@ static LJ_AINLINE void cdata_setptr(void *p, CTSize sz, const void *v)
   if (LJ_64 && sz == 4) {  /* Support 32 bit pointers on 64 bit targets. */
     *(uint32_t *)p = (uint32_t)(uintptr_t)v;
   } else {
-    lua_assert(sz == CTSIZE_PTR);
+    lj_assertX(sz == CTSIZE_PTR, "bad pointer size %d", sz);
     *(void **)p = (void *)v;
   }
 }
@@ -40,7 +40,8 @@ static LJ_AINLINE GCcdata *lj_cdata_new(CTState *cts, CTypeID id, CTSize sz)
   GCcdata *cd;
 #ifdef LUA_USE_ASSERT
   CType *ct = ctype_raw(cts, id);
-  lua_assert((ctype_hassize(ct->info) ? ct->size : CTSIZE_PTR) == sz);
+  lj_assertCTS((ctype_hassize(ct->info) ? ct->size : CTSIZE_PTR) == sz,
+	       "inconsistent size of fixed-size cdata alloc");
 #endif
   cd = (GCcdata *)lj_mem_newgco(cts->L, sizeof(GCcdata) + sz);
   cd->gct = ~LJ_TCDATA;
diff --git a/src/lj_clib.c b/src/lj_clib.c
index a8672052..2f11b2e9 100644
--- a/src/lj_clib.c
+++ b/src/lj_clib.c
@@ -349,7 +349,8 @@ TValue *lj_clib_index(lua_State *L, CLibrary *cl, GCstr *name)
       lj_err_callerv(L, LJ_ERR_FFI_NODECL, strdata(name));
     if (ctype_isconstval(ct->info)) {
       CType *ctt = ctype_child(cts, ct);
-      lua_assert(ctype_isinteger(ctt->info) && ctt->size <= 4);
+      lj_assertCTS(ctype_isinteger(ctt->info) && ctt->size <= 4,
+		   "only 32 bit const supported");  /* NYI */
       if ((ctt->info & CTF_UNSIGNED) && (int32_t)ct->size < 0)
 	setnumV(tv, (lua_Number)(uint32_t)ct->size);
       else
@@ -361,7 +362,8 @@ TValue *lj_clib_index(lua_State *L, CLibrary *cl, GCstr *name)
 #endif
       void *p = clib_getsym(cl, sym);
       GCcdata *cd;
-      lua_assert(ctype_isfunc(ct->info) || ctype_isextern(ct->info));
+      lj_assertCTS(ctype_isfunc(ct->info) || ctype_isextern(ct->info),
+		   "unexpected ctype %08x in clib", ct->info);
 #if LJ_TARGET_X86 && LJ_ABI_WIN
       /* Retry with decorated name for fastcall/stdcall functions. */
       if (!p && ctype_isfunc(ct->info)) {
diff --git a/src/lj_cparse.c b/src/lj_cparse.c
index cd032b8e..6d9490ca 100644
--- a/src/lj_cparse.c
+++ b/src/lj_cparse.c
@@ -28,6 +28,12 @@
 ** If in doubt, please check the input against your favorite C compiler.
 */
 
+#ifdef LUA_USE_ASSERT
+#define lj_assertCP(c, ...)	(lj_assertG_(G(cp->L), (c), __VA_ARGS__))
+#else
+#define lj_assertCP(c, ...)	((void)cp)
+#endif
+
 /* -- Miscellaneous ------------------------------------------------------- */
 
 /* Match string against a C literal. */
@@ -61,7 +67,7 @@ LJ_NORET static void cp_err(CPState *cp, ErrMsg em);
 
 static const char *cp_tok2str(CPState *cp, CPToken tok)
 {
-  lua_assert(tok < CTOK_FIRSTDECL);
+  lj_assertCP(tok < CTOK_FIRSTDECL, "bad CPToken %d", tok);
   if (tok > CTOK_OFS)
     return ctoknames[tok-CTOK_OFS-1];
   else if (!lj_char_iscntrl(tok))
@@ -392,7 +398,7 @@ static void cp_init(CPState *cp)
   cp->curpack = 0;
   cp->packstack[0] = 255;
   lj_buf_init(cp->L, &cp->sb);
-  lua_assert(cp->p != NULL);
+  lj_assertCP(cp->p != NULL, "uninitialized cp->p");
   cp_get(cp);  /* Read-ahead first char. */
   cp->tok = 0;
   cp->tmask = CPNS_DEFAULT;
@@ -853,12 +859,13 @@ static CTypeID cp_decl_intern(CPState *cp, CPDecl *decl)
     /* The cid is already part of info for copies of pointers/functions. */
     idx = ct->next;
     if (ctype_istypedef(info)) {
-      lua_assert(id == 0);
+      lj_assertCP(id == 0, "typedef not at toplevel");
       id = ctype_cid(info);
       /* Always refetch info/size, since struct/enum may have been completed. */
       cinfo = ctype_get(cp->cts, id)->info;
       csize = ctype_get(cp->cts, id)->size;
-      lua_assert(ctype_isstruct(cinfo) || ctype_isenum(cinfo));
+      lj_assertCP(ctype_isstruct(cinfo) || ctype_isenum(cinfo),
+		  "typedef of bad type");
     } else if (ctype_isfunc(info)) {  /* Intern function. */
       CType *fct;
       CTypeID fid;
@@ -891,7 +898,7 @@ static CTypeID cp_decl_intern(CPState *cp, CPDecl *decl)
       /* Inherit csize/cinfo from original type. */
     } else {
       if (ctype_isnum(info)) {  /* Handle mode/vector-size attributes. */
-	lua_assert(id == 0);
+	lj_assertCP(id == 0, "number not at toplevel");
 	if (!(info & CTF_BOOL)) {
 	  CTSize msize = ctype_msizeP(decl->attr);
 	  CTSize vsize = ctype_vsizeP(decl->attr);
@@ -946,7 +953,7 @@ static CTypeID cp_decl_intern(CPState *cp, CPDecl *decl)
 	  info = (info & ~CTF_ALIGN) | (cinfo & CTF_ALIGN);
 	info |= (cinfo & CTF_QUAL);  /* Inherit qual. */
       } else {
-	lua_assert(ctype_isvoid(info));
+	lj_assertCP(ctype_isvoid(info), "bad ctype %08x", info);
       }
       csize = size;
       cinfo = info+id;
@@ -1596,7 +1603,7 @@ end_decl:
 	cp_errmsg(cp, cp->tok, LJ_ERR_FFI_DECLSPEC);
       sz = sizeof(int);
     }
-    lua_assert(sz != 0);
+    lj_assertCP(sz != 0, "basic ctype with zero size");
     info += CTALIGN(lj_fls(sz));  /* Use natural alignment. */
     info += (decl->attr & CTF_QUAL);  /* Merge qualifiers. */
     cp_push(decl, info, sz);
@@ -1856,7 +1863,7 @@ static void cp_decl_multi(CPState *cp)
 	  /* Treat both static and extern function declarations as extern. */
 	  ct = ctype_get(cp->cts, ctypeid);
 	  /* We always get new anonymous functions (typedefs are copied). */
-	  lua_assert(gcref(ct->name) == NULL);
+	  lj_assertCP(gcref(ct->name) == NULL, "unexpected named function");
 	  id = ctypeid;  /* Just name it. */
 	} else if ((scl & CDF_STATIC)) {  /* Accept static constants. */
 	  id = cp_decl_constinit(cp, &ct, ctypeid);
@@ -1913,7 +1920,7 @@ static TValue *cpcparser(lua_State *L, lua_CFunction dummy, void *ud)
     cp_decl_single(cp);
   if (cp->param && cp->param != cp->L->top)
     cp_err(cp, LJ_ERR_FFI_NUMPARAM);
-  lua_assert(cp->depth == 0);
+  lj_assertCP(cp->depth == 0, "unbalanced cparser declaration depth");
   return NULL;
 }
 
diff --git a/src/lj_crecord.c b/src/lj_crecord.c
index 804cdbf4..e1d1110f 100644
--- a/src/lj_crecord.c
+++ b/src/lj_crecord.c
@@ -61,7 +61,8 @@ static GCcdata *argv2cdata(jit_State *J, TRef tr, cTValue *o)
 static CTypeID crec_constructor(jit_State *J, GCcdata *cd, TRef tr)
 {
   CTypeID id;
-  lua_assert(tref_iscdata(tr) && cd->ctypeid == CTID_CTYPEID);
+  lj_assertJ(tref_iscdata(tr) && cd->ctypeid == CTID_CTYPEID,
+	     "expected CTypeID cdata");
   id = *(CTypeID *)cdataptr(cd);
   tr = emitir(IRT(IR_FLOAD, IRT_INT), tr, IRFL_CDATA_INT);
   emitir(IRTG(IR_EQ, IRT_INT), tr, lj_ir_kint(J, (int32_t)id));
@@ -237,13 +238,14 @@ static void crec_copy(jit_State *J, TRef trdst, TRef trsrc, TRef trlen,
     if (len > CREC_COPY_MAXLEN) goto fallback;
     if (ct) {
       CTState *cts = ctype_ctsG(J2G(J));
-      lua_assert(ctype_isarray(ct->info) || ctype_isstruct(ct->info));
+      lj_assertJ(ctype_isarray(ct->info) || ctype_isstruct(ct->info),
+		 "copy of non-aggregate");
       if (ctype_isarray(ct->info)) {
 	CType *cct = ctype_rawchild(cts, ct);
 	tp = crec_ct2irt(cts, cct);
 	if (tp == IRT_CDATA) goto rawcopy;
 	step = lj_ir_type_size[tp];
-	lua_assert((len & (step-1)) == 0);
+	lj_assertJ((len & (step-1)) == 0, "copy of fractional size");
       } else if ((ct->info & CTF_UNION)) {
 	step = (1u << ctype_align(ct->info));
 	goto rawcopy;
@@ -629,7 +631,8 @@ static TRef crec_ct_tv(jit_State *J, CType *d, TRef dp, TRef sp, cTValue *sval)
       /* Specialize to the name of the enum constant. */
       emitir(IRTG(IR_EQ, IRT_STR), sp, lj_ir_kstr(J, str));
       if (cct && ctype_isconstval(cct->info)) {
-	lua_assert(ctype_child(cts, cct)->size == 4);
+	lj_assertJ(ctype_child(cts, cct)->size == 4,
+		   "only 32 bit const supported");  /* NYI */
 	svisnz = (void *)(intptr_t)(ofs != 0);
 	sp = lj_ir_kint(J, (int32_t)ofs);
 	sid = ctype_cid(cct->info);
@@ -756,7 +759,7 @@ static void crec_index_bf(jit_State *J, RecordFFData *rd, TRef ptr, CTInfo info)
   IRType t = IRT_I8 + 2*lj_fls(ctype_bitcsz(info)) + ((info&CTF_UNSIGNED)?1:0);
   TRef tr = emitir(IRT(IR_XLOAD, t), ptr, 0);
   CTSize pos = ctype_bitpos(info), bsz = ctype_bitbsz(info), shift = 32 - bsz;
-  lua_assert(t <= IRT_U32);  /* NYI: 64 bit bitfields. */
+  lj_assertJ(t <= IRT_U32, "only 32 bit bitfields supported");  /* NYI */
   if (rd->data == 0) {  /* __index metamethod. */
     if ((info & CTF_BOOL)) {
       tr = emitir(IRTI(IR_BAND), tr, lj_ir_kint(J, (int32_t)((1u << pos))));
@@ -768,7 +771,7 @@ static void crec_index_bf(jit_State *J, RecordFFData *rd, TRef ptr, CTInfo info)
       tr = emitir(IRTI(IR_BSHL), tr, lj_ir_kint(J, shift - pos));
       tr = emitir(IRTI(IR_BSAR), tr, lj_ir_kint(J, shift));
     } else {
-      lua_assert(bsz < 32);  /* Full-size fields cannot end up here. */
+      lj_assertJ(bsz < 32, "unexpected full bitfield index");
       tr = emitir(IRTI(IR_BSHR), tr, lj_ir_kint(J, pos));
       tr = emitir(IRTI(IR_BAND), tr, lj_ir_kint(J, (int32_t)((1u << bsz)-1)));
       /* We can omit the U32 to NUM conversion, since bsz < 32. */
@@ -883,7 +886,7 @@ again:
 	  crec_index_bf(J, rd, ptr, fct->info);
 	  return;
 	} else {
-	  lua_assert(ctype_isfield(fct->info));
+	  lj_assertJ(ctype_isfield(fct->info), "field expected");
 	  sid = ctype_cid(fct->info);
 	}
       }
@@ -1133,7 +1136,7 @@ static TRef crec_call_args(jit_State *J, RecordFFData *rd,
     if (fid) {  /* Get argument type from field. */
       CType *ctf = ctype_get(cts, fid);
       fid = ctf->sib;
-      lua_assert(ctype_isfield(ctf->info));
+      lj_assertJ(ctype_isfield(ctf->info), "field expected");
       did = ctype_cid(ctf->info);
     } else {
       if (!(ct->info & CTF_VARARG))
diff --git a/src/lj_ctype.c b/src/lj_ctype.c
index 0ea89c74..a42e3d60 100644
--- a/src/lj_ctype.c
+++ b/src/lj_ctype.c
@@ -153,7 +153,7 @@ CTypeID lj_ctype_new(CTState *cts, CType **ctp)
 {
   CTypeID id = cts->top;
   CType *ct;
-  lua_assert(cts->L);
+  lj_assertCTS(cts->L, "uninitialized cts->L");
   if (LJ_UNLIKELY(id >= cts->sizetab)) {
     if (id >= CTID_MAX) lj_err_msg(cts->L, LJ_ERR_TABOV);
 #ifdef LUAJIT_CTYPE_CHECK_ANCHOR
@@ -182,7 +182,7 @@ CTypeID lj_ctype_intern(CTState *cts, CTInfo info, CTSize size)
 {
   uint32_t h = ct_hashtype(info, size);
   CTypeID id = cts->hash[h];
-  lua_assert(cts->L);
+  lj_assertCTS(cts->L, "uninitialized cts->L");
   while (id) {
     CType *ct = ctype_get(cts, id);
     if (ct->info == info && ct->size == size)
@@ -298,9 +298,9 @@ CTSize lj_ctype_vlsize(CTState *cts, CType *ct, CTSize nelem)
     }
     ct = ctype_raw(cts, arrid);
   }
-  lua_assert(ctype_isvlarray(ct->info));  /* Must be a VLA. */
+  lj_assertCTS(ctype_isvlarray(ct->info), "VLA expected");
   ct = ctype_rawchild(cts, ct);  /* Get array element. */
-  lua_assert(ctype_hassize(ct->info));
+  lj_assertCTS(ctype_hassize(ct->info), "bad VLA without size");
   /* Calculate actual size of VLA and check for overflow. */
   xsz += (uint64_t)ct->size * nelem;
   return xsz < 0x80000000u ? (CTSize)xsz : CTSIZE_INVALID;
@@ -323,7 +323,8 @@ CTInfo lj_ctype_info(CTState *cts, CTypeID id, CTSize *szp)
     } else {
       if (!(qual & CTFP_ALIGNED)) qual |= (info & CTF_ALIGN);
       qual |= (info & ~(CTF_ALIGN|CTMASK_CID));
-      lua_assert(ctype_hassize(info) || ctype_isfunc(info));
+      lj_assertCTS(ctype_hassize(info) || ctype_isfunc(info),
+		   "ctype without size");
       *szp = ctype_isfunc(info) ? CTSIZE_INVALID : ct->size;
       break;
     }
@@ -528,7 +529,7 @@ static void ctype_repr(CTRepr *ctr, CTypeID id)
       ctype_appc(ctr, ')');
       break;
     default:
-      lua_assert(0);
+      lj_assertG_(ctr->cts->g, 0, "bad ctype %08x", info);
       break;
     }
     ct = ctype_get(ctr->cts, ctype_cid(info));
diff --git a/src/lj_ctype.h b/src/lj_ctype.h
index 0c220a88..c4f3bdde 100644
--- a/src/lj_ctype.h
+++ b/src/lj_ctype.h
@@ -260,6 +260,12 @@ typedef struct CTState {
 
 #define CT_MEMALIGN	3	/* Alignment guaranteed by memory allocator. */
 
+#ifdef LUA_USE_ASSERT
+#define lj_assertCTS(c, ...)	(lj_assertG_(cts->g, (c), __VA_ARGS__))
+#else
+#define lj_assertCTS(c, ...)	((void)cts)
+#endif
+
 /* -- Predefined types ---------------------------------------------------- */
 
 /* Target-dependent types. */
@@ -392,7 +398,8 @@ static LJ_AINLINE CTState *ctype_cts(lua_State *L)
 /* Check C type ID for validity when assertions are enabled. */
 static LJ_AINLINE CTypeID ctype_check(CTState *cts, CTypeID id)
 {
-  lua_assert(id > 0 && id < cts->top); UNUSED(cts);
+  UNUSED(cts);
+  lj_assertCTS(id > 0 && id < cts->top, "bad CTID %d", id);
   return id;
 }
 
@@ -408,8 +415,9 @@ static LJ_AINLINE CType *ctype_get(CTState *cts, CTypeID id)
 /* Get child C type. */
 static LJ_AINLINE CType *ctype_child(CTState *cts, CType *ct)
 {
-  lua_assert(!(ctype_isvoid(ct->info) || ctype_isstruct(ct->info) ||
-	     ctype_isbitfield(ct->info)));  /* These don't have children. */
+  lj_assertCTS(!(ctype_isvoid(ct->info) || ctype_isstruct(ct->info) ||
+	       ctype_isbitfield(ct->info)),
+	       "ctype %08x has no children", ct->info);
   return ctype_get(cts, ctype_cid(ct->info));
 }
 
diff --git a/src/lj_debug.c b/src/lj_debug.c
index c4edcabb..46c442c6 100644
--- a/src/lj_debug.c
+++ b/src/lj_debug.c
@@ -55,7 +55,8 @@ static BCPos debug_framepc(lua_State *L, GCfunc *fn, cTValue *nextframe)
   const BCIns *ins;
   GCproto *pt;
   BCPos pos;
-  lua_assert(fn->c.gct == ~LJ_TFUNC || fn->c.gct == ~LJ_TTHREAD);
+  lj_assertL(fn->c.gct == ~LJ_TFUNC || fn->c.gct == ~LJ_TTHREAD,
+	     "function or frame expected");
   if (!isluafunc(fn)) {  /* Cannot derive a PC for non-Lua functions. */
     return NO_BCPOS;
   } else if (nextframe == NULL) {  /* Lua function on top. */
@@ -101,7 +102,7 @@ static BCPos debug_framepc(lua_State *L, GCfunc *fn, cTValue *nextframe)
 #if LJ_HASJIT
   if (pos > pt->sizebc) {  /* Undo the effects of lj_trace_exit for JLOOP. */
     GCtrace *T = (GCtrace *)((char *)(ins-1) - offsetof(GCtrace, startins));
-    lua_assert(bc_isret(bc_op(ins[-1])));
+    lj_assertL(bc_isret(bc_op(ins[-1])), "return bytecode expected");
     pos = proto_bcpos(pt, mref(T->startpc, const BCIns));
   }
 #endif
@@ -134,7 +135,7 @@ BCLine lj_debug_frameline(lua_State *L, GCfunc *fn, cTValue *nextframe)
   BCPos pc = debug_framepc(L, fn, nextframe);
   if (pc != NO_BCPOS) {
     GCproto *pt = funcproto(fn);
-    lua_assert(pc <= pt->sizebc);
+    lj_assertL(pc <= pt->sizebc, "PC out of range");
     return lj_debug_line(pt, pc);
   }
   return -1;
@@ -215,7 +216,7 @@ static TValue *debug_localname(lua_State *L, const lua_Debug *ar,
 const char *lj_debug_uvname(GCproto *pt, uint32_t idx)
 {
   const uint8_t *p = proto_uvinfo(pt);
-  lua_assert(idx < pt->sizeuv);
+  lj_assertX(idx < pt->sizeuv, "bad upvalue index");
   if (!p) return "";
   if (idx) while (*p++ || --idx) ;
   return (const char *)p;
@@ -440,13 +441,14 @@ int lj_debug_getinfo(lua_State *L, const char *what, lj_Debug *ar, int ext)
   } else {
     uint32_t offset = (uint32_t)ar->i_ci & 0xffff;
     uint32_t size = (uint32_t)ar->i_ci >> 16;
-    lua_assert(offset != 0);
+    lj_assertL(offset != 0, "bad frame offset");
     frame = tvref(L->stack) + offset;
     if (size) nextframe = frame + size;
-    lua_assert(frame <= tvref(L->maxstack) &&
-	       (!nextframe || nextframe <= tvref(L->maxstack)));
+    lj_assertL(frame <= tvref(L->maxstack) &&
+	       (!nextframe || nextframe <= tvref(L->maxstack)),
+	       "broken frame chain");
     fn = frame_func(frame);
-    lua_assert(fn->c.gct == ~LJ_TFUNC);
+    lj_assertL(fn->c.gct == ~LJ_TFUNC, "bad frame function");
   }
   for (; *what; what++) {
     if (*what == 'S') {
diff --git a/src/lj_def.h b/src/lj_def.h
index 2d8fff66..ba4dcc9d 100644
--- a/src/lj_def.h
+++ b/src/lj_def.h
@@ -338,14 +338,28 @@ static LJ_AINLINE uint32_t lj_getu32(const void *v)
 #define LJ_FUNCA_NORET	LJ_FUNCA LJ_NORET
 #define LJ_ASMF_NORET	LJ_ASMF LJ_NORET
 
-/* Runtime assertions. */
-#ifdef lua_assert
-#define check_exp(c, e)		(lua_assert(c), (e))
-#define api_check(l, e)		lua_assert(e)
+/* Internal assertions. */
+#if defined(LUA_USE_ASSERT) || defined(LUA_USE_APICHECK)
+#define lj_assert_check(g, c, ...) \
+  ((c) ? (void)0 : \
+   (lj_assert_fail((g), __FILE__, __LINE__, __func__, __VA_ARGS__), 0))
+#define lj_checkapi(c, ...)	lj_assert_check(G(L), (c), __VA_ARGS__)
 #else
-#define lua_assert(c)		((void)0)
+#define lj_checkapi(c, ...)	((void)L)
+#endif
+
+#ifdef LUA_USE_ASSERT
+#define lj_assertG_(g, c, ...)	lj_assert_check((g), (c), __VA_ARGS__)
+#define lj_assertG(c, ...)	lj_assert_check(g, (c), __VA_ARGS__)
+#define lj_assertL(c, ...)	lj_assert_check(G(L), (c), __VA_ARGS__)
+#define lj_assertX(c, ...)	lj_assert_check(NULL, (c), __VA_ARGS__)
+#define check_exp(c, e)		(lj_assertX((c), #c), (e))
+#else
+#define lj_assertG_(g, c, ...)	((void)0)
+#define lj_assertG(c, ...)	((void)g)
+#define lj_assertL(c, ...)	((void)L)
+#define lj_assertX(c, ...)	((void)0)
 #define check_exp(c, e)		(e)
-#define api_check		luai_apicheck
 #endif
 
 /* Static assertions. */
diff --git a/src/lj_dispatch.c b/src/lj_dispatch.c
index ee735450..ddee68de 100644
--- a/src/lj_dispatch.c
+++ b/src/lj_dispatch.c
@@ -380,7 +380,7 @@ static void callhook(lua_State *L, int event, BCLine line)
     hook_enter(g);
 #endif
     hookf(L, &ar);
-    lua_assert(hook_active(g));
+    lj_assertG(hook_active(g), "active hook flag removed");
     setgcref(g->cur_L, obj2gco(L));
 #if LJ_HASPROFILE && !LJ_PROFILE_SIGPROF
     lj_profile_hook_leave(g);
@@ -428,7 +428,8 @@ void LJ_FASTCALL lj_dispatch_ins(lua_State *L, const BCIns *pc)
 #endif
       J->L = L;
       lj_trace_ins(J, pc-1);  /* The interpreter bytecode PC is offset by 1. */
-      lua_assert(L->top - L->base == delta);
+      lj_assertG(L->top - L->base == delta,
+		 "unbalanced stack after tracing of instruction");
     }
   }
 #endif
@@ -488,7 +489,8 @@ ASMFunction LJ_FASTCALL lj_dispatch_call(lua_State *L, const BCIns *pc)
 #endif
     pc = (const BCIns *)((uintptr_t)pc & ~(uintptr_t)1);
     lj_trace_hot(J, pc);
-    lua_assert(L->top - L->base == delta);
+    lj_assertG(L->top - L->base == delta,
+	       "unbalanced stack after hot call");
     goto out;
   } else if (J->state != LJ_TRACE_IDLE &&
 	     !(g->hookmask & (HOOK_GC|HOOK_VMEVENT))) {
@@ -497,7 +499,8 @@ ASMFunction LJ_FASTCALL lj_dispatch_call(lua_State *L, const BCIns *pc)
 #endif
     /* Record the FUNC* bytecodes, too. */
     lj_trace_ins(J, pc-1);  /* The interpreter bytecode PC is offset by 1. */
-    lua_assert(L->top - L->base == delta);
+    lj_assertG(L->top - L->base == delta,
+	       "unbalanced stack after hot instruction");
   }
 #endif
   if ((g->hookmask & LUA_MASKCALL)) {
diff --git a/src/lj_emit_arm.h b/src/lj_emit_arm.h
index dee8bdcc..ee299821 100644
--- a/src/lj_emit_arm.h
+++ b/src/lj_emit_arm.h
@@ -81,7 +81,8 @@ static void emit_m(ASMState *as, ARMIns ai, Reg rm)
 
 static void emit_lsox(ASMState *as, ARMIns ai, Reg rd, Reg rn, int32_t ofs)
 {
-  lua_assert(ofs >= -255 && ofs <= 255);
+  lj_assertA(ofs >= -255 && ofs <= 255,
+	     "load/store offset %d out of range", ofs);
   if (ofs < 0) ofs = -ofs; else ai |= ARMI_LS_U;
   *--as->mcp = ai | ARMI_LS_P | ARMI_LSX_I | ARMF_D(rd) | ARMF_N(rn) |
 	       ((ofs & 0xf0) << 4) | (ofs & 0x0f);
@@ -89,7 +90,8 @@ static void emit_lsox(ASMState *as, ARMIns ai, Reg rd, Reg rn, int32_t ofs)
 
 static void emit_lso(ASMState *as, ARMIns ai, Reg rd, Reg rn, int32_t ofs)
 {
-  lua_assert(ofs >= -4095 && ofs <= 4095);
+  lj_assertA(ofs >= -4095 && ofs <= 4095,
+	     "load/store offset %d out of range", ofs);
   /* Combine LDR/STR pairs to LDRD/STRD. */
   if (*as->mcp == (ai|ARMI_LS_P|ARMI_LS_U|ARMF_D(rd^1)|ARMF_N(rn)|(ofs^4)) &&
       (ai & ~(ARMI_LDR^ARMI_STR)) == ARMI_STR && rd != rn &&
@@ -106,7 +108,8 @@ static void emit_lso(ASMState *as, ARMIns ai, Reg rd, Reg rn, int32_t ofs)
 #if !LJ_SOFTFP
 static void emit_vlso(ASMState *as, ARMIns ai, Reg rd, Reg rn, int32_t ofs)
 {
-  lua_assert(ofs >= -1020 && ofs <= 1020 && (ofs&3) == 0);
+  lj_assertA(ofs >= -1020 && ofs <= 1020 && (ofs&3) == 0,
+	     "load/store offset %d out of range", ofs);
   if (ofs < 0) ofs = -ofs; else ai |= ARMI_LS_U;
   *--as->mcp = ai | ARMI_LS_P | ARMF_D(rd & 15) | ARMF_N(rn) | (ofs >> 2);
 }
@@ -124,7 +127,7 @@ static int emit_kdelta1(ASMState *as, Reg d, int32_t i)
   while (work) {
     Reg r = rset_picktop(work);
     IRRef ref = regcost_ref(as->cost[r]);
-    lua_assert(r != d);
+    lj_assertA(r != d, "dest reg not free");
     if (emit_canremat(ref)) {
       int32_t delta = i - (ra_iskref(ref) ? ra_krefk(as, ref) : IR(ref)->i);
       uint32_t k = emit_isk12(ARMI_ADD, delta);
@@ -142,13 +145,13 @@ static int emit_kdelta1(ASMState *as, Reg d, int32_t i)
 }
 
 /* Try to find a two step delta relative to another constant. */
-static int emit_kdelta2(ASMState *as, Reg d, int32_t i)
+static int emit_kdelta2(ASMState *as, Reg rd, int32_t i)
 {
   RegSet work = ~as->freeset & RSET_GPR;
   while (work) {
     Reg r = rset_picktop(work);
     IRRef ref = regcost_ref(as->cost[r]);
-    lua_assert(r != d);
+    lj_assertA(r != rd, "dest reg %d not free", rd);
     if (emit_canremat(ref)) {
       int32_t other = ra_iskref(ref) ? ra_krefk(as, ref) : IR(ref)->i;
       if (other) {
@@ -159,8 +162,8 @@ static int emit_kdelta2(ASMState *as, Reg d, int32_t i)
 	k2 = emit_isk12(0, delta & (255 << sh));
 	k = emit_isk12(0, delta & ~(255 << sh));
 	if (k) {
-	  emit_dn(as, ARMI_ADD^k2^inv, d, d);
-	  emit_dn(as, ARMI_ADD^k^inv, d, r);
+	  emit_dn(as, ARMI_ADD^k2^inv, rd, rd);
+	  emit_dn(as, ARMI_ADD^k^inv, rd, r);
 	  return 1;
 	}
       }
@@ -171,23 +174,24 @@ static int emit_kdelta2(ASMState *as, Reg d, int32_t i)
 }
 
 /* Load a 32 bit constant into a GPR. */
-static void emit_loadi(ASMState *as, Reg r, int32_t i)
+static void emit_loadi(ASMState *as, Reg rd, int32_t i)
 {
   uint32_t k = emit_isk12(ARMI_MOV, i);
-  lua_assert(rset_test(as->freeset, r) || r == RID_TMP);
+  lj_assertA(rset_test(as->freeset, rd) || rd == RID_TMP,
+	     "dest reg %d not free", rd);
   if (k) {
     /* Standard K12 constant. */
-    emit_d(as, ARMI_MOV^k, r);
+    emit_d(as, ARMI_MOV^k, rd);
   } else if ((as->flags & JIT_F_ARMV6T2) && (uint32_t)i < 0x00010000u) {
     /* 16 bit loword constant for ARMv6T2. */
-    emit_d(as, ARMI_MOVW|(i & 0x0fff)|((i & 0xf000)<<4), r);
-  } else if (emit_kdelta1(as, r, i)) {
+    emit_d(as, ARMI_MOVW|(i & 0x0fff)|((i & 0xf000)<<4), rd);
+  } else if (emit_kdelta1(as, rd, i)) {
     /* One step delta relative to another constant. */
   } else if ((as->flags & JIT_F_ARMV6T2)) {
     /* 32 bit hiword/loword constant for ARMv6T2. */
-    emit_d(as, ARMI_MOVT|((i>>16) & 0x0fff)|(((i>>16) & 0xf000)<<4), r);
-    emit_d(as, ARMI_MOVW|(i & 0x0fff)|((i & 0xf000)<<4), r);
-  } else if (emit_kdelta2(as, r, i)) {
+    emit_d(as, ARMI_MOVT|((i>>16) & 0x0fff)|(((i>>16) & 0xf000)<<4), rd);
+    emit_d(as, ARMI_MOVW|(i & 0x0fff)|((i & 0xf000)<<4), rd);
+  } else if (emit_kdelta2(as, rd, i)) {
     /* Two step delta relative to another constant. */
   } else {
     /* Otherwise construct the constant with up to 4 instructions. */
@@ -197,15 +201,15 @@ static void emit_loadi(ASMState *as, Reg r, int32_t i)
       int32_t m = i & (255 << sh);
       i &= ~(255 << sh);
       if (i == 0) {
-	emit_d(as, ARMI_MOV ^ emit_isk12(0, m), r);
+	emit_d(as, ARMI_MOV ^ emit_isk12(0, m), rd);
 	break;
       }
-      emit_dn(as, ARMI_ORR ^ emit_isk12(0, m), r, r);
+      emit_dn(as, ARMI_ORR ^ emit_isk12(0, m), rd, rd);
     }
   }
 }
 
-#define emit_loada(as, r, addr)		emit_loadi(as, (r), i32ptr((addr)))
+#define emit_loada(as, rd, addr)	emit_loadi(as, (rd), i32ptr((addr)))
 
 static Reg ra_allock(ASMState *as, intptr_t k, RegSet allow);
 
@@ -261,7 +265,7 @@ static void emit_branch(ASMState *as, ARMIns ai, MCode *target)
 {
   MCode *p = as->mcp;
   ptrdiff_t delta = (target - p) - 1;
-  lua_assert(((delta + 0x00800000) >> 24) == 0);
+  lj_assertA(((delta + 0x00800000) >> 24) == 0, "branch target out of range");
   *--p = ai | ((uint32_t)delta & 0x00ffffffu);
   as->mcp = p;
 }
@@ -289,7 +293,7 @@ static void emit_call(ASMState *as, void *target)
 static void emit_movrr(ASMState *as, IRIns *ir, Reg dst, Reg src)
 {
 #if LJ_SOFTFP
-  lua_assert(!irt_isnum(ir->t)); UNUSED(ir);
+  lj_assertA(!irt_isnum(ir->t), "unexpected FP op"); UNUSED(ir);
 #else
   if (dst >= RID_MAX_GPR) {
     emit_dm(as, irt_isnum(ir->t) ? ARMI_VMOV_D : ARMI_VMOV_S,
@@ -313,7 +317,7 @@ static void emit_movrr(ASMState *as, IRIns *ir, Reg dst, Reg src)
 static void emit_loadofs(ASMState *as, IRIns *ir, Reg r, Reg base, int32_t ofs)
 {
 #if LJ_SOFTFP
-  lua_assert(!irt_isnum(ir->t)); UNUSED(ir);
+  lj_assertA(!irt_isnum(ir->t), "unexpected FP op"); UNUSED(ir);
 #else
   if (r >= RID_MAX_GPR)
     emit_vlso(as, irt_isnum(ir->t) ? ARMI_VLDR_D : ARMI_VLDR_S, r, base, ofs);
@@ -326,7 +330,7 @@ static void emit_loadofs(ASMState *as, IRIns *ir, Reg r, Reg base, int32_t ofs)
 static void emit_storeofs(ASMState *as, IRIns *ir, Reg r, Reg base, int32_t ofs)
 {
 #if LJ_SOFTFP
-  lua_assert(!irt_isnum(ir->t)); UNUSED(ir);
+  lj_assertA(!irt_isnum(ir->t), "unexpected FP op"); UNUSED(ir);
 #else
   if (r >= RID_MAX_GPR)
     emit_vlso(as, irt_isnum(ir->t) ? ARMI_VSTR_D : ARMI_VSTR_S, r, base, ofs);
diff --git a/src/lj_emit_arm64.h b/src/lj_emit_arm64.h
index 1001b1d8..96fbab72 100644
--- a/src/lj_emit_arm64.h
+++ b/src/lj_emit_arm64.h
@@ -8,8 +8,9 @@
 
 /* -- Constant encoding --------------------------------------------------- */
 
-static uint64_t get_k64val(IRIns *ir)
+static uint64_t get_k64val(ASMState *as, IRRef ref)
 {
+  IRIns *ir = IR(ref);
   if (ir->o == IR_KINT64) {
     return ir_kint64(ir)->u64;
   } else if (ir->o == IR_KGC) {
@@ -17,7 +18,8 @@ static uint64_t get_k64val(IRIns *ir)
   } else if (ir->o == IR_KPTR || ir->o == IR_KKPTR) {
     return (uint64_t)ir_kptr(ir);
   } else {
-    lua_assert(ir->o == IR_KINT || ir->o == IR_KNULL);
+    lj_assertA(ir->o == IR_KINT || ir->o == IR_KNULL,
+	       "bad 64 bit const IR op %d", ir->o);
     return ir->i;  /* Sign-extended. */
   }
 }
@@ -122,7 +124,7 @@ static int emit_checkofs(A64Ins ai, int64_t ofs)
 static void emit_lso(ASMState *as, A64Ins ai, Reg rd, Reg rn, int64_t ofs)
 {
   int ot = emit_checkofs(ai, ofs), sc = (ai >> 30) & 3;
-  lua_assert(ot);
+  lj_assertA(ot, "load/store offset %d out of range", ofs);
   /* Combine LDR/STR pairs to LDP/STP. */
   if ((sc == 2 || sc == 3) &&
       (!(ai & 0x400000) || rd != rn) &&
@@ -166,10 +168,10 @@ static int emit_kdelta(ASMState *as, Reg rd, uint64_t k, int lim)
   while (work) {
     Reg r = rset_picktop(work);
     IRRef ref = regcost_ref(as->cost[r]);
-    lua_assert(r != rd);
+    lj_assertA(r != rd, "dest reg %d not free", rd);
     if (ref < REF_TRUE) {
       uint64_t kx = ra_iskref(ref) ? (uint64_t)ra_krefk(as, ref) :
-				     get_k64val(IR(ref));
+				     get_k64val(as, ref);
       int64_t delta = (int64_t)(k - kx);
       if (delta == 0) {
 	emit_dm(as, A64I_MOVx, rd, r);
@@ -312,7 +314,7 @@ static void emit_cond_branch(ASMState *as, A64CC cond, MCode *target)
 {
   MCode *p = --as->mcp;
   ptrdiff_t delta = target - p;
-  lua_assert(A64F_S_OK(delta, 19));
+  lj_assertA(A64F_S_OK(delta, 19), "branch target out of range");
   *p = A64I_BCC | A64F_S19(delta) | cond;
 }
 
@@ -320,7 +322,7 @@ static void emit_branch(ASMState *as, A64Ins ai, MCode *target)
 {
   MCode *p = --as->mcp;
   ptrdiff_t delta = target - p;
-  lua_assert(A64F_S_OK(delta, 26));
+  lj_assertA(A64F_S_OK(delta, 26), "branch target out of range");
   *p = ai | A64F_S26(delta);
 }
 
@@ -328,7 +330,8 @@ static void emit_tnb(ASMState *as, A64Ins ai, Reg r, uint32_t bit, MCode *target
 {
   MCode *p = --as->mcp;
   ptrdiff_t delta = target - p;
-  lua_assert(bit < 63 && A64F_S_OK(delta, 14));
+  lj_assertA(bit < 63, "bit number out of range");
+  lj_assertA(A64F_S_OK(delta, 14), "branch target out of range");
   if (bit > 31) ai |= A64I_X;
   *p = ai | A64F_BIT(bit & 31) | A64F_S14(delta) | r;
 }
@@ -337,7 +340,7 @@ static void emit_cnb(ASMState *as, A64Ins ai, Reg r, MCode *target)
 {
   MCode *p = --as->mcp;
   ptrdiff_t delta = target - p;
-  lua_assert(A64F_S_OK(delta, 19));
+  lj_assertA(A64F_S_OK(delta, 19), "branch target out of range");
   *p = ai | A64F_S19(delta) | r;
 }
 
diff --git a/src/lj_emit_mips.h b/src/lj_emit_mips.h
index 313d030a..7f0d27ca 100644
--- a/src/lj_emit_mips.h
+++ b/src/lj_emit_mips.h
@@ -4,8 +4,9 @@
 */
 
 #if LJ_64
-static intptr_t get_k64val(IRIns *ir)
+static intptr_t get_k64val(ASMState *as, IRRef ref)
 {
+  IRIns *ir = IR(ref);
   if (ir->o == IR_KINT64) {
     return (intptr_t)ir_kint64(ir)->u64;
   } else if (ir->o == IR_KGC) {
@@ -15,16 +16,17 @@ static intptr_t get_k64val(IRIns *ir)
   } else if (LJ_SOFTFP && ir->o == IR_KNUM) {
     return (intptr_t)ir_knum(ir)->u64;
   } else {
-    lua_assert(ir->o == IR_KINT || ir->o == IR_KNULL);
+    lj_assertA(ir->o == IR_KINT || ir->o == IR_KNULL,
+	       "bad 64 bit const IR op %d", ir->o);
     return ir->i;  /* Sign-extended. */
   }
 }
 #endif
 
 #if LJ_64
-#define get_kval(ir)		get_k64val(ir)
+#define get_kval(as, ref)	get_k64val(as, ref)
 #else
-#define get_kval(ir)		((ir)->i)
+#define get_kval(as, ref)	(IR((ref))->i)
 #endif
 
 /* -- Emit basic instructions --------------------------------------------- */
@@ -82,18 +84,18 @@ static void emit_tsml(ASMState *as, MIPSIns mi, Reg rt, Reg rs, uint32_t msb,
 #define emit_canremat(ref)	((ref) <= REF_BASE)
 
 /* Try to find a one step delta relative to another constant. */
-static int emit_kdelta1(ASMState *as, Reg t, intptr_t i)
+static int emit_kdelta1(ASMState *as, Reg rd, intptr_t i)
 {
   RegSet work = ~as->freeset & RSET_GPR;
   while (work) {
     Reg r = rset_picktop(work);
     IRRef ref = regcost_ref(as->cost[r]);
-    lua_assert(r != t);
+    lj_assertA(r != rd, "dest reg %d not free", rd);
     if (ref < ASMREF_L) {
       intptr_t delta = (intptr_t)((uintptr_t)i -
-	(uintptr_t)(ra_iskref(ref) ? ra_krefk(as, ref) : get_kval(IR(ref))));
+	(uintptr_t)(ra_iskref(ref) ? ra_krefk(as, ref) : get_kval(as, ref)));
       if (checki16(delta)) {
-	emit_tsi(as, MIPSI_AADDIU, t, r, delta);
+	emit_tsi(as, MIPSI_AADDIU, rd, r, delta);
 	return 1;
       }
     }
@@ -223,7 +225,7 @@ static void emit_branch(ASMState *as, MIPSIns mi, Reg rs, Reg rt, MCode *target)
 {
   MCode *p = as->mcp;
   ptrdiff_t delta = target - p;
-  lua_assert(((delta + 0x8000) >> 16) == 0);
+  lj_assertA(((delta + 0x8000) >> 16) == 0, "branch target out of range");
   *--p = mi | MIPSF_S(rs) | MIPSF_T(rt) | ((uint32_t)delta & 0xffffu);
   as->mcp = p;
 }
@@ -299,7 +301,7 @@ static void emit_storeofs(ASMState *as, IRIns *ir, Reg r, Reg base, int32_t ofs)
 static void emit_addptr(ASMState *as, Reg r, int32_t ofs)
 {
   if (ofs) {
-    lua_assert(checki16(ofs));
+    lj_assertA(checki16(ofs), "offset %d out of range", ofs);
     emit_tsi(as, MIPSI_AADDIU, r, r, ofs);
   }
 }
diff --git a/src/lj_emit_ppc.h b/src/lj_emit_ppc.h
index 21c3c2ac..ddc864cd 100644
--- a/src/lj_emit_ppc.h
+++ b/src/lj_emit_ppc.h
@@ -41,13 +41,13 @@ static void emit_rot(ASMState *as, PPCIns pi, Reg ra, Reg rs,
 
 static void emit_slwi(ASMState *as, Reg ra, Reg rs, int32_t n)
 {
-  lua_assert(n >= 0 && n < 32);
+  lj_assertA(n >= 0 && n < 32, "shift out or range");
   emit_rot(as, PPCI_RLWINM, ra, rs, n, 0, 31-n);
 }
 
 static void emit_rotlwi(ASMState *as, Reg ra, Reg rs, int32_t n)
 {
-  lua_assert(n >= 0 && n < 32);
+  lj_assertA(n >= 0 && n < 32, "shift out or range");
   emit_rot(as, PPCI_RLWINM, ra, rs, n, 0, 31);
 }
 
@@ -57,17 +57,17 @@ static void emit_rotlwi(ASMState *as, Reg ra, Reg rs, int32_t n)
 #define emit_canremat(ref)	((ref) <= REF_BASE)
 
 /* Try to find a one step delta relative to another constant. */
-static int emit_kdelta1(ASMState *as, Reg t, int32_t i)
+static int emit_kdelta1(ASMState *as, Reg rd, int32_t i)
 {
   RegSet work = ~as->freeset & RSET_GPR;
   while (work) {
     Reg r = rset_picktop(work);
     IRRef ref = regcost_ref(as->cost[r]);
-    lua_assert(r != t);
+    lj_assertA(r != rd, "dest reg %d not free", rd);
     if (ref < ASMREF_L) {
       int32_t delta = i - (ra_iskref(ref) ? ra_krefk(as, ref) : IR(ref)->i);
       if (checki16(delta)) {
-	emit_tai(as, PPCI_ADDI, t, r, delta);
+	emit_tai(as, PPCI_ADDI, rd, r, delta);
 	return 1;
       }
     }
@@ -144,7 +144,7 @@ static void emit_condbranch(ASMState *as, PPCIns pi, PPCCC cc, MCode *target)
 {
   MCode *p = --as->mcp;
   ptrdiff_t delta = (char *)target - (char *)p;
-  lua_assert(((delta + 0x8000) >> 16) == 0);
+  lj_assertA(((delta + 0x8000) >> 16) == 0, "branch target out of range");
   pi ^= (delta & 0x8000) * (PPCF_Y/0x8000);
   *p = pi | PPCF_CC(cc) | ((uint32_t)delta & 0xffffu);
 }
diff --git a/src/lj_emit_x86.h b/src/lj_emit_x86.h
index b3dc4ea5..eaef17fc 100644
--- a/src/lj_emit_x86.h
+++ b/src/lj_emit_x86.h
@@ -92,7 +92,7 @@ static void emit_rr(ASMState *as, x86Op xo, Reg r1, Reg r2)
 /* [addr] is sign-extended in x64 and must be in lower 2G (not 4G). */
 static int32_t ptr2addr(const void *p)
 {
-  lua_assert((uintptr_t)p < (uintptr_t)0x80000000);
+  lj_assertX((uintptr_t)p < (uintptr_t)0x80000000, "pointer outside 2G range");
   return i32ptr(p);
 }
 #else
@@ -208,7 +208,7 @@ static void emit_mrm(ASMState *as, x86Op xo, Reg rr, Reg rb)
       rb = RID_ESP;
 #endif
     } else if (LJ_GC64 && rb == RID_RIP) {
-      lua_assert(as->mrm.idx == RID_NONE);
+      lj_assertA(as->mrm.idx == RID_NONE, "RIP-rel mrm cannot have index");
       mode = XM_OFS0;
       p -= 4;
       *(int32_t *)p = as->mrm.ofs;
@@ -401,7 +401,8 @@ static void emit_loadk64(ASMState *as, Reg r, IRIns *ir)
     emit_rma(as, xo, r64, k);
   } else {
     if (ir->i) {
-      lua_assert(*k == *(uint64_t*)(as->mctop - ir->i));
+      lj_assertA(*k == *(uint64_t*)(as->mctop - ir->i),
+		 "bad interned 64 bit constant");
     } else if (as->curins <= as->stopins && rset_test(RSET_GPR, r)) {
       emit_loadu64(as, r, *k);
       return;
@@ -433,7 +434,7 @@ static void emit_sjmp(ASMState *as, MCLabel target)
 {
   MCode *p = as->mcp;
   ptrdiff_t delta = target - p;
-  lua_assert(delta == (int8_t)delta);
+  lj_assertA(delta == (int8_t)delta, "short jump target out of range");
   p[-1] = (MCode)(int8_t)delta;
   p[-2] = XI_JMPs;
   as->mcp = p - 2;
@@ -445,7 +446,7 @@ static void emit_sjcc(ASMState *as, int cc, MCLabel target)
 {
   MCode *p = as->mcp;
   ptrdiff_t delta = target - p;
-  lua_assert(delta == (int8_t)delta);
+  lj_assertA(delta == (int8_t)delta, "short jump target out of range");
   p[-1] = (MCode)(int8_t)delta;
   p[-2] = (MCode)(XI_JCCs+(cc&15));
   as->mcp = p - 2;
@@ -471,10 +472,11 @@ static void emit_sfixup(ASMState *as, MCLabel source)
 #define emit_label(as)		((as)->mcp)
 
 /* Compute relative 32 bit offset for jump and call instructions. */
-static LJ_AINLINE int32_t jmprel(MCode *p, MCode *target)
+static LJ_AINLINE int32_t jmprel(jit_State *J, MCode *p, MCode *target)
 {
   ptrdiff_t delta = target - p;
-  lua_assert(delta == (int32_t)delta);
+  UNUSED(J);
+  lj_assertJ(delta == (int32_t)delta, "jump target out of range");
   return (int32_t)delta;
 }
 
@@ -482,7 +484,7 @@ static LJ_AINLINE int32_t jmprel(MCode *p, MCode *target)
 static void emit_jcc(ASMState *as, int cc, MCode *target)
 {
   MCode *p = as->mcp;
-  *(int32_t *)(p-4) = jmprel(p, target);
+  *(int32_t *)(p-4) = jmprel(as->J, p, target);
   p[-5] = (MCode)(XI_JCCn+(cc&15));
   p[-6] = 0x0f;
   as->mcp = p - 6;
@@ -492,7 +494,7 @@ static void emit_jcc(ASMState *as, int cc, MCode *target)
 static void emit_jmp(ASMState *as, MCode *target)
 {
   MCode *p = as->mcp;
-  *(int32_t *)(p-4) = jmprel(p, target);
+  *(int32_t *)(p-4) = jmprel(as->J, p, target);
   p[-5] = XI_JMP;
   as->mcp = p - 5;
 }
@@ -509,7 +511,7 @@ static void emit_call_(ASMState *as, MCode *target)
     return;
   }
 #endif
-  *(int32_t *)(p-4) = jmprel(p, target);
+  *(int32_t *)(p-4) = jmprel(as->J, p, target);
   p[-5] = XI_CALL;
   as->mcp = p - 5;
 }
diff --git a/src/lj_err.c b/src/lj_err.c
index 8d7134d9..89c51e98 100644
--- a/src/lj_err.c
+++ b/src/lj_err.c
@@ -483,17 +483,10 @@ void lj_err_verify(void)
 #if !LJ_TARGET_OSX
   /* Check disabled on MacOS due to brilliant software engineering at Apple. */
   struct dwarf_eh_bases ehb;
-  /*
-  ** FIXME: The following assertions were replaced with
-  ** the conventional `lua_assert` ones.
-  **
-  ** lj_assertX(_Unwind_Find_FDE((void *)lj_err_throw, &ehb), "broken build: external frame unwinding enabled, but missing -funwind-tables");
-  ** lj_assertX(_Unwind_Find_FDE((void *)_Unwind_RaiseException, &ehb), "broken build: external frame unwinding enabled, but system libraries have no unwind tables");
-  */
-  lua_assert(_Unwind_Find_FDE((void *)lj_err_throw, &ehb));
+  lj_assertX(_Unwind_Find_FDE((void *)lj_err_throw, &ehb), "broken build: external frame unwinding enabled, but missing -funwind-tables");
 #endif
   /* Check disabled, because of broken Fedora/ARM64. See #722.
-  lua_assert(_Unwind_Find_FDE((void *)_Unwind_RaiseException, &ehb));
+  lj_assertX(_Unwind_Find_FDE((void *)_Unwind_RaiseException, &ehb), "broken build: external frame unwinding enabled, but system libraries have no unwind tables");
   */
 }
 #endif
@@ -514,13 +507,7 @@ static int err_unwind_jit(int version, int actions,
     ExitNo exitno;
     uintptr_t addr = _Unwind_GetIP(ctx);  /* Return address _after_ call. */
     uintptr_t stub = lj_trace_unwind(G2J(g), addr - sizeof(MCode), &exitno);
-    /*
-    ** FIXME: The following assert was replaced with
-    ** the conventional `lua_assert`.
-    **
-    ** lj_assertG(tvref(g->jit_base), "unexpected throw across mcode frame");
-    */
-    lua_assert(tvref(g->jit_base));
+    lj_assertG(tvref(g->jit_base), "unexpected throw across mcode frame");
     if (stub) {  /* Jump to side exit to unwind the trace. */
       G2J(g)->exitcode = LJ_UEXCLASS_ERRCODE(uexclass);
 #ifdef LJ_TARGET_MIPS
@@ -603,15 +590,8 @@ uint8_t *lj_err_register_mcode(void *base, size_t sz, uint8_t *info)
 #ifdef LUA_USE_ASSERT
   {
     struct dwarf_eh_bases ehb;
-    /*
-    ** FIXME: The following assert was replaced with
-    ** the conventional `lua_assert`.
-    **
-    ** lj_assertX(_Unwind_Find_FDE(info + sizeof(err_frame_jit_template)+1, &ehb),
-    **      "bad JIT unwind table registration");
-    */
-    lua_assert(_Unwind_Find_FDE(info + sizeof(err_frame_jit_template)+1,
-               &ehb));
+    lj_assertX(_Unwind_Find_FDE(info + sizeof(err_frame_jit_template)+1, &ehb),
+	       "bad JIT unwind table registration");
   }
 #endif
   return info + sizeof(err_frame_jit_template);
@@ -716,13 +696,7 @@ void lj_err_verify(void)
 {
   int got = 0;
   _Unwind_Backtrace((_Unwind_Trace_Fn)err_verify_bt, &got);
-  /*
-  ** FIXME: The following assert was replaced with
-  ** the conventional `lua_assert`.
-  **
-  ** lj_assertX(got == 2, "broken build: external frame unwinding enabled, but missing -funwind-tables");
-  */
-  lua_assert(got == 2);
+  lj_assertX(got == 2, "broken build: external frame unwinding enabled, but missing -funwind-tables");
 }
 #endif
 
@@ -852,7 +826,7 @@ static ptrdiff_t finderrfunc(lua_State *L)
 	return savestack(L, frame_prevd(frame)+1);  /* xpcall's errorfunc. */
       return 0;
     default:
-      lua_assert(0);
+      lj_assertL(0, "bad frame type");
       return 0;
     }
   }
diff --git a/src/lj_func.c b/src/lj_func.c
index 639dad87..2efecb0f 100644
--- a/src/lj_func.c
+++ b/src/lj_func.c
@@ -24,9 +24,11 @@ void LJ_FASTCALL lj_func_freeproto(global_State *g, GCproto *pt)
 
 /* -- Upvalues ------------------------------------------------------------ */
 
-static void unlinkuv(GCupval *uv)
+static void unlinkuv(global_State *g, GCupval *uv)
 {
-  lua_assert(uvprev(uvnext(uv)) == uv && uvnext(uvprev(uv)) == uv);
+  UNUSED(g);
+  lj_assertG(uvprev(uvnext(uv)) == uv && uvnext(uvprev(uv)) == uv,
+	     "broken upvalue chain");
   setgcrefr(uvnext(uv)->prev, uv->prev);
   setgcrefr(uvprev(uv)->next, uv->next);
 }
@@ -40,7 +42,7 @@ static GCupval *func_finduv(lua_State *L, TValue *slot)
   GCupval *uv;
   /* Search the sorted list of open upvalues. */
   while (gcref(*pp) != NULL && uvval((p = gco2uv(gcref(*pp)))) >= slot) {
-    lua_assert(!p->closed && uvval(p) != &p->tv);
+    lj_assertG(!p->closed && uvval(p) != &p->tv, "closed upvalue in chain");
     if (uvval(p) == slot) {  /* Found open upvalue pointing to same slot? */
       if (isdead(g, obj2gco(p)))  /* Resurrect it, if it's dead. */
 	flipwhite(obj2gco(p));
@@ -61,7 +63,8 @@ static GCupval *func_finduv(lua_State *L, TValue *slot)
   setgcrefr(uv->next, g->uvhead.next);
   setgcref(uvnext(uv)->prev, obj2gco(uv));
   setgcref(g->uvhead.next, obj2gco(uv));
-  lua_assert(uvprev(uvnext(uv)) == uv && uvnext(uvprev(uv)) == uv);
+  lj_assertG(uvprev(uvnext(uv)) == uv && uvnext(uvprev(uv)) == uv,
+	     "broken upvalue chain");
   return uv;
 }
 
@@ -84,12 +87,13 @@ void LJ_FASTCALL lj_func_closeuv(lua_State *L, TValue *level)
   while (gcref(L->openupval) != NULL &&
 	 uvval((uv = gco2uv(gcref(L->openupval)))) >= level) {
     GCobj *o = obj2gco(uv);
-    lua_assert(!isblack(o) && !uv->closed && uvval(uv) != &uv->tv);
+    lj_assertG(!isblack(o), "bad black upvalue");
+    lj_assertG(!uv->closed && uvval(uv) != &uv->tv, "closed upvalue in chain");
     setgcrefr(L->openupval, uv->nextgc);  /* No longer in open list. */
     if (isdead(g, o)) {
       lj_func_freeuv(g, uv);
     } else {
-      unlinkuv(uv);
+      unlinkuv(g, uv);
       lj_gc_closeuv(g, uv);
     }
   }
@@ -98,7 +102,7 @@ void LJ_FASTCALL lj_func_closeuv(lua_State *L, TValue *level)
 void LJ_FASTCALL lj_func_freeuv(global_State *g, GCupval *uv)
 {
   if (!uv->closed)
-    unlinkuv(uv);
+    unlinkuv(g, uv);
   lj_mem_freet(g, uv);
 }
 
diff --git a/src/lj_gc.c b/src/lj_gc.c
index c306047a..19d4c963 100644
--- a/src/lj_gc.c
+++ b/src/lj_gc.c
@@ -42,7 +42,8 @@
 
 /* Mark a TValue (if needed). */
 #define gc_marktv(g, tv) \
-  { lua_assert(!tvisgcv(tv) || (~itype(tv) == gcval(tv)->gch.gct)); \
+  { lj_assertG(!tvisgcv(tv) || (~itype(tv) == gcval(tv)->gch.gct), \
+	       "TValue and GC type mismatch"); \
     if (tviswhite(tv)) gc_mark(g, gcV(tv)); }
 
 /* Mark a GCobj (if needed). */
@@ -56,7 +57,8 @@
 static void gc_mark(global_State *g, GCobj *o)
 {
   int gct = o->gch.gct;
-  lua_assert(iswhite(o) && !isdead(g, o));
+  lj_assertG(iswhite(o), "mark of non-white object");
+  lj_assertG(!isdead(g, o), "mark of dead object");
   white2gray(o);
   if (LJ_UNLIKELY(gct == ~LJ_TUDATA)) {
     GCtab *mt = tabref(gco2ud(o)->metatable);
@@ -69,8 +71,9 @@ static void gc_mark(global_State *g, GCobj *o)
     if (uv->closed)
       gray2black(o);  /* Closed upvalues are never gray. */
   } else if (gct != ~LJ_TSTR && gct != ~LJ_TCDATA) {
-    lua_assert(gct == ~LJ_TFUNC || gct == ~LJ_TTAB ||
-	       gct == ~LJ_TTHREAD || gct == ~LJ_TPROTO || gct == ~LJ_TTRACE);
+    lj_assertG(gct == ~LJ_TFUNC || gct == ~LJ_TTAB ||
+	       gct == ~LJ_TTHREAD || gct == ~LJ_TPROTO || gct == ~LJ_TTRACE,
+	       "bad GC type %d", gct);
     setgcrefr(o->gch.gclist, g->gc.gray);
     setgcref(g->gc.gray, o);
   }
@@ -103,7 +106,8 @@ static void gc_mark_uv(global_State *g)
 {
   GCupval *uv;
   for (uv = uvnext(&g->uvhead); uv != &g->uvhead; uv = uvnext(uv)) {
-    lua_assert(uvprev(uvnext(uv)) == uv && uvnext(uvprev(uv)) == uv);
+    lj_assertG(uvprev(uvnext(uv)) == uv && uvnext(uvprev(uv)) == uv,
+	       "broken upvalue chain");
     if (isgray(obj2gco(uv)))
       gc_marktv(g, uvval(uv));
   }
@@ -198,7 +202,7 @@ static int gc_traverse_tab(global_State *g, GCtab *t)
     for (i = 0; i <= hmask; i++) {
       Node *n = &node[i];
       if (!tvisnil(&n->val)) {  /* Mark non-empty slot. */
-	lua_assert(!tvisnil(&n->key));
+	lj_assertG(!tvisnil(&n->key), "mark of nil key in non-empty slot");
 	if (!(weak & LJ_GC_WEAKKEY)) gc_marktv(g, &n->key);
 	if (!(weak & LJ_GC_WEAKVAL)) gc_marktv(g, &n->val);
       }
@@ -213,7 +217,8 @@ static void gc_traverse_func(global_State *g, GCfunc *fn)
   gc_markobj(g, tabref(fn->c.env));
   if (isluafunc(fn)) {
     uint32_t i;
-    lua_assert(fn->l.nupvalues <= funcproto(fn)->sizeuv);
+    lj_assertG(fn->l.nupvalues <= funcproto(fn)->sizeuv,
+	       "function upvalues out of range");
     gc_markobj(g, funcproto(fn));
     for (i = 0; i < fn->l.nupvalues; i++)  /* Mark Lua function upvalues. */
       gc_markobj(g, &gcref(fn->l.uvptr[i])->uv);
@@ -229,7 +234,7 @@ static void gc_traverse_func(global_State *g, GCfunc *fn)
 static void gc_marktrace(global_State *g, TraceNo traceno)
 {
   GCobj *o = obj2gco(traceref(G2J(g), traceno));
-  lua_assert(traceno != G2J(g)->cur.traceno);
+  lj_assertG(traceno != G2J(g)->cur.traceno, "active trace escaped");
   if (iswhite(o)) {
     white2gray(o);
     setgcrefr(o->gch.gclist, g->gc.gray);
@@ -310,7 +315,7 @@ static size_t propagatemark(global_State *g)
 {
   GCobj *o = gcref(g->gc.gray);
   int gct = o->gch.gct;
-  lua_assert(isgray(o));
+  lj_assertG(isgray(o), "propagation of non-gray object");
   gray2black(o);
   setgcrefr(g->gc.gray, o->gch.gclist);  /* Remove from gray list. */
   if (LJ_LIKELY(gct == ~LJ_TTAB)) {
@@ -342,7 +347,7 @@ static size_t propagatemark(global_State *g)
     return ((sizeof(GCtrace)+7)&~7) + (T->nins-T->nk)*sizeof(IRIns) +
 	   T->nsnap*sizeof(SnapShot) + T->nsnapmap*sizeof(SnapEntry);
 #else
-    lua_assert(0);
+    lj_assertG(0, "bad GC type %d", gct);
     return 0;
 #endif
   }
@@ -396,11 +401,13 @@ static GCRef *gc_sweep(global_State *g, GCRef *p, uint32_t lim)
     if (o->gch.gct == ~LJ_TTHREAD)  /* Need to sweep open upvalues, too. */
       gc_fullsweep(g, &gco2th(o)->openupval);
     if (((o->gch.marked ^ LJ_GC_WHITES) & ow)) {  /* Black or current white? */
-      lua_assert(!isdead(g, o) || (o->gch.marked & LJ_GC_FIXED));
+      lj_assertG(!isdead(g, o) || (o->gch.marked & LJ_GC_FIXED),
+		 "sweep of undead object");
       makewhite(g, o);  /* Value is alive, change to the current white. */
       p = &o->gch.nextgc;
     } else {  /* Otherwise value is dead, free it. */
-      lua_assert(isdead(g, o) || ow == LJ_GC_SFIXED);
+      lj_assertG(isdead(g, o) || ow == LJ_GC_SFIXED,
+		 "sweep of unlive object");
       setgcrefr(*p, o->gch.nextgc);
       if (o == gcref(g->gc.root))
 	setgcrefr(g->gc.root, o->gch.nextgc);  /* Adjust list anchor. */
@@ -418,7 +425,8 @@ static GCRef *gc_sweep_str_chain(global_State *g, GCRef *p)
   GCobj *o;
   while ((o = gcref(*p)) != NULL) {
     if (((o->gch.marked ^ LJ_GC_WHITES) & ow)) {  /* Black or current white? */
-      lua_assert(!isdead(g, o) || (o->gch.marked & LJ_GC_FIXED));
+      lj_assertG(!isdead(g, o) || (o->gch.marked & LJ_GC_FIXED),
+		 "sweep of undead string");
       makewhite(g, o);  /* Value is alive, change to the current white. */
 #if LUAJIT_SMART_STRINGS
       if (strsmart(&o->str)) {
@@ -429,7 +437,8 @@ static GCRef *gc_sweep_str_chain(global_State *g, GCRef *p)
 #endif
       p = &o->gch.nextgc;
     } else {  /* Otherwise value is dead, free it. */
-      lua_assert(isdead(g, o) || ow == LJ_GC_SFIXED);
+      lj_assertG(isdead(g, o) || ow == LJ_GC_SFIXED,
+		 "sweep of unlive string");
       setgcrefr(*p, o->gch.nextgc);
       lj_str_free(g, &o->str);
     }
@@ -454,11 +463,12 @@ static int gc_mayclear(cTValue *o, int val)
 }
 
 /* Clear collected entries from weak tables. */
-static void gc_clearweak(GCobj *o)
+static void gc_clearweak(global_State *g, GCobj *o)
 {
+  UNUSED(g);
   while (o) {
     GCtab *t = gco2tab(o);
-    lua_assert((t->marked & LJ_GC_WEAK));
+    lj_assertG((t->marked & LJ_GC_WEAK), "clear of non-weak table");
     if ((t->marked & LJ_GC_WEAKVAL)) {
       MSize i, asize = t->asize;
       for (i = 0; i < asize; i++) {
@@ -515,7 +525,7 @@ static void gc_finalize(lua_State *L)
   global_State *g = G(L);
   GCobj *o = gcnext(gcref(g->gc.mmudata));
   cTValue *mo;
-  lua_assert(tvref(g->jit_base) == NULL);  /* Must not be called on trace. */
+  lj_assertG(tvref(g->jit_base) == NULL, "finalizer called on trace");
   /* Unchain from list of userdata to be finalized. */
   if (o == gcref(g->gc.mmudata))
     setgcrefnull(g->gc.mmudata);
@@ -607,7 +617,7 @@ static void atomic(global_State *g, lua_State *L)
 
   setgcrefr(g->gc.gray, g->gc.weak);  /* Empty the list of weak tables. */
   setgcrefnull(g->gc.weak);
-  lua_assert(!iswhite(obj2gco(mainthread(g))));
+  lj_assertG(!iswhite(obj2gco(mainthread(g))), "main thread turned white");
   gc_markobj(g, L);  /* Mark running thread. */
   gc_traverse_curtrace(g);  /* Traverse current trace. */
   gc_mark_gcroot(g);  /* Mark GC roots (again). */
@@ -622,7 +632,7 @@ static void atomic(global_State *g, lua_State *L)
   udsize += gc_propagate_gray(g);  /* And propagate the marks. */
 
   /* All marking done, clear weak tables. */
-  gc_clearweak(gcref(g->gc.weak));
+  gc_clearweak(g, gcref(g->gc.weak));
 
   lj_buf_shrink(L, &g->tmpbuf);  /* Shrink temp buffer. */
 
@@ -668,14 +678,14 @@ static size_t gc_onestep(lua_State *L)
       g->strbloom.cur[1] = g->strbloom.next[1];
 #endif
     }
-    lua_assert(old >= g->gc.total);
+    lj_assertG(old >= g->gc.total, "sweep increased memory");
     g->gc.estimate -= old - g->gc.total;
     return GCSWEEPCOST;
     }
   case GCSsweep: {
     GCSize old = g->gc.total;
     setmref(g->gc.sweep, gc_sweep(g, mref(g->gc.sweep, GCRef), GCSWEEPMAX));
-    lua_assert(old >= g->gc.total);
+    lj_assertG(old >= g->gc.total, "sweep increased memory");
     g->gc.estimate -= old - g->gc.total;
     if (gcref(*mref(g->gc.sweep, GCRef)) == NULL) {
       if (g->strnum <= (g->strmask >> 2) && g->strmask > LJ_MIN_STRTAB*2-1)
@@ -708,7 +718,7 @@ static size_t gc_onestep(lua_State *L)
     g->gc.debt = 0;
     return 0;
   default:
-    lua_assert(0);
+    lj_assertG(0, "bad GC state");
     return 0;
   }
 }
@@ -782,7 +792,8 @@ void lj_gc_fullgc(lua_State *L)
   }
   while (g->gc.state == GCSsweepstring || g->gc.state == GCSsweep)
     gc_onestep(L);  /* Finish sweep. */
-  lua_assert(g->gc.state == GCSfinalize || g->gc.state == GCSpause);
+  lj_assertG(g->gc.state == GCSfinalize || g->gc.state == GCSpause,
+	     "bad GC state");
   /* Now perform a full GC. */
   g->gc.state = GCSpause;
   do { gc_onestep(L); } while (g->gc.state != GCSpause);
@@ -795,9 +806,11 @@ void lj_gc_fullgc(lua_State *L)
 /* Move the GC propagation frontier forward. */
 void lj_gc_barrierf(global_State *g, GCobj *o, GCobj *v)
 {
-  lua_assert(isblack(o) && iswhite(v) && !isdead(g, v) && !isdead(g, o));
-  lua_assert(g->gc.state != GCSfinalize && g->gc.state != GCSpause);
-  lua_assert(o->gch.gct != ~LJ_TTAB);
+  lj_assertG(isblack(o) && iswhite(v) && !isdead(g, v) && !isdead(g, o),
+	     "bad object states for forward barrier");
+  lj_assertG(g->gc.state != GCSfinalize && g->gc.state != GCSpause,
+	     "bad GC state");
+  lj_assertG(o->gch.gct != ~LJ_TTAB, "barrier object is not a table");
   /* Preserve invariant during propagation. Otherwise it doesn't matter. */
   if (g->gc.state == GCSpropagate || g->gc.state == GCSatomic)
     gc_mark(g, v);  /* Move frontier forward. */
@@ -834,7 +847,8 @@ void lj_gc_closeuv(global_State *g, GCupval *uv)
 	lj_gc_barrierf(g, o, gcV(&uv->tv));
     } else {
       makewhite(g, o);  /* Make it white, i.e. sweep the upvalue. */
-      lua_assert(g->gc.state != GCSfinalize && g->gc.state != GCSpause);
+      lj_assertG(g->gc.state != GCSfinalize && g->gc.state != GCSpause,
+		 "bad GC state");
     }
   }
 }
@@ -854,14 +868,15 @@ void lj_gc_barriertrace(global_State *g, uint32_t traceno)
 void *lj_mem_realloc(lua_State *L, void *p, GCSize osz, GCSize nsz)
 {
   global_State *g = G(L);
-  lua_assert((osz == 0) == (p == NULL));
+  lj_assertG((osz == 0) == (p == NULL), "realloc API violation");
 
   setgcref(g->mem_L, obj2gco(L));
   p = g->allocf(g->allocd, p, osz, nsz);
   if (p == NULL && nsz > 0)
     lj_err_mem(L);
-  lua_assert((nsz == 0) == (p == NULL));
-  lua_assert(checkptrGC(p));
+  lj_assertG((nsz == 0) == (p == NULL), "allocf API violation");
+  lj_assertG(checkptrGC(p),
+	     "allocated memory address %p outside required range", p);
   g->gc.total = (g->gc.total - osz) + nsz;
   g->gc.allocated += nsz;
   g->gc.freed += osz;
@@ -878,7 +893,8 @@ void * LJ_FASTCALL lj_mem_newgco(lua_State *L, GCSize size)
   o = (GCobj *)g->allocf(g->allocd, NULL, 0, size);
   if (o == NULL)
     lj_err_mem(L);
-  lua_assert(checkptrGC(o));
+  lj_assertG(checkptrGC(o),
+	     "allocated memory address %p outside required range", o);
   g->gc.total += size;
   g->gc.allocated += size;
   setgcrefr(o->gch.nextgc, g->gc.root);
diff --git a/src/lj_gc.h b/src/lj_gc.h
index 40b02cb0..bd880652 100644
--- a/src/lj_gc.h
+++ b/src/lj_gc.h
@@ -76,8 +76,10 @@ LJ_FUNC void lj_gc_barriertrace(global_State *g, uint32_t traceno);
 static LJ_AINLINE void lj_gc_barrierback(global_State *g, GCtab *t)
 {
   GCobj *o = obj2gco(t);
-  lua_assert(isblack(o) && !isdead(g, o));
-  lua_assert(g->gc.state != GCSfinalize && g->gc.state != GCSpause);
+  lj_assertG(isblack(o) && !isdead(g, o),
+	     "bad object states for backward barrier");
+  lj_assertG(g->gc.state != GCSfinalize && g->gc.state != GCSpause,
+	     "bad GC state");
   black2gray(o);
   setgcrefr(t->gclist, g->gc.grayagain);
   setgcref(g->gc.grayagain, o);
diff --git a/src/lj_gdbjit.c b/src/lj_gdbjit.c
index c219ffac..9947eacc 100644
--- a/src/lj_gdbjit.c
+++ b/src/lj_gdbjit.c
@@ -724,7 +724,7 @@ static void gdbjit_buildobj(GDBJITctx *ctx)
   SECTALIGN(ctx->p, sizeof(uintptr_t));
   gdbjit_initsect(ctx, GDBJIT_SECT_eh_frame, gdbjit_ehframe);
   ctx->objsize = (size_t)((char *)ctx->p - (char *)obj);
-  lua_assert(ctx->objsize < sizeof(GDBJITobj));
+  lj_assertX(ctx->objsize < sizeof(GDBJITobj), "GDBJITobj overflow");
 }
 
 #undef SECTALIGN
@@ -782,7 +782,8 @@ void lj_gdbjit_addtrace(jit_State *J, GCtrace *T)
   ctx.spadjp = CFRAME_SIZE_JIT +
 	       (MSize)(parent ? traceref(J, parent)->spadjust : 0);
   ctx.spadj = CFRAME_SIZE_JIT + T->spadjust;
-  lua_assert(startpc >= proto_bc(pt) && startpc < proto_bc(pt) + pt->sizebc);
+  lj_assertJ(startpc >= proto_bc(pt) && startpc < proto_bc(pt) + pt->sizebc,
+	     "start PC out of range");
   ctx.lineno = lj_debug_line(pt, proto_bcpos(pt, startpc));
   ctx.filename = proto_chunknamestr(pt);
   if (*ctx.filename == '@' || *ctx.filename == '=')
diff --git a/src/lj_ir.c b/src/lj_ir.c
index 2f7ddb24..9a51186f 100644
--- a/src/lj_ir.c
+++ b/src/lj_ir.c
@@ -38,7 +38,7 @@
 #define fins			(&J->fold.ins)
 
 /* Pass IR on to next optimization in chain (FOLD). */
-#define emitir(ot, a, b)        (lj_ir_set(J, (ot), (a), (b)), lj_opt_fold(J))
+#define emitir(ot, a, b)	(lj_ir_set(J, (ot), (a), (b)), lj_opt_fold(J))
 
 /* -- IR tables ----------------------------------------------------------- */
 
@@ -90,8 +90,9 @@ static void lj_ir_growbot(jit_State *J)
 {
   IRIns *baseir = J->irbuf + J->irbotlim;
   MSize szins = J->irtoplim - J->irbotlim;
-  lua_assert(szins != 0);
-  lua_assert(J->cur.nk == J->irbotlim || J->cur.nk-1 == J->irbotlim);
+  lj_assertJ(szins != 0, "zero IR size");
+  lj_assertJ(J->cur.nk == J->irbotlim || J->cur.nk-1 == J->irbotlim,
+	     "unexpected IR growth");
   if (J->cur.nins + (szins >> 1) < J->irtoplim) {
     /* More than half of the buffer is free on top: shift up by a quarter. */
     MSize ofs = szins >> 2;
@@ -148,9 +149,10 @@ TRef lj_ir_call(jit_State *J, IRCallID id, ...)
 /* Load field of type t from GG_State + offset. Must be 32 bit aligned. */
 LJ_FUNC TRef lj_ir_ggfload(jit_State *J, IRType t, uintptr_t ofs)
 {
-  lua_assert((ofs & 3) == 0);
+  lj_assertJ((ofs & 3) == 0, "unaligned GG_State field offset");
   ofs >>= 2;
-  lua_assert(ofs >= IRFL__MAX && ofs <= 0x3ff);  /* 10 bit FOLD key limit. */
+  lj_assertJ(ofs >= IRFL__MAX && ofs <= 0x3ff,
+	     "GG_State field offset breaks 10 bit FOLD key limit");
   lj_ir_set(J, IRT(IR_FLOAD, t), REF_NIL, ofs);
   return lj_opt_fold(J);
 }
@@ -181,7 +183,7 @@ static LJ_AINLINE IRRef ir_nextk(jit_State *J)
 static LJ_AINLINE IRRef ir_nextk64(jit_State *J)
 {
   IRRef ref = J->cur.nk - 2;
-  lua_assert(J->state != LJ_TRACE_ASM);
+  lj_assertJ(J->state != LJ_TRACE_ASM, "bad JIT state");
   if (LJ_UNLIKELY(ref < J->irbotlim)) lj_ir_growbot(J);
   J->cur.nk = ref;
   return ref;
@@ -277,7 +279,7 @@ TRef lj_ir_kgc(jit_State *J, GCobj *o, IRType t)
 {
   IRIns *ir, *cir = J->cur.ir;
   IRRef ref;
-  lua_assert(!isdead(J2G(J), o));
+  lj_assertJ(!isdead(J2G(J), o), "interning of dead GC object");
   for (ref = J->chain[IR_KGC]; ref; ref = cir[ref].prev)
     if (ir_kgc(&cir[ref]) == o)
       goto found;
@@ -299,7 +301,7 @@ TRef lj_ir_ktrace(jit_State *J)
 {
   IRRef ref = ir_nextkgc(J);
   IRIns *ir = IR(ref);
-  lua_assert(irt_toitype_(IRT_P64) == LJ_TTRACE);
+  lj_assertJ(irt_toitype_(IRT_P64) == LJ_TTRACE, "mismatched type mapping");
   ir->t.irt = IRT_P64;
   ir->o = LJ_GC64 ? IR_KNUM : IR_KNULL;  /* Not IR_KGC yet, but same size. */
   ir->op12 = 0;
@@ -313,7 +315,7 @@ TRef lj_ir_kptr_(jit_State *J, IROp op, void *ptr)
   IRIns *ir, *cir = J->cur.ir;
   IRRef ref;
 #if LJ_64 && !LJ_GC64
-  lua_assert((void *)(uintptr_t)u32ptr(ptr) == ptr);
+  lj_assertJ((void *)(uintptr_t)u32ptr(ptr) == ptr, "out-of-range GC pointer");
 #endif
   for (ref = J->chain[op]; ref; ref = cir[ref].prev)
     if (ir_kptr(&cir[ref]) == ptr)
@@ -360,7 +362,8 @@ TRef lj_ir_kslot(jit_State *J, TRef key, IRRef slot)
   IRRef2 op12 = IRREF2((IRRef1)key, (IRRef1)slot);
   IRRef ref;
   /* Const part is not touched by CSE/DCE, so 0-65535 is ok for IRMlit here. */
-  lua_assert(tref_isk(key) && slot == (IRRef)(IRRef1)slot);
+  lj_assertJ(tref_isk(key) && slot == (IRRef)(IRRef1)slot,
+	     "out-of-range key/slot");
   for (ref = J->chain[IR_KSLOT]; ref; ref = cir[ref].prev)
     if (cir[ref].op12 == op12)
       goto found;
@@ -381,7 +384,7 @@ found:
 void lj_ir_kvalue(lua_State *L, TValue *tv, const IRIns *ir)
 {
   UNUSED(L);
-  lua_assert(ir->o != IR_KSLOT);  /* Common mistake. */
+  lj_assertL(ir->o != IR_KSLOT, "unexpected KSLOT");  /* Common mistake. */
   switch (ir->o) {
   case IR_KPRI: setpriV(tv, irt_toitype(ir->t)); break;
   case IR_KINT: setintV(tv, ir->i); break;
@@ -399,7 +402,7 @@ void lj_ir_kvalue(lua_State *L, TValue *tv, const IRIns *ir)
     break;
     }
 #endif
-  default: lua_assert(0); break;
+  default: lj_assertL(0, "bad IR constant op %d", ir->o); break;
   }
 }
 
@@ -459,7 +462,7 @@ int lj_ir_numcmp(lua_Number a, lua_Number b, IROp op)
   case IR_UGE: return !(a < b);
   case IR_ULE: return !(a > b);
   case IR_UGT: return !(a <= b);
-  default: lua_assert(0); return 0;
+  default: lj_assertX(0, "bad IR op %d", op); return 0;
   }
 }
 
@@ -472,7 +475,7 @@ int lj_ir_strcmp(GCstr *a, GCstr *b, IROp op)
   case IR_GE: return (res >= 0);
   case IR_LE: return (res <= 0);
   case IR_GT: return (res > 0);
-  default: lua_assert(0); return 0;
+  default: lj_assertX(0, "bad IR op %d", op); return 0;
   }
 }
 
diff --git a/src/lj_ir.h b/src/lj_ir.h
index 43e55069..46af54e4 100644
--- a/src/lj_ir.h
+++ b/src/lj_ir.h
@@ -412,11 +412,12 @@ static LJ_AINLINE IRType itype2irt(const TValue *tv)
 
 static LJ_AINLINE uint32_t irt_toitype_(IRType t)
 {
-  lua_assert(!LJ_64 || LJ_GC64 || t != IRT_LIGHTUD);
+  lj_assertX(!LJ_64 || LJ_GC64 || t != IRT_LIGHTUD,
+	     "no plain type tag for lightuserdata");
   if (LJ_DUALNUM && t > IRT_NUM) {
     return LJ_TISNUM;
   } else {
-    lua_assert(t <= IRT_NUM);
+    lj_assertX(t <= IRT_NUM, "no plain type tag for IR type %d", t);
     return ~(uint32_t)t;
   }
 }
diff --git a/src/lj_jit.h b/src/lj_jit.h
index a8b6f9a7..361570a0 100644
--- a/src/lj_jit.h
+++ b/src/lj_jit.h
@@ -507,6 +507,12 @@ LJ_ALIGN(16)		/* For DISPATCH-relative addresses in assembler part. */
 #endif
 jit_State;
 
+#ifdef LUA_USE_ASSERT
+#define lj_assertJ(c, ...)	lj_assertG_(J2G(J), (c), __VA_ARGS__)
+#else
+#define lj_assertJ(c, ...)	((void)J)
+#endif
+
 /* Trivial PRNG e.g. used for penalty randomization. */
 static LJ_AINLINE uint32_t LJ_PRNG_BITS(jit_State *J, int bits)
 {
diff --git a/src/lj_lex.c b/src/lj_lex.c
index c66660d7..cef3c683 100644
--- a/src/lj_lex.c
+++ b/src/lj_lex.c
@@ -76,7 +76,7 @@ static LJ_AINLINE LexChar lex_savenext(LexState *ls)
 static void lex_newline(LexState *ls)
 {
   LexChar old = ls->c;
-  lua_assert(lex_iseol(ls));
+  lj_assertLS(lex_iseol(ls), "bad usage");
   lex_next(ls);  /* Skip "\n" or "\r". */
   if (lex_iseol(ls) && ls->c != old) lex_next(ls);  /* Skip "\n\r" or "\r\n". */
   if (++ls->linenumber >= LJ_MAX_LINE)
@@ -90,7 +90,7 @@ static void lex_number(LexState *ls, TValue *tv)
 {
   StrScanFmt fmt;
   LexChar c, xp = 'e';
-  lua_assert(lj_char_isdigit(ls->c));
+  lj_assertLS(lj_char_isdigit(ls->c), "bad usage");
   if ((c = ls->c) == '0' && (lex_savenext(ls) | 0x20) == 'x')
     xp = 'p';
   while (lj_char_isident(ls->c) || ls->c == '.' ||
@@ -110,7 +110,8 @@ static void lex_number(LexState *ls, TValue *tv)
   } else if (fmt != STRSCAN_ERROR) {
     lua_State *L = ls->L;
     GCcdata *cd;
-    lua_assert(fmt == STRSCAN_I64 || fmt == STRSCAN_U64 || fmt == STRSCAN_IMAG);
+    lj_assertLS(fmt == STRSCAN_I64 || fmt == STRSCAN_U64 || fmt == STRSCAN_IMAG,
+		"unexpected number format %d", fmt);
     if (!ctype_ctsG(G(L))) {
       ptrdiff_t oldtop = savestack(L, L->top);
       luaopen_ffi(L);  /* Load FFI library on-demand. */
@@ -127,7 +128,8 @@ static void lex_number(LexState *ls, TValue *tv)
     lj_parse_keepcdata(ls, tv, cd);
 #endif
   } else {
-    lua_assert(fmt == STRSCAN_ERROR);
+    lj_assertLS(fmt == STRSCAN_ERROR,
+		"unexpected number format %d", fmt);
     lj_lex_error(ls, TK_number, LJ_ERR_XNUMBER);
   }
 }
@@ -137,7 +139,7 @@ static int lex_skipeq(LexState *ls)
 {
   int count = 0;
   LexChar s = ls->c;
-  lua_assert(s == '[' || s == ']');
+  lj_assertLS(s == '[' || s == ']', "bad usage");
   while (lex_savenext(ls) == '=' && count < 0x20000000)
     count++;
   return (ls->c == s) ? count : (-count) - 1;
@@ -462,7 +464,7 @@ void lj_lex_next(LexState *ls)
 /* Look ahead for the next token. */
 LexToken lj_lex_lookahead(LexState *ls)
 {
-  lua_assert(ls->lookahead == TK_eof);
+  lj_assertLS(ls->lookahead == TK_eof, "double lookahead");
   ls->lookahead = lex_scan(ls, &ls->lookaheadval);
   return ls->lookahead;
 }
diff --git a/src/lj_lex.h b/src/lj_lex.h
index 33fa8657..ae05a954 100644
--- a/src/lj_lex.h
+++ b/src/lj_lex.h
@@ -83,4 +83,10 @@ LJ_FUNC const char *lj_lex_token2str(LexState *ls, LexToken tok);
 LJ_FUNC_NORET void lj_lex_error(LexState *ls, LexToken tok, ErrMsg em, ...);
 LJ_FUNC void lj_lex_init(lua_State *L);
 
+#ifdef LUA_USE_ASSERT
+#define lj_assertLS(c, ...)	(lj_assertG_(G(ls->L), (c), __VA_ARGS__))
+#else
+#define lj_assertLS(c, ...)	((void)ls)
+#endif
+
 #endif
diff --git a/src/lj_load.c b/src/lj_load.c
index 9a31d9a1..19ac6ba2 100644
--- a/src/lj_load.c
+++ b/src/lj_load.c
@@ -159,7 +159,7 @@ LUALIB_API int luaL_loadstring(lua_State *L, const char *s)
 LUA_API int lua_dump(lua_State *L, lua_Writer writer, void *data)
 {
   cTValue *o = L->top-1;
-  api_check(L, L->top > L->base);
+  lj_checkapi(L->top > L->base, "top slot empty");
   if (tvisfunc(o) && isluafunc(funcV(o)))
     return lj_bcwrite(L, funcproto(funcV(o)), writer, data, 0);
   else
diff --git a/src/lj_mapi.c b/src/lj_mapi.c
index 9d97c747..679ca943 100644
--- a/src/lj_mapi.c
+++ b/src/lj_mapi.c
@@ -28,7 +28,7 @@ LUAMISC_API void luaM_metrics(lua_State *L, struct luam_Metrics *metrics)
   jit_State *J = G2J(g);
 #endif
 
-  lua_assert(metrics != NULL);
+  lj_assertL(metrics != NULL, "uninitialized metrics struct");
 
   metrics->strhash_hit = g->strhash_hit;
   metrics->strhash_miss = g->strhash_miss;
diff --git a/src/lj_mcode.c b/src/lj_mcode.c
index 10db4457..808a9897 100644
--- a/src/lj_mcode.c
+++ b/src/lj_mcode.c
@@ -354,7 +354,7 @@ MCode *lj_mcode_patch(jit_State *J, MCode *ptr, int finish)
     /* Otherwise search through the list of MCode areas. */
     for (;;) {
       mc = ((MCLink *)mc)->next;
-      lua_assert(mc != NULL);
+      lj_assertJ(mc != NULL, "broken MCode area chain");
       if (ptr >= mc && ptr < (MCode *)((char *)mc + ((MCLink *)mc)->size)) {
 	if (LJ_UNLIKELY(mcode_setprot(mc, ((MCLink *)mc)->size, MCPROT_GEN)))
 	  mcode_protfail(J);
diff --git a/src/lj_memprof.c b/src/lj_memprof.c
index c600c4f0..a492cf58 100644
--- a/src/lj_memprof.c
+++ b/src/lj_memprof.c
@@ -144,7 +144,7 @@ static void memprof_write_func(struct memprof *mp, uint8_t aevent)
   else if (iscfunc(fn))
     memprof_write_cfunc(out, aevent, fn, L, &mp->lib_adds);
   else
-    lua_assert(0);
+    lj_assertL(0, "unknown function type to write by memprof");
 }
 
 #if LJ_HASJIT
@@ -164,7 +164,7 @@ static void memprof_write_trace(struct memprof *mp, uint8_t aevent)
 {
   UNUSED(mp);
   UNUSED(aevent);
-  lua_assert(0);
+  lj_assertX(0, "write trace memprof event without JIT");
 }
 
 #endif
@@ -215,10 +215,12 @@ static void *memprof_allocf(void *ud, void *ptr, size_t osize, size_t nsize)
   struct lj_wbuf *out = &mp->out;
   void *nptr;
 
-  lua_assert(MPS_PROFILE == mp->state);
-  lua_assert(oalloc->allocf != memprof_allocf);
-  lua_assert(oalloc->allocf != NULL);
-  lua_assert(ud == oalloc->state);
+  lj_assertX(MPS_PROFILE == mp->state, "bad memprof profile state");
+  lj_assertX(oalloc->allocf != memprof_allocf,
+	     "unexpected memprof old alloc function");
+  lj_assertX(oalloc->allocf != NULL,
+	     "uninitialized memprof old alloc function");
+  lj_assertX(ud == oalloc->state, "bad old memprof profile state");
 
   nptr = oalloc->allocf(ud, ptr, osize, nsize);
 
@@ -252,10 +254,10 @@ int lj_memprof_start(struct lua_State *L, const struct lj_memprof_options *opt)
   struct alloc *oalloc = &mp->orig_alloc;
   const size_t ljm_header_len = sizeof(ljm_header) / sizeof(ljm_header[0]);
 
-  lua_assert(opt->writer != NULL);
-  lua_assert(opt->on_stop != NULL);
-  lua_assert(opt->buf != NULL);
-  lua_assert(opt->len != 0);
+  lj_assertL(opt->writer != NULL, "uninitialized memprof writer");
+  lj_assertL(opt->on_stop != NULL, "uninitialized on stop memprof callback");
+  lj_assertL(opt->buf != NULL, "uninitialized memprof writer buffer");
+  lj_assertL(opt->len != 0, "bad memprof writer buffer lenght");
 
   if (mp->state != MPS_IDLE) {
     /* Clean up resourses. Ignore possible errors. */
@@ -293,8 +295,9 @@ int lj_memprof_start(struct lua_State *L, const struct lj_memprof_options *opt)
 
   /* Override allocating function. */
   oalloc->allocf = lua_getallocf(L, &oalloc->state);
-  lua_assert(oalloc->allocf != NULL);
-  lua_assert(oalloc->allocf != memprof_allocf);
+  lj_assertL(oalloc->allocf != NULL, "uninitialized memprof old alloc function");
+  lj_assertL(oalloc->allocf != memprof_allocf,
+	     "unexpected memprof old alloc function");
   lua_setallocf(L, memprof_allocf, oalloc->state);
 
   return PROFILE_SUCCESS;
@@ -323,10 +326,12 @@ int lj_memprof_stop(struct lua_State *L)
 
   mp->state = MPS_IDLE;
 
-  lua_assert(mp->g != NULL);
+  lj_assertL(mp->g != NULL, "uninitialized global state in memprof state");
 
-  lua_assert(memprof_allocf == lua_getallocf(L, NULL));
-  lua_assert(oalloc->allocf != NULL);
+  lj_assertL(memprof_allocf == lua_getallocf(L, NULL),
+	     "bad current allocator function on memprof stop");
+  lj_assertL(oalloc->allocf != NULL,
+	     "uninitialized old alloc function on memprof stop");
   lua_setallocf(L, oalloc->allocf, oalloc->state);
 
   if (LJ_UNLIKELY(lj_wbuf_test_flag(out, STREAM_STOP))) {
diff --git a/src/lj_meta.c b/src/lj_meta.c
index 7ef7a8e0..4cb1a261 100644
--- a/src/lj_meta.c
+++ b/src/lj_meta.c
@@ -47,7 +47,7 @@ void lj_meta_init(lua_State *L)
 cTValue *lj_meta_cache(GCtab *mt, MMS mm, GCstr *name)
 {
   cTValue *mo = lj_tab_getstr(mt, name);
-  lua_assert(mm <= MM_FAST);
+  lj_assertX(mm <= MM_FAST, "bad metamethod %d", mm);
   if (!mo || tvisnil(mo)) {  /* No metamethod? */
     mt->nomm |= (uint8_t)(1u<<mm);  /* Set negative cache flag. */
     return NULL;
@@ -363,7 +363,7 @@ TValue * LJ_FASTCALL lj_meta_equal_cd(lua_State *L, BCIns ins)
   } else if (op == BC_ISEQN) {
     o2 = &mref(curr_proto(L)->k, cTValue)[bc_d(ins)];
   } else {
-    lua_assert(op == BC_ISEQP);
+    lj_assertL(op == BC_ISEQP, "bad bytecode op %d", op);
     setpriV(&tv, ~bc_d(ins));
     o2 = &tv;
   }
@@ -426,7 +426,7 @@ void lj_meta_istype(lua_State *L, BCReg ra, BCReg tp)
 {
   L->top = curr_topL(L);
   ra++; tp--;
-  lua_assert(LJ_DUALNUM || tp != ~LJ_TNUMX);  /* ISTYPE -> ISNUM broken. */
+  lj_assertL(LJ_DUALNUM || tp != ~LJ_TNUMX, "bad type for ISTYPE");
   if (LJ_DUALNUM && tp == ~LJ_TNUMX) lj_lib_checkint(L, ra);
   else if (tp == ~LJ_TNUMX+1) lj_lib_checknum(L, ra);
   else if (tp == ~LJ_TSTR) lj_lib_checkstr(L, ra);
diff --git a/src/lj_obj.h b/src/lj_obj.h
index bf95e1eb..fb21cba9 100644
--- a/src/lj_obj.h
+++ b/src/lj_obj.h
@@ -735,6 +735,11 @@ struct lua_State {
 #define curr_topL(L)		(L->base + curr_proto(L)->framesize)
 #define curr_top(L)		(curr_funcisL(L) ? curr_topL(L) : L->top)
 
+#if defined(LUA_USE_ASSERT) || defined(LUA_USE_APICHECK)
+LJ_FUNC_NORET void lj_assert_fail(global_State *g, const char *file, int line,
+				  const char *func, const char *fmt, ...);
+#endif
+
 /* -- GC object definition and conversions -------------------------------- */
 
 /* GC header for generic access to common fields of GC objects. */
@@ -788,10 +793,6 @@ typedef union GCobj {
 
 /* -- TValue getters/setters ---------------------------------------------- */
 
-#ifdef LUA_USE_ASSERT
-#include "lj_gc.h"
-#endif
-
 /* Macros to test types. */
 #if LJ_GC64
 #define itype(o)	((uint32_t)((o)->it64 >> 47))
@@ -863,8 +864,8 @@ static LJ_AINLINE void *lightudV(global_State *g, cTValue *o)
   uint64_t u = o->u64;
   uint64_t seg = lightudseg(u);
   uint32_t *segmap = mref(g->gc.lightudseg, uint32_t);
-  lua_assert(tvislightud(o));
-  lua_assert(seg <= g->gc.lightudnum);
+  lj_assertG(tvislightud(o), "lightuserdata expected");
+  lj_assertG(seg <= g->gc.lightudnum, "bad lightuserdata segment %d", seg);
   return (void *)(((uint64_t)segmap[seg] << 32) | lightudlo(u));
 }
 #else
@@ -915,9 +916,19 @@ static LJ_AINLINE void setrawlightudV(TValue *o, void *p)
   ((o)->u64 = (uint64_t)(void *)(f) - (uint64_t)lj_vm_asm_begin)
 #endif
 
-#define tvchecklive(L, o) \
-  UNUSED(L), lua_assert(!tvisgcv(o) || \
-  ((~itype(o) == gcval(o)->gch.gct) && !isdead(G(L), gcval(o))))
+static LJ_AINLINE void checklivetv(lua_State *L, TValue *o, const char *msg)
+{
+  UNUSED(L); UNUSED(o); UNUSED(msg);
+#if LUA_USE_ASSERT
+  if (tvisgcv(o)) {
+    lj_assertL(~itype(o) == gcval(o)->gch.gct,
+	       "mismatch of TValue type %d vs GC type %d",
+	       ~itype(o), gcval(o)->gch.gct);
+    /* Copy of isdead check from lj_gc.h to avoid circular include. */
+    lj_assertL(!(gcval(o)->gch.marked & (G(L)->gc.currentwhite ^ 3) & 3), msg);
+  }
+#endif
+}
 
 static LJ_AINLINE void setgcVraw(TValue *o, GCobj *v, uint32_t itype)
 {
@@ -930,7 +941,8 @@ static LJ_AINLINE void setgcVraw(TValue *o, GCobj *v, uint32_t itype)
 
 static LJ_AINLINE void setgcV(lua_State *L, TValue *o, GCobj *v, uint32_t it)
 {
-  setgcVraw(o, v, it); tvchecklive(L, o);
+  setgcVraw(o, v, it);
+  checklivetv(L, o, "store to dead GC object");
 }
 
 #define define_setV(name, type, tag) \
@@ -977,7 +989,8 @@ static LJ_AINLINE void setint64V(TValue *o, int64_t i)
 /* Copy tagged values. */
 static LJ_AINLINE void copyTV(lua_State *L, TValue *o1, const TValue *o2)
 {
-  *o1 = *o2; tvchecklive(L, o1);
+  *o1 = *o2;
+  checklivetv(L, o1, "copy of dead GC object");
 }
 
 /* -- Number to integer conversion ---------------------------------------- */
diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
index cd803d87..0007107b 100644
--- a/src/lj_opt_fold.c
+++ b/src/lj_opt_fold.c
@@ -282,7 +282,7 @@ static int32_t kfold_intop(int32_t k1, int32_t k2, IROp op)
   case IR_BROR: k1 = (int32_t)lj_ror((uint32_t)k1, (k2 & 31)); break;
   case IR_MIN: k1 = k1 < k2 ? k1 : k2; break;
   case IR_MAX: k1 = k1 > k2 ? k1 : k2; break;
-  default: lua_assert(0); break;
+  default: lj_assertX(0, "bad IR op %d", op); break;
   }
   return k1;
 }
@@ -354,7 +354,7 @@ LJFOLDF(kfold_intcomp)
   case IR_ULE: return CONDFOLD((uint32_t)a <= (uint32_t)b);
   case IR_ABC:
   case IR_UGT: return CONDFOLD((uint32_t)a > (uint32_t)b);
-  default: lua_assert(0); return FAILFOLD;
+  default: lj_assertJ(0, "bad IR op %d", fins->o); return FAILFOLD;
   }
 }
 
@@ -368,10 +368,12 @@ LJFOLDF(kfold_intcomp0)
 
 /* -- Constant folding for 64 bit integers -------------------------------- */
 
-static uint64_t kfold_int64arith(uint64_t k1, uint64_t k2, IROp op)
+static uint64_t kfold_int64arith(jit_State *J, uint64_t k1, uint64_t k2,
+				 IROp op)
 {
-  switch (op) {
+  UNUSED(J);
 #if LJ_HASFFI
+  switch (op) {
   case IR_ADD: k1 += k2; break;
   case IR_SUB: k1 -= k2; break;
   case IR_MUL: k1 *= k2; break;
@@ -383,9 +385,12 @@ static uint64_t kfold_int64arith(uint64_t k1, uint64_t k2, IROp op)
   case IR_BSAR: k1 >>= (k2 & 63); break;
   case IR_BROL: k1 = (int32_t)lj_rol((uint32_t)k1, (k2 & 63)); break;
   case IR_BROR: k1 = (int32_t)lj_ror((uint32_t)k1, (k2 & 63)); break;
-#endif
-  default: UNUSED(k2); lua_assert(0); break;
+  default: lj_assertJ(0, "bad IR op %d", op); break;
   }
+#else
+  UNUSED(k2); UNUSED(op);
+  lj_assertJ(0, "FFI IR op without FFI");
+#endif
   return k1;
 }
 
@@ -397,7 +402,7 @@ LJFOLD(BOR KINT64 KINT64)
 LJFOLD(BXOR KINT64 KINT64)
 LJFOLDF(kfold_int64arith)
 {
-  return INT64FOLD(kfold_int64arith(ir_k64(fleft)->u64,
+  return INT64FOLD(kfold_int64arith(J, ir_k64(fleft)->u64,
 				    ir_k64(fright)->u64, (IROp)fins->o));
 }
 
@@ -419,7 +424,7 @@ LJFOLDF(kfold_int64arith2)
   }
   return INT64FOLD(k1);
 #else
-  UNUSED(J); lua_assert(0); return FAILFOLD;
+  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
 #endif
 }
 
@@ -435,7 +440,7 @@ LJFOLDF(kfold_int64shift)
   int32_t sh = (fright->i & 63);
   return INT64FOLD(lj_carith_shift64(k, sh, fins->o - IR_BSHL));
 #else
-  UNUSED(J); lua_assert(0); return FAILFOLD;
+  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
 #endif
 }
 
@@ -445,7 +450,7 @@ LJFOLDF(kfold_bnot64)
 #if LJ_HASFFI
   return INT64FOLD(~ir_k64(fleft)->u64);
 #else
-  UNUSED(J); lua_assert(0); return FAILFOLD;
+  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
 #endif
 }
 
@@ -455,7 +460,7 @@ LJFOLDF(kfold_bswap64)
 #if LJ_HASFFI
   return INT64FOLD(lj_bswap64(ir_k64(fleft)->u64));
 #else
-  UNUSED(J); lua_assert(0); return FAILFOLD;
+  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
 #endif
 }
 
@@ -480,10 +485,10 @@ LJFOLDF(kfold_int64comp)
   case IR_UGE: return CONDFOLD(a >= b);
   case IR_ULE: return CONDFOLD(a <= b);
   case IR_UGT: return CONDFOLD(a > b);
-  default: lua_assert(0); return FAILFOLD;
+  default: lj_assertJ(0, "bad IR op %d", fins->o); return FAILFOLD;
   }
 #else
-  UNUSED(J); lua_assert(0); return FAILFOLD;
+  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
 #endif
 }
 
@@ -495,7 +500,7 @@ LJFOLDF(kfold_int64comp0)
     return DROPFOLD;
   return NEXTFOLD;
 #else
-  UNUSED(J); lua_assert(0); return FAILFOLD;
+  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
 #endif
 }
 
@@ -520,7 +525,7 @@ LJFOLD(STRREF KGC KINT)
 LJFOLDF(kfold_strref)
 {
   GCstr *str = ir_kstr(fleft);
-  lua_assert((MSize)fright->i <= str->len);
+  lj_assertJ((MSize)fright->i <= str->len, "bad string ref");
   return lj_ir_kkptr(J, (char *)strdata(str) + fright->i);
 }
 
@@ -616,8 +621,9 @@ LJFOLDF(bufput_kgc)
 LJFOLD(BUFSTR any any)
 LJFOLDF(bufstr_kfold_cse)
 {
-  lua_assert(fleft->o == IR_BUFHDR || fleft->o == IR_BUFPUT ||
-	     fleft->o == IR_CALLL);
+  lj_assertJ(fleft->o == IR_BUFHDR || fleft->o == IR_BUFPUT ||
+	     fleft->o == IR_CALLL,
+	     "bad buffer constructor IR op %d", fleft->o);
   if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD)) {
     if (fleft->o == IR_BUFHDR) {  /* No put operations? */
       if (!(fleft->op2 & IRBUFHDR_APPEND))  /* Empty buffer? */
@@ -637,8 +643,9 @@ LJFOLDF(bufstr_kfold_cse)
     while (ref) {
       IRIns *irs = IR(ref), *ira = fleft, *irb = IR(irs->op1);
       while (ira->o == irb->o && ira->op2 == irb->op2) {
-	lua_assert(ira->o == IR_BUFHDR || ira->o == IR_BUFPUT ||
-		   ira->o == IR_CALLL || ira->o == IR_CARG);
+	lj_assertJ(ira->o == IR_BUFHDR || ira->o == IR_BUFPUT ||
+		   ira->o == IR_CALLL || ira->o == IR_CARG,
+		   "bad buffer constructor IR op %d", ira->o);
 	if (ira->o == IR_BUFHDR && !(ira->op2 & IRBUFHDR_APPEND))
 	  return ref;  /* CSE succeeded. */
 	if (ira->o == IR_CALLL && ira->op2 == IRCALL_lj_buf_puttab)
@@ -697,7 +704,7 @@ LJFOLD(CALLL CARG IRCALL_lj_strfmt_putfchar)
 LJFOLDF(bufput_kfold_fmt)
 {
   IRIns *irc = IR(fleft->op1);
-  lua_assert(irref_isk(irc->op2));  /* SFormat must be const. */
+  lj_assertJ(irref_isk(irc->op2), "SFormat must be const");
   if (irref_isk(fleft->op2)) {
     SFormat sf = (SFormat)IR(irc->op2)->i;
     IRIns *ira = IR(fleft->op2);
@@ -1216,10 +1223,10 @@ LJFOLDF(simplify_tobit_conv)
 {
   /* Fold even across PHI to avoid expensive num->int conversions in loop. */
   if ((fleft->op2 & IRCONV_SRCMASK) == IRT_INT) {
-    lua_assert(irt_isnum(fleft->t));
+    lj_assertJ(irt_isnum(fleft->t), "expected TOBIT number arg");
     return fleft->op1;
   } else if ((fleft->op2 & IRCONV_SRCMASK) == IRT_U32) {
-    lua_assert(irt_isnum(fleft->t));
+    lj_assertJ(irt_isnum(fleft->t), "expected TOBIT number arg");
     fins->o = IR_CONV;
     fins->op1 = fleft->op1;
     fins->op2 = (IRT_INT<<5)|IRT_U32;
@@ -1259,7 +1266,7 @@ LJFOLDF(simplify_conv_sext)
   /* Use scalar evolution analysis results to strength-reduce sign-extension. */
   if (ref == J->scev.idx) {
     IRRef lo = J->scev.dir ? J->scev.start : J->scev.stop;
-    lua_assert(irt_isint(J->scev.t));
+    lj_assertJ(irt_isint(J->scev.t), "only int SCEV supported");
     if (lo && IR(lo)->o == IR_KINT && IR(lo)->i + ofs >= 0) {
     ok_reduce:
 #if LJ_TARGET_X64
@@ -1335,7 +1342,8 @@ LJFOLDF(narrow_convert)
   /* Narrowing ignores PHIs and repeating it inside the loop is not useful. */
   if (J->chain[IR_LOOP])
     return NEXTFOLD;
-  lua_assert(fins->o != IR_CONV || (fins->op2&IRCONV_CONVMASK) != IRCONV_TOBIT);
+  lj_assertJ(fins->o != IR_CONV || (fins->op2&IRCONV_CONVMASK) != IRCONV_TOBIT,
+	     "unexpected CONV TOBIT");
   return lj_opt_narrow_convert(J);
 }
 
@@ -1441,7 +1449,7 @@ LJFOLDF(simplify_intmul_k64)
     return simplify_intmul_k(J, (int32_t)ir_kint64(fright)->u64);
   return NEXTFOLD;
 #else
-  UNUSED(J); lua_assert(0); return FAILFOLD;
+  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
 #endif
 }
 
@@ -1449,7 +1457,7 @@ LJFOLD(MOD any KINT)
 LJFOLDF(simplify_intmod_k)
 {
   int32_t k = fright->i;
-  lua_assert(k != 0);
+  lj_assertJ(k != 0, "integer mod 0");
   if (k > 0 && (k & (k-1)) == 0) {  /* i % (2^k) ==> i & (2^k-1) */
     fins->o = IR_BAND;
     fins->op2 = lj_ir_kint(J, k-1);
@@ -1699,7 +1707,8 @@ LJFOLDF(simplify_shiftk_andk)
     fins->ot = IRTI(IR_BAND);
     return RETRYFOLD;
   } else if (irk->o == IR_KINT64) {
-    uint64_t k = kfold_int64arith(ir_k64(irk)->u64, fright->i, (IROp)fins->o);
+    uint64_t k = kfold_int64arith(J, ir_k64(irk)->u64, fright->i,
+				  (IROp)fins->o);
     IROpT ot = fleft->ot;
     fins->op1 = fleft->op1;
     fins->op1 = (IRRef1)lj_opt_fold(J);
@@ -1747,8 +1756,8 @@ LJFOLDF(simplify_andor_k64)
   IRIns *irk = IR(fleft->op2);
   PHIBARRIER(fleft);
   if (irk->o == IR_KINT64) {
-    uint64_t k = kfold_int64arith(ir_k64(irk)->u64,
-				  ir_k64(fright)->u64, (IROp)fins->o);
+    uint64_t k = kfold_int64arith(J, ir_k64(irk)->u64, ir_k64(fright)->u64,
+				  (IROp)fins->o);
     /* (i | k1) & k2 ==> i & k2, if (k1 & k2) == 0. */
     /* (i & k1) | k2 ==> i | k2, if (k1 | k2) == -1. */
     if (k == (fins->o == IR_BAND ? (uint64_t)0 : ~(uint64_t)0)) {
@@ -1758,7 +1767,7 @@ LJFOLDF(simplify_andor_k64)
   }
   return NEXTFOLD;
 #else
-  UNUSED(J); lua_assert(0); return FAILFOLD;
+  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
 #endif
 }
 
@@ -1794,8 +1803,8 @@ LJFOLDF(reassoc_intarith_k64)
 #if LJ_HASFFI
   IRIns *irk = IR(fleft->op2);
   if (irk->o == IR_KINT64) {
-    uint64_t k = kfold_int64arith(ir_k64(irk)->u64,
-				  ir_k64(fright)->u64, (IROp)fins->o);
+    uint64_t k = kfold_int64arith(J, ir_k64(irk)->u64, ir_k64(fright)->u64,
+				  (IROp)fins->o);
     PHIBARRIER(fleft);
     fins->op1 = fleft->op1;
     fins->op2 = (IRRef1)lj_ir_kint64(J, k);
@@ -1803,7 +1812,7 @@ LJFOLDF(reassoc_intarith_k64)
   }
   return NEXTFOLD;
 #else
-  UNUSED(J); lua_assert(0); return FAILFOLD;
+  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
 #endif
 }
 
@@ -2058,7 +2067,7 @@ LJFOLDF(merge_eqne_snew_kgc)
 {
   GCstr *kstr = ir_kstr(fright);
   int32_t len = (int32_t)kstr->len;
-  lua_assert(irt_isstr(fins->t));
+  lj_assertJ(irt_isstr(fins->t), "bad equality IR type");
 
 #if LJ_TARGET_UNALIGNED
 #define FOLD_SNEW_MAX_LEN	4  /* Handle string lengths 0, 1, 2, 3, 4. */
@@ -2122,7 +2131,7 @@ LJFOLD(HLOAD KKPTR)
 LJFOLDF(kfold_hload_kkptr)
 {
   UNUSED(J);
-  lua_assert(ir_kptr(fleft) == niltvg(J2G(J)));
+  lj_assertJ(ir_kptr(fleft) == niltvg(J2G(J)), "expected niltv");
   return TREF_NIL;
 }
 
@@ -2333,7 +2342,7 @@ LJFOLDF(fwd_sload)
     TRef tr = lj_opt_cse(J);
     return tref_ref(tr) < J->chain[IR_RETF] ? EMITFOLD : tr;
   } else {
-    lua_assert(J->slot[fins->op1] != 0);
+    lj_assertJ(J->slot[fins->op1] != 0, "uninitialized slot accessed");
     return J->slot[fins->op1];
   }
 }
@@ -2448,8 +2457,9 @@ TRef LJ_FASTCALL lj_opt_fold(jit_State *J)
   IRRef ref;
 
   if (LJ_UNLIKELY((J->flags & JIT_F_OPT_MASK) != JIT_F_OPT_DEFAULT)) {
-    lua_assert(((JIT_F_OPT_FOLD|JIT_F_OPT_FWD|JIT_F_OPT_CSE|JIT_F_OPT_DSE) |
-		JIT_F_OPT_DEFAULT) == JIT_F_OPT_DEFAULT);
+    lj_assertJ(((JIT_F_OPT_FOLD|JIT_F_OPT_FWD|JIT_F_OPT_CSE|JIT_F_OPT_DSE) |
+		JIT_F_OPT_DEFAULT) == JIT_F_OPT_DEFAULT,
+	       "bad JIT_F_OPT_DEFAULT");
     /* Folding disabled? Chain to CSE, but not for loads/stores/allocs. */
     if (!(J->flags & JIT_F_OPT_FOLD) && irm_kind(lj_ir_mode[fins->o]) == IRM_N)
       return lj_opt_cse(J);
@@ -2511,7 +2521,7 @@ retry:
     return lj_ir_kint(J, fins->i);
   if (ref == FAILFOLD)
     lj_trace_err(J, LJ_TRERR_GFAIL);
-  lua_assert(ref == DROPFOLD);
+  lj_assertJ(ref == DROPFOLD, "bad fold result");
   return REF_DROP;
 }
 
diff --git a/src/lj_opt_loop.c b/src/lj_opt_loop.c
index 10613641..d3b0fcee 100644
--- a/src/lj_opt_loop.c
+++ b/src/lj_opt_loop.c
@@ -300,7 +300,8 @@ static void loop_unroll(LoopState *lps)
   loopmap = &J->cur.snapmap[loopsnap->mapofs];
   /* The PC of snapshot #0 and the loop snapshot must match. */
   psentinel = &loopmap[loopsnap->nent];
-  lua_assert(*psentinel == J->cur.snapmap[J->cur.snap[0].nent]);
+  lj_assertJ(*psentinel == J->cur.snapmap[J->cur.snap[0].nent],
+	     "mismatched PC for loop snapshot");
   *psentinel = SNAP(255, 0, 0);  /* Replace PC with temporary sentinel. */
 
   /* Start substitution with snapshot #1 (#0 is empty for root traces). */
@@ -371,7 +372,7 @@ static void loop_unroll(LoopState *lps)
   }
   if (!irt_isguard(J->guardemit))  /* Drop redundant snapshot. */
     J->cur.nsnapmap = (uint32_t)J->cur.snap[--J->cur.nsnap].mapofs;
-  lua_assert(J->cur.nsnapmap <= J->sizesnapmap);
+  lj_assertJ(J->cur.nsnapmap <= J->sizesnapmap, "bad snapshot map index");
   *psentinel = J->cur.snapmap[J->cur.snap[0].nent];  /* Restore PC. */
 
   loop_emit_phi(J, subst, phi, nphi, onsnap);
diff --git a/src/lj_opt_mem.c b/src/lj_opt_mem.c
index c8265b4f..59fddbdd 100644
--- a/src/lj_opt_mem.c
+++ b/src/lj_opt_mem.c
@@ -18,6 +18,7 @@
 #include "lj_jit.h"
 #include "lj_iropt.h"
 #include "lj_ircall.h"
+#include "lj_dispatch.h"
 
 /* Some local macros to save typing. Undef'd at the end. */
 #define IR(ref)		(&J->cur.ir[(ref)])
@@ -56,8 +57,8 @@ static AliasRet aa_table(jit_State *J, IRRef ta, IRRef tb)
 {
   IRIns *taba = IR(ta), *tabb = IR(tb);
   int newa, newb;
-  lua_assert(ta != tb);
-  lua_assert(irt_istab(taba->t) && irt_istab(tabb->t));
+  lj_assertJ(ta != tb, "bad usage");
+  lj_assertJ(irt_istab(taba->t) && irt_istab(tabb->t), "bad usage");
   /* Disambiguate new allocations. */
   newa = (taba->o == IR_TNEW || taba->o == IR_TDUP);
   newb = (tabb->o == IR_TNEW || tabb->o == IR_TDUP);
@@ -99,7 +100,7 @@ static AliasRet aa_ahref(jit_State *J, IRIns *refa, IRIns *refb)
     /* Disambiguate array references based on index arithmetic. */
     int32_t ofsa = 0, ofsb = 0;
     IRRef basea = ka, baseb = kb;
-    lua_assert(refb->o == IR_AREF);
+    lj_assertJ(refb->o == IR_AREF, "expected AREF");
     /* Gather base and offset from t[base] or t[base+-ofs]. */
     if (keya->o == IR_ADD && irref_isk(keya->op2)) {
       basea = keya->op1;
@@ -117,8 +118,9 @@ static AliasRet aa_ahref(jit_State *J, IRIns *refa, IRIns *refb)
       return ALIAS_NO;  /* t[base+-o1] vs. t[base+-o2] and o1 != o2. */
   } else {
     /* Disambiguate hash references based on the type of their keys. */
-    lua_assert((refa->o==IR_HREF || refa->o==IR_HREFK || refa->o==IR_NEWREF) &&
-	       (refb->o==IR_HREF || refb->o==IR_HREFK || refb->o==IR_NEWREF));
+    lj_assertJ((refa->o==IR_HREF || refa->o==IR_HREFK || refa->o==IR_NEWREF) &&
+	       (refb->o==IR_HREF || refb->o==IR_HREFK || refb->o==IR_NEWREF),
+	       "bad xREF IR op %d or %d", refa->o, refb->o);
     if (!irt_sametype(keya->t, keyb->t))
       return ALIAS_NO;  /* Different key types. */
   }
@@ -192,7 +194,8 @@ static TRef fwd_ahload(jit_State *J, IRRef xref)
 	if (key->o == IR_KSLOT) key = IR(key->op1);
 	lj_ir_kvalue(J->L, &keyv, key);
 	tv = lj_tab_get(J->L, ir_ktab(IR(ir->op1)), &keyv);
-	lua_assert(itype2irt(tv) == irt_type(fins->t));
+	lj_assertJ(itype2irt(tv) == irt_type(fins->t),
+		   "mismatched type in constant table");
 	if (irt_isnum(fins->t))
 	  return lj_ir_knum_u64(J, tv->u64);
 	else if (LJ_DUALNUM && irt_isint(fins->t))
diff --git a/src/lj_opt_narrow.c b/src/lj_opt_narrow.c
index 4f285334..2cfb775b 100644
--- a/src/lj_opt_narrow.c
+++ b/src/lj_opt_narrow.c
@@ -372,17 +372,17 @@ static IRRef narrow_conv_emit(jit_State *J, NarrowConv *nc)
     } else if (op == NARROW_CONV) {
       *sp++ = emitir_raw(convot, ref, convop2);  /* Raw emit avoids a loop. */
     } else if (op == NARROW_SEXT) {
-      lua_assert(sp >= nc->stack+1);
+      lj_assertJ(sp >= nc->stack+1, "stack underflow");
       sp[-1] = emitir(IRT(IR_CONV, IRT_I64), sp[-1],
 		      (IRT_I64<<5)|IRT_INT|IRCONV_SEXT);
     } else if (op == NARROW_INT) {
-      lua_assert(next < last);
+      lj_assertJ(next < last, "missing arg to NARROW_INT");
       *sp++ = nc->t == IRT_I64 ?
 	      lj_ir_kint64(J, (int64_t)(int32_t)*next++) :
 	      lj_ir_kint(J, *next++);
     } else {  /* Regular IROpT. Pops two operands and pushes one result. */
       IRRef mode = nc->mode;
-      lua_assert(sp >= nc->stack+2);
+      lj_assertJ(sp >= nc->stack+2, "stack underflow");
       sp--;
       /* Omit some overflow checks for array indexing. See comments above. */
       if ((mode & IRCONV_CONVMASK) == IRCONV_INDEX) {
@@ -398,7 +398,7 @@ static IRRef narrow_conv_emit(jit_State *J, NarrowConv *nc)
 	narrow_bpc_set(J, narrow_ref(ref), narrow_ref(sp[-1]), mode);
     }
   }
-  lua_assert(sp == nc->stack+1);
+  lj_assertJ(sp == nc->stack+1, "stack misalignment");
   return nc->stack[0];
 }
 
@@ -452,7 +452,7 @@ static TRef narrow_stripov(jit_State *J, TRef tr, int lastop, IRRef mode)
 TRef LJ_FASTCALL lj_opt_narrow_index(jit_State *J, TRef tr)
 {
   IRIns *ir;
-  lua_assert(tref_isnumber(tr));
+  lj_assertJ(tref_isnumber(tr), "expected number type");
   if (tref_isnum(tr))  /* Conversion may be narrowed, too. See above. */
     return emitir(IRTGI(IR_CONV), tr, IRCONV_INT_NUM|IRCONV_INDEX);
   /* Omit some overflow checks for array indexing. See comments above. */
@@ -499,7 +499,7 @@ TRef LJ_FASTCALL lj_opt_narrow_tobit(jit_State *J, TRef tr)
 /* Narrow C array index (overflow undefined). */
 TRef LJ_FASTCALL lj_opt_narrow_cindex(jit_State *J, TRef tr)
 {
-  lua_assert(tref_isnumber(tr));
+  lj_assertJ(tref_isnumber(tr), "expected number type");
   if (tref_isnum(tr))
     return emitir(IRT(IR_CONV, IRT_INTP), tr, (IRT_INTP<<5)|IRT_NUM|IRCONV_ANY);
   /* Undefined overflow semantics allow stripping of ADDOV, SUBOV and MULOV. */
@@ -627,9 +627,10 @@ static int narrow_forl(jit_State *J, cTValue *o)
 /* Narrow the FORL index type by looking at the runtime values. */
 IRType lj_opt_narrow_forl(jit_State *J, cTValue *tv)
 {
-  lua_assert(tvisnumber(&tv[FORL_IDX]) &&
+  lj_assertJ(tvisnumber(&tv[FORL_IDX]) &&
 	     tvisnumber(&tv[FORL_STOP]) &&
-	     tvisnumber(&tv[FORL_STEP]));
+	     tvisnumber(&tv[FORL_STEP]),
+	     "expected number types");
   /* Narrow only if the runtime values of start/stop/step are all integers. */
   if (narrow_forl(J, &tv[FORL_IDX]) &&
       narrow_forl(J, &tv[FORL_STOP]) &&
diff --git a/src/lj_opt_split.c b/src/lj_opt_split.c
index c10a85cb..a619d852 100644
--- a/src/lj_opt_split.c
+++ b/src/lj_opt_split.c
@@ -235,7 +235,7 @@ static IRRef split_bitshift(jit_State *J, IRRef1 *hisubst,
 	return split_emit(J, IRTI(IR_BOR), t1, t2);
       } else {
 	IRRef t1 = ir->prev, t2;
-	lua_assert(op == IR_BSHR || op == IR_BSAR);
+	lj_assertJ(op == IR_BSHR || op == IR_BSAR, "bad usage");
 	nir->o = IR_BSHR;
 	t2 = split_emit(J, IRTI(IR_BSHL), hi, lj_ir_kint(J, (-k&31)));
 	ir->prev = split_emit(J, IRTI(IR_BOR), t1, t2);
@@ -250,7 +250,7 @@ static IRRef split_bitshift(jit_State *J, IRRef1 *hisubst,
 	ir->prev = lj_ir_kint(J, 0);
 	return lo;
       } else {
-	lua_assert(op == IR_BSHR || op == IR_BSAR);
+	lj_assertJ(op == IR_BSHR || op == IR_BSAR, "bad usage");
 	if (k == 32) {
 	  J->cur.nins--;
 	  ir->prev = hi;
@@ -429,7 +429,7 @@ static void split_ir(jit_State *J)
 	hi = split_emit(J, IRT(IR_HIOP, IRT_SOFTFP), nref, nref);
 	break;
       case IR_FLOAD:
-	lua_assert(ir->op1 == REF_NIL);
+	lj_assertJ(ir->op1 == REF_NIL, "expected FLOAD from GG_State");
 	hi = lj_ir_kint(J, *(int32_t*)((char*)J2GG(J) + ir->op2 + LJ_LE*4));
 	nir->op2 += LJ_BE*4;
 	break;
@@ -465,8 +465,9 @@ static void split_ir(jit_State *J)
 	  break;
 	}
 #endif
-	lua_assert(st == IRT_INT ||
-		   (LJ_32 && LJ_HASFFI && (st == IRT_U32 || st == IRT_FLOAT)));
+	lj_assertJ(st == IRT_INT ||
+		   (LJ_32 && LJ_HASFFI && (st == IRT_U32 || st == IRT_FLOAT)),
+		   "bad source type for CONV");
 	nir->o = IR_CALLN;
 #if LJ_32 && LJ_HASFFI
 	nir->op2 = st == IRT_INT ? IRCALL_softfp_i2d :
@@ -496,7 +497,8 @@ static void split_ir(jit_State *J)
 	hi = nir->op2;
 	break;
       default:
-	lua_assert(ir->o <= IR_NE || ir->o == IR_MIN || ir->o == IR_MAX);
+	lj_assertJ(ir->o <= IR_NE || ir->o == IR_MIN || ir->o == IR_MAX,
+		   "bad IR op %d", ir->o);
 	hi = split_emit(J, IRTG(IR_HIOP, IRT_SOFTFP),
 			hisubst[ir->op1], hisubst[ir->op2]);
 	break;
@@ -553,7 +555,7 @@ static void split_ir(jit_State *J)
 	hi = split_bitshift(J, hisubst, oir, nir, ir);
 	break;
       case IR_FLOAD:
-	lua_assert(ir->op2 == IRFL_CDATA_INT64);
+	lj_assertJ(ir->op2 == IRFL_CDATA_INT64, "only INT64 supported");
 	hi = split_emit(J, IRTI(IR_FLOAD), nir->op1, IRFL_CDATA_INT64_4);
 #if LJ_BE
 	ir->prev = hi; hi = nref;
@@ -619,7 +621,7 @@ static void split_ir(jit_State *J)
 	hi = nir->op2;
 	break;
       default:
-	lua_assert(ir->o <= IR_NE);  /* Comparisons. */
+	lj_assertJ(ir->o <= IR_NE, "bad IR op %d", ir->o);  /* Comparisons. */
 	split_emit(J, IRTGI(IR_HIOP), hiref, hisubst[ir->op2]);
 	break;
       }
@@ -697,7 +699,7 @@ static void split_ir(jit_State *J)
 #if LJ_SOFTFP
       if (st == IRT_NUM || (LJ_32 && LJ_HASFFI && st == IRT_FLOAT)) {
 	if (irt_isguard(ir->t)) {
-	  lua_assert(st == IRT_NUM && irt_isint(ir->t));
+	  lj_assertJ(st == IRT_NUM && irt_isint(ir->t), "bad CONV types");
 	  J->cur.nins--;
 	  ir->prev = split_num2int(J, nir->op1, hisubst[ir->op1], 1);
 	} else {
@@ -828,7 +830,7 @@ void lj_opt_split(jit_State *J)
   if (!J->needsplit)
     J->needsplit = split_needsplit(J);
 #else
-  lua_assert(J->needsplit >= split_needsplit(J));  /* Verify flag. */
+  lj_assertJ(J->needsplit >= split_needsplit(J), "bad SPLIT state");
 #endif
   if (J->needsplit) {
     int errcode = lj_vm_cpcall(J->L, NULL, J, cpsplit);
diff --git a/src/lj_parse.c b/src/lj_parse.c
index e238afa3..3f6caaec 100644
--- a/src/lj_parse.c
+++ b/src/lj_parse.c
@@ -169,6 +169,12 @@ LJ_STATIC_ASSERT((int)BC_MULVV-(int)BC_ADDVV == (int)OPR_MUL-(int)OPR_ADD);
 LJ_STATIC_ASSERT((int)BC_DIVVV-(int)BC_ADDVV == (int)OPR_DIV-(int)OPR_ADD);
 LJ_STATIC_ASSERT((int)BC_MODVV-(int)BC_ADDVV == (int)OPR_MOD-(int)OPR_ADD);
 
+#ifdef LUA_USE_ASSERT
+#define lj_assertFS(c, ...)	(lj_assertG_(G(fs->L), (c), __VA_ARGS__))
+#else
+#define lj_assertFS(c, ...)	((void)fs)
+#endif
+
 /* -- Error handling ------------------------------------------------------ */
 
 LJ_NORET LJ_NOINLINE static void err_syntax(LexState *ls, ErrMsg em)
@@ -206,7 +212,7 @@ static BCReg const_num(FuncState *fs, ExpDesc *e)
 {
   lua_State *L = fs->L;
   TValue *o;
-  lua_assert(expr_isnumk(e));
+  lj_assertFS(expr_isnumk(e), "bad usage");
   o = lj_tab_set(L, fs->kt, &e->u.nval);
   if (tvhaskslot(o))
     return tvkslot(o);
@@ -231,7 +237,7 @@ static BCReg const_gc(FuncState *fs, GCobj *gc, uint32_t itype)
 /* Add a string constant. */
 static BCReg const_str(FuncState *fs, ExpDesc *e)
 {
-  lua_assert(expr_isstrk(e) || e->k == VGLOBAL);
+  lj_assertFS(expr_isstrk(e) || e->k == VGLOBAL, "bad usage");
   return const_gc(fs, obj2gco(e->u.sval), LJ_TSTR);
 }
 
@@ -319,7 +325,7 @@ static void jmp_patchins(FuncState *fs, BCPos pc, BCPos dest)
 {
   BCIns *jmp = &fs->bcbase[pc].ins;
   BCPos offset = dest-(pc+1)+BCBIAS_J;
-  lua_assert(dest != NO_JMP);
+  lj_assertFS(dest != NO_JMP, "uninitialized jump target");
   if (offset > BCMAX_D)
     err_syntax(fs->ls, LJ_ERR_XJUMP);
   setbc_d(jmp, offset);
@@ -368,7 +374,7 @@ static void jmp_patch(FuncState *fs, BCPos list, BCPos target)
   if (target == fs->pc) {
     jmp_tohere(fs, list);
   } else {
-    lua_assert(target < fs->pc);
+    lj_assertFS(target < fs->pc, "bad jump target");
     jmp_patchval(fs, list, target, NO_REG, target);
   }
 }
@@ -398,7 +404,7 @@ static void bcreg_free(FuncState *fs, BCReg reg)
 {
   if (reg >= fs->nactvar) {
     fs->freereg--;
-    lua_assert(reg == fs->freereg);
+    lj_assertFS(reg == fs->freereg, "bad regfree");
   }
 }
 
@@ -548,7 +554,7 @@ static void expr_toreg_nobranch(FuncState *fs, ExpDesc *e, BCReg reg)
   } else if (e->k <= VKTRUE) {
     ins = BCINS_AD(BC_KPRI, reg, const_pri(e));
   } else {
-    lua_assert(e->k == VVOID || e->k == VJMP);
+    lj_assertFS(e->k == VVOID || e->k == VJMP, "bad expr type %d", e->k);
     return;
   }
   bcemit_INS(fs, ins);
@@ -643,7 +649,7 @@ static void bcemit_store(FuncState *fs, ExpDesc *var, ExpDesc *e)
     ins = BCINS_AD(BC_GSET, ra, const_str(fs, var));
   } else {
     BCReg ra, rc;
-    lua_assert(var->k == VINDEXED);
+    lj_assertFS(var->k == VINDEXED, "bad expr type %d", var->k);
     ra = expr_toanyreg(fs, e);
     rc = var->u.s.aux;
     if ((int32_t)rc < 0) {
@@ -651,10 +657,12 @@ static void bcemit_store(FuncState *fs, ExpDesc *var, ExpDesc *e)
     } else if (rc > BCMAX_C) {
       ins = BCINS_ABC(BC_TSETB, ra, var->u.s.info, rc-(BCMAX_C+1));
     } else {
+#ifdef LUA_USE_ASSERT
       /* Free late alloced key reg to avoid assert on free of value reg. */
       /* This can only happen when called from expr_table(). */
-      lua_assert(e->k != VNONRELOC || ra < fs->nactvar ||
-		 rc < ra || (bcreg_free(fs, rc),1));
+      if (e->k == VNONRELOC && ra >= fs->nactvar && rc >= ra)
+	bcreg_free(fs, rc);
+#endif
       ins = BCINS_ABC(BC_TSETV, ra, var->u.s.info, rc);
     }
   }
@@ -669,7 +677,7 @@ static void bcemit_method(FuncState *fs, ExpDesc *e, ExpDesc *key)
   expr_free(fs, e);
   func = fs->freereg;
   bcemit_AD(fs, BC_MOV, func+1+LJ_FR2, obj);  /* Copy object to 1st argument. */
-  lua_assert(expr_isstrk(key));
+  lj_assertFS(expr_isstrk(key), "bad usage");
   idx = const_str(fs, key);
   if (idx <= BCMAX_C) {
     bcreg_reserve(fs, 2+LJ_FR2);
@@ -809,7 +817,8 @@ static void bcemit_arith(FuncState *fs, BinOpr opr, ExpDesc *e1, ExpDesc *e2)
     else
       rc = expr_toanyreg(fs, e2);
     /* 1st operand discharged by bcemit_binop_left, but need KNUM/KSHORT. */
-    lua_assert(expr_isnumk(e1) || e1->k == VNONRELOC);
+    lj_assertFS(expr_isnumk(e1) || e1->k == VNONRELOC,
+		"bad expr type %d", e1->k);
     expr_toval(fs, e1);
     /* Avoid two consts to satisfy bytecode constraints. */
     if (expr_isnumk(e1) && !expr_isnumk(e2) &&
@@ -897,19 +906,20 @@ static void bcemit_binop(FuncState *fs, BinOpr op, ExpDesc *e1, ExpDesc *e2)
   if (op <= OPR_POW) {
     bcemit_arith(fs, op, e1, e2);
   } else if (op == OPR_AND) {
-    lua_assert(e1->t == NO_JMP);  /* List must be closed. */
+    lj_assertFS(e1->t == NO_JMP, "jump list not closed");
     expr_discharge(fs, e2);
     jmp_append(fs, &e2->f, e1->f);
     *e1 = *e2;
   } else if (op == OPR_OR) {
-    lua_assert(e1->f == NO_JMP);  /* List must be closed. */
+    lj_assertFS(e1->f == NO_JMP, "jump list not closed");
     expr_discharge(fs, e2);
     jmp_append(fs, &e2->t, e1->t);
     *e1 = *e2;
   } else if (op == OPR_CONCAT) {
     expr_toval(fs, e2);
     if (e2->k == VRELOCABLE && bc_op(*bcptr(fs, e2)) == BC_CAT) {
-      lua_assert(e1->u.s.info == bc_b(*bcptr(fs, e2))-1);
+      lj_assertFS(e1->u.s.info == bc_b(*bcptr(fs, e2))-1,
+		  "bad CAT stack layout");
       expr_free(fs, e1);
       setbc_b(bcptr(fs, e2), e1->u.s.info);
       e1->u.s.info = e2->u.s.info;
@@ -921,8 +931,9 @@ static void bcemit_binop(FuncState *fs, BinOpr op, ExpDesc *e1, ExpDesc *e2)
     }
     e1->k = VRELOCABLE;
   } else {
-    lua_assert(op == OPR_NE || op == OPR_EQ ||
-	       op == OPR_LT || op == OPR_GE || op == OPR_LE || op == OPR_GT);
+    lj_assertFS(op == OPR_NE || op == OPR_EQ ||
+	       op == OPR_LT || op == OPR_GE || op == OPR_LE || op == OPR_GT,
+	       "bad binop %d", op);
     bcemit_comp(fs, op, e1, e2);
   }
 }
@@ -951,10 +962,10 @@ static void bcemit_unop(FuncState *fs, BCOp op, ExpDesc *e)
       e->u.s.info = fs->freereg-1;
       e->k = VNONRELOC;
     } else {
-      lua_assert(e->k == VNONRELOC);
+      lj_assertFS(e->k == VNONRELOC, "bad expr type %d", e->k);
     }
   } else {
-    lua_assert(op == BC_UNM || op == BC_LEN);
+    lj_assertFS(op == BC_UNM || op == BC_LEN, "bad unop %d", op);
     if (op == BC_UNM && !expr_hasjump(e)) {  /* Constant-fold negations. */
 #if LJ_HASFFI
       if (e->k == VKCDATA) {  /* Fold in-place since cdata is not interned. */
@@ -1049,8 +1060,9 @@ static void var_new(LexState *ls, BCReg n, GCstr *name)
       lj_lex_error(ls, 0, LJ_ERR_XLIMC, LJ_MAX_VSTACK);
     lj_mem_growvec(ls->L, ls->vstack, ls->sizevstack, LJ_MAX_VSTACK, VarInfo);
   }
-  lua_assert((uintptr_t)name < VARNAME__MAX ||
-	     lj_tab_getstr(fs->kt, name) != NULL);
+  lj_assertFS((uintptr_t)name < VARNAME__MAX ||
+	      lj_tab_getstr(fs->kt, name) != NULL,
+	      "unanchored variable name");
   /* NOBARRIER: name is anchored in fs->kt and ls->vstack is not a GCobj. */
   setgcref(ls->vstack[vtop].name, obj2gco(name));
   fs->varmap[fs->nactvar+n] = (uint16_t)vtop;
@@ -1105,7 +1117,7 @@ static MSize var_lookup_uv(FuncState *fs, MSize vidx, ExpDesc *e)
       return i;  /* Already exists. */
   /* Otherwise create a new one. */
   checklimit(fs, fs->nuv, LJ_MAX_UPVAL, "upvalues");
-  lua_assert(e->k == VLOCAL || e->k == VUPVAL);
+  lj_assertFS(e->k == VLOCAL || e->k == VUPVAL, "bad expr type %d", e->k);
   fs->uvmap[n] = (uint16_t)vidx;
   fs->uvtmp[n] = (uint16_t)(e->k == VLOCAL ? vidx : LJ_MAX_VSTACK+e->u.s.info);
   fs->nuv = n+1;
@@ -1156,7 +1168,8 @@ static MSize gola_new(LexState *ls, GCstr *name, uint8_t info, BCPos pc)
       lj_lex_error(ls, 0, LJ_ERR_XLIMC, LJ_MAX_VSTACK);
     lj_mem_growvec(ls->L, ls->vstack, ls->sizevstack, LJ_MAX_VSTACK, VarInfo);
   }
-  lua_assert(name == NAME_BREAK || lj_tab_getstr(fs->kt, name) != NULL);
+  lj_assertFS(name == NAME_BREAK || lj_tab_getstr(fs->kt, name) != NULL,
+	      "unanchored label name");
   /* NOBARRIER: name is anchored in fs->kt and ls->vstack is not a GCobj. */
   setgcref(ls->vstack[vtop].name, obj2gco(name));
   ls->vstack[vtop].startpc = pc;
@@ -1186,8 +1199,9 @@ static void gola_close(LexState *ls, VarInfo *vg)
   FuncState *fs = ls->fs;
   BCPos pc = vg->startpc;
   BCIns *ip = &fs->bcbase[pc].ins;
-  lua_assert(gola_isgoto(vg));
-  lua_assert(bc_op(*ip) == BC_JMP || bc_op(*ip) == BC_UCLO);
+  lj_assertFS(gola_isgoto(vg), "expected goto");
+  lj_assertFS(bc_op(*ip) == BC_JMP || bc_op(*ip) == BC_UCLO,
+	      "bad bytecode op %d", bc_op(*ip));
   setbc_a(ip, vg->slot);
   if (bc_op(*ip) == BC_JMP) {
     BCPos next = jmp_next(fs, pc);
@@ -1206,9 +1220,9 @@ static void gola_resolve(LexState *ls, FuncScope *bl, MSize idx)
     if (gcrefeq(vg->name, vl->name) && gola_isgoto(vg)) {
       if (vg->slot < vl->slot) {
 	GCstr *name = strref(var_get(ls, ls->fs, vg->slot).name);
-	lua_assert((uintptr_t)name >= VARNAME__MAX);
+	lj_assertLS((uintptr_t)name >= VARNAME__MAX, "expected goto name");
 	ls->linenumber = ls->fs->bcbase[vg->startpc].line;
-	lua_assert(strref(vg->name) != NAME_BREAK);
+	lj_assertLS(strref(vg->name) != NAME_BREAK, "unexpected break");
 	lj_lex_error(ls, 0, LJ_ERR_XGSCOPE,
 		     strdata(strref(vg->name)), strdata(name));
       }
@@ -1272,7 +1286,7 @@ static void fscope_begin(FuncState *fs, FuncScope *bl, int flags)
   bl->vstart = fs->ls->vtop;
   bl->prev = fs->bl;
   fs->bl = bl;
-  lua_assert(fs->freereg == fs->nactvar);
+  lj_assertFS(fs->freereg == fs->nactvar, "bad regalloc");
 }
 
 /* End a scope. */
@@ -1283,7 +1297,7 @@ static void fscope_end(FuncState *fs)
   fs->bl = bl->prev;
   var_remove(ls, bl->nactvar);
   fs->freereg = fs->nactvar;
-  lua_assert(bl->nactvar == fs->nactvar);
+  lj_assertFS(bl->nactvar == fs->nactvar, "bad regalloc");
   if ((bl->flags & (FSCOPE_UPVAL|FSCOPE_NOCLOSE)) == FSCOPE_UPVAL)
     bcemit_AJ(fs, BC_UCLO, bl->nactvar, 0);
   if ((bl->flags & FSCOPE_BREAK)) {
@@ -1370,13 +1384,13 @@ static void fs_fixup_k(FuncState *fs, GCproto *pt, void *kptr)
     Node *n = &node[i];
     if (tvhaskslot(&n->val)) {
       ptrdiff_t kidx = (ptrdiff_t)tvkslot(&n->val);
-      lua_assert(!tvisint(&n->key));
+      lj_assertFS(!tvisint(&n->key), "unexpected integer key");
       if (tvisnum(&n->key)) {
 	TValue *tv = &((TValue *)kptr)[kidx];
 	if (LJ_DUALNUM) {
 	  lua_Number nn = numV(&n->key);
 	  int32_t k = lj_num2int(nn);
-	  lua_assert(!tvismzero(&n->key));
+	  lj_assertFS(!tvismzero(&n->key), "unexpected -0 key");
 	  if ((lua_Number)k == nn)
 	    setintV(tv, k);
 	  else
@@ -1424,21 +1438,21 @@ static void fs_fixup_line(FuncState *fs, GCproto *pt,
     uint8_t *li = (uint8_t *)lineinfo;
     do {
       BCLine delta = base[i].line - first;
-      lua_assert(delta >= 0 && delta < 256);
+      lj_assertFS(delta >= 0 && delta < 256, "bad line delta");
       li[i] = (uint8_t)delta;
     } while (++i < n);
   } else if (LJ_LIKELY(numline < 65536)) {
     uint16_t *li = (uint16_t *)lineinfo;
     do {
       BCLine delta = base[i].line - first;
-      lua_assert(delta >= 0 && delta < 65536);
+      lj_assertFS(delta >= 0 && delta < 65536, "bad line delta");
       li[i] = (uint16_t)delta;
     } while (++i < n);
   } else {
     uint32_t *li = (uint32_t *)lineinfo;
     do {
       BCLine delta = base[i].line - first;
-      lua_assert(delta >= 0);
+      lj_assertFS(delta >= 0, "bad line delta");
       li[i] = (uint32_t)delta;
     } while (++i < n);
   }
@@ -1528,7 +1542,7 @@ static void fs_fixup_ret(FuncState *fs)
   }
   fs->bl->flags |= FSCOPE_NOCLOSE;  /* Handled above. */
   fscope_end(fs);
-  lua_assert(fs->bl == NULL);
+  lj_assertFS(fs->bl == NULL, "bad scope nesting");
   /* May need to fixup returns encoded before first function was created. */
   if (fs->flags & PROTO_FIXUP_RETURN) {
     BCPos pc;
@@ -1608,7 +1622,7 @@ static GCproto *fs_finish(LexState *ls, BCLine line)
   L->top--;  /* Pop table of constants. */
   ls->vtop = fs->vbase;  /* Reset variable stack. */
   ls->fs = fs->prev;
-  lua_assert(ls->fs != NULL || ls->tok == TK_eof);
+  lj_assertL(ls->fs != NULL || ls->tok == TK_eof, "bad parser state");
   return pt;
 }
 
@@ -1702,14 +1716,15 @@ static void expr_bracket(LexState *ls, ExpDesc *v)
 }
 
 /* Get value of constant expression. */
-static void expr_kvalue(TValue *v, ExpDesc *e)
+static void expr_kvalue(FuncState *fs, TValue *v, ExpDesc *e)
 {
+  UNUSED(fs);
   if (e->k <= VKTRUE) {
     setpriV(v, ~(uint32_t)e->k);
   } else if (e->k == VKSTR) {
     setgcVraw(v, obj2gco(e->u.sval), LJ_TSTR);
   } else {
-    lua_assert(tvisnumber(expr_numtv(e)));
+    lj_assertFS(tvisnumber(expr_numtv(e)), "bad number constant");
     *v = *expr_numtv(e);
   }
 }
@@ -1759,11 +1774,11 @@ static void expr_table(LexState *ls, ExpDesc *e)
 	fs->bcbase[pc].ins = BCINS_AD(BC_TDUP, freg-1, kidx);
       }
       vcall = 0;
-      expr_kvalue(&k, &key);
+      expr_kvalue(fs, &k, &key);
       v = lj_tab_set(fs->L, t, &k);
       lj_gc_anybarriert(fs->L, t);
       if (expr_isk_nojump(&val)) {  /* Add const key/value to template table. */
-	expr_kvalue(v, &val);
+	expr_kvalue(fs, v, &val);
       } else {  /* Otherwise create dummy string key (avoids lj_tab_newkey). */
 	settabV(fs->L, v, t);  /* Preserve key with table itself as value. */
 	fixt = 1;   /* Fix this later, after all resizes. */
@@ -1782,8 +1797,9 @@ static void expr_table(LexState *ls, ExpDesc *e)
   if (vcall) {
     BCInsLine *ilp = &fs->bcbase[fs->pc-1];
     ExpDesc en;
-    lua_assert(bc_a(ilp->ins) == freg &&
-	       bc_op(ilp->ins) == (narr > 256 ? BC_TSETV : BC_TSETB));
+    lj_assertFS(bc_a(ilp->ins) == freg &&
+		bc_op(ilp->ins) == (narr > 256 ? BC_TSETV : BC_TSETB),
+		"bad CALL code generation");
     expr_init(&en, VKNUM, 0);
     en.u.nval.u32.lo = narr-1;
     en.u.nval.u32.hi = 0x43300000;  /* Biased integer to avoid denormals. */
@@ -1813,7 +1829,7 @@ static void expr_table(LexState *ls, ExpDesc *e)
       for (i = 0; i <= hmask; i++) {
 	Node *n = &node[i];
 	if (tvistab(&n->val)) {
-	  lua_assert(tabV(&n->val) == t);
+	  lj_assertFS(tabV(&n->val) == t, "bad dummy key in template table");
 	  setnilV(&n->val);  /* Turn value into nil. */
 	}
       }
@@ -1844,7 +1860,7 @@ static BCReg parse_params(LexState *ls, int needself)
     } while (lex_opt(ls, ','));
   }
   var_add(ls, nparams);
-  lua_assert(fs->nactvar == nparams);
+  lj_assertFS(fs->nactvar == nparams, "bad regalloc");
   bcreg_reserve(fs, nparams);
   lex_check(ls, ')');
   return nparams;
@@ -1931,7 +1947,7 @@ static void parse_args(LexState *ls, ExpDesc *e)
     err_syntax(ls, LJ_ERR_XFUNARG);
     return;  /* Silence compiler. */
   }
-  lua_assert(e->k == VNONRELOC);
+  lj_assertFS(e->k == VNONRELOC, "bad expr type %d", e->k);
   base = e->u.s.info;  /* Base register for call. */
   if (args.k == VCALL) {
     ins = BCINS_ABC(BC_CALLM, base, 2, args.u.s.aux - base - 1 - LJ_FR2);
@@ -2701,8 +2717,9 @@ static void parse_chunk(LexState *ls)
   while (!islast && !parse_isend(ls->tok)) {
     islast = parse_stmt(ls);
     lex_opt(ls, ';');
-    lua_assert(ls->fs->framesize >= ls->fs->freereg &&
-	       ls->fs->freereg >= ls->fs->nactvar);
+    lj_assertLS(ls->fs->framesize >= ls->fs->freereg &&
+		ls->fs->freereg >= ls->fs->nactvar,
+		"bad regalloc");
     ls->fs->freereg = ls->fs->nactvar;  /* Free registers after each stmt. */
   }
   synlevel_end(ls);
@@ -2737,9 +2754,8 @@ GCproto *lj_parse(LexState *ls)
     err_token(ls, TK_eof);
   pt = fs_finish(ls, ls->linenumber);
   L->top--;  /* Drop chunkname. */
-  lua_assert(fs.prev == NULL);
-  lua_assert(ls->fs == NULL);
-  lua_assert(pt->sizeuv == 0);
+  lj_assertL(fs.prev == NULL && ls->fs == NULL, "mismatched frame nesting");
+  lj_assertL(pt->sizeuv == 0, "toplevel proto has upvalues");
   return pt;
 }
 
diff --git a/src/lj_record.c b/src/lj_record.c
index 6030f77c..d1332bfc 100644
--- a/src/lj_record.c
+++ b/src/lj_record.c
@@ -50,34 +50,52 @@
 static void rec_check_ir(jit_State *J)
 {
   IRRef i, nins = J->cur.nins, nk = J->cur.nk;
-  lua_assert(nk <= REF_BIAS && nins >= REF_BIAS && nins < 65536);
+  lj_assertJ(nk <= REF_BIAS && nins >= REF_BIAS && nins < 65536,
+	     "inconsistent IR layout");
   for (i = nk; i < nins; i++) {
     IRIns *ir = IR(i);
     uint32_t mode = lj_ir_mode[ir->o];
     IRRef op1 = ir->op1;
     IRRef op2 = ir->op2;
+    const char *err = NULL;
     switch (irm_op1(mode)) {
-    case IRMnone: lua_assert(op1 == 0); break;
-    case IRMref: lua_assert(op1 >= nk);
-      lua_assert(i >= REF_BIAS ? op1 < i : op1 > i); break;
+    case IRMnone:
+      if (op1 != 0) err = "IRMnone op1 used";
+      break;
+    case IRMref:
+      if (op1 < nk || (i >= REF_BIAS ? op1 >= i : op1 <= i))
+	err = "IRMref op1 out of range";
+      break;
     case IRMlit: break;
-    case IRMcst: lua_assert(i < REF_BIAS);
+    case IRMcst:
+      if (i >= REF_BIAS) { err = "constant in IR range"; break; }
       if (irt_is64(ir->t) && ir->o != IR_KNULL)
 	i++;
       continue;
     }
     switch (irm_op2(mode)) {
-    case IRMnone: lua_assert(op2 == 0); break;
-    case IRMref: lua_assert(op2 >= nk);
-      lua_assert(i >= REF_BIAS ? op2 < i : op2 > i); break;
+    case IRMnone:
+      if (op2) err = "IRMnone op2 used";
+      break;
+    case IRMref:
+      if (op2 < nk || (i >= REF_BIAS ? op2 >= i : op2 <= i))
+	err = "IRMref op2 out of range";
+      break;
     case IRMlit: break;
-    case IRMcst: lua_assert(0); break;
+    case IRMcst: err = "IRMcst op2"; break;
     }
-    if (ir->prev) {
-      lua_assert(ir->prev >= nk);
-      lua_assert(i >= REF_BIAS ? ir->prev < i : ir->prev > i);
-      lua_assert(ir->o == IR_NOP || IR(ir->prev)->o == ir->o);
+    if (!err && ir->prev) {
+      if (ir->prev < nk || (i >= REF_BIAS ? ir->prev >= i : ir->prev <= i))
+	err = "chain out of range";
+      else if (ir->o != IR_NOP && IR(ir->prev)->o != ir->o)
+	err = "chain to different op";
     }
+    lj_assertJ(!err, "bad IR %04d op %d(%04d,%04d): %s",
+	       i-REF_BIAS,
+	       ir->o,
+	       irm_op1(mode) == IRMref ? op1-REF_BIAS : op1,
+	       irm_op2(mode) == IRMref ? op2-REF_BIAS : op2,
+	       err);
   }
 }
 
@@ -87,9 +105,10 @@ static void rec_check_slots(jit_State *J)
   BCReg s, nslots = J->baseslot + J->maxslot;
   int32_t depth = 0;
   cTValue *base = J->L->base - J->baseslot;
-  lua_assert(J->baseslot >= 1+LJ_FR2);
-  lua_assert(J->baseslot == 1+LJ_FR2 || (J->slot[J->baseslot-1] & TREF_FRAME));
-  lua_assert(nslots <= LJ_MAX_JSLOTS);
+  lj_assertJ(J->baseslot >= 1+LJ_FR2, "bad baseslot");
+  lj_assertJ(J->baseslot == 1+LJ_FR2 || (J->slot[J->baseslot-1] & TREF_FRAME),
+	     "baseslot does not point to frame");
+  lj_assertJ(nslots <= LJ_MAX_JSLOTS, "slot overflow");
   for (s = 0; s < nslots; s++) {
     TRef tr = J->slot[s];
     if (tr) {
@@ -97,56 +116,65 @@ static void rec_check_slots(jit_State *J)
       IRRef ref = tref_ref(tr);
       IRIns *ir = NULL;  /* Silence compiler. */
       if (!LJ_FR2 || ref || !(tr & (TREF_FRAME | TREF_CONT))) {
-	lua_assert(ref >= J->cur.nk && ref < J->cur.nins);
+	lj_assertJ(ref >= J->cur.nk && ref < J->cur.nins,
+		   "slot %d ref %04d out of range", s, ref - REF_BIAS);
 	ir = IR(ref);
-	lua_assert(irt_t(ir->t) == tref_t(tr));
+	lj_assertJ(irt_t(ir->t) == tref_t(tr), "slot %d IR type mismatch", s);
       }
       if (s == 0) {
-	lua_assert(tref_isfunc(tr));
+	lj_assertJ(tref_isfunc(tr), "frame slot 0 is not a function");
 #if LJ_FR2
       } else if (s == 1) {
-	lua_assert((tr & ~TREF_FRAME) == 0);
+	lj_assertJ((tr & ~TREF_FRAME) == 0, "bad frame slot 1");
 #endif
       } else if ((tr & TREF_FRAME)) {
 	GCfunc *fn = gco2func(frame_gc(tv));
 	BCReg delta = (BCReg)(tv - frame_prev(tv));
 #if LJ_FR2
-	if (ref)
-	  lua_assert(ir_knum(ir)->u64 == tv->u64);
+	lj_assertJ(!ref || ir_knum(ir)->u64 == tv->u64,
+		   "frame slot %d PC mismatch", s);
 	tr = J->slot[s-1];
 	ir = IR(tref_ref(tr));
 #endif
-	lua_assert(tref_isfunc(tr));
-	if (tref_isk(tr)) lua_assert(fn == ir_kfunc(ir));
-	lua_assert(s > delta + LJ_FR2 ? (J->slot[s-delta] & TREF_FRAME)
-				      : (s == delta + LJ_FR2));
+	lj_assertJ(tref_isfunc(tr),
+		   "frame slot %d is not a function", s-LJ_FR2);
+	lj_assertJ(!tref_isk(tr) || fn == ir_kfunc(ir),
+		   "frame slot %d function mismatch", s-LJ_FR2);
+	lj_assertJ(s > delta + LJ_FR2 ? (J->slot[s-delta] & TREF_FRAME)
+				      : (s == delta + LJ_FR2),
+		   "frame slot %d broken chain", s-LJ_FR2);
 	depth++;
       } else if ((tr & TREF_CONT)) {
 #if LJ_FR2
-	if (ref)
-	  lua_assert(ir_knum(ir)->u64 == tv->u64);
+	lj_assertJ(!ref || ir_knum(ir)->u64 == tv->u64,
+		   "cont slot %d continuation mismatch", s);
 #else
-	lua_assert(ir_kptr(ir) == gcrefp(tv->gcr, void));
+	lj_assertJ(ir_kptr(ir) == gcrefp(tv->gcr, void),
+		   "cont slot %d continuation mismatch", s);
 #endif
-	lua_assert((J->slot[s+1+LJ_FR2] & TREF_FRAME));
+	lj_assertJ((J->slot[s+1+LJ_FR2] & TREF_FRAME),
+		   "cont slot %d not followed by frame", s);
 	depth++;
       } else {
-	if (tvisnumber(tv))
-	  lua_assert(tref_isnumber(tr));  /* Could be IRT_INT etc., too. */
-	else
-	  lua_assert(itype2irt(tv) == tref_type(tr));
+	/* Number repr. may differ, but other types must be the same. */
+	lj_assertJ(tvisnumber(tv) ? tref_isnumber(tr) :
+				    itype2irt(tv) == tref_type(tr),
+		   "slot %d type mismatch: stack type %d vs IR type %d",
+		   s, itypemap(tv), tref_type(tr));
 	if (tref_isk(tr)) {  /* Compare constants. */
 	  TValue tvk;
 	  lj_ir_kvalue(J->L, &tvk, ir);
-	  if (!(tvisnum(&tvk) && tvisnan(&tvk)))
-	    lua_assert(lj_obj_equal(tv, &tvk));
-	  else
-	    lua_assert(tvisnum(tv) && tvisnan(tv));
+	  lj_assertJ((tvisnum(&tvk) && tvisnan(&tvk)) ?
+		     (tvisnum(tv) && tvisnan(tv)) :
+		     lj_obj_equal(tv, &tvk),
+		     "slot %d const mismatch: stack %016llx vs IR %016llx",
+		     s, tv->u64, tvk.u64);
 	}
       }
     }
   }
-  lua_assert(J->framedepth == depth);
+  lj_assertJ(J->framedepth == depth,
+	     "frame depth mismatch %d vs %d", J->framedepth, depth);
 }
 #endif
 
@@ -182,7 +210,7 @@ static TRef getcurrf(jit_State *J)
 {
   if (J->base[-1-LJ_FR2])
     return J->base[-1-LJ_FR2];
-  lua_assert(J->baseslot == 1+LJ_FR2);
+  lj_assertJ(J->baseslot == 1+LJ_FR2, "bad baseslot");
   return sloadt(J, -1-LJ_FR2, IRT_FUNC, IRSLOAD_READONLY);
 }
 
@@ -427,7 +455,8 @@ static void rec_for_loop(jit_State *J, const BCIns *fori, ScEvEntry *scev,
   TRef stop = fori_arg(J, fori, ra+FORL_STOP, t, mode);
   TRef step = fori_arg(J, fori, ra+FORL_STEP, t, mode);
   int tc, dir = rec_for_direction(&tv[FORL_STEP]);
-  lua_assert(bc_op(*fori) == BC_FORI || bc_op(*fori) == BC_JFORI);
+  lj_assertJ(bc_op(*fori) == BC_FORI || bc_op(*fori) == BC_JFORI,
+	     "bad bytecode %d instead of FORI/JFORI", bc_op(*fori));
   scev->t.irt = t;
   scev->dir = dir;
   scev->stop = tref_ref(stop);
@@ -483,7 +512,7 @@ static LoopEvent rec_for(jit_State *J, const BCIns *fori, int isforl)
 						   IRT_NUM;
     for (i = FORL_IDX; i <= FORL_STEP; i++) {
       if (!tr[i]) sload(J, ra+i);
-      lua_assert(tref_isnumber_str(tr[i]));
+      lj_assertJ(tref_isnumber_str(tr[i]), "bad FORI argument type");
       if (tref_isstr(tr[i]))
 	tr[i] = emitir(IRTG(IR_STRTO, IRT_NUM), tr[i], 0);
       if (t == IRT_INT) {
@@ -615,7 +644,8 @@ static void rec_loop_jit(jit_State *J, TraceNo lnk, LoopEvent ev)
 static int rec_profile_need(jit_State *J, GCproto *pt, const BCIns *pc)
 {
   GCproto *ppt;
-  lua_assert(J->prof_mode == 'f' || J->prof_mode == 'l');
+  lj_assertJ(J->prof_mode == 'f' || J->prof_mode == 'l',
+	     "bad profiler mode %c", J->prof_mode);
   if (!pt)
     return 0;
   ppt = J->prev_pt;
@@ -793,7 +823,7 @@ void lj_record_ret(jit_State *J, BCReg rbase, ptrdiff_t gotresults)
     BCReg cbase = (BCReg)frame_delta(frame);
     if (--J->framedepth <= 0)
       lj_trace_err(J, LJ_TRERR_NYIRETL);
-    lua_assert(J->baseslot > 1+LJ_FR2);
+    lj_assertJ(J->baseslot > 1+LJ_FR2, "bad baseslot for return");
     gotresults++;
     rbase += cbase;
     J->baseslot -= (BCReg)cbase;
@@ -818,7 +848,7 @@ void lj_record_ret(jit_State *J, BCReg rbase, ptrdiff_t gotresults)
     BCReg cbase = (BCReg)frame_delta(frame);
     if (--J->framedepth < 0)  /* NYI: return of vararg func to lower frame. */
       lj_trace_err(J, LJ_TRERR_NYIRETL);
-    lua_assert(J->baseslot > 1+LJ_FR2);
+    lj_assertJ(J->baseslot > 1+LJ_FR2, "bad baseslot for return");
     rbase += cbase;
     J->baseslot -= (BCReg)cbase;
     J->base -= cbase;
@@ -845,7 +875,7 @@ void lj_record_ret(jit_State *J, BCReg rbase, ptrdiff_t gotresults)
     J->maxslot = cbase+(BCReg)nresults;
     if (J->framedepth > 0) {  /* Return to a frame that is part of the trace. */
       J->framedepth--;
-      lua_assert(J->baseslot > cbase+1+LJ_FR2);
+      lj_assertJ(J->baseslot > cbase+1+LJ_FR2, "bad baseslot for return");
       J->baseslot -= cbase+1+LJ_FR2;
       J->base -= cbase+1+LJ_FR2;
     } else if (J->parent == 0 && J->exitno == 0 &&
@@ -860,7 +890,7 @@ void lj_record_ret(jit_State *J, BCReg rbase, ptrdiff_t gotresults)
       emitir(IRTG(IR_RETF, IRT_PGC), trpt, trpc);
       J->retdepth++;
       J->needsnap = 1;
-      lua_assert(J->baseslot == 1+LJ_FR2);
+      lj_assertJ(J->baseslot == 1+LJ_FR2, "bad baseslot for return");
       /* Shift result slots up and clear the slots of the new frame below. */
       memmove(J->base + cbase, J->base-1-LJ_FR2, sizeof(TRef)*nresults);
       memset(J->base-1-LJ_FR2, 0, sizeof(TRef)*(cbase+1+LJ_FR2));
@@ -908,12 +938,13 @@ void lj_record_ret(jit_State *J, BCReg rbase, ptrdiff_t gotresults)
       }  /* Otherwise continue with another __concat call. */
     } else {
       /* Result type already specialized. */
-      lua_assert(cont == lj_cont_condf || cont == lj_cont_condt);
+      lj_assertJ(cont == lj_cont_condf || cont == lj_cont_condt,
+		 "bad continuation type");
     }
   } else {
     lj_trace_err(J, LJ_TRERR_NYIRETL);  /* NYI: handle return to C frame. */
   }
-  lua_assert(J->baseslot >= 1+LJ_FR2);
+  lj_assertJ(J->baseslot >= 1+LJ_FR2, "bad baseslot for return");
 }
 
 /* -- Metamethod handling ------------------------------------------------- */
@@ -1168,7 +1199,7 @@ static void rec_mm_comp_cdata(jit_State *J, RecordIndex *ix, int op, MMS mm)
     ix->tab = ix->val;
     copyTV(J->L, &ix->tabv, &ix->valv);
   } else {
-    lua_assert(tref_iscdata(ix->key));
+    lj_assertJ(tref_iscdata(ix->key), "cdata expected");
     ix->tab = ix->key;
     copyTV(J->L, &ix->tabv, &ix->keyv);
   }
@@ -1265,7 +1296,8 @@ static void rec_idx_abc(jit_State *J, TRef asizeref, TRef ikey, uint32_t asize)
     /* Got scalar evolution analysis results for this reference? */
     if (ref == J->scev.idx) {
       int32_t stop;
-      lua_assert(irt_isint(J->scev.t) && ir->o == IR_SLOAD);
+      lj_assertJ(irt_isint(J->scev.t) && ir->o == IR_SLOAD,
+		 "only int SCEV supported");
       stop = numberVint(&(J->L->base - J->baseslot)[ir->op1 + FORL_STOP]);
       /* Runtime value for stop of loop is within bounds? */
       if ((uint64_t)stop + ofs < (uint64_t)asize) {
@@ -1383,7 +1415,7 @@ TRef lj_record_idx(jit_State *J, RecordIndex *ix)
 
   while (!tref_istab(ix->tab)) { /* Handle non-table lookup. */
     /* Never call raw lj_record_idx() on non-table. */
-    lua_assert(ix->idxchain != 0);
+    lj_assertJ(ix->idxchain != 0, "bad usage");
     if (!lj_record_mm_lookup(J, ix, ix->val ? MM_newindex : MM_index))
       lj_trace_err(J, LJ_TRERR_NOMM);
   handlemm:
@@ -1467,10 +1499,10 @@ TRef lj_record_idx(jit_State *J, RecordIndex *ix)
 	emitir(IRTG(oldv == niltvg(J2G(J)) ? IR_EQ : IR_NE, IRT_PGC),
 	       xref, lj_ir_kkptr(J, niltvg(J2G(J))));
       if (ix->idxchain && lj_record_mm_lookup(J, ix, MM_newindex)) {
-	lua_assert(hasmm);
+	lj_assertJ(hasmm, "inconsistent metamethod handling");
 	goto handlemm;
       }
-      lua_assert(!hasmm);
+      lj_assertJ(!hasmm, "inconsistent metamethod handling");
       if (oldv == niltvg(J2G(J))) {  /* Need to insert a new key. */
 	TRef key = ix->key;
 	if (tref_isinteger(key))  /* NEWREF needs a TValue as a key. */
@@ -1578,7 +1610,7 @@ static TRef rec_upvalue(jit_State *J, uint32_t uv, TRef val)
   int needbarrier = 0;
   if (rec_upvalue_constify(J, uvp)) {  /* Try to constify immutable upvalue. */
     TRef tr, kfunc;
-    lua_assert(val == 0);
+    lj_assertJ(val == 0, "bad usage");
     if (!tref_isk(fn)) {  /* Late specialization of current function. */
       if (J->pt->flags >= PROTO_CLC_POLY)
 	goto noconstify;
@@ -1700,7 +1732,7 @@ static void rec_func_vararg(jit_State *J)
 {
   GCproto *pt = J->pt;
   BCReg s, fixargs, vframe = J->maxslot+1+LJ_FR2;
-  lua_assert((pt->flags & PROTO_VARARG));
+  lj_assertJ((pt->flags & PROTO_VARARG), "FUNCV in non-vararg function");
   if (J->baseslot + vframe + pt->framesize >= LJ_MAX_JSLOTS)
     lj_trace_err(J, LJ_TRERR_STACKOV);
   J->base[vframe-1-LJ_FR2] = J->base[-1-LJ_FR2];  /* Copy function up. */
@@ -1769,7 +1801,7 @@ static void rec_varg(jit_State *J, BCReg dst, ptrdiff_t nresults)
 {
   int32_t numparams = J->pt->numparams;
   ptrdiff_t nvararg = frame_delta(J->L->base-1) - numparams - 1 - LJ_FR2;
-  lua_assert(frame_isvarg(J->L->base-1));
+  lj_assertJ(frame_isvarg(J->L->base-1), "VARG in non-vararg frame");
   if (LJ_FR2 && dst > J->maxslot)
     J->base[dst-1] = 0;  /* Prevent resurrection of unrelated slot. */
   if (J->framedepth > 0) {  /* Simple case: varargs defined on-trace. */
@@ -1887,7 +1919,7 @@ static TRef rec_cat(jit_State *J, BCReg baseslot, BCReg topslot)
   TValue savetv[5];
   BCReg s;
   RecordIndex ix;
-  lua_assert(baseslot < topslot);
+  lj_assertJ(baseslot < topslot, "bad CAT arg");
   for (s = baseslot; s <= topslot; s++)
     (void)getslot(J, s);  /* Ensure all arguments have a reference. */
   if (tref_isnumber_str(top[0]) && tref_isnumber_str(top[-1])) {
@@ -2011,7 +2043,7 @@ void lj_record_ins(jit_State *J)
       if (bc_op(*J->pc) >= BC__MAX)
 	return;
       break;
-    default: lua_assert(0); break;
+    default: lj_assertJ(0, "bad post-processing mode"); break;
     }
     J->postproc = LJ_POST_NONE;
   }
@@ -2379,7 +2411,8 @@ void lj_record_ins(jit_State *J)
       J->loopref = J->cur.nins;
     break;
   case BC_JFORI:
-    lua_assert(bc_op(pc[(ptrdiff_t)rc-BCBIAS_J]) == BC_JFORL);
+    lj_assertJ(bc_op(pc[(ptrdiff_t)rc-BCBIAS_J]) == BC_JFORL,
+	       "JFORI does not point to JFORL");
     if (rec_for(J, pc, 0) != LOOPEV_LEAVE)  /* Link to existing loop. */
       lj_record_stop(J, LJ_TRLINK_ROOT, bc_d(pc[(ptrdiff_t)rc-BCBIAS_J]));
     /* Continue tracing if the loop is not entered. */
@@ -2432,7 +2465,8 @@ void lj_record_ins(jit_State *J)
     rec_func_lua(J);
     break;
   case BC_JFUNCV:
-    lua_assert(0);  /* Cannot happen. No hotcall counting for varag funcs. */
+    /* Cannot happen. No hotcall counting for varag funcs. */
+    lj_assertJ(0, "unsupported vararg hotcall");
     break;
 
   case BC_FUNCC:
@@ -2492,11 +2526,11 @@ static const BCIns *rec_setup_root(jit_State *J)
     J->bc_min = pc;
     break;
   case BC_ITERL:
-    lua_assert(bc_op(pc[-1]) == BC_ITERC);
+    lj_assertJ(bc_op(pc[-1]) == BC_ITERC, "no ITERC before ITERL");
     J->maxslot = ra + bc_b(pc[-1]) - 1;
     J->bc_extent = (MSize)(-bc_j(ins))*sizeof(BCIns);
     pc += 1+bc_j(ins);
-    lua_assert(bc_op(pc[-1]) == BC_JMP);
+    lj_assertJ(bc_op(pc[-1]) == BC_JMP, "ITERL does not point to JMP+1");
     J->bc_min = pc;
     break;
   case BC_LOOP:
@@ -2528,7 +2562,7 @@ static const BCIns *rec_setup_root(jit_State *J)
     pc++;
     break;
   default:
-    lua_assert(0);
+    lj_assertJ(0, "bad root trace start bytecode %d", bc_op(ins));
     break;
   }
   return pc;
diff --git a/src/lj_snap.c b/src/lj_snap.c
index 9146cddc..2dc281cb 100644
--- a/src/lj_snap.c
+++ b/src/lj_snap.c
@@ -110,7 +110,7 @@ static MSize snapshot_framelinks(jit_State *J, SnapEntry *map, uint8_t *topslot)
   cTValue *ftop = isluafunc(fn) ? (frame+funcproto(fn)->framesize) : J->L->top;
 #if LJ_FR2
   uint64_t pcbase = (u64ptr(J->pc) << 8) | (J->baseslot - 2);
-  lua_assert(2 <= J->baseslot && J->baseslot <= 257);
+  lj_assertJ(2 <= J->baseslot && J->baseslot <= 257, "bad baseslot");
   memcpy(map, &pcbase, sizeof(uint64_t));
 #else
   MSize f = 0;
@@ -129,7 +129,7 @@ static MSize snapshot_framelinks(jit_State *J, SnapEntry *map, uint8_t *topslot)
 #endif
       frame = frame_prevd(frame);
     } else {
-      lua_assert(!frame_isc(frame));
+      lj_assertJ(!frame_isc(frame), "broken frame chain");
 #if !LJ_FR2
       map[f++] = SNAP_MKFTSZ(frame_ftsz(frame));
 #endif
@@ -141,10 +141,10 @@ static MSize snapshot_framelinks(jit_State *J, SnapEntry *map, uint8_t *topslot)
   }
   *topslot = (uint8_t)(ftop - lim);
 #if LJ_FR2
-  lua_assert(sizeof(SnapEntry) * 2 == sizeof(uint64_t));
+  lj_assertJ(sizeof(SnapEntry) * 2 == sizeof(uint64_t), "bad SnapEntry def");
   return 2;
 #else
-  lua_assert(f == (MSize)(1 + J->framedepth));
+  lj_assertJ(f == (MSize)(1 + J->framedepth), "miscalculated snapshot size");
   return f;
 #endif
 }
@@ -223,7 +223,8 @@ static BCReg snap_usedef(jit_State *J, uint8_t *udf,
 #define DEF_SLOT(s)		udf[(s)] *= 3
 
   /* Scan through following bytecode and check for uses/defs. */
-  lua_assert(pc >= proto_bc(J->pt) && pc < proto_bc(J->pt) + J->pt->sizebc);
+  lj_assertJ(pc >= proto_bc(J->pt) && pc < proto_bc(J->pt) + J->pt->sizebc,
+	     "snapshot PC out of range");
   for (;;) {
     BCIns ins = *pc++;
     BCOp op = bc_op(ins);
@@ -234,7 +235,7 @@ static BCReg snap_usedef(jit_State *J, uint8_t *udf,
     switch (bcmode_c(op)) {
     case BCMvar: USE_SLOT(bc_c(ins)); break;
     case BCMrbase:
-      lua_assert(op == BC_CAT);
+      lj_assertJ(op == BC_CAT, "unhandled op %d with RC rbase", op);
       for (s = bc_b(ins); s <= bc_c(ins); s++) USE_SLOT(s);
       for (; s < maxslot; s++) DEF_SLOT(s);
       break;
@@ -288,7 +289,8 @@ static BCReg snap_usedef(jit_State *J, uint8_t *udf,
       break;
     default: break;
     }
-    lua_assert(pc >= proto_bc(J->pt) && pc < proto_bc(J->pt) + J->pt->sizebc);
+    lj_assertJ(pc >= proto_bc(J->pt) && pc < proto_bc(J->pt) + J->pt->sizebc,
+	       "use/def analysis PC out of range");
   }
 
 #undef USE_SLOT
@@ -361,19 +363,20 @@ static RegSP snap_renameref(GCtrace *T, SnapNo lim, IRRef ref, RegSP rs)
 }
 
 /* Copy RegSP from parent snapshot to the parent links of the IR. */
-IRIns *lj_snap_regspmap(GCtrace *T, SnapNo snapno, IRIns *ir)
+IRIns *lj_snap_regspmap(jit_State *J, GCtrace *T, SnapNo snapno, IRIns *ir)
 {
   SnapShot *snap = &T->snap[snapno];
   SnapEntry *map = &T->snapmap[snap->mapofs];
   BloomFilter rfilt = snap_renamefilter(T, snapno);
   MSize n = 0;
   IRRef ref = 0;
+  UNUSED(J);
   for ( ; ; ir++) {
     uint32_t rs;
     if (ir->o == IR_SLOAD) {
       if (!(ir->op2 & IRSLOAD_PARENT)) break;
       for ( ; ; n++) {
-	lua_assert(n < snap->nent);
+	lj_assertJ(n < snap->nent, "slot %d not found in snapshot", ir->op1);
 	if (snap_slot(map[n]) == ir->op1) {
 	  ref = snap_ref(map[n++]);
 	  break;
@@ -390,7 +393,7 @@ IRIns *lj_snap_regspmap(GCtrace *T, SnapNo snapno, IRIns *ir)
     if (bloomtest(rfilt, ref))
       rs = snap_renameref(T, snapno, ref, rs);
     ir->prev = (uint16_t)rs;
-    lua_assert(regsp_used(rs));
+    lj_assertJ(regsp_used(rs), "unused IR %04d in snapshot", ref - REF_BIAS);
   }
   return ir;
 }
@@ -408,7 +411,7 @@ static TRef snap_replay_const(jit_State *J, IRIns *ir)
   case IR_KNUM: case IR_KINT64:
     return lj_ir_k64(J, (IROp)ir->o, ir_k64(ir)->u64);
   case IR_KPTR: return lj_ir_kptr(J, ir_kptr(ir));  /* Continuation. */
-  default: lua_assert(0); return TREF_NIL; break;
+  default: lj_assertJ(0, "bad IR constant op %d", ir->o); return TREF_NIL;
   }
 }
 
@@ -486,7 +489,7 @@ void lj_snap_replay(jit_State *J, GCtrace *T)
 	tr = snap_replay_const(J, ir);
     } else if (!regsp_used(ir->prev)) {
       pass23 = 1;
-      lua_assert(s != 0);
+      lj_assertJ(s != 0, "unused slot 0 in snapshot");
       tr = s;
     } else {
       IRType t = irt_type(ir->t);
@@ -512,8 +515,9 @@ void lj_snap_replay(jit_State *J, GCtrace *T)
       if (regsp_reg(ir->r) == RID_SUNK) {
 	if (J->slot[snap_slot(sn)] != snap_slot(sn)) continue;
 	pass23 = 1;
-	lua_assert(ir->o == IR_TNEW || ir->o == IR_TDUP ||
-		   ir->o == IR_CNEW || ir->o == IR_CNEWI);
+	lj_assertJ(ir->o == IR_TNEW || ir->o == IR_TDUP ||
+		   ir->o == IR_CNEW || ir->o == IR_CNEWI,
+		   "sunk parent IR %04d has bad op %d", refp - REF_BIAS, ir->o);
 	if (ir->op1 >= T->nk) snap_pref(J, T, map, nent, seen, ir->op1);
 	if (ir->op2 >= T->nk) snap_pref(J, T, map, nent, seen, ir->op2);
 	if (LJ_HASFFI && ir->o == IR_CNEWI) {
@@ -531,7 +535,8 @@ void lj_snap_replay(jit_State *J, GCtrace *T)
 	    }
 	}
       } else if (!irref_isk(refp) && !regsp_used(ir->prev)) {
-	lua_assert(ir->o == IR_CONV && ir->op2 == IRCONV_NUM_INT);
+	lj_assertJ(ir->o == IR_CONV && ir->op2 == IRCONV_NUM_INT,
+		   "sunk parent IR %04d has bad op %d", refp - REF_BIAS, ir->o);
 	J->slot[snap_slot(sn)] = snap_pref(J, T, map, nent, seen, ir->op1);
       }
     }
@@ -581,7 +586,9 @@ void lj_snap_replay(jit_State *J, GCtrace *T)
 	      val = snap_pref(J, T, map, nent, seen, irs->op2);
 	      if (val == 0) {
 		IRIns *irc = &T->ir[irs->op2];
-		lua_assert(irc->o == IR_CONV && irc->op2 == IRCONV_NUM_INT);
+		lj_assertJ(irc->o == IR_CONV && irc->op2 == IRCONV_NUM_INT,
+			   "sunk store for parent IR %04d with bad op %d",
+			   refp - REF_BIAS, irc->o);
 		val = snap_pref(J, T, map, nent, seen, irc->op1);
 		val = emitir(IRTN(IR_CONV), val, IRCONV_NUM_INT);
 	      } else if ((LJ_SOFTFP32 || (LJ_32 && LJ_HASFFI)) &&
@@ -634,7 +641,9 @@ static void snap_restoreval(jit_State *J, GCtrace *T, ExitState *ex,
     if (ir->o == IR_KPTR) {
       o->u64 = (uint64_t)(uintptr_t)ir_kptr(ir);
     } else {
-      lua_assert(!(ir->o == IR_KKPTR || ir->o == IR_KNULL));
+      lj_assertJ(!(ir->o == IR_KKPTR || ir->o == IR_KNULL),
+		 "restore of const from IR %04d with bad op %d",
+		 ref - REF_BIAS, ir->o);
       lj_ir_kvalue(J->L, o, ir);
     }
     return;
@@ -655,13 +664,14 @@ static void snap_restoreval(jit_State *J, GCtrace *T, ExitState *ex,
       o->u64 = *(uint64_t *)sps;
 #endif
     } else {
-      lua_assert(!irt_ispri(t));  /* PRI refs never have a spill slot. */
+      lj_assertJ(!irt_ispri(t), "PRI ref with spill slot");
       setgcV(J->L, o, (GCobj *)(uintptr_t)*(GCSize *)sps, irt_toitype(t));
     }
   } else {  /* Restore from register. */
     Reg r = regsp_reg(rs);
     if (ra_noreg(r)) {
-      lua_assert(ir->o == IR_CONV && ir->op2 == IRCONV_NUM_INT);
+      lj_assertJ(ir->o == IR_CONV && ir->op2 == IRCONV_NUM_INT,
+		 "restore from IR %04d has no reg", ref - REF_BIAS);
       snap_restoreval(J, T, ex, snapno, rfilt, ir->op1, o);
       if (LJ_DUALNUM) setnumV(o, (lua_Number)intV(o));
       return;
@@ -689,7 +699,7 @@ static void snap_restoreval(jit_State *J, GCtrace *T, ExitState *ex,
 
 #if LJ_HASFFI
 /* Restore raw data from the trace exit state. */
-static void snap_restoredata(GCtrace *T, ExitState *ex,
+static void snap_restoredata(jit_State *J, GCtrace *T, ExitState *ex,
 			     SnapNo snapno, BloomFilter rfilt,
 			     IRRef ref, void *dst, CTSize sz)
 {
@@ -697,6 +707,7 @@ static void snap_restoredata(GCtrace *T, ExitState *ex,
   RegSP rs = ir->prev;
   int32_t *src;
   uint64_t tmp;
+  UNUSED(J);
   if (irref_isk(ref)) {
     if (ir_isk64(ir)) {
       src = (int32_t *)&ir[1];
@@ -719,8 +730,9 @@ static void snap_restoredata(GCtrace *T, ExitState *ex,
       Reg r = regsp_reg(rs);
       if (ra_noreg(r)) {
 	/* Note: this assumes CNEWI is never used for SOFTFP split numbers. */
-	lua_assert(sz == 8 && ir->o == IR_CONV && ir->op2 == IRCONV_NUM_INT);
-	snap_restoredata(T, ex, snapno, rfilt, ir->op1, dst, 4);
+	lj_assertJ(sz == 8 && ir->o == IR_CONV && ir->op2 == IRCONV_NUM_INT,
+		   "restore from IR %04d has no reg", ref - REF_BIAS);
+	snap_restoredata(J, T, ex, snapno, rfilt, ir->op1, dst, 4);
 	*(lua_Number *)dst = (lua_Number)*(int32_t *)dst;
 	return;
       }
@@ -741,7 +753,8 @@ static void snap_restoredata(GCtrace *T, ExitState *ex,
       if (LJ_64 && LJ_BE && sz == 4) src++;
     }
   }
-  lua_assert(sz == 1 || sz == 2 || sz == 4 || sz == 8);
+  lj_assertJ(sz == 1 || sz == 2 || sz == 4 || sz == 8,
+	     "restore from IR %04d with bad size %d", ref - REF_BIAS, sz);
   if (sz == 4) *(int32_t *)dst = *src;
   else if (sz == 8) *(int64_t *)dst = *(int64_t *)src;
   else if (sz == 1) *(int8_t *)dst = (int8_t)*src;
@@ -754,8 +767,9 @@ static void snap_unsink(jit_State *J, GCtrace *T, ExitState *ex,
 			SnapNo snapno, BloomFilter rfilt,
 			IRIns *ir, TValue *o)
 {
-  lua_assert(ir->o == IR_TNEW || ir->o == IR_TDUP ||
-	     ir->o == IR_CNEW || ir->o == IR_CNEWI);
+  lj_assertJ(ir->o == IR_TNEW || ir->o == IR_TDUP ||
+	     ir->o == IR_CNEW || ir->o == IR_CNEWI,
+	     "sunk allocation with bad op %d", ir->o);
 #if LJ_HASFFI
   if (ir->o == IR_CNEW || ir->o == IR_CNEWI) {
     CTState *cts = ctype_cts(J->L);
@@ -766,13 +780,14 @@ static void snap_unsink(jit_State *J, GCtrace *T, ExitState *ex,
     setcdataV(J->L, o, cd);
     if (ir->o == IR_CNEWI) {
       uint8_t *p = (uint8_t *)cdataptr(cd);
-      lua_assert(sz == 4 || sz == 8);
+      lj_assertJ(sz == 4 || sz == 8, "sunk cdata with bad size %d", sz);
       if (LJ_32 && sz == 8 && ir+1 < T->ir + T->nins && (ir+1)->o == IR_HIOP) {
-	snap_restoredata(T, ex, snapno, rfilt, (ir+1)->op2, LJ_LE?p+4:p, 4);
+	snap_restoredata(J, T, ex, snapno, rfilt, (ir+1)->op2,
+			 LJ_LE ? p+4 : p, 4);
 	if (LJ_BE) p += 4;
 	sz = 4;
       }
-      snap_restoredata(T, ex, snapno, rfilt, ir->op2, p, sz);
+      snap_restoredata(J, T, ex, snapno, rfilt, ir->op2, p, sz);
     } else {
       IRIns *irs, *irlast = &T->ir[T->snap[snapno].ref];
       for (irs = ir+1; irs < irlast; irs++)
@@ -780,8 +795,11 @@ static void snap_unsink(jit_State *J, GCtrace *T, ExitState *ex,
 	  IRIns *iro = &T->ir[T->ir[irs->op1].op2];
 	  uint8_t *p = (uint8_t *)cd;
 	  CTSize szs;
-	  lua_assert(irs->o == IR_XSTORE && T->ir[irs->op1].o == IR_ADD);
-	  lua_assert(iro->o == IR_KINT || iro->o == IR_KINT64);
+	  lj_assertJ(irs->o == IR_XSTORE, "sunk store with bad op %d", irs->o);
+	  lj_assertJ(T->ir[irs->op1].o == IR_ADD,
+		     "sunk store with bad add op %d", T->ir[irs->op1].o);
+	  lj_assertJ(iro->o == IR_KINT || iro->o == IR_KINT64,
+		     "sunk store with bad const offset op %d", iro->o);
 	  if (irt_is64(irs->t)) szs = 8;
 	  else if (irt_isi8(irs->t) || irt_isu8(irs->t)) szs = 1;
 	  else if (irt_isi16(irs->t) || irt_isu16(irs->t)) szs = 2;
@@ -790,14 +808,16 @@ static void snap_unsink(jit_State *J, GCtrace *T, ExitState *ex,
 	    p += (int64_t)ir_k64(iro)->u64;
 	  else
 	    p += iro->i;
-	  lua_assert(p >= (uint8_t *)cdataptr(cd) &&
-		     p + szs <= (uint8_t *)cdataptr(cd) + sz);
+	  lj_assertJ(p >= (uint8_t *)cdataptr(cd) &&
+		     p + szs <= (uint8_t *)cdataptr(cd) + sz,
+		     "sunk store with offset out of range");
 	  if (LJ_32 && irs+1 < T->ir + T->nins && (irs+1)->o == IR_HIOP) {
-	    lua_assert(szs == 4);
-	    snap_restoredata(T, ex, snapno, rfilt, (irs+1)->op2, LJ_LE?p+4:p,4);
+	    lj_assertJ(szs == 4, "sunk store with bad size %d", szs);
+	    snap_restoredata(J, T, ex, snapno, rfilt, (irs+1)->op2,
+			     LJ_LE ? p+4 : p, 4);
 	    if (LJ_BE) p += 4;
 	  }
-	  snap_restoredata(T, ex, snapno, rfilt, irs->op2, p, szs);
+	  snap_restoredata(J, T, ex, snapno, rfilt, irs->op2, p, szs);
 	}
     }
   } else
@@ -812,10 +832,12 @@ static void snap_unsink(jit_State *J, GCtrace *T, ExitState *ex,
       if (irs->r == RID_SINK && snap_sunk_store(T, ir, irs)) {
 	IRIns *irk = &T->ir[irs->op1];
 	TValue tmp, *val;
-	lua_assert(irs->o == IR_ASTORE || irs->o == IR_HSTORE ||
-		   irs->o == IR_FSTORE);
+	lj_assertJ(irs->o == IR_ASTORE || irs->o == IR_HSTORE ||
+		   irs->o == IR_FSTORE,
+		   "sunk store with bad op %d", irs->o);
 	if (irk->o == IR_FREF) {
-	  lua_assert(irk->op2 == IRFL_TAB_META);
+	  lj_assertJ(irk->op2 == IRFL_TAB_META,
+		     "sunk store with bad field %d", irk->op2);
 	  snap_restoreval(J, T, ex, snapno, rfilt, irs->op2, &tmp);
 	  /* NOBARRIER: The table is new (marked white). */
 	  setgcref(t->metatable, obj2gco(tabV(&tmp)));
@@ -903,7 +925,7 @@ const BCIns *lj_snap_restore(jit_State *J, void *exptr)
 #if LJ_FR2
   L->base += (map[nent+LJ_BE] & 0xff);
 #endif
-  lua_assert(map + nent == flinks);
+  lj_assertJ(map + nent == flinks, "inconsistent frames in snapshot");
 
   /* Compute current stack top. */
   switch (bc_op(*pc)) {
diff --git a/src/lj_snap.h b/src/lj_snap.h
index 2c9ae3d6..4aec8509 100644
--- a/src/lj_snap.h
+++ b/src/lj_snap.h
@@ -13,7 +13,8 @@
 LJ_FUNC void lj_snap_add(jit_State *J);
 LJ_FUNC void lj_snap_purge(jit_State *J);
 LJ_FUNC void lj_snap_shrink(jit_State *J);
-LJ_FUNC IRIns *lj_snap_regspmap(GCtrace *T, SnapNo snapno, IRIns *ir);
+LJ_FUNC IRIns *lj_snap_regspmap(jit_State *J, GCtrace *T, SnapNo snapno,
+				IRIns *ir);
 LJ_FUNC void lj_snap_replay(jit_State *J, GCtrace *T);
 LJ_FUNC const BCIns *lj_snap_restore(jit_State *J, void *exptr);
 LJ_FUNC void lj_snap_grow_buf_(jit_State *J, MSize need);
diff --git a/src/lj_state.c b/src/lj_state.c
index 4add3d65..684336d5 100644
--- a/src/lj_state.c
+++ b/src/lj_state.c
@@ -70,7 +70,8 @@ static void resizestack(lua_State *L, MSize n)
   GCobj *up;
   int32_t oldvmstate = G(L)->vmstate;
 
-  lua_assert((MSize)(tvref(L->maxstack)-oldst)==L->stacksize-LJ_STACK_EXTRA-1);
+  lj_assertL((MSize)(tvref(L->maxstack)-oldst) == L->stacksize-LJ_STACK_EXTRA-1,
+	     "inconsistent stack size");
 
   /*
   ** Lua stack is inconsistent while reallocation, profilers
@@ -182,8 +183,9 @@ static void close_state(lua_State *L)
   global_State *g = G(L);
   lj_func_closeuv(L, tvref(L->stack));
   lj_gc_freeall(g);
-  lua_assert(gcref(g->gc.root) == obj2gco(L));
-  lua_assert(g->strnum == 0);
+  lj_assertG(gcref(g->gc.root) == obj2gco(L),
+	     "main thread is not first GC object");
+  lj_assertG(g->strnum == 0, "leaked %d strings", g->strnum);
   lj_trace_freestate(g);
 #if LJ_HASFFI
   lj_ctype_freestate(g);
@@ -197,7 +199,9 @@ static void close_state(lua_State *L)
     lj_mem_freevec(g, mref(g->gc.lightudseg, uint32_t), segnum, uint32_t);
   }
 #endif
-  lua_assert(g->gc.total == sizeof(GG_State));
+  lj_assertG(g->gc.total == sizeof(GG_State),
+	     "memory leak of %lld bytes",
+	     (long long)(g->gc.total - sizeof(GG_State)));
 #ifndef LUAJIT_USE_SYSMALLOC
   if (g->allocf == lj_alloc_f)
     lj_alloc_destroy(g->allocd);
@@ -315,17 +319,17 @@ lua_State *lj_state_new(lua_State *L)
   setmrefr(L1->glref, L->glref);
   setgcrefr(L1->env, L->env);
   stack_init(L1, L);  /* init stack */
-  lua_assert(iswhite(obj2gco(L1)));
+  lj_assertL(iswhite(obj2gco(L1)), "new thread object is not white");
   return L1;
 }
 
 void LJ_FASTCALL lj_state_free(global_State *g, lua_State *L)
 {
-  lua_assert(L != mainthread(g));
+  lj_assertG(L != mainthread(g), "free of main thread");
   if (obj2gco(L) == gcref(g->cur_L))
     setgcrefnull(g->cur_L);
   lj_func_closeuv(L, tvref(L->stack));
-  lua_assert(gcref(L->openupval) == NULL);
+  lj_assertG(gcref(L->openupval) == NULL, "stale open upvalues");
   lj_mem_freevec(g, tvref(L->stack), L->stacksize, TValue);
   lj_mem_freet(g, L);
 }
diff --git a/src/lj_str.c b/src/lj_str.c
index 8ff955ed..321e8c4f 100644
--- a/src/lj_str.c
+++ b/src/lj_str.c
@@ -53,8 +53,9 @@ int32_t LJ_FASTCALL lj_str_cmp(GCstr *a, GCstr *b)
 static LJ_AINLINE int str_fastcmp(const char *a, const char *b, MSize len)
 {
   MSize i = 0;
-  lua_assert(len > 0);
-  lua_assert((((uintptr_t)a+len-1) & (LJ_PAGESIZE-1)) <= LJ_PAGESIZE-4);
+  lj_assertX(len > 0, "fast string compare with zero length");
+  lj_assertX((((uintptr_t)a+len-1) & (LJ_PAGESIZE-1)) <= LJ_PAGESIZE-4,
+	     "fast string compare crossing page boundary");
   do {  /* Note: innocuous access up to end of string + 3. */
     uint32_t v = lj_getu32(a+i) ^ *(const uint32_t *)(b+i);
     if (v) {
@@ -138,7 +139,7 @@ lj_fullhash(const uint8_t *v, MSize len)
   MSize c = 0xcafedead;
   MSize d = 0xdeadbeef;
   MSize h = len;
-  lua_assert(len >= 12);
+  lj_assertX(len >= 12, "full hash calculation for too short (%d) string", len);
   for(; len>8; len-=8, v+=8) {
     a ^= lj_getu32(v);
     b ^= lj_getu32(v+4);
diff --git a/src/lj_strfmt.c b/src/lj_strfmt.c
index 237cc575..ff5568c3 100644
--- a/src/lj_strfmt.c
+++ b/src/lj_strfmt.c
@@ -320,7 +320,7 @@ SBuf *lj_strfmt_putfxint(SBuf *sb, SFormat sf, uint64_t k)
   if ((sf & STRFMT_F_LEFT))
     while (width-- > pprec) *p++ = ' ';
 
-  lua_assert(need == (MSize)(p - ps));
+  lj_assertX(need == (MSize)(p - ps), "miscalculated format size");
   setsbufP(sb, p);
   return sb;
 }
@@ -449,7 +449,7 @@ const char *lj_strfmt_pushvf(lua_State *L, const char *fmt, va_list argp)
     case STRFMT_ERR:
     default:
       lj_buf_putb(sb, '?');
-      lua_assert(0);
+      lj_assertL(0, "bad string format near offset %d", fs.len);
       break;
     }
   }
diff --git a/src/lj_strfmt.h b/src/lj_strfmt.h
index 6e1d9017..0e1d8946 100644
--- a/src/lj_strfmt.h
+++ b/src/lj_strfmt.h
@@ -79,7 +79,8 @@ static LJ_AINLINE void lj_strfmt_init(FormatState *fs, const char *p, MSize len)
 {
   fs->p = (const uint8_t *)p;
   fs->e = (const uint8_t *)p + len;
-  lua_assert(*fs->e == 0);  /* Must be NUL-terminated (may have NULs inside). */
+  /* Must be NUL-terminated. May have NULs inside, too. */
+  lj_assertX(*fs->e == 0, "format not NUL-terminated");
 }
 
 /* Raw conversions. */
diff --git a/src/lj_strfmt_num.c b/src/lj_strfmt_num.c
index 9271f68a..c26204b7 100644
--- a/src/lj_strfmt_num.c
+++ b/src/lj_strfmt_num.c
@@ -257,7 +257,7 @@ static int nd_similar(uint32_t* nd, uint32_t ndhi, uint32_t* ref, MSize hilen,
   } else {
     prec -= hilen - 9;
   }
-  lua_assert(prec < 9);
+  lj_assertX(prec < 9, "bad precision %d", prec);
   lj_strfmt_wuint9(nd9, nd[ndhi]);
   lj_strfmt_wuint9(ref9, *ref);
   return !memcmp(nd9, ref9, prec) && (nd9[prec] < '5') == (ref9[prec] < '5');
@@ -414,14 +414,14 @@ static char *lj_strfmt_wfnum(SBuf *sb, SFormat sf, lua_Number n, char *p)
 	** Rescaling was performed, but this introduced some error, and might
 	** have pushed us across a rounding boundary. We check whether this
 	** error affected the result by introducing even more error (2ulp in
-	** either direction), and seeing whether a roundary boundary was
+	** either direction), and seeing whether a rounding boundary was
 	** crossed. Having already converted the -2ulp case, we save off its
 	** most significant digits, convert the +2ulp case, and compare them.
 	*/
 	int32_t eidx = e + 70 + (ND_MUL2K_MAX_SHIFT < 29)
 			 + (t.u32.lo >= 0xfffffffe && !(~t.u32.hi << 12));
 	const int8_t *m_e = four_ulp_m_e + eidx * 2;
-	lua_assert(0 <= eidx && eidx < 128);
+	lj_assertG_(G(sbufL(sb)), 0 <= eidx && eidx < 128, "bad eidx %d", eidx);
 	nd[33] = nd[ndhi];
 	nd[32] = nd[(ndhi - 1) & 0x3f];
 	nd[31] = nd[(ndhi - 2) & 0x3f];
diff --git a/src/lj_strscan.c b/src/lj_strscan.c
index 11d341ee..bb07b251 100644
--- a/src/lj_strscan.c
+++ b/src/lj_strscan.c
@@ -93,7 +93,7 @@ static void strscan_double(uint64_t x, TValue *o, int32_t ex2, int32_t neg)
   }
 
   /* Convert to double using a signed int64_t conversion, then rescale. */
-  lua_assert((int64_t)x >= 0);
+  lj_assertX((int64_t)x >= 0, "bad double conversion");
   n = (double)(int64_t)x;
   if (neg) n = -n;
   if (ex2) n = ldexp(n, ex2);
@@ -263,7 +263,7 @@ static StrScanFmt strscan_dec(const uint8_t *p, TValue *o,
     uint32_t hi = 0, lo = (uint32_t)(xip-xi);
     int32_t ex2 = 0, idig = (int32_t)lo + (ex10 >> 1);
 
-    lua_assert(lo > 0 && (ex10 & 1) == 0);
+    lj_assertX(lo > 0 && (ex10 & 1) == 0, "bad lo %d ex10 %d", lo, ex10);
 
     /* Handle simple overflow/underflow. */
     if (idig > 310/2) { if (neg) setminfV(o); else setpinfV(o); return fmt; }
@@ -532,7 +532,7 @@ int LJ_FASTCALL lj_strscan_num(GCstr *str, TValue *o)
 {
   StrScanFmt fmt = lj_strscan_scan((const uint8_t *)strdata(str), str->len, o,
 				   STRSCAN_OPT_TONUM);
-  lua_assert(fmt == STRSCAN_ERROR || fmt == STRSCAN_NUM);
+  lj_assertX(fmt == STRSCAN_ERROR || fmt == STRSCAN_NUM, "bad scan format");
   return (fmt != STRSCAN_ERROR);
 }
 
@@ -541,7 +541,8 @@ int LJ_FASTCALL lj_strscan_number(GCstr *str, TValue *o)
 {
   StrScanFmt fmt = lj_strscan_scan((const uint8_t *)strdata(str), str->len, o,
 				   STRSCAN_OPT_TOINT);
-  lua_assert(fmt == STRSCAN_ERROR || fmt == STRSCAN_NUM || fmt == STRSCAN_INT);
+  lj_assertX(fmt == STRSCAN_ERROR || fmt == STRSCAN_NUM || fmt == STRSCAN_INT,
+	     "bad scan format");
   if (fmt == STRSCAN_INT) setitype(o, LJ_TISNUM);
   return (fmt != STRSCAN_ERROR);
 }
diff --git a/src/lj_symtab.c b/src/lj_symtab.c
index 54984c05..38b5e9e1 100644
--- a/src/lj_symtab.c
+++ b/src/lj_symtab.c
@@ -36,8 +36,8 @@ void lj_symtab_dump_trace(struct lj_wbuf *out, const GCtrace *trace)
   BCLine lineno = 0;
 
   const BCIns *startpc = mref(trace->startpc, const BCIns);
-  lua_assert(startpc >= proto_bc(pt) &&
-             startpc < proto_bc(pt) + pt->sizebc);
+  lj_assertX(startpc >= proto_bc(pt) && startpc < proto_bc(pt) + pt->sizebc,
+	     "start trace PC out of range");
 
   lineno = lj_debug_line(pt, proto_bcpos(pt, startpc));
 
@@ -354,8 +354,9 @@ static int resolve_symbolnames(struct dl_phdr_info *info, size_t info_size,
   ** Assertion was taken from the GLIBC tests:
   ** https://code.woboq.org/userspace/glibc/elf/tst-dlmodcount.c.html#37
   */
-  lua_assert(info_size > offsetof(struct dl_phdr_info, dlpi_subs)
-      + sizeof(info->dlpi_subs));
+  lj_assertL(info_size > offsetof(struct dl_phdr_info, dlpi_subs)
+			 + sizeof(info->dlpi_subs),
+	     "bad dlpi_subs");
 
   lib_cnt = info->dlpi_adds - *conf->lib_adds;
 
@@ -401,7 +402,7 @@ static int resolve_symbolnames(struct dl_phdr_info *info, size_t info_size,
       ** sysprof, unless someone have deleted the LuaJIT binary
       ** right after the start.
       */
-      lua_assert(0);
+      lj_assertL(0, "bad executed binary symtab section");
   }
 
   /*
diff --git a/src/lj_sysprof.c b/src/lj_sysprof.c
index 2e9ed9b3..52d4d2a5 100644
--- a/src/lj_sysprof.c
+++ b/src/lj_sysprof.c
@@ -111,9 +111,9 @@ static void stream_epilogue(struct sysprof *sp)
 
 static void stream_lfunc(struct lj_wbuf *buf, const GCfunc *func)
 {
-  lua_assert(isluafunc(func));
+  lj_assertX(isluafunc(func), "bad lua function in sysprof stream");
   const GCproto *pt = funcproto(func);
-  lua_assert(pt != NULL);
+  lj_assertX(pt != NULL, "bad lua function prototype in sysprof stream");
   lj_wbuf_addbyte(buf, LJP_FRAME_LFUNC);
   lj_wbuf_addu64(buf, (uintptr_t)pt);
   lj_wbuf_addu64(buf, (uint64_t)pt->firstline);
@@ -121,14 +121,14 @@ static void stream_lfunc(struct lj_wbuf *buf, const GCfunc *func)
 
 static void stream_cfunc(struct lj_wbuf *buf, const GCfunc *func)
 {
-  lua_assert(iscfunc(func));
+  lj_assertX(iscfunc(func), "bad C function in sysprof stream");
   lj_wbuf_addbyte(buf, LJP_FRAME_CFUNC);
   lj_wbuf_addu64(buf, (uintptr_t)func->c.f);
 }
 
 static void stream_ffunc(struct lj_wbuf *buf, const GCfunc *func)
 {
-  lua_assert(isffunc(func));
+  lj_assertX(isffunc(func), "bad fast function in sysprof stream");
   lj_wbuf_addbyte(buf, LJP_FRAME_FFUNC);
   lj_wbuf_addu64(buf, func->c.ffid);
 }
@@ -136,7 +136,7 @@ static void stream_ffunc(struct lj_wbuf *buf, const GCfunc *func)
 static void stream_frame_lua(struct lj_wbuf *buf, const cTValue *frame)
 {
   const GCfunc *func = frame_func(frame);
-  lua_assert(func != NULL);
+  lj_assertX(func != NULL, "bad function in sysprof stream");
   if (isluafunc(func))
     stream_lfunc(buf, func);
   else if (isffunc(func))
@@ -145,7 +145,7 @@ static void stream_frame_lua(struct lj_wbuf *buf, const cTValue *frame)
     stream_cfunc(buf, func);
   else
     /* Unreachable. */
-    lua_assert(0);
+    lj_assertX(0, "bad function type in sysprof stream");
 }
 
 static void stream_backtrace_lua(struct sysprof *sp)
@@ -155,9 +155,9 @@ static void stream_backtrace_lua(struct sysprof *sp)
   cTValue *top_frame = NULL, *frame = NULL, *bot = NULL;
   lua_State *L = NULL;
 
-  lua_assert(g != NULL);
+  lj_assertX(g != NULL, "uninitialized global state in sysprof state");
   L = gco2th(gcref(g->cur_L));
-  lua_assert(L != NULL);
+  lj_assertG(L != NULL, "uninitialized Lua state in sysprof state");
 
   top_frame = g->top_frame - 1; //(1 + LJ_FR2)
 
@@ -200,7 +200,7 @@ static void default_backtrace_host(void *(writer)(int frame_no, void *addr))
   const int depth = backtrace(backtrace_buf, max_depth);
   int level;
 
-  lua_assert(depth <= max_depth);
+  lj_assertX(depth <= max_depth, "depth of C stack is too big");
   for (level = SYSPROF_HANDLER_STACK_DEPTH; level < depth; ++level) {
     if (!writer(level - SYSPROF_HANDLER_STACK_DEPTH + 1, backtrace_buf[level]))
       return;
@@ -209,7 +209,7 @@ static void default_backtrace_host(void *(writer)(int frame_no, void *addr))
 
 static void stream_backtrace_host(struct sysprof *sp)
 {
-  lua_assert(sp->backtracer != NULL);
+  lj_assertX(sp->backtracer != NULL, "uninitialized sysprof backtracer");
   sp->backtracer(stream_frame_host);
   lj_wbuf_addu64(&sp->out, (uintptr_t)LJP_FRAME_HOST_LAST);
 }
@@ -268,9 +268,9 @@ static void stream_event(struct sysprof *sp, uint32_t vmstate)
 {
   event_streamer stream = NULL;
 
-  lua_assert(vmstfit4(vmstate));
+  lj_assertX(vmstfit4(vmstate), "vmstate don't fit in 4 bits");
   stream = event_streamers[vmstate];
-  lua_assert(NULL != stream);
+  lj_assertX(stream != NULL, "uninitialized sysprof stream");
   stream(sp, vmstate);
 }
 
@@ -282,7 +282,8 @@ static void sysprof_record_sample(struct sysprof *sp, siginfo_t *info)
   uint32_t _vmstate = ~(uint32_t)(g->vmstate);
   uint32_t vmstate = _vmstate < LJ_VMST_TRACE ? _vmstate : LJ_VMST_TRACE;
 
-  lua_assert(pthread_self() == sp->thread);
+  lj_assertX(pthread_self() == sp->thread,
+	     "bad thread during sysprof record sample");
 
   /* Caveat: order of counters must match vmstate order in <lj_obj.h>. */
   ((uint64_t *)&sp->counters)[vmstate]++;
@@ -317,7 +318,7 @@ static void sysprof_signal_handler(int sig, siginfo_t *info, void *ctx)
       break;
 
     default:
-      lua_assert(0);
+      lj_assertX(0, "bad sysprof profiler state");
       break;
   }
 }
@@ -344,7 +345,7 @@ static int sysprof_validate(struct sysprof *sp,
       return PROFILE_ERRRUN;
 
     default:
-      lua_assert(0);
+      lj_assertX(0, "bad sysprof profiler state");
       break;
   }
 
diff --git a/src/lj_tab.c b/src/lj_tab.c
index c5f358e5..1d6a4b7f 100644
--- a/src/lj_tab.c
+++ b/src/lj_tab.c
@@ -38,7 +38,7 @@ static LJ_AINLINE Node *hashmask(const GCtab *t, uint32_t hash)
 /* Hash an arbitrary key and return its anchor position in the hash table. */
 static Node *hashkey(const GCtab *t, cTValue *key)
 {
-  lua_assert(!tvisint(key));
+  lj_assertX(!tvisint(key), "attempt to hash integer");
   if (tvisstr(key))
     return hashstr(t, strV(key));
   else if (tvisnum(key))
@@ -57,7 +57,7 @@ static LJ_AINLINE void newhpart(lua_State *L, GCtab *t, uint32_t hbits)
 {
   uint32_t hsize;
   Node *node;
-  lua_assert(hbits != 0);
+  lj_assertL(hbits != 0, "zero hash size");
   if (hbits > LJ_MAX_HBITS)
     lj_err_msg(L, LJ_ERR_TABOV);
   hsize = 1u << hbits;
@@ -78,7 +78,7 @@ static LJ_AINLINE void clearhpart(GCtab *t)
 {
   uint32_t i, hmask = t->hmask;
   Node *node = noderef(t->node);
-  lua_assert(t->hmask != 0);
+  lj_assertX(t->hmask != 0, "empty hash part");
   for (i = 0; i <= hmask; i++) {
     Node *n = &node[i];
     setmref(n->next, NULL);
@@ -103,7 +103,7 @@ static GCtab *newtab(lua_State *L, uint32_t asize, uint32_t hbits)
   /* First try to colocate the array part. */
   if (LJ_MAX_COLOSIZE != 0 && asize > 0 && asize <= LJ_MAX_COLOSIZE) {
     Node *nilnode;
-    lua_assert((sizeof(GCtab) & 7) == 0);
+    lj_assertL((sizeof(GCtab) & 7) == 0, "bad GCtab size");
     t = (GCtab *)lj_mem_newgco(L, sizetabcolo(asize));
     t->gct = ~LJ_TTAB;
     t->nomm = (uint8_t)~0;
@@ -186,7 +186,8 @@ GCtab * LJ_FASTCALL lj_tab_dup(lua_State *L, const GCtab *kt)
   GCtab *t;
   uint32_t asize, hmask;
   t = newtab(L, kt->asize, kt->hmask > 0 ? lj_fls(kt->hmask)+1 : 0);
-  lua_assert(kt->asize == t->asize && kt->hmask == t->hmask);
+  lj_assertL(kt->asize == t->asize && kt->hmask == t->hmask,
+	     "mismatched size of table and template");
   t->nomm = 0;  /* Keys with metamethod names may be present. */
   asize = kt->asize;
   if (asize > 0) {
@@ -312,7 +313,7 @@ void lj_tab_resize(lua_State *L, GCtab *t, uint32_t asize, uint32_t hbits)
 
 static uint32_t countint(cTValue *key, uint32_t *bins)
 {
-  lua_assert(!tvisint(key));
+  lj_assertX(!tvisint(key), "bad integer key");
   if (tvisnum(key)) {
     lua_Number nk = numV(key);
     int32_t k = lj_num2int(nk);
@@ -465,7 +466,8 @@ TValue *lj_tab_newkey(lua_State *L, GCtab *t, cTValue *key)
   if (!tvisnil(&n->val) || t->hmask == 0) {
     Node *nodebase = noderef(t->node);
     Node *collide, *freenode = getfreetop(t, nodebase);
-    lua_assert(freenode >= nodebase && freenode <= nodebase+t->hmask+1);
+    lj_assertL(freenode >= nodebase && freenode <= nodebase+t->hmask+1,
+	       "bad freenode");
     do {
       if (freenode == nodebase) {  /* No free node found? */
 	rehashtab(L, t, key);  /* Rehash table. */
@@ -473,7 +475,7 @@ TValue *lj_tab_newkey(lua_State *L, GCtab *t, cTValue *key)
       }
     } while (!tvisnil(&(--freenode)->key));
     setfreetop(t, nodebase, freenode);
-    lua_assert(freenode != &G(L)->nilnode);
+    lj_assertL(freenode != &G(L)->nilnode, "store to fallback hash");
     collide = hashkey(t, &n->key);
     if (collide != n) {  /* Colliding node not the main node? */
       Node *nn;
@@ -555,7 +557,7 @@ TValue *lj_tab_newkey(lua_State *L, GCtab *t, cTValue *key)
   if (LJ_UNLIKELY(tvismzero(&n->key)))
     n->key.u64 = 0;
   lj_gc_anybarriert(L, t);
-  lua_assert(tvisnil(&n->val));
+  lj_assertL(tvisnil(&n->val), "new hash slot is not empty");
   return &n->val;
 }
 
diff --git a/src/lj_target.h b/src/lj_target.h
index 8dcae957..b4be6781 100644
--- a/src/lj_target.h
+++ b/src/lj_target.h
@@ -152,7 +152,8 @@ typedef uint32_t RegCost;
 /* Return the address of an exit stub. */
 static LJ_AINLINE char *exitstub_addr_(char **group, uint32_t exitno)
 {
-  lua_assert(group[exitno / EXITSTUBS_PER_GROUP] != NULL);
+  lj_assertX(group[exitno / EXITSTUBS_PER_GROUP] != NULL,
+	     "exit stub group for exit %d uninitialized", exitno);
   return (char *)group[exitno / EXITSTUBS_PER_GROUP] +
 	 EXITSTUB_SPACING*(exitno % EXITSTUBS_PER_GROUP);
 }
diff --git a/src/lj_trace.c b/src/lj_trace.c
index 17743159..236e06a0 100644
--- a/src/lj_trace.c
+++ b/src/lj_trace.c
@@ -110,7 +110,8 @@ static void perftools_addtrace(GCtrace *T)
     name++;
   else
     name = "(string)";
-  lua_assert(startpc >= proto_bc(pt) && startpc < proto_bc(pt) + pt->sizebc);
+  lj_assertX(startpc >= proto_bc(pt) && startpc < proto_bc(pt) + pt->sizebc,
+	     "trace PC out of range");
   lineno = lj_debug_line(pt, proto_bcpos(pt, startpc));
   if (!fp) {
     char fname[40];
@@ -200,7 +201,7 @@ void lj_trace_reenableproto(GCproto *pt)
 {
   if ((pt->flags & PROTO_ILOOP)) {
     BCIns *bc = proto_bc(pt);
-    BCPos i, sizebc = pt->sizebc;;
+    BCPos i, sizebc = pt->sizebc;
     pt->flags &= ~PROTO_ILOOP;
     if (bc_op(bc[0]) == BC_IFUNCF)
       setbc_op(&bc[0], BC_FUNCF);
@@ -222,27 +223,28 @@ static void trace_unpatch(jit_State *J, GCtrace *T)
     return;  /* No need to unpatch branches in parent traces (yet). */
   switch (bc_op(*pc)) {
   case BC_JFORL:
-    lua_assert(traceref(J, bc_d(*pc)) == T);
+    lj_assertJ(traceref(J, bc_d(*pc)) == T, "JFORL references other trace");
     *pc = T->startins;
     pc += bc_j(T->startins);
-    lua_assert(bc_op(*pc) == BC_JFORI);
+    lj_assertJ(bc_op(*pc) == BC_JFORI, "FORL does not point to JFORI");
     setbc_op(pc, BC_FORI);
     break;
   case BC_JITERL:
   case BC_JLOOP:
-    lua_assert(op == BC_ITERL || op == BC_LOOP || bc_isret(op));
+    lj_assertJ(op == BC_ITERL || op == BC_LOOP || bc_isret(op),
+	       "bad original bytecode %d", op);
     *pc = T->startins;
     break;
   case BC_JMP:
-    lua_assert(op == BC_ITERL);
+    lj_assertJ(op == BC_ITERL, "bad original bytecode %d", op);
     pc += bc_j(*pc)+2;
     if (bc_op(*pc) == BC_JITERL) {
-      lua_assert(traceref(J, bc_d(*pc)) == T);
+      lj_assertJ(traceref(J, bc_d(*pc)) == T, "JITERL references other trace");
       *pc = T->startins;
     }
     break;
   case BC_JFUNCF:
-    lua_assert(op == BC_FUNCF);
+    lj_assertJ(op == BC_FUNCF, "bad original bytecode %d", op);
     *pc = T->startins;
     break;
   default:  /* Already unpatched. */
@@ -254,7 +256,8 @@ static void trace_unpatch(jit_State *J, GCtrace *T)
 static void trace_flushroot(jit_State *J, GCtrace *T)
 {
   GCproto *pt = &gcref(T->startpt)->pt;
-  lua_assert(T->root == 0 && pt != NULL);
+  lj_assertJ(T->root == 0, "not a root trace");
+  lj_assertJ(pt != NULL, "trace has no prototype");
   /* First unpatch any modified bytecode. */
   trace_unpatch(J, T);
   /* Unlink root trace from chain anchored in prototype. */
@@ -370,7 +373,8 @@ void lj_trace_freestate(global_State *g)
   {  /* This assumes all traces have already been freed. */
     ptrdiff_t i;
     for (i = 1; i < (ptrdiff_t)J->sizetrace; i++)
-      lua_assert(i == (ptrdiff_t)J->cur.traceno || traceref(J, i) == NULL);
+      lj_assertG(i == (ptrdiff_t)J->cur.traceno || traceref(J, i) == NULL,
+		 "trace still allocated");
   }
 #endif
   lj_mcode_free(J);
@@ -425,8 +429,9 @@ static void trace_start(jit_State *J)
   if ((J->pt->flags & PROTO_NOJIT)) {  /* JIT disabled for this proto? */
     if (J->parent == 0 && J->exitno == 0) {
       /* Lazy bytecode patching to disable hotcount events. */
-      lua_assert(bc_op(*J->pc) == BC_FORL || bc_op(*J->pc) == BC_ITERL ||
-		 bc_op(*J->pc) == BC_LOOP || bc_op(*J->pc) == BC_FUNCF);
+      lj_assertJ(bc_op(*J->pc) == BC_FORL || bc_op(*J->pc) == BC_ITERL ||
+		 bc_op(*J->pc) == BC_LOOP || bc_op(*J->pc) == BC_FUNCF,
+		 "bad hot bytecode %d", bc_op(*J->pc));
       setbc_op(J->pc, (int)bc_op(*J->pc)+(int)BC_ILOOP-(int)BC_LOOP);
       J->pt->flags |= PROTO_ILOOP;
     }
@@ -437,7 +442,8 @@ static void trace_start(jit_State *J)
   /* Get a new trace number. */
   traceno = trace_findfree(J);
   if (LJ_UNLIKELY(traceno == 0)) {  /* No free trace? */
-    lua_assert((J2G(J)->hookmask & HOOK_GC) == 0);
+    lj_assertJ((J2G(J)->hookmask & HOOK_GC) == 0,
+	       "recorder called from GC hook");
     lj_trace_flushall(J->L);
     J->state = LJ_TRACE_IDLE;  /* Silently ignored. */
     return;
@@ -513,7 +519,7 @@ static void trace_stop(jit_State *J)
     goto addroot;
   case BC_JMP:
     /* Patch exit branch in parent to side trace entry. */
-    lua_assert(J->parent != 0 && J->cur.root != 0);
+    lj_assertJ(J->parent != 0 && J->cur.root != 0, "not a side trace");
     lj_asm_patchexit(J, traceref(J, J->parent), J->exitno, J->cur.mcode);
     /* Avoid compiling a side trace twice (stack resizing uses parent exit). */
     traceref(J, J->parent)->snap[J->exitno].count = SNAPCOUNT_DONE;
@@ -532,7 +538,7 @@ static void trace_stop(jit_State *J)
     traceref(J, J->exitno)->link = traceno;
     break;
   default:
-    lua_assert(0);
+    lj_assertJ(0, "bad stop bytecode %d", op);
     break;
   }
 
@@ -553,8 +559,8 @@ static void trace_stop(jit_State *J)
 static int trace_downrec(jit_State *J)
 {
   /* Restart recording at the return instruction. */
-  lua_assert(J->pt != NULL);
-  lua_assert(bc_isret(bc_op(*J->pc)));
+  lj_assertJ(J->pt != NULL, "no active prototype");
+  lj_assertJ(bc_isret(bc_op(*J->pc)), "not at a return bytecode");
   if (bc_op(*J->pc) == BC_RETM) {
     J->ntraceabort++;
     return 0;  /* NYI: down-recursion with RETM. */
@@ -774,7 +780,7 @@ static void trace_hotside(jit_State *J, const BCIns *pc)
       isluafunc(curr_func(J->L)) &&
       snap->count != SNAPCOUNT_DONE &&
       ++snap->count >= J->param[JIT_P_hotexit]) {
-    lua_assert(J->state == LJ_TRACE_IDLE);
+    lj_assertJ(J->state == LJ_TRACE_IDLE, "hot side exit while recording");
     /* J->parent is non-zero for a side trace. */
     J->state = LJ_TRACE_START;
     lj_trace_ins(J, pc);
@@ -848,7 +854,7 @@ static TraceNo trace_exit_find(jit_State *J, MCode *pc)
     if (T && pc >= T->mcode && pc < (MCode *)((char *)T->mcode + T->szmcode))
       return traceno;
   }
-  lua_assert(0);
+  lj_assertJ(0, "bad exit pc");
   return 0;
 }
 #endif
@@ -878,13 +884,13 @@ int LJ_FASTCALL lj_trace_exit(jit_State *J, void *exptr)
   T = traceref(J, J->parent); UNUSED(T);
 #ifdef EXITSTATE_CHECKEXIT
   if (J->exitno == T->nsnap) {  /* Treat stack check like a parent exit. */
-    lua_assert(T->root != 0);
+    lj_assertJ(T->root != 0, "stack check in root trace");
     J->exitno = T->ir[REF_BASE].op2;
     J->parent = T->ir[REF_BASE].op1;
     T = traceref(J, J->parent);
   }
 #endif
-  lua_assert(T != NULL && J->exitno < T->nsnap);
+  lj_assertJ(T != NULL && J->exitno < T->nsnap, "bad trace or exit number");
   exd.J = J;
   exd.exptr = exptr;
   errcode = lj_vm_cpcall(L, NULL, &exd, trace_exit_cp);
@@ -975,14 +981,7 @@ uintptr_t LJ_FASTCALL lj_trace_unwind(jit_State *J, uintptr_t addr, ExitNo *ep)
     return (uintptr_t)exitstub_trace_addr(T, exitno);
 #endif
   }
-  /* Cannot correlate addr with trace/exit. This will be fatal. */
-  /*
-  ** FIXME: The following assert was replaced with
-  ** the conventional `lua_assert`.
-  **
-  ** lj_assertJ(0, "bad exit pc");
-  */
-  lua_assert(0);
+  lj_assertJ(0, "bad exit pc");
   return 0;
 }
 #endif
diff --git a/src/lj_utils_leb128.c b/src/lj_utils_leb128.c
index 0d50b839..d66961da 100644
--- a/src/lj_utils_leb128.c
+++ b/src/lj_utils_leb128.c
@@ -9,6 +9,7 @@
 #define LUA_CORE
 
 #include "lj_utils.h"
+#include "lj_obj.h"
 
 #define LINK_BIT          (0x80)
 #define MIN_TWOBYTE_VALUE (0x80)
@@ -112,7 +113,7 @@ size_t LJ_FASTCALL lj_utils_write_leb128(uint8_t *buffer, int64_t value)
   /* Omit LINK_BIT in case of overflow. */
   buffer[i++] = (uint8_t)(value & PAYLOAD_MASK);
 
-  lua_assert(i <= LEB128_U64_MAXSIZE);
+  lj_assertX(i <= LEB128_U64_MAXSIZE, "bad leb128 size");
 
   return i;
 }
@@ -126,7 +127,7 @@ size_t LJ_FASTCALL lj_utils_write_uleb128(uint8_t *buffer, uint64_t value)
 
   buffer[i++] = (uint8_t)value;
 
-  lua_assert(i <= LEB128_U64_MAXSIZE);
+  lj_assertX(i <= LEB128_U64_MAXSIZE, "bad uleb128 size");
 
   return i;
 }
diff --git a/src/lj_vmmath.c b/src/lj_vmmath.c
index 9c0d3fde..14e66687 100644
--- a/src/lj_vmmath.c
+++ b/src/lj_vmmath.c
@@ -60,7 +60,8 @@ double lj_vm_foldarith(double x, double y, int op)
 int32_t LJ_FASTCALL lj_vm_modi(int32_t a, int32_t b)
 {
   uint32_t y, ua, ub;
-  lua_assert(b != 0);  /* This must be checked before using this function. */
+  /* This must be checked before using this function. */
+  lj_assertX(b != 0, "modulo with zero divisor");
   ua = a < 0 ? (uint32_t)-a : (uint32_t)a;
   ub = b < 0 ? (uint32_t)-b : (uint32_t)b;
   y = ua % ub;
@@ -84,7 +85,7 @@ double lj_vm_log2(double a)
 static double lj_vm_powui(double x, uint32_t k)
 {
   double y;
-  lua_assert(k != 0);
+  lj_assertX(k != 0, "pow with zero exponent");
   for (; (k & 1) == 0; k >>= 1) x *= x;
   y = x;
   if ((k >>= 1) != 0) {
@@ -123,7 +124,7 @@ double lj_vm_foldfpm(double x, int fpm)
   case IRFPM_SQRT: return sqrt(x);
   case IRFPM_LOG: return log(x);
   case IRFPM_LOG2: return lj_vm_log2(x);
-  default: lua_assert(0);
+  default: lj_assertX(0, "bad fpm %d", fpm);
   }
   return 0;
 }
diff --git a/src/lj_wbuf.c b/src/lj_wbuf.c
index 897ef083..0001a02e 100644
--- a/src/lj_wbuf.c
+++ b/src/lj_wbuf.c
@@ -10,6 +10,7 @@
 
 #include <errno.h>
 
+#include "lj_obj.h"
 #include "lj_wbuf.h"
 #include "lj_utils.h"
 
@@ -52,7 +53,7 @@ void LJ_FASTCALL lj_wbuf_terminate(struct lj_wbuf *buf)
 
 static LJ_AINLINE void wbuf_reserve(struct lj_wbuf *buf, size_t n)
 {
-  lua_assert(n <= buf->size);
+  lj_assertX(n <= buf->size, "wbuf overflow");
   if (LJ_UNLIKELY(wbuf_left(buf) < n))
     lj_wbuf_flush(buf);
 }
diff --git a/src/ljamalg.c b/src/ljamalg.c
index 6ad5289c..0ffc7e81 100644
--- a/src/ljamalg.c
+++ b/src/ljamalg.c
@@ -28,6 +28,7 @@
 #include "lua.h"
 #include "lauxlib.h"
 
+#include "lj_assert.c"
 #include "lj_gc.c"
 #include "lj_err.c"
 #include "lj_char.c"
diff --git a/src/luaconf.h b/src/luaconf.h
index 8029040a..38146008 100644
--- a/src/luaconf.h
+++ b/src/luaconf.h
@@ -146,7 +146,7 @@
 #define LUALIB_API	LUA_API
 #define LUAMISC_API	LUA_API
 
-/* Support for internal assertions. */
+/* Compatibility support for assertions. */
 #if defined(LUA_USE_ASSERT) || defined(LUA_USE_APICHECK)
 #include <assert.h>
 #endif
-- 
2.41.0


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Tarantool-patches] [PATCH luajit 4/5] Fix pow() optimization inconsistencies.
  2023-08-15  9:36 [Tarantool-patches] [PATCH luajit 0/5] Fix pow inconsistencies and improve asserts Sergey Kaplun via Tarantool-patches
                   ` (2 preceding siblings ...)
  2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 3/5] Improve assertions Sergey Kaplun via Tarantool-patches
@ 2023-08-15  9:36 ` Sergey Kaplun via Tarantool-patches
  2023-08-18 12:45   ` Sergey Bronnikov via Tarantool-patches
  2023-08-20  9:26   ` Maxim Kokryashkin via Tarantool-patches
  2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 5/5] Revert to trival pow() optimizations to prevent inaccuracies Sergey Kaplun via Tarantool-patches
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 34+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-08-15  9:36 UTC (permalink / raw)
  To: Maxim Kokryashkin, Sergey Bronnikov; +Cc: tarantool-patches

From: Mike Pall <mike>

(cherry-picked from commit 9512d5c1aced61e13e7be2d3208ec7ae3516b458)

This patch fixes different misbehaviour between JIT-compiled code and
the interpreter for power operator with the following ways:
* Drop folding optimizations for base ^ 0.5 => sqrt(base), as far as
  pow(base, 0.5) isn't interchangeable and depends on the <math.h>
  implementation.
* Drop folding optimizations for 2 ^ int_pow => ldexp(1.0, int_pow), to
  avoid dependcy on the <math.h> implementation.
* Now `asm_pow()` always assemble a call to the `lj_vm_powi()` function,
  that is general now for all CPU architectures. Using this internal
  function instead of toolchain-provided `pow()` guarantees consistency
  between interpreter and JIT results. Also, it drops custom
  implementation for the `vm_powi_sse()` on x86_64.
* `math_extern2` macro in the VM may take the second argument, that is
  used as the target function to call. The first argument is still the
  name for `func_nnsse` macro.
* Narrowing for power operation avoids range guard for non-constant base
  IR. This leads to invalid result if value on trace is out of range.
  Now it is done unconditionally.

Be aware, that [220/502] lib/string/format/num.lua test [1] from
LuaJIT-test suite fails after this commit.

[1]: https://www.exploringbinary.com/incorrect-floating-point-to-decimal-conversions/

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#8825
---
 src/lj_asm.c                                  |  7 +-
 src/lj_asm_x86.h                              | 13 ---
 src/lj_dispatch.h                             |  2 +-
 src/lj_ircall.h                               |  2 +-
 src/lj_opt_fold.c                             | 27 ------
 src/lj_opt_narrow.c                           | 12 +--
 src/lj_vm.h                                   |  7 +-
 src/lj_vmmath.c                               | 82 +++++++++--------
 src/vm_arm.dasc                               | 13 +--
 src/vm_arm64.dasc                             | 11 ++-
 src/vm_mips.dasc                              | 11 ++-
 src/vm_mips64.dasc                            | 11 ++-
 src/vm_ppc.dasc                               | 11 ++-
 src/vm_x64.dasc                               | 44 ++-------
 src/vm_x86.dasc                               | 46 ++--------
 .../lj-684-pow-inconsistencies.test.lua       | 89 +++++++++++++++++++
 .../lj-9-pow-inconsistencies.test.lua         |  2 +
 17 files changed, 195 insertions(+), 195 deletions(-)
 create mode 100644 test/tarantool-tests/lj-684-pow-inconsistencies.test.lua

diff --git a/src/lj_asm.c b/src/lj_asm.c
index d71fa8c8..65261d50 100644
--- a/src/lj_asm.c
+++ b/src/lj_asm.c
@@ -1650,7 +1650,6 @@ static void asm_loop(ASMState *as)
 #if !LJ_SOFTFP32
 #if !LJ_TARGET_X86ORX64
 #define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
-#define asm_fppowi(as, ir)	asm_callid(as, ir, IRCALL_lj_vm_powi)
 #endif
 
 static void asm_pow(ASMState *as, IRIns *ir)
@@ -1661,10 +1660,8 @@ static void asm_pow(ASMState *as, IRIns *ir)
 					  IRCALL_lj_carith_powu64);
   else
 #endif
-  if (irt_isnum(IR(ir->op2)->t))
-    asm_callid(as, ir, IRCALL_pow);
-  else
-    asm_fppowi(as, ir);
+  asm_callid(as, ir, irt_isnum(IR(ir->op2)->t) ? IRCALL_lj_vm_pow :
+						 IRCALL_lj_vm_powi);
 }
 
 static void asm_div(ASMState *as, IRIns *ir)
diff --git a/src/lj_asm_x86.h b/src/lj_asm_x86.h
index 74f2d853..2b810c8d 100644
--- a/src/lj_asm_x86.h
+++ b/src/lj_asm_x86.h
@@ -2005,19 +2005,6 @@ static void asm_ldexp(ASMState *as, IRIns *ir)
   asm_x87load(as, ir->op2);
 }
 
-static void asm_fppowi(ASMState *as, IRIns *ir)
-{
-  /* The modified regs must match with the *.dasc implementation. */
-  RegSet drop = RSET_RANGE(RID_XMM0, RID_XMM1+1)|RID2RSET(RID_EAX);
-  if (ra_hasreg(ir->r))
-    rset_clear(drop, ir->r);  /* Dest reg handled below. */
-  ra_evictset(as, drop);
-  ra_destreg(as, ir, RID_XMM0);
-  emit_call(as, lj_vm_powi_sse);
-  ra_left(as, RID_XMM0, ir->op1);
-  ra_left(as, RID_EAX, ir->op2);
-}
-
 static int asm_swapops(ASMState *as, IRIns *ir)
 {
   IRIns *irl = IR(ir->op1);
diff --git a/src/lj_dispatch.h b/src/lj_dispatch.h
index b8bc2594..af870a75 100644
--- a/src/lj_dispatch.h
+++ b/src/lj_dispatch.h
@@ -44,7 +44,7 @@ extern double __divdf3(double a, double b);
 #define GOTDEF(_) \
   _(floor) _(ceil) _(trunc) _(log) _(log10) _(exp) _(sin) _(cos) _(tan) \
   _(asin) _(acos) _(atan) _(sinh) _(cosh) _(tanh) _(frexp) _(modf) _(atan2) \
-  _(pow) _(fmod) _(ldexp) _(lj_vm_modi) \
+  _(lj_vm_pow) _(fmod) _(ldexp) _(lj_vm_modi) \
   _(lj_dispatch_call) _(lj_dispatch_ins) _(lj_dispatch_stitch) \
   _(lj_dispatch_profile) _(lj_err_throw) \
   _(lj_ffh_coroutine_wrap_err) _(lj_func_closeuv) _(lj_func_newL_gc) \
diff --git a/src/lj_ircall.h b/src/lj_ircall.h
index af064a6f..ac0888a0 100644
--- a/src/lj_ircall.h
+++ b/src/lj_ircall.h
@@ -195,7 +195,7 @@ typedef struct CCallInfo {
   _(ANY,	log,			1,   N, NUM, XA_FP) \
   _(ANY,	lj_vm_log2,		1,   N, NUM, XA_FP) \
   _(ANY,	lj_vm_powi,		2,   N, NUM, XA_FP) \
-  _(ANY,	pow,			2,   N, NUM, XA2_FP) \
+  _(ANY,	lj_vm_pow,		2,   N, NUM, XA2_FP) \
   _(ANY,	atan2,			2,   N, NUM, XA2_FP) \
   _(ANY,	ldexp,			2,   N, NUM, XA_FP) \
   _(SOFTFP,	lj_vm_tobit,		1,   N, INT, XA_FP32) \
diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
index 0007107b..7d7cc9d1 100644
--- a/src/lj_opt_fold.c
+++ b/src/lj_opt_fold.c
@@ -1114,33 +1114,6 @@ LJFOLDF(simplify_numpow_xkint)
   return ref;
 }
 
-LJFOLD(POW any KNUM)
-LJFOLDF(simplify_numpow_xknum)
-{
-  if (knumright == 0.5)  /* x ^ 0.5 ==> sqrt(x) */
-    return emitir(IRTN(IR_FPMATH), fins->op1, IRFPM_SQRT);
-  return NEXTFOLD;
-}
-
-LJFOLD(POW KNUM any)
-LJFOLDF(simplify_numpow_kx)
-{
-  lua_Number n = knumleft;
-  if (n == 2.0 && irt_isint(fright->t)) {  /* 2.0 ^ i ==> ldexp(1.0, i) */
-#if LJ_TARGET_X86ORX64
-    /* Different IR_LDEXP calling convention on x86/x64 requires conversion. */
-    fins->o = IR_CONV;
-    fins->op1 = fins->op2;
-    fins->op2 = IRCONV_NUM_INT;
-    fins->op2 = (IRRef1)lj_opt_fold(J);
-#endif
-    fins->op1 = (IRRef1)lj_ir_knum_one(J);
-    fins->o = IR_LDEXP;
-    return RETRYFOLD;
-  }
-  return NEXTFOLD;
-}
-
 /* -- Simplify conversions ------------------------------------------------ */
 
 LJFOLD(CONV CONV IRCONV_NUM_INT)  /* _NUM */
diff --git a/src/lj_opt_narrow.c b/src/lj_opt_narrow.c
index 2cfb775b..d6601f4c 100644
--- a/src/lj_opt_narrow.c
+++ b/src/lj_opt_narrow.c
@@ -590,20 +590,14 @@ TRef lj_opt_narrow_pow(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc)
   rb = conv_str_tonum(J, rb, vb);
   rb = lj_ir_tonum(J, rb);  /* Left arg is always treated as an FP number. */
   rc = conv_str_tonum(J, rc, vc);
-  /* Narrowing must be unconditional to preserve (-x)^i semantics. */
   if (tvisint(vc) || numisint(numV(vc))) {
-    int checkrange = 0;
-    /* pow() is faster for bigger exponents. But do this only for (+k)^i. */
-    if (tref_isk(rb) && (int32_t)ir_knum(IR(tref_ref(rb)))->u32.hi >= 0) {
-      int32_t k = numberVint(vc);
-      if (!(k >= -65536 && k <= 65536)) goto force_pow_num;
-      checkrange = 1;
-    }
+    int32_t k = numberVint(vc);
+    if (!(k >= -65536 && k <= 65536)) goto force_pow_num;
     if (!tref_isinteger(rc)) {
       /* Guarded conversion to integer! */
       rc = emitir(IRTGI(IR_CONV), rc, IRCONV_INT_NUM|IRCONV_CHECK);
     }
-    if (checkrange && !tref_isk(rc)) {  /* Range guard: -65536 <= i <= 65536 */
+    if (!tref_isk(rc)) {  /* Range guard: -65536 <= i <= 65536 */
       TRef tmp = emitir(IRTI(IR_ADD), rc, lj_ir_kint(J, 65536));
       emitir(IRTGI(IR_ULE), tmp, lj_ir_kint(J, 2*65536));
     }
diff --git a/src/lj_vm.h b/src/lj_vm.h
index abaa7c52..f6f28a08 100644
--- a/src/lj_vm.h
+++ b/src/lj_vm.h
@@ -82,10 +82,6 @@ LJ_ASMF int32_t LJ_FASTCALL lj_vm_modi(int32_t, int32_t);
 LJ_ASMF void lj_vm_floor_sse(void);
 LJ_ASMF void lj_vm_ceil_sse(void);
 LJ_ASMF void lj_vm_trunc_sse(void);
-LJ_ASMF void lj_vm_powi_sse(void);
-#define lj_vm_powi	NULL
-#else
-LJ_ASMF double lj_vm_powi(double, int32_t);
 #endif
 #if LJ_TARGET_PPC || LJ_TARGET_ARM64
 #define lj_vm_trunc	trunc
@@ -100,6 +96,9 @@ LJ_ASMF int lj_vm_errno(void);
 #endif
 #endif
 
+LJ_ASMF double lj_vm_powi(double, int32_t);
+LJ_ASMF double lj_vm_pow(double, double);
+
 /* Continuations for metamethods. */
 LJ_ASMF void lj_cont_cat(void);  /* Continue with concatenation. */
 LJ_ASMF void lj_cont_ra(void);  /* Store result in RA from instruction. */
diff --git a/src/lj_vmmath.c b/src/lj_vmmath.c
index 14e66687..539f955b 100644
--- a/src/lj_vmmath.c
+++ b/src/lj_vmmath.c
@@ -30,11 +30,51 @@ LJ_FUNCA double lj_wrap_sinh(double x) { return sinh(x); }
 LJ_FUNCA double lj_wrap_cosh(double x) { return cosh(x); }
 LJ_FUNCA double lj_wrap_tanh(double x) { return tanh(x); }
 LJ_FUNCA double lj_wrap_atan2(double x, double y) { return atan2(x, y); }
-LJ_FUNCA double lj_wrap_pow(double x, double y) { return pow(x, y); }
 LJ_FUNCA double lj_wrap_fmod(double x, double y) { return fmod(x, y); }
 #endif
 
-/* -- Helper functions for generated machine code ------------------------- */
+/* -- Helper functions ---------------------------------------------------- */
+
+/* Unsigned x^k. */
+static double lj_vm_powui(double x, uint32_t k)
+{
+  double y;
+  lj_assertX(k != 0, "pow with zero exponent");
+  for (; (k & 1) == 0; k >>= 1) x *= x;
+  y = x;
+  if ((k >>= 1) != 0) {
+    for (;;) {
+      x *= x;
+      if (k == 1) break;
+      if (k & 1) y *= x;
+      k >>= 1;
+    }
+    y *= x;
+  }
+  return y;
+}
+
+/* Signed x^k. */
+double lj_vm_powi(double x, int32_t k)
+{
+  if (k > 1)
+    return lj_vm_powui(x, (uint32_t)k);
+  else if (k == 1)
+    return x;
+  else if (k == 0)
+    return 1.0;
+  else
+    return 1.0 / lj_vm_powui(x, (uint32_t)-k);
+}
+
+double lj_vm_pow(double x, double y)
+{
+  int32_t k = lj_num2int(y);
+  if ((k >= -65536 && k <= 65536) && y == (double)k)
+    return lj_vm_powi(x, k);
+  else
+    return pow(x, y);
+}
 
 double lj_vm_foldarith(double x, double y, int op)
 {
@@ -44,7 +84,7 @@ double lj_vm_foldarith(double x, double y, int op)
   case IR_MUL - IR_ADD: return x*y; break;
   case IR_DIV - IR_ADD: return x/y; break;
   case IR_MOD - IR_ADD: return x-lj_vm_floor(x/y)*y; break;
-  case IR_POW - IR_ADD: return pow(x, y); break;
+  case IR_POW - IR_ADD: return lj_vm_pow(x, y); break;
   case IR_NEG - IR_ADD: return -x; break;
   case IR_ABS - IR_ADD: return fabs(x); break;
 #if LJ_HASJIT
@@ -56,6 +96,8 @@ double lj_vm_foldarith(double x, double y, int op)
   }
 }
 
+/* -- Helper functions for generated machine code ------------------------- */
+
 #if (LJ_HASJIT && !(LJ_TARGET_ARM || LJ_TARGET_ARM64 || LJ_TARGET_PPC)) || LJ_TARGET_MIPS
 int32_t LJ_FASTCALL lj_vm_modi(int32_t a, int32_t b)
 {
@@ -80,40 +122,6 @@ double lj_vm_log2(double a)
 }
 #endif
 
-#if !LJ_TARGET_X86ORX64
-/* Unsigned x^k. */
-static double lj_vm_powui(double x, uint32_t k)
-{
-  double y;
-  lj_assertX(k != 0, "pow with zero exponent");
-  for (; (k & 1) == 0; k >>= 1) x *= x;
-  y = x;
-  if ((k >>= 1) != 0) {
-    for (;;) {
-      x *= x;
-      if (k == 1) break;
-      if (k & 1) y *= x;
-      k >>= 1;
-    }
-    y *= x;
-  }
-  return y;
-}
-
-/* Signed x^k. */
-double lj_vm_powi(double x, int32_t k)
-{
-  if (k > 1)
-    return lj_vm_powui(x, (uint32_t)k);
-  else if (k == 1)
-    return x;
-  else if (k == 0)
-    return 1.0;
-  else
-    return 1.0 / lj_vm_powui(x, (uint32_t)-k);
-}
-#endif
-
 /* Computes fpm(x) for extended math functions. */
 double lj_vm_foldfpm(double x, int fpm)
 {
diff --git a/src/vm_arm.dasc b/src/vm_arm.dasc
index 767d31f9..792f0363 100644
--- a/src/vm_arm.dasc
+++ b/src/vm_arm.dasc
@@ -1485,11 +1485,11 @@ static void build_subroutines(BuildCtx *ctx)
   |.endif
   |.endmacro
   |
-  |.macro math_extern2, func
+  |.macro math_extern2, name, func
   |.if HFABI
-  |  .ffunc_dd math_ .. func
+  |  .ffunc_dd math_ .. name
   |.else
-  |  .ffunc_nn math_ .. func
+  |  .ffunc_nn math_ .. name
   |.endif
   |  .IOS mov RA, BASE
   |  bl extern func
@@ -1500,6 +1500,9 @@ static void build_subroutines(BuildCtx *ctx)
   |  b ->fff_restv
   |.endif
   |.endmacro
+  |.macro math_extern2, func
+  |  math_extern2 func, func
+  |.endmacro
   |
   |.if FPU
   |  .ffunc_d math_sqrt
@@ -1545,7 +1548,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  math_extern sinh
   |  math_extern cosh
   |  math_extern tanh
-  |  math_extern2 pow
+  |  math_extern2 pow, lj_vm_pow
   |  math_extern2 atan2
   |  math_extern2 fmod
   |
@@ -3153,7 +3156,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
     break;
   case BC_POW:
     |  // NYI: (partial) integer arithmetic.
-    |  ins_arithfp extern, extern pow
+    |  ins_arithfp extern, extern lj_vm_pow
     break;
 
   case BC_CAT:
diff --git a/src/vm_arm64.dasc b/src/vm_arm64.dasc
index de33bde4..fb267a76 100644
--- a/src/vm_arm64.dasc
+++ b/src/vm_arm64.dasc
@@ -1391,11 +1391,14 @@ static void build_subroutines(BuildCtx *ctx)
   |  b ->fff_resn
   |.endmacro
   |
-  |.macro math_extern2, func
-  |  .ffunc_nn math_ .. func
+  |.macro math_extern2, name, func
+  |  .ffunc_nn math_ .. name
   |  bl extern func
   |  b ->fff_resn
   |.endmacro
+  |.macro math_extern2, func
+  |  math_extern2 func, func
+  |.endmacro
   |
   |.ffunc_n math_sqrt
   |  fsqrt d0, d0
@@ -1424,7 +1427,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  math_extern sinh
   |  math_extern cosh
   |  math_extern tanh
-  |  math_extern2 pow
+  |  math_extern2 pow, lj_vm_pow
   |  math_extern2 atan2
   |  math_extern2 fmod
   |
@@ -2621,7 +2624,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
     |  ins_arithload FARG1, FARG2
     |  ins_arithfallback ins_arithcheck_num
     |.if "fpins" == "fpow"
-    |  bl extern pow
+    |  bl extern lj_vm_pow
     |.else
     |  fpins FARG1, FARG1, FARG2
     |.endif
diff --git a/src/vm_mips.dasc b/src/vm_mips.dasc
index 32caabf7..5664f503 100644
--- a/src/vm_mips.dasc
+++ b/src/vm_mips.dasc
@@ -1631,14 +1631,17 @@ static void build_subroutines(BuildCtx *ctx)
   |.  nop
   |.endmacro
   |
-  |.macro math_extern2, func
-  |  .ffunc_nn math_ .. func
+  |.macro math_extern2, name, func
+  |  .ffunc_nn math_ .. name
   |.  load_got func
   |  call_extern
   |.  nop
   |  b ->fff_resn
   |.  nop
   |.endmacro
+  |.macro math_extern2, func
+  |  math_extern2 func, func
+  |.endmacro
   |
   |// TODO: Return integer type if result is integer (own sf implementation).
   |.macro math_round, func
@@ -1692,7 +1695,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  math_extern sinh
   |  math_extern cosh
   |  math_extern tanh
-  |  math_extern2 pow
+  |  math_extern2 pow, lj_vm_pow
   |  math_extern2 atan2
   |  math_extern2 fmod
   |
@@ -3585,7 +3588,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
     |  sltiu AT, SFARG1HI, LJ_TISNUM
     |  sltiu TMP0, SFARG2HI, LJ_TISNUM
     |  and AT, AT, TMP0
-    |  load_got pow
+    |  load_got lj_vm_pow
     |  beqz AT, ->vmeta_arith
     |.  addu RA, BASE, RA
     |.if FPU
diff --git a/src/vm_mips64.dasc b/src/vm_mips64.dasc
index 44fba36c..249605d4 100644
--- a/src/vm_mips64.dasc
+++ b/src/vm_mips64.dasc
@@ -1669,14 +1669,17 @@ static void build_subroutines(BuildCtx *ctx)
   |.  nop
   |.endmacro
   |
-  |.macro math_extern2, func
-  |  .ffunc_nn math_ .. func
+  |.macro math_extern2, name, func
+  |  .ffunc_nn math_ .. name
   |.  load_got func
   |  call_extern
   |.  nop
   |  b ->fff_resn
   |.  nop
   |.endmacro
+  |.macro math_extern2, func
+  |  math_extern2 func, func
+  |.endmacro
   |
   |// TODO: Return integer type if result is integer (own sf implementation).
   |.macro math_round, func
@@ -1730,7 +1733,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  math_extern sinh
   |  math_extern cosh
   |  math_extern tanh
-  |  math_extern2 pow
+  |  math_extern2 pow, lj_vm_pow
   |  math_extern2 atan2
   |  math_extern2 fmod
   |
@@ -3823,7 +3826,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
     |  sltiu TMP0, TMP0, LJ_TISNUM
     |   sltiu TMP1, TMP1, LJ_TISNUM
     |  and AT, TMP0, TMP1
-    |  load_got pow
+    |  load_got lj_vm_pow
     |  beqz AT, ->vmeta_arith
     |.  daddu RA, BASE, RA
     |.if FPU
diff --git a/src/vm_ppc.dasc b/src/vm_ppc.dasc
index 980ad897..94af63e6 100644
--- a/src/vm_ppc.dasc
+++ b/src/vm_ppc.dasc
@@ -2032,11 +2032,14 @@ static void build_subroutines(BuildCtx *ctx)
   |  b ->fff_resn
   |.endmacro
   |
-  |.macro math_extern2, func
-  |  .ffunc_nn math_ .. func
+  |.macro math_extern2, name, func
+  |  .ffunc_nn math_ .. name
   |  blex func
   |  b ->fff_resn
   |.endmacro
+  |.macro math_extern2, func
+  |  math_extern2 func, func
+  |.endmacro
   |
   |.macro math_round, func
   |  .ffunc_1 math_ .. func
@@ -2161,7 +2164,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  math_extern sinh
   |  math_extern cosh
   |  math_extern tanh
-  |  math_extern2 pow
+  |  math_extern2 pow, lj_vm_pow
   |  math_extern2 atan2
   |  math_extern2 fmod
   |
@@ -4154,7 +4157,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
     |  checknum cr1, CARG3
     |  crand 4*cr0+lt, 4*cr0+lt, 4*cr1+lt
     |  bge ->vmeta_arith_vv
-    |  blex pow
+    |  blex lj_vm_pow
     |  ins_next1
     |.if FPU
     |  stfdx FARG1, BASE, RA
diff --git a/src/vm_x64.dasc b/src/vm_x64.dasc
index 7b04b928..acbe8dc2 100644
--- a/src/vm_x64.dasc
+++ b/src/vm_x64.dasc
@@ -1825,13 +1825,16 @@ static void build_subroutines(BuildCtx *ctx)
   |  jmp ->fff_resxmm0
   |.endmacro
   |
-  |.macro math_extern2, func
-  |  .ffunc_nn math_ .. func
+  |.macro math_extern2, name, func
+  |  .ffunc_nn math_ .. name
   |  mov RB, BASE
   |  call extern func
   |  mov BASE, RB
   |  jmp ->fff_resxmm0
   |.endmacro
+  |.macro math_extern2, func
+  |  math_extern2 func, func
+  |.endmacro
   |
   |  math_extern log10
   |  math_extern exp
@@ -1844,7 +1847,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  math_extern sinh
   |  math_extern cosh
   |  math_extern tanh
-  |  math_extern2 pow
+  |  math_extern2 pow, lj_vm_pow
   |  math_extern2 atan2
   |  math_extern2 fmod
   |
@@ -2649,41 +2652,6 @@ static void build_subroutines(BuildCtx *ctx)
   |  subsd xmm0, xmm1
   |  ret
   |
-  |// Args in xmm0/eax. Ret in xmm0. xmm0-xmm1 and eax modified.
-  |->vm_powi_sse:
-  |  cmp eax, 1; jle >6			// i<=1?
-  |  // Now 1 < (unsigned)i <= 0x80000000.
-  |1:  // Handle leading zeros.
-  |  test eax, 1; jnz >2
-  |  mulsd xmm0, xmm0
-  |  shr eax, 1
-  |  jmp <1
-  |2:
-  |  shr eax, 1; jz >5
-  |  movaps xmm1, xmm0
-  |3:  // Handle trailing bits.
-  |  mulsd xmm0, xmm0
-  |  shr eax, 1; jz >4
-  |  jnc <3
-  |  mulsd xmm1, xmm0
-  |  jmp <3
-  |4:
-  |  mulsd xmm0, xmm1
-  |5:
-  |  ret
-  |6:
-  |  je <5				// x^1 ==> x
-  |  jb >7				// x^0 ==> 1
-  |  neg eax
-  |  call <1
-  |  sseconst_1 xmm1, RD
-  |  divsd xmm1, xmm0
-  |  movaps xmm0, xmm1
-  |  ret
-  |7:
-  |  sseconst_1 xmm0, RD
-  |  ret
-  |
   |//-----------------------------------------------------------------------
   |//-- Miscellaneous functions --------------------------------------------
   |//-----------------------------------------------------------------------
diff --git a/src/vm_x86.dasc b/src/vm_x86.dasc
index bd1e940e..bf30cce6 100644
--- a/src/vm_x86.dasc
+++ b/src/vm_x86.dasc
@@ -2240,8 +2240,8 @@ static void build_subroutines(BuildCtx *ctx)
   |  jmp ->fff_resfp
   |.endmacro
   |
-  |.macro math_extern2, func
-  |  .ffunc_nnsse math_ .. func
+  |.macro math_extern2, name, func
+  |  .ffunc_nnsse math_ .. name
   |.if not X64
   |  movsd FPARG1, xmm0
   |  movsd FPARG3, xmm1
@@ -2251,6 +2251,9 @@ static void build_subroutines(BuildCtx *ctx)
   |  mov BASE, RB
   |  jmp ->fff_resfp
   |.endmacro
+  |.macro math_extern2, func
+  |  math_extern2 func, func
+  |.endmacro
   |
   |  math_extern log10
   |  math_extern exp
@@ -2263,7 +2266,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  math_extern sinh
   |  math_extern cosh
   |  math_extern tanh
-  |  math_extern2 pow
+  |  math_extern2 pow, lj_vm_pow
   |  math_extern2 atan2
   |  math_extern2 fmod
   |
@@ -3140,41 +3143,6 @@ static void build_subroutines(BuildCtx *ctx)
   |  subsd xmm0, xmm1
   |  ret
   |
-  |// Args in xmm0/eax. Ret in xmm0. xmm0-xmm1 and eax modified.
-  |->vm_powi_sse:
-  |  cmp eax, 1; jle >6			// i<=1?
-  |  // Now 1 < (unsigned)i <= 0x80000000.
-  |1:  // Handle leading zeros.
-  |  test eax, 1; jnz >2
-  |  mulsd xmm0, xmm0
-  |  shr eax, 1
-  |  jmp <1
-  |2:
-  |  shr eax, 1; jz >5
-  |  movaps xmm1, xmm0
-  |3:  // Handle trailing bits.
-  |  mulsd xmm0, xmm0
-  |  shr eax, 1; jz >4
-  |  jnc <3
-  |  mulsd xmm1, xmm0
-  |  jmp <3
-  |4:
-  |  mulsd xmm0, xmm1
-  |5:
-  |  ret
-  |6:
-  |  je <5				// x^1 ==> x
-  |  jb >7				// x^0 ==> 1
-  |  neg eax
-  |  call <1
-  |  sseconst_1 xmm1, RDa
-  |  divsd xmm1, xmm0
-  |  movaps xmm0, xmm1
-  |  ret
-  |7:
-  |  sseconst_1 xmm0, RDa
-  |  ret
-  |
   |//-----------------------------------------------------------------------
   |//-- Miscellaneous functions --------------------------------------------
   |//-----------------------------------------------------------------------
@@ -3976,7 +3944,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
     |  movsd FPARG1, xmm0
     |  movsd FPARG3, xmm1
     |.endif
-    |  call extern pow
+    |  call extern lj_vm_pow
     |  movzx RA, PC_RA
     |  mov BASE, RB
     |.if X64
diff --git a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
new file mode 100644
index 00000000..5129fc45
--- /dev/null
+++ b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
@@ -0,0 +1,89 @@
+local tap = require('tap')
+-- Test to demonstrate the incorrect JIT behaviour for different
+-- power operation optimizations.
+-- See also:
+-- https://github.com/LuaJIT/LuaJIT/issues/684.
+local test = tap.test('lj-684-pow-inconsistencies'):skipcond({
+  ['Test requires JIT enabled'] = not jit.status(),
+})
+
+local tostring = tostring
+
+test:plan(4)
+
+jit.opt.start('hotloop=1')
+
+-- XXX: Prevent hotcount side effects.
+jit.off()
+jit.flush()
+
+local res = {}
+-- -0 ^ 0.5 = 0. Test sign with `tostring()`.
+-- XXX: use local variable to prevent folding via parser.
+-- XXX: use stack slot out of trace to prevent constant folding.
+local minus_zero = -0
+jit.on()
+for i = 1, 4 do
+  res[i] = tostring(minus_zero ^ 0.5)
+end
+
+-- XXX: Prevent hotcount side effects.
+jit.off()
+jit.flush()
+
+test:samevalues(res, ('consistent results for folding (-0) ^ 0.5'))
+
+jit.on()
+-- -inf ^ 0.5 = inf.
+res = {}
+local minus_inf = -math.huge
+jit.on()
+for i = 1, 4 do
+  res[i] = minus_inf ^ 0.5
+end
+
+-- XXX: Prevent hotcount side effects.
+jit.off()
+jit.flush()
+
+test:samevalues(res, ('consistent results for folding (-inf) ^ 0.5'))
+
+-- 2921 ^ 0.5 = 0x1.b05ec632536fap+5.
+res = {}
+-- XXX: use local variable to prevent folding via parser.
+-- XXX: use stack slot out of trace to prevent constant folding.
+local corner_case_05 = 2921
+jit.on()
+for i = 1, 4 do
+  res[i] = corner_case_05 ^ 0.5
+end
+
+-- XXX: Prevent hotcount side effects.
+jit.off()
+jit.flush()
+
+test:samevalues(res, ('consistent results for folding 2921 ^ 0.5'))
+
+-- Narrowing for non-constant base of power operation.
+local function pow(base, power)
+  return base ^ power
+end
+
+jit.on()
+
+-- Compile function first.
+pow(1, 2)
+pow(1, 2)
+
+-- Need some value near 1, to avoid infinite result.
+local base = 1.0000000001
+local power = 65536 * 3
+local resulting_value = pow(base, power)
+
+-- XXX: Prevent hotcount side effects.
+jit.off()
+jit.flush()
+
+test:is(resulting_value, base ^ power, 'guard for narrowing of power operation')
+
+test:done(true)
diff --git a/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
index 21b3a0d9..1f7f65c5 100644
--- a/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
+++ b/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
@@ -16,6 +16,8 @@ local INTERESTING_VALUES = {
   -- x ^  inf = 0 (inf), if |x| < 1 (|x| > 1).
   -- x ^ -inf = inf (0), if |x| < 1 (|x| > 1).
   0.999999, 1.000001, -0.999999, -1.000001,
+  -- Test power of even numbers optimizations.
+  2, -2, 0.5, -0.5,
 }
 test:plan(1 + (#INTERESTING_VALUES) ^ 2)
 
-- 
2.41.0


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Tarantool-patches] [PATCH luajit 5/5] Revert to trival pow() optimizations to prevent inaccuracies.
  2023-08-15  9:36 [Tarantool-patches] [PATCH luajit 0/5] Fix pow inconsistencies and improve asserts Sergey Kaplun via Tarantool-patches
                   ` (3 preceding siblings ...)
  2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 4/5] Fix pow() optimization inconsistencies Sergey Kaplun via Tarantool-patches
@ 2023-08-15  9:36 ` Sergey Kaplun via Tarantool-patches
  2023-08-18 12:49   ` Sergey Bronnikov via Tarantool-patches
  2023-08-20  9:37   ` Maxim Kokryashkin via Tarantool-patches
  2023-08-24  7:47 ` [Tarantool-patches] [PATCH luajit 0/5] Fix pow inconsistencies and improve asserts Sergey Bronnikov via Tarantool-patches
  2023-08-31 15:18 ` Igor Munkin via Tarantool-patches
  6 siblings, 2 replies; 34+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-08-15  9:36 UTC (permalink / raw)
  To: Maxim Kokryashkin, Sergey Bronnikov; +Cc: tarantool-patches

From: Mike Pall <mike>

(cherry-picked from commit 96d6d5032098ea9f0002165394a8774dcaa0c0ce)

This patch fixes different misbehaviour between JIT-compiled code and
the interpreter for power operator with the following ways:
* Drop folding optimizations for base ^ n => base * base ..., as far as
  pow(base, n) isn't interchangeable with just multiplicity of numbers
  and depends on the <math.h> implementation.
* Since the internal power function is inaccurate for very big or small
  powers, it is dropped, and `pow()` from the standard library is used
  instead. To save consistency between JIT behaviour and the VM
  narrowing optimization is dropped, and only trivial folding
  optimizations are used. Also, `math_extern2` version with two
  parameters is dropped, since it's no more used.

Also, this fixes failures of the [220/502] lib/string/format/num.lua
test [1] from LuaJIT-test suite.

[1]: https://www.exploringbinary.com/incorrect-floating-point-to-decimal-conversions/

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#8825
---
 src/lj_asm.c                                  |  3 +-
 src/lj_dispatch.h                             |  2 +-
 src/lj_ffrecord.c                             |  4 +-
 src/lj_ircall.h                               |  3 +-
 src/lj_iropt.h                                |  1 -
 src/lj_opt_fold.c                             | 37 ++++------------
 src/lj_opt_narrow.c                           | 24 ----------
 src/lj_opt_split.c                            |  2 +-
 src/lj_record.c                               |  2 +-
 src/lj_vm.h                                   |  3 --
 src/lj_vmmath.c                               | 44 +------------------
 src/vm_arm.dasc                               | 13 +++---
 src/vm_arm64.dasc                             | 11 ++---
 src/vm_mips.dasc                              | 11 ++---
 src/vm_mips64.dasc                            | 11 ++---
 src/vm_ppc.dasc                               | 11 ++---
 src/vm_x64.dasc                               |  9 ++--
 src/vm_x86.dasc                               | 11 ++---
 .../lj-684-pow-inconsistencies.test.lua       | 21 ++++++++-
 19 files changed, 64 insertions(+), 159 deletions(-)

diff --git a/src/lj_asm.c b/src/lj_asm.c
index 65261d50..3a1909d5 100644
--- a/src/lj_asm.c
+++ b/src/lj_asm.c
@@ -1660,8 +1660,7 @@ static void asm_pow(ASMState *as, IRIns *ir)
 					  IRCALL_lj_carith_powu64);
   else
 #endif
-  asm_callid(as, ir, irt_isnum(IR(ir->op2)->t) ? IRCALL_lj_vm_pow :
-						 IRCALL_lj_vm_powi);
+  asm_callid(as, ir, IRCALL_pow);
 }
 
 static void asm_div(ASMState *as, IRIns *ir)
diff --git a/src/lj_dispatch.h b/src/lj_dispatch.h
index af870a75..b8bc2594 100644
--- a/src/lj_dispatch.h
+++ b/src/lj_dispatch.h
@@ -44,7 +44,7 @@ extern double __divdf3(double a, double b);
 #define GOTDEF(_) \
   _(floor) _(ceil) _(trunc) _(log) _(log10) _(exp) _(sin) _(cos) _(tan) \
   _(asin) _(acos) _(atan) _(sinh) _(cosh) _(tanh) _(frexp) _(modf) _(atan2) \
-  _(lj_vm_pow) _(fmod) _(ldexp) _(lj_vm_modi) \
+  _(pow) _(fmod) _(ldexp) _(lj_vm_modi) \
   _(lj_dispatch_call) _(lj_dispatch_ins) _(lj_dispatch_stitch) \
   _(lj_dispatch_profile) _(lj_err_throw) \
   _(lj_ffh_coroutine_wrap_err) _(lj_func_closeuv) _(lj_func_newL_gc) \
diff --git a/src/lj_ffrecord.c b/src/lj_ffrecord.c
index 0746ec64..99a6b918 100644
--- a/src/lj_ffrecord.c
+++ b/src/lj_ffrecord.c
@@ -590,8 +590,8 @@ static void LJ_FASTCALL recff_math_call(jit_State *J, RecordFFData *rd)
 
 static void LJ_FASTCALL recff_math_pow(jit_State *J, RecordFFData *rd)
 {
-  J->base[0] = lj_opt_narrow_pow(J, J->base[0], J->base[1],
-				 &rd->argv[0], &rd->argv[1]);
+  J->base[0] = lj_opt_narrow_arith(J, J->base[0], J->base[1],
+				   &rd->argv[0], &rd->argv[1], IR_POW);
   UNUSED(rd);
 }
 
diff --git a/src/lj_ircall.h b/src/lj_ircall.h
index ac0888a0..9c195918 100644
--- a/src/lj_ircall.h
+++ b/src/lj_ircall.h
@@ -194,8 +194,7 @@ typedef struct CCallInfo {
   _(FPMATH,	sqrt,			1,   N, NUM, XA_FP) \
   _(ANY,	log,			1,   N, NUM, XA_FP) \
   _(ANY,	lj_vm_log2,		1,   N, NUM, XA_FP) \
-  _(ANY,	lj_vm_powi,		2,   N, NUM, XA_FP) \
-  _(ANY,	lj_vm_pow,		2,   N, NUM, XA2_FP) \
+  _(ANY,	pow,			2,   N, NUM, XA2_FP) \
   _(ANY,	atan2,			2,   N, NUM, XA2_FP) \
   _(ANY,	ldexp,			2,   N, NUM, XA_FP) \
   _(SOFTFP,	lj_vm_tobit,		1,   N, INT, XA_FP32) \
diff --git a/src/lj_iropt.h b/src/lj_iropt.h
index a59ba3f4..7ee1ea86 100644
--- a/src/lj_iropt.h
+++ b/src/lj_iropt.h
@@ -144,7 +144,6 @@ LJ_FUNC TRef lj_opt_narrow_arith(jit_State *J, TRef rb, TRef rc,
 				 TValue *vb, TValue *vc, IROp op);
 LJ_FUNC TRef lj_opt_narrow_unm(jit_State *J, TRef rc, TValue *vc);
 LJ_FUNC TRef lj_opt_narrow_mod(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc);
-LJ_FUNC TRef lj_opt_narrow_pow(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc);
 LJ_FUNC IRType lj_opt_narrow_forl(jit_State *J, cTValue *forbase);
 
 /* Optimization passes. */
diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
index 7d7cc9d1..09e6c87b 100644
--- a/src/lj_opt_fold.c
+++ b/src/lj_opt_fold.c
@@ -236,14 +236,10 @@ LJFOLDF(kfold_fpcall2)
   return NEXTFOLD;
 }
 
-LJFOLD(POW KNUM KINT)
 LJFOLD(POW KNUM KNUM)
 LJFOLDF(kfold_numpow)
 {
-  lua_Number a = knumleft;
-  lua_Number b = fright->o == IR_KINT ? (lua_Number)fright->i : knumright;
-  lua_Number y = lj_vm_foldarith(a, b, IR_POW - IR_ADD);
-  return lj_ir_knum(J, y);
+  return lj_ir_knum(J, lj_vm_foldarith(knumleft, knumright, IR_POW - IR_ADD));
 }
 
 /* Must not use kfold_kref for numbers (could be NaN). */
@@ -1084,34 +1080,17 @@ LJFOLDF(simplify_nummuldiv_negneg)
   return RETRYFOLD;
 }
 
-LJFOLD(POW any KINT)
-LJFOLDF(simplify_numpow_xkint)
+LJFOLD(POW any KNUM)
+LJFOLDF(simplify_numpow_k)
 {
-  int32_t k = fright->i;
-  TRef ref = fins->op1;
-  if (k == 0)  /* x ^ 0 ==> 1 */
+  if (knumright == 0)  /* x ^ 0 ==> 1 */
     return lj_ir_knum_one(J);  /* Result must be a number, not an int. */
-  if (k == 1)  /* x ^ 1 ==> x */
+  else if (knumright == 1)  /* x ^ 1 ==> x */
     return LEFTFOLD;
-  if ((uint32_t)(k+65536) > 2*65536u)  /* Limit code explosion. */
+  else if (knumright == 2)  /* x ^ 2 ==> x * x */
+    return emitir(IRTN(IR_MUL), fins->op1, fins->op1);
+  else
     return NEXTFOLD;
-  if (k < 0) {  /* x ^ (-k) ==> (1/x) ^ k. */
-    ref = emitir(IRTN(IR_DIV), lj_ir_knum_one(J), ref);
-    k = -k;
-  }
-  /* Unroll x^k for 1 <= k <= 65536. */
-  for (; (k & 1) == 0; k >>= 1)  /* Handle leading zeros. */
-    ref = emitir(IRTN(IR_MUL), ref, ref);
-  if ((k >>= 1) != 0) {  /* Handle trailing bits. */
-    TRef tmp = emitir(IRTN(IR_MUL), ref, ref);
-    for (; k != 1; k >>= 1) {
-      if (k & 1)
-	ref = emitir(IRTN(IR_MUL), ref, tmp);
-      tmp = emitir(IRTN(IR_MUL), tmp, tmp);
-    }
-    ref = emitir(IRTN(IR_MUL), ref, tmp);
-  }
-  return ref;
 }
 
 /* -- Simplify conversions ------------------------------------------------ */
diff --git a/src/lj_opt_narrow.c b/src/lj_opt_narrow.c
index d6601f4c..db0da10f 100644
--- a/src/lj_opt_narrow.c
+++ b/src/lj_opt_narrow.c
@@ -584,30 +584,6 @@ TRef lj_opt_narrow_mod(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc)
   return emitir(IRTN(IR_SUB), rb, tmp);
 }
 
-/* Narrowing of power operator or math.pow. */
-TRef lj_opt_narrow_pow(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc)
-{
-  rb = conv_str_tonum(J, rb, vb);
-  rb = lj_ir_tonum(J, rb);  /* Left arg is always treated as an FP number. */
-  rc = conv_str_tonum(J, rc, vc);
-  if (tvisint(vc) || numisint(numV(vc))) {
-    int32_t k = numberVint(vc);
-    if (!(k >= -65536 && k <= 65536)) goto force_pow_num;
-    if (!tref_isinteger(rc)) {
-      /* Guarded conversion to integer! */
-      rc = emitir(IRTGI(IR_CONV), rc, IRCONV_INT_NUM|IRCONV_CHECK);
-    }
-    if (!tref_isk(rc)) {  /* Range guard: -65536 <= i <= 65536 */
-      TRef tmp = emitir(IRTI(IR_ADD), rc, lj_ir_kint(J, 65536));
-      emitir(IRTGI(IR_ULE), tmp, lj_ir_kint(J, 2*65536));
-    }
-  } else {
-force_pow_num:
-    rc = lj_ir_tonum(J, rc);  /* Want POW(num, num), not POW(num, int). */
-  }
-  return emitir(IRTN(IR_POW), rb, rc);
-}
-
 /* -- Predictive narrowing of induction variables ------------------------- */
 
 /* Narrow a single runtime value. */
diff --git a/src/lj_opt_split.c b/src/lj_opt_split.c
index a619d852..0dc6394f 100644
--- a/src/lj_opt_split.c
+++ b/src/lj_opt_split.c
@@ -400,7 +400,7 @@ static void split_ir(jit_State *J)
 	hi = split_call_ll(J, hisubst, oir, ir, IRCALL_softfp_div);
 	break;
       case IR_POW:
-	hi = split_call_li(J, hisubst, oir, ir, IRCALL_lj_vm_powi);
+	hi = split_call_li(J, hisubst, oir, ir, IRCALL_pow);
 	break;
       case IR_FPMATH:
 	hi = split_call_l(J, hisubst, oir, ir, IRCALL_lj_vm_floor + ir->op2);
diff --git a/src/lj_record.c b/src/lj_record.c
index d1332bfc..34d1210a 100644
--- a/src/lj_record.c
+++ b/src/lj_record.c
@@ -2268,7 +2268,7 @@ void lj_record_ins(jit_State *J)
 
   case BC_POW:
     if (tref_isnumber_str(rb) && tref_isnumber_str(rc))
-      rc = lj_opt_narrow_pow(J, rb, rc, rbv, rcv);
+      rc = lj_opt_narrow_arith(J, rb, rc, rbv, rcv, IR_POW);
     else
       rc = rec_mm_arith(J, &ix, MM_pow);
     break;
diff --git a/src/lj_vm.h b/src/lj_vm.h
index f6f28a08..79166e5e 100644
--- a/src/lj_vm.h
+++ b/src/lj_vm.h
@@ -96,9 +96,6 @@ LJ_ASMF int lj_vm_errno(void);
 #endif
 #endif
 
-LJ_ASMF double lj_vm_powi(double, int32_t);
-LJ_ASMF double lj_vm_pow(double, double);
-
 /* Continuations for metamethods. */
 LJ_ASMF void lj_cont_cat(void);  /* Continue with concatenation. */
 LJ_ASMF void lj_cont_ra(void);  /* Store result in RA from instruction. */
diff --git a/src/lj_vmmath.c b/src/lj_vmmath.c
index 539f955b..506867f8 100644
--- a/src/lj_vmmath.c
+++ b/src/lj_vmmath.c
@@ -30,52 +30,12 @@ LJ_FUNCA double lj_wrap_sinh(double x) { return sinh(x); }
 LJ_FUNCA double lj_wrap_cosh(double x) { return cosh(x); }
 LJ_FUNCA double lj_wrap_tanh(double x) { return tanh(x); }
 LJ_FUNCA double lj_wrap_atan2(double x, double y) { return atan2(x, y); }
+LJ_FUNCA double lj_wrap_pow(double x, double y) { return pow(x, y); }
 LJ_FUNCA double lj_wrap_fmod(double x, double y) { return fmod(x, y); }
 #endif
 
 /* -- Helper functions ---------------------------------------------------- */
 
-/* Unsigned x^k. */
-static double lj_vm_powui(double x, uint32_t k)
-{
-  double y;
-  lj_assertX(k != 0, "pow with zero exponent");
-  for (; (k & 1) == 0; k >>= 1) x *= x;
-  y = x;
-  if ((k >>= 1) != 0) {
-    for (;;) {
-      x *= x;
-      if (k == 1) break;
-      if (k & 1) y *= x;
-      k >>= 1;
-    }
-    y *= x;
-  }
-  return y;
-}
-
-/* Signed x^k. */
-double lj_vm_powi(double x, int32_t k)
-{
-  if (k > 1)
-    return lj_vm_powui(x, (uint32_t)k);
-  else if (k == 1)
-    return x;
-  else if (k == 0)
-    return 1.0;
-  else
-    return 1.0 / lj_vm_powui(x, (uint32_t)-k);
-}
-
-double lj_vm_pow(double x, double y)
-{
-  int32_t k = lj_num2int(y);
-  if ((k >= -65536 && k <= 65536) && y == (double)k)
-    return lj_vm_powi(x, k);
-  else
-    return pow(x, y);
-}
-
 double lj_vm_foldarith(double x, double y, int op)
 {
   switch (op) {
@@ -84,7 +44,7 @@ double lj_vm_foldarith(double x, double y, int op)
   case IR_MUL - IR_ADD: return x*y; break;
   case IR_DIV - IR_ADD: return x/y; break;
   case IR_MOD - IR_ADD: return x-lj_vm_floor(x/y)*y; break;
-  case IR_POW - IR_ADD: return lj_vm_pow(x, y); break;
+  case IR_POW - IR_ADD: return pow(x, y); break;
   case IR_NEG - IR_ADD: return -x; break;
   case IR_ABS - IR_ADD: return fabs(x); break;
 #if LJ_HASJIT
diff --git a/src/vm_arm.dasc b/src/vm_arm.dasc
index 792f0363..767d31f9 100644
--- a/src/vm_arm.dasc
+++ b/src/vm_arm.dasc
@@ -1485,11 +1485,11 @@ static void build_subroutines(BuildCtx *ctx)
   |.endif
   |.endmacro
   |
-  |.macro math_extern2, name, func
+  |.macro math_extern2, func
   |.if HFABI
-  |  .ffunc_dd math_ .. name
+  |  .ffunc_dd math_ .. func
   |.else
-  |  .ffunc_nn math_ .. name
+  |  .ffunc_nn math_ .. func
   |.endif
   |  .IOS mov RA, BASE
   |  bl extern func
@@ -1500,9 +1500,6 @@ static void build_subroutines(BuildCtx *ctx)
   |  b ->fff_restv
   |.endif
   |.endmacro
-  |.macro math_extern2, func
-  |  math_extern2 func, func
-  |.endmacro
   |
   |.if FPU
   |  .ffunc_d math_sqrt
@@ -1548,7 +1545,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  math_extern sinh
   |  math_extern cosh
   |  math_extern tanh
-  |  math_extern2 pow, lj_vm_pow
+  |  math_extern2 pow
   |  math_extern2 atan2
   |  math_extern2 fmod
   |
@@ -3156,7 +3153,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
     break;
   case BC_POW:
     |  // NYI: (partial) integer arithmetic.
-    |  ins_arithfp extern, extern lj_vm_pow
+    |  ins_arithfp extern, extern pow
     break;
 
   case BC_CAT:
diff --git a/src/vm_arm64.dasc b/src/vm_arm64.dasc
index fb267a76..de33bde4 100644
--- a/src/vm_arm64.dasc
+++ b/src/vm_arm64.dasc
@@ -1391,14 +1391,11 @@ static void build_subroutines(BuildCtx *ctx)
   |  b ->fff_resn
   |.endmacro
   |
-  |.macro math_extern2, name, func
-  |  .ffunc_nn math_ .. name
+  |.macro math_extern2, func
+  |  .ffunc_nn math_ .. func
   |  bl extern func
   |  b ->fff_resn
   |.endmacro
-  |.macro math_extern2, func
-  |  math_extern2 func, func
-  |.endmacro
   |
   |.ffunc_n math_sqrt
   |  fsqrt d0, d0
@@ -1427,7 +1424,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  math_extern sinh
   |  math_extern cosh
   |  math_extern tanh
-  |  math_extern2 pow, lj_vm_pow
+  |  math_extern2 pow
   |  math_extern2 atan2
   |  math_extern2 fmod
   |
@@ -2624,7 +2621,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
     |  ins_arithload FARG1, FARG2
     |  ins_arithfallback ins_arithcheck_num
     |.if "fpins" == "fpow"
-    |  bl extern lj_vm_pow
+    |  bl extern pow
     |.else
     |  fpins FARG1, FARG1, FARG2
     |.endif
diff --git a/src/vm_mips.dasc b/src/vm_mips.dasc
index 5664f503..32caabf7 100644
--- a/src/vm_mips.dasc
+++ b/src/vm_mips.dasc
@@ -1631,17 +1631,14 @@ static void build_subroutines(BuildCtx *ctx)
   |.  nop
   |.endmacro
   |
-  |.macro math_extern2, name, func
-  |  .ffunc_nn math_ .. name
+  |.macro math_extern2, func
+  |  .ffunc_nn math_ .. func
   |.  load_got func
   |  call_extern
   |.  nop
   |  b ->fff_resn
   |.  nop
   |.endmacro
-  |.macro math_extern2, func
-  |  math_extern2 func, func
-  |.endmacro
   |
   |// TODO: Return integer type if result is integer (own sf implementation).
   |.macro math_round, func
@@ -1695,7 +1692,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  math_extern sinh
   |  math_extern cosh
   |  math_extern tanh
-  |  math_extern2 pow, lj_vm_pow
+  |  math_extern2 pow
   |  math_extern2 atan2
   |  math_extern2 fmod
   |
@@ -3588,7 +3585,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
     |  sltiu AT, SFARG1HI, LJ_TISNUM
     |  sltiu TMP0, SFARG2HI, LJ_TISNUM
     |  and AT, AT, TMP0
-    |  load_got lj_vm_pow
+    |  load_got pow
     |  beqz AT, ->vmeta_arith
     |.  addu RA, BASE, RA
     |.if FPU
diff --git a/src/vm_mips64.dasc b/src/vm_mips64.dasc
index 249605d4..44fba36c 100644
--- a/src/vm_mips64.dasc
+++ b/src/vm_mips64.dasc
@@ -1669,17 +1669,14 @@ static void build_subroutines(BuildCtx *ctx)
   |.  nop
   |.endmacro
   |
-  |.macro math_extern2, name, func
-  |  .ffunc_nn math_ .. name
+  |.macro math_extern2, func
+  |  .ffunc_nn math_ .. func
   |.  load_got func
   |  call_extern
   |.  nop
   |  b ->fff_resn
   |.  nop
   |.endmacro
-  |.macro math_extern2, func
-  |  math_extern2 func, func
-  |.endmacro
   |
   |// TODO: Return integer type if result is integer (own sf implementation).
   |.macro math_round, func
@@ -1733,7 +1730,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  math_extern sinh
   |  math_extern cosh
   |  math_extern tanh
-  |  math_extern2 pow, lj_vm_pow
+  |  math_extern2 pow
   |  math_extern2 atan2
   |  math_extern2 fmod
   |
@@ -3826,7 +3823,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
     |  sltiu TMP0, TMP0, LJ_TISNUM
     |   sltiu TMP1, TMP1, LJ_TISNUM
     |  and AT, TMP0, TMP1
-    |  load_got lj_vm_pow
+    |  load_got pow
     |  beqz AT, ->vmeta_arith
     |.  daddu RA, BASE, RA
     |.if FPU
diff --git a/src/vm_ppc.dasc b/src/vm_ppc.dasc
index 94af63e6..980ad897 100644
--- a/src/vm_ppc.dasc
+++ b/src/vm_ppc.dasc
@@ -2032,14 +2032,11 @@ static void build_subroutines(BuildCtx *ctx)
   |  b ->fff_resn
   |.endmacro
   |
-  |.macro math_extern2, name, func
-  |  .ffunc_nn math_ .. name
+  |.macro math_extern2, func
+  |  .ffunc_nn math_ .. func
   |  blex func
   |  b ->fff_resn
   |.endmacro
-  |.macro math_extern2, func
-  |  math_extern2 func, func
-  |.endmacro
   |
   |.macro math_round, func
   |  .ffunc_1 math_ .. func
@@ -2164,7 +2161,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  math_extern sinh
   |  math_extern cosh
   |  math_extern tanh
-  |  math_extern2 pow, lj_vm_pow
+  |  math_extern2 pow
   |  math_extern2 atan2
   |  math_extern2 fmod
   |
@@ -4157,7 +4154,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
     |  checknum cr1, CARG3
     |  crand 4*cr0+lt, 4*cr0+lt, 4*cr1+lt
     |  bge ->vmeta_arith_vv
-    |  blex lj_vm_pow
+    |  blex pow
     |  ins_next1
     |.if FPU
     |  stfdx FARG1, BASE, RA
diff --git a/src/vm_x64.dasc b/src/vm_x64.dasc
index acbe8dc2..09bf67e5 100644
--- a/src/vm_x64.dasc
+++ b/src/vm_x64.dasc
@@ -1825,16 +1825,13 @@ static void build_subroutines(BuildCtx *ctx)
   |  jmp ->fff_resxmm0
   |.endmacro
   |
-  |.macro math_extern2, name, func
-  |  .ffunc_nn math_ .. name
+  |.macro math_extern2, func
+  |  .ffunc_nn math_ .. func
   |  mov RB, BASE
   |  call extern func
   |  mov BASE, RB
   |  jmp ->fff_resxmm0
   |.endmacro
-  |.macro math_extern2, func
-  |  math_extern2 func, func
-  |.endmacro
   |
   |  math_extern log10
   |  math_extern exp
@@ -1847,7 +1844,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  math_extern sinh
   |  math_extern cosh
   |  math_extern tanh
-  |  math_extern2 pow, lj_vm_pow
+  |  math_extern2 pow
   |  math_extern2 atan2
   |  math_extern2 fmod
   |
diff --git a/src/vm_x86.dasc b/src/vm_x86.dasc
index bf30cce6..f16ade1a 100644
--- a/src/vm_x86.dasc
+++ b/src/vm_x86.dasc
@@ -2240,8 +2240,8 @@ static void build_subroutines(BuildCtx *ctx)
   |  jmp ->fff_resfp
   |.endmacro
   |
-  |.macro math_extern2, name, func
-  |  .ffunc_nnsse math_ .. name
+  |.macro math_extern2, func
+  |  .ffunc_nnsse math_ .. func
   |.if not X64
   |  movsd FPARG1, xmm0
   |  movsd FPARG3, xmm1
@@ -2251,9 +2251,6 @@ static void build_subroutines(BuildCtx *ctx)
   |  mov BASE, RB
   |  jmp ->fff_resfp
   |.endmacro
-  |.macro math_extern2, func
-  |  math_extern2 func, func
-  |.endmacro
   |
   |  math_extern log10
   |  math_extern exp
@@ -2266,7 +2263,7 @@ static void build_subroutines(BuildCtx *ctx)
   |  math_extern sinh
   |  math_extern cosh
   |  math_extern tanh
-  |  math_extern2 pow, lj_vm_pow
+  |  math_extern2 pow
   |  math_extern2 atan2
   |  math_extern2 fmod
   |
@@ -3944,7 +3941,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
     |  movsd FPARG1, xmm0
     |  movsd FPARG3, xmm1
     |.endif
-    |  call extern lj_vm_pow
+    |  call extern pow
     |  movzx RA, PC_RA
     |  mov BASE, RB
     |.if X64
diff --git a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
index 5129fc45..ab9db3df 100644
--- a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
+++ b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
@@ -2,14 +2,15 @@ local tap = require('tap')
 -- Test to demonstrate the incorrect JIT behaviour for different
 -- power operation optimizations.
 -- See also:
--- https://github.com/LuaJIT/LuaJIT/issues/684.
+-- https://github.com/LuaJIT/LuaJIT/issues/684,
+-- https://github.com/LuaJIT/LuaJIT/issues/817.
 local test = tap.test('lj-684-pow-inconsistencies'):skipcond({
   ['Test requires JIT enabled'] = not jit.status(),
 })
 
 local tostring = tostring
 
-test:plan(4)
+test:plan(5)
 
 jit.opt.start('hotloop=1')
 
@@ -64,6 +65,22 @@ jit.flush()
 
 test:samevalues(res, ('consistent results for folding 2921 ^ 0.5'))
 
+-- -948388 ^ 3 = -0x1.7ad0e8ad7439dp+59.
+res = {}
+-- XXX: use local variable to prevent folding via parser.
+-- XXX: use stack slot out of trace to prevent constant folding.
+local corner_case_3 = -948388
+jit.on()
+for i = 1, 4 do
+  res[i] = corner_case_3 ^ 3
+end
+
+-- XXX: Prevent hotcount side effects.
+jit.off()
+jit.flush()
+
+test:samevalues(res, ('consistent results for int pow (-948388) ^ 3'))
+
 -- Narrowing for non-constant base of power operation.
 local function pow(base, power)
   return base ^ power
-- 
2.41.0


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/5] test: introduce `samevalues()` TAP checker
  2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 1/5] test: introduce `samevalues()` TAP checker Sergey Kaplun via Tarantool-patches
@ 2023-08-17 14:03   ` Maxim Kokryashkin via Tarantool-patches
  2023-08-17 15:03     ` Sergey Kaplun via Tarantool-patches
  2023-08-18 10:43   ` Sergey Bronnikov via Tarantool-patches
  1 sibling, 1 reply; 34+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2023-08-17 14:03 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the patch!
LGTM, except for a few comments below.

On Tue, Aug 15, 2023 at 12:36:27PM +0300, Sergey Kaplun wrote:
> The introduced `samevalues()` helper checks that values in range from
Typo: s/in range/in the range/
> 1, to `table.maxn()` of the given table are exactly the same. It may be
> usefull for test consistency of JIT and VM behaviour. Originally, the
Typo: s/usefull for test/useful to test the/
> `arr_is_consistent()` function was introduced in the
> <tarantool-tests/gh-6163-min-max.test.lua>. `samevalues()` has the same
> functionallity (except usage of `table.maxn()` instead `#` operator to
Typo: s/functionallity/functionality/
Typo: s/except/except for the/
Typo: s/instead/instead of the/
> be sure, that the table we check isn't a sparse array).
> ---
>  test/tarantool-tests/gh-6163-min-max.test.lua | 52 ++++++++-----------
>  test/tarantool-tests/tap.lua                  | 14 +++++
>  2 files changed, 37 insertions(+), 29 deletions(-)
> 
> diff --git a/test/tarantool-tests/gh-6163-min-max.test.lua b/test/tarantool-tests/gh-6163-min-max.test.lua
> index 63437955..4bc6155c 100644
> --- a/test/tarantool-tests/gh-6163-min-max.test.lua
> +++ b/test/tarantool-tests/gh-6163-min-max.test.lua
> @@ -2,25 +2,17 @@ local tap = require('tap')
>  local test = tap.test('gh-6163-jit-min-max'):skipcond({
>    ['Test requires JIT enabled'] = not jit.status(),
>  })
> +
>  local x86_64 = jit.arch == 'x86' or jit.arch == 'x64'
> +-- XXX: table to use for dummy check for some inconsistent results
> +-- on the x86/64 architecture.
> +local DUMMY_TAB = {}
> +
>  test:plan(18)
>  --
>  -- gh-6163: math.min/math.max inconsistencies.
>  --
>  
> -local function isnan(x)
> -    return x ~= x
> -end
> -
> -local function array_is_consistent(res)
> -  for i = 1, #res - 1 do
> -    if res[i] ~= res[i + 1] and not (isnan(res[i]) and isnan(res[i + 1])) then
> -      return false
> -    end
> -  end
> -  return true
> -end
> -
>  -- This function creates dirty values on the Lua stack.
>  -- The latter of them is going to be treated as an
>  -- argument by the `math.min/math.max`.
> @@ -91,14 +83,14 @@ for k = 1, 4 do
>      result[k] = min(min(x, nan), x)
>  end
>  -- expected: 1 1 1 1
> -test:ok(array_is_consistent(result), 'math.min: reassoc_dup')
> +test:samevalues(result, 'math.min: reassoc_dup')
>  
>  result = {}
>  for k = 1, 4 do
>      result[k] = max(max(x, nan), x)
>  end
>  -- expected: 1 1 1 1
> -test:ok(array_is_consistent(result), 'math.max: reassoc_dup')
> +test:samevalues(result, 'math.max: reassoc_dup')
>  
>  -- If one gets the expression like `math.min(x, math.min(x, nan))`,
>  -- and the `comm_dup` optimization is applied, it results in the
> @@ -120,7 +112,7 @@ for k = 1, 4 do
>  end
>  -- FIXME: results are still inconsistent for the x86/64 architecture.
>  -- expected: nan nan nan nan
> -test:ok(array_is_consistent(result) or x86_64, 'math.min: comm_dup_minmax')
> +test:samevalues(x86_64 and DUMMY_TAB or result, 'math.min: comm_dup_minmax')
>  
>  result = {}
>  for k = 1, 4 do
> @@ -128,7 +120,7 @@ for k = 1, 4 do
>  end
>  -- FIXME: results are still inconsistent for the x86/64 architecture.
>  -- expected: nan nan nan nan
> -test:ok(array_is_consistent(result) or x86_64, 'math.max: comm_dup_minmax')
> +test:samevalues(x86_64 and DUMMY_TAB or result, 'math.max: comm_dup_minmax')
>  
>  -- The following optimization should be disabled:
>  -- (x o k1) o k2 ==> x o (k1 o k2)
> @@ -139,49 +131,49 @@ for k = 1, 4 do
>      result[k] = min(min(x, 0/0), 1.3)
>  end
>  -- expected: 1.3 1.3 1.3 1.3
> -test:ok(array_is_consistent(result), 'math.min: reassoc_minmax_k')
> +test:samevalues(result, 'math.min: reassoc_minmax_k')
>  
>  result = {}
>  for k = 1, 4 do
>      result[k] = max(max(x, 0/0), 1.1)
>  end
>  -- expected: 1.1 1.1 1.1 1.1
> -test:ok(array_is_consistent(result), 'math.max: reassoc_minmax_k')
> +test:samevalues(result, 'math.max: reassoc_minmax_k')
>  
>  result = {}
>  for k = 1, 4 do
>    result[k] = min(max(nan, 1), 1)
>  end
>  -- expected: 1 1 1 1
> -test:ok(array_is_consistent(result), 'min-max-case1: reassoc_minmax_left')
> +test:samevalues(result, 'min-max-case1: reassoc_minmax_left')
>  
>  result = {}
>  for k = 1, 4 do
>    result[k] = min(max(1, nan), 1)
>  end
>  -- expected: 1 1 1 1
> -test:ok(array_is_consistent(result), 'min-max-case2: reassoc_minmax_left')
> +test:samevalues(result, 'min-max-case2: reassoc_minmax_left')
>  
>  result = {}
>  for k = 1, 4 do
>    result[k] = max(min(nan, 1), 1)
>  end
>  -- expected: 1 1 1 1
> -test:ok(array_is_consistent(result), 'max-min-case1: reassoc_minmax_left')
> +test:samevalues(result, 'max-min-case1: reassoc_minmax_left')
>  
>  result = {}
>  for k = 1, 4 do
>    result[k] = max(min(1, nan), 1)
>  end
>  -- expected: 1 1 1 1
> -test:ok(array_is_consistent(result), 'max-min-case2: reassoc_minmax_left')
> +test:samevalues(result, 'max-min-case2: reassoc_minmax_left')
>  
>  result = {}
>  for k = 1, 4 do
>    result[k] = min(1, max(nan, 1))
>  end
>  -- expected: 1 1 1 1
> -test:ok(array_is_consistent(result), 'min-max-case1: reassoc_minmax_right')
> +test:samevalues(result, 'min-max-case1: reassoc_minmax_right')
>  
>  result = {}
>  for k = 1, 4 do
> @@ -189,14 +181,15 @@ for k = 1, 4 do
>  end
>  -- FIXME: results are still inconsistent for the x86/64 architecture.
>  -- expected: nan nan nan nan
> -test:ok(array_is_consistent(result) or x86_64, 'min-max-case2: reassoc_minmax_right')
> +test:samevalues(x86_64 and DUMMY_TAB or result,
> +                'min-max-case2: reassoc_minmax_right')

Side note: this skipcond looks complex, but I can't come up with an alternative
better than altering the TAP-plan, which is even worse option...
>  
>  result = {}
>  for k = 1, 4 do
>    result[k] = max(1, min(nan, 1))
>  end
>  -- expected: 1 1 1 1
> -test:ok(array_is_consistent(result), 'max-min-case1: reassoc_minmax_right')
> +test:samevalues(result, 'max-min-case1: reassoc_minmax_right')
>  
>  result = {}
>  for k = 1, 4 do
> @@ -204,7 +197,8 @@ for k = 1, 4 do
>  end
>  -- FIXME: results are still inconsistent for the x86/64 architecture.
>  -- expected: nan nan nan nan
> -test:ok(array_is_consistent(result) or x86_64, 'max-min-case2: reassoc_minmax_right')
> +test:samevalues(x86_64 and DUMMY_TAB or result,
> +                'max-min-case2: reassoc_minmax_right')
>  
>  -- XXX: If we look into the disassembled code of `lj_vm_foldarith()`
>  -- we can see the following:
> @@ -253,13 +247,13 @@ for k = 1, 4 do
>    result[k] = min(min(7.1, 0/0), 1.1)
>  end
>  -- expected: 1.1 1.1 1.1 1.1
> -test:ok(array_is_consistent(result), 'min: fold_kfold_numarith')
> +test:samevalues(result, 'min: fold_kfold_numarith')
>  
>  result = {}
>  for k = 1, 4 do
>    result[k] = max(max(7.1, 0/0), 1.1)
>  end
>  -- expected: 1.1 1.1 1.1 1.1
> -test:ok(array_is_consistent(result), 'max: fold_kfold_numarith')
> +test:samevalues(result, 'max: fold_kfold_numarith')
>  
>  test:done(true)
> diff --git a/test/tarantool-tests/tap.lua b/test/tarantool-tests/tap.lua
> index 8559ee52..af1d4b20 100644
> --- a/test/tarantool-tests/tap.lua
> +++ b/test/tarantool-tests/tap.lua
> @@ -254,6 +254,19 @@ local function iscdata(test, v, ctype, message, extra)
>    return ok(test, ffi.istype(ctype, v), message, extra)
>  end
>  
> +local function isnan(v)
> +  return v ~= v
> +end
> +
> +local function samevalues(test, got, message, extra)
> +  for i = 1, table.maxn(got) - 1 do
> +    if got[i] ~= got[i + 1] and not (isnan(got[i]) and isnan(got[i + 1])) then
> +      return fail(test, message, extra)
> +    end
> +  end
> +  return ok(test, true, message, extra)
> +end
> +
>  local test_mt
>  
>  local function new(parent, name, fun, ...)
> @@ -372,6 +385,7 @@ test_mt = {
>      isudata    = isudata,
>      iscdata    = iscdata,
>      is_deeply  = is_deeply,
> +    samevalues = samevalues,
>      like       = like,
>      unlike     = unlike,
>    }
> -- 
> 2.41.0
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/5] Remove pow() splitting and cleanup backends.
  2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 2/5] Remove pow() splitting and cleanup backends Sergey Kaplun via Tarantool-patches
@ 2023-08-17 14:52   ` Maxim Kokryashkin via Tarantool-patches
  2023-08-17 15:33     ` Sergey Kaplun via Tarantool-patches
  2023-08-18 11:08   ` Sergey Bronnikov via Tarantool-patches
  1 sibling, 1 reply; 34+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2023-08-17 14:52 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the patch!
Please consider my comments below.

On Tue, Aug 15, 2023 at 12:36:28PM +0300, Sergey Kaplun wrote:
> From: Mike Pall <mike>
> 
> (cherry-picked from commit b2307c8ad817e350d65cc909a579ca2f77439682)
> 
> The JIT engine tries to split b^c to exp2(c * log2(b)) with attempt to
Typo: s/with attempt/with an attempt/
> rejoin them later for some backends. It adds a dependency on C99
> exp2() and log2(), which aren't part of some libm implementations.
> Also, for some cases for IEEE754 we can see, that exp2(log2(x)) != x,
> due to mathematical functions accuracy and double precision
> restrictions. So, the values on the JIT slots and Lua stack are
> inconsistent.

There is a lot to it. There are chnages in emission, fold optimizations,
narrowing, etc. Maybe it is worth mentioning some key changes that
happened as a result of that? That way, this changeset is easier to absorb.

> 
> This patch removes splitting of pow operator, so IR_POW is emitting for
Typo: s/removes/removes the/
> all cases (except power of 0.5 replaced with sqrt operation).
Typo: s/except/except for the/
Typo: s/0.5/0.5, which is/
Typo: s/with sqrt/with the sqrt/
> 
> Also this patch does some refactoring:
> 
> * Functions `asm_pow()`, `asm_mod()`, `asm_ldexp()`, `asm_div()`
>   (replaced with `asm_fpdiv()` for CPU architectures) are moved to the
Typo: s/to the/to/
>   <src/lj_asm.c> as far as their implementation is generic for all
>   architectures.
> * Fusing of IR_HREF + IR_EQ/IR_NE moved to a `asm_fuseequal()`.
Typo: s/moved/was moved/
Typo: s/to a/to/
> * Since `lj_vm_exp2()` subroutine and `IRFPM_EXP2` are removed as no
>   longer used.
I can't understand what this sentence means, please rephrase it.
> 

What about changes with `asm_cnew`? I think you should mention them too.
> Sergey Kaplun:
> * added the description and the test for the problem
> 
> Part of tarantool/tarantool#8825
> ---
>  src/lj_arch.h                                 |   3 -
>  src/lj_asm.c                                  | 106 +++++++++++-------
>  src/lj_asm_arm.h                              |  10 +-
>  src/lj_asm_arm64.h                            |  39 +------
>  src/lj_asm_mips.h                             |  38 +------
>  src/lj_asm_ppc.h                              |   9 +-
>  src/lj_asm_x86.h                              |  37 +-----
>  src/lj_ir.h                                   |   2 +-
>  src/lj_ircall.h                               |   1 -
>  src/lj_opt_fold.c                             |  18 ++-
>  src/lj_opt_narrow.c                           |  20 +---
>  src/lj_opt_split.c                            |  21 ----
>  src/lj_vm.h                                   |   5 -
>  src/lj_vmmath.c                               |   8 --
>  .../lj-9-pow-inconsistencies.test.lua         |  63 +++++++++++
>  15 files changed, 158 insertions(+), 222 deletions(-)
>  create mode 100644 test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
> 
> diff --git a/src/lj_arch.h b/src/lj_arch.h
> index cf31a291..3bdbe84e 100644
> --- a/src/lj_arch.h
> +++ b/src/lj_arch.h
> @@ -607,9 +607,6 @@
>  #if defined(__ANDROID__) || defined(__symbian__) || LJ_TARGET_XBOX360 || LJ_TARGET_WINDOWS
>  #define LUAJIT_NO_LOG2
>  #endif
> -#if defined(__symbian__) || LJ_TARGET_WINDOWS
> -#define LUAJIT_NO_EXP2
> -#endif
>  #if LJ_TARGET_CONSOLE || (LJ_TARGET_IOS && __IPHONE_OS_VERSION_MIN_REQUIRED >= __IPHONE_8_0)
>  #define LJ_NO_SYSTEM		1
>  #endif
> diff --git a/src/lj_asm.c b/src/lj_asm.c
> index b352fd35..a6906b19 100644
> --- a/src/lj_asm.c
> +++ b/src/lj_asm.c
> @@ -1356,32 +1356,6 @@ static void asm_call(ASMState *as, IRIns *ir)
>    asm_gencall(as, ci, args);
>  }
>  
> -#if !LJ_SOFTFP32
> -static void asm_fppow(ASMState *as, IRIns *ir, IRRef lref, IRRef rref)
> -{
> -  const CCallInfo *ci = &lj_ir_callinfo[IRCALL_pow];
> -  IRRef args[2];
> -  args[0] = lref;
> -  args[1] = rref;
> -  asm_setupresult(as, ir, ci);
> -  asm_gencall(as, ci, args);
> -}
> -
> -static int asm_fpjoin_pow(ASMState *as, IRIns *ir)
> -{
> -  IRIns *irp = IR(ir->op1);
> -  if (irp == ir-1 && irp->o == IR_MUL && !ra_used(irp)) {
> -    IRIns *irpp = IR(irp->op1);
> -    if (irpp == ir-2 && irpp->o == IR_FPMATH &&
> -	irpp->op2 == IRFPM_LOG2 && !ra_used(irpp)) {
> -      asm_fppow(as, ir, irpp->op1, irp->op2);
> -      return 1;
> -    }
> -  }
> -  return 0;
> -}
> -#endif
> -
>  /* -- PHI and loop handling ----------------------------------------------- */
>  
>  /* Break a PHI cycle by renaming to a free register (evict if needed). */
> @@ -1652,6 +1626,62 @@ static void asm_loop(ASMState *as)
>  #error "Missing assembler for target CPU"
>  #endif
>  
> +/* -- Common instruction helpers ------------------------------------------ */
> +
> +#if !LJ_SOFTFP32
> +#if !LJ_TARGET_X86ORX64
> +#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
> +#define asm_fppowi(as, ir)	asm_callid(as, ir, IRCALL_lj_vm_powi)
> +#endif
> +
> +static void asm_pow(ASMState *as, IRIns *ir)
> +{
> +#if LJ_64 && LJ_HASFFI
> +  if (!irt_isnum(ir->t))
> +    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_powi64 :
> +					  IRCALL_lj_carith_powu64);
> +  else
> +#endif
> +  if (irt_isnum(IR(ir->op2)->t))
> +    asm_callid(as, ir, IRCALL_pow);
> +  else
> +    asm_fppowi(as, ir);
> +}
> +
> +static void asm_div(ASMState *as, IRIns *ir)
> +{
> +#if LJ_64 && LJ_HASFFI
> +  if (!irt_isnum(ir->t))
> +    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_divi64 :
> +					  IRCALL_lj_carith_divu64);
> +  else
> +#endif
> +    asm_fpdiv(as, ir);
> +}
> +#endif
> +
> +static void asm_mod(ASMState *as, IRIns *ir)
> +{
> +#if LJ_64 && LJ_HASFFI
> +  if (!irt_isint(ir->t))
> +    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_modi64 :
> +					  IRCALL_lj_carith_modu64);
> +  else
> +#endif
> +    asm_callid(as, ir, IRCALL_lj_vm_modi);
> +}
> +
> +static void asm_fuseequal(ASMState *as, IRIns *ir)
> +{
> +  /* Fuse HREF + EQ/NE. */
> +  if ((ir-1)->o == IR_HREF && ir->op1 == as->curins-1) {
> +    as->curins--;
> +    asm_href(as, ir-1, (IROp)ir->o);
> +  } else {
> +    asm_equal(as, ir);
> +  }
> +}
> +
>  /* -- Instruction dispatch ------------------------------------------------ */
>  
>  /* Assemble a single instruction. */
> @@ -1674,14 +1704,7 @@ static void asm_ir(ASMState *as, IRIns *ir)
>    case IR_ABC:
>      asm_comp(as, ir);
>      break;
> -  case IR_EQ: case IR_NE:
> -    if ((ir-1)->o == IR_HREF && ir->op1 == as->curins-1) {
> -      as->curins--;
> -      asm_href(as, ir-1, (IROp)ir->o);
> -    } else {
> -      asm_equal(as, ir);
> -    }
> -    break;
> +  case IR_EQ: case IR_NE: asm_fuseequal(as, ir); break;
>  
>    case IR_RETF: asm_retf(as, ir); break;
>  
> @@ -1750,7 +1773,13 @@ static void asm_ir(ASMState *as, IRIns *ir)
>    case IR_SNEW: case IR_XSNEW: asm_snew(as, ir); break;
>    case IR_TNEW: asm_tnew(as, ir); break;
>    case IR_TDUP: asm_tdup(as, ir); break;
> -  case IR_CNEW: case IR_CNEWI: asm_cnew(as, ir); break;
> +  case IR_CNEW: case IR_CNEWI:
> +#if LJ_HASFFI
> +    asm_cnew(as, ir);
> +#else
> +    lua_assert(0);
> +#endif
> +    break;
>  
>    /* Buffer operations. */
>    case IR_BUFHDR: asm_bufhdr(as, ir); break;
> @@ -2215,6 +2244,10 @@ static void asm_setup_regsp(ASMState *as)
>  	if (inloop)
>  	  as->modset |= RSET_SCRATCH;
>  #if LJ_TARGET_X86
> +	if (irt_isnum(IR(ir->op2)->t)) {
> +	  if (as->evenspill < 4)  /* Leave room to call pow(). */
> +	    as->evenspill = 4;
> +	}
>  	break;
>  #else
>  	ir->prev = REGSP_HINT(RID_FPRET);
> @@ -2240,9 +2273,6 @@ static void asm_setup_regsp(ASMState *as)
>  	  continue;
>  	}
>  	break;
> -      } else if (ir->op2 == IRFPM_EXP2 && !LJ_64) {
> -	if (as->evenspill < 4)  /* Leave room to call pow(). */
> -	  as->evenspill = 4;
>        }
>  #endif
>        if (inloop)
> diff --git a/src/lj_asm_arm.h b/src/lj_asm_arm.h
> index 2894e5c9..29a07c80 100644
> --- a/src/lj_asm_arm.h
> +++ b/src/lj_asm_arm.h
> @@ -1275,8 +1275,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>    ra_allockreg(as, (int32_t)(sz+sizeof(GCcdata)),
>  	       ra_releasetmp(as, ASMREF_TMP1));
>  }
> -#else
> -#define asm_cnew(as, ir)	((void)0)
>  #endif
>  
>  /* -- Write barriers ------------------------------------------------------ */
> @@ -1371,8 +1369,6 @@ static void asm_callround(ASMState *as, IRIns *ir, int id)
>  
>  static void asm_fpmath(ASMState *as, IRIns *ir)
>  {
> -  if (ir->op2 == IRFPM_EXP2 && asm_fpjoin_pow(as, ir))
> -    return;
>    if (ir->op2 <= IRFPM_TRUNC)
>      asm_callround(as, ir, ir->op2);
>    else if (ir->op2 == IRFPM_SQRT)
> @@ -1499,14 +1495,10 @@ static void asm_mul(ASMState *as, IRIns *ir)
>  #define asm_mulov(as, ir)	asm_mul(as, ir)
>  
>  #if !LJ_SOFTFP
> -#define asm_div(as, ir)		asm_fparith(as, ir, ARMI_VDIV_D)
> -#define asm_pow(as, ir)		asm_callid(as, ir, IRCALL_lj_vm_powi)
> +#define asm_fpdiv(as, ir)	asm_fparith(as, ir, ARMI_VDIV_D)
>  #define asm_abs(as, ir)		asm_fpunary(as, ir, ARMI_VABS_D)
> -#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
>  #endif
>  
> -#define asm_mod(as, ir)		asm_callid(as, ir, IRCALL_lj_vm_modi)
> -
>  static void asm_neg(ASMState *as, IRIns *ir)
>  {
>  #if !LJ_SOFTFP
> diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
> index aea251a9..c3d6889e 100644
> --- a/src/lj_asm_arm64.h
> +++ b/src/lj_asm_arm64.h
> @@ -1249,8 +1249,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>    ra_allockreg(as, (int32_t)(sz+sizeof(GCcdata)),
>  	       ra_releasetmp(as, ASMREF_TMP1));
>  }
> -#else
> -#define asm_cnew(as, ir)	((void)0)
>  #endif
>  
>  /* -- Write barriers ------------------------------------------------------ */
> @@ -1327,8 +1325,6 @@ static void asm_fpmath(ASMState *as, IRIns *ir)
>    } else if (fpm <= IRFPM_TRUNC) {
>      asm_fpunary(as, ir, fpm == IRFPM_FLOOR ? A64I_FRINTMd :
>  			fpm == IRFPM_CEIL ? A64I_FRINTPd : A64I_FRINTZd);
> -  } else if (fpm == IRFPM_EXP2 && asm_fpjoin_pow(as, ir)) {
> -    return;
>    } else {
>      asm_callid(as, ir, IRCALL_lj_vm_floor + fpm);
>    }
> @@ -1435,45 +1431,12 @@ static void asm_mul(ASMState *as, IRIns *ir)
>    asm_intmul(as, ir);
>  }
>  
> -static void asm_div(ASMState *as, IRIns *ir)
> -{
> -#if LJ_HASFFI
> -  if (!irt_isnum(ir->t))
> -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_divi64 :
> -					  IRCALL_lj_carith_divu64);
> -  else
> -#endif
> -    asm_fparith(as, ir, A64I_FDIVd);
> -}
> -
> -static void asm_pow(ASMState *as, IRIns *ir)
> -{
> -#if LJ_HASFFI
> -  if (!irt_isnum(ir->t))
> -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_powi64 :
> -					  IRCALL_lj_carith_powu64);
> -  else
> -#endif
> -    asm_callid(as, ir, IRCALL_lj_vm_powi);
> -}
> -
>  #define asm_addov(as, ir)	asm_add(as, ir)
>  #define asm_subov(as, ir)	asm_sub(as, ir)
>  #define asm_mulov(as, ir)	asm_mul(as, ir)
>  
> +#define asm_fpdiv(as, ir)	asm_fparith(as, ir, A64I_FDIVd)
>  #define asm_abs(as, ir)		asm_fpunary(as, ir, A64I_FABS)
> -#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
> -
> -static void asm_mod(ASMState *as, IRIns *ir)
> -{
> -#if LJ_HASFFI
> -  if (!irt_isint(ir->t))
> -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_modi64 :
> -					  IRCALL_lj_carith_modu64);
> -  else
> -#endif
> -    asm_callid(as, ir, IRCALL_lj_vm_modi);
> -}
>  
>  static void asm_neg(ASMState *as, IRIns *ir)
>  {
> diff --git a/src/lj_asm_mips.h b/src/lj_asm_mips.h
> index 4626507b..0f92959b 100644
> --- a/src/lj_asm_mips.h
> +++ b/src/lj_asm_mips.h
> @@ -1613,8 +1613,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>    ra_allockreg(as, (int32_t)(sz+sizeof(GCcdata)),
>  	       ra_releasetmp(as, ASMREF_TMP1));
>  }
> -#else
> -#define asm_cnew(as, ir)	((void)0)
>  #endif
>  
>  /* -- Write barriers ------------------------------------------------------ */
> @@ -1683,8 +1681,6 @@ static void asm_fpunary(ASMState *as, IRIns *ir, MIPSIns mi)
>  #if !LJ_SOFTFP32
>  static void asm_fpmath(ASMState *as, IRIns *ir)
>  {
> -  if (ir->op2 == IRFPM_EXP2 && asm_fpjoin_pow(as, ir))
> -    return;
>  #if !LJ_SOFTFP
>    if (ir->op2 <= IRFPM_TRUNC)
>      asm_callround(as, ir, IRCALL_lj_vm_floor + ir->op2);
> @@ -1772,41 +1768,13 @@ static void asm_mul(ASMState *as, IRIns *ir)
>    }
>  }
>  
> -static void asm_mod(ASMState *as, IRIns *ir)
> -{
> -#if LJ_64 && LJ_HASFFI
> -  if (!irt_isint(ir->t))
> -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_modi64 :
> -					  IRCALL_lj_carith_modu64);
> -  else
> -#endif
> -    asm_callid(as, ir, IRCALL_lj_vm_modi);
> -}
> -
>  #if !LJ_SOFTFP32
> -static void asm_pow(ASMState *as, IRIns *ir)
> -{
> -#if LJ_64 && LJ_HASFFI
> -  if (!irt_isnum(ir->t))
> -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_powi64 :
> -					  IRCALL_lj_carith_powu64);
> -  else
> -#endif
> -    asm_callid(as, ir, IRCALL_lj_vm_powi);
> -}
> -
> -static void asm_div(ASMState *as, IRIns *ir)
> +static void asm_fpdiv(ASMState *as, IRIns *ir)
>  {
> -#if LJ_64 && LJ_HASFFI
> -  if (!irt_isnum(ir->t))
> -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_divi64 :
> -					  IRCALL_lj_carith_divu64);
> -  else
> -#endif
>  #if !LJ_SOFTFP
>      asm_fparith(as, ir, MIPSI_DIV_D);
>  #else
> -  asm_callid(as, ir, IRCALL_softfp_div);
> +    asm_callid(as, ir, IRCALL_softfp_div);
>  #endif
>  }
>  #endif
> @@ -1844,8 +1812,6 @@ static void asm_abs(ASMState *as, IRIns *ir)
>  }
>  #endif
>  
> -#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
> -
>  static void asm_arithov(ASMState *as, IRIns *ir)
>  {
>    /* TODO MIPSR6: bovc/bnvc. Caveat: no delay slot to load RID_TMP. */
> diff --git a/src/lj_asm_ppc.h b/src/lj_asm_ppc.h
> index 6aaed058..62a5c3e2 100644
> --- a/src/lj_asm_ppc.h
> +++ b/src/lj_asm_ppc.h
> @@ -1177,8 +1177,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>    ra_allockreg(as, (int32_t)(sz+sizeof(GCcdata)),
>  	       ra_releasetmp(as, ASMREF_TMP1));
>  }
> -#else
> -#define asm_cnew(as, ir)	((void)0)
>  #endif
>  
>  /* -- Write barriers ------------------------------------------------------ */
> @@ -1249,8 +1247,6 @@ static void asm_fpunary(ASMState *as, IRIns *ir, PPCIns pi)
>  
>  static void asm_fpmath(ASMState *as, IRIns *ir)
>  {
> -  if (ir->op2 == IRFPM_EXP2 && asm_fpjoin_pow(as, ir))
> -    return;
>    if (ir->op2 == IRFPM_SQRT && (as->flags & JIT_F_SQRT))
>      asm_fpunary(as, ir, PPCI_FSQRT);
>    else
> @@ -1364,9 +1360,7 @@ static void asm_mul(ASMState *as, IRIns *ir)
>    }
>  }
>  
> -#define asm_div(as, ir)		asm_fparith(as, ir, PPCI_FDIV)
> -#define asm_mod(as, ir)		asm_callid(as, ir, IRCALL_lj_vm_modi)
> -#define asm_pow(as, ir)		asm_callid(as, ir, IRCALL_lj_vm_powi)
> +#define asm_fpdiv(as, ir)	asm_fparith(as, ir, PPCI_FDIV)
>  
>  static void asm_neg(ASMState *as, IRIns *ir)
>  {
> @@ -1390,7 +1384,6 @@ static void asm_neg(ASMState *as, IRIns *ir)
>  }
>  
>  #define asm_abs(as, ir)		asm_fpunary(as, ir, PPCI_FABS)
> -#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
>  
>  static void asm_arithov(ASMState *as, IRIns *ir, PPCIns pi)
>  {
> diff --git a/src/lj_asm_x86.h b/src/lj_asm_x86.h
> index 63d332ca..5f5fe3cf 100644
> --- a/src/lj_asm_x86.h
> +++ b/src/lj_asm_x86.h
> @@ -1857,8 +1857,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>    asm_gencall(as, ci, args);
>    emit_loadi(as, ra_releasetmp(as, ASMREF_TMP1), (int32_t)(sz+sizeof(GCcdata)));
>  }
> -#else
> -#define asm_cnew(as, ir)	((void)0)
>  #endif
>  
>  /* -- Write barriers ------------------------------------------------------ */
> @@ -1964,8 +1962,6 @@ static void asm_fpmath(ASMState *as, IRIns *ir)
>  		    fpm == IRFPM_CEIL ? lj_vm_ceil_sse : lj_vm_trunc_sse);
>        ra_left(as, RID_XMM0, ir->op1);
>      }
> -  } else if (fpm == IRFPM_EXP2 && asm_fpjoin_pow(as, ir)) {
> -    /* Rejoined to pow(). */
>    } else {
>      asm_callid(as, ir, IRCALL_lj_vm_floor + fpm);
>    }
> @@ -2000,17 +1996,6 @@ static void asm_fppowi(ASMState *as, IRIns *ir)
>    ra_left(as, RID_EAX, ir->op2);
>  }
>  
> -static void asm_pow(ASMState *as, IRIns *ir)
> -{
> -#if LJ_64 && LJ_HASFFI
> -  if (!irt_isnum(ir->t))
> -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_powi64 :
> -					  IRCALL_lj_carith_powu64);
> -  else
> -#endif
> -    asm_fppowi(as, ir);
> -}
> -
>  static int asm_swapops(ASMState *as, IRIns *ir)
>  {
>    IRIns *irl = IR(ir->op1);
> @@ -2208,27 +2193,7 @@ static void asm_mul(ASMState *as, IRIns *ir)
>      asm_intarith(as, ir, XOg_X_IMUL);
>  }
>  
> -static void asm_div(ASMState *as, IRIns *ir)
> -{
> -#if LJ_64 && LJ_HASFFI
> -  if (!irt_isnum(ir->t))
> -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_divi64 :
> -					  IRCALL_lj_carith_divu64);
> -  else
> -#endif
> -    asm_fparith(as, ir, XO_DIVSD);
> -}
> -
> -static void asm_mod(ASMState *as, IRIns *ir)
> -{
> -#if LJ_64 && LJ_HASFFI
> -  if (!irt_isint(ir->t))
> -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_modi64 :
> -					  IRCALL_lj_carith_modu64);
> -  else
> -#endif
> -    asm_callid(as, ir, IRCALL_lj_vm_modi);
> -}
> +#define asm_fpdiv(as, ir)	asm_fparith(as, ir, XO_DIVSD)
>  
>  static void asm_neg_not(ASMState *as, IRIns *ir, x86Group3 xg)
>  {
> diff --git a/src/lj_ir.h b/src/lj_ir.h
> index e8bca275..43e55069 100644
> --- a/src/lj_ir.h
> +++ b/src/lj_ir.h
> @@ -177,7 +177,7 @@ LJ_STATIC_ASSERT((int)IR_XLOAD + IRDELTA_L2S == (int)IR_XSTORE);
>  /* FPMATH sub-functions. ORDER FPM. */
>  #define IRFPMDEF(_) \
>    _(FLOOR) _(CEIL) _(TRUNC)  /* Must be first and in this order. */ \
> -  _(SQRT) _(EXP2) _(LOG) _(LOG2) \
> +  _(SQRT) _(LOG) _(LOG2) \
>    _(OTHER)
>  
>  typedef enum {
> diff --git a/src/lj_ircall.h b/src/lj_ircall.h
> index bbad35b1..af064a6f 100644
> --- a/src/lj_ircall.h
> +++ b/src/lj_ircall.h
> @@ -192,7 +192,6 @@ typedef struct CCallInfo {
>    _(FPMATH,	lj_vm_ceil,		1,   N, NUM, XA_FP) \
>    _(FPMATH,	lj_vm_trunc,		1,   N, NUM, XA_FP) \
>    _(FPMATH,	sqrt,			1,   N, NUM, XA_FP) \
> -  _(ANY,	lj_vm_exp2,		1,   N, NUM, XA_FP) \
>    _(ANY,	log,			1,   N, NUM, XA_FP) \
>    _(ANY,	lj_vm_log2,		1,   N, NUM, XA_FP) \
>    _(ANY,	lj_vm_powi,		2,   N, NUM, XA_FP) \
> diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
> index 27e489af..cd803d87 100644
> --- a/src/lj_opt_fold.c
> +++ b/src/lj_opt_fold.c
> @@ -237,10 +237,11 @@ LJFOLDF(kfold_fpcall2)
>  }
>  
>  LJFOLD(POW KNUM KINT)
> +LJFOLD(POW KNUM KNUM)
>  LJFOLDF(kfold_numpow)
>  {
>    lua_Number a = knumleft;
> -  lua_Number b = (lua_Number)fright->i;
> +  lua_Number b = fright->o == IR_KINT ? (lua_Number)fright->i : knumright;
>    lua_Number y = lj_vm_foldarith(a, b, IR_POW - IR_ADD);
>    return lj_ir_knum(J, y);
>  }
> @@ -1077,7 +1078,7 @@ LJFOLDF(simplify_nummuldiv_negneg)
>  }
>  
>  LJFOLD(POW any KINT)
> -LJFOLDF(simplify_numpow_xk)
> +LJFOLDF(simplify_numpow_xkint)
>  {
>    int32_t k = fright->i;
>    TRef ref = fins->op1;
> @@ -1106,13 +1107,22 @@ LJFOLDF(simplify_numpow_xk)
>    return ref;
>  }
>  
> +LJFOLD(POW any KNUM)
> +LJFOLDF(simplify_numpow_xknum)
> +{
> +  if (knumright == 0.5)  /* x ^ 0.5 ==> sqrt(x) */
> +    return emitir(IRTN(IR_FPMATH), fins->op1, IRFPM_SQRT);
> +  return NEXTFOLD;
> +}
> +
>  LJFOLD(POW KNUM any)
>  LJFOLDF(simplify_numpow_kx)
>  {
>    lua_Number n = knumleft;
> -  if (n == 2.0) {  /* 2.0 ^ i ==> ldexp(1.0, tonum(i)) */
> -    fins->o = IR_CONV;
> +  if (n == 2.0 && irt_isint(fright->t)) {  /* 2.0 ^ i ==> ldexp(1.0, i) */
>  #if LJ_TARGET_X86ORX64
> +    /* Different IR_LDEXP calling convention on x86/x64 requires conversion. */
> +    fins->o = IR_CONV;
>      fins->op1 = fins->op2;
>      fins->op2 = IRCONV_NUM_INT;
>      fins->op2 = (IRRef1)lj_opt_fold(J);
> diff --git a/src/lj_opt_narrow.c b/src/lj_opt_narrow.c
> index bb61f97b..4f285334 100644
> --- a/src/lj_opt_narrow.c
> +++ b/src/lj_opt_narrow.c
> @@ -593,10 +593,10 @@ TRef lj_opt_narrow_pow(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc)
>    /* Narrowing must be unconditional to preserve (-x)^i semantics. */
>    if (tvisint(vc) || numisint(numV(vc))) {
>      int checkrange = 0;
> -    /* Split pow is faster for bigger exponents. But do this only for (+k)^i. */
> +    /* pow() is faster for bigger exponents. But do this only for (+k)^i. */
>      if (tref_isk(rb) && (int32_t)ir_knum(IR(tref_ref(rb)))->u32.hi >= 0) {
>        int32_t k = numberVint(vc);
> -      if (!(k >= -65536 && k <= 65536)) goto split_pow;
> +      if (!(k >= -65536 && k <= 65536)) goto force_pow_num;
>        checkrange = 1;
>      }
>      if (!tref_isinteger(rc)) {
> @@ -607,19 +607,11 @@ TRef lj_opt_narrow_pow(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc)
>        TRef tmp = emitir(IRTI(IR_ADD), rc, lj_ir_kint(J, 65536));
>        emitir(IRTGI(IR_ULE), tmp, lj_ir_kint(J, 2*65536));
>      }
> -    return emitir(IRTN(IR_POW), rb, rc);
> +  } else {
> +force_pow_num:
> +    rc = lj_ir_tonum(J, rc);  /* Want POW(num, num), not POW(num, int). */
>    }
> -split_pow:
> -  /* FOLD covers most cases, but some are easier to do here. */
> -  if (tref_isk(rb) && tvispone(ir_knum(IR(tref_ref(rb)))))
> -    return rb;  /* 1 ^ x ==> 1 */
> -  rc = lj_ir_tonum(J, rc);
> -  if (tref_isk(rc) && ir_knum(IR(tref_ref(rc)))->n == 0.5)
> -    return emitir(IRTN(IR_FPMATH), rb, IRFPM_SQRT);  /* x ^ 0.5 ==> sqrt(x) */
> -  /* Split up b^c into exp2(c*log2(b)). Assembler may rejoin later. */
> -  rb = emitir(IRTN(IR_FPMATH), rb, IRFPM_LOG2);
> -  rc = emitir(IRTN(IR_MUL), rb, rc);
> -  return emitir(IRTN(IR_FPMATH), rc, IRFPM_EXP2);
> +  return emitir(IRTN(IR_POW), rb, rc);
>  }
>  
>  /* -- Predictive narrowing of induction variables ------------------------- */
> diff --git a/src/lj_opt_split.c b/src/lj_opt_split.c
> index 2fc36b8d..c10a85cb 100644
> --- a/src/lj_opt_split.c
> +++ b/src/lj_opt_split.c
> @@ -403,27 +403,6 @@ static void split_ir(jit_State *J)
>  	hi = split_call_li(J, hisubst, oir, ir, IRCALL_lj_vm_powi);
>  	break;
>        case IR_FPMATH:
> -	/* Try to rejoin pow from EXP2, MUL and LOG2. */
> -	if (nir->op2 == IRFPM_EXP2 && nir->op1 > J->loopref) {
> -	  IRIns *irp = IR(nir->op1);
> -	  if (irp->o == IR_CALLN && irp->op2 == IRCALL_softfp_mul) {
> -	    IRIns *irm4 = IR(irp->op1);
> -	    IRIns *irm3 = IR(irm4->op1);
> -	    IRIns *irm12 = IR(irm3->op1);
> -	    IRIns *irl1 = IR(irm12->op1);
> -	    if (irm12->op1 > J->loopref && irl1->o == IR_CALLN &&
> -		irl1->op2 == IRCALL_lj_vm_log2) {
> -	      IRRef tmp = irl1->op1;  /* Recycle first two args from LOG2. */
> -	      IRRef arg3 = irm3->op2, arg4 = irm4->op2;
> -	      J->cur.nins--;
> -	      tmp = split_emit(J, IRT(IR_CARG, IRT_NIL), tmp, arg3);
> -	      tmp = split_emit(J, IRT(IR_CARG, IRT_NIL), tmp, arg4);
> -	      ir->prev = tmp = split_emit(J, IRTI(IR_CALLN), tmp, IRCALL_pow);
> -	      hi = split_emit(J, IRT(IR_HIOP, IRT_SOFTFP), tmp, tmp);
> -	      break;
> -	    }
> -	  }
> -	}
>  	hi = split_call_l(J, hisubst, oir, ir, IRCALL_lj_vm_floor + ir->op2);
>  	break;
>        case IR_LDEXP:
> diff --git a/src/lj_vm.h b/src/lj_vm.h
> index 411caafa..abaa7c52 100644
> --- a/src/lj_vm.h
> +++ b/src/lj_vm.h
> @@ -95,11 +95,6 @@ LJ_ASMF double lj_vm_trunc(double);
>  LJ_ASMF double lj_vm_trunc_sf(double);
>  #endif
>  #endif
> -#ifdef LUAJIT_NO_EXP2
> -LJ_ASMF double lj_vm_exp2(double);
> -#else
> -#define lj_vm_exp2	exp2
> -#endif
>  #if LJ_HASFFI
>  LJ_ASMF int lj_vm_errno(void);
>  #endif
> diff --git a/src/lj_vmmath.c b/src/lj_vmmath.c
> index ae4e0f15..9c0d3fde 100644
> --- a/src/lj_vmmath.c
> +++ b/src/lj_vmmath.c
> @@ -79,13 +79,6 @@ double lj_vm_log2(double a)
>  }
>  #endif
>  
> -#ifdef LUAJIT_NO_EXP2
> -double lj_vm_exp2(double a)
> -{
> -  return exp(a * 0.6931471805599453);
> -}
> -#endif
> -
>  #if !LJ_TARGET_X86ORX64
>  /* Unsigned x^k. */
>  static double lj_vm_powui(double x, uint32_t k)
> @@ -128,7 +121,6 @@ double lj_vm_foldfpm(double x, int fpm)
>    case IRFPM_CEIL: return lj_vm_ceil(x);
>    case IRFPM_TRUNC: return lj_vm_trunc(x);
>    case IRFPM_SQRT: return sqrt(x);
> -  case IRFPM_EXP2: return lj_vm_exp2(x);
>    case IRFPM_LOG: return log(x);
>    case IRFPM_LOG2: return lj_vm_log2(x);
>    default: lua_assert(0);
> diff --git a/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
> new file mode 100644
> index 00000000..21b3a0d9
> --- /dev/null
> +++ b/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
> @@ -0,0 +1,63 @@
> +local tap = require('tap')
> +-- Test to demonstrate the incorrect JIT behaviour when splitting
> +-- IR_POW.
> +-- See also https://github.com/LuaJIT/LuaJIT/issues/9.
> +local test = tap.test('lj-9-pow-inconsistencies'):skipcond({
> +  ['Test requires JIT enabled'] = not jit.status(),
> +})
> +
> +local nan = 0 / 0
> +local inf = math.huge
> +
> +-- Table with some corner cases to check:
> +local INTERESTING_VALUES = {
> +  -- 0, -0, 1, -1 special cases with nan, inf, etc..
> +  0, -0, 1, -1, nan, inf, -inf,
> +  -- x ^  inf = 0 (inf), if |x| < 1 (|x| > 1).
> +  -- x ^ -inf = inf (0), if |x| < 1 (|x| > 1).
> +  0.999999, 1.000001, -0.999999, -1.000001,
> +}
> +test:plan(1 + (#INTERESTING_VALUES) ^ 2)

I suggest renaming it to `CORNER_CASES`, since `INTERESTING_VALUES`
is not very formal.
Also, please mention that not all of the possible pairs are faulty
and most of them are left here for two reasons:
1. Improved readability.
2. More extensive and change-proof testing.

> +
> +jit.opt.start('hotloop=1')
> +
> +-- The JIT engine tries to split b^c to exp2(c * log2(b)).
> +-- For some cases for IEEE754 we can see, that
> +-- (double)exp2((double)log2(x)) != x, due to mathematical
> +-- functions accuracy and double precision restrictions.
> +-- Just use some numbers to observe this misbehaviour.
> +local res = {}
> +local cnt = 1
> +while cnt < 4 do
> +  -- XXX: use local variable to prevent folding via parser.
> +  local b = -0.90000000001
> +  res[cnt] = 1000 ^ b
> +  cnt = cnt + 1
> +end

Is there a specific reason you decided to use while over for?
> +
> +test:samevalues(res, 'consistent pow operator behaviour for corner case')
> +
> +-- Prevent JIT side effects for parent loops.
> +jit.off()
> +for i = 1, #INTERESTING_VALUES do
> +  for j = 1, #INTERESTING_VALUES do
> +    local b = INTERESTING_VALUES[i]
> +    local c = INTERESTING_VALUES[j]
> +    local results = {}
> +    local counter = 1
> +    jit.on()
> +    while counter < 4 do
> +      results[counter] = b ^ c
> +      counter = counter + 1
> +    end
Same question about for and while.
> +    -- Prevent JIT side effects.
> +    jit.off()
> +    jit.flush()
Also, I think we should move the part from jit.on() to jit.flush() into
a separate function.
> +    test:samevalues(
> +      results,
> +      ('consistent pow operator behaviour for (%s)^(%s)'):format(b, c)
> +    )
> +  end
> +end
> +
> +test:done(true)
> -- 
> 2.41.0
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 3/5] Improve assertions.
  2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 3/5] Improve assertions Sergey Kaplun via Tarantool-patches
@ 2023-08-17 14:58   ` Maxim Kokryashkin via Tarantool-patches
  2023-08-18  7:56     ` Sergey Kaplun via Tarantool-patches
  2023-08-18 11:20   ` Sergey Bronnikov via Tarantool-patches
  1 sibling, 1 reply; 34+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2023-08-17 14:58 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Hi!
Thanks for the patch!
LGTM, except for a few comments below.

Side note: glad to see, that you didn't forget to enable the
assertions we decided to replace with convetional ones, awesome work!
On Tue, Aug 15, 2023 at 12:36:29PM +0300, Sergey Kaplun wrote:
> From: Mike Pall <mike>
> 
> (cherry-picked from commit 8ae5170cdc9c307bd81019b3e014391c9fd00581)
> 
> This commit refactors assertions used in the LuaJIT. It introduces new
Typo: s/introduces/introduces the/
> module <src/lj_assert.c> with the `lj_assert_fail()` implementation.
> Wrappers of this function are used across the whole code base. Each
> macro wrapper is defined in the corresponding module gets global state
Typo: s/module/module and/
> (if possible) from its environment to be passed inside the assertion.
> For now, the global state is unused, but later it may be used for
> dumping of the VM state.
Typo: s/of the/the/
> 
> Sergey Kaplun:
> * added the description for the feature
> 
> Part of tarantool/tarantool#8825
> ---
>  src/CMakeLists.txt        |   1 +
>  src/Makefile.dep.original |  13 +--
>  src/Makefile.original     |   4 +-
>  src/lib_io.c              |   6 +-
>  src/lib_jit.c             |   4 +-
>  src/lib_misc.c            |  12 +--
>  src/lib_string.c          |   6 +-
>  src/lj_api.c              | 140 +++++++++++++++++---------------
>  src/lj_asm.c              | 130 ++++++++++++++++++------------
>  src/lj_asm_arm.h          | 119 ++++++++++++++++------------
>  src/lj_asm_arm64.h        |  95 ++++++++++++----------
>  src/lj_asm_mips.h         | 151 ++++++++++++++++++++---------------
>  src/lj_asm_ppc.h          | 113 +++++++++++++++-----------
>  src/lj_asm_x86.h          | 161 +++++++++++++++++++++----------------
>  src/lj_assert.c           |  28 +++++++
>  src/lj_bcread.c           |  20 ++---
>  src/lj_bcwrite.c          |  24 ++++--
>  src/lj_buf.c              |   4 +-
>  src/lj_carith.c           |  10 ++-
>  src/lj_ccall.c            |  19 +++--
>  src/lj_ccallback.c        |  42 +++++-----
>  src/lj_cconv.c            |  57 ++++++++------
>  src/lj_cconv.h            |   5 +-
>  src/lj_cdata.c            |  27 ++++---
>  src/lj_cdata.h            |   7 +-
>  src/lj_clib.c             |   6 +-
>  src/lj_cparse.c           |  25 +++---
>  src/lj_crecord.c          |  19 +++--
>  src/lj_ctype.c            |  13 +--
>  src/lj_ctype.h            |  14 +++-
>  src/lj_debug.c            |  18 +++--
>  src/lj_def.h              |  26 ++++--
>  src/lj_dispatch.c         |  11 ++-
>  src/lj_emit_arm.h         |  50 ++++++------
>  src/lj_emit_arm64.h       |  21 ++---
>  src/lj_emit_mips.h        |  22 +++---
>  src/lj_emit_ppc.h         |  12 +--
>  src/lj_emit_x86.h         |  22 +++---
>  src/lj_err.c              |  40 ++--------
>  src/lj_func.c             |  18 +++--
>  src/lj_gc.c               |  78 ++++++++++--------
>  src/lj_gc.h               |   6 +-
>  src/lj_gdbjit.c           |   5 +-
>  src/lj_ir.c               |  31 ++++----
>  src/lj_ir.h               |   5 +-
>  src/lj_jit.h              |   6 ++
>  src/lj_lex.c              |  14 ++--
>  src/lj_lex.h              |   6 ++
>  src/lj_load.c             |   2 +-
>  src/lj_mapi.c             |   2 +-
>  src/lj_mcode.c            |   2 +-
>  src/lj_memprof.c          |  35 ++++----
>  src/lj_meta.c             |   6 +-
>  src/lj_obj.h              |  35 +++++---
>  src/lj_opt_fold.c         |  88 ++++++++++++---------
>  src/lj_opt_loop.c         |   5 +-
>  src/lj_opt_mem.c          |  15 ++--
>  src/lj_opt_narrow.c       |  17 ++--
>  src/lj_opt_split.c        |  22 +++---
>  src/lj_parse.c            | 114 +++++++++++++++------------
>  src/lj_record.c           | 162 +++++++++++++++++++++++---------------
>  src/lj_snap.c             | 100 ++++++++++++++---------
>  src/lj_snap.h             |   3 +-
>  src/lj_state.c            |  18 +++--
>  src/lj_str.c              |   7 +-
>  src/lj_strfmt.c           |   4 +-
>  src/lj_strfmt.h           |   3 +-
>  src/lj_strfmt_num.c       |   6 +-
>  src/lj_strscan.c          |   9 ++-
>  src/lj_symtab.c           |  11 +--
>  src/lj_sysprof.c          |  31 ++++----
>  src/lj_tab.c              |  20 ++---
>  src/lj_target.h           |   3 +-
>  src/lj_trace.c            |  57 +++++++-------
>  src/lj_utils_leb128.c     |   5 +-
>  src/lj_vmmath.c           |   7 +-
>  src/lj_wbuf.c             |   3 +-
>  src/ljamalg.c             |   1 +
>  src/luaconf.h             |   2 +-
>  79 files changed, 1436 insertions(+), 1025 deletions(-)
>  create mode 100644 src/lj_assert.c
> 
> diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
> index feeccbde..03338306 100644
> --- a/src/CMakeLists.txt
> +++ b/src/CMakeLists.txt
> @@ -59,6 +59,7 @@ make_source_list(SOURCES_FRONTEND
>  make_source_list(SOURCES_UTILS
>    SOURCES
>      lj_alloc.c
> +    lj_assert.c
>      lj_char.c
>      lj_utils_leb128.c
>      lj_vmmath.c
> diff --git a/src/Makefile.dep.original b/src/Makefile.dep.original
> index 968805ed..d35b6d9a 100644
> --- a/src/Makefile.dep.original
> +++ b/src/Makefile.dep.original
> @@ -54,6 +54,7 @@ lj_asm.o: lj_asm.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h lj_gc.h \
>   lj_ircall.h lj_iropt.h lj_mcode.h lj_trace.h lj_dispatch.h lj_traceerr.h \
>   lj_snap.h lj_asm.h lj_vm.h lj_target.h lj_target_*.h lj_emit_*.h \
>   lj_asm_*.h
> +lj_assert.o: lj_assert.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h
>  lj_bc.o: lj_bc.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h lj_bc.h \
>   lj_bcdef.h
>  lj_bcread.o: lj_bcread.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
> @@ -164,7 +165,7 @@ lj_opt_loop.o: lj_opt_loop.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
>   lj_iropt.h lj_trace.h lj_dispatch.h lj_bc.h lj_traceerr.h lj_snap.h \
>   lj_vm.h
>  lj_opt_mem.o: lj_opt_mem.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
> - lj_tab.h lj_ir.h lj_jit.h lj_iropt.h lj_ircall.h
> + lj_tab.h lj_ir.h lj_jit.h lj_iropt.h lj_ircall.h lj_dispatch.h lj_bc.h
>  lj_opt_narrow.o: lj_opt_narrow.c lj_obj.h lua.h luaconf.h lj_def.h \
>   lj_arch.h lj_bc.h lj_ir.h lj_jit.h lj_iropt.h lj_trace.h lj_dispatch.h \
>   lj_traceerr.h lj_vm.h lj_strscan.h
> @@ -224,15 +225,17 @@ lj_trace.o: lj_trace.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
>   lmisclib.h lj_sysprof.h
>  lj_udata.o: lj_udata.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
>   lj_gc.h lj_udata.h
> -lj_utils_leb128.o: lj_utils_leb128.c lj_utils.h lj_def.h lua.h luaconf.h
> +lj_utils_leb128.o: lj_utils_leb128.c lj_utils.h lj_def.h lua.h luaconf.h \
> + lj_obj.h lj_arch.h
>  lj_vmevent.o: lj_vmevent.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
>   lj_str.h lj_tab.h lj_state.h lj_dispatch.h lj_bc.h lj_jit.h lj_ir.h \
>   lj_vm.h lj_vmevent.h
>  lj_vmmath.o: lj_vmmath.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
>   lj_ir.h lj_vm.h
> -lj_wbuf.o: lj_wbuf.c lj_wbuf.h lj_def.h lua.h luaconf.h lj_utils.h
> -ljamalg.o: ljamalg.c lua.h luaconf.h lauxlib.h lj_gc.c lj_obj.h lj_def.h \
> - lj_arch.h lj_gc.h lj_err.h lj_errmsg.h lj_buf.h lj_str.h lj_tab.h \
> +lj_wbuf.o: lj_wbuf.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
> + lj_wbuf.h lj_utils.h
> +ljamalg.o: ljamalg.c lua.h luaconf.h lauxlib.h lj_assert.c lj_obj.h lj_def.h \
> + lj_arch.h lj_gc.c lj_gc.h lj_err.h lj_errmsg.h lj_buf.h lj_str.h lj_tab.h \
>   lj_func.h lj_udata.h lj_meta.h lj_state.h lj_frame.h lj_bc.h lj_ctype.h \
>   lj_cdata.h lj_trace.h lj_jit.h lj_ir.h lj_dispatch.h lj_traceerr.h \
>   lj_vm.h lj_err.c lj_debug.h lj_ff.h lj_ffdef.h lj_strfmt.h lj_char.c \
> diff --git a/src/Makefile.original b/src/Makefile.original
> index 22d36a27..8cfe55c2 100644
> --- a/src/Makefile.original
> +++ b/src/Makefile.original
> @@ -499,8 +499,8 @@ LJLIB_O= lib_base.o lib_math.o lib_bit.o lib_string.o lib_table.o \
>  	 lib_misc.o
>  LJLIB_C= $(LJLIB_O:.o=.c)
>  
> -LJCORE_O= lj_gc.o lj_err.o lj_char.o lj_bc.o lj_obj.o lj_buf.o lj_wbuf.o \
> -	  lj_str.o lj_tab.o lj_func.o lj_udata.o lj_meta.o lj_debug.o \
> +LJCORE_O= lj_assert.o lj_gc.o lj_err.o lj_char.o lj_bc.o lj_obj.o lj_buf.o \
> +	  lj_wbuf.o lj_str.o lj_tab.o lj_func.o lj_udata.o lj_meta.o lj_debug.o \
>  	  lj_state.o lj_dispatch.o lj_vmevent.o lj_vmmath.o lj_strscan.o \
>  	  lj_strfmt.o lj_strfmt_num.o lj_api.o lj_mapi.o lj_profile.o \
>  	  lj_profile_timer.o lj_memprof.o lj_symtab.o lj_sysprof.o \
> diff --git a/src/lib_io.c b/src/lib_io.c
> index db995ae6..ef39e535 100644
> --- a/src/lib_io.c
> +++ b/src/lib_io.c
> @@ -101,9 +101,6 @@ static int io_file_close(lua_State *L, IOFileUD *iof)
>      stat = pclose(iof->fp);
>  #elif LJ_TARGET_WINDOWS && !LJ_TARGET_XBOXONE && !LJ_TARGET_UWP
>      stat = _pclose(iof->fp);
> -#else
> -    lua_assert(0);
> -    return 0;
>  #endif
>  #if LJ_52
>      iof->fp = NULL;
> @@ -112,7 +109,8 @@ static int io_file_close(lua_State *L, IOFileUD *iof)
>      ok = (stat != -1);
>  #endif
>    } else {
> -    lua_assert((iof->type & IOFILE_TYPE_MASK) == IOFILE_TYPE_STDF);
> +    lj_assertL((iof->type & IOFILE_TYPE_MASK) == IOFILE_TYPE_STDF,
> +	       "close of unknown FILE* type");
>      setnilV(L->top++);
>      lua_pushliteral(L, "cannot close standard file");
>      return 2;
> diff --git a/src/lib_jit.c b/src/lib_jit.c
> index 40aa2b51..b3c1c93c 100644
> --- a/src/lib_jit.c
> +++ b/src/lib_jit.c
> @@ -227,7 +227,7 @@ LJLIB_CF(jit_util_funcbc)
>    if (pc < pt->sizebc) {
>      BCIns ins = proto_bc(pt)[pc];
>      BCOp op = bc_op(ins);
> -    lua_assert(op < BC__MAX);
> +    lj_assertL(op < BC__MAX, "bad bytecode op %d", op);
>      setintV(L->top, ins);
>      setintV(L->top+1, lj_bc_mode[op]);
>      L->top += 2;
> @@ -491,7 +491,7 @@ static int jitopt_param(jit_State *J, const char *str)
>    int i;
>    for (i = 0; i < JIT_P__MAX; i++) {
>      size_t len = *(const uint8_t *)lst;
> -    lua_assert(len != 0);
> +    lj_assertJ(len != 0, "bad JIT_P_STRING");
>      if (strncmp(str, lst+1, len) == 0 && str[len] == '=') {
>        int32_t n = 0;
>        const char *p = &str[len+1];
> diff --git a/src/lib_misc.c b/src/lib_misc.c
> index 1913a622..ca1d1c75 100644
> --- a/src/lib_misc.c
> +++ b/src/lib_misc.c
> @@ -109,7 +109,7 @@ static size_t buffer_writer_default(const void **buf_addr, size_t len,
>    const void *data = *buf_addr;
>    size_t write_total = 0;
>  
> -  lua_assert(len <= STREAM_BUFFER_SIZE);
> +  lj_assertX(len <= STREAM_BUFFER_SIZE, "stream buffer overflow");
>  
>    for (;;) {
>      const ssize_t written = write(fd, data, len - write_total);
> @@ -127,7 +127,7 @@ static size_t buffer_writer_default(const void **buf_addr, size_t len,
>      }
>  
>      write_total += written;
> -    lua_assert(write_total <= len);
> +    lj_assertX(write_total <= len, "invalid stream buffer write");
>  
>      if (write_total == len)
>        break;
> @@ -168,7 +168,7 @@ static int on_stop_cb_default(void *opt, uint8_t *buf)
>  static int set_output_path(const char *path, struct luam_Sysprof_Options *opt) {
>    struct profile_ctx *ctx = opt->ctx;
>    int fd = 0;
> -  lua_assert(path != NULL);
> +  lj_assertX(path != NULL, "no file to open by sysprof");
>    fd = open(path, O_CREAT | O_WRONLY | O_TRUNC, 0644);
>    if(fd == -1) {
>      return PROFILE_ERRIO;
> @@ -280,7 +280,7 @@ static int sysprof_error(lua_State *L, int status)
>        return luaL_fileresult(L, 0, NULL);
>  #endif
>      default:
> -      lua_assert(0);
> +      lj_assertL(0, "bad sysprof error %d", status);
>        return 0;
>    }
>  }
> @@ -401,7 +401,7 @@ LJLIB_CF(misc_memprof_start)
>        return luaL_fileresult(L, 0, fname);
>  #endif
>      default:
> -      lua_assert(0);
> +      lj_assertL(0, "bad memprof error %d", memprof_status);
>        return 0;
>      }
>    }
> @@ -430,7 +430,7 @@ LJLIB_CF(misc_memprof_stop)
>        return luaL_fileresult(L, 0, NULL);
>  #endif
>      default:
> -      lua_assert(0);
> +      lj_assertL(0, "bad memprof error %d", status);
>        return 0;
>      }
>    }
> diff --git a/src/lib_string.c b/src/lib_string.c
> index 156dae66..9b9c369a 100644
> --- a/src/lib_string.c
> +++ b/src/lib_string.c
> @@ -136,7 +136,7 @@ LJLIB_CF(string_dump)
>  /* ------------------------------------------------------------------------ */
>  
>  /* macro to `unsign' a character */
> -#define uchar(c)        ((unsigned char)(c))
> +#define uchar(c)	((unsigned char)(c))
>  
>  #define CAP_UNFINISHED	(-1)
>  #define CAP_POSITION	(-2)
> @@ -645,7 +645,7 @@ static GCstr *string_fmt_tostring(lua_State *L, int arg, int retry)
>  {
>    TValue *o = L->base+arg-1;
>    cTValue *mo;
> -  lua_assert(o < L->top);  /* Caller already checks for existence. */
> +  lj_assertL(o < L->top, "bad usage");  /* Caller already checks for existence. */
>    if (LJ_LIKELY(tvisstr(o)))
>      return strV(o);
>    if (retry != 2 && !tvisnil(mo = lj_meta_lookup(L, o, MM_tostring))) {
> @@ -717,7 +717,7 @@ again:
>  	lj_strfmt_putptr(sb, lj_obj_ptr(G(L), L->base+arg-1));
>  	break;
>        default:
> -	lua_assert(0);
> +	lj_assertL(0, "bad string format type");
>  	break;
>        }
>      }
> diff --git a/src/lj_api.c b/src/lj_api.c
> index 89998815..05e02029 100644
> --- a/src/lj_api.c
> +++ b/src/lj_api.c
> @@ -28,8 +28,8 @@
>  
>  /* -- Common helper functions --------------------------------------------- */
>  
> -#define api_checknelems(L, n)		api_check(L, (n) <= (L->top - L->base))
> -#define api_checkvalidindex(L, i)	api_check(L, (i) != niltv(L))
> +#define lj_checkapi_slot(idx) \
> +  lj_checkapi((idx) <= (L->top - L->base), "stack slot %d out of range", (idx))
>  
>  static TValue *index2adr(lua_State *L, int idx)
>  {
> @@ -37,7 +37,8 @@ static TValue *index2adr(lua_State *L, int idx)
>      TValue *o = L->base + (idx - 1);
>      return o < L->top ? o : niltv(L);
>    } else if (idx > LUA_REGISTRYINDEX) {
> -    api_check(L, idx != 0 && -idx <= L->top - L->base);
> +    lj_checkapi(idx != 0 && -idx <= L->top - L->base,
> +		"bad stack slot %d", idx);
>      return L->top + idx;
>    } else if (idx == LUA_GLOBALSINDEX) {
>      TValue *o = &G(L)->tmptv;
> @@ -47,7 +48,8 @@ static TValue *index2adr(lua_State *L, int idx)
>      return registry(L);
>    } else {
>      GCfunc *fn = curr_func(L);
> -    api_check(L, fn->c.gct == ~LJ_TFUNC && !isluafunc(fn));
> +    lj_checkapi(fn->c.gct == ~LJ_TFUNC && !isluafunc(fn),
> +		"calling frame is not a C function");
>      if (idx == LUA_ENVIRONINDEX) {
>        TValue *o = &G(L)->tmptv;
>        settabV(L, o, tabref(fn->c.env));
> @@ -59,13 +61,27 @@ static TValue *index2adr(lua_State *L, int idx)
>    }
>  }
>  
> -static TValue *stkindex2adr(lua_State *L, int idx)
> +static LJ_AINLINE TValue *index2adr_check(lua_State *L, int idx)
> +{
> +  TValue *o = index2adr(L, idx);
> +  lj_checkapi(o != niltv(L), "invalid stack slot %d", idx);
> +  return o;
> +}
> +
> +static TValue *index2adr_stack(lua_State *L, int idx)
>  {
>    if (idx > 0) {
>      TValue *o = L->base + (idx - 1);
> +    if (o < L->top) {
> +      return o;
> +    } else {
> +      lj_checkapi(0, "invalid stack slot %d", idx);
> +      return niltv(L);
> +    }
>      return o < L->top ? o : niltv(L);
>    } else {
> -    api_check(L, idx != 0 && -idx <= L->top - L->base);
> +    lj_checkapi(idx != 0 && -idx <= L->top - L->base,
> +		"invalid stack slot %d", idx);
>      return L->top + idx;
>    }
>  }
> @@ -111,17 +127,17 @@ LUALIB_API void luaL_checkstack(lua_State *L, int size, const char *msg)
>      lj_err_callerv(L, LJ_ERR_STKOVM, msg);
>  }
>  
> -LUA_API void lua_xmove(lua_State *from, lua_State *to, int n)
> +LUA_API void lua_xmove(lua_State *L, lua_State *to, int n)
>  {
>    TValue *f, *t;
> -  if (from == to) return;
> -  api_checknelems(from, n);
> -  api_check(from, G(from) == G(to));
> +  if (L == to) return;
> +  lj_checkapi_slot(n);
> +  lj_checkapi(G(L) == G(to), "move across global states");
>    lj_state_checkstack(to, (MSize)n);
> -  f = from->top;
> +  f = L->top;
>    t = to->top = to->top + n;
>    while (--n >= 0) copyTV(to, --t, --f);
> -  from->top = f;
> +  L->top = f;
>  }
>  
>  LUA_API const lua_Number *lua_version(lua_State *L)
> @@ -141,7 +157,7 @@ LUA_API int lua_gettop(lua_State *L)
>  LUA_API void lua_settop(lua_State *L, int idx)
>  {
>    if (idx >= 0) {
> -    api_check(L, idx <= tvref(L->maxstack) - L->base);
> +    lj_checkapi(idx <= tvref(L->maxstack) - L->base, "bad stack slot %d", idx);
>      if (L->base + idx > L->top) {
>        if (L->base + idx >= tvref(L->maxstack))
>  	lj_state_growstack(L, (MSize)idx - (MSize)(L->top - L->base));
> @@ -150,23 +166,21 @@ LUA_API void lua_settop(lua_State *L, int idx)
>        L->top = L->base + idx;
>      }
>    } else {
> -    api_check(L, -(idx+1) <= (L->top - L->base));
> +    lj_checkapi(-(idx+1) <= (L->top - L->base), "bad stack slot %d", idx);
>      L->top += idx+1;  /* Shrinks top (idx < 0). */
>    }
>  }
>  
>  LUA_API void lua_remove(lua_State *L, int idx)
>  {
> -  TValue *p = stkindex2adr(L, idx);
> -  api_checkvalidindex(L, p);
> +  TValue *p = index2adr_stack(L, idx);
>    while (++p < L->top) copyTV(L, p-1, p);
>    L->top--;
>  }
>  
>  LUA_API void lua_insert(lua_State *L, int idx)
>  {
> -  TValue *q, *p = stkindex2adr(L, idx);
> -  api_checkvalidindex(L, p);
> +  TValue *q, *p = index2adr_stack(L, idx);
>    for (q = L->top; q > p; q--) copyTV(L, q, q-1);
>    copyTV(L, p, L->top);
>  }
> @@ -174,19 +188,18 @@ LUA_API void lua_insert(lua_State *L, int idx)
>  static void copy_slot(lua_State *L, TValue *f, int idx)
>  {
>    if (idx == LUA_GLOBALSINDEX) {
> -    api_check(L, tvistab(f));
> +    lj_checkapi(tvistab(f), "stack slot %d is not a table", idx);
>      /* NOBARRIER: A thread (i.e. L) is never black. */
>      setgcref(L->env, obj2gco(tabV(f)));
>    } else if (idx == LUA_ENVIRONINDEX) {
>      GCfunc *fn = curr_func(L);
>      if (fn->c.gct != ~LJ_TFUNC)
>        lj_err_msg(L, LJ_ERR_NOENV);
> -    api_check(L, tvistab(f));
> +    lj_checkapi(tvistab(f), "stack slot %d is not a table", idx);
>      setgcref(fn->c.env, obj2gco(tabV(f)));
>      lj_gc_barrier(L, fn, f);
>    } else {
> -    TValue *o = index2adr(L, idx);
> -    api_checkvalidindex(L, o);
> +    TValue *o = index2adr_check(L, idx);
>      copyTV(L, o, f);
>      if (idx < LUA_GLOBALSINDEX)  /* Need a barrier for upvalues. */
>        lj_gc_barrier(L, curr_func(L), f);
> @@ -195,7 +208,7 @@ static void copy_slot(lua_State *L, TValue *f, int idx)
>  
>  LUA_API void lua_replace(lua_State *L, int idx)
>  {
> -  api_checknelems(L, 1);
> +  lj_checkapi_slot(1);
>    copy_slot(L, L->top - 1, idx);
>    L->top--;
>  }
> @@ -231,7 +244,7 @@ LUA_API int lua_type(lua_State *L, int idx)
>  #else
>      int tt = (int)(((t < 8 ? 0x98042110u : 0x75a06u) >> 4*(t&7)) & 15u);
>  #endif
> -    lua_assert(tt != LUA_TNIL || tvisnil(o));
> +    lj_assertL(tt != LUA_TNIL || tvisnil(o), "bad tag conversion");
>      return tt;
>    }
>  }
> @@ -522,7 +535,7 @@ LUA_API const char *lua_tolstring(lua_State *L, int idx, size_t *len)
>  LUA_API uint32_t lua_hashstring(lua_State *L, int idx)
>  {
>    TValue *o = index2adr(L, idx);
> -  lua_assert(tvisstr(o));
> +  lj_checkapi(tvisstr(o), "stack slot %d is not a string", idx);
>    GCstr *s = strV(o);
>    if (! strsmart(s))
>      return s->hash;
> @@ -699,14 +712,14 @@ LUA_API void lua_pushcclosure(lua_State *L, lua_CFunction f, int n)
>  {
>    GCfunc *fn;
>    lj_gc_check(L);
> -  api_checknelems(L, n);
> +  lj_checkapi_slot(n);
>    fn = lj_func_newC(L, (MSize)n, getcurrenv(L));
>    fn->c.f = f;
>    L->top -= n;
>    while (n--)
>      copyTV(L, &fn->c.upvalue[n], L->top+n);
>    setfuncV(L, L->top, fn);
> -  lua_assert(iswhite(obj2gco(fn)));
> +  lj_assertL(iswhite(obj2gco(fn)), "new GC object is not white");
>    incr_top(L);
>  }
>  
> @@ -779,7 +792,7 @@ LUA_API void *lua_newuserdata(lua_State *L, size_t size)
>  
>  LUA_API void lua_concat(lua_State *L, int n)
>  {
> -  api_checknelems(L, n);
> +  lj_checkapi_slot(n);
>    if (n >= 2) {
>      n--;
>      do {
> @@ -805,9 +818,8 @@ LUA_API void lua_concat(lua_State *L, int n)
>  
>  LUA_API void lua_gettable(lua_State *L, int idx)
>  {
> -  cTValue *v, *t = index2adr(L, idx);
> -  api_checkvalidindex(L, t);
> -  v = lj_meta_tget(L, t, L->top-1);
> +  cTValue *t = index2adr_check(L, idx);
> +  cTValue *v = lj_meta_tget(L, t, L->top-1);
>    if (v == NULL) {
>      L->top += 2;
>      jit_secure_call(L, L->top-2, 1+1);
> @@ -819,9 +831,8 @@ LUA_API void lua_gettable(lua_State *L, int idx)
>  
>  LUA_API void lua_getfield(lua_State *L, int idx, const char *k)
>  {
> -  cTValue *v, *t = index2adr(L, idx);
> +  cTValue *v, *t = index2adr_check(L, idx);
>    TValue key;
> -  api_checkvalidindex(L, t);
>    setstrV(L, &key, lj_str_newz(L, k));
>    v = lj_meta_tget(L, t, &key);
>    if (v == NULL) {
> @@ -837,14 +848,14 @@ LUA_API void lua_getfield(lua_State *L, int idx, const char *k)
>  LUA_API void lua_rawget(lua_State *L, int idx)
>  {
>    cTValue *t = index2adr(L, idx);
> -  api_check(L, tvistab(t));
> +  lj_checkapi(tvistab(t), "stack slot %d is not a table", idx);
>    copyTV(L, L->top-1, lj_tab_get(L, tabV(t), L->top-1));
>  }
>  
>  LUA_API void lua_rawgeti(lua_State *L, int idx, int n)
>  {
>    cTValue *v, *t = index2adr(L, idx);
> -  api_check(L, tvistab(t));
> +  lj_checkapi(tvistab(t), "stack slot %d is not a table", idx);
>    v = lj_tab_getint(tabV(t), n);
>    if (v) {
>      copyTV(L, L->top, v);
> @@ -886,8 +897,7 @@ LUALIB_API int luaL_getmetafield(lua_State *L, int idx, const char *field)
>  
>  LUA_API void lua_getfenv(lua_State *L, int idx)
>  {
> -  cTValue *o = index2adr(L, idx);
> -  api_checkvalidindex(L, o);
> +  cTValue *o = index2adr_check(L, idx);
>    if (tvisfunc(o)) {
>      settabV(L, L->top, tabref(funcV(o)->c.env));
>    } else if (tvisudata(o)) {
> @@ -904,7 +914,7 @@ LUA_API int lua_next(lua_State *L, int idx)
>  {
>    cTValue *t = index2adr(L, idx);
>    int more;
> -  api_check(L, tvistab(t));
> +  lj_checkapi(tvistab(t), "stack slot %d is not a table", idx);
>    more = lj_tab_next(L, tabV(t), L->top-1);
>    if (more) {
>      incr_top(L);  /* Return new key and value slot. */
> @@ -930,7 +940,7 @@ LUA_API void *lua_upvalueid(lua_State *L, int idx, int n)
>  {
>    GCfunc *fn = funcV(index2adr(L, idx));
>    n--;
> -  api_check(L, (uint32_t)n < fn->l.nupvalues);
> +  lj_checkapi((uint32_t)n < fn->l.nupvalues, "bad upvalue %d", n);
>    return isluafunc(fn) ? (void *)gcref(fn->l.uvptr[n]) :
>  			 (void *)&fn->c.upvalue[n];
>  }
> @@ -940,8 +950,10 @@ LUA_API void lua_upvaluejoin(lua_State *L, int idx1, int n1, int idx2, int n2)
>    GCfunc *fn1 = funcV(index2adr(L, idx1));
>    GCfunc *fn2 = funcV(index2adr(L, idx2));
>    n1--; n2--;
> -  api_check(L, isluafunc(fn1) && (uint32_t)n1 < fn1->l.nupvalues);
> -  api_check(L, isluafunc(fn2) && (uint32_t)n2 < fn2->l.nupvalues);
> +  lj_checkapi(isluafunc(fn1), "stack slot %d is not a Lua function", idx1);
> +  lj_checkapi(isluafunc(fn2), "stack slot %d is not a Lua function", idx2);
> +  lj_checkapi((uint32_t)n1 < fn1->l.nupvalues, "bad upvalue %d", n1+1);
> +  lj_checkapi((uint32_t)n2 < fn2->l.nupvalues, "bad upvalue %d", n2+1);
>    setgcrefr(fn1->l.uvptr[n1], fn2->l.uvptr[n2]);
>    lj_gc_objbarrier(L, fn1, gcref(fn1->l.uvptr[n1]));
>  }
> @@ -970,9 +982,8 @@ LUALIB_API void *luaL_checkudata(lua_State *L, int idx, const char *tname)
>  LUA_API void lua_settable(lua_State *L, int idx)
>  {
>    TValue *o;
> -  cTValue *t = index2adr(L, idx);
> -  api_checknelems(L, 2);
> -  api_checkvalidindex(L, t);
> +  cTValue *t = index2adr_check(L, idx);
> +  lj_checkapi_slot(2);
>    o = lj_meta_tset(L, t, L->top-2);
>    if (o) {
>      /* NOBARRIER: lj_meta_tset ensures the table is not black. */
> @@ -991,9 +1002,8 @@ LUA_API void lua_setfield(lua_State *L, int idx, const char *k)
>  {
>    TValue *o;
>    TValue key;
> -  cTValue *t = index2adr(L, idx);
> -  api_checknelems(L, 1);
> -  api_checkvalidindex(L, t);
> +  cTValue *t = index2adr_check(L, idx);
> +  lj_checkapi_slot(1);
>    setstrV(L, &key, lj_str_newz(L, k));
>    o = lj_meta_tset(L, t, &key);
>    if (o) {
> @@ -1012,7 +1022,7 @@ LUA_API void lua_rawset(lua_State *L, int idx)
>  {
>    GCtab *t = tabV(index2adr(L, idx));
>    TValue *dst, *key;
> -  api_checknelems(L, 2);
> +  lj_checkapi_slot(2);
>    key = L->top-2;
>    dst = lj_tab_set(L, t, key);
>    copyTV(L, dst, key+1);
> @@ -1024,7 +1034,7 @@ LUA_API void lua_rawseti(lua_State *L, int idx, int n)
>  {
>    GCtab *t = tabV(index2adr(L, idx));
>    TValue *dst, *src;
> -  api_checknelems(L, 1);
> +  lj_checkapi_slot(1);
>    dst = lj_tab_setint(L, t, n);
>    src = L->top-1;
>    copyTV(L, dst, src);
> @@ -1036,13 +1046,12 @@ LUA_API int lua_setmetatable(lua_State *L, int idx)
>  {
>    global_State *g;
>    GCtab *mt;
> -  cTValue *o = index2adr(L, idx);
> -  api_checknelems(L, 1);
> -  api_checkvalidindex(L, o);
> +  cTValue *o = index2adr_check(L, idx);
> +  lj_checkapi_slot(1);
>    if (tvisnil(L->top-1)) {
>      mt = NULL;
>    } else {
> -    api_check(L, tvistab(L->top-1));
> +    lj_checkapi(tvistab(L->top-1), "top stack slot is not a table");
>      mt = tabV(L->top-1);
>    }
>    g = G(L);
> @@ -1079,11 +1088,10 @@ LUALIB_API void luaL_setmetatable(lua_State *L, const char *tname)
>  
>  LUA_API int lua_setfenv(lua_State *L, int idx)
>  {
> -  cTValue *o = index2adr(L, idx);
> +  cTValue *o = index2adr_check(L, idx);
>    GCtab *t;
> -  api_checknelems(L, 1);
> -  api_checkvalidindex(L, o);
> -  api_check(L, tvistab(L->top-1));
> +  lj_checkapi_slot(1);
> +  lj_checkapi(tvistab(L->top-1), "top stack slot is not a table");
>    t = tabV(L->top-1);
>    if (tvisfunc(o)) {
>      setgcref(funcV(o)->c.env, obj2gco(t));
> @@ -1106,7 +1114,7 @@ LUA_API const char *lua_setupvalue(lua_State *L, int idx, int n)
>    TValue *val;
>    GCobj *o;
>    const char *name;
> -  api_checknelems(L, 1);
> +  lj_checkapi_slot(1);
>    name = lj_debug_uvnamev(f, (uint32_t)(n-1), &val, &o);
>    if (name) {
>      L->top--;
> @@ -1133,8 +1141,9 @@ static TValue *api_call_base(lua_State *L, int nargs)
>  
>  LUA_API void lua_call(lua_State *L, int nargs, int nresults)
>  {
> -  api_check(L, L->status == LUA_OK || L->status == LUA_ERRERR);
> -  api_checknelems(L, nargs+1);
> +  lj_checkapi(L->status == LUA_OK || L->status == LUA_ERRERR,
> +	      "thread called in wrong state %d", L->status);
> +  lj_checkapi_slot(nargs+1);
>    jit_secure_call(L, api_call_base(L, nargs), nresults+1);
>  }
>  
> @@ -1144,13 +1153,13 @@ LUA_API int lua_pcall(lua_State *L, int nargs, int nresults, int errfunc)
>    uint8_t oldh = hook_save(g);
>    ptrdiff_t ef;
>    int status;
> -  api_check(L, L->status == LUA_OK || L->status == LUA_ERRERR);
> -  api_checknelems(L, nargs+1);
> +  lj_checkapi(L->status == LUA_OK || L->status == LUA_ERRERR,
> +	      "thread called in wrong state %d", L->status);
> +  lj_checkapi_slot(nargs+1);
>    if (errfunc == 0) {
>      ef = 0;
>    } else {
> -    cTValue *o = stkindex2adr(L, errfunc);
> -    api_checkvalidindex(L, o);
> +    cTValue *o = index2adr_stack(L, errfunc);
>      ef = savestack(L, o);
>    }
>    /* Forbid Lua world re-entrancy while running the trace */
> @@ -1186,7 +1195,8 @@ LUA_API int lua_cpcall(lua_State *L, lua_CFunction func, void *ud)
>    global_State *g = G(L);
>    uint8_t oldh = hook_save(g);
>    int status;
> -  api_check(L, L->status == LUA_OK || L->status == LUA_ERRERR);
> +  lj_checkapi(L->status == LUA_OK || L->status == LUA_ERRERR,
> +	      "thread called in wrong state %d", L->status);
>    /* Forbid Lua world re-entrancy while running the trace */
>    if (tvref(g->jit_base)) {
>      setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL));
> diff --git a/src/lj_asm.c b/src/lj_asm.c
> index a6906b19..d71fa8c8 100644
> --- a/src/lj_asm.c
> +++ b/src/lj_asm.c
> @@ -100,6 +100,12 @@ typedef struct ASMState {
>    uint16_t parentmap[LJ_MAX_JSLOTS];  /* Parent instruction to RegSP map. */
>  } ASMState;
>  
> +#ifdef LUA_USE_ASSERT
> +#define lj_assertA(c, ...)	lj_assertG_(J2G(as->J), (c), __VA_ARGS__)
> +#else
> +#define lj_assertA(c, ...)	((void)as)
> +#endif
> +
>  #define IR(ref)			(&as->ir[(ref)])
>  
>  #define ASMREF_TMP1		REF_TRUE	/* Temp. register. */
> @@ -131,9 +137,8 @@ static LJ_AINLINE void checkmclim(ASMState *as)
>  #ifdef LUA_USE_ASSERT
>    if (as->mcp + MCLIM_REDZONE < as->mcp_prev) {
>      IRIns *ir = IR(as->curins+1);
> -    fprintf(stderr, "RED ZONE OVERFLOW: %p IR %04d  %02d %04d %04d\n", as->mcp,
> -	    as->curins+1-REF_BIAS, ir->o, ir->op1-REF_BIAS, ir->op2-REF_BIAS);
> -    lua_assert(0);
> +    lj_assertA(0, "red zone overflow: %p IR %04d  %02d %04d %04d\n", as->mcp,
> +      as->curins+1-REF_BIAS, ir->o, ir->op1-REF_BIAS, ir->op2-REF_BIAS);
>    }
>  #endif
>    if (LJ_UNLIKELY(as->mcp < as->mclim)) asm_mclimit(as);
> @@ -247,7 +252,7 @@ static void ra_dprintf(ASMState *as, const char *fmt, ...)
>  	  *p++ = *q >= 'A' && *q <= 'Z' ? *q + 0x20 : *q;
>        } else {
>  	*p++ = '?';
> -	lua_assert(0);
> +	lj_assertA(0, "bad register %d for debug format \"%s\"", r, fmt);
>        }
>      } else if (e[1] == 'f' || e[1] == 'i') {
>        IRRef ref;
> @@ -265,7 +270,7 @@ static void ra_dprintf(ASMState *as, const char *fmt, ...)
>      } else if (e[1] == 'x') {
>        p += sprintf(p, "%08x", va_arg(argp, int32_t));
>      } else {
> -      lua_assert(0);
> +      lj_assertA(0, "bad debug format code");
>      }
>      fmt = e+2;
>    }
> @@ -324,7 +329,7 @@ static Reg ra_rematk(ASMState *as, IRRef ref)
>    Reg r;
>    if (ra_iskref(ref)) {
>      r = ra_krefreg(ref);
> -    lua_assert(!rset_test(as->freeset, r));
> +    lj_assertA(!rset_test(as->freeset, r), "rematk of free reg %d", r);
>      ra_free(as, r);
>      ra_modified(as, r);
>  #if LJ_64
> @@ -336,7 +341,9 @@ static Reg ra_rematk(ASMState *as, IRRef ref)
>    }
>    ir = IR(ref);
>    r = ir->r;
> -  lua_assert(ra_hasreg(r) && !ra_hasspill(ir->s));
> +  lj_assertA(ra_hasreg(r), "rematk of K%03d has no reg", REF_BIAS - ref);
> +  lj_assertA(!ra_hasspill(ir->s),
> +	     "rematk of K%03d has spill slot [%x]", REF_BIAS - ref, ir->s);
>    ra_free(as, r);
>    ra_modified(as, r);
>    ir->r = RID_INIT;  /* Do not keep any hint. */
> @@ -350,7 +357,8 @@ static Reg ra_rematk(ASMState *as, IRRef ref)
>      ra_sethint(ir->r, RID_BASE);  /* Restore BASE register hint. */
>      emit_getgl(as, r, jit_base);
>    } else if (emit_canremat(ASMREF_L) && ir->o == IR_KPRI) {
> -    lua_assert(irt_isnil(ir->t));  /* REF_NIL stores ASMREF_L register. */
> +    /* REF_NIL stores ASMREF_L register. */
> +    lj_assertA(irt_isnil(ir->t), "rematk of bad ASMREF_L");
>      emit_getgl(as, r, cur_L);
>  #if LJ_64
>    } else if (ir->o == IR_KINT64) {
> @@ -363,8 +371,9 @@ static Reg ra_rematk(ASMState *as, IRRef ref)
>  #endif
>  #endif
>    } else {
> -    lua_assert(ir->o == IR_KINT || ir->o == IR_KGC ||
> -	       ir->o == IR_KPTR || ir->o == IR_KKPTR || ir->o == IR_KNULL);
> +    lj_assertA(ir->o == IR_KINT || ir->o == IR_KGC ||
> +	       ir->o == IR_KPTR || ir->o == IR_KKPTR || ir->o == IR_KNULL,
> +	       "rematk of bad IR op %d", ir->o);
>      emit_loadi(as, r, ir->i);
>    }
>    return r;
> @@ -374,7 +383,8 @@ static Reg ra_rematk(ASMState *as, IRRef ref)
>  static int32_t ra_spill(ASMState *as, IRIns *ir)
>  {
>    int32_t slot = ir->s;
> -  lua_assert(ir >= as->ir + REF_TRUE);
> +  lj_assertA(ir >= as->ir + REF_TRUE,
> +	     "spill of K%03d", REF_BIAS - (int)(ir - as->ir));
>    if (!ra_hasspill(slot)) {
>      if (irt_is64(ir->t)) {
>        slot = as->evenspill;
> @@ -399,7 +409,9 @@ static Reg ra_releasetmp(ASMState *as, IRRef ref)
>  {
>    IRIns *ir = IR(ref);
>    Reg r = ir->r;
> -  lua_assert(ra_hasreg(r) && !ra_hasspill(ir->s));
> +  lj_assertA(ra_hasreg(r), "release of TMP%d has no reg", ref-ASMREF_TMP1+1);
> +  lj_assertA(!ra_hasspill(ir->s),
> +	     "release of TMP%d has spill slot [%x]", ref-ASMREF_TMP1+1, ir->s);
>    ra_free(as, r);
>    ra_modified(as, r);
>    ir->r = RID_INIT;
> @@ -415,7 +427,7 @@ static Reg ra_restore(ASMState *as, IRRef ref)
>      IRIns *ir = IR(ref);
>      int32_t ofs = ra_spill(as, ir);  /* Force a spill slot. */
>      Reg r = ir->r;
> -    lua_assert(ra_hasreg(r));
> +    lj_assertA(ra_hasreg(r), "restore of IR %04d has no reg", ref - REF_BIAS);
>      ra_sethint(ir->r, r);  /* Keep hint. */
>      ra_free(as, r);
>      if (!rset_test(as->weakset, r)) {  /* Only restore non-weak references. */
> @@ -444,14 +456,15 @@ static Reg ra_evict(ASMState *as, RegSet allow)
>  {
>    IRRef ref;
>    RegCost cost = ~(RegCost)0;
> -  lua_assert(allow != RSET_EMPTY);
> +  lj_assertA(allow != RSET_EMPTY, "evict from empty set");
>    if (RID_NUM_FPR == 0 || allow < RID2RSET(RID_MAX_GPR)) {
>      GPRDEF(MINCOST)
>    } else {
>      FPRDEF(MINCOST)
>    }
>    ref = regcost_ref(cost);
> -  lua_assert(ra_iskref(ref) || (ref >= as->T->nk && ref < as->T->nins));
> +  lj_assertA(ra_iskref(ref) || (ref >= as->T->nk && ref < as->T->nins),
> +	     "evict of out-of-range IR %04d", ref - REF_BIAS);
>    /* Preferably pick any weak ref instead of a non-weak, non-const ref. */
>    if (!irref_isk(ref) && (as->weakset & allow)) {
>      IRIns *ir = IR(ref);
> @@ -609,7 +622,8 @@ static Reg ra_allocref(ASMState *as, IRRef ref, RegSet allow)
>    IRIns *ir = IR(ref);
>    RegSet pick = as->freeset & allow;
>    Reg r;
> -  lua_assert(ra_noreg(ir->r));
> +  lj_assertA(ra_noreg(ir->r),
> +	     "IR %04d already has reg %d", ref - REF_BIAS, ir->r);
>    if (pick) {
>      /* First check register hint from propagation or PHI. */
>      if (ra_hashint(ir->r)) {
> @@ -673,8 +687,10 @@ static void ra_rename(ASMState *as, Reg down, Reg up)
>    IRIns *ir = IR(ref);
>    ir->r = (uint8_t)up;
>    as->cost[down] = 0;
> -  lua_assert((down < RID_MAX_GPR) == (up < RID_MAX_GPR));
> -  lua_assert(!rset_test(as->freeset, down) && rset_test(as->freeset, up));
> +  lj_assertA((down < RID_MAX_GPR) == (up < RID_MAX_GPR),
> +	     "rename between GPR/FPR %d and %d", down, up);
> +  lj_assertA(!rset_test(as->freeset, down), "rename from free reg %d", down);
> +  lj_assertA(rset_test(as->freeset, up), "rename to non-free reg %d", up);
>    ra_free(as, down);  /* 'down' is free ... */
>    ra_modified(as, down);
>    rset_clear(as->freeset, up);  /* ... and 'up' is now allocated. */
> @@ -722,7 +738,7 @@ static void ra_destreg(ASMState *as, IRIns *ir, Reg r)
>  {
>    Reg dest = ra_dest(as, ir, RID2RSET(r));
>    if (dest != r) {
> -    lua_assert(rset_test(as->freeset, r));
> +    lj_assertA(rset_test(as->freeset, r), "dest reg %d is not free", r);
>      ra_modified(as, r);
>      emit_movrr(as, ir, dest, r);
>    }
> @@ -755,8 +771,9 @@ static void ra_left(ASMState *as, Reg dest, IRRef lref)
>  #endif
>  #endif
>        } else if (ir->o != IR_KPRI) {
> -	lua_assert(ir->o == IR_KINT || ir->o == IR_KGC ||
> -		   ir->o == IR_KPTR || ir->o == IR_KKPTR || ir->o == IR_KNULL);
> +	lj_assertA(ir->o == IR_KINT || ir->o == IR_KGC ||
> +		   ir->o == IR_KPTR || ir->o == IR_KKPTR || ir->o == IR_KNULL,
> +		   "K%03d has bad IR op %d", REF_BIAS - lref, ir->o);
>  	emit_loadi(as, dest, ir->i);
>  	return;
>        }
> @@ -901,11 +918,14 @@ static void asm_snap_alloc1(ASMState *as, IRRef ref)
>  #endif
>        {  /* Allocate stored values for TNEW, TDUP and CNEW. */
>  	IRIns *irs;
> -	lua_assert(ir->o == IR_TNEW || ir->o == IR_TDUP || ir->o == IR_CNEW);
> +	lj_assertA(ir->o == IR_TNEW || ir->o == IR_TDUP || ir->o == IR_CNEW,
> +		   "sink of IR %04d has bad op %d", ref - REF_BIAS, ir->o);
>  	for (irs = IR(as->snapref-1); irs > ir; irs--)
>  	  if (irs->r == RID_SINK && asm_sunk_store(as, ir, irs)) {
> -	    lua_assert(irs->o == IR_ASTORE || irs->o == IR_HSTORE ||
> -		       irs->o == IR_FSTORE || irs->o == IR_XSTORE);
> +	    lj_assertA(irs->o == IR_ASTORE || irs->o == IR_HSTORE ||
> +		       irs->o == IR_FSTORE || irs->o == IR_XSTORE,
> +		       "sunk store IR %04d has bad op %d",
> +		       (int)(irs - as->ir) - REF_BIAS, irs->o);
>  	    asm_snap_alloc1(as, irs->op2);
>  	    if (LJ_32 && (irs+1)->o == IR_HIOP)
>  	      asm_snap_alloc1(as, (irs+1)->op2);
> @@ -953,15 +973,9 @@ static void asm_snap_alloc(ASMState *as, int snapno)
>      if (!irref_isk(ref)) {
>        asm_snap_alloc1(as, ref);
>        if (LJ_SOFTFP && (sn & SNAP_SOFTFPNUM)) {
> -	/*
> -	** FIXME: The following assert was replaced with
> -	** the conventional `lua_assert`.
> -	**
> -	** lj_assertA(irt_type(IR(ref+1)->t) == IRT_SOFTFP,
> -	** "snap %d[%d] points to bad SOFTFP IR %04d",
> -	** snapno, n, ref - REF_BIAS);
> -	*/
> -	lua_assert(irt_type(IR(ref+1)->t) == IRT_SOFTFP);
> +	lj_assertA(irt_type(IR(ref+1)->t) == IRT_SOFTFP,
> +		   "snap %d[%d] points to bad SOFTFP IR %04d",
> +		   snapno, n, ref - REF_BIAS);
>  	asm_snap_alloc1(as, ref+1);
>        }
>      }
> @@ -1045,19 +1059,20 @@ static int32_t asm_stack_adjust(ASMState *as)
>  }
>  
>  /* Must match with hash*() in lj_tab.c. */
> -static uint32_t ir_khash(IRIns *ir)
> +static uint32_t ir_khash(ASMState *as, IRIns *ir)
>  {
>    uint32_t lo, hi;
> +  UNUSED(as);
>    if (irt_isstr(ir->t)) {
>      return ir_kstr(ir)->hash;
>    } else if (irt_isnum(ir->t)) {
>      lo = ir_knum(ir)->u32.lo;
>      hi = ir_knum(ir)->u32.hi << 1;
>    } else if (irt_ispri(ir->t)) {
> -    lua_assert(!irt_isnil(ir->t));
> +    lj_assertA(!irt_isnil(ir->t), "hash of nil key");
>      return irt_type(ir->t)-IRT_FALSE;
>    } else {
> -    lua_assert(irt_isgcv(ir->t));
> +    lj_assertA(irt_isgcv(ir->t), "hash of bad IR type %d", irt_type(ir->t));
>      lo = u32ptr(ir_kgc(ir));
>  #if LJ_GC64
>      hi = (uint32_t)(u64ptr(ir_kgc(ir)) >> 32) | (irt_toitype(ir->t) << 15);
> @@ -1168,7 +1183,8 @@ static void asm_bufput(ASMState *as, IRIns *ir)
>    args[0] = ir->op1;  /* SBuf * */
>    args[1] = ir->op2;  /* GCstr * */
>    irs = IR(ir->op2);
> -  lua_assert(irt_isstr(irs->t));
> +  lj_assertA(irt_isstr(irs->t),
> +	     "BUFPUT of non-string IR %04d", ir->op2 - REF_BIAS);
>    if (irs->o == IR_KGC) {
>      GCstr *s = ir_kstr(irs);
>      if (s->len == 1) {  /* Optimize put of single-char string constant. */
> @@ -1182,7 +1198,8 @@ static void asm_bufput(ASMState *as, IRIns *ir)
>  	args[1] = ASMREF_TMP1;  /* TValue * */
>  	ci = &lj_ir_callinfo[IRCALL_lj_strfmt_putnum];
>        } else {
> -	lua_assert(irt_isinteger(IR(irs->op1)->t));
> +	lj_assertA(irt_isinteger(IR(irs->op1)->t),
> +		   "TOSTR of non-numeric IR %04d", irs->op1);
>  	args[1] = irs->op1;  /* int */
>  	if (irs->op2 == IRTOSTR_INT)
>  	  ci = &lj_ir_callinfo[IRCALL_lj_strfmt_putint];
> @@ -1248,7 +1265,8 @@ static void asm_conv64(ASMState *as, IRIns *ir)
>    IRType dt = (((ir-1)->op2 & IRCONV_DSTMASK) >> IRCONV_DSH);
>    IRCallID id;
>    IRRef args[2];
> -  lua_assert((ir-1)->o == IR_CONV && ir->o == IR_HIOP);
> +  lj_assertA((ir-1)->o == IR_CONV && ir->o == IR_HIOP,
> +	     "not a CONV/HIOP pair at IR %04d", (int)(ir - as->ir) - REF_BIAS);
>    args[LJ_BE] = (ir-1)->op1;
>    args[LJ_LE] = ir->op1;
>    if (st == IRT_NUM || st == IRT_FLOAT) {
> @@ -1304,15 +1322,16 @@ static void asm_collectargs(ASMState *as, IRIns *ir,
>  			    const CCallInfo *ci, IRRef *args)
>  {
>    uint32_t n = CCI_XNARGS(ci);
> -  lua_assert(n <= CCI_NARGS_MAX*2);  /* Account for split args. */
> +  /* Account for split args. */
> +  lj_assertA(n <= CCI_NARGS_MAX*2, "too many args %d to collect", n);
>    if ((ci->flags & CCI_L)) { *args++ = ASMREF_L; n--; }
>    while (n-- > 1) {
>      ir = IR(ir->op1);
> -    lua_assert(ir->o == IR_CARG);
> +    lj_assertA(ir->o == IR_CARG, "malformed CALL arg tree");
>      args[n] = ir->op2 == REF_NIL ? 0 : ir->op2;
>    }
>    args[0] = ir->op1 == REF_NIL ? 0 : ir->op1;
> -  lua_assert(IR(ir->op1)->o != IR_CARG);
> +  lj_assertA(IR(ir->op1)->o != IR_CARG, "malformed CALL arg tree");
>  }
>  
>  /* Reconstruct CCallInfo flags for CALLX*. */
> @@ -1690,7 +1709,10 @@ static void asm_ir(ASMState *as, IRIns *ir)
>    switch ((IROp)ir->o) {
>    /* Miscellaneous ops. */
>    case IR_LOOP: asm_loop(as); break;
> -  case IR_NOP: case IR_XBAR: lua_assert(!ra_used(ir)); break;
> +  case IR_NOP: case IR_XBAR:
> +    lj_assertA(!ra_used(ir),
> +	       "IR %04d not unused", (int)(ir - as->ir) - REF_BIAS);
> +    break;
>    case IR_USE:
>      ra_alloc1(as, ir->op1, irt_isfp(ir->t) ? RSET_FPR : RSET_GPR); break;
>    case IR_PHI: asm_phi(as, ir); break;
> @@ -1729,7 +1751,9 @@ static void asm_ir(ASMState *as, IRIns *ir)
>  #if LJ_SOFTFP32
>    case IR_DIV: case IR_POW: case IR_ABS:
>    case IR_LDEXP: case IR_FPMATH: case IR_TOBIT:
> -    lua_assert(0);  /* Unused for LJ_SOFTFP32. */
> +    /* Unused for LJ_SOFTFP32. */
> +    lj_assertA(0, "IR %04d with unused op %d",
> +		  (int)(ir - as->ir) - REF_BIAS, ir->o);
>      break;
>  #else
>    case IR_DIV: asm_div(as, ir); break;
> @@ -1777,7 +1801,8 @@ static void asm_ir(ASMState *as, IRIns *ir)
>  #if LJ_HASFFI
>      asm_cnew(as, ir);
>  #else
> -    lua_assert(0);
> +    lj_assertA(0, "IR %04d with unused op %d",
> +		  (int)(ir - as->ir) - REF_BIAS, ir->o);
>  #endif
>      break;
>  
> @@ -1854,8 +1879,10 @@ static void asm_head_side(ASMState *as)
>    for (i = as->stopins; i > REF_BASE; i--) {
>      IRIns *ir = IR(i);
>      RegSP rs;
> -    lua_assert((ir->o == IR_SLOAD && (ir->op2 & IRSLOAD_PARENT)) ||
> -	       (LJ_SOFTFP && ir->o == IR_HIOP) || ir->o == IR_PVAL);
> +    lj_assertA((ir->o == IR_SLOAD && (ir->op2 & IRSLOAD_PARENT)) ||
> +	       (LJ_SOFTFP && ir->o == IR_HIOP) || ir->o == IR_PVAL,
> +	       "IR %04d has bad parent op %d",
> +	       (int)(ir - as->ir) - REF_BIAS, ir->o);
>      rs = as->parentmap[i - REF_FIRST];
>      if (ra_hasreg(ir->r)) {
>        rset_clear(allow, ir->r);
> @@ -2115,7 +2142,7 @@ static void asm_setup_regsp(ASMState *as)
>    ir = IR(REF_FIRST);
>    if (as->parent) {
>      uint16_t *p;
> -    lastir = lj_snap_regspmap(as->parent, as->J->exitno, ir);
> +    lastir = lj_snap_regspmap(as->J, as->parent, as->J->exitno, ir);
>      if (lastir - ir > LJ_MAX_JSLOTS)
>        lj_trace_err(as->J, LJ_TRERR_NYICOAL);
>      as->stopins = (IRRef)((lastir-1) - as->ir);
> @@ -2418,7 +2445,10 @@ void lj_asm_trace(jit_State *J, GCtrace *T)
>      /* Assemble a trace in linear backwards order. */
>      for (as->curins--; as->curins > as->stopins; as->curins--) {
>        IRIns *ir = IR(as->curins);
> -      lua_assert(!(LJ_32 && irt_isint64(ir->t)));  /* Handled by SPLIT. */
> +      /* 64 bit types handled by SPLIT for 32 bit archs. */
> +      lj_assertA(!(LJ_32 && irt_isint64(ir->t)),
> +		 "IR %04d has unsplit 64 bit type",
> +		 (int)(ir - as->ir) - REF_BIAS);
>        asm_snap_prev(as);
>        if (!ra_used(ir) && !ir_sideeff(ir) && (as->flags & JIT_F_OPT_DCE))
>  	continue;  /* Dead-code elimination can be soooo easy. */
> @@ -2449,7 +2479,7 @@ void lj_asm_trace(jit_State *J, GCtrace *T)
>      asm_phi_fixup(as);
>  
>      if (J->curfinal->nins >= T->nins) {  /* IR didn't grow? */
> -      lua_assert(J->curfinal->nk == T->nk);
> +      lj_assertA(J->curfinal->nk == T->nk, "unexpected IR constant growth");
>        memcpy(J->curfinal->ir + as->orignins, T->ir + as->orignins,
>  	     (T->nins - as->orignins) * sizeof(IRIns));  /* Copy RENAMEs. */
>        T->nins = J->curfinal->nins;
> diff --git a/src/lj_asm_arm.h b/src/lj_asm_arm.h
> index 29a07c80..47564d2e 100644
> --- a/src/lj_asm_arm.h
> +++ b/src/lj_asm_arm.h
> @@ -41,7 +41,7 @@ static Reg ra_scratchpair(ASMState *as, RegSet allow)
>        }
>      }
>    }
> -  lua_assert(rset_test(RSET_GPREVEN, r));
> +  lj_assertA(rset_test(RSET_GPREVEN, r), "odd reg %d", r);
>    ra_modified(as, r);
>    ra_modified(as, r+1);
>    RA_DBGX((as, "scratchpair    $r $r", r, r+1));
> @@ -269,7 +269,7 @@ static void asm_fusexref(ASMState *as, ARMIns ai, Reg rd, IRRef ref,
>  	return;
>        }
>      } else if (ir->o == IR_STRREF && !(!LJ_SOFTFP && (ai & 0x08000000))) {
> -      lua_assert(ofs == 0);
> +      lj_assertA(ofs == 0, "bad usage");
>        ofs = (int32_t)sizeof(GCstr);
>        if (irref_isk(ir->op2)) {
>  	ofs += IR(ir->op2)->i;
> @@ -389,9 +389,11 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
>        as->freeset |= (of & RSET_RANGE(REGARG_FIRSTGPR, REGARG_LASTGPR+1));
>        if (irt_isnum(ir->t)) gpr = (gpr+1) & ~1u;
>        if (gpr <= REGARG_LASTGPR) {
> -	lua_assert(rset_test(as->freeset, gpr));  /* Must have been evicted. */
> +	lj_assertA(rset_test(as->freeset, gpr),
> +		   "reg %d not free", gpr);  /* Must have been evicted. */
>  	if (irt_isnum(ir->t)) {
> -	  lua_assert(rset_test(as->freeset, gpr+1));  /* Ditto. */
> +	  lj_assertA(rset_test(as->freeset, gpr+1),
> +		     "reg %d not free", gpr+1);  /* Ditto. */
>  	  emit_dnm(as, ARMI_VMOV_RR_D, gpr, gpr+1, (src & 15));
>  	  gpr += 2;
>  	} else {
> @@ -408,7 +410,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
>  #endif
>      {
>        if (gpr <= REGARG_LASTGPR) {
> -	lua_assert(rset_test(as->freeset, gpr));  /* Must have been evicted. */
> +	lj_assertA(rset_test(as->freeset, gpr),
> +		   "reg %d not free", gpr);  /* Must have been evicted. */
>  	if (ref) ra_leftov(as, gpr, ref);
>  	gpr++;
>        } else {
> @@ -433,7 +436,7 @@ static void asm_setupresult(ASMState *as, IRIns *ir, const CCallInfo *ci)
>      rset_clear(drop, (ir+1)->r);  /* Dest reg handled below. */
>    ra_evictset(as, drop);  /* Evictions must be performed first. */
>    if (ra_used(ir)) {
> -    lua_assert(!irt_ispri(ir->t));
> +    lj_assertA(!irt_ispri(ir->t), "PRI dest");
>      if (!LJ_SOFTFP && irt_isfp(ir->t)) {
>        if (LJ_ABI_SOFTFP || (ci->flags & (CCI_CASTU64|CCI_VARARG))) {
>  	Reg dest = (ra_dest(as, ir, RSET_FPR) & 15);
> @@ -530,13 +533,17 @@ static void asm_conv(ASMState *as, IRIns *ir)
>  #endif
>    IRRef lref = ir->op1;
>    /* 64 bit integer conversions are handled by SPLIT. */
> -  lua_assert(!irt_isint64(ir->t) && !(st == IRT_I64 || st == IRT_U64));
> +  lj_assertA(!irt_isint64(ir->t) && !(st == IRT_I64 || st == IRT_U64),
> +	     "IR %04d has unsplit 64 bit type",
> +	     (int)(ir - as->ir) - REF_BIAS);
>  #if LJ_SOFTFP
>    /* FP conversions are handled by SPLIT. */
> -  lua_assert(!irt_isfp(ir->t) && !(st == IRT_NUM || st == IRT_FLOAT));
> +  lj_assertA(!irt_isfp(ir->t) && !(st == IRT_NUM || st == IRT_FLOAT),
> +	     "IR %04d has FP type",
> +	     (int)(ir - as->ir) - REF_BIAS);
>    /* Can't check for same types: SPLIT uses CONV int.int + BXOR for sfp NEG. */
>  #else
> -  lua_assert(irt_type(ir->t) != st);
> +  lj_assertA(irt_type(ir->t) != st, "inconsistent types for CONV");
>    if (irt_isfp(ir->t)) {
>      Reg dest = ra_dest(as, ir, RSET_FPR);
>      if (stfp) {  /* FP to FP conversion. */
> @@ -553,7 +560,8 @@ static void asm_conv(ASMState *as, IRIns *ir)
>    } else if (stfp) {  /* FP to integer conversion. */
>      if (irt_isguard(ir->t)) {
>        /* Checked conversions are only supported from number to int. */
> -      lua_assert(irt_isint(ir->t) && st == IRT_NUM);
> +      lj_assertA(irt_isint(ir->t) && st == IRT_NUM,
> +		 "bad type for checked CONV");
>        asm_tointg(as, ir, ra_alloc1(as, lref, RSET_FPR));
>      } else {
>        Reg left = ra_alloc1(as, lref, RSET_FPR);
> @@ -572,7 +580,7 @@ static void asm_conv(ASMState *as, IRIns *ir)
>      Reg dest = ra_dest(as, ir, RSET_GPR);
>      if (st >= IRT_I8 && st <= IRT_U16) {  /* Extend to 32 bit integer. */
>        Reg left = ra_alloc1(as, lref, RSET_GPR);
> -      lua_assert(irt_isint(ir->t) || irt_isu32(ir->t));
> +      lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t), "bad type for CONV EXT");
>        if ((as->flags & JIT_F_ARMV6)) {
>  	ARMIns ai = st == IRT_I8 ? ARMI_SXTB :
>  		    st == IRT_U8 ? ARMI_UXTB :
> @@ -667,7 +675,7 @@ static void asm_tvptr(ASMState *as, Reg dest, IRRef ref)
>        ra_allockreg(as, i32ptr(ir_knum(ir)), dest);
>      } else {
>  #if LJ_SOFTFP
> -      lua_assert(0);
> +      lj_assertA(0, "unsplit FP op");
>  #else
>        /* Otherwise force a spill and use the spill slot. */
>        emit_opk(as, ARMI_ADD, dest, RID_SP, ra_spill(as, ir), RSET_GPR);
> @@ -811,7 +819,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>    *l_loop = ARMF_CC(ARMI_B, CC_NE) | ((as->mcp-l_loop-2) & 0x00ffffffu);
>  
>    /* Load main position relative to tab->node into dest. */
> -  khash = irref_isk(refkey) ? ir_khash(irkey) : 1;
> +  khash = irref_isk(refkey) ? ir_khash(as, irkey) : 1;
>    if (khash == 0) {
>      emit_lso(as, ARMI_LDR, dest, tab, (int32_t)offsetof(GCtab, node));
>    } else {
> @@ -867,7 +875,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
>    Reg node = ra_alloc1(as, ir->op1, RSET_GPR);
>    Reg key = RID_NONE, type = RID_TMP, idx = node;
>    RegSet allow = rset_exclude(RSET_GPR, node);
> -  lua_assert(ofs % sizeof(Node) == 0);
> +  lj_assertA(ofs % sizeof(Node) == 0, "unaligned HREFK slot");
>    if (ofs > 4095) {
>      idx = dest;
>      rset_clear(allow, dest);
> @@ -934,7 +942,7 @@ static void asm_uref(ASMState *as, IRIns *ir)
>  static void asm_fref(ASMState *as, IRIns *ir)
>  {
>    UNUSED(as); UNUSED(ir);
> -  lua_assert(!ra_used(ir));
> +  lj_assertA(!ra_used(ir), "unfused FREF");
>  }
>  
>  static void asm_strref(ASMState *as, IRIns *ir)
> @@ -971,25 +979,27 @@ static void asm_strref(ASMState *as, IRIns *ir)
>  
>  /* -- Loads and stores ---------------------------------------------------- */
>  
> -static ARMIns asm_fxloadins(IRIns *ir)
> +static ARMIns asm_fxloadins(ASMState *as, IRIns *ir)
>  {
> +  UNUSED(as);
>    switch (irt_type(ir->t)) {
>    case IRT_I8: return ARMI_LDRSB;
>    case IRT_U8: return ARMI_LDRB;
>    case IRT_I16: return ARMI_LDRSH;
>    case IRT_U16: return ARMI_LDRH;
> -  case IRT_NUM: lua_assert(!LJ_SOFTFP); return ARMI_VLDR_D;
> +  case IRT_NUM: lj_assertA(!LJ_SOFTFP, "unsplit FP op"); return ARMI_VLDR_D;
>    case IRT_FLOAT: if (!LJ_SOFTFP) return ARMI_VLDR_S;  /* fallthrough */
>    default: return ARMI_LDR;
>    }
>  }
>  
> -static ARMIns asm_fxstoreins(IRIns *ir)
> +static ARMIns asm_fxstoreins(ASMState *as, IRIns *ir)
>  {
> +  UNUSED(as);
>    switch (irt_type(ir->t)) {
>    case IRT_I8: case IRT_U8: return ARMI_STRB;
>    case IRT_I16: case IRT_U16: return ARMI_STRH;
> -  case IRT_NUM: lua_assert(!LJ_SOFTFP); return ARMI_VSTR_D;
> +  case IRT_NUM: lj_assertA(!LJ_SOFTFP, "unsplit FP op"); return ARMI_VSTR_D;
>    case IRT_FLOAT: if (!LJ_SOFTFP) return ARMI_VSTR_S;  /* fallthrough */
>    default: return ARMI_STR;
>    }
> @@ -997,12 +1007,13 @@ static ARMIns asm_fxstoreins(IRIns *ir)
>  
>  static void asm_fload(ASMState *as, IRIns *ir)
>  {
> -  if (ir->op1 == REF_NIL) {
> -    lua_assert(!ra_used(ir));  /* We can end up here if DCE is turned off. */
> +  if (ir->op1 == REF_NIL) {  /* FLOAD from GG_State with offset. */
> +    /* We can end up here if DCE is turned off. */
> +    lj_assertA(!ra_used(ir), "NYI FLOAD GG_State");
>    } else {
>      Reg dest = ra_dest(as, ir, RSET_GPR);
>      Reg idx = ra_alloc1(as, ir->op1, RSET_GPR);
> -    ARMIns ai = asm_fxloadins(ir);
> +    ARMIns ai = asm_fxloadins(as, ir);
>      int32_t ofs;
>      if (ir->op2 == IRFL_TAB_ARRAY) {
>        ofs = asm_fuseabase(as, ir->op1);
> @@ -1026,7 +1037,7 @@ static void asm_fstore(ASMState *as, IRIns *ir)
>      IRIns *irf = IR(ir->op1);
>      Reg idx = ra_alloc1(as, irf->op1, rset_exclude(RSET_GPR, src));
>      int32_t ofs = field_ofs[irf->op2];
> -    ARMIns ai = asm_fxstoreins(ir);
> +    ARMIns ai = asm_fxstoreins(as, ir);
>      if ((ai & 0x04000000))
>        emit_lso(as, ai, src, idx, ofs);
>      else
> @@ -1038,8 +1049,8 @@ static void asm_xload(ASMState *as, IRIns *ir)
>  {
>    Reg dest = ra_dest(as, ir,
>  		     (!LJ_SOFTFP && irt_isfp(ir->t)) ? RSET_FPR : RSET_GPR);
> -  lua_assert(!(ir->op2 & IRXLOAD_UNALIGNED));
> -  asm_fusexref(as, asm_fxloadins(ir), dest, ir->op1, RSET_GPR, 0);
> +  lj_assertA(!(ir->op2 & IRXLOAD_UNALIGNED), "unaligned XLOAD");
> +  asm_fusexref(as, asm_fxloadins(as, ir), dest, ir->op1, RSET_GPR, 0);
>  }
>  
>  static void asm_xstore_(ASMState *as, IRIns *ir, int32_t ofs)
> @@ -1047,7 +1058,7 @@ static void asm_xstore_(ASMState *as, IRIns *ir, int32_t ofs)
>    if (ir->r != RID_SINK) {
>      Reg src = ra_alloc1(as, ir->op2,
>  			(!LJ_SOFTFP && irt_isfp(ir->t)) ? RSET_FPR : RSET_GPR);
> -    asm_fusexref(as, asm_fxstoreins(ir), src, ir->op1,
> +    asm_fusexref(as, asm_fxstoreins(as, ir), src, ir->op1,
>  		 rset_exclude(RSET_GPR, src), ofs);
>    }
>  }
> @@ -1066,8 +1077,9 @@ static void asm_ahuvload(ASMState *as, IRIns *ir)
>      rset_clear(allow, type);
>    }
>    if (ra_used(ir)) {
> -    lua_assert((LJ_SOFTFP ? 0 : irt_isnum(ir->t)) ||
> -	       irt_isint(ir->t) || irt_isaddr(ir->t));
> +    lj_assertA((LJ_SOFTFP ? 0 : irt_isnum(ir->t)) ||
> +	       irt_isint(ir->t) || irt_isaddr(ir->t),
> +	       "bad load type %d", irt_type(ir->t));
>      dest = ra_dest(as, ir, (!LJ_SOFTFP && t == IRT_NUM) ? RSET_FPR : allow);
>      rset_clear(allow, dest);
>    }
> @@ -1133,10 +1145,13 @@ static void asm_sload(ASMState *as, IRIns *ir)
>    IRType t = hiop ? IRT_NUM : irt_type(ir->t);
>    Reg dest = RID_NONE, type = RID_NONE, base;
>    RegSet allow = RSET_GPR;
> -  lua_assert(!(ir->op2 & IRSLOAD_PARENT));  /* Handled by asm_head_side(). */
> -  lua_assert(irt_isguard(ir->t) || !(ir->op2 & IRSLOAD_TYPECHECK));
> +  lj_assertA(!(ir->op2 & IRSLOAD_PARENT),
> +	     "bad parent SLOAD");  /* Handled by asm_head_side(). */
> +  lj_assertA(irt_isguard(ir->t) || !(ir->op2 & IRSLOAD_TYPECHECK),
> +	     "inconsistent SLOAD variant");
>  #if LJ_SOFTFP
> -  lua_assert(!(ir->op2 & IRSLOAD_CONVERT));  /* Handled by LJ_SOFTFP SPLIT. */
> +  lj_assertA(!(ir->op2 & IRSLOAD_CONVERT),
> +	     "unsplit SLOAD convert");  /* Handled by LJ_SOFTFP SPLIT. */
>    if (hiop && ra_used(ir+1)) {
>      type = ra_dest(as, ir+1, allow);
>      rset_clear(allow, type);
> @@ -1152,8 +1167,9 @@ static void asm_sload(ASMState *as, IRIns *ir)
>      Reg tmp = RID_NONE;
>      if ((ir->op2 & IRSLOAD_CONVERT))
>        tmp = ra_scratch(as, t == IRT_INT ? RSET_FPR : RSET_GPR);
> -    lua_assert((LJ_SOFTFP ? 0 : irt_isnum(ir->t)) ||
> -	       irt_isint(ir->t) || irt_isaddr(ir->t));
> +    lj_assertA((LJ_SOFTFP ? 0 : irt_isnum(ir->t)) ||
> +	       irt_isint(ir->t) || irt_isaddr(ir->t),
> +	       "bad SLOAD type %d", irt_type(ir->t));
>      dest = ra_dest(as, ir, (!LJ_SOFTFP && t == IRT_NUM) ? RSET_FPR : allow);
>      rset_clear(allow, dest);
>      base = ra_alloc1(as, REF_BASE, allow);
> @@ -1218,7 +1234,8 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>    IRRef args[4];
>    RegSet allow = (RSET_GPR & ~RSET_SCRATCH);
>    RegSet drop = RSET_SCRATCH;
> -  lua_assert(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL));
> +  lj_assertA(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL),
> +	     "bad CNEW/CNEWI operands");
>  
>    as->gcsteps++;
>    if (ra_hasreg(ir->r))
> @@ -1230,10 +1247,10 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>    /* Initialize immutable cdata object. */
>    if (ir->o == IR_CNEWI) {
>      int32_t ofs = sizeof(GCcdata);
> -    lua_assert(sz == 4 || sz == 8);
> +    lj_assertA(sz == 4 || sz == 8, "bad CNEWI size %d", sz);
>      if (sz == 8) {
>        ofs += 4; ir++;
> -      lua_assert(ir->o == IR_HIOP);
> +      lj_assertA(ir->o == IR_HIOP, "expected HIOP for CNEWI");
>      }
>      for (;;) {
>        Reg r = ra_alloc1(as, ir->op2, allow);
> @@ -1306,7 +1323,7 @@ static void asm_obar(ASMState *as, IRIns *ir)
>    MCLabel l_end;
>    Reg obj, val, tmp;
>    /* No need for other object barriers (yet). */
> -  lua_assert(IR(ir->op1)->o == IR_UREFC);
> +  lj_assertA(IR(ir->op1)->o == IR_UREFC, "bad OBAR type");
>    ra_evictset(as, RSET_SCRATCH);
>    l_end = emit_label(as);
>    args[0] = ASMREF_TMP1;  /* global_State *g */
> @@ -1580,7 +1597,7 @@ static void asm_bitshift(ASMState *as, IRIns *ir, ARMShift sh)
>  #define asm_bshr(as, ir)	asm_bitshift(as, ir, ARMSH_LSR)
>  #define asm_bsar(as, ir)	asm_bitshift(as, ir, ARMSH_ASR)
>  #define asm_bror(as, ir)	asm_bitshift(as, ir, ARMSH_ROR)
> -#define asm_brol(as, ir)	lua_assert(0)
> +#define asm_brol(as, ir)	lj_assertA(0, "unexpected BROL")
>  
>  static void asm_intmin_max(ASMState *as, IRIns *ir, int cc)
>  {
> @@ -1731,7 +1748,8 @@ static void asm_intcomp(ASMState *as, IRIns *ir)
>    Reg left;
>    uint32_t m;
>    int cmpprev0 = 0;
> -  lua_assert(irt_isint(ir->t) || irt_isu32(ir->t) || irt_isaddr(ir->t));
> +  lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t) || irt_isaddr(ir->t),
> +	     "bad comparison data type %d", irt_type(ir->t));
>    if (asm_swapops(as, lref, rref)) {
>      Reg tmp = lref; lref = rref; rref = tmp;
>      if (cc >= CC_GE) cc ^= 7;  /* LT <-> GT, LE <-> GE */
> @@ -1900,10 +1918,11 @@ static void asm_hiop(ASMState *as, IRIns *ir)
>    case IR_CNEWI:
>      /* Nothing to do here. Handled by lo op itself. */
>      break;
> -  default: lua_assert(0); break;
> +  default: lj_assertA(0, "bad HIOP for op %d", (ir-1)->o); break;
>    }
>  #else
> -  UNUSED(as); UNUSED(ir); lua_assert(0);
> +  /* Unused without SOFTFP or FFI. */
> +  UNUSED(as); UNUSED(ir); lj_assertA(0, "unexpected HIOP");
>  #endif
>  }
>  
> @@ -1928,7 +1947,7 @@ static void asm_stack_check(ASMState *as, BCReg topslot,
>    if (irp) {
>      if (!ra_hasspill(irp->s)) {
>        pbase = irp->r;
> -      lua_assert(ra_hasreg(pbase));
> +      lj_assertA(ra_hasreg(pbase), "base reg lost");
>      } else if (allow) {
>        pbase = rset_pickbot(allow);
>      } else {
> @@ -1940,7 +1959,7 @@ static void asm_stack_check(ASMState *as, BCReg topslot,
>    }
>    emit_branch(as, ARMF_CC(ARMI_BL, CC_LS), exitstub_addr(as->J, exitno));
>    k = emit_isk12(0, (int32_t)(8*topslot));
> -  lua_assert(k);
> +  lj_assertA(k, "slot offset %d does not fit in K12", 8*topslot);
>    emit_n(as, ARMI_CMP^k, RID_TMP);
>    emit_dnm(as, ARMI_SUB, RID_TMP, RID_TMP, pbase);
>    emit_lso(as, ARMI_LDR, RID_TMP, RID_TMP,
> @@ -1977,7 +1996,8 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>  #if LJ_SOFTFP
>        RegSet odd = rset_exclude(RSET_GPRODD, RID_BASE);
>        Reg tmp;
> -      lua_assert(irref_isk(ref));  /* LJ_SOFTFP: must be a number constant. */
> +      /* LJ_SOFTFP: must be a number constant. */
> +      lj_assertA(irref_isk(ref), "unsplit FP op");
>        tmp = ra_allock(as, (int32_t)ir_knum(ir)->u32.lo,
>  		      rset_exclude(RSET_GPREVEN, RID_BASE));
>        emit_lso(as, ARMI_STR, tmp, RID_BASE, ofs);
> @@ -1991,7 +2011,8 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>      } else {
>        RegSet odd = rset_exclude(RSET_GPRODD, RID_BASE);
>        Reg type;
> -      lua_assert(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t));
> +      lj_assertA(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t),
> +		 "restore of IR type %d", irt_type(ir->t));
>        if (!irt_ispri(ir->t)) {
>  	Reg src = ra_alloc1(as, ref, rset_exclude(RSET_GPREVEN, RID_BASE));
>  	emit_lso(as, ARMI_STR, src, RID_BASE, ofs);
> @@ -2011,7 +2032,7 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>      }
>      checkmclim(as);
>    }
> -  lua_assert(map + nent == flinks);
> +  lj_assertA(map + nent == flinks, "inconsistent frames in snapshot");
>  }
>  
>  /* -- GC handling --------------------------------------------------------- */
> @@ -2097,7 +2118,7 @@ static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
>      rset_clear(allow, ra_dest(as, ir, allow));
>    } else {
>      Reg r = irp->r;
> -    lua_assert(ra_hasreg(r));
> +    lj_assertA(ra_hasreg(r), "base reg lost");
>      rset_clear(allow, r);
>      if (r != ir->r && !rset_test(as->freeset, r))
>        ra_restore(as, regcost_ref(as->cost[r]));
> @@ -2119,7 +2140,7 @@ static void asm_tail_fixup(ASMState *as, TraceNo lnk)
>    } else {
>      /* Patch stack adjustment. */
>      uint32_t k = emit_isk12(ARMI_ADD, spadj);
> -    lua_assert(k);
> +    lj_assertA(k, "stack adjustment %d does not fit in K12", spadj);
>      p[-2] = (ARMI_ADD^k) | ARMF_D(RID_SP) | ARMF_N(RID_SP);
>    }
>    /* Patch exit branch. */
> @@ -2201,7 +2222,7 @@ void lj_asm_patchexit(jit_State *J, GCtrace *T, ExitNo exitno, MCode *target)
>        if (!cstart) cstart = p;
>      }
>    }
> -  lua_assert(cstart != NULL);
> +  lj_assertJ(cstart != NULL, "exit stub %d not found", exitno);
>    lj_mcode_sync(cstart, cend);
>    lj_mcode_patch(J, mcarea, 1);
>  }
> diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
> index c3d6889e..d1d4237b 100644
> --- a/src/lj_asm_arm64.h
> +++ b/src/lj_asm_arm64.h
> @@ -213,7 +213,7 @@ static uint32_t asm_fuseopm(ASMState *as, A64Ins ai, IRRef ref, RegSet allow)
>      return A64F_M(ir->r);
>    } else if (irref_isk(ref)) {
>      uint32_t m;
> -    int64_t k = get_k64val(ir);
> +    int64_t k = get_k64val(as, ref);
>      if ((ai & 0x1f000000) == 0x0a000000)
>        m = emit_isk13(k, irt_is64(ir->t));
>      else
> @@ -354,9 +354,9 @@ static int asm_fusemadd(ASMState *as, IRIns *ir, A64Ins ai, A64Ins air)
>  static int asm_fuseandshift(ASMState *as, IRIns *ir)
>  {
>    IRIns *irl = IR(ir->op1);
> -  lua_assert(ir->o == IR_BAND);
> +  lj_assertA(ir->o == IR_BAND, "bad usage");
>    if (canfuse(as, irl) && irref_isk(ir->op2)) {
> -    uint64_t mask = get_k64val(IR(ir->op2));
> +    uint64_t mask = get_k64val(as, ir->op2);
>      if (irref_isk(irl->op2) && (irl->o == IR_BSHR || irl->o == IR_BSHL)) {
>        int32_t shmask = irt_is64(irl->t) ? 63 : 31;
>        int32_t shift = (IR(irl->op2)->i & shmask);
> @@ -384,7 +384,7 @@ static int asm_fuseandshift(ASMState *as, IRIns *ir)
>  static int asm_fuseorshift(ASMState *as, IRIns *ir)
>  {
>    IRIns *irl = IR(ir->op1), *irr = IR(ir->op2);
> -  lua_assert(ir->o == IR_BOR);
> +  lj_assertA(ir->o == IR_BOR, "bad usage");
>    if (canfuse(as, irl) && canfuse(as, irr) &&
>        ((irl->o == IR_BSHR && irr->o == IR_BSHL) ||
>         (irl->o == IR_BSHL && irr->o == IR_BSHR))) {
> @@ -428,7 +428,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
>      if (ref) {
>        if (irt_isfp(ir->t)) {
>  	if (fpr <= REGARG_LASTFPR) {
> -	  lua_assert(rset_test(as->freeset, fpr)); /* Must have been evicted. */
> +	  lj_assertA(rset_test(as->freeset, fpr),
> +		     "reg %d not free", fpr);  /* Must have been evicted. */
>  	  ra_leftov(as, fpr, ref);
>  	  fpr++;
>  	} else {
> @@ -438,7 +439,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
>  	}
>        } else {
>  	if (gpr <= REGARG_LASTGPR) {
> -	  lua_assert(rset_test(as->freeset, gpr)); /* Must have been evicted. */
> +	  lj_assertA(rset_test(as->freeset, gpr),
> +		     "reg %d not free", gpr);  /* Must have been evicted. */
>  	  ra_leftov(as, gpr, ref);
>  	  gpr++;
>  	} else {
> @@ -459,7 +461,7 @@ static void asm_setupresult(ASMState *as, IRIns *ir, const CCallInfo *ci)
>      rset_clear(drop, ir->r); /* Dest reg handled below. */
>    ra_evictset(as, drop); /* Evictions must be performed first. */
>    if (ra_used(ir)) {
> -    lua_assert(!irt_ispri(ir->t));
> +    lj_assertA(!irt_ispri(ir->t), "PRI dest");
>      if (irt_isfp(ir->t)) {
>        if (ci->flags & CCI_CASTU64) {
>  	Reg dest = ra_dest(as, ir, RSET_FPR) & 31;
> @@ -546,7 +548,7 @@ static void asm_conv(ASMState *as, IRIns *ir)
>    int st64 = (st == IRT_I64 || st == IRT_U64 || st == IRT_P64);
>    int stfp = (st == IRT_NUM || st == IRT_FLOAT);
>    IRRef lref = ir->op1;
> -  lua_assert(irt_type(ir->t) != st);
> +  lj_assertA(irt_type(ir->t) != st, "inconsistent types for CONV");
>    if (irt_isfp(ir->t)) {
>      Reg dest = ra_dest(as, ir, RSET_FPR);
>      if (stfp) {  /* FP to FP conversion. */
> @@ -566,7 +568,8 @@ static void asm_conv(ASMState *as, IRIns *ir)
>    } else if (stfp) {  /* FP to integer conversion. */
>      if (irt_isguard(ir->t)) {
>        /* Checked conversions are only supported from number to int. */
> -      lua_assert(irt_isint(ir->t) && st == IRT_NUM);
> +      lj_assertA(irt_isint(ir->t) && st == IRT_NUM,
> +		 "bad type for checked CONV");
>        asm_tointg(as, ir, ra_alloc1(as, lref, RSET_FPR));
>      } else {
>        Reg left = ra_alloc1(as, lref, RSET_FPR);
> @@ -586,7 +589,7 @@ static void asm_conv(ASMState *as, IRIns *ir)
>      A64Ins ai = st == IRT_I8 ? A64I_SXTBw :
>  		st == IRT_U8 ? A64I_UXTBw :
>  		st == IRT_I16 ? A64I_SXTHw : A64I_UXTHw;
> -    lua_assert(irt_isint(ir->t) || irt_isu32(ir->t));
> +    lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t), "bad type for CONV EXT");
>      emit_dn(as, ai, dest, left);
>    } else {
>      Reg dest = ra_dest(as, ir, RSET_GPR);
> @@ -650,7 +653,8 @@ static void asm_tvstore64(ASMState *as, Reg base, int32_t ofs, IRRef ref)
>  {
>    RegSet allow = rset_exclude(RSET_GPR, base);
>    IRIns *ir = IR(ref);
> -  lua_assert(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t));
> +  lj_assertA(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t),
> +	     "store of IR type %d", irt_type(ir->t));
>    if (irref_isk(ref)) {
>      TValue k;
>      lj_ir_kvalue(as->J->L, &k, ir);
> @@ -770,7 +774,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>      }
>      rset_clear(allow, scr);
>    } else {
> -    lua_assert(irt_ispri(kt) && !irt_isnil(kt));
> +    lj_assertA(irt_ispri(kt) && !irt_isnil(kt), "bad HREF key type");
>      type = ra_allock(as, ~((int64_t)~irt_toitype(ir->t) << 47), allow);
>      scr = ra_scratch(as, rset_clear(allow, type));
>      rset_clear(allow, scr);
> @@ -831,7 +835,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>      rset_clear(allow, type);
>    }
>    /* Load main position relative to tab->node into dest. */
> -  khash = isk ? ir_khash(irkey) : 1;
> +  khash = isk ? ir_khash(as, irkey) : 1;
>    if (khash == 0) {
>      emit_lso(as, A64I_LDRx, dest, tab, offsetof(GCtab, node));
>    } else {
> @@ -886,7 +890,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
>    Reg key, idx = node;
>    RegSet allow = rset_exclude(RSET_GPR, node);
>    uint64_t k;
> -  lua_assert(ofs % sizeof(Node) == 0);
> +  lj_assertA(ofs % sizeof(Node) == 0, "unaligned HREFK slot");
>    if (bigofs) {
>      idx = dest;
>      rset_clear(allow, dest);
> @@ -936,7 +940,7 @@ static void asm_uref(ASMState *as, IRIns *ir)
>  static void asm_fref(ASMState *as, IRIns *ir)
>  {
>    UNUSED(as); UNUSED(ir);
> -  lua_assert(!ra_used(ir));
> +  lj_assertA(!ra_used(ir), "unfused FREF");
>  }
>  
>  static void asm_strref(ASMState *as, IRIns *ir)
> @@ -988,7 +992,7 @@ static void asm_fload(ASMState *as, IRIns *ir)
>    Reg idx;
>    A64Ins ai = asm_fxloadins(ir);
>    int32_t ofs;
> -  if (ir->op1 == REF_NIL) {
> +  if (ir->op1 == REF_NIL) {  /* FLOAD from GG_State with offset. */
>      idx = RID_GL;
>      ofs = (ir->op2 << 2) - GG_OFS(g);
>    } else {
> @@ -1019,7 +1023,7 @@ static void asm_fstore(ASMState *as, IRIns *ir)
>  static void asm_xload(ASMState *as, IRIns *ir)
>  {
>    Reg dest = ra_dest(as, ir, irt_isfp(ir->t) ? RSET_FPR : RSET_GPR);
> -  lua_assert(!(ir->op2 & IRXLOAD_UNALIGNED));
> +  lj_assertA(!(ir->op2 & IRXLOAD_UNALIGNED), "unaligned XLOAD");
>    asm_fusexref(as, asm_fxloadins(ir), dest, ir->op1, RSET_GPR);
>  }
>  
> @@ -1037,8 +1041,9 @@ static void asm_ahuvload(ASMState *as, IRIns *ir)
>    Reg idx, tmp, type;
>    int32_t ofs = 0;
>    RegSet gpr = RSET_GPR, allow = irt_isnum(ir->t) ? RSET_FPR : RSET_GPR;
> -  lua_assert(irt_isnum(ir->t) || irt_ispri(ir->t) || irt_isaddr(ir->t) ||
> -	     irt_isint(ir->t));
> +  lj_assertA(irt_isnum(ir->t) || irt_ispri(ir->t) || irt_isaddr(ir->t) ||
> +	     irt_isint(ir->t),
> +	     "bad load type %d", irt_type(ir->t));
>    if (ra_used(ir)) {
>      Reg dest = ra_dest(as, ir, allow);
>      tmp = irt_isnum(ir->t) ? ra_scratch(as, rset_clear(gpr, dest)) : dest;
> @@ -1057,7 +1062,8 @@ static void asm_ahuvload(ASMState *as, IRIns *ir)
>    /* Always do the type check, even if the load result is unused. */
>    asm_guardcc(as, irt_isnum(ir->t) ? CC_LS : CC_NE);
>    if (irt_type(ir->t) >= IRT_NUM) {
> -    lua_assert(irt_isinteger(ir->t) || irt_isnum(ir->t));
> +    lj_assertA(irt_isinteger(ir->t) || irt_isnum(ir->t),
> +	       "bad load type %d", irt_type(ir->t));
>      emit_nm(as, A64I_CMPx | A64F_SH(A64SH_LSR, 32),
>  	    ra_allock(as, LJ_TISNUM << 15, rset_exclude(gpr, idx)), tmp);
>    } else if (irt_isaddr(ir->t)) {
> @@ -1122,8 +1128,10 @@ static void asm_sload(ASMState *as, IRIns *ir)
>    IRType1 t = ir->t;
>    Reg dest = RID_NONE, base;
>    RegSet allow = RSET_GPR;
> -  lua_assert(!(ir->op2 & IRSLOAD_PARENT));  /* Handled by asm_head_side(). */
> -  lua_assert(irt_isguard(t) || !(ir->op2 & IRSLOAD_TYPECHECK));
> +  lj_assertA(!(ir->op2 & IRSLOAD_PARENT),
> +	     "bad parent SLOAD");  /* Handled by asm_head_side(). */
> +  lj_assertA(irt_isguard(t) || !(ir->op2 & IRSLOAD_TYPECHECK),
> +	     "inconsistent SLOAD variant");
>    if ((ir->op2 & IRSLOAD_CONVERT) && irt_isguard(t) && irt_isint(t)) {
>      dest = ra_scratch(as, RSET_FPR);
>      asm_tointg(as, ir, dest);
> @@ -1132,7 +1140,8 @@ static void asm_sload(ASMState *as, IRIns *ir)
>      Reg tmp = RID_NONE;
>      if ((ir->op2 & IRSLOAD_CONVERT))
>        tmp = ra_scratch(as, irt_isint(t) ? RSET_FPR : RSET_GPR);
> -    lua_assert((irt_isnum(t)) || irt_isint(t) || irt_isaddr(t));
> +    lj_assertA((irt_isnum(t)) || irt_isint(t) || irt_isaddr(t),
> +	       "bad SLOAD type %d", irt_type(t));
>      dest = ra_dest(as, ir, irt_isnum(t) ? RSET_FPR : allow);
>      base = ra_alloc1(as, REF_BASE, rset_clear(allow, dest));
>      if (irt_isaddr(t)) {
> @@ -1172,7 +1181,8 @@ dotypecheck:
>      /* Need type check, even if the load result is unused. */
>      asm_guardcc(as, irt_isnum(t) ? CC_LS : CC_NE);
>      if (irt_type(t) >= IRT_NUM) {
> -      lua_assert(irt_isinteger(t) || irt_isnum(t));
> +      lj_assertA(irt_isinteger(t) || irt_isnum(t),
> +		 "bad SLOAD type %d", irt_type(t));
>        emit_nm(as, A64I_CMPx | A64F_SH(A64SH_LSR, 32),
>  	      ra_allock(as, LJ_TISNUM << 15, allow), tmp);
>      } else if (irt_isnil(t)) {
> @@ -1207,7 +1217,8 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>    const CCallInfo *ci = &lj_ir_callinfo[IRCALL_lj_mem_newgco];
>    IRRef args[4];
>    RegSet allow = (RSET_GPR & ~RSET_SCRATCH);
> -  lua_assert(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL));
> +  lj_assertA(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL),
> +	     "bad CNEW/CNEWI operands");
>  
>    as->gcsteps++;
>    asm_setupresult(as, ir, ci);  /* GCcdata * */
> @@ -1215,7 +1226,7 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>    if (ir->o == IR_CNEWI) {
>      int32_t ofs = sizeof(GCcdata);
>      Reg r = ra_alloc1(as, ir->op2, allow);
> -    lua_assert(sz == 4 || sz == 8);
> +    lj_assertA(sz == 4 || sz == 8, "bad CNEWI size %d", sz);
>      emit_lso(as, sz == 8 ? A64I_STRx : A64I_STRw, r, RID_RET, ofs);
>    } else if (ir->op2 != REF_NIL) {  /* Create VLA/VLS/aligned cdata. */
>      ci = &lj_ir_callinfo[IRCALL_lj_cdata_newv];
> @@ -1281,7 +1292,7 @@ static void asm_obar(ASMState *as, IRIns *ir)
>    RegSet allow = RSET_GPR;
>    Reg obj, val, tmp;
>    /* No need for other object barriers (yet). */
> -  lua_assert(IR(ir->op1)->o == IR_UREFC);
> +  lj_assertA(IR(ir->op1)->o == IR_UREFC, "bad OBAR type");
>    ra_evictset(as, RSET_SCRATCH);
>    l_end = emit_label(as);
>    args[0] = ASMREF_TMP1;  /* global_State *g */
> @@ -1551,7 +1562,7 @@ static void asm_bitshift(ASMState *as, IRIns *ir, A64Ins ai, A64Shift sh)
>  #define asm_bshr(as, ir)	asm_bitshift(as, ir, A64I_UBFMw, A64SH_LSR)
>  #define asm_bsar(as, ir)	asm_bitshift(as, ir, A64I_SBFMw, A64SH_ASR)
>  #define asm_bror(as, ir)	asm_bitshift(as, ir, A64I_EXTRw, A64SH_ROR)
> -#define asm_brol(as, ir)	lua_assert(0)
> +#define asm_brol(as, ir)	lj_assertA(0, "unexpected BROL")
>  
>  static void asm_intmin_max(ASMState *as, IRIns *ir, A64CC cc)
>  {
> @@ -1632,15 +1643,16 @@ static void asm_intcomp(ASMState *as, IRIns *ir)
>    Reg left;
>    uint32_t m;
>    int cmpprev0 = 0;
> -  lua_assert(irt_is64(ir->t) || irt_isint(ir->t) ||
> -	     irt_isu32(ir->t) || irt_isaddr(ir->t) || irt_isu8(ir->t));
> +  lj_assertA(irt_is64(ir->t) || irt_isint(ir->t) ||
> +	     irt_isu32(ir->t) || irt_isaddr(ir->t) || irt_isu8(ir->t),
> +	     "bad comparison data type %d", irt_type(ir->t));
>    if (asm_swapops(as, lref, rref)) {
>      IRRef tmp = lref; lref = rref; rref = tmp;
>      if (cc >= CC_GE) cc ^= 7;  /* LT <-> GT, LE <-> GE */
>      else if (cc > CC_NE) cc ^= 11;  /* LO <-> HI, LS <-> HS */
>    }
>    oldcc = cc;
> -  if (irref_isk(rref) && get_k64val(IR(rref)) == 0) {
> +  if (irref_isk(rref) && get_k64val(as, rref) == 0) {
>      IRIns *irl = IR(lref);
>      if (cc == CC_GE) cc = CC_PL;
>      else if (cc == CC_LT) cc = CC_MI;
> @@ -1655,7 +1667,7 @@ static void asm_intcomp(ASMState *as, IRIns *ir)
>  	Reg tmp = blref; blref = brref; brref = tmp;
>        }
>        if (irref_isk(brref)) {
> -	uint64_t k = get_k64val(IR(brref));
> +	uint64_t k = get_k64val(as, brref);
>  	if (k && !(k & (k-1)) && (cc == CC_EQ || cc == CC_NE)) {
>  	  asm_guardtnb(as, cc == CC_EQ ? A64I_TBZ : A64I_TBNZ,
>  		       ra_alloc1(as, blref, RSET_GPR), emit_ctz64(k));
> @@ -1704,7 +1716,8 @@ static void asm_comp(ASMState *as, IRIns *ir)
>  /* Hiword op of a split 64 bit op. Previous op must be the loword op. */
>  static void asm_hiop(ASMState *as, IRIns *ir)
>  {
> -  UNUSED(as); UNUSED(ir); lua_assert(0);  /* Unused on 64 bit. */
> +  UNUSED(as); UNUSED(ir);
> +  lj_assertA(0, "unexpected HIOP");  /* Unused on 64 bit. */
>  }
>  
>  /* -- Profiling ----------------------------------------------------------- */
> @@ -1712,7 +1725,7 @@ static void asm_hiop(ASMState *as, IRIns *ir)
>  static void asm_prof(ASMState *as, IRIns *ir)
>  {
>    uint32_t k = emit_isk13(HOOK_PROFILE, 0);
> -  lua_assert(k != 0);
> +  lj_assertA(k != 0, "HOOK_PROFILE does not fit in K13");
>    UNUSED(ir);
>    asm_guardcc(as, CC_NE);
>    emit_n(as, A64I_TSTw^k, RID_TMP);
> @@ -1730,7 +1743,7 @@ static void asm_stack_check(ASMState *as, BCReg topslot,
>    if (irp) {
>      if (!ra_hasspill(irp->s)) {
>        pbase = irp->r;
> -      lua_assert(ra_hasreg(pbase));
> +      lj_assertA(ra_hasreg(pbase), "base reg lost");
>      } else if (allow) {
>        pbase = rset_pickbot(allow);
>      } else {
> @@ -1742,7 +1755,7 @@ static void asm_stack_check(ASMState *as, BCReg topslot,
>    }
>    emit_cond_branch(as, CC_LS, asm_exitstub_addr(as, exitno));
>    k = emit_isk12((8*topslot));
> -  lua_assert(k);
> +  lj_assertA(k, "slot offset %d does not fit in K12", 8*topslot);
>    emit_n(as, A64I_CMPx^k, RID_TMP);
>    emit_dnm(as, A64I_SUBx, RID_TMP, RID_TMP, pbase);
>    emit_lso(as, A64I_LDRx, RID_TMP, RID_TMP,
> @@ -1783,7 +1796,7 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>      }
>      checkmclim(as);
>    }
> -  lua_assert(map + nent == flinks);
> +  lj_assertA(map + nent == flinks, "inconsistent frames in snapshot");
>  }
>  
>  /* -- GC handling --------------------------------------------------------- */
> @@ -1871,7 +1884,7 @@ static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
>      rset_clear(allow, ra_dest(as, ir, allow));
>    } else {
>      Reg r = irp->r;
> -    lua_assert(ra_hasreg(r));
> +    lj_assertA(ra_hasreg(r), "base reg lost");
>      rset_clear(allow, r);
>      if (r != ir->r && !rset_test(as->freeset, r))
>        ra_restore(as, regcost_ref(as->cost[r]));
> @@ -1895,7 +1908,7 @@ static void asm_tail_fixup(ASMState *as, TraceNo lnk)
>    } else {
>      /* Patch stack adjustment. */
>      uint32_t k = emit_isk12(spadj);
> -    lua_assert(k);
> +    lj_assertA(k, "stack adjustment %d does not fit in K12", spadj);
>      p[-2] = (A64I_ADDx^k) | A64F_D(RID_SP) | A64F_N(RID_SP);
>    }
>    /* Patch exit branch. */
> @@ -1981,7 +1994,7 @@ void lj_asm_patchexit(jit_State *J, GCtrace *T, ExitNo exitno, MCode *target)
>      } else if ((ins & 0xfc000000u) == 0x14000000u &&
>  	       ((ins ^ (px-p)) & 0x03ffffffu) == 0) {
>        /* Patch b. */
> -      lua_assert(A64F_S_OK(delta, 26));
> +      lj_assertJ(A64F_S_OK(delta, 26), "branch target out of range");
>        *p = A64I_LE((ins & 0xfc000000u) | A64F_S26(delta));
>        if (!cstart) cstart = p;
>      } else if ((ins & 0x7e000000u) == 0x34000000u &&
> @@ -2002,7 +2015,7 @@ void lj_asm_patchexit(jit_State *J, GCtrace *T, ExitNo exitno, MCode *target)
>    }
>    {  /* Always patch long-range branch in exit stub itself. */
>      ptrdiff_t delta = target - px;
> -    lua_assert(A64F_S_OK(delta, 26));
> +    lj_assertJ(A64F_S_OK(delta, 26), "branch target out of range");
>      *px = A64I_B | A64F_S26(delta);
>      if (!cstart) cstart = px;
>    }
> diff --git a/src/lj_asm_mips.h b/src/lj_asm_mips.h
> index 0f92959b..ea108aab 100644
> --- a/src/lj_asm_mips.h
> +++ b/src/lj_asm_mips.h
> @@ -23,7 +23,7 @@ static Reg ra_alloc1z(ASMState *as, IRRef ref, RegSet allow)
>  {
>    Reg r = IR(ref)->r;
>    if (ra_noreg(r)) {
> -    if (!(allow & RSET_FPR) && irref_isk(ref) && get_kval(IR(ref)) == 0)
> +    if (!(allow & RSET_FPR) && irref_isk(ref) && get_kval(as, ref) == 0)
>        return RID_ZERO;
>      r = ra_allocref(as, ref, allow);
>    } else {
> @@ -66,10 +66,10 @@ static void asm_sparejump_setup(ASMState *as)
>  {
>    MCode *mxp = as->mcbot;
>    if (((uintptr_t)mxp & (LJ_PAGESIZE-1)) == sizeof(MCLink)) {
> -    lua_assert(MIPSI_NOP == 0);
> +    lj_assertA(MIPSI_NOP == 0, "bad NOP");
>      memset(mxp, 0, MIPS_SPAREJUMP*2*sizeof(MCode));
>      mxp += MIPS_SPAREJUMP*2;
> -    lua_assert(mxp < as->mctop);
> +    lj_assertA(mxp < as->mctop, "MIPS_SPAREJUMP too big");
>      lj_mcode_sync(as->mcbot, mxp);
>      lj_mcode_commitbot(as->J, mxp);
>      as->mcbot = mxp;
> @@ -84,7 +84,8 @@ static void asm_exitstub_setup(ASMState *as)
>    /* sw TMP, 0(sp); j ->vm_exit_handler; li TMP, traceno */
>    *--mxp = MIPSI_LI|MIPSF_T(RID_TMP)|as->T->traceno;
>    *--mxp = MIPSI_J|((((uintptr_t)(void *)lj_vm_exit_handler)>>2)&0x03ffffffu);
> -  lua_assert(((uintptr_t)mxp ^ (uintptr_t)(void *)lj_vm_exit_handler)>>28 == 0);
> +  lj_assertA(((uintptr_t)mxp ^ (uintptr_t)(void *)lj_vm_exit_handler)>>28 == 0,
> +	     "branch target out of range");
>    *--mxp = MIPSI_SW|MIPSF_T(RID_TMP)|MIPSF_S(RID_SP)|0;
>    as->mctop = mxp;
>  }
> @@ -195,20 +196,20 @@ static void asm_fusexref(ASMState *as, MIPSIns mi, Reg rt, IRRef ref,
>    if (ra_noreg(ir->r) && canfuse(as, ir)) {
>      if (ir->o == IR_ADD) {
>        intptr_t ofs2;
> -      if (irref_isk(ir->op2) && (ofs2 = ofs + get_kval(IR(ir->op2)),
> +      if (irref_isk(ir->op2) && (ofs2 = ofs + get_kval(as, ir->op2),
>  				 checki16(ofs2))) {
>  	ref = ir->op1;
>  	ofs = (int32_t)ofs2;
>        }
>      } else if (ir->o == IR_STRREF) {
>        intptr_t ofs2 = 65536;
> -      lua_assert(ofs == 0);
> +      lj_assertA(ofs == 0, "bad usage");
>        ofs = (int32_t)sizeof(GCstr);
>        if (irref_isk(ir->op2)) {
> -	ofs2 = ofs + get_kval(IR(ir->op2));
> +	ofs2 = ofs + get_kval(as, ir->op2);
>  	ref = ir->op1;
>        } else if (irref_isk(ir->op1)) {
> -	ofs2 = ofs + get_kval(IR(ir->op1));
> +	ofs2 = ofs + get_kval(as, ir->op1);
>  	ref = ir->op2;
>        }
>        if (!checki16(ofs2)) {
> @@ -252,7 +253,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
>  #if !LJ_SOFTFP
>        if (irt_isfp(ir->t) && fpr <= REGARG_LASTFPR &&
>  	  !(ci->flags & CCI_VARARG)) {
> -	lua_assert(rset_test(as->freeset, fpr));  /* Already evicted. */
> +	lj_assertA(rset_test(as->freeset, fpr),
> +		   "reg %d not free", fpr);  /* Already evicted. */
>  	ra_leftov(as, fpr, ref);
>  	fpr += LJ_32 ? 2 : 1;
>  	gpr += (LJ_32 && irt_isnum(ir->t)) ? 2 : 1;
> @@ -264,7 +266,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
>  #endif
>  	if (LJ_32 && irt_isnum(ir->t)) gpr = (gpr+1) & ~1;
>  	if (gpr <= REGARG_LASTGPR) {
> -	  lua_assert(rset_test(as->freeset, gpr));  /* Already evicted. */
> +	  lj_assertA(rset_test(as->freeset, gpr),
> +		     "reg %d not free", gpr);  /* Already evicted. */
>  #if !LJ_SOFTFP
>  	  if (irt_isfp(ir->t)) {
>  	    RegSet of = as->freeset;
> @@ -277,7 +280,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
>  #if LJ_32
>  	      emit_tg(as, MIPSI_MFC1, gpr+(LJ_BE?0:1), r+1);
>  	      emit_tg(as, MIPSI_MFC1, gpr+(LJ_BE?1:0), r);
> -	      lua_assert(rset_test(as->freeset, gpr+1));  /* Already evicted. */
> +	      lj_assertA(rset_test(as->freeset, gpr+1),
> +			 "reg %d not free", gpr+1);  /* Already evicted. */
>  	      gpr += 2;
>  #else
>  	      emit_tg(as, MIPSI_DMFC1, gpr, r);
> @@ -347,7 +351,7 @@ static void asm_setupresult(ASMState *as, IRIns *ir, const CCallInfo *ci)
>  #endif
>    ra_evictset(as, drop);  /* Evictions must be performed first. */
>    if (ra_used(ir)) {
> -    lua_assert(!irt_ispri(ir->t));
> +    lj_assertA(!irt_ispri(ir->t), "PRI dest");
>      if (!LJ_SOFTFP && irt_isfp(ir->t)) {
>        if ((ci->flags & CCI_CASTU64)) {
>  	int32_t ofs = sps_scale(ir->s);
> @@ -395,7 +399,7 @@ static void asm_callx(ASMState *as, IRIns *ir)
>    func = ir->op2; irf = IR(func);
>    if (irf->o == IR_CARG) { func = irf->op1; irf = IR(func); }
>    if (irref_isk(func)) {  /* Call to constant address. */
> -    ci.func = (ASMFunction)(void *)get_kval(irf);
> +    ci.func = (ASMFunction)(void *)get_kval(as, func);
>    } else {  /* Need specific register for indirect calls. */
>      Reg r = ra_alloc1(as, func, RID2RSET(RID_CFUNCADDR));
>      MCode *p = as->mcp;
> @@ -512,15 +516,19 @@ static void asm_conv(ASMState *as, IRIns *ir)
>  #endif
>    IRRef lref = ir->op1;
>  #if LJ_32
> -  lua_assert(!(irt_isint64(ir->t) ||
> -	       (st == IRT_I64 || st == IRT_U64))); /* Handled by SPLIT. */
> +  /* 64 bit integer conversions are handled by SPLIT. */
> +  lj_assertA(!(irt_isint64(ir->t) || (st == IRT_I64 || st == IRT_U64)),
> +	     "IR %04d has unsplit 64 bit type",
> +	     (int)(ir - as->ir) - REF_BIAS);
>  #endif
>  #if LJ_SOFTFP32
>    /* FP conversions are handled by SPLIT. */
> -  lua_assert(!irt_isfp(ir->t) && !(st == IRT_NUM || st == IRT_FLOAT));
> +  lj_assertA(!irt_isfp(ir->t) && !(st == IRT_NUM || st == IRT_FLOAT),
> +	     "IR %04d has FP type",
> +	     (int)(ir - as->ir) - REF_BIAS);
>    /* Can't check for same types: SPLIT uses CONV int.int + BXOR for sfp NEG. */
>  #else
> -  lua_assert(irt_type(ir->t) != st);
> +  lj_assertA(irt_type(ir->t) != st, "inconsistent types for CONV");
>  #if !LJ_SOFTFP
>    if (irt_isfp(ir->t)) {
>      Reg dest = ra_dest(as, ir, RSET_FPR);
> @@ -579,7 +587,8 @@ static void asm_conv(ASMState *as, IRIns *ir)
>    } else if (stfp) {  /* FP to integer conversion. */
>      if (irt_isguard(ir->t)) {
>        /* Checked conversions are only supported from number to int. */
> -      lua_assert(irt_isint(ir->t) && st == IRT_NUM);
> +      lj_assertA(irt_isint(ir->t) && st == IRT_NUM,
> +		 "bad type for checked CONV");
>        asm_tointg(as, ir, ra_alloc1(as, lref, RSET_FPR));
>      } else {
>        Reg dest = ra_dest(as, ir, RSET_GPR);
> @@ -679,7 +688,8 @@ static void asm_conv(ASMState *as, IRIns *ir)
>    } else if (stfp) {  /* FP to integer conversion. */
>      if (irt_isguard(ir->t)) {
>        /* Checked conversions are only supported from number to int. */
> -      lua_assert(irt_isint(ir->t) && st == IRT_NUM);
> +      lj_assertA(irt_isint(ir->t) && st == IRT_NUM,
> +		 "bad type for checked CONV");
>        asm_tointg(as, ir, RID_NONE);
>      } else {
>        IRCallID cid = irt_is64(ir->t) ?
> @@ -698,7 +708,7 @@ static void asm_conv(ASMState *as, IRIns *ir)
>      Reg dest = ra_dest(as, ir, RSET_GPR);
>      if (st >= IRT_I8 && st <= IRT_U16) {  /* Extend to 32 bit integer. */
>        Reg left = ra_alloc1(as, ir->op1, RSET_GPR);
> -      lua_assert(irt_isint(ir->t) || irt_isu32(ir->t));
> +      lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t), "bad type for CONV EXT");
>        if ((ir->op2 & IRCONV_SEXT)) {
>  	if (LJ_64 || (as->flags & JIT_F_MIPSXXR2)) {
>  	  emit_dst(as, st == IRT_I8 ? MIPSI_SEB : MIPSI_SEH, dest, 0, left);
> @@ -795,7 +805,8 @@ static void asm_tvstore64(ASMState *as, Reg base, int32_t ofs, IRRef ref)
>  {
>    RegSet allow = rset_exclude(RSET_GPR, base);
>    IRIns *ir = IR(ref);
> -  lua_assert(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t));
> +  lj_assertA(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t),
> +	     "store of IR type %d", irt_type(ir->t));
>    if (irref_isk(ref)) {
>      TValue k;
>      lj_ir_kvalue(as->J->L, &k, ir);
> @@ -944,7 +955,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>        if (isk && irt_isaddr(kt)) {
>  	k = ((int64_t)irt_toitype(irkey->t) << 47) | irkey[1].tv.u64;
>        } else {
> -	lua_assert(irt_ispri(kt) && !irt_isnil(kt));
> +	lj_assertA(irt_ispri(kt) && !irt_isnil(kt), "bad HREF key type");
>  	k = ~((int64_t)~irt_toitype(ir->t) << 47);
>        }
>        cmp64 = ra_allock(as, k, allow);
> @@ -1012,7 +1023,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>  #endif
>  
>    /* Load main position relative to tab->node into dest. */
> -  khash = isk ? ir_khash(irkey) : 1;
> +  khash = isk ? ir_khash(as, irkey) : 1;
>    if (khash == 0) {
>      emit_tsi(as, MIPSI_AL, dest, tab, (int32_t)offsetof(GCtab, node));
>    } else {
> @@ -1020,7 +1031,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>      if (isk)
>        tmphash = ra_allock(as, khash, allow);
>      emit_dst(as, MIPSI_AADDU, dest, dest, tmp1);
> -    lua_assert(sizeof(Node) == 24);
> +    lj_assertA(sizeof(Node) == 24, "bad Node size");
>      emit_dst(as, MIPSI_SUBU, tmp1, tmp2, tmp1);
>      emit_dta(as, MIPSI_SLL, tmp1, tmp1, 3);
>      emit_dta(as, MIPSI_SLL, tmp2, tmp1, 5);
> @@ -1098,7 +1109,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
>    Reg key = ra_scratch(as, allow);
>    int64_t k;
>  #endif
> -  lua_assert(ofs % sizeof(Node) == 0);
> +  lj_assertA(ofs % sizeof(Node) == 0, "unaligned HREFK slot");
>    if (ofs > 32736) {
>      idx = dest;
>      rset_clear(allow, dest);
> @@ -1127,7 +1138,7 @@ nolo:
>    emit_tsi(as, MIPSI_LW, type, idx, kofs+(LJ_BE?0:4));
>  #else
>    if (irt_ispri(irkey->t)) {
> -    lua_assert(!irt_isnil(irkey->t));
> +    lj_assertA(!irt_isnil(irkey->t), "bad HREFK key type");
>      k = ~((int64_t)~irt_toitype(irkey->t) << 47);
>    } else if (irt_isnum(irkey->t)) {
>      k = (int64_t)ir_knum(irkey)->u64;
> @@ -1166,7 +1177,7 @@ static void asm_uref(ASMState *as, IRIns *ir)
>  static void asm_fref(ASMState *as, IRIns *ir)
>  {
>    UNUSED(as); UNUSED(ir);
> -  lua_assert(!ra_used(ir));
> +  lj_assertA(!ra_used(ir), "unfused FREF");
>  }
>  
>  static void asm_strref(ASMState *as, IRIns *ir)
> @@ -1221,14 +1232,17 @@ static void asm_strref(ASMState *as, IRIns *ir)
>  
>  /* -- Loads and stores ---------------------------------------------------- */
>  
> -static MIPSIns asm_fxloadins(IRIns *ir)
> +static MIPSIns asm_fxloadins(ASMState *as, IRIns *ir)
>  {
> +  UNUSED(as);
>    switch (irt_type(ir->t)) {
>    case IRT_I8: return MIPSI_LB;
>    case IRT_U8: return MIPSI_LBU;
>    case IRT_I16: return MIPSI_LH;
>    case IRT_U16: return MIPSI_LHU;
> -  case IRT_NUM: lua_assert(!LJ_SOFTFP32); if (!LJ_SOFTFP) return MIPSI_LDC1;
> +  case IRT_NUM:
> +    lj_assertA(!LJ_SOFTFP32, "unsplit FP op");
> +    if (!LJ_SOFTFP) return MIPSI_LDC1;
>    /* fallthrough */
>    case IRT_FLOAT: if (!LJ_SOFTFP) return MIPSI_LWC1;
>    /* fallthrough */
> @@ -1236,12 +1250,15 @@ static MIPSIns asm_fxloadins(IRIns *ir)
>    }
>  }
>  
> -static MIPSIns asm_fxstoreins(IRIns *ir)
> +static MIPSIns asm_fxstoreins(ASMState *as, IRIns *ir)
>  {
> +  UNUSED(as);
>    switch (irt_type(ir->t)) {
>    case IRT_I8: case IRT_U8: return MIPSI_SB;
>    case IRT_I16: case IRT_U16: return MIPSI_SH;
> -  case IRT_NUM: lua_assert(!LJ_SOFTFP32); if (!LJ_SOFTFP) return MIPSI_SDC1;
> +  case IRT_NUM:
> +    lj_assertA(!LJ_SOFTFP32, "unsplit FP op");
> +    if (!LJ_SOFTFP) return MIPSI_SDC1;
>    /* fallthrough */
>    case IRT_FLOAT: if (!LJ_SOFTFP) return MIPSI_SWC1;
>    /* fallthrough */
> @@ -1252,10 +1269,10 @@ static MIPSIns asm_fxstoreins(IRIns *ir)
>  static void asm_fload(ASMState *as, IRIns *ir)
>  {
>    Reg dest = ra_dest(as, ir, RSET_GPR);
> -  MIPSIns mi = asm_fxloadins(ir);
> +  MIPSIns mi = asm_fxloadins(as, ir);
>    Reg idx;
>    int32_t ofs;
> -  if (ir->op1 == REF_NIL) {
> +  if (ir->op1 == REF_NIL) {  /* FLOAD from GG_State with offset. */
>      idx = RID_JGL;
>      ofs = (ir->op2 << 2) - 32768 - GG_OFS(g);
>    } else {
> @@ -1269,7 +1286,7 @@ static void asm_fload(ASMState *as, IRIns *ir)
>      }
>      ofs = field_ofs[ir->op2];
>    }
> -  lua_assert(!irt_isfp(ir->t));
> +  lj_assertA(!irt_isfp(ir->t), "bad FP FLOAD");
>    emit_tsi(as, mi, dest, idx, ofs);
>  }
>  
> @@ -1280,8 +1297,8 @@ static void asm_fstore(ASMState *as, IRIns *ir)
>      IRIns *irf = IR(ir->op1);
>      Reg idx = ra_alloc1(as, irf->op1, rset_exclude(RSET_GPR, src));
>      int32_t ofs = field_ofs[irf->op2];
> -    MIPSIns mi = asm_fxstoreins(ir);
> -    lua_assert(!irt_isfp(ir->t));
> +    MIPSIns mi = asm_fxstoreins(as, ir);
> +    lj_assertA(!irt_isfp(ir->t), "bad FP FSTORE");
>      emit_tsi(as, mi, src, idx, ofs);
>    }
>  }
> @@ -1290,8 +1307,9 @@ static void asm_xload(ASMState *as, IRIns *ir)
>  {
>    Reg dest = ra_dest(as, ir,
>      (!LJ_SOFTFP && irt_isfp(ir->t)) ? RSET_FPR : RSET_GPR);
> -  lua_assert(LJ_TARGET_UNALIGNED || !(ir->op2 & IRXLOAD_UNALIGNED));
> -  asm_fusexref(as, asm_fxloadins(ir), dest, ir->op1, RSET_GPR, 0);
> +  lj_assertA(LJ_TARGET_UNALIGNED || !(ir->op2 & IRXLOAD_UNALIGNED),
> +	     "unaligned XLOAD");
> +  asm_fusexref(as, asm_fxloadins(as, ir), dest, ir->op1, RSET_GPR, 0);
>  }
>  
>  static void asm_xstore_(ASMState *as, IRIns *ir, int32_t ofs)
> @@ -1299,7 +1317,7 @@ static void asm_xstore_(ASMState *as, IRIns *ir, int32_t ofs)
>    if (ir->r != RID_SINK) {
>      Reg src = ra_alloc1z(as, ir->op2,
>        (!LJ_SOFTFP && irt_isfp(ir->t)) ? RSET_FPR : RSET_GPR);
> -    asm_fusexref(as, asm_fxstoreins(ir), src, ir->op1,
> +    asm_fusexref(as, asm_fxstoreins(as, ir), src, ir->op1,
>  		 rset_exclude(RSET_GPR, src), ofs);
>    }
>  }
> @@ -1321,8 +1339,9 @@ static void asm_ahuvload(ASMState *as, IRIns *ir)
>      }
>    }
>    if (ra_used(ir)) {
> -    lua_assert((LJ_SOFTFP32 ? 0 : irt_isnum(ir->t)) ||
> -	       irt_isint(ir->t) || irt_isaddr(ir->t));
> +    lj_assertA((LJ_SOFTFP32 ? 0 : irt_isnum(ir->t)) ||
> +	       irt_isint(ir->t) || irt_isaddr(ir->t),
> +	       "bad load type %d", irt_type(ir->t));
>      dest = ra_dest(as, ir, (!LJ_SOFTFP && irt_isnum(t)) ? RSET_FPR : allow);
>      rset_clear(allow, dest);
>  #if LJ_64
> @@ -1427,10 +1446,13 @@ static void asm_sload(ASMState *as, IRIns *ir)
>  #else
>    int32_t ofs = 8*((int32_t)ir->op1-2);
>  #endif
> -  lua_assert(!(ir->op2 & IRSLOAD_PARENT));  /* Handled by asm_head_side(). */
> -  lua_assert(irt_isguard(ir->t) || !(ir->op2 & IRSLOAD_TYPECHECK));
> +  lj_assertA(!(ir->op2 & IRSLOAD_PARENT),
> +	     "bad parent SLOAD");  /* Handled by asm_head_side(). */
> +  lj_assertA(irt_isguard(ir->t) || !(ir->op2 & IRSLOAD_TYPECHECK),
> +	     "inconsistent SLOAD variant");
>  #if LJ_SOFTFP32
> -  lua_assert(!(ir->op2 & IRSLOAD_CONVERT));  /* Handled by LJ_SOFTFP SPLIT. */
> +  lj_assertA(!(ir->op2 & IRSLOAD_CONVERT),
> +	     "unsplit SLOAD convert");  /* Handled by LJ_SOFTFP SPLIT. */
>    if (hiop && ra_used(ir+1)) {
>      type = ra_dest(as, ir+1, allow);
>      rset_clear(allow, type);
> @@ -1443,8 +1465,9 @@ static void asm_sload(ASMState *as, IRIns *ir)
>    } else
>  #endif
>    if (ra_used(ir)) {
> -    lua_assert((LJ_SOFTFP32 ? 0 : irt_isnum(ir->t)) ||
> -	       irt_isint(ir->t) || irt_isaddr(ir->t));
> +    lj_assertA((LJ_SOFTFP32 ? 0 : irt_isnum(ir->t)) ||
> +	       irt_isint(ir->t) || irt_isaddr(ir->t),
> +	       "bad SLOAD type %d", irt_type(ir->t));
>      dest = ra_dest(as, ir, (!LJ_SOFTFP && irt_isnum(t)) ? RSET_FPR : allow);
>      rset_clear(allow, dest);
>      base = ra_alloc1(as, REF_BASE, allow);
> @@ -1556,7 +1579,8 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>    RegSet allow = (RSET_GPR & ~RSET_SCRATCH);
>    RegSet drop = RSET_SCRATCH;
>    Reg tmp;
> -  lua_assert(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL));
> +  lj_assertA(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL),
> +	     "bad CNEW/CNEWI operands");
>  
>    as->gcsteps++;
>    if (ra_hasreg(ir->r))
> @@ -1571,7 +1595,7 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>      int32_t ofs = sizeof(GCcdata);
>      if (sz == 8) {
>        ofs += 4;
> -      lua_assert((ir+1)->o == IR_HIOP);
> +      lj_assertA((ir+1)->o == IR_HIOP, "expected HIOP for CNEWI");
>        if (LJ_LE) ir++;
>      }
>      for (;;) {
> @@ -1585,7 +1609,7 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>      emit_tsi(as, sz == 8 ? MIPSI_SD : MIPSI_SW, ra_alloc1(as, ir->op2, allow),
>  	     RID_RET, sizeof(GCcdata));
>  #endif
> -    lua_assert(sz == 4 || sz == 8);
> +    lj_assertA(sz == 4 || sz == 8, "bad CNEWI size %d", sz);
>    } else if (ir->op2 != REF_NIL) {  /* Create VLA/VLS/aligned cdata. */
>      ci = &lj_ir_callinfo[IRCALL_lj_cdata_newv];
>      args[0] = ASMREF_L;     /* lua_State *L */
> @@ -1640,7 +1664,7 @@ static void asm_obar(ASMState *as, IRIns *ir)
>    MCLabel l_end;
>    Reg obj, val, tmp;
>    /* No need for other object barriers (yet). */
> -  lua_assert(IR(ir->op1)->o == IR_UREFC);
> +  lj_assertA(IR(ir->op1)->o == IR_UREFC, "bad OBAR type");
>    ra_evictset(as, RSET_SCRATCH);
>    l_end = emit_label(as);
>    args[0] = ASMREF_TMP1;  /* global_State *g */
> @@ -1715,7 +1739,7 @@ static void asm_add(ASMState *as, IRIns *ir)
>      Reg dest = ra_dest(as, ir, RSET_GPR);
>      Reg right, left = ra_hintalloc(as, ir->op1, dest, RSET_GPR);
>      if (irref_isk(ir->op2)) {
> -      intptr_t k = get_kval(IR(ir->op2));
> +      intptr_t k = get_kval(as, ir->op2);
>        if (checki16(k)) {
>  	emit_tsi(as, (LJ_64 && irt_is64(t)) ? MIPSI_DADDIU : MIPSI_ADDIU, dest,
>  		 left, k);
> @@ -1816,7 +1840,7 @@ static void asm_arithov(ASMState *as, IRIns *ir)
>  {
>    /* TODO MIPSR6: bovc/bnvc. Caveat: no delay slot to load RID_TMP. */
>    Reg right, left, tmp, dest = ra_dest(as, ir, RSET_GPR);
> -  lua_assert(!irt_is64(ir->t));
> +  lj_assertA(!irt_is64(ir->t), "bad usage");
>    if (irref_isk(ir->op2)) {
>      int k = IR(ir->op2)->i;
>      if (ir->o == IR_SUBOV) k = -k;
> @@ -2003,7 +2027,7 @@ static void asm_bitop(ASMState *as, IRIns *ir, MIPSIns mi, MIPSIns mik)
>    Reg dest = ra_dest(as, ir, RSET_GPR);
>    Reg right, left = ra_hintalloc(as, ir->op1, dest, RSET_GPR);
>    if (irref_isk(ir->op2)) {
> -    intptr_t k = get_kval(IR(ir->op2));
> +    intptr_t k = get_kval(as, ir->op2);
>      if (checku16(k)) {
>        emit_tsi(as, mik, dest, left, k);
>        return;
> @@ -2036,7 +2060,7 @@ static void asm_bitshift(ASMState *as, IRIns *ir, MIPSIns mi, MIPSIns mik)
>  #define asm_bshl(as, ir)	asm_bitshift(as, ir, MIPSI_SLLV, MIPSI_SLL)
>  #define asm_bshr(as, ir)	asm_bitshift(as, ir, MIPSI_SRLV, MIPSI_SRL)
>  #define asm_bsar(as, ir)	asm_bitshift(as, ir, MIPSI_SRAV, MIPSI_SRA)
> -#define asm_brol(as, ir)	lua_assert(0)
> +#define asm_brol(as, ir)	lj_assertA(0, "unexpected BROL")
>  
>  static void asm_bror(ASMState *as, IRIns *ir)
>  {
> @@ -2228,13 +2252,13 @@ static void asm_comp(ASMState *as, IRIns *ir)
>    } else {
>      Reg right, left = ra_alloc1(as, ir->op1, RSET_GPR);
>      if (op == IR_ABC) op = IR_UGT;
> -    if ((op&4) == 0 && irref_isk(ir->op2) && get_kval(IR(ir->op2)) == 0) {
> +    if ((op&4) == 0 && irref_isk(ir->op2) && get_kval(as, ir->op2) == 0) {
>        MIPSIns mi = (op&2) ? ((op&1) ? MIPSI_BLEZ : MIPSI_BGTZ) :
>  			    ((op&1) ? MIPSI_BLTZ : MIPSI_BGEZ);
>        asm_guard(as, mi, left, 0);
>      } else {
>        if (irref_isk(ir->op2)) {
> -	intptr_t k = get_kval(IR(ir->op2));
> +	intptr_t k = get_kval(as, ir->op2);
>  	if ((op&2)) k++;
>  	if (checki16(k)) {
>  	  asm_guard(as, (op&1) ? MIPSI_BNE : MIPSI_BEQ, RID_TMP, RID_ZERO);
> @@ -2390,10 +2414,11 @@ static void asm_hiop(ASMState *as, IRIns *ir)
>    case IR_CNEWI:
>      /* Nothing to do here. Handled by lo op itself. */
>      break;
> -  default: lua_assert(0); break;
> +  default: lj_assertA(0, "bad HIOP for op %d", (ir-1)->o); break;
>    }
>  #else
> -  UNUSED(as); UNUSED(ir); lua_assert(0);  /* Unused without FFI. */
> +  /* Unused on MIPS64 or without SOFTFP or FFI. */
> +  UNUSED(as); UNUSED(ir); lj_assertA(0, "unexpected HIOP");
>  #endif
>  }
>  
> @@ -2462,7 +2487,8 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>  #if LJ_SOFTFP32
>        Reg tmp;
>        RegSet allow = rset_exclude(RSET_GPR, RID_BASE);
> -      lua_assert(irref_isk(ref));  /* LJ_SOFTFP: must be a number constant. */
> +      /* LJ_SOFTFP: must be a number constant. */
> +      lj_assertA(irref_isk(ref), "unsplit FP op");
>        tmp = ra_allock(as, (int32_t)ir_knum(ir)->u32.lo, allow);
>        emit_tsi(as, MIPSI_SW, tmp, RID_BASE, ofs+(LJ_BE?4:0));
>        if (rset_test(as->freeset, tmp+1)) allow = RID2RSET(tmp+1);
> @@ -2479,7 +2505,8 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>  #if LJ_32
>        RegSet allow = rset_exclude(RSET_GPR, RID_BASE);
>        Reg type;
> -      lua_assert(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t));
> +      lj_assertA(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t),
> +		 "restore of IR type %d", irt_type(ir->t));
>        if (!irt_ispri(ir->t)) {
>  	Reg src = ra_alloc1(as, ref, allow);
>  	rset_clear(allow, src);
> @@ -2502,7 +2529,7 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>      }
>      checkmclim(as);
>    }
> -  lua_assert(map + nent == flinks);
> +  lj_assertA(map + nent == flinks, "inconsistent frames in snapshot");
>  }
>  
>  /* -- GC handling --------------------------------------------------------- */
> @@ -2700,7 +2727,7 @@ void lj_asm_patchexit(jit_State *J, GCtrace *T, ExitNo exitno, MCode *target)
>  	}
>        } else if (p+1 == pe) {
>  	/* Patch NOP after code for inverted loop branch. Use of J is ok. */
> -	lua_assert(p[1] == MIPSI_NOP);
> +	lj_assertJ(p[1] == MIPSI_NOP, "expected NOP");
>  	p[1] = tjump;
>  	*p = MIPSI_NOP;  /* Replace the load of the exit number. */
>  	cstop = p+2;
> diff --git a/src/lj_asm_ppc.h b/src/lj_asm_ppc.h
> index 62a5c3e2..971dcc88 100644
> --- a/src/lj_asm_ppc.h
> +++ b/src/lj_asm_ppc.h
> @@ -181,7 +181,7 @@ static void asm_fusexref(ASMState *as, PPCIns pi, Reg rt, IRRef ref,
>  	return;
>        }
>      } else if (ir->o == IR_STRREF) {
> -      lua_assert(ofs == 0);
> +      lj_assertA(ofs == 0, "bad usage");
>        ofs = (int32_t)sizeof(GCstr);
>        if (irref_isk(ir->op2)) {
>  	ofs += IR(ir->op2)->i;
> @@ -268,7 +268,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
>  #if !LJ_SOFTFP
>        if (irt_isfp(ir->t)) {
>  	if (fpr <= REGARG_LASTFPR) {
> -	  lua_assert(rset_test(as->freeset, fpr));  /* Already evicted. */
> +	  lj_assertA(rset_test(as->freeset, fpr),
> +		     "reg %d not free", fpr);  /* Already evicted. */
>  	  ra_leftov(as, fpr, ref);
>  	  fpr++;
>  	} else {
> @@ -281,7 +282,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
>  #endif
>        {
>  	if (gpr <= REGARG_LASTGPR) {
> -	  lua_assert(rset_test(as->freeset, gpr));  /* Already evicted. */
> +	  lj_assertA(rset_test(as->freeset, gpr),
> +		     "reg %d not free", gpr);  /* Already evicted. */
>  	  ra_leftov(as, gpr, ref);
>  	  gpr++;
>  	} else {
> @@ -319,7 +321,7 @@ static void asm_setupresult(ASMState *as, IRIns *ir, const CCallInfo *ci)
>      rset_clear(drop, (ir+1)->r);  /* Dest reg handled below. */
>    ra_evictset(as, drop);  /* Evictions must be performed first. */
>    if (ra_used(ir)) {
> -    lua_assert(!irt_ispri(ir->t));
> +    lj_assertA(!irt_ispri(ir->t), "PRI dest");
>      if (!LJ_SOFTFP && irt_isfp(ir->t)) {
>        if ((ci->flags & CCI_CASTU64)) {
>  	/* Use spill slot or temp slots. */
> @@ -431,14 +433,18 @@ static void asm_conv(ASMState *as, IRIns *ir)
>    int stfp = (st == IRT_NUM || st == IRT_FLOAT);
>  #endif
>    IRRef lref = ir->op1;
> -  lua_assert(!(irt_isint64(ir->t) ||
> -	       (st == IRT_I64 || st == IRT_U64))); /* Handled by SPLIT. */
> +  /* 64 bit integer conversions are handled by SPLIT. */
> +  lj_assertA(!(irt_isint64(ir->t) || (st == IRT_I64 || st == IRT_U64)),
> +	     "IR %04d has unsplit 64 bit type",
> +	     (int)(ir - as->ir) - REF_BIAS);
>  #if LJ_SOFTFP
>    /* FP conversions are handled by SPLIT. */
> -  lua_assert(!irt_isfp(ir->t) && !(st == IRT_NUM || st == IRT_FLOAT));
> +  lj_assertA(!irt_isfp(ir->t) && !(st == IRT_NUM || st == IRT_FLOAT),
> +	     "IR %04d has FP type",
> +	     (int)(ir - as->ir) - REF_BIAS);
>    /* Can't check for same types: SPLIT uses CONV int.int + BXOR for sfp NEG. */
>  #else
> -  lua_assert(irt_type(ir->t) != st);
> +  lj_assertA(irt_type(ir->t) != st, "inconsistent types for CONV");
>    if (irt_isfp(ir->t)) {
>      Reg dest = ra_dest(as, ir, RSET_FPR);
>      if (stfp) {  /* FP to FP conversion. */
> @@ -467,7 +473,8 @@ static void asm_conv(ASMState *as, IRIns *ir)
>    } else if (stfp) {  /* FP to integer conversion. */
>      if (irt_isguard(ir->t)) {
>        /* Checked conversions are only supported from number to int. */
> -      lua_assert(irt_isint(ir->t) && st == IRT_NUM);
> +      lj_assertA(irt_isint(ir->t) && st == IRT_NUM,
> +		 "bad type for checked CONV");
>        asm_tointg(as, ir, ra_alloc1(as, lref, RSET_FPR));
>      } else {
>        Reg dest = ra_dest(as, ir, RSET_GPR);
> @@ -503,7 +510,7 @@ static void asm_conv(ASMState *as, IRIns *ir)
>      Reg dest = ra_dest(as, ir, RSET_GPR);
>      if (st >= IRT_I8 && st <= IRT_U16) {  /* Extend to 32 bit integer. */
>        Reg left = ra_alloc1(as, ir->op1, RSET_GPR);
> -      lua_assert(irt_isint(ir->t) || irt_isu32(ir->t));
> +      lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t), "bad type for CONV EXT");
>        if ((ir->op2 & IRCONV_SEXT))
>  	emit_as(as, st == IRT_I8 ? PPCI_EXTSB : PPCI_EXTSH, dest, left);
>        else
> @@ -699,7 +706,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>  	    (((char *)as->mcp-(char *)l_loop) & 0xffffu);
>  
>    /* Load main position relative to tab->node into dest. */
> -  khash = isk ? ir_khash(irkey) : 1;
> +  khash = isk ? ir_khash(as, irkey) : 1;
>    if (khash == 0) {
>      emit_tai(as, PPCI_LWZ, dest, tab, (int32_t)offsetof(GCtab, node));
>    } else {
> @@ -754,7 +761,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
>    Reg node = ra_alloc1(as, ir->op1, RSET_GPR);
>    Reg key = RID_NONE, type = RID_TMP, idx = node;
>    RegSet allow = rset_exclude(RSET_GPR, node);
> -  lua_assert(ofs % sizeof(Node) == 0);
> +  lj_assertA(ofs % sizeof(Node) == 0, "unaligned HREFK slot");
>    if (ofs > 32736) {
>      idx = dest;
>      rset_clear(allow, dest);
> @@ -813,7 +820,7 @@ static void asm_uref(ASMState *as, IRIns *ir)
>  static void asm_fref(ASMState *as, IRIns *ir)
>  {
>    UNUSED(as); UNUSED(ir);
> -  lua_assert(!ra_used(ir));
> +  lj_assertA(!ra_used(ir), "unfused FREF");
>  }
>  
>  static void asm_strref(ASMState *as, IRIns *ir)
> @@ -853,25 +860,27 @@ static void asm_strref(ASMState *as, IRIns *ir)
>  
>  /* -- Loads and stores ---------------------------------------------------- */
>  
> -static PPCIns asm_fxloadins(IRIns *ir)
> +static PPCIns asm_fxloadins(ASMState *as, IRIns *ir)
>  {
> +  UNUSED(as);
>    switch (irt_type(ir->t)) {
>    case IRT_I8: return PPCI_LBZ;  /* Needs sign-extension. */
>    case IRT_U8: return PPCI_LBZ;
>    case IRT_I16: return PPCI_LHA;
>    case IRT_U16: return PPCI_LHZ;
> -  case IRT_NUM: lua_assert(!LJ_SOFTFP); return PPCI_LFD;
> +  case IRT_NUM: lj_assertA(!LJ_SOFTFP, "unsplit FP op"); return PPCI_LFD;
>    case IRT_FLOAT: if (!LJ_SOFTFP) return PPCI_LFS;
>    default: return PPCI_LWZ;
>    }
>  }
>  
> -static PPCIns asm_fxstoreins(IRIns *ir)
> +static PPCIns asm_fxstoreins(ASMState *as, IRIns *ir)
>  {
> +  UNUSED(as);
>    switch (irt_type(ir->t)) {
>    case IRT_I8: case IRT_U8: return PPCI_STB;
>    case IRT_I16: case IRT_U16: return PPCI_STH;
> -  case IRT_NUM: lua_assert(!LJ_SOFTFP); return PPCI_STFD;
> +  case IRT_NUM: lj_assertA(!LJ_SOFTFP, "unsplit FP op"); return PPCI_STFD;
>    case IRT_FLOAT: if (!LJ_SOFTFP) return PPCI_STFS;
>    default: return PPCI_STW;
>    }
> @@ -880,10 +889,10 @@ static PPCIns asm_fxstoreins(IRIns *ir)
>  static void asm_fload(ASMState *as, IRIns *ir)
>  {
>    Reg dest = ra_dest(as, ir, RSET_GPR);
> -  PPCIns pi = asm_fxloadins(ir);
> +  PPCIns pi = asm_fxloadins(as, ir);
>    Reg idx;
>    int32_t ofs;
> -  if (ir->op1 == REF_NIL) {
> +  if (ir->op1 == REF_NIL) {  /* FLOAD from GG_State with offset. */
>      idx = RID_JGL;
>      ofs = (ir->op2 << 2) - 32768;
>    } else {
> @@ -897,7 +906,7 @@ static void asm_fload(ASMState *as, IRIns *ir)
>      }
>      ofs = field_ofs[ir->op2];
>    }
> -  lua_assert(!irt_isi8(ir->t));
> +  lj_assertA(!irt_isi8(ir->t), "unsupported FLOAD I8");
>    emit_tai(as, pi, dest, idx, ofs);
>  }
>  
> @@ -908,7 +917,7 @@ static void asm_fstore(ASMState *as, IRIns *ir)
>      IRIns *irf = IR(ir->op1);
>      Reg idx = ra_alloc1(as, irf->op1, rset_exclude(RSET_GPR, src));
>      int32_t ofs = field_ofs[irf->op2];
> -    PPCIns pi = asm_fxstoreins(ir);
> +    PPCIns pi = asm_fxstoreins(as, ir);
>      emit_tai(as, pi, src, idx, ofs);
>    }
>  }
> @@ -917,10 +926,10 @@ static void asm_xload(ASMState *as, IRIns *ir)
>  {
>    Reg dest = ra_dest(as, ir,
>      (!LJ_SOFTFP && irt_isfp(ir->t)) ? RSET_FPR : RSET_GPR);
> -  lua_assert(!(ir->op2 & IRXLOAD_UNALIGNED));
> +  lj_assertA(!(ir->op2 & IRXLOAD_UNALIGNED), "unaligned XLOAD");
>    if (irt_isi8(ir->t))
>      emit_as(as, PPCI_EXTSB, dest, dest);
> -  asm_fusexref(as, asm_fxloadins(ir), dest, ir->op1, RSET_GPR, 0);
> +  asm_fusexref(as, asm_fxloadins(as, ir), dest, ir->op1, RSET_GPR, 0);
>  }
>  
>  static void asm_xstore_(ASMState *as, IRIns *ir, int32_t ofs)
> @@ -936,7 +945,7 @@ static void asm_xstore_(ASMState *as, IRIns *ir, int32_t ofs)
>    } else {
>      Reg src = ra_alloc1(as, ir->op2,
>        (!LJ_SOFTFP && irt_isfp(ir->t)) ? RSET_FPR : RSET_GPR);
> -    asm_fusexref(as, asm_fxstoreins(ir), src, ir->op1,
> +    asm_fusexref(as, asm_fxstoreins(as, ir), src, ir->op1,
>  		 rset_exclude(RSET_GPR, src), ofs);
>    }
>  }
> @@ -958,8 +967,9 @@ static void asm_ahuvload(ASMState *as, IRIns *ir)
>      ofs = 0;
>    }
>    if (ra_used(ir)) {
> -    lua_assert((LJ_SOFTFP ? 0 : irt_isnum(ir->t)) ||
> -	       irt_isint(ir->t) || irt_isaddr(ir->t));
> +    lj_assertA((LJ_SOFTFP ? 0 : irt_isnum(ir->t)) ||
> +	       irt_isint(ir->t) || irt_isaddr(ir->t),
> +	       "bad load type %d", irt_type(ir->t));
>      if (LJ_SOFTFP || !irt_isnum(t)) ofs = 0;
>      dest = ra_dest(as, ir, (!LJ_SOFTFP && irt_isnum(t)) ? RSET_FPR : allow);
>      rset_clear(allow, dest);
> @@ -1042,12 +1052,16 @@ static void asm_sload(ASMState *as, IRIns *ir)
>    int hiop = (LJ_SOFTFP && (ir+1)->o == IR_HIOP);
>    if (hiop)
>      t.irt = IRT_NUM;
> -  lua_assert(!(ir->op2 & IRSLOAD_PARENT));  /* Handled by asm_head_side(). */
> -  lua_assert(irt_isguard(ir->t) || !(ir->op2 & IRSLOAD_TYPECHECK));
> -  lua_assert(LJ_DUALNUM ||
> -	     !irt_isint(t) || (ir->op2 & (IRSLOAD_CONVERT|IRSLOAD_FRAME)));
> +  lj_assertA(!(ir->op2 & IRSLOAD_PARENT),
> +	     "bad parent SLOAD");  /* Handled by asm_head_side(). */
> +  lj_assertA(irt_isguard(ir->t) || !(ir->op2 & IRSLOAD_TYPECHECK),
> +	     "inconsistent SLOAD variant");
> +  lj_assertA(LJ_DUALNUM ||
> +	     !irt_isint(t) || (ir->op2 & (IRSLOAD_CONVERT|IRSLOAD_FRAME)),
> +	     "bad SLOAD type");
>  #if LJ_SOFTFP
> -  lua_assert(!(ir->op2 & IRSLOAD_CONVERT));  /* Handled by LJ_SOFTFP SPLIT. */
> +  lj_assertA(!(ir->op2 & IRSLOAD_CONVERT),
> +	     "unsplit SLOAD convert");  /* Handled by LJ_SOFTFP SPLIT. */
>    if (hiop && ra_used(ir+1)) {
>      type = ra_dest(as, ir+1, allow);
>      rset_clear(allow, type);
> @@ -1060,7 +1074,8 @@ static void asm_sload(ASMState *as, IRIns *ir)
>    } else
>  #endif
>    if (ra_used(ir)) {
> -    lua_assert(irt_isnum(t) || irt_isint(t) || irt_isaddr(t));
> +    lj_assertA(irt_isnum(t) || irt_isint(t) || irt_isaddr(t),
> +	       "bad SLOAD type %d", irt_type(ir->t));
>      dest = ra_dest(as, ir, (!LJ_SOFTFP && irt_isnum(t)) ? RSET_FPR : allow);
>      rset_clear(allow, dest);
>      base = ra_alloc1(as, REF_BASE, allow);
> @@ -1127,7 +1142,8 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>    const CCallInfo *ci = &lj_ir_callinfo[IRCALL_lj_mem_newgco];
>    IRRef args[4];
>    RegSet drop = RSET_SCRATCH;
> -  lua_assert(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL));
> +  lj_assertA(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL),
> +	     "bad CNEW/CNEWI operands");
>  
>    as->gcsteps++;
>    if (ra_hasreg(ir->r))
> @@ -1140,10 +1156,10 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>    if (ir->o == IR_CNEWI) {
>      RegSet allow = (RSET_GPR & ~RSET_SCRATCH);
>      int32_t ofs = sizeof(GCcdata);
> -    lua_assert(sz == 4 || sz == 8);
> +    lj_assertA(sz == 4 || sz == 8, "bad CNEWI size %d", sz);
>      if (sz == 8) {
>        ofs += 4;
> -      lua_assert((ir+1)->o == IR_HIOP);
> +      lj_assertA((ir+1)->o == IR_HIOP, "expected HIOP for CNEWI");
>      }
>      for (;;) {
>        Reg r = ra_alloc1(as, ir->op2, allow);
> @@ -1190,7 +1206,7 @@ static void asm_tbar(ASMState *as, IRIns *ir)
>    emit_tai(as, PPCI_STW, link, tab, (int32_t)offsetof(GCtab, gclist));
>    emit_tai(as, PPCI_STB, mark, tab, (int32_t)offsetof(GCtab, marked));
>    emit_setgl(as, tab, gc.grayagain);
> -  lua_assert(LJ_GC_BLACK == 0x04);
> +  lj_assertA(LJ_GC_BLACK == 0x04, "bad LJ_GC_BLACK");
>    emit_rot(as, PPCI_RLWINM, mark, mark, 0, 30, 28);  /* Clear black bit. */
>    emit_getgl(as, link, gc.grayagain);
>    emit_condbranch(as, PPCI_BC|PPCF_Y, CC_EQ, l_end);
> @@ -1205,7 +1221,7 @@ static void asm_obar(ASMState *as, IRIns *ir)
>    MCLabel l_end;
>    Reg obj, val, tmp;
>    /* No need for other object barriers (yet). */
> -  lua_assert(IR(ir->op1)->o == IR_UREFC);
> +  lj_assertA(IR(ir->op1)->o == IR_UREFC, "bad OBAR type");
>    ra_evictset(as, RSET_SCRATCH);
>    l_end = emit_label(as);
>    args[0] = ASMREF_TMP1;  /* global_State *g */
> @@ -1676,7 +1692,7 @@ static void asm_bitshift(ASMState *as, IRIns *ir, PPCIns pi, PPCIns pik)
>  #define asm_brol(as, ir) \
>    asm_bitshift(as, ir, PPCI_RLWNM|PPCF_MB(0)|PPCF_ME(31), \
>  		       PPCI_RLWINM|PPCF_MB(0)|PPCF_ME(31))
> -#define asm_bror(as, ir)	lua_assert(0)
> +#define asm_bror(as, ir)	lj_assertA(0, "unexpected BROR")
>  
>  #if LJ_SOFTFP
>  static void asm_sfpmin_max(ASMState *as, IRIns *ir)
> @@ -1951,10 +1967,11 @@ static void asm_hiop(ASMState *as, IRIns *ir)
>    case IR_CNEWI:
>      /* Nothing to do here. Handled by lo op itself. */
>      break;
> -  default: lua_assert(0); break;
> +  default: lj_assertA(0, "bad HIOP for op %d", (ir-1)->o); break;
>    }
>  #else
> -  UNUSED(as); UNUSED(ir); lua_assert(0);  /* Unused without FFI. */
> +  /* Unused without SOFTFP or FFI. */
> +  UNUSED(as); UNUSED(ir); lj_assertA(0, "unexpected HIOP");
>  #endif
>  }
>  
> @@ -2014,7 +2031,8 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>  #if LJ_SOFTFP
>        Reg tmp;
>        RegSet allow = rset_exclude(RSET_GPR, RID_BASE);
> -      lua_assert(irref_isk(ref));  /* LJ_SOFTFP: must be a number constant. */
> +      /* LJ_SOFTFP: must be a number constant. */
> +      lj_assertA(irref_isk(ref), "unsplit FP op");
>        tmp = ra_allock(as, (int32_t)ir_knum(ir)->u32.lo, allow);
>        emit_tai(as, PPCI_STW, tmp, RID_BASE, ofs+(LJ_BE?4:0));
>        if (rset_test(as->freeset, tmp+1)) allow = RID2RSET(tmp+1);
> @@ -2027,7 +2045,8 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>      } else {
>        Reg type;
>        RegSet allow = rset_exclude(RSET_GPR, RID_BASE);
> -      lua_assert(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t));
> +      lj_assertA(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t),
> +		 "restore of IR type %d", irt_type(ir->t));
>        if (!irt_ispri(ir->t)) {
>  	Reg src = ra_alloc1(as, ref, allow);
>  	rset_clear(allow, src);
> @@ -2047,7 +2066,7 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>      }
>      checkmclim(as);
>    }
> -  lua_assert(map + nent == flinks);
> +  lj_assertA(map + nent == flinks, "inconsistent frames in snapshot");
>  }
>  
>  /* -- GC handling --------------------------------------------------------- */
> @@ -2145,7 +2164,7 @@ static void asm_tail_fixup(ASMState *as, TraceNo lnk)
>      as->mctop = p;
>    } else {
>      /* Patch stack adjustment. */
> -    lua_assert(checki16(CFRAME_SIZE+spadj));
> +    lj_assertA(checki16(CFRAME_SIZE+spadj), "stack adjustment out of range");
>      p[-3] = PPCI_ADDI | PPCF_T(RID_TMP) | PPCF_A(RID_SP) | (CFRAME_SIZE+spadj);
>      p[-2] = PPCI_STWU | PPCF_T(RID_TMP) | PPCF_A(RID_SP) | spadj;
>    }
> @@ -2222,14 +2241,16 @@ void lj_asm_patchexit(jit_State *J, GCtrace *T, ExitNo exitno, MCode *target)
>      } else if ((ins & 0xfc000000u) == PPCI_B &&
>  	       ((ins ^ ((char *)px-(char *)p)) & 0x03ffffffu) == 0) {
>        ptrdiff_t delta = (char *)target - (char *)p;
> -      lua_assert(((delta + 0x02000000) >> 26) == 0);
> +      lj_assertJ(((delta + 0x02000000) >> 26) == 0,
> +		 "branch target out of range");
>        *p = PPCI_B | ((uint32_t)delta & 0x03ffffffu);
>        if (!cstart) cstart = p;
>      }
>    }
>    {  /* Always patch long-range branch in exit stub itself. */
>      ptrdiff_t delta = (char *)target - (char *)px - clearso;
> -    lua_assert(((delta + 0x02000000) >> 26) == 0);
> +    lj_assertJ(((delta + 0x02000000) >> 26) == 0,
> +	       "branch target out of range");
>      *px = PPCI_B | ((uint32_t)delta & 0x03ffffffu);
>    }
>    if (!cstart) cstart = px;
> diff --git a/src/lj_asm_x86.h b/src/lj_asm_x86.h
> index 5f5fe3cf..74f2d853 100644
> --- a/src/lj_asm_x86.h
> +++ b/src/lj_asm_x86.h
> @@ -31,7 +31,7 @@ static MCode *asm_exitstub_gen(ASMState *as, ExitNo group)
>  #endif
>    /* Jump to exit handler which fills in the ExitState. */
>    *mxp++ = XI_JMP; mxp += 4;
> -  *((int32_t *)(mxp-4)) = jmprel(mxp, (MCode *)(void *)lj_vm_exit_handler);
> +  *((int32_t *)(mxp-4)) = jmprel(as->J, mxp, (MCode *)(void *)lj_vm_exit_handler);
>    /* Commit the code for this group (even if assembly fails later on). */
>    lj_mcode_commitbot(as->J, mxp);
>    as->mcbot = mxp;
> @@ -60,7 +60,7 @@ static void asm_guardcc(ASMState *as, int cc)
>    MCode *p = as->mcp;
>    if (LJ_UNLIKELY(p == as->invmcp)) {
>      as->loopinv = 1;
> -    *(int32_t *)(p+1) = jmprel(p+5, target);
> +    *(int32_t *)(p+1) = jmprel(as->J, p+5, target);
>      target = p;
>      cc ^= 1;
>      if (as->realign) {
> @@ -131,7 +131,7 @@ static IRRef asm_fuseabase(ASMState *as, IRRef ref)
>    as->mrm.ofs = 0;
>    if (irb->o == IR_FLOAD) {
>      IRIns *ira = IR(irb->op1);
> -    lua_assert(irb->op2 == IRFL_TAB_ARRAY);
> +    lj_assertA(irb->op2 == IRFL_TAB_ARRAY, "expected FLOAD TAB_ARRAY");
>      /* We can avoid the FLOAD of t->array for colocated arrays. */
>      if (ira->o == IR_TNEW && ira->op1 <= LJ_MAX_COLOSIZE &&
>  	!neverfuse(as) && noconflict(as, irb->op1, IR_NEWREF, 1)) {
> @@ -150,7 +150,7 @@ static IRRef asm_fuseabase(ASMState *as, IRRef ref)
>  static void asm_fusearef(ASMState *as, IRIns *ir, RegSet allow)
>  {
>    IRIns *irx;
> -  lua_assert(ir->o == IR_AREF);
> +  lj_assertA(ir->o == IR_AREF, "expected AREF");
>    as->mrm.base = (uint8_t)ra_alloc1(as, asm_fuseabase(as, ir->op1), allow);
>    irx = IR(ir->op2);
>    if (irref_isk(ir->op2)) {
> @@ -217,8 +217,9 @@ static void asm_fuseahuref(ASMState *as, IRRef ref, RegSet allow)
>        }
>        break;
>      default:
> -      lua_assert(ir->o == IR_HREF || ir->o == IR_NEWREF || ir->o == IR_UREFO ||
> -		 ir->o == IR_KKPTR);
> +      lj_assertA(ir->o == IR_HREF || ir->o == IR_NEWREF || ir->o == IR_UREFO ||
> +		 ir->o == IR_KKPTR,
> +		 "bad IR op %d", ir->o);
>        break;
>      }
>    }
> @@ -230,9 +231,10 @@ static void asm_fuseahuref(ASMState *as, IRRef ref, RegSet allow)
>  /* Fuse FLOAD/FREF reference into memory operand. */
>  static void asm_fusefref(ASMState *as, IRIns *ir, RegSet allow)
>  {
> -  lua_assert(ir->o == IR_FLOAD || ir->o == IR_FREF);
> +  lj_assertA(ir->o == IR_FLOAD || ir->o == IR_FREF,
> +	     "bad IR op %d", ir->o);
>    as->mrm.idx = RID_NONE;
> -  if (ir->op1 == REF_NIL) {
> +  if (ir->op1 == REF_NIL) {  /* FLOAD from GG_State with offset. */
>  #if LJ_GC64
>      as->mrm.ofs = (int32_t)(ir->op2 << 2) - GG_OFS(dispatch);
>      as->mrm.base = RID_DISPATCH;
> @@ -271,7 +273,7 @@ static void asm_fusefref(ASMState *as, IRIns *ir, RegSet allow)
>  static void asm_fusestrref(ASMState *as, IRIns *ir, RegSet allow)
>  {
>    IRIns *irr;
> -  lua_assert(ir->o == IR_STRREF);
> +  lj_assertA(ir->o == IR_STRREF, "bad IR op %d", ir->o);
>    as->mrm.base = as->mrm.idx = RID_NONE;
>    as->mrm.scale = XM_SCALE1;
>    as->mrm.ofs = sizeof(GCstr);
> @@ -378,9 +380,10 @@ static Reg asm_fuseloadk64(ASMState *as, IRIns *ir)
>  	     checki32(mctopofs(as, k)) && checki32(mctopofs(as, k+1))) {
>      as->mrm.ofs = (int32_t)mcpofs(as, k);
>      as->mrm.base = RID_RIP;
> -  } else {
> +  } else {  /* Intern 64 bit constant at bottom of mcode. */
>      if (ir->i) {
> -      lua_assert(*k == *(uint64_t*)(as->mctop - ir->i));
> +      lj_assertA(*k == *(uint64_t*)(as->mctop - ir->i),
> +		 "bad interned 64 bit constant");
>      } else {
>        while ((uintptr_t)as->mcbot & 7) *as->mcbot++ = XI_INT3;
>        *(uint64_t*)as->mcbot = *k;
> @@ -420,12 +423,12 @@ static Reg asm_fuseload(ASMState *as, IRRef ref, RegSet allow)
>    }
>    if (ir->o == IR_KNUM) {
>      RegSet avail = as->freeset & ~as->modset & RSET_FPR;
> -    lua_assert(allow != RSET_EMPTY);
> +    lj_assertA(allow != RSET_EMPTY, "no register allowed");
>      if (!(avail & (avail-1)))  /* Fuse if less than two regs available. */
>        return asm_fuseloadk64(as, ir);
>    } else if (ref == REF_BASE || ir->o == IR_KINT64) {
>      RegSet avail = as->freeset & ~as->modset & RSET_GPR;
> -    lua_assert(allow != RSET_EMPTY);
> +    lj_assertA(allow != RSET_EMPTY, "no register allowed");
>      if (!(avail & (avail-1))) {  /* Fuse if less than two regs available. */
>        if (ref == REF_BASE) {
>  #if LJ_GC64
> @@ -606,7 +609,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
>  #endif
>  	  emit_loadi(as, r, ir->i);
>        } else {
> -	lua_assert(rset_test(as->freeset, r));  /* Must have been evicted. */
> +	/* Must have been evicted. */
> +	lj_assertA(rset_test(as->freeset, r), "reg %d not free", r);
>  	if (ra_hasreg(ir->r)) {
>  	  ra_noweak(as, ir->r);
>  	  emit_movrr(as, ir, r, ir->r);
> @@ -615,7 +619,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
>  	}
>        }
>      } else if (irt_isfp(ir->t)) {  /* FP argument is on stack. */
> -      lua_assert(!(irt_isfloat(ir->t) && irref_isk(ref)));  /* No float k. */
> +      lj_assertA(!(irt_isfloat(ir->t) && irref_isk(ref)),
> +		 "unexpected float constant");
>        if (LJ_32 && (ofs & 4) && irref_isk(ref)) {
>  	/* Split stores for unaligned FP consts. */
>  	emit_movmroi(as, RID_ESP, ofs, (int32_t)ir_knum(ir)->u32.lo);
> @@ -691,7 +696,7 @@ static void asm_setupresult(ASMState *as, IRIns *ir, const CCallInfo *ci)
>        ra_destpair(as, ir);
>  #endif
>      } else {
> -      lua_assert(!irt_ispri(ir->t));
> +      lj_assertA(!irt_ispri(ir->t), "PRI dest");
>        ra_destreg(as, ir, RID_RET);
>      }
>    } else if (LJ_32 && irt_isfp(ir->t) && !(ci->flags & CCI_CASTU64)) {
> @@ -810,8 +815,10 @@ static void asm_conv(ASMState *as, IRIns *ir)
>    int st64 = (st == IRT_I64 || st == IRT_U64 || (LJ_64 && st == IRT_P64));
>    int stfp = (st == IRT_NUM || st == IRT_FLOAT);
>    IRRef lref = ir->op1;
> -  lua_assert(irt_type(ir->t) != st);
> -  lua_assert(!(LJ_32 && (irt_isint64(ir->t) || st64)));  /* Handled by SPLIT. */
> +  lj_assertA(irt_type(ir->t) != st, "inconsistent types for CONV");
> +  lj_assertA(!(LJ_32 && (irt_isint64(ir->t) || st64)),
> +	     "IR %04d has unsplit 64 bit type",
> +	     (int)(ir - as->ir) - REF_BIAS);
>    if (irt_isfp(ir->t)) {
>      Reg dest = ra_dest(as, ir, RSET_FPR);
>      if (stfp) {  /* FP to FP conversion. */
> @@ -847,7 +854,8 @@ static void asm_conv(ASMState *as, IRIns *ir)
>    } else if (stfp) {  /* FP to integer conversion. */
>      if (irt_isguard(ir->t)) {
>        /* Checked conversions are only supported from number to int. */
> -      lua_assert(irt_isint(ir->t) && st == IRT_NUM);
> +      lj_assertA(irt_isint(ir->t) && st == IRT_NUM,
> +		 "bad type for checked CONV");
>        asm_tointg(as, ir, ra_alloc1(as, lref, RSET_FPR));
>      } else {
>        Reg dest = ra_dest(as, ir, RSET_GPR);
> @@ -882,7 +890,7 @@ static void asm_conv(ASMState *as, IRIns *ir)
>      Reg left, dest = ra_dest(as, ir, RSET_GPR);
>      RegSet allow = RSET_GPR;
>      x86Op op;
> -    lua_assert(irt_isint(ir->t) || irt_isu32(ir->t));
> +    lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t), "bad type for CONV EXT");
>      if (st == IRT_I8) {
>        op = XO_MOVSXb; allow = RSET_GPR8; dest |= FORCE_REX;
>      } else if (st == IRT_U8) {
> @@ -953,7 +961,7 @@ static void asm_conv_fp_int64(ASMState *as, IRIns *ir)
>      emit_sjcc(as, CC_NS, l_end);
>      emit_rr(as, XO_TEST, hi, hi);  /* Check if u64 >= 2^63. */
>    } else {
> -    lua_assert(((ir-1)->op2 & IRCONV_SRCMASK) == IRT_I64);
> +    lj_assertA(((ir-1)->op2 & IRCONV_SRCMASK) == IRT_I64, "bad type for CONV");
>    }
>    emit_rmro(as, XO_FILDq, XOg_FILDq, RID_ESP, 0);
>    /* NYI: Avoid narrow-to-wide store-to-load forwarding stall. */
> @@ -967,8 +975,8 @@ static void asm_conv_int64_fp(ASMState *as, IRIns *ir)
>    IRType st = (IRType)((ir-1)->op2 & IRCONV_SRCMASK);
>    IRType dt = (((ir-1)->op2 & IRCONV_DSTMASK) >> IRCONV_DSH);
>    Reg lo, hi;
> -  lua_assert(st == IRT_NUM || st == IRT_FLOAT);
> -  lua_assert(dt == IRT_I64 || dt == IRT_U64);
> +  lj_assertA(st == IRT_NUM || st == IRT_FLOAT, "bad type for CONV");
> +  lj_assertA(dt == IRT_I64 || dt == IRT_U64, "bad type for CONV");
>    hi = ra_dest(as, ir, RSET_GPR);
>    lo = ra_dest(as, ir-1, rset_exclude(RSET_GPR, hi));
>    if (ra_used(ir-1)) emit_rmro(as, XO_MOV, lo, RID_ESP, 0);
> @@ -1180,13 +1188,13 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>        emit_rmro(as, XO_CMP, tmp|REX_64, dest, offsetof(Node, key.u64));
>      }
>    } else {
> -    lua_assert(irt_ispri(kt) && !irt_isnil(kt));
> +    lj_assertA(irt_ispri(kt) && !irt_isnil(kt), "bad HREF key type");
>      emit_u32(as, (irt_toitype(kt)<<15)|0x7fff);
>      emit_rmro(as, XO_ARITHi, XOg_CMP, dest, offsetof(Node, key.it));
>  #else
>    } else {
>      if (!irt_ispri(kt)) {
> -      lua_assert(irt_isaddr(kt));
> +      lj_assertA(irt_isaddr(kt), "bad HREF key type");
>        if (isk)
>  	emit_gmroi(as, XG_ARITHi(XOg_CMP), dest, offsetof(Node, key.gcr),
>  		   ptr2addr(ir_kgc(irkey)));
> @@ -1194,7 +1202,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>  	emit_rmro(as, XO_CMP, key, dest, offsetof(Node, key.gcr));
>        emit_sjcc(as, CC_NE, l_next);
>      }
> -    lua_assert(!irt_isnil(kt));
> +    lj_assertA(!irt_isnil(kt), "bad HREF key type");
>      emit_i8(as, irt_toitype(kt));
>      emit_rmro(as, XO_ARITHi8, XOg_CMP, dest, offsetof(Node, key.it));
>  #endif
> @@ -1209,7 +1217,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>  #endif
>  
>    /* Load main position relative to tab->node into dest. */
> -  khash = isk ? ir_khash(irkey) : 1;
> +  khash = isk ? ir_khash(as, irkey) : 1;
>    if (khash == 0) {
>      emit_rmro(as, XO_MOV, dest|REX_GC64, tab, offsetof(GCtab, node));
>    } else {
> @@ -1276,7 +1284,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
>  #if !LJ_64
>    MCLabel l_exit;
>  #endif
> -  lua_assert(ofs % sizeof(Node) == 0);
> +  lj_assertA(ofs % sizeof(Node) == 0, "unaligned HREFK slot");
>    if (ra_hasreg(dest)) {
>      if (ofs != 0) {
>        if (dest == node && !(as->flags & JIT_F_LEA_AGU))
> @@ -1293,7 +1301,8 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
>      Reg key = ra_scratch(as, rset_exclude(RSET_GPR, node));
>      emit_rmro(as, XO_CMP, key|REX_64, node,
>  	       ofs + (int32_t)offsetof(Node, key.u64));
> -    lua_assert(irt_isnum(irkey->t) || irt_isgcv(irkey->t));
> +    lj_assertA(irt_isnum(irkey->t) || irt_isgcv(irkey->t),
> +	       "bad HREFK key type");
>      /* Assumes -0.0 is already canonicalized to +0.0. */
>      emit_loadu64(as, key, irt_isnum(irkey->t) ? ir_knum(irkey)->u64 :
>  #if LJ_GC64
> @@ -1304,7 +1313,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
>  			  (uint64_t)(uint32_t)ptr2addr(ir_kgc(irkey)));
>  #endif
>    } else {
> -    lua_assert(!irt_isnil(irkey->t));
> +    lj_assertA(!irt_isnil(irkey->t), "bad HREFK key type");
>  #if LJ_GC64
>      emit_i32(as, (irt_toitype(irkey->t)<<15)|0x7fff);
>      emit_rmro(as, XO_ARITHi, XOg_CMP, node,
> @@ -1328,13 +1337,13 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
>  	       (int32_t)ir_knum(irkey)->u32.hi);
>    } else {
>      if (!irt_ispri(irkey->t)) {
> -      lua_assert(irt_isgcv(irkey->t));
> +      lj_assertA(irt_isgcv(irkey->t), "bad HREFK key type");
>        emit_gmroi(as, XG_ARITHi(XOg_CMP), node,
>  		 ofs + (int32_t)offsetof(Node, key.gcr),
>  		 ptr2addr(ir_kgc(irkey)));
>        emit_sjcc(as, CC_NE, l_exit);
>      }
> -    lua_assert(!irt_isnil(irkey->t));
> +    lj_assertA(!irt_isnil(irkey->t), "bad HREFK key type");
>      emit_i8(as, irt_toitype(irkey->t));
>      emit_rmro(as, XO_ARITHi8, XOg_CMP, node,
>  	      ofs + (int32_t)offsetof(Node, key.it));
> @@ -1407,7 +1416,8 @@ static void asm_fxload(ASMState *as, IRIns *ir)
>      if (LJ_64 && irt_is64(ir->t))
>        dest |= REX_64;
>      else
> -      lua_assert(irt_isint(ir->t) || irt_isu32(ir->t) || irt_isaddr(ir->t));
> +      lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t) || irt_isaddr(ir->t),
> +		 "unsplit 64 bit load");
>      xo = XO_MOV;
>      break;
>    }
> @@ -1452,13 +1462,16 @@ static void asm_fxstore(ASMState *as, IRIns *ir)
>      case IRT_NUM: xo = XO_MOVSDto; break;
>      case IRT_FLOAT: xo = XO_MOVSSto; break;
>  #if LJ_64 && !LJ_GC64
> -    case IRT_LIGHTUD: lua_assert(0);  /* NYI: mask 64 bit lightuserdata. */
> +    case IRT_LIGHTUD:
> +      /* NYI: mask 64 bit lightuserdata. */
> +      lj_assertA(0, "store of lightuserdata");
>  #endif
>      default:
>        if (LJ_64 && irt_is64(ir->t))
>  	src |= REX_64;
>        else
> -	lua_assert(irt_isint(ir->t) || irt_isu32(ir->t) || irt_isaddr(ir->t));
> +	lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t) || irt_isaddr(ir->t),
> +		   "unsplit 64 bit store");
>        xo = XO_MOVto;
>        break;
>      }
> @@ -1472,8 +1485,8 @@ static void asm_fxstore(ASMState *as, IRIns *ir)
>        emit_i8(as, k);
>        emit_mrm(as, XO_MOVmib, 0, RID_MRM);
>      } else {
> -      lua_assert(irt_is64(ir->t) || irt_isint(ir->t) || irt_isu32(ir->t) ||
> -		 irt_isaddr(ir->t));
> +      lj_assertA(irt_is64(ir->t) || irt_isint(ir->t) || irt_isu32(ir->t) ||
> +		 irt_isaddr(ir->t), "bad store type");
>        emit_i32(as, k);
>        emit_mrm(as, XO_MOVmi, REX_64IR(ir, 0), RID_MRM);
>      }
> @@ -1508,8 +1521,9 @@ static void asm_ahuvload(ASMState *as, IRIns *ir)
>  #if LJ_GC64
>    Reg tmp = RID_NONE;
>  #endif
> -  lua_assert(irt_isnum(ir->t) || irt_ispri(ir->t) || irt_isaddr(ir->t) ||
> -	     (LJ_DUALNUM && irt_isint(ir->t)));
> +  lj_assertA(irt_isnum(ir->t) || irt_ispri(ir->t) || irt_isaddr(ir->t) ||
> +	     (LJ_DUALNUM && irt_isint(ir->t)),
> +	     "bad load type %d", irt_type(ir->t));
>  #if LJ_64 && !LJ_GC64
>    if (irt_islightud(ir->t)) {
>      Reg dest = asm_load_lightud64(as, ir, 1);
> @@ -1556,7 +1570,8 @@ static void asm_ahuvload(ASMState *as, IRIns *ir)
>    as->mrm.ofs += 4;
>    asm_guardcc(as, irt_isnum(ir->t) ? CC_AE : CC_NE);
>    if (LJ_64 && irt_type(ir->t) >= IRT_NUM) {
> -    lua_assert(irt_isinteger(ir->t) || irt_isnum(ir->t));
> +    lj_assertA(irt_isinteger(ir->t) || irt_isnum(ir->t),
> +	       "bad load type %d", irt_type(ir->t));
>  #if LJ_GC64
>      emit_u32(as, LJ_TISNUM << 15);
>  #else
> @@ -1638,13 +1653,14 @@ static void asm_ahustore(ASMState *as, IRIns *ir)
>  #endif
>        emit_mrm(as, XO_MOVto, src, RID_MRM);
>      } else if (!irt_ispri(irr->t)) {
> -      lua_assert(irt_isaddr(ir->t) || (LJ_DUALNUM && irt_isinteger(ir->t)));
> +      lj_assertA(irt_isaddr(ir->t) || (LJ_DUALNUM && irt_isinteger(ir->t)),
> +		 "bad store type");
>        emit_i32(as, irr->i);
>        emit_mrm(as, XO_MOVmi, 0, RID_MRM);
>      }
>      as->mrm.ofs += 4;
>  #if LJ_GC64
> -    lua_assert(LJ_DUALNUM && irt_isinteger(ir->t));
> +    lj_assertA(LJ_DUALNUM && irt_isinteger(ir->t), "bad store type");
>      emit_i32(as, LJ_TNUMX << 15);
>  #else
>      emit_i32(as, (int32_t)irt_toitype(ir->t));
> @@ -1659,10 +1675,13 @@ static void asm_sload(ASMState *as, IRIns *ir)
>  		(!LJ_FR2 && (ir->op2 & IRSLOAD_FRAME) ? 4 : 0);
>    IRType1 t = ir->t;
>    Reg base;
> -  lua_assert(!(ir->op2 & IRSLOAD_PARENT));  /* Handled by asm_head_side(). */
> -  lua_assert(irt_isguard(t) || !(ir->op2 & IRSLOAD_TYPECHECK));
> -  lua_assert(LJ_DUALNUM ||
> -	     !irt_isint(t) || (ir->op2 & (IRSLOAD_CONVERT|IRSLOAD_FRAME)));
> +  lj_assertA(!(ir->op2 & IRSLOAD_PARENT),
> +	     "bad parent SLOAD"); /* Handled by asm_head_side(). */
> +  lj_assertA(irt_isguard(t) || !(ir->op2 & IRSLOAD_TYPECHECK),
> +	     "inconsistent SLOAD variant");
> +  lj_assertA(LJ_DUALNUM ||
> +	     !irt_isint(t) || (ir->op2 & (IRSLOAD_CONVERT|IRSLOAD_FRAME)),
> +	     "bad SLOAD type");
>    if ((ir->op2 & IRSLOAD_CONVERT) && irt_isguard(t) && irt_isint(t)) {
>      Reg left = ra_scratch(as, RSET_FPR);
>      asm_tointg(as, ir, left);  /* Frees dest reg. Do this before base alloc. */
> @@ -1682,7 +1701,8 @@ static void asm_sload(ASMState *as, IRIns *ir)
>      RegSet allow = irt_isnum(t) ? RSET_FPR : RSET_GPR;
>      Reg dest = ra_dest(as, ir, allow);
>      base = ra_alloc1(as, REF_BASE, RSET_GPR);
> -    lua_assert(irt_isnum(t) || irt_isint(t) || irt_isaddr(t));
> +    lj_assertA(irt_isnum(t) || irt_isint(t) || irt_isaddr(t),
> +	       "bad SLOAD type %d", irt_type(t));
>      if ((ir->op2 & IRSLOAD_CONVERT)) {
>        t.irt = irt_isint(t) ? IRT_NUM : IRT_INT;  /* Check for original type. */
>        emit_rmro(as, irt_isint(t) ? XO_CVTSI2SD : XO_CVTTSD2SI, dest, base, ofs);
> @@ -1728,7 +1748,8 @@ static void asm_sload(ASMState *as, IRIns *ir)
>      /* Need type check, even if the load result is unused. */
>      asm_guardcc(as, irt_isnum(t) ? CC_AE : CC_NE);
>      if (LJ_64 && irt_type(t) >= IRT_NUM) {
> -      lua_assert(irt_isinteger(t) || irt_isnum(t));
> +      lj_assertA(irt_isinteger(t) || irt_isnum(t),
> +		 "bad SLOAD type %d", irt_type(t));
>  #if LJ_GC64
>        emit_u32(as, LJ_TISNUM << 15);
>  #else
> @@ -1780,7 +1801,8 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>    CTInfo info = lj_ctype_info(cts, id, &sz);
>    const CCallInfo *ci = &lj_ir_callinfo[IRCALL_lj_mem_newgco];
>    IRRef args[4];
> -  lua_assert(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL));
> +  lj_assertA(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL),
> +	     "bad CNEW/CNEWI operands");
>  
>    as->gcsteps++;
>    asm_setupresult(as, ir, ci);  /* GCcdata * */
> @@ -1810,7 +1832,7 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>      int32_t ofs = sizeof(GCcdata);
>      if (sz == 8) {
>        ofs += 4; ir++;
> -      lua_assert(ir->o == IR_HIOP);
> +      lj_assertA(ir->o == IR_HIOP, "missing CNEWI HIOP");
>      }
>      do {
>        if (irref_isk(ir->op2)) {
> @@ -1824,7 +1846,7 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>        ofs -= 4; ir--;
>      } while (1);
>  #endif
> -    lua_assert(sz == 4 || sz == 8);
> +    lj_assertA(sz == 4 || sz == 8, "bad CNEWI size %d", sz);
>    } else if (ir->op2 != REF_NIL) {  /* Create VLA/VLS/aligned cdata. */
>      ci = &lj_ir_callinfo[IRCALL_lj_cdata_newv];
>      args[0] = ASMREF_L;     /* lua_State *L */
> @@ -1883,7 +1905,7 @@ static void asm_obar(ASMState *as, IRIns *ir)
>    MCLabel l_end;
>    Reg obj;
>    /* No need for other object barriers (yet). */
> -  lua_assert(IR(ir->op1)->o == IR_UREFC);
> +  lj_assertA(IR(ir->op1)->o == IR_UREFC, "bad OBAR type");
>    ra_evictset(as, RSET_SCRATCH);
>    l_end = emit_label(as);
>    args[0] = ASMREF_TMP1;  /* global_State *g */
> @@ -2000,7 +2022,7 @@ static int asm_swapops(ASMState *as, IRIns *ir)
>  {
>    IRIns *irl = IR(ir->op1);
>    IRIns *irr = IR(ir->op2);
> -  lua_assert(ra_noreg(irr->r));
> +  lj_assertA(ra_noreg(irr->r), "bad usage");
>    if (!irm_iscomm(lj_ir_mode[ir->o]))
>      return 0;  /* Can't swap non-commutative operations. */
>    if (irref_isk(ir->op2))
> @@ -2391,8 +2413,9 @@ static void asm_comp(ASMState *as, IRIns *ir)
>      IROp leftop = (IROp)(IR(lref)->o);
>      Reg r64 = REX_64IR(ir, 0);
>      int32_t imm = 0;
> -    lua_assert(irt_is64(ir->t) || irt_isint(ir->t) ||
> -	       irt_isu32(ir->t) || irt_isaddr(ir->t) || irt_isu8(ir->t));
> +    lj_assertA(irt_is64(ir->t) || irt_isint(ir->t) ||
> +	       irt_isu32(ir->t) || irt_isaddr(ir->t) || irt_isu8(ir->t),
> +	       "bad comparison data type %d", irt_type(ir->t));
>      /* Swap constants (only for ABC) and fusable loads to the right. */
>      if (irref_isk(lref) || (!irref_isk(rref) && opisfusableload(leftop))) {
>        if ((cc & 0xc) == 0xc) cc ^= 0x53;  /* L <-> G, LE <-> GE */
> @@ -2474,7 +2497,7 @@ static void asm_comp(ASMState *as, IRIns *ir)
>  	  /* Use test r,r instead of cmp r,0. */
>  	  x86Op xo = XO_TEST;
>  	  if (irt_isu8(ir->t)) {
> -	    lua_assert(ir->o == IR_EQ || ir->o == IR_NE);
> +	    lj_assertA(ir->o == IR_EQ || ir->o == IR_NE, "bad usage");
>  	    xo = XO_TESTb;
>  	    if (!rset_test(RSET_RANGE(RID_EAX, RID_EBX+1), left)) {
>  	      if (LJ_64) {
> @@ -2630,10 +2653,11 @@ static void asm_hiop(ASMState *as, IRIns *ir)
>    case IR_CNEWI:
>      /* Nothing to do here. Handled by CNEWI itself. */
>      break;
> -  default: lua_assert(0); break;
> +  default: lj_assertA(0, "bad HIOP for op %d", (ir-1)->o); break;
>    }
>  #else
> -  UNUSED(as); UNUSED(ir); lua_assert(0);  /* Unused on x64 or without FFI. */
> +  /* Unused on x64 or without FFI. */
> +  UNUSED(as); UNUSED(ir); lj_assertA(0, "unexpected HIOP");
>  #endif
>  }
>  
> @@ -2699,8 +2723,9 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>        Reg src = ra_alloc1(as, ref, RSET_FPR);
>        emit_rmro(as, XO_MOVSDto, src, RID_BASE, ofs);
>      } else {
> -      lua_assert(irt_ispri(ir->t) || irt_isaddr(ir->t) ||
> -		 (LJ_DUALNUM && irt_isinteger(ir->t)));
> +      lj_assertA(irt_ispri(ir->t) || irt_isaddr(ir->t) ||
> +		 (LJ_DUALNUM && irt_isinteger(ir->t)),
> +		 "restore of IR type %d", irt_type(ir->t));
>        if (!irref_isk(ref)) {
>  	Reg src = ra_alloc1(as, ref, rset_exclude(RSET_GPR, RID_BASE));
>  #if LJ_GC64
> @@ -2745,7 +2770,7 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>      }
>      checkmclim(as);
>    }
> -  lua_assert(map + nent == flinks);
> +  lj_assertA(map + nent == flinks, "inconsistent frames in snapshot");
>  }
>  
>  /* -- GC handling --------------------------------------------------------- */
> @@ -2789,16 +2814,16 @@ static void asm_loop_fixup(ASMState *as)
>    MCode *target = as->mcp;
>    if (as->realign) {  /* Realigned loops use short jumps. */
>      as->realign = NULL;  /* Stop another retry. */
> -    lua_assert(((intptr_t)target & 15) == 0);
> +    lj_assertA(((intptr_t)target & 15) == 0, "loop realign failed");
>      if (as->loopinv) {  /* Inverted loop branch? */
>        p -= 5;
>        p[0] = XI_JMP;
> -      lua_assert(target - p >= -128);
> +      lj_assertA(target - p >= -128, "loop realign failed");
>        p[-1] = (MCode)(target - p);  /* Patch sjcc. */
>        if (as->loopinv == 2)
>  	p[-3] = (MCode)(target - p + 2);  /* Patch opt. short jp. */
>      } else {
> -      lua_assert(target - p >= -128);
> +      lj_assertA(target - p >= -128, "loop realign failed");
>        p[-1] = (MCode)(int8_t)(target - p);  /* Patch short jmp. */
>        p[-2] = XI_JMPs;
>      }
> @@ -2904,7 +2929,7 @@ static void asm_tail_fixup(ASMState *as, TraceNo lnk)
>    }
>    /* Patch exit branch. */
>    target = lnk ? traceref(as->J, lnk)->mcode : (MCode *)lj_vm_exit_interp;
> -  *(int32_t *)(p-4) = jmprel(p, target);
> +  *(int32_t *)(p-4) = jmprel(as->J, p, target);
>    p[-5] = XI_JMP;
>    /* Drop unused mcode tail. Fill with NOPs to make the prefetcher happy. */
>    for (q = as->mctop-1; q >= p; q--)
> @@ -3077,17 +3102,17 @@ void lj_asm_patchexit(jit_State *J, GCtrace *T, ExitNo exitno, MCode *target)
>    uint32_t statei = u32ptr(&J2G(J)->vmstate);
>  #endif
>    if (len > 5 && p[len-5] == XI_JMP && p+len-6 + *(int32_t *)(p+len-4) == px)
> -    *(int32_t *)(p+len-4) = jmprel(p+len, target);
> +    *(int32_t *)(p+len-4) = jmprel(J, p+len, target);
>    /* Do not patch parent exit for a stack check. Skip beyond vmstate update. */
>    for (; p < pe; p += asm_x86_inslen(p)) {
>      intptr_t ofs = LJ_GC64 ? (p[0] & 0xf0) == 0x40 : LJ_64;
>      if (*(uint32_t *)(p+2+ofs) == statei && p[ofs+LJ_GC64-LJ_64] == XI_MOVmi)
>        break;
>    }
> -  lua_assert(p < pe);
> +  lj_assertJ(p < pe, "instruction length decoder failed");
>    for (; p < pe; p += asm_x86_inslen(p))
>      if ((*(uint16_t *)p & 0xf0ff) == 0x800f && p + *(int32_t *)(p+2) == px)
> -      *(int32_t *)(p+2) = jmprel(p+6, target);
> +      *(int32_t *)(p+2) = jmprel(J, p+6, target);
>    lj_mcode_sync(T->mcode, T->mcode + T->szmcode);
>    lj_mcode_patch(J, mcarea, 1);
>  }
> diff --git a/src/lj_assert.c b/src/lj_assert.c
> new file mode 100644
> index 00000000..7989dbe6
> --- /dev/null
> +++ b/src/lj_assert.c
> @@ -0,0 +1,28 @@
> +/*
> +** Internal assertions.
> +** Copyright (C) 2005-2020 Mike Pall. See Copyright Notice in luajit.h
> +*/
> +
> +#define lj_assert_c
> +#define LUA_CORE
> +
> +#if defined(LUA_USE_ASSERT) || defined(LUA_USE_APICHECK)
> +
> +#include <stdio.h>
> +
> +#include "lj_obj.h"
> +
> +void lj_assert_fail(global_State *g, const char *file, int line,
> +		    const char *func, const char *fmt, ...)
> +{
> +  va_list argp;
> +  va_start(argp, fmt);
> +  fprintf(stderr, "LuaJIT ASSERT %s:%d: %s: ", file, line, func);
> +  vfprintf(stderr, fmt, argp);
> +  fputc('\n', stderr);
> +  va_end(argp);
> +  UNUSED(g);  /* May be NULL. TODO: optionally dump state. */
> +  abort();
> +}
> +
> +#endif
> diff --git a/src/lj_bcread.c b/src/lj_bcread.c
> index f6c7ad25..cddf6ff1 100644
> --- a/src/lj_bcread.c
> +++ b/src/lj_bcread.c
> @@ -53,7 +53,7 @@ static LJ_NOINLINE void bcread_error(LexState *ls, ErrMsg em)
>  /* Refill buffer. */
>  static LJ_NOINLINE void bcread_fill(LexState *ls, MSize len, int need)
>  {
> -  lua_assert(len != 0);
> +  lj_assertLS(len != 0, "empty refill");
>    if (len > LJ_MAX_BUF || ls->c < 0)
>      bcread_error(ls, LJ_ERR_BCBAD);
>    do {
> @@ -63,7 +63,7 @@ static LJ_NOINLINE void bcread_fill(LexState *ls, MSize len, int need)
>      MSize n = (MSize)(ls->pe - ls->p);
>      if (n) {  /* Copy remainder to buffer. */
>        if (sbuflen(&ls->sb)) {  /* Move down in buffer. */
> -	lua_assert(ls->pe == sbufP(&ls->sb));
> +	lj_assertLS(ls->pe == sbufP(&ls->sb), "bad buffer pointer");
>  	if (ls->p != p) memmove(p, ls->p, n);
>        } else {  /* Copy from buffer provided by reader. */
>  	p = lj_buf_need(&ls->sb, len);
> @@ -112,7 +112,7 @@ static LJ_AINLINE uint8_t *bcread_mem(LexState *ls, MSize len)
>  {
>    uint8_t *p = (uint8_t *)ls->p;
>    ls->p += len;
> -  lua_assert(ls->p <= ls->pe);
> +  lj_assertLS(ls->p <= ls->pe, "buffer read overflow");
>    return p;
>  }
>  
> @@ -125,7 +125,7 @@ static void bcread_block(LexState *ls, void *q, MSize len)
>  /* Read byte from buffer. */
>  static LJ_AINLINE uint32_t bcread_byte(LexState *ls)
>  {
> -  lua_assert(ls->p < ls->pe);
> +  lj_assertLS(ls->p < ls->pe, "buffer read overflow");
>    return (uint32_t)(uint8_t)*ls->p++;
>  }
>  
> @@ -133,7 +133,7 @@ static LJ_AINLINE uint32_t bcread_byte(LexState *ls)
>  static LJ_AINLINE uint32_t bcread_uleb128(LexState *ls)
>  {
>    uint32_t v = lj_buf_ruleb128(&ls->p);
> -  lua_assert(ls->p <= ls->pe);
> +  lj_assertLS(ls->p <= ls->pe, "buffer read overflow");
>    return v;
>  }
>  
> @@ -150,7 +150,7 @@ static uint32_t bcread_uleb128_33(LexState *ls)
>     } while (*p++ >= 0x80);
>    }
>    ls->p = (char *)p;
> -  lua_assert(ls->p <= ls->pe);
> +  lj_assertLS(ls->p <= ls->pe, "buffer read overflow");
>    return v;
>  }
>  
> @@ -197,7 +197,7 @@ static void bcread_ktabk(LexState *ls, TValue *o)
>      o->u32.lo = bcread_uleb128(ls);
>      o->u32.hi = bcread_uleb128(ls);
>    } else {
> -    lua_assert(tp <= BCDUMP_KTAB_TRUE);
> +    lj_assertLS(tp <= BCDUMP_KTAB_TRUE, "bad constant type %d", tp);
>      setpriV(o, ~tp);
>    }
>  }
> @@ -219,7 +219,7 @@ static GCtab *bcread_ktab(LexState *ls)
>      for (i = 0; i < nhash; i++) {
>        TValue key;
>        bcread_ktabk(ls, &key);
> -      lua_assert(!tvisnil(&key));
> +      lj_assertLS(!tvisnil(&key), "nil key");
>        bcread_ktabk(ls, lj_tab_set(ls->L, t, &key));
>      }
>    }
> @@ -256,7 +256,7 @@ static void bcread_kgc(LexState *ls, GCproto *pt, MSize sizekgc)
>  #endif
>      } else {
>        lua_State *L = ls->L;
> -      lua_assert(tp == BCDUMP_KGC_CHILD);
> +      lj_assertLS(tp == BCDUMP_KGC_CHILD, "bad constant type %d", tp);
>        if (L->top <= bcread_oldtop(L, ls))  /* Stack underflow? */
>  	bcread_error(ls, LJ_ERR_BCBAD);
>        L->top--;
> @@ -437,7 +437,7 @@ static int bcread_header(LexState *ls)
>  GCproto *lj_bcread(LexState *ls)
>  {
>    lua_State *L = ls->L;
> -  lua_assert(ls->c == BCDUMP_HEAD1);
> +  lj_assertLS(ls->c == BCDUMP_HEAD1, "bad bytecode header");
>    bcread_savetop(L, ls, L->top);
>    lj_buf_reset(&ls->sb);
>    /* Check for a valid bytecode dump header. */
> diff --git a/src/lj_bcwrite.c b/src/lj_bcwrite.c
> index a86d6d00..ce5837f6 100644
> --- a/src/lj_bcwrite.c
> +++ b/src/lj_bcwrite.c
> @@ -29,8 +29,17 @@ typedef struct BCWriteCtx {
>    void *wdata;			/* Writer callback data. */
>    int strip;			/* Strip debug info. */
>    int status;			/* Status from writer callback. */
> +#ifdef LUA_USE_ASSERT
> +  global_State *g;
> +#endif
>  } BCWriteCtx;
>  
> +#ifdef LUA_USE_ASSERT
> +#define lj_assertBCW(c, ...)	lj_assertG_(ctx->g, (c), __VA_ARGS__)
> +#else
> +#define lj_assertBCW(c, ...)	((void)ctx)
> +#endif
> +
>  /* -- Bytecode writer ----------------------------------------------------- */
>  
>  /* Write a single constant key/value of a template table. */
> @@ -61,7 +70,7 @@ static void bcwrite_ktabk(BCWriteCtx *ctx, cTValue *o, int narrow)
>      p = lj_strfmt_wuleb128(p, o->u32.lo);
>      p = lj_strfmt_wuleb128(p, o->u32.hi);
>    } else {
> -    lua_assert(tvispri(o));
> +    lj_assertBCW(tvispri(o), "unhandled type %d", itype(o));
>      *p++ = BCDUMP_KTAB_NIL+~itype(o);
>    }
>    setsbufP(&ctx->sb, p);
> @@ -121,7 +130,7 @@ static void bcwrite_kgc(BCWriteCtx *ctx, GCproto *pt)
>        tp = BCDUMP_KGC_STR + gco2str(o)->len;
>        need = 5+gco2str(o)->len;
>      } else if (o->gch.gct == ~LJ_TPROTO) {
> -      lua_assert((pt->flags & PROTO_CHILD));
> +      lj_assertBCW((pt->flags & PROTO_CHILD), "prototype has unexpected child");
>        tp = BCDUMP_KGC_CHILD;
>  #if LJ_HASFFI
>      } else if (o->gch.gct == ~LJ_TCDATA) {
> @@ -132,12 +141,14 @@ static void bcwrite_kgc(BCWriteCtx *ctx, GCproto *pt)
>        } else if (id == CTID_UINT64) {
>  	tp = BCDUMP_KGC_U64;
>        } else {
> -	lua_assert(id == CTID_COMPLEX_DOUBLE);
> +	lj_assertBCW(id == CTID_COMPLEX_DOUBLE,
> +		     "bad cdata constant CTID %d", id);
>  	tp = BCDUMP_KGC_COMPLEX;
>        }
>  #endif
>      } else {
> -      lua_assert(o->gch.gct == ~LJ_TTAB);
> +      lj_assertBCW(o->gch.gct == ~LJ_TTAB,
> +		   "bad constant GC type %d", o->gch.gct);
>        tp = BCDUMP_KGC_TAB;
>        need = 1+2*5;
>      }
> @@ -289,7 +300,7 @@ static void bcwrite_proto(BCWriteCtx *ctx, GCproto *pt)
>      MSize nn = (lj_fls(n)+8)*9 >> 6;
>      char *q = sbufB(&ctx->sb) + (5 - nn);
>      p = lj_strfmt_wuleb128(q, n);  /* Fill in final size. */
> -    lua_assert(p == sbufB(&ctx->sb) + 5);
> +    lj_assertBCW(p == sbufB(&ctx->sb) + 5, "bad ULEB128 write");
>      ctx->status = ctx->wfunc(sbufL(&ctx->sb), q, nn+n, ctx->wdata);
>    }
>  }
> @@ -349,6 +360,9 @@ int lj_bcwrite(lua_State *L, GCproto *pt, lua_Writer writer, void *data,
>    ctx.wdata = data;
>    ctx.strip = strip;
>    ctx.status = 0;
> +#ifdef LUA_USE_ASSERT
> +  ctx.g = G(L);
> +#endif
>    lj_buf_init(L, &ctx.sb);
>    status = lj_vm_cpcall(L, NULL, &ctx, cpwriter);
>    if (status == 0) status = ctx.status;
> diff --git a/src/lj_buf.c b/src/lj_buf.c
> index 0dfe7f98..923f4276 100644
> --- a/src/lj_buf.c
> +++ b/src/lj_buf.c
> @@ -30,7 +30,7 @@ static void buf_grow(SBuf *sb, MSize sz)
>  
>  LJ_NOINLINE char *LJ_FASTCALL lj_buf_need2(SBuf *sb, MSize sz)
>  {
> -  lua_assert(sz > sbufsz(sb));
> +  lj_assertG_(G(sbufL(sb)), sz > sbufsz(sb), "SBuf overflow");
>    if (LJ_UNLIKELY(sz > LJ_MAX_BUF))
>      lj_err_mem(sbufL(sb));
>    buf_grow(sb, sz);
> @@ -40,7 +40,7 @@ LJ_NOINLINE char *LJ_FASTCALL lj_buf_need2(SBuf *sb, MSize sz)
>  LJ_NOINLINE char *LJ_FASTCALL lj_buf_more2(SBuf *sb, MSize sz)
>  {
>    MSize len = sbuflen(sb);
> -  lua_assert(sz > sbufleft(sb));
> +  lj_assertG_(G(sbufL(sb)), sz > sbufleft(sb), "SBuf overflow");
>    if (LJ_UNLIKELY(sz > LJ_MAX_BUF || len + sz > LJ_MAX_BUF))
>      lj_err_mem(sbufL(sb));
>    buf_grow(sb, len + sz);
> diff --git a/src/lj_carith.c b/src/lj_carith.c
> index 04c18054..4ae1e9ee 100644
> --- a/src/lj_carith.c
> +++ b/src/lj_carith.c
> @@ -122,7 +122,7 @@ static int carith_ptr(lua_State *L, CTState *cts, CDArith *ca, MMS mm)
>  	setboolV(L->top-1, ((uintptr_t)pp < (uintptr_t)pp2));
>  	return 1;
>        } else {
> -	lua_assert(mm == MM_le);
> +	lj_assertL(mm == MM_le, "bad metamethod %d", mm);
>  	setboolV(L->top-1, ((uintptr_t)pp <= (uintptr_t)pp2));
>  	return 1;
>        }
> @@ -208,7 +208,9 @@ static int carith_int64(lua_State *L, CTState *cts, CDArith *ca, MMS mm)
>  	*up = lj_carith_powu64(u0, u1);
>        break;
>      case MM_unm: *up = (uint64_t)-(int64_t)u0; break;
> -    default: lua_assert(0); break;
> +    default:
> +      lj_assertL(0, "bad metamethod %d", mm);
> +      break;
>      }
>      lj_gc_check(L);
>      return 1;
> @@ -301,7 +303,9 @@ uint64_t lj_carith_shift64(uint64_t x, int32_t sh, int op)
>    case IR_BSAR-IR_BSHL: x = lj_carith_sar64(x, sh); break;
>    case IR_BROL-IR_BSHL: x = lj_carith_rol64(x, sh); break;
>    case IR_BROR-IR_BSHL: x = lj_carith_ror64(x, sh); break;
> -  default: lua_assert(0); break;
> +  default:
> +    lj_assertX(0, "bad shift op %d", op);
> +    break;
>    }
>    return x;
>  }
> diff --git a/src/lj_ccall.c b/src/lj_ccall.c
> index c1e12f56..a989f657 100644
> --- a/src/lj_ccall.c
> +++ b/src/lj_ccall.c
> @@ -391,7 +391,8 @@
>  #define CCALL_HANDLE_GPR \
>    /* Try to pass argument in GPRs. */ \
>    if (n > 1) { \
> -    lua_assert(n == 2 || n == 4);  /* int64_t or complex (float). */ \
> +    /* int64_t or complex (float). */ \
> +    lj_assertL(n == 2 || n == 4, "bad GPR size %d", n); \
>      if (ctype_isinteger(d->info) || ctype_isfp(d->info)) \
>        ngpr = (ngpr + 1u) & ~1u;  /* Align int64_t to regpair. */ \
>      else if (ngpr + n > maxgpr) \
> @@ -642,7 +643,8 @@ static void ccall_classify_ct(CTState *cts, CType *ct, int *rcl, CTSize ofs)
>      ccall_classify_struct(cts, ct, rcl, ofs);
>    } else {
>      int cl = ctype_isfp(ct->info) ? CCALL_RCL_SSE : CCALL_RCL_INT;
> -    lua_assert(ctype_hassize(ct->info));
> +    lj_assertCTS(ctype_hassize(ct->info),
> +		 "classify ctype %08x without size", ct->info);
>      if ((ofs & (ct->size-1))) cl = CCALL_RCL_MEM;  /* Unaligned. */
>      rcl[(ofs >= 8)] |= cl;
>    }
> @@ -667,12 +669,13 @@ static int ccall_classify_struct(CTState *cts, CType *ct, int *rcl, CTSize ofs)
>  }
>  
>  /* Try to split up a small struct into registers. */
> -static int ccall_struct_reg(CCallState *cc, GPRArg *dp, int *rcl)
> +static int ccall_struct_reg(CCallState *cc, CTState *cts, GPRArg *dp, int *rcl)
>  {
>    MSize ngpr = cc->ngpr, nfpr = cc->nfpr;
>    uint32_t i;
> +  UNUSED(cts);
>    for (i = 0; i < 2; i++) {
> -    lua_assert(!(rcl[i] & CCALL_RCL_MEM));
> +    lj_assertCTS(!(rcl[i] & CCALL_RCL_MEM), "pass mem struct in reg");
>      if ((rcl[i] & CCALL_RCL_INT)) {  /* Integer class takes precedence. */
>        if (ngpr >= CCALL_NARG_GPR) return 1;  /* Register overflow. */
>        cc->gpr[ngpr++] = dp[i];
> @@ -693,7 +696,8 @@ static int ccall_struct_arg(CCallState *cc, CTState *cts, CType *d, int *rcl,
>    dp[0] = dp[1] = 0;
>    /* Convert to temp. struct. */
>    lj_cconv_ct_tv(cts, d, (uint8_t *)dp, o, CCF_ARG(narg));
> -  if (ccall_struct_reg(cc, dp, rcl)) {  /* Register overflow? Pass on stack. */
> +  if (ccall_struct_reg(cc, cts, dp, rcl)) {
> +    /* Register overflow? Pass on stack. */
>      MSize nsp = cc->nsp, n = rcl[1] ? 2 : 1;
>      if (nsp + n > CCALL_MAXSTACK) return 1;  /* Too many arguments. */
>      cc->nsp = nsp + n;
> @@ -989,7 +993,7 @@ static int ccall_set_args(lua_State *L, CTState *cts, CType *ct,
>      if (fid) {  /* Get argument type from field. */
>        CType *ctf = ctype_get(cts, fid);
>        fid = ctf->sib;
> -      lua_assert(ctype_isfield(ctf->info));
> +      lj_assertL(ctype_isfield(ctf->info), "field expected");
>        did = ctype_cid(ctf->info);
>      } else {
>        if (!(ct->info & CTF_VARARG))
> @@ -1137,7 +1141,8 @@ static int ccall_get_results(lua_State *L, CTState *cts, CType *ct,
>    CCALL_HANDLE_RET
>  #endif
>    /* No reference types end up here, so there's no need for the CTypeID. */
> -  lua_assert(!(ctype_isrefarray(ctr->info) || ctype_isstruct(ctr->info)));
> +  lj_assertL(!(ctype_isrefarray(ctr->info) || ctype_isstruct(ctr->info)),
> +	     "unexpected reference ctype");
>    return lj_cconv_tv_ct(cts, ctr, 0, L->top-1, sp);
>  }
>  
> diff --git a/src/lj_ccallback.c b/src/lj_ccallback.c
> index 37edd00f..3738c234 100644
> --- a/src/lj_ccallback.c
> +++ b/src/lj_ccallback.c
> @@ -107,9 +107,9 @@ MSize lj_ccallback_ptr2slot(CTState *cts, void *p)
>  /* Initialize machine code for callback function pointers. */
>  #if LJ_OS_NOJIT
>  /* Disabled callback support. */
> -#define callback_mcode_init(g, p)	UNUSED(p)
> +#define callback_mcode_init(g, p)	(p)
>  #elif LJ_TARGET_X86ORX64
> -static void callback_mcode_init(global_State *g, uint8_t *page)
> +static void *callback_mcode_init(global_State *g, uint8_t *page)
>  {
>    uint8_t *p = page;
>    uint8_t *target = (uint8_t *)(void *)lj_vm_ffi_callback;
> @@ -143,10 +143,10 @@ static void callback_mcode_init(global_State *g, uint8_t *page)
>        *p++ = XI_JMPs; *p++ = (uint8_t)((2+2)*(31-(slot&31)) - 2);
>      }
>    }
> -  lua_assert(p - page <= CALLBACK_MCODE_SIZE);
> +  return p;
>  }
>  #elif LJ_TARGET_ARM
> -static void callback_mcode_init(global_State *g, uint32_t *page)
> +static void *callback_mcode_init(global_State *g, uint32_t *page)
>  {
>    uint32_t *p = page;
>    void *target = (void *)lj_vm_ffi_callback;
> @@ -165,10 +165,10 @@ static void callback_mcode_init(global_State *g, uint32_t *page)
>      *p = ARMI_B | ((page-p-2) & 0x00ffffffu);
>      p++;
>    }
> -  lua_assert(p - page <= CALLBACK_MCODE_SIZE);
> +  return p;
>  }
>  #elif LJ_TARGET_ARM64
> -static void callback_mcode_init(global_State *g, uint32_t *page)
> +static void *callback_mcode_init(global_State *g, uint32_t *page)
>  {
>    uint32_t *p = page;
>    void *target = (void *)lj_vm_ffi_callback;
> @@ -185,10 +185,10 @@ static void callback_mcode_init(global_State *g, uint32_t *page)
>      *p = A64I_LE(A64I_B | A64F_S26((page-p) & 0x03ffffffu));
>      p++;
>    }
> -  lua_assert(p - page <= CALLBACK_MCODE_SIZE);
> +  return p;
>  }
>  #elif LJ_TARGET_PPC
> -static void callback_mcode_init(global_State *g, uint32_t *page)
> +static void *callback_mcode_init(global_State *g, uint32_t *page)
>  {
>    uint32_t *p = page;
>    void *target = (void *)lj_vm_ffi_callback;
> @@ -204,10 +204,10 @@ static void callback_mcode_init(global_State *g, uint32_t *page)
>      *p = PPCI_B | (((page-p) & 0x00ffffffu) << 2);
>      p++;
>    }
> -  lua_assert(p - page <= CALLBACK_MCODE_SIZE);
> +  return p;
>  }
>  #elif LJ_TARGET_MIPS
> -static void callback_mcode_init(global_State *g, uint32_t *page)
> +static void *callback_mcode_init(global_State *g, uint32_t *page)
>  {
>    uint32_t *p = page;
>    uintptr_t target = (uintptr_t)(void *)lj_vm_ffi_callback;
> @@ -236,11 +236,11 @@ static void callback_mcode_init(global_State *g, uint32_t *page)
>      p++;
>      *p++ = MIPSI_LI | MIPSF_T(RID_R1) | slot;
>    }
> -  lua_assert(p - page <= CALLBACK_MCODE_SIZE);
> +  return p;
>  }
>  #else
>  /* Missing support for this architecture. */
> -#define callback_mcode_init(g, p)	UNUSED(p)
> +#define callback_mcode_init(g, p)	(p)
>  #endif
>  
>  /* -- Machine code management --------------------------------------------- */
> @@ -263,7 +263,7 @@ static void callback_mcode_init(global_State *g, uint32_t *page)
>  static void callback_mcode_new(CTState *cts)
>  {
>    size_t sz = (size_t)CALLBACK_MCODE_SIZE;
> -  void *p;
> +  void *p, *pe;
>    if (CALLBACK_MAX_SLOT == 0)
>      lj_err_caller(cts->L, LJ_ERR_FFI_CBACKOV);
>  #if LJ_TARGET_WINDOWS
> @@ -280,7 +280,10 @@ static void callback_mcode_new(CTState *cts)
>    p = lj_mem_new(cts->L, sz);
>  #endif
>    cts->cb.mcode = p;
> -  callback_mcode_init(cts->g, p);
> +  pe = callback_mcode_init(cts->g, p);
> +  UNUSED(pe);
> +  lj_assertCTS((size_t)((char *)pe - (char *)p) <= sz,
> +	       "miscalculated CALLBACK_MAX_SLOT");
>    lj_mcode_sync(p, (char *)p + sz);
>  #if LJ_TARGET_WINDOWS
>    {
> @@ -421,8 +424,9 @@ void lj_ccallback_mcode_free(CTState *cts)
>  
>  #define CALLBACK_HANDLE_GPR \
>    if (n > 1) { \
> -    lua_assert(((LJ_ABI_SOFTFP && ctype_isnum(cta->info)) ||  /* double. */ \
> -		ctype_isinteger(cta->info)) && n == 2);  /* int64_t. */ \
> +    lj_assertCTS(((LJ_ABI_SOFTFP && ctype_isnum(cta->info)) ||  /* double. */ \
> +		 ctype_isinteger(cta->info)) && n == 2,  /* int64_t. */ \
> +		 "bad GPR type"); \
>      ngpr = (ngpr + 1u) & ~1u;  /* Align int64_t to regpair. */ \
>    } \
>    if (ngpr + n <= maxgpr) { \
> @@ -579,7 +583,7 @@ static void callback_conv_args(CTState *cts, lua_State *L)
>        CTSize sz;
>        int isfp;
>        MSize n;
> -      lua_assert(ctype_isfield(ctf->info));
> +      lj_assertCTS(ctype_isfield(ctf->info), "field expected");
>        cta = ctype_rawchild(cts, ctf);
>        isfp = ctype_isfp(cta->info);
>        sz = (cta->size + CTSIZE_PTR-1) & ~(CTSIZE_PTR-1);
> @@ -671,7 +675,7 @@ lua_State * LJ_FASTCALL lj_ccallback_enter(CTState *cts, void *cf)
>  {
>    lua_State *L = cts->L;
>    global_State *g = cts->g;
> -  lua_assert(L != NULL);
> +  lj_assertG(L != NULL, "uninitialized cts->L in callback");
>    if (tvref(g->jit_base)) {
>      setstrV(L, L->top++, lj_err_str(L, LJ_ERR_FFI_BADCBACK));
>      if (g->panic) g->panic(L);
> @@ -756,7 +760,7 @@ static CType *callback_checkfunc(CTState *cts, CType *ct)
>        CType *ctf = ctype_get(cts, fid);
>        if (!ctype_isattrib(ctf->info)) {
>  	CType *cta;
> -	lua_assert(ctype_isfield(ctf->info));
> +	lj_assertCTS(ctype_isfield(ctf->info), "field expected");
>  	cta = ctype_rawchild(cts, ctf);
>  	if (!(ctype_isenum(cta->info) || ctype_isptr(cta->info) ||
>  	      (ctype_isnum(cta->info) && cta->size <= 8)) ||
> diff --git a/src/lj_cconv.c b/src/lj_cconv.c
> index ca2a5d30..37c88852 100644
> --- a/src/lj_cconv.c
> +++ b/src/lj_cconv.c
> @@ -122,19 +122,25 @@ void lj_cconv_ct_ct(CTState *cts, CType *d, CType *s,
>    CTInfo dinfo = d->info, sinfo = s->info;
>    void *tmpptr;
>  
> -  lua_assert(!ctype_isenum(dinfo) && !ctype_isenum(sinfo));
> -  lua_assert(!ctype_isattrib(dinfo) && !ctype_isattrib(sinfo));
> +  lj_assertCTS(!ctype_isenum(dinfo) && !ctype_isenum(sinfo),
> +	       "unresolved enum");
> +  lj_assertCTS(!ctype_isattrib(dinfo) && !ctype_isattrib(sinfo),
> +	       "unstripped attribute");
>  
>    if (ctype_type(dinfo) > CT_MAYCONVERT || ctype_type(sinfo) > CT_MAYCONVERT)
>      goto err_conv;
>  
>    /* Some basic sanity checks. */
> -  lua_assert(!ctype_isnum(dinfo) || dsize > 0);
> -  lua_assert(!ctype_isnum(sinfo) || ssize > 0);
> -  lua_assert(!ctype_isbool(dinfo) || dsize == 1 || dsize == 4);
> -  lua_assert(!ctype_isbool(sinfo) || ssize == 1 || ssize == 4);
> -  lua_assert(!ctype_isinteger(dinfo) || (1u<<lj_fls(dsize)) == dsize);
> -  lua_assert(!ctype_isinteger(sinfo) || (1u<<lj_fls(ssize)) == ssize);
> +  lj_assertCTS(!ctype_isnum(dinfo) || dsize > 0, "bad size for number type");
> +  lj_assertCTS(!ctype_isnum(sinfo) || ssize > 0, "bad size for number type");
> +  lj_assertCTS(!ctype_isbool(dinfo) || dsize == 1 || dsize == 4,
> +	       "bad size for bool type");
> +  lj_assertCTS(!ctype_isbool(sinfo) || ssize == 1 || ssize == 4,
> +	       "bad size for bool type");
> +  lj_assertCTS(!ctype_isinteger(dinfo) || (1u<<lj_fls(dsize)) == dsize,
> +	       "bad size for integer type");
> +  lj_assertCTS(!ctype_isinteger(sinfo) || (1u<<lj_fls(ssize)) == ssize,
> +	       "bad size for integer type");
>  
>    switch (cconv_idx2(dinfo, sinfo)) {
>    /* Destination is a bool. */
> @@ -357,7 +363,7 @@ void lj_cconv_ct_ct(CTState *cts, CType *d, CType *s,
>      if ((flags & CCF_CAST) || (d->info & CTF_VLA) || d != s)
>        goto err_conv;  /* Must be exact same type. */
>  copyval:  /* Copy value. */
> -    lua_assert(dsize == ssize);
> +    lj_assertCTS(dsize == ssize, "value copy with different sizes");
>      memcpy(dp, sp, dsize);
>      break;
>  
> @@ -389,7 +395,7 @@ int lj_cconv_tv_ct(CTState *cts, CType *s, CTypeID sid,
>  	lj_cconv_ct_ct(cts, ctype_get(cts, CTID_DOUBLE), s,
>  		       (uint8_t *)&o->n, sp, 0);
>  	/* Numbers are NOT canonicalized here! Beware of uninitialized data. */
> -	lua_assert(tvisnum(o));
> +	lj_assertCTS(tvisnum(o), "non-canonical NaN passed");
>        }
>      } else {
>        uint32_t b = s->size == 1 ? (*sp != 0) : (*(int *)sp != 0);
> @@ -406,7 +412,7 @@ int lj_cconv_tv_ct(CTState *cts, CType *s, CTypeID sid,
>      CTSize sz;
>    copyval:  /* Copy value. */
>      sz = s->size;
> -    lua_assert(sz != CTSIZE_INVALID);
> +    lj_assertCTS(sz != CTSIZE_INVALID, "value copy with invalid size");
>      /* Attributes are stripped, qualifiers are kept (but mostly ignored). */
>      cd = lj_cdata_new(cts, ctype_typeid(cts, s), sz);
>      setcdataV(cts->L, o, cd);
> @@ -421,19 +427,22 @@ int lj_cconv_tv_bf(CTState *cts, CType *s, TValue *o, uint8_t *sp)
>    CTInfo info = s->info;
>    CTSize pos, bsz;
>    uint32_t val;
> -  lua_assert(ctype_isbitfield(info));
> +  lj_assertCTS(ctype_isbitfield(info), "bitfield expected");
>    /* NYI: packed bitfields may cause misaligned reads. */
>    switch (ctype_bitcsz(info)) {
>    case 4: val = *(uint32_t *)sp; break;
>    case 2: val = *(uint16_t *)sp; break;
>    case 1: val = *(uint8_t *)sp; break;
> -  default: lua_assert(0); val = 0; break;
> +  default:
> +    lj_assertCTS(0, "bad bitfield container size %d", ctype_bitcsz(info));
> +    val = 0;
> +    break;
>    }
>    /* Check if a packed bitfield crosses a container boundary. */
>    pos = ctype_bitpos(info);
>    bsz = ctype_bitbsz(info);
> -  lua_assert(pos < 8*ctype_bitcsz(info));
> -  lua_assert(bsz > 0 && bsz <= 8*ctype_bitcsz(info));
> +  lj_assertCTS(pos < 8*ctype_bitcsz(info), "bad bitfield position");
> +  lj_assertCTS(bsz > 0 && bsz <= 8*ctype_bitcsz(info), "bad bitfield size");
>    if (pos + bsz > 8*ctype_bitcsz(info))
>      lj_err_caller(cts->L, LJ_ERR_FFI_NYIPACKBIT);
>    if (!(info & CTF_BOOL)) {
> @@ -449,7 +458,7 @@ int lj_cconv_tv_bf(CTState *cts, CType *s, TValue *o, uint8_t *sp)
>      }
>    } else {
>      uint32_t b = (val >> pos) & 1;
> -    lua_assert(bsz == 1);
> +    lj_assertCTS(bsz == 1, "bad bool bitfield size");
>      setboolV(o, b);
>      setboolV(&cts->g->tmptv2, b);  /* Remember for trace recorder. */
>    }
> @@ -553,7 +562,7 @@ void lj_cconv_ct_tv(CTState *cts, CType *d,
>      sid = cdataV(o)->ctypeid;
>      s = ctype_get(cts, sid);
>      if (ctype_isref(s->info)) {  /* Resolve reference for value. */
> -      lua_assert(s->size == CTSIZE_PTR);
> +      lj_assertCTS(s->size == CTSIZE_PTR, "ref is not pointer-sized");
>        sp = *(void **)sp;
>        sid = ctype_cid(s->info);
>      }
> @@ -571,7 +580,7 @@ void lj_cconv_ct_tv(CTState *cts, CType *d,
>        CType *cct = lj_ctype_getfield(cts, d, str, &ofs);
>        if (!cct || !ctype_isconstval(cct->info))
>  	goto err_conv;
> -      lua_assert(d->size == 4);
> +      lj_assertCTS(d->size == 4, "only 32 bit enum supported");  /* NYI */
>        sp = (uint8_t *)&cct->size;
>        sid = ctype_cid(cct->info);
>      } else if (ctype_isrefarray(d->info)) {  /* Copy string to array. */
> @@ -635,10 +644,10 @@ void lj_cconv_bf_tv(CTState *cts, CType *d, uint8_t *dp, TValue *o)
>    CTInfo info = d->info;
>    CTSize pos, bsz;
>    uint32_t val, mask;
> -  lua_assert(ctype_isbitfield(info));
> +  lj_assertCTS(ctype_isbitfield(info), "bitfield expected");
>    if ((info & CTF_BOOL)) {
>      uint8_t tmpbool;
> -    lua_assert(ctype_bitbsz(info) == 1);
> +    lj_assertCTS(ctype_bitbsz(info) == 1, "bad bool bitfield size");
>      lj_cconv_ct_tv(cts, ctype_get(cts, CTID_BOOL), &tmpbool, o, 0);
>      val = tmpbool;
>    } else {
> @@ -647,8 +656,8 @@ void lj_cconv_bf_tv(CTState *cts, CType *d, uint8_t *dp, TValue *o)
>    }
>    pos = ctype_bitpos(info);
>    bsz = ctype_bitbsz(info);
> -  lua_assert(pos < 8*ctype_bitcsz(info));
> -  lua_assert(bsz > 0 && bsz <= 8*ctype_bitcsz(info));
> +  lj_assertCTS(pos < 8*ctype_bitcsz(info), "bad bitfield position");
> +  lj_assertCTS(bsz > 0 && bsz <= 8*ctype_bitcsz(info), "bad bitfield size");
>    /* Check if a packed bitfield crosses a container boundary. */
>    if (pos + bsz > 8*ctype_bitcsz(info))
>      lj_err_caller(cts->L, LJ_ERR_FFI_NYIPACKBIT);
> @@ -659,7 +668,9 @@ void lj_cconv_bf_tv(CTState *cts, CType *d, uint8_t *dp, TValue *o)
>    case 4: *(uint32_t *)dp = (*(uint32_t *)dp & ~mask) | (uint32_t)val; break;
>    case 2: *(uint16_t *)dp = (*(uint16_t *)dp & ~mask) | (uint16_t)val; break;
>    case 1: *(uint8_t *)dp = (*(uint8_t *)dp & ~mask) | (uint8_t)val; break;
> -  default: lua_assert(0); break;
> +  default:
> +    lj_assertCTS(0, "bad bitfield container size %d", ctype_bitcsz(info));
> +    break;
>    }
>  }
>  
> diff --git a/src/lj_cconv.h b/src/lj_cconv.h
> index 0a0b66c9..54a61fd4 100644
> --- a/src/lj_cconv.h
> +++ b/src/lj_cconv.h
> @@ -27,13 +27,14 @@ enum {
>  static LJ_AINLINE uint32_t cconv_idx(CTInfo info)
>  {
>    uint32_t idx = ((info >> 26) & 15u);  /* Dispatch bits. */
> -  lua_assert(ctype_type(info) <= CT_MAYCONVERT);
> +  lj_assertX(ctype_type(info) <= CT_MAYCONVERT,
> +	     "cannot convert ctype %08x", info);
>  #if LJ_64
>    idx = ((uint32_t)(U64x(f436fff5,fff7f021) >> 4*idx) & 15u);
>  #else
>    idx = (((idx < 8 ? 0xfff7f021u : 0xf436fff5) >> 4*(idx & 7u)) & 15u);
>  #endif
> -  lua_assert(idx < 8);
> +  lj_assertX(idx < 8, "cannot convert ctype %08x", info);
>    return idx;
>  }
>  
> diff --git a/src/lj_cdata.c b/src/lj_cdata.c
> index d3042f24..35d0e76a 100644
> --- a/src/lj_cdata.c
> +++ b/src/lj_cdata.c
> @@ -35,7 +35,7 @@ GCcdata *lj_cdata_newv(lua_State *L, CTypeID id, CTSize sz, CTSize align)
>    uintptr_t adata = (uintptr_t)p + sizeof(GCcdataVar) + sizeof(GCcdata);
>    uintptr_t almask = (1u << align) - 1u;
>    GCcdata *cd = (GCcdata *)(((adata + almask) & ~almask) - sizeof(GCcdata));
> -  lua_assert((char *)cd - p < 65536);
> +  lj_assertL((char *)cd - p < 65536, "excessive cdata alignment");
>    cdatav(cd)->offset = (uint16_t)((char *)cd - p);
>    cdatav(cd)->extra = extra;
>    cdatav(cd)->len = sz;
> @@ -77,8 +77,8 @@ void LJ_FASTCALL lj_cdata_free(global_State *g, GCcdata *cd)
>    } else if (LJ_LIKELY(!cdataisv(cd))) {
>      CType *ct = ctype_raw(ctype_ctsG(g), cd->ctypeid);
>      CTSize sz = ctype_hassize(ct->info) ? ct->size : CTSIZE_PTR;
> -    lua_assert(ctype_hassize(ct->info) || ctype_isfunc(ct->info) ||
> -	       ctype_isextern(ct->info));
> +    lj_assertG(ctype_hassize(ct->info) || ctype_isfunc(ct->info) ||
> +	       ctype_isextern(ct->info), "free of ctype without a size");
>      lj_mem_free(g, cd, sizeof(GCcdata) + sz);
>      g->gc.cdatanum--;
>    } else {
> @@ -118,7 +118,7 @@ CType *lj_cdata_index(CTState *cts, GCcdata *cd, cTValue *key, uint8_t **pp,
>  
>    /* Resolve reference for cdata object. */
>    if (ctype_isref(ct->info)) {
> -    lua_assert(ct->size == CTSIZE_PTR);
> +    lj_assertCTS(ct->size == CTSIZE_PTR, "ref is not pointer-sized");
>      p = *(uint8_t **)p;
>      ct = ctype_child(cts, ct);
>    }
> @@ -129,7 +129,8 @@ collect_attrib:
>      if (ctype_attrib(ct->info) == CTA_QUAL) *qual |= ct->size;
>      ct = ctype_child(cts, ct);
>    }
> -  lua_assert(!ctype_isref(ct->info));  /* Interning rejects refs to refs. */
> +  /* Interning rejects refs to refs. */
> +  lj_assertCTS(!ctype_isref(ct->info), "bad ref of ref");
>  
>    if (tvisint(key)) {
>      idx = (ptrdiff_t)intV(key);
> @@ -215,7 +216,8 @@ collect_attrib:
>  static void cdata_getconst(CTState *cts, TValue *o, CType *ct)
>  {
>    CType *ctt = ctype_child(cts, ct);
> -  lua_assert(ctype_isinteger(ctt->info) && ctt->size <= 4);
> +  lj_assertCTS(ctype_isinteger(ctt->info) && ctt->size <= 4,
> +	       "only 32 bit const supported");  /* NYI */
>    /* Constants are already zero-extended/sign-extended to 32 bits. */
>    if ((ctt->info & CTF_UNSIGNED) && (int32_t)ct->size < 0)
>      setnumV(o, (lua_Number)(uint32_t)ct->size);
> @@ -236,13 +238,14 @@ int lj_cdata_get(CTState *cts, CType *s, TValue *o, uint8_t *sp)
>    }
>  
>    /* Get child type of pointer/array/field. */
> -  lua_assert(ctype_ispointer(s->info) || ctype_isfield(s->info));
> +  lj_assertCTS(ctype_ispointer(s->info) || ctype_isfield(s->info),
> +	       "pointer or field expected");
>    sid = ctype_cid(s->info);
>    s = ctype_get(cts, sid);
>  
>    /* Resolve reference for field. */
>    if (ctype_isref(s->info)) {
> -    lua_assert(s->size == CTSIZE_PTR);
> +    lj_assertCTS(s->size == CTSIZE_PTR, "ref is not pointer-sized");
>      sp = *(uint8_t **)sp;
>      sid = ctype_cid(s->info);
>      s = ctype_get(cts, sid);
> @@ -269,12 +272,13 @@ void lj_cdata_set(CTState *cts, CType *d, uint8_t *dp, TValue *o, CTInfo qual)
>    }
>  
>    /* Get child type of pointer/array/field. */
> -  lua_assert(ctype_ispointer(d->info) || ctype_isfield(d->info));
> +  lj_assertCTS(ctype_ispointer(d->info) || ctype_isfield(d->info),
> +	       "pointer or field expected");
>    d = ctype_child(cts, d);
>  
>    /* Resolve reference for field. */
>    if (ctype_isref(d->info)) {
> -    lua_assert(d->size == CTSIZE_PTR);
> +    lj_assertCTS(d->size == CTSIZE_PTR, "ref is not pointer-sized");
>      dp = *(uint8_t **)dp;
>      d = ctype_child(cts, d);
>    }
> @@ -289,7 +293,8 @@ void lj_cdata_set(CTState *cts, CType *d, uint8_t *dp, TValue *o, CTInfo qual)
>      d = ctype_child(cts, d);
>    }
>  
> -  lua_assert(ctype_hassize(d->info) && !ctype_isvoid(d->info));
> +  lj_assertCTS(ctype_hassize(d->info), "store to ctype without size");
> +  lj_assertCTS(!ctype_isvoid(d->info), "store to void type");
>  
>    if (((d->info|qual) & CTF_CONST)) {
>    err_const:
> diff --git a/src/lj_cdata.h b/src/lj_cdata.h
> index 66b023bd..193e4241 100644
> --- a/src/lj_cdata.h
> +++ b/src/lj_cdata.h
> @@ -18,7 +18,7 @@ static LJ_AINLINE void *cdata_getptr(void *p, CTSize sz)
>    if (LJ_64 && sz == 4) {  /* Support 32 bit pointers on 64 bit targets. */
>      return ((void *)(uintptr_t)*(uint32_t *)p);
>    } else {
> -    lua_assert(sz == CTSIZE_PTR);
> +    lj_assertX(sz == CTSIZE_PTR, "bad pointer size %d", sz);
>      return *(void **)p;
>    }
>  }
> @@ -29,7 +29,7 @@ static LJ_AINLINE void cdata_setptr(void *p, CTSize sz, const void *v)
>    if (LJ_64 && sz == 4) {  /* Support 32 bit pointers on 64 bit targets. */
>      *(uint32_t *)p = (uint32_t)(uintptr_t)v;
>    } else {
> -    lua_assert(sz == CTSIZE_PTR);
> +    lj_assertX(sz == CTSIZE_PTR, "bad pointer size %d", sz);
>      *(void **)p = (void *)v;
>    }
>  }
> @@ -40,7 +40,8 @@ static LJ_AINLINE GCcdata *lj_cdata_new(CTState *cts, CTypeID id, CTSize sz)
>    GCcdata *cd;
>  #ifdef LUA_USE_ASSERT
>    CType *ct = ctype_raw(cts, id);
> -  lua_assert((ctype_hassize(ct->info) ? ct->size : CTSIZE_PTR) == sz);
> +  lj_assertCTS((ctype_hassize(ct->info) ? ct->size : CTSIZE_PTR) == sz,
> +	       "inconsistent size of fixed-size cdata alloc");
>  #endif
>    cd = (GCcdata *)lj_mem_newgco(cts->L, sizeof(GCcdata) + sz);
>    cd->gct = ~LJ_TCDATA;
> diff --git a/src/lj_clib.c b/src/lj_clib.c
> index a8672052..2f11b2e9 100644
> --- a/src/lj_clib.c
> +++ b/src/lj_clib.c
> @@ -349,7 +349,8 @@ TValue *lj_clib_index(lua_State *L, CLibrary *cl, GCstr *name)
>        lj_err_callerv(L, LJ_ERR_FFI_NODECL, strdata(name));
>      if (ctype_isconstval(ct->info)) {
>        CType *ctt = ctype_child(cts, ct);
> -      lua_assert(ctype_isinteger(ctt->info) && ctt->size <= 4);
> +      lj_assertCTS(ctype_isinteger(ctt->info) && ctt->size <= 4,
> +		   "only 32 bit const supported");  /* NYI */
>        if ((ctt->info & CTF_UNSIGNED) && (int32_t)ct->size < 0)
>  	setnumV(tv, (lua_Number)(uint32_t)ct->size);
>        else
> @@ -361,7 +362,8 @@ TValue *lj_clib_index(lua_State *L, CLibrary *cl, GCstr *name)
>  #endif
>        void *p = clib_getsym(cl, sym);
>        GCcdata *cd;
> -      lua_assert(ctype_isfunc(ct->info) || ctype_isextern(ct->info));
> +      lj_assertCTS(ctype_isfunc(ct->info) || ctype_isextern(ct->info),
> +		   "unexpected ctype %08x in clib", ct->info);
>  #if LJ_TARGET_X86 && LJ_ABI_WIN
>        /* Retry with decorated name for fastcall/stdcall functions. */
>        if (!p && ctype_isfunc(ct->info)) {
> diff --git a/src/lj_cparse.c b/src/lj_cparse.c
> index cd032b8e..6d9490ca 100644
> --- a/src/lj_cparse.c
> +++ b/src/lj_cparse.c
> @@ -28,6 +28,12 @@
>  ** If in doubt, please check the input against your favorite C compiler.
>  */
>  
> +#ifdef LUA_USE_ASSERT
> +#define lj_assertCP(c, ...)	(lj_assertG_(G(cp->L), (c), __VA_ARGS__))
> +#else
> +#define lj_assertCP(c, ...)	((void)cp)
> +#endif
> +
>  /* -- Miscellaneous ------------------------------------------------------- */
>  
>  /* Match string against a C literal. */
> @@ -61,7 +67,7 @@ LJ_NORET static void cp_err(CPState *cp, ErrMsg em);
>  
>  static const char *cp_tok2str(CPState *cp, CPToken tok)
>  {
> -  lua_assert(tok < CTOK_FIRSTDECL);
> +  lj_assertCP(tok < CTOK_FIRSTDECL, "bad CPToken %d", tok);
>    if (tok > CTOK_OFS)
>      return ctoknames[tok-CTOK_OFS-1];
>    else if (!lj_char_iscntrl(tok))
> @@ -392,7 +398,7 @@ static void cp_init(CPState *cp)
>    cp->curpack = 0;
>    cp->packstack[0] = 255;
>    lj_buf_init(cp->L, &cp->sb);
> -  lua_assert(cp->p != NULL);
> +  lj_assertCP(cp->p != NULL, "uninitialized cp->p");
>    cp_get(cp);  /* Read-ahead first char. */
>    cp->tok = 0;
>    cp->tmask = CPNS_DEFAULT;
> @@ -853,12 +859,13 @@ static CTypeID cp_decl_intern(CPState *cp, CPDecl *decl)
>      /* The cid is already part of info for copies of pointers/functions. */
>      idx = ct->next;
>      if (ctype_istypedef(info)) {
> -      lua_assert(id == 0);
> +      lj_assertCP(id == 0, "typedef not at toplevel");
>        id = ctype_cid(info);
>        /* Always refetch info/size, since struct/enum may have been completed. */
>        cinfo = ctype_get(cp->cts, id)->info;
>        csize = ctype_get(cp->cts, id)->size;
> -      lua_assert(ctype_isstruct(cinfo) || ctype_isenum(cinfo));
> +      lj_assertCP(ctype_isstruct(cinfo) || ctype_isenum(cinfo),
> +		  "typedef of bad type");
>      } else if (ctype_isfunc(info)) {  /* Intern function. */
>        CType *fct;
>        CTypeID fid;
> @@ -891,7 +898,7 @@ static CTypeID cp_decl_intern(CPState *cp, CPDecl *decl)
>        /* Inherit csize/cinfo from original type. */
>      } else {
>        if (ctype_isnum(info)) {  /* Handle mode/vector-size attributes. */
> -	lua_assert(id == 0);
> +	lj_assertCP(id == 0, "number not at toplevel");
>  	if (!(info & CTF_BOOL)) {
>  	  CTSize msize = ctype_msizeP(decl->attr);
>  	  CTSize vsize = ctype_vsizeP(decl->attr);
> @@ -946,7 +953,7 @@ static CTypeID cp_decl_intern(CPState *cp, CPDecl *decl)
>  	  info = (info & ~CTF_ALIGN) | (cinfo & CTF_ALIGN);
>  	info |= (cinfo & CTF_QUAL);  /* Inherit qual. */
>        } else {
> -	lua_assert(ctype_isvoid(info));
> +	lj_assertCP(ctype_isvoid(info), "bad ctype %08x", info);
>        }
>        csize = size;
>        cinfo = info+id;
> @@ -1596,7 +1603,7 @@ end_decl:
>  	cp_errmsg(cp, cp->tok, LJ_ERR_FFI_DECLSPEC);
>        sz = sizeof(int);
>      }
> -    lua_assert(sz != 0);
> +    lj_assertCP(sz != 0, "basic ctype with zero size");
>      info += CTALIGN(lj_fls(sz));  /* Use natural alignment. */
>      info += (decl->attr & CTF_QUAL);  /* Merge qualifiers. */
>      cp_push(decl, info, sz);
> @@ -1856,7 +1863,7 @@ static void cp_decl_multi(CPState *cp)
>  	  /* Treat both static and extern function declarations as extern. */
>  	  ct = ctype_get(cp->cts, ctypeid);
>  	  /* We always get new anonymous functions (typedefs are copied). */
> -	  lua_assert(gcref(ct->name) == NULL);
> +	  lj_assertCP(gcref(ct->name) == NULL, "unexpected named function");
>  	  id = ctypeid;  /* Just name it. */
>  	} else if ((scl & CDF_STATIC)) {  /* Accept static constants. */
>  	  id = cp_decl_constinit(cp, &ct, ctypeid);
> @@ -1913,7 +1920,7 @@ static TValue *cpcparser(lua_State *L, lua_CFunction dummy, void *ud)
>      cp_decl_single(cp);
>    if (cp->param && cp->param != cp->L->top)
>      cp_err(cp, LJ_ERR_FFI_NUMPARAM);
> -  lua_assert(cp->depth == 0);
> +  lj_assertCP(cp->depth == 0, "unbalanced cparser declaration depth");
>    return NULL;
>  }
>  
> diff --git a/src/lj_crecord.c b/src/lj_crecord.c
> index 804cdbf4..e1d1110f 100644
> --- a/src/lj_crecord.c
> +++ b/src/lj_crecord.c
> @@ -61,7 +61,8 @@ static GCcdata *argv2cdata(jit_State *J, TRef tr, cTValue *o)
>  static CTypeID crec_constructor(jit_State *J, GCcdata *cd, TRef tr)
>  {
>    CTypeID id;
> -  lua_assert(tref_iscdata(tr) && cd->ctypeid == CTID_CTYPEID);
> +  lj_assertJ(tref_iscdata(tr) && cd->ctypeid == CTID_CTYPEID,
> +	     "expected CTypeID cdata");
>    id = *(CTypeID *)cdataptr(cd);
>    tr = emitir(IRT(IR_FLOAD, IRT_INT), tr, IRFL_CDATA_INT);
>    emitir(IRTG(IR_EQ, IRT_INT), tr, lj_ir_kint(J, (int32_t)id));
> @@ -237,13 +238,14 @@ static void crec_copy(jit_State *J, TRef trdst, TRef trsrc, TRef trlen,
>      if (len > CREC_COPY_MAXLEN) goto fallback;
>      if (ct) {
>        CTState *cts = ctype_ctsG(J2G(J));
> -      lua_assert(ctype_isarray(ct->info) || ctype_isstruct(ct->info));
> +      lj_assertJ(ctype_isarray(ct->info) || ctype_isstruct(ct->info),
> +		 "copy of non-aggregate");
>        if (ctype_isarray(ct->info)) {
>  	CType *cct = ctype_rawchild(cts, ct);
>  	tp = crec_ct2irt(cts, cct);
>  	if (tp == IRT_CDATA) goto rawcopy;
>  	step = lj_ir_type_size[tp];
> -	lua_assert((len & (step-1)) == 0);
> +	lj_assertJ((len & (step-1)) == 0, "copy of fractional size");
>        } else if ((ct->info & CTF_UNION)) {
>  	step = (1u << ctype_align(ct->info));
>  	goto rawcopy;
> @@ -629,7 +631,8 @@ static TRef crec_ct_tv(jit_State *J, CType *d, TRef dp, TRef sp, cTValue *sval)
>        /* Specialize to the name of the enum constant. */
>        emitir(IRTG(IR_EQ, IRT_STR), sp, lj_ir_kstr(J, str));
>        if (cct && ctype_isconstval(cct->info)) {
> -	lua_assert(ctype_child(cts, cct)->size == 4);
> +	lj_assertJ(ctype_child(cts, cct)->size == 4,
> +		   "only 32 bit const supported");  /* NYI */
>  	svisnz = (void *)(intptr_t)(ofs != 0);
>  	sp = lj_ir_kint(J, (int32_t)ofs);
>  	sid = ctype_cid(cct->info);
> @@ -756,7 +759,7 @@ static void crec_index_bf(jit_State *J, RecordFFData *rd, TRef ptr, CTInfo info)
>    IRType t = IRT_I8 + 2*lj_fls(ctype_bitcsz(info)) + ((info&CTF_UNSIGNED)?1:0);
>    TRef tr = emitir(IRT(IR_XLOAD, t), ptr, 0);
>    CTSize pos = ctype_bitpos(info), bsz = ctype_bitbsz(info), shift = 32 - bsz;
> -  lua_assert(t <= IRT_U32);  /* NYI: 64 bit bitfields. */
> +  lj_assertJ(t <= IRT_U32, "only 32 bit bitfields supported");  /* NYI */
>    if (rd->data == 0) {  /* __index metamethod. */
>      if ((info & CTF_BOOL)) {
>        tr = emitir(IRTI(IR_BAND), tr, lj_ir_kint(J, (int32_t)((1u << pos))));
> @@ -768,7 +771,7 @@ static void crec_index_bf(jit_State *J, RecordFFData *rd, TRef ptr, CTInfo info)
>        tr = emitir(IRTI(IR_BSHL), tr, lj_ir_kint(J, shift - pos));
>        tr = emitir(IRTI(IR_BSAR), tr, lj_ir_kint(J, shift));
>      } else {
> -      lua_assert(bsz < 32);  /* Full-size fields cannot end up here. */
> +      lj_assertJ(bsz < 32, "unexpected full bitfield index");
>        tr = emitir(IRTI(IR_BSHR), tr, lj_ir_kint(J, pos));
>        tr = emitir(IRTI(IR_BAND), tr, lj_ir_kint(J, (int32_t)((1u << bsz)-1)));
>        /* We can omit the U32 to NUM conversion, since bsz < 32. */
> @@ -883,7 +886,7 @@ again:
>  	  crec_index_bf(J, rd, ptr, fct->info);
>  	  return;
>  	} else {
> -	  lua_assert(ctype_isfield(fct->info));
> +	  lj_assertJ(ctype_isfield(fct->info), "field expected");
>  	  sid = ctype_cid(fct->info);
>  	}
>        }
> @@ -1133,7 +1136,7 @@ static TRef crec_call_args(jit_State *J, RecordFFData *rd,
>      if (fid) {  /* Get argument type from field. */
>        CType *ctf = ctype_get(cts, fid);
>        fid = ctf->sib;
> -      lua_assert(ctype_isfield(ctf->info));
> +      lj_assertJ(ctype_isfield(ctf->info), "field expected");
>        did = ctype_cid(ctf->info);
>      } else {
>        if (!(ct->info & CTF_VARARG))
> diff --git a/src/lj_ctype.c b/src/lj_ctype.c
> index 0ea89c74..a42e3d60 100644
> --- a/src/lj_ctype.c
> +++ b/src/lj_ctype.c
> @@ -153,7 +153,7 @@ CTypeID lj_ctype_new(CTState *cts, CType **ctp)
>  {
>    CTypeID id = cts->top;
>    CType *ct;
> -  lua_assert(cts->L);
> +  lj_assertCTS(cts->L, "uninitialized cts->L");
>    if (LJ_UNLIKELY(id >= cts->sizetab)) {
>      if (id >= CTID_MAX) lj_err_msg(cts->L, LJ_ERR_TABOV);
>  #ifdef LUAJIT_CTYPE_CHECK_ANCHOR
> @@ -182,7 +182,7 @@ CTypeID lj_ctype_intern(CTState *cts, CTInfo info, CTSize size)
>  {
>    uint32_t h = ct_hashtype(info, size);
>    CTypeID id = cts->hash[h];
> -  lua_assert(cts->L);
> +  lj_assertCTS(cts->L, "uninitialized cts->L");
>    while (id) {
>      CType *ct = ctype_get(cts, id);
>      if (ct->info == info && ct->size == size)
> @@ -298,9 +298,9 @@ CTSize lj_ctype_vlsize(CTState *cts, CType *ct, CTSize nelem)
>      }
>      ct = ctype_raw(cts, arrid);
>    }
> -  lua_assert(ctype_isvlarray(ct->info));  /* Must be a VLA. */
> +  lj_assertCTS(ctype_isvlarray(ct->info), "VLA expected");
>    ct = ctype_rawchild(cts, ct);  /* Get array element. */
> -  lua_assert(ctype_hassize(ct->info));
> +  lj_assertCTS(ctype_hassize(ct->info), "bad VLA without size");
>    /* Calculate actual size of VLA and check for overflow. */
>    xsz += (uint64_t)ct->size * nelem;
>    return xsz < 0x80000000u ? (CTSize)xsz : CTSIZE_INVALID;
> @@ -323,7 +323,8 @@ CTInfo lj_ctype_info(CTState *cts, CTypeID id, CTSize *szp)
>      } else {
>        if (!(qual & CTFP_ALIGNED)) qual |= (info & CTF_ALIGN);
>        qual |= (info & ~(CTF_ALIGN|CTMASK_CID));
> -      lua_assert(ctype_hassize(info) || ctype_isfunc(info));
> +      lj_assertCTS(ctype_hassize(info) || ctype_isfunc(info),
> +		   "ctype without size");
>        *szp = ctype_isfunc(info) ? CTSIZE_INVALID : ct->size;
>        break;
>      }
> @@ -528,7 +529,7 @@ static void ctype_repr(CTRepr *ctr, CTypeID id)
>        ctype_appc(ctr, ')');
>        break;
>      default:
> -      lua_assert(0);
> +      lj_assertG_(ctr->cts->g, 0, "bad ctype %08x", info);
>        break;
>      }
>      ct = ctype_get(ctr->cts, ctype_cid(info));
> diff --git a/src/lj_ctype.h b/src/lj_ctype.h
> index 0c220a88..c4f3bdde 100644
> --- a/src/lj_ctype.h
> +++ b/src/lj_ctype.h
> @@ -260,6 +260,12 @@ typedef struct CTState {
>  
>  #define CT_MEMALIGN	3	/* Alignment guaranteed by memory allocator. */
>  
> +#ifdef LUA_USE_ASSERT
> +#define lj_assertCTS(c, ...)	(lj_assertG_(cts->g, (c), __VA_ARGS__))
> +#else
> +#define lj_assertCTS(c, ...)	((void)cts)
> +#endif
> +
>  /* -- Predefined types ---------------------------------------------------- */
>  
>  /* Target-dependent types. */
> @@ -392,7 +398,8 @@ static LJ_AINLINE CTState *ctype_cts(lua_State *L)
>  /* Check C type ID for validity when assertions are enabled. */
>  static LJ_AINLINE CTypeID ctype_check(CTState *cts, CTypeID id)
>  {
> -  lua_assert(id > 0 && id < cts->top); UNUSED(cts);
> +  UNUSED(cts);
> +  lj_assertCTS(id > 0 && id < cts->top, "bad CTID %d", id);
>    return id;
>  }
>  
> @@ -408,8 +415,9 @@ static LJ_AINLINE CType *ctype_get(CTState *cts, CTypeID id)
>  /* Get child C type. */
>  static LJ_AINLINE CType *ctype_child(CTState *cts, CType *ct)
>  {
> -  lua_assert(!(ctype_isvoid(ct->info) || ctype_isstruct(ct->info) ||
> -	     ctype_isbitfield(ct->info)));  /* These don't have children. */
> +  lj_assertCTS(!(ctype_isvoid(ct->info) || ctype_isstruct(ct->info) ||
> +	       ctype_isbitfield(ct->info)),
> +	       "ctype %08x has no children", ct->info);
>    return ctype_get(cts, ctype_cid(ct->info));
>  }
>  
> diff --git a/src/lj_debug.c b/src/lj_debug.c
> index c4edcabb..46c442c6 100644
> --- a/src/lj_debug.c
> +++ b/src/lj_debug.c
> @@ -55,7 +55,8 @@ static BCPos debug_framepc(lua_State *L, GCfunc *fn, cTValue *nextframe)
>    const BCIns *ins;
>    GCproto *pt;
>    BCPos pos;
> -  lua_assert(fn->c.gct == ~LJ_TFUNC || fn->c.gct == ~LJ_TTHREAD);
> +  lj_assertL(fn->c.gct == ~LJ_TFUNC || fn->c.gct == ~LJ_TTHREAD,
> +	     "function or frame expected");
>    if (!isluafunc(fn)) {  /* Cannot derive a PC for non-Lua functions. */
>      return NO_BCPOS;
>    } else if (nextframe == NULL) {  /* Lua function on top. */
> @@ -101,7 +102,7 @@ static BCPos debug_framepc(lua_State *L, GCfunc *fn, cTValue *nextframe)
>  #if LJ_HASJIT
>    if (pos > pt->sizebc) {  /* Undo the effects of lj_trace_exit for JLOOP. */
>      GCtrace *T = (GCtrace *)((char *)(ins-1) - offsetof(GCtrace, startins));
> -    lua_assert(bc_isret(bc_op(ins[-1])));
> +    lj_assertL(bc_isret(bc_op(ins[-1])), "return bytecode expected");
>      pos = proto_bcpos(pt, mref(T->startpc, const BCIns));
>    }
>  #endif
> @@ -134,7 +135,7 @@ BCLine lj_debug_frameline(lua_State *L, GCfunc *fn, cTValue *nextframe)
>    BCPos pc = debug_framepc(L, fn, nextframe);
>    if (pc != NO_BCPOS) {
>      GCproto *pt = funcproto(fn);
> -    lua_assert(pc <= pt->sizebc);
> +    lj_assertL(pc <= pt->sizebc, "PC out of range");
>      return lj_debug_line(pt, pc);
>    }
>    return -1;
> @@ -215,7 +216,7 @@ static TValue *debug_localname(lua_State *L, const lua_Debug *ar,
>  const char *lj_debug_uvname(GCproto *pt, uint32_t idx)
>  {
>    const uint8_t *p = proto_uvinfo(pt);
> -  lua_assert(idx < pt->sizeuv);
> +  lj_assertX(idx < pt->sizeuv, "bad upvalue index");
>    if (!p) return "";
>    if (idx) while (*p++ || --idx) ;
>    return (const char *)p;
> @@ -440,13 +441,14 @@ int lj_debug_getinfo(lua_State *L, const char *what, lj_Debug *ar, int ext)
>    } else {
>      uint32_t offset = (uint32_t)ar->i_ci & 0xffff;
>      uint32_t size = (uint32_t)ar->i_ci >> 16;
> -    lua_assert(offset != 0);
> +    lj_assertL(offset != 0, "bad frame offset");
>      frame = tvref(L->stack) + offset;
>      if (size) nextframe = frame + size;
> -    lua_assert(frame <= tvref(L->maxstack) &&
> -	       (!nextframe || nextframe <= tvref(L->maxstack)));
> +    lj_assertL(frame <= tvref(L->maxstack) &&
> +	       (!nextframe || nextframe <= tvref(L->maxstack)),
> +	       "broken frame chain");
>      fn = frame_func(frame);
> -    lua_assert(fn->c.gct == ~LJ_TFUNC);
> +    lj_assertL(fn->c.gct == ~LJ_TFUNC, "bad frame function");
>    }
>    for (; *what; what++) {
>      if (*what == 'S') {
> diff --git a/src/lj_def.h b/src/lj_def.h
> index 2d8fff66..ba4dcc9d 100644
> --- a/src/lj_def.h
> +++ b/src/lj_def.h
> @@ -338,14 +338,28 @@ static LJ_AINLINE uint32_t lj_getu32(const void *v)
>  #define LJ_FUNCA_NORET	LJ_FUNCA LJ_NORET
>  #define LJ_ASMF_NORET	LJ_ASMF LJ_NORET
>  
> -/* Runtime assertions. */
> -#ifdef lua_assert
> -#define check_exp(c, e)		(lua_assert(c), (e))
> -#define api_check(l, e)		lua_assert(e)
> +/* Internal assertions. */
> +#if defined(LUA_USE_ASSERT) || defined(LUA_USE_APICHECK)
> +#define lj_assert_check(g, c, ...) \
> +  ((c) ? (void)0 : \
> +   (lj_assert_fail((g), __FILE__, __LINE__, __func__, __VA_ARGS__), 0))
> +#define lj_checkapi(c, ...)	lj_assert_check(G(L), (c), __VA_ARGS__)
>  #else
> -#define lua_assert(c)		((void)0)
> +#define lj_checkapi(c, ...)	((void)L)
> +#endif
> +
> +#ifdef LUA_USE_ASSERT
> +#define lj_assertG_(g, c, ...)	lj_assert_check((g), (c), __VA_ARGS__)
> +#define lj_assertG(c, ...)	lj_assert_check(g, (c), __VA_ARGS__)
> +#define lj_assertL(c, ...)	lj_assert_check(G(L), (c), __VA_ARGS__)
> +#define lj_assertX(c, ...)	lj_assert_check(NULL, (c), __VA_ARGS__)
> +#define check_exp(c, e)		(lj_assertX((c), #c), (e))
> +#else
> +#define lj_assertG_(g, c, ...)	((void)0)
> +#define lj_assertG(c, ...)	((void)g)
> +#define lj_assertL(c, ...)	((void)L)
> +#define lj_assertX(c, ...)	((void)0)
>  #define check_exp(c, e)		(e)
> -#define api_check		luai_apicheck
>  #endif
>  
>  /* Static assertions. */
> diff --git a/src/lj_dispatch.c b/src/lj_dispatch.c
> index ee735450..ddee68de 100644
> --- a/src/lj_dispatch.c
> +++ b/src/lj_dispatch.c
> @@ -380,7 +380,7 @@ static void callhook(lua_State *L, int event, BCLine line)
>      hook_enter(g);
>  #endif
>      hookf(L, &ar);
> -    lua_assert(hook_active(g));
> +    lj_assertG(hook_active(g), "active hook flag removed");
>      setgcref(g->cur_L, obj2gco(L));
>  #if LJ_HASPROFILE && !LJ_PROFILE_SIGPROF
>      lj_profile_hook_leave(g);
> @@ -428,7 +428,8 @@ void LJ_FASTCALL lj_dispatch_ins(lua_State *L, const BCIns *pc)
>  #endif
>        J->L = L;
>        lj_trace_ins(J, pc-1);  /* The interpreter bytecode PC is offset by 1. */
> -      lua_assert(L->top - L->base == delta);
> +      lj_assertG(L->top - L->base == delta,
> +		 "unbalanced stack after tracing of instruction");
>      }
>    }
>  #endif
> @@ -488,7 +489,8 @@ ASMFunction LJ_FASTCALL lj_dispatch_call(lua_State *L, const BCIns *pc)
>  #endif
>      pc = (const BCIns *)((uintptr_t)pc & ~(uintptr_t)1);
>      lj_trace_hot(J, pc);
> -    lua_assert(L->top - L->base == delta);
> +    lj_assertG(L->top - L->base == delta,
> +	       "unbalanced stack after hot call");
>      goto out;
>    } else if (J->state != LJ_TRACE_IDLE &&
>  	     !(g->hookmask & (HOOK_GC|HOOK_VMEVENT))) {
> @@ -497,7 +499,8 @@ ASMFunction LJ_FASTCALL lj_dispatch_call(lua_State *L, const BCIns *pc)
>  #endif
>      /* Record the FUNC* bytecodes, too. */
>      lj_trace_ins(J, pc-1);  /* The interpreter bytecode PC is offset by 1. */
> -    lua_assert(L->top - L->base == delta);
> +    lj_assertG(L->top - L->base == delta,
> +	       "unbalanced stack after hot instruction");
>    }
>  #endif
>    if ((g->hookmask & LUA_MASKCALL)) {
> diff --git a/src/lj_emit_arm.h b/src/lj_emit_arm.h
> index dee8bdcc..ee299821 100644
> --- a/src/lj_emit_arm.h
> +++ b/src/lj_emit_arm.h
> @@ -81,7 +81,8 @@ static void emit_m(ASMState *as, ARMIns ai, Reg rm)
>  
>  static void emit_lsox(ASMState *as, ARMIns ai, Reg rd, Reg rn, int32_t ofs)
>  {
> -  lua_assert(ofs >= -255 && ofs <= 255);
> +  lj_assertA(ofs >= -255 && ofs <= 255,
> +	     "load/store offset %d out of range", ofs);
>    if (ofs < 0) ofs = -ofs; else ai |= ARMI_LS_U;
>    *--as->mcp = ai | ARMI_LS_P | ARMI_LSX_I | ARMF_D(rd) | ARMF_N(rn) |
>  	       ((ofs & 0xf0) << 4) | (ofs & 0x0f);
> @@ -89,7 +90,8 @@ static void emit_lsox(ASMState *as, ARMIns ai, Reg rd, Reg rn, int32_t ofs)
>  
>  static void emit_lso(ASMState *as, ARMIns ai, Reg rd, Reg rn, int32_t ofs)
>  {
> -  lua_assert(ofs >= -4095 && ofs <= 4095);
> +  lj_assertA(ofs >= -4095 && ofs <= 4095,
> +	     "load/store offset %d out of range", ofs);
>    /* Combine LDR/STR pairs to LDRD/STRD. */
>    if (*as->mcp == (ai|ARMI_LS_P|ARMI_LS_U|ARMF_D(rd^1)|ARMF_N(rn)|(ofs^4)) &&
>        (ai & ~(ARMI_LDR^ARMI_STR)) == ARMI_STR && rd != rn &&
> @@ -106,7 +108,8 @@ static void emit_lso(ASMState *as, ARMIns ai, Reg rd, Reg rn, int32_t ofs)
>  #if !LJ_SOFTFP
>  static void emit_vlso(ASMState *as, ARMIns ai, Reg rd, Reg rn, int32_t ofs)
>  {
> -  lua_assert(ofs >= -1020 && ofs <= 1020 && (ofs&3) == 0);
> +  lj_assertA(ofs >= -1020 && ofs <= 1020 && (ofs&3) == 0,
> +	     "load/store offset %d out of range", ofs);
>    if (ofs < 0) ofs = -ofs; else ai |= ARMI_LS_U;
>    *--as->mcp = ai | ARMI_LS_P | ARMF_D(rd & 15) | ARMF_N(rn) | (ofs >> 2);
>  }
> @@ -124,7 +127,7 @@ static int emit_kdelta1(ASMState *as, Reg d, int32_t i)
>    while (work) {
>      Reg r = rset_picktop(work);
>      IRRef ref = regcost_ref(as->cost[r]);
> -    lua_assert(r != d);
> +    lj_assertA(r != d, "dest reg not free");
>      if (emit_canremat(ref)) {
>        int32_t delta = i - (ra_iskref(ref) ? ra_krefk(as, ref) : IR(ref)->i);
>        uint32_t k = emit_isk12(ARMI_ADD, delta);
> @@ -142,13 +145,13 @@ static int emit_kdelta1(ASMState *as, Reg d, int32_t i)
>  }
>  
>  /* Try to find a two step delta relative to another constant. */
> -static int emit_kdelta2(ASMState *as, Reg d, int32_t i)
> +static int emit_kdelta2(ASMState *as, Reg rd, int32_t i)
>  {
>    RegSet work = ~as->freeset & RSET_GPR;
>    while (work) {
>      Reg r = rset_picktop(work);
>      IRRef ref = regcost_ref(as->cost[r]);
> -    lua_assert(r != d);
> +    lj_assertA(r != rd, "dest reg %d not free", rd);
>      if (emit_canremat(ref)) {
>        int32_t other = ra_iskref(ref) ? ra_krefk(as, ref) : IR(ref)->i;
>        if (other) {
> @@ -159,8 +162,8 @@ static int emit_kdelta2(ASMState *as, Reg d, int32_t i)
>  	k2 = emit_isk12(0, delta & (255 << sh));
>  	k = emit_isk12(0, delta & ~(255 << sh));
>  	if (k) {
> -	  emit_dn(as, ARMI_ADD^k2^inv, d, d);
> -	  emit_dn(as, ARMI_ADD^k^inv, d, r);
> +	  emit_dn(as, ARMI_ADD^k2^inv, rd, rd);
> +	  emit_dn(as, ARMI_ADD^k^inv, rd, r);
>  	  return 1;
>  	}
>        }
> @@ -171,23 +174,24 @@ static int emit_kdelta2(ASMState *as, Reg d, int32_t i)
>  }
>  
>  /* Load a 32 bit constant into a GPR. */
> -static void emit_loadi(ASMState *as, Reg r, int32_t i)
> +static void emit_loadi(ASMState *as, Reg rd, int32_t i)
>  {
>    uint32_t k = emit_isk12(ARMI_MOV, i);
> -  lua_assert(rset_test(as->freeset, r) || r == RID_TMP);
> +  lj_assertA(rset_test(as->freeset, rd) || rd == RID_TMP,
> +	     "dest reg %d not free", rd);
>    if (k) {
>      /* Standard K12 constant. */
> -    emit_d(as, ARMI_MOV^k, r);
> +    emit_d(as, ARMI_MOV^k, rd);
>    } else if ((as->flags & JIT_F_ARMV6T2) && (uint32_t)i < 0x00010000u) {
>      /* 16 bit loword constant for ARMv6T2. */
> -    emit_d(as, ARMI_MOVW|(i & 0x0fff)|((i & 0xf000)<<4), r);
> -  } else if (emit_kdelta1(as, r, i)) {
> +    emit_d(as, ARMI_MOVW|(i & 0x0fff)|((i & 0xf000)<<4), rd);
> +  } else if (emit_kdelta1(as, rd, i)) {
>      /* One step delta relative to another constant. */
>    } else if ((as->flags & JIT_F_ARMV6T2)) {
>      /* 32 bit hiword/loword constant for ARMv6T2. */
> -    emit_d(as, ARMI_MOVT|((i>>16) & 0x0fff)|(((i>>16) & 0xf000)<<4), r);
> -    emit_d(as, ARMI_MOVW|(i & 0x0fff)|((i & 0xf000)<<4), r);
> -  } else if (emit_kdelta2(as, r, i)) {
> +    emit_d(as, ARMI_MOVT|((i>>16) & 0x0fff)|(((i>>16) & 0xf000)<<4), rd);
> +    emit_d(as, ARMI_MOVW|(i & 0x0fff)|((i & 0xf000)<<4), rd);
> +  } else if (emit_kdelta2(as, rd, i)) {
>      /* Two step delta relative to another constant. */
>    } else {
>      /* Otherwise construct the constant with up to 4 instructions. */
> @@ -197,15 +201,15 @@ static void emit_loadi(ASMState *as, Reg r, int32_t i)
>        int32_t m = i & (255 << sh);
>        i &= ~(255 << sh);
>        if (i == 0) {
> -	emit_d(as, ARMI_MOV ^ emit_isk12(0, m), r);
> +	emit_d(as, ARMI_MOV ^ emit_isk12(0, m), rd);
>  	break;
>        }
> -      emit_dn(as, ARMI_ORR ^ emit_isk12(0, m), r, r);
> +      emit_dn(as, ARMI_ORR ^ emit_isk12(0, m), rd, rd);
>      }
>    }
>  }
>  
> -#define emit_loada(as, r, addr)		emit_loadi(as, (r), i32ptr((addr)))
> +#define emit_loada(as, rd, addr)	emit_loadi(as, (rd), i32ptr((addr)))
>  
>  static Reg ra_allock(ASMState *as, intptr_t k, RegSet allow);
>  
> @@ -261,7 +265,7 @@ static void emit_branch(ASMState *as, ARMIns ai, MCode *target)
>  {
>    MCode *p = as->mcp;
>    ptrdiff_t delta = (target - p) - 1;
> -  lua_assert(((delta + 0x00800000) >> 24) == 0);
> +  lj_assertA(((delta + 0x00800000) >> 24) == 0, "branch target out of range");
>    *--p = ai | ((uint32_t)delta & 0x00ffffffu);
>    as->mcp = p;
>  }
> @@ -289,7 +293,7 @@ static void emit_call(ASMState *as, void *target)
>  static void emit_movrr(ASMState *as, IRIns *ir, Reg dst, Reg src)
>  {
>  #if LJ_SOFTFP
> -  lua_assert(!irt_isnum(ir->t)); UNUSED(ir);
> +  lj_assertA(!irt_isnum(ir->t), "unexpected FP op"); UNUSED(ir);
>  #else
>    if (dst >= RID_MAX_GPR) {
>      emit_dm(as, irt_isnum(ir->t) ? ARMI_VMOV_D : ARMI_VMOV_S,
> @@ -313,7 +317,7 @@ static void emit_movrr(ASMState *as, IRIns *ir, Reg dst, Reg src)
>  static void emit_loadofs(ASMState *as, IRIns *ir, Reg r, Reg base, int32_t ofs)
>  {
>  #if LJ_SOFTFP
> -  lua_assert(!irt_isnum(ir->t)); UNUSED(ir);
> +  lj_assertA(!irt_isnum(ir->t), "unexpected FP op"); UNUSED(ir);
>  #else
>    if (r >= RID_MAX_GPR)
>      emit_vlso(as, irt_isnum(ir->t) ? ARMI_VLDR_D : ARMI_VLDR_S, r, base, ofs);
> @@ -326,7 +330,7 @@ static void emit_loadofs(ASMState *as, IRIns *ir, Reg r, Reg base, int32_t ofs)
>  static void emit_storeofs(ASMState *as, IRIns *ir, Reg r, Reg base, int32_t ofs)
>  {
>  #if LJ_SOFTFP
> -  lua_assert(!irt_isnum(ir->t)); UNUSED(ir);
> +  lj_assertA(!irt_isnum(ir->t), "unexpected FP op"); UNUSED(ir);
>  #else
>    if (r >= RID_MAX_GPR)
>      emit_vlso(as, irt_isnum(ir->t) ? ARMI_VSTR_D : ARMI_VSTR_S, r, base, ofs);
> diff --git a/src/lj_emit_arm64.h b/src/lj_emit_arm64.h
> index 1001b1d8..96fbab72 100644
> --- a/src/lj_emit_arm64.h
> +++ b/src/lj_emit_arm64.h
> @@ -8,8 +8,9 @@
>  
>  /* -- Constant encoding --------------------------------------------------- */
>  
> -static uint64_t get_k64val(IRIns *ir)
> +static uint64_t get_k64val(ASMState *as, IRRef ref)
>  {
> +  IRIns *ir = IR(ref);
>    if (ir->o == IR_KINT64) {
>      return ir_kint64(ir)->u64;
>    } else if (ir->o == IR_KGC) {
> @@ -17,7 +18,8 @@ static uint64_t get_k64val(IRIns *ir)
>    } else if (ir->o == IR_KPTR || ir->o == IR_KKPTR) {
>      return (uint64_t)ir_kptr(ir);
>    } else {
> -    lua_assert(ir->o == IR_KINT || ir->o == IR_KNULL);
> +    lj_assertA(ir->o == IR_KINT || ir->o == IR_KNULL,
> +	       "bad 64 bit const IR op %d", ir->o);
>      return ir->i;  /* Sign-extended. */
>    }
>  }
> @@ -122,7 +124,7 @@ static int emit_checkofs(A64Ins ai, int64_t ofs)
>  static void emit_lso(ASMState *as, A64Ins ai, Reg rd, Reg rn, int64_t ofs)
>  {
>    int ot = emit_checkofs(ai, ofs), sc = (ai >> 30) & 3;
> -  lua_assert(ot);
> +  lj_assertA(ot, "load/store offset %d out of range", ofs);
>    /* Combine LDR/STR pairs to LDP/STP. */
>    if ((sc == 2 || sc == 3) &&
>        (!(ai & 0x400000) || rd != rn) &&
> @@ -166,10 +168,10 @@ static int emit_kdelta(ASMState *as, Reg rd, uint64_t k, int lim)
>    while (work) {
>      Reg r = rset_picktop(work);
>      IRRef ref = regcost_ref(as->cost[r]);
> -    lua_assert(r != rd);
> +    lj_assertA(r != rd, "dest reg %d not free", rd);
>      if (ref < REF_TRUE) {
>        uint64_t kx = ra_iskref(ref) ? (uint64_t)ra_krefk(as, ref) :
> -				     get_k64val(IR(ref));
> +				     get_k64val(as, ref);
>        int64_t delta = (int64_t)(k - kx);
>        if (delta == 0) {
>  	emit_dm(as, A64I_MOVx, rd, r);
> @@ -312,7 +314,7 @@ static void emit_cond_branch(ASMState *as, A64CC cond, MCode *target)
>  {
>    MCode *p = --as->mcp;
>    ptrdiff_t delta = target - p;
> -  lua_assert(A64F_S_OK(delta, 19));
> +  lj_assertA(A64F_S_OK(delta, 19), "branch target out of range");
>    *p = A64I_BCC | A64F_S19(delta) | cond;
>  }
>  
> @@ -320,7 +322,7 @@ static void emit_branch(ASMState *as, A64Ins ai, MCode *target)
>  {
>    MCode *p = --as->mcp;
>    ptrdiff_t delta = target - p;
> -  lua_assert(A64F_S_OK(delta, 26));
> +  lj_assertA(A64F_S_OK(delta, 26), "branch target out of range");
>    *p = ai | A64F_S26(delta);
>  }
>  
> @@ -328,7 +330,8 @@ static void emit_tnb(ASMState *as, A64Ins ai, Reg r, uint32_t bit, MCode *target
>  {
>    MCode *p = --as->mcp;
>    ptrdiff_t delta = target - p;
> -  lua_assert(bit < 63 && A64F_S_OK(delta, 14));
> +  lj_assertA(bit < 63, "bit number out of range");
> +  lj_assertA(A64F_S_OK(delta, 14), "branch target out of range");
>    if (bit > 31) ai |= A64I_X;
>    *p = ai | A64F_BIT(bit & 31) | A64F_S14(delta) | r;
>  }
> @@ -337,7 +340,7 @@ static void emit_cnb(ASMState *as, A64Ins ai, Reg r, MCode *target)
>  {
>    MCode *p = --as->mcp;
>    ptrdiff_t delta = target - p;
> -  lua_assert(A64F_S_OK(delta, 19));
> +  lj_assertA(A64F_S_OK(delta, 19), "branch target out of range");
>    *p = ai | A64F_S19(delta) | r;
>  }
>  
> diff --git a/src/lj_emit_mips.h b/src/lj_emit_mips.h
> index 313d030a..7f0d27ca 100644
> --- a/src/lj_emit_mips.h
> +++ b/src/lj_emit_mips.h
> @@ -4,8 +4,9 @@
>  */
>  
>  #if LJ_64
> -static intptr_t get_k64val(IRIns *ir)
> +static intptr_t get_k64val(ASMState *as, IRRef ref)
>  {
> +  IRIns *ir = IR(ref);
>    if (ir->o == IR_KINT64) {
>      return (intptr_t)ir_kint64(ir)->u64;
>    } else if (ir->o == IR_KGC) {
> @@ -15,16 +16,17 @@ static intptr_t get_k64val(IRIns *ir)
>    } else if (LJ_SOFTFP && ir->o == IR_KNUM) {
>      return (intptr_t)ir_knum(ir)->u64;
>    } else {
> -    lua_assert(ir->o == IR_KINT || ir->o == IR_KNULL);
> +    lj_assertA(ir->o == IR_KINT || ir->o == IR_KNULL,
> +	       "bad 64 bit const IR op %d", ir->o);
>      return ir->i;  /* Sign-extended. */
>    }
>  }
>  #endif
>  
>  #if LJ_64
> -#define get_kval(ir)		get_k64val(ir)
> +#define get_kval(as, ref)	get_k64val(as, ref)
>  #else
> -#define get_kval(ir)		((ir)->i)
> +#define get_kval(as, ref)	(IR((ref))->i)
>  #endif
>  
>  /* -- Emit basic instructions --------------------------------------------- */
> @@ -82,18 +84,18 @@ static void emit_tsml(ASMState *as, MIPSIns mi, Reg rt, Reg rs, uint32_t msb,
>  #define emit_canremat(ref)	((ref) <= REF_BASE)
>  
>  /* Try to find a one step delta relative to another constant. */
> -static int emit_kdelta1(ASMState *as, Reg t, intptr_t i)
> +static int emit_kdelta1(ASMState *as, Reg rd, intptr_t i)
>  {
>    RegSet work = ~as->freeset & RSET_GPR;
>    while (work) {
>      Reg r = rset_picktop(work);
>      IRRef ref = regcost_ref(as->cost[r]);
> -    lua_assert(r != t);
> +    lj_assertA(r != rd, "dest reg %d not free", rd);
>      if (ref < ASMREF_L) {
>        intptr_t delta = (intptr_t)((uintptr_t)i -
> -	(uintptr_t)(ra_iskref(ref) ? ra_krefk(as, ref) : get_kval(IR(ref))));
> +	(uintptr_t)(ra_iskref(ref) ? ra_krefk(as, ref) : get_kval(as, ref)));
>        if (checki16(delta)) {
> -	emit_tsi(as, MIPSI_AADDIU, t, r, delta);
> +	emit_tsi(as, MIPSI_AADDIU, rd, r, delta);
>  	return 1;
>        }
>      }
> @@ -223,7 +225,7 @@ static void emit_branch(ASMState *as, MIPSIns mi, Reg rs, Reg rt, MCode *target)
>  {
>    MCode *p = as->mcp;
>    ptrdiff_t delta = target - p;
> -  lua_assert(((delta + 0x8000) >> 16) == 0);
> +  lj_assertA(((delta + 0x8000) >> 16) == 0, "branch target out of range");
>    *--p = mi | MIPSF_S(rs) | MIPSF_T(rt) | ((uint32_t)delta & 0xffffu);
>    as->mcp = p;
>  }
> @@ -299,7 +301,7 @@ static void emit_storeofs(ASMState *as, IRIns *ir, Reg r, Reg base, int32_t ofs)
>  static void emit_addptr(ASMState *as, Reg r, int32_t ofs)
>  {
>    if (ofs) {
> -    lua_assert(checki16(ofs));
> +    lj_assertA(checki16(ofs), "offset %d out of range", ofs);
>      emit_tsi(as, MIPSI_AADDIU, r, r, ofs);
>    }
>  }
> diff --git a/src/lj_emit_ppc.h b/src/lj_emit_ppc.h
> index 21c3c2ac..ddc864cd 100644
> --- a/src/lj_emit_ppc.h
> +++ b/src/lj_emit_ppc.h
> @@ -41,13 +41,13 @@ static void emit_rot(ASMState *as, PPCIns pi, Reg ra, Reg rs,
>  
>  static void emit_slwi(ASMState *as, Reg ra, Reg rs, int32_t n)
>  {
> -  lua_assert(n >= 0 && n < 32);
> +  lj_assertA(n >= 0 && n < 32, "shift out or range");
>    emit_rot(as, PPCI_RLWINM, ra, rs, n, 0, 31-n);
>  }
>  
>  static void emit_rotlwi(ASMState *as, Reg ra, Reg rs, int32_t n)
>  {
> -  lua_assert(n >= 0 && n < 32);
> +  lj_assertA(n >= 0 && n < 32, "shift out or range");
>    emit_rot(as, PPCI_RLWINM, ra, rs, n, 0, 31);
>  }
>  
> @@ -57,17 +57,17 @@ static void emit_rotlwi(ASMState *as, Reg ra, Reg rs, int32_t n)
>  #define emit_canremat(ref)	((ref) <= REF_BASE)
>  
>  /* Try to find a one step delta relative to another constant. */
> -static int emit_kdelta1(ASMState *as, Reg t, int32_t i)
> +static int emit_kdelta1(ASMState *as, Reg rd, int32_t i)
>  {
>    RegSet work = ~as->freeset & RSET_GPR;
>    while (work) {
>      Reg r = rset_picktop(work);
>      IRRef ref = regcost_ref(as->cost[r]);
> -    lua_assert(r != t);
> +    lj_assertA(r != rd, "dest reg %d not free", rd);
>      if (ref < ASMREF_L) {
>        int32_t delta = i - (ra_iskref(ref) ? ra_krefk(as, ref) : IR(ref)->i);
>        if (checki16(delta)) {
> -	emit_tai(as, PPCI_ADDI, t, r, delta);
> +	emit_tai(as, PPCI_ADDI, rd, r, delta);
>  	return 1;
>        }
>      }
> @@ -144,7 +144,7 @@ static void emit_condbranch(ASMState *as, PPCIns pi, PPCCC cc, MCode *target)
>  {
>    MCode *p = --as->mcp;
>    ptrdiff_t delta = (char *)target - (char *)p;
> -  lua_assert(((delta + 0x8000) >> 16) == 0);
> +  lj_assertA(((delta + 0x8000) >> 16) == 0, "branch target out of range");
>    pi ^= (delta & 0x8000) * (PPCF_Y/0x8000);
>    *p = pi | PPCF_CC(cc) | ((uint32_t)delta & 0xffffu);
>  }
> diff --git a/src/lj_emit_x86.h b/src/lj_emit_x86.h
> index b3dc4ea5..eaef17fc 100644
> --- a/src/lj_emit_x86.h
> +++ b/src/lj_emit_x86.h
> @@ -92,7 +92,7 @@ static void emit_rr(ASMState *as, x86Op xo, Reg r1, Reg r2)
>  /* [addr] is sign-extended in x64 and must be in lower 2G (not 4G). */
>  static int32_t ptr2addr(const void *p)
>  {
> -  lua_assert((uintptr_t)p < (uintptr_t)0x80000000);
> +  lj_assertX((uintptr_t)p < (uintptr_t)0x80000000, "pointer outside 2G range");
>    return i32ptr(p);
>  }
>  #else
> @@ -208,7 +208,7 @@ static void emit_mrm(ASMState *as, x86Op xo, Reg rr, Reg rb)
>        rb = RID_ESP;
>  #endif
>      } else if (LJ_GC64 && rb == RID_RIP) {
> -      lua_assert(as->mrm.idx == RID_NONE);
> +      lj_assertA(as->mrm.idx == RID_NONE, "RIP-rel mrm cannot have index");
>        mode = XM_OFS0;
>        p -= 4;
>        *(int32_t *)p = as->mrm.ofs;
> @@ -401,7 +401,8 @@ static void emit_loadk64(ASMState *as, Reg r, IRIns *ir)
>      emit_rma(as, xo, r64, k);
>    } else {
>      if (ir->i) {
> -      lua_assert(*k == *(uint64_t*)(as->mctop - ir->i));
> +      lj_assertA(*k == *(uint64_t*)(as->mctop - ir->i),
> +		 "bad interned 64 bit constant");
>      } else if (as->curins <= as->stopins && rset_test(RSET_GPR, r)) {
>        emit_loadu64(as, r, *k);
>        return;
> @@ -433,7 +434,7 @@ static void emit_sjmp(ASMState *as, MCLabel target)
>  {
>    MCode *p = as->mcp;
>    ptrdiff_t delta = target - p;
> -  lua_assert(delta == (int8_t)delta);
> +  lj_assertA(delta == (int8_t)delta, "short jump target out of range");
>    p[-1] = (MCode)(int8_t)delta;
>    p[-2] = XI_JMPs;
>    as->mcp = p - 2;
> @@ -445,7 +446,7 @@ static void emit_sjcc(ASMState *as, int cc, MCLabel target)
>  {
>    MCode *p = as->mcp;
>    ptrdiff_t delta = target - p;
> -  lua_assert(delta == (int8_t)delta);
> +  lj_assertA(delta == (int8_t)delta, "short jump target out of range");
>    p[-1] = (MCode)(int8_t)delta;
>    p[-2] = (MCode)(XI_JCCs+(cc&15));
>    as->mcp = p - 2;
> @@ -471,10 +472,11 @@ static void emit_sfixup(ASMState *as, MCLabel source)
>  #define emit_label(as)		((as)->mcp)
>  
>  /* Compute relative 32 bit offset for jump and call instructions. */
> -static LJ_AINLINE int32_t jmprel(MCode *p, MCode *target)
> +static LJ_AINLINE int32_t jmprel(jit_State *J, MCode *p, MCode *target)
>  {
>    ptrdiff_t delta = target - p;
> -  lua_assert(delta == (int32_t)delta);
> +  UNUSED(J);
> +  lj_assertJ(delta == (int32_t)delta, "jump target out of range");
>    return (int32_t)delta;
>  }
>  
> @@ -482,7 +484,7 @@ static LJ_AINLINE int32_t jmprel(MCode *p, MCode *target)
>  static void emit_jcc(ASMState *as, int cc, MCode *target)
>  {
>    MCode *p = as->mcp;
> -  *(int32_t *)(p-4) = jmprel(p, target);
> +  *(int32_t *)(p-4) = jmprel(as->J, p, target);
>    p[-5] = (MCode)(XI_JCCn+(cc&15));
>    p[-6] = 0x0f;
>    as->mcp = p - 6;
> @@ -492,7 +494,7 @@ static void emit_jcc(ASMState *as, int cc, MCode *target)
>  static void emit_jmp(ASMState *as, MCode *target)
>  {
>    MCode *p = as->mcp;
> -  *(int32_t *)(p-4) = jmprel(p, target);
> +  *(int32_t *)(p-4) = jmprel(as->J, p, target);
>    p[-5] = XI_JMP;
>    as->mcp = p - 5;
>  }
> @@ -509,7 +511,7 @@ static void emit_call_(ASMState *as, MCode *target)
>      return;
>    }
>  #endif
> -  *(int32_t *)(p-4) = jmprel(p, target);
> +  *(int32_t *)(p-4) = jmprel(as->J, p, target);
>    p[-5] = XI_CALL;
>    as->mcp = p - 5;
>  }
> diff --git a/src/lj_err.c b/src/lj_err.c
> index 8d7134d9..89c51e98 100644
> --- a/src/lj_err.c
> +++ b/src/lj_err.c
> @@ -483,17 +483,10 @@ void lj_err_verify(void)
>  #if !LJ_TARGET_OSX
>    /* Check disabled on MacOS due to brilliant software engineering at Apple. */
>    struct dwarf_eh_bases ehb;
> -  /*
> -  ** FIXME: The following assertions were replaced with
> -  ** the conventional `lua_assert` ones.
> -  **
> -  ** lj_assertX(_Unwind_Find_FDE((void *)lj_err_throw, &ehb), "broken build: external frame unwinding enabled, but missing -funwind-tables");
> -  ** lj_assertX(_Unwind_Find_FDE((void *)_Unwind_RaiseException, &ehb), "broken build: external frame unwinding enabled, but system libraries have no unwind tables");
> -  */
> -  lua_assert(_Unwind_Find_FDE((void *)lj_err_throw, &ehb));
> +  lj_assertX(_Unwind_Find_FDE((void *)lj_err_throw, &ehb), "broken build: external frame unwinding enabled, but missing -funwind-tables");
>  #endif
>    /* Check disabled, because of broken Fedora/ARM64. See #722.
> -  lua_assert(_Unwind_Find_FDE((void *)_Unwind_RaiseException, &ehb));
> +  lj_assertX(_Unwind_Find_FDE((void *)_Unwind_RaiseException, &ehb), "broken build: external frame unwinding enabled, but system libraries have no unwind tables");
>    */
>  }
>  #endif
> @@ -514,13 +507,7 @@ static int err_unwind_jit(int version, int actions,
>      ExitNo exitno;
>      uintptr_t addr = _Unwind_GetIP(ctx);  /* Return address _after_ call. */
>      uintptr_t stub = lj_trace_unwind(G2J(g), addr - sizeof(MCode), &exitno);
> -    /*
> -    ** FIXME: The following assert was replaced with
> -    ** the conventional `lua_assert`.
> -    **
> -    ** lj_assertG(tvref(g->jit_base), "unexpected throw across mcode frame");
> -    */
> -    lua_assert(tvref(g->jit_base));
> +    lj_assertG(tvref(g->jit_base), "unexpected throw across mcode frame");
>      if (stub) {  /* Jump to side exit to unwind the trace. */
>        G2J(g)->exitcode = LJ_UEXCLASS_ERRCODE(uexclass);
>  #ifdef LJ_TARGET_MIPS
> @@ -603,15 +590,8 @@ uint8_t *lj_err_register_mcode(void *base, size_t sz, uint8_t *info)
>  #ifdef LUA_USE_ASSERT
>    {
>      struct dwarf_eh_bases ehb;
> -    /*
> -    ** FIXME: The following assert was replaced with
> -    ** the conventional `lua_assert`.
> -    **
> -    ** lj_assertX(_Unwind_Find_FDE(info + sizeof(err_frame_jit_template)+1, &ehb),
> -    **      "bad JIT unwind table registration");
> -    */
> -    lua_assert(_Unwind_Find_FDE(info + sizeof(err_frame_jit_template)+1,
> -               &ehb));
> +    lj_assertX(_Unwind_Find_FDE(info + sizeof(err_frame_jit_template)+1, &ehb),
> +	       "bad JIT unwind table registration");
>    }
>  #endif
>    return info + sizeof(err_frame_jit_template);
> @@ -716,13 +696,7 @@ void lj_err_verify(void)
>  {
>    int got = 0;
>    _Unwind_Backtrace((_Unwind_Trace_Fn)err_verify_bt, &got);
> -  /*
> -  ** FIXME: The following assert was replaced with
> -  ** the conventional `lua_assert`.
> -  **
> -  ** lj_assertX(got == 2, "broken build: external frame unwinding enabled, but missing -funwind-tables");
> -  */
> -  lua_assert(got == 2);
> +  lj_assertX(got == 2, "broken build: external frame unwinding enabled, but missing -funwind-tables");
>  }
>  #endif
>  
> @@ -852,7 +826,7 @@ static ptrdiff_t finderrfunc(lua_State *L)
>  	return savestack(L, frame_prevd(frame)+1);  /* xpcall's errorfunc. */
>        return 0;
>      default:
> -      lua_assert(0);
> +      lj_assertL(0, "bad frame type");
>        return 0;
>      }
>    }
> diff --git a/src/lj_func.c b/src/lj_func.c
> index 639dad87..2efecb0f 100644
> --- a/src/lj_func.c
> +++ b/src/lj_func.c
> @@ -24,9 +24,11 @@ void LJ_FASTCALL lj_func_freeproto(global_State *g, GCproto *pt)
>  
>  /* -- Upvalues ------------------------------------------------------------ */
>  
> -static void unlinkuv(GCupval *uv)
> +static void unlinkuv(global_State *g, GCupval *uv)
>  {
> -  lua_assert(uvprev(uvnext(uv)) == uv && uvnext(uvprev(uv)) == uv);
> +  UNUSED(g);
> +  lj_assertG(uvprev(uvnext(uv)) == uv && uvnext(uvprev(uv)) == uv,
> +	     "broken upvalue chain");
>    setgcrefr(uvnext(uv)->prev, uv->prev);
>    setgcrefr(uvprev(uv)->next, uv->next);
>  }
> @@ -40,7 +42,7 @@ static GCupval *func_finduv(lua_State *L, TValue *slot)
>    GCupval *uv;
>    /* Search the sorted list of open upvalues. */
>    while (gcref(*pp) != NULL && uvval((p = gco2uv(gcref(*pp)))) >= slot) {
> -    lua_assert(!p->closed && uvval(p) != &p->tv);
> +    lj_assertG(!p->closed && uvval(p) != &p->tv, "closed upvalue in chain");
>      if (uvval(p) == slot) {  /* Found open upvalue pointing to same slot? */
>        if (isdead(g, obj2gco(p)))  /* Resurrect it, if it's dead. */
>  	flipwhite(obj2gco(p));
> @@ -61,7 +63,8 @@ static GCupval *func_finduv(lua_State *L, TValue *slot)
>    setgcrefr(uv->next, g->uvhead.next);
>    setgcref(uvnext(uv)->prev, obj2gco(uv));
>    setgcref(g->uvhead.next, obj2gco(uv));
> -  lua_assert(uvprev(uvnext(uv)) == uv && uvnext(uvprev(uv)) == uv);
> +  lj_assertG(uvprev(uvnext(uv)) == uv && uvnext(uvprev(uv)) == uv,
> +	     "broken upvalue chain");
>    return uv;
>  }
>  
> @@ -84,12 +87,13 @@ void LJ_FASTCALL lj_func_closeuv(lua_State *L, TValue *level)
>    while (gcref(L->openupval) != NULL &&
>  	 uvval((uv = gco2uv(gcref(L->openupval)))) >= level) {
>      GCobj *o = obj2gco(uv);
> -    lua_assert(!isblack(o) && !uv->closed && uvval(uv) != &uv->tv);
> +    lj_assertG(!isblack(o), "bad black upvalue");
> +    lj_assertG(!uv->closed && uvval(uv) != &uv->tv, "closed upvalue in chain");
>      setgcrefr(L->openupval, uv->nextgc);  /* No longer in open list. */
>      if (isdead(g, o)) {
>        lj_func_freeuv(g, uv);
>      } else {
> -      unlinkuv(uv);
> +      unlinkuv(g, uv);
>        lj_gc_closeuv(g, uv);
>      }
>    }
> @@ -98,7 +102,7 @@ void LJ_FASTCALL lj_func_closeuv(lua_State *L, TValue *level)
>  void LJ_FASTCALL lj_func_freeuv(global_State *g, GCupval *uv)
>  {
>    if (!uv->closed)
> -    unlinkuv(uv);
> +    unlinkuv(g, uv);
>    lj_mem_freet(g, uv);
>  }
>  
> diff --git a/src/lj_gc.c b/src/lj_gc.c
> index c306047a..19d4c963 100644
> --- a/src/lj_gc.c
> +++ b/src/lj_gc.c
> @@ -42,7 +42,8 @@
>  
>  /* Mark a TValue (if needed). */
>  #define gc_marktv(g, tv) \
> -  { lua_assert(!tvisgcv(tv) || (~itype(tv) == gcval(tv)->gch.gct)); \
> +  { lj_assertG(!tvisgcv(tv) || (~itype(tv) == gcval(tv)->gch.gct), \
> +	       "TValue and GC type mismatch"); \
>      if (tviswhite(tv)) gc_mark(g, gcV(tv)); }
>  
>  /* Mark a GCobj (if needed). */
> @@ -56,7 +57,8 @@
>  static void gc_mark(global_State *g, GCobj *o)
>  {
>    int gct = o->gch.gct;
> -  lua_assert(iswhite(o) && !isdead(g, o));
> +  lj_assertG(iswhite(o), "mark of non-white object");
> +  lj_assertG(!isdead(g, o), "mark of dead object");
>    white2gray(o);
>    if (LJ_UNLIKELY(gct == ~LJ_TUDATA)) {
>      GCtab *mt = tabref(gco2ud(o)->metatable);
> @@ -69,8 +71,9 @@ static void gc_mark(global_State *g, GCobj *o)
>      if (uv->closed)
>        gray2black(o);  /* Closed upvalues are never gray. */
>    } else if (gct != ~LJ_TSTR && gct != ~LJ_TCDATA) {
> -    lua_assert(gct == ~LJ_TFUNC || gct == ~LJ_TTAB ||
> -	       gct == ~LJ_TTHREAD || gct == ~LJ_TPROTO || gct == ~LJ_TTRACE);
> +    lj_assertG(gct == ~LJ_TFUNC || gct == ~LJ_TTAB ||
> +	       gct == ~LJ_TTHREAD || gct == ~LJ_TPROTO || gct == ~LJ_TTRACE,
> +	       "bad GC type %d", gct);
>      setgcrefr(o->gch.gclist, g->gc.gray);
>      setgcref(g->gc.gray, o);
>    }
> @@ -103,7 +106,8 @@ static void gc_mark_uv(global_State *g)
>  {
>    GCupval *uv;
>    for (uv = uvnext(&g->uvhead); uv != &g->uvhead; uv = uvnext(uv)) {
> -    lua_assert(uvprev(uvnext(uv)) == uv && uvnext(uvprev(uv)) == uv);
> +    lj_assertG(uvprev(uvnext(uv)) == uv && uvnext(uvprev(uv)) == uv,
> +	       "broken upvalue chain");
>      if (isgray(obj2gco(uv)))
>        gc_marktv(g, uvval(uv));
>    }
> @@ -198,7 +202,7 @@ static int gc_traverse_tab(global_State *g, GCtab *t)
>      for (i = 0; i <= hmask; i++) {
>        Node *n = &node[i];
>        if (!tvisnil(&n->val)) {  /* Mark non-empty slot. */
> -	lua_assert(!tvisnil(&n->key));
> +	lj_assertG(!tvisnil(&n->key), "mark of nil key in non-empty slot");
>  	if (!(weak & LJ_GC_WEAKKEY)) gc_marktv(g, &n->key);
>  	if (!(weak & LJ_GC_WEAKVAL)) gc_marktv(g, &n->val);
>        }
> @@ -213,7 +217,8 @@ static void gc_traverse_func(global_State *g, GCfunc *fn)
>    gc_markobj(g, tabref(fn->c.env));
>    if (isluafunc(fn)) {
>      uint32_t i;
> -    lua_assert(fn->l.nupvalues <= funcproto(fn)->sizeuv);
> +    lj_assertG(fn->l.nupvalues <= funcproto(fn)->sizeuv,
> +	       "function upvalues out of range");
>      gc_markobj(g, funcproto(fn));
>      for (i = 0; i < fn->l.nupvalues; i++)  /* Mark Lua function upvalues. */
>        gc_markobj(g, &gcref(fn->l.uvptr[i])->uv);
> @@ -229,7 +234,7 @@ static void gc_traverse_func(global_State *g, GCfunc *fn)
>  static void gc_marktrace(global_State *g, TraceNo traceno)
>  {
>    GCobj *o = obj2gco(traceref(G2J(g), traceno));
> -  lua_assert(traceno != G2J(g)->cur.traceno);
> +  lj_assertG(traceno != G2J(g)->cur.traceno, "active trace escaped");
>    if (iswhite(o)) {
>      white2gray(o);
>      setgcrefr(o->gch.gclist, g->gc.gray);
> @@ -310,7 +315,7 @@ static size_t propagatemark(global_State *g)
>  {
>    GCobj *o = gcref(g->gc.gray);
>    int gct = o->gch.gct;
> -  lua_assert(isgray(o));
> +  lj_assertG(isgray(o), "propagation of non-gray object");
>    gray2black(o);
>    setgcrefr(g->gc.gray, o->gch.gclist);  /* Remove from gray list. */
>    if (LJ_LIKELY(gct == ~LJ_TTAB)) {
> @@ -342,7 +347,7 @@ static size_t propagatemark(global_State *g)
>      return ((sizeof(GCtrace)+7)&~7) + (T->nins-T->nk)*sizeof(IRIns) +
>  	   T->nsnap*sizeof(SnapShot) + T->nsnapmap*sizeof(SnapEntry);
>  #else
> -    lua_assert(0);
> +    lj_assertG(0, "bad GC type %d", gct);
>      return 0;
>  #endif
>    }
> @@ -396,11 +401,13 @@ static GCRef *gc_sweep(global_State *g, GCRef *p, uint32_t lim)
>      if (o->gch.gct == ~LJ_TTHREAD)  /* Need to sweep open upvalues, too. */
>        gc_fullsweep(g, &gco2th(o)->openupval);
>      if (((o->gch.marked ^ LJ_GC_WHITES) & ow)) {  /* Black or current white? */
> -      lua_assert(!isdead(g, o) || (o->gch.marked & LJ_GC_FIXED));
> +      lj_assertG(!isdead(g, o) || (o->gch.marked & LJ_GC_FIXED),
> +		 "sweep of undead object");
>        makewhite(g, o);  /* Value is alive, change to the current white. */
>        p = &o->gch.nextgc;
>      } else {  /* Otherwise value is dead, free it. */
> -      lua_assert(isdead(g, o) || ow == LJ_GC_SFIXED);
> +      lj_assertG(isdead(g, o) || ow == LJ_GC_SFIXED,
> +		 "sweep of unlive object");
>        setgcrefr(*p, o->gch.nextgc);
>        if (o == gcref(g->gc.root))
>  	setgcrefr(g->gc.root, o->gch.nextgc);  /* Adjust list anchor. */
> @@ -418,7 +425,8 @@ static GCRef *gc_sweep_str_chain(global_State *g, GCRef *p)
>    GCobj *o;
>    while ((o = gcref(*p)) != NULL) {
>      if (((o->gch.marked ^ LJ_GC_WHITES) & ow)) {  /* Black or current white? */
> -      lua_assert(!isdead(g, o) || (o->gch.marked & LJ_GC_FIXED));
> +      lj_assertG(!isdead(g, o) || (o->gch.marked & LJ_GC_FIXED),
> +		 "sweep of undead string");
>        makewhite(g, o);  /* Value is alive, change to the current white. */
>  #if LUAJIT_SMART_STRINGS
>        if (strsmart(&o->str)) {
> @@ -429,7 +437,8 @@ static GCRef *gc_sweep_str_chain(global_State *g, GCRef *p)
>  #endif
>        p = &o->gch.nextgc;
>      } else {  /* Otherwise value is dead, free it. */
> -      lua_assert(isdead(g, o) || ow == LJ_GC_SFIXED);
> +      lj_assertG(isdead(g, o) || ow == LJ_GC_SFIXED,
> +		 "sweep of unlive string");
>        setgcrefr(*p, o->gch.nextgc);
>        lj_str_free(g, &o->str);
>      }
> @@ -454,11 +463,12 @@ static int gc_mayclear(cTValue *o, int val)
>  }
>  
>  /* Clear collected entries from weak tables. */
> -static void gc_clearweak(GCobj *o)
> +static void gc_clearweak(global_State *g, GCobj *o)
>  {
> +  UNUSED(g);
>    while (o) {
>      GCtab *t = gco2tab(o);
> -    lua_assert((t->marked & LJ_GC_WEAK));
> +    lj_assertG((t->marked & LJ_GC_WEAK), "clear of non-weak table");
>      if ((t->marked & LJ_GC_WEAKVAL)) {
>        MSize i, asize = t->asize;
>        for (i = 0; i < asize; i++) {
> @@ -515,7 +525,7 @@ static void gc_finalize(lua_State *L)
>    global_State *g = G(L);
>    GCobj *o = gcnext(gcref(g->gc.mmudata));
>    cTValue *mo;
> -  lua_assert(tvref(g->jit_base) == NULL);  /* Must not be called on trace. */
> +  lj_assertG(tvref(g->jit_base) == NULL, "finalizer called on trace");
>    /* Unchain from list of userdata to be finalized. */
>    if (o == gcref(g->gc.mmudata))
>      setgcrefnull(g->gc.mmudata);
> @@ -607,7 +617,7 @@ static void atomic(global_State *g, lua_State *L)
>  
>    setgcrefr(g->gc.gray, g->gc.weak);  /* Empty the list of weak tables. */
>    setgcrefnull(g->gc.weak);
> -  lua_assert(!iswhite(obj2gco(mainthread(g))));
> +  lj_assertG(!iswhite(obj2gco(mainthread(g))), "main thread turned white");
>    gc_markobj(g, L);  /* Mark running thread. */
>    gc_traverse_curtrace(g);  /* Traverse current trace. */
>    gc_mark_gcroot(g);  /* Mark GC roots (again). */
> @@ -622,7 +632,7 @@ static void atomic(global_State *g, lua_State *L)
>    udsize += gc_propagate_gray(g);  /* And propagate the marks. */
>  
>    /* All marking done, clear weak tables. */
> -  gc_clearweak(gcref(g->gc.weak));
> +  gc_clearweak(g, gcref(g->gc.weak));
>  
>    lj_buf_shrink(L, &g->tmpbuf);  /* Shrink temp buffer. */
>  
> @@ -668,14 +678,14 @@ static size_t gc_onestep(lua_State *L)
>        g->strbloom.cur[1] = g->strbloom.next[1];
>  #endif
>      }
> -    lua_assert(old >= g->gc.total);
> +    lj_assertG(old >= g->gc.total, "sweep increased memory");
>      g->gc.estimate -= old - g->gc.total;
>      return GCSWEEPCOST;
>      }
>    case GCSsweep: {
>      GCSize old = g->gc.total;
>      setmref(g->gc.sweep, gc_sweep(g, mref(g->gc.sweep, GCRef), GCSWEEPMAX));
> -    lua_assert(old >= g->gc.total);
> +    lj_assertG(old >= g->gc.total, "sweep increased memory");
>      g->gc.estimate -= old - g->gc.total;
>      if (gcref(*mref(g->gc.sweep, GCRef)) == NULL) {
>        if (g->strnum <= (g->strmask >> 2) && g->strmask > LJ_MIN_STRTAB*2-1)
> @@ -708,7 +718,7 @@ static size_t gc_onestep(lua_State *L)
>      g->gc.debt = 0;
>      return 0;
>    default:
> -    lua_assert(0);
> +    lj_assertG(0, "bad GC state");
>      return 0;
>    }
>  }
> @@ -782,7 +792,8 @@ void lj_gc_fullgc(lua_State *L)
>    }
>    while (g->gc.state == GCSsweepstring || g->gc.state == GCSsweep)
>      gc_onestep(L);  /* Finish sweep. */
> -  lua_assert(g->gc.state == GCSfinalize || g->gc.state == GCSpause);
> +  lj_assertG(g->gc.state == GCSfinalize || g->gc.state == GCSpause,
> +	     "bad GC state");
>    /* Now perform a full GC. */
>    g->gc.state = GCSpause;
>    do { gc_onestep(L); } while (g->gc.state != GCSpause);
> @@ -795,9 +806,11 @@ void lj_gc_fullgc(lua_State *L)
>  /* Move the GC propagation frontier forward. */
>  void lj_gc_barrierf(global_State *g, GCobj *o, GCobj *v)
>  {
> -  lua_assert(isblack(o) && iswhite(v) && !isdead(g, v) && !isdead(g, o));
> -  lua_assert(g->gc.state != GCSfinalize && g->gc.state != GCSpause);
> -  lua_assert(o->gch.gct != ~LJ_TTAB);
> +  lj_assertG(isblack(o) && iswhite(v) && !isdead(g, v) && !isdead(g, o),
> +	     "bad object states for forward barrier");
> +  lj_assertG(g->gc.state != GCSfinalize && g->gc.state != GCSpause,
> +	     "bad GC state");
> +  lj_assertG(o->gch.gct != ~LJ_TTAB, "barrier object is not a table");
>    /* Preserve invariant during propagation. Otherwise it doesn't matter. */
>    if (g->gc.state == GCSpropagate || g->gc.state == GCSatomic)
>      gc_mark(g, v);  /* Move frontier forward. */
> @@ -834,7 +847,8 @@ void lj_gc_closeuv(global_State *g, GCupval *uv)
>  	lj_gc_barrierf(g, o, gcV(&uv->tv));
>      } else {
>        makewhite(g, o);  /* Make it white, i.e. sweep the upvalue. */
> -      lua_assert(g->gc.state != GCSfinalize && g->gc.state != GCSpause);
> +      lj_assertG(g->gc.state != GCSfinalize && g->gc.state != GCSpause,
> +		 "bad GC state");
>      }
>    }
>  }
> @@ -854,14 +868,15 @@ void lj_gc_barriertrace(global_State *g, uint32_t traceno)
>  void *lj_mem_realloc(lua_State *L, void *p, GCSize osz, GCSize nsz)
>  {
>    global_State *g = G(L);
> -  lua_assert((osz == 0) == (p == NULL));
> +  lj_assertG((osz == 0) == (p == NULL), "realloc API violation");
>  
>    setgcref(g->mem_L, obj2gco(L));
>    p = g->allocf(g->allocd, p, osz, nsz);
>    if (p == NULL && nsz > 0)
>      lj_err_mem(L);
> -  lua_assert((nsz == 0) == (p == NULL));
> -  lua_assert(checkptrGC(p));
> +  lj_assertG((nsz == 0) == (p == NULL), "allocf API violation");
> +  lj_assertG(checkptrGC(p),
> +	     "allocated memory address %p outside required range", p);
>    g->gc.total = (g->gc.total - osz) + nsz;
>    g->gc.allocated += nsz;
>    g->gc.freed += osz;
> @@ -878,7 +893,8 @@ void * LJ_FASTCALL lj_mem_newgco(lua_State *L, GCSize size)
>    o = (GCobj *)g->allocf(g->allocd, NULL, 0, size);
>    if (o == NULL)
>      lj_err_mem(L);
> -  lua_assert(checkptrGC(o));
> +  lj_assertG(checkptrGC(o),
> +	     "allocated memory address %p outside required range", o);
>    g->gc.total += size;
>    g->gc.allocated += size;
>    setgcrefr(o->gch.nextgc, g->gc.root);
> diff --git a/src/lj_gc.h b/src/lj_gc.h
> index 40b02cb0..bd880652 100644
> --- a/src/lj_gc.h
> +++ b/src/lj_gc.h
> @@ -76,8 +76,10 @@ LJ_FUNC void lj_gc_barriertrace(global_State *g, uint32_t traceno);
>  static LJ_AINLINE void lj_gc_barrierback(global_State *g, GCtab *t)
>  {
>    GCobj *o = obj2gco(t);
> -  lua_assert(isblack(o) && !isdead(g, o));
> -  lua_assert(g->gc.state != GCSfinalize && g->gc.state != GCSpause);
> +  lj_assertG(isblack(o) && !isdead(g, o),
> +	     "bad object states for backward barrier");
> +  lj_assertG(g->gc.state != GCSfinalize && g->gc.state != GCSpause,
> +	     "bad GC state");
>    black2gray(o);
>    setgcrefr(t->gclist, g->gc.grayagain);
>    setgcref(g->gc.grayagain, o);
> diff --git a/src/lj_gdbjit.c b/src/lj_gdbjit.c
> index c219ffac..9947eacc 100644
> --- a/src/lj_gdbjit.c
> +++ b/src/lj_gdbjit.c
> @@ -724,7 +724,7 @@ static void gdbjit_buildobj(GDBJITctx *ctx)
>    SECTALIGN(ctx->p, sizeof(uintptr_t));
>    gdbjit_initsect(ctx, GDBJIT_SECT_eh_frame, gdbjit_ehframe);
>    ctx->objsize = (size_t)((char *)ctx->p - (char *)obj);
> -  lua_assert(ctx->objsize < sizeof(GDBJITobj));
> +  lj_assertX(ctx->objsize < sizeof(GDBJITobj), "GDBJITobj overflow");
>  }
>  
>  #undef SECTALIGN
> @@ -782,7 +782,8 @@ void lj_gdbjit_addtrace(jit_State *J, GCtrace *T)
>    ctx.spadjp = CFRAME_SIZE_JIT +
>  	       (MSize)(parent ? traceref(J, parent)->spadjust : 0);
>    ctx.spadj = CFRAME_SIZE_JIT + T->spadjust;
> -  lua_assert(startpc >= proto_bc(pt) && startpc < proto_bc(pt) + pt->sizebc);
> +  lj_assertJ(startpc >= proto_bc(pt) && startpc < proto_bc(pt) + pt->sizebc,
> +	     "start PC out of range");
>    ctx.lineno = lj_debug_line(pt, proto_bcpos(pt, startpc));
>    ctx.filename = proto_chunknamestr(pt);
>    if (*ctx.filename == '@' || *ctx.filename == '=')
> diff --git a/src/lj_ir.c b/src/lj_ir.c
> index 2f7ddb24..9a51186f 100644
> --- a/src/lj_ir.c
> +++ b/src/lj_ir.c
> @@ -38,7 +38,7 @@
>  #define fins			(&J->fold.ins)
>  
>  /* Pass IR on to next optimization in chain (FOLD). */
> -#define emitir(ot, a, b)        (lj_ir_set(J, (ot), (a), (b)), lj_opt_fold(J))
> +#define emitir(ot, a, b)	(lj_ir_set(J, (ot), (a), (b)), lj_opt_fold(J))
>  
>  /* -- IR tables ----------------------------------------------------------- */
>  
> @@ -90,8 +90,9 @@ static void lj_ir_growbot(jit_State *J)
>  {
>    IRIns *baseir = J->irbuf + J->irbotlim;
>    MSize szins = J->irtoplim - J->irbotlim;
> -  lua_assert(szins != 0);
> -  lua_assert(J->cur.nk == J->irbotlim || J->cur.nk-1 == J->irbotlim);
> +  lj_assertJ(szins != 0, "zero IR size");
> +  lj_assertJ(J->cur.nk == J->irbotlim || J->cur.nk-1 == J->irbotlim,
> +	     "unexpected IR growth");
>    if (J->cur.nins + (szins >> 1) < J->irtoplim) {
>      /* More than half of the buffer is free on top: shift up by a quarter. */
>      MSize ofs = szins >> 2;
> @@ -148,9 +149,10 @@ TRef lj_ir_call(jit_State *J, IRCallID id, ...)
>  /* Load field of type t from GG_State + offset. Must be 32 bit aligned. */
>  LJ_FUNC TRef lj_ir_ggfload(jit_State *J, IRType t, uintptr_t ofs)
>  {
> -  lua_assert((ofs & 3) == 0);
> +  lj_assertJ((ofs & 3) == 0, "unaligned GG_State field offset");
>    ofs >>= 2;
> -  lua_assert(ofs >= IRFL__MAX && ofs <= 0x3ff);  /* 10 bit FOLD key limit. */
> +  lj_assertJ(ofs >= IRFL__MAX && ofs <= 0x3ff,
> +	     "GG_State field offset breaks 10 bit FOLD key limit");
>    lj_ir_set(J, IRT(IR_FLOAD, t), REF_NIL, ofs);
>    return lj_opt_fold(J);
>  }
> @@ -181,7 +183,7 @@ static LJ_AINLINE IRRef ir_nextk(jit_State *J)
>  static LJ_AINLINE IRRef ir_nextk64(jit_State *J)
>  {
>    IRRef ref = J->cur.nk - 2;
> -  lua_assert(J->state != LJ_TRACE_ASM);
> +  lj_assertJ(J->state != LJ_TRACE_ASM, "bad JIT state");
>    if (LJ_UNLIKELY(ref < J->irbotlim)) lj_ir_growbot(J);
>    J->cur.nk = ref;
>    return ref;
> @@ -277,7 +279,7 @@ TRef lj_ir_kgc(jit_State *J, GCobj *o, IRType t)
>  {
>    IRIns *ir, *cir = J->cur.ir;
>    IRRef ref;
> -  lua_assert(!isdead(J2G(J), o));
> +  lj_assertJ(!isdead(J2G(J), o), "interning of dead GC object");
>    for (ref = J->chain[IR_KGC]; ref; ref = cir[ref].prev)
>      if (ir_kgc(&cir[ref]) == o)
>        goto found;
> @@ -299,7 +301,7 @@ TRef lj_ir_ktrace(jit_State *J)
>  {
>    IRRef ref = ir_nextkgc(J);
>    IRIns *ir = IR(ref);
> -  lua_assert(irt_toitype_(IRT_P64) == LJ_TTRACE);
> +  lj_assertJ(irt_toitype_(IRT_P64) == LJ_TTRACE, "mismatched type mapping");
>    ir->t.irt = IRT_P64;
>    ir->o = LJ_GC64 ? IR_KNUM : IR_KNULL;  /* Not IR_KGC yet, but same size. */
>    ir->op12 = 0;
> @@ -313,7 +315,7 @@ TRef lj_ir_kptr_(jit_State *J, IROp op, void *ptr)
>    IRIns *ir, *cir = J->cur.ir;
>    IRRef ref;
>  #if LJ_64 && !LJ_GC64
> -  lua_assert((void *)(uintptr_t)u32ptr(ptr) == ptr);
> +  lj_assertJ((void *)(uintptr_t)u32ptr(ptr) == ptr, "out-of-range GC pointer");
>  #endif
>    for (ref = J->chain[op]; ref; ref = cir[ref].prev)
>      if (ir_kptr(&cir[ref]) == ptr)
> @@ -360,7 +362,8 @@ TRef lj_ir_kslot(jit_State *J, TRef key, IRRef slot)
>    IRRef2 op12 = IRREF2((IRRef1)key, (IRRef1)slot);
>    IRRef ref;
>    /* Const part is not touched by CSE/DCE, so 0-65535 is ok for IRMlit here. */
> -  lua_assert(tref_isk(key) && slot == (IRRef)(IRRef1)slot);
> +  lj_assertJ(tref_isk(key) && slot == (IRRef)(IRRef1)slot,
> +	     "out-of-range key/slot");
>    for (ref = J->chain[IR_KSLOT]; ref; ref = cir[ref].prev)
>      if (cir[ref].op12 == op12)
>        goto found;
> @@ -381,7 +384,7 @@ found:
>  void lj_ir_kvalue(lua_State *L, TValue *tv, const IRIns *ir)
>  {
>    UNUSED(L);
> -  lua_assert(ir->o != IR_KSLOT);  /* Common mistake. */
> +  lj_assertL(ir->o != IR_KSLOT, "unexpected KSLOT");  /* Common mistake. */
>    switch (ir->o) {
>    case IR_KPRI: setpriV(tv, irt_toitype(ir->t)); break;
>    case IR_KINT: setintV(tv, ir->i); break;
> @@ -399,7 +402,7 @@ void lj_ir_kvalue(lua_State *L, TValue *tv, const IRIns *ir)
>      break;
>      }
>  #endif
> -  default: lua_assert(0); break;
> +  default: lj_assertL(0, "bad IR constant op %d", ir->o); break;
>    }
>  }
>  
> @@ -459,7 +462,7 @@ int lj_ir_numcmp(lua_Number a, lua_Number b, IROp op)
>    case IR_UGE: return !(a < b);
>    case IR_ULE: return !(a > b);
>    case IR_UGT: return !(a <= b);
> -  default: lua_assert(0); return 0;
> +  default: lj_assertX(0, "bad IR op %d", op); return 0;
>    }
>  }
>  
> @@ -472,7 +475,7 @@ int lj_ir_strcmp(GCstr *a, GCstr *b, IROp op)
>    case IR_GE: return (res >= 0);
>    case IR_LE: return (res <= 0);
>    case IR_GT: return (res > 0);
> -  default: lua_assert(0); return 0;
> +  default: lj_assertX(0, "bad IR op %d", op); return 0;
>    }
>  }
>  
> diff --git a/src/lj_ir.h b/src/lj_ir.h
> index 43e55069..46af54e4 100644
> --- a/src/lj_ir.h
> +++ b/src/lj_ir.h
> @@ -412,11 +412,12 @@ static LJ_AINLINE IRType itype2irt(const TValue *tv)
>  
>  static LJ_AINLINE uint32_t irt_toitype_(IRType t)
>  {
> -  lua_assert(!LJ_64 || LJ_GC64 || t != IRT_LIGHTUD);
> +  lj_assertX(!LJ_64 || LJ_GC64 || t != IRT_LIGHTUD,
> +	     "no plain type tag for lightuserdata");
>    if (LJ_DUALNUM && t > IRT_NUM) {
>      return LJ_TISNUM;
>    } else {
> -    lua_assert(t <= IRT_NUM);
> +    lj_assertX(t <= IRT_NUM, "no plain type tag for IR type %d", t);
>      return ~(uint32_t)t;
>    }
>  }
> diff --git a/src/lj_jit.h b/src/lj_jit.h
> index a8b6f9a7..361570a0 100644
> --- a/src/lj_jit.h
> +++ b/src/lj_jit.h
> @@ -507,6 +507,12 @@ LJ_ALIGN(16)		/* For DISPATCH-relative addresses in assembler part. */
>  #endif
>  jit_State;
>  
> +#ifdef LUA_USE_ASSERT
> +#define lj_assertJ(c, ...)	lj_assertG_(J2G(J), (c), __VA_ARGS__)
> +#else
> +#define lj_assertJ(c, ...)	((void)J)
> +#endif
> +
>  /* Trivial PRNG e.g. used for penalty randomization. */
>  static LJ_AINLINE uint32_t LJ_PRNG_BITS(jit_State *J, int bits)
>  {
> diff --git a/src/lj_lex.c b/src/lj_lex.c
> index c66660d7..cef3c683 100644
> --- a/src/lj_lex.c
> +++ b/src/lj_lex.c
> @@ -76,7 +76,7 @@ static LJ_AINLINE LexChar lex_savenext(LexState *ls)
>  static void lex_newline(LexState *ls)
>  {
>    LexChar old = ls->c;
> -  lua_assert(lex_iseol(ls));
> +  lj_assertLS(lex_iseol(ls), "bad usage");
>    lex_next(ls);  /* Skip "\n" or "\r". */
>    if (lex_iseol(ls) && ls->c != old) lex_next(ls);  /* Skip "\n\r" or "\r\n". */
>    if (++ls->linenumber >= LJ_MAX_LINE)
> @@ -90,7 +90,7 @@ static void lex_number(LexState *ls, TValue *tv)
>  {
>    StrScanFmt fmt;
>    LexChar c, xp = 'e';
> -  lua_assert(lj_char_isdigit(ls->c));
> +  lj_assertLS(lj_char_isdigit(ls->c), "bad usage");
>    if ((c = ls->c) == '0' && (lex_savenext(ls) | 0x20) == 'x')
>      xp = 'p';
>    while (lj_char_isident(ls->c) || ls->c == '.' ||
> @@ -110,7 +110,8 @@ static void lex_number(LexState *ls, TValue *tv)
>    } else if (fmt != STRSCAN_ERROR) {
>      lua_State *L = ls->L;
>      GCcdata *cd;
> -    lua_assert(fmt == STRSCAN_I64 || fmt == STRSCAN_U64 || fmt == STRSCAN_IMAG);
> +    lj_assertLS(fmt == STRSCAN_I64 || fmt == STRSCAN_U64 || fmt == STRSCAN_IMAG,
> +		"unexpected number format %d", fmt);
>      if (!ctype_ctsG(G(L))) {
>        ptrdiff_t oldtop = savestack(L, L->top);
>        luaopen_ffi(L);  /* Load FFI library on-demand. */
> @@ -127,7 +128,8 @@ static void lex_number(LexState *ls, TValue *tv)
>      lj_parse_keepcdata(ls, tv, cd);
>  #endif
>    } else {
> -    lua_assert(fmt == STRSCAN_ERROR);
> +    lj_assertLS(fmt == STRSCAN_ERROR,
> +		"unexpected number format %d", fmt);
>      lj_lex_error(ls, TK_number, LJ_ERR_XNUMBER);
>    }
>  }
> @@ -137,7 +139,7 @@ static int lex_skipeq(LexState *ls)
>  {
>    int count = 0;
>    LexChar s = ls->c;
> -  lua_assert(s == '[' || s == ']');
> +  lj_assertLS(s == '[' || s == ']', "bad usage");
>    while (lex_savenext(ls) == '=' && count < 0x20000000)
>      count++;
>    return (ls->c == s) ? count : (-count) - 1;
> @@ -462,7 +464,7 @@ void lj_lex_next(LexState *ls)
>  /* Look ahead for the next token. */
>  LexToken lj_lex_lookahead(LexState *ls)
>  {
> -  lua_assert(ls->lookahead == TK_eof);
> +  lj_assertLS(ls->lookahead == TK_eof, "double lookahead");
>    ls->lookahead = lex_scan(ls, &ls->lookaheadval);
>    return ls->lookahead;
>  }
> diff --git a/src/lj_lex.h b/src/lj_lex.h
> index 33fa8657..ae05a954 100644
> --- a/src/lj_lex.h
> +++ b/src/lj_lex.h
> @@ -83,4 +83,10 @@ LJ_FUNC const char *lj_lex_token2str(LexState *ls, LexToken tok);
>  LJ_FUNC_NORET void lj_lex_error(LexState *ls, LexToken tok, ErrMsg em, ...);
>  LJ_FUNC void lj_lex_init(lua_State *L);
>  
> +#ifdef LUA_USE_ASSERT
> +#define lj_assertLS(c, ...)	(lj_assertG_(G(ls->L), (c), __VA_ARGS__))
> +#else
> +#define lj_assertLS(c, ...)	((void)ls)
> +#endif
> +
>  #endif
> diff --git a/src/lj_load.c b/src/lj_load.c
> index 9a31d9a1..19ac6ba2 100644
> --- a/src/lj_load.c
> +++ b/src/lj_load.c
> @@ -159,7 +159,7 @@ LUALIB_API int luaL_loadstring(lua_State *L, const char *s)
>  LUA_API int lua_dump(lua_State *L, lua_Writer writer, void *data)
>  {
>    cTValue *o = L->top-1;
> -  api_check(L, L->top > L->base);
> +  lj_checkapi(L->top > L->base, "top slot empty");
>    if (tvisfunc(o) && isluafunc(funcV(o)))
>      return lj_bcwrite(L, funcproto(funcV(o)), writer, data, 0);
>    else
> diff --git a/src/lj_mapi.c b/src/lj_mapi.c
> index 9d97c747..679ca943 100644
> --- a/src/lj_mapi.c
> +++ b/src/lj_mapi.c
> @@ -28,7 +28,7 @@ LUAMISC_API void luaM_metrics(lua_State *L, struct luam_Metrics *metrics)
>    jit_State *J = G2J(g);
>  #endif
>  
> -  lua_assert(metrics != NULL);
> +  lj_assertL(metrics != NULL, "uninitialized metrics struct");
>  
>    metrics->strhash_hit = g->strhash_hit;
>    metrics->strhash_miss = g->strhash_miss;
> diff --git a/src/lj_mcode.c b/src/lj_mcode.c
> index 10db4457..808a9897 100644
> --- a/src/lj_mcode.c
> +++ b/src/lj_mcode.c
> @@ -354,7 +354,7 @@ MCode *lj_mcode_patch(jit_State *J, MCode *ptr, int finish)
>      /* Otherwise search through the list of MCode areas. */
>      for (;;) {
>        mc = ((MCLink *)mc)->next;
> -      lua_assert(mc != NULL);
> +      lj_assertJ(mc != NULL, "broken MCode area chain");
>        if (ptr >= mc && ptr < (MCode *)((char *)mc + ((MCLink *)mc)->size)) {
>  	if (LJ_UNLIKELY(mcode_setprot(mc, ((MCLink *)mc)->size, MCPROT_GEN)))
>  	  mcode_protfail(J);
> diff --git a/src/lj_memprof.c b/src/lj_memprof.c
> index c600c4f0..a492cf58 100644
> --- a/src/lj_memprof.c
> +++ b/src/lj_memprof.c
> @@ -144,7 +144,7 @@ static void memprof_write_func(struct memprof *mp, uint8_t aevent)
>    else if (iscfunc(fn))
>      memprof_write_cfunc(out, aevent, fn, L, &mp->lib_adds);
>    else
> -    lua_assert(0);
> +    lj_assertL(0, "unknown function type to write by memprof");
>  }
>  
>  #if LJ_HASJIT
> @@ -164,7 +164,7 @@ static void memprof_write_trace(struct memprof *mp, uint8_t aevent)
>  {
>    UNUSED(mp);
>    UNUSED(aevent);
> -  lua_assert(0);
> +  lj_assertX(0, "write trace memprof event without JIT");
>  }
>  
>  #endif
> @@ -215,10 +215,12 @@ static void *memprof_allocf(void *ud, void *ptr, size_t osize, size_t nsize)
>    struct lj_wbuf *out = &mp->out;
>    void *nptr;
>  
> -  lua_assert(MPS_PROFILE == mp->state);
> -  lua_assert(oalloc->allocf != memprof_allocf);
> -  lua_assert(oalloc->allocf != NULL);
> -  lua_assert(ud == oalloc->state);
> +  lj_assertX(MPS_PROFILE == mp->state, "bad memprof profile state");
> +  lj_assertX(oalloc->allocf != memprof_allocf,
> +	     "unexpected memprof old alloc function");
> +  lj_assertX(oalloc->allocf != NULL,
> +	     "uninitialized memprof old alloc function");
> +  lj_assertX(ud == oalloc->state, "bad old memprof profile state");
>  
>    nptr = oalloc->allocf(ud, ptr, osize, nsize);
>  
> @@ -252,10 +254,10 @@ int lj_memprof_start(struct lua_State *L, const struct lj_memprof_options *opt)
>    struct alloc *oalloc = &mp->orig_alloc;
>    const size_t ljm_header_len = sizeof(ljm_header) / sizeof(ljm_header[0]);
>  
> -  lua_assert(opt->writer != NULL);
> -  lua_assert(opt->on_stop != NULL);
> -  lua_assert(opt->buf != NULL);
> -  lua_assert(opt->len != 0);
> +  lj_assertL(opt->writer != NULL, "uninitialized memprof writer");
> +  lj_assertL(opt->on_stop != NULL, "uninitialized on stop memprof callback");
> +  lj_assertL(opt->buf != NULL, "uninitialized memprof writer buffer");
> +  lj_assertL(opt->len != 0, "bad memprof writer buffer lenght");
>  
>    if (mp->state != MPS_IDLE) {
>      /* Clean up resourses. Ignore possible errors. */
> @@ -293,8 +295,9 @@ int lj_memprof_start(struct lua_State *L, const struct lj_memprof_options *opt)
>  
>    /* Override allocating function. */
>    oalloc->allocf = lua_getallocf(L, &oalloc->state);
> -  lua_assert(oalloc->allocf != NULL);
> -  lua_assert(oalloc->allocf != memprof_allocf);
> +  lj_assertL(oalloc->allocf != NULL, "uninitialized memprof old alloc function");
> +  lj_assertL(oalloc->allocf != memprof_allocf,
> +	     "unexpected memprof old alloc function");
>    lua_setallocf(L, memprof_allocf, oalloc->state);
>  
>    return PROFILE_SUCCESS;
> @@ -323,10 +326,12 @@ int lj_memprof_stop(struct lua_State *L)
>  
>    mp->state = MPS_IDLE;
>  
> -  lua_assert(mp->g != NULL);
> +  lj_assertL(mp->g != NULL, "uninitialized global state in memprof state");
>  
> -  lua_assert(memprof_allocf == lua_getallocf(L, NULL));
> -  lua_assert(oalloc->allocf != NULL);
> +  lj_assertL(memprof_allocf == lua_getallocf(L, NULL),
> +	     "bad current allocator function on memprof stop");
> +  lj_assertL(oalloc->allocf != NULL,
> +	     "uninitialized old alloc function on memprof stop");
>    lua_setallocf(L, oalloc->allocf, oalloc->state);
>  
>    if (LJ_UNLIKELY(lj_wbuf_test_flag(out, STREAM_STOP))) {
> diff --git a/src/lj_meta.c b/src/lj_meta.c
> index 7ef7a8e0..4cb1a261 100644
> --- a/src/lj_meta.c
> +++ b/src/lj_meta.c
> @@ -47,7 +47,7 @@ void lj_meta_init(lua_State *L)
>  cTValue *lj_meta_cache(GCtab *mt, MMS mm, GCstr *name)
>  {
>    cTValue *mo = lj_tab_getstr(mt, name);
> -  lua_assert(mm <= MM_FAST);
> +  lj_assertX(mm <= MM_FAST, "bad metamethod %d", mm);
>    if (!mo || tvisnil(mo)) {  /* No metamethod? */
>      mt->nomm |= (uint8_t)(1u<<mm);  /* Set negative cache flag. */
>      return NULL;
> @@ -363,7 +363,7 @@ TValue * LJ_FASTCALL lj_meta_equal_cd(lua_State *L, BCIns ins)
>    } else if (op == BC_ISEQN) {
>      o2 = &mref(curr_proto(L)->k, cTValue)[bc_d(ins)];
>    } else {
> -    lua_assert(op == BC_ISEQP);
> +    lj_assertL(op == BC_ISEQP, "bad bytecode op %d", op);
>      setpriV(&tv, ~bc_d(ins));
>      o2 = &tv;
>    }
> @@ -426,7 +426,7 @@ void lj_meta_istype(lua_State *L, BCReg ra, BCReg tp)
>  {
>    L->top = curr_topL(L);
>    ra++; tp--;
> -  lua_assert(LJ_DUALNUM || tp != ~LJ_TNUMX);  /* ISTYPE -> ISNUM broken. */
> +  lj_assertL(LJ_DUALNUM || tp != ~LJ_TNUMX, "bad type for ISTYPE");
>    if (LJ_DUALNUM && tp == ~LJ_TNUMX) lj_lib_checkint(L, ra);
>    else if (tp == ~LJ_TNUMX+1) lj_lib_checknum(L, ra);
>    else if (tp == ~LJ_TSTR) lj_lib_checkstr(L, ra);
> diff --git a/src/lj_obj.h b/src/lj_obj.h
> index bf95e1eb..fb21cba9 100644
> --- a/src/lj_obj.h
> +++ b/src/lj_obj.h
> @@ -735,6 +735,11 @@ struct lua_State {
>  #define curr_topL(L)		(L->base + curr_proto(L)->framesize)
>  #define curr_top(L)		(curr_funcisL(L) ? curr_topL(L) : L->top)
>  
> +#if defined(LUA_USE_ASSERT) || defined(LUA_USE_APICHECK)
> +LJ_FUNC_NORET void lj_assert_fail(global_State *g, const char *file, int line,
> +				  const char *func, const char *fmt, ...);
> +#endif
> +
>  /* -- GC object definition and conversions -------------------------------- */
>  
>  /* GC header for generic access to common fields of GC objects. */
> @@ -788,10 +793,6 @@ typedef union GCobj {
>  
>  /* -- TValue getters/setters ---------------------------------------------- */
>  
> -#ifdef LUA_USE_ASSERT
> -#include "lj_gc.h"
> -#endif
> -
>  /* Macros to test types. */
>  #if LJ_GC64
>  #define itype(o)	((uint32_t)((o)->it64 >> 47))
> @@ -863,8 +864,8 @@ static LJ_AINLINE void *lightudV(global_State *g, cTValue *o)
>    uint64_t u = o->u64;
>    uint64_t seg = lightudseg(u);
>    uint32_t *segmap = mref(g->gc.lightudseg, uint32_t);
> -  lua_assert(tvislightud(o));
> -  lua_assert(seg <= g->gc.lightudnum);
> +  lj_assertG(tvislightud(o), "lightuserdata expected");
> +  lj_assertG(seg <= g->gc.lightudnum, "bad lightuserdata segment %d", seg);
>    return (void *)(((uint64_t)segmap[seg] << 32) | lightudlo(u));
>  }
>  #else
> @@ -915,9 +916,19 @@ static LJ_AINLINE void setrawlightudV(TValue *o, void *p)
>    ((o)->u64 = (uint64_t)(void *)(f) - (uint64_t)lj_vm_asm_begin)
>  #endif
>  
> -#define tvchecklive(L, o) \
> -  UNUSED(L), lua_assert(!tvisgcv(o) || \
> -  ((~itype(o) == gcval(o)->gch.gct) && !isdead(G(L), gcval(o))))
> +static LJ_AINLINE void checklivetv(lua_State *L, TValue *o, const char *msg)
> +{
> +  UNUSED(L); UNUSED(o); UNUSED(msg);
> +#if LUA_USE_ASSERT
> +  if (tvisgcv(o)) {
> +    lj_assertL(~itype(o) == gcval(o)->gch.gct,
> +	       "mismatch of TValue type %d vs GC type %d",
> +	       ~itype(o), gcval(o)->gch.gct);
> +    /* Copy of isdead check from lj_gc.h to avoid circular include. */
> +    lj_assertL(!(gcval(o)->gch.marked & (G(L)->gc.currentwhite ^ 3) & 3), msg);
> +  }
> +#endif
> +}
>  
>  static LJ_AINLINE void setgcVraw(TValue *o, GCobj *v, uint32_t itype)
>  {
> @@ -930,7 +941,8 @@ static LJ_AINLINE void setgcVraw(TValue *o, GCobj *v, uint32_t itype)
>  
>  static LJ_AINLINE void setgcV(lua_State *L, TValue *o, GCobj *v, uint32_t it)
>  {
> -  setgcVraw(o, v, it); tvchecklive(L, o);
> +  setgcVraw(o, v, it);
> +  checklivetv(L, o, "store to dead GC object");
>  }
>  
>  #define define_setV(name, type, tag) \
> @@ -977,7 +989,8 @@ static LJ_AINLINE void setint64V(TValue *o, int64_t i)
>  /* Copy tagged values. */
>  static LJ_AINLINE void copyTV(lua_State *L, TValue *o1, const TValue *o2)
>  {
> -  *o1 = *o2; tvchecklive(L, o1);
> +  *o1 = *o2;
> +  checklivetv(L, o1, "copy of dead GC object");
>  }
>  
>  /* -- Number to integer conversion ---------------------------------------- */
> diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
> index cd803d87..0007107b 100644
> --- a/src/lj_opt_fold.c
> +++ b/src/lj_opt_fold.c
> @@ -282,7 +282,7 @@ static int32_t kfold_intop(int32_t k1, int32_t k2, IROp op)
>    case IR_BROR: k1 = (int32_t)lj_ror((uint32_t)k1, (k2 & 31)); break;
>    case IR_MIN: k1 = k1 < k2 ? k1 : k2; break;
>    case IR_MAX: k1 = k1 > k2 ? k1 : k2; break;
> -  default: lua_assert(0); break;
> +  default: lj_assertX(0, "bad IR op %d", op); break;
>    }
>    return k1;
>  }
> @@ -354,7 +354,7 @@ LJFOLDF(kfold_intcomp)
>    case IR_ULE: return CONDFOLD((uint32_t)a <= (uint32_t)b);
>    case IR_ABC:
>    case IR_UGT: return CONDFOLD((uint32_t)a > (uint32_t)b);
> -  default: lua_assert(0); return FAILFOLD;
> +  default: lj_assertJ(0, "bad IR op %d", fins->o); return FAILFOLD;
>    }
>  }
>  
> @@ -368,10 +368,12 @@ LJFOLDF(kfold_intcomp0)
>  
>  /* -- Constant folding for 64 bit integers -------------------------------- */
>  
> -static uint64_t kfold_int64arith(uint64_t k1, uint64_t k2, IROp op)
> +static uint64_t kfold_int64arith(jit_State *J, uint64_t k1, uint64_t k2,
> +				 IROp op)
>  {
> -  switch (op) {
> +  UNUSED(J);
>  #if LJ_HASFFI
> +  switch (op) {
>    case IR_ADD: k1 += k2; break;
>    case IR_SUB: k1 -= k2; break;
>    case IR_MUL: k1 *= k2; break;
> @@ -383,9 +385,12 @@ static uint64_t kfold_int64arith(uint64_t k1, uint64_t k2, IROp op)
>    case IR_BSAR: k1 >>= (k2 & 63); break;
>    case IR_BROL: k1 = (int32_t)lj_rol((uint32_t)k1, (k2 & 63)); break;
>    case IR_BROR: k1 = (int32_t)lj_ror((uint32_t)k1, (k2 & 63)); break;
> -#endif
> -  default: UNUSED(k2); lua_assert(0); break;
> +  default: lj_assertJ(0, "bad IR op %d", op); break;
>    }
> +#else
> +  UNUSED(k2); UNUSED(op);
> +  lj_assertJ(0, "FFI IR op without FFI");
> +#endif
>    return k1;
>  }
>  
> @@ -397,7 +402,7 @@ LJFOLD(BOR KINT64 KINT64)
>  LJFOLD(BXOR KINT64 KINT64)
>  LJFOLDF(kfold_int64arith)
>  {
> -  return INT64FOLD(kfold_int64arith(ir_k64(fleft)->u64,
> +  return INT64FOLD(kfold_int64arith(J, ir_k64(fleft)->u64,
>  				    ir_k64(fright)->u64, (IROp)fins->o));
>  }
>  
> @@ -419,7 +424,7 @@ LJFOLDF(kfold_int64arith2)
>    }
>    return INT64FOLD(k1);
>  #else
> -  UNUSED(J); lua_assert(0); return FAILFOLD;
> +  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
>  #endif
>  }
>  
> @@ -435,7 +440,7 @@ LJFOLDF(kfold_int64shift)
>    int32_t sh = (fright->i & 63);
>    return INT64FOLD(lj_carith_shift64(k, sh, fins->o - IR_BSHL));
>  #else
> -  UNUSED(J); lua_assert(0); return FAILFOLD;
> +  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
>  #endif
>  }
>  
> @@ -445,7 +450,7 @@ LJFOLDF(kfold_bnot64)
>  #if LJ_HASFFI
>    return INT64FOLD(~ir_k64(fleft)->u64);
>  #else
> -  UNUSED(J); lua_assert(0); return FAILFOLD;
> +  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
>  #endif
>  }
>  
> @@ -455,7 +460,7 @@ LJFOLDF(kfold_bswap64)
>  #if LJ_HASFFI
>    return INT64FOLD(lj_bswap64(ir_k64(fleft)->u64));
>  #else
> -  UNUSED(J); lua_assert(0); return FAILFOLD;
> +  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
>  #endif
>  }
>  
> @@ -480,10 +485,10 @@ LJFOLDF(kfold_int64comp)
>    case IR_UGE: return CONDFOLD(a >= b);
>    case IR_ULE: return CONDFOLD(a <= b);
>    case IR_UGT: return CONDFOLD(a > b);
> -  default: lua_assert(0); return FAILFOLD;
> +  default: lj_assertJ(0, "bad IR op %d", fins->o); return FAILFOLD;
>    }
>  #else
> -  UNUSED(J); lua_assert(0); return FAILFOLD;
> +  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
>  #endif
>  }
>  
> @@ -495,7 +500,7 @@ LJFOLDF(kfold_int64comp0)
>      return DROPFOLD;
>    return NEXTFOLD;
>  #else
> -  UNUSED(J); lua_assert(0); return FAILFOLD;
> +  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
>  #endif
>  }
>  
> @@ -520,7 +525,7 @@ LJFOLD(STRREF KGC KINT)
>  LJFOLDF(kfold_strref)
>  {
>    GCstr *str = ir_kstr(fleft);
> -  lua_assert((MSize)fright->i <= str->len);
> +  lj_assertJ((MSize)fright->i <= str->len, "bad string ref");
>    return lj_ir_kkptr(J, (char *)strdata(str) + fright->i);
>  }
>  
> @@ -616,8 +621,9 @@ LJFOLDF(bufput_kgc)
>  LJFOLD(BUFSTR any any)
>  LJFOLDF(bufstr_kfold_cse)
>  {
> -  lua_assert(fleft->o == IR_BUFHDR || fleft->o == IR_BUFPUT ||
> -	     fleft->o == IR_CALLL);
> +  lj_assertJ(fleft->o == IR_BUFHDR || fleft->o == IR_BUFPUT ||
> +	     fleft->o == IR_CALLL,
> +	     "bad buffer constructor IR op %d", fleft->o);
>    if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD)) {
>      if (fleft->o == IR_BUFHDR) {  /* No put operations? */
>        if (!(fleft->op2 & IRBUFHDR_APPEND))  /* Empty buffer? */
> @@ -637,8 +643,9 @@ LJFOLDF(bufstr_kfold_cse)
>      while (ref) {
>        IRIns *irs = IR(ref), *ira = fleft, *irb = IR(irs->op1);
>        while (ira->o == irb->o && ira->op2 == irb->op2) {
> -	lua_assert(ira->o == IR_BUFHDR || ira->o == IR_BUFPUT ||
> -		   ira->o == IR_CALLL || ira->o == IR_CARG);
> +	lj_assertJ(ira->o == IR_BUFHDR || ira->o == IR_BUFPUT ||
> +		   ira->o == IR_CALLL || ira->o == IR_CARG,
> +		   "bad buffer constructor IR op %d", ira->o);
>  	if (ira->o == IR_BUFHDR && !(ira->op2 & IRBUFHDR_APPEND))
>  	  return ref;  /* CSE succeeded. */
>  	if (ira->o == IR_CALLL && ira->op2 == IRCALL_lj_buf_puttab)
> @@ -697,7 +704,7 @@ LJFOLD(CALLL CARG IRCALL_lj_strfmt_putfchar)
>  LJFOLDF(bufput_kfold_fmt)
>  {
>    IRIns *irc = IR(fleft->op1);
> -  lua_assert(irref_isk(irc->op2));  /* SFormat must be const. */
> +  lj_assertJ(irref_isk(irc->op2), "SFormat must be const");
>    if (irref_isk(fleft->op2)) {
>      SFormat sf = (SFormat)IR(irc->op2)->i;
>      IRIns *ira = IR(fleft->op2);
> @@ -1216,10 +1223,10 @@ LJFOLDF(simplify_tobit_conv)
>  {
>    /* Fold even across PHI to avoid expensive num->int conversions in loop. */
>    if ((fleft->op2 & IRCONV_SRCMASK) == IRT_INT) {
> -    lua_assert(irt_isnum(fleft->t));
> +    lj_assertJ(irt_isnum(fleft->t), "expected TOBIT number arg");
>      return fleft->op1;
>    } else if ((fleft->op2 & IRCONV_SRCMASK) == IRT_U32) {
> -    lua_assert(irt_isnum(fleft->t));
> +    lj_assertJ(irt_isnum(fleft->t), "expected TOBIT number arg");
>      fins->o = IR_CONV;
>      fins->op1 = fleft->op1;
>      fins->op2 = (IRT_INT<<5)|IRT_U32;
> @@ -1259,7 +1266,7 @@ LJFOLDF(simplify_conv_sext)
>    /* Use scalar evolution analysis results to strength-reduce sign-extension. */
>    if (ref == J->scev.idx) {
>      IRRef lo = J->scev.dir ? J->scev.start : J->scev.stop;
> -    lua_assert(irt_isint(J->scev.t));
> +    lj_assertJ(irt_isint(J->scev.t), "only int SCEV supported");
>      if (lo && IR(lo)->o == IR_KINT && IR(lo)->i + ofs >= 0) {
>      ok_reduce:
>  #if LJ_TARGET_X64
> @@ -1335,7 +1342,8 @@ LJFOLDF(narrow_convert)
>    /* Narrowing ignores PHIs and repeating it inside the loop is not useful. */
>    if (J->chain[IR_LOOP])
>      return NEXTFOLD;
> -  lua_assert(fins->o != IR_CONV || (fins->op2&IRCONV_CONVMASK) != IRCONV_TOBIT);
> +  lj_assertJ(fins->o != IR_CONV || (fins->op2&IRCONV_CONVMASK) != IRCONV_TOBIT,
> +	     "unexpected CONV TOBIT");
>    return lj_opt_narrow_convert(J);
>  }
>  
> @@ -1441,7 +1449,7 @@ LJFOLDF(simplify_intmul_k64)
>      return simplify_intmul_k(J, (int32_t)ir_kint64(fright)->u64);
>    return NEXTFOLD;
>  #else
> -  UNUSED(J); lua_assert(0); return FAILFOLD;
> +  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
>  #endif
>  }
>  
> @@ -1449,7 +1457,7 @@ LJFOLD(MOD any KINT)
>  LJFOLDF(simplify_intmod_k)
>  {
>    int32_t k = fright->i;
> -  lua_assert(k != 0);
> +  lj_assertJ(k != 0, "integer mod 0");
>    if (k > 0 && (k & (k-1)) == 0) {  /* i % (2^k) ==> i & (2^k-1) */
>      fins->o = IR_BAND;
>      fins->op2 = lj_ir_kint(J, k-1);
> @@ -1699,7 +1707,8 @@ LJFOLDF(simplify_shiftk_andk)
>      fins->ot = IRTI(IR_BAND);
>      return RETRYFOLD;
>    } else if (irk->o == IR_KINT64) {
> -    uint64_t k = kfold_int64arith(ir_k64(irk)->u64, fright->i, (IROp)fins->o);
> +    uint64_t k = kfold_int64arith(J, ir_k64(irk)->u64, fright->i,
> +				  (IROp)fins->o);
>      IROpT ot = fleft->ot;
>      fins->op1 = fleft->op1;
>      fins->op1 = (IRRef1)lj_opt_fold(J);
> @@ -1747,8 +1756,8 @@ LJFOLDF(simplify_andor_k64)
>    IRIns *irk = IR(fleft->op2);
>    PHIBARRIER(fleft);
>    if (irk->o == IR_KINT64) {
> -    uint64_t k = kfold_int64arith(ir_k64(irk)->u64,
> -				  ir_k64(fright)->u64, (IROp)fins->o);
> +    uint64_t k = kfold_int64arith(J, ir_k64(irk)->u64, ir_k64(fright)->u64,
> +				  (IROp)fins->o);
>      /* (i | k1) & k2 ==> i & k2, if (k1 & k2) == 0. */
>      /* (i & k1) | k2 ==> i | k2, if (k1 | k2) == -1. */
>      if (k == (fins->o == IR_BAND ? (uint64_t)0 : ~(uint64_t)0)) {
> @@ -1758,7 +1767,7 @@ LJFOLDF(simplify_andor_k64)
>    }
>    return NEXTFOLD;
>  #else
> -  UNUSED(J); lua_assert(0); return FAILFOLD;
> +  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
>  #endif
>  }
>  
> @@ -1794,8 +1803,8 @@ LJFOLDF(reassoc_intarith_k64)
>  #if LJ_HASFFI
>    IRIns *irk = IR(fleft->op2);
>    if (irk->o == IR_KINT64) {
> -    uint64_t k = kfold_int64arith(ir_k64(irk)->u64,
> -				  ir_k64(fright)->u64, (IROp)fins->o);
> +    uint64_t k = kfold_int64arith(J, ir_k64(irk)->u64, ir_k64(fright)->u64,
> +				  (IROp)fins->o);
>      PHIBARRIER(fleft);
>      fins->op1 = fleft->op1;
>      fins->op2 = (IRRef1)lj_ir_kint64(J, k);
> @@ -1803,7 +1812,7 @@ LJFOLDF(reassoc_intarith_k64)
>    }
>    return NEXTFOLD;
>  #else
> -  UNUSED(J); lua_assert(0); return FAILFOLD;
> +  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
>  #endif
>  }
>  
> @@ -2058,7 +2067,7 @@ LJFOLDF(merge_eqne_snew_kgc)
>  {
>    GCstr *kstr = ir_kstr(fright);
>    int32_t len = (int32_t)kstr->len;
> -  lua_assert(irt_isstr(fins->t));
> +  lj_assertJ(irt_isstr(fins->t), "bad equality IR type");
>  
>  #if LJ_TARGET_UNALIGNED
>  #define FOLD_SNEW_MAX_LEN	4  /* Handle string lengths 0, 1, 2, 3, 4. */
> @@ -2122,7 +2131,7 @@ LJFOLD(HLOAD KKPTR)
>  LJFOLDF(kfold_hload_kkptr)
>  {
>    UNUSED(J);
> -  lua_assert(ir_kptr(fleft) == niltvg(J2G(J)));
> +  lj_assertJ(ir_kptr(fleft) == niltvg(J2G(J)), "expected niltv");
>    return TREF_NIL;
>  }
>  
> @@ -2333,7 +2342,7 @@ LJFOLDF(fwd_sload)
>      TRef tr = lj_opt_cse(J);
>      return tref_ref(tr) < J->chain[IR_RETF] ? EMITFOLD : tr;
>    } else {
> -    lua_assert(J->slot[fins->op1] != 0);
> +    lj_assertJ(J->slot[fins->op1] != 0, "uninitialized slot accessed");
>      return J->slot[fins->op1];
>    }
>  }
> @@ -2448,8 +2457,9 @@ TRef LJ_FASTCALL lj_opt_fold(jit_State *J)
>    IRRef ref;
>  
>    if (LJ_UNLIKELY((J->flags & JIT_F_OPT_MASK) != JIT_F_OPT_DEFAULT)) {
> -    lua_assert(((JIT_F_OPT_FOLD|JIT_F_OPT_FWD|JIT_F_OPT_CSE|JIT_F_OPT_DSE) |
> -		JIT_F_OPT_DEFAULT) == JIT_F_OPT_DEFAULT);
> +    lj_assertJ(((JIT_F_OPT_FOLD|JIT_F_OPT_FWD|JIT_F_OPT_CSE|JIT_F_OPT_DSE) |
> +		JIT_F_OPT_DEFAULT) == JIT_F_OPT_DEFAULT,
> +	       "bad JIT_F_OPT_DEFAULT");
>      /* Folding disabled? Chain to CSE, but not for loads/stores/allocs. */
>      if (!(J->flags & JIT_F_OPT_FOLD) && irm_kind(lj_ir_mode[fins->o]) == IRM_N)
>        return lj_opt_cse(J);
> @@ -2511,7 +2521,7 @@ retry:
>      return lj_ir_kint(J, fins->i);
>    if (ref == FAILFOLD)
>      lj_trace_err(J, LJ_TRERR_GFAIL);
> -  lua_assert(ref == DROPFOLD);
> +  lj_assertJ(ref == DROPFOLD, "bad fold result");
>    return REF_DROP;
>  }
>  
> diff --git a/src/lj_opt_loop.c b/src/lj_opt_loop.c
> index 10613641..d3b0fcee 100644
> --- a/src/lj_opt_loop.c
> +++ b/src/lj_opt_loop.c
> @@ -300,7 +300,8 @@ static void loop_unroll(LoopState *lps)
>    loopmap = &J->cur.snapmap[loopsnap->mapofs];
>    /* The PC of snapshot #0 and the loop snapshot must match. */
>    psentinel = &loopmap[loopsnap->nent];
> -  lua_assert(*psentinel == J->cur.snapmap[J->cur.snap[0].nent]);
> +  lj_assertJ(*psentinel == J->cur.snapmap[J->cur.snap[0].nent],
> +	     "mismatched PC for loop snapshot");
>    *psentinel = SNAP(255, 0, 0);  /* Replace PC with temporary sentinel. */
>  
>    /* Start substitution with snapshot #1 (#0 is empty for root traces). */
> @@ -371,7 +372,7 @@ static void loop_unroll(LoopState *lps)
>    }
>    if (!irt_isguard(J->guardemit))  /* Drop redundant snapshot. */
>      J->cur.nsnapmap = (uint32_t)J->cur.snap[--J->cur.nsnap].mapofs;
> -  lua_assert(J->cur.nsnapmap <= J->sizesnapmap);
> +  lj_assertJ(J->cur.nsnapmap <= J->sizesnapmap, "bad snapshot map index");
>    *psentinel = J->cur.snapmap[J->cur.snap[0].nent];  /* Restore PC. */
>  
>    loop_emit_phi(J, subst, phi, nphi, onsnap);
> diff --git a/src/lj_opt_mem.c b/src/lj_opt_mem.c
> index c8265b4f..59fddbdd 100644
> --- a/src/lj_opt_mem.c
> +++ b/src/lj_opt_mem.c
> @@ -18,6 +18,7 @@
>  #include "lj_jit.h"
>  #include "lj_iropt.h"
>  #include "lj_ircall.h"
> +#include "lj_dispatch.h"
>  
>  /* Some local macros to save typing. Undef'd at the end. */
>  #define IR(ref)		(&J->cur.ir[(ref)])
> @@ -56,8 +57,8 @@ static AliasRet aa_table(jit_State *J, IRRef ta, IRRef tb)
>  {
>    IRIns *taba = IR(ta), *tabb = IR(tb);
>    int newa, newb;
> -  lua_assert(ta != tb);
> -  lua_assert(irt_istab(taba->t) && irt_istab(tabb->t));
> +  lj_assertJ(ta != tb, "bad usage");
> +  lj_assertJ(irt_istab(taba->t) && irt_istab(tabb->t), "bad usage");
>    /* Disambiguate new allocations. */
>    newa = (taba->o == IR_TNEW || taba->o == IR_TDUP);
>    newb = (tabb->o == IR_TNEW || tabb->o == IR_TDUP);
> @@ -99,7 +100,7 @@ static AliasRet aa_ahref(jit_State *J, IRIns *refa, IRIns *refb)
>      /* Disambiguate array references based on index arithmetic. */
>      int32_t ofsa = 0, ofsb = 0;
>      IRRef basea = ka, baseb = kb;
> -    lua_assert(refb->o == IR_AREF);
> +    lj_assertJ(refb->o == IR_AREF, "expected AREF");
>      /* Gather base and offset from t[base] or t[base+-ofs]. */
>      if (keya->o == IR_ADD && irref_isk(keya->op2)) {
>        basea = keya->op1;
> @@ -117,8 +118,9 @@ static AliasRet aa_ahref(jit_State *J, IRIns *refa, IRIns *refb)
>        return ALIAS_NO;  /* t[base+-o1] vs. t[base+-o2] and o1 != o2. */
>    } else {
>      /* Disambiguate hash references based on the type of their keys. */
> -    lua_assert((refa->o==IR_HREF || refa->o==IR_HREFK || refa->o==IR_NEWREF) &&
> -	       (refb->o==IR_HREF || refb->o==IR_HREFK || refb->o==IR_NEWREF));
> +    lj_assertJ((refa->o==IR_HREF || refa->o==IR_HREFK || refa->o==IR_NEWREF) &&
> +	       (refb->o==IR_HREF || refb->o==IR_HREFK || refb->o==IR_NEWREF),
> +	       "bad xREF IR op %d or %d", refa->o, refb->o);
>      if (!irt_sametype(keya->t, keyb->t))
>        return ALIAS_NO;  /* Different key types. */
>    }
> @@ -192,7 +194,8 @@ static TRef fwd_ahload(jit_State *J, IRRef xref)
>  	if (key->o == IR_KSLOT) key = IR(key->op1);
>  	lj_ir_kvalue(J->L, &keyv, key);
>  	tv = lj_tab_get(J->L, ir_ktab(IR(ir->op1)), &keyv);
> -	lua_assert(itype2irt(tv) == irt_type(fins->t));
> +	lj_assertJ(itype2irt(tv) == irt_type(fins->t),
> +		   "mismatched type in constant table");
>  	if (irt_isnum(fins->t))
>  	  return lj_ir_knum_u64(J, tv->u64);
>  	else if (LJ_DUALNUM && irt_isint(fins->t))
> diff --git a/src/lj_opt_narrow.c b/src/lj_opt_narrow.c
> index 4f285334..2cfb775b 100644
> --- a/src/lj_opt_narrow.c
> +++ b/src/lj_opt_narrow.c
> @@ -372,17 +372,17 @@ static IRRef narrow_conv_emit(jit_State *J, NarrowConv *nc)
>      } else if (op == NARROW_CONV) {
>        *sp++ = emitir_raw(convot, ref, convop2);  /* Raw emit avoids a loop. */
>      } else if (op == NARROW_SEXT) {
> -      lua_assert(sp >= nc->stack+1);
> +      lj_assertJ(sp >= nc->stack+1, "stack underflow");
>        sp[-1] = emitir(IRT(IR_CONV, IRT_I64), sp[-1],
>  		      (IRT_I64<<5)|IRT_INT|IRCONV_SEXT);
>      } else if (op == NARROW_INT) {
> -      lua_assert(next < last);
> +      lj_assertJ(next < last, "missing arg to NARROW_INT");
>        *sp++ = nc->t == IRT_I64 ?
>  	      lj_ir_kint64(J, (int64_t)(int32_t)*next++) :
>  	      lj_ir_kint(J, *next++);
>      } else {  /* Regular IROpT. Pops two operands and pushes one result. */
>        IRRef mode = nc->mode;
> -      lua_assert(sp >= nc->stack+2);
> +      lj_assertJ(sp >= nc->stack+2, "stack underflow");
>        sp--;
>        /* Omit some overflow checks for array indexing. See comments above. */
>        if ((mode & IRCONV_CONVMASK) == IRCONV_INDEX) {
> @@ -398,7 +398,7 @@ static IRRef narrow_conv_emit(jit_State *J, NarrowConv *nc)
>  	narrow_bpc_set(J, narrow_ref(ref), narrow_ref(sp[-1]), mode);
>      }
>    }
> -  lua_assert(sp == nc->stack+1);
> +  lj_assertJ(sp == nc->stack+1, "stack misalignment");
>    return nc->stack[0];
>  }
>  
> @@ -452,7 +452,7 @@ static TRef narrow_stripov(jit_State *J, TRef tr, int lastop, IRRef mode)
>  TRef LJ_FASTCALL lj_opt_narrow_index(jit_State *J, TRef tr)
>  {
>    IRIns *ir;
> -  lua_assert(tref_isnumber(tr));
> +  lj_assertJ(tref_isnumber(tr), "expected number type");
>    if (tref_isnum(tr))  /* Conversion may be narrowed, too. See above. */
>      return emitir(IRTGI(IR_CONV), tr, IRCONV_INT_NUM|IRCONV_INDEX);
>    /* Omit some overflow checks for array indexing. See comments above. */
> @@ -499,7 +499,7 @@ TRef LJ_FASTCALL lj_opt_narrow_tobit(jit_State *J, TRef tr)
>  /* Narrow C array index (overflow undefined). */
>  TRef LJ_FASTCALL lj_opt_narrow_cindex(jit_State *J, TRef tr)
>  {
> -  lua_assert(tref_isnumber(tr));
> +  lj_assertJ(tref_isnumber(tr), "expected number type");
>    if (tref_isnum(tr))
>      return emitir(IRT(IR_CONV, IRT_INTP), tr, (IRT_INTP<<5)|IRT_NUM|IRCONV_ANY);
>    /* Undefined overflow semantics allow stripping of ADDOV, SUBOV and MULOV. */
> @@ -627,9 +627,10 @@ static int narrow_forl(jit_State *J, cTValue *o)
>  /* Narrow the FORL index type by looking at the runtime values. */
>  IRType lj_opt_narrow_forl(jit_State *J, cTValue *tv)
>  {
> -  lua_assert(tvisnumber(&tv[FORL_IDX]) &&
> +  lj_assertJ(tvisnumber(&tv[FORL_IDX]) &&
>  	     tvisnumber(&tv[FORL_STOP]) &&
> -	     tvisnumber(&tv[FORL_STEP]));
> +	     tvisnumber(&tv[FORL_STEP]),
> +	     "expected number types");
>    /* Narrow only if the runtime values of start/stop/step are all integers. */
>    if (narrow_forl(J, &tv[FORL_IDX]) &&
>        narrow_forl(J, &tv[FORL_STOP]) &&
> diff --git a/src/lj_opt_split.c b/src/lj_opt_split.c
> index c10a85cb..a619d852 100644
> --- a/src/lj_opt_split.c
> +++ b/src/lj_opt_split.c
> @@ -235,7 +235,7 @@ static IRRef split_bitshift(jit_State *J, IRRef1 *hisubst,
>  	return split_emit(J, IRTI(IR_BOR), t1, t2);
>        } else {
>  	IRRef t1 = ir->prev, t2;
> -	lua_assert(op == IR_BSHR || op == IR_BSAR);
> +	lj_assertJ(op == IR_BSHR || op == IR_BSAR, "bad usage");
>  	nir->o = IR_BSHR;
>  	t2 = split_emit(J, IRTI(IR_BSHL), hi, lj_ir_kint(J, (-k&31)));
>  	ir->prev = split_emit(J, IRTI(IR_BOR), t1, t2);
> @@ -250,7 +250,7 @@ static IRRef split_bitshift(jit_State *J, IRRef1 *hisubst,
>  	ir->prev = lj_ir_kint(J, 0);
>  	return lo;
>        } else {
> -	lua_assert(op == IR_BSHR || op == IR_BSAR);
> +	lj_assertJ(op == IR_BSHR || op == IR_BSAR, "bad usage");
>  	if (k == 32) {
>  	  J->cur.nins--;
>  	  ir->prev = hi;
> @@ -429,7 +429,7 @@ static void split_ir(jit_State *J)
>  	hi = split_emit(J, IRT(IR_HIOP, IRT_SOFTFP), nref, nref);
>  	break;
>        case IR_FLOAD:
> -	lua_assert(ir->op1 == REF_NIL);
> +	lj_assertJ(ir->op1 == REF_NIL, "expected FLOAD from GG_State");
>  	hi = lj_ir_kint(J, *(int32_t*)((char*)J2GG(J) + ir->op2 + LJ_LE*4));
>  	nir->op2 += LJ_BE*4;
>  	break;
> @@ -465,8 +465,9 @@ static void split_ir(jit_State *J)
>  	  break;
>  	}
>  #endif
> -	lua_assert(st == IRT_INT ||
> -		   (LJ_32 && LJ_HASFFI && (st == IRT_U32 || st == IRT_FLOAT)));
> +	lj_assertJ(st == IRT_INT ||
> +		   (LJ_32 && LJ_HASFFI && (st == IRT_U32 || st == IRT_FLOAT)),
> +		   "bad source type for CONV");
>  	nir->o = IR_CALLN;
>  #if LJ_32 && LJ_HASFFI
>  	nir->op2 = st == IRT_INT ? IRCALL_softfp_i2d :
> @@ -496,7 +497,8 @@ static void split_ir(jit_State *J)
>  	hi = nir->op2;
>  	break;
>        default:
> -	lua_assert(ir->o <= IR_NE || ir->o == IR_MIN || ir->o == IR_MAX);
> +	lj_assertJ(ir->o <= IR_NE || ir->o == IR_MIN || ir->o == IR_MAX,
> +		   "bad IR op %d", ir->o);
>  	hi = split_emit(J, IRTG(IR_HIOP, IRT_SOFTFP),
>  			hisubst[ir->op1], hisubst[ir->op2]);
>  	break;
> @@ -553,7 +555,7 @@ static void split_ir(jit_State *J)
>  	hi = split_bitshift(J, hisubst, oir, nir, ir);
>  	break;
>        case IR_FLOAD:
> -	lua_assert(ir->op2 == IRFL_CDATA_INT64);
> +	lj_assertJ(ir->op2 == IRFL_CDATA_INT64, "only INT64 supported");
>  	hi = split_emit(J, IRTI(IR_FLOAD), nir->op1, IRFL_CDATA_INT64_4);
>  #if LJ_BE
>  	ir->prev = hi; hi = nref;
> @@ -619,7 +621,7 @@ static void split_ir(jit_State *J)
>  	hi = nir->op2;
>  	break;
>        default:
> -	lua_assert(ir->o <= IR_NE);  /* Comparisons. */
> +	lj_assertJ(ir->o <= IR_NE, "bad IR op %d", ir->o);  /* Comparisons. */
>  	split_emit(J, IRTGI(IR_HIOP), hiref, hisubst[ir->op2]);
>  	break;
>        }
> @@ -697,7 +699,7 @@ static void split_ir(jit_State *J)
>  #if LJ_SOFTFP
>        if (st == IRT_NUM || (LJ_32 && LJ_HASFFI && st == IRT_FLOAT)) {
>  	if (irt_isguard(ir->t)) {
> -	  lua_assert(st == IRT_NUM && irt_isint(ir->t));
> +	  lj_assertJ(st == IRT_NUM && irt_isint(ir->t), "bad CONV types");
>  	  J->cur.nins--;
>  	  ir->prev = split_num2int(J, nir->op1, hisubst[ir->op1], 1);
>  	} else {
> @@ -828,7 +830,7 @@ void lj_opt_split(jit_State *J)
>    if (!J->needsplit)
>      J->needsplit = split_needsplit(J);
>  #else
> -  lua_assert(J->needsplit >= split_needsplit(J));  /* Verify flag. */
> +  lj_assertJ(J->needsplit >= split_needsplit(J), "bad SPLIT state");
>  #endif
>    if (J->needsplit) {
>      int errcode = lj_vm_cpcall(J->L, NULL, J, cpsplit);
> diff --git a/src/lj_parse.c b/src/lj_parse.c
> index e238afa3..3f6caaec 100644
> --- a/src/lj_parse.c
> +++ b/src/lj_parse.c
> @@ -169,6 +169,12 @@ LJ_STATIC_ASSERT((int)BC_MULVV-(int)BC_ADDVV == (int)OPR_MUL-(int)OPR_ADD);
>  LJ_STATIC_ASSERT((int)BC_DIVVV-(int)BC_ADDVV == (int)OPR_DIV-(int)OPR_ADD);
>  LJ_STATIC_ASSERT((int)BC_MODVV-(int)BC_ADDVV == (int)OPR_MOD-(int)OPR_ADD);
>  
> +#ifdef LUA_USE_ASSERT
> +#define lj_assertFS(c, ...)	(lj_assertG_(G(fs->L), (c), __VA_ARGS__))
> +#else
> +#define lj_assertFS(c, ...)	((void)fs)
> +#endif
> +
>  /* -- Error handling ------------------------------------------------------ */
>  
>  LJ_NORET LJ_NOINLINE static void err_syntax(LexState *ls, ErrMsg em)
> @@ -206,7 +212,7 @@ static BCReg const_num(FuncState *fs, ExpDesc *e)
>  {
>    lua_State *L = fs->L;
>    TValue *o;
> -  lua_assert(expr_isnumk(e));
> +  lj_assertFS(expr_isnumk(e), "bad usage");
>    o = lj_tab_set(L, fs->kt, &e->u.nval);
>    if (tvhaskslot(o))
>      return tvkslot(o);
> @@ -231,7 +237,7 @@ static BCReg const_gc(FuncState *fs, GCobj *gc, uint32_t itype)
>  /* Add a string constant. */
>  static BCReg const_str(FuncState *fs, ExpDesc *e)
>  {
> -  lua_assert(expr_isstrk(e) || e->k == VGLOBAL);
> +  lj_assertFS(expr_isstrk(e) || e->k == VGLOBAL, "bad usage");
>    return const_gc(fs, obj2gco(e->u.sval), LJ_TSTR);
>  }
>  
> @@ -319,7 +325,7 @@ static void jmp_patchins(FuncState *fs, BCPos pc, BCPos dest)
>  {
>    BCIns *jmp = &fs->bcbase[pc].ins;
>    BCPos offset = dest-(pc+1)+BCBIAS_J;
> -  lua_assert(dest != NO_JMP);
> +  lj_assertFS(dest != NO_JMP, "uninitialized jump target");
>    if (offset > BCMAX_D)
>      err_syntax(fs->ls, LJ_ERR_XJUMP);
>    setbc_d(jmp, offset);
> @@ -368,7 +374,7 @@ static void jmp_patch(FuncState *fs, BCPos list, BCPos target)
>    if (target == fs->pc) {
>      jmp_tohere(fs, list);
>    } else {
> -    lua_assert(target < fs->pc);
> +    lj_assertFS(target < fs->pc, "bad jump target");
>      jmp_patchval(fs, list, target, NO_REG, target);
>    }
>  }
> @@ -398,7 +404,7 @@ static void bcreg_free(FuncState *fs, BCReg reg)
>  {
>    if (reg >= fs->nactvar) {
>      fs->freereg--;
> -    lua_assert(reg == fs->freereg);
> +    lj_assertFS(reg == fs->freereg, "bad regfree");
>    }
>  }
>  
> @@ -548,7 +554,7 @@ static void expr_toreg_nobranch(FuncState *fs, ExpDesc *e, BCReg reg)
>    } else if (e->k <= VKTRUE) {
>      ins = BCINS_AD(BC_KPRI, reg, const_pri(e));
>    } else {
> -    lua_assert(e->k == VVOID || e->k == VJMP);
> +    lj_assertFS(e->k == VVOID || e->k == VJMP, "bad expr type %d", e->k);
>      return;
>    }
>    bcemit_INS(fs, ins);
> @@ -643,7 +649,7 @@ static void bcemit_store(FuncState *fs, ExpDesc *var, ExpDesc *e)
>      ins = BCINS_AD(BC_GSET, ra, const_str(fs, var));
>    } else {
>      BCReg ra, rc;
> -    lua_assert(var->k == VINDEXED);
> +    lj_assertFS(var->k == VINDEXED, "bad expr type %d", var->k);
>      ra = expr_toanyreg(fs, e);
>      rc = var->u.s.aux;
>      if ((int32_t)rc < 0) {
> @@ -651,10 +657,12 @@ static void bcemit_store(FuncState *fs, ExpDesc *var, ExpDesc *e)
>      } else if (rc > BCMAX_C) {
>        ins = BCINS_ABC(BC_TSETB, ra, var->u.s.info, rc-(BCMAX_C+1));
>      } else {
> +#ifdef LUA_USE_ASSERT
>        /* Free late alloced key reg to avoid assert on free of value reg. */
>        /* This can only happen when called from expr_table(). */
> -      lua_assert(e->k != VNONRELOC || ra < fs->nactvar ||
> -		 rc < ra || (bcreg_free(fs, rc),1));
> +      if (e->k == VNONRELOC && ra >= fs->nactvar && rc >= ra)
> +	bcreg_free(fs, rc);
> +#endif
>        ins = BCINS_ABC(BC_TSETV, ra, var->u.s.info, rc);
>      }
>    }
> @@ -669,7 +677,7 @@ static void bcemit_method(FuncState *fs, ExpDesc *e, ExpDesc *key)
>    expr_free(fs, e);
>    func = fs->freereg;
>    bcemit_AD(fs, BC_MOV, func+1+LJ_FR2, obj);  /* Copy object to 1st argument. */
> -  lua_assert(expr_isstrk(key));
> +  lj_assertFS(expr_isstrk(key), "bad usage");
>    idx = const_str(fs, key);
>    if (idx <= BCMAX_C) {
>      bcreg_reserve(fs, 2+LJ_FR2);
> @@ -809,7 +817,8 @@ static void bcemit_arith(FuncState *fs, BinOpr opr, ExpDesc *e1, ExpDesc *e2)
>      else
>        rc = expr_toanyreg(fs, e2);
>      /* 1st operand discharged by bcemit_binop_left, but need KNUM/KSHORT. */
> -    lua_assert(expr_isnumk(e1) || e1->k == VNONRELOC);
> +    lj_assertFS(expr_isnumk(e1) || e1->k == VNONRELOC,
> +		"bad expr type %d", e1->k);
>      expr_toval(fs, e1);
>      /* Avoid two consts to satisfy bytecode constraints. */
>      if (expr_isnumk(e1) && !expr_isnumk(e2) &&
> @@ -897,19 +906,20 @@ static void bcemit_binop(FuncState *fs, BinOpr op, ExpDesc *e1, ExpDesc *e2)
>    if (op <= OPR_POW) {
>      bcemit_arith(fs, op, e1, e2);
>    } else if (op == OPR_AND) {
> -    lua_assert(e1->t == NO_JMP);  /* List must be closed. */
> +    lj_assertFS(e1->t == NO_JMP, "jump list not closed");
>      expr_discharge(fs, e2);
>      jmp_append(fs, &e2->f, e1->f);
>      *e1 = *e2;
>    } else if (op == OPR_OR) {
> -    lua_assert(e1->f == NO_JMP);  /* List must be closed. */
> +    lj_assertFS(e1->f == NO_JMP, "jump list not closed");
>      expr_discharge(fs, e2);
>      jmp_append(fs, &e2->t, e1->t);
>      *e1 = *e2;
>    } else if (op == OPR_CONCAT) {
>      expr_toval(fs, e2);
>      if (e2->k == VRELOCABLE && bc_op(*bcptr(fs, e2)) == BC_CAT) {
> -      lua_assert(e1->u.s.info == bc_b(*bcptr(fs, e2))-1);
> +      lj_assertFS(e1->u.s.info == bc_b(*bcptr(fs, e2))-1,
> +		  "bad CAT stack layout");
>        expr_free(fs, e1);
>        setbc_b(bcptr(fs, e2), e1->u.s.info);
>        e1->u.s.info = e2->u.s.info;
> @@ -921,8 +931,9 @@ static void bcemit_binop(FuncState *fs, BinOpr op, ExpDesc *e1, ExpDesc *e2)
>      }
>      e1->k = VRELOCABLE;
>    } else {
> -    lua_assert(op == OPR_NE || op == OPR_EQ ||
> -	       op == OPR_LT || op == OPR_GE || op == OPR_LE || op == OPR_GT);
> +    lj_assertFS(op == OPR_NE || op == OPR_EQ ||
> +	       op == OPR_LT || op == OPR_GE || op == OPR_LE || op == OPR_GT,
> +	       "bad binop %d", op);
>      bcemit_comp(fs, op, e1, e2);
>    }
>  }
> @@ -951,10 +962,10 @@ static void bcemit_unop(FuncState *fs, BCOp op, ExpDesc *e)
>        e->u.s.info = fs->freereg-1;
>        e->k = VNONRELOC;
>      } else {
> -      lua_assert(e->k == VNONRELOC);
> +      lj_assertFS(e->k == VNONRELOC, "bad expr type %d", e->k);
>      }
>    } else {
> -    lua_assert(op == BC_UNM || op == BC_LEN);
> +    lj_assertFS(op == BC_UNM || op == BC_LEN, "bad unop %d", op);
>      if (op == BC_UNM && !expr_hasjump(e)) {  /* Constant-fold negations. */
>  #if LJ_HASFFI
>        if (e->k == VKCDATA) {  /* Fold in-place since cdata is not interned. */
> @@ -1049,8 +1060,9 @@ static void var_new(LexState *ls, BCReg n, GCstr *name)
>        lj_lex_error(ls, 0, LJ_ERR_XLIMC, LJ_MAX_VSTACK);
>      lj_mem_growvec(ls->L, ls->vstack, ls->sizevstack, LJ_MAX_VSTACK, VarInfo);
>    }
> -  lua_assert((uintptr_t)name < VARNAME__MAX ||
> -	     lj_tab_getstr(fs->kt, name) != NULL);
> +  lj_assertFS((uintptr_t)name < VARNAME__MAX ||
> +	      lj_tab_getstr(fs->kt, name) != NULL,
> +	      "unanchored variable name");
>    /* NOBARRIER: name is anchored in fs->kt and ls->vstack is not a GCobj. */
>    setgcref(ls->vstack[vtop].name, obj2gco(name));
>    fs->varmap[fs->nactvar+n] = (uint16_t)vtop;
> @@ -1105,7 +1117,7 @@ static MSize var_lookup_uv(FuncState *fs, MSize vidx, ExpDesc *e)
>        return i;  /* Already exists. */
>    /* Otherwise create a new one. */
>    checklimit(fs, fs->nuv, LJ_MAX_UPVAL, "upvalues");
> -  lua_assert(e->k == VLOCAL || e->k == VUPVAL);
> +  lj_assertFS(e->k == VLOCAL || e->k == VUPVAL, "bad expr type %d", e->k);
>    fs->uvmap[n] = (uint16_t)vidx;
>    fs->uvtmp[n] = (uint16_t)(e->k == VLOCAL ? vidx : LJ_MAX_VSTACK+e->u.s.info);
>    fs->nuv = n+1;
> @@ -1156,7 +1168,8 @@ static MSize gola_new(LexState *ls, GCstr *name, uint8_t info, BCPos pc)
>        lj_lex_error(ls, 0, LJ_ERR_XLIMC, LJ_MAX_VSTACK);
>      lj_mem_growvec(ls->L, ls->vstack, ls->sizevstack, LJ_MAX_VSTACK, VarInfo);
>    }
> -  lua_assert(name == NAME_BREAK || lj_tab_getstr(fs->kt, name) != NULL);
> +  lj_assertFS(name == NAME_BREAK || lj_tab_getstr(fs->kt, name) != NULL,
> +	      "unanchored label name");
>    /* NOBARRIER: name is anchored in fs->kt and ls->vstack is not a GCobj. */
>    setgcref(ls->vstack[vtop].name, obj2gco(name));
>    ls->vstack[vtop].startpc = pc;
> @@ -1186,8 +1199,9 @@ static void gola_close(LexState *ls, VarInfo *vg)
>    FuncState *fs = ls->fs;
>    BCPos pc = vg->startpc;
>    BCIns *ip = &fs->bcbase[pc].ins;
> -  lua_assert(gola_isgoto(vg));
> -  lua_assert(bc_op(*ip) == BC_JMP || bc_op(*ip) == BC_UCLO);
> +  lj_assertFS(gola_isgoto(vg), "expected goto");
> +  lj_assertFS(bc_op(*ip) == BC_JMP || bc_op(*ip) == BC_UCLO,
> +	      "bad bytecode op %d", bc_op(*ip));
>    setbc_a(ip, vg->slot);
>    if (bc_op(*ip) == BC_JMP) {
>      BCPos next = jmp_next(fs, pc);
> @@ -1206,9 +1220,9 @@ static void gola_resolve(LexState *ls, FuncScope *bl, MSize idx)
>      if (gcrefeq(vg->name, vl->name) && gola_isgoto(vg)) {
>        if (vg->slot < vl->slot) {
>  	GCstr *name = strref(var_get(ls, ls->fs, vg->slot).name);
> -	lua_assert((uintptr_t)name >= VARNAME__MAX);
> +	lj_assertLS((uintptr_t)name >= VARNAME__MAX, "expected goto name");
>  	ls->linenumber = ls->fs->bcbase[vg->startpc].line;
> -	lua_assert(strref(vg->name) != NAME_BREAK);
> +	lj_assertLS(strref(vg->name) != NAME_BREAK, "unexpected break");
>  	lj_lex_error(ls, 0, LJ_ERR_XGSCOPE,
>  		     strdata(strref(vg->name)), strdata(name));
>        }
> @@ -1272,7 +1286,7 @@ static void fscope_begin(FuncState *fs, FuncScope *bl, int flags)
>    bl->vstart = fs->ls->vtop;
>    bl->prev = fs->bl;
>    fs->bl = bl;
> -  lua_assert(fs->freereg == fs->nactvar);
> +  lj_assertFS(fs->freereg == fs->nactvar, "bad regalloc");
>  }
>  
>  /* End a scope. */
> @@ -1283,7 +1297,7 @@ static void fscope_end(FuncState *fs)
>    fs->bl = bl->prev;
>    var_remove(ls, bl->nactvar);
>    fs->freereg = fs->nactvar;
> -  lua_assert(bl->nactvar == fs->nactvar);
> +  lj_assertFS(bl->nactvar == fs->nactvar, "bad regalloc");
>    if ((bl->flags & (FSCOPE_UPVAL|FSCOPE_NOCLOSE)) == FSCOPE_UPVAL)
>      bcemit_AJ(fs, BC_UCLO, bl->nactvar, 0);
>    if ((bl->flags & FSCOPE_BREAK)) {
> @@ -1370,13 +1384,13 @@ static void fs_fixup_k(FuncState *fs, GCproto *pt, void *kptr)
>      Node *n = &node[i];
>      if (tvhaskslot(&n->val)) {
>        ptrdiff_t kidx = (ptrdiff_t)tvkslot(&n->val);
> -      lua_assert(!tvisint(&n->key));
> +      lj_assertFS(!tvisint(&n->key), "unexpected integer key");
>        if (tvisnum(&n->key)) {
>  	TValue *tv = &((TValue *)kptr)[kidx];
>  	if (LJ_DUALNUM) {
>  	  lua_Number nn = numV(&n->key);
>  	  int32_t k = lj_num2int(nn);
> -	  lua_assert(!tvismzero(&n->key));
> +	  lj_assertFS(!tvismzero(&n->key), "unexpected -0 key");
>  	  if ((lua_Number)k == nn)
>  	    setintV(tv, k);
>  	  else
> @@ -1424,21 +1438,21 @@ static void fs_fixup_line(FuncState *fs, GCproto *pt,
>      uint8_t *li = (uint8_t *)lineinfo;
>      do {
>        BCLine delta = base[i].line - first;
> -      lua_assert(delta >= 0 && delta < 256);
> +      lj_assertFS(delta >= 0 && delta < 256, "bad line delta");
>        li[i] = (uint8_t)delta;
>      } while (++i < n);
>    } else if (LJ_LIKELY(numline < 65536)) {
>      uint16_t *li = (uint16_t *)lineinfo;
>      do {
>        BCLine delta = base[i].line - first;
> -      lua_assert(delta >= 0 && delta < 65536);
> +      lj_assertFS(delta >= 0 && delta < 65536, "bad line delta");
>        li[i] = (uint16_t)delta;
>      } while (++i < n);
>    } else {
>      uint32_t *li = (uint32_t *)lineinfo;
>      do {
>        BCLine delta = base[i].line - first;
> -      lua_assert(delta >= 0);
> +      lj_assertFS(delta >= 0, "bad line delta");
>        li[i] = (uint32_t)delta;
>      } while (++i < n);
>    }
> @@ -1528,7 +1542,7 @@ static void fs_fixup_ret(FuncState *fs)
>    }
>    fs->bl->flags |= FSCOPE_NOCLOSE;  /* Handled above. */
>    fscope_end(fs);
> -  lua_assert(fs->bl == NULL);
> +  lj_assertFS(fs->bl == NULL, "bad scope nesting");
>    /* May need to fixup returns encoded before first function was created. */
>    if (fs->flags & PROTO_FIXUP_RETURN) {
>      BCPos pc;
> @@ -1608,7 +1622,7 @@ static GCproto *fs_finish(LexState *ls, BCLine line)
>    L->top--;  /* Pop table of constants. */
>    ls->vtop = fs->vbase;  /* Reset variable stack. */
>    ls->fs = fs->prev;
> -  lua_assert(ls->fs != NULL || ls->tok == TK_eof);
> +  lj_assertL(ls->fs != NULL || ls->tok == TK_eof, "bad parser state");
>    return pt;
>  }
>  
> @@ -1702,14 +1716,15 @@ static void expr_bracket(LexState *ls, ExpDesc *v)
>  }
>  
>  /* Get value of constant expression. */
> -static void expr_kvalue(TValue *v, ExpDesc *e)
> +static void expr_kvalue(FuncState *fs, TValue *v, ExpDesc *e)
>  {
> +  UNUSED(fs);
>    if (e->k <= VKTRUE) {
>      setpriV(v, ~(uint32_t)e->k);
>    } else if (e->k == VKSTR) {
>      setgcVraw(v, obj2gco(e->u.sval), LJ_TSTR);
>    } else {
> -    lua_assert(tvisnumber(expr_numtv(e)));
> +    lj_assertFS(tvisnumber(expr_numtv(e)), "bad number constant");
>      *v = *expr_numtv(e);
>    }
>  }
> @@ -1759,11 +1774,11 @@ static void expr_table(LexState *ls, ExpDesc *e)
>  	fs->bcbase[pc].ins = BCINS_AD(BC_TDUP, freg-1, kidx);
>        }
>        vcall = 0;
> -      expr_kvalue(&k, &key);
> +      expr_kvalue(fs, &k, &key);
>        v = lj_tab_set(fs->L, t, &k);
>        lj_gc_anybarriert(fs->L, t);
>        if (expr_isk_nojump(&val)) {  /* Add const key/value to template table. */
> -	expr_kvalue(v, &val);
> +	expr_kvalue(fs, v, &val);
>        } else {  /* Otherwise create dummy string key (avoids lj_tab_newkey). */
>  	settabV(fs->L, v, t);  /* Preserve key with table itself as value. */
>  	fixt = 1;   /* Fix this later, after all resizes. */
> @@ -1782,8 +1797,9 @@ static void expr_table(LexState *ls, ExpDesc *e)
>    if (vcall) {
>      BCInsLine *ilp = &fs->bcbase[fs->pc-1];
>      ExpDesc en;
> -    lua_assert(bc_a(ilp->ins) == freg &&
> -	       bc_op(ilp->ins) == (narr > 256 ? BC_TSETV : BC_TSETB));
> +    lj_assertFS(bc_a(ilp->ins) == freg &&
> +		bc_op(ilp->ins) == (narr > 256 ? BC_TSETV : BC_TSETB),
> +		"bad CALL code generation");
>      expr_init(&en, VKNUM, 0);
>      en.u.nval.u32.lo = narr-1;
>      en.u.nval.u32.hi = 0x43300000;  /* Biased integer to avoid denormals. */
> @@ -1813,7 +1829,7 @@ static void expr_table(LexState *ls, ExpDesc *e)
>        for (i = 0; i <= hmask; i++) {
>  	Node *n = &node[i];
>  	if (tvistab(&n->val)) {
> -	  lua_assert(tabV(&n->val) == t);
> +	  lj_assertFS(tabV(&n->val) == t, "bad dummy key in template table");
>  	  setnilV(&n->val);  /* Turn value into nil. */
>  	}
>        }
> @@ -1844,7 +1860,7 @@ static BCReg parse_params(LexState *ls, int needself)
>      } while (lex_opt(ls, ','));
>    }
>    var_add(ls, nparams);
> -  lua_assert(fs->nactvar == nparams);
> +  lj_assertFS(fs->nactvar == nparams, "bad regalloc");
>    bcreg_reserve(fs, nparams);
>    lex_check(ls, ')');
>    return nparams;
> @@ -1931,7 +1947,7 @@ static void parse_args(LexState *ls, ExpDesc *e)
>      err_syntax(ls, LJ_ERR_XFUNARG);
>      return;  /* Silence compiler. */
>    }
> -  lua_assert(e->k == VNONRELOC);
> +  lj_assertFS(e->k == VNONRELOC, "bad expr type %d", e->k);
>    base = e->u.s.info;  /* Base register for call. */
>    if (args.k == VCALL) {
>      ins = BCINS_ABC(BC_CALLM, base, 2, args.u.s.aux - base - 1 - LJ_FR2);
> @@ -2701,8 +2717,9 @@ static void parse_chunk(LexState *ls)
>    while (!islast && !parse_isend(ls->tok)) {
>      islast = parse_stmt(ls);
>      lex_opt(ls, ';');
> -    lua_assert(ls->fs->framesize >= ls->fs->freereg &&
> -	       ls->fs->freereg >= ls->fs->nactvar);
> +    lj_assertLS(ls->fs->framesize >= ls->fs->freereg &&
> +		ls->fs->freereg >= ls->fs->nactvar,
> +		"bad regalloc");
>      ls->fs->freereg = ls->fs->nactvar;  /* Free registers after each stmt. */
>    }
>    synlevel_end(ls);
> @@ -2737,9 +2754,8 @@ GCproto *lj_parse(LexState *ls)
>      err_token(ls, TK_eof);
>    pt = fs_finish(ls, ls->linenumber);
>    L->top--;  /* Drop chunkname. */
> -  lua_assert(fs.prev == NULL);
> -  lua_assert(ls->fs == NULL);
> -  lua_assert(pt->sizeuv == 0);
> +  lj_assertL(fs.prev == NULL && ls->fs == NULL, "mismatched frame nesting");
> +  lj_assertL(pt->sizeuv == 0, "toplevel proto has upvalues");
>    return pt;
>  }
>  
> diff --git a/src/lj_record.c b/src/lj_record.c
> index 6030f77c..d1332bfc 100644
> --- a/src/lj_record.c
> +++ b/src/lj_record.c
> @@ -50,34 +50,52 @@
>  static void rec_check_ir(jit_State *J)
>  {
>    IRRef i, nins = J->cur.nins, nk = J->cur.nk;
> -  lua_assert(nk <= REF_BIAS && nins >= REF_BIAS && nins < 65536);
> +  lj_assertJ(nk <= REF_BIAS && nins >= REF_BIAS && nins < 65536,
> +	     "inconsistent IR layout");
>    for (i = nk; i < nins; i++) {
>      IRIns *ir = IR(i);
>      uint32_t mode = lj_ir_mode[ir->o];
>      IRRef op1 = ir->op1;
>      IRRef op2 = ir->op2;
> +    const char *err = NULL;
>      switch (irm_op1(mode)) {
> -    case IRMnone: lua_assert(op1 == 0); break;
> -    case IRMref: lua_assert(op1 >= nk);
> -      lua_assert(i >= REF_BIAS ? op1 < i : op1 > i); break;
> +    case IRMnone:
> +      if (op1 != 0) err = "IRMnone op1 used";
> +      break;
> +    case IRMref:
> +      if (op1 < nk || (i >= REF_BIAS ? op1 >= i : op1 <= i))
> +	err = "IRMref op1 out of range";
> +      break;
>      case IRMlit: break;
> -    case IRMcst: lua_assert(i < REF_BIAS);
> +    case IRMcst:
> +      if (i >= REF_BIAS) { err = "constant in IR range"; break; }
>        if (irt_is64(ir->t) && ir->o != IR_KNULL)
>  	i++;
>        continue;
>      }
>      switch (irm_op2(mode)) {
> -    case IRMnone: lua_assert(op2 == 0); break;
> -    case IRMref: lua_assert(op2 >= nk);
> -      lua_assert(i >= REF_BIAS ? op2 < i : op2 > i); break;
> +    case IRMnone:
> +      if (op2) err = "IRMnone op2 used";
> +      break;
> +    case IRMref:
> +      if (op2 < nk || (i >= REF_BIAS ? op2 >= i : op2 <= i))
> +	err = "IRMref op2 out of range";
> +      break;
>      case IRMlit: break;
> -    case IRMcst: lua_assert(0); break;
> +    case IRMcst: err = "IRMcst op2"; break;
>      }
> -    if (ir->prev) {
> -      lua_assert(ir->prev >= nk);
> -      lua_assert(i >= REF_BIAS ? ir->prev < i : ir->prev > i);
> -      lua_assert(ir->o == IR_NOP || IR(ir->prev)->o == ir->o);
> +    if (!err && ir->prev) {
> +      if (ir->prev < nk || (i >= REF_BIAS ? ir->prev >= i : ir->prev <= i))
> +	err = "chain out of range";
> +      else if (ir->o != IR_NOP && IR(ir->prev)->o != ir->o)
> +	err = "chain to different op";
>      }
> +    lj_assertJ(!err, "bad IR %04d op %d(%04d,%04d): %s",
> +	       i-REF_BIAS,
> +	       ir->o,
> +	       irm_op1(mode) == IRMref ? op1-REF_BIAS : op1,
> +	       irm_op2(mode) == IRMref ? op2-REF_BIAS : op2,
> +	       err);
>    }
>  }
>  
> @@ -87,9 +105,10 @@ static void rec_check_slots(jit_State *J)
>    BCReg s, nslots = J->baseslot + J->maxslot;
>    int32_t depth = 0;
>    cTValue *base = J->L->base - J->baseslot;
> -  lua_assert(J->baseslot >= 1+LJ_FR2);
> -  lua_assert(J->baseslot == 1+LJ_FR2 || (J->slot[J->baseslot-1] & TREF_FRAME));
> -  lua_assert(nslots <= LJ_MAX_JSLOTS);
> +  lj_assertJ(J->baseslot >= 1+LJ_FR2, "bad baseslot");
> +  lj_assertJ(J->baseslot == 1+LJ_FR2 || (J->slot[J->baseslot-1] & TREF_FRAME),
> +	     "baseslot does not point to frame");
> +  lj_assertJ(nslots <= LJ_MAX_JSLOTS, "slot overflow");
>    for (s = 0; s < nslots; s++) {
>      TRef tr = J->slot[s];
>      if (tr) {
> @@ -97,56 +116,65 @@ static void rec_check_slots(jit_State *J)
>        IRRef ref = tref_ref(tr);
>        IRIns *ir = NULL;  /* Silence compiler. */
>        if (!LJ_FR2 || ref || !(tr & (TREF_FRAME | TREF_CONT))) {
> -	lua_assert(ref >= J->cur.nk && ref < J->cur.nins);
> +	lj_assertJ(ref >= J->cur.nk && ref < J->cur.nins,
> +		   "slot %d ref %04d out of range", s, ref - REF_BIAS);
>  	ir = IR(ref);
> -	lua_assert(irt_t(ir->t) == tref_t(tr));
> +	lj_assertJ(irt_t(ir->t) == tref_t(tr), "slot %d IR type mismatch", s);
>        }
>        if (s == 0) {
> -	lua_assert(tref_isfunc(tr));
> +	lj_assertJ(tref_isfunc(tr), "frame slot 0 is not a function");
>  #if LJ_FR2
>        } else if (s == 1) {
> -	lua_assert((tr & ~TREF_FRAME) == 0);
> +	lj_assertJ((tr & ~TREF_FRAME) == 0, "bad frame slot 1");
>  #endif
>        } else if ((tr & TREF_FRAME)) {
>  	GCfunc *fn = gco2func(frame_gc(tv));
>  	BCReg delta = (BCReg)(tv - frame_prev(tv));
>  #if LJ_FR2
> -	if (ref)
> -	  lua_assert(ir_knum(ir)->u64 == tv->u64);
> +	lj_assertJ(!ref || ir_knum(ir)->u64 == tv->u64,
> +		   "frame slot %d PC mismatch", s);
>  	tr = J->slot[s-1];
>  	ir = IR(tref_ref(tr));
>  #endif
> -	lua_assert(tref_isfunc(tr));
> -	if (tref_isk(tr)) lua_assert(fn == ir_kfunc(ir));
> -	lua_assert(s > delta + LJ_FR2 ? (J->slot[s-delta] & TREF_FRAME)
> -				      : (s == delta + LJ_FR2));
> +	lj_assertJ(tref_isfunc(tr),
> +		   "frame slot %d is not a function", s-LJ_FR2);
> +	lj_assertJ(!tref_isk(tr) || fn == ir_kfunc(ir),
> +		   "frame slot %d function mismatch", s-LJ_FR2);
> +	lj_assertJ(s > delta + LJ_FR2 ? (J->slot[s-delta] & TREF_FRAME)
> +				      : (s == delta + LJ_FR2),
> +		   "frame slot %d broken chain", s-LJ_FR2);
>  	depth++;
>        } else if ((tr & TREF_CONT)) {
>  #if LJ_FR2
> -	if (ref)
> -	  lua_assert(ir_knum(ir)->u64 == tv->u64);
> +	lj_assertJ(!ref || ir_knum(ir)->u64 == tv->u64,
> +		   "cont slot %d continuation mismatch", s);
>  #else
> -	lua_assert(ir_kptr(ir) == gcrefp(tv->gcr, void));
> +	lj_assertJ(ir_kptr(ir) == gcrefp(tv->gcr, void),
> +		   "cont slot %d continuation mismatch", s);
>  #endif
> -	lua_assert((J->slot[s+1+LJ_FR2] & TREF_FRAME));
> +	lj_assertJ((J->slot[s+1+LJ_FR2] & TREF_FRAME),
> +		   "cont slot %d not followed by frame", s);
>  	depth++;
>        } else {
> -	if (tvisnumber(tv))
> -	  lua_assert(tref_isnumber(tr));  /* Could be IRT_INT etc., too. */
> -	else
> -	  lua_assert(itype2irt(tv) == tref_type(tr));
> +	/* Number repr. may differ, but other types must be the same. */
> +	lj_assertJ(tvisnumber(tv) ? tref_isnumber(tr) :
> +				    itype2irt(tv) == tref_type(tr),
> +		   "slot %d type mismatch: stack type %d vs IR type %d",
> +		   s, itypemap(tv), tref_type(tr));
>  	if (tref_isk(tr)) {  /* Compare constants. */
>  	  TValue tvk;
>  	  lj_ir_kvalue(J->L, &tvk, ir);
> -	  if (!(tvisnum(&tvk) && tvisnan(&tvk)))
> -	    lua_assert(lj_obj_equal(tv, &tvk));
> -	  else
> -	    lua_assert(tvisnum(tv) && tvisnan(tv));
> +	  lj_assertJ((tvisnum(&tvk) && tvisnan(&tvk)) ?
> +		     (tvisnum(tv) && tvisnan(tv)) :
> +		     lj_obj_equal(tv, &tvk),
> +		     "slot %d const mismatch: stack %016llx vs IR %016llx",
> +		     s, tv->u64, tvk.u64);
>  	}
>        }
>      }
>    }
> -  lua_assert(J->framedepth == depth);
> +  lj_assertJ(J->framedepth == depth,
> +	     "frame depth mismatch %d vs %d", J->framedepth, depth);
>  }
>  #endif
>  
> @@ -182,7 +210,7 @@ static TRef getcurrf(jit_State *J)
>  {
>    if (J->base[-1-LJ_FR2])
>      return J->base[-1-LJ_FR2];
> -  lua_assert(J->baseslot == 1+LJ_FR2);
> +  lj_assertJ(J->baseslot == 1+LJ_FR2, "bad baseslot");
>    return sloadt(J, -1-LJ_FR2, IRT_FUNC, IRSLOAD_READONLY);
>  }
>  
> @@ -427,7 +455,8 @@ static void rec_for_loop(jit_State *J, const BCIns *fori, ScEvEntry *scev,
>    TRef stop = fori_arg(J, fori, ra+FORL_STOP, t, mode);
>    TRef step = fori_arg(J, fori, ra+FORL_STEP, t, mode);
>    int tc, dir = rec_for_direction(&tv[FORL_STEP]);
> -  lua_assert(bc_op(*fori) == BC_FORI || bc_op(*fori) == BC_JFORI);
> +  lj_assertJ(bc_op(*fori) == BC_FORI || bc_op(*fori) == BC_JFORI,
> +	     "bad bytecode %d instead of FORI/JFORI", bc_op(*fori));
>    scev->t.irt = t;
>    scev->dir = dir;
>    scev->stop = tref_ref(stop);
> @@ -483,7 +512,7 @@ static LoopEvent rec_for(jit_State *J, const BCIns *fori, int isforl)
>  						   IRT_NUM;
>      for (i = FORL_IDX; i <= FORL_STEP; i++) {
>        if (!tr[i]) sload(J, ra+i);
> -      lua_assert(tref_isnumber_str(tr[i]));
> +      lj_assertJ(tref_isnumber_str(tr[i]), "bad FORI argument type");
>        if (tref_isstr(tr[i]))
>  	tr[i] = emitir(IRTG(IR_STRTO, IRT_NUM), tr[i], 0);
>        if (t == IRT_INT) {
> @@ -615,7 +644,8 @@ static void rec_loop_jit(jit_State *J, TraceNo lnk, LoopEvent ev)
>  static int rec_profile_need(jit_State *J, GCproto *pt, const BCIns *pc)
>  {
>    GCproto *ppt;
> -  lua_assert(J->prof_mode == 'f' || J->prof_mode == 'l');
> +  lj_assertJ(J->prof_mode == 'f' || J->prof_mode == 'l',
> +	     "bad profiler mode %c", J->prof_mode);
>    if (!pt)
>      return 0;
>    ppt = J->prev_pt;
> @@ -793,7 +823,7 @@ void lj_record_ret(jit_State *J, BCReg rbase, ptrdiff_t gotresults)
>      BCReg cbase = (BCReg)frame_delta(frame);
>      if (--J->framedepth <= 0)
>        lj_trace_err(J, LJ_TRERR_NYIRETL);
> -    lua_assert(J->baseslot > 1+LJ_FR2);
> +    lj_assertJ(J->baseslot > 1+LJ_FR2, "bad baseslot for return");
>      gotresults++;
>      rbase += cbase;
>      J->baseslot -= (BCReg)cbase;
> @@ -818,7 +848,7 @@ void lj_record_ret(jit_State *J, BCReg rbase, ptrdiff_t gotresults)
>      BCReg cbase = (BCReg)frame_delta(frame);
>      if (--J->framedepth < 0)  /* NYI: return of vararg func to lower frame. */
>        lj_trace_err(J, LJ_TRERR_NYIRETL);
> -    lua_assert(J->baseslot > 1+LJ_FR2);
> +    lj_assertJ(J->baseslot > 1+LJ_FR2, "bad baseslot for return");
>      rbase += cbase;
>      J->baseslot -= (BCReg)cbase;
>      J->base -= cbase;
> @@ -845,7 +875,7 @@ void lj_record_ret(jit_State *J, BCReg rbase, ptrdiff_t gotresults)
>      J->maxslot = cbase+(BCReg)nresults;
>      if (J->framedepth > 0) {  /* Return to a frame that is part of the trace. */
>        J->framedepth--;
> -      lua_assert(J->baseslot > cbase+1+LJ_FR2);
> +      lj_assertJ(J->baseslot > cbase+1+LJ_FR2, "bad baseslot for return");
>        J->baseslot -= cbase+1+LJ_FR2;
>        J->base -= cbase+1+LJ_FR2;
>      } else if (J->parent == 0 && J->exitno == 0 &&
> @@ -860,7 +890,7 @@ void lj_record_ret(jit_State *J, BCReg rbase, ptrdiff_t gotresults)
>        emitir(IRTG(IR_RETF, IRT_PGC), trpt, trpc);
>        J->retdepth++;
>        J->needsnap = 1;
> -      lua_assert(J->baseslot == 1+LJ_FR2);
> +      lj_assertJ(J->baseslot == 1+LJ_FR2, "bad baseslot for return");
>        /* Shift result slots up and clear the slots of the new frame below. */
>        memmove(J->base + cbase, J->base-1-LJ_FR2, sizeof(TRef)*nresults);
>        memset(J->base-1-LJ_FR2, 0, sizeof(TRef)*(cbase+1+LJ_FR2));
> @@ -908,12 +938,13 @@ void lj_record_ret(jit_State *J, BCReg rbase, ptrdiff_t gotresults)
>        }  /* Otherwise continue with another __concat call. */
>      } else {
>        /* Result type already specialized. */
> -      lua_assert(cont == lj_cont_condf || cont == lj_cont_condt);
> +      lj_assertJ(cont == lj_cont_condf || cont == lj_cont_condt,
> +		 "bad continuation type");
>      }
>    } else {
>      lj_trace_err(J, LJ_TRERR_NYIRETL);  /* NYI: handle return to C frame. */
>    }
> -  lua_assert(J->baseslot >= 1+LJ_FR2);
> +  lj_assertJ(J->baseslot >= 1+LJ_FR2, "bad baseslot for return");
>  }
>  
>  /* -- Metamethod handling ------------------------------------------------- */
> @@ -1168,7 +1199,7 @@ static void rec_mm_comp_cdata(jit_State *J, RecordIndex *ix, int op, MMS mm)
>      ix->tab = ix->val;
>      copyTV(J->L, &ix->tabv, &ix->valv);
>    } else {
> -    lua_assert(tref_iscdata(ix->key));
> +    lj_assertJ(tref_iscdata(ix->key), "cdata expected");
>      ix->tab = ix->key;
>      copyTV(J->L, &ix->tabv, &ix->keyv);
>    }
> @@ -1265,7 +1296,8 @@ static void rec_idx_abc(jit_State *J, TRef asizeref, TRef ikey, uint32_t asize)
>      /* Got scalar evolution analysis results for this reference? */
>      if (ref == J->scev.idx) {
>        int32_t stop;
> -      lua_assert(irt_isint(J->scev.t) && ir->o == IR_SLOAD);
> +      lj_assertJ(irt_isint(J->scev.t) && ir->o == IR_SLOAD,
> +		 "only int SCEV supported");
>        stop = numberVint(&(J->L->base - J->baseslot)[ir->op1 + FORL_STOP]);
>        /* Runtime value for stop of loop is within bounds? */
>        if ((uint64_t)stop + ofs < (uint64_t)asize) {
> @@ -1383,7 +1415,7 @@ TRef lj_record_idx(jit_State *J, RecordIndex *ix)
>  
>    while (!tref_istab(ix->tab)) { /* Handle non-table lookup. */
>      /* Never call raw lj_record_idx() on non-table. */
> -    lua_assert(ix->idxchain != 0);
> +    lj_assertJ(ix->idxchain != 0, "bad usage");
>      if (!lj_record_mm_lookup(J, ix, ix->val ? MM_newindex : MM_index))
>        lj_trace_err(J, LJ_TRERR_NOMM);
>    handlemm:
> @@ -1467,10 +1499,10 @@ TRef lj_record_idx(jit_State *J, RecordIndex *ix)
>  	emitir(IRTG(oldv == niltvg(J2G(J)) ? IR_EQ : IR_NE, IRT_PGC),
>  	       xref, lj_ir_kkptr(J, niltvg(J2G(J))));
>        if (ix->idxchain && lj_record_mm_lookup(J, ix, MM_newindex)) {
> -	lua_assert(hasmm);
> +	lj_assertJ(hasmm, "inconsistent metamethod handling");
>  	goto handlemm;
>        }
> -      lua_assert(!hasmm);
> +      lj_assertJ(!hasmm, "inconsistent metamethod handling");
>        if (oldv == niltvg(J2G(J))) {  /* Need to insert a new key. */
>  	TRef key = ix->key;
>  	if (tref_isinteger(key))  /* NEWREF needs a TValue as a key. */
> @@ -1578,7 +1610,7 @@ static TRef rec_upvalue(jit_State *J, uint32_t uv, TRef val)
>    int needbarrier = 0;
>    if (rec_upvalue_constify(J, uvp)) {  /* Try to constify immutable upvalue. */
>      TRef tr, kfunc;
> -    lua_assert(val == 0);
> +    lj_assertJ(val == 0, "bad usage");
>      if (!tref_isk(fn)) {  /* Late specialization of current function. */
>        if (J->pt->flags >= PROTO_CLC_POLY)
>  	goto noconstify;
> @@ -1700,7 +1732,7 @@ static void rec_func_vararg(jit_State *J)
>  {
>    GCproto *pt = J->pt;
>    BCReg s, fixargs, vframe = J->maxslot+1+LJ_FR2;
> -  lua_assert((pt->flags & PROTO_VARARG));
> +  lj_assertJ((pt->flags & PROTO_VARARG), "FUNCV in non-vararg function");
>    if (J->baseslot + vframe + pt->framesize >= LJ_MAX_JSLOTS)
>      lj_trace_err(J, LJ_TRERR_STACKOV);
>    J->base[vframe-1-LJ_FR2] = J->base[-1-LJ_FR2];  /* Copy function up. */
> @@ -1769,7 +1801,7 @@ static void rec_varg(jit_State *J, BCReg dst, ptrdiff_t nresults)
>  {
>    int32_t numparams = J->pt->numparams;
>    ptrdiff_t nvararg = frame_delta(J->L->base-1) - numparams - 1 - LJ_FR2;
> -  lua_assert(frame_isvarg(J->L->base-1));
> +  lj_assertJ(frame_isvarg(J->L->base-1), "VARG in non-vararg frame");
>    if (LJ_FR2 && dst > J->maxslot)
>      J->base[dst-1] = 0;  /* Prevent resurrection of unrelated slot. */
>    if (J->framedepth > 0) {  /* Simple case: varargs defined on-trace. */
> @@ -1887,7 +1919,7 @@ static TRef rec_cat(jit_State *J, BCReg baseslot, BCReg topslot)
>    TValue savetv[5];
>    BCReg s;
>    RecordIndex ix;
> -  lua_assert(baseslot < topslot);
> +  lj_assertJ(baseslot < topslot, "bad CAT arg");
>    for (s = baseslot; s <= topslot; s++)
>      (void)getslot(J, s);  /* Ensure all arguments have a reference. */
>    if (tref_isnumber_str(top[0]) && tref_isnumber_str(top[-1])) {
> @@ -2011,7 +2043,7 @@ void lj_record_ins(jit_State *J)
>        if (bc_op(*J->pc) >= BC__MAX)
>  	return;
>        break;
> -    default: lua_assert(0); break;
> +    default: lj_assertJ(0, "bad post-processing mode"); break;
>      }
>      J->postproc = LJ_POST_NONE;
>    }
> @@ -2379,7 +2411,8 @@ void lj_record_ins(jit_State *J)
>        J->loopref = J->cur.nins;
>      break;
>    case BC_JFORI:
> -    lua_assert(bc_op(pc[(ptrdiff_t)rc-BCBIAS_J]) == BC_JFORL);
> +    lj_assertJ(bc_op(pc[(ptrdiff_t)rc-BCBIAS_J]) == BC_JFORL,
> +	       "JFORI does not point to JFORL");
>      if (rec_for(J, pc, 0) != LOOPEV_LEAVE)  /* Link to existing loop. */
>        lj_record_stop(J, LJ_TRLINK_ROOT, bc_d(pc[(ptrdiff_t)rc-BCBIAS_J]));
>      /* Continue tracing if the loop is not entered. */
> @@ -2432,7 +2465,8 @@ void lj_record_ins(jit_State *J)
>      rec_func_lua(J);
>      break;
>    case BC_JFUNCV:
> -    lua_assert(0);  /* Cannot happen. No hotcall counting for varag funcs. */
> +    /* Cannot happen. No hotcall counting for varag funcs. */
> +    lj_assertJ(0, "unsupported vararg hotcall");
>      break;
>  
>    case BC_FUNCC:
> @@ -2492,11 +2526,11 @@ static const BCIns *rec_setup_root(jit_State *J)
>      J->bc_min = pc;
>      break;
>    case BC_ITERL:
> -    lua_assert(bc_op(pc[-1]) == BC_ITERC);
> +    lj_assertJ(bc_op(pc[-1]) == BC_ITERC, "no ITERC before ITERL");
>      J->maxslot = ra + bc_b(pc[-1]) - 1;
>      J->bc_extent = (MSize)(-bc_j(ins))*sizeof(BCIns);
>      pc += 1+bc_j(ins);
> -    lua_assert(bc_op(pc[-1]) == BC_JMP);
> +    lj_assertJ(bc_op(pc[-1]) == BC_JMP, "ITERL does not point to JMP+1");
>      J->bc_min = pc;
>      break;
>    case BC_LOOP:
> @@ -2528,7 +2562,7 @@ static const BCIns *rec_setup_root(jit_State *J)
>      pc++;
>      break;
>    default:
> -    lua_assert(0);
> +    lj_assertJ(0, "bad root trace start bytecode %d", bc_op(ins));
>      break;
>    }
>    return pc;
> diff --git a/src/lj_snap.c b/src/lj_snap.c
> index 9146cddc..2dc281cb 100644
> --- a/src/lj_snap.c
> +++ b/src/lj_snap.c
> @@ -110,7 +110,7 @@ static MSize snapshot_framelinks(jit_State *J, SnapEntry *map, uint8_t *topslot)
>    cTValue *ftop = isluafunc(fn) ? (frame+funcproto(fn)->framesize) : J->L->top;
>  #if LJ_FR2
>    uint64_t pcbase = (u64ptr(J->pc) << 8) | (J->baseslot - 2);
> -  lua_assert(2 <= J->baseslot && J->baseslot <= 257);
> +  lj_assertJ(2 <= J->baseslot && J->baseslot <= 257, "bad baseslot");
>    memcpy(map, &pcbase, sizeof(uint64_t));
>  #else
>    MSize f = 0;
> @@ -129,7 +129,7 @@ static MSize snapshot_framelinks(jit_State *J, SnapEntry *map, uint8_t *topslot)
>  #endif
>        frame = frame_prevd(frame);
>      } else {
> -      lua_assert(!frame_isc(frame));
> +      lj_assertJ(!frame_isc(frame), "broken frame chain");
>  #if !LJ_FR2
>        map[f++] = SNAP_MKFTSZ(frame_ftsz(frame));
>  #endif
> @@ -141,10 +141,10 @@ static MSize snapshot_framelinks(jit_State *J, SnapEntry *map, uint8_t *topslot)
>    }
>    *topslot = (uint8_t)(ftop - lim);
>  #if LJ_FR2
> -  lua_assert(sizeof(SnapEntry) * 2 == sizeof(uint64_t));
> +  lj_assertJ(sizeof(SnapEntry) * 2 == sizeof(uint64_t), "bad SnapEntry def");
>    return 2;
>  #else
> -  lua_assert(f == (MSize)(1 + J->framedepth));
> +  lj_assertJ(f == (MSize)(1 + J->framedepth), "miscalculated snapshot size");
>    return f;
>  #endif
>  }
> @@ -223,7 +223,8 @@ static BCReg snap_usedef(jit_State *J, uint8_t *udf,
>  #define DEF_SLOT(s)		udf[(s)] *= 3
>  
>    /* Scan through following bytecode and check for uses/defs. */
> -  lua_assert(pc >= proto_bc(J->pt) && pc < proto_bc(J->pt) + J->pt->sizebc);
> +  lj_assertJ(pc >= proto_bc(J->pt) && pc < proto_bc(J->pt) + J->pt->sizebc,
> +	     "snapshot PC out of range");
>    for (;;) {
>      BCIns ins = *pc++;
>      BCOp op = bc_op(ins);
> @@ -234,7 +235,7 @@ static BCReg snap_usedef(jit_State *J, uint8_t *udf,
>      switch (bcmode_c(op)) {
>      case BCMvar: USE_SLOT(bc_c(ins)); break;
>      case BCMrbase:
> -      lua_assert(op == BC_CAT);
> +      lj_assertJ(op == BC_CAT, "unhandled op %d with RC rbase", op);
>        for (s = bc_b(ins); s <= bc_c(ins); s++) USE_SLOT(s);
>        for (; s < maxslot; s++) DEF_SLOT(s);
>        break;
> @@ -288,7 +289,8 @@ static BCReg snap_usedef(jit_State *J, uint8_t *udf,
>        break;
>      default: break;
>      }
> -    lua_assert(pc >= proto_bc(J->pt) && pc < proto_bc(J->pt) + J->pt->sizebc);
> +    lj_assertJ(pc >= proto_bc(J->pt) && pc < proto_bc(J->pt) + J->pt->sizebc,
> +	       "use/def analysis PC out of range");
>    }
>  
>  #undef USE_SLOT
> @@ -361,19 +363,20 @@ static RegSP snap_renameref(GCtrace *T, SnapNo lim, IRRef ref, RegSP rs)
>  }
>  
>  /* Copy RegSP from parent snapshot to the parent links of the IR. */
> -IRIns *lj_snap_regspmap(GCtrace *T, SnapNo snapno, IRIns *ir)
> +IRIns *lj_snap_regspmap(jit_State *J, GCtrace *T, SnapNo snapno, IRIns *ir)
>  {
>    SnapShot *snap = &T->snap[snapno];
>    SnapEntry *map = &T->snapmap[snap->mapofs];
>    BloomFilter rfilt = snap_renamefilter(T, snapno);
>    MSize n = 0;
>    IRRef ref = 0;
> +  UNUSED(J);
>    for ( ; ; ir++) {
>      uint32_t rs;
>      if (ir->o == IR_SLOAD) {
>        if (!(ir->op2 & IRSLOAD_PARENT)) break;
>        for ( ; ; n++) {
> -	lua_assert(n < snap->nent);
> +	lj_assertJ(n < snap->nent, "slot %d not found in snapshot", ir->op1);
>  	if (snap_slot(map[n]) == ir->op1) {
>  	  ref = snap_ref(map[n++]);
>  	  break;
> @@ -390,7 +393,7 @@ IRIns *lj_snap_regspmap(GCtrace *T, SnapNo snapno, IRIns *ir)
>      if (bloomtest(rfilt, ref))
>        rs = snap_renameref(T, snapno, ref, rs);
>      ir->prev = (uint16_t)rs;
> -    lua_assert(regsp_used(rs));
> +    lj_assertJ(regsp_used(rs), "unused IR %04d in snapshot", ref - REF_BIAS);
>    }
>    return ir;
>  }
> @@ -408,7 +411,7 @@ static TRef snap_replay_const(jit_State *J, IRIns *ir)
>    case IR_KNUM: case IR_KINT64:
>      return lj_ir_k64(J, (IROp)ir->o, ir_k64(ir)->u64);
>    case IR_KPTR: return lj_ir_kptr(J, ir_kptr(ir));  /* Continuation. */
> -  default: lua_assert(0); return TREF_NIL; break;
> +  default: lj_assertJ(0, "bad IR constant op %d", ir->o); return TREF_NIL;
>    }
>  }
>  
> @@ -486,7 +489,7 @@ void lj_snap_replay(jit_State *J, GCtrace *T)
>  	tr = snap_replay_const(J, ir);
>      } else if (!regsp_used(ir->prev)) {
>        pass23 = 1;
> -      lua_assert(s != 0);
> +      lj_assertJ(s != 0, "unused slot 0 in snapshot");
>        tr = s;
>      } else {
>        IRType t = irt_type(ir->t);
> @@ -512,8 +515,9 @@ void lj_snap_replay(jit_State *J, GCtrace *T)
>        if (regsp_reg(ir->r) == RID_SUNK) {
>  	if (J->slot[snap_slot(sn)] != snap_slot(sn)) continue;
>  	pass23 = 1;
> -	lua_assert(ir->o == IR_TNEW || ir->o == IR_TDUP ||
> -		   ir->o == IR_CNEW || ir->o == IR_CNEWI);
> +	lj_assertJ(ir->o == IR_TNEW || ir->o == IR_TDUP ||
> +		   ir->o == IR_CNEW || ir->o == IR_CNEWI,
> +		   "sunk parent IR %04d has bad op %d", refp - REF_BIAS, ir->o);
>  	if (ir->op1 >= T->nk) snap_pref(J, T, map, nent, seen, ir->op1);
>  	if (ir->op2 >= T->nk) snap_pref(J, T, map, nent, seen, ir->op2);
>  	if (LJ_HASFFI && ir->o == IR_CNEWI) {
> @@ -531,7 +535,8 @@ void lj_snap_replay(jit_State *J, GCtrace *T)
>  	    }
>  	}
>        } else if (!irref_isk(refp) && !regsp_used(ir->prev)) {
> -	lua_assert(ir->o == IR_CONV && ir->op2 == IRCONV_NUM_INT);
> +	lj_assertJ(ir->o == IR_CONV && ir->op2 == IRCONV_NUM_INT,
> +		   "sunk parent IR %04d has bad op %d", refp - REF_BIAS, ir->o);
>  	J->slot[snap_slot(sn)] = snap_pref(J, T, map, nent, seen, ir->op1);
>        }
>      }
> @@ -581,7 +586,9 @@ void lj_snap_replay(jit_State *J, GCtrace *T)
>  	      val = snap_pref(J, T, map, nent, seen, irs->op2);
>  	      if (val == 0) {
>  		IRIns *irc = &T->ir[irs->op2];
> -		lua_assert(irc->o == IR_CONV && irc->op2 == IRCONV_NUM_INT);
> +		lj_assertJ(irc->o == IR_CONV && irc->op2 == IRCONV_NUM_INT,
> +			   "sunk store for parent IR %04d with bad op %d",
> +			   refp - REF_BIAS, irc->o);
>  		val = snap_pref(J, T, map, nent, seen, irc->op1);
>  		val = emitir(IRTN(IR_CONV), val, IRCONV_NUM_INT);
>  	      } else if ((LJ_SOFTFP32 || (LJ_32 && LJ_HASFFI)) &&
> @@ -634,7 +641,9 @@ static void snap_restoreval(jit_State *J, GCtrace *T, ExitState *ex,
>      if (ir->o == IR_KPTR) {
>        o->u64 = (uint64_t)(uintptr_t)ir_kptr(ir);
>      } else {
> -      lua_assert(!(ir->o == IR_KKPTR || ir->o == IR_KNULL));
> +      lj_assertJ(!(ir->o == IR_KKPTR || ir->o == IR_KNULL),
> +		 "restore of const from IR %04d with bad op %d",
> +		 ref - REF_BIAS, ir->o);
>        lj_ir_kvalue(J->L, o, ir);
>      }
>      return;
> @@ -655,13 +664,14 @@ static void snap_restoreval(jit_State *J, GCtrace *T, ExitState *ex,
>        o->u64 = *(uint64_t *)sps;
>  #endif
>      } else {
> -      lua_assert(!irt_ispri(t));  /* PRI refs never have a spill slot. */
> +      lj_assertJ(!irt_ispri(t), "PRI ref with spill slot");
>        setgcV(J->L, o, (GCobj *)(uintptr_t)*(GCSize *)sps, irt_toitype(t));
>      }
>    } else {  /* Restore from register. */
>      Reg r = regsp_reg(rs);
>      if (ra_noreg(r)) {
> -      lua_assert(ir->o == IR_CONV && ir->op2 == IRCONV_NUM_INT);
> +      lj_assertJ(ir->o == IR_CONV && ir->op2 == IRCONV_NUM_INT,
> +		 "restore from IR %04d has no reg", ref - REF_BIAS);
>        snap_restoreval(J, T, ex, snapno, rfilt, ir->op1, o);
>        if (LJ_DUALNUM) setnumV(o, (lua_Number)intV(o));
>        return;
> @@ -689,7 +699,7 @@ static void snap_restoreval(jit_State *J, GCtrace *T, ExitState *ex,
>  
>  #if LJ_HASFFI
>  /* Restore raw data from the trace exit state. */
> -static void snap_restoredata(GCtrace *T, ExitState *ex,
> +static void snap_restoredata(jit_State *J, GCtrace *T, ExitState *ex,
>  			     SnapNo snapno, BloomFilter rfilt,
>  			     IRRef ref, void *dst, CTSize sz)
>  {
> @@ -697,6 +707,7 @@ static void snap_restoredata(GCtrace *T, ExitState *ex,
>    RegSP rs = ir->prev;
>    int32_t *src;
>    uint64_t tmp;
> +  UNUSED(J);
>    if (irref_isk(ref)) {
>      if (ir_isk64(ir)) {
>        src = (int32_t *)&ir[1];
> @@ -719,8 +730,9 @@ static void snap_restoredata(GCtrace *T, ExitState *ex,
>        Reg r = regsp_reg(rs);
>        if (ra_noreg(r)) {
>  	/* Note: this assumes CNEWI is never used for SOFTFP split numbers. */
> -	lua_assert(sz == 8 && ir->o == IR_CONV && ir->op2 == IRCONV_NUM_INT);
> -	snap_restoredata(T, ex, snapno, rfilt, ir->op1, dst, 4);
> +	lj_assertJ(sz == 8 && ir->o == IR_CONV && ir->op2 == IRCONV_NUM_INT,
> +		   "restore from IR %04d has no reg", ref - REF_BIAS);
> +	snap_restoredata(J, T, ex, snapno, rfilt, ir->op1, dst, 4);
>  	*(lua_Number *)dst = (lua_Number)*(int32_t *)dst;
>  	return;
>        }
> @@ -741,7 +753,8 @@ static void snap_restoredata(GCtrace *T, ExitState *ex,
>        if (LJ_64 && LJ_BE && sz == 4) src++;
>      }
>    }
> -  lua_assert(sz == 1 || sz == 2 || sz == 4 || sz == 8);
> +  lj_assertJ(sz == 1 || sz == 2 || sz == 4 || sz == 8,
> +	     "restore from IR %04d with bad size %d", ref - REF_BIAS, sz);
>    if (sz == 4) *(int32_t *)dst = *src;
>    else if (sz == 8) *(int64_t *)dst = *(int64_t *)src;
>    else if (sz == 1) *(int8_t *)dst = (int8_t)*src;
> @@ -754,8 +767,9 @@ static void snap_unsink(jit_State *J, GCtrace *T, ExitState *ex,
>  			SnapNo snapno, BloomFilter rfilt,
>  			IRIns *ir, TValue *o)
>  {
> -  lua_assert(ir->o == IR_TNEW || ir->o == IR_TDUP ||
> -	     ir->o == IR_CNEW || ir->o == IR_CNEWI);
> +  lj_assertJ(ir->o == IR_TNEW || ir->o == IR_TDUP ||
> +	     ir->o == IR_CNEW || ir->o == IR_CNEWI,
> +	     "sunk allocation with bad op %d", ir->o);
>  #if LJ_HASFFI
>    if (ir->o == IR_CNEW || ir->o == IR_CNEWI) {
>      CTState *cts = ctype_cts(J->L);
> @@ -766,13 +780,14 @@ static void snap_unsink(jit_State *J, GCtrace *T, ExitState *ex,
>      setcdataV(J->L, o, cd);
>      if (ir->o == IR_CNEWI) {
>        uint8_t *p = (uint8_t *)cdataptr(cd);
> -      lua_assert(sz == 4 || sz == 8);
> +      lj_assertJ(sz == 4 || sz == 8, "sunk cdata with bad size %d", sz);
>        if (LJ_32 && sz == 8 && ir+1 < T->ir + T->nins && (ir+1)->o == IR_HIOP) {
> -	snap_restoredata(T, ex, snapno, rfilt, (ir+1)->op2, LJ_LE?p+4:p, 4);
> +	snap_restoredata(J, T, ex, snapno, rfilt, (ir+1)->op2,
> +			 LJ_LE ? p+4 : p, 4);
>  	if (LJ_BE) p += 4;
>  	sz = 4;
>        }
> -      snap_restoredata(T, ex, snapno, rfilt, ir->op2, p, sz);
> +      snap_restoredata(J, T, ex, snapno, rfilt, ir->op2, p, sz);
>      } else {
>        IRIns *irs, *irlast = &T->ir[T->snap[snapno].ref];
>        for (irs = ir+1; irs < irlast; irs++)
> @@ -780,8 +795,11 @@ static void snap_unsink(jit_State *J, GCtrace *T, ExitState *ex,
>  	  IRIns *iro = &T->ir[T->ir[irs->op1].op2];
>  	  uint8_t *p = (uint8_t *)cd;
>  	  CTSize szs;
> -	  lua_assert(irs->o == IR_XSTORE && T->ir[irs->op1].o == IR_ADD);
> -	  lua_assert(iro->o == IR_KINT || iro->o == IR_KINT64);
> +	  lj_assertJ(irs->o == IR_XSTORE, "sunk store with bad op %d", irs->o);
> +	  lj_assertJ(T->ir[irs->op1].o == IR_ADD,
> +		     "sunk store with bad add op %d", T->ir[irs->op1].o);
> +	  lj_assertJ(iro->o == IR_KINT || iro->o == IR_KINT64,
> +		     "sunk store with bad const offset op %d", iro->o);
>  	  if (irt_is64(irs->t)) szs = 8;
>  	  else if (irt_isi8(irs->t) || irt_isu8(irs->t)) szs = 1;
>  	  else if (irt_isi16(irs->t) || irt_isu16(irs->t)) szs = 2;
> @@ -790,14 +808,16 @@ static void snap_unsink(jit_State *J, GCtrace *T, ExitState *ex,
>  	    p += (int64_t)ir_k64(iro)->u64;
>  	  else
>  	    p += iro->i;
> -	  lua_assert(p >= (uint8_t *)cdataptr(cd) &&
> -		     p + szs <= (uint8_t *)cdataptr(cd) + sz);
> +	  lj_assertJ(p >= (uint8_t *)cdataptr(cd) &&
> +		     p + szs <= (uint8_t *)cdataptr(cd) + sz,
> +		     "sunk store with offset out of range");
>  	  if (LJ_32 && irs+1 < T->ir + T->nins && (irs+1)->o == IR_HIOP) {
> -	    lua_assert(szs == 4);
> -	    snap_restoredata(T, ex, snapno, rfilt, (irs+1)->op2, LJ_LE?p+4:p,4);
> +	    lj_assertJ(szs == 4, "sunk store with bad size %d", szs);
> +	    snap_restoredata(J, T, ex, snapno, rfilt, (irs+1)->op2,
> +			     LJ_LE ? p+4 : p, 4);
>  	    if (LJ_BE) p += 4;
>  	  }
> -	  snap_restoredata(T, ex, snapno, rfilt, irs->op2, p, szs);
> +	  snap_restoredata(J, T, ex, snapno, rfilt, irs->op2, p, szs);
>  	}
>      }
>    } else
> @@ -812,10 +832,12 @@ static void snap_unsink(jit_State *J, GCtrace *T, ExitState *ex,
>        if (irs->r == RID_SINK && snap_sunk_store(T, ir, irs)) {
>  	IRIns *irk = &T->ir[irs->op1];
>  	TValue tmp, *val;
> -	lua_assert(irs->o == IR_ASTORE || irs->o == IR_HSTORE ||
> -		   irs->o == IR_FSTORE);
> +	lj_assertJ(irs->o == IR_ASTORE || irs->o == IR_HSTORE ||
> +		   irs->o == IR_FSTORE,
> +		   "sunk store with bad op %d", irs->o);
>  	if (irk->o == IR_FREF) {
> -	  lua_assert(irk->op2 == IRFL_TAB_META);
> +	  lj_assertJ(irk->op2 == IRFL_TAB_META,
> +		     "sunk store with bad field %d", irk->op2);
>  	  snap_restoreval(J, T, ex, snapno, rfilt, irs->op2, &tmp);
>  	  /* NOBARRIER: The table is new (marked white). */
>  	  setgcref(t->metatable, obj2gco(tabV(&tmp)));
> @@ -903,7 +925,7 @@ const BCIns *lj_snap_restore(jit_State *J, void *exptr)
>  #if LJ_FR2
>    L->base += (map[nent+LJ_BE] & 0xff);
>  #endif
> -  lua_assert(map + nent == flinks);
> +  lj_assertJ(map + nent == flinks, "inconsistent frames in snapshot");
>  
>    /* Compute current stack top. */
>    switch (bc_op(*pc)) {
> diff --git a/src/lj_snap.h b/src/lj_snap.h
> index 2c9ae3d6..4aec8509 100644
> --- a/src/lj_snap.h
> +++ b/src/lj_snap.h
> @@ -13,7 +13,8 @@
>  LJ_FUNC void lj_snap_add(jit_State *J);
>  LJ_FUNC void lj_snap_purge(jit_State *J);
>  LJ_FUNC void lj_snap_shrink(jit_State *J);
> -LJ_FUNC IRIns *lj_snap_regspmap(GCtrace *T, SnapNo snapno, IRIns *ir);
> +LJ_FUNC IRIns *lj_snap_regspmap(jit_State *J, GCtrace *T, SnapNo snapno,
> +				IRIns *ir);
>  LJ_FUNC void lj_snap_replay(jit_State *J, GCtrace *T);
>  LJ_FUNC const BCIns *lj_snap_restore(jit_State *J, void *exptr);
>  LJ_FUNC void lj_snap_grow_buf_(jit_State *J, MSize need);
> diff --git a/src/lj_state.c b/src/lj_state.c
> index 4add3d65..684336d5 100644
> --- a/src/lj_state.c
> +++ b/src/lj_state.c
> @@ -70,7 +70,8 @@ static void resizestack(lua_State *L, MSize n)
>    GCobj *up;
>    int32_t oldvmstate = G(L)->vmstate;
>  
> -  lua_assert((MSize)(tvref(L->maxstack)-oldst)==L->stacksize-LJ_STACK_EXTRA-1);
> +  lj_assertL((MSize)(tvref(L->maxstack)-oldst) == L->stacksize-LJ_STACK_EXTRA-1,
> +	     "inconsistent stack size");
>  
>    /*
>    ** Lua stack is inconsistent while reallocation, profilers
> @@ -182,8 +183,9 @@ static void close_state(lua_State *L)
>    global_State *g = G(L);
>    lj_func_closeuv(L, tvref(L->stack));
>    lj_gc_freeall(g);
> -  lua_assert(gcref(g->gc.root) == obj2gco(L));
> -  lua_assert(g->strnum == 0);
> +  lj_assertG(gcref(g->gc.root) == obj2gco(L),
> +	     "main thread is not first GC object");
> +  lj_assertG(g->strnum == 0, "leaked %d strings", g->strnum);
>    lj_trace_freestate(g);
>  #if LJ_HASFFI
>    lj_ctype_freestate(g);
> @@ -197,7 +199,9 @@ static void close_state(lua_State *L)
>      lj_mem_freevec(g, mref(g->gc.lightudseg, uint32_t), segnum, uint32_t);
>    }
>  #endif
> -  lua_assert(g->gc.total == sizeof(GG_State));
> +  lj_assertG(g->gc.total == sizeof(GG_State),
> +	     "memory leak of %lld bytes",
> +	     (long long)(g->gc.total - sizeof(GG_State)));
>  #ifndef LUAJIT_USE_SYSMALLOC
>    if (g->allocf == lj_alloc_f)
>      lj_alloc_destroy(g->allocd);
> @@ -315,17 +319,17 @@ lua_State *lj_state_new(lua_State *L)
>    setmrefr(L1->glref, L->glref);
>    setgcrefr(L1->env, L->env);
>    stack_init(L1, L);  /* init stack */
> -  lua_assert(iswhite(obj2gco(L1)));
> +  lj_assertL(iswhite(obj2gco(L1)), "new thread object is not white");
>    return L1;
>  }
>  
>  void LJ_FASTCALL lj_state_free(global_State *g, lua_State *L)
>  {
> -  lua_assert(L != mainthread(g));
> +  lj_assertG(L != mainthread(g), "free of main thread");
>    if (obj2gco(L) == gcref(g->cur_L))
>      setgcrefnull(g->cur_L);
>    lj_func_closeuv(L, tvref(L->stack));
> -  lua_assert(gcref(L->openupval) == NULL);
> +  lj_assertG(gcref(L->openupval) == NULL, "stale open upvalues");
>    lj_mem_freevec(g, tvref(L->stack), L->stacksize, TValue);
>    lj_mem_freet(g, L);
>  }
> diff --git a/src/lj_str.c b/src/lj_str.c
> index 8ff955ed..321e8c4f 100644
> --- a/src/lj_str.c
> +++ b/src/lj_str.c
> @@ -53,8 +53,9 @@ int32_t LJ_FASTCALL lj_str_cmp(GCstr *a, GCstr *b)
>  static LJ_AINLINE int str_fastcmp(const char *a, const char *b, MSize len)
>  {
>    MSize i = 0;
> -  lua_assert(len > 0);
> -  lua_assert((((uintptr_t)a+len-1) & (LJ_PAGESIZE-1)) <= LJ_PAGESIZE-4);
> +  lj_assertX(len > 0, "fast string compare with zero length");
> +  lj_assertX((((uintptr_t)a+len-1) & (LJ_PAGESIZE-1)) <= LJ_PAGESIZE-4,
> +	     "fast string compare crossing page boundary");
>    do {  /* Note: innocuous access up to end of string + 3. */
>      uint32_t v = lj_getu32(a+i) ^ *(const uint32_t *)(b+i);
>      if (v) {
> @@ -138,7 +139,7 @@ lj_fullhash(const uint8_t *v, MSize len)
>    MSize c = 0xcafedead;
>    MSize d = 0xdeadbeef;
>    MSize h = len;
> -  lua_assert(len >= 12);
> +  lj_assertX(len >= 12, "full hash calculation for too short (%d) string", len);
>    for(; len>8; len-=8, v+=8) {
>      a ^= lj_getu32(v);
>      b ^= lj_getu32(v+4);
> diff --git a/src/lj_strfmt.c b/src/lj_strfmt.c
> index 237cc575..ff5568c3 100644
> --- a/src/lj_strfmt.c
> +++ b/src/lj_strfmt.c
> @@ -320,7 +320,7 @@ SBuf *lj_strfmt_putfxint(SBuf *sb, SFormat sf, uint64_t k)
>    if ((sf & STRFMT_F_LEFT))
>      while (width-- > pprec) *p++ = ' ';
>  
> -  lua_assert(need == (MSize)(p - ps));
> +  lj_assertX(need == (MSize)(p - ps), "miscalculated format size");
>    setsbufP(sb, p);
>    return sb;
>  }
> @@ -449,7 +449,7 @@ const char *lj_strfmt_pushvf(lua_State *L, const char *fmt, va_list argp)
>      case STRFMT_ERR:
>      default:
>        lj_buf_putb(sb, '?');
> -      lua_assert(0);
> +      lj_assertL(0, "bad string format near offset %d", fs.len);
>        break;
>      }
>    }
> diff --git a/src/lj_strfmt.h b/src/lj_strfmt.h
> index 6e1d9017..0e1d8946 100644
> --- a/src/lj_strfmt.h
> +++ b/src/lj_strfmt.h
> @@ -79,7 +79,8 @@ static LJ_AINLINE void lj_strfmt_init(FormatState *fs, const char *p, MSize len)
>  {
>    fs->p = (const uint8_t *)p;
>    fs->e = (const uint8_t *)p + len;
> -  lua_assert(*fs->e == 0);  /* Must be NUL-terminated (may have NULs inside). */
> +  /* Must be NUL-terminated. May have NULs inside, too. */
> +  lj_assertX(*fs->e == 0, "format not NUL-terminated");
>  }
>  
>  /* Raw conversions. */
> diff --git a/src/lj_strfmt_num.c b/src/lj_strfmt_num.c
> index 9271f68a..c26204b7 100644
> --- a/src/lj_strfmt_num.c
> +++ b/src/lj_strfmt_num.c
> @@ -257,7 +257,7 @@ static int nd_similar(uint32_t* nd, uint32_t ndhi, uint32_t* ref, MSize hilen,
>    } else {
>      prec -= hilen - 9;
>    }
> -  lua_assert(prec < 9);
> +  lj_assertX(prec < 9, "bad precision %d", prec);
>    lj_strfmt_wuint9(nd9, nd[ndhi]);
>    lj_strfmt_wuint9(ref9, *ref);
>    return !memcmp(nd9, ref9, prec) && (nd9[prec] < '5') == (ref9[prec] < '5');
> @@ -414,14 +414,14 @@ static char *lj_strfmt_wfnum(SBuf *sb, SFormat sf, lua_Number n, char *p)
>  	** Rescaling was performed, but this introduced some error, and might
>  	** have pushed us across a rounding boundary. We check whether this
>  	** error affected the result by introducing even more error (2ulp in
> -	** either direction), and seeing whether a roundary boundary was
> +	** either direction), and seeing whether a rounding boundary was
>  	** crossed. Having already converted the -2ulp case, we save off its
>  	** most significant digits, convert the +2ulp case, and compare them.
>  	*/
>  	int32_t eidx = e + 70 + (ND_MUL2K_MAX_SHIFT < 29)
>  			 + (t.u32.lo >= 0xfffffffe && !(~t.u32.hi << 12));
>  	const int8_t *m_e = four_ulp_m_e + eidx * 2;
> -	lua_assert(0 <= eidx && eidx < 128);
> +	lj_assertG_(G(sbufL(sb)), 0 <= eidx && eidx < 128, "bad eidx %d", eidx);
>  	nd[33] = nd[ndhi];
>  	nd[32] = nd[(ndhi - 1) & 0x3f];
>  	nd[31] = nd[(ndhi - 2) & 0x3f];
> diff --git a/src/lj_strscan.c b/src/lj_strscan.c
> index 11d341ee..bb07b251 100644
> --- a/src/lj_strscan.c
> +++ b/src/lj_strscan.c
> @@ -93,7 +93,7 @@ static void strscan_double(uint64_t x, TValue *o, int32_t ex2, int32_t neg)
>    }
>  
>    /* Convert to double using a signed int64_t conversion, then rescale. */
> -  lua_assert((int64_t)x >= 0);
> +  lj_assertX((int64_t)x >= 0, "bad double conversion");
>    n = (double)(int64_t)x;
>    if (neg) n = -n;
>    if (ex2) n = ldexp(n, ex2);
> @@ -263,7 +263,7 @@ static StrScanFmt strscan_dec(const uint8_t *p, TValue *o,
>      uint32_t hi = 0, lo = (uint32_t)(xip-xi);
>      int32_t ex2 = 0, idig = (int32_t)lo + (ex10 >> 1);
>  
> -    lua_assert(lo > 0 && (ex10 & 1) == 0);
> +    lj_assertX(lo > 0 && (ex10 & 1) == 0, "bad lo %d ex10 %d", lo, ex10);
>  
>      /* Handle simple overflow/underflow. */
>      if (idig > 310/2) { if (neg) setminfV(o); else setpinfV(o); return fmt; }
> @@ -532,7 +532,7 @@ int LJ_FASTCALL lj_strscan_num(GCstr *str, TValue *o)
>  {
>    StrScanFmt fmt = lj_strscan_scan((const uint8_t *)strdata(str), str->len, o,
>  				   STRSCAN_OPT_TONUM);
> -  lua_assert(fmt == STRSCAN_ERROR || fmt == STRSCAN_NUM);
> +  lj_assertX(fmt == STRSCAN_ERROR || fmt == STRSCAN_NUM, "bad scan format");
>    return (fmt != STRSCAN_ERROR);
>  }
>  
> @@ -541,7 +541,8 @@ int LJ_FASTCALL lj_strscan_number(GCstr *str, TValue *o)
>  {
>    StrScanFmt fmt = lj_strscan_scan((const uint8_t *)strdata(str), str->len, o,
>  				   STRSCAN_OPT_TOINT);
> -  lua_assert(fmt == STRSCAN_ERROR || fmt == STRSCAN_NUM || fmt == STRSCAN_INT);
> +  lj_assertX(fmt == STRSCAN_ERROR || fmt == STRSCAN_NUM || fmt == STRSCAN_INT,
> +	     "bad scan format");
>    if (fmt == STRSCAN_INT) setitype(o, LJ_TISNUM);
>    return (fmt != STRSCAN_ERROR);
>  }
> diff --git a/src/lj_symtab.c b/src/lj_symtab.c
> index 54984c05..38b5e9e1 100644
> --- a/src/lj_symtab.c
> +++ b/src/lj_symtab.c
> @@ -36,8 +36,8 @@ void lj_symtab_dump_trace(struct lj_wbuf *out, const GCtrace *trace)
>    BCLine lineno = 0;
>  
>    const BCIns *startpc = mref(trace->startpc, const BCIns);
> -  lua_assert(startpc >= proto_bc(pt) &&
> -             startpc < proto_bc(pt) + pt->sizebc);
> +  lj_assertX(startpc >= proto_bc(pt) && startpc < proto_bc(pt) + pt->sizebc,
> +	     "start trace PC out of range");
>  
>    lineno = lj_debug_line(pt, proto_bcpos(pt, startpc));
>  
> @@ -354,8 +354,9 @@ static int resolve_symbolnames(struct dl_phdr_info *info, size_t info_size,
>    ** Assertion was taken from the GLIBC tests:
>    ** https://code.woboq.org/userspace/glibc/elf/tst-dlmodcount.c.html#37
>    */
> -  lua_assert(info_size > offsetof(struct dl_phdr_info, dlpi_subs)
> -      + sizeof(info->dlpi_subs));
> +  lj_assertL(info_size > offsetof(struct dl_phdr_info, dlpi_subs)
> +			 + sizeof(info->dlpi_subs),
> +	     "bad dlpi_subs");
>  
>    lib_cnt = info->dlpi_adds - *conf->lib_adds;
>  
> @@ -401,7 +402,7 @@ static int resolve_symbolnames(struct dl_phdr_info *info, size_t info_size,
>        ** sysprof, unless someone have deleted the LuaJIT binary
>        ** right after the start.
>        */
> -      lua_assert(0);
> +      lj_assertL(0, "bad executed binary symtab section");
>    }
>  
>    /*
> diff --git a/src/lj_sysprof.c b/src/lj_sysprof.c
> index 2e9ed9b3..52d4d2a5 100644
> --- a/src/lj_sysprof.c
> +++ b/src/lj_sysprof.c
> @@ -111,9 +111,9 @@ static void stream_epilogue(struct sysprof *sp)
>  
>  static void stream_lfunc(struct lj_wbuf *buf, const GCfunc *func)
>  {
> -  lua_assert(isluafunc(func));
> +  lj_assertX(isluafunc(func), "bad lua function in sysprof stream");
>    const GCproto *pt = funcproto(func);
> -  lua_assert(pt != NULL);
> +  lj_assertX(pt != NULL, "bad lua function prototype in sysprof stream");
>    lj_wbuf_addbyte(buf, LJP_FRAME_LFUNC);
>    lj_wbuf_addu64(buf, (uintptr_t)pt);
>    lj_wbuf_addu64(buf, (uint64_t)pt->firstline);
> @@ -121,14 +121,14 @@ static void stream_lfunc(struct lj_wbuf *buf, const GCfunc *func)
>  
>  static void stream_cfunc(struct lj_wbuf *buf, const GCfunc *func)
>  {
> -  lua_assert(iscfunc(func));
> +  lj_assertX(iscfunc(func), "bad C function in sysprof stream");
>    lj_wbuf_addbyte(buf, LJP_FRAME_CFUNC);
>    lj_wbuf_addu64(buf, (uintptr_t)func->c.f);
>  }
>  
>  static void stream_ffunc(struct lj_wbuf *buf, const GCfunc *func)
>  {
> -  lua_assert(isffunc(func));
> +  lj_assertX(isffunc(func), "bad fast function in sysprof stream");
>    lj_wbuf_addbyte(buf, LJP_FRAME_FFUNC);
>    lj_wbuf_addu64(buf, func->c.ffid);
>  }
> @@ -136,7 +136,7 @@ static void stream_ffunc(struct lj_wbuf *buf, const GCfunc *func)
>  static void stream_frame_lua(struct lj_wbuf *buf, const cTValue *frame)
>  {
>    const GCfunc *func = frame_func(frame);
> -  lua_assert(func != NULL);
> +  lj_assertX(func != NULL, "bad function in sysprof stream");
>    if (isluafunc(func))
>      stream_lfunc(buf, func);
>    else if (isffunc(func))
> @@ -145,7 +145,7 @@ static void stream_frame_lua(struct lj_wbuf *buf, const cTValue *frame)
>      stream_cfunc(buf, func);
>    else
>      /* Unreachable. */
> -    lua_assert(0);
> +    lj_assertX(0, "bad function type in sysprof stream");
>  }
>  
>  static void stream_backtrace_lua(struct sysprof *sp)
> @@ -155,9 +155,9 @@ static void stream_backtrace_lua(struct sysprof *sp)
>    cTValue *top_frame = NULL, *frame = NULL, *bot = NULL;
>    lua_State *L = NULL;
>  
> -  lua_assert(g != NULL);
> +  lj_assertX(g != NULL, "uninitialized global state in sysprof state");
>    L = gco2th(gcref(g->cur_L));
> -  lua_assert(L != NULL);
> +  lj_assertG(L != NULL, "uninitialized Lua state in sysprof state");
>  
>    top_frame = g->top_frame - 1; //(1 + LJ_FR2)
>  
> @@ -200,7 +200,7 @@ static void default_backtrace_host(void *(writer)(int frame_no, void *addr))
>    const int depth = backtrace(backtrace_buf, max_depth);
>    int level;
>  
> -  lua_assert(depth <= max_depth);
> +  lj_assertX(depth <= max_depth, "depth of C stack is too big");
>    for (level = SYSPROF_HANDLER_STACK_DEPTH; level < depth; ++level) {
>      if (!writer(level - SYSPROF_HANDLER_STACK_DEPTH + 1, backtrace_buf[level]))
>        return;
> @@ -209,7 +209,7 @@ static void default_backtrace_host(void *(writer)(int frame_no, void *addr))
>  
>  static void stream_backtrace_host(struct sysprof *sp)
>  {
> -  lua_assert(sp->backtracer != NULL);
> +  lj_assertX(sp->backtracer != NULL, "uninitialized sysprof backtracer");
>    sp->backtracer(stream_frame_host);
>    lj_wbuf_addu64(&sp->out, (uintptr_t)LJP_FRAME_HOST_LAST);
>  }
> @@ -268,9 +268,9 @@ static void stream_event(struct sysprof *sp, uint32_t vmstate)
>  {
>    event_streamer stream = NULL;
>  
> -  lua_assert(vmstfit4(vmstate));
> +  lj_assertX(vmstfit4(vmstate), "vmstate don't fit in 4 bits");
>    stream = event_streamers[vmstate];
> -  lua_assert(NULL != stream);
> +  lj_assertX(stream != NULL, "uninitialized sysprof stream");
>    stream(sp, vmstate);
>  }
>  
> @@ -282,7 +282,8 @@ static void sysprof_record_sample(struct sysprof *sp, siginfo_t *info)
>    uint32_t _vmstate = ~(uint32_t)(g->vmstate);
>    uint32_t vmstate = _vmstate < LJ_VMST_TRACE ? _vmstate : LJ_VMST_TRACE;
>  
> -  lua_assert(pthread_self() == sp->thread);
> +  lj_assertX(pthread_self() == sp->thread,
> +	     "bad thread during sysprof record sample");
>  
>    /* Caveat: order of counters must match vmstate order in <lj_obj.h>. */
>    ((uint64_t *)&sp->counters)[vmstate]++;
> @@ -317,7 +318,7 @@ static void sysprof_signal_handler(int sig, siginfo_t *info, void *ctx)
>        break;
>  
>      default:
> -      lua_assert(0);
> +      lj_assertX(0, "bad sysprof profiler state");
>        break;
>    }
>  }
> @@ -344,7 +345,7 @@ static int sysprof_validate(struct sysprof *sp,
>        return PROFILE_ERRRUN;
>  
>      default:
> -      lua_assert(0);
> +      lj_assertX(0, "bad sysprof profiler state");
>        break;
>    }
>  
> diff --git a/src/lj_tab.c b/src/lj_tab.c
> index c5f358e5..1d6a4b7f 100644
> --- a/src/lj_tab.c
> +++ b/src/lj_tab.c
> @@ -38,7 +38,7 @@ static LJ_AINLINE Node *hashmask(const GCtab *t, uint32_t hash)
>  /* Hash an arbitrary key and return its anchor position in the hash table. */
>  static Node *hashkey(const GCtab *t, cTValue *key)
>  {
> -  lua_assert(!tvisint(key));
> +  lj_assertX(!tvisint(key), "attempt to hash integer");
>    if (tvisstr(key))
>      return hashstr(t, strV(key));
>    else if (tvisnum(key))
> @@ -57,7 +57,7 @@ static LJ_AINLINE void newhpart(lua_State *L, GCtab *t, uint32_t hbits)
>  {
>    uint32_t hsize;
>    Node *node;
> -  lua_assert(hbits != 0);
> +  lj_assertL(hbits != 0, "zero hash size");
>    if (hbits > LJ_MAX_HBITS)
>      lj_err_msg(L, LJ_ERR_TABOV);
>    hsize = 1u << hbits;
> @@ -78,7 +78,7 @@ static LJ_AINLINE void clearhpart(GCtab *t)
>  {
>    uint32_t i, hmask = t->hmask;
>    Node *node = noderef(t->node);
> -  lua_assert(t->hmask != 0);
> +  lj_assertX(t->hmask != 0, "empty hash part");
>    for (i = 0; i <= hmask; i++) {
>      Node *n = &node[i];
>      setmref(n->next, NULL);
> @@ -103,7 +103,7 @@ static GCtab *newtab(lua_State *L, uint32_t asize, uint32_t hbits)
>    /* First try to colocate the array part. */
>    if (LJ_MAX_COLOSIZE != 0 && asize > 0 && asize <= LJ_MAX_COLOSIZE) {
>      Node *nilnode;
> -    lua_assert((sizeof(GCtab) & 7) == 0);
> +    lj_assertL((sizeof(GCtab) & 7) == 0, "bad GCtab size");
>      t = (GCtab *)lj_mem_newgco(L, sizetabcolo(asize));
>      t->gct = ~LJ_TTAB;
>      t->nomm = (uint8_t)~0;
> @@ -186,7 +186,8 @@ GCtab * LJ_FASTCALL lj_tab_dup(lua_State *L, const GCtab *kt)
>    GCtab *t;
>    uint32_t asize, hmask;
>    t = newtab(L, kt->asize, kt->hmask > 0 ? lj_fls(kt->hmask)+1 : 0);
> -  lua_assert(kt->asize == t->asize && kt->hmask == t->hmask);
> +  lj_assertL(kt->asize == t->asize && kt->hmask == t->hmask,
> +	     "mismatched size of table and template");
>    t->nomm = 0;  /* Keys with metamethod names may be present. */
>    asize = kt->asize;
>    if (asize > 0) {
> @@ -312,7 +313,7 @@ void lj_tab_resize(lua_State *L, GCtab *t, uint32_t asize, uint32_t hbits)
>  
>  static uint32_t countint(cTValue *key, uint32_t *bins)
>  {
> -  lua_assert(!tvisint(key));
> +  lj_assertX(!tvisint(key), "bad integer key");
>    if (tvisnum(key)) {
>      lua_Number nk = numV(key);
>      int32_t k = lj_num2int(nk);
> @@ -465,7 +466,8 @@ TValue *lj_tab_newkey(lua_State *L, GCtab *t, cTValue *key)
>    if (!tvisnil(&n->val) || t->hmask == 0) {
>      Node *nodebase = noderef(t->node);
>      Node *collide, *freenode = getfreetop(t, nodebase);
> -    lua_assert(freenode >= nodebase && freenode <= nodebase+t->hmask+1);
> +    lj_assertL(freenode >= nodebase && freenode <= nodebase+t->hmask+1,
> +	       "bad freenode");
>      do {
>        if (freenode == nodebase) {  /* No free node found? */
>  	rehashtab(L, t, key);  /* Rehash table. */
> @@ -473,7 +475,7 @@ TValue *lj_tab_newkey(lua_State *L, GCtab *t, cTValue *key)
>        }
>      } while (!tvisnil(&(--freenode)->key));
>      setfreetop(t, nodebase, freenode);
> -    lua_assert(freenode != &G(L)->nilnode);
> +    lj_assertL(freenode != &G(L)->nilnode, "store to fallback hash");
>      collide = hashkey(t, &n->key);
>      if (collide != n) {  /* Colliding node not the main node? */
>        Node *nn;
> @@ -555,7 +557,7 @@ TValue *lj_tab_newkey(lua_State *L, GCtab *t, cTValue *key)
>    if (LJ_UNLIKELY(tvismzero(&n->key)))
>      n->key.u64 = 0;
>    lj_gc_anybarriert(L, t);
> -  lua_assert(tvisnil(&n->val));
> +  lj_assertL(tvisnil(&n->val), "new hash slot is not empty");
>    return &n->val;
>  }
>  
> diff --git a/src/lj_target.h b/src/lj_target.h
> index 8dcae957..b4be6781 100644
> --- a/src/lj_target.h
> +++ b/src/lj_target.h
> @@ -152,7 +152,8 @@ typedef uint32_t RegCost;
>  /* Return the address of an exit stub. */
>  static LJ_AINLINE char *exitstub_addr_(char **group, uint32_t exitno)
>  {
> -  lua_assert(group[exitno / EXITSTUBS_PER_GROUP] != NULL);
> +  lj_assertX(group[exitno / EXITSTUBS_PER_GROUP] != NULL,
> +	     "exit stub group for exit %d uninitialized", exitno);
>    return (char *)group[exitno / EXITSTUBS_PER_GROUP] +
>  	 EXITSTUB_SPACING*(exitno % EXITSTUBS_PER_GROUP);
>  }
> diff --git a/src/lj_trace.c b/src/lj_trace.c
> index 17743159..236e06a0 100644
> --- a/src/lj_trace.c
> +++ b/src/lj_trace.c
> @@ -110,7 +110,8 @@ static void perftools_addtrace(GCtrace *T)
>      name++;
>    else
>      name = "(string)";
> -  lua_assert(startpc >= proto_bc(pt) && startpc < proto_bc(pt) + pt->sizebc);
> +  lj_assertX(startpc >= proto_bc(pt) && startpc < proto_bc(pt) + pt->sizebc,
> +	     "trace PC out of range");
>    lineno = lj_debug_line(pt, proto_bcpos(pt, startpc));
>    if (!fp) {
>      char fname[40];
> @@ -200,7 +201,7 @@ void lj_trace_reenableproto(GCproto *pt)
>  {
>    if ((pt->flags & PROTO_ILOOP)) {
>      BCIns *bc = proto_bc(pt);
> -    BCPos i, sizebc = pt->sizebc;;
> +    BCPos i, sizebc = pt->sizebc;
>      pt->flags &= ~PROTO_ILOOP;
>      if (bc_op(bc[0]) == BC_IFUNCF)
>        setbc_op(&bc[0], BC_FUNCF);
> @@ -222,27 +223,28 @@ static void trace_unpatch(jit_State *J, GCtrace *T)
>      return;  /* No need to unpatch branches in parent traces (yet). */
>    switch (bc_op(*pc)) {
>    case BC_JFORL:
> -    lua_assert(traceref(J, bc_d(*pc)) == T);
> +    lj_assertJ(traceref(J, bc_d(*pc)) == T, "JFORL references other trace");
>      *pc = T->startins;
>      pc += bc_j(T->startins);
> -    lua_assert(bc_op(*pc) == BC_JFORI);
> +    lj_assertJ(bc_op(*pc) == BC_JFORI, "FORL does not point to JFORI");
>      setbc_op(pc, BC_FORI);
>      break;
>    case BC_JITERL:
>    case BC_JLOOP:
> -    lua_assert(op == BC_ITERL || op == BC_LOOP || bc_isret(op));
> +    lj_assertJ(op == BC_ITERL || op == BC_LOOP || bc_isret(op),
> +	       "bad original bytecode %d", op);
>      *pc = T->startins;
>      break;
>    case BC_JMP:
> -    lua_assert(op == BC_ITERL);
> +    lj_assertJ(op == BC_ITERL, "bad original bytecode %d", op);
>      pc += bc_j(*pc)+2;
>      if (bc_op(*pc) == BC_JITERL) {
> -      lua_assert(traceref(J, bc_d(*pc)) == T);
> +      lj_assertJ(traceref(J, bc_d(*pc)) == T, "JITERL references other trace");
>        *pc = T->startins;
>      }
>      break;
>    case BC_JFUNCF:
> -    lua_assert(op == BC_FUNCF);
> +    lj_assertJ(op == BC_FUNCF, "bad original bytecode %d", op);
>      *pc = T->startins;
>      break;
>    default:  /* Already unpatched. */
> @@ -254,7 +256,8 @@ static void trace_unpatch(jit_State *J, GCtrace *T)
>  static void trace_flushroot(jit_State *J, GCtrace *T)
>  {
>    GCproto *pt = &gcref(T->startpt)->pt;
> -  lua_assert(T->root == 0 && pt != NULL);
> +  lj_assertJ(T->root == 0, "not a root trace");
> +  lj_assertJ(pt != NULL, "trace has no prototype");
>    /* First unpatch any modified bytecode. */
>    trace_unpatch(J, T);
>    /* Unlink root trace from chain anchored in prototype. */
> @@ -370,7 +373,8 @@ void lj_trace_freestate(global_State *g)
>    {  /* This assumes all traces have already been freed. */
>      ptrdiff_t i;
>      for (i = 1; i < (ptrdiff_t)J->sizetrace; i++)
> -      lua_assert(i == (ptrdiff_t)J->cur.traceno || traceref(J, i) == NULL);
> +      lj_assertG(i == (ptrdiff_t)J->cur.traceno || traceref(J, i) == NULL,
> +		 "trace still allocated");
>    }
>  #endif
>    lj_mcode_free(J);
> @@ -425,8 +429,9 @@ static void trace_start(jit_State *J)
>    if ((J->pt->flags & PROTO_NOJIT)) {  /* JIT disabled for this proto? */
>      if (J->parent == 0 && J->exitno == 0) {
>        /* Lazy bytecode patching to disable hotcount events. */
> -      lua_assert(bc_op(*J->pc) == BC_FORL || bc_op(*J->pc) == BC_ITERL ||
> -		 bc_op(*J->pc) == BC_LOOP || bc_op(*J->pc) == BC_FUNCF);
> +      lj_assertJ(bc_op(*J->pc) == BC_FORL || bc_op(*J->pc) == BC_ITERL ||
> +		 bc_op(*J->pc) == BC_LOOP || bc_op(*J->pc) == BC_FUNCF,
> +		 "bad hot bytecode %d", bc_op(*J->pc));
>        setbc_op(J->pc, (int)bc_op(*J->pc)+(int)BC_ILOOP-(int)BC_LOOP);
>        J->pt->flags |= PROTO_ILOOP;
>      }
> @@ -437,7 +442,8 @@ static void trace_start(jit_State *J)
>    /* Get a new trace number. */
>    traceno = trace_findfree(J);
>    if (LJ_UNLIKELY(traceno == 0)) {  /* No free trace? */
> -    lua_assert((J2G(J)->hookmask & HOOK_GC) == 0);
> +    lj_assertJ((J2G(J)->hookmask & HOOK_GC) == 0,
> +	       "recorder called from GC hook");
>      lj_trace_flushall(J->L);
>      J->state = LJ_TRACE_IDLE;  /* Silently ignored. */
>      return;
> @@ -513,7 +519,7 @@ static void trace_stop(jit_State *J)
>      goto addroot;
>    case BC_JMP:
>      /* Patch exit branch in parent to side trace entry. */
> -    lua_assert(J->parent != 0 && J->cur.root != 0);
> +    lj_assertJ(J->parent != 0 && J->cur.root != 0, "not a side trace");
>      lj_asm_patchexit(J, traceref(J, J->parent), J->exitno, J->cur.mcode);
>      /* Avoid compiling a side trace twice (stack resizing uses parent exit). */
>      traceref(J, J->parent)->snap[J->exitno].count = SNAPCOUNT_DONE;
> @@ -532,7 +538,7 @@ static void trace_stop(jit_State *J)
>      traceref(J, J->exitno)->link = traceno;
>      break;
>    default:
> -    lua_assert(0);
> +    lj_assertJ(0, "bad stop bytecode %d", op);
>      break;
>    }
>  
> @@ -553,8 +559,8 @@ static void trace_stop(jit_State *J)
>  static int trace_downrec(jit_State *J)
>  {
>    /* Restart recording at the return instruction. */
> -  lua_assert(J->pt != NULL);
> -  lua_assert(bc_isret(bc_op(*J->pc)));
> +  lj_assertJ(J->pt != NULL, "no active prototype");
> +  lj_assertJ(bc_isret(bc_op(*J->pc)), "not at a return bytecode");
>    if (bc_op(*J->pc) == BC_RETM) {
>      J->ntraceabort++;
>      return 0;  /* NYI: down-recursion with RETM. */
> @@ -774,7 +780,7 @@ static void trace_hotside(jit_State *J, const BCIns *pc)
>        isluafunc(curr_func(J->L)) &&
>        snap->count != SNAPCOUNT_DONE &&
>        ++snap->count >= J->param[JIT_P_hotexit]) {
> -    lua_assert(J->state == LJ_TRACE_IDLE);
> +    lj_assertJ(J->state == LJ_TRACE_IDLE, "hot side exit while recording");
>      /* J->parent is non-zero for a side trace. */
>      J->state = LJ_TRACE_START;
>      lj_trace_ins(J, pc);
> @@ -848,7 +854,7 @@ static TraceNo trace_exit_find(jit_State *J, MCode *pc)
>      if (T && pc >= T->mcode && pc < (MCode *)((char *)T->mcode + T->szmcode))
>        return traceno;
>    }
> -  lua_assert(0);
> +  lj_assertJ(0, "bad exit pc");
>    return 0;
>  }
>  #endif
> @@ -878,13 +884,13 @@ int LJ_FASTCALL lj_trace_exit(jit_State *J, void *exptr)
>    T = traceref(J, J->parent); UNUSED(T);
>  #ifdef EXITSTATE_CHECKEXIT
>    if (J->exitno == T->nsnap) {  /* Treat stack check like a parent exit. */
> -    lua_assert(T->root != 0);
> +    lj_assertJ(T->root != 0, "stack check in root trace");
>      J->exitno = T->ir[REF_BASE].op2;
>      J->parent = T->ir[REF_BASE].op1;
>      T = traceref(J, J->parent);
>    }
>  #endif
> -  lua_assert(T != NULL && J->exitno < T->nsnap);
> +  lj_assertJ(T != NULL && J->exitno < T->nsnap, "bad trace or exit number");
>    exd.J = J;
>    exd.exptr = exptr;
>    errcode = lj_vm_cpcall(L, NULL, &exd, trace_exit_cp);
> @@ -975,14 +981,7 @@ uintptr_t LJ_FASTCALL lj_trace_unwind(jit_State *J, uintptr_t addr, ExitNo *ep)
>      return (uintptr_t)exitstub_trace_addr(T, exitno);
>  #endif
>    }
> -  /* Cannot correlate addr with trace/exit. This will be fatal. */
> -  /*
> -  ** FIXME: The following assert was replaced with
> -  ** the conventional `lua_assert`.
> -  **
> -  ** lj_assertJ(0, "bad exit pc");
> -  */
> -  lua_assert(0);
> +  lj_assertJ(0, "bad exit pc");
>    return 0;
>  }
>  #endif
> diff --git a/src/lj_utils_leb128.c b/src/lj_utils_leb128.c
> index 0d50b839..d66961da 100644
> --- a/src/lj_utils_leb128.c
> +++ b/src/lj_utils_leb128.c
> @@ -9,6 +9,7 @@
>  #define LUA_CORE
>  
>  #include "lj_utils.h"
> +#include "lj_obj.h"
>  
>  #define LINK_BIT          (0x80)
>  #define MIN_TWOBYTE_VALUE (0x80)
> @@ -112,7 +113,7 @@ size_t LJ_FASTCALL lj_utils_write_leb128(uint8_t *buffer, int64_t value)
>    /* Omit LINK_BIT in case of overflow. */
>    buffer[i++] = (uint8_t)(value & PAYLOAD_MASK);
>  
> -  lua_assert(i <= LEB128_U64_MAXSIZE);
> +  lj_assertX(i <= LEB128_U64_MAXSIZE, "bad leb128 size");
>  
>    return i;
>  }
> @@ -126,7 +127,7 @@ size_t LJ_FASTCALL lj_utils_write_uleb128(uint8_t *buffer, uint64_t value)
>  
>    buffer[i++] = (uint8_t)value;
>  
> -  lua_assert(i <= LEB128_U64_MAXSIZE);
> +  lj_assertX(i <= LEB128_U64_MAXSIZE, "bad uleb128 size");
>  
>    return i;
>  }
> diff --git a/src/lj_vmmath.c b/src/lj_vmmath.c
> index 9c0d3fde..14e66687 100644
> --- a/src/lj_vmmath.c
> +++ b/src/lj_vmmath.c
> @@ -60,7 +60,8 @@ double lj_vm_foldarith(double x, double y, int op)
>  int32_t LJ_FASTCALL lj_vm_modi(int32_t a, int32_t b)
>  {
>    uint32_t y, ua, ub;
> -  lua_assert(b != 0);  /* This must be checked before using this function. */
> +  /* This must be checked before using this function. */
> +  lj_assertX(b != 0, "modulo with zero divisor");
>    ua = a < 0 ? (uint32_t)-a : (uint32_t)a;
>    ub = b < 0 ? (uint32_t)-b : (uint32_t)b;
>    y = ua % ub;
> @@ -84,7 +85,7 @@ double lj_vm_log2(double a)
>  static double lj_vm_powui(double x, uint32_t k)
>  {
>    double y;
> -  lua_assert(k != 0);
> +  lj_assertX(k != 0, "pow with zero exponent");
>    for (; (k & 1) == 0; k >>= 1) x *= x;
>    y = x;
>    if ((k >>= 1) != 0) {
> @@ -123,7 +124,7 @@ double lj_vm_foldfpm(double x, int fpm)
>    case IRFPM_SQRT: return sqrt(x);
>    case IRFPM_LOG: return log(x);
>    case IRFPM_LOG2: return lj_vm_log2(x);
> -  default: lua_assert(0);
> +  default: lj_assertX(0, "bad fpm %d", fpm);
>    }
>    return 0;
>  }
> diff --git a/src/lj_wbuf.c b/src/lj_wbuf.c
> index 897ef083..0001a02e 100644
> --- a/src/lj_wbuf.c
> +++ b/src/lj_wbuf.c
> @@ -10,6 +10,7 @@
>  
>  #include <errno.h>
>  
> +#include "lj_obj.h"
>  #include "lj_wbuf.h"
>  #include "lj_utils.h"
>  
> @@ -52,7 +53,7 @@ void LJ_FASTCALL lj_wbuf_terminate(struct lj_wbuf *buf)
>  
>  static LJ_AINLINE void wbuf_reserve(struct lj_wbuf *buf, size_t n)
>  {
> -  lua_assert(n <= buf->size);
> +  lj_assertX(n <= buf->size, "wbuf overflow");
>    if (LJ_UNLIKELY(wbuf_left(buf) < n))
>      lj_wbuf_flush(buf);
>  }
> diff --git a/src/ljamalg.c b/src/ljamalg.c
> index 6ad5289c..0ffc7e81 100644
> --- a/src/ljamalg.c
> +++ b/src/ljamalg.c
> @@ -28,6 +28,7 @@
>  #include "lua.h"
>  #include "lauxlib.h"
>  
> +#include "lj_assert.c"
>  #include "lj_gc.c"
>  #include "lj_err.c"
>  #include "lj_char.c"
> diff --git a/src/luaconf.h b/src/luaconf.h
> index 8029040a..38146008 100644
> --- a/src/luaconf.h
> +++ b/src/luaconf.h
> @@ -146,7 +146,7 @@
>  #define LUALIB_API	LUA_API
>  #define LUAMISC_API	LUA_API
>  
> -/* Support for internal assertions. */
> +/* Compatibility support for assertions. */
>  #if defined(LUA_USE_ASSERT) || defined(LUA_USE_APICHECK)
>  #include <assert.h>
>  #endif
> -- 
> 2.41.0
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/5] test: introduce `samevalues()` TAP checker
  2023-08-17 14:03   ` Maxim Kokryashkin via Tarantool-patches
@ 2023-08-17 15:03     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 34+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-08-17 15:03 UTC (permalink / raw)
  To: Maxim Kokryashkin; +Cc: tarantool-patches

Hi, Maxim!
Thanks for the review!
Updated the commit message as you suggested.

On 17.08.23, Maxim Kokryashkin wrote:
> Hi, Sergey!
> Thanks for the patch!
> LGTM, except for a few comments below.
> 
> On Tue, Aug 15, 2023 at 12:36:27PM +0300, Sergey Kaplun wrote:
> > The introduced `samevalues()` helper checks that values in range from
> Typo: s/in range/in the range/

Fixed.

> > 1, to `table.maxn()` of the given table are exactly the same. It may be
> > usefull for test consistency of JIT and VM behaviour. Originally, the
> Typo: s/usefull for test/useful to test the/

Fixed.

> > `arr_is_consistent()` function was introduced in the
> > <tarantool-tests/gh-6163-min-max.test.lua>. `samevalues()` has the same
> > functionallity (except usage of `table.maxn()` instead `#` operator to
> Typo: s/functionallity/functionality/
> Typo: s/except/except for the/
> Typo: s/instead/instead of the/

Fixed all, thanks!

> > be sure, that the table we check isn't a sparse array).
> > ---

<snipped>

> > 

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/5] Remove pow() splitting and cleanup backends.
  2023-08-17 14:52   ` Maxim Kokryashkin via Tarantool-patches
@ 2023-08-17 15:33     ` Sergey Kaplun via Tarantool-patches
  2023-08-20  9:48       ` Maxim Kokryashkin via Tarantool-patches
  0 siblings, 1 reply; 34+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-08-17 15:33 UTC (permalink / raw)
  To: Maxim Kokryashkin; +Cc: tarantool-patches

Hi Maxim!
Thanks for the review!
Updated considering your comments.

On 17.08.23, Maxim Kokryashkin wrote:
> Hi, Sergey!
> Thanks for the patch!
> Please consider my comments below.
> 
> On Tue, Aug 15, 2023 at 12:36:28PM +0300, Sergey Kaplun wrote:
> > From: Mike Pall <mike>
> > 
> > (cherry-picked from commit b2307c8ad817e350d65cc909a579ca2f77439682)
> > 
> > The JIT engine tries to split b^c to exp2(c * log2(b)) with attempt to
> Typo: s/with attempt/with an attempt/

Fixed.

> > rejoin them later for some backends. It adds a dependency on C99
> > exp2() and log2(), which aren't part of some libm implementations.
> > Also, for some cases for IEEE754 we can see, that exp2(log2(x)) != x,
> > due to mathematical functions accuracy and double precision
> > restrictions. So, the values on the JIT slots and Lua stack are
> > inconsistent.
> 
> There is a lot to it. There are chnages in emission, fold optimizations,
> narrowing, etc. Maybe it is worth mentioning some key changes that
> happened as a result of that? That way, this changeset is easier to absorb.

It's mentioned below, or I don't understand the idea.

> 
> > 
> > This patch removes splitting of pow operator, so IR_POW is emitting for
> Typo: s/removes/removes the/

Fixed.

> > all cases (except power of 0.5 replaced with sqrt operation).
> Typo: s/except/except for the/
> Typo: s/0.5/0.5, which is/
> Typo: s/with sqrt/with the sqrt/

Fixed all.

> > 
> > Also this patch does some refactoring:
> > 
> > * Functions `asm_pow()`, `asm_mod()`, `asm_ldexp()`, `asm_div()`
> >   (replaced with `asm_fpdiv()` for CPU architectures) are moved to the
> Typo: s/to the/to/

Fixed.

> >   <src/lj_asm.c> as far as their implementation is generic for all
> >   architectures.
> > * Fusing of IR_HREF + IR_EQ/IR_NE moved to a `asm_fuseequal()`.
> Typo: s/moved/was moved/
> Typo: s/to a/to/

Fixed all.

> > * Since `lj_vm_exp2()` subroutine and `IRFPM_EXP2` are removed as no
> >   longer used.
> I can't understand what this sentence means, please rephrase it.

Removed "Since" as measleading.

> > 
> 
> What about changes with `asm_cnew`? I think you should mention them too.

Added.

> > Sergey Kaplun:
> > * added the description and the test for the problem
> > 
> > Part of tarantool/tarantool#8825
> > ---
> >  src/lj_arch.h                                 |   3 -
> >  src/lj_asm.c                                  | 106 +++++++++++-------
> >  src/lj_asm_arm.h                              |  10 +-
> >  src/lj_asm_arm64.h                            |  39 +------
> >  src/lj_asm_mips.h                             |  38 +------
> >  src/lj_asm_ppc.h                              |   9 +-
> >  src/lj_asm_x86.h                              |  37 +-----
> >  src/lj_ir.h                                   |   2 +-
> >  src/lj_ircall.h                               |   1 -
> >  src/lj_opt_fold.c                             |  18 ++-
> >  src/lj_opt_narrow.c                           |  20 +---
> >  src/lj_opt_split.c                            |  21 ----
> >  src/lj_vm.h                                   |   5 -
> >  src/lj_vmmath.c                               |   8 --
> >  .../lj-9-pow-inconsistencies.test.lua         |  63 +++++++++++
> >  15 files changed, 158 insertions(+), 222 deletions(-)
> >  create mode 100644 test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
> > 
> > diff --git a/src/lj_arch.h b/src/lj_arch.h
> > index cf31a291..3bdbe84e 100644
> > --- a/src/lj_arch.h
> > +++ b/src/lj_arch.h
> > @@ -607,9 +607,6 @@
> >  #if defined(__ANDROID__) || defined(__symbian__) || LJ_TARGET_XBOX360 || LJ_TARGET_WINDOWS
> >  #define LUAJIT_NO_LOG2
> >  #endif
> > -#if defined(__symbian__) || LJ_TARGET_WINDOWS
> > -#define LUAJIT_NO_EXP2
> > -#endif
> >  #if LJ_TARGET_CONSOLE || (LJ_TARGET_IOS && __IPHONE_OS_VERSION_MIN_REQUIRED >= __IPHONE_8_0)
> >  #define LJ_NO_SYSTEM		1
> >  #endif
> > diff --git a/src/lj_asm.c b/src/lj_asm.c
> > index b352fd35..a6906b19 100644
> > --- a/src/lj_asm.c
> > +++ b/src/lj_asm.c
> > @@ -1356,32 +1356,6 @@ static void asm_call(ASMState *as, IRIns *ir)
> >    asm_gencall(as, ci, args);
> >  }
> >  
> > -#if !LJ_SOFTFP32
> > -static void asm_fppow(ASMState *as, IRIns *ir, IRRef lref, IRRef rref)
> > -{
> > -  const CCallInfo *ci = &lj_ir_callinfo[IRCALL_pow];
> > -  IRRef args[2];
> > -  args[0] = lref;
> > -  args[1] = rref;
> > -  asm_setupresult(as, ir, ci);
> > -  asm_gencall(as, ci, args);
> > -}
> > -
> > -static int asm_fpjoin_pow(ASMState *as, IRIns *ir)
> > -{
> > -  IRIns *irp = IR(ir->op1);
> > -  if (irp == ir-1 && irp->o == IR_MUL && !ra_used(irp)) {
> > -    IRIns *irpp = IR(irp->op1);
> > -    if (irpp == ir-2 && irpp->o == IR_FPMATH &&
> > -	irpp->op2 == IRFPM_LOG2 && !ra_used(irpp)) {
> > -      asm_fppow(as, ir, irpp->op1, irp->op2);
> > -      return 1;
> > -    }
> > -  }
> > -  return 0;
> > -}
> > -#endif
> > -
> >  /* -- PHI and loop handling ----------------------------------------------- */
> >  
> >  /* Break a PHI cycle by renaming to a free register (evict if needed). */
> > @@ -1652,6 +1626,62 @@ static void asm_loop(ASMState *as)
> >  #error "Missing assembler for target CPU"
> >  #endif
> >  
> > +/* -- Common instruction helpers ------------------------------------------ */
> > +
> > +#if !LJ_SOFTFP32
> > +#if !LJ_TARGET_X86ORX64
> > +#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
> > +#define asm_fppowi(as, ir)	asm_callid(as, ir, IRCALL_lj_vm_powi)
> > +#endif
> > +
> > +static void asm_pow(ASMState *as, IRIns *ir)
> > +{
> > +#if LJ_64 && LJ_HASFFI
> > +  if (!irt_isnum(ir->t))
> > +    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_powi64 :
> > +					  IRCALL_lj_carith_powu64);
> > +  else
> > +#endif
> > +  if (irt_isnum(IR(ir->op2)->t))
> > +    asm_callid(as, ir, IRCALL_pow);
> > +  else
> > +    asm_fppowi(as, ir);
> > +}
> > +
> > +static void asm_div(ASMState *as, IRIns *ir)
> > +{
> > +#if LJ_64 && LJ_HASFFI
> > +  if (!irt_isnum(ir->t))
> > +    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_divi64 :
> > +					  IRCALL_lj_carith_divu64);
> > +  else
> > +#endif
> > +    asm_fpdiv(as, ir);
> > +}
> > +#endif
> > +
> > +static void asm_mod(ASMState *as, IRIns *ir)
> > +{
> > +#if LJ_64 && LJ_HASFFI
> > +  if (!irt_isint(ir->t))
> > +    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_modi64 :
> > +					  IRCALL_lj_carith_modu64);
> > +  else
> > +#endif
> > +    asm_callid(as, ir, IRCALL_lj_vm_modi);
> > +}
> > +
> > +static void asm_fuseequal(ASMState *as, IRIns *ir)
> > +{
> > +  /* Fuse HREF + EQ/NE. */
> > +  if ((ir-1)->o == IR_HREF && ir->op1 == as->curins-1) {
> > +    as->curins--;
> > +    asm_href(as, ir-1, (IROp)ir->o);
> > +  } else {
> > +    asm_equal(as, ir);
> > +  }
> > +}
> > +
> >  /* -- Instruction dispatch ------------------------------------------------ */
> >  
> >  /* Assemble a single instruction. */
> > @@ -1674,14 +1704,7 @@ static void asm_ir(ASMState *as, IRIns *ir)
> >    case IR_ABC:
> >      asm_comp(as, ir);
> >      break;
> > -  case IR_EQ: case IR_NE:
> > -    if ((ir-1)->o == IR_HREF && ir->op1 == as->curins-1) {
> > -      as->curins--;
> > -      asm_href(as, ir-1, (IROp)ir->o);
> > -    } else {
> > -      asm_equal(as, ir);
> > -    }
> > -    break;
> > +  case IR_EQ: case IR_NE: asm_fuseequal(as, ir); break;
> >  
> >    case IR_RETF: asm_retf(as, ir); break;
> >  
> > @@ -1750,7 +1773,13 @@ static void asm_ir(ASMState *as, IRIns *ir)
> >    case IR_SNEW: case IR_XSNEW: asm_snew(as, ir); break;
> >    case IR_TNEW: asm_tnew(as, ir); break;
> >    case IR_TDUP: asm_tdup(as, ir); break;
> > -  case IR_CNEW: case IR_CNEWI: asm_cnew(as, ir); break;
> > +  case IR_CNEW: case IR_CNEWI:
> > +#if LJ_HASFFI
> > +    asm_cnew(as, ir);
> > +#else
> > +    lua_assert(0);
> > +#endif
> > +    break;
> >  
> >    /* Buffer operations. */
> >    case IR_BUFHDR: asm_bufhdr(as, ir); break;
> > @@ -2215,6 +2244,10 @@ static void asm_setup_regsp(ASMState *as)
> >  	if (inloop)
> >  	  as->modset |= RSET_SCRATCH;
> >  #if LJ_TARGET_X86
> > +	if (irt_isnum(IR(ir->op2)->t)) {
> > +	  if (as->evenspill < 4)  /* Leave room to call pow(). */
> > +	    as->evenspill = 4;
> > +	}
> >  	break;
> >  #else
> >  	ir->prev = REGSP_HINT(RID_FPRET);
> > @@ -2240,9 +2273,6 @@ static void asm_setup_regsp(ASMState *as)
> >  	  continue;
> >  	}
> >  	break;
> > -      } else if (ir->op2 == IRFPM_EXP2 && !LJ_64) {
> > -	if (as->evenspill < 4)  /* Leave room to call pow(). */
> > -	  as->evenspill = 4;
> >        }
> >  #endif
> >        if (inloop)
> > diff --git a/src/lj_asm_arm.h b/src/lj_asm_arm.h
> > index 2894e5c9..29a07c80 100644
> > --- a/src/lj_asm_arm.h
> > +++ b/src/lj_asm_arm.h
> > @@ -1275,8 +1275,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
> >    ra_allockreg(as, (int32_t)(sz+sizeof(GCcdata)),
> >  	       ra_releasetmp(as, ASMREF_TMP1));
> >  }
> > -#else
> > -#define asm_cnew(as, ir)	((void)0)
> >  #endif
> >  
> >  /* -- Write barriers ------------------------------------------------------ */
> > @@ -1371,8 +1369,6 @@ static void asm_callround(ASMState *as, IRIns *ir, int id)
> >  
> >  static void asm_fpmath(ASMState *as, IRIns *ir)
> >  {
> > -  if (ir->op2 == IRFPM_EXP2 && asm_fpjoin_pow(as, ir))
> > -    return;
> >    if (ir->op2 <= IRFPM_TRUNC)
> >      asm_callround(as, ir, ir->op2);
> >    else if (ir->op2 == IRFPM_SQRT)
> > @@ -1499,14 +1495,10 @@ static void asm_mul(ASMState *as, IRIns *ir)
> >  #define asm_mulov(as, ir)	asm_mul(as, ir)
> >  
> >  #if !LJ_SOFTFP
> > -#define asm_div(as, ir)		asm_fparith(as, ir, ARMI_VDIV_D)
> > -#define asm_pow(as, ir)		asm_callid(as, ir, IRCALL_lj_vm_powi)
> > +#define asm_fpdiv(as, ir)	asm_fparith(as, ir, ARMI_VDIV_D)
> >  #define asm_abs(as, ir)		asm_fpunary(as, ir, ARMI_VABS_D)
> > -#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
> >  #endif
> >  
> > -#define asm_mod(as, ir)		asm_callid(as, ir, IRCALL_lj_vm_modi)
> > -
> >  static void asm_neg(ASMState *as, IRIns *ir)
> >  {
> >  #if !LJ_SOFTFP
> > diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
> > index aea251a9..c3d6889e 100644
> > --- a/src/lj_asm_arm64.h
> > +++ b/src/lj_asm_arm64.h
> > @@ -1249,8 +1249,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
> >    ra_allockreg(as, (int32_t)(sz+sizeof(GCcdata)),
> >  	       ra_releasetmp(as, ASMREF_TMP1));
> >  }
> > -#else
> > -#define asm_cnew(as, ir)	((void)0)
> >  #endif
> >  
> >  /* -- Write barriers ------------------------------------------------------ */
> > @@ -1327,8 +1325,6 @@ static void asm_fpmath(ASMState *as, IRIns *ir)
> >    } else if (fpm <= IRFPM_TRUNC) {
> >      asm_fpunary(as, ir, fpm == IRFPM_FLOOR ? A64I_FRINTMd :
> >  			fpm == IRFPM_CEIL ? A64I_FRINTPd : A64I_FRINTZd);
> > -  } else if (fpm == IRFPM_EXP2 && asm_fpjoin_pow(as, ir)) {
> > -    return;
> >    } else {
> >      asm_callid(as, ir, IRCALL_lj_vm_floor + fpm);
> >    }
> > @@ -1435,45 +1431,12 @@ static void asm_mul(ASMState *as, IRIns *ir)
> >    asm_intmul(as, ir);
> >  }
> >  
> > -static void asm_div(ASMState *as, IRIns *ir)
> > -{
> > -#if LJ_HASFFI
> > -  if (!irt_isnum(ir->t))
> > -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_divi64 :
> > -					  IRCALL_lj_carith_divu64);
> > -  else
> > -#endif
> > -    asm_fparith(as, ir, A64I_FDIVd);
> > -}
> > -
> > -static void asm_pow(ASMState *as, IRIns *ir)
> > -{
> > -#if LJ_HASFFI
> > -  if (!irt_isnum(ir->t))
> > -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_powi64 :
> > -					  IRCALL_lj_carith_powu64);
> > -  else
> > -#endif
> > -    asm_callid(as, ir, IRCALL_lj_vm_powi);
> > -}
> > -
> >  #define asm_addov(as, ir)	asm_add(as, ir)
> >  #define asm_subov(as, ir)	asm_sub(as, ir)
> >  #define asm_mulov(as, ir)	asm_mul(as, ir)
> >  
> > +#define asm_fpdiv(as, ir)	asm_fparith(as, ir, A64I_FDIVd)
> >  #define asm_abs(as, ir)		asm_fpunary(as, ir, A64I_FABS)
> > -#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
> > -
> > -static void asm_mod(ASMState *as, IRIns *ir)
> > -{
> > -#if LJ_HASFFI
> > -  if (!irt_isint(ir->t))
> > -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_modi64 :
> > -					  IRCALL_lj_carith_modu64);
> > -  else
> > -#endif
> > -    asm_callid(as, ir, IRCALL_lj_vm_modi);
> > -}
> >  
> >  static void asm_neg(ASMState *as, IRIns *ir)
> >  {
> > diff --git a/src/lj_asm_mips.h b/src/lj_asm_mips.h
> > index 4626507b..0f92959b 100644
> > --- a/src/lj_asm_mips.h
> > +++ b/src/lj_asm_mips.h
> > @@ -1613,8 +1613,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
> >    ra_allockreg(as, (int32_t)(sz+sizeof(GCcdata)),
> >  	       ra_releasetmp(as, ASMREF_TMP1));
> >  }
> > -#else
> > -#define asm_cnew(as, ir)	((void)0)
> >  #endif
> >  
> >  /* -- Write barriers ------------------------------------------------------ */
> > @@ -1683,8 +1681,6 @@ static void asm_fpunary(ASMState *as, IRIns *ir, MIPSIns mi)
> >  #if !LJ_SOFTFP32
> >  static void asm_fpmath(ASMState *as, IRIns *ir)
> >  {
> > -  if (ir->op2 == IRFPM_EXP2 && asm_fpjoin_pow(as, ir))
> > -    return;
> >  #if !LJ_SOFTFP
> >    if (ir->op2 <= IRFPM_TRUNC)
> >      asm_callround(as, ir, IRCALL_lj_vm_floor + ir->op2);
> > @@ -1772,41 +1768,13 @@ static void asm_mul(ASMState *as, IRIns *ir)
> >    }
> >  }
> >  
> > -static void asm_mod(ASMState *as, IRIns *ir)
> > -{
> > -#if LJ_64 && LJ_HASFFI
> > -  if (!irt_isint(ir->t))
> > -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_modi64 :
> > -					  IRCALL_lj_carith_modu64);
> > -  else
> > -#endif
> > -    asm_callid(as, ir, IRCALL_lj_vm_modi);
> > -}
> > -
> >  #if !LJ_SOFTFP32
> > -static void asm_pow(ASMState *as, IRIns *ir)
> > -{
> > -#if LJ_64 && LJ_HASFFI
> > -  if (!irt_isnum(ir->t))
> > -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_powi64 :
> > -					  IRCALL_lj_carith_powu64);
> > -  else
> > -#endif
> > -    asm_callid(as, ir, IRCALL_lj_vm_powi);
> > -}
> > -
> > -static void asm_div(ASMState *as, IRIns *ir)
> > +static void asm_fpdiv(ASMState *as, IRIns *ir)
> >  {
> > -#if LJ_64 && LJ_HASFFI
> > -  if (!irt_isnum(ir->t))
> > -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_divi64 :
> > -					  IRCALL_lj_carith_divu64);
> > -  else
> > -#endif
> >  #if !LJ_SOFTFP
> >      asm_fparith(as, ir, MIPSI_DIV_D);
> >  #else
> > -  asm_callid(as, ir, IRCALL_softfp_div);
> > +    asm_callid(as, ir, IRCALL_softfp_div);
> >  #endif
> >  }
> >  #endif
> > @@ -1844,8 +1812,6 @@ static void asm_abs(ASMState *as, IRIns *ir)
> >  }
> >  #endif
> >  
> > -#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
> > -
> >  static void asm_arithov(ASMState *as, IRIns *ir)
> >  {
> >    /* TODO MIPSR6: bovc/bnvc. Caveat: no delay slot to load RID_TMP. */
> > diff --git a/src/lj_asm_ppc.h b/src/lj_asm_ppc.h
> > index 6aaed058..62a5c3e2 100644
> > --- a/src/lj_asm_ppc.h
> > +++ b/src/lj_asm_ppc.h
> > @@ -1177,8 +1177,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
> >    ra_allockreg(as, (int32_t)(sz+sizeof(GCcdata)),
> >  	       ra_releasetmp(as, ASMREF_TMP1));
> >  }
> > -#else
> > -#define asm_cnew(as, ir)	((void)0)
> >  #endif
> >  
> >  /* -- Write barriers ------------------------------------------------------ */
> > @@ -1249,8 +1247,6 @@ static void asm_fpunary(ASMState *as, IRIns *ir, PPCIns pi)
> >  
> >  static void asm_fpmath(ASMState *as, IRIns *ir)
> >  {
> > -  if (ir->op2 == IRFPM_EXP2 && asm_fpjoin_pow(as, ir))
> > -    return;
> >    if (ir->op2 == IRFPM_SQRT && (as->flags & JIT_F_SQRT))
> >      asm_fpunary(as, ir, PPCI_FSQRT);
> >    else
> > @@ -1364,9 +1360,7 @@ static void asm_mul(ASMState *as, IRIns *ir)
> >    }
> >  }
> >  
> > -#define asm_div(as, ir)		asm_fparith(as, ir, PPCI_FDIV)
> > -#define asm_mod(as, ir)		asm_callid(as, ir, IRCALL_lj_vm_modi)
> > -#define asm_pow(as, ir)		asm_callid(as, ir, IRCALL_lj_vm_powi)
> > +#define asm_fpdiv(as, ir)	asm_fparith(as, ir, PPCI_FDIV)
> >  
> >  static void asm_neg(ASMState *as, IRIns *ir)
> >  {
> > @@ -1390,7 +1384,6 @@ static void asm_neg(ASMState *as, IRIns *ir)
> >  }
> >  
> >  #define asm_abs(as, ir)		asm_fpunary(as, ir, PPCI_FABS)
> > -#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
> >  
> >  static void asm_arithov(ASMState *as, IRIns *ir, PPCIns pi)
> >  {
> > diff --git a/src/lj_asm_x86.h b/src/lj_asm_x86.h
> > index 63d332ca..5f5fe3cf 100644
> > --- a/src/lj_asm_x86.h
> > +++ b/src/lj_asm_x86.h
> > @@ -1857,8 +1857,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
> >    asm_gencall(as, ci, args);
> >    emit_loadi(as, ra_releasetmp(as, ASMREF_TMP1), (int32_t)(sz+sizeof(GCcdata)));
> >  }
> > -#else
> > -#define asm_cnew(as, ir)	((void)0)
> >  #endif
> >  
> >  /* -- Write barriers ------------------------------------------------------ */
> > @@ -1964,8 +1962,6 @@ static void asm_fpmath(ASMState *as, IRIns *ir)
> >  		    fpm == IRFPM_CEIL ? lj_vm_ceil_sse : lj_vm_trunc_sse);
> >        ra_left(as, RID_XMM0, ir->op1);
> >      }
> > -  } else if (fpm == IRFPM_EXP2 && asm_fpjoin_pow(as, ir)) {
> > -    /* Rejoined to pow(). */
> >    } else {
> >      asm_callid(as, ir, IRCALL_lj_vm_floor + fpm);
> >    }
> > @@ -2000,17 +1996,6 @@ static void asm_fppowi(ASMState *as, IRIns *ir)
> >    ra_left(as, RID_EAX, ir->op2);
> >  }
> >  
> > -static void asm_pow(ASMState *as, IRIns *ir)
> > -{
> > -#if LJ_64 && LJ_HASFFI
> > -  if (!irt_isnum(ir->t))
> > -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_powi64 :
> > -					  IRCALL_lj_carith_powu64);
> > -  else
> > -#endif
> > -    asm_fppowi(as, ir);
> > -}
> > -
> >  static int asm_swapops(ASMState *as, IRIns *ir)
> >  {
> >    IRIns *irl = IR(ir->op1);
> > @@ -2208,27 +2193,7 @@ static void asm_mul(ASMState *as, IRIns *ir)
> >      asm_intarith(as, ir, XOg_X_IMUL);
> >  }
> >  
> > -static void asm_div(ASMState *as, IRIns *ir)
> > -{
> > -#if LJ_64 && LJ_HASFFI
> > -  if (!irt_isnum(ir->t))
> > -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_divi64 :
> > -					  IRCALL_lj_carith_divu64);
> > -  else
> > -#endif
> > -    asm_fparith(as, ir, XO_DIVSD);
> > -}
> > -
> > -static void asm_mod(ASMState *as, IRIns *ir)
> > -{
> > -#if LJ_64 && LJ_HASFFI
> > -  if (!irt_isint(ir->t))
> > -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_modi64 :
> > -					  IRCALL_lj_carith_modu64);
> > -  else
> > -#endif
> > -    asm_callid(as, ir, IRCALL_lj_vm_modi);
> > -}
> > +#define asm_fpdiv(as, ir)	asm_fparith(as, ir, XO_DIVSD)
> >  
> >  static void asm_neg_not(ASMState *as, IRIns *ir, x86Group3 xg)
> >  {
> > diff --git a/src/lj_ir.h b/src/lj_ir.h
> > index e8bca275..43e55069 100644
> > --- a/src/lj_ir.h
> > +++ b/src/lj_ir.h
> > @@ -177,7 +177,7 @@ LJ_STATIC_ASSERT((int)IR_XLOAD + IRDELTA_L2S == (int)IR_XSTORE);
> >  /* FPMATH sub-functions. ORDER FPM. */
> >  #define IRFPMDEF(_) \
> >    _(FLOOR) _(CEIL) _(TRUNC)  /* Must be first and in this order. */ \
> > -  _(SQRT) _(EXP2) _(LOG) _(LOG2) \
> > +  _(SQRT) _(LOG) _(LOG2) \
> >    _(OTHER)
> >  
> >  typedef enum {
> > diff --git a/src/lj_ircall.h b/src/lj_ircall.h
> > index bbad35b1..af064a6f 100644
> > --- a/src/lj_ircall.h
> > +++ b/src/lj_ircall.h
> > @@ -192,7 +192,6 @@ typedef struct CCallInfo {
> >    _(FPMATH,	lj_vm_ceil,		1,   N, NUM, XA_FP) \
> >    _(FPMATH,	lj_vm_trunc,		1,   N, NUM, XA_FP) \
> >    _(FPMATH,	sqrt,			1,   N, NUM, XA_FP) \
> > -  _(ANY,	lj_vm_exp2,		1,   N, NUM, XA_FP) \
> >    _(ANY,	log,			1,   N, NUM, XA_FP) \
> >    _(ANY,	lj_vm_log2,		1,   N, NUM, XA_FP) \
> >    _(ANY,	lj_vm_powi,		2,   N, NUM, XA_FP) \
> > diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
> > index 27e489af..cd803d87 100644
> > --- a/src/lj_opt_fold.c
> > +++ b/src/lj_opt_fold.c
> > @@ -237,10 +237,11 @@ LJFOLDF(kfold_fpcall2)
> >  }
> >  
> >  LJFOLD(POW KNUM KINT)
> > +LJFOLD(POW KNUM KNUM)
> >  LJFOLDF(kfold_numpow)
> >  {
> >    lua_Number a = knumleft;
> > -  lua_Number b = (lua_Number)fright->i;
> > +  lua_Number b = fright->o == IR_KINT ? (lua_Number)fright->i : knumright;
> >    lua_Number y = lj_vm_foldarith(a, b, IR_POW - IR_ADD);
> >    return lj_ir_knum(J, y);
> >  }
> > @@ -1077,7 +1078,7 @@ LJFOLDF(simplify_nummuldiv_negneg)
> >  }
> >  
> >  LJFOLD(POW any KINT)
> > -LJFOLDF(simplify_numpow_xk)
> > +LJFOLDF(simplify_numpow_xkint)
> >  {
> >    int32_t k = fright->i;
> >    TRef ref = fins->op1;
> > @@ -1106,13 +1107,22 @@ LJFOLDF(simplify_numpow_xk)
> >    return ref;
> >  }
> >  
> > +LJFOLD(POW any KNUM)
> > +LJFOLDF(simplify_numpow_xknum)
> > +{
> > +  if (knumright == 0.5)  /* x ^ 0.5 ==> sqrt(x) */
> > +    return emitir(IRTN(IR_FPMATH), fins->op1, IRFPM_SQRT);
> > +  return NEXTFOLD;
> > +}
> > +
> >  LJFOLD(POW KNUM any)
> >  LJFOLDF(simplify_numpow_kx)
> >  {
> >    lua_Number n = knumleft;
> > -  if (n == 2.0) {  /* 2.0 ^ i ==> ldexp(1.0, tonum(i)) */
> > -    fins->o = IR_CONV;
> > +  if (n == 2.0 && irt_isint(fright->t)) {  /* 2.0 ^ i ==> ldexp(1.0, i) */
> >  #if LJ_TARGET_X86ORX64
> > +    /* Different IR_LDEXP calling convention on x86/x64 requires conversion. */
> > +    fins->o = IR_CONV;
> >      fins->op1 = fins->op2;
> >      fins->op2 = IRCONV_NUM_INT;
> >      fins->op2 = (IRRef1)lj_opt_fold(J);
> > diff --git a/src/lj_opt_narrow.c b/src/lj_opt_narrow.c
> > index bb61f97b..4f285334 100644
> > --- a/src/lj_opt_narrow.c
> > +++ b/src/lj_opt_narrow.c
> > @@ -593,10 +593,10 @@ TRef lj_opt_narrow_pow(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc)
> >    /* Narrowing must be unconditional to preserve (-x)^i semantics. */
> >    if (tvisint(vc) || numisint(numV(vc))) {
> >      int checkrange = 0;
> > -    /* Split pow is faster for bigger exponents. But do this only for (+k)^i. */
> > +    /* pow() is faster for bigger exponents. But do this only for (+k)^i. */
> >      if (tref_isk(rb) && (int32_t)ir_knum(IR(tref_ref(rb)))->u32.hi >= 0) {
> >        int32_t k = numberVint(vc);
> > -      if (!(k >= -65536 && k <= 65536)) goto split_pow;
> > +      if (!(k >= -65536 && k <= 65536)) goto force_pow_num;
> >        checkrange = 1;
> >      }
> >      if (!tref_isinteger(rc)) {
> > @@ -607,19 +607,11 @@ TRef lj_opt_narrow_pow(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc)
> >        TRef tmp = emitir(IRTI(IR_ADD), rc, lj_ir_kint(J, 65536));
> >        emitir(IRTGI(IR_ULE), tmp, lj_ir_kint(J, 2*65536));
> >      }
> > -    return emitir(IRTN(IR_POW), rb, rc);
> > +  } else {
> > +force_pow_num:
> > +    rc = lj_ir_tonum(J, rc);  /* Want POW(num, num), not POW(num, int). */
> >    }
> > -split_pow:
> > -  /* FOLD covers most cases, but some are easier to do here. */
> > -  if (tref_isk(rb) && tvispone(ir_knum(IR(tref_ref(rb)))))
> > -    return rb;  /* 1 ^ x ==> 1 */
> > -  rc = lj_ir_tonum(J, rc);
> > -  if (tref_isk(rc) && ir_knum(IR(tref_ref(rc)))->n == 0.5)
> > -    return emitir(IRTN(IR_FPMATH), rb, IRFPM_SQRT);  /* x ^ 0.5 ==> sqrt(x) */
> > -  /* Split up b^c into exp2(c*log2(b)). Assembler may rejoin later. */
> > -  rb = emitir(IRTN(IR_FPMATH), rb, IRFPM_LOG2);
> > -  rc = emitir(IRTN(IR_MUL), rb, rc);
> > -  return emitir(IRTN(IR_FPMATH), rc, IRFPM_EXP2);
> > +  return emitir(IRTN(IR_POW), rb, rc);
> >  }
> >  
> >  /* -- Predictive narrowing of induction variables ------------------------- */
> > diff --git a/src/lj_opt_split.c b/src/lj_opt_split.c
> > index 2fc36b8d..c10a85cb 100644
> > --- a/src/lj_opt_split.c
> > +++ b/src/lj_opt_split.c
> > @@ -403,27 +403,6 @@ static void split_ir(jit_State *J)
> >  	hi = split_call_li(J, hisubst, oir, ir, IRCALL_lj_vm_powi);
> >  	break;
> >        case IR_FPMATH:
> > -	/* Try to rejoin pow from EXP2, MUL and LOG2. */
> > -	if (nir->op2 == IRFPM_EXP2 && nir->op1 > J->loopref) {
> > -	  IRIns *irp = IR(nir->op1);
> > -	  if (irp->o == IR_CALLN && irp->op2 == IRCALL_softfp_mul) {
> > -	    IRIns *irm4 = IR(irp->op1);
> > -	    IRIns *irm3 = IR(irm4->op1);
> > -	    IRIns *irm12 = IR(irm3->op1);
> > -	    IRIns *irl1 = IR(irm12->op1);
> > -	    if (irm12->op1 > J->loopref && irl1->o == IR_CALLN &&
> > -		irl1->op2 == IRCALL_lj_vm_log2) {
> > -	      IRRef tmp = irl1->op1;  /* Recycle first two args from LOG2. */
> > -	      IRRef arg3 = irm3->op2, arg4 = irm4->op2;
> > -	      J->cur.nins--;
> > -	      tmp = split_emit(J, IRT(IR_CARG, IRT_NIL), tmp, arg3);
> > -	      tmp = split_emit(J, IRT(IR_CARG, IRT_NIL), tmp, arg4);
> > -	      ir->prev = tmp = split_emit(J, IRTI(IR_CALLN), tmp, IRCALL_pow);
> > -	      hi = split_emit(J, IRT(IR_HIOP, IRT_SOFTFP), tmp, tmp);
> > -	      break;
> > -	    }
> > -	  }
> > -	}
> >  	hi = split_call_l(J, hisubst, oir, ir, IRCALL_lj_vm_floor + ir->op2);
> >  	break;
> >        case IR_LDEXP:
> > diff --git a/src/lj_vm.h b/src/lj_vm.h
> > index 411caafa..abaa7c52 100644
> > --- a/src/lj_vm.h
> > +++ b/src/lj_vm.h
> > @@ -95,11 +95,6 @@ LJ_ASMF double lj_vm_trunc(double);
> >  LJ_ASMF double lj_vm_trunc_sf(double);
> >  #endif
> >  #endif
> > -#ifdef LUAJIT_NO_EXP2
> > -LJ_ASMF double lj_vm_exp2(double);
> > -#else
> > -#define lj_vm_exp2	exp2
> > -#endif
> >  #if LJ_HASFFI
> >  LJ_ASMF int lj_vm_errno(void);
> >  #endif
> > diff --git a/src/lj_vmmath.c b/src/lj_vmmath.c
> > index ae4e0f15..9c0d3fde 100644
> > --- a/src/lj_vmmath.c
> > +++ b/src/lj_vmmath.c
> > @@ -79,13 +79,6 @@ double lj_vm_log2(double a)
> >  }
> >  #endif
> >  
> > -#ifdef LUAJIT_NO_EXP2
> > -double lj_vm_exp2(double a)
> > -{
> > -  return exp(a * 0.6931471805599453);
> > -}
> > -#endif
> > -
> >  #if !LJ_TARGET_X86ORX64
> >  /* Unsigned x^k. */
> >  static double lj_vm_powui(double x, uint32_t k)
> > @@ -128,7 +121,6 @@ double lj_vm_foldfpm(double x, int fpm)
> >    case IRFPM_CEIL: return lj_vm_ceil(x);
> >    case IRFPM_TRUNC: return lj_vm_trunc(x);
> >    case IRFPM_SQRT: return sqrt(x);
> > -  case IRFPM_EXP2: return lj_vm_exp2(x);
> >    case IRFPM_LOG: return log(x);
> >    case IRFPM_LOG2: return lj_vm_log2(x);
> >    default: lua_assert(0);
> > diff --git a/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
> > new file mode 100644
> > index 00000000..21b3a0d9
> > --- /dev/null
> > +++ b/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
> > @@ -0,0 +1,63 @@
> > +local tap = require('tap')
> > +-- Test to demonstrate the incorrect JIT behaviour when splitting
> > +-- IR_POW.
> > +-- See also https://github.com/LuaJIT/LuaJIT/issues/9.
> > +local test = tap.test('lj-9-pow-inconsistencies'):skipcond({
> > +  ['Test requires JIT enabled'] = not jit.status(),
> > +})
> > +
> > +local nan = 0 / 0
> > +local inf = math.huge
> > +
> > +-- Table with some corner cases to check:
> > +local INTERESTING_VALUES = {
> > +  -- 0, -0, 1, -1 special cases with nan, inf, etc..
> > +  0, -0, 1, -1, nan, inf, -inf,
> > +  -- x ^  inf = 0 (inf), if |x| < 1 (|x| > 1).
> > +  -- x ^ -inf = inf (0), if |x| < 1 (|x| > 1).
> > +  0.999999, 1.000001, -0.999999, -1.000001,
> > +}
> > +test:plan(1 + (#INTERESTING_VALUES) ^ 2)
> 
> I suggest renaming it to `CORNER_CASES`, since `INTERESTING_VALUES`
> is not very formal.

Renamed.

> Also, please mention that not all of the possible pairs are faulty
> and most of them are left here for two reasons:
> 1. Improved readability.
> 2. More extensive and change-proof testing.

Added the comment.

> 
> > +
> > +jit.opt.start('hotloop=1')
> > +
> > +-- The JIT engine tries to split b^c to exp2(c * log2(b)).
> > +-- For some cases for IEEE754 we can see, that
> > +-- (double)exp2((double)log2(x)) != x, due to mathematical
> > +-- functions accuracy and double precision restrictions.
> > +-- Just use some numbers to observe this misbehaviour.
> > +local res = {}
> > +local cnt = 1
> > +while cnt < 4 do
> > +  -- XXX: use local variable to prevent folding via parser.
> > +  local b = -0.90000000001
> > +  res[cnt] = 1000 ^ b
> > +  cnt = cnt + 1
> > +end
> 
> Is there a specific reason you decided to use while over for?

Since I can't remember, I think no, so I replaced with `for`.

> > +
> > +test:samevalues(res, 'consistent pow operator behaviour for corner case')
> > +
> > +-- Prevent JIT side effects for parent loops.
> > +jit.off()
> > +for i = 1, #INTERESTING_VALUES do
> > +  for j = 1, #INTERESTING_VALUES do
> > +    local b = INTERESTING_VALUES[i]
> > +    local c = INTERESTING_VALUES[j]
> > +    local results = {}
> > +    local counter = 1
> > +    jit.on()
> > +    while counter < 4 do
> > +      results[counter] = b ^ c
> > +      counter = counter + 1
> > +    end
> Same question about for and while.

Fixed.

> > +    -- Prevent JIT side effects.
> > +    jit.off()
> > +    jit.flush()
> Also, I think we should move the part from jit.on() to jit.flush() into
> a separate function.

I don't agree here -- we still use tons of variables from the cycles,
and I don't want to see any side-effects of the function call in
traces.

See other changes in the iterative patch below:

===================================================================
diff --git a/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
index 21b3a0d9..6abba07f 100644
--- a/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
+++ b/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
@@ -10,14 +10,18 @@ local nan = 0 / 0
 local inf = math.huge
 
 -- Table with some corner cases to check:
-local INTERESTING_VALUES = {
+-- Not all of them fail on each CPU architecture, but bruteforce
+-- is better, than custom enumerated usage for two reasons:
+-- * Improved readability.
+-- * More extensive and change-proof testing.
+local CORNER_CASES = {
   -- 0, -0, 1, -1 special cases with nan, inf, etc..
   0, -0, 1, -1, nan, inf, -inf,
   -- x ^  inf = 0 (inf), if |x| < 1 (|x| > 1).
   -- x ^ -inf = inf (0), if |x| < 1 (|x| > 1).
   0.999999, 1.000001, -0.999999, -1.000001,
 }
-test:plan(1 + (#INTERESTING_VALUES) ^ 2)
+test:plan(1 + (#CORNER_CASES) ^ 2)
 
 jit.opt.start('hotloop=1')
 
@@ -27,28 +31,25 @@ jit.opt.start('hotloop=1')
 -- functions accuracy and double precision restrictions.
 -- Just use some numbers to observe this misbehaviour.
 local res = {}
-local cnt = 1
-while cnt < 4 do
+for i = 1, 4 do
   -- XXX: use local variable to prevent folding via parser.
   local b = -0.90000000001
-  res[cnt] = 1000 ^ b
-  cnt = cnt + 1
+  res[i] = 1000 ^ b
 end
 
 test:samevalues(res, 'consistent pow operator behaviour for corner case')
 
 -- Prevent JIT side effects for parent loops.
 jit.off()
-for i = 1, #INTERESTING_VALUES do
-  for j = 1, #INTERESTING_VALUES do
-    local b = INTERESTING_VALUES[i]
-    local c = INTERESTING_VALUES[j]
+for i = 1, #CORNER_CASES do
+  for j = 1, #CORNER_CASES do
+    local b = CORNER_CASES[i]
+    local c = CORNER_CASES[j]
     local results = {}
     local counter = 1
     jit.on()
-    while counter < 4 do
-      results[counter] = b ^ c
-      counter = counter + 1
+    for k = 1, 4 do
+      results[k] = b ^ c
     end
     -- Prevent JIT side effects.
     jit.off()
===================================================================

<snipped>
> > -- 
> > 2.41.0
> > 

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 3/5] Improve assertions.
  2023-08-17 14:58   ` Maxim Kokryashkin via Tarantool-patches
@ 2023-08-18  7:56     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 34+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-08-18  7:56 UTC (permalink / raw)
  To: Maxim Kokryashkin; +Cc: tarantool-patches

Hi, Maxim!
Thanks for the review!
I fixed your comments and force-pushed the branch.

On 17.08.23, Maxim Kokryashkin wrote:
> Hi!
> Thanks for the patch!
> LGTM, except for a few comments below.
> 
> Side note: glad to see, that you didn't forget to enable the
> assertions we decided to replace with convetional ones, awesome work!

Nice to hear:).

> On Tue, Aug 15, 2023 at 12:36:29PM +0300, Sergey Kaplun wrote:
> > From: Mike Pall <mike>
> > 
> > (cherry-picked from commit 8ae5170cdc9c307bd81019b3e014391c9fd00581)
> > 
> > This commit refactors assertions used in the LuaJIT. It introduces new
> Typo: s/introduces/introduces the/

Fixed, thanks!

> > module <src/lj_assert.c> with the `lj_assert_fail()` implementation.
> > Wrappers of this function are used across the whole code base. Each
> > macro wrapper is defined in the corresponding module gets global state
> Typo: s/module/module and/

Fixed.

> > (if possible) from its environment to be passed inside the assertion.
> > For now, the global state is unused, but later it may be used for
> > dumping of the VM state.
> Typo: s/of the/the/

Fixed.

> > 
> > Sergey Kaplun:
> > * added the description for the feature
> > 
> > Part of tarantool/tarantool#8825
> > ---

<snipped>

> > -- 
> > 2.41.0
> > 

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/5] test: introduce `samevalues()` TAP checker
  2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 1/5] test: introduce `samevalues()` TAP checker Sergey Kaplun via Tarantool-patches
  2023-08-17 14:03   ` Maxim Kokryashkin via Tarantool-patches
@ 2023-08-18 10:43   ` Sergey Bronnikov via Tarantool-patches
  2023-08-18 10:58     ` Sergey Kaplun via Tarantool-patches
  1 sibling, 1 reply; 34+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2023-08-18 10:43 UTC (permalink / raw)
  To: Sergey Kaplun, Maxim Kokryashkin; +Cc: tarantool-patches

Hi, Sergey


Thanks for the patch! See my comments inline.


On 8/15/23 12:36, Sergey Kaplun wrote:
> The introduced `samevalues()` helper checks that values in range from
> 1, to `table.maxn()` of the given table are exactly the same. It may be
> usefull for test consistency of JIT and VM behaviour. Originally, the
> `arr_is_consistent()` function was introduced in the
> <tarantool-tests/gh-6163-min-max.test.lua>. `samevalues()` has the same
> functionallity (except usage of `table.maxn()` instead `#` operator to
> be sure, that the table we check isn't a sparse array).

I would rename samevalues to something like assert_equals or 
assert_items_equals just because

similar functions are named in unit testing frameworks and helpers with 
prefix assert_

more readable from my point of view. See names for assertions in luatest 
[1] and JUnit (popular unit testing framework).


1. https://github.com/tarantool/luatest#list-of-luatest-functions

2. 
https://junit.org/junit5/docs/5.0.1/api/org/junit/jupiter/api/Assertions.html

> ---
>   test/tarantool-tests/gh-6163-min-max.test.lua | 52 ++++++++-----------
>   test/tarantool-tests/tap.lua                  | 14 +++++
>   2 files changed, 37 insertions(+), 29 deletions(-)
>
> diff --git a/test/tarantool-tests/gh-6163-min-max.test.lua b/test/tarantool-tests/gh-6163-min-max.test.lua
> index 63437955..4bc6155c 100644
> --- a/test/tarantool-tests/gh-6163-min-max.test.lua
> +++ b/test/tarantool-tests/gh-6163-min-max.test.lua
> @@ -2,25 +2,17 @@ local tap = require('tap')
>   local test = tap.test('gh-6163-jit-min-max'):skipcond({
>     ['Test requires JIT enabled'] = not jit.status(),
>   })
> +
Nit: Unneeded change.

<snipped>


> diff --git a/test/tarantool-tests/tap.lua b/test/tarantool-tests/tap.lua
> index 8559ee52..af1d4b20 100644
> --- a/test/tarantool-tests/tap.lua
> +++ b/test/tarantool-tests/tap.lua
> @@ -254,6 +254,19 @@ local function iscdata(test, v, ctype, message, extra)
>     return ok(test, ffi.istype(ctype, v), message, extra)
>   end
>   
> +local function isnan(v)
> +  return v ~= v
> +end
> +

I would put isnan() to utils, because it is not related to TAP module 
actually and

can be useful for other tests too and exporting it in tap module will be 
strange.


> +local function samevalues(test, got, message, extra)
> +  for i = 1, table.maxn(got) - 1 do
> +    if got[i] ~= got[i + 1] and not (isnan(got[i]) and isnan(got[i + 1])) then
> +      return fail(test, message, extra)
> +    end
> +  end
> +  return ok(test, true, message, extra)
> +end
> +
>   local test_mt
>   
>   local function new(parent, name, fun, ...)
> @@ -372,6 +385,7 @@ test_mt = {
>       isudata    = isudata,
>       iscdata    = iscdata,
>       is_deeply  = is_deeply,
> +    samevalues = samevalues,
>       like       = like,
>       unlike     = unlike,
>     }

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/5] test: introduce `samevalues()` TAP checker
  2023-08-18 10:43   ` Sergey Bronnikov via Tarantool-patches
@ 2023-08-18 10:58     ` Sergey Kaplun via Tarantool-patches
  2023-08-18 11:12       ` Sergey Bronnikov via Tarantool-patches
  0 siblings, 1 reply; 34+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-08-18 10:58 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review!
Please, consider my comments below.

On 18.08.23, Sergey Bronnikov wrote:
> Hi, Sergey
> 
> 
> Thanks for the patch! See my comments inline.
> 
> 
> On 8/15/23 12:36, Sergey Kaplun wrote:
> > The introduced `samevalues()` helper checks that values in range from
> > 1, to `table.maxn()` of the given table are exactly the same. It may be
> > usefull for test consistency of JIT and VM behaviour. Originally, the
> > `arr_is_consistent()` function was introduced in the
> > <tarantool-tests/gh-6163-min-max.test.lua>. `samevalues()` has the same
> > functionallity (except usage of `table.maxn()` instead `#` operator to
> > be sure, that the table we check isn't a sparse array).
> 
> I would rename samevalues to something like assert_equals or 
> assert_items_equals just because
> 
> similar functions are named in unit testing frameworks and helpers with 
> prefix assert_

As you can see we use naming without _ for exported function in the
<tap.lua> module, so additional one with strange naming will be
inconsistent.

Also, discussed this naming with Igor and Max offline and this name is
OK for them, feel free also to CC Igor to discuss:).

> 
> more readable from my point of view. See names for assertions in luatest 
> [1] and JUnit (popular unit testing framework).
> 
> 
> 1. https://github.com/tarantool/luatest#list-of-luatest-functions
> 
> 2. 
> https://junit.org/junit5/docs/5.0.1/api/org/junit/jupiter/api/Assertions.html
> 
> > ---
> >   test/tarantool-tests/gh-6163-min-max.test.lua | 52 ++++++++-----------
> >   test/tarantool-tests/tap.lua                  | 14 +++++
> >   2 files changed, 37 insertions(+), 29 deletions(-)
> >
> > diff --git a/test/tarantool-tests/gh-6163-min-max.test.lua b/test/tarantool-tests/gh-6163-min-max.test.lua
> > index 63437955..4bc6155c 100644
> > --- a/test/tarantool-tests/gh-6163-min-max.test.lua
> > +++ b/test/tarantool-tests/gh-6163-min-max.test.lua
> > @@ -2,25 +2,17 @@ local tap = require('tap')
> >   local test = tap.test('gh-6163-jit-min-max'):skipcond({
> >     ['Test requires JIT enabled'] = not jit.status(),
> >   })
> > +
> Nit: Unneeded change.

The code with additional local variable `DUMMY_TABLE` becomes less
readable, so I added the empty line.
Ignoring.

> 
> <snipped>
> 
> 
> > diff --git a/test/tarantool-tests/tap.lua b/test/tarantool-tests/tap.lua
> > index 8559ee52..af1d4b20 100644
> > --- a/test/tarantool-tests/tap.lua
> > +++ b/test/tarantool-tests/tap.lua
> > @@ -254,6 +254,19 @@ local function iscdata(test, v, ctype, message, extra)
> >     return ok(test, ffi.istype(ctype, v), message, extra)
> >   end
> >   
> > +local function isnan(v)
> > +  return v ~= v
> > +end
> > +
> 
> I would put isnan() to utils, because it is not related to TAP module 
> actually and

Don't agree here, since <tap.lua> module should be self-sufficient (as
it is now), so it doesn't require additional modules like utils.

> 
> can be useful for other tests too and exporting it in tap module will be 

Opposite: it's OK to export it from this module within a check, since we
may want to check some value is NaN or not.

> strange.
> 
> 
> > +local function samevalues(test, got, message, extra)
> > +  for i = 1, table.maxn(got) - 1 do
> > +    if got[i] ~= got[i + 1] and not (isnan(got[i]) and isnan(got[i + 1])) then
> > +      return fail(test, message, extra)
> > +    end
> > +  end
> > +  return ok(test, true, message, extra)
> > +end
> > +
> >   local test_mt
> >   
> >   local function new(parent, name, fun, ...)
> > @@ -372,6 +385,7 @@ test_mt = {
> >       isudata    = isudata,
> >       iscdata    = iscdata,
> >       is_deeply  = is_deeply,
> > +    samevalues = samevalues,
> >       like       = like,
> >       unlike     = unlike,
> >     }

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/5] Remove pow() splitting and cleanup backends.
  2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 2/5] Remove pow() splitting and cleanup backends Sergey Kaplun via Tarantool-patches
  2023-08-17 14:52   ` Maxim Kokryashkin via Tarantool-patches
@ 2023-08-18 11:08   ` Sergey Bronnikov via Tarantool-patches
  1 sibling, 0 replies; 34+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2023-08-18 11:08 UTC (permalink / raw)
  To: Sergey Kaplun, Maxim Kokryashkin; +Cc: tarantool-patches

Hi, Sergey


thanks for the patch! LGTM


On 8/15/23 12:36, Sergey Kaplun wrote:
> From: Mike Pall <mike>
>
> (cherry-picked from commit b2307c8ad817e350d65cc909a579ca2f77439682)
>
> The JIT engine tries to split b^c to exp2(c * log2(b)) with attempt to
> rejoin them later for some backends. It adds a dependency on C99
> exp2() and log2(), which aren't part of some libm implementations.
> Also, for some cases for IEEE754 we can see, that exp2(log2(x)) != x,
> due to mathematical functions accuracy and double precision
> restrictions. So, the values on the JIT slots and Lua stack are
> inconsistent.
>
> This patch removes splitting of pow operator, so IR_POW is emitting for
> all cases (except power of 0.5 replaced with sqrt operation).
>
> Also this patch does some refactoring:
>
> * Functions `asm_pow()`, `asm_mod()`, `asm_ldexp()`, `asm_div()`
>    (replaced with `asm_fpdiv()` for CPU architectures) are moved to the
>    <src/lj_asm.c> as far as their implementation is generic for all
>    architectures.
> * Fusing of IR_HREF + IR_EQ/IR_NE moved to a `asm_fuseequal()`.
> * Since `lj_vm_exp2()` subroutine and `IRFPM_EXP2` are removed as no
>    longer used.
>
> Sergey Kaplun:
> * added the description and the test for the problem
>
> Part of tarantool/tarantool#8825
> ---
>   src/lj_arch.h                                 |   3 -
>   src/lj_asm.c                                  | 106 +++++++++++-------
>   src/lj_asm_arm.h                              |  10 +-
>   src/lj_asm_arm64.h                            |  39 +------
>   src/lj_asm_mips.h                             |  38 +------
>   src/lj_asm_ppc.h                              |   9 +-
>   src/lj_asm_x86.h                              |  37 +-----
>   src/lj_ir.h                                   |   2 +-
>   src/lj_ircall.h                               |   1 -
>   src/lj_opt_fold.c                             |  18 ++-
>   src/lj_opt_narrow.c                           |  20 +---
>   src/lj_opt_split.c                            |  21 ----
>   src/lj_vm.h                                   |   5 -
>   src/lj_vmmath.c                               |   8 --
>   .../lj-9-pow-inconsistencies.test.lua         |  63 +++++++++++
>   15 files changed, 158 insertions(+), 222 deletions(-)
>   create mode 100644 test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
>
> diff --git a/src/lj_arch.h b/src/lj_arch.h
> index cf31a291..3bdbe84e 100644
> --- a/src/lj_arch.h
> +++ b/src/lj_arch.h
> @@ -607,9 +607,6 @@
>   #if defined(__ANDROID__) || defined(__symbian__) || LJ_TARGET_XBOX360 || LJ_TARGET_WINDOWS
>   #define LUAJIT_NO_LOG2
>   #endif
> -#if defined(__symbian__) || LJ_TARGET_WINDOWS
> -#define LUAJIT_NO_EXP2
> -#endif
>   #if LJ_TARGET_CONSOLE || (LJ_TARGET_IOS && __IPHONE_OS_VERSION_MIN_REQUIRED >= __IPHONE_8_0)
>   #define LJ_NO_SYSTEM		1
>   #endif
> diff --git a/src/lj_asm.c b/src/lj_asm.c
> index b352fd35..a6906b19 100644
> --- a/src/lj_asm.c
> +++ b/src/lj_asm.c
> @@ -1356,32 +1356,6 @@ static void asm_call(ASMState *as, IRIns *ir)
>     asm_gencall(as, ci, args);
>   }
>   
> -#if !LJ_SOFTFP32
> -static void asm_fppow(ASMState *as, IRIns *ir, IRRef lref, IRRef rref)
> -{
> -  const CCallInfo *ci = &lj_ir_callinfo[IRCALL_pow];
> -  IRRef args[2];
> -  args[0] = lref;
> -  args[1] = rref;
> -  asm_setupresult(as, ir, ci);
> -  asm_gencall(as, ci, args);
> -}
> -
> -static int asm_fpjoin_pow(ASMState *as, IRIns *ir)
> -{
> -  IRIns *irp = IR(ir->op1);
> -  if (irp == ir-1 && irp->o == IR_MUL && !ra_used(irp)) {
> -    IRIns *irpp = IR(irp->op1);
> -    if (irpp == ir-2 && irpp->o == IR_FPMATH &&
> -	irpp->op2 == IRFPM_LOG2 && !ra_used(irpp)) {
> -      asm_fppow(as, ir, irpp->op1, irp->op2);
> -      return 1;
> -    }
> -  }
> -  return 0;
> -}
> -#endif
> -
>   /* -- PHI and loop handling ----------------------------------------------- */
>   
>   /* Break a PHI cycle by renaming to a free register (evict if needed). */
> @@ -1652,6 +1626,62 @@ static void asm_loop(ASMState *as)
>   #error "Missing assembler for target CPU"
>   #endif
>   
> +/* -- Common instruction helpers ------------------------------------------ */
> +
> +#if !LJ_SOFTFP32
> +#if !LJ_TARGET_X86ORX64
> +#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
> +#define asm_fppowi(as, ir)	asm_callid(as, ir, IRCALL_lj_vm_powi)
> +#endif
> +
> +static void asm_pow(ASMState *as, IRIns *ir)
> +{
> +#if LJ_64 && LJ_HASFFI
> +  if (!irt_isnum(ir->t))
> +    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_powi64 :
> +					  IRCALL_lj_carith_powu64);
> +  else
> +#endif
> +  if (irt_isnum(IR(ir->op2)->t))
> +    asm_callid(as, ir, IRCALL_pow);
> +  else
> +    asm_fppowi(as, ir);
> +}
> +
> +static void asm_div(ASMState *as, IRIns *ir)
> +{
> +#if LJ_64 && LJ_HASFFI
> +  if (!irt_isnum(ir->t))
> +    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_divi64 :
> +					  IRCALL_lj_carith_divu64);
> +  else
> +#endif
> +    asm_fpdiv(as, ir);
> +}
> +#endif
> +
> +static void asm_mod(ASMState *as, IRIns *ir)
> +{
> +#if LJ_64 && LJ_HASFFI
> +  if (!irt_isint(ir->t))
> +    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_modi64 :
> +					  IRCALL_lj_carith_modu64);
> +  else
> +#endif
> +    asm_callid(as, ir, IRCALL_lj_vm_modi);
> +}
> +
> +static void asm_fuseequal(ASMState *as, IRIns *ir)
> +{
> +  /* Fuse HREF + EQ/NE. */
> +  if ((ir-1)->o == IR_HREF && ir->op1 == as->curins-1) {
> +    as->curins--;
> +    asm_href(as, ir-1, (IROp)ir->o);
> +  } else {
> +    asm_equal(as, ir);
> +  }
> +}
> +
>   /* -- Instruction dispatch ------------------------------------------------ */
>   
>   /* Assemble a single instruction. */
> @@ -1674,14 +1704,7 @@ static void asm_ir(ASMState *as, IRIns *ir)
>     case IR_ABC:
>       asm_comp(as, ir);
>       break;
> -  case IR_EQ: case IR_NE:
> -    if ((ir-1)->o == IR_HREF && ir->op1 == as->curins-1) {
> -      as->curins--;
> -      asm_href(as, ir-1, (IROp)ir->o);
> -    } else {
> -      asm_equal(as, ir);
> -    }
> -    break;
> +  case IR_EQ: case IR_NE: asm_fuseequal(as, ir); break;
>   
>     case IR_RETF: asm_retf(as, ir); break;
>   
> @@ -1750,7 +1773,13 @@ static void asm_ir(ASMState *as, IRIns *ir)
>     case IR_SNEW: case IR_XSNEW: asm_snew(as, ir); break;
>     case IR_TNEW: asm_tnew(as, ir); break;
>     case IR_TDUP: asm_tdup(as, ir); break;
> -  case IR_CNEW: case IR_CNEWI: asm_cnew(as, ir); break;
> +  case IR_CNEW: case IR_CNEWI:
> +#if LJ_HASFFI
> +    asm_cnew(as, ir);
> +#else
> +    lua_assert(0);
> +#endif
> +    break;
>   
>     /* Buffer operations. */
>     case IR_BUFHDR: asm_bufhdr(as, ir); break;
> @@ -2215,6 +2244,10 @@ static void asm_setup_regsp(ASMState *as)
>   	if (inloop)
>   	  as->modset |= RSET_SCRATCH;
>   #if LJ_TARGET_X86
> +	if (irt_isnum(IR(ir->op2)->t)) {
> +	  if (as->evenspill < 4)  /* Leave room to call pow(). */
> +	    as->evenspill = 4;
> +	}
>   	break;
>   #else
>   	ir->prev = REGSP_HINT(RID_FPRET);
> @@ -2240,9 +2273,6 @@ static void asm_setup_regsp(ASMState *as)
>   	  continue;
>   	}
>   	break;
> -      } else if (ir->op2 == IRFPM_EXP2 && !LJ_64) {
> -	if (as->evenspill < 4)  /* Leave room to call pow(). */
> -	  as->evenspill = 4;
>         }
>   #endif
>         if (inloop)
> diff --git a/src/lj_asm_arm.h b/src/lj_asm_arm.h
> index 2894e5c9..29a07c80 100644
> --- a/src/lj_asm_arm.h
> +++ b/src/lj_asm_arm.h
> @@ -1275,8 +1275,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>     ra_allockreg(as, (int32_t)(sz+sizeof(GCcdata)),
>   	       ra_releasetmp(as, ASMREF_TMP1));
>   }
> -#else
> -#define asm_cnew(as, ir)	((void)0)
>   #endif
>   
>   /* -- Write barriers ------------------------------------------------------ */
> @@ -1371,8 +1369,6 @@ static void asm_callround(ASMState *as, IRIns *ir, int id)
>   
>   static void asm_fpmath(ASMState *as, IRIns *ir)
>   {
> -  if (ir->op2 == IRFPM_EXP2 && asm_fpjoin_pow(as, ir))
> -    return;
>     if (ir->op2 <= IRFPM_TRUNC)
>       asm_callround(as, ir, ir->op2);
>     else if (ir->op2 == IRFPM_SQRT)
> @@ -1499,14 +1495,10 @@ static void asm_mul(ASMState *as, IRIns *ir)
>   #define asm_mulov(as, ir)	asm_mul(as, ir)
>   
>   #if !LJ_SOFTFP
> -#define asm_div(as, ir)		asm_fparith(as, ir, ARMI_VDIV_D)
> -#define asm_pow(as, ir)		asm_callid(as, ir, IRCALL_lj_vm_powi)
> +#define asm_fpdiv(as, ir)	asm_fparith(as, ir, ARMI_VDIV_D)
>   #define asm_abs(as, ir)		asm_fpunary(as, ir, ARMI_VABS_D)
> -#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
>   #endif
>   
> -#define asm_mod(as, ir)		asm_callid(as, ir, IRCALL_lj_vm_modi)
> -
>   static void asm_neg(ASMState *as, IRIns *ir)
>   {
>   #if !LJ_SOFTFP
> diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
> index aea251a9..c3d6889e 100644
> --- a/src/lj_asm_arm64.h
> +++ b/src/lj_asm_arm64.h
> @@ -1249,8 +1249,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>     ra_allockreg(as, (int32_t)(sz+sizeof(GCcdata)),
>   	       ra_releasetmp(as, ASMREF_TMP1));
>   }
> -#else
> -#define asm_cnew(as, ir)	((void)0)
>   #endif
>   
>   /* -- Write barriers ------------------------------------------------------ */
> @@ -1327,8 +1325,6 @@ static void asm_fpmath(ASMState *as, IRIns *ir)
>     } else if (fpm <= IRFPM_TRUNC) {
>       asm_fpunary(as, ir, fpm == IRFPM_FLOOR ? A64I_FRINTMd :
>   			fpm == IRFPM_CEIL ? A64I_FRINTPd : A64I_FRINTZd);
> -  } else if (fpm == IRFPM_EXP2 && asm_fpjoin_pow(as, ir)) {
> -    return;
>     } else {
>       asm_callid(as, ir, IRCALL_lj_vm_floor + fpm);
>     }
> @@ -1435,45 +1431,12 @@ static void asm_mul(ASMState *as, IRIns *ir)
>     asm_intmul(as, ir);
>   }
>   
> -static void asm_div(ASMState *as, IRIns *ir)
> -{
> -#if LJ_HASFFI
> -  if (!irt_isnum(ir->t))
> -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_divi64 :
> -					  IRCALL_lj_carith_divu64);
> -  else
> -#endif
> -    asm_fparith(as, ir, A64I_FDIVd);
> -}
> -
> -static void asm_pow(ASMState *as, IRIns *ir)
> -{
> -#if LJ_HASFFI
> -  if (!irt_isnum(ir->t))
> -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_powi64 :
> -					  IRCALL_lj_carith_powu64);
> -  else
> -#endif
> -    asm_callid(as, ir, IRCALL_lj_vm_powi);
> -}
> -
>   #define asm_addov(as, ir)	asm_add(as, ir)
>   #define asm_subov(as, ir)	asm_sub(as, ir)
>   #define asm_mulov(as, ir)	asm_mul(as, ir)
>   
> +#define asm_fpdiv(as, ir)	asm_fparith(as, ir, A64I_FDIVd)
>   #define asm_abs(as, ir)		asm_fpunary(as, ir, A64I_FABS)
> -#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
> -
> -static void asm_mod(ASMState *as, IRIns *ir)
> -{
> -#if LJ_HASFFI
> -  if (!irt_isint(ir->t))
> -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_modi64 :
> -					  IRCALL_lj_carith_modu64);
> -  else
> -#endif
> -    asm_callid(as, ir, IRCALL_lj_vm_modi);
> -}
>   
>   static void asm_neg(ASMState *as, IRIns *ir)
>   {
> diff --git a/src/lj_asm_mips.h b/src/lj_asm_mips.h
> index 4626507b..0f92959b 100644
> --- a/src/lj_asm_mips.h
> +++ b/src/lj_asm_mips.h
> @@ -1613,8 +1613,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>     ra_allockreg(as, (int32_t)(sz+sizeof(GCcdata)),
>   	       ra_releasetmp(as, ASMREF_TMP1));
>   }
> -#else
> -#define asm_cnew(as, ir)	((void)0)
>   #endif
>   
>   /* -- Write barriers ------------------------------------------------------ */
> @@ -1683,8 +1681,6 @@ static void asm_fpunary(ASMState *as, IRIns *ir, MIPSIns mi)
>   #if !LJ_SOFTFP32
>   static void asm_fpmath(ASMState *as, IRIns *ir)
>   {
> -  if (ir->op2 == IRFPM_EXP2 && asm_fpjoin_pow(as, ir))
> -    return;
>   #if !LJ_SOFTFP
>     if (ir->op2 <= IRFPM_TRUNC)
>       asm_callround(as, ir, IRCALL_lj_vm_floor + ir->op2);
> @@ -1772,41 +1768,13 @@ static void asm_mul(ASMState *as, IRIns *ir)
>     }
>   }
>   
> -static void asm_mod(ASMState *as, IRIns *ir)
> -{
> -#if LJ_64 && LJ_HASFFI
> -  if (!irt_isint(ir->t))
> -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_modi64 :
> -					  IRCALL_lj_carith_modu64);
> -  else
> -#endif
> -    asm_callid(as, ir, IRCALL_lj_vm_modi);
> -}
> -
>   #if !LJ_SOFTFP32
> -static void asm_pow(ASMState *as, IRIns *ir)
> -{
> -#if LJ_64 && LJ_HASFFI
> -  if (!irt_isnum(ir->t))
> -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_powi64 :
> -					  IRCALL_lj_carith_powu64);
> -  else
> -#endif
> -    asm_callid(as, ir, IRCALL_lj_vm_powi);
> -}
> -
> -static void asm_div(ASMState *as, IRIns *ir)
> +static void asm_fpdiv(ASMState *as, IRIns *ir)
>   {
> -#if LJ_64 && LJ_HASFFI
> -  if (!irt_isnum(ir->t))
> -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_divi64 :
> -					  IRCALL_lj_carith_divu64);
> -  else
> -#endif
>   #if !LJ_SOFTFP
>       asm_fparith(as, ir, MIPSI_DIV_D);
>   #else
> -  asm_callid(as, ir, IRCALL_softfp_div);
> +    asm_callid(as, ir, IRCALL_softfp_div);
>   #endif
>   }
>   #endif
> @@ -1844,8 +1812,6 @@ static void asm_abs(ASMState *as, IRIns *ir)
>   }
>   #endif
>   
> -#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
> -
>   static void asm_arithov(ASMState *as, IRIns *ir)
>   {
>     /* TODO MIPSR6: bovc/bnvc. Caveat: no delay slot to load RID_TMP. */
> diff --git a/src/lj_asm_ppc.h b/src/lj_asm_ppc.h
> index 6aaed058..62a5c3e2 100644
> --- a/src/lj_asm_ppc.h
> +++ b/src/lj_asm_ppc.h
> @@ -1177,8 +1177,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>     ra_allockreg(as, (int32_t)(sz+sizeof(GCcdata)),
>   	       ra_releasetmp(as, ASMREF_TMP1));
>   }
> -#else
> -#define asm_cnew(as, ir)	((void)0)
>   #endif
>   
>   /* -- Write barriers ------------------------------------------------------ */
> @@ -1249,8 +1247,6 @@ static void asm_fpunary(ASMState *as, IRIns *ir, PPCIns pi)
>   
>   static void asm_fpmath(ASMState *as, IRIns *ir)
>   {
> -  if (ir->op2 == IRFPM_EXP2 && asm_fpjoin_pow(as, ir))
> -    return;
>     if (ir->op2 == IRFPM_SQRT && (as->flags & JIT_F_SQRT))
>       asm_fpunary(as, ir, PPCI_FSQRT);
>     else
> @@ -1364,9 +1360,7 @@ static void asm_mul(ASMState *as, IRIns *ir)
>     }
>   }
>   
> -#define asm_div(as, ir)		asm_fparith(as, ir, PPCI_FDIV)
> -#define asm_mod(as, ir)		asm_callid(as, ir, IRCALL_lj_vm_modi)
> -#define asm_pow(as, ir)		asm_callid(as, ir, IRCALL_lj_vm_powi)
> +#define asm_fpdiv(as, ir)	asm_fparith(as, ir, PPCI_FDIV)
>   
>   static void asm_neg(ASMState *as, IRIns *ir)
>   {
> @@ -1390,7 +1384,6 @@ static void asm_neg(ASMState *as, IRIns *ir)
>   }
>   
>   #define asm_abs(as, ir)		asm_fpunary(as, ir, PPCI_FABS)
> -#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
>   
>   static void asm_arithov(ASMState *as, IRIns *ir, PPCIns pi)
>   {
> diff --git a/src/lj_asm_x86.h b/src/lj_asm_x86.h
> index 63d332ca..5f5fe3cf 100644
> --- a/src/lj_asm_x86.h
> +++ b/src/lj_asm_x86.h
> @@ -1857,8 +1857,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>     asm_gencall(as, ci, args);
>     emit_loadi(as, ra_releasetmp(as, ASMREF_TMP1), (int32_t)(sz+sizeof(GCcdata)));
>   }
> -#else
> -#define asm_cnew(as, ir)	((void)0)
>   #endif
>   
>   /* -- Write barriers ------------------------------------------------------ */
> @@ -1964,8 +1962,6 @@ static void asm_fpmath(ASMState *as, IRIns *ir)
>   		    fpm == IRFPM_CEIL ? lj_vm_ceil_sse : lj_vm_trunc_sse);
>         ra_left(as, RID_XMM0, ir->op1);
>       }
> -  } else if (fpm == IRFPM_EXP2 && asm_fpjoin_pow(as, ir)) {
> -    /* Rejoined to pow(). */
>     } else {
>       asm_callid(as, ir, IRCALL_lj_vm_floor + fpm);
>     }
> @@ -2000,17 +1996,6 @@ static void asm_fppowi(ASMState *as, IRIns *ir)
>     ra_left(as, RID_EAX, ir->op2);
>   }
>   
> -static void asm_pow(ASMState *as, IRIns *ir)
> -{
> -#if LJ_64 && LJ_HASFFI
> -  if (!irt_isnum(ir->t))
> -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_powi64 :
> -					  IRCALL_lj_carith_powu64);
> -  else
> -#endif
> -    asm_fppowi(as, ir);
> -}
> -
>   static int asm_swapops(ASMState *as, IRIns *ir)
>   {
>     IRIns *irl = IR(ir->op1);
> @@ -2208,27 +2193,7 @@ static void asm_mul(ASMState *as, IRIns *ir)
>       asm_intarith(as, ir, XOg_X_IMUL);
>   }
>   
> -static void asm_div(ASMState *as, IRIns *ir)
> -{
> -#if LJ_64 && LJ_HASFFI
> -  if (!irt_isnum(ir->t))
> -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_divi64 :
> -					  IRCALL_lj_carith_divu64);
> -  else
> -#endif
> -    asm_fparith(as, ir, XO_DIVSD);
> -}
> -
> -static void asm_mod(ASMState *as, IRIns *ir)
> -{
> -#if LJ_64 && LJ_HASFFI
> -  if (!irt_isint(ir->t))
> -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_modi64 :
> -					  IRCALL_lj_carith_modu64);
> -  else
> -#endif
> -    asm_callid(as, ir, IRCALL_lj_vm_modi);
> -}
> +#define asm_fpdiv(as, ir)	asm_fparith(as, ir, XO_DIVSD)
>   
>   static void asm_neg_not(ASMState *as, IRIns *ir, x86Group3 xg)
>   {
> diff --git a/src/lj_ir.h b/src/lj_ir.h
> index e8bca275..43e55069 100644
> --- a/src/lj_ir.h
> +++ b/src/lj_ir.h
> @@ -177,7 +177,7 @@ LJ_STATIC_ASSERT((int)IR_XLOAD + IRDELTA_L2S == (int)IR_XSTORE);
>   /* FPMATH sub-functions. ORDER FPM. */
>   #define IRFPMDEF(_) \
>     _(FLOOR) _(CEIL) _(TRUNC)  /* Must be first and in this order. */ \
> -  _(SQRT) _(EXP2) _(LOG) _(LOG2) \
> +  _(SQRT) _(LOG) _(LOG2) \
>     _(OTHER)
>   
>   typedef enum {
> diff --git a/src/lj_ircall.h b/src/lj_ircall.h
> index bbad35b1..af064a6f 100644
> --- a/src/lj_ircall.h
> +++ b/src/lj_ircall.h
> @@ -192,7 +192,6 @@ typedef struct CCallInfo {
>     _(FPMATH,	lj_vm_ceil,		1,   N, NUM, XA_FP) \
>     _(FPMATH,	lj_vm_trunc,		1,   N, NUM, XA_FP) \
>     _(FPMATH,	sqrt,			1,   N, NUM, XA_FP) \
> -  _(ANY,	lj_vm_exp2,		1,   N, NUM, XA_FP) \
>     _(ANY,	log,			1,   N, NUM, XA_FP) \
>     _(ANY,	lj_vm_log2,		1,   N, NUM, XA_FP) \
>     _(ANY,	lj_vm_powi,		2,   N, NUM, XA_FP) \
> diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
> index 27e489af..cd803d87 100644
> --- a/src/lj_opt_fold.c
> +++ b/src/lj_opt_fold.c
> @@ -237,10 +237,11 @@ LJFOLDF(kfold_fpcall2)
>   }
>   
>   LJFOLD(POW KNUM KINT)
> +LJFOLD(POW KNUM KNUM)
>   LJFOLDF(kfold_numpow)
>   {
>     lua_Number a = knumleft;
> -  lua_Number b = (lua_Number)fright->i;
> +  lua_Number b = fright->o == IR_KINT ? (lua_Number)fright->i : knumright;
>     lua_Number y = lj_vm_foldarith(a, b, IR_POW - IR_ADD);
>     return lj_ir_knum(J, y);
>   }
> @@ -1077,7 +1078,7 @@ LJFOLDF(simplify_nummuldiv_negneg)
>   }
>   
>   LJFOLD(POW any KINT)
> -LJFOLDF(simplify_numpow_xk)
> +LJFOLDF(simplify_numpow_xkint)
>   {
>     int32_t k = fright->i;
>     TRef ref = fins->op1;
> @@ -1106,13 +1107,22 @@ LJFOLDF(simplify_numpow_xk)
>     return ref;
>   }
>   
> +LJFOLD(POW any KNUM)
> +LJFOLDF(simplify_numpow_xknum)
> +{
> +  if (knumright == 0.5)  /* x ^ 0.5 ==> sqrt(x) */
> +    return emitir(IRTN(IR_FPMATH), fins->op1, IRFPM_SQRT);
> +  return NEXTFOLD;
> +}
> +
>   LJFOLD(POW KNUM any)
>   LJFOLDF(simplify_numpow_kx)
>   {
>     lua_Number n = knumleft;
> -  if (n == 2.0) {  /* 2.0 ^ i ==> ldexp(1.0, tonum(i)) */
> -    fins->o = IR_CONV;
> +  if (n == 2.0 && irt_isint(fright->t)) {  /* 2.0 ^ i ==> ldexp(1.0, i) */
>   #if LJ_TARGET_X86ORX64
> +    /* Different IR_LDEXP calling convention on x86/x64 requires conversion. */
> +    fins->o = IR_CONV;
>       fins->op1 = fins->op2;
>       fins->op2 = IRCONV_NUM_INT;
>       fins->op2 = (IRRef1)lj_opt_fold(J);
> diff --git a/src/lj_opt_narrow.c b/src/lj_opt_narrow.c
> index bb61f97b..4f285334 100644
> --- a/src/lj_opt_narrow.c
> +++ b/src/lj_opt_narrow.c
> @@ -593,10 +593,10 @@ TRef lj_opt_narrow_pow(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc)
>     /* Narrowing must be unconditional to preserve (-x)^i semantics. */
>     if (tvisint(vc) || numisint(numV(vc))) {
>       int checkrange = 0;
> -    /* Split pow is faster for bigger exponents. But do this only for (+k)^i. */
> +    /* pow() is faster for bigger exponents. But do this only for (+k)^i. */
>       if (tref_isk(rb) && (int32_t)ir_knum(IR(tref_ref(rb)))->u32.hi >= 0) {
>         int32_t k = numberVint(vc);
> -      if (!(k >= -65536 && k <= 65536)) goto split_pow;
> +      if (!(k >= -65536 && k <= 65536)) goto force_pow_num;
>         checkrange = 1;
>       }
>       if (!tref_isinteger(rc)) {
> @@ -607,19 +607,11 @@ TRef lj_opt_narrow_pow(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc)
>         TRef tmp = emitir(IRTI(IR_ADD), rc, lj_ir_kint(J, 65536));
>         emitir(IRTGI(IR_ULE), tmp, lj_ir_kint(J, 2*65536));
>       }
> -    return emitir(IRTN(IR_POW), rb, rc);
> +  } else {
> +force_pow_num:
> +    rc = lj_ir_tonum(J, rc);  /* Want POW(num, num), not POW(num, int). */
>     }
> -split_pow:
> -  /* FOLD covers most cases, but some are easier to do here. */
> -  if (tref_isk(rb) && tvispone(ir_knum(IR(tref_ref(rb)))))
> -    return rb;  /* 1 ^ x ==> 1 */
> -  rc = lj_ir_tonum(J, rc);
> -  if (tref_isk(rc) && ir_knum(IR(tref_ref(rc)))->n == 0.5)
> -    return emitir(IRTN(IR_FPMATH), rb, IRFPM_SQRT);  /* x ^ 0.5 ==> sqrt(x) */
> -  /* Split up b^c into exp2(c*log2(b)). Assembler may rejoin later. */
> -  rb = emitir(IRTN(IR_FPMATH), rb, IRFPM_LOG2);
> -  rc = emitir(IRTN(IR_MUL), rb, rc);
> -  return emitir(IRTN(IR_FPMATH), rc, IRFPM_EXP2);
> +  return emitir(IRTN(IR_POW), rb, rc);
>   }
>   
>   /* -- Predictive narrowing of induction variables ------------------------- */
> diff --git a/src/lj_opt_split.c b/src/lj_opt_split.c
> index 2fc36b8d..c10a85cb 100644
> --- a/src/lj_opt_split.c
> +++ b/src/lj_opt_split.c
> @@ -403,27 +403,6 @@ static void split_ir(jit_State *J)
>   	hi = split_call_li(J, hisubst, oir, ir, IRCALL_lj_vm_powi);
>   	break;
>         case IR_FPMATH:
> -	/* Try to rejoin pow from EXP2, MUL and LOG2. */
> -	if (nir->op2 == IRFPM_EXP2 && nir->op1 > J->loopref) {
> -	  IRIns *irp = IR(nir->op1);
> -	  if (irp->o == IR_CALLN && irp->op2 == IRCALL_softfp_mul) {
> -	    IRIns *irm4 = IR(irp->op1);
> -	    IRIns *irm3 = IR(irm4->op1);
> -	    IRIns *irm12 = IR(irm3->op1);
> -	    IRIns *irl1 = IR(irm12->op1);
> -	    if (irm12->op1 > J->loopref && irl1->o == IR_CALLN &&
> -		irl1->op2 == IRCALL_lj_vm_log2) {
> -	      IRRef tmp = irl1->op1;  /* Recycle first two args from LOG2. */
> -	      IRRef arg3 = irm3->op2, arg4 = irm4->op2;
> -	      J->cur.nins--;
> -	      tmp = split_emit(J, IRT(IR_CARG, IRT_NIL), tmp, arg3);
> -	      tmp = split_emit(J, IRT(IR_CARG, IRT_NIL), tmp, arg4);
> -	      ir->prev = tmp = split_emit(J, IRTI(IR_CALLN), tmp, IRCALL_pow);
> -	      hi = split_emit(J, IRT(IR_HIOP, IRT_SOFTFP), tmp, tmp);
> -	      break;
> -	    }
> -	  }
> -	}
>   	hi = split_call_l(J, hisubst, oir, ir, IRCALL_lj_vm_floor + ir->op2);
>   	break;
>         case IR_LDEXP:
> diff --git a/src/lj_vm.h b/src/lj_vm.h
> index 411caafa..abaa7c52 100644
> --- a/src/lj_vm.h
> +++ b/src/lj_vm.h
> @@ -95,11 +95,6 @@ LJ_ASMF double lj_vm_trunc(double);
>   LJ_ASMF double lj_vm_trunc_sf(double);
>   #endif
>   #endif
> -#ifdef LUAJIT_NO_EXP2
> -LJ_ASMF double lj_vm_exp2(double);
> -#else
> -#define lj_vm_exp2	exp2
> -#endif
>   #if LJ_HASFFI
>   LJ_ASMF int lj_vm_errno(void);
>   #endif
> diff --git a/src/lj_vmmath.c b/src/lj_vmmath.c
> index ae4e0f15..9c0d3fde 100644
> --- a/src/lj_vmmath.c
> +++ b/src/lj_vmmath.c
> @@ -79,13 +79,6 @@ double lj_vm_log2(double a)
>   }
>   #endif
>   
> -#ifdef LUAJIT_NO_EXP2
> -double lj_vm_exp2(double a)
> -{
> -  return exp(a * 0.6931471805599453);
> -}
> -#endif
> -
>   #if !LJ_TARGET_X86ORX64
>   /* Unsigned x^k. */
>   static double lj_vm_powui(double x, uint32_t k)
> @@ -128,7 +121,6 @@ double lj_vm_foldfpm(double x, int fpm)
>     case IRFPM_CEIL: return lj_vm_ceil(x);
>     case IRFPM_TRUNC: return lj_vm_trunc(x);
>     case IRFPM_SQRT: return sqrt(x);
> -  case IRFPM_EXP2: return lj_vm_exp2(x);
>     case IRFPM_LOG: return log(x);
>     case IRFPM_LOG2: return lj_vm_log2(x);
>     default: lua_assert(0);
> diff --git a/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
> new file mode 100644
> index 00000000..21b3a0d9
> --- /dev/null
> +++ b/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
> @@ -0,0 +1,63 @@
> +local tap = require('tap')
> +-- Test to demonstrate the incorrect JIT behaviour when splitting
> +-- IR_POW.
> +-- See also https://github.com/LuaJIT/LuaJIT/issues/9.
> +local test = tap.test('lj-9-pow-inconsistencies'):skipcond({
> +  ['Test requires JIT enabled'] = not jit.status(),
> +})
> +
> +local nan = 0 / 0
> +local inf = math.huge
> +
> +-- Table with some corner cases to check:
> +local INTERESTING_VALUES = {
> +  -- 0, -0, 1, -1 special cases with nan, inf, etc..
> +  0, -0, 1, -1, nan, inf, -inf,
> +  -- x ^  inf = 0 (inf), if |x| < 1 (|x| > 1).
> +  -- x ^ -inf = inf (0), if |x| < 1 (|x| > 1).
> +  0.999999, 1.000001, -0.999999, -1.000001,
> +}
> +test:plan(1 + (#INTERESTING_VALUES) ^ 2)
> +
> +jit.opt.start('hotloop=1')
> +
> +-- The JIT engine tries to split b^c to exp2(c * log2(b)).
> +-- For some cases for IEEE754 we can see, that
> +-- (double)exp2((double)log2(x)) != x, due to mathematical
> +-- functions accuracy and double precision restrictions.
> +-- Just use some numbers to observe this misbehaviour.
> +local res = {}
> +local cnt = 1
> +while cnt < 4 do
> +  -- XXX: use local variable to prevent folding via parser.
> +  local b = -0.90000000001
> +  res[cnt] = 1000 ^ b
> +  cnt = cnt + 1
> +end
> +
> +test:samevalues(res, 'consistent pow operator behaviour for corner case')
> +
> +-- Prevent JIT side effects for parent loops.
> +jit.off()
> +for i = 1, #INTERESTING_VALUES do
> +  for j = 1, #INTERESTING_VALUES do
> +    local b = INTERESTING_VALUES[i]
> +    local c = INTERESTING_VALUES[j]
> +    local results = {}
> +    local counter = 1
> +    jit.on()
> +    while counter < 4 do
> +      results[counter] = b ^ c
> +      counter = counter + 1
> +    end
> +    -- Prevent JIT side effects.
> +    jit.off()
> +    jit.flush()
> +    test:samevalues(
> +      results,
> +      ('consistent pow operator behaviour for (%s)^(%s)'):format(b, c)
> +    )
> +  end
> +end
> +
> +test:done(true)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/5] test: introduce `samevalues()` TAP checker
  2023-08-18 10:58     ` Sergey Kaplun via Tarantool-patches
@ 2023-08-18 11:12       ` Sergey Bronnikov via Tarantool-patches
  2023-08-21 10:47         ` Igor Munkin via Tarantool-patches
  0 siblings, 1 reply; 34+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2023-08-18 11:12 UTC (permalink / raw)
  To: Sergey Kaplun, Igor Munkin; +Cc: tarantool-patches

Hi,

On 8/18/23 13:58, Sergey Kaplun wrote:
> Hi, Sergey!
> Thanks for the review!
> Please, consider my comments below.
>
> On 18.08.23, Sergey Bronnikov wrote:
>> Hi, Sergey
>>
>>
>> Thanks for the patch! See my comments inline.
>>
>>
>> On 8/15/23 12:36, Sergey Kaplun wrote:
>>> The introduced `samevalues()` helper checks that values in range from
>>> 1, to `table.maxn()` of the given table are exactly the same. It may be
>>> usefull for test consistency of JIT and VM behaviour. Originally, the
>>> `arr_is_consistent()` function was introduced in the
>>> <tarantool-tests/gh-6163-min-max.test.lua>. `samevalues()` has the same
>>> functionallity (except usage of `table.maxn()` instead `#` operator to
>>> be sure, that the table we check isn't a sparse array).
>> I would rename samevalues to something like assert_equals or
>> assert_items_equals just because
>>
>> similar functions are named in unit testing frameworks and helpers with
>> prefix assert_
> As you can see we use naming without _ for exported function in the
> <tap.lua> module, so additional one with strange naming will be
> inconsistent.
>
> Also, discussed this naming with Igor and Max offline and this name is
> OK for them, feel free also to CC Igor to discuss:).
>
>> more readable from my point of view. See names for assertions in luatest
>> [1] and JUnit (popular unit testing framework).
>>
>>
>> 1. https://github.com/tarantool/luatest#list-of-luatest-functions
>>
>> 2.
>> https://junit.org/junit5/docs/5.0.1/api/org/junit/jupiter/api/Assertions.html
>>
>>
Igor, what do you think regarding naming of the introduced function?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 3/5] Improve assertions.
  2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 3/5] Improve assertions Sergey Kaplun via Tarantool-patches
  2023-08-17 14:58   ` Maxim Kokryashkin via Tarantool-patches
@ 2023-08-18 11:20   ` Sergey Bronnikov via Tarantool-patches
  1 sibling, 0 replies; 34+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2023-08-18 11:20 UTC (permalink / raw)
  To: Sergey Kaplun, Maxim Kokryashkin; +Cc: tarantool-patches

Hi, Sergey

thanks for the patch! LGTM

On 8/15/23 12:36, Sergey Kaplun wrote:
> From: Mike Pall <mike>
>
> (cherry-picked from commit 8ae5170cdc9c307bd81019b3e014391c9fd00581)
>
> This commit refactors assertions used in the LuaJIT. It introduces new
> module <src/lj_assert.c> with the `lj_assert_fail()` implementation.
> Wrappers of this function are used across the whole code base. Each
> macro wrapper is defined in the corresponding module gets global state
> (if possible) from its environment to be passed inside the assertion.
> For now, the global state is unused, but later it may be used for
> dumping of the VM state.
>
> Sergey Kaplun:
> * added the description for the feature
>
> Part of tarantool/tarantool#8825
> ---
>   src/CMakeLists.txt        |   1 +
>   src/Makefile.dep.original |  13 +--
>   src/Makefile.original     |   4 +-
>   src/lib_io.c              |   6 +-
>   src/lib_jit.c             |   4 +-
>   src/lib_misc.c            |  12 +--
>   src/lib_string.c          |   6 +-
>   src/lj_api.c              | 140 +++++++++++++++++---------------
>   src/lj_asm.c              | 130 ++++++++++++++++++------------
>   src/lj_asm_arm.h          | 119 ++++++++++++++++------------
>   src/lj_asm_arm64.h        |  95 ++++++++++++----------
>   src/lj_asm_mips.h         | 151 ++++++++++++++++++++---------------
>   src/lj_asm_ppc.h          | 113 +++++++++++++++-----------
>   src/lj_asm_x86.h          | 161 +++++++++++++++++++++----------------
>   src/lj_assert.c           |  28 +++++++
>   src/lj_bcread.c           |  20 ++---
>   src/lj_bcwrite.c          |  24 ++++--
>   src/lj_buf.c              |   4 +-
>   src/lj_carith.c           |  10 ++-
>   src/lj_ccall.c            |  19 +++--
>   src/lj_ccallback.c        |  42 +++++-----
>   src/lj_cconv.c            |  57 ++++++++------
>   src/lj_cconv.h            |   5 +-
>   src/lj_cdata.c            |  27 ++++---
>   src/lj_cdata.h            |   7 +-
>   src/lj_clib.c             |   6 +-
>   src/lj_cparse.c           |  25 +++---
>   src/lj_crecord.c          |  19 +++--
>   src/lj_ctype.c            |  13 +--
>   src/lj_ctype.h            |  14 +++-
>   src/lj_debug.c            |  18 +++--
>   src/lj_def.h              |  26 ++++--
>   src/lj_dispatch.c         |  11 ++-
>   src/lj_emit_arm.h         |  50 ++++++------
>   src/lj_emit_arm64.h       |  21 ++---
>   src/lj_emit_mips.h        |  22 +++---
>   src/lj_emit_ppc.h         |  12 +--
>   src/lj_emit_x86.h         |  22 +++---
>   src/lj_err.c              |  40 ++--------
>   src/lj_func.c             |  18 +++--
>   src/lj_gc.c               |  78 ++++++++++--------
>   src/lj_gc.h               |   6 +-
>   src/lj_gdbjit.c           |   5 +-
>   src/lj_ir.c               |  31 ++++----
>   src/lj_ir.h               |   5 +-
>   src/lj_jit.h              |   6 ++
>   src/lj_lex.c              |  14 ++--
>   src/lj_lex.h              |   6 ++
>   src/lj_load.c             |   2 +-
>   src/lj_mapi.c             |   2 +-
>   src/lj_mcode.c            |   2 +-
>   src/lj_memprof.c          |  35 ++++----
>   src/lj_meta.c             |   6 +-
>   src/lj_obj.h              |  35 +++++---
>   src/lj_opt_fold.c         |  88 ++++++++++++---------
>   src/lj_opt_loop.c         |   5 +-
>   src/lj_opt_mem.c          |  15 ++--
>   src/lj_opt_narrow.c       |  17 ++--
>   src/lj_opt_split.c        |  22 +++---
>   src/lj_parse.c            | 114 +++++++++++++++------------
>   src/lj_record.c           | 162 +++++++++++++++++++++++---------------
>   src/lj_snap.c             | 100 ++++++++++++++---------
>   src/lj_snap.h             |   3 +-
>   src/lj_state.c            |  18 +++--
>   src/lj_str.c              |   7 +-
>   src/lj_strfmt.c           |   4 +-
>   src/lj_strfmt.h           |   3 +-
>   src/lj_strfmt_num.c       |   6 +-
>   src/lj_strscan.c          |   9 ++-
>   src/lj_symtab.c           |  11 +--
>   src/lj_sysprof.c          |  31 ++++----
>   src/lj_tab.c              |  20 ++---
>   src/lj_target.h           |   3 +-
>   src/lj_trace.c            |  57 +++++++-------
>   src/lj_utils_leb128.c     |   5 +-
>   src/lj_vmmath.c           |   7 +-
>   src/lj_wbuf.c             |   3 +-
>   src/ljamalg.c             |   1 +
>   src/luaconf.h             |   2 +-
>   79 files changed, 1436 insertions(+), 1025 deletions(-)
>   create mode 100644 src/lj_assert.c
>
> diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
> index feeccbde..03338306 100644
> --- a/src/CMakeLists.txt
> +++ b/src/CMakeLists.txt
> @@ -59,6 +59,7 @@ make_source_list(SOURCES_FRONTEND
>   make_source_list(SOURCES_UTILS
>     SOURCES
>       lj_alloc.c
> +    lj_assert.c
>       lj_char.c
>       lj_utils_leb128.c
>       lj_vmmath.c
> diff --git a/src/Makefile.dep.original b/src/Makefile.dep.original
> index 968805ed..d35b6d9a 100644
> --- a/src/Makefile.dep.original
> +++ b/src/Makefile.dep.original
> @@ -54,6 +54,7 @@ lj_asm.o: lj_asm.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h lj_gc.h \
>    lj_ircall.h lj_iropt.h lj_mcode.h lj_trace.h lj_dispatch.h lj_traceerr.h \
>    lj_snap.h lj_asm.h lj_vm.h lj_target.h lj_target_*.h lj_emit_*.h \
>    lj_asm_*.h
> +lj_assert.o: lj_assert.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h
>   lj_bc.o: lj_bc.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h lj_bc.h \
>    lj_bcdef.h
>   lj_bcread.o: lj_bcread.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
> @@ -164,7 +165,7 @@ lj_opt_loop.o: lj_opt_loop.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
>    lj_iropt.h lj_trace.h lj_dispatch.h lj_bc.h lj_traceerr.h lj_snap.h \
>    lj_vm.h
>   lj_opt_mem.o: lj_opt_mem.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
> - lj_tab.h lj_ir.h lj_jit.h lj_iropt.h lj_ircall.h
> + lj_tab.h lj_ir.h lj_jit.h lj_iropt.h lj_ircall.h lj_dispatch.h lj_bc.h
>   lj_opt_narrow.o: lj_opt_narrow.c lj_obj.h lua.h luaconf.h lj_def.h \
>    lj_arch.h lj_bc.h lj_ir.h lj_jit.h lj_iropt.h lj_trace.h lj_dispatch.h \
>    lj_traceerr.h lj_vm.h lj_strscan.h
> @@ -224,15 +225,17 @@ lj_trace.o: lj_trace.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
>    lmisclib.h lj_sysprof.h
>   lj_udata.o: lj_udata.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
>    lj_gc.h lj_udata.h
> -lj_utils_leb128.o: lj_utils_leb128.c lj_utils.h lj_def.h lua.h luaconf.h
> +lj_utils_leb128.o: lj_utils_leb128.c lj_utils.h lj_def.h lua.h luaconf.h \
> + lj_obj.h lj_arch.h
>   lj_vmevent.o: lj_vmevent.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
>    lj_str.h lj_tab.h lj_state.h lj_dispatch.h lj_bc.h lj_jit.h lj_ir.h \
>    lj_vm.h lj_vmevent.h
>   lj_vmmath.o: lj_vmmath.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
>    lj_ir.h lj_vm.h
> -lj_wbuf.o: lj_wbuf.c lj_wbuf.h lj_def.h lua.h luaconf.h lj_utils.h
> -ljamalg.o: ljamalg.c lua.h luaconf.h lauxlib.h lj_gc.c lj_obj.h lj_def.h \
> - lj_arch.h lj_gc.h lj_err.h lj_errmsg.h lj_buf.h lj_str.h lj_tab.h \
> +lj_wbuf.o: lj_wbuf.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \
> + lj_wbuf.h lj_utils.h
> +ljamalg.o: ljamalg.c lua.h luaconf.h lauxlib.h lj_assert.c lj_obj.h lj_def.h \
> + lj_arch.h lj_gc.c lj_gc.h lj_err.h lj_errmsg.h lj_buf.h lj_str.h lj_tab.h \
>    lj_func.h lj_udata.h lj_meta.h lj_state.h lj_frame.h lj_bc.h lj_ctype.h \
>    lj_cdata.h lj_trace.h lj_jit.h lj_ir.h lj_dispatch.h lj_traceerr.h \
>    lj_vm.h lj_err.c lj_debug.h lj_ff.h lj_ffdef.h lj_strfmt.h lj_char.c \
> diff --git a/src/Makefile.original b/src/Makefile.original
> index 22d36a27..8cfe55c2 100644
> --- a/src/Makefile.original
> +++ b/src/Makefile.original
> @@ -499,8 +499,8 @@ LJLIB_O= lib_base.o lib_math.o lib_bit.o lib_string.o lib_table.o \
>   	 lib_misc.o
>   LJLIB_C= $(LJLIB_O:.o=.c)
>   
> -LJCORE_O= lj_gc.o lj_err.o lj_char.o lj_bc.o lj_obj.o lj_buf.o lj_wbuf.o \
> -	  lj_str.o lj_tab.o lj_func.o lj_udata.o lj_meta.o lj_debug.o \
> +LJCORE_O= lj_assert.o lj_gc.o lj_err.o lj_char.o lj_bc.o lj_obj.o lj_buf.o \
> +	  lj_wbuf.o lj_str.o lj_tab.o lj_func.o lj_udata.o lj_meta.o lj_debug.o \
>   	  lj_state.o lj_dispatch.o lj_vmevent.o lj_vmmath.o lj_strscan.o \
>   	  lj_strfmt.o lj_strfmt_num.o lj_api.o lj_mapi.o lj_profile.o \
>   	  lj_profile_timer.o lj_memprof.o lj_symtab.o lj_sysprof.o \
> diff --git a/src/lib_io.c b/src/lib_io.c
> index db995ae6..ef39e535 100644
> --- a/src/lib_io.c
> +++ b/src/lib_io.c
> @@ -101,9 +101,6 @@ static int io_file_close(lua_State *L, IOFileUD *iof)
>       stat = pclose(iof->fp);
>   #elif LJ_TARGET_WINDOWS && !LJ_TARGET_XBOXONE && !LJ_TARGET_UWP
>       stat = _pclose(iof->fp);
> -#else
> -    lua_assert(0);
> -    return 0;
>   #endif
>   #if LJ_52
>       iof->fp = NULL;
> @@ -112,7 +109,8 @@ static int io_file_close(lua_State *L, IOFileUD *iof)
>       ok = (stat != -1);
>   #endif
>     } else {
> -    lua_assert((iof->type & IOFILE_TYPE_MASK) == IOFILE_TYPE_STDF);
> +    lj_assertL((iof->type & IOFILE_TYPE_MASK) == IOFILE_TYPE_STDF,
> +	       "close of unknown FILE* type");
>       setnilV(L->top++);
>       lua_pushliteral(L, "cannot close standard file");
>       return 2;
> diff --git a/src/lib_jit.c b/src/lib_jit.c
> index 40aa2b51..b3c1c93c 100644
> --- a/src/lib_jit.c
> +++ b/src/lib_jit.c
> @@ -227,7 +227,7 @@ LJLIB_CF(jit_util_funcbc)
>     if (pc < pt->sizebc) {
>       BCIns ins = proto_bc(pt)[pc];
>       BCOp op = bc_op(ins);
> -    lua_assert(op < BC__MAX);
> +    lj_assertL(op < BC__MAX, "bad bytecode op %d", op);
>       setintV(L->top, ins);
>       setintV(L->top+1, lj_bc_mode[op]);
>       L->top += 2;
> @@ -491,7 +491,7 @@ static int jitopt_param(jit_State *J, const char *str)
>     int i;
>     for (i = 0; i < JIT_P__MAX; i++) {
>       size_t len = *(const uint8_t *)lst;
> -    lua_assert(len != 0);
> +    lj_assertJ(len != 0, "bad JIT_P_STRING");
>       if (strncmp(str, lst+1, len) == 0 && str[len] == '=') {
>         int32_t n = 0;
>         const char *p = &str[len+1];
> diff --git a/src/lib_misc.c b/src/lib_misc.c
> index 1913a622..ca1d1c75 100644
> --- a/src/lib_misc.c
> +++ b/src/lib_misc.c
> @@ -109,7 +109,7 @@ static size_t buffer_writer_default(const void **buf_addr, size_t len,
>     const void *data = *buf_addr;
>     size_t write_total = 0;
>   
> -  lua_assert(len <= STREAM_BUFFER_SIZE);
> +  lj_assertX(len <= STREAM_BUFFER_SIZE, "stream buffer overflow");
>   
>     for (;;) {
>       const ssize_t written = write(fd, data, len - write_total);
> @@ -127,7 +127,7 @@ static size_t buffer_writer_default(const void **buf_addr, size_t len,
>       }
>   
>       write_total += written;
> -    lua_assert(write_total <= len);
> +    lj_assertX(write_total <= len, "invalid stream buffer write");
>   
>       if (write_total == len)
>         break;
> @@ -168,7 +168,7 @@ static int on_stop_cb_default(void *opt, uint8_t *buf)
>   static int set_output_path(const char *path, struct luam_Sysprof_Options *opt) {
>     struct profile_ctx *ctx = opt->ctx;
>     int fd = 0;
> -  lua_assert(path != NULL);
> +  lj_assertX(path != NULL, "no file to open by sysprof");
>     fd = open(path, O_CREAT | O_WRONLY | O_TRUNC, 0644);
>     if(fd == -1) {
>       return PROFILE_ERRIO;
> @@ -280,7 +280,7 @@ static int sysprof_error(lua_State *L, int status)
>         return luaL_fileresult(L, 0, NULL);
>   #endif
>       default:
> -      lua_assert(0);
> +      lj_assertL(0, "bad sysprof error %d", status);
>         return 0;
>     }
>   }
> @@ -401,7 +401,7 @@ LJLIB_CF(misc_memprof_start)
>         return luaL_fileresult(L, 0, fname);
>   #endif
>       default:
> -      lua_assert(0);
> +      lj_assertL(0, "bad memprof error %d", memprof_status);
>         return 0;
>       }
>     }
> @@ -430,7 +430,7 @@ LJLIB_CF(misc_memprof_stop)
>         return luaL_fileresult(L, 0, NULL);
>   #endif
>       default:
> -      lua_assert(0);
> +      lj_assertL(0, "bad memprof error %d", status);
>         return 0;
>       }
>     }
> diff --git a/src/lib_string.c b/src/lib_string.c
> index 156dae66..9b9c369a 100644
> --- a/src/lib_string.c
> +++ b/src/lib_string.c
> @@ -136,7 +136,7 @@ LJLIB_CF(string_dump)
>   /* ------------------------------------------------------------------------ */
>   
>   /* macro to `unsign' a character */
> -#define uchar(c)        ((unsigned char)(c))
> +#define uchar(c)	((unsigned char)(c))
>   
>   #define CAP_UNFINISHED	(-1)
>   #define CAP_POSITION	(-2)
> @@ -645,7 +645,7 @@ static GCstr *string_fmt_tostring(lua_State *L, int arg, int retry)
>   {
>     TValue *o = L->base+arg-1;
>     cTValue *mo;
> -  lua_assert(o < L->top);  /* Caller already checks for existence. */
> +  lj_assertL(o < L->top, "bad usage");  /* Caller already checks for existence. */
>     if (LJ_LIKELY(tvisstr(o)))
>       return strV(o);
>     if (retry != 2 && !tvisnil(mo = lj_meta_lookup(L, o, MM_tostring))) {
> @@ -717,7 +717,7 @@ again:
>   	lj_strfmt_putptr(sb, lj_obj_ptr(G(L), L->base+arg-1));
>   	break;
>         default:
> -	lua_assert(0);
> +	lj_assertL(0, "bad string format type");
>   	break;
>         }
>       }
> diff --git a/src/lj_api.c b/src/lj_api.c
> index 89998815..05e02029 100644
> --- a/src/lj_api.c
> +++ b/src/lj_api.c
> @@ -28,8 +28,8 @@
>   
>   /* -- Common helper functions --------------------------------------------- */
>   
> -#define api_checknelems(L, n)		api_check(L, (n) <= (L->top - L->base))
> -#define api_checkvalidindex(L, i)	api_check(L, (i) != niltv(L))
> +#define lj_checkapi_slot(idx) \
> +  lj_checkapi((idx) <= (L->top - L->base), "stack slot %d out of range", (idx))
>   
>   static TValue *index2adr(lua_State *L, int idx)
>   {
> @@ -37,7 +37,8 @@ static TValue *index2adr(lua_State *L, int idx)
>       TValue *o = L->base + (idx - 1);
>       return o < L->top ? o : niltv(L);
>     } else if (idx > LUA_REGISTRYINDEX) {
> -    api_check(L, idx != 0 && -idx <= L->top - L->base);
> +    lj_checkapi(idx != 0 && -idx <= L->top - L->base,
> +		"bad stack slot %d", idx);
>       return L->top + idx;
>     } else if (idx == LUA_GLOBALSINDEX) {
>       TValue *o = &G(L)->tmptv;
> @@ -47,7 +48,8 @@ static TValue *index2adr(lua_State *L, int idx)
>       return registry(L);
>     } else {
>       GCfunc *fn = curr_func(L);
> -    api_check(L, fn->c.gct == ~LJ_TFUNC && !isluafunc(fn));
> +    lj_checkapi(fn->c.gct == ~LJ_TFUNC && !isluafunc(fn),
> +		"calling frame is not a C function");
>       if (idx == LUA_ENVIRONINDEX) {
>         TValue *o = &G(L)->tmptv;
>         settabV(L, o, tabref(fn->c.env));
> @@ -59,13 +61,27 @@ static TValue *index2adr(lua_State *L, int idx)
>     }
>   }
>   
> -static TValue *stkindex2adr(lua_State *L, int idx)
> +static LJ_AINLINE TValue *index2adr_check(lua_State *L, int idx)
> +{
> +  TValue *o = index2adr(L, idx);
> +  lj_checkapi(o != niltv(L), "invalid stack slot %d", idx);
> +  return o;
> +}
> +
> +static TValue *index2adr_stack(lua_State *L, int idx)
>   {
>     if (idx > 0) {
>       TValue *o = L->base + (idx - 1);
> +    if (o < L->top) {
> +      return o;
> +    } else {
> +      lj_checkapi(0, "invalid stack slot %d", idx);
> +      return niltv(L);
> +    }
>       return o < L->top ? o : niltv(L);
>     } else {
> -    api_check(L, idx != 0 && -idx <= L->top - L->base);
> +    lj_checkapi(idx != 0 && -idx <= L->top - L->base,
> +		"invalid stack slot %d", idx);
>       return L->top + idx;
>     }
>   }
> @@ -111,17 +127,17 @@ LUALIB_API void luaL_checkstack(lua_State *L, int size, const char *msg)
>       lj_err_callerv(L, LJ_ERR_STKOVM, msg);
>   }
>   
> -LUA_API void lua_xmove(lua_State *from, lua_State *to, int n)
> +LUA_API void lua_xmove(lua_State *L, lua_State *to, int n)
>   {
>     TValue *f, *t;
> -  if (from == to) return;
> -  api_checknelems(from, n);
> -  api_check(from, G(from) == G(to));
> +  if (L == to) return;
> +  lj_checkapi_slot(n);
> +  lj_checkapi(G(L) == G(to), "move across global states");
>     lj_state_checkstack(to, (MSize)n);
> -  f = from->top;
> +  f = L->top;
>     t = to->top = to->top + n;
>     while (--n >= 0) copyTV(to, --t, --f);
> -  from->top = f;
> +  L->top = f;
>   }
>   
>   LUA_API const lua_Number *lua_version(lua_State *L)
> @@ -141,7 +157,7 @@ LUA_API int lua_gettop(lua_State *L)
>   LUA_API void lua_settop(lua_State *L, int idx)
>   {
>     if (idx >= 0) {
> -    api_check(L, idx <= tvref(L->maxstack) - L->base);
> +    lj_checkapi(idx <= tvref(L->maxstack) - L->base, "bad stack slot %d", idx);
>       if (L->base + idx > L->top) {
>         if (L->base + idx >= tvref(L->maxstack))
>   	lj_state_growstack(L, (MSize)idx - (MSize)(L->top - L->base));
> @@ -150,23 +166,21 @@ LUA_API void lua_settop(lua_State *L, int idx)
>         L->top = L->base + idx;
>       }
>     } else {
> -    api_check(L, -(idx+1) <= (L->top - L->base));
> +    lj_checkapi(-(idx+1) <= (L->top - L->base), "bad stack slot %d", idx);
>       L->top += idx+1;  /* Shrinks top (idx < 0). */
>     }
>   }
>   
>   LUA_API void lua_remove(lua_State *L, int idx)
>   {
> -  TValue *p = stkindex2adr(L, idx);
> -  api_checkvalidindex(L, p);
> +  TValue *p = index2adr_stack(L, idx);
>     while (++p < L->top) copyTV(L, p-1, p);
>     L->top--;
>   }
>   
>   LUA_API void lua_insert(lua_State *L, int idx)
>   {
> -  TValue *q, *p = stkindex2adr(L, idx);
> -  api_checkvalidindex(L, p);
> +  TValue *q, *p = index2adr_stack(L, idx);
>     for (q = L->top; q > p; q--) copyTV(L, q, q-1);
>     copyTV(L, p, L->top);
>   }
> @@ -174,19 +188,18 @@ LUA_API void lua_insert(lua_State *L, int idx)
>   static void copy_slot(lua_State *L, TValue *f, int idx)
>   {
>     if (idx == LUA_GLOBALSINDEX) {
> -    api_check(L, tvistab(f));
> +    lj_checkapi(tvistab(f), "stack slot %d is not a table", idx);
>       /* NOBARRIER: A thread (i.e. L) is never black. */
>       setgcref(L->env, obj2gco(tabV(f)));
>     } else if (idx == LUA_ENVIRONINDEX) {
>       GCfunc *fn = curr_func(L);
>       if (fn->c.gct != ~LJ_TFUNC)
>         lj_err_msg(L, LJ_ERR_NOENV);
> -    api_check(L, tvistab(f));
> +    lj_checkapi(tvistab(f), "stack slot %d is not a table", idx);
>       setgcref(fn->c.env, obj2gco(tabV(f)));
>       lj_gc_barrier(L, fn, f);
>     } else {
> -    TValue *o = index2adr(L, idx);
> -    api_checkvalidindex(L, o);
> +    TValue *o = index2adr_check(L, idx);
>       copyTV(L, o, f);
>       if (idx < LUA_GLOBALSINDEX)  /* Need a barrier for upvalues. */
>         lj_gc_barrier(L, curr_func(L), f);
> @@ -195,7 +208,7 @@ static void copy_slot(lua_State *L, TValue *f, int idx)
>   
>   LUA_API void lua_replace(lua_State *L, int idx)
>   {
> -  api_checknelems(L, 1);
> +  lj_checkapi_slot(1);
>     copy_slot(L, L->top - 1, idx);
>     L->top--;
>   }
> @@ -231,7 +244,7 @@ LUA_API int lua_type(lua_State *L, int idx)
>   #else
>       int tt = (int)(((t < 8 ? 0x98042110u : 0x75a06u) >> 4*(t&7)) & 15u);
>   #endif
> -    lua_assert(tt != LUA_TNIL || tvisnil(o));
> +    lj_assertL(tt != LUA_TNIL || tvisnil(o), "bad tag conversion");
>       return tt;
>     }
>   }
> @@ -522,7 +535,7 @@ LUA_API const char *lua_tolstring(lua_State *L, int idx, size_t *len)
>   LUA_API uint32_t lua_hashstring(lua_State *L, int idx)
>   {
>     TValue *o = index2adr(L, idx);
> -  lua_assert(tvisstr(o));
> +  lj_checkapi(tvisstr(o), "stack slot %d is not a string", idx);
>     GCstr *s = strV(o);
>     if (! strsmart(s))
>       return s->hash;
> @@ -699,14 +712,14 @@ LUA_API void lua_pushcclosure(lua_State *L, lua_CFunction f, int n)
>   {
>     GCfunc *fn;
>     lj_gc_check(L);
> -  api_checknelems(L, n);
> +  lj_checkapi_slot(n);
>     fn = lj_func_newC(L, (MSize)n, getcurrenv(L));
>     fn->c.f = f;
>     L->top -= n;
>     while (n--)
>       copyTV(L, &fn->c.upvalue[n], L->top+n);
>     setfuncV(L, L->top, fn);
> -  lua_assert(iswhite(obj2gco(fn)));
> +  lj_assertL(iswhite(obj2gco(fn)), "new GC object is not white");
>     incr_top(L);
>   }
>   
> @@ -779,7 +792,7 @@ LUA_API void *lua_newuserdata(lua_State *L, size_t size)
>   
>   LUA_API void lua_concat(lua_State *L, int n)
>   {
> -  api_checknelems(L, n);
> +  lj_checkapi_slot(n);
>     if (n >= 2) {
>       n--;
>       do {
> @@ -805,9 +818,8 @@ LUA_API void lua_concat(lua_State *L, int n)
>   
>   LUA_API void lua_gettable(lua_State *L, int idx)
>   {
> -  cTValue *v, *t = index2adr(L, idx);
> -  api_checkvalidindex(L, t);
> -  v = lj_meta_tget(L, t, L->top-1);
> +  cTValue *t = index2adr_check(L, idx);
> +  cTValue *v = lj_meta_tget(L, t, L->top-1);
>     if (v == NULL) {
>       L->top += 2;
>       jit_secure_call(L, L->top-2, 1+1);
> @@ -819,9 +831,8 @@ LUA_API void lua_gettable(lua_State *L, int idx)
>   
>   LUA_API void lua_getfield(lua_State *L, int idx, const char *k)
>   {
> -  cTValue *v, *t = index2adr(L, idx);
> +  cTValue *v, *t = index2adr_check(L, idx);
>     TValue key;
> -  api_checkvalidindex(L, t);
>     setstrV(L, &key, lj_str_newz(L, k));
>     v = lj_meta_tget(L, t, &key);
>     if (v == NULL) {
> @@ -837,14 +848,14 @@ LUA_API void lua_getfield(lua_State *L, int idx, const char *k)
>   LUA_API void lua_rawget(lua_State *L, int idx)
>   {
>     cTValue *t = index2adr(L, idx);
> -  api_check(L, tvistab(t));
> +  lj_checkapi(tvistab(t), "stack slot %d is not a table", idx);
>     copyTV(L, L->top-1, lj_tab_get(L, tabV(t), L->top-1));
>   }
>   
>   LUA_API void lua_rawgeti(lua_State *L, int idx, int n)
>   {
>     cTValue *v, *t = index2adr(L, idx);
> -  api_check(L, tvistab(t));
> +  lj_checkapi(tvistab(t), "stack slot %d is not a table", idx);
>     v = lj_tab_getint(tabV(t), n);
>     if (v) {
>       copyTV(L, L->top, v);
> @@ -886,8 +897,7 @@ LUALIB_API int luaL_getmetafield(lua_State *L, int idx, const char *field)
>   
>   LUA_API void lua_getfenv(lua_State *L, int idx)
>   {
> -  cTValue *o = index2adr(L, idx);
> -  api_checkvalidindex(L, o);
> +  cTValue *o = index2adr_check(L, idx);
>     if (tvisfunc(o)) {
>       settabV(L, L->top, tabref(funcV(o)->c.env));
>     } else if (tvisudata(o)) {
> @@ -904,7 +914,7 @@ LUA_API int lua_next(lua_State *L, int idx)
>   {
>     cTValue *t = index2adr(L, idx);
>     int more;
> -  api_check(L, tvistab(t));
> +  lj_checkapi(tvistab(t), "stack slot %d is not a table", idx);
>     more = lj_tab_next(L, tabV(t), L->top-1);
>     if (more) {
>       incr_top(L);  /* Return new key and value slot. */
> @@ -930,7 +940,7 @@ LUA_API void *lua_upvalueid(lua_State *L, int idx, int n)
>   {
>     GCfunc *fn = funcV(index2adr(L, idx));
>     n--;
> -  api_check(L, (uint32_t)n < fn->l.nupvalues);
> +  lj_checkapi((uint32_t)n < fn->l.nupvalues, "bad upvalue %d", n);
>     return isluafunc(fn) ? (void *)gcref(fn->l.uvptr[n]) :
>   			 (void *)&fn->c.upvalue[n];
>   }
> @@ -940,8 +950,10 @@ LUA_API void lua_upvaluejoin(lua_State *L, int idx1, int n1, int idx2, int n2)
>     GCfunc *fn1 = funcV(index2adr(L, idx1));
>     GCfunc *fn2 = funcV(index2adr(L, idx2));
>     n1--; n2--;
> -  api_check(L, isluafunc(fn1) && (uint32_t)n1 < fn1->l.nupvalues);
> -  api_check(L, isluafunc(fn2) && (uint32_t)n2 < fn2->l.nupvalues);
> +  lj_checkapi(isluafunc(fn1), "stack slot %d is not a Lua function", idx1);
> +  lj_checkapi(isluafunc(fn2), "stack slot %d is not a Lua function", idx2);
> +  lj_checkapi((uint32_t)n1 < fn1->l.nupvalues, "bad upvalue %d", n1+1);
> +  lj_checkapi((uint32_t)n2 < fn2->l.nupvalues, "bad upvalue %d", n2+1);
>     setgcrefr(fn1->l.uvptr[n1], fn2->l.uvptr[n2]);
>     lj_gc_objbarrier(L, fn1, gcref(fn1->l.uvptr[n1]));
>   }
> @@ -970,9 +982,8 @@ LUALIB_API void *luaL_checkudata(lua_State *L, int idx, const char *tname)
>   LUA_API void lua_settable(lua_State *L, int idx)
>   {
>     TValue *o;
> -  cTValue *t = index2adr(L, idx);
> -  api_checknelems(L, 2);
> -  api_checkvalidindex(L, t);
> +  cTValue *t = index2adr_check(L, idx);
> +  lj_checkapi_slot(2);
>     o = lj_meta_tset(L, t, L->top-2);
>     if (o) {
>       /* NOBARRIER: lj_meta_tset ensures the table is not black. */
> @@ -991,9 +1002,8 @@ LUA_API void lua_setfield(lua_State *L, int idx, const char *k)
>   {
>     TValue *o;
>     TValue key;
> -  cTValue *t = index2adr(L, idx);
> -  api_checknelems(L, 1);
> -  api_checkvalidindex(L, t);
> +  cTValue *t = index2adr_check(L, idx);
> +  lj_checkapi_slot(1);
>     setstrV(L, &key, lj_str_newz(L, k));
>     o = lj_meta_tset(L, t, &key);
>     if (o) {
> @@ -1012,7 +1022,7 @@ LUA_API void lua_rawset(lua_State *L, int idx)
>   {
>     GCtab *t = tabV(index2adr(L, idx));
>     TValue *dst, *key;
> -  api_checknelems(L, 2);
> +  lj_checkapi_slot(2);
>     key = L->top-2;
>     dst = lj_tab_set(L, t, key);
>     copyTV(L, dst, key+1);
> @@ -1024,7 +1034,7 @@ LUA_API void lua_rawseti(lua_State *L, int idx, int n)
>   {
>     GCtab *t = tabV(index2adr(L, idx));
>     TValue *dst, *src;
> -  api_checknelems(L, 1);
> +  lj_checkapi_slot(1);
>     dst = lj_tab_setint(L, t, n);
>     src = L->top-1;
>     copyTV(L, dst, src);
> @@ -1036,13 +1046,12 @@ LUA_API int lua_setmetatable(lua_State *L, int idx)
>   {
>     global_State *g;
>     GCtab *mt;
> -  cTValue *o = index2adr(L, idx);
> -  api_checknelems(L, 1);
> -  api_checkvalidindex(L, o);
> +  cTValue *o = index2adr_check(L, idx);
> +  lj_checkapi_slot(1);
>     if (tvisnil(L->top-1)) {
>       mt = NULL;
>     } else {
> -    api_check(L, tvistab(L->top-1));
> +    lj_checkapi(tvistab(L->top-1), "top stack slot is not a table");
>       mt = tabV(L->top-1);
>     }
>     g = G(L);
> @@ -1079,11 +1088,10 @@ LUALIB_API void luaL_setmetatable(lua_State *L, const char *tname)
>   
>   LUA_API int lua_setfenv(lua_State *L, int idx)
>   {
> -  cTValue *o = index2adr(L, idx);
> +  cTValue *o = index2adr_check(L, idx);
>     GCtab *t;
> -  api_checknelems(L, 1);
> -  api_checkvalidindex(L, o);
> -  api_check(L, tvistab(L->top-1));
> +  lj_checkapi_slot(1);
> +  lj_checkapi(tvistab(L->top-1), "top stack slot is not a table");
>     t = tabV(L->top-1);
>     if (tvisfunc(o)) {
>       setgcref(funcV(o)->c.env, obj2gco(t));
> @@ -1106,7 +1114,7 @@ LUA_API const char *lua_setupvalue(lua_State *L, int idx, int n)
>     TValue *val;
>     GCobj *o;
>     const char *name;
> -  api_checknelems(L, 1);
> +  lj_checkapi_slot(1);
>     name = lj_debug_uvnamev(f, (uint32_t)(n-1), &val, &o);
>     if (name) {
>       L->top--;
> @@ -1133,8 +1141,9 @@ static TValue *api_call_base(lua_State *L, int nargs)
>   
>   LUA_API void lua_call(lua_State *L, int nargs, int nresults)
>   {
> -  api_check(L, L->status == LUA_OK || L->status == LUA_ERRERR);
> -  api_checknelems(L, nargs+1);
> +  lj_checkapi(L->status == LUA_OK || L->status == LUA_ERRERR,
> +	      "thread called in wrong state %d", L->status);
> +  lj_checkapi_slot(nargs+1);
>     jit_secure_call(L, api_call_base(L, nargs), nresults+1);
>   }
>   
> @@ -1144,13 +1153,13 @@ LUA_API int lua_pcall(lua_State *L, int nargs, int nresults, int errfunc)
>     uint8_t oldh = hook_save(g);
>     ptrdiff_t ef;
>     int status;
> -  api_check(L, L->status == LUA_OK || L->status == LUA_ERRERR);
> -  api_checknelems(L, nargs+1);
> +  lj_checkapi(L->status == LUA_OK || L->status == LUA_ERRERR,
> +	      "thread called in wrong state %d", L->status);
> +  lj_checkapi_slot(nargs+1);
>     if (errfunc == 0) {
>       ef = 0;
>     } else {
> -    cTValue *o = stkindex2adr(L, errfunc);
> -    api_checkvalidindex(L, o);
> +    cTValue *o = index2adr_stack(L, errfunc);
>       ef = savestack(L, o);
>     }
>     /* Forbid Lua world re-entrancy while running the trace */
> @@ -1186,7 +1195,8 @@ LUA_API int lua_cpcall(lua_State *L, lua_CFunction func, void *ud)
>     global_State *g = G(L);
>     uint8_t oldh = hook_save(g);
>     int status;
> -  api_check(L, L->status == LUA_OK || L->status == LUA_ERRERR);
> +  lj_checkapi(L->status == LUA_OK || L->status == LUA_ERRERR,
> +	      "thread called in wrong state %d", L->status);
>     /* Forbid Lua world re-entrancy while running the trace */
>     if (tvref(g->jit_base)) {
>       setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL));
> diff --git a/src/lj_asm.c b/src/lj_asm.c
> index a6906b19..d71fa8c8 100644
> --- a/src/lj_asm.c
> +++ b/src/lj_asm.c
> @@ -100,6 +100,12 @@ typedef struct ASMState {
>     uint16_t parentmap[LJ_MAX_JSLOTS];  /* Parent instruction to RegSP map. */
>   } ASMState;
>   
> +#ifdef LUA_USE_ASSERT
> +#define lj_assertA(c, ...)	lj_assertG_(J2G(as->J), (c), __VA_ARGS__)
> +#else
> +#define lj_assertA(c, ...)	((void)as)
> +#endif
> +
>   #define IR(ref)			(&as->ir[(ref)])
>   
>   #define ASMREF_TMP1		REF_TRUE	/* Temp. register. */
> @@ -131,9 +137,8 @@ static LJ_AINLINE void checkmclim(ASMState *as)
>   #ifdef LUA_USE_ASSERT
>     if (as->mcp + MCLIM_REDZONE < as->mcp_prev) {
>       IRIns *ir = IR(as->curins+1);
> -    fprintf(stderr, "RED ZONE OVERFLOW: %p IR %04d  %02d %04d %04d\n", as->mcp,
> -	    as->curins+1-REF_BIAS, ir->o, ir->op1-REF_BIAS, ir->op2-REF_BIAS);
> -    lua_assert(0);
> +    lj_assertA(0, "red zone overflow: %p IR %04d  %02d %04d %04d\n", as->mcp,
> +      as->curins+1-REF_BIAS, ir->o, ir->op1-REF_BIAS, ir->op2-REF_BIAS);
>     }
>   #endif
>     if (LJ_UNLIKELY(as->mcp < as->mclim)) asm_mclimit(as);
> @@ -247,7 +252,7 @@ static void ra_dprintf(ASMState *as, const char *fmt, ...)
>   	  *p++ = *q >= 'A' && *q <= 'Z' ? *q + 0x20 : *q;
>         } else {
>   	*p++ = '?';
> -	lua_assert(0);
> +	lj_assertA(0, "bad register %d for debug format \"%s\"", r, fmt);
>         }
>       } else if (e[1] == 'f' || e[1] == 'i') {
>         IRRef ref;
> @@ -265,7 +270,7 @@ static void ra_dprintf(ASMState *as, const char *fmt, ...)
>       } else if (e[1] == 'x') {
>         p += sprintf(p, "%08x", va_arg(argp, int32_t));
>       } else {
> -      lua_assert(0);
> +      lj_assertA(0, "bad debug format code");
>       }
>       fmt = e+2;
>     }
> @@ -324,7 +329,7 @@ static Reg ra_rematk(ASMState *as, IRRef ref)
>     Reg r;
>     if (ra_iskref(ref)) {
>       r = ra_krefreg(ref);
> -    lua_assert(!rset_test(as->freeset, r));
> +    lj_assertA(!rset_test(as->freeset, r), "rematk of free reg %d", r);
>       ra_free(as, r);
>       ra_modified(as, r);
>   #if LJ_64
> @@ -336,7 +341,9 @@ static Reg ra_rematk(ASMState *as, IRRef ref)
>     }
>     ir = IR(ref);
>     r = ir->r;
> -  lua_assert(ra_hasreg(r) && !ra_hasspill(ir->s));
> +  lj_assertA(ra_hasreg(r), "rematk of K%03d has no reg", REF_BIAS - ref);
> +  lj_assertA(!ra_hasspill(ir->s),
> +	     "rematk of K%03d has spill slot [%x]", REF_BIAS - ref, ir->s);
>     ra_free(as, r);
>     ra_modified(as, r);
>     ir->r = RID_INIT;  /* Do not keep any hint. */
> @@ -350,7 +357,8 @@ static Reg ra_rematk(ASMState *as, IRRef ref)
>       ra_sethint(ir->r, RID_BASE);  /* Restore BASE register hint. */
>       emit_getgl(as, r, jit_base);
>     } else if (emit_canremat(ASMREF_L) && ir->o == IR_KPRI) {
> -    lua_assert(irt_isnil(ir->t));  /* REF_NIL stores ASMREF_L register. */
> +    /* REF_NIL stores ASMREF_L register. */
> +    lj_assertA(irt_isnil(ir->t), "rematk of bad ASMREF_L");
>       emit_getgl(as, r, cur_L);
>   #if LJ_64
>     } else if (ir->o == IR_KINT64) {
> @@ -363,8 +371,9 @@ static Reg ra_rematk(ASMState *as, IRRef ref)
>   #endif
>   #endif
>     } else {
> -    lua_assert(ir->o == IR_KINT || ir->o == IR_KGC ||
> -	       ir->o == IR_KPTR || ir->o == IR_KKPTR || ir->o == IR_KNULL);
> +    lj_assertA(ir->o == IR_KINT || ir->o == IR_KGC ||
> +	       ir->o == IR_KPTR || ir->o == IR_KKPTR || ir->o == IR_KNULL,
> +	       "rematk of bad IR op %d", ir->o);
>       emit_loadi(as, r, ir->i);
>     }
>     return r;
> @@ -374,7 +383,8 @@ static Reg ra_rematk(ASMState *as, IRRef ref)
>   static int32_t ra_spill(ASMState *as, IRIns *ir)
>   {
>     int32_t slot = ir->s;
> -  lua_assert(ir >= as->ir + REF_TRUE);
> +  lj_assertA(ir >= as->ir + REF_TRUE,
> +	     "spill of K%03d", REF_BIAS - (int)(ir - as->ir));
>     if (!ra_hasspill(slot)) {
>       if (irt_is64(ir->t)) {
>         slot = as->evenspill;
> @@ -399,7 +409,9 @@ static Reg ra_releasetmp(ASMState *as, IRRef ref)
>   {
>     IRIns *ir = IR(ref);
>     Reg r = ir->r;
> -  lua_assert(ra_hasreg(r) && !ra_hasspill(ir->s));
> +  lj_assertA(ra_hasreg(r), "release of TMP%d has no reg", ref-ASMREF_TMP1+1);
> +  lj_assertA(!ra_hasspill(ir->s),
> +	     "release of TMP%d has spill slot [%x]", ref-ASMREF_TMP1+1, ir->s);
>     ra_free(as, r);
>     ra_modified(as, r);
>     ir->r = RID_INIT;
> @@ -415,7 +427,7 @@ static Reg ra_restore(ASMState *as, IRRef ref)
>       IRIns *ir = IR(ref);
>       int32_t ofs = ra_spill(as, ir);  /* Force a spill slot. */
>       Reg r = ir->r;
> -    lua_assert(ra_hasreg(r));
> +    lj_assertA(ra_hasreg(r), "restore of IR %04d has no reg", ref - REF_BIAS);
>       ra_sethint(ir->r, r);  /* Keep hint. */
>       ra_free(as, r);
>       if (!rset_test(as->weakset, r)) {  /* Only restore non-weak references. */
> @@ -444,14 +456,15 @@ static Reg ra_evict(ASMState *as, RegSet allow)
>   {
>     IRRef ref;
>     RegCost cost = ~(RegCost)0;
> -  lua_assert(allow != RSET_EMPTY);
> +  lj_assertA(allow != RSET_EMPTY, "evict from empty set");
>     if (RID_NUM_FPR == 0 || allow < RID2RSET(RID_MAX_GPR)) {
>       GPRDEF(MINCOST)
>     } else {
>       FPRDEF(MINCOST)
>     }
>     ref = regcost_ref(cost);
> -  lua_assert(ra_iskref(ref) || (ref >= as->T->nk && ref < as->T->nins));
> +  lj_assertA(ra_iskref(ref) || (ref >= as->T->nk && ref < as->T->nins),
> +	     "evict of out-of-range IR %04d", ref - REF_BIAS);
>     /* Preferably pick any weak ref instead of a non-weak, non-const ref. */
>     if (!irref_isk(ref) && (as->weakset & allow)) {
>       IRIns *ir = IR(ref);
> @@ -609,7 +622,8 @@ static Reg ra_allocref(ASMState *as, IRRef ref, RegSet allow)
>     IRIns *ir = IR(ref);
>     RegSet pick = as->freeset & allow;
>     Reg r;
> -  lua_assert(ra_noreg(ir->r));
> +  lj_assertA(ra_noreg(ir->r),
> +	     "IR %04d already has reg %d", ref - REF_BIAS, ir->r);
>     if (pick) {
>       /* First check register hint from propagation or PHI. */
>       if (ra_hashint(ir->r)) {
> @@ -673,8 +687,10 @@ static void ra_rename(ASMState *as, Reg down, Reg up)
>     IRIns *ir = IR(ref);
>     ir->r = (uint8_t)up;
>     as->cost[down] = 0;
> -  lua_assert((down < RID_MAX_GPR) == (up < RID_MAX_GPR));
> -  lua_assert(!rset_test(as->freeset, down) && rset_test(as->freeset, up));
> +  lj_assertA((down < RID_MAX_GPR) == (up < RID_MAX_GPR),
> +	     "rename between GPR/FPR %d and %d", down, up);
> +  lj_assertA(!rset_test(as->freeset, down), "rename from free reg %d", down);
> +  lj_assertA(rset_test(as->freeset, up), "rename to non-free reg %d", up);
>     ra_free(as, down);  /* 'down' is free ... */
>     ra_modified(as, down);
>     rset_clear(as->freeset, up);  /* ... and 'up' is now allocated. */
> @@ -722,7 +738,7 @@ static void ra_destreg(ASMState *as, IRIns *ir, Reg r)
>   {
>     Reg dest = ra_dest(as, ir, RID2RSET(r));
>     if (dest != r) {
> -    lua_assert(rset_test(as->freeset, r));
> +    lj_assertA(rset_test(as->freeset, r), "dest reg %d is not free", r);
>       ra_modified(as, r);
>       emit_movrr(as, ir, dest, r);
>     }
> @@ -755,8 +771,9 @@ static void ra_left(ASMState *as, Reg dest, IRRef lref)
>   #endif
>   #endif
>         } else if (ir->o != IR_KPRI) {
> -	lua_assert(ir->o == IR_KINT || ir->o == IR_KGC ||
> -		   ir->o == IR_KPTR || ir->o == IR_KKPTR || ir->o == IR_KNULL);
> +	lj_assertA(ir->o == IR_KINT || ir->o == IR_KGC ||
> +		   ir->o == IR_KPTR || ir->o == IR_KKPTR || ir->o == IR_KNULL,
> +		   "K%03d has bad IR op %d", REF_BIAS - lref, ir->o);
>   	emit_loadi(as, dest, ir->i);
>   	return;
>         }
> @@ -901,11 +918,14 @@ static void asm_snap_alloc1(ASMState *as, IRRef ref)
>   #endif
>         {  /* Allocate stored values for TNEW, TDUP and CNEW. */
>   	IRIns *irs;
> -	lua_assert(ir->o == IR_TNEW || ir->o == IR_TDUP || ir->o == IR_CNEW);
> +	lj_assertA(ir->o == IR_TNEW || ir->o == IR_TDUP || ir->o == IR_CNEW,
> +		   "sink of IR %04d has bad op %d", ref - REF_BIAS, ir->o);
>   	for (irs = IR(as->snapref-1); irs > ir; irs--)
>   	  if (irs->r == RID_SINK && asm_sunk_store(as, ir, irs)) {
> -	    lua_assert(irs->o == IR_ASTORE || irs->o == IR_HSTORE ||
> -		       irs->o == IR_FSTORE || irs->o == IR_XSTORE);
> +	    lj_assertA(irs->o == IR_ASTORE || irs->o == IR_HSTORE ||
> +		       irs->o == IR_FSTORE || irs->o == IR_XSTORE,
> +		       "sunk store IR %04d has bad op %d",
> +		       (int)(irs - as->ir) - REF_BIAS, irs->o);
>   	    asm_snap_alloc1(as, irs->op2);
>   	    if (LJ_32 && (irs+1)->o == IR_HIOP)
>   	      asm_snap_alloc1(as, (irs+1)->op2);
> @@ -953,15 +973,9 @@ static void asm_snap_alloc(ASMState *as, int snapno)
>       if (!irref_isk(ref)) {
>         asm_snap_alloc1(as, ref);
>         if (LJ_SOFTFP && (sn & SNAP_SOFTFPNUM)) {
> -	/*
> -	** FIXME: The following assert was replaced with
> -	** the conventional `lua_assert`.
> -	**
> -	** lj_assertA(irt_type(IR(ref+1)->t) == IRT_SOFTFP,
> -	** "snap %d[%d] points to bad SOFTFP IR %04d",
> -	** snapno, n, ref - REF_BIAS);
> -	*/
> -	lua_assert(irt_type(IR(ref+1)->t) == IRT_SOFTFP);
> +	lj_assertA(irt_type(IR(ref+1)->t) == IRT_SOFTFP,
> +		   "snap %d[%d] points to bad SOFTFP IR %04d",
> +		   snapno, n, ref - REF_BIAS);
>   	asm_snap_alloc1(as, ref+1);
>         }
>       }
> @@ -1045,19 +1059,20 @@ static int32_t asm_stack_adjust(ASMState *as)
>   }
>   
>   /* Must match with hash*() in lj_tab.c. */
> -static uint32_t ir_khash(IRIns *ir)
> +static uint32_t ir_khash(ASMState *as, IRIns *ir)
>   {
>     uint32_t lo, hi;
> +  UNUSED(as);
>     if (irt_isstr(ir->t)) {
>       return ir_kstr(ir)->hash;
>     } else if (irt_isnum(ir->t)) {
>       lo = ir_knum(ir)->u32.lo;
>       hi = ir_knum(ir)->u32.hi << 1;
>     } else if (irt_ispri(ir->t)) {
> -    lua_assert(!irt_isnil(ir->t));
> +    lj_assertA(!irt_isnil(ir->t), "hash of nil key");
>       return irt_type(ir->t)-IRT_FALSE;
>     } else {
> -    lua_assert(irt_isgcv(ir->t));
> +    lj_assertA(irt_isgcv(ir->t), "hash of bad IR type %d", irt_type(ir->t));
>       lo = u32ptr(ir_kgc(ir));
>   #if LJ_GC64
>       hi = (uint32_t)(u64ptr(ir_kgc(ir)) >> 32) | (irt_toitype(ir->t) << 15);
> @@ -1168,7 +1183,8 @@ static void asm_bufput(ASMState *as, IRIns *ir)
>     args[0] = ir->op1;  /* SBuf * */
>     args[1] = ir->op2;  /* GCstr * */
>     irs = IR(ir->op2);
> -  lua_assert(irt_isstr(irs->t));
> +  lj_assertA(irt_isstr(irs->t),
> +	     "BUFPUT of non-string IR %04d", ir->op2 - REF_BIAS);
>     if (irs->o == IR_KGC) {
>       GCstr *s = ir_kstr(irs);
>       if (s->len == 1) {  /* Optimize put of single-char string constant. */
> @@ -1182,7 +1198,8 @@ static void asm_bufput(ASMState *as, IRIns *ir)
>   	args[1] = ASMREF_TMP1;  /* TValue * */
>   	ci = &lj_ir_callinfo[IRCALL_lj_strfmt_putnum];
>         } else {
> -	lua_assert(irt_isinteger(IR(irs->op1)->t));
> +	lj_assertA(irt_isinteger(IR(irs->op1)->t),
> +		   "TOSTR of non-numeric IR %04d", irs->op1);
>   	args[1] = irs->op1;  /* int */
>   	if (irs->op2 == IRTOSTR_INT)
>   	  ci = &lj_ir_callinfo[IRCALL_lj_strfmt_putint];
> @@ -1248,7 +1265,8 @@ static void asm_conv64(ASMState *as, IRIns *ir)
>     IRType dt = (((ir-1)->op2 & IRCONV_DSTMASK) >> IRCONV_DSH);
>     IRCallID id;
>     IRRef args[2];
> -  lua_assert((ir-1)->o == IR_CONV && ir->o == IR_HIOP);
> +  lj_assertA((ir-1)->o == IR_CONV && ir->o == IR_HIOP,
> +	     "not a CONV/HIOP pair at IR %04d", (int)(ir - as->ir) - REF_BIAS);
>     args[LJ_BE] = (ir-1)->op1;
>     args[LJ_LE] = ir->op1;
>     if (st == IRT_NUM || st == IRT_FLOAT) {
> @@ -1304,15 +1322,16 @@ static void asm_collectargs(ASMState *as, IRIns *ir,
>   			    const CCallInfo *ci, IRRef *args)
>   {
>     uint32_t n = CCI_XNARGS(ci);
> -  lua_assert(n <= CCI_NARGS_MAX*2);  /* Account for split args. */
> +  /* Account for split args. */
> +  lj_assertA(n <= CCI_NARGS_MAX*2, "too many args %d to collect", n);
>     if ((ci->flags & CCI_L)) { *args++ = ASMREF_L; n--; }
>     while (n-- > 1) {
>       ir = IR(ir->op1);
> -    lua_assert(ir->o == IR_CARG);
> +    lj_assertA(ir->o == IR_CARG, "malformed CALL arg tree");
>       args[n] = ir->op2 == REF_NIL ? 0 : ir->op2;
>     }
>     args[0] = ir->op1 == REF_NIL ? 0 : ir->op1;
> -  lua_assert(IR(ir->op1)->o != IR_CARG);
> +  lj_assertA(IR(ir->op1)->o != IR_CARG, "malformed CALL arg tree");
>   }
>   
>   /* Reconstruct CCallInfo flags for CALLX*. */
> @@ -1690,7 +1709,10 @@ static void asm_ir(ASMState *as, IRIns *ir)
>     switch ((IROp)ir->o) {
>     /* Miscellaneous ops. */
>     case IR_LOOP: asm_loop(as); break;
> -  case IR_NOP: case IR_XBAR: lua_assert(!ra_used(ir)); break;
> +  case IR_NOP: case IR_XBAR:
> +    lj_assertA(!ra_used(ir),
> +	       "IR %04d not unused", (int)(ir - as->ir) - REF_BIAS);
> +    break;
>     case IR_USE:
>       ra_alloc1(as, ir->op1, irt_isfp(ir->t) ? RSET_FPR : RSET_GPR); break;
>     case IR_PHI: asm_phi(as, ir); break;
> @@ -1729,7 +1751,9 @@ static void asm_ir(ASMState *as, IRIns *ir)
>   #if LJ_SOFTFP32
>     case IR_DIV: case IR_POW: case IR_ABS:
>     case IR_LDEXP: case IR_FPMATH: case IR_TOBIT:
> -    lua_assert(0);  /* Unused for LJ_SOFTFP32. */
> +    /* Unused for LJ_SOFTFP32. */
> +    lj_assertA(0, "IR %04d with unused op %d",
> +		  (int)(ir - as->ir) - REF_BIAS, ir->o);
>       break;
>   #else
>     case IR_DIV: asm_div(as, ir); break;
> @@ -1777,7 +1801,8 @@ static void asm_ir(ASMState *as, IRIns *ir)
>   #if LJ_HASFFI
>       asm_cnew(as, ir);
>   #else
> -    lua_assert(0);
> +    lj_assertA(0, "IR %04d with unused op %d",
> +		  (int)(ir - as->ir) - REF_BIAS, ir->o);
>   #endif
>       break;
>   
> @@ -1854,8 +1879,10 @@ static void asm_head_side(ASMState *as)
>     for (i = as->stopins; i > REF_BASE; i--) {
>       IRIns *ir = IR(i);
>       RegSP rs;
> -    lua_assert((ir->o == IR_SLOAD && (ir->op2 & IRSLOAD_PARENT)) ||
> -	       (LJ_SOFTFP && ir->o == IR_HIOP) || ir->o == IR_PVAL);
> +    lj_assertA((ir->o == IR_SLOAD && (ir->op2 & IRSLOAD_PARENT)) ||
> +	       (LJ_SOFTFP && ir->o == IR_HIOP) || ir->o == IR_PVAL,
> +	       "IR %04d has bad parent op %d",
> +	       (int)(ir - as->ir) - REF_BIAS, ir->o);
>       rs = as->parentmap[i - REF_FIRST];
>       if (ra_hasreg(ir->r)) {
>         rset_clear(allow, ir->r);
> @@ -2115,7 +2142,7 @@ static void asm_setup_regsp(ASMState *as)
>     ir = IR(REF_FIRST);
>     if (as->parent) {
>       uint16_t *p;
> -    lastir = lj_snap_regspmap(as->parent, as->J->exitno, ir);
> +    lastir = lj_snap_regspmap(as->J, as->parent, as->J->exitno, ir);
>       if (lastir - ir > LJ_MAX_JSLOTS)
>         lj_trace_err(as->J, LJ_TRERR_NYICOAL);
>       as->stopins = (IRRef)((lastir-1) - as->ir);
> @@ -2418,7 +2445,10 @@ void lj_asm_trace(jit_State *J, GCtrace *T)
>       /* Assemble a trace in linear backwards order. */
>       for (as->curins--; as->curins > as->stopins; as->curins--) {
>         IRIns *ir = IR(as->curins);
> -      lua_assert(!(LJ_32 && irt_isint64(ir->t)));  /* Handled by SPLIT. */
> +      /* 64 bit types handled by SPLIT for 32 bit archs. */
> +      lj_assertA(!(LJ_32 && irt_isint64(ir->t)),
> +		 "IR %04d has unsplit 64 bit type",
> +		 (int)(ir - as->ir) - REF_BIAS);
>         asm_snap_prev(as);
>         if (!ra_used(ir) && !ir_sideeff(ir) && (as->flags & JIT_F_OPT_DCE))
>   	continue;  /* Dead-code elimination can be soooo easy. */
> @@ -2449,7 +2479,7 @@ void lj_asm_trace(jit_State *J, GCtrace *T)
>       asm_phi_fixup(as);
>   
>       if (J->curfinal->nins >= T->nins) {  /* IR didn't grow? */
> -      lua_assert(J->curfinal->nk == T->nk);
> +      lj_assertA(J->curfinal->nk == T->nk, "unexpected IR constant growth");
>         memcpy(J->curfinal->ir + as->orignins, T->ir + as->orignins,
>   	     (T->nins - as->orignins) * sizeof(IRIns));  /* Copy RENAMEs. */
>         T->nins = J->curfinal->nins;
> diff --git a/src/lj_asm_arm.h b/src/lj_asm_arm.h
> index 29a07c80..47564d2e 100644
> --- a/src/lj_asm_arm.h
> +++ b/src/lj_asm_arm.h
> @@ -41,7 +41,7 @@ static Reg ra_scratchpair(ASMState *as, RegSet allow)
>         }
>       }
>     }
> -  lua_assert(rset_test(RSET_GPREVEN, r));
> +  lj_assertA(rset_test(RSET_GPREVEN, r), "odd reg %d", r);
>     ra_modified(as, r);
>     ra_modified(as, r+1);
>     RA_DBGX((as, "scratchpair    $r $r", r, r+1));
> @@ -269,7 +269,7 @@ static void asm_fusexref(ASMState *as, ARMIns ai, Reg rd, IRRef ref,
>   	return;
>         }
>       } else if (ir->o == IR_STRREF && !(!LJ_SOFTFP && (ai & 0x08000000))) {
> -      lua_assert(ofs == 0);
> +      lj_assertA(ofs == 0, "bad usage");
>         ofs = (int32_t)sizeof(GCstr);
>         if (irref_isk(ir->op2)) {
>   	ofs += IR(ir->op2)->i;
> @@ -389,9 +389,11 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
>         as->freeset |= (of & RSET_RANGE(REGARG_FIRSTGPR, REGARG_LASTGPR+1));
>         if (irt_isnum(ir->t)) gpr = (gpr+1) & ~1u;
>         if (gpr <= REGARG_LASTGPR) {
> -	lua_assert(rset_test(as->freeset, gpr));  /* Must have been evicted. */
> +	lj_assertA(rset_test(as->freeset, gpr),
> +		   "reg %d not free", gpr);  /* Must have been evicted. */
>   	if (irt_isnum(ir->t)) {
> -	  lua_assert(rset_test(as->freeset, gpr+1));  /* Ditto. */
> +	  lj_assertA(rset_test(as->freeset, gpr+1),
> +		     "reg %d not free", gpr+1);  /* Ditto. */
>   	  emit_dnm(as, ARMI_VMOV_RR_D, gpr, gpr+1, (src & 15));
>   	  gpr += 2;
>   	} else {
> @@ -408,7 +410,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
>   #endif
>       {
>         if (gpr <= REGARG_LASTGPR) {
> -	lua_assert(rset_test(as->freeset, gpr));  /* Must have been evicted. */
> +	lj_assertA(rset_test(as->freeset, gpr),
> +		   "reg %d not free", gpr);  /* Must have been evicted. */
>   	if (ref) ra_leftov(as, gpr, ref);
>   	gpr++;
>         } else {
> @@ -433,7 +436,7 @@ static void asm_setupresult(ASMState *as, IRIns *ir, const CCallInfo *ci)
>       rset_clear(drop, (ir+1)->r);  /* Dest reg handled below. */
>     ra_evictset(as, drop);  /* Evictions must be performed first. */
>     if (ra_used(ir)) {
> -    lua_assert(!irt_ispri(ir->t));
> +    lj_assertA(!irt_ispri(ir->t), "PRI dest");
>       if (!LJ_SOFTFP && irt_isfp(ir->t)) {
>         if (LJ_ABI_SOFTFP || (ci->flags & (CCI_CASTU64|CCI_VARARG))) {
>   	Reg dest = (ra_dest(as, ir, RSET_FPR) & 15);
> @@ -530,13 +533,17 @@ static void asm_conv(ASMState *as, IRIns *ir)
>   #endif
>     IRRef lref = ir->op1;
>     /* 64 bit integer conversions are handled by SPLIT. */
> -  lua_assert(!irt_isint64(ir->t) && !(st == IRT_I64 || st == IRT_U64));
> +  lj_assertA(!irt_isint64(ir->t) && !(st == IRT_I64 || st == IRT_U64),
> +	     "IR %04d has unsplit 64 bit type",
> +	     (int)(ir - as->ir) - REF_BIAS);
>   #if LJ_SOFTFP
>     /* FP conversions are handled by SPLIT. */
> -  lua_assert(!irt_isfp(ir->t) && !(st == IRT_NUM || st == IRT_FLOAT));
> +  lj_assertA(!irt_isfp(ir->t) && !(st == IRT_NUM || st == IRT_FLOAT),
> +	     "IR %04d has FP type",
> +	     (int)(ir - as->ir) - REF_BIAS);
>     /* Can't check for same types: SPLIT uses CONV int.int + BXOR for sfp NEG. */
>   #else
> -  lua_assert(irt_type(ir->t) != st);
> +  lj_assertA(irt_type(ir->t) != st, "inconsistent types for CONV");
>     if (irt_isfp(ir->t)) {
>       Reg dest = ra_dest(as, ir, RSET_FPR);
>       if (stfp) {  /* FP to FP conversion. */
> @@ -553,7 +560,8 @@ static void asm_conv(ASMState *as, IRIns *ir)
>     } else if (stfp) {  /* FP to integer conversion. */
>       if (irt_isguard(ir->t)) {
>         /* Checked conversions are only supported from number to int. */
> -      lua_assert(irt_isint(ir->t) && st == IRT_NUM);
> +      lj_assertA(irt_isint(ir->t) && st == IRT_NUM,
> +		 "bad type for checked CONV");
>         asm_tointg(as, ir, ra_alloc1(as, lref, RSET_FPR));
>       } else {
>         Reg left = ra_alloc1(as, lref, RSET_FPR);
> @@ -572,7 +580,7 @@ static void asm_conv(ASMState *as, IRIns *ir)
>       Reg dest = ra_dest(as, ir, RSET_GPR);
>       if (st >= IRT_I8 && st <= IRT_U16) {  /* Extend to 32 bit integer. */
>         Reg left = ra_alloc1(as, lref, RSET_GPR);
> -      lua_assert(irt_isint(ir->t) || irt_isu32(ir->t));
> +      lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t), "bad type for CONV EXT");
>         if ((as->flags & JIT_F_ARMV6)) {
>   	ARMIns ai = st == IRT_I8 ? ARMI_SXTB :
>   		    st == IRT_U8 ? ARMI_UXTB :
> @@ -667,7 +675,7 @@ static void asm_tvptr(ASMState *as, Reg dest, IRRef ref)
>         ra_allockreg(as, i32ptr(ir_knum(ir)), dest);
>       } else {
>   #if LJ_SOFTFP
> -      lua_assert(0);
> +      lj_assertA(0, "unsplit FP op");
>   #else
>         /* Otherwise force a spill and use the spill slot. */
>         emit_opk(as, ARMI_ADD, dest, RID_SP, ra_spill(as, ir), RSET_GPR);
> @@ -811,7 +819,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>     *l_loop = ARMF_CC(ARMI_B, CC_NE) | ((as->mcp-l_loop-2) & 0x00ffffffu);
>   
>     /* Load main position relative to tab->node into dest. */
> -  khash = irref_isk(refkey) ? ir_khash(irkey) : 1;
> +  khash = irref_isk(refkey) ? ir_khash(as, irkey) : 1;
>     if (khash == 0) {
>       emit_lso(as, ARMI_LDR, dest, tab, (int32_t)offsetof(GCtab, node));
>     } else {
> @@ -867,7 +875,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
>     Reg node = ra_alloc1(as, ir->op1, RSET_GPR);
>     Reg key = RID_NONE, type = RID_TMP, idx = node;
>     RegSet allow = rset_exclude(RSET_GPR, node);
> -  lua_assert(ofs % sizeof(Node) == 0);
> +  lj_assertA(ofs % sizeof(Node) == 0, "unaligned HREFK slot");
>     if (ofs > 4095) {
>       idx = dest;
>       rset_clear(allow, dest);
> @@ -934,7 +942,7 @@ static void asm_uref(ASMState *as, IRIns *ir)
>   static void asm_fref(ASMState *as, IRIns *ir)
>   {
>     UNUSED(as); UNUSED(ir);
> -  lua_assert(!ra_used(ir));
> +  lj_assertA(!ra_used(ir), "unfused FREF");
>   }
>   
>   static void asm_strref(ASMState *as, IRIns *ir)
> @@ -971,25 +979,27 @@ static void asm_strref(ASMState *as, IRIns *ir)
>   
>   /* -- Loads and stores ---------------------------------------------------- */
>   
> -static ARMIns asm_fxloadins(IRIns *ir)
> +static ARMIns asm_fxloadins(ASMState *as, IRIns *ir)
>   {
> +  UNUSED(as);
>     switch (irt_type(ir->t)) {
>     case IRT_I8: return ARMI_LDRSB;
>     case IRT_U8: return ARMI_LDRB;
>     case IRT_I16: return ARMI_LDRSH;
>     case IRT_U16: return ARMI_LDRH;
> -  case IRT_NUM: lua_assert(!LJ_SOFTFP); return ARMI_VLDR_D;
> +  case IRT_NUM: lj_assertA(!LJ_SOFTFP, "unsplit FP op"); return ARMI_VLDR_D;
>     case IRT_FLOAT: if (!LJ_SOFTFP) return ARMI_VLDR_S;  /* fallthrough */
>     default: return ARMI_LDR;
>     }
>   }
>   
> -static ARMIns asm_fxstoreins(IRIns *ir)
> +static ARMIns asm_fxstoreins(ASMState *as, IRIns *ir)
>   {
> +  UNUSED(as);
>     switch (irt_type(ir->t)) {
>     case IRT_I8: case IRT_U8: return ARMI_STRB;
>     case IRT_I16: case IRT_U16: return ARMI_STRH;
> -  case IRT_NUM: lua_assert(!LJ_SOFTFP); return ARMI_VSTR_D;
> +  case IRT_NUM: lj_assertA(!LJ_SOFTFP, "unsplit FP op"); return ARMI_VSTR_D;
>     case IRT_FLOAT: if (!LJ_SOFTFP) return ARMI_VSTR_S;  /* fallthrough */
>     default: return ARMI_STR;
>     }
> @@ -997,12 +1007,13 @@ static ARMIns asm_fxstoreins(IRIns *ir)
>   
>   static void asm_fload(ASMState *as, IRIns *ir)
>   {
> -  if (ir->op1 == REF_NIL) {
> -    lua_assert(!ra_used(ir));  /* We can end up here if DCE is turned off. */
> +  if (ir->op1 == REF_NIL) {  /* FLOAD from GG_State with offset. */
> +    /* We can end up here if DCE is turned off. */
> +    lj_assertA(!ra_used(ir), "NYI FLOAD GG_State");
>     } else {
>       Reg dest = ra_dest(as, ir, RSET_GPR);
>       Reg idx = ra_alloc1(as, ir->op1, RSET_GPR);
> -    ARMIns ai = asm_fxloadins(ir);
> +    ARMIns ai = asm_fxloadins(as, ir);
>       int32_t ofs;
>       if (ir->op2 == IRFL_TAB_ARRAY) {
>         ofs = asm_fuseabase(as, ir->op1);
> @@ -1026,7 +1037,7 @@ static void asm_fstore(ASMState *as, IRIns *ir)
>       IRIns *irf = IR(ir->op1);
>       Reg idx = ra_alloc1(as, irf->op1, rset_exclude(RSET_GPR, src));
>       int32_t ofs = field_ofs[irf->op2];
> -    ARMIns ai = asm_fxstoreins(ir);
> +    ARMIns ai = asm_fxstoreins(as, ir);
>       if ((ai & 0x04000000))
>         emit_lso(as, ai, src, idx, ofs);
>       else
> @@ -1038,8 +1049,8 @@ static void asm_xload(ASMState *as, IRIns *ir)
>   {
>     Reg dest = ra_dest(as, ir,
>   		     (!LJ_SOFTFP && irt_isfp(ir->t)) ? RSET_FPR : RSET_GPR);
> -  lua_assert(!(ir->op2 & IRXLOAD_UNALIGNED));
> -  asm_fusexref(as, asm_fxloadins(ir), dest, ir->op1, RSET_GPR, 0);
> +  lj_assertA(!(ir->op2 & IRXLOAD_UNALIGNED), "unaligned XLOAD");
> +  asm_fusexref(as, asm_fxloadins(as, ir), dest, ir->op1, RSET_GPR, 0);
>   }
>   
>   static void asm_xstore_(ASMState *as, IRIns *ir, int32_t ofs)
> @@ -1047,7 +1058,7 @@ static void asm_xstore_(ASMState *as, IRIns *ir, int32_t ofs)
>     if (ir->r != RID_SINK) {
>       Reg src = ra_alloc1(as, ir->op2,
>   			(!LJ_SOFTFP && irt_isfp(ir->t)) ? RSET_FPR : RSET_GPR);
> -    asm_fusexref(as, asm_fxstoreins(ir), src, ir->op1,
> +    asm_fusexref(as, asm_fxstoreins(as, ir), src, ir->op1,
>   		 rset_exclude(RSET_GPR, src), ofs);
>     }
>   }
> @@ -1066,8 +1077,9 @@ static void asm_ahuvload(ASMState *as, IRIns *ir)
>       rset_clear(allow, type);
>     }
>     if (ra_used(ir)) {
> -    lua_assert((LJ_SOFTFP ? 0 : irt_isnum(ir->t)) ||
> -	       irt_isint(ir->t) || irt_isaddr(ir->t));
> +    lj_assertA((LJ_SOFTFP ? 0 : irt_isnum(ir->t)) ||
> +	       irt_isint(ir->t) || irt_isaddr(ir->t),
> +	       "bad load type %d", irt_type(ir->t));
>       dest = ra_dest(as, ir, (!LJ_SOFTFP && t == IRT_NUM) ? RSET_FPR : allow);
>       rset_clear(allow, dest);
>     }
> @@ -1133,10 +1145,13 @@ static void asm_sload(ASMState *as, IRIns *ir)
>     IRType t = hiop ? IRT_NUM : irt_type(ir->t);
>     Reg dest = RID_NONE, type = RID_NONE, base;
>     RegSet allow = RSET_GPR;
> -  lua_assert(!(ir->op2 & IRSLOAD_PARENT));  /* Handled by asm_head_side(). */
> -  lua_assert(irt_isguard(ir->t) || !(ir->op2 & IRSLOAD_TYPECHECK));
> +  lj_assertA(!(ir->op2 & IRSLOAD_PARENT),
> +	     "bad parent SLOAD");  /* Handled by asm_head_side(). */
> +  lj_assertA(irt_isguard(ir->t) || !(ir->op2 & IRSLOAD_TYPECHECK),
> +	     "inconsistent SLOAD variant");
>   #if LJ_SOFTFP
> -  lua_assert(!(ir->op2 & IRSLOAD_CONVERT));  /* Handled by LJ_SOFTFP SPLIT. */
> +  lj_assertA(!(ir->op2 & IRSLOAD_CONVERT),
> +	     "unsplit SLOAD convert");  /* Handled by LJ_SOFTFP SPLIT. */
>     if (hiop && ra_used(ir+1)) {
>       type = ra_dest(as, ir+1, allow);
>       rset_clear(allow, type);
> @@ -1152,8 +1167,9 @@ static void asm_sload(ASMState *as, IRIns *ir)
>       Reg tmp = RID_NONE;
>       if ((ir->op2 & IRSLOAD_CONVERT))
>         tmp = ra_scratch(as, t == IRT_INT ? RSET_FPR : RSET_GPR);
> -    lua_assert((LJ_SOFTFP ? 0 : irt_isnum(ir->t)) ||
> -	       irt_isint(ir->t) || irt_isaddr(ir->t));
> +    lj_assertA((LJ_SOFTFP ? 0 : irt_isnum(ir->t)) ||
> +	       irt_isint(ir->t) || irt_isaddr(ir->t),
> +	       "bad SLOAD type %d", irt_type(ir->t));
>       dest = ra_dest(as, ir, (!LJ_SOFTFP && t == IRT_NUM) ? RSET_FPR : allow);
>       rset_clear(allow, dest);
>       base = ra_alloc1(as, REF_BASE, allow);
> @@ -1218,7 +1234,8 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>     IRRef args[4];
>     RegSet allow = (RSET_GPR & ~RSET_SCRATCH);
>     RegSet drop = RSET_SCRATCH;
> -  lua_assert(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL));
> +  lj_assertA(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL),
> +	     "bad CNEW/CNEWI operands");
>   
>     as->gcsteps++;
>     if (ra_hasreg(ir->r))
> @@ -1230,10 +1247,10 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>     /* Initialize immutable cdata object. */
>     if (ir->o == IR_CNEWI) {
>       int32_t ofs = sizeof(GCcdata);
> -    lua_assert(sz == 4 || sz == 8);
> +    lj_assertA(sz == 4 || sz == 8, "bad CNEWI size %d", sz);
>       if (sz == 8) {
>         ofs += 4; ir++;
> -      lua_assert(ir->o == IR_HIOP);
> +      lj_assertA(ir->o == IR_HIOP, "expected HIOP for CNEWI");
>       }
>       for (;;) {
>         Reg r = ra_alloc1(as, ir->op2, allow);
> @@ -1306,7 +1323,7 @@ static void asm_obar(ASMState *as, IRIns *ir)
>     MCLabel l_end;
>     Reg obj, val, tmp;
>     /* No need for other object barriers (yet). */
> -  lua_assert(IR(ir->op1)->o == IR_UREFC);
> +  lj_assertA(IR(ir->op1)->o == IR_UREFC, "bad OBAR type");
>     ra_evictset(as, RSET_SCRATCH);
>     l_end = emit_label(as);
>     args[0] = ASMREF_TMP1;  /* global_State *g */
> @@ -1580,7 +1597,7 @@ static void asm_bitshift(ASMState *as, IRIns *ir, ARMShift sh)
>   #define asm_bshr(as, ir)	asm_bitshift(as, ir, ARMSH_LSR)
>   #define asm_bsar(as, ir)	asm_bitshift(as, ir, ARMSH_ASR)
>   #define asm_bror(as, ir)	asm_bitshift(as, ir, ARMSH_ROR)
> -#define asm_brol(as, ir)	lua_assert(0)
> +#define asm_brol(as, ir)	lj_assertA(0, "unexpected BROL")
>   
>   static void asm_intmin_max(ASMState *as, IRIns *ir, int cc)
>   {
> @@ -1731,7 +1748,8 @@ static void asm_intcomp(ASMState *as, IRIns *ir)
>     Reg left;
>     uint32_t m;
>     int cmpprev0 = 0;
> -  lua_assert(irt_isint(ir->t) || irt_isu32(ir->t) || irt_isaddr(ir->t));
> +  lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t) || irt_isaddr(ir->t),
> +	     "bad comparison data type %d", irt_type(ir->t));
>     if (asm_swapops(as, lref, rref)) {
>       Reg tmp = lref; lref = rref; rref = tmp;
>       if (cc >= CC_GE) cc ^= 7;  /* LT <-> GT, LE <-> GE */
> @@ -1900,10 +1918,11 @@ static void asm_hiop(ASMState *as, IRIns *ir)
>     case IR_CNEWI:
>       /* Nothing to do here. Handled by lo op itself. */
>       break;
> -  default: lua_assert(0); break;
> +  default: lj_assertA(0, "bad HIOP for op %d", (ir-1)->o); break;
>     }
>   #else
> -  UNUSED(as); UNUSED(ir); lua_assert(0);
> +  /* Unused without SOFTFP or FFI. */
> +  UNUSED(as); UNUSED(ir); lj_assertA(0, "unexpected HIOP");
>   #endif
>   }
>   
> @@ -1928,7 +1947,7 @@ static void asm_stack_check(ASMState *as, BCReg topslot,
>     if (irp) {
>       if (!ra_hasspill(irp->s)) {
>         pbase = irp->r;
> -      lua_assert(ra_hasreg(pbase));
> +      lj_assertA(ra_hasreg(pbase), "base reg lost");
>       } else if (allow) {
>         pbase = rset_pickbot(allow);
>       } else {
> @@ -1940,7 +1959,7 @@ static void asm_stack_check(ASMState *as, BCReg topslot,
>     }
>     emit_branch(as, ARMF_CC(ARMI_BL, CC_LS), exitstub_addr(as->J, exitno));
>     k = emit_isk12(0, (int32_t)(8*topslot));
> -  lua_assert(k);
> +  lj_assertA(k, "slot offset %d does not fit in K12", 8*topslot);
>     emit_n(as, ARMI_CMP^k, RID_TMP);
>     emit_dnm(as, ARMI_SUB, RID_TMP, RID_TMP, pbase);
>     emit_lso(as, ARMI_LDR, RID_TMP, RID_TMP,
> @@ -1977,7 +1996,8 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>   #if LJ_SOFTFP
>         RegSet odd = rset_exclude(RSET_GPRODD, RID_BASE);
>         Reg tmp;
> -      lua_assert(irref_isk(ref));  /* LJ_SOFTFP: must be a number constant. */
> +      /* LJ_SOFTFP: must be a number constant. */
> +      lj_assertA(irref_isk(ref), "unsplit FP op");
>         tmp = ra_allock(as, (int32_t)ir_knum(ir)->u32.lo,
>   		      rset_exclude(RSET_GPREVEN, RID_BASE));
>         emit_lso(as, ARMI_STR, tmp, RID_BASE, ofs);
> @@ -1991,7 +2011,8 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>       } else {
>         RegSet odd = rset_exclude(RSET_GPRODD, RID_BASE);
>         Reg type;
> -      lua_assert(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t));
> +      lj_assertA(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t),
> +		 "restore of IR type %d", irt_type(ir->t));
>         if (!irt_ispri(ir->t)) {
>   	Reg src = ra_alloc1(as, ref, rset_exclude(RSET_GPREVEN, RID_BASE));
>   	emit_lso(as, ARMI_STR, src, RID_BASE, ofs);
> @@ -2011,7 +2032,7 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>       }
>       checkmclim(as);
>     }
> -  lua_assert(map + nent == flinks);
> +  lj_assertA(map + nent == flinks, "inconsistent frames in snapshot");
>   }
>   
>   /* -- GC handling --------------------------------------------------------- */
> @@ -2097,7 +2118,7 @@ static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
>       rset_clear(allow, ra_dest(as, ir, allow));
>     } else {
>       Reg r = irp->r;
> -    lua_assert(ra_hasreg(r));
> +    lj_assertA(ra_hasreg(r), "base reg lost");
>       rset_clear(allow, r);
>       if (r != ir->r && !rset_test(as->freeset, r))
>         ra_restore(as, regcost_ref(as->cost[r]));
> @@ -2119,7 +2140,7 @@ static void asm_tail_fixup(ASMState *as, TraceNo lnk)
>     } else {
>       /* Patch stack adjustment. */
>       uint32_t k = emit_isk12(ARMI_ADD, spadj);
> -    lua_assert(k);
> +    lj_assertA(k, "stack adjustment %d does not fit in K12", spadj);
>       p[-2] = (ARMI_ADD^k) | ARMF_D(RID_SP) | ARMF_N(RID_SP);
>     }
>     /* Patch exit branch. */
> @@ -2201,7 +2222,7 @@ void lj_asm_patchexit(jit_State *J, GCtrace *T, ExitNo exitno, MCode *target)
>         if (!cstart) cstart = p;
>       }
>     }
> -  lua_assert(cstart != NULL);
> +  lj_assertJ(cstart != NULL, "exit stub %d not found", exitno);
>     lj_mcode_sync(cstart, cend);
>     lj_mcode_patch(J, mcarea, 1);
>   }
> diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
> index c3d6889e..d1d4237b 100644
> --- a/src/lj_asm_arm64.h
> +++ b/src/lj_asm_arm64.h
> @@ -213,7 +213,7 @@ static uint32_t asm_fuseopm(ASMState *as, A64Ins ai, IRRef ref, RegSet allow)
>       return A64F_M(ir->r);
>     } else if (irref_isk(ref)) {
>       uint32_t m;
> -    int64_t k = get_k64val(ir);
> +    int64_t k = get_k64val(as, ref);
>       if ((ai & 0x1f000000) == 0x0a000000)
>         m = emit_isk13(k, irt_is64(ir->t));
>       else
> @@ -354,9 +354,9 @@ static int asm_fusemadd(ASMState *as, IRIns *ir, A64Ins ai, A64Ins air)
>   static int asm_fuseandshift(ASMState *as, IRIns *ir)
>   {
>     IRIns *irl = IR(ir->op1);
> -  lua_assert(ir->o == IR_BAND);
> +  lj_assertA(ir->o == IR_BAND, "bad usage");
>     if (canfuse(as, irl) && irref_isk(ir->op2)) {
> -    uint64_t mask = get_k64val(IR(ir->op2));
> +    uint64_t mask = get_k64val(as, ir->op2);
>       if (irref_isk(irl->op2) && (irl->o == IR_BSHR || irl->o == IR_BSHL)) {
>         int32_t shmask = irt_is64(irl->t) ? 63 : 31;
>         int32_t shift = (IR(irl->op2)->i & shmask);
> @@ -384,7 +384,7 @@ static int asm_fuseandshift(ASMState *as, IRIns *ir)
>   static int asm_fuseorshift(ASMState *as, IRIns *ir)
>   {
>     IRIns *irl = IR(ir->op1), *irr = IR(ir->op2);
> -  lua_assert(ir->o == IR_BOR);
> +  lj_assertA(ir->o == IR_BOR, "bad usage");
>     if (canfuse(as, irl) && canfuse(as, irr) &&
>         ((irl->o == IR_BSHR && irr->o == IR_BSHL) ||
>          (irl->o == IR_BSHL && irr->o == IR_BSHR))) {
> @@ -428,7 +428,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
>       if (ref) {
>         if (irt_isfp(ir->t)) {
>   	if (fpr <= REGARG_LASTFPR) {
> -	  lua_assert(rset_test(as->freeset, fpr)); /* Must have been evicted. */
> +	  lj_assertA(rset_test(as->freeset, fpr),
> +		     "reg %d not free", fpr);  /* Must have been evicted. */
>   	  ra_leftov(as, fpr, ref);
>   	  fpr++;
>   	} else {
> @@ -438,7 +439,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
>   	}
>         } else {
>   	if (gpr <= REGARG_LASTGPR) {
> -	  lua_assert(rset_test(as->freeset, gpr)); /* Must have been evicted. */
> +	  lj_assertA(rset_test(as->freeset, gpr),
> +		     "reg %d not free", gpr);  /* Must have been evicted. */
>   	  ra_leftov(as, gpr, ref);
>   	  gpr++;
>   	} else {
> @@ -459,7 +461,7 @@ static void asm_setupresult(ASMState *as, IRIns *ir, const CCallInfo *ci)
>       rset_clear(drop, ir->r); /* Dest reg handled below. */
>     ra_evictset(as, drop); /* Evictions must be performed first. */
>     if (ra_used(ir)) {
> -    lua_assert(!irt_ispri(ir->t));
> +    lj_assertA(!irt_ispri(ir->t), "PRI dest");
>       if (irt_isfp(ir->t)) {
>         if (ci->flags & CCI_CASTU64) {
>   	Reg dest = ra_dest(as, ir, RSET_FPR) & 31;
> @@ -546,7 +548,7 @@ static void asm_conv(ASMState *as, IRIns *ir)
>     int st64 = (st == IRT_I64 || st == IRT_U64 || st == IRT_P64);
>     int stfp = (st == IRT_NUM || st == IRT_FLOAT);
>     IRRef lref = ir->op1;
> -  lua_assert(irt_type(ir->t) != st);
> +  lj_assertA(irt_type(ir->t) != st, "inconsistent types for CONV");
>     if (irt_isfp(ir->t)) {
>       Reg dest = ra_dest(as, ir, RSET_FPR);
>       if (stfp) {  /* FP to FP conversion. */
> @@ -566,7 +568,8 @@ static void asm_conv(ASMState *as, IRIns *ir)
>     } else if (stfp) {  /* FP to integer conversion. */
>       if (irt_isguard(ir->t)) {
>         /* Checked conversions are only supported from number to int. */
> -      lua_assert(irt_isint(ir->t) && st == IRT_NUM);
> +      lj_assertA(irt_isint(ir->t) && st == IRT_NUM,
> +		 "bad type for checked CONV");
>         asm_tointg(as, ir, ra_alloc1(as, lref, RSET_FPR));
>       } else {
>         Reg left = ra_alloc1(as, lref, RSET_FPR);
> @@ -586,7 +589,7 @@ static void asm_conv(ASMState *as, IRIns *ir)
>       A64Ins ai = st == IRT_I8 ? A64I_SXTBw :
>   		st == IRT_U8 ? A64I_UXTBw :
>   		st == IRT_I16 ? A64I_SXTHw : A64I_UXTHw;
> -    lua_assert(irt_isint(ir->t) || irt_isu32(ir->t));
> +    lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t), "bad type for CONV EXT");
>       emit_dn(as, ai, dest, left);
>     } else {
>       Reg dest = ra_dest(as, ir, RSET_GPR);
> @@ -650,7 +653,8 @@ static void asm_tvstore64(ASMState *as, Reg base, int32_t ofs, IRRef ref)
>   {
>     RegSet allow = rset_exclude(RSET_GPR, base);
>     IRIns *ir = IR(ref);
> -  lua_assert(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t));
> +  lj_assertA(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t),
> +	     "store of IR type %d", irt_type(ir->t));
>     if (irref_isk(ref)) {
>       TValue k;
>       lj_ir_kvalue(as->J->L, &k, ir);
> @@ -770,7 +774,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>       }
>       rset_clear(allow, scr);
>     } else {
> -    lua_assert(irt_ispri(kt) && !irt_isnil(kt));
> +    lj_assertA(irt_ispri(kt) && !irt_isnil(kt), "bad HREF key type");
>       type = ra_allock(as, ~((int64_t)~irt_toitype(ir->t) << 47), allow);
>       scr = ra_scratch(as, rset_clear(allow, type));
>       rset_clear(allow, scr);
> @@ -831,7 +835,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>       rset_clear(allow, type);
>     }
>     /* Load main position relative to tab->node into dest. */
> -  khash = isk ? ir_khash(irkey) : 1;
> +  khash = isk ? ir_khash(as, irkey) : 1;
>     if (khash == 0) {
>       emit_lso(as, A64I_LDRx, dest, tab, offsetof(GCtab, node));
>     } else {
> @@ -886,7 +890,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
>     Reg key, idx = node;
>     RegSet allow = rset_exclude(RSET_GPR, node);
>     uint64_t k;
> -  lua_assert(ofs % sizeof(Node) == 0);
> +  lj_assertA(ofs % sizeof(Node) == 0, "unaligned HREFK slot");
>     if (bigofs) {
>       idx = dest;
>       rset_clear(allow, dest);
> @@ -936,7 +940,7 @@ static void asm_uref(ASMState *as, IRIns *ir)
>   static void asm_fref(ASMState *as, IRIns *ir)
>   {
>     UNUSED(as); UNUSED(ir);
> -  lua_assert(!ra_used(ir));
> +  lj_assertA(!ra_used(ir), "unfused FREF");
>   }
>   
>   static void asm_strref(ASMState *as, IRIns *ir)
> @@ -988,7 +992,7 @@ static void asm_fload(ASMState *as, IRIns *ir)
>     Reg idx;
>     A64Ins ai = asm_fxloadins(ir);
>     int32_t ofs;
> -  if (ir->op1 == REF_NIL) {
> +  if (ir->op1 == REF_NIL) {  /* FLOAD from GG_State with offset. */
>       idx = RID_GL;
>       ofs = (ir->op2 << 2) - GG_OFS(g);
>     } else {
> @@ -1019,7 +1023,7 @@ static void asm_fstore(ASMState *as, IRIns *ir)
>   static void asm_xload(ASMState *as, IRIns *ir)
>   {
>     Reg dest = ra_dest(as, ir, irt_isfp(ir->t) ? RSET_FPR : RSET_GPR);
> -  lua_assert(!(ir->op2 & IRXLOAD_UNALIGNED));
> +  lj_assertA(!(ir->op2 & IRXLOAD_UNALIGNED), "unaligned XLOAD");
>     asm_fusexref(as, asm_fxloadins(ir), dest, ir->op1, RSET_GPR);
>   }
>   
> @@ -1037,8 +1041,9 @@ static void asm_ahuvload(ASMState *as, IRIns *ir)
>     Reg idx, tmp, type;
>     int32_t ofs = 0;
>     RegSet gpr = RSET_GPR, allow = irt_isnum(ir->t) ? RSET_FPR : RSET_GPR;
> -  lua_assert(irt_isnum(ir->t) || irt_ispri(ir->t) || irt_isaddr(ir->t) ||
> -	     irt_isint(ir->t));
> +  lj_assertA(irt_isnum(ir->t) || irt_ispri(ir->t) || irt_isaddr(ir->t) ||
> +	     irt_isint(ir->t),
> +	     "bad load type %d", irt_type(ir->t));
>     if (ra_used(ir)) {
>       Reg dest = ra_dest(as, ir, allow);
>       tmp = irt_isnum(ir->t) ? ra_scratch(as, rset_clear(gpr, dest)) : dest;
> @@ -1057,7 +1062,8 @@ static void asm_ahuvload(ASMState *as, IRIns *ir)
>     /* Always do the type check, even if the load result is unused. */
>     asm_guardcc(as, irt_isnum(ir->t) ? CC_LS : CC_NE);
>     if (irt_type(ir->t) >= IRT_NUM) {
> -    lua_assert(irt_isinteger(ir->t) || irt_isnum(ir->t));
> +    lj_assertA(irt_isinteger(ir->t) || irt_isnum(ir->t),
> +	       "bad load type %d", irt_type(ir->t));
>       emit_nm(as, A64I_CMPx | A64F_SH(A64SH_LSR, 32),
>   	    ra_allock(as, LJ_TISNUM << 15, rset_exclude(gpr, idx)), tmp);
>     } else if (irt_isaddr(ir->t)) {
> @@ -1122,8 +1128,10 @@ static void asm_sload(ASMState *as, IRIns *ir)
>     IRType1 t = ir->t;
>     Reg dest = RID_NONE, base;
>     RegSet allow = RSET_GPR;
> -  lua_assert(!(ir->op2 & IRSLOAD_PARENT));  /* Handled by asm_head_side(). */
> -  lua_assert(irt_isguard(t) || !(ir->op2 & IRSLOAD_TYPECHECK));
> +  lj_assertA(!(ir->op2 & IRSLOAD_PARENT),
> +	     "bad parent SLOAD");  /* Handled by asm_head_side(). */
> +  lj_assertA(irt_isguard(t) || !(ir->op2 & IRSLOAD_TYPECHECK),
> +	     "inconsistent SLOAD variant");
>     if ((ir->op2 & IRSLOAD_CONVERT) && irt_isguard(t) && irt_isint(t)) {
>       dest = ra_scratch(as, RSET_FPR);
>       asm_tointg(as, ir, dest);
> @@ -1132,7 +1140,8 @@ static void asm_sload(ASMState *as, IRIns *ir)
>       Reg tmp = RID_NONE;
>       if ((ir->op2 & IRSLOAD_CONVERT))
>         tmp = ra_scratch(as, irt_isint(t) ? RSET_FPR : RSET_GPR);
> -    lua_assert((irt_isnum(t)) || irt_isint(t) || irt_isaddr(t));
> +    lj_assertA((irt_isnum(t)) || irt_isint(t) || irt_isaddr(t),
> +	       "bad SLOAD type %d", irt_type(t));
>       dest = ra_dest(as, ir, irt_isnum(t) ? RSET_FPR : allow);
>       base = ra_alloc1(as, REF_BASE, rset_clear(allow, dest));
>       if (irt_isaddr(t)) {
> @@ -1172,7 +1181,8 @@ dotypecheck:
>       /* Need type check, even if the load result is unused. */
>       asm_guardcc(as, irt_isnum(t) ? CC_LS : CC_NE);
>       if (irt_type(t) >= IRT_NUM) {
> -      lua_assert(irt_isinteger(t) || irt_isnum(t));
> +      lj_assertA(irt_isinteger(t) || irt_isnum(t),
> +		 "bad SLOAD type %d", irt_type(t));
>         emit_nm(as, A64I_CMPx | A64F_SH(A64SH_LSR, 32),
>   	      ra_allock(as, LJ_TISNUM << 15, allow), tmp);
>       } else if (irt_isnil(t)) {
> @@ -1207,7 +1217,8 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>     const CCallInfo *ci = &lj_ir_callinfo[IRCALL_lj_mem_newgco];
>     IRRef args[4];
>     RegSet allow = (RSET_GPR & ~RSET_SCRATCH);
> -  lua_assert(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL));
> +  lj_assertA(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL),
> +	     "bad CNEW/CNEWI operands");
>   
>     as->gcsteps++;
>     asm_setupresult(as, ir, ci);  /* GCcdata * */
> @@ -1215,7 +1226,7 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>     if (ir->o == IR_CNEWI) {
>       int32_t ofs = sizeof(GCcdata);
>       Reg r = ra_alloc1(as, ir->op2, allow);
> -    lua_assert(sz == 4 || sz == 8);
> +    lj_assertA(sz == 4 || sz == 8, "bad CNEWI size %d", sz);
>       emit_lso(as, sz == 8 ? A64I_STRx : A64I_STRw, r, RID_RET, ofs);
>     } else if (ir->op2 != REF_NIL) {  /* Create VLA/VLS/aligned cdata. */
>       ci = &lj_ir_callinfo[IRCALL_lj_cdata_newv];
> @@ -1281,7 +1292,7 @@ static void asm_obar(ASMState *as, IRIns *ir)
>     RegSet allow = RSET_GPR;
>     Reg obj, val, tmp;
>     /* No need for other object barriers (yet). */
> -  lua_assert(IR(ir->op1)->o == IR_UREFC);
> +  lj_assertA(IR(ir->op1)->o == IR_UREFC, "bad OBAR type");
>     ra_evictset(as, RSET_SCRATCH);
>     l_end = emit_label(as);
>     args[0] = ASMREF_TMP1;  /* global_State *g */
> @@ -1551,7 +1562,7 @@ static void asm_bitshift(ASMState *as, IRIns *ir, A64Ins ai, A64Shift sh)
>   #define asm_bshr(as, ir)	asm_bitshift(as, ir, A64I_UBFMw, A64SH_LSR)
>   #define asm_bsar(as, ir)	asm_bitshift(as, ir, A64I_SBFMw, A64SH_ASR)
>   #define asm_bror(as, ir)	asm_bitshift(as, ir, A64I_EXTRw, A64SH_ROR)
> -#define asm_brol(as, ir)	lua_assert(0)
> +#define asm_brol(as, ir)	lj_assertA(0, "unexpected BROL")
>   
>   static void asm_intmin_max(ASMState *as, IRIns *ir, A64CC cc)
>   {
> @@ -1632,15 +1643,16 @@ static void asm_intcomp(ASMState *as, IRIns *ir)
>     Reg left;
>     uint32_t m;
>     int cmpprev0 = 0;
> -  lua_assert(irt_is64(ir->t) || irt_isint(ir->t) ||
> -	     irt_isu32(ir->t) || irt_isaddr(ir->t) || irt_isu8(ir->t));
> +  lj_assertA(irt_is64(ir->t) || irt_isint(ir->t) ||
> +	     irt_isu32(ir->t) || irt_isaddr(ir->t) || irt_isu8(ir->t),
> +	     "bad comparison data type %d", irt_type(ir->t));
>     if (asm_swapops(as, lref, rref)) {
>       IRRef tmp = lref; lref = rref; rref = tmp;
>       if (cc >= CC_GE) cc ^= 7;  /* LT <-> GT, LE <-> GE */
>       else if (cc > CC_NE) cc ^= 11;  /* LO <-> HI, LS <-> HS */
>     }
>     oldcc = cc;
> -  if (irref_isk(rref) && get_k64val(IR(rref)) == 0) {
> +  if (irref_isk(rref) && get_k64val(as, rref) == 0) {
>       IRIns *irl = IR(lref);
>       if (cc == CC_GE) cc = CC_PL;
>       else if (cc == CC_LT) cc = CC_MI;
> @@ -1655,7 +1667,7 @@ static void asm_intcomp(ASMState *as, IRIns *ir)
>   	Reg tmp = blref; blref = brref; brref = tmp;
>         }
>         if (irref_isk(brref)) {
> -	uint64_t k = get_k64val(IR(brref));
> +	uint64_t k = get_k64val(as, brref);
>   	if (k && !(k & (k-1)) && (cc == CC_EQ || cc == CC_NE)) {
>   	  asm_guardtnb(as, cc == CC_EQ ? A64I_TBZ : A64I_TBNZ,
>   		       ra_alloc1(as, blref, RSET_GPR), emit_ctz64(k));
> @@ -1704,7 +1716,8 @@ static void asm_comp(ASMState *as, IRIns *ir)
>   /* Hiword op of a split 64 bit op. Previous op must be the loword op. */
>   static void asm_hiop(ASMState *as, IRIns *ir)
>   {
> -  UNUSED(as); UNUSED(ir); lua_assert(0);  /* Unused on 64 bit. */
> +  UNUSED(as); UNUSED(ir);
> +  lj_assertA(0, "unexpected HIOP");  /* Unused on 64 bit. */
>   }
>   
>   /* -- Profiling ----------------------------------------------------------- */
> @@ -1712,7 +1725,7 @@ static void asm_hiop(ASMState *as, IRIns *ir)
>   static void asm_prof(ASMState *as, IRIns *ir)
>   {
>     uint32_t k = emit_isk13(HOOK_PROFILE, 0);
> -  lua_assert(k != 0);
> +  lj_assertA(k != 0, "HOOK_PROFILE does not fit in K13");
>     UNUSED(ir);
>     asm_guardcc(as, CC_NE);
>     emit_n(as, A64I_TSTw^k, RID_TMP);
> @@ -1730,7 +1743,7 @@ static void asm_stack_check(ASMState *as, BCReg topslot,
>     if (irp) {
>       if (!ra_hasspill(irp->s)) {
>         pbase = irp->r;
> -      lua_assert(ra_hasreg(pbase));
> +      lj_assertA(ra_hasreg(pbase), "base reg lost");
>       } else if (allow) {
>         pbase = rset_pickbot(allow);
>       } else {
> @@ -1742,7 +1755,7 @@ static void asm_stack_check(ASMState *as, BCReg topslot,
>     }
>     emit_cond_branch(as, CC_LS, asm_exitstub_addr(as, exitno));
>     k = emit_isk12((8*topslot));
> -  lua_assert(k);
> +  lj_assertA(k, "slot offset %d does not fit in K12", 8*topslot);
>     emit_n(as, A64I_CMPx^k, RID_TMP);
>     emit_dnm(as, A64I_SUBx, RID_TMP, RID_TMP, pbase);
>     emit_lso(as, A64I_LDRx, RID_TMP, RID_TMP,
> @@ -1783,7 +1796,7 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>       }
>       checkmclim(as);
>     }
> -  lua_assert(map + nent == flinks);
> +  lj_assertA(map + nent == flinks, "inconsistent frames in snapshot");
>   }
>   
>   /* -- GC handling --------------------------------------------------------- */
> @@ -1871,7 +1884,7 @@ static RegSet asm_head_side_base(ASMState *as, IRIns *irp, RegSet allow)
>       rset_clear(allow, ra_dest(as, ir, allow));
>     } else {
>       Reg r = irp->r;
> -    lua_assert(ra_hasreg(r));
> +    lj_assertA(ra_hasreg(r), "base reg lost");
>       rset_clear(allow, r);
>       if (r != ir->r && !rset_test(as->freeset, r))
>         ra_restore(as, regcost_ref(as->cost[r]));
> @@ -1895,7 +1908,7 @@ static void asm_tail_fixup(ASMState *as, TraceNo lnk)
>     } else {
>       /* Patch stack adjustment. */
>       uint32_t k = emit_isk12(spadj);
> -    lua_assert(k);
> +    lj_assertA(k, "stack adjustment %d does not fit in K12", spadj);
>       p[-2] = (A64I_ADDx^k) | A64F_D(RID_SP) | A64F_N(RID_SP);
>     }
>     /* Patch exit branch. */
> @@ -1981,7 +1994,7 @@ void lj_asm_patchexit(jit_State *J, GCtrace *T, ExitNo exitno, MCode *target)
>       } else if ((ins & 0xfc000000u) == 0x14000000u &&
>   	       ((ins ^ (px-p)) & 0x03ffffffu) == 0) {
>         /* Patch b. */
> -      lua_assert(A64F_S_OK(delta, 26));
> +      lj_assertJ(A64F_S_OK(delta, 26), "branch target out of range");
>         *p = A64I_LE((ins & 0xfc000000u) | A64F_S26(delta));
>         if (!cstart) cstart = p;
>       } else if ((ins & 0x7e000000u) == 0x34000000u &&
> @@ -2002,7 +2015,7 @@ void lj_asm_patchexit(jit_State *J, GCtrace *T, ExitNo exitno, MCode *target)
>     }
>     {  /* Always patch long-range branch in exit stub itself. */
>       ptrdiff_t delta = target - px;
> -    lua_assert(A64F_S_OK(delta, 26));
> +    lj_assertJ(A64F_S_OK(delta, 26), "branch target out of range");
>       *px = A64I_B | A64F_S26(delta);
>       if (!cstart) cstart = px;
>     }
> diff --git a/src/lj_asm_mips.h b/src/lj_asm_mips.h
> index 0f92959b..ea108aab 100644
> --- a/src/lj_asm_mips.h
> +++ b/src/lj_asm_mips.h
> @@ -23,7 +23,7 @@ static Reg ra_alloc1z(ASMState *as, IRRef ref, RegSet allow)
>   {
>     Reg r = IR(ref)->r;
>     if (ra_noreg(r)) {
> -    if (!(allow & RSET_FPR) && irref_isk(ref) && get_kval(IR(ref)) == 0)
> +    if (!(allow & RSET_FPR) && irref_isk(ref) && get_kval(as, ref) == 0)
>         return RID_ZERO;
>       r = ra_allocref(as, ref, allow);
>     } else {
> @@ -66,10 +66,10 @@ static void asm_sparejump_setup(ASMState *as)
>   {
>     MCode *mxp = as->mcbot;
>     if (((uintptr_t)mxp & (LJ_PAGESIZE-1)) == sizeof(MCLink)) {
> -    lua_assert(MIPSI_NOP == 0);
> +    lj_assertA(MIPSI_NOP == 0, "bad NOP");
>       memset(mxp, 0, MIPS_SPAREJUMP*2*sizeof(MCode));
>       mxp += MIPS_SPAREJUMP*2;
> -    lua_assert(mxp < as->mctop);
> +    lj_assertA(mxp < as->mctop, "MIPS_SPAREJUMP too big");
>       lj_mcode_sync(as->mcbot, mxp);
>       lj_mcode_commitbot(as->J, mxp);
>       as->mcbot = mxp;
> @@ -84,7 +84,8 @@ static void asm_exitstub_setup(ASMState *as)
>     /* sw TMP, 0(sp); j ->vm_exit_handler; li TMP, traceno */
>     *--mxp = MIPSI_LI|MIPSF_T(RID_TMP)|as->T->traceno;
>     *--mxp = MIPSI_J|((((uintptr_t)(void *)lj_vm_exit_handler)>>2)&0x03ffffffu);
> -  lua_assert(((uintptr_t)mxp ^ (uintptr_t)(void *)lj_vm_exit_handler)>>28 == 0);
> +  lj_assertA(((uintptr_t)mxp ^ (uintptr_t)(void *)lj_vm_exit_handler)>>28 == 0,
> +	     "branch target out of range");
>     *--mxp = MIPSI_SW|MIPSF_T(RID_TMP)|MIPSF_S(RID_SP)|0;
>     as->mctop = mxp;
>   }
> @@ -195,20 +196,20 @@ static void asm_fusexref(ASMState *as, MIPSIns mi, Reg rt, IRRef ref,
>     if (ra_noreg(ir->r) && canfuse(as, ir)) {
>       if (ir->o == IR_ADD) {
>         intptr_t ofs2;
> -      if (irref_isk(ir->op2) && (ofs2 = ofs + get_kval(IR(ir->op2)),
> +      if (irref_isk(ir->op2) && (ofs2 = ofs + get_kval(as, ir->op2),
>   				 checki16(ofs2))) {
>   	ref = ir->op1;
>   	ofs = (int32_t)ofs2;
>         }
>       } else if (ir->o == IR_STRREF) {
>         intptr_t ofs2 = 65536;
> -      lua_assert(ofs == 0);
> +      lj_assertA(ofs == 0, "bad usage");
>         ofs = (int32_t)sizeof(GCstr);
>         if (irref_isk(ir->op2)) {
> -	ofs2 = ofs + get_kval(IR(ir->op2));
> +	ofs2 = ofs + get_kval(as, ir->op2);
>   	ref = ir->op1;
>         } else if (irref_isk(ir->op1)) {
> -	ofs2 = ofs + get_kval(IR(ir->op1));
> +	ofs2 = ofs + get_kval(as, ir->op1);
>   	ref = ir->op2;
>         }
>         if (!checki16(ofs2)) {
> @@ -252,7 +253,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
>   #if !LJ_SOFTFP
>         if (irt_isfp(ir->t) && fpr <= REGARG_LASTFPR &&
>   	  !(ci->flags & CCI_VARARG)) {
> -	lua_assert(rset_test(as->freeset, fpr));  /* Already evicted. */
> +	lj_assertA(rset_test(as->freeset, fpr),
> +		   "reg %d not free", fpr);  /* Already evicted. */
>   	ra_leftov(as, fpr, ref);
>   	fpr += LJ_32 ? 2 : 1;
>   	gpr += (LJ_32 && irt_isnum(ir->t)) ? 2 : 1;
> @@ -264,7 +266,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
>   #endif
>   	if (LJ_32 && irt_isnum(ir->t)) gpr = (gpr+1) & ~1;
>   	if (gpr <= REGARG_LASTGPR) {
> -	  lua_assert(rset_test(as->freeset, gpr));  /* Already evicted. */
> +	  lj_assertA(rset_test(as->freeset, gpr),
> +		     "reg %d not free", gpr);  /* Already evicted. */
>   #if !LJ_SOFTFP
>   	  if (irt_isfp(ir->t)) {
>   	    RegSet of = as->freeset;
> @@ -277,7 +280,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
>   #if LJ_32
>   	      emit_tg(as, MIPSI_MFC1, gpr+(LJ_BE?0:1), r+1);
>   	      emit_tg(as, MIPSI_MFC1, gpr+(LJ_BE?1:0), r);
> -	      lua_assert(rset_test(as->freeset, gpr+1));  /* Already evicted. */
> +	      lj_assertA(rset_test(as->freeset, gpr+1),
> +			 "reg %d not free", gpr+1);  /* Already evicted. */
>   	      gpr += 2;
>   #else
>   	      emit_tg(as, MIPSI_DMFC1, gpr, r);
> @@ -347,7 +351,7 @@ static void asm_setupresult(ASMState *as, IRIns *ir, const CCallInfo *ci)
>   #endif
>     ra_evictset(as, drop);  /* Evictions must be performed first. */
>     if (ra_used(ir)) {
> -    lua_assert(!irt_ispri(ir->t));
> +    lj_assertA(!irt_ispri(ir->t), "PRI dest");
>       if (!LJ_SOFTFP && irt_isfp(ir->t)) {
>         if ((ci->flags & CCI_CASTU64)) {
>   	int32_t ofs = sps_scale(ir->s);
> @@ -395,7 +399,7 @@ static void asm_callx(ASMState *as, IRIns *ir)
>     func = ir->op2; irf = IR(func);
>     if (irf->o == IR_CARG) { func = irf->op1; irf = IR(func); }
>     if (irref_isk(func)) {  /* Call to constant address. */
> -    ci.func = (ASMFunction)(void *)get_kval(irf);
> +    ci.func = (ASMFunction)(void *)get_kval(as, func);
>     } else {  /* Need specific register for indirect calls. */
>       Reg r = ra_alloc1(as, func, RID2RSET(RID_CFUNCADDR));
>       MCode *p = as->mcp;
> @@ -512,15 +516,19 @@ static void asm_conv(ASMState *as, IRIns *ir)
>   #endif
>     IRRef lref = ir->op1;
>   #if LJ_32
> -  lua_assert(!(irt_isint64(ir->t) ||
> -	       (st == IRT_I64 || st == IRT_U64))); /* Handled by SPLIT. */
> +  /* 64 bit integer conversions are handled by SPLIT. */
> +  lj_assertA(!(irt_isint64(ir->t) || (st == IRT_I64 || st == IRT_U64)),
> +	     "IR %04d has unsplit 64 bit type",
> +	     (int)(ir - as->ir) - REF_BIAS);
>   #endif
>   #if LJ_SOFTFP32
>     /* FP conversions are handled by SPLIT. */
> -  lua_assert(!irt_isfp(ir->t) && !(st == IRT_NUM || st == IRT_FLOAT));
> +  lj_assertA(!irt_isfp(ir->t) && !(st == IRT_NUM || st == IRT_FLOAT),
> +	     "IR %04d has FP type",
> +	     (int)(ir - as->ir) - REF_BIAS);
>     /* Can't check for same types: SPLIT uses CONV int.int + BXOR for sfp NEG. */
>   #else
> -  lua_assert(irt_type(ir->t) != st);
> +  lj_assertA(irt_type(ir->t) != st, "inconsistent types for CONV");
>   #if !LJ_SOFTFP
>     if (irt_isfp(ir->t)) {
>       Reg dest = ra_dest(as, ir, RSET_FPR);
> @@ -579,7 +587,8 @@ static void asm_conv(ASMState *as, IRIns *ir)
>     } else if (stfp) {  /* FP to integer conversion. */
>       if (irt_isguard(ir->t)) {
>         /* Checked conversions are only supported from number to int. */
> -      lua_assert(irt_isint(ir->t) && st == IRT_NUM);
> +      lj_assertA(irt_isint(ir->t) && st == IRT_NUM,
> +		 "bad type for checked CONV");
>         asm_tointg(as, ir, ra_alloc1(as, lref, RSET_FPR));
>       } else {
>         Reg dest = ra_dest(as, ir, RSET_GPR);
> @@ -679,7 +688,8 @@ static void asm_conv(ASMState *as, IRIns *ir)
>     } else if (stfp) {  /* FP to integer conversion. */
>       if (irt_isguard(ir->t)) {
>         /* Checked conversions are only supported from number to int. */
> -      lua_assert(irt_isint(ir->t) && st == IRT_NUM);
> +      lj_assertA(irt_isint(ir->t) && st == IRT_NUM,
> +		 "bad type for checked CONV");
>         asm_tointg(as, ir, RID_NONE);
>       } else {
>         IRCallID cid = irt_is64(ir->t) ?
> @@ -698,7 +708,7 @@ static void asm_conv(ASMState *as, IRIns *ir)
>       Reg dest = ra_dest(as, ir, RSET_GPR);
>       if (st >= IRT_I8 && st <= IRT_U16) {  /* Extend to 32 bit integer. */
>         Reg left = ra_alloc1(as, ir->op1, RSET_GPR);
> -      lua_assert(irt_isint(ir->t) || irt_isu32(ir->t));
> +      lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t), "bad type for CONV EXT");
>         if ((ir->op2 & IRCONV_SEXT)) {
>   	if (LJ_64 || (as->flags & JIT_F_MIPSXXR2)) {
>   	  emit_dst(as, st == IRT_I8 ? MIPSI_SEB : MIPSI_SEH, dest, 0, left);
> @@ -795,7 +805,8 @@ static void asm_tvstore64(ASMState *as, Reg base, int32_t ofs, IRRef ref)
>   {
>     RegSet allow = rset_exclude(RSET_GPR, base);
>     IRIns *ir = IR(ref);
> -  lua_assert(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t));
> +  lj_assertA(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t),
> +	     "store of IR type %d", irt_type(ir->t));
>     if (irref_isk(ref)) {
>       TValue k;
>       lj_ir_kvalue(as->J->L, &k, ir);
> @@ -944,7 +955,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>         if (isk && irt_isaddr(kt)) {
>   	k = ((int64_t)irt_toitype(irkey->t) << 47) | irkey[1].tv.u64;
>         } else {
> -	lua_assert(irt_ispri(kt) && !irt_isnil(kt));
> +	lj_assertA(irt_ispri(kt) && !irt_isnil(kt), "bad HREF key type");
>   	k = ~((int64_t)~irt_toitype(ir->t) << 47);
>         }
>         cmp64 = ra_allock(as, k, allow);
> @@ -1012,7 +1023,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>   #endif
>   
>     /* Load main position relative to tab->node into dest. */
> -  khash = isk ? ir_khash(irkey) : 1;
> +  khash = isk ? ir_khash(as, irkey) : 1;
>     if (khash == 0) {
>       emit_tsi(as, MIPSI_AL, dest, tab, (int32_t)offsetof(GCtab, node));
>     } else {
> @@ -1020,7 +1031,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>       if (isk)
>         tmphash = ra_allock(as, khash, allow);
>       emit_dst(as, MIPSI_AADDU, dest, dest, tmp1);
> -    lua_assert(sizeof(Node) == 24);
> +    lj_assertA(sizeof(Node) == 24, "bad Node size");
>       emit_dst(as, MIPSI_SUBU, tmp1, tmp2, tmp1);
>       emit_dta(as, MIPSI_SLL, tmp1, tmp1, 3);
>       emit_dta(as, MIPSI_SLL, tmp2, tmp1, 5);
> @@ -1098,7 +1109,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
>     Reg key = ra_scratch(as, allow);
>     int64_t k;
>   #endif
> -  lua_assert(ofs % sizeof(Node) == 0);
> +  lj_assertA(ofs % sizeof(Node) == 0, "unaligned HREFK slot");
>     if (ofs > 32736) {
>       idx = dest;
>       rset_clear(allow, dest);
> @@ -1127,7 +1138,7 @@ nolo:
>     emit_tsi(as, MIPSI_LW, type, idx, kofs+(LJ_BE?0:4));
>   #else
>     if (irt_ispri(irkey->t)) {
> -    lua_assert(!irt_isnil(irkey->t));
> +    lj_assertA(!irt_isnil(irkey->t), "bad HREFK key type");
>       k = ~((int64_t)~irt_toitype(irkey->t) << 47);
>     } else if (irt_isnum(irkey->t)) {
>       k = (int64_t)ir_knum(irkey)->u64;
> @@ -1166,7 +1177,7 @@ static void asm_uref(ASMState *as, IRIns *ir)
>   static void asm_fref(ASMState *as, IRIns *ir)
>   {
>     UNUSED(as); UNUSED(ir);
> -  lua_assert(!ra_used(ir));
> +  lj_assertA(!ra_used(ir), "unfused FREF");
>   }
>   
>   static void asm_strref(ASMState *as, IRIns *ir)
> @@ -1221,14 +1232,17 @@ static void asm_strref(ASMState *as, IRIns *ir)
>   
>   /* -- Loads and stores ---------------------------------------------------- */
>   
> -static MIPSIns asm_fxloadins(IRIns *ir)
> +static MIPSIns asm_fxloadins(ASMState *as, IRIns *ir)
>   {
> +  UNUSED(as);
>     switch (irt_type(ir->t)) {
>     case IRT_I8: return MIPSI_LB;
>     case IRT_U8: return MIPSI_LBU;
>     case IRT_I16: return MIPSI_LH;
>     case IRT_U16: return MIPSI_LHU;
> -  case IRT_NUM: lua_assert(!LJ_SOFTFP32); if (!LJ_SOFTFP) return MIPSI_LDC1;
> +  case IRT_NUM:
> +    lj_assertA(!LJ_SOFTFP32, "unsplit FP op");
> +    if (!LJ_SOFTFP) return MIPSI_LDC1;
>     /* fallthrough */
>     case IRT_FLOAT: if (!LJ_SOFTFP) return MIPSI_LWC1;
>     /* fallthrough */
> @@ -1236,12 +1250,15 @@ static MIPSIns asm_fxloadins(IRIns *ir)
>     }
>   }
>   
> -static MIPSIns asm_fxstoreins(IRIns *ir)
> +static MIPSIns asm_fxstoreins(ASMState *as, IRIns *ir)
>   {
> +  UNUSED(as);
>     switch (irt_type(ir->t)) {
>     case IRT_I8: case IRT_U8: return MIPSI_SB;
>     case IRT_I16: case IRT_U16: return MIPSI_SH;
> -  case IRT_NUM: lua_assert(!LJ_SOFTFP32); if (!LJ_SOFTFP) return MIPSI_SDC1;
> +  case IRT_NUM:
> +    lj_assertA(!LJ_SOFTFP32, "unsplit FP op");
> +    if (!LJ_SOFTFP) return MIPSI_SDC1;
>     /* fallthrough */
>     case IRT_FLOAT: if (!LJ_SOFTFP) return MIPSI_SWC1;
>     /* fallthrough */
> @@ -1252,10 +1269,10 @@ static MIPSIns asm_fxstoreins(IRIns *ir)
>   static void asm_fload(ASMState *as, IRIns *ir)
>   {
>     Reg dest = ra_dest(as, ir, RSET_GPR);
> -  MIPSIns mi = asm_fxloadins(ir);
> +  MIPSIns mi = asm_fxloadins(as, ir);
>     Reg idx;
>     int32_t ofs;
> -  if (ir->op1 == REF_NIL) {
> +  if (ir->op1 == REF_NIL) {  /* FLOAD from GG_State with offset. */
>       idx = RID_JGL;
>       ofs = (ir->op2 << 2) - 32768 - GG_OFS(g);
>     } else {
> @@ -1269,7 +1286,7 @@ static void asm_fload(ASMState *as, IRIns *ir)
>       }
>       ofs = field_ofs[ir->op2];
>     }
> -  lua_assert(!irt_isfp(ir->t));
> +  lj_assertA(!irt_isfp(ir->t), "bad FP FLOAD");
>     emit_tsi(as, mi, dest, idx, ofs);
>   }
>   
> @@ -1280,8 +1297,8 @@ static void asm_fstore(ASMState *as, IRIns *ir)
>       IRIns *irf = IR(ir->op1);
>       Reg idx = ra_alloc1(as, irf->op1, rset_exclude(RSET_GPR, src));
>       int32_t ofs = field_ofs[irf->op2];
> -    MIPSIns mi = asm_fxstoreins(ir);
> -    lua_assert(!irt_isfp(ir->t));
> +    MIPSIns mi = asm_fxstoreins(as, ir);
> +    lj_assertA(!irt_isfp(ir->t), "bad FP FSTORE");
>       emit_tsi(as, mi, src, idx, ofs);
>     }
>   }
> @@ -1290,8 +1307,9 @@ static void asm_xload(ASMState *as, IRIns *ir)
>   {
>     Reg dest = ra_dest(as, ir,
>       (!LJ_SOFTFP && irt_isfp(ir->t)) ? RSET_FPR : RSET_GPR);
> -  lua_assert(LJ_TARGET_UNALIGNED || !(ir->op2 & IRXLOAD_UNALIGNED));
> -  asm_fusexref(as, asm_fxloadins(ir), dest, ir->op1, RSET_GPR, 0);
> +  lj_assertA(LJ_TARGET_UNALIGNED || !(ir->op2 & IRXLOAD_UNALIGNED),
> +	     "unaligned XLOAD");
> +  asm_fusexref(as, asm_fxloadins(as, ir), dest, ir->op1, RSET_GPR, 0);
>   }
>   
>   static void asm_xstore_(ASMState *as, IRIns *ir, int32_t ofs)
> @@ -1299,7 +1317,7 @@ static void asm_xstore_(ASMState *as, IRIns *ir, int32_t ofs)
>     if (ir->r != RID_SINK) {
>       Reg src = ra_alloc1z(as, ir->op2,
>         (!LJ_SOFTFP && irt_isfp(ir->t)) ? RSET_FPR : RSET_GPR);
> -    asm_fusexref(as, asm_fxstoreins(ir), src, ir->op1,
> +    asm_fusexref(as, asm_fxstoreins(as, ir), src, ir->op1,
>   		 rset_exclude(RSET_GPR, src), ofs);
>     }
>   }
> @@ -1321,8 +1339,9 @@ static void asm_ahuvload(ASMState *as, IRIns *ir)
>       }
>     }
>     if (ra_used(ir)) {
> -    lua_assert((LJ_SOFTFP32 ? 0 : irt_isnum(ir->t)) ||
> -	       irt_isint(ir->t) || irt_isaddr(ir->t));
> +    lj_assertA((LJ_SOFTFP32 ? 0 : irt_isnum(ir->t)) ||
> +	       irt_isint(ir->t) || irt_isaddr(ir->t),
> +	       "bad load type %d", irt_type(ir->t));
>       dest = ra_dest(as, ir, (!LJ_SOFTFP && irt_isnum(t)) ? RSET_FPR : allow);
>       rset_clear(allow, dest);
>   #if LJ_64
> @@ -1427,10 +1446,13 @@ static void asm_sload(ASMState *as, IRIns *ir)
>   #else
>     int32_t ofs = 8*((int32_t)ir->op1-2);
>   #endif
> -  lua_assert(!(ir->op2 & IRSLOAD_PARENT));  /* Handled by asm_head_side(). */
> -  lua_assert(irt_isguard(ir->t) || !(ir->op2 & IRSLOAD_TYPECHECK));
> +  lj_assertA(!(ir->op2 & IRSLOAD_PARENT),
> +	     "bad parent SLOAD");  /* Handled by asm_head_side(). */
> +  lj_assertA(irt_isguard(ir->t) || !(ir->op2 & IRSLOAD_TYPECHECK),
> +	     "inconsistent SLOAD variant");
>   #if LJ_SOFTFP32
> -  lua_assert(!(ir->op2 & IRSLOAD_CONVERT));  /* Handled by LJ_SOFTFP SPLIT. */
> +  lj_assertA(!(ir->op2 & IRSLOAD_CONVERT),
> +	     "unsplit SLOAD convert");  /* Handled by LJ_SOFTFP SPLIT. */
>     if (hiop && ra_used(ir+1)) {
>       type = ra_dest(as, ir+1, allow);
>       rset_clear(allow, type);
> @@ -1443,8 +1465,9 @@ static void asm_sload(ASMState *as, IRIns *ir)
>     } else
>   #endif
>     if (ra_used(ir)) {
> -    lua_assert((LJ_SOFTFP32 ? 0 : irt_isnum(ir->t)) ||
> -	       irt_isint(ir->t) || irt_isaddr(ir->t));
> +    lj_assertA((LJ_SOFTFP32 ? 0 : irt_isnum(ir->t)) ||
> +	       irt_isint(ir->t) || irt_isaddr(ir->t),
> +	       "bad SLOAD type %d", irt_type(ir->t));
>       dest = ra_dest(as, ir, (!LJ_SOFTFP && irt_isnum(t)) ? RSET_FPR : allow);
>       rset_clear(allow, dest);
>       base = ra_alloc1(as, REF_BASE, allow);
> @@ -1556,7 +1579,8 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>     RegSet allow = (RSET_GPR & ~RSET_SCRATCH);
>     RegSet drop = RSET_SCRATCH;
>     Reg tmp;
> -  lua_assert(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL));
> +  lj_assertA(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL),
> +	     "bad CNEW/CNEWI operands");
>   
>     as->gcsteps++;
>     if (ra_hasreg(ir->r))
> @@ -1571,7 +1595,7 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>       int32_t ofs = sizeof(GCcdata);
>       if (sz == 8) {
>         ofs += 4;
> -      lua_assert((ir+1)->o == IR_HIOP);
> +      lj_assertA((ir+1)->o == IR_HIOP, "expected HIOP for CNEWI");
>         if (LJ_LE) ir++;
>       }
>       for (;;) {
> @@ -1585,7 +1609,7 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>       emit_tsi(as, sz == 8 ? MIPSI_SD : MIPSI_SW, ra_alloc1(as, ir->op2, allow),
>   	     RID_RET, sizeof(GCcdata));
>   #endif
> -    lua_assert(sz == 4 || sz == 8);
> +    lj_assertA(sz == 4 || sz == 8, "bad CNEWI size %d", sz);
>     } else if (ir->op2 != REF_NIL) {  /* Create VLA/VLS/aligned cdata. */
>       ci = &lj_ir_callinfo[IRCALL_lj_cdata_newv];
>       args[0] = ASMREF_L;     /* lua_State *L */
> @@ -1640,7 +1664,7 @@ static void asm_obar(ASMState *as, IRIns *ir)
>     MCLabel l_end;
>     Reg obj, val, tmp;
>     /* No need for other object barriers (yet). */
> -  lua_assert(IR(ir->op1)->o == IR_UREFC);
> +  lj_assertA(IR(ir->op1)->o == IR_UREFC, "bad OBAR type");
>     ra_evictset(as, RSET_SCRATCH);
>     l_end = emit_label(as);
>     args[0] = ASMREF_TMP1;  /* global_State *g */
> @@ -1715,7 +1739,7 @@ static void asm_add(ASMState *as, IRIns *ir)
>       Reg dest = ra_dest(as, ir, RSET_GPR);
>       Reg right, left = ra_hintalloc(as, ir->op1, dest, RSET_GPR);
>       if (irref_isk(ir->op2)) {
> -      intptr_t k = get_kval(IR(ir->op2));
> +      intptr_t k = get_kval(as, ir->op2);
>         if (checki16(k)) {
>   	emit_tsi(as, (LJ_64 && irt_is64(t)) ? MIPSI_DADDIU : MIPSI_ADDIU, dest,
>   		 left, k);
> @@ -1816,7 +1840,7 @@ static void asm_arithov(ASMState *as, IRIns *ir)
>   {
>     /* TODO MIPSR6: bovc/bnvc. Caveat: no delay slot to load RID_TMP. */
>     Reg right, left, tmp, dest = ra_dest(as, ir, RSET_GPR);
> -  lua_assert(!irt_is64(ir->t));
> +  lj_assertA(!irt_is64(ir->t), "bad usage");
>     if (irref_isk(ir->op2)) {
>       int k = IR(ir->op2)->i;
>       if (ir->o == IR_SUBOV) k = -k;
> @@ -2003,7 +2027,7 @@ static void asm_bitop(ASMState *as, IRIns *ir, MIPSIns mi, MIPSIns mik)
>     Reg dest = ra_dest(as, ir, RSET_GPR);
>     Reg right, left = ra_hintalloc(as, ir->op1, dest, RSET_GPR);
>     if (irref_isk(ir->op2)) {
> -    intptr_t k = get_kval(IR(ir->op2));
> +    intptr_t k = get_kval(as, ir->op2);
>       if (checku16(k)) {
>         emit_tsi(as, mik, dest, left, k);
>         return;
> @@ -2036,7 +2060,7 @@ static void asm_bitshift(ASMState *as, IRIns *ir, MIPSIns mi, MIPSIns mik)
>   #define asm_bshl(as, ir)	asm_bitshift(as, ir, MIPSI_SLLV, MIPSI_SLL)
>   #define asm_bshr(as, ir)	asm_bitshift(as, ir, MIPSI_SRLV, MIPSI_SRL)
>   #define asm_bsar(as, ir)	asm_bitshift(as, ir, MIPSI_SRAV, MIPSI_SRA)
> -#define asm_brol(as, ir)	lua_assert(0)
> +#define asm_brol(as, ir)	lj_assertA(0, "unexpected BROL")
>   
>   static void asm_bror(ASMState *as, IRIns *ir)
>   {
> @@ -2228,13 +2252,13 @@ static void asm_comp(ASMState *as, IRIns *ir)
>     } else {
>       Reg right, left = ra_alloc1(as, ir->op1, RSET_GPR);
>       if (op == IR_ABC) op = IR_UGT;
> -    if ((op&4) == 0 && irref_isk(ir->op2) && get_kval(IR(ir->op2)) == 0) {
> +    if ((op&4) == 0 && irref_isk(ir->op2) && get_kval(as, ir->op2) == 0) {
>         MIPSIns mi = (op&2) ? ((op&1) ? MIPSI_BLEZ : MIPSI_BGTZ) :
>   			    ((op&1) ? MIPSI_BLTZ : MIPSI_BGEZ);
>         asm_guard(as, mi, left, 0);
>       } else {
>         if (irref_isk(ir->op2)) {
> -	intptr_t k = get_kval(IR(ir->op2));
> +	intptr_t k = get_kval(as, ir->op2);
>   	if ((op&2)) k++;
>   	if (checki16(k)) {
>   	  asm_guard(as, (op&1) ? MIPSI_BNE : MIPSI_BEQ, RID_TMP, RID_ZERO);
> @@ -2390,10 +2414,11 @@ static void asm_hiop(ASMState *as, IRIns *ir)
>     case IR_CNEWI:
>       /* Nothing to do here. Handled by lo op itself. */
>       break;
> -  default: lua_assert(0); break;
> +  default: lj_assertA(0, "bad HIOP for op %d", (ir-1)->o); break;
>     }
>   #else
> -  UNUSED(as); UNUSED(ir); lua_assert(0);  /* Unused without FFI. */
> +  /* Unused on MIPS64 or without SOFTFP or FFI. */
> +  UNUSED(as); UNUSED(ir); lj_assertA(0, "unexpected HIOP");
>   #endif
>   }
>   
> @@ -2462,7 +2487,8 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>   #if LJ_SOFTFP32
>         Reg tmp;
>         RegSet allow = rset_exclude(RSET_GPR, RID_BASE);
> -      lua_assert(irref_isk(ref));  /* LJ_SOFTFP: must be a number constant. */
> +      /* LJ_SOFTFP: must be a number constant. */
> +      lj_assertA(irref_isk(ref), "unsplit FP op");
>         tmp = ra_allock(as, (int32_t)ir_knum(ir)->u32.lo, allow);
>         emit_tsi(as, MIPSI_SW, tmp, RID_BASE, ofs+(LJ_BE?4:0));
>         if (rset_test(as->freeset, tmp+1)) allow = RID2RSET(tmp+1);
> @@ -2479,7 +2505,8 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>   #if LJ_32
>         RegSet allow = rset_exclude(RSET_GPR, RID_BASE);
>         Reg type;
> -      lua_assert(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t));
> +      lj_assertA(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t),
> +		 "restore of IR type %d", irt_type(ir->t));
>         if (!irt_ispri(ir->t)) {
>   	Reg src = ra_alloc1(as, ref, allow);
>   	rset_clear(allow, src);
> @@ -2502,7 +2529,7 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>       }
>       checkmclim(as);
>     }
> -  lua_assert(map + nent == flinks);
> +  lj_assertA(map + nent == flinks, "inconsistent frames in snapshot");
>   }
>   
>   /* -- GC handling --------------------------------------------------------- */
> @@ -2700,7 +2727,7 @@ void lj_asm_patchexit(jit_State *J, GCtrace *T, ExitNo exitno, MCode *target)
>   	}
>         } else if (p+1 == pe) {
>   	/* Patch NOP after code for inverted loop branch. Use of J is ok. */
> -	lua_assert(p[1] == MIPSI_NOP);
> +	lj_assertJ(p[1] == MIPSI_NOP, "expected NOP");
>   	p[1] = tjump;
>   	*p = MIPSI_NOP;  /* Replace the load of the exit number. */
>   	cstop = p+2;
> diff --git a/src/lj_asm_ppc.h b/src/lj_asm_ppc.h
> index 62a5c3e2..971dcc88 100644
> --- a/src/lj_asm_ppc.h
> +++ b/src/lj_asm_ppc.h
> @@ -181,7 +181,7 @@ static void asm_fusexref(ASMState *as, PPCIns pi, Reg rt, IRRef ref,
>   	return;
>         }
>       } else if (ir->o == IR_STRREF) {
> -      lua_assert(ofs == 0);
> +      lj_assertA(ofs == 0, "bad usage");
>         ofs = (int32_t)sizeof(GCstr);
>         if (irref_isk(ir->op2)) {
>   	ofs += IR(ir->op2)->i;
> @@ -268,7 +268,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
>   #if !LJ_SOFTFP
>         if (irt_isfp(ir->t)) {
>   	if (fpr <= REGARG_LASTFPR) {
> -	  lua_assert(rset_test(as->freeset, fpr));  /* Already evicted. */
> +	  lj_assertA(rset_test(as->freeset, fpr),
> +		     "reg %d not free", fpr);  /* Already evicted. */
>   	  ra_leftov(as, fpr, ref);
>   	  fpr++;
>   	} else {
> @@ -281,7 +282,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
>   #endif
>         {
>   	if (gpr <= REGARG_LASTGPR) {
> -	  lua_assert(rset_test(as->freeset, gpr));  /* Already evicted. */
> +	  lj_assertA(rset_test(as->freeset, gpr),
> +		     "reg %d not free", gpr);  /* Already evicted. */
>   	  ra_leftov(as, gpr, ref);
>   	  gpr++;
>   	} else {
> @@ -319,7 +321,7 @@ static void asm_setupresult(ASMState *as, IRIns *ir, const CCallInfo *ci)
>       rset_clear(drop, (ir+1)->r);  /* Dest reg handled below. */
>     ra_evictset(as, drop);  /* Evictions must be performed first. */
>     if (ra_used(ir)) {
> -    lua_assert(!irt_ispri(ir->t));
> +    lj_assertA(!irt_ispri(ir->t), "PRI dest");
>       if (!LJ_SOFTFP && irt_isfp(ir->t)) {
>         if ((ci->flags & CCI_CASTU64)) {
>   	/* Use spill slot or temp slots. */
> @@ -431,14 +433,18 @@ static void asm_conv(ASMState *as, IRIns *ir)
>     int stfp = (st == IRT_NUM || st == IRT_FLOAT);
>   #endif
>     IRRef lref = ir->op1;
> -  lua_assert(!(irt_isint64(ir->t) ||
> -	       (st == IRT_I64 || st == IRT_U64))); /* Handled by SPLIT. */
> +  /* 64 bit integer conversions are handled by SPLIT. */
> +  lj_assertA(!(irt_isint64(ir->t) || (st == IRT_I64 || st == IRT_U64)),
> +	     "IR %04d has unsplit 64 bit type",
> +	     (int)(ir - as->ir) - REF_BIAS);
>   #if LJ_SOFTFP
>     /* FP conversions are handled by SPLIT. */
> -  lua_assert(!irt_isfp(ir->t) && !(st == IRT_NUM || st == IRT_FLOAT));
> +  lj_assertA(!irt_isfp(ir->t) && !(st == IRT_NUM || st == IRT_FLOAT),
> +	     "IR %04d has FP type",
> +	     (int)(ir - as->ir) - REF_BIAS);
>     /* Can't check for same types: SPLIT uses CONV int.int + BXOR for sfp NEG. */
>   #else
> -  lua_assert(irt_type(ir->t) != st);
> +  lj_assertA(irt_type(ir->t) != st, "inconsistent types for CONV");
>     if (irt_isfp(ir->t)) {
>       Reg dest = ra_dest(as, ir, RSET_FPR);
>       if (stfp) {  /* FP to FP conversion. */
> @@ -467,7 +473,8 @@ static void asm_conv(ASMState *as, IRIns *ir)
>     } else if (stfp) {  /* FP to integer conversion. */
>       if (irt_isguard(ir->t)) {
>         /* Checked conversions are only supported from number to int. */
> -      lua_assert(irt_isint(ir->t) && st == IRT_NUM);
> +      lj_assertA(irt_isint(ir->t) && st == IRT_NUM,
> +		 "bad type for checked CONV");
>         asm_tointg(as, ir, ra_alloc1(as, lref, RSET_FPR));
>       } else {
>         Reg dest = ra_dest(as, ir, RSET_GPR);
> @@ -503,7 +510,7 @@ static void asm_conv(ASMState *as, IRIns *ir)
>       Reg dest = ra_dest(as, ir, RSET_GPR);
>       if (st >= IRT_I8 && st <= IRT_U16) {  /* Extend to 32 bit integer. */
>         Reg left = ra_alloc1(as, ir->op1, RSET_GPR);
> -      lua_assert(irt_isint(ir->t) || irt_isu32(ir->t));
> +      lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t), "bad type for CONV EXT");
>         if ((ir->op2 & IRCONV_SEXT))
>   	emit_as(as, st == IRT_I8 ? PPCI_EXTSB : PPCI_EXTSH, dest, left);
>         else
> @@ -699,7 +706,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>   	    (((char *)as->mcp-(char *)l_loop) & 0xffffu);
>   
>     /* Load main position relative to tab->node into dest. */
> -  khash = isk ? ir_khash(irkey) : 1;
> +  khash = isk ? ir_khash(as, irkey) : 1;
>     if (khash == 0) {
>       emit_tai(as, PPCI_LWZ, dest, tab, (int32_t)offsetof(GCtab, node));
>     } else {
> @@ -754,7 +761,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
>     Reg node = ra_alloc1(as, ir->op1, RSET_GPR);
>     Reg key = RID_NONE, type = RID_TMP, idx = node;
>     RegSet allow = rset_exclude(RSET_GPR, node);
> -  lua_assert(ofs % sizeof(Node) == 0);
> +  lj_assertA(ofs % sizeof(Node) == 0, "unaligned HREFK slot");
>     if (ofs > 32736) {
>       idx = dest;
>       rset_clear(allow, dest);
> @@ -813,7 +820,7 @@ static void asm_uref(ASMState *as, IRIns *ir)
>   static void asm_fref(ASMState *as, IRIns *ir)
>   {
>     UNUSED(as); UNUSED(ir);
> -  lua_assert(!ra_used(ir));
> +  lj_assertA(!ra_used(ir), "unfused FREF");
>   }
>   
>   static void asm_strref(ASMState *as, IRIns *ir)
> @@ -853,25 +860,27 @@ static void asm_strref(ASMState *as, IRIns *ir)
>   
>   /* -- Loads and stores ---------------------------------------------------- */
>   
> -static PPCIns asm_fxloadins(IRIns *ir)
> +static PPCIns asm_fxloadins(ASMState *as, IRIns *ir)
>   {
> +  UNUSED(as);
>     switch (irt_type(ir->t)) {
>     case IRT_I8: return PPCI_LBZ;  /* Needs sign-extension. */
>     case IRT_U8: return PPCI_LBZ;
>     case IRT_I16: return PPCI_LHA;
>     case IRT_U16: return PPCI_LHZ;
> -  case IRT_NUM: lua_assert(!LJ_SOFTFP); return PPCI_LFD;
> +  case IRT_NUM: lj_assertA(!LJ_SOFTFP, "unsplit FP op"); return PPCI_LFD;
>     case IRT_FLOAT: if (!LJ_SOFTFP) return PPCI_LFS;
>     default: return PPCI_LWZ;
>     }
>   }
>   
> -static PPCIns asm_fxstoreins(IRIns *ir)
> +static PPCIns asm_fxstoreins(ASMState *as, IRIns *ir)
>   {
> +  UNUSED(as);
>     switch (irt_type(ir->t)) {
>     case IRT_I8: case IRT_U8: return PPCI_STB;
>     case IRT_I16: case IRT_U16: return PPCI_STH;
> -  case IRT_NUM: lua_assert(!LJ_SOFTFP); return PPCI_STFD;
> +  case IRT_NUM: lj_assertA(!LJ_SOFTFP, "unsplit FP op"); return PPCI_STFD;
>     case IRT_FLOAT: if (!LJ_SOFTFP) return PPCI_STFS;
>     default: return PPCI_STW;
>     }
> @@ -880,10 +889,10 @@ static PPCIns asm_fxstoreins(IRIns *ir)
>   static void asm_fload(ASMState *as, IRIns *ir)
>   {
>     Reg dest = ra_dest(as, ir, RSET_GPR);
> -  PPCIns pi = asm_fxloadins(ir);
> +  PPCIns pi = asm_fxloadins(as, ir);
>     Reg idx;
>     int32_t ofs;
> -  if (ir->op1 == REF_NIL) {
> +  if (ir->op1 == REF_NIL) {  /* FLOAD from GG_State with offset. */
>       idx = RID_JGL;
>       ofs = (ir->op2 << 2) - 32768;
>     } else {
> @@ -897,7 +906,7 @@ static void asm_fload(ASMState *as, IRIns *ir)
>       }
>       ofs = field_ofs[ir->op2];
>     }
> -  lua_assert(!irt_isi8(ir->t));
> +  lj_assertA(!irt_isi8(ir->t), "unsupported FLOAD I8");
>     emit_tai(as, pi, dest, idx, ofs);
>   }
>   
> @@ -908,7 +917,7 @@ static void asm_fstore(ASMState *as, IRIns *ir)
>       IRIns *irf = IR(ir->op1);
>       Reg idx = ra_alloc1(as, irf->op1, rset_exclude(RSET_GPR, src));
>       int32_t ofs = field_ofs[irf->op2];
> -    PPCIns pi = asm_fxstoreins(ir);
> +    PPCIns pi = asm_fxstoreins(as, ir);
>       emit_tai(as, pi, src, idx, ofs);
>     }
>   }
> @@ -917,10 +926,10 @@ static void asm_xload(ASMState *as, IRIns *ir)
>   {
>     Reg dest = ra_dest(as, ir,
>       (!LJ_SOFTFP && irt_isfp(ir->t)) ? RSET_FPR : RSET_GPR);
> -  lua_assert(!(ir->op2 & IRXLOAD_UNALIGNED));
> +  lj_assertA(!(ir->op2 & IRXLOAD_UNALIGNED), "unaligned XLOAD");
>     if (irt_isi8(ir->t))
>       emit_as(as, PPCI_EXTSB, dest, dest);
> -  asm_fusexref(as, asm_fxloadins(ir), dest, ir->op1, RSET_GPR, 0);
> +  asm_fusexref(as, asm_fxloadins(as, ir), dest, ir->op1, RSET_GPR, 0);
>   }
>   
>   static void asm_xstore_(ASMState *as, IRIns *ir, int32_t ofs)
> @@ -936,7 +945,7 @@ static void asm_xstore_(ASMState *as, IRIns *ir, int32_t ofs)
>     } else {
>       Reg src = ra_alloc1(as, ir->op2,
>         (!LJ_SOFTFP && irt_isfp(ir->t)) ? RSET_FPR : RSET_GPR);
> -    asm_fusexref(as, asm_fxstoreins(ir), src, ir->op1,
> +    asm_fusexref(as, asm_fxstoreins(as, ir), src, ir->op1,
>   		 rset_exclude(RSET_GPR, src), ofs);
>     }
>   }
> @@ -958,8 +967,9 @@ static void asm_ahuvload(ASMState *as, IRIns *ir)
>       ofs = 0;
>     }
>     if (ra_used(ir)) {
> -    lua_assert((LJ_SOFTFP ? 0 : irt_isnum(ir->t)) ||
> -	       irt_isint(ir->t) || irt_isaddr(ir->t));
> +    lj_assertA((LJ_SOFTFP ? 0 : irt_isnum(ir->t)) ||
> +	       irt_isint(ir->t) || irt_isaddr(ir->t),
> +	       "bad load type %d", irt_type(ir->t));
>       if (LJ_SOFTFP || !irt_isnum(t)) ofs = 0;
>       dest = ra_dest(as, ir, (!LJ_SOFTFP && irt_isnum(t)) ? RSET_FPR : allow);
>       rset_clear(allow, dest);
> @@ -1042,12 +1052,16 @@ static void asm_sload(ASMState *as, IRIns *ir)
>     int hiop = (LJ_SOFTFP && (ir+1)->o == IR_HIOP);
>     if (hiop)
>       t.irt = IRT_NUM;
> -  lua_assert(!(ir->op2 & IRSLOAD_PARENT));  /* Handled by asm_head_side(). */
> -  lua_assert(irt_isguard(ir->t) || !(ir->op2 & IRSLOAD_TYPECHECK));
> -  lua_assert(LJ_DUALNUM ||
> -	     !irt_isint(t) || (ir->op2 & (IRSLOAD_CONVERT|IRSLOAD_FRAME)));
> +  lj_assertA(!(ir->op2 & IRSLOAD_PARENT),
> +	     "bad parent SLOAD");  /* Handled by asm_head_side(). */
> +  lj_assertA(irt_isguard(ir->t) || !(ir->op2 & IRSLOAD_TYPECHECK),
> +	     "inconsistent SLOAD variant");
> +  lj_assertA(LJ_DUALNUM ||
> +	     !irt_isint(t) || (ir->op2 & (IRSLOAD_CONVERT|IRSLOAD_FRAME)),
> +	     "bad SLOAD type");
>   #if LJ_SOFTFP
> -  lua_assert(!(ir->op2 & IRSLOAD_CONVERT));  /* Handled by LJ_SOFTFP SPLIT. */
> +  lj_assertA(!(ir->op2 & IRSLOAD_CONVERT),
> +	     "unsplit SLOAD convert");  /* Handled by LJ_SOFTFP SPLIT. */
>     if (hiop && ra_used(ir+1)) {
>       type = ra_dest(as, ir+1, allow);
>       rset_clear(allow, type);
> @@ -1060,7 +1074,8 @@ static void asm_sload(ASMState *as, IRIns *ir)
>     } else
>   #endif
>     if (ra_used(ir)) {
> -    lua_assert(irt_isnum(t) || irt_isint(t) || irt_isaddr(t));
> +    lj_assertA(irt_isnum(t) || irt_isint(t) || irt_isaddr(t),
> +	       "bad SLOAD type %d", irt_type(ir->t));
>       dest = ra_dest(as, ir, (!LJ_SOFTFP && irt_isnum(t)) ? RSET_FPR : allow);
>       rset_clear(allow, dest);
>       base = ra_alloc1(as, REF_BASE, allow);
> @@ -1127,7 +1142,8 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>     const CCallInfo *ci = &lj_ir_callinfo[IRCALL_lj_mem_newgco];
>     IRRef args[4];
>     RegSet drop = RSET_SCRATCH;
> -  lua_assert(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL));
> +  lj_assertA(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL),
> +	     "bad CNEW/CNEWI operands");
>   
>     as->gcsteps++;
>     if (ra_hasreg(ir->r))
> @@ -1140,10 +1156,10 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>     if (ir->o == IR_CNEWI) {
>       RegSet allow = (RSET_GPR & ~RSET_SCRATCH);
>       int32_t ofs = sizeof(GCcdata);
> -    lua_assert(sz == 4 || sz == 8);
> +    lj_assertA(sz == 4 || sz == 8, "bad CNEWI size %d", sz);
>       if (sz == 8) {
>         ofs += 4;
> -      lua_assert((ir+1)->o == IR_HIOP);
> +      lj_assertA((ir+1)->o == IR_HIOP, "expected HIOP for CNEWI");
>       }
>       for (;;) {
>         Reg r = ra_alloc1(as, ir->op2, allow);
> @@ -1190,7 +1206,7 @@ static void asm_tbar(ASMState *as, IRIns *ir)
>     emit_tai(as, PPCI_STW, link, tab, (int32_t)offsetof(GCtab, gclist));
>     emit_tai(as, PPCI_STB, mark, tab, (int32_t)offsetof(GCtab, marked));
>     emit_setgl(as, tab, gc.grayagain);
> -  lua_assert(LJ_GC_BLACK == 0x04);
> +  lj_assertA(LJ_GC_BLACK == 0x04, "bad LJ_GC_BLACK");
>     emit_rot(as, PPCI_RLWINM, mark, mark, 0, 30, 28);  /* Clear black bit. */
>     emit_getgl(as, link, gc.grayagain);
>     emit_condbranch(as, PPCI_BC|PPCF_Y, CC_EQ, l_end);
> @@ -1205,7 +1221,7 @@ static void asm_obar(ASMState *as, IRIns *ir)
>     MCLabel l_end;
>     Reg obj, val, tmp;
>     /* No need for other object barriers (yet). */
> -  lua_assert(IR(ir->op1)->o == IR_UREFC);
> +  lj_assertA(IR(ir->op1)->o == IR_UREFC, "bad OBAR type");
>     ra_evictset(as, RSET_SCRATCH);
>     l_end = emit_label(as);
>     args[0] = ASMREF_TMP1;  /* global_State *g */
> @@ -1676,7 +1692,7 @@ static void asm_bitshift(ASMState *as, IRIns *ir, PPCIns pi, PPCIns pik)
>   #define asm_brol(as, ir) \
>     asm_bitshift(as, ir, PPCI_RLWNM|PPCF_MB(0)|PPCF_ME(31), \
>   		       PPCI_RLWINM|PPCF_MB(0)|PPCF_ME(31))
> -#define asm_bror(as, ir)	lua_assert(0)
> +#define asm_bror(as, ir)	lj_assertA(0, "unexpected BROR")
>   
>   #if LJ_SOFTFP
>   static void asm_sfpmin_max(ASMState *as, IRIns *ir)
> @@ -1951,10 +1967,11 @@ static void asm_hiop(ASMState *as, IRIns *ir)
>     case IR_CNEWI:
>       /* Nothing to do here. Handled by lo op itself. */
>       break;
> -  default: lua_assert(0); break;
> +  default: lj_assertA(0, "bad HIOP for op %d", (ir-1)->o); break;
>     }
>   #else
> -  UNUSED(as); UNUSED(ir); lua_assert(0);  /* Unused without FFI. */
> +  /* Unused without SOFTFP or FFI. */
> +  UNUSED(as); UNUSED(ir); lj_assertA(0, "unexpected HIOP");
>   #endif
>   }
>   
> @@ -2014,7 +2031,8 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>   #if LJ_SOFTFP
>         Reg tmp;
>         RegSet allow = rset_exclude(RSET_GPR, RID_BASE);
> -      lua_assert(irref_isk(ref));  /* LJ_SOFTFP: must be a number constant. */
> +      /* LJ_SOFTFP: must be a number constant. */
> +      lj_assertA(irref_isk(ref), "unsplit FP op");
>         tmp = ra_allock(as, (int32_t)ir_knum(ir)->u32.lo, allow);
>         emit_tai(as, PPCI_STW, tmp, RID_BASE, ofs+(LJ_BE?4:0));
>         if (rset_test(as->freeset, tmp+1)) allow = RID2RSET(tmp+1);
> @@ -2027,7 +2045,8 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>       } else {
>         Reg type;
>         RegSet allow = rset_exclude(RSET_GPR, RID_BASE);
> -      lua_assert(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t));
> +      lj_assertA(irt_ispri(ir->t) || irt_isaddr(ir->t) || irt_isinteger(ir->t),
> +		 "restore of IR type %d", irt_type(ir->t));
>         if (!irt_ispri(ir->t)) {
>   	Reg src = ra_alloc1(as, ref, allow);
>   	rset_clear(allow, src);
> @@ -2047,7 +2066,7 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>       }
>       checkmclim(as);
>     }
> -  lua_assert(map + nent == flinks);
> +  lj_assertA(map + nent == flinks, "inconsistent frames in snapshot");
>   }
>   
>   /* -- GC handling --------------------------------------------------------- */
> @@ -2145,7 +2164,7 @@ static void asm_tail_fixup(ASMState *as, TraceNo lnk)
>       as->mctop = p;
>     } else {
>       /* Patch stack adjustment. */
> -    lua_assert(checki16(CFRAME_SIZE+spadj));
> +    lj_assertA(checki16(CFRAME_SIZE+spadj), "stack adjustment out of range");
>       p[-3] = PPCI_ADDI | PPCF_T(RID_TMP) | PPCF_A(RID_SP) | (CFRAME_SIZE+spadj);
>       p[-2] = PPCI_STWU | PPCF_T(RID_TMP) | PPCF_A(RID_SP) | spadj;
>     }
> @@ -2222,14 +2241,16 @@ void lj_asm_patchexit(jit_State *J, GCtrace *T, ExitNo exitno, MCode *target)
>       } else if ((ins & 0xfc000000u) == PPCI_B &&
>   	       ((ins ^ ((char *)px-(char *)p)) & 0x03ffffffu) == 0) {
>         ptrdiff_t delta = (char *)target - (char *)p;
> -      lua_assert(((delta + 0x02000000) >> 26) == 0);
> +      lj_assertJ(((delta + 0x02000000) >> 26) == 0,
> +		 "branch target out of range");
>         *p = PPCI_B | ((uint32_t)delta & 0x03ffffffu);
>         if (!cstart) cstart = p;
>       }
>     }
>     {  /* Always patch long-range branch in exit stub itself. */
>       ptrdiff_t delta = (char *)target - (char *)px - clearso;
> -    lua_assert(((delta + 0x02000000) >> 26) == 0);
> +    lj_assertJ(((delta + 0x02000000) >> 26) == 0,
> +	       "branch target out of range");
>       *px = PPCI_B | ((uint32_t)delta & 0x03ffffffu);
>     }
>     if (!cstart) cstart = px;
> diff --git a/src/lj_asm_x86.h b/src/lj_asm_x86.h
> index 5f5fe3cf..74f2d853 100644
> --- a/src/lj_asm_x86.h
> +++ b/src/lj_asm_x86.h
> @@ -31,7 +31,7 @@ static MCode *asm_exitstub_gen(ASMState *as, ExitNo group)
>   #endif
>     /* Jump to exit handler which fills in the ExitState. */
>     *mxp++ = XI_JMP; mxp += 4;
> -  *((int32_t *)(mxp-4)) = jmprel(mxp, (MCode *)(void *)lj_vm_exit_handler);
> +  *((int32_t *)(mxp-4)) = jmprel(as->J, mxp, (MCode *)(void *)lj_vm_exit_handler);
>     /* Commit the code for this group (even if assembly fails later on). */
>     lj_mcode_commitbot(as->J, mxp);
>     as->mcbot = mxp;
> @@ -60,7 +60,7 @@ static void asm_guardcc(ASMState *as, int cc)
>     MCode *p = as->mcp;
>     if (LJ_UNLIKELY(p == as->invmcp)) {
>       as->loopinv = 1;
> -    *(int32_t *)(p+1) = jmprel(p+5, target);
> +    *(int32_t *)(p+1) = jmprel(as->J, p+5, target);
>       target = p;
>       cc ^= 1;
>       if (as->realign) {
> @@ -131,7 +131,7 @@ static IRRef asm_fuseabase(ASMState *as, IRRef ref)
>     as->mrm.ofs = 0;
>     if (irb->o == IR_FLOAD) {
>       IRIns *ira = IR(irb->op1);
> -    lua_assert(irb->op2 == IRFL_TAB_ARRAY);
> +    lj_assertA(irb->op2 == IRFL_TAB_ARRAY, "expected FLOAD TAB_ARRAY");
>       /* We can avoid the FLOAD of t->array for colocated arrays. */
>       if (ira->o == IR_TNEW && ira->op1 <= LJ_MAX_COLOSIZE &&
>   	!neverfuse(as) && noconflict(as, irb->op1, IR_NEWREF, 1)) {
> @@ -150,7 +150,7 @@ static IRRef asm_fuseabase(ASMState *as, IRRef ref)
>   static void asm_fusearef(ASMState *as, IRIns *ir, RegSet allow)
>   {
>     IRIns *irx;
> -  lua_assert(ir->o == IR_AREF);
> +  lj_assertA(ir->o == IR_AREF, "expected AREF");
>     as->mrm.base = (uint8_t)ra_alloc1(as, asm_fuseabase(as, ir->op1), allow);
>     irx = IR(ir->op2);
>     if (irref_isk(ir->op2)) {
> @@ -217,8 +217,9 @@ static void asm_fuseahuref(ASMState *as, IRRef ref, RegSet allow)
>         }
>         break;
>       default:
> -      lua_assert(ir->o == IR_HREF || ir->o == IR_NEWREF || ir->o == IR_UREFO ||
> -		 ir->o == IR_KKPTR);
> +      lj_assertA(ir->o == IR_HREF || ir->o == IR_NEWREF || ir->o == IR_UREFO ||
> +		 ir->o == IR_KKPTR,
> +		 "bad IR op %d", ir->o);
>         break;
>       }
>     }
> @@ -230,9 +231,10 @@ static void asm_fuseahuref(ASMState *as, IRRef ref, RegSet allow)
>   /* Fuse FLOAD/FREF reference into memory operand. */
>   static void asm_fusefref(ASMState *as, IRIns *ir, RegSet allow)
>   {
> -  lua_assert(ir->o == IR_FLOAD || ir->o == IR_FREF);
> +  lj_assertA(ir->o == IR_FLOAD || ir->o == IR_FREF,
> +	     "bad IR op %d", ir->o);
>     as->mrm.idx = RID_NONE;
> -  if (ir->op1 == REF_NIL) {
> +  if (ir->op1 == REF_NIL) {  /* FLOAD from GG_State with offset. */
>   #if LJ_GC64
>       as->mrm.ofs = (int32_t)(ir->op2 << 2) - GG_OFS(dispatch);
>       as->mrm.base = RID_DISPATCH;
> @@ -271,7 +273,7 @@ static void asm_fusefref(ASMState *as, IRIns *ir, RegSet allow)
>   static void asm_fusestrref(ASMState *as, IRIns *ir, RegSet allow)
>   {
>     IRIns *irr;
> -  lua_assert(ir->o == IR_STRREF);
> +  lj_assertA(ir->o == IR_STRREF, "bad IR op %d", ir->o);
>     as->mrm.base = as->mrm.idx = RID_NONE;
>     as->mrm.scale = XM_SCALE1;
>     as->mrm.ofs = sizeof(GCstr);
> @@ -378,9 +380,10 @@ static Reg asm_fuseloadk64(ASMState *as, IRIns *ir)
>   	     checki32(mctopofs(as, k)) && checki32(mctopofs(as, k+1))) {
>       as->mrm.ofs = (int32_t)mcpofs(as, k);
>       as->mrm.base = RID_RIP;
> -  } else {
> +  } else {  /* Intern 64 bit constant at bottom of mcode. */
>       if (ir->i) {
> -      lua_assert(*k == *(uint64_t*)(as->mctop - ir->i));
> +      lj_assertA(*k == *(uint64_t*)(as->mctop - ir->i),
> +		 "bad interned 64 bit constant");
>       } else {
>         while ((uintptr_t)as->mcbot & 7) *as->mcbot++ = XI_INT3;
>         *(uint64_t*)as->mcbot = *k;
> @@ -420,12 +423,12 @@ static Reg asm_fuseload(ASMState *as, IRRef ref, RegSet allow)
>     }
>     if (ir->o == IR_KNUM) {
>       RegSet avail = as->freeset & ~as->modset & RSET_FPR;
> -    lua_assert(allow != RSET_EMPTY);
> +    lj_assertA(allow != RSET_EMPTY, "no register allowed");
>       if (!(avail & (avail-1)))  /* Fuse if less than two regs available. */
>         return asm_fuseloadk64(as, ir);
>     } else if (ref == REF_BASE || ir->o == IR_KINT64) {
>       RegSet avail = as->freeset & ~as->modset & RSET_GPR;
> -    lua_assert(allow != RSET_EMPTY);
> +    lj_assertA(allow != RSET_EMPTY, "no register allowed");
>       if (!(avail & (avail-1))) {  /* Fuse if less than two regs available. */
>         if (ref == REF_BASE) {
>   #if LJ_GC64
> @@ -606,7 +609,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
>   #endif
>   	  emit_loadi(as, r, ir->i);
>         } else {
> -	lua_assert(rset_test(as->freeset, r));  /* Must have been evicted. */
> +	/* Must have been evicted. */
> +	lj_assertA(rset_test(as->freeset, r), "reg %d not free", r);
>   	if (ra_hasreg(ir->r)) {
>   	  ra_noweak(as, ir->r);
>   	  emit_movrr(as, ir, r, ir->r);
> @@ -615,7 +619,8 @@ static void asm_gencall(ASMState *as, const CCallInfo *ci, IRRef *args)
>   	}
>         }
>       } else if (irt_isfp(ir->t)) {  /* FP argument is on stack. */
> -      lua_assert(!(irt_isfloat(ir->t) && irref_isk(ref)));  /* No float k. */
> +      lj_assertA(!(irt_isfloat(ir->t) && irref_isk(ref)),
> +		 "unexpected float constant");
>         if (LJ_32 && (ofs & 4) && irref_isk(ref)) {
>   	/* Split stores for unaligned FP consts. */
>   	emit_movmroi(as, RID_ESP, ofs, (int32_t)ir_knum(ir)->u32.lo);
> @@ -691,7 +696,7 @@ static void asm_setupresult(ASMState *as, IRIns *ir, const CCallInfo *ci)
>         ra_destpair(as, ir);
>   #endif
>       } else {
> -      lua_assert(!irt_ispri(ir->t));
> +      lj_assertA(!irt_ispri(ir->t), "PRI dest");
>         ra_destreg(as, ir, RID_RET);
>       }
>     } else if (LJ_32 && irt_isfp(ir->t) && !(ci->flags & CCI_CASTU64)) {
> @@ -810,8 +815,10 @@ static void asm_conv(ASMState *as, IRIns *ir)
>     int st64 = (st == IRT_I64 || st == IRT_U64 || (LJ_64 && st == IRT_P64));
>     int stfp = (st == IRT_NUM || st == IRT_FLOAT);
>     IRRef lref = ir->op1;
> -  lua_assert(irt_type(ir->t) != st);
> -  lua_assert(!(LJ_32 && (irt_isint64(ir->t) || st64)));  /* Handled by SPLIT. */
> +  lj_assertA(irt_type(ir->t) != st, "inconsistent types for CONV");
> +  lj_assertA(!(LJ_32 && (irt_isint64(ir->t) || st64)),
> +	     "IR %04d has unsplit 64 bit type",
> +	     (int)(ir - as->ir) - REF_BIAS);
>     if (irt_isfp(ir->t)) {
>       Reg dest = ra_dest(as, ir, RSET_FPR);
>       if (stfp) {  /* FP to FP conversion. */
> @@ -847,7 +854,8 @@ static void asm_conv(ASMState *as, IRIns *ir)
>     } else if (stfp) {  /* FP to integer conversion. */
>       if (irt_isguard(ir->t)) {
>         /* Checked conversions are only supported from number to int. */
> -      lua_assert(irt_isint(ir->t) && st == IRT_NUM);
> +      lj_assertA(irt_isint(ir->t) && st == IRT_NUM,
> +		 "bad type for checked CONV");
>         asm_tointg(as, ir, ra_alloc1(as, lref, RSET_FPR));
>       } else {
>         Reg dest = ra_dest(as, ir, RSET_GPR);
> @@ -882,7 +890,7 @@ static void asm_conv(ASMState *as, IRIns *ir)
>       Reg left, dest = ra_dest(as, ir, RSET_GPR);
>       RegSet allow = RSET_GPR;
>       x86Op op;
> -    lua_assert(irt_isint(ir->t) || irt_isu32(ir->t));
> +    lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t), "bad type for CONV EXT");
>       if (st == IRT_I8) {
>         op = XO_MOVSXb; allow = RSET_GPR8; dest |= FORCE_REX;
>       } else if (st == IRT_U8) {
> @@ -953,7 +961,7 @@ static void asm_conv_fp_int64(ASMState *as, IRIns *ir)
>       emit_sjcc(as, CC_NS, l_end);
>       emit_rr(as, XO_TEST, hi, hi);  /* Check if u64 >= 2^63. */
>     } else {
> -    lua_assert(((ir-1)->op2 & IRCONV_SRCMASK) == IRT_I64);
> +    lj_assertA(((ir-1)->op2 & IRCONV_SRCMASK) == IRT_I64, "bad type for CONV");
>     }
>     emit_rmro(as, XO_FILDq, XOg_FILDq, RID_ESP, 0);
>     /* NYI: Avoid narrow-to-wide store-to-load forwarding stall. */
> @@ -967,8 +975,8 @@ static void asm_conv_int64_fp(ASMState *as, IRIns *ir)
>     IRType st = (IRType)((ir-1)->op2 & IRCONV_SRCMASK);
>     IRType dt = (((ir-1)->op2 & IRCONV_DSTMASK) >> IRCONV_DSH);
>     Reg lo, hi;
> -  lua_assert(st == IRT_NUM || st == IRT_FLOAT);
> -  lua_assert(dt == IRT_I64 || dt == IRT_U64);
> +  lj_assertA(st == IRT_NUM || st == IRT_FLOAT, "bad type for CONV");
> +  lj_assertA(dt == IRT_I64 || dt == IRT_U64, "bad type for CONV");
>     hi = ra_dest(as, ir, RSET_GPR);
>     lo = ra_dest(as, ir-1, rset_exclude(RSET_GPR, hi));
>     if (ra_used(ir-1)) emit_rmro(as, XO_MOV, lo, RID_ESP, 0);
> @@ -1180,13 +1188,13 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>         emit_rmro(as, XO_CMP, tmp|REX_64, dest, offsetof(Node, key.u64));
>       }
>     } else {
> -    lua_assert(irt_ispri(kt) && !irt_isnil(kt));
> +    lj_assertA(irt_ispri(kt) && !irt_isnil(kt), "bad HREF key type");
>       emit_u32(as, (irt_toitype(kt)<<15)|0x7fff);
>       emit_rmro(as, XO_ARITHi, XOg_CMP, dest, offsetof(Node, key.it));
>   #else
>     } else {
>       if (!irt_ispri(kt)) {
> -      lua_assert(irt_isaddr(kt));
> +      lj_assertA(irt_isaddr(kt), "bad HREF key type");
>         if (isk)
>   	emit_gmroi(as, XG_ARITHi(XOg_CMP), dest, offsetof(Node, key.gcr),
>   		   ptr2addr(ir_kgc(irkey)));
> @@ -1194,7 +1202,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>   	emit_rmro(as, XO_CMP, key, dest, offsetof(Node, key.gcr));
>         emit_sjcc(as, CC_NE, l_next);
>       }
> -    lua_assert(!irt_isnil(kt));
> +    lj_assertA(!irt_isnil(kt), "bad HREF key type");
>       emit_i8(as, irt_toitype(kt));
>       emit_rmro(as, XO_ARITHi8, XOg_CMP, dest, offsetof(Node, key.it));
>   #endif
> @@ -1209,7 +1217,7 @@ static void asm_href(ASMState *as, IRIns *ir, IROp merge)
>   #endif
>   
>     /* Load main position relative to tab->node into dest. */
> -  khash = isk ? ir_khash(irkey) : 1;
> +  khash = isk ? ir_khash(as, irkey) : 1;
>     if (khash == 0) {
>       emit_rmro(as, XO_MOV, dest|REX_GC64, tab, offsetof(GCtab, node));
>     } else {
> @@ -1276,7 +1284,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
>   #if !LJ_64
>     MCLabel l_exit;
>   #endif
> -  lua_assert(ofs % sizeof(Node) == 0);
> +  lj_assertA(ofs % sizeof(Node) == 0, "unaligned HREFK slot");
>     if (ra_hasreg(dest)) {
>       if (ofs != 0) {
>         if (dest == node && !(as->flags & JIT_F_LEA_AGU))
> @@ -1293,7 +1301,8 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
>       Reg key = ra_scratch(as, rset_exclude(RSET_GPR, node));
>       emit_rmro(as, XO_CMP, key|REX_64, node,
>   	       ofs + (int32_t)offsetof(Node, key.u64));
> -    lua_assert(irt_isnum(irkey->t) || irt_isgcv(irkey->t));
> +    lj_assertA(irt_isnum(irkey->t) || irt_isgcv(irkey->t),
> +	       "bad HREFK key type");
>       /* Assumes -0.0 is already canonicalized to +0.0. */
>       emit_loadu64(as, key, irt_isnum(irkey->t) ? ir_knum(irkey)->u64 :
>   #if LJ_GC64
> @@ -1304,7 +1313,7 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
>   			  (uint64_t)(uint32_t)ptr2addr(ir_kgc(irkey)));
>   #endif
>     } else {
> -    lua_assert(!irt_isnil(irkey->t));
> +    lj_assertA(!irt_isnil(irkey->t), "bad HREFK key type");
>   #if LJ_GC64
>       emit_i32(as, (irt_toitype(irkey->t)<<15)|0x7fff);
>       emit_rmro(as, XO_ARITHi, XOg_CMP, node,
> @@ -1328,13 +1337,13 @@ static void asm_hrefk(ASMState *as, IRIns *ir)
>   	       (int32_t)ir_knum(irkey)->u32.hi);
>     } else {
>       if (!irt_ispri(irkey->t)) {
> -      lua_assert(irt_isgcv(irkey->t));
> +      lj_assertA(irt_isgcv(irkey->t), "bad HREFK key type");
>         emit_gmroi(as, XG_ARITHi(XOg_CMP), node,
>   		 ofs + (int32_t)offsetof(Node, key.gcr),
>   		 ptr2addr(ir_kgc(irkey)));
>         emit_sjcc(as, CC_NE, l_exit);
>       }
> -    lua_assert(!irt_isnil(irkey->t));
> +    lj_assertA(!irt_isnil(irkey->t), "bad HREFK key type");
>       emit_i8(as, irt_toitype(irkey->t));
>       emit_rmro(as, XO_ARITHi8, XOg_CMP, node,
>   	      ofs + (int32_t)offsetof(Node, key.it));
> @@ -1407,7 +1416,8 @@ static void asm_fxload(ASMState *as, IRIns *ir)
>       if (LJ_64 && irt_is64(ir->t))
>         dest |= REX_64;
>       else
> -      lua_assert(irt_isint(ir->t) || irt_isu32(ir->t) || irt_isaddr(ir->t));
> +      lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t) || irt_isaddr(ir->t),
> +		 "unsplit 64 bit load");
>       xo = XO_MOV;
>       break;
>     }
> @@ -1452,13 +1462,16 @@ static void asm_fxstore(ASMState *as, IRIns *ir)
>       case IRT_NUM: xo = XO_MOVSDto; break;
>       case IRT_FLOAT: xo = XO_MOVSSto; break;
>   #if LJ_64 && !LJ_GC64
> -    case IRT_LIGHTUD: lua_assert(0);  /* NYI: mask 64 bit lightuserdata. */
> +    case IRT_LIGHTUD:
> +      /* NYI: mask 64 bit lightuserdata. */
> +      lj_assertA(0, "store of lightuserdata");
>   #endif
>       default:
>         if (LJ_64 && irt_is64(ir->t))
>   	src |= REX_64;
>         else
> -	lua_assert(irt_isint(ir->t) || irt_isu32(ir->t) || irt_isaddr(ir->t));
> +	lj_assertA(irt_isint(ir->t) || irt_isu32(ir->t) || irt_isaddr(ir->t),
> +		   "unsplit 64 bit store");
>         xo = XO_MOVto;
>         break;
>       }
> @@ -1472,8 +1485,8 @@ static void asm_fxstore(ASMState *as, IRIns *ir)
>         emit_i8(as, k);
>         emit_mrm(as, XO_MOVmib, 0, RID_MRM);
>       } else {
> -      lua_assert(irt_is64(ir->t) || irt_isint(ir->t) || irt_isu32(ir->t) ||
> -		 irt_isaddr(ir->t));
> +      lj_assertA(irt_is64(ir->t) || irt_isint(ir->t) || irt_isu32(ir->t) ||
> +		 irt_isaddr(ir->t), "bad store type");
>         emit_i32(as, k);
>         emit_mrm(as, XO_MOVmi, REX_64IR(ir, 0), RID_MRM);
>       }
> @@ -1508,8 +1521,9 @@ static void asm_ahuvload(ASMState *as, IRIns *ir)
>   #if LJ_GC64
>     Reg tmp = RID_NONE;
>   #endif
> -  lua_assert(irt_isnum(ir->t) || irt_ispri(ir->t) || irt_isaddr(ir->t) ||
> -	     (LJ_DUALNUM && irt_isint(ir->t)));
> +  lj_assertA(irt_isnum(ir->t) || irt_ispri(ir->t) || irt_isaddr(ir->t) ||
> +	     (LJ_DUALNUM && irt_isint(ir->t)),
> +	     "bad load type %d", irt_type(ir->t));
>   #if LJ_64 && !LJ_GC64
>     if (irt_islightud(ir->t)) {
>       Reg dest = asm_load_lightud64(as, ir, 1);
> @@ -1556,7 +1570,8 @@ static void asm_ahuvload(ASMState *as, IRIns *ir)
>     as->mrm.ofs += 4;
>     asm_guardcc(as, irt_isnum(ir->t) ? CC_AE : CC_NE);
>     if (LJ_64 && irt_type(ir->t) >= IRT_NUM) {
> -    lua_assert(irt_isinteger(ir->t) || irt_isnum(ir->t));
> +    lj_assertA(irt_isinteger(ir->t) || irt_isnum(ir->t),
> +	       "bad load type %d", irt_type(ir->t));
>   #if LJ_GC64
>       emit_u32(as, LJ_TISNUM << 15);
>   #else
> @@ -1638,13 +1653,14 @@ static void asm_ahustore(ASMState *as, IRIns *ir)
>   #endif
>         emit_mrm(as, XO_MOVto, src, RID_MRM);
>       } else if (!irt_ispri(irr->t)) {
> -      lua_assert(irt_isaddr(ir->t) || (LJ_DUALNUM && irt_isinteger(ir->t)));
> +      lj_assertA(irt_isaddr(ir->t) || (LJ_DUALNUM && irt_isinteger(ir->t)),
> +		 "bad store type");
>         emit_i32(as, irr->i);
>         emit_mrm(as, XO_MOVmi, 0, RID_MRM);
>       }
>       as->mrm.ofs += 4;
>   #if LJ_GC64
> -    lua_assert(LJ_DUALNUM && irt_isinteger(ir->t));
> +    lj_assertA(LJ_DUALNUM && irt_isinteger(ir->t), "bad store type");
>       emit_i32(as, LJ_TNUMX << 15);
>   #else
>       emit_i32(as, (int32_t)irt_toitype(ir->t));
> @@ -1659,10 +1675,13 @@ static void asm_sload(ASMState *as, IRIns *ir)
>   		(!LJ_FR2 && (ir->op2 & IRSLOAD_FRAME) ? 4 : 0);
>     IRType1 t = ir->t;
>     Reg base;
> -  lua_assert(!(ir->op2 & IRSLOAD_PARENT));  /* Handled by asm_head_side(). */
> -  lua_assert(irt_isguard(t) || !(ir->op2 & IRSLOAD_TYPECHECK));
> -  lua_assert(LJ_DUALNUM ||
> -	     !irt_isint(t) || (ir->op2 & (IRSLOAD_CONVERT|IRSLOAD_FRAME)));
> +  lj_assertA(!(ir->op2 & IRSLOAD_PARENT),
> +	     "bad parent SLOAD"); /* Handled by asm_head_side(). */
> +  lj_assertA(irt_isguard(t) || !(ir->op2 & IRSLOAD_TYPECHECK),
> +	     "inconsistent SLOAD variant");
> +  lj_assertA(LJ_DUALNUM ||
> +	     !irt_isint(t) || (ir->op2 & (IRSLOAD_CONVERT|IRSLOAD_FRAME)),
> +	     "bad SLOAD type");
>     if ((ir->op2 & IRSLOAD_CONVERT) && irt_isguard(t) && irt_isint(t)) {
>       Reg left = ra_scratch(as, RSET_FPR);
>       asm_tointg(as, ir, left);  /* Frees dest reg. Do this before base alloc. */
> @@ -1682,7 +1701,8 @@ static void asm_sload(ASMState *as, IRIns *ir)
>       RegSet allow = irt_isnum(t) ? RSET_FPR : RSET_GPR;
>       Reg dest = ra_dest(as, ir, allow);
>       base = ra_alloc1(as, REF_BASE, RSET_GPR);
> -    lua_assert(irt_isnum(t) || irt_isint(t) || irt_isaddr(t));
> +    lj_assertA(irt_isnum(t) || irt_isint(t) || irt_isaddr(t),
> +	       "bad SLOAD type %d", irt_type(t));
>       if ((ir->op2 & IRSLOAD_CONVERT)) {
>         t.irt = irt_isint(t) ? IRT_NUM : IRT_INT;  /* Check for original type. */
>         emit_rmro(as, irt_isint(t) ? XO_CVTSI2SD : XO_CVTTSD2SI, dest, base, ofs);
> @@ -1728,7 +1748,8 @@ static void asm_sload(ASMState *as, IRIns *ir)
>       /* Need type check, even if the load result is unused. */
>       asm_guardcc(as, irt_isnum(t) ? CC_AE : CC_NE);
>       if (LJ_64 && irt_type(t) >= IRT_NUM) {
> -      lua_assert(irt_isinteger(t) || irt_isnum(t));
> +      lj_assertA(irt_isinteger(t) || irt_isnum(t),
> +		 "bad SLOAD type %d", irt_type(t));
>   #if LJ_GC64
>         emit_u32(as, LJ_TISNUM << 15);
>   #else
> @@ -1780,7 +1801,8 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>     CTInfo info = lj_ctype_info(cts, id, &sz);
>     const CCallInfo *ci = &lj_ir_callinfo[IRCALL_lj_mem_newgco];
>     IRRef args[4];
> -  lua_assert(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL));
> +  lj_assertA(sz != CTSIZE_INVALID || (ir->o == IR_CNEW && ir->op2 != REF_NIL),
> +	     "bad CNEW/CNEWI operands");
>   
>     as->gcsteps++;
>     asm_setupresult(as, ir, ci);  /* GCcdata * */
> @@ -1810,7 +1832,7 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>       int32_t ofs = sizeof(GCcdata);
>       if (sz == 8) {
>         ofs += 4; ir++;
> -      lua_assert(ir->o == IR_HIOP);
> +      lj_assertA(ir->o == IR_HIOP, "missing CNEWI HIOP");
>       }
>       do {
>         if (irref_isk(ir->op2)) {
> @@ -1824,7 +1846,7 @@ static void asm_cnew(ASMState *as, IRIns *ir)
>         ofs -= 4; ir--;
>       } while (1);
>   #endif
> -    lua_assert(sz == 4 || sz == 8);
> +    lj_assertA(sz == 4 || sz == 8, "bad CNEWI size %d", sz);
>     } else if (ir->op2 != REF_NIL) {  /* Create VLA/VLS/aligned cdata. */
>       ci = &lj_ir_callinfo[IRCALL_lj_cdata_newv];
>       args[0] = ASMREF_L;     /* lua_State *L */
> @@ -1883,7 +1905,7 @@ static void asm_obar(ASMState *as, IRIns *ir)
>     MCLabel l_end;
>     Reg obj;
>     /* No need for other object barriers (yet). */
> -  lua_assert(IR(ir->op1)->o == IR_UREFC);
> +  lj_assertA(IR(ir->op1)->o == IR_UREFC, "bad OBAR type");
>     ra_evictset(as, RSET_SCRATCH);
>     l_end = emit_label(as);
>     args[0] = ASMREF_TMP1;  /* global_State *g */
> @@ -2000,7 +2022,7 @@ static int asm_swapops(ASMState *as, IRIns *ir)
>   {
>     IRIns *irl = IR(ir->op1);
>     IRIns *irr = IR(ir->op2);
> -  lua_assert(ra_noreg(irr->r));
> +  lj_assertA(ra_noreg(irr->r), "bad usage");
>     if (!irm_iscomm(lj_ir_mode[ir->o]))
>       return 0;  /* Can't swap non-commutative operations. */
>     if (irref_isk(ir->op2))
> @@ -2391,8 +2413,9 @@ static void asm_comp(ASMState *as, IRIns *ir)
>       IROp leftop = (IROp)(IR(lref)->o);
>       Reg r64 = REX_64IR(ir, 0);
>       int32_t imm = 0;
> -    lua_assert(irt_is64(ir->t) || irt_isint(ir->t) ||
> -	       irt_isu32(ir->t) || irt_isaddr(ir->t) || irt_isu8(ir->t));
> +    lj_assertA(irt_is64(ir->t) || irt_isint(ir->t) ||
> +	       irt_isu32(ir->t) || irt_isaddr(ir->t) || irt_isu8(ir->t),
> +	       "bad comparison data type %d", irt_type(ir->t));
>       /* Swap constants (only for ABC) and fusable loads to the right. */
>       if (irref_isk(lref) || (!irref_isk(rref) && opisfusableload(leftop))) {
>         if ((cc & 0xc) == 0xc) cc ^= 0x53;  /* L <-> G, LE <-> GE */
> @@ -2474,7 +2497,7 @@ static void asm_comp(ASMState *as, IRIns *ir)
>   	  /* Use test r,r instead of cmp r,0. */
>   	  x86Op xo = XO_TEST;
>   	  if (irt_isu8(ir->t)) {
> -	    lua_assert(ir->o == IR_EQ || ir->o == IR_NE);
> +	    lj_assertA(ir->o == IR_EQ || ir->o == IR_NE, "bad usage");
>   	    xo = XO_TESTb;
>   	    if (!rset_test(RSET_RANGE(RID_EAX, RID_EBX+1), left)) {
>   	      if (LJ_64) {
> @@ -2630,10 +2653,11 @@ static void asm_hiop(ASMState *as, IRIns *ir)
>     case IR_CNEWI:
>       /* Nothing to do here. Handled by CNEWI itself. */
>       break;
> -  default: lua_assert(0); break;
> +  default: lj_assertA(0, "bad HIOP for op %d", (ir-1)->o); break;
>     }
>   #else
> -  UNUSED(as); UNUSED(ir); lua_assert(0);  /* Unused on x64 or without FFI. */
> +  /* Unused on x64 or without FFI. */
> +  UNUSED(as); UNUSED(ir); lj_assertA(0, "unexpected HIOP");
>   #endif
>   }
>   
> @@ -2699,8 +2723,9 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>         Reg src = ra_alloc1(as, ref, RSET_FPR);
>         emit_rmro(as, XO_MOVSDto, src, RID_BASE, ofs);
>       } else {
> -      lua_assert(irt_ispri(ir->t) || irt_isaddr(ir->t) ||
> -		 (LJ_DUALNUM && irt_isinteger(ir->t)));
> +      lj_assertA(irt_ispri(ir->t) || irt_isaddr(ir->t) ||
> +		 (LJ_DUALNUM && irt_isinteger(ir->t)),
> +		 "restore of IR type %d", irt_type(ir->t));
>         if (!irref_isk(ref)) {
>   	Reg src = ra_alloc1(as, ref, rset_exclude(RSET_GPR, RID_BASE));
>   #if LJ_GC64
> @@ -2745,7 +2770,7 @@ static void asm_stack_restore(ASMState *as, SnapShot *snap)
>       }
>       checkmclim(as);
>     }
> -  lua_assert(map + nent == flinks);
> +  lj_assertA(map + nent == flinks, "inconsistent frames in snapshot");
>   }
>   
>   /* -- GC handling --------------------------------------------------------- */
> @@ -2789,16 +2814,16 @@ static void asm_loop_fixup(ASMState *as)
>     MCode *target = as->mcp;
>     if (as->realign) {  /* Realigned loops use short jumps. */
>       as->realign = NULL;  /* Stop another retry. */
> -    lua_assert(((intptr_t)target & 15) == 0);
> +    lj_assertA(((intptr_t)target & 15) == 0, "loop realign failed");
>       if (as->loopinv) {  /* Inverted loop branch? */
>         p -= 5;
>         p[0] = XI_JMP;
> -      lua_assert(target - p >= -128);
> +      lj_assertA(target - p >= -128, "loop realign failed");
>         p[-1] = (MCode)(target - p);  /* Patch sjcc. */
>         if (as->loopinv == 2)
>   	p[-3] = (MCode)(target - p + 2);  /* Patch opt. short jp. */
>       } else {
> -      lua_assert(target - p >= -128);
> +      lj_assertA(target - p >= -128, "loop realign failed");
>         p[-1] = (MCode)(int8_t)(target - p);  /* Patch short jmp. */
>         p[-2] = XI_JMPs;
>       }
> @@ -2904,7 +2929,7 @@ static void asm_tail_fixup(ASMState *as, TraceNo lnk)
>     }
>     /* Patch exit branch. */
>     target = lnk ? traceref(as->J, lnk)->mcode : (MCode *)lj_vm_exit_interp;
> -  *(int32_t *)(p-4) = jmprel(p, target);
> +  *(int32_t *)(p-4) = jmprel(as->J, p, target);
>     p[-5] = XI_JMP;
>     /* Drop unused mcode tail. Fill with NOPs to make the prefetcher happy. */
>     for (q = as->mctop-1; q >= p; q--)
> @@ -3077,17 +3102,17 @@ void lj_asm_patchexit(jit_State *J, GCtrace *T, ExitNo exitno, MCode *target)
>     uint32_t statei = u32ptr(&J2G(J)->vmstate);
>   #endif
>     if (len > 5 && p[len-5] == XI_JMP && p+len-6 + *(int32_t *)(p+len-4) == px)
> -    *(int32_t *)(p+len-4) = jmprel(p+len, target);
> +    *(int32_t *)(p+len-4) = jmprel(J, p+len, target);
>     /* Do not patch parent exit for a stack check. Skip beyond vmstate update. */
>     for (; p < pe; p += asm_x86_inslen(p)) {
>       intptr_t ofs = LJ_GC64 ? (p[0] & 0xf0) == 0x40 : LJ_64;
>       if (*(uint32_t *)(p+2+ofs) == statei && p[ofs+LJ_GC64-LJ_64] == XI_MOVmi)
>         break;
>     }
> -  lua_assert(p < pe);
> +  lj_assertJ(p < pe, "instruction length decoder failed");
>     for (; p < pe; p += asm_x86_inslen(p))
>       if ((*(uint16_t *)p & 0xf0ff) == 0x800f && p + *(int32_t *)(p+2) == px)
> -      *(int32_t *)(p+2) = jmprel(p+6, target);
> +      *(int32_t *)(p+2) = jmprel(J, p+6, target);
>     lj_mcode_sync(T->mcode, T->mcode + T->szmcode);
>     lj_mcode_patch(J, mcarea, 1);
>   }
> diff --git a/src/lj_assert.c b/src/lj_assert.c
> new file mode 100644
> index 00000000..7989dbe6
> --- /dev/null
> +++ b/src/lj_assert.c
> @@ -0,0 +1,28 @@
> +/*
> +** Internal assertions.
> +** Copyright (C) 2005-2020 Mike Pall. See Copyright Notice in luajit.h
> +*/
> +
> +#define lj_assert_c
> +#define LUA_CORE
> +
> +#if defined(LUA_USE_ASSERT) || defined(LUA_USE_APICHECK)
> +
> +#include <stdio.h>
> +
> +#include "lj_obj.h"
> +
> +void lj_assert_fail(global_State *g, const char *file, int line,
> +		    const char *func, const char *fmt, ...)
> +{
> +  va_list argp;
> +  va_start(argp, fmt);
> +  fprintf(stderr, "LuaJIT ASSERT %s:%d: %s: ", file, line, func);
> +  vfprintf(stderr, fmt, argp);
> +  fputc('\n', stderr);
> +  va_end(argp);
> +  UNUSED(g);  /* May be NULL. TODO: optionally dump state. */
> +  abort();
> +}
> +
> +#endif
> diff --git a/src/lj_bcread.c b/src/lj_bcread.c
> index f6c7ad25..cddf6ff1 100644
> --- a/src/lj_bcread.c
> +++ b/src/lj_bcread.c
> @@ -53,7 +53,7 @@ static LJ_NOINLINE void bcread_error(LexState *ls, ErrMsg em)
>   /* Refill buffer. */
>   static LJ_NOINLINE void bcread_fill(LexState *ls, MSize len, int need)
>   {
> -  lua_assert(len != 0);
> +  lj_assertLS(len != 0, "empty refill");
>     if (len > LJ_MAX_BUF || ls->c < 0)
>       bcread_error(ls, LJ_ERR_BCBAD);
>     do {
> @@ -63,7 +63,7 @@ static LJ_NOINLINE void bcread_fill(LexState *ls, MSize len, int need)
>       MSize n = (MSize)(ls->pe - ls->p);
>       if (n) {  /* Copy remainder to buffer. */
>         if (sbuflen(&ls->sb)) {  /* Move down in buffer. */
> -	lua_assert(ls->pe == sbufP(&ls->sb));
> +	lj_assertLS(ls->pe == sbufP(&ls->sb), "bad buffer pointer");
>   	if (ls->p != p) memmove(p, ls->p, n);
>         } else {  /* Copy from buffer provided by reader. */
>   	p = lj_buf_need(&ls->sb, len);
> @@ -112,7 +112,7 @@ static LJ_AINLINE uint8_t *bcread_mem(LexState *ls, MSize len)
>   {
>     uint8_t *p = (uint8_t *)ls->p;
>     ls->p += len;
> -  lua_assert(ls->p <= ls->pe);
> +  lj_assertLS(ls->p <= ls->pe, "buffer read overflow");
>     return p;
>   }
>   
> @@ -125,7 +125,7 @@ static void bcread_block(LexState *ls, void *q, MSize len)
>   /* Read byte from buffer. */
>   static LJ_AINLINE uint32_t bcread_byte(LexState *ls)
>   {
> -  lua_assert(ls->p < ls->pe);
> +  lj_assertLS(ls->p < ls->pe, "buffer read overflow");
>     return (uint32_t)(uint8_t)*ls->p++;
>   }
>   
> @@ -133,7 +133,7 @@ static LJ_AINLINE uint32_t bcread_byte(LexState *ls)
>   static LJ_AINLINE uint32_t bcread_uleb128(LexState *ls)
>   {
>     uint32_t v = lj_buf_ruleb128(&ls->p);
> -  lua_assert(ls->p <= ls->pe);
> +  lj_assertLS(ls->p <= ls->pe, "buffer read overflow");
>     return v;
>   }
>   
> @@ -150,7 +150,7 @@ static uint32_t bcread_uleb128_33(LexState *ls)
>      } while (*p++ >= 0x80);
>     }
>     ls->p = (char *)p;
> -  lua_assert(ls->p <= ls->pe);
> +  lj_assertLS(ls->p <= ls->pe, "buffer read overflow");
>     return v;
>   }
>   
> @@ -197,7 +197,7 @@ static void bcread_ktabk(LexState *ls, TValue *o)
>       o->u32.lo = bcread_uleb128(ls);
>       o->u32.hi = bcread_uleb128(ls);
>     } else {
> -    lua_assert(tp <= BCDUMP_KTAB_TRUE);
> +    lj_assertLS(tp <= BCDUMP_KTAB_TRUE, "bad constant type %d", tp);
>       setpriV(o, ~tp);
>     }
>   }
> @@ -219,7 +219,7 @@ static GCtab *bcread_ktab(LexState *ls)
>       for (i = 0; i < nhash; i++) {
>         TValue key;
>         bcread_ktabk(ls, &key);
> -      lua_assert(!tvisnil(&key));
> +      lj_assertLS(!tvisnil(&key), "nil key");
>         bcread_ktabk(ls, lj_tab_set(ls->L, t, &key));
>       }
>     }
> @@ -256,7 +256,7 @@ static void bcread_kgc(LexState *ls, GCproto *pt, MSize sizekgc)
>   #endif
>       } else {
>         lua_State *L = ls->L;
> -      lua_assert(tp == BCDUMP_KGC_CHILD);
> +      lj_assertLS(tp == BCDUMP_KGC_CHILD, "bad constant type %d", tp);
>         if (L->top <= bcread_oldtop(L, ls))  /* Stack underflow? */
>   	bcread_error(ls, LJ_ERR_BCBAD);
>         L->top--;
> @@ -437,7 +437,7 @@ static int bcread_header(LexState *ls)
>   GCproto *lj_bcread(LexState *ls)
>   {
>     lua_State *L = ls->L;
> -  lua_assert(ls->c == BCDUMP_HEAD1);
> +  lj_assertLS(ls->c == BCDUMP_HEAD1, "bad bytecode header");
>     bcread_savetop(L, ls, L->top);
>     lj_buf_reset(&ls->sb);
>     /* Check for a valid bytecode dump header. */
> diff --git a/src/lj_bcwrite.c b/src/lj_bcwrite.c
> index a86d6d00..ce5837f6 100644
> --- a/src/lj_bcwrite.c
> +++ b/src/lj_bcwrite.c
> @@ -29,8 +29,17 @@ typedef struct BCWriteCtx {
>     void *wdata;			/* Writer callback data. */
>     int strip;			/* Strip debug info. */
>     int status;			/* Status from writer callback. */
> +#ifdef LUA_USE_ASSERT
> +  global_State *g;
> +#endif
>   } BCWriteCtx;
>   
> +#ifdef LUA_USE_ASSERT
> +#define lj_assertBCW(c, ...)	lj_assertG_(ctx->g, (c), __VA_ARGS__)
> +#else
> +#define lj_assertBCW(c, ...)	((void)ctx)
> +#endif
> +
>   /* -- Bytecode writer ----------------------------------------------------- */
>   
>   /* Write a single constant key/value of a template table. */
> @@ -61,7 +70,7 @@ static void bcwrite_ktabk(BCWriteCtx *ctx, cTValue *o, int narrow)
>       p = lj_strfmt_wuleb128(p, o->u32.lo);
>       p = lj_strfmt_wuleb128(p, o->u32.hi);
>     } else {
> -    lua_assert(tvispri(o));
> +    lj_assertBCW(tvispri(o), "unhandled type %d", itype(o));
>       *p++ = BCDUMP_KTAB_NIL+~itype(o);
>     }
>     setsbufP(&ctx->sb, p);
> @@ -121,7 +130,7 @@ static void bcwrite_kgc(BCWriteCtx *ctx, GCproto *pt)
>         tp = BCDUMP_KGC_STR + gco2str(o)->len;
>         need = 5+gco2str(o)->len;
>       } else if (o->gch.gct == ~LJ_TPROTO) {
> -      lua_assert((pt->flags & PROTO_CHILD));
> +      lj_assertBCW((pt->flags & PROTO_CHILD), "prototype has unexpected child");
>         tp = BCDUMP_KGC_CHILD;
>   #if LJ_HASFFI
>       } else if (o->gch.gct == ~LJ_TCDATA) {
> @@ -132,12 +141,14 @@ static void bcwrite_kgc(BCWriteCtx *ctx, GCproto *pt)
>         } else if (id == CTID_UINT64) {
>   	tp = BCDUMP_KGC_U64;
>         } else {
> -	lua_assert(id == CTID_COMPLEX_DOUBLE);
> +	lj_assertBCW(id == CTID_COMPLEX_DOUBLE,
> +		     "bad cdata constant CTID %d", id);
>   	tp = BCDUMP_KGC_COMPLEX;
>         }
>   #endif
>       } else {
> -      lua_assert(o->gch.gct == ~LJ_TTAB);
> +      lj_assertBCW(o->gch.gct == ~LJ_TTAB,
> +		   "bad constant GC type %d", o->gch.gct);
>         tp = BCDUMP_KGC_TAB;
>         need = 1+2*5;
>       }
> @@ -289,7 +300,7 @@ static void bcwrite_proto(BCWriteCtx *ctx, GCproto *pt)
>       MSize nn = (lj_fls(n)+8)*9 >> 6;
>       char *q = sbufB(&ctx->sb) + (5 - nn);
>       p = lj_strfmt_wuleb128(q, n);  /* Fill in final size. */
> -    lua_assert(p == sbufB(&ctx->sb) + 5);
> +    lj_assertBCW(p == sbufB(&ctx->sb) + 5, "bad ULEB128 write");
>       ctx->status = ctx->wfunc(sbufL(&ctx->sb), q, nn+n, ctx->wdata);
>     }
>   }
> @@ -349,6 +360,9 @@ int lj_bcwrite(lua_State *L, GCproto *pt, lua_Writer writer, void *data,
>     ctx.wdata = data;
>     ctx.strip = strip;
>     ctx.status = 0;
> +#ifdef LUA_USE_ASSERT
> +  ctx.g = G(L);
> +#endif
>     lj_buf_init(L, &ctx.sb);
>     status = lj_vm_cpcall(L, NULL, &ctx, cpwriter);
>     if (status == 0) status = ctx.status;
> diff --git a/src/lj_buf.c b/src/lj_buf.c
> index 0dfe7f98..923f4276 100644
> --- a/src/lj_buf.c
> +++ b/src/lj_buf.c
> @@ -30,7 +30,7 @@ static void buf_grow(SBuf *sb, MSize sz)
>   
>   LJ_NOINLINE char *LJ_FASTCALL lj_buf_need2(SBuf *sb, MSize sz)
>   {
> -  lua_assert(sz > sbufsz(sb));
> +  lj_assertG_(G(sbufL(sb)), sz > sbufsz(sb), "SBuf overflow");
>     if (LJ_UNLIKELY(sz > LJ_MAX_BUF))
>       lj_err_mem(sbufL(sb));
>     buf_grow(sb, sz);
> @@ -40,7 +40,7 @@ LJ_NOINLINE char *LJ_FASTCALL lj_buf_need2(SBuf *sb, MSize sz)
>   LJ_NOINLINE char *LJ_FASTCALL lj_buf_more2(SBuf *sb, MSize sz)
>   {
>     MSize len = sbuflen(sb);
> -  lua_assert(sz > sbufleft(sb));
> +  lj_assertG_(G(sbufL(sb)), sz > sbufleft(sb), "SBuf overflow");
>     if (LJ_UNLIKELY(sz > LJ_MAX_BUF || len + sz > LJ_MAX_BUF))
>       lj_err_mem(sbufL(sb));
>     buf_grow(sb, len + sz);
> diff --git a/src/lj_carith.c b/src/lj_carith.c
> index 04c18054..4ae1e9ee 100644
> --- a/src/lj_carith.c
> +++ b/src/lj_carith.c
> @@ -122,7 +122,7 @@ static int carith_ptr(lua_State *L, CTState *cts, CDArith *ca, MMS mm)
>   	setboolV(L->top-1, ((uintptr_t)pp < (uintptr_t)pp2));
>   	return 1;
>         } else {
> -	lua_assert(mm == MM_le);
> +	lj_assertL(mm == MM_le, "bad metamethod %d", mm);
>   	setboolV(L->top-1, ((uintptr_t)pp <= (uintptr_t)pp2));
>   	return 1;
>         }
> @@ -208,7 +208,9 @@ static int carith_int64(lua_State *L, CTState *cts, CDArith *ca, MMS mm)
>   	*up = lj_carith_powu64(u0, u1);
>         break;
>       case MM_unm: *up = (uint64_t)-(int64_t)u0; break;
> -    default: lua_assert(0); break;
> +    default:
> +      lj_assertL(0, "bad metamethod %d", mm);
> +      break;
>       }
>       lj_gc_check(L);
>       return 1;
> @@ -301,7 +303,9 @@ uint64_t lj_carith_shift64(uint64_t x, int32_t sh, int op)
>     case IR_BSAR-IR_BSHL: x = lj_carith_sar64(x, sh); break;
>     case IR_BROL-IR_BSHL: x = lj_carith_rol64(x, sh); break;
>     case IR_BROR-IR_BSHL: x = lj_carith_ror64(x, sh); break;
> -  default: lua_assert(0); break;
> +  default:
> +    lj_assertX(0, "bad shift op %d", op);
> +    break;
>     }
>     return x;
>   }
> diff --git a/src/lj_ccall.c b/src/lj_ccall.c
> index c1e12f56..a989f657 100644
> --- a/src/lj_ccall.c
> +++ b/src/lj_ccall.c
> @@ -391,7 +391,8 @@
>   #define CCALL_HANDLE_GPR \
>     /* Try to pass argument in GPRs. */ \
>     if (n > 1) { \
> -    lua_assert(n == 2 || n == 4);  /* int64_t or complex (float). */ \
> +    /* int64_t or complex (float). */ \
> +    lj_assertL(n == 2 || n == 4, "bad GPR size %d", n); \
>       if (ctype_isinteger(d->info) || ctype_isfp(d->info)) \
>         ngpr = (ngpr + 1u) & ~1u;  /* Align int64_t to regpair. */ \
>       else if (ngpr + n > maxgpr) \
> @@ -642,7 +643,8 @@ static void ccall_classify_ct(CTState *cts, CType *ct, int *rcl, CTSize ofs)
>       ccall_classify_struct(cts, ct, rcl, ofs);
>     } else {
>       int cl = ctype_isfp(ct->info) ? CCALL_RCL_SSE : CCALL_RCL_INT;
> -    lua_assert(ctype_hassize(ct->info));
> +    lj_assertCTS(ctype_hassize(ct->info),
> +		 "classify ctype %08x without size", ct->info);
>       if ((ofs & (ct->size-1))) cl = CCALL_RCL_MEM;  /* Unaligned. */
>       rcl[(ofs >= 8)] |= cl;
>     }
> @@ -667,12 +669,13 @@ static int ccall_classify_struct(CTState *cts, CType *ct, int *rcl, CTSize ofs)
>   }
>   
>   /* Try to split up a small struct into registers. */
> -static int ccall_struct_reg(CCallState *cc, GPRArg *dp, int *rcl)
> +static int ccall_struct_reg(CCallState *cc, CTState *cts, GPRArg *dp, int *rcl)
>   {
>     MSize ngpr = cc->ngpr, nfpr = cc->nfpr;
>     uint32_t i;
> +  UNUSED(cts);
>     for (i = 0; i < 2; i++) {
> -    lua_assert(!(rcl[i] & CCALL_RCL_MEM));
> +    lj_assertCTS(!(rcl[i] & CCALL_RCL_MEM), "pass mem struct in reg");
>       if ((rcl[i] & CCALL_RCL_INT)) {  /* Integer class takes precedence. */
>         if (ngpr >= CCALL_NARG_GPR) return 1;  /* Register overflow. */
>         cc->gpr[ngpr++] = dp[i];
> @@ -693,7 +696,8 @@ static int ccall_struct_arg(CCallState *cc, CTState *cts, CType *d, int *rcl,
>     dp[0] = dp[1] = 0;
>     /* Convert to temp. struct. */
>     lj_cconv_ct_tv(cts, d, (uint8_t *)dp, o, CCF_ARG(narg));
> -  if (ccall_struct_reg(cc, dp, rcl)) {  /* Register overflow? Pass on stack. */
> +  if (ccall_struct_reg(cc, cts, dp, rcl)) {
> +    /* Register overflow? Pass on stack. */
>       MSize nsp = cc->nsp, n = rcl[1] ? 2 : 1;
>       if (nsp + n > CCALL_MAXSTACK) return 1;  /* Too many arguments. */
>       cc->nsp = nsp + n;
> @@ -989,7 +993,7 @@ static int ccall_set_args(lua_State *L, CTState *cts, CType *ct,
>       if (fid) {  /* Get argument type from field. */
>         CType *ctf = ctype_get(cts, fid);
>         fid = ctf->sib;
> -      lua_assert(ctype_isfield(ctf->info));
> +      lj_assertL(ctype_isfield(ctf->info), "field expected");
>         did = ctype_cid(ctf->info);
>       } else {
>         if (!(ct->info & CTF_VARARG))
> @@ -1137,7 +1141,8 @@ static int ccall_get_results(lua_State *L, CTState *cts, CType *ct,
>     CCALL_HANDLE_RET
>   #endif
>     /* No reference types end up here, so there's no need for the CTypeID. */
> -  lua_assert(!(ctype_isrefarray(ctr->info) || ctype_isstruct(ctr->info)));
> +  lj_assertL(!(ctype_isrefarray(ctr->info) || ctype_isstruct(ctr->info)),
> +	     "unexpected reference ctype");
>     return lj_cconv_tv_ct(cts, ctr, 0, L->top-1, sp);
>   }
>   
> diff --git a/src/lj_ccallback.c b/src/lj_ccallback.c
> index 37edd00f..3738c234 100644
> --- a/src/lj_ccallback.c
> +++ b/src/lj_ccallback.c
> @@ -107,9 +107,9 @@ MSize lj_ccallback_ptr2slot(CTState *cts, void *p)
>   /* Initialize machine code for callback function pointers. */
>   #if LJ_OS_NOJIT
>   /* Disabled callback support. */
> -#define callback_mcode_init(g, p)	UNUSED(p)
> +#define callback_mcode_init(g, p)	(p)
>   #elif LJ_TARGET_X86ORX64
> -static void callback_mcode_init(global_State *g, uint8_t *page)
> +static void *callback_mcode_init(global_State *g, uint8_t *page)
>   {
>     uint8_t *p = page;
>     uint8_t *target = (uint8_t *)(void *)lj_vm_ffi_callback;
> @@ -143,10 +143,10 @@ static void callback_mcode_init(global_State *g, uint8_t *page)
>         *p++ = XI_JMPs; *p++ = (uint8_t)((2+2)*(31-(slot&31)) - 2);
>       }
>     }
> -  lua_assert(p - page <= CALLBACK_MCODE_SIZE);
> +  return p;
>   }
>   #elif LJ_TARGET_ARM
> -static void callback_mcode_init(global_State *g, uint32_t *page)
> +static void *callback_mcode_init(global_State *g, uint32_t *page)
>   {
>     uint32_t *p = page;
>     void *target = (void *)lj_vm_ffi_callback;
> @@ -165,10 +165,10 @@ static void callback_mcode_init(global_State *g, uint32_t *page)
>       *p = ARMI_B | ((page-p-2) & 0x00ffffffu);
>       p++;
>     }
> -  lua_assert(p - page <= CALLBACK_MCODE_SIZE);
> +  return p;
>   }
>   #elif LJ_TARGET_ARM64
> -static void callback_mcode_init(global_State *g, uint32_t *page)
> +static void *callback_mcode_init(global_State *g, uint32_t *page)
>   {
>     uint32_t *p = page;
>     void *target = (void *)lj_vm_ffi_callback;
> @@ -185,10 +185,10 @@ static void callback_mcode_init(global_State *g, uint32_t *page)
>       *p = A64I_LE(A64I_B | A64F_S26((page-p) & 0x03ffffffu));
>       p++;
>     }
> -  lua_assert(p - page <= CALLBACK_MCODE_SIZE);
> +  return p;
>   }
>   #elif LJ_TARGET_PPC
> -static void callback_mcode_init(global_State *g, uint32_t *page)
> +static void *callback_mcode_init(global_State *g, uint32_t *page)
>   {
>     uint32_t *p = page;
>     void *target = (void *)lj_vm_ffi_callback;
> @@ -204,10 +204,10 @@ static void callback_mcode_init(global_State *g, uint32_t *page)
>       *p = PPCI_B | (((page-p) & 0x00ffffffu) << 2);
>       p++;
>     }
> -  lua_assert(p - page <= CALLBACK_MCODE_SIZE);
> +  return p;
>   }
>   #elif LJ_TARGET_MIPS
> -static void callback_mcode_init(global_State *g, uint32_t *page)
> +static void *callback_mcode_init(global_State *g, uint32_t *page)
>   {
>     uint32_t *p = page;
>     uintptr_t target = (uintptr_t)(void *)lj_vm_ffi_callback;
> @@ -236,11 +236,11 @@ static void callback_mcode_init(global_State *g, uint32_t *page)
>       p++;
>       *p++ = MIPSI_LI | MIPSF_T(RID_R1) | slot;
>     }
> -  lua_assert(p - page <= CALLBACK_MCODE_SIZE);
> +  return p;
>   }
>   #else
>   /* Missing support for this architecture. */
> -#define callback_mcode_init(g, p)	UNUSED(p)
> +#define callback_mcode_init(g, p)	(p)
>   #endif
>   
>   /* -- Machine code management --------------------------------------------- */
> @@ -263,7 +263,7 @@ static void callback_mcode_init(global_State *g, uint32_t *page)
>   static void callback_mcode_new(CTState *cts)
>   {
>     size_t sz = (size_t)CALLBACK_MCODE_SIZE;
> -  void *p;
> +  void *p, *pe;
>     if (CALLBACK_MAX_SLOT == 0)
>       lj_err_caller(cts->L, LJ_ERR_FFI_CBACKOV);
>   #if LJ_TARGET_WINDOWS
> @@ -280,7 +280,10 @@ static void callback_mcode_new(CTState *cts)
>     p = lj_mem_new(cts->L, sz);
>   #endif
>     cts->cb.mcode = p;
> -  callback_mcode_init(cts->g, p);
> +  pe = callback_mcode_init(cts->g, p);
> +  UNUSED(pe);
> +  lj_assertCTS((size_t)((char *)pe - (char *)p) <= sz,
> +	       "miscalculated CALLBACK_MAX_SLOT");
>     lj_mcode_sync(p, (char *)p + sz);
>   #if LJ_TARGET_WINDOWS
>     {
> @@ -421,8 +424,9 @@ void lj_ccallback_mcode_free(CTState *cts)
>   
>   #define CALLBACK_HANDLE_GPR \
>     if (n > 1) { \
> -    lua_assert(((LJ_ABI_SOFTFP && ctype_isnum(cta->info)) ||  /* double. */ \
> -		ctype_isinteger(cta->info)) && n == 2);  /* int64_t. */ \
> +    lj_assertCTS(((LJ_ABI_SOFTFP && ctype_isnum(cta->info)) ||  /* double. */ \
> +		 ctype_isinteger(cta->info)) && n == 2,  /* int64_t. */ \
> +		 "bad GPR type"); \
>       ngpr = (ngpr + 1u) & ~1u;  /* Align int64_t to regpair. */ \
>     } \
>     if (ngpr + n <= maxgpr) { \
> @@ -579,7 +583,7 @@ static void callback_conv_args(CTState *cts, lua_State *L)
>         CTSize sz;
>         int isfp;
>         MSize n;
> -      lua_assert(ctype_isfield(ctf->info));
> +      lj_assertCTS(ctype_isfield(ctf->info), "field expected");
>         cta = ctype_rawchild(cts, ctf);
>         isfp = ctype_isfp(cta->info);
>         sz = (cta->size + CTSIZE_PTR-1) & ~(CTSIZE_PTR-1);
> @@ -671,7 +675,7 @@ lua_State * LJ_FASTCALL lj_ccallback_enter(CTState *cts, void *cf)
>   {
>     lua_State *L = cts->L;
>     global_State *g = cts->g;
> -  lua_assert(L != NULL);
> +  lj_assertG(L != NULL, "uninitialized cts->L in callback");
>     if (tvref(g->jit_base)) {
>       setstrV(L, L->top++, lj_err_str(L, LJ_ERR_FFI_BADCBACK));
>       if (g->panic) g->panic(L);
> @@ -756,7 +760,7 @@ static CType *callback_checkfunc(CTState *cts, CType *ct)
>         CType *ctf = ctype_get(cts, fid);
>         if (!ctype_isattrib(ctf->info)) {
>   	CType *cta;
> -	lua_assert(ctype_isfield(ctf->info));
> +	lj_assertCTS(ctype_isfield(ctf->info), "field expected");
>   	cta = ctype_rawchild(cts, ctf);
>   	if (!(ctype_isenum(cta->info) || ctype_isptr(cta->info) ||
>   	      (ctype_isnum(cta->info) && cta->size <= 8)) ||
> diff --git a/src/lj_cconv.c b/src/lj_cconv.c
> index ca2a5d30..37c88852 100644
> --- a/src/lj_cconv.c
> +++ b/src/lj_cconv.c
> @@ -122,19 +122,25 @@ void lj_cconv_ct_ct(CTState *cts, CType *d, CType *s,
>     CTInfo dinfo = d->info, sinfo = s->info;
>     void *tmpptr;
>   
> -  lua_assert(!ctype_isenum(dinfo) && !ctype_isenum(sinfo));
> -  lua_assert(!ctype_isattrib(dinfo) && !ctype_isattrib(sinfo));
> +  lj_assertCTS(!ctype_isenum(dinfo) && !ctype_isenum(sinfo),
> +	       "unresolved enum");
> +  lj_assertCTS(!ctype_isattrib(dinfo) && !ctype_isattrib(sinfo),
> +	       "unstripped attribute");
>   
>     if (ctype_type(dinfo) > CT_MAYCONVERT || ctype_type(sinfo) > CT_MAYCONVERT)
>       goto err_conv;
>   
>     /* Some basic sanity checks. */
> -  lua_assert(!ctype_isnum(dinfo) || dsize > 0);
> -  lua_assert(!ctype_isnum(sinfo) || ssize > 0);
> -  lua_assert(!ctype_isbool(dinfo) || dsize == 1 || dsize == 4);
> -  lua_assert(!ctype_isbool(sinfo) || ssize == 1 || ssize == 4);
> -  lua_assert(!ctype_isinteger(dinfo) || (1u<<lj_fls(dsize)) == dsize);
> -  lua_assert(!ctype_isinteger(sinfo) || (1u<<lj_fls(ssize)) == ssize);
> +  lj_assertCTS(!ctype_isnum(dinfo) || dsize > 0, "bad size for number type");
> +  lj_assertCTS(!ctype_isnum(sinfo) || ssize > 0, "bad size for number type");
> +  lj_assertCTS(!ctype_isbool(dinfo) || dsize == 1 || dsize == 4,
> +	       "bad size for bool type");
> +  lj_assertCTS(!ctype_isbool(sinfo) || ssize == 1 || ssize == 4,
> +	       "bad size for bool type");
> +  lj_assertCTS(!ctype_isinteger(dinfo) || (1u<<lj_fls(dsize)) == dsize,
> +	       "bad size for integer type");
> +  lj_assertCTS(!ctype_isinteger(sinfo) || (1u<<lj_fls(ssize)) == ssize,
> +	       "bad size for integer type");
>   
>     switch (cconv_idx2(dinfo, sinfo)) {
>     /* Destination is a bool. */
> @@ -357,7 +363,7 @@ void lj_cconv_ct_ct(CTState *cts, CType *d, CType *s,
>       if ((flags & CCF_CAST) || (d->info & CTF_VLA) || d != s)
>         goto err_conv;  /* Must be exact same type. */
>   copyval:  /* Copy value. */
> -    lua_assert(dsize == ssize);
> +    lj_assertCTS(dsize == ssize, "value copy with different sizes");
>       memcpy(dp, sp, dsize);
>       break;
>   
> @@ -389,7 +395,7 @@ int lj_cconv_tv_ct(CTState *cts, CType *s, CTypeID sid,
>   	lj_cconv_ct_ct(cts, ctype_get(cts, CTID_DOUBLE), s,
>   		       (uint8_t *)&o->n, sp, 0);
>   	/* Numbers are NOT canonicalized here! Beware of uninitialized data. */
> -	lua_assert(tvisnum(o));
> +	lj_assertCTS(tvisnum(o), "non-canonical NaN passed");
>         }
>       } else {
>         uint32_t b = s->size == 1 ? (*sp != 0) : (*(int *)sp != 0);
> @@ -406,7 +412,7 @@ int lj_cconv_tv_ct(CTState *cts, CType *s, CTypeID sid,
>       CTSize sz;
>     copyval:  /* Copy value. */
>       sz = s->size;
> -    lua_assert(sz != CTSIZE_INVALID);
> +    lj_assertCTS(sz != CTSIZE_INVALID, "value copy with invalid size");
>       /* Attributes are stripped, qualifiers are kept (but mostly ignored). */
>       cd = lj_cdata_new(cts, ctype_typeid(cts, s), sz);
>       setcdataV(cts->L, o, cd);
> @@ -421,19 +427,22 @@ int lj_cconv_tv_bf(CTState *cts, CType *s, TValue *o, uint8_t *sp)
>     CTInfo info = s->info;
>     CTSize pos, bsz;
>     uint32_t val;
> -  lua_assert(ctype_isbitfield(info));
> +  lj_assertCTS(ctype_isbitfield(info), "bitfield expected");
>     /* NYI: packed bitfields may cause misaligned reads. */
>     switch (ctype_bitcsz(info)) {
>     case 4: val = *(uint32_t *)sp; break;
>     case 2: val = *(uint16_t *)sp; break;
>     case 1: val = *(uint8_t *)sp; break;
> -  default: lua_assert(0); val = 0; break;
> +  default:
> +    lj_assertCTS(0, "bad bitfield container size %d", ctype_bitcsz(info));
> +    val = 0;
> +    break;
>     }
>     /* Check if a packed bitfield crosses a container boundary. */
>     pos = ctype_bitpos(info);
>     bsz = ctype_bitbsz(info);
> -  lua_assert(pos < 8*ctype_bitcsz(info));
> -  lua_assert(bsz > 0 && bsz <= 8*ctype_bitcsz(info));
> +  lj_assertCTS(pos < 8*ctype_bitcsz(info), "bad bitfield position");
> +  lj_assertCTS(bsz > 0 && bsz <= 8*ctype_bitcsz(info), "bad bitfield size");
>     if (pos + bsz > 8*ctype_bitcsz(info))
>       lj_err_caller(cts->L, LJ_ERR_FFI_NYIPACKBIT);
>     if (!(info & CTF_BOOL)) {
> @@ -449,7 +458,7 @@ int lj_cconv_tv_bf(CTState *cts, CType *s, TValue *o, uint8_t *sp)
>       }
>     } else {
>       uint32_t b = (val >> pos) & 1;
> -    lua_assert(bsz == 1);
> +    lj_assertCTS(bsz == 1, "bad bool bitfield size");
>       setboolV(o, b);
>       setboolV(&cts->g->tmptv2, b);  /* Remember for trace recorder. */
>     }
> @@ -553,7 +562,7 @@ void lj_cconv_ct_tv(CTState *cts, CType *d,
>       sid = cdataV(o)->ctypeid;
>       s = ctype_get(cts, sid);
>       if (ctype_isref(s->info)) {  /* Resolve reference for value. */
> -      lua_assert(s->size == CTSIZE_PTR);
> +      lj_assertCTS(s->size == CTSIZE_PTR, "ref is not pointer-sized");
>         sp = *(void **)sp;
>         sid = ctype_cid(s->info);
>       }
> @@ -571,7 +580,7 @@ void lj_cconv_ct_tv(CTState *cts, CType *d,
>         CType *cct = lj_ctype_getfield(cts, d, str, &ofs);
>         if (!cct || !ctype_isconstval(cct->info))
>   	goto err_conv;
> -      lua_assert(d->size == 4);
> +      lj_assertCTS(d->size == 4, "only 32 bit enum supported");  /* NYI */
>         sp = (uint8_t *)&cct->size;
>         sid = ctype_cid(cct->info);
>       } else if (ctype_isrefarray(d->info)) {  /* Copy string to array. */
> @@ -635,10 +644,10 @@ void lj_cconv_bf_tv(CTState *cts, CType *d, uint8_t *dp, TValue *o)
>     CTInfo info = d->info;
>     CTSize pos, bsz;
>     uint32_t val, mask;
> -  lua_assert(ctype_isbitfield(info));
> +  lj_assertCTS(ctype_isbitfield(info), "bitfield expected");
>     if ((info & CTF_BOOL)) {
>       uint8_t tmpbool;
> -    lua_assert(ctype_bitbsz(info) == 1);
> +    lj_assertCTS(ctype_bitbsz(info) == 1, "bad bool bitfield size");
>       lj_cconv_ct_tv(cts, ctype_get(cts, CTID_BOOL), &tmpbool, o, 0);
>       val = tmpbool;
>     } else {
> @@ -647,8 +656,8 @@ void lj_cconv_bf_tv(CTState *cts, CType *d, uint8_t *dp, TValue *o)
>     }
>     pos = ctype_bitpos(info);
>     bsz = ctype_bitbsz(info);
> -  lua_assert(pos < 8*ctype_bitcsz(info));
> -  lua_assert(bsz > 0 && bsz <= 8*ctype_bitcsz(info));
> +  lj_assertCTS(pos < 8*ctype_bitcsz(info), "bad bitfield position");
> +  lj_assertCTS(bsz > 0 && bsz <= 8*ctype_bitcsz(info), "bad bitfield size");
>     /* Check if a packed bitfield crosses a container boundary. */
>     if (pos + bsz > 8*ctype_bitcsz(info))
>       lj_err_caller(cts->L, LJ_ERR_FFI_NYIPACKBIT);
> @@ -659,7 +668,9 @@ void lj_cconv_bf_tv(CTState *cts, CType *d, uint8_t *dp, TValue *o)
>     case 4: *(uint32_t *)dp = (*(uint32_t *)dp & ~mask) | (uint32_t)val; break;
>     case 2: *(uint16_t *)dp = (*(uint16_t *)dp & ~mask) | (uint16_t)val; break;
>     case 1: *(uint8_t *)dp = (*(uint8_t *)dp & ~mask) | (uint8_t)val; break;
> -  default: lua_assert(0); break;
> +  default:
> +    lj_assertCTS(0, "bad bitfield container size %d", ctype_bitcsz(info));
> +    break;
>     }
>   }
>   
> diff --git a/src/lj_cconv.h b/src/lj_cconv.h
> index 0a0b66c9..54a61fd4 100644
> --- a/src/lj_cconv.h
> +++ b/src/lj_cconv.h
> @@ -27,13 +27,14 @@ enum {
>   static LJ_AINLINE uint32_t cconv_idx(CTInfo info)
>   {
>     uint32_t idx = ((info >> 26) & 15u);  /* Dispatch bits. */
> -  lua_assert(ctype_type(info) <= CT_MAYCONVERT);
> +  lj_assertX(ctype_type(info) <= CT_MAYCONVERT,
> +	     "cannot convert ctype %08x", info);
>   #if LJ_64
>     idx = ((uint32_t)(U64x(f436fff5,fff7f021) >> 4*idx) & 15u);
>   #else
>     idx = (((idx < 8 ? 0xfff7f021u : 0xf436fff5) >> 4*(idx & 7u)) & 15u);
>   #endif
> -  lua_assert(idx < 8);
> +  lj_assertX(idx < 8, "cannot convert ctype %08x", info);
>     return idx;
>   }
>   
> diff --git a/src/lj_cdata.c b/src/lj_cdata.c
> index d3042f24..35d0e76a 100644
> --- a/src/lj_cdata.c
> +++ b/src/lj_cdata.c
> @@ -35,7 +35,7 @@ GCcdata *lj_cdata_newv(lua_State *L, CTypeID id, CTSize sz, CTSize align)
>     uintptr_t adata = (uintptr_t)p + sizeof(GCcdataVar) + sizeof(GCcdata);
>     uintptr_t almask = (1u << align) - 1u;
>     GCcdata *cd = (GCcdata *)(((adata + almask) & ~almask) - sizeof(GCcdata));
> -  lua_assert((char *)cd - p < 65536);
> +  lj_assertL((char *)cd - p < 65536, "excessive cdata alignment");
>     cdatav(cd)->offset = (uint16_t)((char *)cd - p);
>     cdatav(cd)->extra = extra;
>     cdatav(cd)->len = sz;
> @@ -77,8 +77,8 @@ void LJ_FASTCALL lj_cdata_free(global_State *g, GCcdata *cd)
>     } else if (LJ_LIKELY(!cdataisv(cd))) {
>       CType *ct = ctype_raw(ctype_ctsG(g), cd->ctypeid);
>       CTSize sz = ctype_hassize(ct->info) ? ct->size : CTSIZE_PTR;
> -    lua_assert(ctype_hassize(ct->info) || ctype_isfunc(ct->info) ||
> -	       ctype_isextern(ct->info));
> +    lj_assertG(ctype_hassize(ct->info) || ctype_isfunc(ct->info) ||
> +	       ctype_isextern(ct->info), "free of ctype without a size");
>       lj_mem_free(g, cd, sizeof(GCcdata) + sz);
>       g->gc.cdatanum--;
>     } else {
> @@ -118,7 +118,7 @@ CType *lj_cdata_index(CTState *cts, GCcdata *cd, cTValue *key, uint8_t **pp,
>   
>     /* Resolve reference for cdata object. */
>     if (ctype_isref(ct->info)) {
> -    lua_assert(ct->size == CTSIZE_PTR);
> +    lj_assertCTS(ct->size == CTSIZE_PTR, "ref is not pointer-sized");
>       p = *(uint8_t **)p;
>       ct = ctype_child(cts, ct);
>     }
> @@ -129,7 +129,8 @@ collect_attrib:
>       if (ctype_attrib(ct->info) == CTA_QUAL) *qual |= ct->size;
>       ct = ctype_child(cts, ct);
>     }
> -  lua_assert(!ctype_isref(ct->info));  /* Interning rejects refs to refs. */
> +  /* Interning rejects refs to refs. */
> +  lj_assertCTS(!ctype_isref(ct->info), "bad ref of ref");
>   
>     if (tvisint(key)) {
>       idx = (ptrdiff_t)intV(key);
> @@ -215,7 +216,8 @@ collect_attrib:
>   static void cdata_getconst(CTState *cts, TValue *o, CType *ct)
>   {
>     CType *ctt = ctype_child(cts, ct);
> -  lua_assert(ctype_isinteger(ctt->info) && ctt->size <= 4);
> +  lj_assertCTS(ctype_isinteger(ctt->info) && ctt->size <= 4,
> +	       "only 32 bit const supported");  /* NYI */
>     /* Constants are already zero-extended/sign-extended to 32 bits. */
>     if ((ctt->info & CTF_UNSIGNED) && (int32_t)ct->size < 0)
>       setnumV(o, (lua_Number)(uint32_t)ct->size);
> @@ -236,13 +238,14 @@ int lj_cdata_get(CTState *cts, CType *s, TValue *o, uint8_t *sp)
>     }
>   
>     /* Get child type of pointer/array/field. */
> -  lua_assert(ctype_ispointer(s->info) || ctype_isfield(s->info));
> +  lj_assertCTS(ctype_ispointer(s->info) || ctype_isfield(s->info),
> +	       "pointer or field expected");
>     sid = ctype_cid(s->info);
>     s = ctype_get(cts, sid);
>   
>     /* Resolve reference for field. */
>     if (ctype_isref(s->info)) {
> -    lua_assert(s->size == CTSIZE_PTR);
> +    lj_assertCTS(s->size == CTSIZE_PTR, "ref is not pointer-sized");
>       sp = *(uint8_t **)sp;
>       sid = ctype_cid(s->info);
>       s = ctype_get(cts, sid);
> @@ -269,12 +272,13 @@ void lj_cdata_set(CTState *cts, CType *d, uint8_t *dp, TValue *o, CTInfo qual)
>     }
>   
>     /* Get child type of pointer/array/field. */
> -  lua_assert(ctype_ispointer(d->info) || ctype_isfield(d->info));
> +  lj_assertCTS(ctype_ispointer(d->info) || ctype_isfield(d->info),
> +	       "pointer or field expected");
>     d = ctype_child(cts, d);
>   
>     /* Resolve reference for field. */
>     if (ctype_isref(d->info)) {
> -    lua_assert(d->size == CTSIZE_PTR);
> +    lj_assertCTS(d->size == CTSIZE_PTR, "ref is not pointer-sized");
>       dp = *(uint8_t **)dp;
>       d = ctype_child(cts, d);
>     }
> @@ -289,7 +293,8 @@ void lj_cdata_set(CTState *cts, CType *d, uint8_t *dp, TValue *o, CTInfo qual)
>       d = ctype_child(cts, d);
>     }
>   
> -  lua_assert(ctype_hassize(d->info) && !ctype_isvoid(d->info));
> +  lj_assertCTS(ctype_hassize(d->info), "store to ctype without size");
> +  lj_assertCTS(!ctype_isvoid(d->info), "store to void type");
>   
>     if (((d->info|qual) & CTF_CONST)) {
>     err_const:
> diff --git a/src/lj_cdata.h b/src/lj_cdata.h
> index 66b023bd..193e4241 100644
> --- a/src/lj_cdata.h
> +++ b/src/lj_cdata.h
> @@ -18,7 +18,7 @@ static LJ_AINLINE void *cdata_getptr(void *p, CTSize sz)
>     if (LJ_64 && sz == 4) {  /* Support 32 bit pointers on 64 bit targets. */
>       return ((void *)(uintptr_t)*(uint32_t *)p);
>     } else {
> -    lua_assert(sz == CTSIZE_PTR);
> +    lj_assertX(sz == CTSIZE_PTR, "bad pointer size %d", sz);
>       return *(void **)p;
>     }
>   }
> @@ -29,7 +29,7 @@ static LJ_AINLINE void cdata_setptr(void *p, CTSize sz, const void *v)
>     if (LJ_64 && sz == 4) {  /* Support 32 bit pointers on 64 bit targets. */
>       *(uint32_t *)p = (uint32_t)(uintptr_t)v;
>     } else {
> -    lua_assert(sz == CTSIZE_PTR);
> +    lj_assertX(sz == CTSIZE_PTR, "bad pointer size %d", sz);
>       *(void **)p = (void *)v;
>     }
>   }
> @@ -40,7 +40,8 @@ static LJ_AINLINE GCcdata *lj_cdata_new(CTState *cts, CTypeID id, CTSize sz)
>     GCcdata *cd;
>   #ifdef LUA_USE_ASSERT
>     CType *ct = ctype_raw(cts, id);
> -  lua_assert((ctype_hassize(ct->info) ? ct->size : CTSIZE_PTR) == sz);
> +  lj_assertCTS((ctype_hassize(ct->info) ? ct->size : CTSIZE_PTR) == sz,
> +	       "inconsistent size of fixed-size cdata alloc");
>   #endif
>     cd = (GCcdata *)lj_mem_newgco(cts->L, sizeof(GCcdata) + sz);
>     cd->gct = ~LJ_TCDATA;
> diff --git a/src/lj_clib.c b/src/lj_clib.c
> index a8672052..2f11b2e9 100644
> --- a/src/lj_clib.c
> +++ b/src/lj_clib.c
> @@ -349,7 +349,8 @@ TValue *lj_clib_index(lua_State *L, CLibrary *cl, GCstr *name)
>         lj_err_callerv(L, LJ_ERR_FFI_NODECL, strdata(name));
>       if (ctype_isconstval(ct->info)) {
>         CType *ctt = ctype_child(cts, ct);
> -      lua_assert(ctype_isinteger(ctt->info) && ctt->size <= 4);
> +      lj_assertCTS(ctype_isinteger(ctt->info) && ctt->size <= 4,
> +		   "only 32 bit const supported");  /* NYI */
>         if ((ctt->info & CTF_UNSIGNED) && (int32_t)ct->size < 0)
>   	setnumV(tv, (lua_Number)(uint32_t)ct->size);
>         else
> @@ -361,7 +362,8 @@ TValue *lj_clib_index(lua_State *L, CLibrary *cl, GCstr *name)
>   #endif
>         void *p = clib_getsym(cl, sym);
>         GCcdata *cd;
> -      lua_assert(ctype_isfunc(ct->info) || ctype_isextern(ct->info));
> +      lj_assertCTS(ctype_isfunc(ct->info) || ctype_isextern(ct->info),
> +		   "unexpected ctype %08x in clib", ct->info);
>   #if LJ_TARGET_X86 && LJ_ABI_WIN
>         /* Retry with decorated name for fastcall/stdcall functions. */
>         if (!p && ctype_isfunc(ct->info)) {
> diff --git a/src/lj_cparse.c b/src/lj_cparse.c
> index cd032b8e..6d9490ca 100644
> --- a/src/lj_cparse.c
> +++ b/src/lj_cparse.c
> @@ -28,6 +28,12 @@
>   ** If in doubt, please check the input against your favorite C compiler.
>   */
>   
> +#ifdef LUA_USE_ASSERT
> +#define lj_assertCP(c, ...)	(lj_assertG_(G(cp->L), (c), __VA_ARGS__))
> +#else
> +#define lj_assertCP(c, ...)	((void)cp)
> +#endif
> +
>   /* -- Miscellaneous ------------------------------------------------------- */
>   
>   /* Match string against a C literal. */
> @@ -61,7 +67,7 @@ LJ_NORET static void cp_err(CPState *cp, ErrMsg em);
>   
>   static const char *cp_tok2str(CPState *cp, CPToken tok)
>   {
> -  lua_assert(tok < CTOK_FIRSTDECL);
> +  lj_assertCP(tok < CTOK_FIRSTDECL, "bad CPToken %d", tok);
>     if (tok > CTOK_OFS)
>       return ctoknames[tok-CTOK_OFS-1];
>     else if (!lj_char_iscntrl(tok))
> @@ -392,7 +398,7 @@ static void cp_init(CPState *cp)
>     cp->curpack = 0;
>     cp->packstack[0] = 255;
>     lj_buf_init(cp->L, &cp->sb);
> -  lua_assert(cp->p != NULL);
> +  lj_assertCP(cp->p != NULL, "uninitialized cp->p");
>     cp_get(cp);  /* Read-ahead first char. */
>     cp->tok = 0;
>     cp->tmask = CPNS_DEFAULT;
> @@ -853,12 +859,13 @@ static CTypeID cp_decl_intern(CPState *cp, CPDecl *decl)
>       /* The cid is already part of info for copies of pointers/functions. */
>       idx = ct->next;
>       if (ctype_istypedef(info)) {
> -      lua_assert(id == 0);
> +      lj_assertCP(id == 0, "typedef not at toplevel");
>         id = ctype_cid(info);
>         /* Always refetch info/size, since struct/enum may have been completed. */
>         cinfo = ctype_get(cp->cts, id)->info;
>         csize = ctype_get(cp->cts, id)->size;
> -      lua_assert(ctype_isstruct(cinfo) || ctype_isenum(cinfo));
> +      lj_assertCP(ctype_isstruct(cinfo) || ctype_isenum(cinfo),
> +		  "typedef of bad type");
>       } else if (ctype_isfunc(info)) {  /* Intern function. */
>         CType *fct;
>         CTypeID fid;
> @@ -891,7 +898,7 @@ static CTypeID cp_decl_intern(CPState *cp, CPDecl *decl)
>         /* Inherit csize/cinfo from original type. */
>       } else {
>         if (ctype_isnum(info)) {  /* Handle mode/vector-size attributes. */
> -	lua_assert(id == 0);
> +	lj_assertCP(id == 0, "number not at toplevel");
>   	if (!(info & CTF_BOOL)) {
>   	  CTSize msize = ctype_msizeP(decl->attr);
>   	  CTSize vsize = ctype_vsizeP(decl->attr);
> @@ -946,7 +953,7 @@ static CTypeID cp_decl_intern(CPState *cp, CPDecl *decl)
>   	  info = (info & ~CTF_ALIGN) | (cinfo & CTF_ALIGN);
>   	info |= (cinfo & CTF_QUAL);  /* Inherit qual. */
>         } else {
> -	lua_assert(ctype_isvoid(info));
> +	lj_assertCP(ctype_isvoid(info), "bad ctype %08x", info);
>         }
>         csize = size;
>         cinfo = info+id;
> @@ -1596,7 +1603,7 @@ end_decl:
>   	cp_errmsg(cp, cp->tok, LJ_ERR_FFI_DECLSPEC);
>         sz = sizeof(int);
>       }
> -    lua_assert(sz != 0);
> +    lj_assertCP(sz != 0, "basic ctype with zero size");
>       info += CTALIGN(lj_fls(sz));  /* Use natural alignment. */
>       info += (decl->attr & CTF_QUAL);  /* Merge qualifiers. */
>       cp_push(decl, info, sz);
> @@ -1856,7 +1863,7 @@ static void cp_decl_multi(CPState *cp)
>   	  /* Treat both static and extern function declarations as extern. */
>   	  ct = ctype_get(cp->cts, ctypeid);
>   	  /* We always get new anonymous functions (typedefs are copied). */
> -	  lua_assert(gcref(ct->name) == NULL);
> +	  lj_assertCP(gcref(ct->name) == NULL, "unexpected named function");
>   	  id = ctypeid;  /* Just name it. */
>   	} else if ((scl & CDF_STATIC)) {  /* Accept static constants. */
>   	  id = cp_decl_constinit(cp, &ct, ctypeid);
> @@ -1913,7 +1920,7 @@ static TValue *cpcparser(lua_State *L, lua_CFunction dummy, void *ud)
>       cp_decl_single(cp);
>     if (cp->param && cp->param != cp->L->top)
>       cp_err(cp, LJ_ERR_FFI_NUMPARAM);
> -  lua_assert(cp->depth == 0);
> +  lj_assertCP(cp->depth == 0, "unbalanced cparser declaration depth");
>     return NULL;
>   }
>   
> diff --git a/src/lj_crecord.c b/src/lj_crecord.c
> index 804cdbf4..e1d1110f 100644
> --- a/src/lj_crecord.c
> +++ b/src/lj_crecord.c
> @@ -61,7 +61,8 @@ static GCcdata *argv2cdata(jit_State *J, TRef tr, cTValue *o)
>   static CTypeID crec_constructor(jit_State *J, GCcdata *cd, TRef tr)
>   {
>     CTypeID id;
> -  lua_assert(tref_iscdata(tr) && cd->ctypeid == CTID_CTYPEID);
> +  lj_assertJ(tref_iscdata(tr) && cd->ctypeid == CTID_CTYPEID,
> +	     "expected CTypeID cdata");
>     id = *(CTypeID *)cdataptr(cd);
>     tr = emitir(IRT(IR_FLOAD, IRT_INT), tr, IRFL_CDATA_INT);
>     emitir(IRTG(IR_EQ, IRT_INT), tr, lj_ir_kint(J, (int32_t)id));
> @@ -237,13 +238,14 @@ static void crec_copy(jit_State *J, TRef trdst, TRef trsrc, TRef trlen,
>       if (len > CREC_COPY_MAXLEN) goto fallback;
>       if (ct) {
>         CTState *cts = ctype_ctsG(J2G(J));
> -      lua_assert(ctype_isarray(ct->info) || ctype_isstruct(ct->info));
> +      lj_assertJ(ctype_isarray(ct->info) || ctype_isstruct(ct->info),
> +		 "copy of non-aggregate");
>         if (ctype_isarray(ct->info)) {
>   	CType *cct = ctype_rawchild(cts, ct);
>   	tp = crec_ct2irt(cts, cct);
>   	if (tp == IRT_CDATA) goto rawcopy;
>   	step = lj_ir_type_size[tp];
> -	lua_assert((len & (step-1)) == 0);
> +	lj_assertJ((len & (step-1)) == 0, "copy of fractional size");
>         } else if ((ct->info & CTF_UNION)) {
>   	step = (1u << ctype_align(ct->info));
>   	goto rawcopy;
> @@ -629,7 +631,8 @@ static TRef crec_ct_tv(jit_State *J, CType *d, TRef dp, TRef sp, cTValue *sval)
>         /* Specialize to the name of the enum constant. */
>         emitir(IRTG(IR_EQ, IRT_STR), sp, lj_ir_kstr(J, str));
>         if (cct && ctype_isconstval(cct->info)) {
> -	lua_assert(ctype_child(cts, cct)->size == 4);
> +	lj_assertJ(ctype_child(cts, cct)->size == 4,
> +		   "only 32 bit const supported");  /* NYI */
>   	svisnz = (void *)(intptr_t)(ofs != 0);
>   	sp = lj_ir_kint(J, (int32_t)ofs);
>   	sid = ctype_cid(cct->info);
> @@ -756,7 +759,7 @@ static void crec_index_bf(jit_State *J, RecordFFData *rd, TRef ptr, CTInfo info)
>     IRType t = IRT_I8 + 2*lj_fls(ctype_bitcsz(info)) + ((info&CTF_UNSIGNED)?1:0);
>     TRef tr = emitir(IRT(IR_XLOAD, t), ptr, 0);
>     CTSize pos = ctype_bitpos(info), bsz = ctype_bitbsz(info), shift = 32 - bsz;
> -  lua_assert(t <= IRT_U32);  /* NYI: 64 bit bitfields. */
> +  lj_assertJ(t <= IRT_U32, "only 32 bit bitfields supported");  /* NYI */
>     if (rd->data == 0) {  /* __index metamethod. */
>       if ((info & CTF_BOOL)) {
>         tr = emitir(IRTI(IR_BAND), tr, lj_ir_kint(J, (int32_t)((1u << pos))));
> @@ -768,7 +771,7 @@ static void crec_index_bf(jit_State *J, RecordFFData *rd, TRef ptr, CTInfo info)
>         tr = emitir(IRTI(IR_BSHL), tr, lj_ir_kint(J, shift - pos));
>         tr = emitir(IRTI(IR_BSAR), tr, lj_ir_kint(J, shift));
>       } else {
> -      lua_assert(bsz < 32);  /* Full-size fields cannot end up here. */
> +      lj_assertJ(bsz < 32, "unexpected full bitfield index");
>         tr = emitir(IRTI(IR_BSHR), tr, lj_ir_kint(J, pos));
>         tr = emitir(IRTI(IR_BAND), tr, lj_ir_kint(J, (int32_t)((1u << bsz)-1)));
>         /* We can omit the U32 to NUM conversion, since bsz < 32. */
> @@ -883,7 +886,7 @@ again:
>   	  crec_index_bf(J, rd, ptr, fct->info);
>   	  return;
>   	} else {
> -	  lua_assert(ctype_isfield(fct->info));
> +	  lj_assertJ(ctype_isfield(fct->info), "field expected");
>   	  sid = ctype_cid(fct->info);
>   	}
>         }
> @@ -1133,7 +1136,7 @@ static TRef crec_call_args(jit_State *J, RecordFFData *rd,
>       if (fid) {  /* Get argument type from field. */
>         CType *ctf = ctype_get(cts, fid);
>         fid = ctf->sib;
> -      lua_assert(ctype_isfield(ctf->info));
> +      lj_assertJ(ctype_isfield(ctf->info), "field expected");
>         did = ctype_cid(ctf->info);
>       } else {
>         if (!(ct->info & CTF_VARARG))
> diff --git a/src/lj_ctype.c b/src/lj_ctype.c
> index 0ea89c74..a42e3d60 100644
> --- a/src/lj_ctype.c
> +++ b/src/lj_ctype.c
> @@ -153,7 +153,7 @@ CTypeID lj_ctype_new(CTState *cts, CType **ctp)
>   {
>     CTypeID id = cts->top;
>     CType *ct;
> -  lua_assert(cts->L);
> +  lj_assertCTS(cts->L, "uninitialized cts->L");
>     if (LJ_UNLIKELY(id >= cts->sizetab)) {
>       if (id >= CTID_MAX) lj_err_msg(cts->L, LJ_ERR_TABOV);
>   #ifdef LUAJIT_CTYPE_CHECK_ANCHOR
> @@ -182,7 +182,7 @@ CTypeID lj_ctype_intern(CTState *cts, CTInfo info, CTSize size)
>   {
>     uint32_t h = ct_hashtype(info, size);
>     CTypeID id = cts->hash[h];
> -  lua_assert(cts->L);
> +  lj_assertCTS(cts->L, "uninitialized cts->L");
>     while (id) {
>       CType *ct = ctype_get(cts, id);
>       if (ct->info == info && ct->size == size)
> @@ -298,9 +298,9 @@ CTSize lj_ctype_vlsize(CTState *cts, CType *ct, CTSize nelem)
>       }
>       ct = ctype_raw(cts, arrid);
>     }
> -  lua_assert(ctype_isvlarray(ct->info));  /* Must be a VLA. */
> +  lj_assertCTS(ctype_isvlarray(ct->info), "VLA expected");
>     ct = ctype_rawchild(cts, ct);  /* Get array element. */
> -  lua_assert(ctype_hassize(ct->info));
> +  lj_assertCTS(ctype_hassize(ct->info), "bad VLA without size");
>     /* Calculate actual size of VLA and check for overflow. */
>     xsz += (uint64_t)ct->size * nelem;
>     return xsz < 0x80000000u ? (CTSize)xsz : CTSIZE_INVALID;
> @@ -323,7 +323,8 @@ CTInfo lj_ctype_info(CTState *cts, CTypeID id, CTSize *szp)
>       } else {
>         if (!(qual & CTFP_ALIGNED)) qual |= (info & CTF_ALIGN);
>         qual |= (info & ~(CTF_ALIGN|CTMASK_CID));
> -      lua_assert(ctype_hassize(info) || ctype_isfunc(info));
> +      lj_assertCTS(ctype_hassize(info) || ctype_isfunc(info),
> +		   "ctype without size");
>         *szp = ctype_isfunc(info) ? CTSIZE_INVALID : ct->size;
>         break;
>       }
> @@ -528,7 +529,7 @@ static void ctype_repr(CTRepr *ctr, CTypeID id)
>         ctype_appc(ctr, ')');
>         break;
>       default:
> -      lua_assert(0);
> +      lj_assertG_(ctr->cts->g, 0, "bad ctype %08x", info);
>         break;
>       }
>       ct = ctype_get(ctr->cts, ctype_cid(info));
> diff --git a/src/lj_ctype.h b/src/lj_ctype.h
> index 0c220a88..c4f3bdde 100644
> --- a/src/lj_ctype.h
> +++ b/src/lj_ctype.h
> @@ -260,6 +260,12 @@ typedef struct CTState {
>   
>   #define CT_MEMALIGN	3	/* Alignment guaranteed by memory allocator. */
>   
> +#ifdef LUA_USE_ASSERT
> +#define lj_assertCTS(c, ...)	(lj_assertG_(cts->g, (c), __VA_ARGS__))
> +#else
> +#define lj_assertCTS(c, ...)	((void)cts)
> +#endif
> +
>   /* -- Predefined types ---------------------------------------------------- */
>   
>   /* Target-dependent types. */
> @@ -392,7 +398,8 @@ static LJ_AINLINE CTState *ctype_cts(lua_State *L)
>   /* Check C type ID for validity when assertions are enabled. */
>   static LJ_AINLINE CTypeID ctype_check(CTState *cts, CTypeID id)
>   {
> -  lua_assert(id > 0 && id < cts->top); UNUSED(cts);
> +  UNUSED(cts);
> +  lj_assertCTS(id > 0 && id < cts->top, "bad CTID %d", id);
>     return id;
>   }
>   
> @@ -408,8 +415,9 @@ static LJ_AINLINE CType *ctype_get(CTState *cts, CTypeID id)
>   /* Get child C type. */
>   static LJ_AINLINE CType *ctype_child(CTState *cts, CType *ct)
>   {
> -  lua_assert(!(ctype_isvoid(ct->info) || ctype_isstruct(ct->info) ||
> -	     ctype_isbitfield(ct->info)));  /* These don't have children. */
> +  lj_assertCTS(!(ctype_isvoid(ct->info) || ctype_isstruct(ct->info) ||
> +	       ctype_isbitfield(ct->info)),
> +	       "ctype %08x has no children", ct->info);
>     return ctype_get(cts, ctype_cid(ct->info));
>   }
>   
> diff --git a/src/lj_debug.c b/src/lj_debug.c
> index c4edcabb..46c442c6 100644
> --- a/src/lj_debug.c
> +++ b/src/lj_debug.c
> @@ -55,7 +55,8 @@ static BCPos debug_framepc(lua_State *L, GCfunc *fn, cTValue *nextframe)
>     const BCIns *ins;
>     GCproto *pt;
>     BCPos pos;
> -  lua_assert(fn->c.gct == ~LJ_TFUNC || fn->c.gct == ~LJ_TTHREAD);
> +  lj_assertL(fn->c.gct == ~LJ_TFUNC || fn->c.gct == ~LJ_TTHREAD,
> +	     "function or frame expected");
>     if (!isluafunc(fn)) {  /* Cannot derive a PC for non-Lua functions. */
>       return NO_BCPOS;
>     } else if (nextframe == NULL) {  /* Lua function on top. */
> @@ -101,7 +102,7 @@ static BCPos debug_framepc(lua_State *L, GCfunc *fn, cTValue *nextframe)
>   #if LJ_HASJIT
>     if (pos > pt->sizebc) {  /* Undo the effects of lj_trace_exit for JLOOP. */
>       GCtrace *T = (GCtrace *)((char *)(ins-1) - offsetof(GCtrace, startins));
> -    lua_assert(bc_isret(bc_op(ins[-1])));
> +    lj_assertL(bc_isret(bc_op(ins[-1])), "return bytecode expected");
>       pos = proto_bcpos(pt, mref(T->startpc, const BCIns));
>     }
>   #endif
> @@ -134,7 +135,7 @@ BCLine lj_debug_frameline(lua_State *L, GCfunc *fn, cTValue *nextframe)
>     BCPos pc = debug_framepc(L, fn, nextframe);
>     if (pc != NO_BCPOS) {
>       GCproto *pt = funcproto(fn);
> -    lua_assert(pc <= pt->sizebc);
> +    lj_assertL(pc <= pt->sizebc, "PC out of range");
>       return lj_debug_line(pt, pc);
>     }
>     return -1;
> @@ -215,7 +216,7 @@ static TValue *debug_localname(lua_State *L, const lua_Debug *ar,
>   const char *lj_debug_uvname(GCproto *pt, uint32_t idx)
>   {
>     const uint8_t *p = proto_uvinfo(pt);
> -  lua_assert(idx < pt->sizeuv);
> +  lj_assertX(idx < pt->sizeuv, "bad upvalue index");
>     if (!p) return "";
>     if (idx) while (*p++ || --idx) ;
>     return (const char *)p;
> @@ -440,13 +441,14 @@ int lj_debug_getinfo(lua_State *L, const char *what, lj_Debug *ar, int ext)
>     } else {
>       uint32_t offset = (uint32_t)ar->i_ci & 0xffff;
>       uint32_t size = (uint32_t)ar->i_ci >> 16;
> -    lua_assert(offset != 0);
> +    lj_assertL(offset != 0, "bad frame offset");
>       frame = tvref(L->stack) + offset;
>       if (size) nextframe = frame + size;
> -    lua_assert(frame <= tvref(L->maxstack) &&
> -	       (!nextframe || nextframe <= tvref(L->maxstack)));
> +    lj_assertL(frame <= tvref(L->maxstack) &&
> +	       (!nextframe || nextframe <= tvref(L->maxstack)),
> +	       "broken frame chain");
>       fn = frame_func(frame);
> -    lua_assert(fn->c.gct == ~LJ_TFUNC);
> +    lj_assertL(fn->c.gct == ~LJ_TFUNC, "bad frame function");
>     }
>     for (; *what; what++) {
>       if (*what == 'S') {
> diff --git a/src/lj_def.h b/src/lj_def.h
> index 2d8fff66..ba4dcc9d 100644
> --- a/src/lj_def.h
> +++ b/src/lj_def.h
> @@ -338,14 +338,28 @@ static LJ_AINLINE uint32_t lj_getu32(const void *v)
>   #define LJ_FUNCA_NORET	LJ_FUNCA LJ_NORET
>   #define LJ_ASMF_NORET	LJ_ASMF LJ_NORET
>   
> -/* Runtime assertions. */
> -#ifdef lua_assert
> -#define check_exp(c, e)		(lua_assert(c), (e))
> -#define api_check(l, e)		lua_assert(e)
> +/* Internal assertions. */
> +#if defined(LUA_USE_ASSERT) || defined(LUA_USE_APICHECK)
> +#define lj_assert_check(g, c, ...) \
> +  ((c) ? (void)0 : \
> +   (lj_assert_fail((g), __FILE__, __LINE__, __func__, __VA_ARGS__), 0))
> +#define lj_checkapi(c, ...)	lj_assert_check(G(L), (c), __VA_ARGS__)
>   #else
> -#define lua_assert(c)		((void)0)
> +#define lj_checkapi(c, ...)	((void)L)
> +#endif
> +
> +#ifdef LUA_USE_ASSERT
> +#define lj_assertG_(g, c, ...)	lj_assert_check((g), (c), __VA_ARGS__)
> +#define lj_assertG(c, ...)	lj_assert_check(g, (c), __VA_ARGS__)
> +#define lj_assertL(c, ...)	lj_assert_check(G(L), (c), __VA_ARGS__)
> +#define lj_assertX(c, ...)	lj_assert_check(NULL, (c), __VA_ARGS__)
> +#define check_exp(c, e)		(lj_assertX((c), #c), (e))
> +#else
> +#define lj_assertG_(g, c, ...)	((void)0)
> +#define lj_assertG(c, ...)	((void)g)
> +#define lj_assertL(c, ...)	((void)L)
> +#define lj_assertX(c, ...)	((void)0)
>   #define check_exp(c, e)		(e)
> -#define api_check		luai_apicheck
>   #endif
>   
>   /* Static assertions. */
> diff --git a/src/lj_dispatch.c b/src/lj_dispatch.c
> index ee735450..ddee68de 100644
> --- a/src/lj_dispatch.c
> +++ b/src/lj_dispatch.c
> @@ -380,7 +380,7 @@ static void callhook(lua_State *L, int event, BCLine line)
>       hook_enter(g);
>   #endif
>       hookf(L, &ar);
> -    lua_assert(hook_active(g));
> +    lj_assertG(hook_active(g), "active hook flag removed");
>       setgcref(g->cur_L, obj2gco(L));
>   #if LJ_HASPROFILE && !LJ_PROFILE_SIGPROF
>       lj_profile_hook_leave(g);
> @@ -428,7 +428,8 @@ void LJ_FASTCALL lj_dispatch_ins(lua_State *L, const BCIns *pc)
>   #endif
>         J->L = L;
>         lj_trace_ins(J, pc-1);  /* The interpreter bytecode PC is offset by 1. */
> -      lua_assert(L->top - L->base == delta);
> +      lj_assertG(L->top - L->base == delta,
> +		 "unbalanced stack after tracing of instruction");
>       }
>     }
>   #endif
> @@ -488,7 +489,8 @@ ASMFunction LJ_FASTCALL lj_dispatch_call(lua_State *L, const BCIns *pc)
>   #endif
>       pc = (const BCIns *)((uintptr_t)pc & ~(uintptr_t)1);
>       lj_trace_hot(J, pc);
> -    lua_assert(L->top - L->base == delta);
> +    lj_assertG(L->top - L->base == delta,
> +	       "unbalanced stack after hot call");
>       goto out;
>     } else if (J->state != LJ_TRACE_IDLE &&
>   	     !(g->hookmask & (HOOK_GC|HOOK_VMEVENT))) {
> @@ -497,7 +499,8 @@ ASMFunction LJ_FASTCALL lj_dispatch_call(lua_State *L, const BCIns *pc)
>   #endif
>       /* Record the FUNC* bytecodes, too. */
>       lj_trace_ins(J, pc-1);  /* The interpreter bytecode PC is offset by 1. */
> -    lua_assert(L->top - L->base == delta);
> +    lj_assertG(L->top - L->base == delta,
> +	       "unbalanced stack after hot instruction");
>     }
>   #endif
>     if ((g->hookmask & LUA_MASKCALL)) {
> diff --git a/src/lj_emit_arm.h b/src/lj_emit_arm.h
> index dee8bdcc..ee299821 100644
> --- a/src/lj_emit_arm.h
> +++ b/src/lj_emit_arm.h
> @@ -81,7 +81,8 @@ static void emit_m(ASMState *as, ARMIns ai, Reg rm)
>   
>   static void emit_lsox(ASMState *as, ARMIns ai, Reg rd, Reg rn, int32_t ofs)
>   {
> -  lua_assert(ofs >= -255 && ofs <= 255);
> +  lj_assertA(ofs >= -255 && ofs <= 255,
> +	     "load/store offset %d out of range", ofs);
>     if (ofs < 0) ofs = -ofs; else ai |= ARMI_LS_U;
>     *--as->mcp = ai | ARMI_LS_P | ARMI_LSX_I | ARMF_D(rd) | ARMF_N(rn) |
>   	       ((ofs & 0xf0) << 4) | (ofs & 0x0f);
> @@ -89,7 +90,8 @@ static void emit_lsox(ASMState *as, ARMIns ai, Reg rd, Reg rn, int32_t ofs)
>   
>   static void emit_lso(ASMState *as, ARMIns ai, Reg rd, Reg rn, int32_t ofs)
>   {
> -  lua_assert(ofs >= -4095 && ofs <= 4095);
> +  lj_assertA(ofs >= -4095 && ofs <= 4095,
> +	     "load/store offset %d out of range", ofs);
>     /* Combine LDR/STR pairs to LDRD/STRD. */
>     if (*as->mcp == (ai|ARMI_LS_P|ARMI_LS_U|ARMF_D(rd^1)|ARMF_N(rn)|(ofs^4)) &&
>         (ai & ~(ARMI_LDR^ARMI_STR)) == ARMI_STR && rd != rn &&
> @@ -106,7 +108,8 @@ static void emit_lso(ASMState *as, ARMIns ai, Reg rd, Reg rn, int32_t ofs)
>   #if !LJ_SOFTFP
>   static void emit_vlso(ASMState *as, ARMIns ai, Reg rd, Reg rn, int32_t ofs)
>   {
> -  lua_assert(ofs >= -1020 && ofs <= 1020 && (ofs&3) == 0);
> +  lj_assertA(ofs >= -1020 && ofs <= 1020 && (ofs&3) == 0,
> +	     "load/store offset %d out of range", ofs);
>     if (ofs < 0) ofs = -ofs; else ai |= ARMI_LS_U;
>     *--as->mcp = ai | ARMI_LS_P | ARMF_D(rd & 15) | ARMF_N(rn) | (ofs >> 2);
>   }
> @@ -124,7 +127,7 @@ static int emit_kdelta1(ASMState *as, Reg d, int32_t i)
>     while (work) {
>       Reg r = rset_picktop(work);
>       IRRef ref = regcost_ref(as->cost[r]);
> -    lua_assert(r != d);
> +    lj_assertA(r != d, "dest reg not free");
>       if (emit_canremat(ref)) {
>         int32_t delta = i - (ra_iskref(ref) ? ra_krefk(as, ref) : IR(ref)->i);
>         uint32_t k = emit_isk12(ARMI_ADD, delta);
> @@ -142,13 +145,13 @@ static int emit_kdelta1(ASMState *as, Reg d, int32_t i)
>   }
>   
>   /* Try to find a two step delta relative to another constant. */
> -static int emit_kdelta2(ASMState *as, Reg d, int32_t i)
> +static int emit_kdelta2(ASMState *as, Reg rd, int32_t i)
>   {
>     RegSet work = ~as->freeset & RSET_GPR;
>     while (work) {
>       Reg r = rset_picktop(work);
>       IRRef ref = regcost_ref(as->cost[r]);
> -    lua_assert(r != d);
> +    lj_assertA(r != rd, "dest reg %d not free", rd);
>       if (emit_canremat(ref)) {
>         int32_t other = ra_iskref(ref) ? ra_krefk(as, ref) : IR(ref)->i;
>         if (other) {
> @@ -159,8 +162,8 @@ static int emit_kdelta2(ASMState *as, Reg d, int32_t i)
>   	k2 = emit_isk12(0, delta & (255 << sh));
>   	k = emit_isk12(0, delta & ~(255 << sh));
>   	if (k) {
> -	  emit_dn(as, ARMI_ADD^k2^inv, d, d);
> -	  emit_dn(as, ARMI_ADD^k^inv, d, r);
> +	  emit_dn(as, ARMI_ADD^k2^inv, rd, rd);
> +	  emit_dn(as, ARMI_ADD^k^inv, rd, r);
>   	  return 1;
>   	}
>         }
> @@ -171,23 +174,24 @@ static int emit_kdelta2(ASMState *as, Reg d, int32_t i)
>   }
>   
>   /* Load a 32 bit constant into a GPR. */
> -static void emit_loadi(ASMState *as, Reg r, int32_t i)
> +static void emit_loadi(ASMState *as, Reg rd, int32_t i)
>   {
>     uint32_t k = emit_isk12(ARMI_MOV, i);
> -  lua_assert(rset_test(as->freeset, r) || r == RID_TMP);
> +  lj_assertA(rset_test(as->freeset, rd) || rd == RID_TMP,
> +	     "dest reg %d not free", rd);
>     if (k) {
>       /* Standard K12 constant. */
> -    emit_d(as, ARMI_MOV^k, r);
> +    emit_d(as, ARMI_MOV^k, rd);
>     } else if ((as->flags & JIT_F_ARMV6T2) && (uint32_t)i < 0x00010000u) {
>       /* 16 bit loword constant for ARMv6T2. */
> -    emit_d(as, ARMI_MOVW|(i & 0x0fff)|((i & 0xf000)<<4), r);
> -  } else if (emit_kdelta1(as, r, i)) {
> +    emit_d(as, ARMI_MOVW|(i & 0x0fff)|((i & 0xf000)<<4), rd);
> +  } else if (emit_kdelta1(as, rd, i)) {
>       /* One step delta relative to another constant. */
>     } else if ((as->flags & JIT_F_ARMV6T2)) {
>       /* 32 bit hiword/loword constant for ARMv6T2. */
> -    emit_d(as, ARMI_MOVT|((i>>16) & 0x0fff)|(((i>>16) & 0xf000)<<4), r);
> -    emit_d(as, ARMI_MOVW|(i & 0x0fff)|((i & 0xf000)<<4), r);
> -  } else if (emit_kdelta2(as, r, i)) {
> +    emit_d(as, ARMI_MOVT|((i>>16) & 0x0fff)|(((i>>16) & 0xf000)<<4), rd);
> +    emit_d(as, ARMI_MOVW|(i & 0x0fff)|((i & 0xf000)<<4), rd);
> +  } else if (emit_kdelta2(as, rd, i)) {
>       /* Two step delta relative to another constant. */
>     } else {
>       /* Otherwise construct the constant with up to 4 instructions. */
> @@ -197,15 +201,15 @@ static void emit_loadi(ASMState *as, Reg r, int32_t i)
>         int32_t m = i & (255 << sh);
>         i &= ~(255 << sh);
>         if (i == 0) {
> -	emit_d(as, ARMI_MOV ^ emit_isk12(0, m), r);
> +	emit_d(as, ARMI_MOV ^ emit_isk12(0, m), rd);
>   	break;
>         }
> -      emit_dn(as, ARMI_ORR ^ emit_isk12(0, m), r, r);
> +      emit_dn(as, ARMI_ORR ^ emit_isk12(0, m), rd, rd);
>       }
>     }
>   }
>   
> -#define emit_loada(as, r, addr)		emit_loadi(as, (r), i32ptr((addr)))
> +#define emit_loada(as, rd, addr)	emit_loadi(as, (rd), i32ptr((addr)))
>   
>   static Reg ra_allock(ASMState *as, intptr_t k, RegSet allow);
>   
> @@ -261,7 +265,7 @@ static void emit_branch(ASMState *as, ARMIns ai, MCode *target)
>   {
>     MCode *p = as->mcp;
>     ptrdiff_t delta = (target - p) - 1;
> -  lua_assert(((delta + 0x00800000) >> 24) == 0);
> +  lj_assertA(((delta + 0x00800000) >> 24) == 0, "branch target out of range");
>     *--p = ai | ((uint32_t)delta & 0x00ffffffu);
>     as->mcp = p;
>   }
> @@ -289,7 +293,7 @@ static void emit_call(ASMState *as, void *target)
>   static void emit_movrr(ASMState *as, IRIns *ir, Reg dst, Reg src)
>   {
>   #if LJ_SOFTFP
> -  lua_assert(!irt_isnum(ir->t)); UNUSED(ir);
> +  lj_assertA(!irt_isnum(ir->t), "unexpected FP op"); UNUSED(ir);
>   #else
>     if (dst >= RID_MAX_GPR) {
>       emit_dm(as, irt_isnum(ir->t) ? ARMI_VMOV_D : ARMI_VMOV_S,
> @@ -313,7 +317,7 @@ static void emit_movrr(ASMState *as, IRIns *ir, Reg dst, Reg src)
>   static void emit_loadofs(ASMState *as, IRIns *ir, Reg r, Reg base, int32_t ofs)
>   {
>   #if LJ_SOFTFP
> -  lua_assert(!irt_isnum(ir->t)); UNUSED(ir);
> +  lj_assertA(!irt_isnum(ir->t), "unexpected FP op"); UNUSED(ir);
>   #else
>     if (r >= RID_MAX_GPR)
>       emit_vlso(as, irt_isnum(ir->t) ? ARMI_VLDR_D : ARMI_VLDR_S, r, base, ofs);
> @@ -326,7 +330,7 @@ static void emit_loadofs(ASMState *as, IRIns *ir, Reg r, Reg base, int32_t ofs)
>   static void emit_storeofs(ASMState *as, IRIns *ir, Reg r, Reg base, int32_t ofs)
>   {
>   #if LJ_SOFTFP
> -  lua_assert(!irt_isnum(ir->t)); UNUSED(ir);
> +  lj_assertA(!irt_isnum(ir->t), "unexpected FP op"); UNUSED(ir);
>   #else
>     if (r >= RID_MAX_GPR)
>       emit_vlso(as, irt_isnum(ir->t) ? ARMI_VSTR_D : ARMI_VSTR_S, r, base, ofs);
> diff --git a/src/lj_emit_arm64.h b/src/lj_emit_arm64.h
> index 1001b1d8..96fbab72 100644
> --- a/src/lj_emit_arm64.h
> +++ b/src/lj_emit_arm64.h
> @@ -8,8 +8,9 @@
>   
>   /* -- Constant encoding --------------------------------------------------- */
>   
> -static uint64_t get_k64val(IRIns *ir)
> +static uint64_t get_k64val(ASMState *as, IRRef ref)
>   {
> +  IRIns *ir = IR(ref);
>     if (ir->o == IR_KINT64) {
>       return ir_kint64(ir)->u64;
>     } else if (ir->o == IR_KGC) {
> @@ -17,7 +18,8 @@ static uint64_t get_k64val(IRIns *ir)
>     } else if (ir->o == IR_KPTR || ir->o == IR_KKPTR) {
>       return (uint64_t)ir_kptr(ir);
>     } else {
> -    lua_assert(ir->o == IR_KINT || ir->o == IR_KNULL);
> +    lj_assertA(ir->o == IR_KINT || ir->o == IR_KNULL,
> +	       "bad 64 bit const IR op %d", ir->o);
>       return ir->i;  /* Sign-extended. */
>     }
>   }
> @@ -122,7 +124,7 @@ static int emit_checkofs(A64Ins ai, int64_t ofs)
>   static void emit_lso(ASMState *as, A64Ins ai, Reg rd, Reg rn, int64_t ofs)
>   {
>     int ot = emit_checkofs(ai, ofs), sc = (ai >> 30) & 3;
> -  lua_assert(ot);
> +  lj_assertA(ot, "load/store offset %d out of range", ofs);
>     /* Combine LDR/STR pairs to LDP/STP. */
>     if ((sc == 2 || sc == 3) &&
>         (!(ai & 0x400000) || rd != rn) &&
> @@ -166,10 +168,10 @@ static int emit_kdelta(ASMState *as, Reg rd, uint64_t k, int lim)
>     while (work) {
>       Reg r = rset_picktop(work);
>       IRRef ref = regcost_ref(as->cost[r]);
> -    lua_assert(r != rd);
> +    lj_assertA(r != rd, "dest reg %d not free", rd);
>       if (ref < REF_TRUE) {
>         uint64_t kx = ra_iskref(ref) ? (uint64_t)ra_krefk(as, ref) :
> -				     get_k64val(IR(ref));
> +				     get_k64val(as, ref);
>         int64_t delta = (int64_t)(k - kx);
>         if (delta == 0) {
>   	emit_dm(as, A64I_MOVx, rd, r);
> @@ -312,7 +314,7 @@ static void emit_cond_branch(ASMState *as, A64CC cond, MCode *target)
>   {
>     MCode *p = --as->mcp;
>     ptrdiff_t delta = target - p;
> -  lua_assert(A64F_S_OK(delta, 19));
> +  lj_assertA(A64F_S_OK(delta, 19), "branch target out of range");
>     *p = A64I_BCC | A64F_S19(delta) | cond;
>   }
>   
> @@ -320,7 +322,7 @@ static void emit_branch(ASMState *as, A64Ins ai, MCode *target)
>   {
>     MCode *p = --as->mcp;
>     ptrdiff_t delta = target - p;
> -  lua_assert(A64F_S_OK(delta, 26));
> +  lj_assertA(A64F_S_OK(delta, 26), "branch target out of range");
>     *p = ai | A64F_S26(delta);
>   }
>   
> @@ -328,7 +330,8 @@ static void emit_tnb(ASMState *as, A64Ins ai, Reg r, uint32_t bit, MCode *target
>   {
>     MCode *p = --as->mcp;
>     ptrdiff_t delta = target - p;
> -  lua_assert(bit < 63 && A64F_S_OK(delta, 14));
> +  lj_assertA(bit < 63, "bit number out of range");
> +  lj_assertA(A64F_S_OK(delta, 14), "branch target out of range");
>     if (bit > 31) ai |= A64I_X;
>     *p = ai | A64F_BIT(bit & 31) | A64F_S14(delta) | r;
>   }
> @@ -337,7 +340,7 @@ static void emit_cnb(ASMState *as, A64Ins ai, Reg r, MCode *target)
>   {
>     MCode *p = --as->mcp;
>     ptrdiff_t delta = target - p;
> -  lua_assert(A64F_S_OK(delta, 19));
> +  lj_assertA(A64F_S_OK(delta, 19), "branch target out of range");
>     *p = ai | A64F_S19(delta) | r;
>   }
>   
> diff --git a/src/lj_emit_mips.h b/src/lj_emit_mips.h
> index 313d030a..7f0d27ca 100644
> --- a/src/lj_emit_mips.h
> +++ b/src/lj_emit_mips.h
> @@ -4,8 +4,9 @@
>   */
>   
>   #if LJ_64
> -static intptr_t get_k64val(IRIns *ir)
> +static intptr_t get_k64val(ASMState *as, IRRef ref)
>   {
> +  IRIns *ir = IR(ref);
>     if (ir->o == IR_KINT64) {
>       return (intptr_t)ir_kint64(ir)->u64;
>     } else if (ir->o == IR_KGC) {
> @@ -15,16 +16,17 @@ static intptr_t get_k64val(IRIns *ir)
>     } else if (LJ_SOFTFP && ir->o == IR_KNUM) {
>       return (intptr_t)ir_knum(ir)->u64;
>     } else {
> -    lua_assert(ir->o == IR_KINT || ir->o == IR_KNULL);
> +    lj_assertA(ir->o == IR_KINT || ir->o == IR_KNULL,
> +	       "bad 64 bit const IR op %d", ir->o);
>       return ir->i;  /* Sign-extended. */
>     }
>   }
>   #endif
>   
>   #if LJ_64
> -#define get_kval(ir)		get_k64val(ir)
> +#define get_kval(as, ref)	get_k64val(as, ref)
>   #else
> -#define get_kval(ir)		((ir)->i)
> +#define get_kval(as, ref)	(IR((ref))->i)
>   #endif
>   
>   /* -- Emit basic instructions --------------------------------------------- */
> @@ -82,18 +84,18 @@ static void emit_tsml(ASMState *as, MIPSIns mi, Reg rt, Reg rs, uint32_t msb,
>   #define emit_canremat(ref)	((ref) <= REF_BASE)
>   
>   /* Try to find a one step delta relative to another constant. */
> -static int emit_kdelta1(ASMState *as, Reg t, intptr_t i)
> +static int emit_kdelta1(ASMState *as, Reg rd, intptr_t i)
>   {
>     RegSet work = ~as->freeset & RSET_GPR;
>     while (work) {
>       Reg r = rset_picktop(work);
>       IRRef ref = regcost_ref(as->cost[r]);
> -    lua_assert(r != t);
> +    lj_assertA(r != rd, "dest reg %d not free", rd);
>       if (ref < ASMREF_L) {
>         intptr_t delta = (intptr_t)((uintptr_t)i -
> -	(uintptr_t)(ra_iskref(ref) ? ra_krefk(as, ref) : get_kval(IR(ref))));
> +	(uintptr_t)(ra_iskref(ref) ? ra_krefk(as, ref) : get_kval(as, ref)));
>         if (checki16(delta)) {
> -	emit_tsi(as, MIPSI_AADDIU, t, r, delta);
> +	emit_tsi(as, MIPSI_AADDIU, rd, r, delta);
>   	return 1;
>         }
>       }
> @@ -223,7 +225,7 @@ static void emit_branch(ASMState *as, MIPSIns mi, Reg rs, Reg rt, MCode *target)
>   {
>     MCode *p = as->mcp;
>     ptrdiff_t delta = target - p;
> -  lua_assert(((delta + 0x8000) >> 16) == 0);
> +  lj_assertA(((delta + 0x8000) >> 16) == 0, "branch target out of range");
>     *--p = mi | MIPSF_S(rs) | MIPSF_T(rt) | ((uint32_t)delta & 0xffffu);
>     as->mcp = p;
>   }
> @@ -299,7 +301,7 @@ static void emit_storeofs(ASMState *as, IRIns *ir, Reg r, Reg base, int32_t ofs)
>   static void emit_addptr(ASMState *as, Reg r, int32_t ofs)
>   {
>     if (ofs) {
> -    lua_assert(checki16(ofs));
> +    lj_assertA(checki16(ofs), "offset %d out of range", ofs);
>       emit_tsi(as, MIPSI_AADDIU, r, r, ofs);
>     }
>   }
> diff --git a/src/lj_emit_ppc.h b/src/lj_emit_ppc.h
> index 21c3c2ac..ddc864cd 100644
> --- a/src/lj_emit_ppc.h
> +++ b/src/lj_emit_ppc.h
> @@ -41,13 +41,13 @@ static void emit_rot(ASMState *as, PPCIns pi, Reg ra, Reg rs,
>   
>   static void emit_slwi(ASMState *as, Reg ra, Reg rs, int32_t n)
>   {
> -  lua_assert(n >= 0 && n < 32);
> +  lj_assertA(n >= 0 && n < 32, "shift out or range");
>     emit_rot(as, PPCI_RLWINM, ra, rs, n, 0, 31-n);
>   }
>   
>   static void emit_rotlwi(ASMState *as, Reg ra, Reg rs, int32_t n)
>   {
> -  lua_assert(n >= 0 && n < 32);
> +  lj_assertA(n >= 0 && n < 32, "shift out or range");
>     emit_rot(as, PPCI_RLWINM, ra, rs, n, 0, 31);
>   }
>   
> @@ -57,17 +57,17 @@ static void emit_rotlwi(ASMState *as, Reg ra, Reg rs, int32_t n)
>   #define emit_canremat(ref)	((ref) <= REF_BASE)
>   
>   /* Try to find a one step delta relative to another constant. */
> -static int emit_kdelta1(ASMState *as, Reg t, int32_t i)
> +static int emit_kdelta1(ASMState *as, Reg rd, int32_t i)
>   {
>     RegSet work = ~as->freeset & RSET_GPR;
>     while (work) {
>       Reg r = rset_picktop(work);
>       IRRef ref = regcost_ref(as->cost[r]);
> -    lua_assert(r != t);
> +    lj_assertA(r != rd, "dest reg %d not free", rd);
>       if (ref < ASMREF_L) {
>         int32_t delta = i - (ra_iskref(ref) ? ra_krefk(as, ref) : IR(ref)->i);
>         if (checki16(delta)) {
> -	emit_tai(as, PPCI_ADDI, t, r, delta);
> +	emit_tai(as, PPCI_ADDI, rd, r, delta);
>   	return 1;
>         }
>       }
> @@ -144,7 +144,7 @@ static void emit_condbranch(ASMState *as, PPCIns pi, PPCCC cc, MCode *target)
>   {
>     MCode *p = --as->mcp;
>     ptrdiff_t delta = (char *)target - (char *)p;
> -  lua_assert(((delta + 0x8000) >> 16) == 0);
> +  lj_assertA(((delta + 0x8000) >> 16) == 0, "branch target out of range");
>     pi ^= (delta & 0x8000) * (PPCF_Y/0x8000);
>     *p = pi | PPCF_CC(cc) | ((uint32_t)delta & 0xffffu);
>   }
> diff --git a/src/lj_emit_x86.h b/src/lj_emit_x86.h
> index b3dc4ea5..eaef17fc 100644
> --- a/src/lj_emit_x86.h
> +++ b/src/lj_emit_x86.h
> @@ -92,7 +92,7 @@ static void emit_rr(ASMState *as, x86Op xo, Reg r1, Reg r2)
>   /* [addr] is sign-extended in x64 and must be in lower 2G (not 4G). */
>   static int32_t ptr2addr(const void *p)
>   {
> -  lua_assert((uintptr_t)p < (uintptr_t)0x80000000);
> +  lj_assertX((uintptr_t)p < (uintptr_t)0x80000000, "pointer outside 2G range");
>     return i32ptr(p);
>   }
>   #else
> @@ -208,7 +208,7 @@ static void emit_mrm(ASMState *as, x86Op xo, Reg rr, Reg rb)
>         rb = RID_ESP;
>   #endif
>       } else if (LJ_GC64 && rb == RID_RIP) {
> -      lua_assert(as->mrm.idx == RID_NONE);
> +      lj_assertA(as->mrm.idx == RID_NONE, "RIP-rel mrm cannot have index");
>         mode = XM_OFS0;
>         p -= 4;
>         *(int32_t *)p = as->mrm.ofs;
> @@ -401,7 +401,8 @@ static void emit_loadk64(ASMState *as, Reg r, IRIns *ir)
>       emit_rma(as, xo, r64, k);
>     } else {
>       if (ir->i) {
> -      lua_assert(*k == *(uint64_t*)(as->mctop - ir->i));
> +      lj_assertA(*k == *(uint64_t*)(as->mctop - ir->i),
> +		 "bad interned 64 bit constant");
>       } else if (as->curins <= as->stopins && rset_test(RSET_GPR, r)) {
>         emit_loadu64(as, r, *k);
>         return;
> @@ -433,7 +434,7 @@ static void emit_sjmp(ASMState *as, MCLabel target)
>   {
>     MCode *p = as->mcp;
>     ptrdiff_t delta = target - p;
> -  lua_assert(delta == (int8_t)delta);
> +  lj_assertA(delta == (int8_t)delta, "short jump target out of range");
>     p[-1] = (MCode)(int8_t)delta;
>     p[-2] = XI_JMPs;
>     as->mcp = p - 2;
> @@ -445,7 +446,7 @@ static void emit_sjcc(ASMState *as, int cc, MCLabel target)
>   {
>     MCode *p = as->mcp;
>     ptrdiff_t delta = target - p;
> -  lua_assert(delta == (int8_t)delta);
> +  lj_assertA(delta == (int8_t)delta, "short jump target out of range");
>     p[-1] = (MCode)(int8_t)delta;
>     p[-2] = (MCode)(XI_JCCs+(cc&15));
>     as->mcp = p - 2;
> @@ -471,10 +472,11 @@ static void emit_sfixup(ASMState *as, MCLabel source)
>   #define emit_label(as)		((as)->mcp)
>   
>   /* Compute relative 32 bit offset for jump and call instructions. */
> -static LJ_AINLINE int32_t jmprel(MCode *p, MCode *target)
> +static LJ_AINLINE int32_t jmprel(jit_State *J, MCode *p, MCode *target)
>   {
>     ptrdiff_t delta = target - p;
> -  lua_assert(delta == (int32_t)delta);
> +  UNUSED(J);
> +  lj_assertJ(delta == (int32_t)delta, "jump target out of range");
>     return (int32_t)delta;
>   }
>   
> @@ -482,7 +484,7 @@ static LJ_AINLINE int32_t jmprel(MCode *p, MCode *target)
>   static void emit_jcc(ASMState *as, int cc, MCode *target)
>   {
>     MCode *p = as->mcp;
> -  *(int32_t *)(p-4) = jmprel(p, target);
> +  *(int32_t *)(p-4) = jmprel(as->J, p, target);
>     p[-5] = (MCode)(XI_JCCn+(cc&15));
>     p[-6] = 0x0f;
>     as->mcp = p - 6;
> @@ -492,7 +494,7 @@ static void emit_jcc(ASMState *as, int cc, MCode *target)
>   static void emit_jmp(ASMState *as, MCode *target)
>   {
>     MCode *p = as->mcp;
> -  *(int32_t *)(p-4) = jmprel(p, target);
> +  *(int32_t *)(p-4) = jmprel(as->J, p, target);
>     p[-5] = XI_JMP;
>     as->mcp = p - 5;
>   }
> @@ -509,7 +511,7 @@ static void emit_call_(ASMState *as, MCode *target)
>       return;
>     }
>   #endif
> -  *(int32_t *)(p-4) = jmprel(p, target);
> +  *(int32_t *)(p-4) = jmprel(as->J, p, target);
>     p[-5] = XI_CALL;
>     as->mcp = p - 5;
>   }
> diff --git a/src/lj_err.c b/src/lj_err.c
> index 8d7134d9..89c51e98 100644
> --- a/src/lj_err.c
> +++ b/src/lj_err.c
> @@ -483,17 +483,10 @@ void lj_err_verify(void)
>   #if !LJ_TARGET_OSX
>     /* Check disabled on MacOS due to brilliant software engineering at Apple. */
>     struct dwarf_eh_bases ehb;
> -  /*
> -  ** FIXME: The following assertions were replaced with
> -  ** the conventional `lua_assert` ones.
> -  **
> -  ** lj_assertX(_Unwind_Find_FDE((void *)lj_err_throw, &ehb), "broken build: external frame unwinding enabled, but missing -funwind-tables");
> -  ** lj_assertX(_Unwind_Find_FDE((void *)_Unwind_RaiseException, &ehb), "broken build: external frame unwinding enabled, but system libraries have no unwind tables");
> -  */
> -  lua_assert(_Unwind_Find_FDE((void *)lj_err_throw, &ehb));
> +  lj_assertX(_Unwind_Find_FDE((void *)lj_err_throw, &ehb), "broken build: external frame unwinding enabled, but missing -funwind-tables");
>   #endif
>     /* Check disabled, because of broken Fedora/ARM64. See #722.
> -  lua_assert(_Unwind_Find_FDE((void *)_Unwind_RaiseException, &ehb));
> +  lj_assertX(_Unwind_Find_FDE((void *)_Unwind_RaiseException, &ehb), "broken build: external frame unwinding enabled, but system libraries have no unwind tables");
>     */
>   }
>   #endif
> @@ -514,13 +507,7 @@ static int err_unwind_jit(int version, int actions,
>       ExitNo exitno;
>       uintptr_t addr = _Unwind_GetIP(ctx);  /* Return address _after_ call. */
>       uintptr_t stub = lj_trace_unwind(G2J(g), addr - sizeof(MCode), &exitno);
> -    /*
> -    ** FIXME: The following assert was replaced with
> -    ** the conventional `lua_assert`.
> -    **
> -    ** lj_assertG(tvref(g->jit_base), "unexpected throw across mcode frame");
> -    */
> -    lua_assert(tvref(g->jit_base));
> +    lj_assertG(tvref(g->jit_base), "unexpected throw across mcode frame");
>       if (stub) {  /* Jump to side exit to unwind the trace. */
>         G2J(g)->exitcode = LJ_UEXCLASS_ERRCODE(uexclass);
>   #ifdef LJ_TARGET_MIPS
> @@ -603,15 +590,8 @@ uint8_t *lj_err_register_mcode(void *base, size_t sz, uint8_t *info)
>   #ifdef LUA_USE_ASSERT
>     {
>       struct dwarf_eh_bases ehb;
> -    /*
> -    ** FIXME: The following assert was replaced with
> -    ** the conventional `lua_assert`.
> -    **
> -    ** lj_assertX(_Unwind_Find_FDE(info + sizeof(err_frame_jit_template)+1, &ehb),
> -    **      "bad JIT unwind table registration");
> -    */
> -    lua_assert(_Unwind_Find_FDE(info + sizeof(err_frame_jit_template)+1,
> -               &ehb));
> +    lj_assertX(_Unwind_Find_FDE(info + sizeof(err_frame_jit_template)+1, &ehb),
> +	       "bad JIT unwind table registration");
>     }
>   #endif
>     return info + sizeof(err_frame_jit_template);
> @@ -716,13 +696,7 @@ void lj_err_verify(void)
>   {
>     int got = 0;
>     _Unwind_Backtrace((_Unwind_Trace_Fn)err_verify_bt, &got);
> -  /*
> -  ** FIXME: The following assert was replaced with
> -  ** the conventional `lua_assert`.
> -  **
> -  ** lj_assertX(got == 2, "broken build: external frame unwinding enabled, but missing -funwind-tables");
> -  */
> -  lua_assert(got == 2);
> +  lj_assertX(got == 2, "broken build: external frame unwinding enabled, but missing -funwind-tables");
>   }
>   #endif
>   
> @@ -852,7 +826,7 @@ static ptrdiff_t finderrfunc(lua_State *L)
>   	return savestack(L, frame_prevd(frame)+1);  /* xpcall's errorfunc. */
>         return 0;
>       default:
> -      lua_assert(0);
> +      lj_assertL(0, "bad frame type");
>         return 0;
>       }
>     }
> diff --git a/src/lj_func.c b/src/lj_func.c
> index 639dad87..2efecb0f 100644
> --- a/src/lj_func.c
> +++ b/src/lj_func.c
> @@ -24,9 +24,11 @@ void LJ_FASTCALL lj_func_freeproto(global_State *g, GCproto *pt)
>   
>   /* -- Upvalues ------------------------------------------------------------ */
>   
> -static void unlinkuv(GCupval *uv)
> +static void unlinkuv(global_State *g, GCupval *uv)
>   {
> -  lua_assert(uvprev(uvnext(uv)) == uv && uvnext(uvprev(uv)) == uv);
> +  UNUSED(g);
> +  lj_assertG(uvprev(uvnext(uv)) == uv && uvnext(uvprev(uv)) == uv,
> +	     "broken upvalue chain");
>     setgcrefr(uvnext(uv)->prev, uv->prev);
>     setgcrefr(uvprev(uv)->next, uv->next);
>   }
> @@ -40,7 +42,7 @@ static GCupval *func_finduv(lua_State *L, TValue *slot)
>     GCupval *uv;
>     /* Search the sorted list of open upvalues. */
>     while (gcref(*pp) != NULL && uvval((p = gco2uv(gcref(*pp)))) >= slot) {
> -    lua_assert(!p->closed && uvval(p) != &p->tv);
> +    lj_assertG(!p->closed && uvval(p) != &p->tv, "closed upvalue in chain");
>       if (uvval(p) == slot) {  /* Found open upvalue pointing to same slot? */
>         if (isdead(g, obj2gco(p)))  /* Resurrect it, if it's dead. */
>   	flipwhite(obj2gco(p));
> @@ -61,7 +63,8 @@ static GCupval *func_finduv(lua_State *L, TValue *slot)
>     setgcrefr(uv->next, g->uvhead.next);
>     setgcref(uvnext(uv)->prev, obj2gco(uv));
>     setgcref(g->uvhead.next, obj2gco(uv));
> -  lua_assert(uvprev(uvnext(uv)) == uv && uvnext(uvprev(uv)) == uv);
> +  lj_assertG(uvprev(uvnext(uv)) == uv && uvnext(uvprev(uv)) == uv,
> +	     "broken upvalue chain");
>     return uv;
>   }
>   
> @@ -84,12 +87,13 @@ void LJ_FASTCALL lj_func_closeuv(lua_State *L, TValue *level)
>     while (gcref(L->openupval) != NULL &&
>   	 uvval((uv = gco2uv(gcref(L->openupval)))) >= level) {
>       GCobj *o = obj2gco(uv);
> -    lua_assert(!isblack(o) && !uv->closed && uvval(uv) != &uv->tv);
> +    lj_assertG(!isblack(o), "bad black upvalue");
> +    lj_assertG(!uv->closed && uvval(uv) != &uv->tv, "closed upvalue in chain");
>       setgcrefr(L->openupval, uv->nextgc);  /* No longer in open list. */
>       if (isdead(g, o)) {
>         lj_func_freeuv(g, uv);
>       } else {
> -      unlinkuv(uv);
> +      unlinkuv(g, uv);
>         lj_gc_closeuv(g, uv);
>       }
>     }
> @@ -98,7 +102,7 @@ void LJ_FASTCALL lj_func_closeuv(lua_State *L, TValue *level)
>   void LJ_FASTCALL lj_func_freeuv(global_State *g, GCupval *uv)
>   {
>     if (!uv->closed)
> -    unlinkuv(uv);
> +    unlinkuv(g, uv);
>     lj_mem_freet(g, uv);
>   }
>   
> diff --git a/src/lj_gc.c b/src/lj_gc.c
> index c306047a..19d4c963 100644
> --- a/src/lj_gc.c
> +++ b/src/lj_gc.c
> @@ -42,7 +42,8 @@
>   
>   /* Mark a TValue (if needed). */
>   #define gc_marktv(g, tv) \
> -  { lua_assert(!tvisgcv(tv) || (~itype(tv) == gcval(tv)->gch.gct)); \
> +  { lj_assertG(!tvisgcv(tv) || (~itype(tv) == gcval(tv)->gch.gct), \
> +	       "TValue and GC type mismatch"); \
>       if (tviswhite(tv)) gc_mark(g, gcV(tv)); }
>   
>   /* Mark a GCobj (if needed). */
> @@ -56,7 +57,8 @@
>   static void gc_mark(global_State *g, GCobj *o)
>   {
>     int gct = o->gch.gct;
> -  lua_assert(iswhite(o) && !isdead(g, o));
> +  lj_assertG(iswhite(o), "mark of non-white object");
> +  lj_assertG(!isdead(g, o), "mark of dead object");
>     white2gray(o);
>     if (LJ_UNLIKELY(gct == ~LJ_TUDATA)) {
>       GCtab *mt = tabref(gco2ud(o)->metatable);
> @@ -69,8 +71,9 @@ static void gc_mark(global_State *g, GCobj *o)
>       if (uv->closed)
>         gray2black(o);  /* Closed upvalues are never gray. */
>     } else if (gct != ~LJ_TSTR && gct != ~LJ_TCDATA) {
> -    lua_assert(gct == ~LJ_TFUNC || gct == ~LJ_TTAB ||
> -	       gct == ~LJ_TTHREAD || gct == ~LJ_TPROTO || gct == ~LJ_TTRACE);
> +    lj_assertG(gct == ~LJ_TFUNC || gct == ~LJ_TTAB ||
> +	       gct == ~LJ_TTHREAD || gct == ~LJ_TPROTO || gct == ~LJ_TTRACE,
> +	       "bad GC type %d", gct);
>       setgcrefr(o->gch.gclist, g->gc.gray);
>       setgcref(g->gc.gray, o);
>     }
> @@ -103,7 +106,8 @@ static void gc_mark_uv(global_State *g)
>   {
>     GCupval *uv;
>     for (uv = uvnext(&g->uvhead); uv != &g->uvhead; uv = uvnext(uv)) {
> -    lua_assert(uvprev(uvnext(uv)) == uv && uvnext(uvprev(uv)) == uv);
> +    lj_assertG(uvprev(uvnext(uv)) == uv && uvnext(uvprev(uv)) == uv,
> +	       "broken upvalue chain");
>       if (isgray(obj2gco(uv)))
>         gc_marktv(g, uvval(uv));
>     }
> @@ -198,7 +202,7 @@ static int gc_traverse_tab(global_State *g, GCtab *t)
>       for (i = 0; i <= hmask; i++) {
>         Node *n = &node[i];
>         if (!tvisnil(&n->val)) {  /* Mark non-empty slot. */
> -	lua_assert(!tvisnil(&n->key));
> +	lj_assertG(!tvisnil(&n->key), "mark of nil key in non-empty slot");
>   	if (!(weak & LJ_GC_WEAKKEY)) gc_marktv(g, &n->key);
>   	if (!(weak & LJ_GC_WEAKVAL)) gc_marktv(g, &n->val);
>         }
> @@ -213,7 +217,8 @@ static void gc_traverse_func(global_State *g, GCfunc *fn)
>     gc_markobj(g, tabref(fn->c.env));
>     if (isluafunc(fn)) {
>       uint32_t i;
> -    lua_assert(fn->l.nupvalues <= funcproto(fn)->sizeuv);
> +    lj_assertG(fn->l.nupvalues <= funcproto(fn)->sizeuv,
> +	       "function upvalues out of range");
>       gc_markobj(g, funcproto(fn));
>       for (i = 0; i < fn->l.nupvalues; i++)  /* Mark Lua function upvalues. */
>         gc_markobj(g, &gcref(fn->l.uvptr[i])->uv);
> @@ -229,7 +234,7 @@ static void gc_traverse_func(global_State *g, GCfunc *fn)
>   static void gc_marktrace(global_State *g, TraceNo traceno)
>   {
>     GCobj *o = obj2gco(traceref(G2J(g), traceno));
> -  lua_assert(traceno != G2J(g)->cur.traceno);
> +  lj_assertG(traceno != G2J(g)->cur.traceno, "active trace escaped");
>     if (iswhite(o)) {
>       white2gray(o);
>       setgcrefr(o->gch.gclist, g->gc.gray);
> @@ -310,7 +315,7 @@ static size_t propagatemark(global_State *g)
>   {
>     GCobj *o = gcref(g->gc.gray);
>     int gct = o->gch.gct;
> -  lua_assert(isgray(o));
> +  lj_assertG(isgray(o), "propagation of non-gray object");
>     gray2black(o);
>     setgcrefr(g->gc.gray, o->gch.gclist);  /* Remove from gray list. */
>     if (LJ_LIKELY(gct == ~LJ_TTAB)) {
> @@ -342,7 +347,7 @@ static size_t propagatemark(global_State *g)
>       return ((sizeof(GCtrace)+7)&~7) + (T->nins-T->nk)*sizeof(IRIns) +
>   	   T->nsnap*sizeof(SnapShot) + T->nsnapmap*sizeof(SnapEntry);
>   #else
> -    lua_assert(0);
> +    lj_assertG(0, "bad GC type %d", gct);
>       return 0;
>   #endif
>     }
> @@ -396,11 +401,13 @@ static GCRef *gc_sweep(global_State *g, GCRef *p, uint32_t lim)
>       if (o->gch.gct == ~LJ_TTHREAD)  /* Need to sweep open upvalues, too. */
>         gc_fullsweep(g, &gco2th(o)->openupval);
>       if (((o->gch.marked ^ LJ_GC_WHITES) & ow)) {  /* Black or current white? */
> -      lua_assert(!isdead(g, o) || (o->gch.marked & LJ_GC_FIXED));
> +      lj_assertG(!isdead(g, o) || (o->gch.marked & LJ_GC_FIXED),
> +		 "sweep of undead object");
>         makewhite(g, o);  /* Value is alive, change to the current white. */
>         p = &o->gch.nextgc;
>       } else {  /* Otherwise value is dead, free it. */
> -      lua_assert(isdead(g, o) || ow == LJ_GC_SFIXED);
> +      lj_assertG(isdead(g, o) || ow == LJ_GC_SFIXED,
> +		 "sweep of unlive object");
>         setgcrefr(*p, o->gch.nextgc);
>         if (o == gcref(g->gc.root))
>   	setgcrefr(g->gc.root, o->gch.nextgc);  /* Adjust list anchor. */
> @@ -418,7 +425,8 @@ static GCRef *gc_sweep_str_chain(global_State *g, GCRef *p)
>     GCobj *o;
>     while ((o = gcref(*p)) != NULL) {
>       if (((o->gch.marked ^ LJ_GC_WHITES) & ow)) {  /* Black or current white? */
> -      lua_assert(!isdead(g, o) || (o->gch.marked & LJ_GC_FIXED));
> +      lj_assertG(!isdead(g, o) || (o->gch.marked & LJ_GC_FIXED),
> +		 "sweep of undead string");
>         makewhite(g, o);  /* Value is alive, change to the current white. */
>   #if LUAJIT_SMART_STRINGS
>         if (strsmart(&o->str)) {
> @@ -429,7 +437,8 @@ static GCRef *gc_sweep_str_chain(global_State *g, GCRef *p)
>   #endif
>         p = &o->gch.nextgc;
>       } else {  /* Otherwise value is dead, free it. */
> -      lua_assert(isdead(g, o) || ow == LJ_GC_SFIXED);
> +      lj_assertG(isdead(g, o) || ow == LJ_GC_SFIXED,
> +		 "sweep of unlive string");
>         setgcrefr(*p, o->gch.nextgc);
>         lj_str_free(g, &o->str);
>       }
> @@ -454,11 +463,12 @@ static int gc_mayclear(cTValue *o, int val)
>   }
>   
>   /* Clear collected entries from weak tables. */
> -static void gc_clearweak(GCobj *o)
> +static void gc_clearweak(global_State *g, GCobj *o)
>   {
> +  UNUSED(g);
>     while (o) {
>       GCtab *t = gco2tab(o);
> -    lua_assert((t->marked & LJ_GC_WEAK));
> +    lj_assertG((t->marked & LJ_GC_WEAK), "clear of non-weak table");
>       if ((t->marked & LJ_GC_WEAKVAL)) {
>         MSize i, asize = t->asize;
>         for (i = 0; i < asize; i++) {
> @@ -515,7 +525,7 @@ static void gc_finalize(lua_State *L)
>     global_State *g = G(L);
>     GCobj *o = gcnext(gcref(g->gc.mmudata));
>     cTValue *mo;
> -  lua_assert(tvref(g->jit_base) == NULL);  /* Must not be called on trace. */
> +  lj_assertG(tvref(g->jit_base) == NULL, "finalizer called on trace");
>     /* Unchain from list of userdata to be finalized. */
>     if (o == gcref(g->gc.mmudata))
>       setgcrefnull(g->gc.mmudata);
> @@ -607,7 +617,7 @@ static void atomic(global_State *g, lua_State *L)
>   
>     setgcrefr(g->gc.gray, g->gc.weak);  /* Empty the list of weak tables. */
>     setgcrefnull(g->gc.weak);
> -  lua_assert(!iswhite(obj2gco(mainthread(g))));
> +  lj_assertG(!iswhite(obj2gco(mainthread(g))), "main thread turned white");
>     gc_markobj(g, L);  /* Mark running thread. */
>     gc_traverse_curtrace(g);  /* Traverse current trace. */
>     gc_mark_gcroot(g);  /* Mark GC roots (again). */
> @@ -622,7 +632,7 @@ static void atomic(global_State *g, lua_State *L)
>     udsize += gc_propagate_gray(g);  /* And propagate the marks. */
>   
>     /* All marking done, clear weak tables. */
> -  gc_clearweak(gcref(g->gc.weak));
> +  gc_clearweak(g, gcref(g->gc.weak));
>   
>     lj_buf_shrink(L, &g->tmpbuf);  /* Shrink temp buffer. */
>   
> @@ -668,14 +678,14 @@ static size_t gc_onestep(lua_State *L)
>         g->strbloom.cur[1] = g->strbloom.next[1];
>   #endif
>       }
> -    lua_assert(old >= g->gc.total);
> +    lj_assertG(old >= g->gc.total, "sweep increased memory");
>       g->gc.estimate -= old - g->gc.total;
>       return GCSWEEPCOST;
>       }
>     case GCSsweep: {
>       GCSize old = g->gc.total;
>       setmref(g->gc.sweep, gc_sweep(g, mref(g->gc.sweep, GCRef), GCSWEEPMAX));
> -    lua_assert(old >= g->gc.total);
> +    lj_assertG(old >= g->gc.total, "sweep increased memory");
>       g->gc.estimate -= old - g->gc.total;
>       if (gcref(*mref(g->gc.sweep, GCRef)) == NULL) {
>         if (g->strnum <= (g->strmask >> 2) && g->strmask > LJ_MIN_STRTAB*2-1)
> @@ -708,7 +718,7 @@ static size_t gc_onestep(lua_State *L)
>       g->gc.debt = 0;
>       return 0;
>     default:
> -    lua_assert(0);
> +    lj_assertG(0, "bad GC state");
>       return 0;
>     }
>   }
> @@ -782,7 +792,8 @@ void lj_gc_fullgc(lua_State *L)
>     }
>     while (g->gc.state == GCSsweepstring || g->gc.state == GCSsweep)
>       gc_onestep(L);  /* Finish sweep. */
> -  lua_assert(g->gc.state == GCSfinalize || g->gc.state == GCSpause);
> +  lj_assertG(g->gc.state == GCSfinalize || g->gc.state == GCSpause,
> +	     "bad GC state");
>     /* Now perform a full GC. */
>     g->gc.state = GCSpause;
>     do { gc_onestep(L); } while (g->gc.state != GCSpause);
> @@ -795,9 +806,11 @@ void lj_gc_fullgc(lua_State *L)
>   /* Move the GC propagation frontier forward. */
>   void lj_gc_barrierf(global_State *g, GCobj *o, GCobj *v)
>   {
> -  lua_assert(isblack(o) && iswhite(v) && !isdead(g, v) && !isdead(g, o));
> -  lua_assert(g->gc.state != GCSfinalize && g->gc.state != GCSpause);
> -  lua_assert(o->gch.gct != ~LJ_TTAB);
> +  lj_assertG(isblack(o) && iswhite(v) && !isdead(g, v) && !isdead(g, o),
> +	     "bad object states for forward barrier");
> +  lj_assertG(g->gc.state != GCSfinalize && g->gc.state != GCSpause,
> +	     "bad GC state");
> +  lj_assertG(o->gch.gct != ~LJ_TTAB, "barrier object is not a table");
>     /* Preserve invariant during propagation. Otherwise it doesn't matter. */
>     if (g->gc.state == GCSpropagate || g->gc.state == GCSatomic)
>       gc_mark(g, v);  /* Move frontier forward. */
> @@ -834,7 +847,8 @@ void lj_gc_closeuv(global_State *g, GCupval *uv)
>   	lj_gc_barrierf(g, o, gcV(&uv->tv));
>       } else {
>         makewhite(g, o);  /* Make it white, i.e. sweep the upvalue. */
> -      lua_assert(g->gc.state != GCSfinalize && g->gc.state != GCSpause);
> +      lj_assertG(g->gc.state != GCSfinalize && g->gc.state != GCSpause,
> +		 "bad GC state");
>       }
>     }
>   }
> @@ -854,14 +868,15 @@ void lj_gc_barriertrace(global_State *g, uint32_t traceno)
>   void *lj_mem_realloc(lua_State *L, void *p, GCSize osz, GCSize nsz)
>   {
>     global_State *g = G(L);
> -  lua_assert((osz == 0) == (p == NULL));
> +  lj_assertG((osz == 0) == (p == NULL), "realloc API violation");
>   
>     setgcref(g->mem_L, obj2gco(L));
>     p = g->allocf(g->allocd, p, osz, nsz);
>     if (p == NULL && nsz > 0)
>       lj_err_mem(L);
> -  lua_assert((nsz == 0) == (p == NULL));
> -  lua_assert(checkptrGC(p));
> +  lj_assertG((nsz == 0) == (p == NULL), "allocf API violation");
> +  lj_assertG(checkptrGC(p),
> +	     "allocated memory address %p outside required range", p);
>     g->gc.total = (g->gc.total - osz) + nsz;
>     g->gc.allocated += nsz;
>     g->gc.freed += osz;
> @@ -878,7 +893,8 @@ void * LJ_FASTCALL lj_mem_newgco(lua_State *L, GCSize size)
>     o = (GCobj *)g->allocf(g->allocd, NULL, 0, size);
>     if (o == NULL)
>       lj_err_mem(L);
> -  lua_assert(checkptrGC(o));
> +  lj_assertG(checkptrGC(o),
> +	     "allocated memory address %p outside required range", o);
>     g->gc.total += size;
>     g->gc.allocated += size;
>     setgcrefr(o->gch.nextgc, g->gc.root);
> diff --git a/src/lj_gc.h b/src/lj_gc.h
> index 40b02cb0..bd880652 100644
> --- a/src/lj_gc.h
> +++ b/src/lj_gc.h
> @@ -76,8 +76,10 @@ LJ_FUNC void lj_gc_barriertrace(global_State *g, uint32_t traceno);
>   static LJ_AINLINE void lj_gc_barrierback(global_State *g, GCtab *t)
>   {
>     GCobj *o = obj2gco(t);
> -  lua_assert(isblack(o) && !isdead(g, o));
> -  lua_assert(g->gc.state != GCSfinalize && g->gc.state != GCSpause);
> +  lj_assertG(isblack(o) && !isdead(g, o),
> +	     "bad object states for backward barrier");
> +  lj_assertG(g->gc.state != GCSfinalize && g->gc.state != GCSpause,
> +	     "bad GC state");
>     black2gray(o);
>     setgcrefr(t->gclist, g->gc.grayagain);
>     setgcref(g->gc.grayagain, o);
> diff --git a/src/lj_gdbjit.c b/src/lj_gdbjit.c
> index c219ffac..9947eacc 100644
> --- a/src/lj_gdbjit.c
> +++ b/src/lj_gdbjit.c
> @@ -724,7 +724,7 @@ static void gdbjit_buildobj(GDBJITctx *ctx)
>     SECTALIGN(ctx->p, sizeof(uintptr_t));
>     gdbjit_initsect(ctx, GDBJIT_SECT_eh_frame, gdbjit_ehframe);
>     ctx->objsize = (size_t)((char *)ctx->p - (char *)obj);
> -  lua_assert(ctx->objsize < sizeof(GDBJITobj));
> +  lj_assertX(ctx->objsize < sizeof(GDBJITobj), "GDBJITobj overflow");
>   }
>   
>   #undef SECTALIGN
> @@ -782,7 +782,8 @@ void lj_gdbjit_addtrace(jit_State *J, GCtrace *T)
>     ctx.spadjp = CFRAME_SIZE_JIT +
>   	       (MSize)(parent ? traceref(J, parent)->spadjust : 0);
>     ctx.spadj = CFRAME_SIZE_JIT + T->spadjust;
> -  lua_assert(startpc >= proto_bc(pt) && startpc < proto_bc(pt) + pt->sizebc);
> +  lj_assertJ(startpc >= proto_bc(pt) && startpc < proto_bc(pt) + pt->sizebc,
> +	     "start PC out of range");
>     ctx.lineno = lj_debug_line(pt, proto_bcpos(pt, startpc));
>     ctx.filename = proto_chunknamestr(pt);
>     if (*ctx.filename == '@' || *ctx.filename == '=')
> diff --git a/src/lj_ir.c b/src/lj_ir.c
> index 2f7ddb24..9a51186f 100644
> --- a/src/lj_ir.c
> +++ b/src/lj_ir.c
> @@ -38,7 +38,7 @@
>   #define fins			(&J->fold.ins)
>   
>   /* Pass IR on to next optimization in chain (FOLD). */
> -#define emitir(ot, a, b)        (lj_ir_set(J, (ot), (a), (b)), lj_opt_fold(J))
> +#define emitir(ot, a, b)	(lj_ir_set(J, (ot), (a), (b)), lj_opt_fold(J))
>   
>   /* -- IR tables ----------------------------------------------------------- */
>   
> @@ -90,8 +90,9 @@ static void lj_ir_growbot(jit_State *J)
>   {
>     IRIns *baseir = J->irbuf + J->irbotlim;
>     MSize szins = J->irtoplim - J->irbotlim;
> -  lua_assert(szins != 0);
> -  lua_assert(J->cur.nk == J->irbotlim || J->cur.nk-1 == J->irbotlim);
> +  lj_assertJ(szins != 0, "zero IR size");
> +  lj_assertJ(J->cur.nk == J->irbotlim || J->cur.nk-1 == J->irbotlim,
> +	     "unexpected IR growth");
>     if (J->cur.nins + (szins >> 1) < J->irtoplim) {
>       /* More than half of the buffer is free on top: shift up by a quarter. */
>       MSize ofs = szins >> 2;
> @@ -148,9 +149,10 @@ TRef lj_ir_call(jit_State *J, IRCallID id, ...)
>   /* Load field of type t from GG_State + offset. Must be 32 bit aligned. */
>   LJ_FUNC TRef lj_ir_ggfload(jit_State *J, IRType t, uintptr_t ofs)
>   {
> -  lua_assert((ofs & 3) == 0);
> +  lj_assertJ((ofs & 3) == 0, "unaligned GG_State field offset");
>     ofs >>= 2;
> -  lua_assert(ofs >= IRFL__MAX && ofs <= 0x3ff);  /* 10 bit FOLD key limit. */
> +  lj_assertJ(ofs >= IRFL__MAX && ofs <= 0x3ff,
> +	     "GG_State field offset breaks 10 bit FOLD key limit");
>     lj_ir_set(J, IRT(IR_FLOAD, t), REF_NIL, ofs);
>     return lj_opt_fold(J);
>   }
> @@ -181,7 +183,7 @@ static LJ_AINLINE IRRef ir_nextk(jit_State *J)
>   static LJ_AINLINE IRRef ir_nextk64(jit_State *J)
>   {
>     IRRef ref = J->cur.nk - 2;
> -  lua_assert(J->state != LJ_TRACE_ASM);
> +  lj_assertJ(J->state != LJ_TRACE_ASM, "bad JIT state");
>     if (LJ_UNLIKELY(ref < J->irbotlim)) lj_ir_growbot(J);
>     J->cur.nk = ref;
>     return ref;
> @@ -277,7 +279,7 @@ TRef lj_ir_kgc(jit_State *J, GCobj *o, IRType t)
>   {
>     IRIns *ir, *cir = J->cur.ir;
>     IRRef ref;
> -  lua_assert(!isdead(J2G(J), o));
> +  lj_assertJ(!isdead(J2G(J), o), "interning of dead GC object");
>     for (ref = J->chain[IR_KGC]; ref; ref = cir[ref].prev)
>       if (ir_kgc(&cir[ref]) == o)
>         goto found;
> @@ -299,7 +301,7 @@ TRef lj_ir_ktrace(jit_State *J)
>   {
>     IRRef ref = ir_nextkgc(J);
>     IRIns *ir = IR(ref);
> -  lua_assert(irt_toitype_(IRT_P64) == LJ_TTRACE);
> +  lj_assertJ(irt_toitype_(IRT_P64) == LJ_TTRACE, "mismatched type mapping");
>     ir->t.irt = IRT_P64;
>     ir->o = LJ_GC64 ? IR_KNUM : IR_KNULL;  /* Not IR_KGC yet, but same size. */
>     ir->op12 = 0;
> @@ -313,7 +315,7 @@ TRef lj_ir_kptr_(jit_State *J, IROp op, void *ptr)
>     IRIns *ir, *cir = J->cur.ir;
>     IRRef ref;
>   #if LJ_64 && !LJ_GC64
> -  lua_assert((void *)(uintptr_t)u32ptr(ptr) == ptr);
> +  lj_assertJ((void *)(uintptr_t)u32ptr(ptr) == ptr, "out-of-range GC pointer");
>   #endif
>     for (ref = J->chain[op]; ref; ref = cir[ref].prev)
>       if (ir_kptr(&cir[ref]) == ptr)
> @@ -360,7 +362,8 @@ TRef lj_ir_kslot(jit_State *J, TRef key, IRRef slot)
>     IRRef2 op12 = IRREF2((IRRef1)key, (IRRef1)slot);
>     IRRef ref;
>     /* Const part is not touched by CSE/DCE, so 0-65535 is ok for IRMlit here. */
> -  lua_assert(tref_isk(key) && slot == (IRRef)(IRRef1)slot);
> +  lj_assertJ(tref_isk(key) && slot == (IRRef)(IRRef1)slot,
> +	     "out-of-range key/slot");
>     for (ref = J->chain[IR_KSLOT]; ref; ref = cir[ref].prev)
>       if (cir[ref].op12 == op12)
>         goto found;
> @@ -381,7 +384,7 @@ found:
>   void lj_ir_kvalue(lua_State *L, TValue *tv, const IRIns *ir)
>   {
>     UNUSED(L);
> -  lua_assert(ir->o != IR_KSLOT);  /* Common mistake. */
> +  lj_assertL(ir->o != IR_KSLOT, "unexpected KSLOT");  /* Common mistake. */
>     switch (ir->o) {
>     case IR_KPRI: setpriV(tv, irt_toitype(ir->t)); break;
>     case IR_KINT: setintV(tv, ir->i); break;
> @@ -399,7 +402,7 @@ void lj_ir_kvalue(lua_State *L, TValue *tv, const IRIns *ir)
>       break;
>       }
>   #endif
> -  default: lua_assert(0); break;
> +  default: lj_assertL(0, "bad IR constant op %d", ir->o); break;
>     }
>   }
>   
> @@ -459,7 +462,7 @@ int lj_ir_numcmp(lua_Number a, lua_Number b, IROp op)
>     case IR_UGE: return !(a < b);
>     case IR_ULE: return !(a > b);
>     case IR_UGT: return !(a <= b);
> -  default: lua_assert(0); return 0;
> +  default: lj_assertX(0, "bad IR op %d", op); return 0;
>     }
>   }
>   
> @@ -472,7 +475,7 @@ int lj_ir_strcmp(GCstr *a, GCstr *b, IROp op)
>     case IR_GE: return (res >= 0);
>     case IR_LE: return (res <= 0);
>     case IR_GT: return (res > 0);
> -  default: lua_assert(0); return 0;
> +  default: lj_assertX(0, "bad IR op %d", op); return 0;
>     }
>   }
>   
> diff --git a/src/lj_ir.h b/src/lj_ir.h
> index 43e55069..46af54e4 100644
> --- a/src/lj_ir.h
> +++ b/src/lj_ir.h
> @@ -412,11 +412,12 @@ static LJ_AINLINE IRType itype2irt(const TValue *tv)
>   
>   static LJ_AINLINE uint32_t irt_toitype_(IRType t)
>   {
> -  lua_assert(!LJ_64 || LJ_GC64 || t != IRT_LIGHTUD);
> +  lj_assertX(!LJ_64 || LJ_GC64 || t != IRT_LIGHTUD,
> +	     "no plain type tag for lightuserdata");
>     if (LJ_DUALNUM && t > IRT_NUM) {
>       return LJ_TISNUM;
>     } else {
> -    lua_assert(t <= IRT_NUM);
> +    lj_assertX(t <= IRT_NUM, "no plain type tag for IR type %d", t);
>       return ~(uint32_t)t;
>     }
>   }
> diff --git a/src/lj_jit.h b/src/lj_jit.h
> index a8b6f9a7..361570a0 100644
> --- a/src/lj_jit.h
> +++ b/src/lj_jit.h
> @@ -507,6 +507,12 @@ LJ_ALIGN(16)		/* For DISPATCH-relative addresses in assembler part. */
>   #endif
>   jit_State;
>   
> +#ifdef LUA_USE_ASSERT
> +#define lj_assertJ(c, ...)	lj_assertG_(J2G(J), (c), __VA_ARGS__)
> +#else
> +#define lj_assertJ(c, ...)	((void)J)
> +#endif
> +
>   /* Trivial PRNG e.g. used for penalty randomization. */
>   static LJ_AINLINE uint32_t LJ_PRNG_BITS(jit_State *J, int bits)
>   {
> diff --git a/src/lj_lex.c b/src/lj_lex.c
> index c66660d7..cef3c683 100644
> --- a/src/lj_lex.c
> +++ b/src/lj_lex.c
> @@ -76,7 +76,7 @@ static LJ_AINLINE LexChar lex_savenext(LexState *ls)
>   static void lex_newline(LexState *ls)
>   {
>     LexChar old = ls->c;
> -  lua_assert(lex_iseol(ls));
> +  lj_assertLS(lex_iseol(ls), "bad usage");
>     lex_next(ls);  /* Skip "\n" or "\r". */
>     if (lex_iseol(ls) && ls->c != old) lex_next(ls);  /* Skip "\n\r" or "\r\n". */
>     if (++ls->linenumber >= LJ_MAX_LINE)
> @@ -90,7 +90,7 @@ static void lex_number(LexState *ls, TValue *tv)
>   {
>     StrScanFmt fmt;
>     LexChar c, xp = 'e';
> -  lua_assert(lj_char_isdigit(ls->c));
> +  lj_assertLS(lj_char_isdigit(ls->c), "bad usage");
>     if ((c = ls->c) == '0' && (lex_savenext(ls) | 0x20) == 'x')
>       xp = 'p';
>     while (lj_char_isident(ls->c) || ls->c == '.' ||
> @@ -110,7 +110,8 @@ static void lex_number(LexState *ls, TValue *tv)
>     } else if (fmt != STRSCAN_ERROR) {
>       lua_State *L = ls->L;
>       GCcdata *cd;
> -    lua_assert(fmt == STRSCAN_I64 || fmt == STRSCAN_U64 || fmt == STRSCAN_IMAG);
> +    lj_assertLS(fmt == STRSCAN_I64 || fmt == STRSCAN_U64 || fmt == STRSCAN_IMAG,
> +		"unexpected number format %d", fmt);
>       if (!ctype_ctsG(G(L))) {
>         ptrdiff_t oldtop = savestack(L, L->top);
>         luaopen_ffi(L);  /* Load FFI library on-demand. */
> @@ -127,7 +128,8 @@ static void lex_number(LexState *ls, TValue *tv)
>       lj_parse_keepcdata(ls, tv, cd);
>   #endif
>     } else {
> -    lua_assert(fmt == STRSCAN_ERROR);
> +    lj_assertLS(fmt == STRSCAN_ERROR,
> +		"unexpected number format %d", fmt);
>       lj_lex_error(ls, TK_number, LJ_ERR_XNUMBER);
>     }
>   }
> @@ -137,7 +139,7 @@ static int lex_skipeq(LexState *ls)
>   {
>     int count = 0;
>     LexChar s = ls->c;
> -  lua_assert(s == '[' || s == ']');
> +  lj_assertLS(s == '[' || s == ']', "bad usage");
>     while (lex_savenext(ls) == '=' && count < 0x20000000)
>       count++;
>     return (ls->c == s) ? count : (-count) - 1;
> @@ -462,7 +464,7 @@ void lj_lex_next(LexState *ls)
>   /* Look ahead for the next token. */
>   LexToken lj_lex_lookahead(LexState *ls)
>   {
> -  lua_assert(ls->lookahead == TK_eof);
> +  lj_assertLS(ls->lookahead == TK_eof, "double lookahead");
>     ls->lookahead = lex_scan(ls, &ls->lookaheadval);
>     return ls->lookahead;
>   }
> diff --git a/src/lj_lex.h b/src/lj_lex.h
> index 33fa8657..ae05a954 100644
> --- a/src/lj_lex.h
> +++ b/src/lj_lex.h
> @@ -83,4 +83,10 @@ LJ_FUNC const char *lj_lex_token2str(LexState *ls, LexToken tok);
>   LJ_FUNC_NORET void lj_lex_error(LexState *ls, LexToken tok, ErrMsg em, ...);
>   LJ_FUNC void lj_lex_init(lua_State *L);
>   
> +#ifdef LUA_USE_ASSERT
> +#define lj_assertLS(c, ...)	(lj_assertG_(G(ls->L), (c), __VA_ARGS__))
> +#else
> +#define lj_assertLS(c, ...)	((void)ls)
> +#endif
> +
>   #endif
> diff --git a/src/lj_load.c b/src/lj_load.c
> index 9a31d9a1..19ac6ba2 100644
> --- a/src/lj_load.c
> +++ b/src/lj_load.c
> @@ -159,7 +159,7 @@ LUALIB_API int luaL_loadstring(lua_State *L, const char *s)
>   LUA_API int lua_dump(lua_State *L, lua_Writer writer, void *data)
>   {
>     cTValue *o = L->top-1;
> -  api_check(L, L->top > L->base);
> +  lj_checkapi(L->top > L->base, "top slot empty");
>     if (tvisfunc(o) && isluafunc(funcV(o)))
>       return lj_bcwrite(L, funcproto(funcV(o)), writer, data, 0);
>     else
> diff --git a/src/lj_mapi.c b/src/lj_mapi.c
> index 9d97c747..679ca943 100644
> --- a/src/lj_mapi.c
> +++ b/src/lj_mapi.c
> @@ -28,7 +28,7 @@ LUAMISC_API void luaM_metrics(lua_State *L, struct luam_Metrics *metrics)
>     jit_State *J = G2J(g);
>   #endif
>   
> -  lua_assert(metrics != NULL);
> +  lj_assertL(metrics != NULL, "uninitialized metrics struct");
>   
>     metrics->strhash_hit = g->strhash_hit;
>     metrics->strhash_miss = g->strhash_miss;
> diff --git a/src/lj_mcode.c b/src/lj_mcode.c
> index 10db4457..808a9897 100644
> --- a/src/lj_mcode.c
> +++ b/src/lj_mcode.c
> @@ -354,7 +354,7 @@ MCode *lj_mcode_patch(jit_State *J, MCode *ptr, int finish)
>       /* Otherwise search through the list of MCode areas. */
>       for (;;) {
>         mc = ((MCLink *)mc)->next;
> -      lua_assert(mc != NULL);
> +      lj_assertJ(mc != NULL, "broken MCode area chain");
>         if (ptr >= mc && ptr < (MCode *)((char *)mc + ((MCLink *)mc)->size)) {
>   	if (LJ_UNLIKELY(mcode_setprot(mc, ((MCLink *)mc)->size, MCPROT_GEN)))
>   	  mcode_protfail(J);
> diff --git a/src/lj_memprof.c b/src/lj_memprof.c
> index c600c4f0..a492cf58 100644
> --- a/src/lj_memprof.c
> +++ b/src/lj_memprof.c
> @@ -144,7 +144,7 @@ static void memprof_write_func(struct memprof *mp, uint8_t aevent)
>     else if (iscfunc(fn))
>       memprof_write_cfunc(out, aevent, fn, L, &mp->lib_adds);
>     else
> -    lua_assert(0);
> +    lj_assertL(0, "unknown function type to write by memprof");
>   }
>   
>   #if LJ_HASJIT
> @@ -164,7 +164,7 @@ static void memprof_write_trace(struct memprof *mp, uint8_t aevent)
>   {
>     UNUSED(mp);
>     UNUSED(aevent);
> -  lua_assert(0);
> +  lj_assertX(0, "write trace memprof event without JIT");
>   }
>   
>   #endif
> @@ -215,10 +215,12 @@ static void *memprof_allocf(void *ud, void *ptr, size_t osize, size_t nsize)
>     struct lj_wbuf *out = &mp->out;
>     void *nptr;
>   
> -  lua_assert(MPS_PROFILE == mp->state);
> -  lua_assert(oalloc->allocf != memprof_allocf);
> -  lua_assert(oalloc->allocf != NULL);
> -  lua_assert(ud == oalloc->state);
> +  lj_assertX(MPS_PROFILE == mp->state, "bad memprof profile state");
> +  lj_assertX(oalloc->allocf != memprof_allocf,
> +	     "unexpected memprof old alloc function");
> +  lj_assertX(oalloc->allocf != NULL,
> +	     "uninitialized memprof old alloc function");
> +  lj_assertX(ud == oalloc->state, "bad old memprof profile state");
>   
>     nptr = oalloc->allocf(ud, ptr, osize, nsize);
>   
> @@ -252,10 +254,10 @@ int lj_memprof_start(struct lua_State *L, const struct lj_memprof_options *opt)
>     struct alloc *oalloc = &mp->orig_alloc;
>     const size_t ljm_header_len = sizeof(ljm_header) / sizeof(ljm_header[0]);
>   
> -  lua_assert(opt->writer != NULL);
> -  lua_assert(opt->on_stop != NULL);
> -  lua_assert(opt->buf != NULL);
> -  lua_assert(opt->len != 0);
> +  lj_assertL(opt->writer != NULL, "uninitialized memprof writer");
> +  lj_assertL(opt->on_stop != NULL, "uninitialized on stop memprof callback");
> +  lj_assertL(opt->buf != NULL, "uninitialized memprof writer buffer");
> +  lj_assertL(opt->len != 0, "bad memprof writer buffer lenght");
>   
>     if (mp->state != MPS_IDLE) {
>       /* Clean up resourses. Ignore possible errors. */
> @@ -293,8 +295,9 @@ int lj_memprof_start(struct lua_State *L, const struct lj_memprof_options *opt)
>   
>     /* Override allocating function. */
>     oalloc->allocf = lua_getallocf(L, &oalloc->state);
> -  lua_assert(oalloc->allocf != NULL);
> -  lua_assert(oalloc->allocf != memprof_allocf);
> +  lj_assertL(oalloc->allocf != NULL, "uninitialized memprof old alloc function");
> +  lj_assertL(oalloc->allocf != memprof_allocf,
> +	     "unexpected memprof old alloc function");
>     lua_setallocf(L, memprof_allocf, oalloc->state);
>   
>     return PROFILE_SUCCESS;
> @@ -323,10 +326,12 @@ int lj_memprof_stop(struct lua_State *L)
>   
>     mp->state = MPS_IDLE;
>   
> -  lua_assert(mp->g != NULL);
> +  lj_assertL(mp->g != NULL, "uninitialized global state in memprof state");
>   
> -  lua_assert(memprof_allocf == lua_getallocf(L, NULL));
> -  lua_assert(oalloc->allocf != NULL);
> +  lj_assertL(memprof_allocf == lua_getallocf(L, NULL),
> +	     "bad current allocator function on memprof stop");
> +  lj_assertL(oalloc->allocf != NULL,
> +	     "uninitialized old alloc function on memprof stop");
>     lua_setallocf(L, oalloc->allocf, oalloc->state);
>   
>     if (LJ_UNLIKELY(lj_wbuf_test_flag(out, STREAM_STOP))) {
> diff --git a/src/lj_meta.c b/src/lj_meta.c
> index 7ef7a8e0..4cb1a261 100644
> --- a/src/lj_meta.c
> +++ b/src/lj_meta.c
> @@ -47,7 +47,7 @@ void lj_meta_init(lua_State *L)
>   cTValue *lj_meta_cache(GCtab *mt, MMS mm, GCstr *name)
>   {
>     cTValue *mo = lj_tab_getstr(mt, name);
> -  lua_assert(mm <= MM_FAST);
> +  lj_assertX(mm <= MM_FAST, "bad metamethod %d", mm);
>     if (!mo || tvisnil(mo)) {  /* No metamethod? */
>       mt->nomm |= (uint8_t)(1u<<mm);  /* Set negative cache flag. */
>       return NULL;
> @@ -363,7 +363,7 @@ TValue * LJ_FASTCALL lj_meta_equal_cd(lua_State *L, BCIns ins)
>     } else if (op == BC_ISEQN) {
>       o2 = &mref(curr_proto(L)->k, cTValue)[bc_d(ins)];
>     } else {
> -    lua_assert(op == BC_ISEQP);
> +    lj_assertL(op == BC_ISEQP, "bad bytecode op %d", op);
>       setpriV(&tv, ~bc_d(ins));
>       o2 = &tv;
>     }
> @@ -426,7 +426,7 @@ void lj_meta_istype(lua_State *L, BCReg ra, BCReg tp)
>   {
>     L->top = curr_topL(L);
>     ra++; tp--;
> -  lua_assert(LJ_DUALNUM || tp != ~LJ_TNUMX);  /* ISTYPE -> ISNUM broken. */
> +  lj_assertL(LJ_DUALNUM || tp != ~LJ_TNUMX, "bad type for ISTYPE");
>     if (LJ_DUALNUM && tp == ~LJ_TNUMX) lj_lib_checkint(L, ra);
>     else if (tp == ~LJ_TNUMX+1) lj_lib_checknum(L, ra);
>     else if (tp == ~LJ_TSTR) lj_lib_checkstr(L, ra);
> diff --git a/src/lj_obj.h b/src/lj_obj.h
> index bf95e1eb..fb21cba9 100644
> --- a/src/lj_obj.h
> +++ b/src/lj_obj.h
> @@ -735,6 +735,11 @@ struct lua_State {
>   #define curr_topL(L)		(L->base + curr_proto(L)->framesize)
>   #define curr_top(L)		(curr_funcisL(L) ? curr_topL(L) : L->top)
>   
> +#if defined(LUA_USE_ASSERT) || defined(LUA_USE_APICHECK)
> +LJ_FUNC_NORET void lj_assert_fail(global_State *g, const char *file, int line,
> +				  const char *func, const char *fmt, ...);
> +#endif
> +
>   /* -- GC object definition and conversions -------------------------------- */
>   
>   /* GC header for generic access to common fields of GC objects. */
> @@ -788,10 +793,6 @@ typedef union GCobj {
>   
>   /* -- TValue getters/setters ---------------------------------------------- */
>   
> -#ifdef LUA_USE_ASSERT
> -#include "lj_gc.h"
> -#endif
> -
>   /* Macros to test types. */
>   #if LJ_GC64
>   #define itype(o)	((uint32_t)((o)->it64 >> 47))
> @@ -863,8 +864,8 @@ static LJ_AINLINE void *lightudV(global_State *g, cTValue *o)
>     uint64_t u = o->u64;
>     uint64_t seg = lightudseg(u);
>     uint32_t *segmap = mref(g->gc.lightudseg, uint32_t);
> -  lua_assert(tvislightud(o));
> -  lua_assert(seg <= g->gc.lightudnum);
> +  lj_assertG(tvislightud(o), "lightuserdata expected");
> +  lj_assertG(seg <= g->gc.lightudnum, "bad lightuserdata segment %d", seg);
>     return (void *)(((uint64_t)segmap[seg] << 32) | lightudlo(u));
>   }
>   #else
> @@ -915,9 +916,19 @@ static LJ_AINLINE void setrawlightudV(TValue *o, void *p)
>     ((o)->u64 = (uint64_t)(void *)(f) - (uint64_t)lj_vm_asm_begin)
>   #endif
>   
> -#define tvchecklive(L, o) \
> -  UNUSED(L), lua_assert(!tvisgcv(o) || \
> -  ((~itype(o) == gcval(o)->gch.gct) && !isdead(G(L), gcval(o))))
> +static LJ_AINLINE void checklivetv(lua_State *L, TValue *o, const char *msg)
> +{
> +  UNUSED(L); UNUSED(o); UNUSED(msg);
> +#if LUA_USE_ASSERT
> +  if (tvisgcv(o)) {
> +    lj_assertL(~itype(o) == gcval(o)->gch.gct,
> +	       "mismatch of TValue type %d vs GC type %d",
> +	       ~itype(o), gcval(o)->gch.gct);
> +    /* Copy of isdead check from lj_gc.h to avoid circular include. */
> +    lj_assertL(!(gcval(o)->gch.marked & (G(L)->gc.currentwhite ^ 3) & 3), msg);
> +  }
> +#endif
> +}
>   
>   static LJ_AINLINE void setgcVraw(TValue *o, GCobj *v, uint32_t itype)
>   {
> @@ -930,7 +941,8 @@ static LJ_AINLINE void setgcVraw(TValue *o, GCobj *v, uint32_t itype)
>   
>   static LJ_AINLINE void setgcV(lua_State *L, TValue *o, GCobj *v, uint32_t it)
>   {
> -  setgcVraw(o, v, it); tvchecklive(L, o);
> +  setgcVraw(o, v, it);
> +  checklivetv(L, o, "store to dead GC object");
>   }
>   
>   #define define_setV(name, type, tag) \
> @@ -977,7 +989,8 @@ static LJ_AINLINE void setint64V(TValue *o, int64_t i)
>   /* Copy tagged values. */
>   static LJ_AINLINE void copyTV(lua_State *L, TValue *o1, const TValue *o2)
>   {
> -  *o1 = *o2; tvchecklive(L, o1);
> +  *o1 = *o2;
> +  checklivetv(L, o1, "copy of dead GC object");
>   }
>   
>   /* -- Number to integer conversion ---------------------------------------- */
> diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
> index cd803d87..0007107b 100644
> --- a/src/lj_opt_fold.c
> +++ b/src/lj_opt_fold.c
> @@ -282,7 +282,7 @@ static int32_t kfold_intop(int32_t k1, int32_t k2, IROp op)
>     case IR_BROR: k1 = (int32_t)lj_ror((uint32_t)k1, (k2 & 31)); break;
>     case IR_MIN: k1 = k1 < k2 ? k1 : k2; break;
>     case IR_MAX: k1 = k1 > k2 ? k1 : k2; break;
> -  default: lua_assert(0); break;
> +  default: lj_assertX(0, "bad IR op %d", op); break;
>     }
>     return k1;
>   }
> @@ -354,7 +354,7 @@ LJFOLDF(kfold_intcomp)
>     case IR_ULE: return CONDFOLD((uint32_t)a <= (uint32_t)b);
>     case IR_ABC:
>     case IR_UGT: return CONDFOLD((uint32_t)a > (uint32_t)b);
> -  default: lua_assert(0); return FAILFOLD;
> +  default: lj_assertJ(0, "bad IR op %d", fins->o); return FAILFOLD;
>     }
>   }
>   
> @@ -368,10 +368,12 @@ LJFOLDF(kfold_intcomp0)
>   
>   /* -- Constant folding for 64 bit integers -------------------------------- */
>   
> -static uint64_t kfold_int64arith(uint64_t k1, uint64_t k2, IROp op)
> +static uint64_t kfold_int64arith(jit_State *J, uint64_t k1, uint64_t k2,
> +				 IROp op)
>   {
> -  switch (op) {
> +  UNUSED(J);
>   #if LJ_HASFFI
> +  switch (op) {
>     case IR_ADD: k1 += k2; break;
>     case IR_SUB: k1 -= k2; break;
>     case IR_MUL: k1 *= k2; break;
> @@ -383,9 +385,12 @@ static uint64_t kfold_int64arith(uint64_t k1, uint64_t k2, IROp op)
>     case IR_BSAR: k1 >>= (k2 & 63); break;
>     case IR_BROL: k1 = (int32_t)lj_rol((uint32_t)k1, (k2 & 63)); break;
>     case IR_BROR: k1 = (int32_t)lj_ror((uint32_t)k1, (k2 & 63)); break;
> -#endif
> -  default: UNUSED(k2); lua_assert(0); break;
> +  default: lj_assertJ(0, "bad IR op %d", op); break;
>     }
> +#else
> +  UNUSED(k2); UNUSED(op);
> +  lj_assertJ(0, "FFI IR op without FFI");
> +#endif
>     return k1;
>   }
>   
> @@ -397,7 +402,7 @@ LJFOLD(BOR KINT64 KINT64)
>   LJFOLD(BXOR KINT64 KINT64)
>   LJFOLDF(kfold_int64arith)
>   {
> -  return INT64FOLD(kfold_int64arith(ir_k64(fleft)->u64,
> +  return INT64FOLD(kfold_int64arith(J, ir_k64(fleft)->u64,
>   				    ir_k64(fright)->u64, (IROp)fins->o));
>   }
>   
> @@ -419,7 +424,7 @@ LJFOLDF(kfold_int64arith2)
>     }
>     return INT64FOLD(k1);
>   #else
> -  UNUSED(J); lua_assert(0); return FAILFOLD;
> +  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
>   #endif
>   }
>   
> @@ -435,7 +440,7 @@ LJFOLDF(kfold_int64shift)
>     int32_t sh = (fright->i & 63);
>     return INT64FOLD(lj_carith_shift64(k, sh, fins->o - IR_BSHL));
>   #else
> -  UNUSED(J); lua_assert(0); return FAILFOLD;
> +  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
>   #endif
>   }
>   
> @@ -445,7 +450,7 @@ LJFOLDF(kfold_bnot64)
>   #if LJ_HASFFI
>     return INT64FOLD(~ir_k64(fleft)->u64);
>   #else
> -  UNUSED(J); lua_assert(0); return FAILFOLD;
> +  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
>   #endif
>   }
>   
> @@ -455,7 +460,7 @@ LJFOLDF(kfold_bswap64)
>   #if LJ_HASFFI
>     return INT64FOLD(lj_bswap64(ir_k64(fleft)->u64));
>   #else
> -  UNUSED(J); lua_assert(0); return FAILFOLD;
> +  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
>   #endif
>   }
>   
> @@ -480,10 +485,10 @@ LJFOLDF(kfold_int64comp)
>     case IR_UGE: return CONDFOLD(a >= b);
>     case IR_ULE: return CONDFOLD(a <= b);
>     case IR_UGT: return CONDFOLD(a > b);
> -  default: lua_assert(0); return FAILFOLD;
> +  default: lj_assertJ(0, "bad IR op %d", fins->o); return FAILFOLD;
>     }
>   #else
> -  UNUSED(J); lua_assert(0); return FAILFOLD;
> +  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
>   #endif
>   }
>   
> @@ -495,7 +500,7 @@ LJFOLDF(kfold_int64comp0)
>       return DROPFOLD;
>     return NEXTFOLD;
>   #else
> -  UNUSED(J); lua_assert(0); return FAILFOLD;
> +  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
>   #endif
>   }
>   
> @@ -520,7 +525,7 @@ LJFOLD(STRREF KGC KINT)
>   LJFOLDF(kfold_strref)
>   {
>     GCstr *str = ir_kstr(fleft);
> -  lua_assert((MSize)fright->i <= str->len);
> +  lj_assertJ((MSize)fright->i <= str->len, "bad string ref");
>     return lj_ir_kkptr(J, (char *)strdata(str) + fright->i);
>   }
>   
> @@ -616,8 +621,9 @@ LJFOLDF(bufput_kgc)
>   LJFOLD(BUFSTR any any)
>   LJFOLDF(bufstr_kfold_cse)
>   {
> -  lua_assert(fleft->o == IR_BUFHDR || fleft->o == IR_BUFPUT ||
> -	     fleft->o == IR_CALLL);
> +  lj_assertJ(fleft->o == IR_BUFHDR || fleft->o == IR_BUFPUT ||
> +	     fleft->o == IR_CALLL,
> +	     "bad buffer constructor IR op %d", fleft->o);
>     if (LJ_LIKELY(J->flags & JIT_F_OPT_FOLD)) {
>       if (fleft->o == IR_BUFHDR) {  /* No put operations? */
>         if (!(fleft->op2 & IRBUFHDR_APPEND))  /* Empty buffer? */
> @@ -637,8 +643,9 @@ LJFOLDF(bufstr_kfold_cse)
>       while (ref) {
>         IRIns *irs = IR(ref), *ira = fleft, *irb = IR(irs->op1);
>         while (ira->o == irb->o && ira->op2 == irb->op2) {
> -	lua_assert(ira->o == IR_BUFHDR || ira->o == IR_BUFPUT ||
> -		   ira->o == IR_CALLL || ira->o == IR_CARG);
> +	lj_assertJ(ira->o == IR_BUFHDR || ira->o == IR_BUFPUT ||
> +		   ira->o == IR_CALLL || ira->o == IR_CARG,
> +		   "bad buffer constructor IR op %d", ira->o);
>   	if (ira->o == IR_BUFHDR && !(ira->op2 & IRBUFHDR_APPEND))
>   	  return ref;  /* CSE succeeded. */
>   	if (ira->o == IR_CALLL && ira->op2 == IRCALL_lj_buf_puttab)
> @@ -697,7 +704,7 @@ LJFOLD(CALLL CARG IRCALL_lj_strfmt_putfchar)
>   LJFOLDF(bufput_kfold_fmt)
>   {
>     IRIns *irc = IR(fleft->op1);
> -  lua_assert(irref_isk(irc->op2));  /* SFormat must be const. */
> +  lj_assertJ(irref_isk(irc->op2), "SFormat must be const");
>     if (irref_isk(fleft->op2)) {
>       SFormat sf = (SFormat)IR(irc->op2)->i;
>       IRIns *ira = IR(fleft->op2);
> @@ -1216,10 +1223,10 @@ LJFOLDF(simplify_tobit_conv)
>   {
>     /* Fold even across PHI to avoid expensive num->int conversions in loop. */
>     if ((fleft->op2 & IRCONV_SRCMASK) == IRT_INT) {
> -    lua_assert(irt_isnum(fleft->t));
> +    lj_assertJ(irt_isnum(fleft->t), "expected TOBIT number arg");
>       return fleft->op1;
>     } else if ((fleft->op2 & IRCONV_SRCMASK) == IRT_U32) {
> -    lua_assert(irt_isnum(fleft->t));
> +    lj_assertJ(irt_isnum(fleft->t), "expected TOBIT number arg");
>       fins->o = IR_CONV;
>       fins->op1 = fleft->op1;
>       fins->op2 = (IRT_INT<<5)|IRT_U32;
> @@ -1259,7 +1266,7 @@ LJFOLDF(simplify_conv_sext)
>     /* Use scalar evolution analysis results to strength-reduce sign-extension. */
>     if (ref == J->scev.idx) {
>       IRRef lo = J->scev.dir ? J->scev.start : J->scev.stop;
> -    lua_assert(irt_isint(J->scev.t));
> +    lj_assertJ(irt_isint(J->scev.t), "only int SCEV supported");
>       if (lo && IR(lo)->o == IR_KINT && IR(lo)->i + ofs >= 0) {
>       ok_reduce:
>   #if LJ_TARGET_X64
> @@ -1335,7 +1342,8 @@ LJFOLDF(narrow_convert)
>     /* Narrowing ignores PHIs and repeating it inside the loop is not useful. */
>     if (J->chain[IR_LOOP])
>       return NEXTFOLD;
> -  lua_assert(fins->o != IR_CONV || (fins->op2&IRCONV_CONVMASK) != IRCONV_TOBIT);
> +  lj_assertJ(fins->o != IR_CONV || (fins->op2&IRCONV_CONVMASK) != IRCONV_TOBIT,
> +	     "unexpected CONV TOBIT");
>     return lj_opt_narrow_convert(J);
>   }
>   
> @@ -1441,7 +1449,7 @@ LJFOLDF(simplify_intmul_k64)
>       return simplify_intmul_k(J, (int32_t)ir_kint64(fright)->u64);
>     return NEXTFOLD;
>   #else
> -  UNUSED(J); lua_assert(0); return FAILFOLD;
> +  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
>   #endif
>   }
>   
> @@ -1449,7 +1457,7 @@ LJFOLD(MOD any KINT)
>   LJFOLDF(simplify_intmod_k)
>   {
>     int32_t k = fright->i;
> -  lua_assert(k != 0);
> +  lj_assertJ(k != 0, "integer mod 0");
>     if (k > 0 && (k & (k-1)) == 0) {  /* i % (2^k) ==> i & (2^k-1) */
>       fins->o = IR_BAND;
>       fins->op2 = lj_ir_kint(J, k-1);
> @@ -1699,7 +1707,8 @@ LJFOLDF(simplify_shiftk_andk)
>       fins->ot = IRTI(IR_BAND);
>       return RETRYFOLD;
>     } else if (irk->o == IR_KINT64) {
> -    uint64_t k = kfold_int64arith(ir_k64(irk)->u64, fright->i, (IROp)fins->o);
> +    uint64_t k = kfold_int64arith(J, ir_k64(irk)->u64, fright->i,
> +				  (IROp)fins->o);
>       IROpT ot = fleft->ot;
>       fins->op1 = fleft->op1;
>       fins->op1 = (IRRef1)lj_opt_fold(J);
> @@ -1747,8 +1756,8 @@ LJFOLDF(simplify_andor_k64)
>     IRIns *irk = IR(fleft->op2);
>     PHIBARRIER(fleft);
>     if (irk->o == IR_KINT64) {
> -    uint64_t k = kfold_int64arith(ir_k64(irk)->u64,
> -				  ir_k64(fright)->u64, (IROp)fins->o);
> +    uint64_t k = kfold_int64arith(J, ir_k64(irk)->u64, ir_k64(fright)->u64,
> +				  (IROp)fins->o);
>       /* (i | k1) & k2 ==> i & k2, if (k1 & k2) == 0. */
>       /* (i & k1) | k2 ==> i | k2, if (k1 | k2) == -1. */
>       if (k == (fins->o == IR_BAND ? (uint64_t)0 : ~(uint64_t)0)) {
> @@ -1758,7 +1767,7 @@ LJFOLDF(simplify_andor_k64)
>     }
>     return NEXTFOLD;
>   #else
> -  UNUSED(J); lua_assert(0); return FAILFOLD;
> +  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
>   #endif
>   }
>   
> @@ -1794,8 +1803,8 @@ LJFOLDF(reassoc_intarith_k64)
>   #if LJ_HASFFI
>     IRIns *irk = IR(fleft->op2);
>     if (irk->o == IR_KINT64) {
> -    uint64_t k = kfold_int64arith(ir_k64(irk)->u64,
> -				  ir_k64(fright)->u64, (IROp)fins->o);
> +    uint64_t k = kfold_int64arith(J, ir_k64(irk)->u64, ir_k64(fright)->u64,
> +				  (IROp)fins->o);
>       PHIBARRIER(fleft);
>       fins->op1 = fleft->op1;
>       fins->op2 = (IRRef1)lj_ir_kint64(J, k);
> @@ -1803,7 +1812,7 @@ LJFOLDF(reassoc_intarith_k64)
>     }
>     return NEXTFOLD;
>   #else
> -  UNUSED(J); lua_assert(0); return FAILFOLD;
> +  UNUSED(J); lj_assertJ(0, "FFI IR op without FFI"); return FAILFOLD;
>   #endif
>   }
>   
> @@ -2058,7 +2067,7 @@ LJFOLDF(merge_eqne_snew_kgc)
>   {
>     GCstr *kstr = ir_kstr(fright);
>     int32_t len = (int32_t)kstr->len;
> -  lua_assert(irt_isstr(fins->t));
> +  lj_assertJ(irt_isstr(fins->t), "bad equality IR type");
>   
>   #if LJ_TARGET_UNALIGNED
>   #define FOLD_SNEW_MAX_LEN	4  /* Handle string lengths 0, 1, 2, 3, 4. */
> @@ -2122,7 +2131,7 @@ LJFOLD(HLOAD KKPTR)
>   LJFOLDF(kfold_hload_kkptr)
>   {
>     UNUSED(J);
> -  lua_assert(ir_kptr(fleft) == niltvg(J2G(J)));
> +  lj_assertJ(ir_kptr(fleft) == niltvg(J2G(J)), "expected niltv");
>     return TREF_NIL;
>   }
>   
> @@ -2333,7 +2342,7 @@ LJFOLDF(fwd_sload)
>       TRef tr = lj_opt_cse(J);
>       return tref_ref(tr) < J->chain[IR_RETF] ? EMITFOLD : tr;
>     } else {
> -    lua_assert(J->slot[fins->op1] != 0);
> +    lj_assertJ(J->slot[fins->op1] != 0, "uninitialized slot accessed");
>       return J->slot[fins->op1];
>     }
>   }
> @@ -2448,8 +2457,9 @@ TRef LJ_FASTCALL lj_opt_fold(jit_State *J)
>     IRRef ref;
>   
>     if (LJ_UNLIKELY((J->flags & JIT_F_OPT_MASK) != JIT_F_OPT_DEFAULT)) {
> -    lua_assert(((JIT_F_OPT_FOLD|JIT_F_OPT_FWD|JIT_F_OPT_CSE|JIT_F_OPT_DSE) |
> -		JIT_F_OPT_DEFAULT) == JIT_F_OPT_DEFAULT);
> +    lj_assertJ(((JIT_F_OPT_FOLD|JIT_F_OPT_FWD|JIT_F_OPT_CSE|JIT_F_OPT_DSE) |
> +		JIT_F_OPT_DEFAULT) == JIT_F_OPT_DEFAULT,
> +	       "bad JIT_F_OPT_DEFAULT");
>       /* Folding disabled? Chain to CSE, but not for loads/stores/allocs. */
>       if (!(J->flags & JIT_F_OPT_FOLD) && irm_kind(lj_ir_mode[fins->o]) == IRM_N)
>         return lj_opt_cse(J);
> @@ -2511,7 +2521,7 @@ retry:
>       return lj_ir_kint(J, fins->i);
>     if (ref == FAILFOLD)
>       lj_trace_err(J, LJ_TRERR_GFAIL);
> -  lua_assert(ref == DROPFOLD);
> +  lj_assertJ(ref == DROPFOLD, "bad fold result");
>     return REF_DROP;
>   }
>   
> diff --git a/src/lj_opt_loop.c b/src/lj_opt_loop.c
> index 10613641..d3b0fcee 100644
> --- a/src/lj_opt_loop.c
> +++ b/src/lj_opt_loop.c
> @@ -300,7 +300,8 @@ static void loop_unroll(LoopState *lps)
>     loopmap = &J->cur.snapmap[loopsnap->mapofs];
>     /* The PC of snapshot #0 and the loop snapshot must match. */
>     psentinel = &loopmap[loopsnap->nent];
> -  lua_assert(*psentinel == J->cur.snapmap[J->cur.snap[0].nent]);
> +  lj_assertJ(*psentinel == J->cur.snapmap[J->cur.snap[0].nent],
> +	     "mismatched PC for loop snapshot");
>     *psentinel = SNAP(255, 0, 0);  /* Replace PC with temporary sentinel. */
>   
>     /* Start substitution with snapshot #1 (#0 is empty for root traces). */
> @@ -371,7 +372,7 @@ static void loop_unroll(LoopState *lps)
>     }
>     if (!irt_isguard(J->guardemit))  /* Drop redundant snapshot. */
>       J->cur.nsnapmap = (uint32_t)J->cur.snap[--J->cur.nsnap].mapofs;
> -  lua_assert(J->cur.nsnapmap <= J->sizesnapmap);
> +  lj_assertJ(J->cur.nsnapmap <= J->sizesnapmap, "bad snapshot map index");
>     *psentinel = J->cur.snapmap[J->cur.snap[0].nent];  /* Restore PC. */
>   
>     loop_emit_phi(J, subst, phi, nphi, onsnap);
> diff --git a/src/lj_opt_mem.c b/src/lj_opt_mem.c
> index c8265b4f..59fddbdd 100644
> --- a/src/lj_opt_mem.c
> +++ b/src/lj_opt_mem.c
> @@ -18,6 +18,7 @@
>   #include "lj_jit.h"
>   #include "lj_iropt.h"
>   #include "lj_ircall.h"
> +#include "lj_dispatch.h"
>   
>   /* Some local macros to save typing. Undef'd at the end. */
>   #define IR(ref)		(&J->cur.ir[(ref)])
> @@ -56,8 +57,8 @@ static AliasRet aa_table(jit_State *J, IRRef ta, IRRef tb)
>   {
>     IRIns *taba = IR(ta), *tabb = IR(tb);
>     int newa, newb;
> -  lua_assert(ta != tb);
> -  lua_assert(irt_istab(taba->t) && irt_istab(tabb->t));
> +  lj_assertJ(ta != tb, "bad usage");
> +  lj_assertJ(irt_istab(taba->t) && irt_istab(tabb->t), "bad usage");
>     /* Disambiguate new allocations. */
>     newa = (taba->o == IR_TNEW || taba->o == IR_TDUP);
>     newb = (tabb->o == IR_TNEW || tabb->o == IR_TDUP);
> @@ -99,7 +100,7 @@ static AliasRet aa_ahref(jit_State *J, IRIns *refa, IRIns *refb)
>       /* Disambiguate array references based on index arithmetic. */
>       int32_t ofsa = 0, ofsb = 0;
>       IRRef basea = ka, baseb = kb;
> -    lua_assert(refb->o == IR_AREF);
> +    lj_assertJ(refb->o == IR_AREF, "expected AREF");
>       /* Gather base and offset from t[base] or t[base+-ofs]. */
>       if (keya->o == IR_ADD && irref_isk(keya->op2)) {
>         basea = keya->op1;
> @@ -117,8 +118,9 @@ static AliasRet aa_ahref(jit_State *J, IRIns *refa, IRIns *refb)
>         return ALIAS_NO;  /* t[base+-o1] vs. t[base+-o2] and o1 != o2. */
>     } else {
>       /* Disambiguate hash references based on the type of their keys. */
> -    lua_assert((refa->o==IR_HREF || refa->o==IR_HREFK || refa->o==IR_NEWREF) &&
> -	       (refb->o==IR_HREF || refb->o==IR_HREFK || refb->o==IR_NEWREF));
> +    lj_assertJ((refa->o==IR_HREF || refa->o==IR_HREFK || refa->o==IR_NEWREF) &&
> +	       (refb->o==IR_HREF || refb->o==IR_HREFK || refb->o==IR_NEWREF),
> +	       "bad xREF IR op %d or %d", refa->o, refb->o);
>       if (!irt_sametype(keya->t, keyb->t))
>         return ALIAS_NO;  /* Different key types. */
>     }
> @@ -192,7 +194,8 @@ static TRef fwd_ahload(jit_State *J, IRRef xref)
>   	if (key->o == IR_KSLOT) key = IR(key->op1);
>   	lj_ir_kvalue(J->L, &keyv, key);
>   	tv = lj_tab_get(J->L, ir_ktab(IR(ir->op1)), &keyv);
> -	lua_assert(itype2irt(tv) == irt_type(fins->t));
> +	lj_assertJ(itype2irt(tv) == irt_type(fins->t),
> +		   "mismatched type in constant table");
>   	if (irt_isnum(fins->t))
>   	  return lj_ir_knum_u64(J, tv->u64);
>   	else if (LJ_DUALNUM && irt_isint(fins->t))
> diff --git a/src/lj_opt_narrow.c b/src/lj_opt_narrow.c
> index 4f285334..2cfb775b 100644
> --- a/src/lj_opt_narrow.c
> +++ b/src/lj_opt_narrow.c
> @@ -372,17 +372,17 @@ static IRRef narrow_conv_emit(jit_State *J, NarrowConv *nc)
>       } else if (op == NARROW_CONV) {
>         *sp++ = emitir_raw(convot, ref, convop2);  /* Raw emit avoids a loop. */
>       } else if (op == NARROW_SEXT) {
> -      lua_assert(sp >= nc->stack+1);
> +      lj_assertJ(sp >= nc->stack+1, "stack underflow");
>         sp[-1] = emitir(IRT(IR_CONV, IRT_I64), sp[-1],
>   		      (IRT_I64<<5)|IRT_INT|IRCONV_SEXT);
>       } else if (op == NARROW_INT) {
> -      lua_assert(next < last);
> +      lj_assertJ(next < last, "missing arg to NARROW_INT");
>         *sp++ = nc->t == IRT_I64 ?
>   	      lj_ir_kint64(J, (int64_t)(int32_t)*next++) :
>   	      lj_ir_kint(J, *next++);
>       } else {  /* Regular IROpT. Pops two operands and pushes one result. */
>         IRRef mode = nc->mode;
> -      lua_assert(sp >= nc->stack+2);
> +      lj_assertJ(sp >= nc->stack+2, "stack underflow");
>         sp--;
>         /* Omit some overflow checks for array indexing. See comments above. */
>         if ((mode & IRCONV_CONVMASK) == IRCONV_INDEX) {
> @@ -398,7 +398,7 @@ static IRRef narrow_conv_emit(jit_State *J, NarrowConv *nc)
>   	narrow_bpc_set(J, narrow_ref(ref), narrow_ref(sp[-1]), mode);
>       }
>     }
> -  lua_assert(sp == nc->stack+1);
> +  lj_assertJ(sp == nc->stack+1, "stack misalignment");
>     return nc->stack[0];
>   }
>   
> @@ -452,7 +452,7 @@ static TRef narrow_stripov(jit_State *J, TRef tr, int lastop, IRRef mode)
>   TRef LJ_FASTCALL lj_opt_narrow_index(jit_State *J, TRef tr)
>   {
>     IRIns *ir;
> -  lua_assert(tref_isnumber(tr));
> +  lj_assertJ(tref_isnumber(tr), "expected number type");
>     if (tref_isnum(tr))  /* Conversion may be narrowed, too. See above. */
>       return emitir(IRTGI(IR_CONV), tr, IRCONV_INT_NUM|IRCONV_INDEX);
>     /* Omit some overflow checks for array indexing. See comments above. */
> @@ -499,7 +499,7 @@ TRef LJ_FASTCALL lj_opt_narrow_tobit(jit_State *J, TRef tr)
>   /* Narrow C array index (overflow undefined). */
>   TRef LJ_FASTCALL lj_opt_narrow_cindex(jit_State *J, TRef tr)
>   {
> -  lua_assert(tref_isnumber(tr));
> +  lj_assertJ(tref_isnumber(tr), "expected number type");
>     if (tref_isnum(tr))
>       return emitir(IRT(IR_CONV, IRT_INTP), tr, (IRT_INTP<<5)|IRT_NUM|IRCONV_ANY);
>     /* Undefined overflow semantics allow stripping of ADDOV, SUBOV and MULOV. */
> @@ -627,9 +627,10 @@ static int narrow_forl(jit_State *J, cTValue *o)
>   /* Narrow the FORL index type by looking at the runtime values. */
>   IRType lj_opt_narrow_forl(jit_State *J, cTValue *tv)
>   {
> -  lua_assert(tvisnumber(&tv[FORL_IDX]) &&
> +  lj_assertJ(tvisnumber(&tv[FORL_IDX]) &&
>   	     tvisnumber(&tv[FORL_STOP]) &&
> -	     tvisnumber(&tv[FORL_STEP]));
> +	     tvisnumber(&tv[FORL_STEP]),
> +	     "expected number types");
>     /* Narrow only if the runtime values of start/stop/step are all integers. */
>     if (narrow_forl(J, &tv[FORL_IDX]) &&
>         narrow_forl(J, &tv[FORL_STOP]) &&
> diff --git a/src/lj_opt_split.c b/src/lj_opt_split.c
> index c10a85cb..a619d852 100644
> --- a/src/lj_opt_split.c
> +++ b/src/lj_opt_split.c
> @@ -235,7 +235,7 @@ static IRRef split_bitshift(jit_State *J, IRRef1 *hisubst,
>   	return split_emit(J, IRTI(IR_BOR), t1, t2);
>         } else {
>   	IRRef t1 = ir->prev, t2;
> -	lua_assert(op == IR_BSHR || op == IR_BSAR);
> +	lj_assertJ(op == IR_BSHR || op == IR_BSAR, "bad usage");
>   	nir->o = IR_BSHR;
>   	t2 = split_emit(J, IRTI(IR_BSHL), hi, lj_ir_kint(J, (-k&31)));
>   	ir->prev = split_emit(J, IRTI(IR_BOR), t1, t2);
> @@ -250,7 +250,7 @@ static IRRef split_bitshift(jit_State *J, IRRef1 *hisubst,
>   	ir->prev = lj_ir_kint(J, 0);
>   	return lo;
>         } else {
> -	lua_assert(op == IR_BSHR || op == IR_BSAR);
> +	lj_assertJ(op == IR_BSHR || op == IR_BSAR, "bad usage");
>   	if (k == 32) {
>   	  J->cur.nins--;
>   	  ir->prev = hi;
> @@ -429,7 +429,7 @@ static void split_ir(jit_State *J)
>   	hi = split_emit(J, IRT(IR_HIOP, IRT_SOFTFP), nref, nref);
>   	break;
>         case IR_FLOAD:
> -	lua_assert(ir->op1 == REF_NIL);
> +	lj_assertJ(ir->op1 == REF_NIL, "expected FLOAD from GG_State");
>   	hi = lj_ir_kint(J, *(int32_t*)((char*)J2GG(J) + ir->op2 + LJ_LE*4));
>   	nir->op2 += LJ_BE*4;
>   	break;
> @@ -465,8 +465,9 @@ static void split_ir(jit_State *J)
>   	  break;
>   	}
>   #endif
> -	lua_assert(st == IRT_INT ||
> -		   (LJ_32 && LJ_HASFFI && (st == IRT_U32 || st == IRT_FLOAT)));
> +	lj_assertJ(st == IRT_INT ||
> +		   (LJ_32 && LJ_HASFFI && (st == IRT_U32 || st == IRT_FLOAT)),
> +		   "bad source type for CONV");
>   	nir->o = IR_CALLN;
>   #if LJ_32 && LJ_HASFFI
>   	nir->op2 = st == IRT_INT ? IRCALL_softfp_i2d :
> @@ -496,7 +497,8 @@ static void split_ir(jit_State *J)
>   	hi = nir->op2;
>   	break;
>         default:
> -	lua_assert(ir->o <= IR_NE || ir->o == IR_MIN || ir->o == IR_MAX);
> +	lj_assertJ(ir->o <= IR_NE || ir->o == IR_MIN || ir->o == IR_MAX,
> +		   "bad IR op %d", ir->o);
>   	hi = split_emit(J, IRTG(IR_HIOP, IRT_SOFTFP),
>   			hisubst[ir->op1], hisubst[ir->op2]);
>   	break;
> @@ -553,7 +555,7 @@ static void split_ir(jit_State *J)
>   	hi = split_bitshift(J, hisubst, oir, nir, ir);
>   	break;
>         case IR_FLOAD:
> -	lua_assert(ir->op2 == IRFL_CDATA_INT64);
> +	lj_assertJ(ir->op2 == IRFL_CDATA_INT64, "only INT64 supported");
>   	hi = split_emit(J, IRTI(IR_FLOAD), nir->op1, IRFL_CDATA_INT64_4);
>   #if LJ_BE
>   	ir->prev = hi; hi = nref;
> @@ -619,7 +621,7 @@ static void split_ir(jit_State *J)
>   	hi = nir->op2;
>   	break;
>         default:
> -	lua_assert(ir->o <= IR_NE);  /* Comparisons. */
> +	lj_assertJ(ir->o <= IR_NE, "bad IR op %d", ir->o);  /* Comparisons. */
>   	split_emit(J, IRTGI(IR_HIOP), hiref, hisubst[ir->op2]);
>   	break;
>         }
> @@ -697,7 +699,7 @@ static void split_ir(jit_State *J)
>   #if LJ_SOFTFP
>         if (st == IRT_NUM || (LJ_32 && LJ_HASFFI && st == IRT_FLOAT)) {
>   	if (irt_isguard(ir->t)) {
> -	  lua_assert(st == IRT_NUM && irt_isint(ir->t));
> +	  lj_assertJ(st == IRT_NUM && irt_isint(ir->t), "bad CONV types");
>   	  J->cur.nins--;
>   	  ir->prev = split_num2int(J, nir->op1, hisubst[ir->op1], 1);
>   	} else {
> @@ -828,7 +830,7 @@ void lj_opt_split(jit_State *J)
>     if (!J->needsplit)
>       J->needsplit = split_needsplit(J);
>   #else
> -  lua_assert(J->needsplit >= split_needsplit(J));  /* Verify flag. */
> +  lj_assertJ(J->needsplit >= split_needsplit(J), "bad SPLIT state");
>   #endif
>     if (J->needsplit) {
>       int errcode = lj_vm_cpcall(J->L, NULL, J, cpsplit);
> diff --git a/src/lj_parse.c b/src/lj_parse.c
> index e238afa3..3f6caaec 100644
> --- a/src/lj_parse.c
> +++ b/src/lj_parse.c
> @@ -169,6 +169,12 @@ LJ_STATIC_ASSERT((int)BC_MULVV-(int)BC_ADDVV == (int)OPR_MUL-(int)OPR_ADD);
>   LJ_STATIC_ASSERT((int)BC_DIVVV-(int)BC_ADDVV == (int)OPR_DIV-(int)OPR_ADD);
>   LJ_STATIC_ASSERT((int)BC_MODVV-(int)BC_ADDVV == (int)OPR_MOD-(int)OPR_ADD);
>   
> +#ifdef LUA_USE_ASSERT
> +#define lj_assertFS(c, ...)	(lj_assertG_(G(fs->L), (c), __VA_ARGS__))
> +#else
> +#define lj_assertFS(c, ...)	((void)fs)
> +#endif
> +
>   /* -- Error handling ------------------------------------------------------ */
>   
>   LJ_NORET LJ_NOINLINE static void err_syntax(LexState *ls, ErrMsg em)
> @@ -206,7 +212,7 @@ static BCReg const_num(FuncState *fs, ExpDesc *e)
>   {
>     lua_State *L = fs->L;
>     TValue *o;
> -  lua_assert(expr_isnumk(e));
> +  lj_assertFS(expr_isnumk(e), "bad usage");
>     o = lj_tab_set(L, fs->kt, &e->u.nval);
>     if (tvhaskslot(o))
>       return tvkslot(o);
> @@ -231,7 +237,7 @@ static BCReg const_gc(FuncState *fs, GCobj *gc, uint32_t itype)
>   /* Add a string constant. */
>   static BCReg const_str(FuncState *fs, ExpDesc *e)
>   {
> -  lua_assert(expr_isstrk(e) || e->k == VGLOBAL);
> +  lj_assertFS(expr_isstrk(e) || e->k == VGLOBAL, "bad usage");
>     return const_gc(fs, obj2gco(e->u.sval), LJ_TSTR);
>   }
>   
> @@ -319,7 +325,7 @@ static void jmp_patchins(FuncState *fs, BCPos pc, BCPos dest)
>   {
>     BCIns *jmp = &fs->bcbase[pc].ins;
>     BCPos offset = dest-(pc+1)+BCBIAS_J;
> -  lua_assert(dest != NO_JMP);
> +  lj_assertFS(dest != NO_JMP, "uninitialized jump target");
>     if (offset > BCMAX_D)
>       err_syntax(fs->ls, LJ_ERR_XJUMP);
>     setbc_d(jmp, offset);
> @@ -368,7 +374,7 @@ static void jmp_patch(FuncState *fs, BCPos list, BCPos target)
>     if (target == fs->pc) {
>       jmp_tohere(fs, list);
>     } else {
> -    lua_assert(target < fs->pc);
> +    lj_assertFS(target < fs->pc, "bad jump target");
>       jmp_patchval(fs, list, target, NO_REG, target);
>     }
>   }
> @@ -398,7 +404,7 @@ static void bcreg_free(FuncState *fs, BCReg reg)
>   {
>     if (reg >= fs->nactvar) {
>       fs->freereg--;
> -    lua_assert(reg == fs->freereg);
> +    lj_assertFS(reg == fs->freereg, "bad regfree");
>     }
>   }
>   
> @@ -548,7 +554,7 @@ static void expr_toreg_nobranch(FuncState *fs, ExpDesc *e, BCReg reg)
>     } else if (e->k <= VKTRUE) {
>       ins = BCINS_AD(BC_KPRI, reg, const_pri(e));
>     } else {
> -    lua_assert(e->k == VVOID || e->k == VJMP);
> +    lj_assertFS(e->k == VVOID || e->k == VJMP, "bad expr type %d", e->k);
>       return;
>     }
>     bcemit_INS(fs, ins);
> @@ -643,7 +649,7 @@ static void bcemit_store(FuncState *fs, ExpDesc *var, ExpDesc *e)
>       ins = BCINS_AD(BC_GSET, ra, const_str(fs, var));
>     } else {
>       BCReg ra, rc;
> -    lua_assert(var->k == VINDEXED);
> +    lj_assertFS(var->k == VINDEXED, "bad expr type %d", var->k);
>       ra = expr_toanyreg(fs, e);
>       rc = var->u.s.aux;
>       if ((int32_t)rc < 0) {
> @@ -651,10 +657,12 @@ static void bcemit_store(FuncState *fs, ExpDesc *var, ExpDesc *e)
>       } else if (rc > BCMAX_C) {
>         ins = BCINS_ABC(BC_TSETB, ra, var->u.s.info, rc-(BCMAX_C+1));
>       } else {
> +#ifdef LUA_USE_ASSERT
>         /* Free late alloced key reg to avoid assert on free of value reg. */
>         /* This can only happen when called from expr_table(). */
> -      lua_assert(e->k != VNONRELOC || ra < fs->nactvar ||
> -		 rc < ra || (bcreg_free(fs, rc),1));
> +      if (e->k == VNONRELOC && ra >= fs->nactvar && rc >= ra)
> +	bcreg_free(fs, rc);
> +#endif
>         ins = BCINS_ABC(BC_TSETV, ra, var->u.s.info, rc);
>       }
>     }
> @@ -669,7 +677,7 @@ static void bcemit_method(FuncState *fs, ExpDesc *e, ExpDesc *key)
>     expr_free(fs, e);
>     func = fs->freereg;
>     bcemit_AD(fs, BC_MOV, func+1+LJ_FR2, obj);  /* Copy object to 1st argument. */
> -  lua_assert(expr_isstrk(key));
> +  lj_assertFS(expr_isstrk(key), "bad usage");
>     idx = const_str(fs, key);
>     if (idx <= BCMAX_C) {
>       bcreg_reserve(fs, 2+LJ_FR2);
> @@ -809,7 +817,8 @@ static void bcemit_arith(FuncState *fs, BinOpr opr, ExpDesc *e1, ExpDesc *e2)
>       else
>         rc = expr_toanyreg(fs, e2);
>       /* 1st operand discharged by bcemit_binop_left, but need KNUM/KSHORT. */
> -    lua_assert(expr_isnumk(e1) || e1->k == VNONRELOC);
> +    lj_assertFS(expr_isnumk(e1) || e1->k == VNONRELOC,
> +		"bad expr type %d", e1->k);
>       expr_toval(fs, e1);
>       /* Avoid two consts to satisfy bytecode constraints. */
>       if (expr_isnumk(e1) && !expr_isnumk(e2) &&
> @@ -897,19 +906,20 @@ static void bcemit_binop(FuncState *fs, BinOpr op, ExpDesc *e1, ExpDesc *e2)
>     if (op <= OPR_POW) {
>       bcemit_arith(fs, op, e1, e2);
>     } else if (op == OPR_AND) {
> -    lua_assert(e1->t == NO_JMP);  /* List must be closed. */
> +    lj_assertFS(e1->t == NO_JMP, "jump list not closed");
>       expr_discharge(fs, e2);
>       jmp_append(fs, &e2->f, e1->f);
>       *e1 = *e2;
>     } else if (op == OPR_OR) {
> -    lua_assert(e1->f == NO_JMP);  /* List must be closed. */
> +    lj_assertFS(e1->f == NO_JMP, "jump list not closed");
>       expr_discharge(fs, e2);
>       jmp_append(fs, &e2->t, e1->t);
>       *e1 = *e2;
>     } else if (op == OPR_CONCAT) {
>       expr_toval(fs, e2);
>       if (e2->k == VRELOCABLE && bc_op(*bcptr(fs, e2)) == BC_CAT) {
> -      lua_assert(e1->u.s.info == bc_b(*bcptr(fs, e2))-1);
> +      lj_assertFS(e1->u.s.info == bc_b(*bcptr(fs, e2))-1,
> +		  "bad CAT stack layout");
>         expr_free(fs, e1);
>         setbc_b(bcptr(fs, e2), e1->u.s.info);
>         e1->u.s.info = e2->u.s.info;
> @@ -921,8 +931,9 @@ static void bcemit_binop(FuncState *fs, BinOpr op, ExpDesc *e1, ExpDesc *e2)
>       }
>       e1->k = VRELOCABLE;
>     } else {
> -    lua_assert(op == OPR_NE || op == OPR_EQ ||
> -	       op == OPR_LT || op == OPR_GE || op == OPR_LE || op == OPR_GT);
> +    lj_assertFS(op == OPR_NE || op == OPR_EQ ||
> +	       op == OPR_LT || op == OPR_GE || op == OPR_LE || op == OPR_GT,
> +	       "bad binop %d", op);
>       bcemit_comp(fs, op, e1, e2);
>     }
>   }
> @@ -951,10 +962,10 @@ static void bcemit_unop(FuncState *fs, BCOp op, ExpDesc *e)
>         e->u.s.info = fs->freereg-1;
>         e->k = VNONRELOC;
>       } else {
> -      lua_assert(e->k == VNONRELOC);
> +      lj_assertFS(e->k == VNONRELOC, "bad expr type %d", e->k);
>       }
>     } else {
> -    lua_assert(op == BC_UNM || op == BC_LEN);
> +    lj_assertFS(op == BC_UNM || op == BC_LEN, "bad unop %d", op);
>       if (op == BC_UNM && !expr_hasjump(e)) {  /* Constant-fold negations. */
>   #if LJ_HASFFI
>         if (e->k == VKCDATA) {  /* Fold in-place since cdata is not interned. */
> @@ -1049,8 +1060,9 @@ static void var_new(LexState *ls, BCReg n, GCstr *name)
>         lj_lex_error(ls, 0, LJ_ERR_XLIMC, LJ_MAX_VSTACK);
>       lj_mem_growvec(ls->L, ls->vstack, ls->sizevstack, LJ_MAX_VSTACK, VarInfo);
>     }
> -  lua_assert((uintptr_t)name < VARNAME__MAX ||
> -	     lj_tab_getstr(fs->kt, name) != NULL);
> +  lj_assertFS((uintptr_t)name < VARNAME__MAX ||
> +	      lj_tab_getstr(fs->kt, name) != NULL,
> +	      "unanchored variable name");
>     /* NOBARRIER: name is anchored in fs->kt and ls->vstack is not a GCobj. */
>     setgcref(ls->vstack[vtop].name, obj2gco(name));
>     fs->varmap[fs->nactvar+n] = (uint16_t)vtop;
> @@ -1105,7 +1117,7 @@ static MSize var_lookup_uv(FuncState *fs, MSize vidx, ExpDesc *e)
>         return i;  /* Already exists. */
>     /* Otherwise create a new one. */
>     checklimit(fs, fs->nuv, LJ_MAX_UPVAL, "upvalues");
> -  lua_assert(e->k == VLOCAL || e->k == VUPVAL);
> +  lj_assertFS(e->k == VLOCAL || e->k == VUPVAL, "bad expr type %d", e->k);
>     fs->uvmap[n] = (uint16_t)vidx;
>     fs->uvtmp[n] = (uint16_t)(e->k == VLOCAL ? vidx : LJ_MAX_VSTACK+e->u.s.info);
>     fs->nuv = n+1;
> @@ -1156,7 +1168,8 @@ static MSize gola_new(LexState *ls, GCstr *name, uint8_t info, BCPos pc)
>         lj_lex_error(ls, 0, LJ_ERR_XLIMC, LJ_MAX_VSTACK);
>       lj_mem_growvec(ls->L, ls->vstack, ls->sizevstack, LJ_MAX_VSTACK, VarInfo);
>     }
> -  lua_assert(name == NAME_BREAK || lj_tab_getstr(fs->kt, name) != NULL);
> +  lj_assertFS(name == NAME_BREAK || lj_tab_getstr(fs->kt, name) != NULL,
> +	      "unanchored label name");
>     /* NOBARRIER: name is anchored in fs->kt and ls->vstack is not a GCobj. */
>     setgcref(ls->vstack[vtop].name, obj2gco(name));
>     ls->vstack[vtop].startpc = pc;
> @@ -1186,8 +1199,9 @@ static void gola_close(LexState *ls, VarInfo *vg)
>     FuncState *fs = ls->fs;
>     BCPos pc = vg->startpc;
>     BCIns *ip = &fs->bcbase[pc].ins;
> -  lua_assert(gola_isgoto(vg));
> -  lua_assert(bc_op(*ip) == BC_JMP || bc_op(*ip) == BC_UCLO);
> +  lj_assertFS(gola_isgoto(vg), "expected goto");
> +  lj_assertFS(bc_op(*ip) == BC_JMP || bc_op(*ip) == BC_UCLO,
> +	      "bad bytecode op %d", bc_op(*ip));
>     setbc_a(ip, vg->slot);
>     if (bc_op(*ip) == BC_JMP) {
>       BCPos next = jmp_next(fs, pc);
> @@ -1206,9 +1220,9 @@ static void gola_resolve(LexState *ls, FuncScope *bl, MSize idx)
>       if (gcrefeq(vg->name, vl->name) && gola_isgoto(vg)) {
>         if (vg->slot < vl->slot) {
>   	GCstr *name = strref(var_get(ls, ls->fs, vg->slot).name);
> -	lua_assert((uintptr_t)name >= VARNAME__MAX);
> +	lj_assertLS((uintptr_t)name >= VARNAME__MAX, "expected goto name");
>   	ls->linenumber = ls->fs->bcbase[vg->startpc].line;
> -	lua_assert(strref(vg->name) != NAME_BREAK);
> +	lj_assertLS(strref(vg->name) != NAME_BREAK, "unexpected break");
>   	lj_lex_error(ls, 0, LJ_ERR_XGSCOPE,
>   		     strdata(strref(vg->name)), strdata(name));
>         }
> @@ -1272,7 +1286,7 @@ static void fscope_begin(FuncState *fs, FuncScope *bl, int flags)
>     bl->vstart = fs->ls->vtop;
>     bl->prev = fs->bl;
>     fs->bl = bl;
> -  lua_assert(fs->freereg == fs->nactvar);
> +  lj_assertFS(fs->freereg == fs->nactvar, "bad regalloc");
>   }
>   
>   /* End a scope. */
> @@ -1283,7 +1297,7 @@ static void fscope_end(FuncState *fs)
>     fs->bl = bl->prev;
>     var_remove(ls, bl->nactvar);
>     fs->freereg = fs->nactvar;
> -  lua_assert(bl->nactvar == fs->nactvar);
> +  lj_assertFS(bl->nactvar == fs->nactvar, "bad regalloc");
>     if ((bl->flags & (FSCOPE_UPVAL|FSCOPE_NOCLOSE)) == FSCOPE_UPVAL)
>       bcemit_AJ(fs, BC_UCLO, bl->nactvar, 0);
>     if ((bl->flags & FSCOPE_BREAK)) {
> @@ -1370,13 +1384,13 @@ static void fs_fixup_k(FuncState *fs, GCproto *pt, void *kptr)
>       Node *n = &node[i];
>       if (tvhaskslot(&n->val)) {
>         ptrdiff_t kidx = (ptrdiff_t)tvkslot(&n->val);
> -      lua_assert(!tvisint(&n->key));
> +      lj_assertFS(!tvisint(&n->key), "unexpected integer key");
>         if (tvisnum(&n->key)) {
>   	TValue *tv = &((TValue *)kptr)[kidx];
>   	if (LJ_DUALNUM) {
>   	  lua_Number nn = numV(&n->key);
>   	  int32_t k = lj_num2int(nn);
> -	  lua_assert(!tvismzero(&n->key));
> +	  lj_assertFS(!tvismzero(&n->key), "unexpected -0 key");
>   	  if ((lua_Number)k == nn)
>   	    setintV(tv, k);
>   	  else
> @@ -1424,21 +1438,21 @@ static void fs_fixup_line(FuncState *fs, GCproto *pt,
>       uint8_t *li = (uint8_t *)lineinfo;
>       do {
>         BCLine delta = base[i].line - first;
> -      lua_assert(delta >= 0 && delta < 256);
> +      lj_assertFS(delta >= 0 && delta < 256, "bad line delta");
>         li[i] = (uint8_t)delta;
>       } while (++i < n);
>     } else if (LJ_LIKELY(numline < 65536)) {
>       uint16_t *li = (uint16_t *)lineinfo;
>       do {
>         BCLine delta = base[i].line - first;
> -      lua_assert(delta >= 0 && delta < 65536);
> +      lj_assertFS(delta >= 0 && delta < 65536, "bad line delta");
>         li[i] = (uint16_t)delta;
>       } while (++i < n);
>     } else {
>       uint32_t *li = (uint32_t *)lineinfo;
>       do {
>         BCLine delta = base[i].line - first;
> -      lua_assert(delta >= 0);
> +      lj_assertFS(delta >= 0, "bad line delta");
>         li[i] = (uint32_t)delta;
>       } while (++i < n);
>     }
> @@ -1528,7 +1542,7 @@ static void fs_fixup_ret(FuncState *fs)
>     }
>     fs->bl->flags |= FSCOPE_NOCLOSE;  /* Handled above. */
>     fscope_end(fs);
> -  lua_assert(fs->bl == NULL);
> +  lj_assertFS(fs->bl == NULL, "bad scope nesting");
>     /* May need to fixup returns encoded before first function was created. */
>     if (fs->flags & PROTO_FIXUP_RETURN) {
>       BCPos pc;
> @@ -1608,7 +1622,7 @@ static GCproto *fs_finish(LexState *ls, BCLine line)
>     L->top--;  /* Pop table of constants. */
>     ls->vtop = fs->vbase;  /* Reset variable stack. */
>     ls->fs = fs->prev;
> -  lua_assert(ls->fs != NULL || ls->tok == TK_eof);
> +  lj_assertL(ls->fs != NULL || ls->tok == TK_eof, "bad parser state");
>     return pt;
>   }
>   
> @@ -1702,14 +1716,15 @@ static void expr_bracket(LexState *ls, ExpDesc *v)
>   }
>   
>   /* Get value of constant expression. */
> -static void expr_kvalue(TValue *v, ExpDesc *e)
> +static void expr_kvalue(FuncState *fs, TValue *v, ExpDesc *e)
>   {
> +  UNUSED(fs);
>     if (e->k <= VKTRUE) {
>       setpriV(v, ~(uint32_t)e->k);
>     } else if (e->k == VKSTR) {
>       setgcVraw(v, obj2gco(e->u.sval), LJ_TSTR);
>     } else {
> -    lua_assert(tvisnumber(expr_numtv(e)));
> +    lj_assertFS(tvisnumber(expr_numtv(e)), "bad number constant");
>       *v = *expr_numtv(e);
>     }
>   }
> @@ -1759,11 +1774,11 @@ static void expr_table(LexState *ls, ExpDesc *e)
>   	fs->bcbase[pc].ins = BCINS_AD(BC_TDUP, freg-1, kidx);
>         }
>         vcall = 0;
> -      expr_kvalue(&k, &key);
> +      expr_kvalue(fs, &k, &key);
>         v = lj_tab_set(fs->L, t, &k);
>         lj_gc_anybarriert(fs->L, t);
>         if (expr_isk_nojump(&val)) {  /* Add const key/value to template table. */
> -	expr_kvalue(v, &val);
> +	expr_kvalue(fs, v, &val);
>         } else {  /* Otherwise create dummy string key (avoids lj_tab_newkey). */
>   	settabV(fs->L, v, t);  /* Preserve key with table itself as value. */
>   	fixt = 1;   /* Fix this later, after all resizes. */
> @@ -1782,8 +1797,9 @@ static void expr_table(LexState *ls, ExpDesc *e)
>     if (vcall) {
>       BCInsLine *ilp = &fs->bcbase[fs->pc-1];
>       ExpDesc en;
> -    lua_assert(bc_a(ilp->ins) == freg &&
> -	       bc_op(ilp->ins) == (narr > 256 ? BC_TSETV : BC_TSETB));
> +    lj_assertFS(bc_a(ilp->ins) == freg &&
> +		bc_op(ilp->ins) == (narr > 256 ? BC_TSETV : BC_TSETB),
> +		"bad CALL code generation");
>       expr_init(&en, VKNUM, 0);
>       en.u.nval.u32.lo = narr-1;
>       en.u.nval.u32.hi = 0x43300000;  /* Biased integer to avoid denormals. */
> @@ -1813,7 +1829,7 @@ static void expr_table(LexState *ls, ExpDesc *e)
>         for (i = 0; i <= hmask; i++) {
>   	Node *n = &node[i];
>   	if (tvistab(&n->val)) {
> -	  lua_assert(tabV(&n->val) == t);
> +	  lj_assertFS(tabV(&n->val) == t, "bad dummy key in template table");
>   	  setnilV(&n->val);  /* Turn value into nil. */
>   	}
>         }
> @@ -1844,7 +1860,7 @@ static BCReg parse_params(LexState *ls, int needself)
>       } while (lex_opt(ls, ','));
>     }
>     var_add(ls, nparams);
> -  lua_assert(fs->nactvar == nparams);
> +  lj_assertFS(fs->nactvar == nparams, "bad regalloc");
>     bcreg_reserve(fs, nparams);
>     lex_check(ls, ')');
>     return nparams;
> @@ -1931,7 +1947,7 @@ static void parse_args(LexState *ls, ExpDesc *e)
>       err_syntax(ls, LJ_ERR_XFUNARG);
>       return;  /* Silence compiler. */
>     }
> -  lua_assert(e->k == VNONRELOC);
> +  lj_assertFS(e->k == VNONRELOC, "bad expr type %d", e->k);
>     base = e->u.s.info;  /* Base register for call. */
>     if (args.k == VCALL) {
>       ins = BCINS_ABC(BC_CALLM, base, 2, args.u.s.aux - base - 1 - LJ_FR2);
> @@ -2701,8 +2717,9 @@ static void parse_chunk(LexState *ls)
>     while (!islast && !parse_isend(ls->tok)) {
>       islast = parse_stmt(ls);
>       lex_opt(ls, ';');
> -    lua_assert(ls->fs->framesize >= ls->fs->freereg &&
> -	       ls->fs->freereg >= ls->fs->nactvar);
> +    lj_assertLS(ls->fs->framesize >= ls->fs->freereg &&
> +		ls->fs->freereg >= ls->fs->nactvar,
> +		"bad regalloc");
>       ls->fs->freereg = ls->fs->nactvar;  /* Free registers after each stmt. */
>     }
>     synlevel_end(ls);
> @@ -2737,9 +2754,8 @@ GCproto *lj_parse(LexState *ls)
>       err_token(ls, TK_eof);
>     pt = fs_finish(ls, ls->linenumber);
>     L->top--;  /* Drop chunkname. */
> -  lua_assert(fs.prev == NULL);
> -  lua_assert(ls->fs == NULL);
> -  lua_assert(pt->sizeuv == 0);
> +  lj_assertL(fs.prev == NULL && ls->fs == NULL, "mismatched frame nesting");
> +  lj_assertL(pt->sizeuv == 0, "toplevel proto has upvalues");
>     return pt;
>   }
>   
> diff --git a/src/lj_record.c b/src/lj_record.c
> index 6030f77c..d1332bfc 100644
> --- a/src/lj_record.c
> +++ b/src/lj_record.c
> @@ -50,34 +50,52 @@
>   static void rec_check_ir(jit_State *J)
>   {
>     IRRef i, nins = J->cur.nins, nk = J->cur.nk;
> -  lua_assert(nk <= REF_BIAS && nins >= REF_BIAS && nins < 65536);
> +  lj_assertJ(nk <= REF_BIAS && nins >= REF_BIAS && nins < 65536,
> +	     "inconsistent IR layout");
>     for (i = nk; i < nins; i++) {
>       IRIns *ir = IR(i);
>       uint32_t mode = lj_ir_mode[ir->o];
>       IRRef op1 = ir->op1;
>       IRRef op2 = ir->op2;
> +    const char *err = NULL;
>       switch (irm_op1(mode)) {
> -    case IRMnone: lua_assert(op1 == 0); break;
> -    case IRMref: lua_assert(op1 >= nk);
> -      lua_assert(i >= REF_BIAS ? op1 < i : op1 > i); break;
> +    case IRMnone:
> +      if (op1 != 0) err = "IRMnone op1 used";
> +      break;
> +    case IRMref:
> +      if (op1 < nk || (i >= REF_BIAS ? op1 >= i : op1 <= i))
> +	err = "IRMref op1 out of range";
> +      break;
>       case IRMlit: break;
> -    case IRMcst: lua_assert(i < REF_BIAS);
> +    case IRMcst:
> +      if (i >= REF_BIAS) { err = "constant in IR range"; break; }
>         if (irt_is64(ir->t) && ir->o != IR_KNULL)
>   	i++;
>         continue;
>       }
>       switch (irm_op2(mode)) {
> -    case IRMnone: lua_assert(op2 == 0); break;
> -    case IRMref: lua_assert(op2 >= nk);
> -      lua_assert(i >= REF_BIAS ? op2 < i : op2 > i); break;
> +    case IRMnone:
> +      if (op2) err = "IRMnone op2 used";
> +      break;
> +    case IRMref:
> +      if (op2 < nk || (i >= REF_BIAS ? op2 >= i : op2 <= i))
> +	err = "IRMref op2 out of range";
> +      break;
>       case IRMlit: break;
> -    case IRMcst: lua_assert(0); break;
> +    case IRMcst: err = "IRMcst op2"; break;
>       }
> -    if (ir->prev) {
> -      lua_assert(ir->prev >= nk);
> -      lua_assert(i >= REF_BIAS ? ir->prev < i : ir->prev > i);
> -      lua_assert(ir->o == IR_NOP || IR(ir->prev)->o == ir->o);
> +    if (!err && ir->prev) {
> +      if (ir->prev < nk || (i >= REF_BIAS ? ir->prev >= i : ir->prev <= i))
> +	err = "chain out of range";
> +      else if (ir->o != IR_NOP && IR(ir->prev)->o != ir->o)
> +	err = "chain to different op";
>       }
> +    lj_assertJ(!err, "bad IR %04d op %d(%04d,%04d): %s",
> +	       i-REF_BIAS,
> +	       ir->o,
> +	       irm_op1(mode) == IRMref ? op1-REF_BIAS : op1,
> +	       irm_op2(mode) == IRMref ? op2-REF_BIAS : op2,
> +	       err);
>     }
>   }
>   
> @@ -87,9 +105,10 @@ static void rec_check_slots(jit_State *J)
>     BCReg s, nslots = J->baseslot + J->maxslot;
>     int32_t depth = 0;
>     cTValue *base = J->L->base - J->baseslot;
> -  lua_assert(J->baseslot >= 1+LJ_FR2);
> -  lua_assert(J->baseslot == 1+LJ_FR2 || (J->slot[J->baseslot-1] & TREF_FRAME));
> -  lua_assert(nslots <= LJ_MAX_JSLOTS);
> +  lj_assertJ(J->baseslot >= 1+LJ_FR2, "bad baseslot");
> +  lj_assertJ(J->baseslot == 1+LJ_FR2 || (J->slot[J->baseslot-1] & TREF_FRAME),
> +	     "baseslot does not point to frame");
> +  lj_assertJ(nslots <= LJ_MAX_JSLOTS, "slot overflow");
>     for (s = 0; s < nslots; s++) {
>       TRef tr = J->slot[s];
>       if (tr) {
> @@ -97,56 +116,65 @@ static void rec_check_slots(jit_State *J)
>         IRRef ref = tref_ref(tr);
>         IRIns *ir = NULL;  /* Silence compiler. */
>         if (!LJ_FR2 || ref || !(tr & (TREF_FRAME | TREF_CONT))) {
> -	lua_assert(ref >= J->cur.nk && ref < J->cur.nins);
> +	lj_assertJ(ref >= J->cur.nk && ref < J->cur.nins,
> +		   "slot %d ref %04d out of range", s, ref - REF_BIAS);
>   	ir = IR(ref);
> -	lua_assert(irt_t(ir->t) == tref_t(tr));
> +	lj_assertJ(irt_t(ir->t) == tref_t(tr), "slot %d IR type mismatch", s);
>         }
>         if (s == 0) {
> -	lua_assert(tref_isfunc(tr));
> +	lj_assertJ(tref_isfunc(tr), "frame slot 0 is not a function");
>   #if LJ_FR2
>         } else if (s == 1) {
> -	lua_assert((tr & ~TREF_FRAME) == 0);
> +	lj_assertJ((tr & ~TREF_FRAME) == 0, "bad frame slot 1");
>   #endif
>         } else if ((tr & TREF_FRAME)) {
>   	GCfunc *fn = gco2func(frame_gc(tv));
>   	BCReg delta = (BCReg)(tv - frame_prev(tv));
>   #if LJ_FR2
> -	if (ref)
> -	  lua_assert(ir_knum(ir)->u64 == tv->u64);
> +	lj_assertJ(!ref || ir_knum(ir)->u64 == tv->u64,
> +		   "frame slot %d PC mismatch", s);
>   	tr = J->slot[s-1];
>   	ir = IR(tref_ref(tr));
>   #endif
> -	lua_assert(tref_isfunc(tr));
> -	if (tref_isk(tr)) lua_assert(fn == ir_kfunc(ir));
> -	lua_assert(s > delta + LJ_FR2 ? (J->slot[s-delta] & TREF_FRAME)
> -				      : (s == delta + LJ_FR2));
> +	lj_assertJ(tref_isfunc(tr),
> +		   "frame slot %d is not a function", s-LJ_FR2);
> +	lj_assertJ(!tref_isk(tr) || fn == ir_kfunc(ir),
> +		   "frame slot %d function mismatch", s-LJ_FR2);
> +	lj_assertJ(s > delta + LJ_FR2 ? (J->slot[s-delta] & TREF_FRAME)
> +				      : (s == delta + LJ_FR2),
> +		   "frame slot %d broken chain", s-LJ_FR2);
>   	depth++;
>         } else if ((tr & TREF_CONT)) {
>   #if LJ_FR2
> -	if (ref)
> -	  lua_assert(ir_knum(ir)->u64 == tv->u64);
> +	lj_assertJ(!ref || ir_knum(ir)->u64 == tv->u64,
> +		   "cont slot %d continuation mismatch", s);
>   #else
> -	lua_assert(ir_kptr(ir) == gcrefp(tv->gcr, void));
> +	lj_assertJ(ir_kptr(ir) == gcrefp(tv->gcr, void),
> +		   "cont slot %d continuation mismatch", s);
>   #endif
> -	lua_assert((J->slot[s+1+LJ_FR2] & TREF_FRAME));
> +	lj_assertJ((J->slot[s+1+LJ_FR2] & TREF_FRAME),
> +		   "cont slot %d not followed by frame", s);
>   	depth++;
>         } else {
> -	if (tvisnumber(tv))
> -	  lua_assert(tref_isnumber(tr));  /* Could be IRT_INT etc., too. */
> -	else
> -	  lua_assert(itype2irt(tv) == tref_type(tr));
> +	/* Number repr. may differ, but other types must be the same. */
> +	lj_assertJ(tvisnumber(tv) ? tref_isnumber(tr) :
> +				    itype2irt(tv) == tref_type(tr),
> +		   "slot %d type mismatch: stack type %d vs IR type %d",
> +		   s, itypemap(tv), tref_type(tr));
>   	if (tref_isk(tr)) {  /* Compare constants. */
>   	  TValue tvk;
>   	  lj_ir_kvalue(J->L, &tvk, ir);
> -	  if (!(tvisnum(&tvk) && tvisnan(&tvk)))
> -	    lua_assert(lj_obj_equal(tv, &tvk));
> -	  else
> -	    lua_assert(tvisnum(tv) && tvisnan(tv));
> +	  lj_assertJ((tvisnum(&tvk) && tvisnan(&tvk)) ?
> +		     (tvisnum(tv) && tvisnan(tv)) :
> +		     lj_obj_equal(tv, &tvk),
> +		     "slot %d const mismatch: stack %016llx vs IR %016llx",
> +		     s, tv->u64, tvk.u64);
>   	}
>         }
>       }
>     }
> -  lua_assert(J->framedepth == depth);
> +  lj_assertJ(J->framedepth == depth,
> +	     "frame depth mismatch %d vs %d", J->framedepth, depth);
>   }
>   #endif
>   
> @@ -182,7 +210,7 @@ static TRef getcurrf(jit_State *J)
>   {
>     if (J->base[-1-LJ_FR2])
>       return J->base[-1-LJ_FR2];
> -  lua_assert(J->baseslot == 1+LJ_FR2);
> +  lj_assertJ(J->baseslot == 1+LJ_FR2, "bad baseslot");
>     return sloadt(J, -1-LJ_FR2, IRT_FUNC, IRSLOAD_READONLY);
>   }
>   
> @@ -427,7 +455,8 @@ static void rec_for_loop(jit_State *J, const BCIns *fori, ScEvEntry *scev,
>     TRef stop = fori_arg(J, fori, ra+FORL_STOP, t, mode);
>     TRef step = fori_arg(J, fori, ra+FORL_STEP, t, mode);
>     int tc, dir = rec_for_direction(&tv[FORL_STEP]);
> -  lua_assert(bc_op(*fori) == BC_FORI || bc_op(*fori) == BC_JFORI);
> +  lj_assertJ(bc_op(*fori) == BC_FORI || bc_op(*fori) == BC_JFORI,
> +	     "bad bytecode %d instead of FORI/JFORI", bc_op(*fori));
>     scev->t.irt = t;
>     scev->dir = dir;
>     scev->stop = tref_ref(stop);
> @@ -483,7 +512,7 @@ static LoopEvent rec_for(jit_State *J, const BCIns *fori, int isforl)
>   						   IRT_NUM;
>       for (i = FORL_IDX; i <= FORL_STEP; i++) {
>         if (!tr[i]) sload(J, ra+i);
> -      lua_assert(tref_isnumber_str(tr[i]));
> +      lj_assertJ(tref_isnumber_str(tr[i]), "bad FORI argument type");
>         if (tref_isstr(tr[i]))
>   	tr[i] = emitir(IRTG(IR_STRTO, IRT_NUM), tr[i], 0);
>         if (t == IRT_INT) {
> @@ -615,7 +644,8 @@ static void rec_loop_jit(jit_State *J, TraceNo lnk, LoopEvent ev)
>   static int rec_profile_need(jit_State *J, GCproto *pt, const BCIns *pc)
>   {
>     GCproto *ppt;
> -  lua_assert(J->prof_mode == 'f' || J->prof_mode == 'l');
> +  lj_assertJ(J->prof_mode == 'f' || J->prof_mode == 'l',
> +	     "bad profiler mode %c", J->prof_mode);
>     if (!pt)
>       return 0;
>     ppt = J->prev_pt;
> @@ -793,7 +823,7 @@ void lj_record_ret(jit_State *J, BCReg rbase, ptrdiff_t gotresults)
>       BCReg cbase = (BCReg)frame_delta(frame);
>       if (--J->framedepth <= 0)
>         lj_trace_err(J, LJ_TRERR_NYIRETL);
> -    lua_assert(J->baseslot > 1+LJ_FR2);
> +    lj_assertJ(J->baseslot > 1+LJ_FR2, "bad baseslot for return");
>       gotresults++;
>       rbase += cbase;
>       J->baseslot -= (BCReg)cbase;
> @@ -818,7 +848,7 @@ void lj_record_ret(jit_State *J, BCReg rbase, ptrdiff_t gotresults)
>       BCReg cbase = (BCReg)frame_delta(frame);
>       if (--J->framedepth < 0)  /* NYI: return of vararg func to lower frame. */
>         lj_trace_err(J, LJ_TRERR_NYIRETL);
> -    lua_assert(J->baseslot > 1+LJ_FR2);
> +    lj_assertJ(J->baseslot > 1+LJ_FR2, "bad baseslot for return");
>       rbase += cbase;
>       J->baseslot -= (BCReg)cbase;
>       J->base -= cbase;
> @@ -845,7 +875,7 @@ void lj_record_ret(jit_State *J, BCReg rbase, ptrdiff_t gotresults)
>       J->maxslot = cbase+(BCReg)nresults;
>       if (J->framedepth > 0) {  /* Return to a frame that is part of the trace. */
>         J->framedepth--;
> -      lua_assert(J->baseslot > cbase+1+LJ_FR2);
> +      lj_assertJ(J->baseslot > cbase+1+LJ_FR2, "bad baseslot for return");
>         J->baseslot -= cbase+1+LJ_FR2;
>         J->base -= cbase+1+LJ_FR2;
>       } else if (J->parent == 0 && J->exitno == 0 &&
> @@ -860,7 +890,7 @@ void lj_record_ret(jit_State *J, BCReg rbase, ptrdiff_t gotresults)
>         emitir(IRTG(IR_RETF, IRT_PGC), trpt, trpc);
>         J->retdepth++;
>         J->needsnap = 1;
> -      lua_assert(J->baseslot == 1+LJ_FR2);
> +      lj_assertJ(J->baseslot == 1+LJ_FR2, "bad baseslot for return");
>         /* Shift result slots up and clear the slots of the new frame below. */
>         memmove(J->base + cbase, J->base-1-LJ_FR2, sizeof(TRef)*nresults);
>         memset(J->base-1-LJ_FR2, 0, sizeof(TRef)*(cbase+1+LJ_FR2));
> @@ -908,12 +938,13 @@ void lj_record_ret(jit_State *J, BCReg rbase, ptrdiff_t gotresults)
>         }  /* Otherwise continue with another __concat call. */
>       } else {
>         /* Result type already specialized. */
> -      lua_assert(cont == lj_cont_condf || cont == lj_cont_condt);
> +      lj_assertJ(cont == lj_cont_condf || cont == lj_cont_condt,
> +		 "bad continuation type");
>       }
>     } else {
>       lj_trace_err(J, LJ_TRERR_NYIRETL);  /* NYI: handle return to C frame. */
>     }
> -  lua_assert(J->baseslot >= 1+LJ_FR2);
> +  lj_assertJ(J->baseslot >= 1+LJ_FR2, "bad baseslot for return");
>   }
>   
>   /* -- Metamethod handling ------------------------------------------------- */
> @@ -1168,7 +1199,7 @@ static void rec_mm_comp_cdata(jit_State *J, RecordIndex *ix, int op, MMS mm)
>       ix->tab = ix->val;
>       copyTV(J->L, &ix->tabv, &ix->valv);
>     } else {
> -    lua_assert(tref_iscdata(ix->key));
> +    lj_assertJ(tref_iscdata(ix->key), "cdata expected");
>       ix->tab = ix->key;
>       copyTV(J->L, &ix->tabv, &ix->keyv);
>     }
> @@ -1265,7 +1296,8 @@ static void rec_idx_abc(jit_State *J, TRef asizeref, TRef ikey, uint32_t asize)
>       /* Got scalar evolution analysis results for this reference? */
>       if (ref == J->scev.idx) {
>         int32_t stop;
> -      lua_assert(irt_isint(J->scev.t) && ir->o == IR_SLOAD);
> +      lj_assertJ(irt_isint(J->scev.t) && ir->o == IR_SLOAD,
> +		 "only int SCEV supported");
>         stop = numberVint(&(J->L->base - J->baseslot)[ir->op1 + FORL_STOP]);
>         /* Runtime value for stop of loop is within bounds? */
>         if ((uint64_t)stop + ofs < (uint64_t)asize) {
> @@ -1383,7 +1415,7 @@ TRef lj_record_idx(jit_State *J, RecordIndex *ix)
>   
>     while (!tref_istab(ix->tab)) { /* Handle non-table lookup. */
>       /* Never call raw lj_record_idx() on non-table. */
> -    lua_assert(ix->idxchain != 0);
> +    lj_assertJ(ix->idxchain != 0, "bad usage");
>       if (!lj_record_mm_lookup(J, ix, ix->val ? MM_newindex : MM_index))
>         lj_trace_err(J, LJ_TRERR_NOMM);
>     handlemm:
> @@ -1467,10 +1499,10 @@ TRef lj_record_idx(jit_State *J, RecordIndex *ix)
>   	emitir(IRTG(oldv == niltvg(J2G(J)) ? IR_EQ : IR_NE, IRT_PGC),
>   	       xref, lj_ir_kkptr(J, niltvg(J2G(J))));
>         if (ix->idxchain && lj_record_mm_lookup(J, ix, MM_newindex)) {
> -	lua_assert(hasmm);
> +	lj_assertJ(hasmm, "inconsistent metamethod handling");
>   	goto handlemm;
>         }
> -      lua_assert(!hasmm);
> +      lj_assertJ(!hasmm, "inconsistent metamethod handling");
>         if (oldv == niltvg(J2G(J))) {  /* Need to insert a new key. */
>   	TRef key = ix->key;
>   	if (tref_isinteger(key))  /* NEWREF needs a TValue as a key. */
> @@ -1578,7 +1610,7 @@ static TRef rec_upvalue(jit_State *J, uint32_t uv, TRef val)
>     int needbarrier = 0;
>     if (rec_upvalue_constify(J, uvp)) {  /* Try to constify immutable upvalue. */
>       TRef tr, kfunc;
> -    lua_assert(val == 0);
> +    lj_assertJ(val == 0, "bad usage");
>       if (!tref_isk(fn)) {  /* Late specialization of current function. */
>         if (J->pt->flags >= PROTO_CLC_POLY)
>   	goto noconstify;
> @@ -1700,7 +1732,7 @@ static void rec_func_vararg(jit_State *J)
>   {
>     GCproto *pt = J->pt;
>     BCReg s, fixargs, vframe = J->maxslot+1+LJ_FR2;
> -  lua_assert((pt->flags & PROTO_VARARG));
> +  lj_assertJ((pt->flags & PROTO_VARARG), "FUNCV in non-vararg function");
>     if (J->baseslot + vframe + pt->framesize >= LJ_MAX_JSLOTS)
>       lj_trace_err(J, LJ_TRERR_STACKOV);
>     J->base[vframe-1-LJ_FR2] = J->base[-1-LJ_FR2];  /* Copy function up. */
> @@ -1769,7 +1801,7 @@ static void rec_varg(jit_State *J, BCReg dst, ptrdiff_t nresults)
>   {
>     int32_t numparams = J->pt->numparams;
>     ptrdiff_t nvararg = frame_delta(J->L->base-1) - numparams - 1 - LJ_FR2;
> -  lua_assert(frame_isvarg(J->L->base-1));
> +  lj_assertJ(frame_isvarg(J->L->base-1), "VARG in non-vararg frame");
>     if (LJ_FR2 && dst > J->maxslot)
>       J->base[dst-1] = 0;  /* Prevent resurrection of unrelated slot. */
>     if (J->framedepth > 0) {  /* Simple case: varargs defined on-trace. */
> @@ -1887,7 +1919,7 @@ static TRef rec_cat(jit_State *J, BCReg baseslot, BCReg topslot)
>     TValue savetv[5];
>     BCReg s;
>     RecordIndex ix;
> -  lua_assert(baseslot < topslot);
> +  lj_assertJ(baseslot < topslot, "bad CAT arg");
>     for (s = baseslot; s <= topslot; s++)
>       (void)getslot(J, s);  /* Ensure all arguments have a reference. */
>     if (tref_isnumber_str(top[0]) && tref_isnumber_str(top[-1])) {
> @@ -2011,7 +2043,7 @@ void lj_record_ins(jit_State *J)
>         if (bc_op(*J->pc) >= BC__MAX)
>   	return;
>         break;
> -    default: lua_assert(0); break;
> +    default: lj_assertJ(0, "bad post-processing mode"); break;
>       }
>       J->postproc = LJ_POST_NONE;
>     }
> @@ -2379,7 +2411,8 @@ void lj_record_ins(jit_State *J)
>         J->loopref = J->cur.nins;
>       break;
>     case BC_JFORI:
> -    lua_assert(bc_op(pc[(ptrdiff_t)rc-BCBIAS_J]) == BC_JFORL);
> +    lj_assertJ(bc_op(pc[(ptrdiff_t)rc-BCBIAS_J]) == BC_JFORL,
> +	       "JFORI does not point to JFORL");
>       if (rec_for(J, pc, 0) != LOOPEV_LEAVE)  /* Link to existing loop. */
>         lj_record_stop(J, LJ_TRLINK_ROOT, bc_d(pc[(ptrdiff_t)rc-BCBIAS_J]));
>       /* Continue tracing if the loop is not entered. */
> @@ -2432,7 +2465,8 @@ void lj_record_ins(jit_State *J)
>       rec_func_lua(J);
>       break;
>     case BC_JFUNCV:
> -    lua_assert(0);  /* Cannot happen. No hotcall counting for varag funcs. */
> +    /* Cannot happen. No hotcall counting for varag funcs. */
> +    lj_assertJ(0, "unsupported vararg hotcall");
>       break;
>   
>     case BC_FUNCC:
> @@ -2492,11 +2526,11 @@ static const BCIns *rec_setup_root(jit_State *J)
>       J->bc_min = pc;
>       break;
>     case BC_ITERL:
> -    lua_assert(bc_op(pc[-1]) == BC_ITERC);
> +    lj_assertJ(bc_op(pc[-1]) == BC_ITERC, "no ITERC before ITERL");
>       J->maxslot = ra + bc_b(pc[-1]) - 1;
>       J->bc_extent = (MSize)(-bc_j(ins))*sizeof(BCIns);
>       pc += 1+bc_j(ins);
> -    lua_assert(bc_op(pc[-1]) == BC_JMP);
> +    lj_assertJ(bc_op(pc[-1]) == BC_JMP, "ITERL does not point to JMP+1");
>       J->bc_min = pc;
>       break;
>     case BC_LOOP:
> @@ -2528,7 +2562,7 @@ static const BCIns *rec_setup_root(jit_State *J)
>       pc++;
>       break;
>     default:
> -    lua_assert(0);
> +    lj_assertJ(0, "bad root trace start bytecode %d", bc_op(ins));
>       break;
>     }
>     return pc;
> diff --git a/src/lj_snap.c b/src/lj_snap.c
> index 9146cddc..2dc281cb 100644
> --- a/src/lj_snap.c
> +++ b/src/lj_snap.c
> @@ -110,7 +110,7 @@ static MSize snapshot_framelinks(jit_State *J, SnapEntry *map, uint8_t *topslot)
>     cTValue *ftop = isluafunc(fn) ? (frame+funcproto(fn)->framesize) : J->L->top;
>   #if LJ_FR2
>     uint64_t pcbase = (u64ptr(J->pc) << 8) | (J->baseslot - 2);
> -  lua_assert(2 <= J->baseslot && J->baseslot <= 257);
> +  lj_assertJ(2 <= J->baseslot && J->baseslot <= 257, "bad baseslot");
>     memcpy(map, &pcbase, sizeof(uint64_t));
>   #else
>     MSize f = 0;
> @@ -129,7 +129,7 @@ static MSize snapshot_framelinks(jit_State *J, SnapEntry *map, uint8_t *topslot)
>   #endif
>         frame = frame_prevd(frame);
>       } else {
> -      lua_assert(!frame_isc(frame));
> +      lj_assertJ(!frame_isc(frame), "broken frame chain");
>   #if !LJ_FR2
>         map[f++] = SNAP_MKFTSZ(frame_ftsz(frame));
>   #endif
> @@ -141,10 +141,10 @@ static MSize snapshot_framelinks(jit_State *J, SnapEntry *map, uint8_t *topslot)
>     }
>     *topslot = (uint8_t)(ftop - lim);
>   #if LJ_FR2
> -  lua_assert(sizeof(SnapEntry) * 2 == sizeof(uint64_t));
> +  lj_assertJ(sizeof(SnapEntry) * 2 == sizeof(uint64_t), "bad SnapEntry def");
>     return 2;
>   #else
> -  lua_assert(f == (MSize)(1 + J->framedepth));
> +  lj_assertJ(f == (MSize)(1 + J->framedepth), "miscalculated snapshot size");
>     return f;
>   #endif
>   }
> @@ -223,7 +223,8 @@ static BCReg snap_usedef(jit_State *J, uint8_t *udf,
>   #define DEF_SLOT(s)		udf[(s)] *= 3
>   
>     /* Scan through following bytecode and check for uses/defs. */
> -  lua_assert(pc >= proto_bc(J->pt) && pc < proto_bc(J->pt) + J->pt->sizebc);
> +  lj_assertJ(pc >= proto_bc(J->pt) && pc < proto_bc(J->pt) + J->pt->sizebc,
> +	     "snapshot PC out of range");
>     for (;;) {
>       BCIns ins = *pc++;
>       BCOp op = bc_op(ins);
> @@ -234,7 +235,7 @@ static BCReg snap_usedef(jit_State *J, uint8_t *udf,
>       switch (bcmode_c(op)) {
>       case BCMvar: USE_SLOT(bc_c(ins)); break;
>       case BCMrbase:
> -      lua_assert(op == BC_CAT);
> +      lj_assertJ(op == BC_CAT, "unhandled op %d with RC rbase", op);
>         for (s = bc_b(ins); s <= bc_c(ins); s++) USE_SLOT(s);
>         for (; s < maxslot; s++) DEF_SLOT(s);
>         break;
> @@ -288,7 +289,8 @@ static BCReg snap_usedef(jit_State *J, uint8_t *udf,
>         break;
>       default: break;
>       }
> -    lua_assert(pc >= proto_bc(J->pt) && pc < proto_bc(J->pt) + J->pt->sizebc);
> +    lj_assertJ(pc >= proto_bc(J->pt) && pc < proto_bc(J->pt) + J->pt->sizebc,
> +	       "use/def analysis PC out of range");
>     }
>   
>   #undef USE_SLOT
> @@ -361,19 +363,20 @@ static RegSP snap_renameref(GCtrace *T, SnapNo lim, IRRef ref, RegSP rs)
>   }
>   
>   /* Copy RegSP from parent snapshot to the parent links of the IR. */
> -IRIns *lj_snap_regspmap(GCtrace *T, SnapNo snapno, IRIns *ir)
> +IRIns *lj_snap_regspmap(jit_State *J, GCtrace *T, SnapNo snapno, IRIns *ir)
>   {
>     SnapShot *snap = &T->snap[snapno];
>     SnapEntry *map = &T->snapmap[snap->mapofs];
>     BloomFilter rfilt = snap_renamefilter(T, snapno);
>     MSize n = 0;
>     IRRef ref = 0;
> +  UNUSED(J);
>     for ( ; ; ir++) {
>       uint32_t rs;
>       if (ir->o == IR_SLOAD) {
>         if (!(ir->op2 & IRSLOAD_PARENT)) break;
>         for ( ; ; n++) {
> -	lua_assert(n < snap->nent);
> +	lj_assertJ(n < snap->nent, "slot %d not found in snapshot", ir->op1);
>   	if (snap_slot(map[n]) == ir->op1) {
>   	  ref = snap_ref(map[n++]);
>   	  break;
> @@ -390,7 +393,7 @@ IRIns *lj_snap_regspmap(GCtrace *T, SnapNo snapno, IRIns *ir)
>       if (bloomtest(rfilt, ref))
>         rs = snap_renameref(T, snapno, ref, rs);
>       ir->prev = (uint16_t)rs;
> -    lua_assert(regsp_used(rs));
> +    lj_assertJ(regsp_used(rs), "unused IR %04d in snapshot", ref - REF_BIAS);
>     }
>     return ir;
>   }
> @@ -408,7 +411,7 @@ static TRef snap_replay_const(jit_State *J, IRIns *ir)
>     case IR_KNUM: case IR_KINT64:
>       return lj_ir_k64(J, (IROp)ir->o, ir_k64(ir)->u64);
>     case IR_KPTR: return lj_ir_kptr(J, ir_kptr(ir));  /* Continuation. */
> -  default: lua_assert(0); return TREF_NIL; break;
> +  default: lj_assertJ(0, "bad IR constant op %d", ir->o); return TREF_NIL;
>     }
>   }
>   
> @@ -486,7 +489,7 @@ void lj_snap_replay(jit_State *J, GCtrace *T)
>   	tr = snap_replay_const(J, ir);
>       } else if (!regsp_used(ir->prev)) {
>         pass23 = 1;
> -      lua_assert(s != 0);
> +      lj_assertJ(s != 0, "unused slot 0 in snapshot");
>         tr = s;
>       } else {
>         IRType t = irt_type(ir->t);
> @@ -512,8 +515,9 @@ void lj_snap_replay(jit_State *J, GCtrace *T)
>         if (regsp_reg(ir->r) == RID_SUNK) {
>   	if (J->slot[snap_slot(sn)] != snap_slot(sn)) continue;
>   	pass23 = 1;
> -	lua_assert(ir->o == IR_TNEW || ir->o == IR_TDUP ||
> -		   ir->o == IR_CNEW || ir->o == IR_CNEWI);
> +	lj_assertJ(ir->o == IR_TNEW || ir->o == IR_TDUP ||
> +		   ir->o == IR_CNEW || ir->o == IR_CNEWI,
> +		   "sunk parent IR %04d has bad op %d", refp - REF_BIAS, ir->o);
>   	if (ir->op1 >= T->nk) snap_pref(J, T, map, nent, seen, ir->op1);
>   	if (ir->op2 >= T->nk) snap_pref(J, T, map, nent, seen, ir->op2);
>   	if (LJ_HASFFI && ir->o == IR_CNEWI) {
> @@ -531,7 +535,8 @@ void lj_snap_replay(jit_State *J, GCtrace *T)
>   	    }
>   	}
>         } else if (!irref_isk(refp) && !regsp_used(ir->prev)) {
> -	lua_assert(ir->o == IR_CONV && ir->op2 == IRCONV_NUM_INT);
> +	lj_assertJ(ir->o == IR_CONV && ir->op2 == IRCONV_NUM_INT,
> +		   "sunk parent IR %04d has bad op %d", refp - REF_BIAS, ir->o);
>   	J->slot[snap_slot(sn)] = snap_pref(J, T, map, nent, seen, ir->op1);
>         }
>       }
> @@ -581,7 +586,9 @@ void lj_snap_replay(jit_State *J, GCtrace *T)
>   	      val = snap_pref(J, T, map, nent, seen, irs->op2);
>   	      if (val == 0) {
>   		IRIns *irc = &T->ir[irs->op2];
> -		lua_assert(irc->o == IR_CONV && irc->op2 == IRCONV_NUM_INT);
> +		lj_assertJ(irc->o == IR_CONV && irc->op2 == IRCONV_NUM_INT,
> +			   "sunk store for parent IR %04d with bad op %d",
> +			   refp - REF_BIAS, irc->o);
>   		val = snap_pref(J, T, map, nent, seen, irc->op1);
>   		val = emitir(IRTN(IR_CONV), val, IRCONV_NUM_INT);
>   	      } else if ((LJ_SOFTFP32 || (LJ_32 && LJ_HASFFI)) &&
> @@ -634,7 +641,9 @@ static void snap_restoreval(jit_State *J, GCtrace *T, ExitState *ex,
>       if (ir->o == IR_KPTR) {
>         o->u64 = (uint64_t)(uintptr_t)ir_kptr(ir);
>       } else {
> -      lua_assert(!(ir->o == IR_KKPTR || ir->o == IR_KNULL));
> +      lj_assertJ(!(ir->o == IR_KKPTR || ir->o == IR_KNULL),
> +		 "restore of const from IR %04d with bad op %d",
> +		 ref - REF_BIAS, ir->o);
>         lj_ir_kvalue(J->L, o, ir);
>       }
>       return;
> @@ -655,13 +664,14 @@ static void snap_restoreval(jit_State *J, GCtrace *T, ExitState *ex,
>         o->u64 = *(uint64_t *)sps;
>   #endif
>       } else {
> -      lua_assert(!irt_ispri(t));  /* PRI refs never have a spill slot. */
> +      lj_assertJ(!irt_ispri(t), "PRI ref with spill slot");
>         setgcV(J->L, o, (GCobj *)(uintptr_t)*(GCSize *)sps, irt_toitype(t));
>       }
>     } else {  /* Restore from register. */
>       Reg r = regsp_reg(rs);
>       if (ra_noreg(r)) {
> -      lua_assert(ir->o == IR_CONV && ir->op2 == IRCONV_NUM_INT);
> +      lj_assertJ(ir->o == IR_CONV && ir->op2 == IRCONV_NUM_INT,
> +		 "restore from IR %04d has no reg", ref - REF_BIAS);
>         snap_restoreval(J, T, ex, snapno, rfilt, ir->op1, o);
>         if (LJ_DUALNUM) setnumV(o, (lua_Number)intV(o));
>         return;
> @@ -689,7 +699,7 @@ static void snap_restoreval(jit_State *J, GCtrace *T, ExitState *ex,
>   
>   #if LJ_HASFFI
>   /* Restore raw data from the trace exit state. */
> -static void snap_restoredata(GCtrace *T, ExitState *ex,
> +static void snap_restoredata(jit_State *J, GCtrace *T, ExitState *ex,
>   			     SnapNo snapno, BloomFilter rfilt,
>   			     IRRef ref, void *dst, CTSize sz)
>   {
> @@ -697,6 +707,7 @@ static void snap_restoredata(GCtrace *T, ExitState *ex,
>     RegSP rs = ir->prev;
>     int32_t *src;
>     uint64_t tmp;
> +  UNUSED(J);
>     if (irref_isk(ref)) {
>       if (ir_isk64(ir)) {
>         src = (int32_t *)&ir[1];
> @@ -719,8 +730,9 @@ static void snap_restoredata(GCtrace *T, ExitState *ex,
>         Reg r = regsp_reg(rs);
>         if (ra_noreg(r)) {
>   	/* Note: this assumes CNEWI is never used for SOFTFP split numbers. */
> -	lua_assert(sz == 8 && ir->o == IR_CONV && ir->op2 == IRCONV_NUM_INT);
> -	snap_restoredata(T, ex, snapno, rfilt, ir->op1, dst, 4);
> +	lj_assertJ(sz == 8 && ir->o == IR_CONV && ir->op2 == IRCONV_NUM_INT,
> +		   "restore from IR %04d has no reg", ref - REF_BIAS);
> +	snap_restoredata(J, T, ex, snapno, rfilt, ir->op1, dst, 4);
>   	*(lua_Number *)dst = (lua_Number)*(int32_t *)dst;
>   	return;
>         }
> @@ -741,7 +753,8 @@ static void snap_restoredata(GCtrace *T, ExitState *ex,
>         if (LJ_64 && LJ_BE && sz == 4) src++;
>       }
>     }
> -  lua_assert(sz == 1 || sz == 2 || sz == 4 || sz == 8);
> +  lj_assertJ(sz == 1 || sz == 2 || sz == 4 || sz == 8,
> +	     "restore from IR %04d with bad size %d", ref - REF_BIAS, sz);
>     if (sz == 4) *(int32_t *)dst = *src;
>     else if (sz == 8) *(int64_t *)dst = *(int64_t *)src;
>     else if (sz == 1) *(int8_t *)dst = (int8_t)*src;
> @@ -754,8 +767,9 @@ static void snap_unsink(jit_State *J, GCtrace *T, ExitState *ex,
>   			SnapNo snapno, BloomFilter rfilt,
>   			IRIns *ir, TValue *o)
>   {
> -  lua_assert(ir->o == IR_TNEW || ir->o == IR_TDUP ||
> -	     ir->o == IR_CNEW || ir->o == IR_CNEWI);
> +  lj_assertJ(ir->o == IR_TNEW || ir->o == IR_TDUP ||
> +	     ir->o == IR_CNEW || ir->o == IR_CNEWI,
> +	     "sunk allocation with bad op %d", ir->o);
>   #if LJ_HASFFI
>     if (ir->o == IR_CNEW || ir->o == IR_CNEWI) {
>       CTState *cts = ctype_cts(J->L);
> @@ -766,13 +780,14 @@ static void snap_unsink(jit_State *J, GCtrace *T, ExitState *ex,
>       setcdataV(J->L, o, cd);
>       if (ir->o == IR_CNEWI) {
>         uint8_t *p = (uint8_t *)cdataptr(cd);
> -      lua_assert(sz == 4 || sz == 8);
> +      lj_assertJ(sz == 4 || sz == 8, "sunk cdata with bad size %d", sz);
>         if (LJ_32 && sz == 8 && ir+1 < T->ir + T->nins && (ir+1)->o == IR_HIOP) {
> -	snap_restoredata(T, ex, snapno, rfilt, (ir+1)->op2, LJ_LE?p+4:p, 4);
> +	snap_restoredata(J, T, ex, snapno, rfilt, (ir+1)->op2,
> +			 LJ_LE ? p+4 : p, 4);
>   	if (LJ_BE) p += 4;
>   	sz = 4;
>         }
> -      snap_restoredata(T, ex, snapno, rfilt, ir->op2, p, sz);
> +      snap_restoredata(J, T, ex, snapno, rfilt, ir->op2, p, sz);
>       } else {
>         IRIns *irs, *irlast = &T->ir[T->snap[snapno].ref];
>         for (irs = ir+1; irs < irlast; irs++)
> @@ -780,8 +795,11 @@ static void snap_unsink(jit_State *J, GCtrace *T, ExitState *ex,
>   	  IRIns *iro = &T->ir[T->ir[irs->op1].op2];
>   	  uint8_t *p = (uint8_t *)cd;
>   	  CTSize szs;
> -	  lua_assert(irs->o == IR_XSTORE && T->ir[irs->op1].o == IR_ADD);
> -	  lua_assert(iro->o == IR_KINT || iro->o == IR_KINT64);
> +	  lj_assertJ(irs->o == IR_XSTORE, "sunk store with bad op %d", irs->o);
> +	  lj_assertJ(T->ir[irs->op1].o == IR_ADD,
> +		     "sunk store with bad add op %d", T->ir[irs->op1].o);
> +	  lj_assertJ(iro->o == IR_KINT || iro->o == IR_KINT64,
> +		     "sunk store with bad const offset op %d", iro->o);
>   	  if (irt_is64(irs->t)) szs = 8;
>   	  else if (irt_isi8(irs->t) || irt_isu8(irs->t)) szs = 1;
>   	  else if (irt_isi16(irs->t) || irt_isu16(irs->t)) szs = 2;
> @@ -790,14 +808,16 @@ static void snap_unsink(jit_State *J, GCtrace *T, ExitState *ex,
>   	    p += (int64_t)ir_k64(iro)->u64;
>   	  else
>   	    p += iro->i;
> -	  lua_assert(p >= (uint8_t *)cdataptr(cd) &&
> -		     p + szs <= (uint8_t *)cdataptr(cd) + sz);
> +	  lj_assertJ(p >= (uint8_t *)cdataptr(cd) &&
> +		     p + szs <= (uint8_t *)cdataptr(cd) + sz,
> +		     "sunk store with offset out of range");
>   	  if (LJ_32 && irs+1 < T->ir + T->nins && (irs+1)->o == IR_HIOP) {
> -	    lua_assert(szs == 4);
> -	    snap_restoredata(T, ex, snapno, rfilt, (irs+1)->op2, LJ_LE?p+4:p,4);
> +	    lj_assertJ(szs == 4, "sunk store with bad size %d", szs);
> +	    snap_restoredata(J, T, ex, snapno, rfilt, (irs+1)->op2,
> +			     LJ_LE ? p+4 : p, 4);
>   	    if (LJ_BE) p += 4;
>   	  }
> -	  snap_restoredata(T, ex, snapno, rfilt, irs->op2, p, szs);
> +	  snap_restoredata(J, T, ex, snapno, rfilt, irs->op2, p, szs);
>   	}
>       }
>     } else
> @@ -812,10 +832,12 @@ static void snap_unsink(jit_State *J, GCtrace *T, ExitState *ex,
>         if (irs->r == RID_SINK && snap_sunk_store(T, ir, irs)) {
>   	IRIns *irk = &T->ir[irs->op1];
>   	TValue tmp, *val;
> -	lua_assert(irs->o == IR_ASTORE || irs->o == IR_HSTORE ||
> -		   irs->o == IR_FSTORE);
> +	lj_assertJ(irs->o == IR_ASTORE || irs->o == IR_HSTORE ||
> +		   irs->o == IR_FSTORE,
> +		   "sunk store with bad op %d", irs->o);
>   	if (irk->o == IR_FREF) {
> -	  lua_assert(irk->op2 == IRFL_TAB_META);
> +	  lj_assertJ(irk->op2 == IRFL_TAB_META,
> +		     "sunk store with bad field %d", irk->op2);
>   	  snap_restoreval(J, T, ex, snapno, rfilt, irs->op2, &tmp);
>   	  /* NOBARRIER: The table is new (marked white). */
>   	  setgcref(t->metatable, obj2gco(tabV(&tmp)));
> @@ -903,7 +925,7 @@ const BCIns *lj_snap_restore(jit_State *J, void *exptr)
>   #if LJ_FR2
>     L->base += (map[nent+LJ_BE] & 0xff);
>   #endif
> -  lua_assert(map + nent == flinks);
> +  lj_assertJ(map + nent == flinks, "inconsistent frames in snapshot");
>   
>     /* Compute current stack top. */
>     switch (bc_op(*pc)) {
> diff --git a/src/lj_snap.h b/src/lj_snap.h
> index 2c9ae3d6..4aec8509 100644
> --- a/src/lj_snap.h
> +++ b/src/lj_snap.h
> @@ -13,7 +13,8 @@
>   LJ_FUNC void lj_snap_add(jit_State *J);
>   LJ_FUNC void lj_snap_purge(jit_State *J);
>   LJ_FUNC void lj_snap_shrink(jit_State *J);
> -LJ_FUNC IRIns *lj_snap_regspmap(GCtrace *T, SnapNo snapno, IRIns *ir);
> +LJ_FUNC IRIns *lj_snap_regspmap(jit_State *J, GCtrace *T, SnapNo snapno,
> +				IRIns *ir);
>   LJ_FUNC void lj_snap_replay(jit_State *J, GCtrace *T);
>   LJ_FUNC const BCIns *lj_snap_restore(jit_State *J, void *exptr);
>   LJ_FUNC void lj_snap_grow_buf_(jit_State *J, MSize need);
> diff --git a/src/lj_state.c b/src/lj_state.c
> index 4add3d65..684336d5 100644
> --- a/src/lj_state.c
> +++ b/src/lj_state.c
> @@ -70,7 +70,8 @@ static void resizestack(lua_State *L, MSize n)
>     GCobj *up;
>     int32_t oldvmstate = G(L)->vmstate;
>   
> -  lua_assert((MSize)(tvref(L->maxstack)-oldst)==L->stacksize-LJ_STACK_EXTRA-1);
> +  lj_assertL((MSize)(tvref(L->maxstack)-oldst) == L->stacksize-LJ_STACK_EXTRA-1,
> +	     "inconsistent stack size");
>   
>     /*
>     ** Lua stack is inconsistent while reallocation, profilers
> @@ -182,8 +183,9 @@ static void close_state(lua_State *L)
>     global_State *g = G(L);
>     lj_func_closeuv(L, tvref(L->stack));
>     lj_gc_freeall(g);
> -  lua_assert(gcref(g->gc.root) == obj2gco(L));
> -  lua_assert(g->strnum == 0);
> +  lj_assertG(gcref(g->gc.root) == obj2gco(L),
> +	     "main thread is not first GC object");
> +  lj_assertG(g->strnum == 0, "leaked %d strings", g->strnum);
>     lj_trace_freestate(g);
>   #if LJ_HASFFI
>     lj_ctype_freestate(g);
> @@ -197,7 +199,9 @@ static void close_state(lua_State *L)
>       lj_mem_freevec(g, mref(g->gc.lightudseg, uint32_t), segnum, uint32_t);
>     }
>   #endif
> -  lua_assert(g->gc.total == sizeof(GG_State));
> +  lj_assertG(g->gc.total == sizeof(GG_State),
> +	     "memory leak of %lld bytes",
> +	     (long long)(g->gc.total - sizeof(GG_State)));
>   #ifndef LUAJIT_USE_SYSMALLOC
>     if (g->allocf == lj_alloc_f)
>       lj_alloc_destroy(g->allocd);
> @@ -315,17 +319,17 @@ lua_State *lj_state_new(lua_State *L)
>     setmrefr(L1->glref, L->glref);
>     setgcrefr(L1->env, L->env);
>     stack_init(L1, L);  /* init stack */
> -  lua_assert(iswhite(obj2gco(L1)));
> +  lj_assertL(iswhite(obj2gco(L1)), "new thread object is not white");
>     return L1;
>   }
>   
>   void LJ_FASTCALL lj_state_free(global_State *g, lua_State *L)
>   {
> -  lua_assert(L != mainthread(g));
> +  lj_assertG(L != mainthread(g), "free of main thread");
>     if (obj2gco(L) == gcref(g->cur_L))
>       setgcrefnull(g->cur_L);
>     lj_func_closeuv(L, tvref(L->stack));
> -  lua_assert(gcref(L->openupval) == NULL);
> +  lj_assertG(gcref(L->openupval) == NULL, "stale open upvalues");
>     lj_mem_freevec(g, tvref(L->stack), L->stacksize, TValue);
>     lj_mem_freet(g, L);
>   }
> diff --git a/src/lj_str.c b/src/lj_str.c
> index 8ff955ed..321e8c4f 100644
> --- a/src/lj_str.c
> +++ b/src/lj_str.c
> @@ -53,8 +53,9 @@ int32_t LJ_FASTCALL lj_str_cmp(GCstr *a, GCstr *b)
>   static LJ_AINLINE int str_fastcmp(const char *a, const char *b, MSize len)
>   {
>     MSize i = 0;
> -  lua_assert(len > 0);
> -  lua_assert((((uintptr_t)a+len-1) & (LJ_PAGESIZE-1)) <= LJ_PAGESIZE-4);
> +  lj_assertX(len > 0, "fast string compare with zero length");
> +  lj_assertX((((uintptr_t)a+len-1) & (LJ_PAGESIZE-1)) <= LJ_PAGESIZE-4,
> +	     "fast string compare crossing page boundary");
>     do {  /* Note: innocuous access up to end of string + 3. */
>       uint32_t v = lj_getu32(a+i) ^ *(const uint32_t *)(b+i);
>       if (v) {
> @@ -138,7 +139,7 @@ lj_fullhash(const uint8_t *v, MSize len)
>     MSize c = 0xcafedead;
>     MSize d = 0xdeadbeef;
>     MSize h = len;
> -  lua_assert(len >= 12);
> +  lj_assertX(len >= 12, "full hash calculation for too short (%d) string", len);
>     for(; len>8; len-=8, v+=8) {
>       a ^= lj_getu32(v);
>       b ^= lj_getu32(v+4);
> diff --git a/src/lj_strfmt.c b/src/lj_strfmt.c
> index 237cc575..ff5568c3 100644
> --- a/src/lj_strfmt.c
> +++ b/src/lj_strfmt.c
> @@ -320,7 +320,7 @@ SBuf *lj_strfmt_putfxint(SBuf *sb, SFormat sf, uint64_t k)
>     if ((sf & STRFMT_F_LEFT))
>       while (width-- > pprec) *p++ = ' ';
>   
> -  lua_assert(need == (MSize)(p - ps));
> +  lj_assertX(need == (MSize)(p - ps), "miscalculated format size");
>     setsbufP(sb, p);
>     return sb;
>   }
> @@ -449,7 +449,7 @@ const char *lj_strfmt_pushvf(lua_State *L, const char *fmt, va_list argp)
>       case STRFMT_ERR:
>       default:
>         lj_buf_putb(sb, '?');
> -      lua_assert(0);
> +      lj_assertL(0, "bad string format near offset %d", fs.len);
>         break;
>       }
>     }
> diff --git a/src/lj_strfmt.h b/src/lj_strfmt.h
> index 6e1d9017..0e1d8946 100644
> --- a/src/lj_strfmt.h
> +++ b/src/lj_strfmt.h
> @@ -79,7 +79,8 @@ static LJ_AINLINE void lj_strfmt_init(FormatState *fs, const char *p, MSize len)
>   {
>     fs->p = (const uint8_t *)p;
>     fs->e = (const uint8_t *)p + len;
> -  lua_assert(*fs->e == 0);  /* Must be NUL-terminated (may have NULs inside). */
> +  /* Must be NUL-terminated. May have NULs inside, too. */
> +  lj_assertX(*fs->e == 0, "format not NUL-terminated");
>   }
>   
>   /* Raw conversions. */
> diff --git a/src/lj_strfmt_num.c b/src/lj_strfmt_num.c
> index 9271f68a..c26204b7 100644
> --- a/src/lj_strfmt_num.c
> +++ b/src/lj_strfmt_num.c
> @@ -257,7 +257,7 @@ static int nd_similar(uint32_t* nd, uint32_t ndhi, uint32_t* ref, MSize hilen,
>     } else {
>       prec -= hilen - 9;
>     }
> -  lua_assert(prec < 9);
> +  lj_assertX(prec < 9, "bad precision %d", prec);
>     lj_strfmt_wuint9(nd9, nd[ndhi]);
>     lj_strfmt_wuint9(ref9, *ref);
>     return !memcmp(nd9, ref9, prec) && (nd9[prec] < '5') == (ref9[prec] < '5');
> @@ -414,14 +414,14 @@ static char *lj_strfmt_wfnum(SBuf *sb, SFormat sf, lua_Number n, char *p)
>   	** Rescaling was performed, but this introduced some error, and might
>   	** have pushed us across a rounding boundary. We check whether this
>   	** error affected the result by introducing even more error (2ulp in
> -	** either direction), and seeing whether a roundary boundary was
> +	** either direction), and seeing whether a rounding boundary was
>   	** crossed. Having already converted the -2ulp case, we save off its
>   	** most significant digits, convert the +2ulp case, and compare them.
>   	*/
>   	int32_t eidx = e + 70 + (ND_MUL2K_MAX_SHIFT < 29)
>   			 + (t.u32.lo >= 0xfffffffe && !(~t.u32.hi << 12));
>   	const int8_t *m_e = four_ulp_m_e + eidx * 2;
> -	lua_assert(0 <= eidx && eidx < 128);
> +	lj_assertG_(G(sbufL(sb)), 0 <= eidx && eidx < 128, "bad eidx %d", eidx);
>   	nd[33] = nd[ndhi];
>   	nd[32] = nd[(ndhi - 1) & 0x3f];
>   	nd[31] = nd[(ndhi - 2) & 0x3f];
> diff --git a/src/lj_strscan.c b/src/lj_strscan.c
> index 11d341ee..bb07b251 100644
> --- a/src/lj_strscan.c
> +++ b/src/lj_strscan.c
> @@ -93,7 +93,7 @@ static void strscan_double(uint64_t x, TValue *o, int32_t ex2, int32_t neg)
>     }
>   
>     /* Convert to double using a signed int64_t conversion, then rescale. */
> -  lua_assert((int64_t)x >= 0);
> +  lj_assertX((int64_t)x >= 0, "bad double conversion");
>     n = (double)(int64_t)x;
>     if (neg) n = -n;
>     if (ex2) n = ldexp(n, ex2);
> @@ -263,7 +263,7 @@ static StrScanFmt strscan_dec(const uint8_t *p, TValue *o,
>       uint32_t hi = 0, lo = (uint32_t)(xip-xi);
>       int32_t ex2 = 0, idig = (int32_t)lo + (ex10 >> 1);
>   
> -    lua_assert(lo > 0 && (ex10 & 1) == 0);
> +    lj_assertX(lo > 0 && (ex10 & 1) == 0, "bad lo %d ex10 %d", lo, ex10);
>   
>       /* Handle simple overflow/underflow. */
>       if (idig > 310/2) { if (neg) setminfV(o); else setpinfV(o); return fmt; }
> @@ -532,7 +532,7 @@ int LJ_FASTCALL lj_strscan_num(GCstr *str, TValue *o)
>   {
>     StrScanFmt fmt = lj_strscan_scan((const uint8_t *)strdata(str), str->len, o,
>   				   STRSCAN_OPT_TONUM);
> -  lua_assert(fmt == STRSCAN_ERROR || fmt == STRSCAN_NUM);
> +  lj_assertX(fmt == STRSCAN_ERROR || fmt == STRSCAN_NUM, "bad scan format");
>     return (fmt != STRSCAN_ERROR);
>   }
>   
> @@ -541,7 +541,8 @@ int LJ_FASTCALL lj_strscan_number(GCstr *str, TValue *o)
>   {
>     StrScanFmt fmt = lj_strscan_scan((const uint8_t *)strdata(str), str->len, o,
>   				   STRSCAN_OPT_TOINT);
> -  lua_assert(fmt == STRSCAN_ERROR || fmt == STRSCAN_NUM || fmt == STRSCAN_INT);
> +  lj_assertX(fmt == STRSCAN_ERROR || fmt == STRSCAN_NUM || fmt == STRSCAN_INT,
> +	     "bad scan format");
>     if (fmt == STRSCAN_INT) setitype(o, LJ_TISNUM);
>     return (fmt != STRSCAN_ERROR);
>   }
> diff --git a/src/lj_symtab.c b/src/lj_symtab.c
> index 54984c05..38b5e9e1 100644
> --- a/src/lj_symtab.c
> +++ b/src/lj_symtab.c
> @@ -36,8 +36,8 @@ void lj_symtab_dump_trace(struct lj_wbuf *out, const GCtrace *trace)
>     BCLine lineno = 0;
>   
>     const BCIns *startpc = mref(trace->startpc, const BCIns);
> -  lua_assert(startpc >= proto_bc(pt) &&
> -             startpc < proto_bc(pt) + pt->sizebc);
> +  lj_assertX(startpc >= proto_bc(pt) && startpc < proto_bc(pt) + pt->sizebc,
> +	     "start trace PC out of range");
>   
>     lineno = lj_debug_line(pt, proto_bcpos(pt, startpc));
>   
> @@ -354,8 +354,9 @@ static int resolve_symbolnames(struct dl_phdr_info *info, size_t info_size,
>     ** Assertion was taken from the GLIBC tests:
>     ** https://code.woboq.org/userspace/glibc/elf/tst-dlmodcount.c.html#37
>     */
> -  lua_assert(info_size > offsetof(struct dl_phdr_info, dlpi_subs)
> -      + sizeof(info->dlpi_subs));
> +  lj_assertL(info_size > offsetof(struct dl_phdr_info, dlpi_subs)
> +			 + sizeof(info->dlpi_subs),
> +	     "bad dlpi_subs");
>   
>     lib_cnt = info->dlpi_adds - *conf->lib_adds;
>   
> @@ -401,7 +402,7 @@ static int resolve_symbolnames(struct dl_phdr_info *info, size_t info_size,
>         ** sysprof, unless someone have deleted the LuaJIT binary
>         ** right after the start.
>         */
> -      lua_assert(0);
> +      lj_assertL(0, "bad executed binary symtab section");
>     }
>   
>     /*
> diff --git a/src/lj_sysprof.c b/src/lj_sysprof.c
> index 2e9ed9b3..52d4d2a5 100644
> --- a/src/lj_sysprof.c
> +++ b/src/lj_sysprof.c
> @@ -111,9 +111,9 @@ static void stream_epilogue(struct sysprof *sp)
>   
>   static void stream_lfunc(struct lj_wbuf *buf, const GCfunc *func)
>   {
> -  lua_assert(isluafunc(func));
> +  lj_assertX(isluafunc(func), "bad lua function in sysprof stream");
>     const GCproto *pt = funcproto(func);
> -  lua_assert(pt != NULL);
> +  lj_assertX(pt != NULL, "bad lua function prototype in sysprof stream");
>     lj_wbuf_addbyte(buf, LJP_FRAME_LFUNC);
>     lj_wbuf_addu64(buf, (uintptr_t)pt);
>     lj_wbuf_addu64(buf, (uint64_t)pt->firstline);
> @@ -121,14 +121,14 @@ static void stream_lfunc(struct lj_wbuf *buf, const GCfunc *func)
>   
>   static void stream_cfunc(struct lj_wbuf *buf, const GCfunc *func)
>   {
> -  lua_assert(iscfunc(func));
> +  lj_assertX(iscfunc(func), "bad C function in sysprof stream");
>     lj_wbuf_addbyte(buf, LJP_FRAME_CFUNC);
>     lj_wbuf_addu64(buf, (uintptr_t)func->c.f);
>   }
>   
>   static void stream_ffunc(struct lj_wbuf *buf, const GCfunc *func)
>   {
> -  lua_assert(isffunc(func));
> +  lj_assertX(isffunc(func), "bad fast function in sysprof stream");
>     lj_wbuf_addbyte(buf, LJP_FRAME_FFUNC);
>     lj_wbuf_addu64(buf, func->c.ffid);
>   }
> @@ -136,7 +136,7 @@ static void stream_ffunc(struct lj_wbuf *buf, const GCfunc *func)
>   static void stream_frame_lua(struct lj_wbuf *buf, const cTValue *frame)
>   {
>     const GCfunc *func = frame_func(frame);
> -  lua_assert(func != NULL);
> +  lj_assertX(func != NULL, "bad function in sysprof stream");
>     if (isluafunc(func))
>       stream_lfunc(buf, func);
>     else if (isffunc(func))
> @@ -145,7 +145,7 @@ static void stream_frame_lua(struct lj_wbuf *buf, const cTValue *frame)
>       stream_cfunc(buf, func);
>     else
>       /* Unreachable. */
> -    lua_assert(0);
> +    lj_assertX(0, "bad function type in sysprof stream");
>   }
>   
>   static void stream_backtrace_lua(struct sysprof *sp)
> @@ -155,9 +155,9 @@ static void stream_backtrace_lua(struct sysprof *sp)
>     cTValue *top_frame = NULL, *frame = NULL, *bot = NULL;
>     lua_State *L = NULL;
>   
> -  lua_assert(g != NULL);
> +  lj_assertX(g != NULL, "uninitialized global state in sysprof state");
>     L = gco2th(gcref(g->cur_L));
> -  lua_assert(L != NULL);
> +  lj_assertG(L != NULL, "uninitialized Lua state in sysprof state");
>   
>     top_frame = g->top_frame - 1; //(1 + LJ_FR2)
>   
> @@ -200,7 +200,7 @@ static void default_backtrace_host(void *(writer)(int frame_no, void *addr))
>     const int depth = backtrace(backtrace_buf, max_depth);
>     int level;
>   
> -  lua_assert(depth <= max_depth);
> +  lj_assertX(depth <= max_depth, "depth of C stack is too big");
>     for (level = SYSPROF_HANDLER_STACK_DEPTH; level < depth; ++level) {
>       if (!writer(level - SYSPROF_HANDLER_STACK_DEPTH + 1, backtrace_buf[level]))
>         return;
> @@ -209,7 +209,7 @@ static void default_backtrace_host(void *(writer)(int frame_no, void *addr))
>   
>   static void stream_backtrace_host(struct sysprof *sp)
>   {
> -  lua_assert(sp->backtracer != NULL);
> +  lj_assertX(sp->backtracer != NULL, "uninitialized sysprof backtracer");
>     sp->backtracer(stream_frame_host);
>     lj_wbuf_addu64(&sp->out, (uintptr_t)LJP_FRAME_HOST_LAST);
>   }
> @@ -268,9 +268,9 @@ static void stream_event(struct sysprof *sp, uint32_t vmstate)
>   {
>     event_streamer stream = NULL;
>   
> -  lua_assert(vmstfit4(vmstate));
> +  lj_assertX(vmstfit4(vmstate), "vmstate don't fit in 4 bits");
>     stream = event_streamers[vmstate];
> -  lua_assert(NULL != stream);
> +  lj_assertX(stream != NULL, "uninitialized sysprof stream");
>     stream(sp, vmstate);
>   }
>   
> @@ -282,7 +282,8 @@ static void sysprof_record_sample(struct sysprof *sp, siginfo_t *info)
>     uint32_t _vmstate = ~(uint32_t)(g->vmstate);
>     uint32_t vmstate = _vmstate < LJ_VMST_TRACE ? _vmstate : LJ_VMST_TRACE;
>   
> -  lua_assert(pthread_self() == sp->thread);
> +  lj_assertX(pthread_self() == sp->thread,
> +	     "bad thread during sysprof record sample");
>   
>     /* Caveat: order of counters must match vmstate order in <lj_obj.h>. */
>     ((uint64_t *)&sp->counters)[vmstate]++;
> @@ -317,7 +318,7 @@ static void sysprof_signal_handler(int sig, siginfo_t *info, void *ctx)
>         break;
>   
>       default:
> -      lua_assert(0);
> +      lj_assertX(0, "bad sysprof profiler state");
>         break;
>     }
>   }
> @@ -344,7 +345,7 @@ static int sysprof_validate(struct sysprof *sp,
>         return PROFILE_ERRRUN;
>   
>       default:
> -      lua_assert(0);
> +      lj_assertX(0, "bad sysprof profiler state");
>         break;
>     }
>   
> diff --git a/src/lj_tab.c b/src/lj_tab.c
> index c5f358e5..1d6a4b7f 100644
> --- a/src/lj_tab.c
> +++ b/src/lj_tab.c
> @@ -38,7 +38,7 @@ static LJ_AINLINE Node *hashmask(const GCtab *t, uint32_t hash)
>   /* Hash an arbitrary key and return its anchor position in the hash table. */
>   static Node *hashkey(const GCtab *t, cTValue *key)
>   {
> -  lua_assert(!tvisint(key));
> +  lj_assertX(!tvisint(key), "attempt to hash integer");
>     if (tvisstr(key))
>       return hashstr(t, strV(key));
>     else if (tvisnum(key))
> @@ -57,7 +57,7 @@ static LJ_AINLINE void newhpart(lua_State *L, GCtab *t, uint32_t hbits)
>   {
>     uint32_t hsize;
>     Node *node;
> -  lua_assert(hbits != 0);
> +  lj_assertL(hbits != 0, "zero hash size");
>     if (hbits > LJ_MAX_HBITS)
>       lj_err_msg(L, LJ_ERR_TABOV);
>     hsize = 1u << hbits;
> @@ -78,7 +78,7 @@ static LJ_AINLINE void clearhpart(GCtab *t)
>   {
>     uint32_t i, hmask = t->hmask;
>     Node *node = noderef(t->node);
> -  lua_assert(t->hmask != 0);
> +  lj_assertX(t->hmask != 0, "empty hash part");
>     for (i = 0; i <= hmask; i++) {
>       Node *n = &node[i];
>       setmref(n->next, NULL);
> @@ -103,7 +103,7 @@ static GCtab *newtab(lua_State *L, uint32_t asize, uint32_t hbits)
>     /* First try to colocate the array part. */
>     if (LJ_MAX_COLOSIZE != 0 && asize > 0 && asize <= LJ_MAX_COLOSIZE) {
>       Node *nilnode;
> -    lua_assert((sizeof(GCtab) & 7) == 0);
> +    lj_assertL((sizeof(GCtab) & 7) == 0, "bad GCtab size");
>       t = (GCtab *)lj_mem_newgco(L, sizetabcolo(asize));
>       t->gct = ~LJ_TTAB;
>       t->nomm = (uint8_t)~0;
> @@ -186,7 +186,8 @@ GCtab * LJ_FASTCALL lj_tab_dup(lua_State *L, const GCtab *kt)
>     GCtab *t;
>     uint32_t asize, hmask;
>     t = newtab(L, kt->asize, kt->hmask > 0 ? lj_fls(kt->hmask)+1 : 0);
> -  lua_assert(kt->asize == t->asize && kt->hmask == t->hmask);
> +  lj_assertL(kt->asize == t->asize && kt->hmask == t->hmask,
> +	     "mismatched size of table and template");
>     t->nomm = 0;  /* Keys with metamethod names may be present. */
>     asize = kt->asize;
>     if (asize > 0) {
> @@ -312,7 +313,7 @@ void lj_tab_resize(lua_State *L, GCtab *t, uint32_t asize, uint32_t hbits)
>   
>   static uint32_t countint(cTValue *key, uint32_t *bins)
>   {
> -  lua_assert(!tvisint(key));
> +  lj_assertX(!tvisint(key), "bad integer key");
>     if (tvisnum(key)) {
>       lua_Number nk = numV(key);
>       int32_t k = lj_num2int(nk);
> @@ -465,7 +466,8 @@ TValue *lj_tab_newkey(lua_State *L, GCtab *t, cTValue *key)
>     if (!tvisnil(&n->val) || t->hmask == 0) {
>       Node *nodebase = noderef(t->node);
>       Node *collide, *freenode = getfreetop(t, nodebase);
> -    lua_assert(freenode >= nodebase && freenode <= nodebase+t->hmask+1);
> +    lj_assertL(freenode >= nodebase && freenode <= nodebase+t->hmask+1,
> +	       "bad freenode");
>       do {
>         if (freenode == nodebase) {  /* No free node found? */
>   	rehashtab(L, t, key);  /* Rehash table. */
> @@ -473,7 +475,7 @@ TValue *lj_tab_newkey(lua_State *L, GCtab *t, cTValue *key)
>         }
>       } while (!tvisnil(&(--freenode)->key));
>       setfreetop(t, nodebase, freenode);
> -    lua_assert(freenode != &G(L)->nilnode);
> +    lj_assertL(freenode != &G(L)->nilnode, "store to fallback hash");
>       collide = hashkey(t, &n->key);
>       if (collide != n) {  /* Colliding node not the main node? */
>         Node *nn;
> @@ -555,7 +557,7 @@ TValue *lj_tab_newkey(lua_State *L, GCtab *t, cTValue *key)
>     if (LJ_UNLIKELY(tvismzero(&n->key)))
>       n->key.u64 = 0;
>     lj_gc_anybarriert(L, t);
> -  lua_assert(tvisnil(&n->val));
> +  lj_assertL(tvisnil(&n->val), "new hash slot is not empty");
>     return &n->val;
>   }
>   
> diff --git a/src/lj_target.h b/src/lj_target.h
> index 8dcae957..b4be6781 100644
> --- a/src/lj_target.h
> +++ b/src/lj_target.h
> @@ -152,7 +152,8 @@ typedef uint32_t RegCost;
>   /* Return the address of an exit stub. */
>   static LJ_AINLINE char *exitstub_addr_(char **group, uint32_t exitno)
>   {
> -  lua_assert(group[exitno / EXITSTUBS_PER_GROUP] != NULL);
> +  lj_assertX(group[exitno / EXITSTUBS_PER_GROUP] != NULL,
> +	     "exit stub group for exit %d uninitialized", exitno);
>     return (char *)group[exitno / EXITSTUBS_PER_GROUP] +
>   	 EXITSTUB_SPACING*(exitno % EXITSTUBS_PER_GROUP);
>   }
> diff --git a/src/lj_trace.c b/src/lj_trace.c
> index 17743159..236e06a0 100644
> --- a/src/lj_trace.c
> +++ b/src/lj_trace.c
> @@ -110,7 +110,8 @@ static void perftools_addtrace(GCtrace *T)
>       name++;
>     else
>       name = "(string)";
> -  lua_assert(startpc >= proto_bc(pt) && startpc < proto_bc(pt) + pt->sizebc);
> +  lj_assertX(startpc >= proto_bc(pt) && startpc < proto_bc(pt) + pt->sizebc,
> +	     "trace PC out of range");
>     lineno = lj_debug_line(pt, proto_bcpos(pt, startpc));
>     if (!fp) {
>       char fname[40];
> @@ -200,7 +201,7 @@ void lj_trace_reenableproto(GCproto *pt)
>   {
>     if ((pt->flags & PROTO_ILOOP)) {
>       BCIns *bc = proto_bc(pt);
> -    BCPos i, sizebc = pt->sizebc;;
> +    BCPos i, sizebc = pt->sizebc;
>       pt->flags &= ~PROTO_ILOOP;
>       if (bc_op(bc[0]) == BC_IFUNCF)
>         setbc_op(&bc[0], BC_FUNCF);
> @@ -222,27 +223,28 @@ static void trace_unpatch(jit_State *J, GCtrace *T)
>       return;  /* No need to unpatch branches in parent traces (yet). */
>     switch (bc_op(*pc)) {
>     case BC_JFORL:
> -    lua_assert(traceref(J, bc_d(*pc)) == T);
> +    lj_assertJ(traceref(J, bc_d(*pc)) == T, "JFORL references other trace");
>       *pc = T->startins;
>       pc += bc_j(T->startins);
> -    lua_assert(bc_op(*pc) == BC_JFORI);
> +    lj_assertJ(bc_op(*pc) == BC_JFORI, "FORL does not point to JFORI");
>       setbc_op(pc, BC_FORI);
>       break;
>     case BC_JITERL:
>     case BC_JLOOP:
> -    lua_assert(op == BC_ITERL || op == BC_LOOP || bc_isret(op));
> +    lj_assertJ(op == BC_ITERL || op == BC_LOOP || bc_isret(op),
> +	       "bad original bytecode %d", op);
>       *pc = T->startins;
>       break;
>     case BC_JMP:
> -    lua_assert(op == BC_ITERL);
> +    lj_assertJ(op == BC_ITERL, "bad original bytecode %d", op);
>       pc += bc_j(*pc)+2;
>       if (bc_op(*pc) == BC_JITERL) {
> -      lua_assert(traceref(J, bc_d(*pc)) == T);
> +      lj_assertJ(traceref(J, bc_d(*pc)) == T, "JITERL references other trace");
>         *pc = T->startins;
>       }
>       break;
>     case BC_JFUNCF:
> -    lua_assert(op == BC_FUNCF);
> +    lj_assertJ(op == BC_FUNCF, "bad original bytecode %d", op);
>       *pc = T->startins;
>       break;
>     default:  /* Already unpatched. */
> @@ -254,7 +256,8 @@ static void trace_unpatch(jit_State *J, GCtrace *T)
>   static void trace_flushroot(jit_State *J, GCtrace *T)
>   {
>     GCproto *pt = &gcref(T->startpt)->pt;
> -  lua_assert(T->root == 0 && pt != NULL);
> +  lj_assertJ(T->root == 0, "not a root trace");
> +  lj_assertJ(pt != NULL, "trace has no prototype");
>     /* First unpatch any modified bytecode. */
>     trace_unpatch(J, T);
>     /* Unlink root trace from chain anchored in prototype. */
> @@ -370,7 +373,8 @@ void lj_trace_freestate(global_State *g)
>     {  /* This assumes all traces have already been freed. */
>       ptrdiff_t i;
>       for (i = 1; i < (ptrdiff_t)J->sizetrace; i++)
> -      lua_assert(i == (ptrdiff_t)J->cur.traceno || traceref(J, i) == NULL);
> +      lj_assertG(i == (ptrdiff_t)J->cur.traceno || traceref(J, i) == NULL,
> +		 "trace still allocated");
>     }
>   #endif
>     lj_mcode_free(J);
> @@ -425,8 +429,9 @@ static void trace_start(jit_State *J)
>     if ((J->pt->flags & PROTO_NOJIT)) {  /* JIT disabled for this proto? */
>       if (J->parent == 0 && J->exitno == 0) {
>         /* Lazy bytecode patching to disable hotcount events. */
> -      lua_assert(bc_op(*J->pc) == BC_FORL || bc_op(*J->pc) == BC_ITERL ||
> -		 bc_op(*J->pc) == BC_LOOP || bc_op(*J->pc) == BC_FUNCF);
> +      lj_assertJ(bc_op(*J->pc) == BC_FORL || bc_op(*J->pc) == BC_ITERL ||
> +		 bc_op(*J->pc) == BC_LOOP || bc_op(*J->pc) == BC_FUNCF,
> +		 "bad hot bytecode %d", bc_op(*J->pc));
>         setbc_op(J->pc, (int)bc_op(*J->pc)+(int)BC_ILOOP-(int)BC_LOOP);
>         J->pt->flags |= PROTO_ILOOP;
>       }
> @@ -437,7 +442,8 @@ static void trace_start(jit_State *J)
>     /* Get a new trace number. */
>     traceno = trace_findfree(J);
>     if (LJ_UNLIKELY(traceno == 0)) {  /* No free trace? */
> -    lua_assert((J2G(J)->hookmask & HOOK_GC) == 0);
> +    lj_assertJ((J2G(J)->hookmask & HOOK_GC) == 0,
> +	       "recorder called from GC hook");
>       lj_trace_flushall(J->L);
>       J->state = LJ_TRACE_IDLE;  /* Silently ignored. */
>       return;
> @@ -513,7 +519,7 @@ static void trace_stop(jit_State *J)
>       goto addroot;
>     case BC_JMP:
>       /* Patch exit branch in parent to side trace entry. */
> -    lua_assert(J->parent != 0 && J->cur.root != 0);
> +    lj_assertJ(J->parent != 0 && J->cur.root != 0, "not a side trace");
>       lj_asm_patchexit(J, traceref(J, J->parent), J->exitno, J->cur.mcode);
>       /* Avoid compiling a side trace twice (stack resizing uses parent exit). */
>       traceref(J, J->parent)->snap[J->exitno].count = SNAPCOUNT_DONE;
> @@ -532,7 +538,7 @@ static void trace_stop(jit_State *J)
>       traceref(J, J->exitno)->link = traceno;
>       break;
>     default:
> -    lua_assert(0);
> +    lj_assertJ(0, "bad stop bytecode %d", op);
>       break;
>     }
>   
> @@ -553,8 +559,8 @@ static void trace_stop(jit_State *J)
>   static int trace_downrec(jit_State *J)
>   {
>     /* Restart recording at the return instruction. */
> -  lua_assert(J->pt != NULL);
> -  lua_assert(bc_isret(bc_op(*J->pc)));
> +  lj_assertJ(J->pt != NULL, "no active prototype");
> +  lj_assertJ(bc_isret(bc_op(*J->pc)), "not at a return bytecode");
>     if (bc_op(*J->pc) == BC_RETM) {
>       J->ntraceabort++;
>       return 0;  /* NYI: down-recursion with RETM. */
> @@ -774,7 +780,7 @@ static void trace_hotside(jit_State *J, const BCIns *pc)
>         isluafunc(curr_func(J->L)) &&
>         snap->count != SNAPCOUNT_DONE &&
>         ++snap->count >= J->param[JIT_P_hotexit]) {
> -    lua_assert(J->state == LJ_TRACE_IDLE);
> +    lj_assertJ(J->state == LJ_TRACE_IDLE, "hot side exit while recording");
>       /* J->parent is non-zero for a side trace. */
>       J->state = LJ_TRACE_START;
>       lj_trace_ins(J, pc);
> @@ -848,7 +854,7 @@ static TraceNo trace_exit_find(jit_State *J, MCode *pc)
>       if (T && pc >= T->mcode && pc < (MCode *)((char *)T->mcode + T->szmcode))
>         return traceno;
>     }
> -  lua_assert(0);
> +  lj_assertJ(0, "bad exit pc");
>     return 0;
>   }
>   #endif
> @@ -878,13 +884,13 @@ int LJ_FASTCALL lj_trace_exit(jit_State *J, void *exptr)
>     T = traceref(J, J->parent); UNUSED(T);
>   #ifdef EXITSTATE_CHECKEXIT
>     if (J->exitno == T->nsnap) {  /* Treat stack check like a parent exit. */
> -    lua_assert(T->root != 0);
> +    lj_assertJ(T->root != 0, "stack check in root trace");
>       J->exitno = T->ir[REF_BASE].op2;
>       J->parent = T->ir[REF_BASE].op1;
>       T = traceref(J, J->parent);
>     }
>   #endif
> -  lua_assert(T != NULL && J->exitno < T->nsnap);
> +  lj_assertJ(T != NULL && J->exitno < T->nsnap, "bad trace or exit number");
>     exd.J = J;
>     exd.exptr = exptr;
>     errcode = lj_vm_cpcall(L, NULL, &exd, trace_exit_cp);
> @@ -975,14 +981,7 @@ uintptr_t LJ_FASTCALL lj_trace_unwind(jit_State *J, uintptr_t addr, ExitNo *ep)
>       return (uintptr_t)exitstub_trace_addr(T, exitno);
>   #endif
>     }
> -  /* Cannot correlate addr with trace/exit. This will be fatal. */
> -  /*
> -  ** FIXME: The following assert was replaced with
> -  ** the conventional `lua_assert`.
> -  **
> -  ** lj_assertJ(0, "bad exit pc");
> -  */
> -  lua_assert(0);
> +  lj_assertJ(0, "bad exit pc");
>     return 0;
>   }
>   #endif
> diff --git a/src/lj_utils_leb128.c b/src/lj_utils_leb128.c
> index 0d50b839..d66961da 100644
> --- a/src/lj_utils_leb128.c
> +++ b/src/lj_utils_leb128.c
> @@ -9,6 +9,7 @@
>   #define LUA_CORE
>   
>   #include "lj_utils.h"
> +#include "lj_obj.h"
>   
>   #define LINK_BIT          (0x80)
>   #define MIN_TWOBYTE_VALUE (0x80)
> @@ -112,7 +113,7 @@ size_t LJ_FASTCALL lj_utils_write_leb128(uint8_t *buffer, int64_t value)
>     /* Omit LINK_BIT in case of overflow. */
>     buffer[i++] = (uint8_t)(value & PAYLOAD_MASK);
>   
> -  lua_assert(i <= LEB128_U64_MAXSIZE);
> +  lj_assertX(i <= LEB128_U64_MAXSIZE, "bad leb128 size");
>   
>     return i;
>   }
> @@ -126,7 +127,7 @@ size_t LJ_FASTCALL lj_utils_write_uleb128(uint8_t *buffer, uint64_t value)
>   
>     buffer[i++] = (uint8_t)value;
>   
> -  lua_assert(i <= LEB128_U64_MAXSIZE);
> +  lj_assertX(i <= LEB128_U64_MAXSIZE, "bad uleb128 size");
>   
>     return i;
>   }
> diff --git a/src/lj_vmmath.c b/src/lj_vmmath.c
> index 9c0d3fde..14e66687 100644
> --- a/src/lj_vmmath.c
> +++ b/src/lj_vmmath.c
> @@ -60,7 +60,8 @@ double lj_vm_foldarith(double x, double y, int op)
>   int32_t LJ_FASTCALL lj_vm_modi(int32_t a, int32_t b)
>   {
>     uint32_t y, ua, ub;
> -  lua_assert(b != 0);  /* This must be checked before using this function. */
> +  /* This must be checked before using this function. */
> +  lj_assertX(b != 0, "modulo with zero divisor");
>     ua = a < 0 ? (uint32_t)-a : (uint32_t)a;
>     ub = b < 0 ? (uint32_t)-b : (uint32_t)b;
>     y = ua % ub;
> @@ -84,7 +85,7 @@ double lj_vm_log2(double a)
>   static double lj_vm_powui(double x, uint32_t k)
>   {
>     double y;
> -  lua_assert(k != 0);
> +  lj_assertX(k != 0, "pow with zero exponent");
>     for (; (k & 1) == 0; k >>= 1) x *= x;
>     y = x;
>     if ((k >>= 1) != 0) {
> @@ -123,7 +124,7 @@ double lj_vm_foldfpm(double x, int fpm)
>     case IRFPM_SQRT: return sqrt(x);
>     case IRFPM_LOG: return log(x);
>     case IRFPM_LOG2: return lj_vm_log2(x);
> -  default: lua_assert(0);
> +  default: lj_assertX(0, "bad fpm %d", fpm);
>     }
>     return 0;
>   }
> diff --git a/src/lj_wbuf.c b/src/lj_wbuf.c
> index 897ef083..0001a02e 100644
> --- a/src/lj_wbuf.c
> +++ b/src/lj_wbuf.c
> @@ -10,6 +10,7 @@
>   
>   #include <errno.h>
>   
> +#include "lj_obj.h"
>   #include "lj_wbuf.h"
>   #include "lj_utils.h"
>   
> @@ -52,7 +53,7 @@ void LJ_FASTCALL lj_wbuf_terminate(struct lj_wbuf *buf)
>   
>   static LJ_AINLINE void wbuf_reserve(struct lj_wbuf *buf, size_t n)
>   {
> -  lua_assert(n <= buf->size);
> +  lj_assertX(n <= buf->size, "wbuf overflow");
>     if (LJ_UNLIKELY(wbuf_left(buf) < n))
>       lj_wbuf_flush(buf);
>   }
> diff --git a/src/ljamalg.c b/src/ljamalg.c
> index 6ad5289c..0ffc7e81 100644
> --- a/src/ljamalg.c
> +++ b/src/ljamalg.c
> @@ -28,6 +28,7 @@
>   #include "lua.h"
>   #include "lauxlib.h"
>   
> +#include "lj_assert.c"
>   #include "lj_gc.c"
>   #include "lj_err.c"
>   #include "lj_char.c"
> diff --git a/src/luaconf.h b/src/luaconf.h
> index 8029040a..38146008 100644
> --- a/src/luaconf.h
> +++ b/src/luaconf.h
> @@ -146,7 +146,7 @@
>   #define LUALIB_API	LUA_API
>   #define LUAMISC_API	LUA_API
>   
> -/* Support for internal assertions. */
> +/* Compatibility support for assertions. */
>   #if defined(LUA_USE_ASSERT) || defined(LUA_USE_APICHECK)
>   #include <assert.h>
>   #endif

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 4/5] Fix pow() optimization inconsistencies.
  2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 4/5] Fix pow() optimization inconsistencies Sergey Kaplun via Tarantool-patches
@ 2023-08-18 12:45   ` Sergey Bronnikov via Tarantool-patches
  2023-08-21  8:07     ` Sergey Kaplun via Tarantool-patches
  2023-08-20  9:26   ` Maxim Kokryashkin via Tarantool-patches
  1 sibling, 1 reply; 34+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2023-08-18 12:45 UTC (permalink / raw)
  To: Sergey Kaplun, Maxim Kokryashkin; +Cc: tarantool-patches

Hi, Sergey!


thanks for the patch! LGTM with two minor comments inline.


On 8/15/23 12:36, Sergey Kaplun wrote:
> From: Mike Pall <mike>
>
> (cherry-picked from commit 9512d5c1aced61e13e7be2d3208ec7ae3516b458)
>
> This patch fixes different misbehaviour between JIT-compiled code and
misbehaviour -> misbehaviours
> the interpreter for power operator with the following ways:
> * Drop folding optimizations for base ^ 0.5 => sqrt(base), as far as
>    pow(base, 0.5) isn't interchangeable and depends on the <math.h>
>    implementation.
> * Drop folding optimizations for 2 ^ int_pow => ldexp(1.0, int_pow), to
>    avoid dependcy on the <math.h> implementation.
dependcy -> dependency
> * Now `asm_pow()` always assemble a call to the `lj_vm_powi()` function,
>    that is general now for all CPU architectures. Using this internal
>    function instead of toolchain-provided `pow()` guarantees consistency
>    between interpreter and JIT results. Also, it drops custom
>    implementation for the `vm_powi_sse()` on x86_64.
> * `math_extern2` macro in the VM may take the second argument, that is
>    used as the target function to call. The first argument is still the
>    name for `func_nnsse` macro.
> * Narrowing for power operation avoids range guard for non-constant base
>    IR. This leads to invalid result if value on trace is out of range.
>    Now it is done unconditionally.
>
> Be aware, that [220/502] lib/string/format/num.lua test [1] from
> LuaJIT-test suite fails after this commit.
>
> [1]: https://www.exploringbinary.com/incorrect-floating-point-to-decimal-conversions/
>
> Sergey Kaplun:
> * added the description and the test for the problem
<snipped>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 5/5] Revert to trival pow() optimizations to prevent inaccuracies.
  2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 5/5] Revert to trival pow() optimizations to prevent inaccuracies Sergey Kaplun via Tarantool-patches
@ 2023-08-18 12:49   ` Sergey Bronnikov via Tarantool-patches
  2023-08-21  8:16     ` Sergey Kaplun via Tarantool-patches
  2023-08-20  9:37   ` Maxim Kokryashkin via Tarantool-patches
  1 sibling, 1 reply; 34+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2023-08-18 12:49 UTC (permalink / raw)
  To: Sergey Kaplun, Maxim Kokryashkin; +Cc: tarantool-patches

Hi, Sergey


Thanks for the patch!

typo in subj: trival -> trivial, however seems it is not fixable because 
a part of original commit

LGTM


On 8/15/23 12:36, Sergey Kaplun wrote:
> From: Mike Pall <mike>
>
> (cherry-picked from commit 96d6d5032098ea9f0002165394a8774dcaa0c0ce)
>
> This patch fixes different misbehaviour between JIT-compiled code and
typo: misbehaviour -> misbehaviours
> the interpreter for power operator with the following ways:
> * Drop folding optimizations for base ^ n => base * base ..., as far as
>    pow(base, n) isn't interchangeable with just multiplicity of numbers
>    and depends on the <math.h> implementation.
> * Since the internal power function is inaccurate for very big or small
>    powers, it is dropped, and `pow()` from the standard library is used
>    instead. To save consistency between JIT behaviour and the VM
>    narrowing optimization is dropped, and only trivial folding
>    optimizations are used. Also, `math_extern2` version with two
>    parameters is dropped, since it's no more used.
>
> Also, this fixes failures of the [220/502] lib/string/format/num.lua
> test [1] from LuaJIT-test suite.
>
> [1]: https://www.exploringbinary.com/incorrect-floating-point-to-decimal-conversions/
>
> Sergey Kaplun:
> * added the description and the test for the problem
>
> Part of tarantool/tarantool#8825
> ---
>   src/lj_asm.c                                  |  3 +-
>   src/lj_dispatch.h                             |  2 +-
>   src/lj_ffrecord.c                             |  4 +-
>   src/lj_ircall.h                               |  3 +-
>   src/lj_iropt.h                                |  1 -
>   src/lj_opt_fold.c                             | 37 ++++------------
>   src/lj_opt_narrow.c                           | 24 ----------
>   src/lj_opt_split.c                            |  2 +-
>   src/lj_record.c                               |  2 +-
>   src/lj_vm.h                                   |  3 --
>   src/lj_vmmath.c                               | 44 +------------------
>   src/vm_arm.dasc                               | 13 +++---
>   src/vm_arm64.dasc                             | 11 ++---
>   src/vm_mips.dasc                              | 11 ++---
>   src/vm_mips64.dasc                            | 11 ++---
>   src/vm_ppc.dasc                               | 11 ++---
>   src/vm_x64.dasc                               |  9 ++--
>   src/vm_x86.dasc                               | 11 ++---
>   .../lj-684-pow-inconsistencies.test.lua       | 21 ++++++++-
>   19 files changed, 64 insertions(+), 159 deletions(-)
>
> diff --git a/src/lj_asm.c b/src/lj_asm.c
> index 65261d50..3a1909d5 100644
> --- a/src/lj_asm.c
> +++ b/src/lj_asm.c
> @@ -1660,8 +1660,7 @@ static void asm_pow(ASMState *as, IRIns *ir)
>   					  IRCALL_lj_carith_powu64);
>     else
>   #endif
> -  asm_callid(as, ir, irt_isnum(IR(ir->op2)->t) ? IRCALL_lj_vm_pow :
> -						 IRCALL_lj_vm_powi);
> +  asm_callid(as, ir, IRCALL_pow);
>   }
>   
>   static void asm_div(ASMState *as, IRIns *ir)
> diff --git a/src/lj_dispatch.h b/src/lj_dispatch.h
> index af870a75..b8bc2594 100644
> --- a/src/lj_dispatch.h
> +++ b/src/lj_dispatch.h
> @@ -44,7 +44,7 @@ extern double __divdf3(double a, double b);
>   #define GOTDEF(_) \
>     _(floor) _(ceil) _(trunc) _(log) _(log10) _(exp) _(sin) _(cos) _(tan) \
>     _(asin) _(acos) _(atan) _(sinh) _(cosh) _(tanh) _(frexp) _(modf) _(atan2) \
> -  _(lj_vm_pow) _(fmod) _(ldexp) _(lj_vm_modi) \
> +  _(pow) _(fmod) _(ldexp) _(lj_vm_modi) \
>     _(lj_dispatch_call) _(lj_dispatch_ins) _(lj_dispatch_stitch) \
>     _(lj_dispatch_profile) _(lj_err_throw) \
>     _(lj_ffh_coroutine_wrap_err) _(lj_func_closeuv) _(lj_func_newL_gc) \
> diff --git a/src/lj_ffrecord.c b/src/lj_ffrecord.c
> index 0746ec64..99a6b918 100644
> --- a/src/lj_ffrecord.c
> +++ b/src/lj_ffrecord.c
> @@ -590,8 +590,8 @@ static void LJ_FASTCALL recff_math_call(jit_State *J, RecordFFData *rd)
>   
>   static void LJ_FASTCALL recff_math_pow(jit_State *J, RecordFFData *rd)
>   {
> -  J->base[0] = lj_opt_narrow_pow(J, J->base[0], J->base[1],
> -				 &rd->argv[0], &rd->argv[1]);
> +  J->base[0] = lj_opt_narrow_arith(J, J->base[0], J->base[1],
> +				   &rd->argv[0], &rd->argv[1], IR_POW);
>     UNUSED(rd);
>   }
>   
> diff --git a/src/lj_ircall.h b/src/lj_ircall.h
> index ac0888a0..9c195918 100644
> --- a/src/lj_ircall.h
> +++ b/src/lj_ircall.h
> @@ -194,8 +194,7 @@ typedef struct CCallInfo {
>     _(FPMATH,	sqrt,			1,   N, NUM, XA_FP) \
>     _(ANY,	log,			1,   N, NUM, XA_FP) \
>     _(ANY,	lj_vm_log2,		1,   N, NUM, XA_FP) \
> -  _(ANY,	lj_vm_powi,		2,   N, NUM, XA_FP) \
> -  _(ANY,	lj_vm_pow,		2,   N, NUM, XA2_FP) \
> +  _(ANY,	pow,			2,   N, NUM, XA2_FP) \
>     _(ANY,	atan2,			2,   N, NUM, XA2_FP) \
>     _(ANY,	ldexp,			2,   N, NUM, XA_FP) \
>     _(SOFTFP,	lj_vm_tobit,		1,   N, INT, XA_FP32) \
> diff --git a/src/lj_iropt.h b/src/lj_iropt.h
> index a59ba3f4..7ee1ea86 100644
> --- a/src/lj_iropt.h
> +++ b/src/lj_iropt.h
> @@ -144,7 +144,6 @@ LJ_FUNC TRef lj_opt_narrow_arith(jit_State *J, TRef rb, TRef rc,
>   				 TValue *vb, TValue *vc, IROp op);
>   LJ_FUNC TRef lj_opt_narrow_unm(jit_State *J, TRef rc, TValue *vc);
>   LJ_FUNC TRef lj_opt_narrow_mod(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc);
> -LJ_FUNC TRef lj_opt_narrow_pow(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc);
>   LJ_FUNC IRType lj_opt_narrow_forl(jit_State *J, cTValue *forbase);
>   
>   /* Optimization passes. */
> diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
> index 7d7cc9d1..09e6c87b 100644
> --- a/src/lj_opt_fold.c
> +++ b/src/lj_opt_fold.c
> @@ -236,14 +236,10 @@ LJFOLDF(kfold_fpcall2)
>     return NEXTFOLD;
>   }
>   
> -LJFOLD(POW KNUM KINT)
>   LJFOLD(POW KNUM KNUM)
>   LJFOLDF(kfold_numpow)
>   {
> -  lua_Number a = knumleft;
> -  lua_Number b = fright->o == IR_KINT ? (lua_Number)fright->i : knumright;
> -  lua_Number y = lj_vm_foldarith(a, b, IR_POW - IR_ADD);
> -  return lj_ir_knum(J, y);
> +  return lj_ir_knum(J, lj_vm_foldarith(knumleft, knumright, IR_POW - IR_ADD));
>   }
>   
>   /* Must not use kfold_kref for numbers (could be NaN). */
> @@ -1084,34 +1080,17 @@ LJFOLDF(simplify_nummuldiv_negneg)
>     return RETRYFOLD;
>   }
>   
> -LJFOLD(POW any KINT)
> -LJFOLDF(simplify_numpow_xkint)
> +LJFOLD(POW any KNUM)
> +LJFOLDF(simplify_numpow_k)
>   {
> -  int32_t k = fright->i;
> -  TRef ref = fins->op1;
> -  if (k == 0)  /* x ^ 0 ==> 1 */
> +  if (knumright == 0)  /* x ^ 0 ==> 1 */
>       return lj_ir_knum_one(J);  /* Result must be a number, not an int. */
> -  if (k == 1)  /* x ^ 1 ==> x */
> +  else if (knumright == 1)  /* x ^ 1 ==> x */
>       return LEFTFOLD;
> -  if ((uint32_t)(k+65536) > 2*65536u)  /* Limit code explosion. */
> +  else if (knumright == 2)  /* x ^ 2 ==> x * x */
> +    return emitir(IRTN(IR_MUL), fins->op1, fins->op1);
> +  else
>       return NEXTFOLD;
> -  if (k < 0) {  /* x ^ (-k) ==> (1/x) ^ k. */
> -    ref = emitir(IRTN(IR_DIV), lj_ir_knum_one(J), ref);
> -    k = -k;
> -  }
> -  /* Unroll x^k for 1 <= k <= 65536. */
> -  for (; (k & 1) == 0; k >>= 1)  /* Handle leading zeros. */
> -    ref = emitir(IRTN(IR_MUL), ref, ref);
> -  if ((k >>= 1) != 0) {  /* Handle trailing bits. */
> -    TRef tmp = emitir(IRTN(IR_MUL), ref, ref);
> -    for (; k != 1; k >>= 1) {
> -      if (k & 1)
> -	ref = emitir(IRTN(IR_MUL), ref, tmp);
> -      tmp = emitir(IRTN(IR_MUL), tmp, tmp);
> -    }
> -    ref = emitir(IRTN(IR_MUL), ref, tmp);
> -  }
> -  return ref;
>   }
>   
>   /* -- Simplify conversions ------------------------------------------------ */
> diff --git a/src/lj_opt_narrow.c b/src/lj_opt_narrow.c
> index d6601f4c..db0da10f 100644
> --- a/src/lj_opt_narrow.c
> +++ b/src/lj_opt_narrow.c
> @@ -584,30 +584,6 @@ TRef lj_opt_narrow_mod(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc)
>     return emitir(IRTN(IR_SUB), rb, tmp);
>   }
>   
> -/* Narrowing of power operator or math.pow. */
> -TRef lj_opt_narrow_pow(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc)
> -{
> -  rb = conv_str_tonum(J, rb, vb);
> -  rb = lj_ir_tonum(J, rb);  /* Left arg is always treated as an FP number. */
> -  rc = conv_str_tonum(J, rc, vc);
> -  if (tvisint(vc) || numisint(numV(vc))) {
> -    int32_t k = numberVint(vc);
> -    if (!(k >= -65536 && k <= 65536)) goto force_pow_num;
> -    if (!tref_isinteger(rc)) {
> -      /* Guarded conversion to integer! */
> -      rc = emitir(IRTGI(IR_CONV), rc, IRCONV_INT_NUM|IRCONV_CHECK);
> -    }
> -    if (!tref_isk(rc)) {  /* Range guard: -65536 <= i <= 65536 */
> -      TRef tmp = emitir(IRTI(IR_ADD), rc, lj_ir_kint(J, 65536));
> -      emitir(IRTGI(IR_ULE), tmp, lj_ir_kint(J, 2*65536));
> -    }
> -  } else {
> -force_pow_num:
> -    rc = lj_ir_tonum(J, rc);  /* Want POW(num, num), not POW(num, int). */
> -  }
> -  return emitir(IRTN(IR_POW), rb, rc);
> -}
> -
>   /* -- Predictive narrowing of induction variables ------------------------- */
>   
>   /* Narrow a single runtime value. */
> diff --git a/src/lj_opt_split.c b/src/lj_opt_split.c
> index a619d852..0dc6394f 100644
> --- a/src/lj_opt_split.c
> +++ b/src/lj_opt_split.c
> @@ -400,7 +400,7 @@ static void split_ir(jit_State *J)
>   	hi = split_call_ll(J, hisubst, oir, ir, IRCALL_softfp_div);
>   	break;
>         case IR_POW:
> -	hi = split_call_li(J, hisubst, oir, ir, IRCALL_lj_vm_powi);
> +	hi = split_call_li(J, hisubst, oir, ir, IRCALL_pow);
>   	break;
>         case IR_FPMATH:
>   	hi = split_call_l(J, hisubst, oir, ir, IRCALL_lj_vm_floor + ir->op2);
> diff --git a/src/lj_record.c b/src/lj_record.c
> index d1332bfc..34d1210a 100644
> --- a/src/lj_record.c
> +++ b/src/lj_record.c
> @@ -2268,7 +2268,7 @@ void lj_record_ins(jit_State *J)
>   
>     case BC_POW:
>       if (tref_isnumber_str(rb) && tref_isnumber_str(rc))
> -      rc = lj_opt_narrow_pow(J, rb, rc, rbv, rcv);
> +      rc = lj_opt_narrow_arith(J, rb, rc, rbv, rcv, IR_POW);
>       else
>         rc = rec_mm_arith(J, &ix, MM_pow);
>       break;
> diff --git a/src/lj_vm.h b/src/lj_vm.h
> index f6f28a08..79166e5e 100644
> --- a/src/lj_vm.h
> +++ b/src/lj_vm.h
> @@ -96,9 +96,6 @@ LJ_ASMF int lj_vm_errno(void);
>   #endif
>   #endif
>   
> -LJ_ASMF double lj_vm_powi(double, int32_t);
> -LJ_ASMF double lj_vm_pow(double, double);
> -
>   /* Continuations for metamethods. */
>   LJ_ASMF void lj_cont_cat(void);  /* Continue with concatenation. */
>   LJ_ASMF void lj_cont_ra(void);  /* Store result in RA from instruction. */
> diff --git a/src/lj_vmmath.c b/src/lj_vmmath.c
> index 539f955b..506867f8 100644
> --- a/src/lj_vmmath.c
> +++ b/src/lj_vmmath.c
> @@ -30,52 +30,12 @@ LJ_FUNCA double lj_wrap_sinh(double x) { return sinh(x); }
>   LJ_FUNCA double lj_wrap_cosh(double x) { return cosh(x); }
>   LJ_FUNCA double lj_wrap_tanh(double x) { return tanh(x); }
>   LJ_FUNCA double lj_wrap_atan2(double x, double y) { return atan2(x, y); }
> +LJ_FUNCA double lj_wrap_pow(double x, double y) { return pow(x, y); }
>   LJ_FUNCA double lj_wrap_fmod(double x, double y) { return fmod(x, y); }
>   #endif
>   
>   /* -- Helper functions ---------------------------------------------------- */
>   
> -/* Unsigned x^k. */
> -static double lj_vm_powui(double x, uint32_t k)
> -{
> -  double y;
> -  lj_assertX(k != 0, "pow with zero exponent");
> -  for (; (k & 1) == 0; k >>= 1) x *= x;
> -  y = x;
> -  if ((k >>= 1) != 0) {
> -    for (;;) {
> -      x *= x;
> -      if (k == 1) break;
> -      if (k & 1) y *= x;
> -      k >>= 1;
> -    }
> -    y *= x;
> -  }
> -  return y;
> -}
> -
> -/* Signed x^k. */
> -double lj_vm_powi(double x, int32_t k)
> -{
> -  if (k > 1)
> -    return lj_vm_powui(x, (uint32_t)k);
> -  else if (k == 1)
> -    return x;
> -  else if (k == 0)
> -    return 1.0;
> -  else
> -    return 1.0 / lj_vm_powui(x, (uint32_t)-k);
> -}
> -
> -double lj_vm_pow(double x, double y)
> -{
> -  int32_t k = lj_num2int(y);
> -  if ((k >= -65536 && k <= 65536) && y == (double)k)
> -    return lj_vm_powi(x, k);
> -  else
> -    return pow(x, y);
> -}
> -
>   double lj_vm_foldarith(double x, double y, int op)
>   {
>     switch (op) {
> @@ -84,7 +44,7 @@ double lj_vm_foldarith(double x, double y, int op)
>     case IR_MUL - IR_ADD: return x*y; break;
>     case IR_DIV - IR_ADD: return x/y; break;
>     case IR_MOD - IR_ADD: return x-lj_vm_floor(x/y)*y; break;
> -  case IR_POW - IR_ADD: return lj_vm_pow(x, y); break;
> +  case IR_POW - IR_ADD: return pow(x, y); break;
>     case IR_NEG - IR_ADD: return -x; break;
>     case IR_ABS - IR_ADD: return fabs(x); break;
>   #if LJ_HASJIT
> diff --git a/src/vm_arm.dasc b/src/vm_arm.dasc
> index 792f0363..767d31f9 100644
> --- a/src/vm_arm.dasc
> +++ b/src/vm_arm.dasc
> @@ -1485,11 +1485,11 @@ static void build_subroutines(BuildCtx *ctx)
>     |.endif
>     |.endmacro
>     |
> -  |.macro math_extern2, name, func
> +  |.macro math_extern2, func
>     |.if HFABI
> -  |  .ffunc_dd math_ .. name
> +  |  .ffunc_dd math_ .. func
>     |.else
> -  |  .ffunc_nn math_ .. name
> +  |  .ffunc_nn math_ .. func
>     |.endif
>     |  .IOS mov RA, BASE
>     |  bl extern func
> @@ -1500,9 +1500,6 @@ static void build_subroutines(BuildCtx *ctx)
>     |  b ->fff_restv
>     |.endif
>     |.endmacro
> -  |.macro math_extern2, func
> -  |  math_extern2 func, func
> -  |.endmacro
>     |
>     |.if FPU
>     |  .ffunc_d math_sqrt
> @@ -1548,7 +1545,7 @@ static void build_subroutines(BuildCtx *ctx)
>     |  math_extern sinh
>     |  math_extern cosh
>     |  math_extern tanh
> -  |  math_extern2 pow, lj_vm_pow
> +  |  math_extern2 pow
>     |  math_extern2 atan2
>     |  math_extern2 fmod
>     |
> @@ -3156,7 +3153,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>       break;
>     case BC_POW:
>       |  // NYI: (partial) integer arithmetic.
> -    |  ins_arithfp extern, extern lj_vm_pow
> +    |  ins_arithfp extern, extern pow
>       break;
>   
>     case BC_CAT:
> diff --git a/src/vm_arm64.dasc b/src/vm_arm64.dasc
> index fb267a76..de33bde4 100644
> --- a/src/vm_arm64.dasc
> +++ b/src/vm_arm64.dasc
> @@ -1391,14 +1391,11 @@ static void build_subroutines(BuildCtx *ctx)
>     |  b ->fff_resn
>     |.endmacro
>     |
> -  |.macro math_extern2, name, func
> -  |  .ffunc_nn math_ .. name
> +  |.macro math_extern2, func
> +  |  .ffunc_nn math_ .. func
>     |  bl extern func
>     |  b ->fff_resn
>     |.endmacro
> -  |.macro math_extern2, func
> -  |  math_extern2 func, func
> -  |.endmacro
>     |
>     |.ffunc_n math_sqrt
>     |  fsqrt d0, d0
> @@ -1427,7 +1424,7 @@ static void build_subroutines(BuildCtx *ctx)
>     |  math_extern sinh
>     |  math_extern cosh
>     |  math_extern tanh
> -  |  math_extern2 pow, lj_vm_pow
> +  |  math_extern2 pow
>     |  math_extern2 atan2
>     |  math_extern2 fmod
>     |
> @@ -2624,7 +2621,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>       |  ins_arithload FARG1, FARG2
>       |  ins_arithfallback ins_arithcheck_num
>       |.if "fpins" == "fpow"
> -    |  bl extern lj_vm_pow
> +    |  bl extern pow
>       |.else
>       |  fpins FARG1, FARG1, FARG2
>       |.endif
> diff --git a/src/vm_mips.dasc b/src/vm_mips.dasc
> index 5664f503..32caabf7 100644
> --- a/src/vm_mips.dasc
> +++ b/src/vm_mips.dasc
> @@ -1631,17 +1631,14 @@ static void build_subroutines(BuildCtx *ctx)
>     |.  nop
>     |.endmacro
>     |
> -  |.macro math_extern2, name, func
> -  |  .ffunc_nn math_ .. name
> +  |.macro math_extern2, func
> +  |  .ffunc_nn math_ .. func
>     |.  load_got func
>     |  call_extern
>     |.  nop
>     |  b ->fff_resn
>     |.  nop
>     |.endmacro
> -  |.macro math_extern2, func
> -  |  math_extern2 func, func
> -  |.endmacro
>     |
>     |// TODO: Return integer type if result is integer (own sf implementation).
>     |.macro math_round, func
> @@ -1695,7 +1692,7 @@ static void build_subroutines(BuildCtx *ctx)
>     |  math_extern sinh
>     |  math_extern cosh
>     |  math_extern tanh
> -  |  math_extern2 pow, lj_vm_pow
> +  |  math_extern2 pow
>     |  math_extern2 atan2
>     |  math_extern2 fmod
>     |
> @@ -3588,7 +3585,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>       |  sltiu AT, SFARG1HI, LJ_TISNUM
>       |  sltiu TMP0, SFARG2HI, LJ_TISNUM
>       |  and AT, AT, TMP0
> -    |  load_got lj_vm_pow
> +    |  load_got pow
>       |  beqz AT, ->vmeta_arith
>       |.  addu RA, BASE, RA
>       |.if FPU
> diff --git a/src/vm_mips64.dasc b/src/vm_mips64.dasc
> index 249605d4..44fba36c 100644
> --- a/src/vm_mips64.dasc
> +++ b/src/vm_mips64.dasc
> @@ -1669,17 +1669,14 @@ static void build_subroutines(BuildCtx *ctx)
>     |.  nop
>     |.endmacro
>     |
> -  |.macro math_extern2, name, func
> -  |  .ffunc_nn math_ .. name
> +  |.macro math_extern2, func
> +  |  .ffunc_nn math_ .. func
>     |.  load_got func
>     |  call_extern
>     |.  nop
>     |  b ->fff_resn
>     |.  nop
>     |.endmacro
> -  |.macro math_extern2, func
> -  |  math_extern2 func, func
> -  |.endmacro
>     |
>     |// TODO: Return integer type if result is integer (own sf implementation).
>     |.macro math_round, func
> @@ -1733,7 +1730,7 @@ static void build_subroutines(BuildCtx *ctx)
>     |  math_extern sinh
>     |  math_extern cosh
>     |  math_extern tanh
> -  |  math_extern2 pow, lj_vm_pow
> +  |  math_extern2 pow
>     |  math_extern2 atan2
>     |  math_extern2 fmod
>     |
> @@ -3826,7 +3823,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>       |  sltiu TMP0, TMP0, LJ_TISNUM
>       |   sltiu TMP1, TMP1, LJ_TISNUM
>       |  and AT, TMP0, TMP1
> -    |  load_got lj_vm_pow
> +    |  load_got pow
>       |  beqz AT, ->vmeta_arith
>       |.  daddu RA, BASE, RA
>       |.if FPU
> diff --git a/src/vm_ppc.dasc b/src/vm_ppc.dasc
> index 94af63e6..980ad897 100644
> --- a/src/vm_ppc.dasc
> +++ b/src/vm_ppc.dasc
> @@ -2032,14 +2032,11 @@ static void build_subroutines(BuildCtx *ctx)
>     |  b ->fff_resn
>     |.endmacro
>     |
> -  |.macro math_extern2, name, func
> -  |  .ffunc_nn math_ .. name
> +  |.macro math_extern2, func
> +  |  .ffunc_nn math_ .. func
>     |  blex func
>     |  b ->fff_resn
>     |.endmacro
> -  |.macro math_extern2, func
> -  |  math_extern2 func, func
> -  |.endmacro
>     |
>     |.macro math_round, func
>     |  .ffunc_1 math_ .. func
> @@ -2164,7 +2161,7 @@ static void build_subroutines(BuildCtx *ctx)
>     |  math_extern sinh
>     |  math_extern cosh
>     |  math_extern tanh
> -  |  math_extern2 pow, lj_vm_pow
> +  |  math_extern2 pow
>     |  math_extern2 atan2
>     |  math_extern2 fmod
>     |
> @@ -4157,7 +4154,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>       |  checknum cr1, CARG3
>       |  crand 4*cr0+lt, 4*cr0+lt, 4*cr1+lt
>       |  bge ->vmeta_arith_vv
> -    |  blex lj_vm_pow
> +    |  blex pow
>       |  ins_next1
>       |.if FPU
>       |  stfdx FARG1, BASE, RA
> diff --git a/src/vm_x64.dasc b/src/vm_x64.dasc
> index acbe8dc2..09bf67e5 100644
> --- a/src/vm_x64.dasc
> +++ b/src/vm_x64.dasc
> @@ -1825,16 +1825,13 @@ static void build_subroutines(BuildCtx *ctx)
>     |  jmp ->fff_resxmm0
>     |.endmacro
>     |
> -  |.macro math_extern2, name, func
> -  |  .ffunc_nn math_ .. name
> +  |.macro math_extern2, func
> +  |  .ffunc_nn math_ .. func
>     |  mov RB, BASE
>     |  call extern func
>     |  mov BASE, RB
>     |  jmp ->fff_resxmm0
>     |.endmacro
> -  |.macro math_extern2, func
> -  |  math_extern2 func, func
> -  |.endmacro
>     |
>     |  math_extern log10
>     |  math_extern exp
> @@ -1847,7 +1844,7 @@ static void build_subroutines(BuildCtx *ctx)
>     |  math_extern sinh
>     |  math_extern cosh
>     |  math_extern tanh
> -  |  math_extern2 pow, lj_vm_pow
> +  |  math_extern2 pow
>     |  math_extern2 atan2
>     |  math_extern2 fmod
>     |
> diff --git a/src/vm_x86.dasc b/src/vm_x86.dasc
> index bf30cce6..f16ade1a 100644
> --- a/src/vm_x86.dasc
> +++ b/src/vm_x86.dasc
> @@ -2240,8 +2240,8 @@ static void build_subroutines(BuildCtx *ctx)
>     |  jmp ->fff_resfp
>     |.endmacro
>     |
> -  |.macro math_extern2, name, func
> -  |  .ffunc_nnsse math_ .. name
> +  |.macro math_extern2, func
> +  |  .ffunc_nnsse math_ .. func
>     |.if not X64
>     |  movsd FPARG1, xmm0
>     |  movsd FPARG3, xmm1
> @@ -2251,9 +2251,6 @@ static void build_subroutines(BuildCtx *ctx)
>     |  mov BASE, RB
>     |  jmp ->fff_resfp
>     |.endmacro
> -  |.macro math_extern2, func
> -  |  math_extern2 func, func
> -  |.endmacro
>     |
>     |  math_extern log10
>     |  math_extern exp
> @@ -2266,7 +2263,7 @@ static void build_subroutines(BuildCtx *ctx)
>     |  math_extern sinh
>     |  math_extern cosh
>     |  math_extern tanh
> -  |  math_extern2 pow, lj_vm_pow
> +  |  math_extern2 pow
>     |  math_extern2 atan2
>     |  math_extern2 fmod
>     |
> @@ -3944,7 +3941,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>       |  movsd FPARG1, xmm0
>       |  movsd FPARG3, xmm1
>       |.endif
> -    |  call extern lj_vm_pow
> +    |  call extern pow
>       |  movzx RA, PC_RA
>       |  mov BASE, RB
>       |.if X64
> diff --git a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> index 5129fc45..ab9db3df 100644
> --- a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> +++ b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> @@ -2,14 +2,15 @@ local tap = require('tap')
>   -- Test to demonstrate the incorrect JIT behaviour for different
>   -- power operation optimizations.
>   -- See also:
> --- https://github.com/LuaJIT/LuaJIT/issues/684.
> +-- https://github.com/LuaJIT/LuaJIT/issues/684,
> +-- https://github.com/LuaJIT/LuaJIT/issues/817.
>   local test = tap.test('lj-684-pow-inconsistencies'):skipcond({
>     ['Test requires JIT enabled'] = not jit.status(),
>   })
>   
>   local tostring = tostring
>   
> -test:plan(4)
> +test:plan(5)
>   
>   jit.opt.start('hotloop=1')
>   
> @@ -64,6 +65,22 @@ jit.flush()
>   
>   test:samevalues(res, ('consistent results for folding 2921 ^ 0.5'))
>   
> +-- -948388 ^ 3 = -0x1.7ad0e8ad7439dp+59.
> +res = {}
> +-- XXX: use local variable to prevent folding via parser.
> +-- XXX: use stack slot out of trace to prevent constant folding.
> +local corner_case_3 = -948388
> +jit.on()
> +for i = 1, 4 do
> +  res[i] = corner_case_3 ^ 3
> +end
> +
> +-- XXX: Prevent hotcount side effects.
> +jit.off()
> +jit.flush()
> +
> +test:samevalues(res, ('consistent results for int pow (-948388) ^ 3'))
> +
>   -- Narrowing for non-constant base of power operation.
>   local function pow(base, power)
>     return base ^ power

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 4/5] Fix pow() optimization inconsistencies.
  2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 4/5] Fix pow() optimization inconsistencies Sergey Kaplun via Tarantool-patches
  2023-08-18 12:45   ` Sergey Bronnikov via Tarantool-patches
@ 2023-08-20  9:26   ` Maxim Kokryashkin via Tarantool-patches
  2023-08-21  8:06     ` Sergey Kaplun via Tarantool-patches
  1 sibling, 1 reply; 34+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2023-08-20  9:26 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the patch!
Please consider my comments below.

On Tue, Aug 15, 2023 at 12:36:30PM +0300, Sergey Kaplun wrote:
> From: Mike Pall <mike>
> 
> (cherry-picked from commit 9512d5c1aced61e13e7be2d3208ec7ae3516b458)
> 
> This patch fixes different misbehaviour between JIT-compiled code and
Typo: s/misbehaviour/misbehaviours/
> the interpreter for power operator with the following ways:
Typo: s/with the/in the/
> * Drop folding optimizations for base ^ 0.5 => sqrt(base), as far as
>   pow(base, 0.5) isn't interchangeable and depends on the <math.h>
>   implementation.
> * Drop folding optimizations for 2 ^ int_pow => ldexp(1.0, int_pow), to
>   avoid dependcy on the <math.h> implementation.
> * Now `asm_pow()` always assemble a call to the `lj_vm_powi()` function,
Typo: s/assemble/assembles/
>   that is general now for all CPU architectures. Using this internal
>   function instead of toolchain-provided `pow()` guarantees consistency
Typo: s/of/of the/
>   between interpreter and JIT results. Also, it drops custom
Typo: s/drops/drops the/
>   implementation for the `vm_powi_sse()` on x86_64.
Typo: s/for the/for/
> * `math_extern2` macro in the VM may take the second argument, that is
>   used as the target function to call. The first argument is still the
>   name for `func_nnsse` macro.
> * Narrowing for power operation avoids range guard for non-constant base
>   IR. This leads to invalid result if value on trace is out of range.
Typo: s/to invalid/to an invalid/
>   Now it is done unconditionally.
> 
> Be aware, that [220/502] lib/string/format/num.lua test [1] from
Typo: s/from the/from/
> LuaJIT-test suite fails after this commit.
> 
> [1]: https://www.exploringbinary.com/incorrect-floating-point-to-decimal-conversions/
> 
> Sergey Kaplun:
> * added the description and the test for the problem
> 
> Part of tarantool/tarantool#8825
> ---
>  src/lj_asm.c                                  |  7 +-
>  src/lj_asm_x86.h                              | 13 ---
>  src/lj_dispatch.h                             |  2 +-
>  src/lj_ircall.h                               |  2 +-
>  src/lj_opt_fold.c                             | 27 ------
>  src/lj_opt_narrow.c                           | 12 +--
>  src/lj_vm.h                                   |  7 +-
>  src/lj_vmmath.c                               | 82 +++++++++--------
>  src/vm_arm.dasc                               | 13 +--
>  src/vm_arm64.dasc                             | 11 ++-
>  src/vm_mips.dasc                              | 11 ++-
>  src/vm_mips64.dasc                            | 11 ++-
>  src/vm_ppc.dasc                               | 11 ++-
>  src/vm_x64.dasc                               | 44 ++-------
>  src/vm_x86.dasc                               | 46 ++--------
>  .../lj-684-pow-inconsistencies.test.lua       | 89 +++++++++++++++++++
>  .../lj-9-pow-inconsistencies.test.lua         |  2 +
>  17 files changed, 195 insertions(+), 195 deletions(-)
>  create mode 100644 test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> 
> diff --git a/src/lj_asm.c b/src/lj_asm.c
> index d71fa8c8..65261d50 100644
> --- a/src/lj_asm.c
> +++ b/src/lj_asm.c
> @@ -1650,7 +1650,6 @@ static void asm_loop(ASMState *as)
>  #if !LJ_SOFTFP32
>  #if !LJ_TARGET_X86ORX64
>  #define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
> -#define asm_fppowi(as, ir)	asm_callid(as, ir, IRCALL_lj_vm_powi)
>  #endif
>  
>  static void asm_pow(ASMState *as, IRIns *ir)
> @@ -1661,10 +1660,8 @@ static void asm_pow(ASMState *as, IRIns *ir)
>  					  IRCALL_lj_carith_powu64);
>    else
>  #endif
> -  if (irt_isnum(IR(ir->op2)->t))
> -    asm_callid(as, ir, IRCALL_pow);
> -  else
> -    asm_fppowi(as, ir);
> +  asm_callid(as, ir, irt_isnum(IR(ir->op2)->t) ? IRCALL_lj_vm_pow :
> +						 IRCALL_lj_vm_powi);
>  }
>  
>  static void asm_div(ASMState *as, IRIns *ir)
> diff --git a/src/lj_asm_x86.h b/src/lj_asm_x86.h
> index 74f2d853..2b810c8d 100644
> --- a/src/lj_asm_x86.h
> +++ b/src/lj_asm_x86.h
> @@ -2005,19 +2005,6 @@ static void asm_ldexp(ASMState *as, IRIns *ir)
>    asm_x87load(as, ir->op2);
>  }
>  
> -static void asm_fppowi(ASMState *as, IRIns *ir)
> -{
> -  /* The modified regs must match with the *.dasc implementation. */
> -  RegSet drop = RSET_RANGE(RID_XMM0, RID_XMM1+1)|RID2RSET(RID_EAX);
> -  if (ra_hasreg(ir->r))
> -    rset_clear(drop, ir->r);  /* Dest reg handled below. */
> -  ra_evictset(as, drop);
> -  ra_destreg(as, ir, RID_XMM0);
> -  emit_call(as, lj_vm_powi_sse);
> -  ra_left(as, RID_XMM0, ir->op1);
> -  ra_left(as, RID_EAX, ir->op2);
> -}
> -
>  static int asm_swapops(ASMState *as, IRIns *ir)
>  {
>    IRIns *irl = IR(ir->op1);
> diff --git a/src/lj_dispatch.h b/src/lj_dispatch.h
> index b8bc2594..af870a75 100644
> --- a/src/lj_dispatch.h
> +++ b/src/lj_dispatch.h
> @@ -44,7 +44,7 @@ extern double __divdf3(double a, double b);
>  #define GOTDEF(_) \
>    _(floor) _(ceil) _(trunc) _(log) _(log10) _(exp) _(sin) _(cos) _(tan) \
>    _(asin) _(acos) _(atan) _(sinh) _(cosh) _(tanh) _(frexp) _(modf) _(atan2) \
> -  _(pow) _(fmod) _(ldexp) _(lj_vm_modi) \
> +  _(lj_vm_pow) _(fmod) _(ldexp) _(lj_vm_modi) \
>    _(lj_dispatch_call) _(lj_dispatch_ins) _(lj_dispatch_stitch) \
>    _(lj_dispatch_profile) _(lj_err_throw) \
>    _(lj_ffh_coroutine_wrap_err) _(lj_func_closeuv) _(lj_func_newL_gc) \
> diff --git a/src/lj_ircall.h b/src/lj_ircall.h
> index af064a6f..ac0888a0 100644
> --- a/src/lj_ircall.h
> +++ b/src/lj_ircall.h
> @@ -195,7 +195,7 @@ typedef struct CCallInfo {
>    _(ANY,	log,			1,   N, NUM, XA_FP) \
>    _(ANY,	lj_vm_log2,		1,   N, NUM, XA_FP) \
>    _(ANY,	lj_vm_powi,		2,   N, NUM, XA_FP) \
> -  _(ANY,	pow,			2,   N, NUM, XA2_FP) \
> +  _(ANY,	lj_vm_pow,		2,   N, NUM, XA2_FP) \
>    _(ANY,	atan2,			2,   N, NUM, XA2_FP) \
>    _(ANY,	ldexp,			2,   N, NUM, XA_FP) \
>    _(SOFTFP,	lj_vm_tobit,		1,   N, INT, XA_FP32) \
> diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
> index 0007107b..7d7cc9d1 100644
> --- a/src/lj_opt_fold.c
> +++ b/src/lj_opt_fold.c
> @@ -1114,33 +1114,6 @@ LJFOLDF(simplify_numpow_xkint)
>    return ref;
>  }
>  
> -LJFOLD(POW any KNUM)
> -LJFOLDF(simplify_numpow_xknum)
> -{
> -  if (knumright == 0.5)  /* x ^ 0.5 ==> sqrt(x) */
> -    return emitir(IRTN(IR_FPMATH), fins->op1, IRFPM_SQRT);
> -  return NEXTFOLD;
> -}
> -
> -LJFOLD(POW KNUM any)
> -LJFOLDF(simplify_numpow_kx)
> -{
> -  lua_Number n = knumleft;
> -  if (n == 2.0 && irt_isint(fright->t)) {  /* 2.0 ^ i ==> ldexp(1.0, i) */
> -#if LJ_TARGET_X86ORX64
> -    /* Different IR_LDEXP calling convention on x86/x64 requires conversion. */
> -    fins->o = IR_CONV;
> -    fins->op1 = fins->op2;
> -    fins->op2 = IRCONV_NUM_INT;
> -    fins->op2 = (IRRef1)lj_opt_fold(J);
> -#endif
> -    fins->op1 = (IRRef1)lj_ir_knum_one(J);
> -    fins->o = IR_LDEXP;
> -    return RETRYFOLD;
> -  }
> -  return NEXTFOLD;
> -}
> -
>  /* -- Simplify conversions ------------------------------------------------ */
>  
>  LJFOLD(CONV CONV IRCONV_NUM_INT)  /* _NUM */
> diff --git a/src/lj_opt_narrow.c b/src/lj_opt_narrow.c
> index 2cfb775b..d6601f4c 100644
> --- a/src/lj_opt_narrow.c
> +++ b/src/lj_opt_narrow.c
> @@ -590,20 +590,14 @@ TRef lj_opt_narrow_pow(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc)
>    rb = conv_str_tonum(J, rb, vb);
>    rb = lj_ir_tonum(J, rb);  /* Left arg is always treated as an FP number. */
>    rc = conv_str_tonum(J, rc, vc);
> -  /* Narrowing must be unconditional to preserve (-x)^i semantics. */
>    if (tvisint(vc) || numisint(numV(vc))) {
> -    int checkrange = 0;
> -    /* pow() is faster for bigger exponents. But do this only for (+k)^i. */
> -    if (tref_isk(rb) && (int32_t)ir_knum(IR(tref_ref(rb)))->u32.hi >= 0) {
> -      int32_t k = numberVint(vc);
> -      if (!(k >= -65536 && k <= 65536)) goto force_pow_num;
> -      checkrange = 1;
> -    }
> +    int32_t k = numberVint(vc);
> +    if (!(k >= -65536 && k <= 65536)) goto force_pow_num;
>      if (!tref_isinteger(rc)) {
>        /* Guarded conversion to integer! */
>        rc = emitir(IRTGI(IR_CONV), rc, IRCONV_INT_NUM|IRCONV_CHECK);
>      }
> -    if (checkrange && !tref_isk(rc)) {  /* Range guard: -65536 <= i <= 65536 */
> +    if (!tref_isk(rc)) {  /* Range guard: -65536 <= i <= 65536 */
>        TRef tmp = emitir(IRTI(IR_ADD), rc, lj_ir_kint(J, 65536));
>        emitir(IRTGI(IR_ULE), tmp, lj_ir_kint(J, 2*65536));
>      }
> diff --git a/src/lj_vm.h b/src/lj_vm.h
> index abaa7c52..f6f28a08 100644
> --- a/src/lj_vm.h
> +++ b/src/lj_vm.h
> @@ -82,10 +82,6 @@ LJ_ASMF int32_t LJ_FASTCALL lj_vm_modi(int32_t, int32_t);
>  LJ_ASMF void lj_vm_floor_sse(void);
>  LJ_ASMF void lj_vm_ceil_sse(void);
>  LJ_ASMF void lj_vm_trunc_sse(void);
> -LJ_ASMF void lj_vm_powi_sse(void);
> -#define lj_vm_powi	NULL
> -#else
> -LJ_ASMF double lj_vm_powi(double, int32_t);
>  #endif
>  #if LJ_TARGET_PPC || LJ_TARGET_ARM64
>  #define lj_vm_trunc	trunc
> @@ -100,6 +96,9 @@ LJ_ASMF int lj_vm_errno(void);
>  #endif
>  #endif
>  
> +LJ_ASMF double lj_vm_powi(double, int32_t);
> +LJ_ASMF double lj_vm_pow(double, double);
> +
>  /* Continuations for metamethods. */
>  LJ_ASMF void lj_cont_cat(void);  /* Continue with concatenation. */
>  LJ_ASMF void lj_cont_ra(void);  /* Store result in RA from instruction. */
> diff --git a/src/lj_vmmath.c b/src/lj_vmmath.c
> index 14e66687..539f955b 100644
> --- a/src/lj_vmmath.c
> +++ b/src/lj_vmmath.c
> @@ -30,11 +30,51 @@ LJ_FUNCA double lj_wrap_sinh(double x) { return sinh(x); }
>  LJ_FUNCA double lj_wrap_cosh(double x) { return cosh(x); }
>  LJ_FUNCA double lj_wrap_tanh(double x) { return tanh(x); }
>  LJ_FUNCA double lj_wrap_atan2(double x, double y) { return atan2(x, y); }
> -LJ_FUNCA double lj_wrap_pow(double x, double y) { return pow(x, y); }
>  LJ_FUNCA double lj_wrap_fmod(double x, double y) { return fmod(x, y); }
>  #endif
>  
> -/* -- Helper functions for generated machine code ------------------------- */
> +/* -- Helper functions ---------------------------------------------------- */
> +
> +/* Unsigned x^k. */
> +static double lj_vm_powui(double x, uint32_t k)
> +{
> +  double y;
> +  lj_assertX(k != 0, "pow with zero exponent");
> +  for (; (k & 1) == 0; k >>= 1) x *= x;
> +  y = x;
> +  if ((k >>= 1) != 0) {
> +    for (;;) {
> +      x *= x;
> +      if (k == 1) break;
> +      if (k & 1) y *= x;
> +      k >>= 1;
> +    }
> +    y *= x;
> +  }
> +  return y;
> +}
> +
> +/* Signed x^k. */
> +double lj_vm_powi(double x, int32_t k)
> +{
> +  if (k > 1)
> +    return lj_vm_powui(x, (uint32_t)k);
> +  else if (k == 1)
> +    return x;
> +  else if (k == 0)
> +    return 1.0;
> +  else
> +    return 1.0 / lj_vm_powui(x, (uint32_t)-k);
> +}
> +
> +double lj_vm_pow(double x, double y)
> +{
> +  int32_t k = lj_num2int(y);
> +  if ((k >= -65536 && k <= 65536) && y == (double)k)
> +    return lj_vm_powi(x, k);
> +  else
> +    return pow(x, y);
> +}
>  
>  double lj_vm_foldarith(double x, double y, int op)
>  {
> @@ -44,7 +84,7 @@ double lj_vm_foldarith(double x, double y, int op)
>    case IR_MUL - IR_ADD: return x*y; break;
>    case IR_DIV - IR_ADD: return x/y; break;
>    case IR_MOD - IR_ADD: return x-lj_vm_floor(x/y)*y; break;
> -  case IR_POW - IR_ADD: return pow(x, y); break;
> +  case IR_POW - IR_ADD: return lj_vm_pow(x, y); break;
>    case IR_NEG - IR_ADD: return -x; break;
>    case IR_ABS - IR_ADD: return fabs(x); break;
>  #if LJ_HASJIT
> @@ -56,6 +96,8 @@ double lj_vm_foldarith(double x, double y, int op)
>    }
>  }
>  
> +/* -- Helper functions for generated machine code ------------------------- */
> +
>  #if (LJ_HASJIT && !(LJ_TARGET_ARM || LJ_TARGET_ARM64 || LJ_TARGET_PPC)) || LJ_TARGET_MIPS
>  int32_t LJ_FASTCALL lj_vm_modi(int32_t a, int32_t b)
>  {
> @@ -80,40 +122,6 @@ double lj_vm_log2(double a)
>  }
>  #endif
>  
> -#if !LJ_TARGET_X86ORX64
> -/* Unsigned x^k. */
> -static double lj_vm_powui(double x, uint32_t k)
> -{
> -  double y;
> -  lj_assertX(k != 0, "pow with zero exponent");
> -  for (; (k & 1) == 0; k >>= 1) x *= x;
> -  y = x;
> -  if ((k >>= 1) != 0) {
> -    for (;;) {
> -      x *= x;
> -      if (k == 1) break;
> -      if (k & 1) y *= x;
> -      k >>= 1;
> -    }
> -    y *= x;
> -  }
> -  return y;
> -}
> -
> -/* Signed x^k. */
> -double lj_vm_powi(double x, int32_t k)
> -{
> -  if (k > 1)
> -    return lj_vm_powui(x, (uint32_t)k);
> -  else if (k == 1)
> -    return x;
> -  else if (k == 0)
> -    return 1.0;
> -  else
> -    return 1.0 / lj_vm_powui(x, (uint32_t)-k);
> -}
> -#endif
> -
>  /* Computes fpm(x) for extended math functions. */
>  double lj_vm_foldfpm(double x, int fpm)
>  {
> diff --git a/src/vm_arm.dasc b/src/vm_arm.dasc
> index 767d31f9..792f0363 100644
> --- a/src/vm_arm.dasc
> +++ b/src/vm_arm.dasc
> @@ -1485,11 +1485,11 @@ static void build_subroutines(BuildCtx *ctx)
>    |.endif
>    |.endmacro
>    |
> -  |.macro math_extern2, func
> +  |.macro math_extern2, name, func
>    |.if HFABI
> -  |  .ffunc_dd math_ .. func
> +  |  .ffunc_dd math_ .. name
>    |.else
> -  |  .ffunc_nn math_ .. func
> +  |  .ffunc_nn math_ .. name
>    |.endif
>    |  .IOS mov RA, BASE
>    |  bl extern func
> @@ -1500,6 +1500,9 @@ static void build_subroutines(BuildCtx *ctx)
>    |  b ->fff_restv
>    |.endif
>    |.endmacro
> +  |.macro math_extern2, func
> +  |  math_extern2 func, func
> +  |.endmacro
>    |
>    |.if FPU
>    |  .ffunc_d math_sqrt
> @@ -1545,7 +1548,7 @@ static void build_subroutines(BuildCtx *ctx)
>    |  math_extern sinh
>    |  math_extern cosh
>    |  math_extern tanh
> -  |  math_extern2 pow
> +  |  math_extern2 pow, lj_vm_pow
>    |  math_extern2 atan2
>    |  math_extern2 fmod
>    |
> @@ -3153,7 +3156,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      break;
>    case BC_POW:
>      |  // NYI: (partial) integer arithmetic.
> -    |  ins_arithfp extern, extern pow
> +    |  ins_arithfp extern, extern lj_vm_pow
>      break;
>  
>    case BC_CAT:
> diff --git a/src/vm_arm64.dasc b/src/vm_arm64.dasc
> index de33bde4..fb267a76 100644
> --- a/src/vm_arm64.dasc
> +++ b/src/vm_arm64.dasc
> @@ -1391,11 +1391,14 @@ static void build_subroutines(BuildCtx *ctx)
>    |  b ->fff_resn
>    |.endmacro
>    |
> -  |.macro math_extern2, func
> -  |  .ffunc_nn math_ .. func
> +  |.macro math_extern2, name, func
> +  |  .ffunc_nn math_ .. name
>    |  bl extern func
>    |  b ->fff_resn
>    |.endmacro
> +  |.macro math_extern2, func
> +  |  math_extern2 func, func
> +  |.endmacro
>    |
>    |.ffunc_n math_sqrt
>    |  fsqrt d0, d0
> @@ -1424,7 +1427,7 @@ static void build_subroutines(BuildCtx *ctx)
>    |  math_extern sinh
>    |  math_extern cosh
>    |  math_extern tanh
> -  |  math_extern2 pow
> +  |  math_extern2 pow, lj_vm_pow
>    |  math_extern2 atan2
>    |  math_extern2 fmod
>    |
> @@ -2621,7 +2624,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |  ins_arithload FARG1, FARG2
>      |  ins_arithfallback ins_arithcheck_num
>      |.if "fpins" == "fpow"
> -    |  bl extern pow
> +    |  bl extern lj_vm_pow
>      |.else
>      |  fpins FARG1, FARG1, FARG2
>      |.endif
> diff --git a/src/vm_mips.dasc b/src/vm_mips.dasc
> index 32caabf7..5664f503 100644
> --- a/src/vm_mips.dasc
> +++ b/src/vm_mips.dasc
> @@ -1631,14 +1631,17 @@ static void build_subroutines(BuildCtx *ctx)
>    |.  nop
>    |.endmacro
>    |
> -  |.macro math_extern2, func
> -  |  .ffunc_nn math_ .. func
> +  |.macro math_extern2, name, func
> +  |  .ffunc_nn math_ .. name
>    |.  load_got func
>    |  call_extern
>    |.  nop
>    |  b ->fff_resn
>    |.  nop
>    |.endmacro
> +  |.macro math_extern2, func
> +  |  math_extern2 func, func
> +  |.endmacro
>    |
>    |// TODO: Return integer type if result is integer (own sf implementation).
>    |.macro math_round, func
> @@ -1692,7 +1695,7 @@ static void build_subroutines(BuildCtx *ctx)
>    |  math_extern sinh
>    |  math_extern cosh
>    |  math_extern tanh
> -  |  math_extern2 pow
> +  |  math_extern2 pow, lj_vm_pow
>    |  math_extern2 atan2
>    |  math_extern2 fmod
>    |
> @@ -3585,7 +3588,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |  sltiu AT, SFARG1HI, LJ_TISNUM
>      |  sltiu TMP0, SFARG2HI, LJ_TISNUM
>      |  and AT, AT, TMP0
> -    |  load_got pow
> +    |  load_got lj_vm_pow
>      |  beqz AT, ->vmeta_arith
>      |.  addu RA, BASE, RA
>      |.if FPU
> diff --git a/src/vm_mips64.dasc b/src/vm_mips64.dasc
> index 44fba36c..249605d4 100644
> --- a/src/vm_mips64.dasc
> +++ b/src/vm_mips64.dasc
> @@ -1669,14 +1669,17 @@ static void build_subroutines(BuildCtx *ctx)
>    |.  nop
>    |.endmacro
>    |
> -  |.macro math_extern2, func
> -  |  .ffunc_nn math_ .. func
> +  |.macro math_extern2, name, func
> +  |  .ffunc_nn math_ .. name
>    |.  load_got func
>    |  call_extern
>    |.  nop
>    |  b ->fff_resn
>    |.  nop
>    |.endmacro
> +  |.macro math_extern2, func
> +  |  math_extern2 func, func
> +  |.endmacro
>    |
>    |// TODO: Return integer type if result is integer (own sf implementation).
>    |.macro math_round, func
> @@ -1730,7 +1733,7 @@ static void build_subroutines(BuildCtx *ctx)
>    |  math_extern sinh
>    |  math_extern cosh
>    |  math_extern tanh
> -  |  math_extern2 pow
> +  |  math_extern2 pow, lj_vm_pow
>    |  math_extern2 atan2
>    |  math_extern2 fmod
>    |
> @@ -3823,7 +3826,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |  sltiu TMP0, TMP0, LJ_TISNUM
>      |   sltiu TMP1, TMP1, LJ_TISNUM
>      |  and AT, TMP0, TMP1
> -    |  load_got pow
> +    |  load_got lj_vm_pow
>      |  beqz AT, ->vmeta_arith
>      |.  daddu RA, BASE, RA
>      |.if FPU
> diff --git a/src/vm_ppc.dasc b/src/vm_ppc.dasc
> index 980ad897..94af63e6 100644
> --- a/src/vm_ppc.dasc
> +++ b/src/vm_ppc.dasc
> @@ -2032,11 +2032,14 @@ static void build_subroutines(BuildCtx *ctx)
>    |  b ->fff_resn
>    |.endmacro
>    |
> -  |.macro math_extern2, func
> -  |  .ffunc_nn math_ .. func
> +  |.macro math_extern2, name, func
> +  |  .ffunc_nn math_ .. name
>    |  blex func
>    |  b ->fff_resn
>    |.endmacro
> +  |.macro math_extern2, func
> +  |  math_extern2 func, func
> +  |.endmacro
>    |
>    |.macro math_round, func
>    |  .ffunc_1 math_ .. func
> @@ -2161,7 +2164,7 @@ static void build_subroutines(BuildCtx *ctx)
>    |  math_extern sinh
>    |  math_extern cosh
>    |  math_extern tanh
> -  |  math_extern2 pow
> +  |  math_extern2 pow, lj_vm_pow
>    |  math_extern2 atan2
>    |  math_extern2 fmod
>    |
> @@ -4154,7 +4157,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |  checknum cr1, CARG3
>      |  crand 4*cr0+lt, 4*cr0+lt, 4*cr1+lt
>      |  bge ->vmeta_arith_vv
> -    |  blex pow
> +    |  blex lj_vm_pow
>      |  ins_next1
>      |.if FPU
>      |  stfdx FARG1, BASE, RA
> diff --git a/src/vm_x64.dasc b/src/vm_x64.dasc
> index 7b04b928..acbe8dc2 100644
> --- a/src/vm_x64.dasc
> +++ b/src/vm_x64.dasc
> @@ -1825,13 +1825,16 @@ static void build_subroutines(BuildCtx *ctx)
>    |  jmp ->fff_resxmm0
>    |.endmacro
>    |
> -  |.macro math_extern2, func
> -  |  .ffunc_nn math_ .. func
> +  |.macro math_extern2, name, func
> +  |  .ffunc_nn math_ .. name
>    |  mov RB, BASE
>    |  call extern func
>    |  mov BASE, RB
>    |  jmp ->fff_resxmm0
>    |.endmacro
> +  |.macro math_extern2, func
> +  |  math_extern2 func, func
> +  |.endmacro
>    |
>    |  math_extern log10
>    |  math_extern exp
> @@ -1844,7 +1847,7 @@ static void build_subroutines(BuildCtx *ctx)
>    |  math_extern sinh
>    |  math_extern cosh
>    |  math_extern tanh
> -  |  math_extern2 pow
> +  |  math_extern2 pow, lj_vm_pow
>    |  math_extern2 atan2
>    |  math_extern2 fmod
>    |
> @@ -2649,41 +2652,6 @@ static void build_subroutines(BuildCtx *ctx)
>    |  subsd xmm0, xmm1
>    |  ret
>    |
> -  |// Args in xmm0/eax. Ret in xmm0. xmm0-xmm1 and eax modified.
> -  |->vm_powi_sse:
> -  |  cmp eax, 1; jle >6			// i<=1?
> -  |  // Now 1 < (unsigned)i <= 0x80000000.
> -  |1:  // Handle leading zeros.
> -  |  test eax, 1; jnz >2
> -  |  mulsd xmm0, xmm0
> -  |  shr eax, 1
> -  |  jmp <1
> -  |2:
> -  |  shr eax, 1; jz >5
> -  |  movaps xmm1, xmm0
> -  |3:  // Handle trailing bits.
> -  |  mulsd xmm0, xmm0
> -  |  shr eax, 1; jz >4
> -  |  jnc <3
> -  |  mulsd xmm1, xmm0
> -  |  jmp <3
> -  |4:
> -  |  mulsd xmm0, xmm1
> -  |5:
> -  |  ret
> -  |6:
> -  |  je <5				// x^1 ==> x
> -  |  jb >7				// x^0 ==> 1
> -  |  neg eax
> -  |  call <1
> -  |  sseconst_1 xmm1, RD
> -  |  divsd xmm1, xmm0
> -  |  movaps xmm0, xmm1
> -  |  ret
> -  |7:
> -  |  sseconst_1 xmm0, RD
> -  |  ret
> -  |
>    |//-----------------------------------------------------------------------
>    |//-- Miscellaneous functions --------------------------------------------
>    |//-----------------------------------------------------------------------
> diff --git a/src/vm_x86.dasc b/src/vm_x86.dasc
> index bd1e940e..bf30cce6 100644
> --- a/src/vm_x86.dasc
> +++ b/src/vm_x86.dasc
> @@ -2240,8 +2240,8 @@ static void build_subroutines(BuildCtx *ctx)
>    |  jmp ->fff_resfp
>    |.endmacro
>    |
> -  |.macro math_extern2, func
> -  |  .ffunc_nnsse math_ .. func
> +  |.macro math_extern2, name, func
> +  |  .ffunc_nnsse math_ .. name
>    |.if not X64
>    |  movsd FPARG1, xmm0
>    |  movsd FPARG3, xmm1
> @@ -2251,6 +2251,9 @@ static void build_subroutines(BuildCtx *ctx)
>    |  mov BASE, RB
>    |  jmp ->fff_resfp
>    |.endmacro
> +  |.macro math_extern2, func
> +  |  math_extern2 func, func
> +  |.endmacro
>    |
>    |  math_extern log10
>    |  math_extern exp
> @@ -2263,7 +2266,7 @@ static void build_subroutines(BuildCtx *ctx)
>    |  math_extern sinh
>    |  math_extern cosh
>    |  math_extern tanh
> -  |  math_extern2 pow
> +  |  math_extern2 pow, lj_vm_pow
>    |  math_extern2 atan2
>    |  math_extern2 fmod
>    |
> @@ -3140,41 +3143,6 @@ static void build_subroutines(BuildCtx *ctx)
>    |  subsd xmm0, xmm1
>    |  ret
>    |
> -  |// Args in xmm0/eax. Ret in xmm0. xmm0-xmm1 and eax modified.
> -  |->vm_powi_sse:
> -  |  cmp eax, 1; jle >6			// i<=1?
> -  |  // Now 1 < (unsigned)i <= 0x80000000.
> -  |1:  // Handle leading zeros.
> -  |  test eax, 1; jnz >2
> -  |  mulsd xmm0, xmm0
> -  |  shr eax, 1
> -  |  jmp <1
> -  |2:
> -  |  shr eax, 1; jz >5
> -  |  movaps xmm1, xmm0
> -  |3:  // Handle trailing bits.
> -  |  mulsd xmm0, xmm0
> -  |  shr eax, 1; jz >4
> -  |  jnc <3
> -  |  mulsd xmm1, xmm0
> -  |  jmp <3
> -  |4:
> -  |  mulsd xmm0, xmm1
> -  |5:
> -  |  ret
> -  |6:
> -  |  je <5				// x^1 ==> x
> -  |  jb >7				// x^0 ==> 1
> -  |  neg eax
> -  |  call <1
> -  |  sseconst_1 xmm1, RDa
> -  |  divsd xmm1, xmm0
> -  |  movaps xmm0, xmm1
> -  |  ret
> -  |7:
> -  |  sseconst_1 xmm0, RDa
> -  |  ret
> -  |
>    |//-----------------------------------------------------------------------
>    |//-- Miscellaneous functions --------------------------------------------
>    |//-----------------------------------------------------------------------
> @@ -3976,7 +3944,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |  movsd FPARG1, xmm0
>      |  movsd FPARG3, xmm1
>      |.endif
> -    |  call extern pow
> +    |  call extern lj_vm_pow
>      |  movzx RA, PC_RA
>      |  mov BASE, RB
>      |.if X64
> diff --git a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> new file mode 100644
> index 00000000..5129fc45
> --- /dev/null
> +++ b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> @@ -0,0 +1,89 @@
> +local tap = require('tap')
> +-- Test to demonstrate the incorrect JIT behaviour for different
> +-- power operation optimizations.
> +-- See also:
> +-- https://github.com/LuaJIT/LuaJIT/issues/684.
> +local test = tap.test('lj-684-pow-inconsistencies'):skipcond({
> +  ['Test requires JIT enabled'] = not jit.status(),
> +})
> +
> +local tostring = tostring
> +
> +test:plan(4)
> +
> +jit.opt.start('hotloop=1')
> +
> +-- XXX: Prevent hotcount side effects.
> +jit.off()
> +jit.flush()
> +
> +local res = {}
> +-- -0 ^ 0.5 = 0. Test sign with `tostring()`.
Typo: s/Test/Test the/
> +-- XXX: use local variable to prevent folding via parser.
> +-- XXX: use stack slot out of trace to prevent constant folding.
> +local minus_zero = -0
> +jit.on()
> +for i = 1, 4 do
> +  res[i] = tostring(minus_zero ^ 0.5)
> +end
> +
> +-- XXX: Prevent hotcount side effects.
> +jit.off()
> +jit.flush()
> +
> +test:samevalues(res, ('consistent results for folding (-0) ^ 0.5'))
> +
> +jit.on()
> +-- -inf ^ 0.5 = inf.
> +res = {}
> +local minus_inf = -math.huge
> +jit.on()
> +for i = 1, 4 do
> +  res[i] = minus_inf ^ 0.5
> +end
> +
> +-- XXX: Prevent hotcount side effects.
> +jit.off()
> +jit.flush()
> +
> +test:samevalues(res, ('consistent results for folding (-inf) ^ 0.5'))
> +
> +-- 2921 ^ 0.5 = 0x1.b05ec632536fap+5.
We certainly need to add some explanation here about the precision, because
it is not obvious why these magic numbers should cause any issues.
> +res = {}
> +-- XXX: use local variable to prevent folding via parser.
> +-- XXX: use stack slot out of trace to prevent constant folding.
> +local corner_case_05 = 2921
> +jit.on()
> +for i = 1, 4 do
> +  res[i] = corner_case_05 ^ 0.5
> +end
> +
> +-- XXX: Prevent hotcount side effects.
> +jit.off()
> +jit.flush()
> +
> +test:samevalues(res, ('consistent results for folding 2921 ^ 0.5'))

I believe it is possible to make a single function with different
parameters for all three cases above.
Something like `test_power(value, power, extra_map)`, so you can do
| res[i] = extra_map(value ^ power)

> +
> +-- Narrowing for non-constant base of power operation.
> +local function pow(base, power)
> +  return base ^ power
> +end
> +
> +jit.on()
> +
> +-- Compile function first.
> +pow(1, 2)
> +pow(1, 2)
> +
> +-- Need some value near 1, to avoid infinite result.
Typo: s/Need/We need/
Typo: s/avoid/avoid an/
> +local base = 1.0000000001
> +local power = 65536 * 3
> +local resulting_value = pow(base, power)
> +
> +-- XXX: Prevent hotcount side effects.
> +jit.off()
> +jit.flush()
> +
> +test:is(resulting_value, base ^ power, 'guard for narrowing of power operation')
> +
> +test:done(true)
> diff --git a/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
> index 21b3a0d9..1f7f65c5 100644
> --- a/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
> +++ b/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
> @@ -16,6 +16,8 @@ local INTERESTING_VALUES = {
>    -- x ^  inf = 0 (inf), if |x| < 1 (|x| > 1).
>    -- x ^ -inf = inf (0), if |x| < 1 (|x| > 1).
>    0.999999, 1.000001, -0.999999, -1.000001,
> +  -- Test power of even numbers optimizations.
> +  2, -2, 0.5, -0.5,
>  }
>  test:plan(1 + (#INTERESTING_VALUES) ^ 2)
>  
> -- 
> 2.41.0
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 5/5] Revert to trival pow() optimizations to prevent inaccuracies.
  2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 5/5] Revert to trival pow() optimizations to prevent inaccuracies Sergey Kaplun via Tarantool-patches
  2023-08-18 12:49   ` Sergey Bronnikov via Tarantool-patches
@ 2023-08-20  9:37   ` Maxim Kokryashkin via Tarantool-patches
  2023-08-21  8:15     ` Sergey Kaplun via Tarantool-patches
  1 sibling, 1 reply; 34+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2023-08-20  9:37 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the patch!
Please consider my comments below.
On Tue, Aug 15, 2023 at 12:36:31PM +0300, Sergey Kaplun wrote:
> From: Mike Pall <mike>
> 
> (cherry-picked from commit 96d6d5032098ea9f0002165394a8774dcaa0c0ce)
> 
> This patch fixes different misbehaviour between JIT-compiled code and
Typo: s/misbehaviour/misbehaviours/
> the interpreter for power operator with the following ways:
Typo: s/with/in/
> * Drop folding optimizations for base ^ n => base * base ..., as far as
>   pow(base, n) isn't interchangeable with just multiplicity of numbers
>   and depends on the <math.h> implementation.
> * Since the internal power function is inaccurate for very big or small
>   powers, it is dropped, and `pow()` from the standard library is used
>   instead. To save consistency between JIT behaviour and the VM
Typo: s/VM/VM,/
>   narrowing optimization is dropped, and only trivial folding
>   optimizations are used. Also, `math_extern2` version with two
>   parameters is dropped, since it's no more used.
Typo: s/more/longer/
> 
> Also, this fixes failures of the [220/502] lib/string/format/num.lua
> test [1] from LuaJIT-test suite.
Typo: s/from/from the/
> 
> [1]: https://www.exploringbinary.com/incorrect-floating-point-to-decimal-conversions/
> 
> Sergey Kaplun:
> * added the description and the test for the problem
> 
> Part of tarantool/tarantool#8825
> ---
>  src/lj_asm.c                                  |  3 +-
>  src/lj_dispatch.h                             |  2 +-
>  src/lj_ffrecord.c                             |  4 +-
>  src/lj_ircall.h                               |  3 +-
>  src/lj_iropt.h                                |  1 -
>  src/lj_opt_fold.c                             | 37 ++++------------
>  src/lj_opt_narrow.c                           | 24 ----------
>  src/lj_opt_split.c                            |  2 +-
>  src/lj_record.c                               |  2 +-
>  src/lj_vm.h                                   |  3 --
>  src/lj_vmmath.c                               | 44 +------------------
>  src/vm_arm.dasc                               | 13 +++---
>  src/vm_arm64.dasc                             | 11 ++---
>  src/vm_mips.dasc                              | 11 ++---
>  src/vm_mips64.dasc                            | 11 ++---
>  src/vm_ppc.dasc                               | 11 ++---
>  src/vm_x64.dasc                               |  9 ++--
>  src/vm_x86.dasc                               | 11 ++---
>  .../lj-684-pow-inconsistencies.test.lua       | 21 ++++++++-
>  19 files changed, 64 insertions(+), 159 deletions(-)
> 
> diff --git a/src/lj_asm.c b/src/lj_asm.c
> index 65261d50..3a1909d5 100644
> --- a/src/lj_asm.c
> +++ b/src/lj_asm.c
> @@ -1660,8 +1660,7 @@ static void asm_pow(ASMState *as, IRIns *ir)
>  					  IRCALL_lj_carith_powu64);
>    else
>  #endif
> -  asm_callid(as, ir, irt_isnum(IR(ir->op2)->t) ? IRCALL_lj_vm_pow :
> -						 IRCALL_lj_vm_powi);
> +  asm_callid(as, ir, IRCALL_pow);
>  }
>  
>  static void asm_div(ASMState *as, IRIns *ir)
> diff --git a/src/lj_dispatch.h b/src/lj_dispatch.h
> index af870a75..b8bc2594 100644
> --- a/src/lj_dispatch.h
> +++ b/src/lj_dispatch.h
> @@ -44,7 +44,7 @@ extern double __divdf3(double a, double b);
>  #define GOTDEF(_) \
>    _(floor) _(ceil) _(trunc) _(log) _(log10) _(exp) _(sin) _(cos) _(tan) \
>    _(asin) _(acos) _(atan) _(sinh) _(cosh) _(tanh) _(frexp) _(modf) _(atan2) \
> -  _(lj_vm_pow) _(fmod) _(ldexp) _(lj_vm_modi) \
> +  _(pow) _(fmod) _(ldexp) _(lj_vm_modi) \
>    _(lj_dispatch_call) _(lj_dispatch_ins) _(lj_dispatch_stitch) \
>    _(lj_dispatch_profile) _(lj_err_throw) \
>    _(lj_ffh_coroutine_wrap_err) _(lj_func_closeuv) _(lj_func_newL_gc) \
> diff --git a/src/lj_ffrecord.c b/src/lj_ffrecord.c
> index 0746ec64..99a6b918 100644
> --- a/src/lj_ffrecord.c
> +++ b/src/lj_ffrecord.c
> @@ -590,8 +590,8 @@ static void LJ_FASTCALL recff_math_call(jit_State *J, RecordFFData *rd)
>  
>  static void LJ_FASTCALL recff_math_pow(jit_State *J, RecordFFData *rd)
>  {
> -  J->base[0] = lj_opt_narrow_pow(J, J->base[0], J->base[1],
> -				 &rd->argv[0], &rd->argv[1]);
> +  J->base[0] = lj_opt_narrow_arith(J, J->base[0], J->base[1],
> +				   &rd->argv[0], &rd->argv[1], IR_POW);
>    UNUSED(rd);
>  }
>  
> diff --git a/src/lj_ircall.h b/src/lj_ircall.h
> index ac0888a0..9c195918 100644
> --- a/src/lj_ircall.h
> +++ b/src/lj_ircall.h
> @@ -194,8 +194,7 @@ typedef struct CCallInfo {
>    _(FPMATH,	sqrt,			1,   N, NUM, XA_FP) \
>    _(ANY,	log,			1,   N, NUM, XA_FP) \
>    _(ANY,	lj_vm_log2,		1,   N, NUM, XA_FP) \
> -  _(ANY,	lj_vm_powi,		2,   N, NUM, XA_FP) \
> -  _(ANY,	lj_vm_pow,		2,   N, NUM, XA2_FP) \
> +  _(ANY,	pow,			2,   N, NUM, XA2_FP) \
>    _(ANY,	atan2,			2,   N, NUM, XA2_FP) \
>    _(ANY,	ldexp,			2,   N, NUM, XA_FP) \
>    _(SOFTFP,	lj_vm_tobit,		1,   N, INT, XA_FP32) \
> diff --git a/src/lj_iropt.h b/src/lj_iropt.h
> index a59ba3f4..7ee1ea86 100644
> --- a/src/lj_iropt.h
> +++ b/src/lj_iropt.h
> @@ -144,7 +144,6 @@ LJ_FUNC TRef lj_opt_narrow_arith(jit_State *J, TRef rb, TRef rc,
>  				 TValue *vb, TValue *vc, IROp op);
>  LJ_FUNC TRef lj_opt_narrow_unm(jit_State *J, TRef rc, TValue *vc);
>  LJ_FUNC TRef lj_opt_narrow_mod(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc);
> -LJ_FUNC TRef lj_opt_narrow_pow(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc);
>  LJ_FUNC IRType lj_opt_narrow_forl(jit_State *J, cTValue *forbase);
>  
>  /* Optimization passes. */
> diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
> index 7d7cc9d1..09e6c87b 100644
> --- a/src/lj_opt_fold.c
> +++ b/src/lj_opt_fold.c
> @@ -236,14 +236,10 @@ LJFOLDF(kfold_fpcall2)
>    return NEXTFOLD;
>  }
>  
> -LJFOLD(POW KNUM KINT)
>  LJFOLD(POW KNUM KNUM)
>  LJFOLDF(kfold_numpow)
>  {
> -  lua_Number a = knumleft;
> -  lua_Number b = fright->o == IR_KINT ? (lua_Number)fright->i : knumright;
> -  lua_Number y = lj_vm_foldarith(a, b, IR_POW - IR_ADD);
> -  return lj_ir_knum(J, y);
> +  return lj_ir_knum(J, lj_vm_foldarith(knumleft, knumright, IR_POW - IR_ADD));
>  }
>  
>  /* Must not use kfold_kref for numbers (could be NaN). */
> @@ -1084,34 +1080,17 @@ LJFOLDF(simplify_nummuldiv_negneg)
>    return RETRYFOLD;
>  }
>  
> -LJFOLD(POW any KINT)
> -LJFOLDF(simplify_numpow_xkint)
> +LJFOLD(POW any KNUM)
> +LJFOLDF(simplify_numpow_k)
>  {
> -  int32_t k = fright->i;
> -  TRef ref = fins->op1;
> -  if (k == 0)  /* x ^ 0 ==> 1 */
> +  if (knumright == 0)  /* x ^ 0 ==> 1 */
>      return lj_ir_knum_one(J);  /* Result must be a number, not an int. */
> -  if (k == 1)  /* x ^ 1 ==> x */
> +  else if (knumright == 1)  /* x ^ 1 ==> x */
>      return LEFTFOLD;
> -  if ((uint32_t)(k+65536) > 2*65536u)  /* Limit code explosion. */
> +  else if (knumright == 2)  /* x ^ 2 ==> x * x */
> +    return emitir(IRTN(IR_MUL), fins->op1, fins->op1);
> +  else
>      return NEXTFOLD;
> -  if (k < 0) {  /* x ^ (-k) ==> (1/x) ^ k. */
> -    ref = emitir(IRTN(IR_DIV), lj_ir_knum_one(J), ref);
> -    k = -k;
> -  }
> -  /* Unroll x^k for 1 <= k <= 65536. */
> -  for (; (k & 1) == 0; k >>= 1)  /* Handle leading zeros. */
> -    ref = emitir(IRTN(IR_MUL), ref, ref);
> -  if ((k >>= 1) != 0) {  /* Handle trailing bits. */
> -    TRef tmp = emitir(IRTN(IR_MUL), ref, ref);
> -    for (; k != 1; k >>= 1) {
> -      if (k & 1)
> -	ref = emitir(IRTN(IR_MUL), ref, tmp);
> -      tmp = emitir(IRTN(IR_MUL), tmp, tmp);
> -    }
> -    ref = emitir(IRTN(IR_MUL), ref, tmp);
> -  }
> -  return ref;
>  }
>  
>  /* -- Simplify conversions ------------------------------------------------ */
> diff --git a/src/lj_opt_narrow.c b/src/lj_opt_narrow.c
> index d6601f4c..db0da10f 100644
> --- a/src/lj_opt_narrow.c
> +++ b/src/lj_opt_narrow.c
> @@ -584,30 +584,6 @@ TRef lj_opt_narrow_mod(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc)
>    return emitir(IRTN(IR_SUB), rb, tmp);
>  }
>  
> -/* Narrowing of power operator or math.pow. */
> -TRef lj_opt_narrow_pow(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc)
> -{
> -  rb = conv_str_tonum(J, rb, vb);
> -  rb = lj_ir_tonum(J, rb);  /* Left arg is always treated as an FP number. */
> -  rc = conv_str_tonum(J, rc, vc);
> -  if (tvisint(vc) || numisint(numV(vc))) {
> -    int32_t k = numberVint(vc);
> -    if (!(k >= -65536 && k <= 65536)) goto force_pow_num;
> -    if (!tref_isinteger(rc)) {
> -      /* Guarded conversion to integer! */
> -      rc = emitir(IRTGI(IR_CONV), rc, IRCONV_INT_NUM|IRCONV_CHECK);
> -    }
> -    if (!tref_isk(rc)) {  /* Range guard: -65536 <= i <= 65536 */
> -      TRef tmp = emitir(IRTI(IR_ADD), rc, lj_ir_kint(J, 65536));
> -      emitir(IRTGI(IR_ULE), tmp, lj_ir_kint(J, 2*65536));
> -    }
> -  } else {
> -force_pow_num:
> -    rc = lj_ir_tonum(J, rc);  /* Want POW(num, num), not POW(num, int). */
> -  }
> -  return emitir(IRTN(IR_POW), rb, rc);
> -}
> -
>  /* -- Predictive narrowing of induction variables ------------------------- */
>  
>  /* Narrow a single runtime value. */
> diff --git a/src/lj_opt_split.c b/src/lj_opt_split.c
> index a619d852..0dc6394f 100644
> --- a/src/lj_opt_split.c
> +++ b/src/lj_opt_split.c
> @@ -400,7 +400,7 @@ static void split_ir(jit_State *J)
>  	hi = split_call_ll(J, hisubst, oir, ir, IRCALL_softfp_div);
>  	break;
>        case IR_POW:
> -	hi = split_call_li(J, hisubst, oir, ir, IRCALL_lj_vm_powi);
> +	hi = split_call_li(J, hisubst, oir, ir, IRCALL_pow);
>  	break;
>        case IR_FPMATH:
>  	hi = split_call_l(J, hisubst, oir, ir, IRCALL_lj_vm_floor + ir->op2);
> diff --git a/src/lj_record.c b/src/lj_record.c
> index d1332bfc..34d1210a 100644
> --- a/src/lj_record.c
> +++ b/src/lj_record.c
> @@ -2268,7 +2268,7 @@ void lj_record_ins(jit_State *J)
>  
>    case BC_POW:
>      if (tref_isnumber_str(rb) && tref_isnumber_str(rc))
> -      rc = lj_opt_narrow_pow(J, rb, rc, rbv, rcv);
> +      rc = lj_opt_narrow_arith(J, rb, rc, rbv, rcv, IR_POW);
>      else
>        rc = rec_mm_arith(J, &ix, MM_pow);
>      break;
> diff --git a/src/lj_vm.h b/src/lj_vm.h
> index f6f28a08..79166e5e 100644
> --- a/src/lj_vm.h
> +++ b/src/lj_vm.h
> @@ -96,9 +96,6 @@ LJ_ASMF int lj_vm_errno(void);
>  #endif
>  #endif
>  
> -LJ_ASMF double lj_vm_powi(double, int32_t);
> -LJ_ASMF double lj_vm_pow(double, double);
> -
>  /* Continuations for metamethods. */
>  LJ_ASMF void lj_cont_cat(void);  /* Continue with concatenation. */
>  LJ_ASMF void lj_cont_ra(void);  /* Store result in RA from instruction. */
> diff --git a/src/lj_vmmath.c b/src/lj_vmmath.c
> index 539f955b..506867f8 100644
> --- a/src/lj_vmmath.c
> +++ b/src/lj_vmmath.c
> @@ -30,52 +30,12 @@ LJ_FUNCA double lj_wrap_sinh(double x) { return sinh(x); }
>  LJ_FUNCA double lj_wrap_cosh(double x) { return cosh(x); }
>  LJ_FUNCA double lj_wrap_tanh(double x) { return tanh(x); }
>  LJ_FUNCA double lj_wrap_atan2(double x, double y) { return atan2(x, y); }
> +LJ_FUNCA double lj_wrap_pow(double x, double y) { return pow(x, y); }
>  LJ_FUNCA double lj_wrap_fmod(double x, double y) { return fmod(x, y); }
>  #endif
>  
>  /* -- Helper functions ---------------------------------------------------- */
>  
> -/* Unsigned x^k. */
> -static double lj_vm_powui(double x, uint32_t k)
> -{
> -  double y;
> -  lj_assertX(k != 0, "pow with zero exponent");
> -  for (; (k & 1) == 0; k >>= 1) x *= x;
> -  y = x;
> -  if ((k >>= 1) != 0) {
> -    for (;;) {
> -      x *= x;
> -      if (k == 1) break;
> -      if (k & 1) y *= x;
> -      k >>= 1;
> -    }
> -    y *= x;
> -  }
> -  return y;
> -}
> -
> -/* Signed x^k. */
> -double lj_vm_powi(double x, int32_t k)
> -{
> -  if (k > 1)
> -    return lj_vm_powui(x, (uint32_t)k);
> -  else if (k == 1)
> -    return x;
> -  else if (k == 0)
> -    return 1.0;
> -  else
> -    return 1.0 / lj_vm_powui(x, (uint32_t)-k);
> -}
> -
> -double lj_vm_pow(double x, double y)
> -{
> -  int32_t k = lj_num2int(y);
> -  if ((k >= -65536 && k <= 65536) && y == (double)k)
> -    return lj_vm_powi(x, k);
> -  else
> -    return pow(x, y);
> -}
> -
>  double lj_vm_foldarith(double x, double y, int op)
>  {
>    switch (op) {
> @@ -84,7 +44,7 @@ double lj_vm_foldarith(double x, double y, int op)
>    case IR_MUL - IR_ADD: return x*y; break;
>    case IR_DIV - IR_ADD: return x/y; break;
>    case IR_MOD - IR_ADD: return x-lj_vm_floor(x/y)*y; break;
> -  case IR_POW - IR_ADD: return lj_vm_pow(x, y); break;
> +  case IR_POW - IR_ADD: return pow(x, y); break;
>    case IR_NEG - IR_ADD: return -x; break;
>    case IR_ABS - IR_ADD: return fabs(x); break;
>  #if LJ_HASJIT
> diff --git a/src/vm_arm.dasc b/src/vm_arm.dasc
> index 792f0363..767d31f9 100644
> --- a/src/vm_arm.dasc
> +++ b/src/vm_arm.dasc
> @@ -1485,11 +1485,11 @@ static void build_subroutines(BuildCtx *ctx)
>    |.endif
>    |.endmacro
>    |
> -  |.macro math_extern2, name, func
> +  |.macro math_extern2, func
>    |.if HFABI
> -  |  .ffunc_dd math_ .. name
> +  |  .ffunc_dd math_ .. func
>    |.else
> -  |  .ffunc_nn math_ .. name
> +  |  .ffunc_nn math_ .. func
>    |.endif
>    |  .IOS mov RA, BASE
>    |  bl extern func
> @@ -1500,9 +1500,6 @@ static void build_subroutines(BuildCtx *ctx)
>    |  b ->fff_restv
>    |.endif
>    |.endmacro
> -  |.macro math_extern2, func
> -  |  math_extern2 func, func
> -  |.endmacro
>    |
>    |.if FPU
>    |  .ffunc_d math_sqrt
> @@ -1548,7 +1545,7 @@ static void build_subroutines(BuildCtx *ctx)
>    |  math_extern sinh
>    |  math_extern cosh
>    |  math_extern tanh
> -  |  math_extern2 pow, lj_vm_pow
> +  |  math_extern2 pow
>    |  math_extern2 atan2
>    |  math_extern2 fmod
>    |
> @@ -3156,7 +3153,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      break;
>    case BC_POW:
>      |  // NYI: (partial) integer arithmetic.
> -    |  ins_arithfp extern, extern lj_vm_pow
> +    |  ins_arithfp extern, extern pow
>      break;
>  
>    case BC_CAT:
> diff --git a/src/vm_arm64.dasc b/src/vm_arm64.dasc
> index fb267a76..de33bde4 100644
> --- a/src/vm_arm64.dasc
> +++ b/src/vm_arm64.dasc
> @@ -1391,14 +1391,11 @@ static void build_subroutines(BuildCtx *ctx)
>    |  b ->fff_resn
>    |.endmacro
>    |
> -  |.macro math_extern2, name, func
> -  |  .ffunc_nn math_ .. name
> +  |.macro math_extern2, func
> +  |  .ffunc_nn math_ .. func
>    |  bl extern func
>    |  b ->fff_resn
>    |.endmacro
> -  |.macro math_extern2, func
> -  |  math_extern2 func, func
> -  |.endmacro
>    |
>    |.ffunc_n math_sqrt
>    |  fsqrt d0, d0
> @@ -1427,7 +1424,7 @@ static void build_subroutines(BuildCtx *ctx)
>    |  math_extern sinh
>    |  math_extern cosh
>    |  math_extern tanh
> -  |  math_extern2 pow, lj_vm_pow
> +  |  math_extern2 pow
>    |  math_extern2 atan2
>    |  math_extern2 fmod
>    |
> @@ -2624,7 +2621,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |  ins_arithload FARG1, FARG2
>      |  ins_arithfallback ins_arithcheck_num
>      |.if "fpins" == "fpow"
> -    |  bl extern lj_vm_pow
> +    |  bl extern pow
>      |.else
>      |  fpins FARG1, FARG1, FARG2
>      |.endif
> diff --git a/src/vm_mips.dasc b/src/vm_mips.dasc
> index 5664f503..32caabf7 100644
> --- a/src/vm_mips.dasc
> +++ b/src/vm_mips.dasc
> @@ -1631,17 +1631,14 @@ static void build_subroutines(BuildCtx *ctx)
>    |.  nop
>    |.endmacro
>    |
> -  |.macro math_extern2, name, func
> -  |  .ffunc_nn math_ .. name
> +  |.macro math_extern2, func
> +  |  .ffunc_nn math_ .. func
>    |.  load_got func
>    |  call_extern
>    |.  nop
>    |  b ->fff_resn
>    |.  nop
>    |.endmacro
> -  |.macro math_extern2, func
> -  |  math_extern2 func, func
> -  |.endmacro
>    |
>    |// TODO: Return integer type if result is integer (own sf implementation).
>    |.macro math_round, func
> @@ -1695,7 +1692,7 @@ static void build_subroutines(BuildCtx *ctx)
>    |  math_extern sinh
>    |  math_extern cosh
>    |  math_extern tanh
> -  |  math_extern2 pow, lj_vm_pow
> +  |  math_extern2 pow
>    |  math_extern2 atan2
>    |  math_extern2 fmod
>    |
> @@ -3588,7 +3585,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |  sltiu AT, SFARG1HI, LJ_TISNUM
>      |  sltiu TMP0, SFARG2HI, LJ_TISNUM
>      |  and AT, AT, TMP0
> -    |  load_got lj_vm_pow
> +    |  load_got pow
>      |  beqz AT, ->vmeta_arith
>      |.  addu RA, BASE, RA
>      |.if FPU
> diff --git a/src/vm_mips64.dasc b/src/vm_mips64.dasc
> index 249605d4..44fba36c 100644
> --- a/src/vm_mips64.dasc
> +++ b/src/vm_mips64.dasc
> @@ -1669,17 +1669,14 @@ static void build_subroutines(BuildCtx *ctx)
>    |.  nop
>    |.endmacro
>    |
> -  |.macro math_extern2, name, func
> -  |  .ffunc_nn math_ .. name
> +  |.macro math_extern2, func
> +  |  .ffunc_nn math_ .. func
>    |.  load_got func
>    |  call_extern
>    |.  nop
>    |  b ->fff_resn
>    |.  nop
>    |.endmacro
> -  |.macro math_extern2, func
> -  |  math_extern2 func, func
> -  |.endmacro
>    |
>    |// TODO: Return integer type if result is integer (own sf implementation).
>    |.macro math_round, func
> @@ -1733,7 +1730,7 @@ static void build_subroutines(BuildCtx *ctx)
>    |  math_extern sinh
>    |  math_extern cosh
>    |  math_extern tanh
> -  |  math_extern2 pow, lj_vm_pow
> +  |  math_extern2 pow
>    |  math_extern2 atan2
>    |  math_extern2 fmod
>    |
> @@ -3826,7 +3823,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |  sltiu TMP0, TMP0, LJ_TISNUM
>      |   sltiu TMP1, TMP1, LJ_TISNUM
>      |  and AT, TMP0, TMP1
> -    |  load_got lj_vm_pow
> +    |  load_got pow
>      |  beqz AT, ->vmeta_arith
>      |.  daddu RA, BASE, RA
>      |.if FPU
> diff --git a/src/vm_ppc.dasc b/src/vm_ppc.dasc
> index 94af63e6..980ad897 100644
> --- a/src/vm_ppc.dasc
> +++ b/src/vm_ppc.dasc
> @@ -2032,14 +2032,11 @@ static void build_subroutines(BuildCtx *ctx)
>    |  b ->fff_resn
>    |.endmacro
>    |
> -  |.macro math_extern2, name, func
> -  |  .ffunc_nn math_ .. name
> +  |.macro math_extern2, func
> +  |  .ffunc_nn math_ .. func
>    |  blex func
>    |  b ->fff_resn
>    |.endmacro
> -  |.macro math_extern2, func
> -  |  math_extern2 func, func
> -  |.endmacro
>    |
>    |.macro math_round, func
>    |  .ffunc_1 math_ .. func
> @@ -2164,7 +2161,7 @@ static void build_subroutines(BuildCtx *ctx)
>    |  math_extern sinh
>    |  math_extern cosh
>    |  math_extern tanh
> -  |  math_extern2 pow, lj_vm_pow
> +  |  math_extern2 pow
>    |  math_extern2 atan2
>    |  math_extern2 fmod
>    |
> @@ -4157,7 +4154,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |  checknum cr1, CARG3
>      |  crand 4*cr0+lt, 4*cr0+lt, 4*cr1+lt
>      |  bge ->vmeta_arith_vv
> -    |  blex lj_vm_pow
> +    |  blex pow
>      |  ins_next1
>      |.if FPU
>      |  stfdx FARG1, BASE, RA
> diff --git a/src/vm_x64.dasc b/src/vm_x64.dasc
> index acbe8dc2..09bf67e5 100644
> --- a/src/vm_x64.dasc
> +++ b/src/vm_x64.dasc
> @@ -1825,16 +1825,13 @@ static void build_subroutines(BuildCtx *ctx)
>    |  jmp ->fff_resxmm0
>    |.endmacro
>    |
> -  |.macro math_extern2, name, func
> -  |  .ffunc_nn math_ .. name
> +  |.macro math_extern2, func
> +  |  .ffunc_nn math_ .. func
>    |  mov RB, BASE
>    |  call extern func
>    |  mov BASE, RB
>    |  jmp ->fff_resxmm0
>    |.endmacro
> -  |.macro math_extern2, func
> -  |  math_extern2 func, func
> -  |.endmacro
>    |
>    |  math_extern log10
>    |  math_extern exp
> @@ -1847,7 +1844,7 @@ static void build_subroutines(BuildCtx *ctx)
>    |  math_extern sinh
>    |  math_extern cosh
>    |  math_extern tanh
> -  |  math_extern2 pow, lj_vm_pow
> +  |  math_extern2 pow
>    |  math_extern2 atan2
>    |  math_extern2 fmod
>    |
> diff --git a/src/vm_x86.dasc b/src/vm_x86.dasc
> index bf30cce6..f16ade1a 100644
> --- a/src/vm_x86.dasc
> +++ b/src/vm_x86.dasc
> @@ -2240,8 +2240,8 @@ static void build_subroutines(BuildCtx *ctx)
>    |  jmp ->fff_resfp
>    |.endmacro
>    |
> -  |.macro math_extern2, name, func
> -  |  .ffunc_nnsse math_ .. name
> +  |.macro math_extern2, func
> +  |  .ffunc_nnsse math_ .. func
>    |.if not X64
>    |  movsd FPARG1, xmm0
>    |  movsd FPARG3, xmm1
> @@ -2251,9 +2251,6 @@ static void build_subroutines(BuildCtx *ctx)
>    |  mov BASE, RB
>    |  jmp ->fff_resfp
>    |.endmacro
> -  |.macro math_extern2, func
> -  |  math_extern2 func, func
> -  |.endmacro
>    |
>    |  math_extern log10
>    |  math_extern exp
> @@ -2266,7 +2263,7 @@ static void build_subroutines(BuildCtx *ctx)
>    |  math_extern sinh
>    |  math_extern cosh
>    |  math_extern tanh
> -  |  math_extern2 pow, lj_vm_pow
> +  |  math_extern2 pow
>    |  math_extern2 atan2
>    |  math_extern2 fmod
>    |
> @@ -3944,7 +3941,7 @@ static void build_ins(BuildCtx *ctx, BCOp op, int defop)
>      |  movsd FPARG1, xmm0
>      |  movsd FPARG3, xmm1
>      |.endif
> -    |  call extern lj_vm_pow
> +    |  call extern pow
>      |  movzx RA, PC_RA
>      |  mov BASE, RB
>      |.if X64
> diff --git a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> index 5129fc45..ab9db3df 100644
> --- a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> +++ b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> @@ -2,14 +2,15 @@ local tap = require('tap')
>  -- Test to demonstrate the incorrect JIT behaviour for different
>  -- power operation optimizations.
>  -- See also:
> --- https://github.com/LuaJIT/LuaJIT/issues/684.
> +-- https://github.com/LuaJIT/LuaJIT/issues/684,
> +-- https://github.com/LuaJIT/LuaJIT/issues/817.
>  local test = tap.test('lj-684-pow-inconsistencies'):skipcond({
>    ['Test requires JIT enabled'] = not jit.status(),
>  })
>  
>  local tostring = tostring
>  
> -test:plan(4)
> +test:plan(5)
>  
>  jit.opt.start('hotloop=1')
>  
> @@ -64,6 +65,22 @@ jit.flush()
>  
>  test:samevalues(res, ('consistent results for folding 2921 ^ 0.5'))
>  
> +-- -948388 ^ 3 = -0x1.7ad0e8ad7439dp+59.
Same as in the previous patch, we need some additinal commentary for
those magic numbers.
> +res = {}
> +-- XXX: use local variable to prevent folding via parser.
> +-- XXX: use stack slot out of trace to prevent constant folding.
> +local corner_case_3 = -948388
Naming is misleading, it seems like it is the test case number,
which it is not. Please also fix this in the previous patch for `corner_case_5`.
> +jit.on()
> +for i = 1, 4 do
> +  res[i] = corner_case_3 ^ 3
> +end
> +
> +-- XXX: Prevent hotcount side effects.
> +jit.off()
> +jit.flush()

If you'll succeed making that dedicated function for those test cases in the
previous patch fix ups, this one should be rewritten too.
> +
> +test:samevalues(res, ('consistent results for int pow (-948388) ^ 3'))
> +
>  -- Narrowing for non-constant base of power operation.
>  local function pow(base, power)
>    return base ^ power
> -- 
> 2.41.0
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/5] Remove pow() splitting and cleanup backends.
  2023-08-17 15:33     ` Sergey Kaplun via Tarantool-patches
@ 2023-08-20  9:48       ` Maxim Kokryashkin via Tarantool-patches
  0 siblings, 0 replies; 34+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2023-08-20  9:48 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the fixes!
LGTM now, see my responses below.
On Thu, Aug 17, 2023 at 06:33:31PM +0300, Sergey Kaplun wrote:
> Hi Maxim!
> Thanks for the review!
> Updated considering your comments.
> 
> On 17.08.23, Maxim Kokryashkin wrote:
> > Hi, Sergey!
> > Thanks for the patch!
> > Please consider my comments below.
> > 
> > On Tue, Aug 15, 2023 at 12:36:28PM +0300, Sergey Kaplun wrote:
> > > From: Mike Pall <mike>
> > > 
> > > (cherry-picked from commit b2307c8ad817e350d65cc909a579ca2f77439682)
> > > 
> > > The JIT engine tries to split b^c to exp2(c * log2(b)) with attempt to
> > Typo: s/with attempt/with an attempt/
> 
> Fixed.
> 
> > > rejoin them later for some backends. It adds a dependency on C99
> > > exp2() and log2(), which aren't part of some libm implementations.
> > > Also, for some cases for IEEE754 we can see, that exp2(log2(x)) != x,
> > > due to mathematical functions accuracy and double precision
> > > restrictions. So, the values on the JIT slots and Lua stack are
> > > inconsistent.
> > 
> > There is a lot to it. There are chnages in emission, fold optimizations,
> > narrowing, etc. Maybe it is worth mentioning some key changes that
> > happened as a result of that? That way, this changeset is easier to absorb.
> 
> It's mentioned below, or I don't understand the idea.
Well, I think my brain just short-circuited or somehting. Yep, everything is ok.
> 
> > 
> > > 
> > > This patch removes splitting of pow operator, so IR_POW is emitting for
> > Typo: s/removes/removes the/
> 
> Fixed.
> 
> > > all cases (except power of 0.5 replaced with sqrt operation).
> > Typo: s/except/except for the/
> > Typo: s/0.5/0.5, which is/
> > Typo: s/with sqrt/with the sqrt/
> 
> Fixed all.
> 
> > > 
> > > Also this patch does some refactoring:
> > > 
> > > * Functions `asm_pow()`, `asm_mod()`, `asm_ldexp()`, `asm_div()`
> > >   (replaced with `asm_fpdiv()` for CPU architectures) are moved to the
> > Typo: s/to the/to/
> 
> Fixed.
> 
> > >   <src/lj_asm.c> as far as their implementation is generic for all
> > >   architectures.
> > > * Fusing of IR_HREF + IR_EQ/IR_NE moved to a `asm_fuseequal()`.
> > Typo: s/moved/was moved/
> > Typo: s/to a/to/
> 
> Fixed all.
> 
> > > * Since `lj_vm_exp2()` subroutine and `IRFPM_EXP2` are removed as no
> > >   longer used.
> > I can't understand what this sentence means, please rephrase it.
> 
> Removed "Since" as measleading.
> 
> > > 
> > 
> > What about changes with `asm_cnew`? I think you should mention them too.
> 
> Added.
> 
> > > Sergey Kaplun:
> > > * added the description and the test for the problem
> > > 
> > > Part of tarantool/tarantool#8825
> > > ---
> > >  src/lj_arch.h                                 |   3 -
> > >  src/lj_asm.c                                  | 106 +++++++++++-------
> > >  src/lj_asm_arm.h                              |  10 +-
> > >  src/lj_asm_arm64.h                            |  39 +------
> > >  src/lj_asm_mips.h                             |  38 +------
> > >  src/lj_asm_ppc.h                              |   9 +-
> > >  src/lj_asm_x86.h                              |  37 +-----
> > >  src/lj_ir.h                                   |   2 +-
> > >  src/lj_ircall.h                               |   1 -
> > >  src/lj_opt_fold.c                             |  18 ++-
> > >  src/lj_opt_narrow.c                           |  20 +---
> > >  src/lj_opt_split.c                            |  21 ----
> > >  src/lj_vm.h                                   |   5 -
> > >  src/lj_vmmath.c                               |   8 --
> > >  .../lj-9-pow-inconsistencies.test.lua         |  63 +++++++++++
> > >  15 files changed, 158 insertions(+), 222 deletions(-)
> > >  create mode 100644 test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
> > > 
> > > diff --git a/src/lj_arch.h b/src/lj_arch.h
> > > index cf31a291..3bdbe84e 100644
> > > --- a/src/lj_arch.h
> > > +++ b/src/lj_arch.h
> > > @@ -607,9 +607,6 @@
> > >  #if defined(__ANDROID__) || defined(__symbian__) || LJ_TARGET_XBOX360 || LJ_TARGET_WINDOWS
> > >  #define LUAJIT_NO_LOG2
> > >  #endif
> > > -#if defined(__symbian__) || LJ_TARGET_WINDOWS
> > > -#define LUAJIT_NO_EXP2
> > > -#endif
> > >  #if LJ_TARGET_CONSOLE || (LJ_TARGET_IOS && __IPHONE_OS_VERSION_MIN_REQUIRED >= __IPHONE_8_0)
> > >  #define LJ_NO_SYSTEM		1
> > >  #endif
> > > diff --git a/src/lj_asm.c b/src/lj_asm.c
> > > index b352fd35..a6906b19 100644
> > > --- a/src/lj_asm.c
> > > +++ b/src/lj_asm.c
> > > @@ -1356,32 +1356,6 @@ static void asm_call(ASMState *as, IRIns *ir)
> > >    asm_gencall(as, ci, args);
> > >  }
> > >  
> > > -#if !LJ_SOFTFP32
> > > -static void asm_fppow(ASMState *as, IRIns *ir, IRRef lref, IRRef rref)
> > > -{
> > > -  const CCallInfo *ci = &lj_ir_callinfo[IRCALL_pow];
> > > -  IRRef args[2];
> > > -  args[0] = lref;
> > > -  args[1] = rref;
> > > -  asm_setupresult(as, ir, ci);
> > > -  asm_gencall(as, ci, args);
> > > -}
> > > -
> > > -static int asm_fpjoin_pow(ASMState *as, IRIns *ir)
> > > -{
> > > -  IRIns *irp = IR(ir->op1);
> > > -  if (irp == ir-1 && irp->o == IR_MUL && !ra_used(irp)) {
> > > -    IRIns *irpp = IR(irp->op1);
> > > -    if (irpp == ir-2 && irpp->o == IR_FPMATH &&
> > > -	irpp->op2 == IRFPM_LOG2 && !ra_used(irpp)) {
> > > -      asm_fppow(as, ir, irpp->op1, irp->op2);
> > > -      return 1;
> > > -    }
> > > -  }
> > > -  return 0;
> > > -}
> > > -#endif
> > > -
> > >  /* -- PHI and loop handling ----------------------------------------------- */
> > >  
> > >  /* Break a PHI cycle by renaming to a free register (evict if needed). */
> > > @@ -1652,6 +1626,62 @@ static void asm_loop(ASMState *as)
> > >  #error "Missing assembler for target CPU"
> > >  #endif
> > >  
> > > +/* -- Common instruction helpers ------------------------------------------ */
> > > +
> > > +#if !LJ_SOFTFP32
> > > +#if !LJ_TARGET_X86ORX64
> > > +#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
> > > +#define asm_fppowi(as, ir)	asm_callid(as, ir, IRCALL_lj_vm_powi)
> > > +#endif
> > > +
> > > +static void asm_pow(ASMState *as, IRIns *ir)
> > > +{
> > > +#if LJ_64 && LJ_HASFFI
> > > +  if (!irt_isnum(ir->t))
> > > +    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_powi64 :
> > > +					  IRCALL_lj_carith_powu64);
> > > +  else
> > > +#endif
> > > +  if (irt_isnum(IR(ir->op2)->t))
> > > +    asm_callid(as, ir, IRCALL_pow);
> > > +  else
> > > +    asm_fppowi(as, ir);
> > > +}
> > > +
> > > +static void asm_div(ASMState *as, IRIns *ir)
> > > +{
> > > +#if LJ_64 && LJ_HASFFI
> > > +  if (!irt_isnum(ir->t))
> > > +    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_divi64 :
> > > +					  IRCALL_lj_carith_divu64);
> > > +  else
> > > +#endif
> > > +    asm_fpdiv(as, ir);
> > > +}
> > > +#endif
> > > +
> > > +static void asm_mod(ASMState *as, IRIns *ir)
> > > +{
> > > +#if LJ_64 && LJ_HASFFI
> > > +  if (!irt_isint(ir->t))
> > > +    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_modi64 :
> > > +					  IRCALL_lj_carith_modu64);
> > > +  else
> > > +#endif
> > > +    asm_callid(as, ir, IRCALL_lj_vm_modi);
> > > +}
> > > +
> > > +static void asm_fuseequal(ASMState *as, IRIns *ir)
> > > +{
> > > +  /* Fuse HREF + EQ/NE. */
> > > +  if ((ir-1)->o == IR_HREF && ir->op1 == as->curins-1) {
> > > +    as->curins--;
> > > +    asm_href(as, ir-1, (IROp)ir->o);
> > > +  } else {
> > > +    asm_equal(as, ir);
> > > +  }
> > > +}
> > > +
> > >  /* -- Instruction dispatch ------------------------------------------------ */
> > >  
> > >  /* Assemble a single instruction. */
> > > @@ -1674,14 +1704,7 @@ static void asm_ir(ASMState *as, IRIns *ir)
> > >    case IR_ABC:
> > >      asm_comp(as, ir);
> > >      break;
> > > -  case IR_EQ: case IR_NE:
> > > -    if ((ir-1)->o == IR_HREF && ir->op1 == as->curins-1) {
> > > -      as->curins--;
> > > -      asm_href(as, ir-1, (IROp)ir->o);
> > > -    } else {
> > > -      asm_equal(as, ir);
> > > -    }
> > > -    break;
> > > +  case IR_EQ: case IR_NE: asm_fuseequal(as, ir); break;
> > >  
> > >    case IR_RETF: asm_retf(as, ir); break;
> > >  
> > > @@ -1750,7 +1773,13 @@ static void asm_ir(ASMState *as, IRIns *ir)
> > >    case IR_SNEW: case IR_XSNEW: asm_snew(as, ir); break;
> > >    case IR_TNEW: asm_tnew(as, ir); break;
> > >    case IR_TDUP: asm_tdup(as, ir); break;
> > > -  case IR_CNEW: case IR_CNEWI: asm_cnew(as, ir); break;
> > > +  case IR_CNEW: case IR_CNEWI:
> > > +#if LJ_HASFFI
> > > +    asm_cnew(as, ir);
> > > +#else
> > > +    lua_assert(0);
> > > +#endif
> > > +    break;
> > >  
> > >    /* Buffer operations. */
> > >    case IR_BUFHDR: asm_bufhdr(as, ir); break;
> > > @@ -2215,6 +2244,10 @@ static void asm_setup_regsp(ASMState *as)
> > >  	if (inloop)
> > >  	  as->modset |= RSET_SCRATCH;
> > >  #if LJ_TARGET_X86
> > > +	if (irt_isnum(IR(ir->op2)->t)) {
> > > +	  if (as->evenspill < 4)  /* Leave room to call pow(). */
> > > +	    as->evenspill = 4;
> > > +	}
> > >  	break;
> > >  #else
> > >  	ir->prev = REGSP_HINT(RID_FPRET);
> > > @@ -2240,9 +2273,6 @@ static void asm_setup_regsp(ASMState *as)
> > >  	  continue;
> > >  	}
> > >  	break;
> > > -      } else if (ir->op2 == IRFPM_EXP2 && !LJ_64) {
> > > -	if (as->evenspill < 4)  /* Leave room to call pow(). */
> > > -	  as->evenspill = 4;
> > >        }
> > >  #endif
> > >        if (inloop)
> > > diff --git a/src/lj_asm_arm.h b/src/lj_asm_arm.h
> > > index 2894e5c9..29a07c80 100644
> > > --- a/src/lj_asm_arm.h
> > > +++ b/src/lj_asm_arm.h
> > > @@ -1275,8 +1275,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
> > >    ra_allockreg(as, (int32_t)(sz+sizeof(GCcdata)),
> > >  	       ra_releasetmp(as, ASMREF_TMP1));
> > >  }
> > > -#else
> > > -#define asm_cnew(as, ir)	((void)0)
> > >  #endif
> > >  
> > >  /* -- Write barriers ------------------------------------------------------ */
> > > @@ -1371,8 +1369,6 @@ static void asm_callround(ASMState *as, IRIns *ir, int id)
> > >  
> > >  static void asm_fpmath(ASMState *as, IRIns *ir)
> > >  {
> > > -  if (ir->op2 == IRFPM_EXP2 && asm_fpjoin_pow(as, ir))
> > > -    return;
> > >    if (ir->op2 <= IRFPM_TRUNC)
> > >      asm_callround(as, ir, ir->op2);
> > >    else if (ir->op2 == IRFPM_SQRT)
> > > @@ -1499,14 +1495,10 @@ static void asm_mul(ASMState *as, IRIns *ir)
> > >  #define asm_mulov(as, ir)	asm_mul(as, ir)
> > >  
> > >  #if !LJ_SOFTFP
> > > -#define asm_div(as, ir)		asm_fparith(as, ir, ARMI_VDIV_D)
> > > -#define asm_pow(as, ir)		asm_callid(as, ir, IRCALL_lj_vm_powi)
> > > +#define asm_fpdiv(as, ir)	asm_fparith(as, ir, ARMI_VDIV_D)
> > >  #define asm_abs(as, ir)		asm_fpunary(as, ir, ARMI_VABS_D)
> > > -#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
> > >  #endif
> > >  
> > > -#define asm_mod(as, ir)		asm_callid(as, ir, IRCALL_lj_vm_modi)
> > > -
> > >  static void asm_neg(ASMState *as, IRIns *ir)
> > >  {
> > >  #if !LJ_SOFTFP
> > > diff --git a/src/lj_asm_arm64.h b/src/lj_asm_arm64.h
> > > index aea251a9..c3d6889e 100644
> > > --- a/src/lj_asm_arm64.h
> > > +++ b/src/lj_asm_arm64.h
> > > @@ -1249,8 +1249,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
> > >    ra_allockreg(as, (int32_t)(sz+sizeof(GCcdata)),
> > >  	       ra_releasetmp(as, ASMREF_TMP1));
> > >  }
> > > -#else
> > > -#define asm_cnew(as, ir)	((void)0)
> > >  #endif
> > >  
> > >  /* -- Write barriers ------------------------------------------------------ */
> > > @@ -1327,8 +1325,6 @@ static void asm_fpmath(ASMState *as, IRIns *ir)
> > >    } else if (fpm <= IRFPM_TRUNC) {
> > >      asm_fpunary(as, ir, fpm == IRFPM_FLOOR ? A64I_FRINTMd :
> > >  			fpm == IRFPM_CEIL ? A64I_FRINTPd : A64I_FRINTZd);
> > > -  } else if (fpm == IRFPM_EXP2 && asm_fpjoin_pow(as, ir)) {
> > > -    return;
> > >    } else {
> > >      asm_callid(as, ir, IRCALL_lj_vm_floor + fpm);
> > >    }
> > > @@ -1435,45 +1431,12 @@ static void asm_mul(ASMState *as, IRIns *ir)
> > >    asm_intmul(as, ir);
> > >  }
> > >  
> > > -static void asm_div(ASMState *as, IRIns *ir)
> > > -{
> > > -#if LJ_HASFFI
> > > -  if (!irt_isnum(ir->t))
> > > -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_divi64 :
> > > -					  IRCALL_lj_carith_divu64);
> > > -  else
> > > -#endif
> > > -    asm_fparith(as, ir, A64I_FDIVd);
> > > -}
> > > -
> > > -static void asm_pow(ASMState *as, IRIns *ir)
> > > -{
> > > -#if LJ_HASFFI
> > > -  if (!irt_isnum(ir->t))
> > > -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_powi64 :
> > > -					  IRCALL_lj_carith_powu64);
> > > -  else
> > > -#endif
> > > -    asm_callid(as, ir, IRCALL_lj_vm_powi);
> > > -}
> > > -
> > >  #define asm_addov(as, ir)	asm_add(as, ir)
> > >  #define asm_subov(as, ir)	asm_sub(as, ir)
> > >  #define asm_mulov(as, ir)	asm_mul(as, ir)
> > >  
> > > +#define asm_fpdiv(as, ir)	asm_fparith(as, ir, A64I_FDIVd)
> > >  #define asm_abs(as, ir)		asm_fpunary(as, ir, A64I_FABS)
> > > -#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
> > > -
> > > -static void asm_mod(ASMState *as, IRIns *ir)
> > > -{
> > > -#if LJ_HASFFI
> > > -  if (!irt_isint(ir->t))
> > > -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_modi64 :
> > > -					  IRCALL_lj_carith_modu64);
> > > -  else
> > > -#endif
> > > -    asm_callid(as, ir, IRCALL_lj_vm_modi);
> > > -}
> > >  
> > >  static void asm_neg(ASMState *as, IRIns *ir)
> > >  {
> > > diff --git a/src/lj_asm_mips.h b/src/lj_asm_mips.h
> > > index 4626507b..0f92959b 100644
> > > --- a/src/lj_asm_mips.h
> > > +++ b/src/lj_asm_mips.h
> > > @@ -1613,8 +1613,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
> > >    ra_allockreg(as, (int32_t)(sz+sizeof(GCcdata)),
> > >  	       ra_releasetmp(as, ASMREF_TMP1));
> > >  }
> > > -#else
> > > -#define asm_cnew(as, ir)	((void)0)
> > >  #endif
> > >  
> > >  /* -- Write barriers ------------------------------------------------------ */
> > > @@ -1683,8 +1681,6 @@ static void asm_fpunary(ASMState *as, IRIns *ir, MIPSIns mi)
> > >  #if !LJ_SOFTFP32
> > >  static void asm_fpmath(ASMState *as, IRIns *ir)
> > >  {
> > > -  if (ir->op2 == IRFPM_EXP2 && asm_fpjoin_pow(as, ir))
> > > -    return;
> > >  #if !LJ_SOFTFP
> > >    if (ir->op2 <= IRFPM_TRUNC)
> > >      asm_callround(as, ir, IRCALL_lj_vm_floor + ir->op2);
> > > @@ -1772,41 +1768,13 @@ static void asm_mul(ASMState *as, IRIns *ir)
> > >    }
> > >  }
> > >  
> > > -static void asm_mod(ASMState *as, IRIns *ir)
> > > -{
> > > -#if LJ_64 && LJ_HASFFI
> > > -  if (!irt_isint(ir->t))
> > > -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_modi64 :
> > > -					  IRCALL_lj_carith_modu64);
> > > -  else
> > > -#endif
> > > -    asm_callid(as, ir, IRCALL_lj_vm_modi);
> > > -}
> > > -
> > >  #if !LJ_SOFTFP32
> > > -static void asm_pow(ASMState *as, IRIns *ir)
> > > -{
> > > -#if LJ_64 && LJ_HASFFI
> > > -  if (!irt_isnum(ir->t))
> > > -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_powi64 :
> > > -					  IRCALL_lj_carith_powu64);
> > > -  else
> > > -#endif
> > > -    asm_callid(as, ir, IRCALL_lj_vm_powi);
> > > -}
> > > -
> > > -static void asm_div(ASMState *as, IRIns *ir)
> > > +static void asm_fpdiv(ASMState *as, IRIns *ir)
> > >  {
> > > -#if LJ_64 && LJ_HASFFI
> > > -  if (!irt_isnum(ir->t))
> > > -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_divi64 :
> > > -					  IRCALL_lj_carith_divu64);
> > > -  else
> > > -#endif
> > >  #if !LJ_SOFTFP
> > >      asm_fparith(as, ir, MIPSI_DIV_D);
> > >  #else
> > > -  asm_callid(as, ir, IRCALL_softfp_div);
> > > +    asm_callid(as, ir, IRCALL_softfp_div);
> > >  #endif
> > >  }
> > >  #endif
> > > @@ -1844,8 +1812,6 @@ static void asm_abs(ASMState *as, IRIns *ir)
> > >  }
> > >  #endif
> > >  
> > > -#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
> > > -
> > >  static void asm_arithov(ASMState *as, IRIns *ir)
> > >  {
> > >    /* TODO MIPSR6: bovc/bnvc. Caveat: no delay slot to load RID_TMP. */
> > > diff --git a/src/lj_asm_ppc.h b/src/lj_asm_ppc.h
> > > index 6aaed058..62a5c3e2 100644
> > > --- a/src/lj_asm_ppc.h
> > > +++ b/src/lj_asm_ppc.h
> > > @@ -1177,8 +1177,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
> > >    ra_allockreg(as, (int32_t)(sz+sizeof(GCcdata)),
> > >  	       ra_releasetmp(as, ASMREF_TMP1));
> > >  }
> > > -#else
> > > -#define asm_cnew(as, ir)	((void)0)
> > >  #endif
> > >  
> > >  /* -- Write barriers ------------------------------------------------------ */
> > > @@ -1249,8 +1247,6 @@ static void asm_fpunary(ASMState *as, IRIns *ir, PPCIns pi)
> > >  
> > >  static void asm_fpmath(ASMState *as, IRIns *ir)
> > >  {
> > > -  if (ir->op2 == IRFPM_EXP2 && asm_fpjoin_pow(as, ir))
> > > -    return;
> > >    if (ir->op2 == IRFPM_SQRT && (as->flags & JIT_F_SQRT))
> > >      asm_fpunary(as, ir, PPCI_FSQRT);
> > >    else
> > > @@ -1364,9 +1360,7 @@ static void asm_mul(ASMState *as, IRIns *ir)
> > >    }
> > >  }
> > >  
> > > -#define asm_div(as, ir)		asm_fparith(as, ir, PPCI_FDIV)
> > > -#define asm_mod(as, ir)		asm_callid(as, ir, IRCALL_lj_vm_modi)
> > > -#define asm_pow(as, ir)		asm_callid(as, ir, IRCALL_lj_vm_powi)
> > > +#define asm_fpdiv(as, ir)	asm_fparith(as, ir, PPCI_FDIV)
> > >  
> > >  static void asm_neg(ASMState *as, IRIns *ir)
> > >  {
> > > @@ -1390,7 +1384,6 @@ static void asm_neg(ASMState *as, IRIns *ir)
> > >  }
> > >  
> > >  #define asm_abs(as, ir)		asm_fpunary(as, ir, PPCI_FABS)
> > > -#define asm_ldexp(as, ir)	asm_callid(as, ir, IRCALL_ldexp)
> > >  
> > >  static void asm_arithov(ASMState *as, IRIns *ir, PPCIns pi)
> > >  {
> > > diff --git a/src/lj_asm_x86.h b/src/lj_asm_x86.h
> > > index 63d332ca..5f5fe3cf 100644
> > > --- a/src/lj_asm_x86.h
> > > +++ b/src/lj_asm_x86.h
> > > @@ -1857,8 +1857,6 @@ static void asm_cnew(ASMState *as, IRIns *ir)
> > >    asm_gencall(as, ci, args);
> > >    emit_loadi(as, ra_releasetmp(as, ASMREF_TMP1), (int32_t)(sz+sizeof(GCcdata)));
> > >  }
> > > -#else
> > > -#define asm_cnew(as, ir)	((void)0)
> > >  #endif
> > >  
> > >  /* -- Write barriers ------------------------------------------------------ */
> > > @@ -1964,8 +1962,6 @@ static void asm_fpmath(ASMState *as, IRIns *ir)
> > >  		    fpm == IRFPM_CEIL ? lj_vm_ceil_sse : lj_vm_trunc_sse);
> > >        ra_left(as, RID_XMM0, ir->op1);
> > >      }
> > > -  } else if (fpm == IRFPM_EXP2 && asm_fpjoin_pow(as, ir)) {
> > > -    /* Rejoined to pow(). */
> > >    } else {
> > >      asm_callid(as, ir, IRCALL_lj_vm_floor + fpm);
> > >    }
> > > @@ -2000,17 +1996,6 @@ static void asm_fppowi(ASMState *as, IRIns *ir)
> > >    ra_left(as, RID_EAX, ir->op2);
> > >  }
> > >  
> > > -static void asm_pow(ASMState *as, IRIns *ir)
> > > -{
> > > -#if LJ_64 && LJ_HASFFI
> > > -  if (!irt_isnum(ir->t))
> > > -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_powi64 :
> > > -					  IRCALL_lj_carith_powu64);
> > > -  else
> > > -#endif
> > > -    asm_fppowi(as, ir);
> > > -}
> > > -
> > >  static int asm_swapops(ASMState *as, IRIns *ir)
> > >  {
> > >    IRIns *irl = IR(ir->op1);
> > > @@ -2208,27 +2193,7 @@ static void asm_mul(ASMState *as, IRIns *ir)
> > >      asm_intarith(as, ir, XOg_X_IMUL);
> > >  }
> > >  
> > > -static void asm_div(ASMState *as, IRIns *ir)
> > > -{
> > > -#if LJ_64 && LJ_HASFFI
> > > -  if (!irt_isnum(ir->t))
> > > -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_divi64 :
> > > -					  IRCALL_lj_carith_divu64);
> > > -  else
> > > -#endif
> > > -    asm_fparith(as, ir, XO_DIVSD);
> > > -}
> > > -
> > > -static void asm_mod(ASMState *as, IRIns *ir)
> > > -{
> > > -#if LJ_64 && LJ_HASFFI
> > > -  if (!irt_isint(ir->t))
> > > -    asm_callid(as, ir, irt_isi64(ir->t) ? IRCALL_lj_carith_modi64 :
> > > -					  IRCALL_lj_carith_modu64);
> > > -  else
> > > -#endif
> > > -    asm_callid(as, ir, IRCALL_lj_vm_modi);
> > > -}
> > > +#define asm_fpdiv(as, ir)	asm_fparith(as, ir, XO_DIVSD)
> > >  
> > >  static void asm_neg_not(ASMState *as, IRIns *ir, x86Group3 xg)
> > >  {
> > > diff --git a/src/lj_ir.h b/src/lj_ir.h
> > > index e8bca275..43e55069 100644
> > > --- a/src/lj_ir.h
> > > +++ b/src/lj_ir.h
> > > @@ -177,7 +177,7 @@ LJ_STATIC_ASSERT((int)IR_XLOAD + IRDELTA_L2S == (int)IR_XSTORE);
> > >  /* FPMATH sub-functions. ORDER FPM. */
> > >  #define IRFPMDEF(_) \
> > >    _(FLOOR) _(CEIL) _(TRUNC)  /* Must be first and in this order. */ \
> > > -  _(SQRT) _(EXP2) _(LOG) _(LOG2) \
> > > +  _(SQRT) _(LOG) _(LOG2) \
> > >    _(OTHER)
> > >  
> > >  typedef enum {
> > > diff --git a/src/lj_ircall.h b/src/lj_ircall.h
> > > index bbad35b1..af064a6f 100644
> > > --- a/src/lj_ircall.h
> > > +++ b/src/lj_ircall.h
> > > @@ -192,7 +192,6 @@ typedef struct CCallInfo {
> > >    _(FPMATH,	lj_vm_ceil,		1,   N, NUM, XA_FP) \
> > >    _(FPMATH,	lj_vm_trunc,		1,   N, NUM, XA_FP) \
> > >    _(FPMATH,	sqrt,			1,   N, NUM, XA_FP) \
> > > -  _(ANY,	lj_vm_exp2,		1,   N, NUM, XA_FP) \
> > >    _(ANY,	log,			1,   N, NUM, XA_FP) \
> > >    _(ANY,	lj_vm_log2,		1,   N, NUM, XA_FP) \
> > >    _(ANY,	lj_vm_powi,		2,   N, NUM, XA_FP) \
> > > diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c
> > > index 27e489af..cd803d87 100644
> > > --- a/src/lj_opt_fold.c
> > > +++ b/src/lj_opt_fold.c
> > > @@ -237,10 +237,11 @@ LJFOLDF(kfold_fpcall2)
> > >  }
> > >  
> > >  LJFOLD(POW KNUM KINT)
> > > +LJFOLD(POW KNUM KNUM)
> > >  LJFOLDF(kfold_numpow)
> > >  {
> > >    lua_Number a = knumleft;
> > > -  lua_Number b = (lua_Number)fright->i;
> > > +  lua_Number b = fright->o == IR_KINT ? (lua_Number)fright->i : knumright;
> > >    lua_Number y = lj_vm_foldarith(a, b, IR_POW - IR_ADD);
> > >    return lj_ir_knum(J, y);
> > >  }
> > > @@ -1077,7 +1078,7 @@ LJFOLDF(simplify_nummuldiv_negneg)
> > >  }
> > >  
> > >  LJFOLD(POW any KINT)
> > > -LJFOLDF(simplify_numpow_xk)
> > > +LJFOLDF(simplify_numpow_xkint)
> > >  {
> > >    int32_t k = fright->i;
> > >    TRef ref = fins->op1;
> > > @@ -1106,13 +1107,22 @@ LJFOLDF(simplify_numpow_xk)
> > >    return ref;
> > >  }
> > >  
> > > +LJFOLD(POW any KNUM)
> > > +LJFOLDF(simplify_numpow_xknum)
> > > +{
> > > +  if (knumright == 0.5)  /* x ^ 0.5 ==> sqrt(x) */
> > > +    return emitir(IRTN(IR_FPMATH), fins->op1, IRFPM_SQRT);
> > > +  return NEXTFOLD;
> > > +}
> > > +
> > >  LJFOLD(POW KNUM any)
> > >  LJFOLDF(simplify_numpow_kx)
> > >  {
> > >    lua_Number n = knumleft;
> > > -  if (n == 2.0) {  /* 2.0 ^ i ==> ldexp(1.0, tonum(i)) */
> > > -    fins->o = IR_CONV;
> > > +  if (n == 2.0 && irt_isint(fright->t)) {  /* 2.0 ^ i ==> ldexp(1.0, i) */
> > >  #if LJ_TARGET_X86ORX64
> > > +    /* Different IR_LDEXP calling convention on x86/x64 requires conversion. */
> > > +    fins->o = IR_CONV;
> > >      fins->op1 = fins->op2;
> > >      fins->op2 = IRCONV_NUM_INT;
> > >      fins->op2 = (IRRef1)lj_opt_fold(J);
> > > diff --git a/src/lj_opt_narrow.c b/src/lj_opt_narrow.c
> > > index bb61f97b..4f285334 100644
> > > --- a/src/lj_opt_narrow.c
> > > +++ b/src/lj_opt_narrow.c
> > > @@ -593,10 +593,10 @@ TRef lj_opt_narrow_pow(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc)
> > >    /* Narrowing must be unconditional to preserve (-x)^i semantics. */
> > >    if (tvisint(vc) || numisint(numV(vc))) {
> > >      int checkrange = 0;
> > > -    /* Split pow is faster for bigger exponents. But do this only for (+k)^i. */
> > > +    /* pow() is faster for bigger exponents. But do this only for (+k)^i. */
> > >      if (tref_isk(rb) && (int32_t)ir_knum(IR(tref_ref(rb)))->u32.hi >= 0) {
> > >        int32_t k = numberVint(vc);
> > > -      if (!(k >= -65536 && k <= 65536)) goto split_pow;
> > > +      if (!(k >= -65536 && k <= 65536)) goto force_pow_num;
> > >        checkrange = 1;
> > >      }
> > >      if (!tref_isinteger(rc)) {
> > > @@ -607,19 +607,11 @@ TRef lj_opt_narrow_pow(jit_State *J, TRef rb, TRef rc, TValue *vb, TValue *vc)
> > >        TRef tmp = emitir(IRTI(IR_ADD), rc, lj_ir_kint(J, 65536));
> > >        emitir(IRTGI(IR_ULE), tmp, lj_ir_kint(J, 2*65536));
> > >      }
> > > -    return emitir(IRTN(IR_POW), rb, rc);
> > > +  } else {
> > > +force_pow_num:
> > > +    rc = lj_ir_tonum(J, rc);  /* Want POW(num, num), not POW(num, int). */
> > >    }
> > > -split_pow:
> > > -  /* FOLD covers most cases, but some are easier to do here. */
> > > -  if (tref_isk(rb) && tvispone(ir_knum(IR(tref_ref(rb)))))
> > > -    return rb;  /* 1 ^ x ==> 1 */
> > > -  rc = lj_ir_tonum(J, rc);
> > > -  if (tref_isk(rc) && ir_knum(IR(tref_ref(rc)))->n == 0.5)
> > > -    return emitir(IRTN(IR_FPMATH), rb, IRFPM_SQRT);  /* x ^ 0.5 ==> sqrt(x) */
> > > -  /* Split up b^c into exp2(c*log2(b)). Assembler may rejoin later. */
> > > -  rb = emitir(IRTN(IR_FPMATH), rb, IRFPM_LOG2);
> > > -  rc = emitir(IRTN(IR_MUL), rb, rc);
> > > -  return emitir(IRTN(IR_FPMATH), rc, IRFPM_EXP2);
> > > +  return emitir(IRTN(IR_POW), rb, rc);
> > >  }
> > >  
> > >  /* -- Predictive narrowing of induction variables ------------------------- */
> > > diff --git a/src/lj_opt_split.c b/src/lj_opt_split.c
> > > index 2fc36b8d..c10a85cb 100644
> > > --- a/src/lj_opt_split.c
> > > +++ b/src/lj_opt_split.c
> > > @@ -403,27 +403,6 @@ static void split_ir(jit_State *J)
> > >  	hi = split_call_li(J, hisubst, oir, ir, IRCALL_lj_vm_powi);
> > >  	break;
> > >        case IR_FPMATH:
> > > -	/* Try to rejoin pow from EXP2, MUL and LOG2. */
> > > -	if (nir->op2 == IRFPM_EXP2 && nir->op1 > J->loopref) {
> > > -	  IRIns *irp = IR(nir->op1);
> > > -	  if (irp->o == IR_CALLN && irp->op2 == IRCALL_softfp_mul) {
> > > -	    IRIns *irm4 = IR(irp->op1);
> > > -	    IRIns *irm3 = IR(irm4->op1);
> > > -	    IRIns *irm12 = IR(irm3->op1);
> > > -	    IRIns *irl1 = IR(irm12->op1);
> > > -	    if (irm12->op1 > J->loopref && irl1->o == IR_CALLN &&
> > > -		irl1->op2 == IRCALL_lj_vm_log2) {
> > > -	      IRRef tmp = irl1->op1;  /* Recycle first two args from LOG2. */
> > > -	      IRRef arg3 = irm3->op2, arg4 = irm4->op2;
> > > -	      J->cur.nins--;
> > > -	      tmp = split_emit(J, IRT(IR_CARG, IRT_NIL), tmp, arg3);
> > > -	      tmp = split_emit(J, IRT(IR_CARG, IRT_NIL), tmp, arg4);
> > > -	      ir->prev = tmp = split_emit(J, IRTI(IR_CALLN), tmp, IRCALL_pow);
> > > -	      hi = split_emit(J, IRT(IR_HIOP, IRT_SOFTFP), tmp, tmp);
> > > -	      break;
> > > -	    }
> > > -	  }
> > > -	}
> > >  	hi = split_call_l(J, hisubst, oir, ir, IRCALL_lj_vm_floor + ir->op2);
> > >  	break;
> > >        case IR_LDEXP:
> > > diff --git a/src/lj_vm.h b/src/lj_vm.h
> > > index 411caafa..abaa7c52 100644
> > > --- a/src/lj_vm.h
> > > +++ b/src/lj_vm.h
> > > @@ -95,11 +95,6 @@ LJ_ASMF double lj_vm_trunc(double);
> > >  LJ_ASMF double lj_vm_trunc_sf(double);
> > >  #endif
> > >  #endif
> > > -#ifdef LUAJIT_NO_EXP2
> > > -LJ_ASMF double lj_vm_exp2(double);
> > > -#else
> > > -#define lj_vm_exp2	exp2
> > > -#endif
> > >  #if LJ_HASFFI
> > >  LJ_ASMF int lj_vm_errno(void);
> > >  #endif
> > > diff --git a/src/lj_vmmath.c b/src/lj_vmmath.c
> > > index ae4e0f15..9c0d3fde 100644
> > > --- a/src/lj_vmmath.c
> > > +++ b/src/lj_vmmath.c
> > > @@ -79,13 +79,6 @@ double lj_vm_log2(double a)
> > >  }
> > >  #endif
> > >  
> > > -#ifdef LUAJIT_NO_EXP2
> > > -double lj_vm_exp2(double a)
> > > -{
> > > -  return exp(a * 0.6931471805599453);
> > > -}
> > > -#endif
> > > -
> > >  #if !LJ_TARGET_X86ORX64
> > >  /* Unsigned x^k. */
> > >  static double lj_vm_powui(double x, uint32_t k)
> > > @@ -128,7 +121,6 @@ double lj_vm_foldfpm(double x, int fpm)
> > >    case IRFPM_CEIL: return lj_vm_ceil(x);
> > >    case IRFPM_TRUNC: return lj_vm_trunc(x);
> > >    case IRFPM_SQRT: return sqrt(x);
> > > -  case IRFPM_EXP2: return lj_vm_exp2(x);
> > >    case IRFPM_LOG: return log(x);
> > >    case IRFPM_LOG2: return lj_vm_log2(x);
> > >    default: lua_assert(0);
> > > diff --git a/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
> > > new file mode 100644
> > > index 00000000..21b3a0d9
> > > --- /dev/null
> > > +++ b/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
> > > @@ -0,0 +1,63 @@
> > > +local tap = require('tap')
> > > +-- Test to demonstrate the incorrect JIT behaviour when splitting
> > > +-- IR_POW.
> > > +-- See also https://github.com/LuaJIT/LuaJIT/issues/9.
> > > +local test = tap.test('lj-9-pow-inconsistencies'):skipcond({
> > > +  ['Test requires JIT enabled'] = not jit.status(),
> > > +})
> > > +
> > > +local nan = 0 / 0
> > > +local inf = math.huge
> > > +
> > > +-- Table with some corner cases to check:
> > > +local INTERESTING_VALUES = {
> > > +  -- 0, -0, 1, -1 special cases with nan, inf, etc..
> > > +  0, -0, 1, -1, nan, inf, -inf,
> > > +  -- x ^  inf = 0 (inf), if |x| < 1 (|x| > 1).
> > > +  -- x ^ -inf = inf (0), if |x| < 1 (|x| > 1).
> > > +  0.999999, 1.000001, -0.999999, -1.000001,
> > > +}
> > > +test:plan(1 + (#INTERESTING_VALUES) ^ 2)
> > 
> > I suggest renaming it to `CORNER_CASES`, since `INTERESTING_VALUES`
> > is not very formal.
> 
> Renamed.
> 
> > Also, please mention that not all of the possible pairs are faulty
> > and most of them are left here for two reasons:
> > 1. Improved readability.
> > 2. More extensive and change-proof testing.
> 
> Added the comment.
> 
> > 
> > > +
> > > +jit.opt.start('hotloop=1')
> > > +
> > > +-- The JIT engine tries to split b^c to exp2(c * log2(b)).
> > > +-- For some cases for IEEE754 we can see, that
> > > +-- (double)exp2((double)log2(x)) != x, due to mathematical
> > > +-- functions accuracy and double precision restrictions.
> > > +-- Just use some numbers to observe this misbehaviour.
> > > +local res = {}
> > > +local cnt = 1
> > > +while cnt < 4 do
> > > +  -- XXX: use local variable to prevent folding via parser.
> > > +  local b = -0.90000000001
> > > +  res[cnt] = 1000 ^ b
> > > +  cnt = cnt + 1
> > > +end
> > 
> > Is there a specific reason you decided to use while over for?
> 
> Since I can't remember, I think no, so I replaced with `for`.
> 
> > > +
> > > +test:samevalues(res, 'consistent pow operator behaviour for corner case')
> > > +
> > > +-- Prevent JIT side effects for parent loops.
> > > +jit.off()
> > > +for i = 1, #INTERESTING_VALUES do
> > > +  for j = 1, #INTERESTING_VALUES do
> > > +    local b = INTERESTING_VALUES[i]
> > > +    local c = INTERESTING_VALUES[j]
> > > +    local results = {}
> > > +    local counter = 1
> > > +    jit.on()
> > > +    while counter < 4 do
> > > +      results[counter] = b ^ c
> > > +      counter = counter + 1
> > > +    end
> > Same question about for and while.
> 
> Fixed.
> 
> > > +    -- Prevent JIT side effects.
> > > +    jit.off()
> > > +    jit.flush()
> > Also, I think we should move the part from jit.on() to jit.flush() into
> > a separate function.
> 
> I don't agree here -- we still use tons of variables from the cycles,
> and I don't want to see any side-effects of the function call in
> traces.
Ok, that is not a big deal.
> 
> See other changes in the iterative patch below:
> 
> ===================================================================
> diff --git a/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
> index 21b3a0d9..6abba07f 100644
> --- a/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
> +++ b/test/tarantool-tests/lj-9-pow-inconsistencies.test.lua
> @@ -10,14 +10,18 @@ local nan = 0 / 0
>  local inf = math.huge
>  
>  -- Table with some corner cases to check:
> -local INTERESTING_VALUES = {
> +-- Not all of them fail on each CPU architecture, but bruteforce
> +-- is better, than custom enumerated usage for two reasons:
> +-- * Improved readability.
> +-- * More extensive and change-proof testing.
> +local CORNER_CASES = {
>    -- 0, -0, 1, -1 special cases with nan, inf, etc..
>    0, -0, 1, -1, nan, inf, -inf,
>    -- x ^  inf = 0 (inf), if |x| < 1 (|x| > 1).
>    -- x ^ -inf = inf (0), if |x| < 1 (|x| > 1).
>    0.999999, 1.000001, -0.999999, -1.000001,
>  }
> -test:plan(1 + (#INTERESTING_VALUES) ^ 2)
> +test:plan(1 + (#CORNER_CASES) ^ 2)
>  
>  jit.opt.start('hotloop=1')
>  
> @@ -27,28 +31,25 @@ jit.opt.start('hotloop=1')
>  -- functions accuracy and double precision restrictions.
>  -- Just use some numbers to observe this misbehaviour.
>  local res = {}
> -local cnt = 1
> -while cnt < 4 do
> +for i = 1, 4 do
>    -- XXX: use local variable to prevent folding via parser.
>    local b = -0.90000000001
> -  res[cnt] = 1000 ^ b
> -  cnt = cnt + 1
> +  res[i] = 1000 ^ b
>  end
>  
>  test:samevalues(res, 'consistent pow operator behaviour for corner case')
>  
>  -- Prevent JIT side effects for parent loops.
>  jit.off()
> -for i = 1, #INTERESTING_VALUES do
> -  for j = 1, #INTERESTING_VALUES do
> -    local b = INTERESTING_VALUES[i]
> -    local c = INTERESTING_VALUES[j]
> +for i = 1, #CORNER_CASES do
> +  for j = 1, #CORNER_CASES do
> +    local b = CORNER_CASES[i]
> +    local c = CORNER_CASES[j]
>      local results = {}
>      local counter = 1
>      jit.on()
> -    while counter < 4 do
> -      results[counter] = b ^ c
> -      counter = counter + 1
> +    for k = 1, 4 do
> +      results[k] = b ^ c
>      end
>      -- Prevent JIT side effects.
>      jit.off()
> ===================================================================
> 
> <snipped>
> > > -- 
> > > 2.41.0
> > > 
> 
> -- 
> Best regards,
> Sergey Kaplun

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 4/5] Fix pow() optimization inconsistencies.
  2023-08-20  9:26   ` Maxim Kokryashkin via Tarantool-patches
@ 2023-08-21  8:06     ` Sergey Kaplun via Tarantool-patches
  2023-08-21  9:00       ` Maxim Kokryashkin via Tarantool-patches
  0 siblings, 1 reply; 34+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-08-21  8:06 UTC (permalink / raw)
  To: Maxim Kokryashkin; +Cc: tarantool-patches

Hi, Maxim!
Thanks for the review!
See my answers below.

On 20.08.23, Maxim Kokryashkin wrote:
> Hi, Sergey!
> Thanks for the patch!
> Please consider my comments below.
> 
> On Tue, Aug 15, 2023 at 12:36:30PM +0300, Sergey Kaplun wrote:
> > From: Mike Pall <mike>
> > 
> > (cherry-picked from commit 9512d5c1aced61e13e7be2d3208ec7ae3516b458)
> > 
> > This patch fixes different misbehaviour between JIT-compiled code and
> Typo: s/misbehaviour/misbehaviours/

Fixed.

> > the interpreter for power operator with the following ways:
> Typo: s/with the/in the/

Fixed.

> > * Drop folding optimizations for base ^ 0.5 => sqrt(base), as far as
> >   pow(base, 0.5) isn't interchangeable and depends on the <math.h>
> >   implementation.
> > * Drop folding optimizations for 2 ^ int_pow => ldexp(1.0, int_pow), to
> >   avoid dependcy on the <math.h> implementation.
> > * Now `asm_pow()` always assemble a call to the `lj_vm_powi()` function,
> Typo: s/assemble/assembles/

Fixed.

> >   that is general now for all CPU architectures. Using this internal
> >   function instead of toolchain-provided `pow()` guarantees consistency
> Typo: s/of/of the/

Fixed.

> >   between interpreter and JIT results. Also, it drops custom
> Typo: s/drops/drops the/

Fixed.

> >   implementation for the `vm_powi_sse()` on x86_64.
> Typo: s/for the/for/

Fixed.

> > * `math_extern2` macro in the VM may take the second argument, that is
> >   used as the target function to call. The first argument is still the
> >   name for `func_nnsse` macro.
> > * Narrowing for power operation avoids range guard for non-constant base
> >   IR. This leads to invalid result if value on trace is out of range.
> Typo: s/to invalid/to an invalid/

Fixed.

> >   Now it is done unconditionally.
> > 
> > Be aware, that [220/502] lib/string/format/num.lua test [1] from
> Typo: s/from the/from/

I suppose that it should be "from the"? Fixed.

> > LuaJIT-test suite fails after this commit.
> > 
> > [1]: https://www.exploringbinary.com/incorrect-floating-point-to-decimal-conversions/
> > 
> > Sergey Kaplun:
> > * added the description and the test for the problem
> > 
> > Part of tarantool/tarantool#8825
> > ---

<snipped>

> > +local res = {}
> > +-- -0 ^ 0.5 = 0. Test sign with `tostring()`.
> Typo: s/Test/Test the/

Fixed.

> > +-- XXX: use local variable to prevent folding via parser.

<snipped>

> > +
> > +-- 2921 ^ 0.5 = 0x1.b05ec632536fap+5.
> We certainly need to add some explanation here about the precision, because
> it is not obvious why these magic numbers should cause any issues.

I suppose any really intererested in this reader may compare the
behaviour of the glibc implementation of `sqrt()` and `pow()`. Also, the
comment should mention this implementation, so it becomes too huge and
distracts the reader from the test case itself.

Ignoring for now.

> > +res = {}

<snipped>

> > +test:samevalues(res, ('consistent results for folding 2921 ^ 0.5'))
> 
> I believe it is possible to make a single function with different
> parameters for all three cases above.
> Something like `test_power(value, power, extra_map)`, so you can do
> | res[i] = extra_map(value ^ power)

I afraid that this function doesn't give any improvement in readability,
also, it may change the trace semantics, so I prefer to leave it as is.

Ignoring for now.

> 
> > +

<snipped>

> > +-- Need some value near 1, to avoid infinite result.
> Typo: s/Need/We need/
> Typo: s/avoid/avoid an/

Fixed.

See the iterative patch below.

===================================================================
diff --git a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
index 5129fc45..003fe957 100644
--- a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
+++ b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
@@ -18,7 +18,7 @@ jit.off()
 jit.flush()
 
 local res = {}
--- -0 ^ 0.5 = 0. Test sign with `tostring()`.
+-- -0 ^ 0.5 = 0. Test the sign with `tostring()`.
 -- XXX: use local variable to prevent folding via parser.
 -- XXX: use stack slot out of trace to prevent constant folding.
 local minus_zero = -0
@@ -75,7 +75,7 @@ jit.on()
 pow(1, 2)
 pow(1, 2)
 
--- Need some value near 1, to avoid infinite result.
+-- We need some value near 1, to avoid an infinite result.
 local base = 1.0000000001
 local power = 65536 * 3
 local resulting_value = pow(base, power)
===================================================================

> > +local base = 1.0000000001

<snipped>

> > -- 
> > 2.41.0
> > 

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 4/5] Fix pow() optimization inconsistencies.
  2023-08-18 12:45   ` Sergey Bronnikov via Tarantool-patches
@ 2023-08-21  8:07     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 34+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-08-21  8:07 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review!
Fixed your comments inline.

On 18.08.23, Sergey Bronnikov wrote:
> Hi, Sergey!
> 
> 
> thanks for the patch! LGTM with two minor comments inline.
> 
> 
> On 8/15/23 12:36, Sergey Kaplun wrote:
> > From: Mike Pall <mike>
> >
> > (cherry-picked from commit 9512d5c1aced61e13e7be2d3208ec7ae3516b458)
> >
> > This patch fixes different misbehaviour between JIT-compiled code and
> misbehaviour -> misbehaviours

Fixed.

> > the interpreter for power operator with the following ways:
> > * Drop folding optimizations for base ^ 0.5 => sqrt(base), as far as
> >    pow(base, 0.5) isn't interchangeable and depends on the <math.h>
> >    implementation.
> > * Drop folding optimizations for 2 ^ int_pow => ldexp(1.0, int_pow), to
> >    avoid dependcy on the <math.h> implementation.
> dependcy -> dependency

Fixed.

> > * Now `asm_pow()` always assemble a call to the `lj_vm_powi()` function,

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 5/5] Revert to trival pow() optimizations to prevent inaccuracies.
  2023-08-20  9:37   ` Maxim Kokryashkin via Tarantool-patches
@ 2023-08-21  8:15     ` Sergey Kaplun via Tarantool-patches
  2023-08-21  9:06       ` Maxim Kokryashkin via Tarantool-patches
  0 siblings, 1 reply; 34+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-08-21  8:15 UTC (permalink / raw)
  To: Maxim Kokryashkin; +Cc: tarantool-patches

Hi, Maxim!
Thanks for the review!
See my answers below.

On 20.08.23, Maxim Kokryashkin wrote:
> Hi, Sergey!
> Thanks for the patch!
> Please consider my comments below.
> On Tue, Aug 15, 2023 at 12:36:31PM +0300, Sergey Kaplun wrote:
> > From: Mike Pall <mike>
> > 
> > (cherry-picked from commit 96d6d5032098ea9f0002165394a8774dcaa0c0ce)
> > 
> > This patch fixes different misbehaviour between JIT-compiled code and
> Typo: s/misbehaviour/misbehaviours/

Fixed.

> > the interpreter for power operator with the following ways:
> Typo: s/with/in/

Fixed.

> > * Drop folding optimizations for base ^ n => base * base ..., as far as
> >   pow(base, n) isn't interchangeable with just multiplicity of numbers
> >   and depends on the <math.h> implementation.
> > * Since the internal power function is inaccurate for very big or small
> >   powers, it is dropped, and `pow()` from the standard library is used
> >   instead. To save consistency between JIT behaviour and the VM
> Typo: s/VM/VM,/

Fixed.

> >   narrowing optimization is dropped, and only trivial folding
> >   optimizations are used. Also, `math_extern2` version with two
> >   parameters is dropped, since it's no more used.
> Typo: s/more/longer/

Fixed.

> > 
> > Also, this fixes failures of the [220/502] lib/string/format/num.lua
> > test [1] from LuaJIT-test suite.
> Typo: s/from/from the/

Fixed.

> > 
> > [1]: https://www.exploringbinary.com/incorrect-floating-point-to-decimal-conversions/
> > 
> > Sergey Kaplun:
> > * added the description and the test for the problem
> > 
> > Part of tarantool/tarantool#8825
> > ---

<snipped>

> >  
> > +-- -948388 ^ 3 = -0x1.7ad0e8ad7439dp+59.
> Same as in the previous patch, we need some additinal commentary for
> those magic numbers.

See my answer in the previous reply.

> > +res = {}
> > +-- XXX: use local variable to prevent folding via parser.
> > +-- XXX: use stack slot out of trace to prevent constant folding.
> > +local corner_case_3 = -948388
> Naming is misleading, it seems like it is the test case number,
> which it is not. Please also fix this in the previous patch for `corner_case_5`.

Renamed as the following:

===================================================================
diff --git a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
index 003fe957..418a1557 100644
--- a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
+++ b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
@@ -52,10 +52,10 @@ test:samevalues(res, ('consistent results for folding (-inf) ^ 0.5'))
 res = {}
 -- XXX: use local variable to prevent folding via parser.
 -- XXX: use stack slot out of trace to prevent constant folding.
-local corner_case_05 = 2921
+local corner_case_pow_05 = 2921
 jit.on()
 for i = 1, 4 do
-  res[i] = corner_case_05 ^ 0.5
+  res[i] = corner_case_pow_05 ^ 0.5
 end
 
 -- XXX: Prevent hotcount side effects.
===================================================================

===================================================================
diff --git a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
index 23dd44e8..57685a72 100644
--- a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
+++ b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
@@ -69,10 +69,10 @@ test:samevalues(res, ('consistent results for folding 2921 ^ 0.5'))
 res = {}
 -- XXX: use local variable to prevent folding via parser.
 -- XXX: use stack slot out of trace to prevent constant folding.
-local corner_case_3 = -948388
+local corner_case_pow_3 = -948388
 jit.on()
 for i = 1, 4 do
-  res[i] = corner_case_3 ^ 3
+  res[i] = corner_case_pow_3 ^ 3
 end
 
 -- XXX: Prevent hotcount side effects.
===================================================================

Branch is force-pushed.

> > +jit.on()
> > +for i = 1, 4 do
> > +  res[i] = corner_case_3 ^ 3
> > +end
> > +
> > +-- XXX: Prevent hotcount side effects.
> > +jit.off()
> > +jit.flush()
> 
> If you'll succeed making that dedicated function for those test cases in the
> previous patch fix ups, this one should be rewritten too.

See my answer in the previous reply.

> > +
> > +test:samevalues(res, ('consistent results for int pow (-948388) ^ 3'))
> > +
> >  -- Narrowing for non-constant base of power operation.
> >  local function pow(base, power)
> >    return base ^ power
> > -- 
> > 2.41.0
> > 

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 5/5] Revert to trival pow() optimizations to prevent inaccuracies.
  2023-08-18 12:49   ` Sergey Bronnikov via Tarantool-patches
@ 2023-08-21  8:16     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 34+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-08-21  8:16 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review!
Fixed your comments and force-pushed the branch.

On 18.08.23, Sergey Bronnikov wrote:
> Hi, Sergey
> 
> 
> Thanks for the patch!
> 
> typo in subj: trival -> trivial, however seems it is not fixable because 
> a part of original commit
> 
> LGTM
> 
> 
> On 8/15/23 12:36, Sergey Kaplun wrote:
> > From: Mike Pall <mike>
> >
> > (cherry-picked from commit 96d6d5032098ea9f0002165394a8774dcaa0c0ce)
> >
> > This patch fixes different misbehaviour between JIT-compiled code and
> typo: misbehaviour -> misbehaviours

Fixed.

> > the interpreter for power operator with the following ways:
> > * Drop folding optimizations for base ^ n => base * base ..., as far as
> >    pow(base, n) isn't interchangeable with just multiplicity of numbers
> >    and depends on the <math.h> implementation.
> > * Since the internal power function is inaccurate for very big or small
> >    powers, it is dropped, and `pow()` from the standard library is used
> >    instead. To save consistency between JIT behaviour and the VM
> >    narrowing optimization is dropped, and only trivial folding
> >    optimizations are used. Also, `math_extern2` version with two
> >    parameters is dropped, since it's no more used.
> >
> > Also, this fixes failures of the [220/502] lib/string/format/num.lua
> > test [1] from LuaJIT-test suite.
> >
> > [1]: https://www.exploringbinary.com/incorrect-floating-point-to-decimal-conversions/
> >
> > Sergey Kaplun:
> > * added the description and the test for the problem
> >
> > Part of tarantool/tarantool#8825
> > ---

<snipped>

> >     return base ^ power

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 4/5] Fix pow() optimization inconsistencies.
  2023-08-21  8:06     ` Sergey Kaplun via Tarantool-patches
@ 2023-08-21  9:00       ` Maxim Kokryashkin via Tarantool-patches
  2023-08-21  9:31         ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 34+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2023-08-21  9:00 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the fixes!
LGTM now, see my answers below.
On Mon, Aug 21, 2023 at 11:06:32AM +0300, Sergey Kaplun wrote:
> Hi, Maxim!
> Thanks for the review!
> See my answers below.
> 
> On 20.08.23, Maxim Kokryashkin wrote:
> > Hi, Sergey!
> > Thanks for the patch!
> > Please consider my comments below.
> > 
> > On Tue, Aug 15, 2023 at 12:36:30PM +0300, Sergey Kaplun wrote:
> > > From: Mike Pall <mike>
> > > 
> > > (cherry-picked from commit 9512d5c1aced61e13e7be2d3208ec7ae3516b458)
> > > 
> > > This patch fixes different misbehaviour between JIT-compiled code and
> > Typo: s/misbehaviour/misbehaviours/
> 
> Fixed.
> 
> > > the interpreter for power operator with the following ways:
> > Typo: s/with the/in the/
> 
> Fixed.
> 
> > > * Drop folding optimizations for base ^ 0.5 => sqrt(base), as far as
> > >   pow(base, 0.5) isn't interchangeable and depends on the <math.h>
> > >   implementation.
> > > * Drop folding optimizations for 2 ^ int_pow => ldexp(1.0, int_pow), to
> > >   avoid dependcy on the <math.h> implementation.
> > > * Now `asm_pow()` always assemble a call to the `lj_vm_powi()` function,
> > Typo: s/assemble/assembles/
> 
> Fixed.
> 
> > >   that is general now for all CPU architectures. Using this internal
> > >   function instead of toolchain-provided `pow()` guarantees consistency
> > Typo: s/of/of the/
> 
> Fixed.
> 
> > >   between interpreter and JIT results. Also, it drops custom
> > Typo: s/drops/drops the/
> 
> Fixed.
> 
> > >   implementation for the `vm_powi_sse()` on x86_64.
> > Typo: s/for the/for/
> 
> Fixed.
> 
> > > * `math_extern2` macro in the VM may take the second argument, that is
> > >   used as the target function to call. The first argument is still the
> > >   name for `func_nnsse` macro.
> > > * Narrowing for power operation avoids range guard for non-constant base
> > >   IR. This leads to invalid result if value on trace is out of range.
> > Typo: s/to invalid/to an invalid/
> 
> Fixed.
> 
> > >   Now it is done unconditionally.
> > > 
> > > Be aware, that [220/502] lib/string/format/num.lua test [1] from
> > Typo: s/from the/from/
> 
> I suppose that it should be "from the"? Fixed.
Yep, I got the order wrong, sorry.
> 
> > > LuaJIT-test suite fails after this commit.
> > > 
> > > [1]: https://www.exploringbinary.com/incorrect-floating-point-to-decimal-conversions/
> > > 
> > > Sergey Kaplun:
> > > * added the description and the test for the problem
> > > 
> > > Part of tarantool/tarantool#8825
> > > ---
> 
> <snipped>
> 
> > > +local res = {}
> > > +-- -0 ^ 0.5 = 0. Test sign with `tostring()`.
> > Typo: s/Test/Test the/
> 
> Fixed.
> 
> > > +-- XXX: use local variable to prevent folding via parser.
> 
> <snipped>
> 
> > > +
> > > +-- 2921 ^ 0.5 = 0x1.b05ec632536fap+5.
> > We certainly need to add some explanation here about the precision, because
> > it is not obvious why these magic numbers should cause any issues.
> 
> I suppose any really intererested in this reader may compare the
> behaviour of the glibc implementation of `sqrt()` and `pow()`. Also, the
> comment should mention this implementation, so it becomes too huge and
> distracts the reader from the test case itself.
Something like the comment below is sufficient:
| This number has no special meaning and is used as one that gives different
| results when its square root is obtained with glibc's `sqrt` and `power`
| operations, thanks to their implementation nuances.

I strongly suggest adding it to make the test case more understandable.
> 
> Ignoring for now.
> 
> > > +res = {}
> 
> <snipped>
> 
> > > +test:samevalues(res, ('consistent results for folding 2921 ^ 0.5'))
> > 
> > I believe it is possible to make a single function with different
> > parameters for all three cases above.
> > Something like `test_power(value, power, extra_map)`, so you can do
> > | res[i] = extra_map(value ^ power)
> 
> I afraid that this function doesn't give any improvement in readability,
> also, it may change the trace semantics, so I prefer to leave it as is.
> 
> Ignoring for now.
I've expressed my suggestion incomprehensively, sorry. Here is what I've meant
someting like this:

| local function pow_test_case(value, power, extra_map)
|   jit.on()
|   res = {}
|   jit.on()
|   for i = 1, 4 do
|     res[i] = extra_map(value ^ power)
|   end
|
|   -- XXX: Prevent hotcount side effects.
|   jit.off()
|   jit.flush()
|
|   test:samevalues(res, ('consistent results for <...>'))
| end

Anyway, I've checked the jit.dump by myself, and even for the simple
cases traces are entirely different. With that in mind, I believe, this
comment should be ignored, even though this is very sad.

> 
> > 
> > > +
> 
> <snipped>
> 
> > > +-- Need some value near 1, to avoid infinite result.
> > Typo: s/Need/We need/
> > Typo: s/avoid/avoid an/
> 
> Fixed.
> 
> See the iterative patch below.
> 
> ===================================================================
> diff --git a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> index 5129fc45..003fe957 100644
> --- a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> +++ b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> @@ -18,7 +18,7 @@ jit.off()
>  jit.flush()
>  
>  local res = {}
> --- -0 ^ 0.5 = 0. Test sign with `tostring()`.
> +-- -0 ^ 0.5 = 0. Test the sign with `tostring()`.
>  -- XXX: use local variable to prevent folding via parser.
>  -- XXX: use stack slot out of trace to prevent constant folding.
>  local minus_zero = -0
> @@ -75,7 +75,7 @@ jit.on()
>  pow(1, 2)
>  pow(1, 2)
>  
> --- Need some value near 1, to avoid infinite result.
> +-- We need some value near 1, to avoid an infinite result.
>  local base = 1.0000000001
>  local power = 65536 * 3
>  local resulting_value = pow(base, power)
> ===================================================================
> 
> > > +local base = 1.0000000001
> 
> <snipped>
> 
> > > -- 
> > > 2.41.0
> > > 
> 
> -- 
> Best regards,
> Sergey Kaplun

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 5/5] Revert to trival pow() optimizations to prevent inaccuracies.
  2023-08-21  8:15     ` Sergey Kaplun via Tarantool-patches
@ 2023-08-21  9:06       ` Maxim Kokryashkin via Tarantool-patches
  2023-08-21  9:36         ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 34+ messages in thread
From: Maxim Kokryashkin via Tarantool-patches @ 2023-08-21  9:06 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the patch!
LGTM now, except for a single comment below.
Also, see my answers too.
On Mon, Aug 21, 2023 at 11:15:22AM +0300, Sergey Kaplun wrote:
> Hi, Maxim!
> Thanks for the review!
> See my answers below.
> 
> On 20.08.23, Maxim Kokryashkin wrote:
> > Hi, Sergey!
> > Thanks for the patch!
> > Please consider my comments below.
> > On Tue, Aug 15, 2023 at 12:36:31PM +0300, Sergey Kaplun wrote:
> > > From: Mike Pall <mike>
> > > 
> > > (cherry-picked from commit 96d6d5032098ea9f0002165394a8774dcaa0c0ce)
> > > 
> > > This patch fixes different misbehaviour between JIT-compiled code and
> > Typo: s/misbehaviour/misbehaviours/
> 
> Fixed.
> 
> > > the interpreter for power operator with the following ways:
> > Typo: s/with/in/
> 
> Fixed.
> 
> > > * Drop folding optimizations for base ^ n => base * base ..., as far as
> > >   pow(base, n) isn't interchangeable with just multiplicity of numbers
> > >   and depends on the <math.h> implementation.
> > > * Since the internal power function is inaccurate for very big or small
> > >   powers, it is dropped, and `pow()` from the standard library is used
> > >   instead. To save consistency between JIT behaviour and the VM
> > Typo: s/VM/VM,/
> 
> Fixed.
> 
> > >   narrowing optimization is dropped, and only trivial folding
> > >   optimizations are used. Also, `math_extern2` version with two
> > >   parameters is dropped, since it's no more used.
> > Typo: s/more/longer/
> 
> Fixed.
> 
> > > 
> > > Also, this fixes failures of the [220/502] lib/string/format/num.lua
> > > test [1] from LuaJIT-test suite.
> > Typo: s/from/from the/
> 
> Fixed.
> 
> > > 
> > > [1]: https://www.exploringbinary.com/incorrect-floating-point-to-decimal-conversions/
> > > 
> > > Sergey Kaplun:
> > > * added the description and the test for the problem
> > > 
> > > Part of tarantool/tarantool#8825
> > > ---
> 
> <snipped>
> 
> > >  
> > > +-- -948388 ^ 3 = -0x1.7ad0e8ad7439dp+59.
> > Same as in the previous patch, we need some additinal commentary for
> > those magic numbers.
> 
> See my answer in the previous reply.
I think the same commentary, as one I suggested in reply, will do.
> 
> > > +res = {}
> > > +-- XXX: use local variable to prevent folding via parser.
> > > +-- XXX: use stack slot out of trace to prevent constant folding.
> > > +local corner_case_3 = -948388
> > Naming is misleading, it seems like it is the test case number,
> > which it is not. Please also fix this in the previous patch for `corner_case_5`.
> 
> Renamed as the following:
> 
> ===================================================================
> diff --git a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> index 003fe957..418a1557 100644
> --- a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> +++ b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> @@ -52,10 +52,10 @@ test:samevalues(res, ('consistent results for folding (-inf) ^ 0.5'))
>  res = {}
>  -- XXX: use local variable to prevent folding via parser.
>  -- XXX: use stack slot out of trace to prevent constant folding.
> -local corner_case_05 = 2921
> +local corner_case_pow_05 = 2921
>  jit.on()
>  for i = 1, 4 do
> -  res[i] = corner_case_05 ^ 0.5
> +  res[i] = corner_case_pow_05 ^ 0.5
>  end
>  
>  -- XXX: Prevent hotcount side effects.
> ===================================================================
> 
> ===================================================================
> diff --git a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> index 23dd44e8..57685a72 100644
> --- a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> +++ b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> @@ -69,10 +69,10 @@ test:samevalues(res, ('consistent results for folding 2921 ^ 0.5'))
>  res = {}
>  -- XXX: use local variable to prevent folding via parser.
>  -- XXX: use stack slot out of trace to prevent constant folding.
> -local corner_case_3 = -948388
> +local corner_case_pow_3 = -948388
>  jit.on()
>  for i = 1, 4 do
> -  res[i] = corner_case_3 ^ 3
> +  res[i] = corner_case_pow_3 ^ 3
>  end
>  
>  -- XXX: Prevent hotcount side effects.
> ===================================================================
> 
> Branch is force-pushed.
> 
> > > +jit.on()
> > > +for i = 1, 4 do
> > > +  res[i] = corner_case_3 ^ 3
> > > +end
> > > +
> > > +-- XXX: Prevent hotcount side effects.
> > > +jit.off()
> > > +jit.flush()
> > 
> > If you'll succeed making that dedicated function for those test cases in the
> > previous patch fix ups, this one should be rewritten too.
> 
> See my answer in the previous reply.
As I've already said in the previous reply, I've checked the proposed change
by myself and it doesn't seem to work great, although it would increase readability
by a great margin. So yep, ignore it.
> 
> > > +
> > > +test:samevalues(res, ('consistent results for int pow (-948388) ^ 3'))
> > > +
> > >  -- Narrowing for non-constant base of power operation.
> > >  local function pow(base, power)
> > >    return base ^ power
> > > -- 
> > > 2.41.0
> > > 
> 
> -- 
> Best regards,
> Sergey Kaplun

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 4/5] Fix pow() optimization inconsistencies.
  2023-08-21  9:00       ` Maxim Kokryashkin via Tarantool-patches
@ 2023-08-21  9:31         ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 34+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-08-21  9:31 UTC (permalink / raw)
  To: Maxim Kokryashkin; +Cc: tarantool-patches

Hi, Maxim!
Thanks for the answers!
Fixed your comment and updated the branch.

On 21.08.23, Maxim Kokryashkin wrote:
> Hi, Sergey!
> Thanks for the fixes!
> LGTM now, see my answers below.
> On Mon, Aug 21, 2023 at 11:06:32AM +0300, Sergey Kaplun wrote:

<snipped>

> > 
> > > > +
> > > > +-- 2921 ^ 0.5 = 0x1.b05ec632536fap+5.
> > > We certainly need to add some explanation here about the precision, because
> > > it is not obvious why these magic numbers should cause any issues.
> > 
> > I suppose any really intererested in this reader may compare the
> > behaviour of the glibc implementation of `sqrt()` and `pow()`. Also, the
> > comment should mention this implementation, so it becomes too huge and
> > distracts the reader from the test case itself.
> Something like the comment below is sufficient:
> | This number has no special meaning and is used as one that gives different
> | results when its square root is obtained with glibc's `sqrt` and `power`
> | operations, thanks to their implementation nuances.
> 
> I strongly suggest adding it to make the test case more understandable.

Added. See the iterative patch below:

===================================================================
diff --git a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
index 418a1557..cfd4860d 100644
--- a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
+++ b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
@@ -50,6 +50,9 @@ test:samevalues(res, ('consistent results for folding (-inf) ^ 0.5'))
 
 -- 2921 ^ 0.5 = 0x1.b05ec632536fap+5.
 res = {}
+-- This number has no special meaning and is used as one that
+-- gives different results when its square root is obtained with
+-- glibc's `sqrt()` and `pow()` operations.
 -- XXX: use local variable to prevent folding via parser.
 -- XXX: use stack slot out of trace to prevent constant folding.
 local corner_case_pow_05 = 2921
===================================================================

> > 
> > Ignoring for now.
> > 
> > > > +res = {}
> > 
> > <snipped>
> > 
> > > > +test:samevalues(res, ('consistent results for folding 2921 ^ 0.5'))
> > > 
> > > I believe it is possible to make a single function with different
> > > parameters for all three cases above.
> > > Something like `test_power(value, power, extra_map)`, so you can do
> > > | res[i] = extra_map(value ^ power)
> > 
> > I afraid that this function doesn't give any improvement in readability,
> > also, it may change the trace semantics, so I prefer to leave it as is.
> > 
> > Ignoring for now.
> I've expressed my suggestion incomprehensively, sorry. Here is what I've meant
> someting like this:
> 
> | local function pow_test_case(value, power, extra_map)
> |   jit.on()
> |   res = {}
> |   jit.on()
> |   for i = 1, 4 do
> |     res[i] = extra_map(value ^ power)
> |   end
> |
> |   -- XXX: Prevent hotcount side effects.
> |   jit.off()
> |   jit.flush()
> |
> |   test:samevalues(res, ('consistent results for <...>'))
> | end
> 
> Anyway, I've checked the jit.dump by myself, and even for the simple
> cases traces are entirely different. With that in mind, I believe, this
> comment should be ignored, even though this is very sad.

Yes, also it changes the semantic of trace, since power isn't a
constant, fold optimization isn't taken.

> 
> > 
> > > 
> > > > +
> > 
> > <snipped>
> > 
> > > > +-- Need some value near 1, to avoid infinite result.
> > > Typo: s/Need/We need/
> > > Typo: s/avoid/avoid an/
> > 
> > Fixed.
> > 
> > See the iterative patch below.
> > 
> > ===================================================================
> > diff --git a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> > index 5129fc45..003fe957 100644
> > --- a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> > +++ b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
> > @@ -18,7 +18,7 @@ jit.off()
> >  jit.flush()
> >  
> >  local res = {}
> > --- -0 ^ 0.5 = 0. Test sign with `tostring()`.
> > +-- -0 ^ 0.5 = 0. Test the sign with `tostring()`.
> >  -- XXX: use local variable to prevent folding via parser.
> >  -- XXX: use stack slot out of trace to prevent constant folding.
> >  local minus_zero = -0
> > @@ -75,7 +75,7 @@ jit.on()
> >  pow(1, 2)
> >  pow(1, 2)
> >  
> > --- Need some value near 1, to avoid infinite result.
> > +-- We need some value near 1, to avoid an infinite result.
> >  local base = 1.0000000001
> >  local power = 65536 * 3
> >  local resulting_value = pow(base, power)
> > ===================================================================
> > 
> > > > +local base = 1.0000000001
> > 
> > <snipped>
> > 
> > > > -- 
> > > > 2.41.0
> > > > 
> > 
> > -- 
> > Best regards,
> > Sergey Kaplun

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 5/5] Revert to trival pow() optimizations to prevent inaccuracies.
  2023-08-21  9:06       ` Maxim Kokryashkin via Tarantool-patches
@ 2023-08-21  9:36         ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 34+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-08-21  9:36 UTC (permalink / raw)
  To: Maxim Kokryashkin; +Cc: tarantool-patches

Hi, Maxim!
Thanks for the updates!
Fixed your comment, see the iterative patch below.
Branch is force-pushed.

On 21.08.23, Maxim Kokryashkin wrote:
> Hi, Sergey!
> Thanks for the patch!
> LGTM now, except for a single comment below.
> Also, see my answers too.

<snipped>

> > > >  
> > > > +-- -948388 ^ 3 = -0x1.7ad0e8ad7439dp+59.
> > > Same as in the previous patch, we need some additinal commentary for
> > > those magic numbers.
> > 
> > See my answer in the previous reply.
> I think the same commentary, as one I suggested in reply, will do.

Added.

===================================================================
diff --git a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
index 10984d33..c0c63cce 100644
--- a/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
+++ b/test/tarantool-tests/lj-684-pow-inconsistencies.test.lua
@@ -70,6 +70,9 @@ test:samevalues(res, ('consistent results for folding 2921 ^ 0.5'))
 
 -- -948388 ^ 3 = -0x1.7ad0e8ad7439dp+59.
 res = {}
+-- This number has no special meaning and is used as one that
+-- gives different results when its power of 3 is obtained with
+-- glibc's `pow()` and `x * x * x` operations.
 -- XXX: use local variable to prevent folding via parser.
 -- XXX: use stack slot out of trace to prevent constant folding.
 local corner_case_pow_3 = -948388
===================================================================

> > 

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/5] test: introduce `samevalues()` TAP checker
  2023-08-18 11:12       ` Sergey Bronnikov via Tarantool-patches
@ 2023-08-21 10:47         ` Igor Munkin via Tarantool-patches
  2023-08-24  7:44           ` Sergey Bronnikov via Tarantool-patches
  0 siblings, 1 reply; 34+ messages in thread
From: Igor Munkin via Tarantool-patches @ 2023-08-21 10:47 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Sergey,

<snipped>

> > > > The introduced `samevalues()` helper checks that values in range from
> > > > 1, to `table.maxn()` of the given table are exactly the same. It may be
> > > > usefull for test consistency of JIT and VM behaviour. Originally, the
> > > > `arr_is_consistent()` function was introduced in the
> > > > <tarantool-tests/gh-6163-min-max.test.lua>. `samevalues()` has the same
> > > > functionallity (except usage of `table.maxn()` instead `#` operator to
> > > > be sure, that the table we check isn't a sparse array).
> > > I would rename samevalues to something like assert_equals or
> > > assert_items_equals just because
> > > 
> > > similar functions are named in unit testing frameworks and helpers with
> > > prefix assert_
> > As you can see we use naming without _ for exported function in the
> > <tap.lua> module, so additional one with strange naming will be
> > inconsistent.
> > 
> > Also, discussed this naming with Igor and Max offline and this name is
> > OK for them, feel free also to CC Igor to discuss:).
> > 
> > > more readable from my point of view. See names for assertions in luatest
> > > [1] and JUnit (popular unit testing framework).
> > > 
> > > 
> > > 1. https://github.com/tarantool/luatest#list-of-luatest-functions
> > > 
> > > 2.
> > > https://junit.org/junit5/docs/5.0.1/api/org/junit/jupiter/api/Assertions.html
> > > 
> > > 
> Igor, what do you think regarding naming of the introduced function?

Frankly speaking, it was me, who originally suggested this name (AFAIR,
but Sergey K. might correct me if I'm wrong), so I'm totally fine with
the naming and here why:
1. There are no functions named in the style you're referring to above.
   This may relate to luatest, but definitely not to our version of
   tap.lua module.
2. All the names *except* <is_deeply> for some historical reasons (and
   due to many Python lovers, that worked in Tarantool, I guess) are
   named in so-called "Lua-way" (you can see many examples in Lua
   Reference Manual or in popular Lua modules): short name with in lower
   case with no separators like underscore or other. This applies to
   <samevalues> too.
3. As for me <assert_items_equals> should validate all the items in
   table against the one *expected*, however <samevalues> just checks
   that the table consists of the same values, but nobody has to know
   this particular value.

All in all, I'm OK with the current name, since it fits to the current
naming policy. However, I'm open to other options regarding the
assertion module to be used in our testing suite (of course, out of the
scope of this series).

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/5] test: introduce `samevalues()` TAP checker
  2023-08-21 10:47         ` Igor Munkin via Tarantool-patches
@ 2023-08-24  7:44           ` Sergey Bronnikov via Tarantool-patches
  0 siblings, 0 replies; 34+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2023-08-24  7:44 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches

H, Igor, Sergey

On 8/21/23 13:47, Igor Munkin wrote:
> Sergey,
>
> <snipped>
>
>>>>> The introduced `samevalues()` helper checks that values in range from
>>>>> 1, to `table.maxn()` of the given table are exactly the same. It may be
>>>>> usefull for test consistency of JIT and VM behaviour. Originally, the
>>>>> `arr_is_consistent()` function was introduced in the
>>>>> <tarantool-tests/gh-6163-min-max.test.lua>. `samevalues()` has the same
>>>>> functionallity (except usage of `table.maxn()` instead `#` operator to
>>>>> be sure, that the table we check isn't a sparse array).
>>>> I would rename samevalues to something like assert_equals or
>>>> assert_items_equals just because
>>>>
>>>> similar functions are named in unit testing frameworks and helpers with
>>>> prefix assert_
>>> As you can see we use naming without _ for exported function in the
>>> <tap.lua> module, so additional one with strange naming will be
>>> inconsistent.
>>>
>>> Also, discussed this naming with Igor and Max offline and this name is
>>> OK for them, feel free also to CC Igor to discuss:).
>>>
>>>> more readable from my point of view. See names for assertions in luatest
>>>> [1] and JUnit (popular unit testing framework).
>>>>
>>>>
>>>> 1. https://github.com/tarantool/luatest#list-of-luatest-functions
>>>>
>>>> 2.
>>>> https://junit.org/junit5/docs/5.0.1/api/org/junit/jupiter/api/Assertions.html
>>>>
>>>>
>> Igor, what do you think regarding naming of the introduced function?
> Frankly speaking, it was me, who originally suggested this name (AFAIR,
> but Sergey K. might correct me if I'm wrong), so I'm totally fine with
> the naming and here why:
> 1. There are no functions named in the style you're referring to above.
>     This may relate to luatest, but definitely not to our version of
>     tap.lua module.
> 2. All the names *except* <is_deeply> for some historical reasons (and
>     due to many Python lovers, that worked in Tarantool, I guess) are
>     named in so-called "Lua-way" (you can see many examples in Lua
>     Reference Manual or in popular Lua modules): short name with in lower
>     case with no separators like underscore or other. This applies to
>     <samevalues> too.
> 3. As for me <assert_items_equals> should validate all the items in
>     table against the one *expected*, however <samevalues> just checks
>     that the table consists of the same values, but nobody has to know
>     this particular value.
>
> All in all, I'm OK with the current name, since it fits to the current
> naming policy. However, I'm open to other options regarding the
> assertion module to be used in our testing suite (of course, out of the
> scope of this series).
>

Igor, thanks for detailed explanation. Arguments looks reasonable for me.

I just wanted to make sure that the choice is not accidental and made 
spontaneously.


Sergey, LGTM now.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 0/5] Fix pow inconsistencies and improve asserts
  2023-08-15  9:36 [Tarantool-patches] [PATCH luajit 0/5] Fix pow inconsistencies and improve asserts Sergey Kaplun via Tarantool-patches
                   ` (4 preceding siblings ...)
  2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 5/5] Revert to trival pow() optimizations to prevent inaccuracies Sergey Kaplun via Tarantool-patches
@ 2023-08-24  7:47 ` Sergey Bronnikov via Tarantool-patches
  2023-08-31 15:18 ` Igor Munkin via Tarantool-patches
  6 siblings, 0 replies; 34+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2023-08-24  7:47 UTC (permalink / raw)
  To: Sergey Kaplun, Maxim Kokryashkin; +Cc: tarantool-patches

Hi, Sergey


thanks for the patches!

Patch series LGTM

On 8/15/23 12:36, Sergey Kaplun wrote:
> This patchset fix `^` operator insconsistencies. Since the last 2
> commits are based on the patch "Improve assertions." (*) it is backported as
> well (it's about time).

<snipped>


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 0/5] Fix pow inconsistencies and improve asserts
  2023-08-15  9:36 [Tarantool-patches] [PATCH luajit 0/5] Fix pow inconsistencies and improve asserts Sergey Kaplun via Tarantool-patches
                   ` (5 preceding siblings ...)
  2023-08-24  7:47 ` [Tarantool-patches] [PATCH luajit 0/5] Fix pow inconsistencies and improve asserts Sergey Bronnikov via Tarantool-patches
@ 2023-08-31 15:18 ` Igor Munkin via Tarantool-patches
  6 siblings, 0 replies; 34+ messages in thread
From: Igor Munkin via Tarantool-patches @ 2023-08-31 15:18 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sergey,

I've checked the patchset into all long-term branches in
tarantool/luajit and bumped a new version in master, release/2.11 and
release/2.10.

On 15.08.23, Sergey Kaplun via Tarantool-patches wrote:
> This patchset fix `^` operator insconsistencies. Since the last 2
> commits are based on the patch "Improve assertions." (*) it is backported as
> well (it's about time).
> 
> (*) Be aware that assertions in <src/luajit.c> aren't replaced in the
> upstream too.
> The following functions/modules contains our own code with assertions,
> so we can discuss some better namings for them:
> * `lj_fullhash()` in <src/lj_str.c>
> * `lua_hashstring()` in <src/lj_api.c>
> * <src/lib_misc.c>
> * <src/lj_mapi.c>
> * <src/lj_memprof.c>
> * <src/lj_sysprof.c>
> * <src/lj_utils_leb128.c>
> * <src/lj_wbuf.c>
> 
> P.S. Unfortunately, I can't find any reproducer for dropping
> optimization of 2 ^ i => ldexp(1.0, i). Please, guide me, if you may
> found any.
> 
> Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-9-pow-inconsistencies
> PR: https://github.com/tarantool/tarantool/pull/8985
> Related issues:
> * https://github.com/tarantool/tarantool/issues/8825
> * https://github.com/LuaJIT/LuaJIT/issues/9
> * https://github.com/LuaJIT/LuaJIT/issues/684
> * https://github.com/LuaJIT/LuaJIT/issues/817
> 
> 
> Mike Pall (4):
>   Remove pow() splitting and cleanup backends.
>   Improve assertions.
>   Fix pow() optimization inconsistencies.
>   Revert to trival pow() optimizations to prevent inaccuracies.
> 
> Sergey Kaplun (1):
>   test: introduce `samevalues()` TAP checker
> 

<snipped>

> 
> -- 
> 2.41.0
> 

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2023-08-31 15:35 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-15  9:36 [Tarantool-patches] [PATCH luajit 0/5] Fix pow inconsistencies and improve asserts Sergey Kaplun via Tarantool-patches
2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 1/5] test: introduce `samevalues()` TAP checker Sergey Kaplun via Tarantool-patches
2023-08-17 14:03   ` Maxim Kokryashkin via Tarantool-patches
2023-08-17 15:03     ` Sergey Kaplun via Tarantool-patches
2023-08-18 10:43   ` Sergey Bronnikov via Tarantool-patches
2023-08-18 10:58     ` Sergey Kaplun via Tarantool-patches
2023-08-18 11:12       ` Sergey Bronnikov via Tarantool-patches
2023-08-21 10:47         ` Igor Munkin via Tarantool-patches
2023-08-24  7:44           ` Sergey Bronnikov via Tarantool-patches
2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 2/5] Remove pow() splitting and cleanup backends Sergey Kaplun via Tarantool-patches
2023-08-17 14:52   ` Maxim Kokryashkin via Tarantool-patches
2023-08-17 15:33     ` Sergey Kaplun via Tarantool-patches
2023-08-20  9:48       ` Maxim Kokryashkin via Tarantool-patches
2023-08-18 11:08   ` Sergey Bronnikov via Tarantool-patches
2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 3/5] Improve assertions Sergey Kaplun via Tarantool-patches
2023-08-17 14:58   ` Maxim Kokryashkin via Tarantool-patches
2023-08-18  7:56     ` Sergey Kaplun via Tarantool-patches
2023-08-18 11:20   ` Sergey Bronnikov via Tarantool-patches
2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 4/5] Fix pow() optimization inconsistencies Sergey Kaplun via Tarantool-patches
2023-08-18 12:45   ` Sergey Bronnikov via Tarantool-patches
2023-08-21  8:07     ` Sergey Kaplun via Tarantool-patches
2023-08-20  9:26   ` Maxim Kokryashkin via Tarantool-patches
2023-08-21  8:06     ` Sergey Kaplun via Tarantool-patches
2023-08-21  9:00       ` Maxim Kokryashkin via Tarantool-patches
2023-08-21  9:31         ` Sergey Kaplun via Tarantool-patches
2023-08-15  9:36 ` [Tarantool-patches] [PATCH luajit 5/5] Revert to trival pow() optimizations to prevent inaccuracies Sergey Kaplun via Tarantool-patches
2023-08-18 12:49   ` Sergey Bronnikov via Tarantool-patches
2023-08-21  8:16     ` Sergey Kaplun via Tarantool-patches
2023-08-20  9:37   ` Maxim Kokryashkin via Tarantool-patches
2023-08-21  8:15     ` Sergey Kaplun via Tarantool-patches
2023-08-21  9:06       ` Maxim Kokryashkin via Tarantool-patches
2023-08-21  9:36         ` Sergey Kaplun via Tarantool-patches
2023-08-24  7:47 ` [Tarantool-patches] [PATCH luajit 0/5] Fix pow inconsistencies and improve asserts Sergey Bronnikov via Tarantool-patches
2023-08-31 15:18 ` Igor Munkin via Tarantool-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox