Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH luajit 0/2] arm64: fix 48-bit addresses issues
@ 2021-07-06 17:40 Sergey Kaplun via Tarantool-patches
  2021-07-06 17:40 ` [Tarantool-patches] [PATCH luajit 1/2] Add support for full-range 64 bit lightuserdata Sergey Kaplun via Tarantool-patches
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-07-06 17:40 UTC (permalink / raw)
  To: Igor Munkin, Sergey Ostanevich; +Cc: tarantool-patches

This patchset provides changes necessary for Tarantool working in docker
on M1 (#2712), except well-known `cur_L` issue [1].

The first patch fixes issue with bad lightuserdata error.
The second patch fixes issue with memory remaping to 48-bit VA space.

LuaJIT branch: https://github.com/tarantool/luajit/tree/skaplun/gh-2712-bad-lightud
Tarantool branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-2712-bad-lightud
Issues:
* https://github.com/tarantool/tarantool/issues/2712
* https://github.com/tarantool/tarantool/issues/6154

[1]: https://github.com/tarantool/tarantool/issues/6189

Mike Pall (2):
  Add support for full-range 64 bit lightuserdata.
  Linux/ARM64: Make mremap() non-moving due to VA space woes.

 doc/status.html                               | 11 ----
 src/jit/dump.lua                              |  4 +-
 src/lib_debug.c                               | 12 ++--
 src/lib_jit.c                                 | 14 ++---
 src/lib_package.c                             |  8 +--
 src/lib_string.c                              |  2 +-
 src/lj_alloc.c                                |  2 +-
 src/lj_api.c                                  | 40 +++++++++++--
 src/lj_ccall.c                                |  2 +-
 src/lj_cconv.c                                |  2 +-
 src/lj_crecord.c                              |  6 +-
 src/lj_dispatch.c                             |  2 +-
 src/lj_ir.c                                   |  6 +-
 src/lj_obj.c                                  |  5 +-
 src/lj_obj.h                                  | 57 ++++++++++++-------
 src/lj_snap.c                                 |  7 ++-
 src/lj_state.c                                |  6 ++
 src/lj_strfmt.c                               |  2 +-
 test/tarantool-tests/CMakeLists.txt           |  1 +
 .../lj-49-bad-lightuserdata.test.lua          | 10 ++++
 .../lj-49-bad-lightuserdata/CMakeLists.txt    |  1 +
 .../testlightuserdata.c                       | 52 +++++++++++++++++
 .../lj-671-arm64-assert-after-mremap.test.lua | 24 ++++++++
 23 files changed, 208 insertions(+), 68 deletions(-)
 create mode 100644 test/tarantool-tests/lj-49-bad-lightuserdata.test.lua
 create mode 100644 test/tarantool-tests/lj-49-bad-lightuserdata/CMakeLists.txt
 create mode 100644 test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
 create mode 100644 test/tarantool-tests/lj-671-arm64-assert-after-mremap.test.lua

-- 
2.31.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Tarantool-patches] [PATCH luajit 1/2] Add support for full-range 64 bit lightuserdata.
  2021-07-06 17:40 [Tarantool-patches] [PATCH luajit 0/2] arm64: fix 48-bit addresses issues Sergey Kaplun via Tarantool-patches
@ 2021-07-06 17:40 ` Sergey Kaplun via Tarantool-patches
  2021-07-27 13:59   ` Igor Munkin via Tarantool-patches
  2021-07-06 17:40 ` [Tarantool-patches] [PATCH luajit 2/2] Linux/ARM64: Make mremap() non-moving due to VA space woes Sergey Kaplun via Tarantool-patches
  2021-08-11  7:21 ` [Tarantool-patches] [PATCH luajit 0/2] arm64: fix 48-bit addresses issues Igor Munkin via Tarantool-patches
  2 siblings, 1 reply; 20+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-07-06 17:40 UTC (permalink / raw)
  To: Igor Munkin, Sergey Ostanevich; +Cc: tarantool-patches

From: Mike Pall <mike>

(cherry picked from commit e9af1abec542e6f9851ff2368e7f196b6382a44c)

LuaJIT uses special NaN-tagging technique to store internal type on
the Lua stack. In case LJ_GC64 first 13 bits are set in special NaN
type (0xfff8...). FPU generates the only one type. The next 4 bits are
used for an internal LuaJIT type of object on stack. The next 47 bits
are used for storing this object's content. For userdata, it is its
address. In case arm64 the pointer can have more than 47 significant
bits [1]. In this case the error BADLU error is raised.

For the support of full 64-bit range lightuserdata pointers two new
fields in GCState are added:

`lightudseg` - vector of segments of lightuserdata storing 32-bit
values. MS 25 bits equal to MS bits of lightuserdata address, the
rest are filled with zeros. The lentgh of the vector is power of 2.

`lightudnum` - the length - 1 of aforementioned vector (up to 255).

When lightuserdata is pushed on the stack, if its segment is not stored
in vector new value is pushed on top. The maximum amount of segments is
256, BADLU error is raised in case when user tried to add userdata with
the new 257-th segment.

Also, in this patch all internal usage of lightuserdata (for hooks,
profilers, built-in package, ir and so on) is changed to special values
on Lua Stack.

Also, conversion of TValue to ffi C type with store is no longer
compiled for lightuserdata.

[1]: https://www.kernel.org/doc/html/latest/arm64/memory.html

Sergey Kaplun:
* added the description and the test for the problem

Needed for tarantool/tarantool#6154
Resolves tarantool/tarantool#2712
---
 doc/status.html                               | 11 ----
 src/jit/dump.lua                              |  4 +-
 src/lib_debug.c                               | 12 ++--
 src/lib_jit.c                                 | 14 ++---
 src/lib_package.c                             |  8 +--
 src/lib_string.c                              |  2 +-
 src/lj_api.c                                  | 40 +++++++++++--
 src/lj_ccall.c                                |  2 +-
 src/lj_cconv.c                                |  2 +-
 src/lj_crecord.c                              |  6 +-
 src/lj_dispatch.c                             |  2 +-
 src/lj_ir.c                                   |  6 +-
 src/lj_obj.c                                  |  5 +-
 src/lj_obj.h                                  | 57 ++++++++++++-------
 src/lj_snap.c                                 |  7 ++-
 src/lj_state.c                                |  6 ++
 src/lj_strfmt.c                               |  2 +-
 test/tarantool-tests/CMakeLists.txt           |  1 +
 .../lj-49-bad-lightuserdata.test.lua          | 10 ++++
 .../lj-49-bad-lightuserdata/CMakeLists.txt    |  1 +
 .../testlightuserdata.c                       | 52 +++++++++++++++++
 21 files changed, 183 insertions(+), 67 deletions(-)
 create mode 100644 test/tarantool-tests/lj-49-bad-lightuserdata.test.lua
 create mode 100644 test/tarantool-tests/lj-49-bad-lightuserdata/CMakeLists.txt
 create mode 100644 test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c

diff --git a/doc/status.html b/doc/status.html
index cad6ca65..c89255b6 100644
--- a/doc/status.html
+++ b/doc/status.html
@@ -97,17 +97,6 @@ handled correctly. The error may fall through an on-trace
 <tt>lua_atpanic</tt> on x64. This issue will be fixed with the new
 garbage collector.
 </li>
-<li>
-LuaJIT on 64 bit systems provides a <b>limited range</b> of 47 bits for the
-<b>legacy <tt>lightuserdata</tt></b> data type.
-This is only relevant on x64 systems which use the negative part of the
-virtual address space in user mode, e.g. Solaris/x64, and on ARM64 systems
-configured with a 48 bit or 52 bit VA.
-Avoid using <tt>lightuserdata</tt> to hold pointers that may point outside
-of that range, e.g. variables on the stack. In general, avoid this data
-type for new code and replace it with (much more performant) FFI bindings.
-FFI cdata pointers can address the full 64 bit range.
-</li>
 </ul>
 <br class="flush">
 </div>
diff --git a/src/jit/dump.lua b/src/jit/dump.lua
index 2bea652b..27b5c2ae 100644
--- a/src/jit/dump.lua
+++ b/src/jit/dump.lua
@@ -315,7 +315,9 @@ local function formatk(tr, idx, sn)
   local tn = type(k)
   local s
   if tn == "number" then
-    if band(sn or 0, 0x30000) ~= 0 then
+    if t < 12 then
+      s = k == 0 and "NULL" or format("[0x%08x]", k)
+    elseif band(sn or 0, 0x30000) ~= 0 then
       s = band(sn, 0x20000) ~= 0 and "contpc" or "ftsz"
     elseif k == 2^52+2^51 then
       s = "bias"
diff --git a/src/lib_debug.c b/src/lib_debug.c
index f112b5bc..8fdfda03 100644
--- a/src/lib_debug.c
+++ b/src/lib_debug.c
@@ -231,8 +231,8 @@ LJLIB_CF(debug_upvalueid)
   int32_t n = lj_lib_checkint(L, 2) - 1;
   if ((uint32_t)n >= fn->l.nupvalues)
     lj_err_arg(L, 2, LJ_ERR_IDXRNG);
-  setlightudV(L->top-1, isluafunc(fn) ? (void *)gcref(fn->l.uvptr[n]) :
-					(void *)&fn->c.upvalue[n]);
+  lua_pushlightuserdata(L, isluafunc(fn) ? (void *)gcref(fn->l.uvptr[n]) :
+					   (void *)&fn->c.upvalue[n]);
   return 1;
 }
 
@@ -283,13 +283,13 @@ LJLIB_CF(debug_setuservalue)
 
 /* ------------------------------------------------------------------------ */
 
-#define KEY_HOOK	((void *)0x3004)
+#define KEY_HOOK	(U64x(80000000,00000000)|'h')
 
 static void hookf(lua_State *L, lua_Debug *ar)
 {
   static const char *const hooknames[] =
     {"call", "return", "line", "count", "tail return"};
-  lua_pushlightuserdata(L, KEY_HOOK);
+  (L->top++)->u64 = KEY_HOOK;
   lua_rawget(L, LUA_REGISTRYINDEX);
   if (lua_isfunction(L, -1)) {
     lua_pushstring(L, hooknames[(int)ar->event]);
@@ -334,7 +334,7 @@ LJLIB_CF(debug_sethook)
     count = luaL_optint(L, arg+3, 0);
     func = hookf; mask = makemask(smask, count);
   }
-  lua_pushlightuserdata(L, KEY_HOOK);
+  (L->top++)->u64 = KEY_HOOK;
   lua_pushvalue(L, arg+1);
   lua_rawset(L, LUA_REGISTRYINDEX);
   lua_sethook(L, func, mask, count);
@@ -349,7 +349,7 @@ LJLIB_CF(debug_gethook)
   if (hook != NULL && hook != hookf) {  /* external hook? */
     lua_pushliteral(L, "external hook");
   } else {
-    lua_pushlightuserdata(L, KEY_HOOK);
+    (L->top++)->u64 = KEY_HOOK;
     lua_rawget(L, LUA_REGISTRYINDEX);   /* get hook */
   }
   lua_pushstring(L, unmakemask(mask, buff));
diff --git a/src/lib_jit.c b/src/lib_jit.c
index 22ca0a1a..40aa2b51 100644
--- a/src/lib_jit.c
+++ b/src/lib_jit.c
@@ -540,15 +540,15 @@ LJLIB_CF(jit_opt_start)
 
 /* Not loaded by default, use: local profile = require("jit.profile") */
 
-static const char KEY_PROFILE_THREAD = 't';
-static const char KEY_PROFILE_FUNC = 'f';
+#define KEY_PROFILE_THREAD	(U64x(80000000,00000000)|'t')
+#define KEY_PROFILE_FUNC	(U64x(80000000,00000000)|'f')
 
 static void jit_profile_callback(lua_State *L2, lua_State *L, int samples,
 				 int vmstate)
 {
   TValue key;
   cTValue *tv;
-  setlightudV(&key, (void *)&KEY_PROFILE_FUNC);
+  key.u64 = KEY_PROFILE_FUNC;
   tv = lj_tab_get(L, tabV(registry(L)), &key);
   if (tvisfunc(tv)) {
     char vmst = (char)vmstate;
@@ -575,9 +575,9 @@ LJLIB_CF(jit_profile_start)
   lua_State *L2 = lua_newthread(L);  /* Thread that runs profiler callback. */
   TValue key;
   /* Anchor thread and function in registry. */
-  setlightudV(&key, (void *)&KEY_PROFILE_THREAD);
+  key.u64 = KEY_PROFILE_THREAD;
   setthreadV(L, lj_tab_set(L, registry, &key), L2);
-  setlightudV(&key, (void *)&KEY_PROFILE_FUNC);
+  key.u64 = KEY_PROFILE_FUNC;
   setfuncV(L, lj_tab_set(L, registry, &key), func);
   lj_gc_anybarriert(L, registry);
   luaJIT_profile_start(L, mode ? strdata(mode) : "",
@@ -592,9 +592,9 @@ LJLIB_CF(jit_profile_stop)
   TValue key;
   luaJIT_profile_stop(L);
   registry = tabV(registry(L));
-  setlightudV(&key, (void *)&KEY_PROFILE_THREAD);
+  key.u64 = KEY_PROFILE_THREAD;
   setnilV(lj_tab_set(L, registry, &key));
-  setlightudV(&key, (void *)&KEY_PROFILE_FUNC);
+  key.u64 = KEY_PROFILE_FUNC;
   setnilV(lj_tab_set(L, registry, &key));
   lj_gc_anybarriert(L, registry);
   return 0;
diff --git a/src/lib_package.c b/src/lib_package.c
index a0bca572..8573b9f9 100644
--- a/src/lib_package.c
+++ b/src/lib_package.c
@@ -412,7 +412,7 @@ static int lj_cf_package_loader_preload(lua_State *L)
 
 /* ------------------------------------------------------------------------ */
 
-#define sentinel	((void *)0x4004)
+#define KEY_SENTINEL	(U64x(80000000,00000000)|'s')
 
 static int lj_cf_package_require(lua_State *L)
 {
@@ -422,7 +422,7 @@ static int lj_cf_package_require(lua_State *L)
   lua_getfield(L, LUA_REGISTRYINDEX, "_LOADED");
   lua_getfield(L, 2, name);
   if (lua_toboolean(L, -1)) {  /* is it there? */
-    if (lua_touserdata(L, -1) == sentinel)  /* check loops */
+    if ((L->top-1)->u64 == KEY_SENTINEL)  /* check loops */
       luaL_error(L, "loop or previous error loading module " LUA_QS, name);
     return 1;  /* package is already loaded */
   }
@@ -445,14 +445,14 @@ static int lj_cf_package_require(lua_State *L)
     else
       lua_pop(L, 1);
   }
-  lua_pushlightuserdata(L, sentinel);
+  (L->top++)->u64 = KEY_SENTINEL;
   lua_setfield(L, 2, name);  /* _LOADED[name] = sentinel */
   lua_pushstring(L, name);  /* pass name as argument to module */
   lua_call(L, 1, 1);  /* run loaded module */
   if (!lua_isnil(L, -1))  /* non-nil return? */
     lua_setfield(L, 2, name);  /* _LOADED[name] = returned value */
   lua_getfield(L, 2, name);
-  if (lua_touserdata(L, -1) == sentinel) {   /* module did not set a value? */
+  if ((L->top-1)->u64 == KEY_SENTINEL) {   /* module did not set a value? */
     lua_pushboolean(L, 1);  /* use true as result */
     lua_pushvalue(L, -1);  /* extra copy to be returned */
     lua_setfield(L, 2, name);  /* _LOADED[name] = true */
diff --git a/src/lib_string.c b/src/lib_string.c
index 76b0730a..156dae66 100644
--- a/src/lib_string.c
+++ b/src/lib_string.c
@@ -714,7 +714,7 @@ again:
 	lj_strfmt_putfchar(sb, sf, lj_lib_checkint(L, arg));
 	break;
       case STRFMT_PTR:  /* No formatting. */
-	lj_strfmt_putptr(sb, lj_obj_ptr(L->base+arg-1));
+	lj_strfmt_putptr(sb, lj_obj_ptr(G(L), L->base+arg-1));
 	break;
       default:
 	lua_assert(0);
diff --git a/src/lj_api.c b/src/lj_api.c
index c9b5d220..c7a0b327 100644
--- a/src/lj_api.c
+++ b/src/lj_api.c
@@ -617,7 +617,7 @@ LUA_API void *lua_touserdata(lua_State *L, int idx)
   if (tvisudata(o))
     return uddata(udataV(o));
   else if (tvislightud(o))
-    return lightudV(o);
+    return lightudV(G(L), o);
   else
     return NULL;
 }
@@ -630,7 +630,7 @@ LUA_API lua_State *lua_tothread(lua_State *L, int idx)
 
 LUA_API const void *lua_topointer(lua_State *L, int idx)
 {
-  return lj_obj_ptr(index2adr(L, idx));
+  return lj_obj_ptr(G(L), index2adr(L, idx));
 }
 
 /* -- Stack setters (object creation) ------------------------------------- */
@@ -716,9 +716,38 @@ LUA_API void lua_pushboolean(lua_State *L, int b)
   incr_top(L);
 }
 
+#if LJ_64
+static void *lightud_intern(lua_State *L, void *p)
+{
+  global_State *g = G(L);
+  uint64_t u = (uint64_t)p;
+  uint32_t up = lightudup(u);
+  uint32_t *segmap = mref(g->gc.lightudseg, uint32_t);
+  MSize segnum = g->gc.lightudnum;
+  if (segmap) {
+    MSize seg;
+    for (seg = 0; seg <= segnum; seg++)
+      if (segmap[seg] == up)  /* Fast path. */
+	return (void *)(((uint64_t)seg << LJ_LIGHTUD_BITS_LO) | lightudlo(u));
+    segnum++;
+  }
+  if (!((segnum-1) & segnum) && segnum != 1) {
+    if (segnum >= (1 << LJ_LIGHTUD_BITS_SEG)) lj_err_msg(L, LJ_ERR_BADLU);
+    lj_mem_reallocvec(L, segmap, segnum, segnum ? 2*segnum : 2u, uint32_t);
+    setmref(g->gc.lightudseg, segmap);
+  }
+  g->gc.lightudnum = segnum;
+  segmap[segnum] = up;
+  return (void *)(((uint64_t)segnum << LJ_LIGHTUD_BITS_LO) | lightudlo(u));
+}
+#endif
+
 LUA_API void lua_pushlightuserdata(lua_State *L, void *p)
 {
-  setlightudV(L->top, checklightudptr(L, p));
+#if LJ_64
+  p = lightud_intern(L, p);
+#endif
+  setrawlightudV(L->top, p);
   incr_top(L);
 }
 
@@ -1167,7 +1196,10 @@ static TValue *cpcall(lua_State *L, lua_CFunction func, void *ud)
   fn->c.f = func;
   setfuncV(L, top++, fn);
   if (LJ_FR2) setnilV(top++);
-  setlightudV(top++, checklightudptr(L, ud));
+#if LJ_64
+  ud = lightud_intern(L, ud);
+#endif
+  setrawlightudV(top++, ud);
   cframe_nres(L->cframe) = 1+0;  /* Zero results. */
   L->top = top;
   return top-1;  /* Now call the newly allocated C function. */
diff --git a/src/lj_ccall.c b/src/lj_ccall.c
index 06d1b3bd..f26f1fea 100644
--- a/src/lj_ccall.c
+++ b/src/lj_ccall.c
@@ -1150,7 +1150,7 @@ int lj_ccall_func(lua_State *L, GCcdata *cd)
     lj_vm_ffi_call(&cc);
     if (cts->cb.slot != ~0u) {  /* Blacklist function that called a callback. */
       TValue tv;
-      setlightudV(&tv, (void *)cc.func);
+      tv.u64 = ((uintptr_t)(void *)cc.func >> 2) | U64x(800000000, 00000000);
       setboolV(lj_tab_set(L, cts->miscmap, &tv), 1);
     }
     ct = (CType *)((intptr_t)ct+(intptr_t)cts->tab);  /* May be reallocated. */
diff --git a/src/lj_cconv.c b/src/lj_cconv.c
index 13b8230d..ca2a5d30 100644
--- a/src/lj_cconv.c
+++ b/src/lj_cconv.c
@@ -611,7 +611,7 @@ void lj_cconv_ct_tv(CTState *cts, CType *d,
     if (ud->udtype == UDTYPE_IO_FILE)
       tmpptr = *(void **)tmpptr;
   } else if (tvislightud(o)) {
-    tmpptr = lightudV(o);
+    tmpptr = lightudV(cts->g, o);
   } else if (tvisfunc(o)) {
     void *p = lj_ccallback_new(cts, d, funcV(o));
     if (p) {
diff --git a/src/lj_crecord.c b/src/lj_crecord.c
index e32ae23e..59186dab 100644
--- a/src/lj_crecord.c
+++ b/src/lj_crecord.c
@@ -643,8 +643,7 @@ static TRef crec_ct_tv(jit_State *J, CType *d, TRef dp, TRef sp, cTValue *sval)
     }
   } else if (tref_islightud(sp)) {
 #if LJ_64
-    sp = emitir(IRT(IR_BAND, IRT_P64), sp,
-		lj_ir_kint64(J, U64x(00007fff,ffffffff)));
+    lj_trace_err(J, LJ_TRERR_NYICONV);
 #endif
   } else {  /* NYI: tref_istab(sp). */
     IRType t;
@@ -1209,8 +1208,7 @@ static int crec_call(jit_State *J, RecordFFData *rd, GCcdata *cd)
     TRef tr;
     TValue tv;
     /* Check for blacklisted C functions that might call a callback. */
-    setlightudV(&tv,
-		cdata_getptr(cdataptr(cd), (LJ_64 && tp == IRT_P64) ? 8 : 4));
+    tv.u64 = ((uintptr_t)cdata_getptr(cdataptr(cd), (LJ_64 && tp == IRT_P64) ? 8 : 4) >> 2) | U64x(800000000, 00000000);
     if (tvistrue(lj_tab_get(J->L, cts->miscmap, &tv)))
       lj_trace_err(J, LJ_TRERR_BLACKL);
     if (ctype_isvoid(ctr->info)) {
diff --git a/src/lj_dispatch.c b/src/lj_dispatch.c
index b694d8fd..ee735450 100644
--- a/src/lj_dispatch.c
+++ b/src/lj_dispatch.c
@@ -308,7 +308,7 @@ int luaJIT_setmode(lua_State *L, int idx, int mode)
       if (idx != 0) {
 	cTValue *tv = idx > 0 ? L->base + (idx-1) : L->top + idx;
 	if (tvislightud(tv))
-	  g->wrapf = (lua_CFunction)lightudV(tv);
+	  g->wrapf = (lua_CFunction)lightudV(g, tv);
 	else
 	  return 0;  /* Failed. */
       } else {
diff --git a/src/lj_ir.c b/src/lj_ir.c
index 5baece67..2f7ddb24 100644
--- a/src/lj_ir.c
+++ b/src/lj_ir.c
@@ -386,8 +386,10 @@ void lj_ir_kvalue(lua_State *L, TValue *tv, const IRIns *ir)
   case IR_KPRI: setpriV(tv, irt_toitype(ir->t)); break;
   case IR_KINT: setintV(tv, ir->i); break;
   case IR_KGC: setgcV(L, tv, ir_kgc(ir), irt_toitype(ir->t)); break;
-  case IR_KPTR: case IR_KKPTR: setlightudV(tv, ir_kptr(ir)); break;
-  case IR_KNULL: setlightudV(tv, NULL); break;
+  case IR_KPTR: case IR_KKPTR:
+    setnumV(tv, (lua_Number)(uintptr_t)ir_kptr(ir));
+    break;
+  case IR_KNULL: setintV(tv, 0); break;
   case IR_KNUM: setnumV(tv, ir_knum(ir)->n); break;
 #if LJ_HASFFI
   case IR_KINT64: {
diff --git a/src/lj_obj.c b/src/lj_obj.c
index ee33aeb3..746da6ef 100644
--- a/src/lj_obj.c
+++ b/src/lj_obj.c
@@ -34,12 +34,13 @@ int LJ_FASTCALL lj_obj_equal(cTValue *o1, cTValue *o2)
 }
 
 /* Return pointer to object or its object data. */
-const void * LJ_FASTCALL lj_obj_ptr(cTValue *o)
+const void * LJ_FASTCALL lj_obj_ptr(global_State *g, cTValue *o)
 {
+  UNUSED(g);
   if (tvisudata(o))
     return uddata(udataV(o));
   else if (tvislightud(o))
-    return lightudV(o);
+    return lightudV(g, o);
   else if (LJ_HASFFI && tviscdata(o))
     return cdataptr(cdataV(o));
   else if (tvisgcv(o))
diff --git a/src/lj_obj.h b/src/lj_obj.h
index 4a4d77f5..d26e60be 100644
--- a/src/lj_obj.h
+++ b/src/lj_obj.h
@@ -232,7 +232,7 @@ typedef const TValue cTValue;
 **                  ---MSW---.---LSW---
 ** primitive types |  itype  |         |
 ** lightuserdata   |  itype  |  void * |  (32 bit platforms)
-** lightuserdata   |ffff|    void *    |  (64 bit platforms, 47 bit pointers)
+** lightuserdata   |ffff|seg|    ofs   |  (64 bit platforms)
 ** GC objects      |  itype  |  GCRef  |
 ** int (LJ_DUALNUM)|  itype  |   int   |
 ** number           -------double------
@@ -245,7 +245,8 @@ typedef const TValue cTValue;
 **
 **                     ------MSW------.------LSW------
 ** primitive types    |1..1|itype|1..................1|
-** GC objects/lightud |1..1|itype|-------GCRef--------|
+** GC objects         |1..1|itype|-------GCRef--------|
+** lightuserdata      |1..1|itype|seg|------ofs-------|
 ** int (LJ_DUALNUM)   |1..1|itype|0..0|-----int-------|
 ** number              ------------double-------------
 **
@@ -285,6 +286,12 @@ typedef const TValue cTValue;
 #define LJ_GCVMASK		(((uint64_t)1 << 47) - 1)
 #endif
 
+#if LJ_64
+/* To stay within 47 bits, lightuserdata is segmented. */
+#define LJ_LIGHTUD_BITS_SEG	8
+#define LJ_LIGHTUD_BITS_LO	(47 - LJ_LIGHTUD_BITS_SEG)
+#endif
+
 /* -- String object ------------------------------------------------------- */
 
 /* String object header. String payload follows. */
@@ -597,7 +604,11 @@ typedef struct GCState {
   uint8_t currentwhite;	/* Current white color. */
   uint8_t state;	/* GC state. */
   uint8_t nocdatafin;	/* No cdata finalizer called. */
-  uint8_t unused2;
+#if LJ_64
+  uint8_t lightudnum;	/* Number of lightuserdata segments - 1. */
+#else
+  uint8_t unused1;
+#endif
   MSize sweepstr;	/* Sweep position in string table. */
   GCRef root;		/* List of all collectable objects. */
   MRef sweep;		/* Sweep position in root list. */
@@ -609,6 +620,9 @@ typedef struct GCState {
   GCSize estimate;	/* Estimate of memory actually in use. */
   MSize stepmul;	/* Incremental GC step granularity. */
   MSize pause;		/* Pause between successive GC cycles. */
+#if LJ_64
+  MRef lightudseg;	/* Upper bits of lightuserdata segments. */
+#endif
 
   size_t freed;		/* Total amount of freed memory. */
   size_t allocated;	/* Total amount of allocated memory. */
@@ -834,10 +848,23 @@ typedef union GCobj {
 #endif
 #define boolV(o)	check_exp(tvisbool(o), (LJ_TFALSE - itype(o)))
 #if LJ_64
-#define lightudV(o) \
-  check_exp(tvislightud(o), (void *)((o)->u64 & U64x(00007fff,ffffffff)))
+#define lightudseg(u) \
+  (((u) >> LJ_LIGHTUD_BITS_LO) & ((1 << LJ_LIGHTUD_BITS_SEG)-1))
+#define lightudlo(u) \
+  ((u) & (((uint64_t)1 << LJ_LIGHTUD_BITS_LO) - 1))
+#define lightudup(p) \
+  ((uint32_t)(((p) >> LJ_LIGHTUD_BITS_LO) << (LJ_LIGHTUD_BITS_LO-32)))
+static LJ_AINLINE void *lightudV(global_State *g, cTValue *o)
+{
+  uint64_t u = o->u64;
+  uint64_t seg = lightudseg(u);
+  uint32_t *segmap = mref(g->gc.lightudseg, uint32_t);
+  lua_assert(tvislightud(o));
+  lua_assert(seg <= g->gc.lightudnum);
+  return (void *)(((uint64_t)segmap[seg] << 32) | lightudlo(u));
+}
 #else
-#define lightudV(o)	check_exp(tvislightud(o), gcrefp((o)->gcr, void))
+#define lightudV(g, o)	check_exp(tvislightud(o), gcrefp((o)->gcr, void))
 #endif
 #define gcV(o)		check_exp(tvisgcv(o), gcval(o))
 #define strV(o)		check_exp(tvisstr(o), &gcval(o)->str)
@@ -863,7 +890,7 @@ typedef union GCobj {
 #define setpriV(o, i)		(setitype((o), (i)))
 #endif
 
-static LJ_AINLINE void setlightudV(TValue *o, void *p)
+static LJ_AINLINE void setrawlightudV(TValue *o, void *p)
 {
 #if LJ_GC64
   o->u64 = (uint64_t)p | (((uint64_t)LJ_TLIGHTUD) << 47);
@@ -874,24 +901,14 @@ static LJ_AINLINE void setlightudV(TValue *o, void *p)
 #endif
 }
 
-#if LJ_64
-#define checklightudptr(L, p) \
-  (((uint64_t)(p) >> 47) ? (lj_err_msg(L, LJ_ERR_BADLU), NULL) : (p))
-#else
-#define checklightudptr(L, p)	(p)
-#endif
-
-#if LJ_FR2
+#if LJ_FR2 || LJ_32
 #define contptr(f)		((void *)(f))
 #define setcont(o, f)		((o)->u64 = (uint64_t)(uintptr_t)contptr(f))
-#elif LJ_64
+#else
 #define contptr(f) \
   ((void *)(uintptr_t)(uint32_t)((intptr_t)(f) - (intptr_t)lj_vm_asm_begin))
 #define setcont(o, f) \
   ((o)->u64 = (uint64_t)(void *)(f) - (uint64_t)lj_vm_asm_begin)
-#else
-#define contptr(f)		((void *)(f))
-#define setcont(o, f)		setlightudV((o), contptr(f))
 #endif
 
 #define tvchecklive(L, o) \
@@ -1014,6 +1031,6 @@ LJ_DATA const char *const lj_obj_itypename[~LJ_TNUMX+1];
 
 /* Compare two objects without calling metamethods. */
 LJ_FUNC int LJ_FASTCALL lj_obj_equal(cTValue *o1, cTValue *o2);
-LJ_FUNC const void * LJ_FASTCALL lj_obj_ptr(cTValue *o);
+LJ_FUNC const void * LJ_FASTCALL lj_obj_ptr(global_State *g, cTValue *o);
 
 #endif
diff --git a/src/lj_snap.c b/src/lj_snap.c
index 4cf27e76..2f7cf80a 100644
--- a/src/lj_snap.c
+++ b/src/lj_snap.c
@@ -626,7 +626,12 @@ static void snap_restoreval(jit_State *J, GCtrace *T, ExitState *ex,
   IRType1 t = ir->t;
   RegSP rs = ir->prev;
   if (irref_isk(ref)) {  /* Restore constant slot. */
-    lj_ir_kvalue(J->L, o, ir);
+    if (ir->o == IR_KPTR) {
+      o->u64 = (uint64_t)(uintptr_t)ir_kptr(ir);
+    } else {
+      lua_assert(!(ir->o == IR_KKPTR || ir->o == IR_KNULL));
+      lj_ir_kvalue(J->L, o, ir);
+    }
     return;
   }
   if (LJ_UNLIKELY(bloomtest(rfilt, ref)))
diff --git a/src/lj_state.c b/src/lj_state.c
index 448f5134..f82b1b5b 100644
--- a/src/lj_state.c
+++ b/src/lj_state.c
@@ -186,6 +186,12 @@ static void close_state(lua_State *L)
   lj_mem_freevec(g, g->strhash, g->strmask+1, GCRef);
   lj_buf_free(g, &g->tmpbuf);
   lj_mem_freevec(g, tvref(L->stack), L->stacksize, TValue);
+#if LJ_64
+  if (mref(g->gc.lightudseg, uint32_t)) {
+    MSize segnum = g->gc.lightudnum ? (2 << lj_fls(g->gc.lightudnum)) : 2;
+    lj_mem_freevec(g, mref(g->gc.lightudseg, uint32_t), segnum, uint32_t);
+  }
+#endif
   lua_assert(g->gc.total == sizeof(GG_State));
 #ifndef LUAJIT_USE_SYSMALLOC
   if (g->allocf == lj_alloc_f)
diff --git a/src/lj_strfmt.c b/src/lj_strfmt.c
index d7893ce9..237cc575 100644
--- a/src/lj_strfmt.c
+++ b/src/lj_strfmt.c
@@ -393,7 +393,7 @@ GCstr * LJ_FASTCALL lj_strfmt_obj(lua_State *L, cTValue *o)
       p = lj_buf_wmem(p, "builtin#", 8);
       p = lj_strfmt_wint(p, funcV(o)->c.ffid);
     } else {
-      p = lj_strfmt_wptr(p, lj_obj_ptr(o));
+      p = lj_strfmt_wptr(p, lj_obj_ptr(G(L), o));
     }
     return lj_str_new(L, buf, (size_t)(p - buf));
   }
diff --git a/test/tarantool-tests/CMakeLists.txt b/test/tarantool-tests/CMakeLists.txt
index 2fdb4d1f..3e881411 100644
--- a/test/tarantool-tests/CMakeLists.txt
+++ b/test/tarantool-tests/CMakeLists.txt
@@ -57,6 +57,7 @@ macro(BuildTestCLib lib sources)
 endmacro()
 
 add_subdirectory(gh-4427-ffi-sandwich)
+add_subdirectory(lj-49-bad-lightuserdata)
 add_subdirectory(lj-flush-on-trace)
 add_subdirectory(misclib-getmetrics-capi)
 
diff --git a/test/tarantool-tests/lj-49-bad-lightuserdata.test.lua b/test/tarantool-tests/lj-49-bad-lightuserdata.test.lua
new file mode 100644
index 00000000..111d6a70
--- /dev/null
+++ b/test/tarantool-tests/lj-49-bad-lightuserdata.test.lua
@@ -0,0 +1,10 @@
+local tap = require('tap')
+
+local test = tap.test('lj-49-bad-lightuserdata')
+test:plan(1)
+
+local testlightuserdata = require('testlightuserdata')
+
+test:ok(testlightuserdata.longptr())
+
+os.exit(test:check() and 0 or 1)
diff --git a/test/tarantool-tests/lj-49-bad-lightuserdata/CMakeLists.txt b/test/tarantool-tests/lj-49-bad-lightuserdata/CMakeLists.txt
new file mode 100644
index 00000000..ec6bb62c
--- /dev/null
+++ b/test/tarantool-tests/lj-49-bad-lightuserdata/CMakeLists.txt
@@ -0,0 +1 @@
+BuildTestCLib(testlightuserdata testlightuserdata.c)
diff --git a/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c b/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
new file mode 100644
index 00000000..801c7fe1
--- /dev/null
+++ b/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
@@ -0,0 +1,52 @@
+#include <lua.h>
+#include <lauxlib.h>
+
+#include <sys/mman.h>
+#include <unistd.h>
+
+#undef NDEBUG
+#include <assert.h>
+
+#define START ((void *)-1)
+
+static int longptr(lua_State *L)
+{
+	/*
+	 * We know that for arm64 at least 48 bits are available.
+	 * So emulate manually push of lightuseradata within
+	 * this range.
+	 */
+	void *longptr = (void *)(1llu << 48);
+	lua_pushlightuserdata(L, longptr);
+	assert(longptr == lua_topointer(L, -1));
+	/*
+	 * If start mapping address is not NULL, then the kernel
+	 * takes it as a hint about where to place the mapping, so
+	 * we try to get the highest memory address by hint
+	 * equals -1.
+	 */
+	const size_t pagesize = getpagesize();
+	void *mmaped = mmap(START, pagesize, PROT_NONE,
+				  MAP_PRIVATE | MAP_ANON, -1, 0);
+	if (mmaped != MAP_FAILED) {
+		lua_pushlightuserdata(L, mmaped);
+		assert(mmaped == lua_topointer(L, -1));
+		assert(munmap(mmaped, pagesize) == 0);
+	}
+	/* Clear our stack. */
+	lua_pop(L, 0);
+	lua_pushboolean(L, 1);
+	return 1;
+}
+
+static const struct luaL_Reg testlightuserdata[] = {
+	{"longptr", longptr},
+	{NULL, NULL}
+};
+
+LUA_API int luaopen_testlightuserdata(lua_State *L)
+{
+	luaL_register(L, "testlightuserdata", testlightuserdata);
+	return 1;
+}
+
-- 
2.31.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Tarantool-patches] [PATCH luajit 2/2] Linux/ARM64: Make mremap() non-moving due to VA space woes.
  2021-07-06 17:40 [Tarantool-patches] [PATCH luajit 0/2] arm64: fix 48-bit addresses issues Sergey Kaplun via Tarantool-patches
  2021-07-06 17:40 ` [Tarantool-patches] [PATCH luajit 1/2] Add support for full-range 64 bit lightuserdata Sergey Kaplun via Tarantool-patches
@ 2021-07-06 17:40 ` Sergey Kaplun via Tarantool-patches
  2021-07-27 15:23   ` Igor Munkin via Tarantool-patches
  2021-08-11  7:21 ` [Tarantool-patches] [PATCH luajit 0/2] arm64: fix 48-bit addresses issues Igor Munkin via Tarantool-patches
  2 siblings, 1 reply; 20+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-07-06 17:40 UTC (permalink / raw)
  To: Igor Munkin, Sergey Ostanevich; +Cc: tarantool-patches

From: Mike Pall <mike>

This reduces overall performance on ARM64, but we have no choice.
Linux kernel default userspace VA is 48 bit, but we'd need 47 bit.
mremap() ignores address hints due to a kernel API issue. The mapping
may move to an undesired address which will cause an assert or crash.

Reported by Raymond W. Ko.

(cherry picked from commit 67dbec82f4f05a416a78a560a726553beaa7a223)

47-bit VA space is required by LuaJIT for keeping a GC object pointer in
TValue. When need to reallocate to huge sized block `mrepmap()` on arm64
may move out VA space from the 47-bit range. `mremap()` accepts the
fifth argument (new address hint) only with MREMAP_FIXED flag. In that
case it unmaps any other mapping to specified address.

To avoid this behaviour this patch restricts `mremap()` to relocate
the mapping to a new virtual address by reset MREMAP_MAYMOVE flag
for arm64 architecture.

Sergey Kaplun:
* added the description and the test for the problem

Needed for tarantool/tarantool#6154
---
 src/lj_alloc.c                                |  2 +-
 .../lj-671-arm64-assert-after-mremap.test.lua | 24 +++++++++++++++++++
 2 files changed, 25 insertions(+), 1 deletion(-)
 create mode 100644 test/tarantool-tests/lj-671-arm64-assert-after-mremap.test.lua

diff --git a/src/lj_alloc.c b/src/lj_alloc.c
index 9fc761c7..ffcd019b 100644
--- a/src/lj_alloc.c
+++ b/src/lj_alloc.c
@@ -378,7 +378,7 @@ static void *CALL_MREMAP_(void *ptr, size_t osz, size_t nsz, int flags)
 #define CALL_MREMAP(addr, osz, nsz, mv) CALL_MREMAP_((addr), (osz), (nsz), (mv))
 #define CALL_MREMAP_NOMOVE	0
 #define CALL_MREMAP_MAYMOVE	1
-#if LJ_64 && !LJ_GC64
+#if LJ_64 && (!LJ_GC64 || LJ_TARGET_ARM64)
 #define CALL_MREMAP_MV		CALL_MREMAP_NOMOVE
 #else
 #define CALL_MREMAP_MV		CALL_MREMAP_MAYMOVE
diff --git a/test/tarantool-tests/lj-671-arm64-assert-after-mremap.test.lua b/test/tarantool-tests/lj-671-arm64-assert-after-mremap.test.lua
new file mode 100644
index 00000000..0be60a2d
--- /dev/null
+++ b/test/tarantool-tests/lj-671-arm64-assert-after-mremap.test.lua
@@ -0,0 +1,24 @@
+local tap = require('tap')
+
+-- Test file to demonstrate assertion after `mremap()` on arm64.
+-- See also, https://github.com/LuaJIT/LuaJIT/issues/671.
+
+local test = tap.test('lj-671-arm64-assert-after-mremap')
+test:plan(1)
+
+-- `mremap()` is used on Linux for remap directly mapped big
+-- (>=DEFAULT_MMAP_THRESHOLD) memory chunks.
+-- The simplest way to test memory move is to allocate the huge
+-- memory chunk for string buffer directly and reallocate it
+-- after.
+-- To allocate buffer exactly to threshold limit for direct chunk
+-- mapping use `string.rep()` with length equals threshold.
+-- Then concatenate result string (with length of
+-- DEFAULT_MMAP_THRESHOLD) with the other one to reallocate
+-- and remap string buffer.
+
+local DEFAULT_MMAP_THRESHOLD = 128 * 1024
+local s = string.rep('x', DEFAULT_MMAP_THRESHOLD)..'x'
+test:ok(s)
+
+os.exit(test:check() and 0 or 1)
-- 
2.31.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/2] Add support for full-range 64 bit lightuserdata.
  2021-07-06 17:40 ` [Tarantool-patches] [PATCH luajit 1/2] Add support for full-range 64 bit lightuserdata Sergey Kaplun via Tarantool-patches
@ 2021-07-27 13:59   ` Igor Munkin via Tarantool-patches
  2021-07-28 12:29     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 20+ messages in thread
From: Igor Munkin via Tarantool-patches @ 2021-07-27 13:59 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sergey,

Thanks for the patch! Please consider my comments below.

On 06.07.21, Sergey Kaplun wrote:
> From: Mike Pall <mike>
> 
> (cherry picked from commit e9af1abec542e6f9851ff2368e7f196b6382a44c)
> 
> LuaJIT uses special NaN-tagging technique to store internal type on
> the Lua stack. In case LJ_GC64 first 13 bits are set in special NaN
> type (0xfff8...). FPU generates the only one type. The next 4 bits are
> used for an internal LuaJIT type of object on stack. The next 47 bits
> are used for storing this object's content. For userdata, it is its
> address. In case arm64 the pointer can have more than 47 significant
> bits [1]. In this case the error BADLU error is raised.

This is not right for (LJ_64 && !LJ_GC64) case, but AFAIU you are
writing about LJ_GC64 here since this is the only option for ARM64.

> 
> For the support of full 64-bit range lightuserdata pointers two new
> fields in GCState are added:
> 
> `lightudseg` - vector of segments of lightuserdata storing 32-bit

I don't get whether vector or segments or lightuserdata store these
32-bit values :)

> values. MS 25 bits equal to MS bits of lightuserdata address, the

It's better to use either "MSB" acronym or "most significant 25 bits".

> rest are filled with zeros. The lentgh of the vector is power of 2.
> 
> `lightudnum` - the length - 1 of aforementioned vector (up to 255).
> 
> When lightuserdata is pushed on the stack, if its segment is not stored
> in vector new value is pushed on top. The maximum amount of segments is

On top of the stack or <lightudseg>? The wording is ambiguous. AFAIU,
you're writing about the vector, so the preferable verb is "append" or
"push back". Or simply "add", since nobody cares about the location of
the particular key.

> 256, BADLU error is raised in case when user tried to add userdata with
> the new 257-th segment.

It's worth to mention that such hack with segments still doesn't cover
the full VA space, but only those areas covered with the segments. It
was unclear to me at first.

> 
> Also, in this patch all internal usage of lightuserdata (for hooks,
> profilers, built-in package, ir and so on) is changed to special values

Typo: s/ir/IR/ since this is an acronym.

> on Lua Stack.
> 
> Also, conversion of TValue to ffi C type with store is no longer

Ditto for FFI.

> compiled for lightuserdata.
> 
> [1]: https://www.kernel.org/doc/html/latest/arm64/memory.html
> 
> Sergey Kaplun:
> * added the description and the test for the problem
> 
> Needed for tarantool/tarantool#6154
> Resolves tarantool/tarantool#2712

Please move "Resolves" tag on top. BTW, it's rather "Fixes", isn't it?

> ---
>  doc/status.html                               | 11 ----
>  src/jit/dump.lua                              |  4 +-
>  src/lib_debug.c                               | 12 ++--
>  src/lib_jit.c                                 | 14 ++---
>  src/lib_package.c                             |  8 +--
>  src/lib_string.c                              |  2 +-
>  src/lj_api.c                                  | 40 +++++++++++--
>  src/lj_ccall.c                                |  2 +-
>  src/lj_cconv.c                                |  2 +-
>  src/lj_crecord.c                              |  6 +-
>  src/lj_dispatch.c                             |  2 +-
>  src/lj_ir.c                                   |  6 +-
>  src/lj_obj.c                                  |  5 +-
>  src/lj_obj.h                                  | 57 ++++++++++++-------
>  src/lj_snap.c                                 |  7 ++-
>  src/lj_state.c                                |  6 ++
>  src/lj_strfmt.c                               |  2 +-
>  test/tarantool-tests/CMakeLists.txt           |  1 +
>  .../lj-49-bad-lightuserdata.test.lua          | 10 ++++
>  .../lj-49-bad-lightuserdata/CMakeLists.txt    |  1 +
>  .../testlightuserdata.c                       | 52 +++++++++++++++++
>  21 files changed, 183 insertions(+), 67 deletions(-)
>  create mode 100644 test/tarantool-tests/lj-49-bad-lightuserdata.test.lua
>  create mode 100644 test/tarantool-tests/lj-49-bad-lightuserdata/CMakeLists.txt
>  create mode 100644 test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
> 

<snipped>

> diff --git a/src/lib_debug.c b/src/lib_debug.c
> index f112b5bc..8fdfda03 100644
> --- a/src/lib_debug.c
> +++ b/src/lib_debug.c

<snipped>

> @@ -283,13 +283,13 @@ LJLIB_CF(debug_setuservalue)
>  
>  /* ------------------------------------------------------------------------ */
>  
> -#define KEY_HOOK	((void *)0x3004)
> +#define KEY_HOOK	(U64x(80000000,00000000)|'h')
>  
>  static void hookf(lua_State *L, lua_Debug *ar)
>  {
>    static const char *const hooknames[] =
>      {"call", "return", "line", "count", "tail return"};
> -  lua_pushlightuserdata(L, KEY_HOOK);
> +  (L->top++)->u64 = KEY_HOOK;

Side note: Comment desired. I see no guarantees there was no stack
reallocations for the removed <lua_pushlightuserdata>. Fortunately,
there are 5 more slots in the "red zone" for metamethods, and the stack
will be resized within "if" block if needed, but I don't feel safe in
case there is no hook function in registry.

I leave this on you whether to add the comment for this or not.

>    lua_rawget(L, LUA_REGISTRYINDEX);
>    if (lua_isfunction(L, -1)) {
>      lua_pushstring(L, hooknames[(int)ar->event]);
> @@ -334,7 +334,7 @@ LJLIB_CF(debug_sethook)
>      count = luaL_optint(L, arg+3, 0);
>      func = hookf; mask = makemask(smask, count);
>    }
> -  lua_pushlightuserdata(L, KEY_HOOK);

Side note: Comment desired. I see no guarantees there was no stack
reallocations for the removed <lua_pushlightuserdata>. Fortunately,
there are 5 more slots in the "red zone" for metamethods, so we are safe
and the reallocation occurs in <lua_pushvalue> below if one needed.
Anyway, such hacks are fragile, IMHO.

I leave this on you whether to add the comment for this or not.

> +  (L->top++)->u64 = KEY_HOOK;
>    lua_pushvalue(L, arg+1);
>    lua_rawset(L, LUA_REGISTRYINDEX);
>    lua_sethook(L, func, mask, count);
> @@ -349,7 +349,7 @@ LJLIB_CF(debug_gethook)
>    if (hook != NULL && hook != hookf) {  /* external hook? */
>      lua_pushliteral(L, "external hook");
>    } else {
> -    lua_pushlightuserdata(L, KEY_HOOK);
> +    (L->top++)->u64 = KEY_HOOK;

Ditto.

>      lua_rawget(L, LUA_REGISTRYINDEX);   /* get hook */
>    }
>    lua_pushstring(L, unmakemask(mask, buff));

<snipped>

> diff --git a/src/lib_package.c b/src/lib_package.c
> index a0bca572..8573b9f9 100644
> --- a/src/lib_package.c
> +++ b/src/lib_package.c

<snipped>

> @@ -445,14 +445,14 @@ static int lj_cf_package_require(lua_State *L)
>      else
>        lua_pop(L, 1);
>    }
> -  lua_pushlightuserdata(L, sentinel);
> +  (L->top++)->u64 = KEY_SENTINEL;

Ditto.

>    lua_setfield(L, 2, name);  /* _LOADED[name] = sentinel */
>    lua_pushstring(L, name);  /* pass name as argument to module */
>    lua_call(L, 1, 1);  /* run loaded module */
>    if (!lua_isnil(L, -1))  /* non-nil return? */
>      lua_setfield(L, 2, name);  /* _LOADED[name] = returned value */
>    lua_getfield(L, 2, name);

<snipped>

> diff --git a/src/lj_api.c b/src/lj_api.c
> index c9b5d220..c7a0b327 100644
> --- a/src/lj_api.c
> +++ b/src/lj_api.c

<snipped>

> diff --git a/src/lj_ccall.c b/src/lj_ccall.c
> index 06d1b3bd..f26f1fea 100644
> --- a/src/lj_ccall.c
> +++ b/src/lj_ccall.c
> @@ -1150,7 +1150,7 @@ int lj_ccall_func(lua_State *L, GCcdata *cd)
>      lj_vm_ffi_call(&cc);
>      if (cts->cb.slot != ~0u) {  /* Blacklist function that called a callback. */
>        TValue tv;
> -      setlightudV(&tv, (void *)cc.func);
> +      tv.u64 = ((uintptr_t)(void *)cc.func >> 2) | U64x(800000000, 00000000);

Side note: Comment desired. The reason why this hack works is that
compilers align the functions up to the machine word (i.e. 8 bytes),
hence two LSB are *likely* be zeros. Furthermore, functions calling the
callbacks are *likely* contains of more than 8 bytes of instructions.

However, I wonder whether we can create two hand-crafted assembler
functions, that are placed one right after another within 8 bytes, so
blacklisting one of them affects another one. I believe this need to be
checked (aside of this patchset) and reported to the upstream if the
issue occurs.

I leave this on you whether to add the comment for this or not.

>        setboolV(lj_tab_set(L, cts->miscmap, &tv), 1);
>      }
>      ct = (CType *)((intptr_t)ct+(intptr_t)cts->tab);  /* May be reallocated. */

<snipped>

> diff --git a/src/lj_crecord.c b/src/lj_crecord.c
> index e32ae23e..59186dab 100644
> --- a/src/lj_crecord.c
> +++ b/src/lj_crecord.c

<snipped>

> @@ -1209,8 +1208,7 @@ static int crec_call(jit_State *J, RecordFFData *rd, GCcdata *cd)
>      TRef tr;
>      TValue tv;
>      /* Check for blacklisted C functions that might call a callback. */
> -    setlightudV(&tv,
> -		cdata_getptr(cdataptr(cd), (LJ_64 && tp == IRT_P64) ? 8 : 4));
> +    tv.u64 = ((uintptr_t)cdata_getptr(cdataptr(cd), (LJ_64 && tp == IRT_P64) ? 8 : 4) >> 2) | U64x(800000000, 00000000);

Returning to the previous comment, looks like you can just turn JIT off
for the one handcrafted function to implicitly blacklist another one.

>      if (tvistrue(lj_tab_get(J->L, cts->miscmap, &tv)))
>        lj_trace_err(J, LJ_TRERR_BLACKL);
>      if (ctype_isvoid(ctr->info)) {

<snipped>

> diff --git a/src/lj_ir.c b/src/lj_ir.c
> index 5baece67..2f7ddb24 100644
> --- a/src/lj_ir.c
> +++ b/src/lj_ir.c
> @@ -386,8 +386,10 @@ void lj_ir_kvalue(lua_State *L, TValue *tv, const IRIns *ir)
>    case IR_KPRI: setpriV(tv, irt_toitype(ir->t)); break;
>    case IR_KINT: setintV(tv, ir->i); break;
>    case IR_KGC: setgcV(L, tv, ir_kgc(ir), irt_toitype(ir->t)); break;
> -  case IR_KPTR: case IR_KKPTR: setlightudV(tv, ir_kptr(ir)); break;
> -  case IR_KNULL: setlightudV(tv, NULL); break;
> +  case IR_KPTR: case IR_KKPTR:
> +    setnumV(tv, (lua_Number)(uintptr_t)ir_kptr(ir));

[Looking forward to the day this will crash for 53-bit VA addresses]

> +    break;
> +  case IR_KNULL: setintV(tv, 0); break;
>    case IR_KNUM: setnumV(tv, ir_knum(ir)->n); break;
>  #if LJ_HASFFI
>    case IR_KINT64: {

<snipped>

> diff --git a/src/lj_snap.c b/src/lj_snap.c
> index 4cf27e76..2f7cf80a 100644
> --- a/src/lj_snap.c
> +++ b/src/lj_snap.c
> @@ -626,7 +626,12 @@ static void snap_restoreval(jit_State *J, GCtrace *T, ExitState *ex,
>    IRType1 t = ir->t;
>    RegSP rs = ir->prev;
>    if (irref_isk(ref)) {  /* Restore constant slot. */
> -    lj_ir_kvalue(J->L, o, ir);
> +    if (ir->o == IR_KPTR) {
> +      o->u64 = (uint64_t)(uintptr_t)ir_kptr(ir);

Side note: It's strange, why Mike didn't make this trick within
<lj_ir_kvalue>, but decided to play with lua_Number *and* do not use
<lj_ir_kvalue> for IR_KPTR for this case. I haven't checked precisely
whether IR_KPTR is passed to <lj_ir_kvalue> anywhere in the JIT.

> +    } else {
> +      lua_assert(!(ir->o == IR_KKPTR || ir->o == IR_KNULL));

Side note: BTW, this choice is quite obvious, since both IR_KKPTR and
IR_KNULL are limited by LuaJIT defines.

> +      lj_ir_kvalue(J->L, o, ir);
> +    }
>      return;
>    }
>    if (LJ_UNLIKELY(bloomtest(rfilt, ref)))

<snipped>

> diff --git a/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c b/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
> new file mode 100644
> index 00000000..801c7fe1
> --- /dev/null
> +++ b/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
> @@ -0,0 +1,52 @@
> +#include <lua.h>
> +#include <lauxlib.h>
> +
> +#include <sys/mman.h>
> +#include <unistd.h>
> +
> +#undef NDEBUG
> +#include <assert.h>
> +
> +#define START ((void *)-1)
> +
> +static int longptr(lua_State *L)
> +{
> +	/*
> +	 * We know that for arm64 at least 48 bits are available.
> +	 * So emulate manually push of lightuseradata within
> +	 * this range.
> +	 */
> +	void *longptr = (void *)(1llu << 48);
> +	lua_pushlightuserdata(L, longptr);
> +	assert(longptr == lua_topointer(L, -1));
> +	/*
> +	 * If start mapping address is not NULL, then the kernel
> +	 * takes it as a hint about where to place the mapping, so
> +	 * we try to get the highest memory address by hint
> +	 * equals -1.
> +	 */

BTW, I guess it's more natural to split this function into two: one for
hand-crafted pointer, and another for one obtained by <mmap>.

> +	const size_t pagesize = getpagesize();

Please declare the variables at the beginning of the scope.

> +	void *mmaped = mmap(START, pagesize, PROT_NONE,
> +				  MAP_PRIVATE | MAP_ANON, -1, 0);

Typo: Odd arguments alignment.

> +	if (mmaped != MAP_FAILED) {
> +		lua_pushlightuserdata(L, mmaped);
> +		assert(mmaped == lua_topointer(L, -1));
> +		assert(munmap(mmaped, pagesize) == 0);
> +	}
> +	/* Clear our stack. */
> +	lua_pop(L, 0);
> +	lua_pushboolean(L, 1);
> +	return 1;
> +}
> +
> +static const struct luaL_Reg testlightuserdata[] = {
> +	{"longptr", longptr},
> +	{NULL, NULL}
> +};
> +
> +LUA_API int luaopen_testlightuserdata(lua_State *L)
> +{
> +	luaL_register(L, "testlightuserdata", testlightuserdata);
> +	return 1;
> +}
> +
> -- 
> 2.31.0
> 

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/2] Linux/ARM64: Make mremap() non-moving due to VA space woes.
  2021-07-06 17:40 ` [Tarantool-patches] [PATCH luajit 2/2] Linux/ARM64: Make mremap() non-moving due to VA space woes Sergey Kaplun via Tarantool-patches
@ 2021-07-27 15:23   ` Igor Munkin via Tarantool-patches
  2021-07-28 12:29     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 20+ messages in thread
From: Igor Munkin via Tarantool-patches @ 2021-07-27 15:23 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sergey,

Thanks for the patch! LGTM, except a few nits below.

On 06.07.21, Sergey Kaplun wrote:
> From: Mike Pall <mike>
> 
> This reduces overall performance on ARM64, but we have no choice.
> Linux kernel default userspace VA is 48 bit, but we'd need 47 bit.
> mremap() ignores address hints due to a kernel API issue. The mapping
> may move to an undesired address which will cause an assert or crash.
> 
> Reported by Raymond W. Ko.
> 
> (cherry picked from commit 67dbec82f4f05a416a78a560a726553beaa7a223)
> 
> 47-bit VA space is required by LuaJIT for keeping a GC object pointer in
> TValue. When need to reallocate to huge sized block `mrepmap()` on arm64
> may move out VA space from the 47-bit range. `mremap()` accepts the

I'd rather reword the previous sentence the following way:
| In case of huge blobs that are mapped directly, `mremap()` may move
| the chunk out of 47-bit range of VA space on ARM64.

> fifth argument (new address hint) only with MREMAP_FIXED flag. In that
> case it unmaps any other mapping to specified address.
> 
> To avoid this behaviour this patch restricts `mremap()` to relocate
> the mapping to a new virtual address by reset MREMAP_MAYMOVE flag

I'm confused a bit with "reset" word: MREMAP_MAYMOVE is simply changed
to 0 (i.e. CALL_MREMAP_NOMOVE). Moreover, it's better to stay in LuaJIT
terms instead of Linux ones.

> for arm64 architecture.
> 
> Sergey Kaplun:
> * added the description and the test for the problem
> 
> Needed for tarantool/tarantool#6154
> ---
>  src/lj_alloc.c                                |  2 +-
>  .../lj-671-arm64-assert-after-mremap.test.lua | 24 +++++++++++++++++++
>  2 files changed, 25 insertions(+), 1 deletion(-)
>  create mode 100644 test/tarantool-tests/lj-671-arm64-assert-after-mremap.test.lua
> 

<snipped>

> diff --git a/test/tarantool-tests/lj-671-arm64-assert-after-mremap.test.lua b/test/tarantool-tests/lj-671-arm64-assert-after-mremap.test.lua
> new file mode 100644
> index 00000000..0be60a2d
> --- /dev/null
> +++ b/test/tarantool-tests/lj-671-arm64-assert-after-mremap.test.lua
> @@ -0,0 +1,24 @@
> +local tap = require('tap')
> +
> +-- Test file to demonstrate assertion after `mremap()` on arm64.
> +-- See also, https://github.com/LuaJIT/LuaJIT/issues/671.
> +
> +local test = tap.test('lj-671-arm64-assert-after-mremap')
> +test:plan(1)
> +
> +-- `mremap()` is used on Linux for remap directly mapped big

Typo: s/for/to/.

> +-- (>=DEFAULT_MMAP_THRESHOLD) memory chunks.
> +-- The simplest way to test memory move is to allocate the huge
> +-- memory chunk for string buffer directly and reallocate it
> +-- after.
> +-- To allocate buffer exactly to threshold limit for direct chunk
> +-- mapping use `string.rep()` with length equals threshold.
> +-- Then concatenate result string (with length of
> +-- DEFAULT_MMAP_THRESHOLD) with the other one to reallocate
> +-- and remap string buffer.

Just polished two sections above:
| -- To allocate a memory buffer with the size up to the threshold
| -- for direct mapping `string.rep()` is used with the length that
| -- equals to DEFAULT_MMAP_THRESHOLD.
| -- Then concatenate the directly mapped result string with the
| -- other one to trigger buffer reallocation and its remapping.

> +
> +local DEFAULT_MMAP_THRESHOLD = 128 * 1024
> +local s = string.rep('x', DEFAULT_MMAP_THRESHOLD)..'x'
> +test:ok(s)
> +
> +os.exit(test:check() and 0 or 1)
> -- 
> 2.31.0
> 

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/2] Add support for full-range 64 bit lightuserdata.
  2021-07-27 13:59   ` Igor Munkin via Tarantool-patches
@ 2021-07-28 12:29     ` Sergey Kaplun via Tarantool-patches
  2021-08-01 10:36       ` Igor Munkin via Tarantool-patches
  2021-08-01 16:25       ` Sergey Ostanevich via Tarantool-patches
  0 siblings, 2 replies; 20+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-07-28 12:29 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches

Igor,
Thanks for the review!

On 27.07.21, Igor Munkin wrote:
> Sergey,
> 
> Thanks for the patch! Please consider my comments below.

I'll skip all the part regarding comments, as far as we discuss with
you offline to take patches as is to simplify backporting of patches
from the upstream. Of course comments are desired, especially taking in
account that there are a lot of places, where patch bases on Mike's
knowledge about how it works with tons of hacks.

> 
> On 06.07.21, Sergey Kaplun wrote:
> > From: Mike Pall <mike>
> > 
> > (cherry picked from commit e9af1abec542e6f9851ff2368e7f196b6382a44c)
> > 
> > LuaJIT uses special NaN-tagging technique to store internal type on
> > the Lua stack. In case LJ_GC64 first 13 bits are set in special NaN
> > type (0xfff8...). FPU generates the only one type. The next 4 bits are
> > used for an internal LuaJIT type of object on stack. The next 47 bits
> > are used for storing this object's content. For userdata, it is its
> > address. In case arm64 the pointer can have more than 47 significant
> > bits [1]. In this case the error BADLU error is raised.
> 
> This is not right for (LJ_64 && !LJ_GC64) case, but AFAIU you are
> writing about LJ_GC64 here since this is the only option for ARM64.

Yes, the LJ_GC64 mode is mentioned in the second sentence.

> 
> > 
> > For the support of full 64-bit range lightuserdata pointers two new
> > fields in GCState are added:
> > 
> > `lightudseg` - vector of segments of lightuserdata storing 32-bit
> 
> I don't get whether vector or segments or lightuserdata store these
> 32-bit values :)

Fixes.

> 
> > values. MS 25 bits equal to MS bits of lightuserdata address, the
> 
> It's better to use either "MSB" acronym or "most significant 25 bits".

Fixed.

> 
> > rest are filled with zeros. The lentgh of the vector is power of 2.
> > 
> > `lightudnum` - the length - 1 of aforementioned vector (up to 255).
> > 
> > When lightuserdata is pushed on the stack, if its segment is not stored
> > in vector new value is pushed on top. The maximum amount of segments is
> 
> On top of the stack or <lightudseg>? The wording is ambiguous. AFAIU,
> you're writing about the vector, so the preferable verb is "append" or
> "push back". Or simply "add", since nobody cares about the location of
> the particular key.

Fixed.

> 
> > 256, BADLU error is raised in case when user tried to add userdata with
> > the new 257-th segment.
> 
> It's worth to mention that such hack with segments still doesn't cover
> the full VA space, but only those areas covered with the segments. It
> was unclear to me at first.

Fixed.

> 
> > 
> > Also, in this patch all internal usage of lightuserdata (for hooks,
> > profilers, built-in package, ir and so on) is changed to special values
> 
> Typo: s/ir/IR/ since this is an acronym.

Fixed.

> 
> > on Lua Stack.
> > 
> > Also, conversion of TValue to ffi C type with store is no longer
> 
> Ditto for FFI.

Fixed.

> 
> > compiled for lightuserdata.
> > 
> > [1]: https://www.kernel.org/doc/html/latest/arm64/memory.html
> > 
> > Sergey Kaplun:
> > * added the description and the test for the problem
> > 
> > Needed for tarantool/tarantool#6154
> > Resolves tarantool/tarantool#2712
> 
> Please move "Resolves" tag on top. BTW, it's rather "Fixes", isn't it?

It will be resolved when the LuaJIT will be bumped in the Tarantool,
won't it?

Changed lines order.

The new commit message is the following:

===================================================================
Add support for full-range 64 bit lightuserdata.

(cherry picked from commit e9af1abec542e6f9851ff2368e7f196b6382a44c)

LuaJIT uses special NaN-tagging technique to store internal type on
the Lua stack. In case LJ_GC64 first 13 bits are set in special NaN
type (0xfff8...). FPU generates the only one type. The next 4 bits are
used for an internal LuaJIT type of object on stack. The next 47 bits
are used for storing this object's content. For userdata, it is its
address. In case arm64 the pointer can have more than 47 significant
bits [1]. In this case the error BADLU error is raised.

For the support of full 64-bit range lightuserdata pointers two new
fields in GCState are added:

`lightudseg` - vector of segments of lightuserdata. Each element keeps
32-bit value. 25 MSB equal to MSB of lightuserdata address, the rest are
filled with zeros. The lentgh of the vector is power of 2.

`lightudnum` - the length - 1 of aforementioned vector (up to 255).

When lightuserdata is pushed on the stack, if its segment is not stored
in vector new value is appended on top of this vector. The maximum
amount of segments is 256. BADLU error is raised in case when user tried
to add userdata with the new 257-th segment, so the whole VA-space
isn't covered by this patch.

Also, in this patch all internal usage of lightuserdata (for hooks,
profilers, built-in package, IR and so on) is changed to special values
on Lua Stack.

Also, conversion of TValue to FFI C type with store is no longer
compiled for lightuserdata.

[1]: https://www.kernel.org/doc/html/latest/arm64/memory.html

Sergey Kaplun:
* added the description and the test for the problem

Resolves tarantool/tarantool#2712
Needed for tarantool/tarantool#6154
===================================================================

> 
> > ---
> >  doc/status.html                               | 11 ----
> >  src/jit/dump.lua                              |  4 +-
> >  src/lib_debug.c                               | 12 ++--
> >  src/lib_jit.c                                 | 14 ++---
> >  src/lib_package.c                             |  8 +--
> >  src/lib_string.c                              |  2 +-
> >  src/lj_api.c                                  | 40 +++++++++++--
> >  src/lj_ccall.c                                |  2 +-
> >  src/lj_cconv.c                                |  2 +-
> >  src/lj_crecord.c                              |  6 +-
> >  src/lj_dispatch.c                             |  2 +-
> >  src/lj_ir.c                                   |  6 +-
> >  src/lj_obj.c                                  |  5 +-
> >  src/lj_obj.h                                  | 57 ++++++++++++-------
> >  src/lj_snap.c                                 |  7 ++-
> >  src/lj_state.c                                |  6 ++
> >  src/lj_strfmt.c                               |  2 +-
> >  test/tarantool-tests/CMakeLists.txt           |  1 +
> >  .../lj-49-bad-lightuserdata.test.lua          | 10 ++++
> >  .../lj-49-bad-lightuserdata/CMakeLists.txt    |  1 +
> >  .../testlightuserdata.c                       | 52 +++++++++++++++++
> >  21 files changed, 183 insertions(+), 67 deletions(-)
> >  create mode 100644 test/tarantool-tests/lj-49-bad-lightuserdata.test.lua
> >  create mode 100644 test/tarantool-tests/lj-49-bad-lightuserdata/CMakeLists.txt
> >  create mode 100644 test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
> > 

<snipped>

> > diff --git a/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c b/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
> > new file mode 100644
> > index 00000000..801c7fe1
> > --- /dev/null
> > +++ b/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
> > @@ -0,0 +1,52 @@
> > +#include <lua.h>
> > +#include <lauxlib.h>
> > +
> > +#include <sys/mman.h>
> > +#include <unistd.h>
> > +
> > +#undef NDEBUG
> > +#include <assert.h>
> > +
> > +#define START ((void *)-1)
> > +
> > +static int longptr(lua_State *L)
> > +{
> > +	/*
> > +	 * We know that for arm64 at least 48 bits are available.
> > +	 * So emulate manually push of lightuseradata within
> > +	 * this range.
> > +	 */
> > +	void *longptr = (void *)(1llu << 48);
> > +	lua_pushlightuserdata(L, longptr);
> > +	assert(longptr == lua_topointer(L, -1));
> > +	/*
> > +	 * If start mapping address is not NULL, then the kernel
> > +	 * takes it as a hint about where to place the mapping, so
> > +	 * we try to get the highest memory address by hint
> > +	 * equals -1.
> > +	 */
> 
> BTW, I guess it's more natural to split this function into two: one for
> hand-crafted pointer, and another for one obtained by <mmap>.
> 
> > +	const size_t pagesize = getpagesize();
> 
> Please declare the variables at the beginning of the scope.

Don't get it. We don't use the LuaJIT code stile for tests, see memprof
tests for example. Anyway after spllitting the this function into two,
the variables declared at the beginning of the scope.

> 
> > +	void *mmaped = mmap(START, pagesize, PROT_NONE,
> > +				  MAP_PRIVATE | MAP_ANON, -1, 0);
> 
> Typo: Odd arguments alignment.

Fixed.

See the iterative patch below.

===================================================================
diff --git a/test/tarantool-tests/lj-49-bad-lightuserdata.test.lua b/test/tarantool-tests/lj-49-bad-lightuserdata.test.lua
index 111d6a70..94a743c7 100644
--- a/test/tarantool-tests/lj-49-bad-lightuserdata.test.lua
+++ b/test/tarantool-tests/lj-49-bad-lightuserdata.test.lua
@@ -1,10 +1,11 @@
 local tap = require('tap')
 
 local test = tap.test('lj-49-bad-lightuserdata')
-test:plan(1)
+test:plan(2)
 
 local testlightuserdata = require('testlightuserdata')
 
-test:ok(testlightuserdata.longptr())
+test:ok(testlightuserdata.crafted_ptr())
+test:ok(testlightuserdata.mmaped_ptr())
 
 os.exit(test:check() and 0 or 1)
diff --git a/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c b/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
index 801c7fe1..c958841f 100644
--- a/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
+++ b/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
@@ -9,7 +9,7 @@
 
 #define START ((void *)-1)
 
-static int longptr(lua_State *L)
+static int crafted_ptr(lua_State *L)
 {
 	/*
 	 * We know that for arm64 at least 48 bits are available.
@@ -19,6 +19,14 @@ static int longptr(lua_State *L)
 	void *longptr = (void *)(1llu << 48);
 	lua_pushlightuserdata(L, longptr);
 	assert(longptr == lua_topointer(L, -1));
+	/* Clear our stack. */
+	lua_pop(L, 0);
+	lua_pushboolean(L, 1);
+	return 1;
+}
+
+static int mmaped_ptr(lua_State *L)
+{
 	/*
 	 * If start mapping address is not NULL, then the kernel
 	 * takes it as a hint about where to place the mapping, so
@@ -26,8 +34,8 @@ static int longptr(lua_State *L)
 	 * equals -1.
 	 */
 	const size_t pagesize = getpagesize();
-	void *mmaped = mmap(START, pagesize, PROT_NONE,
-				  MAP_PRIVATE | MAP_ANON, -1, 0);
+	void *mmaped = mmap(START, pagesize, PROT_NONE, MAP_PRIVATE | MAP_ANON,
+			    -1, 0);
 	if (mmaped != MAP_FAILED) {
 		lua_pushlightuserdata(L, mmaped);
 		assert(mmaped == lua_topointer(L, -1));
@@ -40,7 +48,8 @@ static int longptr(lua_State *L)
 }
 
 static const struct luaL_Reg testlightuserdata[] = {
-	{"longptr", longptr},
+	{"crafted_ptr", crafted_ptr},
+	{"mmaped_ptr", mmaped_ptr},
 	{NULL, NULL}
 };
 
===================================================================

> 
> > +	if (mmaped != MAP_FAILED) {
> > +		lua_pushlightuserdata(L, mmaped);
> > +		assert(mmaped == lua_topointer(L, -1));
> > +		assert(munmap(mmaped, pagesize) == 0);
> > +	}
> > +	/* Clear our stack. */
> > +	lua_pop(L, 0);
> > +	lua_pushboolean(L, 1);
> > +	return 1;
> > +}
> > +
> > +static const struct luaL_Reg testlightuserdata[] = {
> > +	{"longptr", longptr},
> > +	{NULL, NULL}
> > +};
> > +
> > +LUA_API int luaopen_testlightuserdata(lua_State *L)
> > +{
> > +	luaL_register(L, "testlightuserdata", testlightuserdata);
> > +	return 1;
> > +}
> > +
> > -- 
> > 2.31.0
> > 
> 
> -- 
> Best regards,
> IM

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/2] Linux/ARM64: Make mremap() non-moving due to VA space woes.
  2021-07-27 15:23   ` Igor Munkin via Tarantool-patches
@ 2021-07-28 12:29     ` Sergey Kaplun via Tarantool-patches
  2021-08-01 10:36       ` Igor Munkin via Tarantool-patches
  0 siblings, 1 reply; 20+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-07-28 12:29 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches

On 27.07.21, Igor Munkin wrote:
> Sergey,
> 
> Thanks for the patch! LGTM, except a few nits below.
> 
> On 06.07.21, Sergey Kaplun wrote:
> > From: Mike Pall <mike>
> > 
> > This reduces overall performance on ARM64, but we have no choice.
> > Linux kernel default userspace VA is 48 bit, but we'd need 47 bit.
> > mremap() ignores address hints due to a kernel API issue. The mapping
> > may move to an undesired address which will cause an assert or crash.
> > 
> > Reported by Raymond W. Ko.
> > 
> > (cherry picked from commit 67dbec82f4f05a416a78a560a726553beaa7a223)
> > 
> > 47-bit VA space is required by LuaJIT for keeping a GC object pointer in
> > TValue. When need to reallocate to huge sized block `mrepmap()` on arm64
> > may move out VA space from the 47-bit range. `mremap()` accepts the
> 
> I'd rather reword the previous sentence the following way:
> | In case of huge blobs that are mapped directly, `mremap()` may move
> | the chunk out of 47-bit range of VA space on ARM64.

Fixed.

> 
> > fifth argument (new address hint) only with MREMAP_FIXED flag. In that
> > case it unmaps any other mapping to specified address.
> > 
> > To avoid this behaviour this patch restricts `mremap()` to relocate
> > the mapping to a new virtual address by reset MREMAP_MAYMOVE flag
> 
> I'm confused a bit with "reset" word: MREMAP_MAYMOVE is simply changed
> to 0 (i.e. CALL_MREMAP_NOMOVE). Moreover, it's better to stay in LuaJIT
> terms instead of Linux ones.

Fixed.

The new commit message is the following:

===================================================================
Linux/ARM64: Make mremap() non-moving due to VA space woes.

This reduces overall performance on ARM64, but we have no choice.
Linux kernel default userspace VA is 48 bit, but we'd need 47 bit.
mremap() ignores address hints due to a kernel API issue. The mapping
may move to an undesired address which will cause an assert or crash.

Reported by Raymond W. Ko.

(cherry picked from commit 67dbec82f4f05a416a78a560a726553beaa7a223)

47-bit VA space is required by LuaJIT for keeping a GC object pointer in
TValue. In case of huge blobs that are mapped directly, `mremap()` may
move the chunk out of 47-bit range of VA space on ARM64. `mremap()`
accepts the fifth argument (new address hint) only with MREMAP_FIXED
flag. In that case it unmaps any other mapping to specified address.

To avoid this behaviour this patch restricts `mremap()` to relocate
the mapping to a new virtual address by set CALL_MREMAP_NOMOVE flag
instead of CALL_MREMAP_MAYMOVE for arm64 architecture.

Sergey Kaplun:
* added the description and the test for the problem

Needed for tarantool/tarantool#6154
===================================================================

> 
> > for arm64 architecture.
> > 
> > Sergey Kaplun:
> > * added the description and the test for the problem
> > 
> > Needed for tarantool/tarantool#6154
> > ---
> >  src/lj_alloc.c                                |  2 +-
> >  .../lj-671-arm64-assert-after-mremap.test.lua | 24 +++++++++++++++++++
> >  2 files changed, 25 insertions(+), 1 deletion(-)
> >  create mode 100644 test/tarantool-tests/lj-671-arm64-assert-after-mremap.test.lua
> > 
> 
> <snipped>
> 
> > diff --git a/test/tarantool-tests/lj-671-arm64-assert-after-mremap.test.lua b/test/tarantool-tests/lj-671-arm64-assert-after-mremap.test.lua
> > new file mode 100644
> > index 00000000..0be60a2d
> > --- /dev/null
> > +++ b/test/tarantool-tests/lj-671-arm64-assert-after-mremap.test.lua
> > @@ -0,0 +1,24 @@
> > +local tap = require('tap')
> > +
> > +-- Test file to demonstrate assertion after `mremap()` on arm64.
> > +-- See also, https://github.com/LuaJIT/LuaJIT/issues/671.
> > +
> > +local test = tap.test('lj-671-arm64-assert-after-mremap')
> > +test:plan(1)
> > +
> > +-- `mremap()` is used on Linux for remap directly mapped big
> 
> Typo: s/for/to/.
> 
> > +-- (>=DEFAULT_MMAP_THRESHOLD) memory chunks.
> > +-- The simplest way to test memory move is to allocate the huge
> > +-- memory chunk for string buffer directly and reallocate it
> > +-- after.
> > +-- To allocate buffer exactly to threshold limit for direct chunk
> > +-- mapping use `string.rep()` with length equals threshold.
> > +-- Then concatenate result string (with length of
> > +-- DEFAULT_MMAP_THRESHOLD) with the other one to reallocate
> > +-- and remap string buffer.
> 
> Just polished two sections above:
> | -- To allocate a memory buffer with the size up to the threshold
> | -- for direct mapping `string.rep()` is used with the length that
> | -- equals to DEFAULT_MMAP_THRESHOLD.
> | -- Then concatenate the directly mapped result string with the
> | -- other one to trigger buffer reallocation and its remapping.

Fixed. See the iterative patch below. Branch is force pushed.

===================================================================
diff --git a/test/tarantool-tests/lj-671-arm64-assert-after-mremap.test.lua b/test/tarantool-tests/lj-671-arm64-assert-after-mremap.test.lua
index 0be60a2d..0558cbe3 100644
--- a/test/tarantool-tests/lj-671-arm64-assert-after-mremap.test.lua
+++ b/test/tarantool-tests/lj-671-arm64-assert-after-mremap.test.lua
@@ -6,16 +6,16 @@ local tap = require('tap')
 local test = tap.test('lj-671-arm64-assert-after-mremap')
 test:plan(1)
 
--- `mremap()` is used on Linux for remap directly mapped big
+-- `mremap()` is used on Linux to remap directly mapped big
 -- (>=DEFAULT_MMAP_THRESHOLD) memory chunks.
 -- The simplest way to test memory move is to allocate the huge
 -- memory chunk for string buffer directly and reallocate it
 -- after.
--- To allocate buffer exactly to threshold limit for direct chunk
--- mapping use `string.rep()` with length equals threshold.
--- Then concatenate result string (with length of
--- DEFAULT_MMAP_THRESHOLD) with the other one to reallocate
--- and remap string buffer.
+-- To allocate a memory buffer with the size up to the threshold
+-- for direct mapping `string.rep()` is used with the length that
+-- equals to DEFAULT_MMAP_THRESHOLD.
+-- Then concatenate the directly mapped result string with the
+-- other one to trigger buffer reallocation and its remapping.
 
 local DEFAULT_MMAP_THRESHOLD = 128 * 1024
 local s = string.rep('x', DEFAULT_MMAP_THRESHOLD)..'x'
===================================================================

> 
> > +
> > +local DEFAULT_MMAP_THRESHOLD = 128 * 1024
> > +local s = string.rep('x', DEFAULT_MMAP_THRESHOLD)..'x'
> > +test:ok(s)
> > +
> > +os.exit(test:check() and 0 or 1)
> > -- 
> > 2.31.0
> > 
> 
> -- 
> Best regards,
> IM

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/2] Add support for full-range 64 bit lightuserdata.
  2021-07-28 12:29     ` Sergey Kaplun via Tarantool-patches
@ 2021-08-01 10:36       ` Igor Munkin via Tarantool-patches
  2021-08-02 14:56         ` Sergey Kaplun via Tarantool-patches
  2021-08-01 16:25       ` Sergey Ostanevich via Tarantool-patches
  1 sibling, 1 reply; 20+ messages in thread
From: Igor Munkin via Tarantool-patches @ 2021-08-01 10:36 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sergey,

Thanks for the fixes and comments! LGTM then, except the three typos
I've missed in the previous review.

On 28.07.21, Sergey Kaplun wrote:
> Igor,
> Thanks for the review!
> 

<snipped>

> 
> The new commit message is the following:
> 
> ===================================================================
> Add support for full-range 64 bit lightuserdata.
> 
> (cherry picked from commit e9af1abec542e6f9851ff2368e7f196b6382a44c)
> 
> LuaJIT uses special NaN-tagging technique to store internal type on
> the Lua stack. In case LJ_GC64 first 13 bits are set in special NaN
> type (0xfff8...). FPU generates the only one type. The next 4 bits are
> used for an internal LuaJIT type of object on stack. The next 47 bits
> are used for storing this object's content. For userdata, it is its
> address. In case arm64 the pointer can have more than 47 significant
> bits [1]. In this case the error BADLU error is raised.
> 
> For the support of full 64-bit range lightuserdata pointers two new
> fields in GCState are added:
> 
> `lightudseg` - vector of segments of lightuserdata. Each element keeps
> 32-bit value. 25 MSB equal to MSB of lightuserdata address, the rest are
> filled with zeros. The lentgh of the vector is power of 2.

Typo: s/lentgh/length/.

> 
> `lightudnum` - the length - 1 of aforementioned vector (up to 255).
> 
> When lightuserdata is pushed on the stack, if its segment is not stored
> in vector new value is appended on top of this vector. The maximum
> amount of segments is 256. BADLU error is raised in case when user tried

Typo: s/tried/tries/.

> to add userdata with the new 257-th segment, so the whole VA-space
> isn't covered by this patch.
> 
> Also, in this patch all internal usage of lightuserdata (for hooks,
> profilers, built-in package, IR and so on) is changed to special values
> on Lua Stack.
> 
> Also, conversion of TValue to FFI C type with store is no longer
> compiled for lightuserdata.
> 
> [1]: https://www.kernel.org/doc/html/latest/arm64/memory.html
> 
> Sergey Kaplun:
> * added the description and the test for the problem
> 
> Resolves tarantool/tarantool#2712
> Needed for tarantool/tarantool#6154

Minor: Why #5629 is not mentioned?

> ===================================================================
> 

<snipped>

> 
> See the iterative patch below.
> 
> ===================================================================

<snipped>

> diff --git a/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c b/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
> index 801c7fe1..c958841f 100644
> --- a/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
> +++ b/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c

<snipped>

> @@ -26,8 +34,8 @@ static int longptr(lua_State *L)
>  	 * equals -1.

Typo: s/hint equals -1/hint that equals to -1/.

>  	 */
>  	const size_t pagesize = getpagesize();
> -	void *mmaped = mmap(START, pagesize, PROT_NONE,
> -				  MAP_PRIVATE | MAP_ANON, -1, 0);
> +	void *mmaped = mmap(START, pagesize, PROT_NONE, MAP_PRIVATE | MAP_ANON,
> +			    -1, 0);
>  	if (mmaped != MAP_FAILED) {
>  		lua_pushlightuserdata(L, mmaped);
>  		assert(mmaped == lua_topointer(L, -1));

<snipped>

> ===================================================================
> 

<snipped>

> 
> -- 
> Best regards,
> Sergey Kaplun

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/2] Linux/ARM64: Make mremap() non-moving due to VA space woes.
  2021-07-28 12:29     ` Sergey Kaplun via Tarantool-patches
@ 2021-08-01 10:36       ` Igor Munkin via Tarantool-patches
  2021-08-01 16:59         ` Sergey Ostanevich via Tarantool-patches
  2021-08-02 15:11         ` Sergey Kaplun via Tarantool-patches
  0 siblings, 2 replies; 20+ messages in thread
From: Igor Munkin via Tarantool-patches @ 2021-08-01 10:36 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sergey,

Thanks for the fixes! LGTM, except the single typo.

On 28.07.21, Sergey Kaplun wrote:

<snipped>

> 
> The new commit message is the following:
> 
> ===================================================================
> Linux/ARM64: Make mremap() non-moving due to VA space woes.
> 
> This reduces overall performance on ARM64, but we have no choice.
> Linux kernel default userspace VA is 48 bit, but we'd need 47 bit.
> mremap() ignores address hints due to a kernel API issue. The mapping
> may move to an undesired address which will cause an assert or crash.
> 
> Reported by Raymond W. Ko.
> 
> (cherry picked from commit 67dbec82f4f05a416a78a560a726553beaa7a223)
> 
> 47-bit VA space is required by LuaJIT for keeping a GC object pointer in
> TValue. In case of huge blobs that are mapped directly, `mremap()` may
> move the chunk out of 47-bit range of VA space on ARM64. `mremap()`
> accepts the fifth argument (new address hint) only with MREMAP_FIXED
> flag. In that case it unmaps any other mapping to specified address.
> 
> To avoid this behaviour this patch restricts `mremap()` to relocate
> the mapping to a new virtual address by set CALL_MREMAP_NOMOVE flag

Typo: s/set/setting/.

> instead of CALL_MREMAP_MAYMOVE for arm64 architecture.
> 
> Sergey Kaplun:
> * added the description and the test for the problem
> 
> Needed for tarantool/tarantool#6154

Minor: Why #5629 is not mentioned?

> ===================================================================
> 

<snipped>

> 
> -- 
> Best regards,
> Sergey Kaplun

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/2] Add support for full-range 64 bit lightuserdata.
  2021-07-28 12:29     ` Sergey Kaplun via Tarantool-patches
  2021-08-01 10:36       ` Igor Munkin via Tarantool-patches
@ 2021-08-01 16:25       ` Sergey Ostanevich via Tarantool-patches
  2021-08-02 14:51         ` Sergey Kaplun via Tarantool-patches
  1 sibling, 1 reply; 20+ messages in thread
From: Sergey Ostanevich via Tarantool-patches @ 2021-08-01 16:25 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 3655 bytes --]

Hi! Thanks for the patch!

Some minor message fixes, one great gag from Mike’s code and a
test request.

Regards,
Sergos

> 
> The new commit message is the following:
> 
> ===================================================================
> Add support for full-range 64 bit lightuserdata.
> 
> (cherry picked from commit e9af1abec542e6f9851ff2368e7f196b6382a44c)
> 
> LuaJIT uses special NaN-tagging technique to store internal type on
> the Lua stack. In case LJ_GC64 first 13 bits are set in special NaN
		^^^^^^^		^
		In case of     the
> type (0xfff8...). FPU generates the only one type. The next 4 bits are
				  ^^^^^^^^^^^
			Which one and how is it relevant?	

> used for an internal LuaJIT type of object on stack. The next 47 bits
> are used for storing this object's content. For userdata, it is its
> address. In case arm64 the pointer can have more than 47 significant
	   ^^^^^
	   For
> bits [1]. In this case the error BADLU error is raised.
> 
> For the support of full 64-bit range lightuserdata pointers two new
> fields in GCState are added:
> 
> `lightudseg` - vector of segments of lightuserdata. Each element keeps
> 32-bit value. 25 MSB equal to MSB of lightuserdata address, the rest are
                                                    ^
						64bit
> filled with zeros. The length of the vector is power of 2.
> 
> `lightudnum` - the length - 1 of aforementioned vector (up to 255).
> 
> When lightuserdata is pushed on the stack, if its segment is not stored

> in vector new value is appended on top of this vector. The maximum
				 ^^^^^^^^^ to

At first I want you to put it as ’not found’ instead of ’not stored’. 
Then I start thinking over ‘on top’ for a vector and I got a strange
feeling... 


Now tell me, every time you put a LUD pointer to stack you have to roll
over all present segments in this '>>>' plain loop below?

--- a/src/lj_api.c
+++ b/src/lj_api.c
+#if LJ_64
+static void *lightud_intern(lua_State *L, void *p)
+{
+  global_State *g = G(L);
+  uint64_t u = (uint64_t)p;
+  uint32_t up = lightudup(u);
+  uint32_t *segmap = mref(g->gc.lightudseg, uint32_t);
+  MSize segnum = g->gc.lightudnum;
+  if (segmap) {
+    MSize seg;
>>> +    for (seg = 0; seg <= segnum; seg++)
>>> +      if (segmap[seg] == up)  /* Fast path. */
>>> +	return (void *)(((uint64_t)seg << LJ_LIGHTUD_BITS_LO) | lightudlo(u));
+    segnum++;
+  }
+  if (!((segnum-1) & segnum) && segnum != 1) {
+    if (segnum >= (1 << LJ_LIGHTUD_BITS_SEG)) lj_err_msg(L, LJ_ERR_BADLU);
+    lj_mem_reallocvec(L, segmap, segnum, segnum ? 2*segnum : 2u, uint32_t);
+    setmref(g->gc.lightudseg, segmap);
+  }
+  g->gc.lightudnum = segnum;
+  segmap[segnum] = up;
+  return (void *)(((uint64_t)segnum << LJ_LIGHTUD_BITS_LO) | lightudlo(u));
+}
+#endif
+

Can’t help to laugh at Mike’s /* Fast path */, brilliant isn’t it?
Perhaps addition of a new segment is not so often - and is counted to 256 -
so we can easily sort the array each time to make it log(n) rather (n) for
each lua_pushlightuserdata()?

> <snipped>
> 
> See the iterative patch below.
> 
> ===================================================================
> diff --git a/test/tarantool-tests/lj-49-bad-lightuserdata.test.lua b/test/tarantool-tests/lj-49-bad-lightuserdata.test.lua

This one tests the LUD push/pop to/fro stack. How about those 

> all internal usage of lightuserdata (for hooks,
> profilers, built-in package, IR and so on) is changed to special values
> on Lua Stack.

Can you add at least _some_ test to verify memprof is fine?


[-- Attachment #2: Type: text/html, Size: 29264 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/2] Linux/ARM64: Make mremap() non-moving due to VA space woes.
  2021-08-01 10:36       ` Igor Munkin via Tarantool-patches
@ 2021-08-01 16:59         ` Sergey Ostanevich via Tarantool-patches
  2021-08-02 15:08           ` Sergey Kaplun via Tarantool-patches
  2021-08-02 15:11         ` Sergey Kaplun via Tarantool-patches
  1 sibling, 1 reply; 20+ messages in thread
From: Sergey Ostanevich via Tarantool-patches @ 2021-08-01 16:59 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches

Hi! Thanks for the patch!

Now direct_resize() will fail if it doesn't fit into the current place
with a new size and a direct_alloc() supposedly will be called. This one 
doesn't help with 47-bit address AFAIU since it has no extra option -
I doubt it exist at all - to ask kernel to fit.
 
So, how it helps?


Regards,
Sergos


> On 1 Aug 2021, at 13:36, Igor Munkin <imun@tarantool.org> wrote:
> 
> Sergey,
> 
> Thanks for the fixes! LGTM, except the single typo.
> 
> On 28.07.21, Sergey Kaplun wrote:
> 
> <snipped>
> 
>> 
>> The new commit message is the following:
>> 
>> ===================================================================
>> Linux/ARM64: Make mremap() non-moving due to VA space woes.
>> 
>> This reduces overall performance on ARM64, but we have no choice.
>> Linux kernel default userspace VA is 48 bit, but we'd need 47 bit.
>> mremap() ignores address hints due to a kernel API issue. The mapping
>> may move to an undesired address which will cause an assert or crash.
>> 
>> Reported by Raymond W. Ko.
>> 
>> (cherry picked from commit 67dbec82f4f05a416a78a560a726553beaa7a223)
>> 
>> 47-bit VA space is required by LuaJIT for keeping a GC object pointer in
>> TValue. In case of huge blobs that are mapped directly, `mremap()` may
>> move the chunk out of 47-bit range of VA space on ARM64. `mremap()`
>> accepts the fifth argument (new address hint) only with MREMAP_FIXED
>> flag. In that case it unmaps any other mapping to specified address.
>> 
>> To avoid this behaviour this patch restricts `mremap()` to relocate
>> the mapping to a new virtual address by set CALL_MREMAP_NOMOVE flag
> 
> Typo: s/set/setting/.
> 
>> instead of CALL_MREMAP_MAYMOVE for arm64 architecture.
>> 
>> Sergey Kaplun:
>> * added the description and the test for the problem
>> 
>> Needed for tarantool/tarantool#6154
> 
> Minor: Why #5629 is not mentioned?
> 
>> ===================================================================
>> 
> 
> <snipped>
> 
>> 
>> -- 
>> Best regards,
>> Sergey Kaplun
> 
> -- 
> Best regards,
> IM


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/2] Add support for full-range 64 bit lightuserdata.
  2021-08-01 16:25       ` Sergey Ostanevich via Tarantool-patches
@ 2021-08-02 14:51         ` Sergey Kaplun via Tarantool-patches
  2021-08-02 15:42           ` Igor Munkin via Tarantool-patches
  2021-08-10 16:46           ` Sergey Ostanevich via Tarantool-patches
  0 siblings, 2 replies; 20+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-08-02 14:51 UTC (permalink / raw)
  To: Sergey Ostanevich; +Cc: tarantool-patches

Hi, Sergos!

Thanks for the review!

On 01.08.21, Sergey Ostanevich wrote:
> Hi! Thanks for the patch!
> 
> Some minor message fixes, one great gag from Mike’s code and a
> test request.
> 
> Regards,
> Sergos
> 
> > 
> > The new commit message is the following:
> > 
> > ===================================================================
> > Add support for full-range 64 bit lightuserdata.
> > 
> > (cherry picked from commit e9af1abec542e6f9851ff2368e7f196b6382a44c)
> > 
> > LuaJIT uses special NaN-tagging technique to store internal type on
> > the Lua stack. In case LJ_GC64 first 13 bits are set in special NaN
> 		^^^^^^^		^
> 		In case of     the

Fixed.

> > type (0xfff8...). FPU generates the only one type. The next 4 bits are
> 				  ^^^^^^^^^^^
> 			Which one and how is it relevant?	

Yep, it can be dropped.

> 
> > used for an internal LuaJIT type of object on stack. The next 47 bits
> > are used for storing this object's content. For userdata, it is its
> > address. In case arm64 the pointer can have more than 47 significant
> 	   ^^^^^
> 	   For

Fixed.

> > bits [1]. In this case the error BADLU error is raised.
> > 
> > For the support of full 64-bit range lightuserdata pointers two new
> > fields in GCState are added:
> > 
> > `lightudseg` - vector of segments of lightuserdata. Each element keeps
> > 32-bit value. 25 MSB equal to MSB of lightuserdata address, the rest are
>                                                     ^
> 						64bit

Fixed.

> > filled with zeros. The length of the vector is power of 2.
> > 
> > `lightudnum` - the length - 1 of aforementioned vector (up to 255).
> > 
> > When lightuserdata is pushed on the stack, if its segment is not stored
> 
> > in vector new value is appended on top of this vector. The maximum
> 				 ^^^^^^^^^ to

Fixed.

> 
> At first I want you to put it as ’not found’ instead of ’not stored’. 
> Then I start thinking over ‘on top’ for a vector and I got a strange
> feeling... 
> 
> 
> Now tell me, every time you put a LUD pointer to stack you have to roll
> over all present segments in this '>>>' plain loop below?

Yep, praying for the "Fast path".

Side note: Mike likes to "teach" people how they *should* write code,
and how they *shouldn't*. Also, he often tells that huge slowdowns and
lags, when the "wrong" code is running are teaching them to avoid bad
code. I suppose, he thinks, that intensive usage of userdata is a bad
pattern in Lua and more important in LuaJIT...

> 
> --- a/src/lj_api.c
> +++ b/src/lj_api.c
> +#if LJ_64
> +static void *lightud_intern(lua_State *L, void *p)
> +{
> +  global_State *g = G(L);
> +  uint64_t u = (uint64_t)p;
> +  uint32_t up = lightudup(u);
> +  uint32_t *segmap = mref(g->gc.lightudseg, uint32_t);
> +  MSize segnum = g->gc.lightudnum;
> +  if (segmap) {
> +    MSize seg;
> >>> +    for (seg = 0; seg <= segnum; seg++)
> >>> +      if (segmap[seg] == up)  /* Fast path. */
> >>> +	return (void *)(((uint64_t)seg << LJ_LIGHTUD_BITS_LO) | lightudlo(u));
> +    segnum++;
> +  }
> +  if (!((segnum-1) & segnum) && segnum != 1) {
> +    if (segnum >= (1 << LJ_LIGHTUD_BITS_SEG)) lj_err_msg(L, LJ_ERR_BADLU);
> +    lj_mem_reallocvec(L, segmap, segnum, segnum ? 2*segnum : 2u, uint32_t);
> +    setmref(g->gc.lightudseg, segmap);
> +  }
> +  g->gc.lightudnum = segnum;
> +  segmap[segnum] = up;
> +  return (void *)(((uint64_t)segnum << LJ_LIGHTUD_BITS_LO) | lightudlo(u));
> +}
> +#endif
> +
> 
> Can’t help to laugh at Mike’s /* Fast path */, brilliant isn’t it?
> Perhaps addition of a new segment is not so often - and is counted to 256 -
> so we can easily sort the array each time to make it log(n) rather (n) for
> each lua_pushlightuserdata()?

Mike's style... Also, I suggest to avoid sorting optimization for now by
two reasons:

1) We have no goal to beat everyone at ARM __yet__. Just make it
breathing.
2) We have no performance tests to measure such changes (I hope
__yet__, too).

> 
> > <snipped>
> > 
> > See the iterative patch below.
> > 
> > ===================================================================
> > diff --git a/test/tarantool-tests/lj-49-bad-lightuserdata.test.lua b/test/tarantool-tests/lj-49-bad-lightuserdata.test.lua
> 
> This one tests the LUD push/pop to/fro stack. How about those 
> 
> > all internal usage of lightuserdata (for hooks,
> > profilers, built-in package, IR and so on) is changed to special values
> > on Lua Stack.
> 
> Can you add at least _some_ test to verify memprof is fine?

Memprof avoids such extroversions. Do you mean the test for `jit.p`?

The new commit message is the following:

===================================================================
Add support for full-range 64 bit lightuserdata.

(cherry picked from commit e9af1abec542e6f9851ff2368e7f196b6382a44c)

LuaJIT uses special NaN-tagging technique to store internal type on
the Lua stack. In case of LJ_GC64 the first 13 bits are set in special
NaN type (0xfff8...). The next 4 bits are used for an internal LuaJIT
type of object on stack. The next 47 bits are used for storing this
object's content. For userdata, it is its address. For arm64 a pointer
can have more than 47 significant bits [1]. In this case the error BADLU
error is raised.

For the support of full 64-bit range lightuserdata pointers two new
fields in GCState are added:

`lightudseg` - vector of segments of lightuserdata. Each element keeps
32-bit value. 25 MSB equal to MSB of lightuserdata 64-bit address, the
rest are filled with zeros. The lentgh of the vector is power of 2.

`lightudnum` - the length - 1 of aforementioned vector (up to 255).

When lightuserdata is pushed on the stack, if its segment is not stored
in vector new value is appended to of this vector. The maximum amount of
segments is 256. BADLU error is raised in case when user tried to add
userdata with the new 257-th segment, so the whole VA-space isn't
covered by this patch.

Also, in this patch all internal usage of lightuserdata (for hooks,
profilers, built-in package, IR and so on) is changed to special values
on Lua Stack.

Also, conversion of TValue to FFI C type with store is no longer
compiled for lightuserdata.

[1]: https://www.kernel.org/doc/html/latest/arm64/memory.html

Sergey Kaplun:
* added the description and the test for the problem

Resolves tarantool/tarantool#2712
Needed for tarantool/tarantool#6154
===================================================================

Branch is force-pushed.

> 

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/2] Add support for full-range 64 bit lightuserdata.
  2021-08-01 10:36       ` Igor Munkin via Tarantool-patches
@ 2021-08-02 14:56         ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 20+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-08-02 14:56 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches

Igor,

On 01.08.21, Igor Munkin wrote:
> Sergey,
> 
> Thanks for the fixes and comments! LGTM then, except the three typos
> I've missed in the previous review.
> 
> On 28.07.21, Sergey Kaplun wrote:
> > Igor,
> > Thanks for the review!
> > 
> 
> <snipped>
> 
> > 
> > The new commit message is the following:
> > 
> > ===================================================================
> > Add support for full-range 64 bit lightuserdata.
> > 
> > (cherry picked from commit e9af1abec542e6f9851ff2368e7f196b6382a44c)
> > 
> > LuaJIT uses special NaN-tagging technique to store internal type on
> > the Lua stack. In case LJ_GC64 first 13 bits are set in special NaN
> > type (0xfff8...). FPU generates the only one type. The next 4 bits are
> > used for an internal LuaJIT type of object on stack. The next 47 bits
> > are used for storing this object's content. For userdata, it is its
> > address. In case arm64 the pointer can have more than 47 significant
> > bits [1]. In this case the error BADLU error is raised.
> > 
> > For the support of full 64-bit range lightuserdata pointers two new
> > fields in GCState are added:
> > 
> > `lightudseg` - vector of segments of lightuserdata. Each element keeps
> > 32-bit value. 25 MSB equal to MSB of lightuserdata address, the rest are
> > filled with zeros. The lentgh of the vector is power of 2.
> 
> Typo: s/lentgh/length/.

Fixed.

> 
> > 
> > `lightudnum` - the length - 1 of aforementioned vector (up to 255).
> > 
> > When lightuserdata is pushed on the stack, if its segment is not stored
> > in vector new value is appended on top of this vector. The maximum
> > amount of segments is 256. BADLU error is raised in case when user tried
> 
> Typo: s/tried/tries/.

Fixed.

> 
> > to add userdata with the new 257-th segment, so the whole VA-space
> > isn't covered by this patch.
> > 
> > Also, in this patch all internal usage of lightuserdata (for hooks,
> > profilers, built-in package, IR and so on) is changed to special values
> > on Lua Stack.
> > 
> > Also, conversion of TValue to FFI C type with store is no longer
> > compiled for lightuserdata.
> > 
> > [1]: https://www.kernel.org/doc/html/latest/arm64/memory.html
> > 
> > Sergey Kaplun:
> > * added the description and the test for the problem
> > 
> > Resolves tarantool/tarantool#2712
> > Needed for tarantool/tarantool#6154
> 
> Minor: Why #5629 is not mentioned?

Added.

> 
> > ===================================================================
> > 
> 
> <snipped>
> 
> > 
> > See the iterative patch below.
> > 
> > ===================================================================
> 
> <snipped>
> 
> > diff --git a/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c b/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
> > index 801c7fe1..c958841f 100644
> > --- a/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
> > +++ b/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
> 
> <snipped>
> 
> > @@ -26,8 +34,8 @@ static int longptr(lua_State *L)
> >  	 * equals -1.
> 
> Typo: s/hint equals -1/hint that equals to -1/.

Fixed:
===================================================================
diff --git a/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c b/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
index c958841f..1b909fc6 100644
--- a/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
+++ b/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
@@ -30,8 +30,8 @@ static int mmaped_ptr(lua_State *L)
         /*
          * If start mapping address is not NULL, then the kernel
          * takes it as a hint about where to place the mapping, so
-         * we try to get the highest memory address by hint
-         * equals -1.
+         * we try to get the highest memory address by hint, that
+         * equals to -1.
          */
         const size_t pagesize = getpagesize();
         void *mmaped = mmap(START, pagesize, PROT_NONE, MAP_PRIVATE | MAP_ANON,
===================================================================

> 
> >  	 */
> >  	const size_t pagesize = getpagesize();
> > -	void *mmaped = mmap(START, pagesize, PROT_NONE,
> > -				  MAP_PRIVATE | MAP_ANON, -1, 0);
> > +	void *mmaped = mmap(START, pagesize, PROT_NONE, MAP_PRIVATE | MAP_ANON,
> > +			    -1, 0);
> >  	if (mmaped != MAP_FAILED) {
> >  		lua_pushlightuserdata(L, mmaped);
> >  		assert(mmaped == lua_topointer(L, -1));
> 
> <snipped>
> 
> > ===================================================================
> > 
> 
> <snipped>
> 
> > 
> > -- 
> > Best regards,
> > Sergey Kaplun
> 
> -- 
> Best regards,
> IM

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/2] Linux/ARM64: Make mremap() non-moving due to VA space woes.
  2021-08-01 16:59         ` Sergey Ostanevich via Tarantool-patches
@ 2021-08-02 15:08           ` Sergey Kaplun via Tarantool-patches
  2021-08-02 15:55             ` Sergey Ostanevich via Tarantool-patches
  0 siblings, 1 reply; 20+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-08-02 15:08 UTC (permalink / raw)
  To: Sergey Ostanevich; +Cc: tarantool-patches

Hi, Sergos!

Thanks for the review!

On 01.08.21, Sergey Ostanevich wrote:
> Hi! Thanks for the patch!
> 
> Now direct_resize() will fail if it doesn't fit into the current place
> with a new size and a direct_alloc() supposedly will be called. This one 
> doesn't help with 47-bit address AFAIU since it has no extra option -
> I doubt it exist at all - to ask kernel to fit.
>  
> So, how it helps?

For arm64 LJ_64 mode is enabled.
It means LJ_ALLOC_MMAP_PROBE is defined and for each huge allocation
`mmap_probe()` is called. This function tries several allocations
unless the limit (30) is reached, or any suitable address is found.

> 
> 
> Regards,
> Sergos
> 
> 
> > On 1 Aug 2021, at 13:36, Igor Munkin <imun@tarantool.org> wrote:
> > 
> > Sergey,
> > 
> > Thanks for the fixes! LGTM, except the single typo.
> > 
> > On 28.07.21, Sergey Kaplun wrote:
> > 
> > <snipped>
> > 
> >> 
> >> The new commit message is the following:
> >> 
> >> ===================================================================
> >> Linux/ARM64: Make mremap() non-moving due to VA space woes.
> >> 
> >> This reduces overall performance on ARM64, but we have no choice.
> >> Linux kernel default userspace VA is 48 bit, but we'd need 47 bit.
> >> mremap() ignores address hints due to a kernel API issue. The mapping
> >> may move to an undesired address which will cause an assert or crash.
> >> 
> >> Reported by Raymond W. Ko.
> >> 
> >> (cherry picked from commit 67dbec82f4f05a416a78a560a726553beaa7a223)
> >> 
> >> 47-bit VA space is required by LuaJIT for keeping a GC object pointer in
> >> TValue. In case of huge blobs that are mapped directly, `mremap()` may
> >> move the chunk out of 47-bit range of VA space on ARM64. `mremap()`
> >> accepts the fifth argument (new address hint) only with MREMAP_FIXED
> >> flag. In that case it unmaps any other mapping to specified address.
> >> 
> >> To avoid this behaviour this patch restricts `mremap()` to relocate
> >> the mapping to a new virtual address by set CALL_MREMAP_NOMOVE flag
> > 
> > Typo: s/set/setting/.
> > 
> >> instead of CALL_MREMAP_MAYMOVE for arm64 architecture.
> >> 
> >> Sergey Kaplun:
> >> * added the description and the test for the problem
> >> 
> >> Needed for tarantool/tarantool#6154
> > 
> > Minor: Why #5629 is not mentioned?
> > 
> >> ===================================================================
> >> 
> > 
> > <snipped>
> > 
> >> 
> >> -- 
> >> Best regards,
> >> Sergey Kaplun
> > 
> > -- 
> > Best regards,
> > IM
> 

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/2] Linux/ARM64: Make mremap() non-moving due to VA space woes.
  2021-08-01 10:36       ` Igor Munkin via Tarantool-patches
  2021-08-01 16:59         ` Sergey Ostanevich via Tarantool-patches
@ 2021-08-02 15:11         ` Sergey Kaplun via Tarantool-patches
  1 sibling, 0 replies; 20+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-08-02 15:11 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches

Igor,

On 01.08.21, Igor Munkin wrote:
> Sergey,
> 
> Thanks for the fixes! LGTM, except the single typo.
> 
> On 28.07.21, Sergey Kaplun wrote:
> 
> <snipped>
> 
> > 
> > The new commit message is the following:
> > 
> > ===================================================================
> > Linux/ARM64: Make mremap() non-moving due to VA space woes.
> > 
> > This reduces overall performance on ARM64, but we have no choice.
> > Linux kernel default userspace VA is 48 bit, but we'd need 47 bit.
> > mremap() ignores address hints due to a kernel API issue. The mapping
> > may move to an undesired address which will cause an assert or crash.
> > 
> > Reported by Raymond W. Ko.
> > 
> > (cherry picked from commit 67dbec82f4f05a416a78a560a726553beaa7a223)
> > 
> > 47-bit VA space is required by LuaJIT for keeping a GC object pointer in
> > TValue. In case of huge blobs that are mapped directly, `mremap()` may
> > move the chunk out of 47-bit range of VA space on ARM64. `mremap()`
> > accepts the fifth argument (new address hint) only with MREMAP_FIXED
> > flag. In that case it unmaps any other mapping to specified address.
> > 
> > To avoid this behaviour this patch restricts `mremap()` to relocate
> > the mapping to a new virtual address by set CALL_MREMAP_NOMOVE flag
> 
> Typo: s/set/setting/.

Fixed.

> 
> > instead of CALL_MREMAP_MAYMOVE for arm64 architecture.
> > 
> > Sergey Kaplun:
> > * added the description and the test for the problem
> > 
> > Needed for tarantool/tarantool#6154
> 
> Minor: Why #5629 is not mentioned?

Added.

Branch is force-pushed.

> 
> > ===================================================================
> > 
> 
> <snipped>
> 
> > 
> > -- 
> > Best regards,
> > Sergey Kaplun
> 
> -- 
> Best regards,
> IM

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/2] Add support for full-range 64 bit lightuserdata.
  2021-08-02 14:51         ` Sergey Kaplun via Tarantool-patches
@ 2021-08-02 15:42           ` Igor Munkin via Tarantool-patches
  2021-08-10 16:46           ` Sergey Ostanevich via Tarantool-patches
  1 sibling, 0 replies; 20+ messages in thread
From: Igor Munkin via Tarantool-patches @ 2021-08-02 15:42 UTC (permalink / raw)
  To: Sergey Ostanevich; +Cc: tarantool-patches

Sergos,

On 02.08.21, Sergey Kaplun wrote:
> Hi, Sergos!
> 

<snipped>

> > 
> > Can’t help to laugh at Mike’s /* Fast path */, brilliant isn’t it?
> > Perhaps addition of a new segment is not so often - and is counted to 256 -
> > so we can easily sort the array each time to make it log(n) rather (n) for
> > each lua_pushlightuserdata()?
> 
> Mike's style... Also, I suggest to avoid sorting optimization for now by
> two reasons:
> 
> 1) We have no goal to beat everyone at ARM __yet__. Just make it
> breathing.
> 2) We have no performance tests to measure such changes (I hope
> __yet__, too).

And what is more important, this is impossible, AFAICS. The segment *id*
is returned to "userspace", so you can't just reorder the segments
without LUD values to be updated. The issue becomes impossible since
particular LUD value can't be obtained within Lua internal world. Hence
this is the *fastest* path for the current solution. Anyway, nobody can
stop us to improve it ;)

> 

<snipped>

> 
> -- 
> Best regards,
> Sergey Kaplun

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/2] Linux/ARM64: Make mremap() non-moving due to VA space woes.
  2021-08-02 15:08           ` Sergey Kaplun via Tarantool-patches
@ 2021-08-02 15:55             ` Sergey Ostanevich via Tarantool-patches
  0 siblings, 0 replies; 20+ messages in thread
From: Sergey Ostanevich via Tarantool-patches @ 2021-08-02 15:55 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Hi!

Thanks for explanation, LGTM.


> On 2 Aug 2021, at 18:08, Sergey Kaplun <skaplun@tarantool.org> wrote:
> 
> Hi, Sergos!
> 
> Thanks for the review!
> 
> On 01.08.21, Sergey Ostanevich wrote:
>> Hi! Thanks for the patch!
>> 
>> Now direct_resize() will fail if it doesn't fit into the current place
>> with a new size and a direct_alloc() supposedly will be called. This one 
>> doesn't help with 47-bit address AFAIU since it has no extra option -
>> I doubt it exist at all - to ask kernel to fit.
>> 
>> So, how it helps?
> 
> For arm64 LJ_64 mode is enabled.
> It means LJ_ALLOC_MMAP_PROBE is defined and for each huge allocation
> `mmap_probe()` is called. This function tries several allocations
> unless the limit (30) is reached, or any suitable address is found.
> 
>> 
>> 
>> Regards,
>> Sergos
>> 
>> 
>>> On 1 Aug 2021, at 13:36, Igor Munkin <imun@tarantool.org> wrote:
>>> 
>>> Sergey,
>>> 
>>> Thanks for the fixes! LGTM, except the single typo.
>>> 
>>> On 28.07.21, Sergey Kaplun wrote:
>>> 
>>> <snipped>
>>> 
>>>> 
>>>> The new commit message is the following:
>>>> 
>>>> ===================================================================
>>>> Linux/ARM64: Make mremap() non-moving due to VA space woes.
>>>> 
>>>> This reduces overall performance on ARM64, but we have no choice.
>>>> Linux kernel default userspace VA is 48 bit, but we'd need 47 bit.
>>>> mremap() ignores address hints due to a kernel API issue. The mapping
>>>> may move to an undesired address which will cause an assert or crash.
>>>> 
>>>> Reported by Raymond W. Ko.
>>>> 
>>>> (cherry picked from commit 67dbec82f4f05a416a78a560a726553beaa7a223)
>>>> 
>>>> 47-bit VA space is required by LuaJIT for keeping a GC object pointer in
>>>> TValue. In case of huge blobs that are mapped directly, `mremap()` may
>>>> move the chunk out of 47-bit range of VA space on ARM64. `mremap()`
>>>> accepts the fifth argument (new address hint) only with MREMAP_FIXED
>>>> flag. In that case it unmaps any other mapping to specified address.
>>>> 
>>>> To avoid this behaviour this patch restricts `mremap()` to relocate
>>>> the mapping to a new virtual address by set CALL_MREMAP_NOMOVE flag
>>> 
>>> Typo: s/set/setting/.
>>> 
>>>> instead of CALL_MREMAP_MAYMOVE for arm64 architecture.
>>>> 
>>>> Sergey Kaplun:
>>>> * added the description and the test for the problem
>>>> 
>>>> Needed for tarantool/tarantool#6154
>>> 
>>> Minor: Why #5629 is not mentioned?
>>> 
>>>> ===================================================================
>>>> 
>>> 
>>> <snipped>
>>> 
>>>> 
>>>> -- 
>>>> Best regards,
>>>> Sergey Kaplun
>>> 
>>> -- 
>>> Best regards,
>>> IM
>> 
> 
> -- 
> Best regards,
> Sergey Kaplun


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/2] Add support for full-range 64 bit lightuserdata.
  2021-08-02 14:51         ` Sergey Kaplun via Tarantool-patches
  2021-08-02 15:42           ` Igor Munkin via Tarantool-patches
@ 2021-08-10 16:46           ` Sergey Ostanevich via Tarantool-patches
  2021-08-11  5:54             ` Vitaliia Ioffe via Tarantool-patches
  1 sibling, 1 reply; 20+ messages in thread
From: Sergey Ostanevich via Tarantool-patches @ 2021-08-10 16:46 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 2566 bytes --]

>>> all internal usage of lightuserdata (for hooks,
>>> profilers, built-in package, IR and so on) is changed to special values
>>> on Lua Stack.
>> 
>> Can you add at least _some_ test to verify memprof is fine?
> 
> Memprof avoids such extroversions. Do you mean the test for `jit.p`?
> 

I can see memeprof is out of the business, sure I meant any our new
functionality - such as stats on gc - that can be hit by this in some
way.
 
> The new commit message is the following:
> 
> ===================================================================
> Add support for full-range 64 bit lightuserdata.
> 
> (cherry picked from commit e9af1abec542e6f9851ff2368e7f196b6382a44c)
> 
> LuaJIT uses special NaN-tagging technique to store internal type on
> the Lua stack. In case of LJ_GC64 the first 13 bits are set in special
> NaN type (0xfff8...). The next 4 bits are used for an internal LuaJIT
> type of object on stack. The next 47 bits are used for storing this
> object's content. For userdata, it is its address. For arm64 a pointer
> can have more than 47 significant bits [1]. In this case the error BADLU
> error is raised.
> 
> For the support of full 64-bit range lightuserdata pointers two new
> fields in GCState are added:
> 
> `lightudseg` - vector of segments of lightuserdata. Each element keeps
> 32-bit value. 25 MSB equal to MSB of lightuserdata 64-bit address, the
> rest are filled with zeros. The lentgh of the vector is power of 2.
> 
> `lightudnum` - the length - 1 of aforementioned vector (up to 255).
> 
> When lightuserdata is pushed on the stack, if its segment is not stored
> in vector new value is appended to of this vector. The maximum amount of
> segments is 256. BADLU error is raised in case when user tried to add
> userdata with the new 257-th segment, so the whole VA-space isn't
> covered by this patch.
> 
> Also, in this patch all internal usage of lightuserdata (for hooks,
> profilers, built-in package, IR and so on) is changed to special values
> on Lua Stack.
> 
> Also, conversion of TValue to FFI C type with store is no longer
> compiled for lightuserdata.
> 
> [1]: https://www.kernel.org/doc/html/latest/arm64/memory.html <https://www.kernel.org/doc/html/latest/arm64/memory.html>
> 
> Sergey Kaplun:
> * added the description and the test for the problem
> 
> Resolves tarantool/tarantool#2712
> Needed for tarantool/tarantool#6154
> ===================================================================
> 
> Branch is force-pushed.

LGTM.

Sergos


[-- Attachment #2: Type: text/html, Size: 34347 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Tarantool-patches]  [PATCH luajit 1/2] Add support for full-range 64 bit lightuserdata.
  2021-08-10 16:46           ` Sergey Ostanevich via Tarantool-patches
@ 2021-08-11  5:54             ` Vitaliia Ioffe via Tarantool-patches
  0 siblings, 0 replies; 20+ messages in thread
From: Vitaliia Ioffe via Tarantool-patches @ 2021-08-11  5:54 UTC (permalink / raw)
  To: Sergey Ostanevich; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 2584 bytes --]


Hi team, 
 
QA LGTM
 
 
--
Vitaliia Ioffe
 
  
>Вторник, 10 августа 2021, 19:46 +03:00 от Sergey Ostanevich via Tarantool-patches <tarantool-patches@dev.tarantool.org>:
> 
>all internal usage of lightuserdata (for hooks,
>profilers, built-in package, IR and so on) is changed to special values
>on Lua Stack.

Can you add at least _some_ test to verify memprof is fine?

Memprof avoids such extroversions. Do you mean the test for `jit.p`?
 
  I can see memeprof is out of the business, sure I meant any our new
functionality - such as stats on gc - that can be hit by this in some
way.
 
>The new commit message is the following:
>
>===================================================================
>Add support for full-range 64 bit lightuserdata.
>
>(cherry picked from commit e9af1abec542e6f9851ff2368e7f196b6382a44c)
>
>LuaJIT uses special NaN-tagging technique to store internal type on
>the Lua stack. In case of LJ_GC64 the first 13 bits are set in special
>NaN type (0xfff8...). The next 4 bits are used for an internal LuaJIT
>type of object on stack. The next 47 bits are used for storing this
>object's content. For userdata, it is its address. For arm64 a pointer
>can have more than 47 significant bits [1]. In this case the error BADLU
>error is raised.
>
>For the support of full 64-bit range lightuserdata pointers two new
>fields in GCState are added:
>
>`lightudseg` - vector of segments of lightuserdata. Each element keeps
>32-bit value. 25 MSB equal to MSB of lightuserdata 64-bit address, the
>rest are filled with zeros. The lentgh of the vector is power of 2.
>
>`lightudnum` - the length - 1 of aforementioned vector (up to 255).
>
>When lightuserdata is pushed on the stack, if its segment is not stored
>in vector new value is appended to of this vector. The maximum amount of
>segments is 256. BADLU error is raised in case when user tried to add
>userdata with the new 257-th segment, so the whole VA-space isn't
>covered by this patch.
>
>Also, in this patch all internal usage of lightuserdata (for hooks,
>profilers, built-in package, IR and so on) is changed to special values
>on Lua Stack.
>
>Also, conversion of TValue to FFI C type with store is no longer
>compiled for lightuserdata.
>
>[1]:   https://www.kernel.org/doc/html/latest/arm64/memory.html
>
>Sergey Kaplun:
>* added the description and the test for the problem
>
>Resolves tarantool/tarantool#2712
>Needed for tarantool/tarantool#6154
>===================================================================
>
>Branch is force-pushed. 
 
LGTM.
 
Sergos
 

[-- Attachment #2: Type: text/html, Size: 31624 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 0/2] arm64: fix 48-bit addresses issues
  2021-07-06 17:40 [Tarantool-patches] [PATCH luajit 0/2] arm64: fix 48-bit addresses issues Sergey Kaplun via Tarantool-patches
  2021-07-06 17:40 ` [Tarantool-patches] [PATCH luajit 1/2] Add support for full-range 64 bit lightuserdata Sergey Kaplun via Tarantool-patches
  2021-07-06 17:40 ` [Tarantool-patches] [PATCH luajit 2/2] Linux/ARM64: Make mremap() non-moving due to VA space woes Sergey Kaplun via Tarantool-patches
@ 2021-08-11  7:21 ` Igor Munkin via Tarantool-patches
  2 siblings, 0 replies; 20+ messages in thread
From: Igor Munkin via Tarantool-patches @ 2021-08-11  7:21 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sergey,

I've checked the patchset into tarantool branch in tarantool/luajit and
bumped a new version in master.

On 06.07.21, Sergey Kaplun wrote:
> This patchset provides changes necessary for Tarantool working in docker
> on M1 (#2712), except well-known `cur_L` issue [1].
> 
> The first patch fixes issue with bad lightuserdata error.
> The second patch fixes issue with memory remaping to 48-bit VA space.
> 
> LuaJIT branch: https://github.com/tarantool/luajit/tree/skaplun/gh-2712-bad-lightud
> Tarantool branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-2712-bad-lightud
> Issues:
> * https://github.com/tarantool/tarantool/issues/2712
> * https://github.com/tarantool/tarantool/issues/6154
> 
> [1]: https://github.com/tarantool/tarantool/issues/6189
> 
> Mike Pall (2):
>   Add support for full-range 64 bit lightuserdata.
>   Linux/ARM64: Make mremap() non-moving due to VA space woes.
> 
>  doc/status.html                               | 11 ----
>  src/jit/dump.lua                              |  4 +-
>  src/lib_debug.c                               | 12 ++--
>  src/lib_jit.c                                 | 14 ++---
>  src/lib_package.c                             |  8 +--
>  src/lib_string.c                              |  2 +-
>  src/lj_alloc.c                                |  2 +-
>  src/lj_api.c                                  | 40 +++++++++++--
>  src/lj_ccall.c                                |  2 +-
>  src/lj_cconv.c                                |  2 +-
>  src/lj_crecord.c                              |  6 +-
>  src/lj_dispatch.c                             |  2 +-
>  src/lj_ir.c                                   |  6 +-
>  src/lj_obj.c                                  |  5 +-
>  src/lj_obj.h                                  | 57 ++++++++++++-------
>  src/lj_snap.c                                 |  7 ++-
>  src/lj_state.c                                |  6 ++
>  src/lj_strfmt.c                               |  2 +-
>  test/tarantool-tests/CMakeLists.txt           |  1 +
>  .../lj-49-bad-lightuserdata.test.lua          | 10 ++++
>  .../lj-49-bad-lightuserdata/CMakeLists.txt    |  1 +
>  .../testlightuserdata.c                       | 52 +++++++++++++++++
>  .../lj-671-arm64-assert-after-mremap.test.lua | 24 ++++++++
>  23 files changed, 208 insertions(+), 68 deletions(-)
>  create mode 100644 test/tarantool-tests/lj-49-bad-lightuserdata.test.lua
>  create mode 100644 test/tarantool-tests/lj-49-bad-lightuserdata/CMakeLists.txt
>  create mode 100644 test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
>  create mode 100644 test/tarantool-tests/lj-671-arm64-assert-after-mremap.test.lua
> 
> -- 
> 2.31.0
> 

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-08-11  7:45 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-06 17:40 [Tarantool-patches] [PATCH luajit 0/2] arm64: fix 48-bit addresses issues Sergey Kaplun via Tarantool-patches
2021-07-06 17:40 ` [Tarantool-patches] [PATCH luajit 1/2] Add support for full-range 64 bit lightuserdata Sergey Kaplun via Tarantool-patches
2021-07-27 13:59   ` Igor Munkin via Tarantool-patches
2021-07-28 12:29     ` Sergey Kaplun via Tarantool-patches
2021-08-01 10:36       ` Igor Munkin via Tarantool-patches
2021-08-02 14:56         ` Sergey Kaplun via Tarantool-patches
2021-08-01 16:25       ` Sergey Ostanevich via Tarantool-patches
2021-08-02 14:51         ` Sergey Kaplun via Tarantool-patches
2021-08-02 15:42           ` Igor Munkin via Tarantool-patches
2021-08-10 16:46           ` Sergey Ostanevich via Tarantool-patches
2021-08-11  5:54             ` Vitaliia Ioffe via Tarantool-patches
2021-07-06 17:40 ` [Tarantool-patches] [PATCH luajit 2/2] Linux/ARM64: Make mremap() non-moving due to VA space woes Sergey Kaplun via Tarantool-patches
2021-07-27 15:23   ` Igor Munkin via Tarantool-patches
2021-07-28 12:29     ` Sergey Kaplun via Tarantool-patches
2021-08-01 10:36       ` Igor Munkin via Tarantool-patches
2021-08-01 16:59         ` Sergey Ostanevich via Tarantool-patches
2021-08-02 15:08           ` Sergey Kaplun via Tarantool-patches
2021-08-02 15:55             ` Sergey Ostanevich via Tarantool-patches
2021-08-02 15:11         ` Sergey Kaplun via Tarantool-patches
2021-08-11  7:21 ` [Tarantool-patches] [PATCH luajit 0/2] arm64: fix 48-bit addresses issues Igor Munkin via Tarantool-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox