Hi! Thanks for the patch!

Some minor message fixes, one great gag from Mike’s code and a
test request.

Regards,
Sergos


The new commit message is the following:

===================================================================
Add support for full-range 64 bit lightuserdata.

(cherry picked from commit e9af1abec542e6f9851ff2368e7f196b6382a44c)

LuaJIT uses special NaN-tagging technique to store internal type on
the Lua stack. In case LJ_GC64 first 13 bits are set in special NaN
^^^^^^^ ^
In case of     the
type (0xfff8...). FPU generates the only one type. The next 4 bits are
  ^^^^^^^^^^^
Which one and how is it relevant?

used for an internal LuaJIT type of object on stack. The next 47 bits
are used for storing this object's content. For userdata, it is its
address. In case arm64 the pointer can have more than 47 significant
   ^^^^^
   For
bits [1]. In this case the error BADLU error is raised.

For the support of full 64-bit range lightuserdata pointers two new
fields in GCState are added:

`lightudseg` - vector of segments of lightuserdata. Each element keeps
32-bit value. 25 MSB equal to MSB of lightuserdata address, the rest are
                                                    ^
64bit
filled with zeros. The length of the vector is power of 2.

`lightudnum` - the length - 1 of aforementioned vector (up to 255).

When lightuserdata is pushed on the stack, if its segment is not stored
in vector new value is appended on top of this vector. The maximum
 ^^^^^^^^^ to

At first I want you to put it as ’not found’ instead of ’not stored’. 
Then I start thinking over ‘on top’ for a vector and I got a strange
feeling... 

Now tell me, every time you put a LUD pointer to stack you have to roll
over all present segments in this '>>>' plain loop below?

--- a/src/lj_api.c
+++ b/src/lj_api.c
+#if LJ_64
+static void *lightud_intern(lua_State *L, void *p)
+{
+  global_State *g = G(L);
+  uint64_t u = (uint64_t)p;
+  uint32_t up = lightudup(u);
+  uint32_t *segmap = mref(g->gc.lightudseg, uint32_t);
+  MSize segnum = g->gc.lightudnum;
+  if (segmap) {
+    MSize seg;
>>> +    for (seg = 0; seg <= segnum; seg++)
>>> +      if (segmap[seg] == up)  /* Fast path. */
>>> + return (void *)(((uint64_t)seg << LJ_LIGHTUD_BITS_LO) | lightudlo(u));
+    segnum++;
+  }
+  if (!((segnum-1) & segnum) && segnum != 1) {
+    if (segnum >= (1 << LJ_LIGHTUD_BITS_SEG)) lj_err_msg(L, LJ_ERR_BADLU);
+    lj_mem_reallocvec(L, segmap, segnum, segnum ? 2*segnum : 2u, uint32_t);
+    setmref(g->gc.lightudseg, segmap);
+  }
+  g->gc.lightudnum = segnum;
+  segmap[segnum] = up;
+  return (void *)(((uint64_t)segnum << LJ_LIGHTUD_BITS_LO) | lightudlo(u));
+}
+#endif
+

Can’t help to laugh at Mike’s /* Fast path */, brilliant isn’t it?
Perhaps addition of a new segment is not so often - and is counted to 256 -
so we can easily sort the array each time to make it log(n) rather (n) for
each lua_pushlightuserdata()?

<snipped>

See the iterative patch below.

===================================================================
diff --git a/test/tarantool-tests/lj-49-bad-lightuserdata.test.lua b/test/tarantool-tests/lj-49-bad-lightuserdata.test.lua

This one tests the LUD push/pop to/fro stack. How about those 

all internal usage of lightuserdata (for hooks,
profilers, built-in package, IR and so on) is changed to special values
on Lua Stack.

Can you add at least _some_ test to verify memprof is fine?