[Tarantool-patches] [PATCH luajit 1/2] Add support for full-range 64 bit lightuserdata.

Sergey Kaplun skaplun at tarantool.org
Wed Jul 28 15:29:32 MSK 2021


Igor,
Thanks for the review!

On 27.07.21, Igor Munkin wrote:
> Sergey,
> 
> Thanks for the patch! Please consider my comments below.

I'll skip all the part regarding comments, as far as we discuss with
you offline to take patches as is to simplify backporting of patches
from the upstream. Of course comments are desired, especially taking in
account that there are a lot of places, where patch bases on Mike's
knowledge about how it works with tons of hacks.

> 
> On 06.07.21, Sergey Kaplun wrote:
> > From: Mike Pall <mike>
> > 
> > (cherry picked from commit e9af1abec542e6f9851ff2368e7f196b6382a44c)
> > 
> > LuaJIT uses special NaN-tagging technique to store internal type on
> > the Lua stack. In case LJ_GC64 first 13 bits are set in special NaN
> > type (0xfff8...). FPU generates the only one type. The next 4 bits are
> > used for an internal LuaJIT type of object on stack. The next 47 bits
> > are used for storing this object's content. For userdata, it is its
> > address. In case arm64 the pointer can have more than 47 significant
> > bits [1]. In this case the error BADLU error is raised.
> 
> This is not right for (LJ_64 && !LJ_GC64) case, but AFAIU you are
> writing about LJ_GC64 here since this is the only option for ARM64.

Yes, the LJ_GC64 mode is mentioned in the second sentence.

> 
> > 
> > For the support of full 64-bit range lightuserdata pointers two new
> > fields in GCState are added:
> > 
> > `lightudseg` - vector of segments of lightuserdata storing 32-bit
> 
> I don't get whether vector or segments or lightuserdata store these
> 32-bit values :)

Fixes.

> 
> > values. MS 25 bits equal to MS bits of lightuserdata address, the
> 
> It's better to use either "MSB" acronym or "most significant 25 bits".

Fixed.

> 
> > rest are filled with zeros. The lentgh of the vector is power of 2.
> > 
> > `lightudnum` - the length - 1 of aforementioned vector (up to 255).
> > 
> > When lightuserdata is pushed on the stack, if its segment is not stored
> > in vector new value is pushed on top. The maximum amount of segments is
> 
> On top of the stack or <lightudseg>? The wording is ambiguous. AFAIU,
> you're writing about the vector, so the preferable verb is "append" or
> "push back". Or simply "add", since nobody cares about the location of
> the particular key.

Fixed.

> 
> > 256, BADLU error is raised in case when user tried to add userdata with
> > the new 257-th segment.
> 
> It's worth to mention that such hack with segments still doesn't cover
> the full VA space, but only those areas covered with the segments. It
> was unclear to me at first.

Fixed.

> 
> > 
> > Also, in this patch all internal usage of lightuserdata (for hooks,
> > profilers, built-in package, ir and so on) is changed to special values
> 
> Typo: s/ir/IR/ since this is an acronym.

Fixed.

> 
> > on Lua Stack.
> > 
> > Also, conversion of TValue to ffi C type with store is no longer
> 
> Ditto for FFI.

Fixed.

> 
> > compiled for lightuserdata.
> > 
> > [1]: https://www.kernel.org/doc/html/latest/arm64/memory.html
> > 
> > Sergey Kaplun:
> > * added the description and the test for the problem
> > 
> > Needed for tarantool/tarantool#6154
> > Resolves tarantool/tarantool#2712
> 
> Please move "Resolves" tag on top. BTW, it's rather "Fixes", isn't it?

It will be resolved when the LuaJIT will be bumped in the Tarantool,
won't it?

Changed lines order.

The new commit message is the following:

===================================================================
Add support for full-range 64 bit lightuserdata.

(cherry picked from commit e9af1abec542e6f9851ff2368e7f196b6382a44c)

LuaJIT uses special NaN-tagging technique to store internal type on
the Lua stack. In case LJ_GC64 first 13 bits are set in special NaN
type (0xfff8...). FPU generates the only one type. The next 4 bits are
used for an internal LuaJIT type of object on stack. The next 47 bits
are used for storing this object's content. For userdata, it is its
address. In case arm64 the pointer can have more than 47 significant
bits [1]. In this case the error BADLU error is raised.

For the support of full 64-bit range lightuserdata pointers two new
fields in GCState are added:

`lightudseg` - vector of segments of lightuserdata. Each element keeps
32-bit value. 25 MSB equal to MSB of lightuserdata address, the rest are
filled with zeros. The lentgh of the vector is power of 2.

`lightudnum` - the length - 1 of aforementioned vector (up to 255).

When lightuserdata is pushed on the stack, if its segment is not stored
in vector new value is appended on top of this vector. The maximum
amount of segments is 256. BADLU error is raised in case when user tried
to add userdata with the new 257-th segment, so the whole VA-space
isn't covered by this patch.

Also, in this patch all internal usage of lightuserdata (for hooks,
profilers, built-in package, IR and so on) is changed to special values
on Lua Stack.

Also, conversion of TValue to FFI C type with store is no longer
compiled for lightuserdata.

[1]: https://www.kernel.org/doc/html/latest/arm64/memory.html

Sergey Kaplun:
* added the description and the test for the problem

Resolves tarantool/tarantool#2712
Needed for tarantool/tarantool#6154
===================================================================

> 
> > ---
> >  doc/status.html                               | 11 ----
> >  src/jit/dump.lua                              |  4 +-
> >  src/lib_debug.c                               | 12 ++--
> >  src/lib_jit.c                                 | 14 ++---
> >  src/lib_package.c                             |  8 +--
> >  src/lib_string.c                              |  2 +-
> >  src/lj_api.c                                  | 40 +++++++++++--
> >  src/lj_ccall.c                                |  2 +-
> >  src/lj_cconv.c                                |  2 +-
> >  src/lj_crecord.c                              |  6 +-
> >  src/lj_dispatch.c                             |  2 +-
> >  src/lj_ir.c                                   |  6 +-
> >  src/lj_obj.c                                  |  5 +-
> >  src/lj_obj.h                                  | 57 ++++++++++++-------
> >  src/lj_snap.c                                 |  7 ++-
> >  src/lj_state.c                                |  6 ++
> >  src/lj_strfmt.c                               |  2 +-
> >  test/tarantool-tests/CMakeLists.txt           |  1 +
> >  .../lj-49-bad-lightuserdata.test.lua          | 10 ++++
> >  .../lj-49-bad-lightuserdata/CMakeLists.txt    |  1 +
> >  .../testlightuserdata.c                       | 52 +++++++++++++++++
> >  21 files changed, 183 insertions(+), 67 deletions(-)
> >  create mode 100644 test/tarantool-tests/lj-49-bad-lightuserdata.test.lua
> >  create mode 100644 test/tarantool-tests/lj-49-bad-lightuserdata/CMakeLists.txt
> >  create mode 100644 test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
> > 

<snipped>

> > diff --git a/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c b/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
> > new file mode 100644
> > index 00000000..801c7fe1
> > --- /dev/null
> > +++ b/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
> > @@ -0,0 +1,52 @@
> > +#include <lua.h>
> > +#include <lauxlib.h>
> > +
> > +#include <sys/mman.h>
> > +#include <unistd.h>
> > +
> > +#undef NDEBUG
> > +#include <assert.h>
> > +
> > +#define START ((void *)-1)
> > +
> > +static int longptr(lua_State *L)
> > +{
> > +	/*
> > +	 * We know that for arm64 at least 48 bits are available.
> > +	 * So emulate manually push of lightuseradata within
> > +	 * this range.
> > +	 */
> > +	void *longptr = (void *)(1llu << 48);
> > +	lua_pushlightuserdata(L, longptr);
> > +	assert(longptr == lua_topointer(L, -1));
> > +	/*
> > +	 * If start mapping address is not NULL, then the kernel
> > +	 * takes it as a hint about where to place the mapping, so
> > +	 * we try to get the highest memory address by hint
> > +	 * equals -1.
> > +	 */
> 
> BTW, I guess it's more natural to split this function into two: one for
> hand-crafted pointer, and another for one obtained by <mmap>.
> 
> > +	const size_t pagesize = getpagesize();
> 
> Please declare the variables at the beginning of the scope.

Don't get it. We don't use the LuaJIT code stile for tests, see memprof
tests for example. Anyway after spllitting the this function into two,
the variables declared at the beginning of the scope.

> 
> > +	void *mmaped = mmap(START, pagesize, PROT_NONE,
> > +				  MAP_PRIVATE | MAP_ANON, -1, 0);
> 
> Typo: Odd arguments alignment.

Fixed.

See the iterative patch below.

===================================================================
diff --git a/test/tarantool-tests/lj-49-bad-lightuserdata.test.lua b/test/tarantool-tests/lj-49-bad-lightuserdata.test.lua
index 111d6a70..94a743c7 100644
--- a/test/tarantool-tests/lj-49-bad-lightuserdata.test.lua
+++ b/test/tarantool-tests/lj-49-bad-lightuserdata.test.lua
@@ -1,10 +1,11 @@
 local tap = require('tap')
 
 local test = tap.test('lj-49-bad-lightuserdata')
-test:plan(1)
+test:plan(2)
 
 local testlightuserdata = require('testlightuserdata')
 
-test:ok(testlightuserdata.longptr())
+test:ok(testlightuserdata.crafted_ptr())
+test:ok(testlightuserdata.mmaped_ptr())
 
 os.exit(test:check() and 0 or 1)
diff --git a/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c b/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
index 801c7fe1..c958841f 100644
--- a/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
+++ b/test/tarantool-tests/lj-49-bad-lightuserdata/testlightuserdata.c
@@ -9,7 +9,7 @@
 
 #define START ((void *)-1)
 
-static int longptr(lua_State *L)
+static int crafted_ptr(lua_State *L)
 {
 	/*
 	 * We know that for arm64 at least 48 bits are available.
@@ -19,6 +19,14 @@ static int longptr(lua_State *L)
 	void *longptr = (void *)(1llu << 48);
 	lua_pushlightuserdata(L, longptr);
 	assert(longptr == lua_topointer(L, -1));
+	/* Clear our stack. */
+	lua_pop(L, 0);
+	lua_pushboolean(L, 1);
+	return 1;
+}
+
+static int mmaped_ptr(lua_State *L)
+{
 	/*
 	 * If start mapping address is not NULL, then the kernel
 	 * takes it as a hint about where to place the mapping, so
@@ -26,8 +34,8 @@ static int longptr(lua_State *L)
 	 * equals -1.
 	 */
 	const size_t pagesize = getpagesize();
-	void *mmaped = mmap(START, pagesize, PROT_NONE,
-				  MAP_PRIVATE | MAP_ANON, -1, 0);
+	void *mmaped = mmap(START, pagesize, PROT_NONE, MAP_PRIVATE | MAP_ANON,
+			    -1, 0);
 	if (mmaped != MAP_FAILED) {
 		lua_pushlightuserdata(L, mmaped);
 		assert(mmaped == lua_topointer(L, -1));
@@ -40,7 +48,8 @@ static int longptr(lua_State *L)
 }
 
 static const struct luaL_Reg testlightuserdata[] = {
-	{"longptr", longptr},
+	{"crafted_ptr", crafted_ptr},
+	{"mmaped_ptr", mmaped_ptr},
 	{NULL, NULL}
 };
 
===================================================================

> 
> > +	if (mmaped != MAP_FAILED) {
> > +		lua_pushlightuserdata(L, mmaped);
> > +		assert(mmaped == lua_topointer(L, -1));
> > +		assert(munmap(mmaped, pagesize) == 0);
> > +	}
> > +	/* Clear our stack. */
> > +	lua_pop(L, 0);
> > +	lua_pushboolean(L, 1);
> > +	return 1;
> > +}
> > +
> > +static const struct luaL_Reg testlightuserdata[] = {
> > +	{"longptr", longptr},
> > +	{NULL, NULL}
> > +};
> > +
> > +LUA_API int luaopen_testlightuserdata(lua_State *L)
> > +{
> > +	luaL_register(L, "testlightuserdata", testlightuserdata);
> > +	return 1;
> > +}
> > +
> > -- 
> > 2.31.0
> > 
> 
> -- 
> Best regards,
> IM

-- 
Best regards,
Sergey Kaplun


More information about the Tarantool-patches mailing list