[Tarantool-patches] [PATCH luajit 1/2][v2] Fix embedded bytecode loader.

Sergey Kaplun skaplun at tarantool.org
Tue Sep 5 17:10:14 MSK 2023


Hi, Sergey!
Thanks for the patch!
Please, consider my comments below.

On 31.08.23, Sergey Bronnikov wrote:
> From: Sergey Bronnikov <sergeyb at tarantool.org>
> 
> (cherry-picked from commit 820339960123dc78a7ce03edf53fcf4fdae0e55d)
> 
> The original problem is specific to x32 and is as follows: when a chunk
> with a bytecode library is loaded into memory, and the address is higher
> than 0x80000100, the `LexState->pe`, that contains an address of the end
> of the bytecode chunk in the memory, will wrap around and become smaller
> than the address in `LexState->p`, that contains an address of the
> beginning of bytecode chunk in the memory. In `bcread_fill()` called by
> `bcread_want()`, `memcpy()` is called with a very large size and causes
> bus error on x86 and segmentation fault on ARM Android.

Typo: s/bus error/the bus error/
Typo: s/segmentation fault/the segmentation fault/

> 
> The problem cannot be reproduced on platforms supported by Tarantool
> (ARM64, x86_64), so test doesn't reproduce a problem without a patch and
> tests the patch partially.
> 
> Sergey Bronnikov:
> * added the description and the test
> ---
>  src/lib_package.c                             |  4 +-
>  src/lj_bcread.c                               | 10 +-
>  src/lj_lex.c                                  |  6 ++
>  src/lj_lex.h                                  |  1 +
>  .../lj-549-bytecode-loader.test.lua           | 96 +++++++++++++++++++
>  5 files changed, 110 insertions(+), 7 deletions(-)
>  create mode 100644 test/tarantool-tests/lj-549-bytecode-loader.test.lua
> 

<snipped>

> diff --git a/test/tarantool-tests/lj-549-bytecode-loader.test.lua b/test/tarantool-tests/lj-549-bytecode-loader.test.lua
> new file mode 100644
> index 00000000..889be80a
> --- /dev/null
> +++ b/test/tarantool-tests/lj-549-bytecode-loader.test.lua
> @@ -0,0 +1,96 @@
> +local tap = require('tap')
> +local ffi = require('ffi')
> +local utils = require('utils')
> +local test = tap.test('lj-549-bytecode-loader'):skipcond({
> +    -- ['Test requires GC64 mode enabled'] = not require('ffi').abi('gc64'),
> +})

Minor: It's better to require ffi and utils after test initialization
via `tap.test()`, see other tests for example.
Also, I suppose that we don't need `utils` itself, but
`utils.exec.makecmd`.

> +
> +test:plan(1)
> +
> +-- Test creates a shared library with LuaJIT bytecode,
> +-- loads shared library as a Lua module and checks,
> +-- that no crashes eliminated.
> +--
> +-- $ make HOST_CC='gcc -m32' TARGET_CFLAGS='-m32' \
> +--                           TARGET_LDFLAGS='-m32' \
> +--                           TARGET_SHLDFLAGS='-m32' \
> +--                           -f Makefile.original
> +-- $ echo 'print("test")' > a.lua
> +-- $ LUA_PATH="src/?.lua;;" luajit -b a.lua a.c
> +-- $ gcc -m32 -fPIC -shared a.c -o a.so
> +-- $ luajit -e "require('a')"
> +-- Program received signal SIGBUS, Bus error
> +
> +local function file_exists(fname)
> +   return io.open(fname, 'r') or true and false

OK, this is a little bit confusing:
If file doesn't exists we go to `or true` and after check `and false`
which is always false. Tricky, but works.

Also, here we don't close file handler.
I suggest it is better to rewrite this as the following:
| local fh = io.open(name, 'r')
| return fh and io.close(fh)

It is simplier to read, and fixes problem with leaking handler.

> +end
> +
> +local function get_file_name(file)
> +    return file:match("[^/]*$")

Minor: it may match the empty string for a directory occasionally:
| src/luajit -e 'print([["]]..("/tmp/"):match("[^/]*$")..[["]])'
| ""

Nit: use single quotes instead of double quotes if possible.

Nit: `[^/\\]` is better since it also covers Windows.
See <test/lua-Harness-tests/314-regex.t:167>
| local dirname = arg[0]:gsub('([^/\\]+)$', '')
Since we don't support Windows feel free to ignore.

> +end
> +
> +local stdout_msg = 'Lango team'
> +local lua_code = ('print(%q)'):format(stdout_msg)
> +local fpath = os.tmpname()
> +local path_lua = ('%s.lua'):format(fpath)
> +local path_c = ('%s.c'):format(fpath)
> +local path_so = ('%s.so'):format(fpath)

Minor: I suppose it should be renamed to `path_shared`, since on macOS
they have the ".dyld" suffix for shared libs. Hence, we need to use the
suffix in format of the shared library name too. You may take some
inspiration from here [1].

> +
> +-- Create a file with a minimal Lua code.
> +local fh = assert(io.open(path_lua, 'w'))
> +fh:write(lua_code)
> +fh:close()
> +
> +local module_name = assert(get_file_name(fpath))
> +
> +local basedir = function(path)
> +    local sep = '/'

Why do we need an additional variable here?

Nit: Indent is 4 spaces instead of 2.

> +    return path:match('(.*' .. sep .. ')') or './'

It's better to mention that the pattern matching is greedy, so we match
until the last separator.

> +end
> +
> +-- Create a C file with LuaJIT bytecode.
> +-- We cannot use utils.makecmd, because command-line generated
> +-- by `makecmd` contains `-e` that is incompatible with option `-b`.

Nit: comment line width is more than 66 symbols

> +local function create_c_file(pathlua, pathc)
> +  local lua_path = os.getenv('LUA_PATH')
> +  local lua_bin = require('utils').exec.luacmd(arg):match('%S+')
> +  local cmd_fmt = 'LUA_PATH="%s" %s -b %s %s'
> +  local cmd = (cmd_fmt):format(lua_path, lua_bin, pathlua, pathc)
> +  local ret = os.execute(cmd)
> +  assert(ret == 0, 'create a C file with bytecode')
> +end
> +
> +create_c_file(path_lua, path_c)
> +assert(file_exists(path_c))

Minor: The test flow is a little bit hard to read due to function
declarations. Maybe it is better to declare all utility functions first
and then use them one by one? This makes control flow easier to read.

> +
> +-- Compile C source code with LuaJIT bytecode to a shared library.
> +-- `-m64` is not available on ARM64, see
> +-- "3.18.1 AArch64 Options in the manual",
> +-- https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html
> +local cflags_64 = jit.arch == 'arm64' and '-march=armv8-a' or '-m64'
> +local cflags = ffi.abi('32bit') and '-m32' or cflags_64
> +local cc_cmd = ('cc %s -fPIC -shared %s -o %s'):format(cflags, path_c, path_so)
> +local ph = io.popen(cc_cmd)
> +ph:close()
> +assert(file_exists(path_so))
> +
> +-- Load shared library as a Lua module.
> +local lua_cpath = ('"/tmp/?.so;"'):format(basedir(fpath))
> +assert(file_exists(path_so))
> +local cmd = utils.exec.makecmd(arg, {
> +    script = ('-e "require([[%s]])"'):format(module_name),

Nit: Indent is 4 spaces instead of 2.

> +    env = {
> +        LUA_CPATH = lua_cpath,

Nit: Indent is 4 spaces instead of 2.

> +        -- It is required to cleanup LUA_PATH, otherwise
> +        -- LuaJIT loads Lua module, see tarantool-tests/utils/init.lua.

Nit: comment line width is more than 66 symbols

Actually I don't understand from the comment what Lua module exactly is
loaded. Maybe it's better to fix this behaviour?
Feel free to ignore.

> +        LUA_PATH = '',
> +    },
> +})
> +local res = cmd()
> +test:ok(res == stdout_msg, 'bytecode loader works')
> +
> +os.remove(path_lua)
> +os.remove(path_c)
> +os.remove(path_so)
> +
> +os.exit(test:check() and 0 or 1)
> -- 
> 2.34.1
> 

[1]: https://github.com/tarantool/tarantool/blob/dc8973c3de6311ab11df8d43520e1d40de4b9c7b/test/box/func_reload.test.lua#L5

-- 
Best regards,
Sergey Kaplun


More information about the Tarantool-patches mailing list