[Tarantool-patches] [PATCH luajit 1/2][v2] Fix embedded bytecode loader.

Sergey Bronnikov sergeyb at tarantool.org
Thu Sep 7 18:21:50 MSK 2023


Hi, Sergey


On 9/5/23 17:10, Sergey Kaplun wrote:
> Hi, Sergey!
> Thanks for the patch!
> Please, consider my comments below.
>
> On 31.08.23, Sergey Bronnikov wrote:
>> From: Sergey Bronnikov <sergeyb at tarantool.org>
>>
>> (cherry-picked from commit 820339960123dc78a7ce03edf53fcf4fdae0e55d)
>>
>> The original problem is specific to x32 and is as follows: when a chunk
>> with a bytecode library is loaded into memory, and the address is higher
>> than 0x80000100, the `LexState->pe`, that contains an address of the end
>> of the bytecode chunk in the memory, will wrap around and become smaller
>> than the address in `LexState->p`, that contains an address of the
>> beginning of bytecode chunk in the memory. In `bcread_fill()` called by
>> `bcread_want()`, `memcpy()` is called with a very large size and causes
>> bus error on x86 and segmentation fault on ARM Android.
> Typo: s/bus error/the bus error/
> Typo: s/segmentation fault/the segmentation fault/

Fixed.


>
>> The problem cannot be reproduced on platforms supported by Tarantool
>> (ARM64, x86_64), so test doesn't reproduce a problem without a patch and
>> tests the patch partially.
>>
>> Sergey Bronnikov:
>> * added the description and the test
>> ---
>>   src/lib_package.c                             |  4 +-
>>   src/lj_bcread.c                               | 10 +-
>>   src/lj_lex.c                                  |  6 ++
>>   src/lj_lex.h                                  |  1 +
>>   .../lj-549-bytecode-loader.test.lua           | 96 +++++++++++++++++++
>>   5 files changed, 110 insertions(+), 7 deletions(-)
>>   create mode 100644 test/tarantool-tests/lj-549-bytecode-loader.test.lua
>>
> <snipped>
>
>> diff --git a/test/tarantool-tests/lj-549-bytecode-loader.test.lua b/test/tarantool-tests/lj-549-bytecode-loader.test.lua
>> new file mode 100644
>> index 00000000..889be80a
>> --- /dev/null
>> +++ b/test/tarantool-tests/lj-549-bytecode-loader.test.lua
>> @@ -0,0 +1,96 @@
>> +local tap = require('tap')
>> +local ffi = require('ffi')
>> +local utils = require('utils')
>> +local test = tap.test('lj-549-bytecode-loader'):skipcond({
>> +    -- ['Test requires GC64 mode enabled'] = not require('ffi').abi('gc64'),
>> +})
> Minor: It's better to require ffi and utils after test initialization
> via `tap.test()`, see other tests for example.
> Also, I suppose that we don't need `utils` itself, but
> `utils.exec.makecmd`.

Fixed.


>> +
>> +test:plan(1)
>> +
>> +-- Test creates a shared library with LuaJIT bytecode,
>> +-- loads shared library as a Lua module and checks,
>> +-- that no crashes eliminated.
>> +--
>> +-- $ make HOST_CC='gcc -m32' TARGET_CFLAGS='-m32' \
>> +--                           TARGET_LDFLAGS='-m32' \
>> +--                           TARGET_SHLDFLAGS='-m32' \
>> +--                           -f Makefile.original
>> +-- $ echo 'print("test")' > a.lua
>> +-- $ LUA_PATH="src/?.lua;;" luajit -b a.lua a.c
>> +-- $ gcc -m32 -fPIC -shared a.c -o a.so
>> +-- $ luajit -e "require('a')"
>> +-- Program received signal SIGBUS, Bus error
>> +
>> +local function file_exists(fname)
>> +   return io.open(fname, 'r') or true and false
> OK, this is a little bit confusing:
> If file doesn't exists we go to `or true` and after check `and false`
> which is always false. Tricky, but works.
>
> Also, here we don't close file handler.
> I suggest it is better to rewrite this as the following:
> | local fh = io.open(name, 'r')
> | return fh and io.close(fh)
>
> It is simplier to read, and fixes problem with leaking handler.
Updated.
>
>> +end
>> +
>> +local function get_file_name(file)
>> +    return file:match("[^/]*$")
> Minor: it may match the empty string for a directory occasionally:
> | src/luajit -e 'print([["]]..("/tmp/"):match("[^/]*$")..[["]])'


Fixed.

> | ""
>
> Nit: use single quotes instead of double quotes if possible.

Without context it is difficult to get what is line you talk about.

As I see everything is fine with quotes in version on the branch.


>
> Nit: `[^/\\]` is better since it also covers Windows.
> See <test/lua-Harness-tests/314-regex.t:167>
> | local dirname = arg[0]:gsub('([^/\\]+)$', '')
> Since we don't support Windows feel free to ignore.
>
>> +end
>> +
>> +local stdout_msg = 'Lango team'
>> +local lua_code = ('print(%q)'):format(stdout_msg)
>> +local fpath = os.tmpname()
>> +local path_lua = ('%s.lua'):format(fpath)
>> +local path_c = ('%s.c'):format(fpath)
>> +local path_so = ('%s.so'):format(fpath)
> Minor: I suppose it should be renamed to `path_shared`, since on macOS
> they have the ".dyld" suffix for shared libs. Hence, we need to use the
> suffix in format of the shared library name too. You may take some
> inspiration from here [1].
Fixed.
>> +
>> +-- Create a file with a minimal Lua code.
>> +local fh = assert(io.open(path_lua, 'w'))
>> +fh:write(lua_code)
>> +fh:close()
>> +
>> +local module_name = assert(get_file_name(fpath))
>> +
>> +local basedir = function(path)
>> +    local sep = '/'
> Why do we need an additional variable here?

For clarity.


>
> Nit: Indent is 4 spaces instead of 2.
>
>> +    return path:match('(.*' .. sep .. ')') or './'
> It's better to mention that the pattern matching is greedy, so we match
> until the last separator.


Updated.

>> +end
>> +
>> +-- Create a C file with LuaJIT bytecode.
>> +-- We cannot use utils.makecmd, because command-line generated
>> +-- by `makecmd` contains `-e` that is incompatible with option `-b`.
> Nit: comment line width is more than 66 symbols

Fixed.


>> +local function create_c_file(pathlua, pathc)
>> +  local lua_path = os.getenv('LUA_PATH')
>> +  local lua_bin = require('utils').exec.luacmd(arg):match('%S+')
>> +  local cmd_fmt = 'LUA_PATH="%s" %s -b %s %s'
>> +  local cmd = (cmd_fmt):format(lua_path, lua_bin, pathlua, pathc)
>> +  local ret = os.execute(cmd)
>> +  assert(ret == 0, 'create a C file with bytecode')
>> +end
>> +
>> +create_c_file(path_lua, path_c)
>> +assert(file_exists(path_c))
> Minor: The test flow is a little bit hard to read due to function
> declarations. Maybe it is better to declare all utility functions first
> and then use them one by one? This makes control flow easier to read.
Rearranged, take a look please.
>
>> +
>> +-- Compile C source code with LuaJIT bytecode to a shared library.
>> +-- `-m64` is not available on ARM64, see
>> +-- "3.18.1 AArch64 Options in the manual",
>> +-- https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html
>> +local cflags_64 = jit.arch == 'arm64' and '-march=armv8-a' or '-m64'
>> +local cflags = ffi.abi('32bit') and '-m32' or cflags_64
>> +local cc_cmd = ('cc %s -fPIC -shared %s -o %s'):format(cflags, path_c, path_so)
>> +local ph = io.popen(cc_cmd)
>> +ph:close()
>> +assert(file_exists(path_so))
>> +
>> +-- Load shared library as a Lua module.
>> +local lua_cpath = ('"/tmp/?.so;"'):format(basedir(fpath))
>> +assert(file_exists(path_so))
>> +local cmd = utils.exec.makecmd(arg, {
>> +    script = ('-e "require([[%s]])"'):format(module_name),
> Nit: Indent is 4 spaces instead of 2.
Fixed.
>
>> +    env = {
>> +        LUA_CPATH = lua_cpath,
> Nit: Indent is 4 spaces instead of 2.
Fixed.
>
>> +        -- It is required to cleanup LUA_PATH, otherwise
>> +        -- LuaJIT loads Lua module, see tarantool-tests/utils/init.lua.
> Nit: comment line width is more than 66 symbols
Fixed too.
>
> Actually I don't understand from the comment what Lua module exactly is
> loaded. Maybe it's better to fix this behaviour?
What behaviour you want to fix?
> Feel free to ignore.
>
>> +        LUA_PATH = '',
>> +    },
>> +})
>> +local res = cmd()
>> +test:ok(res == stdout_msg, 'bytecode loader works')
>> +
>> +os.remove(path_lua)
>> +os.remove(path_c)
>> +os.remove(path_so)
>> +
>> +os.exit(test:check() and 0 or 1)
>> -- 
>> 2.34.1
>>
> [1]: https://github.com/tarantool/tarantool/blob/dc8973c3de6311ab11df8d43520e1d40de4b9c7b/test/box/func_reload.test.lua#L5
>


More information about the Tarantool-patches mailing list