[Tarantool-patches] [PATCH luajit v2 1/1] Fix saved bytecode encapsulated in ELF objects.
Sergey Bronnikov
sergeyb at tarantool.org
Wed May 31 17:50:00 MSK 2023
Hello, Max! Thanks for comments!
Fixed all comments except checks for constant size and offset, fixes
force-pushed.
Sergey
On 5/30/23 12:05, Maxim Kokryashkin wrote:
> Hi, Sergey!
> Thanks for the patch!
> LGTM, except for a few comments below.
>
>
<snipped>
>
> diff --git
> a/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
> b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
> new file mode 100644
> index 00000000..5dec6751
> --- /dev/null
> +++ b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
> @@ -0,0 +1,202 @@
> +local tap = require('tap')
> +local test = tap.test('lj-366-strtab-correct-size'):skipcond({
> + -- The test is ELF-specific, and because LuaJIT exports object
> + -- files in ELF format for all operating systems except macOS
> + -- and Windows we skip test on these OSes.
>
> Typo: s/we skip test/we skip it/
>
Fixed!
> + -- See src/jit/bcsave.lua:bcsave_obj.
> + ['Disabled on Windows'] = jit.os == 'Windows',
> + ['Disabled on macOS'] = jit.os == 'OSX',
> +})
> +
> +local ffi = require 'ffi'
> +
> +-- Command below exports bytecode as an object file in ELF
> format:
> +-- $ luajit -b -n 'lango_team' -e 'print()' xxx.obj
> +-- $ file xxx.obj
> +-- xxx.obj: ELF 64-bit LSB relocatable, x86-64, version 1
> (SYSV), not stripped
> +--
> +-- With read_elf(1) it is possible displaying the entries in
> symbol table
>
> Typo: s/possible displaying the entries/possible to display entries/
>
Fixed!
> +-- section of the file, if it has one. Object file contains a
> single symbol
>
> Typo: s/a single symbol/the single symbol/
>
> +-- with name 'luaJIT_BC_lango_team':
> +--
> +-- $ readelf --symbols xxx.obj
> +--
> +-- Symbol table '.symtab' contains 2 entries:
> +-- Num: Value Size Type Bind Vis Ndx Name
> +-- 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
> +-- 1: 0000000000000000 66 OBJECT GLOBAL DEFAULT 4
> luaJIT_BC_lango_team
> +--
> +-- and displaying the information contained in the file's
> section headers, if
>
> Typo: s/and to display/
>
Fixed!
> Also, that sentence seems to be a continuation for the previous
> sentence about
> `readelf`’s functionality, however it is completely detached from
> it both by the
> setence inbetween them and grammatically. I suggest moving the
> demonstration
> of `.symtab` contents after the introduction of readelf’s
> functionality, to improve
> readability.
>
> +-- it has any. For our purposes we are interesting in section
> .symtab,
>
> Typo: s/are interesting in section .symtab/are interested in
> .symtab section/
>
Fixed!
> +-- so other sections snipped in the output:
>
> Typo: s/sections snipped/sections are snipped/
>
Fixed!
> +--
> +-- $ readelf --section-headers xxx.obj
> +-- There are 6 section headers, starting at offset 0x40:
> +--
> +-- Section Headers:
> +-- [Nr] Name Type Address Offset
> +-- Size EntSize Flags Link Info Align
> +-- ...
> +--
> +-- [ 3] .strtab STRTAB 0000000000000000 00000223
> +-- 0000000000000016 0000000000000000 0 0 1
> +-- ...
> +-- Reference numbers for strtab offset and size could be
> obtained with
> +-- readelf(1). Note that number system of these numbers are
> hexadecimal.
> +
> +local expected_strtab_size = 0x16
> +local expected_strtab_offset = 0x223
> +local module_name = 'lango_team'
>
> Simple value constants should be uppercased.
>
Fixed.
> +
> +-- Symbol name prefix for LuaJIT bytecode defined in bcsave.lua.
> +local LJBC_PREFIX = 'luaJIT_BC_'
> +
> +-- Defined in elf.h.
> +local SHT_SYMTAB = 2
> +
> +-- Using the same declarations as defined in
> <src/jit/bcsave.lua>.
> +ffi.cdef[[
> +typedef struct {
> + uint8_t emagic[4], eclass, eendian, eversion, eosabi,
> eabiversion, epad[7];
> + uint16_t type, machine;
> + uint32_t version;
> + uint32_t entry, phofs, shofs;
> + uint32_t flags;
> + uint16_t ehsize, phentsize, phnum, shentsize, shnum, shstridx;
> +} ELF32header;
> +
> +typedef struct {
> + uint8_t emagic[4], eclass, eendian, eversion, eosabi,
> eabiversion, epad[7];
> + uint16_t type, machine;
> + uint32_t version;
> + uint64_t entry, phofs, shofs;
> + uint32_t flags;
> + uint16_t ehsize, phentsize, phnum, shentsize, shnum, shstridx;
> +} ELF64header;
> +
> +typedef struct {
> + uint32_t name, type, flags, addr, ofs, size, link, info,
> align, entsize;
> +} ELF32sectheader;
> +
> +typedef struct {
> + uint32_t name, type;
> + uint64_t flags, addr, ofs, size;
> + uint32_t link, info;
> + uint64_t align, entsize;
> +} ELF64sectheader;
> +
> +typedef struct {
> + uint32_t name, value, size;
> + uint8_t info, other;
> + uint16_t sectidx;
> +} ELF32symbol;
> +
> +typedef struct {
> + uint32_t name;
> + uint8_t info, other;
> + uint16_t sectidx;
> + uint64_t value, size;
> +} ELF64symbol;
> +
> +typedef struct {
> + ELF32header hdr;
> + ELF32sectheader sect[6];
> + ELF32symbol sym[2];
> + uint8_t space[4096];
> +} ELF32obj;
> +
> +typedef struct {
> + ELF64header hdr;
> + ELF64sectheader sect[6];
> + ELF64symbol sym[2];
> + uint8_t space[4096];
> +} ELF64obj;
> +]]
> +
> +local is64_arch = {
> + ['x64'] = true,
> + ['arm64'] = true,
> + ['arm64be'] = true,
> + ['ppc'] = false,
> + ['mips'] = false,
> +}
> +
> +local is64 = is64_arch[jit.arch] or false
> +
> +local function create_obj_file(name)
> + local elf_filename = os.tmpname() .. '.obj'
> + local lua_path = os.getenv('LUA_PATH')
> + local lua_bin = require('utils').luacmd(arg):match('%S+')
> + local cmd_fmt = 'LUA_PATH="%s" %s -b -n "%s" -e "print()" %s'
> + local cmd = (cmd_fmt):format(lua_path, lua_bin, name,
> elf_filename)
> + local ret = os.execute(cmd)
> + assert(ret == 0, 'create an object file')
> + return elf_filename
> +end
> +
> +-- Reads a file located in a specified path and returns its
> content.
> +local function read_file(path)
> + local file = assert(io.open(path), 'cannot open an object file')
> + local content = file:read('*a')
> + file:close()
> + return content
> +end
> +
> +-- Parses a buffer in an ELF format and returns an offset and
> a size of strtab
> +-- and symtab sections.
> +local function read_elf(elf_content)
> + local ELFobj_type = ffi.typeof(is64 and 'ELF64obj *' or
> 'ELF32obj *')
> + local ELFsectheader_type = ffi.typeof(is64 and
> 'ELF64sectheader *' or 'ELF32sectheader *')
> + local elf = ffi.cast(ELFobj_type, elf_content)
> + local symtab_hdr, strtab_hdr
> + -- Iterate by section headers.
> + for i = 0, elf.hdr.shnum do
> + local sec = ffi.cast(ELFsectheader_type, elf.sect[i])
> + if sec.type == SHT_SYMTAB then
> + symtab_hdr = sec
> + strtab_hdr = ffi.cast(ELFsectheader_type,
> elf.sect[symtab_hdr.link])
> + break
> + end
> + end
> +
> + assert(strtab_hdr ~= nil, 'section .strtab was not found')
> + assert(symtab_hdr ~= nil, 'section .symtab was not found')
> +
> + return strtab_hdr, symtab_hdr
> +end
> +
> +test:plan(3)
> +
> +local elf_filename = create_obj_file(module_name)
> +local elf_content = read_file(elf_filename)
> +assert(#elf_content ~= 0, 'cannot read an object file')
> +
> +local strtab, symtab = read_elf(elf_content)
> +local strtab_size = tonumber(strtab.size)
> +local strtab_offset = tonumber(strtab.ofs)
> +local symtab_size = tonumber(symtab.size)
> +local sym_cnt = tonumber(symtab_size / symtab.entsize)
> +assert(sym_cnt ~= 0, 'number of symbols is zero')
> +
> +test:is(strtab_size, expected_strtab_size, 'check .strtab size')
> +test:is(strtab_offset, expected_strtab_offset, 'check .strtab
> offset')
>
> As I already said offline, I don’t like that approach. ELF is made
> to by
> as flexible as possible, so, generally, you have no guarantees about
> order and, hence, offsets of any sections. The only reason why we are
> getting away with that here, is because this ELF is generated
> in `bcsave.lua` in a particular manner. It is ok for now, but no one
> knows if this will ever change. I think that the symbol name search
> below is sufficient, because it is unlikely to pass with an
> incorrect offset.
>
Sure, I remember your thoughts regarding checking with constant size and
offset.
I left this for double-checking change proposed in patch.
I'll leave this as as without changes and will wait review from Sergey
and then we will decide remove these checks or not.
> All of the checks are valid for now, though, so feel free to ignore.
>
> +
> +local strtab_str = string.sub(elf_content, strtab_offset,
> strtab_offset + strtab_size)
> +
> +local strtab_p = ffi.cast('char *', strtab_str)
> +local sym_name_expected = LJBC_PREFIX .. module_name
> +local sym_is_found = false
> +for i = 1, sym_cnt do
> + local sym_name = ffi.string(strtab_p + i)
> + if sym_name_expected == sym_name then
> + sym_is_found = true
> + break
> + end
> +end
> +
> +test:ok(sym_is_found == true, 'symbol is found')
> +
> +local ret = os.remove(elf_filename)
> +assert(ret == true, 'cannot remove an object file')
> +
> +os.exit(test:check() and 0 or 1)
> --
> 2.34.1
>
> --
> Best regards,
> Maxim Kokryashkin
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20230531/f623c801/attachment.htm>
More information about the Tarantool-patches
mailing list