[Tarantool-patches] [PATCH luajit v2 1/1] Fix saved bytecode encapsulated in ELF objects.

Sergey Bronnikov sergeyb at tarantool.org
Wed May 31 17:50:00 MSK 2023


Hello, Max! Thanks for comments!

Fixed all comments except checks for constant size and offset, fixes 
force-pushed.


Sergey

On 5/30/23 12:05, Maxim Kokryashkin wrote:
> Hi, Sergey!
> Thanks for the patch!
> LGTM, except for a few comments below.
>
>
<snipped>
>
>         diff --git
>         a/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
>         b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
>         new file mode 100644
>         index 00000000..5dec6751
>         --- /dev/null
>         +++ b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
>         @@ -0,0 +1,202 @@
>         +local tap = require('tap')
>         +local test = tap.test('lj-366-strtab-correct-size'):skipcond({
>         + -- The test is ELF-specific, and because LuaJIT exports object
>         + -- files in ELF format for all operating systems except macOS
>         + -- and Windows we skip test on these OSes.
>
>     Typo: s/we skip test/we skip it/
>
Fixed!


>         + -- See src/jit/bcsave.lua:bcsave_obj.
>         + ['Disabled on Windows'] = jit.os == 'Windows',
>         + ['Disabled on macOS'] = jit.os == 'OSX',
>         +})
>         +
>         +local ffi = require 'ffi'
>         +
>         +-- Command below exports bytecode as an object file in ELF
>         format:
>         +-- $ luajit -b -n 'lango_team' -e 'print()' xxx.obj
>         +-- $ file xxx.obj
>         +-- xxx.obj: ELF 64-bit LSB relocatable, x86-64, version 1
>         (SYSV), not stripped
>         +--
>         +-- With read_elf(1) it is possible displaying the entries in
>         symbol table
>
>     Typo: s/possible displaying the entries/possible to display entries/
>

Fixed!


>         +-- section of the file, if it has one. Object file contains a
>         single symbol
>
>     Typo: s/a single symbol/the single symbol/
>
>         +-- with name 'luaJIT_BC_lango_team':
>         +--
>         +-- $ readelf --symbols xxx.obj
>         +--
>         +-- Symbol table '.symtab' contains 2 entries:
>         +-- Num: Value Size Type Bind Vis Ndx Name
>         +-- 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
>         +-- 1: 0000000000000000 66 OBJECT GLOBAL DEFAULT 4
>         luaJIT_BC_lango_team
>         +--
>         +-- and displaying the information contained in the file's
>         section headers, if
>
>     Typo: s/and to display/
>
Fixed!


>     Also, that sentence seems to be a continuation for the previous
>     sentence about
>     `readelf`’s functionality, however it is completely detached from
>     it both by the
>     setence inbetween them and grammatically. I suggest moving the
>     demonstration
>     of `.symtab` contents after the introduction of readelf’s
>     functionality, to improve
>     readability.
>
>         +-- it has any. For our purposes we are interesting in section
>         .symtab,
>
>     Typo: s/are interesting in section .symtab/are interested in
>     .symtab section/
>

Fixed!


>         +-- so other sections snipped in the output:
>
>     Typo: s/sections snipped/sections are snipped/
>
Fixed!


>         +--
>         +-- $ readelf --section-headers xxx.obj
>         +-- There are 6 section headers, starting at offset 0x40:
>         +--
>         +-- Section Headers:
>         +-- [Nr] Name Type Address Offset
>         +-- Size EntSize Flags Link Info Align
>         +-- ...
>         +--
>         +-- [ 3] .strtab STRTAB 0000000000000000 00000223
>         +-- 0000000000000016 0000000000000000 0 0 1
>         +-- ...
>         +-- Reference numbers for strtab offset and size could be
>         obtained with
>         +-- readelf(1). Note that number system of these numbers are
>         hexadecimal.
>         +
>         +local expected_strtab_size = 0x16
>         +local expected_strtab_offset = 0x223
>         +local module_name = 'lango_team'
>
>     Simple value constants should be uppercased.
>
Fixed.


>         +
>         +-- Symbol name prefix for LuaJIT bytecode defined in bcsave.lua.
>         +local LJBC_PREFIX = 'luaJIT_BC_'
>         +
>         +-- Defined in elf.h.
>         +local SHT_SYMTAB = 2
>         +
>         +-- Using the same declarations as defined in
>         <src/jit/bcsave.lua>.
>         +ffi.cdef[[
>         +typedef struct {
>         + uint8_t emagic[4], eclass, eendian, eversion, eosabi,
>         eabiversion, epad[7];
>         + uint16_t type, machine;
>         + uint32_t version;
>         + uint32_t entry, phofs, shofs;
>         + uint32_t flags;
>         + uint16_t ehsize, phentsize, phnum, shentsize, shnum, shstridx;
>         +} ELF32header;
>         +
>         +typedef struct {
>         + uint8_t emagic[4], eclass, eendian, eversion, eosabi,
>         eabiversion, epad[7];
>         + uint16_t type, machine;
>         + uint32_t version;
>         + uint64_t entry, phofs, shofs;
>         + uint32_t flags;
>         + uint16_t ehsize, phentsize, phnum, shentsize, shnum, shstridx;
>         +} ELF64header;
>         +
>         +typedef struct {
>         + uint32_t name, type, flags, addr, ofs, size, link, info,
>         align, entsize;
>         +} ELF32sectheader;
>         +
>         +typedef struct {
>         + uint32_t name, type;
>         + uint64_t flags, addr, ofs, size;
>         + uint32_t link, info;
>         + uint64_t align, entsize;
>         +} ELF64sectheader;
>         +
>         +typedef struct {
>         + uint32_t name, value, size;
>         + uint8_t info, other;
>         + uint16_t sectidx;
>         +} ELF32symbol;
>         +
>         +typedef struct {
>         + uint32_t name;
>         + uint8_t info, other;
>         + uint16_t sectidx;
>         + uint64_t value, size;
>         +} ELF64symbol;
>         +
>         +typedef struct {
>         + ELF32header hdr;
>         + ELF32sectheader sect[6];
>         + ELF32symbol sym[2];
>         + uint8_t space[4096];
>         +} ELF32obj;
>         +
>         +typedef struct {
>         + ELF64header hdr;
>         + ELF64sectheader sect[6];
>         + ELF64symbol sym[2];
>         + uint8_t space[4096];
>         +} ELF64obj;
>         +]]
>         +
>         +local is64_arch = {
>         + ['x64'] = true,
>         + ['arm64'] = true,
>         + ['arm64be'] = true,
>         + ['ppc'] = false,
>         + ['mips'] = false,
>         +}
>         +
>         +local is64 = is64_arch[jit.arch] or false
>         +
>         +local function create_obj_file(name)
>         + local elf_filename = os.tmpname() .. '.obj'
>         + local lua_path = os.getenv('LUA_PATH')
>         + local lua_bin = require('utils').luacmd(arg):match('%S+')
>         + local cmd_fmt = 'LUA_PATH="%s" %s -b -n "%s" -e "print()" %s'
>         + local cmd = (cmd_fmt):format(lua_path, lua_bin, name,
>         elf_filename)
>         + local ret = os.execute(cmd)
>         + assert(ret == 0, 'create an object file')
>         + return elf_filename
>         +end
>         +
>         +-- Reads a file located in a specified path and returns its
>         content.
>         +local function read_file(path)
>         + local file = assert(io.open(path), 'cannot open an object file')
>         + local content = file:read('*a')
>         + file:close()
>         + return content
>         +end
>         +
>         +-- Parses a buffer in an ELF format and returns an offset and
>         a size of strtab
>         +-- and symtab sections.
>         +local function read_elf(elf_content)
>         + local ELFobj_type = ffi.typeof(is64 and 'ELF64obj *' or
>         'ELF32obj *')
>         + local ELFsectheader_type = ffi.typeof(is64 and
>         'ELF64sectheader *' or 'ELF32sectheader *')
>         + local elf = ffi.cast(ELFobj_type, elf_content)
>         + local symtab_hdr, strtab_hdr
>         + -- Iterate by section headers.
>         + for i = 0, elf.hdr.shnum do
>         + local sec = ffi.cast(ELFsectheader_type, elf.sect[i])
>         + if sec.type == SHT_SYMTAB then
>         + symtab_hdr = sec
>         + strtab_hdr = ffi.cast(ELFsectheader_type,
>         elf.sect[symtab_hdr.link])
>         + break
>         + end
>         + end
>         +
>         + assert(strtab_hdr ~= nil, 'section .strtab was not found')
>         + assert(symtab_hdr ~= nil, 'section .symtab was not found')
>         +
>         + return strtab_hdr, symtab_hdr
>         +end
>         +
>         +test:plan(3)
>         +
>         +local elf_filename = create_obj_file(module_name)
>         +local elf_content = read_file(elf_filename)
>         +assert(#elf_content ~= 0, 'cannot read an object file')
>         +
>         +local strtab, symtab = read_elf(elf_content)
>         +local strtab_size = tonumber(strtab.size)
>         +local strtab_offset = tonumber(strtab.ofs)
>         +local symtab_size = tonumber(symtab.size)
>         +local sym_cnt = tonumber(symtab_size / symtab.entsize)
>         +assert(sym_cnt ~= 0, 'number of symbols is zero')
>         +
>         +test:is(strtab_size, expected_strtab_size, 'check .strtab size')
>         +test:is(strtab_offset, expected_strtab_offset, 'check .strtab
>         offset')
>
>     As I already said offline, I don’t like that approach. ELF is made
>     to by
>     as flexible as possible, so, generally, you have no guarantees about
>     order and, hence, offsets of any sections. The only reason why we are
>     getting away with that here, is because this ELF is generated
>     in `bcsave.lua` in a particular manner. It is ok for now, but no one
>     knows if this will ever change. I think that the symbol name search
>     below is sufficient, because it is unlikely to pass with an
>     incorrect offset.
>
Sure, I remember your thoughts regarding checking with constant size and 
offset.

I left this for double-checking change proposed in patch.

I'll leave this as as without changes and will wait review from Sergey

and then we will decide remove these checks or not.

>     All of the checks are valid for now, though, so feel free to ignore.
>
>         +
>         +local strtab_str = string.sub(elf_content, strtab_offset,
>         strtab_offset + strtab_size)
>         +
>         +local strtab_p = ffi.cast('char *', strtab_str)
>         +local sym_name_expected = LJBC_PREFIX .. module_name
>         +local sym_is_found = false
>         +for i = 1, sym_cnt do
>         + local sym_name = ffi.string(strtab_p + i)
>         + if sym_name_expected == sym_name then
>         + sym_is_found = true
>         + break
>         + end
>         +end
>         +
>         +test:ok(sym_is_found == true, 'symbol is found')
>         +
>         +local ret = os.remove(elf_filename)
>         +assert(ret == true, 'cannot remove an object file')
>         +
>         +os.exit(test:check() and 0 or 1)
>         --
>         2.34.1
>
>     --
>     Best regards,
>     Maxim Kokryashkin
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20230531/f623c801/attachment.htm>


More information about the Tarantool-patches mailing list