[Tarantool-patches] [PATCH luajit v2 1/1] Fix saved bytecode encapsulated in ELF objects.

Maxim Kokryashkin m.kokryashkin at tarantool.org
Tue May 30 12:05:04 MSK 2023


Hi, Sergey!
Thanks for the patch!
LGTM, except for a few comments below.
> 
>>From: Sergey Bronnikov < sergeyb at tarantool.org >
>>
>>Thanks to Dimitry Andric.
>>
>>(cherry picked from commit 7dbf0b05f1228c1c719866db5e5f3d58f87f74c8)
>>
>>Function `bcsave.lua:bcsave_elfobj()` produced an object file in ELF
>>format with a wrong size size of `.strtab`. Wrong .strtab size causes lld to
>>show an error message:
>>
>>```
>>$ luajit -b -n "module_name" -e "print()" xxx.obj
>>$ ld.lld xxx.obj
>>ld.lld: error: xxx.obj: SHT_STRTAB string table section [index 3] is non-null terminated
>>```
>>
>>Sergey Bronnikov:
>>* added the description and the test for the problem
>>
>>Signed-off-by: Sergey Bronnikov < sergeyb at tarantool.org >
>>Reviewed-by: Sergey Kaplun < skaplun at tarantool.org >
>>Reviewed-by: Maxim Kokryashkin < m.kokryashkin at tarantool.org >
>>---
>>
>>Changes in v2:
>>- Fixed comments as per review by Sergey.
>>
>>Branch:  https://github.com/tarantool/luajit/tree/ligurio/gh-366-strtab-section-correct-size
>>Tarantool CI:  https://github.com/ligurio/tarantool/tree/ligurio/gh-xxxx-strtab-section-correct-size
>>
>> src/jit/bcsave.lua | 2 +-
>> .../lj-366-strtab-correct-size.test.lua | 202 ++++++++++++++++++
>> 2 files changed, 203 insertions(+), 1 deletion(-)
>> create mode 100644 test/tarantool-tests/lj-366-strtab-correct-size.test.lua
>>
>>diff --git a/src/jit/bcsave.lua b/src/jit/bcsave.lua
>>index c17c88e0..2553d97e 100644
>>--- a/src/jit/bcsave.lua
>>+++ b/src/jit/bcsave.lua
>>@@ -275,7 +275,7 @@ typedef struct {
>>   o.sect[2].size = fofs(ofs)
>>   o.sect[3].type = f32(3) -- .strtab
>>   o.sect[3].ofs = fofs(sofs + ofs)
>>- o.sect[3].size = fofs(#symname+1)
>>+ o.sect[3].size = fofs(#symname+2)
>>   ffi.copy(o.space+ofs+1, symname)
>>   ofs = ofs + #symname + 2
>>   o.sect[4].type = f32(1) -- .rodata
>>diff --git a/test/tarantool-tests/lj-366-strtab-correct-size.test.lua b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
>>new file mode 100644
>>index 00000000..5dec6751
>>--- /dev/null
>>+++ b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
>>@@ -0,0 +1,202 @@
>>+local tap = require('tap')
>>+local test = tap.test('lj-366-strtab-correct-size'):skipcond({
>>+ -- The test is ELF-specific, and because LuaJIT exports object
>>+ -- files in ELF format for all operating systems except macOS
>>+ -- and Windows we skip test on these OSes.
>Typo: s/we skip test/we skip it/
>>+ -- See src/jit/bcsave.lua:bcsave_obj.
>>+ ['Disabled on Windows'] = jit.os == 'Windows',
>>+ ['Disabled on macOS'] = jit.os == 'OSX',
>>+})
>>+
>>+local ffi = require 'ffi'
>>+
>>+-- Command below exports bytecode as an object file in ELF format:
>>+-- $ luajit -b -n 'lango_team' -e 'print()' xxx.obj
>>+-- $ file xxx.obj
>>+-- xxx.obj: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
>>+--
>>+-- With read_elf(1) it is possible displaying the entries in symbol table
>Typo: s/possible displaying the entries/possible to display entries/
>>+-- section of the file, if it has one. Object file contains a single symbol
>Typo: s/a single symbol/the single symbol/
>>+-- with name 'luaJIT_BC_lango_team':
>>+--
>>+-- $ readelf --symbols xxx.obj
>>+--
>>+-- Symbol table '.symtab' contains 2 entries:
>>+-- Num: Value Size Type Bind Vis Ndx Name
>>+-- 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
>>+-- 1: 0000000000000000 66 OBJECT GLOBAL DEFAULT 4 luaJIT_BC_lango_team
>>+--
>>+-- and displaying the information contained in the file's section headers, if
>Typo: s/and to display/
> 
>Also, that sentence seems to be a continuation for the previous sentence about
>`readelf`’s functionality, however it is completely detached from it both by the
>setence inbetween them and grammatically. I suggest moving the demonstration
>of `.symtab` contents after the introduction of readelf’s functionality, to improve
>readability.
>>+-- it has any. For our purposes we are interesting in section .symtab,
>Typo: s/are interesting in section .symtab/are interested in .symtab section/
>>+-- so other sections snipped in the output:
>Typo: s/sections snipped/sections are snipped/
>>+--
>>+-- $ readelf --section-headers xxx.obj
>>+-- There are 6 section headers, starting at offset 0x40:
>>+--
>>+-- Section Headers:
>>+-- [Nr] Name Type Address Offset
>>+-- Size EntSize Flags Link Info Align
>>+-- ...
>>+--
>>+-- [ 3] .strtab STRTAB 0000000000000000 00000223
>>+-- 0000000000000016 0000000000000000 0 0 1
>>+-- ...
>>+-- Reference numbers for strtab offset and size could be obtained with
>>+-- readelf(1). Note that number system of these numbers are hexadecimal.
>>+
>>+local expected_strtab_size = 0x16
>>+local expected_strtab_offset = 0x223
>>+local module_name = 'lango_team'
>Simple value constants should be uppercased.
>>+
>>+-- Symbol name prefix for LuaJIT bytecode defined in bcsave.lua.
>>+local LJBC_PREFIX = 'luaJIT_BC_'
>>+
>>+-- Defined in elf.h.
>>+local SHT_SYMTAB = 2
>>+
>>+-- Using the same declarations as defined in <src/jit/bcsave.lua>.
>>+ffi.cdef[[
>>+typedef struct {
>>+ uint8_t emagic[4], eclass, eendian, eversion, eosabi, eabiversion, epad[7];
>>+ uint16_t type, machine;
>>+ uint32_t version;
>>+ uint32_t entry, phofs, shofs;
>>+ uint32_t flags;
>>+ uint16_t ehsize, phentsize, phnum, shentsize, shnum, shstridx;
>>+} ELF32header;
>>+
>>+typedef struct {
>>+ uint8_t emagic[4], eclass, eendian, eversion, eosabi, eabiversion, epad[7];
>>+ uint16_t type, machine;
>>+ uint32_t version;
>>+ uint64_t entry, phofs, shofs;
>>+ uint32_t flags;
>>+ uint16_t ehsize, phentsize, phnum, shentsize, shnum, shstridx;
>>+} ELF64header;
>>+
>>+typedef struct {
>>+ uint32_t name, type, flags, addr, ofs, size, link, info, align, entsize;
>>+} ELF32sectheader;
>>+
>>+typedef struct {
>>+ uint32_t name, type;
>>+ uint64_t flags, addr, ofs, size;
>>+ uint32_t link, info;
>>+ uint64_t align, entsize;
>>+} ELF64sectheader;
>>+
>>+typedef struct {
>>+ uint32_t name, value, size;
>>+ uint8_t info, other;
>>+ uint16_t sectidx;
>>+} ELF32symbol;
>>+
>>+typedef struct {
>>+ uint32_t name;
>>+ uint8_t info, other;
>>+ uint16_t sectidx;
>>+ uint64_t value, size;
>>+} ELF64symbol;
>>+
>>+typedef struct {
>>+ ELF32header hdr;
>>+ ELF32sectheader sect[6];
>>+ ELF32symbol sym[2];
>>+ uint8_t space[4096];
>>+} ELF32obj;
>>+
>>+typedef struct {
>>+ ELF64header hdr;
>>+ ELF64sectheader sect[6];
>>+ ELF64symbol sym[2];
>>+ uint8_t space[4096];
>>+} ELF64obj;
>>+]]
>>+
>>+local is64_arch = {
>>+ ['x64'] = true,
>>+ ['arm64'] = true,
>>+ ['arm64be'] = true,
>>+ ['ppc'] = false,
>>+ ['mips'] = false,
>>+}
>>+
>>+local is64 = is64_arch[jit.arch] or false
>>+
>>+local function create_obj_file(name)
>>+ local elf_filename = os.tmpname() .. '.obj'
>>+ local lua_path = os.getenv('LUA_PATH')
>>+ local lua_bin = require('utils').luacmd(arg):match('%S+')
>>+ local cmd_fmt = 'LUA_PATH="%s" %s -b -n "%s" -e "print()" %s'
>>+ local cmd = (cmd_fmt):format(lua_path, lua_bin, name, elf_filename)
>>+ local ret = os.execute(cmd)
>>+ assert(ret == 0, 'create an object file')
>>+ return elf_filename
>>+end
>>+
>>+-- Reads a file located in a specified path and returns its content.
>>+local function read_file(path)
>>+ local file = assert(io.open(path), 'cannot open an object file')
>>+ local content = file:read('*a')
>>+ file:close()
>>+ return content
>>+end
>>+
>>+-- Parses a buffer in an ELF format and returns an offset and a size of strtab
>>+-- and symtab sections.
>>+local function read_elf(elf_content)
>>+ local ELFobj_type = ffi.typeof(is64 and 'ELF64obj *' or 'ELF32obj *')
>>+ local ELFsectheader_type = ffi.typeof(is64 and 'ELF64sectheader *' or 'ELF32sectheader *')
>>+ local elf = ffi.cast(ELFobj_type, elf_content)
>>+ local symtab_hdr, strtab_hdr
>>+ -- Iterate by section headers.
>>+ for i = 0, elf.hdr.shnum do
>>+ local sec = ffi.cast(ELFsectheader_type, elf.sect[i])
>>+ if sec.type == SHT_SYMTAB then
>>+ symtab_hdr = sec
>>+ strtab_hdr = ffi.cast(ELFsectheader_type, elf.sect[symtab_hdr.link])
>>+ break
>>+ end
>>+ end
>>+
>>+ assert(strtab_hdr ~= nil, 'section .strtab was not found')
>>+ assert(symtab_hdr ~= nil, 'section .symtab was not found')
>>+
>>+ return strtab_hdr, symtab_hdr
>>+end
>>+
>>+test:plan(3)
>>+
>>+local elf_filename = create_obj_file(module_name)
>>+local elf_content = read_file(elf_filename)
>>+assert(#elf_content ~= 0, 'cannot read an object file')
>>+
>>+local strtab, symtab = read_elf(elf_content)
>>+local strtab_size = tonumber(strtab.size)
>>+local strtab_offset = tonumber(strtab.ofs)
>>+local symtab_size = tonumber(symtab.size)
>>+local sym_cnt = tonumber(symtab_size / symtab.entsize)
>>+assert(sym_cnt ~= 0, 'number of symbols is zero')
>>+
>>+test:is(strtab_size, expected_strtab_size, 'check .strtab size')
>>+test:is(strtab_offset, expected_strtab_offset, 'check .strtab offset')
>As I already said offline, I don’t like that approach. ELF is made to by
>as flexible as possible, so, generally, you have no guarantees about
>order and, hence, offsets of any sections. The only reason why we are
>getting away with that here, is because this ELF is generated
>in `bcsave.lua` in a particular manner. It is ok for now, but no one
>knows if this will ever change. I think that the symbol name search
>below is sufficient, because it is unlikely to pass with an incorrect offset.
> 
>All of the checks are valid for now, though, so feel free to ignore.
>>+
>>+local strtab_str = string.sub(elf_content, strtab_offset, strtab_offset + strtab_size)
>>+
>>+local strtab_p = ffi.cast('char *', strtab_str)
>>+local sym_name_expected = LJBC_PREFIX .. module_name
>>+local sym_is_found = false
>>+for i = 1, sym_cnt do
>>+ local sym_name = ffi.string(strtab_p + i)
>>+ if sym_name_expected == sym_name then
>>+ sym_is_found = true
>>+ break
>>+ end
>>+end
>>+
>>+test:ok(sym_is_found == true, 'symbol is found')
>>+
>>+local ret = os.remove(elf_filename)
>>+assert(ret == true, 'cannot remove an object file')
>>+
>>+os.exit(test:check() and 0 or 1)
>>--
>>2.34.1
>--
>Best regards,
>Maxim Kokryashkin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20230530/16321b00/attachment.htm>


More information about the Tarantool-patches mailing list