Hello, Max! Thanks for comments! Fixed all comments except checks for constant size and offset, fixes force-pushed. Sergey On 5/30/23 12:05, Maxim Kokryashkin wrote: > Hi, Sergey! > Thanks for the patch! > LGTM, except for a few comments below. > > > > diff --git > a/test/tarantool-tests/lj-366-strtab-correct-size.test.lua > b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua > new file mode 100644 > index 00000000..5dec6751 > --- /dev/null > +++ b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua > @@ -0,0 +1,202 @@ > +local tap = require('tap') > +local test = tap.test('lj-366-strtab-correct-size'):skipcond({ > + -- The test is ELF-specific, and because LuaJIT exports object > + -- files in ELF format for all operating systems except macOS > + -- and Windows we skip test on these OSes. > > Typo: s/we skip test/we skip it/ > Fixed! > + -- See src/jit/bcsave.lua:bcsave_obj. > + ['Disabled on Windows'] = jit.os == 'Windows', > + ['Disabled on macOS'] = jit.os == 'OSX', > +}) > + > +local ffi = require 'ffi' > + > +-- Command below exports bytecode as an object file in ELF > format: > +-- $ luajit -b -n 'lango_team' -e 'print()' xxx.obj > +-- $ file xxx.obj > +-- xxx.obj: ELF 64-bit LSB relocatable, x86-64, version 1 > (SYSV), not stripped > +-- > +-- With read_elf(1) it is possible displaying the entries in > symbol table > > Typo: s/possible displaying the entries/possible to display entries/ > Fixed! > +-- section of the file, if it has one. Object file contains a > single symbol > > Typo: s/a single symbol/the single symbol/ > > +-- with name 'luaJIT_BC_lango_team': > +-- > +-- $ readelf --symbols xxx.obj > +-- > +-- Symbol table '.symtab' contains 2 entries: > +-- Num: Value Size Type Bind Vis Ndx Name > +-- 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND > +-- 1: 0000000000000000 66 OBJECT GLOBAL DEFAULT 4 > luaJIT_BC_lango_team > +-- > +-- and displaying the information contained in the file's > section headers, if > > Typo: s/and to display/ > Fixed! > Also, that sentence seems to be a continuation for the previous > sentence about > `readelf`’s functionality, however it is completely detached from > it both by the > setence inbetween them and grammatically. I suggest moving the > demonstration > of `.symtab` contents after the introduction of readelf’s > functionality, to improve > readability. > > +-- it has any. For our purposes we are interesting in section > .symtab, > > Typo: s/are interesting in section .symtab/are interested in > .symtab section/ > Fixed! > +-- so other sections snipped in the output: > > Typo: s/sections snipped/sections are snipped/ > Fixed! > +-- > +-- $ readelf --section-headers xxx.obj > +-- There are 6 section headers, starting at offset 0x40: > +-- > +-- Section Headers: > +-- [Nr] Name Type Address Offset > +-- Size EntSize Flags Link Info Align > +-- ... > +-- > +-- [ 3] .strtab STRTAB 0000000000000000 00000223 > +-- 0000000000000016 0000000000000000 0 0 1 > +-- ... > +-- Reference numbers for strtab offset and size could be > obtained with > +-- readelf(1). Note that number system of these numbers are > hexadecimal. > + > +local expected_strtab_size = 0x16 > +local expected_strtab_offset = 0x223 > +local module_name = 'lango_team' > > Simple value constants should be uppercased. > Fixed. > + > +-- Symbol name prefix for LuaJIT bytecode defined in bcsave.lua. > +local LJBC_PREFIX = 'luaJIT_BC_' > + > +-- Defined in elf.h. > +local SHT_SYMTAB = 2 > + > +-- Using the same declarations as defined in > . > +ffi.cdef[[ > +typedef struct { > + uint8_t emagic[4], eclass, eendian, eversion, eosabi, > eabiversion, epad[7]; > + uint16_t type, machine; > + uint32_t version; > + uint32_t entry, phofs, shofs; > + uint32_t flags; > + uint16_t ehsize, phentsize, phnum, shentsize, shnum, shstridx; > +} ELF32header; > + > +typedef struct { > + uint8_t emagic[4], eclass, eendian, eversion, eosabi, > eabiversion, epad[7]; > + uint16_t type, machine; > + uint32_t version; > + uint64_t entry, phofs, shofs; > + uint32_t flags; > + uint16_t ehsize, phentsize, phnum, shentsize, shnum, shstridx; > +} ELF64header; > + > +typedef struct { > + uint32_t name, type, flags, addr, ofs, size, link, info, > align, entsize; > +} ELF32sectheader; > + > +typedef struct { > + uint32_t name, type; > + uint64_t flags, addr, ofs, size; > + uint32_t link, info; > + uint64_t align, entsize; > +} ELF64sectheader; > + > +typedef struct { > + uint32_t name, value, size; > + uint8_t info, other; > + uint16_t sectidx; > +} ELF32symbol; > + > +typedef struct { > + uint32_t name; > + uint8_t info, other; > + uint16_t sectidx; > + uint64_t value, size; > +} ELF64symbol; > + > +typedef struct { > + ELF32header hdr; > + ELF32sectheader sect[6]; > + ELF32symbol sym[2]; > + uint8_t space[4096]; > +} ELF32obj; > + > +typedef struct { > + ELF64header hdr; > + ELF64sectheader sect[6]; > + ELF64symbol sym[2]; > + uint8_t space[4096]; > +} ELF64obj; > +]] > + > +local is64_arch = { > + ['x64'] = true, > + ['arm64'] = true, > + ['arm64be'] = true, > + ['ppc'] = false, > + ['mips'] = false, > +} > + > +local is64 = is64_arch[jit.arch] or false > + > +local function create_obj_file(name) > + local elf_filename = os.tmpname() .. '.obj' > + local lua_path = os.getenv('LUA_PATH') > + local lua_bin = require('utils').luacmd(arg):match('%S+') > + local cmd_fmt = 'LUA_PATH="%s" %s -b -n "%s" -e "print()" %s' > + local cmd = (cmd_fmt):format(lua_path, lua_bin, name, > elf_filename) > + local ret = os.execute(cmd) > + assert(ret == 0, 'create an object file') > + return elf_filename > +end > + > +-- Reads a file located in a specified path and returns its > content. > +local function read_file(path) > + local file = assert(io.open(path), 'cannot open an object file') > + local content = file:read('*a') > + file:close() > + return content > +end > + > +-- Parses a buffer in an ELF format and returns an offset and > a size of strtab > +-- and symtab sections. > +local function read_elf(elf_content) > + local ELFobj_type = ffi.typeof(is64 and 'ELF64obj *' or > 'ELF32obj *') > + local ELFsectheader_type = ffi.typeof(is64 and > 'ELF64sectheader *' or 'ELF32sectheader *') > + local elf = ffi.cast(ELFobj_type, elf_content) > + local symtab_hdr, strtab_hdr > + -- Iterate by section headers. > + for i = 0, elf.hdr.shnum do > + local sec = ffi.cast(ELFsectheader_type, elf.sect[i]) > + if sec.type == SHT_SYMTAB then > + symtab_hdr = sec > + strtab_hdr = ffi.cast(ELFsectheader_type, > elf.sect[symtab_hdr.link]) > + break > + end > + end > + > + assert(strtab_hdr ~= nil, 'section .strtab was not found') > + assert(symtab_hdr ~= nil, 'section .symtab was not found') > + > + return strtab_hdr, symtab_hdr > +end > + > +test:plan(3) > + > +local elf_filename = create_obj_file(module_name) > +local elf_content = read_file(elf_filename) > +assert(#elf_content ~= 0, 'cannot read an object file') > + > +local strtab, symtab = read_elf(elf_content) > +local strtab_size = tonumber(strtab.size) > +local strtab_offset = tonumber(strtab.ofs) > +local symtab_size = tonumber(symtab.size) > +local sym_cnt = tonumber(symtab_size / symtab.entsize) > +assert(sym_cnt ~= 0, 'number of symbols is zero') > + > +test:is(strtab_size, expected_strtab_size, 'check .strtab size') > +test:is(strtab_offset, expected_strtab_offset, 'check .strtab > offset') > > As I already said offline, I don’t like that approach. ELF is made > to by > as flexible as possible, so, generally, you have no guarantees about > order and, hence, offsets of any sections. The only reason why we are > getting away with that here, is because this ELF is generated > in `bcsave.lua` in a particular manner. It is ok for now, but no one > knows if this will ever change. I think that the symbol name search > below is sufficient, because it is unlikely to pass with an > incorrect offset. > Sure, I remember your thoughts regarding checking with constant size and offset. I left this for double-checking change proposed in patch. I'll leave this as as without changes and will wait review from Sergey and then we will decide remove these checks or not. > All of the checks are valid for now, though, so feel free to ignore. > > + > +local strtab_str = string.sub(elf_content, strtab_offset, > strtab_offset + strtab_size) > + > +local strtab_p = ffi.cast('char *', strtab_str) > +local sym_name_expected = LJBC_PREFIX .. module_name > +local sym_is_found = false > +for i = 1, sym_cnt do > + local sym_name = ffi.string(strtab_p + i) > + if sym_name_expected == sym_name then > + sym_is_found = true > + break > + end > +end > + > +test:ok(sym_is_found == true, 'symbol is found') > + > +local ret = os.remove(elf_filename) > +assert(ret == true, 'cannot remove an object file') > + > +os.exit(test:check() and 0 or 1) > -- > 2.34.1 > > -- > Best regards, > Maxim Kokryashkin >