Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH] Fix saved bytecode encapsulated in ELF objects.
@ 2023-05-22  8:58 Sergey Bronnikov via Tarantool-patches
  2023-05-23 13:09 ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 3+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2023-05-22  8:58 UTC (permalink / raw)
  To: tarantool-patches

From: Sergey Bronnikov <sergeyb@tarantool.org>

Thanks to Dimitry Andric.

(cherry picked from commit 7dbf0b05f1228c1c719866db5e5f3d58f87f74c8)

Function `bcsave.lua:bcsave_elfobj()` produced an object file in ELF
format with a wrong size size of `.strtab`. Wrong size causes lld to
show an error message similar to: `ld: error: obj/bytecode.o: string
table non-null terminated`.

Sergey Bronnikov:
* added the description and the test for the problem

Signed-off-by: Sergey Bronnikov <sergeyb@tarantool.org>
---
Branch: https://github.com/tarantool/luajit/tree/ligurio/gh-366-strtab-section-correct-size
PR: https://github.com/tarantool/tarantool/pull/8678
---
 src/jit/bcsave.lua                            |   2 +-
 .../lj-366-strtab-correct-size.test.lua       | 148 ++++++++++++++++++
 2 files changed, 149 insertions(+), 1 deletion(-)
 create mode 100644 test/tarantool-tests/lj-366-strtab-correct-size.test.lua

diff --git a/src/jit/bcsave.lua b/src/jit/bcsave.lua
index c17c88e0..2553d97e 100644
--- a/src/jit/bcsave.lua
+++ b/src/jit/bcsave.lua
@@ -275,7 +275,7 @@ typedef struct {
   o.sect[2].size = fofs(ofs)
   o.sect[3].type = f32(3) -- .strtab
   o.sect[3].ofs = fofs(sofs + ofs)
-  o.sect[3].size = fofs(#symname+1)
+  o.sect[3].size = fofs(#symname+2)
   ffi.copy(o.space+ofs+1, symname)
   ofs = ofs + #symname + 2
   o.sect[4].type = f32(1) -- .rodata
diff --git a/test/tarantool-tests/lj-366-strtab-correct-size.test.lua b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
new file mode 100644
index 00000000..d4b51537
--- /dev/null
+++ b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
@@ -0,0 +1,148 @@
+local tap = require('tap')
+local test = tap.test('lj-366-strtab-correct-size'):skipcond({
+  -- Test is ELF-specific and because LuaJIT exports object files in ELF format
+  -- for all operating systems except macOS and Windows we skip test on
+  -- these OSes. See src/jit/bcsave.lua:bcsave_obj.
+  ['Disabled on Windows'] = jit.os == 'Windows',
+  ['Disabled on macOS'] = jit.os == 'OSX',
+})
+
+local ffi = require 'ffi'
+
+-- Reference numbers for strtab offset and size could be obtained with
+-- readelf(1). Note that number system of these number is hexadecimal.
+--
+-- $ luajit -b -n "module_name" -e "print()" xxx.obj
+-- $ readelf --section-headers xxx.obj
+-- There are 6 section headers, starting at offset 0x40:
+--
+-- Section Headers:
+--   [Nr] Name              Type             Address           Offset
+--          Size              EntSize          Flags  Link  Info  Align
+-- ...
+--
+-- [ 3] .strtab           STRTAB           0000000000000000  00000223
+--      0000000000000017  0000000000000000           0     0     1
+-- ...
+
+local expected_strtab_size = 23 -- == 0x17
+local expected_strtab_offset = 547 -- == 0x223
+
+-- Defined in elf.h.
+local sht_symtab = 2
+
+ffi.cdef[[
+typedef struct {
+  uint8_t emagic[4], eclass, eendian, eversion, eosabi, eabiversion, epad[7];
+  uint16_t type, machine;
+  uint32_t version;
+  uint32_t entry, phofs, shofs;
+  uint32_t flags;
+  uint16_t ehsize, phentsize, phnum, shentsize, shnum, shstridx;
+} ELF32header;
+typedef struct {
+  uint8_t emagic[4], eclass, eendian, eversion, eosabi, eabiversion, epad[7];
+  uint16_t type, machine;
+  uint32_t version;
+  uint64_t entry, phofs, shofs;
+  uint32_t flags;
+  uint16_t ehsize, phentsize, phnum, shentsize, shnum, shstridx;
+} ELF64header;
+typedef struct {
+  uint32_t name, type, flags, addr, ofs, size, link, info, align, entsize;
+} ELF32sectheader;
+typedef struct {
+  uint32_t name, type;
+  uint64_t flags, addr, ofs, size;
+  uint32_t link, info;
+  uint64_t align, entsize;
+} ELF64sectheader;
+typedef struct {
+  uint32_t name, value, size;
+  uint8_t info, other;
+  uint16_t sectidx;
+} ELF32symbol;
+typedef struct {
+  uint32_t name;
+  uint8_t info, other;
+  uint16_t sectidx;
+  uint64_t value, size;
+} ELF64symbol;
+typedef struct {
+  ELF32header hdr;
+  ELF32sectheader sect[6];
+  ELF32symbol sym[2];
+  uint8_t space[4096];
+} ELF32obj;
+typedef struct {
+  ELF64header hdr;
+  ELF64sectheader sect[6];
+  ELF64symbol sym[2];
+  uint8_t space[4096];
+} ELF64obj;
+]]
+
+local is64_arch = {
+  ["x64"] = true,
+  ["arm64"] = true,
+  ["arm64be"] = true,
+  ["ppc"] = false,
+  ["mips"] = false,
+}
+
+local is64 = is64_arch[jit.arch] or false
+local ELFobj_type = ffi.typeof(is64 and "ELF64obj *" or "ELF32obj *")
+local ELFsectheader_type = ffi.typeof(is64 and "ELF64sectheader *" or "ELF32sectheader")
+
+-- Reads a file located in a specified path and returns its content.
+local function read_file(path)
+  local file = assert(io.open(path), 'cannot open an object file')
+  local content = file:read("*a")
+  file:close()
+  return content
+end
+
+-- Parses a buffer in an ELF format and returns an offset and a size of strtab
+-- section for a symtab.
+local function read_elf_strtab(elf_content)
+  local elf = ffi.cast(ELFobj_type, elf_content)
+  local elf_header = elf.hdr
+  local sec_strtab
+  for i = 0, elf_header.shnum do
+    local sec = ffi.cast(ELFsectheader_type, elf.sect[i])
+    if sec.type == sht_symtab then
+       sec_strtab = ffi.cast(ELFsectheader_type, elf.sect[sec.link])
+      break
+    end
+  end
+  return sec_strtab.size, sec_strtab.ofs
+end
+
+local function lua_bin_path(arg)
+  local args_str = require('utils').luacmd(arg)
+  local args = {}
+  for a in args_str:gmatch("%S+") do
+    table.insert(args, a);
+  end
+  return args[1]
+end
+
+test:plan(5)
+
+local elf_filename = os.tmpname() .. '.obj'
+local lua_path = os.getenv('LUA_PATH')
+local lua_bin = lua_bin_path(arg)
+local cmd = ('LUA_PATH="%s" %s -b -n "module_name" -e "print()" %s'):format(lua_path, lua_bin, elf_filename)
+local ret = os.execute(cmd)
+test:ok(ret == 0, 'create an object file')
+
+local elf_content = read_file(elf_filename)
+test:ok(#elf_content ~= 0, 'read an object file')
+local strtab_size, strtab_offset = read_elf_strtab(elf_content)
+test:ok(strtab_size == expected_strtab_size, 'check .strtab size')
+test:ok(strtab_offset == expected_strtab_offset, 'check .strtab offset')
+
+ret = os.execute(("rm -f %s"):format(elf_filename))
+test:ok(ret == 0, 'remove an object file')
+
+os.exit(test:check() and 0 or 1)
-- 
2.34.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Tarantool-patches] [PATCH] Fix saved bytecode encapsulated in ELF objects.
  2023-05-22  8:58 [Tarantool-patches] [PATCH] Fix saved bytecode encapsulated in ELF objects Sergey Bronnikov via Tarantool-patches
@ 2023-05-23 13:09 ` Sergey Kaplun via Tarantool-patches
  2023-05-29 14:34   ` Sergey Bronnikov via Tarantool-patches
  0 siblings, 1 reply; 3+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2023-05-23 13:09 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the patch!
Please consider my comments below.

On 22.05.23, Sergey Bronnikov wrote:
> From: Sergey Bronnikov <sergeyb@tarantool.org>
> 
> Thanks to Dimitry Andric.
> 
> (cherry picked from commit 7dbf0b05f1228c1c719866db5e5f3d58f87f74c8)
> 
> Function `bcsave.lua:bcsave_elfobj()` produced an object file in ELF
> format with a wrong size size of `.strtab`. Wrong size causes lld to
> show an error message similar to: `ld: error: obj/bytecode.o: string
> table non-null terminated`.
> 
> Sergey Bronnikov:
> * added the description and the test for the problem
> 
> Signed-off-by: Sergey Bronnikov <sergeyb@tarantool.org>
> ---
> Branch: https://github.com/tarantool/luajit/tree/ligurio/gh-366-strtab-section-correct-size
> PR: https://github.com/tarantool/tarantool/pull/8678
> ---
>  src/jit/bcsave.lua                            |   2 +-
>  .../lj-366-strtab-correct-size.test.lua       | 148 ++++++++++++++++++
>  2 files changed, 149 insertions(+), 1 deletion(-)
>  create mode 100644 test/tarantool-tests/lj-366-strtab-correct-size.test.lua
> 
> diff --git a/src/jit/bcsave.lua b/src/jit/bcsave.lua
> index c17c88e0..2553d97e 100644
> --- a/src/jit/bcsave.lua
> +++ b/src/jit/bcsave.lua
> @@ -275,7 +275,7 @@ typedef struct {
>    o.sect[2].size = fofs(ofs)
>    o.sect[3].type = f32(3) -- .strtab
>    o.sect[3].ofs = fofs(sofs + ofs)
> -  o.sect[3].size = fofs(#symname+1)
> +  o.sect[3].size = fofs(#symname+2)
>    ffi.copy(o.space+ofs+1, symname)
>    ofs = ofs + #symname + 2
>    o.sect[4].type = f32(1) -- .rodata
> diff --git a/test/tarantool-tests/lj-366-strtab-correct-size.test.lua b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
> new file mode 100644
> index 00000000..d4b51537
> --- /dev/null
> +++ b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
> @@ -0,0 +1,148 @@
> +local tap = require('tap')
> +local test = tap.test('lj-366-strtab-correct-size'):skipcond({
> +  -- Test is ELF-specific and because LuaJIT exports object files in ELF format

Typo: s/Test/The test/
Typo: s/, and/and/

Nit: Comment line is wider than 66 symbols.
Here and below.

> +  -- for all operating systems except macOS and Windows we skip test on
> +  -- these OSes. See src/jit/bcsave.lua:bcsave_obj.
> +  ['Disabled on Windows'] = jit.os == 'Windows',

Side note: I'm not sure, that we shall mention Windows, when we still
don't support it. Still, this skipcond is necessary, so I'm OK with it.

> +  ['Disabled on macOS'] = jit.os == 'OSX',
> +})
> +
> +local ffi = require 'ffi'
> +
> +-- Reference numbers for strtab offset and size could be obtained with
> +-- readelf(1). Note that number system of these number is hexadecimal.

Typo: s/number/numbers/

> +--
> +-- $ luajit -b -n "module_name" -e "print()" xxx.obj
> +-- $ readelf --section-headers xxx.obj

Minor: May you also provide the linker command with the failure
mentioned in the issue?

Side note: I suppose that the starting \0 byte [1] was forgotten to
count. :)

> +-- There are 6 section headers, starting at offset 0x40:
> +--
> +-- Section Headers:
> +--   [Nr] Name              Type             Address           Offset
> +--          Size              EntSize          Flags  Link  Info  Align
> +-- ...
> +--
> +-- [ 3] .strtab           STRTAB           0000000000000000  00000223
> +--      0000000000000017  0000000000000000           0     0     1
> +-- ...
> +
> +local expected_strtab_size = 23 -- == 0x17

Minor: Also, comment about this size (length of
`LJBC_PREFIX..ctx.modname` + 2) is desirable. Maybe we should copy this
string and calculate the size here as well.
Feel free to ignore.

> +local expected_strtab_offset = 547 -- == 0x223

Why don't declare as 0x17, 0x223?

> +
> +-- Defined in elf.h.
> +local sht_symtab = 2

Minor: Maybe it is better to use UPPER-CASE for constants?
Feel free to ignore.

> +
> +ffi.cdef[[
> +typedef struct {
> +  uint8_t emagic[4], eclass, eendian, eversion, eosabi, eabiversion, epad[7];
> +  uint16_t type, machine;
> +  uint32_t version;
> +  uint32_t entry, phofs, shofs;
> +  uint32_t flags;
> +  uint16_t ehsize, phentsize, phnum, shentsize, shnum, shstridx;
> +} ELF32header;

I suggest to add additional empty line between each declaration for
better readability.

> +typedef struct {
> +  uint8_t emagic[4], eclass, eendian, eversion, eosabi, eabiversion, epad[7];
> +  uint16_t type, machine;
> +  uint32_t version;
> +  uint64_t entry, phofs, shofs;
> +  uint32_t flags;
> +  uint16_t ehsize, phentsize, phnum, shentsize, shnum, shstridx;
> +} ELF64header;
> +typedef struct {
> +  uint32_t name, type, flags, addr, ofs, size, link, info, align, entsize;
> +} ELF32sectheader;
> +typedef struct {
> +  uint32_t name, type;
> +  uint64_t flags, addr, ofs, size;
> +  uint32_t link, info;
> +  uint64_t align, entsize;
> +} ELF64sectheader;
> +typedef struct {
> +  uint32_t name, value, size;
> +  uint8_t info, other;
> +  uint16_t sectidx;
> +} ELF32symbol;

I have the following declaration in <elf.h>, maybe we should use the the
same declarations ad litteram? Just to avoid inconsistencies in naming?
The same for all other structures defined.

| typedef struct
| {
|   Elf32_Word	st_name;		/* Symbol name (string tbl index) */
|   Elf32_Addr	st_value;		/* Symbol value */
|   Elf32_Word	st_size;		/* Symbol size */
|   unsigned char	st_info;		/* Symbol type and binding */
|   unsigned char	st_other;		/* Symbol visibility */
|   Elf32_Section	st_shndx;		/* Section index */
| } Elf32_Sym;

Or we may use the same as defined in <src/jit/bcsave.lua> (as you did),
but the comment about it is desirable.

> +typedef struct {
> +  uint32_t name;
> +  uint8_t info, other;
> +  uint16_t sectidx;
> +  uint64_t value, size;
> +} ELF64symbol;
> +typedef struct {
> +  ELF32header hdr;
> +  ELF32sectheader sect[6];
> +  ELF32symbol sym[2];
> +  uint8_t space[4096];
> +} ELF32obj;
> +typedef struct {
> +  ELF64header hdr;
> +  ELF64sectheader sect[6];
> +  ELF64symbol sym[2];
> +  uint8_t space[4096];
> +} ELF64obj;
> +]]
> +
> +local is64_arch = {
> +  ["x64"] = true,

Nit: please, use single-quotes for strings, here and below.

> +  ["arm64"] = true,
> +  ["arm64be"] = true,
> +  ["ppc"] = false,
> +  ["mips"] = false,

Why do we need ppc, and mips definitions here?

> +}
> +
> +local is64 = is64_arch[jit.arch] or false
> +local ELFobj_type = ffi.typeof(is64 and "ELF64obj *" or "ELF32obj *")
> +local ELFsectheader_type = ffi.typeof(is64 and "ELF64sectheader *" or "ELF32sectheader")

Should it be `ELF32sectheader *`?

> +
> +-- Reads a file located in a specified path and returns its content.
> +local function read_file(path)
> +  local file = assert(io.open(path), 'cannot open an object file')
> +  local content = file:read("*a")
> +  file:close()
> +  return content
> +end
> +
> +-- Parses a buffer in an ELF format and returns an offset and a size of strtab
> +-- section for a symtab.
> +local function read_elf_strtab(elf_content)
> +  local elf = ffi.cast(ELFobj_type, elf_content)
> +  local elf_header = elf.hdr
> +  local sec_strtab
> +  for i = 0, elf_header.shnum do
> +    local sec = ffi.cast(ELFsectheader_type, elf.sect[i])
> +    if sec.type == sht_symtab then

Why don't just to find .strtab instead?
LuaJIT emits really a small obj file, (it's literally the third section).

> +       sec_strtab = ffi.cast(ELFsectheader_type, elf.sect[sec.link])
> +      break
> +    end
> +  end
> +  return sec_strtab.size, sec_strtab.ofs
> +end
> +
> +local function lua_bin_path(arg)

Don't get why do we need this function to get luabin?
Why just `luacmd` is not enough?

> +  local args_str = require('utils').luacmd(arg)
> +  local args = {}
> +  for a in args_str:gmatch("%S+") do
> +    table.insert(args, a);
> +  end
> +  return args[1]

Why do we need to parse the whole string if we return only first
occurence of %S+?

> +end
> +
> +test:plan(5)
> +
> +local elf_filename = os.tmpname() .. '.obj'
> +local lua_path = os.getenv('LUA_PATH')
> +local lua_bin = lua_bin_path(arg)
> +local cmd = ('LUA_PATH="%s" %s -b -n "module_name" -e "print()" %s'):format(lua_path, lua_bin, elf_filename)

Nit: Code line is wider than 80 symbols.

> +local ret = os.execute(cmd)
> +test:ok(ret == 0, 'create an object file')

I suggest to use `assert()` for these checks for test correcness.
`test:*()` functions -- to check some bugs
`assert()` -- to verify test correctness or validity
So, we have only 2 test to plan: for size and offset.

> +
> +local elf_content = read_file(elf_filename)
> +test:ok(#elf_content ~= 0, 'read an object file')
> +local strtab_size, strtab_offset = read_elf_strtab(elf_content)
> +test:ok(strtab_size == expected_strtab_size, 'check .strtab size')
> +test:ok(strtab_offset == expected_strtab_offset, 'check .strtab offset')
> +
> +ret = os.execute(("rm -f %s"):format(elf_filename))

I suppose, that we may use `os.remove()` here.

> +test:ok(ret == 0, 'remove an object file')
> +
> +os.exit(test:check() and 0 or 1)
> -- 
> 2.34.1
> 

[1]: https://refspecs.linuxbase.org/elf/gabi4+/ch4.strtab.html

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Tarantool-patches] [PATCH] Fix saved bytecode encapsulated in ELF objects.
  2023-05-23 13:09 ` Sergey Kaplun via Tarantool-patches
@ 2023-05-29 14:34   ` Sergey Bronnikov via Tarantool-patches
  0 siblings, 0 replies; 3+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2023-05-29 14:34 UTC (permalink / raw)
  To: Sergey Kaplun, Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!

Thanks for review, please see my answers (and patches with fixes) below.

Changes force-pushed, and I'll send a second version to the mailing list.

Sergey

On 5/23/23 16:09, Sergey Kaplun wrote:
> Hi, Sergey!
> Thanks for the patch!
> Please consider my comments below.
>
> On 22.05.23, Sergey Bronnikov wrote:
>> From: Sergey Bronnikov <sergeyb@tarantool.org>
>>
>> Thanks to Dimitry Andric.
>>
>> (cherry picked from commit 7dbf0b05f1228c1c719866db5e5f3d58f87f74c8)
>>
>> Function `bcsave.lua:bcsave_elfobj()` produced an object file in ELF
>> format with a wrong size size of `.strtab`. Wrong size causes lld to
>> show an error message similar to: `ld: error: obj/bytecode.o: string
>> table non-null terminated`.
>>
>> Sergey Bronnikov:
>> * added the description and the test for the problem
>>
>> Signed-off-by: Sergey Bronnikov <sergeyb@tarantool.org>
>> ---
>> Branch: https://github.com/tarantool/luajit/tree/ligurio/gh-366-strtab-section-correct-size
>> PR: https://github.com/tarantool/tarantool/pull/8678
>> ---
>>   src/jit/bcsave.lua                            |   2 +-
>>   .../lj-366-strtab-correct-size.test.lua       | 148 ++++++++++++++++++
>>   2 files changed, 149 insertions(+), 1 deletion(-)
>>   create mode 100644 test/tarantool-tests/lj-366-strtab-correct-size.test.lua
>>
>> diff --git a/src/jit/bcsave.lua b/src/jit/bcsave.lua
>> index c17c88e0..2553d97e 100644
>> --- a/src/jit/bcsave.lua
>> +++ b/src/jit/bcsave.lua
>> @@ -275,7 +275,7 @@ typedef struct {
>>     o.sect[2].size = fofs(ofs)
>>     o.sect[3].type = f32(3) -- .strtab
>>     o.sect[3].ofs = fofs(sofs + ofs)
>> -  o.sect[3].size = fofs(#symname+1)
>> +  o.sect[3].size = fofs(#symname+2)
>>     ffi.copy(o.space+ofs+1, symname)
>>     ofs = ofs + #symname + 2
>>     o.sect[4].type = f32(1) -- .rodata
>> diff --git a/test/tarantool-tests/lj-366-strtab-correct-size.test.lua b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
>> new file mode 100644
>> index 00000000..d4b51537
>> --- /dev/null
>> +++ b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
>> @@ -0,0 +1,148 @@
>> +local tap = require('tap')
>> +local test = tap.test('lj-366-strtab-correct-size'):skipcond({
>> +  -- Test is ELF-specific and because LuaJIT exports object files in ELF format
> Typo: s/Test/The test/
> Typo: s/, and/and/

Fixed!

>
> Nit: Comment line is wider than 66 symbols.
> Here and below.

Fixed!

--- a/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
+++ b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
@@ -1,8 +1,9 @@
  local tap = require('tap')
  local test = tap.test('lj-366-strtab-correct-size'):skipcond({
-  -- Test is ELF-specific and because LuaJIT exports object files in 
ELF format
-  -- for all operating systems except macOS and Windows we skip test on
-  -- these OSes. See src/jit/bcsave.lua:bcsave_obj.
+  -- The test is ELF-specific, and because LuaJIT exports object
+  -- files in ELF format for all operating systems except macOS
+  -- and Windows we skip test on these OSes.
+  -- See src/jit/bcsave.lua:bcsave_obj.
    ['Disabled on Windows'] = jit.os == 'Windows',
    ['Disabled on macOS'] = jit.os == 'OSX',
  })

>> +  -- for all operating systems except macOS and Windows we skip test on
>> +  -- these OSes. See src/jit/bcsave.lua:bcsave_obj.
>> +  ['Disabled on Windows'] = jit.os == 'Windows',
> Side note: I'm not sure, that we shall mention Windows, when we still
> don't support it. Still, this skipcond is necessary, so I'm OK with it.
>
>> +  ['Disabled on macOS'] = jit.os == 'OSX',
>> +})
>> +
>> +local ffi = require 'ffi'
>> +
>> +-- Reference numbers for strtab offset and size could be obtained with
>> +-- readelf(1). Note that number system of these number is hexadecimal.
> Typo: s/number/numbers/

Fixed.


@@ -10,7 +11,7 @@ local test = 
tap.test('lj-366-strtab-correct-size'):skipcond({
  local ffi = require 'ffi'

  -- Reference numbers for strtab offset and size could be obtained with
--- readelf(1). Note that number system of these number is hexadecimal.
+-- readelf(1). Note that number system of these numbers are hexadecimal.
  --
  -- $ luajit -b -n "module_name" -e "print()" xxx.obj
  -- $ readelf --section-headers xxx.obj

>> +--
>> +-- $ luajit -b -n "module_name" -e "print()" xxx.obj
>> +-- $ readelf --section-headers xxx.obj
> Minor: May you also provide the linker command with the failure
> mentioned in the issue?

Sure:


     $ luajit  -b -n "module_name" -e "print()" xxx.obj
     $ ld.lld xxx.obj
     ld.lld: error: xxx.obj: SHT_STRTAB string table section [index 3] 
is non-null terminated


Added to commit message.

> Side note: I suppose that the starting \0 byte [1] was forgotten to
> count. :)
>> +-- There are 6 section headers, starting at offset 0x40:
>> +--
>> +-- Section Headers:
>> +--   [Nr] Name              Type             Address           Offset
>> +--          Size              EntSize          Flags  Link  Info  Align
>> +-- ...
>> +--
>> +-- [ 3] .strtab           STRTAB           0000000000000000  00000223
>> +--      0000000000000017  0000000000000000           0     0     1
>> +-- ...
>> +
>> +local expected_strtab_size = 23 -- == 0x17
> Minor: Also, comment about this size (length of
> `LJBC_PREFIX..ctx.modname` + 2) is desirable. Maybe we should copy this
> string and calculate the size here as well.
> Feel free to ignore.

Added.


>
>> +local expected_strtab_offset = 547 -- == 0x223
> Why don't declare as 0x17, 0x223?

Because I forgot that Lua allows to use hex numbers.

Fixed.

>
>> +
>> +-- Defined in elf.h.
>> +local sht_symtab = 2
> Minor: Maybe it is better to use UPPER-CASE for constants?
> Feel free to ignore.
Switched to upper-case.
>
>> +
>> +ffi.cdef[[
>> +typedef struct {
>> +  uint8_t emagic[4], eclass, eendian, eversion, eosabi, eabiversion, epad[7];
>> +  uint16_t type, machine;
>> +  uint32_t version;
>> +  uint32_t entry, phofs, shofs;
>> +  uint32_t flags;
>> +  uint16_t ehsize, phentsize, phnum, shentsize, shnum, shstridx;
>> +} ELF32header;
> I suggest to add additional empty line between each declaration for
> better readability.
Added.
>
>> +typedef struct {
>> +  uint8_t emagic[4], eclass, eendian, eversion, eosabi, eabiversion, epad[7];
>> +  uint16_t type, machine;
>> +  uint32_t version;
>> +  uint64_t entry, phofs, shofs;
>> +  uint32_t flags;
>> +  uint16_t ehsize, phentsize, phnum, shentsize, shnum, shstridx;
>> +} ELF64header;
>> +typedef struct {
>> +  uint32_t name, type, flags, addr, ofs, size, link, info, align, entsize;
>> +} ELF32sectheader;
>> +typedef struct {
>> +  uint32_t name, type;
>> +  uint64_t flags, addr, ofs, size;
>> +  uint32_t link, info;
>> +  uint64_t align, entsize;
>> +} ELF64sectheader;
>> +typedef struct {
>> +  uint32_t name, value, size;
>> +  uint8_t info, other;
>> +  uint16_t sectidx;
>> +} ELF32symbol;
> I have the following declaration in <elf.h>, maybe we should use the the
> same declarations ad litteram? Just to avoid inconsistencies in naming?
> The same for all other structures defined.
>
> | typedef struct
> | {
> |   Elf32_Word	st_name;		/* Symbol name (string tbl index) */
> |   Elf32_Addr	st_value;		/* Symbol value */
> |   Elf32_Word	st_size;		/* Symbol size */
> |   unsigned char	st_info;		/* Symbol type and binding */
> |   unsigned char	st_other;		/* Symbol visibility */
> |   Elf32_Section	st_shndx;		/* Section index */
> | } Elf32_Sym;
>
> Or we may use the same as defined in <src/jit/bcsave.lua> (as you did),
> but the comment about it is desirable.
C definitions were copy-pasted from bcsave.lua, added a comment about it.
>
>> +typedef struct {
>> +  uint32_t name;
>> +  uint8_t info, other;
>> +  uint16_t sectidx;
>> +  uint64_t value, size;
>> +} ELF64symbol;
>> +typedef struct {
>> +  ELF32header hdr;
>> +  ELF32sectheader sect[6];
>> +  ELF32symbol sym[2];
>> +  uint8_t space[4096];
>> +} ELF32obj;
>> +typedef struct {
>> +  ELF64header hdr;
>> +  ELF64sectheader sect[6];
>> +  ELF64symbol sym[2];
>> +  uint8_t space[4096];
>> +} ELF64obj;
>> +]]
>> +
>> +local is64_arch = {
>> +  ["x64"] = true,
> Nit: please, use single-quotes for strings, here and below.
Fixed.
>
>> +  ["arm64"] = true,
>> +  ["arm64be"] = true,
>> +  ["ppc"] = false,
>> +  ["mips"] = false,
> Why do we need ppc, and mips definitions here?

I would left unsupported arch's here as well, it will simplify a bit work

for those who will port test to these arch's in a future.

>
>> +}
>> +
>> +local is64 = is64_arch[jit.arch] or false
>> +local ELFobj_type = ffi.typeof(is64 and "ELF64obj *" or "ELF32obj *")
>> +local ELFsectheader_type = ffi.typeof(is64 and "ELF64sectheader *" or "ELF32sectheader")
> Should it be `ELF32sectheader *`?

Sure, fixed.


>
>> +
>> +-- Reads a file located in a specified path and returns its content.
>> +local function read_file(path)
>> +  local file = assert(io.open(path), 'cannot open an object file')
>> +  local content = file:read("*a")
>> +  file:close()
>> +  return content
>> +end
>> +
>> +-- Parses a buffer in an ELF format and returns an offset and a size of strtab
>> +-- section for a symtab.
>> +local function read_elf_strtab(elf_content)
>> +  local elf = ffi.cast(ELFobj_type, elf_content)
>> +  local elf_header = elf.hdr
>> +  local sec_strtab
>> +  for i = 0, elf_header.shnum do
>> +    local sec = ffi.cast(ELFsectheader_type, elf.sect[i])
>> +    if sec.type == sht_symtab then
> Why don't just to find .strtab instead?
> LuaJIT emits really a small obj file, (it's literally the third section).

strtab is an abstract section, it could be used not for storing symbols 
but for other purposes too.

in this test we need a strtab section for symtab.

>
>> +       sec_strtab = ffi.cast(ELFsectheader_type, elf.sect[sec.link])
>> +      break
>> +    end
>> +  end
>> +  return sec_strtab.size, sec_strtab.ofs
>> +end
>> +
>> +local function lua_bin_path(arg)
> Don't get why do we need this function to get luabin?
> Why just `luacmd` is not enough?

luacmd is not suitable for my test without fixing helper.

The problem was in passing additional arguments to a final luajit cmd.

It looked that easier don't use luacmd than fix it for my purposes.

>
>> +  local args_str = require('utils').luacmd(arg)
>> +  local args = {}
>> +  for a in args_str:gmatch("%S+") do
>> +    table.insert(args, a);
>> +  end
>> +  return args[1]
> Why do we need to parse the whole string if we return only first
> occurence of %S+?
Fixed.
>> +end
>> +
>> +test:plan(5)
>> +
>> +local elf_filename = os.tmpname() .. '.obj'
>> +local lua_path = os.getenv('LUA_PATH')
>> +local lua_bin = lua_bin_path(arg)
>> +local cmd = ('LUA_PATH="%s" %s -b -n "module_name" -e "print()" %s'):format(lua_path, lua_bin, elf_filename)
> Nit: Code line is wider than 80 symbols.
Fixed.
>
>> +local ret = os.execute(cmd)
>> +test:ok(ret == 0, 'create an object file')
> I suggest to use `assert()` for these checks for test correcness.
> `test:*()` functions -- to check some bugs
> `assert()` -- to verify test correctness or validity
> So, we have only 2 test to plan: for size and offset.

With asserts I don't like that it breaks test report.

According to TAP spec there is "bail out" that should handle cases

when something goes wrong and we couldn't continue testing.

In our implementation there is not "bail out" support.


Replaced test:ok() to asserts where checks are related to test 
environment and

output  now looks as below:

TAP version 13
1..2
ok - check .strtab size
ok - check .strtab offset

>
>> +
>> +local elf_content = read_file(elf_filename)
>> +test:ok(#elf_content ~= 0, 'read an object file')
>> +local strtab_size, strtab_offset = read_elf_strtab(elf_content)
>> +test:ok(strtab_size == expected_strtab_size, 'check .strtab size')
>> +test:ok(strtab_offset == expected_strtab_offset, 'check .strtab offset')
>> +
>> +ret = os.execute(("rm -f %s"):format(elf_filename))
> I suppose, that we may use `os.remove()` here.
Sure, replaced to os.remove().
>
>> +test:ok(ret == 0, 'remove an object file')
>> +
>> +os.exit(test:check() and 0 or 1)
>> -- 
>> 2.34.1
>>
> [1]: https://refspecs.linuxbase.org/elf/gabi4+/ch4.strtab.html
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-05-29 14:34 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-22  8:58 [Tarantool-patches] [PATCH] Fix saved bytecode encapsulated in ELF objects Sergey Bronnikov via Tarantool-patches
2023-05-23 13:09 ` Sergey Kaplun via Tarantool-patches
2023-05-29 14:34   ` Sergey Bronnikov via Tarantool-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox