[Tarantool-patches] [PATCH luajit 1/2] OSX/iOS/ARM64: Fix generation of Mach-O object files.
Sergey Bronnikov
sergeyb at tarantool.org
Tue Mar 19 11:19:19 MSK 2024
Max,
thanks for review! see comments below
On 3/18/24 15:53, Maxim Kokryashkin wrote:
> Hi, Sergey, thanks for the patch!
> Please consider my comments below.
> On Thu, Mar 14, 2024 at 02:39:49PM +0300, Sergey Bronnikov wrote:
>> From: sergeyb at tarantool.org
>>
>> Thanks to Carlo Cabrera.
>>
>> (cherry picked from commit 3065c910ad6027031aabe2dfd3c26a3d0f014b4f)
>>
>> Mach-O FAT object constructed by LuaJIT had an incorrect format. The
> Typo: s/Mach-O FAT/The Mach-O FAT/
Fixed.
>> problem is reproduced when target hardware platform has AVX512F and
> Typo:s/target/the target/
Fixed.
>> LuaJIT is compiled with enabled AVX512F instructions.
>>
>> The problem is arise because LuaJIT FFI code for Mach-O file generation
> Typo: s/is arise/arises/
Fixed.
>> in `bcsave.lua` relied on undefined behavior for conversions to
> Typo: s/relied/relies/
Fixed.
>> `uint32_t`. AVX512F has the `vcvttsd2usi` instruction which converts
> Typo: s/instruction/instruction,/
Fixed.
>> `double`/`float` to `uint32_t/uint64_t`. Earlier architectures (SSE2,
>> AVX2) are sorely lacking such an instruction, as they only support
>> signed conversions. Unsigned conversions are done with a signed convert
>> and range shifting - the exact algorithm depends on the compiler.
>> A side-effect of these workarounds is that negative `double`/`float`
>> often inadvertently convert 'as expected', even though this is invoking
>> undefined behavior. Whereas `vcvttsd2usi` always returns 0x80000000 or
>> 0x8000000000000000 for out-of-range inputs.
>>
>> The patch fixes the problem, however, the real issue remains unfixed.
>>
>> Sergey Bronnikov:
>> * added the description and a test for the problem
> Typo: s/a test/the test/
Fixed.
>> Part of tarantool/tarantool#9595
>> ---
>> .github/workflows/exotic-builds-testing.yml | 5 +-
>> src/jit/bcsave.lua | 6 +-
>> .../lj-366-strtab-correct-size.test.lua | 10 +-
>> ...generation_of_mach-o_object_files.test.lua | 271 ++++++++++++++++++
>> test/tarantool-tests/utils/tools.lua | 8 +
>> 5 files changed, 287 insertions(+), 13 deletions(-)
>> create mode 100644 test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua
>>
>> diff --git a/.github/workflows/exotic-builds-testing.yml b/.github/workflows/exotic-builds-testing.yml
>> index a9ba5fd5..df4bc2e9 100644
>> --- a/.github/workflows/exotic-builds-testing.yml
>> +++ b/.github/workflows/exotic-builds-testing.yml
>> @@ -32,6 +32,7 @@ jobs:
>> fail-fast: false
>> matrix:
>> BUILDTYPE: [Debug, Release]
>> + OS: [Linux, macOS]
>> ARCH: [ARM64, x86_64]
>> GC64: [ON, OFF]
>> FLAVOR: [checkhook, dualnum, gdbjit, nojit, nounwind]
>> @@ -50,13 +51,15 @@ jobs:
>> FLAVORFLAGS: -DLUAJIT_USE_GDBJIT=ON
> https://github.com/tarantool/luajit/actions/runs/8279362128
>> - FLAVOR: nounwind
>> FLAVORFLAGS: -DLUAJIT_NO_UNWIND=ON
>> + - FLAVOR: avx512
>> + CMAKEFLAGS: -DCMAKE_C_FLAGS=skylake-avx512 -DCMAKE_C_COMPILER=gcc
>> exclude:
>> - ARCH: ARM64
>> GC64: OFF
>> # DUALNUM is default for ARM64, no need for additional testing.
>> - FLAVOR: dualnum
>> ARCH: ARM64
>> - runs-on: [self-hosted, regular, Linux, '${{ matrix.ARCH }}']
>> + runs-on: [self-hosted, regular, Linux, '${{ matrix.ARCH }}', '${ matrix.OS }']
> The matrix.OS variable should be wrapped with double curly braces,
> instead of singular ones. So, it should be like this:
> | '${{ matrix.OS }}'
Fixed, thanks!
> Currently, exotic build testing fails to start because of this mistake.
> https://github.com/tarantool/luajit/actions/runs/8279362128
>> name: >
>> LuaJIT ${{ matrix.FLAVOR }}
>> (Linux/${{ matrix.ARCH }})
>> diff --git a/src/jit/bcsave.lua b/src/jit/bcsave.lua
>> index a287d675..7aec1555 100644
>> --- a/src/jit/bcsave.lua
>> +++ b/src/jit/bcsave.lua
>> @@ -446,18 +446,18 @@ typedef struct {
>> uint32_t value;
>> } mach_nlist;
>> typedef struct {
>> - uint32_t strx;
>> + int32_t strx;
>> uint8_t type, sect;
>> uint16_t desc;
>> uint64_t value;
>> } mach_nlist_64;
>> typedef struct
>> {
>> - uint32_t magic, nfat_arch;
>> + int32_t magic, nfat_arch;
>> } mach_fat_header;
>> typedef struct
>> {
>> - uint32_t cputype, cpusubtype, offset, size, align;
>> + int32_t cputype, cpusubtype, offset, size, align;
>> } mach_fat_arch;
>> typedef struct {
>> struct {
>> diff --git a/test/tarantool-tests/lj-366-strtab-correct-size.test.lua b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
>> index 8a97a441..0bb92da6 100644
>> --- a/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
>> +++ b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
> Let's move this update to a separate patch alongside with the added
> utility function. Currently, this change in unrelated test file is
> a bit confusing.
Moved to a separate patch:
commit 138bcda850ed4aad4bbbae4d30315d9a0934cb26
Author: Sergey Bronnikov <sergeyb at tarantool.org>
Date: Tue Mar 19 09:56:51 2024 +0300
test: introduce a helper read_file
The test `lj-366-strtab-correct-size.test.lua` has a test helper
`read_file` that read a file's content and returns it.
This helper will be useful for a test upcoming in the next commit,
so it is moved to a test tools.
diff --git a/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
index 8a97a441..0bb92da6 100644
--- a/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
+++ b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
@@ -138,14 +138,6 @@ local function create_obj_file(name)
return elf_filename
end
--- Reads a file located in a specified path and returns its content.
-local function read_file(path)
- local file = assert(io.open(path), 'cannot open an object file')
- local content = file:read('*a')
- file:close()
- return content
-end
-
-- Parses a buffer in an ELF format and returns an offset and a size
of strtab
-- and symtab sections.
local function read_elf(elf_content)
@@ -172,7 +164,7 @@ end
test:plan(3)
local elf_filename = create_obj_file(MODULE_NAME)
-local elf_content = read_file(elf_filename)
+local elf_content = require('utils').tools.read_file(elf_filename)
assert(#elf_content ~= 0, 'cannot read an object file')
local strtab, symtab = read_elf(elf_content)
diff --git a/test/tarantool-tests/utils/tools.lua
b/test/tarantool-tests/utils/tools.lua
index f35c6922..26b8c08d 100644
--- a/test/tarantool-tests/utils/tools.lua
+++ b/test/tarantool-tests/utils/tools.lua
@@ -12,4 +12,12 @@ function M.profilename(name)
return (arg[0]:gsub('^(.+)/([^/]+)%.test%.lua$', replacepattern))
end
+-- Reads a file located in a specified path and returns its content.
+function M.read_file(path)
+ local file = assert(io.open(path), 'cannot open an object file')
+ local content = file:read('*a')
+ file:close()
+ return content
+end
+
return M
>> @@ -138,14 +138,6 @@ local function create_obj_file(name)
>> return elf_filename
>> end
>>
>> --- Reads a file located in a specified path and returns its content.
>> -local function read_file(path)
>> - local file = assert(io.open(path), 'cannot open an object file')
>> - local content = file:read('*a')
>> - file:close()
>> - return content
>> -end
>> -
>> -- Parses a buffer in an ELF format and returns an offset and a size of strtab
>> -- and symtab sections.
>> local function read_elf(elf_content)
>> @@ -172,7 +164,7 @@ end
>> test:plan(3)
>>
>> local elf_filename = create_obj_file(MODULE_NAME)
>> -local elf_content = read_file(elf_filename)
>> +local elf_content = require('utils').tools.read_file(elf_filename)
>> assert(#elf_content ~= 0, 'cannot read an object file')
>>
>> local strtab, symtab = read_elf(elf_content)
>> diff --git a/test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua b/test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua
>> new file mode 100644
>> index 00000000..0519e134
>> --- /dev/null
>> +++ b/test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua
>> @@ -0,0 +1,271 @@
>> +local tap = require('tap')
>> +local test = tap.test('lj-865-fix_generation_of_mach-o_object_files'):skipcond({
>> + -- XXX: Tarantool doesn't use default LuaJIT loaders, and Lua
>> + -- bytecode can't be loaded from the shared library. For more
> Typo: s/the shared/a shared/
Fixed.
>> + -- info: https://github.com/tarantool/tarantool/issues/9671.
>> + -- luacheck: no global
>> + ['Test uses exotic type of loaders (see #9671)'] = _TARANTOOL,
> Typo: s/uses exotic/uses an exotic/
Fixed.
>> +})
>> +
>> +test:plan(4)
>> +
>> +-- Test creates an object file in Mach-O format with LuaJIT bytecode
> Typo: s/Test/The test/
Fixed.
>> +-- and checks validness of the object file fields.
> Typo: s/validness/the validity/
Fixed.
>> +--
>> +-- The original problem is reproduced with LuaJIT that built with
>> +-- enabled AVX512F instructions. The support of AVX512F could be
> Typo: s/support of/support for/
Fixed.
>> +-- checked in `/proc/cpuinfo` on Linux and
>> +-- `sysctl hw.optional.avx512f` on Mac. AVX512F must be
>> +-- implicitly enabled in a C compiler by passing CPU codename.
>> +-- Please consult for available model architecture on GCC Online
> Typo: s/for/the/
Fixed.
>> +-- Documentation [1] for available CPU codenames. To detect
>> +-- CPU codename execute `gcc -march=native -Q --help=target | grep march`.
>> +--
>> +-- 1. https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html
>> +--
>> +-- Manual steps for reproducing are the following:
>> +--
>> +-- $ make CC=gcc TARGET_CFLAGS='skylake-avx512' -f Makefile.original
>> +-- $ echo > test.lua
>> +-- $ LUA_PATH="src/?.lua;;" luajit -b -o osx -a arm test.lua test.o
>> +-- $ file test.o
>> +-- empty.o: DOS executable (block device driver)
>> +
>> +local ffi = require('ffi')
>> +
>> +-- Format of the Mach-O is described in a document
> Typo: s/a document/the document/
Fixed.
>> +-- "OS X ABI Mach-O File Format Reference", published by Apple company.
>> +-- Copy of the (now removed) official documentation in [1].
> Let's replace "in [1]" with "can be found here [1].".
Fixed.
>> +-- Yet another source of thruth is a XNU headers, see the definition
> Typo: s/is a XNU/are XNU/
> Typo: s/the definition/definitions/
Fixed.
>> +-- of C-structures in: [2] (`nlist_64`), [3] (`fat_arch` and `fat_header`).
>> +
>> +-- 1. https://github.com/aidansteele/osx-abi-macho-file-format-reference
>> +-- 2. https://github.com/apple-oss-distributions/xnu/blob/xnu-10002.1.13/EXTERNAL_HEADERS/mach-o/nlist.h
>> +-- 3. https://github.com/apple-oss-distributions/xnu/blob/xnu-10002.1.13/EXTERNAL_HEADERS/mach-o/fat.h
>> +-- 4. https://developer.apple.com/documentation/apple-silicon/addressing-architectural-differences-in-your-macos-code
>> +--
>> +-- Using the same declarations as defined in <src/jit/bcsave.lua>.
>> +ffi.cdef[[
>> +typedef struct
>> +{
>> + uint32_t magic, cputype, cpusubtype, filetype, ncmds, sizeofcmds, flags;
>> +} mach_header;
>> +
>> +typedef struct
>> +{
>> + mach_header; uint32_t reserved;
>> +} mach_header_64;
>> +
>> +typedef struct {
>> + uint32_t cmd, cmdsize;
>> + char segname[16];
>> + uint32_t vmaddr, vmsize, fileoff, filesize;
>> + uint32_t maxprot, initprot, nsects, flags;
>> +} mach_segment_command;
>> +
>> +typedef struct {
>> + uint32_t cmd, cmdsize;
>> + char segname[16];
>> + uint64_t vmaddr, vmsize, fileoff, filesize;
>> + uint32_t maxprot, initprot, nsects, flags;
>> +} mach_segment_command_64;
>> +
>> +typedef struct {
>> + char sectname[16], segname[16];
>> + uint32_t addr, size;
>> + uint32_t offset, align, reloff, nreloc, flags;
>> + uint32_t reserved1, reserved2;
>> +} mach_section;
>> +
>> +typedef struct {
>> + char sectname[16], segname[16];
>> + uint64_t addr, size;
>> + uint32_t offset, align, reloff, nreloc, flags;
>> + uint32_t reserved1, reserved2, reserved3;
>> +} mach_section_64;
>> +
>> +typedef struct {
>> + uint32_t cmd, cmdsize, symoff, nsyms, stroff, strsize;
>> +} mach_symtab_command;
>> +
>> +typedef struct {
>> + int32_t strx;
>> + uint8_t type, sect;
>> + int16_t desc;
>> + uint32_t value;
>> +} mach_nlist;
>> +
>> +typedef struct {
>> + uint32_t strx;
>> + uint8_t type, sect;
>> + uint16_t desc;
>> + uint64_t value;
>> +} mach_nlist_64;
>> +
>> +typedef struct
>> +{
>> + uint32_t magic, nfat_arch;
>> +} mach_fat_header;
>> +
>> +typedef struct
>> +{
>> + uint32_t cputype, cpusubtype, offset, size, align;
>> +} mach_fat_arch;
>> +
>> +typedef struct {
>> + mach_fat_header fat;
>> + mach_fat_arch fat_arch[2];
>> + struct {
>> + mach_header hdr;
>> + mach_segment_command seg;
>> + mach_section sec;
>> + mach_symtab_command sym;
>> + } arch[2];
>> + mach_nlist sym_entry;
>> + uint8_t space[4096];
>> +} mach_fat_obj;
>> +
>> +typedef struct {
>> + mach_fat_header fat;
>> + mach_fat_arch fat_arch[2];
>> + struct {
>> + mach_header_64 hdr;
>> + mach_segment_command_64 seg;
>> + mach_section_64 sec;
>> + mach_symtab_command sym;
>> + } arch[2];
>> + mach_nlist_64 sym_entry;
>> + uint8_t space[4096];
>> +} mach_fat_obj_64;
>> +]]
>> +
>> +local function create_obj_file(name, arch)
>> + local mach_o_path = os.tmpname() .. '.o'
>> + local lua_path = os.getenv('LUA_PATH')
>> + local lua_bin = require('utils').exec.luacmd(arg):match('%S+')
>> + local cmd_fmt = 'LUA_PATH="%s" %s -b -n "%s" -o osx -a %s -e "print()" %s'
>> + local cmd = (cmd_fmt):format(lua_path, lua_bin, name, arch, mach_o_path)
>> + local ret = os.execute(cmd)
>> + assert(ret == 0, 'cannot create an object file')
>> + return mach_o_path
>> +end
>> +
>> +-- Parses a buffer in an Mach-O format and returns
> Typo: s/in an/in the/
Fixed.
>> +-- an fat magic number and nfat_arch.
> Typo: s/an fat/the FAT/
> Typo: s/nfat_arch/`nfat_arch`
Fixed.
>> +local function read_mach_o(buf, is64)
>> + local res = {
>> + header = {
>> + magic = 0,
>> + nfat_arch = 0,
>> + },
>> + fat_arch = {
>> + },
> I guess, that the formatting below is a bit better here:
> | fat_arch = {},
Fixed.
>> + }
>> +
>> + -- Mach-O FAT object.
>> + local mach_fat_obj_type = ffi.typeof(is64 and 'mach_fat_obj_64 *' or 'mach_fat_obj *')
> The line is longer than 80 symbols.
Fixed.
>> + local obj = ffi.cast(mach_fat_obj_type, buf)
>> +
>> + -- Mach-O FAT object header.
>> + local mach_fat_header_type = ffi.typeof('mach_fat_header *')
>> + local mach_fat_header = ffi.cast(mach_fat_header_type, obj.fat)
>> + local be32 = bit.bswap -- Mach-O FAT is BE, target arch is LE.
>> + res.header.magic = be32(mach_fat_header.magic)
>> + res.header.nfat_arch = be32(mach_fat_header.nfat_arch)
>> +
>> + -- Mach-O FAT object archs.
> Typo: s/archs/arches/
Fixed.
> Side note: I feel like the comments for the sections are not elaborate
> enough for unprepared reader. I think you should briefly desribe the
> basic structure of a FAT object (FAT header, then array of per-segment
> headers, then object files)
Added a short description and an ASCII scheme.
>> + local mach_fat_arch_type = ffi.typeof('mach_fat_arch *')
>> + for i = 0, res.header.nfat_arch - 1 do
>> + local fat_arch = ffi.cast(mach_fat_arch_type, obj.fat_arch[i])
>> + arch = {
>> + cputype = be32(fat_arch.cputype),
>> + cpusubtype = be32(fat_arch.cpusubtype),
>> + }
>> + table.insert(res.fat_arch, arch)
>> + end
>> +
>> + return res
>> +end
>> +
>> +-- Defined in <src/jit/bcsave.lua:bcsave_machobj>.
>> +local sum_cputype = {
>> + x86 = 7,
>> + x64 = 0x01000007,
>> + arm = 7 + 12,
>> + arm64 = 0x01000007 + 0x0100000c,
>> +}
>> +local sum_cpusubtype = {
>> + x86 = 3,
>> + x64 = 3,
>> + arm = 3 + 9,
>> + arm64 = 3 + 0,
>> +}
> It would be nice to have an explanation for these magic numbers.
Added:
@@ -183,7 +246,13 @@ local function read_mach_o(buf, is64)
return res
end
--- Defined in <src/jit/bcsave.lua:bcsave_machobj>.
+-- Universal Binary can contain executables for more than one
+-- CPU architecture. For simplicity the test compares
+-- sum of CPU types and CPU subtypes.
+-- Numbers below are defined in <src/jit/bcsave.lua:bcsave_machobj>.
+-- Original source is XNU source code, <osfmk/mach/machine.h> [1].
+--
+-- 1.
https://opensource.apple.com/source/xnu/xnu-4570.41.2/osfmk/mach/machine.h.auto.html
local sum_cputype = {
x86 = 7,
x64 = 0x01000007,
>> +
>> +-- The function builds Mach-O FAT object file and retrieves
>> +-- its header fields (magic and nfat_arch)
>> +-- and fields of the each arch (cputype, cpusubtype).
>> +--
>> +-- Mach-O FAT object header could be retrieved with `otool` on macOS:
> Typo: s/could be/can be/
Fixed.
>> +--
>> +-- $ otool -f empty.o
>> +-- Fat headers
>> +-- fat_magic 0xcafebabe
>> +-- nfat_arch 2
>> +-- <snipped>
>> +--
>> +-- CPU type and subtype could be retrieved with `lipo` on macOS:
> Typo: s/could be/can be/
Fixed.
>> +--
>> +-- $ luajit -b -o osx -a arm empty.lua empty.o
>> +-- $ lipo -archs empty.o
>> +-- i386 armv7
>> +-- $ luajit -b -o osx -a arm64 empty.lua empty.o
>> +-- $ lipo -archs empty.o
>> +-- x86_64 arm64
>> +local function build_and_check_mach_o(is64)
>> + local arch = is64 and 'arm64' or 'arm'
>> +
>> + -- FAT_MAGIC is an integer containing the value 0xCAFEBABE in
>> + -- big-endian byte order format. On a big-endian host CPU,
>> + -- this can be validated using the constant FAT_MAGIC;
>> + -- on a little-endian host CPU, it can be validated using
>> + -- the constant FAT_CIGAM.
>> + --
>> + -- FAT_NARCH is an integer specifying the number of fat_arch
>> + -- data structures that follow. This is the number of
>> + -- architectures contained in this binary.
>> + --
>> + -- See aforementioned "OS X ABI Mach-O File Format Reference".
>> + --
>> + local FAT_MAGIC = '0xffffffffcafebabe'
>> + local FAT_NARCH = 2
>> +
>> + local MODULE_NAME = 'lango_team'
>> +
>> + local mach_o_obj_path = create_obj_file(MODULE_NAME, arch)
>> + local mach_o_buf = require('utils').tools.read_file(mach_o_obj_path)
>> + assert(mach_o_buf == nil or #mach_o_buf ~= 0, 'cannot read an object file')
>> +
>> + local mach_o = read_mach_o(mach_o_buf, is64)
>> +
>> + -- Teardown.
>> + local retcode = os.remove(mach_o_obj_path)
>> + assert(retcode == true, 'remove an object file')
>> +
>> + local magic_str = string.format('0x%02x', mach_o.header.magic)
>> + test:is(magic_str, FAT_MAGIC, 'fat_magic is correct in Mach-O, ' .. arch)
>> + test:is(mach_o.header.nfat_arch, FAT_NARCH, 'nfat_arch is correct in Mach-O, ' .. arch)
>> +
>> + local total_cputype = 0
>> + local total_cpusubtype = 0
>> + for i = 1, mach_o.header.nfat_arch do
>> + total_cputype = total_cputype + mach_o.fat_arch[i].cputype
>> + total_cpusubtype = total_cpusubtype + mach_o.fat_arch[i].cpusubtype
>> + end
>> + test:is(total_cputype, sum_cputype[arch], 'cputype is correct in Mach-O, ' .. arch)
>> + test:is(total_cpusubtype, sum_cpusubtype[arch], 'cpusubtype is correct in Mach-O, ' .. arch)
>> +end
>> +
>> +-- ARM
>> +build_and_check_mach_o(false)
>> +
>> +test:done(true)
> Please mention that alongside with test and fix for the issue
> you've added this tool. IMO, it would be even better to do that
> in a separate commit, to avoid confusion because of the updates in
> test files unrelated to the patch.
Helper is moved to a separate patch.
>> diff --git a/test/tarantool-tests/utils/tools.lua b/test/tarantool-tests/utils/tools.lua
>> index f35c6922..26b8c08d 100644
>> --- a/test/tarantool-tests/utils/tools.lua
>> +++ b/test/tarantool-tests/utils/tools.lua
>> @@ -12,4 +12,12 @@ function M.profilename(name)
>> return (arg[0]:gsub('^(.+)/([^/]+)%.test%.lua$', replacepattern))
>> end
>>
>> +-- Reads a file located in a specified path and returns its content.
>> +function M.read_file(path)
>> + local file = assert(io.open(path), 'cannot open an object file')
>> + local content = file:read('*a')
>> + file:close()
>> + return content
>> +end
>> +
>> return M
>> --
>> 2.34.1
>>
More information about the Tarantool-patches
mailing list