Sergey, thanks for review. Fixes applied and force-pushed to the same branch. Sergey On 09.07.2024 16:03, Sergey Kaplun via Tarantool-patches wrote: > Hi, Sergey! > Thanks for the patch! > Please consider my comments below. > > May we add the test [1] to verify that there will be no regression in the > future? This "test" was for a FAT binary generated by LuaJIT. With backported patches, FAT (aka Universal Binary) support is gone. mach-O header is checked by tests in proposed patch. > > On 05.07.24, Sergey Bronnikov wrote: >> Reported by Sergey Bronnikov. >> >> (cherry picked from commit 7110b935672489afd6ba3eef3e5139d2f3bd05b6) >> >> Previously, LuaJIT generated Mach-O FAT object files for ARM and >> ARM64 on macOS. The patch removes support of 32-bit ARM and >> FAT object files and now LuaJIT generate Mach-O object files for >> ARM64. > I suppose we should mention that no x86/x86_64 objects are generated > now. x86_64 is still there: $ ./build/gc64/src/luajit -b -o osx -a arm64 empty.lua empty.o $ file empty.o empty.o: Mach-O 64-bit arm64 object $ ./build/gc64/src/luajit -b -o osx -a x64 empty.lua empty.o $ file empty.o empty.o: Mach-O 64-bit x86_64 object $ Anyway, commit message has been updated. >> Sergey Bronnikov: >> * added the description and the trimmed the test for the problem >> >> Part of tarantool/tarantool#10199 >> --- >> src/jit/bcsave.lua | 155 ++------- >> ...-865-cross-generation-mach-o-file.test.lua | 294 +++--------------- >> 2 files changed, 70 insertions(+), 379 deletions(-) >> >> diff --git a/src/jit/bcsave.lua b/src/jit/bcsave.lua >> index 26ec29c6..61953c2d 100644 >> --- a/src/jit/bcsave.lua >> +++ b/src/jit/bcsave.lua > > >> diff --git a/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua b/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua >> index f008f3bd..6a58de95 100644 >> --- a/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua >> +++ b/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua >> @@ -3,109 +3,11 @@ local test = tap.test('lj-865-cross-generation-mach-o-file') >> local utils = require('utils') >> local ffi = require('ffi') >> >> -test:plan(2) >> +test:plan(1) >> >> -- The test creates an object file in Mach-O format with LuaJIT >> -- bytecode and checks the validity of the object file fields. >> --- >> --- The original problem is reproduced with LuaJIT, which is built >> --- with enabled AVX512F instructions. The support for AVX512F >> --- could be checked in `/proc/cpuinfo` on Linux and >> --- `sysctl hw.optional.avx512f` on Mac. AVX512F must be >> --- implicitly enabled in a C compiler by passing a CPU codename. >> --- Please take a look at the GCC Online Documentation [1] for >> --- available CPU codenames. Also, see the Wikipedia for CPUs with >> --- AVX-512 support [2]. >> --- Execute command below to detect the CPU codename: >> --- `gcc -march=native -Q --help=target | grep march`. >> --- >> --- 1.https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html >> --- 2.https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512 >> --- >> --- Manual steps for reproducing are the following: >> --- >> --- $ CC=gcc TARGET_CFLAGS='skylake-avx512' cmake -S . -B build >> --- $ cmake --build build --parallel >> --- $ echo > test.lua >> --- $ LUA_PATH="src/?.lua;;" luajit -b -o osx -a arm test.lua test.o >> --- $ file test.o >> --- empty.o: DOS executable (block device driver) >> >> --- LuaJIT can generate so called Universal Binary with Lua >> --- bytecode. The Universal Binary format is a format for >> --- executable files that run natively on hardware platforms with >> --- different hardware architectures. This concept is more >> --- generally known as a fat binary. >> --- >> --- The format of the Mach-O is described in the document >> --- "OS X ABI Mach-O File Format Reference", published by Apple >> --- company. The copy of the (now removed) official documentation >> --- can be found here [1]. Yet another source of truth is >> --- XNU headers, see the definition of C-structures in: >> --- [2] (`nlist_64`), [3] (`fat_arch` and `fat_header`). >> --- >> --- There is a good visual representation of Universal Binary >> --- in "Mac OS X Internals" book (pages 67-68) [5] and in the [6]. >> --- Below is the schematic structure of Universal Binary, which >> --- includes two executables for PowerPC and Intel i386 (omitted): >> --- >> --- 0x0000000 --------------------------------------- >> --- | >> --- struct | 0xcafebabe FAT_MAGIC magic >> --- fat_header | ------------------------------------- >> --- | 0x00000003 nfat_arch >> --- --------------------------------------- >> --- | 0x00000012 CPU_TYPE_POWERPC cputype >> --- | ------------------------------------- >> --- | 0x00000000 CPU_SUBTYPE_POWERPC_ALL cpusubtype >> --- struct | ------------------------------------- >> --- fat_arch | 0x00001000 4096 bytes offset >> --- | ------------------------------------- >> --- | 0x00004224 16932 bytes size >> --- | ------------------------------------- >> --- | 0x0000000c 2^12 = 4096 bytes align >> --- --------------------------------------- >> --- --------------------------------------- >> --- | 0x00000007 CPU_TYPE_I386 cputype >> --- | ------------------------------------- >> --- | 0x00000003 CPU_SUBTYPE_I386_ALL cpusubtype >> --- struct | ------------------------------------- >> --- fat_arch | 0x00006000 24576 bytes offset >> --- | ------------------------------------- >> --- | 0x0000292c 10540 bytes size >> --- | ------------------------------------- >> --- | 0x0000000c 2^12 = 4096 bytes align >> --- --------------------------------------- >> --- Unused >> --- 0x00001000 --------------------------------------- >> --- | 0xfeedface MH_MAGIC magic >> --- | ------------------------------------ >> --- | 0x00000012 CPU_TYPE_POWERPC cputype >> --- | ------------------------------------ >> --- struct | 0x00000000 CPU_SUBTYPE_POWERPC_ALL cpusubtype >> --- mach_header | ------------------------------------ >> --- | 0x00000002 MH_EXECUTE filetype >> --- | ------------------------------------ >> --- | 0x0000000b 10 load commands ncmds >> --- | ------------------------------------ >> --- | 0x00000574 1396 bytes sizeofcmds >> --- | ------------------------------------ >> --- | 0x00000085 DYLDLINK TWOLEVEL flags >> --- -------------------------------------- >> --- Load commands >> --- --------------------------------------- >> --- Data >> --- --------------------------------------- >> --- >> --- < x86 executable > >> --- >> --- 1.https://github.com/aidansteele/osx-abi-macho-file-format-reference >> --- 2.https://github.com/apple-oss-distributions/xnu/blob/xnu-10002.1.13/EXTERNAL_HEADERS/mach-o/nlist.h >> --- 3.https://github.com/apple-oss-distributions/xnu/blob/xnu-10002.1.13/EXTERNAL_HEADERS/mach-o/fat.h >> --- 4.https://developer.apple.com/documentation/apple-silicon/addressing-architectural-differences-in-your-macos-code >> --- 5.https://reverseengineering.stackexchange.com/a/6357/46029 >> --- 6.http://formats.kaitai.io/mach_o/index.html > I prefer to keep the non-FAT part of this comment since it is very > useful. reverted: diff --git a/build/src/luajit b/build/src/luajit deleted file mode 100755 index bd10e09a..00000000 Binary files a/build/src/luajit and /dev/null differ diff --git a/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua b/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua index 6a58de95..f9ca6e38 100644 --- a/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua +++ b/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua @@ -68,6 +68,62 @@ end  -- Parses a buffer in the Mach-O format and returns its fields  -- in a table. +-- The test creates an object file in Mach-O format with LuaJIT +-- bytecode and checks the validity of the object file fields. +--- +--- The original problem is reproduced with LuaJIT, which is built +--- with enabled AVX512F instructions. The support for AVX512F +--- could be checked in `/proc/cpuinfo` on Linux and +--- `sysctl hw.optional.avx512f` on Mac. AVX512F must be +--- implicitly enabled in a C compiler by passing a CPU codename. +--- Please take a look at the GCC Online Documentation [1] for +--- available CPU codenames. Also, see the Wikipedia for CPUs with +--- AVX-512 support [2]. +--- Execute command below to detect the CPU codename: +--- `gcc -march=native -Q --help=target | grep march`. +--- +--- 1. https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html +--- 2. https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512 +--- +--- Manual steps for reproducing are the following: +--- +--- $ CC=gcc TARGET_CFLAGS='skylake-avx512' cmake -S . -B build +--- $ cmake --build build --parallel +--- $ echo > test.lua +--- $ LUA_PATH="src/?.lua;;" luajit -b -o osx -a arm test.lua test.o +--- $ file test.o +--- empty.o: DOS executable (block device driver) + +--- The format of the Mach-O is described in the document +--- "OS X ABI Mach-O File Format Reference", published by Apple +--- company. The copy of the (now removed) official documentation +--- can be found here [1]. There is a good visual representation +--- of Mach-O format in "Mac OS X Internals" book (pages 67-68) +--- [2] and in the [3]. +-- +--- 0x0000000  --------------------------------------- +---             | 0xfeedface  MH_MAGIC                  magic +---             | ------------------------------------ +---             | 0x00000012  CPU_TYPE_POWERPC          cputype +---             | ------------------------------------ +--- struct      | 0x00000000  CPU_SUBTYPE_POWERPC_ALL cpusubtype +--- mach_header | ------------------------------------ +---             | 0x00000002  MH_EXECUTE                filetype +---             | ------------------------------------ +---             | 0x0000000b  10 load commands          ncmds +---             | ------------------------------------ +---             | 0x00000574  1396 bytes sizeofcmds +---             | ------------------------------------ +---             | 0x00000085  DYLDLINK TWOLEVEL         flags +---             -------------------------------------- +---               Load commands +---             --------------------------------------- +---               Data +---             --------------------------------------- +--- +--- 1. https://github.com/aidansteele/osx-abi-macho-file-format-reference +--- 2. https://reverseengineering.stackexchange.com/a/6357/46029 +--- 3. http://formats.kaitai.io/mach_o/index.html  local function read_mach_o(buf, hw_arch)    local is64 = hw_arch == 'arm64' > >> --- >> -- Using the same declarations as defined in . >> ffi.cdef[[ > > >> local function create_obj_file(name, arch) >> @@ -212,108 +66,37 @@ local function create_obj_file(name, arch) >> return mach_o_path >> end >> >> --- Parses a buffer in the Mach-O format and returns the FAT magic >> --- number and `nfat_arch`. >> +-- Parses a buffer in the Mach-O format and returns its fields >> +-- in a table. >> local function read_mach_o(buf, hw_arch) > I suggest renaming it to `read_mach_o_hdr()` and returning only the > header without any additional wrapping in the table. Updated: diff --git a/build/src/luajit b/build/src/luajit deleted file mode 100755 index bd10e09a..00000000 Binary files a/build/src/luajit and /dev/null differ diff --git a/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua b/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua index 6a58de95..0f269a8c 100644 --- a/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua +++ b/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua @@ -78,15 +134,7 @@ local function read_mach_o(buf, hw_arch)    -- Mach-O object header.    local mach_header = obj.hdr -  return { -    header = { -      magic = mach_header.magic, -      cputype = mach_header.cputype, -      cpusubtype = mach_header.cpusubtype, -      filetype = mach_header.filetype, -      ncmds = mach_header.ncmds, -    }, -  } +  return mach_header  end  -- The function builds Mach-O object file and retrieves @@ -104,22 +152,22 @@ local function build_and_check_mach_o(subtest)    local mach_o_buf = utils.tools.read_file(mach_o_obj_path)    assert(mach_o_buf ~= nil and #mach_o_buf ~= 0, 'cannot read an object file') -  local mach_o = read_mach_o(mach_o_buf, hw_arch) +  local mach_o = read_mach_o_hdr(mach_o_buf, hw_arch)    -- Teardown.    assert(os.remove(mach_o_obj_path), 'remove an object file') -  local magic_str = string.format('%#x', mach_o.header.magic) +  local magic_str = string.format('%#x', mach_o.magic)    subtest:is(magic_str, '0xfeedfacf',               'magic is correct in Mach-O') -  local cputype_str = string.format('%#x', mach_o.header.cputype) +  local cputype_str = string.format('%#x', mach_o.cputype)    subtest:is(cputype_str, '0x100000c',               'cputype is correct in Mach-O') -  subtest:is(mach_o.header.cpusubtype, 0, +  subtest:is(mach_o.cpusubtype, 0,               'cpusubtype is correct in Mach-O') -  subtest:is(mach_o.header.filetype, 1, +  subtest:is(mach_o.filetype, 1,               'filetype is correct in Mach-O') -  subtest:is(mach_o.header.ncmds, 2, +  subtest:is(mach_o.ncmds, 2,               'ncmds is correct in Mach-O')  end >> - local res = { >> - header = { >> - magic = 0, >> - nfat_arch = 0, >> - }, >> - fat_arch = {}, >> - } >> - >> local is64 = hw_arch == 'arm64' > Maybe it is better to use assert here, like we do for the > `build_and_check_mach_o()` routine? Added, but read_mach_o_hdr is called by build_and_check_mach_o where assert already added. > >> >> - -- Mach-O FAT object. >> - local mach_fat_obj_type = ffi.typeof(is64 and >> - 'mach_fat_obj_64 *' or >> - 'mach_fat_obj *') >> - local obj = ffi.cast(mach_fat_obj_type, buf) >> + -- Mach-O object. >> + local mach_obj_type = ffi.typeof(is64 and 'mach_obj_64 *') > Maybe just use mach_obj_64 since there is no alternative? ok >> + local obj = ffi.cast(mach_obj_type, buf) >> >> - -- Mach-O FAT object header. >> - local mach_fat_header = obj.fat >> - -- Mach-O FAT is BE, target arch is LE. >> - local be32 = bit.bswap >> - res.header.magic = be32(mach_fat_header.magic) >> - res.header.nfat_arch = be32(mach_fat_header.nfat_arch) >> + -- Mach-O object header. >> + local mach_header = obj.hdr >> >> - -- Mach-O FAT object arches. >> - for i = 0, res.header.nfat_arch - 1 do >> - local fat_arch = obj.fat_arch[i] >> - local arch = { >> - cputype = be32(fat_arch.cputype), >> - cpusubtype = be32(fat_arch.cpusubtype), >> - } >> - table.insert(res.fat_arch, arch) >> - end >> - >> - return res >> + return { >> + header = { >> + magic = mach_header.magic, >> + cputype = mach_header.cputype, >> + cpusubtype = mach_header.cpusubtype, >> + filetype = mach_header.filetype, >> + ncmds = mach_header.ncmds, >> + }, >> + } >> end >> >> --- Universal Binary can contain executables for more than one >> --- CPU architecture. For simplicity, the test compares the *sum* >> --- of CPU types and CPU subtypes. >> --- >> --- has the definitions of the >> --- numbers below. The original XNU source code may be found in >> --- [1]. >> --- >> --- 1.https://opensource.apple.com/source/xnu/xnu-4570.41.2/osfmk/mach/machine.h.auto.html >> --- >> -local SUM_CPUTYPE = { >> - -- x86 + arm. >> - arm = 7 + 12, >> - -- x64 + arm64. >> - arm64 = 0x01000007 + 0x0100000c, >> -} >> -local SUM_CPUSUBTYPE = { >> - -- x86 + arm. >> - arm = 3 + 9, >> - -- x64 + arm64. >> - arm64 = 3 + 0, >> -} >> - >> --- The function builds Mach-O FAT object file and retrieves >> --- its header fields (magic and nfat_arch) and fields of each arch >> --- (cputype, cpusubtype). >> --- >> --- The Mach-O FAT object header can be retrieved with `otool` on >> --- macOS: >> --- >> --- $ otool -f empty.o >> --- Fat headers >> --- fat_magic 0xcafebabe >> --- nfat_arch 2 >> --- >> --- >> --- CPU type and subtype can be retrieved with `lipo` on macOS: >> --- >> --- $ luajit -b -o osx -a arm empty.lua empty.o >> --- $ lipo -archs empty.o >> --- i386 armv7 >> --- $ luajit -b -o osx -a arm64 empty.lua empty.o >> --- $ lipo -archs empty.o >> --- x86_64 arm64 >> +-- The function builds Mach-O object file and retrieves >> +-- its header fields. >> local function build_and_check_mach_o(subtest) >> local hw_arch = subtest.name >> - assert(hw_arch == 'arm' or hw_arch == 'arm64') >> + -- LuaJIT always generate 64-bit non-FAT Mach-O object files. > The fact that these files are generated only for M1 CPUs looks worth > mentioning. > >> + assert(hw_arch == 'arm64') >> >> - subtest:plan(4) >> - -- FAT_MAGIC is an integer containing the value 0xCAFEBABE in >> - -- big-endian byte order format. On a big-endian host CPU, >> - -- this can be validated using the constant FAT_MAGIC; >> - -- on a little-endian host CPU, it can be validated using >> - -- the constant FAT_CIGAM. >> - -- >> - -- FAT_NARCH is an integer specifying the number of fat_arch >> - -- data structures that follow. This is the number of >> - -- architectures contained in this binary. >> - -- >> - -- See the aforementioned "OS X ABI Mach-O File Format >> - -- Reference". > The similar comment about Mach-O format will be appretiated. mach-o header struct described in a comment above > >> - local FAT_MAGIC = '0xffffffffcafebabe' >> - local FAT_NARCH = 2 >> + subtest:plan(5) >> >> local MODULE_NAME = 'lango_team' >> >> @@ -327,24 +110,19 @@ local function build_and_check_mach_o(subtest) >> assert(os.remove(mach_o_obj_path), 'remove an object file') >> >> local magic_str = string.format('%#x', mach_o.header.magic) >> - subtest:is(magic_str, FAT_MAGIC, >> - 'fat_magic is correct in Mach-O') >> - subtest:is(mach_o.header.nfat_arch, FAT_NARCH, >> - 'nfat_arch is correct in Mach-O') >> - >> - local total_cputype = 0 >> - local total_cpusubtype = 0 >> - for i = 1, FAT_NARCH do >> - total_cputype = total_cputype + mach_o.fat_arch[i].cputype >> - total_cpusubtype = total_cpusubtype + mach_o.fat_arch[i].cpusubtype >> - end >> - subtest:is(total_cputype, SUM_CPUTYPE[hw_arch], >> + subtest:is(magic_str, '0xfeedfacf', > Please use MH_MAGIC_64 named constant for this magic string. Added > >> + 'magic is correct in Mach-O') > Looks like this line may be joined with the previous one. not enough free space for this >> + local cputype_str = string.format('%#x', mach_o.header.cputype) >> + subtest:is(cputype_str, '0x100000c', > Please use the named constant CPU_TYPE_ARM64 for this magic string. Added > >> 'cputype is correct in Mach-O') > Looks like this line may be joined with the previous one. not enough free space for this > >> - subtest:is(total_cpusubtype, SUM_CPUSUBTYPE[hw_arch], >> + subtest:is(mach_o.header.cpusubtype, 0, > Please use the named constant CPU_SUBTYPE_ARM64 for this magic constant. added > >> 'cpusubtype is correct in Mach-O') > Looks like this line may be joined with the previous one. not enough free space for this > >> + subtest:is(mach_o.header.filetype, 1, > What does the 1 filetype mean? > Please use the named constant. MH_OBJECT, added * The file type MH_OBJECT is a compact format intended as output of the * assembler and input (and possibly output) of the link editor (the .o * format). All sections are in one unnamed segment with no segment padding. * This format is used as an executable format when the file is so small the * segment padding greatly increases it's size. https://opensource.apple.com/source/xnu/xnu-344/EXTERNAL_HEADERS/mach-o/loader.h.auto.html > >> + 'filetype is correct in Mach-O') > Looks like this line may be joined with the previous one. no free space for this > >> + subtest:is(mach_o.header.ncmds, 2, > Why there are 2 commands for Mach-O format? > Please use the named constant. is constant name 'ncmds' self-explained? > >> + 'ncmds is correct in Mach-O') > Looks like this line may be joined with the previous one. joined > >> end >> >> -test:test('arm', build_and_check_mach_o) >> test:test('arm64', build_and_check_mach_o) >> >> test:done(true) >> -- >> 2.34.1 >> > [1]:https://github.com/LuaJIT/LuaJIT/issues/1181#issue-2202788411 >