[Tarantool-patches] [PATCH luajit 2/2] OSX/iOS: Always generate 64 bit non-FAT Mach-O object files.

Sergey Bronnikov sergeyb at tarantool.org
Wed Jul 10 15:43:54 MSK 2024


Sergey,

thanks for review. Fixes applied and force-pushed to the same branch.


Sergey

On 09.07.2024 16:03, Sergey Kaplun via Tarantool-patches wrote:
> Hi, Sergey!
> Thanks for the patch!
> Please consider my comments below.
>
> May we add the test [1] to verify that there will be no regression in the
> future?

This "test" was for a FAT binary generated by LuaJIT.

With backported patches, FAT (aka Universal Binary) support is gone.

mach-O header is checked by tests in proposed patch.

>
> On 05.07.24, Sergey Bronnikov wrote:
>> Reported by Sergey Bronnikov.
>>
>> (cherry picked from commit 7110b935672489afd6ba3eef3e5139d2f3bd05b6)
>>
>> Previously, LuaJIT generated Mach-O FAT object files for ARM and
>> ARM64 on macOS. The patch removes support of 32-bit ARM and
>> FAT object files and now LuaJIT generate Mach-O object files for
>> ARM64.
> I suppose we should mention that no x86/x86_64 objects are generated
> now.
x86_64 is still there:

$ ./build/gc64/src/luajit -b -o osx -a arm64 empty.lua empty.o
$ file empty.o
empty.o: Mach-O 64-bit arm64 object
$ ./build/gc64/src/luajit -b -o osx -a x64 empty.lua empty.o
$ file empty.o
empty.o: Mach-O 64-bit x86_64 object
$

Anyway, commit message has been updated.

>> Sergey Bronnikov:
>> * added the description and the trimmed the test for the problem
>>
>> Part of tarantool/tarantool#10199
>> ---
>>   src/jit/bcsave.lua                            | 155 ++-------
>>   ...-865-cross-generation-mach-o-file.test.lua | 294 +++---------------
>>   2 files changed, 70 insertions(+), 379 deletions(-)
>>
>> diff --git a/src/jit/bcsave.lua b/src/jit/bcsave.lua
>> index 26ec29c6..61953c2d 100644
>> --- a/src/jit/bcsave.lua
>> +++ b/src/jit/bcsave.lua
> <snipped>
>
>> diff --git a/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua b/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua
>> index f008f3bd..6a58de95 100644
>> --- a/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua
>> +++ b/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua
>> @@ -3,109 +3,11 @@ local test = tap.test('lj-865-cross-generation-mach-o-file')
>>   local utils = require('utils')
>>   local ffi = require('ffi')
>>   
>> -test:plan(2)
>> +test:plan(1)
>>   
>>   -- The test creates an object file in Mach-O format with LuaJIT
>>   -- bytecode and checks the validity of the object file fields.
>> ---
>> --- The original problem is reproduced with LuaJIT, which is built
>> --- with enabled AVX512F instructions. The support for AVX512F
>> --- could be checked in `/proc/cpuinfo` on Linux and
>> --- `sysctl hw.optional.avx512f` on Mac. AVX512F must be
>> --- implicitly enabled in a C compiler by passing a CPU codename.
>> --- Please take a look at the GCC Online Documentation [1] for
>> --- available CPU codenames. Also, see the Wikipedia for CPUs with
>> --- AVX-512 support [2].
>> --- Execute command below to detect the CPU codename:
>> --- `gcc -march=native -Q --help=target | grep march`.
>> ---
>> --- 1.https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html
>> --- 2.https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512
>> ---
>> --- Manual steps for reproducing are the following:
>> ---
>> --- $ CC=gcc TARGET_CFLAGS='skylake-avx512' cmake -S . -B build
>> --- $ cmake --build build --parallel
>> --- $ echo > test.lua
>> --- $ LUA_PATH="src/?.lua;;" luajit -b -o osx -a arm test.lua test.o
>> --- $ file test.o
>> --- empty.o: DOS executable (block device driver)
>>   
>> --- LuaJIT can generate so called Universal Binary with Lua
>> --- bytecode. The Universal Binary format is a format for
>> --- executable files that run natively on hardware platforms with
>> --- different hardware architectures. This concept is more
>> --- generally known as a fat binary.
>> ---
>> --- The format of the Mach-O is described in the document
>> --- "OS X ABI Mach-O File Format Reference", published by Apple
>> --- company. The copy of the (now removed) official documentation
>> --- can be found here [1]. Yet another source of truth is
>> --- XNU headers, see the definition of C-structures in:
>> --- [2] (`nlist_64`), [3] (`fat_arch` and `fat_header`).
>> ---
>> --- There is a good visual representation of Universal Binary
>> --- in "Mac OS X Internals" book (pages 67-68) [5] and in the [6].
>> --- Below is the schematic structure of Universal Binary, which
>> --- includes two executables for PowerPC and Intel i386 (omitted):
>> ---
>> ---   0x0000000 ---------------------------------------
>> ---             |
>> --- struct      | 0xcafebabe  FAT_MAGIC                 magic
>> --- fat_header  | -------------------------------------
>> ---             | 0x00000003                            nfat_arch
>> ---             ---------------------------------------
>> ---             | 0x00000012  CPU_TYPE_POWERPC          cputype
>> ---             | -------------------------------------
>> ---             | 0x00000000  CPU_SUBTYPE_POWERPC_ALL   cpusubtype
>> --- struct      | -------------------------------------
>> --- fat_arch    | 0x00001000  4096 bytes                offset
>> ---             | -------------------------------------
>> ---             | 0x00004224  16932 bytes               size
>> ---             | -------------------------------------
>> ---             | 0x0000000c  2^12 = 4096 bytes         align
>> ---             ---------------------------------------
>> ---             ---------------------------------------
>> ---             | 0x00000007  CPU_TYPE_I386             cputype
>> ---             | -------------------------------------
>> ---             | 0x00000003  CPU_SUBTYPE_I386_ALL      cpusubtype
>> --- struct      | -------------------------------------
>> --- fat_arch    | 0x00006000  24576 bytes               offset
>> ---             | -------------------------------------
>> ---             | 0x0000292c  10540 bytes               size
>> ---             | -------------------------------------
>> ---             | 0x0000000c  2^12 = 4096 bytes         align
>> ---             ---------------------------------------
>> ---               Unused
>> --- 0x00001000  ---------------------------------------
>> ---             | 0xfeedface  MH_MAGIC                  magic
>> ---             | ------------------------------------
>> ---             | 0x00000012  CPU_TYPE_POWERPC          cputype
>> ---             | ------------------------------------
>> --- struct      | 0x00000000  CPU_SUBTYPE_POWERPC_ALL   cpusubtype
>> --- mach_header | ------------------------------------
>> ---             | 0x00000002  MH_EXECUTE                filetype
>> ---             | ------------------------------------
>> ---             | 0x0000000b  10 load commands          ncmds
>> ---             | ------------------------------------
>> ---             | 0x00000574  1396 bytes                sizeofcmds
>> ---             | ------------------------------------
>> ---             | 0x00000085  DYLDLINK TWOLEVEL         flags
>> ---             --------------------------------------
>> ---               Load commands
>> ---             ---------------------------------------
>> ---               Data
>> ---             ---------------------------------------
>> ---
>> ---               < x86 executable >
>> ---
>> --- 1.https://github.com/aidansteele/osx-abi-macho-file-format-reference
>> --- 2.https://github.com/apple-oss-distributions/xnu/blob/xnu-10002.1.13/EXTERNAL_HEADERS/mach-o/nlist.h
>> --- 3.https://github.com/apple-oss-distributions/xnu/blob/xnu-10002.1.13/EXTERNAL_HEADERS/mach-o/fat.h
>> --- 4.https://developer.apple.com/documentation/apple-silicon/addressing-architectural-differences-in-your-macos-code
>> --- 5.https://reverseengineering.stackexchange.com/a/6357/46029
>> --- 6.http://formats.kaitai.io/mach_o/index.html
> I prefer to keep the non-FAT part of this comment since it is very
> useful.


reverted:

diff --git a/build/src/luajit b/build/src/luajit
deleted file mode 100755
index bd10e09a..00000000
Binary files a/build/src/luajit and /dev/null differ
diff --git 
a/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua 
b/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua
index 6a58de95..f9ca6e38 100644
--- a/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua
+++ b/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua
@@ -68,6 +68,62 @@ end

  -- Parses a buffer in the Mach-O format and returns its fields
  -- in a table.
+-- The test creates an object file in Mach-O format with LuaJIT
+-- bytecode and checks the validity of the object file fields.
+---
+--- The original problem is reproduced with LuaJIT, which is built
+--- with enabled AVX512F instructions. The support for AVX512F
+--- could be checked in `/proc/cpuinfo` on Linux and
+--- `sysctl hw.optional.avx512f` on Mac. AVX512F must be
+--- implicitly enabled in a C compiler by passing a CPU codename.
+--- Please take a look at the GCC Online Documentation [1] for
+--- available CPU codenames. Also, see the Wikipedia for CPUs with
+--- AVX-512 support [2].
+--- Execute command below to detect the CPU codename:
+--- `gcc -march=native -Q --help=target | grep march`.
+---
+--- 1. https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html
+--- 2. https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512
+---
+--- Manual steps for reproducing are the following:
+---
+--- $ CC=gcc TARGET_CFLAGS='skylake-avx512' cmake -S . -B build
+--- $ cmake --build build --parallel
+--- $ echo > test.lua
+--- $ LUA_PATH="src/?.lua;;" luajit -b -o osx -a arm test.lua test.o
+--- $ file test.o
+--- empty.o: DOS executable (block device driver)
+
+--- The format of the Mach-O is described in the document
+--- "OS X ABI Mach-O File Format Reference", published by Apple
+--- company. The copy of the (now removed) official documentation
+--- can be found here [1]. There is a good visual representation
+--- of Mach-O format in "Mac OS X Internals" book (pages 67-68)
+--- [2] and in the [3].
+--
+--- 0x0000000  ---------------------------------------
+---             | 0xfeedface  MH_MAGIC                  magic
+---             | ------------------------------------
+---             | 0x00000012  CPU_TYPE_POWERPC          cputype
+---             | ------------------------------------
+--- struct      | 0x00000000  CPU_SUBTYPE_POWERPC_ALL cpusubtype
+--- mach_header | ------------------------------------
+---             | 0x00000002  MH_EXECUTE                filetype
+---             | ------------------------------------
+---             | 0x0000000b  10 load commands          ncmds
+---             | ------------------------------------
+---             | 0x00000574  1396 bytes sizeofcmds
+---             | ------------------------------------
+---             | 0x00000085  DYLDLINK TWOLEVEL         flags
+---             --------------------------------------
+---               Load commands
+---             ---------------------------------------
+---               Data
+---             ---------------------------------------
+---
+--- 1. https://github.com/aidansteele/osx-abi-macho-file-format-reference
+--- 2. https://reverseengineering.stackexchange.com/a/6357/46029
+--- 3. http://formats.kaitai.io/mach_o/index.html
  local function read_mach_o(buf, hw_arch)
    local is64 = hw_arch == 'arm64'


>
>> ---
>>   -- Using the same declarations as defined in <src/jit/bcsave.lua>.
>>   ffi.cdef[[
> <snipped>
>
>>   local function create_obj_file(name, arch)
>> @@ -212,108 +66,37 @@ local function create_obj_file(name, arch)
>>     return mach_o_path
>>   end
>>   
>> --- Parses a buffer in the Mach-O format and returns the FAT magic
>> --- number and `nfat_arch`.
>> +-- Parses a buffer in the Mach-O format and returns its fields
>> +-- in a table.
>>   local function read_mach_o(buf, hw_arch)
> I suggest renaming it to `read_mach_o_hdr()` and returning only the
> header without any additional wrapping in the table.

Updated:

diff --git a/build/src/luajit b/build/src/luajit
deleted file mode 100755
index bd10e09a..00000000
Binary files a/build/src/luajit and /dev/null differ
diff --git 
a/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua 
b/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua
index 6a58de95..0f269a8c 100644
--- a/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua
+++ b/test/tarantool-tests/lj-865-cross-generation-mach-o-file.test.lua
@@ -78,15 +134,7 @@ local function read_mach_o(buf, hw_arch)
    -- Mach-O object header.
    local mach_header = obj.hdr

-  return {
-    header = {
-      magic = mach_header.magic,
-      cputype = mach_header.cputype,
-      cpusubtype = mach_header.cpusubtype,
-      filetype = mach_header.filetype,
-      ncmds = mach_header.ncmds,
-    },
-  }
+  return mach_header
  end

  -- The function builds Mach-O object file and retrieves
@@ -104,22 +152,22 @@ local function build_and_check_mach_o(subtest)
    local mach_o_buf = utils.tools.read_file(mach_o_obj_path)
    assert(mach_o_buf ~= nil and #mach_o_buf ~= 0, 'cannot read an 
object file')

-  local mach_o = read_mach_o(mach_o_buf, hw_arch)
+  local mach_o = read_mach_o_hdr(mach_o_buf, hw_arch)

    -- Teardown.
    assert(os.remove(mach_o_obj_path), 'remove an object file')

-  local magic_str = string.format('%#x', mach_o.header.magic)
+  local magic_str = string.format('%#x', mach_o.magic)
    subtest:is(magic_str, '0xfeedfacf',
               'magic is correct in Mach-O')
-  local cputype_str = string.format('%#x', mach_o.header.cputype)
+  local cputype_str = string.format('%#x', mach_o.cputype)
    subtest:is(cputype_str, '0x100000c',
               'cputype is correct in Mach-O')
-  subtest:is(mach_o.header.cpusubtype, 0,
+  subtest:is(mach_o.cpusubtype, 0,
               'cpusubtype is correct in Mach-O')
-  subtest:is(mach_o.header.filetype, 1,
+  subtest:is(mach_o.filetype, 1,
               'filetype is correct in Mach-O')
-  subtest:is(mach_o.header.ncmds, 2,
+  subtest:is(mach_o.ncmds, 2,
               'ncmds is correct in Mach-O')
  end

>> -  local res = {
>> -    header = {
>> -      magic = 0,
>> -      nfat_arch = 0,
>> -    },
>> -    fat_arch = {},
>> -  }
>> -
>>     local is64 = hw_arch == 'arm64'
> Maybe it is better to use assert here, like we do for the
> `build_and_check_mach_o()` routine?
Added, but read_mach_o_hdr is called by build_and_check_mach_o where 
assert already added.
>
>>   
>> -  -- Mach-O FAT object.
>> -  local mach_fat_obj_type = ffi.typeof(is64 and
>> -                                       'mach_fat_obj_64 *' or
>> -                                       'mach_fat_obj *')
>> -  local obj = ffi.cast(mach_fat_obj_type, buf)
>> +  -- Mach-O object.
>> +  local mach_obj_type = ffi.typeof(is64 and 'mach_obj_64 *')
> Maybe just use mach_obj_64 since there is no alternative?
ok
>> +  local obj = ffi.cast(mach_obj_type, buf)
>>   
>> -  -- Mach-O FAT object header.
>> -  local mach_fat_header = obj.fat
>> -  -- Mach-O FAT is BE, target arch is LE.
>> -  local be32 = bit.bswap
>> -  res.header.magic = be32(mach_fat_header.magic)
>> -  res.header.nfat_arch = be32(mach_fat_header.nfat_arch)
>> +  -- Mach-O object header.
>> +  local mach_header = obj.hdr
>>   
>> -  -- Mach-O FAT object arches.
>> -  for i = 0, res.header.nfat_arch - 1 do
>> -    local fat_arch = obj.fat_arch[i]
>> -    local arch = {
>> -      cputype = be32(fat_arch.cputype),
>> -      cpusubtype = be32(fat_arch.cpusubtype),
>> -    }
>> -    table.insert(res.fat_arch, arch)
>> -  end
>> -
>> -  return res
>> +  return {
>> +    header = {
>> +      magic = mach_header.magic,
>> +      cputype = mach_header.cputype,
>> +      cpusubtype = mach_header.cpusubtype,
>> +      filetype = mach_header.filetype,
>> +      ncmds = mach_header.ncmds,
>> +    },
>> +  }
>>   end
>>   
>> --- Universal Binary can contain executables for more than one
>> --- CPU architecture. For simplicity, the test compares the *sum*
>> --- of CPU types and CPU subtypes.
>> ---
>> --- <src/jit/bcsave.lua:bcsave_machobj> has the definitions of the
>> --- numbers below. The original XNU source code may be found in
>> --- <osfmk/mach/machine.h> [1].
>> ---
>> --- 1.https://opensource.apple.com/source/xnu/xnu-4570.41.2/osfmk/mach/machine.h.auto.html
>> ---
>> -local SUM_CPUTYPE = {
>> -  -- x86 + arm.
>> -  arm = 7 + 12,
>> -  -- x64 + arm64.
>> -  arm64 = 0x01000007 + 0x0100000c,
>> -}
>> -local SUM_CPUSUBTYPE = {
>> -  -- x86 + arm.
>> -  arm = 3 + 9,
>> -  -- x64 + arm64.
>> -  arm64 = 3 + 0,
>> -}
>> -
>> --- The function builds Mach-O FAT object file and retrieves
>> --- its header fields (magic and nfat_arch) and fields of each arch
>> --- (cputype, cpusubtype).
>> ---
>> --- The Mach-O FAT object header can be retrieved with `otool` on
>> --- macOS:
>> ---
>> --- $ otool -f empty.o
>> --- Fat headers
>> --- fat_magic 0xcafebabe
>> --- nfat_arch 2
>> --- <snipped>
>> ---
>> --- CPU type and subtype can be retrieved with `lipo` on macOS:
>> ---
>> --- $ luajit -b -o osx -a arm empty.lua empty.o
>> --- $ lipo -archs empty.o
>> --- i386 armv7
>> --- $ luajit -b -o osx -a arm64 empty.lua empty.o
>> --- $ lipo -archs empty.o
>> --- x86_64 arm64
>> +-- The function builds Mach-O object file and retrieves
>> +-- its header fields.
>>   local function build_and_check_mach_o(subtest)
>>     local hw_arch = subtest.name
>> -  assert(hw_arch == 'arm' or hw_arch == 'arm64')
>> +  -- LuaJIT always generate 64-bit non-FAT Mach-O object files.
> The fact that these files are generated only for M1 CPUs looks worth
> mentioning.
>
>> +  assert(hw_arch == 'arm64')
>>   
>> -  subtest:plan(4)
>> -  -- FAT_MAGIC is an integer containing the value 0xCAFEBABE in
>> -  -- big-endian byte order format. On a big-endian host CPU,
>> -  -- this can be validated using the constant FAT_MAGIC;
>> -  -- on a little-endian host CPU, it can be validated using
>> -  -- the constant FAT_CIGAM.
>> -  --
>> -  -- FAT_NARCH is an integer specifying the number of fat_arch
>> -  -- data structures that follow. This is the number of
>> -  -- architectures contained in this binary.
>> -  --
>> -  -- See the aforementioned "OS X ABI Mach-O File Format
>> -  -- Reference".
> The similar comment about Mach-O format will be appretiated.
mach-o header struct described in a comment above
>
>> -  local FAT_MAGIC = '0xffffffffcafebabe'
>> -  local FAT_NARCH = 2
>> +  subtest:plan(5)
>>   
>>     local MODULE_NAME = 'lango_team'
>>   
>> @@ -327,24 +110,19 @@ local function build_and_check_mach_o(subtest)
>>     assert(os.remove(mach_o_obj_path), 'remove an object file')
>>   
>>     local magic_str = string.format('%#x', mach_o.header.magic)
>> -  subtest:is(magic_str, FAT_MAGIC,
>> -             'fat_magic is correct in Mach-O')
>> -  subtest:is(mach_o.header.nfat_arch, FAT_NARCH,
>> -             'nfat_arch is correct in Mach-O')
>> -
>> -  local total_cputype = 0
>> -  local total_cpusubtype = 0
>> -  for i = 1, FAT_NARCH do
>> -    total_cputype = total_cputype + mach_o.fat_arch[i].cputype
>> -    total_cpusubtype = total_cpusubtype + mach_o.fat_arch[i].cpusubtype
>> -  end
>> -  subtest:is(total_cputype, SUM_CPUTYPE[hw_arch],
>> +  subtest:is(magic_str, '0xfeedfacf',
> Please use MH_MAGIC_64 named constant for this magic string.
Added
>
>> +             'magic is correct in Mach-O')
> Looks like this line may be joined with the previous one.
not enough free space for this
>> +  local cputype_str = string.format('%#x', mach_o.header.cputype)
>> +  subtest:is(cputype_str, '0x100000c',
> Please use the named constant CPU_TYPE_ARM64 for this magic string.
Added
>
>>                'cputype is correct in Mach-O')
> Looks like this line may be joined with the previous one.
not enough free space for this
>
>> -  subtest:is(total_cpusubtype, SUM_CPUSUBTYPE[hw_arch],
>> +  subtest:is(mach_o.header.cpusubtype, 0,
> Please use the named constant CPU_SUBTYPE_ARM64 for this magic constant.
added
>
>>                'cpusubtype is correct in Mach-O')
> Looks like this line may be joined with the previous one.
not enough free space for this
>
>> +  subtest:is(mach_o.header.filetype, 1,
> What does the 1 filetype mean?
> Please use the named constant.

MH_OBJECT, added * The file type MH_OBJECT is a compact format intended 
as output of the * assembler and input (and possibly output) of the link 
editor (the .o * format). All sections are in one unnamed segment with 
no segment padding. * This format is used as an executable format when 
the file is so small the * segment padding greatly increases it's size. 
https://opensource.apple.com/source/xnu/xnu-344/EXTERNAL_HEADERS/mach-o/loader.h.auto.html 


>
>> +             'filetype is correct in Mach-O')
> Looks like this line may be joined with the previous one.
no free space for this
>
>> +  subtest:is(mach_o.header.ncmds, 2,
> Why there are 2 commands for Mach-O format?
> Please use the named constant.
is constant name 'ncmds' self-explained?
>
>> +             'ncmds is correct in Mach-O')
> Looks like this line may be joined with the previous one.
joined
>
>>   end
>>   
>> -test:test('arm', build_and_check_mach_o)
>>   test:test('arm64', build_and_check_mach_o)
>>   
>>   test:done(true)
>> -- 
>> 2.34.1
>>
> [1]:https://github.com/LuaJIT/LuaJIT/issues/1181#issue-2202788411
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20240710/c1e62f18/attachment.htm>


More information about the Tarantool-patches mailing list