Re: [Tarantool-patches] [PATCH luajit 1/2] OSX/iOS/ARM64: Fix generation of Mach-O object files.

Tarantool development patches archive
 help / color / mirror / Atom feed

From: Maxim Kokryashkin via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Sergey Bronnikov <estetus@gmail.com>
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH luajit 1/2] OSX/iOS/ARM64: Fix generation of Mach-O object files.
Date: Mon, 18 Mar 2024 15:53:04 +0300	[thread overview]
Message-ID: <f4p5lqyc4mtnpcqcvudsrlvfizc64w5mtx56umwmneiqfpuumj@sm5px5nhetyl> (raw)
In-Reply-To: <9a71bf0765acd6ab019de4ae9f491a6c7bcb463d.1710416150.git.sergeyb@tarantool.org>

Hi, Sergey, thanks for the patch!
Please consider my comments below.
On Thu, Mar 14, 2024 at 02:39:49PM +0300, Sergey Bronnikov wrote:
> From: sergeyb@tarantool.org
>
> Thanks to Carlo Cabrera.
>
> (cherry picked from commit 3065c910ad6027031aabe2dfd3c26a3d0f014b4f)
>
> Mach-O FAT object constructed by LuaJIT had an incorrect format. The
Typo: s/Mach-O FAT/The Mach-O FAT/
> problem is reproduced when target hardware platform has AVX512F and
Typo:s/target/the target/
> LuaJIT is compiled with enabled AVX512F instructions.
>
> The problem is arise because LuaJIT FFI code for Mach-O file generation
Typo: s/is arise/arises/
> in `bcsave.lua` relied on undefined behavior for conversions to
Typo: s/relied/relies/
> `uint32_t`. AVX512F has the `vcvttsd2usi` instruction which converts
Typo: s/instruction/instruction,/
> `double`/`float` to `uint32_t/uint64_t`. Earlier architectures (SSE2,
> AVX2) are sorely lacking such an instruction, as they only support
> signed conversions. Unsigned conversions are done with a signed convert
> and range shifting - the exact algorithm depends on the compiler.
> A side-effect of these workarounds is that negative `double`/`float`
> often inadvertently convert 'as expected', even though this is invoking
> undefined behavior. Whereas `vcvttsd2usi` always returns 0x80000000 or
> 0x8000000000000000 for out-of-range inputs.
>
> The patch fixes the problem, however, the real issue remains unfixed.
>
> Sergey Bronnikov:
> * added the description and a test for the problem
Typo: s/a test/the test/
>
> Part of tarantool/tarantool#9595
> ---
>  .github/workflows/exotic-builds-testing.yml   |   5 +-
>  src/jit/bcsave.lua                            |   6 +-
>  .../lj-366-strtab-correct-size.test.lua       |  10 +-
>  ...generation_of_mach-o_object_files.test.lua | 271 ++++++++++++++++++
>  test/tarantool-tests/utils/tools.lua          |   8 +
>  5 files changed, 287 insertions(+), 13 deletions(-)
>  create mode 100644 test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua
>
> diff --git a/.github/workflows/exotic-builds-testing.yml b/.github/workflows/exotic-builds-testing.yml
> index a9ba5fd5..df4bc2e9 100644
> --- a/.github/workflows/exotic-builds-testing.yml
> +++ b/.github/workflows/exotic-builds-testing.yml
> @@ -32,6 +32,7 @@ jobs:
>        fail-fast: false
>        matrix:
>          BUILDTYPE: [Debug, Release]
> +        OS: [Linux, macOS]
>          ARCH: [ARM64, x86_64]
>          GC64: [ON, OFF]
>          FLAVOR: [checkhook, dualnum, gdbjit, nojit, nounwind]
> @@ -50,13 +51,15 @@ jobs:
>              FLAVORFLAGS: -DLUAJIT_USE_GDBJIT=ON
https://github.com/tarantool/luajit/actions/runs/8279362128
>            - FLAVOR: nounwind
>              FLAVORFLAGS: -DLUAJIT_NO_UNWIND=ON
> +          - FLAVOR: avx512
> +            CMAKEFLAGS: -DCMAKE_C_FLAGS=skylake-avx512 -DCMAKE_C_COMPILER=gcc
>          exclude:
>            - ARCH: ARM64
>              GC64: OFF
>            # DUALNUM is default for ARM64, no need for additional testing.
>            - FLAVOR: dualnum
>              ARCH: ARM64
> -    runs-on: [self-hosted, regular, Linux, '${{ matrix.ARCH }}']
> +    runs-on: [self-hosted, regular, Linux, '${{ matrix.ARCH }}', '${ matrix.OS }']
The matrix.OS variable should be wrapped with double curly braces,
instead of singular ones. So, it should be like this:
| '${{ matrix.OS }}'

Currently, exotic build testing fails to start because of this mistake.
https://github.com/tarantool/luajit/actions/runs/8279362128
>      name: >
>        LuaJIT ${{ matrix.FLAVOR }}
>        (Linux/${{ matrix.ARCH }})
> diff --git a/src/jit/bcsave.lua b/src/jit/bcsave.lua
> index a287d675..7aec1555 100644
> --- a/src/jit/bcsave.lua
> +++ b/src/jit/bcsave.lua
> @@ -446,18 +446,18 @@ typedef struct {
>    uint32_t value;
>  } mach_nlist;
>  typedef struct {
> -  uint32_t strx;
> +  int32_t strx;
>    uint8_t type, sect;
>    uint16_t desc;
>    uint64_t value;
>  } mach_nlist_64;
>  typedef struct
>  {
> -  uint32_t magic, nfat_arch;
> +  int32_t magic, nfat_arch;
>  } mach_fat_header;
>  typedef struct
>  {
> -  uint32_t cputype, cpusubtype, offset, size, align;
> +  int32_t cputype, cpusubtype, offset, size, align;
>  } mach_fat_arch;
>  typedef struct {
>    struct {
> diff --git a/test/tarantool-tests/lj-366-strtab-correct-size.test.lua b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
> index 8a97a441..0bb92da6 100644
> --- a/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
> +++ b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
Let's move this update to a separate patch alongside with the added
utility function. Currently, this change in unrelated test file is
a bit confusing.
> @@ -138,14 +138,6 @@ local function create_obj_file(name)
>    return elf_filename
>  end
>
> --- Reads a file located in a specified path and returns its content.
> -local function read_file(path)
> -  local file = assert(io.open(path), 'cannot open an object file')
> -  local content = file:read('*a')
> -  file:close()
> -  return content
> -end
> -
>  -- Parses a buffer in an ELF format and returns an offset and a size of strtab
>  -- and symtab sections.
>  local function read_elf(elf_content)
> @@ -172,7 +164,7 @@ end
>  test:plan(3)
>
>  local elf_filename = create_obj_file(MODULE_NAME)
> -local elf_content = read_file(elf_filename)
> +local elf_content = require('utils').tools.read_file(elf_filename)
>  assert(#elf_content ~= 0, 'cannot read an object file')
>
>  local strtab, symtab = read_elf(elf_content)
> diff --git a/test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua b/test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua
> new file mode 100644
> index 00000000..0519e134
> --- /dev/null
> +++ b/test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua
> @@ -0,0 +1,271 @@
> +local tap = require('tap')
> +local test = tap.test('lj-865-fix_generation_of_mach-o_object_files'):skipcond({
> +  -- XXX: Tarantool doesn't use default LuaJIT loaders, and Lua
> +  -- bytecode can't be loaded from the shared library. For more
Typo: s/the shared/a shared/
> +  -- info: https://github.com/tarantool/tarantool/issues/9671.
> +  -- luacheck: no global
> +  ['Test uses exotic type of loaders (see #9671)'] = _TARANTOOL,
Typo: s/uses exotic/uses an exotic/
> +})
> +
> +test:plan(4)
> +
> +-- Test creates an object file in Mach-O format with LuaJIT bytecode
Typo: s/Test/The test/
> +-- and checks validness of the object file fields.
Typo: s/validness/the validity/
> +--
> +-- The original problem is reproduced with LuaJIT that built with
> +-- enabled AVX512F instructions. The support of AVX512F could be
Typo: s/support of/support for/
> +-- checked in `/proc/cpuinfo` on Linux and
> +-- `sysctl hw.optional.avx512f` on Mac. AVX512F must be
> +-- implicitly enabled in a C compiler by passing CPU codename.
> +-- Please consult for available model architecture on GCC Online
Typo: s/for/the/
> +-- Documentation [1] for available CPU codenames. To detect
> +-- CPU codename execute `gcc -march=native -Q --help=target | grep march`.
> +--
> +-- 1. https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html
> +--
> +-- Manual steps for reproducing are the following:
> +--
> +-- $ make CC=gcc TARGET_CFLAGS='skylake-avx512' -f Makefile.original
> +-- $ echo > test.lua
> +-- $ LUA_PATH="src/?.lua;;" luajit -b -o osx -a arm test.lua test.o
> +-- $ file test.o
> +-- empty.o: DOS executable (block device driver)
> +
> +local ffi = require('ffi')
> +
> +-- Format of the Mach-O is described in a document
Typo: s/a document/the document/
> +-- "OS X ABI Mach-O File Format Reference", published by Apple company.
> +-- Copy of the (now removed) official documentation in [1].
Let's replace "in [1]" with "can be found here [1].".
> +-- Yet another source of thruth is a XNU headers, see the definition
Typo: s/is a XNU/are XNU/
Typo: s/the definition/definitions/
> +-- of C-structures in: [2] (`nlist_64`), [3] (`fat_arch` and `fat_header`).
> +
> +-- 1. https://github.com/aidansteele/osx-abi-macho-file-format-reference
> +-- 2. https://github.com/apple-oss-distributions/xnu/blob/xnu-10002.1.13/EXTERNAL_HEADERS/mach-o/nlist.h
> +-- 3. https://github.com/apple-oss-distributions/xnu/blob/xnu-10002.1.13/EXTERNAL_HEADERS/mach-o/fat.h
> +-- 4. https://developer.apple.com/documentation/apple-silicon/addressing-architectural-differences-in-your-macos-code
> +--
> +-- Using the same declarations as defined in <src/jit/bcsave.lua>.
> +ffi.cdef[[
> +typedef struct
> +{
> +  uint32_t magic, cputype, cpusubtype, filetype, ncmds, sizeofcmds, flags;
> +} mach_header;
> +
> +typedef struct
> +{
> +  mach_header; uint32_t reserved;
> +} mach_header_64;
> +
> +typedef struct {
> +  uint32_t cmd, cmdsize;
> +  char segname[16];
> +  uint32_t vmaddr, vmsize, fileoff, filesize;
> +  uint32_t maxprot, initprot, nsects, flags;
> +} mach_segment_command;
> +
> +typedef struct {
> +  uint32_t cmd, cmdsize;
> +  char segname[16];
> +  uint64_t vmaddr, vmsize, fileoff, filesize;
> +  uint32_t maxprot, initprot, nsects, flags;
> +} mach_segment_command_64;
> +
> +typedef struct {
> +  char sectname[16], segname[16];
> +  uint32_t addr, size;
> +  uint32_t offset, align, reloff, nreloc, flags;
> +  uint32_t reserved1, reserved2;
> +} mach_section;
> +
> +typedef struct {
> +  char sectname[16], segname[16];
> +  uint64_t addr, size;
> +  uint32_t offset, align, reloff, nreloc, flags;
> +  uint32_t reserved1, reserved2, reserved3;
> +} mach_section_64;
> +
> +typedef struct {
> +  uint32_t cmd, cmdsize, symoff, nsyms, stroff, strsize;
> +} mach_symtab_command;
> +
> +typedef struct {
> +  int32_t strx;
> +  uint8_t type, sect;
> +  int16_t desc;
> +  uint32_t value;
> +} mach_nlist;
> +
> +typedef struct {
> +  uint32_t strx;
> +  uint8_t type, sect;
> +  uint16_t desc;
> +  uint64_t value;
> +} mach_nlist_64;
> +
> +typedef struct
> +{
> +  uint32_t magic, nfat_arch;
> +} mach_fat_header;
> +
> +typedef struct
> +{
> +  uint32_t cputype, cpusubtype, offset, size, align;
> +} mach_fat_arch;
> +
> +typedef struct {
> +  mach_fat_header fat;
> +  mach_fat_arch fat_arch[2];
> +  struct {
> +    mach_header hdr;
> +    mach_segment_command seg;
> +    mach_section sec;
> +    mach_symtab_command sym;
> +  } arch[2];
> +  mach_nlist sym_entry;
> +  uint8_t space[4096];
> +} mach_fat_obj;
> +
> +typedef struct {
> +  mach_fat_header fat;
> +  mach_fat_arch fat_arch[2];
> +  struct {
> +    mach_header_64 hdr;
> +    mach_segment_command_64 seg;
> +    mach_section_64 sec;
> +    mach_symtab_command sym;
> +  } arch[2];
> +  mach_nlist_64 sym_entry;
> +  uint8_t space[4096];
> +} mach_fat_obj_64;
> +]]
> +
> +local function create_obj_file(name, arch)
> +  local mach_o_path = os.tmpname() .. '.o'
> +  local lua_path = os.getenv('LUA_PATH')
> +  local lua_bin = require('utils').exec.luacmd(arg):match('%S+')
> +  local cmd_fmt = 'LUA_PATH="%s" %s -b -n "%s" -o osx -a %s -e "print()" %s'
> +  local cmd = (cmd_fmt):format(lua_path, lua_bin, name, arch, mach_o_path)
> +  local ret = os.execute(cmd)
> +  assert(ret == 0, 'cannot create an object file')
> +  return mach_o_path
> +end
> +
> +-- Parses a buffer in an Mach-O format and returns
Typo: s/in an/in the/
> +-- an fat magic number and nfat_arch.
Typo: s/an fat/the FAT/
Typo: s/nfat_arch/`nfat_arch`
> +local function read_mach_o(buf, is64)
> +  local res = {
> +    header = {
> +      magic = 0,
> +      nfat_arch = 0,
> +    },
> +    fat_arch = {
> +    },
I guess, that the formatting below is a bit better here:
| fat_arch = {},
> +  }
> +
> +  -- Mach-O FAT object.
> +  local mach_fat_obj_type = ffi.typeof(is64 and 'mach_fat_obj_64 *' or 'mach_fat_obj *')
The line is longer than 80 symbols.
> +  local obj = ffi.cast(mach_fat_obj_type, buf)
> +
> +  -- Mach-O FAT object header.
> +  local mach_fat_header_type = ffi.typeof('mach_fat_header *')
> +  local mach_fat_header = ffi.cast(mach_fat_header_type, obj.fat)
> +  local be32 = bit.bswap -- Mach-O FAT is BE, target arch is LE.
> +  res.header.magic = be32(mach_fat_header.magic)
> +  res.header.nfat_arch = be32(mach_fat_header.nfat_arch)
> +
> +  -- Mach-O FAT object archs.
Typo: s/archs/arches/

Side note: I feel like the comments for the sections are not elaborate
enough for unprepared reader. I think you should briefly desribe the
basic structure of a FAT object (FAT header, then array of per-segment
headers, then object files)
> +  local mach_fat_arch_type = ffi.typeof('mach_fat_arch *')
> +  for i = 0, res.header.nfat_arch - 1 do
> +    local fat_arch = ffi.cast(mach_fat_arch_type, obj.fat_arch[i])
> +    arch = {
> +      cputype = be32(fat_arch.cputype),
> +      cpusubtype = be32(fat_arch.cpusubtype),
> +    }
> +    table.insert(res.fat_arch, arch)
> +  end
> +
> +  return res
> +end
> +
> +-- Defined in <src/jit/bcsave.lua:bcsave_machobj>.
> +local sum_cputype = {
> +  x86 = 7,
> +  x64 = 0x01000007,
> +  arm = 7 + 12,
> +  arm64 = 0x01000007 + 0x0100000c,
> +}
> +local sum_cpusubtype = {
> +  x86 = 3,
> +  x64 = 3,
> +  arm = 3 + 9,
> +  arm64 = 3 + 0,
> +}
It would be nice to have an explanation for these magic numbers.
> +
> +-- The function builds Mach-O FAT object file and retrieves
> +-- its header fields (magic and nfat_arch)
> +-- and fields of the each arch (cputype, cpusubtype).
> +--
> +-- Mach-O FAT object header could be retrieved with `otool` on macOS:
Typo: s/could be/can be/
> +--
> +-- $ otool -f empty.o
> +-- Fat headers
> +-- fat_magic 0xcafebabe
> +-- nfat_arch 2
> +-- <snipped>
> +--
> +-- CPU type and subtype could be retrieved with `lipo` on macOS:
Typo: s/could be/can be/
> +--
> +-- $ luajit -b -o osx -a arm empty.lua empty.o
> +-- $ lipo -archs empty.o
> +-- i386 armv7
> +-- $ luajit -b -o osx -a arm64 empty.lua empty.o
> +-- $ lipo -archs empty.o
> +-- x86_64 arm64
> +local function build_and_check_mach_o(is64)
> +  local arch = is64 and 'arm64' or 'arm'
> +
> +  -- FAT_MAGIC is an integer containing the value 0xCAFEBABE in
> +  -- big-endian byte order format. On a big-endian host CPU,
> +  -- this can be validated using the constant FAT_MAGIC;
> +  -- on a little-endian host CPU, it can be validated using
> +  -- the constant FAT_CIGAM.
> +  --
> +  -- FAT_NARCH is an integer specifying the number of fat_arch
> +  -- data structures that follow. This is the number of
> +  -- architectures contained in this binary.
> +  --
> +  -- See aforementioned "OS X ABI Mach-O File Format Reference".
> +  --
> +  local FAT_MAGIC = '0xffffffffcafebabe'
> +  local FAT_NARCH = 2
> +
> +  local MODULE_NAME = 'lango_team'
> +
> +  local mach_o_obj_path = create_obj_file(MODULE_NAME, arch)
> +  local mach_o_buf = require('utils').tools.read_file(mach_o_obj_path)
> +  assert(mach_o_buf == nil or #mach_o_buf ~= 0, 'cannot read an object file')
> +
> +  local mach_o = read_mach_o(mach_o_buf, is64)
> +
> +  -- Teardown.
> +  local retcode = os.remove(mach_o_obj_path)
> +  assert(retcode == true, 'remove an object file')
> +
> +  local magic_str = string.format('0x%02x', mach_o.header.magic)
> +  test:is(magic_str, FAT_MAGIC, 'fat_magic is correct in Mach-O, ' .. arch)
> +  test:is(mach_o.header.nfat_arch, FAT_NARCH, 'nfat_arch is correct in Mach-O, ' .. arch)
> +
> +  local total_cputype = 0
> +  local total_cpusubtype = 0
> +  for i = 1, mach_o.header.nfat_arch do
> +    total_cputype = total_cputype + mach_o.fat_arch[i].cputype
> +    total_cpusubtype = total_cpusubtype + mach_o.fat_arch[i].cpusubtype
> +  end
> +  test:is(total_cputype, sum_cputype[arch], 'cputype is correct in Mach-O, ' .. arch)
> +  test:is(total_cpusubtype, sum_cpusubtype[arch], 'cpusubtype is correct in Mach-O, ' .. arch)
> +end
> +
> +-- ARM
> +build_and_check_mach_o(false)
> +
> +test:done(true)

Please mention that alongside with test and fix for the issue
you've added this tool. IMO, it would be even better to do that
in a separate commit, to avoid confusion because of the updates in
test files unrelated to the patch.
> diff --git a/test/tarantool-tests/utils/tools.lua b/test/tarantool-tests/utils/tools.lua
> index f35c6922..26b8c08d 100644
> --- a/test/tarantool-tests/utils/tools.lua
> +++ b/test/tarantool-tests/utils/tools.lua
> @@ -12,4 +12,12 @@ function M.profilename(name)
>    return (arg[0]:gsub('^(.+)/([^/]+)%.test%.lua$', replacepattern))
>  end
>
> +-- Reads a file located in a specified path and returns its content.
> +function M.read_file(path)
> +  local file = assert(io.open(path), 'cannot open an object file')
> +  local content = file:read('*a')
> +  file:close()
> +  return content
> +end
> +
>  return M
> --
> 2.34.1
>

next prev parent reply	other threads:[~2024-03-18 12:53 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-14 11:39 [Tarantool-patches] [PATCH luajit 0/2] Mach-O generation fixes Sergey Bronnikov via Tarantool-patches
2024-03-14 11:39 ` [Tarantool-patches] [PATCH luajit 1/2] OSX/iOS/ARM64: Fix generation of Mach-O object files Sergey Bronnikov via Tarantool-patches
2024-03-18 12:53   ` Maxim Kokryashkin via Tarantool-patches [this message]
2024-03-19  8:19     ` Sergey Bronnikov via Tarantool-patches
2024-03-19 16:28       ` Maxim Kokryashkin via Tarantool-patches
2024-03-26 13:53         ` Sergey Bronnikov via Tarantool-patches
2024-03-26 15:44           ` Maxim Kokryashkin via Tarantool-patches
2024-04-08 15:01           ` Sergey Kaplun via Tarantool-patches
2024-04-09 11:07             ` Sergey Bronnikov via Tarantool-patches
2024-04-09 12:47               ` Sergey Kaplun via Tarantool-patches
2024-03-18 12:55   ` Maxim Kokryashkin via Tarantool-patches
2024-03-14 11:39 ` [Tarantool-patches] [PATCH luajit 2/2] OSX/iOS/ARM64: Fix bytecode embedding in Mach-O object file Sergey Bronnikov via Tarantool-patches
2024-03-18 13:44   ` Maxim Kokryashkin via Tarantool-patches
2024-03-19  8:22     ` Sergey Bronnikov via Tarantool-patches
2024-03-19 16:15       ` Maxim Kokryashkin via Tarantool-patches
2024-03-26 14:01         ` Sergey Bronnikov via Tarantool-patches
2024-03-26 15:45           ` Maxim Kokryashkin via Tarantool-patches
2024-04-08 15:16   ` Sergey Kaplun via Tarantool-patches
2024-04-11  7:56     ` Sergey Bronnikov via Tarantool-patches
2024-04-08  7:47 ` [Tarantool-patches] [PATCH luajit 0/2] Mach-O generation fixes Sergey Kaplun via Tarantool-patches
2024-04-08 13:06   ` Sergey Kaplun via Tarantool-patches
2024-04-11  8:08   ` Sergey Bronnikov via Tarantool-patches
2024-04-11  8:27     ` Sergey Kaplun via Tarantool-patches
2024-04-11 12:39       ` Sergey Bronnikov via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f4p5lqyc4mtnpcqcvudsrlvfizc64w5mtx56umwmneiqfpuumj@sm5px5nhetyl \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=estetus@gmail.com \
    --cc=m.kokryashkin@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH luajit 1/2] OSX/iOS/ARM64: Fix generation of Mach-O object files.' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox