Tarantool development patches archive
 help / color / mirror / Atom feed
From: Sergey Bronnikov via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: tarantool-patches@dev.tarantool.org,
	Sergey Kaplun <skaplun@tarantool.org>,
	Maxim Kokryashkin <m.kokryashkin@tarantool.org>
Subject: [Tarantool-patches] [PATCH luajit 1/2] OSX/iOS/ARM64: Fix generation of Mach-O object files.
Date: Thu, 14 Mar 2024 14:39:49 +0300	[thread overview]
Message-ID: <9a71bf0765acd6ab019de4ae9f491a6c7bcb463d.1710416150.git.sergeyb@tarantool.org> (raw)
In-Reply-To: <cover.1710416150.git.sergeyb@tarantool.org>

From: sergeyb@tarantool.org

Thanks to Carlo Cabrera.

(cherry picked from commit 3065c910ad6027031aabe2dfd3c26a3d0f014b4f)

Mach-O FAT object constructed by LuaJIT had an incorrect format. The
problem is reproduced when target hardware platform has AVX512F and
LuaJIT is compiled with enabled AVX512F instructions.

The problem is arise because LuaJIT FFI code for Mach-O file generation
in `bcsave.lua` relied on undefined behavior for conversions to
`uint32_t`. AVX512F has the `vcvttsd2usi` instruction which converts
`double`/`float` to `uint32_t/uint64_t`. Earlier architectures (SSE2,
AVX2) are sorely lacking such an instruction, as they only support
signed conversions. Unsigned conversions are done with a signed convert
and range shifting - the exact algorithm depends on the compiler.
A side-effect of these workarounds is that negative `double`/`float`
often inadvertently convert 'as expected', even though this is invoking
undefined behavior. Whereas `vcvttsd2usi` always returns 0x80000000 or
0x8000000000000000 for out-of-range inputs.

The patch fixes the problem, however, the real issue remains unfixed.

Sergey Bronnikov:
* added the description and a test for the problem

Part of tarantool/tarantool#9595
---
 .github/workflows/exotic-builds-testing.yml   |   5 +-
 src/jit/bcsave.lua                            |   6 +-
 .../lj-366-strtab-correct-size.test.lua       |  10 +-
 ...generation_of_mach-o_object_files.test.lua | 271 ++++++++++++++++++
 test/tarantool-tests/utils/tools.lua          |   8 +
 5 files changed, 287 insertions(+), 13 deletions(-)
 create mode 100644 test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua

diff --git a/.github/workflows/exotic-builds-testing.yml b/.github/workflows/exotic-builds-testing.yml
index a9ba5fd5..df4bc2e9 100644
--- a/.github/workflows/exotic-builds-testing.yml
+++ b/.github/workflows/exotic-builds-testing.yml
@@ -32,6 +32,7 @@ jobs:
       fail-fast: false
       matrix:
         BUILDTYPE: [Debug, Release]
+        OS: [Linux, macOS]
         ARCH: [ARM64, x86_64]
         GC64: [ON, OFF]
         FLAVOR: [checkhook, dualnum, gdbjit, nojit, nounwind]
@@ -50,13 +51,15 @@ jobs:
             FLAVORFLAGS: -DLUAJIT_USE_GDBJIT=ON
           - FLAVOR: nounwind
             FLAVORFLAGS: -DLUAJIT_NO_UNWIND=ON
+          - FLAVOR: avx512
+            CMAKEFLAGS: -DCMAKE_C_FLAGS=skylake-avx512 -DCMAKE_C_COMPILER=gcc
         exclude:
           - ARCH: ARM64
             GC64: OFF
           # DUALNUM is default for ARM64, no need for additional testing.
           - FLAVOR: dualnum
             ARCH: ARM64
-    runs-on: [self-hosted, regular, Linux, '${{ matrix.ARCH }}']
+    runs-on: [self-hosted, regular, Linux, '${{ matrix.ARCH }}', '${ matrix.OS }']
     name: >
       LuaJIT ${{ matrix.FLAVOR }}
       (Linux/${{ matrix.ARCH }})
diff --git a/src/jit/bcsave.lua b/src/jit/bcsave.lua
index a287d675..7aec1555 100644
--- a/src/jit/bcsave.lua
+++ b/src/jit/bcsave.lua
@@ -446,18 +446,18 @@ typedef struct {
   uint32_t value;
 } mach_nlist;
 typedef struct {
-  uint32_t strx;
+  int32_t strx;
   uint8_t type, sect;
   uint16_t desc;
   uint64_t value;
 } mach_nlist_64;
 typedef struct
 {
-  uint32_t magic, nfat_arch;
+  int32_t magic, nfat_arch;
 } mach_fat_header;
 typedef struct
 {
-  uint32_t cputype, cpusubtype, offset, size, align;
+  int32_t cputype, cpusubtype, offset, size, align;
 } mach_fat_arch;
 typedef struct {
   struct {
diff --git a/test/tarantool-tests/lj-366-strtab-correct-size.test.lua b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
index 8a97a441..0bb92da6 100644
--- a/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
+++ b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua
@@ -138,14 +138,6 @@ local function create_obj_file(name)
   return elf_filename
 end
 
--- Reads a file located in a specified path and returns its content.
-local function read_file(path)
-  local file = assert(io.open(path), 'cannot open an object file')
-  local content = file:read('*a')
-  file:close()
-  return content
-end
-
 -- Parses a buffer in an ELF format and returns an offset and a size of strtab
 -- and symtab sections.
 local function read_elf(elf_content)
@@ -172,7 +164,7 @@ end
 test:plan(3)
 
 local elf_filename = create_obj_file(MODULE_NAME)
-local elf_content = read_file(elf_filename)
+local elf_content = require('utils').tools.read_file(elf_filename)
 assert(#elf_content ~= 0, 'cannot read an object file')
 
 local strtab, symtab = read_elf(elf_content)
diff --git a/test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua b/test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua
new file mode 100644
index 00000000..0519e134
--- /dev/null
+++ b/test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua
@@ -0,0 +1,271 @@
+local tap = require('tap')
+local test = tap.test('lj-865-fix_generation_of_mach-o_object_files'):skipcond({
+  -- XXX: Tarantool doesn't use default LuaJIT loaders, and Lua
+  -- bytecode can't be loaded from the shared library. For more
+  -- info: https://github.com/tarantool/tarantool/issues/9671.
+  -- luacheck: no global
+  ['Test uses exotic type of loaders (see #9671)'] = _TARANTOOL,
+})
+
+test:plan(4)
+
+-- Test creates an object file in Mach-O format with LuaJIT bytecode
+-- and checks validness of the object file fields.
+--
+-- The original problem is reproduced with LuaJIT that built with
+-- enabled AVX512F instructions. The support of AVX512F could be
+-- checked in `/proc/cpuinfo` on Linux and
+-- `sysctl hw.optional.avx512f` on Mac. AVX512F must be
+-- implicitly enabled in a C compiler by passing CPU codename.
+-- Please consult for available model architecture on GCC Online
+-- Documentation [1] for available CPU codenames. To detect
+-- CPU codename execute `gcc -march=native -Q --help=target | grep march`.
+--
+-- 1. https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html
+--
+-- Manual steps for reproducing are the following:
+--
+-- $ make CC=gcc TARGET_CFLAGS='skylake-avx512' -f Makefile.original
+-- $ echo > test.lua
+-- $ LUA_PATH="src/?.lua;;" luajit -b -o osx -a arm test.lua test.o
+-- $ file test.o
+-- empty.o: DOS executable (block device driver)
+
+local ffi = require('ffi')
+
+-- Format of the Mach-O is described in a document
+-- "OS X ABI Mach-O File Format Reference", published by Apple company.
+-- Copy of the (now removed) official documentation in [1].
+-- Yet another source of thruth is a XNU headers, see the definition
+-- of C-structures in: [2] (`nlist_64`), [3] (`fat_arch` and `fat_header`).
+
+-- 1. https://github.com/aidansteele/osx-abi-macho-file-format-reference
+-- 2. https://github.com/apple-oss-distributions/xnu/blob/xnu-10002.1.13/EXTERNAL_HEADERS/mach-o/nlist.h
+-- 3. https://github.com/apple-oss-distributions/xnu/blob/xnu-10002.1.13/EXTERNAL_HEADERS/mach-o/fat.h
+-- 4. https://developer.apple.com/documentation/apple-silicon/addressing-architectural-differences-in-your-macos-code
+--
+-- Using the same declarations as defined in <src/jit/bcsave.lua>.
+ffi.cdef[[
+typedef struct
+{
+  uint32_t magic, cputype, cpusubtype, filetype, ncmds, sizeofcmds, flags;
+} mach_header;
+
+typedef struct
+{
+  mach_header; uint32_t reserved;
+} mach_header_64;
+
+typedef struct {
+  uint32_t cmd, cmdsize;
+  char segname[16];
+  uint32_t vmaddr, vmsize, fileoff, filesize;
+  uint32_t maxprot, initprot, nsects, flags;
+} mach_segment_command;
+
+typedef struct {
+  uint32_t cmd, cmdsize;
+  char segname[16];
+  uint64_t vmaddr, vmsize, fileoff, filesize;
+  uint32_t maxprot, initprot, nsects, flags;
+} mach_segment_command_64;
+
+typedef struct {
+  char sectname[16], segname[16];
+  uint32_t addr, size;
+  uint32_t offset, align, reloff, nreloc, flags;
+  uint32_t reserved1, reserved2;
+} mach_section;
+
+typedef struct {
+  char sectname[16], segname[16];
+  uint64_t addr, size;
+  uint32_t offset, align, reloff, nreloc, flags;
+  uint32_t reserved1, reserved2, reserved3;
+} mach_section_64;
+
+typedef struct {
+  uint32_t cmd, cmdsize, symoff, nsyms, stroff, strsize;
+} mach_symtab_command;
+
+typedef struct {
+  int32_t strx;
+  uint8_t type, sect;
+  int16_t desc;
+  uint32_t value;
+} mach_nlist;
+
+typedef struct {
+  uint32_t strx;
+  uint8_t type, sect;
+  uint16_t desc;
+  uint64_t value;
+} mach_nlist_64;
+
+typedef struct
+{
+  uint32_t magic, nfat_arch;
+} mach_fat_header;
+
+typedef struct
+{
+  uint32_t cputype, cpusubtype, offset, size, align;
+} mach_fat_arch;
+
+typedef struct {
+  mach_fat_header fat;
+  mach_fat_arch fat_arch[2];
+  struct {
+    mach_header hdr;
+    mach_segment_command seg;
+    mach_section sec;
+    mach_symtab_command sym;
+  } arch[2];
+  mach_nlist sym_entry;
+  uint8_t space[4096];
+} mach_fat_obj;
+
+typedef struct {
+  mach_fat_header fat;
+  mach_fat_arch fat_arch[2];
+  struct {
+    mach_header_64 hdr;
+    mach_segment_command_64 seg;
+    mach_section_64 sec;
+    mach_symtab_command sym;
+  } arch[2];
+  mach_nlist_64 sym_entry;
+  uint8_t space[4096];
+} mach_fat_obj_64;
+]]
+
+local function create_obj_file(name, arch)
+  local mach_o_path = os.tmpname() .. '.o'
+  local lua_path = os.getenv('LUA_PATH')
+  local lua_bin = require('utils').exec.luacmd(arg):match('%S+')
+  local cmd_fmt = 'LUA_PATH="%s" %s -b -n "%s" -o osx -a %s -e "print()" %s'
+  local cmd = (cmd_fmt):format(lua_path, lua_bin, name, arch, mach_o_path)
+  local ret = os.execute(cmd)
+  assert(ret == 0, 'cannot create an object file')
+  return mach_o_path
+end
+
+-- Parses a buffer in an Mach-O format and returns
+-- an fat magic number and nfat_arch.
+local function read_mach_o(buf, is64)
+  local res = {
+    header = {
+      magic = 0,
+      nfat_arch = 0,
+    },
+    fat_arch = {
+    },
+  }
+
+  -- Mach-O FAT object.
+  local mach_fat_obj_type = ffi.typeof(is64 and 'mach_fat_obj_64 *' or 'mach_fat_obj *')
+  local obj = ffi.cast(mach_fat_obj_type, buf)
+
+  -- Mach-O FAT object header.
+  local mach_fat_header_type = ffi.typeof('mach_fat_header *')
+  local mach_fat_header = ffi.cast(mach_fat_header_type, obj.fat)
+  local be32 = bit.bswap -- Mach-O FAT is BE, target arch is LE.
+  res.header.magic = be32(mach_fat_header.magic)
+  res.header.nfat_arch = be32(mach_fat_header.nfat_arch)
+
+  -- Mach-O FAT object archs.
+  local mach_fat_arch_type = ffi.typeof('mach_fat_arch *')
+  for i = 0, res.header.nfat_arch - 1 do
+    local fat_arch = ffi.cast(mach_fat_arch_type, obj.fat_arch[i])
+    arch = {
+      cputype = be32(fat_arch.cputype),
+      cpusubtype = be32(fat_arch.cpusubtype),
+    }
+    table.insert(res.fat_arch, arch)
+  end
+
+  return res
+end
+
+-- Defined in <src/jit/bcsave.lua:bcsave_machobj>.
+local sum_cputype = {
+  x86 = 7,
+  x64 = 0x01000007,
+  arm = 7 + 12,
+  arm64 = 0x01000007 + 0x0100000c,
+}
+local sum_cpusubtype = {
+  x86 = 3,
+  x64 = 3,
+  arm = 3 + 9,
+  arm64 = 3 + 0,
+}
+
+-- The function builds Mach-O FAT object file and retrieves
+-- its header fields (magic and nfat_arch)
+-- and fields of the each arch (cputype, cpusubtype).
+--
+-- Mach-O FAT object header could be retrieved with `otool` on macOS:
+--
+-- $ otool -f empty.o
+-- Fat headers
+-- fat_magic 0xcafebabe
+-- nfat_arch 2
+-- <snipped>
+--
+-- CPU type and subtype could be retrieved with `lipo` on macOS:
+--
+-- $ luajit -b -o osx -a arm empty.lua empty.o
+-- $ lipo -archs empty.o
+-- i386 armv7
+-- $ luajit -b -o osx -a arm64 empty.lua empty.o
+-- $ lipo -archs empty.o
+-- x86_64 arm64
+local function build_and_check_mach_o(is64)
+  local arch = is64 and 'arm64' or 'arm'
+
+  -- FAT_MAGIC is an integer containing the value 0xCAFEBABE in
+  -- big-endian byte order format. On a big-endian host CPU,
+  -- this can be validated using the constant FAT_MAGIC;
+  -- on a little-endian host CPU, it can be validated using
+  -- the constant FAT_CIGAM.
+  --
+  -- FAT_NARCH is an integer specifying the number of fat_arch
+  -- data structures that follow. This is the number of
+  -- architectures contained in this binary.
+  --
+  -- See aforementioned "OS X ABI Mach-O File Format Reference".
+  --
+  local FAT_MAGIC = '0xffffffffcafebabe'
+  local FAT_NARCH = 2
+
+  local MODULE_NAME = 'lango_team'
+
+  local mach_o_obj_path = create_obj_file(MODULE_NAME, arch)
+  local mach_o_buf = require('utils').tools.read_file(mach_o_obj_path)
+  assert(mach_o_buf == nil or #mach_o_buf ~= 0, 'cannot read an object file')
+
+  local mach_o = read_mach_o(mach_o_buf, is64)
+
+  -- Teardown.
+  local retcode = os.remove(mach_o_obj_path)
+  assert(retcode == true, 'remove an object file')
+
+  local magic_str = string.format('0x%02x', mach_o.header.magic)
+  test:is(magic_str, FAT_MAGIC, 'fat_magic is correct in Mach-O, ' .. arch)
+  test:is(mach_o.header.nfat_arch, FAT_NARCH, 'nfat_arch is correct in Mach-O, ' .. arch)
+
+  local total_cputype = 0
+  local total_cpusubtype = 0
+  for i = 1, mach_o.header.nfat_arch do
+    total_cputype = total_cputype + mach_o.fat_arch[i].cputype
+    total_cpusubtype = total_cpusubtype + mach_o.fat_arch[i].cpusubtype
+  end
+  test:is(total_cputype, sum_cputype[arch], 'cputype is correct in Mach-O, ' .. arch)
+  test:is(total_cpusubtype, sum_cpusubtype[arch], 'cpusubtype is correct in Mach-O, ' .. arch)
+end
+
+-- ARM
+build_and_check_mach_o(false)
+
+test:done(true)
diff --git a/test/tarantool-tests/utils/tools.lua b/test/tarantool-tests/utils/tools.lua
index f35c6922..26b8c08d 100644
--- a/test/tarantool-tests/utils/tools.lua
+++ b/test/tarantool-tests/utils/tools.lua
@@ -12,4 +12,12 @@ function M.profilename(name)
   return (arg[0]:gsub('^(.+)/([^/]+)%.test%.lua$', replacepattern))
 end
 
+-- Reads a file located in a specified path and returns its content.
+function M.read_file(path)
+  local file = assert(io.open(path), 'cannot open an object file')
+  local content = file:read('*a')
+  file:close()
+  return content
+end
+
 return M
-- 
2.34.1


  reply	other threads:[~2024-03-14 11:40 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-14 11:39 [Tarantool-patches] [PATCH luajit 0/2] Mach-O generation fixes Sergey Bronnikov via Tarantool-patches
2024-03-14 11:39 ` Sergey Bronnikov via Tarantool-patches [this message]
2024-03-18 12:53   ` [Tarantool-patches] [PATCH luajit 1/2] OSX/iOS/ARM64: Fix generation of Mach-O object files Maxim Kokryashkin via Tarantool-patches
2024-03-19  8:19     ` Sergey Bronnikov via Tarantool-patches
2024-03-19 16:28       ` Maxim Kokryashkin via Tarantool-patches
2024-03-26 13:53         ` Sergey Bronnikov via Tarantool-patches
2024-03-26 15:44           ` Maxim Kokryashkin via Tarantool-patches
2024-04-08 15:01           ` Sergey Kaplun via Tarantool-patches
2024-04-09 11:07             ` Sergey Bronnikov via Tarantool-patches
2024-04-09 12:47               ` Sergey Kaplun via Tarantool-patches
2024-03-18 12:55   ` Maxim Kokryashkin via Tarantool-patches
2024-03-14 11:39 ` [Tarantool-patches] [PATCH luajit 2/2] OSX/iOS/ARM64: Fix bytecode embedding in Mach-O object file Sergey Bronnikov via Tarantool-patches
2024-03-18 13:44   ` Maxim Kokryashkin via Tarantool-patches
2024-03-19  8:22     ` Sergey Bronnikov via Tarantool-patches
2024-03-19 16:15       ` Maxim Kokryashkin via Tarantool-patches
2024-03-26 14:01         ` Sergey Bronnikov via Tarantool-patches
2024-03-26 15:45           ` Maxim Kokryashkin via Tarantool-patches
2024-04-08 15:16   ` Sergey Kaplun via Tarantool-patches
2024-04-11  7:56     ` Sergey Bronnikov via Tarantool-patches
2024-04-08  7:47 ` [Tarantool-patches] [PATCH luajit 0/2] Mach-O generation fixes Sergey Kaplun via Tarantool-patches
2024-04-08 13:06   ` Sergey Kaplun via Tarantool-patches
2024-04-11  8:08   ` Sergey Bronnikov via Tarantool-patches
2024-04-11  8:27     ` Sergey Kaplun via Tarantool-patches
2024-04-11 12:39       ` Sergey Bronnikov via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9a71bf0765acd6ab019de4ae9f491a6c7bcb463d.1710416150.git.sergeyb@tarantool.org \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=estetus@gmail.com \
    --cc=m.kokryashkin@tarantool.org \
    --cc=skaplun@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH luajit 1/2] OSX/iOS/ARM64: Fix generation of Mach-O object files.' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox