From: Maxim Kokryashkin via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: Sergey Bronnikov <estetus@gmail.com> Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH luajit 1/2] OSX/iOS/ARM64: Fix generation of Mach-O object files. Date: Mon, 18 Mar 2024 15:53:04 +0300 [thread overview] Message-ID: <f4p5lqyc4mtnpcqcvudsrlvfizc64w5mtx56umwmneiqfpuumj@sm5px5nhetyl> (raw) In-Reply-To: <9a71bf0765acd6ab019de4ae9f491a6c7bcb463d.1710416150.git.sergeyb@tarantool.org> Hi, Sergey, thanks for the patch! Please consider my comments below. On Thu, Mar 14, 2024 at 02:39:49PM +0300, Sergey Bronnikov wrote: > From: sergeyb@tarantool.org > > Thanks to Carlo Cabrera. > > (cherry picked from commit 3065c910ad6027031aabe2dfd3c26a3d0f014b4f) > > Mach-O FAT object constructed by LuaJIT had an incorrect format. The Typo: s/Mach-O FAT/The Mach-O FAT/ > problem is reproduced when target hardware platform has AVX512F and Typo:s/target/the target/ > LuaJIT is compiled with enabled AVX512F instructions. > > The problem is arise because LuaJIT FFI code for Mach-O file generation Typo: s/is arise/arises/ > in `bcsave.lua` relied on undefined behavior for conversions to Typo: s/relied/relies/ > `uint32_t`. AVX512F has the `vcvttsd2usi` instruction which converts Typo: s/instruction/instruction,/ > `double`/`float` to `uint32_t/uint64_t`. Earlier architectures (SSE2, > AVX2) are sorely lacking such an instruction, as they only support > signed conversions. Unsigned conversions are done with a signed convert > and range shifting - the exact algorithm depends on the compiler. > A side-effect of these workarounds is that negative `double`/`float` > often inadvertently convert 'as expected', even though this is invoking > undefined behavior. Whereas `vcvttsd2usi` always returns 0x80000000 or > 0x8000000000000000 for out-of-range inputs. > > The patch fixes the problem, however, the real issue remains unfixed. > > Sergey Bronnikov: > * added the description and a test for the problem Typo: s/a test/the test/ > > Part of tarantool/tarantool#9595 > --- > .github/workflows/exotic-builds-testing.yml | 5 +- > src/jit/bcsave.lua | 6 +- > .../lj-366-strtab-correct-size.test.lua | 10 +- > ...generation_of_mach-o_object_files.test.lua | 271 ++++++++++++++++++ > test/tarantool-tests/utils/tools.lua | 8 + > 5 files changed, 287 insertions(+), 13 deletions(-) > create mode 100644 test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua > > diff --git a/.github/workflows/exotic-builds-testing.yml b/.github/workflows/exotic-builds-testing.yml > index a9ba5fd5..df4bc2e9 100644 > --- a/.github/workflows/exotic-builds-testing.yml > +++ b/.github/workflows/exotic-builds-testing.yml > @@ -32,6 +32,7 @@ jobs: > fail-fast: false > matrix: > BUILDTYPE: [Debug, Release] > + OS: [Linux, macOS] > ARCH: [ARM64, x86_64] > GC64: [ON, OFF] > FLAVOR: [checkhook, dualnum, gdbjit, nojit, nounwind] > @@ -50,13 +51,15 @@ jobs: > FLAVORFLAGS: -DLUAJIT_USE_GDBJIT=ON https://github.com/tarantool/luajit/actions/runs/8279362128 > - FLAVOR: nounwind > FLAVORFLAGS: -DLUAJIT_NO_UNWIND=ON > + - FLAVOR: avx512 > + CMAKEFLAGS: -DCMAKE_C_FLAGS=skylake-avx512 -DCMAKE_C_COMPILER=gcc > exclude: > - ARCH: ARM64 > GC64: OFF > # DUALNUM is default for ARM64, no need for additional testing. > - FLAVOR: dualnum > ARCH: ARM64 > - runs-on: [self-hosted, regular, Linux, '${{ matrix.ARCH }}'] > + runs-on: [self-hosted, regular, Linux, '${{ matrix.ARCH }}', '${ matrix.OS }'] The matrix.OS variable should be wrapped with double curly braces, instead of singular ones. So, it should be like this: | '${{ matrix.OS }}' Currently, exotic build testing fails to start because of this mistake. https://github.com/tarantool/luajit/actions/runs/8279362128 > name: > > LuaJIT ${{ matrix.FLAVOR }} > (Linux/${{ matrix.ARCH }}) > diff --git a/src/jit/bcsave.lua b/src/jit/bcsave.lua > index a287d675..7aec1555 100644 > --- a/src/jit/bcsave.lua > +++ b/src/jit/bcsave.lua > @@ -446,18 +446,18 @@ typedef struct { > uint32_t value; > } mach_nlist; > typedef struct { > - uint32_t strx; > + int32_t strx; > uint8_t type, sect; > uint16_t desc; > uint64_t value; > } mach_nlist_64; > typedef struct > { > - uint32_t magic, nfat_arch; > + int32_t magic, nfat_arch; > } mach_fat_header; > typedef struct > { > - uint32_t cputype, cpusubtype, offset, size, align; > + int32_t cputype, cpusubtype, offset, size, align; > } mach_fat_arch; > typedef struct { > struct { > diff --git a/test/tarantool-tests/lj-366-strtab-correct-size.test.lua b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua > index 8a97a441..0bb92da6 100644 > --- a/test/tarantool-tests/lj-366-strtab-correct-size.test.lua > +++ b/test/tarantool-tests/lj-366-strtab-correct-size.test.lua Let's move this update to a separate patch alongside with the added utility function. Currently, this change in unrelated test file is a bit confusing. > @@ -138,14 +138,6 @@ local function create_obj_file(name) > return elf_filename > end > > --- Reads a file located in a specified path and returns its content. > -local function read_file(path) > - local file = assert(io.open(path), 'cannot open an object file') > - local content = file:read('*a') > - file:close() > - return content > -end > - > -- Parses a buffer in an ELF format and returns an offset and a size of strtab > -- and symtab sections. > local function read_elf(elf_content) > @@ -172,7 +164,7 @@ end > test:plan(3) > > local elf_filename = create_obj_file(MODULE_NAME) > -local elf_content = read_file(elf_filename) > +local elf_content = require('utils').tools.read_file(elf_filename) > assert(#elf_content ~= 0, 'cannot read an object file') > > local strtab, symtab = read_elf(elf_content) > diff --git a/test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua b/test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua > new file mode 100644 > index 00000000..0519e134 > --- /dev/null > +++ b/test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua > @@ -0,0 +1,271 @@ > +local tap = require('tap') > +local test = tap.test('lj-865-fix_generation_of_mach-o_object_files'):skipcond({ > + -- XXX: Tarantool doesn't use default LuaJIT loaders, and Lua > + -- bytecode can't be loaded from the shared library. For more Typo: s/the shared/a shared/ > + -- info: https://github.com/tarantool/tarantool/issues/9671. > + -- luacheck: no global > + ['Test uses exotic type of loaders (see #9671)'] = _TARANTOOL, Typo: s/uses exotic/uses an exotic/ > +}) > + > +test:plan(4) > + > +-- Test creates an object file in Mach-O format with LuaJIT bytecode Typo: s/Test/The test/ > +-- and checks validness of the object file fields. Typo: s/validness/the validity/ > +-- > +-- The original problem is reproduced with LuaJIT that built with > +-- enabled AVX512F instructions. The support of AVX512F could be Typo: s/support of/support for/ > +-- checked in `/proc/cpuinfo` on Linux and > +-- `sysctl hw.optional.avx512f` on Mac. AVX512F must be > +-- implicitly enabled in a C compiler by passing CPU codename. > +-- Please consult for available model architecture on GCC Online Typo: s/for/the/ > +-- Documentation [1] for available CPU codenames. To detect > +-- CPU codename execute `gcc -march=native -Q --help=target | grep march`. > +-- > +-- 1. https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html > +-- > +-- Manual steps for reproducing are the following: > +-- > +-- $ make CC=gcc TARGET_CFLAGS='skylake-avx512' -f Makefile.original > +-- $ echo > test.lua > +-- $ LUA_PATH="src/?.lua;;" luajit -b -o osx -a arm test.lua test.o > +-- $ file test.o > +-- empty.o: DOS executable (block device driver) > + > +local ffi = require('ffi') > + > +-- Format of the Mach-O is described in a document Typo: s/a document/the document/ > +-- "OS X ABI Mach-O File Format Reference", published by Apple company. > +-- Copy of the (now removed) official documentation in [1]. Let's replace "in [1]" with "can be found here [1].". > +-- Yet another source of thruth is a XNU headers, see the definition Typo: s/is a XNU/are XNU/ Typo: s/the definition/definitions/ > +-- of C-structures in: [2] (`nlist_64`), [3] (`fat_arch` and `fat_header`). > + > +-- 1. https://github.com/aidansteele/osx-abi-macho-file-format-reference > +-- 2. https://github.com/apple-oss-distributions/xnu/blob/xnu-10002.1.13/EXTERNAL_HEADERS/mach-o/nlist.h > +-- 3. https://github.com/apple-oss-distributions/xnu/blob/xnu-10002.1.13/EXTERNAL_HEADERS/mach-o/fat.h > +-- 4. https://developer.apple.com/documentation/apple-silicon/addressing-architectural-differences-in-your-macos-code > +-- > +-- Using the same declarations as defined in <src/jit/bcsave.lua>. > +ffi.cdef[[ > +typedef struct > +{ > + uint32_t magic, cputype, cpusubtype, filetype, ncmds, sizeofcmds, flags; > +} mach_header; > + > +typedef struct > +{ > + mach_header; uint32_t reserved; > +} mach_header_64; > + > +typedef struct { > + uint32_t cmd, cmdsize; > + char segname[16]; > + uint32_t vmaddr, vmsize, fileoff, filesize; > + uint32_t maxprot, initprot, nsects, flags; > +} mach_segment_command; > + > +typedef struct { > + uint32_t cmd, cmdsize; > + char segname[16]; > + uint64_t vmaddr, vmsize, fileoff, filesize; > + uint32_t maxprot, initprot, nsects, flags; > +} mach_segment_command_64; > + > +typedef struct { > + char sectname[16], segname[16]; > + uint32_t addr, size; > + uint32_t offset, align, reloff, nreloc, flags; > + uint32_t reserved1, reserved2; > +} mach_section; > + > +typedef struct { > + char sectname[16], segname[16]; > + uint64_t addr, size; > + uint32_t offset, align, reloff, nreloc, flags; > + uint32_t reserved1, reserved2, reserved3; > +} mach_section_64; > + > +typedef struct { > + uint32_t cmd, cmdsize, symoff, nsyms, stroff, strsize; > +} mach_symtab_command; > + > +typedef struct { > + int32_t strx; > + uint8_t type, sect; > + int16_t desc; > + uint32_t value; > +} mach_nlist; > + > +typedef struct { > + uint32_t strx; > + uint8_t type, sect; > + uint16_t desc; > + uint64_t value; > +} mach_nlist_64; > + > +typedef struct > +{ > + uint32_t magic, nfat_arch; > +} mach_fat_header; > + > +typedef struct > +{ > + uint32_t cputype, cpusubtype, offset, size, align; > +} mach_fat_arch; > + > +typedef struct { > + mach_fat_header fat; > + mach_fat_arch fat_arch[2]; > + struct { > + mach_header hdr; > + mach_segment_command seg; > + mach_section sec; > + mach_symtab_command sym; > + } arch[2]; > + mach_nlist sym_entry; > + uint8_t space[4096]; > +} mach_fat_obj; > + > +typedef struct { > + mach_fat_header fat; > + mach_fat_arch fat_arch[2]; > + struct { > + mach_header_64 hdr; > + mach_segment_command_64 seg; > + mach_section_64 sec; > + mach_symtab_command sym; > + } arch[2]; > + mach_nlist_64 sym_entry; > + uint8_t space[4096]; > +} mach_fat_obj_64; > +]] > + > +local function create_obj_file(name, arch) > + local mach_o_path = os.tmpname() .. '.o' > + local lua_path = os.getenv('LUA_PATH') > + local lua_bin = require('utils').exec.luacmd(arg):match('%S+') > + local cmd_fmt = 'LUA_PATH="%s" %s -b -n "%s" -o osx -a %s -e "print()" %s' > + local cmd = (cmd_fmt):format(lua_path, lua_bin, name, arch, mach_o_path) > + local ret = os.execute(cmd) > + assert(ret == 0, 'cannot create an object file') > + return mach_o_path > +end > + > +-- Parses a buffer in an Mach-O format and returns Typo: s/in an/in the/ > +-- an fat magic number and nfat_arch. Typo: s/an fat/the FAT/ Typo: s/nfat_arch/`nfat_arch` > +local function read_mach_o(buf, is64) > + local res = { > + header = { > + magic = 0, > + nfat_arch = 0, > + }, > + fat_arch = { > + }, I guess, that the formatting below is a bit better here: | fat_arch = {}, > + } > + > + -- Mach-O FAT object. > + local mach_fat_obj_type = ffi.typeof(is64 and 'mach_fat_obj_64 *' or 'mach_fat_obj *') The line is longer than 80 symbols. > + local obj = ffi.cast(mach_fat_obj_type, buf) > + > + -- Mach-O FAT object header. > + local mach_fat_header_type = ffi.typeof('mach_fat_header *') > + local mach_fat_header = ffi.cast(mach_fat_header_type, obj.fat) > + local be32 = bit.bswap -- Mach-O FAT is BE, target arch is LE. > + res.header.magic = be32(mach_fat_header.magic) > + res.header.nfat_arch = be32(mach_fat_header.nfat_arch) > + > + -- Mach-O FAT object archs. Typo: s/archs/arches/ Side note: I feel like the comments for the sections are not elaborate enough for unprepared reader. I think you should briefly desribe the basic structure of a FAT object (FAT header, then array of per-segment headers, then object files) > + local mach_fat_arch_type = ffi.typeof('mach_fat_arch *') > + for i = 0, res.header.nfat_arch - 1 do > + local fat_arch = ffi.cast(mach_fat_arch_type, obj.fat_arch[i]) > + arch = { > + cputype = be32(fat_arch.cputype), > + cpusubtype = be32(fat_arch.cpusubtype), > + } > + table.insert(res.fat_arch, arch) > + end > + > + return res > +end > + > +-- Defined in <src/jit/bcsave.lua:bcsave_machobj>. > +local sum_cputype = { > + x86 = 7, > + x64 = 0x01000007, > + arm = 7 + 12, > + arm64 = 0x01000007 + 0x0100000c, > +} > +local sum_cpusubtype = { > + x86 = 3, > + x64 = 3, > + arm = 3 + 9, > + arm64 = 3 + 0, > +} It would be nice to have an explanation for these magic numbers. > + > +-- The function builds Mach-O FAT object file and retrieves > +-- its header fields (magic and nfat_arch) > +-- and fields of the each arch (cputype, cpusubtype). > +-- > +-- Mach-O FAT object header could be retrieved with `otool` on macOS: Typo: s/could be/can be/ > +-- > +-- $ otool -f empty.o > +-- Fat headers > +-- fat_magic 0xcafebabe > +-- nfat_arch 2 > +-- <snipped> > +-- > +-- CPU type and subtype could be retrieved with `lipo` on macOS: Typo: s/could be/can be/ > +-- > +-- $ luajit -b -o osx -a arm empty.lua empty.o > +-- $ lipo -archs empty.o > +-- i386 armv7 > +-- $ luajit -b -o osx -a arm64 empty.lua empty.o > +-- $ lipo -archs empty.o > +-- x86_64 arm64 > +local function build_and_check_mach_o(is64) > + local arch = is64 and 'arm64' or 'arm' > + > + -- FAT_MAGIC is an integer containing the value 0xCAFEBABE in > + -- big-endian byte order format. On a big-endian host CPU, > + -- this can be validated using the constant FAT_MAGIC; > + -- on a little-endian host CPU, it can be validated using > + -- the constant FAT_CIGAM. > + -- > + -- FAT_NARCH is an integer specifying the number of fat_arch > + -- data structures that follow. This is the number of > + -- architectures contained in this binary. > + -- > + -- See aforementioned "OS X ABI Mach-O File Format Reference". > + -- > + local FAT_MAGIC = '0xffffffffcafebabe' > + local FAT_NARCH = 2 > + > + local MODULE_NAME = 'lango_team' > + > + local mach_o_obj_path = create_obj_file(MODULE_NAME, arch) > + local mach_o_buf = require('utils').tools.read_file(mach_o_obj_path) > + assert(mach_o_buf == nil or #mach_o_buf ~= 0, 'cannot read an object file') > + > + local mach_o = read_mach_o(mach_o_buf, is64) > + > + -- Teardown. > + local retcode = os.remove(mach_o_obj_path) > + assert(retcode == true, 'remove an object file') > + > + local magic_str = string.format('0x%02x', mach_o.header.magic) > + test:is(magic_str, FAT_MAGIC, 'fat_magic is correct in Mach-O, ' .. arch) > + test:is(mach_o.header.nfat_arch, FAT_NARCH, 'nfat_arch is correct in Mach-O, ' .. arch) > + > + local total_cputype = 0 > + local total_cpusubtype = 0 > + for i = 1, mach_o.header.nfat_arch do > + total_cputype = total_cputype + mach_o.fat_arch[i].cputype > + total_cpusubtype = total_cpusubtype + mach_o.fat_arch[i].cpusubtype > + end > + test:is(total_cputype, sum_cputype[arch], 'cputype is correct in Mach-O, ' .. arch) > + test:is(total_cpusubtype, sum_cpusubtype[arch], 'cpusubtype is correct in Mach-O, ' .. arch) > +end > + > +-- ARM > +build_and_check_mach_o(false) > + > +test:done(true) Please mention that alongside with test and fix for the issue you've added this tool. IMO, it would be even better to do that in a separate commit, to avoid confusion because of the updates in test files unrelated to the patch. > diff --git a/test/tarantool-tests/utils/tools.lua b/test/tarantool-tests/utils/tools.lua > index f35c6922..26b8c08d 100644 > --- a/test/tarantool-tests/utils/tools.lua > +++ b/test/tarantool-tests/utils/tools.lua > @@ -12,4 +12,12 @@ function M.profilename(name) > return (arg[0]:gsub('^(.+)/([^/]+)%.test%.lua$', replacepattern)) > end > > +-- Reads a file located in a specified path and returns its content. > +function M.read_file(path) > + local file = assert(io.open(path), 'cannot open an object file') > + local content = file:read('*a') > + file:close() > + return content > +end > + > return M > -- > 2.34.1 >
next prev parent reply other threads:[~2024-03-18 12:53 UTC|newest] Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top 2024-03-14 11:39 [Tarantool-patches] [PATCH luajit 0/2] Mach-O generation fixes Sergey Bronnikov via Tarantool-patches 2024-03-14 11:39 ` [Tarantool-patches] [PATCH luajit 1/2] OSX/iOS/ARM64: Fix generation of Mach-O object files Sergey Bronnikov via Tarantool-patches 2024-03-18 12:53 ` Maxim Kokryashkin via Tarantool-patches [this message] 2024-03-19 8:19 ` Sergey Bronnikov via Tarantool-patches 2024-03-19 16:28 ` Maxim Kokryashkin via Tarantool-patches 2024-03-26 13:53 ` Sergey Bronnikov via Tarantool-patches 2024-03-26 15:44 ` Maxim Kokryashkin via Tarantool-patches 2024-04-08 15:01 ` Sergey Kaplun via Tarantool-patches 2024-04-09 11:07 ` Sergey Bronnikov via Tarantool-patches 2024-04-09 12:47 ` Sergey Kaplun via Tarantool-patches 2024-03-18 12:55 ` Maxim Kokryashkin via Tarantool-patches 2024-03-14 11:39 ` [Tarantool-patches] [PATCH luajit 2/2] OSX/iOS/ARM64: Fix bytecode embedding in Mach-O object file Sergey Bronnikov via Tarantool-patches 2024-03-18 13:44 ` Maxim Kokryashkin via Tarantool-patches 2024-03-19 8:22 ` Sergey Bronnikov via Tarantool-patches 2024-03-19 16:15 ` Maxim Kokryashkin via Tarantool-patches 2024-03-26 14:01 ` Sergey Bronnikov via Tarantool-patches 2024-03-26 15:45 ` Maxim Kokryashkin via Tarantool-patches 2024-04-08 15:16 ` Sergey Kaplun via Tarantool-patches 2024-04-11 7:56 ` Sergey Bronnikov via Tarantool-patches 2024-04-08 7:47 ` [Tarantool-patches] [PATCH luajit 0/2] Mach-O generation fixes Sergey Kaplun via Tarantool-patches 2024-04-08 13:06 ` Sergey Kaplun via Tarantool-patches 2024-04-11 8:08 ` Sergey Bronnikov via Tarantool-patches 2024-04-11 8:27 ` Sergey Kaplun via Tarantool-patches 2024-04-11 12:39 ` Sergey Bronnikov via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=f4p5lqyc4mtnpcqcvudsrlvfizc64w5mtx56umwmneiqfpuumj@sm5px5nhetyl \ --to=tarantool-patches@dev.tarantool.org \ --cc=estetus@gmail.com \ --cc=m.kokryashkin@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH luajit 1/2] OSX/iOS/ARM64: Fix generation of Mach-O object files.' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox